0% found this document useful (0 votes)
29 views194 pages

Yano Esc384 Notes f2021

This document contains lecture notes on partial differential equations (PDEs). It introduces common types of PDEs and methods for solving them, including Fourier series, separation of variables for the heat equation, and Sturm-Liouville theory. The notes are divided into sections covering topics like classification of PDEs, Fourier analysis, the heat equation, and fundamental solutions.

Uploaded by

yelinelif1993
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
29 views194 pages

Yano Esc384 Notes f2021

This document contains lecture notes on partial differential equations (PDEs). It introduces common types of PDEs and methods for solving them, including Fourier series, separation of variables for the heat equation, and Sturm-Liouville theory. The notes are divided into sections covering topics like classification of PDEs, Fourier analysis, the heat equation, and fundamental solutions.

Uploaded by

yelinelif1993
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 194

ESC384: Partial Differential Equations

Lecture Notes
Version 1.0 (Fall 2021)

Masayuki Yano

University of Toronto
Institute for Aerospace Studies
©2019–2021 Masayuki Yano, University of Toronto.

1
Contents

I Introduction 7

1 Introduction to PDEs 8
1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.2 Partial differential equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.3 Classification of PDEs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
1.4 Classification of second-order PDEs . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
1.5 PDEs in physics and engineering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
1.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
1.A Well-posedness problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

2 Fourier series: formulation 16


2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
2.2 Periodic, even, and odd functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
2.3 Fourier series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
2.4 Fourier sine and cosine series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
2.5 Fourier series on arbitrary intervals . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
2.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
2.A Complex form of Fourier series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
2.B List of useful integrals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28

3 Fourier series: analysis 29


3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
3.2 Regularity of functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
3.3 Convergence of Fourier series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
3.4 Convergence of Fourier series on finite intervals . . . . . . . . . . . . . . . . . . . . . 34
3.5 Uniform and absolute convergence . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
3.6 Differentiation of Fourier series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
3.7 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
3.A Proof of pointwise convergence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
3.B Proof of uniform convergence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
3.C L2 convergence and optimality of Fourier series . . . . . . . . . . . . . . . . . . . . . 41

4 Sturm-Liouville theory 43
4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
4.2 Motivation: Fourier series and eigenproblem . . . . . . . . . . . . . . . . . . . . . . . 43

2
4.3 Sturm-Liouville theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
4.4 Examples of the Sturm-Liouville problem . . . . . . . . . . . . . . . . . . . . . . . . 47
4.5 Generalized Fourier series by eigenfunctions . . . . . . . . . . . . . . . . . . . . . . . 49
4.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
4.A L2 properties of generalized Fourier series . . . . . . . . . . . . . . . . . . . . . . . . 50

II Heat equation 51

5 Heat equation: introduction 52


5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
5.2 Derivation of the heat equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
5.3 Boundary and initial conditions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
5.4 Nonhomogeneous heat equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
5.5 Nondimensionalization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
5.6 Maximum principle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
5.7 Uniqueness by the maximum principle . . . . . . . . . . . . . . . . . . . . . . . . . . 58
5.8 Uniform stability by the maximum principle . . . . . . . . . . . . . . . . . . . . . . . 59
5.9 Energy method: uniqueness and stability in L2 sense . . . . . . . . . . . . . . . . . . 60
5.10 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61

6 Heat equation: separation of variables 62


6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
6.2 Homogeneous Dirichlet boundary conditions . . . . . . . . . . . . . . . . . . . . . . . 62
6.3 Homogeneous Neumann boundary conditions . . . . . . . . . . . . . . . . . . . . . . 66
6.4 Mixed boundary conditions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
6.5 Nonhomogeneous source term . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
6.6 Nonhomogeneous boundary conditions . . . . . . . . . . . . . . . . . . . . . . . . . . 73
6.7 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75

7 Heat equation: fundamental solution 77


7.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
7.2 Fourier integral representation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
7.3 Heat equation in R × R≥0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
7.4 Fundamental solution of the heat equation in R × R>0 . . . . . . . . . . . . . . . . . 79
7.5 Heat equation in R>0 × R>0 : homogeneous Dirichlet . . . . . . . . . . . . . . . . . . 82
7.6 Heat equation in R>0 × R>0 : homogeneous Neumann . . . . . . . . . . . . . . . . . 84
7.7 Nonhomogeneous problems and Duhamel’s principle . . . . . . . . . . . . . . . . . . 85
7.8 Nonhomogeneous boundary conditions . . . . . . . . . . . . . . . . . . . . . . . . . . 87
7.9 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
7.A “Derivation” of Fourier integrals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
7.B Fundamental solution in Rn × R>0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91

3
8 Heat equation: finite difference method 93
8.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
8.2 Model equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
8.3 Discretization in space and time . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
8.4 Finite difference in space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
8.5 Semi-discrete heat equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96
8.6 Temporal discretization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98
8.7 Fully discrete equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
8.8 Solution of the fully discrete equation . . . . . . . . . . . . . . . . . . . . . . . . . . 100
8.9 Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
8.10 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103

III Laplace’s equation 104

9 Laplace’s equation: introduction 105


9.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
9.2 Laplace’s and Poisson’s equations in science and engineering . . . . . . . . . . . . . . 105
9.3 Mean-value formulas . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106
9.4 Maximum principles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108
9.5 Uniqueness and uniform stability by maximum principle . . . . . . . . . . . . . . . . 110
9.6 Energy method: uniqueness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111
9.7 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112

10 Laplace’s equation: separation of variables 113


10.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113
10.2 Laplace’s equation on a rectangle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113
10.3 Laplace’s equation on a disk . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119
10.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122

11 Laplace’s equation: fundamental solutions and Green’s functions 124


11.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124
11.2 Fundamental solution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124
11.3 Poisson’s equation in Rn . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126
11.4 Green’s function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127
11.5 Finding Green’s functions: general formulation . . . . . . . . . . . . . . . . . . . . . 128
11.6 Green’s function for the upper half plane . . . . . . . . . . . . . . . . . . . . . . . . . 129
11.7 Green’s function for a disk . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131
11.8 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134
11.A Proofs of properties of Green’s functions . . . . . . . . . . . . . . . . . . . . . . . . . 136

12 Poisson’s equation: finite difference method 139


12.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139
12.2 Problem statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139
12.3 Finite difference formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139
12.4 Example: square domain . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142

4
12.5 Example: L-shaped domain . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143
12.6 Treatment of various boundary conditions . . . . . . . . . . . . . . . . . . . . . . . . 143
12.7 Time dependent problems: heat equation . . . . . . . . . . . . . . . . . . . . . . . . 145
12.8 Other numerical methods for PDEs . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145
12.9 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 146

IV Hyperbolic equations 147

13 Wave equation: introduction 148


13.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 148
13.2 Derivation: vibrating string in R × R>0 . . . . . . . . . . . . . . . . . . . . . . . . . 148
13.3 Derivation: acoustic wave equation in R3 × R>0 . . . . . . . . . . . . . . . . . . . . . 149
13.4 Boundary conditions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150
13.5 Nondimensionalization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151
13.6 Energy conservation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152
13.7 Energy method: uniqueness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153
13.8 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154

14 Wave equation: separation of variables and d’Alembert’s formula 155


14.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155
14.2 Separation of variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155
14.3 d’Alembert’s formula . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 158
14.4 Neumann boundary conditions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 162
14.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165

15 Transport equation: method of characteristics 167


15.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167
15.2 Derivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167
15.3 Generalized transport equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 168
15.4 Method of characteristics for homogeneous problems . . . . . . . . . . . . . . . . . . 168
15.5 Method of characteristics for nonhomogeneous problems . . . . . . . . . . . . . . . . 172
15.6 Nonhomogeneous problems: Duhamel’s principle revisited . . . . . . . . . . . . . . . 174
15.7 Initial-boundary value problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 175
15.8 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 177

16 Wave equation: method of characteristics 179


16.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 179
16.2 d’Alembert’s formula: revisited . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 179
16.3 Wave equation on R>0 × R>0 : homogeneous Dirichlet BC . . . . . . . . . . . . . . . 182
16.4 Wave equation on R>0 × R>0 : homogeneous Neumann BC . . . . . . . . . . . . . . . 184
16.5 Wave equation on (0, 1) × R>0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 185
16.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 186

5
17 Weak solution of conservation laws 188
17.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 188
17.2 Conservation laws . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 188
17.3 Weak solution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 189
17.4 Weak solutions of transport equation . . . . . . . . . . . . . . . . . . . . . . . . . . . 191
17.5 Weak solution of Burgers’ equation: shocks . . . . . . . . . . . . . . . . . . . . . . . 192
17.6 Entropy condition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 192

6
Part I

Introduction

7
Lecture 1

Introduction to PDEs

©2019–2021 Masayuki Yano. Prepared for ESC384 Partial Differential Equations taught at the
University of Toronto.

1.1 Introduction
Our goal in this lecture is to introduce some of the key concepts that we will visit throughout
this course. We provide a definition of partial differential equations (PDEs) and a few model (and
carefully chosen) PDEs. We then discuss several ways to classify PDEs with a particular emphasis
on second-order PDEs, which are ubiquitous in science and engineering. We conclude the lecture
with examples of PDEs in science and engineering.

1.2 Partial differential equations


We first provide a definition of PDEs:
Definition 1.1 (partial differential equation (PDE)). A PDE is an equation that relates an un-
known function u and some of its partial derivatives, where the function is defined over a n-
dimensional spatial domain Ω ⊂ Rn and time interval I ⊂ R. In general, a PDE can be expressed
as
∂u ∂u ∂ 2 u ∂ 2 u
 
∂u
G x1 , . . . , xn , t, u, ,..., , , 2, , . . . = 0,
∂x1 ∂xn ∂t ∂x1 ∂x1 x2
∂u
where x1 , · · · , xn are the spatial coordinates, t is the time, and ∂x1 , . . . denote the partial deriva-
tives.
We provide a few concrete examples of model PDEs that we will study throughout this course:
• Laplace’s equation. Laplace’s equation is given by

−∆u = 0,
∂ 2 ∂2
where ∆ = ∇ · ∇ = ∂x 2 + · · · + ∂x2 is the Laplacian. Laplace’s equation describes various
1 n
physical phenomena including (but not limited to) the following: steady heat transfer (with-
out a heat source), where u is the temperature; electrostatics, where u is the electrostatic
potential; and steady incompressible irrotational flow, where u is the velocity potential.

8
• Poisson’s equation. Poisson’s equation is given by

−∆u = f,

where f is a source function. Poisson’s equation describes the following: steady heat transfer
in the presence of a heat source; and electrostatic potential in the presence of charge. Laplace’s
equation is a special case of Poisson’s equation for f = 0.

• Heat equation. The heat equation is given by

∂u
− ∇ · a∇u = f,
∂t
where a is the diffusivity constant, and f is the source term. The heat equation describes the
following: unsteady heat transfer; and the diffusion of chemical species.

• Transport equation. The transport equation is given by

∂u
+ b · ∇u = 0,
∂t
where b is an advection field. The transport equation describes the transport of conserved
quantity (e.g., concentration of chemical species) in the presence of an advection field.

• Wave equation. The wave equation is given by

∂2u
− c2 ∆u = 0,
∂t2
where c is the wave propagation speed. The wave equation describes the propagation of
waves through a medium, including the following: the propagation of the acoustic pressure
wave in the air; electromagnetic fields through space; and the vibration of taut strings and
membranes.

• Burgers’ equation. The (inviscid) Burgers’ equation is given by

∂u ∂u
−u = 0.
∂t ∂x
The Burgers’ equation describes the propagation of a nonlinear wave, in which the speed of
propagation depends on the solution value. The equation models features such as shocks and
rarefaction waves that are encountered in compressible flows.

1.3 Classification of PDEs


To facilitate our study of PDEs, we now classify PDEs by a few of their attributes. For each
definition we introduce, we provide a few concrete examples.

Definition 1.2 (order of a PDE). The order of a PDE is the order of the highest-order derivative
that appears in the PDE.

9
Example 1.3. Laplace’s equation −∆u = 0 is a second-order PDE, and the transport equation
∂u ∂u
∂t + a ∂x = 0 is a first-order PDE.

Example 1.4. The heat equation ∂u ∂t − ∆u = 0 is a second-order PDE. More precisely, it is first-
order in time and second-order in space. Note that time-dependent PDEs admit a more precise
characterization.

Definition 1.5 (linear operator). An operator L is a linear operator if

L(αw + βv) = αLw + βLv

for any functions w and v and any scalars α and β.

Definition 1.6 (linear PDEs). A PDE is said to be linear if the PDE can be expressed as

Lu = f,

where L is a linear operator and f is a function independent of u. If a PDE is not linear, then it
is called a nonlinear PDE.

Definition 1.7 (linear homogeneous PDEs). A linear homogeneous PDE is a PDE of the form

Lu = 0,

where L is a linear operator; i.e., the term that does not depend on u vanishes. If a linear PDE is
not homogeneous, then it is nonhomogeneous (or inhomogeneous).

Example 1.8. Laplace’s equation −∆u = 0 is a linear homogeneous PDE; the Laplacian L ≡ −∆
is a linear operator, and the right hand side is zero. Poisson’s equation −∆u = f is a linear
nonhomogeneous PDE for f 6= 0. The heat equation ∂u ∂t − ∇ · a∇u = f is a linear homogeneous
PDE if f = 0 and is a linear nonhomogeneous PDE if f 6= 0. The Burgers’ equation ∂u ∂u
∂t + u ∂x = 0
is a nonlinear PDE, as there is no way to express u ∂u
∂x as a linear operator on u.

The superposition principle. Linear homogeneous PDEs, just like linear homogeneous ODEs, are
particularly nice to study because we can appeal to the superposition principle to seek a solution.
Specifically, if u1 and u2 satisfy Lu1 = 0 and Lu2 = 0, then u1 + u2 satisfies L(u1 + u2 ) = 0. If we
in addition find a solution u0 to a nonhomogeneous PDE Lu0 = f , then u0 + u1 + u2 also satisfies
the nonhomogeneous PDE L(u0 + u1 + u2 ) = f . We will exploit these properties in many solution
techniques throughout this course.
Due to the superposition principle, in general there are an infinite number of functions that
satisfy a given PDE. We must supply additional conditions to find a unique solution. In the
context of PDEs, these conditions are provided by boundary conditions, if the problem is posed in
a bounded domain, and by initial conditions, if the problem is time dependent. (Initial conditions
can be thought of as “boundary conditions” in the space-time domain.) We will study various
boundary and initial conditions for each model PDE in later lectures.

10
1.4 Classification of second-order PDEs
We have introduced a few second-order linear PDEs including Laplace’s equation, the heat equation,
and the wave equation. In fact second-order PDEs are ubiquitous in sciences and engineering, which
justifies a finer classification of these PDEs.

Definition 1.9 (classification of second-order linear PDEs). Consider a second-order linear PDE
with a differential operator on Rn :
n n
X ∂2u X ∂u
Lu = aij + bi + cu (1.1)
∂xi ∂xj ∂xi
i,j=1 i=1

for some coefficients aij , bi and c. Let A ∈ Rn×n be a symmetric matrix such that Aij = aij ,
i, j = 1, . . . , n. (Note that A can be always symmetrized without loss of generality since the
mixed partials are the same.) Let λ1 , . . . , λn be the eigenvalues of A. Then we have the following
classification of PDEs:

• elliptic: all eigenvalues are nonzero and have the same sign (i.e., all positive or all negative).

• hyperbolic: all eigenvalues are nonzero, and one of them has the opposite sign from the n − 1
others.

• ultrahyperbolic: all eigenvalues are nonzero, and at least two of them are positive and at least
two of them are negative.

• parabolic: exactly one eigenvalue is zero and all other eigenvalues have the same sign.

If the PDE is time-dependent, then we treat time as the zeroth coordinate and run the summations
from 0 to n in (1.1).

Example 1.10 (classification: Laplace’s equation). The differential operator for Laplace’s equation
in Rn is
∂2 ∂2
L = −∆ = − 2 − · · · − 2 .
∂x1 ∂xn
We recognize the associated A matrix is A = −I. All eigenvalues are equal to −1, and hence
Laplace’s equation is elliptic.

Example 1.11 (classification: Poisson’s equation). Poisson’s equation is also elliptic because its
differential operator is the Laplacian.

Example 1.12 (classification: wave equation). The differential operator for the wave equation in
Rn is
∂2 ∂2 ∂2 ∂2
L = 2 − ∆ = 2 − 2 − ··· − 2 .
∂t ∂t ∂x1 ∂xn
We treat the temporal coordinate as the zero-th coordinate and recognize the associated A matrix
is A = diag(1, −1, −1, . . . , −1) ∈ R(n+1)×(n+1) . We have one positive eigenvalue and n − 1 negative
eigenvalues. The wave equation is hence hyperbolic.

11
Example 1.13 (classification: heat equation). The differential operator for the heat equation in
Rn is
∂ ∂ ∂2 ∂2
L= −∆= − 2 − ··· − 2
∂t ∂t ∂x1 ∂xn
We treat the temporal coordinate as the zero-th coordinate and recognize the associated A matrix is
A = diag(0, −1, −1, . . . , −1) ∈ R(n+1)×(n+1) . Note that the entry of A associated with the temporal
coordinate is 0 because the differential operator does not contain a second derivative with respect
to time. We have one zero eigenvalue and n − 1 negative eigenvalues. The heat equation is hence
parabolic.
Example 1.14 (classification: finding the A matrix). Consider a second-order PDE

∂2u ∂2u ∂2u


2 −2 +2 2 =f
∂x1 ∂x1 ∂x2 ∂x2
for some f . We note that the linear differential operator can be rearranged into the form (1.1):

∂2u ∂2u ∂2u ∂2u


Lu = 1 − 1 − 1 + 2 ,
∂x21 ∂x1 ∂x2 ∂x2 ∂x1 ∂x22
where we have split the mixed partial term into two so that the coefficients in front of the terms
are the same. We recognize that a11 = 1, a12 = a21 = −1, and a22 = 2, and hence
 
1 −1
A= .
−1 2

Note that the matrix is symmetric (as required for classification) thanks to the splitting of the mixed
partial term. To find the eigenvalues, we√ form the characteristic equation pA (λ) ≡ det(λI − A) =
λ2 −3λ+1 and find its roots are λ = 23 ± 25 . Since both eigenvalues are positive and the eigenvalues
have the same sign, the equation is elliptic.
Example 1.15 (classification: PDE with variable coefficient). The classification of the PDE may
vary over the domain for PDEs with variable coefficients. For instance, consider the second-order
PDE
∂2u ∂2u
y 2 + 2 = 0.
∂x ∂y
The matrix A associated with the PDE is A = diag(y, 1), and its eigenvalues are λ1 = 1 and λ2 = y.
Hence, the PDE is parabolic on the line y = 0, elliptic in the region y > 0, and hyperbolic in the
region y < 0.
The classification of PDEs into elliptic, hyperbolic, and parabolic equations is useful because
equations in a given class exhibit similar behaviors. For instance, the only difference between the
wave equation (in the space-time space with the time as the zero-th coordinate) and Laplace’s
equation is that one of the eigenvalues is positive in the wave equation; however, the wave equation
is hyperbolic and Laplace’s equation is elliptic, and they describe different physical processes and
exhibit different solution behaviors. On the other hand, while Laplace’s equation and the Navier-
Cauchy equation of linear elasticity might look quite different, they are both elliptic and exhibit
similar behaviors. Similarly, while the wave equation and Maxwell’s equations might look quite
different, they are both hyperbolic and exhibit similar behaviors.

12
1.5 PDEs in physics and engineering
Before we conclude this lecture, we provide some examples of more complex PDEs that are encoun-
tered in science and engineering. While we will not study these PDEs in detail, we will see that
much of the behaviors of these more complex PDEs are captured by our model PDEs.
• Navier-Cauchy equations. The Navier-Cauchy equations are a system of PDEs of the
form
d d
X ∂ui X ∂uj
µ 2 + (µ + λ) = fi , i = 1, . . . , d,
∂xj ∂xi ∂xj
j=1 j=1
where u is the (vector-valued) displacement field, f is the body force, and λ and µ are the
first and second Lamé parameters, respectively. The Navier-Cauchy equations, also known
as the equations of linear elasticity, are a system of linear PDEs. The equations describe
the deformation of a material under load (under the assumption of small displacement and
strain), and hence the behavior of engineering structures.
• Maxwell’s equations. Maxwell’s equations in free space are given by
∂B
∇×E =− ,
∂t
∇ · E = 0,
1 ∂E
∇×B = 2 ,
c ∂t
∇ · B = 0,
where E is the electric field, B is the magnetic field, and c is the speed of light. Maxwell’s
equations are a system of linear, homogeneous PDEs. Maxwell’s equations describe the
propagation of electromagnetic waves in free space, and hence the behavior of electric circuits,
electric motors, wireless communication, and power generation, among others.
• Incompressible Navier-Stokes equations. The incompressible Navier-Stokes equations
are given by
d
∂u X 1
+ ∇ · (u ⊗ u) + ∇p − ∆u = f,
∂t Re
j=1

∇ · u = 0,
where u is the (vector-valued) velocity field, p is the (scalar-valued) pressure field, and Re
is the Reynolds number. The incompressible Navier-Stokes equations comprise a system of
nonlinear PDEs that describes the velocity and pressure of an incompressible (Newtonian)
flow.
• Compressible Euler equations. The compressible Euler equations are given by
∂ρ
+ ∇ · ρu = 0
∂t
∂ρu
+ ∇ · (ρu ⊗ u) + ∇p = 0
∂t
∂ρe
+ ∇ · (ρh) = 0,
∂t

13
where ρ is the density field, u is the (vector-valued) velocity field, e is the specific internal
energy field, p is the pressure field, and h is the specific enthalpy field. The compressible Euler
equations is a system of nonlinear PDEs that describes the behavior of inviscid compressible
flows.

• Schrödinger equation. The Schrödinger equation for a single particle is given by

~2
 
∂Ψ
i~ = − ∆ + V Ψ,
∂t 2m

where Ψ : Rn ×t → C is the wave function in position space, ~ is the reduced Planck constant,
m is the particle mass, and V : Rn ×t → C is the potential function. The Schrödinger equation
describes the quantum state of an isolated non-relativistic quantum system.

• Black-Scholes equation. The Black-Scholes equation is given by

∂V 1 ∂2V ∂V
+ σ 2 S 2 2 + rS − rV = 0,
∂t 2 ∂S ∂S
where V is the price of the option as a function of the stock price S and time t, r is the
risk-free interest rate, and σ is the volatitlity of the stock.

1.6 Summary
We summarize key points of this lecture:

1. A PDE is an equation that relates an unknown function u and some of its partial derivatives.

2. PDEs can be classified in terms of the order, linearity, and homogeneity.

3. PDEs are augmented by initial and/or boundary conditions to provide a well-posed problem.

4. Second-order PDEs can be further classified as elliptic, hyperbolic, ultrahyperbolic, or parabolic.


The classification is based on the eigenvalues of the coefficient matrix of the second derivative.

5. Laplace’s equation, Poisson’s equation, the heat equation, the transport equation, the wave
equation, and Burgers’ equation each serves as a model PDE of a given classification and will
be closely studied throughout this course.

14
1.A Well-posedness problem
Throughout this course, we will introduce different techniques to solve PDEs. However, it only
makes sense to seek a solution if the problem is well-posed. We study this notion in this appendix.
Definition 1.16 (well-posedness in the sense of Hadamard). A mathematical problem is said to
be well-posed if (i) a solution exists, (ii) the solution is unique, and (iii) the solution depends
continuously on the data. If the problem is not well-posed, it is said to be ill-posed.
As we are not yet equipped to study the well-posedness of PDEs, we consider a few examples
of well-posedness in the context of linear algebra and ODEs.
Example 1.17 (linear algebra). Consider the solution of a linear system associated with a matrix
A ∈ Rn×n : find x ∈ Rn such that
Ax = b
for some data b ∈ Rn . We recall that the problem has a unique solution for any data b if A is
nonsingular. In addition, we can readily show that if Ax1 = b1 and Ax2 = b2 , then kx1 − x2 k ≤
(1/σmin (A))kb1 − b2 k, where σmin (A) is the minimum singular value of A which is nonzero for a
nonsingular A. A small perturbation in the data b results in a small perturbation in the solution
x. Hence the problem is well-posed if A is nonsingular.
On the other hand, if A is singular and b ∈/ Img(A), then the solution does not exist. If A is
singular and b ∈ Img(A), then there are infinitely many solutions; if Ax = b, then x + y is also a
solution for any y ∈ Ker(A). (Recall that Ker(A) = {0} for a nonsingular A, but Ker(A) includes
other elements for a singular A.) Hence, if A is singular, then the problem is ill-posed.
Example 1.18 (initial value problem). Consider an initial value problem on I ≡ (0, T ] for the
final time T ∈ R>0 :
du
= f (t, u) in I,
dt
u(t = 0) = g,
where u : I → R is the solution, f : I × R → R defines the ODE, and g ∈ R is the initial condition.
Suppose f is continuous in t and uniformly Lipschitz continuous in u; i.e., there exists γ < ∞ such
that for any (t, u1 ) and (t, u2 ) we have |f (t, u1 ) − f (t, u2 )| ≤ γ|u1 − u2 |. Then the Picard-Lindelöf
theorem states that a unique solution to the initial value problem exists. In addition, the energy
estimate states that |u1 (t) − u2 (t)| ≤ |g1 − g2 | exp(γt), which means that the solution depends
continuously on data. Hence the problem is well-posed if the assumptions of the Picard-Lindelöf
theorem holds.
It is also relatively simple to find f for which the problem is ill-posed. A classical example is
du
= u1/3
dt
u(t = 0) = 0;
this initial value problem admits infinitely many solutions of the form
(
0, t ∈ (0, τ ),
u(t) =
( 23 (t − τ ))3/2 , t ∈ (τ, T ],
for any τ ∈ I. We readily verify that f (t, u) = u1/3 does not satisfy the uniform Lipschitz continuity
condition of the Picard-Lindelöf theorem.

15
Lecture 2

Fourier series: formulation

©2019–2021 Masayuki Yano. Prepared for ESC384 Partial Differential Equations taught at the
University of Toronto.

2.1 Introduction
In order to solve a PDE, we must be able to represent its solution, which is a function in space (and
time). We could attempt to describe the function in terms of the value that it takes at each point
in space (and time), but this representation is not convenient to solve or to analyze PDEs. A much
more convenient approach is to represent the function as a linear combination of known functions,
so that the function canPbe identified by a set of coefficients. Namely, we represent the solution
u : Ω → R as u(x) = ∞ n=1 ûn φn (x), where φn , n = 1, 2, . . . , are the known functions and ûn ,
n = 1, 2, . . . , are the coefficients to be determined. One class of infinite series that is particularly
useful to represent the solution of PDEs is Fourier series. In this lecture we introduce Fourier series
so that we can use them to solve PDEs in the following lectures. In this lecture we focus on the
construction of Fourier series; we will analyze its theoretical properties in the next lecture.

2.2 Periodic, even, and odd functions


As periodic functions play an important role in the construction of Fourier series, we first provide
a few definitions related to the periodicity of functions.

Definition 2.1 (periodic function). A function f : R → R is said to be z-periodic if

f (x + z) = f (x) ∀x ∈ R.

The real number z is called a period. The smallest real number z such that the relationship holds
is called the fundamental period.

Example 2.2 (periodicity of sine functions). Perhaps one of the most common periodic functions
is the sine function. The function f (x) = sin(x) is 2π-periodic because

sin(x + 2π) = sin(x) ∀x ∈ R.

16
Its fundamental period is 2π, because the relationship does not hold for any shorter period. Note
that any integer multiple of the fundamental period is also a period of the function; e.g., z =
2π, 4π, 6π.

Example 2.3 (periodicity of sum of sine function). Consider f (x) = cos(x)+sin(2x). The function
cos(x) and sin(2x) are periodic functions with the fundamental period of 2π and π, respectively.
The fundamental period of the sum of the functions is 2π, the larger of the two fundamental periods.

We also recall the definition of even and odd functions.

Definition 2.4 (even function). A function f : (−a, a) → R is an even function if for all x ∈ (−a, a),

f (−x) = f (x).

Definition 2.5 (odd function). A function f : (−a, a) → R is an odd function if for all x ∈ (−a, a),

f (−x) = −f (x).

If f is even, then the graph y = f (x) is symmetric about the y axis; the left hand side is the
mirror image of the right hand side. If f is odd, then the graph y = f (x) is symmetric about the
origin. We provide a few examples.

Example 2.6. For any λ ≥ 0, the function f (x) = cos(λx) is an even function because f (−x) =
cos(−λx) = cos(λx) = f (x). For any λ ≥ 0, the function f (x) = sin(λx) is an odd function because
f (−x) = sin(−λx) = − sin(λx) = −f (x).

Example 2.7. A monomial xk is an even function if k is even and an odd function if k is odd.

We also note few properties of even and odd functions under some operations; each result can
be readily proven from the definition of even and odd functions:

• Summation. The sum of two even functions is even. The sum of two odd functions is odd.
However, the sum of an even function and an odd function is in general neither even nor odd.

• Multiplication. The product of two even functions or two odd functions is even. The product
of an even function and an odd function is odd.

• Differentiation. The derivative of an odd function is even, and the derivative of an even
function is odd. The result follows from the chain rule.
Rx
• Integration. If f is an even function, then the integral g(x) = ξ=0 f (ξ)dξ is odd. If f is an
Rx
odd function, then the integral g(x) = ξ=0 f (ξ)dξ is even.
Ra Ra Ra
• Integrals. If f is odd, then −a f (x)dx = 0. If f is even, then −a f (x)dx = 2 0 f (x)dx.

• Behavior at origin. If f is even and is differentiable at x = 0, then its derivative must vanish:
i.e., f 0 (x = 0) = 0. If f is odd and is continuous at x = 0, then its value must vanish: i.e.,
f (x = 0) = 0.

Given a function defined on a finite interval, we may also “extend” the function to the entire
real line R to construct a periodic function on R. We introduce a few different extensions.

17
Definition 2.8 (periodic extension). Given a function f : (0, a) → R, its periodic extension to R
is the function f p : R → R such that

f p (x) = f (x), x ∈ (0, a),

and has a period a (i.e., f p (x + a) = f p (x), ∀x ∈ R, or equivalently f p (x + na) = f (x), ∀x ∈ (0, a)


and ∀n ∈ Z).
Definition 2.9 (even periodic extension). Given a function f : (0, a) → R, its even periodic
extension is the function f e,p : R → R such that

f e,p (x) = f (x), x ∈ (0, a),


e,p
f (x) = f (−x), x ∈ (−a, 0),

and has a period of 2a (i.e., f e,p (x + 2a) = f e,p (x), ∀x ∈ R).


Definition 2.10 (odd periodic extension). Given a function f : (0, a) → R, its odd periodic
extension is the function f o,p : R → R such that

f o,p (x) = f (x), x ∈ (0, a),


o,p
f (x) = −f (−x), x ∈ (−a, 0),

and has a period of 2a (i.e., f o,p (x + 2a) = f o,p (x), ∀x ∈ R).


In the above definitions of periodic, even-periodic, and odd-periodic extensions, we do not
specify the values of the extensions at the end points x = na, n ∈ Z. As we will see shortly, the
lack of the specification does not cause a problem in our discussion of Fourier series.
We demonstrate these three periodic extensions using a concrete example.
Example 2.11 (periodic extensions). Let f (x) = x, x ∈ (0, 1). Figure 2.1 shows the periodic
extension, even periodic extension, and the odd periodic extension for the function. To construct
the periodic extension, we simply repeat the function, with a period of 1. We construct the even
periodic extension in two steps: we first mirror the function about the y-axis to construct the
even extension; we then construct the periodic extension of the even extension with a period of
2. By definition the even (or odd) extension of a function is defined on a symmetric interval (e.g.,
[−1, 1]). We also note that the period of the even periodic extension is twice the domain length of
the original function to match the domain length of the even extension. Similarly, we construct the
odd periodic extension in two steps: we first mirror the function about the origin to construct the
odd extension; we then construct the periodic extension of the odd extension with a period of 2.

2.3 Fourier series


We now introduce Fourier series, which provide a representation of a periodic function as a linear
combination of sinusoidal functions. For simplicity, we consider 2-periodic functions; we will later
generalize the formulation to an arbitrary period in Section 2.5.
Definition 2.12 (Fourier series). Let f : R → R be a 2-periodic function. The Fourier series of
the function is

1 X
f (x) ∼ a0 + (an cos(nπx) + bn sin(nπx)), (2.1)
2
n=1

18
1 1 1

0.5 0.5 0.5

0 0 0

-0.5 -0.5 -0.5


f f
f even ext odd ext
-1 periodic ext -1 even periodic ext -1 odd periodic ext

-3 -2 -1 0 1 2 3 -3 -2 -1 0 1 2 3 -3 -2 -1 0 1 2 3

(a) periodic extension (b) even periodic extension (c) odd periodic extension

Figure 2.1: Periodic extensions of f (x) = x for x ∈ [0, 1].

with Fourier coefficients

Z 1
an = cos(nπx)f (x)dx, n = 0, 1, 2, . . . , (2.2)
−1
Z 1
bn = sin(nπx)f (x)dx, n = 1, 2, . . . .
−1

Definition 2.13 (truncated Fourier series). Let f : R → R be a 2-periodic function. The N -term
truncated Fourier series of the function is

N
1 X
sN (x) ≡ a0 + (an cos(nπx) + bn sin(nπx)),
2
n=1

with coefficients given by (2.2).

We make some observations about Fourier series.

1. Because we have not verified, and in general will not be able to verify, that the Fourier series
of f is the same as f , we use the notation “∼” instead of “=” in (2.1). In the next lecture
we will answer various questions on the convergence of Fourier series including the following:
does the series converge? In what sense does the series converge? How smooth does the
function f have to be for the series to converge?

2. The set of functions

 
1
, sin(πx), cos(πx), sin(2πx), cos(2πx), . . .
2

19
R1
is orthonormal with respect to the L2 inner product, (w, v) ≡ −1 wvdx. For all n, m ∈ N>0 ,
Z 1 Z 1
1 1
cos(mπx)dx = sin(mπx)dx = 0,
−1 2 −1 2
Z 1
cos(nπx) cos(mπx)dx = δmn ,
−1
Z 1
sin(nπx) sin(mπx)dx = δmn ,
−1
Z 1
sin(nπx) cos(mπx)dx = 0,
−1

where δmn is the Kronecker delta; i.e., δmn = 1 if m = n and δmn = 0 if m 6= n.

3. The expression for the coefficients (2.2) is a consequence of the orthonormality of the sinusoidal
functions. For instance, to obtain the expression for a2 , we multiply the Fourier series (2.1)
by cos(2πx), integrate the expression over (−1, 1), and appeal to the orthonormality: i.e.,
Z 1 Z 1 ∞
1 X
cos(2πx)f (x)dx = cos(2πx)( a0 + (an cos(nπx) + bn sin(nπx)))dx
−1 −1 2
n=1
Z 1 ∞ Z 1 ∞ Z 1
1 X X
= a0 cos(2πx)dx + an cos(2πx) cos(nπx)dx + bn cos(2πx) sin(nπx)dx
−1 2 −1 −1
n=1 n=1

X
= an δ2n = a2 .
n=1

All integrals with exception of the one multiplied by a2 vanish due to the mutual orthogonality
of the sine and cosine functions. We obtain the expression for all other an and bn using the
same orthogonality argument.

4. We can also find the Fourier series of a function f defined on (−1, 1) (instead of a 2-periodic
function on R). The Fourier series of f : (−1, 1) → R is given by (2.1) and it converges to (i)
f on (−1, 1) and (ii) its periodic extension f p on R.

5. The truncated Fourier series sN of f is the N -term partial sum of the Fourier series of f .

We now consider an example:

Example 2.14 (The Fourier series of f (x) = x on (−1, 1)). We consider the 2-periodic extension
of the function f (x) = x for x ∈ [−1, 1]. We wish to find the Fourier series representation of the
function. The coefficient an , n ∈ N≥0 , is given by
Z 1 Z 1
an = cos(nπx)f (x)dx = cos(nπx)xdx = 0;
−1 −1

the integral evaluates to 0 because the integrand is an odd function, and the domain of integration

20
Figure 2.2: Truncated Fourier series of f (x) = x for x ∈ [−1, 1].

is symmetric. The Fourier coefficient bn is given by


Z 1 Z 1  1 Z 1
1 1
bn = sin(nπx)f (x)dx = sin(nπx)xdx = − cos(nπx)x + cos(nπx)dx
−1 −1 nπ x=−1 −1 nπ
1 1 2 2
= − (cos(nπ) + cos(−nπ)) + 2 2 sin(nπx)|1x=−1 = − cos(nπ) = − (−1)n
nπ n π nπ nπ
2 n+1
= (−1) ;

here, the third equality follows from integration by parts, and the second to last equality follows
from cos(nπ) = (−1)n , ∀n ∈ Z. It follows that
∞ ∞
X X 2(−1)n+1
f (x) ∼ bn sin(nπx) = sin(nπx)

n=1 n=1
 
2 1 1
= sin(πx) − sin(2πx) + sin(3πx) − · · · .
π 2 3
The series contains only sine terms (and not cosine terms) because the function f (x) = x is an odd
function. Figure 2.2 illustrates the Fourier series approximation of the function. We observe that
the weighted sum of the sine functions at least visually converges to f . (As noted earlier, we will
analyze the convergence of Fourier series in the next lecture.)

2.4 Fourier sine and cosine series


The Fourier series provides a representation for any periodic function. It is often also useful to
construct a representation specialized to odd or even periodic functions. A Fourier sine series is
the Fourier series associated with odd periodic functions, and a Fourier cosine series is the Fourier
series associated with even periodic functions.
Definition 2.15 (Fourier cosine series). Let f : R → R be an even 2-periodic function. The Fourier
cosine series of f is given by

1ˆ Xˆ
f (x) ∼ f0 + fn cos(nπx) (2.3)
2
n=1

21
where
Z 1 Z 1
fˆn = cos(nπx)f (x)dx = 2 cos(nπx)f (x)dx, n ∈ N≥0 .
−1 0

Definition 2.16 (Fourier sine series). Let f : R → R be an odd 2-periodic function. The Fourier
sine series of f is given by
X∞
f (x) ∼ fˆn sin(nπx) (2.4)
n=1
where Z 1 Z 1
fˆn = sin(nπx)f (x)dx = 2 sin(nπx)f (x)dx, n ∈ N>0 .
−1 0

We make some observations:


1. The domain of integration for the coefficients can be reduced from [−1, 1] to [0, 1] due to the
symmetry of even functions for the Fourier cosine series (and due to the antisymmetry of odd
functions for the Fourier sine series).

2. Similar to the (full) Fourier series, we can also find the Fourier cosine series of a function f
defined on (0, 1) (instead of an even 2-periodic function on the entire R). The Fourier cosine
series of f : (0, 1) → R is given by (2.3) and it converges to (i) f on (0, 1) and (ii) its even
periodic extension f e,p on the entire R.

3. Similarly, we can find the Fourier sine series of a function f defined on (0, 1) (instead of an
odd 2-periodic function on the entire R). The Fourier sine series of f : (0, 1) → R is given
by (2.4) and it converges to (i) f on (0, 1) and (ii) its odd periodic extension f o,p on the entire
R.

4. We use the notation (fˆn )∞n=0 to denote the Fourier sine/cosine coefficients of f . The notation
makes the connection between the function and its Fourier coefficients more explicit, which
will be helpful when we work with Fourier series associated with multiple functions. (Unfor-
tunately this notation cannot be used for the (full) Fourier series, as we require two sets of
coefficients for sine and cosine.)
We now consider an example of Fourier cosine and sine series.
Example 2.17 (Fourier cosine series of f (x) = x on (0, 1)). We consider the Fourier cosine series
representation of f (x) = x for x ∈ (0, 1). The coefficient fˆ0 of the Fourier cosine series is given by
Z 1
ˆ
f0 = 2 xdx = 1.
0

For n > 0, we obtain

cos(nπx) x sin(nπx) 1
Z 1  
ˆ 2
fn = 2 x cos(nπx)dx = 2 2 2
+ = 2 2 (cos(nπ) − 1)
0 n π nπ x=0 n π
(
4
2 − n2 π2 n ∈ odd,
= 2 2 ((−1)n − 1) =
n π 0 n ∈ even.

22
(a) Fourier cosine series (b) Fourier sine series

Figure 2.3: Truncated Fourier cosine and sine series of f (x) = x for x ∈ [0, 1].

The Fourier cosine series of f is hence


∞ ∞
1 X 1 Xˆ
f (x) ∼ fˆ0 + fˆn cos(nπx) = + f2k−1 cos((2k − 1)πx)
2 2
n=1 k=1

1 X4
= − cos((2k − 1)πx)
2 (2k − 1)2 π 2
k=1
 
1 4 1 1
= − 2 cos(πx) + cos(3πx) + cos(5πx) + · · · .
2 π 9 25
Figure 2.3(a) illustrates the Fourier cosine series representation of the function. We observe that
the series (at least visually) approximates the function f (x) = x on (0, 1). Note that the Fourier
cosine series approximates the even periodic extension of f .

Example 2.18 (Fourier sine series approximation of f (x) = x on (0, 1)). We consider the Fourier
sine series representation of f (x) = x for x ∈ (0, 1). The coefficients of the Fourier sine series are
given by

sin(nπx) x cos(nπx) 1
Z 1  
ˆ 2 2
fn = 2 x sin(nπx)dx = 2 2 2
− =− cos(nπ) = (−1)n+1 .
0 n π nπ x=0 nπ nπ
The Fourier sine series of f is hence
∞ ∞
2(−1)n+1
 
X X 2 1 1
f (x) ∼ fˆn sin(nπx) = sin(nπx) = sin(πx) − sin(2πx) + sin(3πx) − · · · .
nπ π 2 3
n=1 n=1

Not surprisingly, the Fourier sine series of f (x) = x on (0, 1) considered here is the same as the
Fourier series of g(x) = x on (−1, 1) considered in Example 2.14 because the odd periodic extension
of f is the same as the periodic extension of g. Figure 2.3(b) illustrates the Fourier sine series
representation of the function. Note that the Fourier sine series approximates the odd periodic
extension of f .

23
2.5 Fourier series on arbitrary intervals
We now consider Fourier series for periodic functions of arbitrary periods. Given a 2a-periodic
function f : R → R, the Fourier series of f results from simple adjustments of the arguments of the
sine and cosine functions to the period of 2a:

1 X
f (x) ∼ a0 + (an cos(nπx/a) + bn sin(nπx/a)),
2
n=1

with coefficients
1 a
Z
an = cos(nπx/a)f (x)dx, n = 0, 1, 2, . . . ,
a −a
1 a
Z
bn = sin(nπx/a)f (x)dx, n = 1, 2, . . . .
a −a
The normalization factor of the coefficients for the Fourier series on (−a, a) is scaled by 1/a relative
to the Fourier series on (−1, 1) becuase the orthogonality relations are
1 a
Z
sin(nπx/a) cos(mπx/a)dx = 0,
a −a
Z a
1
sin(nπx/a) sin(mπx/a)dx = δmn ,
a −a
1 a
Z
cos(nπx/a) cos(mπx/a)dx = δmn .
a −a
The Fourier series of a function f : (−a, a) → R is also given by the same expression, and the series
converges to the 2a-periodic extension of f on the entire R.
Similarly, the Fourier cosine series of an even 2a-periodic function f : R → R is given by

1 X
f (x) ∼ fˆ0 + fˆn cos(nπx/a)
2
n=1

with Fourier cosine coefficients


Z a
2
fˆn = cos(nπx/a)f (x)dx, n = 0, 1, 2, . . . .
a 0

The Fourier cosine series of a function f : (0, a) → R is also given by the same expression, and the
even 2a-periodic extension of f on the entire R.
The Fourier sine series on of an odd 2a-periodic function f : R → R is given by

X
f (x) ∼ fˆn sin(nπx/a)
n=1

with Fourier sine coefficients


Z a
2
fˆn = sin(nπx/a)f (x)dx, n = 1, 2, . . . .
a 0

The Fourier sine series of a function f : (0, a) → R is also given by the same expression, and the
series converges to the odd 2a-periodic extension of f on the entire R.

24
2.6 Summary
We summarize key points of this lecture:

1. Fourier series provide series representations (of approximations) of periodic functions f : R →


R in terms of sinusoidal functions. To find the coefficients of the series, we multiply f by the
sinusoidal functions and integrate over the period; this is a consequence of the orthonormality
of the sinusoidal functions.

2. Fourier cosine and sine series provide series representations (of approximations) of even and
odd periodic functions, respectively.

3. Given f : (−a, a) → R, its (full) Fourier series approximates the periodic extension of f .
Given f : (0, a) → R, its cosine and sine Fourier series approximates the even periodic and
odd periodic extensions, respectively, of f .

4. Fourier series can be applied to periodic functions of an arbitrary period by adjusting the
period of the sinusoidal functions and the associated coefficients accordingly.

25
2.A Complex form of Fourier series
The Fourier series can be written in a more compact form using complex exponentials. To begin,
we recall the DeMoivre formulas
1 1
sin(x) = (exp(ix) − exp(−ix)) and cos(x) = (exp(ix) + exp(−ix)).
2i 2
It follows that the set of sinusoidal functions

{1, cos(πx), sin(πx), cos(2πx), sin(2πx), . . . }

can be replaced by the complex exponentials

{1, exp(−iπx), exp(iπx), exp(−i2πx), exp(i2πx), . . . } = {exp(nx)}∞


n=−∞ .

The complex form of the Fourier series is as follows:

Definition 2.19 (complex form of Fourier series). Let f : R → R be a 2-periodic function. The
complex form of the Fourier series of f is

X
f (x) ∼ fˆn exp(inπx), (2.5)
n=−∞

with coefficients Z 1
1
fˆn = exp(−inπx)f (x)dx.
2 −1

Similar to the (real) Fourier series of f , the expression for the coefficients follows from the
orthogonality relationship. Specifically, the orthogonality relationship for the complex exponentials
exp(−imπx) and exp(inπx) for m 6= n is
Z 1 Z 1
exp(−imπx) exp(inπx)dx = exp(i(n − m)πx)dx
−1 −1
1 2
= (exp(i(n − m)π) − exp(−i(n − m)π)) = sin((n − m)π) = 0,
iπ(n − m) π(n − m)

and for m = n is Z 1 Z 1
exp(i(n − m)x)dx = dx = 2.
−1 −1

The complex form of Fourier series (2.5) immediately follows from the orthogonality relationship.
For a 2a-periodic function f : R → R, the complex form of the Fourier series is given by

X
f (x) ∼ fˆn exp(inπx/a),
n=−∞

with complex Fourier coefficients


Z a
1
fˆn = exp(−inπx/a)f (x)dx.
2a −a

26
The Fourier series of a function f : (−a, a) → R is also given by the same expression, and the series
converges to the 2a-periodic extension of f on the entire R.
The complex form of Fourier series is more concise and arguably more elegant than the real
form of Fourier series, which requires separate coefficients for the sine and cosine modes. However,
in this course we mostly use the real form of Fourier series; this is mainly because Fourier sine and
cosine series are far more frequently encountered than the (full) Fourier series in the analysis of
PDEs, and the sine/cosine series also admit a compact representation.

27
2.B List of useful integrals
We list a few indefinite integrals that are useful for the evaluation of Fourier series.
Z
cos(αx)
1. sin(αx)dx = −
α
Z
sin(αx)
2. cos(αx)dx =
α
Z
sin(αx) x cos(αx)
3. x sin(αx)dx = −
α2 α
Z
cos(αx) x sin(αx)
4. x cos(αx)dx = +
α2 α
2x sin(αx) (2 − α2 x2 ) cos(αx)
Z
5. x2 sin(αx)dx = +
α2 α3
2x cos(αx) (α2 x2 − 2) sin(αx)
Z
6. x2 cos(αx)dx = +
α2 α3
sin((α − β)x) sin((α + β)x)
Z
7. sin(αx) sin(βx)dx = − , α 6= β
2(α − β) 2(α + β)
cos((α − β)x) cos((α + β)x)
Z
8. cos(αx) sin(βx)dx = − , α 6= β
2(α − β) 2(α + β)
sin((α − β)x) sin((α + β)x)
Z
9. cos(αx) cos(βx)dx = + , α 6= β
2(α − β) 2(α + β)
Z
x sin(2αx)
10. sin2 (αx)dx = −
2 4α
sin2 (αx)
Z
11. cos(αx) sin(αx)dx =

Z
x sin(2αx)
12. cos2 (αx)dx = +
2 4α
13. cos(nπ) = (−1)n , n ∈ Z

14. sin(nπ) = 0, n ∈ Z

28
Lecture 3

Fourier series: analysis

©2019–2021 Masayuki Yano. Prepared for ESC384 Partial Differential Equations taught at the
University of Toronto.

3.1 Introduction
In the previous lecture we introduced Fourier series as a means to represent (i) periodic functions
on R or (ii) a periodic extension of functions defined on an interval. We have yet to answer the all
important question: do Fourier series converge and, if so, for what classes of functions and in what
sense? In this lecture we introduce a few different types of convergence for a sequence of functions
and then discuss the regularity that the underlying function must possess to achieve each type of
convergence.

3.2 Regularity of functions


We first review a few different characterizations of the regularity of functions. We will use these
characterizations in the subsequent discussion of Fourier series.
Definition 3.1 (continuous functions). A function f : (a, b) → R defined on an open interval
(a, b) is continuous on (a, b) if limx→c f (x) = f (c) for all c ∈ (a, b). A function g : [a, b] → R
defined on a closed interval [a, b] is continuous on [a, b] if (i) limx→c f (x) = f (c) for all c ∈ (a, b),
(ii) limx→a+ f (x) = f (a), and (iii) limx→b− f (x) = f (b). The space of continuous functions on an
interval I is denoted C 0 (I).
Definition 3.2 (continuously differentiable functions). Let I ⊂ R be an open or closed interval.
A function f : I → R is continuously differentiable on I if its derivative f 0 is continuous on I. The
space of continuous differentiable functions is denoted C 1 (I).
Definition 3.3 (piecewise continuous functions). Let (a, b) ⊂ R. A function f : (a, b) → R is
piecewise continuous on (a, b) if there exists a set of points a ≡ x1 < x2 < · · · xN ≡ b such that
f is continuous on each interval (xi , xi+1 ) and one-sided limits limx→xi + f (x) and limx→xi+1 − f (x)
exist for i = 1, . . . , N − 1.
Definition 3.4 (piecewise continuously differentiable functions). Let (a, b) ⊂ R. A function f :
(a, b) → R is piecewise continuously differentiable on (a, b) if there exists a set of points a ≡ x1 <

29
x2 < · · · xN ≡ b such that f is continuously differentiable on each interval (xi , xi+1 ) and one-sided
limits limx→xi + f 0 (x) and limx→xi+1 − f 0 (x) exist for i = 1, . . . , N − 1.
A function f is continuous on (a, b) if there are no discontinuities in the function. A function f
is piecewise continuous on (a, b) if we can construct a finite number of open subintervals such that
f is continuous on each of the open subintervals and limits exist at the endpoints; in other words,
the function may have a finite number of discontinuities as long as it is bounded everywhere. Note
that if f is continuous on (a, b), then it is also piecewise continuous on (a, b); however, the converse
is not true in general. We now consider a few concrete examples.
Example 3.5 (continuous function). Let f (x) = 1/x for x ∈ [0, 1]. The function is not continuous
on the closed interval [0, 1] as the function is not defined at x = 0. The function is however
continuous on the open interval (0, 1).

Example 3.6 (continuously differentiable function). Let f (x) = x for x ∈ [0, 1]. The function is
continuous on the interval [0, 1] but is not continuously differentiable on the interval; the derivative

f 0 (x) = 1/(2 x) is not defined at x = 0. (The function is similarly piecewise continuous but not
piecewise continuously differentiable.)
Example 3.7 (piecewise continuous function). Let
(
0, x ∈ [−1, 0],
f (x) =
1, x ∈ (0, 1].

The function is not continuous on [−1, 1] because limx→0− f (x) = 0 6= 1 = limx→0+ f (x). However,
the function is piecewise continuous on [−1, 1] because it is continuous on (−1, 0) and (0, 1) and
the limits at the endpoints exist.
Example 3.8. Let f (x) = tan(x) for x ∈ R, and consider subintervals In ≡ (nπ − π/2, nπ + π/2),
n ∈ Z. While the function is continuous on each open subinterval In , the function is not piecewise
continuous because the one-sided limits do not exist at the endpoints of the subintervals.

3.3 Convergence of Fourier series


We have so far discussed the representation of various functions as Fourier series but have not
discussed two important questions: does the series converge, and, if yes, in what sense do they
converge? To facilitate the discussion of convergence, we recall the truncated Fourier series defined
in Definition 2.13,
N
1 X
sN (x) ≡ a0 + (an cos(nπx) + bn sin(nπx)),
2
n=1
which is the N -term partial sum of the Fourier series. We now provide definitions for pointwise
convergence, uniform convergence, and L2 convergence of Fourier series.
Theorem 3.9 (Pointwise convergence of Fourier series). Let f be a piecewise continuously differ-
entiable function with a period 2, and sN be the associated truncated Fourier series. Then for any
fixed x ∈ R,
1
sN (x) → [f (x− ) + f (x+ )] as N → ∞.
2

30
The result also holds for the Fourier cosine (resp. sine) series if f is a piecewise continuously
differentiable, even (resp. odd) function with a period 2.

Proof. See Appendix 3.A. (Optional)

Corollary 3.10. Consider the setting of Theorem 3.9. If f is continuous at x, then sN (x) → f (x)
as N → ∞. If f is not continuous at x, then sN (x) converges to the average of the two one-sided
limits f (x− ) and f (x+ ) as N → ∞.

Theorem 3.11 (Uniform convergence of Fourier series). Let f be a continuous and piecewise
continuously differentiable function with a period 2, and sN be the associated truncated Fourier
series. Then sN converges uniformly to f ; i.e.,

max |f (x) − sN (x)| → 0 as N → ∞.


x∈R

The result also holds for the Fourier cosine (resp. sine) series if f is a continuous, piecewise contin-
uously differentiable, even (resp. odd) function with a period 2.

Proof. See Appendix 3.B. (Optional)

Theorem 3.12 (L2 convergence of Fourier series). Let f be a square-integrable function with a
period 2; i.e., f is 2-periodic and
Z 1
f (x)2 dx < ∞.
−1

Then sN converges in L2 to f ; i.e.,


Z 1
(f (x) − sN (x))2 dx → 0 as N → ∞.
−1

The result also holds for the Fourier cosine (resp. sine) series if f is a square-integrable, even (resp.
odd) function with a period 2.

Proof. See Appendix 3.C. (Optional)

Example 3.13 (pointwise, uniform, and L2 convergence). In this example we illustrate pointwise,
uniform, and L2 convergence using a simple sequence of functions (which is not a truncated Fourier
series). Consider a sequence of functions fN : [0, 1] → R, N = 1, 2, . . . , given by

fN (x) = xN .

For any fixed x ∈ [0, 1), this function converges to

fN (x) → 0 as N → ∞

because x < 1. For x = 1, the function evaluates to

fN (x = 1) = 1N = 1 ∀N.

31
Figure 3.1: Convergence of fN (x) = xN .

Hence, the functions fN converge to f : [0, 1] → R given by


(
0, x ∈ [0, 1)
f (x) =
1, x = 1.

This convergence is pointwise, since we considered the convergence fN (x) → f (x) for each fixed x.
While fN → f pointwise, the sequence does not converge uniformly. To see this, we observe
that, for any N ,

max |f (x) − fN (x)| = max |f (x) − fN (x)| = max |0 − xN | = max |xN | = 1 6= 0,


x∈[0,1] x∈(0,1) x∈(0,1) x∈(0,1)

where the first equality follows from |f (x) − fN (x)| = 0 at x = 0 and 1. (Technically, the “max”
should be replaced by “sup”, the least upper bound.) The maximum error is equal to 1 for all N
and does not converge to 0 as N → ∞; hence the convergence is not uniform.
While the sequence does not converge uniformly, it does converge in L2 because
Z 1 Z 1
2 1 1
(f (x) − fN (x)) dx = (0 − xN )2 dx = x2N +1 |1x=0 = ,
0 0 2N + 1 2N + 1

which converges to 0 as N → ∞.
In summary, fN → f pointwise and in L2 , but not uniformly. In general, uniform convergence is
a stronger condition than pointwise or L2 convergence; i.e., if fN → f uniformly, then the sequence
also converges pointwise and in L2 , but the converse is not true.

Example 3.14 (pointwise vs uniform convergence of Fourier series). We now consider uniform and
pointwise convergence for Fourier series for a 2-periodic square-wave function

f (x) = sgn(sin(πx));

here sgn(z) evaluates to 1 if z > 0, 0 if z = 0, and −1 if z < 0. Since


P∞ this is an odd function, we
ˆ
approximate the function using the Fourier sine series of the form n=1 fn sin(nπx). We find that

32
(a) function (b) convergence

Figure 3.2: Convergence of the Fourier series for a square wave.

the coefficients must satisfy


(
Z 1 4
fˆn = 2 sin(nπx)dx = nπ , n odd,
x=0 0, n even.

Hence our Fourier sine series representation of the square wave is



X 4
f (x) ∼ sin((2k − 1)πx).
(2k − 1)π
k=1

Note that the square wave function is piecewise continuously differentiable but is not (globally)
continuous. Hence, it satisfies the condition for pointwise convergence but not uniform convergence.
Figure 3.2(a) shows the truncated Fourier series for the square wave for N = 10, 30, 100, and
1000. We observe that (i) the truncated Fourier series at least visually gets “closer” to the exact
square wave almost everywhere as N increases, but (ii) the magnitude of the oscillations near
the discontinuities does not decrease. (It can be shown that peak value is approximately 1.18 as
N → ∞.) In particular, for any given fixed x (e.g., x = 0.5), we (at least visually) observe that
the truncated Fourier series converges to the exact square wave as N → ∞; this suggests pointwise
convergence. However, regardless of how large N may be, we can always find some x for which the
error is large (i.e., the peak of the oscillation); this suggests the series is not uniformly convergent.

We study the behavior in more detail in Figure 3.2(b), which shows the error in the truncated
Fourier series as a function of the number of terms N for three different points of the square
wave. To observe convergence, we fix the value of x to (say) x = 0.5 and observe that the error
|f (x = 0.5) − sN (x = 0.5)| decreases as N increases; i.e., sN (x = 0.5) → f (x = 0.5) as N increases.
This convergence for a fixed x is also shown for x = 0.1 and x = 0.01 and in fact is observed for any
x; i.e., for any fixed x, there exists N sufficiently large such |f (x) − sN (x)| < δ for any δ. However,
this value of N depends on x. In particular, as x → 0, the number of terms N required to achieve a
given error grows in an unbounded manner. This is a manifestation of the fact that the truncated

33
Fourier series sN (x) is a continuous function that evaluates to sN (x = 0) = 0, and hence a large
number of terms is required for sN (x = ) to evaluate to 1 as  → 0. Hence, the Fourier series does
not converge uniformly for this discontinuous function.

We note that the key difference between the conditions for pointwise convergence and uniform
convergence is the (global) continuity of the function. Namely, if f : R → R is discontinuous
anywhere, we cannot expect the Fourier series to converge uniformly; the condition for uniform
convergence is more stringent than for pointwise convergence. The condition for L2 convergence is
even weaker than the condition for pointwise convergence; all piecewise continuously differentiable
functions are square integrable since the function is necessarily bounded, but square integrable func-
tion need not be piecewise continuously differentiable. (In fact, the square integrability condition
is even weaker than the piecewise continuity condition by the same argument.)

3.4 Convergence of Fourier series on finite intervals


While the convergence theorems, Theorems 3.9, 3.11, and 3.12, are provided for periodic functions
on the entire R, we can readily apply the theorems to a function defined on a finite interval by
considering an appropriate periodic extension of the function. For instance, the (full) Fourier series
of f : (−a, a) → R is pointwise convergent if f is piecewise continuously differentiable; it converges
to f on (−a, a) in the pointwise sense (and to the periodic extension of f on the entire R). Similarly,
the Fourier cosine and sine series of f : (0, a) → R is each pointwise convergent if f is piecewise
continuously differentiable; they each converge to f on (0, a) in the pointwise sense (and to the
even and odd periodic extensions, respectively, of f on the entire R).
The same extension also applies to uniform convergence; however, it is important to note that
the (global) continuity condition for uniform convergence applies to the periodic, even periodic,
and odd periodic extensions for the Fourier, Fourier cosine, and Fourier sine series, respectively.
We illustrate this in a concrete example:

Example 3.15 (uniform convergence on a finite interval). Let f (x) = x on (0, 1). The Fourier
cosine series of the function converges uniformly on (0, 1) as the even periodic extension of f is a
continuous and piecewise continuously differentiable function on R. On the other hand, the Fourier
sine series of the function does not converge uniformly on (0, 1) as the odd periodic extension of f
is not a continuous function on R; see Figure 2.1. (The Fourier sine series is nevertheless pointwise
convergent because the odd periodic extension of f is piecewise continuously differentiable.)
The continuity requirement for each type of periodic extension, which is required for uniform
convergence, can be summarized via the following conditions for the endpoints:

• (Full) Fourier series. For f : (−a, a) → R, the values at the endpoints should match: i.e.,
f (−a+ ) = f (a− ).

• Fourier sine series. For f : (0, a) → R, the function must vanish at the endpoints: i.e.,
f (0+ ) = f (a− ) = 0.

• Fourier cosine series. For f : (0, a) → R, there are no additional endpoint conditions.

We readily arrive at these results by considering appropriate periodic extensions of f .

34
3.5 Uniform and absolute convergence
Theorem 3.11 provides conditions that a function f must satisfy for its Fourier series to be uniformly
convergent. We now provide an alternative condition for uniform convergence in terms of the
coefficients of the Fourier series.
Theorem 3.16 (absolute convergence). Let f : R → R be a periodic function with the Fourier
series

1 X
f (x) ∼ a0 + (an cos(nπx) + bn sin(nπx)),
2
n=1
where the coefficients are given by (2.1). The Fourier series converges uniformly to f if

X
(|an | + |bn |) < ∞.
n=1

Proof. See Appendix 3.B. (Optional)


This theorem is useful because it allows us to assess the uniform convergence of an arbitrary
series based solely on the decay of its Fourier coefficients, as illustrated in the following examples.
Example 3.17. Consider a Fourier cosine series

X 2
f (x) ∼ ((−1)n − 1) cos(nπx).
n2 π 2
n=1

We recognize that the Fourier cosine coefficients are fˆn = n22π2 ((−1)n − 1). To check if the series is
uniformly convergent, we check the absolute convergence of the coefficients and observe that
∞ ∞
X X 4
|fˆn | ≤ < ∞.
n2 π 2
n=1 n=1

Because the coefficients are absolutely convergent, the Fourier series converges uniformly. (Note
that this Fourier cosine series is associated with f (x) = x, x ∈ (0, 1), considered in Example 2.17.
Because the even periodic extension of the function is continuous and piecewise continuously dif-
ferentiable, Theorem 3.11 also states the Fourier series is uniformly convergent. However, Theo-
rem 3.16, unlike Theorem 3.11, does not require us to know an explicit form of f .)
Example 3.18. Consider a Fourier sine series

X 2
f (x) ∼ (−1)n+1 sin(nπx).

n=1

We recognize that the Fourier sine coefficients are fˆn = (−1)n+1 nπ2
. To check if the series is
uniformly convergent, we check the absolute convergence of the coefficients and observe that
∞ ∞
X X 2
|fˆn | = ,

n=1 n=1
which diverges. Because the coefficients are not absolutely convergent, the Fourier series does not
converge uniformly. (Note that this Fourier sine series is associated with f (x) = x, x ∈ (0, 1),
considered in Example 2.18. Because the odd periodic extension of the function is not continuous,
Theorem 3.11 also states the Fourier series is not uniformly convergent.)

35
3.6 Differentiation of Fourier series
We now summarize the properties of Fourier series under differentiation. These properties will play
a crucial role as we use Fourier series to represent solutions to (partial) differential equations.

Theorem 3.19 (differentiation of Fourier series). Let f be a continuous periodic function with
piecewise continuously differentiable derivative f 0 . Then the Fourier series of f 0 can be obtained
by term-by-term differentiation of the Fourier series of f ; i.e., if


1 X
f (x) ∼ a0 + (an cos(nπx) + bn sin(nπx)),
2
n=1

then

X
f 0 (x) ∼ (−an nπ sin(nπx) + bn nπ cos(nπx)).
n=1

For any fixed x ∈ R, the differentiated series converge to 21 (f 0 (x− ) + f 0 (x+ )).

Theorem 3.20 (uniform convergence of derivatives). Let f be a periodic function with Fourier
coefficients an and bn . Suppose

X
(|nk an | + |nk bn |) < ∞
n=1

for some k ∈ N>0 . Then f has a continuous derivatives f 0 , . . . , f (k) , whose Fourier series are
obtained by term-by-term differentiation of the Fourier series of f .

Proof. Proof follows from Theorems 3.19 and 3.16.

Example 3.21 (C ∞ or smooth functions). One particular form of Fourier series that we encounter
in the analysis of heat equation is


X
f (x) = exp(−λn) sin(nπx)
n=1

for λ > 0. We recognize that an = 0 and bn = exp(−λn). To verify if the derivative of order k
exists, we consider the series
X∞
nk exp(−λn),
n=1

and observe that this series converges for any k ∈ N>0 . Hence f is infinitely differentiable; an
infinitely differentiable function is said to be smooth and
P∞belongs to
P∞ C ∞ . (To prove the series
converges, use, for instance, the comparison test with n=1 cn = 2
n=1 1/n and note that (i)

an /cn = nk+2 exp(−λn) < 1 for n sufficiently large, and (ii) n=1 cn converges.)
P

36
3.7 Summary
We summarize key points of this lecture:

1. Uniform, pointwise, and L2 convergences are three different ways to characterize the con-
vergence of a sequence of functions (fn )n to f . Uniform convergence is more stringent than
pointwise convergence, and pointwise convergence is more stringent than L2 convergence.

2. The Fourier series of f : R → R converges in L2 for any periodic function f that is square
integrable.

3. The Fourier series of f : R → R converges pointwise, except at discontinuities, for any


periodic function f that is piecewise continuously differentiable. At the discontinuities the
Fourier series converges to the average of the left and right limits.

4. The Fourier series of f : R → R converges uniformly for any periodic function f that is
continuous and piecewise continuously differentiable.

5. A Fourier series is uniformly convergent if its coefficients are absolutely convergent.

6. The conditions for pointwise, uniform, and L2 convergence readily extend to functions defined
on a bounded domain as long as the appropriate periodic extension of the function satisfies
the required continuity and differentiability conditions.

7. The term-by-term differentiation of a Fourier series is well defined only if the function is
continuous and piecewise continuously differentiable.

37
3.A Proof of pointwise convergence
We prove the pointwise convergence of Fourier series, Theorem 3.9. We prove the result for f :
R → R with a period 2, but the result readily generalizes for functions with any periodic function
through the transformation considered in Section 2.5. The presentations in this and subsequent
Appendices closely follow Strauss, Partial Differential Equations: An Introduction.
For simplicity, we prove the result for a continuous f . To begin, we recall that the Fourier series
and the associated partial sum are given by

1 X
f (x) ∼ a0 + (an cos(nπx) + bn sin(nπx)),
2
n=1
N
1 X
sN (x) = a0 + (an cos(nπx) + bn sin(nπx)),
2
n=1

for coefficients Z 1 Z 1
an = cos(nπy)f (y)dy and bn = sin(nπy)f (y)dy.
−1 −1
We now substitute the expression for the coefficients to the partial sum to obtain
Z 1 " N
#
1 X
sN (x) = + (cos(nπy) cos(nπx) + sin(nπy) sin(nπx)) f (y)dy
y=−1 2 n=1
Z 1 " N
#
1 X
= + cos(nπ(y − x)) f (y)dy.
y=−1 2 n=1

We now introduce the Dirichlet kernel


N
1 X
KN (ξ) ≡ + cos(nπξ)
2
n=1

so that Z 1
sN (x) = KN (y − x)f (y)dy.
y=−1

The Dirichlet kernel has two important properties. The first is that the sum can be expressed in a
simple form:
sin((N + 1/2)πξ)
KN (ξ) = ;
2 sin(πξ/2)
this result can be readily shown using the De Moivre’s formula and rearranging the exponentials.
The second property is that Z 1
KN (ξ)dξ = 1,
−1

since only the leading term, 1/2, contributes to the integral. We now set ξ = y − x to obtain
Z 1 Z x+1 Z 1
sN (x) = KN (y − x)f (y)dy = KN (ξ)f (x + ξ)dξ = KN (ξ)f (x + ξ)dξ,
y=−1 ξ=x−1 ξ=−1

38
R1
where the last equality follows since KN and f are both 2-periodic. We now appeal to −1 KN (ξ)dξ =
1 to obtain Z 1
f (x) − sN (x) = KN (ξ)(f (x) − f (x + ξ))dξ.
ξ=−1

We now rearrange the expression as follows:


Z 1 Z 1
sin((N + 1/2)πξ)
f (x) − sN (x) = (f (x) − f (x + ξ))dξ = g(ξ; x)φN (ξ)dξ,
ξ=−1 2 sin(πξ/2) ξ=−1

where
f (x) − f (x + ξ)
g(ξ; x) ≡ ,
2 sin(πξ/2)
φ(ξ) ≡ sin((N + 1/2)πξ), N = 1, 2, . . . .

We note that {φN }∞


N =1 is an orthonormal set on (−1, 1) because
Z 1 Z 1
sin((N + 1/2)πξ) sin((M + 1/2)πξ)dξ = sin(N πξ) sin(M πξ)dξ = δM N ,
−1 −1

where the first equality follows from the periodicity of the sine functions. Hence by Bessel’s in-
equality

X
(g(·; x), φN )2 ≤ kg(·; x)k2 .
N =1

It follows that if kg(·; x)k < ∞ then the series converges, which implies that (g(·; x), φN ) = f (x) −
sN (x) → 0 as N → ∞.
We finally verify that kg(·; x)k < ∞. We have
1  2
f (x) − f (x + ξ)
Z
2
kg(·; x)k = dξ.
ξ=−1 2 sin(πξ/2)

Since f is continuous, the numerator is bounded. The only potential difficulty arises when the
denominator goes to 0 at ξ = 0. However we find that

f (x) − f (x + ξ) −f 0 (x + ξ)
lim g(ξ) = lim = lim = −f 0 (x)/π < ∞
ξ→0 ξ→0 2 sin(πξ/2) ξ→0 π cos(πξ/2)

by L’Hopital’s rule since f is differentiable at x (by assumption). Hence g is bounded everywhere


and the integral kgk is finite.

39
3.B Proof of uniform convergence
We prove two theorems regarding the uniform convergence of Fourier series: Theorems 3.11 and
3.16. We first prove Theorem 3.16, which states that that the Fourier series converges uniformly
to f if its coefficients are absolutely convergent. For a given x ∈ [−1, 1], we take the difference
between f (x) and the truncated Fourier series sN (x) to obtain

∞ N
1 X 1 X
f (x) − sN (x) = a0 + (an cos(nπx) + bn sin(nπx)) − a0 − (an cos(nπx) + bn sin(nπx))
2 2
n=1 n=1

X
= (an cos(nπx) + bn sin(nπx)).
n=N +1

It follows that

X ∞
X
max |f (x) − sN (x)| = max (an cos(nπx) + bn sin(nπx)) ≤ (|an | + |bn |).
x∈[−1,1] x∈[−1,1]
n=N +1 n=N +1

P∞
Because (by assumption) the series is a tail of the absolutely convergent series n=1 (|an | + |bn |) <
∞, it must vanish as N → ∞. It follows that

X
max |f (x) − sN (x)| ≤ (|an | + |bn |) → 0 as N → ∞,
x∈[−1,1]
n=N +1

if the coefficients are absolutely convergent.


We next prove Theorem 3.11, which states that the Fourier series converges uniformly to f :
R → R if f is continuously differentiable and has a period of 2. To begin, let an and bn denote the
Fourier coefficients of f and a0n and b0n denote the Fourier coefficients of f 0 . We observe that
Z 1 Z 1
1 1 1
an = cos(nπx)f (x)dx = 1
f (x) sin(nπx)|x=−1 − sin(nπx)f 0 (x)dx = − b0n ,
−1 nπ nπ −1 nπ
Z 1 Z 1
1 1 1 0
bn = sin(nπx)f (x)dx = − f (x) cos(nπx)|1x=−1 + cos(nπx)f 0 (x)dx = a ,
−1 nπ nπ −1 nπ n

where the first equality follows because f is continuously differentiable, and the last equality follows
from periodicity for each case. It follows that

∞ ∞ ∞
!1/2 ∞
!1/2
X X 1 X 1 X
(|an | + |bn |) = (|b0 | + |a0n |) ≤ 2(|a0n |2 + |b0n |2 ) < ∞,
nπ n n2
n=1 n=1 n=1 n=1

where the last inequality follows from the convergence of the p-series for P p = 2, which implies
P ∞ 1 0 ∞ 0 2 0 2
n=1 n2 < ∞, and Bessel’s inequality for the derivative f , which implies n=1 (|an | + |bn | ) ≤
0
kf k < ∞. Since the Fourier coefficients for f is absolutely convergent, the Fourier series converges
uniformly to f by Theorem 3.16, which we just proved.

40
3.C L2 convergence and optimality of Fourier series
We now introduce some properties of Fourier series related to its convergence in L2 . In fact, all
properties we will consider apply to any series representation of functions based on a (complete)
set of orthonormal functions (φn )∞
n=1 . We hence abstract our Fourier series in the following form:


X
f (x) ∼ fˆn φn (x)
n=1

for coefficients Z 1
fˆn = φn (x)f (x)dx.
−1
The associated truncated Fourier series is defined by
N
X
sN (x) ≡ fˆn φn (x).
n=1

(If the functions (φn )∞


n=1 are orthogonal
R1
(but not orthonormal), then the coefficients are normalized
φ (x)f (x)dx
by the norm of φn : i.e., fˆn = −1
n
R1 .)
−1 φn (x)2 dx

Proposition 3.22 (optimality of Fourier series). Consider functions of the form


N
X
gN (x) ≡ ĝn φn (x),
n=1

where ĝn , n = 1, 2, . . . , are some arbitrary coefficients. The truncated Fourier series sN of f is the
best N -term approximation associated with the basis (φn ) in the sense that

kf − sN k ≤ kf − gN k ∀gN .

Proof. We find
N
X N
X N
X
2 2 2
EN ≡ kf − gN k = kf k − 2(f, gN ) + (gN , gN ) = kf k − 2(f, ĝn φn ) + ( ĝn φn , ĝm φm )
n=1 n=1 m=1
N
X N
X XN N
X
= kf k2 − 2 ĝn (f, φn ) + ĝn ĝm (φn , φm ) = kf k2 − 2 ĝn fˆn + ĝn2 , (3.1)
n=1 n,m=1 n=1 n=1

where the last equality follows from the definition of the (generalized) Fourier coefficients (fˆn =
(f, φn )) and the orthonormality of functions ((φm , φn ) = δmn ).
We recall that EN is minimized if (i) all first derivative of EN with respect to the coefficients
vanish and (ii) the Hessian (i.e., second derivative) of EN with respect to the coefficients is positive
definite. To this end, we evaluate the first derivative of EN with respect to the coefficients:
N N
∂EN X X
=x−2 fˆn + 2 ĝn = 0 ⇒ ĝn = fˆn .
∂ĝn
n=1 n=1

41
In addition, the second derivative is
∂ 2 EN
= 2,
∂ĝn2
and all mixed partial derivatives vanish; hence the Hessian is positive definite. It follows that the
set of coefficients that minimizes the square-integral error EN is given by ĝn = fˆn = (f, φn ), which
are the generalized Fourier coefficients.

Theorem 3.23 (Bessel’s inequality). Let f : (−1, 1) → R be a square-integrable function, (φn )∞


n=1
be an orthonormal set of functions, and fˆn = (f, φn ) be the associated coefficients. Then,

X Z 1
fˆn2 ≤ f (x)2 dx.
n=1 −1

Proof. We substitute ĝn = fˆn in (3.1) to obtain, for all N ∈ N,


N
X
EN ≡ kf − gN k2 = kf k2 − fˆn2 ≥ 0.
n=1

We rearrange the inequality to obtain


N
X
fˆn2 ≤ kf k2 .
n=1

Since the relationship holds for any N , we take N → ∞ to obtain the desired inequality.

Theorem 3.24 (Parseval’s identity). The Fourier series of f : (−1, 1) → R converges to f in the
mean-square sense if and only if

X Z 1
ˆ2
fn = f 2 dx.
n=1 −1

Proof. Parseval’s identity is a consequence of the L2 convergence of Fourier series. Namely,


N
X ∞
X
2
lim EN ≡ lim kf − gN k = lim (kf k − 2
fˆn2 ) = kf k2 − fˆn2 = 0,
N →∞ N →∞ N →∞
n=1 n=1

which is the desired equality.

Remark 3.25 (decay of Fourier coefficients). Let f : (−1, 1) → R be a square-integrable function,


(φn )∞ ˆ
n=1 be an orthonormal set of functions, and fn = (f, φn ) be the associated coefficients. Then

fˆn = (f, φn ) → 0 as n → ∞.
P∞ ˆ2
This is a direct consequence of Bessel’s inequality. Since the series n=1 fn < ∞, the coefficients
must converge to 0.

42
Lecture 4

Sturm-Liouville theory

©2019–2021 Masayuki Yano. Prepared for ESC384 Partial Differential Equations taught at the
University of Toronto.

4.1 Introduction
We have so far formulated and analyzed Fourier series based on sinusoidal functions. In our
formulation, we have found that orthogonal functions provide a particularly simple expression for
the coefficients; in our analysis, we found that many of the convergence properties derive from the
orthogonality and completeness of the sinusoidal functions. In this lecture we study an important
class of ordinary differential equations (ODEs): the Sturm-Liouville problem. The solution to the
problem is a complete set of orthogonal functions, which can be used to form a generalized Fourier
series, which shares many of the convergence properties of the “classical” Fourier series.

4.2 Motivation: Fourier series and eigenproblem


We first explore the connection between Fourier series and some select eigenproblems. To this end,
consider an ODE eigenproblem: find eigenpairs (φn , λn ) (i.e., eigenfunction φn and the associated
eigenvalue λn ) such that

−φ00n = λn φn in (0, 1), (4.1)


φn (0) = φn (1) = 0.

To find the solution, we consider three different cases for the eigenvalue:
• Case 1: λn < 0. Let µn = −λn so that µn > 0. The general solution to (4.1) is of the form
√ √
φn (x) = An cosh( µn x) + Bn sinh( µn x).

The boundary condition at x = 0 requires

φn (x = 0) = An = 0.

The boundary condition at x = 1 requires



φn (x = 1) = Bn sinh( µn ) = 0 ⇒ Bn = 0,

43
since sinh(x) = 0 only if x = 0. The only solution to the eigenproblem (4.1) is the trivial
solution φn = 0. Hence λn < 0 does not yield a desired family of nontrival functions.
• Case 2: λn = 0. If λn = 0, the general solution to (4.1) is of the form

φn (x) = An x + Bn .

The boundary condition at x = 0 requires

φn (x = 0) = Bn = 0.

The boundary condition at x = 1 requires

φn (x = 1) = An = 0.

Again, if λn = 0, the only solution to the eigenproblem (4.1) is the trivial solution φn = 0.
• Case 3: λn > 0. The general solution to (4.1) is of the form
p p
φn (x) = An cos( λn x) + Bn sin( λn x).

The boundary condition at x = 0 requires

φn (x = 0) = An = 0.

The boundary condition at x = 1 requires


p
φn (x = 1) = Bn sin( λn ) = 0.

Since we seek a nontrivial solution, we require Bn 6= 0. It hence follows that


p
sin( λn ) = 0 ⇒ λn = n2 π 2 , n = 1, 2, . . . .

Hence φn (x) = sin(nπx) and λn = n2 π 2 for n = 1, 2, . . . are the nontrivial solutions to the
eigenproblem (4.1).
The solution to the eigenproblem (4.1) is hence

φn (x) = sin(nπx) and λn = n2 π 2 for n = 1, 2, . . . .

We make a few observations:


1. The eigenfunctions are the basis of Fourier sine series, and hence the linear combination of
the eigenfunctions can approximate any square integrable function.
2. The eigenfunctions are orthogonal.
3. The eigenvalues are real and nonnegative.
We now ask an important question:
Could we have known all of the above properties just by looking at the eigenproblem
without having to find the explicit expressions for the solution?
It turns out the answer is yes. All of the above properties are consequences of the Sturm-Liouville
theory.

44
4.3 Sturm-Liouville theory
We introduce the regular Sturm-Liouville problem.

Definition 4.1 (regular Sturm-Liouville problem). The regular Sturm-Liouville problem is given
by

−(pφ0 )0 + qφ = λwφ in (a, b),


0
α1 φ(a) − α2 φ (a) = 0,
β1 φ(b) + β2 φ0 (b) = 0,

where p : [a, b] → R, p0 : [a, b] → R, q : [a, b] → R and w : [a, b] → R are continuous on [a, b], p > 0
and w > 0 on [a, b], and α12 + α22 > 0 and β12 + β22 > 0.

While the particular form of the Sturm-Liouville equation might appear rather restrictive at
the first glance, the equation is quite general; specifically, second-order ODEs of the form

−γ2 φ00 + γ1 φ0 + γ0 φ = λφ in (a, b)

can be recast as the Sturm-Liouville equation through the change of variables


Z x 
γ1 (ξ) γ0 (x) 1
p(x) = exp − dξ , q(x) = p(x), w(x) = p(x).
γ2 (ξ) γ2 (x) γ2 (x)

We now introduce important properties of the Sturm-Liouville problem and its solutions. (Note:
proofs are provided for completeness, but they should be considered optional reading items.)

Theorem 4.2 (self-adjointness of Sturm-Liouville operator). The Sturm-Liouville operator L ≡


d d
− dx (p dx ) + q is self-adjoint in the sense that

(Lw, v) = (w, Lv)

for any complex-valued functions w, v ∈ C 2 ([a, b]) that satisfy the boundary conditions. Here
Rb
(·, ·) : C × C → C is the L2 inner product on complex-valued functions given by (f, g) ≡ a f ḡdx,
where ḡ denotes the complex conjugate of g.

Proof. We use integration by parts to obtain


Z b
(Lw, v) − (w, Lv) = (−(pw0 )0 + qw)v̄ − (−(pv̄ 0 )0 + qv̄)wdx = [p(−w0 v̄ + v̄ 0 w)]bx=a .
a

We now appeal to the boundary condition at x = a to find that

α1 w(a) − α2 w0 (a) = 0,
α1 v̄(a) − α2 v̄ 0 (a) = 0,

which may be written as


w(a) −w0 (a)
    
α1 0
= .
v̄(a) −v̄ 0 (a) α2 0

45
Since α12 + α22 > 0, the linear system has a nontrivial solution and the matrix must be singular. It
follows that
w(a)v̄ 0 (a) − w0 (a)v̄(a) = 0.
We also invoke the same argument on the boundary x = b to find that

w(b)v̄ 0 (b) − w0 (b)v̄(b) = 0.

Hence
(Lw, v) − (w, Lv) = [p(−w0 v̄ + v̄ 0 w)]bx=a = 0,
which is the desired equality.

Theorem 4.3 (realness of Sturm-Liouville eigenvalues). All eigenvalues of the regular Sturm-
Liouville problem are real.

Proof. Let φ and λ be an eigenfunction and eigenvalue, respectively, of a regular Sturm-Liouville


problem. Then Lφ = λwφ, and hence
Z b
0 = (Lφ, φ) − (φ, Lφ) = (λ − λ̄) wφ2 dx.
a
Rb
Since a wφ2 dx 6= 0, λ − λ̄ = 2Im(λ) = 0, and λ is real.

Theorem 4.4 (orthogonality of Sturm-Liouville eigenfunctions). The eigenfunctions of the regular


Sturm-Liouville problem associated with two distinct eigenvalues λn and λm are orthogonal with
respect to the weight w : [a, b] → R>0 in the sense that
Z b
wφm φn dx = 0, n 6= m.
a

Proof. Suppose φm and φn are eigenfunctions associated with distinct eigenvalues λm and λn so
that

−(pφ0n )0 + qφn − λn wφn = Lφn − λn wφn = 0


−(pφ0m )0 + qφm − λm wφm = Lφm − λm wφm = 0.

We multiply the first and second equations by φm and φn , respectively, integrate over (a, b), and
take the difference of the two equations to find that

0 = (Lφn , φm ) − λn (wφn , φm ) − (φn , Lφm ) + λm (wφn , φm )


= [(Lφn , φm ) − (φn , Lφm )] − (λn − λm )(wφn , φm ).

Because the Sturm-Liouville operator L is self-adjoint, the term in the bracket vanishes. It follows
that Z b
(λn − λm )(wφn , φm ) = (λn − λm ) wφn φm dx = 0.
a
Hence if λm 6= λn , the eigenfunctions are orthogonal with respect to the weight function w.

46
Theorem 4.5. The regular Sturm-Liouville problem has an infinite number of eigenvalues and
λn → ∞ as n → ∞. Moreover, if w > 0 on [a, b] and α1 ≥ 0, α2 ≥ 0, β1 ≥ 0, and β2 ≥ 0, then
λn ≥ 0 for all n.

These properties may look rather abstract at the first glance. However, given the Sturm-
Liouville operator is a linear operator, we can relate the above properties to the familiar properties
of linear operators in Rn : symmetric matrices (or Hermitian matrices if complex). First, self-
adjointness (Lw, v) = (w, Lv) is analogous to the condition that v H Aw = v H AT w for real symmetric
matrices, where ·H denotes the conjugate transpose. Second, all eigenvalues of the Sturm-Liouville
problem are real, just like real symmetric matrices have all real eigenvalues. Third, the eigenfunc-
tions of the Sturm-Liouville problem are orthogonal, just like the eigenvectors of real symmetric
matrices are orthogonal. Finally, under certain conditions all eigenvalues of the Sturm-Liouville
problem are nonnegative, just like all eigenvalues of symmetric positive semi-definite matrices are
nonnegative.

4.4 Examples of the Sturm-Liouville problem


We now provide some examples of orthogonal functions that arise as solutions of the regular Sturm-
Liouville problem.

Example 4.6 (sine functions). Consider an eigenproblem

−φ00n = λn φn in (0, 1),


φn (0) = φn (1) = 0.

We readily verify that this is a regular Sturm-Liouville problem with p = 1, q = 0, w = 1,


α1 = β1 = 1, and α2 = β2 = 0. It follows that the eigenfunctions {φn } are orthogonal. Moreover,
because w, α1 , α2 , β1 , and β2 are all nonnegative, the eigenvalues are nonnegative. The power of
the Sturm-Liouville theory is that we arrive at these conclusions without necessarily knowing the
explicit expressions for the eigenvalues and eigenfunctions.
(Of course, we have seen in Section 4.2 that the explicit form of the solution is φn (x) = sin(nπx)
and λn = n2 π 2 for n = 1, 2, . . . . We readily verify that the solution satisfies all of the above
properties.)

Example 4.7 (cosine functions). Consider an eigenproblem

−φ00n = λn φn in (0, 1),


φ0n (0) = φ0n (1) = 0.

We readily verify that this is a regular Sturm-Liouville problem with p = 1, q = 0, w = 1,


α1 = β1 = 0, and α2 = β2 = 1. It follows that the eigenfunctions {φn } are orthogonal. Moreover,
because w, α1 , α2 , β1 , and β2 are all nonnegative, the eigenvalues are nonnegative.
We can again find explicit expressions for the solutions to verify the above claims. We consider
three cases:

• Case 1: λn < 0. Let µn = −λn so that µn > 0. The general solution to the eigenproblem
√ √
is of the form φn (x) = an cosh( µn x) + bn sinh( µn x). The derivative of the function is

47
√ √ √ √
φ0n (x) = an µn sinh( µn x) + bn µn cosh( µn x). The boundary condition at x = 0 requires
that φ0n (x = 0) = bn = 0. The boundary condition at x = 1 requires that φ0n (x = 1) =
√ √
an µn sinh( µn ) = 0; as sinh is zero only at x = 0, the condition requires that an = 0.
Hence there is no nontrivial solution to the boundary value eigenproblem for µn > 0.
• Case 2: λn = 0. If λn = 0, the general solution to the eigenproblem is φn (x) = an + bn x.
The derivative of the function is φ0n (x) = bn . The boundary conditions at x = 0 and x = 1
require that φ0n (x = 0) = φ0n (x = 1) = bn = 0. We hence find that the one nontrivial solution
is φn (x) = 1.

• Case 3:√λn > 0. The general solution to the eigenproblem is √ of the form
√ φn (x) =√an cos( √λn x)+
bn sin( λn x). The derivative of the function is φ0n (x) = −an√ λn sin( λn x)+bn λn cos( λn x).
The boundary condition at x = 0 requires √ φ0n (x
√ = 0) = bn λn = 0. The boundary condition
0
at x = 1 requires φn (x = 1) = −an λn sin( λn ) = 0; the nontrivial solution is given by
λn = n2 π 2 , n = 1, 2, . . . . We hence nontrival solutions are of the form φn (x) = cos(nπx),
n = 1, 2, . . . . In fact, we can include the constant solution found in Case 2 in this general
family of solutions by starting from n = 0.
As expected from the Sturm-Liouville theory, the problem has eigenvalues which are real and
nonnegative: λn = n2 π 2 ≥ 0, n = 0, 1, 2, . . . . In addition, the eigenfunctions φn (x) = cos(nπx) are
orthogonal on (0, 1), as we have already verified in the context of classical Fourier series.
Example 4.8 (sinusoidal functions). Consider an eigenproblem
−φ00n = λn φn in (−1, 1),
φn (−1) = φn (1),
φ0n (−1) = φ0n (1).
This is an Sturm-Liouville equation with periodic boundary condition. This is not a regular Sturm-
Liouville problem, but a periodic Sturm-Liouville problem. The solution to this periodic Sturm-
Liouville problem is given by
(
cos(mπx), n even,
λn = m2 π 2 and φn (x) =
sin(mπx), n odd
for m ≡ dn/2e, n = 0, 1, 2, . . . . We recognize that the eigenfunctions are the basis for the full
Fourier series. All eigenvalues are nonnegative and the eigenfunctions are orthogonal, as we have
already verified in the context of classical Fourier series. (The orthogonality proof holds for the
periodic boundary condition with little modification.)
Example 4.9 (Legendre polynomials). Consider an eigenproblem
−((1 − x2 )φ0n )0 = λn φn in (−1, 1).
This is a Sturm-Liouville equation, with coefficients p(x) = (1 − x2 ), q = 0, and w = 1. Because
p(x = −1) = p(x = 1) = 0, the differential operator vanishes at the boundary; the problem hence
does not require a boundary condition. A problem associated with a Sturm-Liouville equation with
p that vanishes at either or both boundaries is called a singular Sturm-Liouville problem. The
solutions to this singular Sturm-Liouville problem are the Legendre polynomials, which form an
orthogonal set of polynomials. (The orthogonality proof holds without any boundary conditions
because p vanishes at both boundaries.)

48
4.5 Generalized Fourier series by eigenfunctions
We now use the orthogonality and completeness of eigenfunctions of the Sturm-Liouville problem
to form generalized Fourier series.
Definition 4.10 (generalized Fourier series). Let (φn )∞
n=1 be eigenfunctions of a regular Sturm-
Liouville problem with α1 ≥ 0, α2 ≥ 0, β1 ≥ 0, and β2 ≥ 0. We can then approximate any
f : (a, b) → R by a generalized Fourier series

X
f (x) ∼ fˆn φn (x) (4.2)
n=1
for coefficients Rb
ˆ f φn wdx
fn = Ra b .
φ 2 wdx
a n
Similarly to the classical Fourier series, the expression for the coefficients is a consequence of
the orthogonality of the functions (φn )∞ n=1 . Our expression for the coefficient has (the square of)
the norm of the function as the denominator, as we did not assume the functions are orthonormal
(i.e., functions in general do not have a unit norm); the expression could be simplified if we had
normalized the functions.
We conclude this section by stating without proof the pointwise, uniform, and L2 convergence
of generalized Fourier series.
Theorem 4.11 (pointwise convergence of generalized Fourier series). Let f : [a, b] → R be a
piecewise continuously differentiable function that satisfies the boundary conditions of the regular
Sturm-Liouville problem. Then the generalized Fourier series (4.2) converges for any x ∈ (a, b) to
1 + −
2 (f (x ) + f (x )).
Theorem 4.12 (uniform convergence of generalized Fourier series). Let f : [a, b] → R be a twice
continuously differentiable function that satisfies the boundary conditions of the regular Sturm-
Liouville problem. Then the generalized Fourier series (4.2) converges uniformly to f .
Theorem 4.13 (L2 convergence of generalized Fourier series). Let f : (a, b) → R be a function
that is square integrable with respect to the weight function w and satisfies the boundary conditions
of the regular Sturm-Liouville problem. Then the generalized Fourier series (4.2) converges with
Rb
respect to the w-weighted norm a (·)2 wdx.

4.6 Summary
We summarize key points of this lecture:
1. Sturm-Liouville operator is self-adjoint. As a result, all eigenvalues of a regular Sturm-
Liouville problem are real, and its eigenfunctions are orthogonal.
2. The family of sine, cosine, and sine-cosine functions used in Fourier series arise as solutions
to Sturm-Liouville problems. As a result, the functions are orthogonal.
3. Solutions of a Sturm-Liouville problem can be used to construct a generalized Fourier series.
The coefficients of the series can be found using the relevant orthogonality relation. The
series converges as a result of the completeness of the eigenfunctions.

49
4.A L2 properties of generalized Fourier series
We now state some relationships between the coefficients of a generalized Fourier series and the
function it approximates.

Theorem 4.14 (generalized Parseval’s identity). Let f : (a, b) → R be a function that is square
integrable with respect to the weight function w and satisfies the boundary conditions of the regular
Sturm-Liouville problem. Then

X Z b Z b
fˆn2 φ2n wdx = f 2 wdx.
n=1 a a

In other words, the eigenfunctions


Rb of the regular Sturm-Liouville problem are complete with respect
to the w-weighted L2 norm a (·)2 wdx.

Corollary 4.15 (generalized Bessel’s inequality). As a consequence of the Parseval’s identity, a


weaker condition, Bessel’s inequality,

X Z b Z b
fˆn2 φ2n wdx ≤ f 2 wdx,
n=1 a a

also holds.

50
Part II

Heat equation

51
Lecture 5

Heat equation: introduction

©2019–2021 Masayuki Yano. Prepared for ESC384 Partial Differential Equations taught at the
University of Toronto.

5.1 Introduction
The heat equation describes heat conduction in media and plays an important role in the thermal
analysis of many engineering systems. The heat equation is also of mathematical interest, as it is
an archetype of parabolic PDEs. In this lecture we derive the heat equation and the associated
boundary conditions. We then explore various properties of the equation and the solution — such
as the maximum principle, energy stability, and the uniqueness of the solution — without finding
a closed form expression for the solution. We will introduce a few different approaches to find a
solution representation to the heat equation in the subsequent lectures.

5.2 Derivation of the heat equation


We derive the heat equation based on three principles of heat transfer:
1. The thermal energy density e is given by e = ρcu, where ρ > 0 is the density, c > 0 is the
specific heat capacity, and u is the temperature.
2. Fourier’s law. The rate of heat transfer is proportional to the temperature gradient: q =
−k∇u, where q is the heat flux and k > 0 is the thermal conductivity.
3. Conservation of energy. The change in the thermal energy in a volume over a time interval
∆t is equal to the net heat entering the volume over ∆t.
From these three principles, we can readily derive the heat equation.
To begin, we introduce a domain Ω ⊂ Rn and the associated fixed subdomain ω ⊂ Ω. We
denote the boundary of the subdomain by ∂ω. (In general, the boundary of any domain
R D ⊂ Rn
is denoted by ∂D.) We next note that the total energy in the subdomain ω is E = ω ρcudx. We
then note that by the conservation of energy
Z Z
d
ρcudx = − q · νds ,
dt ω ∂ω
| {z } | {z }
change in thermal energy net heat entering ω

52
where ν is the outward-pointing unit normal vector on ∂ω. We make two observations: (i) the
subdomain ω is independent of time and hence the time derivative can be brought inside the
integral; (ii) the integral over ∂ω can be converted to an integral over ω using the divergence
theorem. We hence obtain Z Z
∂u
ρc dx = − ∇ · qdx.
ω ∂t ω
We next note that, since the relationship above should hold for any ω ⊂ Ω, the integrands must
be equal to each other:
∂u
ρc = −∇ · q in Ω × R>0 .
∂t
We then appeal to Fourier’s law, q = −k∇u, to obtain
∂u
ρc = ∇ · (k∇u) in Ω × R>0 .
∂t
This is the general form of the heat equation that can treat spatially varying density, specific heat
capacity, and thermal conductivity.
If we assume that the density, specific heat capacity, and thermal conductivity are constant in
space and time, we can further simplify the equation. Namely, we introduce the thermal diffusivity
k
κ ≡ ρc (and appeal to our constant coefficient assumption) to obtain

∂u
= κ∆u in Ω × R>0 . (5.1)
∂t
This form of the equation is often referred to as the heat equation in the mathematics community;
in this course we will follow the same convention.

5.3 Boundary and initial conditions


To define a complete initial-boundary value problem, we now complement the heat equation (5.1)
with boundary and initial conditions. There are three different types of boundary conditions for
the heat equation:
1. Dirichlet (first-type) boundary conditions; i.e., prescribed temperature. If the temperature
at the boundary is prescribed, we obtain a boundary condition of the form

u = ub on ΓD × R>0 ,

where ub is the prescribed boundary temperature and ΓD ⊂ ∂Ω is the Dirichlet boundary.


The Dirichlet boundary condition only depends on the value (and not the derivative) of the
state u. If ub = 0, then the condition is called a homogeneous Dirichlet boundary condition;
otherwise it is a nonhomogeneous Dirichlet boundary condition.

2. Neumann (second-type) boundary conditions; i.e., prescribed heat flux. If the heat flux at
the boundary is prescribed, we obtain a boundary condition of the form

ν · (k∇u) = q b on ΓN × R>0 ,

where q b is the prescribed boundary heat flux into the domain and ΓN ⊂ ∂Ω is the Neumann
boundary. Note q b > 0 indicates heat transfers into the domain (i.e., the boundary is heated)

53
and q b < 0 indicates heat transfers out from the domain (i.e., the boundary is cooled).
(Note that the signs are “flipped” from Fourier’s law because Fourier’s law characterizes the
heat transfer from the domain into the boundary, whereas in the above q b characterizes the
heat transfer from the boundary into the domain.) The Neumann boundary condition only
depends on the derivative (and not the value) of the state u. If q b = 0, then the condition is
called a homogeneous Neumann boundary condition (i.e., insulated boundary); otherwise it
is a nonhomogeneous Neumann boundary condition.

3. Robin (third-type) boundary conditions; i.e., Newton’s law of cooling. If a boundary is ex-
posed to a different medium (e.g., fluid), then the heat is transferred from/to the surrounding
medium. Newton’s law state that the rate of this heat transfer is proportional to the difference
in the temperate of the material and the surrounding. The associated boundary condition is
of the form
ν · (k∇u) = h(uenv − u) on ΓR × R>0 ,
where h > 0 is the heat transfer coefficient, uenv is the temperature of the environment, and
ΓR ⊂ ∂Ω is the Robin boundary. The Robin boundary condition depends on both the value
and derivative of the state u. If uenv is 0, then the condition is called a homogeneous Robin
boundary condition; otherwise it is a nonhomogeneous Robin boundary condition.

As the heat equation (5.1) is second-order in space, one (and only one) boundary condition must
be specified at each point on the boundary. In other words, we partition the domain boundary into
non-overlapping sets so that

∂Ω = ΓD ∪ ΓN ∪ ΓR , (5.2)
ΓD ∩ ΓN = ΓD ∩ ΓR = ΓN ∩ ΓR = ∅, (5.3)

and impose the appropriate boundary condition on each part of the boundary. (Some of sets ΓD ,
ΓN , and ΓR may be empty. Optional note: D denotes the closure of D; we need to close each set
in the first equality because ΓD , ΓN , and ΓR are open.)
To completely specify the initial-boundary value problem, we must also specify the initial con-
dition. As the heat equation is first order in time, we need need to specify the value of the state at
the initial time:
u = g on Ω × {t = 0},
where g is the prescribed initial temperature.

Example 5.1 (heat equation initial-boundary value problem). A one-dimensional heat equation
over the domain Ω ≡ (0, L) with homogeneous Dirichlet boundary conditions is given by

∂u ∂2u
= κ 2 in (0, L) × R>0 ,
∂t ∂x
u = 0 on {0, L} × R>0 , (5.4)
u=g on (0, L) × {t = 0}.

It can be shown that this initial-boundary value problem is well-posed assuming the initial condition
g is sufficiently regular.

54
5.4 Nonhomogeneous heat equation
In Section 5.2, we derived the heat equation assuming that there is no “volume heating”. In the
presence of volume heating, the change in the thermal energy in the volume ω depends on not
only on the net heat entering through the surface ∂ω but also the volume heating from inside the
domain: Z Z Z
d
ρcudx = − q · νds + f˜dx .
dt ω ∂ω ω
| {z } | {z } | {z }
change in thermal energy net heat entering ω volume heating

After the application of the divergence theorem we obtain a PDE


∂u
= κ∆u + f in Ω × R>0
∂t
for f˜ ≡ f /(ρc). Note that this is a nonhomogeneous equation, because the PDE cannot be expressed
as Lu = 0 for any linear operator L due to the presence of the source term f that is independent
of u.

5.5 Nondimensionalization
In studying any PDE, it is often convenient to nondimesionalize the equations. To provide a
concrete example, we consider the one-dimensional heat equation (5.4). To nondimensionalize the
equation, we introduce the following scales:
1. Length scale. We choose for our length scale the domain length L. The nondimensionalized
spatial coordinate is x̃ = x/L and the nondimensionalized domain is Ω̃ = (0, 1).

2. Time scale. We choose for our time scale T ≡ L2 /κ. The nondimensionalized time is t̃ =
t/(L2 /κ).

3. Temperature scale. We choose for our temperature scale U ≡ max(g). The nondimesionalized
temperature is ũ = u/ max(g). The nondimensional initial temperature is bounded by 1.
While the choice of the length and temperature scales may be rather obvious, the choice of the time
scale is perhaps not. The choice is motivated by the following simplification of the heat equation.
Namely, since
∂u U ∂ ũ U κ ∂ ũ ∂2u U ∂ 2 ũ
= = 2 and = ,
∂t T ∂ t̃ L ∂ t̃ ∂x2 L2 ∂ x̃2
we obtain
∂u ∂2u U κ ∂ ũ U ∂ 2 ũ ∂ ũ ∂ 2 ũ
=κ 2 ⇒ =κ 2 2 ⇒ = ;
∂t ∂x T ∂ t̃ L ∂ x̃ ∂ t̃ ∂ x̃2
in other words, the nondimensionalized diffusion coefficient is unity. The nondimesionalized initial-
boundary value problem is hence
∂ ũ ∂ 2 ũ
= in (0, 1) × R>0 ,
∂ t̃ ∂ x̃2
ũ = 0 on {0, 1} × R>0 ,
ũ = g̃ on (0, 1) × {t = 0},

55
where g̃ ≡ g/ max(g).
The nondimensionalization offers simplicity without any loss of generality. Specifically, we can
always recover the solution to the dimensional equation as

u(x, t) = ũ(x/L, t/T )g̃ = ũ(x/L, κt/L2 )g̃.

For this reason, we will focus on nondimesionalized form of the heat equation (or any other equa-
tions) in this course.

Example 5.2 (nondimesionalization). Suppose the solution to nondimesionalized heat equation is


given by
ũ(x̃, t̃) = sin(πx̃) exp(−π 2 t̃).
(This is in fact the solution to the one-dimensional heat equation above for g̃(x) = sin(πx̃).) The
solution to the non-dimesionalized equation is then given by
 πx   2 
2 π κt
u(x, t) = max(g)ũ(x/L, κt/L ) = max(g) sin exp − 2 .
L L

5.6 Maximum principle


We now introduce a property of the heat equation that is important both in theory and in practice:
the maximum principle. The maximum principle facilitates the verification of two of the three
conditions of well-posedness — uniqueness and stability (but not existence) — of the initial and/or
boundary value problems associated with the heat equation. (Note: proofs are provided in this
and subsequent sections for completeness, but they should be considered optional reading items.)

Theorem 5.3 (maximum principle for the heat equation). Suppose we are given an open spatial
domain Ω and a time interval I ≡ (t0 , tf ], where Ω may be unbounded and tf may be infinite. We
define a parabolic cylinder Ω × I and a parabolic boundary

Γ ≡ (Ω × {t = t0 }) ∪ (∂Ω × I) = (Ω × I) \ (Ω × I),

which includes the spatial boundary and initial time “boundary” (but not the final time “bound-
ary”). If u satisfies the heat equation ∂u
∂t − ∆u = 0 in Ω × I, then

max u(x, t) = max u(x, t).


(x,t)∈Ω×I (x,t)∈Γ

In words, the solution to the heat equation attains the maximum value somewhere on the parabolic
boundary Γ. (The parabolic boundary Γ in the space-time domain is shown in Figure 5.1.)

Proof. We prove the result by contradiction: we suppose that the maximizer (x? , t? ) of u is in Ω × I
and then show that this is not possible. We again emphasize Ω × I is open everywhere except at
the final time t = tf ; the maximizer (x? , t? ) is hence either interior to Ω × I or at the final time
t = tf . We first introduce an auxiliary function

w(x, t) = u(x, t) − (t − t? ).

56
Figure 5.1: Parabolic boundary Γ in the space-time domain for the maximum principle.

We may choose  ∈ R>0 sufficiently small such that w also attains the maximum value in Ω × I.
Suppose (x?? , t?? ) ∈ Ω × I is the maximizer of w. Then

∆w(x?? , t?? ) ≤ 0

because the function has to be concave at the maximizer due to the second-order optimality con-
dition. In addition
∂w ?? ??
(x , t ) ≥ 0
∂t
because (i) ∂u ?? ??
∂t (x , t ) = 0 if t
?? ∈ (t , t ) from the first-order optimality condition and (ii)
0 f
∂u ?? ?? ?? f
∂t (x , t ) ≥ 0 if t = t since the function must not be decreasing with t for the point to be the
maximum. It follows that

∆w(x?? , t?? ) = ∆u(x?? , t?? ) ≤ 0,


∂w ?? ?? ∂u ?? ?? ∂u ?? ?? ∂u ?? ??
(x , t ) = (x , t ) −  ≥ 0 ⇒ (x , t ) ≥  ⇒ (x , t ) > 0.
∂t ∂t ∂t ∂t

However, the solution to the heat equation must satisfy ∂u


∂t = ∆u, which is a contradiction. We
hence conclude that the maximizer must be on the parabolic boundary Γ.

Remark 5.4. The maximum principle (Theorem 5.3) is sometimes referred to as the weak maxi-
mum principle. A closely related strong maximum principle states that the maximum value cannot
be attained anywhere in Ω×I unless u is constant in Ω×I; i.e., if there exists a point (x? , t? ) ∈ Ω×I
such that
u(x? , t? ) = max u(x, t),
Ω×I

then u is constant in Ω × I. (We recall that Ω is open and I is open at t = t0 )

Theorem 5.5 (minimum principle for the heat equation). In the same setting as Theorem 5.3, we
obtain
min u(x, t) = min u(x, t).
(x,t)∈Ω×I (x,t)∈Γ

57
Proof. Let w = −u. We recognize that w is also the solution to a heat equation and apply the
maximum principle to obtain

max w(x, t) = max w(x, t).


(x,t)∈Ω×I (x,t)∈Γ

We expand the left hand side and the right hand side to obtain

(LHS) = max w(x, t) = max −u(x, t) = − min u(x, t),


(x,t)∈Ω×I (x,t)∈Ω×I (x,t)∈Ω×I

(RHS) = max w(x, t) = max −u(x, t) = − min u(x, t).


(x,t)∈Γ (x,t)∈Γ (x,t)∈Γ

It follows that

− min u(x, t) = − min u(x, t) ⇒ min u(x, t) = min u(x, t).


(x,t)∈Ω×I (x,t)∈Γ (x,t)∈Ω×I (x,t)∈Γ

Example 5.6 (maximum principle for the heat equation). Consider the initial-boundary value
problem
∂u ∂ 2 u
− 2 = 0 in (−1, 2) × R>0 ,
∂t ∂x
u(x = −1, t) = exp(−t), t ∈ R>0 ,
u(x = 2, t) = | cos(3πt)|, t ∈ R>0 ,
4
u(x, t = 0) = (x − 1/2)2 , x ∈ (−1, 2).
9
We observe that the value of the solution along the left and right boundaries, as well as at the
initial time, are in [0, 1]. Hence by the maximum and minimum principles

0 ≤ u(x, t) ≤ 1 ∀(x, t) ∈ [−1, 2] × R≥0 .

We arrive at this conclusion even though we have not have any representation for the solution.

5.7 Uniqueness by the maximum principle


We now appeal to the maximum (and minimum) principle to show that the solution to the initial-
boundary value problem, if it exists, is unique.
Theorem 5.7 (uniqueness of the solution to the heat equation). Let Ω ⊂ Rn be a spatial domain
and I = (0, tf ] be the time interval. There is at most one solution u : Ω × I → R that satisfies
∂u
− ∆u = f in Ω × I,
∂t
u=h on ∂Ω × I,
u=g on Ω × {t = 0},

where f : Ω × I → R is a source function, h : ∂Ω × I → R is the boundary function, and g : Ω → R


is the initial condition.

58
Proof. Let u1 and u2 be the solutions to the heat equation initial-boundary value problem. Since
the equation is linear, we know by the principle of superposition that the function w ≡ u1 − u2
satisfies the initial-boundary value problem associated with the homogeneous heat equation:

∂w
− ∆w = 0 in Ω × I,
∂t
w=0 on ∂Ω × I,
w=0 on Ω × {t = 0}.

By the maximum and minimum principles, we find that

max w(x, t) = max w(x, t) = 0 ⇒ w≤0 in Ω × I,


(x,t)∈Ω×I (x,t)∈Γ

min w(x, t) = min w(x, t) = 0 ⇒ w≥0 in Ω × I,


(x,t)∈Ω×I (x,t)∈Γ

where Γ ≡ (Ω × {t = t0 }) ∪ (∂Ω × I) is the parabolic boundary. Since w ≤ 0 and w ≥ 0, it follows


that w = u1 − u2 = 0 in Ω × I.

5.8 Uniform stability by the maximum principle


We recall that the third ingredient of well-posedness is stability. In the study of PDEs, we are often
interested in stability in two different senses: “uniform” and “square-integral”. In this section we
study the uniform stability of the heat equation using the maximum principle.

Theorem 5.8 (uniform stability of the heat equation). Let u1 and u2 be solutions to the initial-
boundary value problems associated with two different set of boundary and initial data (h1 , g1 ) and
(h2 , g2 ), respectively:

∂ui
− ∆ui = 0 in Ω × I,
∂t
ui = hi on ∂Ω × I,
ui = gi on Ω × {t = 0}

for i = 1, 2. Then the solution depends continuously on data in the sense that
( )
max |u1 (x, t) − u2 (x, t)| ≤ max max |h1 (x, t) − h2 (x, t)|, max |g1 (x) − g2 (x)| . (5.5)
(x,t)∈Ω×I (x,t)∈∂Ω×I x∈Ω

Proof. We appeal to the linearity of the differential operators and take the difference of the two set
of equations to obtain

∂(u1 − u2 )
− ∆(u1 − u2 ) = 0 in Ω × I,
∂t
(u1 − u2 ) = (h1 − h2 ) on ∂Ω × I,
(u1 − u2 ) = (g1 − g2 ) on Ω × {t = 0}.

59
The maximum and minimum principles require, for any (x, t) ∈ Ω × I,
( )
u1 (x, t) − u2 (x, t) ≤ max max (h1 (x, t) − h2 (x, t)), max(g1 (x) − g2 (x)) ,
(x,t)∈∂Ω×I x∈Ω
( )
u1 (x, t) − u2 (x, t) ≥ min min (h1 (x, t) − h2 (x, t)), min(g1 (x) − g2 (x)) .
(x,t)∈∂Ω×I x∈Ω

It follows that
( )
max |u1 (x, t) − u2 (x, t)| ≤ max max |h1 (x, t) − h2 (x, t)|, max |g1 (x) − g2 (x)| ,
(x,t)∈Ω×I (x,t)∈∂Ω×I x∈Ω

which is the desired inequality.

The left hand side of (5.5) measures the closeness of the solutions and its right hand side
measures the closeness of initial and boundary data. The equation shows that the difference in
the two solutions u1 and u2 over any point x ∈ Ω at any time t ∈ I is bounded by the maximum
difference in the boundary data h1 −h2 and the initial data g1 −g2 . Hence a small change in the data
results in a small change in the solution; the problem is hence stable. As the stability statement
applies to all points in the space-time domain Ω × I, this is called stability in the “uniform” sense.

5.9 Energy method: uniqueness and stability in L2 sense


We have so far appealed to the maximum principle to show the uniqueness and stability of the
solution to the heat equation initial-boundary value problem. We can also prove uniqueness and
stability using the energy method. Unlike the results associated with the maximum principle which
are in the uniform sense, the results associated with the energy methods are in the square-integral
(or L2 ) sense.
We first consider uniqueness. As in the proof of Theorem 5.7, suppose u1 and u2 are the two
solutions to the heat equation initial-boundary value problem. Then their difference w ≡ u1 − u2
satisfies
∂w
− ∆w = 0 in Ω × I,
∂t
w = 0 on ΓD × I,
w=0 on Ω × {t = 0}.
We now define Z
1
E(t) ≡ w(x, t)2 dx, t ∈ I.
2 Ω
We then observe
Z Z Z Z Z
d ∂w
E= w dx = w∆wdx = − ∇w · ∇wdx + wν · ∇wds = − |∇w|2 dx ≤ 0;
dt Ω ∂t Ω Ω ∂Ω Ω
dE
in the last
R equality, the integral on the boundary vanishes because w = 0 on ∂Ω. Since dt ≤ 0,
E(t) = Ω w(·, t)2 dx is a nonincreasing function of time and
Z Z
2
w(·, t) dx ≤ w(·, t = 0)2 dx = 0, (5.6)
Ω Ω

60
where the last equality follows from w(·, t = 0) = u1 (·, t = 0) − u2 (·, t = 0) = g − g = 0. Conse-
quently, w(·, t) = u1 (·, t) − u2 (·, t) = 0 for each t ∈ I, and u1 = u2 in Ω × I. This energy method
can be readily extended for problems that involve Neumann and Robin boundary conditions. In
addition, as we will see in later lectures, the method also extends to equations that do not possess
the maximum principle.
We next consider stability. To this end, we consider the same setting as Theorem 5.8, but for
simplicity consider the case where only the initial data is changed and boundary data is fixed; i.e.,
g1 6= g1 but h1 = h2 . We appeal to energy balance (5.6) and obtain, for any t ∈ I,
Z Z
(u1 (·, t) − u2 (·, t))2 dx ≤ (g1 − g2 )2 dx.
Ω Ω

Again the left hand side measures the closeness of the solution and the right hand side measures
the closeness of the initial data. We again observe that a small change in the data results in a small
change in the solution; the problem is stable. However, the measure of “closeness” is not in the
maximum (or uniform) sense but rather in the square-integral sense. We hence refer to this form
of stability as stability in the square-integral (or L2 ) sense.

5.10 Summary
We summarize key points of this lecture:

1. The heat equation describes the temperature distribution in a medium and is derived from
the definition of thermal energy, Fourier’s law, and conservation of energy.

2. The three different types of boundary conditions that are relevant for the heat equation are
associated with prescribed temperature, prescribed heat flux, and Newton’s law of cooling.
The three conditions are called respectively Dirichlet, Neumann, and Robin conditions, or
first-type, second-type, and third-type conditions.

3. The heat equation (just like any other equation) can be nondimesionalized. In particular,
if we choose the time scale T ≡ L2 /κ, the resulting equation has the diffusivity constant of
unity.

4. The maximum principle states that the value of the solution to the (homogeneous) heat
equation over the entire space-time domain is bounded from above by the maximum value
along the boundaries and in the initial condition. The minimum principle similarly holds.

5. The maximum principle can be used to prove the solution to the heat equation, if it exists,
is unique. The principle also asserts that the solution is stable in the uniform sense.

6. The energy method can be also used to prove the uniqueness and L2 stability of the heat
equation.

61
Lecture 6

Heat equation: separation of variables

©2019–2021 Masayuki Yano. Prepared for ESC384 Partial Differential Equations taught at the
University of Toronto.

6.1 Introduction
In this lecture we consider the solution of initial-boundary value problems associated with the
heat equation in a bounded domain. To find a solution representation, we introduce separation of
variables, which seeks solutions to the PDE that are separable in space and time, and then consider
a linear combination of these separable solutions that satisfies the particular boundary and initial
conditions. The approach builds on the Fourier series representation of various functions. We
consider the treatment of various boundary conditions; we then provide physical and mathematical
interpretations of the associated solutions.

6.2 Homogeneous Dirichlet boundary conditions


We consider the heat equation over the one-dimensional spatial domain (0, 1) with the homogeneous
Dirichlet boundary conditions:

∂u ∂ 2 u
− 2 =0 in (0, 1) × R>0 ,
∂t ∂x
u=0 on {0, 1} × R>0 ,
u=g on (0, 1) × {t = 0}.

Our solution procedure builds on separation of variables and the principle of superposition:

1. Separation of variables. We seek a family of solutions to the heat equation of the form

un (x, t) = φn (x)Tn (t), n = 1, 2, . . . ,

which satisfy the PDE and the boundary conditions but not necessarily the initial condition.
As we will see shortly, the separable form allows us to use the standard ODE techniques to
find φn and Tn .

62
2. The principle of superposition. We then consider a linear combination of the above separable
functions

X
u(x, t) = an un (x, t)
n=1
that satisfy the initial condition. Note that due to the linearity and homogeneity of the PDE
and boundary conditions, any linear combination of the separable functions also satisfies the
PDE and the boundary conditions. Hence if we find the coefficients (an ) that satisfy the
initial condition, then the solution will satisfy the PDE, boundary conditions, and initial
condition.
This two-step procedure can be used to find a solution — and in fact a unique solution — to the
heat equation. (It can be shown that any function over (0, 1) × R>0 can be represented as a linear
combination of the form φn (x)Tn (t).)
We first seek a family of solutions of the form un (x, t) = φn (x)Tn (t) that satisfy the heat
equation and boundary conditions (but not the initial condition). To begin, we substitute the
separable form un (x, t) = φn (x)Tn (t) to the heat equation to obtain

φn Tn0 = φ00n Tn ,

where (·)0 and (·)00 denote the first and second derivatives, respectively, of the single-variable func-
tion. We now assume that φn (x)Tn (t) 6= 0 ∀(x, t) and divide through the equation by φn Tn to
obtain
Tn0 φ00
= n.
Tn φn
We make a key observation: since the left hand side Tn0 /Tn is a function of t (only) and the right
hand side φ00n /φn is a function of x (only), each side must be independent of both x and t and hence
be a constant. We denote this constant by αn to obtain
Tn0 φ00n
= αn and = αn .
Tn φn
Equivalently, we obtain a pair of ODEs

φ00n = αn φn in (0, 1),


Tn0 = αn Tn in (0, ∞);

note that each equation is an ODE (rather than a PDE) since each problem is associated with
either x or t (but not both). In addition, the homogeneous Dirichlet boundary condition requires
that

un (x = 0, t) = φn (0)Tn (t) = 0,
un (x = 1, t) = φn (1)Tn (t) = 0.

As we seek a nontrivial solution (i.e., Tn 6= 0), we deduce that φn (0) = φn (1) = 0. Hence the spatial
function φn must satisfy a boundary-value eigenproblem

−φ00n = −αn φn in (0, 1), (6.1)


φn (0) = φn (1) = 0.

63
We recognize that this is a Sturm-Liouville problem. (We have negated the equation so that it fits
in the standard Sturm-Liouville form.) The problem was analyzed in Section 4.4 and in particular
Example 4.6; we found the solutions are of the form −αn = π 2 n2 and φn (x) = sin(nπx). (The
eigenfunction φn is unique up to multiplication by a constant.) For notational convenience, we will

henceforce use λn ≡ −αn = nπ instead of αn .
Having found nontrival solutions to the spatial eigenproblem, we now seek the associated solu-

tions to the temporal ODE, Tn0 = αn Tn . Upon the substitution λn = −αn , we obtain Tn0 = −λ2n Tn .
The general solution to the ODE is of the form

Tn (t) = exp(−λ2n t) = exp(−n2 π 2 t),

which again is unique up to multiplication by a constant. We combine the expressions for φn and
Tn to conclude that any functions of the form

un (x, t) = φn (x)Tn (t) = exp(−n2 π 2 t) sin(nπx), n = 1, 2, . . . ,

satisfy the homogeneous PDE and boundary conditions (but not the initial condition).
We now appeal to the principle of superposition to find the solution to the initial-boundary
value problem. The key observation is that if each un for n = 1, 2, . . . satisfies the homogeneous
PDE and boundary conditions, so does

X ∞
X
u(x, t) = an un (x, t) = an exp(−n2 π 2 t) sin(nπx),
n=1 n=1

for any an , n = 1, 2, . . . . The principle of superposition is a direct consequence of the linearity of


the heat equation and the homogeneous boundary condition. Namely,
∞ ∞ ∞
∂u ∂ 2 u ∂2 X ∂un ∂ 2 un
 
∂ X X
− 2 = ( an un ) − 2 ( an un ) = an − = 0,
∂t ∂x ∂t ∂x ∂t ∂x2
n=1 n=1 n=1

X
u(x = 0, t) = an un (x = 0, t) = 0,
n=1
X∞
u(x = 1, t) = an un (x = 1, t) = 0,
n=1

since each un satisfies the heat equation and the boundary conditions. We make one key remark:
the fact that both the PDE and boundary conditions are homogeneous is crucial to apply the
principle of superposition. Any linear combination of functions that satisfy a homogeneous PDE
and boundary conditions satisfy the PDE and the boundary conditions (simply because sum of the
multiples of 0 is always 0), but this would not be the case if the PDE or boundary condition is not
homogeneous.
Having satisfied the PDE and boundary conditions, we now seek a set of coefficients an that
satisfies the initial condition:

X
u(x, t = 0) = an sin(nπx) = g(x).
n=1

64
We recognize that this is a Fourier sine series, and the coefficients are given by
Z 1
an = ĝn = 2 g(x) sin(nπx)dx, n = 1, 2, . . . .
0

(Note that we denote the Fourier sine coefficients of g by ĝn , n = 1, 2, . . . .) Hence the solution to
our initial boundary value problem is given by

X
u(x, t) = ĝn exp(−n2 π 2 t) sin(nπx), (6.2)
n=1
R1
where ĝn = 2 0 g(x) sin(nπx)dx. We make a few observations.
1. Convergence. If the initial condition g is square integrable, then the Fourier series is uni-
formly convergent for u(·, t), t > 0. To see this, we observe that P (i) the Fourier coeffi-
2 2 ∞
cients are bn = ĝn exp(−n π t) and (ii) the associated infinite sum n=1 bn is absolutely
convergent.
P∞ The
P∞ absolute convergence of the series follows from a comparison test with
c = 1/n 2 , which we know is convergent. We first note that |ĝ | → 0 as
n=1 n n=1 n
n → ∞ because the initial condition g is square integrable and then find that bn /cn =
|ĝn | exp(−n2 π 2 t)/(1/n2 ) = |ĝn |n2 exp(−n2 π 2 t) → 0 as n → ∞, and hence the series con-
verges.

2. Smoothness. For t > 0, the solution to the heat equation is smooth (i.e., infinitely differen-
tiable and u(·, t) ∈ C ∞ (0, 1)). To see this, we recall that the function has a continuous k-th
order derivative if
X∞ ∞
X
k
n |bn | = |ĝn |nk exp(−n2 π 2 t) < ∞.
n=1 n=1

We again perform a comparison test with a convergent series ∞


P P∞ 2
n=1 cn = n=1 1/n . We find
that (nk bn )/cn = |gn |nk exp(−n2 π 2 t)/(1/n2 ) = |ĝn |nk+2 exp(−n2 π 2 t) → 0 as n → ∞ for any
k ∈ N≥0 and t > 0, and hence the series converges. It follows that the solution to the heat
equation is infinitely differentiable for any t > 0, even if the initial condition is only square
integrable (i.e., it could be discontinuous or be unbounded).

3. Steady state solution. As t → ∞, the solution decays and converges to u = 0. This agrees with
our physical intuition that at steady state the temperature should be equal to the ambient
temperature (i.e., the temperature on the boundary) which is 0.

4. Mode shape and decay rate. We observe that each mode exponentially decays with time. We
also observe that higher-frequency modes decay more rapidly than lower-frequency modes.
As a result, the solution gets qualitatively “smoother” as the time evolves.
Example 6.1 (heat equation with homogeneous Dirichlet boundary conditions). We consider the
heat equation initial-boundary value problem
∂u ∂ 2 u
− 2 =0 in (0, 1) × R>0 ,
∂t ∂x
u=0 on {0, 1} × R>0 ,
u(x, t = 0) = x ∀x ∈ (0, 1).

65
Figure 6.1: Solution to the heat equation with homogeneous Dirichlet boundary condition.

We have already computed the Fourier sine series representation of the initial condition g(x) = x
on (0, 1) in Example 2.18; the expression is
∞ ∞
X X 2(−1)n+1
g(x) ∼ ĝn sin(nπx) = sin(nπx).

n=1 n=1

It follows that the solution to the heat equation is given by


∞ ∞
X X 2(−1)n+1
u(x, t) = ĝn exp(−n2 π 2 t) sin(nπx) = exp(−n2 π 2 t) sin(nπx).

n=1 n=1

Figure 6.1 shows the solution to the heat equation at a few different time instances. We observe
that the solution (i) decays in time and (ii) qualitatively becomes smoother. Both behaviors are
readily predicted from our Fourier series representation of the solution. We also observe that (iii)
the initial condition g(x) = x does not approach 0 on the x = 1 boundary; however, the problem
is still well-posed and the solution u(·, t) satisfies the boundary condition for any t > 0.

6.3 Homogeneous Neumann boundary conditions


We now consider a heat equation with a homogeneous Neumann boundary condition
∂u ∂ 2 u
− 2 =0 in (0, 1) × R>0 ,
∂t ∂x
∂u
=0 on {0, 1} × R>0 ,
∂x
u=g on (0, 1) × {t = 0}.

We apply separation of variables and consider solutions of the form un (x, t) = φn (x)Tn (t). The
substitution of the separable form of the solution yields
φ00n T0
φn Tn0 − φ00n Tn = 0 ⇒ = n = αn ;
φn Tn

66
here we appeal to the fact that φ00n /φn is a function of x only and Tn0 /Tn is a function of t only, and
hence each expression must be a constant.
We now focus on the solution of the spatial problem. We appeal to the boundary conditions
and find that
∂un
(x = 0, t) = φ0n (0)Tn (t) = 0,
∂x
∂un
(x = 1, t) = φ0n (1)Tn (t) = 0;
∂x
since we seek a nontrival solution (i.e., Tn 6= 0), we require φ0n (0) = φ0n (1) = 0. Hence the
boundary-value eigenproblem associated with the spatial function φn is
−φ00n = −αn φn in (0, 1),
φ0n (0) = φ0n (1) = 0.
This is again a Sturm-Liouville problem. The problem was studied in Example 4.7; the eigenvalues

are −αn = n2 π 2 and the eigenfunctions are φn = cos(nπx). We again introduce λn = −αn = nπ
for notational convenience.
We now seek the solution to the temporal ODE
Tn0 (t) = −λ2n Tn (t)
for λn = nπ. The solution to this ODE is of the form T (t) = exp(−λ2n t) = exp(−n2 π 2 t) (up to
multiplication by a constant). Hence the general solution is of the form

X ∞
X
u(x, t) = an Tn (t)φn (x) = an exp(−n2 π 2 t) cos(nπx).
n=0 n=0

We now impose the initial condition to find



X
u(x, t = 0) = an cos(nπx) = g(x).
n=0

We recognize that this is almost a Fourier cosine series, with exception of the constant term having
a factor of 1 instead of 1/2. We rescale the representation and find that the solution is given by

1 X
u(x, t) = a0 + an exp(−n2 π 2 t) cos(nπx),
2
n=1

where Z 1
an = ĝn = 2 cos(nπx)g(x)dx.
0
We make a few observations:
1. Energy Rconservation. We recall that the total thermal energy in a domain Ω at time t is
E(t) = Ω u(x, t)dx (for the nondimensionalized equation with ρc = 1). We observe that, for
homogeneous Neumann boundary condition, the total energy is conserved:
Z 1" ∞
Z 1 # Z 1
1 X
2 2 1
E(t) = u(x, t)dx = a0 + an exp(−n π t) cos(nπx) dx = a0 = g(x)dx;
0 0 2 2 0
n=1

67
the terms associated with n = 1, 2, . . . vanish due to the orthogonality of the cosine function
with the constant function. Hence the total thermal energy is conserved. Physically, the
homogeneous Neumann boundary condition corresponds to an insulated boundary (i.e., a
boundary without any heat transfer), and hence the observed conservation of thermal energy
agrees with our physical intuition.

2. Convergence; smoothness; mode shape and decay rate. These observations we made for the
homogeneous Dirichlet solution (6.2) also holds for the homogeneous Neumann solution.

Example 6.2 (heat equation with homogeneous Neumann boundary condition). We consider the
heat equation initial boundary value problem

∂u ∂ 2 u
− 2 = 0 in (0, 1) × R>0 ,
∂t ∂x
∂u
= 0 on {0, 1} × R>0 ,
∂x
u(x, t = 0) = x on (0, 1) × {t = 0}.

We have already computed the Fourier cosine series representation of the initial condition g(x) = x
on (0, 1) in Example 2.17; the expression is
∞ ∞
1 X 1 X 4
g(x) ∼ ĝ0 + ĝn cos(nπx) = − cos((2k − 1)πx)
2 2 (2k − 1)2 π 2
n=1 k=1

It follows that the solution to the heat equation is given by


∞ ∞
1 X 1 X 4
u(x, t) = − ĝn exp(−n2 π 2 t) cos(nπx) = − exp(−(2k−1)2 π 2 t) cos((2k−1)πx).
2 2 (2k − 1)2 π 2
n=1 k=1

Figure 6.2 shows the solution to the heat equation at a few different time instances. We observe
that the solution (i) satisfies the homogeneous Neumann condition (i.e., the derivative vanishes
at the endpoints) and (ii) is smooth for any time. Both behaviors are readily predicted from our
Fourier series representation of the solution.

6.4 Mixed boundary conditions


We now consider a heat equation with mixed boundary conditions

∂u ∂ 2 u
− 2 =0 in (0, 1) × R>0 ,
∂t ∂x
u=0 on {x = 0} × R>0 ,
∂u
=0 on {x = 1} × R>0 ,
∂x
u=g on (0, 1) × {t = 0}.

As the procedure is similar to the case for the homogeneous Dirichlet (or Neumann) boundary
condition considered earlier, we only highlight steps that are different from the previous cases.

68
Figure 6.2: Solution to the heat equation with homogeneous Neumann boundary condition.

We apply separation of variables and arrive at the following boundary-value eigenproblem as-
sociated with the spatial function φn

φ00n = αn φn in (0, 1),


φn (0) = 0,
φ0n (1) = 0.

This — or more precisely −φ00n = −αn φn — is a Sturm-Liouville problem with a mixed boundary
condition; we know that the eigenvalues are real and the eigenfunctions are orthogonal. Moreover,
because the conditions for Theorem 4.5 are satisfied, the eigenvalues −αn ≥ 0 (or αn ≤ 0). We
hence set αn = −λ2n and find the solution is in the form φn (x) = an cos(λn x) + bn sin(λn x). The
boundary condition φn (x = 0) = 0 requires that an = 0. The boundary condition φ0n (1) = 0
requires that φ0n (1) = bn λn cos(λn ) = 0, which implies λn = (n − 1/2)π, n = 1, 2, . . . .
The temporal ODE Tn0 (t) = −λ2n Tn (t) is the same as before, and its solution is of the form
T (t) = exp(−λ2n t) = exp(−(n − 1/2)2 π 2 t) (up to multiplication by a constant). It follows that the
solution to the mixed boundary value problem is of the form

X
u(x, t) = an exp(−(n − 1/2)2 π 2 t) sin((n − 1/2)πx). (6.3)
n=1

We impose the initial condition to find



X
u(x, t = 0) = an sin((n − 1/2)πx) = g(x).
n=1

We know that {sin((n − 1/2)πx)}∞ n=1 forms an orthogonal basis because they are the eigenfunctions
of a Sturm-Liouville problem. We hence multiply both sides of the equations by sin((m − 1/2)πx)
and integrate over (0, 1) to obtain
Z 1 Z 1
2
am sin((m − 1/2)πx) dx = sin((m − 1/2)πx)g(x)dx, m = 1, 2, . . . .
0 0

69
The integral on the left hand side evaluates to
Z 1 Z 1
1 1
sin((m − 1/2)πx)2 dx = (1 − cos(2(m − 1/2)πx))dx = ,
0 0 2 2

where the last equality follows from the fact that integral of cos(2(m − 1/2)πx) over (0, 1) is 0 for
any m. Hence
Z 1
1
an = sin((n − 1/2)πx)g(x)dx
2 0

or equivalently
Z 1
an = 2 sin((n − 1/2)πx)g(x)dx. (6.4)
0

The series (6.3) with the coefficients (6.4) provide a solution representation to the mixed boundary
value problem.

6.5 Nonhomogeneous source term


We now consider the heat equation with a nonhomogeneous source term:

∂u ∂ 2 u
− 2 =f in (0, 1) × R>0 ,
∂t ∂x
u=0 on {0, 1} × R>0 ,
u=g on (0, 1) × {t = 0},

where the source function f : (0, 1) × R>0 → R in general depends on both space and time. This
nonhomogeneous problem can also be solved using separation of variables.
∂2u
To begin, we first apply separation of variables to the homogeneous PDE ∂u ∂t − ∂x2 = 0 with
the homogeneous Dirichlet boundary condition to find an appropriate spatial basis functions (φn ).
This homogeneous problem, considered in Section 6.2, yields a basis functions φn (x) = sin(nπx).
We express the solution as a linear combination of these spatial basis functions to obtain a solution
representation

X
u(x, t) = ûn (t) sin(nπx).
n=1

Note that the Fourier sine series will satisfy the homogeneous Dirichlet boundary condition for any
time-dependent coefficients (ûn (t))∞
n=1 at each time. We now substitute the Fourier representation
of u into the PDE to obtain

X
[û0n (t) + n2 π 2 ûn (t)] sin(nπx) = f (x, t).
n=1

We next multiply the equation by 2 sin(mπx), integrate from x = 0 to x = 1, and appeal to the
orthogonality of the Fourier basis to obtain

û0m (t) + m2 π 2 ûm (t) = fˆm (t), m = 1, 2, . . . , (6.5)

70
where the Fourier coefficients of f are given by
Z 1
ˆ
fm (t) = 2 f (x, t) sin(mπx)dx, m = 1, 2, . . . .
x=0

The time-dependent coefficients ûn : R≥0 → R, n = 1, 2, . . . , must hence satisfy the ODE (6.5).
We now wish to identify the appropriate initial conditions for (6.5). To this end, we substitute
the solution representation to the initial condition to obtain

X
u(x, t = 0) = ûn (t = 0) sin(nπx) = g(x).
n=1

This is a Fourier sine series and hence we deduce that


Z 1
ûn (t = 0) = ĝn ≡ 2 g(x) sin(nπx)dx. (6.6)
x=0

The combination of (6.5) and (6.6) yields

û0n (t) + n2 π 2 ûn (t) = fˆn (t),


ûn (t = 0) = ĝn

for n = 1, 2, . . . , which is a set of ODE initial value problems.


We find the solution to the ODE initial value problem using an integrating factor. To this end,
we multiply each ODE by exp(n2 π 2 t) and note that
d
exp(n2 π 2 t)û0n (t) + n2 π 2 exp(n2 π 2 t)ûn (t) = (exp(n2 π 2 t)ûn (t)) = exp(n2 π 2 t)fˆn (t).
dt
We integrate the expression from 0 to t to obtain
Z t
2 2
exp(n π t)ûn (t) − ûn (t = 0) = exp(n2 π 2 τ )fˆn (τ )dτ.
τ =0

We divide through by exp(n2 π 2 t) and rearrange the expression to obtain


Z t
2 2
ûn (t) = ûn (t = 0) exp(−n π t) + exp(−n2 π 2 (t − τ ))fˆn (τ )dτ.
τ =0

Since ûn (t = 0) = ĝn , it follows that


Z t
2 2
ûn (t) = ĝn exp(−n π t) + exp(−n2 π 2 (t − τ ))fˆn (τ )dτ. (6.7)
τ =0

Hence the solution to the nonhomogeneous heat equation is given by



X
u(x, t) = ûn (t) sin(nπx),
n=1

where the coefficient functions are given by (6.7).


We make a few observations:

71
1. The solution to the nonhomogeneous heat equation is a sum of the solutions to two initial
boundary value problem. The first problem is the homogeneous problem with nonzero initial
∂2u
condition: ∂u
∂t − ∂x2 = 0 and u = g at t = 0. The second problem is the nonhomogeneous
∂2u
problem with a zero initial condition: ∂u∂t − ∂x2 = f and u = 0 at t = 0. The result is an
application of the principle of superposition.
Rt
2. The term due to the nonhomogeneous source, τ =0 exp(n2 π 2 (τ − t))fˆn (τ )dτ , suggests that the
addition of the heat source at time τ affects the solution for all t > τ , which agrees with our
physical intuition. However, the influence of the heat source at time τ decays exponentially
in time.

Example 6.3 (heat equation with a source term). We consider a nonhomogeneous heat equation

∂u ∂2u
(x, t) − 2 (x, t) = x, x ∈ (0, 1), t ∈ R>0 ,
∂t ∂x
u = 0 on {0, 1} × R>0 ,
u=0 on (0, 1) × {t = 0}.

The heat equation has a nonzero but time-independent source term.


We first recognize that our boundary conditions are homogeneous Dirichlet, and hence our
spatial basis is given by φn (x) = sin(nπx), n = 1, 2, . . . ; the basis of Fourier sine series. To find the
Fourier sine series representation of the problem, we first identify that the Fourier sine coefficients
of the source term f (x) = x on x ∈ (0, 1):

2(−1)n+1
fˆn = n ∈ Z>0 .

We appeal to (6.7), note that ĝn = 0 since the initial condition is g = 0, and find the Fourier sine
coefficients of the solution:
Z t
1 2(−1)n+1
ûn (t) = exp(n2 π 2 (τ − t))fˆn dτ = 2 2 (1 − exp(−n2 π 2 t))fˆn = (1 − exp(−n2 π 2 t)).
τ =0 n π n3 π 3

Hence the solution is given by



X 2(−1)n+1
u(x, t) = (1 − exp(−n2 π 2 t)) sin(nπx).
n3 π 3
n=1

Figure 6.3 shows the solution to the heat equation at a few different time instances. We make a
few observations. Lastly, as t → ∞ the solution converges to the non-trivial steady state solution

X 2(−1)n+1
u(x, t) = sin(nπx);
n3 π 3
n=1

this again agrees with our intuition that, in the presence of a steady heat source, the temperature
approaches a steady nonzero distribution over time.

72
Figure 6.3: Solution to the heat equation with nonhomogeneous Dirichlet boundary condition.

6.6 Nonhomogeneous boundary conditions


We now consider the heat equation with nonhomogeneous boundary conditions:
∂u ∂ 2 u
− 2 =f in (0, 1) × R>0 ,
∂t ∂x
u=h on {0, 1} × R>0 ,
u=g on (0, 1) × {t = 0},

where the boundary function h : {0, 1} × R>0 → R in general depends on time. To solve this
problem, we consider the decomposition of the solution into two parts: (i) a function ub that
satisfies the boundary condition and (ii) the remainder of the solution w such that u = ub + w.
We first note that the function ub can be obtained by “extending” the boundary function h to the
interior of the domain:
ub (x, t) = h(0, t) + (h(1, t) − h(0, t))x;
this is obviously not a unique choice of a function that satisfies the boundary conditions, but it is a
convenient choice because the Laplacian of the linear function vanishes. We substitute u = ub + w
to our initial-boundary value problem to obtain

∂ub ∂w ∂ 2 w
+ − = f in (0, 1) × R>0 ,
∂t ∂t ∂x2
ub + w = h on {0, 1} × R>0 ,
ub + w = g on (0, 1) × {t = 0},
∂ 2 ub
where we have appealed to ∂x2
= 0. In other words, the function w must satisfy

∂w ∂ 2 w ∂ub
− = f − in (0, 1) × R>0 ,
∂t ∂x2 ∂t
w = 0 on {0, 1} × R>0 ,
w = g − ub on (0, 1) × {t = 0}.

73
This is a heat equation with a nonhomogeneous source term but with a homogeneous boundary
condition, which was considered in Section 6.5. We may hence solve for w using the procedure
outlined in the section and set u = ub + w. In the special case in which the boundary condition
∂2w
does not depend on time, the heat equation for w simplifies to ∂w
∂t − ∂x2 = f .
Example 6.4 (nonhomogeneous Dirichlet boundary conditions). We consider the solution of a
heat equation initial-boundary value problem
∂u ∂ 2 u
− 2 = 0 in (0, 1) × R>0 ,
∂t ∂x
u(x = 0, t) = TL , t ∈ R>0 ,
u(x = 1, t) = TR , t ∈ R>0 ,
u=0 on (0, 1) × {t = 0}
for some fixed temperatures TL and TR . The boundary condition is nonhomogeneous but is time-
independent. We seek the solution of a form u = ub + w, for some ub that satisfies the boundary
condition and w which solves the initial-boundary value problem. We choose the function
ub (x, t) = TL + (TR − TL )x
as our boundary function, which satisfies the boundary conditions. We then note that w must then
satisfy
∂w ∂ 2 w
− =0 on (0, 1) × R>0 ,
∂t ∂x2
w=0 on {0, 1} × R>0 ,
b
w = −u = −TL − (TR − TL )x on (0, 1) × {t = 0}.
This is a homogeneous heat equation with homogeneous boundary conditions, the solution approach
to which was discussed in Section 6.2.
The Fourier sine coefficients for the initial condition are
Z 1
bn = 2 sin(nπx)(−TL − (TR − TL )x)dx
0
cos(nπx) 1 sin(nπx) x cos(nπx) 1
   
= 2TL + 2(TR − TL ) − +
nπ x=0 n2 π 2 nπ x=0
   
cos(nπ) − 1 cos(nπ) 2
= 2TL + 2(TR − TL ) = ((−1)n TL − TL + (−1)n (TR − TL ))
nπ nπ nπ
2
= (−TL + (−1)n TR ).

Hence the function w is given by
∞ ∞
X X 2
w(x, t) = bn exp(−n2 π 2 t) sin(nπx) = (−TL + (−1)n TR ) exp(−n2 π 2 t) sin(nπx).

n=1 n=1

The solution u is

X 2
u(x, t) = ub (x, t) + w(x, t) = TL + (TR − TL )x + (−TL + (−1)n TR ) exp(−n2 π 2 t) sin(nπx).

n=1

74
Figure 6.4: Solution to the heat equation with nonhomogeneous Dirichlet boundary condition.

Figure 6.4 shows the solution to the heat equation for TL = 1/3 and TR = 1 at a few different
time instances. We make a few observations. First, for a small t, the series representation of w is
almost equal to −ub and hence the solution u is close to zero (i.e., the initial condition). Second,
as t → ∞, the function w → 0 and hence the solution u converges to ub .

6.7 Summary
We summarize the key points of this lecture:

1. Using separation of variables and the principle of superposition, we can find Fourier series
representations of the solution to the heat equation on a one-dimensional bounded spatial
domain.

2. The application of separation of variables to the heat equation yields a Sturm-Liouville prob-
lem for the spatial modes. For homogeneous Dirichlet, Neumann, and mixed boundary condi-
tions, the Sturm-Liouville theory can be used to deduce that all eigenvalues are non-negative
and the eigenfunctions are orthogonal.

3. The solution to the heat equation is smooth (i.e., infinitely differentiable) for t > 0 for any
boundary conditions.

4. The solution to the heat equation with homogeneous Dirichlet boundary condition is given
by Fourier sine series. All modes decay in time, and higher spatial mode decay more rapidly
in time.

5. The solution to the heat equation with homogeneous Neumann boundary condition is given
by Fourier cosine series. The total energy is conserved in time.

6. The solutions to a heat equation with a nonhomogeneous source term or a nonhomogeneous


boundary condition can be found again using separation of variables. The solution is the
sum of the solutions associated with (i) a nonhomogeneous PDE (i.e., finite source) and a

75
homogeneous initial condition and (ii) a nonhomogeneous initial condition and a homogeneous
PDE (i.e., zero source).

76
Lecture 7

Heat equation: fundamental solution

©2019–2021 Masayuki Yano. Prepared for ESC384 Partial Differential Equations taught at the
University of Toronto.

7.1 Introduction
In this lecture we seek the solution of initial or initial-boundary value problems associated with
the heat equation in unbounded domains. To find the solutions, we generalize Fourier series to
unbounded domains to form Fourier integrals. We then identify, using Fourier integrals, the fun-
damental solution (or heat kernel) for heat equations in unbounded domains. We next employ the
method of reflection to solve problems on the half (as opposed to entire) real line. We finalize
introduce the Duhamel’s principle, which allows us to express the solution to nonhomogeneous
problems as a linear combination of the solutions to a family of homogeneous problems.

7.2 Fourier integral representation


We have so far represented functions on bounded domains as Fourier series. For instance, the
Fourier series of f : (−L, L) → R is given by

1 X
f (x) = a0 + (an cos(nπx/L) + bn sin(nπx/L)),
2
n=1

where
Z L Z L Z L
1 1 1
a0 ≡ f (x)dx, an ≡ cos(nπx/L)f (x)dx, bn ≡ sin(nπx/L)f (x)dx.
L −L L −L L −L

We now introduce Fourier integrals which provide similar representations for functions on un-
bounded domains (i.e., L → ∞).
R∞
Definition 7.1 (Fourier integral). Let f : R → R that satisfies −∞ |f (x)|dx < ∞. The Fourier
integral representation of f is given by
Z ∞
f (x) ∼ (a(λ) cos(λx) + b(λ) sin(λx))dλ,
λ=0

77
where the Fourier coefficients are given by

1 ∞ ∞
Z Z
1
a(λ) ≡ cos(λx)f (x)dx, b(λ) ≡ cos(λx)f (x)dx.
π −∞ π −∞

The Fourier integral, just like the Fourier series, converges for piecewise continuously differen-
tiable functions in a point-wise sense. We here state a key result without a proof.

Theorem 7.2 (convergence of Fourier integral). Let f : RR→ R be a function that (i) is piecewise

continuously differentiable on any interval and (ii) satisfies −∞ |f (x)|dx < ∞. Then for any x ∈ R,
Z ∞
1
(a(λ) cos(λx) + b(λ) sin(λx))dλ = (f (x+ ) + f (x− )).
λ=0 2

The Fourier integral, just like Fourier series, can be extended to odd and even functions.

Definition
R∞ 7.3 (Fourier cosine integral). Let f : R → R be an even function that satisfies
0 |f (x)|dx < ∞. The Fourier cosine integral representation of f is given by
Z ∞
f (x) = fˆ(λ) cos(λx)dλ,
λ=0

where the Fourier cosine coefficient is given by

1 ∞ 2 ∞
Z Z
ˆ
f (λ) = cos(λx)f (x)dx = cos(λx)f (x)dx.
π −∞ π 0
R∞
Definition 7.4 (Fourier sine integral). Let f : R → R be an odd function that satisfies 0 |f (x)|dx <
∞. The Fourier sine integral representation of f is given by
Z ∞
f (x) = fˆ(λ) sin(λx)dλ,
λ=0

where the Fourier sine coefficient is given by

1 ∞ 2 ∞
Z Z
ˆ
f (λ) = sin(λx)f (x)dx = sin(λx)f (x)dx.
π −∞ π 0

The Fourier cosine and sine integrals exhibit the same convergence property as the (full) Fourier
integral.

7.3 Heat equation in R × R≥0


We now consider the heat equation on the entire real line R:

∂u ∂ 2 u
− 2 =0 in R × R>0 , (7.1)
∂t ∂x
u=g on R × {t = 0},
|u(x, t)| < ∞ as x → ±∞.

78
The boundary condition is replaced by a boundedness condition which requires the solution to be
finite as x → ±∞.
As before, we apply separation of variables and express the solution as a linear combination of
homogeneous solutions of the form u(x, t) = φ(x)T (t). As usual, the substitution of the separable
expression to the heat equation yields a condition that
T0 φ00
= = α,
T φ
for some constant α. The boundedness condition requires that φ remain finite as x → ∞, and hence
α must be negative so that φ are sinusoidal functions instead of hyperbolic sinusoidal functions
which are unbounded as x → ∞. Without loss of generality, we set α = −λ2 , and find the solutions
to the eigenproblem
φ00 + λ2 φ = 0 in R,
|φ(x)| < ∞ as x → ±∞,
are of the form
φ(x) = a cos(λx) + b sin(λx);
the expression is bounded for any x ∈ R, λ ∈ R>0 . The associated temporal functions are
T (t) = exp(−λ2 t).
We consider the “linear combination” of the solutions φ(x)T (t) to obtain
Z ∞
u(x, t) = (a(λ) cos(λx) + b(λ) sin(λx)) exp(−λ2 t)dλ. (7.2)
λ=0
The coefficient functions a(λ) and b(λ) are determined from the initial condition:
Z ∞
u(x, t = 0) = (a(λ) cos(λx) + b(λ) sin(λx))dλ = g(x).
λ=0
We deduce that a(λ) and b(λ) are the Fourier coefficients of the Fourier integral of g; i.e.,
1 ∞ 1 ∞
Z Z
a(λ) ≡ cos(λx)g(x)dx, b(λ) ≡ sin(λx)g(x)dx. (7.3)
π x=−∞ π x=−∞
The solution to (7.1) is hence given by (7.2) with the coefficients (7.3).

7.4 Fundamental solution of the heat equation in R × R>0


We can rearrange the Fourier integral of the solution (7.2) to gain further insight about the structure
of the heat equation. To begin, we substitute the coefficients (7.3) to the Fourier integral (7.1) to
obtain
1 ∞
Z Z ∞ Z ∞ 
u(x, t) = cos(λξ)g(ξ)dξ cos(λx) + sin(λξ)g(ξ)dξ sin(λx) exp(−λ2 t)dλ
π λ=0 ξ=−∞ ξ=−∞
1 ∞ ∞
Z Z
= (cos(λξ) cos(λx) + sin(λξ) sin(λx))g(ξ)dξ exp(−λ2 t)dλ
π λ=0 ξ=−∞
1 ∞ ∞
Z Z
= cos(λ(ξ − x))g(ξ)dξ exp(−λ2 t)dλ,
π λ=0 ξ=−∞

79
where the last equality follows from a trigonometric identity. We now assume that the order of
integration can be reversed to obtain
1 ∞
Z Z ∞
u(x, t) = cos(λ(ξ − x)) exp(−λ2 t)dλg(ξ)dξ.
π ξ=−∞ λ=0

The inner integral on λ over (0, ∞) can be computed using an integration technique in complex
analysis: Z ∞
−(ξ − x)2
r  
2 π
cos(λ(ξ − x)) exp(−λ t)dλ = exp .
λ=0 4t 4t
(The evaluation is beyond the scope of this course, but those who have taken complex analysis
could attempt its evaluation.) We hence find that
Z ∞
−(ξ − x)2
 
1
u(x, t) = √ exp g(ξ)dξ. (7.4)
4πt ξ=−∞ 4t

This final expression shows how the initial condition g affects the solution u.
Because the solution of the form (7.4) provides much useful information about the behavior
of the PDE, we now formalize the solution representation of this form using the concept of the
fundamental solution.

Definition 7.5 (fundamental solution of the heat equation). The fundamental solution of the heat
∂2u
equation ∂u
∂t − ∂x2 = 0 in R × R>0 is
(  2
√ 1 exp −x , t>0
4πt 4t
Φ(x, t) ≡
0, t < 0.

The fundamental solution is also called the heat kernel or diffusion kernel.

√ The fundamental solution Φ(·, t), t >√0, is the Gaussian of a variance 2t (or standard deviation
2t); the height of the Gaussian is 1/ 4πt. The fundamental solution for three different time
instances is shown in Figure 7.1. The fundamental solution spreads and decays over time.
Using the fundamental solution, the solution to the initial value problem with u(x, t = 0) = g(x)
can be represented as
Z ∞ Z ∞
−(x − ξ)2
 
1
u(x, t) = Φ(x − ξ, t)g(ξ)dξ = √ exp g(ξ)dξ.
−∞ −∞ 4πt 4t

We now make several observations:

1. Isotropy. The fundamental solution is even in x; i.e., Φ(x, t) = Φ(−x, t). The initial solution
g spreads out to left and right in an equal manner, or more generally isotropically in higher
dimensions. (This is in contrast to the transport equation, where the solution transports in
a directional manner.)

2. Conservation of thermal energy. The spatial integral of the fundamental solution over (−∞, ∞)
is given by Z ∞
Φ(x, t)dx = 1.
−∞

80
Figure 7.1: Fundamental solution of the heat equation.

Recalling that that spatial integral of the solution over (a, b) ⊂ R represents the thermal
energy in the domain (a, b), we conclude that (i) the solution spreads out over time but (ii)
the thermal energy is conserved. This is consistent with our understanding of the conservation
of thermal energy.

3. Infinite propagation speed. The expression (7.4) shows that the solution u at a (space-time)
point (x, t), t > 0, depends on the initial solution g everywhere because exp(−(x − ξ)2 /(4t))
is nonzero for all ξ ∈ R. Conversely, the initial solution g at any point ξ affects solution
everywhere and immediately because the function exp(−(ξ − x)2 /4t) is nonzero everywhere
in R×R>0 . In other words, any change in the initial condition at some point ξ is felt for any x
immediately for t > 0, regardless of how far apart x and ξ are; i.e., disturbances propagate at
infinite speed. (This is in contrast to the finite propagation speed observed for the transport
and wave equations.)

4. Exponential decay of influence. While any disturbance affects the solution instantaneously
everywhere, their effect is quantitatively limited to a short distance over a short time. Namely,
if t  1, then the exponential in the integrand evaluates to nearly zero except for ξ ≈ x.
Hence Z x+a
−(ξ − x)2
 
1
u(x, t) ≈ √ exp g(ξ)dξ.
4πt ξ=x−a 4t
The initial condition g(x) will have a significant effect on the solution u(·, t) only in the
vicinity of x for small t.

5. Positivity. If g ≥ 0 but g 6= 0 (i.e., g(x) ≥ 0 for all x and g(x) > 0 for some x), then the
solution u is positive for all points x ∈ R and times t > 0. In words, if the initial temperature is
nonnegative everywhere and positive somewhere, then the temperature is positive everywhere
at any later time no matter how small.

6. Smoothness. It can be shown that the solution to the heat equation is smooth; the solution
is infinitely differentiable in both space and time. The smoothness holds even if the initial
condition g is not differentiable. This is consistent with our observation for the solution on a
bounded domain obtained using the Fourier series.

81
Figure 7.2: The solution to the heat equation with a square initial condition.

Example 7.6 (heat equation on R × R>0 ). Consider the heat equation initial value problem (7.1)
with a initial condition (
1, x ∈ (−1, 1),
g(x) =
0, R \ (−1, 1).
Using the fundamental solution, we can express the solution as
Z 1
−(ξ − x)2
 
1
u(x, t) = √ exp dξ.
4πt −1 4t

The integration is over (−1, 1) as the initial condition g vanishes outside of this interval. Figure 7.2
shows the solution to the heat equation. We readily confirm some of the observations we made
earlier. Even though the initial condition is discontinuous, the solution diffuses quickly and is
smooth for t > 0. While the solution is formally positive everywhere for any t > 0 (i.e., infinite
propagation speed), quantitatively the effect of the initial condition decays expontentially away
from x = ±1.

Remark 7.7. Fundamental solutions also exist in Rn × R>0 ; see Appendix 7.B.

7.5 Heat equation in R>0 × R>0 : homogeneous Dirichlet


We now consider heat equation on the positive half of the real line R>0 :

∂u ∂ 2 u
− 2 =0 in R>0 × R>0 , (7.5)
∂t ∂x
u=0 on {x = 0} × R>0 ,
u=g on R>0 × {t = 0},
|u(x, t)| < ∞ as x → ∞.

We seek the solution using the method of reflection. The idea is to identify a problem on the entire
real line R whose solution restricted to R>0 solves the problem (7.5) on the half line. To find the

82
equivalent problem on R, we recall a key fact: if a function is (i) continuous and (ii) odd, then the
function vanishes at x = 0. The continuity requirement is satisfied for any solution to the heat
equation because the solution is smooth (i.e., infinitely differentiable). We hence wish to find a
problem on R whose solution is the odd extension of the solution of (7.5). We achieve this goal
by using the initial condition g̃ : R → R that is the odd extension of g : R>0 → R. The resulting
problem on the entire real line R is as follows:
∂u ∂ 2 u
− 2 =0 in R × R>0 ,
∂t ∂x
u = g̃ on R × {t = 0},
|u(x, t)| < ∞ as x → ±∞

where g̃ is the odd extension of g,


(
g(x), x ∈ R>0 ,
g̃(x) =
−g(−x), x ∈ R<0 .

The associated solution is


Z ∞ Z ∞ Z 0
u(x, t) = Φ(x − ξ, t)g̃(ξ)dξ = Φ(x − ξ, t)g(ξ)dξ − Φ(x − ξ, t)g(−ξ)dξ
−∞ 0 −∞
Z ∞
= (Φ(x − ξ, t) − Φ(x + ξ, t))g(ξ)dξ
0
Z ∞
−(x − ξ)2 −(x + ξ)2
   
1
=√ exp − exp g(ξ)dξ
4πt 0 4t 4t
Because the method of reflection applied to the homogeneous Dirichlet boundary condition is based
on the odd extension of the initial condition, the method is also called the method of odd extensions.
Example 7.8. We consider the heat equation initial-boundary value problem (7.5) with the initial
condition g = 1 on R>0 . The solution is
Z ∞
−(x − ξ)2 −(x + ξ)2
   
1
u(x, t) = √ exp − exp dξ.
4πt 0 4t 4t
√ √
We substitute z = (x − ξ)/ 4t and z = (x + ξ)/ 4t to the first and second integrals, respectively,
and obtain
Z x/√4t Z ∞ !
1 2 2
u(x, t) = √ exp(−z )dz − √ exp(−z )dz
π −∞ x/ 4t
Z x/√4t √
1 2
=√ √ exp(−z )dz ≡ erf(x/ 4t);
π −x/ 4t
here the first equality follows form the aforementioned substitution, the second equality √ follows
from√ the symmetry of the integrand and the cancellation of the integrals over (−∞, −x/ 4t) and
(x/ 4t, ∞), and the last equality follows from the definition of the error function.
Figure 7.3 shows the odd periodic extension on the entire R. We make a few observations, the
latter three of which reiterate our earlier observation about the solution of the heat equation (on
R × R>0 ).

83
Figure 7.3: Solution of Example 7.8.

1. The solution attains v(x = 0, t) = 0 for all t ∈ [0, ∞), and the homogeneous Dirichlet
boundary condition is satisfied.

2. Smoothness. The discontinuity in the initial condition disappear immediately. The solution
is infinitely differentiable for t > 0.

3. Infinite propagation speed. The discontinuity affects the solution immediately and everywhere.
The disturbance propagates at infinite speed.

Exponential decay of influence. However, quantitatively, only the solution in the region |x| .
4. √
4t sees significant deviation from the initial condition; significant impact
√ of the diffusion
process is only felt in this region, and the width of the region grows as 4t.

7.6 Heat equation in R>0 × R>0 : homogeneous Neumann


We now consider heat equation on a half line R>0 :

∂u ∂ 2 u
− 2 =0 in R>0 × R>0 , (7.6)
∂t ∂x
∂u
=0 on {x = 0} × R>0 ,
∂x
u=g on R>0 × {t = 0},
|u(x, t)| < ∞ as x → ∞.

We again seek the solution using the method of reflection; i.e., we seek a problem on the entire R
whose solution restricted to the half line R>0 is the solution to the homogeneous Neumann problem.
We recall a key fact: any continuously differentiable and even function has a vanishing derivative at
x = 0. The continuously differentiable requirement is satisfied for any solution to the heat equation
because the solution is smooth (i.e., infinitely differentiable). We hence wish to find a problem on
R whose solution is an even extension of the solution to the homogeneous Neumann problem. To
this end, we recast the problem on the entire R whose initial condition is the even extension of the

84
initial condition g. (Note that, to ensure the derivative (instead of the value) vanishes at x = 0,
we consider the even (instead of the odd) extension.)
The resulting problem on the entire real line R is

∂u ∂ 2 u
− 2 =0 in R × R>0 , (7.7)
∂t ∂x
u = g̃ on R × {t = 0},
|u(x, t)| < ∞ as x → ±∞,

where g̃ is the even extension of g,


(
g(x), x ∈ R>0 ,
g̃(x) =
g(−x), x ∈ R<0 .

The associated solution is


Z ∞
u(x, t) = (Φ(x − ξ, t) + Φ(x + ξ, t))g(ξ)dξ
0
Z ∞
−(x − ξ)2 −(x + ξ)2
   
1
=√ exp + exp g(ξ)dξ.
4πt 0 4t 4t

The method of reflection applied to the homogeneous Neumann problem is also called the method
of even extension. We can readily verify the earlier observations about (i) smoothness and (ii)
infinite propagation speed.

7.7 Nonhomogeneous problems and Duhamel’s principle


We now consider the solution of nonhomogeneous problems of the form

∂u ∂ 2 u
− 2 =f in R × R>0 , (7.8)
∂t ∂x
u=g on R × {t = 0}.

We immediately note that, by the linearity of the PDE, the solution can be decomposed into
u = w + v, where w satisfies the homogeneous heat equation with a nontrivial initial condition

∂w ∂ 2 w
− =0 in R × R>0 , (7.9)
∂t ∂x2
u=g on R × {t = 0}.

and v satisfies the nonhomogeneous heat equation with the trivial initial condition

∂v ∂2v
− 2 =f in R × R>0 , (7.10)
∂t ∂x
u=0 on R × {t = 0}.

We have already seen how to solve (7.9) in previous sections; we hence focus on the solution
of (7.10). One approach to solve the problem is to appeal to Duhamel’s principle:

85
Theorem 7.9 (Duhamel’s principle for the heat equation). Let u : R × R>0 → R be the solution
to the initial value problem
∂u ∂ 2 u
− 2 =f in R × R>0 ,
∂t ∂x
u=0 on R × {t = 0},

for some source function f : R × R>0 → R. For any fixed s ∈ R>0 , let us : R × R>s → R be the
solution to
∂us ∂ 2 us
− = 0 in R × R>s ,
∂t ∂x2
us = f (·, s) on R × {t = s}.

Then, the solution u : R × R>0 is given by


Z t
u(x, t) = us (x, t)ds, x ∈ R, t ∈ R>0 . (7.11)
s=0

Proof. We differentiate (7.11) in time to obtain


Z t Z t
∂u ∂us ∂us
(x, t) = lim us (x, t) + (x, t)ds = f (x, t) + (x, t)ds.
∂t s→t s=0 ∂t s=0 ∂t

Similarly we apply the spatial differential operator to obtain


Z t Z t
∂2u ∂ 2 us ∂us
(x, t) = (x, t)ds = (x, t)ds
∂x2 s=0 ∂x
2
s=0 ∂t

It follows that
t t
∂2u ∂us ∂us
Z Z
∂u
(x, t) − 2 (x, t) = f (x, t) + (x, t)ds − (x, t)ds = f (x, t),
∂t ∂x s=0 ∂t s=0 ∂t
and hence the solution (7.11) satisfies the PDE. We also observe that
Z t=0
u(x, t = 0) = us (x, t = 0)ds = 0, x ∈ R,
s=0

as us (·, t = 0) = f (·, s) is bounded, and hence the solution satisfies the homogeneous initial condi-
tion.

Intuitively, Duhamel’s principle states that the solution u to nonhomogeneous heat equations
can be expressed as a “linear combination” of the solutions {us }s∈(0,t) to homogeneous heat equa-
tions with initial conditions associated with the nonhomogeneous source term f . We note that each
solution us : R × R>s → R can be expressed using the fundamental solution:
Z
us (x, t) = Φ(x − ξ, t − s)f (ξ, s)dξ.
R

It follows that the solution to (7.10) with the homogeneous initial condition is
Z t Z t Z
s
v(x, t) = u (x, t)ds = Φ(x − ξ, t − s)f (ξ, s)dξds.
s=0 s=0 R

86
The complete solution to (7.8) with the initial condition u(·, t = 0) = g is hence
Z Z t Z
u(x, t) = Φ(x − ξ, t)g(ξ)dξ + Φ(x − ξ, t − s)f (ξ, s)dξds
R s=0 R
Z t Z
(x − ξ)2 (x − ξ)2
Z    
1 1
= exp − g(ξ)dξ + exp − f (ξ, s)dξds.
R 4πt 4t s=0 R 4π(t − s) 4(t − s)

Remark 7.10. The proof of Duhamel’s principle relies only on the linearity of the spatial operator
∂2
(which is ∂x2 for the heat equation in one dimension). Hence the principle applies to any linear
nonhomogeneous PDE in any spatial dimensions.

Example 7.11 (nonhomogeneous heat equation: revisited). In Section 6.5, we considered the
solution of nonhomogeneous one-dimensional heat equation using the separation of variables. For
homogeneous Dirichlet problemsP∞on Ω ≡ (0, 1) with the initial condition g = 0, we found that the
solution is given by u(x, t) = n=1 ûn (t) sin(nπx), where
Z t
ûn (t) = exp(−n2 π 2 (t − τ ))fˆn (τ )dτ,
τ =0

and fˆn (τ ) are the Fourier sine coefficients of the source function f (·, τ ).
We can derive the same relationship using the Duhamel’s principle. To see this, we first recall
that the solution to the homogeneous heat equation for any fixed s ∈ R>0 ,

∂us ∂ 2 us
− =0 in (0, 1) × R>s ,
∂t ∂x2
us = 0 on {0, 1} × R>s ,
s
u = f (·, s) on (0, 1) × {t = s},

is given by

X
us (x, t) = fˆn (s) exp(−n2 π 2 (t − s)) sin(nπx).
n=1

We now appeal to Duhamel’s principle to observe that


Z t ∞ Z
X t 
u(x, t) = us (x, s)ds = exp(−n2 π 2 (t − s))fˆn (s)ds sin(nπx),
s=0 n=1 | s=0 {z }
ûn (t)

which is identical to the expression obtained in Section 6.5 through an explicit manipulation of the
Fourier sine series coefficients.

7.8 Nonhomogeneous boundary conditions


We now consider the solution of nonhomogeneous heat equations on a half line R>0 with non-
homogeneous boundary conditions. Our approach is similar to the approach we used to treat
nonhomogeneous boundary conditions on bounded domains using separation of variables in Sec-
tion 6.6.

87
We first consider the case with a nonhomogeneous Dirichlet boundary condition:

∂u ∂ 2 u
− 2 = f in R>0 × R>0 , (7.12)
∂t ∂x
u = h on {x = 0} × R>0 ,
u=g on R>0 × {t = 0},

where f : R>0 × R>0 → R is the source function, h : R>0 → R is the boundary function, and
g : R>0 → R is the initial condition. We decompose the solution into two parts:

u(x, t) = v(x, t) + h(t);

note that the second term matches the boundary condition for all t, and hence the first term
vanished on the boundary. (Here h(t) plays the role of ub in Section 6.6.) The substitution of the
decomposition into (7.12) yields another IBVP for v : R>0 × R>0 → R:

∂v ∂2v dh
− 2 =f− in R>0 × R>0 ,
∂t ∂x dt
v = 0 on {x = 0} × R>0 ,
v =g−h on R>0 × {t = 0}.

This is a nonhomogeneous heat equation with a homogeneous Dirichlet boundary condition, which
can be solved using Duhamel’s principle and fundamental solution as shown in Section 7.7.
We next consider the case with a nonhomogeneous Neumann boundary condition:

∂u ∂ 2 u
− 2 = f in R>0 × R>0 , (7.13)
∂t ∂x
∂u
= h on {x = 0} × R>0 ,
∂x
u = g on R>0 × {t = 0},

where f : R>0 × R>0 → R is the source function, h : R>0 × R is the boundary derivative function,
and g : R>0 → R is the initial condition. In this case, we consider the decomposition

u(x, t) = v(x, t) + xh(t);

the second term matches the nonhomogeneous Neumann boundary condition for all t, and hence the
first term satisfies homogeneous Neumann boundary condition on the boundary. The substitution
of the decomposition into (7.13) yields another IBVP for v : R>0 × R>0 → R:

∂v ∂2v dh
− 2 =f −x in R>0 × R>0 ,
∂t ∂x dt
∂v
= 0 on {x = 0} × R>0 ,
∂x
v = g − xh on R>0 × {t = 0},

This is a nonhomogeneous heat equation with a homogeneous Neumann boundary condition, which
again can be solved using Duhamel’s principle and fundamental solution as shown in Section 7.7.
We conclude the section with an example with a nonhomogeneous Dirichlet boundary condition.

88
Example 7.12. Consider the heat equation on a half line,

∂u ∂ 2 u
− 2 = 0 in R>0 × R>0 , (7.14)
∂t ∂x
u = 1 on {x = 0} × R>0 ,
u=0 on R>0 × {t = 0},

Our initial condition is the zero solution, and we impose the boundary condition of u = 1 at x = 1;
we hence intuitively expect the temperature to increase with time, and the “warm region” to spread
with time.
We recognize h(t) = 1 and decompose the solution u as

u(x, t) = v(x, t) + 1.

The substitution of the decomposition into (7.14) yields an IBVP for v : R>0 × R>0 → R:

∂v ∂2v
− 2 = 0 in R>0 × R>0 ,
∂t ∂x
v = 0 on {x = 0} × R>0 ,
v = −1 on R>0 × {t = 0}.

This problem was considered in Example 7.8, and it has the solution

v(x, t) = −erf(x/ 4t).

Hence the solution to our original problem (7.14) is given by


√ √
u(x, t) = 1 − erf(x/ 4t) = erfc(x/ 4t).

We now make a few observations about the solutions v and u:

1. Smoothness. The inconsistency in the initial and boundary conditions at (x, t) = (0, 0)
disappears immediately. The solution is infinitely differentiable for t > 0.

2. Infinite propagation speed. The inconsistency affects the solution immediately and everywhere.
The disturbance propagates at infinite speed.

3. Exponential
√ decay of influence. However, quantitatively, only the solution in the region x .
4t sees significant deviation from the initial condition; significant impact
√ of the diffusion
process is only felt in this region, and the width of the region grows as 4t.

7.9 Summary
We summarize key points of this lecture:

1. Fourier integral is a “generalization” of Fourier series to non-periodic functions on the entire


R.

89
Figure 7.4: Solution to the heat equations initial-boundary value problem (7.14).

2. The solution to the heat equation on R can be expressed as a convolution of the initial
condition and the fundamental solution (i.e., the heat kernel).

3. Important properties of the heat kernel include isotropy, conservation of thermal energy,
infinite propagation speed, exponential decay of influence, positivity, and smoothness.

4. The solution to the heat equation on R>0 with homogeneous Dirichlet boundary condition can
be found using the method of odd extension, which belongs to a family of reflection methods.

5. The solution to the heat equation on R>0 with homogeneous Neumann boundary condition
can be found using the method of even extension.

6. Duhamel’s principle can be used to express the solution to nonhomogeneous heat equation as
a linear combination of a family of solutions to homogeneous heat equations with particular
initial conditions.

7. Nonhomogeneous boundary conditions can be treated by decomposing the solution into a


(known) boundary function that satisfies the boundary condition and an auxiliary solution.
The solution to the auxiliary problem, with homogeneous boundary conditions, can be ob-
tained using Duhamel’s principle and the fundamental solution.

90
7.A “Derivation” of Fourier integrals
We note that the following “derivation” of Fourier integrals is not rigorous; nevertheless it provides
the correct final expression for Fourier integrals. We first analyze the constant term of the Fourier
series:
1 L 1 L
Z Z
L→∞
f0 (x) ≡ a0 ≡ f (x)dx ≤ |f (x)|dx −−−−→ 0,
L −L L −L
R
where we have used the fact that the integral of f over R is finite. (Not that if R |f (x)|dx is finite,
1
RL
then 2L −L |f (x)|dx → 0 as L → ∞.) We next analyze the cosine terms of the Fourier series:

∞  Z L 
X 1
fc (x) ≡ cos(nπξ/L)f (ξ)dξ cos(nπx/L),
L −L
n=1

where ξ is a “dummy variable” for the integration to avoid conflation with “x”. We now consider
the change of variables λn ≡ nπ/L and note ∆λ ≡ λn+1 − λn = π/L to obtain

∞  Z L 
X 1
fc (x) ≡ ∆λ cos(λn ξ)f (ξ)dξ cos(λn x).
π −L
n=1

We the interpret the sum as a Riemann sum and take L → ∞ (and hence ∆λ → 0) to replace the
sum with an integral and obtain
Z ∞
 Z ∞ 
1
fc (x) ≡ cos(λξ)f (ξ)dξ cos(λx)dλ.
λ=0 π −∞

We can apply the same transformation to the sine terms of the Fourier series to obtain
Z ∞
 Z ∞ 
1
fs (x) ≡ sin(λξ)f (ξ)dξ sin(λx)dλ.
λ=0 π −∞

We again note that the above “derivation” is not rigorous; however, it leads us to the following
(correct) definition of Fourier integral.

7.B Fundamental solution in Rn × R>0


The idea of fundamental solution readily extends to higher spatial dimension. We here omit the
derivation and state the key results

Definition 7.13 (fundamental solution of the heat equation in Rn ). The fundamental solution of
the heat equation ∂u n
∂t − ∆u = 0 in R is
 2

exp − |x|
(
1
(4πt)n/2 4t , t>0
Φ(x, t) ≡
0, t < 0.

91
We verify the particular result we derived for R1 is a specialization of this general result to
n = 1. We now consider the initial value problem
∂u
− ∆u = 0 in Rn × R>0 ,
∂t
u=g on Rn × {t = 0},
|u| < ∞ in Rn × R>0 .

Analogously to the problem on R × R>0 , the solution to this problem on Rn × R>0 can be expressed
using the fundamental solution:

|x − ξ|2
Z Z  
1
u(x, t) = Φ(x − ξ, t)g(ξ)dξ = n/2
exp − g(ξ)dξ. (7.15)
Rn Rn (4πt) 4t

We can readily show that the the four observations we have made about the solution of the heat
equation in R1 — smoothness, infinite propagation speed, exponential decay of influence, and
positivity — also hold for the heat equation in Rn ; the observations follow from the form of the
fundamental solution.

92
Lecture 8

Heat equation: finite difference


method

©2019–2021 Masayuki Yano. Prepared for ESC384 Partial Differential Equations taught at the
University of Toronto.

8.1 Motivation
We have so far considered analytical approaches to “solve” for — or more precisely represent
— the solution to the heat equation: separation of variables and fundamental solutions. While
the analytical representations are useful in understanding properties of the PDE, they cannot
readily treat general heat equations involving variable coefficients and general domains, which are
frequently encountered in science and engineering. In practice, these more general initial-boundary
value problems are solved — or more precisely approximated — using numerical methods. In this
lecture we consider numerical approximation of the heat equation. For simplicity, we consider a
one-dimensional heat equation; we will later consider the treatment of higher-dimensional problems.

8.2 Model equation


Throughout this lecture, we consider a initial-boundary value problem defined on spatial domain
Ω ≡ (xL , xR ) and the time interval I ≡ (0, tf ],

∂u ∂ 2 u
− 2 = f in (xL , xR ) × (0, tf ],
∂t ∂x
u = hL on {x = xL } × (0, tf ],
∂u
= hR on {x = xR } × (0, tf ],
∂x
u = g on (xL , xR ) × {t = 0},

where the functions f : Ω × I → R, g : Ω → R, hL : I → R, and hR : I → R depend on space


and/or time in an arbitrary manner. We consider a problem with Dirichlet and Neumann boundary
conditions on the left and right boundary, respectively, to demonstrate the treatment of these two
different boundary conditions.

93
Figure 8.1: Space-time finite difference grid for one-dimensional heat equation for n = 4 and J = 5.

8.3 Discretization in space and time


To approximate the solution to the heat equation, we first discretize our space-time domain
(xL , xR ) × (0, tf ). To this end, we introduce the spatial computation grid comprises n + 1 points:

xL ≡ x0 < x1 < · · · < xn−1 < xn ≡ xR .

For simplicity we assume that the grid points are equispaced so that

xi = i∆x, i = 0, 1, . . . , n,

where ∆x ≡ (xR − xL )/n. We similarly introduce a temporal computational grid comprises J + 1


points:
0 ≡ t0 < t1 < · · · < tJ−1 < tJ ≡ tf .

For simplicity, we again assume that time intervals are equispaced so that

tj = j∆t, j = 0, 1, . . . , J,

where ∆t ≡ tf /J. Our goal is to approximate the solution at the space-time grid points u(xi , tj )
˜j so that
by a computational approximation ũi

˜j ,
u(xi , tj ) ≈ ũ i = 0, . . . , n, j = 0, . . . , J.
i

Figure 8.1 shows the space-time finite difference grid. We now introduce finite difference method
in space and time to approximate the solution at the grid points.
(While we will not consider non-equispaced grids or time intervals in this lecture, finite difference
methods readily extend to these cases. In fact, adaptive and non-equispaced grids and time intervals
can greatly improve the efficiency of the method for some problems.)

94
8.4 Finite difference in space
The numerical approximation of PDEs requires the approximation of the (spatial) derivatives that
appear in the PDEs. To this end, we introduce a generic function f : [xL , xR ] → R and illustrate
an approach to estimate its derivative at a set of points. Specifically, given the function values at
the grid points x0 , . . . , xn we wish to approximate the derivative at the same set of points. We first
introduce an approximation of the first derivative:
Definition 8.1 (first-derivative central difference formula). The first-derivative central difference
formula is
fi+1 − fi−1
f 0 (xi ) ≈ , (8.1)
2∆x
where fi ≡ f (xi ) and ∆x = xi+1 − xi = xi − xi−1 .
This is an example of a finite difference formula, in which the derivative is approximated by a
finite number of differences on a finite grid. Note that in the limit of ∆x → 0, the formula provides
the definition of the derivative. We now formally analyze the truncation error.
Proposition 8.2. Given a thrice continuously differentiable function f ,
fi+1 − fi−1 1
|f 0 (xi ) − |≤ max |f 000 (ξ)|∆x2 ,
2∆x 6 ξ∈[xi−1 ,xi+1 ]

where fi ≡ f (xi ) and ∆x = xi+1 − xi = xi − xi−1 . The formula is second-order accurate; i.e., the
error E is bounded by E < C∆x2 for some C independent of ∆x, and in particular it depends on
the square of the grid spacing ∆x.

Proof. We consider Taylor series expansion of fi+1 and fi−1 about xi to obtain
fi+1 − fi−1 1 1 1 1 1
= (fi + fi0 ∆x + fi00 ∆x2 + f 000 (ξ1 )∆x3 − (fi − fi0 ∆x + fi00 ∆x2 − fi000 (ξ2 )∆x3 ))
2∆x 2∆x 2 6 2 6
1 1 1
= (2f 0 ∆x + (f 000 (ξ1 ) + f 000 (ξ2 ))∆x3 ) = fi0 + (f 000 (ξ1 ) + f 000 (ξ2 ))∆x2
2∆x i 6 12
for some ξ1 ∈ [xi−1 , xi ] and ξ2 ∈ [xi , xi+1 ]. It follows that
fi+1 − fi−1 1 1
fi0 − = |f 000 (ξ1 ) + f 000 (ξ2 )|∆x2 ≤ max |f 000 (ξ)|∆x2 ,
2∆x 12 6 ξ∈[xi−1 ,xi+1 ]
which is the desired relationship.

We can similarly approximate the second derivative using the finite difference method.
Definition 8.3 (second-derivative central difference formula). The second-derivative central dif-
ference formula is
fi+1 − 2fi + fi−1
f 00 (xi ) ≈ , (8.2)
∆x2
where fi ≡ f (xi ) and ∆x = xi+1 − xi = xi − xi−1 .
The denominator scales with ∆x2 for the second derivative (as opposed to ∆x for the first
derivative); perhaps this is not too surprising if we consider the definition of the derivative and the
limit. We now formally analyze the truncation error.

95
Proposition 8.4. Given a function f with a continuous fourth derivative,
fi+1 − 2fi + fi−1 1
|fi00 − 2
|≤ max |f (4) (ξ)|∆x2 ,
∆x 12 ξ∈[xi−1 ,xi+1 ]
where fi ≡ f (xi ) and ∆x = xi+1 − xi = xi − xi−1 . The formula is second-order accurate.

Proof. We express fi+1 , −2fi , and fi−1 using Taylor series expansions:
1 1 000 3 1
fi+1 = fi + fi0 ∆x + fi00 ∆x2 + fi ∆x + f (4) (ξ1 )∆x4
2 6 24
−2fi = −2fi
1 1 000 3 1
fi−1 = fi − fi0 ∆x + fi00 ∆x2 − fi ∆x + f (4) (ξ2 )∆x4
2 6 24
for some ξ1 ∈ [xi , xi+1 ] and ξ2 ∈ [xi−1 , xi ]. We sum the equations to obtain
1 (4)
fi+1 − 2fi + fi−1 = fi00 ∆x2 + [f (ξ1 ) + f (4) (ξ2 )]∆x4 ;
24
note in particular that the odd-order terms cancel due to the symmetry. We finally note
fi+1 − 2fi + fi−1 1 1
|fi00 − 2
| = [f (4) (ξ1 ) + f (4) (ξ2 )]∆x2 ≤ max |f (4) (s)|∆x2 ,
∆x 24 12 s∈[xi−1 ,xi+1 ]
which is the desired relationship.

In general, it is possible to construct finite difference formulas for derivatives of arbitrary order.
In addition, it is possible to construct finite difference formulas that are accurate to an arbitrary
order, in which the error scales as ∆xp for an arbitrarily large p. Higher-order and/or higher-
derivative approximations are achieved by including more points (xi , f (xi )) in the finite difference
formula and choosing the associated weights so that more terms of the Taylor series expansion
cancel. This can be achieved in a systematic manner using a Taylor table.

8.5 Semi-discrete heat equation


We now apply finite difference formulas to approximate the heat equation. In particular, we wish
to approximate the value of the solution at nodes x1 , . . . , xn at time t by

ũi (t) ≈ u(xi , t), i = 1, . . . , n, t ∈ R>0 .

Note that our goal is to approximate the solution u : Ω × I → R that depends on both space and
time by a set of n functions ũi : I → R, i = 1, . . . , n, associated with the n fixed grid points. To
this end, we start with the heat equation
∂u ∂2u
(x, t) − 2 (x, t) = f (x, t) x ∈ Ω, t ∈ I,
∂t ∂x
and approximate the Laplacian operator, which is the second derivative in space, using the central
difference formula (8.2) to obtain the heat equation discretized in space:
dũi ũi+1 (t) − 2ũi (t) + ũi−1 (t)
(t) − = fˆi (t), t ∈ I, (8.3)
dt ∆x2

96
where fˆi (t) ≡ f (xi , t). We use fˆi (t) (instead of f˜i (t)) to denote the evaluation of f at (xi , t) because
this evaluation is exact at xi ; this is unlike ũ(xi , t), which in general will not equal u(xi , t) at xi .
While (8.3) provides appropriate equations for the interior nodes i = 2, . . . , n − 1, we need to
incorporate the boundary conditions for i = 1 and n as follows:

1. Left Dirichlet boundary. On the left boundary, we wish to impose the Dirichlet boundary
condition u(x = xL , t) = hL (t). Because the solution at i = 0 is “known”, we replace the
difference equation (8.3) for i = 0 with

ũi=0 (t) = hL (t).

We substitute the expression in the difference equation (8.3) for i = 1 to obtain

dũ1 ũ2 (t) − 2ũ1 (t) + hL (t)


(t) − = fˆi (t),
dt ∆x2
which, after some rearrangement, yields
dũ1 ũ2 (t) − 2ũ1 (t) hL (t)
(t) − 2
= fˆi (t) + .
dt ∆x ∆x2
Note that we have moved the “known” value to the right hand side of the equation so that
the left hand side is linear in ũ.

2. Right Neumann boundary. On the right boundary, we wish to impose the Neumann boundary
condition ∂u
∂x (x = xR , t) = hR (t). We apply the central finite difference formula for the first
derivative at i = n to obtain
ũn+1 (t) − ũn−1 (t)
= hR (t) ⇒ ũn+1 (t) = ũn−1 (t) + 2∆xhR (t).
2∆x
We substitute the relationship to the difference equation (8.3) for i = n to obtain

dũn (ũn−1 (t) + 2∆xhR (t)) − 2ũn (t) + ũn−1 (t)


(t) − = fˆn (t), t ∈ R>0
dt ∆x2
which, after some rearrangement, yields
dũn −2ũn (t) + 2ũn−1 (t) 2hR
(t) − 2
= fˆn (t) + .
dt ∆x ∆x

The finite difference approximation, which incorporates the boundary conditions, is hence given by
dũ1 ũ2 (t) − 2ũ1 (t) hL (t)
(t) − 2
= fˆ1 (t) + ,
dt ∆x ∆x2
dũi ũi+1 (t) − 2ũi (t) + ũi−1 (t)
(t) − = fˆi (t), i = 2, . . . , n − 1, (8.4)
dt ∆x2
dũn −2ũn (t) + 2ũn−1 (t) 2hR (t)
(t) − = fˆn (t) + ,
dt ∆x2 ∆x
ũi (t = 0) = ĝi , i = 1, . . . , n,

where ĝi ≡ g(xi ), i = 1, . . . , n. We make a few observations:

97
1. The system (8.4) is called a semi-discrete form of the heat equation (with appropriate bound-
ary conditions) because the equation is discretized in space but not in time.

2. We have turned the original PDE-based initial-boundary value problem into an initial value
problem based on a system of n ODEs, where the unknowns are u(x1 ), . . . , u(xn ) (which
excludes the left boundary with the known solution).

3. Assuming the solution u is sufficiently regular, this approximation is second-order accurate


in space; i.e.,
|u(xi , t) − ũi (t)| ≤ C∆x2 , i = 1, . . . , n, t ∈ R>0
for some constant C < ∞ independent of ∆x. Decreasing the grid spacing ∆x by a factor of
two results in the error to decrease by a factor of four.

The system of ODEs can also be written in matrix form. We first concatenate ũ1 (t), . . . , ũn (t)
as a vector ũ(t) ∈ Rn whose i-th entry is ũi (t). We then seek ũ : R>0 → Rn such that

dũ
(t) + Âũ(t) = fˆ(t) + ĥ(t) in Rn , t ∈ R>0 , (8.5)
dt
ũ(t = 0) = ĝ in Rn ,

where  ∈ Rn×n , fˆ(t) ∈ Rn , fˆb (t) ∈ Rn , and ĝ ∈ Rn are given by


 
fˆ1 hL /∆x2
     
2 −1 ĝ1
 −1 2 −1  fˆ2  0 ĝ2
      
1  . . .

ˆ
 .  
..
 
..

 =
 .. .. ..  , f =  ..  , ĥ = 
   
. ,

ĝ = 

. .

∆x2 

  ˆ    
−1 2 −1   fn−1  0  ĝn−1
    
−2 2 fˆn 2hR /∆x ĝn

The matrix form of the equation allows us to (i) more compactly express our finite difference
approximation and (ii) efficiently implement our approximation in a computer program using a
language such as Matlab.

8.6 Temporal discretization


We next wish to apply a time discretization to the semi-discrete form (8.4). There are many time
marching methods, but in this lecture we consider the Crank-Nicolson method. To this end, we first
introduce a initial value problem associated with a system of n ODEs: find ũ : I → Rn such that

dũ
(t) = F (ũ(t), t) in Rn , t ∈ I,
dt
ũ(t = 0) = ĝ in Rn ,

where F : Rn × I → Rn describes the dynamics of the system, and ĝ ∈ Rn is the initial condition.
˜0 , . . . , ũ
We now wish to find a sequence of approximations ũ ˜J such that

˜j ,
ũ(tj ) ≈ ũ j = 0, . . . , J.

98
The Crank-Nicolson approximation is given by
ũj − ũj−1 1
F (ũj , tj ) + F (ũj−1 , tj−1 ) in Rn , j = 1, . . . , J,

=
∆t 2
ũj=0 = ĝ in Rn .
As regards the accuracy of the Crank-Nicolson method, we have the following result:
Proposition 8.5. Suppose dũ n
dt = F (ũ, t) in R where (i) F is Lipschitz continuous in ũ and (ii) F
˜j , satisfies
is continuous in t. Then the Crank-Nicolson approximation of ũ(tj ), ũ
˜j | ≤ C max |ũ000 (s)|∆t2 ,
|ũ(tj ) − ũ
s∈[0,tj ]

where C is independent of ∆t.


We make two observations:
1. Convergence: |u(tj ) − ũj | → 0 as ∆t → 0 for any tj and hence the scheme is convergent;
2. Order of accuracy: |u(tj ) − ũj | ≤ C∆t2 and hence the scheme is second order accurate.
Decreasing the time step ∆t by a factor of two results in the error to decrease by a factor of
four.
Remark 8.6. (This discussion is beyond the scope of this course, but we make one remark regarding
the stability of time-marching schemes.) The choice of the time-marching scheme is typically
governed by three factors: the order of accuracy; the cost; and the stability of the scheme. The
stability of the scheme dictates the maximum time step ∆t that can be used for the scheme to remain
numerically stable (i.e., the solution to not “blow up” in time). The Crank-Nicolson method has
a nice stability property and remains stable for any choice of ∆t, assuming the underlying PDE is
stable (which is the case for the heat equation). We just note that the order of accuracy is not the
only consideration in choosing a time-marching scheme.

8.7 Fully discrete equation


We now apply the Crank-Nicolson method to the semi-discrete form of the heat equation to obtain
a fully discrete equation. We denote the finite difference approximation of the state at position xi
and time tj by
˜j ≈ u(xi , tj ), i = 1, . . . , n, j = 1, . . . , J.
ũi
The application of the backward Euler formula to (8.4) yields
˜j − ũ
ũ ˜j−1 1 X
 ˜k ˜k
ũ2 − 2ũ ˆˆk hkL

1 1 1
= + f1 + , j = 1, . . . , J,
∆t 2 ∆x2 ∆x2
k∈{j−1,j}
!
˜j − ũ
ũ ˜j−1 1 X ˜k − 2ũ
ũi+1
˜k + ũ
i
˜k
i−1 ˆ
i i
= + fˆik , i = 2, . . . , n − 1, j = 1, . . . , J, (8.6)
∆t 2 ∆x2
k∈{j−1,j}
!
˜jn − ũ
ũ ˜nj−1 1 X −2ũ˜kn + 2ũ
˜k
n−1 ˆ 2h k
= + fˆnk + R , j = 1, . . . , J,
∆t 2 ∆x2 ∆x
k∈{j−1,j}
˜j=0
ũ = ĝi , i = 1, . . . , n,
i

99
ˆ
where fˆij = f (xi , tj ), hjL = hL (tj ), and hjR = hR (tj ). We can also write the formulation in a more
compact matrix-vector form:
˜j − ũ
ũ ˜j−1 1 ˜j + fˆ ˆj ˜j−1 + fˆˆj−1 + ĥ
ˆ j−1

= −Âũ ˆj + ĥ − Âũ in Rn , j = 1, . . . , J,
∆t 2
˜
ũ j=0
= ĝ in Rn ,

ˆ ˆ
where  ∈ Rn×n , fˆj = fˆ(tj ) ∈ Rn , and ĥj = ĥ(tj ) ∈ Rn ; the expressions for fˆ(t) and ĥ(t) are
provided in the semi-discrete formulation (8.5). We make a few observations:
1. The system (8.6) is a fully discrete form of the heat equation because it is discretized in both
space and time.

2. Assuming the solution u is sufficiently regular, the approximation is second-order accurate in


both space and time; i.e.,
˜j | < C∆x2 + D∆t2 ,
|u(xi , tj ) − ũ i = 1, . . . , n, j = 1, . . . , J,
i

for some constants C < ∞ and D < ∞. Decreasing both ∆x and ∆t by a factor of two results
in the error to decrease by a factor of four.

3. Because the error is a sum of the spatial and temporal errors, it is insufficient to decrease
only ∆x or ∆t. Both ∆x and ∆t must be decreased to control the error.

8.8 Solution of the fully discrete equation


We can solve the fully discrete equation by marching in time as follows:
˜j=0 = ĝ.
0. Initialize the state to ũ

1. Update the state by solving a linear system


1 ˜j = (I − 1 ∆tÂ)ũ
˜j−1 + 1 ∆t(fˆˆj + ĥ
ˆj ˆ ˆ
(I + ∆tÂ)ũ + fˆj−1 + ĥj−1 ) (8.7)
2 2 2
˜j ∈ Rn , starting with j = 1 and time-marching to j = J.
for ũ
We make a few observations:
1. The matrix I + 21 ∆t ∈ Rn×n is nonsingular for all ∆t and hence the linear system has a
unique solution.

2. The matrices I + 21 ∆t ∈ Rn×n and I − 21 ∆t ∈ Rn×n are tridiagonal. The matrices have
nonzero entries only on the diagonal, superdiagonal, and subdiagonal. The matrices can be
efficiently stored in O(n) space by only storing the indices and values of the nonzero entries;
this is unlike a general dense n × n matrix, which requires O(n2 ) space to store.

3. The tridiagonal system can be solved efficiently in O(n) operations using the Thomas algo-
rithm, which is a specialization of LU factorization (or Gaussian elimination) for a tridiagonal
matrix. (A tridiagonal matrix is a special case of a more general sparse matrix, which is a
matrix whose entries are mostly zero.)

100
4. While the matrix I + 12 ∆t is sparse, its inverse (I + 21 ∆tÂ)−1 is a dense matrix whose entries
are nonzero everywhere. Computing (I + 12 ∆tÂ)−1 requires O(n3 ) operations, and storing
the matrix requires O(n2 ) space. Hence, the matrix (I + 21 ∆tÂ)−1 should never be explicitly
formed.
5. More pragmatically, to solve (8.7) using Matlab, we first store I and A as sparse matrices
using the sparse command. We then call
˜j ← (I + 1 ∆tÂ) \ ((I − 1 ∆tÂ)ũ
ũ ˜j−1 + 1 ∆t(fˆˆj + ĥ
ˆj ˆ ˆ
+ fˆj−1 + ĥj−1 )),
2 2 2
where “\” is the backslash operator, which automatically solves the system using the Thomas
algorithm for tridiagonal systems. We should never use the inv function, which is very
inefficient.

8.9 Example
We now consider the approximation of the heat equation on (0, 1) × (0, 1):
∂u ∂2u
(x, t) − 2 (x, t) = cos(t/2) exp(x2 ), ∀x ∈ (0, 1), t ∈ (0, 1], (8.8)
∂t ∂x
u(x = 0, t) = 0, ∀t ∈ (0, 1],
∂u
(x = 1, t) = 0, ∀t ∈ (0, 1],
∂x
1
u(x, t = 0) = sin(3πx/2), ∀x ∈ (0, 1).
2
Figure 8.2 shows the finite difference solution for two different grid spacings ∆x and time steps
∆t. We note that the arbitrary initial condition and the time-dependent source function can be
handled without any difficulty using the numerical method. We also qualitatively observe that the
approximation gets finer as ∆x decreases.
Table 8.1 shows the maximum error
˜j |
˜ ∞ ≡ max |u(xi , tj ) − ũ
ku − ũk
i=1,...,n i
j=1,...,J

as a function of ∆x and ∆t. We make some observations:


1. The solution accuracy can be limited by either the lack of the spatial resolution or the lack
of the temporal resolution. We must decrease both ∆x and ∆t to control the error.
2. If ∆t is sufficiently small such that the accuracy is limited by the spatial resolution — i.e.,
∆t = 1/256 which is the last column of the table — the error scales as O(∆x2 ), which is
consistent with the theory for the second-order accurate finite difference method.
3. If ∆x is sufficiently small such that the accuracy is limited by the time resolution — i.e.,
∆x = 1/64 which is the last row of the table — the error scales as O(∆t2 ), which is consistent
with the theory for the second-order accurate Crank-Nicolson method.
4. To minimize the computational cost (i.e., the number of grid points or time steps) required
to achieve a given accuracy, we wish to choose a combination of ∆x and ∆t such that the
spatial and temporal errors are balanced and neither one is dominating the other.

101
(a) ∆x = 1/8, ∆t = 1/32 (b) ∆x = 1/32, ∆t = 1/128

Figure 8.2: Approximation of the heat equation problem (8.8).

∆t = 1/32 ∆t = 1/64 ∆t = 1/128 ∆t = 1/256


∆x = 1/8 7.61 × 10−3 7.54 × 10−3 7.53 × 10−3 7.52 × 10−3
∆x = 1/16 7.75 × 10−3 1.91 × 10−3 1.89 × 10−3 1.89 × 10−3
∆x = 1/32 8.33 × 10−3 1.84 × 10−3 4.77 × 10−4 4.73 × 10−4
∆x = 1/64 8.47 × 10−3 1.97 × 10−3 4.48 × 10−4 2.09 × 10−4

Table 8.1: Maximum error for the heat equation problem (8.8) as a function of the grid spacing
∆x and time stepping ∆t.

102
8.10 Summary
We summarize key points of this lecture:

1. The finite difference method provides an approximation of the solution to the heat equation
on a set of grid points in space and instances in time.

2. Given a function f , its derivative evaluated at a grid point xi can be approximated as a


linear combination of f evaluated at a set of grid points. The order of accuracy of the finite
difference formula can be analyzed using Taylor series. The number of required grid points
depends on the order of the derivative and the order of accuracy.

3. Semi-discrete form of a time-dependent PDE is a system ODEs (initial value problems) that
has been discretized in space but remains continuous in time.

4. The semi-discrete equation can be discretized in time using a time marching method (e.g.,
the Crank-Nicolson method), which yields a fully discrete equation.

5. The fully discrete equation can be solved by starting with t = 0 and then marching forward
in time. If the backward Euler method is used for the time discretization, we must solve a
linear system at each time instance; the linear system can be solved efficiently because it is a
sparse matrix.

103
Part III

Laplace’s equation

104
Lecture 9

Laplace’s equation: introduction

©2019–2021 Masayuki Yano. Prepared for ESC384 Partial Differential Equations taught at the
University of Toronto.

9.1 Introduction
In this lecture we introduce and study Laplace’s equation

−∆u = 0 in Ω ⊂ Rn

and the closely related Poisson’s equation

−∆u = f in Ω ⊂ Rn

for some f : Ω → R. Both equations are ubiquitous in science and engineering, and they are an
archetype of elliptic PDEs. Following the approach used for the heat equation, we will first derive a
number of properties associated with Laplace’s and Poisson’s equations without explicitly seeking
the solution.
Classification. We recall the classification system introduced in Lecture 1. We immediately ob-
serve that Laplace’s and Poisson’s equations are second-order linear equations. Laplace’s equation
is homogeneous, while Poisson’s equation is nonhomogeneous (for f 6= 0). Both equations are also
elliptic. To see this, for simplicity we consider n = 2 and observe that the matrix associated with
∂2 ∂2
the second-order operator −∆ = − ∂x 2 − ∂x2 is
1 2

 
−1 0
A= ;
0 −1

the associated eigenvalues are λ1,2 = −1. Because the eigenvalues have the same sign, the equation
is elliptic. This argument would hold for an arbitrary n because the A matrix is A = −In×n .

9.2 Laplace’s and Poisson’s equations in science and engineering


We provide a few applications of Laplace’s and Poisson’s equations in science and engineering.

105
• Steady heat transfer. The PDE that describes the steady temperature distribution u is ob-
tained by setting the time derivative term ∂u
∂t of the heat equation to 0. The resulting equation
is Laplace’s equation −∆u = 0 if there is no (volume) heat source and Poisson’s equation
−∆u = f if there is a source term f . We may obtain a boundary value problem by augmenting
the PDE with some combination of the boundary conditions discussed in Section 5.3.

• Steady diffusion. The steady-state concentration of chemical species dispersed in a medium is


described by Laplace’s equation. The species concentration u : Ω → R satisfies the Laplace’s
equation −∆u = 0 if there is no volume chemical source and Poisson’s equation −∆u = f if
there is a source term.

• Potential flow. Steady, inviscid, incompressible, and irrotational flows can be modeled using
the potential flow theory. The velocity potential φ : Ω → R satisfies Laplace’s equation
−∆φ = 0. The velocity field is then given by v = ∇φ.

• Electrostatics. The electric field in a linear, isotropic, and homogeneous material is described
by Poisson’s equation. The electric potential φ : Ω → R satisfies Poisson’s equation −∆φ =
ρ/, where ρ is the volumetric charge density and  is the permittivity of the medium. The
electric field is then given by E = −∇φ.

Laplace’s equation is also ubiquitous in mathematics, and its solution bears a special name.

Definition 9.1 (harmonic function). Let Ω ⊂ Rn . A function u : Ω → R that satisfies Laplace’s


equation −∆u = 0 in Ω is called a harmonic function.

Before we proceed with the solution of Laplace’s and Poisson’s equations, we make one remark
about our choice of the sign. Laplace’s and Poisson’s equation are sometimes written without the
negative sign in front of the differential operator; i.e., ∆u = 0 instead of −∆u = 0. Throughout
this course we will use the convection with the negative sign in front.

9.3 Mean-value formulas


Harmonic functions have a number of special properties. The first property is expressed in mean
value formulas. (All proofs in this section and following sections should be considered optional.)

Definition 9.2 (ball in Rn ). A ball of radius r and centered at x0 is an open subset of Rn given
by B(x0 , r) ≡ {x ∈ Rn | kx − x0 k2 < r}, where k · k2 is the Euclidean norm in Rn . Its boundary
is ∂B(x0 , r) = {x ∈ Rn | kx − x0 k2 = r}. The volume of the ball is denoted by |B(x0 , r)|, and the
surface area of the ball is denoted by |∂B(x0 , r)|.

Example 9.3. In two dimensions, B(x0 , r) is a disk, and ∂B(x0 , r) is a circle. Its “volume” (i.e.,
area in R2 ) is |B(x0 , r)| = πr2 , and its “surface area” (i.e., arc length in R2 ) is |∂B(x0 , r)| = 2πr.

Theorem 9.4 (mean-value formula for harmonic functions). If u is harmonic in Ω, then for any
ball B(x0 , r) ⊂ Ω and the associated boundary ∂B(x0 , r) we have
Z Z
1 1
u(x0 ) = udx = uds .
|B(x0 , r)| B(x0 ,r) |∂B(x0 , r)| ∂B(x0 ,r)
| {z } | {z }
average over the volume average over the surface

106
Proof. We prove the formula in R2 ; the proof readily extends to higher dimensions. We first prove
the relationship for the average over the circumference. Without loss of generality, we choose the
coordinate system centered about the ball and introduce a function
Z Z 2π Z 2π
1 1 1
φ(r) = u(y)ds(y) = u(r, θ)rdθ = u(r, θ)dθ.
2πr ∂B(0,r) 2πr θ=0 2π θ=0

We then differentiate the function to obtain


Z 2π Z 2π
0 1 ∂u 1 ∂u
φ (r) = (r, θ)dθ = (r, θ)rdθ.
2π θ=0 ∂r 2πr θ=0 ∂r

We now (i) recognize that the integration is over the circle B(0, r), (ii) recognize ∂u
∂r = ν · ∇u for ν
the outward pointing unit normal on the circle, and (iii) invoke the divergence theorem to obtain
Z Z
0 1 1
φ (r) = ν · ∇uds = ∆udx = 0,
2πr ∂B(0,r) 2πr B(0,r)

where the last equality follows from ∆u = 0. Hence φ is a constant function and we can equate its
value for any r to the limit of r going to 0, limr→0 φ(r):
Z
1
φ(r) = lim φ(a) = lim u(y)ds(y) = u(x = 0),
a→0 a→0 2πa ∂B(0,a)

which is the first relationship. (Note that we use a dummy variable a for the limit lima→0 φ(a) to
avoid conflation with r about which φ is evaluated.)
To prove the second relationship we again use the polar coordinate to obtain
!
Z Z r Z Z r
u(x)dx = uds dz = u(0)2πzdz = u(0)πr2 ,
B(0,r) z=0 ∂B(0,z) z=0

which is the second relationship.

Theorem 9.5 (Converse to mean-value theorem). If u ∈ C 2 (Ω) satisfies


Z
1
u(x0 ) = u(s)ds
2πr ∂B(x0 ,r)

for all balls B(x0 , r) ⊂ Ω, then u is harmonic.

Proof. Proof is beyond the scope of this course. See, for example, Partial Differential Equations
by Evans.

Example 9.6 (mean-value formulas in R1 ). Let u : R → R be a harmonic function (i.e., u satisfies


−u00 = 0 in R). We introduce a ball B(x0 = 1, r = 2) = (−1, 3). (The ball is just an open line
segment in R1 .) Suppose u(x = −1) = 1 and u(x = 3) = 2. Then by the mean value formula, the
value at the center of the ball is given by
Z
1 1 3
u(1) = uds = (u(−1) + u(3)) = .
|∂B(1, 2)| ∂B(1,2) 2 2
| {z }
ball surface average

107
In addition, the average of u over the line segment (−1, 3) is
Z Z 3
1 1 3
udx = udx = .
|B(x0 , r)| B(x,r) 4 −1 2

Note that we arrive at these conclusions without knowing the precise form of the solution.
(Note that explicit form of the solution is given by u(x) = 14 x + 45 , which yields the results that
are consistent with the above. Harmonic functions are linear in R1 because u00 = 0.)

Example 9.7 (mean-value formulas in R2 ). Let u : R2 → R be a harmonic function. We introduce


a ball B(0, 2) ⊂ R2 . Suppose along the surface of the ball (i.e., the circle) the function is given by

u(r = 2, θ) = 1 + cos(3θ)

in the polar coordinates. Then, by the mean-value formula for the surface, the value at the center
of the ball is
Z Z 2π
1 1
u(0) = uds = (1 + cos(3θ))2 · dθ = 1.
|∂B(0, 2)| ∂B(0,2) 2π · 2 θ=0
| {z }
average over circle

Similarly, by the mean value formula for the volume,


Z
1
udx = u(0) = 1.
|B(0, 2)| B(0,2)
| {z }
average over disk

We arrive at these results without knowing the explicit form of the solution.
(We will find an explicit form of the solution using separation of variables in Example 10.2. The
solution as expected satisfies the above mean-value properties.)

9.4 Maximum principles


Similarly to the heat equation, Laplace’s equation also satisfies certain maximum and minimum
principles:

Theorem 9.8 ((weak) maximum principle). Let u be a harmonic function in a connected open
domain Ω ⊂ R2 . Then
max u = max u;
Ω̄ ∂Ω

the function attains the maximum value on the boundary of the domain.

Proof. Suppose x? ∈ Ω̄ is the maximizer so that u(x? ) = maxΩ̄ u ≡ M . We have two cases:
(i) x? is an interior point, x? ∈ Ω; (ii) x? is on the boundary, x? ∈ ∂Ω. If x? ∈ Ω, then by
the strong maximum principle u is a constant function and hence the maximum is attained on
everywhere including the boundary ∂Ω. If x? ∈ ∂Ω, the statement is trivially satisfied because it
is the statement itself.

108
Theorem 9.9 (strong maximum principle). Let u be a harmonic function in a connected open
domain Ω ⊂ Rn . If there exists a point x? ∈ Ω such that

u(x? ) = max u,
Ω̄

then u is constant in Ω. (Note that x? must lie in the open set Ω, which is strictly interior to Ω̄.)

Proof. Suppose x? ∈ Ω is the maximizer so that u(x? ) = maxΩ̄ u ≡ M . Then, by the mean-value
formula for the ball
Z
1
M = u(x? ) = u(x)dx ≤ max u(x) ≤ M.
|B(x? , a)| B(x? ,a) x∈B(x? ,a)

The last inequality holds as equality only if u(x) = M for all x ∈ B(x? , a), and hence we conclude
that u must be constant (and equal to M ) in B(x? , a). We then repeat the same argument for a
new ball with the center x?0 and radius a0 that lies in B(x? , a), from which we conclude u(x) = M
for all x ∈ B(x? , a) ∪ B(x?0 , a0 ). We repeat the same argument until we cover the whole domain
and conclude that u(x) = M for all x ∈ Ω̄.

Theorem 9.10 (weak and strong minimum principles). Let u be a harmonic function in a connected
open domain Ω ⊂ R2 . Then
min u = min u.
Ω̄ ∂Ω

In addition, if there exists a point x? ∈ Ω such that

u(x? ) = min u,
Ω̄

then u is constant in Ω.

Proof. We apply the weak and strong maximum principle, respectively, to −u. (See the same
argument for the maximum/minimum principle for the heat equation for a complete proof.)

We make a few observations:

1. The weak maximum (resp. minimum) principle states that, if u : Ω̄ → R is a harmonic


function, then it attains a maximum (resp. minimum) on the boundary ∂Ω. In other words,
for any x ∈ Ω̄ (including the boundary),

min u ≤ u(x) ≤ max u.


∂Ω ∂Ω

2. The strong maximum (resp. minimum) principle states that, if u : Ω̄ → R is a harmonic


function, then the only way for it to attain a maximum (resp. minimum) in Ω — i.e., strictly
inside of the domain and not on the boundary — is if u is a constant function. Conversely,
if u is a non-constant function, then the function values in Ω are strictly less than (resp.
greater than) the maximum (resp. minimum) value on the boundary. In other words, for a
non-constant harmonic function u and for any x ∈ Ω (strictly inside),

min u < u(x) < max u.


∂Ω ∂Ω

109
We now provide an example:
Example 9.11 (maximum principle in R2 ). Suppose u : R2 → R is a harmonic function. We
introduce a ball Ω = B(0, 2) ⊂ R2 . Suppose along the surface of the ball (i.e., the circle) the
function is given by
u(r = 2, θ) = 2 + cos(3θ)2
in the polar coordinates. Then by the weak maximum and minimum principles, for any x ∈ Ω̄
(including the boundary),
2 = min u ≤ u(x) ≤ max u = 3.
∂Ω ∂Ω
In addition, because u is a non-constant function, by the strong maximum and minimum principles,
for any x ∈ Ω (strictly inside),
2 < u(x) < 3.
We conclude the section with a positivity result:
Corollary 9.12 (positivity). Let u is the solution to
−∆u = 0 in Ω,
u=g on ∂Ω.
If g ≥ 0, then u ≥ 0 everywhere in Ω̄. In addition, if Ω is connected and g(x) > 0 for some x ∈ ∂Ω,
then u > 0 everywhere in Ω.
Proof. Non-negativity is a consequence of the (weak) minimum principle. If u ≥ 0 on ∂Ω then
u(x) ≥ minx∈Ω u(x) ≥ 0.
Positivity is a consequence of the strong minimum principle. Suppose g ≥ 0 and g(x) > 0 for
some x ∈ ∂Ω, and there exists x? ∈ Ω such that u(x? ) = 0 = minΩ̄ u. Then, by the strong maximum
principle u must be a constant function, u = 0. However, this contradicts the fact g(x) > 0 for
some x ∈ ∂Ω. Hence u must be positive everywhere in Ω.

9.5 Uniqueness and uniform stability by maximum principle


We now assess two of the three conditions of well-posedness: uniqueness and stability. We appeal
to the maximum principle to show Dirichlet boundary value problems associated with Laplace’s
equation satisfy both conditions. The approach is analogous to our proof of uniqueness and stability
for the heat equation by its maximum principle.
Theorem 9.13 (uniqueness of the solution to Laplace’s equation). Let g ∈ C(∂Ω). There exists
at most one solution u ∈ C 2 (Ω) ∩ C(Ω̄) to the boundary value problem
−∆u = 0 in Ω,
u=g on ∂Ω.
Proof. If both u1 ∈ C 2 (Ω) ∩ C(Ω̄) and u2 ∈ C 2 (Ω) ∩ C(Ω̄) are solutions to the boundary value
problem, then u1 − u2 ∈ C 2 (Ω) ∩ C(Ω̄) and by linearity
−∆(u1 − u2 ) = 0 in Ω,
(u1 − u2 ) = 0 on ∂Ω.
By the maximum and minimum principle, u1 − u2 = 0 everywhere in Ω̄ and hence u1 = u2 .

110
Theorem 9.14 (stability of the solution to Laplace’s equation). Let u1 and u2 be the solutions to
the boundary value problems associated with boundary data g1 and g2 , respectively:

−∆ui = 0 in Ω,
ui = gi on ∂Ω

for i = 1, 2. Then the solution depends continuously on data in the sense that

max |u1 (x) − u2 (x)| ≤ max |g1 (x) − g2 (x)| (9.1)


x∈Ω̄ x∈∂Ω

Proof. We appeal to the linearity of the Laplacian operator and take the difference of the two set
of equations to obtain

−∆(u1 − u2 ) = 0 in Ω,
(u1 − u2 ) = (g1 − g2 ) on ∂Ω.

The (weak) maximum (resp. minimum) principle require the maximum (resp. minimum) to occur
at the boundary; i.e., for any x ∈ Ω̄,

u1 (x) − u2 (x) ≤ max (g1 (x) − g2 (x)),


x∈∂Ω
u1 (x) − u2 (x) ≥ min (g1 (x) − g2 (x)).
x∈∂Ω

It follows that
max |u1 (x) − u2 (x)| ≤ max |g1 (x) − g2 (x)|,
x∈Ω̄ x∈∂Ω

which is the desired inequality.

The left hand side of (9.1) measures the closeness of the two solutions, and the right hand
side of (9.1) measures the closeness of the boundary data. The uniform stability result shows that
the difference in the two solutions u1 and u2 over any point x ∈ Ω̄ is bounded by the maximum
difference in the boundary data g1 − g2 . Hence a small change in the data results in a small change
in the solution; the problem is hence stable. We have seen in Lecture 5 that that the heat equation
also enjoys a similar stability property; however, not all PDEs possesses the property, as we will
see for the wave and transport equations.

9.6 Energy method: uniqueness


Similarly to the case of the heat equation, we can also prove the uniqueness of the solution to
Laplace’s equation using an energy method. Suppose u1 and u2 are two solutions to the same
boundary value problem associated with Laplace’s equation. Then their difference w ≡ u1 − u2
satisfies

−∆w = 0 in Ω,
w=0 on ∂Ω.

111
We then observe that Z Z
0=− w∆wdx = ∇w · ∇wdx,
Ω Ω
where the second equality follow from integration by parts and the fact w = 0 on ∂Ω. Hence
∇w = 0 in Ω. Because w = 0 on ∂Ω, we conclude w = u1 − u2 = 0 in Ω.

9.7 Summary
We summarize key points of this lecture:

1. Laplace’s equation is ubiquitous in both engineering and mathematics. A function that


satisfies Laplace’s equation is called a harmonic function.

2. Harmonic functions satisfy mean-value properties. Given a harmonic function over a ball, the
value of the function at the center is equal to both the mean of the function over the ball and
the mean of the function over the surface of the ball.

3. Harmonic functions satisfy maximum principles. A harmonic function attains the maximum
value somewhere on the boundary (i.e., the weak maximum principle); if the maximum is
attained in an interior, then it must be a constant function (i.e., the strong maximum prin-
ciple).

4. Harmonic functions also satisfy the weak and strong minimum principles.

5. Dirichlet boundary value problems associated with Laplace’s equation have unique solutions.
The proof follows from either the maximum principle or the energy method.

6. Dirichlet boundary value problems associated with Laplace’s equation are stable in the sense
that the solution depends continuously on the boundary data. The proof follows from the
maximum principle.

112
Lecture 10

Laplace’s equation: separation of


variables

©2019–2021 Masayuki Yano. Prepared for ESC384 Partial Differential Equations taught at the
University of Toronto.

10.1 Introduction
In this lecture, we seek the solution of Laplace’s and Poisson’s equations on bounded domains
using separation of variables. While the technique is fundamentally the same as that considered for
the heat equation, we pay particular attention on the extension of the technique to higher spatial
dimensions (n > 1).

10.2 Laplace’s equation on a rectangle


We now consider the solution of Laplace’s equation on a rectangle Ω ≡ (0, L) × (0, H) using
separation of variables:

−∆u = 0 in Ω ≡ (0, L) × (0, H), (10.1)


1 1
u=g on Γ ≡ {x = 0} × (0, H),
u = g2 on Γ2 ≡ {x = L} × (0, H),
u = g3 on Γ3 ≡ (0, L) × {y = 0},
u = g4 on Γ4 ≡ (0, L) × {y = H},

where x and y denote the first and second spatial coordinates, respectively, and g i , i = 1, . . . , 4,
are some boundary functions. Not that the boundaries Γ1 , Γ2 , Γ3 , and Γ4 correspond to the left,
right, bottom, and top boundaries, respectively.
To solve the boundary value problem (10.1), we again appeal to the principle of superposition.
Namely, we decompose the solution u into two parts u = u1,2 +u3,4 , where u1,2 is the solution to the
boundary value problem with only g 1 and g 2 as the nonhomogeneous boundaries, and u3,4 is the
solution to the boundary value problem with only g 3 and g 4 as the nonhomogeneous boundaries.

113
More explicitly, u1,2 is the solution to

−∆u1,2 = 0 in Ω, (10.2)
1,2 1 1
u =g on Γ ,
u1,2 = g 2 on Γ2 ,
u1,2 = 0 on Γ3 ∪ Γ4 ;

similarly, u3,4 is the solution to

−∆u3,4 = 0 in Ω,
u3,4 = g 3 on Γ3 ,
u3,4 = g 4 on Γ4 ,
u3,4 = 0 on Γ1 ∪ Γ2 .

We readily see u = u1,2 + u3,4 by superposition.


We now consider the solution of (10.2) by separation of variables and linear superposition. We
recall the two step procedure:

• Step 1. We find a family of separable solutions of the form

un (x, y) = Xn (x; Cn , Dn )Yn (y), n = 1, 2, . . . ,

each of which satisfies the homogeneous boundary conditions on Γ3 and Γ4 for any choice
of coefficients Cn and Dn , but not (in general) the nonhomogeneous boundary conditions on
Γ1 and Γ2 . (As we have seen for the heat equation, Yn will be a family of functions in y
that satisfy homogeneous boundary conditions at y = 0 and y = H; hence, we associate the
indeterminate coefficients Cn and Dn with functions Xn in x.)

• Step 2. We consider a particular choice of coefficients Cn and Dn such that



X
1,2
u (x, y) = Xn (x; Cn , Dn )Yn (y)
n=1

satisfies the nonhomogeneous boundary conditions on Γ1 and Γ2 . Note that the solution u1,2
automatically satisfies the homogeneous boundary conditions on Γ3 and Γ4 since each un
satisfies the homogeneous boundary condition.

To find the separable solution of the form un (x, y) = Xn (x)Yn (y), we substitute the expression
into (10.2) and obtain

Xn00 Y 00
−Xn00 Yn − Xn Yn00 = 0 ⇒ = − n = αn
Xn Yn

for some constant αn ∈ R. We have used the fact Xn00 /Xn and Yn00 /Yn depend on only x and y,
respectively, and hence each of them must be a constant.

114
We now consider the boundary value eigenproblem associated with Yn ; we solve for Yn (instead
of Xn ) first because we wish to impose homogeneous boundary conditions at the endpoints y = 0
and y = H. We first observe that the boundary condition implies that

un (x, y = 0) = Xn (x)Yn (0) = 0, ∀x ∈ (0, L)


un (x, y = H) = Xn (x)Yn (H) = 0, ∀x ∈ (0, L).

Since we seek a nontrivial solution, we require Xn 6= 0 and deduce Yn (0) = Yn (H) = 0. Our
boundary value eigenproblem for Yn is as follows:

−Yn00 = αn Yn in (0, H),


Yn (0) = Yn (H) = 0.

This is a Sturm-Liouville problem. In fact this is a scaled-version of the problem considered in


Example 4.6, and the eigenpairs are given by

Yn (y) = sin(λn y) and αn = λ2n

for λn = nπ/H, n = 1, 2, . . . .
We next consider the ODE associated with Xn :

Xn00 = αn Xn ⇒ Xn00 = λ2n Xn

for λn = nπ/H. The general solution is given by

Xn (x) = Cn sinh(λn x) + Dn cosh(λn x).

We hence conclude that any solution of the form



X
1,2
u (x, y) = (Cn sinh(λn x) + Dn cosh(λn x)) sin(λn y)
n=1

satisfies Laplace’s equation and the associated homogeneous boundary conditions on y = 0 and
y = H.
We now enforce the nonhomogeneous boundary conditions on x = 0 and x = L. We first enforce
u(x = 0, y) = g1 (y) to obtain

X
1,2
u (x = 0, y) = Dn sin(λn y) = g 1 (y)
n=1

for λn = nπ/H. We recognize that this is a Fourier sine series and the coefficients are given by

2 H
Z
1
Dn = ĝn ≡ sin(nπy/H)g 1 (y)dy, n = 1, 2, . . . . (10.3)
H 0

We next enforce the boundary condition u1,2 (x = L, y) = g 2 (y) to obtain



X
1,2
u (x = L, y) = (Cn sinh(λn L) + ĝn1 cosh(λn L)) sin(λn y) = g 2 (y),
n=1

115
where we have set Dn = ĝn1 . We note that the “coefficient” (Cn sinh(λn L) + ĝn1 cosh(λn L)) must be
the n-th Fourier sine coefficient of g 2 . Hence
2 H
Z
1 2
Cn sinh(λn L) + ĝn cosh(λn L) = ĝn ≡ sin(nπy/H)g 2 (y)dy.
H 0
We rearrange the expression to obtain
ĝn2 − ĝn1 cosh(λn L)
Cn = . (10.4)
sinh(λn L)
We hence conclude that the solution u1,2 is given by
∞  2
ĝn − ĝn1 cosh(λn L)
X 
1,2 1
u (x, y) = sinh(λn x) + ĝn cosh(λn x) sin(λn y).
sinh(λn L)
n=1

We may group the terms containing ĝn1 and ĝn2 to obtain


∞  
2 sinh(λn x)
X
1,2 1
u (x, y) = ĝn (cosh(λn x) − coth(λn L) sinh(λn x)) + ĝn sin(λn y),
sinh(λn L)
n=1

which may be further simplified using trigonometric identity to yields


∞  
1 sinh(λn (L − x)) 2 sinh(λn x)
X
1,2
u (x, y) = ĝn + ĝn sin(λn y),
sinh(λn L) sinh(λn L)
n=1

where λn = nπ/H.
By the symmetry of the problem, we also find the solution to
−∆u3,4 = 0 in Ω, (10.5)
3,4 3 3
u =g on Γ ,
3,4 4
u =g on Γ4 ,
u3,4 = 0 on Γ1 ∪ Γ2
is
∞  
X sinh(µn (H − y)) sinh(µn y)
u3,4 (x, y) = ĝn3 + ĝn4 sin(µn x),
sinh(µn H) sinh(µn H)
n=1
where µn = nπ/L, and ĝn3 and ĝn4
are the Fourier sine series coefficients
Z L
2 L
Z
3 2 3 4
ĝn ≡ sin(nπx/L)g (x)dx and ĝn ≡ sin(nπx/L)g 4 (x)dx.
L 0 L 0
It follows that, by linear superposition, the solution to the original problem (10.1) is given by
u(x, y) = u1,2 (x, y) + u3,4 (x, y)
∞  
1 sinh(λn (L − x)) 2 sinh(λn x)
X
= ĝn + ĝn sin(λn y)
sinh(λn L) sinh(λn L)
n=1
∞  
3 sinh(µn (H − y)) 4 sinh(µn y)
X
+ ĝn + ĝn sin(µn x)
sinh(µn H) sinh(µn H)
n=1

116
for λn = nπ/H and µn = nπ/L.
We make a few observations:

1. The solution is a linear combination of four harmonic functions associated with the four
boundary functions g 1 , . . . , g 4 ; i.e., u = u1 + u2 + u3 + u4 where
∞ ∞
X sinh(λn (L − x)) X sinh(λn x)
u1 (x, y) = ĝn1 sin(λn y), u2 (x, y) = ĝn2 sin(λn y),
sinh(λn L) sinh(λn L)
n=1 n=1
∞ ∞
X sinh(µn (H − y)) X sinh(µn y)
u3 (x, y) = ĝn3 sin(µn x), u4 (x, y) = ĝn4 sin(µn x).
sinh(µn H) sinh(µn H)
n=1 n=1

2. We readily verify that the solution satisfy the boundary conditions:



X ∞
X
u(x = 0, y) = ĝn1 sin(λn y) 1
= g (y), u(x = L, y) = ĝn2 sin(λn y) = g 2 (y),
n=1 n=1
X∞ ∞
X
u(x, y = 0) = ĝn3 sin(µn x) = g 3 (x), u(x, y = H) = ĝn4 sin(µn x) = g 4 (x).
n=1 n=1

3. Figure 10.1 shows functions sinh(λ n (L−x))


sinh(λn L)
sinh(λn x)
and sinh(λ n L)
for H = 1, which dictate the decay of
the n-th mode away from the x = 0 and x = L boundaries, respectively. We observe that the
functions decay exponentially away from the boundaries because sinh(λn x) ≈ exp(λn x)/2 for
λn x  1. Moreover, higher modes (i.e., modes associated with a larger n) decay more rapidly
away from the boundaries because the exponential factor is λn x = nπ/H.

4. Figure 10.2 shows the solution associated with a nonhomogeneous boundary condition on x =
sinh(λn x)
L and with the homogeneous boundary condition elsewhere: un (x, y) = ĝn2 sinh(λn L)
sin(λn y)
for n = 1 and 3. We observe that each mode is smooth and they decay exponentially away
from the boundary.

Example 10.1 (Laplace’s equation on a square). We consider Laplace’s equation on the unit square
Ω ≡ (0, 1)2 . The boundary value problem is given by (10.1) for triangular boundary functions

g 2 (y) = 1 − 2|y − 0.5|, y ∈ (0, 1),


4
g (x) = 1/2 − |x − 0.5|, x ∈ (0, 1),

and g 1 = g 3 = 0. To find the solution, we first note that λn = µn = nπ since L = H = 1. We then


find the Fourier sine coefficients of g 2 and g 4 :

8(−1)(n−1)/2
(
2 n2 π 2
, n odd,
ĝn =
0, n even,
4(−1)(n−1)/2
(
4 n2 π 2
, n odd,
ĝn =
0, n even.

117
Figure 10.1: Functions sinh(nπ(1 − x))/ sinh(nπ) and sinh(nπx)/ sinh(nπ).

(a) n = 1 (b) n = 3

Figure 10.2: Solutions to Laplace’s equation on (0, 1.5) × (0, 1) for g 2 (y) = sin(nπy) and g i = 0 for
i = 1, 3, 4.

118
Figure 10.3: Solution to Laplace’s equation on a square with triangular boundary functions.

We then substitute the coefficients to our general expression to find



!
X 8(−1)(n−1)/2 sinh(nπx) 4(−1)(n−1)/2 sinh(nπy)
u(x, y) = sin(nπy) + sin(nπx) .
n2 π 2 sinh(nπ) n2 π 2 sinh(nπ)
n=1,3,5,...

Figure 10.3 shows the solution to the problem. We observe that even though the boundary functions
are not smooth, the solution is smooth on Ω. In fact it can be shown that harmonic functions are
infinitely continuously differentiable in Ω (but not necessarily on ∂Ω). We also observe that the
maximum (resp. minimum) principle is clearly satisfied and the maximum (resp. minimum) value
is attained on the boundary.

10.3 Laplace’s equation on a disk


We now consider the solution of Laplace’s equation on a disk Ω ≡ {x ∈ R2 | kxk2 < a}:

−∆u = 0 in Ω, (10.6)
u=g on ∂Ω.

We again use separation of variables but now in polar coordinates. The Laplacian in polar coordi-
nates is
1 ∂2u ∂ 2 u 1 ∂u 1 ∂2u
 
1 ∂ ∂u
∆u = r + 2 2 = 2 + + 2 2 = 0.
r ∂r ∂r r ∂θ ∂r r ∂r r ∂θ
We substitute the separable form of the solution un (r, θ) = Rn (r)Θn (θ) and obtain

1 1
Rn00 Θn + Rn0 Θn + 2 Rn Θ00n = 0.
r r
We multiply through by r2 /(Rn Θn ) to obtain

Rn00 R0 Θ00 Rn00 R0 Θ00


r2 +r n + n =0 ⇒ r2 + r n = − n = αn .
Rn Rn Θn Rn Rn Θn

119
Since the left and right hand sides are functions of only r and θ, respectively, we deduce as before
that they must be equal to some constant αn .
We now consider the eigenproblem associated with Θn . Because we wish the function Θn :
(−π, π) → R to be continuous at the endpoints, we impose periodic boundary conditions. In
addition, we solve for Θn (instead of Rn ) first because the periodic boundary condition can be
readily incorporated into the separable form of the solution, just like the homogeneous boundary
condition in the rectangular case. The resulting eigenproblem is given by

−Θ00n = αn Θn in (−π, π),


Θn (2π) = Θn (0).

This is a periodic Sturm-Liouville problem. The general solution to the ODE is given by

Θn (θ) = An cos(λn θ) + Bn sin(λn θ)



for λn = αn . We then impose the periodic boundary condition to find that λn = n, n = 0, 1, 2, . . . ,
and hence
Θn (θ) = An cos(nθ) + Bn sin(nθ),
and αn = λ2n = n2 . (Note that Θn=0 = An is a constant function, which of course is periodic.)
Having found αn = n2 , we next observe that the ODE for Rn must satisfy

r2 Rn00 + rRn0 − n2 Rn = 0.

This is a Cauchy-Euler equation of order two, whose solution is of the form Rn (r) = rm for some
m. We substitute the form of the solution to obtain

m(m − 1)rm + mrm − n2 rm = 0 ⇒ m2 − n2 = 0. ⇒ m = ±n.

It follows that Rn (r) = Cn rn + Dn r−n if n 6= 0. For n = 0, the solution is given by Rn=0 (r) =
Cn=0 + Dn=0 log(r). We now impose the condition that Rn should be bounded on [0, a) and in
particular at the origin; we hence reject the solution r−n and log(r) which are unbounded at the
origin. It follows that our solution is

Rn (r) = Cn rn , n = 0, 1, 2, . . . .

Hence our family of separable solutions is

un (r, θ) = Rn (r)Θn (θ) = rn (An cos(nθ) + Bn sin(nθ)), n = 0, 1, 2, . . . .

We appeal to the principle of superposition and rescale the constant term by 1/2 and all other
terms by 1/an for convenience to obtain

1 X  r n
u(r, θ) = A0 + (An cos(nθ) + Bn sin(nθ)). (10.7)
2 a
n=1

To find the coefficients (An )∞ ∞


n=0 and (Bn )n=1 , we apply the boundary condition at r = a:

1 X
u(r = a, θ) = A0 + (An cos(nθ) + Bn sin(nθ)) = g(θ);
2
n=1

120
this is the Fourier series for g with the period of 2π and hence the coefficients are given by

1 π
Z
An = g(θ) cos(nθ)dθ, n = 0, 1, . . . , (10.8)
π −π
1 π
Z
Bn = g(θ) sin(nθ)dθ, n = 1, 2, . . . .
π −π

Hence the Fourier series representation of the solution to (10.6) is given by (10.7) with coefficients
(10.8). We make a few observations:

1. As the amplitude of the mode depends on (r/a)n with r/a ≤ 1, high modes — modes with
a large n that exhibit rapid oscillation on the boundary — decay rapidly away from the
boundary. This behavior is shown in Figure 10.4 and is similar to the decay away from the
boundary observed for rectangular domains.

2. The solution (as expected) satisfies the mean-value formulas over Ω and ∂Ω. The value at
the center of the disk is
1
u(r = 0, θ) = A0 .
2
The average of the solution over the disk is
∞  
Z Z π Z a !
1 1 1 X r n
udx = A0 + (An cos(nθ) + Bn sin(nθ)) rdrdθ
πa2 Ω πa2 θ=−π r=0 2 a
n=1
Z π Z a
1 1 1
= 2
A0 rdrdθ = A0 ,
πa θ=−π r=0 2 2

where the second equality follows because all of the sine and cosine integrals vanish when
integrated over (−π, π). The average over the circumference is
Z Z π
1 1
uds = u(r = a, θ)adθ
2πa ∂Ω 2πa θ=−π

Z π !
1 1 X
= A0 + (An cos(nθ) + Bn sin(nθ)) adθ
2πa θ=−π 2
n=1
Z π
1 1 1
= A0 adθ = A0 ,
2πa θ=−π 2 2

where the third equality again follows because all of the sine and cosine integrals vanish.
Hence, we observe that, for any boundary condition g, the value of the solution at the center
of the disk is equal to (i) its average over the disk and (ii) its average over the circumference,
as expected from the mean-value formulas.

Example 10.2 (Laplace’s equation on a disk). Let Ω ≡ {x ∈ R2 | kxk2 < 2} and consider

−∆u = 0 in Ω,
u(r = 2, θ) = 1 + cos(3θ), θ ∈ [−π, π).

121
Figure 10.4: Function (r/a)n .

We wish to find the solution to the boundary value problem. We could find the Fourier coefficients
using the formula (10.8), but for this simple boundary condition we can find the coefficients by
inspection. Namely, we wish to find (An )∞ ∞
n=0 and (Bn )n=1 such that


1 X
u(r = 2, θ) = A0 + (An cos(nθ) + Bn sin(nθ)) = 1 + cos(3θ).
2
n=1

We match the coefficients on the left and right hand side to observe that A0 /2 = 1 (i.e., A0 = 2),
A3 = 1, and all other coefficients should be zero. Hence the solution is
 r 3
u(r, θ) = 1 + cos(3θ).
2
Figure 10.5 shows the solution. We observe, as expected, the solution is smooth. In addition,
we readily verify the (weak) maximum and minimum principles: the maximum and minimum of 2
and 0, respectively, occur on the boundary. The strong maximum and minimum principles are also
satisfied: the solution strictly inside of the domain Ω (i.e., not on the boundary) are all strictly
greater than 0 and strictly less than 2, since (r/2)3 < 1 for all r < 2. Finally, we observe that
the mean-value formulas are satisfied; the value at the center, the average over the disk, and the
average over the circumference are all equal to 1.

10.4 Summary
We summarize key points of this lecture:

1. Separation of variables can be applied to solve Laplace’s equation in two (or higher) dimen-
sions.

2. The solution to Laplace’s equation on a rectangle or disk can be expressed in terms of the
Fourier coefficients of the boundary data.

122
Figure 10.5: Solution to Laplace’s equation on a disk.

3. The harmonic function on a rectangle or disk decays away from the boundary. The solu-
tion decays more rapidly for more oscillatory boundary functions; i.e., the effect of highly
oscillatory boundary modes are confined to the thin region near the boundary.

123
Lecture 11

Laplace’s equation: fundamental


solutions and Green’s functions

©2019–2021 Masayuki Yano. Prepared for ESC384 Partial Differential Equations taught at the
University of Toronto.

11.1 Introduction
In this lecture we seek the fundamental solution of Laplace’s equation and appeal to the principle
of linear superposition to solve Laplace’s and Poisson’s equations. The technique allows us to
naturally analyze the solution to Laplace’s equation in unbounded domains. We will also apply the
method of reflection to find Green’s functions and solve boundary value problems.

11.2 Fundamental solution


We first introduce the Dirac delta distribution, which is important to characterize the fundamental
solution.

Definition 11.1 (Dirac delta distribution). The Dirac delta distribution δ over Rn is a distribution
(or “function”) such that

δ(x) = 0, x 6= 0,
Z
δ(x)dx = 1,
Z Rn

f (x)δ(x − a)dx = f (a), ∀a ∈ Rn ,


Rn

where f : Rn → R is continuous at a.

We wish to develop a better insight about the Dirac delta distribution. To this end, we restrict
ourselves to R1 and recall the definition of a Gaussian g : R → R centered about the origin with a
standard deviation σ:
1 x2
 
1
g(x; σ) = √ exp − 2 .
2πσ 2 2σ

124
14

12

10

0
-1 -0.5 0 0.5 1

Figure 11.1: Gaussian functions for a few different values of σ.

Figure 11.1 shows Gaussian functions for a few different values of σ. The Dirac delta distribution
can be thought of as a Gaussian in the limit of σ → 0. To see this, we make three observations:

1. A Gaussian g(x; σ) is nearly zero for any x such that |x|  σ. This becomes the first property
of Dirac delta in the limit of σ → 0; i.e., for x 6= 0, δ(x) → 0 as σ → 0.

2. For any σ > 0, a Gaussian satisfies


Z
g(x, σ)dx = 1,
R

R
which is the second property of Dirac delta: R δ(x)dx = 1.

3. The convolution of a Gaussian with a function,


Z
g(x − a, σ)f (x)dx,
R

provides an weighted average of f near a, where the nearness


R is defined by σ. In the limit of
σ → 0, this becomes the third property of Dirac delta: R δ(x − a)f (x)dx = f (a).

Having defined the Dirac delta distribution, we now introduce the fundamental solution of
Laplace’s equation.

Definition 11.2 (fundamental solution of Laplace’s equation). The fundamental solution to Laplace’s
equation is the function Φ : Rn → R such that

−∆Φ = δ in Rn ,

125
Figure 11.2: The fundamental solution of Laplace’s equation in R2 . Note that the function is
unbounded at the origin.

where δ is the Dirac delta function. The fundamental solutions are given by
1


 − kxk, n = 1,


 2
1


− 2π log kxk, n = 2,



Φ(x) = 1

 , n = 3,
 4πkxk




 1
, n ≥ 4,


σn (n − 2)kxkn−2

p
where σn is the surface area of the n-dimensional unit ball, and kxk ≡ x21 + · · · + x2n .
1
First part of proof (for R2 ). We first prove the result for R2 for which Φ(x) = − 2π log kxk =
1
− 2π log(r). We first verify that the Green’s function satisfies Laplace’s equation for any r > 0:

1 ∂2Φ
      
1 ∂ ∂Φ 1 ∂ ∂ 1 1 1 ∂ 1
−∆Φ = − r − 2 2 =− r − log(r) = r = 0,
r ∂r ∂r r ∂θ r ∂r ∂r 2π 2π r ∂r r

for r > 0, and hence −∆Φ(x) = 0 for x 6= 0.


Showing −∆Φ = δ requires the knowledge of distribution; this is beyond the scope of this
course.

Figure 11.2 shows the fundamental solution of Laplace’s equation in R2 . The function is un-
bounded at the origin and decays rapidly away from the origin. By definition, the function is
harmonic everywhere except at the origin. In electrostatics, the fundamental solution represents
the electric potential associated with a single point charge at the origin.

11.3 Poisson’s equation in Rn


We now use the fundamental solution to solve Poisson’s problem in Rn :

126
Theorem 11.3. Suppose f : Rn → R is twice continuously differentiable and has a compact
support. Then a solution to Poisson’s problem

−∆u = f in Rn

can be expressed as Z
u(x) = Φ(x − ξ)f (ξ)dξ,
Rn
where Φ : Rn → R is the fundamental solution to Laplace’s equation.

Proof. The proof is beyond the scope of this course.


However, using the definition of the fundamental solution and the Dirac delta distribution, we
may at least formally show
Z Z
−∆u(x) = − ∆Φ(x − ξ)f (ξ)dξ = δ(x − ξ)f (ξ)dξ = f (x),
Rn Rn

and hence −∆u(x) = f (x) ∀x ∈ Rn .

11.4 Green’s function


We now consider the solution of Laplace’s (and Poisson’s) equation in a domain Ω with a bound-
ary using Green’s functions, which are similar to the fundamental solution but incorporates the
boundary condition.

Definition 11.4 (Green’s function). A Green’s function in Ω is a function G : Ω × Ω → R that


satisfies, for each fixed ξ ∈ Ω,

−∆G(x, ξ) = δ(x − ξ) ∀x ∈ Ω,
G(x, ξ) = 0 ∀x ∈ ∂Ω.

We can equivalently interpret G(·, ξ) : Ω → R as a function that satisfies, for each fixed ξ ∈ Ω,

−∆G(·, ξ) = δξ in Ω,
G(·, ξ) = 0 on ∂Ω,

where δξ is the Dirac delta distribution centered about ξ (i.e., δξ (x) = δ(x − ξ)). Like the funda-
mental solution, a Green’s function G(·, ξ) is harmonic everywhere except at ξ; however, unlike the
fundamental solution, the Green’s function is associated with a particular domain with a boundary
∂Ω. In electrostatics, the Green’s function G(·, ξ) represents the electric potential associated with
a single point charge at ξ inside a conducting boundary ∂Ω (where the potential vanishes).
We first note that Green’s functions are symmetric.

Theorem 11.5 (symmetry of Green’s function). For any x, ξ ∈ Ω, x 6= ξ,

G(x, ξ) = G(ξ, x).

Proof. Proof is provided in Appendix 11.A.

127
The symmetry of Green’s functions is known as the principle of reciprocity. In electrostatics,
the symmetry implies that (i) the potential at point a due to a point charge at b and (ii) the
potential at point b due to a point charge at a are the same.
The Green’s function provides the following solution representation for nonhomogeneous Dirich-
let boundary value problem:

Theorem 11.6 (Green’s representation formula). Let G : Ω → R be the Green’s function associ-
ated with a domain Ω. Then the solution to the Poisson equation

−∆u = f in Ω,
u=g on ∂Ω

can be expressed as
Z Z
u(x) = G(x, ξ)f (ξ)dξ − ν · ∇ξ G(x, ξ)g(ξ)dξ. (11.1)
Ω ∂Ω

Proof. Proof is provided in Appendix 11.A.


We may also at least formally obtain the expression from the definition of the delta distribution
and two successive applications of integration by parts:
Z Z
u(ξ) = δ(ξ − x)u(x)dx = − ∆G(x, ξ)u(x)dx
ZΩ ΩZ

= ∇G(x, ξ) · ∇u(x)dx − ν · ∇G(x, ξ)u(x)dx


ΩZ Z ∂Ω Z
= − G(x, ξ)∆u(x)dx + G(x, ξ)ν · ∇u(x)dx − ν · ∇G(x, ξ)u(x)dx
Z Ω Z ∂Ω ∂Ω

= G(x, ξ)f (x)dx − ν · ∇G(x, ξ)g(x)dx,


Ω ∂Ω

where the last equality follows from G(·, ξ) = 0 on ∂Ω. We finally appeal to the symmetry of the
Green’s function to obtain the desired result.

The representation formula allows us to express the solution to the Poisson’s problem as the
convolution of the Green’s function and the source/boundary data f and g. The Green’s function
for problems with boundaries (in Theorem 11.6) plays an analogous role as the fundamental solution
for problems without boundaries (in Theorem 11.3).

11.5 Finding Green’s functions: general formulation


In general, a Green’s function, which satisfies Definition 11.4, can be expressed in terms of the
fundamental solution Φ and a “correction function” Ψ. Namely, we consider the decomposition of
the form
G(x, ξ) = Φ(x − ξ) + Ψ(x, ξ).
We apply the Laplacian to the decomposition to obtain

−∆G(x, ξ) = −∆Φ(x − ξ) − ∆Ψ(x, ξ) = δ(x − ξ) − ∆Ψ(x, ξ) ∀x, ξ ∈ Ω.

128
Since we wish −∆G(x, ξ) = δ(x − ξ), we conclude that

−∆Ψ(x, ξ) = 0 ∀x, ξ ∈ Ω.

We next apply the boundary condition for the Green’s function to the decomposition to obtain

G(x, ξ) = Φ(x − ξ) + Ψ(x, ξ) = 0 ∀x ∈ ∂Ω, ∀ξ ∈ Ω,

which implies that


Ψ(x, ξ) = −Φ(x − ξ), ∀x ∈ ∂Ω, ∀ξ ∈ Ω.
Thus, the correction function Ψ : Ω × Ω → R must satisfy, for each fixed ξ ∈ Ω,

−∆x Ψ(x, ξ) = 0 ∀x ∈ Ω
Ψ(x, ξ) = −Φ(x − ξ) ∀x ∈ ∂Ω.

The correction function, unlike the fundamental solution, does not contain a singularity and is
harmonic everywhere in Ω.

11.6 Green’s function for the upper half plane


We consider Laplace’s problem on the upper half plane. To this end, we introduce the boundary
Γ ≡ {(x1 , x2 ) | x2 = 0} and the domain Ω ≡ {(x1 , x2 ) | x2 > 0}. Poisson’s problem is

−∆u = 0 in Ω, (11.2)
u=g on Γ.

We solve the problem in two steps: we first find the Green’s function; we then apply Green’s
representation formula.
To find the Green’s function, we use the method of images (or the method of reflection) and
express the Green’s function as a sum of fundamental solutions. The idea is to choose an image
point ξ˜ ∈ R2 \ Ω associated with the reflection of ξ about the axis x2 = 0; i.e.,

ξ˜ = (ξ˜1 , ξ˜2 ) ≡ (ξ1 , −ξ2 ).

The image point ξ˜ associated with the point ξ has an important property that any point on the
˜ as we will see shorty, this property is important to enforce
boundary Γ is equidistant from ξ and ξ;
the boundary condition. We then choose the correction function
˜
Ψ(x, ξ) = −Φ(x − ξ),

so that the Green’s function is

˜ = − 1 log kx − ξk + 1 log kx − ξk.


G(x, ξ) = Φ(x − ξ) − Φ(x − ξ) ˜
2π 2π
We observe the following:

1. The Green’s function G(x, ξ) is harmonic everywhere in Ω except at ξ since (i) Φ(x − ξ) is
˜ is harmonic everywhere (since ξ˜ 6∈ Ω).
harmonic everywhere except at ξ and (ii) Φ(x − ξ)

129
(a) Full R2 visualization (b) Upper half plane visualization

Figure 11.3: Green’s function for the upper half plane G(x, ξ ≡ (0, 0.5)). The function is unbounded
at ξ = (0, 0.5) (and at the image point ξ˜ = (0, −0.5)). The red line indicates the boundary.

2. The value of the fundamental solution depends only on the distance between two points, and
the distance from x to ξ and from x to ξ˜ are the same for all x ∈ Γ. Hence, for any x ∈ Γ,
˜ which implies G(x, ξ) = Φ(x − ξ) − Φ(x − ξ)
Φ(x − ξ) = Φ(x − ξ), ˜ = 0. Thus the boundary
condition for the Green’s function is satisfied.
Figure 11.3 shows the Green’s function G(x, ξ ≡ (0, 0.5)) for the upper half plane.
We now wish to represent the solution to Laplace’s problem (11.2) using the representation
formula (11.1). To this end, we first evaluate the directional derivative ν · ∇ξ G(x, ξ) to find
 
∂G(x, ξ) 1 ∂  ˜
 1 ξ2 − x2 ξ2 + x2
ν · ∇ξ G(x, ξ) = − = log kx − ξk − log kx − ξk = − ,
∂ξ2 2π ∂ξ2 2π kx − ξk2 kx − ξk ˜2

where the first equality follows from ν = (0, −1) on Γ for the upper half plane, and the last equality
follows from ξ˜2 = −ξ2 . For ξ ∈ Γ, ξ = ξ˜ and hence
x2 1
ν · ∇ξ G(x, ξ) = − .
π kx − ξk2
We appeal to (11.1) and find that the solution to (11.2) is
Z  Z Z
 1 x2
u(x) = G(x,ξ)f (ξ)dξ − ν · ∇ξ G(x, ξ)g(ξ)dξ = g(ξ)dξ,
 
Ω Γ π Γ kx − ξk2
where the integral over Ω vanished because f = 0. This solution representation is called Poisson’s
formula for the upper half plane (in R2 ). The function
1 x2
K(x, ξ) ≡ ,
π kx − ξk2
which multiplies the boundary function g, is called Poisson’s kernel for the upper half plane (in
R2 ).
We make a few observations

130
1. If g ≥ 0 everywhere on Γ and g > 0 somewhere on Γ, then the solution is strictly positive
(u > 0) everywhere in the upper half plane. This is consistent with the strong maximum
principle.
2. Any disturbance in the boundary condition affects the solution everywhere in the upper half
plane because Poisson’s kernel is nonzero everywhere.
3. In R2 , the effect of the disturbance decays as ∼ 1/x2 as we move away from the plane. In
other words, while the disturbance in the boundary condition affects solution everywhere in
the domain, its effect decays away from the boundary.
Remark 11.7 (Poisson’s formula for upper half plane in Rn ). Following the same derivation as
the above, we can shown that Poisson’s formula for the upper half plane in Rn is
Z
2 xn
u(x) = g(ξ)dξ,
nαn Γ kx − ξkn
where αn is the volume of the unit ball in Rn (e.g., αn=2 = π, αn=3 = 4/3π).
Example 11.8 (Laplace’s equation in the upper half plane). We consider a boundary-value problem
−∆u = 0 in Ω ≡ {(x1 , x2 ) | x2 > 0},
u=g on Γ ≡ {(x1 , x2 ) | x2 = 0},
where (
1, x1 ∈ (−1/2, 1/2),
g((x1 , 0)) ≡
0, otherwise.
Using Poisson’s formula for the upper half plane, we can express the solution as
1 1/2
Z Z
1 x2 x2
u(x) = g(ξ)dξ = dξ1 ,
π Γ kx − ξk 2 π ξ1 =−1/2 (x1 − ξ1 )2 + x22
where the last equality follows from ξ = (ξ1 , 0) on the boundary Γ. Figure 11.4 shows the solution.
We observe that even though the boundary condition is discontinuous, the solution is smooth every-
where in the upper half plane (excluding the boundary). We also observe that the strong maximum
principle is satisfied and the solution is strictly positive everywhere (excluding the boundary).

11.7 Green’s function for a disk


We now consider the solution of Laplace’s problem on a disk Ω ≡ B(0, a) = {x ∈ R2 | kxk < a} of
radius a:
−∆u = 0 in Ω, (11.3)
u=g on ∂Ω.
We again solve the problem in two steps: we first find the Green’s function; we then apply the
Green’s representation formula.
As before, to find the Green’s function, we wish to find the correction function Ψ which takes
on the same value as the fundamental solution Φ on the boundary. To this end, we first introduce
the dual point with respect to the disk.

131
Figure 11.4: Solution to a Poisson’s problem in the upper half plane.

Definition 11.9 (dual point with respect to a circle). The point

a2
ξ˜ ≡ ξ.
kξk2

is said to be dual to ξ with respect to the circle ∂B(0, a). For any fixed ξ ∈ B(0, a), the dual point
satisfies for x ∈ ∂B(0, a),
kξk ˜
kx − ξk = kx − ξk.
a
Proof. We first observe that

2
1/2
kξk2 a2

kξk ˜ = kξk x − a ξ = kξk x − a ξ =
kx − ξk x · x − 2x · ξ + ξ·ξ .
a a kξk2 a |ξ| a2 kξk2

Since x · x = a2 for x ∈ ∂B(0, a),

kξk ˜ = kξk2 − 2x · ξ + kxk2 1/2 = kx − ξk,



kx − ξk
a
which is the desired relationship.
kξk ˜ for x ∈ ∂B(0, a), we choose the correction function
Given kx − ξk = a kx − ξk

kξk ˜
Ψ(x, ξ) = −Φ( (x − ξ))
a
so that the Green’s function is
 
kξk ˜ = − 1 log(kx − ξk) + 1 log kξk ˜
G(x, ξ) = Φ(x − ξ) − Φ( (x − ξ)) kx − ξk .
a 2π 2π a

The Green’s function then satisfies the boundary condition. Figure 11.5 shows the Green’s function
G(x, ξ ≡ (0.4, 0)) for the unit disk.

132
(a) surface plot (b) top view

Figure 11.5: Green’s function for the unit disk G(x, ξ ≡ (0.4, 0)). The function is unbounded at
ξ = (0.4, 0.0) (and at the dual point ξ˜ = (2.5, 0.0)). The red line indicates the boundary.

We now wish to represent the solution to Laplace’s problem (11.3) using the representation
formula (11.1). After the tedious but straightforward evaluation of ν · ∇ξ G(x, ξ) for ξ ∈ ∂B(0, a),
we find a solution representation:

a2 − kxk2
Z Z
g(ξ)
u(x) = − ν · ∇ξ G(x, ξ)g(ξ)dξ = 2
dξ.
∂Ω 2πa ∂B(0,a) kx − ξk

This formula is called the Poisson’s formula for a disk (in R2 ). The function

a2 − kxk2 1
K(x, ξ) ≡ ,
2πa kx − ξk2

which multiplies the boundary function g is called Poisson’s kernel for a disk (in R2 ).

Remark 11.10 (Poisson’s formula for a ball in Rn ). Following the same derivation as the above,
we can show that Poisson’s formula for a ball B(0, a) ⊂ Rn is

a2 − kxk2
Z
1
u(x) = g(ξ)dξ,
nαn a ∂B(0,a) kx − ξkn

where αn is the volume of the unit ball in Rn .

Example 11.11 (Laplace’s equation in a disk). We consider a boundary-value problem

−∆u = 0 in Ω ≡ {x ∈ R2 | kxk < 1},


u=g on ∂Ω,

where (
1, θ ∈ (−π/10, π/10),
g((r = 1, θ)) ≡
0, otherwise.

133
Figure 11.6: Solution to a Poisson’s problem in a disk.

Using Poisson’s formula for a unit disk, we can express the solution as
π/10
a2 − kxk2 12 − kxk2
Z Z
g(ξ) 1
u(x) = dξ = dθ,
2πa ∂B(0,a) kx − ξk2 2π −π/10 (x1 − cos(θ))2 + (x2 − sin(θ))2

where the last equality follows from ξ = (cos(θ), sin(θ)) on ∂B(0, a = 1). Figure 11.6 shows
the solution. We observe that even though the boundary condition is discontinuous, the solution is
smooth everywhere in the disk (excluding the boundary). We also observe that the strong maximum
principle is satisfied and the solution is strictly positive everywhere (excluding the boundary).

Example 11.12 (proof of mean value formula by Poisson’s formula). We evaluate Poisson’s formula
(for the disk) at the origin x = 0 to find that

a2
Z Z
g(ξ) 1
u(0) = 2
dξ = g(ξ)dξ,
2πa ∂B(0,a) a 2πa ∂B(0,a)

which is the mean value formula.

11.8 Summary
We summarize key points of this lecture:

1. The Dirac delta distribution vanishes everywhere except at the origin and yet integrates
to 1; it can be thought of as the Gaussian function in the limit of the standard deviation
approaching 0.

2. The fundamental solution of Laplace’s equation satisfies the Poisson’s equation with the
delta distribution as the source in Rn . The fundamental solution decays rapidly away from
the origin.

3. The solution to Poisson’s equation in Rn can be expressed as the convolution of the funda-
mental solution with the source function.

134
4. Green’s function of Laplace’s equation on Ω ⊂ Rn satisfies the Poisson’s equation with the
Dirac delta distribution as the source and vanishes on ∂Ω.

5. Green’s function is symmetric. This is known as the principle of reciprocity.

6. The solution to Poisson’s problem on Ω ⊂ Rn with Dirichlet boundary conditions can be


expressed using the Green’s representation formula.

7. Green’s function for a half plane can be found using the method of reflection (a.k.a. method
of images).

8. Green’s function for a disk can be found using the method of reflection using the concept of
dual points.

135
11.A Proofs of properties of Green’s functions
Before we prove properties of Green’s function, we introduce Green’s second identity, which helps
us prove the properties.

Proposition 11.13 (Green’s second identity). For all w, u ∈ C 2 (Ω),


Z Z
(w∆u − u∆w)dx = (w(ν · ∇u) − u(ν · ∇w))ds.
Ω ∂Ω

Proof. We invoke integration by parts twice to yield


Z Z Z Z Z
w∆udx = w(ν · ∇u)ds − ∇w · ∇udx = (w(ν · ∇u) − (ν · ∇w)u)ds + (∆w)udx.
Ω ∂Ω Ω ∂Ω Ω
R
The subtraction of Ω (∆w)udx from both sides yields the desired result.

Proposition 11.14. Let Φ : Rn → R be the fundamental solution of Laplace’s equation. Then,


for any continuously differentiable u,
Z
∂u ∂Φ
lim (Φ(x − a) (x) − u(x) (x − a))dx = u(a),
→0 ∂B(a,) ∂ν ∂ν

∂u
where B(a, ) is the ball of radius  centered about a, and ∂ν (x) ≡ ν · ∇u(x) for ν the outward-
pointing normal on the surface of the ball.
1
Proof. We prove the result in two dimensions (n = 2), for which Φ(x) = − 2π log kxk. The proof
for other dimensions follows similarly. We first note that, for any fixed  > 0,
Z
∂u ∂Φ
(?) ≡ (Φ(x − a) (x) − u(x) (x − a))dx
∂B(a,) ∂ν ∂ν
Z Z
1 ∂u 1
= − log() (x)dx + u(x)dx,
2π ∂B(a,) ∂ν 2π ∂B(a,)

where the equality follows because (i) the fundamental solution depends only on the distance from a,
1
and hence (ii) Φ(x−a) = − 2π log() and ∂Φ ∂ 1 1
∂ν (x−a) = ∂r (− 2π log(r))|r= = 2π for any x ∈ ∂B(a, ).
We now rewrite the integrals in terms of the mean over the surface of the ball to obtain

1 ∂u 1 ∂u
(?) = − log()2π mean ( ) + 2π mean (u) = − log() mean ( ) + mean (u),
2π ∂B(a,) ∂ν 2π ∂B(a,) ∂B(a,) ∂ν ∂B(a,)

where mean ( ∂u 1 ∂u 1
R R
∂ν ) ≡ 2πa ∂B(a,) ∂ν (x)dx and mean (u) ≡ 2πa ∂B(a,) u(x)dx. Since u continuously
∂B(a,) ∂B(a,)
differentiable, ∂u
∂ν and u are bounded and so are their means. Moreover, since lim→0  log() = 0,
the first term is 0 in the limit of  → 0. It follows that

lim (?) = lim mean (u) = u(a),


→0 →0 ∂B(a,)

which is the desired result.

136
Proof of symmetry of Green’s function. We first introduce u(x) = G(x, a) = Φ(x − a) + Ψ(x, a)
and w(x) = G(x, b) = Φ(x − b) + Ψ(x, b), where G(x, a) and G(x, b) are Green’s functions on
Ω, and Ψ(x, a) and Ψ(x, b) are the associated correction functions. We next introduce a domain
Ω ≡ Ω \ (B(a, ) ∪ B(b, )) for  > 0 but sufficiently small; i.e., Ω with two holes centered about
a and b. Both u and w are harmonic in Ω since the holes remove the singularity in the Green’s
functions. We now invoke the Green’s second identity with u and w. The left hand side of the
Green’s second identify over Ω is
Z
(LHS) = (G(x, b)∆G(x, a) − G(x, a)∆G(x, b))dx = 0;
Ω

the term vanishes because G(x, a) and G(x, b) are harmonic everywhere in Ω and hence ∆G(x, a) =
∆G(x, b) = 0 for all x ∈ Ω . We next note that the right hand side of the Green’s second identify
over Ω is
Z
∂u ∂w
(RHS) = (w −u )dx
∂Ω ∂ν ∂ν
Z Z Z
∂u ∂w ∂u ∂w ∂u ∂w
= (w −u )dx − (w −u )dx − (w −u )dx
∂Ω ∂ν ∂ν ∂B(a,) ∂ν ∂ν ∂B(b,) ∂ν ∂ν
≡ CΩ − A − B ,

where ν is the outward-pointing normal with respect to Ω on ∂Ω and with respect to B(a, ) on
∂B(a, ). (In other words, ν is the inward-pointing normal with respect to Ω on ∂B(a, ).) The
first term CΩ vanishes because u(x) = G(x, a) = 0 and w(x) = G(x, b) = 0 for all x ∈ ∂Ω. The
second term simplifies to
Z
∂G ∂w
lim A = lim (w(x) (x, a) − G(x, a) (x))dx
→0 →0 ∂B(a,) ∂ν ∂ν
Z
∂Φ ∂w
= lim (w(x) (x − a) − Φ(x − a) (x))dx
→0 ∂B(a,) ∂ν ∂ν
Z
∂Ψ ∂w
+ lim (w(x) (x, a) − Ψ(x, a) (x))dx
→0 ∂B(a,) ∂ν ∂ν
= −w(a) = −G(a, b),

where the second equality follows from the decomposition G(x, a) = Φ(x − a) + Ψ(x, a); the last
equality follows from (i) the application of Proposition 11.14 to the first term and (ii) the fact that
the second term vanishes as  → 0 because w(x) ∂Ψ ∂w
∂ν (x, a) − Ψ(x, a) ∂ν (x) is continuous. Similarly,

lim B = u(b) = G(b, a).


→0

It follows 0 = (LHS) = (RHS) = −A − B = G(a, b) − G(b, a), and hence G(a, b) = G(b, a).

Proof of Green’s representation formula. We first introduce w(x) = G(x, ξ) = Φ(x − ξ) + Ψ(x, ξ),
where G(x, ξ) is the Green’s function on Ω and Ψ(x, ξ) is the associated correction function. We
next introduce a domain Ω ≡ Ω \ B(ξ, ) for  > 0 but sufficiently small. Note that w is harmonic

137
in Ω since the hole removes the singularity in the Green’s function. We now invoke the Green’s
second identity with w(x) = G(x, ξ) and u(x). The left hand side is given by
Z Z
(LHS) = (G(x, ξ)∆u(x) − u(x)∆G(x, ξ))dx = − G(x, ξ)f (x)dx;
Ω Ω

the second term vanishes because G(x, ξ) is harmonic in Ω . The right hand side is given by
Z
∂u(x) ∂G
(RHS) = (G(x, ξ) − u(x) (x, ξ))dx
Ω ∂ν ∂ν
Z Z
∂u(x) ∂G ∂u(x) ∂G
= (G(x, ξ) − u(x) (x, ξ))dx − (G(x, ξ) − u(x) (x, ξ))dx
∂Ω ∂ν ∂ν ∂B(ξ,) ∂ν ∂ν
≡ CΩ − D ,

where ν is the outward-pointing normal with respect to Ω on ∂Ω and with respect to B(a, ) on
∂B(a, ). (In other words, ν is the inward-pointing normal with respect to Ω on ∂B(a, ).) The
first term simplifies to Z
∂G
CΩ = − g(x) (x, ξ)dx,
∂Ω ∂ν
because G(x, ξ) = 0 and u(x) = g(x) for all x ∈ ∂Ω. The second term simplifies to

lim D = u(ξ),
→0

following the same argument used to simplify A in the proof of symmetry of Green’s functions.
Equating the left and right hand sides, we obtain
Z Z
∂G
u(ξ) = G(x, ξ)f (x)dx − g(x) (x, ξ)dx.
Ω ∂Ω ∂ν

We switch the roles of x and ξ to obtain


Z Z Z Z
∂G
u(x) = G(ξ, x)f (ξ)dξ − g(ξ) (ξ, x)dξ = G(ξ, x)f (ξ)dξ − g(ξ)ν · ∇ξ G(ξ, x)dξ.
Ω ∂Ω ∂ν Ω ∂Ω

We appeal to the symmetry of the Green’s function to obtain


Z Z
u(x) = G(x, ξ)f (ξ)dξ − g(ξ)ν · ∇ξ G(x, ξ)dξ,
Ω ∂Ω

which is the desired result.

138
Lecture 12

Poisson’s equation: finite difference


method

©2019–2021 Masayuki Yano. Prepared for ESC384 Partial Differential Equations taught at the
University of Toronto.

12.1 Introduction
In this lecture we consider finite difference approximations of Poisson’s (and Laplace’s) equation
in two dimensions. Numerical methods are arguably a necessity for PDEs in two and higher
dimensions, as there is no analytical means to find the solution for complex domain geometries.
Building on our discussion on finite difference method for the one-dimensional heat equation in
Lecture 8, in this lecture we focus on finite difference approximations in higher dimensions.

12.2 Problem statement


We introduce Poisson’s equation in two dimensions:

∂2u ∂2u
− − 2 =f in Ω ≡ (0, 1) × (0, 1),
∂x2 ∂y
u=0 on ∂Ω,

where f : Ω → R is some forcing function. For simplicity, we consider a unit square domain and
homogeneous Dirichlet boundary condition. We defer to Section 12.6 for the treatment of other
boundary conditions.

12.3 Finite difference formulation


We approximate the two-dimensional Poisson’s equation by a finite difference method. As we have
done for the heat equation, we first introduce a computational grid over the domain Ω. The grid
points are given by
(xi , yj ) = (i∆x, j∆y), i, j = 0, . . . , n + 1,

139
Figure 12.1: Computational grid for two-dimensional Poisson equation.

where ∆x = ∆y = h = 1/(n + 1) is the spacing between any two grid points. We then denote our
approximation of the solution by
ũi,j ≡ ũ(xi , yj ), i, j = 0, . . . , n + 1.
An example of computational grid for n = 3 is shown in Figure 12.1. (Note that our indices range
from i, j = 0 to n + 1, so that the unknowns are at i, j = 1, . . . , n after we impose the Dirichlet
boundary conditions at i, j = 0 and n + 1. The indices are different from the mixed boundary
condition case considered in Lecture 8, where the right most node was an unknown node because
of the Neumann boundary condition. This also affects the relationship between the grid spacing
∆x and n.)
We apply the second-order central difference formula to approximate the second derivatives
∂2u ũi+1,j − 2ũi,j + ũi−1,j

∂x2 i,j ∆x2
∂2u ũi,j+1 − 2ũi,j + ũi,j−1
≈ .
∂y 2 i,j ∆y 2
2
We note that, because we are approximating partial derivatives, the finite difference step for ∂∂xu2
2
and ∂∂yu2 are taken (only) in the respective coordinate directions. We also recall that these approxi-
mations for the second-derivative are second-order accurate (Proposition 8.4); i.e., using the Taylor
series, we can show that

∂2u ũi+1,j − 2ũi,j + ũi−1,j


− = O(∆x2 )
∂x2 i,j ∆x2

for sufficiently regular u.


Combining the two approximations of partial derivatives, we obtain an approximation of the
Laplacian:
∂2u ∂2u ũi+1,j − 2ũi,j + ũi−1,j ũi,j+1 − 2ũi,j + ũi,j−1
−∆u|i,j = − − ≈− − .
∂x2 i,j ∂y 2 i,j ∆x 2 ∆y 2

140
Because the approximation of the Laplacian at (i, j) involves the solution values at five points (i.e.
four neighbors and itself), the finite difference approximation is said to have a five-point stencil.
The substitution of the finite difference formula to the Poisson equation yields
ũi+1,j − 2ũi,j + ũi−1,j ũi,j+1 − 2ũi,j + ũi,j−1
− 2
− = fˆi,j , i, j = 1, . . . , n, (12.1)
∆x ∆y 2
where fˆi,j = f (xi , yj ). The homogeneous Dirichlet boundary conditions become
ũ0,j = ũn+1,j = 0, j = 0, . . . , n + 1,
ũi,0 = ũi,n+1 = 0, i = 1, . . . , n,
Note that we have n2 equations associated with the interior grid points. The difference equation
can again be expressed in a matrix form:
Âũ = fˆ;
for instance, if n = 3 and h ≡ ∆x = ∆y, the matrix  ∈ R9×9 and the vectors ũ ∈ R9 and fˆ ∈ R9
are given by
ˆ11
 

4 −1 −1
 
ũ11
 f
 fˆ 
 −1 4 −1 −1   ũ21   21 
 ˆ 
 f31 
   

 −1 4 −1 

 ũ31 
   ˆ 
 −1 4 −1 −1  f12 
 ũ12 
  
1   , ũ =  ũ22  , fˆ =  fˆ22  .
  
 = 2  −1 −1 4 −1 −1
h     
 fˆ 

 −1 −1 4 −1 
 ũ32 
   32 

 −1 4 −1 

 ũ 
 13 
 ˆ 
 f13 
 −1 −1 4 −1   ũ23   ˆ 
 f23 
−1 −1 4 ũ33 fˆ 33

Note that we have ordered the grid points such that x-index is the faster changing index and y-index
is the slower changing index; this choice is arbitrary.
Because we have employed a second-order accurate finite difference formula to approximate the
Laplacian, it can be shown that this finite difference approximation is second-order accurate. If the
exact solution is sufficiently regular, then the error is bounded by
ku − ũk∞ = max |u(xi , yj ) − ũi,j | ≤ C∆x2 + D∆y 2 ,
i,j=1,...,n

for some C < ∞ and D < ∞. We make two observations:


1. Convergence: the error ku − ũk∞ → 0 as ∆x → 0 and ∆y → 0.
2. Order of accuracy: ku − ũk∞ ≤ C∆x2 + D∆y 2 , and the scheme is second-order accurate.
Decreasing both ∆x and ∆y by a factor of two results in the error to decrease by a factor of
four.
We make one final remark on computational consideration. Similar to the case of one-dimensional
heat equation, the matrix  associated with two- (or higher-) dimensional Laplace’s equation is
sparse (i.e., mostly zero). It is important to take advantage of this sparsity in computation. Prag-
matically, in Matlab, we should (i) ensure that the matrix is stored as a sparse matrix using the
sparse command and (ii) use the “\” backslash operator to solve the linear system Âũ = fˆ. We
should never use the inv function, which is very inefficient because the inverse of  is a dense
matrix.

141
(a) h = 1/8 (b) h = 1/64

Figure 12.2: Finite difference approximations of Poisson’s equation on the unit square domain.

h ku − ũk∞
1/4 2.69 × 10−3
1/8 6.56 × 10−4
1/16 1.63 × 10−4
1/32 4.07 × 10−5
1/64 1.02 × 10−5

Table 12.1: Maximum error for finite difference approximations of Poisson’s equation on the unit
square domain as a function of the grid spacing.

12.4 Example: square domain

We consider a two-dimensional Poisson problem

−∆u = sin(πx) sin(πy) in Ω ≡ (0, 1)2 ,


u=0 on ∂Ω.

Finite difference approximations of the solution for two different values of the grid spacing h =
∆x = ∆y are shown in Figure 12.2. The h = 1/64 approximation is noticeably smooth than the
h = 1/8 approximation.
Table 12.1 shows the maximum error over all grid points as a function of the grid spacing. We
observe that the maximum error decreases by approximately a factor of four whenever the grid
spacing is decreased by a factor of two. This is consistent with the fact that the error scales as
O(∆x2 , ∆y 2 ) for the second-order accurate finite difference method for smooth problems.

142
(a) computational grid (h = 1/6) (b) solution (h = 1/64)

Figure 12.3: Computational grid and solution for the L-shaped domain problem.

12.5 Example: L-shaped domain


The finite difference approximation readily extends to non-square domains. We here consider
Poisson’s equation on a L-shaped domain Ω ≡ (0, 1)2 \ [0, 1/2]2 :

−∆u = 1 in Ω,
u=0 on ∂Ω.

An example of the computational grid for the L-shaped domain is shown in Figure 12.3(a). Note
that there is no computational grid over (0, 1/2)2 , which is not part of the domain. The solution
obtained on a finer grid is shown in Figure 12.3(b). The solution is in fact singular at the 270◦
corner; i.e., higher-order derivatives are not continuous at the corner. Table 12.2 shows that, due
to the presence of the singularity, the finite difference error converges as O(h2/3 ) for h = ∆x = ∆y,
which is (significantly) slower than the convergence of O(h2 ) for smooth problems. The fact that the
method does not achieve the design order of accuracy for problems with singularities is consistent
with the fact that the finite difference formulas are derived using the Taylor series for smooth
functions.

12.6 Treatment of various boundary conditions


The treatment of various boundary conditions in two dimensions is similar to their treatment the
boundary condition in one dimension in Lecture 8.

1. Dirichlet boundary. Suppose we wish to impose a nonhomogeneous Dirichlet boundary con-


dition on the x = 0 boundary:
u(xi=0 , yj ) = h(yj ).

143
h ku − ũk∞
1/4 1.79 × 10−3
1/8 1.92 × 10−3
1/16 1.40 × 10−3
1/32 9.19 × 10−4
1/64 5.80 × 10−4

Table 12.2: Maximum error for the two-dimensional Poisson problem on the L-shaped domain as
a function of grid spacing.

This boundary condition is incorporated in the finite difference equation associated with
(xi=1 , yj ). We substitute ui=0,j = h(yj ) to (12.1) evaluated at (xi=1 , yj ),

ũ2,j − 2ũ1,j + ũ0,j ũ1,j+1 − 2ũ1,j + ũ1,j−1


− 2
− = fˆ1,j ,
∆x ∆y 2

to obtain
ũ2,j − 2ũ1,j + h(yj ) ũ1,j+1 − 2ũ1,j + ũ1,j−1
− − = fˆ1,j .
∆x2 ∆y 2
We move all known terms to the right hand side to obtain

ũ2,j − 2ũ1,j ũ1,j+1 − 2ũ1,j + ũ1,j−1 h(yj )


− − = fˆ1,j + .
∆x2 ∆y 2 ∆x2

We note that the expression is similar to the expression obtained for nonhomogeneous Dirich-
let boundary condition for the one-dimensional heat equation.

2. Neumann boundary. Suppose we wish to impose a nonhomogeneous Neumann boundary


condition on the x = 0 boundary:

∂u
− (xi=0 , yj ) = h(yj ).
∂x
This boundary condition is incorporated in the finite difference equation associated with
(xi=0 , yj ). (Recall that, unlike the case of a Dirichlet boundary condition, the solution on the
boundary u(xi=0 , yj ) is unknown for a Neumann boundary.) The finite difference approxima-
tion of the boundary condition is
ũ1,j − ũ−1,j
− = h(yj ),
2∆x
which implies ũ−1,j = ũ1,j + 2∆xh(yj ). We substitute the (approximation of the) boundary
condition to (12.1) evaluated at (xi=1 , yj ),

ũ1,j − 2ũ0,j + ũ−1,j ũ0,j+1 − 2ũ0,j + ũ0,j−1


− 2
− = fˆ0,j ,
∆x ∆y 2

to obtain
ũ1,j − 2ũ0,j + ũ1,j + 2∆xh(yj ) ũ0,j+1 − 2ũ0,j + ũ0,j−1
− − = fˆ0,j .
∆x2 ∆y 2

144
We again move known quantities to the right hand side to obtain
2ũ1,j − 2ũ0,j ũ0,j+1 − 2ũ0,j + ũ0,j−1 2h(yj )
− 2
− 2
= fˆ0,j + .
∆x ∆y ∆x
We note that the expression is similar to the expression obtained for nonhomogeneous Neu-
mann boundary condition for the one-dimensional heat equation.

12.7 Time dependent problems: heat equation


We can readily apply the finite difference approximation of two-dimensional Laplace’s equation
to construct a semi-discrete equation for the two-dimensional heat equation. The resulting semi-
discrete equation is
dũi,j ũi+1,j − 2ũi,j + ũi−1,j ũi,j+1 − 2ũi,j + ũi,j−1
− 2
− = fˆi,j , i, j = 1, . . . , n.
dt ∆x ∆y 2
The associated matrix form of the equation is
dũ
+ Âũ = fˆ.
dt
We can then apply a time marching method discussed in Lecture 8 to the time derivative to obtain
a fully discrete equation. We then solve the fully discrete equation, starting at time t = 0.

12.8 Other numerical methods for PDEs


While we have focused our discussion on the finite difference method in this course, there are many
other numerical methods for PDEs. We here highlight several well-known approaches:
• Finite element methods. Finite element methods are derived from so-called weak form of
the PDEs. Finite element methods permit the use of unstructured meshes, which facili-
tates the treatment of complex geometries; an example of unstructured meshes is shown in
Figure 12.4(a). The method is also amenable to rigorous analysis using functional analysis
techniques. The finite element method can be used to approximate PDEs of all types (i.e.,
parabolic, elliptic, and hyperbolic) and are used in many engineering applications including
structures, fluids, acoustics, and electromagnetics. An example of finite element approxima-
tion of fluid flow is shown in Figure 12.4(b).
• Finite volume methods. Finite volume methods are derived from the integral (as opposed
to differential) form of the PDEs and are particularly well-suited for hyperbolic conservation
laws in fluid dynamics. Like finite element methods, finite volume methods also permit the
use of unstructured meshes, which facilitates the treatment of complex geometries.
• Boundary element methods. Boundary element methods are designed for linear homogeneous
PDEs. The method appeals to the fundamental solution of the PDE and the Green’s repre-
sentation formula to recast the PDE as a problem on boundaries. While applicable only to
linear homogeneous PDEs, it provides efficient solution of these PDEs and are widely used
in electromagnetics; the vortex-lattice method for potential flows in fluid dynamics is also an
example of boundary element methods.

145
(a) mesh (b) solution (x-component of velocity)

Figure 12.4: Example of a mesh and solution for flow past a cylinder.

12.9 Summary
We summarize key points of this lecture:

1. Finite difference methods readily extend to higher dimensions. To approximate a partial


derivative in a given direction, a finite difference formula is applied in that direction.

2. If the Laplacian operator is approximated using the second-order accurate finite difference
formula, the resulting finite difference approximation of Laplace’s (or Poisson’s) equation is
second-order accurate.

3. Various boundary conditions can be treated. On a Dirichlet boundary, we simply replace


the value of the solution on the boundary with the known value. On a Neumann (or Robin)
boundary, we approximate the boundary condition using a finite difference formula and make
an appropriate substitution.

4. Finite difference methods for higher spatial dimensions can also be used to identify a semi-
discrete form of higher-dimensional time-dependent problems.

5. There are many numerical methods for PDEs, including finite element, finite volume, and
boundary element methods.

146
Part IV

Hyperbolic equations

147
Lecture 13

Wave equation: introduction

©2019–2021 Masayuki Yano. Prepared for ESC384 Partial Differential Equations taught at the
University of Toronto.

13.1 Introduction
We have so far considered parabolic equations in the heat equation and elliptic equations in Pois-
son’s (and Laplace’s) equation. We now consider a new class of equations: hyperbolic equations.
Hyperbolic equations describe physical phenomena in which quantities are “transported” rather
than “diffused.” The two model hyperbolic equations we consider in this course are the wave
equation and the transport equation. In this lecture we focus on the wave equation; we introduce
the equation and study the behavior of the equation without finding an explicit expression of the
solution.

13.2 Derivation: vibrating string in R × R>0


We now derive the one-dimensional wave equation which models a vibrating string. We assume
that the flexible string has a (per-length) density of ρ and is in tension T . We apply Newton’s law
to a small segment (xL , xR ) of the string shown in Figure 13.1 to obtain the equations of motion
for the longitudinal (x) and transverse (y) motions:

0
|{z} = T nx |xR − T nx |xL , (13.1)
| {z }
no movement in x longitudinal force
xR
∂2u
Z
ρ 2 dx = T ny |xR − T ny |xL , (13.2)
xL ∂t | {z }
| {z } transverse force
mass × acceleration

where nx and ny are the longitudinal and transverse components of the unit vector n pointing along
the string. From the geometry of the problem, the unit vector n = (nx , ny ) must be proportional
to (dx, du) or equivalently (1, ∂u
∂x ); hence, the components of the unit vector can be expressed as
∂u
1
nx = q and ny = q ∂x  . (13.3)
∂u 2 2
1 + ∂u

1+ ∂x ∂x

148
(a) string (b) force decomposition

Figure 13.1: A vibrating string and the associated force decomposition of a segment.
q 2
We now assume that the displacement u and ∂u ∂x are small so that 1 + ∂u
∂x ) ≈ 1. This
∂u
assumption simplifies the components of the unit vector to nx = 1 and ny = ∂x . Consequently, the
longitudinal equation of motion (13.1) simplifies to
T |xR − T |xL = 0;
since the choice of the segment (xL , xR ), we deduce that the tension is constant along the string.
This further simplifies the transverse equation of motion (13.2) to
Z xR 2
∂ u  ∂u ∂u 
ρ 2
dx = T −
xL ∂t ∂x xR ∂x xL
We apply the divergence theorem to the right hand side to obtain
Z xR 2 Z xR 2
∂ u ∂ u
ρ 2
dx = T 2
dx.
xL ∂t xL ∂x

Again, since the choice of the segment (xL , xR ) is arbitrary, the integrands must be equal to each
other and we obtain
∂2u T ∂2u
= .
∂t2 ρ ∂x2
p
We finally set c ≡ T /ρ and obtain
∂2u 2
2∂ u
= c .
∂t2 ∂x2
This is the wave equation, where c is the wave speed.

13.3 Derivation: acoustic wave equation in R3 × R>0


We now derive a multi-dimensional wave equation which models acoustic waves. The starting point
for the acoustic wave equation is the Euler’s equations for compressible flows:
∂ρ
+ ∇ · ρv = 0,
∂t
∂ρv
+ v · ∇ρv + ∇p = 0;
∂t

149
the first and second equations are the conservation of mass and momentum, respectively. We now
express the density, velocity, and pressure as the perturbation (ρ̃, ṽ, and p̃) about the background
state (ρ0 , v0 , and p0 ) that are invariant in space and time:

ρ = ρ0 + ρ̃,
v = v0 + ṽ = ṽ,
p = p0 + p̃,

where we have assumed that the background velocity is 0. We substitute the expressions to the
Euler equations, appeal to the fact ρ0 , v0 , and p0 are invariant in space and time, and neglect the
product of two or more perturbation terms to obtain
∂(ρ0 + ρ̃) ∂ ρ̃
+ ∇ · ((ρ0 + ρ̃)ṽ) ≈ + ρ0 ∇ṽ = 0,
∂t ∂t
∂((ρ0 + ρ̃)ṽ) ∂ṽ
+ ṽ · ∇((ρ0 + ρ̃)ṽ) + ∇(p0 + p̃) ≈ ρ0 + ∇p̃ = 0.
∂t ∂t
We next note that for isentropic flows c2 ≈ p̃/ρ̃. It follows that
1 ∂ p̃
+ ρ0 ∇ · ṽ = 0,
c2 ∂t
∂ṽ
ρ0 + ∇p̃ = 0. (13.4)
∂t
We now wish to eliminate the velocity from the equations. To this end, we assume that both p̃ and
ṽ are sufficiently smooth, differentiate the first equation in time, differentiate the second equation
in space, and take the difference of the two equations:
   
∂ 1 ∂ p̃ ∂ṽ
0= + ρ0 ∇ · ṽ − ∇ · ρ0 + ∇p̃
∂t c2 ∂t ∂t
1 ∂ 2 p̃ ∂ ∂ṽ
= 2 2 + ρ0 ∇ · ṽ − ρ0 ∇ − ∆p̃
c ∂t ∂t ∂t
1 ∂ 2 p̃
= 2 2 − ∆p̃.
c ∂t
This is the acoustic wave equation for the (perturbation in the) pressure.
Following the convention that is used for all other PDEs in this course, we denote the solution
— perturbation in the pressure — by the variable u and multiply through by c2 to obtain

∂2u
− c2 ∆u = 0. (13.5)
∂t2
This is the (dimensional form of the) wave equation.

13.4 Boundary conditions


To define a complete initial boundary value problem, we now complement the wave equation (13.5)
with boundary and initial conditions. We consider two different types of boundary conditions for
the wave equation.

150
1. Dirichlet boundary condition (e.g., prescribe pressure for the acoustic wave equation). We first
recall that the solution to the (acoustic) wave equation represent the pressure perturbation.
Hence, if the pressure perturbation at the boundary is prescribed, we obtain a boundary
condition of the form
u = p̃b on ΓD × R>0 ,
where ΓD is the Dirichlet boundary, and p̃b is the prescribed pressure perturbation. This is a
Dirichlet boundary condition, as the equation depends only on the value of the solution (and
not its derivative). If the pressure is fixed (i.e., there is no perturbation), then the equation
simplifies to u = 0.

2. Neumann boundary condition (e.g., prescribed normal acceleration for the acoustic wave equa-
tion). The prescribed acceleration condition arises when we have, for instance, a speaker that
vibrates at a known frequency and displacement. To prescribe the acceleration condition, we
evaluate the linearized momentum equation (13.4) in the normal direction ν̃ to obtain

∂ν · ṽ
ν · ∇p̃ = −ρ0 .
∂t
Recalling u = p̃ we obtain

∂ν · ṽ
ν · ∇u = −ρ0 = −ρ0 ν · ãb on ΓN × R>0 ,
∂t

where ΓN is the Neumann boundary, and ãb is the prescribed acceleration of the wall. This
is a Neumann boundary condition, as the equation depends only on the derivative (and not
the value). If the wall is fixed (i.e., there is no acceleration), then the equation simplifies to
ν · ∇u = 0.

The wave equation, like the heat equation, is second order in space. Hence, we must enforce one
and only one boundary condition at any point on ∂Ω.
The wave equation, unlike the heat equation, is second order in time. Hence we must specify
both the value and time derivative at the initial time to obtain a well posed problem; we impose

u = g on Ω × {t = 0},
∂u
= h on Ω × {t = 0},
∂t
for some prescribed initial state g : Ω → R and the time derivative of the state h : Ω → R.

13.5 Nondimensionalization
As in the case for the heat equation, we may choose the nondimensionalization for the spatial and
temporal variables so that the nondimensionalized equation has a unity wave speed. To demonstrate
the idea, consider a wave equation

∂2u 2
2∂ u
− c =0 in (0, L) × R>0 .
∂t2 ∂x2

151
We introduce the length scale L and the time scale T ≡ L/c, so that the nondimensionalized
coordinate is x̃ = x/L and t̃ = t/T = ct/L. The nondimesionalized wave equation is then

∂ 2 ũ ∂ 2 ũ
− 2 =0 in (0, 1) × R>0 .
∂ t̃2 ∂ x̃
The choice of the time scale in particular reduces the wave speed to unity.

Example 13.1 (nondimensionalization). Suppose the solution to nondimensionalized wave equa-


tion is given by
ũ(x̃, t̃) = f (x̃ − t̃) + f˜(x̃ + t̃),
for some function f : R → R. The solution to the original equation is then

u(x, t) = ũ(x/L, ct/L) = f ((x − ct)/L) + f ((x + ct)/L).

13.6 Energy conservation


In the derivation of the wave equation, we have seen that the equation is associated with phenomena
such as the vibration of a string and acoustics, where the energy is propagated as waves. Our
physical intuition suggests that, in the absence of friction, the energy should be conserved and
simply propagate without any dissipation. Indeed the wave equation respects this physical intuition,
as stated in the following theorem.

Theorem 13.2 (conservation of energy for the wave equation). Suppose u is the solution to the
wave equation on Ω with homogeneous boundary conditions:

∂2u
− ∆u = 0 in Ω × R>0 ,
∂t2
u = 0 on ΓD × R>0 ,
∂u
= 0 on ΓN × R>0 ,
∂n
where ΓD and ΓN are Dirichlet and Neumann boundaries, respectively, so that ΓD ∪ ΓN = ∂Ω. Let
E(t) be the total energy at time t given by
Z  2 !
1 ∂u
E(t) ≡ + ∇u · ∇u dx.
2 Ω ∂t

Then the total energy is conserved:

E(t) = E(t = 0) ∀t ∈ R>0 .

Proof. Let us introduce the kinetic and potential energy


Z  2
1 ∂u
K≡ dx,
2 Ω ∂t
Z
1
P ≡ ∇u · ∇udx,
2 Ω

152
so that E = K + P . The change in the kinetic energy with time is
Z  2
∂u ∂ 2 u
Z Z
dK 1d ∂u ∂u
= dx = 2
dx = ∆udx
dt 2 dt Ω ∂t Ω ∂t ∂t Ω ∂t
Z   Z Z
∂ ∂u ∂u ∂u ∂u
=− ∇u · ∇udx + ds + ds
Ω ∂t ΓD ∂t ∂n ΓN ∂t ∂n
| {z } | {z }
∂u ∂u
=0 as u=0 ⇒ ∂t
=0 =0 as ∂n
=0
Z Z
1∂ 1d dP
=− (∇u · ∇u) dx = − (∇u · ∇u) dx = − ;
Ω 2 ∂t 2 dt Ω dt

here the third equality follows from the definition of the wave equation, the fourth equality follows
from the integration by parts, and the last equality follows from the definition of the potential
energy. The change in the kinetic energy is exactly the negative of the change in the potential
energy. It follows that dE dK dP
dt = dt + dt = 0, which implies E(t) = E(t = 0).

We note that this energy conservation (or equality) for the wave equation is different from the
energy stability (or inequality) for the heat equation. This again agrees with our physical intuition
that in phenomena modeled by the wave equation the energy is conserved and propagated, whereas
in phenomena modeled by the heat equation the energy is dissipated through diffusion.

13.7 Energy method: uniqueness


We now appeal to the energy conservation to show that the solution to an initial-boundary value
problem for the wave equation, if it exists, is unique.

Theorem 13.3 (uniqueness of the solution to the wave equation). Let Ω ⊂ Rn be a spatial domain
and R>0 be the time interval. There is at most one solution u : Ω × R>0 → R that satisfies

∂2u
− ∆u = f in Ω × R>0 ,
∂t2
u = hD on ΓD × R>0 ,
∂u
= hN on ΓN × R>0 ,
∂n
u = g0 on Ω × {t = 0},
∂u
= g1 on Ω × {t = 0},
∂t

where f : Ω × R>0 → R is a source function, hD : ΓD × R>0 → R is the Dirichlet boundary function,


hN : ΓN × R>0 → R is the Neumann boundary function, and g0 : Ω → R and g1 : Ω → R specify
the initial conditions.

Proof. Suppose u1 and u2 are two solutions to the heat equation initial-boundary value problem.

153
Then their difference w ≡ u1 − u2 satisfies

∂2w
− ∆w =0 in Ω × R>0 ,
∂t2
w =0 on ΓD × R>0 ,
∂w
=0 on ΓN × R>0 ,
∂n
w =0 on Ω × {t = 0},
∂w
=0 on Ω × {t = 0}.
∂t
The total energy of the difference w, for any t ≥ 0, is
! !
∂w 2 ∂w 2
Z   Z  
1 1
E(t) = + ∇w · ∇w dx = ∇w
+( · ∇w dx = 0,
 (( ((
2 Ω ∂t 2 Ω ∂t t=0

where the second equality follows from energy conservation, and the third equality follows from (i)
∂w
∂t (t = 0) = 0 by the initial condition and (ii) ∇w(t = 0) = 0 since w(t = 0) = 0 by the initial
condition. Since E(t) = 0 and the energy is sum of two non-negative terms, it follows that ∂w ∂t = 0
for all t ≥ 0. Since w(t = 0) = 0, we conclude that w(t) = u1 (t) − u2 (t) = 0 for all t ≥ 0.

We recall that for the heat equation we could establish the uniqueness of the solution using
either the energy method or the maximum principle. Unfortunately there is no property that is
analogous to the maximum principle for the wave equation. The lack of maximum principle is one
of the key differentiators between the heat and wave equations.

13.8 Summary
We summarize key points of this lecture:

1. The wave equation describes phenomena including the vibration of a taut string and the
propagation of acoustic pressure wave.

2. Initial-boundary value problems associated with the wave equation is obtained by incorporat-
ing Dirichlet and/or Neumann boundary conditions and initial conditions on both the value
and time derivative of the state.

3. The wave equation can be nondimensionalized such that the nondimensionalized equation has
a unit wave speed.

4. The wave equation with homogeneous Dirichlet or Neumann boundary conditions conserves
the total energy, which is the sum of the kinetic and potential energy.

5. The wave equation with an appropriate set of initial and boundary conditions has a unique
solution. The uniqueness can be proven using the energy method.

6. There is no property that is analogous to the maximum principle for the wave equation.

154
Lecture 14

Wave equation: separation of


variables and d’Alembert’s formula

©2019–2021 Masayuki Yano. Prepared for ESC384 Partial Differential Equations taught at the
University of Toronto.

14.1 Introduction
In this lecture we solve the wave equation in a bounded one-dimensional spatial domain using
separation of variables. The procedure is identical to that used for the heat equation; however, we
will find that the solution to the wave equation can be interpreted as a superposition of standing
waves unlike the solution to the heat equation which decays in time. Moreover, we will find an
alternative representation for the solution in terms of traveling waves. We will exploit this latter
interpretation in the subsequent lectures.

14.2 Separation of variables


We consider a wave equation in (0, 1) × R>0 with homogeneous Dirichlet boundary conditions:

∂2u ∂2u
− 2 = 0 in (0, 1) × R>0 , (14.1)
∂t2 ∂x
u = 0 on {0, 1} × R>0 ,
u = g on (0, 1) × {t = 0},
∂u
= h on (0, 1) × {t = 0}.
∂t
We wish to find a series representation of the solution using separation of variables. We recall the
two steps of the procedure:

1. We seek a family of solutions of separable form un (x, t) = φn (x)Tn (x), n = 1, 2, . . . , that


satisfy the PDE and the boundary condition (but not necessarily the initial conditions).

2. We identify a linear combination of u1 , u2 , . . . that satisfies the initial conditions.

155
To begin, we substitute un (x, t) = φn (x)Tn (t) to the wave equation and obtain

φn Tn00 − φ00n Tn = 0.

We rearrange the expression to obtain

φ00n T 00
= n = −αn ;
φn Tn

here, we have appealed to the fact that φ00n /φn depends on x only and Tn00 /Tn depends on t only,
and hence the expressions must be equal to a constant, which we denote by −αn . We then impose
the boundary conditions to find

un (x = 0, t) = φn (0)Tn (t) = 0 ⇒ φn (0) = 0,


un (x = 1, t) = φn (1)Tn (t) = 0 ⇒ φn (1) = 0;

note that we choose φn (0) = φn (1) = 0 because the other option, Tn (t) = 0, would result in
the trivial solution un = 0. It follows that the spatial functions must satisfy the boundary value
eigenproblem

−φ00n = αn φn in (0, 1),


φn = 0 on {0, 1}.

This is a Sturm-Liouville problem, and hence we expect φn to form a complete orthonormal func-
tions. In fact, this very Sturm-Liouville problem was considered in Example 4.6: the eigenfunctions
are given by
φn (x) = sin(nπx),
and eigenvalues are αn = n2 π 2 , for n = 1, 2, . . . .
Given αn = n2 π 2 , the temporal function must satisfy, for each n, the ODE

−Tn00 = αn Tn ⇒ −Tn00 = n2 π 2 Tn .

The solution to the ODE is of the form

Tn (t) = An cos(nπt) + Bn cos(nπt).

Hence our separable solutions that satisfy the wave equation and the boundary condition are of the
form
un (x, t) = An cos(nπt) sin(nπx) + Bn sin(nπt) sin(nπt).
Note that because the wave equation is second-order in time, we obtain a family of functions with
two unknown coefficients; this is unlike the heat equation which yields a family of functions with
one unknown coefficient.
Our general solution is a linear combination of the separable solutions, hence

X
u(x, t) = An cos(nπt) sin(nπx) + Bn sin(nπt) sin(nπx). (14.2)
n=1

156
We now impose the initial conditions to obtain

X
u(x, t = 0) = An sin(nπx) = g(x),
n=1

∂u X
(x, t = 0) = Bn nπ sin(nπt) = h(x);
∂t
n=1

here, we have assumed that the solution is sufficiently regular so that the series may be differentiated
term by term. We note that the first and second equations are the Fourier sine series for g and h,
respectively. We hence obtain
Z 1
An = ĝn = 2 sin(nπx)g(x)dx,
0
Z 1 Z 1
ĥn 2
Bn nπ = ĥn = 2 sin(nπx)h(x)dx ⇒ Bn = = sin(nπx)h(x)dx.
0 nπ nπ 0

Hence the solution to (14.1) is given by



!
X ĥn
u(x, t) = ĝn cos(nπt) + sin(nπt) sin(nπx), (14.3)

n=1

where we have grouped the time-dependent terms. We make a few observations:


1. Interpretation as standing waves. Consider the special case where the initial condition is
g(x) = sin(mπx) for some integer m and h = 0. The solution to the problem is then given by

u(x, t) = cos(mπt) sin(mπx).

For any fixed t ∈ R>0 , the snapshot of the solution is u(x, t) ∝ sin(mπx); i.e., the snapshot
has the same shape (up to scaling) at any time. For any fixed x ∈ (0, 1), the variation in the
solution in time is u(x, t) ∝ cos(mπt); i.e., the amplitude varies sinusoidally in time. Hence,
as the name suggests, the wave equation admits standing waves as solutions.

2. Oscillatory Fourier coefficients. P


For any fixed t, the solution u(·, t) in (14.3) can be inter-
preted as the Fourier sine series ∞ n=1 an (t) sin(nπx) with time-varying coefficients an (t) =
(ĝn cos(nπt) + ĥn sin(nπt)
nπ ), which is oscillatory. This is in contrast to the coefficients for the
heat equation, whose solution is given by

X
u(x, t) = ĝn exp(−n2 π 2 t) sin(nπx);
n=1

the Fourier coefficients for the heat equation, an (t) = ĝn exp(−n2 π 2 t), decays exponentially
in time.

3. (Lack of ) smoothness. For any fixed t, the solution to the wave equation u(·, t) is not nec-
essarily smooth if the initial conditions g and h are not smooth. This is in contrast to the
solution to the heat equation which is infinitely differentiable for any t > 0 even if the ini-
tial condition is discontinuous. We recall that the solution of the heat equation is smooth

157
(a) snapshots (b) space-time contours

Figure 14.1: Solution to the one-dimensional wave equation with the initial condition g = sin(2πx)
and h = 0.

because the Fourier coefficients decay exponentially with the mode number n for t > 0; as
the wave equation does not have this property, the solution is not necessarily smooth. (This
lack of smoothing effect will become more apparent for another solution representation that
we introduce next.)
Example 14.1 (wave equation by separation of variables). We consider the solution of the initial-
boundary value problem (14.1) for the initial condition g(x) = sin(2πx) and h = 0. By inspection,
we find that the Fourier sine series coefficients are ĝ2 = 1 and ĝi = 0, i 6= 2. The solution is hence
given by
u(x, t) = cos(2πt) sin(2πx).
Figure 14.1 visualizes the solution as temporal snapshots and as a space-time contour. We observe
a standing wave of with a unit wave length and a unit period.

14.3 d’Alembert’s formula


We can further simplify the series solution (14.2) to derive d’Alembert’s formula, which reveals
more properties of the wave equation. To simplify the derivation we consider the solution for two
cases — (i) g 6= 0 and h = 0 and (ii) h 6= 0 and g = 0 — and then take the superposition of the
two solutions.
Case 1: g 6= 0 and h = 0. We set ĥn = 0 in (14.3) to obtain
∞ ∞
X 1X
u(x, t) = ĝn sin(nπx) cos(nπt) = ĝn (sin(nπ(x − t)) + sin(nπ(x + t))),
2
n=1 n=1

where the second equality follows from sin(nπx) cos(nπt) = 12 (sin(nπ(x − t)) + sin(nπ(x + t))). We
recall that ĝn are the Fourier sine coefficients of g and hence, assuming g o,p is sufficiently regular,

X
ĝn sin(nπξ) = g o,p (ξ) ∀ξ ∈ R,
n=1

158
where g o,p : R → R is the odd periodic extension of g : (0, 1) → R. Hence,

1X 1
u(x, t) = ĝn (sin(nπ(x − t)) + sin(nπ(x + t))) = [g o,p (x − t) + g o,p (x + t)].
2 2
n=1

Case 2: h 6= 0 and g = 0. We set ĝn = 0 in (14.3) to obtain


∞ ∞
X ĥn 1 X ĥn
u(x, t) = sin(nπx) sin(nπt) = (cos(nπ(x − t)) − cos(nπ(x + t))),
nπ 2 nπ
n=1 n=1

where the second equality follows from sin(nπx) sin(nπt) = 21 (cos(nπ(x − t)) − cos(nπ(x + t))). We
recall that ĥn are the Fourier sine coefficients of h and hence, assuming ho,p is sufficiently regular,

X
o,p
h (ξ) = ĥn sin(nπξ) ∀ξ ∈ R,
n=1

where ho,p : R → R is the odd periodic extension of h : (0, 1) → R. We integrate the expression
over (a, b) to obtain
Z b ∞ ∞
X ĥn X ĥn
ho,p (ξ)dξ = [− cos(nπξ)]bξ=a = (cos(nπa) − cos(nπb)).
a nπ nπ
n=1 n=1

It hence follows that


∞ Z x+t
1 X ĥn 1
u(x, t) = (cos(nπ(x − t)) − cos(nπ(x + t))) = ho,p (ξ)dξ.
2 nπ 2 x−t
n=1

We take the superposition of the two solutions and find that the solution to the initial-boundary
value problem (14.1) is

1 x+t o,p
Z
1 o,p o,p
u(x, t) = [g (x − t) + g (x + t)] + h (ξ)dξ. (14.4)
2 2 x−t
This is d’Alembert’s formula for the wave equation.
We now use d’Alembert’s formula to sketch the solution to the wave equation.
Example 14.2 (d’Alembert’s formula: Dirichlet boundary condition). We consider the initial-
boundary value problem (14.1) for the initial condition g(x) = exp(−(x − 1/2)2 /0.052 ) and h = 0.
We wish to sketch the solution at various times using d’Alembert’s formula (14.4), as shown in
Figure 14.2. We proceed as follows:
1. We first sketch two odd periodic extensions of g(x, t = 0), which has been scaled by the
factor of 1/2. In Figure 14.2, the right-traveling wave 12 g o,p (x − t) and the left-traveling wave
1 o,p
2g (x + t) are shown in blue and orange, respectively. The waves coincide at time t = 0.

2. To sketch the solution at time t, we now slide the right-traveling (blue) wave and left-traveling
(orange) wave by t. Even though only the solution over (0, 1) is physically relevant, we
slide the entire periodic extension. The solution is given by the sum of the two waves:
u(x, t) = 21 (g o,p (x − t) + g e,p (x + t)).

159
Figure 14.2: Solution to the one-dimensional wave equation with Dirichlet boundary conditions
and the initial condition g(x) = exp(−(x − 1/2)2 /0.052 ) and h = 0. The blue and orange waves
are the right- and left-traveling odd periodic extension g o,p , respectively. The red wave is the sum
of the two waves evaluated over the domain (0, 1).

160
(a) snapshots (b) space-time contours

Figure 14.3: Solution to the one-dimensional wave equation with Dirichlet boundary conditions and
the initial condition g(x) = exp(−(x − 1/2)2 /0.052 ) and h = 0. The snapshots are in ∆t = 1/10
intervals.

This two-step procedure applies at any t. The effect of the wave reflecting off the boundary is
captured by the odd periodic extension entering the physical domain (0, 1). We observe that the
solution “inverts” when it reflects at the boundary at t ≈ 1/2; the wave is inverted because the
odd-periodic extension is inverted in (−1, 0) and (1, 2), and these inverted waves enter the domain
(0, 1) for t ∈ (1/2, 1).
Figure 14.3 shows the solution as snapshots and a space-time contour.

We conclude the section with a few observations about the d’Alembert’s formula:

1. Interpretation as travel waves. Consider the case h = 0. The solution is the superposition
of two traveling waves: the left-traveling wave 21 g o,p (x + t) and the right-traveling wave
1 o,p
2g (x − t).

2. Reflection at boundaries. When the wave reflects off the boundary, the reflection induces
a change of sign because we consider the odd periodic extension; i.e., the reflected wave is
“inverted.”

3. Lack of smoothness (g 6= 0 and h = 0). Consider the case of g 6= 0 and h = 0. Because


the solution is a superposition of two waves associated with the (odd periodic extension of
the) initial condition, the solution u(·, t) at time t is in general no smoother than the initial
condition g. For instance, if g o,p is (say) only twice differentiable, then u(·, t) is also only
twice differentiable. This is unlike the heat equation, for which the solution u(·, t) is infinitely
differentiable even if the initial condition is discontinuous.

4. Lack of smoothness (h 6= 0 and g = 0. Similar to the case above, the solution u(·, t) at time
t is only once more differentiable than the initial function ho,p . For instance, if ho,p is only
once differentiable, then u(·, t) is only twice differentiable. (The additional regularity arises
because we integrate h.)

We will later revisit d’Alembert’s formula and its interpretation using the method of characteristics.

161
14.4 Neumann boundary conditions
We now consider the solution of the wave equation with Neumann boundary conditions:
∂2u ∂2u
− 2 = 0 in (0, 1) × R>0 , (14.5)
∂t2 ∂x
∂u
= 0 on {0, 1} × R>0 ,
∂x
u = g on (0, 1) × {t = 0},
∂u
= h on (0, 1) × {t = 0}.
∂t
As many of the steps to solving the problem are identical to the Dirichlet boundary condition
case, we provide an abbreviated exposition here. As before, we first find a family of separable
solutions of the form un (x, t) = φn (x)Tn (t) that satisfies the homogeneous PDE and boundary
conditions. The substitution of the separable form to the PDE and boundary conditions yield a
spatial eigenproblem:

−φ00n = αn φn in (0, 1),


φ0n =0 on {0, 1}.

This is a Sturm-Liouville problem, whose solutions are given by

φn (x) = cos(nπx) and αn = n2 π 2 for n = 0, 1, 2, . . . .

The associated temporal ODE is

−Tn00 = αn Tn ⇒ −Tn00 = n2 π 2 Tn , n = 0, 1, 2, . . . .

The solutions are given by


 1 A + 1 B t, n = 0,

n n
Tn (t) = 2 2
An cos(nπt) + Bn sin(nπt), n = 1, 2, . . . .

(We here scale the n = 0 term by 1/2 for convenience.) The general solution is hence given by

1 1 X
u(x, t) = A0 + B0 t + (An cos(nπt) + Bn sin(nπt)) cos(nπx).
2 2
n=1

We now impose the initial conditions:



1 X
u(x, t = 0) = A0 + An cos(nπx) = g(x),
2
n=1

∂u 1 X
(x, t = 0) = B0 + Bn nπ cos(nπx) = h(x).
∂t 2
n=1

The first condition implies that An must be the Fourier cosine coefficients of g: i.e.,
Z 1
An = ĝn ≡ 2 cos(nπx)g(x)dx, n = 0, 1, 2, . . . .
0

162
The second condition implies that B0 , B1 1π, B2 2π, . . . must be the Fourier cosine coefficients of h,
and hence

ĥn , n = 0,
 Z 1
Bn = ĥ where ĥn ≡ 2 cos(nπx)h(x)dx.
 n , n = 1, 2, . . .
 0

The solution is hence given by

1 1 X ĥn
u(x, t) = ĝ0 + ĥ0 t + (ĝn cos(nπt) + sin(nπt)) cos(nπx),
2 2 nπ
n=1

which is a series representation of the solution to (14.5).


We can also find d’Alembert’s formula for the Neumann boundary condition case using a similar
approach as the Dirichlet boundary condition case. First, for g 6= 0 and h = 0, we observe that
∞ ∞
1 X 1 1X
u(x, t) = ĝ0 + ĝn cos(nπx) cos(nπt) = ĝ0 + ĝn (cos(nπ(x + t)) + cos(nπ(x − t)))
2 2 2
n=1 n=1
1
= (g e,p (x + t) + g e,p (x − t)),
2
where the second equality follows from a trigonometric identity, and the last equality follows from
the fact that the Fourier cosine series of g converges to its even extension g e,p (assuming sufficient
regularity). Second, for g = 0 and h 6= 0, we observe that
∞ ∞
1 X ĥn 1 1 X ĥn
u(x, t) = ĥ0 t + cos(nπx) sin(nπt) = ĥ0 t + (sin(nπ(x + t)) − sin(nπ(x − t))).
2 nπ 2 2 nπ
n=1 n=1

We also note that, assuming he,p is sufficiently regular,



e,p 1 X
h (ξ) = ĥ0 + ĥn cos(nπξ) ∀x ∈ R,
2
n=1

and hence
Z b ∞
e,p 1 X ĥn
h (ξ)dξ = ĥ0 (b − a) + (sin(nπb) − sin(nπa)).
a 2 nπ
n=1

We choose a = x − t and b = x + t to observe that



1 x+t e,p
Z
1 1 X ĥn
u(x, t) = ĥ0 t + (sin(nπ(x + t)) − sin(nπ(x − t))) = h (ξ)dξ.
2 2 nπ 2 x−t
n=1

Combining the expression with the solution to the g 6= 0 and h = 0 case, we find that the solution
to (14.5) can be expressed as
1 x+t e,p
Z
1 e,p e,p
u(x, t) = (g (x − t) + g (x + t)) + h (ξ)dξ, (14.6)
2 2 x−t
which is d’Alembert’s formula for the Neumann boundary condition problem (14.5). The formula
is identical to the Dirichlet boundary case (14.4) with the exception of the use of the even periodic
extension instead of the odd periodic extension.
We now use d’Alembert’s formula to sketch the solution to the wave equation.

163
Figure 14.4: Solution to the one-dimensional wave equation with Dirichlet boundary conditions
and the initial condition g(x) = exp(−(x − 1/2)2 /0.052 ) and h = 0. The blue and orange waves are
the right- and left-traveling even periodic extension g o,p , respectively. The red wave is the sum of
the two waves evaluated over the domain (0, 1).

Example 14.3 (d’Alembert’s formula: Neumann boundary condition). We consider the solution
of the initial-boundary value problem (14.5) for the initial condition g(x) = exp(−(x − 1/2)2 /0.052 )
and h = 0. We sketch the solution at various times using d’Alembert’s formula (14.6). The
resulting sketches are shown in Figure 14.4. The procedure to sketch the solutions is identical to
Example 14.2, except that we consider the even periodic extension for the Neumann boundary
condition. As a result, the solution does not invert when it reflects at the boundary at t ≈ 1/2;
the wave is not inverted because the even-periodic extension in (−1, 0) and (1, 2) are not inverted,
and these waves enter the domain (0, 1) for t ∈ (1/2, 1).
Figure 14.5 shows the solution as snapshots and a space-time contour.

We make a few observations:

1. For the Neumann boundary problem, the series solution uses the Fourier cosine series and
the associated d’Alembert’s formula uses the even periodic extension of the initial conditions.

164
(a) snapshots (b) space-time contours

Figure 14.5: Solution to the one-dimensional wave equation with Neumann boundary conditions
and the initial condition g(x) = exp(−(x−1/2)2 /0.052 ) and h = 0. The snapshots are in ∆t = 1/10
intervals.

This is unlike the Dirichlet boundary problem which uses the Fourier sine series and the odd
periodic extension of the initial conditions.

2. When the wave reflects off the boundary, the reflection does not induce a change of sign be-
cause we consider the even periodic extension. This is unlike the Dirichlet boundary problem
in which the reflection induces a change of sign.

3. In general, the solution is not smooth; it is only as differentiable as g e,p and only once more
differentiable than he,p .

14.5 Summary
We summarize key points of this lecture:

1. The one-dimensional wave equation can be readily solved using separation of variables. The
solution is expressed in terms of the (generalized) Fourier coefficients of the initial conditions.

2. The Fourier coefficients for the wave equation are sinusoidal functions and oscillates in time,
which may be interpreted as standing waves. This is unlike the Fourier coefficients for the
heat equation which decays exponentially in time.

3. For any fixed t, the solution to the wave equation is not necessarily smooth if the initial
condition is not smooth. This is again unlike the heat equation which smooths the solution.

4. d’Alembert’s formula allows us to interpret the solution to the wave equation as a super-
position of traveling waves associated with (appropriate periodic extensions of) the initial
conditions.

5. The solutions to both Dirichlet boundary and Neumann boundary problems can be expressed
as appropriate Fourier series or using d’Alembert’s formula.

165
6. The reflection induces a change of sign at a Dirichlet boundary but not at a Neumann bound-
ary.

166
Lecture 15

Transport equation: method of


characteristics

©2019–2021 Masayuki Yano. Prepared for ESC384 Partial Differential Equations taught at the
University of Toronto.

15.1 Introduction
In the previous lecture we considered the wave equation which is a second-order hyperbolic equation.
In this lecture we consider (arguably) a simpler first-order hyperbolic equation: the transport
equation. Transport equation, as the name suggests, describes the transport quantities in space.
It also serves as a model equation for more complicated system of first-order hyperbolic equations,
such as the Euler equations for compressible flows.

15.2 Derivation
We consider transport of some pollutant by fluid. To begin, we introduce a domain Ω ⊂ Rn and
the associated fixed subdomain ω ⊂ Ω. The density of the pollutant at any point in the domain is
u : Rn → R, and the (vector-valued) velocity field of the fluid is b : Rn → Rn . We assume that the
pollutant perfectly follows the motion of the fluid. (Physically this assumption is satisfied if the
particles that constitute the pollutant are small enough.) We also assume that the velocity field is
divergence free: ∇ · b = 0. (Physically this corresponds to incompressible flows.)
We now appeal to the conservation of mass to derive the transport equation which describes
the evolution
R of the pollutant density u. We first note that the total mass of the pollutant in Ω is
M = Ω udx. We next appeal to the conservation of mass to obtain
Z Z
d
udx = − u(b · ν)ds ,
dt ω ∂ω
| {z } | {z }
change in pollutant mass net mass of pollutant entering ω

where ν is the outward-pointing unit normal on ∂ω. We then make two observations: (i) since ω
is fixed in time the time derivative can be brought inside the integral; (ii) the integral over ∂ω can

167
be converted to an integral over ω using the divergence theorem. We hence obtain
Z Z
∂u
dx = − ∇(bu)dx.
ω ∂t ω

This relationship should hold for any ω ⊂ Ω, and hence the integrands must be equal to each other:
∂u
= −∇ · (bu).
∂t
Because we assumed that the velocity field b is divergence free (i.e., ∇ · b = 0), we obtain ∇ · (bu) =
(∇ · b)u + b · ∇u = b · ∇u and hence
∂u
+ b · ∇u = 0.
∂t
This is the transport equation.

Remark 15.1 (transport equation and conservation principle). In the above, we have derived the
transport equation using the conservation principle assuming the velocity field b is divergence-free.
Following the convention, from hereon we define the transport equation as the PDE of the form
∂u
∂t + b · ∇u = 0, regardless of whether b is divergence-free. Note that if b is not divergence-free,
then the transport equation does not satisfy the conservation condition.

15.3 Generalized transport equation


Before we present a solution procedure for the transport equation, we first generalize the transport
equation, as our procedure is quite flexible. Namely, we consider the case where the velocity b
at the point (x, t) depends on both the position (x, t) and the state u(x, t). Such a generalized
transport equation is given by
∂u
(x, t) + b(x, t, u(x, t)) · ∇u(x, t) = 0,
∂t
where b : Rn × R>0 × R → R is the velocity field. We consider three cases:

• Case 1: b is constant. This is a constant-coefficient transport equation.

• Case 2: b depends on x and t but not on u. This is a variable-coefficient transport equation.

• Case 3: b depends on u. This is a nonlinear transport equation. (More precisely, it is a


quasilinear equation because it is linear in the derivatives ∂u
∂t and ∇u.)

15.4 Method of characteristics for homogeneous problems


We now present the method of characteristics for a homogeneous (generalized) transport equation.
We first introduce an initial value problem
∂u
+ b · ∇u = 0 in Rn × R>0 , (15.1)
∂t
u=g on Rn × {t = 0},

168
where the transport field b in general depends on the point (x, t) and the state u(x, t), and the
initial condition g : Rn → R is given.
To solve the problem, let us first assume that the solution u is continuously differentiable: i.e.,
u ∈ C 1 (Rn × R≥0 ). We then introduce a “trajectory” xc : R≥0 → Rn ; note that this is a “curve” in
space parametrized by time, or a point whose location depends on t. We next denote the associated
solution along the trajectory by uc : R≥0 → Rn :

uc (t) ≡ u(xc (t), t).

We note that the solution uc evolves along this trajectory as follows:

d d ∂u dxc
(uc (t)) = u(xc (t), t) = + · ∇u.
dt dt ∂t dt

We compare the expression for du dt and the (generalized) transport equation (15.1) and note the
c

following: if we choose a curve xc such that dx


dt = b(xc (t), t, u(xc (t), t)), then
c

d d ∂u
(uc (t)) = u(xc (t), t) = + b(xc (t), t, u(xc (t), t)) · ∇u = 0. (15.2)
dt dt ∂t

Along the curve xc such that dx dt (t) = b(xc (t), t, u(xc (t), t)), the solution uc (t) is constant. The
c

curve xc is called the characteristic, and the pair of ODEs that govern the evolution of the solution
along the characteristic is called the characteristic equations.

Definition 15.2 (characteristic equations for a homogeneous transport equation). The character-
istic equations for a homogeneous transport equation ∂u
∂t + b∇u = 0 is a pair of ODEs

dxc
(t) = b(xc (t), t, uc (t)), t ∈ R>0 ,
dt
duc
(t) = 0, t ∈ R>0 .
dt
The time-parametrized curve xc : Rn → R≥0 is called the characteristic.

Note that this is a pair of ODEs — and not a PDE — that can be readily integrated in time.
The solution at time t is given by
Z t
xc (t) = b(xc (τ ), τ, uc (τ ))dτ + xc,0 ,
τ =0
uc (t) = uc (0) = u(xc (0), 0) = g(xc,0 ),

where xc,0 ≡ xc (t = 0) is the initial point of the characteristic. The solution along the characteristic
xc is hence equal to the initial value g(xc,0 ) at the point xc,0 .

Example 15.3 (constant-coefficient transport equation). We consider a transport equation

∂u
+ b · ∇u = 0 in Rn × R>0 ,
∂t
u=g on Rn × {t = 0},

169
where b ∈ Rn is a (constant) velocity vector, and g : Rn → R is the initial condition. The
characteristic takes a particularly simple form for this transport equation with a constant velocity.
The characteristic equation is
dxc
(t) = b ∀t ∈ R>0 ;
dt
the direct integration of the ODE shows that the solution is given by
Z t
xc (t) = bdτ + xc,0 = bt + xc,0 .
τ =0

Since the solution u is constant along the characteristic xc (for a homogeneous equation), we obtain
uc (t) = u(xc (t), t) = u(bt + xc,0 , t) = g(xc,0 ).
We propagate “backward” and choose xc,0 = x − bt to obtain the solution
u(x, t) = g(x − bt).
This is a general solution to the homogeneous transport equation with a constant coefficient.
We now compare the behavior of the solutions to the transport equation and the heat equation.
To this end, we note that the solutions to the two equations for a given initial condition g can be
expressed as
transport: u(x, t) = g(x − bt),
Z
heat: u(x, t) = Φ(x − ξ, t)g(ξ)dξ,
Rn
where Φ is the fundamental solution of the heat equation, which is smooth and nonzero everywhere.
We make a few observations:
1. Transport. In the transport equation, the initial condition g is “transported” at the velocity
b. This is consistent with our intuition for transport phenomena.
2. Limited domain of influence/dependence. The solution to the transport equation at (x, t)
depends only on the initial condition at the point xc,0 = x − bt from which the characteristic
originates. Conversely, the initial condition at ξ ∈ Rn only affects the solution along its
characteristic. This is unlike for the heat equation where the solution at (x, t) depends on
the initial condition everywhere, and the initial condition at ξ affects the solution u(·, t)
everywhere for any t > 0.
3. Finite propagation speed. For the transport equation, a change in the initial condition at
point xc,0 is felt only along the characteristic xc = bt + xc,0 . As the point xc (t) moves at the
velocity b (since dxdt = b), the change in the initial condition also propagates at this finite
c

velocity. This is unlike for the heat equation where the disturbance propagates at infinite
speed.
4. Lack of smoothness. For the transport equation, the solution u(·, t) at time t has the same
regularity as the initial condition g. If g is (only) once continuously differentiable, so is u(·, t).
This is like the wave equation — another hyperbolic equation —, where the d’Alembert’s
formula shows that the solution u(·, t) is only as regular as the initial condition. This is
unlike the heat equation — a parabolic equation —, where the solution u(·, t) is smooth (i.e.,
infinitely differentiable) even if the initial condition is discontinuous.

170
(a) characteristics (b) solution snapshots

Figure 15.1: The characteristics and solution snapshots for the constant-coefficient transport equa-
tion in Example 15.3.

Example 15.4. Consider a one-dimensional transport equation


∂u ∂x
+ = 0 in R × R>0 ,
∂t ∂t
u(x, t = 0) = g(x) = exp(−x2 ), ∀x ∈ R.

This is a homogeneous transport equation with constant coefficient of b = 1. Following the steps
used in Example 15.3, we observe that the solution is given by

u(x, t) = g(x − t) = exp(−(x − t)2 ).

The characteristics and the snapshots of the solution are shown in Figure 15.1. The characteristics
are parallel lines of slope 1 in the space-time domain, and the solution translates at a constant
speed of 1.

Example 15.5. Consider an initial value problem associated with a homogeneous transport equa-
tion with variable coefficients:
∂u ∂u
+x = 0 in R × R>0 ,
∂t ∂x
u(x, t = 0) = g(x) = exp(−x2 ), x ∈ R.

The characteristics must satisfy


dxc
(t) = xc (t) t ∈ R>0 .
dt
The solution to this ODE is of the form xc (t) = xc,0 exp(t), where xc,0 is the point from which the
characteristic emanates. Since the PDE is homogeneous, the solution must be constant along each
characteristic, and we find that

u(xc (t), t) = u(xc,0 exp(t), t) = u(xc,0 , t = 0) = g(xc,0 ).

171
(a) characteristics (b) solution snapshots

Figure 15.2: The characteristics and solution snapshots for Example 15.5.

We choose x = xc,0 exp(t) or xc,0 = x exp(−t) to obtain


u(x, t) = g(x exp(−t)) = exp(−x2 exp(−2t)).
The characteristics and the snapshots of the solution are shown in Figure 15.2. The character-
istics spread at an increasingly faster rate over time. The accordingly the solution spread at an
increasingly faster rate over time.
Remark 15.6. For the case with a constant coefficient, if g ∈ C 1 (Rn ), then u ∈ C 1 (Rn ×R>0 ), and
our assumption on the smoothness of u is also satisfied. It turns out that the regularity condition
can be relaxed if we consider weak solutions. We will consider this reformulation of the transport
equation which admits not only nondifferentiable but also discontinuous solutions in a later lecture.
The method of characteristics is in fact valid even if the initial condition (and hence the solution)
is not differentiable.

15.5 Method of characteristics for nonhomogeneous problems


We now consider a initial value problem associated with a nonhomogeneous (generalized) transport
equation
∂u
+ b · ∇u = f in Rn × R>0 , (15.3)
∂t
u=g on Rn × {t = 0},
where b : Rn × R>0 × R → Rn is the velocity field, f : Rn × R>0 → R is the source function, and
g : Rn → R is the initial condition.
As before we define our characteristic as the time-parametrized curve which satisfies dx dt =
c

b(xc (t), t, u(xc (t), t)). We then observe that the solution along the characteristic evolves as follows
d d ∂u dxc
(uc (t)) = u(xc (t), t) = + · ∇u = f (xc (t), t).
dt dt ∂t dt
Hence we obtain the following characteristic equations:

172
Definition 15.7 (characteristic equations for a nonhomogeneous transport equation). The char-
acteristic equations for a nonhomogeneous transport equation ∂u
∂t + b∇u = f is a pair of ODEs
dxc
(t) = b(xc (t), t, uc (t)), t ∈ R>0 ,
dt
duc
(t) = f (xc (t), t), t ∈ R>0 .
dt
We note that the solution uc along the characteristic is not a constant (since du
dt 6= 0 in general),
c

but its value only depends on the initial condition and the value of f along the characteristic xc .
The pair of ODEs can be readily integrated in time to obtain
Z t
xc (t) = b(xc (τ ), τ, uc (τ ))dτ + xc,0
τ =0
Z t
uc (t) = f (xc (τ ), τ )dτ + g(xc,0 ),
τ =0
where xc,0 is the starting point of the characteristic, and the second equation incorporates both the
source function and the initial condition.
Example 15.8 (nonhomogeneous transport equation with a constant coefficient). We consider a
transport equation
∂u
+ b · ∇u = f in Rn × R>0 ,
∂t
u=g on Rn × {t = 0}
with a constant velocity field b ∈ Rn , a source function f : Rn × R>0 → R, and an initial condition
g : Rn → R. The characteristic must satisfy dx dt = b, which for the constant velocity field yields
c

xc (t) = bt + xc,0 . It hence follows that


Z t
uc (t) = u(bt + xc,0 , t) = g(xc,0 ) + f (bτ + xc,0 , τ )dτ.
τ =0
We choose xc,0 = x − bt to obtain
Z t
u(x, t) = g(x − bt) + f (x − b(t − τ ), τ )dτ. (15.4)
τ =0
As expected from linear superposition, the solution is the sum of the contributions from the initial
condition g and the source term f .
Example 15.9. Consider a one-dimensional transport equation
∂u ∂u
(x, t) + 2 (x, t) = f (x, t) = cos(x) ∀x ∈ R, t ∈ R>0 ,
∂t ∂x
u(x, t = 0) = g(x) = exp(−x2 ) ∀x ∈ R.
Following the procedure in the previous example for b = 2, we obtain
Z t Z t
2
u(x, t) = g(x − 2t) + f (x − 2(t − τ ), τ )dτ = exp(−(x − 2t) ) + cos(x − 2(t − τ ))dτ
τ =0 τ =0
1 1
= exp(−(x − 2t)2 ) + sin(x − 2(t − τ ))|tτ =0 = exp(−(x − 2t)2 ) + (sin(x) − sin(x − 2t)).
2 2

173
15.6 Nonhomogeneous problems: Duhamel’s principle revisited
We recall Duhamel’s principle (Theorem 7.9), which allows us to express the solution to a non-
homogeneous heat equation in terms of the solution to a family of homogeneous heat equations.
Duhamel’s principle in fact applies to any equations that is first-order in time. We here provide a
general statement:
Theorem 15.10 (Duhamel’s principle). Consider the initial value problem
∂u
+ Lu = f in Rn × R>0 , (15.5)
∂t
u=g on Rn × {t = 0},

where L is a linear differential operator (e.g., transport operator b · ∇ or Laplacian operator −∆),
f : Rn × R>0 → R is the source function, and g : Rn → R is the initial condition. For any fixed
s ∈ R>0 , let usf : Rn × R>s → R be the solution to

∂usf
+ Lusf = 0 in Rn × R>s , (15.6)
∂t
usf = f (·, s) on Rn × {t = s},

and ug : Rn × R>0 → R be the solution to


∂ug
+ Lug = 0 in Rn × R>0 , (15.7)
∂t
ug = g on Rn × {t = 0}.

Then the solution to (15.5) is given by


Z t
u(x, t) = ug (x, t) + usf (x, t)ds, x ∈ Rn , t ∈ R>0 . (15.8)
s=0

Proof. We first consider the case of f 6= 0 and g = 0 (and hence ug = 0). We differentiate (15.8)
(for ug = 0) in time to obtain
∂usf
Z t  Z t Z t
∂u ∂
(x, t) = usf (x, t)ds = utf (x, t) + (x, t)ds = f (x, t) − Lusf (x, t)ds
∂t ∂t s=0 s=0 ∂t s=0
Z t 
= f (x, t) − L usf (x, t)ds = f (x, t) − Lu(x, t).
s=0

Here the first equality follows from the differentiation of (15.8) with respect to the integration
bound and the integrand, the second equality follows from (15.6), the third equality follows from
the fact that the integral does not depend on space, and the last equality follows from (15.8).
Hence, the solution (15.8) satisfies the PDE. We also observe that
Z t=0
u(x, t = 0) = us (x, t = 0)ds = 0, x ∈ Rn ,
s=0

as us (·, t
= 0) = f (·, s) is bounded; the solution hence satisfies the homogeneous initial condition
g = 0 for this case.

174
We next consider the case of f = 0 and g 6= 0. This case is associated with (15.7), and hence
the solution is given by ug by definition.
We finally appeal to the linear superposition of the solutions to (i) f 6= 0 and g = 0 and (ii)
g 6= 0 and f 6= 0 to obtain (15.8).

Example 15.11. We now apply Duhamel’s principle to the transport equation with a constant
velocity field:

∂u
+ b · ∇u = f in Rn × R>0 ,
∂t
u=g on Rn × {t = 0},

The family of homogeneous problems of interest is

∂us
+ b · ∇us = 0 in Rn × (s, ∞),
∂t
us = f (·, s) on Rn × {t = s}.

Using the method of characteristics, the solution, for any given s, is given by

us (x, t) = f (x − b(t − s), s).

We also recall that the solution to the homogeneous transport equation is given by

ug (x, t) = g(x − bt).

The application of the Duhamel’s principle thus yields the solution


Z t Z t
s
u(x, t) = ug (x, t) + u (x, t)ds = g(x − bt) + f (x − b(t − s), s)ds,
s=0 s=0

which, as we expect, is identical to (15.4) obtained using a more direct approach.

15.7 Initial-boundary value problems


We now consider an initial-boundary value problem in a bounded spatial domain Ω ⊂ R. Because
the transport equation is first-order in space, a boundary condition is only imposed on an inflow
boundary
Γin ≡ {x ∈ ∂Ω | b(x) · ν(x) < 0},

where ν is the outward-pointing unit normal vector on the boundary. No boundary condition is
imposed on an outflow boundary
Γout ≡ ∂Ω \ Γin .

This agrees with our physical intuition that if u models the concentration of a pollutant, then we
can only specify the concentration at the places where the pollutant enters, and not leaves, the
domain. In addition, the only type of the boundary condition that can be imposed in the Dirichlet

175
type. Hence, given a spatial domain Ω ⊂ Rn , the initial-boundary value problem associated with
the transport equation takes the following form:
∂u
+ b · ∇u = f in Ω × R>0 ,
∂t
u=h on Γin × R>0 ,
u=g on Ω × {t = 0},
where f : Ω × R>0 → R is the source function, h : Γin × R>0 → R is the (in general time-dependent)
boundary function, and g : Ω → R is the initial condition.
We can solve the initial-boundary value problem using the method of characteristics. We here
provide a concrete one-dimensional example.
Example 15.12. We consider a one-dimensional initial-boundary value problem on a half line:
∂u ∂u
−b = 0 in R>0 × R>0 ,
∂t ∂x
u = h on {x = 0} × R>0 ,
u=g on R>0 × {t = 0},
where b > 0 is a fixed advection speed, h : R>0 → R is a boundary condition function, and
g : R>0 → R is an initial condition function. Note that we require b > 0 so that {x = 0} is an
inflow boundary, where a boundary condition can be imposed.
To solve the problem, we again note that the ODE for characteristics is
dxc
(t) = b.
dt
We integrate the ODE from t0 to t to obtain
xc (t) = b(t − t0 ) + xc (t0 ). (15.9)
Because the solution to the homogeneous PDE is constant along each characteristic, we obtain
u(xc (t), t) = u(xc (t0 ), t0 ). (15.10)
We now consider two cases distinguished by the point from which the characteristic emanates, as
shown in Figure 15.3.
Case 1: x > bt. In this case, the characteristic that goes through (x, t) emanates from the initial-
time boundary {t = 0} of the space-time domain. (I.e., if we trace the characteristic backward from
(x, t), we intersect the initial-time boundary.) We hence set xc (t) = x and t0 = 0 in (15.9) to obtain
x = bt + xc (t0 ) ⇒ xc (t0 ) = x − bt.
The substitution of the expressions to (15.10) yields
u(x, t) = u(xc (t0 ), t0 ) = u(x − bt, 0) = g(x − bt).
Case 2: x < bt. In this case, the characteristic that goes through (x, t) emanates from the
{x = 0} boundary. (I.e., if we trace the characteristic backward from (x, t), we intersect the
{x = 0} boundary.) We hence set xc (t) = x and xc (t0 ) = 0 in (15.9) to obtain
x = b(t − t0 ) ⇒ t0 = t − x/b.

176
Figure 15.3: Characteristics for the transport equation on the half line R>0 .

The substitution of the expressions to (15.10) yields

u(x, t) = u(xc (t0 ), t0 ) = u(0, t − x/b) = h(t − x/b).

In summary, the solution is given by


(
g(x − bt), x > bt,
u(x, t) =
h(t − x/b), x < bt.

We note that the solution u(x, t) for x > bt depends only on the initial condition g; it does not
depend on the boundary condition h. Conversely, the solution u(x, t) for x < bt depends only on the
boundary condition h; it does not depend on the initial condition g. This is a direct consequence
of the limited domain of influence and dependence that the transport equation exhibits. The result
can be contrasted with the solution of the heat equation on the half line R>0 , where the solution
at any (x, t) depends on both the initial and boundary conditions.

15.8 Summary
We summarize key points of this lecture:

1. The transport equation models the transport of a conserved quantity by an advection field.

2. The generalized transport equation models transport equation with the advection field that
may (i) be constant, (ii) depend on space and time, and (iii) depend on space, time, and the
state.

3. The generalized transport equation can be solved using the method of characteristics. The
method turns the PDE into a pair of ODEs along characteristics.

4. For homogeneous transport equations, the value of the solution along each characteristic is a
constant. The characteristic can be found by integrating the advection field. If the advection
field is constant, the solution is simply translated in time.

177
5. For nonhomogeneous transport equations, the value of the solution along each characteristic
grows/decays depending the values of the source function along the characteristic.

6. The transport equation exhibits limited domain of influence and finite propagation speed.
This is unlike the heat equation which exhibits unbounded domain of influence and infinite
propagation speed.

7. The solution to the transport equation is not smooth if the initial condition is not smooth.
This is unlike the heat (and Laplace’s) equation for which the solution is always smooth even
if the initial condition (or boundary condition) is discontinuous.

8. The solution to nonhomogeneous transport equations can also be obtained using Duhamel’s
principle.

9. If the spatial domain has a boundary, then Dirichlet boundary conditions are imposed on the
inflow boundaries; no boundary conditions are imposed elsewhere.

178
Lecture 16

Wave equation: method of


characteristics

©2019–2021 Masayuki Yano. Prepared for ESC384 Partial Differential Equations taught at the
University of Toronto.

16.1 Introduction
In this lecture we analyze the wave equation in unbounded domain using the d’Alembert’s formula
derived using the method of characteristics. The method provides other physical insights that are
not as obvious from the analysis based on separation of variables. Method of characteristics in fact
provides a similar insight for the wave equation that the fundamental solution does for the heat
equation. We will also study the behavior of the wave equation on a half line R>0 using the method
of reflection, with a particular attention to the behavior of the function at the boundary.

16.2 d’Alembert’s formula: revisited


We first consider the IVP for one-dimensional wave equation in R × R>0 :

∂2u ∂2u
− 2 = 0 in R × R>0 ,
∂t2 ∂x
u = g on R × {t = 0},
∂u
= h on R × {t = 0},
∂t
for some functions g : R → R and h : R → R. Our goal is to find a representation for u in terms
of g and h using the method of characteristics. We will find that the resulting formula is the
d’Alembert’s formula, which we have derived for bounded domains using separation of variables in
Lecture 14.
To this end, we first factor the second-order operator as a product of two first-order operators:

∂2u ∂2u
  
∂ ∂ ∂ ∂
− 2 = + − u = 0.
∂t2 ∂x ∂t ∂x ∂t ∂x
| {z }
≡v

179
This “factorization” allows us to decompose the wave equation, which is a second-order PDE, into
a pair of first-order PDEs:
∂u ∂u
− = v, (16.1)
∂t ∂x
∂v ∂v
+ = 0. (16.2)
∂t ∂x
We recognize these are transport equations. We approach the problem in two steps: we first solve
(16.2) for v; we then solve (16.1) for u.
We first apply the method of characteristics to the one-dimensional homogeneous transport
equation (16.2) (for which b = 1) to obtain

v(x, t) = v 0 (x − t),

where v 0 (x) = v(x, t = 0) ∀x ∈ R is the initial condition, which we later specify.


We next substitute the expression for v into (16.1) to obtain
∂u ∂u
(x, t) − (x, t) = v 0 (x − t).
∂t ∂x
This is a nonhomogeneous transport equation with a constant coefficient b = −1 and the source
function v(x, t) = v 0 (x − t). We again apply the method of characteristics to obtain
Z t
u(x, t) = u(x − bt, 0) + v(x − b(t − τ ), τ )dτ
τ =0
Z t
= u(x − bt, 0) + v 0 (x − b(t − τ ) − τ )dτ
τ =0
Z t
= u(x + t, 0) + v 0 (x + t − 2τ )dτ.
τ =0

We now consider the change of variables ξ = x + t − 2τ to obtain


Z x−t
1 x+t 0
Z
1
u(x, t) = u(x + t, 0) + v 0 (ξ)(− dξ) = u(x + t, 0) + v (ξ)dξ. (16.3)
x+t 2 2 x−t

We then appeal to the initial condition u(x + t, 0) = g(x + t) and the definition of v 0 to obtain
∂u ∂u
v 0 (ξ) = v(ξ, 0) = (ξ, 0) − (ξ, 0) = h(ξ) − g 0 (ξ).
∂t ∂x
The substitution of the relationship to (16.3) yields
1 x+t
Z
u(x, t) = g(x + t) + (h(ξ) − g 0 (ξ))dξ
2 x−t
which, by the fundamental theorem of calculus, simplifies to
1 x+t
Z
1 1
u(x, t) = g(x + t) − g(x + t) + g(x − t) + h(ξ)dξ
2 2 2 x−t
1 x+t
Z
1
= (g(x + t) + g(x − t)) + h(ξ)dξ. (16.4)
2 2 x−t

180
(a) domain of dependence (b) domain of influence

Figure 16.1: Domain of dependence and influence for the wave equation.

We have rederived the d’Alembert’s formula using the method of characteristics. Note that the
formula is now defined on the entire real line R.
We now compare the behaviors of the wave equation (which is hyperbolic) and the heat equation
(which is parabolic). To this end, we note that the solutions to the two equations for a given initial
condition g (and h) can be expressed as
Z x+t
1 1
wave: u(x, t) = (g(x + t) + g(x − t)) + h(ξ)dξ,
2 2 x−t
Z
heat: u(x, t) = Φ(x − ξ, t)g(ξ)dξ,
R

where Φ is the fundamental solution of the heat equation, which is smooth and nonzero everywhere.
We make a few observations:

1. Limited domain of dependence. The solution at (x, t) depends only on the initial position g
at x − t and x + t and the initial velocity in (x − t, x + t). (Note that in the dimensional form,
the domain is (x − ct, x + ct).) Figure 16.1(a) illustrates the domain of dependence for the
wave equation in the space-time domain. The bounded domain of dependence is unlike the
heat equation, in which the solution u(x, t) depends on the initial condition everywhere.

2. Limited domain of influence. The domain of influence is the subset of the space-time domain
that the initial values g(x0 ) and h(x0 ) affects the solution. Figure 16.1(b) illustrates the
domain of influence for the wave equation. Again, because the wave propagates at the speed
of at most c = 1, the initial condition at (x0 , t) only influences solutions in [x0 − t, x0 + t] at
time t. This is again unlike the heat equation, for which the initial condition g(x0 ) influences
the entire solution for t > 0.

3. Finite propagation speed. The waves propagate at the speed of at most c = 1, and the initial
condition outside of [x − t, x + t] does not have time to influence the solution at (x, t). This
is again unlike the heat equation with the infinite propagation speed.

4. Lack of smoothness. The solution to the wave equation is only as regular as the initial
condition g and only one order more regular than the initial condition h. If g is (only)
twice continuously differentiable and h is (only) once continuously differentiable, then u(·, t)
is (only) twice continuously differentiable. This is unlike the heat equation whose solution
u(·, t) is smooth (i.e., infinitely differentiable) for any t > 0 even if the initial condition g is
discontinuous.

181
16.3 Wave equation on R>0 × R>0 : homogeneous Dirichlet BC
We now consider the wave equation on the half line R>0 :

∂2u ∂2u
− 2 =0 in R>0 × R>0 , (16.5)
∂t2 ∂x
u=0 on {x = 0} × R>0 ,
u = g on R>0 × {t = 0},
∂u
= h on R>0 × {t = 0};
∂t
note that homogeneous Dirichlet boundary condition is imposed at x = 0.
We wish to solve the problem using the method of reflection, which we have considered in the
context of heat equation in Lecture 7. We recall the idea is to identify a problem on the entire real
line R whose solution restricted to the half line R>0 is the solution to (16.5). We also recall that we
can impose the homogeneous Diriclet boundary condition by considering the odd extension of the
initial conditions, as the odd extension of a continuous function vanishes at the origin. We hence
consider

∂2u ∂2u
− 2 = 0 in R × R>0 , (16.6)
∂t2 ∂x
u = g o on R × {t = 0},
∂u
= ho on R × {t = 0},
∂t
where g o : R → R and ho : R → R are the odd extensions of g : R>0 → R and h : R>0 → R,
respectively. We apply d’Alembert’s formula to the extended problem to obtain
Z x+t
1 1
u(x, t) = (g o (x + t) + g o (x − t)) + ho (ξ)dξ.
2 2 x−t

We can simplify the formula with some manipulation specific to odd functions. To this end, we
partition the space-time domain into two parts: the part not affected by the reflected wave (x > t)
and the part affected by the reflected wave (x < t).
Case 1: x > t. In this case, all arguments of g o and ho are positive and hence we can simply
substitute g o = g and ho = h:
Z x+t
1 1
u(x, t) = (g(x + t) + g(x − t)) + h(ξ)dξ, x > t.
2 2 x−t

An illustration for the x > t case is shown in Figure 16.2(a). Because the reflected wave does not
reach u(x, t), the solution is identical to what would have been obtained if the boundary did not
exist.
Case 2: x < t. In this case, some of the arguments of g o and ho become negative. We appeal
to the property of the odd extension to obtain, for x − t < 0,

g o (x − t) = −g(t − x).

182
(a) x > ct (b) x < ct

Figure 16.2: Visualization of the solution of the wave equation on the half line with homogeneous
Dirichlet boundary condition at x = 0.

In addition, we split the integral on ho into three parts according to Figure 16.2(b) to obtain

Z x+t Z 0 Z t−x Z x+t Z x+t


1 o 1 o 1 o 1 o 1
h (ξ)dξ = h (ξ)dξ + h (ξ)dξ + h (ξ)dξ = h(ξ)dξ;
2 x−t 2 x−t 2 0 2 t−x 2 t−x
| {z }
equal and opposite

here, the first two integrals cancel each other because ho is an odd function and the integration
domains (x − t, 0) and (0, t − x) are symmetric about the origin. We hence obtain

Z x+t
1 1
u(x, t) = (g(x + t) − g(t − x)) + h(ξ)dξ, x < t.
2 2 ξ=t−x

An illustration for the x < t case is shown in Figure 16.2(b).


We make a few observations:

1. For x > t, the solution u(x, t) does not depend on the boundary condition, as the informa-
tion does not reach (x, t) due to the finite propagation speed. This is unlike for the heat
equation, where the boundary condition influences the solution everywhere due to the infinite
propagation speed.

2. For x < t, the reflected wave that reaches (x, t) originates at t − x and is inverted (i.e., the
sign is changed).

3. For x < t, the integral of h over (0, t − x) vanishes because it cancels with the odd extension
over (−(t − x), 0); hence the solution depends only on h over (t − x, x + t).

183
16.4 Wave equation on R>0 × R>0 : homogeneous Neumann BC
We now consider the wave equation on the half line R>0 :
∂2u ∂2u
− 2 = 0 in R>0 × R>0 ,
∂t2 ∂x
∂u
= 0 on {x = 0} × R>0 ,
∂x
u = g on R>0 × {t = 0},
∂u
= h on R>0 × {t = 0};
∂t
note that homogeneous Neumann boundary condition is imposed at x = 0.
We again use the method of reflection. We recall that for homogeneous Neumann boundary
condition, we consider the even extension of the initial conditions, as the even extension of a
continuous function has a vanishing derivative at the origin. We hence consider
∂2u ∂2u
− 2 = 0 in R × R>0 , (16.7)
∂t2 ∂x
u = g e on R × {t = 0},
∂u
= he on R × {t = 0},
∂t
where g e : R → R and he : R → R are the even extensions of g : R>0 → R and h : R>0 → R,
respectively. We apply d’Alembert’s formula to obtain
1 x+t e
Z
1 e e
u(x, t) = (g (x + t) + g (x − t)) + h (ξ)dξ.
2 2 x−t
We simplify the formula with some manipulation specific to even functions. We again partition
the space-time domain into two parts: the part not affected by the reflected wave (x > t) and the
part affected by the reflected wave (x < t).
Case 1: x > t. In this case, all arguments of g e and he are positive and hence we can simply
substitute g e = g and he = h:
1 x+t
Z
1
u(x, t) = (g(x + t) + g(x − t)) + h(ξ)dξ, x > t.
2 2 x−t
As before, and as shown in Figure 16.3(a), the reflected wave does not reach u(x, t) and hence the
solution is identical to what would have been obtained if the boundary did not exist.
Case 2: x < t. In this case, some of the arguments of g e and he become negative. We appeal
to the property of the even extension to obtain, for x − t < 0,
g e (x − t) = g(t − x).
In addition, we split the integral on he into three parts according to Figure 16.3(b) to obtain
1 x+t e 1 0 e 1 t−x e 1 x+t e
Z Z Z Z
h (ξ)dξ = h (ξ)dξ + h (ξ)dξ + h (ξ)dξ
2 x−t 2 x−t 2 0 2 t−x
| {z }
equal to each other
Z t−x Z x+t
1
= he (ξ)dξ + h(ξ)dξ;
0 2 t−x

184
(a) x > ct (b) x < ct

Figure 16.3: Visualization of the solution of the wave equation on the half line with homogeneous
Neumann boundary condition at x = 0.

here the first two integrals are equal to each other because he is an even function and the integration
domains (x − t, 0) and (0, t − x) are symmetric about the origin. We hence obtain

Z t−x Z x+t
1 1
u(x, t) = (g(x + t) + g(t − x)) + h(ξ)dξ + h(ξ)dξ, x < t.
2 0 2 t−x

An illustration for the x < t case is shown in Figure 16.3(b).


We make a few observations:

1. For x > t, the solution u(x, t) does not depend on the boundary condition (as before).

2. For x < t, the reflected wave that reaches (x, t) originates at t − x and is not inverted (i.e.,
has the same sign). This is unlike at a homogeneous Dirichlet boundary where the reflected
wave is inverted.

3. For x < t, the integral of h over (0, t − x) is doubled because it must also account for the
even extension over (−(t − x), 0). The solution depends on h over (0, x + t); this differs
from the homogeneous Dirichlet boundary case, in which the solution depends on h over only
(t − x, x + t).

16.5 Wave equation on (0, 1) × R>0

In a previous lecture we considered the solution of the wave equation on a bounded domain using
separation of variables. We now revisit the problem, but now using the method of reflection. The

185
problem we consider is
∂2u ∂2u
− 2 = 0 in R>0 × (0, 1), (16.8)
∂t2 ∂x
u = 0 on {0, 1} × R>0 ,
u = g on (0, 1) × {t = 0},
∂u
= h on (0, 1) × {t = 0};
∂t
the homogeneous Dirichlet boundary condition is imposed at both x = 0 and 1.
We again consider a problem on the entire real line R whose solution restricted to (0, 1) satisfies
(16.8). To find the solution that vanishes at both x = 0 and 1, we consider the initial condition
that is odd with respect to both x = 0 and 1. The odd extension g : (0, 1) → R and h : (0, 1) → R
to g o : (0, 1) → R and ho : (0, 1) → R results in an odd function about x = 0. We then consider
the periodic extension of g o and ho — which is the odd periodic extension of g and h — to obtain
a function that is odd with respect to x = 1. Hence, the problem on the entire R that we seek is
∂2u ∂2u
− 2 = 0 in R × R>0 , (16.9)
∂t2 ∂x
u = g o,p on R × {t = 0},
∂u
= ho,p on R × {t = 0},
∂t
where g o,p : R → R and ho,p : R → R are the odd extensions of the g : (0, 1) → R and h : (0, 1) → R,
respectively. We had previously derived this result using separation of variables; we have now
derived the result using the method of characteristics and the method of reflection. This approach
based on the method of reflection readily extends to the case of homogeneous Neumann boundary
conditions and mixed boundary conditions (i.e., Dirichlet on one end and Neumann on the other
end).

16.6 Summary
We summarize key points of this lecture:
1. We can derive the d’Alembert’s formula on the entire real line R using the method of char-
acteristics.
2. The wave equation exhibits a limited domain of influence, limited domain of dependence, and
finite propagation speed. This is unlike the heat equation which exhibits unbounded domain
of influence and dependence and infinite propagation speed.
3. The solution to the wave equation is in general not smooth if either of the initial conditions
g or h is not smooth.
4. The solution to the wave equation on the half line R>0 can be obtained using the method
of reflection. If homogeneous Dirichlet boundary condition is imposed at x = 0, then we
consider the odd extension of the initial condition and find the reflected wave is inverted.
If homogeneous Neumann boundary condition is imposed at x = 0, we consider the even
extension of the initial condition and find the reflected wave is not inverted.

186
5. The solution to the wave equation on a bounded domain can be obtained using the method of
reflection on both boundaries. The initial conditions are extended using appropriate odd/even
periodic extensions.

187
Lecture 17

Weak solution of conservation laws

©2019–2021 Masayuki Yano. Prepared for ESC384 Partial Differential Equations taught at the
University of Toronto.

17.1 Introduction
In this lecture we consider the solution of conservation laws, which play an important role in
fluid mechanics. We first provide an abstract form of conservation laws and then provide a few
concrete model problems. We then introduce the notion of weak solution; a weak solution may
contain discontinuities and allows us to analyze phenomena such as shock and rarefaction waves.
We conclude the lecture with a discussion of entropy-satisfying (i.e., physically relevant) solutions
to conservation laws in the presence of discontinuities.

17.2 Conservation laws


To consider the idea of weak solution in a general but simple setting, we first introduce a general
conservation law in one dimension:
∂u ∂
+ F (u) = 0 in R × R>0 , (17.1)
∂t ∂x
u=g on R × {t = 0},

where u is the conserved state, and F : R → R is a flux function. We assume that the flux is
continuous in u. This general form of the equation encompasses many relevant PDEs; we provide
a few examples.
Example 17.1 (transport equation). The constant-coefficient transport equation ∂u ∂u
∂t + b ∂x = 0
for b ∈ R is a conservation law with the flux F (u) = bu. (More generally in higher dimensions,
transport equation with any divergence free velocity field ∇ · b = 0 is a conservation law with the
flux F (u) = bu.)
Example 17.2 (Burgers’ equation). The Burgers’ equation ∂u ∂u
∂t + u ∂x = 0 is a model equation
for compressible flows, which exhibits phenomena such as shock waves and rarefaction waves. We
observe that u ∂u
∂x = ∂
∂x 2
1 2
u , and hence the Burgers’ equation is a conservation law with the flux
1 2
F (u) = 2 u .

188
Example 17.3 (traffic flow equation). The traffic flow equation models the density of cars on a
one-way road. The conserved state u is the density of the cars and the flux is given by F (u) =
uvmax (1 − u/umax ), where vmax is the maximum velocity of the traffic and umax is the maximum
density of the cars.

17.3 Weak solution


We have observed that the solution to the transport equation ∂u ∂u
∂t +b ∂x = 0 is in general not smooth.
2
This is unlike the solution to the heat equation ∂u ∂ u
∂t − ∂x2 = 0 for which the solution is smooth (i.e.,

u(·, t) ∈ C (R)) for any t > 0. This lack of smoothness is in fact a property of conservation
laws, where the solution is “transported” instead of “diffused”. However, as it stands, the PDE
∂u ∂u
∂t + b ∂x = 0 does not even make sense unless the solution is at least once differentiable in both
space and time. It turns out we can relax this differentiability requirement and pose a different form
of the PDE, called the weak form, so that the problem admits less regular solutions (and initial
and boundary conditions), which includes discontinuous solutions. In this section we introduce this
weak form of conservation laws.
To begin, we introduce a space of test functions, which comprises smooth space-time functions
v ∈ C ∞ (R×R≥0 ). We (for now) assume the solution is differentiable — and hence ∂u ∂
∂t + ∂x F (u) = 0
makes sense —, multiply our strong form by the test function, and integrate by parts to obtain
Z ∞Z ∞  
∂u ∂
0= v + F (u) dxdt
∂t ∂x
0
Z ∞−∞
Z ∞ Z ∞ Z ∞Z ∞
∂v ∂v
=− udxdt − vudx|t=0 − F (u)dxdt.
0 −∞ ∂t −∞ 0 −∞ ∂x

We finally substitute the initial condition u = g on R × {t = 0} to obtain


Z ∞Z ∞   Z ∞
∂v ∂v
u+ F (u) dxdt + vgdx|t=0 = 0.
0 −∞ ∂t ∂x −∞

This is the weak form of the conservation law.


Definition 17.4 (weak solution). The bounded solution u ∈ L∞ (R × R>0 ) such that
Z ∞Z ∞   Z ∞
∂v ∂v
u+ F (u) dxdt + vgdx|t=0 = 0 ∀v ∈ C ∞ (R × R≥0 )
0 −∞ ∂t ∂x −∞
∂u ∂
is called a weak solution of the conservation law ∂t + ∂x F (u) = 0 with the initial condition u = g
at t = 0.
We now analyze the behavior of the weak solution for three cases with increasingly less regular
solutions: smooth solution, continuous but nondifferentiable solution, and discontinuous solution.
Smooth solution. Any solution u that satisfies the strong form of the conservation law (17.1) is
a solution to the weak form; we integrate by parts in both space and time to obtain
Z ∞Z ∞   Z ∞
∂v ∂v
u+ F (u) dxdt + vgdx|t=0
∂t ∂x
0
Z−∞∞ Z ∞
−∞
Z ∞
∂u ∂
=− v( + F (u))dxdt + v(g − u)dx|t=0 = 0,
0 −∞ ∂t ∂x −∞

189
where the first and second terms vanishe because u satisfies the strong form of PDE and the initial
condition, respectively.
Continuous but nondifferentiable solution. We now consider the case where u is not differentiable
across some space-time curve and hence the strong form (17.1) does not make sense. For instance,
for the transport equation, such a nondifferentiable solution would arise if the initial condition is not
differentiable. We will show that a weak solution exists even in this case. To begin, we introduce
a space-time curve x = ξ(t) and the associated space-time unit normal vector n ≡ (nx , nt ) =
√ 1 0 2 (1, −ξ 0 ) that points outward from the “left” domain {(x, t) ∈ R×R>0 | x < ξ(t), ∀t ∈ R>0 }
1+(ξ )
into the “right” domain. Note that ξ 0 = dξ dt is the velocity at which the nondifferentiable point
moves. We assume that the solution u satisfies ∂u ∂
∂t + ∂x F (u) = 0 everywhere except on the curve
x = ξ(t) and observe that
Z ∞Z ∞  
∂v ∂v
0= u+ F (u) dxdt
0 −∞ ∂t ∂x
Z ∞ Z ξ(t)   Z ∞Z ∞ 
∂v ∂v ∂v ∂v
= u+ F (u) dxdt + u+ F (u) dxdt
0 −∞ ∂t ∂x 0 ξ(t) ∂t ∂x
Z ∞ Z ξ(t)   Z
∂u ∂
=− v + F (u) dxdt + (vu− nt + vF (u− )nx )ds
0 −∞ ∂t ∂x x=ξ(t)
Z ∞Z ∞   Z
∂u ∂
− v + F (u) dxdt − (vu+ nt + vF (u+ )nx )ds
0 ξ(t) ∂t ∂x x=ξ(t)
Z
v((u− − u+ )nt + (F (u− ) − F (u+ ))nx ) ds.

=
x=ξ(t)

Because v is arbitrary, we require the solution to satisfy

(u− − u+ )nt + (F (u− ) − F (u+ ))nx = 0

along the nondifferentiable curve x = ξ(t). We now substitute nt /nx = −ξ 0 to obtain

F (u− ) − F (u+ ) = ξ 0 (u− − u+ ). (17.2)

If u is continuous but not differentiable across x = ξ(t), we observe that the condition at the non-
differentiable interface (17.2) is automatically satisfied because (i) u− = u+ since u is continuous
and (ii) F (u− ) = F (u+ ) since F : R → R is continuous on its argument and u is continuous. In
addition, we note that the above argument can be generalized to any finite number of nondifferen-
tiable curves. Hence we conclude that if the solution u is (i) continuous everywhere and (ii) satisfies
the strong form ∂u ∂
∂t + ∂x F (u) = 0 everywhere except on a finite number of space-time curves on
which u not differentiable, then u is a weak solution.
Discontinuous solution. We now consider the case where u is discontinuous across some space-
time curve. We follow the same argument as the case above for nondifferentiable solution and find
that the condition across the space-time curve x = ξ(t) is

F (u− ) − F (u+ ) = ξ 0 (u− − u+ ).

However, now that u is discontinuous, this condition is not automatically satisfied. This is in fact
the Rankine-Hugoniot condition, which the weak solution must satisfy across a discontinuity or
shock.

190
Theorem 17.5 (Rankine-Hugoniot condition). Given a conservation law ∂u ∂
∂t + ∂x F (u) = 0, the
discontinuous solution across the shock curve {(x, t) | x = ξ(t), t ∈ R>0 } satisfies

F (u− ) − F (u+ ) = s(u− − u+ )

for the shock speed s = ξ 0 .


Hence, if the solution is discontinuous along a finite number of space-time curves, the weak so-
lution must (i) satisfy the strong form of the equation everywhere it is continuous and differentiable
and (ii) the Rankine-Hugoniot condition across the discontinuities.

17.4 Weak solutions of transport equation


We now seek the weak solution of the transport equation with nondifferentiable initial conditions.
Example 17.6 (transport equation with a nondifferentiable initial condition). Consider the trans-
port equation ∂u ∂u
∂t + b ∂x = 0 on R × R>0 with the initial condition
(
1 − |x|, x ∈ (−1, 1),
g(x) =
0, x ∈ (−∞, 0) ∪ (1, ∞).

The initial condition is continuous but not differentiable at x = −1, 0, and 1. We use the method
of characteristics to find that the solution is

u(x, t) = g(x − bt),

which is a triangular wave that propagates at the speed b. Because u is continuous, this is a weak
solution to the problem.
Example 17.7 (transport equation with a discontinuous initial condition). Consider the transport
equation ∂u ∂u
∂t + b ∂x = 0 on R × R>0 with the initial condition
(
1, x ∈ (−1, 1),
g(x) =
0, x ∈ (−∞, 0) ∪ (1, ∞).

The initial condition is discontinuous at x = −1 and 1. We use the method of characteristics to


find that the solution is u(x, t) = g(x − bt), which is a square wave that propagates at the speed
b. Because u is discontinuous, we must check that the Rankine-Hugoniot condition is satisfied. We
observe that the discontinuities must propagate at the speed of
F (u− ) − F (u+ ) bu− − bu+ b(u− − u+ )
s= = = = b.
u− − u+ u− − u+ u− − u+
The discontinuity in the solution u(x, t) = g(x − bt) indeed propagates at the speed b, the Rankine-
Hugoniot condition is satisfied, and u is a weak solution to the problem.
We recall that the characteristic equations were obtained assuming the solution is differentiable.
However, the above two examples show that the solution to the characteristic equation, u(x, t) =
g(x − bt), is indeed a weak solution even if the initial condition is nondifferentiable and hence the
solution is nondifferentiable.

191
(a) characteristics (b) solution snapshots

Figure 17.1: The characteristics and solution snapshots for the Burgers’ problem with a shock in
Example 17.8.

17.5 Weak solution of Burgers’ equation: shocks


We now seek a weak solution of Burgers’ equation.
Example 17.8 (shock in Burgers’ equation). Consider the initial value problem: Burgers’ equation
 
∂u ∂u ∂u ∂ 1 2
+u = + u = 0 in R × R>0
∂t ∂x ∂t ∂x 2
with the initial condition (
1, x<0
g(x) =
0, x > 0.
Because 1 = u(x = 0− ) > u(x = 0+ ) = 0, the characteristic curve intersect for any t > 0 and there
is no continuous solution. However, there is a discontinuous solution given by
(
1, x < st
u(x, t) =
0, x > st,
where the shock speed s is given by the Rankine-Hugoniot condition
1 2 1 2
F (u− ) − F (u+ ) 21 − 20 1
s= − +
= 2 2
= .
u −u 1 −0 2
Figure 17.1 shows the characteristics and the solution snapshots. The characteristics collapse into
the shock at x = st = t/2; the snapshots also show the traveling shock.

17.6 Entropy condition


We now consider another initial value problem: Burgers’ equation on R with the initial condition
(
0, x < 0,
g(x) =
1, x > 0.

192
(a) characteristics (b) solution snapshots

Figure 17.2: The characteristics and solution snapshots for a Burgers’ problem with a nonphysical
shock wave.

We apply the method of characteristics to find that


(
0, x < 0,
u(x, t) =
1, x > t.

However, the method of characteristics does not inform us how the solution in the wedge {(x, t) | 0 <
x < t, t ∈ R>0 }.
Solution 1. One potential solution is to fill the wedge with a discontinuous solution
(
0, x < t/2,
u1 (x, t) =
1, x > t/2.
We readily verify that the solution satisfies the Burgers’ equation outside of the discontinuity. The
solution also satisfies the Rankine-Hugoniot condition across the discontinuity:
1 2 1 2
F (u− ) − F (u+ ) 20 − 21 1
s= = = .
u− − u+ 02 − 12 2
Hence, this is a weak solution to the initial value problem. Figure 17.2 shows the characteristics
and the solution snapshots.
Solution 2. Another potential solution is to fill the wedge with a continuous solution

0, x ∈ (−∞, 0),

u2 (x, t) = x/t, x ∈ (0, t),

1, x ∈ (t, ∞).

We first note that the solution is continuous. We second observe that the Burgers’ equation is
satisfied for x ∈ (−∞, 0) and x ∈ (t, ∞) where the solution is constant. We finally observe that for
x ∈ (0, t),
∂u ∂u x x1
+u =− 2 + = 0.
∂t ∂x t tt

193
(a) characteristics (b) solution snapshots

Figure 17.3: The characteristics and solution snapshots for a Burgers’ problem with a rarefaction
wave.

Hence the continuous solution satisfies the strong form of the Burgers’ equation everywhere except
along the two nondifferentiable curves; this is a weak solution of the Burgers’ equation. Figure 17.3
shows the characteristics and the solution snapshots.
We have shown that there are at least two weak solutions to the Burgers’ equation; in fact
we can show that there are infinitely many weak solutions. However, only one of these solutions
are physically relevant. It turns out there is an additional condition, called the entropy condition,
that a physically relevant solution must satisfy. While there are a few different characterizations of
the condition, one interpretation is that information cannot be created when we march forward in
time. Mathematically, this translates to a condition that if we start at a point (x, t) ∈ R × R>0 and
march backward in time along the characteristic, then we would not cross any other characteristics.
In other words, if we start at a point (x, t) ∈ R × R>0 and march backward in time along the
characteristic, we would not encounter any discontinuities. Note that, as we have seen, if we march
forward in time along the characteristic, we may cross another characteristic at which point a
discontinuity is formed. The characteristic terminating at a discontinuity when marching forward
in time corresponds to the loss of information, which is permissible. However, characteristics
terminating at a discontinuity when marching backward in time — or equivalently characteristics
emanating from a discontinuity when marching forward in time — corresponds to the creation of
information, which is not permissible. Because each characteristic has a moves at the speed of
F 0 (u), we arrive at the following mathematical condition:
Definition 17.9 (entropy condition). Along a discontinuity, a physically relevant weak solution to
a conservation law must satisfy
F 0 (u− ) > s > F 0 (u+ ),
where s is the shock speed given by the Rankine-Hugoniot condition.
Applying the entropy condition to Solutions 1 and 2 above, we find that Solution 1 is nonphysical
since 0 = F 0 (u− ) < F 0 (u+ ) = 1. On the other hand, Solution 2 satisfies entropy condition at all
discontinuities (because it has no discontinuity). Hence the solution shown in Figure 17.3 is the
physically relevant solution to the weak form.

194

You might also like