Notes On Partial Differential Equations (Theory)

Download as pdf or txt
Download as pdf or txt
You are on page 1of 129

Notes on Partial Differential Equations

(Theory)
Claudio Canuto
Dipartimento di Matematica
Politecnico di Torino
10129 Torino Italy
[email protected]
http://calvino.polito.it/ccanuto
July 21, 2009

Chapter 1

Basic Concepts
1.1

Vectors

The inner product between two column vectors a = (ai ) IRm and b = (bi ) IRm will be denoted
by
m
X
ai bi .
a b :=
i=1

An equivalent notation is aT b (where the superscript T indicates the transpose of a vector or a


matrix), since a b is the matrix product between the (1 m)-matrix aT and the (m, 1)-matrix b.
For m = 3, the vector product (or external product) between a and b is the vector

e1 e2 e2
a b := det a1 a2 a3
b1 b2 b3
= (a2 b3 a3 b2 )e1 + (a3 b1 a1 b3 )e2 + (a1 b2 a2 b1 )e3
=

(a2 b3 a3 b2 , a3 b1 a1 b3 , a1 b2 a2 b1 )T ,

where ei = (ij )j=1,...,3 , i = 1, . . . , 3, are the vectors of the canonical basis.

1.2

Introduction and Notations

Let us denote by x = (x1 , . . . , xm ) the independent variable in the Euclidean space IRm . Let
u = u(x ) be a real-valued function defined in some open set O IRm . Given a multi-integer =
(1 , . . . , m ) INm , the -partial derivative of u at a point x O is obtained by differentiating u
at x i -times with respect to the variable xi , for i = 1, . . . , m. The order of the partial derivative
is defined as || := 1 + + m . The -partial derivative will be denoted by one of the symbols
D u

|| u
.
x1 1 . . . xmm

or

Other symbols may be preferred for indicating low order derivatives. For instance, the first order
partial derivatives with respect to xi will also be denoted by
Di u

or

Dxi u
3

or

ux i ;

CHAPTER 1. BASIC CONCEPTS

the second order derivative with respect to xi and xj will be denoted by


2
Dij
u

Dx2i xj u

or

or

ux i x j ,

and so on.
We now introduce the most commonly used first order differential operators. The gradient
or grad is defined as

x1 x1
. .

u :=
.. = .. u ;
u
xm
xm
note that acts on a scalar function and produces a column vector function, i. e., a vector field
defined in O.
The divergence or div acts on a vector-valued function u = (u1 , . . . , um )T and produces a
scalar function, according to the definition
u :=

um
u1
+ +
.
x1
xm

The notation is coherent with the fact that u can be formally obtained as the inner product
of the column vectors and u. Therefore, an equivalent notation for u is T u.
In dimension m = 3, the curl or rot acts on a vector-valued function u and produces a
vector-valued function, according to the definition


u3 u2 u1 u3 u2 u1 T
;

u=
x2 x3 x3 x1 x1 x2
the vector u can be formally obtained as the vector product of the column vectors and u.
In dimension m = 2, we can define the curl of a scalar function u as the column vector in IR2


u
u T
u=
;
,
x2
x1
note that u contains the two first components of the vector U IR3 , where U =
(0, 0, u)T IR3 (the last component is obviously 0). Similarly, we can define the curl of a vector
function u = (u1 , u2 )T IR2 as the scalar
u=

u2 u1

,
x1 x2

which coincides with the third component of the vector U IR3 , where U = (u, 0)T (the first
two components are 0 since u does not depend on x3 ).
The perhaps most popular second order differential operator is the Laplacian , defined as
u :=

2u
2u
.
+

+
x2m
x21

Note that u is obtained by taking the divergence of the vector u, i.e., u = u = T u;


for this reason, the Laplacian is also denoted by the symbol 2 .

1.2. INTRODUCTION AND NOTATIONS

A partial differential equation is a relationship


R(x , u; D ) = 0

(1.2.1)

among the independent variable x , the dependent variable u = u(x ) and certain partial derivatives
D applied to u or to some functions depending on u; the multi-integers vary in some finite
subset of INm . The equation is required to be satisfied in some open set O IRm . The order of
the equation is the maximum order of the partial derivatives which appear in the relationship.
Examples of (first order) partial differential equations are
(i) the transport (or advection) equation
m
X

a u =

ai

i=1

u
= 0,
xi

(1.2.2)

where a = a(x ) = (a1 (x ), . . . , am (x ))T is a given vector field defined in O;


(ii) the inviscid Burgers equation

u
+
t
x

u2
2

= 0,

(1.2.3)

where we have set (x1 , x2 ) = (x, t) IR2 ;


(iii) the liquid crystal equation
u u =


m 
X
u 2
i=1

xi

= f,

(1.2.4)

where f = f (x ) is a given function in O.


A partial differential equation is linear if (1.2.1) can be written as
L(x , u; D ) = f,

(1.2.5)

where L is linear in u, i.e., L(x , u + v; D ) = L(x , u; D ) + L(x , v; D ) for all , IR,


and f = f (x ) is a given function in O. Equivalently, the left-hand side of (1.2.5) is a linear
differential operator (of some order N ) applied to u; this means that for all with || N there
exist coefficients a = a(x ) such that
L(x , u; D ) =

aD u,

(1.2.6)

||N

and, at each x O, at least one coefficient with || = N is not vanishing. For convenience, the
left-hand side L(x , u; D ) will be denoted by Lu.
If we restrict the sum in (1.2.6) to the indices with || = N , we obtain the principal part L(N )
of the operator L, i.e.,
X
L(N ) u =
aD u.
||=N

The transport equation (1.2.2) is an example of a linear first order equation. Examples of
linear second order equations (in two independent variables) are

CHAPTER 1. BASIC CONCEPTS


(i) the Poisson equation

2u 2u
+ 2
x2
y

=f

(1.2.7)

(called the Laplace equation if f = 0);


(ii) the heat equation
u 2 u
2 =f
t
x
(t denotes the time variable, whereas x denotes the space variable);

(1.2.8)

(iii) the wave equation


2u 2u
2 = f;
t2
x

(1.2.9)

2u
2u
+
y
= f.
x2
y 2

(1.2.10)

(iv) the Tricomi equation

Note that the principal part of the heat operator is minus the second order derivative in the x
space variable, i.e., minus the Laplacian in one space variable.
A partial differential equation is quasi-linear if it is linear in the higher-order derivatives, i.e.,
if it can be written as
X
aDu = f,
(1.2.11)
||=N

where the coefficients a as well as f may depend not only on x but also on u and certain
derivatives D u of order || < N . An example is the inviscid Burgers equation (1.2.3), which can
be written in the (formally) equivalent expression
u
u
+u
= 0.
t
x
Finally, a partial differential equation is semi-linear if it is quasi-linear and the coefficients
a in (1.2.11) depend neither on u nor on its derivatives (whereas f may depend). Examples of
semi-linear equations are the viscous Burgers equation
 
2u
u2
u
2 +
= 0,
t
x
x 2
(where > 0 is the viscosity constant), the Korteweg-de Vries equation
u 3 u
u
+u
+
= 0,
t
x x3

(1.2.12)

and the ground-state equation


u = u3 .
We will now discuss in which sense a function u defined in the open set O IRm is a solution
of the partial differential equation (1.2.1) therein. Indeed, we can give different meanings to the
word solution. We go from the concept of classical solution to that of strong solution, and
then to weaker and weaker definitions, which require a solution to be less and less regular (i.e.,

1.2. INTRODUCTION AND NOTATIONS

differentiable). One of the main achievements of the Mathematics of the XXth century has been
the relaxation of the concept of solution of a partial differential equation; this has allowed the
differential problems to be formulated in the most appropriate way for being studied by often
sophisticated analytical tools, and numerically discretized by efficient methods.
Let us denote by N the order of the partial differential equation (1.2.1). A classical solution is
a N -time continuously differentiable function in O (i.e., u CN (O)) which, inserted with all its
derivatives in the left-hand side of (1.2.1), makes the equation satisfied pointwise in O:
R(x , u; D ) = 0,

x O.

(1.2.13)

We want to formulate these conditions in an equivalent way, which subsequently will allow us
to relax the concept of solution. To this end, we introduce the notion of test function, i.e., an
infinitely differentiable function defined and having compact support in O: this means that the
closed set
supp = closure of {x O : (x ) 6= 0}
is bounded and contained in O. Then, vanishes with all its derivatives in a neighborhood of the
boundary O. The set of all test functions forms a vector space, which will be denoted by D(O).
Note that any partial derivative of a test function is itself a test function.

Example 1.2.1. It is often important to know that test functions with certain properties exist;
for example, one often needs a test function that is positive in a small neighborhood of a given
point x 0 and zero outside that neighborhood. Such a function can be given explicitly:



2
exp
if kx x 0 k <
kx x 0 k2 2
(x ) =

0
otherwise,

where k k denotes the Euclidean norm of a vector in IRm .

Let us assume that R depends continuously on all its arguments, so that R(x , v; D ) is a
continuous function in O, for all functions v CN (O). The set of conditions (1.2.13) is equivalent
to the set of conditions
Z
R(x , u; D )(x ) dx = 0,
D(O).
(1.2.14)
O

Equivalence is established by a classical argument in analysis. If (1.2.13) is satisfied, we multiply


both sides by (x ) and integrate over O to get (1.2.14). Conversely, suppose that (1.2.14) holds;
assume by contradiction that there exists x 0 O such that R(x 0 , u; D ) 6= 0, say, strictly
positive. Since R(x , u; D ) is a continuous function of x , it will also be strictly positive in
a neighborhood B(x 0 ) of x 0 . Take as test function a nonnegative function having support
contained in B(x 0 ) and satisfying (x 0 ) = 1. Then,
Z
R(x , u; D )(x ) dx > 0,
O

which contradicts (1.2.14).


The interest of formulation (1.2.14) relies on the fact that certain derivatives applied to u, or
to some functions of u, can be moved on , thus relaxing the differentiability requirements on u.
This is accomplished via the (repeated) use of the integration-by-parts formula
Z
Z

g
g(x )
(x )(x ) dx =
(x ) dx ,
D(O), i = 1, . . . , m,
(1.2.15)
xi
O
O xi

CHAPTER 1. BASIC CONCEPTS

which holds, at least, if g is continuously differentiable in O. Note that no boundary term appears,
since a test function vanishes in a neighborhood of O. While the left-hand side requires the partial
derivative of g with respect to xi to be defined and integrable on O, the right-hand side is defined
under the milder condition that g be integrable on O, only.
To explain how (1.2.15) is used to manipulate (1.2.14), assume that the partial differential
equation is written in the quasi-divergence form
m
X

R(x , u; D ) = g(x , u; D ) + g0 (x , u; D ) =
gi (x , u; D ) + g0 (x , u; D ),
xi

i=1

where each gi (i = 0, 1, . . . , m) only involves partial derivatives of order strictly less than N .
Many partial differential equations which model fundamental phenomena of the physical world
are precisely obtained in this form; conservation laws are an example. Then, applying (1.2.15),
conditions (1.2.14) can be written as
#
Z "X
m

gi (x , u; D )
(x ) + g0 (x , u; D )(x ) dx = 0,
D(O).
(1.2.16)
xi
O
i=1

In this formulation, u need not be differentiable up to order N ; it is enough for the functions gi to
be defined and integrable on O. Any function u for which this is true and which satisfies (1.2.16)
is called a weak solution of the partial differential equation. Obviously, a classical solution is also
a weak solution, whereas the converse need not be true.
Further integrations by parts in (1.2.16) may lead to even weaker definitions of solution.
Example 1.2.2. Consider the transport equation (1.2.2) and assume that the coefficients ai
(i = 1, . . . , m) belong to C1 (O). After a change of sign, the equation can be written as
!
m
m
X
X
ai

(ai u) +
u = 0.

xi
xi
i=1

i=1

Thus, the weak formulation is


"m
Z
X

ai (x )
u(x )
(x ) +
xi
O
i=1

!
#
m
X
ai
(x ) (x ) dx = 0,
xi
i=1

D(O).

In this way, we allow the transport equation to have bounded, piecewise smooth but discontinuous
weak solutions.
Example 1.2.3. Recalling that = , the Poisson equation
u = f
in O is written in weak form as
Z X
Z
Z
m
u
u dx =
f dx ,
dx =
xi xi
O
O
O
i=1

D(O),

provided f and all the first order partial derivatives of u exist and are integrable on O. A further
integration by parts yields
Z
Z
f dx ,
D(O),
u dx =

which only requires u to be integrable on O.

1.2. INTRODUCTION AND NOTATIONS

Partial differential equations are usually supplemented by boundary and/or initial conditions,
i.e., conditions that the solution has to satisfy on all or part of the boundary O of the region O in
which the equation is set; if O is unbounded, the solution may be required to match a prescribed
asymptotic behaviour at infinity. Indeed, in most cases, a partial differential equation admits
infinitely many solutions; the conditions on O or at infinity, which often originate as part of
the mathematical model describing the phenomenon of interest, allow us to select precisely one
solution.
Example 1.2.4. Consider the simple transport equation in one space variable
ut + ux .
It is immediate to check that if g = g(s) is any continuously differentiable function on the real
line, then u(x, t) = g(x t) is a classical solution of the equation. Thus, if we set the equation in
the half-plane {(x, t) IR2 : t > 0}, so that O = {(x, 0) : x IR} represents the space at the
initial time t = 0, then u is the unique solution of the initial value problem

ut + ux = 0 in O
u = g on O.
Example 1.2.5. As a second example, consider the Poisson equation
u = f
in some O IRm . If we know one solution uf , then all solutions can be written in the form
u = u0 + uf , where u0 denotes any harmonic function in O, i.e., any solution of the homogeneous
equation u0 = 0 therein. We shall see that, under appropriate assumptions, a unique solution
can be selected by forcing u to vanish on the whole of O, and, if O is unbounded, at infinity.
The two previous examples concern linear partial differential equations,
Lu = f

in O.

(1.2.17)

For such equations, the set of solutions is an (affine) vector space. To see this, consider at first
the associated homogeneous equation
Lu0 = 0

in O.

The set of its solutions is a linear vector space: indeed, if u0 and v0 are two such solutions, then
by the linearity of L one has
L(u0 + v0 ) = Lu0 + Lv0 = 0,

, IR,

so u0 + v0 is also a solution. Going back to the nonhomogeneous equation, if u and v are two
solutions, then
L(u v) = Lu Lv = f f = 0,

i.e., u v is a solution of the homogeneous equation. Thus, if (1.2.17) admits a solution uf , then
all its solutions can be written in the form
u = u 0 + uf ,
with u0 arbitrary solution of the homogeneous equation. This is the well-known superposition
principle of linear equations.

10

1.3

CHAPTER 1. BASIC CONCEPTS

Linear First Order Equations

The most general linear first order partial differential equation is


a u + a0 u =

m
X
i=1

ai

u
+ a0 u = f ;
xi

(1.3.1)

a = (a1 , . . . , am )T 6= 0 and a0 are the coefficients of the equation, whereas f is a given function.
An alternative formulation of the equation is the quasi-divergence form
(au) + a0 u =

m
X

(ai u) + a0 u = f.
xi
i=1

If the coefficients ai (i = 1, . . . , m) are differentiable, the two formulations are equivalent by the
differentiation rule of a product, up to a different definition of the zeroth-order coefficient a0 .
We want to show that (1.3.1) is equivalent to a family of ordinary differential equations. We
u
u the
, with a
having unitary Euclidean norm, and we denote by
= a
write a = kak a

a
. Then, (1.3.1) becomes
directional derivative of u along a
kak

u
+ a0 u = f,

which is a family of ordinary differential equations in the directions of a. To be more explicit, let
us assume that the coefficients ai (i = 1, . . . , m) are bounded, continuously differentiable functions
of the region O in which the equation is set. Let us introduce the characteristics
in the closure O
curves of the equation, i.e., the curves x = x (s) defined as the solutions of the autonomous
ordinary differential system
dx
= a(x ).
(1.3.2)
ds
Here, s is a real variable which parametrizes each curve. A classical result in the theory of
ordinary differential equations (see, e.g., (??)) guarantees that, under the assumptions made on
the coefficients, for each x 0 O there exists exactly one characteristic curve passing through x 0 ;
it is defined as the solution x = x (s; x 0 ) of the Cauchy problem

dx = a(x )
ds

x (0) = x 0 .

The solution exists for positive and negative values of the parameter s, until x reaches the boundary
O. Note that the characteristics only depend on the principal part of the operator.
Let now u be a (classical) solution of (1.3.1), and let us consider its restriction u
=u
(s) =
u(x (s)) to a characteristic curve. By the chain rule and (1.3.2), one has
m

d
u X u dxi
=
= a u.
ds
xi ds
i=1

It follows that u can be determined by solving the linear ordinary differential equation
d
u
+a
0 u
= f
ds

(1.3.3)

1.3. LINEAR FIRST ORDER EQUATIONS


O0

11
n

a
n

O +

O
O0

a
n

Figure 1.1: Decomposition of the boundary of a channel O into inflow boundary O , characteristic boundary O0 and outflow boundary O+
on each characteristic curve (again, the symbol indicates restriction to the characteristic curve);
solvability is guaranteed if, for instance, a0 and f are bounded and continuous in O. Furthermore,
u can be uniquely determined by prescribing its value at one point of each characteristic curve.
A situation of particular interest is the following one. Let O be smooth enough so that the
unit vector n = n(x ) normal to O exists at each point x O; we assume that O is locally on
one side of O, and n is pointing outwards. Let us introduce the inflow boundary of O as the set
O := {x O : (a n)(x ) < 0} .

(1.3.4)

The terminology comes from the fact that if a is the (Eulerian) velocity of fluid particles, then O
is the portion of the boundary where the fluid is entering the region O. The sets O+ (outflow
boundary) and O0 (characteristic boundary) are defined similarly, with < replaced by > and
=, respectively.
Now, suppose that each point in O is reached by a characteristic curve issuing from O (see
Figure 1.1). Then, we can prescribe the value of u at each point in O and uniquely solve the
set of equations (1.3.3), getting u at each point in O. In other words, given a function g on O ,
the boundary value problem

a u + a0 u = f in O
(1.3.5)
u = g on O
admits a unique solution.
Before presenting an example, we anticipate that in Chapter ?? we shall see that this problem
is indeed solvable under weaker assumptions on the data (the domain, the coefficients of the
operator and the right-hand sides f and g).
Example 1.3.1. Consider the simple, constant coefficient equation
ut + aux = 0

(1.3.6)

in the variables (x1 , x2 ) = (x, t). Thus, a = (a, 1)T and a0 = 0. The characteristic curves are
defined by the relations
dx
dt
= a,
= 1.
ds
ds
Eliminating s, we get
x at = constant;
(1.3.7)

12

CHAPTER 1. BASIC CONCEPTS


t
t = t0 + a1 (x x0 )
t0 + t
t0

x0 + at

x0

Figure 1.2: Characteristics in the (x, t)-plane


in other words, the characteristics are straight lines in the plane (x, t) having slope 1/a (see Fig.
1.2). The solution u is constant along these lines. Thus, the equation models the propagation of
a signal in the x-direction, with speed a: a signal issued at time t0 from position x0 is received
at time t0 + t at position x0 + at (see Fig. 1.3). Indeed,
= u(x0 , t0 ) = u(x0 + at, t0 + t).
At first, let us suppose that O is the half-plane {(x, t) : t > 0}. Since n = (0, 1)T on
O = {(x, 0) : x IR}, we have a n = 1 therein, so that O = O, i.e., all the boundary is
inflow. Thus, we prescribe the value u0 of u on O, i.e., at the initial time t = 0. In this case,
(1.3.5) reads as

ut + aux = 0
x IR, t > 0 ,
(1.3.8)
u(x, 0) = u0 (x) x IR ,

which is more properly called an initial value problem. Given a point (


x, t) O, the characteristic
line passing through it originates from the point (x0 , 0) O such that x
at = x0 (see (1.3.7)
and Fig. 1.4). Since u is constant on this line, we have u(
x, t) = u(x0 , 0) = u0 (x0 ) = u0 (
x at).
1.4

1.4

1.2

1.2

S
1

0.8

0.8

u(x, t0 )

0.6

u(x, t0 +t)

0.6

0.4

0.4

0.2

0.2

x
0

x0
0.2

0.2

0.4

0.6

x0 +at
0.8

1.2

1.4

1.6

1.8

0.2

0.2

0.4

0.6

0.8

1.2

1.4

Figure 1.3: Propagation of a signal from time t = t0 to time t = t0 + t

1.6

1.8

1.3. LINEAR FIRST ORDER EQUATIONS

13

(
x, t)

x
at

Figure 1.4: Solution of the initial value problem


Dropping the bars on x and t, we get the explicit formula for the solution of the initial value
problem (1.3.8)
u(x, t) = u0 (x at) ,
for all (x, t) O .
(1.3.9)
It is trivial to check that if u0 is continuously differentiable, then u is continuously differentiable
with respect to x and t; thus, u is a classical solution of the partial differential equation. On the
other hand, suppose that u0 has a discontinuity at a point x0 ; then, u will have a discontinuity
across the characteristic line x at = x0 issued at x0 . In other words, singularities propagate
along characteristic curves. This very important property is quite general, as it holds for all first
order equations. Obviously, the function defined by (1.3.9) is not a strong solution of (1.3.6) in
O; however, it is a weak solution.
At last, suppose that O is the semi-infinite strip O = {(x, t) : 0 < x < 1, t > 0}. For the sake
of definiteness, assume that a > 0. Then,
O = {(x, 0) : 0 x < 1} {(0, t) : t > 0}
(note that the normal vector to O does not exist at the origin, yet this boundary point is an
inflow point for the equation). We prescribe the value u0 = u0 (x) at time t = 0 and the value
g = g(t) at the left endpoint of the interval (0, 1); no condition has to be prescribed at the right
endpoint. Thus, we consider the initial-boundary value problem

0 < x < 1, t > 0 ,


ut + aux = 0
u(x, 0) = u0 (x) 0 < x < 1 ,
(1.3.10)

u(0, t) = g(t)
t>0.

In order to solve this problem, let us fix a point (


x, t) O. If x
at, then the characteristic passing

through (
x, t) meets O at the point (x0 , 0), with x0 = x
at; hence, as above, u(
x, t) = u0 (
xat).
On the other hand, if x
< at, then the characteristic passing through (
x, t) meets O at the
point (0, t0 ), with t0 = t x
/a (see Fig. 1.5); hence, u(
x, t) = u(0, t0 ) = g(t0 ) = g(t x
/a). We
conclude that the solution of the initial-boundary value problem (1.3.10) is
(
u0 (x at) if x at
u(x, t) =
g(t x/a) if x < at.
Note that if the data u0 and g do not match properly at the origin, a singularity propagates along
the characteristic line x = at.

14

CHAPTER 1. BASIC CONCEPTS


t

(
x, t)
t

Figure 1.5: Solution of the initial-boundary value problem


If the coefficient a is strictly negative, the boundary condition g is enforced at the right endpoint
of the interval (0, 1).
The concept of characteristic line introduced above is a particular case of the more general
concept of characteristic manifold. A (m 1)-dimensional manifold (a line in two dimensions,
is said non-characteristic for equation
a surface in three dimensions, and so on) contained in O
(1.3.1) whenever the following property holds: if one prescribes the value of u on , then u is
uniquely determined by the partial differential equation in a neighborhood of . As a first step,
one aims at determining the gradient of u on ; then, if the manifold, the coefficients and the
data are smoother and smoother, one can differentiate the equation to get derivatives of u on of
higher and higher order; at last, the condition of real analyticity leads to the representation of u
in terms of its Taylor series in a neighborhood of each point in (Cauchy-Kowalewska Theorem).
Confining ourselves to the determination of the gradient of u on , we observe that the directional
derivative of u along any tangential vector to is uniquely determined by the prescribed value of
u therein. Therefore, the differential equation should allow to express the derivative of u along a
non-tangential direction to in terms of the value of u on the manifold. In other words, denoting
by n the normal vector to , one should have a n 6= 0 on . This motivates the following
whose normal vector n satisfies
Definition 1.3.2. Any smooth manifold in O
an=0

on

is called a characteristic manifold for equation (1.3.1).


It is easy to check that characteristic curves lie on characteristic manifolds. Furthermore, the
inflow boundary O of O is obviously a non-characteristic manifold.

1.4

Linear Second Order Equations

The most general linear second order partial differential equation reads as follows:

m
X

i,j=1

aij

X u
2u
ai
+
+ a0 u = f
xi xj
xi
i=1

(1.4.1)

1.4. LINEAR SECOND ORDER EQUATIONS

15

(the choice of the minus sign in front of the principal part will be motivated in the sequel). We
actually consider the equation in the quasi-divergence form

 X
m
m
X

(ai u) + a0 u = f.
(1.4.2)
aij
+
xi
xj
xi
i,j=1

i=1

As already mentioned above, often the equation is derived in this form; if not, we can transform
(1.4.1) into (1.4.2) by an appropriate modification of lower order coefficients ai (i = 0, 1, . . . , m).
To simplify the notation, let us introduce the square matrix of order m
A := (aij )1i,jm.

(1.4.3)

Since uxi xj = uxj xi for any twice continuously differentiable function, it is not restrictive to assume
that aij = aji for all i and j, i.e., to assume that the matrix A is symmetric. Indeed, if the matrix
is not symmetric, we write
1
1
aij uxi xj + aji uxj xi = (aij + aji )uxi xj + (aij + aji )uxj xi ,
2
2
i. e., we replace A by 21 (A + AT ). As before, a = (a1 , . . . , am )T denotes the coefficients of the
first order part. Then, (1.4.2) is compactly written as
Lu = (Au) + (au) + a0 u = f ,

(1.4.4)

Lu = T (Au) + T (au) + a0 u = f .

(1.4.5)

or, equivalently,
A linear second order differential equation can be classified according to the structure of its
principal part. This classification is very important: indeed, the type of the equation influences
the kind of boundary and/or initial conditions which are admissible for the equation, the relevant
properties of the solution, as well as the techniques for solving the equation (analytically or
numerically).
The classification is accomplished by looking at the sign of the eigenvalues of the coefficient
matrix A (recall that A is symmetric, so all its eigenvalues are real). Note that since the coefficients
may depend on x , the type of the equation may vary from point to point. Let us consider A = A(x )
at a fixed point x O.
Three situations are most commonly encountered in applications:
(i) all the eigenvalues of A are not zero, and they all have the same sign; in this case we say
that the operator L (or the equation (1.4.4)) is of elliptic type at x ;
(ii) precisely one eigenvalue of A is zero, while the others have constant sign; in this case we say
that the operator L is of parabolic type at x ;
(iii) all the eigenvalues of A are not zero, and precisely one eigenvalue has a different sign with
respect to the others; in this case we say that the operator L is of hyperbolic type at x .
In two independent variables, this classification is exhaustive (since, by assumption, A cannot
be the null matrix). The terminology comes from the fact that the level curves in the (1 , 2 )-plane
of the associated quadratic form
Q() = T A ,

= (1 , 2 )T

are ellipses, or degenerate parabolae, or hyperbolae, depending whether the operator L is elliptic,
or parabolic, or hyperbolic.

16

CHAPTER 1. BASIC CONCEPTS

Example 1.4.1. The Poisson equation (1.2.7) is elliptic, the heat equation (1.2.8) is parabolic,
whereas the wave equation (1.2.9) is hyperbolic. Obviously, the type of each equation is the same
at all points in the plane.
Conversely, the Tricomi equation (1.2.10) is of variable type: it is elliptic in the upper half
plane, parabolic on the axis y = 0 and hyperbolic in the lower half plane.
In three or more independent variables, other situations may occur. If A has two or more zero
eigenvalues and the remaing ones are of one sign, we say that the operator is ultra-parabolic. If
two or more eigenvalues are of one sign, whereas two or more remaining ones are of the opposite
sign, we say that L is ultra-hyperbolic. We shall not consider these cases further on.
We now use the classification introduced above to reduce the general second order equation
(1.4.4) to a canonical form. To this end, we shall make the simplifying assumption that the
coefficients of the principal part are constant (otherwise, one can modify the arguments below by
freezing the coefficients in a neighborhood of each point x O).
Denote by i (i = 1, . . . , m) the eigenvalues of A, and let wi be the corresponding eigenvectors,
which form a complete set since A is symmetric. Define the diagonal matrix := diag(1 , . . . , m ),
as well as the orthogonal matrix S := (w1 , . . . , wm ). The eigenvalue-eigenvector relations, written
as AS = S, yield the diagonalization of A
S T A S = .

(1.4.6)

Now, let us fix a point x O and let us make the change of independent variable
y = x + S T (x x ).
Denoting by x the gradient in the x -variable and defining y similarly, we have by the chain
rule
x = Sy .
Indeed, if V = V (y ) is a given function and we set v(x ) = V (
x + S T (x x )), we have
m

j=1

j=1

X V yj
X V
v
sij
=
=
,
xi
yj xi
yj
since yj = x
j +

Pm

i=1 (S

T) x
ji i

=x
j +

Pm

i=1 sij xi .

Substituting into (1.4.5) gives




Lu = Ty S T A S y u + Ty S T (au) + a0 u = f ;

recalling (1.4.6) and setting a := S T a, we obtain

Lu = Ty (y u) + Ty (a u) + a0 u = f.
Thus, we have diagonalized the principal part of the operator L, i.e.,
Lu =

m
X
i=1

2u
+ lower order terms.
yi2

In order to proceed, we consider the three main types of equations introduced above.

(1.4.7)

1.4. LINEAR SECOND ORDER EQUATIONS

17

(i) Elliptic equations.


If the equation is elliptic, we can assume - possibly after changing the sign of the equation that all the eigenvalues of A are positive. Then, we set


1
1
D := diag , . . . ,
1
m
and we make the further change of variable z = x + D(y x ), which implies y = Dz .
Thus, setting a := Da , (1.4.7) becomes
Lu = z u + Tz (a u) + a0 u = f,
where z is the Laplacian in the z -variable. We conclude that the Laplace operator is the
canonical form of an elliptic operator.
(ii) Parabolic equations.
Set n = m 1. Suppose that i > 0 for i = 1, . . . , n, whereas m = 0. If the last component
am of the vector a appearing in (1.4.7) is zero, then the equation does not contain partial
derivatives of u with respect to ym : it is an elliptic equation in n variables, the variable ym
acting only as a parameter. On the other hand, if am 6= 0, we set
D := diag

1
1
1
,..., ,
a
n m
1

and we make the change of variable z = x + D(y x ). Writing z = (z1 , . . . , zn , t) = (z , t)


and denoting by a the n first components of the vector Da , we transform (1.4.7) into the
form
Dt u z u + Tz (a u) + a0 u = f.
We conclude that the heat operator
Dt
is the canonical form of a (genuinely) parabolic operator.
(iii) Hyperbolic equations.
Set again n = m 1. Suppose now that i > 0 for i = 1, . . . , n, whereas m < 0. Define


1
1
1
D := diag , . . . , ,
1
n
m
and perform the same change of variable as in the parabolic case. Setting a := Da , (1.4.7)
becomes
2
Dtt
u z u + Tz (a u) + a0 u = f.
We conclude that the wave operator (also termed the DAlembert operator)
2
 := Dtt

is the canonical form of a hyperbolic operator.

18

1.5

CHAPTER 1. BASIC CONCEPTS

Boundary and Initial Conditions. Characteristics

Let us consider the simple case of a hyperbolic equation, in dimension m = 2. After diagonalization, and assuming that the lower order terms are zero, we have


2u
2u
1 2 + 2 2 = f ,
(1.5.1)
y1
y2
with 1 > 0 and 2 < 0. Let us define a2 := 1 /|2 | > 0; setting x = y1 , t = y2 and g = f /|2 |,
we obtain
2
2
Dtt
u a2 Dxx
u = g.
(1.5.2)
The equation factorizes as
(Dt + aDx ) (Dt aDx ) u = g,
which is equivalent to the first order hyperbolic system
(
(Dt + aDx ) w = g

(Dt aDx ) u = w

(1.5.3)

(1.5.4)
(1.5.5)

(note that the + and signs can be exchanged in these formulae). Recalling the results of Sect.
1.3, u can be obtained by first integrating (1.5.4) along the characteristics x at = constant,
next integrating (1.5.5) along the characteristics x + at = constant. Actually, the family of lines
x at = constant are called the characteristics of equation (1.5.2). In order to uniquely determine
the solution, one can prescribe a condition on u for each characteristic line at each boundary point
where it enters the region O. Let us detail two examples.
Example 1.5.1. At first, suppose that O is the half-plane {(x, t) : t > 0}. Both characteristics
enter O at each point in O; thus, we prescribe u and a non-tangential derivative of u, such as
the normal derivative ut , therein. Precisely, we consider the initial value problem

x IR ,
t>0,
utt a2 uxx = 0
u(x, 0) = u0 (x) x IR ,
(1.5.6)

ut (x, 0) = u1 (x) x IR ,

(where, for simplicity, we have chosen g 0). Taking into account (1.5.4), (1.5.5) and noting
that w(x, 0) = (ut aux )(x, 0) = u1 (x) au0 (x), we first integrate along the characteristics
x at = constant to solve the initial value problem

wt + awx = 0
x IR , t > 0 ,

w(x, 0) = u1 (x) au0 (x) x IR ;

we get w(x, t) = u1 (x at) au0 (x at). Next, we integrate along the characteristics x + at =
constant to solve the initial value problem

ut aux = w
x IR, t > 0 ,
u(x, 0) = u0 (x) x IR .
We get
u(x, t) = u0 (x + at) +

w(x + at as, s) ds;

1.5. BOUNDARY AND INITIAL CONDITIONS. CHARACTERISTICS

19

(x, t)
x + at = x0
x at

x + at

x at = x0
x0

Figure 1.6: The domain of dependence of a point (x, t) (left) and the domain of influence of a
point (x0 , 0)
substituting the expression of w and making a change of variable in the integral leads to the final
form of the solution:
Z x+at
1
1
u(x, t) = [u0 (x at) + u0 (x + at)] +
u1 (s) ds .
(1.5.7)
2
2a xat
Rz
R0
1
1
1
Setting (z) = 21 u0 (z) + 2a
0 u1 (s) ds and (z) = 2 u0 (z) + 2a z u1 (s) ds, we have
u(x, t) = (x + at) + (x at) .

Note that the solution is the superposition of two signals, traveling leftwards and rightwards,
respectively, with speed a and +a; also note that u at (x, t) only depends on the initial data on
the interval [x at, x + at]. If we had considered our equation with a nonzero right-hand side g,
then u(x, t) would have depended on the values of g in the triangle
T = {(x , t ) : 0 t t, x a(t t ) x x + a(t t )}.
Indeed, adapting the computations above to the presence of the right-hand side yields
Z x+at
1
1
[u0 (x at) + u0 (x + at)] +
u1 (s) ds
u(x, t) =
2
2a xat
Z t Z t
g(x a(2 t s), s) d .
ds
+
0

We call the region T the domain of dependence of the point (x, t) (see Fig. 1.6, left).
Conversely, the initial values at a point (x0 , 0) influence the solution in the angle
A = {(x, t) : x0 at x x0 + at};
this region is called the domain of influence of the point (x0 , 0) (see Fig. 1.6, right).
This simple example shows that a second order hyperbolic equation describes the propagation
and composition of two signals moving at finite speed; the solution depends locally on the data of
the problem (the initial data u0 and u1 , the right-hand side g).
Example 1.5.2. Let us now consider our equation in the semi-infinite strip
O = {(x, t) : 0 < x < 1, t > 0}.

20

CHAPTER 1. BASIC CONCEPTS

At each point of the spatial boundary {(0, t) : t > 0} {(1, t) : t > 0}, one characteristics is
entering the domain and one is leaving it, see Fig. 1.7. Thus, one has to prescribe one boundary
condition on u; this can be either the value of u or the value of ux (which is the normal derivative
to O therein). For instance, we can consider the following initial-boundary value problem
t

Figure 1.7: The characteristics entering the domain

utt a2 uxx

u(0, t)

ux (1, t)

u(x, 0)

ut (x, 0)

=
=
=
=
=

0
0 (t)
1 (t)
u0 (x)
u1 (x)

x IR , t > 0 ,
t>0,
t>0,
x IR ,
x IR .

(1.5.8)

In order to motivate the admissibility of the boundary conditions, let us fix a point P0 = (0, t0 )
in O (see Fig. 1.8). If we prescribe u at this point, say u(0, t0 ) = 0 (t0 ), it is convenient to
exchange the signs in (1.5.4), (1.5.5); then, we use the boundary data to integrate u along the
characteristic line x at = at0 entering O at P0 , i.e., we solve

ut + aux = w
u(0, t0 ) = 0 (t0 ),
with w coming from the inside along the characteristic lines x + at = constant.
t
u
P0

Figure 1.8: Construction of the solution near a boundary point P0 = (0, t0 )

1.5. BOUNDARY AND INITIAL CONDITIONS. CHARACTERISTICS

21

Conversely, if we prescribe ux at P0 , say ux (0, t0 ) = 1 (t0 ), then we factorize the equation as


in (1.5.4), (1.5.5) and we observe that w is known at P0 . Indeed, u has already been determined
for t t0 , so
u(0, t) u(0, t0 )
ut (0, t0 ) = lim
;

t t0
tt0
thus, w(0, t) = ut (0, t0 )a1 (t0 ) and we can integrate w along the characteristic line xat = at0 ,
i.e.,

wt + awx = 0
w(0, t0 ) = ut (0, t0 ) a1 (t0 ).
For an initial-boundary value problem, the domains of dependence and influence are defined
in the obvious way.
Summarizing, at a point belonging to O, we assign as many independent boundary conditions
as the number of characteristics entering the domain O for increasing t.
It is instructive to consider the situation in which 1 is fixed and 2 tends to 0. In this case, the
speed a tends to infinity, i.e., signals propagate with faster and faster speed. Geometrically, the
slopes of the characteristic lines in the (x, t)-plane tend to 0, and the domains of dependence and
influence of any point get wider and wider. In the limit, eq. (1.5.1) becomes the elliptic equation
1

2u
=f
x2

in the sole space variable y1 = x. The solution at each point x


depends on the values of the data
f at all points x in the domain, as well as on the boundary data at all boundary points.
If a low order term is present in the equation, i.e., if the equation is


2u
u
u
2u
+ a2
=f
1 2 + 2 2 + a1
x
t
x
t
with a2 6= 0, the limit equation for 2 0 is parabolic; the solution at each point (
x, t) in the

domain depends on all the values of the data f and the boundary data for all t < t, as well as on
the initial condition u0 at t = 0. Propagation of signals takes place with infinite speed.
At last, we briefly deal with the concept of characteristic manifold. For a second order partial
is non-characteristic if the prescription of u and u on
differential equation, a manifold O
uniquely determines the Hessian of u (i.e., the set of all its second order partial derivatives)
therein, via the differential equation.
Suppose that the manifold is described by an implicit equation (x ) = 0, for a smooth . Fix
a point x on and let
(
x)
n=
k(
x )k
be the normal vector to at x . Let us make a change of independent variable y = x + RT (x x ),

1
(y x ) , we
such that the last coordinate direction is along n. Setting (y ) = x + RT
have R1 x = y , so we choose R such that R1 n = em = (0, . . . , 0, 1)T . The differential
equation in the new coordinates becomes

Ty RT A R y u + lower order terms = f.

22

CHAPTER 1. BASIC CONCEPTS

We note that all second order derivatives of u except uym ym are determined at x by the values
of u and u on . Thus, in order to get the value of uym ym , we must have

T
T
RT A R mm = eT
m R A R em = n A n 6= 0.

Thus, we are led to the following

whose normal vector n


Definition 1.5.3. Any smooth (m 1)-dimensional manifold in O
satisfies
nT A n = 0
on
is called a characteristic manifold for equation (1.4.1).
For instance, the characteristic manifolds of the wave equation (1.5.2) are defined by the
equation a2 n2x n2t = 0 (where n = (nx , nt )T ), i.e., they are precisely the lines x at = constant.
The characteristic manifolds for the heat equation (1.2.8) satisfy n2x = 0, i.e., they are the lines
t = constant.
Finally, any elliptic equations has no (real) characteristic manifold. This means that, under
appropriate regularity conditions, the Cauchy problem

Lu = f in O ,
u = u0 on ,
u

= u1 on ,
n
is always uniquely solvable in a neighborhood O of any (m 1)-dimensional manifold . However,
the Cauchy problem is not well-posed for an elliptic equation. This means that arbitrarily small
changes in the data u0 and u1 may lead to arbitrarily large changes in the solution u, as the
following example shows.
Example 1.5.4. Let us consider the Laplace equation in the half-plane
u = 0 in O = {(x, y) IR2 : y > 0}
and let us prescribe on O
u(x, 0) =

sin nx
,
n

u
u
(x, 0) =
(x, 0) = 0 ,
n
y

for a fixed n > 0 (thus, u(x, y) = un (x, y)). The solution can be found by the ansatz
u(x, y) =

sin nx
u
(y),
n

which reduces the problem to a second order ordinary differential equation with two initial conditions:

u
n
u = 0
n
u
(0) = 1

u
(0) = 0.

The result is

u
(y) =

eny + eny
= cosh ny,
2

1.6. EXERCISES

23

so that the exact solution of the Laplace problem writes as


u(x, y) =

sin nx
cosh ny.
n

As n , the initial data converge to 0 uniformly in n, whereas u becomes arbitrarily large in


an arbitrarily small neighborhood of O.
For this reason, an elliptic equation is more appropriately supplemented by one boundary
u
condition, involving u and/or n
, at each point of the boundary O of the region O where the
equation is set. In this way, one obtains a well-posed problem, as we shall see in Chapter 4.

1.6

Exercises

1.1. Consider the transport equation


u
u
+2
=0
t
x
in the half-plane {(x, t) : t > 0}, with the initial condition
(
3 if x < 0
u(x, 0) = u0 (x) =
1 if x > 0.
Show that the function
u(x, t) =

3 if t > 21 x
1 if t < 12 x

is a weak, not classical, solution of the equation.


1.2. Consider the linear equation
u
u
+x
=1
t
x
and:
(i) find the general solution;
(ii) solve the initial value problem in O = IR (0, +) with the condition u(x, 0) = u0 (x);
(iii) solve the initial-boundary value problem first for x [0, 1] and then for x [1, 2] with the
further condition u = g(t) on the inflow boundary.
1.3. The inviscid Burgers equation
u
u
+u
=0
t
x
is the simplest example of nonlinear transport equation.
(i) Show that the solution u is constant along the characteristics.
(ii) Deduce that the characteristics are straight lines in the half plane {(x, t) : t > 0}.
(iii) Suppose the initial datum u(x, 0) = u0 (x) is prescribed for every x IR; find the slope of
the characteristics.

24

CHAPTER 1. BASIC CONCEPTS

1.4. Classify the following second order equations:


(i)

2u
2u
2u
2u
+
3
+
+
4
=f
x2
xy yx
y 2

(ii)

2u
2u
= g.
+
y
x2
xy

Chapter 2

Theory of Distributions
The theory of distributions was created by Laurent Schwartz in 1944; its main purpose is to extend
the results which hold for integrable and differentiable functions to those functions that do not
satisfy the necessary conditions of classical regularity.

2.1

Basic Definitions

Let O be an open set in IRm ; we recall that the set D(O) has been previously defined as
D(O) = { C (O) | supp is a compact subset of O},
where supp denotes the support of , i.e., the closure of the set of all points x in O such that
does not vanish on them:
supp = {x O | (x ) 6= 0};
it is easy to verify that D(O) is a linear space.
Let us now introduce the following notion of convergence in D(O):
Definition 2.1.1. A sequence {n }n0 D(O) is said to converge to D(O) if:
(i) there exists a compact set K O which contains all the supports of n and ;
(ii) for all multi-integers INm , the sequence {D n }n0 converges to D uniformly on K,
i.e.,
n
kD n D k,K 0.
We are now ready to discuss the concept of distribution.
Definition 2.1.2. A distribution (or generalized function) is a linear form
T : D(O) IR
such that if {n }n0 converges to in D(O) then {T (n )}n0 also converges to T () in IR when
n .
The set of all distributions on O is a linear space denoted by D (O). Moreover, the notation
hT, i is often used instead of T () and it is called a duality form.
25

26

CHAPTER 2. THEORY OF DISTRIBUTIONS

Example 2.1.3. Let f be a real-valued and Riemann (or Lebesgue)-integrable function on O; let
us set
Z
f (x )(x ) dx
D(O)
hTf , i :=
O

and let us verify that Tf is a distribution. To do this, we have to check the properties of the
previous definition; in particular:
(i) Tf is certainly a linear form because it is real-valued and the integral is a linear operator;
(ii) suppose {n }n0 D(O) is a sequence such that kn k,K 0 when n for a
certain D(O); then
Z
f (x )[n (x ) (x )] dx
hTf , n i hTf , i = hTf , n i =
O

and so

|f (x )| |n (x ) (x )| dx
Z
|f (x )| dx =
kn k,K

|hTf , n i hTf , i|

K
n

= Gkn k,K 0

R
where G = K |f (x )| dx is a finite constant that comes from the hypothesis that f is integrable on K. Thus we have hTf , n i hTf , i when n .
Note that the following equality holds true:
Z
Z
f (x )[n (x ) (x )] dx
f (x )[n (x ) (x )] dx =
K

because the supports of all the n s and of are contained in K, so the integral vanishes on O\K.
This imply that only a local integrability of f on subdomains of O, and not on the whole set O,
is needed to define the distribution Tf .
Throughout this chapter, we shall refer to this type of distribution as a function-like distribution.
Example 2.1.4 (The Dirac delta). Consider a point x 0 O; we introduce now the following
form
hx 0 , i := (x 0 )
D(O)
and we want to verify that it is a distribution in the sense of Definition 2.1.2.
(i) The linearity is obvious.
(ii) Let us suppose that n in D(O); by Definition 2.1.1, for all we have a uniform
convergence of D n to D , then for || = 0 it follows
n

max |n (x ) (x )| 0
x K

and so, if x 0 K:

|hx 0 , n i hx 0 , i| = |n (x 0 ) (x 0 )| max |n (x ) (x )| 0.
x K

If x 0 6 K, then it results directly n (x 0 ) (x 0 ) = 0 for all n.

2.1. BASIC DEFINITIONS

27
y
n

1
2n

1
2n

Figure 2.1: A piecewise constant approximation of the Dirac delta 0 .


Such a distribution is called the Dirac delta on the point x 0 ; it is possible to show (see Exercise
2.1) that it is not a function-like distribution, i.e., it does not exist any function f such that the
action of x 0 on a test function D(O) can be expressed as the integral on O of f versus .
After introducing the notion of convergence in D(O), it would be useful to provide a similar
tool for the space D (O) too. This is accomplished by the following
Definition 2.1.5. Let T, Tn D (O), n 0; the sequence {Tn }n0 is said to converge to T in
the sense of D (O) if
n
hTn , i hT, i
for every D(O).
This definition leads us to an important characterization of the Dirac delta. Let us set O = IR
and T = 0 , hT, i = (0) for all D(IR); then, for every n > 0, let us define the function (see
Figure 2.1)
(
fn (x) =

1
if |x| 2n
otherwise.

n
0

We can observe that the integral


Z

fn (x) dx =
IR

1
2n

1
2n

n dx = 1

does not depend on n: every function fn has therefore the same unitary area on IR. If we now
consider the family of distributions Tfn , we have:
hTfn , i =

IR

fn (x)(x) dx = n

1
2n

1
2n

(x) dx = n

1
(xn ) = (xn )
n

 1 1
where xn is a point in the interval 2n
, 2n whose existence is guaranteed by the Integral Mean
Theorem. It is clear that xn 0 when n ; then using the continuity of gives
n

hTfn , i (0) = h0 , i.

28

CHAPTER 2. THEORY OF DISTRIBUTIONS

Since this argument holds for every D(IR), we conclude that Tfn 0 in the sense of D (IR).
This show that, although the Dirac delta cannot be represented by a classical function, it can
nevertheless be obtained as a limit of classical functions in the sense of Definition 2.1.5.
R
In general, it is easy to check that any sequence {fn } of integrable functions satisfying R fn (x) dx =
1 and supp fn B(0, rn ) with rn 0 as n , converges to 0 in D (IR) as n .
Definition 2.1.6. A distribution T is said to be of finite order if there exist r IN and a constant
Cr > 0 such that
D(O), |hT, i| Cr max kD k,O .
||r

The smallest r for which this condition holds is called the order of the distribution.
Example 2.1.7. Let Tf D (O); then

Z

Z


|hTf , i| = f (x )(x ) dx kk,O
|f (x )| dx
O

and, if f is integrable over O, we have


0

|f (x )| dx = C < +

so
|hTf , i| Ckk,O .
In this case r = 0, then Tf is a distribution of order zero.
It is possible to verify that this is also the order of the Dirac delta x 0 .
such that
Definition 2.1.8. Let T D (O); the support of T is the smallest closed set K O
D(O), supp O K hT, i = 0.
This definition states that the support of a distribution T is strictly related to those of test
that has the following
functions. More in detail, the support K of T is the smallest closed set in O
property: every test function that vanishes on the whole K, i.e., such that its support does not
intersect K, sees T as zero.
For instance, if we take T = x 0 , x 0 O, we find supp x 0 = {x 0 } because every test function
whose support does not contain x 0 is such that (x 0 ) = 0 and so hx 0 , i = 0.
As another example, let us consider an integrable function f with a compact support in O;
then supp Tf = supp f .
Example 2.1.9. Consider an open set O in IRm and let be a closed (m 1)-dimensional regular
manifold contained in O; let g be an integrable function defined on . Then, the distribution ,g
defined as
Z
h,g , i =

g()() d

is of order zero with support equal to .

D(O),

2.2. DERIVATIVES OF DISTRIBUTIONS

2.2

29

Derivatives of Distributions

In this section, the main results from the differential theory of distributions are exposed. In
particular, we shall see, with the aid of many examples, in which sense such a theory represents
a generalization of the classical one and what meaning has to be given to the word derivative
referred to a distribution.
Let us start with this basic definition.
Definition 2.2.1. Let INm and T D (O); the partial derivative of T of order is the
distribution D T whose action on a test function D(O) is defined as
hD T, i = (1)|| hT, D i.
We can immediately observe that, in the sense of this definition, all the distributions are
infinitely differentiable, since the derivative is moved on the test function which is of class C (O).
The following example will explain the reason of such a definition and where it comes from.
Example 2.2.2. Let f C1 (O) and consider the distribution T = Tf ; in order to calculate its
derivative Di Tf , we set = (0, . . . , 0, 1, 0, . . . , 0) (where the only component of the multi-integer
different from zero is the i-th) and then we apply the Definition 2.2.1:
Z

f (x )
hDi Tf , i = hTf , Di i =
(x ) dx =
x
i
O
Z
f
(x )(x ) dx = hTDi f , i
=
x
i
O
for all D(O); we recall that in applying the integration-by-parts formula no boundary term
appears, since a test function vanishes in a neighborhood of O.
We conclude that Di Tf = TDi f , i.e., the partial derivative with respect to xi of the distribution based on the function f is the distribution based on the function Di f , which exists in the
classical sense under the hypothesis f C1 (O). As we have just seen, this result follows from
the integration-by-parts formula and it allows us to calculate the derivatives of a function-like
distribution in a somewhat classical way.
Example 2.2.3. Let O = IR and consider the function
(
2x if x 0
f (x) =
x if x < 0
which is not differentiable in the classical sense because of the singularity at the origin. Nevertheless, in the distributional sense we have:
Z
f (x) (x) dx =
h(Tf ) , i = hTf , i =
IR
Z +
Z 0

2x (x) dx =
x (x) dx
=
=

(x) dx +

IR

0
+

2(x) dx =
0

g(x)(x) dx = hTg , i

D(IR)

30

CHAPTER 2. THEORY OF DISTRIBUTIONS

where g is the function

(
2
g(x) =
1

if x > 0
if x < 0;

then (Tf ) = Tg or, as often one writes, f = g in the sense of distributions.


Note that the derivative of Tf is itself a function-like distribution; this depends strictly on the
fact that f is continuous on IR.
Example 2.2.4. Let us now consider the function
(
2x + 1 if x > 0
f (x) =
x
if x < 0
which is discontinuous at the origin. In this case we have:
Z

f (x) (x) dx =
h(Tf ) , i = hTf , i =
IR
Z +
Z 0

(2x + 1) (x) dx =
x (x) dx
=
Z

0
+


+
=
2(x) dx (2x + 1)(x)

(x) dx +
0
Z
g(x)(x) dx + (0)
=
=

D(IR) ,

IR

where g is defined as in the example above. Then

h(Tf ) , i = hTg , i + h0 , i

D(IR)

and consequently (Tf ) = Tg +0 , which is no longer a function-like distribution although Tf is.


In general, if one has

(
f+ (x)
f (x) =
f (x)

if x > x0
if x < x0

with f+ C1 [x0 , +), f C1 (, x0 ], then

(Tf ) = Tg + |[f ]|x=x0 x0 ,

where
g(x) =

f+ (x)
f (x)

(2.2.1)

if x > x0
if x < x0

and |[f ]|x=x0 denotes the jump of f at the point x0 . Therefore, (Tf ) is a function-like distribution
if, and only if, f is continuous at x0 ; in fact, in this case |[f ]|x=x0 = 0, which eliminates the delta
from the expression (2.2.1).
Example 2.2.5. The Heaviside function is defined as
(
1 if x > 0
H(x) =
0 if x < 0;
from (2.2.1) it follows (TH ) = 0 . The Heaviside function is then a primitive of the Dirac delta
in the sense of distributions.
More often one writes H = 0 , where the derivative is, of course, intended in the distributional
sense.

2.3. STUDY OF THE LAPLACE OPERATOR IN D (O)

31

Example 2.2.6. Consider the Dirac delta 0 D (IR) and let D(IR); from Definition 2.2.1
one has
h0 , i = h0 , i = (0)

h0 , i = h0 , i = (0)
..
.
(k)

h0 , i = = (1)k (k) (0),

k IN.

Example 2.2.7. The multidimensional counterpart of the general situation considered in Example
2.2.4 is as follows. Let O an open set in IRm and let be an (m 1)-dimensional regular manifold
contained in O, which splits O as O O+ , with O open disjoint sets such that O O+ = .
Let the function f satisfy
(
f (x ) if x O
f (x ) =
f+ (x ) if x O+

with f+ C1 (O+ ), f C1 (O ). Then, for any i = 1, . . . , m, one has


Di (Tf ) = Tgi + ,hi
where

and

(2.2.2)

(
Di f (x ) if x O
gi (x ) =
Di f+ (x ) if x O+
hi (s) = |[f ]|s ni (s),

s ,

where |[f ]|s denotes the jump of f at the point s in going from O to O+ , and ni is the i-th
component of the normal unit vector to pointing from O to O+ .
We prove the result in the particular case in which f is the Heaviside function associated with
the given partition of O, i.e.,
(
0 if x O
H(x ) =
1 if x O+
Let us compute Di (TH ) in the sense of distributions. Using the divergence theorem (see (3.1.3)),
we have
Z
Z
Z

H(x )
hDi (TH ), i =
ni d = h,ni , i
(x ) dx =
(x ) dx =
xi
O
O+ xi

for all D(O); this is precisely (2.2.2) in the present situation.


We refer to Exercise 2.6 for the proof of the general result.

2.3

Study of the Laplace Operator in D (O)

In this section we study the Laplacian


=

m
X
2
x2i
i=1

as an operator into the space of distributions D (O); in particular, we are interested in those
functions g : O IRm IR whose Laplacian is the Dirac delta 0 on the origin.

32

CHAPTER 2. THEORY OF DISTRIBUTIONS

Definition 2.3.1. Every function g = g(x), x IRm , such that


g = 0

in D (O)

is said to be a fundamental solution of the Laplacian.


Let us start with m = 1 (dimension 1); if we take the function u(x) = |x|, it is easy to verify
that u (x) = sign(x) and thus u (x) = 20 , as it immediately follows from (2.2.1). Hence, the
function g(x) = 21 u(x) = 12 |x| is a fundamental solution of the Laplacian on IR.
Let us now consider m = 2; in this case, it is convenient to use the polar coordinates defined
by the transformation
: [0, +) [0, 2) IR2

(r, ) 7 (x, y) = (r cos , r sin );

since we can think of every function u = u(x, y) as u(x, y) = u((r, )) = U (r, ), we have the
relationship u(x, y) = U (r, ) and consequently (x, y) u = (r, ) U , where (x, y) and (r, ) denote
the Laplacian in cartesian and polar coordinates respectively, with (see Exercise 2.8)
1
1 2
2
+
+
.
(2.3.1)
r 2
r r r 2 2
p
6 (0, 0), we obtain
If we take the function u(x, y) = log x2 + y 2 = log r and we set (x, y) =
from (2.3.1)
1
1
u = (r, ) log r = 2 + 2 = 0;
r
r
hence, log r is a harmonic function in the classical sense everywhere in the plane except at the
origin.
Let us now calculate u in the sense of distributions; taking D(IR2 ) we have
(r, ) =

hu, i = hu, i =
Z
Z
log r dx dy = lim
=
IR2

0+

IR2 \B(0, )

log r dx dy

where B(0, ) is the open ball of radius > 0 centered at the origin. Applying the integration-byparts formula gives

 Z
Z

d =
log r
hu, i = lim
log r dx dy +
n
0+
r=
Z r>

Z
Z

log r dx dy
= lim
log r
log r d +
d =
n
0+
r= n
r>
r=

 Z
Z

log r
log r d +
d
= lim
n
0+
r=
r= n
where the result log r = 0 out of the origin has been used.
Since B(0, ) is a circle, the normal vector of its circumference r = is a radial vector, which
allows us to write

d
1
log r = log r =
n
dr
r

2.3. STUDY OF THE LAPLACE OPERATOR IN D (O)

33

where the minus sign depends only on the fact that n


log r = log rn should be negative because
the two vectors log r and n point in opposite directions. Thus
Z
Z
Z

1
1
log r d =
d = 2
d.
r=
2 r=
r= n
R
1
Note that 2
r= d is the mean value that takes along the circumference r = ; since
is continuous, it follows


Z
1
lim 2
d = 2(0, 0).
2 r=
0+

Moreover

and it results
Z


r=

log r
d = log
n
r=

r=

d
n


Z
Z




| n| d
d
n d =
n
r=
Z
Zr=
d = 2 max kk;
kk d max kk

r=

(x, y)IR2

r=

(x, y)IR2

in the third passage, the Cauchy-Schwartz inequality has been used within the fact that knk = 1
(here k k denotes the Euclidean norm in IR2 ). Since
2 max kk = M
(x, y)IR2

is a real nonnegative finite constant, we conclude that




Z


0+
log
M | log | 0


r= n

and finally
that is

hu, i = 2(0, 0) = 2h0 , i


u = 20

D(IR2 )

in D (IR2 ).
p
log x2 + y 2 is a fundamental solution for the

1
1
Hence, the function g(x, y) = 2
u(x, y) = 2
Laplacian on IR2 .
In three dimensions, with the aid of the spherical coordinates, it can be found that the function

u(x, y, z) = p

1
x2

+ y2 + z2

is such that u = 40 ; it is therefore proportional to a fundamental solution on IR3 .


In general, we have the following expressions for the fundamental solutions of the Laplacian:

r
m=1

v
1

um

m=2
log r
uX
2
t
x2i
(2.3.2)
r
=
kx
k
=
g(x ) =
1 1

m
=
3
i=1

4 r

1
1

m2 m 4

(m 2)m r

34

CHAPTER 2. THEORY OF DISTRIBUTIONS

2 m/2
is the surface area of the unit sphere in IRm .
(m/2)
It is obvious that adding any harmonic function to g, i.e., a function v such that v 0,
leads to another fundamental solution of the Laplacian. Actually, we are more interested in the
existence rather than in the uniqueness of the fundamental solutions, since their importance is
due to the fact that they provide a powerful tool for solving the following more general matter:
find u such that u = f in O, where f is a given bounded function with a compact support and
(i.e. f L1 (O)).

integrable on O
Note that, given any function g such that g = 0 , the new function

where m =

G(x , y ) := g(x y ),

x , y IRm

has the following property: if we denote by x the Laplacian with respect to the variable x , then
in D (IRm ),

x G = y

because the singularity of g has now been moved from the origin to the point y .
Let us set
Z
g(x y )f (y) dy =
u(x ) := (f g)(x ) =
O
Z
G(x , y )f (y ) dy;
=
O

if we calculate the Laplacian of u in the sense of distributions we obtain


Z
u(x )(x ) dx =
hu, i = hu, i =
O
Z Z
G(x , y )f (y )(x ) dx dy =
=
O O
Z
Z
G(x , y )(x ) dx dy ;
f (y)
=
O

but from Definition 2.2.1 we have


Z
G(x , y )(x ) dx = hG, i = hx G, i = hy , i = (y )
O

and then
hu, i =

f (y)(y ) dy = hf, i;

since this argument holds for every test function D(O), we conclude that such a u is a solution
of the elliptic equation u = f in the sense of distributions.

2.4

Exercises

2.1. Prove that the Dirac delta is not a function-like distribution, i.e., that it does not exist any
integrable function f : O IRm IR such that
Z
f (x )(x ) dx ,
D(O).
hx0 , i =
O

2.4. EXERCISES

35

2.2. Consider an open set O in IRm and let be a (m 1)-dimensional regular manifold contained
in O. Moreover, let g be an integrable function defined on ; prove that the formula
Z

g()
hT,g , i =
() d
n

defines a distribution T,g D (O) which is not function-like.


2.3. Find all functions u : IR IR such that u = 21 3 in D (IR).
2.4. For every n 1 let us set

calculate lim fn in D (IR).


n

if x < 0 or x >
0
fn (x) = n2 x
if 0 x n1

2n n2 x if n1 < x n2 ;

2
n

2.5. Let be the straight line in the plane having equation y = 2x. Define then the distribution
D (IR2 ) such that
Z
() d
h , i =

for every test function D(IR ).


(i) Find all functions u : IR2 IR such that
u
=
x

in D (IR2 ).

(ii) Find all functions v : IR2 IR such that


v
v
+
=
x y

in D (IR2 ).

2.6. Prove the identity (2.2.2) in the general case.


2.7. Find the distributions u D (IR2 ) such that
u
= 0
x

in D (IR2 ).

Which is the support of u?


2.8. Prove that the Laplacian in polar coordinates is given by equation (2.3.1).

36

CHAPTER 2. THEORY OF DISTRIBUTIONS

Chapter 3

Sobolev Spaces
3.1

Motivation

In order to motivate the introduction of the Sobolev space H1 (), let us consider the following
Dirichlet boundary-value problem for a general second-order elliptic operator Lu:
(
Lu = (Au) + (au) + a0 u = f in
(3.1.1)
u = 0 on .
Here, A, a, a0 and f are known functions defined in ; precisely, A takes its values in the space
of symmetric and positive-definite matrices of order n, a is a vector-valued function, whereas a0
and f are scalar functions.
We aim at giving a weak (or integral, or variational) formulation of this problem, which
corresponds to the general form (1.2.16). At the beginning, we will proceed in a formal manner,
assuming that all mathematical operations are permitted; then, step by step, we will envisage a
set of assumptions on the data of the problem (the coefficients of the operator, the right-hand
side, the domain) which make the resulting formulation mathematically rigorous.
The starting point consists of multiplying the first equation in (3.1.1) by a test function v and
integrating over , to get
Z
Z
Z
Z
fv .
(3.1.2)
a0 uv =
(au)v +
(Au)v +

Next, we perform an integration-by-parts in the first and second term on the left-hand side.
Precisely, we invoke the divergence theorem
Z
Z
Fn,
(3.1.3)
F=

where F is a vector field and n is the unit vector which is normal to and pointing outwards,
as well as the differentiation rule for a product
(v) = ( ) v + v ,

(3.1.4)

where is a vector field and v is a scalar function. Applying (3.1.3) and (3.1.4) to F = v with
= Au, we obtain
Z
Z
Z
n (Au) v ;
(Au) v =
(Au)v +

37

38

CHAPTER 3. SOBOLEV SPACES

if we introduce the conormal derivative of u on with respect to A, i.e. the function


u
= n (Au)
nA

(3.1.5)

u
= n u when A is the identity matrix), we get
n
Z
Z
Z
u
(Au) v
(Au)v =
v.
(3.1.6)
nA

(which coincides with the normal derivative

On the other hand, applying (3.1.3) and (3.1.4) to F = v with = au yields


Z
Z
Z
a n uv .
(au)v = u a v +

(3.1.7)

Combining the two previous results, we can write (3.1.2) as


Z
Z
Z
Z
Z
Z
u
a0 uv
fv .
u a v +
a n uv =
(Au) v
v+
nA

(3.1.8)

Now, we observe that u is required to vanish on ; therefore, from now on, we will require
that our test functions v vanish on , too (note that functions in D() do satisfy this condition).
Then, (3.1.8) simplifies as
Z
Z
Z
Z
fv .
(3.1.9)
a0 uv =
u a v +
(Au) v

Note that this equation only involves first-order partial derivatives of u and v.
Next, we make assumptions on the functions appearing in (3.1.9), so that all integrals therein
are guaranteed to be meaningful and finite. On the left-hand side, we have integrals of products
of three functions, such as
Z
Z
Z
u v
v
aij
ai
a0 uv ,
or
u
or
xj xi
xi

whereas the right-hand side is the integral of the product of two functions. Thus, we set ourselves in
the framework of the Lebesgue Integration Theory, which, in particular, ensures that the product

of two functions is integrable in , i.e., L1 () if Lp () and Lp () with


p, p [1, ] satisfying p1 + p1 = 1; furthermore, the following Holder inequality holds:
Z
Z
1/p
Z
1/p Z


p
p

||
||
= kkLp () kkLp ()
||

(3.1.10)

R
1/p
has to be replaced by ess sup ||, and similarly if p = ). This
(if p = , the term ||p

result extends to the product of three functions, i.e., L1 () if Lp (), Lp () and

Lp () with p, p , p [1, ] satisfying p1 + p1 + p1 = 1; in this case, one has


Z



kkLp () kk p kk p
.
(3.1.11)


L ()
L ()

The structure of the integrals in (3.1.9) suggests to work in a Hilbertian setting, i.e., to assume that
u, v and their first derivatives belong to L2 (). More precisely, the previous results tell us that

3.2. THE SPACE H1 ()

39

R
f v is well-defined if f and v L2 (); a0 uv is well-defined if a0 L () and u, v L2 ();
R
v
v
an integral of the form ai x
u is well-defined if ai L (), u L2 () and x
L2 (); finally,
i
i
R
u v
u
v
an integral of the form aij x
is well-defined if aij L (), x
and x
L2 (). In
j xi
j
i
conclusion, if we assume that

A (L ())nn ,

a (L ())n ,

a0 L (),

f L2 () ,

(3.1.12)

then u and v should belong to L2 () together with all their first-order partial derivatives. Such
derivatives have to be considered in the sense of distributions, since u and v are merely L2 integrable functions, and not classical differentiable functions.
This leads us to introduce the Sobolev space H1 () and, subsequently, its closed subspace
1
H0 () of the functions vanishing on . This will be the appropriate space for setting the weak
formulation of problem (3.1.1) and for studying its well-posedness.

3.2

The space H1 ()

Motivated by the previous discussion, we introduce the space H1 () as follows.


Definition 3.2.1. H1 () is the subspace of L2 () of the functions whose first-order partial derivatives, in the distributional sense, belong to L2 (), i.e.,
H1 () = {v L2 () :

v
L2 () for 1 i n} = {v L2 () : v (L2 ())n } .
xi

H1 () is endowed with the inner product


(u, v)H1 () = (u, v)L2 () +

n 
X
u
i=1

v
,
xi xi

L2 ()

= (u, v)L2 () + (u, v)(L2 ())n ,

which induces the norm


kvkH1 () =

kvk2L2 ()


n
X
v 2


+
xi 2
i=1

L ()

!1/2

1/2

= kvk2L2 () + kvk2(L2 ())n
.

We point out that the requirement v/xi L2 () means that there exists gi L2 () such that
Tv /xi = Tgi in the sense of distributions, i.e.,
Z
Z

h
gi = hTgi , i
D() ;
Tv , i = hTv ,
i= v
=
xi
xi
xi

then, gi is identified to v/xi .


By the very definition of the norm in H1 (), one has kvkL2 () kvkH1 () for all v H1 (),
i.e., the inclusion H1 () L2 () is continuous.
Next property is one of the fundamental properties of H1 ().
Property 3.2.2. H1 () is a Hilbert space.
Proof. Let {vk }k0 be a Cauchy sequence in H1 ()-norm, i.e., > 0, k IN such that
, m > k one has kv vm kH1 () < . This immediately implies that each sequence {vk }k0 ,

40

CHAPTER 3. SOBOLEV SPACES

{vk /xi }k0 for i = 1, . . . , n, is a Cauchy sequence in L2 (). By the completeness of this space,
there exist functions v and gi , i = 1, . . . , n, belonging to L2 (), such that
lim vk = v ,

vk
= gi , i = 1, . . . , n ,
k xi
lim

in L2 (). The property is proven if we prove that v/xi = gi for i = 1, . . . , n. This follows from

Z
Z 
Z
v

h
vk
, i = v
=
lim vk
= lim
k
k
xi
xi
xi
xi

Z
Z 
Z
vk
vk
= lim
gi = hgi , i
D() .
=
lim
=
k xi
k xi

Next property, which we state without proof, is important both from the theoretical and
the constructive/numerical point of view; indeed, it guarantees that functions in H1 () can be
approximated arbitrarily well by functions belonging to a sequence of finite dimensional subspaces.
Property 3.2.3. H1 () is separable, i.e., it contains a sequence {vk }k0 which is dense in it.
Let us improve our knowledge of the space H1 () by observing that it contains classical differentiable functions. Indeed, if is bounded, any function v C 1 () belongs to H1 (), and one
has
!1/2

n
X
v 2
2
1/2


kvkC 0 () +
kvkH1 () ||
(n + 1)||1/2 kvkC 1 () .
xi 0
i=1

C ()

In other words, we have:

Property 3.2.4. If is bounded, then C 1 () H1 () with continuous injection.


If is not bounded, then any function v C 1 () which decays fast enough as kx k ,
belongs to H1 (). In particular, for any open set , one has:
Property 3.2.5. D() H1 ().
So far, we have seen that sufficiently smooth functions, in a classical sense, belong to H1 ().
On the other hand, H1 () also contains piecewise smooth functions, provided they are globally
continuous. The following result illustrates the situation.
Property 3.2.6. Let be a bounded open set, which is divided into two open subsets and +
by a smooth (n 1)-dimensional manifold . Given two functions v C 1 ( ), the function v
defined as
(
v (x) if x ,
v(x) =
v+ (x) if x + ,
belongs to H1 () if and only if v is continuous across .
Proof. Obviously, v L2 (). On the other hand, for any i = 1, . . . , n, if we set

(x ) if x ,

xi
gi (x ) =
v

+ (x ) if x + ,
xi

3.2. THE SPACE H1 ()

41

we have (see (2.2.2))

Tv = Tgi + ,[v] ni ,
xi
where [v] is the jump of v across and ni is the i-th component of the normal vector n to . The
result follows from the observation that gi L2 (), whereas ,[v] ni 6 L2 () unless [v] ni 0.
The previous result has a strong practical impact, as it guarantees that one can use continuous, piecewise polynomial functions in order to approximate the solution of second-order elliptic
problem; the finite element method relies precisely on this property.

One may wander if a function belonging to H1 () is more regular than just an L2 ()-function,
for instance if it is continuous. First, let us clarify the real meaning of the statement a function
v H1 () is continuous. Indeed, H1 () is a subspace of L2 (), which according to the Lebesgue
integration theory rigorously speaking does not contain functions but classes of equivalence of
functions, two functions in the same class differing only on a zero-measure subset of . Then, the
statement above means that in the equivalence class of v there exists a function, say v, which is
continuous. For simplicity, in the sequel of this book we will not distinguish a function and the
equivalence class which contains it. The following result gives a positive answer in dimension 1.
Property 3.2.7. If = I is a bounded interval, then H1 (I) C0,1/2 (I) with continuous injection,
older continuous functions of exponent 1/2 in I.
where C0,1/2 (I) is the space of the H
Proof. Let us fix v H1 (I) and let us set g = v L2 (I). Let us define the function
Z x
w(x) =
g(s) ds ,
x0

where x0 is any fixed point in I. Since v = w in the distribution sense, there exists a constant C
such that v(x) = w(x) + C in I. Thus, for any two points x1 , x2 I,
Z x1
g(s) ds ,
v(x1 ) v(x2 ) = w(x1 ) w(x2 ) =
x2

whence, by the Cauchy-Schwarz inequality,


Z

|v(x1 ) v(x2 )| =

x1
x2

Z

1 g(s) ds

x1

x2

1/2 Z

1 ds
2

x1

x2

1/2

g (s) ds |x1 x2 |1/2 kgkL2 (I) .
2

This precisely means that v is Holder continuous of exponent 1/2 in I, and that
|v|C0,1/2 (I) := sup

x1 ,x2 I

|v(x1 ) v(x2 )|
kv kL2 (I) .
|x1 x2 |1/2

On the other hand, again by the Cauchy-Schwarz inequality,


Z
Z




v(y) dy = 1 v(y) dy |I|1/2 kvk 2 .
L (I)



I

Since v is continuous, there exists a point x


I such that
Z
1
v(y) dy
v(
x) =
|I| I

(3.2.1)

(3.2.2)

42

CHAPTER 3. SOBOLEV SPACES

by the mean value theorem. For any x I, we write


v(x) = v(
x) +

v(x) v(
x)
|x x
|1/2 ,
1/2
|x x
|

and we apply (3.2.1) and (3.2.2) to get


|v(x)| |v(
x)| +

|v(x) v(
x)| 1/2
|I| |I|1/2 kvkL2 (I) + |I|1/2 kv kL2 (I) ,
1/2
|x x
|

which easily implies kvkC0 (I) C1 kvkH1 (I) . In conclusion, this estimate and (3.2.1) yield
kvkC0,1/2 (I) C2 kvkH1 (I) .
If n > 1, then H1 () is neither contained in C0 () nor in L (),
p as the following simple
counterexample shows. Consider the disc = {(x, y) IR2 : r = x2 + y 2 < 1/2} and the
function v(x, y) = | log r| for some > 0. Obviously, v is unbounded as r 0, hence, in
particular, it is not continuous at the origin. Let us show that v H1 () if and only if < 1/2.
We have
Z
Z
Z
2

v 2 dxdy =

1/2

| log r|2 r drd < +

for any > 0 .

On the other hand,

v
x
= | log r|1 2 ,
x
r

y
v
= | log r|1 2 ,
y
r

whence
Z 1/2
Z 2 2
Z
v v
1
+ dxdy = 2 | log r|22 1 dxdy = 22
| log r|22 dr < +
x y
2
r
r
0

iff < 1/2 .

We will see later on (Thm. 3.8.2) that functions in H1 (), although not necessarily continuous,
belong to some space Lp () with p > 2 depending on n.
Another fundamental result is the following one. Let D() be the space of the C -functions
defined in , whose support is compact and contained in (thus, they are allowed to be nonzero
on ). Note that D() = C () if is bounded, whereas D() = D() iff = IRn . The space
D() can equivalently be defined as the space of the restrictions to of the functions in D(IRn ).
Property 3.2.8. D() is a dense subspace of H1 ().
We will give some ideas of the proof, in some particular cases, later on. The property states
that any function in H1 () can be approximated arbitrarily well by smooth classical functions.
This will allow us to pass to the limit and extend results which are well-known for classical
functions to analogous results for functions in H1 ().

3.3

The families of spaces Hm () and Wm,p ()

The definition of H1 () can be generalized, by considering for any m 2 the set of all functions
v L2 () such that all their distributional partial derivatives D v of order || m belong to
L2 (). Thus, we set
Hm () = {v L2 () : D v L2 () for || m}

3.4. THE SPACES HS (IRN )

43

and we endow this space by the inner product


X
(u, v)Hm () =
(D u, D v)L2 () ,
||m

which induces the norm

kvkHm () =

||m

D0v

1/2

kD vk2L2 ()

(we use here the convention that


= v). In this way, we obtain a separable Hilbert space which
enjoys properties similar or equal to those seen for H1 (): for instance, it contains D() as a dense
subspace and, if is bounded, it contains Cm () but also those functions of Cm1 () which are
piecewise Cm -differentiable. Furthermore, all functions in Hm () enjoy classical differentiability
of some order < m (which depends on the space dimension n); for instance, in dimension n = 2,
the space H2 () is contained in C0 () with continuous inclusion (but not in C1 ()). The precise
result will be given in Thm. 3.8.2.
We thus have a scale of function spaces, in which smoothness is measured in a weak, integral
sense; each space is strictly contained in all the spaces of lower index, with continuous inclusion.
Such a scale is the counterpart of the classical scale of spaces Cm (), in which smoothness is
measured in a strong, pointwise sense. Precisely, the two sequences of spaces satisfy

Hm+1 () Hm () Hm1 ()

H1 ()

H0 () = L2 ()

Cm+1 () Cm () Cm1 ()

C1 ()

C0 ()

and if is bounded each space of the lower sequence is contained with continuous inclusion in
the space above it in the upper sequence. Working in the Sobolev scale rather than in the classical
one is more appropriate for handling the weak, or integral, formulation of an elliptic boundary
value problem; in particular, the Sobolev scale consists of Hilbert spaces, whereas the classical
scale consists merely of non-reflexive Banach spaces.
A further generalization comes from replacing L2 () by some Lp () with p [1, +] in the
definition of the Sobolev space. Thus, we set
Wm,p () = {v Lp () : D v Lp () for || m}
equipped with the norm

kvkWm,p () =

||m

1/p

kD vkpLp ()

Such a space is a Banach space, which as Lp () is reflexive if 1 < p < + and is non-reflexive
if p = 1 or p = +. Note that Wm,2 () = Hm (). Sobolev spaces of summability index p 6= 2
play a crucial role in studying nonlinear partial differential equations.

3.4

The spaces Hs (IRn )

The study of Sobolev spaces is particularly interesting and important when the domain is the
full space IRn . In this section, we first provide a characterization of H1 (IRn ) by means of the
Fourier transform. Next, we consider the spaces Hm (IRn ) for m > 1 and we extend the definition
of Sobolev spaces to the case where the index is any real number. Finally, we sketch the proof of
Property 3.2.8 in the present situation, i.e., when the boundary is empty.

44

3.4.1

CHAPTER 3. SOBOLEV SPACES

A characterization of H1 (IRn )

We recall that the (continuous) Fourier transform


1
v(x ) 7 v() =
(2)n/2

IR

v(x ) eix dx

is an isomorphism between L2 (IRn ) and itself, whose inverse is


Z
1
v() e ix d ;
v() 7 v(x ) =
(2)n/2 IRn
more precisely, the transform is an isometry, i.e., for all v, w L2 (IRn ) one has
Z
Z
v(x )w(x ) dx =
v()w()

d ,
whence, kvkL2 (IRn ) = k
v kL2 (IRn ) .
IRn

IRn

(3.4.1)

Furthermore, if is a compactly supported, continuously differentiable function, one has





() = i k ()

IRn , k = 1, . . . , n ,
(3.4.2)
xk
as it can be seen by applying an integration by parts in the integral which defines the Fourier
transform of /xk .
The following result gives the announced characterization of H1 (IRn ) in terms of summability
at infinity of the Fourier transform of its functions.
Proposition 3.4.1. One has
H1 (IRn ) = {v L2 (IRn ) : the function 7 (1 + kk2 )1/2 v() belongs to L2 (IRn )}
and
kvkH1 (IRn ) =

Z

IRn

(1 + kk )|
v ()| d

1/2

v
L2 (IRn ) in the distributional
Proof. It is enough to prove that, for each k = 1, . . . , n, x
k
sense iff the function 7 k v() belongs to L2 (IRn ), with identical norm. Let us assume that
v
2
n
n
xk = gk L (IR ); then, using (3.4.2) and (3.4.1), for all D(IR ) one has on the one side

v
, i =
h
xk

v(x )
(x ) dx =
xk
IRn

and on the other side


v
h
, i =
xk

IRn



Z

d
i k v() ()
v()
()d =
xk
IRn
IRn

gk (x )(x ) dx =

IRn

gk ()()
d .

By equating the two last expressions and by recalling that is arbitrary, we get i k v() = gk ()
almost everywhere in IRn , and therefore the function 7 k v() belongs to L2 (IRn ). Conversely,
if this happens, one sets gk () = i k v(), so that its inverse transform gk (x ) belongs to L2 (IRn )
v
.
and satisfies gk = x
k
The argument given in the proof shows that


v
1
n
() = i k v()
IRn , k = 1, . . . , n .
(3.4.3)
for all v H (IR ) ,
xk

3.4. THE SPACES HS (IRN )

3.4.2

45

The spaces Hs (IRn ), with s IR

Given a vector = (1 , . . . , n ) IRn and a multi-index = (1 , . . . , n ) INn , let us set


= 11 22 nn IR. An argument similar to the proof of Proposition 3.4.1 shows that
Dv L2 (IRn ) in the sense of distributions if and only if v() L2 (IRn ); in this case, one has
(D v) () = i|| v() ,
which generalizes (3.4.3). Thus, any Sobolev space Hm (IRn ) can be characterized as the subset
of L2 (IRn ) of those functions satisfying v() L2 (IRn ) for all INn such that || m; the
norm in Hm (IRn ) could be represented as

kvkHm (IRn ) =

IRn ||m

1/2

||2|
v ()|2 d

where || = (|1 |, |2 |, . . . , |n |). An equivalent but simpler expression of the norm is preferred,
which can be derived by applying the following technical lemma, whose elementary proof is left to
the reader.
Lemma 3.4.2. There exists constants c, C > 0 depending only on n and m such that
X
m
m
,
IRn .

||2 C 1 + kk2
c 1 + kk2
||m

The result allows us to characterize Hm (IRn ) in an equivalent manner as the subset of L2 (IRn )
m/2
v() L2 (IRn ), and to use the L2 (IRn )-norm of this
of those functions such that 1 + kk2
m
n
function as an equivalent norm in H (IR ).
At this point, a remarkable observation can be made, namely, that the latter characterization
does not require the parameter m to be an integer: any real value of m is admissible. This leads
us to extend the definition of Sobolev spaces given so far, to the case of real positive indices.
Definition 3.4.3. For any real s 0, we set
Hs (IRn ) = {v L2 (IRn ) :

1 + kk2

equipped with the norm


kvkHs (IRn ) =

Z

IRn


2 s

1 + kk

s/2

v() L2 (IRn )}

|
v ()| d

1/2

(3.4.4)

In this way, we obtain a continuos family of separable Hilbert spaces, which satisfy
D(IRn ) is a dense subspace of Hs (IRn )
s

H (IR ) H (IR )
s

H (IR ) = H (IR )
m+1

iff

if

s >s ,

for all

s0;

s=m,
m

(IR ) H (IR ) H (IRn )

iff

m<s<m+1.

The last relation shows that the Sobolev spaces Hs (IRn ) of non-integer index can be viewed as a
sort of interpolating spaces between consecutive Sobolev spaces of integer index. The concept can
be made rigorous, within the so-called Theory of Space Interpolation.

46

CHAPTER 3. SOBOLEV SPACES

One can furtherly extend the definition of Sobolev space to the case of negative indices, by
setting


Hs (IRn ) = H|s| (IRn )
if s < 0 ,

where X denotes the dual space of the Hilbert space X; equivalently, Hs (IRn ) can be defined
as the space of the distributions whose Fourier transform (defined in a suitable sense) makes the
right-hand side of (3.4.4) finite.

3.4.3

Sketch of the proof of Property 3.2.8

The idea is to apply to any function in H1 (IRn ) a truncation, which yields a compactly supported
function, followed by a regularization, which produces a C function.
Truncation.

Given any R > 0, let R D(IR) be an even function satisfying


0 R (t) 1
R (t) 1

R (t) 0

t IR ,

if |t| R ,

if |t| R + 1 .

Then, given any v H1 (IRn ), one can prove that the function vR (x ) = R (kx k)v(x ) belongs to
H1 (IRn ) and is supported in B(0, R + 1); furthermore, kv vR kH1 (IRn ) 0 as R +.
Regularization.

Given any > 0, let (x ) be any non-negative function in D(IRn ) satisfying


Z
(x ) dx = 1 .
supp B(0, ) ,
IRn

An example of such function is obtained by properly scaling the function given in Example 1.2.1.
Note that as 0, converges in D (IRn ) to the distribution 0 .
Then, given any v H1 (IRn ), one can prove that the convolution function
Z
(x y )v(y ) dy
v (x ) = ( v)(x ) =
IRn

belongs to H1 (IRn ) and is infinitely differentiable at every x IRn ; furthermore, kvv kH1 (IRn ) 0
as 0.
Finally, we combine the two previous approximations by considering functions vR, = (vR )
obtained by first truncating a function v H1 (IRn ), and then regularizing the result. Since both
vR and are compactly supported, so is vR, ; precisely, supp vR, B(0, R + 1 + ). Thus, vR,
belongs to D(IRn ).
An appropriate choice of = (R), such that (R) 0 as R , shows that v can be
approximated in H1 (IRn ) to any prescribed precision by a function vR, for a sufficiently large R.

3.5

The space H1 (IRn+ )

One of the simplest examples of open domain with nonempty boundary is the semi-space =
IRn+ = {x = (x1 , . . . , xn1 , xn ) IRn : xn > 0}, whose boundary IRn+ = {x = (x1 , . . . , xn1 , 0)
IRn } can be identified with IRn1 . For notational simplicity, every point x IRn will be written
as x = (x , xn ) with x IRn1 .

3.5. THE SPACE H1 (IRN


+)

47

A number of properties of H1 (IRn+ ), such as Property 3.2.8, can be obtained from the analogous
properties of H1 (IRn ) after introducing a suitable prolongation operator which extends the functions belonging to H1 (IRn+ ) into functions belonging to H1 (IRn ). Precisely, given any v H1 (IRn+ ),
let us set
(P v)(x ) = v(x , |xn |)
x = (x , xn ) IRn .
Thus, P realizes an extension of v by an even reflection around the boundary IRn+ . It is easy to
check that P v H1 (IRn ) iff v H1 (IRn+ ), and that

v H1 (IRn+ ) ,
kP vkH1 (IRn ) 2kvkH1 (IRn+

i.e., P is continuous. Thus, we have established the following result.


Property 3.5.1. There exists a linear continuous operator P : H1 (IRn+ ) H1 (IRn ) such that
P v|IRn+ v for all v H1 (IRn+ ).
Any operator satisfying the conditions of the Property is termed a prolongation operator in
H1 (IRn+ ).
As an application, we can easily prove Property 3.2.8 in the current situation. Indeed, given
any v H1 (IRn+ ), by the analogous result in IRn one can find a sequence of functions vk D(IRn )
which converge to v = P v in the H1 (IRn )-norm. It is immediate that the functions vk = (
vk )|IRn
+

belong to D(IRn+ ) and converge to v in the H1 (IRn+ )-norm.

Next, we begin the discussion of the concept of trace on the boundary of a function
belonging to H1 (). Clarifying this concept is very important, for instance in order to give the
proper meaning to a Dirichlet boundary condition. The present geometrically simple situation
will allow us to keep ideas separated from technicalities, which may occur in the case of general
domains. While for a smooth function defined in (e.g., a function in D()) its trace on is
defined pointwise in the obvious way, for a function v which merely belongs to H1 () the same
procedure cannot be applied in dimension n 2. Indeed, v is an L2 -function, hence it is actually
a class of equivalence of functions, which may arbitrarily differ on subsets of zero measure in ;
since is precisely one of such subsets, the pointwise restriction of v to is meaningless. (The
situation is different in dimension 1, since we have seen that in each class of equivalence there is
one member which is continuous up to the boundary (Property 3.2.7), so that in particular its
boundary values are well-defined.)
The correct approach to the problem of defining boundary traces of functions in H1 () consists
of considering the trace operator as a linear continuous mapping defined on H1 (), which is first
defined pointwise on the subset of smooth functions, and which is next extended in a unique way
to the whole space thanks to the density of smooth functions.
We will detail this procedure in the case of = IRn+ . To this end, the following result is of
paramount importance.
Proposition 3.5.2. One has
k|IRn+ kL2 (IRn+ ) kkH1 (IRn+ )

D(IRn+ ) ,

where |IRn+ denotes the restriction of to IRn+ .


Proof. Given any D(IRn+ ), let A > 0 be such that supp IRn1 [0, A]. For any x IRn1 ,
one has by the fundamental theorem of integral calculus in one dimension,

Z A
Z A


2
2

(x , xn ) dxn = 2

(x , xn ) dxn ;
(x , 0) = (x , A) (x , 0) =
xn
0
0 xn

48

CHAPTER 3. SOBOLEV SPACES

hence, the Cauchy-Schwarz inequality and the inequality 2ab a2 + b2 yield


2

(x , 0) 2

Z
A

(x , xn ) dxn
0

2 (x , xn ) dxn +

1/2

A

A

xn

xn

!1/2
2

(x , xn ) dxn

2
(x , xn ) dxn .

Integrating over IRn1 with respect to x , we obtain


Z

IRn
+

(x ) dx

IRn
+

(x ) dx +

IRn
+

xn

2
(x ) dx ,

(3.5.1)

which implies the thesis.


The result tells us that the trace operator, which maps D(IRn+ ) into its restriction | on
, is continuous if the domain of definition is equipped with the H1 (IRn+ )-norm and the image
is equipped with the L2 (IRn+ )-norm. But since D(IRn+ ) is dense in H1 (IRn+ ), the Continuous
Extension Theorem guarantees that there exists a unique extension, say , which is defined on
the whole of H1 (IRn+ ), takes its values in L2 (IRn+ ) and is linear and continuous between these two
spaces. This means that the trace (v) of any function v H1 (IRn+ ) is well-defined as an element
of L2 (IRn+ ) (at least).
With a little additional effort, we can see that the image of is actually a proper subspace

of L2 (). Note, indeed, that in the proof of Proposition 3.5.2 only the L2 -norms of and x
n
2
have been used on the right-hand side (see (3.5.1)). We now involve the L -norms of the other
first-order partial derivatives of with the aim of improving the bound given in the previous
proposition, by putting a stronger norm on the left-hand side. To this end, we recall that IRn+
is isomorphic to IRn1 , so that one can define the space H1/2 (IRn+ ) by identifying it to the space
H1/2 (IRn1 ) defined in Def. 3.4.3. The new result is as follows.
Proposition 3.5.3. There exists a constant C > 0 such that
k|IRn+ kH1/2 (IRn ) CkkH1 (IRn+ )
+

D(IRn+ ) .

Proof. Let us denote by (


, xn ) the (n 1)-dimensional Fourier transform of (x , xn ) with

respect to the variables x , keeping xn fixed. Using the same notation as in the proof of Prop.
3.5.2, we have (x , xn ) 0 for all xn A, hence also (
, xn ) 0 for all xn A. Thus, we can
write


Z A
Z A


2
2
IRe (
, xn )
| ( , xn )| dxn = 2
( , xn ) dxn
| ( , 0)| =
xn
0
0 xn


Z A


( , xn ) dxn .
|(
, xn )|
2
xn
0

Multiplying this inequality by (1 + k k2 )1/2 and integrating over IRn1 , we obtain




Z
Z


2 1/2

2 1/2 2

(1 + k k ) |(
, xn )|
(1 + k k ) | ( , 0)| d 2
( , xn ) d dxn
xn
IRn
IRn1
+
2
Z
Z


2


( , xn ) d dxn .
(1 + k k )|(
, xn )| d dxn +


xn
IRn
IRn
+
+

3.6. SOBOLEV SPACES ON BOUNDED DOMAINS

49

Recalling Proposition 3.4.1, the first integral on the right-hand side equals
"
2 #
Z
n1 
X

2 +
(x ) dx ;
n
xk
IR+
k=1

on the other hand, the second integral on the right-hand side equals


Z
2
(x ) dx ,
xn
IRn
+
since differentiation in the xn variable and Fourier
in the x -variables are independent

 transform

of each other, so that they commute, i.e., x


= x
. This concludes the proof.
n
n
The previous result tells us that the trace operator maps H1 (IRn+ ) into H1/2 (IRn+ ) in a
continuous way. A deeper result, that we will not prove, guarantees that the image of H1 (IRn+ )
under is precisely the space H1/2 (IRn+ ), and that admits a continuous right-inverse :
H1/2 (IRn+ ) H1 (IRn+ ), i.e., ( (g)) = g for all g H1/2 (IRn+ ), or, equivalently,
for each g H1/2 (IRn+ ), there exists v = vg H1 (IRn+ ) such that
(vg ) = g and kvg kH1 (IRn+ ) CkgkH1/2 (IRn ) .
+

We summarize the properties of in the following theorem.


Theorem 3.5.4. There exists a linear continuous operator
: H1 (IRn+ ) H1/2 (IRn+ ) ,
termed trace operator, such that () = | for all D(IRn+ ).
H1/2 (IRn+ ) and admits a continuous right-inverse

It is surjective upon

: H1/2 (IRn+ ) H1 (IRn+ ) ,


termed lifting operator.

3.6

Sobolev spaces on bounded domains

From now on, we suppose that is a bounded domain, and we make some assumptions on its
boundary , which will guarantee the validity of such results as the existence of prolongation or
trace operators, as for the half-space.
The following definitions will be crucial for our purposes.
Definition 3.6.1. A bounded open domain is said of class Cm (or simply a Cm -domain) for m 1
if there exists a finite covering of by open bounded sets Ai , i = 0, 1, . . . , I, such that
i) A0 A0 ;
ii) for each i = 1, . . . , I, there exists a mapping i : Ai B(0, 1) with the following properties:
a) i is bijective ;
b) i is of class Cm , with inverse i1 also of class Cm ;

50

CHAPTER 3. SOBOLEV SPACES


c) i (Ai ) = {y B(0, 1) : yn > 0} ;

d) i (Ai ) = {y B(0, 1) : yn = 0} .
Many domain of interest in applications have boundaries of polygonal (in dimension 2) or
polyhedral (in dimension 3) type. The presence of corners or edges prevents them from being of
class C1 . The following definition relaxes the previous one for m = 1, in such a way that polygonal
or polyhedral domains fulfil its conditions.
Definition 3.6.2. A bounded open domain is said a Lipschitz domain if there exists a finite
covering of by open bounded sets Ai , i = 0, 1, . . . , I, such that
i) A0 A0 ;
ii) for each i = 1, . . . , I, there exists a mapping i : Ai B(0, 1) with the following properties:
a) i is bijective ;
b) i is of class C1 , with inverse i1 also of class C1 ;
c) there exists a Lipschitz-continuous function gi : IRn1 IR such that gi (0) = 0,
i (Ai ) = {y = (y , yn ) B(0, 1) : yn > g(y )}
and
i (Ai ) = {y = (y , yn ) B(0, 1) : yn = g(y )} .
Obviously, a C1 -domain is a particular case of Lipschitz domain, where one can take each gi 0.

The concept of C1 -domain (of Lipschitz domain, resp.) can be equivalently expressed by saying
that locally its boundary is a graph of a C1 -function (a Lipschitz-continuous function, resp.), such
that the domain lies on one side of the graph.
One can prove that any convex domain is a Lipschitz domain.
Examples of domain which are neither C1 nor Lipschitz are those containing cusp points, such
as
= {x IR2 : x2 + (y 1)2 < 1 and x2 + (2y 1)2 > 1} ,
i.e., the region between two circumferences which are tangent at the origin. The origin is a cusp
point, such that in none of its neighborhoods the boundary can be represented as the graph
of a function.
The concept of partition of unity provides the tool which allows us to localize the study of a
function defined in .
Definition 3.6.3. A partition of unity associated with the covering {Ai }i=0,1,...,I of is a set of
nonnegative C -functions i : IRn IR such that
i)
ii)

supp i Ai ;
I
X
i=0

i (x) = 1

x .

One can prove the following result.


Property 3.6.4. Given any finite covering of , there exists a partition of unity associated with
it.

3.6. SOBOLEV SPACES ON BOUNDED DOMAINS

51

Example 3.6.5. Let us exhibit a simple partition of unity associated with a covering of an interval
of the real line. Let us start by considering the even function


(
exp t211
|t| < 1 ,
(t) =
0
|t| 1 ;
and let us set

Rs

(s) = R
+

(t) dt
(t) dt

which satisfies (s) 0 for s 1, (s) 1 for s 1, (s) is strictly increasing in [1, 1]. Note
that 21 is an odd function, and this easily implies the identity (s) + (s) 1 in IR. Setting
(s) = (20s), we squeeze into the interval [1/20, 1/20] the transition region between the values
0 and 1.
Consider now the interval = (1, 1) and the covering



1
1
1 5
1
1
1
), ( 43 + 10
) ,
A1 = 34 10
, 4 + 10
,
A2 = ( 54 + 10
), 43 + 10
) .
A0 = ( 43 + 10
Then, a partition of unity associated with this covering is given by
0 (x) = (x+ 34 )+( 43 x)1 ,

1 (x) = (x 43 )+( 54 x)1 ,

2 (x) = (x+ 54 )+( 34 x)1 .

By a partition of unity in a Lipschitz domain , we can construct prolongation and trace


operators defined in H1 (), starting from the analogous operators in H1 (IRn+ ). Let us sketch the
idea. The starting point consists of writing a function v H1 () as
v(x ) = 1 v(x ) =

I
X
i=0

i (x ) v(x ) =

I
X

(i (x )v(x )) ,

i=0

P
i.e., setting vi (x ) = i (x )v(x ), we express v as v = Ii=0 vi , with supp vi Ai and kvi kH1 ()
CkvkH1 () for i = 0, . . . , I.
Now, v0 vanishes in a neighborhood of , hence, we can think of it as extended by zero outside
, i.e., v0 H1 (IRn ). On the other hand, for i = 1, . . . , I, we define vi = vi i1 , a function which
is supported in B(0, 1) IRn+ , so that it can be extended by zero to a function vi in IRn+ , satisfying
vi kH1 (IRn+ ) Ckvi kH1 () .
vi H1 (IRn+ ) with k
1
n
Let P : H (IR+ ) H1 (IRn ) be the prolongation operator defined in Sect. 3.5. Then, the
function

on Ai ,

vi (x ) 

vi (x ) = i (x ) (P vi ) i (x ) on Ai C ,

0
on IRn \ A ,
i

is an extension of vi which satisfies k


vi kH1 (IRn ) Ckvi kH1 () . Finally, the global prolongation
PI
operator is defined as P v = v0 + i=1 vi . Thus, we have established the following result.

Property 3.6.6. Let be a bounded Lipschitz domain. There exists a linear continuous operator
P : H1 () H1 (IRn ) such that P v| v for all v H1 ().

52

CHAPTER 3. SOBOLEV SPACES

As for the case = IRn+ , this property allows one to prove Property 3.2.8 for all Lipschitz domains.
Let us now consider the problem of defining the trace operator. To this end, let IRn+ :
H (IRn+ ) H1/2 (IRn+ ) be the trace operator defined in Sect. 3.5. Then, the function IRn+ (
vi )i
is defined on Ai ; since it vanishes as the argument approaches Ai , it can be extended by
zero to \ Ai , givingP
rise to a function (vi ) defined on . Thus, the trace operator on
is defined as (v) = Ii=1 (vi ). It is easily seen that () = | if C (), and that
(v) L2 () with
k (v)kL2 () CkvkH1 () .
(3.6.1)
1

In other words, : H1 () L2 () is a linear continuous operator. Let us denote by


H1/2 () = (H1 ())

(3.6.2)

its image, which is a subspace of L2 (). On the other hand, is clearly non-injective (many
functions in may have the same trace on ). Thus, let us introduce the subspace
H10 () = ker = {v H1 () : (v) = 0} ,

(3.6.3)

which is a closed subspace of H1 (), since is continuous.


Next, let us give a few results about quotient spaces. If X is a Hilbert space, with inner
product (x, y)X and norm kxkX , and X0 is a closed subspace of X, the quotient space X/X0 is
the set of all equivalence classes x = x + X0 = {y X : y x X0 }; it is a Hilbert space for the
quotient norm
kxk
X/X0 = inf kxkX = inf kx + x0 kX .
xx

x0 X0

We observe that the infimum above is actually a minimum. Indeed, given x X/X0 , there exists
a unique element x X such that kx kX = kxk
X/X0 . This result can be proven by taking any
element y x and setting x = y y, where is the orthogonal projection operator from X
upon X0 ; it is easily seen that x is independent of the particular choice of y, and satisfies the
conditions stated above. Equivalently, the linear continuous mapping x X 7 x X/X0 admits
a continuous right-inverse x X/X0 7 x X.
We apply these results to the quotient space H1 ()/H10 (), after observing that, by definition
of kernel of a linear operator, induces an algebraic isomorphism between H1 ()/H10 () and
H1/2 (). Therefore, we can equip the space H1/2 () by the quotient norm
|k g |kH1/2 () = kvk
H1 ()/H10 () ,

(3.6.4)

where v is the unique equivalence class of all functions v H1 () satisfying (v) = g; equivalently, we have
|k g |kH1/2 () =
inf
kvkH1 () ,
(3.6.5)
vH1 (), (v)=g

or
|k g |kH1/2 () = kvg kH1 () ,
where v = vg is the element in v of smallest H1 ()-norm. One can prove (exercise) that vg is the
unique solution of the elliptic problem
(
vg + vg = 0 in ,
(3.6.6)
vg = g
on .


3.7. THE SPACE H10 () AND THE POINCARE-FRIEDRICHS
INEQUALITY

53

Thus, for the norm just introduced in H1/2 (), the mapping g 7 vg is a continuous right-inverse
of the mapping v 7 (v).
One can give an intrinsic definition of the space H1/2 (), as one of the fractional order
Sobolev spaces Hs (), 0 < s < 1, defined as
Z Z
|g(x ) g(y )|2
s
2
2
H () = {g L () : |g|Hs () =
dxdy < +}
2s+n
kx y k
equipped with the norm

1/2
kgkHs () = kgk2L2 () + |g|2Hs ()
.

This norm is equivalent to the norm defined in (3.6.4). If = IRn+ , this definition is also equivalent
to the one given in Sect. 3.5 via the Fourier transform.
We summarize the results that can be proven about the trace operator .
Theorem 3.6.7. Let be a bounded Lipschitz domain. There exists a linear continuous operator
: H1 () H1/2 () ,
termed trace operator, such that () = | for all C (). It is surjective upon H1/2 ()
and admits a continuous right-inverse
: H1/2 () H1 () ,
termed lifting operator.
Remark 3.6.8. Let us show on a single example how the previous results can be extended to
Sobolev spaces of higher order.
Consider a bounded C2 -domain, and take a function v H2 (). Then, not only its trace (v)
is well-defined in H1/2 (), but also the traces (v/xi ) of its first-order partial derivatives
are well-defined in H1/2 ().
This is expressed by saying, on the one side, that (v) belongs to the Sobolev space H3/2 ()
1
(i.e., it is more regular, as a consequence of the fact that v is more regular
P than just an H ()function) and, on the other side, that the normal derivative v/n = i (v/xi )ni is welldefined in H1/2 ().

3.7

The space H10 () and the Poincar


e-Friedrichs inequality

The space H10 () has been defined in (3.6.3). It is the natural space in which to set the variational
formulation of a Dirichlet problem for a second-order elliptic boundary-value problem.
An equivalent definition is based on the following property.
Property 3.7.1. D() is dense in H10 ().
Thus, H10 () can be equivalently defined as the closure of D() with respect to the topology
of H1 (); i.e., a function v H1 () belongs to H10 () if and only if there exists a sequence of
functions n D() satisfying kv n kH1 () 0 as n .
As a closed subspace of the Hilbert space H1 (), H10 () is itself a Hilbert space, for the
same inner product as in H1 (). On the other hand, a simpler, yet equivalent inner product

54

CHAPTER 3. SOBOLEV SPACES

can be defined in H10 () (equivalent meaning that it induces an equivalent norm). In order to
motivate its definition, let us start from the observation that the norm in H1 (), given in Definition
3.2.1, depends on both the L2 ()-norm of the function and the L2 ()-norm of its gradient. In
H1 () functions exist, which have one of the two norms much larger that the other one. For
instance, highly oscillatory but bounded functions (such as v,k (x, y) = sin kx cos ky in the square
= (0, 2)2 ) may be arbitrarily small while their gradients may be arbitrarily large. On the
contrary, constant functions (such as vk (x, y) = k) may be arbitrarily large while their gradients
are identically zero.
However, suppose bounded and consider a function constrained to vanish on : then, it
can be large somewhere in the domain only if its gradient, too, is large somewhere. This intuitive
concept can be made rigorous through an important inequality, known as the Poincare-Friedrichs
inequality, which we now state.
Proposition 3.7.2. Let be a bounded domain. Then, there exists a constant CP > 0 such that
v H10 () .

kvkL2 () CP kvk(L2 ())n

(3.7.1)

Any constant for which this inequality holds is referred to as a Poincare constant in the domain.
There exists a minimal value CP () of this constant, depending only on , which can be referred
to as the Poincare constant of the domain .
Proof. We follow the strategy of first proving the inequality for all functions in D(). Next, since
D() is dense in H10 () and both sides of the inequality depend continuously on the H1 ()-norm,
we can pass to the limit and extend the inequality to all functions in H10 ().
Since is bounded (in particular, in the xn -direction), there exist constants a < b such that
IRn1 [a, b]. Given any D(), let us extend it by zero outside ; then, for any x IRn1
and any xn [a, b], the fundamental theorem of Calculus yields
Z xn

(x , xn ) =
(x , s) ds .
xn
a
By the Cauchy-Schwarz inequality, we get
Z

|(x , xn )| =


1/2
Z xn


2

1 ds
(x , s)ds
1
xn
a

xn

1/2

(xn a)

xn


!1/2
2


xn (x , s) ds

!1/2
Z b
2


.
xn (x , s) ds
a

Squaring both sides and integrating with respect to xn yields




Z b
Z b
Z b
Z b
2
2
1
2
2




(x , s) ds = (b a)
(x , s) ds .
(xn a) dxn
(x , xn ) dxn


2
a xn
a
a
a xn
Integrating with respect to x yields

2
Z
Z


1
2
2

(x ) dx .
(x ) dx (b a)

2
IRn1 [a,b] xn
IRn1 [a,b]

Since 0 outside , this is equivalent to


2
Z
Z


1
2
2
2

dx 1 (b a)2 kk2 2
(x ) dx (b a)
kkL2 () =
;
(x
)


(L ())n
2
2

xn


3.7. THE SPACE H10 () AND THE POINCARE-FRIEDRICHS
INEQUALITY

55

thus, inequality (3.7.1) is established with CP = 21/2 (b a) for any D(). The existence of
a minimal value of the Poincare constant will be proven in Chap. 6.
The assumptions make above in order to obtain the Poincare-Friedrichs inequality ( bounded
and functions vanishing on the whole of ) are just one of the possible sets of assumptions which
guarantee the inequality to hold. Here are possible extensions:
The proof clearly indicates that need not be bounded, but just bounded in one direction,
i.e., in the direction of a coordinate axis (possibly after a rigid rotation).
Functions need not vanish on the whole of , but just on a proper subset which has
positive (n 1)-dimensional measure, provided any point in the domain can be connected
to by a curve completely contained in the domain. In particular, if we introduce the
closed subspace of H1 () of the functions vanishing on a subset of positive (n 1)dimensional measure, i.e., if we set
H10, () = {v H1 () : (v) = 0 on } ,

(3.7.2)

then the Poincare-Friendrichs inequality holds in H10, () if is a bounded connected domain.


Let us introduce the closed subspace of L2 () of the zero-average functions, i.e., let us set
L20 ()

= {v L () :

v = 0} .

(3.7.3)

Note that zero-average functions cannot be strictly positive or strictly negative throughout
the domain; hence, if in addition they are continuous, they necessarily vanish somewhere
in the domain. So, it is not unexpected that one can prove (see Exercise 3.2) that the
Poincare-Friedrichs inequality holds in the closed subspace of H1 () given by H1 () L20 ().
We are now ready to introduce, as announced, a new inner product in H10 () or, more generally,
in any subspace of H1 () for which a Poincare-Friedrichs inequality holds.
Proposition 3.7.3. Let H0 denote any closed subspace of H1 () for which there exists a constant
CP > 0 such that
v H0 .
(3.7.4)
kvkL2 () CP kvk(L2 ())n
Then, the bilinear form
(u, v)H0 =

u v

is an inner product in H0 ; the induced norm kvkH0 =


H1 ()-norm in H0 , since one has

1
2
1+CP

(3.7.5)


2 1/2
kvk

kvkH1 () kvkH0 kvkH1 ()

is equivalent to the standard

v H0 .

(3.7.6)

Proof. It is enough to prove the first inequality in (3.7.6), which immediately follows from
(3.7.4).

56

CHAPTER 3. SOBOLEV SPACES

3.8

Imbedding theorems

In this section, we present, without proof, two fundamental theorems concerning Sobolev spaces,
the Rellich theorem and the Sobolev imbedding theorem.
Rellichs theorem is the counterpart in Sobolev spaces of classical theorems such as the AscoliArzel`a theorem. It pertains to the possibility of extracting a converging sequence of functions,
with respect to a certain Sobolev norm, from knowing that their derivatives of sufficiently high
order are bounded in the L2 ()-norm. The essential condition is that the domain has to be
bounded, for otherwise the result is not true.
The precise statement is as follows.
Theorem 3.8.1. (Rellich) Let be a bounded domain in IRn . Then, for any m 0 the inclusion
Hm () Hk () with 0 k < m, is compact.
For instance, if {vn }n1 is a sequence of functions in H1 () satisfying kvn kH1 () C for
some constant C > 0, then the theorem assures the existence of a subsequence {vnj }j1 which is
convergent in L2 ().
The Sobolev imbedding theorem links Sobolev regularity to classical regularity, allowing one
e.g. to see the weak solution of a variational problem as a classical solution as well. In essence,
the theorem says that any function in Hm () for large enough m has the property that all its
derivatives of order up to a certain k < m are classical derivatives (and not just distributional
derivatives), and they are continuous in . Furthermore, even if m is not large enough, any
function in Hm () is p-integrable for some p > 2 (and not just square-integrable), as soon as
m > 0.
The precise statement is as follows.
Theorem 3.8.2. (Sobolev) Let be any domain in IRn . Let m > 0 be given, and denote by [z]
the largest integer z.
i) If m < n/2, then Hm () Lp () for p =

2n
n2m

> 2.

ii) If m = n/2, then Hm () Lp () for all p < .


iii) If m > n/2, then Hm () Ck, () L (), for k = [m n/2] and = m n/2 k if
m n/2 is not an integer, and for k = m n/2 1 and arbitrary < 1 if m n/2 is an
integer.
All inclusions above are continuous. In addition, if is bounded, the inclusions are compact.
Examples 3.8.3.
i) In dimension n = 1, the theorem gives Hm () Cm1,1/2 (). This result, for m = 1, has
been already proven in Sect. 3.2 (Property 3.2.7).
ii) In dimension n = 2, the theorem gives H1 () Lp () for all p < (but not contained in
L (), as already noted in Sect. 3.2), and Hm () Cm2, () L () for any integer m 2
and any real < 1.

iii) In dimension n = 3, the theorem gives H1 () L6 (), and Hm () Cm2,1/2 ()L ()


for any integer m 2.

3.9. THE DUALS OF H1 () AND H10 ()

3.9

57

The duals of H1 () and H10()

In this section, we analyze the structure of the dual spaces of the Sobolev spaces H1 () and H10 ().
The results will tell us which kind of right-hand side is admissible in the variational formulation
of a second-order elliptic problem.
We recall that the dual space of a Banach space X is the Banach space X of the continuous
linear forms F : X IR, equipped with the norm
kF kX = sup

xX

F (x)
.
kxkX

If X is a Hilbert space, the Riesz Representation Theorem says that each F X can be written
as F (x) = (y, x)X x X, for a unique y X, which satisfies kykX = kF kX . Thus, X can be
identified to X via the isometry F 7 y.
If X and Y are Banach spaces satisfying X Y with continuous injection, i.e., kxkY CkxkX
for all x X, then any FY Y defines an FX X by setting FX (x) = FY (x) x X;
furthermore, kFX kX CkFY kY , i.e., the mapping FY Y 7 FX X is continuous. If,
in addition, X is dense in Y , then this mapping is injective, since FX = GX means FY (x) =
GY (x) x X, whence FY (y) = GY (y) y Y , by the uniqueness of the extension of a continuous
map defined on a dense subset. In other words, we can identify Y to a subspace of X , i.e., we
have Y X with continuous injection and dense image.
A particularly important situation is the following one. We are given two Hilbert spaces V
and H, such that V H with continuous injection and dense image; furthermore, we identify H
to H via the Riesz Representation Theorem as above. Then, we have H = H V , so that we
can write the chain of inclusions
V H V ,
(3.9.1)
where each inclusion is continuous and with dense image. The pair (V, H) is often called a Gelfand
pair; equivalently, the triple (V, H, V ) is called a Gelfand triple. Examples of Gelfand triples are
(H1 (), L2 (), (H1 ()) )

and

(H10 (), L2 (), (H10 ()) ) ,

provided we identify L2 () to its dual space. Indeed, functions in L2 () can be approximated


arbitrarily well by compactly supported, smooth functions, i.e., D() is dense in L2 (), so that
a fortiori both H10 () and H1 () are dense in L2 ().
We want to find an explicit representation of any element in (H1 ()) or (H10 ()) . To this end,
let us introduce the mapping
S : H1 () (L2 ())n+1


v
v
,
.
.
.
,
= Sv .
v 7
v,
x1
xn

(3.9.2)

The mapping is trivially injective, since Sv = 0 implies that the first component of Sv, i.e., v
itself, is zero; furthermore, it is an isometry, since kSvk(L2 ())n+1 = kvkH1 () for all v H1 ().
Thus, we can identify H1 () to the subspace Z = S(H1 ()) of (L2 ())n+1 ; this subspace is closed,
since H1 () is complete. Correspondingly, the dual of H1 () can be identified to the dual of Z,
via the isometry F (H1 ()) 7 FZ Z defined as FZ (w) = F (S 1 (w)) w Z.
We now invoke the Hahn-Banch Theorem, which guarantees that, given a linear continuous
form FZ on a closed subspace Z of a Banach space X, there exists a linear continuous form Fe on

58

CHAPTER 3. SOBOLEV SPACES

X which extends FZ and such that kFZ kZ = kFekX . Thus, if we start from any F (H1 ()) ,
there exists Fe (L2 ())n+1 ) such that
F (v) = FZ (Sv) = Fe(Sv)

v H1 ()

(3.9.3)

and kF k(H1 ()) = kFZ kZ = kFek(L2 ())n+1 ) . On the other hand, having identified (L2 ()) to
L2 (), the space (L2 ())n+1 ) can be identified to (L2 ())n+1 , so that Fe can be identified to an
element f = (f0 , f1 , . . . , fn ) (L2 ())n+1 by the relation
Fe(w) = (f , w)L2 ()n+1 =
and kFek((L2 ())n+1 ) = kf k(L2 ())n+1 =

n
X
(fi , wi )L2 ()

w (L2 ())n+1 ,

i=0

P

n
2
i=0 kfi kL2 ()

conclude that F (H1 ()) can be represented as

1/2

(3.9.4)

. Combining (3.9.3) and (3.9.4), we


n 
X
v
F (v) = (f0 , v)L2 () +
fi ,
xi L2 ()

v H1 () ,

i=1

for suitable functions f0 , f1 , . . . , fn L2 () satisfying kF k(H1 ()) =

P

n
2
i=0 kfi kL2 ()

(3.9.5)
1/2

Since the extension FZ 7 Fe is not unique, the representation (3.9.5) is not unique as well.
For instance, since
Z
Z
v

dx =
v dx
if D() ,
x1
x1

we can replace in (3.9.5) f1 by f1 + and f0 by f0 x


without changing the right-hand side. If
1
2
n+1
f = (f0 , f1 , . . . , fn ) (L ())
denotes now any (n + 1)-ple of functions for which (3.9.5) holds,
then by the Cauchy-Schwarz inequality we have

|F (v)| kf0 kL2 () kvkL2 () +

n
X
i=1



v


kf k(L2 ())n+1 kvkH1 () ,
kf1 kL2 ()
xi L2 ()

i.e., kF k(H1 ()) kf k(L2 ())n+1 . On the other hand, the construction above shows that there
exists f (L2 ())n+1 for which the equality sign is attained.
The following statement summarizes the results obtained so far.

Theorem 3.9.1. Any F (H1 ()) can be represented, in a non-unique way, as



n 
X
v
F (v) = (f0 , v)L2 () +
fi ,
xi L2 ()

v H1 () ,

i=1

(3.9.6)

for suitable functions f0 , f1 , . . . , fn L2 (). In addition, setting R(F ) = {f (L2 ())n+1 :


(3.9.6) holds }, we have
!1/2
n
X
2
kfi kL2 ()
kF k(H1 ()) = min kf k(L2 ())n+1 = min
(3.9.7)
.
fR(F )

fR(F )

i=0

3.9. THE DUALS OF H1 () AND H10 ()

59

We now consider the dual space of H10 (), which is usually denoted by H1 (). All previous
considerations can be repeated, with the only change that the operator S introduced in (3.9.2)
is now restricted to H10 (), and consequently its image is a closed subspace of Z, say Z0 . Then,
given any F H1 (), we arrive as above to the representation formula

n 
X
v
fi ,
F (v) = (f0 , v)L2 () +
xi L2 ()
i=1

v H10 () ,

(3.9.8)

which is identical to (3.9.5), except that now v is restricted to H10 ().


The key difference with respect to case discussed above is that elements in H1 () are distributions. Indeed, D() is dense in H10 (), hence, H1 () can be identified to a dense subspace
of D (), the space of distributions over . In other words, we have the sequence of continuous
inclusions with dense images
D() H10 () L2 () H1 () D () .
Recalling the expression of the partial derivatives of a distribution (Def. 2.2.1), the right-hand
side of (3.9.8) can be written as
+
*

n
n 
X
X
fi

,
D() ,
= f0
(f0 , )L2 () +
fi ,
xi L2 ()
xi
i=1

i=1

i.e.,

n
X
fi
F = f0
= f0 (f1 , . . . , fn )
xi
i=1

in D () .

(3.9.9)

Thus, any F H1 () is the sum of an L2 ()-function and the divergence (in the sense of
distributions) of a vector of L2 ()-functions. The representation of its norm is analogous to the
one in (H1 ()) .
We summarize the results as follows.
Theorem 3.9.2. Let us denote by H1 () the dual space of H10 (). Then, H1 () D (), and
any F H1 () can be represented, in a non-unique way, as in (3.9.9) for suitable f0 , f1 , . . . , fn in
L2 (); equivalently, (3.9.8) holds. In addition, setting R0 (F ) = {f (L2 ())n+1 : (3.9.8) holds },
we have
!1/2
n
X
(3.9.10)
.
kfi k2L2 ()
kF kH1 () = min kf k(L2 ())n+1 = min
fR0 (F )

fR0 (F )

i=0

Examples 3.9.3.
i) Let = (a, b) be an interval in IR and let x0 any point in . Since H10 () H1 () C0 (),
the linear form v 7 Fx0 (v) = v(x0 ) belongs to both H1 () and (H1 ()) . As an element of
H1 (), it coincides with the distribution x0 (we simply say that x0 belongs to H1 ()); indeed,
1
we can represent it as in (3.9.9) setting Fx0 = f0 df
dx , with f0 0 and f1 (x) = H(x x0 ), where
H is the Heaviside function. We refer to Exercise 3.3 for a representation of Fx0 , as an element of
(H1 ()) , in the form (3.9.6).
ii) In dimension n > 1, the distribution x 0 , with x 0 , belongs neither to H1 () nor to
(H1 ()) ; indeed, neither H1 () nor H10 () are imbedded in C0 ().

60

CHAPTER 3. SOBOLEV SPACES

Consider instead the (n 1)-dimensional manifold = {x : x1 = c} and a bounded


domain which is cut into two non-empty subsets and + according to where x1 < c or
x1 > c. Then, the form
Z
(v) d ,

F (v) =

where (v) is the trace of v on , belongs to both H1 () and (H1 ()) , since the mapping
1
: H1 ( ) L2 () is continuous, as seen in
R Sect. 3.6. As an element of H (), F coincides
with the distribution such that h , i = | d for all D(). It can be represented in
the form (3.9.9) by setting f0 = f2 = = fn 0 and f1 = + , the characteristic function of
the set + . We refer to Exercise 3.4 for a representation of F , as an element of (H1 ()) , in the
form (3.9.6).
iii) Let Lw = (Aw) + a w + a0 w be any second-order operator in ; let us assume
that all its coefficients belong to L (). Then, for any w H1 (), one has Aw (L2 ())n as
well as a w + a0 w L2 (). Thus, according to Thm. 3.9.2, Lw belongs to H1 () and the
mapping w H1 () 7 Lw H1 () is continuous. A particularly relevant case occurs when
Lw = w: by restricting w to H10 (), we obtain that the operator maps H10 () into its dual
H1 (). Precisely, there exists a constant C > 0 depending on such that
kvkH1 () CkvkH10 ()

v H10 () .

(3.9.11)

The Laplacian is actually an isomorphism between these two spaces, as discussed in the next
chapter (see Property 4.3.7).
One can also consider Lw, for w H1 (), as an element of (H1 ()) , by setting
Z
Z
v H1 () .
F (v) = (Aw) v + (a w + a0 w) v

iv) Let be a bounded Lipschitz domain; recall that the trace operator is continuous
from H1 () to L2 (). Thus, given any h L2 (), the form Fh defined by
Z
h (v) d
(3.9.12)
v 7 Fh (v) =

belongs to (H1 ()) . How can we represent Fh according to (3.9.6)? To answer this question, let
us consider the Neumann problem

w + w = 0 in ,
w = h
on .
n

Anticipating the results of next chapter, we can say that there exists a unique solution w H1 ()
of the variational formulation of this problem, which is given by
Z
Z
h (v)
v H1 () .
(3.9.13)
w v + wv =



w
w
provides a representation of the form Fh . It is not
,
.
.
.
,
Then, the (n + 1)-ple f = w, x
xn
1
difficult to prove, indeed, that this is the representation which realizes the minimum in (3.9.7).

3.10. EXERCISES

61

Consider now the same mapping (3.9.12), but restrict it to the functions of H10 (). Obviously,
(Fh )|H10 () belongs to H1 (), but... it is nothing else than the null form, i.e., Fh0 = 0
H (). Since, by (3.9.12),
Z
Z
w + w = hw + w, i
D() ,
h | =
0 = Fh0 () =
Fh0 =
1

the same f as above does provide one of the possible representations of Fh0 in the form (3.9.9),
but surely it is not the one with minimal norm! Obviously, such a representation is provided by
the (n + 1)-ple f = (0, 0, . . . , 0).

3.10

Exercises

3.1. Let us consider the function


u(x, y) = |x y|

on the set = [1, 2] [1, 2] IR2 , where is a real parameter.


(i) Find the values of for which u L2 () and those for which u H1 ().
(ii) For the values of that allow u to be in H1 (), calculate the Laplacian u and find the
space which it belongs to.
3.2. Prove that the expression
Z




kvk = v(x ) dx + kvk(L2 ())n

is a norm in H1 (), which is equivalent to the standard norm kvkH1 () . Deduce from this the
validity of a Poincare-Friedrichs inequality in the space H1 () L20 (), where L20 () is defined in
(3.7.3).
3.3. Prove that the form Fx0 defined in Example 3.9.3, i) can be represented as
Fx0 (v) =
where

f0 (x)v(x) dx +

(
0
f0 (x) =
w(x)

in (a, x0 ) ,
in (x0 , b) ,

f1 (x)v (x) dx

f1 (x) =

v H1 (a, b) ,

0
w (x)

in (a, x0 ) ,
in (x0 , b) ,

and w is the solution of the Neumann problem in (x0 , b):


(
w + w = 0
in (x0 , b) ,

w (x0 ) = 1, w (b) = 0 .
3.4. Adapt the arguments of the previous exercise in order to find a representation of the form
F defined in Example 3.9.3, ii), as an element of (H1 ()) .

62

CHAPTER 3. SOBOLEV SPACES

Chapter 4

Elliptic Problems
4.1

Weak formulation of elliptic boundary-value problems

Hereafter, we will study a more general situation than the homogeneous Dirichlet boundary-value
problem given in (3.1.1). Precisely, let us consider the mixed Dirichlet/Neumann problem

Lu = (Au) + (au) + a0 u = f in ,

u = g on D ,
(4.1.1)

= h on N .
nA
Here, is a bounded and connected Lipschitz domain, whose boundary is partitioned into two
relatively open subsets D (the Dirichlet part of the boundary) and N (the Neumann part), i.e.,
they satisfy
D N = .
D , N ,
D N = ,

We assume again that the coefficients A, a, a0 satisfy (3.1.12); in addition, if N is not empty, we
assume that A and a are defined and continuous in a neighborhood of N , so that the conormal
vector nA = An and the normal component an = a n of the vector a make sense therein. We
u
= nA u; see also (3.1.5).
recall that the conormal derivative of u is defined as n
A
We now add the crucial assumption that the equation is elliptic throughout the domain, i.e.,
the matrix A(x) is positive-definite at each x .
As far as the data f , g and h are concerned, we assume as in (3.1.12) that f L2 (), that g
is the trace on of a function of H1 (), i.e., g H1/2 (), and finally that h L2 (N ).

We now formulate the problem in a weak sense, so that the solution u is only required to
belong to H1 (). We recall the identity (3.1.8), obtained after multiplying equation Lu = f by
a test function v, integrating over and performing two integrations-by-parts. If we restrict
test functions to those vanishing on N , and replace therein the conormal derivative of u by the
prescribed value h, we obtain
Z
Z
Z
Z
Z
Z
hv .
(4.1.2)
fv +
an uv =
a0 uv +
u a v +
(Au) v

Note that all integrals which appear in this equation make sense if u, v H1 (), thanks to the
definition of this space and the trace properties established in Chapter 3. Hence, we introduce the
bilinear form
Z
Z
Z
Z
an uv
(4.1.3)
a0 uv +
u a v +
(Au) v
a(u, v) :=

63

64

CHAPTER 4. ELLIPTIC PROBLEMS

defined in H1 () H1 (), as well as the linear form


F (v) :=

fv +

hv

(4.1.4)

defined in H1 (). We also introduce the closed subspace of H1 () of the functions vanishing on
D , i.e., we set
H10,D () = {v H1 () : v = 0 on D } ,
equipped with the H1 ()-norm.
Then, we are led to considering the following weak formulation of the boundary-value problem
stated in (4.1.1):
Problem 4.1.1. Find a function u H1 (), with u = g on D , such that
a(u, v) = F (v)

v H10,D () .

(4.1.5)

At this point, if D is non-empty, it is convenient to distinguish two cases.


a) Homogeneous Dirichlet condition, g = 0
Both the solution u and the test functions v belong to the same space H10,D (). The weak
formulation becomes:
Problem 4.1.2. Find u H10,D () such that
a(u, v) = F (v)

v H10,D () .

b) Non-homogeneous Dirichlet condition, g 6= 0


This case can be reduced to the previous one by a change of unknown. Precisely, since g
H1/2 (), there exists ug H1 () such that ug | = g; we write u = u0 + ug , where the new
unknown w satisfies (u0 )|D = u|D (ug )|D = (g g)|D = 0, i.e., u0 H10,D (). Substituting
the expression of u into (4.1.5), we obtain
a(u0 , v) + a(ug , v) = F (v)

v H10,D () .

(4.1.6)

Thus, if we introduce the modified linear form


F (v) := F (v) a(ug , v) ,

(4.1.7)

still defined in H1 (), we reduce the non-homogeneous problem for u to the following one for w:
Problem 4.1.3. Find u0 H10,D () such that
a(u0 , v) = F (v)

v H10,D () .

This change of dependent variable is made in preparation of applying the abstract existence and
uniqueness theory developed in the subsequent section.
Obviously, if D = , we simply consider Problem 4.1.2 with H10,D () = H1 ().

4.1. WEAK FORMULATION OF ELLIPTIC BOUNDARY-VALUE PROBLEMS

65

Examples 4.1.4. Let us consider the homogeneous Dirichlet problem for the Poisson equation:
(
u = f in ,
(4.1.8)
u = 0 on .
It admits the following weak formulation: Find u H10 () such that
Z
Z
fv
v H10 () .
u v =

(4.1.9)

We observe that, keeping in mind Theorem 3.9.2, the data f L2 () can be replaced by a data
F H1 (), i.e., of the form
n
X
fi
F = f0
,
xi

fi L2 () .

i=1

In this case, (4.1.9) becomes


Z
Z
n Z
X
v
fi
f0 v +
u v =
xi

v H10 () .

i=1

(4.1.10)

ii) Let us consider the non-homogeneous Neumann problem for the Helmholtz equation:
(
u + a0 u = f in ,
(4.1.11)
u
n = h on .
It admits the following weak formulation: Find u H1 () such that
Z
Z
Z
hv
v H1 () .
fv +
(u v + a0 uv) =

(4.1.12)

In this case, too, a more general data F can replace f on the right-hand side.
iii) Let us consider the mixed Dirichlet/Neumann
tion:

u + (au) =
u =

u
n =

problem for the convection-diffusion equaf

in ,

on D ,

on N ,

(4.1.13)

where > 0 is a constant and D = {x : (a n)(x) < 0}. It admits the following weak
formulation: Find u H1 (), with u = g on D , such that
Z
Z
Z
Z
fv
v H10,D () .
(4.1.14)
an uv =
u a v +
u v

The reduction to an equivalent homogeneous problem is based on the representation u = u0 + ug ,


where ug is any fixed function in H1 () satisfying u = g on . Then, u0 is defined by the
following weak formulation: Find u0 H10,D () such that
Z
Z
Z
an u0 v
(4.1.15)
u0 a v +
u0 v

Z
Z
Z
Z
an gv
v H10,D () . (4.1.16)
ug a v
ug v +
fv
=

Next section will be devoted to study an abstract form of Problem 4.1.2 or 4.1.3, providing a
set of assumptions which guarantee its solvability.

66

CHAPTER 4. ELLIPTIC PROBLEMS

4.2

The Lax-Milgram Theorem

Let V be a reflexive Banach space, with norm kvkV . (We recall that a Banach space is reflexive
if its bidual V = (V ) can be identified with V itself; each Hilbert space is reflexive, thanks to
Riesz Representation Theorem.)
Let
a:
V V R
(4.2.1)
(w, v) 7 a(w, v)
be a bilinear form defined in V ; let
F :

7 F (v)

(4.2.2)

be a linear form defined in V . We consider the following abstract problem.


Problem 4.2.1. Find u V such that
a(u, v) = F (v)

v V .

(4.2.3)

In order to prove the solvability of this problem, we assume that the forms a and F are
continuous. The latter condition is equivalent to F V ; concerning the form a, we give the
following definition.
Definition 4.2.2. The bilinear form a is said to be continuous in V if there exists a constant
C > 0 such that
|a(w, v)| C kwkV kvkV
w, v V .
(4.2.4)
The smallest constant C for which this bound holds is
kak =

sup
wV,vV

a(w, v)
,
kwkV kvkV

(4.2.5)

which is termed the norm of a.


If the form a is continuous, we can associate to it, in a canonical way, a linear operator
A:V V ,
defined as follows: for any w V , the mapping v V 7 a(w, v) R is linear and continuous;
therefore, it is an element of V , which we denote by Aw. In other words, A is defined by the
relations
(Aw)(v) = a(w, v)
w, v V .
(4.2.6)
Thus, we can give the following operatorial expression to Problem 4.2.1.
Problem 4.2.3. Find u V such that
Au = F .

4.2. THE LAX-MILGRAM THEOREM

67

It will be convenient to introduce the duality pairing between V and V , i.e., the bilinear form
F V , v V .

hF, vi = F (v)

(4.2.7)

Then, the operator A is defined by


hAw, vi = a(w, v)

w, v V ,

and the equation Au = F can be equivalently written as


hAu, vi = hF, vi

v V .

Next, we introduce the crucial property of the bilinear form a, which will provide, together
with continuity, a sufficient condition for the solvability of Problem 4.2.1.
Definition 4.2.4. The bilinear form a is said to be coercive in V if there exists a constant > 0,
such that
a(v, v) kvk2V
v V .
(4.2.8)
Any > 0 satisfying (4.2.8) is termed a coercivity constant of the form a. The best coercivity
constant is given by
a(v, v)
.
(4.2.9)
= inf
vV kvk2
V
We are now ready to state the main result of the present theory.
Theorem 4.2.5. (Lax-Milgram) Assume that the bilinear form a on the reflexive Banach space
V is continuous and coercive, with coercivity constant . Then, given any linear continuous form
F on V , Problem 4.2.1 admits one and only one solution, which satisfies
kukV

1
kF kV .

(4.2.10)

Before proving the theorem, let us make some important observations.


Remark 4.2.6. The Lax-Milgram Theorem provides sufficient conditions under which Problem
4.2.1 is well-posed in the sense of Hadamard. This means that:
for any data F , the problem has a solution u;
the solution is unique;
the solution depends on the data in a continuous way.
The latter condition is made explicit as follows: let F1 , F2 be two elements in V , and let u1 , u2 V
be the solutions of the corresponding problems. Then, by linearity, the difference u1 u2 is the
solution of
a(u1 u2 , v) = (F1 F2 )(v)
v V ;
hence, by (4.2.10) applied to this problem, we conclude that
ku1 u2 kV

1
kF1 F2 kV .

In other words, a small perturbation in the data yields a small perturbation in the solution.
Equivalently, the mapping F V 7 u = A1 F V is Lipschitz-continuous, with Lipschitz
constant 1/.

68

CHAPTER 4. ELLIPTIC PROBLEMS

Remark 4.2.7. The Lax-Milgram Theorem can be viewed as a generalization of the Riesz Representation Theorem in a Hilbert space. Indeed, assume that V is such a space and that a(w, v)
is a continuous and coercive bilinear form on V , which in addition is symmetric, i.e.,
a(v, w) = a(w, v)

v, w V .

(4.2.11)

Then, it is easily seen that


p the bilinear form (w, v)a = a(w, v) is an inner product in V , which
induces a norm, kvka = a(v, v), equivalent to the original norm in V ; indeed, by coercivity and
continuity one has
kvk2V a(v, v) kak kvk2V
v V ,
i.e.,

kvkV kvka

kak kvkV

v V .

Thus, any linear form on V which is continuous in the original norm of V is also continuous in the
norm induced by a, and vice-versa. Given any such form F , the Lax-Milgram Theorem assures
the existence of a unique u V such that
(u, v)a = F (v)

v V ;

this is precisely what is assured by the Riesz Representation Theorem.


Proof of Theorem 4.2.5 At first, let us remark that if we assume for the moment the existence
of a solution, then its uniqueness and the bound (4.2.10) are immediate. Indeed, this inequality
follows from coercivity after choosing v = u in (4.2.3, since
kuk2V a(u, u) = F (u) kF kV kukV ;
dividing by kukV , we get the result. Uniqueness follows from (4.2.10): if u1 , u2 are any two
solutions of Problem 4.2.1, their difference satisfies
a(u1 u2 , v) = F (v) F (v) = 0

v V ,

whence ku1 u2 kV 1 kF F kV = 0, which implies u1 = u2 .


Thus, we are left with the task of proving the existence of a solution of Problem 4.2.1. We
will actually prove that the operator A introduced in (4.2.6) is an (algebraic and topological)
isomorphism between V and V . Let us proceed in several steps.
Step 1): A : V V is continuous. This follows, as expected, from the continuity of a; indeed,
for any w V ,
kAwkV = sup
vV

hAw, vi
a(w, v)
kak kwkV kvkV
= sup
sup
= kak kwkV .
kvkV
kvkV
vV kvkV
vV

Step 2): A is injective. This follows from the coercivity of a. Indeed, Aw = 0 V means
hAw, vi = a(w, v) = 0 for all v V ; taking v = w we get kwk2V a(w, w) = 0, which imples
w = 0.
Let us introduce the image of A in V , i.e., the subspace
Z = Im(A) = {G V : G = Aw for some w V } .
Then, A is an algebraic isomorphism between V and Z, whose inverse will be denoted, as usual,
by A1 .

4.2. THE LAX-MILGRAM THEOREM

69

Step 3): A1 : Z V is continuous. This follows again from the coercivity of a. Indeed, given
any G Z, let w V be such that Aw = G. This means that w satisfies
a(w, v) = G(v)

v V ;

applying the bound (4.2.10) to this problem, we get kA1 GkV


claim.

kGkV ,

which is precisely the

Step 4): Z is a closed subspace of V . This follows from the completeness of V . Indeed, let F
belong to the closure of Z in V . Then, there exist a sequence {Gn }nN converging to F in the
norm of V . Let us set wn = A1 Gn V . By the result of Step 3), we have
kwn wm kV

1
kGn Gm kV

n, m N .

This implies that {wn }nN is a Cauchy sequence in V , hence, it is converging to some w V
since this space is complete. Then, the form G = Aw belongs to Z and, by Step 1), the sequence
{Gn = Awn }nN converges to G in V . Since it also converges to F by assumption, necessarily
F = G by the uniqueness of the limit, hence, F belongs to Z.
Step 5): Z coincides with V . This follows from the reflexivity of V . By contradiction, assume
that Z is a proper subspace of V . Then, according to the Hahn-Banach theorem, there exists
a linear continuous form W on V such that W(G) = 0 for all G Z, but W(F ) 6= 0 for some
F V \ Z. In other words, W V , and kWkV > 0.
Since V is reflexive, W can be identified with an element w in V ; precisely, there exists a
unique w V such that
W(F ) = F (w)

F V ,

and kWkV = kwkV > 0 .

Choosing any F = Av Z, with v V , in the previous relation yields


0 = W(Av) = (Av)(w) = a(v, w)

v V ;

but taking v = w and using again the coercivity of a yields


kwk2V a(w, w) = 0 ,
which implies kwkV = 0, i.e., w = 0. This contradicts the property kwkV > 0 stated before, and
the claim is proven.
The proof of the Lax-Milgram Theorem is then concluded.

4.2.1

The symmetric case: equivalence with a minimization problem

Let us consider again the relevant case of a continuous and coercive bilinear form a(w, v), which
is symmetric, i.e. it satisfies (4.2.11); let us show that in this case Problem 4.2.1 is equivalent to
a minimization problem in V . Precisely, let us introduce the functional
J:

7 J(v) = 21 a(v, v) F (v) ,

and let us consider the following global minimization problem for J:

(4.2.12)

70

CHAPTER 4. ELLIPTIC PROBLEMS

Problem 4.2.8. Find u V such that


J(u) = min J(v) .

(4.2.13)

vV

In order to study such a problem, let us establish some useful properties of J.


Property 4.2.9. For any w V , the following identity holds:

J(w + v) = J(w) + a(w, v) F (v) + 21 a(v, v)

v V .

(4.2.14)

Proof. The result is an immediate consequence of the bilinearity of a and the linearity of F .
The identity can be interpreted as follows: think of v as an increment v = w given to w. Then,
w 7 a(w, w)F (w) represents the linear part of the increment of J, whereas w 7 12 a(w, w)
represents the quadratic part. Equivalently stated, (4.2.14) is nothing but the Taylor expansion
of J around w
J(w + w) = J(w) + hJ(w), wi + 21 hHJ(w)w, wi

w V ,

where the gradient of J at w is the form J(w) V defined by


hJ(w), vi = a(w, v) F (v)

v V ,

(4.2.15)

whereas the hessian of J at w is the constant mapping HJ (w) = HJ L(V, V ) defined by


h(HJ) v1 , v2 i = a(v1 , v2 )

v1 , v2 V .

A related important property is the following one.


Property 4.2.10. The functional J is strictly convex in V , i.e., it satisfies
J((1 )v1 + v2 ) < (1 )J(v1 ) + J(v2 ) ,

v1 6= v2 V and : 0 < < 1 . (4.2.16)

Proof. Note that the inequality can be equivalently written as



J(v1 + (v2 v1 )) < J(v1 ) + J(v2 ) J(v1 ) .

Then, using (4.2.14) with w = v1 and v = (v2 v1 ), we obtain



J(v1 + (v2 v1 )) = J(v1 ) + a(v1 , v2 v1 ) F (v2 v1 ) + 12 2 a(v2 v1 , v2 v1 ) .

(4.2.17)

On the other hand, using the symmetry of the bilinear form, we obtain the relation
a(v2 v1 , v2 v1 ) = a(v2 , v2 ) 2a(v1 , v2 ) + a(v1 , v1 ) ,
which can be easily manipulated to give
a(v1 , v2 ) a(v1 , v1 ) = 21 a(v2 , v2 ) 12 a(v1 , v1 ) 21 a(v2 v1 , v2 v1 ) ;
this identity, inserted in (4.2.17), yields


J(v1 + (v2 v1 )) = J(v1 ) + J(v2 ) J(v1 ) 12 (1 )a(v2 v1 , v2 v1 ) .

Since v2 v1 6= 0, we have by coercivity a(v2 v1 , v2 v1 ) > 0; in addition, (1 ) > 0. This


gives the result.
The last property concerns the behavior of F at infinity.

4.2. THE LAX-MILGRAM THEOREM

71

Property 4.2.11. The functional J satisfies


J(v) +

if

kvkV .

(4.2.18)

Proof. By the coercivity of a and the continuity of F , we get




kF kV
2
2
1
1
J(v) 2 kvkV kF kV kvkV = kvkV 2
.
kvkV
As soon as kvkV 4 kF kV , the quantity in parenthesis is larger or equal to

4,

whence

J(v) 14 kvk2V ,
which clearly implies the result.
Properties 4.2.10 and 4.2.11 should give the intuitive idea that the graph of J in V R behaves
like an elliptic paraboloid in R3 (but beware: here we are in infinite dimension!). In particular,
the existence of a unique minimum for J should not be a surprise at this point. This is made
precise in the following fundamental statement.
Theorem 4.2.12. The weak Problem 4.2.1 and the minimization Problem 4.2.8 are equivalent:
u is a solution of the latter problem if and only if it is a solution of the former problem.
Thus, since we already know that Problem 4.2.1 admits one and only one solution, the same
is true for Problem 4.2.8.
Proof. At first, assume that u is a solution of Problem 4.2.1. Property 4.2.9 with w = u yields
J(u + v) = J(u) + 12 a(v, v)

v V .

If v =
6 0, then a(v, v) > 0 by coercivity, hence J(u + v) > J(u), i.e., u is a strict minimizer of J.
Conversely, let u be a solution of Problem 4.2.8. For any fixed v V , consider the quadratic
function (parabola) : R R defined by

() = J(u + v) = J(u) + a(u, v) F (v) + 21 2 a(v, v)
By assumption, it has a minimum at = 0; differentiating, we get
0=

d
(0) = a(u, v) F (v) ;
d

since v is arbitrary, this shows that u is a solution of Problem 4.2.1.


In view of (4.2.15), equations (4.2.3) state that J(u) = 0, i.e., they express the property that
the functional J is stationary at u. They are often called the Euler-Lagrange equations of the
minimization problem.
If we consider the bilinear form a(w, v) defined in (4.1.3), then it is symmetric if (and only if)
A is a symmetric matrix and a = 0 throughout the domain. In this case, the weak formulation
of the boundary-value problem (4.1.1), given by (4.1.5), may be referred to as the variational
formulation of the problem, and the corresponding solution u is called the variational solution.
Example 4.2.13. Consider the homogeneous Dirichlet problem for the Poisson equation, given
in (4.1.8), and the related variational formulation (4.1.9). The corresponding functional J :
H10 () R is given by
Z
Z
1
2
fv .
kvkRn
J(v) =
2

72

CHAPTER 4. ELLIPTIC PROBLEMS

The first integral on the right-hand side is called the Dirichlet integral of v in .
In a (extremely simplified) description of small deformations in linear Elasticity (such as in
the membrane problem), v represents an admissible displacement, constrained to vanish on the
boundary of the body which occupies the domain . Then, the quantity
1
2

kvk2Rn

represents the internal elastic energy associated with the displaced configuration, whereas the
integral
Z
fv

represents the potential energy associated with the work of the external force of density f when
the displacement v takes place. Thus, J(v) represents the total energy of the configuration described by v. Eq. (4.2.13) translates the well-known physical principle that, among all admissible
displacements, the one which corresponds to the equilibrium of the elastic body under the external
forces is characterized by having the minimal total energy.

4.3

Sufficient conditions for the well-posedness


of elliptic boundary-value problems

Let us go back to the mixed boundary-value problem (4.1.1), and let us establish suitable assumptions under which Problems 4.1.2 or 4.1.3 are well-posed. Thanks to the Lax-Milgram theorem,
this will be accomplished by checking that the bilinear form a(w, v) defined in (4.1.3) is continuous
and coercive in H10,D (), and that the linear forms F (v) or F (v), defined in (4.1.7) or (4.1.4), are
continuous in H10,D ().
At first, let us deal with continuity.
Lemma 4.3.1. Under the assumptions on the coefficients A, a, a0 stated at the beginning of
Sect. 4.1, one has
|a(w, v)| C(A, a, a0 ) kwkH1 () kvkH1 ()

w, v H10,D () ,

(4.3.1)

where C(A, a, a0 ) depends upon kAk(L ())nn , kak(L ())n , ka0 kL () and ka nkL (N ) .
Proof. In the proof, C will denote a constant, independent of w, v and the coefficients of the
operator, which be different from place to place.
Using Holders inequality (3.1.11) with p = , p = p = 2, we easily bound each addend in
the definition of a(w, v); precisely:
Z



(Aw) v CkAk(L ())nn kwk 2
(L ())n kvk(L2 ())n

CkAk(L ())nn kwkH1 () kvkH1 () ,

Z



w a v kak(L ())n kwk 2 kvk 2
L ()
(L ())n kak(L ())n kwkH1 () kvkH 1 () ,

4.3. WELL POSEDNESS OF ELLIPTIC PROBLEMS

73

Z



a0 wv ka0 kL () kwk 2 kvk 2
L ()
L () ka0 kL () kwkH 1 () kvkH1 () ,



an wv ka nkL (N ) kwkL2 (N ) kvkL2 (N ) ka nkL (N ) kwkL2 () kvkL2 ()
Cka nkL (N ) kwkH1 () kvkL2 (H1 ()) .

The last inequality follows from the continuity of the trace operator : H1 () L2 () (recall
(3.6.1)).
Lemma 4.3.2. Let the assumptions on the data f, g, h stated at the beginning of Sect. 4.1 be
satisfied. In addition, suppose that the function ug be any lifting of the data g inside satisfying
kug kH1 () 2kgkH1/2 () (according to the definition (3.6.5) of the H1/2 ()-norm of g). Then,
one has


|F (v)| kf kL2 () + khkL2 (N ) kvkH1 ()
v H10,D () ,
(4.3.2)


|F (v)| kf kL2 () + khkL2 (N ) + 2C(A, a, a0 )kgkH1/2 () kvkH1 ()

v H10,D () . (4.3.3)

Proof. The first inequality follows immediately from the Cauchy-Schwarz inequality. Concerning
the second one, we have
|F (v)| |F (v)| + |a(ug , v) |F (v)| + C(A, a, a0 ) kug kH1 () kvkH1 () .
We conclude by the assumption made on ug .
Next, let us define a set of assumptions on the coefficients of the operator L, which ensure that
the bilinear form a(w, v) is coercive in H10,D () with respect to the H1 ()-norm. At first, let us
observe that
Z
Z
Z
Z
2
an v 2
v H10,D () .
(4.3.4)
a0 v +
v a v +
v Av
a(v, v) =
N

The first integral on the right-hand side is non-negative, since the assumption that the operator
is elliptic throughout the domain implies that A > 0 almost everywhere in , for any Rn .
However, this assumption is not sufficient to yield a control on the L2 -norm of v; we need a
stronger condition, expressed by the following definition.
Definition 4.3.3. The operator L is said to be uniformly elliptic in if there exists a constant
> 0 such that
Rn , almost everywhere in .

A kk2Rn

If L is uniformly elliptic in , then


Z
Z
kvk2Rn = kvk2(L2 ())n .
v Av

(4.3.5)

(4.3.6)

The second integral on the right-hand side of (4.3.4) can be manipulated as follows. We first
note that vv = ( 12 v 2 ), so that
Z
Z
a ( 21 v 2 ) .
v av =

74

CHAPTER 4. ELLIPTIC PROBLEMS

Next, we integrate by parts, after assuming that a exists in L (); then,


Z
Z
Z
1
2
1
v av =
an v 2 ;
2a v + 2

the last integral is indeed an integral over N , since v vanishes on D . Thus, substituting this
expression and inequality (4.3.6) into (4.3.4) yields
Z
Z
 2 1
2
1
an v 2
v H10,D () .
a(v, v) kvk(L2 ())n +
2 a + a0 v + 2
N

At this point, we make the assumptions that a n 0 on N and that there exists a constant
such that 21 a + a0 almost everywhere in . Then, the above inequality implies
a(v, v) kvk2(L2 ())n + kvk2L2 ()

v H10,D () .

Finally, we observe that if the Poincare-Friedrichs inequality (3.7.4) holds for H10,D (), then it is
enough to assume 0 to get coercivity; indeed, recalling the first inequality in (3.7.6), we obtain
a(v, v) kvk2(L2 ())n

2
1+CP

kvk2H1 ()

v H10,D () .

Otherwise, we assume > 0 and we deduce that




a(v, v) min(, ) kvk2(L2 ())n + kvk2L2 () = min(, ) kvk2H1 ()

v H10,D () .

Let us summarize these results as follows.

Lemma 4.3.4. Assume that L be uniformly elliptic in , i.e., (4.3.5) holds true. Furthermore,
assume that a L () and that there exists a constant 0 for which 21 a + a0 almost
everywhere in ; let > 0 whenever the Poincare-Friedrichs inequality (3.7.4) does not hold for
H10,D (). Finally, assume that a n 0 on N . Then, the bilinear form a(w, v) is coercive in
H10,D (), with coercivity constant in the H1 ()-norm given by

2
if the Poincare-Friedrichs inequality holds in H10,D () ,
1+CP
(4.3.7)
=
min(, ) if the inequality does not hold .

The three previous lemmas guarantee that the bilinear form a and the linear form F or F , which
define Problems 4.1.2 or 4.1.3, satisfy the assumptions of the Lax-Milgram Theorem. Consequently,
each of these Problem is well-posed. Recalling that the solution u of Problem 4.1.1 is given by
u = u0 + ug , hence in particular kukH1 () ku0 kH1 () + kug kH1 () , we arrive at the following final
result.

Theorem 4.3.5. Let the assumptions stated in Lemmas 4.3.1, 4.3.2 and 4.3.4 on the coefficients
A, a, a0 and the data f , g, h of the mixed Dirichlet/Neumann boundary-value problem (4.1.1) be
satisfied. Then, the weak formulation of the problem, given by Problem 4.1.1, admits one and only
one solution u, for which the following bound holds:


kukH1 () C kf kL2 () + kgkH1/2 () + khkL2 (N ) ,
(4.3.8)
where the constant C depends upon kAk(L ())nn , kak(L ())n , ka0 kL () and ka nkL (N ) .

4.3. WELL POSEDNESS OF ELLIPTIC PROBLEMS

75

Examples 4.3.6. Let us resume the Examples 4.1.4.


i) The homogeneous Dirichlet problem for the Poisson equation, stated in (4.1.8), admits one
and only one variational solution u H10 () satisfying (4.1.9). Indeed, for Rthe operator one
has A = I, hence, = 1 in (4.3.5); consequently, the bilinear form a(w, v) = w v is coercive
in H10 (), with constant = 1/(1 + CP2 ), where CP is the Poincare-Friedrichs constant in H10 ()
(see (4.3.7)). Thus, from (4.2.10) and the fact that kf kH1 () kf kL2 () if f L2 (), we obtain
the following bound for the H1 ()-norm of u:
kukH1 () (1 + CP2 )kf kL2 () .

(4.3.9)

Note that a direct estimate on the norm kukH10 () = kuk(L2 ())n (see (recall Proposition 3.7.3)
can be obtained from the relations
kuk2(L2 ())n = a(u, u) = (f, u) kf kL2 () kukL2 () CP kf kL2 () kuk(L2 ())n
which give
kukH10 () CP kf kL2 () .

(4.3.10)

A similar well-posedness result holds if the data f is replaced by the more general data F
H1 (), according to (4.1.10). In this case, recalling (3.9.7), one has
kukH1 () (1 +

CP2 )kF kH1 ()

(1 +

CP2 )

n
X
i=0

kfi k2L2 ()

!1/2

(4.3.11)

The latter result can be stated in the form of the following fundamental property.
Property 4.3.7. The operator
: H10 () H1 ()
is an algebraic and topological isomorphism between these spaces.
ii) Consider now the non-homogeneous Neumann problem for the Helmholtz equation, stated
in (4.1.11). Assume that a0 almost everywhere in , for a suitable constant > 0. Then, the
problem admits one and only one variational solution u H1 () satisfying (4.1.12). The following
bound holds for the H1 ()-norm of u:


kukH1 () min(1, ) kf kL2 () + khkL2 () .
(4.3.12)
iii) Finally, let us consider the mixed Dirichlet/Neumann problem for the convection-diffusion
equation, stated in (4.1.13). Let us assume that the Dirichlet part D of the boundary be nonempty, and that the velocity field a be solenoidal, i.e., a = 0 in . Note that by the very
definition of D we have an 0 on N . Thus, the problem admits one and only one weak solution
u H1 () satisfying u = g on D and (4.1.14). The following bound holds for the H1 ()-norm of
u:


1+C 2
kukH1 () P kf kL2 () + kgkH1/2 () ,
(4.3.13)
where CP is the Poincare-Friedrichs constant in H10,D ().

76

4.4

CHAPTER 4. ELLIPTIC PROBLEMS

What the weak solution satisfies

Theorem 4.3.5 ensures the existence and uniqueness of a weak solution of the boundary-value
problem (4.1.1), i.e., a solution of the weak formulation (4.1.5) of the problem. This formulation
has been derived by combining the partial differential equation to be satisfied inside the domain
with the Dirichlet and/or Neumann conditions to be satisfied at the boundary, via integrations
by parts.
From now on, we will move in the opposite direction, going back from (4.1.5) towards (4.1.1).
As a first step, we ask ourselves in which sense the weak solution u satisfies the partial differential
equation and the boundary conditions.
Let us begin with the equation. We recall that the most general framework to give sense to
a differential equation is the distributional sense. With this in mind, we observe that the space
H10,D () surely contains the space D() of all test functions for distributions. If we restrict (4.1.5)
to the functions v = in this space, we get
Z
Z
Z
Z
f
D() .
(4.4.1)
a0 u =
u a +
(Au)

This means that the equation is satisfied in the distributional sense, i.e., we have
(Au) + (au) + a0 u = f

in D () .

(4.4.2)

From this relation, we can derive an additional information on the solution. Indeed, we write
(Au) = (au) + a0 u f

in D () ,

and we observe that the right-hand side, which we write (a)u + a u + a0 u f , is a sum of
functions belonging to L2 (). Thus, the principal part of the operator, L(2) u = (Au), is
not just an element in D () (or in H1 ()), but is an element of L2 (). Therefore, if we define
the domain of the operator L(2) as the space
D(L(2) ) = {v H1 () : L(2) v L2 ()} ,

(4.4.3)

we conclude that the solution u is not just a function in H1 (), but it satisfies
u D(L(2) ) .

(4.4.4)

The property (Au) L2 () (together with the property (au) L2 (), already used
above) allows us to write (4.4.1) in the equivalent form
Z
Z
Z
Z
f
D() ;
a0 u =
(au) +
(Au) +

since D() is dense also in L2 (), this is equivalent to


(Au) + (au) + a0 u = f

in L2 () ,

(4.4.5)

i.e., the equation is actually satisfied in a stronger sense than the distributional one, compare with
(4.4.2).
Condition (4.4.4) has also an important consequence on the interpretation of the Neumann
boundary conditions. Indeed, the following crucial property holds.

4.4. WHAT THE WEAK SOLUTION SATISFIES

77

v
Proposition 4.4.1. For any function v D(L(2) ), the conormal derivative n
is well defined as
A
1/2
1/2

an element of the dual space H


() = (H ()) . Precisely, the following formula holds:
Z
Z
v
(Av) w
H1/2 () ,
(4.4.6)
(Av) w +
, i =
h
nA

where w is any function in H1 () such that (w )| = .


Proof. (Sketch) The formula is the classical integration-by-parts formula if v and w are smooth
functions (e.g., if they belong to C1 ()). Then, the result is extended by a density argument.
We are ready to discuss in which sense the boundary conditions are satisfied. The Dirichlet
condition is clear: since u H1 (), its trace on is a function in H1/2 (), and we have required
that this function coincide with the data g on D . Concerning the Neumann condition, we first
observe that (4.4.4) and Proposition 4.4.1 yield
u
H1/2 () ,
nA
with
h

u
,v i =
nA |

(Av) v +

(4.4.7)

(Av) v

v H10,D () ,

We now use (4.4.5) to write


Z
Z
u
(Av) v
(( (au) + a0 u f ) v +
,v i =
h
nA |

Z
Z
Z
Z
a0 uv +
u a v +
(Au) v
=

an uv

fv

Finally, we use
R the fact that u is the solution of the weak formulation (4.1.1), so that the right-hand
side equals N hv. In other words, the conormal derivative of u satisfies
Z
u
h
h
H1/2 (), = 0 on N .
(4.4.8)
, i =
nA
N
Since functions H1/2 () satisfying = 0 on N are dense in L2 (N ), we conclude that the
conormal derivative of u induces a linear form on L2 (N ), which coincides with h. We write
u
=h
nA N

in L2 (N ) ;

(4.4.9)

this is precisely the way in which the Neumann boundary condition is satisfied by u.
We summarize the results obtained so far in the following theorem.
Theorem 4.4.2. Under the sole assumptions on the domain, the coefficients and the data for
which Theorem 4.3.5 holds, the weak solution of the boundary-value problem (4.1.1) is such that:
u D(L(2) )

and

(Au) + (au) + a0 u = f

u| H1/2 ()

u
H1/2 ()
nA

and

and

in L2 () ;

u|D = g|D ;

u
=h
nA N

in L2 (N ) .

78

4.5

CHAPTER 4. ELLIPTIC PROBLEMS

Back to classics: the regularity of the weak solution

At last, we investigate under which conditions on the domain , the coefficients A, a, a0 of


the operator and the data f, g, h the weak solution is a more regular function, up to being a
classical solution of the boundary-value problem (4.1.1).
We first discuss regularity inside the domain, then we deal with regularity up to the boundary.

4.5.1

Internal regularity

The elliptic nature of the differential operator implies that there are no real characteristics, hence,
there is no propagation of singularities. In other words, the smootheness of u around a point in
only depends on the smoothness of the coefficients and the data f around that point. The result
is made precise in the following statement.
Proposition 4.5.1. Let be any open set contained with its closure in . Let m 0 be a
non-negative integer and assume that there exists an open set , satisfying , such that
A (Cm+1 ( ))nn ,

a (Cm+1 ( ))n ,

a0 Cm+1 ( ) ,

and f Hm ( ) .

Then,
u Hm+2 () .
The result means that if the coefficients of the operator are sufficiently smooth, then there is
a gain of two orders of Sobolev regularity between the data f and the solution u, i.e.,
f Hm ( )

u Hm+2 () .

(4.5.1)

This is a manifestation of the property that elliptic operators are regularizing operators.
A partial yet simple justification of the previous result can be provided for the model equation
in = Rn ,

u = f

using the powerful tool of the Fourier transform. Indeed, using (3.4.3), if both u and f belong to
L2 (Rn ) this equation is equivalent to
kk2 u
() = f() ,

Rn .

(4.5.2)

Then, one has


Z
Z
Z 

kk2m kk4 |
u()|2 d
|
u()|2 d +
u()|2 d =
1 + kk2(m+2) |
n
n
n
R
ZR
ZR
2
kk2m |f()|2 d
|
u()| d +
=
n
n
ZR
ZR

1 + kk2m |f()|2 d .
|
u()|2 d +

Rn

Rn

Using the expression (3.4.4) for the Sobolev norms, one easily gets


kukHm+2 (Rn ) C kukL2 (Rn ) + kf kHm (Rn ) ,

which is precisely (4.5.1) in the particular situation of being the full space.

(4.5.3)

4.5. BACK TO CLASSICS: THE REGULARITY OF THE WEAK SOLUTION

79

Going back to the general situation, condition u Hm+2 () with m large enough implies
classical regularity of u, thanks to the Sobolev Imbedding Theorem 3.8.2. Precisely, if m > n/2 2
then
u Hm+2 () u Ck, () ,
with k = [m + 2 n/2] and = m + 2 n/2 k if m n/2 is not an integer, or k = m + 1 n/2
and < 1 arbitrary if m n/2 is an integer. In particular, if the coefficients and the data f
are infinitely differentialble in (so that one can take m arbitrarily large), then u is infinitely
differentiable in .
It is worthwhile detailing certain results of minimal regularity in dimension two and three.
Corollary 4.5.2. Let n = 2 or 3, and let the assumptions of Proposition 4.5.1 hold. Then,
m = 0 implies u C0 (),
m = 1 implies u C1 (),
m = 2 implies u C2 ().
In the latter case, u is a classical solution of the partial differential equation at each point of .

4.5.2

Regularity up to the boundary

Sobolev regularity up to the boundary of can be achieved if the coefficients and the data are
sufficiently smooth, and in addition if the boundary is a smooth manifold, which does not
contain points where a change between Dirichlet and Neumann boundary conditions occurs. The
simplest situation is described in the following proposition.
Proposition 4.5.3. Let m 0 be a non-negative integer such that
A (Cm+1 ())nn ,

a (Cm+1 ())n ,

a0 Cm+1 () ,

and f Hm () .

In addition, assume that


is a manifold of class Cm+1 ;
either D or N are empty;
if D = , then g is the trace of a function vg Hm+2 ();
if N = , then h is the trace of a function zh Hm+1 ().
Under these assumptions, one has
u Hm+2 () .
The result can be extended to more general situations, for instance when D and N are both
non-empty, but each connected component of is completely contained in one of these sets. In
all cases, a result of classical regularity similar to Corollary 4.5.2 holds in the whole of .
The question of assessing the regularity of the weak solution u up to the boundary becomes
quite delicate if is not a manifold of class Cm+1 (for instance, if it has corners), or if there is a
transition between Dirichlet and Neumann boundary conditions at some point of . We confine
ourselves to the illustration of some of the possible situations in the case of a polygonal domain.

80

CHAPTER 4. ELLIPTIC PROBLEMS

The case of a polygonal domain


The following astonishingly simple example indicates how delicate is the issue of regularity in a
non-smooth domain. Consider the homogeneos Dirichlet problem for the Poisson equation, stated
in (4.1.8). Assume that is the square (0, 1) (0, 1), and take f to be the constant function
f = 1.
Despite the fact that we have a constant-coefficient operator and the data f and g = 0 are
2
constant, we cannot have u C2 () ! Indeed, if this were the case, we would have both xu2 and
2u
in C0 (). This would imply u C0 (), and since the laplacian of u equals 1 in , we
y 2
would also have u = 1 on ; in particular,
u(0, 0) = 1 .
2

On the other hand, the boundary condition on the side [0, 1] {0} implies xu2 (x, 0) = 0 for
2
0 < x < 1; by continuity, we would also have yu2 (0, 0) = 0. Similarly, using the boundary
condition on the side {0} [0, 1] we would have

2u
(0, 0)
x2

= 0, whence

u(0, 0) = 0 ,
a contradiction with the previous statement.
We aim at getting some understanding of the effect of the presence of corners and/or transition
points between boundary conditions upon the regularity of the solution. We confine ourselves to
the following situation: we consider a simplified, yet significant form of the boundary-value problem
(4.1.1), i.e., the homogeneous mixed Dirichlet-Neumann problem for the Poisson equation,

u = f in ,

u = 0 on D ,
(4.5.4)

u = 0 on ,
N
n
and we pose the following question: under which conditions the implication
f L2 () = H0 ()

u H2 ()

(4.5.5)

is true?
We assume that is a bounded polygonal domain in R2 with vertices Vi , i = 1, . . . , I. Each
side i (which could even be a curved side) carries a unique type of boundary condition, i.e., either
i D or i N .
Let us assume that the vertex Vi is common to the sides i and i+1 (setting I+1 = 1 ); let
i (0, 2) be the measure of the angle at Vi , contained in . Note that the situation of a point
P internal to a side at which there is a change of type of boundary conditions can be included
into the present setting by considering P as an additional vertex of the polygon, with associated
angle of measure .
We also associate an angle i to each side i , by setting i = 0 if i N and i = 2 if
i D . (More generally, we could enforce as boundary condition the vanishing of an oblique
u
derivative of u, i.e.,
= 0 on i , where i is a fixed unitary vector which is not perpedicular to
i
the normal vector ni ; in such a case, i would be the measure of the angle between i and ni .)
With these notations at hand, we define the quantities
i,m =

i i+1 m
,
i

mZ,

(4.5.6)

4.5. BACK TO CLASSICS: THE REGULARITY OF THE WEAK SOLUTION

81

and, correspondingly, the functions Si,m which, in a polar coordinate system (r, ) around the
vertex Vi , take the form
r i,m
cos(i,m i+1 ) i (r, ) ;
Si,m (r, ) =
i i,m
here, each cut-off function i is infinitely differentiable, takes the value 1 at Vi and vanishes
identically if r is large enough.
The following result describes the structure of the solution of Problem (4.5.4).
Theorem 4.5.4. Let u H10,D () be the weak solution of Problem (4.5.4), with f L2 (). Then,
u can be represented as
u = ureg + using ,
where ureg H2 (), while
using =

I
X

ci,m (f ) Si,m

i=1 0<i,m <1

for suitable coefficients ci,m (f ) depending on f . Each Si,m H1 () but 6 H2 ().


The theorem states that u H2 in a neighborhood of the vertex Vi if there is no m Z such
that 0 < i,m < 1; globally, u H2 () if all i,m fall outside the interval (0, 1).
Let us detail a few relevant situations.
Examples 4.5.5. i) Consecutive sides carrying the same boundary condition
Assume that both i and i+1 are contained in D , or in N . Then, i = i+1 , so that
i,m = m

.
i

If is convex around Vi , i.e., if 0 < i < , then i > 1, so that there is no m Z such that
0 < i,m < 1: the solution is H2 in a neighborhood of Vi .
This result is a particular case of the following property.
Property 4.5.6. If is convex, and either D = or N = , then u H2 () if f
L2 ().
Obviously, the counter-example at the beginning of the present section shows that a similar
result cannot hold for higher-order regularity: the convexity of the domain is not enough to ensure
that f H2 () implies u H4 ().

Going back to our discussion, if is not convex around Vi , i.e., if < i < 2, then 12 < i < 1,
so that there is exactly one m Z, namely m = 1, such that 0 < i,m < 1: the solution is not
H2 in any neighborhood of Vi . For instance, if i = 23 as in the re-entrant corner of an L-shaped
domain, then one can prove that u belongs at most to H7/4 in a neighborhood of Vi .
ii) Consecutive sides carrying different boundary conditions
Assume that i D , whereas i+1 N . Then, i i+1 = 2 , so that
i,m =

1
m
2

.
i

82

CHAPTER 4. ELLIPTIC PROBLEMS

If Vi is a point internal to a straight side of , at which the imposed boundary condition


switches from Dirichlet to Neumann, then i = , so that i,m = 12 m. Thus, 0 < i,m < 1
exactly for m = 0, and we have a singularity at Vi : one can prove that u belongs at most to H3/2
in a neighborhood of Vi .
Conversely, if i and i+1 are perpendicular to each other, i.e., if i = 2 , then i,m = 1 2m.
In this case, no i,m satisfies 0 < i,m < 1, hence, the solution is H2 in a neighborhood of Vi .

4.6

Exercises

4.1. Consider the following problem in = (0, 1)2 IR2 :

u = f
u = 0

u + u = 0
n

in
on 2 3 4
on 1

where IR and 1 = (0, 1) {0}, 2 = {1} (0, 1), 3 = (0, 1) {1}, 4 = {0} (0, 1).
(i) Write the variational formulation of the problem.
(ii) Find the conditions on which guarantee the coercivity of the associated bilinear form.
4.2. Setting up a suitable bilinear form in V = H10 () H10 (), use the Lax-Milgram Theorem to
prove the existence and uniqueness of the solution of the following elliptic system:

u2

= f1 in
u1 + u1 +
x1
u2 + u1 + u2 = f2 in

u1 = u2 = 0 on
where is a bounded open set in IRn and f1 , f2 H1 ().

Chapter 5

The Maximum Principle


Under the name of Maximum Principle, we find several important theoretical properties of the
solutions of elliptic and parabolic problems, all related to the ordering relation between real
numbers. The Maximum Principle can be expressed in various forms, from the classical ones to
the more general statements derived from the weak, or variational, formulations of the problems.
In this chapter, we will confine ourselves to elliptic boundary value problems. The Maximum
Principle for parabolic problems will be briefly accounted for in Sect.

5.1

Classical Results

In this section, for the sake of simplicity we consider the Laplace operator only, although most of
the results hold for more general elliptic operators, under suitable assumptions on the coefficients.
We general treatment is postponed to the next section.
The first expression of the Maximum Principle that we present has an immediate physical
interpretation. Let a thin elastic membrane occupy, when no load is applied to it, the position of
a plane domain , and let it be attached to the boundary . Then, if we apply a vertical load
pointing upwards, the membrane gets inflated upwards as well.
Proposition 5.1.1. Let be a bounded domain, with sufficiently smooth boundary . Let
C2 () be the solution of the problem
u C0 ()

u = f in
u = 0 on .
If f 0 in , then u 0 in .
Proof. Suppose at first f > 0 in and assume by contradiction that there exists x such
that u(x ) < 0. Since u is continuous in the bounded domain , there exists x such that
u(
x ) = min u(x ) < 0.

Then, necessarily u(
x ) = 0, and furthermore the Hessian of u in x is nonnegative, which implies
in particular
2u
0
i = 1, 2, . . . , n.
x2i
Thus
u(
x) =

n
X
2u
i=1

83

x2i

(
x) 0

84

CHAPTER 5. THE MAXIMUM PRINCIPLE

which contradicts the fact u(


x ) = f (
x ) > 0.
Suppose now f 0 in ; take any > 0 and set f (x ) = f (x ) + , so that f > 0 in . Let
u be the solution of the problem

u = f in ,
u = 0 on .
By linearity we can use the superposition principle, which allows us to write
u (x ) = u(x ) +
u(x ),
where u
solves the problem

(5.1.1)

u = 1 in ,
u
= 0 on .

C2 (), since the right-hand side is infinitely smooth and is


This solution satisfies u
C0 ()
supposed to be sufficiently smooth to allow this. Consequently, by (5.1.1) the same property holds
for u . Thus, we can apply to this function the result proven above and get u 0 in . Taking
the limit in (5.1.1) as 0, we finally get u 0 in .
Remark 5.1.2. The proof of the Proposition easily shows that u cannot have points of strict
local minimum inside the domain.
Remark 5.1.3. The previous result is also known as the Weak Maximum Principle. It is possible
to prove even a Strong
Maximum Principle, which states that if is a connected domain and if
R
f 0 in with f (x ) dx > 0, then u > 0 in .

So far, we have assumed zero boundary conditions and a non-zero right-hand side. The following result applies to harmonic functions, and provides a bound for their values at the interior
of the domain in terms of the boundary values.

C2 () be the solution of the


Proposition 5.1.4. Let be a bounded domain and let u C0 ()
problem

u = 0 in ,
u = g on ;
then

x .

min g u(x) max g,

Proof. Suppose that the second inequality does not hold, and assume that there exists x
such that
u(
x ) = max u > max g;

we can then choose IR such that


max g < < u(
x ).

Define now := {x : u(x ) > }; this set is a nonempty (since x ), bounded (since
) and open (since u is continuous); furthermore, = {x : u(x ) = }. In , u
solves the Dirichlet problem

u = 0 in ,
u = on .

By the uniqueness of the solution, we must then have u in ; but, by assumption, it results
u(
x ) > , a contradiction. The first inequality in the thesis is proven similarly.

5.2. VARIATIONAL RESULTS

85

Remark 5.1.5. For harmonic functions, we can actually state more: a harmonic function cannot
have strict maxima or minima inside its domain.
The result is an immediate consequence of the following property of harmonic functions, which
we will derive in a moment: given a point x and any neighborhood BR (x ) = {z : kz x k < R}
contained in , with boundary R = {z : kz x k = R}, we have the expression:
Z
1
u(x ) =
u() d,
(5.1.2)
|R | R
where |R | denotes the measure of R . As a consequence, u cannot achieve, for instance, a local
maximum value in x , since in that case we would have u() < u(x ) for every which parametrizes
R , and then
Z
Z
1
1
u() d < u(x )
d = u(x ),
u(x ) =
|R | R
|R | R
a contradiction. For a local minimum value in x , the reasoning is analogous.
We will derive (5.1.2) in dimension 2, the general case being similar. Consider the function
1
r
v(z ) = log , with r = kz x k, which according to (2.3.2) satisfies
2
R

v = x in BR (x ),
v = 0 on R .
Then,
v
u d
v u dx
n
R
BR (x )
Z
Z
u
v
v
u d +
v d =
u d,
R n
R n
R n

u(x ) = hx , ui = hv, ui =
Z
Z
vu dx
=
BR (x )

v
1
since u is harmonic in BR (x ) and v vanishes on R . We conclude by observing that
=
n
2R
on R .

5.2

Variational Results

The results presented in the previous section are special cases of a very general theorem, which can
be proved by a variational technique due to G. Stampacchia. In order to obtain such a theorem,
we need at first some preliminary considerations.
Let v be a function defined in ; we set
v + = max (v, 0)

and

v = max (v, 0).

The functions v + and v are said to be the positive part and the negative part of v, respectively.
Note that the following decomposition holds true:
v = v+ v ;
moreover, it can be shown that
Lemma 5.2.1. If v H1 (), then v + , v H1 ().

86

CHAPTER 5. THE MAXIMUM PRINCIPLE

In the sequel, we shall compare Lebesgue-measurable functions, defined up to sets of zero


measure; for them, the pointwise value is meaningless. Therefore, an expression of the form v w
in should be understood as v w almost everywhere in .
Let us now consider a generic second order elliptic operator, say
Lu = (Au) + a u + a0 u,
in a bounded domain , with coefficients such that the bilinear form
Z
Z
Z
a0 uv dx
a u v dx +
Auv dx +
a(u, v) =

(5.2.1)

is continuous and coercive in H10 (); furthermore, given f L2 () and g H1/2 () C0 (),
let us denote by u the solution of the Dirichlet problem

Lu = f in ,
u = g on ,
that is

(
u H1 (),
a(u, v) = (f, v)

u| = g,
v H10 ().

Finally, let us set


mg = min g

and

Mg = max g.

(5.2.2)

Theorem 5.2.2 (G. Stampacchia). Under the previous assumptions, we have:


i) if f a0 mg 0 in , then u mg in ;
ii) if f a0 Mg 0 in , then u Mg in .
Proof. We prove i), since the proof of ii) is similar. Let us write u in the form u = (u mg ) + mg
and let us substitute it in the variational formulation:
a(u mg , v) + a(mg , v) = (f, v);
since mg is a constant, we simply have a(mg , v) = (a0 mg , v) and therefore
a(u mg , v) = (f a0 mg , v).
Moreover, splitting u mg into its positive and negative parts yields
a((u mg )+ , v) a((u mg ) , v) = (f a0 mg , v).

(5.2.3)

Now, from Lemma 5.2.1 applied to the function u mg H1 () we obtain (u mg ) H1 (),


and furthermore it results (u mg )| = g mg 0 on , hence, (u mg )
| = 0; therefore
(u mg ) H10 (),
so we can choose v = (u mg ) as a test function in equation (5.2.3):



a (u mg )+ , (u mg ) a (u mg ) , (u mg ) = f a0 mg , (u mg ) .

(5.2.4)

5.2. VARIATIONAL RESULTS

87

Note that in general one has, for any function v H1 (),



a v+ , v = 0

since the supports of v + and v do not intersect in , except for possible sets of measure zero;
consequently, equation (5.2.4) becomes


a (u mg ) , (u mg ) = f a0 mg , (u mg ) .

Using now the coercivity of a, we get


k(u mg ) k21, f a0 mg , (u mg ) ;

moreover, since both f a0 mg and (u mg ) are assumed to be nonnegative, we have



f a0 mg , (u mg ) 0,
which implies

k(u mg ) k21, 0

and finally (u mg ) = 0. This means that u mg 0 in , i.e., u mg in .


Remark 5.2.3. Instead of g C0 (), one could make the weaker assumption that g L ().
In that case, the theorem holds after replacing (5.2.2) by
mg = ess inf g

and

Mg = ess sup g.

Remark 5.2.4. By inspecting the proof, it is easily seen that the implications i) and ii) of the
theorem hold if mg indicates any number min g and Mg indicates any number max g.
In the examples below we shall show how Stampacchias Theorem can be applied to some
particular cases of elliptic problems.
Example 5.2.5. Let us consider the Dirichlet problem for the Laplace operator

u = f in ,
u = 0 on ;
we have a0 = mg = 0, hence if f 0 in we obtain u 0 in , which is the same result already
proven in the classical case (cf. Proposition 5.1.1).
Instead, if u = g 6= 0 on , then we get u min g in .
k
Example 5.2.6. The problem

u = 0 in ,
u = g on ,

is such that a0 = 0 and f 0; from Stampacchias Theorem it follows that

(i) f 0 u ming

min g u max g,

(ii) f 0 u maxg

again a result already proved in the classical case (cf. Proposition 5.1.4).

88

CHAPTER 5. THE MAXIMUM PRINCIPLE

Example 5.2.7. Let us consider the Dirichlet problem for the Helmholtz operator

u + u = f in ,
u = g on .
We have a0 = 1. Assume at first that f 0 in and mg 0: we conclude that u min g in
. On the other hand, let again f 0 in but now assume mg > 0; then, the assumptions of
Stampacchias Theorem may not be satisfied and we can indeed have u < min g in , as shown
in the forthcoming discussion of singular perturbation problems. However, in this case we can
apply Remark 5.2.4 with mg = 0 and get at least u 0 in .
k

5.2.1

Singular Perturbation Problems

The hypothesis of Stampacchia Theorem involve neither second-order nor first-order coefficients
of the operator L. Therefore, these coefficients can be very small or large (provided the coercivity
of the form a is guaranteed) and the result still holds.
As an example, let us consider the so-called singular perturbation problem

u + u = f in ,
u = g on ,
where > 0 is a constant. This is an extremely simplified example of a reaction-diffusion problem,
in which the second-order part of the operator models a diffusion process (say, by mechanical
molecular interactions), whereas the zeroth-order part models a reaction process (say, by the
action of chemical agents). If the diffusion coefficient is small, then the solution u = u , far from
the boundary, is close to the function f , as if the equation were simply u = f without the term
u; u eventually adjusts itself to the boundary condition, i.e., the second-order term becomes
influential, only in a small transition region attached to the boundary. This region, called the

boundary layer, has length of order in the direction normal to the boundary.
To illustrate this behavior, let us consider the one-dimensional problem

in (0, 1),
u + u = f
u(0) = g0 ,

u(1) = g1 ,

with f, g0 , g1 IR. In order to get the solution,


the problem

w + w =
w(0) =

w(1) =

let us observe that the function w = u f solves


0
in (0, 1),
g0 f,
g1 f.

By linearity and the superposition principle, w can be expressed as w = (g0 f )u0 + (g1 f )u1 ,
where u0 is the solution of the problem

u0 + u0 = 0 in (0, 1),
u0 (0) = 1,

u0 (1) = 0,
whereas u1 is the solution of the problem

u1 + u1 = 0 in (0, 1),
u1 (0) = 0,

u1 (1) = 1.

5.2. VARIATIONAL RESULTS

89

In conclusion, we have
u(x) = f + (g0 f )u0 (x) + (g1 f )u1 (x).
An elementary computation yields

ex/ ex/
,
and
u1 (x) = 1/
e
e1/

from which it is easily seen that u0 is very close to 0 for c < x < 1, whereas u1 is very close to

0 for 0 < x < 1 c , where c > 1 is any large enough constant independent of .

On the whole, the solution u approximately equals f in (c , 1 c ), and exhibits two

boundary layers in (0, c ) and (1 c , 1) (see Fig. ....)


Going back to the maximum principle discussed at the end of the previous section, we note
that if 0 < f < min(g0 , g1 ), then we will also have 0 < u(x) < min(g0 , g1 ) at some point x (0, 1),
provided is small enough.
e(1x)/ e(1x)/

u0 (x) =
e1/ e1/

Another interesting mathematical model which leads to a singular perturbation problem is


provided by the advection-diffusion problem


u + a u = f
u = g

in ,
on ,

where > 0 is again a constant and a is a smooth vector field such that kak 1 in . The term
a u models the transport of the scalar quantity u (which may represent the temperature of a
fluid, or the concentration of a pollutant in a fluid) along the streamlines of the field a.
If is small, so that the term u may be neglected, the equation in reduces to a particular
case of the linear first order equation (1.3.1) considered in Sect. 1.3. We have seen that this
equation may be solved with the boundary condition assigned on the inflow boundary defined
as in (1.3.4). The solution u = u , far from the portion of the boundary 0 + , is close to
the solution u
of the reduced first-order problem


a
u = f
u
= g

in ,
on .

Approaching 0 + , u eventually adjusts itself to the boundary condition prescribed therein.


The transition occurs in boundary layers of differents size: the length in the direction normal

to the boundary is of order for the boundary layer attached to the characteristic boundary
0 , whereas it is of order (thus, much smaller) for the boundary layer attached to the outflow
boundary + .
Again we use a one-dimensional problem to illustrate the structure of the solution. Let us
consider

in (0, 1),
u + u = f
u(0) = g0 ,

u(1) = g1 ,
with f, g0 , g1 IR. At first, we solve the reduced model


u
= f
in (0, 1),
u
(0) = g0 ,

90

CHAPTER 5. THE MAXIMUM PRINCIPLE

(note that x = 0 is the inflow boundary point), getting u


(x) = g0 + f x. Next we make the change
of variable w = u u
and we observe that w is the solution of

in (0, 1),
w + w = 0
w(0) = 0,

w(1) = g1 g0 f.

By linearity we get

u(x) = g0 + f x + (g1 g0 f )u1 (x),


where now u1 is the solution of

An easy calculation yields

u1 + u1 = 0 in (0, 1),
u1 (0) = 0,

u1 (1) = 1.

ex/ 1
e(x1)/ e1/
=
,
e1/ 1
1 e1/
which shows that u1 is very close to 0 for 0 < x < 1 c, for any c > 1 large enough.
On the whole, the solution u approximately equals the linear function g0 + f x in (0, 1 c)
and exhibits an outflow boundary layer in (1 c, 1) (see Fig. ....)
Concerning the maximum principle, we have a0 = 0, hence, Stampacchias Theorem yields
u min(g0 , g1 ) in if f 0, and u max(g0 , g1 ) in if f 0. Note that we have min(g0 , g1 )
u max(g0 , g1 ) in if f = 0.
u1 (x) =

Chapter 6

Spectral Theory for Elliptic


Self-adjoint Problems
6.1

Introduction

In this chapter, we introduce the concept of eigenvalue and eigenfunction of a uniformly elliptic,
self-adjoint operator with appropriate boundary conditions. They can be considered as the generalization to an infinite dimensional Hilbert space H of the concept of eigenvalue and eigenvector of
a symmetric positive-definite matrix in a Euclidean space Rn . The eigenfunctions of the operator
form an orthonormal basis in H, with respect to which the operator is diagonalized.
A physical motivation for introducing eigenvalues and eigenfunctions of an elliptic operator is
as follows. Consider the wave equation
2u
u = f
t2
in a bounded domain Rn , submitted to honogeneous Dirichlet boundary conditions
u=0.
In one dimension (n = 1), u can be interpreted as the (small) displacement of a guitar string in the
direction perpendicular to the plane of the guitar, under the effect of a density force; the boundary
condition forces the string to remain attached to the guitar. The two-dimensional analog is the
displacement of a drum membrane (or drum skin) in the direction perpendicular to the drum
surface. (For simplicity, we have set to 1 all physical constants.)
An important characteristic of the musical instrument is represented by the set of the modes
of free vibration of the guitar string or the drum skin. They correspond to those solutions of the
wave equation with no forcing term (f = 0), which are periodic in time. More precisely, they can
be defined as the solutions of the form
u(x, t) = eit w(x) ,
where R is the frequency of pulsation, whereas w determines the spatial shape of the pulsation.
Differentiating and substituting into the wave equation, we get

eit 2 w(x) w(x) = 0 ,
91

92

CHAPTER 6. SPECTRAL THEORY FOR ELLIPTIC SELF-ADJOINT PROBLEMS

and since the exponential never vanishes, this is equivalent to


w = 2 w

in ;

on the other hand, the boundary condition for u is equivalent to the boundary condition for w,
w=0

on .

Setting = 2 , we are led to consider the following spectral problem.


Problem 6.1.1. Find all real numbers and all non-zero functions w defined in such that
(
w = w in ,
(6.1.1)
w = 0
on ,
Using the functional machinery developed in Chap. 4, this problem can be equivalently written
in variational form as follows.
Problem 6.1.2. Find all R and w H10 (), w 6= 0, such that
Z
Z
v H10 () .
w v = w v

(6.1.2)

The situation described so far is just an example, and it can be generalized in various ways,
e.g. by considering variable-coefficient elliptic operators or other types of boundary conditions. In
the next section, we will provide an abstract framework to study spectral problems formulated in
a variational form.

6.2

The abstract variational eigenvalue problem

Consider a Gelfand triple (V, H, V ) of separable Hilbert spaces (see (3.9.1)), and assume in addition that the inclusion V H is compact, i.e., from every sequence {vn }n1 which is bounded in
V , one can extract a subsequence {vnj }j1 which is convergent in H.
Denote by (u, v) the inner product in H, and let a : V V R be a bilinear form which is
continuous and coercive in V , with coercivity constant > 0 (i.e., it satisfies the assumptions of
the Lax-Milgram theorem); assume in addition that the form a is symmetric.
Let us consider the following eigenvalue problem.
Problem 6.2.1. Find all R and w V , w 6= 0, such that
a(w, v) = (w, v)

v V .

(6.2.1)

The solutions of this problem are referred to as the eigenvalues and the eigenfunctions w of
the bilinear form a. Note that any eigenfunction is defined up to a multiplicative constant, i.e., if
w is an eigenfunction, then w is also an eigenfunction, for any real 6= 0. Furthermore, given
an eigenfunction w, the corresponding eigenvalue can be expressed as the Rayleigh quotient
=

a(w, w)
.
(w, w)

Problem 6.2.1 can be solved by resorting to the spectral theory for a compact, self-adjoint and
positive operator in a Hilbert space. Let us recall the main result of this theory.

6.2. THE ABSTRACT VARIATIONAL EIGENVALUE PROBLEM

93

Theorem 6.2.2. Let X be a separable Hilbert space, with inner product (x, y). Let T : X X
be a linear operator, satisfying the following assumptions:
a) T is compact, i.e., from every bounded sequence {xn }n1 , one can extract a subsequence
{xnj }j1 such that {T xnj }j1 is convergent;
b) T is self-adjoint, i.e., (T x, y) = (T y, x) for all x, y X;
c) T is positive, i.e., (T x, x) > 0 for all non-zero x X.
Under these assumptions, there exist a sequence {k }k1 of strictly positive real numbers and a
sequence {xk }k1 of elements of X, which satisfy the relations
T xk = k xk

k = 1, 2, . . .

(6.2.2)

(each k is termed an eigenvalue of T , whereas xk is the corresponding eigen-element, or eigenfunction). Eigenvalues and eigen-elements enjoy the following properties:
i) The sequence of eigenvalues is non-increasing and converging to 0, i.e.,
0 < k+1 k 2 1 ,

lim k = 0 .

(6.2.3)

ii) The sequence of eigen-elements form an orthonormal system in X, i.e.,


(xk , x ) = k ,

k, 1 .

(6.2.4)

iii) This system is indeed an orthonormal basis in X, i.e., every x X can be uniquely represented as

X
x=
k xk
with k = (x, xk ) ;
(6.2.5)
k=1

the coefficients k are termed the (generalized) Fourier coefficients of x with respect to the
basis of eigen-elements of T . The convergence of the series has to be meant in the norm of
X, i.e.,
N
X
kx xN k 0 as N ,
where xN =
k wk .
k=1

iv) The following representation of the norm in X, termed Parseval identity, holds:
2

kxk =

X
k=1

|k |2

x X .

(6.2.6)

Remark 6.2.3. If is any eigenvalue of T , then the set


V () = {z X : T z = z}
is a vector space, termed the eigenspace of . As a consequence of (6.2.3), V () has finite dimension, say m 1 if we have for some k
k1 < = k = k+1 = = k+m1 < k+m ;
precisely, V () = span {x : k k + m 1}. We say that has finite multiplicity, equal to
m.

94

CHAPTER 6. SPECTRAL THEORY FOR ELLIPTIC SELF-ADJOINT PROBLEMS

Remark 6.2.4. Recall that for a linear operator, the property of compactness is stronger than
the property of continuity, i.e., if T is compact, then necessarily it is continuous. Indeed, by
contradiction, if there were no constant C > 0 such that
kT xk C kxk

x X ,

then we could find for any n 1 an element xn satisfying


kT xn k > n kxn k ;
The sequence yn = xn /kxn k would satisfy
kyn k = 1

and

kT yn k > n .

Clearly, no converging subsequence can be extracted from {T yn }, a contradiction with the property
of compactness.
We are ready to introduce the operator T : H H associated with our Problem 6.2.1.
Definition 6.2.5. For any f H, we set
Tf = u V H

solution of

a(u, v) = (f, v)

v V .

(6.2.7)

Proposition 6.2.6. The operator T defined above satisfies assumptions a)-c) of Theorem 6.2.2.
Proof. Let us check compactness. Let C > 0 denote the continuity constant of the inclusion
V H, i.e., kvkH CkvkV for all v V . Furthermore, let F V be the form associated with
f , i.e., the form defined by F (v) = (f, v) for all v V . Then, recalling (4.2.10) and the definition
of dual norm (see Sect. 3.9), one has
1
1
kukH kukV kF kV Ckf kH .
C

This implies both kT f kH C kf kH (i.e., the continuity of T , which we have seen above to be a
necessary condition for compactness), and
kT f kV

C
kf kH .

Thus, if {fn } is a bounded sequence in H, then {T fn } is a bounded sequence in V and since the
inclusion V H is compact by assumption, we can extract a subsequence {T fnj } converging in
H. This proves that T is compact.
In order to prove assumption b), consider any f and g in H, and set u = T f , w = T g. Using
the symmetry of the inner product in H and the definition of T , we have
(T f, g) = (g, T f ) = (g, u) = a(w, u)

and

(T g, f ) = (f, T g) = (f, w) = a(u, w) ;

the result follows from the symmetry of the bilinear form a.


Finally, the positivity of T immediately follows from the coercivity of a. Indeed, for any f H,
one has
(T f, f ) = (f, T f ) = (f, u) = a(u, u) kuk2V 0 ,

6.2. THE ABSTRACT VARIATIONAL EIGENVALUE PROBLEM

95

and (T f, f ) = 0 implies u = 0, which in turn implies f = 0 by (6.2.7) and the density of V in


H.
The result just proven ensures us that statements i)-iv) of Theorem 6.2.2 apply to our operator
T . Let us rephrase them in the language of the variational setting.
We will denote by wk , k = 1, 2, . . . , the eigenfunctions of T . Then, each T wk satisfies
a(T wk , v) = (wk , v)

v V ;

on the other hand, condition (6.2.2) defining a couple eigenvalue-eigenfunction, i.e., T wk = k wk ,


yields
k a(wk , v) = (wk , v)
v V .
Setting
k =

1
,
k

we obtain
a(wk , v) = k (wk , v)

v V .

(6.2.8)

We arrive at the conclusion that T wk = k wk holds if and only if k R+ and wk V are a


solution of the variational eigenvalue problem (6.2.1). We collect the results obtained so far in the
following theorem.
Theorem 6.2.7. Under the assumptions on the Gelfand triple (V, H, V ) and the bilinear form
a(u, v) stated at the beginning of this section, the variational eigenvalue Problem 6.2.1 admits a
sequence of real strictly positive eigenvalues k and corresponding eigenfunctions wk V , k =
1, 2, . . . , with the following properties:
i) The sequence of eigenvalues is non-decreasing and unbounded from above, i.e.,
0 < 1 2 k k+1 . . . ,

lim k = + .

(6.2.9)

ii) The sequence of eigenfunctions form an orthonormal system in H, i.e.,


(wk , w ) = k ,

k, 1 .

(6.2.10)

iii) This system is indeed an orthonormal basis in H, i.e., every v H can be uniquely represented as

X
v=
vk wk
with vk = (v, wk ) ;
(6.2.11)
k=1

iv) The Parseval identity holds:


kvk2H =

X
k=1

|
vk |2

v H .

(6.2.12)

Corollary 6.2.8. The previous conclusions apply to the eigenvalue Problem 6.1.2 for the operator
with Dirichlet boundary conditions in a bounded domain .

96

CHAPTER 6. SPECTRAL THEORY FOR ELLIPTIC SELF-ADJOINT PROBLEMS

Proof. We know that (H10 (), L2 (), H1 ()) is a Gelfand triple (see Sect. 3.9), and that the
inclusion H10 () L2 () is compact if is bounded (by Rellich Theorem 3.8.1).
We close this section by showing that the system of eigenfunctions {wk }k1 provides a basis
not only in H but also in other functional spaces related to our problem, such as V and V ; in
addition, the norm of an element in such a space can be expressed in terms of a suitably weighted
norm of the sequence of its (generalized) Fourier coefficients. Let us begin with the space V .
Setting v = w in (6.2.8) and using (6.2.10), we obtain
k, 1 .

a(wk , w ) = k k ,

(6.2.13)

This means that the eigenfunctions form an orthogonal system in V , for the inner product (u, v)a =
a(u, v) associated with the bilinear form a (which induces a norm, kvka , uniformly equivalent to
the norm kvkV , see Remark 4.2.7). The system is indeed a basis for V , as stated by the following
result.
Proposition 6.2.9. For any v H, one has
vV

if and only if

X
k=1

k |
vk |2 < + .

In this case, the series (6.2.11) converges also in V and the following Parseval identity holds true:
a(v, v) = kvk2a =

X
k=1

k |
vk |2 .

(6.2.14)

Proof. For any fixed N 1, let VN = span {wk : 1 k N } the subspace of V spanned by the
first N eigenfunctions.
Assume
PN that v V . Define vN VN as the truncation to N terms of the series (6.2.11), i.e.,
vN = k=1 vk wk . It is easily seen, thanks to (6.2.13), that vN is the orthogonal projection of v
upon VN in the inner product (u, v)a , i.e., it satisfies
(v vN , z)a = 0

z VN .

Writing v = (v vN ) + vN , we have (v vN , vN )a = 0 by this relation with z = vN , so that


Pythagoras theorem in the norm k . ka applies, yielding
kvk2a = kv vN k2a + kvN k2a ,
which implies
N
X
k=1

k |
vk |2 = kvN k2a kvk2a < + ,

N 1 .

Thus, the series in the statement of the proposition is convergent, and the other properties follow
immediately from this result.P
P
Conversely, assume that
vk |2 < +. Then, setting vN = N
k wk for all N 1,
k=1 k |
k=1 v
we have that the sequence {vN } is a Cauchy sequence in V , since
kvN vM k2a =

N
X

k=M +1

k |
vk |2 0

if

N >M .

6.2. THE ABSTRACT VARIATIONAL EIGENVALUE PROBLEM

97

Thus, {vN } converges in V to some element v V ; however, since it also converges in H to v


by the previous theorem, necessarily v = v by the uniqueness of the limit. We conclude that
v V.
Proposition 6.2.9 provides an example of the fundamental property that for an element v H
the condition of being more regular (in a suitable sense) is equivalent to a faster decay of its
Fourier coefficients vk as k , since the sequence {k } is diverging to + (recall (6.2.9)). Note
in particular that from v belonging to H we can only infer that
vk = o(1)

(i.e., infinitesimal)

as k ,

since the series (6.2.12) is convergent; on the other hand, from v belonging to V we infer that
 
1
k.
vk = o
k
Another example of this property is as follows. Recall the definition (4.2.6) of the operator
A : V V associated with the bilinear form a. Note that the relations (6.2.8) can be equivalently
written as
Awk = k wk ,
k = 1, 2, . . . .
Let us introduce the domain of the operator A as the subspace of V
D(A) = {v V : Av H} .
P
One can prove (see the remark below) that if v =
k wk belongs to D(A), then
k=1 v
Av =

vk Awk =

so that
kAvk2H =

vk = o

k vk wk ,

k=1

k=1

In this case,

(6.2.15)

X
k=1

1
2k

vk |2 < + .
2k |

k.

In the model situation of the Laplacian with Dirichlet boundary conditions, we have H =
L (), V = H10 () and D() = H10 () H2 () provided has a C1 boundary (Proposition 4.5.3)
or is convex (Property 4.5.6). In one of these cases, the following equivalences hold:
2

v L2 ()

if and only if

v H10 ()

if and only if

v H10 () H2 ()

k=1

X
k=1

if and only if

X
k=1

|
vk |2 < + ,

(6.2.16)

k |
vk |2 < + ,

(6.2.17)

2k |
vk |2 < + .

(6.2.18)

On can continue this sequence indefinitely, by considering domains of the successive powers m
of the Laplacian.

98

CHAPTER 6. SPECTRAL THEORY FOR ELLIPTIC SELF-ADJOINT PROBLEMS

Remark 6.2.10. We can also expand in eigenfunction series any element F of the dual space V .
Indeed, we define its (generalized) Fourier coefficients by setting
Fk = F (wk ) ,
then, for any v =

k wk
k=1 v

k = 1, 2, . . . ;

V , one has (at least formally)


F (v) =

vk F (wk ) =

k=1

Fk vk .

(6.2.19)

k=1

Since F (v) is finite for all v V , one can prove using Proposition 6.2.9 that necessarily

X
1 2
|Fk | < + ,
k
k=1

and actually the square root of this expression is the dual norm of F , if V is equipped with the
norm kvka . In other words, one can prove that
F =

X
k=1

Fk wk V

X
1 2
|Fk | < + ;
k

if and only if

k=1

in this case, the series of F is convergent in V , and the action of F on any v V can be expressed
as in (6.2.19).

6.3

Classical examples. Separation of variables

We have seen (Corollary 6.2.8) that in a bounded domain Problem 6.1.2 admits a sequence {k }k1
of strictly positive eigenvalues, diverging to + as k , with corresponding eigenfunctions
{wk }k1 , which form an orthonormal basis in L2 ().
The following result is of some interest, since it relates the smallest eigenvalue 1 to the
Poincare constant CP () of the domain (defined in Proposition 3.7.2).
Property 6.3.1. The Poincare constant of the domain is given by
1
CP () = .
1
Proof. We recall that the Poincare constant of the domain is the smallest constant CP for which
(3.7.1) holds; this condition can be written as
(v, v) CP2 a(v, v)
v H10 () ,
R
R
with (u, v) = uv and a(u, v) = u v. Recalling (6.2.12) and (6.2.14), we obtain

X
k=1

|
vk |

CP2

X
k=1

k |
vk |2 ;

pulling 1 outside the series, we get

X
k=1

|
vk |

CP2 1

X
k
k=1

|
vk |2 .

6.3. CLASSICAL EXAMPLES. SEPARATION OF VARIABLES

99

This inequality is true if CP2 1 = 1, since k /1 1 for any k by (6.2.9). On the other hand, with
such a choice the inequality becomes an equality for v = w1 , meaning precisely that
1
CP =
1
is the smallest admissible constant in (3.7.1).
Next, we explicitly compute the eigenvalues and eigenfunctions of Problem 6.1.2 in some relevant cases.

6.3.1

The free vibrations of a string

We consider the one-dimensional version of Problem 6.1.1, i.e., we look for all the solutions of the
equations
(
w = w
in (0, L) ,
(6.3.1)
w(0) = w(L) = 0 ,
where L > 0 is fixed. The equation in (0, L) is a second-order, linear, constant-coefficient ordinary
differential equation, which is well known to admit the general integral

w(x) = Aei

+ Bei

or, equivalently but avoiding the complex variable,

w(x) = a sin( x) + b cos( x) ,

a, b R arbitrary .

Condition w(0) = 0 forces b = 0, whereas condition w(L) = 0 yields L = k for any integer
k > 0 (since the left-hand side is strictly positive). Thus, we have found the eigenvalues of our
problem:
2
k = k 2 2 ,
k = 1, 2, . . . ,
(6.3.2)
L
with corresponding eigenfunctions

wk (x) = ak sin(k x) .
L
The parameter ak is determined by the normalization condition
Z L
Z L

2
2
wk (x) dx = ak
sin2 (k x) dx = 1 ,
L
0
0
p
which easily gives ak = 2/L. Thus, the eigenfunctions of our problem are
r

2
sin(k x) ,
k = 1, 2, . . . ,
wk (x) =
L
L

(6.3.3)

Following the same procedure illustrated above, one can easily compute (see Exercise 6.1)
eigenvalues and eigenfunctions of other boundary-value problems for the second derivative operator, such as a mixed Dirichlet/Neumann boundary-value problem

w = w in (0, L),
w = w in (0, L),
w(0) = 0,
or
w (0) = 0,
(6.3.4)

w (L) = 0,
w(L) = 0.

100

CHAPTER 6. SPECTRAL THEORY FOR ELLIPTIC SELF-ADJOINT PROBLEMS

or a pure Neumann boundary-value problem


(
w = w
w (0)

in (0, L) ,

w (L)

=0,

(6.3.5)

Remark 6.3.2. Note that one cannot directly apply Theorem 6.2.7 to the variational formulation
of the pure Neumann problem (6.3.5), i.e.,
Z

w v dx =

w v dx

v V = H1 (0, L) ,

(6.3.6)

RL
since the bilinear form a(w, v) = 0 w v dx is not coercive in V : one has a(w, w) = 0 for all
constant functions w. In other words, the problem admits the eigenvalue = 0 (with eigenfunction
w = const), which is excluded by Theorem 6.2.7.
In order to apply the theorem, one uses the trick of adding the L2 -inner product on both sides
of (6.3.6), i.e., one considers the modified problem
Z

w v dx +

w v dx =
0

w v dx
0

v V = H1 (0, L) ,

with = + 1. Now the bilinear form on the left-hand side is precisely the inner product in V ,
so that this problem fulfils the assumptions of Theorem 6.2.7. The eigenfunctions wk of the two
problems are the same, whereas the eigenvalues of (6.3.6) are given by k = k 1.

The same trick of shifting the eigenvalues applies whenever one has to solve a general eigenvalue
Problem 6.2.1, in which the bilinear form a(w, v) is not coercive in V , but is such that the shifted
form a(w, v) + (w, v) is indeed coercive for > 0 large enough.

6.3.2

The free vibrations of a square membrane

Next, we solve Problem 6.1.1 in the square = (0, L)2 . The nature of the domain (a cartesian
product of intervals) and the form of the equation (a constant-coefficient operator) suggests to
adopt the separation of variables approach, which consists of looking for a solution in the form
w(x, y) = (x)(y) ,

(6.3.7)

i.e., a product of a function of the x-variable alone and a function of the y-variable alone. Substituting into the differential equation w = w, we get
( (x)(y) + (x) (y)) = (x)(y) ;
dividing both terms by (x)(y) yields

(x) (y)

=.
(x)
(y)

(6.3.8)

This identity holds in , i.e., for all x (0, L) and (independently) for all y (0, L). Keeping y
fixed and varying x, we see that necessarily

(x)
= constant (say, ) ,
(x)

6.3. CLASSICAL EXAMPLES. SEPARATION OF VARIABLES

101

while keeping x fixed and varying y, we see that necessarily

(y)
= constant (say, ) .
(y)

We deduce that (6.3.8) is equivalent to

(x)
=,
(x)

(y)
=,
(y)

=+ .

(6.3.9)

Enforcing the boundary condition w = 0 on the vertical side x = 0 of the square yields
(0)(y) = 0 ,

0yL,

and since cannot be identically 0 (otherwise w would be so), then necessarily (0) = 0. Considering all other sides of the square, we find by similar arguments (L) = 0, (0) = 0 and
(L) = 0.
In conclusion, a function w of the form (6.3.7) is an eigenfunction of our problem if and only
if and satisfy
(
(
=
in (0, L) ,
=
in (0, L) ,
(0) = (L) = 0 ,

(0) = (L) = 0 ,

i.e., if and only if both and are arbitrary eigenfunctions of the one-dimensional problem
considered in the previous subsection. In this way, we find the eigenfunctions
whk (x, y) =

2
sin(h x) sin(k y) ,
L
L
L

h, k = 1, 2, . . . ,

(6.3.10)

with associated eigenvalues


hk = (h2 + k2 )

2
,
L2

h, k = 1, 2, . . . ,

(6.3.11)

A reasonable question at this point is whether there exist eigenfunctions other than those found
so far. The answer is no, since one can prove that the set (6.3.10) is complete in L2 (), so that
any w orthogonal to all whk would necessarily be the zero function in L2 ().
Note that the adopted labeling of eigenfunctions and eigenvalues, by two indices h and k,
differs from the single-index labeling used in Theorem 6.2.7. Yet, it is not difficult to see that the
set {(h, k) : h, k = 1, 2, . . . } can be numbered in such a way that the monotonicity conditions
(6.2.9) are fulfilled.

6.3.3

The free vibrations of a circular membrane

At last, we solve Problem 6.1.1 in the unit circle centered at the origin, i.e., in = B(0, 1). It is
natural to resort to polar coordinates (r, ); recalling (2.3.1), the differential equation w = w
becomes

 2
1 2w
w 1 w
+
+
= w .

r 2
r r
r 2 2
The domain is transformed into the product of intervals [0, 1) [0, 2), which suggests as above
to look for solutions in separated form
w(r, ) = (r)() .

(6.3.12)

102

CHAPTER 6. SPECTRAL THEORY FOR ELLIPTIC SELF-ADJOINT PROBLEMS

Figure 6.1: The first eigenfunctions of a square membrane [... da aggiungere]


Substituting into the equation yields


1
1

(r)() + (r)() + 2 (r) () = (r)() ;


r
r
dividing by (r)(), we get

(r) 1 (r)
1 ()
+
+ 2
(r)
r (r)
r ()

=.

(6.3.13)

Now we observe that keeping r fixed and varying , we find that necessarily the expression
()
()
must be constant, i.e., () is a solution of the differential equation
() = ()

in (0, 2)

(6.3.14)

for some real constant ; on the other hand, should be 2-periodic. Hence, all admissible
have the form
m () = eim
for arbitrary m Z ,
and the corresponding in (6.3.14) is given by
m = m2 .
Substituting into (6.3.13) yields

(r) 1 (r) m2
+
2
(r)
r (r)
r

=,

or, equivalently,
r 2 (r) + r (r) + (r 2 m2 )(r) = 0 .

(6.3.15)

Let us denote by m (r) any solution of such equation. The boundary conditions for m are as
follows. Since w(1, ) = 0 for all , necessarily m (1) = 0. On the other hand, when m 6= 0,
the function w(r, ) = m (r)eim would not admit a limit independent of as r 0+ , unless
m (0) = 0; so we enforce this condition. When m = 0, then w(r, ) = 0 (r) and no condition is
needed on 0 at the origin, except that of being finite as r 0+ .
Equation (6.3.15) is similar to one of the classical equations of Applied Mathematics, the Bessel
equation
x2 Y (x) + x Y (x) + (x2 2 ) Y (x) = 0 ,
x>0,
(6.3.16)

6.4. EXPANSION IN SERIES OF EIGENFUNCTIONS

103

Figure 6.2: Plots of the Bessel functions of the first kind J0 , J1 and J2
where is an arbitrary real or complex number. The solutions of (6.3.15) with the boundary
conditions described above can be related to certain solutions of the Bessel equation, precisely to
the Bessel functions of the first kind J (x), with a non-negative integer. The behavior of these
functions for = 0, 1, 2 is shown in Fig. 6.2 (taken from Wikipedia); it is apparent that near the
origin they behave as we want that 0 , 1 , 2 behave. Indeed, we have J (0) = 0 for all integer
> 0, whereas J0 (0) is finite and non-zero.
Each function J exhibits an oscillatory behavior around the horizontal axis as x increases,
with a monotonically increasing sequence of strictly positive, simple zeroes x,k , k = 1, 2, . . . .
Thus, for any m Z and any integer k 1, let us define
m,k (r) = J|m| (x|m|,k r) .

(6.3.17)

Then, using (6.3.16) with x = x|m|,k r and = |m|, it is easily seen that m,k is a solution of
(6.3.15) if the constant = m,k is defined as
m,k = x2|m|,k ;

(6.3.18)

in addition, since x|m|,k is a zero of J|m| , we have m,k (1) = 0.


Summarizing, for any m Z and any integer k 1, the function wm,k defined in polar
coordinates as
wm,k (r, ) = m,k (r)m ()
(6.3.19)
is an eigenfunction of Problem 6.1.1 in the unit circle, with corresponding eigenvalue given by
(6.3.19). One can show that these are indeed the totality of the eigenfunctions of this problem,
which therefore is completely solved.

6.4

Expansion in series of eigenfunctions

The knowledge of the eigenvalues and eigenfunctions of Problem 6.2.1 allows us to solve by series
the variational problem associated with the bilinear form a, namely:
Problem 6.4.1. Given f H, find u V such that
a(u, v) = (f, v)

v V .

(6.4.1)

104

CHAPTER 6. SPECTRAL THEORY FOR ELLIPTIC SELF-ADJOINT PROBLEMS

Figure 6.3: The first eigenfunctions of a circular membrane [... da aggiungere]


This means that we are able to expand the solution u in the eigenfunction series (6.2.11), where
the (generalized) Fourier coefficients of u are expressed in terms of those of the data f . The precise
result is as follows.
Proposition 6.4.2. Let
f=

fk wk

k=1

be the expansion of f in the series of eigenfunctions of the bilinear form a. Then, the Fourier
coefficients of the solution u of Problem 6.4.1 are given by
u
k =

1
fk ,
k

k = 1, 2, . . . ,

(6.4.2)

so that the eigenfunction series of u is given by

X
1
fk wk
u=
k

(6.4.3)

k=1

Proof. Let us choose v = w in (6.4.1), with = 1, 2, . . . . Then,

X
X
a(u, w ) = a(
u
k wk , w ) =
u
k a(wk , w ) = (f, w ) ,
k=1

k=1

where the second equality is justified by Proposition 6.2.9. Recalling (6.2.13), we obtain
u
= f ,
which, up to a change in the label of the coefficients, is precisely (6.4.2).

Examples 6.4.3. i) Consider the one-dimensional problem


(

u = f

u(0) = u(L) = 0 .

in (0, L) ,

(6.4.4)

Assume that right-hand side is the constant function f = 1, so that the solution is the parabola
u(x) = 12 x(L x).

6.4. EXPANSION IN SERIES OF EIGENFUNCTIONS

105

Let us first compute the Fourier coefficients of f with respect to the eigenfunctions wk introduced in Sect. 6.3.1. Since

Z L
for k even ,
0

sin(k x) dx = 2L
(6.4.5)

L
0
for k odd ,
k
we get

for k even ,
0
r
fk = 2L 2

for k odd .
k L
The Fourier coefficients of the solution are given by (6.4.2), i.e.,

for k even ,
0
1
r
u
k =
fk = 2L3
2

for k odd ,
3
3
k
L

so that the eigenfunction expansion of the solution is as follows:


u(x) =

X
k=1

u
k wk (x) =

4L2 X
1

sin
(2m
+
1)
x .
3 m=0 (2m + 1)3
L

ii) Consider now the two-dimensional problem


(
u = f in ,
f

= 0

on ,

(6.4.6)

with = (0, L)2 . Assume as above that right-hand side is the constant function f = 1; recall
that this problem has been already considered in Sect. 4.5 while discussing the regularity of the
solution of an elliptic problem in a domain with corners.
Let us first compute the Fourier coefficients of f with respect to the eigenfunctions whk introduced in Sect. 6.3.2. Recalling (6.4.5), we have

Z L
Z L
for h or k even ,
0

sin(k y) dy = 8L2 1
sin(h x) dx
fhk =

L 0
L
L
0

for h and k odd .


2 hk
The Fourier coefficients of the solution are given by (6.4.2), i.e.,

for h or k even ,
0
1
u
hk =
fhk = 8L3
1

hk
4
for h and k odd .
hk(h2 + k2 )

so that the eigenfunction expansion of the solution is as follows:


u(x) =
=

X
X

u
hk whk (x)

h=1 k=1
X

2 X

16L
4

m=0 n=0



1
sin (2m + 1) x sin (2n + 1) y .
2
2
(2m + 1)(2n + 1)((2m + 1) + (2n + 1) )
L
L

106

6.5

CHAPTER 6. SPECTRAL THEORY FOR ELLIPTIC SELF-ADJOINT PROBLEMS

Exercises

6.1. Compute the eigenvalues and the eigenfunctions of the one-dimensional boundary-value problems (6.3.4) and (6.3.5).
6.2. Expand in series of eigenfunctions the solution of the problem

u = xy 2 in ,
u = 0
on ,
where = (0, )2 IR2 .
6.3. Represent in series of eigenfunctions the solution of the problem

u = x sin 3y in
u = 0
on 1 3

= 0
on 2 4
n

where = (0, )2 IR2 , 1 = (0, ){0}, 2 = {}(0, ), 3 = (0, ){} and 4 = {0}(0, ).

Chapter 7

Parabolic Problems
Parabolic problems describe propagation phenomena with infinite speed, the so-called diffusion
phenomena. Many problems in the applied sciences lead to this type of mathematical model: the
heat propagation through a rod or the motion of a viscous flow in a channel are just two important
examples, from Thermodynamics and Fluid Dynamics, in which parabolic equations describe the
time evolution of temperature and velocity, respectively.
Initial/boundary-value problems for a second-order linear parabolic operator can be studied in
a manner similar to what has been done in the previous chapters for elliptic problems. Precisely,
we first obtain a weak, or variational, formulation of the problem, by proceeding initially in a
formal manner and then making the functional assumptions fully rigorous. Next, we prove the
well-posedness of the weak formulation; in the present situation, the result will be a consequence
of an a-priori bound on any possible solution of the weak problem. At last, we interpret the weak
solution as a strong, or classical, one, provided suitable assumptions on the data are satisfied.
While the variational treatment of the spatial part of the operator follows the guidelines established in Chaps. 3 and 4, here the mathematical novelty is represented by the first-order time
derivative. Its treatment will be based on a new result, that represent a generalization of the
well-known integration-by-part formula in one dimension.
Before starting, let us introduce some slightly new notations that are needed in order to take
into account the time variable.
Let be a bounded open set in IRn and suppose its boundary is smooth enough; furthermore, let (0, T ) be the time interval of interest. We introduce the cylinder in space and time
Q = (0, T ) IRn+1 .
Consider a generic real-valued function v = v(x , t) defined on Q; it is convenient to think of
it as a function v = v(t) defined in for every fixed t (0, T ) (i.e., a function in depending on
t as a parameter). In other words, for all t (0, T ), the function v(t) : IR is defined as
x .

(v(t))(x ) = v(x , t)
Suppose now v L2 (Q); then, by Fubinis theorem
kvk2L2 (Q)

|v(x , t)| dx dt =

and, if we set as usual

kv(t)k2L2 () =

T
0

|v(x , t)|2 dx dt

|v(x , t)|2 dx ,

107

108

CHAPTER 7. PARABOLIC PROBLEMS

it results
kvk2L2 (Q) =

kv(t)k2L2 () dt .

Since the left-hand side is, by assumption, a finite number, it follows that we can think of v as
a function v : (0, T ) L2 () such that the further function t 7 kv(t)kL2 () belongs to L2 (0, T ).
We write v L2 (0, T ; L2 ()).
Thus, we may regard functions in L2 (Q) as functions of the variable t taking values in the
space L2 (), with the further property that the spatial norm has a certain degree of integrability
with respect to the time. This can be generalized as follows.
Definition 7.0.1. Let X be a normed space; for 1 p , we denote by
Lp (0, T ; X)
the linear space of the measurable functions v : (0, T ) X such that
Z

kv(t)kpX dt < + ,

equipped with the norm


kvkLp (0,T ; X) =

Z

kv(t)kpX

dt

1/p

(with the obvious change when p = ). In addition, Lp (0, T ; X) is a Banach space if and only if
X is so.
Hence, we can identify L2 (Q) with the space L2 (0, T ; L2 ()); in this and next chapters, we
shall also use spaces like L2 (0, T ; H10 ()) and L2 (0, T ; H1 ()). We also define C0 ([0, T ]; X) as
the space of all continuous functions v : [0, T ] X, equipped with the norm
kvkC0 ([0,T ]; X) = max kv(t)kX .
0tT

7.1

Variational Formulation

In order to illustrate how to derive the variational formulation of a parabolic problem, we put
ourselves in the conceptually simplest situation, namely we consider the homogeneous Dirichlet
problem for the heat equation:

u = f in Q
t
u = 0 on (0, T )

u = u0 on {0}

(7.1.1)

where u = u(x , t) is the unknown temperature, at position x and time t, of a conducting body
occupying the domain , f = f (x , t) and u0 = u0 (x ) are two given functions representing the
heat exchange with the surrounding environment and the initial temperature, respectively. This
choice is motivated by the arguments of Sect. 1.4, where it has been shown that the heat operator
is the canonical form to which we can reduce any second-order parabolic operator. However, we
will also mention extensions of the subsequent results to more general situations.

7.1. VARIATIONAL FORMULATION

109

As we know, the weakest way to give meaning to the heat equation is in the distributional
sense: we assume u and f to be locally integrable in Q and we require u to satisfy

i.e.,

u
u = f
t

in D (Q) ,

dx dt
Q t

u dx dt =

f dx dt
Q

(7.1.2)

D(Q) .

(7.1.3)

A more balanced formulation, suggested by the experience gained with elliptic problems, consists
of applying Gauss therem only once in the term involving the spatial operator, i.e., writing the
term
Z
Z
u dx dt .
u dx dt as

Obviously, this requires more regularity on u: it is quite natural to assume u L2 (0, T ; H10 ()),
since then also the Dirichlet boundary condition is rigorously defined a.e. (almost everywhere, i.e.,
outside a set of zero measure) in time. Furthermore, this assumption allows us to give a precise
meaning to the time derivative of u. Indeed, recalling the bound (3.9.11), one has
Z T
Z T
kuk2H1 () dt < + ,
kuk2H1 () dt C
0

i.e.,
u L2 (0, T ; H10 ())

u L2 (0, T ; H1 ()) .

If we furtherly assume that f L2 (Q) = L2 (0, T ; L2 ()) L2 (0, T ; H1 ()) (the latter inclusion
being a consequence of (3.9.1)), then we deduce from (7.1.2) that
u
= u + f L2 (0, T ; H1 ()) .
t

(7.1.4)

Thus, under the above assumption on f , we are led to look for a solution u of problem (7.1.1)
in the space
W(0, T ; H10 (), H1 ()) = {w L2 (0, T ; H10 ()) :

w
L2 (0, T ; H1 ())} ,
t

(7.1.5)

where the time derivative has to be understood, as usual, in the distributional sense. This is a
Hilbert space for the graph norm
kwkW(0,T ; H10 (),H1 ()) =

kwk2L2 (0,T ; H1 ())


0

w 2
+k
k 2
1
t L (0,T ; H ())

1/2

In this case, all three addends of the heat equation are in L2 (0, T ; H1 ()), and the following
variational formulation of the equation can be given:
Z T
Z T
u
v L2 (0, T ; H10 ()) . (7.1.6)

u,
vi
dt
=
h
1
1
H1 () hf, viH10 () dt
H0 ()
H ()
t
0
0
What about the initial condition? We have to give a correct meaning to it. This is precisely
what is provided by the following result, which in addition establishes a useful integration-byparts formula in time for functions in W(0, T ; H10 (), H1 ()).

110

CHAPTER 7. PARABOLIC PROBLEMS

Proposition 7.1.1. Any function in W(0, T ; H10 (), H1 ()) is (up to a negligible set) continuous from [0, T ] to L2 (); in other words, the space W(0, T ; H10 (), H1 ()) is contained in
C0 ([0, T ]; L2 ()) with continuous inclusion.
Furthermore, for any w, v W(0, T ; H10 (), H1 ()), the following identity holds:
d
(w, v)L2 () =
dt

H1 () h

w
, viH10 () +
t

v
, wiH10 ()
t

H1 () h

in D (0, T ) ;

(7.1.7)

equivalently, for any 0 t1 < t2 T , one has

t2

w
, viH10 () dt +
H1 () h
t
t1

t2

t1

H1 () h

v
, wiH10 () dt = (w(t2 ), v(t2 ))L2 () (w(t1 ), v(t1 ))L2 () .
t
(7.1.8)

Proof. (We just provide the essential steps.) At first, one proves that the set of all smooth enough
functions, e.g. those in C1 (Q) vanishing on [0, T ], are dense in W(0, T ; H10 (), H1 ()); we
skip the technical details.
Next, we prove (7.1.7). Let {vn }n0 be any sequence of smooth functions converging to v in
W(0, T ; H10 (), H1 ()). Then, for all D(0, T ), one has
d
(w, v)L2 () , iD(0,T ) =
D (0,T ) h
dt

(w(t), v(t))L 2 () (t) dt = lim

n 0

(w(t), vn (t))L2 () (t) dt .

Now,
Z

(w(t), vn (t))L2 () (t) dt =


=

w
t

By definition of
Z

d
(t) dx dt
dt

Z TZ
Z

vn

w(x , t)(t)
(x , t) dx dt .
vn (x , t)(t) dx dt
w(x , t)
t
t
0

w(x , t)v(x , t)

L2 (0, T ; H1 ()), observing that vn D(Q), we can write

w(x , t)

vn (x , t)(t) dx dt =
t

T
H1 () h

w
, vn iH10 () dt .
t

Now, we remember that - since (H10 (), L2 (), H1 ()) form a Gelfand triple (see Sect. 3.9), the
L2 ()-inner product of a function g L2 () with a function z H10 () can be equivalently viewed
as the duality pairing between H1 () and H10 (), i.e.,
(g, z)L2 () =
Therefore,

H1 () hg,

vn
w(x , t)(t)
(x , t) dx dt =
t

ziH10 () .

T
H1 () h

(7.1.9)

vn
, wiH10 () dt .
t

Summarizing,

w
, vn iH10 () dt +
(w(t), vn (t))L2 () (t) dt = H1 () h
t
0

T
0

H1 () h

vn
, wiH10 () dt .
t

7.1. VARIATIONAL FORMULATION


Letting n , we get

Z T
w
v
, viH10 () dt + H1 () h , wiH10 () dt
H1 () h
t
t
0
0

Z T
v
w
,
vi
+
h
,
wi
=
h
1
1
1
1
H0 ()
H ()
H0 () dt
H ()
t
t
0
v
w
, viH10 () + H1 () h , wiH10 () , iD(0,T ) .
= D (0,T ) h H1 () h
t
t

d
(w, v)L2 () , iD(0,T ) =
D (0,T ) h
dt

111

Thus, (7.1.7) is proven.


This relation indicates that the L1 (0, T )-function t 7 (w(t), v(t))L 2 () has first derivative, in
the distributional sense, which itself belongs to L1 (0, T ). From such a result one can easily deduce
(as in the proof of Property 3.2.7) that
the function (w(t), v(t))L 2 () is continuous (indeed, absolutely continuous) on [0, T ] .

(7.1.10)

In order to prove the continuity in L2 () of w(t) at any point t0 [0, T ], we use (7.1.10) once
with v = w and then with the constant function v = w(t0 ). Using the identity
kw(t) w(t0 )k2L2 () = (w(t), w(t)L 2 () 2(w(t), w(t0 ))L2 () + (w(t0 ), w(t0 )2L2 ()
and letting t t0 , we get the result.
At last, (7.1.8) is obtained by integrating (7.1.7) on the time interval [t1 , t2 ].
Proposition 7.1.1 suggests the choice of the initial data u0 in L2 (), since u(0) is well-defined
as a function in this space, if u W(0, T ; H10 (), H1 ()); thus, condition u(0) = u0 has to be
understood as an equality between functions in L2 ().
Remark 7.1.2. Proposition 7.1.1 is just a particular case of a more general result, due to J.L. Lions, which concerns Gelfand triples (V, H, V ) (where V H with continuous and dense
inclusion). Precisely, introducing the space
W(0, T ; V, V ) = {w L2 (0, T ; V ) :

w
L2 (0, T ; V )} ,
t

(7.1.11)

one has:
Proposition 7.1.3. The space W(0, T ; V, V ) is contained in C0 ([0, T ]; H) with continuous inclusion. Furthermore, for any w, v W(0, T ; V, V ) and any 0 t1 < t2 T , one has
Z t2
Z t2
v
w

, viV dt +
, wiV dt = (w(t2 ), v(t2 ))H (w(t1 ), v(t1 ))H .
(7.1.12)
V h
V h
t
t
t1
t1
Note that if one takes H = V (hence, V = H = V ) the result tells us that if w L2 (0, T ; V )
w
and
L2 (0, T ; V ), then w C0 ([0, T ]; V ).
t
We are ready for stating a variational formulation of Problem (7.1.1). For simplicity, in the
sequel we will set
Z
Z
uv dx .
u v dx ,
(u, v) = (u, v)L2 () =
a(u, v) = (u, v)H10 () =

The variational formulation that we will consider is as follows.

112

CHAPTER 7. PARABOLIC PROBLEMS

Problem 7.1.4. Given f L2 (Q) and u0 L2 (), find u W(0, T ; H10 (), H1 ()) satisfying
u(0) = u0 and such that
Z

u
, viH10 () dt +
H1 () h
t

a(u, v) dt =

(f, v) dt

v L2 (0, T ; H10 ()) .

(7.1.13)

Two alternative but entirely equivalent expressions of (7.1.13) can be given. They are:
H1 () h

u
(t), viH10 () + a(u(t), v) = (f (t), v)
t

v H10 (), a.e. in (0, T )

(7.1.14)

and

d
(u(t), v) + a(u(t), v) = (f (t), v)
v H10 (), a.e. in (0, T ).
(7.1.15)
dt
Expression (7.1.15) is particularly important both for proving the existence of a solution (as we
shall see later) and for the numerical approximation of the problem. The forms (7.1.13) and (7.1.14)
are equivalent, thanks to the fact that the set of all piecewise constant functions v : [0, T ] H10 ()
is dense in L2 (0, T ; H10 ()). On the other hand, (7.1.14) and (7.1.15) are equivalent, since
d
(u(t), v) =
dt

H1 () h

u
(t), viH10 ()
t

in D (0, T ) ,

thanks to (7.1.7) with v L2 (0, T ; H10 ()) constant in time.

7.2

An a priori Estimate

In this section, we establish an upper bound for certain norms of any solution u of the variational
problem (7.1.4), depending only on suitable norms of the data f and u0 . Such an estimate is
called a priori because we derive it from the sole assumption of u being a solution of this problem,
without even knowing that such a function exists.
However, a result of this kind is of great importance since from it we may prove the wellposedness of our parabolic problem, i.e., the existence and uniqueness of the solution as well as
its continuous dependence on the data.
Let us suppose that u is a solution of Problem (7.1.4). Fix any time satisfying 0 < T .
Take v = u in (7.1.14) and integrate in time from 0 to (equivalently, in (7.1.13) choose v
coinciding with u in [0, ] and equal to 0 in (, T ]); we get
Z
Z
Z
u
(f, u) dt .
(7.2.1)
a(u,
u)
dt
=
,
ui
dt
+
h
1
1
H0 ()
H ()
t
0
0
0
The first integral can be expressed via Proposition 7.1.1 as follows. Set t1 = 0, t2 = and
w = v = u in (7.1.8); this easily yields
Z
1
u
, uiH10 () dt = [(u( ), u( )) (u(0), u(0))]
H1 () h
t
2
0
1
1
= ku( )k2L2 () ku0 k2L2 () .
2
2
On the other hand,

a(u, u) dt =

kuk2H1 () dt .
0

7.2. AN A PRIORI ESTIMATE

113

Furthermore, the Cauchy-Schwartz and Poincar`e inequalities yield


(f, u) kf kL2 () kukL2 () CP () kf kL2 () kukH10 () ,
where CP () is the Poincare constant. Finally, using the inequality ab
with the choice a = CP () kf kL2 () and b = kukH10 () leads to
(f, u)

1
2

a2 + b2

a, b 0

1
1 2
CP () kf k2L2 () + kuk2H1 () ;
0
2
2

integrating in time, we get


Z
Z
Z
1
1
kf k2L2 () dt +
kuk2H1 () dt .
(f, u) dt CP2 ()
0
2
2
0
0
0
Thus, (7.2.1) leads to the inequality
Z
Z
Z
1
1
1 2
1
2
2
2
2
kukH1 () dt ku0 kL2 () + CP ()
kf kL2 () dt +
ku( )kL2 () +
kuk2H1 () dt ,
0
0
2
2
2
2 0
0
0
which can be further manipulated to obtain
Z
Z
2
2
2
2
kukH1 () ku0 kL2 () + CP ()
ku( )kL2 () +
0

kf k2L2 () dt .

(7.2.2)

Now, if we neglect the second term on the left-hand side and use the trivial bound
Z
Z T
2
kf kL2 () dt
kf k2L2 () dt ,
0

we get
ku( )k2L2 ()

ku0 k2L2 ()

CP2 ()

i.e.,
max ku( )k2L2 ()
[0,T ]

T
0

ku0 k2L2 ()

kf k2L2 () dt
+

CP2 ()

for all [0, T ] ,


T

kf k2L2 () dt .

On the other hand, if we neglect the first term on the left-hand side of (7.2.2) and we choose
= T , we have
Z T
Z T
2
2
2
kf k2L2 () dt .
kukH1 () ku0 kL2 () + CP ()
0

Taking the square roots of both sides in the two last inequalities, and using the relation
+ for all , 0, we end up with the following result.
Proposition 7.2.1. Any solution u of Problem 7.1.4 satisfies the estimate


kukC0 ([0,T ]; L2 ()) + kukL2 (0,T ; H10 ()) C ku0 kL2 () + kf kL2 (0,T ; L2 ()) ,

2 + 2

(7.2.3)

where C > 0 is a constant depending only on the domain .

Remark 7.2.2. A similar result holds, with the obvious changes, if f L2 (0, T ; H1 ()), or if
a(u, v) is any bilinear form which is coercive in H10 () (as the one associated with a uniformly
elliptic second-order operator in ).

114

CHAPTER 7. PARABOLIC PROBLEMS

7.3

Well-Posedness of the Problem

We now derive several important consequences from the a priori estimate (7.2.3).
(i) Suppose that two sets of data {f1 , u0,1 } and {f2 , u0,2 } are given and denote by u1 and u2
the solutions of the corresponding Problems 7.1.4. Due to the linearity of the equations,
u1 u2 is the solution of the problem whose data are f1 f2 and u0,1 u0,2 , so that from
(7.2.3) we have
ku1 u2 kC0 ([0,T ]; L2 ()) + ku1 u2 kL2 (0,T ; H10 ())


C ku0,1 u0,2 kL2 () + kf1 f2 kL2 (0,T ; L2 ()) ,

(7.3.1)

which shows that as the data set {f1 , u0,1 } approaches {f2 , u0,2 }, the first solution u1 also
approaches the second solution u2 .
This means that the solution of the problem, if it exists, depends continuously on the data.
(ii) Consider a set of data {f, u0 } and suppose that two solutions u1 and u2 arise; then we must
have
ku1 u2 kC0 ([0,T ]; L2 ()) + ku1 u2 kL2 (0,T ; H10 ()) 0
and thus u1 = u2 . This means that the solution of the problem, if it exists, is unique.
Finally, we prove that the problem really admits a solution.
Let {n , wn }n1 be the eigenvalue-eigenfunction pairs of the Laplacian with Dirichlet boundary
conditions, given by the problem (??); since {wn }n1 is an orthonormal fundamental set in L2 (),
we can give the following representations for the functions f (t), u0 L2 ():
f (t) =
u0 =

+
X

n=1
+
X

fn (t)wn ,

fn (t) = (f (t), wn )

u
0,n wn ,

u
0,n = (u0 , wn ).

n=1

Let us seek a solution of the form


u(t) =

+
X

u
n (t)wn ;

(7.3.2)

n=1

substituting into equation (7.1.15) and choosing any eigenfunction wm as a test function, gives
!
!
!
+
+
+
X
X
d X
u
n (t)wn , wm =
m = 1, 2, . . . ,
fn (t)wn , wm ,
u
n (t)wn , wm + a
dt
n=1

n=1

n=1

that is
+
X

u
n (t) (wn , wm ) +

n=1

+
X

u
n (t) a(wn , wm ) =

+
X

fn (t) (wn , wm )

n=1

n=1

recalling that
(wn , wm ) = n,m

and

a(wn , wm ) = n n,m ,

m = 1, 2, . . . ;

7.3. WELL-POSEDNESS OF THE PROBLEM


we obtain

115

m (t) = fm (t) ,
u
m (t) + m u

m = 1, 2, . . .

(7.3.3)

which shows that every generalized Fourier coefficient u


m (t) of the expansion of u satisfies a
first-order linear ordinary differential equation.
Moreover, since u(0) = u0 , the following relationship must hold:
+
X

u
n (0)wn =

u
0,n wn ,

n=1

n=1

thus

+
X

u
m (0) = u
0,m

m = 1, 2, . . . .

(7.3.4)

Finally, from (7.3.3) and (7.3.4) we have for every m 1 the Cauchy problem
(
u
m (t) + m u
m (t) = fm (t) ,
u
m (0) = u
0,m ,
whose solution is
m t

u
m (t) = e

u
0,m +

(7.3.5)

em ( t) fm ( ) d.

(7.3.6)

It remains now to be checked that the series (7.3.2) converges to a function u which solves
Problem 7.1.4. To do this, let us set
uN (t) : =

N
X

n=1

u
n (t)wn ,

fN (t) : =

N
X

fn (t)wn ,

u0,N : =

n=1

N
X

u
0,n wn ;

n=1

it is easy to verify that uN satisfies the variational formulation of the problem

uN

uN = fN
in Q ,
t
uN = 0
on (0, T ) ,

uN = u0,N on {0} ;

then, if we consider a further set of data {fM , u0,M } with the corresponding solution uM we have,
from the upper bound (7.2.3),
kuN uM kC0 ([0,T ]; L2 ()) + kuN uM kL2 (0,T ; H10 ())


C ku0,N u0,M kL2 () + kfN fM kL2 (0,T ; L2 ()) ,

M, N 1 . (7.3.7)

Since {fN }N 1 and {u0,N }N 1 converge in their spaces (to f and u0 , respectively), they are
Cauchy sequences; from (7.3.7) it follows that so is {uN }N 1 both in C0 ([0, T ]; L2 ()) and in
L2 (0, T ; H10 ()). Using the completeness of these spaces allows us to conclude that when N
the sequence uN converges to a function u belonging to L2 (0, T ; H10 ()) C0 ([0, T ]; L2 ()).
Passing to the limit in the distributional equations satisfied by uN , i.e.,
Z
Z
Z

fN dx dt
D(Q) ,
uN dx dt =
dx dt
uN

t
Q
Q
Q
we obtain (7.1.2), from which one can easily deduce that u solves Problem 7.1.4.
Summarizing, we have proven the following fundamental result.
Theorem 7.3.1. Problem 7.1.4 admits one and only one solution, for which the bound (7.2.3)
holds. Furthermore, the solution depends continuously on the data, as expressed by the bound
(7.3.1).

116

7.4

CHAPTER 7. PARABOLIC PROBLEMS

Some Facts about the Regularity of the Solution

We have seen that, under the hypothesis f L2 (0, T ; L2 ()) and u0 L2 (), the parabolic
problem (7.1.1) admits a unique solution u L2 (0, T ; H10 ()); note that we need not assign u0 in
H10 (): u0 L2 () is enough to obtain a solution which belongs to H10 () a. e. in time. This fact
is already related to the regularization property of parabolic equations.
Another point of view is provided by a spectral analysis. Let us suppose f 0, i.e., fn (t) 0
for every n 1; then, the solution reads as
u(t) =

+
X

u
n (t)wn =

+
X

en t u
0,n wn .

n=1

n=1

Recalling that the L2 -norm


P+ of a2 function v of the form v =
2
vn | , it results
identity as kvkL2 () = n=1 |
ku(t)k2L2 () =

+
X

n=1

P+

n wn
n=1 v

is given by Parsevals

e2n t |
u0,n |2 ;

on the other hand, putting t = 0 yields


+
X

n=1

|
u0,n |2 = ku0 k2L2 () < +

since the function u0 lies, by assumption, in L2 (). Thus,


|
un (t)| Cen t

n 1 .

It follows that the Fourier coefficients of the solution for any time t > 0 decay exponentially in
the wavenumber n. As noted in Chap. 6, this is a manifestation of smoothness of the solution.
If f is not zero, an analogous conclusion can be achieved from expression (7.3.6). Since t < 0
whenever t > 0, any singularity of the data f at a given time > 0 is immediately smoothed out
at all subsequent times.
In short, a parabolic problem tends to smooth the data.
Another explicit manifestation of this property can be obtained by considering the homogeneous heat equation in the half-plane {(x, t) : t > 0}:

u 2 u

2 = 0
in IR (0, +) ,

t
x

u(x, t) 0
for |x| ,

u(x, 0) = u0 (x) x IR .

If we apply the Fourier transform with respect to x to u and its derivatives we get
Z +
1
u(x, t)eix dx ,
u
(, t) =
2
2u
d
(, t) = 2 u
(, t) ,
x2
c
u
u

(, t) =
(, t) ,
t
t

(7.4.1)

7.4. SOME FACTS ABOUT THE REGULARITY OF THE SOLUTION


and then

u
+ 2u
= 0
t

u
(, 0) = u
0 ()

IR, t > 0 .

117

(7.4.2)

Note that the condition u(x, t) 0 is implicit in the fact that u(t) admits a Fourier transform,
i.e., u(t) L2 (IR) and so it need not be furtherly imposed in Problem (7.4.2).
Solving this Cauchy problem with respect to t and regarding as a parameter yields
2

u
(, t) = u
0 ()e t ;
turning back to the physical space, we have then
Z +
Z +
1
2
1
ix
u
(, t)e d =
u
0 ()e t+ix d =
u(x, t) =
2
2
Z +
Z +
1
2
=
e t+ix
u0 (y)eiy dy d =
2


Z + Z +
1
2 t+i(xy)
e
d u0 (y) dy.
=
2

If we set

1
K(x y, t) : =
2

we can write the solution as


u(x, t) =

2 t+i(xy)

d ,

(7.4.3)

K(x y, t)u0 (y) dy = (K u0 )(x, t) ,

where the convolution is intended with respect to the space variable only.
The function K is said to be the kernel of the heat equation; it is the solution of the parabolic
problem

2K
K

= 0 in IR (0, +) ,

t
x2

K 0 for |x| ,

K = 0 for t = 0 ,

in the sense of distributions (for this reason it is sometimes called a fundamental solution of the
heat equation) and it can be expressed in a closed form, since from equation (7.4.3) it follows
z2

e 4t
,
K(z, t) =
4t
so that we also have
u(x, t) =

(xy)2
4t

u0 (y) dy .
4t
From here we see that u is infinitely differentiable with respect to x even when u0 is not, since
the kernel K lies in C for every t > 0.

An alternative fashion for studying the regularity of the solution of a parabolic problem consists
of the following procedure:

118

CHAPTER 7. PARABOLIC PROBLEMS

(i) consider the equation


u
u = f
t
and formally differentiate both left-hand side and right-hand side with respect to t to obtain
an equation satisfied by u
t ;
(ii) write then
u
t > 0
t
and exploit the results about the regularity of the solution of elliptic problems.
u = f

Accordingly, we have as a first step


 
 
u
u
f

=
t t
t
t
so u
t formally satisfies a parabolic equation where the right-hand side is
u = 0 on (0, T ), it follows the boundary condition

(7.4.4)
f
t .

Furthermore, from

u
= 0 on (0, T ) ,
t
and from u(0) = u0 on {0} we obtain analogously
u
(0) = f (0) + u(0) = f (0) + u0
t
giving us the initial condition for equation (7.4.4).
2
2
In order to justify the previous formal steps, let us suppose f
t L (0, T ; L ()): since we
already assumed f L2 (0, T ; L2 ()), we derive f C0 ([0, T ]; L2 ()) from Remark 7.1.2, so that
f (0) is well-defined in L2 (); moreover, let u0 H2 (), and consequently u0 L2 (). Therefore,
the problem

f
w

in Q ,
t w = t
w = 0
on (0, T ) ,

w = f (0) + u0 on {0} ,
2
1
admits a unique solution w L2 (0, T ; H10 ()) with w
t L (0, T ; H ()). But it is clear that
u
2
1
one has w = t (at least in the sense of distributions) and then u
t L (0, T ; H0 ()) with
2u
0
2
L2 (0, T ; H1 ()), i.e. by Proposition 7.1.1, u
t C ([0, T ]; L ()).
t2
As a second step, we write

u(t) = f (t)

u
(t)
t

a. e. in t ,

(7.4.5)

which is an elliptic equation with respect to x , whose right-hand side lies in L2 () for almost
every t (0, T ). If is smooth or convex, then the following relationship holds true:



u

ku(t)kH2 () C f (t)
(t)
< + ,
C>0,
t L2 ()

which yields, after an integration in t, u L2 (0, T ; H2 ()).

We conclude that the solution is more regular than in the general case, thanks to the extra
regularity assumptions made on the data f and u0 .

7.5. THE MAXIMUM PRINCIPLE FOR PARABOLIC EQUATIONS

7.5

119

The Maximum Principle for Parabolic Equations

We can extend the Maximum Principle presented


equations of the form

+ Lu = f
t
u = g

u = u0

in Chapter 5 for elliptic equations to parabolic


in Q
on (0, T )
on {0}.

where L denotes the most general second order linear elliptic operator (cf. equation (??)).
By the same variational technique of Stampacchia Theorem 5.2.2, and using now the a priori
upper bound of the parabolic problems, it is possible to prove the following result:
Theorem 7.5.1. Set


m = min min g, min u0

and



M = max max g, max u0

and assume that the same hypothesis of Theorem 5.2.2 hold on L. Then:
(i) if f a0 m 0 in Q, then u m in Q;
(ii) if f a0 M 0 in Q, then u M in Q.

In particular, if f 0 in Q, u0 0 in and g = 0 on then it follows u 0 in Q.

7.6

Exercises

7.1. Solve in series of eigenfunctions the following parabolic problem:

u 2 u

2 = 0
in (0, 1) (0, T ), T > 0 ,

x
t
u(0, t) = et o

0<tT ,

u(1, t) = 0

u(x, 0) = 0
x (0, 1) .

(Hint: do a suitable substitution to have a homogeneous Dirichlet condition on the boundary; for
example, consider u(x, t) = u
(x, t) + et (1 x) and solve first for u
).
7.2. Solve in series of eigenfunctions the following parabolic problem:

u 2 u

2 = x
in (0, ) (0, T ), T > 0 ,

t
)
u(0, t) = 0
u

0<tT ,

(, t) = 0

n
u(x, 0) = 1
x (0, ) .

x
. Moreover, for this problem you need
(Hint: first of all, note that in this case it results n
the eigenfunctions of the Laplacian with mixed boundary conditions: see Exercise (6.1)).

120

CHAPTER 7. PARABOLIC PROBLEMS

Chapter 8

Hyperbolic Problems
Hyperbolic problems arise in modeling transport phenomena with finite speed, especially those
involving wave motion. Maxwells equations, which describe the propagation of electromagnetic
waves in the vacuum, as well as DAlemberts equation for the pressure waves in a fluid like air or
water, and the so-called equation of a vibrating string, which gives the motion of an elastic wave
induced across a rope blocked at its ends, go all back to a hyperbolic mathematical model.
In this chapter we deal with the most general second order linear hyperbolic operator in its
canonical form, as it has been presented in Chapter 1. We shall use the same notation which has
been introduced in the previous chapter and our explanation will be almost entirely parallel to
that of parabolic problems.

8.1

Variational Formulation

Throughout this chapter we shall refer to the initial/boundary value problem


2
u

t
u

= f

in Q ,

= 0 ) on (0, T ) ,
= u0
on {0} .
= u1

(8.1.1)

Note that in this case we must prescribe two initial conditions due to the presence of the second
2
order derivative t2u in time.
Remark 8.1.1. In applications, a hyperbolic equation often arises in the form
2u
c2 u = f ,
t2

(8.1.2)

where c is the speed of the wave which is propagating across the domain .
However, it is possible to reduce such an equation to the form discussed in the text by a classical
procedure of the Mathematical Physics which consists in rewriting the problem in a dimensionless
form.
For instance, suppose u is a velocity field and let us denote by U a characteristic (constant)
velocity such that u = U u
, where u
is a dimensionless variable; analogously, let us set x = L
x and
121

122

CHAPTER 8. HYPERBOLIC PROBLEMS

t = t, L and being a characteristic length of the spatial domain and a characteristic time,
respectively. Substituting into equation (8.1.2) gives
U 2u
c2 U
2 2
u=f ,
2

t
L
that is

2
2u
c2 2
f.

u
=
t2
L2
U
Choosing = L/c and setting 2 f /U = f, we finally obtain
2u


u = f ,
2

which is a dimensionless equation of the desired form, in the unknown u


.
Going back to our problem, equation
2u
u = f
t2

(8.1.3)

certainly makes sense in D (Q), so we can write


D (Q) h

or, more explicitly,


Z T
0

2u
u, iD(Q) =
t2

2
u,
t2

dt +

D (Q) hf,

a(u, ) dt =
0

iD(Q)

D(Q)

(f, ) dt

D(Q),

(8.1.4)

where, as usually, we have set


a(u, v) = (u, v)H10 () =

u v dx ,

(u, v) = (u, v)L2 () =

uv dx .

On the other hand, if we assume f L2 (Q) = L2 (0, T ; L2 ()) and u L2 (0, T ; H10 ()), which
implies u L2 (0, T ; H1 ()), we obtain from (8.1.3)
2u
L2 (0, T ; H1 ()) ,
t2
so that we can express our equation in the following variational form:
Z T
Z T
Z T
2u
(f, v) dt
v L2 (0, T ; H10 ()) .
a(u, v) dt =
, viH10 () dt +
H1 () h
t2
0
0
0

(8.1.5)

In order to give a precise meaning to the initial conditions, we make the further assumption
that
u
L2 (0, T ; H10 ()) ;
t
in other words, all together the solution u is required to satisfy u L2 (0, T ; H10 ()) with u
t
1
1
u 2 u
W(0, T ; H0 (), H ()). As a consequence, Proposition 7.1.1 applied to the pair t , t2 yields
u
C0 ([0, T ]; L2 ()) ;
t

8.2. AN A PRIORI ESTIMATE

123
u
t ,

on the other hand, recalling Remark 7.1.2 for the pair u,

we get

u C0 (0, T ; H10 ()) .


u
t

Thus, the pointwise evaluation of both u and


u(0) = u0 H10 ()

makes sense in [0, T ]. In particular, we have


u
(0) = u1 L2 (),
t

and

giving the natural spaces in which one has to choose the initial data.
We are ready to state the initial/boundary value problem (8.1.1) in variational form.
Problem 8.1.2. Given f L2 (Q), u0 H10 () and u1 L2 (), find u L2 (0, T ; H10 ()) with
u
1
1
u
t W(0, T ; H0 (), H ()) satisfying u(0) = u0 , t (0) = u1 and such that
Z

2u
, viH10 () dt +
h
1
H ()
t2

a(u, v) dt =

(f, v) dt

v L2 (0, T ; H10 ()) .

(8.1.6)

Equivalent expressions of the last equations are:


H

() h

2u
(t), viH10 () + a(u(t), v) = (f (t), v)
t2

and

8.2

d2
(u(t), v) + a(u(t), v) = (f (t), v)
dt2

v H10 (), a.e. in (0, T )

v H10 (), a.e. in (0, T ).

(8.1.7)

(8.1.8)

An a priori Estimate

As we have already seen for parabolic problems, obtaining an a priori upper bound on the norm of
the solution u is once again of very great importance to prove the well-posedness of the problem
(8.1.1).
Consider any (0, T ]. Taking v = u
t (t) in (8.1.7) and integrating from 0 to yields
Z

() h

2 u u
i 1 dt +
,
t2 t H0 ()

Now take t1 = 0, t2 = , w =
Z

2u
t2

and v =

u
t

a(u,

u
) dt =
t

(f,
0

u
) dt .
t

in (7.1.7) and use the initial condition to get

 

u
u
u
u
( ),
( )
(0),
(0)
=
t
t
t
t

2
1
u
1

= ( )
ku1 k2L2 () .

2 t
2
2

1
2 u u
,
iH10 () dt =
H1 () h
2
t t
2



L ()

Furthermore, the following identity holds for our bilinear symmetric form a(u, v):


u
a u,
t

(8.2.1)

1 d
a(u, u) .
2 dt

124

CHAPTER 8. HYPERBOLIC PROBLEMS

Indeed, for any t 6= 0,



1
a(u(t + t), u(t + t)) a(u(t), u(t))
t

1
=
a(u(t + t), u(t + t)) a(u(t), u(t + t)) a(u(t), u(t))
t




u(t + t) u(t)
u(t + t) u(t)
=a
, u(t + t) + a u(t),
,
t
t
and the result follows by letting t 0. Therefore,

Z 
u
1
1
1
a u,
dt = [a(u( ), u( )) a(u(0), u(0))] = ku( )k2H1 () ku0 k2H1 () .
0
0
t
2
2
2
0

Finally, using the Cauchy-Schwartz inequality and the fact ab 12 a2 + b2 a, b > 0, we get
Z





Z

1
u
u

kf (t)kL2 () (t)
dt
dt
f,
t
t L2 ()
0

Z
Z
u 2

1
2

(t)
kf (t)kL2 () dt +
dt
2 0
2 0 t L2 ()

where is any positive real parameter.


These results lead us to the following relationship:


2

1
1
1
1
u ( )
ku1 k2L2 () + ku( )k2H1 () ku0 k2H1 ()


0
0
2 t
2
2
2
L2 ()

kf k2L2 (0, ; L2 ()) +


2
2

proceeding as in the parabolic case, we can further manipulate it to obtain




u 2

max
(t)
max ku(t)k2H1 ()
2 + t[0,T
0
t[0,T ] t
]
L ()

kf k2L2 (0,T ; L2 ())

ku0 k2H1 ()
0

ku1 k2L2 ()

Choosing = 2T yields the final result.



u 2
(t)
dt ;
t 2
L ()



Z T
u 2
1


max
(t)
dt .
+
t[0,T ] t L2 () 0

Proposition 8.2.1. Any solution u of Problem 8.1.2 satisfies the estimate



u

kukC0 ([0,T ]; H10 ()) +
t 0
C ([0,T ]; L2 ())

h
i

C ku0 kH10 () + ku1 kL2 () + T kf kL2 (0,T ; L2 ()) , (8.2.2)

where C > 0 is a constant depending only on the domain .

8.3. WELL-POSEDNESS OF THE PROBLEM

125

Remark 8.2.2. Observe that the right-hand side depends on T , which implies that the upper
bound becomes less and less tight as T grows to infinity. Although this is not a so serious problem
from a theoretical viewpoint, it may be unfavourable in a numerical approach, where one would
like to control and confine as much as possible the error on the numerical solution.
If we want to improve in this sense the a priori upper bound (8.2.2) we may, for instance,
control the norm of the solution u by using a different norm of the forcing term f . In particular,
since f L2 (0, T ; L2 ()) implies1 f L1 (0, T ; L2 ()), we have
Z

u
f,
t

dt

kf (t)kL2 ()



u
(t)
dt
t 2
L ()
Z
kf (t)kL2 () dt



u

max (t)
2
t[0, ] t
L () 0

u

kf kL1 (0,T ; L2 ())

t 0
C ([0, ]; L2 ())
2
1
1
u


+ kf k2L1 (0,T ; L2 ()) ,


2 t C0 ([0, ]; L2 ()) 2

in which constants are no longer depending on T .

8.3

Well-Posedness of the Problem

Using the a priori upper bound (8.2.2) and following the guidelines of the parabolic case, one gets
the following result.
Theorem 8.3.1. Problem 8.1.2 admits one and only one solution, for which the bound (8.2.2)
holds. Furthermore, the solution depends continuously on the data in the norms which appear in
this bound.
Proof. (Sketch). Let us just detail the construction of the approximants of the solution by an
eigenfunction expansion; the rest of the arguments are similar to the parabolic case.
Let us denote, as usually, by {n , wn } the eigenvalue-eigenfunction pairs of the Laplacian with
Dirichlet boundary conditions; then we can represent the data as

f (t) =
u0 =
u1 =

+
X

n=1
+
X

n=1
+
X

fn (t)wn ,

fn (t) = (f (t), wn ) ,

u
0,n wn ,

u
0,n = (u0 , wn ) ,

u
1,n wn ,

u
1,n = (u1 , wn ) ,

n=1
1

Given a set A, we recall that the following relationship holds true: L2 (A) L1 (A), provided A has a finite
Lebesgue measure; in this case, there exists a constant C > 0 such that kvkL1 (A) CkvkL2 (A) v L2 (A), which
shows that L1 -norm is weaker than L2 -norm. Nevertheless, these two norms are not equivalent, since the converse
need not be true; consider, e.g., the function v(x) = 1x on A = [0, 1].

126

CHAPTER 8. HYPERBOLIC PROBLEMS

and we can formally expand the solution as


u(t) =
u
(t) =
t

+
X

n=1
+
X

u
n (t)wn ,

(8.3.1)

u
n (t)wn ,

u
n (t)

n=1

u
(t), wn
t

Substituting into equation (8.1.8) and choosing as v each eigenfunctions wm gives


!
!
!
+
+
+
X
X
d2 X
u
n (t), wm =
fn (t)wn , wm , m = 1, 2, . . . .
u
n (t)wn , wm + a
dt2
n=1

n=1

n=1

Recalling that
(wn , wm ) = n,m

and

a(wn , wm ) = n n,m

we obtain
u
m (t) + m u
m (t) = fm (t),

m = 1, 2, . . . ,

together with the conditions


u(0) =

+
X

u
n (0)wn

n=1
+
X

u
u
n (0)wn
(0) =
t
n=1

=
=

u0 =
u1 =

+
X

n=1
+
X

u
0,n wn

u
m (0) = u
0,m

m 1 ,

u
1,n wn

u
m (0) = u
1,m

m 1 ;

n=1

thus, each generalized Fourier coefficient of the solution u is the solution of the following initialvalue problem for a linear second-order differential equation

m (t) = fm (t) ,
m (t) + m u
u
m = 1, 2, . . . ,
u
m (0) = u
0,m ,

u
m (0) = u
1,m

which can be rewritten in a canonical form as a first-order system by defining vm (t) = u


m (t)

u
m (t)


vm (t)

(0)

m
vm (0)

=
=
=
=

vm (t) ,
m u
m (t) + fm (t) ,
u
0,m ,
u
1,m

m = 1, 2, . . . ,

and then introducing the vector vm (t) = (


um (t), vm (t))T :





0
0
1

v (t) =
vm (t) +

m 0
fm (t)
m



u
0,m

vm (0) =
u
1,m

The solution of this differential system writes formally as

m = 1, 2, . . .

8.4. QUALITATIVE PROPERTIES OF THE SOLUTION

tAm

vm (t) = e




 Z t
0
u
0,m
( t)Am
d ,
e
+
u
1,m
fm ( )
0

127

(8.3.2)

0
1
where we have set Am =
. This matrix has the following eigenvalues: i m , thus it
m 0
can be diagonalized using a nonsingular matrix P C
I 2,2 such that


i m
0

,
Am P = P
0
i m

which also yields the explicit expression of the exponential matrix


!

i m t
0
e

etAm = P
P 1 .
0
ei m t

8.4

Qualitative properties of the solution

We briefly comment on certain properties of the solution of a second-order hyperbolic problem, as


far as regularity, conservation and time reversibility are concerned.
Regularity of the solution
A hyperbolic operator, unlike a parabolic one, does not smooth the solution as time increases;
roughly speaking, we may say that it simply mix together the data. This property, related to
the fact that the operator describes a propagation phenomenon, was already observed in Chap.
1, where we provided the analytical expression of the solution in the one dimensional case. The
spectral decomposition of the solution, given above, yields another point of view.
Indeed, assume that f 0, i.e., fm (t) = 0 for all m 1. Then, the general expression (8.3.2)
gives
u
m (t) = C1 eim t + C2 eim t ,

0,m and u
1,m but not on t. As a consequence,
with m = m . The coefficients C1 , C2 depend on u
2
1
if we choose, respectively, u0 and u1 to lie in L () and in H0 () the solution u does not become
more and more regular as the time goes by, since the terms eim t do not decay in time. In other
words, the series
+
X
un (t)|2
sn |
n=1

converges in general only for s = 0 and s = 1, according to the fact that u L2 (0, T ; H10 ()).
If f is not zero, an analogous result can be achieved.
In order to have a smooth solution (in the Sobolev or classical scale) one has to assume
sufficiently smooth initial data and forcing term, plus of course the smoothness of the domain if
one is interested to the regularity up to the boundary.
Conservation properties
Let us consider an elastic membrane blocked at its boundary, which can oscillate from an initial
position u0 and with an initial speed u1 . The mathematical dimensionless model describing such
a phenomenon is

128

CHAPTER 8. HYPERBOLIC PROBLEMS

2
u

u = 0
in (0, ) ,

t
u = 0
(8.4.1)
) on (0, ) ,

u
=
u
0

u
on {0} ,

= u1 .
t
that is a linear hyperbolic problem with in particular the forcing term f equal to zero. Due to
this, the a priori upper bound derived from (8.2.1) can now be written as an equality:

2

1
1
1
1
2
2
2
u ( )

2 + 2 ku( )kH10 () = 2 ku1 kL2 () + 2 ku0 kH10 () ,
2 t
L ()

Thus, the expression

0 .


2
u
1
1

E(t) = (t)
+ ku(t)k2H1 ()

0
2 t
2
L2 ()

denotes a quantity which does not vary during the evolution in time, i.e.,
d
E(t) 0 ;
dt
In other words, the quantity E(t) is conserved in time. Physically, it represents the membranes
dimensionless total energy, and, more precisely,

2

u
1
(T )


2 t
0,
1
|u(T )|21,
2

is the dimensionless kinetic energy ,

is the dimensionless elastic energy .

Instead, the following problem


2
u
u

u = 0
in (0, ) ,
+

t
u = 0 ) on (0, ) ,

u
= u0

u
on {0} ,

= u1
t
with > 0 models the motion of the same membrane but without neglecting friction forces,
proportional to its velocity, which are taken into account by the term u
t .
The a priori upper bound reads now as


2
Z
u 2
u
1
1
1
1


0 ,
dt + ( )
+ ku( )k2H1 () = ku1 k2L2 () + ku0 k2H1 () ,
t (t) 2

0
0
2 t
2
2
2
0
L ()
L2 ()
and we can see that,
Z

one has

due to the dissipation term on the left-hand side



2


u (t)
dt > 0
and strictly increasing with ,
t 2
0
L ()

d
E(t) < 0 ,
dt
i. e., the total energy of the system is not conserved.

8.5. EXERCISES

129

Time reversibility
Linear hyperbolicity allows us to reverse the time axis, and reconstruct the solution from knowing
its values at a final time, instead of at an initial time. In other words, the retrograde boundaryvalue problem
2
w

w = 0
in Q ,

t2

w = 0
(8.4.2)
) on (0, T ) ,

w
=
w
0

w
on {T } ,

= w1

is well posed, as the forward problem. Indeed, appliying the change of variable = T t and
setting u(x , ) = w(x , T ), we have
w
u
(x , ) =
(x , T ) ,

t
2w
2u
(x
,

)
=
(x , T ) ,
2
t2
u(x , ) = w(x , T )
and moreover
u(0) = w(0) = w0 ,
u
w
(0) =
(T ) = w1 ,

t
so that w satisfies the standard problem
2
u

u =


u =

u
=

in Q ,

0
) on (0, T ) ,
w0
on {0} .
w1

(8.4.3)

Roughly speaking, we may say that w solves the problem (8.4.2) backwards in time.
This, however, no longer holds if we include the term w
t , since in dissipative phenomena the
time axis cannot be reversed: indeed, the previous transformation would give the term u
in
(8.4.3), with the wrong sign!
The fact the a dissipative retrograde problem is not well posed is nothing but the mathematical
counterpart of the physical concept of entropy.

8.5

Exercises

8.1. [... to be added ...]

You might also like