1998 - Reddy - Introductory Functional Analysis
1998 - Reddy - Introductory Functional Analysis
Editors
JE. Marsden
L. Sirovich
M. Golubitsky
W . Jäger
F. J ohn (deceased)
Advisor
G. Iooss
Introductory Functional
Analysis
With Applications to Boundary Value Problems
and Finite Elements
, Springer
B. Daya Reddy
Department of Mathematics and
Applied Mathematics
University of Cape Town
7700 Rondebosch
South Africa
Series Editors
J. E.Marsden L. Sirovich
Control and Dynamical Systems, 116-81 Division of Applied Mathematics
California Institute ofTechnology Brown University
Pasadena , CA 91125 Providencc , RI 02912
USA USA
M . Golubitsky W. Jäger
Department of M athematics Department of Applied Mathematics
University of Houston Universität Heidelberg
Houston , TX 77204-3476 Im Neuenheimer Feld 294
USA 69120 Heidelberg
Germany
9 8 7 6 5 432 I
ISBN 978-1-4612-6824-6 ISBN 978-1-4612-0575-3 (eBook)
DOI 10.1007/978-1-4612-0575-3 SPIN 10557902
Series Preface
task facing the engineer or applied scientist was thus quite daunting. For-
tunately the situation has progressed markedly since then. There is now
available a wide range of texts that present functional analysis, often with
one or more applications taken from engineering and physics, in a manner
accessible to readers not having the standard prerequisites. The styles dif-
fer, sometimes quite considerably, from one text to another, although this
is not a bad thing given the diversity of interests and backgrounds of the
potential readership.
This text is a furt her addition to the set of books that present func-
tional analysis and its applications to nonspecialists. The approach taken
is, first, to assurne that readers have no more by way ofrelevant background
than elementary courses in linear algebra, vector analysis, and differential
equations, and wish to learn the elements of linear functional analysis.
The book begins with an introduetory ehapter, which is somewhat in
the nature of a prologue, and which presents in mostly deseriptive form
a motivation for studying functional analysis from the viewpoint of those
involved in the study of problems from physics and engineering. The re-
mainder of the book is then divided into parts: Part I is devoted to linear
functional analysis, Part II to an introduction to elliptic boundary value
problems, and Part III comprises a study of the finite element method.
Two applications are treated in detail in this text: elliptic boundary
value problems and the finite element method. In both cases any prior
exposure to these areas will represent an advantage to those using this
book; indeed, it is expected that such prior exposure will in many eases
have provided the motivation to study the material presented here. The
presentation of these applieations starts more or less at the beginning,
so that those having no background in these areas could use this text to
acquire such background. On the other hand, it may be the case that the
motivation to learn functional analysis arises from an interest in an area of
application other than those treated in this text. Such readers might weil
prefer to focus on Part I of the book.
The incorporation of applications and other illustrative material is ap-
proached in two distinct ways. In Part I of the book new concepts, often
of an abstract nature, are rendered more accessible by the copious use
of concrete worked examples. There is little reference in this part of the
book to applications in physics and engineering, for thc simple reason that
such examples are less weil suited to laying bare the essential features of
the many ncw concepts that accompany any introduction to functional
analysis. In Parts II and III, it is appropriate and desirable to illustrate
abstract concepts by re course to concrete problems and examples taken
from physics and engineering, and this is the approach taken here. I have
used as examples problems such as heat conduction, as weil as problems in
solid and structural mechanics - elasticity, beams, and plates - and return
regularly in Parts II and III to these examples in order to motivate and
Preface ix
B.D.R.
Cape Town
April 1997
Contents
Series Preface v
Preface vii
Introduction 1
1 Sets 23
1.1 The algebra of sets 23
1.2 Sets of numbers .. 28
1.3 IRn and its subsets 37
1.4 Relations, equivalence classes, and Zom's lemma 41
1.5 Theorem proving . . . . 46
1.6 Bibliographical remarks 48
1. 7 Exercises . . . . . . . . 48
References 435
Index 463
Introd uction
such applications, and in which ways it is useful. For this reason we present
in this introductory chapter an overview of how boundary value problems
are encountered, what kinds of mathematical quest ions arise in their treat-
ment, and where functional analysis fits into the general scheme of things.
The treatment is deliberately sketchy in its mathematical detail, since the
aim here is to identify important beacons or landmarks, rat her than to flesh
out all their mathematical features; this latter task forms the bulk of this
text.
Boundary value problems almost always arise as mathematical models of
some real-life situation, whether of a physical, biological, economic, or other
nature. We place the planned excursion in a concrete physical context in
order to be able to show how the mathematics interacts with the physical
dictates of the problem. The main vehicle chosen for the discussion in
this chapter is the physical problem of heat conduction or, equivalently, of
diffusion and, subsequently, its steady (that is, time-independent) variants.
When arriving at the steady-state case we are able to make contact also
with other problems that have the same mathematical formulation, viz.
electrastatics, and the problem of the deflection of an elastic membrane.
We proceed now to examine the various stages that arise in the consid-
eration of these physical problems, and their mathematical realizations.
1"
The next stage is to translate each of these terms into mathematical form.
We do this by applying balance of energy to an arbitrary part n' of the
body n; the arbitrary region has abounding surface r' (see Figure 1).
N ow the thermal energy in a body is quantified by the heat capa city c,
which is the amount of heat generated per unit mass, and per unit rise in
temperature. If the mass density is denoted by p and the temperature by u,
then the total thermal energy in 0' at a particular time is therefore given
by
is given by
r
'r'
-1 q. v dA =-
Jn'
div q dV.
dd l c(X)P(X)u(x,t)dv=l f(x,t)dV-
t n' n'
r
Jn'
divqdV
Now the time derivative may be taken inside the integral, since the limits
of integration are fixed; it then becomes the partial derivative D/Dt, and
we now have, upon rearrangement,
Since the volume under consideration is arbitrary, and the functions appear-
ing in the integrand are assumed to be sufficiently smooth, the integrand
must vanish in order for this equation to hold true. This observation leads
to the preliminary form
cP Du
Dt + d'IVq= f (5)
q = -K\lu, (6)
hot to cold. Now substitution of Fourier's law in the energy equation and
division throughout by cp give, finally,
au -
-
1
-div (K\1u) = Q, (7)
at cp
where we have set Q = f /(cp). In full, equation (7) reads
PDE: -ou 1
ot - -div(K\1u)
cp
= Q in n, t > 0
BCs: u=u(x,t) on r
au
and ov =g(x,t) onr q
u
1 .
PDE: --dlV(K\7u) = Q
cp
Finally, this BVP takes an even simpler form if the problem is homoge-
neous, that is, if the density, specific heat, and thermal conductivity are
constant. The problem now becomes that shown in Box 3; the PDE there
is known as the Poisson equation and the operator \72 on the left-hand side
is the Laplacian, defined by
2 02 u 02 u 02 U
\7 u = ox2 + oy2 + Oz2' (8)
PDE:
mentioned earlier, the heat equaton is also known as the diffusion equation
because it serves as a model for diffusion. Likewise, thc Poisson equation
models a wide range of physical phenomena.
To make the point we consider as a further example the case of elec-
trostatics. Suppose that we are given a distribution of stationary electric
charges in a region n in space; this distribution may be specified by a scalar
function p which gives the charge per unit volume, or charge density, at any
point. The charge density in turn gives rise to a vector force field known
as the electric field, and denoted by E. The electric field at a point x gives
the force per unit charge acting on acharge located at ~e.
Now, just as we considered in Example 1 the relationship between the
flux of heat through the boundary r' of an arbitrary region n' and the
change of heat inside that region, in the same way we may consider the
relationship between the flux of the electric field through r' (using the
same notation as in Example 1), and the total charge inside the region
enclosed by r'. The result is Gauss's law, which states that
r
Jr l
E. v dA = 47r r
Jn'
p dV; (9)
that is, the flux of the electric field E through any closed surface equals 47r
times the total charge enclosed by that surface. By exploiting the divergence
theorem of Gauss, the surface integral on the left-hand side of (9) can be
converted to a volume integral, and this way we arrive at the counterpart
of (5); that is,
J 82
81
E·r ds
is independent of the curve chosen to join Xl and X2. Here r is the unit
tangent vector along the curve (Figure 2). An immediate mathematical
consequence is that it is possible to express the electric field as the gradient
of an electric potential function 4;; that is,
E = -\lifJ. (11)
The minus sign takes care of the fact that the electric Held points in the
direction of decreasing potential. Figure 2 shows schematically the curves
8 Introduction
FIGURE 2. Curves of constant potential and electric field vectors in the plane
normal to, and passing through the center of, a uniformly charged disk
of eonstant potential, and a few of the eleetrie field veetors, in the vicinity
of a uniformly charged disko
Now we are ready to formulate the problem of determining the electric
field: by substituting (11) in (10) we obtain again a Poisson equation
(12)
/y
Again, in the context of the membrane problem, this represents the state-
ment of the principle of minimum potential energy. Exactly what is meant
by an admissible hmction is a matter that takes up some time when BVPs
are discussed in full detail, but it suffices for this preliminary overview that
we consider functions which satisfy two properties:
(i) they are continuously differentiable; that is, the fllnctions and their
derivatives are continuous on n, where n denotes the region n to-
gether with its boundary r. In this way the integrand in (13) makes
sense;
and
Now, when one wishes to find the minimum of a function of a single variable
h(x), say, then of course this minimum (assuming it exists) is characterized
by the necessary condition h'(xo) = 0, Xo being the point at which the
minimium is attained. The case such as that in Box 5, where it is required to
find a function that minimizes a given functional, is not dissimilar, despite
its greater generality. Indeed, suppose we assume that a minimum does
exist, and that this minimum is achieved at the function u. If we replace v
by U+EV, where v is arbitrary, although a member of X, then we may treat
J(U+EV) as a function ofthe single variable E, and write J(u+w) == F(E),
say. A minimum is then achieved at E = 0, so the condition for a minimum
is therefore that
-E
Jrrfv dx - Jj'
fu dx.
in Vu· Vv dV = t fv dV
for all v EX
-in vV 2 u dV = in fv dV.
Now the left-hand side may be transformed by using the divergence theo-
rem, according to which
-in vV 2 u dV = - t ~~
v dA + in Vu· Vv dV
The first two questions need no clarificationj in the case of the third we
are asking whether the solution changes by only a small amount if the
data are changed by a small amount (of course, what is meant by "small
amount" must be made clear). A problem for which small changes in data
cause wild fiuctuations in the solution is clearly unstable, and one would
be tempted in such a case to reconsider whether the mathematical model
does indeed represent reality adequately and, if so, exactly how to interpret
such sensitivity.
If the answer to all three questions is in the affirmative, then the problem
is said to be well-posed. Armed with such knowledge, which can only prop-
Introduction 15
erly be obtained with the aid of the tools of functional analysis, the process
of seeking approximate solutions can then proceed from a firm base.
It is possible in some circumstances to construct counterexamples which
demonstrate that a problem does not have a unique solution. It is also
possible in such cases to obtain necessary conditions for existence of a solu-
tion. That is, with a minimum of manipulation we can establish conditions
satisfied either by the solution or by the data, assuming that a solution
does exist. Take as an example the BVP in Box 3, but ass urne that the
boundary condition takes the form
ou
- on r; (15)
ov =g
in other words, if the physical problem is heat conduction, the heat flux is
given on the entire boundary.
Now in fact it is easy to see that this problem does not have a unique
solution; indeed, if u is a solution, then so is u + c, where c is any constant,
since
- V (u + c) = - Vu = f on n
and
o ou
0) u + c) = ov = 9 on r.
Physically, we may add a constant temperature to the body, and the re-
sulting temperature distribution would still be consistent with the mathe-
matical model.
It is possible to go even further, and to show that a solution will exist only
if the data satisfy a particular condition; to see this we integrate Laplace's
equation over n and make use of the divergence theorem and the boundary
condition (15), to find that
l div(Vu) dV
J ou dA
!r ov
t 9dA .
That is,
l f dV + t 9 dA = o. (16)
Thus it is not possible to find a solution to the problem unless the data
fand 9 satisfy the compatibility condition (16). For the problem of heat
16 Introduction
conduction this asserts that the net amount of heat generated in the body
must be zero; such a condition makes perfect sense since otherwise we could
not expect to have a steady problem.
-In + + ...
j(bIifJI b2ifJ2 bnifJn) dV
i,j=1 j=1
J(V) = ~bT Kb - bT F,
Introduction 17
Ka=F; (17)
u(x)
x
useful information about the problem. The finite element method in par-
ticular provides a systematic way of defining the finite-dimensional spaces
X n , for increasing values of n.
In summary, then, the overall impression gained is that although func-
tional analysis will not in general provide a means or technique for actually
finding closed-form solutions to boundary value problems, it can help us to
develop theories that will throw light on the nature of solutions to problems.
Given that closed-form solutions may be impossible to achieve, by what-
ever method, the need for an understanding of some of these qualitative
properties of solutions is compelling.
Linear Functional
Analysis
1
Sets
xEA
24 1. Sets
which is read "x is an element of A" or "x belongs to A". Likewise, the
expression
xjt'A
reads "x is not an element 0/ A". Various ways of defining sets are givcn
in the following examples.
Exarnples
A = {1,2,3,4,5}. (1.1)
A={n: nEZ,n2:0},
3. The empty or null set is the set with no elements, and is denoted by
0. For example,
Subsets, equal sets. If A and Bare two sets, we say that A is a subset
of B if each element of A is an element of B. This is denoted by
AcB.
According to this definition every set is, of course, a subset of itself, and so
in order to distinguish subsets that do not coincide with the set in question,
we say that A is a proper subset of B if A is indeed a subset of Band if,
furthermore, B also contains elements that do not belong to A. If it is
1.1 The algebra of sets 25
A~B.
Act.B.
A=B.
Ac Band Be A.
For example, if
then A = B.
AUB={x: xEAorxEB}.
The difference of two sets A and B, written A - B, is the set of all elements
of A that do not belong to B (Figure 1.2(a)):
FIGURE 1.1. (a) The union and (b) the intersection of two sets
FIGURE 1.2. (a) The difference A - B of two sets A and Bj (b) the complement
A' of a set A
The complement of a set A, denoted by A', is the set of all elements not in
A (Figure 1.2(b». That is,
Example
4. Let A = {x: xE Z, 1 $ x $ 1O} and B = {9, 10, 11, 12}. Then
and
(1.3)
from those that cannot be labeled as in (1.3). A set that can be put in
one-to-one correspondence, in other words, labelIed, with positive integers
is called a countable set. Of course, any finite set A is countable, for if A
has m members we may label them al, a2, ... , a m .
Examples
5. The set of functions
is countable.
6. As shown in the next section, sets such as the set of points
on the real line are not countable. That is, the set of real numbers
between 0 and 1 cannot be labeled in the form al, a2, ....
Cartesian products. Given two sets A and B, their Cartesian product
A x Bis the set of all ordered pairs (a, b), where a E A and bEB. That is,
then
B x A = {(7, 1), (7, 2), (7,3), (8, 1), (8,2), (8, 3)} i= A x B.
28 1. Sets
1 2 3
FIGURE 1.3. The Cartesian product A x B of the two sets in (1.4)
N={1,2,3, ... }.
A rational number is a number that can be expressed as the ratio of two
integers. We denote the set of rational numbers by Q, so that
Subsets of lR. Very often we deal not with the whole real line but only a
portion of it, called an interval. Thus, if a and b are two points on lR such
that a ::; b, then we define
the open interval (a, b) = {x: x E lR, a < x < b};
the closed interval [a,b] = {x: xE lR, a::; x::; b};
the half-open intervals (a, b] = {x: x E lR, a < x :::; b} and
[a,b)={x: xElR, a::;x<b}.
Thus the terms "open" and "closed" indicate, respectively, that the end-
points of the interval are exeluded from or ineluded in the set. There are
more technical definitions of open and elosed sets however, whieh, although
eonsistent with the preceding definitions, are mathematically more sound.
We discuss these shortly.
Imz
z=a-bi
where r = 14
Open sets. Given any point c on the real li ne and a positive number c, the
open interval (c - c, c + c) = {x: c - c < x < c+c} is called a neighborhood
of c. Likewise, if w is any complex number and c a positive real number,
a neighborhood of w is the set {z E C : Iw - zl < c}. Neighborhoods are
illustrated in Figure 1.5.
Now, let lK represent either IR or C, and let X be a subset of K Then c
is called an interior point of X if we can find a neighborhood of c, all of
whose points belong to X. A set X c lK is called an open set if every point
of X is an interior point.
1.2 Sets of numbers 31
Examples
7. The open interval (a, b) is an open set: for any point ein (a, b) we
can define a neighborhood lying entirely in (a, b) by choosing € to be
less than le - al and le - bl. Thus every point in (a, b) is an interior
point.
On the other hand, the closed interval [a, bJ is not an open set: the
points a and b are such that, no matter how small we choose €, it
is not possible to find neighborhoods of a and b, all of whose points
lie in [a, bJ. Thus a and b are not interior points and so [a, bJ is not
open. Similar considerations apply to the half-open intervals [a, b)
and (a, bJj the points a and b, respectively, are not interior points.
8. The real line IR is an open set since every point in IR has a neighbor-
hood that lies in IR.
9. A simple example of an open set in C is the disk of radius rand
center Zo, defined by D(zojr) = {z E IC: Iz - zol < r}.
Examples
10. The set {1,~, 1,~, 1,~, ... } has two points of accumulation, namely,
1 and O. Since these do not belong to the set, it is not closed. On the
other hand, the closed set {1, 2, 3, ... } has no points of accumulation.
11. Consider the interval (a, b): according to the preceding definition,
every point in (a, b) is a point of accumulation. F'urthermore, a and
b are also points of accumulation of (a, b) since every neighborhood
of these two points contains members of (a, b). But a and b do not
belong to the set, and so it is not closed. The closure of (a, b), on the
other hand, is [a, bJ.
32 1. Sets
Example
{1,0,l,0,~,0,··1
for even n,
for odd n.
lim
n-+CX)
IU n - ul = 0 or lim
n---+CXJ
Un = U, (1.6)
Examples
15. Consider the sequence {an} = {(3n 2 - 1)j(n2 - 5n)}~=6' As n gets
very large wc would expect this sequence to approach the limit 3
(since the terms 3n 2 and n 2 dominate the numerator and denomina-
tor, respectively). We check this by asking whether, for any t > 0, a
number N can be found such that
15n -1
la n - 31 = 2
n -5n
< t
34 1. Sets
•
• •
Zn·
Zo
for n > N. Denote the left-hand side of (1.7) by f(n) and treat nasa
real number; then the graph of f(n} is as shown in Figure 1.6, and f
has roots nl and n2, with n2 2': 6. If we choose N = n2, then clearly
f(n) > 0 for n > N, or lan - 31 < € for n > N, so that an -+ 3.
A
0 01
o
~
--.,....
1 0 0
Now suppose that there is a number m whieh belongs to A and that also
is an upper bound of A. We eall m the maximum of the set A and we write
maxA = m.
Similarly, if there is a number n that belongs to A and whieh, furthermore,
is a lower bound of A, then this number is called the minimum of A and
we write
minA = n.
Examples
17. Let A be the closed unit interval [0,1] = {x: x E IR, 0 ~ x ~ 1}.
Then any number a ?: 1 is an upper bound, any number b ~ 0 is a
lower bound, and
maxA = 1, minA = O.
18. Let A = (0,1) = {x: X E IR, 0< x < I}. In this ease A has no
maximum or minimum, although it is bounded; the numbers 0 and 1
are upper and lower bounds, respectively, but do not belong to A.
The preeeding exarnples illustrate onee again the essential difIerenee be-
tween closed and open intervals: closed intervals have minima and maxima
whereas open intervals do not. Still, we would like to be able to express the
fact that, from the point of view of boundedness, a set such as (0,1) is not
36 1. Sets
that different from [0,1] in that it does have aleast upper bound, which is
the smallest of all its upper bounds, and a greatest lower bound, which is
the largest of all its lower bounds, even though these bounds do not belong
to the set.
In general the supremum or least upper bound of a set A is a number
p' wh ich is an upper bound of A, and which satisfies p' :S p for all upper
bounds p. When p' exists, we write
p' = supA.
q' = inf A.
maxA = supA.
minA = inf A.
Examples
20. Let A be the positive real line lR+ = {x: x E lR, x :::: O}. Then
inflR+ = minlR+ = 0 and suplR+ does not exist, since lR+ is not
bounded above.
{Xl, X2, . ..}. Next, let C2 be the infimum of the sequence {X2' X3, ... }. Con-
tinuing in this way, we denote by c.,. the infimum of the sequence starting
with X n , that is, {Xn,Xn+I, .. .}.
Clearly {Cl, C2, ... } is a monotone increasing sequence, and furthermore
this sequence is bounded above (by b). It follows (Exercise 1.14) that this
sequence {c.,.} converges to a limit c, say, which lies in [a, b]. We show next
that C is in fact a point of accumulation of the sequence {x n }.
To do this, choose any f > 0, and choose also a positive integer Nj then
there exists m 2: N such that
lern - cl< f.
Now c'" is the infimum of the set of numbers {x m , X",+l, .. .}, so that there
exists k 2: m such that
R 2 ={(x,y): x,YER}.
y+----e X
x
FIGURE 1.9. The Cartesian plane]R2
depending on circumstances.
and f is ca11ed, for obvious reasons, the radius of the neighborhood (Figure
1.10). We immediately generalize to ]Rn and define a neighborhood of a
point C in ]Rn to be the set
N(c, E)
Example
21. The unit square n = nUr = {x: xE JR2, O:s; XI:S; 1,0:S; X2:S; I}
is closed (see the previous example); however n is not closed iOince all
the points lying on r are limit points of n but do not belong to n.
Domains in JRn. We now describe the kinds of sets in JRn that are of
greatest relevance. First, we define a connected set n in JRn to be a iOet
which has the property that every pair of points in n can be connected by
a curve that lies entirely in n. Examples of connected and disconnected set
are shown in Figure 1.12.
We define next a domain in lR n to be an open connected set in lR n .
Domains are central to the consideration ofboundary value problems, as the
examples in the Introduction indicate. Our interest is exclusively confined
to domains in JR, JR2, and occasionally in JR3; in the case of JR2 and lR 3 the
boundary r (that is, the curve (in JR2) or surface (in JR3) within which all
points of the domain lie) is assumed to be sufficiently smooth, in the sense
that it possesses no cusps or suchlike singularities. Examples of admissible
and inadmissible domains are also shown in Figure 1.12.
Later, it is necessary to be more precise ab out what is meant by an
admissible domain, and there we define what is called a Lipschitz domain;
this is in a sense the standard "nice" domain with which one works in the
context of boundary value problems.
1.4 Relations, equivalence classes, and Zorn's lemma 41
admissible
connected disconnected
inadmissible
FIGURE 1.12. Connected and disconnected sets, and admissible and inadmissible
domains
Examples
22. Let A be the set of a11 men (in a given community, say), and B the
set of all women. Then for a E A and bEB, "a is the husband of b"
defines a relation on A x B.
23. Let A = {2,3,4}, B = {3,4,5,6}, and consider the relation "y is
divisible by x", for (x, y) E A x B. Then the subset making up this
relation is
The kinds of relations that are particularly useful are those that have
sorne well-defined structure built into them, and we now consider some
42 1. Sets
then the set A is said to be linearly ordered, and ::; is ealled a linear omering
onA.
Examples
24. Consider the relation "<" on the realline. This is not reflexive since,
for any real number x, x -/. x. It is also not symmetrie, though it is
transitive: x < y and y < z imply that x < z.
25. As mentioned earlier, the operation "::;" defines a partial ordering on
R; it is reflexive (x ::; x), antisymmetric (x ::; y and y ::; x imply that
x = y), and transitive (x ::; y and y ::; z imply that x ::; z). Note that
it is not, however, an equivalenee relation, sinee it is not symmetrie
(x ::; y does not imply that y ::; x).
26. Let F be a family of sets; that is, F is a set whose members are
themselves sets. Then set inclusion C is a partial ordering on F; note
in particular that for any two sets A and B in F, A c Band B C A
imply that A = B.
1.4 Relations, equivalence classes, and Zorn's lemma 43
27. Let A be the set of triangles in the plane, and let ",,}' be the relation
defined by "is similar to". Then this defines an equivalence relation
on A.
Partitions and equivalence classes. Let A be any set, and suppose that
it is possible to define subsets Al, A 2 , ••• of A which have the properties
that
(i) the sets Ai are pairwise disjoint; that is, Ai n Aj = 0 for all i, j =
1,2, ... such that j 1= i;
(ii) Al U A 2 U ... = A.
Then the family of sets {Al, ...} is called a partition 0/ .4.. The motivation
for this name is easily understood if one considers Figure 1.13, which illus-
trat es the concept for the case of a set A in lR?
Examples
28. Let X = {I, 2, 3, ... , 9}, A = {I, 4, 7}, B = {2, 3, 5, 6}, and C
{7, 8, 9}. Then {A, B, C} is not a partition of Xi X = Au B U C but
An C = {7} 1= 0.
29. Consider the plane lR? and the family of subsets A a defined by A a =
{x E 1R2 : X2 = a}. Thus A a is the set of points lying on the horizontal
line X2 = a. Then {A a : a E IR} defines a partition on 1R2 •
It turns out that there is a dose relation between partitions and equiva-
lence relations, and we explore this next. Suppose that "- defines an equiv-
alence relation on A, and for each a E A, define the set Aa by
Aa = {x E A: x'" a}.
Then A a is called an equivalence dass determined by 11. In Example 29,
the equivalence relation x '" y may be defined on 1R2 by X2 = Y2; then the
horizontal line A a passing through a is the equivalence dass defined by a.
We show that the family of equivalence classes in fact defines a partition
of A.
44 1. Sets
To prove (i), let B = u{A a : a E A}. Then any bEB belongs to some
A a , for a suitable choice of a, and hence b belongs also to A. Thus B <:;; A.
Next, take any c E A. By reflexivity we have c rv C, or c belongs to Ac, and
hence c belongs also to B. It follows that (i) holds.
The proof of (ii) follows directly from the result of Exercise 1.23, which
implies that if A a =1= Ab, then A a and Ab must be disjoint. 0
THEOREM 3. Zorn's Lemma and the Axiom of Choice are equivalent ax-
ioms.
46 1. Sets
A=?B.
A=?B
B holds if A holds
Now consider the converse, that is, the case in which B implies A, or B '*
A. Of course we could simply go back and transpose A and B in all the
preceding statements; but it is useful to consider this relationship from a
diffferent angle. Specifically, we may now state that B '*
A is the same as
stating that B holds only if A holds, which is to say that, if A does not
hold, then neither does B. A third way of making this assertion is to state
that a necessary condition for B to hold is that A holds. Going back to the
simple example, we can state that a necessary condition far a 3 = 8 to hold
is a = 2. We sumrnarize again.
1.5 Theorem proving 47
The term "iff" is shorthand for "if and only if" . In the context of proofs of
theorems and thc like, when faced with the task of showing that statement
A is true if and only if B is true, the typical approach is a two-stage one:
• sufficiency (if): assume that B holds, and show that this implies A;
• necessity (only if): assume that A holds, and show that this implies
B.
Example
30. Let A be the statement "a 2 > 4" and B the statement "a > 2".
Assume first that B is true; then dearly A is true. Thus B is a suffi-
cient condition for A to hold. Conversely, assume that A is true; this
implies that lai > 2; that is, a < -2 and a > 2. Thus A is not a
sufficient condition for B to hold; alternatively, B is not a necessary
condition for A to hold (since a > -2 would also be acceptablc). The
two statements are therefore not equivalent. On the other hand, if A
is the statement "a 2 > 4 and a > 0" , and B is the statement "a > 2" ,
then A and Bare equivalent.
exploits the fact that the statement "if A holds, then B holds" is equivalent
to the statement "if B does not hold, then A does not hold". Faced with
the task of proving that A implies B, the procedure starts off by assuming
that B does not hold. The task is then to show that this implies that A is
not valid, usually by obtaining a eontradiction of the original assumption.
Example
1.7 Exercises
The algebra of sets
1.2. Let A = {1,2}, B = {7,8}, and C = {9, I}. Find B x (Au C) and
(An C) x B.
1.7 Exercises 49
1.4. Let n(A) denote the number of elements of a finite set A. Prove that
1.5. The power set of a set A, denoted by 2A or P(A), is the set of all
subsets of A. What are P(A) and P(B) if A = {I, 2, 3} and B =
{{1,2},3}?
Sets of numbers
1.6. Show that the set Q of rational numbers is countable. [Hint: Set up
a table of the form
1/1 1/2 1/3
2/1 2/2 2/3
3/1 ].
1.11. Write down the first few terms of the following sequences.
(i) {( -1)n/n}~=l;
(ii) g(1- (-I)n)}~=l;
(iii) {3n 2 /(5n 2 - 6)}~=1·
1.12. Determine which of the following sequences are convergent, and find
their limits.
. (4 - 2n - 3n 2 ) n
(I) (2n2 + n) ; (iii) ---.
l+n
50 1. Sets
1.13. The sequence {(3n + 2)/(n - I)} converges to 3 as n --> 00. Find the
smallest integer N such that
13n + 2 _ 31< E
n-l
1.17. Suppose that A and B are two sets of real numbers that are bounded
above, with sup A = a and sup B = b. Let C be the set of real numbers
formed by considering all products of the form xy, where x E A and
y E B. Give a counterexample to show that, in general, sup C # ab.
Subsets of Rn
1.19. Determine the points of accumulation of the following sets and estab-
lish which of these sets are open, closed, or neither.
1.22. Let '" be the relation on the set A = {2, 3, 4, 5, ß} defined by the
statement "la - bl is divisible by 3". Write '" as a set of ordered pairs,
that is, as a subset ofAx A, and represent it graphically as a set of
points in the plane.
1.24. Consider the relation on ZxZ in which a '" bis defined by lall+la21 =
Ibll + Ib 21. Show that this is an equivalence relation, and illustrate the
manner in which Z x Z is partitioned.
2
Sets of functions and Lebesgue
integration
In due course we endow sets with particular properties and on the basis of
these assumed properties construct a theory for special kinds of sets such
as Hilbert spaces. In the development of this theory it is not necessary to
appeal to the precise character of a set: the basic axioms, and the theo-
rems that follow from these axioms, apply equally to sets whose members
are numbers or matrices or functions. Before embarking on the task of de-
scribing this general framework, however, we first introduce two important
examples of sets, or spaces (as they are usually called when endowed with
additional properties) of functions: these are the spaces of continuous func-
tions, and thc LP spaces of functions whose pth powers are integrable. With
these at our disposal it is possible in subsequent chapters to illustrate as-
pects of the general theory, using as special examples sets such as lR or lRn
which were introduced in the last chapter, as weil as spaces of functions.
In Section 2.1 the concept of continuity is introduccd, and the space
Cm(rl) of m-times continuously differentiable functions is dcfincd.
There are of course many well-behaved functions that are not contin-
uous, and that also feature in the developments to follow; an example is
the Heaviside step function. These functions need to be characterized in
an alternative manner, and this is done by exploiting not the degree of
continuity or smoothness of the function, but rat her its integrability. This
process leads naturally to the definition of the LP spaces. In order to dis-
cuss these adequatcly it is necessary first, however, to extend the definition
of the integral encountered in elcmentary courses on caJculus; this is the
Ricmann integral, and it is not adequate for our purposes. Its extension,
known as the Lebesgue integral, in turn rehes on an acquaintance with thc
54 2. Sets of functions and Lebesgue integration
discontinuoUB
continuoUB
discontinuous
E
f (xo) +--------f
E
--+-------~-~-----------x
Xo
to find a positive number 6 (depending on E and on the point xo) such that
Examples
Ixo - xl
If(x) - f(xo) I = Ixllxo I
° °
for an arbitrary fixed Xo in (0, IJ. Let < 8 < Xo; then every x in the
interval Ix - xol < 8 satisfies x > Xo - 8> and we have
8
If(x) - f(xo)1 < xo(xo _ 8)
For functions of more than one variable the preceding ideas are easily ex-
tended. For example, consider a function f(x) == f(x,y) of two variables
defined on an open subset fl of lft2, as shown in Figure 2.387q. To check
for continuity at a point Xo = (xo, Yo) in fl we choose a positive number E
and construct a pair of horizontal planes at heights f(xo) ± E above the xy
plane. Then f(x) is continuous at Xo if it is always possible to construct a
cylinderofradius 8 (that is, the set ofpoints x for which Ix -xol < 8), this
radius depending on E, such that the part of the surface lying within the
cylinder is contained in thc horizontal band If(x) - f(xo)1 < E. This is but
a special case of the general definition of continuity defined for functions
of any numbcr of variables, which we now state.
The space G(fl). For any domain fl in IRn the collection of all continuous
functions defined on fl forms a set, or space, which is denoted by G(fl).
For functions dcfincd on a subset fl = (a, b) of the real line, we simply
write G(a, b). The space of functions that are continuous on thc closed set
TI = fl u r (fl and its boundary r) is denoted by G(TI) , and by G[a, bJ
for functions on the closed interval. There is more than a mere technical
2.1 Continuous functions 57
f(xo) +€
f(xo) f(x, y)
f(xo) - €
I I
~
8
FIGURE 2.3. Continuity of a function of two variables
The spaces Cm(O) and COO(O). Among all the continuous functions
defined on a subset 0 of !Rn, some have the property that their first deriva-
tives and possibly some derivatives of higher order are also continuous. It is
important to identify such functions, and so we introduce the space C=(O)
of functions which, together with all of their derivatives up to and including
those of order m, are continuous on O. That is,
forO=(a,b)ClR,
u"
u'
-1 1
FIGURE 2.4. The function in Example 3, and its first and second derivatives
Exarnples
3. The function
-1 :s; x < 0,
u(x) = { O2
x, O:S;x:S;l,
~~--------------~--------~ X
sup!(I)
!(I) f(x)
inf f(I)
sup J(n) and inf J(n) exist (that is, are finite).
Note that part (a) of the theorem states that there is a point z, say, in
TI, such that z = supf(O) = maxf(O); that is, f(z) 2: f(x) far alt points
x E TI; a similar interpretation applies with respect to the infimum.
whence
The proof for the minimum is carried out in much the same way. 0
Examples
6. Consider the function u(x) = sinx defined on [0, 21r], which is closed.
The supremum of u(x) is 1 which is achieved at x = 1r/2, whereas
the infimum is -1 which is achieved at x = 31r /2. Theorem 1 teIls us
that u is uniformly continuous.
7. Let u(x) = I/x; we have seen earlier that this function is continuous
on the open interval (0,1), but that it is not uniformly continuous
there (see also Exercise 2.3). It is not continuous on [0,1]; furt her-
more, inf u = 1 (at x = 1), but sup u does not exist.
8. Note that Theorem 1 gives sufficient conditions for a function to be
bounded and uniformly continuous. These are not necessary condi-
tions, however; for example, if u(x) = x 2 on (0,1), then supu = 1,
inf u = 0, and the function u is uniformly continuous, although it
°
achieves its supremum and infimum (at x = anel x = 1, respec-
tively) outside the set (0,1), which is open.
Lipschitz continuous functions. A function f defined on a set n in
°
Rn is said to be Lipschitz continuous (or simply Lipschitz) if there exists a
constant L > such that
If(x) - f(y)1 s Llx - Yl for all x, yEn. (2.3)
It is straight forward to show (Exercise 2.10) that every Lipschitz function
is uniformly continuous, although of course the converse is not true. This
may be better appreciated by considering the interpretation of Lipschitz
continuity for functions of a single variable (Figure 2.7): (2.3) states that
the slope of the chord joining any two points on a Lipsehitz function is
bounded above by a constant L which is independent of the two points.
We see also that the definition of Lipschitz continuity does not require that
the derivative exist at every point. However it is not difficult to show that,
if n is a compact set, then every continuously differentiable function on n
is Lipschitz.
Many functions that occur in practical applications are not continuous, and
cannot therefore be accommodateel in one of the spaces Cm(n). A simple
example is the Heaviside step function, which has many applications in
physics anel engineering, and which is defined by
H(x) = { 0, x S 0,
1, x> 0.
62 2. Sets of functions and Lebesgue integration
If(Y) - f(x)1 :S L
Iy-xl
If(Y) - f(x)1
x Y
R(x)
lr-----~--------
H(x)
FIGURE 2.8. The Heaviside step function H(x) and its integral, the ramp func-
tion R(x)
Though functions like H (x) are not continuous, they do nevertheless possess
the important property that they are integmble; that is, their integrals exist.
For example, the integral of H (x) is the ramp function R( x) shown in Figure
2.8; clearly, R(x) E C(-oo, 00).
Our aim is to set up aspace of functions that may be classified according
to whether they, and their powers, are integrable. That is, for a given
function f we investigate the range of exponents p for which the integral
(a) (b)
FIGURE 2.9. The basic idea behind (a) Riemann and (b) Lebesgue integration
case in which p > qj thus these spaces also provide a means of comparing
functions, this time through their integrability.
In order to give such spaces a proper treatment it is necessary first of
all to discuss the notion of Lebesgue measure. This in turn allows us to
introduce the notion of Lebesgue integration, which is a generalization of
the "standard" Riemann integration, and in so doing to go on to introduce
the spaces LP(S1).
Measure theory is a well-established branch ofmathematics, and Lebesgue
measure is but one example of a measure. It is an important example,
though, and it is also intuitively the easiest to grasp. There is no need to
make reference subsequently to any other measure than that of Lebesgue,
so rather than give a general treatment of the subject, this section is re-
stricted to an overview of the theory of Lebesgue measme that is concise,
but which nevertheless suffices for our purposes.
In order to appreciate the need to extend the notion of Riemann integra-
tion, we return first to the definition of the Riemann integral. Restricting
the discussion for now to functions of a single variable, consider a function
f defined on the interval [a, b]. The Riemann integral is based on the idea
of dividing [a, b] into a finite number N of subintervals, the kth subinterval
having length LiXk, and then considering sums of the form
f(Xl)Lixl + f(X2)Lix2 + ... + f(XN)LixN,
as shown in Figure 2.9(a). This sum represents an approximation to the
area under the graph of f. If the function is sufficently well-behaved - for
exarnple, piecewise continuous - then the approximation may be improved
by increasing N, that is, by refining the subdivision of [a . b], so that in the
limit, as N gets very large, we arrive at the Riemann integral, which is
usually denoted by
l b
f(x) dx.
deficiencies. For example, there are certain "nasty" functions that we are
unablc to deal with using the Riemann integral: an example is the function
u(x) = { ~: x is rational,
x is irrational,
(2.4)
defined on thc interval [0,1]. With the more general Lebesgue integral we
avoid these problems; the Lebesgue integral is able to handle functions
like (2.4) and, furthermore, gives the same result as the Riemann integral
if the function is Riemann-integrable. Also, limits of Lebesgue-integrable
functions are always Lebesgue-integrable.
Although it might seem rat her pedantic to abandon the Riemann in-
tegral for the preceding reasons - after all , how often are we required to
integrate something like the function defined in (2.4)'1 - we demonstrate
later that spaces of Lebesgue-integrable functions possess properties which
allow them to be classified as Banach spaces or Hilbert spaces, with the
fortunate consequence that it is then possible to draw on the vast reser-
voir of results for such spaces. From a practical point of view, Riemann
and Lebesgue integrals coincide when the former exists, so all we will have
done would be to broaden the class of functions that can be integrated.
Since the question of whether the integral of a function f makes sense
depends very much on the function, a suitable alternative approach to the
Riemann integral might be to approximate f by a very simple function, the
integral ofwhich can be computed without any difficulty. Then, in contrast
to the Riemann integral, the approximation to the integral of f can be
progressively improved, not by furt her subdivisions of the domain, but by
refining the approximation to f (see Figure 2.9(b)). The approximating
functions that serve this pur pose are indeed known as simple functions,
and are defined to be functions that take on a finite number of values.
Provided that we have no problems with the subsets M k on which they
take their constant values, the integral of f can be approximated by a sum
ofthc form
Jl'(IZ) = p,(Q» = O.
2.3 Lebesgue integration and the space LP(O) 67
fex), g(x)
f(x) - g(x)
(a) (b)
d +--------------------+--,
M
C
Examples
9. Any continuous function is measurable: in partieular if M is the in-
terval (c,d) (Figure 2.12), then it is possible to show that f-l(c,d)
is also open. To see this, take any point Yo in M; then it is possible
to choose E such that the neighborhood {y : Iy - Yol < c} lies en-
tirely in M, M being open. Now denote by J the interval f-l(M);
by definition there exist points x and Xo in J such that f (x) = Y
and f(xo) = Yo, and so If(x) - f(xo)1 < E. By the eontinuity of fit
follows that there exists 8> 0 such that Ix - xol < 8. Thus J is open.
10. Consider the Heaviside function H defined by
if x 2: 0
H(x) = { ~ if x< 0,
and shown in Figure 2.13; if we choose M as shown in the figure,
then H- 1 (M) = {x: x 2: O} whieh is measurable; on the other
hand, if we choose the measurable set L, then H- 1 (L) = {x: x < O}
which again is measurable. Continuing in this way, we ean verify that
the sets H-1(M) for measurable Mare all measurable. Thus H is a
measurable function.
11. Let n be a measurable set, and E a measurable subset of n; then the
charactenstic function XE of E is defined by
ifx E E
XE(X) = { ~ if x rf- E.
(2.5)
2.3 Lebesgue integration and the space P(rl) 69
H(x)
kfr-------------
,------_ XE
,---,
E
ak------------ ~--'
(a) (b)
FIGURE 2.14. (a) The characteristic function XE, and (b) CL simple function s
12. We return to the example given in (2.4), and observe that this can be
written in the alternative form u = XQ, where Q is the set of rational
numbers. Since Q is a measurable set - with Lebesgue measure zero,
since it is countable - it follows that the function u is measurable, by
Example 11.
With the characteristic function at our disposal we can now define simple
functions s: these are functions on n that take on only a finite number of
values. In other words, suppose that kf 1 , kf2 , ... , kfN is a partition of n;
then each simple function is a measurable function of the form
(2.6)
where ak is the value of s on kfk . These not ions are illustrated in Figure
2.14. Since sums of measurable functions are measurable, we can conclude
that every step function is measurable.
l dx:= l XE dx = f-L(E).
k u dx = k XQ dx = f-L(Q) = O.
1 1 3
4 '2 4
FIGURE 2.16. A nondecreasing sequence of simple functions that approximate a
measurable function f
inrf dx = lim
k~oo inrSk dx, (2.8)
Example
13. Suppose that we wish to integrate the function f shown in Figure
2.16. Now this function is piecewise continuous, and it is wcH known
from elementary integration that its integral is the area under the
triangle, and is equal to 4.
However, the purpose of this example is
to show how the definition (2.8) may be deployed in practice, so we
in fact construct a sequence of nondecreasing simple functions that
converge to f·
There are many different ways of constructing the requisite family of
simple functions; we consider just one, in which the first member 81
is as shown in Figure 2.16; the second member, 82, is coni:>tructed in
a similar manner, ensuring that it too satisfies 82 S f. The process
may now be continued in a fairly obvious manner.
Coni:>idering next the integrals of the simple functions, we see that
f
+
so that
f=t+-r·
We observe that both f+ and f- are nonnegative functions, so that the
preceding theory app!ies to these two components of f. It is possible to
show that f+ and f- are both measurable if fis, and so we may define
the Lebesgue integral of f by
in f dx = in t+ in r
dx - dx. (2.9)
It is at this point that we can clarify the need to define the integral of a
function in terms of its positive and negative parts; first we need to note
that the integral, as defined in (2.8), need not be finite; that is, it is possible
to have Inf dx = +00 for a nonnegative function. Continuing this !ine of
argument, it is quite conceivable that evaluation of the righ-hand side of
(2.9) will give 00 - 00, which is of course meaningless. It is therefore to be
understood that the notation In
f dx makes sense only if one of the terms
on the right-hand side of (2.9) is finite. We go one step further, and give a
special name to those functions f for which f+ + f- has a finite integral.
JL u(x) dxdy.
L u(x) dx,
the context making clear the dimension of the domain over which the inte-
gral is taken. This convention has in fact been implicit in the developments
leading to the Lebesgue integral; we made no distinction there between
integrals taken over ~ and over ~n, for any n.
It is also worth bearing in mind that since sets of measure zero are
irrelevant in the evaluation of integrals, integrals may be defined over open
sets or over their closures. So, for example, it makes no difference whether
an integral is defined over an open interval (a, b), or over [a, b].
All the usual properties of Riemann integrals extend to Lebesgue inte-
grals, and we summarize without proof some of these properties.
r u dx =
Jn
lim
k-->(X)
r
Jn
Uk dx.
The usefulness of this theorem lies in the very mild conditions that are
placed on Uk.
L lu(x)IP dx
exists (that is, is finite). The case p = 2 is special in many ways, as the
developments in Chapter 3 and beyond make clear; functions in L 2 (n) have
the property that
Examples
H(x) ={ ~: x<O
O:S;x
belongs to LP(a, b) for any p ~ 1 and finite a < 0 and b > 0 since
15. The function u(x) = X- 1 / 3 belongs to LP(O, 1) for any p < 3, since
11
o
lu(x)IP dx =
11 0 3-p
3
x- p / 3 dx = - - [:r(3- P)/3]
1
0
Some results that are frequently useful are embodied in the following the-
orem.
10 u(x)v(x) dx
is finite.
which holds if p 2:: pi (sec Exerci8e 3.22 later for a derivation). If u belongs
to LP(n), then the integral on the right is finite, and hence so is the integral
on the left. Thus u E LP' (n) also.
Part (b) is a trivial consequence of (a): set pi = 1; then we have, for
u E LP(n),
110 u(x) dxl ~ 10 lu(x)1 dx ~ l10 lu(x)IP dxj l/p < 00.
and g(x) are two measurable functions that are equal a.e. (as in Figure
2.10); then
It follows that LP(O) can be partitioned into equivalence dasses, each dass
comprising all those functions that are equal a.e. to a given one. In order
to be able to define LP(rl) as a normed space (in the next chapter) it is
necessary to regard the elements of this space not as functions, but rather
as the equivalence dasses of functions defined here. Notwithstanding this
distinction, it is common practice to speak of the members of LP(rl) as
functions; this is a harmless abuse of language provided that the precise
nature of the space is properly understood.
In I dx == In u dx +i In v dx.
The definition of the LP spaces still stands, if the notation 1I1 is interpreted
as the modulus of a complex number: 1/1 2 = u 2 + v 2 •
The space LOO(O). Ifwe let p --+ 00, then we may define the space LOO(rl)
to be the space of all measurable functions on rl that are bounded almost
everywhere on rl (that is, except possibly on subsets of zero measure):
Clearly for a bounded domain rl, LOO(rl) is a subset of LP(rl) for all p ~ 1,
since any u E L 00 (rl) satisfies
Example
16. The function
) _ { x 2 , -0::;
u (X I
X < 1, x #~
--+00, X=2'
0(0)
FIGURE 2.18. The relationship between the LP spaces and spaces of continuous
functions
2.5 Exercises
Continuous functions and the space Cm(n)
2.1. Sketch and discuss the continuity of the functions
2.2. Show that the following nmctions are continuous on the intervals
given:
(a) polynomials of degree k defined on the interval [a, bJ;
(b) the function u(x) = X 1/ 2 on [0,00).
Is either of these functions uniformly continuous?
2.3. (a) Show that f(x) = I/x is not uniformly continuous on (0,1).
[Hint: recall that f(x) - f(y) = (y - x)/xy. Show, for example
by choosing x = l/n and y appropriately, that the distance
Ix - yl can be shown arbitrarily small although If(x) - f(y)1 is
large.J
(b) Show, on the other hand, that f(x) = I/x is uniformly contin-
uous on [a, b], where b > a > 0.
2.4. Show that f(x) = x 2 + 2y is continuous at any point x in 1R2 .
2.5. Let E be a closed connected set in 1R2 , and for .any point x in 1R2
define the function f by f(x) = d(x, E), where d(x, E) is the distance
between x and E, defined by
()
cux
"() = {O,1, 00<S; xx<S; l~ '
2.9. Examine the continuity of u(x) = r on the unit disk, where r 2
x2 + y2.
2.10. Show that every Lipschitz function is uniformly continuous.
Measure of sets in !Rn
2.11. Let I be an interval in IR, and consider the subset of all irrational
numbers in I. 1s this set measurable? If so, calculate its measure.
2.12. Show that the characteristic function XE of a set E is a measurable
function if and only if E is itself measurable.
Lebesgue integration and the spaces LP(0.)
2.13. Prove Lemma 1.
2.14. Verify that the integral of the nth simple function approximating the
function f in Example 13 has the value ~ - 2}+1'
2.15. For the function f defined by
-I -lS;x<O
f(x) = { +~ if OS;xS;l
lxi> 1,
find f+ and f- and determine the integral using (2.8). Repeat the
exercise for the case in which
-I -lS;x<O
g(x) = { +~ if x 2:
x< -1
°
2.16. Show that the Lebesgue integral of f exists if and only if that of Ifl
exists, and that
From elementary courses in vector algebra and analysis we know that the
idea of a vector as a directed line segment is not sufficient for us to build up
a nontrivial theory, let alone be of use in concrete applications. Additional
structure has to be added: we agree to add together vectors using the
parallelogram law, and we define various forms of multiplication of vectors,
for example, the scalar (dot) product and the vector (cross) product. Once
these properties have been adopted, it becomes possible to construct a
fairly sophisticated theory.
The same is true of sets in general. A set without structure is sterile,
and not of much use from the point of view of the analyst. The quest ion of
what kinds of properties to assurne is generally answered by looking at the
properties of simple sets like lR or the set of vectors, and by generalizing
accordingly. This process of generalization is a recurrent theme in the next
few chapters, and in this chapter we begin the process by defining first
a vector space to be, broadly speaking, an arbitrary set whose members
behave as vectors. Then we show how properties such as "length", "dis-
tance" and "scalar product" can be defined for vector spaces, leading to
the notions of normed and inner product spaces.
3. there is a vector 0 caIled the zero vectorthat has the property u+O =
u for aIl vectors u;
4. there is a vector -u, caIled the negative of u, that has the property
u + (-u) = 0 (we normally write this as u - u = 0). This in turn
defines subtmction: by the difference u - v we then mean the vector
u + (-v) (Figure 3.1);
5. (aß)u = a(ßu);
6. (a + ß)u = au + ßu, and a(u + v) = au + av;
7. 1 . u = u (this, with 6, teIls us that u = (1 + O)u 1 . u
1 . u +0 . u = u so that 0 . u = 0).
Now all of these properties of vectors are readily generalized to any set,
and this is what we do next.
Vector space. Let X be a set, and let lK be either the set lR of real numbers
or the set C of complex numbers, either of these being referrcd to here as
scalars, for convenience. Then X is called a vector space (or linear space)
3.1 Vector spaces and subspaces 83
VS3. there is an element 0 of X called the zero element that has the
property
u +0 = u for all u E X j
VS6. (a+ß)u = au+ßu, and a(u+v) = au+av for all scalars a,ß and
for all u, v E X j
VS7. 1· u = u.
When IK is chosen to be the real numbers, then X is called areal vector
space, whereas it is referred to as a complex vector space if IK is chosen to
be IC. These two sets do not ex haust the choices of sealars that may be
made, but they more than suffice for our needs.
Examples
The zero element is 0 = (0, ... ,0) and the element -x is given by
-x = (-Xl, ... , -Xn).
e
3. The set n of n-tuples of complex numbers is a complex vector space,
the operations of addition and scalar multiplication being defined as
in the case of IR n , with the scalars now being complex numbers.
84 3. Vector spaces, normed, and inner product spaces
5. The space LP(O) is a vector space for 1 ::; p < 00; this follows from
the Minkowski inequality fOT integrals
[llu ± viP dX] l/p ::; [llulP dX] l/p + [liviP dX] l/p , (3.1)
[llau + ßvl P dX] l/p < [llaulP dX] l/p + [llßv iP dX] l/p
and this last expression is finite since u and v belong to LP(O). Hence
au + ßv E LP(O). The remaining axioms are readily shown to be
valid.
The space L = (0) is likewise a vector space, as is easily verified.
As in the case of Cm(il), the spaces LP(O) are real or complex vector
spaces, accordingly as the functions are real- or complex- valued. If
complex-valued, then I . I in the Minkowski inequality is interpreted
as the modulus of a complex number.
Since all vector spaces are sets, it is natural to enquire whether subsets of
vector spaces are also vector spaccs. This is not always true, but in those
cases in which it is true we give the subset a special name.
x
FIGURE 3.2. Planes passing through the origin are subspaces of llt j
Exarnples
7. Consider the vector space ]R3; all points of the form (x, y, 0) form
a subspace of ]R3 - the xy plane, in common parlance - since sums
of multiples of points in the xy plane also lie in this plane. Indeed,
the set of points of any plane or line passing through the origin is a
subspace of]R3 (Figure 3.2).
8. The set P3 [0, 1) of polynomials of degree :S 3 forms a subset of G[O, 1)
and constitutes a subspace: for any polynomials p(x), q(x) E P3 [0, 1),
ap(x) + ßq(x)
is also a polynomial of degree :S 3, and therefore belongs to P3 [0, 1).
9. The set G(O) of bounded continuous functions forms a subspace of
LP(n) (see Section 2.3) for 1 :S p :S 00.
Surn of subspaces. Given two subspaces V, W of a vector space X, we
define the sum of V and W, denoted by V + W, to be the set of all members
of X of the form v + w with v E V and w E W. In other words,
V + W = {u EX: u = v + w for v E V, W E W}.
The set V + W is also a subspace of X since if u and u are members of
V + W, so that u = v + wand u = v + w with v,v E, V and w,w E W,
then it follows that
Example
X=U+VandX=U+W
()
L-'''--------------u
lul
u· v = lullvl cos(),
where () is the angle between u and v (Figure 3.4). The inner product has
the following properties. It is symmetrie (v . u = v . u), linear ((au + ßv) .
w = au,w+ßv,w), and positive-definite (u·u 2: 0 and u·u = 0 iff u = 0).
Furthermore, the scalar product in turn provides a means of measuring the
length or norm of a vector: indeed, for any vector u lul = (u· U)1/2. And
finally, when equipped with the scalar product operation it is possible to
measure the distance between two points x and y in JR3: if this distance is
denoted by d( x, y), then
d(x, y) J(y-x)·(y-x)
Iy-xl
J(Yl - xd 2 + (Y2 - X2)2 + (Y3 - X:l)2 .
The function d(·, .), being a device for measuring distances between points,
is called the metric. The concepts of inner product, norm, and metric are
defined in much the same way for arbitrary vector spaces. In this section
we take the first step in this direction, and deal with the inner product.
and
see this we note first of all that, with the use of the inner product axioms
and the properties of complex conjugation,
This follows from the observation that, for any u EX, (u, v) = (u + 0, v) =
(u, v) + (0, v). Comparison of the left- and right-hand sides gives the desired
resul t. In the same way it follows that (v, 0) = 0 for any v EX.
Examples
12. Let X = JR3; then the conventional or Euclidean scalar product de-
flned by
(u, v) == l b
u(x)v(x) dx for u, v E L 2 (a, b). (3.4)
We have, in particular,
(v,u) = lbV(X)U(X)dX l b
1i[X)v(x) dx
l b
v(x)u(x) dx = l b
v(x)u(x) dx
l b
u(x)v(x) dx = (u, v)
(au + ßv,w) l b
[au(x) + ßv(x»)w(x) dx
a l b
u(x)w(x) dx +ß l b
v(x)w(x) dx
a(u, w) + ß(v,w)
and so Axiom CIP3 is satisfied. Finally,
(u, u) = 1b u(x)u(x) dx =
jb lu(xW dx,
a
u
FIGURE 3.5. Orthogonal vectors
(u,v) =0.
Example
14. Consider the functions u(x) = sinx and v(x) = cosx, with u,v E
L 2 ( -7r, 7r). Making use of the inner produet (3.4) (but bearing in
mind that we are dealing with real-valued functions here) we find
1:
that
(u,v) = sinxeosx dx =0
u .v = lullvl eos 0
or
This property in fact holds for any inner product space, as the next result
shows.
(3.5)
PROOF. We assume that neither u nor v is zero; for the case in which either
of these is zero, (3.5) is satisfied trivially. The proof then follows from the
observation that, for any complex number a,
using Axiom CIP3. Upon expansion and use of the axioms of linearity and
Hermitian symmetry this becomes
0::; (u,u)-(av,u)-(u,av)+(av,av)
(u, u) - (av, u) - (av, u) + (av, av)
(u, u) - 2Re[a(v,u)] + laI 2 (v,v),
where Rez denotes the real part of a complex number z. Now a is arbitrary,
so if we choose a to be equal to (v,u)/(v,v), then lai = l(v,u)I/(v,v)
(remember that (v, v) is real) and
au ~
laul = lallul
FIGURE 3.6. Axioms N3 and N4 as they apply to vectors
Examples
15. Let X = jR3; then the usual or Euclidean norm defined on jR3 is
Ilxll = (xi + x~ + X~)1!2 for x = (Xl, X2, X3)'
(3.6)
The first three norm axioms obviously hold; to verify the triangle
inequality we note that, for any two functions u and v in Loo(a, b),
u(x) and v(x) are simply real or complex numbers, so that
lu(x) + v(x)1 ::::: lu(x)1 + Iv(x)l;
thus, recalling the properties ofthe supremum in Chapter 1, and bear-
ing in mind that these properties carry over to the essential supre-
mum, we have (Figure 3.7)
lIu + vllu'" ess suplu(x) + v(x)1
::::: ess sup(lu(x)1 + Iv(x)1)
ess suplu(x)1 + ess suplv(x)1
IlulIL= + IIvIIL='
3.3 Normed spaces 95
(3.8)
and we say that 11 . 11 in (3.8) is the norm genemted by the inner product.
Thc analogy with vectars in IR. 3 is clear: given the scalar (inner) product
defined for vectors, the norm or length of a vector u is given by
96 3. Vector spaces, normed, and inner product spaces
The quest ion now arises: is 11·11 defined in (3.8) really a norm? That is, does
it satisfy all of the norm axioms? The answer, of course, is yes: first, the
quantity (u, u) is real, so that NI is satisfied. Second, positive-definiteness
of Ilull follows directly from the positive-definiteness of the inner product.
Positive homogeneity is verified by considering that, far any complex a,
(au, au)
aa(u, u) = lal 2 11ul1 2
using properties CIP2 and CIP3 of the inner product. Finally, in order to
show that the triangle inequality is satisfied we consider
Ilu+vl1 2 (u+v,u+v)
(u, u) + 2Re(u, v) + (v, v)
IIul1 2 + 2Re(u, v) + IIvl1 2
< IIul1 2 + 21(u, v)1 + II v l1 2
< IIul1 2 + 211ullllvil + IIvl1 2
(using the Cauchy-Schwarz inequality)
(Ilull + Ilvll)2.
The desired result is now obtained by taking the square root of both sides.
With thc understanding that the norm on an inner product space is that
generated by the inner product, the Cauehy-Schwarz inequality may be
written in the alternative form
(3.9)
for all u, v E X. Comparison of (3.9) with Figure 3.8 should make clear
the reason for referring to this identity as the parallelogram law. Indeed,
from the eosine rule in IR?, Ilu - vl1 2 = IIul1 2 + IIvl1 2 - 211ullllvil cos Band
Ilu + vl1 2 = IIul1 2 + IIvl1 2 - 211 u llll v il cos(180° - B); adding, we obtain (3.9).
The proof for the more general case is easily carried out (see Exercise
3.10). Since the parallelogram law holds for any norm generated by an
3.3 Norlll1ed spaces 97
inner product it follows that, far any normed space X, if the norm does not
satisfy the parallelogram law, then there is no inner prOd'IJict that generates
this norm. When this is so, the space X is not an inner product space.
Example
20. Consider G[O, 1] with the sup-norm 11· 1100' Then choosing
Thus
Ilull~ + Ilvll~ = 2.
The parallelogram law does not hold, and so G[O,l] with the sup-
norm is not an inner product space. In the same way we can show
that G(O) with the sup-norm is also not an inner product space.
Equivalent narms. It has already been pointed out that a norm is not a
unique object, in the sense that a variety of norms may be defined on any
given vector space. Suppose then that two alternative norms II·IIA and II·IIB
are defined on a vectar space X. These norms are said to be equivalent to
each other if there are positive constants m and M such that
(3.10)
98 3. Vector spaces, normed, and inner product spaces
y
d(y,z)
x e-----------__ z
for all U E X.
Example
21. Consider the case X = lR?, with the norms 11·112 and 1 . 1100 defined
in Examples 15 and 16. Since lXII::; IIxl12 and IX21 ::; Ilx112, it follows
that maxi lXii == Ilxll oo ::; Il x 112· Furthermore, lXII::; Ilxll oo and IX21 ::;
Ilxll=; squaring and adding, we find that Ilxll~ ::; 21Ixll~. Thus
Ilxll oo ::; IIxl12 ::; v2llxll oo
and 11 . 112 and 11 . 1100 are equivalent norms.
Metric and metric space. Let X be a set. If u and v are two members
of X, a metne on X is areal number d( u, v) with the following properties,
for any u,v,w EX.
3.5 Bibliographical remarks 99
Examples
d u v ={
o if v = u
(3.12)
( ,) 1 otherwise.
3.6 Exercises
Vector spaces and subspaces
3.1. Which of the following are vector spaces?
(a) the set of m x n matrices;
(b) the set of mx m matrices with determinant equal to 1;
(c) the set ofpoints X = {x: x = (XI,X2) E ~2, X2 2: O} (that is,
the upper half plane);
(d) the set of solutions to the differential equation
d2 u du
a(x) dx 2 + b(x) dx + c(x)u = 0, 0< x < 1;
d2 u du
a(x) dx 2 + b(x) dx + c(x)u + d(x) = 0, 0< x < 1.
3.2. Consider the vector space ~2 of ordered pairs. Which of the following
subsets of ~2 are subspaces?
for some v E V, w E W.
3.fJ Exercises 101
3.6. The purpose of this exercise is to prove the Minkowski inequality for
integrals
[!nlu ± viP dX] l/p :s [!nluIP dX] l/p + [!nIviP dX] l/p
y = x p -·l or x = yq-l
ßr-----~
B
x
(b) Use the identity
3.8. Consider the spacc Gm[O, 1J with inner product (-, ·)m defined by
102 3. Vector spaces, normed, and inner product spaces
Given u(x) = x 3 and v(x) = 1 - (3x 2 /2), show that u and v are
orthogonal with respect to the inner product (. , ')0. Are they orthog-
onal with respect to (- h ? Verify the Cauchy-Schwarz inequality using
the inner product (. h.
N ormed spaces
3.12. If X is a real inner product space, show that IIx - yll + IIY - zll
Ilx - zll if and only if Y = ax + (1- o:)z, where 0 <::: a <::: 1, and ll ·11 is
the norm generated by the inner product on X. Interpret this result
for the case X = ]R2.
lIull =
[
1 (~~) 2] 1/2
b
dx , uEX
satisfies the norm axioms for the case in which X is the space
u v
au + (1- a)v
3.17. Let X be a real inner product space. Show that u ...L v in X if and
only if Ilu + avll = lIu - avll for all real numbers a, where 11·11 is the
norm generated by the inner product on X. Illustrate this result in
1R2 .
3.18. The distance from a point x in a normed space X to a closed and
bounded subset B of Xis defined by d(x, B) = inf{llx-yll : Y E B}.
Calculate d( x, B) if X = 1R2 , X = (1, 1), B is the closed disk of radius
~ and center (~, 0), and X has (i) the Euclidean norm; and (ii) the
norm II . 111 (see Example 16).
3.19. The purpose of this exercise is to show that
sum over 1 to n, and manipulate to get the Hölder inequality for sums
n
3.20. For any normed space V, the unit ball with center 0 and radius r is
defined by B(O,r) = {u E V: Ilull ::; r}. Sketch B(O,r) for the case
in which V = ll~? with the norms 11 . IIp for p = 1, 2, and 00.
3.21. Show that 11 ·111 and 11· 112 are equivalent norms on ]R2.
3.22. The aim of this exercise is to show that, for a bounded domain n,
(3.14)
1 1 1 1 1
-p + -q = -r or --+--=1. (3.15)
(pjr) (qjr)
(3.16)
Metric spaces
3.24. Let D = {z E C : Izi ::; I} be the closed unit disk in the complex
plane, and define
Normed and inner product spaces possess a wealth of properties, and these
in turn allow sophisticated theories to be developed and applied in a variety
of contexts. Some of these properties are introduced in this chapter.
Arguably the most basic concept, and one which pervades most discus-
sions involving normed spaces, is that of convergence of sequences. Se-
quences were introduced in Chapter 1, in the context of real and complex
numbers. We show in Section 4.1 that the definition of convergence of a
sequence in a normed space is a natural extension of that given in Chapter
l.
In Section 4.2 we focus attention on sequences in spaces of functions;
these are a special case which occurs so often in the future as to warrant
devoting some time to the elucidation of their characteristics.
The notion of completeness pervades functional analysis, and complete
normed and inner product spaces are sufficiently important to be given
special names: a complete normed space is called a Banach space and a
complete inner product space is known as a Hilbert space. We describe
completeness in Section 4.3, and then show in Section 4.4 how completeness
of aspace is related to the closedness of that space. We also discuss in this
section the issue of how to complete aspace that lacks this property.
Finally in Section 4.5, we discuss further properties of inner product
spaces. In particular, we extend to arbitrary Hilbert spaces a property that
is fairly obvious in three-dimensional space.lR.3 may be decomposed into two
orthogonal subspaces (a simple example, once a set of Cartesian axes has
been introduced, would be the xy-plane and the z-axL'l), and every vector
may be written uniquely as the sum of orthogonal components in these
106 4. Properties of normed spaces
two subspaces, as shown in Figure 4.1. The generalization of this not ion
to arbitrary Hilbert spaces is known as the projection theorem, which also
features later on.
4.1 Sequences
Sequences of numbers were defined in Chapter 1; here we look at sequences
in normed spaces gene rally. A sequence in a normed space X is an ordered
set in X whose members can be labeled with positive integers. We write
{Ul, U2, ... } or {Udk'=l·
Example
a b
FIGURE 4.2. The sequence of functions with general memher un(x) = n(x - a)
•
•• Un
•U
some element U E Y if, for any E > 0, it is always possible to make Ilu n -ull
smaller than E simply by choosing n large enough, larger than some number
N, say (Figure 4.3). The groundwork for a precise definition of convergence
has now been laid.
lim Ilun
n~oo
- ull = 0 or lim
n~oo
Un = U, (4.2)
U-€
a b
FIGURE 4.4. An illustration of the concept of uniform convergence
Suppose that this sequence is convergent in the sup-norm; that is, given
any € > 0 it is possible to find a number N such that
for all xE [a, b], whenever n > N. But since lun(x) - u(x)1 :S: sup lun(x) -
u(x)l, it follows that (4.3) also holds.
In other words, convergence in the sup-norm implies uniform conver-
gence. Conversely, suppose that {u n } is a uniformly convergent sequcnce,
so that (4.3) holds. Then € is an upper bound for IU n (:1:) - u( x) I, for any
x in [a, b]. But this imp!ies that the least upper bound or supremum of
lun(x) - u(x)1 must also be less than €, so that
Ilu n - ull oo == sup lun(x) - u(x)1 < € for all x E [a, b], n > N,
or alternatively
° b in1Example 2
FIGURE 4.5. Nonuniform convergence of the sequence
Examples
° °: ; °: ;
2. Let U n = Xn, defined on [0,1]. This sequence convergences pointwise
to for x < 1, and to 1 at x = 1. If we set u(x) = 0, x < 1,
and u(x) = 1 for x = 1, then
°
formlyon [0,1]. However, it does converge uniformly to zero on [0, b],
where < b < 1, since in this case sup Iun(x) - u(x)1 = bn which
°
goes to as n -> 00.
1
FIGURE 4.6. The sequence of functions in Example 3
fex)
f(x n ) ~====~
f(X2) f------------:~
f(X1) f--=--~
Xl X2 Xn X
What this definition states is that if one takes a sequence of points that
converges, then these points are mapped to a sequence of numbers (real or
complex) f(X1), f(X2), . .. which, first, converges, and second, the limit of
which coincides with fex). These ideas are illustrated in Figure 4.7. Now
there is little point in having alternative definitions of the same concept
unless these are equivalent, so it is essential that we establish the connection
between the E-O definition of continuity and the sequential definition. These
are in fact equivalent, as the following theorem confirms.
PROOF. First assume that the E- 0 definition (2.2) is valid. Then given any
E > 0, there exists 0 such that If(Y) - f(x)1 < E when yEn and Y- xl < O.
112 4. Properties of normed spaces
or
or
lim
n-oo inr Iun(x) - u(x)IPdx = O. (4.9)
c= UNIFORM CONVERGENCE
Example
which goes to zero as n --+ 00. It can be shown that U n --+ 0 in the
LP - norm for any p > 1.
4.3 Completeness
As we have seen, convergent sequences all have the property that the dis-
tance between successive members of a sequence, measured by means of
so me appropriatc norm, becomes progressively smaller, and the sequence
approaches a definite limit which is, moreover, a mcmbcr of the normed
space concerned. Unfortunately, the situation is not always so clear-cut:
some normcd spaces have the deficiency that, although it is possible to set
up sequences in these spaces with the property that thc distance between
successive members becomes progressively smaller, the sequence does not
in fact have a limit in this space. For example, suppose we take a look at the
half-open interval (0, 1] with the norm 11·11 = 1·1, and consider the sequence
{u n } = {l/n}~=l. This sequence behaves in all respects as a convergent
sequence, and converges to 0, but 0 is not in the space (0, I]!
This behavior is undesirable for a number of reasons, and we always
make a strong distinction between spaces in which sequences that behave as
convergent sequences do in fact converge to a limit and, on the other hand,
those spaces in which the limits of such sequences are possibly "missing".
In order to proceed with the discussion, we first neecl to have a meam;
of identifying sequences with the property that the distance between suc-
cessive members decreases. These are called Cauchy sequences, and their
definition makes no reference to the not ion of convergence, or of a limit,
since it is possiblc for such sequences not to converge.
or, more formally, if for any givcn E > 0 there exists a number N such that
Ilum- unll < E whenever m, n > N. (4.11)
Every convergent sequence is a Cauchy sequence (see Exercise 4.13), but
the point has been made that not every Cauchy sequence is convergent, for
114 4. Properties of normed spaces
the simple reason that, although the members may be converging to a limit,
the limit may not be part of the space. When this is so, then we say that the
space is incomplete. The situation may be remedied, however, by adding to
the space those elements that are the limits of Cauchy sequences but wh ich
were not originally in the space. This process is called completion of the
space, which is then said to be complete. We discuss completions in more
detail in the next section, but we first define formally a complete space,
and then give some simple but important examples of complete spaces.
Example
6. The set IR n with any ofthe norms 11·llp defined by Ilxll p = [2:~=1 IXiIPj1/p
s:
for 1 p < 00 is complete, as is IR n with the norm II . liDO defined by
Ilxil DO = maxI::Si::Sn lXii· This follows from the completeness of IR (see
Exercise 4.12).
7. The space G[O, 1] with the integral norm IIul1 2 = Jal u 2 dx is not
complete. To see this, consider the sequence {u n } defined by
OS:x<~,
~<:::x<:::1.
0< X < ~,
u(x) = { ~: ~<:::x<:::1.
Hence G[O, 1] (and in general G[a, b]) is not complete in the L 2 -norm,
and indeed it is not complete in the LP-norm for any p such that
1 <::: p < 00. In a similar way we may show that G(r2) with the LP
norm Ilull = [fr! luIPdx]l/p is not complete, for 1 <::: p < 00.
1
FIGURE 4.8. The sequence in Example !)
Examples
10. IR n with the norm 11 . IIp defined in Example 6, anel with 1 ~ p :::; 00,
is a Banach space, and IR n with the norm 11 . 112 is a Hilbert space.
11. The space G[a, b] with the norm 11 . 1100 is a Banach space, as are
the spaces LP(a, b) with the V-norm. The space L 2 (a, b) with the
L 2 -norm is a Hilbert space.
116 4. Properties of normed spaces
Examples
12. We have al ready met neighborhoods of points in IR n (see Chapter I,
Section 3); there, the norm used is of course the Euclidean norm.
4.4 Open and closed sets, completion 117
13. Consider the space e[O, 1] with the sup-norm lIull<Xl = sup lu(x)l. An
open neighborhood of the function uo(x) of radius E is the set of all
continuous functions on [0,1] for which (Figure 4.9)
lIu - uoll oo = sup lu(x) - uo(x)1 < E.
xE[O,lJ
14. In the normed space L 2 (0, 1) with the usual L 2 -norm, an open neigh-
borhood of Uo is the set of all functions u E L 2 (0, 1) for which
Examples
15. To start with, it may be worth reviewing some of the examples in
Sections 1.2 and 1.3.
16. The set B(uo, r) := {u : U E X, Ilu-uoll < r}, where X is any normed
space, is called the open ball with center Uo and radius r, and is an
open set. Indeed, for any point v in B(uo,r) the open neighborhood
N(V,E) lies entirely in B(uo,r) provided that E is less than d, the
shortest distance from v to the boundary of B(uQ,r) (Figure 4.10).
More formally, set
FIGURE 4.10. The open ball B(uo,r} with center Uo and radius r
18. Consider the space G[a, b] with the sup-norm, and let V = {u E
G[a, b], lu(x)1 ::::; 1}. Then V is closed, since V is in fact the closed
ball of radius 1, centered at uo(x) = 0, as can be seen in Figure 4.11.
PROOF. First assurne that Y is closed, and consider the convergent se-
4.4 Open and closed sets, completion 119
uEV
PROOF. This is not too difficult, and is left as an exercise (see Exercise
4.21). 0
120 4. Properties of normed spaces
Example
19. Let X = G[O, 1J and Y = prO, 1J, the set of all polynomials on the
unit interval. First, we recall that G[O, 1J is complete in the sup-norm
11·1100. Now prO, 1J is asubspace ofG[O, 1J, but Pis not closedin G. To
see this, consider the fact that u( x) = eX is a point of accumulation of
P: for any E > 0 we can always find at least one polynomial p( x) E P
lying in a neighborhood of u; indeed, given any E > 0, it is possible
to find a polynomial p(x) = 1 + x + x 2 /2! + ... + x n In! such that
tion emphasizes the need to exercise caution when generalizing from the
particular; such generalizations, or illustrations of abstract concepts in R
or Rn, are often very helpful, but there are times when one's intuition can
be misleading.
Apart from the results given in Chapters 1 and 2, compactness does not
play much of a role in subsequent developments, and we do not pursue the
topic further here.
Earlier we described in a vague fashion how an incomplete set Y may be
made complete by adding to it those limits of Cauchy sequences that were
not originally in Y. The resulting set Y, say, is then called the completion of
Y. We conclude this section by recording some properties of the completion
Yof an incomplete set Y, but in order to do this it is first of all necessary
to define a few more topological concepts.
lIuo - vII< E.
Example
20. The set Q of rational numbers is dense in R; the closure ij (= R) of
Q consists of all rational and irrational numbers.
21. The Weierstmss theorem states that, for any u E G[a, b] and for
every E > 0, it is possible to find a polynomial p E P[a, b], p(x) =
ao + aIX + ... , such that
-1
\ 1
:2 1
Example
x, Os lxi s ~,
~ -x, ~ < x < ~,
u(x) =
-~ -x, -~ < x < -~,
0, ~ s lxi< 1.
Then K will be the open set given by K = (- ~, 0) U (0, ~) and K =
[- ~ , ~] (note that x = 0 and x = ± ~ are points of accumulation of
K). Since K is a closed bounded subset of n, it follows that u(x) has
compact support on n (see Figure 4.13).
Dense sets in LV. We are now able to show that the space Co( -00,00)
of continuous functions having compact support is dense in LV( -00,00) for
1 S P < 00. This implies that any set X which contains C o( -00,00) is
itself dense in LV( -00,00).
4.4 Open and cJosed sets, completion 123
THEOREM 6. The space Go(f!) is dense in LP(r2) for 1 S p < 00, where
0, is any openset in IR n .
The proof may be found in some of the texts referred to at the end of
this chapter.
Example
24. We havc seen that the set of rationals iQl is dense in IR; since iQl is
countable, it follows that ~ is scparablc. In the same way, ~n is sep-
arable: a countable dense subset is the set iQln of n-tuples of rational
numbers.
124 4. Properties of normed spaces
25. From the Weierstrass theorem (Example 21) we see that the count-
able set Q[a, b] of polynomials with rational coefficients is dense in
G[a, b]. Furthermore, from Theorem 6 we know that G[a, b] is dense
in V(a, b). It follows that Q[a, b] is dense in V(a, b) far 1 ::; p < 00,
and that LP(a, b) is therefore separable.
that is, Y 1- consists of all those members of X that are orthogonal to every
member of Y. If w belongs to Y 1-, we say that w is orthogonal to Y and
write w 1. Y. Since (v,v) = 0 implies that v = 0, it is clear that the only
member of both Y and Y 1- is the zero element: Y n Y 1- = {O}.
4.5 Orthogonal complements in Hilbert spaces 125
Example
26. A canonical example of orthogonal complements is provided in Il~?
Let Y be the realline - the X3-axis, say. Then y.L is the Xlx2-plane
since points in the Xlx2-plane are orthogonal to every member of Y,
as shown in Figure 4.1.
Our main aim in this section is to show that if H is any Hilbert space and
M is a closed subspace of H, then H = M E!:l M.L; that is, every u E H has
the unique representation
u =v +w with v E M and w E M L
(recall the discussion of direct sums in Seetion 3.1). Before doing so, how-
ever, it is necessary to prove another intuitively obvious result, which is
embodied in the following theorem.
REMARK. Part (a) of the theorem says that, provided M is a closed sub-
space, we can always find a unique point Vo in M that is closer to u than
any other point in M. Furthermore, this point may be found by "dropping
a perpendicular" from u on to M. Part (b) is illustrated in Figure 4.14 for
the case in which H = R 3 , M is the xy-plane, and S the x-axis.
II(vn - u) - (v rn - u)1i 2
-11(vn - u) + (v rn - u)11 2 + 2(llvn - 1J~112 + Ilvrn - u11 2 )
using the parallelogram law, Exercise 3.10. Now
126 4. Properties of normed spaces
and hence
Ilu - voll ~ d;
furthermore,
Hence Ilu - voll = d. The proof that Vo is unique is left as an exercise (see
Exercise 4.24).
To show that (u - Vo, v) = 0 for all v E M, consider any point Vo + ClV
in M; dearly,
(4.14)
(4.15)
but the right-hand side of (4.15) is less than d2 since Ilu - vol1 2 = ~ and
ß2/lIv11 2 > O.
This leads to a contradiction, and so we must have ß = O.
(b) Choose v E M such that v rt S, and let w be the point in S dosest
to v (we are applying part (a) of the theorem to S). Then v = v - w is
such a point. 0
We are now ready to prove the following theorem.
4.5 Orthogonal complements in Hilbert spaces 127
(4.16)
To prove that there is only one such w, suppose that (4.16) holds for two
elements '1111 and '1112 in M 1... Then
PROOF OF LEMMA 1. First, we note that M c M1..J., since ifv is any point
in M, then v .1 M1.. so that v also belongs to M1..1... F'urthermore, since
M1..1.. is closed (see Exercise 4.25), clearly Me M1..J.. All that remains is to
show that M = M1..1... Suppose that M =f MJ.J.; from Theorem 7(b), there
is a nonzero point '111 E M1..J. such that '111 .1 M. Since 1\;.[ c M this means
that '111 E M 1... But '111 E M 1.. n M 1..1.. implies that '111 = 0, a contradiction.
Hence M1..1.. = M. D
Example
27. Since C(rl) is dense in L 2 (D,) with the L 2 -norm, it follows from The-
orem 9 that ifu E L 2 (D,) and (u,v) = 0 for all v E C(rl), then u = O.
That is,
4.7 Exercises
Sequences
4.2. Let X be an inner product space and suppose that {u n } and {vn }
are convergent sequences in X with limits u and v, respectively, con-
vergence being defined via the norm generated by the inner product
on X. Show that (un,v n ) -> (u,v), and deduce that (un,v) -> (u,v)
and that Ilunll -> Ilull.
4.3. If U n -> U in a normed space X and lIun - wll :::; Q for some w E X
and Q E ~, show that Ilu - wll :::; Q.
Convergence of sequences of functions
0,
(a) un(x) = { n,
°~ x :::; 1/n,
1/n < x < 2/n,
0, 2/n ~ x :::; 1;
(b) Un(X) = n 3/2 xe- n 2 X 2 on [-1, 1J.
°
4.8. Let a > be a fixed real number, and define Ilull = sup{lu(x)l: lxi:::;
a} and Illulll = min(l, Ilull) on the space C( -00,00). Why is 11·11 not
a norm? Is 111· 111 a norm?
Completeness
4.9. Show that the sequence un(x) = X1/n is a Cauchy sequence in L 2 (0, 1).
4.11. The purpose of this exercise is to show that C[a, bl is complete with
respect to the sup-norm. Let un(x) be a Cauchy sequence; show that
un(xo) is a Cauchy sequence of real numbers for every fixed Xo in
[a, bJ and deduce that un(xo) converges to a number u(xo), say. Next,
show that un(x) converges uniformly to the function u(x). Finally,
since U n -+ u uniformly, we have
and deduce from this result and the continuity of Un that U is con-
tinuous.
4.12. Show that ]Rn with the norm 11 ·llp (1 :::; P ~ 00) is complete.
4.14. Consider C[O, 1] with the L 2 -norm. Show that thc sequence {u n } is
a Cauchy sequence, where un(x) is as shown in tbe following figure.
Next, show that if U n converges to u(x), then we should have
U(x) = { 0,
1,
°~< x < ~,
~ x ~ 1,
130 4. Properties of normed spaces
4.15. Show that the set Y = {v E L 2 (0, 1): Jo1 Iv(x)1 dx = I} is complete.
4.17. Consider the space G[a, b] with the sup-norm, and let M be the subset
consisting of functions v satisfying v(a) = 0 and Iv(x)1 < 1. 1s the
function u(x) = 1 a point of accumulation of M?
4.18. Find the smallest value of r such that the function v(x) = cos27T"x
lies in the closed ball with center Uo and radius r in the space G[O, 1]
with the sup-norm, where uo(x) = sin27T"x.
4.19. Show that a set Y in a normed space X is closed if and only if its
complement Y' = X - Y is open.
In the preceding chapters we have acquainted ourselves with some of the ba-
sic structures of normed and inner product spaces. We come now to another
fundamental concept in functional analysis, namely, that of a mapping or
operator from one space to another. At the most primitive level one re-
quires only two sets in order to define an operator from one of them to the
other, and these sets need not have any algebraic or topological structure
for the definition to make sense. Obviously, though, the really interesting
and useful properties of operators come to the fore when the two sets are
given additional structure: if the two sets are vector spaces, we can intro-
du ce the concept of a linear operator, and if the sets are normed spaces as
weil, then it is possible to construct a rieh theory of linear operators on
such spaees. After a general introduction to operators in Section 5.1, we
discuss the theory of linear operators on normed spaces in Section 5.2.
Projections are a dass of operators that feature strong;ly in later chapters
when we discuss approximations of boundary value problems. Apart from
this, much of the geometrical structure of Hilbert spaces is laid bare with
the aid of projection operators acting on these spaces. For these reasons
we devote Seetion 5.3 to a diseussion of projection operators on Hilbert
spaces.
Operators that map members of a specified space into the real or com-
plex numbers are special, and are given a special name: these are called
functionals, and are discussed in Section 5.4. Finally, we discuss in Section
5.5 operators that map pairs of elements into the real or complex numbers
in a linear fashion; these are known as bi linear forms. Linear functionals
134 5. Linear operators
1
f(x) = sinx
-Ir/2 x
-1
and bilinear forms playa central role in the study of linear boundary value
problems, as we show in Chapter 9 and subsequently.
5.1 Operators
The subject of this chapter is not entirely unfamiliar; we havc all come
across both linear and nonlinear operators in earlier courses on linear al-
gebra, differential equations, and so on. Here we continue the process of
generalizing from the familiar. Consider the function f(x), defined on the
interval I = [-Ir /2, Ir /2], as shown in Figure 5. L This familiar situation is
really just an example of the action of an operator: specifically, we have
defined f to be something that acts on any member x in I, and pro duces a
real number sin x. Furthermore, the image sin x lies in the set J = [-1, 1J.
More formally we write all of this as follows:
Here the first expression reads "f maps elements of I to elements of IR"
and the second expression tells how f does this: f aets on x to produce
sin x. The set I is called the domain of the operator f, written D (f). The
set IR in which f(x) takes its values is called the image space, whereas the
subset J c IR consisting of all real numbers that are images of I under the
mapping f is called the range of f, written R(f). We now generalize.
Let X and Y be two sets, and suppose that a rule is given whereby an
element u of X is mapped or transformed to an element v of Y. This rule
is called an operator or transformation or mapping and we write, for an
operator T,
The first expression reads "T maps elements of X to elements of Y" while
the secoIld reads "T acts on u to produce v". We refer to Y as the image
5.1 Operators 135
space, X is ealled the domain of T, written D(T), and we write R(T) for
the range of T, whieh eonsists of all those elements of Y that are images of
members of X. In other words,
R(T) = {v: v E Y, Tu = v for some u EX}.
Finally, the element v is ealled the image of u under the mapping T. These
eoneepts are illustrated in Figure 5.2.
If the range of T happens to be all of Y, then T is ealled a surjective
operator, and we say that T maps X onto Y. Otherwise T maps X into Y.
Assurne that the image spaee of T eontains the zero element; then the
null space N(T) of T is the set of all elements of D(T) whose image is zero:
N(T) = {u EX: Tu = O}.
The inverse image of a member v E Y is denoted by T- 1 (v), and is the set
of all u E X such that Tu = v:
T- 1 (v)={uEX: Tu=v}.
Likewise, the inverse image of a subset W of Y is denoted by T- 1 (W), and
is the set of all u E X such that Tu E W (Figure 5.3):
T- 1 (W) = {u EX: Tu E W}.
Examples
1. All functions of a real variable are operators from a subset of lR to lR,
for example, the operator or funetion f(x) = sinx diseussed at the
beginning of this section. In the same way, the function
f : lR2 -> lR, f(x) = f(x, y) = x 2 + y2
136 5. Linear operators
x y
Tx = ( Tu
T 21
for a problem involving two variables x and y only. Thus J, the image
of U, is a continuous function. To be specific, if nc ]R2 and u(x) =
x 2 y 3, then the image of U is the function J defined by
J(x, y) = 2y 3 + 6x 2 y.
The question of whether ~ is a surjective operator is a question that
is taken up in Chapter 8; this is equivalent to asking whether there
exists a solution to the equation ~u = J.
Two operators S : X -+ Y and T : X -+ Y are said to be equal if for
every u E X we have
Su = Tu.
When this is the case, we write S = T.
The sum of two operators S : X -+ Y and T : X -+ Y is defined to be
the operator satisfying
(S + T)u = Su + Tu, u E X,
where Y is a vector space. That is, T + S has the
same effect on any
member of u as would be obtained by applying T and S separately, and
then adding together the result. In order for the definitions of the sum of
operators, and of equality of operators, to make sense, t.he domains of the
two operators Sand T must be equal, as must the image spaces.
The composition or product T S of two operators S : X -+ Y and T :
Y -+ W is defined to be the operator satisfying
That is, the element (TS)u E W is found by first obtaining the element
Su E Y, and then by the action of T on Su. Note that the composition
T S is meaningless if the element Su does not belong to the domain of T.
Furthermore, in general TS f=- ST; in fact, ST may be quite meaningless.
Example
4. Let X = ]R3, Y = ]R2, W = IR, and let T : X -+ Y and S : Y -+ W
be the matrices
2
S = [1 2J.
3
S(~
2
(ST)x 21 ) ( yX ) ( x + 2y + z )
3 z = [1 2] 2x + 3y + 2z
5x + 8y + 5z.
138 5. Linear operators
The identity operator is an operator from a set X into itself, which maps
each element of X to the same element. That is,
Example
5. Let X = Y = ]R3; then the identity operator I : ]R3 --t ]R3 is simply
the 3 x 3 identity matrix. The zero operator from ]R3 to ]R2 is the
2 x 3 matrix containing an zeros.
TT- l = I,
that is, (T- 1)-1 = T. Ifthe range ofT is all ofY (that is, T is surjective)
and T is also one-to-one, then T is said to be bijectivej T- 1 is a one-to-one
operator from Y onto X, and we say that T is invertible.
Examples
T-1(v)(x) = l x
v(y)dy.
Here lK is the field (either IR or C) over which the vector space is defined. We
may summarize (a) and (b) in one statement by defining a linear operator
to be one that satisfies
Examples
9. The differential operator d n / dx n : cn [a, b] -> C[ a, bJ is linear since
dn dnu dnv
-(au+ßv) = a - +ß-.
dx n dx n dx n
Similar considerations apply to partial differential operators of an
orders. which are also linear operators.
5.2 Linear operators, continuous, and bounded operators 141
Example
11. Let T : IRn -+ Rn be the operator defined by an n x n matrix. It is
easily shown that T is a linear operator; the question of whether T
is one-to-one is equivalent to asking whether the equation
Tx=y
has a unique solution x for a given y. According to Theorem 1 this
question may be answered by considering the equation
Txo = 0;
if the only element Xo satisfying this equation is Xo = 0, then T is one-
to-one. Equivalently, we may check whether the matrix is nonsingular.
For example, if T : IR 2 x IR 2 is given by
then
Example
12. To emphasize the point that two isomorphie spaees ean be quite differ-
ent in nature, eonsider the ease in whieh X = ]Rn and Y = Pn - I [a, bJ,
the spaee of polynomials of degree less than or equal to n-l. An arbi-
trary member of Y is of the form p(x) = ao + aIX + ... + an_IXn-l,
and is therefore defined uniquely by the n numbers ao, . .. , an-I. Let
T : X -> Y be the operator that associates with the point a =
(ao, . .. , an-I) the polynomial p(x) introdueed earlier; then clearly T
is linear and bijeetive, and is henee an isomorphism. Thus X = ]Rn
and Y = Pn-da, bJ are isomorphie to each other.
If (5.2) holds for every Uo EX, then we simply say that T is continuous on
X. Furthermore, if 8 does not depend on uo, then T is said to be uniformly
continuous on X.
The situation is shown schematically in Figure 5.6. Choose some point
Uo and a number E > 0; then T is continuous if a number 8 can be found
such that the image of the points lying inside the neighborhood N (uo, 8)
is contained in the open ball of radius E and center Tuo. At this point we
draw attention to the norms used in (5.2); since Uo and u are in X, the
norm used when evaluating 11 UD -ull is the norm defined on X; on the other
hand, the norm used in the evaluation of IITuo - Tull must be that defined
on Y. When wishing to emphasize the distinction we write, for example,
IluD - ullx and IITuo - Tully· Generally, though, it is expected that there
will be no confusion about which norm should be used.
Examples
13. Let X = Y = IR and let f : IR --+ IR. Then the definition of continuity
given previously coincides with that given in Section 2.1 if we use the
norm 11 . 11 = I . I on IR.
14. Let X = Y = G[O, 1] with the sup-norm, and define T : G[O, 1] --+
G[O, 1] by
Tu(x) = l x
u(y) dy, xE [0,1]
144 5. Linear operators
Il x
(uo(y) - u(y)) dyl ::; l x
luo(y) - u(y)1 dy
::; (x - 0) sup luo(y) - u(y)1
yE[O,l]
(using Theorem 2, Chapter 2)
and so
Tx=
or
n
LTijXj = ai, 1::; i::; m.
j=l
IIb-alil ~ t, (t,T;,(Yj-Xj))'
$ t, (t,T;;) t.(Yj -Xj)'
5.2 Linear operators, continuous, and bounded operators 145
Hence
PROOF. Figure 5.7 illustrates the assertion of the theorem. Suppose that
T is continuous, and for any '11.0 E So let Vo = Tuo. Since S is open, there
is a neighborhood N(vo, E) of Vo contained entirely in S. By the continuity
of T, Uo has a neighborhood No(uo, 8) that is mapped into N(vo, E). Thus,
No C So since No is part of the inverse image of S; so So is open. Conversely,
assurne that the inverse image of cvcry open set in Y is an open set in X.
Then in particular for every Uo E X and any neighborhood N(Tuo,f)
of Tuo, the inverse image No, say, of N is open. Hence No also contains
a neighborhood of center Uo and, by definition of No, the image of this
neighborhood lies in N, Since Uo was arbitrary, T is continuous. 0
x y
Tu
IITU-TV I (
= Ilu-vll
Tv I
FIGURE 5.8. Two isometrically isomorphie spaees
IITull<::; IITllllul1
Examples
16. Let T : ]R2 - t ]R3; then the space L(]R2, ]R3) of linear operators from
]R2 to lR3 is equivalent to the space of all real 3 x 2 matrices. If]R2
and ]R3 are equipped with the I-norm
n
IIxl11 = L lXii
i=1
in which n = 2 or 3, respectively, then
IITII
IITxlh
sup { ~' x fO}
sup{IITxI11, IIxl11 = I} (see Exercise 5.11).
Now
3 3
: :; ~~~L ITij
J • i=l
!
(5.4)
I! Tx Il1 = L ITi1 1,
i=l
17. Let T = dldx : G 1 (O, 1] ---> G[O, 1] with the sup-norm defined on
GI and G. T is not a bounded operator; to show this, we need only
consider u(x) = sinnx. Then Ilull = 1 and Ilduldxli = Iincosnxli = n.
It follows that IITul1 can take on arbitrarily large values (for any
chosen constant K, wc simply choose n big enough to invalidate the
statement IITul1 = n < K). This result may be extended in an obvious
way to show that all ordinary and partial differential operators are
unbounded in the sup-norm.
The connection between bounded and continuous linear operators is one
that is exploited very often. Suppose that T : X ---> Y is a bounded linear
operator; then there exists K > 0 such that
For the case Uo = 0 we have Tuo = 0 and so IITuol1 :'S: 8- 1 1Iuoll. Thus
IITull :'S: 6- 1 1Iull, and so T is bounded. We thus have the following theorem.
Example
PROOF. We have
Hence
!im
n~~
IIAu n - Aully :'S: IIAII !im
n~~
lIu n - ullx = 0;
5.3 Projections
Consider the following situation in ]R3, shown in Figure 5.9. Given any
vector x we define an operator P which has the property that
Px = (Xl,X2,0).
152 5. Linear operators
That is, P projects any vector onto the Xlx2-plane. It follows that if y is
a vector of the form (Yl, Y2, 0), then Py = y, so that R(P) = {y: y =
(a,ß,O)} and N(P) = {y: y = (O,O,'Y)}, where a,ß, and'Y are real
numbers. Furthermore, the only vector common to R(P) and N(P) is the
zero vector. More generally, P has the property that p 2 x == P(Px) = Px
for all points x in ]R3. This is a simple and standard example of a projection
operator on ]R3; we now generalize to arbitrary vector spaces.
Example
Pu = u(O)(l - x) + u(l)x.
That is, P maps a continuous function to its linear interpolate, as
shown in Figure 5.10. To see that P is a projection operator, note
that
I
FIGURE 5.10. A continuous function u and its linear interpolate Pu
Example
20. Let X = Cl-I, 1] and define P to be the projection that maps any
XE Cl-I, 1] to its even part (Figure 5.11):
--------+--------+------~~-------
Up to now we have discussed the situation that obtains when we are given
a projection operator. What of the converse situation? Suppose we are
given a subspace Y of an inner product space X. Is it possible to define an
orthogonal projection P with the property that R(P) = Y? The answer lies
in a logical extension to Theorem 8 of Chapter 4, the Projection Theorem,
as we now show.
PROOF. We know that H = Y + y-L and that every U E H has the uniquc
representation
P(v + w) = v;
Pv = v = Qv for v E Y,
Pw=O =Qw for w E y-L.
Note thc elose relationship between Theorem 9 and the Projection Theo-
rem.
This section is coneluded with a similar extension of Theorem 7, also
from Chapter 4.
that is,
Example
is given a special name: this is called the dual space of X, and is denoted
by X'. That is,
Most of the time we deal with the case IK = 1Ft, and it is this case that is
the focus in examplcs.
Examples
(f, u) = l b
u(x) dx.
and so f is bounded, and is thus a member ofthe dual space [L 2 (a, b)]'.
i:
00 at x = o. Furthermore, 6 is assumed to have the property
6(x)u(x) dx = u(O)
158 5. Linear operators
00
area = 1
E 1'-->0
Furthermore,
PROOF. Assume that Ri-O since, if R = 0, then (5.7) and (5.8) hold with
u = O. Also, observe that if a representation (5.7) exists, then the element
u must be nonzero. Second, for any v in H for which (C, v) = 0 we must
have (u, v) = O. This implies that u must be orthogonal to any member
of the null space of C; that is, u E N (eV-. We are thus led to show the
existence of u by considering N(e) and N(e)l...
Now N(e) is a closed subspace of H (see Exercise 5.16). Furthermore,
since ei- 0 by assumption, N(R) i- H (if N(e) were equal to H this would
imply that (e,v) = 0 for all v EH). Thus N(e) is a proper subset of H
and so, by the Projection Theorem (Theorem 8, Chapter 4), N(e)l.. i- {O}.
Hence there must be at least one nonzero element, uo, say, in N(R)l... Set
then
0= (uu, z) = (uo, (e, v)uo - (R, uo)v) = (e, v)(uo, uo) -- (e, uo)(uo, v)
which implies that
Finally, we set
(e,uo)
u = Il uol1 2 uo, (5.10)
160 5. Linear operators
and from (5.9) we see that the element u defined by (5.10) satisfies (5.7).
The existence of u has been proved.
The proofthat u is unique, and the derivation of (5.8), are more straight-
forward than the existence proof, and are left as exercises (see Exercise
5.30). 0
Examples
24. Let! be a linear functional on ne; then (f, x) is areal number and
according to Theorem 11 we can always find a point y E IR n such
that
(f,x) = X· y.
1, 0< x S; ~
u(x) = { (5.11)
0, ~ <x < 1.
5.4 Linear functionals 161
H~H',
Example
(5.12)
{1/2
(f, v) = Jo v(x) dx,
Furthermore,
However, the converse is not true. Take, for example, the function u(x) = 0
and the sequence defined by un(x) = cosnx, on the interval [0,27r]; then
U n E L 2 (O,21l') (for example), and it can be shown that U n ~ O. On the
other hand, Ilun - 011L2 = 1l' for all values of n, so U n does not converge
strongly to O.
It follows from the Riesz Representation Theorem that in a Hilbert space
H, U n ~ U if and only if (u n , v) ~ (u, v) for all v.E H.
In the context of the spaces LP, the not ion of weak convergence takes on
a more concrete form in the light of Riesz's Theorem. Indeed, the corre-
spondence (5.13) implies that a sequence {u n } in LP (1 ::::; p < (0) is weakly
convergent with limit U if and only if
lim
n->oo 10.rung dx = 10.rug dx for all g E U(f2).
Examples
27. Let X = Y = 1R3 , and let A be any 3 x 3 matrix; then the operator
defined by a(x,y) = x· Ay is a bilinear form. In particular, for any
164 5. Linear operators
here, in general,
a:Cl[a,b]xCl[a,b]-+~, a(u, v) = l b
(Uv+u'V')dX
is a bilinear form.
Example
l
sild u'v' dxl + d IKuvl dx
Now from (5.16) it is clear that lIullp S IlullHl, and likewise Ilu'llp S
IlullHl; thus
Thus an H-elliptic form is one that is always non negative, and takes the
value 0 only for the case in which v = O. In other words, it is positive-
definite.
166 5. Linear operators
Example
~ d
11 [(V 1 )2 + K2V2jdXl
~ a !l d l
[(V ? + V2jdXl = allvll~"
where a = min(l, K2)'
The Riesz Representation Theorem for linear functionals has a counterpart
for bilinear forms that proves usefullater. Suppose that we are given areal
inner product space H and a continuous, H-elliptic bilinear form a on Hj
then for any given u E H it is possible to define a bounded linear functional
f on H according to the rule
or
The proof of this theorem is rather lengthy, and is made more digestible
by breaking it up into aseries of five lemmas.
A : H -> H, Au = w. (5.21)
Since a is bilinear,
Q:a(ul, v) + ßa(u2:.v)
(Q:Wl + ßW2, v). (5.22)
(5.23)
But from (5.20) through (5.22) we see that A maps Q:Ul +ßU2 to Q:W1 +ßW2:
PROOF. Let R(A) denote the range of A (of course, R(A) eH). We show
168 5. Linear operators
In particular, for v = z,
0= a(z, z) ~ allzl1 2
so that Ilzll = 0 or z = O. Hence A is one-to-one, and its inverse A -1 :
R(A) ~ H exists. Furthermore, A- l is linear since A is linear (see Exercisc
5.9), and A-l is bounded since
lim
k~oo
Ilwk - wll = 0 in H.
lim
k,l~(X)
Iluk - ulil :s; IIA-lil k,l-oo
lim Ilwk - wzll = 0
lim AUk
k---+CXJ
= lim Wk
k---+(X)
=W or W = A(lim Uk) = Au
(using Theorem 4). Hence W is in the range of A, and since W is the limit
of an arbitrary Cauchy sequence, R(A) is complete. 0
with IIPII = Ilwll. Thus (5.24) and (5.25) imply (5.17), :md (5.19) follows
from thc H-ellipticity of a and the eontinuity of P. This proves the theorem.
o
5.7 Exercises
Operators
5.1. Describe the range and null space of the following operators.
1
->
1
(b) K: L 2 (0, 1) -> JE., Ku = [u(xW dx.
5.12. Let X be the space Rn with the norm Ilxlloo = maxlSjSn IXjl. If
A : X - t X is a linear operator represented by an n x n matrix, show
that
n
IIAlloo = max
, L IAijl·
j=l
IITullv ~ Kllullu, u E U.
If T is a bounded below linear operator, show that T is one-to-one,
and that T- I : R(T) -> U is a bounded operator.
if lxi< 1,
Tu= {
°
u(x)
otherwise.
Pu(x) =~ r ei(x-y)u(y) dy
LI
l
(x, y) = L AijXiYj·
i,j=l
5.35. Let R: HJ(O, 1) --f ffi. and a: HJ(O, 1) x HJ(O, 1) --f.IR be defined by
1
(R,v) = 1 (-1-4x)vdX, a(u, v) = 1\X+1)U'V'dX,
where
In vector algebra it is often the case that computations are carried out using
the components of vectors. A set of three mutually orthogonal unit vectors
{i, j, k} is selected as a basis, and every vcctor a can then be written
as u = ai + ßj + ,k, the coefficients a, ß" being the components of a
relative to the chosen basis. In Section 6.1 we start the process of extending
this notion to vector spaces in general, by introducing finite-dimensional
vector spaces. In Section 6.2 the vector space is endowed with an inner
product or a norm, and this in turn perrnits the investigation of various
properties that such inner product or normed spaces have by virtue of their
being finite-dimensional. Section 6.3 is devoted to an examination of linear
operators acting on finite-dimensional spaces; these are always continuous,
and they also inherit in general the simple nature of their domains.
These concepts are extended to infinite-dimensional spaces in Section 6.4;
if the space concerned is a Hilbert space, then the idea of an orthonormal
basis carries over in a natural way from the finite-dimensional situation.
The quest ion of how one generates bases in infinite-dimensional spaccs is
partially answered by considering Sturm-Liouville problems, the topic of
Seetion 6.5; these eigenvalue problems have a number of i.nteresting proper-
ties, the most relevant of which is that their eigenfunctions form orthonor-
mal bases in L 2
176 6. Orthonormal bases and Fourier series
6 .1 Finite-dimensional spaces
In this section we disCUBS vector spaces that have the property that every
member can be expressed as a finite sum of multiples (that is, a linear com-
bination) of a selected subset of members of that space. The motivation for
endowing vector spaces with this property once again comes from elemen-
tary vector algebra; every vector in three dimensions can be represented as
a sum of multiples of three noncoplanar vectors.
Linear combination. Let X be a linear space and JK: the set of real or
complex numbers. Let {Ul, U2, ... , u n } be a set of elements in X. The ex-
pression
(6.1)
The set {Ul, ... , u n } is linearly independent if (6.1) holds only when all of
the Qi are zero. In other words, a set is linearly dependent if one of its
elements can be written as a linear combination of the others; for if Qk is
nonzero, then (6.1) may be rewritten in the form
Examples
1. Let X = 1I~?, and consider the vectors al = (2,1) and a2 = (1,2). To
test for linear dependence, consider the linear combination
The only possible solution to these two equations is (}:1 = (}:2 = 0, and
so a1 and a2 are linearly independent. Graphically this is easy to see
(Figure 6.1), in that it is not possible to express a2 as a multiple of
a1·
Now suppose that we also have the vector a3 as shown in Figure 6.l.
This set is linearly dependent since, whatever the length and direction
of a3, it is always possible to express it in the form a3 = 131 a1 + ß2a2
for some 131,132, Hence there exist scalars 131,132, and 133 = -1 such
that ßla1 + ß2a2 + ß3a3 = O.
2. Let X = L 2 (0, 1) and consider the functions Uk (.4: = 1,2,3) defined
by Ul(X) = coshx, U2(X) = sinhx, U3(X) = e X • Then the equation
3
L (}:iUi =0 or (}:1 cosh x + (}:2 sinh x + (}:3ex = 0
i=1
is satisfied for any nonzero (}:i that are related to each other by (}:l =
= -(}:l, and so the set is linearly dependent.
(}:2, (}:3
Examples
3. Consider the space ~3: the set {edr=l = {(1,0,0), (0, 1,0), (0,0, I)}
is linearly independent and also spans ~3; hence {edr=l is a basis
for ~3 and dirn ~3 = 3. Consider the point x = 2e1 + 3e3; this
has components (2,0,3) relative to the basis {ei}' But if we choose
instead the basis UJr=l defined by f1 = e1 + e2 + 2e3, f2 = e1 -
e2 + e3, f3 = 2e1 + e2, then the components of x relative to this
basis are found from the fact that
X=f1+f2'
° => °
3
p(x) = -1 . (1 - x) + 1· (1 + x) - x2 + x3.
PROOF. Let B = {VI, V2, ... , Vn } be a basis for X, and let S = {Ul, ... ,
Un , Un +1 , ... , UnH} be any set of (n + k) elements in X. Then by definition
there are scalars A ij such that
n
ui=LAijVj, i=l, ... ,n+k.
j=1
Take the inner product of both si des of this equation with U1 to obtain
Examples
5. Let X = ]R2 with a1, a2, and a3 as in Example 1. Then with (a, b) ==
a . b, A ij = a, . aj, and
4
detA = det ( ~ 5
2ßl + ß2 ß1 + 2ß2
which is easily shown to be identically zero for any values of ßl and
ß2. Hence the set {al, a2, a3} is linearly dependent.
2 o 2/3 )
o 2/3 o i O.
2/3 o 2/5
1 if i = j,
(cPi, cPj) =
{ 0 otherwise. (6.4)
6.2 Finite-dimensional inner product and normed spaces 181
now take the inner product with <PI to obtain Cll . 1 + 0 + ... + 0 = O. Thus
Q:l = O. In the same way, by taking the inner product with each <Pk in turn,
we find that all the Q:k are zero.
Now suppose that X is a finite-dimensional inner product space with
dirn X = n. Then a basis {<PI, . .. , <Pn} of X whose elements satisfy (6.4) is
said to be an oTthonormal basis.
Examples
7. The set {(I, 0, 0), (0, 1,0), (0, 0, I)} forms an ort ho normal basis for]R3.
8. Consider the space L 2 ( -1,1). The infinite set
if k = l,
otherwise.
One of the main advantages of ort ho normal bases over other bases is
that computations involving the former are much simpler. For example, if
{<Pi }i=l is an orthonormal basis for X and u, v E X, then
(U,v)
n n
tPl
Next, project tP2 onto the plane orthogonal to cPl (Figure 6.2), using for
this purpose the projection operator P2 defined by
Then set
Finally, project tP3 onto the line orthogonal to both cPl and cP2, using the
projection operator P3 defined by
Then set
H=
The resulting basis {cP k 1 is ort ho normal.
Generally, the procedure may be summarized as folIows. Given a basis
{1/J;}i=l) form an orthonormal basis {cP;}i=l from
Pi1/Ji
cPi = 11 Pi 1/Ji 11 '
6.2 Finite-dimensional inner product and normed spaces 183
where aki (i = 1, ... , n) are the components of Uk. Using Lemma 1 and the
fact that {Uk} is Cauchy, it follows that for any given 0" > 0, there exists
N such that
we see that {akd is a Cauchy sequence in lK for each fixed i. Hence aki
converges to an element ai, say. Now define
Then
184 6. Orthonormal bases and Fourier series
PROOF. Suppose that dimX = n and let {ei}i=l be a basis for X. Then
we may express Uk and u in the form
and
Now
(6.5)
Then it follows that (Ei, Uk) = Qki and (Ei, u) = ai, and (6.5) implies that
Qki ---> Qi as k ---> 00, for each i. Thus
s; 'LIQki-Qillleill--->O
i=l
PROOF. Let {ei, ... , en } be a basis for X; then any u E: X has the repre-
sentation u = alel + ... + ane n for certain scalars al,· .. , an, and so
so that
M
IITul1 ~ Cllull.
Thus T is bounded, hence continuous. o
There is a very simple relationship among the dimensions of the domain,
null space, and range of a linear operator when the operator acts on a finite
dimensional space, as we now show.
(a) if {eI, ... , ed is a basis for N(T) and {eI, ... , ek, eHl, ... , en } is a
basis for X, then {Tek+l, ... , Te n } is a basis for R(T), the range of
T;
PROOF. (a) The elements Tel,Te2, ... ,Te n certainly span R(T) since any
v E R(T) satisfies, for sorne u EX,
186 6. Orthonormal bases and Fourier series
where C\(i are the components of u relative to the basis ei. Since e1, ... , ek
are in N(T) we have Tel = ... = Tek = 0 so that {Tek+1" .. , Te n } spans
R(T). We show next that this set is linearly independent. Suppose that
there are scalars ßk+1, ... , ßn such that
n
L ßi(Tei) = 0;
i=k+1
by the linearity of T,
T ( t
i=k+1
ßiei) = 0
so that the sum L~=k+1 ßiei belongs to N(T), and may therefore be rep-
resented in the form
n k
L ßiei = L "/jej
i=k+l j=l
for some scalars "/1, ... , "/k· It follows that, if we set ß1 = -"/1, ... , ßk = -"/k,
then this expression may be rewritten in the form
n
Lßiei = O.
i=l
But {eI,"" en } is linearly independent, hence ß1 = ... = ßn = O. So
{Tek+l?" ., Te n } is linearly independent and, since it spans R(T), it forms
a basis for R(T). Part (b) is a trivial consequence of (a). 0
Example
Xl + 2X3 0,
Ta: = 0 or
3X1 + 4X2 + 2X3 o.
6.3 Linear operators on finite-dimensional spaces 187
or
Example
11. Let X = P 2 [0, 1] and Y = PdO,1], where Pk[O, 1] is the set of poly-
nomials of degree at most k on [0,1]; dirn PdO, 1] = k + 1. Suppose
that we choose as bases for X and Y
o Tu + T 2l X,
1 T 12 + T 22 X,
2x 113 + T 23 x.
Tu = T2l = T 22 = Tl3
T 12 = 1, T 23 = 2,
= 0,
so that T=(~ 1 0)
o 2 .
6.3 Linear operators on finite-dimensional spaces 189
ß=(~ -1 )
-6 '
or dp/dx = Tp = -1 + 6x.
if j = k,
(6.9)
otherwise.
We claim that the set L = {RI, ... , Rn} thus defined is a basis for X'; indeed,
L is linearly independent since, if
(6.10)
using (6.9). Thus (6.10) holds only if all Ooj = O. Secondly, every R E X'
has the unique representation
L Cl'ißi.
n
(R, u) = (e, Cl'lel + ... + Cl'nen) =
i=l
if i = j,
otherwise.
Example
12. Let H = L 2 (-l, 1) and consider the set ([>1 = {sin7l"x,sin27l"x, ... }.
([>1 is an orthonormal set since
(rfJk, rfJI) = 1 1
-1 sin k7l"x sin 17l"x dx =
{I otherwise,
0 if k ~ l.
U = L(U,rfJk)rfJk (6.11)
k=1
is valid for an element U of an infinite-dimensional inner product space X
is essentially the subject of this section. These conditions are discussed in
amoment, but first we must digress and make clear exactly what is meant
by an infinite sum of the form 2::;;'=1 O!kUk.
Suppose that {ud is a sequence in an inner product space and {O!k} is
a sequence of real or complex numbers, and define the corresponding nth
partial sum Sn of this sequence by
(6.12)
where {O!I' 0!2, ... } is a set of real or complex numbers. Now suppose that
we generate SI, S2, ... using (6.12); then the series 2::;;'=IO!kUk is said to
converge to an element U if the sequence {Sn} of partial sums converges to
192 6. Orthonormal bases and Fourier series
u. That is, we write u = L~=l Cl:kUk if, given any E > 0, it is possible to
find a number N such that
It is in this sense that the expression (6.11) must be interpreted: any partial
sum Sn is an approximation to u, and this approximation improves as n
increases.
it follows that the issue is one of determining what the coefficients Ck must
be; it turns out that the coefficients that provide the best approximation
are precisely the Fourier coefficents.
where Vk = (v, <Pk) is the kth Fourier coefficient oi v and Ck are arbitrary
real 01' complex numbers. Then
6.4 Fourier series in Hilbert spaces 193
(6.13)
L L
00 00
L
00
Now
n n n
LL
k=ll=l
CkCl(cPk, cPl) = L I kI
k=l
C 2;
next,
Likewise, (tn, v) = L:~=1 C1Vl. Assembling all these terms, we find that
n
IIv - tn l1 2 = IIvl1 2 + L (-CkVk - ckih + ICkI 2 )
k=l
n n
The inequality (6.13) follows by comparing this equation with that obtained
by setting Ck = Vk, for which case t n = Sn and the second term on the right-
hand side is zero.
194 6. Orthonormal bases and Fourier series
Parts (b) and (c) follow readily from (a), and are treated in Exercise
6.W. 0
Example
13. Let X = L 2 ( -1,1) and let cI> be the subspace of X spanned by the
orthonormal set {li -/2, cos 7rX, sin 7rx}. Consider the function v(x) =
x 2 . Then the Fourier coefficients of v are
(v, 1/-/2) j 1
-1
_1_x2 dx
-/2
= -/2
3 '
and so
_ -/2 4
v = 3 - 11'2 COS7rX.
-2
Ilv - vllv = 1[ (3-/2 -
[1 X
2
-
4
11'2 COS7rX )]2 dx = 0.0516
so that the relative error is
Ilv - vllv
Ilvllv = 0.36.
U = ~)U,<Pk)<Pk (6.16)
k=l
PROOF. Assurne first that <I> is an orthonormal basis. What we are required
to show is that if Sn denotes, as before, the partial sum Sn = L~=l Uk<Pk>
where Uk = (u, <Pk), then Sn ~ U as n ~ 00, if and only if <I> is an or-
thonormal basis. We begin by showing that {sn} is a Cauchy sequence in
H.
Let n 2: m, and consider
Now recall that in the proof of the Best Approximation Theorem it was
shown that IIsnll2 = L~=1IukI2. Furthermore,
(~Uk<Pk, ~ UI<PI)
(~ Uk<Pk, ~ UI<P1 + l=~l UI<PI)
(~ Uk<Pk, ~ UI<PI) + (~Uk<Pk, l=~l UI<PI)
m
196 6. Orthonormal bases and Fourier series
the last term arises from the fact that a m ::; an- Now from Bessel's in-
equality (6.14), lanl ::; Ilu11 2, and since the right-hand side is independent
of n, the sequence {an} is boundedj it is also monotone increasing, hence
(see Exercise 1.14) it is convergent, and therefore also Cauchy (in IK). The
sequence {sn} is thus also a Cauchy sequence (in H), and by the complete-
ness of H this sequence converges, to a member u', say, of H. It remains
to show that u' = u.
Consider
(u - u', tPI) (u - t
nl~~ k=l UktPk, tPI)
nl~~ (u - t UktPk, tPI) (using Exercise 4.2)
k=l
lim [(u,tPd - (u,tPdl = O.
n->=
This holds for all l, and {tPI} is a basis, so it follows that we must have
u' = u. This proves the first part of the theorem.
Conversely, suppose that every u E H has the form (6.16). Then we have,
for u E H,
= (Xl = =
1 = IitPol1 2 = 2)tPO,tPi)2 = 0,
ou
ot
_ ,. , 02
ox2
U =0 (
in -f, f), (6.18)
u(x, t) = M(x)N(t);
substitution in (6.18) and rearrangement leads to the equation
N'(t) ,..,M"(x)
N(t) M(x) ,
and since the left-hand side depends only on t and the right-hand side only
on x, it follows that each side of this equation is equal to a constant, wh ich
we denote by -A, the minus sign being inserted for future convenience. The
boundary conditions (6.19) become M(-f) = M(f) = 0 and so the first
problem becomes one of finding u and A that satisfy
-,..,M"(x) = AM(x)
(6.21)
M( -f) = 0, M(f) = O.
The second problem involves finding N that satisfies
Lu = AU, (6.23)
6.5 Sturm-Liouville problems 199
(
cosa!!
cosa!!
- sin a!! ) (
sma!! B
A) (0)
0
.
In order for this set of equations to have a nontrivial solution it is necessary
and sufficient that the determinant of the matrix be zero; that is, we require
that
cos a!! sin a!! = 0 or sin 2a!! = 0,
the solution of which is 2a!! = k7r (k = 0,1,2, ...), so that the problem
(6.21) has an infinite sequence of eigenvalues >'k, k = 0,1,2, ... , where
200 6. Orthonormal bases and Fourier series
and O'.k = br/U. We are now able to return to the problem (6.22), which
is considered in the form
N'(t) + AkN(t) = O.
The general solution of (6.18) and (6.19) may now be obtained by adding
up the linear combinations of the possible solutions; we set aside for now
the issue of convergence of the infinite sum that results, and express the
general solution in the form
00
All that remains is to obtain the constants A k and Bk. These may be found
by using the last remaining condition to be satisfied, which is the initial
condition; from (6.20), then,
00
P if k = j,
(sin O'.kX, sin O'.jx) = (cos O'.kX, COS O'.jx) = {
o otherwise, (6.26)
(COSO'.kX, sin O'.jx) = O.
So if we take the inner product of each side of (6.25) with sin O'.kX and with
cos O'.kX in turn, we find that
(6.27)
trigonometrie functions <P == {eos akX }~o U {sin akX H~o=l; seeond, in order
to eomplete the solution of the problem it is required to expand the fune-
tion f (x) in terms of these trigonometrie funetions. The appearanee of the
L 2 -inner produet in (6.27) is not accidental; indeed, the question ofwhether
the funetion f may be represented in the form (6.25) is equivalent to asking
whether the set <P forms a basis for L 2 ( -f, f) (this set is indeed orthogo-
nal but not orthonormal, although easily orthonormalized). The answer is
affirmative, and this is the relevanee of Sturm-Liouville problems to the
subjeet of orthonormal bases: the eigenfunctions of StU7m-Liouville prob-
lems constitute orlhonormal bases for L 2 . These eonsiderations are precisely
what motivate elementary Fourier analysis, and indeed (6.25) together with
(6.27) simply gives the Fourier series repsentation of f.
We turn now to a more detailed study of Sturm-Liouville problems, the
objective being to work towards this general result.
P(X»O}
q(x) 2: 0 on [a, b]. (6.28)
w(x) > 0
aiu(a) + ßiu'(a),
(6.29)
a2u (b) + ß2u '(b).
-(PU')' + qu = >.wu,
rather than in the form in which w is found on the left-hand side.
202 6. Orthononnal bases and Fourier series
If any of the eonditions in the definition differ from those given here,
whether with respeet to the boundedness of the interval, the requirements
(6.28), or the eonditions (6.30), the problem is then known as a singular
Sturm-Liouville problem.
The problem (6.31) is eonsidered in the space L 2 (a, b) endowed with the
inner prod uet (.,.) defined by
(u,v) = l b
u(x)v(x)w(x) dx;
(6.32)
Sinee the spaee C~(a, b) is eontained in D(L), and sinee C~(a, b) is dense
in L 2 (a, b) (see Chapter 4, Theorem 6 and the diseussion that follows it),
it follows that D (L) is dense in L 2 (a, b).
Examples
14. The problem (6.21) is a Sturm-Liouville problem with [a,b] = [-l,l],
p(x) = K" q(x) = 0, and w(x) = 1. With regard to the boundary
eonditions, 01 = 02 = 1 and ßl = ß2 = O.
15. Legendre 's equation arises when the method of separation of variables
is applied to problems having spherical symmetry (see Exercise 6.23).
This problem takes the form
using (6.34). o
LEMMA 3. Let L be a symmetrie linear opemtor defined on a Hilbert spaee
H. Then the eigenfunetions eorresponding to two distinct eigenvalues are
orthogonal.
PROOF. Let ..\1 and ..\2 be eigenvalues of L with eigenfunctions ul and U2,
respeetively. Then LUi = ..\iUi (i = 1,2) and so
"\1(Ul,U2) - "\2(Ul,U2)
("\lUl,U2) - (Ul,"\2U2) = (LUl,U2) - (Ul,Lu2) = o.
Sinee ..\2 =F ..\1 by assumption, it follows that (Ul, U2) = o. o
Properties of Sturm-Liouville operators. We begin by establishing
that L is symmetrie; indeed, for any u and v in D(L) we have
(Lu, v) - (u,Lv) l b
[-(pu')'v - quv + u(p'U')' + quv] dx
l b
[-(PU')'v+ (p'U')'u] dx
l b
[(p'U'U)' - (p'Uu')'] dx
[pv'u - pu'v]~
p(b)[u(b)v'(b) - u'(b)v(b)]
-p(a) [u(a)v'(a) - u'(a)v(a)]. (6.35)
204 6. Orthonormal bases and Fourier series
Now since v belongs to D(L), so does v since the coefficients in the boundary
terms are all real. It follows that BI u = BI V = 0 or, recasting this in matrix
form after using (6.29),
( u(a) u'(a)) ( QI ) = ( 0 )
v(a) v'(a) ßI O'
From the set of conditions (6.30) at least one of QI and ßI must be nonzero,
and this is only possible if the matrix is singular, that is, if
u(a)v'(a) - u'(a)v(a) = O.
Repeating the exercise for the boundary condition B 2 u
obtain for that case
u(b)v'(b) - u'(b)v(b) = O.
From these two equations it follows that the right-hand of side of (6.35) is
zero, as desired. This result, together with a related result, is summarized
in the following theorem.
THEOREM 11.
The proof of part (b) is deferred to Exercise 6.25, as is the proof of the
following corollary.
By way of preparing for the proof of this theorem, we introduce the Rayleigh
quotient R, a functional defined on D(L) by
(Lv, v)
R(v) = TvJj2 for all v E D(L).
6.5 Sturm-Liouville problems 205
LEMMA 4. The minimum 01 R(v) over all functions 1J E D(L) that are
orthogonal to the first n eigenlunctions is An+l. That is,
00
U = '"
~
UkQJk or !im
n--+oo
Ilu - snll = 0,
k=l
where Uk = (u, QJk) and Sn = :E~=1 UkQJk (reeall Theorem 8 and its proof).
It is eonvenient to introduee the remainder r n = U -- Sn, and we now
estimate R(r n ). For k = 1, ... , n we have
(6.36)
Suppose now that we are able to show that the numerator (Lrn,r n ) is
bounded independently of nj then as n ~ 00 the right-hand side of (6.36)
goes to infinity, and henee r n ~ o. The theorem is thus proved if it ean be
shown that (Lrn, r n ) is bounded. This is aehieved by showing that (Lr n , r n )
206 6. Orthonormal bases and Fourier series
Now it ean be shown that in faet (Lsn,r n ) = 0 (see Exercise 6.27), and so
we are left with the inequality
which shows that (Lrn,r n ) is bounded. This eoncludes the proof of the
theorem. D
Examples
16. Theorem 12 eonfirms that the set of eigenfunetions {eos O:kX }k=O U
{sinO:kx}k=l ofthe Sturm-Liouville problem (6.21) forms a basis for
L 2 (a,b). In this form the basis is orthogonal, but not orthonormal;
however, it is easily eonverted to an orthonormal basis by using the
relations (6.26), aeeording to whieh 11 coso:xll = 11 sino:xll = ,jE and,
for the case k = 0, 11111 = vU. The orthonormal basis is thus
17. We return to the problem (6.21), but this time express the solution
in the form
and this forms a basis for the spaee L 2 ( -C, C) of complex-valued fune-
tions (see also Example 9). Normalization is easy, sinee
6.6 Bibliographical remarks 207
1 Iddn (x 2 _1)n,
Pn (X)=-2 n=O,I,2, ....
nn. x n
Since
1 2
(Pn,Pn) = / P~(x)dx = - 2l'
-1 n+
the corresponding onhonormal set is {ePn(x), n == 0,1, ... } where
ePn = [(2n + 1)/2]1/2 Pn . It is also worth noting that the Legendre
polynomials can be obtained by applying the Gram-Schmidt proce-
dure to the set {I, x, x 2 , ... } of monomials (cf. Exercise 6.8).
6.7 Exercises
Finite-dimensional spaces
6.4. Let M be the vector space of all real 3 x 3 matrices and K the subset
of matrices of the form
for all nonzero real numbers Q, ß, "'{, 15. What are dirn M and dirn K?
Display a basis for K.
6.9. Test the set {Ul,U2} = {e x ,e- 3x } for linear dependence in L 2 (O,1)
by evaluating detA where Aij = (ui,ujh2.
Tp = d2 p/dx 2 .
Find the matrix corresponding to T.
6.7 Exercises 209
6.12. Let X be the linear space of all functions of the form u(x) = a +
ßcosx + ,sinx, 0::; x ::; 271", and define T: X ---> X by
(271"
Tu = Jo [1 + cos(x - ~)]u(~) dt,.
that is, b E N(T t ).1.. (Note that (x, Ty) = (Tt~~, y).) This shows
that R(T) C N(T t ).1.. Show that R(T) = N(T t ).1.,
Determine N(T t ) for the matrix
T~[!-~n
and hence find the general form of the vector b such that Ta = b.
6.14. Find a basis for the null space of the functional R. : lR3 ---> lR, (R., x) =
aIXI + a2X2 + a3X3, where aI 1= 0,
6,15. Prove Theorem 6, which states that there exists an isometrie isomor-
phis m from any n-dimensional inner product space to ]Rn. Show that
this does not hold in general for finite-dimensional normed spaces by
verifying that (lR 2, 11·111) and (]R2, 11·112) are not isometrically isomor-
phie.
6.16. Let X = ]R3 with the norm 11 . 111. If R. is as defined in Exercise 6.14,
find IIR.II.
Fourier series in Hilbert spaces
6.17. The set {1/J271", coskx, sinkx, k = 1,2, ...} is an orthonormal basis
for L 2 (-71", 71"). Find the Fourier coefficients Ui if (I) u(x) = 1; (ii)
u(x) = {-I, 1,
-'Ir::;
0 < x ::;
x ::;
'Ir.
0
6,18. Determine the first three terms of the expansion IL = L~=o ul;;el;; on
[-1,1] when el;; are the normalized Legendre polynomials and
Pu = 2)u, rPk)rPk.
k=l
6.22. Let V be the subspace of L 2 ( -1, 1) spanned by the first four orthonor-
mal Legendre polynomials rPn = (n+ ~)1/2Pn(X), n = 0,1, ... ,3.
Write out rPn explicitly, find the orthogonal projection of u( x) = x 4
onto V, and verify the inequality Ilu - Puli::::: Ilu - viI, v E V, for any
suitable choice of v.
Sturm-Liouville problems
6.23. Spherieal coordinates (r, B, rP) are related to Cartesian coordinates
(x,y,z) through
x = rsinBcosrP, y = rsinBsinrP, z = reosB,
and the Laplacian operator in spherical coordinates is given by
2 1 8 ( 2 8U) 1 a (. au) 1 a2 u
'V u = r 2 8r r ar + ;:2;[nB aB sm BaB + r2 sin 2 () arP2 .
= - ddxu2 + a 2 x 2 u,
2
Su xE (-oo,OQ),
(6.37)
n = 0, 1,2, ....
tions. Finally, and perhaps of most importance is the fact that approximate
solution methods such as the Galerkin and finite element methods are most
conveniently and correctly formulated in finite-dimensional subspaces of
Sobolev spaces.
In order to discuss Sobolev spaces it is necessary first of all to leam a little
about distributions. This necessary background is provided in Sections 7.1
and 7.2. Then in Seetion 7.3 we introduce Sobolev spaces, and discuss some
of their more important properties. The apparently innocuous quest ion of
how one obtains the value of a function on the boundary of a domain,
given the function on the domain, is shown in Seetion 7.4 to be a nontrivial
problem. We show that a function has to satisfy certain requirements in
order for its values on the boundary to be defined unambiguously. Finally, in
Section 7.5 we discuss the Sobolev spaces Hö'(rl) offunctions that, together
with their derivatives of order less than m, vanish on the boundary. We also
introduce in this section the space H-rn(rl) (for m > 0) which is defined
to be the dual space of Hö'(rl).
7.1 Distributions
In this section and in those that follow it is often necessary to deal with
partial derivatives of all orders, and when discussing general ideas the no-
tation can sometimes become very clumsy. As aprelude to the main topic
of this chapter the very useful multi-index notation for partial derivatives
is introduced.
Thus if lai = m, then D"'u denotes one of the mth partial derivatives of u.
Examples
I = L a",Dau,
lal9
whcre aa are given functions of x and y. Thus
L ao.D"u alO
alU
ox l oyO + aOl oxO
alu
oyl = 2x
(OU
Ox
ou)
+ oy ,
10.1=1
02 U 02 U 02 u
a2°8x2 8 y O + an 8x l 8 y l + a02 8xO 8 y2
8 2u 8 2u 8 2u
1 . 8x 2 + x 2 8x 8y + 1 . oy2'
Collecting all terms, it turns out that
8 2u
L + a + u.
0. 8 u 2 2 82u (011., 811.)
aa D u = 8x2 + x 8x 8 + ß2 + 2x ox
10'19 Y Y Y
-b -a a b
In Section 5.4 we discussed an example that showed that the Dirac delta
8 is not a function at all, but is more correctly viewed as a continuous linear
functional, in that it operates on a continuous function u to produce areal
number, namely, u(O):
The space V(n). For reasons that become evident later, it is desirable to
consider the action of distributions not on all of C(n), but on only the small
subset Co (n) of infinitely differentiable functions with compact support;
the notion of functions having compact support was, of course, introduced
in Section 4.4. In the context of distribution theory it is conventional to
use the notation V(n) for Co(n), and to refer to V(n) as the space of test
functions, becausc it is against functions in this space that distributions
are tested, in a sense to be made precise.
Example
3. A canonical example of a member of V(n) is thc function
lxi 2': a,
lxi< a,
defined on n = (-b, b), where b > a > 0, as shown in Figure 7.1. It
is not difficult to show that cp is infinitely differentiablc, and that the
support of cp and all of its derivatives is the set [-a, a].
It is possiblc to provide the space V(n) with a topology known as an
inductive limit topology, but such considerations are rat her complicated.
7.1 Distributions 217
Example
which is finite, and so we are assured that (F, <jJ) has meaning. Under these
circumstances F is said to be the distribution genemted by f. In future the
different notation for a function (I) and its associated distribution (F) is
dispensed with, and we simply write f for both quantities. Whether f is
the function or the distribution that it generates is clear from the context;
for example, f in the expression In
f<jJ dx is clearly a function whereas f
in (j, <jJ) is a distribution.
Examples
and is bounded for every closed interval [a, b] in [-1,1]. Thus f gen-
erates the distribution, also denoted by f, and defined by
(H, <jJ) = 1 1
-I
H(x)<jJ(x) dx = (<jJ(x) dx.
Ja
A distribution that is generated by a locally integrable function is called
a regular distribution. If a distribution cannot be generated by a locally
integrable function, it is then said to be a singular distribution. An impor-
tant example of a singular distribution is the Dirac delta; it is not difficult
to show (see Exercise 7.2) that there does not exist a locally integrable
function that gene rates b.
It is possible to define in a very natural way the pmduct of a fUIlctioIl
and a distribution. Specifically, if nc jRn, U belongs to C=(D), and f is a
distribution on n, then by uf we understand the distributioIl satisfying
In[U(X)f(X)]IjJ(X) dx = In f(x)[u(x)ljJ(x)] dx
that holds when f is locally integrable.
Example
7. The distribution u8 on (-1,1), where u(x) = x, satisfies
i o
u8v
- dx
8Xi
= 1 r
UVVi ds - 10
v8u
- dx
8Xi
(7.2)
1b uv' dx = [uv]~ -
jb vu' dx,
a (7.3)
Indeed, (7.3) is just a special case of (7.2) with n = (a,II),r = {a,b}, and
v = ±1 at x = band a, respectively.
The theorem is easily generalized to a result involving partial derivatives
of order m of functions u, v E Cm(TI) (see Exercise 7.4): by replacing U by
Dcxu in (7.2), and with lai = m, one can show that
(7.5)
We take (7.6) as the basis for defining the derivative of any distribution
f, as folIows. The o:th distributional or generalized partial derivative of a
distribution f is defined to be a distribution, denoted by Da f, that satisfies
(DC>. f, rjJ) = (_l)la l (f, DarjJ) for all rjJ E v(n). (7.7)
7.2 Derivatives of distributions 221
Thus we use the same notation for the generalized derivative of a distribu-
tion as that used for the conventional derivative of a function. Of course,
if the function belongs to em(n), then the generalized derivative coincides
with the conventional ath partial derivative for lai :S m, as can be seen
immediately from (7.5) and (7.6).
For the special case of first derivatives the multi-index notation can be
dispensed with, in which case (7.7) becomes
(7.8)
Examples
-1
-1
df/>
°
1
dx dx =-[f/>lö=(f/>,0)=(8,cp)
so that, symbolically, H' = 8; that is, the derivative H' of the step
function is the Dime delta.
R(x)
°
= {x y ~f x ~ 0, y ~ 0,
If x < or y < 0. °
The generalized derivative D(I,O) R = 8R/8x is found from
8f/» = -
- \ R, -8
x
1
--1 -1
1
11 8cp dxdy
R(x)-8
X
11
(R is locally integrable)
- 1°111° 8f/>
X Y-8 = 1 1.
dxdy yf/> dxdy
x ° (I
222 7. Distributions and Sobolev spaces
2 / a 2 q;) 1 1
a 2 q;r r
(-1) \ R, axay = Jo Jo xy ax ay dxdy
11 1 1
q; dxdy (applying Green's theorem twice)
l H(x)q;(x) dxdy,
Hence
\/ axay'
a2 R q; ) = (H, q;) so that De 1 ,1) R = H.
It follows in this case from (7.7) and (7.9) that the functions u and DCtu
are related by
-1 1
FIGURE 7.3. The function u(x) = lxi and its weak derivative
Example
10. The function u(x) = lxi belongs to Cl-I, 1], but the classical deriva-
tive u' does not exist, in that it is not defined at the origin. However,
the weak derivative of u is the function
u
,= {-I
+1
for
for
-1::; x < 0,
0 ::; x ::; 1
(see Figure 7.3), since the identity J~l u'ifJ dx = - I-~1 uifJ' dx is easily
shown to hold. Note furthermore that u' E L 2 ( -1,1), and is therefore
of course locally integrable.
The preceding example illustrates one fundamental difference between clas-
sical and weak derivatives. The classical derivative, if it exists, is a function
defined pointwise on an interval, so it must be at least continuous. A weak
derivative, on the other hand, need only be locally integrable. Thus any
function v differing from a weak derivative u' on a set of measure zero
(for example, at a finite number of points in the realline) is itself a weak
derivative of u.
(7.10) would be a simple first order differential equation. Since fand gare
actually distributions, we go back to the definition (7.8) of a generalized
derivative; then (7.10) really reads
Ag = J, (7.12)
which is equivalent to
(7.13)
Ag = J, (7.14)
Examples
12. The equation g" = 8' has no classical solution on (-1,1) but its weak
solution is
1(6)
E
r Lipschitz
/+--------------~--~l
r
r not Lipschitz
FIGURE 7.4. A local coordinate system for classifying the boundary of a domain,
and examples of Lipschitz and non-Lipschitz domains
Next, set up a coordinate system (6, ... , ~n) such that the segment rn
B(xo, E) can be expressed as the function
~n = 1(6,···,~n-l).
where e = (6,··· ,t;"n-l) and." = ('(/1, ... , 1]n-l) (recall that a Lipschitz-
continuous function is uniformly continuous). The situation is illustrated in
Figure 7.4 for n = 2. Unless otherwise stated r is assumed to be Lipschitz;
this includes, in ]Ft2, boundaries that are triangular, reet angular, and an-
nular, whereas in ]Ft3 tetrahedra and cubes are Lipschitz. Boundaries that
are not Lipschitz include those with CUSpi:i and those that have the domain
o on both sides, as shown in Figure 7.4.
The Sobolev spaces Hm(O). The Sobolev space of order m, denoted by
H m (0), is defined to be the space consisting of those functions in L 2 (0)
that, together with all their weak partial derivatives up to and including
those of order m, belong to L 2 (0):
H m (0) = {u: DCtu E L 2 (0) for all 0: such that 10:1 ::; m}.
7.3 The Sobolev spaces H"'(rl) 227
We consider real-valued functions only, and rnake HTn (n) an inner product
space by introducing the Sobolev inner product (-, ·)Hm defined by
(u, V)Hm = l L
l"'l<:m
(D"'u)(D"'v) dx for u, v E H Tn (0,).
This inner product in turn generates the Sobolev norm 1 . IIHm defined by
Note that HO (0,) = L 2 (0,), and furtherrnore that we may write (u,v)w"
as
(U,V)H== L (D"'u,DO:v)L2;
10:1<:=
in other words, the Sobolev inner product (u, v) Hm is equal to the surn of
the L 2 -inner products of DO:u and DO:v over an 0' such that 10'1 ::; m. Of
course, we could also write
Iluilkm = L IIDO:ulli2;
1001<:Tn
when written out in fun for the case m = 2, and for a function u defined
on 0, C ]R2, this becornes
dx.
Examples
0< x::; 1,
u(x) = {
1< x < 2.
Then
4 ---+----/- u"
1 2
By inspection u,u', and u" all belong to L 2(O, 2); however, the (gen-
eralized) derivative of u" is Ulll = 28(x - 1) ~ L 2 (O, 2). Hence u is a
member of H 2 (0, 2), the function v belongs to H 1 (0,2), and w be-
longs to L 2 (0,2) = HO(O,2) (see Figure 7.5). The respective norms
of these functions are
1 2 2
[u + (u')2 + (u")2] dx = 71.37,
1 2 2
[v + (V')2] dx = 39,
1 2
w 2 dx = 20.
u(X)={ x
° for
for
x>O,
x ~ 0,
r GU
Jn oy 4J dxdy = -1 n GY
uGcjJ
- dxdy =- 1°1 [11
x G4J ] dx
-dy
GY
-11
-1
/------.;;..-x
1 äu
o äxrjJ dxdy = -1 uärjJ
- dxdy = - j1 ([1 x-dx
ärjJ ) dy
-11dX)
o äx -1 "0äx
j-1}0
1 r1 rjJ dxdy =1 0
H(x)rt>(x,y) dxdy.
PROOF. Only Parts (i) and (ii) are proved; the proof of (iii) is rather long
and technical, and its details may be found in the references at the end of
this chapter.
(i) If u E HT(n), then D"'u belongs to L 2 (D.) for all a such that lai::::; T,
and thus for all a such that lai::::; m. So U E Hm(D.), and HT(n) ~ Hm(D.).
Hence {D"'Uk} is a Cauchy sequence in L 2 (n) for each a such that lai::::; m.
Since L 2 is complete, it follows that DcoUk converges to a function u(a), say,
that belongs to L 2 . In particular, for lai = 0, Uk converges to a function
u, say, in L 2 .
We show next that U is in Hm(n). Consider
inr (!im
k-+(X)
DCOUk)1> dx = (!im DCOUk,1»L2
k-oo
=
inruD
(-l)lco l (!im uk,D C0 1»L2 (_1)1"'1 a 1> dx,
k-+oo
where we have used the result of Exercise 2 of Chapter 4, (7.7), and the fact
that Dll!Uk is a regular distribution. Thus u(a) is the ath weak derivative
7.3 The Sobolev spaces Hm(O) 231
Part (iii) of the theorem has an important interpretation: from the defi-
nition of the completion of aspace (Section 4.3) we know that C""(fl) is
dense in Hm(n); hence, for any u E Hm(n) it is always possible to find an
infinitely difIerentiable function J(x), say, that is arbitrarily elose to u in
the sense that
for any given E > O. In other words, every member of Hm(n) is either a
member of C"" (fl), or may be approximated arbitrarily closely by a function
from this space.
Example
15. Refer to Example 13: from what was said there we conelude that,
given any E > 0, it is possible to find functions J, g, and h in C"" (fl)
that satisfy
[ 11
(u - f)2 + (u' - J')2 + (u" - f"?dx
] 1/2
< E,
[ 11
(v - g)2 + (v' - g')2dx
] 1/2
< E,
[1 1
(w - h)2dx
] 1/2
< E.
Before stating the theorem we give a simple example to show that in-
tuition would be misleading. Let n be the disc of radius ~ in ~2, and let
u = In(In(l/r)), where r 2 = x 2 +y2. Then, using polar coordinates (r,e),
/1 r!
u 2 dxdy = 1 1/21271" [ln(In(1/r))]2
0 0
rdrde =
1271"1.00 (e-
0 In 2
t In t)2 dtde
(7.17)
is continuous.
domains that are subsets of the plane, though, n = 2 and we require that
a function be a member of H 2 (n) in order to guarantee its continuity.
The main point about Theorem 3 is that every functiün in Hm(n) can
be approximated arbitrarily closely by a member of cm(n).
We conclude this section with an important and frequently useful in-
equality.
Ilulli2 <::: Cl 1L
[11"'1=1
ID"'uI 2 dx + C2 [1 [1
u(x) dX] 2 (7.18)
More generally, for any U E Hk(n) there exist constants C3 and C4 such
that
(7.19)
PROOF. We prove (7.19) für the case n = (a, b) C IR; the higher-dimensional
r
results follow in a similar way. Thus for n = 1 we have to derive the in-
r
equality
Ilull~k <::: Cl l b
(::~ dx + C2 ~ (l ::~ b
dx. (7.20)
b
< (b - a) l (u'fdx.
Now we integrate with respect to~, keeping 11 fixed, and then with respect
to 11, to get
21 u(Od~
b
l
b
U(11)d11
or
(7.21)
l
a
b (dk-1U)2
dx k-l dx
(7.22)
Next, to get (7.20) for k = 2 we add f:(U")2dx to both sides of (7.22) and
use (7.21), to obtain
Ilull~2 :S (1+Cl(1+Cd)lb(UII)2dX+C2(lbudxr
+C2 (1 + Cl) (l b
u' dX) 2
7.3 The Sobolev spaces Hm(fl) 235
The Sobolev spaces Wm,P(fl). The Sobolev spaces Hm(f!) have been
defined by taking as a point of departure the Hilbert space L 2 (f!); in this
way it has been possible to introduce a family of Hilbert spaces, of which
L 2 is a special case, viz. the case m = O. In much the same way it is possible
to take as a point of departure the spaces LV(f!) for 1 S p S 00, and in
this way to introduce, for each p, a Sobolev space that is a Banach space.
Thus, for m = 0,1, ... , the Sobolev space Wm,V(f!) is defined to be the
space of functions that, together with all their weak derivatives up to and
including those of order m, belong to LV(f!):
(7.23)
That is, the Sobolev norm Ilull~,v is equal to the sum of the pth powers of
the LV- norrns of D"'u over all a such that lai :S m.
Theorems 1 to 3 all have counterparts for the spaces lt,m'V(f!), and some
of these extensions are given in the following theorem.
(iii) W m,P(S1) is the completion, with respect to the norm 11·llw~.p, of the
space coo (TI);
(7.24)
Example
o s: x < l/k,
s:
l/k x < 1 - l/k,
1 - l/k s: x s: 1,
(7.25)
for some constant C > O. Thus if U and v are two functions in L 2 (n)
(they could be continuous functions) that are elose in the sense that Ilu -
vIIL2(rl) < E for some small E > 0, then (7.25) gives immediately
so that ulr and vlr are correspondingly elose. However, if / is not continu-
ous there is no guarantee that this situation would obtain. This is obviously
untcnablc if we are to develop a coherent theory of boundary value prob-
lems.
238 7. Distributions and Sobolev spaces
-1
satisfics
(7.26)
for so me constant C > 0 (note the norms used). The proof of this result is
contained in the next lemma.
PROOF. We prove the result for the case n = 2; the proof für the more gen-
eral case follows in a similar way. We consider a local piece of the boundary
and set up coordinates (~, 'Tl) so that this can be represented in the form
u(~,J(~)) = 1s
1«(,) ou
a(~'
7]
7]) d7] + u(~, s),
where !(~) - b ::::; s ::::; !(O. We use the elementary inequality (n + ß)2 ::::;
2n 2 + 2ß2 to obtain
(7.27)
: :; (1 1
«(,) 1d77r
2 (l 1W
(~~r d7]r
(f(~) - s? (l (fJ)2
a*
s
f(f.)
d7]
)2
< b 2(t(t;) (8U)2 dT!)2
lf(O-b 07]
(7.28)
ds = [1 + (f'?F/2d~.
Furthermore, f' is bounded so that 1 ~ [1 + (f')2]1/2 ~ C, where C is
a constant independent of f. Substitution in the left-hand side of (7.28)
yields
j-a
a u(~, f(O?
b[l + (f')2]1/2
> r 2
d~ - Cl }f'S u(~, f(~)) ds,
for some constant Cl, where r 5 is the portion of the boundary correspond-
ing to the interval ~ E [-a,a]. The right-hand side is easily estimated, and
(7.28) becomes
!im
k->co
Iluk - ullHl = 0,
7.4 Boundary values of functions and trace theorems 241
and so, using (7.26), one sees that {-rk} is a Cauchy sequence in L 2(r) and
therefore converges to v, say, in L 2 (r). Define ')'(u) = v; then
::; C k--->oo
lim lIukIIHl(rl) = ClluIIHl(rl)'
Part (ii) of the theorem implies that, although the range of')' is not all
of L 2 (r), any member of L 2 (r) can be approximated arbitrarily closely by
a function lying in the range of ')'. D
The trace theorem enables us to define unambiguously ')'(u) or ulr, pro-
vided that u is smooth enough to be in H 1 (n). Now suppose that u is even
smoother, so that u belongs to H 2 (n). Then u is a member of HI(n) and
so in fact is DDlU for lad = 1:
8u 8u 1
u'-8 ' ' ' ' ' - 8 EH (0.).
Xl X n
This means that the boundary values of the first derivatives of u can also
be defined unambiguously, using the trace theorem.
The argument can be generalized to the space Hm(n); indeed, when
m> 1 then for any u E Hm(n) we have DDlU E HI(n) for lad::; m-I. By
the trace theorem the value of DDl U on the boundary is well-defined and
belongs to L 2 (r) :
We introduce the notation ')'01 to denote the operator that, when applied
to a member u of Hm(n), gives the trace or boundary value of DDlU for
10:1::; m - 1 :
(7.29)
u = Uo on r,
242 7. Distributions and Sobolev spaces
1u8v
- dx
o 8Xi
= 1
r
UVI/i ds - l8u- v dx
0 8X i
(7.30)
holds for i = 1,2, ... ,n. From this identity we can deduce higher-order
identities; for example, if u is replaced by 8u/8xi (assuming now that
u E H 2 (0)) and the resulting equation is summed over i from 1 to n, then
we find that
(7.31)
tor n = 1, and
(7.32)
tor n 2: 2;
(ii) tor any u E H 2 (O) there exists a constant C3 such that that
lIull~2 ~ C3 (L (ID"'uI 2 dx +
101=210
1r
u2 dS) . (7.33)
PROOF. The proof of (a) is similar to that of Theorem 1 (iii). Part (b) is
obvious. To prove (c), we usc the continuity of the trace operator: let {Uk}
be a Cauchy sequence in CO'(n) with limit u in HO'(n). Then from the
definition of 10/ we have
Example
17. The function u defined by
u(x) = { ~'~, + ~, 2x -
(2 - x?,
is a mcmber of H 2 (O, 2), as Figure 7.9 shows. Also, u and du/dx are
equal to zero on the boundary x = 0, x = 2. Hence u E HJ(O, 2).
244 7. Distributions and Sobolev spaces
a b x
FIGURE 7.10. The construction used in the derivation of (7.34)
(7.34)
PROOF. The inequality is first established for the case u E C(f'(11), after
which the density of this space in HJ(11) may be used to obtain (7.34). We
focus on the situation in which n = 2, for convenience. Let G = [a, b] x [c, d]
be a rectangle that includes 11 as a proper subset, as in Figure 7.10, and
7.5 The spaces Hf)'cn) and H-mcn) 245
note that
[Y 8u
u(x, Y) = Je 8t (x, t) dt for all (x, y) E C
u 2(x,y) (l Y
1· ~~(x,t) dy)2 ~ l l Y
dt
Y
(~~(X,t))2 dt
< (d-c) I d
(~~(X,t)r dt.
The inequality (7.34) may now be obtained by repeating the argument, this
time integrating in the x-direction, and then adding.
The extension to functions in HJ(O) is left as an exercise (Exercise
7.16). D
At this stage it is convenient to introduce a family of seminorms on
Hm(o). A seminorm 1·1 satisfies all the norm axiOIIlS except that of positive-
definiteness (Axiom N2 in Section 3.3), in that lul 2: 0, but lul = 0 does
not imply that u = O. The quantity I . Im defined on Hrrt(O) by
lul;' = L
l"l=m
in ID"uI 2 dx (7.35)
This result is treated in Exercise 7.17; note in particular that (7.34) can be
expressed in the form
The Space H-m(O). In Section 5.4 we discovered that the space L 2 (n) is
self-dual. The quest ion now arises as to how we can characterize [Hm(n)l',
the space of bounded linear functionals on Hm(n). Now we would hope
to find out ab out [Hm(n)l' by considering functionals E on 1)(0,) (that is,
distributions), and by looking at the limits of (E, ifJk! as k --> 00, where
{ifJd is a Cauchy scquence in 1)(0,). There is a complication here, however,
in that 1)(0,) is not dense in Hm(n), so that not every U E Hm(n) is
the limit of a Cauchy sequence {ifJd in 1)(0,). This dilemma is resolved
by restricting attention instead to the dual of Hü(n); 1)(0,) is dense in
Hü(n), by Theorem 8(a), and this property is used to definc Hü(n)' in the
following theorem. Before stating the theorem we introduce the convention
whereby the dual of Hü(n) is denoted by H-m(n):
(7.37)
PROOF. Let J be any function in L 2 (n) (= [L 2 (n)]'); then, far any ifJ E
1)(0,)
Il J(DaifJ) dxl
Now, for any v E Hf!'(o.), let {cpd c V(D.) with limk--->oo CPk = v. Then
(U,CPk)H=
1L°1<>ISm
(D<>u)(D<>CPk) dx
L (-I)I<>I(D<>(D<>u),CPk)'
1<>ISm
Hence, as k -> 00 we have
q = L (-I)I<>ID<>(D<>u)
1<>ISm
Example
18. Theorem 11 gives a useful way of characterizing the negative Sobolev
spaces H-m(o.)j indeed, (7.37) indicates that if we difIerentiate a
member of L 2 (D.) up to m times, we get a functional q on Hf!'. For
example, take
-1<x<O,
H(x) = { ~: 0< x < 1,
7.7 Exercises
Distributions
j(x) = f= x~ a.
Da j(O).
1"'1=0
[Expand the right-hand side for the case n = 2, by working out the
first few terms.]
7 7 Exercises 249
7.2. Show that the Dirae delta /5 is not generated by a loeally integrable
function /5(x), as follows. Let 1>a(x) be the test funetion defined by
exp [x a:.a
2 2 ], lxi< a,
1>a(x) = {
0, b> lxi ~ a,
for b> a > O. Assume that a function /5(x) exists, and show that
r
io
(D"'u)v dx = (_1)1"'1
in
r u(D"'v) dx +
ir'
r h(u,v) ds
fex) = { xy, if xy ~ 0,
0, if xy< O.
+1,
=Ei' f jaxay ~ g(x) ~ {
x ~ 0, y ~ 0,
Show ,hat n",l' f -1, x ~ 0, y ~ 0,
0, otherwise.
250 7. Distributions and Sobolev spaces
0, 0< x< 1,
(a) u'(x) = { x-I, 1 :S x< 2,
x3 - x2 - 3, 2 :S x < 3;
xy,
x(2 - y),
°°:S:S x :S:S °:S<
x
1, y < 1,
1, 1 y :S 2,
{
(b) u(x, y) = X O:Sx<~, y=l,
371", x =~, y = 1,
x, ~ < x :S 1, Y = 1.
7.11. Show that the functions
u(x) = {
X
2'- x,
°1 :S:S x :S 1,
x :S 2,
and v(x)=sin71"x,
u(x)={ x 2 y 2, x>O,y>O,
0, otherwlse,
IU,v)1
IIfIIH-~ = sup IlvIIH~' v E HO'(n).
7.20. In the spaee HI(n) show that the orthogonal eomplement of HJ(n)
is the subspaee of funetions U E HI(n) for whieh 'J2 u = u (distribu-
tionally). Find a basis for HJ(n)..L for the ease 0, = (0,1) C IR.
7.21. Show that u(x) = lnx is a member of L 2 (0, 1), and hence that v(x) =
I/x belongs to H-I(O, 1).
Part 11
In this chapter we return to the topic of the Introduction, and set about
the process of developing a mathematically coherent framework for bound-
ary value problems. Section 8.1 sets the stage by introducing a range of
problems involving differential equations; we saw some examples in the
Introduction, and here the opportunity is taken to introduce a few more.
In the remaining four sections we build up towards a general theory for
the existence, uniqueness, and regularity of solutions to elliptic boundary
value problems. The problem is posed as one involving an elliptic operator
from one Sobolev space to another. To the uninitiated, the ideas discussed
here may seem esoteric at times; rather than discuss techniques for solving
boundary value problems, the results obtained are of a qualitative nature.
This is precisely the program of investigation that was proposed in the
Introduction, and the intention is that the motivating ideas of that chapter
together with the theory developed here, convey the relevance of these
qualitative results to a proper understanding of the problem.
Examples
is given by
du
- = [b(u) - d(u)]u. (8.1)
dt
This is a first-order nonlinear ODE, since the operator Au == du/ dt-
[b(u) - d(u)]u is nonlinear.
au 1.
- - -dlV (KV"u) =Q (8.2)
at cp
in which
is the Laplacian operator in]R3. Recall from the Introduction that this
equation also arises, on a domain in two dimensions, in the problem
of the deflection of an elastic membrane.
_~~
cpdx
(K dU ) = Q.
dx
(8.3)
Note that the left-hand side of (8.3) has the form of a Sturm-Liouville
operator (recall Section 6.5).
Shape at time t
t= uv. (8.4)
(8.5)
i)2Ui _ ~ aO'ij _ Q.
p at 2 ~
j=l
ax·J - , (i=1,2,3). (8.6)
1 (aUi
()
Eij U = '2 aXj + aUj)
aXi .
(8.7)
The constitutive law for linear elastic materials then states that the
stress depends linearly on the strain at every point of the body; that
is,
(8.8)
(8.9)
It is of course possible to express the stress directly as a function of
the displacement, by writing
(J"=OU,
260 8. Elliptic boundary value problems
Ou = >.[trE(u)]I + 2p,E(U).
(8.10)
(8.11)
6. Deflection of a plate. The next example also comes from linear elas-
ticity, and concerns the special case in which the body is a thin
plate. That is, one of its dimensions, in the z direction, say, is very
much smaller than the other two, and the body occupies the region
n x (-h/2, h/2), where n is a domain in IR 2 , so that geometrically
the plate is flat (Figure 8.2). It is assumed that external forces act
only in the z direction. This set of circumstances allows various as-
sumptions to be made about the deformation of the plate. First, the
midsurface n is assumed to undergo a displacement with compo-
nents Ul(X,y,O) = U2(X,y,O) = 0 and U3(X,y,O) == w(x,y). Second,
we invoke a key geometrical assumption known as the Kirchhoff-Love
hypothesis: this states that sections of the plate that are straight, and
normal to the midplane n, remain straight and normal after deforma-
tion. The Kirchhoff-Love hypothesis has an immediate consequence,
8.1 Differential equations, boundary conditions, and initial conditions 261
n:,
tt q(if '~' I~J.J!L
dX
y
t z
q..I" r
I
tor Majl
to which is added
U3(X, y, z) = w(x, y). (8.13)
The governing equation for an elastic plate is obtained by imposing
these assumptions on the elasticity equations. First, we adopt the
convention that Greek suffixes range over 1 and 2. Next, we define
the components Sa and Maß of the shear force vector Sand bending
moment matrix M by
h/2 jh/2
Sa = j CT3a dz and Maß = ZCTaß dz.
-h/2 -h/2
These are quantities that are averaged over the thickness of the platej
their interpretations are illustrated in Figure 8.2.
The shear force is eventually elirninated, but a constitutive equation
is required for M. This may be derived from the generalized Hooke's
law, which together with (8.12) becomes
Here \12 is the two-dimensional Laplace operator, laß are the com-
ponents of the 2 x 2 identity matrix, and D is called the bending
stiffness; it depends on the material and the geometry, and is dcfincd
by D = Eh 3/12(1 - lJ2), in which E and v are material constants
known, respectively, as Young's modulus and Poisson's ratio (noth-
ing to do with the Poisson equation!). These two constants may be
expressed in terms of the Lame moduli if desired.
262 8. Elliptic boundary value problems
(8.16)
M'-8 0,
(8.17)
8'+1 o.
The constitutive equation for the bending moment comes from the as-
sumption that Poisson's ratio v is very nearly zero; thus from (8.14),
for example, o"u = EEu and so, after substitution for EU, multipli-
cation by z and integration with respect to y and .2:, we find that
M = -Elw", (8.18)
8 = -Elw"'. (8.19)
El d4w4 = 1 (8.20)
dx
for the deflection of a beam.
°°
the effect that we require u(t) for t lying in the range 0 < t :s; T or (0, T],
where t = represents some datum and T is the longest time of interest.
If t = is taken to be the present, and a solution is required for all time
in the future, then the range of t is (0,00). Similarly, if for example, the
problem has to do with he at conduction in a slab occupying the region
(0,1) x (0,1) x (0,1), and if we require a solution for all time, then (8.2)
has to be supplemented by the statements
Examples
Ie: u(O) = uo
9. Heat conduction. Suppose for example that the domain n is the cylin-
drical region r < a and 0 < z < L, where r 2 = x 2 + y2. Suppose
further that the ends z = 0 and z = L are insulated and the tem-
perature is a prescribed constant on the curved part of the boundary
(Figure 8.4). In this case it is more convenient to use cylindrical co-
ordinates (r, (), z)j then if the initial temperature is known, and is
given by the function !(r, (), z), the initial boundary value problem
corresponding to heat conduction is summarized as in Box 2.
8u/8z = 0
{}u {}u
BCs: {}z (r, 0,0, t) = {}z (r, 0, L, t) = 0
u(a,O,z,t) = c
ambient temperature ua
ODE: _~~
cpdx
(K dx
dU ) =Q
11. Elasticity. Suppose that the elastic body under consideration is the
bar shown in Figure 8.6; this bar is fixed at the end x = 0, it is
subjected to a time-independent (vectorial) force per unit area f(y, z)
at the end x = L, and on the remainder of its surface there are no
forces acting. To specify the force boundary conditions we make use
of (8.4) and (8.8), with the appropriate choice of v. In this way we
arrive at the boundary value problem in Box 4.
2d I
f
z
PDE:
BCs: u(O,y,z) =0
(<>u)(l,y,z)e x = J(y,z)
(<>u)(x, y, z)e y = 0 for y = ±d
12. Elastic plate. The fourth-order plate problem requires two boundary
conditions at each point on the boundary, as we show in the theory
that foltows. These are of two kinds: those in which the displacement
or its first derivatives are prescribed, and those in which the shear
force or bending moment along the boundary are prescribed. We take
a concrete example to show what form some of these boundary con-
ditions can take.
Consider then the rectangular plate shown in Figure 8.7. It is con-
strained against motion along the ends x = ±h, whereas the other two
ends y = ±l rest on supports that permit rotation, but not vertical
displacement. The boundary conditions along x = ±h therefore stip-
ulate that the displacement and slope are both zero; in other words,
w = 0 and äw / äx = O.
In order to write down the boundary conditions at the other two
ends we must first be clear about what it is that they stipulate. One
of the conditions is straightforward: w = 0 there. But the condition
that these ends are free to rotate is equivalent to stating that the
plate experiences no restraining moment or couple there. Referring
to Figure 8.2, we see that it is the moment M 1l that is required to
be zero. From (8.14) this is
The term (-1) lai is not essential, but is included here for future conve-
nience.
The operator A is assumed to occur in a PDE (or system of PDEs) of
the form
Au = J,
where J lies in the range of A. For now we restrict attention to scalar-valued
functions u, and make the extension to vector-valued functions (that occur
in elasticity, for example) later.
The classification of A depends only on the coefficients of the highest-
order derivatives, that is, the derivatives of order 2m, and the terms involv-
ing these derivatives are said to constitute the principal part oJ A, denoted
by A o, and which for the operator (8.21) is given by
A o == L aaßDa+ßu .
lal,IßI~m
Then
(i) A is elliptic at Xo E n if
L aaß(xo)e a + ß =1= 0 for all e =1= 0; (8.22)
lal,IßI=m
For the case in which A is a second-order operator (that is, m = 1), the
notation can be simplified. Indeed, suppose that the problem is posed in
}Rn; then (8.21) takes the form
Au =- ~ ou) + ~
L.... - 0 ( aij(x)- ou
L....aj- +aou = f
. n
In (8.24)
.. 1 8Xi 8xJ· . 1 8xJ·
',J= J=
for suitable coefficients aij, aj, and ao, and the condition of ellipticity is
exarnined by considering, instead of (8.22) and (8.23), the conditions
n
L: aij(xO)~i~j f= 0 for aB f. f= 0, (8.25)
i,j=l
Examples
13. Consider the operator that appears in the steady, nonhomogeneous
heat equation (that is, the steady version of Example 9), and assume
that the problem is plane, so that n = 2. The operator A is thus
(ignoring the coefficient 1/(cp)) given by
Au -div (K'V)
_~
OX
(Kau)
OX
_~oy (K OU8y )
so that, in the notation of (8.24), au = a22 = K and a12 = a21 = O.
The principal part of this operator is
82 82 8
A = ( I - x )2 - + 3 - - y -
8x 8y 2 8x
is elliptic only in the half plane x < 1; to see this, we evaluate
this expression is nonzero for all nonzero vectors ~ = (E;, 77) provided
that x < l. However, for any point (xo, Yo) in the half plane x ;::: 1
this expression is zero for all vectors of the form ~ = (y'3, vx;;-=l).
u g, (8.27)
Vu· s == du/ds h, (8.28)
where du/ds is the tangential derivative, s being the unit tangent vec-
tor to the curve defining the boundary. The ccndition (8.27) implies that
du/ds = dg/ds, wh ich contradicts (8.28), unless dg/ds = h (Figure 8.8).
Hence these two equations arc inadmissible as boundary conditions when
specified together. In order to avoid situations such as these, we restrict
the mann er in which boundary conditions might be specified. First, recall
that we restrict attention to boundary value problems involving differential
8.3 Normal boundary conditions 273
equations of even order 2m (m = 1,2, ... ), say, and the boundary is as-
sumed to be smooth (that is, of dass COO). Then the following restrictions
are imposed on the boundary conditions.
(i) A total of m conditions must be specified at each point of the bound-
ary. These are written in the form
Bou 90,
B 1u 91,
(8.29)
B rn - 1 u 9rn-1,
where 90, gl, ... , grn-1 are given functions and B o , B 1 , ... , B rn - 1 are
a set of linear differential operators called boundary operators. (The
boundary conditions are numbered 0, 1,2, ... rather than 1,2, ... for
reasons of convenience, as becomes apparent). The jth boundary op-
erator is of the form
B·u
J = "~ b(j)D"'u'
Q ,
lal~qJ
that is, it is a linear operator of order qj. The eoefficients b~P are
given functions of x for x E r. We assume that b~P and gj are
smooth functions;
(ii) the order of the highest derivative appearing in each boundary con-
dition must be less than the order of the PDE: in other words,
(iii) qi i' qj for i i' j; that is, no two boundary conditions should have
differential operators of the same order;
Requirements (i) through (iii) are self-explanatory but the fourth re-
quirement needs some explanation, which is best done by means of a simple
example. Suppose that we have a second-order problem with the boundary
condition
\7u· a = h
b· v i' o.
Thus (8.30) or (8.32) requires that the vector a should not be orthogonal to
v; this condition ensures that we do not have a situation such as that which
occurred with the pair of boundary conditions (8.27) and (8.28) discussed
earlier. There, a = sand the two conditions are contradictory.
When Conditions (i) to (iv) are satisfied, the set {Bo,B1, ... ,Bm-d is
said to be a set of normal boundary conditions. An important special case of
a set of normal boundary conditions arises when the order qj of the highest
derivative in the jth boundary condition is equal to j, far j = 1, ... , m - 1;
such a set of boundary conditions is called a Dirichlet system of order m.
8.3 Normal boundary conditions 275
Examples
16. As observed in Example 12, the PDE corresponeling to the plate
problem requires two boundary conelitions to be specified at each
point on the boundary. One possibility is to specify that the dis-
placement anel the slope are both zero along f; in other words, the
plate is clamped along its edge. In this case the boundary conditions
are
Bau == u = 0,
B 1u == V'u· 1/ = 8u/8v = 0,
which is a Dirichlet system of order 2 since qa
system
= °
and q1 = 1. The
8u/8x = 9a,
8u/8y = 91,
then we have
in which bk1 and Ckl are most generally sets of functions. Note that these
two matrices do not contain derivative operators, and that t is a function
of the displacement through (8.4) and (8.8).
As in the case of scalar problems, the functions bk1 and Ckl cannot be spec-
ified arbitrarily. For example, it is necessary that conditions be specified for
the normal and each of the tangential n components. Such a requirement is
met by specifying that the functions appearing in the boundary conditions
(8.33) satisfy the condition:
for boundary condition k, the coefficients bkk and Ckk are not both zero.
(8.34)
"
Example
u·y 0,
t· 8 0
u·y 0,
t· y 0,
L aOtß(x) [S - iv ds
d ] Ot+ß
u(s) = 0, s> 0, (8.35)
IOtI,IßI=m
L
IOtI=qj
b~)(x) [S-iV:sr u(S)I_ 8-0
=0, j=0, ... ,m-l,(8.36)
that involve only the principal parts of A and of B j (recall that aOt =
a~'a~2 ... a~n for any vector a in Rn). The set {Bo,B1, ... ,Bm-d of
boundary operators is compatible with A, and is said to cover A at x,
ifthe only solution of(8.35), (8.36) is u(s) = 0. We require that {Bj } cover
A at every point x in r.
Precisely why a requirement such as the covering condition should ensure
compatibility between B j and A is not an obvious matter; the details are
lengthy, and may be pursued in the references given at the end of this
chapter.
Example
20. Consider the Poisson equation
-V 2 u=! inncR2 ;
the most general normal boundary condition is of the form
ßu ßu
Bou = a ßx + b ßy + cu = 9 on r, (8.37)
8.4 Green's fOrlllUlas and adjoint problems 279
2 rPu
-0" u+-2 =0 (8.38)
ds
whereas (8.36) gives
(a + ib)O"C2 = 0,
so that C2 = 0 and hence u( s) = 0 provided that a # or b # 0, so
that (8.37) covers A at x for any values of a, b, and c. In order to
°
investigate the covering condition at other points on the boundary,
we simply introduce new axes X, y so that v = (0,1) relative to these
axes, at the point under consideration.
= 9m-l
where A is a linear elliptic partial differential operator of order 2m, of the
form
the coefficients a"'ß are functions of x, are smooth, and satisfy the condition
for ellipticity. The set B o, BI, ... , B m - l of boundary operators is of the
form
n au
Eu = """
L b ·-
Jax. + cu 9 on r,
j=1 J
Green's formula and the formal adjoint operator. With the operator
A given by (8.41), we denote by A* the operator defined by
in which F(u, v) represents boundary terms that arise from the application
of the theorem. If A * = A, that iso aaß = aß<>., the operator A is then said
to be formally self-adjoint.
8.4 Green's formulas and adjoint problems 281
- Jrr Vaij~Vi
8xj
ds+l aij~~ dx
n 8xj 8Xi
- Jrr [va. ,~v.
8xj'
IJ
- ua·· 88vXi v.] '.1 J
ds
- 1 u8- ( aij~)
o 8xj
8 ' dx.
8 Xi
By summing over i and j we therefore find that (8.43) holds with
A*v = - ~ ~ (aji(X)~)
~ 8x· 8x·
, (8.44)
i,j=l' J
and
F(u , v) = - Ln
i,j=l
a··
'J
(8U
v-v-
8
Xj
8v')
' - u-v·
8
Xi
J
j
,
(8.45)
Examples
21. Consider the second-order ordinary differential operator
d2
A = - dx 2 + 1;
using integration by parts we have, for sufficiently smooth u and v
and for n = (0,1),
1 1
( -v ~:~ + vu) dx
- 1+
- [vdU]
dX a a
1
- -du
(dV
dxdx
1 + vu ) dx
_ [v dU] + [dV U]
dx a dx a
1 1
_ t
Ja
(d2~
dx
+V)UdX.
dx =
du
[-v dx dV] 1
+ dx u 0 + Ja
r (-ddxv + V) U dx,
1 2
2
, v ' '--..,..--.-'
F(u,v) A'v
(8.46)
and since A * = A, A is formally self-adjoint.
282 8. Elliptic boundary value problems
(8.47)
Au=- ~2
- f) ( f)u )
aij- ,
L f)x f)x'
i,j=l' J
23. The analogue of (8.43) is readily derived for the elasticity problem.
We disregard dependence on time as before, and write the system of
PDEs (8.10) or (8.11) corresponding to the elasticity problem in the
form
Au=Qj (8.48)
l Au·v dx = - l div[Ce(u)]·v dx
in which the scalar product of two matrices t7 and T has been writ-
ten as t7 . T = L~j=l rrijTij· The details of the derivation of (8.49)
are discussed in Exercise 8.16. Now another application of Green's
theorem, this time to the volume integral on the right-hand side of
(8.49), yields
t(u) = [Ce(u)]v.
(8.53)
form a Dirichlet system of order 2m. Given these two sets of operators, it
is possible to write the Green's formula in the form
1
rl
vAu dx = 1
rl
uA*v dx
m-l
+L
j=O
1r
(SjUBi V - BjUSi v ) ds, (8.54)
Examples
24. In the Green's formula (8.46) we wish to express the boundary term
in the form
Sou = ßu (Po = 0)
for so me function ß. Ncxt, So must be of order 2m - 1 - qo =
2 - 1 - 1 = 0 and Ba must be of order 2m - 1 - Po = 2 - 1 - 0 = 1.
Thus Ba and So must be of the form
'Y V ,
Bov
uov/ox: ßu = Vx + vy
uov/oy: ßT = V x + vy
uv: ßp=O
vou/ox: ,vx = -vx - vy
v ou/oy : ,vy = -vx - vY '
The last two of these equations give, after using the fact that v; +
v y2 = 1 ,
p = 0, u = T = I/x + I/y.
Hence the boundary integral can be written in the form
Böv
26. The analogue of (8.54) in the case of the elasticity problem may
be formulated by considering the specific form (8.52) taken by the
boundary integrand F(u,v). First we denote the left-hand side of
the boundary conditions (8.33) by Bi (i = 1, ... , n), so that this set
of equations reads Biu = gi, (remember that t also depends on u.)
Then (8.52) is expressible in the form
n
F(u,v) = L:SiuBiv - BiuSiv, (8.56)
i=l
in which the new operators Si, Bi, Si are defined in exactly the same
way as in (8.54) and (8.55), with m = 1; thus for i = 1, ... , n, {Bi, Si}
forms a Dirichlet system of order 2, Bi is of order 1 - Pj, where Pi is
the order of Si, Si is of order 1- qi, where qi is the order of Bi, and
{Bi, Sn forms a Dirichlet system of order 2.
Suppose then that for a problem posed in ]R2 (n= 2) the pair of
boundary conditions is
u·v=O,
t· s = O.
286 8. Elliptic boundary value problems
A*u = f* in nc lR. n ,
Bou
Biu = 90
= 9i }
_ * on f,
B;"_I U - 9 rn - l
Example
8 2u 8 2u 82 u
---2----
8x 2 8x8y 8 y2
1* in n,
8u OU OU
-+v--n-
OV x 8y Y 8x
90 on r.
:}
Au=j in n C !Rn,
Bou =
B 1u = (8.57)
on r,
Bm-lu =
where A and B j are given by (8.41) and (8.42). Our aim is to settle the
questions of
(a) existence: under what circumstances (8.57) has a solution u that be-
longs to HS(n), s being an integer greater than or equal to 2m;
(8.58)
imply that
from which it ean be eoneluded that if hand 12 are elose to eaeh other in
the sense that II!lJII is smalI, II!lJII < E, say, where E is a small number,
then II!lull < CE so that Ul and U2 are eorrespondingly elose.
The quest ion of existenee and uniqueness of a solution is best approaehed
by adopting the language of linear operator theory (Chapter 5). First, we
denote by N(Bj ) the null space ofthe boundary operator B J ; that is, if B j
is regarded as an operator from H'(n) to L 2 (f), then
Our first task is to determine the set of functions J in H s - 2m (n) for whieh
(8.60) admits a solution. That is, we must identify R(A), the range of A.
This enables us to solve the problem of the existence of a solution. We
find that R(A) is not all of Hs- 2 m(n); there are functions J in Hs- 2 rn(n)
that do not lie in R(A), and for which no solution exists. The situation
is shown diagrammatieally in Figure 8.11. The seeond task is to aseertain
the eonditions under which the solution is unique; in other words, we wish
to know the eonditions under whieh A is one-to-one. For this purpose we
define the null spaee N(A) of A by
N(A) {u E D(A): Au = O}
{u E HS(n): Au = 0 in n, Bju = 0 on f}.
D(A) A
N(A)
A*u =f in nc jRn,
~O
B~u
Bi u (8.61)
on r;
}
B:n_l U
we define
The null space N(A*) of A* and its orthogonal complement N(A*)..L are
then
N(A*) {w E D(A*): A*w = O},
N(A*)..L {v E D(A*): (v, w)p = 0 for all w E N(A*)}.
Like N(A), the space N(A*) is finite-dimensional. Indeed, for most prob-
lems of practical interest
dimN(A) = dimN(A*).
We are not particularly concerned with solutions to the adjoint problem,
but when discussing the existence of solutions to (8.60) it is necessary to
call on properties of the space N(A*)..L. We now give a few examples.
Examples
28. Consider the problem
Au = -u" = f in n = (0,1),
Bou = (u(O), u(l)) = (0,0).
Assume that f E L 2 (0, 1), so that a solution u E H 2 (0, 1) is sought.
Also,
N(Bo) = {u E H 2 (n): u(O) = u(l) = O} = D(A).
Au
Bou
8.5 Existence, uniqueness, and regularity of solutions 291
Clearly
and
N(A).L
N(A) D(A)
(ii) (existence) there exists at least one solution if and only if / E N(A*).L,
that is, if
(f,v)p = 0 for all v E N(A*); (8.63)
PROOF. (i) Take any w E N(A) and assume that there are two solutions
satisfying
Ul, U2
8.5 Existence, uniqueness, and regularity of solutions 293
j=O
1r
(SjUBi V - BjUSi v ) ds
(u,O)p+
m-I
L
j=O
1r
(SjU'O-O'Si v) ds=O.
Hence J E N(A*)1-.
We sketch the proof of the converse and leave some of the details to
the exercises. The aim is to prove that if J E N(A*)1-, then J E R(A);
that is, N(A*)1- c R(A). First we note from (i) that, since A is one-to-one
from N(A)1- ontu R(A), it is possible to define the inverse operator A-I :
R(A) ----> N(A)1-. Second, it can be shown (see Exercise 8.22) that both A
and A- I are bounded operators, and furthermore that R(A) is closed. It
follows then from Chapter 4, Lemma 1 that R(A)1-1- =R(A) = R(A).
Next, if v E R(A)1- and u E D(A), then
j=O
1 r
SjUBi V ds,
°
operator from N(A)1- onto R(A); then (Exercise 8.23) there is a constant
C > such that
Examples
-kV 2 u = J in n,
u = 0 on r.
In this case A = A * = - kV 2, so A is formally self-adjoint. Assurne
that J E L 2 (n), and take s = 2 (m = 1 here). Thus
-kV 2 u f in!1,
ou/ov o on r.
This would correspond physically to thc problem of a membrane con-
strained around its boundary in such a way that the slope there is
zero, or in the case of heat conduction, to a medium that is perfectly
insulated along its boundary.
In this case N(A) = N(A*) = {c}, c being a constant function. From
Part (ii) of the theorem we thus deduce that there exists a solution
if and only if
(f, c) = 0, or c l f dx = 0, or l f dx = 0,
(u,c)=Oor lUdx=o.
Such a condition would serve to determine the value of any arbitrary
constant in the solution.
32. We return to the problem of elasticity, and show that Theorem 1
is applicable to this problem as weiL Suppose that the boundary
condition is
0= (Au,u) = L..t
t,J,k,l=l
CijklEij(U)Ekl(U) dx. (8.64)
(8.66)
where 1·1 represents the norm of a matrix; that is, 1€1 2 = E~j=l EijEij'
u(x) = a+ b X X
(8.67)
t(u) = 0 on r.
Physically, the body is not constrained against movement anywhere
on its boundary, so we would expect an element of nonuniqueness in
the solution, inasmuch as the body could be translated and rotated
from whatever its current position is, without affecting its state at all
(Figure 8.13). Such a motion, which takes place without adding any
deformation to the body, is known as a rigid body displacement. Its
most general form is
u(x)=a+bxx,
and it is easy to verify that E(U) = 0 for such a displacement field.
8.6 Bibliographical remarks 297
For the problem with a traction boundary condition, the most general
solution of the problem Au = 0 in n and t(u) = 0 on r is € = 0, in
other words, a rigid body displacement, and so
or, equivalently, if
In Q dx = 0 and In Q x x dx = O.
or, equivalently, if
In u dx = 0 and In u x x dx = O.
concentrates on the aspects that are most accessible, and most relevant,
to readers of this text. Accessible treatments of an alternative approach
to regularity, using what is known as the method of differentials, may be
found in the monographs by Zeidler [53] and by Dautray and Lions [13]. The
latter text mayaIso be consulted for further details of Korn's inequalities.
The article by Horgan [21] summarizes the major results concerning Korn's
inequalities for bounded domains, and discusses bounds on the constants
appearing in the inequalities.
Attention has been focused deliberatelyon those aspects of the theory of
elliptic boundary value problems that are relevant to the primary objective,
viz. that of presenting the theory of variational boundary value problems
and their approximation by finite elements. Some of the more complex
topics that have been omitted include the question of well-posedness in the
presence of nonhomogeneous boundary conditions, and in the presence of
data in H-r(D) for r > O. The latter would cover problems such as - V 2 u =
f in D where, for example, f is a Dirac delta. Naturally the solution u is
correspondingly irregular. These topics rcquire some knowledge of Sobolev
spaces HS(D) and HS(r) for which s is real; the theory of such spaces is
covered in the references to Sobolev spaces given at the end of Chapter 7.
We have assumed the boundary to be of dass Coo; when the boundary is
less smooth (for example, Lipschitz or polygonal) then the theory on regu-
larity becomes more complicated, although in many cases the results look
similar to those given here. For a comprehensive treatment of problems in
nonsmooth domains the monograph by Grisvard [17] is recommended.
8.7 Exercises
Differential equations, boundary conditions, and initial conditions
8.1. For each of the following differential equations specify the order of
the equation, state whether it is linear, and sketch the spatial domain
D.
ß2U ßu ßu
(a) -+--=y inD={xE]R2: X 2+ y2<I,y>0};
ßx 2 ßx ßy
8.2. The purpüse of this exercise is tü derive Navier's equation (8.11) für
elastic bodies, by retracing the steps employed in the Introduction
(equations (0.1) through (0.7)) in the derivation üfthe heat equation.
8.7 Exercises 299
where the coefficients C ijk1 are defined by (8.9). The clasticity oper-
ator is then said to be elliptic if for all vectors ~ and 1],
3
L Cijk1c'iTJjc'kTJl 2:: O.
i.j,k,I=1
on r
in the form LI"'19 b",D"'u = g. Is it normal?
8.10. Show that in thc theory of clastic platcs, the boundary condition
SI = 0 along thc cdgc x = L of thc plate can bc cxpressed in the
form
8.11. Determine the conditions under which the pair of boundary condi-
tions
Bou u,
8 3u 8 3u ~u 8 3u
Q; 8x3 + ß8x28y + 'Y 8x8 y2 + 8 y 3'
cover the biharmonic operator A = \74, at a point on the boundary
with normal v = (0,1).
8.12. An elastic body occupies the domain n = (0,1) x (0,1). The sides
x = 0, x = 1, and y = 1 are traction-free, whereas the side y =
is constrained by a flexible foundation, in the sense that the normal
°
component of the surface traction acting on the boundary is pro-
portional to the normal component of displacement; the tangential
°
component of displacement is zero along this side. Do the boundary
conditions along y = satisfy (8.34)?
8.13. Consider again the elastic body discussed in Exercise 8.12, but this
time suppose that the boundary condition along :IJ = is that corre-
sponding to Coulomb friction: the normal component of displacement
°
is zero, whereas the tangential component of traction is proportional
to the normal component of traction. Formulate this boundary con-
dition.
Green's formulas and adjoint problems
8.14. Show that the Green's formula for the operator A defined by
d4 u
dx 4 =f in n= (0,1)
is 1
1
VU'II' dx = 11
UV"" dx + [ulllv - u"v' + u'v" - uvllllö.
Given that Bou = (u l (O),u"(I)) and B 1 u = (u"'(O),u lll (I)), find the
operators B;, Sj, and S; (j = 0,1).
8.15. Show that the Green's formula for the Laplacian operator Au = \72 u
can be expressed in the form
"~
n
i,j=l
1r!
8a ij
- v · dx
8x.'
J
=" 1a··v·v· d s - " 1a ·8x8Vi· -. dx '
~
n
i,j=l r
'J J ' ~
n
i,j=l r!
'J
J
(8.68)
302 8. Elliptic boundary value problems
Au = V'4 U = f in n,
u = go } on r
8uj8v = g1 .
Aw j in n,
Bjw 0 on r,
8.21. Verify that €(u) = 0 for the rigid body displacement u(x) = a +
b x x.
8.22. The purpose of this exercise is to fill in some of the details of the
proof of Theorem 1.
8.23. Investigate the conditions under which unique solutions exist to the
elasticity problem with boundary conditions given in Exercises 8.12
and 8.13.
9
Variational boundary value problems
PDE: Au f in 0,
BCs: Bou 90
} on r,
9m-l
a(u,v) = (f,v)
Here V = HJ(!1),
a(u,v) = 1
~
V'u· V'v dx = 1(- - +--
n
öuöv
öx ÖX
öuöv) dxdy
öy öy
and
(e, v) = In Iv dxdy.
The first quest ion we ask is: in what sense is (9.1) equivalent to a BVP, and
what does this BVP look like? This is resolved by observing first that since
v in (9.1) is arbitrary, we can set v = qy E V(!1) (note that V(!1) C HJ(!1)),
to give
a(u,qy) = in
r (öu öqy öu qy )
öxöx+öyöy
Ö
dxdy=(e,qy). (9.2)
Now the functions ßu/ßx and au/ßy appearing in (9.2) belong to L 2(0)
(since u E HJ(O)) and also generate regular distributions ßu/ßx and
ßu/ßy, from which it follows that
(9.5)
in other words (9.1) implies the problem of finding u E HJ(O) that satisfies
the Poisson equation
(9.7)
in the sense 01 distributions (see Section 7.2). We could even go one step
further, and make use of the fact that V(O) is dense in HJ(O) to argue,
using (9.6), that (9.7) makes sense in H- 1 (0), the dual space of HJ(O).
Furthermore, since u E HJ (0) it vanishes on the boundary, and we have
u = 0 on r, (9.8)
where 8 is the Dirae singular distribution, and the same proeedure leads to
the equation
whieh, as we know from Section 7.2, only has meaning in the distributional
sense.
308 9. Variational boundary value problems
As (9.7) and (9.8) stand, a solution is sought in the space HJ (n). Whether
this solution coincides with a "classical" solution of the kind discussed in
Chapter 8, depends on the smoothness of f. If fE HS(n) with s ~ 0, then
u E Hs+2(n), and so the solution to the VBVP is the same as that of the
classical BVP.
So far we have shown that the VBVP (9.1) implies (9.7) (in the sense of
distributions) and (9.8). What ofthe converse? Suppose that we start with
the Dirichlet problem for the Poisson equation, that is,
f in n, (9.10)
u o on r, (9.11)
Green's theorem in the form (7.30) is now applied to the left-hand side of
(9.12), to reduce this to
(9.13)
= j in r2
BCs: Bou 0
} (,.,,,.tial) (9.14)
Bp_lu 0
Bpu 9p
} (natuml). (9.15)
Brn-lu 9m-l
The first step is to define aspace V in which the solution to the VBVP
is to be sought. This corresponds to the space HJ(r2) in problem (9.1). The
space V is known as the space oj admissible junctions, and is defined by
or
As with the simple example worked through earlier, the next step is to
multiply both sides of (9.13) by an arbitrary funetion v from V, integrate,
and use Green's theorem to reduee the expression so obtained to one of the
form
a(u,v) = l L
lal,IßISm
aaß(x)DßuDav dx + boundary terms.
Although the essential BCs are taken care of by the requirement that U E V,
the natural BCs are substituted into (9.16) direetly. Onee the formulation
(9.16) is arrived at we may disregard any smoothness initially assumed of
u, and pose the VBVP: find u E V that satisfies (9.16) for all v E V. Since
the VBVP is derived from the setting (9.13) through (9.15), every solution
of (9.13) through (9.15) is a solution of the VBVP. Conversely, it can be
shown that every solution of (9.16) solves the classical problem, possibly
in a wcak or distributional sense.
9.2 Formulation of variational boundary value problems 311
Examples
1. Consider the problem
-V 2 u+au I in fl,
(9.17)
au/all + bu 9 on r,
where a and bare continuous functions and it is assumed that I E
L 2 (fl) and gE L 2 (r). This problem arises in steady heat conduction,
in which the heat source is temperature-dependent, and of the form
I-au, and there is Newton cooling on the boundary. In this problem
m = 1, so that the boundary condition is a natural one. The space
of admissible functions is thus V = H1(fl). Multiplying both sides of
the PDE by v E H1(fl), integrating, and using Green's theorem, we
get
ln(vu.vv+aUV)dx-Ir(~~)VdS= InIVdX.
The introduction of the natural boundary condition into the bound-
ary term reduces this equation to the VBVP of finding u E H 1 (fl)
that satisfies
In
...
(vu· Vv + auv) dx +
v
Ir buv ds =
",...
In Iv dx +
T
Ir gv ds
'"
(9.18)
~u~) Q~
for all v E H1(fl). Thus the solution to problem (9.17) for I E L 2 (fl)
also solves the VBVP (9.18).
Conversely, if u is a solution of (9.18), then upon setting v = cf> E V(fl)
we get
(9.19)
o = Ir ( bu - 9 + ~~) v ds - In (V 2 u _. au + I) v dx
Ir (bu - 9 + ~~) v ds
using (9.19). The boundary value au/all is, of course, well-defined
since u E H 2(fl) by assumption, and so au/all E L 2(r). By choosing
312 9. Variational boundary value problems
where
4 84 u 84u 84u
V' W = 8x 4 + 2 8x 2 8 y 2 + 8 y 4'
Recall from Section 8.1 (Box 5) that physically this equation repre-
sents the behavior of a Rat plate with stiffness D subject to a trans-
verse force Q per unit area, with f = Q / D. Far simplicity we confine
attention here to a rectangular plate such as that shown in Figure
8.6.
Suppose that the plate is supported on its entire boundary in such
a way that rotation is permitted, but the boundary is constrained
against dis placement (as in the second boundary in Section 8.1, Box
5). Then there are two boundary conditions, the first of which is w = 0
on r. To formulate the second boundary condition we must consider
the edges x = ±h and y = ±l separately. For the edges y = ±l we
have, as in Box 5, the condition 8 2 w/8 y 2 = O. By a similar argument,
that essentially entails reversing the roles of x and y, we arrive at the
condition 8 2 w /8x 2 = 0 along the edge x = ±h. In summary, we
require that
o on r,
o forx=±h, YE[-l,l], (9.20)
o far y = ±l, xE [-h,h].
Fina11y,
8
8x
(8W)
8y ,
and 8w/8y vanishes along x = ±h. The other two sides are treated
in the same way, by swapping x and y.
Thus all the boundary terms vanish, and we fina11y obtain the VBVP:
find w E V such that
(9.24)
y l
2d 1
z
- .Irr [CE(U)]V' v ds
may be written as a sum of integrals over the parts f 1, ... , f 4 making
up f; now the integrals over f 1 , f 2 , and f 3 vanish, either because
v = 0 (on fd or because the surface traction vanishes (on f 2 and f 3 ).
The integral over f 4 becomes simply 11'4 f . v ds, after substitution
of the natural boundary conditions, and the desired VBVP is: find
U E V such that
1
,11
[CE(U)]' E(V) dx =
v ,~
Ir f· v ds (9.25)
a(U.V) (l,V)
--__=:1
FIGURE 9.3. The beam corresponding to Example 4
u(O) = 0 u'(L) = 0
u'(O) = 0 u"'(L) = -SLIEI.
Ofthese, all except the condition ulll(L) = SIEl are essential condi-
tions, so it follows that the space of admissible functions is
where q = fiEl.
316 9. Variational boundary value problems
V being the space of admissible functions and 11·11 v the norm on this space.
Without furt her ado we present the basic existence and uniqueness theorem
for VBVPs, after which a few specific examples are considered.
(ii) the solution depends continuously on the data, in the sense that
1
Ilullv<; -11€llv',
a
(9.28)
where 11 . Ilv' is the norm in the dual space V' 01 V and a is the
constant in (9.26).
9.3 Existence, uniqueness, and regularity of solutions 317
PROOF. The proof of this theorem follows from the Lax-Milgram theorem
(Theorem 5.13). Since a is continuous and V-elliptic, every bounded linear
functional, and in particular the functional C, can be expressed in the form
(C,v) = a(u,v),
where u is unique. This proves Part (i); Part (ii) follows by setting v =u
in (9.26) and using (9.27). This gives
the last inequality coming from the fact that Cis bounded. Dividing through-
out by Ilull, we obtain (9.28). 0
Recall from the discussion in Section 8.5 the significance of a result such
as (9.28). This inequality assures us that a small change in the functional
C leads to a correspondingly small change in the solution.
The inequality (9.28) may be expressed in an alternative form if Cis given
by
(9.29)
Examples
5. Consider the BVP
- \72 u + ku = f in r2,
(9.30)
u =0 on r,
(9.31)
318 9. Variational boundary value problems
::; IlfllullvllHI;
{ I [i +:;1
('\lv· '\lv v 2) dx
In other words,
6. Consider onee again the BVP (9.30), but this time ass urne only that
k(x) is nonnegative and bounded above, so that
and a is HJ-elliptic.
Given that all the norms are of various combinations of first deriva-
tives of u and v it follows that there is a constant M > 0 such that
in !1,
u
oujov
on r.
11n a wa
2
<l
uX
2 <l
uX
2
v
2 dx
I
a(v, v) ~ L
lal=2
i (Dav? dx,
a(v,v) ~ allvl1 2
ceases to hold for any function v that is constant (for which case
a(v,v) = 0). Hence, although the lack of H1-ellipticity does not nec-
essarily me an that a unique solution does not exist (Theorem 1 gives
sufficient conditions for the existence of a solution; if these conditions
are not satisfied, it does not imply nonexistence or nonuniquene~s),
we are unable to guarantee the existence of a unique solution.
Now the problem (9.33) has in fact been treated previously, in Chap-
tcr 8, Example 31; indeed, recall that in order for a unique solution to
exist, it was necessary and sufficient that the data land the solution
u satisfy
iI dx =0 and l u dx = o. (9.35)
The reason why we cannot prove existence of a unique solution 1.0 the
corresponding VBVP (9.34) is essentially that the conditions (9.35)
are not satisfied in the statement of (9.34) as it stands. First of all,
the space H1(n) in which u is sought is too large; for uniqueness
we must restrict attention to the subspace of H1(n) consisting of
elements orthogonal to constants, that is, elements satisfying (9.35h.
322 9. Variational boundary value problems
o= In fc dx or In f dx = O.
and assume that a(·,·) is Q-elliptic: there is a constant a > 0 such that
holds;
(ii) (continuous dependence on the data) the solution u satisfies
REMARK. Note that Theorem 1 is a special case for which P = {O}, so that
Q = V. Also, observe that for Example 9 we have V = Hl(0.), P = Po,
the set of constant functions, and Q consists of functions satisfying (9.35h,
9.3 Existence, uniqueness, and regularity of solutions 323
PROOF. First we show the neeessity of (9.38). Assuming that (9.37) holds,
we have from (9.36),
Onee this has been done, the proof follows that in Theorem 1, sinee a(·,·)
is continuous and Q-elliptie and, furthermore, it is readily shown that Q is
a closed subspaee and therefore complete in the Hm- norm.
Now sinee P is closed in V, which is in turn closed in Hm(n) and henee
also in L 2 (n), we have by Theorem 8 of Chapter 4 that
V = PEBQ.
Example
Q = {q E H 1 (fl): Lq o} . dx = (9.41 )
Furthermore, we have
(9.42)
t = 0 on r,
then r 1 = 0 and this is the same situation as that encountered
in Chapter 8, Example 33. Returning to Theorem 2, that holds for
vector-valued functions with minor modifications, we first note that
V = [Hl(fl)J3, and the space of functions P coincides with the set of
rigid body displacements:
Q = {U E V: 10 u dx = 0, 10 u x x dx = o} .
Kom's inequality (8.67) holds on Q, and thus the bilinear form is
Q-elliptic. According to Theorem 2, a unique solution exists in Q
provided that the compatibility condition (9.38) is satisfied: this is
precisely the pair of conditions
10 Q dx = 0 and 10 Q x x dx= 0
(P,v) = 10 Iv dx (9.43)
(9.45)
holds.
1 in n (9.46)
gj on r (j=O, ... ,m-l)
can be posed in the alternative form of a variational boundary value prob-
lem
The formulation (9.47) has been shown also to hold certain advantages over
(9.46).
It turns out that if the bilinear form in (9.47) is symmetrie, that is, if
~ and this is has been the case for the examples treated here - then a
third formulation is possible: the variational boundary value problem is
equivalent to a minimization problem, in which a function u in V is sought
that renders the value of a functional J : V ----;. IR aminimum; that is,
where J is defined by
the total potential energy of the system, whose minimum characterizes the
solution; this minimum is sought in HJ
(n).
Identical remarks apply to the elasticity problem; the total potential
energy is again given by
!'(xo) = O.
The point Xo is a minimum if the function f is convex, that is, if a straight
line drawn between any two points on the curve of f(x) lies on or above
the graph of the function. Mathematically, f is convex if, for 0 < 8 < 1,
9 !
\ ./
strictly
convex
a b y x
y+(}(x-y)
The minimum may not be unique. For example, consider the function 9
shown in Figure 9.4; the minimum value of g(x) occurs at all points between
a and b. But for a strictly convex function, that is, one for which
the minimum is unique; this is the case for the function ! in Figure 9.4.
Remarkably, all of these ideas extend in a very simple way to functionals
defined on arbitrary spaces. In order to see how this is done we first intro-
duce the required gencralizations of convex functions and their derivatives.
(9.51)
9.4 Minimization of functionals 329
(see Exercise 9.12). The operator DJ is called the gradient of J and DJ(u) :
V -> IR is the (Gateaux) derivative of J at u. Observe from (9.50) or (9.51)
that DJ maps V to its dual space V', so that DJ(u) is required to be a
bounded linear functional on V.
The Gateaux derivative does not always existi it may be verified, for
example, that if J is defined by J : IR 2 -> IR,
then
lim 8- 1 [J(x
IJ~O
+ 8y) - J(x)] = yUY2'
Examples
8J
L
n
(DJ(x),y) = 8X.Yi = VJ· Yi
i=l t
that is, the Gateaux derivative is the directional derivative (see Ex-
ercise 9.13).
Hence
= BJ(u) + (1 - B)J(v).
(DJ(u), v)
or
PROOF. We show first that (9.52) implies (9.53). Assume that (9.52) holds;
then, replacing v by u + Bv for any u, v E V and B E (0,1), we have
But v is arbitrary, so (9.54) holds ifwe replace v by -v. Using the linearity
of DJ(u) we get (DJ(u),v) ::; 0, and so (DJ(u),v) = O.
To show that (9.53) implies (9.52), we start with
so that, as 8 -> 0,
Exarnples
15. The preceding example is jlh'lt a special case of the general minimiza-
tion problem that involves quadratic functionals of the form
= a(u,v) - (f,v)
(9.59)
(9.60)
where I . Ila is the norm generated by (-, ·)a. From (9.60) it is clear that
the problem amounts to one of finding u E V such that
- \J2U - f 2: 0, u 2: g
(u-g)(-\J 2u-f) =0
U = 0 on r.
334 9. Variational boundary value problems
9.6 Exercises
Formulation of variational boundary value problems
g(x)
f
9 on r,
9.3. The VBVP for the plate problem may be derived in a manner that
facilitates im position of the natural boundary conditions, in the fol-
lowing way.
(a) Equations (8.15) implythat 2:!,ß=1 a2M a ß/ax a axß = -q. Mul-
tiply this equation by an arbitrary function v and use Green's
theorem to obtain the identity
f in n,
U o on r;
the eoefficients k ij of the thermal eonduetivity matrix are sueh that
the operator is strongly elliptie. Derive the eorresponding VBVP and
show that the bilinear form is V-elliptie provided that b(x) ~ O. Show
also that the bilinear form is eontinuous provided that Ikij(x)1 :s: K.
9.6. For a plate oeeupying a domain n with arbitrary, nonreetangular
boundary r, it ean be shown (see, for example, Rektorys [41], Chap-
ter 23) that the moment aeting on the boundary is given by Mn =
n T Mn = E!,ß=l Maßnanß· In this exercise the unit normal is de-
noted by n, to avoid eonfusion with Poisson's ratio v .
(a) as in Example 4;
u"(O) = 90, u"(I) = 91,
(b)
ul/l(O) = ho, ul/l(l) = h l .
9.9. Show that the bilinear form in Example 7 is V-elliptic provided that
the Lame constants satisfy the conditions given in Exercise 8.8.
used in Example 7.
Minimization of functionals
9.12. Show that an equivalent definition of the Gateaux derivative is
d
(DJ(u), v) = dB [J(u + Bv)lo=o·
8J
(DJ(x),y) = L 8xYi.
n
i=l 2
338 9. Variational boundary value problems
is strictly convex.
9.15. Show that J(v) = ~a(v, v) - (f, v) is convex if ais positive, that is, if
a(v,v) :c:: 0 for all v E V. Hence prove the converse of Theorem 3: if
u satisfies a(u,v) = (f,v) for all v E V, then u minimizes J.
(10.2)
The index h is a parameter that lies between 0 and 1, and whose magnitude
gives some indication of how elose V h is to V; his related to the dimension
of vh, and as the number N of basis functions chosen gets larger, h gets
smalter (for example, we could set h = 1/N). In the limit, as N --> 00, h-->
o and we would like to choose {cpd in such a way that V h will approach
V, in a manner made precise later.
Having defined the space V h , problem (10.1) is now posed in V h instead
of in V. That is, we try to find a function Uh E V h that satisfies
(10.3)
This is the essence of the Galerkin method. In order to solve for Uh, we
simply note that both Uh and Vh must be linear combinations of the basis
functions of V h , so that
N N
LLa(cp;,cpj)c;dj = L(P,cpj)dj
;=1 j=l j=l
(10.5)
10.1 The Galerkin method 341
in which
(10.6)
Note that K ij and F j can be evaluated in practice since the 1>i are known
functions and the forms of a and f are also known.
Since the coefficients dj are arbitrary, it follows that (10.5) only holds if
the term in brackets is zero. The problem is thus reduced to one of solving
the set of simultaneous linear equations
N
LKijCi = Fj , j = 1, ... ,N (10.7)
i=l
(10.8)
in which K and F are, respectively, the matrix and vector with entries K ij
and F i . Once these equations are solved, thc approximate solution Uh can
be found from the first of equations (10.4).
Examples
d2 u . 7rX .
- dx 2 = sm 2 In [! = (0,1), u(O) = u'(1) = O.
Then
and
o 1
0.405
Kc=F {==}
0.295
Cl = 0.738, C2 = -0.33.
The approximate solution is thus
ClcPl(X) + C2cP2(X)
0.738x - 0.33x 2.
This problem ean bc solved in closed form, and the exaet solution is
1 1
sin n7rX sin m7rX dx 1
1
eosn7rX cosm7rX dx
if n i= m,
ifn = m.
11
Then
(ßc/Ji ßc/Jj
+ ßc/Ji
'" ßc/Jj)
1 1
K. = d d
'J o 0 '"
uX '"
uX uy '"
uy x y,
and beeause of the orthogonality of the trigonometrie functions the
only nonzero terms of K ij are
Kii 11 (ßc/Ji)2
o
- 1
+ (ßc/Ji)
- 1
ßx ßy
2
dxdy
1
0
1
7r211 n 2 eos 2 n7rX sin 2 m7rY dxdy
1
+7r211 1 m 2 sin 2 n7rX eos 2 m7rY dxdy,
1o 1
1 1
xy sin n7rX sin m7rY dx dy
1
2(1, -2, -2,4).
7r
344 10. Approximate methods of solution
and so
4
4" [~( sin 7rX sill7ry + sin 2?TX sin 2?TY)
?T
- ~ (sin?Tx sin 2?TY + sin 2?TX sin ?TY)] .
Wc recall from Section 5.5 that the bilinear form a(·,·) defines an in-
ner product on V if a is symmetrie and V-elliptic; indeed, the properties
of linearity and symmetry are obvious, whereas the property of positive-
definiteness comes from the V-ellipticity of a:
and so
KiiCi = Fi , or Ci = Fd K ii ·
This is in fact the case in Example 2.
However, a word of warning is appropriate. Although for the preceding
example it was quite simple to find a basis that was orthogonal with respect
to (-, ')a, in general this is quite difficult. One could of course choose any
non-orthogonal basis and use the Gram-Schmidt procedure of Section 6.2
to orthogonalize or even orthonormalize, but for aIl exccpt the most trivial
problems this is a laborious procedure, and little is to be gained from it.
The problem of constructing a basis {ljJd~l in such a way that V h ap-
proaches V as N ---> 00 can be rather awkward. Remember that although
orthonormal bases for spaces such as L 2 are weIl known, at least for spaces
of functions on the real line or on simple two- and three-dimensional do-
mains (see, for example, Section 6.4), when using the Galerkin rnethod we
10.2 Properties of Galerkin approximations 345
are required to find bases for spaces V that are subspaces of Sobolev spaces
Hm(rI,), and that are defined on domains rI, which may be quite irregular
in shape. A vcry simple and elegant mcthod for constructing such bases is
provided by the finite element method. This is the topic of discussion in
the next two chapters.
aJ =0 k = 1, ... ,N,
aCk '
and this yields a set of N simultaneous algebraic equations in the N un-
knowns Cl, ... , CN· Solution of these equations then gives the components
Ck of Uh. In particular, if J is given by (9.57), then
N N
J(Ck) = ~ L KijCiCj - L Fjcj,
i,j=l j=l
leave things at that; we ought to know, first of all, whether the Galerkin
method always works and, if so, how significantly the approximate solution
differs from the exact solution. Also, we would like to be confident that as
the number of functions CPi in the basis of V h is increased, V h approaches
in some sense the space V and Uh approaches the exact solution u. This last
consideration is of course one of converyence of the approximate solution
as h -+ 0 (or as N -+ 00).
(10.11)
or
that is,
so that
N
Pu 'L,(Uh' ePk)aePk
k=l
PUh = Uh. (10.14)
Henee the orthogonal projeetion 01 the solution U onto V", with respeet to
the inner produet (., ')a, is the approximate solution Uh. Clearly, then, thc
errar e = u - Uh = U - Pu belongs to N(P), that is,
(e, Uh)a = 0,
which confirms (10.12). In other words, relative to the inner product (-, ')a
the error is orthogonal to the subspaee V h .
The geometrical analogy may be carried a step furt her. It would appear
that the distance Ilu - vhll, when measured using the norm 11·lla, is a min-
imum when Vh = Uh· That this is indeed so is borne out by the following.
Wc have
using the bilinearity of a. Now the second term on the right-hand side is
zero, since eisorthogonal to all members of V h with respect to the inner
product (., ·)a. Thus
and for fixed U and Uh (and hence for fixed e) we condude that Ilu - vhlla
is smallest when Vh = Uh,; that is,
(10.15)
lim
h~O
Iluh - ullv = o. (10.16)
(10.17)
10.2 Properties of Galerkin approximations 349
The last term on the right-hand side is zero by (10.12) so, using the conti-
nuity of a,
N
Uh = LCkCPk, (10.19)
k=1
where q:,k is any basis of V h , we can determine the coefficients Ck from the
fact that
for a given function u. That is, we solve for Cj the N simultaneous equations
N
L Cjq:,j(Xk) = U(Xk), k=l, .... N
j=1
350 10. Approximate methods of solution
O 1. 2 1
3 3
FIGURE 10.2. A function and its interpolate in the space spanned by (10.22)
(10.21)
is a projection operator.
Example
Cl . ~ + C2 . ~ u( ~)
Cl . ~ + C2 . ~ u( ~ ).
Solving, we have
(10.23)
(10.24)
10.3 Other methods of approximation 351
cM ß
Ilu - uhllv :S - h .
Q
(10.25)
(10.26)
using the V-ellipticity and continuity of the bilinear form, we find that this
reduces to
Au f in n (10.27)
Bau =
352 10. Approximate methods of solution
Example
4. Consider the problem
-u"+u=f inO=(0,1),
(10.33)
u(O) = u(1) = O.
Equation (10.30) reads, for this problem,
1 1
(-U Il +u)v dx = 1
1
fv dx,
1 1
(u'v' + uv) dx = 1
1
fv dx
1 1
(-uv" + uv) dx = 1 1
fv dx
in which
Ilr(Vh) IIi2
l (AVh - 1)2 dx
or
1
n
(AUh - f)AVh dx =0 (10.37)
Now suppose for definiteness that Uh and f are smooth enough far the
residual r(uh) = AUh - f to belong to HJ(O); that is,
(10.38)
The methods of weighted residuals, least squares, and collocation all possess
advantages that make them, in principle at least, viable alternatives to
the Galerkin method. In particular, in all three cases a greater degree of
smoothness is expected of the approximate solution: if A is an operator of
order 2m, then both weighted residuals and least squares require that U h C
H 2m (O), whereas in the case of the collocation method, the assumption
that AUh - f E HJ(O) will require that Uh E H 3 (O).
The rationale behind opting for the the Petrov-Galerkin procedure rather
than the standard Galerkin method is perhaps less clear, since the advan-
tage of greater smoothness is not present. However, there are various situa-
tions in which the Petrov-Galerkin method provides approximations of far
superior quality. This is particularly true of eonveetion-dijJusion problems,
of the form
1(-UhV~ +
1
UhVh) dx = 1
1
fVh dx
for all Vh E V h c Hg(O, 1). So, for example, we could choose as a basis
for the U h a set of piecewise-constant functions whereas for V h we need a
basis of functions that are at least continuously differentiable and which,
together with their first derivatives, vanish on the boundary.
In Exercise 10.10 the methods introduced here are applied to a simple
problem.
10.5 Exercises
The Galerkin method
10.1. The BVP (xu')' = xinO = (1,2), u(1) = u(2) = 0, has the exact
solution u(x) = iX2 - (3lnx/4ln 2) - i.
Use the Galerkin method to
find an approximate solution Uh in the subspace of HJ(1, 2) spanned
by ct>1 (x) = (x -1)(x - 2) and ct>2(X) = x(x - 1)(x - 2). Compare the
exact and approximate solutions by (a) sketching graphs of U and Uh;
(b) evaluating the errors lIellu"', lIellp, and IlellHl, where e = U - Uh·
10.2. Use the Galerkin method with basis functions ct>1 (x, y) = (-x 2 +
x)( _y2 + y) and ct>2(X, y) = (x 3 - ~X2 + ~x)(y3 - h
2 + ~y) to solve
the BVP
xy on 0 = (0,1) x (0,1),
U o on r.
that is, the error in the energy norm equals the error in the energy,
and therefore Iluhila ::; Ilulla.
10.6. If u minimizes the funetional J: V ----> IR given by
Phu = Uh = I:>k<Pk(X),
k=l
A*Auh=A*JinO,
plus appropriate boundary eonditions. If A = _\72 (the Laplaeian),
show that this is equivalent to the problem
\74 uh = \72 J,
eontinuity requirements, whieh in turn lead to bases that are quite different
from those used in seeond-order problems.
The elements diseussed in Section 1l.2 to 1l.4 are all polygonal (in two
dimensions). Domains with eurved boundaries ean, however, be aceommo-
dated by resorting to the use of isoparametrie elements; exaetly how this
is done is the subjeet of Seetion 11.5.
The final seetion of this chapter deals with an issue that is eentral to all
finite element eomputations, viz. numerical integration. For general prob-
lems, involving inhomogeneous media, say, or for domains that are approx-
imated using isoparametric elements, the integrals that arise eannot be
evaluated exaetly. For this reason it makes sense to resort to methods that
allow the integrals to be evaluated approximately, although with a degree
of aceuraey that can be estimated and therefore improved upon if desired.
(11.2)
If {N;}Y=l is a basis for vh, then we have seen in Chapter 10 that expansion
of Uh and Vh in terms of this basis and substitution in (1l.2) leads to the
set of simultaneous linear equations
(11.3)
where, as before,
(1l.4 )
11.1 The finite element method for second-order problems 365
FIGURE 11.1. A polygonal domain in ]R2 and its subdivison into finite elements
ne n n f = 0 for e =I- J, U Oe = 0.
e=l
To avoid complicating matters unnecessarily, we assume that the domain n
is polygonal if it is a subset of R.2 • That is, the boundary r of n is made up
of straight segments. Under these conditions, it is easy to see that the entire
domain can be covered exactly by polygonal elements (Figure 11.1). One
more condition is imposed on the subdivision of n: it is required that every
side of the boundary of an element in R.2 be either part of the boundary
r, or a side of another element. This condition rules out a situation such
as that shown in Figure 11.2, in which AB is a side of n2 but not of n1 .
Nodal points. We next identify certain points called nodes or nodal points
in the subdivided domain; these points playa key role in the finite element
method, as will so on become evident. Nodes are allocated at least at the
vertices of elements, as shown in Figure 11.3(a), but in order to improve
the approximation, further nodes may be introduced, for example, at the
midpoints of the sides of elements as shown in Figure 1l.3(b). In any case
there is a total of G nodes, say, which are numbered 1,2, ... , G and which
have position vectors X1,X2, ... ,Xc. The set of elements and no des that
366 11. The finite element method
• •
(a) (b)
FIGURE 11.3. Finite element meshes comprising elements and nodal points
11.1 The finite element method für second-ürder problems 367
Basis functions Ni. We are now ready to describe how the finite element
basis functions are farmed. In carrying out this procedure it must be borne
in mind that the basis functions define a subspace of V, so that they must
be functions in H 1 (n) (for second-arder problems) that satisfy the essential
boundary conditions. The quest ion of boundary conditions is left aside for
now, and we proceed to construct a set of basis functions with the following
properties.
(i) The functions Ni are bounded and eontinuous, that is,
Ni E C(n); (11.5)
(ii) there is a total of G basis functions, that is, one for each node, and
each function Ni is nonzero only on those elements that are connected
to node i:
(11.6)
if i = j,
(11. 7)
otherwise;
if i = j,
(11.9)
otherwise,
i and j running over all nodes in neo We call NiCe) a loeal basis function.
These ideas are illustrated in Figure 11.4. It is not diffieult to show that
Conditions (i) and (iv) ensure that the functions Ni belong to H1(n), as
required. We are thus going to set up basis functions that are pieeewise
polynomials, and that have small supports, in that they are nonzero only
in a "small" region. It should be clear that we may regard a typieal basis
funetion N, as having been built up by patehing together the loeal basis
functions NiCe) assoeiated with node i, as shown in Figure 11.5. To distin-
guish the basis funetions Ni from the loeal basis funetions N,Ce) , we refer
to the former as global basis junctions.
368 11. The finite element method
•
1
the funäion Ni
~ ... ..... --.-.
:-.
FIGURE 11.5_ A basis function Ni formed by patching together local basis func-
tions Nie)
11.1 The finite element method for second-order problems 369
Specific examples are given in the following section, but in the meantime
we observe that if we write
G
Vh(X) = L biNi(x), (11.10)
i=1
G
Vh(Xj) = LbiNi(Xj) = bj ; (11.11)
i=1
In the same way we denote by Xe the space spanned by the functions Ni(e).
That is,
l F(Ni,Nj ) dx
L
E
e=l
1 Oe
F(Ni , N j ) dx
L
E
e=l
1 Oe
F(Ni(e) , Nje») dx
e=1
370 11. The finite element method
E
L(I!(e),NJe»).
e=l
In other words, the matrix K and vector F in (11.4) can now be expressed
in the form
E E
K = LK(e) and F = LF(e), (11.13)
c=l c=l
in which
K(e)
~J
= a(e)(N(e)
t ,
N(e»)
J
and F(e)
J
= (R(e) ' N(e»)
J . (11.14)
• • • • • •
1 2 3 E-1 E E+ 1
FIGURE 11.6. The domain (a, b) and its sub division into elements
N (e)( ) = x- Xi
(11.15)
t X h·
e
The reference element. Instead of defining loeal basis functions for each
372 11. The finite element method
••
, .. , ....... ,."
------r-~~_e.----
....,.".__ . --------------_._-
._-_._---------
--• ._-----e.----~~~~._----~
.........._~.
• •.__
-1 1 Xi Xi+l
FIGURE 11.8. Reference element f2 and the image Oe under the affine map Fe
(11.16)
and (11.17)
or
() ~ Ce) ~
N;" (X) = NI (I;) , Ni+!(x) = N 2 (O, (11.18)
in which X and ~ are related through (11.16) (Figure 11.9).
For a piecewise linear basis we thus define
~(1-1;),
(11.19)
~(1+1;),
11.2 One-dimensional problems 373
N (e)
N(e)
, i+l
i+1
FIGURE 11.9. Local basis functions defined on the reference element 0 and on
the element Oe
and from (11.18) and (11.16) we then reeover (11.15) sinee, for examplc,
(e) (
Ni+2 x) = N- 3 ( ~ ) .
A few typieal pieeewise quadratie basis functions Ni that result from pateh-
ing together the quadratic loeal basis functions are also shown in Figure
1l.l0.
Polynomial bases of the form (11.19) and (11.20) are often referred to as
Lagmnge bases or families, beeause of their elose assoeiation with Lagrange
374 11. The finite element method
i i +1
FIGURE 11.10. Quadratie loeal basis functions and pieeewise quadratie global
functions
Example
1. Consider the BVP
-u" + u = sin 'Ir X , x E n = (0,1), (11.21)
u(O) = u(l) = O.
The corresponding VBVP is: find u E V = HJ (0,1) such that
Now recall that Ki~l = 0 if either node i or node j does not belong to
neo Hence the only nonzero contributions that have to be calculated
are Kg), K~;) , Kg) , Kg l ,Kgl (note that K ij is symmetrie).
376 11. The finite element method
r
in,
( dN2
dx
(1) (1)
dN2
dx
+ N(l) N(l) )
2 2
dx
ifir (
dN2 dN2
d~ d~
(~)2
h
NN)
+ 2 2
l
l
h dC
2 ..,
(11.22)
(11.23)
{1/3 (2/3
=}o N~~) NP) dX, + }1/3 (N~2) + N~2»)NF) d~
F,(l)
.
Fi(2)
PP)
v'3
36"(1, 5, 5, 1).
K22C2 + K23C3 F2 ,
K 32 C2 + K33C3 F3 ,
wh ich gives C2 = C3 = 0.0734. Hence the approximate solution is
We compare this with the exact solution u(x) = (1 + 11"2)-1 sin 11"X
in Figure 11.13 and see that the finite element solution is a fair ap-
proximation of the exact solution, given the very small subspace V h
that has been used. Furthermore, the approximate solution, like thc
exact, is symmetrie about x = ~. The approximate solution could
of course be improved in two ways: by subdividing the domain into
a larger number of elements, and by using a higher-order element,
such as the quadratic element. Of course, either of these refinements
will result in a greater amount of computational work, which mllst
be taken into consideration. It is clearly of interest to know before-
hand by how much an approximate solution will improve as a result
of either of the two refinements mentioned, so that one may decide
whether the refinement is worth the effort. This is a quest ion that is
addressed in the following chapter. In the next seetion we apply the
ideas developed here to second-order problems defined on domains n
in ]R2
11.3 Two-dimensional problems 379
i +-+ 1
j<->2
k+-+3
between global and loeal node numbers. The coordinates of eaeh node ean
likewise be expressed in global or loeal form, and we may write X}e) (I =
1,2,3) for Xi, Xj, Xk. Finally, the loeal basis functions mayaIso be num-
bered loeally, in the form Nie). Wherever it is neeessary to make the dis-
tinetion, loeal quantities are always indexed by upper ease [etters.
380 11. The finite element method
y
k
T}
3 (0,1)
J
Xi X
n
(1,0)
1 2 ~
T - (
(e)
x 2 - Xl
(e) (e)
X3 -
(e))
Xl and
- (e) (e) (e) (e)
Y2 - YI Y3 - Yl
11.3 Two-dimensional problems 381
k j
I I
K(e) K(e) K(e) i
11 13 12
I
K(e) K(e) k
33 32
---K(e)-----+ j
22
I
FIGURE 11.15. Assembly of the global stiffness matrix
• NI Ce)
N2 Ce)
N3Ce)
the function Nl is shown in Figure 11.16, whereas Figure 11.4 shows the
images of NA on the elements attached to node i, and the basis funetion
Ni(a:) that results from patching together allloeal basis functions associ-
ated with node i. Clearly the eondition Cl1.9) is satisfied. The basis funetion
Ni formed by patching together all the loeal funetions Ni(e) associated with
node i is the two-dimensional counterpart of the "hat" function in one di-
mension, and is pyramidal in shape. Naturally Ni is pieeewise linear, and
is nonzero only on those elements that have node i as anode.
Piecewise quadratic triangular elements are obtained by adding a fur-
ther three nodes to an element, at the midpoints of the sides, as in Fig-
ure 11.17. The most general function in P2 Cfi) has the form f (~ , 7J) =
382 11. The finite element method
1
t k=O
x Y
k= 1
x2 xy y2
k=2
x3 x 2y xy2 y3
k=3
x4 x 3y X 2y 2 xy3 y4
TI
4 3
~IG
~
1 2
x = Fee == Te + b or (~)=(
in which the matrix T is given by
(e) (e)
_ 1 ( X2 - Xl Y2(e) - Yl(e)) (11.26)
T - 2" (e) (e) (e) (e)
X4 - Xl Y4 - Yl
and bis the position vector of the centroid of the rectangle neo Since affine
maps take straight lines to straight lines, it is worth noting that the most
general such map would transform the reference element in Figure 11.19 to
a pamllelogmm, so that parallelograms could be as easily accommodated.
11.3 Two-dimensional problems 385
thc function Ni
Ncxt, we set up bilinear local basis functions on f2 that satisfy (11.9); just
as the reference element (-1, 1) x ( -1, 1) may be regardcd as the Cartesian
product of the one-dimensional reference element, in the same way local
basis functions satisfying (11.9) may be generated from products of the
functions (11.19). Thus we obtain
(11.27)
where (~i TJi) are the coordinates of node i on the reference element; in fuH
this reads
NI (f,) Hl - ~)(1 - TJ),
N2 (f,) Hl + ~)(1 - TJ),
N3 (f,) i (1 + ~)(1 + TJ),
N4 (f,) i(1 - ~)(1 + TJ)·
Then the functions NJe) are obtained by setting Nje) (x) = NI(f,), with x
and f, being related through (11.25). As in the case of triangular elements,
the positioning of the nodes and the choice of local basis funetions ensures
that the basis functions Ni will be continuous ac ross element boundaries,
as shown in Figure 11.20. Higher-order approximations on reet angular el-
ements may be generated by onee again appealing to Pascal's triangle.
Figure 11.21 shows the triangle, on whieh are marked the four terms that
give the bilinear approximation. By extending the diamond-shaped pattern
associated with the bilinear approximation, we arrive at a biquadmtic ap-
proximation that eontains nine terms, so that nine nodes are required, as
shown in Figure 11.22. The loeal basis {Nje)H=l on f2 e spans Q2; these
functions are again most conveniently found by constructing basis functions
on fi, and then using the relationship NI(f,) = Nje) (x). The nine functions
on fi may be generated from products of the one-dimensional quadratic
386 11. The finite element method
element De
FIGURE 11.22. A biquadratie loeal basis function
11.3 Two-dimensional problems 387
The basis functions Ni formed by patching together the function N iCe ) as-
sociated with global node i are piecewise biquadratic polynomials that are
continuous across interelement boundaries.
This concludes the discussion on elements for second-order problems in
two dimensions. We now work through a simple example involving rectan-
gular elements.
Example
We divide the domain into four square elements and choose for X h
the space of piecewise bi linear functions, so that nodes are required
at the corners of elements only (see Figure 11.23).
Next we construct Vh. From (11.12) it is required that
y r1
9 8 7
[24 [23
2 1 6 r1
[21 [22
3 4 5 x
r2
FIGURE 11.23. The domain and finite element mesh for Example 2
Now we require
Since all elements have the same geometry, the amount of compu-
tational work can be reduced considerably by observing that many
of the integrals have the same value. Indeed, if nodes i and j both
belong to Oe, then an moment's thought will convince us that
1= J,
if I, J are adjacent,
I, J otherwise,
(1l.28)
have
(11.29)
using the chain rule and the rule dxdy = Ijld~d'T] for changing vari-
ables in area integrals; here j = detT, where T is given by (11.26).
For this element,
T = 4: 1(-1 0) 0 -1 and T
-1
= (-4 0)0 - 4 .
Also, J = ft.
With all of this available and with the use of (11.27)
we can now evaluate (11.29), to obtain
Kg) = j1 j1
-1 -1
([ft('T] _ 1)( -4W + [i(~ - 1)( -4W) ~ d~d'T] = ~.
16 3
Similarly, Kg) = -!, Kg) = -~. Using (11.28) and (11.29) wc gct
-D
-1 -2 o
~n+('
0
C
6K=
4 -1 o 0
sym 4 sym 0
+(' 0)
0 0 -1
D+C
0 0 4 o 0
sym 0 sym o 0
0
C
-2 -2
8 -1 -2
-2 )
6 sym 4 -1 .
8
390 11. The finite element method
JJ;Oe N(l)
1
N(l)
1
dxdy if 1= J,
I~
Oe
Oe
N(l)
1
1
N,(l) dxdy
2
otherwise,
1/24
{ 1/72
1/144.
0.1585 )
( 0.2568
c = 0.4213 .
0.2568
These two solutions are eompared in Figure 11.24, where we see that
the approximation is quite favorable, notwithstanding the relatively
erude mesh.
3. Suppose that we wish to find an approximate solution, using fi-
nite elements, to a two-dimensional problem in linear elasticity. The
variational problem takes the form (9.25), and it is assumed that
U = U1 (x, y)e1 + U2(X, y)e2, so that the only nonzero eomponents
of the strain are, from (8.7), 1011,1012 = 1021, and 1022. It is assumed
furthermore that the material is isotropie, so that the bilinear form
is given by (see Exereise 9.10)
(11.30)
[le
e
which is easily evaluated with the aid of express ions for NI and the
affine maps (11.24) or (11.25). In Seetion 11.5 it is shown that this
transformation can be carried out without explicit inversion of the
map between the reference and the actual element. If the matrix thus
transformed is denoted as B, then we have finally (cf. (11.29))
d4 w f
dx 4 EI on (O,L),
w(O) = w'(O) = 0, (11.31 )
w(L) = w'(L) = O.
The space of admissible functions is V = H5(O), and the VBVP is: find
w E V such that
l L
w"v" dx = l L
(j / EI)v dx for all v E V
It is clear then that the space V h must comprise functions whose sec-
ond derivatives cxist, at least in a weak sense. By analogy with the set of
conditions (11.5) through (11.8) for second-order problems, we therefore
stipulate that the basis functions of V h must satisfy the following proper-
ties.
(i) The global basis functions comprise two sets, denoted by Ni (i =
1, ... ,G) and Mj (j = 1, ... , K); these functions are bounded and
continuously differentiable, that is,
Mi(x)IOe } 'f d r.
N i (x)loe == 0 I X'F He;
if i = j,
if i =I j,
(iv) let Ni(e) and Mi(e) be, respectively, the restrietions of Ni and Mi to
Oe; then Ni(e) and Mi(e) are polynomials.
394 11. The finite element method
This time it is clear from (iii) and (iv) that the loeal basis function Ni(e)
defined on element Oe will have the properties
I if i = j,
{
o otherwise,
o at all nodal points x j,
Vh = Lb;Ni + LdjMj ,
i=l j=l
whereas
G K
v~(Xj) = LbiN:(xj) + Ld;MI(xj) = dj .
;=1 i=l
So both the value of a function and its derivative are interpolated by Her-
mite basis functions.
The space X h is now simply defined by
Example
4. Returning to a one-dimensional problem such as (11.30) for the elastic
beam, suppose we try the simplest mesh, consisting of a set of ele-
ments with nodes only at the interelement boundaries (Figure 11.25).
The restrietion of the local basis functions to element Oe is required
to be polynomials whose values and slopes at the nodes are uniquely
11.4 Fourth-order problems and Hermite families of elements 395
i-1ii+l
normal derivative
p(XI) }
pAXI), Py(XI) at the vertices (I = 1,2,3)
Pxx(XI), Pxy(XI), Pyy(XI)
Pli at the midpoints 0/ the sides.
6 5
1 4 2
x
FIGURE 11.28. A curvilinear triangular element
vertex nodes, and f at the midside node. Thus the normal derivative is
uniquely determined at the interelement boundary, and so V h C C 2 ((I).
The six nodes X}e) of ne are the images of the six nodes of the reference
element, as shown in Figure 11.28, and as may be deduced by using the
properties of the local basis functions, and the sides of the reference element
are mapped to curves which are deseribed by quadratie polynomials in ~
and TJ. In this way we have used the basis funetions to generate an element
with curved sides: this is known as an isoparametric element, and a mesh
of elements generated in this way is known as an isoparametric mesh. The
loeal basis functions on ne are generated in the usual way, by using the
relationship
11.5 Isüparametric elements 399
in which x and ~ are related through (11.32). In this way much of the
process developed for affine elements earries over to this more general ease.
What must be reeognized, though, is that the basis functions NY) no longer
inherit the polynomial strueture of the functions NI, for the simple reason
that the map Fe is no longer affine. The manner in whieh eomputations are
earried out on the referenee element is best illustrated through a eonerete
example.
Example
K(e)
IJ -
-1Oe
\7 N(e)
I
. \7 N(e) dx
J '
(11.33)
and is obtained from (11.32). This plays a key role in the evaluation
of (11.33) on the referenee element, as does its determinant, which is
denoted by j:
j = detJ.
(11.34)
400 11. The finite element method
Now considering that the aim is to evaluate these terms on the ref-
erence element, it follows that we have to transform the vectors B r.
We have
8N}e) _ 2 8 Nr 8~i
8x·J - L
i=1
a<".
e . ax·
J
(11.35)
~t ~ ~
-1 8~k
J kl (x) = -8 '
Xl
given by
10 (j-2)B~J-l J- T BJ j ~ d'f/
1o(j-l)B~J-IJ-tBJ ~d'f/. (11.36)
Oe
(~
(
y
T)
4
X = LxINI(e)
I=l
(11.37)
The main aim then is to have available a systematic means of choosing the
sampling points and weights in such a way as to be able to minimize the
error IIn j(t;,) ~ - [rU)1 for an integration scheme of given order, where
IrU) represents the righthand side of (11.37). The choice is usually carried
out in such a way that the integration scheme is exact for polynomials of
a given degree.
Order ~t Wt
1 0 2
2 -1/../3 1
1/../3 1
3 -.j3!5 5/9
0 8/9
.j3!5 5/9
The weights and sampling points corresponding to Gauss quadrat ure are
chosen in an optimal fashion, so as to integrate exactly polynomials of
as high a degree as possible. Thus a polynomial of order 2r is integrated
exactly by a Gauss rule of order r + 1. Alternatively, a rule of order r
integrates exactly a polynomial of order 2r - 1.
Example
7. We show in this example how the sampling points and weights for the
scheme of order 2 may be obtained. Suppose that an arbitrary cubic
function J(~) = ao + al~ + a2e + a3e is to be integrated exactly
over the interval (-1, 1), using a scheme of order 2. Now
(11.38)
This must hold for all values of ao and al, and so it follows that
Wl = 1 and ~l = 1/../3.
404 11. The finite element method
l=l m=l
r
L wlwmf(el, firn).
l,m=l
in which (x, y) are the coordinates of the centroid of the triangle (Figure
11.30). Likewise, a rule of order 3 may be defined by
1~
f(x, y) dxdy ~ ~Ae L
3
~1
f(xe,iJt), (11.40)
where (Xl, fit) (I! = 1,2,3) are the coordinates of the midpoints of the sides
(Figure 11.30). It is not too difficult to show (see Exercise 17) that the rule
of order 1 is exact for polynomials of degree 1, while the rule of order 3 is
exact far polynomials of order 2.
11. 7 Bibliographical remarks 405
11.8 Exercises
The finite element method for second order problems
11.1. Assurne that the space Xe spanned by local basis functions belongs
to H1(r!c), and that X h C C(n). Show that X h C H 1 (r!). [Take
406 11. The finite element method
One-dimensional problems
11.3. Rework Example 1 using a mesh of two elements and the quadratic
loeal basis functions
N1 (0 = !~(~ -1), N2 (0 = 1 - e,
N3(~) = !~(~ + 1).
11.4. Let Xh be the space spanned by piecewise linear functions, that is,
Xe = Pl(il e ), where il e eile IR. Let 1 be any funetion defined on
il, and assurne that 1 can be differentiated as many times as desired.
Let lh be the interpolate of 1 in Xh. The purpose of this Exercise
is to show that the interpolation erraT" e = 1 - lh satisfies the erraT"
baund
Iiell oo = O~x~l
max 11(x) - lh(x) I::; h
8
2
max
O~x~l
1f"(x) 1
where h is the length of an element. Expand e(x) in a Taylor series
ab out any point x in il e , that is,
where Xi is one of the no des of il. Assuming that Xi is the node nearer
to x, obtain the error cstimatc.
11.5. Use Exercise 4 to estimate the error 111 - lhlloo if 1 is the function
f(x) = xsin7fx on the domain il = (0,1). Compute the aetual crror
11.8 Exercises 407
using two, three and four elements, and compare with the estimate.
Plot a log-log graph of error vs. hand plot the three points corre-
sponding to the three actual errors obtained. Do these points indicate
a quadratic rate of convergence?
Two-dimensional elements
11.6. Show that the basis functions Ni obtained by patching together quadratie
loeal basis funetions Ni(e) on triangular elements are continuous.
W(X)
f L4[1 (
x)4 (X) 3 1 x ) 2]
= EI 24 L - 12 L + 24 L .
1(
11.10. Prove Theorem 1.
Isoparametric elements
11.11. Prove that the isoparametrie map from the referenee element to a
parallelogram is neeessarily affine.
11.12. Determine the range of values of d for which the quadrilateral ele-
ment shown below has a jaeobian determinant whieh is everywhere
408 11. The finite element method
positive.
1 d
N umerical integration
11.13. Following the procedure used in Example 7, find the sampling points
and weights corresponding to a Gauss quadrat ure rule of order 3 on
the reference triangle.
11.14. Rework Example 2 using the method of Example 6, with 2 x 2 Gauss
quadrat ure.
11.15. The purpose of this exercise is to explore the consequences of under-
integration, the process whereby the terms in the stiffness matrix are
obtained by using an integration scheme of a lower order than that
required for exact integration. Consider an element in the form of the
reference square (-1,1) x (-1,1) (that is, Oe = n) and suppose that
the bilinear form is that corresponding to the Laplacian operator.
(a) The basis functions (11.27) may be expressed in vectorial form
as
space, since the desired solution will be polluted by this vector. Highly
effective schemes exist for achieving this end.
11.16. Show that the integration rule (11.39) is exact für polynomials of
degree 1, while the rule (11.40) is exact for polynomials of order 2.
12
Analysis of the finite element method
•
~
x
-+-------11---- ~
thc interpolation error on a single element. This estimate takes the form of
abound on the H Tn -seminorm of U - Uh, in terms of geometrical properties
of the element. Then in Seetion 12.3 error estimates are derived for second-
order problems, in appropriate Sobolev norms. The final section of thb
chapter is devoted to a discussion of the modifications that must be made to
the theory in order to accommodate the presence of curvcd boundaries, and
also to incorporate into the estimates the error due to numerical integration.
(12.1 )
12.1 Affine families of elements 413
~e
~
small hel Pe large hel Pe
(12.2)
Onee a set of affine transformations has been eonstructed in this way for
eaeh element, we need to foeus attention only on the referenee element
n and the family of transformations F 1 , F2 , ... , FE, sinee these provide a
complete description of the mesh.
n
When two elements and n e are related to eaeh other by a transforma-
tion of the type (12.1), (12.2), they are said to be affine-equivalent. Also,
a set of finite elements n 1 , ... , n E is ealled an affine family if all elements
are affine-equivalent to a single reference element n.
It should be clear from the discussion in Section 11.3 that affine maps of
the form (12.1), (12.2) exist in lR, and in lR 2 from one tri angle to another,
and as far as quadrilaterals go, most generally from one parallelogram to an-
other. Similar results hold in lR 3 for tetrahedra and 3-rectangles or "brieks".
We are thus assured that affine maps are always available for the elements
with which we are concerned.
The relative size and shape of an arbitrary element ne are quantified in
a natural way by defining the eonstants
and
(12.1).
(12.5)
(12.6)
Now suppose that {NI }~1 is a set of loeal basis funetions defined on 0
with the usual property that
I if J = I
NI(eJ) = { 0 otherwise,
12.1 Affine families of elements 415
for nodal points {J. The function NI is a polynomial of degree k, say, that
can be mapped to C(n e ) using (12.6):
K e- 1 N I = N(e)
A
I'
Here {N;e) }~1 is the corresponding set of polynomial loeal basis functions
defined on ne ; these functions also have the property that N;e) (XI ) = 1 and
N;e) (xJ) = 0 for J #- I since (12.5) implies that NI({J) = N;e) (xJ) (we
have in fact carried out this transformation for one- and two-dimensional
problems in Se~tions 11.2 and 11.3}.
As usual, {NI} spans aspace X (of polynomials, in our case) and so
we can construct a projeetion operator IT that maps any v E C(n) to its
interpolate v in X, according to
M
IT: C(n) ~ X, ITv = LV({I)NI . (12.7)
1=1
PROOF. We have
M
ITev = LV(XI)N;e)
1=1
by virtue of (12.8). Hence
Ke (t V(eI )N}e))
M
L V(e1 )KeN;e) (Ke is a linear operator)
1=1
M
LV(e1)fh
1=1
Similarly,
ITv = v for any v E Pk(n). (12.11)
The main result in this section is: for v E H k +1 (Oe) and IIe satisfying
the preceding properties, the interpolation errar in the Hm- norm can be
estimated by
Ivl;,o. = L
l"l=s
In •
[D"v(xW dx
(recall also that the Sobolev norm 11·lls,o. is given by II vll;,o. = 2:;=1 Ivlf,oJ.
Here and subsequently the norm on HS(O) is denoted by 11·lIs,o rather than
the more cumbersome 11· IIHs(o), We start the development by recording
an important result that is required later.
This can always be done: set lai = k; then D"'p equals the coefficient of
x O , that can be solved for using (12.14). Having solved für all coefficients
of terms of order k, set lai = k -I, and use (12.14) to solve for coefficients
of terms of order k - 1. Proceeding in this way, we find p for any given v.
With p = p in (12.13), we have
(12.15)
and
(12.16)
Ivl;,o = L
lol=s
1 (D°{j(e))2 d~
L 10 (DOv(e))2IdetTel-l dx (12.17)
101=8 e
(using the result from multivariable calculus that if t;i = J;(Xj), then d~ ==
dt; l d6··· d~n = I det(8fd8xj)l dx l dx 2'" dxn ).
By an application of the chain rule we have (see Exercise 12.6), für fixed
x and e,
(12.18)
(since e and x are fixed, D"'v(e) and DQv(x) are simply real numbers).
Hence (12.17) becomes
lvi; (j
,
:s: L
101='
1rle
(Dov(e))2I1TeIl28(detTe)-1 dx
12.2 Local interpolation error estimates 419
from which (12.15) follows, since I\Tel1 and det Te are constant. 0
We come now to the interpolation error estimate for the seminorm Iv -
IIevlm,n e •
and
Let IIe and tr be the operators defined in (12.7) and (12.8). Then for any
affine equivalent element Oe and for all functions v E Hk+1(Oe),
(12.19)
PROOF. We have, for all v E Hk+1(fl) and all ß E Pk(fl), and using (12.11),
Iv - ITvl m,n- :s Ilv - ITvll m,n- = Ilv - ITv + ß - ITßII m,n-
:s III(v + ß) - IT(v + ß)llm,n
:s III( v + ß) Ilm,n + IIIT( v + ß) IIm,n
:s (11111 + Iltrl!) Ilv + ßll k + 1 n'
---------------
6 '
The last line follows from the fact that land tr are bounded operators
from Hk+1(fl) to Hm(fl). The use of Theorem 2 now yields
(12.20)
Iv - IIevlm,n e :S IIev)lm,n
IIT~lln det T eI1 / 2 IK e(v -
IIT~lln det Te11/ 2 1v - ITvlm,n; (12.22)
420 12. Analysis of the finite element method
REMARKS.
1. Since we wish to evaluate Iv-II ev m ,!1 e I, it follows that both v and IIev
must be in Hm(0,e) for this term to make sense. Equivalently, v and
ITv must be in Hm(n). This accounts for the inclusions Hk+ I (n) c
Hm(n) and X c Hm(n). Note that v E Hk+I(0,e) implies v E
H k+ I (n). The inclusion H k +1(n) c Hm(n) of course holds if m :::;
k + 1.
(i) there exists a constant ()" such that hel Pe :::; ()" for all elements;
( 12.24)
12.3 Error estimates for second-order problems 421
It is not difficult (see Exercise 12.5) to deduce this result, and in partic-
ular to show that it depends on Property (ii) of regular families of finite
elements.
Examples
1. Let n e be the three-noded triangle in ]R2. The space Xe spanned by
the loeal interpolation functions is PI (ne), so that k = 1. Assuming
that v is smooth enough to belong to H 2 (n e ), (12.24) gives
(12.25)
2. For problems such as those arising in linear elasticity, for which the
unknown variable is vector-valued, the set of results culminating in
(12.25) carries over virtually unchanged. We return to Chapter 9,
Example 3, for which case V = {v: v E [H 1 (n)f, v = 0 on rI}.
This problem is posed on a domain in ]R2, so suppose that we make
use of four-noded rectangular elements, generated by a family of affine
maps (11.27) from the reference square.
Now the basis functions (11.27) eorresponding to this element are
bilinear; thus the restriction to ne of any function Vh E V h will
belong to Ql(n e ), and since PI C Ql C P2 it follows that the value
of k appropriate to this problem is k = 1. If the Hm- norm for vector-
valued functions is defined on n e according to
wherc Ni are the global basis functions that span X h . As in Seetion 12.1,
we define a projection operator II h that maps v to its interpolant Vh or
IIhv:
N
Ih : C(r2) --> X h, IIhv = L v(x;)N
i. (12.26)
i=1
From the way in whieh the functiüns Ni are construeted from loeal basis
functions N;, it should be clear that the restriction of II h v to any element
r2 e is in fact IIev (Figure 12.5):
(12.27)
THEOREM 5. Assume that all the conditions of Theorem 4 and its corollary
hold. Then there exists a constant c independent of h such that, for any
v E Hk+1(n),
(12.28)
E ) 1/2
< L C h e (k+1-m) lul k+1,rl e
"'"
( 2 2 2
e=l
E ) 1/2
Ch k + 1 - m ( "'"
L lul 2k+1,Oe
e=l
Ch k+1- m lu lk+1,O.
PROOF. From Theorem 2 of Chapter 10, with Vh = Ihu and (12.28) with
m = 1 we obtain
(12.31)
far some constant Cl > O. The finite element theory developed here is
applicable only to polygonal domains (in ~2), but if it is known that the
estimate holds even for such a case, then we may set r = s + 2, and since
12.3 Error estimates for second-order problems 425
(12.32)
Example
I inflclRn ,
u o on r.
and this problem has a unique solution. Similarly, the VBVP corre-
sponding to the approximate solution is: find Uh E V h such that
•
/
/
/
/
/
in wh ich the functions NI are quadratic. Also shown in the figure is the
element Oe generated by the affine map Fe from the reference triangle. The
definitions (12.3) and (12.4) of the quantities h e and Pe are retained, but
these refer to the affine element Oe, as shown in Figure 12.6. Then under
these conditions a family of isoparametric elements is said to be regular if
12.4 Isoparametrie families and numerical integration 427
Thus a comparison with the definition given in Section 12.2 shows that a
family of regular isoparametric elements has to satisfy the criteria that are
set for affine families, but in addition rl e is required to be not very different
from fi e , in the sense of (12.34). Under these conditions it is possible to
prove the following analogue of Theorem 4 and its corollary.
(12.35)
Thus the estimate (12.35) differs from (12.24) (with k = 3 there) only in
that the term Ivl2,n e also appears on the right-hand side.
One of the reasons for using isoparametric families is that these permit
the construction of domains with curved boundaries. It is often the case,
though, that the actual curved boundary r of the domain rl cannot be
represented exactly using isoparametric elements. When attempting to ar-
rive at an error estimate of the kind (12.28) for second-order problems,
therefore, the theory must take account of the fact that the domain rl h
which is defined by the finite element mesh may be distinct from rl. Such
a situation is of course also true in the case of affine families, which would
at best represent a polygonal approximation to a domain with a curved
boundary.
(12.36)
428 12. Analysis of the finite element method
note that the norms are defined here on the domain n h . In going from
(12.35) to (12.36) we also make use of the elementary fact that Iv12,Oe +
Iv13,Oe ~ cllvIl3,Oe·
Numerical integration. We consider now the modifications that have
to be made to the standard theory, in the event that numerieal integration
proeedures ofthe kind discussed in Section 11.6 are used. Take, for example,
the problem of finding U E V = HJ(n) that satisfies
a(u,v) = (i,v) for all v E V, (12.37)
where
(i,v) = l Iv dx.
in which the bilinear form ah(uh,vh) and linear functional (Rh,Vh) are ob-
tained by integrating numerically over each element and summing over all
elements. For an integration rule of order r, therefore,
(12.42)
430 12. Analysis of the finite element method
r
in e
f(x) dxdy ~ h(f) == Aef(x),
where A e is the area of n e and x the loeation of its eentroid. If Xe =
PI (ne), where Xe is the spaee spanned by the loeal basis functions, then
there exists a constant 00, independent of h, such that
ah(vh, Vh) 2:: oollvhll~·
r k\lvh' \lVh dx dy
in e
h(k\lvh . \lvh)
Aek(x)\lVh(X)' \lvh(X)
2:: A ek ol\lvhI 2 (x)
kolvhl~,ne'
The desired result then follows from summing over all elements, and then
using the Poincare-Friedrichs inequality (7.34). 0
Theorem 8, together with the consistency error estimates, gives the fol-
lowing result.
THEOREM 10. Assume that the conditions of Theorem 8 hold. Then if the
solution U E HJ(n) of the problem (12.37) belangs to H 2 (n), and if the
da ta satisfy (12.37) and
k ji = k ij , k ij E C(O), JE H 2 (n),
then there exists a eonstant C dependent on u, k, and f but independent
of h,
12.5 Bibliographical remarks 431
12.6 Exercises
Affine families of elements
12.1. Complete the proof of Lemma 1 by showing that IIT;lll ::; hl Pe.
Local interpolation error estimates
12.2. Show that I : Hk+1(n) ...... Hm(n) and fI : Hk+l(n) ...... Hm(n)
are bounded operators, where fI is the projection operator defined in
Seetion 12.1. [Theorem 2 of Chapter 7 is useful when dealing with fI.]
for some a > O. Show that this condition is satisfied if the smallest
angle (Je in an element is bounded below by some eonstantj that is,
Ilu ~ uhllm,n e
O(h;-m) ? ?
O:S;m:S;2
Regularity H 2 (rI,e) ? ?
zj ~ ~
0 D D. .
12.5. Derive the estimate
D 2 v(a, b) = "'"'
D aaaib
U j
i,j=l Xi Xj
12.6 Exercises 433
12.8. The theory of Sections 12.1 and 12.2 does not enable us to ob-
tain optimal error estimates in the L 2 -norm for second-order prob-
lems, mainly because of the central röle played by the inequality
Ilu - uhllv :::; Cllu - vhllv. It is possible, however, to obtain L 2 esti-
mates using what is known as the Aubin-Nitsche method. The method
is outlined in this exercise, for the problem (12.29).
Consider the auxiliary VBVP of finding w E V C Hl(O) such that
and use the continuity of a and the results of Sections 12.1 and 12.2
to obtain the estimate
12.9. Use the result of Exercise 12.7 to obtain L 2 -estimates of the error for
the problem
\J2 U ! in 0 C JR2,
U 0 on r,
assuming that ! E L 2 (O), and using linear or bilinear elements.
12.10. Suppose that we have to solve a fourth-order BVP defined on n =
(0,1), and assume that we intend using the Hermite basis functions
described in Section 11.4. Verify that the theory developed in Sections
12.1 to 12.3 remains essentially unchanged except that, for example,
it is necessary to specify that Hk+l(n) c Gl(n) in Theorem 4. De-
rive an estimate of the error in finite element approximations of the
problem
d4 u
dx4+ku=! in (0,1),
u(O) = u(l) = 0,
u'(O) = u'(I) = 0,
assuming that ! E L 2 (0, 1), and using the cubic Hermite functions in
Example 4 of Chapter 11.
12.11. The purpose of this exercise is to derive the estimate (12.41) in The-
orem 8. Use the Vh-ellipticity of ah to obtain the inequality
12.12. Show that the bi linear form ah("') defined by (12.40) is Vh-elliptic,
if Oe is the six-noded quadratic element and the integration rule used
is the three-point rule on the triangle.
References
[1] Adams R.A., Sobolev Spaces. Academic Press (New York) 1975.
[2] Apostol T.M., Mathematical Analysis: A Modern Approach to Ad-
vanced Calculus. Addison-Wesley (Reading, Mass.) 1957.
[3] Babuska 1. and Aziz A.K., Survey Lectures on the Mathematical Foun-
dations of the Finite Element Method, in The Mathematical Founda-
tions of the Finite Element Method with Applications to Partial Dif-
ferential Equations (ed. A.K. Aziz). Academic Press (New York) 1972.
[4] Baiocchi C. and Capelo A., Variational and Quasi- Variational Inequal-
ities. Wiley (New York) 1984.
[5] Becker E.B., Carey G.F. and Oden J.T., Finite Elements, Volume 1:
An Introduction. Prentice-Hall (Englewood Cliffs, N.J.) 1981.
[6] Binmore K.G., Mathematical Analysis: A Straightforward Approach.
Cambridge University Press (Cambridge) 1977.
[7] Binmore K.G., The Foundations of Analysis: A Straightforward Intro-
duction. Book 2: Topologicalldeas. Cambridge University Press (Cam-
bridge) 1981.
[8] Brenner S. and Scott L.R., The Mathematical Theory of Finite Ele-
ment Methods. Springer-Verlag (New York) 1994.
[9] Burnett D.S., Finite Element Analysis. Addison-Wesley (Reading,
Mass.) 1987.
436 References
[10] Carey G.F. and Oden J.T., Finite Elements, Vol. 2; A Seeond Course.
Prentice-Hall (Englewood Cliffs, N.J.) 1983.
[11] Ciarlet P.G., The Finite Element Method for Elliptie Problems. North-
Holland (Amsterdam) 1978.
[12] Ciarlet P.G and Raviart P.-A., Interpolation theory over curved ele-
ments with applications to finite element methods. Computer Methods
in Applied Meehanies and Engineering 1 (1972) 217-249.
[14] Dhatt G. and Touzot G., The Finite Element Method Displayed. Wiley
(New York) 1984.
[18] Halmos P., Finite Dimensional Veetor Spaees. Van Nostrand Reinhold
(New York) 1958.
[19] Hewitt E. and Stromberg K.R., Real and Abstmct Analysis; A Modern
Treatment of the Theory of Funetions of a Real Variable. Springer-
Verlag (New York) 1965.
[22] Hughes T.J.R., The Finite Element Method; Linear Statie and Dy-
namie Analysis. Prentice-Hall (Englewood Cliffs, N.J.) 1987.
[25] Kolmogorov A.N. and Fomin S.V., Elements of the Theory of Fune-
tions and Funetional Analysis. Volume 1: Metrie and Normed Spaees.
Graylock Press (Rochester, N.Y.) 1957.
[26] Kolmogorov A.N. and Fomin S.V., Elements of the Theory of Fune-
tions and Functional Analysis. Volume 2: Measure, Lebesgue Integrals
and Hilbert Spaee. Academic Press (New York) 1961.
[30] Lions J.L. and Magenes E., Non-Homogeneous Boundary- Value Prob-
lems and Applieations, Volume 1. Springer-Verlag (New York) 1972.
[31] Lipschutz S., Set Theory and Related Topies. Schaum Outline Series.
McGraw-Hill (New York) 1964.
[32] Loula A.F.D., Hughes T.J.R. and Franca L.P., Petrov-Galerkin for-
mulations of the Timoshenko beam problem. Computer Methods in
Applied Meehanies and Engineering 63 (1987) 115-132.
[33] Naylor A.W. and Seil G.R., Linear Operator Theory in Engineering
and Seienee. Springer-Verlag (Berlin) 1982.
[34] Necas .I., Les Methodes Direetes en Theorie des Equations Elliptiques.
Masson (Paris) 1967.
[37] Oden J.T. and Carey G.F., Finite Elements, Volume 4: Mathematical
Aspeets. Prentice-Hall (Englewood Cliffs, N.J.) 1982.
[38] Oden J.T. and Reddy J.N., An Introduetion to the Mathematical The-
ory of Finite Elements. Wiley (Ncw York) 1976.
[42) Roman P., Some Modern Mathematics for Physicists and Other Out-
siders, Volume 1: Algebra, Topology and Measure Theory. Pergamon
(Oxford) 1975.
[43) Roman P., Some Modern Mathematics for Physicists and Other Out-
siders, Volume 2: Functional Analysis with Applications. Pergamon
(Oxford) 1975.
[45) Rudin W., Real and Complex Analysis. 2nd edition. McGraw-Hill (New
York) 1974.
[47) Schwartz L., Mathematics for the Physical Sciences. Hermann (Paris)
1966.
[48) Showalter R.E., Hilbert Space Methods for Partial Differential Equa-
tions. Pitman (Boston) 1977.
[51) Strang G. and Fix G.J., An Analysis of the Finite Element Method.
Prentice-Hall (Englewood Cliffs, N.J.) 1973.
[53) Zeidler E., Nonlinear Functional Analysis and Its Applications. Vol-
ume IIA: Linear Monotone Operators. Springer-Verlag (Berlin) 1990.
[54) Zeidler E., Applied Functional Analysis: Applications of Mathematical
Physics. Springer-Verlag (Berlin) 1995.
[55) Zeidler E., Applied Functional Analysis: Main Principles and Their
Applications. Springer-Verlag (Berlin) 1995.
[56) Zienkiewicz O.C. and Taylor R.L., The Finite Element Method. Vol-
ume 1: Basic Formulation and Linear Problems. McGraw-HiIl (Lon-
don) 1989.
References 439
[57] Zienkiewicz O.C. and Taylor R.L., The Finite Element Method. Vol-
ume 2: Solid and Fluid Mechanics, Dynamics and Nonlinearity.
McGraw-Hill (London) 1991.
Solutions to Exercises
Chapter 1
1.2. Au C = {I, 2, 9} so B x (A U C) = {(7, 1), (7,2), (7,9), (8, 1), (8, 2),
(8,9)}; An C = {I} so (A n C) x B = {(I, 7), (1, 8)}.
1.3. Let x E An (B U C). Then x E A and x E B or C; Le., x E A and
xE B, or xE A and x E C. Hence x E (AnB) U (AnC). The second
identity is proved in a similar way.
The rationals can be listed by writing down the numbers in the pre-
ceding table in the order shown, omitting those already listed (e.g.,
442 Solutions
omit 2/2 = 1). This then gives a listing of all rationals whose nu-
merator and denominator add up to 2, then 3, and so on. In this
way all positive rationals are covered. Multiply by -1 to get negative
rationals.
1.8. (i) Not open: A = {±I/n7r, n = I, ... } and for every x E A there is
a nhd N(x) such that N(x) - {x} ri A. Not closed: 0 ri Ais a point
of accumulation. (ii) Neither open nor closed. (iii) Open, not closed
since {±I/n7r} are points of accumulation, but are not in A.
1.11. (i) -1,1/2, -1/6, 1/24, ... ; (ii) 1,0,1,0,1 ... ; (iii) -3,6/7,9/13,12/19, ...
1.13. 1(3n + 2)/(n - 1) - 31 = 15/(n - 1)1 < 0.001. Assurne n > 1, so that
5 < O.OOI(n - 1) => n > 5001. Take n = 5001.
1.16. Y = inf A => a :<::: y :<::: x for all x E A and lower bounds a. Thus
-x :<::: -y :<::: -a so that -y is the least upper bound of -A.
1.18. (i) Lct p = supI; then x:<::: p for any x E I. Let J = {ax: xE I};
since a > 0, ax :<::: ap. Hence J is bounded above by ap. Let the
supremum of J be q (we must prove that q = ap). Since ap is an
upper bound for J and q is the least upper bound, q :<::: ap. But for
Solutions 443
1.21. (a) Not an equivalence relation, but a partial ordering; (b) equivalence
relation, not a partial orderingj (c) neither an equivalence relation
nor a partial orderingj (d) not an equivalence relation, but a partial
ordering.
1.22. {(2,2), (3,3), (4,4), (5,5), (6,6), (2,5), (5,2), (3,6), (6,3)}.
1.23. Take c E A a n Ab. Then c'" a implies that a '" c. Also, c'" b. Thus
a '" b by transitivity, and b '" a by reflexivityj hence a E Ab and
b E A a . Take any x E A a : x '" aj hence x '" b, so x E Ab. Thus
A a C Ab· Similarly show that Ab C A a ·
Chapter 2
2.4. Ij(x) - j(x)1 = l(x 2 + 2y) - (x 2 + 2y)1 = l(x 2 - x 2 ) + 2(y - y)1 ::::
Ix 2-x 2 1+2IY-YI. Supposethat lx-xl< Oj i.e., (X_x)2+(y_y)2 < 02.
Thenlx2 - x21 = Ix - xlix + xl < O· C. Also, Iy - yl < o. Hence
Ij(x) - j(x)1 < (C + 2)0. Set 0 = E/(C + 2).
2.5. Set je) = d(-, E). Then Ij(x) - j(y)1 = I inf zEA Ix - zl- inf zEA Iy-
zil :::: Ilx - Yl + inf Iy - zi - inf Iy - zll = Ix - yl· Given E > 0,
choose 0 = E.
444 Solutions
2.6. If(xo) - f(x)1 < E whenever Ixo - xl < 8, Le., for xE (xo - 8, Xo + 8).
Pick any such x: either 0 < f(xo) - fex) < E in which case fex) >
f(xo) - E or 0< fex) - f(xo) < Ein which case fex) < f(xo) + E. For
the first case choose E smaller than I(xo) so that fex) is positive. For
the seeond ease fex) > f(xo) > o.
2.7. Assume that f(a) < 0, f(b) > o. Sinee f(a) < 0, there is an interval
[a, c] in which fex) < o. Let the l.u.b. of such points c be e; then
fee) ~ o. We cannot have fee) < 0 sinee we would then be able to
find an interval about e for whieh fex) < 0, which would imply that e
is not a l.u.b. Hence fee) = o. A similar argument applies if f(a) > 0
and I(b) < O.
2.8. (a) U EG(-I, 1); (b) U E Goo([O, 71"] X [0,1]); (e) U EGl[O, I].
2.9. Iu(x) - u(y)1 = Ilxl - lyll ~ Ix - Yl sinee lxi = Ix - Y + Yl ~
Ix - Yl + lyl·
2.10. Choose 8 = E/ L in the definition of continuity.
2.11. I = IQ U I', where IQ and I' are the subsets of rationals and irra-
tionals. J.L(I') = J.L(I) - J.L(IQ) = J.L(I).
215 f+( )
•• X
= {I,0 0otherwise,'
~x~1 r() =
x
~x<
0 otherwise.
{I, -1 0
,{jRf+ dx = JlRf- dx = 1, so JlRf dx = O.
JlRg+ dx = +00, JlRg- dx = 1, so JlRg dx = +00.
2.16. Use the fact that III = f+ + 1-, and that integrability of f implies
that of f+ and f-. For the converse use f = f+ - f-· Show that
- J r - J f- ~ J f+ - J f- ~ J r + r·
2.17. (a) ap> -1; (b) ap< -1.
Solutions 445
3.14. No.
p=1 p=2 p= 00
3.21. Ilxll§ = x 2 + y 2 = (lxi + lyl)2 - 21x lyl ::; (lxi + lyl? = Ilxlli- Ilxlli =
x 2 + y2 + 21xyI ::; 2(x 2 + y2) = Ilxll~.
3.22. J IUTVT I dx ::; [f luIT(p/Tlr/p[f Ivlr(q/rlt/ q· Take rth roots of both
sides.
4.1. 2.
4.3. llu - wll = llu - Un + Un - wll ::; Ilu - unll + Ilun - wll < E + a. The
inequality follows from the arbitrariness of a.
4.6. sup lun(x)1 = 1/2at x = l/n. Thus in [0,1], un(x) - .. 0 pointwise but
Ilu n - ulloo = 1/2, so convergence is not uniform. But convergence
is uniform in (a,l] (a > 0) : sup lun(x)1 = na/(l + n 2 a 2 ) at x = a
for n > l/a (check this by sketching un(x)) and sup lun(x)1 -> 0 as
n -> 00.
b
4.7. sup Iun(x) - u(x)1 < E for n > N. Hence Ja Iun(x) - u(x)IP dx :s;
(sup Iun(x) - u(x)I)P' (b - a) < (b - a)E P.
4.8. Ilull = 0 does not imply that u = 0; 111 . 111 is also not a norm.
4.9. Ilu - U 11 2 = ~ -
n m L2
2mn + ~
n+2 mn+m+n m+2 = 2 (m+2)(n+2)(mn+m+n)'
(m_n)2 Nu-
merator (m - n)2 :s; (m + n)2. Now show that Ilun - u m lli2 -> 0 as
n,m -> 00.
4.11. {u n } is Cauchy, so suplun(x) - um(x)1 < E for m,n > N. For any
Xo, Iun(xo) - um(xo)1 < E, so {un(xo)} is a Cauchy sequence of
real numbers. IR is complete, so un(xo) -> u(xo), say, which defines
a function u( x). Thc rest of the proof follows easily from the hints
given.
4.13. Assume {u n } convergent: Ilu n - ull < E for n > N. Also Ilu m - ull < E'
for m > Nt Hence Ilun - Um 11 = II(un - U) + (um - u)11 :s; Ilun - ull +
Ilu m - ull < E + E' for n, m > N (assume N > N').
1/ 2+1/ m [
4.14. 11 Un - Um 11 2 = Jl/2+1/n[
1/2 n ( X - '21) - m ( X - '21 )]2 d:r + J1/2+1/n 1-
m(x - ~)J2 dx. Show that this -> 0 as m, n -> (Xl, SO that {U n } is
Cauchy. Also, Ilu n _u11 2 = Jllg+l/n[n(x-~) -1]2 dx -> 0 as n -> 00.
So u n -> u in L 2 .
~1 ~1::;x<~E,
4.16. Let v(x) E C[~I, 1] be defined by v(x) = { I/E', ~E ::; x::; E,
+1, E<X::;1.
J
We have Ilu ~ vlli2 = J~« ~1 ~ C I )2 dx + o«1 ~ C I )2 dx = E3 /3 ~
E2 + E. Hence v can be made arbitrarily elose to u by choosing E small
enough.
4.17. Ilu vll oo = supll ~ v(x)l, where Iv(x)1 < 1 and v(O) = O. Hence
~
Ilu~vlloo = 1; neighborhoods ofu ofradius less than 1 do not contain
members of V, so u is not a point of accumulation.
4.22. Take i E LP. For given E > 0 choose a bounded function 9 in LP,
where 9 has compact support, for example, Igl ::; M in [a, b] and
9 = 0 otherwise. Select 9 so that Ili ~ gllp < E. Bounded functions
with compact support are dense in LP, so we can find ihn} in Co
such that 9 = limh n a.e. Assume that Ihni::; M in [a,b] and 0
otherwise. Then Ig ~ hnl P ::; (2M)P on [a, b] and Iig ~ hnll p --> 0 from
the Dominated Convergence Theorem. Choose n so that Ilg~hnll ::; E
and use the Minkowski inequality.
4.23. Suppose that there are two points vo, vb such that Ilu - Vo II = d.
Then w = (vo + vb)/2 is in M hence, using the parallelogram law, it
can be shown that d 2 ::; IluD ~ wl1 2 < ~llu ~ vol1 2 + ~llu ~ v'II 2 =~,
a contradiction.
5.8. Tx = ( -5 -1) ( 4)
-3 -5 x + 5 .
,assummg that {(O,O), (1,0), (0, In
go to {(4,5), (-1,2), (3,On.
5.9. Let TUI = VI, TU2 = V2. Then T(aul + ßU2) = O~Vl + ßV2 by the
linearity of T. Hence T-l(av1 + ßV2) = aUI + ßU:2. Eut aT-lvI =
aUl, aT-1v2 = aU2 =} T-l(aul + ßU2) = aT-lv + ßT-lV2.
5.10. No; e.g., d(x, B) + d(y, B) =1= d(x + y, B) in general. Null space is the
set B.
5.12. IIAxll= = maxl:S:i:S:n 12::7=1 AijXjl ::; maX1:S:i:S:n 2=7=1 IAijlllxjl ::;
maxl:S:i:S:n 2::7=1 lAi] I maxl:S:j:S:n IXjl = maxl:S:i:S:n 'L7=1 IAijlllxll=·
Hence IIAII = sup(IIAxll=/lIxll=) ::; maXi:S:i:S:n 'L7=1IAij l. Suppose
maximum occurs for i = k. Then for x such that Xj = +1 if A kj 2-
°
0, Xj = -1 if A kj < we have IIAxll=/llxll oo = 2::7=1 IAijl·
450 Solutions
5.14. Illull = Ilull; I is bounded. Consider u(x) = sin nx: lIullv = 1 but
Illullw = 1 + n which cannot be bounded.
5.24. v(y) = Pu(y) = I~1 exp(i(y - z))u(z) dz; show that Pv(x) ==
p 2u(x) = Pu(x). Pis an orthogonal projection.
5.25. (i) x satisfies Ax = 1 where 1 = (1, ... ,1); (ii) x satisfies Ax = 0 =
(1,0, ... ,0).
5.30. If there are two elements Ul, U2 such that (Ul,V) = (U2'V) = (P,u),
then (Ul - U2,V) = o. Set v = Ul - U2: IIUI - u211 2 = Oor
Ul = U2· II Pli = sup(I(P,v)I/lIvID (for v =I- 0) = sup((u,v)/llvID :'S
sup(llullllvll/llvll) = Ilull· Also, I(P,u)1 = (u,u) = IIul1 2 :'S IIPllllul1 so
IIPII ~ lIull· Hence IIPII = Ilull·
5.31. Take I = Iglq-l sgng; then I/IP = Iglq, so I E LP, and II/lb =
IlgIl1:;-1. Then show that (P g , I) = II/lbl!gl!Lq.
5.32. P = 0 <=} (P, v) = 0 far all v E X. Given v E X there exists a
5.33. la(u, v)1 2 :'S [llu'lIlIv'll +Ktllullllvll]2 :'S (K~ IIul1 2 + lIu'1I 2 )(llvI1 2 + IIv'11 2 ),
using Cauchy-Schwarz.
6.8. cPo(x) = Vlfi, cPl(X) = ..j3fix, cP2(X) = hß72 (3x 2-1), cP3(X) =
~..Jf72 (5x 3 - 3x).
°
6.13. (b, c) = (Ta, c) = (a, TT c) = °
if c E N(TT). Let d E R(T)l...
= (TT d, u) => d E N(TT). Conversely, if d E
°
Then (d, Tu) =
N(TT), then if Tu = v we have (TT d,u) = = (d,v) => d E
R(T)l... Hence N(TT) = R(T)l.. => N(TT)l.. = R(T). N(TT) =
{(1,1,-1)}, b=(a,ß,a+ß).
6.15. Let BI = {el, ... ,e n } and B 2 = {h, ... ,fn} be orthonormal bases
of X and ffi:n, respectively. For any u E X we have u = L uiei, Ui =
(u, eil. Define the map T: X ---> IR n by T(u) = (Ul,"" un ). Then T is
an isomorphism (show this) and Ilulli = (u,u) = (Luiei, LUjej) =
LU; = IITullffi.n.
6.20. °S Ilu- L;':1 (u, cPi)cPi 11 2 = lIu11 2 - L;':1 (u, cPi)2, hence L;':1 (u, cPi)2 S
Ilu11 2 . Since sum is bounded, we can let N ---> 00.
6.21. Use the property PcPk = cPk to show that p 2u = Pu. Clearly R(P) c
V. Conversely, if v E V, show that Pv = v so that R(P) = V.
Orthogonality: take v E R(P) and W E N(P); then (w, v) = (w, Pu).
Use this to show that (w,v) = 0.
6.23. (a) Set u(r, e) = R(r)8(e) to get (8' sine)' + '\8sine = O. Set
E = cos e to get Legendre's equation. General solution is u( r, e) =
2:~=o[anrn + bnr-(n+1)]Pn(cose).
(b) an = (2n + 1)/2 Jo f(e)Pn(cose) deo
71:
6.26. Let the minimizer be u, and set w = u + EV; then consider R(w) =
R(E) over all w that satisfy (w,el) = (w,e2) = ... = (w,en-l) = O.
Set [dR/dE]<=o = 0; expand and differentiate to find that A = R(u)
and u = en .
6.29. (c) Show that H~(x) = 2xHn (x)-Hn+ l (x). Set f(x) = exp( _x 2 ) and
show that f(n+1)+2xf(n)+2nf(n-l) = 0; multiply by (_l)n+l exp(x 2 )
to get H n+ 1 - 2xHn + 2nHn- 1 = O.
Chapter 7
7.15. Assume 0, c lR?; then left-hand side is 10. (~:~ + ~:~) (~:~ + ~:~) dx.
Now Jo. uv
f ö2 ö2
öx2 öx2 dx = Jr
f 2 ÖV
öx2 öx (ö u u) f ö4
- ööx3V Vx ds+ Jo.
3
öx4V dx. Procee du
in this manner; use ß/av = Vla/ßX + V2a/ßy.
7.16. Let {v n } be a sequence in 'D(fJ) with limit v E H{j(fJ). We have
Ilvnllp : : : clvnlHl; V n -> v in H I implies that IIvn llL2 --+ IIvllL2 and
IvnlHJ -> IvlHl. I·IHJ is positive-definite since lviI = 0 implies that
J l'Vvl 2 dx = 0, so that v = const = 0, given the boundary value of
v.
Solutions 455
8.2. (a) 1t
In' pv dx = In' Q dx + fr"
t ds. Use Cauchy's law t = ern and
the divergence theorem to rewrite the surface integral as In' diver dx.
The left-hand side equals In' p{)2 u /{)t2 dx. Regroup and invoke the
arbitrariness of 0.' to obtain (8.5). (b) (J"ij = 'x(divu)Iij + 2J-tEij(U).
Substitute in (8.5).
8.3. The argument is as in Example 2 of the Introduction: simply replace
f by f - ku, ku being the force of the foundation.
8.4. (a) Lj {)(J"aj/{)Xj = Lß {)(J"Otß/{)xß + {)(J"Ot3/{)Z, where Z = X3. Inte-
grate with respect to z and use the definitions of Sa. and Maß. (b) Fol-
lows as part (a). (c) Differentiate (8.14h, with respect to x"' sum on
n, and use (8.14h to eliminate Sa:; this gives La,ß {)2 M aß / {)xa{)xß =
-q. Next, use (8.13). This gives La,ß{)2Ma:ß/{)xa{)xß = -D[VLa
[j2(V 2w)/{)x; + (1 - v) La:,ß {)4w/{)x;{)x~1·
8.8. Li,j,k,l Cijkl~i"'j~k"'l = JLle1 21771 2 + (>. + JL)(e· 77)2 = JLle1 21771.1 2 + (>. +
2JL) (e . 77)2, where 771. is the component of 77 orthogonal to e. The
result follows from the independence of.,,1. and (e· 77). Pointwise sta-
bility: f == Li,j,k,l CijklMijMkl = (3)'+2JL)IM s I2+2JLIM D I2 [M s =
~(tr M)I and MD = M -MsJ. Show that IMI 2 = IM s I2 + IMDI2:
then f ~ cIMI 2 Hf 3>' + 2{t > ko and {t > {to.
[)2 U [)2 U [)2 U
8.9. -[)2 V~
Xl
+2
8 Xl [)-
X2
VI V2 + -82 V~
X
= g. Have to check Llal=2 baaa =
2
v?a~+2VIV2ala2+V2a2 = (VIai +V2a2)2 = (v?+vi)2 f= 0 if a = v.
8.11. Irl + Iß - 31 f= o.
8.12. ku·v-t·v = O. n = 2j bn = k, b22 = -Cn = Ij all other components
are zero. So (8.33) is satisfied.
8.22. (b) From (a), A : N(A)1. --> R(A) is bounded. Hence, using the
Banach theorem, A- I : R(A) --> N(A)1. is linear, bounded IIA-IVIl '*
sKllvll for all v E R(A), so setting v = Au we have lIull KIlAull s
for u E N(A)1.. If {vn } is a Cauchy sequence in R(A) with limit v,
then with Un = A-1vn we have lIu m - unll s
Kllv m - vnll --> 0 as
m, n --> 00; so {um} is a Cauchy sequence in N(A)1.. N(A)1. is closed;
Solutions 457
9.8. (b) VBVP is: Jol U"V" dx + [h)v(l) - gIV'(l) - hov(O) + gov'(O)J =
101 Iv dx, v E H 2 (O, 1); so P = PI (0,1). Hence Q = {v E H 2 (O, 1) :
101 v dx = Jo1 xv dx = O.} Q-ellipticity is tricky, but see Rektorys
9.14. J(eu + (1- e)v) = He 2a(u, u) + (1- e)2 a(v, v) - 2e(1 - e)a(u, v)}-
e(C,u) - (1 - e)(C,v). a(u - v,u - v) > 0 since a is V-elliptic, so
2a(1L, v) < a(u, u) + a(v, v). Use this to obtain strict convexity of J.
9.15. J(eu+(I-e)v) = eJ(u)+(I-e)J(v)-~e(l-e)a(u-v,u-v). Thelast
term on the right is nonnegative. To show that u is a minimizer: from
convexity, J(v) - J(u) :::: e- 1 [J(u + e(v - u)) - J(u)] = (DJ(u), v)
e
when ---7 0 (see Example 15).
10.5. Ilu - uhll; = a(u - Uh, U - Uh) = Ilull; - Iluh II~ - 2a(uh, u - Uh)' The
last term is zero.
J;
'l/Jj) dx and Pj = (sinx)( -'l/Jj' +'l/Jj) dx. Collocation: solve L:=1 (-cp%
(Xi) + CPk(xd)ak = f(xi), i = 1, ... , N.
(c) Solve MT a = F, where Mij = J;
CPi( -'l/Jj' + 'l/Jj) dx and Pj =
Jo1
f'I/Jj dx.
Chapter 11
11.1. Must show that a function v, say, exists such that Jn ViCP dx =
- Jn VOCP/OXi dx. For each Oe, Jn e (OV/OXi)CP dx = Jr e V!/iCP ds -
Jne V OcpjOXi dx since vbe E H 1(O).
11.2. Optimal B = 5.
11.10. One needs to solve a system of 21 equations uniquely, for any given
right-hand side. Equivalently, show that any polynomial for which
D"p = 0 for lai::::: 2 at the vertices, and p" = 0 at the midpoints is
identically zero. See Ciarlet [11] (Theorem 2.2.11) for full details.
4' ,
11.11. x = LA=1 whcrc NA are given by (11.27). Substitute
xANA(~,TJ),
and use the geometry of the parallelogram to verify that x = A~ + b
for suitable A and b.
Chapter 12
12.3. Let the triangle have angles o:,ß,"'( with (Je = 0: ~ ß ~ "'(. Let the
sides opposite 0:, ß, "'( be a, b, c, respectively. Then a ~ b ~ c and
h e = c. The largest cirele inscribed in the tri angle touches all sides.
Draw a sketch and show that h e = (Pe/2) (cot 0:/2 + cotß/2). Now
0: < 7r/2,ß < 7r/2; so cotß/2 ~ coto:/2. Hence he/Pe ~ coto:/2 ~ a
if we prescribe 0: ~ Bo, so that a = cot Bo/2.
12.4. k = 2 O(h~-m), 0 ~ m ~ 3, u E H 3 (n e )
k = 3 O(h!-m), 0 ~ m ~ 4, u E H 4 (n e ).
12.6. 'OiJ(a) = I:i :;, ai = I:i,j t:, ~ai = I:i,j t:] Tjiai = 'Ov(Ta). Pro-
ceed in the same way for higher derivatives. Then for k = 2, for
example, ID"'iJ(x) I ~ /l'0 2iJ/I = sup 1'0 2 iJ(a, b)/ (/la/l ~ I, /lbll ~ 1)
= sup/'0 2 v(Ta,Tb)1 = SUpl'02V(~~ 1~1)1·/lTII2. Use IITa/l ~
IIT/Iliall ~ IITII·
12.10. Using the Hermite basis functions and making appropriate changes
(e.g., replace C(n) by CI(n)), the estimate (12.24) remains valid.
The VBVP is: find u E Hg(O, 1) such that I;
(u"v" + k(x)uv) dx =
101 fv dx for all v E Hg(O,I). We obtain an error estimate from
lIu - uhl/2,n ~ Kllu - uhll2,rl = K (I:e /Iu - uhll~,nY/2 ~ Kh;-1
Solutions 461