100% found this document useful (1 vote)
3K views472 pages

1998 - Reddy - Introductory Functional Analysis

Uploaded by

mggll
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
100% found this document useful (1 vote)
3K views472 pages

1998 - Reddy - Introductory Functional Analysis

Uploaded by

mggll
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 472

Texts in Applied Mathematics 27

Editors
JE. Marsden
L. Sirovich
M. Golubitsky
W . Jäger
F. J ohn (deceased)

Advisor
G. Iooss

Springer Science+Business Media, LLC


Texts in Applied Mathematics

I. Sirovich: Introduction to Applied Mathematics.


2. Wiggins: Introduction to Applied Nonlinear Dynamical Systems and Chaos.
3. HaleIKor;ak: Dynamics and Bifurcations.
4. ChorinlMarsden: A Mathematical Introduction to Fluid Mechanics, 3rd ed.
5. HubbardlWest: Differential Equations: A Dynamical Systems Approach:
Ordinary Differential Equations.
6. Sontag: Mathematical Control Theory: Deterministic Finite Dimensional
Systems.
7. Perko: Differential Equations and Dynamical Systems, 2nd ed.
8. Seaborn: Hypergeometrie Functions and Their Applications.
9. Pipkin: A Course on Integral Equations.
10. HoppensteadtlPeskin: Mathematics in Medicine and the Life Sciences.
11. Braun: Differential Equations and Their Applications, 4th ed.
12. StoerlBulirsch: Introduction to Numerical Analysis, 2nd ed.
13. RenardylRogers: A First Graduate Course in Partial Differential Equations.
14. Banks: Growth and Diffusion Phenomena: Mathematical Frameworks and
Applications.
15. BrennerlScott: The Mathematical Theory of Finite Element Methods.
16. Van de Velde: Concurrent Scientific Computing.
17. MarsdenlRatiu: Introduction to Mechanics and Symmetry.
18. HubbardlWest: Differential Equations: A Dynamical Systems Approach:
Higher-Dimensional Systems.
19. KaplanlGlass: Understanding Nonlinear Dynamics.
20. Holmes: Introduction to Perturbation Methods.
21. CurtainlZwart: An Introduction to Infinite-Dimensional Linear Systems
Theory.
22. Thomas: Numerical Partial Differential Equations: Finite Difference
Methods.
23. Taylor: Partial Differential Equations: Basic Theory.
24. Merkin: Introduction to the Theory of Stability.
25. Naber: Topology, Geometry, and Gauge Fields: Foundations.
26. PoldermanlWillems: Introduction to Mathematical Systems Theory:
A Behavioral Approach.
27. Reddy: Introductory Functional Analysis: with Applications to Boundary
Value Problems and Finite Elements.
B. Daya Reddy

Introductory Functional
Analysis
With Applications to Boundary Value Problems
and Finite Elements

With 145 Illustrations

, Springer
B. Daya Reddy
Department of Mathematics and
Applied Mathematics
University of Cape Town
7700 Rondebosch
South Africa

Series Editors
J. E.Marsden L. Sirovich
Control and Dynamical Systems, 116-81 Division of Applied Mathematics
California Institute ofTechnology Brown University
Pasadena , CA 91125 Providencc , RI 02912
USA USA

M . Golubitsky W. Jäger
Department of M athematics Department of Applied Mathematics
University of Houston Universität Heidelberg
Houston , TX 77204-3476 Im Neuenheimer Feld 294
USA 69120 Heidelberg
Germany

Mathematics Subject Classification (1991): 46-01 , 65N30

Library of Congress Cataloging-in-Publication Data


R eddy, B. Dayanand , 1953-
Introductory functional a nalys is: with a pplications to boundary
value problems and finite element s I B. Daya Redd y .
p. cm . - (Texts in a pplied mathematics ; 27)
Includes bibliographical re/e rences a nd index.

I. Functional a na ly sis. I. T itle. 11 . Se ries.


(LA320.R433 1997
5 f 5 ' . 7 - dc21 97 -24052

Printed on acid -free paper.


©1998 Springer Science+Business Media New York
Originally published by Springer-Verlag New York, Inc. in 1998
Softcover reprint of the hardcover I st edition 1998
All rights reserved. This work may not be translated or copied in whole or in part without the
written permission ofthe publisher (Springer Science+Business Media, LLC), except for brief
excerpts in connection with reviews or scholarly analysis. Use in connection with any form of
information storage and retrieval, electronic adaptation, computer software, or by similar or
dissimilar methodology now known or hereafter developed is forbidden.
The use of general descriptive names, trade names, trademarks, etc., in this publication, even if
the former are not especially identified, is not to be taken as a sign that such names, as understood
by the Trade Marks and Merchandise Marks Act, may accordingly be used freely by anyone.
Production managed by Anthony K. Guardiola; manufacturing supervised by Joe Quatela.
Camera-ready copy prepared from the author's LaTeX files .

9 8 7 6 5 432 I
ISBN 978-1-4612-6824-6 ISBN 978-1-4612-0575-3 (eBook)
DOI 10.1007/978-1-4612-0575-3 SPIN 10557902
Series Preface

Mathematics is playing an ever more important role in the physical and


biological sciences, provo king a blurring of boundaries between scientific dis-
ciplines and a resurgence of interest in the modern as weil as the classical
techniques of applied mathematics. This renewal of interest, both in research
and teaching, has led to the establishment of the series: Texts in Applied Mathe-
matics (TAM).
The development of new courses is a natural consequence of a .high level of
excitement on the research frontier as newer techniques, such as numerical
and symbolic computer systems, dynamical systems, and chaos, mix with and
reinforce the traditional methods of applied mathematics. Thus, the purpose
of this textbook series is to meet the current and future needs of these advances
and encourage the teaching of new courses.
TAM will publish textbooks suitable für use in advanced undergraduate and
beginning graduate courses, and will complement the Applied Mathematical
Sciences (AMS) series, which will focus on advanced textbooks and research
level monographs.
Preface

A proper understanding of the theory of boundary value problems, as op-


posed to a knowledge of techniques for solving specific problems or classes
of problems, requires some background in functional analysis. The same is
true of the finite element method: there is much that can be learned and
practised - for example, the basic theory of the method, computational
aspects, and so on - without knowledge of even the most basic notions of
functional analysis. But for anyone wishing to gain a proper understanding
of qualitative aspects of boundary value problems, or of aspects of the finite
element method such as those that lead to the development of error esti-
mates, some background in functional analysis is an essential prerequisite.
The issue of an adequate mathematical background is somewhat more
straightforward in the case of students of mathematics who have taken
courses in real and complex analysis, followed by a course in functional
analysis. Such students are ideally equipped to follow courses that deal with
existence theory for boundary value problems, and with qualitative aspects
of the finite element method. This text has arisen out of a recognition,
though, that there are many students, researchers, and practitioners who
have not been exposed to the kind of mathematical background j ust referred
to, but who nevertheless wish to become acquainted with the basic notions
of functional analysis and its application to the kinds of problems that arise
typically in physics and engineering.
Up to the mid-1970s the availability of source material to which such in-
dividuals could refer, at least in the English language, was limited almost
entirely to the standard texts on real and functional analysis, written by
mathematicians for mathematicians having the standard background. The
viii Preface

task facing the engineer or applied scientist was thus quite daunting. For-
tunately the situation has progressed markedly since then. There is now
available a wide range of texts that present functional analysis, often with
one or more applications taken from engineering and physics, in a manner
accessible to readers not having the standard prerequisites. The styles dif-
fer, sometimes quite considerably, from one text to another, although this
is not a bad thing given the diversity of interests and backgrounds of the
potential readership.
This text is a furt her addition to the set of books that present func-
tional analysis and its applications to nonspecialists. The approach taken
is, first, to assurne that readers have no more by way ofrelevant background
than elementary courses in linear algebra, vector analysis, and differential
equations, and wish to learn the elements of linear functional analysis.
The book begins with an introduetory ehapter, which is somewhat in
the nature of a prologue, and which presents in mostly deseriptive form
a motivation for studying functional analysis from the viewpoint of those
involved in the study of problems from physics and engineering. The re-
mainder of the book is then divided into parts: Part I is devoted to linear
functional analysis, Part II to an introduction to elliptic boundary value
problems, and Part III comprises a study of the finite element method.
Two applications are treated in detail in this text: elliptic boundary
value problems and the finite element method. In both cases any prior
exposure to these areas will represent an advantage to those using this
book; indeed, it is expected that such prior exposure will in many eases
have provided the motivation to study the material presented here. The
presentation of these applieations starts more or less at the beginning,
so that those having no background in these areas could use this text to
acquire such background. On the other hand, it may be the case that the
motivation to learn functional analysis arises from an interest in an area of
application other than those treated in this text. Such readers might weil
prefer to focus on Part I of the book.
The incorporation of applications and other illustrative material is ap-
proached in two distinct ways. In Part I of the book new concepts, often
of an abstract nature, are rendered more accessible by the copious use
of concrete worked examples. There is little reference in this part of the
book to applications in physics and engineering, for thc simple reason that
such examples are less weil suited to laying bare the essential features of
the many ncw concepts that accompany any introduction to functional
analysis. In Parts II and III, it is appropriate and desirable to illustrate
abstract concepts by re course to concrete problems and examples taken
from physics and engineering, and this is the approach taken here. I have
used as examples problems such as heat conduction, as weil as problems in
solid and structural mechanics - elasticity, beams, and plates - and return
regularly in Parts II and III to these examples in order to motivate and
Preface ix

illustrate aspects of the theory of elliptic boundary value problems, and


finite elements.
The style adopted in this text differs from that to be found in most texts
on analysis, in that it is adapted to the goal of making the subject matter
accessible. Thus proofs are sometimes omitted when these are feIt to shed
little additional light on the relevant topic. It will also be found in places
that the presentation of detailed mathematical argument is eschewed in
favor of a more descriptive approach, again for the purpose of rendering
the material more accessible.
In addition to the many examples, each chapter ends with a collection of
exercises for the reader. Some of these consolidate material presented in the
chapters, and many exercises serve the purpose of amplification and sup-
plementation. In both cases the exercises are to be regarded as an essential
component of the text. Solutions to most of the exercises are presented at
the end of the text.
Many individuals have assisted, in various ways, in the completion of this
book. I am particularly grateful to Christiaan le Roux, Jean Lubuma, and
Sizwe Mabizela, all of whom gave most generously of their time in reading
and criticizing a preliminary version of the text. They offered detailed criti-
cism on aspects of style and substance, and pointed out a number of errors.
David Davidson organized a study group which worked through most of
the book; I found the comments of this group of engineering scientists very
helpful indeed. Weimin Han deserves special thanks für his constructive
suggestions; so too does Brendt Wohlberg, who offered many suggestions
for improving the text, located errors, and also provided me with valuable
advice on the preparation of figures by computer. Shaun Courtney's expert
guidance in the mysteries of Unix, and his very willing assistance with a
variety of Jb.1EX problems, are much appreciated. Most of the figures were
prepared by Bruce Bassett and Jill Goode, while Diane Laugksch assisted
me in the typing of drafts of sections of the book. I am most grateful to
these individuals for their cheerful assistance.
I express my thanks to the staff at Springer-Verlag New York für their
expert guidance and assistance with editorial aspects, as well as their advice
on the the preparation of the manuscript using Jb.1EX.
Finally, I acknowledge with gratefulness the moral support and forbear-
ance of my wife Shaada and son Jordi, who have had to spend many
evenings and weekends without my company in order that I might bring
this project to fruition.

B.D.R.
Cape Town
April 1997
Contents

Series Preface v

Preface vii

Introduction 1

I Linear Functional Analysis 21

1 Sets 23
1.1 The algebra of sets 23
1.2 Sets of numbers .. 28
1.3 IRn and its subsets 37
1.4 Relations, equivalence classes, and Zom's lemma 41
1.5 Theorem proving . . . . 46
1.6 Bibliographical remarks 48
1. 7 Exercises . . . . . . . . 48

2 Sets of functions and Lebesgue integration 53


2.1 Continuous functions . . . . . . . . . . . . 54
2.2 Measure of sets in lRn . . . . . . . . . . . 61
2.3 Lebesgue integration and the space V'(O) 67
2.4 Bibliographical remarks 78
2.5 Exercises . . . . . . . . . . . . . . . . . . 79
xii Contents

3 Vector spaces, normed, and inner product spaces 81


3.1 Vector spaces and subspaces . 81
3.2 Inner product spaces 87
3.3 Normed spaces . . . . . 92
3.4 Metric spaces . . . . . . 98
3.5 Bibliographical remarks 99
3.6 Exercises 100

4 Properties of normed spaces 105


4.1 Sequences................ 106
4.2 Convergence of sequences of functions 108
4.3 Completeness . . . . . . . . . . . . . . 113
4.4 Open and closed sets, completion .. . 116
4.5 Orthogonal complements in Hilbert spaces . 124
4.6 Bibliographical remarks 128
4.7 Exercises 128

5 Linear operators 133


5.1 Operators . . . . . . . . . . . . . . . . . . . . . . . . 134
5.2 Linear operators, continuous, and bounded operators 140
5.3 Projections . . . . 152
5.4 Linear functionals 157
5.5 Bilinear forrns . . . 163
5.6 Bibliographical remarks 169
5.7 Exercises 170

6 Orthonormal bases and Fourier se ries 175


6.1 Finite-dimensional spaces . . . . . . . 176
6.2 Finite-dimensional inner product and normed spaces 179
6.3 Linear operators on finite-dimensional spaces 184
6.4 Fourier se ries in Hilbert spaces 190
6.5 Sturm-Liouville problems 197
6.6 Bibliographical remarks 207
6.7 Exercises 207

7 Distributions and Sobolev spaces 213


7.1 Distributions . . . . . . . . 214
7.2 Derivatives of distributions .. . 219
7.3 The Sobolev spaces Hm(rl) .. . 225
7.4 Boundary values of functions and trace theorems 236
7.5 The spaces Hü(rl) and H-m(rl) 242
7.6 Bibliographical re marks 248
7.7 Exercises 248
Contents xiii

11 Elliptic Boundary Value Problems 253


8 Elliptic boundary value problems 255
8.1 Differential equations, boundary conditions, and initial
conditions . . . . . . . . . . . 255
8.2 Linear elliptic operators . . . . . . . . . 269
8.3 Normal boundary conditions . . . . . . 272
8.4 Green's formulas and adjoint problems . 279
8.5 Existence, uniqueness, and regularity of solutions 286
8.6 Bibliographical remarks 297
8.7 Exercises . . . . . . . . . . . . . . . . 298

9 Variational boundary value problems 305


9.1 A simple variational boundary value problem 306
9.2 Formulation of variational boundary value problems 309
9.3 Existence, uniqueness, and regularity of solutions 316
9.4 Minimization of functionals 326
9.5 Bibliographical remarks 333
9.6 Exercises . . . . . . . . . . 334

10 Approximate methods of solution 339


10.1 The Galerkin method . . . . . . 340
10.2 Properties of Galerkin approximations 345
10.3 Other methods of approximation 351
10.4 Bibliographical remarks 356
10.5 Exercises . . . . . . . . . . . . . 357

111 The Finite Element Method 361


11 The finite element method 363
11.1 The finite element method for second-order problems. 364
11.2 One-dimensional problems. . . . . . . . . . . . . . . . 371
11.3 Two-dimensional problems. . . . . . . . . . . . . . . . 379
11.4 Fourth-order problems and Hermite families 01' elements 392
11.5 Isoparametrie elements. 398
11.6 Numerical integration . 402
11. 7 Bibliographical remarks 405
11.8 Exercises . . . . . . . . 405

12 Analysis of the finite element method 411


12.1 Affine families of elements . . . . . . . 412
12.2 Local interpolation error estimates . . 416
12.3 Error estimates for second-order problems 421
12.4 Isoparametrie families and numerical integration 426
xiv Contents

12.5 Bibliographical remarks 431


12.6 Exercises . . . . . . . . 431

References 435

Solutions to Exercises 441

Index 463
Introd uction

The usefulness of functional analysis may not be immediately evident to


users of mathematics who have hitherto not encountered this branch of the
subject. Indeed physicists, engineering scientists, and other applied math-
ematicians are often put off by what they perceive to be an unnecessarily
high degree of abstract ion inherent in functional analysis, and the conclu-
sion is often reached that such a branch of mathematics could not possibly
be of any use in an area of endeavor in which concrete solutions to concrete
problems are sought.
However, there are many areas in which a knowledge of functional anal-
ysis is indispensable if one hopes to be able to probe deeply into the nature
of a problem. In this book we try to convey some idea of the circumstances
under which the student or researcher, equipped with not much more than
the basics of functional analysis, can gain a great deal of insight into the
properties of boundary value problems and their approximation. Not that
what is being proposed is in any way a panacea: while learning about the
power of functional analysis, it is equally important to be aware of its
limitationsj in other words, it is important to know which questions can
conceivably be answered by adopting such an approach, and which cannot.
The introduction to functional analysis presented in this text is directed
somewhat towards those aspects of the subject that are relevant to the
qualitative treatment of boundary value problems and their approximation
by finite elements. The precise manner in which one may call upon func-
tional analysis as a useful tool in these applications is the subject of Parts
II and III of this work. However, it is rather unsatisfactory to postpone
until then an indication of how this branch of mathematics interacts with
2 Introduction

such applications, and in which ways it is useful. For this reason we present
in this introductory chapter an overview of how boundary value problems
are encountered, what kinds of mathematical quest ions arise in their treat-
ment, and where functional analysis fits into the general scheme of things.
The treatment is deliberately sketchy in its mathematical detail, since the
aim here is to identify important beacons or landmarks, rat her than to flesh
out all their mathematical features; this latter task forms the bulk of this
text.
Boundary value problems almost always arise as mathematical models of
some real-life situation, whether of a physical, biological, economic, or other
nature. We place the planned excursion in a concrete physical context in
order to be able to show how the mathematics interacts with the physical
dictates of the problem. The main vehicle chosen for the discussion in
this chapter is the physical problem of heat conduction or, equivalently, of
diffusion and, subsequently, its steady (that is, time-independent) variants.
When arriving at the steady-state case we are able to make contact also
with other problems that have the same mathematical formulation, viz.
electrastatics, and the problem of the deflection of an elastic membrane.
We proceed now to examine the various stages that arise in the consid-
eration of these physical problems, and their mathematical realizations.

STAGE I: CONSTRUCTION OF A MATHEMATICAL MODEL


Example 1: A model for heat conduction. Consider a medium through
which heat is flowing. The aim is first to construct a mathematical model
of this physical problem which, on the one hand, is a sufficiently ac cu-
rate representation of the situation, yet which is simple enough to yield to
mathematical analysis.
Positions of points in the medium are denoted by the position vector x
relative to some origin O. A Cartesian coordinate system is chosen with
origin at 0, so that the coordinates of the point x are (x, y, z). Here we use
the vector x and the tripie (x, y, z) interchangeably. If the time is denoted
by t, then the aim of the exercise is to find the temperature distribution
u(x, t) in the body, due to the presence of heat sources within the body
and flux of heat through its surface (see Figure 1). Two equations suffice
for a realistic model of this situation: an equation representing balance
(or conservation) of energy, and a constitutive equation, which contains
information about how heat flows in the medium itself.
Balance of energy states the following: assuming that there are no other
types of energy present,

The rate of change of thermal energy in a body =


the heat generated by sources in the body + (1)
the flow of heat into the body fram outside.
Introduction 3

1"

FIGURE 1. The problem of heat conduction

The next stage is to translate each of these terms into mathematical form.
We do this by applying balance of energy to an arbitrary part n' of the
body n; the arbitrary region has abounding surface r' (see Figure 1).
N ow the thermal energy in a body is quantified by the heat capa city c,
which is the amount of heat generated per unit mass, and per unit rise in
temperature. If the mass density is denoted by p and the temperature by u,
then the total thermal energy in 0' at a particular time is therefore given
by

( c(x)p(x)u(x, t) dV. (2)


in'
We emphasize here that thc body is nonhomogeneous; that is, its properties
vary with position in the body, so that c and p are functions of position. In
the preceding expression, and henceforth, dV denotes the volume element
dx dy dz and fo' is shorthand for the tripie integral fffn'. In this or other
problems in wh ich the domain is two-dimensional, dV is interpreted as the
area element dxdy, and fo' as the area integral ffo'.
Heat may be generated inside the body as a result of a heat source (for
example, a chemical reaction). This is given in thc form of a function f(x, t),
which represents the amount of heat generated per unit volume, per unit
time. Thus the total he at generated by such a source in n' is given by

( f(x, t) dV. (3)


in'
Finally, the flux or fiow of heat is represented by a veetor q(x, t), called
the heat ftux, which specifies both the magnitude and direction of the fiow
of heat. The flow of heat across the surface r' into the part n' of the body
4 Introduction

is given by

_1 q.v dA, (4)


Ir"
since it is only the normal component of heat fiux that will actually enter
the body. Here v is the outward unit normal to the boundary and dA
denotes the element of surface area. This surface integral may be converted
into a volume integral by using the divergence theorem of Gauss, according
to wh ich

r
'r'
-1 q. v dA =-
Jn'
div q dV.

By putting together these various components of the equation of balance


of energy (1) we therefore obtain the equation

dd l c(X)P(X)u(x,t)dv=l f(x,t)dV-
t n' n'
r
Jn'
divqdV

Now the time derivative may be taken inside the integral, since the limits
of integration are fixed; it then becomes the partial derivative D/Dt, and
we now have, upon rearrangement,

10, [C(X)P(X)~~ +diVq-f(X,t)] dV=O.

Since the volume under consideration is arbitrary, and the functions appear-
ing in the integrand are assumed to be sufficiently smooth, the integrand
must vanish in order for this equation to hold true. This observation leads
to the preliminary form

cP Du
Dt + d'IVq= f (5)

of the heat equation.


Clearly a further equation is required, since there are two unknowns:
the temperature u and the heat fiux q. Physically also, it is clear that we
need another equation that will characterize the particular heat-conducting
properties of the material under consideration. We choose a simple form
of such an equation, viz. Fourier's law, which states that the heat fiux is
linearly related to the temperature gradient; that is,

q = -K\lu, (6)

where the positive scalar function K is known as the thermal conductivity.


The minus sign is introduced to accommodate the fact that heat fiows from
Introduction 5

hot to cold. Now substitution of Fourier's law in the energy equation and
division throughout by cp give, finally,
au -
-
1
-div (K\1u) = Q, (7)
at cp
where we have set Q = f /(cp). In full, equation (7) reads

au _ ~ [~(Kau) + ~ (Kau) + ~ (K~'}u)] = Q.


at cp ax ox oy oy oy t7y
This partial differential equation (PDE) is the standard heat equation in
its fuH unsteady, nonhomogeneous form.
To the PDE must be added information about the conditions on the
boundary, and also initial conditiollS. There are many kinds of boundary
conditions (BCs): for example, suppose that the temperature is prescribed
on apart r u of the boundary, whereas on the remainder r q the heat fiux
is given (Figure 1). That is,
u = u(x, t) on r u and q. v = q on r",
where u and q are prescribed functions. The second of these conditions can
be simplified, using Fourier's law once again, so that it reads
au
ov = gon r q ,
where 9 = -q/K and %v = v· \1 denotes the normal derivative. In the
event that the part r q of the boundary is insulated, 9 = O.
FinaHy, we add an initial condition (IC), which specifies the temperature
at time t = 0; that is,
u(x,O) = uo(x),
with Uo being a given function. We have now arrived at an initial boundary
value problem (IBVP) for heat conduction, which may be succinctly sum-
marized as in Box 1.

Box 1: THE INITIAL BOUNDARY VALUE PROBLEM


FOR HEAT CONDUCTION

PDE: -ou 1
ot - -div(K\1u)
cp
= Q in n, t > 0

BCs: u=u(x,t) on r
au
and ov =g(x,t) onr q
u

IC: u(x,O) = uo(x) in n


6 Introduction

Later on, when discussing elliptic problems and their approximation by


finite elements, we are mare concerned with problems that are indepen-
dent of time. This is the so-called steady case which is appropriate if, far
example, the data such as the source term and the boundary terms are
independent of time. In this case the time derivative disappears from the
PDE, the initial condition is redundant, and we are left with a boundary
value problem (BVP) for u(x), summarized in Box 2.

Box 2: THE BOUNDARY VALUE PROBLEM


FOR STEADY HEAT CONDUCTION

1 .
PDE: --dlV(K\7u) = Q
cp

BCs: u = u(x) on r", and


ou
ov = g(x) on r q

Finally, this BVP takes an even simpler form if the problem is homoge-
neous, that is, if the density, specific heat, and thermal conductivity are
constant. The problem now becomes that shown in Box 3; the PDE there
is known as the Poisson equation and the operator \72 on the left-hand side
is the Laplacian, defined by
2 02 u 02 u 02 U
\7 u = ox2 + oy2 + Oz2' (8)

The constant k = K / (cp) is known as the thermal diffusivity. If there is


no source term, so that the right-hand side of the PDE is zero, then the
resulting equation is known as Laplace 's equation.

Box 3: PorSSON'S EQUATION

PDE:

BCs: u = u(x) on r",


ou = g(x) on r
and OV q

Example 2: Electrostatics. It is important to realize that PDEs such as


those in Boxes 1 through 3 do not represent only one physical situation. As
Introduction 7

mentioned earlier, the heat equaton is also known as the diffusion equation
because it serves as a model for diffusion. Likewise, thc Poisson equation
models a wide range of physical phenomena.
To make the point we consider as a further example the case of elec-
trostatics. Suppose that we are given a distribution of stationary electric
charges in a region n in space; this distribution may be specified by a scalar
function p which gives the charge per unit volume, or charge density, at any
point. The charge density in turn gives rise to a vector force field known
as the electric field, and denoted by E. The electric field at a point x gives
the force per unit charge acting on acharge located at ~e.
Now, just as we considered in Example 1 the relationship between the
flux of heat through the boundary r' of an arbitrary region n' and the
change of heat inside that region, in the same way we may consider the
relationship between the flux of the electric field through r' (using the
same notation as in Example 1), and the total charge inside the region
enclosed by r'. The result is Gauss's law, which states that

r
Jr l
E. v dA = 47r r
Jn'
p dV; (9)

that is, the flux of the electric field E through any closed surface equals 47r
times the total charge enclosed by that surface. By exploiting the divergence
theorem of Gauss, the surface integral on the left-hand side of (9) can be
converted to a volume integral, and this way we arrive at the counterpart
of (5); that is,

div E = 47rq. (10)

We require some additional information in order to solve this problem


since (10) is a single equation involving an unknown vector-valued quantity.
The requisite information is provided by the fact that the line integral of
the electric field between any two points in space is path-independent; that
is, given points Xl and X2 and a curve x(s) joining these two points, with
Xl(S) = SI and X2(S) = S2, the value of the integral

J 82

81
E·r ds

is independent of the curve chosen to join Xl and X2. Here r is the unit
tangent vector along the curve (Figure 2). An immediate mathematical
consequence is that it is possible to express the electric field as the gradient
of an electric potential function 4;; that is,

E = -\lifJ. (11)

The minus sign takes care of the fact that the electric Held points in the
direction of decreasing potential. Figure 2 shows schematically the curves
8 Introduction

FIGURE 2. Curves of constant potential and electric field vectors in the plane
normal to, and passing through the center of, a uniformly charged disk

of eonstant potential, and a few of the eleetrie field veetors, in the vicinity
of a uniformly charged disko
Now we are ready to formulate the problem of determining the electric
field: by substituting (11) in (10) we obtain again a Poisson equation

(12)

Naturally this differential equation will have to be supplemented by suitable


boundary eonditions involving either the potential or the electric field; we
then arrive at a problem exactly as that given m Box 3, with the obvious
ehanges in notation.
Onee cf; has been found from (12), the electrie field ean then be deter-
mined from (11).
We see then that two problems which differ vastly in a physical sense
have a eommon mathematieal strueture.
Introduction 9

Box 4: THE COMMON STRUCTURE OF THE HEAT CONDUCTION


AND ELECTROSTATIC PROBLEMS

Heat Conduction Electrostatics

Basic variable u or 1> temperature potential

Flux quantity q or E heat flux electric field

Balance law thermal energy Gauss's law

Constitutive law Fourier's law potential law

This commonality is summarized in Box 4. Of course the relation (11) is


not a constitutive law; indeed, the region in quest ion is assumed to be a
vacuum! But (11) plays the same role mathematically as does Fourier's law;
so it is not inappropriate to place it alongside Fourier's law in the table.

Example 3: A model for deformation of a membrane. We conclude


the set of examples in this chapter with one which is discussed again in later
chapters: this is the problem of determining the shape at equilibrium of a
thin elastic membrane. Such a membrane is initially planar, and occupies
a two-dimensional domain n, as shown in Figure 3. It is fixed along part of
its boundary. The membrane is subjected to a transverse force f per unit
area, as a result of which it takes up a nonplanar shape. The problem is to
find the deformed shape of the membrane, which is givcn by the function
u(x).
In this problem the main unknown is the transverse displacement, again
represented by u (see Figure 3). The terms and equations introduced for
the heat problem remain valid, provided that they are interpreted correctly.
First, all functions of time alone vanish, since we are dealing with a steady
problem. Second, balance of energy is replaced by the principle of balance
of forces (actually, this is the time-independent version of the principle of
balance of momentum), which states that the net total force acting on any
part of the membrane is zero.
A constitutive equation is required in order to characterize the behavior
of the material comprising the membrane. This equation now expresses
the fact that the vertical force depends not on the displacement u, but
10 Introduction

/y

FIGURE 3. Deformation of a thin elastic membrane

rather on the displacement gradient \lu, which is what characterizes local


deformation of the membrane.
By considering the balance of forces acting on an arbitrary section of the
membrane, as shown in Figure 3, we eventually arrive at the PDE in Box
2 (if the membrane is nonhomogeneous) or the Poisson equation of Box 3
(if it is homogeneous).

BVPs, as opposed to IBVPs, are particularly relevant to later develop-


ments, since they are representative examples of elliptic equations (the
not ion of an elliptic problem is explored in so me depth in Chapter 8). In
order to focus the rest of this discussion on the main goals, we proceed
furt her by taking one of the BVPs, rat her than the original IBVP, as a
representative example.

An alternative formulation for BVPs: the variational problem.


The preceding developments lead to a particular kind of mathematical
model, viz. one involving a PDE (plus boundary and possibly also initial
conditions). But although this is a commonly adopted form of the model,
it is not unique. There are other, more or less equivalent, ways of putting
the model into mathematical form, and it may be that one of the alter-
natives would be more appropriate, depending on how we would want to
pursue the mathematical investigation, and depending also on thc types of
approximations that we might wish to - or bc forccd to - consider.
An important alternative, and one which is featured heavily later on
since it is at the hcart of the finite element method of approximation, is
that of the variational problem. Traditionally this mcant that the problem
Introduction 11

was formulated as one in which it was required to mimimize a partieular


functional; however, the term "variational" has a wider significance that
is explored in Chapter 9. We begin the discussion here with the original
understanding of a variational problem as being the same thing as a min-
imization problem, and subsequently consider an alternative variational
formulation.
This exploration is carried out in the context of the simple boundary
value problem associated with the Poisson equation. We simplify matters
even further by assuming that the temperature (or the potential in the case
of electrostatics, or the displacement in the case of the membrane problem)
is prescribed to be zero on the entire boundary.
The first stage is to introduce a functional J, that is, an operator that
maps a function v(x) to areal number, defined by

J(v) = ~ InIV'VI2 dV -In fv dV (13)

In the context of the membrane problem, for example, this represents, to


within a constant, the total potential energy of the membrane; the first
term on the right-hand side represents the strain energy or stored energy
due to deformation, and the second term represents the potential energy
of the force.
Now it can be shown (and this is done in Chapter 9) that the problem of
Box 3, with the modified boundary condition, is equivalent to the following.

Find u such that J (u) ::; J (v) for all admissible v.

Again, in the context of the membrane problem, this represents the state-
ment of the principle of minimum potential energy. Exactly what is meant
by an admissible hmction is a matter that takes up some time when BVPs
are discussed in full detail, but it suffices for this preliminary overview that
we consider functions which satisfy two properties:

(i) they are continuously differentiable; that is, the fllnctions and their
derivatives are continuous on n, where n denotes the region n to-
gether with its boundary r. In this way the integrand in (13) makes
sense;

(ii) they satisfy the boundary condition u = 0 on r u'


Then the minimization problem may be summarized concisely as in Box 5.
12 Introduction

Box 5: MINIMIZATION PROBLEM FOR THE PorSSON EQUATION

Find a function u in X that satisfies

J(u) :::; J(v) for all functions v in X,

where J(v) = ~ k l\7vl 2 dV - k fv dV

and

X = {continuously differentiable functions on TI that vanish on r u }

Now, when one wishes to find the minimum of a function of a single variable
h(x), say, then of course this minimum (assuming it exists) is characterized
by the necessary condition h'(xo) = 0, Xo being the point at which the
minimium is attained. The case such as that in Box 5, where it is required to
find a function that minimizes a given functional, is not dissimilar, despite
its greater generality. Indeed, suppose we assume that a minimum does
exist, and that this minimum is achieved at the function u. If we replace v
by U+EV, where v is arbitrary, although a member of X, then we may treat
J(U+EV) as a function ofthe single variable E, and write J(u+w) == F(E),
say. A minimum is then achieved at E = 0, so the condition for a minimum
is therefore that

~F(E)! = 0 or dd J(u + EV)! = o.


dE <=0 E <=0

This is very easy to work out, since

J(U+EV) ~E2 k l\7vl 2 dx +E k \7u· \7v dx + k l\7u l2 dx

-E
Jrrfv dx - Jj'
fu dx.

Differentiation with respect to E, and evaluation at E 0, result in the


equation

k Vu· Vv dx = l fv dx. (14)

Thus we have arrived at an alternative variational formulation. Because


of its elose association with the minimization problem, and because mini-
mization problems are the subject of the calculus of variations, (14) is also
Introduction 13

known as a vaTiational boundary value problem (VBVP). This problem is


summarized in Box 6.

Box 6: THE VARIATIONAL BOUNDARY VALUE PROBLEM


FOR THE POISSON EQUATION

Find a function u in X that satisfies

in Vu· Vv dV = t fv dV

for all v EX

We show in Chapter 9, the subject of which is VBVPs, that although min-


imization problems may be formulated as VBVPs, it is possible to derive
the VBVP corresponding to a PDE and boundary conditions without going
via the minimization problem. Indeed, in so me cases there may in fact be
no minimization problem, so the VBVP is in this sense a more fundamental
formulation.
As an example of how the VBVP is derived directly from the classical
formulation, we return to the Poisson equation in Box :J; multiplying this
equation throughout by an arbitrary function v in X, where X is as defined
previously, and then integrating over n, we obtain

-in vV 2 u dV = in fv dV.

Now the left-hand side may be transformed by using the divergence theo-
rem, according to which

-in vV 2 u dV = - t ~~
v dA + in Vu· Vv dV

The boundary integral Ir may be broken up into two parts: an integral


over r u , and an integral over the complementary part r q • Since v belongs
to X by assumption, and since all members of X have the property that
they vanish on r u , the integral over r u vanishes. Likewise, we have thc
boundary condition auf av = 0 on r q, so that the integral over r q also
vanishes. Thus the entire boundary integral is zero, ancl we arrive in this
way at the problem in Box 6.

STAGE 11: CONSTRUCTION OF A SOLUTION (IF POSSIBLE)


Having formulated the problem, one obvious second step would be to solve
14 Introduction

it in closed form. If an exact solution is available, then most people would


agree that the task has been essentially completed, and all that remains is
to use the solution to obtain information about the system that has been
modeled. Certainly for the case of a homogeneous medium, that is, one
whose properties are independent of position, the problem can often be
solved in closed form if the data are reasonably "nice" , and if the geometry
of the boundary r is very simple: for example, a rectangle (or parallelipiped
in three dimensions), circular cylinder, or sphere. In this case the method
of separation of variables can be used to obtain aseries solution of the heat
equationj this is a method that most students encounter in elementary
courses on PDEs or Fourier series.
But such a happy situation is exceptionalj in general it is not possible to
solve even relatively simple BVPs, such as those given previously, in closed
form, if the geometry of the domain is complex, for example. Thus we are
left with the problem of trying to obtain information about the solution,
even though there is little prospect of finding that solution. It is at this
stage that tools such as functional analysis come into their own.

STAGE 111: WELL-POSEDNESS OF THE PROBLEM


Now, rat her than pursuing the possibly fruitless goal of an exact solution,
we first attempt to gain qualitative information about the problem. What
kind of qualitative information is required? Weil, it is generally agreed that
above anything else it is necessary to know the answers to the following
questions.
Does a solution exist?

If so, is this solution unique?

Does the solution depend continuously on the data?

The first two questions need no clarificationj in the case of the third we
are asking whether the solution changes by only a small amount if the
data are changed by a small amount (of course, what is meant by "small
amount" must be made clear). A problem for which small changes in data
cause wild fiuctuations in the solution is clearly unstable, and one would
be tempted in such a case to reconsider whether the mathematical model
does indeed represent reality adequately and, if so, exactly how to interpret
such sensitivity.
If the answer to all three questions is in the affirmative, then the problem
is said to be well-posed. Armed with such knowledge, which can only prop-
Introduction 15

erly be obtained with the aid of the tools of functional analysis, the process
of seeking approximate solutions can then proceed from a firm base.
It is possible in some circumstances to construct counterexamples which
demonstrate that a problem does not have a unique solution. It is also
possible in such cases to obtain necessary conditions for existence of a solu-
tion. That is, with a minimum of manipulation we can establish conditions
satisfied either by the solution or by the data, assuming that a solution
does exist. Take as an example the BVP in Box 3, but ass urne that the
boundary condition takes the form

ou
- on r; (15)
ov =g
in other words, if the physical problem is heat conduction, the heat flux is
given on the entire boundary.
Now in fact it is easy to see that this problem does not have a unique
solution; indeed, if u is a solution, then so is u + c, where c is any constant,
since

- V (u + c) = - Vu = f on n
and
o ou
0) u + c) = ov = 9 on r.
Physically, we may add a constant temperature to the body, and the re-
sulting temperature distribution would still be consistent with the mathe-
matical model.
It is possible to go even further, and to show that a solution will exist only
if the data satisfy a particular condition; to see this we integrate Laplace's
equation over n and make use of the divergence theorem and the boundary
condition (15), to find that

l div(Vu) dV

J ou dA
!r ov
t 9dA .

That is,

l f dV + t 9 dA = o. (16)

Thus it is not possible to find a solution to the problem unless the data
fand 9 satisfy the compatibility condition (16). For the problem of heat
16 Introduction

conduction this asserts that the net amount of heat generated in the body
must be zero; such a condition makes perfect sense since otherwise we could
not expect to have a steady problem.

STAGE IV: CONSTRUCTION OF APPROXIMATE SOLUTIONS


Assuming that a closed-form solution is out of the question, but that the
problem has nevertheless been shown to be well-posed, the following stage
involves that of constructing the next best thing: a good approximation.
Perhaps the two most well-known procedures for achieving this are the
finite difference and finite element methods. The finite element method has
been a great success, particularly since the advent of high-speed computers,
and it is thc main topic of Part Irr of this book. We therefore focus on this
particular method of approximation in this section.
The first point to bear in mind ab out the finite element method is that
it takes as a point of departure the variational formulation of thc problem,
for example, that given in Boxes 5 or 6, rather than the formulation (such
as that in Box 3) as a PDE. The method is in fact a special case of the Ritz-
Galerkin method, whose essence can be described succinctly. Wc could start
with Box 5 or 6; choosing the minimization problem, we pose this not on
the whole of the space X - remember that this problem in general will not
be solvable in closed form - but instead on a finite-dimensional subspace
X n of X. In other words, functions ifJI, ifJ2, ... , ifJn, all of which belong to
X, are selected as basis functions, and X n is defined by

Xn {all functions of the form V n = bl ifJI + ... + bnifJn,


bl, ... , bn constant}.

Such a representation of v is substituted in the functional J, which now


reads

J(v n ) ~ In V' V' Vn . VnIn dV - j V n dV

~ In (bI V'ifJI + ... + bn V'ifJn) . (bI V'ifJI + ... + bn V'ifJn) dV

-In + + ...
j(bIifJI b2ifJ2 bnifJn) dV

i,j=1 j=1

or, more concisely,

J(V) = ~bT Kb - bT F,
Introduction 17

where bT = [b 1 , b2 , ... , bnJ and K and F are, respectively, the n x n


symmetrie matrix and n x 1 vector with components

So the essence of the Ritz-Galerkin method is that the problem of minimiz-


ing a functional over an infinite-dimensional space X of candidate functions
has been reduced to the very simple ~ and solvable! - problem of minimiz-
ing a quadratie function of n variables. If the minimizing vector is denoted
by a, then this vector is the solution to the set of simultaneous equations

Ka=F; (17)

this set of equations has a unique solution since K is positive-definite and


therefore has an inverse. Once solved, the approximate solution may be
found from U n = alrPl + ... + anrPn.
The finite element method, as has been mentioned already, is a spe-
cial case of the Ritz-Galerkin method in whieh the basis functions rPi are
constructed by a particular process whieh has the virtues that, first, it is
systematic, and second, the quality of the approximation improves with
increase of the dimension n of the basis. The method is introduced in Part
III.

STAGE V: QUALITY OF APPROXIMATE SOLUTIONS


Although many practitioners would regard Stage IV as a suitable point at
which to conclude proceedings, it is nevertheless of the utmost importance
to obtain some information about the quality 01 the approximation. With-
out such information the approximate solution itself is of little use, since
it could conceivably bear very little resemblance to the exact solution. We
need to know whether in some sense the approximation is a good one, and
also how it could be improved.
Once again a knowledge of functional analysis is required in order to pur-
sue this line of enquiry satisfactorily. Suppose that the cxact solution to
one of the problems discussed previously is denoted by u, and the approx-
imate solution, obtaincd using the Ritz-Galerkin finite element method, is
denoted by Uno What we first of all require is some qualitative information
about thc error u - Un0 Now since u is unknown, it is of course not possible
to evaluate the error exactly. So we have instead to be satisfied with an
estimate of the error. To complete the pieture, one would hope to obtain
from such an estimate not only an idea about the size of the error, but also
some information about whether the approximatc procedure gives a family
of solutions that converge to the exact solution and, if so, what thc rate 01
convergence iso
18 Introduction

u(x)
x

FIGURE 4. The norm of a continuous function

The finite element method is ideally suited to such a convergence analy-


sis. In the framework of functional analysis one has to be precise first of all
about the set, or space, of functions X to which the solution belongs. We
gave a rather simple example earlier, but it is shown that it is necessary to
be a lot more careful about the choice of space. Such aspace is generally
a normed space, meaning that there is an operation I! . 11, called a norm,
defined on the space. The norm measures the magnitude of the nmction in
a manner analogous to that in which the length lai of a vector a is defined
by lai = ,;a:a,. For example, if X is the space of all bounded continuous
functions, then one way of defining a norm on X is according to
Ilull = max{lu(x)I for all points x in fl}.
(see Figure 4). With the aid of a norm, the procedure of obtaining an error
estimate and the rate of convergence can be phrased in more concrete form.
For example, the Ritz-Galerkin procedure entails the replacement of X by
an n-dimensional approximation X n which is constructed in such a way
that the error estimate
(18)
holds, where C is a positive constant not related to n, and p is a positive
constant, also independent of n. There is a wealth of information contained
in this inequality, if it can be derived for the problem at hand. First, it
confirms that we may improve the approximation by increasing the size of
X n ; that is, Ilu - U n II decreases with an increase in n. Second, the sequence
of approximations obtained by a progressive increase in the dimension n of
X n converges to u, in the sense that Ilu - unI! approaches zero as n gets
larger. Third, we get to know not only that the method converges, but also
about the rate of convergence . This is given by the number p: the larger p
is, the fast er will be the approach to the exact solution.
So we see that, even though an exact solution may be elusive, the combi-
nation of a careful qualitative analysis, using the tools of functional analy-
sis, together with an approximate solution procedure, will yield extremely
Introduction 19

useful information about the problem. The finite element method in par-
ticular provides a systematic way of defining the finite-dimensional spaces
X n , for increasing values of n.
In summary, then, the overall impression gained is that although func-
tional analysis will not in general provide a means or technique for actually
finding closed-form solutions to boundary value problems, it can help us to
develop theories that will throw light on the nature of solutions to problems.
Given that closed-form solutions may be impossible to achieve, by what-
ever method, the need for an understanding of some of these qualitative
properties of solutions is compelling.

Outline of the rest of this book


The next seven chapters constitute Part I of this book and provide a system-
atic development of the basic functional analysis required for a reasonably
in-depth study of BVPs and their approximation.
In Part 11, which comprises Chapters 8 and 9, we present a detailed
study of elliptic BVPs in both the classical (that is, PDE) as well as the
variational forms. This part of the book can therefore be thought of as
supplying the technical detail that was missing in the heuristic overview of
BVPs presented earlier. The opportunity is also taken in these chapter;.; to
introduce further physical examples, taken mostly from mechanics, and to
use these examples to illustrate the theory.
Finally, Part 111 is concerned with methods of obtaining approximate
solutions. The focus is on the Galerkin and finite element methods, and
both the computational and mathematical aspects are studied so that the
steps leading to the equation (17) are prescnted in detail and the analysis
necessary to arrive at error estimates of the form (18) is also developed.
Part I

Linear Functional
Analysis
1
Sets

Functional analysis conventionally takes as its starting point the idea of


the existence of collections of mathematical objects, for example, numbers,
vectors, or functions. Such collections, which are known as sets, are endowed
with additional structure, and when this is done it beeomes possible to
elaborate on their properties and build up a coherent theory.
Sets are dearly basic to a proper study of mathematics, and for this
reason we start the study of functional analysis with some introductory
aspects of set theory. After a review of the algebra of sets in Section 1,
we go on to take a doser look at sets of numbers and of n-tuples, since
these are fundamental to much of the later developments in this work. In
Chapter 2 we discuss some important sets whose members are functions.
This chapter ends, in Section 1.4, by introducing further set-theoretic and
logical notions which are required in the proofs of theorems.

1.1 The algebra of sets


A set is any well-defined collection of objects. These objects -- in the present
context mainly numbers, vectors, or functions - are called members or
elements ofthe set. A set is usually denoted by a capitalletter, for example,
A, and if the object x is a member of A we write

xEA
24 1. Sets

which is read "x is an element of A" or "x belongs to A". Likewise, the
expression

xjt'A

reads "x is not an element 0/ A". Various ways of defining sets are givcn
in the following examples.

Exarnples

1. The set A of all positive integers less than or equal to 5 is given by

A = {1,2,3,4,5}. (1.1)

Here we have defined A simply by listing its elements. A more so-


phisticated, but concise, way of defining the set would be to write

A = {n: n is an integer, 1:S: n :S: 5}

or, if we denote by Z the set of all integers,

Z+ = {n: nE Z, 1 :S: n :S: 5}. (1.2)

The expression in brackets is read "the set 0/ all n, n being an integer,


such that 1 :S: n :S: 5."
2. The preceding example was of a finite set, that is, a set comprising a
finite number of elements. An infinite set is one that does not have a
finite number of elements; for example, the set Z+ of all nonnegative
integers, that is,

A={n: nEZ,n2:0},

is an infinite set (as is Z itself).

3. The empty or null set is the set with no elements, and is denoted by
0. For example,

{n: nE Z, n< 1, and n > I} = 0.

Subsets, equal sets. If A and Bare two sets, we say that A is a subset
of B if each element of A is an element of B. This is denoted by

AcB.

According to this definition every set is, of course, a subset of itself, and so
in order to distinguish subsets that do not coincide with the set in question,
we say that A is a proper subset of B if A is indeed a subset of Band if,
furthermore, B also contains elements that do not belong to A. If it is
1.1 The algebra of sets 25

desirable to indicate that A is a subset of B which is possibly the set A


itself, we write

A~B.

If A is not a subset of B, this is indicated by

Act.B.

For example, if Ais given by (1.1) or (1.2), then C = {1,2,3} is a proper


subset of A, but D ct. A, where D = {1,2,6}.
Two sets A and Bare equal if they contain exactly the same elements.
When this is the case, we write

A=B.

According to the definition of a subset, it is dear that two sets A and B


are equal if, and only if,

Ac Band Be A.

For example, if

A={x: x 2 =4} and B={2,-2},

then A = B.

Union, intersection, difference. We make the assumption from now


on that all sets under discussion are subsets of a single fixed set called the
universal set, which is denoted by U. The definition of U varies from one
context to another.
The union of two sets A and B, written A U B, is the set eonsisting of
all elements that are in A or in B. That is,

AUB={x: xEAorxEB}.

This is shown graphieally in Figure 1.1(a), in the form of a Venn diagmmj


the universal set is represented by the reet angle and its subsets are points
or areas within the rectangle.
The intersection of two sets A and B, written An B, is the set of all
elements that belong to both A and B (Figure 1.1(b)). In other words,

An B = {x: xE A and x E B}.

The difference of two sets A and B, written A - B, is the set of all elements
of A that do not belong to B (Figure 1.2(a)):

A - B = {x: x E A, x (j. B}.


26 1. Sets

FIGURE 1.1. (a) The union and (b) the intersection of two sets

FIGURE 1.2. (a) The difference A - B of two sets A and Bj (b) the complement
A' of a set A

The complement of a set A, denoted by A', is the set of all elements not in
A (Figure 1.2(b». That is,

A' = {x EU: x fj. A}.

Example
4. Let A = {x: xE Z, 1 $ x $ 1O} and B = {9, 10, 11, 12}. Then

AUB {x: xE Z, 1 $ x $ 12},


AnB {9,1O},
A-B {x: xE Z, 1 $ x $ 8},

and

A' = {... - 3, -2, -1,0,11,12,13, ... }

(the universal set is taken here to be Z).


Countable sets. It is necessary at times to know whether the elements
of an infinite set can be "labelIed" by positive integers. In other words, we
1.1 The algebra of sets 27

wish to distinguish infinite sets of the form

(1.3)

from those that cannot be labeled as in (1.3). A set that can be put in
one-to-one correspondence, in other words, labelIed, with positive integers
is called a countable set. Of course, any finite set A is countable, for if A
has m members we may label them al, a2, ... , a m .

Examples
5. The set of functions

is countable.
6. As shown in the next section, sets such as the set of points

on the real line are not countable. That is, the set of real numbers
between 0 and 1 cannot be labeled in the form al, a2, ....
Cartesian products. Given two sets A and B, their Cartesian product
A x Bis the set of all ordered pairs (a, b), where a E A and bEB. That is,

A x B = {(a, b): a E A, bEB}.


For example, if

A = {I, 2, 3} and B = {7,8}, (1.4)

then

A x B = {(I, 7), (1,8), (2,7), (2,8), (3,7), (3,8)}.


If we think of the members of A as lying on a horizontal axis and the
members of B as lying on the vertical axis of a pair of Ca.rtesian coordinate
axes, then A x B may be represented by a set of points in the plane, as
shown in Figure 1.3. The idea of the Cartesian product may be extended
to products of more than two sets. For example, the Cartesian product
Al x A 2 X A 3 X ... An is defined to be the set of all oniered n-tuples

Note that, in general, B x A i= A x B. For example, if A and B are as in


(1.4), then

B x A = {(7, 1), (7, 2), (7,3), (8, 1), (8,2), (8, 3)} i= A x B.
28 1. Sets

1 2 3
FIGURE 1.3. The Cartesian product A x B of the two sets in (1.4)

1.2 Sets of numbers


Although we encounter a wide variety of sets in this work, sets of numbers
are of particular importance, and pervade the developments that follow.
For this reason we set out in this section some of the salient definitions and
properties of sets of numbers.
We have already come across the set Z of all integers, defined by

Z={ ... ,-3,-2,-1,O,1,2,3, ... }.


Occasionally we also make use of the set N of natural numbers or positive
integers:

N={1,2,3, ... }.
A rational number is a number that can be expressed as the ratio of two
integers. We denote the set of rational numbers by Q, so that

Q={x:x=p/q, p,qEZ, q#O}.


An important property of Q, which we record here, is the following: the
set Q of rational numbers is a countably infinite set. In other words, the
rational numbers can be put in one-to-one correspondence with integers.
You are asked to show this in Exercise 1.6 at the end of this Chapter.
Real numbers that do not belong to Q are called irrational numbers.
For example, v'2 cannot be written in the form p/q for p, q E Z, and so is
irrational (see later, Seetion 5). This brings us to lR, the set of real numbers;
lR consists of all rational as wen as irrational numbers. It is convenient to
think of lR as being represented by an infinitely long line, ealled the real
line. The origin is chosen to be at some point on this line, and onee the
loeation of the number 1 is fixed, the scale will be determined, and every
real number then eorresponds to a point on the real line. It should be clear
that lR is an uneountable set.
1.2 Sets of numbers 29

Assuming the rational numbers to be known, the existence of the real


numbers is not, mathematically speaking at any rate, a fact that can be
deduced in an obvious way. The construction of lR is a process that was
carried out in accordance with modern not ions of rigor as recently as the
late nineteenth century. We omit the detailed arguments that are required
for a proper treatment of this subject, and appeal instead to well-known
and intuitively obvious properties of the real numbers.

Subsets of lR. Very often we deal not with the whole real line but only a
portion of it, called an interval. Thus, if a and b are two points on lR such
that a ::; b, then we define
the open interval (a, b) = {x: x E lR, a < x < b};
the closed interval [a,b] = {x: xE lR, a::; x::; b};
the half-open intervals (a, b] = {x: x E lR, a < x :::; b} and
[a,b)={x: xElR, a::;x<b}.
Thus the terms "open" and "closed" indicate, respectively, that the end-
points of the interval are exeluded from or ineluded in the set. There are
more technical definitions of open and elosed sets however, whieh, although
eonsistent with the preceding definitions, are mathematically more sound.
We discuss these shortly.

Complex numbers. The set of eomplex numbers is denoted by C, and is


defined to be the set of numbers of the form z = a + bi, where i = A and
a and b are real numbers. The number a is called the real part of z, and is
written Re (z), whereas b is called the imaginary part of z and is written
Im (z). The modulus of a complex number z is defined as (a 2 + b2)1/2, and is
written Izl· The complex conjugate of z is the complex number z = a - bi, so
that Izl 2 = zz. Complex numbers are conveniently represented graphically
with respeet to a pair ofaxes that correspond to the real and imaginary
parts, respectively, as shown in Figure 1.4; this is referred to as the com-
plex plane. Clearly one may set up a correspondence between C and the
Cartesian product lR x IR.
Continuing the geometrical interpretation, the argument (J of a complex
number z = a + bi is the angle defined by
(J == argz = aretan(b/a), -7[" < (J::; +'1..
The last condition ensures that () is uniquely defined.
There is a elose relationship, in the eomplex plane, between the expo-
nential function exp and trigonometrie funetions. Indeed, we have
== eiIJ = eos(} + isin(},
exp(i(})
and so every complex number z = a + bi ean be written in polar form as
z = re iIJ ,
30 1. Sets

Imz

z=a-bi

FIGDRE 1.'1. Graphical representation of complex llumbers

FIGDRE 1.5. Neighborhood of a point c in IR and of a complex number w

where r = 14
Open sets. Given any point c on the real li ne and a positive number c, the
open interval (c - c, c + c) = {x: c - c < x < c+c} is called a neighborhood
of c. Likewise, if w is any complex number and c a positive real number,
a neighborhood of w is the set {z E C : Iw - zl < c}. Neighborhoods are
illustrated in Figure 1.5.
Now, let lK represent either IR or C, and let X be a subset of K Then c
is called an interior point of X if we can find a neighborhood of c, all of
whose points belong to X. A set X c lK is called an open set if every point
of X is an interior point.
1.2 Sets of numbers 31

Examples
7. The open interval (a, b) is an open set: for any point ein (a, b) we
can define a neighborhood lying entirely in (a, b) by choosing € to be
less than le - al and le - bl. Thus every point in (a, b) is an interior
point.
On the other hand, the closed interval [a, bJ is not an open set: the
points a and b are such that, no matter how small we choose €, it
is not possible to find neighborhoods of a and b, all of whose points
lie in [a, bJ. Thus a and b are not interior points and so [a, bJ is not
open. Similar considerations apply to the half-open intervals [a, b)
and (a, bJj the points a and b, respectively, are not interior points.

8. The real line IR is an open set since every point in IR has a neighbor-
hood that lies in IR.
9. A simple example of an open set in C is the disk of radius rand
center Zo, defined by D(zojr) = {z E IC: Iz - zol < r}.

Points of accumulation and closed sets. In order to give a rigorous


definition of closed sets in IR and IC we define first a point of accumulation.
Let X be a set in 1K, where lK is either IR or IC, and e a point in lK (e does
not necessarily belong to X). Then e is called a point of accumulation of
X if every neighborhood of e contains at least one point of X distinct from
C. Furthermore, a set X C lK is a closed set if it contains all of its points of
accumulation. It is possible to show also (see Exercise 1.9) that a set X is
closed if and only if its complement X' = lK - X (here the universal set is
1K) is open.
Finally, we define the closure X of a set X c lK to be the union of X
and all its points of accumulation. Clearly X is a closed setj however, if X
is not closed, the operation of closure is a means of constructing the closed
set that is nearest to X.

Examples

10. The set {1,~, 1,~, 1,~, ... } has two points of accumulation, namely,
1 and O. Since these do not belong to the set, it is not closed. On the
other hand, the closed set {1, 2, 3, ... } has no points of accumulation.

11. Consider the interval (a, b): according to the preceding definition,
every point in (a, b) is a point of accumulation. F'urthermore, a and
b are also points of accumulation of (a, b) since every neighborhood
of these two points contains members of (a, b). But a and b do not
belong to the set, and so it is not closed. The closure of (a, b), on the
other hand, is [a, bJ.
32 1. Sets

12. The closed interval [a, b] is a elosed set.

13. The unit cirele S = {z E C: Izl = 1} is a elosed set in C; every


member of S is a point of accumulation of S.

Sequences of numbers. The concept of a sequence is central to analysis.


In due course we deal with sequences in fairly arbitrary spaces; to set the
stage for these later developments, and in order to be able to deal he re
with some furt her ideas which are pertinent to real or complex numbers,
we introduce sequences in this elementary setting. First, however, we elarify
the manner in which the symbol (X) is employed.
The nature of 00, which represents infinity, is often misunderstood. Tü
begin with, 00 is not a number: it is merely a means of representing un-
boundedness. For example, the set of positive even integers can be written
as {2,4, ... } or as {2n}~=1' the second representation indicating that n
increases indefinitely. Another example concerns the realline: since (a, b)
denotes an open interval in ~, using this same notation we may write

to indicate that the "interval" corresponding to ~ is not bounded.


Let IK be, as before, either ~ or C, and let X be any subset of IK. A
sequence {Ub U2, ... ,U n , .. . } in X is a countable set of elements of X with
the general element U n of X being associated with a positive integer n. If
the sequence has a finite number of elements, it is called a finite sequence;
otherwise it is called an infinite sequence.
Most of the time we deal with infinite sequences, and we generally use the
notation {Un}~=l or simply {u n } to denote the infinite sequence {Ul' U2, ... ,
U n ,·· .}.

Example

14. Sequences may be described either by displaying them or by giving a


formula for determining the general element. For example, let X be
the elosed interval [0,1]; the sequence

{1,0,l,0,~,0,··1

is defined by actually displaying the first few elements; alternatively,


it could be defined by stating that the nth term X n of the sequence
IS

for even n,
for odd n.

Ultimately wh at is of most interest about sequences is the way in wh ich


they behave as n gets progressively larger; this brings us to the next topic,
1.2 SetE, of numbers 33

namely, that of convergence of sequences.

Convergence of sequences. We begin with sequences in lR. Consider


then the sequence of real numbers {x n } = {1jn}~=l' As n get;; larger
the term X n gets doser to zero and we say, loosely at this stage, that
the sequence converges to O. On the other hand, the sequence {2 n } =
{2, 4, 8, ...} increases indefinitely: no matter what number N we specify, it
will always be possible to choose a value of n such that 2n will be greater
than N. This sequence is said to diverge.
The preceding examples behave in a fairly obvious way, and so they could
be discussed without recourse to a rigorous definition of convergencc. Later
on it is necessary to discuss convergence of arbitrary sequences in normed
spaces, and the definitions that are introduced in that context are fairly
obvious generalizations of the definitions used in the rather specialized case
of real and complex numbcrs.
Let X be a subset of lK, and {u n } a sequence in X. Let U be a number in
X, and form the sequence {Iul - ul, 1'11.2 - ul, ... , IU n - '11.1, ...} (in the case
lK = <C, I . I of course represents the modulus of a complex number). If the
number IU n -ul approaches 0 as n gets larger, we agree to call the sequence
convergent. Another, more formal, way of stating this is as folIows. Pick any
positive number t; then {u n } is said to convergc to u if it is always pos;;ible
to make IU n - ul smaller than t simply by choosing n large enough, larger
than some number N, say. That is, a sequence {u n } in a subset X of lK
is convergent if, given any t > 0, there is a member u E X for which a
number N, possibly depending on c, can be found such that

IU n - ul < t for all n > N. (1.5)


If this is the case, we write U n ---> U (which is read "u n converges to u"),
and u is called the limit of the sequence. Yet another way of stating (1.5)
informally is

lim
n-+CX)
IU n - ul = 0 or lim
n---+CXJ
Un = U, (1.6)

which is read "the limit as n tend;; to 00 of 7L n is 7L". Note, however, that


by (1.6) we mean (1.5); it is also worth noting that the symbollimn~= is
synonymous with "n becomes arbitrarily large" .

Examples
15. Consider the sequence {an} = {(3n 2 - 1)j(n2 - 5n)}~=6' As n gets
very large wc would expect this sequence to approach the limit 3
(since the terms 3n 2 and n 2 dominate the numerator and denomina-
tor, respectively). We check this by asking whether, for any t > 0, a
number N can be found such that
15n -1
la n - 31 = 2
n -5n
< t
34 1. Sets

FIGURE 1.6. The nlnction f(n) in Example 15


• •
Zn·

Zo

FIGURE 1.7. The sequence in Example 16

whenever n > N. This is equivalent to seeking N such that

€(n 2 -5n) > 15n-l or m 2 -5(€+3)n+l>O (1.7)

for n > N. Denote the left-hand side of (1.7) by f(n) and treat nasa
real number; then the graph of f(n} is as shown in Figure 1.6, and f
has roots nl and n2, with n2 2': 6. If we choose N = n2, then clearly
f(n) > 0 for n > N, or lan - 31 < € for n > N, so that an -+ 3.

16. Consider the sequence {zn E IC : Zn = an ein8 };:,,=o in which a and


e are fixed, and 0 < a < 1. The sequence is shown in Figure 1.7.
To confirm that Zn -+ 0 as n -+ 00, we choose € > 0 and consider
whether a number N can be found such that IZn I < € when n > N.
Since IZn I = an, this amounts to checking w hether there exists N
such that an < E for all n > N. The answer is affirmative; indeed, by
taking the logarithms of both sides we see that n > log EI log a, so it
suffices to choose N = log EI log a.
1. 2 Sets of numbers 35

A
0 01
o
~
--.,....
1 0 0

lower bounds upper bounds

FIGURE 1.8. Upper and lower bounds of a set A

Supremum and infimum. Suppose that the set A is a subset of IR: if


there is areal number p such that p ?: x for all points x in A, then we call
p an upper bound of A, and say that A is bounded above by p. Similarly,
if there is areal number q such that q ~ x for all x in A, then q is ealled a
lower bound of A, and we say that A is bounded below by q. Note that A
ean have many upper and lower bounds. If A has both an upper and a lower
bound, then it is said to be bounded. Thus another way of eharacterizing a
bounded set A is as one for whieh there exists a positive number M such
that

lxi ~ M for all x E A.

Now suppose that there is a number m whieh belongs to A and that also
is an upper bound of A. We eall m the maximum of the set A and we write

maxA = m.
Similarly, if there is a number n that belongs to A and whieh, furthermore,
is a lower bound of A, then this number is called the minimum of A and
we write

minA = n.

Examples
17. Let A be the closed unit interval [0,1] = {x: x E IR, 0 ~ x ~ 1}.
Then any number a ?: 1 is an upper bound, any number b ~ 0 is a
lower bound, and

maxA = 1, minA = O.

18. Let A = (0,1) = {x: X E IR, 0< x < I}. In this ease A has no
maximum or minimum, although it is bounded; the numbers 0 and 1
are upper and lower bounds, respectively, but do not belong to A.
The preeeding exarnples illustrate onee again the essential difIerenee be-
tween closed and open intervals: closed intervals have minima and maxima
whereas open intervals do not. Still, we would like to be able to express the
fact that, from the point of view of boundedness, a set such as (0,1) is not
36 1. Sets

that different from [0,1] in that it does have aleast upper bound, which is
the smallest of all its upper bounds, and a greatest lower bound, which is
the largest of all its lower bounds, even though these bounds do not belong
to the set.
In general the supremum or least upper bound of a set A is a number
p' wh ich is an upper bound of A, and which satisfies p' :S p for all upper
bounds p. When p' exists, we write

p' = supA.

Similarly, the infimum or greatest lower bound of A is a number q' which is


a lower bound of A, and which satisfies q' :::: q for all lower bounds q. We
normally write this as

q' = inf A.

We note that when max A exists then clearly

maxA = supA.

Likewise, if min A exists, then

minA = inf A.

Examples

19. Let A = (0,1]. Then maxA = supA = 1, and inf A = 0 although


min A does not exist.

20. Let A be the positive real line lR+ = {x: x E lR, x :::: O}. Then
inflR+ = minlR+ = 0 and suplR+ does not exist, since lR+ is not
bounded above.

The Bolzano-Weierstrass theorem. The quest ion naturally arises as


to whether it is possible to eharaeterize those subsets for which all sequences
contain a point of aecumulation in the subset. The answer, in the case of
subsets of lR, is surprisingly straight forward: all that is required is for the
set to be closed and bounded. This result is the substance of the following
theorem.

THEOREM 1 (THE BOLZANO-WEIERSTRASS THEOREM). Let [a,b] be a


closed and bounded interval on the real line, and {x n } a sequence in [a, b].
Then this set has a point of accumulation in [a, b].

PROOF. Let Cl be the infimum or greatest lower bound of the sequence


1.3 jRn and its subsets 37

{Xl, X2, . ..}. Next, let C2 be the infimum of the sequence {X2' X3, ... }. Con-
tinuing in this way, we denote by c.,. the infimum of the sequence starting
with X n , that is, {Xn,Xn+I, .. .}.
Clearly {Cl, C2, ... } is a monotone increasing sequence, and furthermore
this sequence is bounded above (by b). It follows (Exercise 1.14) that this
sequence {c.,.} converges to a limit c, say, which lies in [a, b]. We show next
that C is in fact a point of accumulation of the sequence {x n }.
To do this, choose any f > 0, and choose also a positive integer Nj then
there exists m 2: N such that

lern - cl< f.
Now c'" is the infimum of the set of numbers {x m , X",+l, .. .}, so that there
exists k 2: m such that

Hence it follows, using the tri angle inequality, that

The assertion is thus proved. o


The Bolzano-Weierstrass theorem is in fact valid for arbitrary closed and
bounded subsets in R, and in C. This may be shown by modifying the
preceding proof appropriately, and is left as an exercise.

Compactness. A set X c IK is said to be compact if every sequence of


elements in X has a point of accumulation in X. In other words, a set is
called compact if it has a property which by Theorem 1 is possessed by all
closed and bounded intervals. It is in fact a key property of IK that subsets
of IK are compact if and only if they are closed and bounded:

compact == closed + bounded.

1.3 ffi.n and its subsets


In the previous section we dealt in some detail with subsets of the real line
or intervals. Here we extend some of the concepts to higher-dimensional
regions. We start with a description of R 2 , which is defined by R 2 = R x R
so that members of R 2 are ordered pairs 0/ real numbers:

R 2 ={(x,y): x,YER}.

Just as R may be represented geometrically by a line, with members of R


being points on the line, in the same way R 2 may be thought of as a plane
38 1. Sets

y+----e X

x
FIGURE 1.9. The Cartesian plane]R2

extending indefinitely in all directions. If we use the notation x = (x, y)


to denote a typical member of lR2 , then clearly x represents a point in the
plane with coordinates (x,y), as shown in Figure 1.9. This plane is known
as the Cartesian plane.
The situation just described is easily generalized to higher dimensions: for
example, the set lR 3 == lR xlR x lR is the set of all ordered triples x = (x, y, z)
of real numbers; that is,

ll~? = {x = (x,y,z): x,y,z E lR}.


As with lR2 , we simply represent a typical member of lR3 by x. Geometri-
cally we can regard lR3 as being synonymous with three-dimensional space,
any member xE lR3 being a point in this space with coordinates x, y, and
z.
Generally, we define lRn to be the set of all ordered n-tuples of real num-
bers:

When working in two or three dimensions it is often convenient to use the


alternative notations

x = (x,y) for x E lR 2 and x = (x,y,z) for x E lR3 ,

depending on circumstances.

Open sets in lRn . The generalization to higher dimensions of the interval


on the realline is the domain in lRn; in order to describe exactly what we
mean by a domain, however, it is necessary first of all to generalize to lRn
the definition of an open set introduced earlier. Recall that a neighborhood
of a point c in lR is an open interval of points x satisfying Ix - cl < E. Now
we can read this inequality as "the distance from x to c is less than E" , and
so it follows that all we need in order to extend the idea of a neighborhood
to lRn is a means of measuring distance. In lR2 the distance between two
points x and y is defined by
1.3 jRn and its subsets 39

FIGURE 1.10. A neighborhood of the point c in jR2

where (Xl,X2) are the coordinates of x and (Yl,Y2) the coordinates of y;


hence, we can define a neighborhood of a point c in lR? to be the set of
points that are a distance less than f away from c, for some f > 0; that is,
if we denote a neighborhood of c by N(c; f), then

N(C;f)={X: X Elle, lx-cl<!"},

and f is ca11ed, for obvious reasons, the radius of the neighborhood (Figure
1.10). We immediately generalize to ]Rn and define a neighborhood of a
point C in ]Rn to be the set

N(C;f)={X: xE]Rn, Ix-cl<f}

where f is the radius of the neighborhood and the distance Ix - cl from x


to C is defined by

Now that we have at our disposal the concept of a neighborhood in ]Rn,


we can define open sets in ]Rn simply by modifying appropriately the def-
inition given in Section 1.2 for subsets of ]R; specifica11y, C E ]Rn is ca11ed
an interior point of a set n of points in ]Rn if we can always find a neigh-
borhood of c, a11 of whose points belong to n. The situation is depicted
in Figure 1.11 for the case n = 2, where n is defined to be the set of a11
points lying inside but not on the curve f. We can define a neighborhood
N(c; f) lying entirely inside n by choosing E to be less than or equal to d,
the shortest distance from c to the boundary r. Finally, a subset n of ]Rn
is an open set if every point of n is an interior point.

Points of accumulation and closed sets in ]Rn. As with open sets,


closed sets in ]Rn are defined in much the same way as their counterparts
in ]R. Specifica11y, if n is a subset of]Rn and c is a point in ]Rn (not nec-
essarily in n though), then c is called a point of accumulation of n if
every neighborhood of c contains at least one point of n distinct from c.
40 1. Sets

N(c, E)

FIGURE 1.11. A neighborhood of c, an interior point of n

Then, if the set n contains all of its points of accumulation, it is called


a closed set. The closure n of a set n is defined to be the union of n
and its points of accumulation. The boundary bdy n of n is defined by
bdy =n n- {points of accumulation}.

Example

21. The unit square n = nUr = {x: xE JR2, O:s; XI:S; 1,0:S; X2:S; I}
is closed (see the previous example); however n is not closed iOince all
the points lying on r are limit points of n but do not belong to n.

Compactness. A set n c JRn is said to be compact if every sequence of


elements in n has a point of accumulation in n. As in the case of JR and C,

n is compact if and only if it is closed and bounded.

Domains in JRn. We now describe the kinds of sets in JRn that are of
greatest relevance. First, we define a connected set n in JRn to be a iOet
which has the property that every pair of points in n can be connected by
a curve that lies entirely in n. Examples of connected and disconnected set
are shown in Figure 1.12.
We define next a domain in lR n to be an open connected set in lR n .
Domains are central to the consideration ofboundary value problems, as the
examples in the Introduction indicate. Our interest is exclusively confined
to domains in JR, JR2, and occasionally in JR3; in the case of JR2 and lR 3 the
boundary r (that is, the curve (in JR2) or surface (in JR3) within which all
points of the domain lie) is assumed to be sufficiently smooth, in the sense
that it possesses no cusps or suchlike singularities. Examples of admissible
and inadmissible domains are also shown in Figure 1.12.
Later, it is necessary to be more precise ab out what is meant by an
admissible domain, and there we define what is called a Lipschitz domain;
this is in a sense the standard "nice" domain with which one works in the
context of boundary value problems.
1.4 Relations, equivalence classes, and Zorn's lemma 41

admissible

connected disconnected

inadmissible
FIGURE 1.12. Connected and disconnected sets, and admissible and inadmissible
domains

1.4 Relations, equivalence classes, and Zorn's


lemma
Having acquired sorne familiarity with sets of a simple nature, we return
now to abstract set theory, and develop some ideas that are useful later.
We return to the not ion of ordered pairs, and the concept of a relation.

Relations. Forma11y, if we have two sets A and B with Cartesian product


A x B, then a relation is a subset ofAx B. However, this formal defi-
nition obscures an intuitively simple interpretation of what constitutes a
relation. Essentia11y we wish to formalize the notion that, given two sets,
some members of the one set may be related to some members of the other
in a special way. A few examples should help to clarify these ideas.

Examples

22. Let A be the set of a11 men (in a given community, say), and B the
set of all women. Then for a E A and bEB, "a is the husband of b"
defines a relation on A x B.
23. Let A = {2,3,4}, B = {3,4,5,6}, and consider the relation "y is
divisible by x", for (x, y) E A x B. Then the subset making up this
relation is

{(2,4),(2,6), (3,3), (3,6),(4,4)}.

The kinds of relations that are particularly useful are those that have
sorne well-defined structure built into them, and we now consider some
42 1. Sets

of these special types of relations, with particular reference to the case in


which the ordered pairs come from a single set; that is, given a set A,
we consider relations on A x A. We use the notation ""," to indieate the
relationship between two elements of a set A.
Given a set A, a relation rv on A is
reflexive if a rv a,
symmetrie if a rv b implies that b '" a,

antisymmetrie if a '" band b '" a imply that a = b,


transitive if a '" b and b '" e imply that a '" e,
for all a,b,e E A.

Equivalence relations. A relation that is reflexive, symmetrie, and tran-


sitive is called an equivalenee relation.
Partial and linear orderings. A relation", on a set A is a partial or-
dering if it is reflexive, antisymmetrie, and transitive. It is conventional to
denote a partial ordering by the suggestive symbol "::;" rather than the
generie "",", since the standard operation "::;" on the set R in fact defines
a partial ordering on that set.
Finally, a partially ordered set is a pair (A, ::;), where A is a set and ::; is
a partial ordering on A.
If (A, ::;) is a partially ordered set and ::; also satisfies the condition

x ::; y or y::; x for every x, y E A,

then the set A is said to be linearly ordered, and ::; is ealled a linear omering
onA.

Examples
24. Consider the relation "<" on the realline. This is not reflexive since,
for any real number x, x -/. x. It is also not symmetrie, though it is
transitive: x < y and y < z imply that x < z.
25. As mentioned earlier, the operation "::;" defines a partial ordering on
R; it is reflexive (x ::; x), antisymmetric (x ::; y and y ::; x imply that
x = y), and transitive (x ::; y and y ::; z imply that x ::; z). Note that
it is not, however, an equivalenee relation, sinee it is not symmetrie
(x ::; y does not imply that y ::; x).

26. Let F be a family of sets; that is, F is a set whose members are
themselves sets. Then set inclusion C is a partial ordering on F; note
in particular that for any two sets A and B in F, A c Band B C A
imply that A = B.
1.4 Relations, equivalence classes, and Zorn's lemma 43

FIGURE 1.13. Illustration of the concept of a partition

27. Let A be the set of triangles in the plane, and let ",,}' be the relation
defined by "is similar to". Then this defines an equivalence relation
on A.
Partitions and equivalence classes. Let A be any set, and suppose that
it is possible to define subsets Al, A 2 , ••• of A which have the properties
that
(i) the sets Ai are pairwise disjoint; that is, Ai n Aj = 0 for all i, j =
1,2, ... such that j 1= i;
(ii) Al U A 2 U ... = A.
Then the family of sets {Al, ...} is called a partition 0/ .4.. The motivation
for this name is easily understood if one considers Figure 1.13, which illus-
trat es the concept for the case of a set A in lR?

Examples
28. Let X = {I, 2, 3, ... , 9}, A = {I, 4, 7}, B = {2, 3, 5, 6}, and C
{7, 8, 9}. Then {A, B, C} is not a partition of Xi X = Au B U C but
An C = {7} 1= 0.
29. Consider the plane lR? and the family of subsets A a defined by A a =
{x E 1R2 : X2 = a}. Thus A a is the set of points lying on the horizontal
line X2 = a. Then {A a : a E IR} defines a partition on 1R2 •
It turns out that there is a dose relation between partitions and equiva-
lence relations, and we explore this next. Suppose that "- defines an equiv-
alence relation on A, and for each a E A, define the set Aa by

Aa = {x E A: x'" a}.
Then A a is called an equivalence dass determined by 11. In Example 29,
the equivalence relation x '" y may be defined on 1R2 by X2 = Y2; then the
horizontal line A a passing through a is the equivalence dass defined by a.
We show that the family of equivalence classes in fact defines a partition
of A.
44 1. Sets

First, we note the result that if b E A a , then Ab = A a . To see this,


observe that by definition, b rv a. Now take any x E Ab such that x rv b;
then since the relation is transitive, x rv a. Thus Ab C A a . Arepetition
of the argument, starting with x E A a , yields the result that A a C Ab, so
that Ab = A a , as desired.
We note also, by way of a preliminary result (see Exercise 1.23), that if
two equivalence classes A a and Ab contain at least one element in common,
then they are in fact equal; that is,

We are now ready to prove the main result.

THEOREM 2. Let rv be an equivalence relation on a set A. Then the equiv-


alence classes defined by rv constitute a partition of A.

PROOF. We must show that the set of equivalence classes {A a : a E A}


satisfies

(i) U{A a : aEA}=A;


(ii) A a n Ab = 0 for b f a.

To prove (i), let B = u{A a : a E A}. Then any bEB belongs to some
A a , for a suitable choice of a, and hence b belongs also to A. Thus B <:;; A.
Next, take any c E A. By reflexivity we have c rv C, or c belongs to Ac, and
hence c belongs also to B. It follows that (i) holds.
The proof of (ii) follows directly from the result of Exercise 1.23, which
implies that if A a =1= Ab, then A a and Ab must be disjoint. 0

U pper and lower bounds, maximal and minimal elements. The


analogy between the operation :<S; on the real line, and the notion of a
partially ordered set, may be exploited furt her by extending to the more
general situation, in an appropriate way, properties on the real line that
arise from the use of :<S;. Thus, suppose that (P,:<S;) is a partially ordered
set, and A a subset of P. Then a member u in P is called an upper bound
of A if x S u for all x E A. An element m in P is called a maximal element
of P if xE P and m S x imply that m = x. The dual concepts of a lower
bound and a minimal element are defined similarly (cf. the definitions of
infimum and minimum in Section 1.2).
The notions of upper bounds and maximal elements bring us to a fun-
damental axiom of mathematics, known as Zorn's Lemma. Very often one
finds in functional analysis, and also in other branches of mathematics such
as algebra and topology, that the elementary notions of set theory, as have
been presented in this chapter, are not sufIicient to allow proofs or defini-
tions to be constructed satisfactorily, or at all. Just as it is necessary in
elementary mechanics to invoke a number of self-evident truths, or axioms
1.4 Relations, equivalence classes, and Zorn's lemma 45

- for example, Newton's laws of motion - in order to construct a theory of


moving bodies, in the same way it becomes necessary to introduce into the
general mathematical framework axioms of set theory l;hat are deemed to
be generally acceptable, in order to be able to proceed with the business
of constructing theories. Zorn's Lemma is one such axiom.

ZORN'S LEMMA. Let A be a nonempty partially ordered set. 1f every linearly


ordered subset of A has an upper bound, then A has a maximal element.
We encounter an application of Zorn's Lemma in Chapter 6, in order to
prove a theorem about orthonormal bases in Hilbert spaces.
It is possiblc, instead of invoking Zorn's lemma in a particular situation,
to make use of one of many alternative axioms, should one of these alter-
natives prove to be more appropriate to the situation at hand. We do not
get into a detailed discussion he re about the various alternative axioms
since these are rat her peripheral to further developments; but we mention
for completeness one such alternative, the Axiom of Choice, which is often
encountered. First, we introduce the not ion of a choice function.

Choice functions. Let F = {Al, A 2 , ... } be a famiily of sets, and sup-


pose that we choose from each set Ai a member ai, say. The resulting set
of choices {al, a2, ... , ai, ... } is known as a choice function. We deal with
functions or maps in detail in Chapter 5, and it suffices for now to recall
that a function f from a set X to a set Y is a rule timt associates with
each member x E X exactly one member y E Y. In the present context
the choice function acts on the family F, and associates with each member
Ai exactly one member ai of that set. Note that as we run through the
various permutations or choices, we actually set up the Cartesian product
Al x A 2 X ....
Now one may ask whether, for a given family of sets, there are any choice
functions; in other words, we should like to know whether it is always
possible to select one member from each of the sets in F. The quest ion is
trivial for finite sets, but for infinite sets it is not, and it turns out that this
quest ion cannot be answered in general using only the usual axioms of set
theory. This motivates the introduction of the Axiom of Choice.

Axiom of Choice. Let F = {A I, A 2 , ... } be a familll of nonemptll sets.


Then there exists at least one choice function for the family. That is, the
Cartesian product of anll nonempty family of nonempty sets is a nonempty
set.
This rat her innocent-Iooking axiom has as a consequence some very im-
portant and deep results in mathcmatics, as the following shows.

THEOREM 3. Zorn's Lemma and the Axiom of Choice are equivalent ax-
ioms.
46 1. Sets

We omit the rather lengthy proof of this theorem.

1.5 Theorem proving


We end this ehapter with a eolleetion of notions and proeedures that are
eentral to proving results in mathematies. No doubt these will have been
previously eneountered in various guises, but it is well to reiterate these
not ions in one plaee, in order for there to be clarity about what eonstitutes
a proof, and about some of the ideas that are used in eonstructing proofs.

Necessity and sufficiency. Suppose that we are given two mathematieal


statements, labeled A and B, and suppose we are told that if A holds true,
then so does B. For example, A may be the statement "a = 2", and B
the statement "a 3 = 8". We may put this another way by stating that
statement A implies statement B, and by writing

A=?B.

Yet another way of expressing the relation between A and B is to assert


that B holds if A holds, or that a sufficient condition far B to hold, is that
A holds. These three different ways of stating the same fact are summarized
in the following.

A=?B

B holds if A holds

a sufficient condition for B to hold is that A holds

Now consider the converse, that is, the case in which B implies A, or B '*
A. Of course we could simply go back and transpose A and B in all the
preceding statements; but it is useful to consider this relationship from a
diffferent angle. Specifically, we may now state that B '*
A is the same as
stating that B holds only if A holds, which is to say that, if A does not
hold, then neither does B. A third way of making this assertion is to state
that a necessary condition for B to hold is that A holds. Going back to the
simple example, we can state that a necessary condition far a 3 = 8 to hold
is a = 2. We sumrnarize again.
1.5 Theorem proving 47

B holds only if A holds

a necessary condition for B to hold is that A holds

This brings us to the third possibility, which is that statements A and B


imply each other. In this case the two statements are cquivalent, and wc
write A {o} B; furthermore, for B to hold it is now necessary and sufficient
that A holds. We now have

B holds iff A holds

a necessary and sufficient condition for B to hold is that A holds

The term "iff" is shorthand for "if and only if" . In the context of proofs of
theorems and thc like, when faced with the task of showing that statement
A is true if and only if B is true, the typical approach is a two-stage one:

• sufficiency (if): assume that B holds, and show that this implies A;

• necessity (only if): assume that A holds, and show that this implies
B.

Example

30. Let A be the statement "a 2 > 4" and B the statement "a > 2".
Assume first that B is true; then dearly A is true. Thus B is a suffi-
cient condition for A to hold. Conversely, assume that A is true; this
implies that lai > 2; that is, a < -2 and a > 2. Thus A is not a
sufficient condition for B to hold; alternatively, B is not a necessary
condition for A to hold (since a > -2 would also be acceptablc). The
two statements are therefore not equivalent. On the other hand, if A
is the statement "a 2 > 4 and a > 0" , and B is the statement "a > 2" ,
then A and Bare equivalent.

Reductio ad absurdum or proof by contradiction. The method


of reductio ad absurd um is an ancient strategy for constructing proofs. It
48 1. Sets

exploits the fact that the statement "if A holds, then B holds" is equivalent
to the statement "if B does not hold, then A does not hold". Faced with
the task of proving that A implies B, the procedure starts off by assuming
that B does not hold. The task is then to show that this implies that A is
not valid, usually by obtaining a eontradiction of the original assumption.

Example

31. A classical example of proof by contradiction is the proof that V2 is


irrational. We begin by assuming that V2 is rational. Set x = V2;
since this is rational by assumption, we may write x = p / q for so me
integers p and q with q nonzero. It mayaIso be assumed thai p and q
have no eommon divisor (if they do, this may be divided out). Thus
x 2 = 2 = p2 / q2, or p2 = 2q2, which implies that p2 is even. Therefore
p is even, and since it is divisible by 2, p2 is divisible by 4. Since
q2 = p2/2, q2 is therefore even. But then 2 is a common divisor of p
and q, which constitutes a contradiction. Thus V2 is irrational.

1.6 Bibliographical remarks


There is a wide range of books that deal with the subject matter of this
chapter. Very readable accounts of the real and complex numbers systems
are to be found in the texts by Apostol [2], Binmore [6, 7], Lang [29], and
Royden [44]. Oden [36] presents a fairly detailed account of the algebra of
sets, with an applications-oriented readership in mind. The text by Lip-
schutz [31] in the Schaum's Outline Series provides an accessible account
of set theory, replete with hundreds of examples and exercises, which would
take the reader some way beyond the contents of this chapter. Finally, the
first chapter of the monograph by Hewitt and Stromberg [19] provides a
treatment that is more detailed and somewhat more advanced, although
very weil written, of set theory and of the real and complex numbers.

1.7 Exercises
The algebra of sets

1.1. Let A = {x E Z : x 2 - x - 6 = O} and B = {x E Z : x 2 < 1O}. List the


elements of A and B. What are AUB, AnB, AnZ+, and A-Z+?

1.2. Let A = {1,2}, B = {7,8}, and C = {9, I}. Find B x (Au C) and
(An C) x B.
1.7 Exercises 49

1.3. Show that


An (B U C) = (A n B) U (A n Cl,
Au (B n C) = (A U B) n (A U Cl.
Illustrate these identities.

1.4. Let n(A) denote the number of elements of a finite set A. Prove that

n(A U B) = n(A) + n(B) - n(A n B).

How would you generalize this identity to n(A U B U Cl?

1.5. The power set of a set A, denoted by 2A or P(A), is the set of all
subsets of A. What are P(A) and P(B) if A = {I, 2, 3} and B =
{{1,2},3}?
Sets of numbers
1.6. Show that the set Q of rational numbers is countable. [Hint: Set up
a table of the form
1/1 1/2 1/3
2/1 2/2 2/3
3/1 ].

1. 7. Find all the points of accumulation of the following subsets of R


(i) [a, b]; (ii) Q; (iii) (0,1) U {2}.

1.8. Which of the following subsets of IR are closed, open, or neither?


(i) A = {x: sin(l/x) = O}; (ii) A = {x: xsin(l/x) = O}; (iii)
A = {x: sin (I/x) > O}.
1.9. Show that a set I c IR is closed if and only if its complement is open.

1.10. Find all the points of accumulation of the set A = {z E C : z =


x + iy, x 2 - y2 < I}, and determine whether this set is open or
closed.

1.11. Write down the first few terms of the following sequences.

(i) {( -1)n/n}~=l;
(ii) g(1- (-I)n)}~=l;
(iii) {3n 2 /(5n 2 - 6)}~=1·

1.12. Determine which of the following sequences are convergent, and find
their limits.
. (4 - 2n - 3n 2 ) n
(I) (2n2 + n) ; (iii) ---.
l+n
50 1. Sets

1.13. The sequence {(3n + 2)/(n - I)} converges to 3 as n --> 00. Find the
smallest integer N such that

13n + 2 _ 31< E
n-l

whenever n > N, for the case E = 0.001.

1.14. A sequence {u n } is boundedifthere are constants M and N such that


M S U n S N for all n. Also, U n is monotone increasing if U n +l 2:': U n
for all n, and monotone decreasing if UnH S U n for all n. Show
that every bounded monotone (increasing or decreasing) sequence
converges, and that the limit is the supremum (or infimum).

1.15. Find maxA, minA, supA, and inf A when

(i) A = {l/n: n = 1,2,3, ... }j


(ii) A={x:0<x2 <I}j
(iii) A = {x: (x - a)(x - b)(x - c) < 0, a< b< c}j
(iv) A = {lz 2 + 11: z E C, Izl S I} .
1.16. Show that

inf A = - sup( -A).

1.17. Suppose that A and B are two sets of real numbers that are bounded
above, with sup A = a and sup B = b. Let C be the set of real numbers
formed by considering all products of the form xy, where x E A and
y E B. Give a counterexample to show that, in general, sup C # ab.

1.18. Show that the supremum has the following properties.

(i) If I c Rand a is any positive real number, then


supax = asupxj
xE! xE!

(ii) if I c Rand a is any real number, then


sup(a + x) = a + supx.
xE! xE!

Subsets of Rn

1.19. Determine the points of accumulation of the following sets and estab-
lish which of these sets are open, closed, or neither.

(i) n={x: XER2 , OsxSl, O<YSl}j


(ii) n={x: XER 3 , X 2 +y2+ z 2 <a2, z>O}.
1. 7 Exercises 51

1.20. The diameter dia (n) of a set in jRn is defined by

dian = sup{lx - yl: x, YEn}.

Find dia (n) for the sets in Exercise 1.19.

Relations, equivalence classes, and Zorn's lemma

1.21. Which of the following statements defines (i) an equivalence relation;


and (ii) a partial ordering on the set of natural numbers N?
(a) ß is a multiple of 0:; (b) o:ß is the square of a number;
(c) 0:+ 2ß = 6; (d) 0: divides ß.

1.22. Let '" be the relation on the set A = {2, 3, 4, 5, ß} defined by the
statement "la - bl is divisible by 3". Write '" as a set of ordered pairs,
that is, as a subset ofAx A, and represent it graphically as a set of
points in the plane.

1.23. If '" defines an equivalence relation on a set A, show that if A a n Ab =f


0, then A a = Ab.

1.24. Consider the relation on ZxZ in which a '" bis defined by lall+la21 =
Ibll + Ib 21. Show that this is an equivalence relation, and illustrate the
manner in which Z x Z is partitioned.
2
Sets of functions and Lebesgue
integration

In due course we endow sets with particular properties and on the basis of
these assumed properties construct a theory for special kinds of sets such
as Hilbert spaces. In the development of this theory it is not necessary to
appeal to the precise character of a set: the basic axioms, and the theo-
rems that follow from these axioms, apply equally to sets whose members
are numbers or matrices or functions. Before embarking on the task of de-
scribing this general framework, however, we first introduce two important
examples of sets, or spaces (as they are usually called when endowed with
additional properties) of functions: these are the spaces of continuous func-
tions, and thc LP spaces of functions whose pth powers are integrable. With
these at our disposal it is possible in subsequent chapters to illustrate as-
pects of the general theory, using as special examples sets such as lR or lRn
which were introduced in the last chapter, as weil as spaces of functions.
In Section 2.1 the concept of continuity is introduccd, and the space
Cm(rl) of m-times continuously differentiable functions is dcfincd.
There are of course many well-behaved functions that are not contin-
uous, and that also feature in the developments to follow; an example is
the Heaviside step function. These functions need to be characterized in
an alternative manner, and this is done by exploiting not the degree of
continuity or smoothness of the function, but rat her its integrability. This
process leads naturally to the definition of the LP spaces. In order to dis-
cuss these adequatcly it is necessary first, however, to extend the definition
of the integral encountered in elcmentary courses on caJculus; this is the
Ricmann integral, and it is not adequate for our purposes. Its extension,
known as the Lebesgue integral, in turn rehes on an acquaintance with thc
54 2. Sets of functions and Lebesgue integration

discontinuoUB

continuoUB
discontinuous

FIGURE 2.1. Examples of continuous and discontinuous functions

notion of Lebesgue measure, which is the subject of Section 2.2. Section


2.3 is then devoted to a discussion of the LV spaces.

2.1 Continuous functions


Continuous functions occur in great abundance in applied mathematics and
engineering. This is not surprising, since many natural phenomena that are
modeled mathematically involve quantities which may be represented (per-
haps approximately) by continuous functions. Our aim in this section is to
describe, rather informally at first, the concept of continuity, and subse-
quently to arrive at a mathematically suitable definition of a continuous
function.
We begin by considering two arbitrary functions fand g whose graphs
are shown in Figure 2.1. The function f(x) is continuous, by which we mean
that it is possible to draw its graph without lifting one's pen. On the other
hand, the function g(x) is discontinuous, in that its graph has a break.
Roughly speaking, then, a continuous function of a single variable may
be characterized as one whose graph is an uninterrupted curve. Another
type of discontinuous function is one that is unbounded at some point. For
example, the function h shown in Figure 2.1 is not continuoUB at x = 0
since h(x) "tends to infinity" as x approaches zero. Again, it is not possible
to represent hex) by an uninterrupted curve.
The preceding examples give a qualitative feel for what constitutes a
continuoUB function. However, for subsequent work we need adefinition
of continuity that agrees with our intuition and which is also sufficiently
robust to be used in any mathematical situation. The following is a suitable
definition.

Continuous functions of one variable. Let f be a function on an


interval I (open or closed) of the realline. Then f is continuous at a point
Xo E I if, given any positive number f, no matter how small, it is possible
2.1 Continuous functions 55
g(x)
f(x)

E
f (xo) +--------f
E

--+-------~-~-----------x

Xo

FIGURE 2.2. Graphical interpretation of the definition of continuity

to find a positive number 6 (depending on E and on the point xo) such that

If(x) - f(xo)1 < E for all x in 1 with Ix - xol < 6. (2.1)

If f is continuous at alt points in 1 then f is said to be continuous on I.


Generally the number 6 will vary from point to point for a given value of
E, but if it so happens that 6 depends only on E and not on x, then we say
that f is uniformly continuous on 11.
The preceding definition of continuity has the advantage of a very simple
geometrical meaning. To see this, consider the graph of the function f
shown in Figure 2.2. Choose a positive number E and draw lines parallel to
the x axis at heights f(xo) ± E. Then f is continuous at Xo if we can find
a positive number 6 such that the graph of f(x) lies inside the horizontal
band bounded by f(xo) ± E (that is, If(x) - f(xo)1 < E) for all values of x
in the vertical band Ix - xol < 6. In other words, the whole portion of the
graph lying in the vertical band is contained in the horizontal band. For
the function f shown in Figure 2.2 this is clearly possible at any point Xo
lying in I, no matter how small a value of Eis chosen. Thus f is continuous.
On the other hand, the function 9 in Figure 2.2 is continuous at all points
in 1 except at x = xoi no matter how small we make 6,. the vertical band
will always include a portion of the graph that lies outside the horizontal
band.

Examples

1. f(x) = x 2 is continuous on [0,1]: to see this, consider


If(x) - f(xo)1 Ix 2 - x~1 = I(x - xo)(x + xo)1
s: 21x - xol for x, Xo E [0,1].

If Ix - xol < 6, then If(x) - f(xo)1 < 26. Hence, if E is given, it


suffices to take 6 = ~E to guarantee that If(x) - f(xo)1 < E whenever
Ix - xol < 6. Since 6 does not depend on xo, f is also uniformly
continuous on [0,1].
56 2. Sets of functions and Lebesgue integration

2. f(x) = I/x is continuous on the half-open interval (O,IJ. To show


this, we begin by considering

Ixo - xl
If(x) - f(xo) I = Ixllxo I
° °
for an arbitrary fixed Xo in (0, IJ. Let < 8 < Xo; then every x in the
interval Ix - xol < 8 satisfies x > Xo - 8> and we have

8
If(x) - f(xo)1 < xo(xo _ 8)

Then if E is given, we choose 8 = EX6/(1 + EXo) (this is found by


setting 8/xo(xo - 8) = E). Thus f is continuous for any Xo in (0, IJ.
Note that 8 depends on E and on xo, so we have not been able to
prove that f is uniformly continuous. Indeed, it can be shown (see
Exercise 2.3) that f is not uniformly continuous.

For functions of more than one variable the preceding ideas are easily ex-
tended. For example, consider a function f(x) == f(x,y) of two variables
defined on an open subset fl of lft2, as shown in Figure 2.387q. To check
for continuity at a point Xo = (xo, Yo) in fl we choose a positive number E
and construct a pair of horizontal planes at heights f(xo) ± E above the xy
plane. Then f(x) is continuous at Xo if it is always possible to construct a
cylinderofradius 8 (that is, the set ofpoints x for which Ix -xol < 8), this
radius depending on E, such that the part of the surface lying within the
cylinder is contained in thc horizontal band If(x) - f(xo)1 < E. This is but
a special case of the general definition of continuity defined for functions
of any numbcr of variables, which we now state.

Continuity in lftn. A function f(x) defined on a subset fl of IR n is con-


tinuous at a point Xo in fl if, for every positive number E, no matter how
smalI, it is possible to find a positive number 8 (depending on E and xo)
such that

If(x) - f(xo)! < E whenever Ix - xol < 8 and x E fl. (2.2)

f is said to be continuous on fl.


If (2.2) hohls for every Xo in fl, then
Furthermore, if 8 does not depend on xo, then f is said to be uniformly
continuous on fl.

The space G(fl). For any domain fl in IRn the collection of all continuous
functions defined on fl forms a set, or space, which is denoted by G(fl).
For functions dcfincd on a subset fl = (a, b) of the real line, we simply
write G(a, b). The space of functions that are continuous on thc closed set
TI = fl u r (fl and its boundary r) is denoted by G(TI) , and by G[a, bJ
for functions on the closed interval. There is more than a mere technical
2.1 Continuous functions 57

f(xo) +€
f(xo) f(x, y)

f(xo) - €

I I

~
8
FIGURE 2.3. Continuity of a function of two variables

difference between continuous nlllctions defined on open and closed sets;


for example, u(x) = I/x is continuous on (0,1) but not on [0,1]. We return
to this point momentarily.

The spaces Cm(O) and COO(O). Among all the continuous functions
defined on a subset 0 of !Rn, some have the property that their first deriva-
tives and possibly some derivatives of higher order are also continuous. It is
important to identify such functions, and so we introduce the space C=(O)
of functions which, together with all of their derivatives up to and including
those of order m, are continuous on O. That is,

{u: u,du/dx, ... , dmu/dx m are


all continuous functions}

forO=(a,b)ClR,

{u: u,öu/äx, öu/öy, ... , ömu/äxkäym-k(k=O, ... ,m)


are all continuous functions }

for 0 C lR 2 , and so on.


We define C=(O) to be the space of functions all of whose derivatives
exist and are continuous on O.
Clearly thc inclusions
58 2. Sets of functions and Lebesgue integration

u"
u'

-1 1
FIGURE 2.4. The function in Example 3, and its first and second derivatives

hold, so that the spaces C m constitutc a gradation which permits contin-


uous functions to be classified according to their degree of smoothness: for
any function in Cm(n), the higher the value of m, the smoother the func-
tion.

Exarnples
3. The function
-1 :s; x < 0,
u(x) = { O2
x, O:S;x:S;l,

belongs to C 1 [-I, 1] since u and du/dx are both continuous, but


d2 u/dx 2 is discontinuous (Figure 2.4).
4. u(x) = sin x belongs to C DO ( -00,(0) since u and all of its derivatives
are continuous on the whole real line.
5. The wedge-shaped function shown in Figure 2.5 is a member of C(n),
where n = (0,1) x (0,1) is the unit square in Il~?, but not of C1(n),
since ou/ox is not a continuous function.
Continuous functions on cornpact sets. We saw carlier that the
function u(x) = I/x is continuous on the open set (0,1), but not on the
closed set [0,1], as a result of the singularity at x = O. It turns out that
continuous functions defined on compact sets (recall from Chapter 1 that
these arc closed and bounded sets in ]Rn) may be characterized further, in
the sense that they are necessarily bounded on such sets.
A function f defined on a set n in ]Rn is said to be bounded if it is
possible to find a number M > 0 such that

If(x)1 :s; M for all x E n;


in other words, the function does not "blow up" anywhere. Another way
of characterizing boundedncss is to consider the set f (n) of values of f (x),
2.1 Continuous functions 59

~~--------------~--------~ X

FIGURE 2.5. The wedge-shaped function of Example 5

sup!(I)

!(I) f(x)

inf f(I)

FIGURE 2.6. The image f(1) of the function f defined on an interval I in lR

as X ranges over all points in nj that is,

J(n) = {y E lR: J(x) =y for some x E n}.

This situation is depicted in Figure 2.6, for a function of a single variable.


J(n) is called the runge or image of J, a concept whieh is explored in
greater detail in Chapter 5. With the notion of the range at our disposal it
is possible to characterize a bounded function J as being one for which

sup J(n) and inf J(n) exist (that is, are finite).

Continuous functions behave in a special way on compact sets, as the


next theorem shows.

THEOREM 1. Let n be a bounded domain (that is, a bO'l.mded open, cort-


nected set) in ]Rn, and J a continuous Junction defined on the compact set
n. Then
60 2. Sets of functions and Lebesgue integration

(a) f is bounded on 0 and, funhermore, f achieves its supremum and


infimum on TI;

(b) f is uniformly continuous on TI.

Note that part (a) of the theorem states that there is a point z, say, in
TI, such that z = supf(O) = maxf(O); that is, f(z) 2: f(x) far alt points
x E TI; a similar interpretation applies with respect to the infimum.

PROOF OF THEOREM 1. We prove (a).


We show first that the function has a maximum; thus we have to show
that the function f is indeed bounded above, that is, that f(x) ::; M for
some M in the interval.
If f is not bounded from above, then corresponding to each positive
integer m it is possible to find a point x in the domain TI such that f (x) >
m. By the Bolzano-Weierstrass Theorem applied to sets in IR n , the sequence
{x m } constructed in this way has a point of accumulation c that lies in TI.
Furthermore, since f is continuous, given E = 1 there exists 6 > 0 such that

If(x) - f(c)1 < 1 whenever Ix - cl < 6.


In particular, this applies to points in the sequence, so that

whence

For m sufficiently large this is a contradiction. Thus f is bounded from


above.
Suppose then that the least upper bound of alt the values of f (x) is f.
Then, given a positive integer m we can find a point Ym in TI such that

If(Ym) - fl < l/m.


Dcnote by d a point of accumulation of the sequence of points {Ym}; then
f(d) ::; f. The proof of the theorem follows if we can show that in fact
f( d) = f. Given E, there exists 6 such that

If(Ym) - f(d)1 < E whenever IYm - dl < 6.


This is true for infinitely many values of m since d is a point of accumula-
tion. But

If(d) - fl ::; If(d) - f(Ym)1 + If(Ym) - f(d)1 < E + l/m.


This holds for evcry E and m; hence If(d) - fl = 0, as was to be shown.
2.2 Measure of sets in jRn 61

The proof for the minimum is carried out in much the same way. 0

Examples

6. Consider the function u(x) = sinx defined on [0, 21r], which is closed.
The supremum of u(x) is 1 which is achieved at x = 1r/2, whereas
the infimum is -1 which is achieved at x = 31r /2. Theorem 1 teIls us
that u is uniformly continuous.

7. Let u(x) = I/x; we have seen earlier that this function is continuous
on the open interval (0,1), but that it is not uniformly continuous
there (see also Exercise 2.3). It is not continuous on [0,1]; furt her-
more, inf u = 1 (at x = 1), but sup u does not exist.
8. Note that Theorem 1 gives sufficient conditions for a function to be
bounded and uniformly continuous. These are not necessary condi-
tions, however; for example, if u(x) = x 2 on (0,1), then supu = 1,
inf u = 0, and the function u is uniformly continuous, although it
°
achieves its supremum and infimum (at x = anel x = 1, respec-
tively) outside the set (0,1), which is open.
Lipschitz continuous functions. A function f defined on a set n in

°
Rn is said to be Lipschitz continuous (or simply Lipschitz) if there exists a
constant L > such that
If(x) - f(y)1 s Llx - Yl for all x, yEn. (2.3)
It is straight forward to show (Exercise 2.10) that every Lipschitz function
is uniformly continuous, although of course the converse is not true. This
may be better appreciated by considering the interpretation of Lipschitz
continuity for functions of a single variable (Figure 2.7): (2.3) states that
the slope of the chord joining any two points on a Lipsehitz function is
bounded above by a constant L which is independent of the two points.
We see also that the definition of Lipschitz continuity does not require that
the derivative exist at every point. However it is not difficult to show that,
if n is a compact set, then every continuously differentiable function on n
is Lipschitz.

2.2 Measure of sets in ~n

Many functions that occur in practical applications are not continuous, and
cannot therefore be accommodateel in one of the spaces Cm(n). A simple
example is the Heaviside step function, which has many applications in
physics anel engineering, and which is defined by

H(x) = { 0, x S 0,
1, x> 0.
62 2. Sets of functions and Lebesgue integration

If(Y) - f(x)1 :S L
Iy-xl
If(Y) - f(x)1

x Y

FIGURE 2.7. A Lipschitz continuous function of a single variable

R(x)

lr-----~--------
H(x)

FIGURE 2.8. The Heaviside step function H(x) and its integral, the ramp func-
tion R(x)

Though functions like H (x) are not continuous, they do nevertheless possess
the important property that they are integmble; that is, their integrals exist.
For example, the integral of H (x) is the ramp function R( x) shown in Figure
2.8; clearly, R(x) E C(-oo, 00).
Our aim is to set up aspace of functions that may be classified according
to whether they, and their powers, are integrable. That is, for a given
function f we investigate the range of exponents p for which the integral

is meaningful (that is, finite), where p 2: 1 is areal number. This permits


the introduction ofthe spaces LP(a, b) or, more generally, LP(f2). Now recall
that in the case of the spaces cm it is possible to obtain a precise idea of
the degree of smoothness of a function by determining the largest value
of m for which it belongs to C m . The smoothness of two functions may
then, for example, be compared by determining the largest numbers m of
the spaces cm of which they are members. In the same way, we will see
that the LP spaces are also "nested" , in thc sense that LP c Lq for the
2.2 Measure of sets in Rn 63

(a) (b)

FIGURE 2.9. The basic idea behind (a) Riemann and (b) Lebesgue integration

case in which p > qj thus these spaces also provide a means of comparing
functions, this time through their integrability.
In order to give such spaces a proper treatment it is necessary first of
all to discuss the notion of Lebesgue measure. This in turn allows us to
introduce the notion of Lebesgue integration, which is a generalization of
the "standard" Riemann integration, and in so doing to go on to introduce
the spaces LP(S1).
Measure theory is a well-established branch ofmathematics, and Lebesgue
measure is but one example of a measure. It is an important example,
though, and it is also intuitively the easiest to grasp. There is no need to
make reference subsequently to any other measure than that of Lebesgue,
so rather than give a general treatment of the subject, this section is re-
stricted to an overview of the theory of Lebesgue measme that is concise,
but which nevertheless suffices for our purposes.
In order to appreciate the need to extend the notion of Riemann integra-
tion, we return first to the definition of the Riemann integral. Restricting
the discussion for now to functions of a single variable, consider a function
f defined on the interval [a, b]. The Riemann integral is based on the idea
of dividing [a, b] into a finite number N of subintervals, the kth subinterval
having length LiXk, and then considering sums of the form
f(Xl)Lixl + f(X2)Lix2 + ... + f(XN)LixN,
as shown in Figure 2.9(a). This sum represents an approximation to the
area under the graph of f. If the function is sufficently well-behaved - for
exarnple, piecewise continuous - then the approximation may be improved
by increasing N, that is, by refining the subdivision of [a . b], so that in the
limit, as N gets very large, we arrive at the Riemann integral, which is
usually denoted by

l b
f(x) dx.

The Riemann integral is the integral used in everyday applications, and


it is generally adequate for most purposes, but it also suffers from certain
64 2. Sets of functions and Lebesgue integration

deficiencies. For example, there are certain "nasty" functions that we are
unablc to deal with using the Riemann integral: an example is the function

u(x) = { ~: x is rational,
x is irrational,
(2.4)

defined on thc interval [0,1]. With the more general Lebesgue integral we
avoid these problems; the Lebesgue integral is able to handle functions
like (2.4) and, furthermore, gives the same result as the Riemann integral
if the function is Riemann-integrable. Also, limits of Lebesgue-integrable
functions are always Lebesgue-integrable.
Although it might seem rat her pedantic to abandon the Riemann in-
tegral for the preceding reasons - after all , how often are we required to
integrate something like the function defined in (2.4)'1 - we demonstrate
later that spaces of Lebesgue-integrable functions possess properties which
allow them to be classified as Banach spaces or Hilbert spaces, with the
fortunate consequence that it is then possible to draw on the vast reser-
voir of results for such spaces. From a practical point of view, Riemann
and Lebesgue integrals coincide when the former exists, so all we will have
done would be to broaden the class of functions that can be integrated.
Since the question of whether the integral of a function f makes sense
depends very much on the function, a suitable alternative approach to the
Riemann integral might be to approximate f by a very simple function, the
integral ofwhich can be computed without any difficulty. Then, in contrast
to the Riemann integral, the approximation to the integral of f can be
progressively improved, not by furt her subdivisions of the domain, but by
refining the approximation to f (see Figure 2.9(b)). The approximating
functions that serve this pur pose are indeed known as simple functions,
and are defined to be functions that take on a finite number of values.
Provided that we have no problems with the subsets M k on which they
take their constant values, the integral of f can be approximated by a sum
ofthc form

in which J.1(Mk ) denotes the "size", or measure of M k . By a process of


refinement which leads to a progressive improvement in the approximation
of f (this is shown schematically in Figure 2.9(b)) in the limit as N goes
to infinity, we arrive at thc integral of f.
Now for this strategy to work we have to have available a means of
measuring the size of sets such as M k , even for approximations of fairly
nasty functions, in which case M k may take on a rather complex form. So
the problem of constructing an adequate definition of the integral has been
transferred to one of formulating a mathematically acceptable definition of
the size of a set.
It can be shown that, rat her surprisingly, not all subsets of jRn can be
assigned a size that is independent of rotations and translations; this is
2.2 Measure 01' sets in lle 65

known as the Banach-Tarski paradox, the essence of which is that it is


possible to break up a ball of radius r into a finite nurnber of cornplex
pieces, rnove thern around, and reassemble thern to get two balls of radius
r! Those special sets that do not suffer frorn such shortcomings, and which
will do for evaluations of integrals, are known as measurable sets. So we
have to consider next the issue of identifying those subsets of a set n
(which would be IRn or a subset of IRn ) that can be used in an appropriate
definition of the integral.
Suppose that this farnily of subsets is denoted by M; what properties
do we want the members of this family to have? First" n itself should
be a rnernber of this family. Second, if M belongs to M, that is, M is a
rneasurable set, then we would like its complement n - M to be rneasurable
as weil. Next, this farnily should contain the open subsets of n. And finally,
the intuitive notion of size dictates that if {MI, M 2 , •.• ,} is a countable
family of measurable subsets that belong to M with the property that
the sets M k are mutually disjoint, then the size of MI U M 2 U ... may be
evaluated by determining the size of each of the sets M k , and adding the
result; that is, we require that

This property of /-L is known as countable additivity.


To summarize, we require that the following categories of sets all belong
toM:
1. n itself;
2. n- M, for M E M;

3. all open sets in n; and

4. MI U M 2 U ... , for any countable family {MI, M 2 , ... ,} of disjoint


sets in M.
The rnernbers M k of a farnily M of sets that satisfy the properties 1 through
4 are known as measurable sets, and n is called a measumble space. It is
on such spaces that the Lebesgue rneasure is defined.

Lebesgue measure. In order to define the Lebesgue rneasure we start


with the familiar: in IR n , define an n-cell to be any set of the form

C={x: ai<xi<bi , i=I,2, ... ,n}.

Thus a I-cell is an open interval in IR, a 2-cell is the area bounded by a


rectangle in IR2 , and so on. We further define the n-volurne of C, denoted
vol(C), by
66 2. Sets of functions and Lebesgue integration

Of course, by the I-volume of a l-cell (interval) in lR we understand the


length of the interval, the 2-volume of a 2-cell (rectangle) in lR2 is its area,
and so on. Then the Lebesgue measure p, on lR n is a quantity (a function,
to be precise) that measures the content of a set. It is defined to have the
following properties.
1. p,(0) = 0;

2. p,( C) = vol C for any n-cell C;

3. if {MI, M 2 , ..•} is a collection of mutually disjoint sets in M, then

4. the measure of any mcasurable set can be approximated from above


by open sets; that is, for any measurable M,

p,(M) = inf{p,(O) : Me), 0 is open};

5. the measure of any measurable set can be approximated from below


by compact sets (recall that these are closed and bounded sets in lR n );
that is, for any measurable M,

p,(M) = sup{p,(C) : C C M, Cis compact}.

Thus we see that Lebesgue measure is a very natural extension of thc


familiar not ions of length, area, and volume, with the added quality that
the careful definition also permits the measurement of the content of a very
wide range of sets in lRn .
This definition of measure also suffices for the construction of a theory
of integration that works for a range of functions much wider than that
accommodated by Riemann integration.

Sets of measure zero. Suppose that P is a property that may or may


not hold at a point x in lR n . For example, for a given function J, P could
be the property "J(x) = 0". If f.-t is the Lebesgue measure on lR n , then wc
say that a property holds almost everywhere (abbreviated a.e.) if the only
subset on which it does not hold is one with measure zero. For example, if
J and gare two functions that are equal everywhere except at a number of
isolated points in IR, then J = 9 a.e. (Figure 2.10). The same would apply
to two functions that are defined in IR 2 , and differ only on a curve in IR 2 ,
such as the boundary r of a domain n.
Sets of measure zero are readily identifiable, at least in the kinds of
applications with which we are concerned. For example, any countable set
has zero measure. So in particular, for the sets IZ and Q> of integers and
rational numbers, respectively, we have

Jl'(IZ) = p,(Q» = O.
2.3 Lebesgue integration and the space LP(O) 67

fex), g(x)

f(x) - g(x)

FIGDRE 2.10. Functions f and 9 that are equal almost everywhere

(a) (b)

FIGURE 2.11. (a) Disjoint sets: 0 1 n O2 = 0 and P.(01 n O2 ) = 0; (b) nonover-


lapping sets: 0 1 n O2 = 0, 0 1 n O2 # 0, and P.(01 n ( 2 ) = O.

Another example concerns disjoint or nonoverlapping sets: if 0 1 and O2


satsify rh n O2 = 0 or more generally, if 0 1 and O2 are nonoverlapping in
the sense that 0 1 n O2 is a nonempty set of zero measure, then

This is illustrated in Figure 2.11.

2.3 Lebesgue integration and the space IJ'(O)


Before we begin discussion of the integral itself, we have to select a suitable
family or space of functions for which integration can be usefully defined.
Returning to the motivation given at the beginning of the previous section,
we say that a function defined on a measurable set 0 in ~n is measurable
if the inverse image f-1(M) of any measurable set M (in IR, of course) is
itself measurable. The idea is illustrated in Figure 2.12.
68 2. Sets of functions and Lebesgue integration

d +--------------------+--,
M
C

FIGURE 2.12. The definition of a measurable function

Before continuing furt her we consider a few cxamples of measurable func-


tions.

Examples
9. Any continuous function is measurable: in partieular if M is the in-
terval (c,d) (Figure 2.12), then it is possible to show that f-l(c,d)
is also open. To see this, take any point Yo in M; then it is possible
to choose E such that the neighborhood {y : Iy - Yol < c} lies en-
tirely in M, M being open. Now denote by J the interval f-l(M);
by definition there exist points x and Xo in J such that f (x) = Y
and f(xo) = Yo, and so If(x) - f(xo)1 < E. By the eontinuity of fit
follows that there exists 8> 0 such that Ix - xol < 8. Thus J is open.
10. Consider the Heaviside function H defined by
if x 2: 0
H(x) = { ~ if x< 0,
and shown in Figure 2.13; if we choose M as shown in the figure,
then H- 1 (M) = {x: x 2: O} whieh is measurable; on the other
hand, if we choose the measurable set L, then H- 1 (L) = {x: x < O}
which again is measurable. Continuing in this way, we ean verify that
the sets H-1(M) for measurable Mare all measurable. Thus H is a
measurable function.
11. Let n be a measurable set, and E a measurable subset of n; then the
charactenstic function XE of E is defined by
ifx E E
XE(X) = { ~ if x rf- E.
(2.5)
2.3 Lebesgue integration and the space P(rl) 69

H(x)
kfr-------------

FIGURE 2.13. The Heaviside step function H(x)

,------_ XE

,---,

E
ak------------ ~--'

(a) (b)
FIGURE 2.14. (a) The characteristic function XE, and (b) CL simple function s

This is illustrated in Figure 2.14. It can be shown that XE is a mea-


surable function (see Exercise 2.12) provided that Eis measurable.

12. We return to the example given in (2.4), and observe that this can be
written in the alternative form u = XQ, where Q is the set of rational
numbers. Since Q is a measurable set - with Lebesgue measure zero,
since it is countable - it follows that the function u is measurable, by
Example 11.
With the characteristic function at our disposal we can now define simple
functions s: these are functions on n that take on only a finite number of
values. In other words, suppose that kf 1 , kf2 , ... , kfN is a partition of n;
then each simple function is a measurable function of the form

(2.6)

where ak is the value of s on kfk . These not ions are illustrated in Figure
2.14. Since sums of measurable functions are measurable, we can conclude
that every step function is measurable.

The Lebesgue integral. We are now in a position to define the Lebesgue


integral, and we begin by doing so for simple functions, for which case the
definition takes on an intuitively obvious character.
70 2. Sets of functions and Lebesgue integration

FIGURE 2.15. The Lebesgue integral of a simple function

The (Lebesgue) integral 01 a simple lunction s on n is defined by

where Mk are measurable and pairwise disjoint (Figure 2.15). A special


case is the integral over a measurable subset E of n with finite measure; it
suffices to put 8 = XE, and we obtain

l dx:= l XE dx = f-L(E).

Returning to Example 12, we find now that it is a trivial matter to integrate


the function u: indeed

k u dx = k XQ dx = f-L(Q) = O.

Here, then, is one example of a function that is not Riemann-integrable,


but which is indeed Lebesgue-integrable.
We now extend the definition of the integral to more general classes of
functions; this is achieved by approaching measurable functions as limits
of sequences of simple functions.
We begin by introducing the notion of a nondecreasing sequence of simple
functions: this is a sequence S = {81, 82,"" 8k,"'} of simple functions
which have the property that

An example is shown in Figure 2.16. A word about nomenclature: the se-


quence is termed "nondecreasing" rat her than "increasing" since the latter
would refer to functions having the property SI < 82 < ...; in other words
the possibility of equality would be excluded. To obtain the Lebesgue inte-
gral of a measurable function f we first set up a sequence of nondecreasing
2.3 Lebesgue integration and the space LP(n) 71

1 1 3
4 '2 4
FIGURE 2.16. A nondecreasing sequence of simple functions that approximate a
measurable function f

simple functions that approximate f; next, we evaluate the integrals of


these simple functions - this is a well-defined procedure - and, finally, take
the limit to obtain the integral of f. In order for this intuitively reason-
able strategy to work we must first be sure, though, that it will always be
possible to set up such sequences; this guarantee is given in the following
result.

LEMMA 1. 1f f is a nonnegative measurable function on IR n , then it is


possible to find a nondecreasing sequence S of simple functions on IR n such
that
lim sn(x) = f(x) at all points x in IR n . (2.7)
n~oo

The meaning of (2.7) should be clear from the discussion of sequences in


Chapter 1: for each n, sn(x) is areal or complex number, so (2.7) is a
statement concerning the convergence of a sequence of real or complex
numbers. Simply put, a rat her arbitrary function may be approximated as
closely as we wish by simple functions. Figure 2.16 illustrates this notion
in one dimension.
We come at last to the definition of the integral, and begin by defining
the integral for nonnegative functions, that is, functions that satisfy the
condition f(x) 2: 0 a.e. on their domain. Suppose that f is a measurable
function defined on a measurable set n, and furt her that f is nonnegative
on n: f(x) 2: 0 a.e.; then the Lebesgue integral of f ove:r n is defined by

inrf dx = lim
k~oo inrSk dx, (2.8)

where Sn are nondecreasing simple functions that approximate f in the


sense of Lemma 1. The definition (2.8) should be considered in the light
of the discussion of sequences in Chapter 1. Assuming that we are dealing
with real-valued functions (the same argument applies to complex-valued
72 2. Sets of functions and Lebesgue integration

functions), In f dx is areal number, and if we set ak = In


8k dx, then
{ad is a sequence of real numbers. The definition (2.8) then i:>tates that
ak --+In f dx as k --+ 00. Thus, in contrast to the approach taken with the
Riemann integral, the Lebesgue integral of f may be obtained as the limit
of integrals of simple functions that approximate f more and more closely
as the limit is approached.
As mcntioned earlier, functioni:> that are Riemann-integrable are also
Lebesgue-integrable, and the two integrals coincide. Indeed, for weH be-
haved functions - for example, piecewise continuous functions - it is clear
that the Lebesgue integral, likc thc Riemann integral, amounts to the area
under the graph of the function. However, as we indicated earlier, there are
Lebesgue-integrable functions which are not Riemann-integrable.

Example
13. Suppose that we wish to integrate the function f shown in Figure
2.16. Now this function is piecewise continuous, and it is wcH known
from elementary integration that its integral is the area under the
triangle, and is equal to 4.
However, the purpose of this example is
to show how the definition (2.8) may be deployed in practice, so we
in fact construct a sequence of nondecreasing simple functions that
converge to f·
There are many different ways of constructing the requisite family of
simple functions; we consider just one, in which the first member 81
is as shown in Figure 2.16; the second member, 82, is coni:>tructed in
a similar manner, ensuring that it too satisfies 82 S f. The process
may now be continued in a fairly obvious manner.
Coni:>idering next the integrals of the simple functions, we see that

.Im. 81(X) dx = ~ and .Im. 82(X) dx = ~.


In fact it is possible to show (see Exercise 14) that
r
JITf.. 8k(X) dx =
1
"2 -
1
2k+1'

so that as k goes to 00 we approach the value given hy the Riemann


integral.
To complete the theory of the Lehei:>gue integral we now extend the treat-
ment to include functions that are not necessarily nonnegative. Suppose
then that f is any measurable function; then f may bc decomposed into
its positive part f+ and negative part f- (Figure 2.17), which are defined
hy
if f(x) 2: 0 iff(x) 2:0
otherwii:>e and r(x) = { ~ f(x) otherwise.
2.3 Lebesgue integration and the space LP(O) 73

f
+

FIGURE 2.17. The positive and negative parts of a function

More concisely, we can write

t+ = ~(f + Ifl), r = ~(Ifl- f),

so that

f=t+-r·
We observe that both f+ and f- are nonnegative functions, so that the
preceding theory app!ies to these two components of f. It is possible to
show that f+ and f- are both measurable if fis, and so we may define
the Lebesgue integral of f by

in f dx = in t+ in r
dx - dx. (2.9)

It is at this point that we can clarify the need to define the integral of a
function in terms of its positive and negative parts; first we need to note
that the integral, as defined in (2.8), need not be finite; that is, it is possible
to have Inf dx = +00 for a nonnegative function. Continuing this !ine of
argument, it is quite conceivable that evaluation of the righ-hand side of
(2.9) will give 00 - 00, which is of course meaningless. It is therefore to be
understood that the notation In
f dx makes sense only if one of the terms
on the right-hand side of (2.9) is finite. We go one step further, and give a
special name to those functions f for which f+ + f- has a finite integral.

Integrable functions. A measurable function f is said to be integmble


on a measurable set n in jRn if

in Ifl dx < 00.

A word ab out notation is in order. In multivariable calculus it is custom-


ary to write multiple integrals as
74 2. Sets of functions and Lebesgue integration

for functions of n variables; so, for example, a double integral is written as

JL u(x) dxdy.

We adopt the convention throughout that multiple integrals are written in


the concise form

L u(x) dx,

the context making clear the dimension of the domain over which the inte-
gral is taken. This convention has in fact been implicit in the developments
leading to the Lebesgue integral; we made no distinction there between
integrals taken over ~ and over ~n, for any n.
It is also worth bearing in mind that since sets of measure zero are
irrelevant in the evaluation of integrals, integrals may be defined over open
sets or over their closures. So, for example, it makes no difference whether
an integral is defined over an open interval (a, b), or over [a, b].
All the usual properties of Riemann integrals extend to Lebesgue inte-
grals, and we summarize without proof some of these properties.

THEOREM 2. Let u(x) and v(x) be Lebesgue-integrable functions on 0 C


~n. Then

(a) L[au(x) + ßv(x)] dx=a L u(x) dx + ß L v(x) dx


fOT constants a, ß;

(b) if u(x) ::; v(x) fOT alm ost all xE 0, then

L u(x) dx::; In v(x) dx;

(c) ifu(x) is bounded above and below by numbeTs m and M, then

mp,(O) ::; In u(x) dx ::; Mp,(n);

(d) lul is also integrable, and

Iin u dxl ::; In lul dx.

The following theorem is a powerful tool in functional analysis.

THEOREM 3 (THE LEBESGUE DOMINATED CONVERGENCE THEOREM).


Let Ul, U2, ... , Uk, ... be a sequence of measurable functions and suppose
2.3 Lebesgue integration and the space P(rl) 75

that Uk (x) .:::: v( x) a. e. for each k, where v is an integrable junction. Suppose


that u(x) = limk-->(X) Uk(X) a.e. Then u is integrable and

r u dx =
Jn
lim
k-->(X)
r
Jn
Uk dx.

The usefulness of this theorem lies in the very mild conditions that are
placed on Uk.

The spaces LP(f!). Let p be areal number with p ~ 1. A function u(x)


defined on a subset n of jRn is said to belong to LP(n) if u is measurable
and if the (Lebesgue) integral

L lu(x)IP dx

exists (that is, is finite). The case p = 2 is special in many ways, as the
developments in Chapter 3 and beyond make clear; functions in L 2 (n) have
the property that

and for this reason are referred to as square-integmble .


Of course, every bounded continuous function defined on a bounded set
n belongs to LP(n), but there are many other functions that have this
property, as we show in the following.

Examples

14. The step function H (x) defined by

H(x) ={ ~: x<O
O:S;x

belongs to LP(a, b) for any p ~ 1 and finite a < 0 and b > 0 since

15. The function u(x) = X- 1 / 3 belongs to LP(O, 1) for any p < 3, since

11
o
lu(x)IP dx =
11 0 3-p
3
x- p / 3 dx = - - [:r(3- P)/3]
1
0

which is finite for p < 3.


76 2. Sets of functions and Lebesgue integration

Some results that are frequently useful are embodied in the following the-
orem.

THEOREM 4. Suppose that n is a bounded domain in ]Rn. Then

(b) if U E LP(n), then the integrals

10 lu(x)1 dx and 10 u(x) dx


are finite;

(c) ifu,v E L 2 (n), then the integral

10 u(x)v(x) dx
is finite.

PROOF. The proof of (a) relies on the inequality

which holds if p 2:: pi (sec Exerci8e 3.22 later for a derivation). If u belongs
to LP(n), then the integral on the right is finite, and hence so is the integral
on the left. Thus u E LP' (n) also.
Part (b) is a trivial consequence of (a): set pi = 1; then we have, for
u E LP(n),

110 u(x) dxl ~ 10 lu(x)1 dx ~ l10 lu(x)IP dxj l/p < 00.

Part (c) is a result of the inequality

110 u(x)v(x) dxl ~ 10 lu(x)l2dx 10 Iv(x)1 2 dx

which arises again in Chapter 3 (Theorem 2) in the guise of the Cauchy-


Schwarz inequality. 0

There is a source of ambiguity in the definition of the space LP(n) which


we must remove in order to deal with it meaningfully. Suppose that f(x)
2.3 Lebesgue integration and the space LP(O) 77

and g(x) are two measurable functions that are equal a.e. (as in Figure
2.10); then

It follows that LP(O) can be partitioned into equivalence dasses, each dass
comprising all those functions that are equal a.e. to a given one. In order
to be able to define LP(rl) as a normed space (in the next chapter) it is
necessary to regard the elements of this space not as functions, but rather
as the equivalence dasses of functions defined here. Notwithstanding this
distinction, it is common practice to speak of the members of LP(rl) as
functions; this is a harmless abuse of language provided that the precise
nature of the space is properly understood.

Complex-valued functions. The theory presented here can be extended


in a very obvious way to functions that are complex-valued. Given a func-
tion I that is of the form I (x) = u( x) + iv( x), we say that I is measurable
if u and v are. Furt hermore , I is integrable if u and v are, and

In I dx == In u dx +i In v dx.

The definition of the LP spaces still stands, if the notation 1I1 is interpreted
as the modulus of a complex number: 1/1 2 = u 2 + v 2 •

The space LOO(O). Ifwe let p --+ 00, then we may define the space LOO(rl)
to be the space of all measurable functions on rl that are bounded almost
everywhere on rl (that is, except possibly on subsets of zero measure):

Loo(rl) = {u: lu(x)1 ::; k a.e. on rl for some k E IR}.

Clearly for a bounded domain rl, LOO(rl) is a subset of LP(rl) for all p ~ 1,
since any u E L 00 (rl) satisfies

Inlu(x)IP dx::; In k P dx< 00,

so that u E LP(rl) also.

Example
16. The function

) _ { x 2 , -0::;
u (X I
X < 1, x #~
--+00, X=2'

is bounded a.e. on (0,1) since u(x) --+ 00 only on a set of measure


zero (the point x = ~).
78 2. Sets of functions and Lebesgue integration

0(0)

FIGURE 2.18. The relationship between the LP spaces and spaces of continuous
functions

It is interesting to note that although we have LOO(O) c ... c LP(O) C


... C LI(O), the space C(O) of continuous functions is not a subset of any of
the LP spaces. For example, the function u(x) = X-I belongs to C(O, 1) but
not to LOO(O, 1) since it is not bounded. But the space of boundedcontinuous
functions, (equivalently, the space c(n) of continuous functions defined on
a compact set fi) is a subset of LOO(O).
Figure 2.18 shows schematically how the spaces cm(o) and LP(O) are
related.

2.4 Bibliographical remarks


A good treatment of the concept of continuity may be found in Apostol
[2], Binmore [6], and Lang [29]. The treatment of measure and integra-
tion given here is somewhat superficial, although it suffices for subsequent
needs. There exist many readable accounts of the Lebesgue theory, notable
examples being Kolmogorov and Fomin [26], Reed and Simon [40], Roy-
den [44], and Rudin [45], and Roman [42] gives an account that should be
particularly accessible to nonspecialists in mathematics.
2.5 Exercises 79

2.5 Exercises
Continuous functions and the space Cm(n)
2.1. Sketch and discuss the continuity of the functions

(a) u(x) = x/(x 2 - 1), -00 < x < 00.

(b) u(x) = { ~~ - Ixl)/x, ~: ~ .

2.2. Show that the following nmctions are continuous on the intervals
given:
(a) polynomials of degree k defined on the interval [a, bJ;
(b) the function u(x) = X 1/ 2 on [0,00).
Is either of these functions uniformly continuous?
2.3. (a) Show that f(x) = I/x is not uniformly continuous on (0,1).
[Hint: recall that f(x) - f(y) = (y - x)/xy. Show, for example
by choosing x = l/n and y appropriately, that the distance
Ix - yl can be shown arbitrarily small although If(x) - f(y)1 is
large.J
(b) Show, on the other hand, that f(x) = I/x is uniformly contin-
uous on [a, b], where b > a > 0.
2.4. Show that f(x) = x 2 + 2y is continuous at any point x in 1R2 .
2.5. Let E be a closed connected set in 1R2 , and for .any point x in 1R2
define the function f by f(x) = d(x, E), where d(x, E) is the distance
between x and E, defined by

d(x, E) = inf{lx - yl: y E E}.

Draw a sketch that illustrates the function f, and show that f is


continuous.
2.6. If f is continuous at a point Xo in [a, bJ with f(:J.~o) > 0, show that
there is a neighborhood (xo-h, xo+h) about Xo in which f is positive.
2.7. Prove Bolzano's Theorem, which states that if f(x) is a continuous
function on [a, bJ with f(a)f(b) < 0 (that is, f(a) and f(b) have
different signs), then there is at least one point c in [a, bJ such that
f(c) = O. [Use the result in Exercise 2.6.J
2.8. To which spaces cm(n) do the following functions belong ?

(a) u(x) = { ~'(I + x),


80 2. Sets of functions and Lebesgue integration

(b) u(x) = (sinx)(l - y), (x, y) E [0,11'] X [0,1].

()
cux
"() = {O,1, 00<S; xx<S; l~ '
2.9. Examine the continuity of u(x) = r on the unit disk, where r 2
x2 + y2.
2.10. Show that every Lipschitz function is uniformly continuous.
Measure of sets in !Rn
2.11. Let I be an interval in IR, and consider the subset of all irrational
numbers in I. 1s this set measurable? If so, calculate its measure.
2.12. Show that the characteristic function XE of a set E is a measurable
function if and only if E is itself measurable.
Lebesgue integration and the spaces LP(0.)
2.13. Prove Lemma 1.
2.14. Verify that the integral of the nth simple function approximating the
function f in Example 13 has the value ~ - 2}+1'
2.15. For the function f defined by
-I -lS;x<O
f(x) = { +~ if OS;xS;l
lxi> 1,
find f+ and f- and determine the integral using (2.8). Repeat the
exercise for the case in which
-I -lS;x<O
g(x) = { +~ if x 2:
x< -1
°
2.16. Show that the Lebesgue integral of f exists if and only if that of Ifl
exists, and that

IL f dxl S; L Ifl dx.

2.17. What relationship must be satisfied by a and p in order that u(x) =


x a belongs to LP(O, I)? And for u to belong to LP(l, oo)?
2.18. For wh at values of the real number a does the complex-valued func-
tion f defined by f(x) = x a (l - xi) belong to L 2 (0, I)?
2.19. Prove Theorem 4(c), which states that if u, v E L 2 (0.), then In u(x)v(x) dx
is finite.
3
Vectar spaces, narmed, and inner
prad uct spaces

From elementary courses in vector algebra and analysis we know that the
idea of a vector as a directed line segment is not sufficient for us to build up
a nontrivial theory, let alone be of use in concrete applications. Additional
structure has to be added: we agree to add together vectors using the
parallelogram law, and we define various forms of multiplication of vectors,
for example, the scalar (dot) product and the vector (cross) product. Once
these properties have been adopted, it becomes possible to construct a
fairly sophisticated theory.
The same is true of sets in general. A set without structure is sterile,
and not of much use from the point of view of the analyst. The quest ion of
what kinds of properties to assurne is generally answered by looking at the
properties of simple sets like lR or the set of vectors, and by generalizing
accordingly. This process of generalization is a recurrent theme in the next
few chapters, and in this chapter we begin the process by defining first
a vector space to be, broadly speaking, an arbitrary set whose members
behave as vectors. Then we show how properties such as "length", "dis-
tance" and "scalar product" can be defined for vector spaces, leading to
the notions of normed and inner product spaces.

3.1 Vector spaces and subspaces


We are familiar with the idea of a set being a collection of objects, all of
which have a specified property. In most applications, though, it is useful
82 3. Vector spaces, normed, and inner product spaces
au+ßv

FIGURE 3.1. Vector addition and subtraction

to be able to add together multiples of members of a set and to have the


assurance that the result of such an operation will yield something that is
also a member of that set. This is the essence of a vector space.
Suppose we generalize from the behavior of vectors, starting by first
reviewing some familiar properties of the set V of aIl vectors in three-
dimensional space. Given vectors u, v, wand real numbers a, ß we know
that:

1. au + ßv is also a vector (sums of multiples of vectors are also vectors,


as shown in Figure 3.1);

2. u + v = v + u, and u + v + w = (u + v) + w (when adding vec-


tors together, the result does not depend on the order in which the
addition is carried out);

3. there is a vector 0 caIled the zero vectorthat has the property u+O =
u for aIl vectors u;

4. there is a vector -u, caIled the negative of u, that has the property
u + (-u) = 0 (we normally write this as u - u = 0). This in turn
defines subtmction: by the difference u - v we then mean the vector
u + (-v) (Figure 3.1);

5. (aß)u = a(ßu);
6. (a + ß)u = au + ßu, and a(u + v) = au + av;
7. 1 . u = u (this, with 6, teIls us that u = (1 + O)u 1 . u
1 . u +0 . u = u so that 0 . u = 0).

Now all of these properties of vectors are readily generalized to any set,
and this is what we do next.

Vector space. Let X be a set, and let lK be either the set lR of real numbers
or the set C of complex numbers, either of these being referrcd to here as
scalars, for convenience. Then X is called a vector space (or linear space)
3.1 Vector spaces and subspaces 83

if it has an operation + called addition, an operation of multiplication by


a scalar, and satisfies the following axioms.
VSl. for all u, v E X, and scalars a, ß, au + ßv is also a member of Xj

VS2. u +v = v +u and u + (v + w) = (u + v) +w for all u, v, w E Xj

VS3. there is an element 0 of X called the zero element that has the
property

u +0 = u for all u E X j

VS4. for every u E X there is an element -u that satisfies u + (-u) = 0;


then by the difIerence u - v we understand u + (--v)j

VS5. (aß)u = a(ßu) for all scalars a, ß and for all u E: Xj

VS6. (a+ß)u = au+ßu, and a(u+v) = au+av for all scalars a,ß and
for all u, v E X j

VS7. 1· u = u.
When IK is chosen to be the real numbers, then X is called areal vector
space, whereas it is referred to as a complex vector space if IK is chosen to
be IC. These two sets do not ex haust the choices of sealars that may be
made, but they more than suffice for our needs.

Examples

1. We start with a trivial example: the set V of vectors in IR3 is a vector


space. Indeed, V served as a model for setting up the axioms of a
vector space.

2. The set IR n of n-tuples is areal vector space, with addition defined


by

x+y (XI,X2, ... ,Xn) + (YI, Y2,··· ,Yn)


(Xl + YI,X2 + Y2,·.· ,Xn + Yn) for x,y E IR n

and scalar multiplication by

The zero element is 0 = (0, ... ,0) and the element -x is given by
-x = (-Xl, ... , -Xn).
e
3. The set n of n-tuples of complex numbers is a complex vector space,
the operations of addition and scalar multiplication being defined as
in the case of IR n , with the scalars now being complex numbers.
84 3. Vector spaces, normed, and inner product spaces

4. The set Cm(O) of m-times continuously differentiable functions on 0


is a vector space. For, if u and v are two such functions, then so is
the function au + ßv defined by (au + ßv)(x) = au(x) + ßv(x). The
zero element is simply the zero function and the function -u is the
function satisfying (-u)(x) = -1· u(x). It is areal or complex vector
space, accordingly as thc functions are rcal- or complex-valued.

5. The space LP(O) is a vector space for 1 ::; p < 00; this follows from
the Minkowski inequality fOT integrals

[llu ± viP dX] l/p ::; [llulP dX] l/p + [liviP dX] l/p , (3.1)

the derivation of which is discussed in Exercise 3.6. If we replace u


by au and v by ßv in (3.1) then we see that

[llau + ßvl P dX] l/p < [llaulP dX] l/p + [llßv iP dX] l/p

[I alP llulP dX] l/p + [IßIP llvlP dX] l/p

lai [llulP dX] l/p + IßI [liviP dX] l/p

and this last expression is finite since u and v belong to LP(O). Hence
au + ßv E LP(O). The remaining axioms are readily shown to be
valid.
The space L = (0) is likewise a vector space, as is easily verified.
As in the case of Cm(il), the spaces LP(O) are real or complex vector
spaces, accordingly as the functions are real- or complex- valued. If
complex-valued, then I . I in the Minkowski inequality is interpreted
as the modulus of a complex number.

6. The set X of all nonnegative continuous functions, defined by

X = {u: u(x) E C(O), u(x) 2: 0, xE O},

is not a vector space since, for example, au is not a member of X for


negative valucs of a.

Since all vector spaces are sets, it is natural to enquire whether subsets of
vector spaces are also vector spaccs. This is not always true, but in those
cases in which it is true we give the subset a special name.

Subspace. A subspace Y of a vector space X is a subset of X that is also


a vector space.
3.1 Vector spaces and subspaces 85

x
FIGURE 3.2. Planes passing through the origin are subspaces of llt j

Exarnples
7. Consider the vector space ]R3; all points of the form (x, y, 0) form
a subspace of ]R3 - the xy plane, in common parlance - since sums
of multiples of points in the xy plane also lie in this plane. Indeed,
the set of points of any plane or line passing through the origin is a
subspace of]R3 (Figure 3.2).
8. The set P3 [0, 1) of polynomials of degree :S 3 forms a subset of G[O, 1)
and constitutes a subspace: for any polynomials p(x), q(x) E P3 [0, 1),
ap(x) + ßq(x)
is also a polynomial of degree :S 3, and therefore belongs to P3 [0, 1).
9. The set G(O) of bounded continuous functions forms a subspace of
LP(n) (see Section 2.3) for 1 :S p :S 00.
Surn of subspaces. Given two subspaces V, W of a vector space X, we
define the sum of V and W, denoted by V + W, to be the set of all members
of X of the form v + w with v E V and w E W. In other words,
V + W = {u EX: u = v + w for v E V, W E W}.
The set V + W is also a subspace of X since if u and u are members of
V + W, so that u = v + wand u = v + w with v,v E, V and w,w E W,
then it follows that

au + ß'U a(v + w) + ß(v + w)


(av + ßV) + (av + ß'iiJ),
'--v---' '--"'
EV EW
86 3. Vector spaces, norrned, and inner product spaces

FIGURE 3.3. Direct surn of subspaces

so that au + (Ju is also in V + W.


Example

10. Let X = IR3 and V = {x E IR3 : x = (a,O,O), a E IR}, W = {x E


]R3 : x = (a, ß, 0), a, ß E IR} (that is, V is the x axis and W the xy
plane). Then the sum of V and W is the subspace of X consisting of
the points x = (a, ß, 0) für real numbers a, ß.
Direct sum of subspaces. If V and W are subspaces of a vector space
X, then X is said to be the direct sum of V and W if (i) it is the surn of
V and Wj and (ii) V and W have only the zero element in common, that
is, V n W = {O}. The direct sum is denoted by V EEl W.

Example

11. Let X = IR3 , U = {x E IR3 : x = (a,ß,O)}, V = {x E IR 3 : x =


(O,ß,'Y)}' and W = {x E]R3: x = (0, 0, 'Y)}. Then clearly

X=U+VandX=U+W

But U n V = {x : x = (0, ß, O)} =1= {O}, whereas U n W = {O}. Thus


X = U EEl W (Figure 3.3).
The question of when an arbitrary member of u of a vector space X has a
unique representation u = v + w for v E V, w E W is easily resolved, as we
show in the next result.

THEOREM 1. Let X be a vector space. Then X = V EEl W if and only if


3.2 Inner product spaces 87

()
L-'''--------------u
lul

FIGURE 3.4. The inner product of two vectors

for any u E X there are unique members v E V and w E W such that


u = v+w.

The proof of this theorem is treated in Exercise 3.4.

3.2 Inner product spaces


It is now possible to generalize to vector spaces many of the fundamental
concepts of vector algebra and analysis, and we start with the concept of
an inner product or scalar product. Recall that the scalar product u . v of
two vectors u and v is areal number given by

u· v = lullvl cos(),
where () is the angle between u and v (Figure 3.4). The inner product has
the following properties. It is symmetrie (v . u = v . u), linear ((au + ßv) .
w = au,w+ßv,w), and positive-definite (u·u 2: 0 and u·u = 0 iff u = 0).
Furthermore, the scalar product in turn provides a means of measuring the
length or norm of a vector: indeed, for any vector u lul = (u· U)1/2. And
finally, when equipped with the scalar product operation it is possible to
measure the distance between two points x and y in JR3: if this distance is
denoted by d( x, y), then

d(x, y) J(y-x)·(y-x)
Iy-xl
J(Yl - xd 2 + (Y2 - X2)2 + (Y3 - X:l)2 .
The function d(·, .), being a device for measuring distances between points,
is called the metric. The concepts of inner product, norm, and metric are
defined in much the same way for arbitrary vector spaces. In this section
we take the first step in this direction, and deal with the inner product.

Inner product and inner product space. Let X be a complex vector


88 3. Vector spaces, norrned, and inner product spaces

space; then the inner produet (u, v) of u, v E X is an operation that satisfies


the following axioms, for all u, v, w E X and a, ß E C.

CIPl. (u,v) E C (the inner produet is eomplex-valued).

CIP2. (v,u) = (u,v) (the operation is Hermitian).

CIP3. (au+ßv,w) = a(u,w) +ß(v,w) (it is linear in the first slot).

CIP4. (u, u) 2: 0 and (u, u) = 0 iff u = 0 (it is positive-definite).


A veetor spaee X endowed with an inner produet (. , .) is ealled an inner
product space. Sinee it is the vector spaee together with the inner produet
that defines the inner produet spaee, the eonventional and more proper no-
tation is (X, (., .)); however, when it is clear which particular inner product
is being used, or when the details of the inner product are not pertinent,
the inner produet spaee is denoted simply by X.
Although the inner produet is in general a complex number, the inner
produet (u, u) of any member with itself is always areal number, from
Axiom CIP2, sinee this axiom tells us that (u,u) = (u,u). It follows then
that the axiom of positive-definiteness CIP4 makes complete sense (it would
not if (u,u) were complex).

Linearity. Linearity is a notion that reeurs on a regular basis, given that


the foeus of this work is on linear funetional analysis and its applications.
It is as weB, therefore, to spend a moment eonsidering its attributes. The
Axiom CIP3 of linearity, which applies only to the first slot of the inner
produet, is aetually made up of two parts: it states that the inner produet
is

additive, that is, (u + v, w) = (u, w) + (v, w), (3.2)

and

homogeneous, that is, (au, v) = a( u, v). (3.3)

We thus have the relationship

additivity + homogeneity = linearity.


These two eompnents of linearity can easily be deduced from CIP3, simply
by setting a = ß = 1 (to obtain the property of additivity) and then
by setting ß = 0 (to obtain homogeneity). Conversely, the properties of
additivity and homogeneity may be eombined to give CIP3, by replacing u
and v in (3.2) with au and ßv, and then by invoking homogeneity.
Whether linearity is defined with respeet to the first or second slot in the
inner product does make a differenee far complex inner product spaces. To
3.2 Inner product spaces 89

see this we note first of all that, with the use of the inner product axioms
and the properties of complex conjugation,

(u,v + w) = (v + w,u) = (v,u) + (w,u) = (v,u) + (w,u) = (u,v) + (u,w)


so that additivity in the second slot follows. However, the inner product
is not homogeneous in the second slot since, for any complex numher a,
Axioms CIP2 and CIP3 give

(u,av) = (av,u) = a(v,u) = a(v,u) = a(u,v).


The property of linearity implies that the inner product of any member
u with the zero element is zero; that is,

(O,v) = 0 for all v E X.

This follows from the observation that, for any u EX, (u, v) = (u + 0, v) =
(u, v) + (0, v). Comparison of the left- and right-hand sides gives the desired
resul t. In the same way it follows that (v, 0) = 0 for any v EX.

Real inner product spaces. It is possible to define an inner product


on real vector spaces as well; indeed, this is a very important special case
to which frequent reference is made. To do so, it suffices to change IC to IR
in the preceding set ofaxioms, which then conveniently reduce to the set
ofaxioms appropriate to real vector spaces. For convenience we summarize
these here in one place:
for u,v,w E X and a,ß E IR,

RIPl. (u,v) E IR;

RIP2. (v,u) = (u,v) (the operation is symmetrie);

RIP3. (au + ßv,w) = a(u,w) + ß(v,w) (linearity);

RIP4. (u, u) ~ 0 and (u, u) = 0 iff u = 0 (positive-definiteness).


We note in particular that the property of Hermitian symmetry reduces
to that of plain symmetry, for the real case, whereas the other axioms are
unchanged. For real inner product spaces we have in addition the property
of linearity in the second slot; this follows from (3.2) and (3.3), bearing in
mind that a is areal number in this case.
In general we refer simply to inner product spaces, and the context makes
clear whether the space is real or complex. General results are always proved
for complex inner product spaces, since these then apply a fortiori to real
spaces.
90 3. Vector spaces, normed, and inner product spaces

Examples

12. Let X = JR3; then the conventional or Euclidean scalar product de-
flned by

for x = (Xl,X2,X3) and y = (Yl,Y2,Y3) satisfles the (real) inner


product axioms.

13. Consider the space L 2 (a, b) of square-integrable functions deflned on


the interval (a, b). An inner product for L 2 (a, b) may be defined by

(u, v) == l b
u(x)v(x) dx for u, v E L 2 (a, b). (3.4)

We have, in particular,

(v,u) = lbV(X)U(X)dX l b
1i[X)v(x) dx

l b
v(x)u(x) dx = l b
v(x)u(x) dx

l b
u(x)v(x) dx = (u, v)

using the properties of complex numbers, including the property


that, for complex integrands, the complex conjugate of the integral
equals the integral of the complex conjugate (can you see why?).
Thus Axiom CIP2 is satisfled. Second,

(au + ßv,w) l b
[au(x) + ßv(x»)w(x) dx

a l b
u(x)w(x) dx +ß l b
v(x)w(x) dx

a(u, w) + ß(v,w)
and so Axiom CIP3 is satisfied. Finally,

(u, u) = 1b u(x)u(x) dx =
jb lu(xW dx,
a

which is clearly positive since it is the integral of a positive function.


The only function u for which this integral vanishes is u(x) = 0 a.e.
(recall the properties of the Lebesgue integral), and so Axiom CIP4
is satisfied.
3.2 Inner product spaces 91
v

u
FIGURE 3.5. Orthogonal vectors

Orthogonality. We continue the abstract ion of properties of vectors


in three-dimensional spaee, and reeall next the concept of orthogonality:
two vectors u and v are orthogonal if u . v = 0, that is, if they are at
right angles to each other (Figure 3.5). Since we have at our dis pos al the
concept of an inner product, it is a very simple matter to extend this
notion of orthogonality to any inner product space, irrespective of whether
the geometrie interpretation applies. Thus two members u, v of an inner
product space X are said to be orthogonal if

(u,v) =0.

When this is the case we write, as in the case of vectors, u ...L v.

Example

14. Consider the functions u(x) = sinx and v(x) = cosx, with u,v E
L 2 ( -7r, 7r). Making use of the inner produet (3.4) (but bearing in
mind that we are dealing with real-valued functions here) we find

1:
that

(u,v) = sinxeosx dx =0

and so u and v are orthogonal in L 2 ( -7r, 7r).

The Cauchy-Schwarz inequality. We return for a moment to vectors


and observe another property ofthe dot product. Recall that for two vectors
u and v with an angle 0 between them,

u .v = lullvl eos 0
or

u· v = (u· U)lj2(V. V)lj2 COS o.


But I cos 01 :s; 1, and so it follows that

lu· vi :s; (u· U)lj2(V· V)lj2.


92 3. Vector spaces, normed, and inner product spaces

This property in fact holds for any inner product space, as the next result
shows.

THEOREM 2 (THE CAUCHy-SCHWARZ INEQUALITY). Ifu and v are mem-


bers of an inner product space X with inner product (. ,.), then

(3.5)

PROOF. We assume that neither u nor v is zero; for the case in which either
of these is zero, (3.5) is satisfied trivially. The proof then follows from the
observation that, for any complex number a,

(u - av, u - av) ;::: 0,

using Axiom CIP3. Upon expansion and use of the axioms of linearity and
Hermitian symmetry this becomes

0::; (u,u)-(av,u)-(u,av)+(av,av)
(u, u) - (av, u) - (av, u) + (av, av)
(u, u) - 2Re[a(v,u)] + laI 2 (v,v),

where Rez denotes the real part of a complex number z. Now a is arbitrary,
so if we choose a to be equal to (v,u)/(v,v), then lai = l(v,u)I/(v,v)
(remember that (v, v) is real) and

o ::; (u,u) - 2Re[(v,u)(v,u)/(v,v)] + l(v,uW/(v,v)


(u,u) -1(v,uW/(v,v).

The desired result is then obtained by rearranging the terms, multiplying


throughout by (v,v), and taking the square raot of both sides. 0

3.3 Normed spaces


At the beginning of the previous section we introduced the concept of a
norm by drawing an analogy with the notion of the length of a vector. As
with the definition of an inner praduct, this notion may be abstracted in
a natural way if we start from scratch with an arbitrary vector space X: a
norm II . II on X is an operation that satisfies the following axioms for any
members u,v of X, and scalars (real or complex, as appropriate) a:

NI. Ilull E IR.


N2. Ilull ;::: 0 and Ilull = 0 iff u = 0 (positive-definiteness).
N3. Ilaull = laillull (positive homogeneity).
3.3 Normed spaces 93
u+v

au ~

laul = lallul
FIGURE 3.6. Axioms N3 and N4 as they apply to vectors

N4. Ilu + vII :s; Ilull + Ilvll (triangle inequality).


A few remarks about these axioms are in order. The first asserts that, by
analogy with the length of a vector, thc norm is expectcd to yield areal
number; furthermore, Axiom N2 asserts that this quantit.y will always be
positive, except for the case of the zero element, in which case it is zero.
The third axiom, that of positive homogeneity, is again a straightfor-
ward abstraction of a property of vectors, as shown in Figure 3.6. It is
also clear from this figure why it is positive homogeneity, and not simply
homogeneity as in the case of the inner product, that is required. Axiom
N3 applies equally to real and complex spaces, with lai heing interpreted
appropriately.
Finally, the triangle inequality abstracts the situation that is summarized
in the parallelogram law for addition of vectors (Figure 3.6).

Examples
15. Let X = jR3; then the usual or Euclidean norm defined on jR3 is
Ilxll = (xi + x~ + X~)1!2 for x = (Xl, X2, X3)'

The extension to jRn is obvious.


16. Norms, like inner products, are not unique quantities. A case in point
is jRn, on which it is possible to define a whole family of norms: for
each real number p in the range 1 :s; p < 00 the quantity 1I·lIp defined
by

is a norm on jRn, the Euclidean norm corresponding 1.0 the case p = 2.


Axioms N2 and N3 are seen by inspection to hold, and thc triangle
inequality is a consequence of the Minkowski inequality foT' sums; for
1 :s; p < 00,
94 3. Vector spaces, normed, and inner product spaces

The proof of this inequality is outlined in Exercise 3.19. The case


p = 00 mayaIso be included in this family if we define 11 . 1100 on lRn
by

this is also a norm on lR n .


17. Let X = LP(O) with 1 ::::: p < 00. The standard norm on LP is defined
by

(3.6)

which is of course a well-defined quantity for any u E LP(O). Axioms


NI through N3 are easily shown to hold, and the triangle inequality
follows from the Minkowski inequality for integrals (3.1) which can
be written, using the notation (3.6), as
Ilu ± vllLP ::::: IlullLP + IIvIILP.
For convenience, and when there is no danger of ambiguity, we denote
the LP-norm simply by 11·llp rat her than the more cumbersome II·IILP.
18. Consider the space Loo(O) of bounded measurable functions, that is,
functions u that satisfy lu(x)1 ::::: k a.e. on O. Because we are dealing
with equivalence classes of functions, the notion of the supremum
is meaningless, and has to be replaced by the essential supremum,
defined to be the greatest lower bound of the constants k that bound
lul almost everywhere:
ess sup lu(x)1 = inf{k: lu(x)1 ::::: k a.e.}.
xEO

Then Loo(O) is a normed space, with norm 1I·llu'" defined by


Ilullu'" = ess sup lu(x)l·
xEO

The first three norm axioms obviously hold; to verify the triangle
inequality we note that, for any two functions u and v in Loo(a, b),
u(x) and v(x) are simply real or complex numbers, so that
lu(x) + v(x)1 ::::: lu(x)1 + Iv(x)l;
thus, recalling the properties ofthe supremum in Chapter 1, and bear-
ing in mind that these properties carry over to the essential supre-
mum, we have (Figure 3.7)
lIu + vllu'" ess suplu(x) + v(x)1
::::: ess sup(lu(x)1 + Iv(x)1)
ess suplu(x)1 + ess suplv(x)1
IlulIL= + IIvIIL='
3.3 Normed spaces 95

Ilu + vll oo ------::::..


Ilull oo

FIGURE 3.7. The triangle inequality for functions in U>O(a, b)

Here, too, it is convenient to denote the L''''-norm simply by 11 . 11 00


when this is unlikely to be ambiguous.

19. Since the space C([2) of bounded continuous functions is a subspace


of Loo(O), it follows that we may endow C([2) with the sup-norm
11 . 1100, thereby making it a normed space. In the case of continuous
functions it suffices of course to make use of the supremum, or indeed
the maximum, rat her than the essential supremum. U nd er these cir-
cumstances it suffices also simply to denote this norm by 11·1100; thus,

lIull oo = m~ lu(x)1 for u E C([2). (3.7)


XEO

Normed space. A vector space X with a norm 11· 11 defined on it is called


a normed space. Since it is the vector space together with the norm that
defines the normed space, the conventional and more proper notation is
(X, 11 . 11); however, when it is clear which particular norm is being used,
or when the details of the norm are not pertinent, the narmed space is
denoted simply by X.

The norm generated by an inner product. The norm is a primitive


concept that docs not require far its definition the existence on an inner
product. Indeed, a norm is any operation 11 ·11 that satisfies NI through N3.
But if we have an inner product space (X, (., .)), then a norm 11 . 11 on X
may be defined according to

(3.8)

and we say that 11 . 11 in (3.8) is the norm genemted by the inner product.
Thc analogy with vectars in IR. 3 is clear: given the scalar (inner) product
defined for vectors, the norm or length of a vector u is given by
96 3. Vector spaces, normed, and inner product spaces

The quest ion now arises: is 11·11 defined in (3.8) really a norm? That is, does
it satisfy all of the norm axioms? The answer, of course, is yes: first, the
quantity (u, u) is real, so that NI is satisfied. Second, positive-definiteness
of Ilull follows directly from the positive-definiteness of the inner product.
Positive homogeneity is verified by considering that, far any complex a,

(au, au)
aa(u, u) = lal 2 11ul1 2

using properties CIP2 and CIP3 of the inner product. Finally, in order to
show that the triangle inequality is satisfied we consider

Ilu+vl1 2 (u+v,u+v)
(u, u) + 2Re(u, v) + (v, v)
IIul1 2 + 2Re(u, v) + IIvl1 2
< IIul1 2 + 21(u, v)1 + II v l1 2
< IIul1 2 + 211ullllvil + IIvl1 2
(using the Cauchy-Schwarz inequality)
(Ilull + Ilvll)2.
The desired result is now obtained by taking the square root of both sides.
With thc understanding that the norm on an inner product space is that
generated by the inner product, the Cauehy-Schwarz inequality may be
written in the alternative form

l(u,v)1 ::; Ilullllvll·

The parallelogram law. The preeeding discussion shows that it is true


that every inner product space is automatically a normed space, so it is nat-
ural to enquire whether the converse is true: in other words, ean a norm be
used to generate an inner product? The answer, unfortunately, is negative.
To see this, we introduce an identity that is valid on any normed spaee, and
which may be used to provide a partial answer to the question. This iden-
tity is known as the parallelogram law as a result of its interpretation in ne
(Figure 3.8). Let X be an inner product space with norm Ilull = (U,U)1/2;
then

(3.9)

for all u, v E X. Comparison of (3.9) with Figure 3.8 should make clear
the reason for referring to this identity as the parallelogram law. Indeed,
from the eosine rule in IR?, Ilu - vl1 2 = IIul1 2 + IIvl1 2 - 211ullllvil cos Band
Ilu + vl1 2 = IIul1 2 + IIvl1 2 - 211 u llll v il cos(180° - B); adding, we obtain (3.9).
The proof for the more general case is easily carried out (see Exercise
3.10). Since the parallelogram law holds for any norm generated by an
3.3 Norlll1ed spaces 97

FIGURE 3.8. The parallelogram law in two-dimensional space

inner product it follows that, far any normed space X, if the norm does not
satisfy the parallelogram law, then there is no inner prOd'IJict that generates
this norm. When this is so, the space X is not an inner product space.

Example
20. Consider G[O, 1] with the sup-norm 11· 1100' Then choosing

u(x) = 1 and v(x) =x


we have

Ilu + vll oo sup lu(x) + v(x)1 = sup 11 + xl = 2,


lIu - vll oo supl1- xl = 1.

Thus

Ilu + vll~ + Ilu - vll~ = 5.


On the other hand,

Ilull oo = sup 111 = 1 and IIvll oo = sup I:rl = 1


and so

Ilull~ + Ilvll~ = 2.
The parallelogram law does not hold, and so G[O,l] with the sup-
norm is not an inner product space. In the same way we can show
that G(O) with the sup-norm is also not an inner product space.
Equivalent narms. It has already been pointed out that a norm is not a
unique object, in the sense that a variety of norms may be defined on any
given vector space. Suppose then that two alternative norms II·IIA and II·IIB
are defined on a vectar space X. These norms are said to be equivalent to
each other if there are positive constants m and M such that

(3.10)
98 3. Vector spaces, normed, and inner product spaces
y

d(y,z)

x e-----------__ z

FIGURE 3.9. The triangle inequality in ]R.2

for all U E X.

Example
21. Consider the case X = lR?, with the norms 11·112 and 1 . 1100 defined
in Examples 15 and 16. Since lXII::; IIxl12 and IX21 ::; Ilx112, it follows
that maxi lXii == Ilxll oo ::; Il x 112· Furthermore, lXII::; Ilxll oo and IX21 ::;
Ilxll=; squaring and adding, we find that Ilxll~ ::; 21Ixll~. Thus
Ilxll oo ::; IIxl12 ::; v2llxll oo
and 11 . 112 and 11 . 1100 are equivalent norms.

3.4 Metric spaces


The final geometrical property that we wish to abstract is the not ion of a
metric. The motivation comes, as before, from the situation in ]R.3, in which
the distance d(x, y) between two points x and y is given by

d(x, y) = J(XI - Yd 2 + (X2 - Y2)2 + (X3 - Y3)2 .


A further property enjoyed by points in ]R.3 is the triangle inequality (Figure
3.9): the length of one side of a triangle is less than or equal to the sum of
the lengths of the other two sides. That is,
d(x,z)::; d(x,y) +d(y,z).
As with the concepts of inner product and norm, we define a metric d(· , .)
on an abstract set to have properties analogous to those relevant to dis-
tances between points in ]R.n. One important respect in which the metric
differs from both the norm and inner product is that it does not require
the structure of a vector space for its definition.

Metric and metric space. Let X be a set. If u and v are two members
of X, a metne on X is areal number d( u, v) with the following properties,
for any u,v,w EX.
3.5 Bibliographical remarks 99

MI. d(u, v) 2: 0 and d(u, v) =0 if and only if v = u.


M2. d(v,u) = d(u,v).
M3. d(u, w) :s d(u, v) + d(v, w).
A set X with ametrie d(· , .) defined on it is called ametrie space. In order
to emphasize the particular metric defined on a set, ametrie space may
alternatively be denoted by (X, d).

The metric generated by a norm. The metric requires less structure


on the underlying set for its definition than do the norm and inner product.
Rather than defining ametrie from scratch, we always work with normed
(including inner product) spaces, and define the corresponding metric by

d(u,v) = Ilu - vii· (3.11)

When dC , .) is thus defined, we say that d(·, .) is the metric generated by


the norm 11 . 11. That (3.11) does indeed satisfy the axioms for ametrie is
easily shown, and is left as an exercise.
There are nevertheless many examples of metrics that are not generated
by norms, as some of the following examples show.

Examples

22. The discrete metric on any set X is defined by

d u v ={
o if v = u
(3.12)
( ,) 1 otherwise.

Provided that the set X is nonempty, the definition of this metric


makes sense, and it may be checked that it satisfies the axioms for a
metric.

23. Let X = ]R2, and define dC , .) by

d(x, y) = { fx' + lyl ify = x


otherwise.

3.5 Bibliographical remarks


The concept of a vector space is a purely algebraic one, requiring as it dolOS
only a set of rules for combining elements and multiplying them by scalars.
Further details on vector spaces may be found in texts on linear algebra,
and good sources are Hoffman and Kunze [20] and Strang [50], as weil as
the text by Lang [29].
100 3. Vector spaces, normed, and inner product spaces

Good accounts of metric, normed, and inner product spaces may be


found in Kolmogorov and Fomin [25], Kreyszig [27], Naylor and Seil [33],
Rektorys [41], Roman [43], Smirnov [49], and Zeidler [54].

3.6 Exercises
Vector spaces and subspaces
3.1. Which of the following are vector spaces?
(a) the set of m x n matrices;
(b) the set of mx m matrices with determinant equal to 1;
(c) the set ofpoints X = {x: x = (XI,X2) E ~2, X2 2: O} (that is,
the upper half plane);
(d) the set of solutions to the differential equation

d2 u du
a(x) dx 2 + b(x) dx + c(x)u = 0, 0< x < 1;

(e) the set of solutions to the differential equation

d2 u du
a(x) dx 2 + b(x) dx + c(x)u + d(x) = 0, 0< x < 1.

3.2. Consider the vector space ~2 of ordered pairs. Which of the following
subsets of ~2 are subspaces?

(a) V={x=(x,y): x=O};


(b) V={x=(x,y): x+y=l}.
3.3. Which of the following subsets of G[a, b] are subspaces?

(a) V = {u E G[a,b]: u(a) = u(b) = O};


(b) V={uEG[a,b]: u(a)=u(b)=l};
(c) V={uEC[a,b]: J:u(x)dx=O}.
(d) V = {u E C[O, IJ: u(x) = u(y) for all x,y such that x+y = I}.
3.4. Prove Theorem 1, which states that if V and Ware subspaces of a
vector space X, then X = V EB W if and only if every u E X has the
unique representation

for some v E V, w E W.
3.fJ Exercises 101

3.5. Let X = G[O,l], V = {v E G[O,lJ: v(x) = v(-x)} (the set ofeven


functions), and W = {w E G[O,lJ: w(x) = -w(-x)} (the set ofodd
functions). Verify that X = V EB W.

3.6. The purpose of this exercise is to prove the Minkowski inequality for
integrals

[!nlu ± viP dX] l/p :s [!nluIP dX] l/p + [!nIviP dX] l/p

(a) Show that aß :S (aP/p) + (ßq/q), where l/p + l/q = 1 and


a, ß E R [Consider the following sketch. Show that area A =

aP/p, area B = ßq/q.] Set a = u(x)/ [!nIU(x)IP dxf/P and

ß = v(x)/ [!nlv(x W dX] l/q, integrate and manipulate to get


the Hölder inequality

!nluv l dx:S [!nluIP dX] l/p [!nIV 1q dX] l/q (3.13)

y = x p -·l or x = yq-l
ßr-----~
B

x
(b) Use the identity

to obtain the Minkowski inequality.

Inner product spaces

3.7. If (u,w) = (v,w) for all w E X, show that u = v.

3.8. Consider the spacc Gm[O, 1J with inner product (-, ·)m defined by
102 3. Vector spaces, normed, and inner product spaces

Given u(x) = x 3 and v(x) = 1 - (3x 2 /2), show that u and v are
orthogonal with respect to the inner product (. , ')0. Are they orthog-
onal with respect to (- h ? Verify the Cauchy-Schwarz inequality using
the inner product (. h.

N ormed spaces

3.9. For a normed space X show that

Illull - Ilvlll :::; Ilu - vii for all u, v E X.

3.10. Prove the parallelogram law

where X is an inner product space.

3.11. Let u and v be nonzero elements in a real inner product space X.


Show that

Ilu + vii = Ilull + Ilvll


if and only if v = au for some real number a > O.

3.12. If X is a real inner product space, show that IIx - yll + IIY - zll
Ilx - zll if and only if Y = ax + (1- o:)z, where 0 <::: a <::: 1, and ll ·11 is
the norm generated by the inner product on X. Interpret this result
for the case X = ]R2.

3.13. Show that the quantity

lIull =
[
1 (~~) 2] 1/2
b
dx , uEX

satisfies the norm axioms for the case in which X is the space

X = {u: u E C 1 [a,b], u(a) = u(b) = O}.

3.15. Show that

where u, v are members of a complex inner product space.


3.6 Exercises 103

3.16. A subset V of a linear space X is said to be convex if, for every


u, v E V, au + (1 - a)v is also in V, where 0 :S a :S 1. Show that the
closed ball B = {u EX: lIull:S I} is convex. What does B look like
when X = C(O, 1) with the sup-norm ?

u v
au + (1- a)v

a convex set V a nonconvex set V

3.17. Let X be a real inner product space. Show that u ...L v in X if and
only if Ilu + avll = lIu - avll for all real numbers a, where 11·11 is the
norm generated by the inner product on X. Illustrate this result in
1R2 .
3.18. The distance from a point x in a normed space X to a closed and
bounded subset B of Xis defined by d(x, B) = inf{llx-yll : Y E B}.
Calculate d( x, B) if X = 1R2 , X = (1, 1), B is the closed disk of radius
~ and center (~, 0), and X has (i) the Euclidean norm; and (ii) the
norm II . 111 (see Example 16).
3.19. The purpose of this exercise is to show that

Ilxilp = [IXII P + ... + IxnlPF/p


defines a norm on IR n , for 1 :S p < 00. In Exercise :3.6(a) set

sum over 1 to n, and manipulate to get the Hölder inequality for sums
n

L IXiYil :S Ilxll p IIYllq·


i=l
Use the identity in Exercise 3.6(b) to obtain the Minkowski inequality
for sums

[~.=nl Ix, ± Y'IP] l/p < Ilxll


~.. P + Ilyll p,

which confirms that II . II p is a norm for IR n .


104 3. Vector spaces, normed, and inner product spaces

3.20. For any normed space V, the unit ball with center 0 and radius r is
defined by B(O,r) = {u E V: Ilull ::; r}. Sketch B(O,r) for the case
in which V = ll~? with the norms 11 . IIp for p = 1, 2, and 00.

3.21. Show that 11 ·111 and 11· 112 are equivalent norms on ]R2.
3.22. The aim of this exercise is to show that, for a bounded domain n,
(3.14)

for p > r ~ 1, so that if u E LP(n), then u E Lr(n) also. First, let


p, q, r be real numbers such that

1 1 1 1 1
-p + -q = -r or --+--=1. (3.15)
(pjr) (qjr)

Replace u by ur and V by v r in Hölder's inequality (3.13) and use


(3.15) to obtain the generalization

(3.16)

of Hölder's inequality. Then use (3.16) to obtain (3.14).


3.23. Show by means of a counterexanlple that the Li-norm does not gen-
erate an inner product.

Metric spaces

3.24. Let D = {z E C : Izi ::; I} be the closed unit disk in the complex
plane, and define

Iz-wl if arg (z) = arg w or if one of z and w is zero,


d(z, w) = { Izi + Iwl otherwise.
Verify that d(-,·) defines a metric on D. This space is called the
"French railroad space"; sketch a picture of the action of d(· , .) to see
why this is so.
3.25. Verify that (3.12) does indeed satisfy the axioms for a metric.
4
Properties of normed spaces

Normed and inner product spaces possess a wealth of properties, and these
in turn allow sophisticated theories to be developed and applied in a variety
of contexts. Some of these properties are introduced in this chapter.
Arguably the most basic concept, and one which pervades most discus-
sions involving normed spaces, is that of convergence of sequences. Se-
quences were introduced in Chapter 1, in the context of real and complex
numbers. We show in Section 4.1 that the definition of convergence of a
sequence in a normed space is a natural extension of that given in Chapter
l.
In Section 4.2 we focus attention on sequences in spaces of functions;
these are a special case which occurs so often in the future as to warrant
devoting some time to the elucidation of their characteristics.
The notion of completeness pervades functional analysis, and complete
normed and inner product spaces are sufficiently important to be given
special names: a complete normed space is called a Banach space and a
complete inner product space is known as a Hilbert space. We describe
completeness in Section 4.3, and then show in Section 4.4 how completeness
of aspace is related to the closedness of that space. We also discuss in this
section the issue of how to complete aspace that lacks this property.
Finally in Section 4.5, we discuss further properties of inner product
spaces. In particular, we extend to arbitrary Hilbert spaces a property that
is fairly obvious in three-dimensional space.lR.3 may be decomposed into two
orthogonal subspaces (a simple example, once a set of Cartesian axes has
been introduced, would be the xy-plane and the z-axL'l), and every vector
may be written uniquely as the sum of orthogonal components in these
106 4. Properties of normed spaces

FIGURE 4.1. An example of orthogonal decomposition of a vector in m?

two subspaces, as shown in Figure 4.1. The generalization of this not ion
to arbitrary Hilbert spaces is known as the projection theorem, which also
features later on.

4.1 Sequences
Sequences of numbers were defined in Chapter 1; here we look at sequences
in normed spaces gene rally. A sequence in a normed space X is an ordered
set in X whose members can be labeled with positive integers. We write
{Ul, U2, ... } or {Udk'=l·

Example

1. By way of moving away from sequences of numbers, consider the


sequence of functions described by (Figure 4.2)

{Un}~=l C G[a, bJ, un(x) = n(x - a).

Ultimately what is of most interest about sequences is the way in which


they behave as n gets progressively larger; this brings us to the next topic,
namely, that of convergence.

Convergence of sequences. The notion of convcrgencc of a sequencc of


elements in a normed space carries over in a natural way from the definition
for sequences of numbers. Let Y be a subset of a normed space X, then,
and suppose that {u n } is a sequence in Y. Let U belong to Y, and form
the sequence of real numbers {Ilul - ulI, IIU2 - ulI, ... ,lIun - ull, ... }. If the
sequence of numbers lIu n - ull converges to zero as n gets larger, we agree
to caU the scqucncc convergent. Another, more formal way of stating this
is as folIows: pick any positive number €. Then {u n } is said to converge to
4.1 Sequences 107

a b
FIGURE 4.2. The sequence of functions with general memher un(x) = n(x - a)

•• Un
•U

FIGURE 4.3. Convergence of a sequence to a point U

some element U E Y if, for any E > 0, it is always possible to make Ilu n -ull
smaller than E simply by choosing n large enough, larger than some number
N, say (Figure 4.3). The groundwork for a precise definition of convergence
has now been laid.

Convergence of a sequence in a norrned space. A sequence {u n } in


a subset Y of a normed space X is convergent if there is a member U E Y
for which, given any E > 0, a number N can be found such that

Ilu n - ull < E for all n > N. (4.1)

If this is the case, we write U n -+ U (which is read "u n converges to u"),


and U is called the limit of the sequence. Yet another way of stating (4.1)
informally is

lim Ilun
n~oo
- ull = 0 or lim
n~oo
Un = U, (4.2)

which is read "the limit as n tends to 00 of U n , is u". Note, however, that


by (4.2) we mean (4.1).

Equivalent norrns and convergence. The not ion of two equivalent


norms 11 . IIA and 11 . IIB on a normed space X was defined in (3.10). A
useful attribute of equivalent norms is that properties of convergence carry
108 4. Properties of normed spaces

over from one to the other. More precisely, if {u n } is a sequence in X and


U n -+ U with respect to 11 . IIA, in the sense that limn-->oo lIu - unliA = 0,
then U n -+ U with respect to II·IIB as weIl. To see this, we note from (3.10)
that

Ilu - unllB ::; Milu - unliA -+ 0 as n -+ 00.

4.2 Convergence of sequences of functions


When discussing convergence of sequences in normed spaces whose mem-
bers are functions, it is particularly important to specify which norm is
being used, as convergence with respect to one norm does not necessarily
imply convergence with respect to another. We are acquainted so far with
two types of norms when dealing with spaces of functions: the sup-norm in
Chapter 3, Examples 18 and 19, and the V-norm (3.6). As we show in this
section, the type of convergence associated with the sup-norm (namely, uni-
form convergence) implies convergence in the LP-norm, but not vice versa.
We begin with a discussion of pointwise and uniform convergence.
Suppose that we know that a sequence {u n (x)} of continuous functions
converges to a limit at each point xE n c jRd. This implies the following: if
we fix x, then the sequence of real numbers un(x) (n = 1,2, ... ) converges
to areal number u(x), say, and this in turn defines a function u. In other
words, for every E > 0 there exists a number N > 0 such that

lun(x) - u(x)1 < E whenever n > N. (4.3)

Of course N will depend on x and on the number €. If we now move to


another value of x the statement (4.3) may not be true for the same N.
However, if we can find a number N independent of x such that (4.3) holds
for all x E n, then we say that U n converges uniformly to u on n. We now
define these concepts formally.

Pointwise and uniform convergence. A sequence {u n } of functions


defined on a subset n of jRd converges pointwise to u(x) if for every E > 0
there exists a number N depending on x and E such that (4.3) holds. If N
does not depend on the value of x, then U n is said to converge uniformly
to u on nj this is written as limn-->oo U n = u (uniformly).
Note that we are using jRd rather than jRn here, for obvious notational
reasons!
Uniform convergence has a very simple geometrical interpretation which
is illustrated in Figure 4.4 for the case n = [a, b]: according to the definition,
for any given € all the functions u n (X),U n +l(X), ... lie in the "tube" of
height 2€ located symmetrically about the limit function u( x), for n greater
than a number N which of course depends on E, but not on x. Now that
uniform convergence has been defined, one might ask how it is related to
4.2 Convergence of sequences of functions 109

U-€

a b
FIGURE 4.4. An illustration of the concept of uniform convergence

the formal definition (4.1) of convergence in terms of a norm. To answer


this quest ion, consider a sequence {u n } of functions that belong to the
normed space G[a, b] with the norm

IluliDO = sup lu(x)l, xE [a, b].

Suppose that this sequence is convergent in the sup-norm; that is, given
any € > 0 it is possible to find a number N such that

Ilu n - ull oo = sup lun(x) - u(x)1 < € (4.4)

for all xE [a, b], whenever n > N. But since lun(x) - u(x)1 :S: sup lun(x) -
u(x)l, it follows that (4.3) also holds.
In other words, convergence in the sup-norm implies uniform conver-
gence. Conversely, suppose that {u n } is a uniformly convergent sequcnce,
so that (4.3) holds. Then € is an upper bound for IU n (:1:) - u( x) I, for any
x in [a, b]. But this imp!ies that the least upper bound or supremum of
lun(x) - u(x)1 must also be less than €, so that

Ilu n - ull oo == sup lun(x) - u(x)1 < € for all x E [a, b], n > N,

or alternatively

!im [sup iun(x) - u(x)l] = O.


n~oo

That is, uniform convergence implies convergence in the sup-norm. This


useful result can be proved in much the same way for functions defined on
domains n in jRd and so we simply record the general result.

THEOREM 1. A sequence of fun ctio ns {u n }, where U n ':: G(O) and n is a


domain in jRd, converges uniformly to u if and only if

!im [suPXE(1lun(x) - u(x)1l


n~oo
= o. (4.5)
110 4. Properties of normed spaces

° b in1Example 2
FIGURE 4.5. Nonuniform convergence of the sequence

Examples

° °: ; °: ;
2. Let U n = Xn, defined on [0,1]. This sequence convergences pointwise
to for x < 1, and to 1 at x = 1. If we set u(x) = 0, x < 1,
and u(x) = 1 for x = 1, then

sup Iun(x) - u(x)1 = 1 for all n,

this supremum being attained at a value of x "infinitesimally" elose


to x = 1 (Figure 4.5). Hence the sequence does not converge uni-

°
formlyon [0,1]. However, it does converge uniformly to zero on [0, b],
where < b < 1, since in this case sup Iun(x) - u(x)1 = bn which
°
goes to as n -> 00.

3. Consider the sequence {un(x) = n 2 x(1 - x)n} defined on [0,1]; the


larger n is, the larger and the eloser to the y-axis the maximum value
of un(x) will be. For each fixed x E [0,1] the sequence converges to
zero; but as n increases the supremum of Iun(x) - u(x)1 = lun(x)l,
attained at x = 1/(n+ 1), also increases (Figure 4.6). Condition (4.5)
cannot be satisfied, and so we do not have uniform convergence. But
convergence is uniform on any interval [a, 1] where 0< a < 1; indeed,
for sufficiently large n the undesirable behavior of the maximum value
of U n will fall outside the interval [a, 1].

There is a elose connection between the notions of continuity and conver-


gence, in the context of functions. Continuous functions have of course been
defined in Chapter 2, and this definition, which is encapsulated in (2.2),
may be referred to as the E - {j definition of continuity, for obvious rea-
sons. Depending on the context, it is often convenient to have available a
definition of continuity that is based on sequential considerations. Such a
definition does exist, and goes as folIows. Suppose that we have a domain
n in ffi.d, and that {Xn}~=l is a convergent sequence of points in n, with
4.2 Convergence of sequences of functions 111

1
FIGURE 4.6. The sequence of functions in Example 3

fex)
f(x n ) ~====~
f(X2) f------------:~
f(X1) f--=--~

Xl X2 Xn X

FIGURE 4.7. Illustration of the sequential definition of continuity

limit x in O. Then a function f defined on n is said to be continuous if


lim f(x n )
n-+oo
= f(x). (4.6)

What this definition states is that if one takes a sequence of points that
converges, then these points are mapped to a sequence of numbers (real or
complex) f(X1), f(X2), . .. which, first, converges, and second, the limit of
which coincides with fex). These ideas are illustrated in Figure 4.7. Now
there is little point in having alternative definitions of the same concept
unless these are equivalent, so it is essential that we establish the connection
between the E-O definition of continuity and the sequential definition. These
are in fact equivalent, as the following theorem confirms.

THEOREM 2. Let n be a domain in IR d , and let f be a function defined on


O. Let x be a point in n. The function f is continuous at x if and only iJ,
for every sequence {xn} of elements in n that converges to x, (4.6) holds.

PROOF. First assume that the E- 0 definition (2.2) is valid. Then given any
E > 0, there exists 0 such that If(Y) - f(x)1 < E when yEn and Y- xl < O.
112 4. Properties of normed spaces

If we now take any sequence {x n } of elements in r2 that converges to x,


then from the definition of convergence we know that IX n - xl < 8 for all
n;::: N, for suitably large N. Thus, taking y to be the point x n , we see that
If(x n ) - f(x)1 < E for n ;::: N, which is just another way of stating (4.6).
Thus the E - 8 definition implies the sequential definition of continuity.
Conversely, ass urne that (4.6) is valid. It suffices to prove then that, given
E > 0, there exists N such that whenever Iy-xl < I/N then If(Y)- f(x)1 <
E. We use the method of proof by contradiction. Suppose that this assertion
is false. Then, for some E and for every positive integer n there exists X n E r2
such that IX n - xl < I/n but If(x n ) - f(x)1 > E. This in turn implies that
{x n } is a convergent sequence, so that this statement contradicts (4.6).
The theorem is thus proved. 0

LP-convergence. We continue the discussion of convergence of sequences


of functions, and move on to the larger normed space LP(r2) with the usual
LP-norm defined in (3.6), and with 1 ::; p < 00. The definition (4.1) states
that a sequence {u n } C LP(r2) eonverges in the LP-norm to an element
U E LP(r2) if for any given E > 0 it is possible to find a number N such that

IIU n - ullLP < E whenever n > N, (4.7)

or

or

lim
n-oo inr Iun(x) - u(x)IPdx = O. (4.9)

This type of convergence is rcferred to as LP-convergence, and in the case


p = 1 it is referred to as convergence in the mean. It is important to
note that although uniform convergence implies LP -convergence (if r2 is
bounded; see Exercise 4.7), the converse is not true. The relationship be-
tween uniform, LP -, and pointwise convergence ean be summarized as
follows.

c= UNIFORM CONVERGENCE

lEQ!NTWISE CONVERGENCE I# ILP CONVERGENCk]


4.3 Completeness 113

Example

4. Let {u n } = {(I + nx)-l }~=l. This sequence convcrges to 0 on [0,1]


in the L 2 norm since

which goes to zero as n --+ 00. It can be shown that U n --+ 0 in the
LP - norm for any p > 1.

4.3 Completeness
As we have seen, convergent sequences all have the property that the dis-
tance between successive members of a sequence, measured by means of
so me appropriatc norm, becomes progressively smaller, and the sequence
approaches a definite limit which is, moreover, a mcmbcr of the normed
space concerned. Unfortunately, the situation is not always so clear-cut:
some normcd spaces have the deficiency that, although it is possible to set
up sequences in these spaces with the property that thc distance between
successive members becomes progressively smaller, the sequence does not
in fact have a limit in this space. For example, suppose we take a look at the
half-open interval (0, 1] with the norm 11·11 = 1·1, and consider the sequence
{u n } = {l/n}~=l. This sequence behaves in all respects as a convergent
sequence, and converges to 0, but 0 is not in the space (0, I]!
This behavior is undesirable for a number of reasons, and we always
make a strong distinction between spaces in which sequences that behave as
convergent sequences do in fact converge to a limit and, on the other hand,
those spaces in which the limits of such sequences are possibly "missing".
In order to proceed with the discussion, we first neecl to have a meam;
of identifying sequences with the property that the distance between suc-
cessive members decreases. These are called Cauchy sequences, and their
definition makes no reference to the not ion of convergence, or of a limit,
since it is possiblc for such sequences not to converge.

Cauchy sequence. A sequcnce {u n } in a subset Y of a normed space X


is called a Cauchy sequence if
lim
m,n--+oo
Ilum - unll = 0 ( 4.10)

or, more formally, if for any givcn E > 0 there exists a number N such that
Ilum- unll < E whenever m, n > N. (4.11)
Every convergent sequence is a Cauchy sequence (see Exercise 4.13), but
the point has been made that not every Cauchy sequence is convergent, for
114 4. Properties of normed spaces

the simple reason that, although the members may be converging to a limit,
the limit may not be part of the space. When this is so, then we say that the
space is incomplete. The situation may be remedied, however, by adding to
the space those elements that are the limits of Cauchy sequences but wh ich
were not originally in the space. This process is called completion of the
space, which is then said to be complete. We discuss completions in more
detail in the next section, but we first define formally a complete space,
and then give some simple but important examples of complete spaces.

Complete space. A subset Y of a normed space X is complete if every


Cauchy sequence in Y converges to an element of Y.

Example

5. The set IR of real numbers with the norm II . II = I . I is complete, as is


any closed interval of IR. The completeness of IR is taken as a funda-
mental property of the real number system, whereas the completeness
of closed intervals follows from the equivalence between closedness
and completeness, in a sense made precise in Section 4.

6. The set IR n with any ofthe norms 11·llp defined by Ilxll p = [2:~=1 IXiIPj1/p
s:
for 1 p < 00 is complete, as is IR n with the norm II . liDO defined by
Ilxil DO = maxI::Si::Sn lXii· This follows from the completeness of IR (see
Exercise 4.12).

7. The space G[O, 1] with the integral norm IIul1 2 = Jal u 2 dx is not
complete. To see this, consider the sequence {u n } defined by

OS:x<~,
~<:::x<:::1.

It is readily verified that {u n } is a Cauchy sequence; however, its


"limit" u(x) is the discontinuous function (Figure 4.8)

0< X < ~,
u(x) = { ~: ~<:::x<:::1.

Hence G[O, 1] (and in general G[a, b]) is not complete in the L 2 -norm,
and indeed it is not complete in the LP-norm for any p such that
1 <::: p < 00. In a similar way we may show that G(r2) with the LP
norm Ilull = [fr! luIPdx]l/p is not complete, for 1 <::: p < 00.

8. The space G[a,b] with the sup-norm Ilull= = sup{lu(x)l, xE [a,b]},


is complete. We may show this by demonstrating first of all that
every Cauchy sequence converges uniformly to so me function u(x),
and that this limiting function is necessarily continuous (see Exercise
4.11). Similarly, G(r2) with the norm Ilull oo = sup{lu(x)l, x E r2},
4.3 Completeness 115

1
FIGURE 4.8. The sequence in Example !)

is complete. This example and the previous one demonstrate that


completeness of aspace depends crucially on the choice of norm.

9. The space L 2 (D) with the usual L 2 -norm is complete. We do not


give the proof, as this would take us too far afield. It is impor-
tant to note that the notion of Lebesgue integration is essential for
the completeness of L 2 ; the space of functions that are lliemann-
square integrable is not complete, since it is possible to construct
Cauchy sequences of Riemann-integrable functions whose limits are
not lliemann-integrable. Generally, for any p 2: 1 the space LP (D)
with the LP-norm is complete (this holds for LOO (0) as weH).

Completeness is an extremely important property, because complete spaces


possess many useful characteristics that are absent from incomplete spaces.
Fortunately, most of the spaces of functions with which we work are com-
plete. Normed and inner product spaces that are complete have special
names, which are introduced here.

Banach and Hilbert spaces. A complete normed space is called a Ba-


nach space; a complete inner product space is called a Hilben space.
Since every inner product defines a norm, every Hilbert. space is a Banach
space.

Examples

10. IR n with the norm 11 . IIp defined in Example 6, anel with 1 ~ p :::; 00,
is a Banach space, and IR n with the norm 11 . 112 is a Hilbert space.

11. The space G[a, b] with the norm 11 . 1100 is a Banach space, as are
the spaces LP(a, b) with the V-norm. The space L 2 (a, b) with the
L 2 -norm is a Hilbert space.
116 4. Properties of normed spaces

FIGURE 4.9. Open neighborhoods in ne and in G[D, 11

4.4 Open and closed sets, completion


The not ion of completeness, which we have just met, is an example of a
topological property of a normed space. Topological properties of aspace are
those that are based on the concept of distances between points. Naturally
we may use a norm to measure distances between points, although the
broader concept of a topological space does not require a norm far its
definition; a normed space is just a special case of a topological space.
In this section we discuss a few more topological concepts that are nec-
essary for a proper understanding of later material. Common to all of these
concepts - and indeed to topological considerations in general - is the idea
of an open set, the definition of which in turn depends on the idea of a
neighborhood. We have come ac ross both neighborhoods and open sets in
the context of the particular spaces introduced earlier; here we generalize
to normed spaces.

Neighborhood. Let X be a normed space and let Uo be an arbitrary point


in X. The set
N(UO,E) = {u EU: Ilu- uoll < E}
is called an open neighborhood of Uo with radius E, where E > 0 (Figure
4.9). Similarly, the set
N(uo, E) = {u EU: Ilu - uoll ::; E}
is called a closed neighborhood of Uo with radius E.

Examples
12. We have al ready met neighborhoods of points in IR n (see Chapter I,
Section 3); there, the norm used is of course the Euclidean norm.
4.4 Open and closed sets, completion 117

13. Consider the space e[O, 1] with the sup-norm lIull<Xl = sup lu(x)l. An
open neighborhood of the function uo(x) of radius E is the set of all
continuous functions on [0,1] for which (Figure 4.9)
lIu - uoll oo = sup lu(x) - uo(x)1 < E.
xE[O,lJ

14. In the normed space L 2 (0, 1) with the usual L 2 -norm, an open neigh-
borhood of Uo is the set of all functions u E L 2 (0, 1) for which

[[u - uollL2 = [1 [U(Xo) -


1
u(xW dxf
/
2 < E.

This open neighborhood is unfortunately not easy to represent graph-


ically.
As can be seen from the definition and the preceding examples, the idea
of an open neighborhood is generalized in an almost trivial way from the
concept in Rn. This is a common feature of functional analysis in normed
spaces, and we encounter it many more times. Indeed, we proceed now
to the idea of open and closed subsets in normed spaces, and essentially
generalize what was introduced in Chapter 1.

Open and closed sets. A subset Y of a normed space X is an open


set if, for every point v in Y, there is an open neighborhood N(V,E) of v
that lies entirely in Y. A point w in Y is a point of accumulation of Y if
every open neighborhood of w, no matter how small, also contains at least
one point v in Y. Finally, Y is a closed set if it contains all of its points of
accumulation. We also define the closure Y of a set Y to be the union of
Y and all of its points of accumulation.

Examples
15. To start with, it may be worth reviewing some of the examples in
Sections 1.2 and 1.3.
16. The set B(uo, r) := {u : U E X, Ilu-uoll < r}, where X is any normed
space, is called the open ball with center Uo and radius r, and is an
open set. Indeed, for any point v in B(uo,r) the open neighborhood
N(V,E) lies entirely in B(uo,r) provided that E is less than d, the
shortest distance from v to the boundary of B(uQ,r) (Figure 4.10).
More formally, set

S(uQ,r) = {u E X, Ilu - uoll = r};


then
d=inf{[[v-ull, uES(uQ,r)}
and we require E < d.
118 4. Properties of normed spaces

FIGURE 4.10. The open ball B(uo,r} with center Uo and radius r

17. The set B(uo,r} = {u E X, lIu - uoll ::::; r} = B(uo,r) U S(uo,r) is


called the closed ball of radius r, and is a closed set since, for any v E
B(uo,r), the neighborhood N(V,f) contains points ofB(uo,r) other
than v. Hence every member of B(uo,r) is a point of accumulation.
Furthermore, for any point Ul not in B(uo, r) we can always construct
a neighborhood N(Ul, f) that contains no point of B(uo, r). Indeed,
let l = infllul - vll,v E B(uo,r)j then we simply choose f < l.
Thus B(uo, r) contains all its points of accumulation and is therefore
closed. Although these concepts all apply to normed spaces in general-
- and in particular apply to open and closed balls that often look
nothing like balls - the basic ideas may nevertheless be more readily
assimilated by considering their interpretation in l~?, endowed with
the Euclidean norm.

18. Consider the space G[a, b] with the sup-norm, and let V = {u E
G[a, b], lu(x)1 ::::; 1}. Then V is closed, since V is in fact the closed
ball of radius 1, centered at uo(x) = 0, as can be seen in Figure 4.11.

There is yet another way of characterizing closed sets, namely, by


looking at the limits of convergent sequences. This characterization
is described in the following theorem.

THEOREM 3. A subset Y 01 a normed space X is closed il and only il every


convergent sequence 01 points in Y has its limit in Y.

PROOF. First assurne that Y is closed, and consider the convergent se-
4.4 Open and closed sets, completion 119

uEV

FIGURE 4.11. The closed ball of unit radius in G[a, b]

quence {u n } C Yj when viewed as a sequence in X, {u n } has a limit u,


say, in X. We want to show that U E Y.
Now by assumption, given f > 0, there exists a number N such that

lIu n - ull < f whenever n > N.

Stated otherwise, far every f > 0 the neighborhood N(u, f) contains at


least one member of Y (that is, a member of the sequence {u n }) distinct
from u. Thus u is a point of accumulation of Y and, since Y is closed, it
follows that u E Y.
Conversely, assurne that every convergent sequence has its limit in Y.
Let Uo be a point in Yj then there is at least one member of the sequence,
u n , say, in the neighbarhood N(uo, 1/n). This holds for all values of n, so
that limn~oo U n = Uo (Figure 4.12). But since every convergent sequence
has its limit in Y by assumption, it follows that Uo E Y and so Y = Y;
that is, Y is closed. 0
It appears from the foregoing result that closed sets, like complete spaces,
have no "holes" in them, in the sense that convergent sequences have their
limits in the sets. On the other hand, it would seem that open sets have the
same deficiencies as incomplete spaces; for example, the interval (0,1) c lR
is an open set and is also incomplete, whereas [0,1J is both closed and
complete. Still, it is not clear under what circumstances a closed subset of
a normed space is complete; we clarify this in the next result.

THEOREM 4. Let X be a complete normed space and Y a subset of X.


Then Y is complete if and only if Y is closed in X.

PROOF. This is not too difficult, and is left as an exercise (see Exercise
4.21). 0
120 4. Properties of normed spaces

FIGURE 4.12. An illustration of the argument used in the proof of Theorem 3

Example

19. Let X = G[O, 1J and Y = prO, 1J, the set of all polynomials on the
unit interval. First, we recall that G[O, 1J is complete in the sup-norm
11·1100. Now prO, 1J is asubspace ofG[O, 1J, but Pis not closedin G. To
see this, consider the fact that u( x) = eX is a point of accumulation of
P: for any E > 0 we can always find at least one polynomial p( x) E P
lying in a neighborhood of u; indeed, given any E > 0, it is possible
to find a polynomial p(x) = 1 + x + x 2 /2! + ... + x n In! such that

lIu - plloo = sup leX - p(x)1 < E

for sufficiently large n. But the point of accumulation e X does not


belong to prO, 1] and so, since prO, 1J does not contain all its points
of accumulation, it is not closed.

Compact sets. We met compact sets in jRn earlier, in Chapter 1; there,


a set S in jRn was defined to be compact if every sequence in S has a point
of accumulation in S. That definition of compactness carries over without
modification to arbitrary normed spaces: if X is a normed space and S a
subset of X, then S is said to be compact if every sequence in S has a point
of accumulation in S.
Although it is reassuring to observe that the definition of compactness
is valid for any normed space, a word of caution is in order, in that not
all of the properties of compact sets in jRn carry over to arbitrary normed
spaces. One of these is worth pointing out here: for subsets of jRn com-
pactness is equivalent to closedness + boundedness, but the result does not
hold in arbitrary normed spaces. Certainly every compact set is closed and
bounded (and hence complete), but the converse is not true. This observa-
4.4 Open and closed sets, completion 121

tion emphasizes the need to exercise caution when generalizing from the
particular; such generalizations, or illustrations of abstract concepts in R
or Rn, are often very helpful, but there are times when one's intuition can
be misleading.
Apart from the results given in Chapters 1 and 2, compactness does not
play much of a role in subsequent developments, and we do not pursue the
topic further here.
Earlier we described in a vague fashion how an incomplete set Y may be
made complete by adding to it those limits of Cauchy sequences that were
not originally in Y. The resulting set Y, say, is then called the completion of
Y. We conclude this section by recording some properties of the completion
Yof an incomplete set Y, but in order to do this it is first of all necessary
to define a few more topological concepts.

Dense sets. If X and Y are two subsets of a normed space, then Y is


said to be dense in X if the elosure of Y is X, that is, if Y = X. This
definition implies of course that every member of X is either a member of
Y or a point of accumulation of Y, so that a neighborhood N(uo, E) of any
point Uo in X contains at least one member of Y. This in turn leads to an
alternative definition of a dense set: Y is dense in X if and only if there
are points in X arbitrarily elose to points in Y, or, given any point Uo EX
and any number E > 0, it is possible to find v E Y such that

lIuo - vII< E.

Example
20. The set Q of rational numbers is dense in R; the closure ij (= R) of
Q consists of all rational and irrational numbers.
21. The Weierstmss theorem states that, for any u E G[a, b] and for
every E > 0, it is possible to find a polynomial p E P[a, b], p(x) =
ao + aIX + ... , such that

Ilu - plICXl < E.


That is, every bounded continuous function can be approximated
arbitrarily closely by a polynomial; we refer to this as uniform ap-
proximation. Stated otherwise, the space P[a, b] is dense in G[a, b].
We go on now to the issue of characterizing some of the subsets of LP that
are dense in this space. The discussion is confined to a subset of the space of
bounded continuous functions comprising those continuous functions that,
roughly speaking, are zero in a neighborhood of the boundary.

Functions with compact support. Suppose that a function u defined


122 4. Properties of normed spaces

-1
\ 1
:2 1

FIGURE 4.13. The support of the function in Example 24

on a domain n is nonzero only for points belonging to a proper subset K


of n. Let K be the closure of K. Then K is called the support of u. We say
that u has compact support on n if its support K is a compact - that is,
a closed and bounded - subset of n. The set of continuous functions with
compact support is denoted by Co(n).

Example

22. Let n = (-1, 1) and define u to be the function

x, Os lxi s ~,
~ -x, ~ < x < ~,
u(x) =
-~ -x, -~ < x < -~,
0, ~ s lxi< 1.
Then K will be the open set given by K = (- ~, 0) U (0, ~) and K =
[- ~ , ~] (note that x = 0 and x = ± ~ are points of accumulation of
K). Since K is a closed bounded subset of n, it follows that u(x) has
compact support on n (see Figure 4.13).

23. Consider next the function u(x) = sin7rx on n = (-1,1). Here K =


(-1,0) U (0,1) and K = [-1,1] which is not a subset of n. Hence
u(x) does not have compact support on n.

Dense sets in LV. We are now able to show that the space Co( -00,00)
of continuous functions having compact support is dense in LV( -00,00) for
1 S P < 00. This implies that any set X which contains C o( -00,00) is
itself dense in LV( -00,00).
4.4 Open and cJosed sets, completion 123

Take any u E LP( -00,00) and define the sequence of functions Un by


u(x) > n
-nSu(x)Sn
u(x) < -no
For each value of n we have lunl P S lul P , so that Un E LP(-oo,oo). Fur-
thermore, U n ---+ U (a.e.) as n ---+ 00, and
Iu - unl P S (Iul + IUnl? S 2P lu1 P .
Now we apply the Dominated Convergence Theorem (Theorem 3 of Chap-
ter 2) to deduce that Ilu-unllLP ---+ 0 as n ---+ 00. We thus see that bounded
functions are dense in LP( -00,00) for 1 S p < 00.
Now take a bounded function v in LP( -00,00) and set vn(x) = v(x) for
-n S x S n, and vn(x) = 0 otherwise. By repeating the argument in the
previous paragraph we can conclude that bounded functions with compact
support are dense in LP( -00,00) for 1 S p < 00.
With these preliminary results, the following theorem c:an now be proved
(see Exercise 4.23).

THEOREM 5. The space Go(-oo,oo) is dense in LP(-oo,oo) for 1 S p <


00.
In Chapter 7 we come ac:ross a rather special dass of functions with
compac:t support; this is the space G8"(f!) which c:onsists of functions that,
together with all their derivatives, are continuous and have c:ompac:t sup-
port. It is in fact possible, by an appropriate modification and extension of
the proof of Theorem 5, to prove the following.

THEOREM 6. The space Go(f!) is dense in LP(r2) for 1 S p < 00, where
0, is any openset in IR n .
The proof may be found in some of the texts referred to at the end of
this chapter.

Separable space. A normed spac:e X is said to be separable if it contains


a dense set Y that is countable. Recall from Chapter 1 that a countable
set is one whose elements can be put in one-to-one correspondence with
integers. It follows then that a separable space is one with the following
property: thcre exists a countable subset {{VI, V2, ... } such that for each
f > 0 and for each u in X, there is a member V n , say, with Ilu - vnll < f.

Example
24. We havc seen that the set of rationals iQl is dense in IR; since iQl is
countable, it follows that ~ is scparablc. In the same way, ~n is sep-
arable: a countable dense subset is the set iQln of n-tuples of rational
numbers.
124 4. Properties of normed spaces

25. From the Weierstrass theorem (Example 21) we see that the count-
able set Q[a, b] of polynomials with rational coefficients is dense in
G[a, b]. Furthermore, from Theorem 6 we know that G[a, b] is dense
in V(a, b). It follows that Q[a, b] is dense in V(a, b) far 1 ::; p < 00,
and that LP(a, b) is therefore separable.

Completion of a set. We return now to the idea of completion of a set.


Recall from Theorem 4 that an arbitrary subset Y of a Banach space X
is itself complete if and only if Y is closed. It follows that the closure Y
of Y is complete; furthermore, according to the definition of a dense set,
Y is dense in Y. What we have described is of course a way of completing
an incomplete set Y. In future, 1?Y the completion 17 of a set Y, we always
mean the closure of Y, so that Y = Y. This definition obviously relies on
the fact that Y must be a subset of a complete space X.
Recall from Example 7 that the space G(O) with the norm

is not complete. The completion of this space is actually V(n), which is


obtained by adding to G(O) those limits of Cauchy sequences (for example,
the function u(x) in Example 7) that are not in G(O). Thus G(O) is dense
in LP(n). In Chapter 7 we show how this result is obtained as a special case
of a much broader result, the essence of which is that the space Gm(o) is
dense in the Sobolev space Hm(n) of functions that, together with their
derivatives of order::; m, are square-integrable. The result quoted here is
for the special case m = o.

4.5 Orthogonal complements in Hilbert spaces


In this section attention is confined to inner product spaces. We exploit
the concept of orthogonality in Hilbert spaces and present results that are
generalizations of well-known geometrical situations in ~a.
Let X be an inner product space and Y any subspace of X; the orthogonal
complement Y 1- of Y is defined to be the set

y1-={WEX: (w,v)=O forall VEY}; (4.12)

that is, Y 1- consists of all those members of X that are orthogonal to every
member of Y. If w belongs to Y 1-, we say that w is orthogonal to Y and
write w 1. Y. Since (v,v) = 0 implies that v = 0, it is clear that the only
member of both Y and Y 1- is the zero element: Y n Y 1- = {O}.
4.5 Orthogonal complements in Hilbert spaces 125

Example
26. A canonical example of orthogonal complements is provided in Il~?
Let Y be the realline - the X3-axis, say. Then y.L is the Xlx2-plane
since points in the Xlx2-plane are orthogonal to every member of Y,
as shown in Figure 4.1.

Our main aim in this section is to show that if H is any Hilbert space and
M is a closed subspace of H, then H = M E!:l M.L; that is, every u E H has
the unique representation

u =v +w with v E M and w E M L

(recall the discussion of direct sums in Seetion 3.1). Before doing so, how-
ever, it is necessary to prove another intuitively obvious result, which is
embodied in the following theorem.

THEOREM 7. (a) Let H be a Hilben space and M a closed subspace of H.


Then for every u E H it is possible to find a unique member Vo in M such
that

d == Ilu - voll = inf{llu - vii, v E M}.


Moreover, U - Vo .1 v for all v E M.
(b) If S is also a closed subspace of H with SeM and 8 i= M, then there
is a member v in M such that v i= 0 and v .1 S.

REMARK. Part (a) of the theorem says that, provided M is a closed sub-
space, we can always find a unique point Vo in M that is closer to u than
any other point in M. Furthermore, this point may be found by "dropping
a perpendicular" from u on to M. Part (b) is illustrated in Figure 4.14 for
the case in which H = R 3 , M is the xy-plane, and S the x-axis.

PROOF. (a) By definition of the infimum, there exists a sequence {v n } in


M such that

dn = Ilu - vnll ...... d. (4.13)


If we can show that V n is a Cauchy sequencc, then it follows from the
completeness of M (see Theorem 4) that the limit vo, say, of the sequence
will lie in M. To show that V n is Cauchy, consider

II(vn - u) - (v rn - u)1i 2
-11(vn - u) + (v rn - u)11 2 + 2(llvn - 1J~112 + Ilvrn - u11 2 )
using the parallelogram law, Exercise 3.10. Now
126 4. Properties of normed spaces

FIGURE 4.14. Illustration of part (b) of Theorem 7

and hence

from (4.13). Hence Vn is a Cauchy sequence, and Vn -t Vo in M. But since


Vo is in M we must have

Ilu - voll ~ d;

furthermore,

Ilu -voll Ilu - V n+ V n - voll :-:; Ilu - vnll + Ilvn - voll


dn + Ilvn - voll d.
-t

Hence Ilu - voll = d. The proof that Vo is unique is left as an exercise (see
Exercise 4.24).
To show that (u - Vo, v) = 0 for all v E M, consider any point Vo + ClV
in M; dearly,

(4.14)

Now suppose that u - Vo is not orthogonal to v, and that (u - vo,v) =


ß # O. Also, take Cl to be equal to ß/llvl1 2 (recall that Cl is arbitrary). Then
we obtain, from (4.14),

(4.15)

but the right-hand side of (4.15) is less than d2 since Ilu - vol1 2 = ~ and
ß2/lIv11 2 > O.
This leads to a contradiction, and so we must have ß = O.
(b) Choose v E M such that v rt S, and let w be the point in S dosest
to v (we are applying part (a) of the theorem to S). Then v = v - w is
such a point. 0
We are now ready to prove the following theorem.
4.5 Orthogonal complements in Hilbert spaces 127

THEOREM 8 (THE PROJECTION THEOREM.) Let M be a closed subspace


of a Hilben space H. Then every u E H can be uniquely written in the form

u = v + '111, V E M, '111 E M1..;

that is, H =M EB M1...

PROOF. First of all, M is complete by Theorem 4. Second, according to


Theorem 7, for each u E H there is a unique v in M such that u - v is
orthogonal to M. That is,

(4.16)

To prove that there is only one such w, suppose that (4.16) holds for two
elements '1111 and '1112 in M 1... Then

and so '1111 - W2 = 0, or '1111 = '1112. This proves the theorem. D


We conclude this section with a result that will prove useful later on.

THEOREM 9. Let M be a subspace of a Hilben space H. Then M1.. = {O}


if and only if M is dense in H.
In order to prove Theorem 9 we introduce the following property of
orthogonal complements.

LEMMA 1. M1..1.. = M, where M denotes the closure of M and M1..1.. =


(M1..)1...

PROOF OF LEMMA 1. First, we note that M c M1..J., since ifv is any point
in M, then v .1 M1.. so that v also belongs to M1..1... F'urthermore, since
M1..1.. is closed (see Exercise 4.25), clearly Me M1..J.. All that remains is to
show that M = M1..1... Suppose that M =f MJ.J.; from Theorem 7(b), there
is a nonzero point '111 E M1..J. such that '111 .1 M. Since 1\;.[ c M this means
that '111 E M 1... But '111 E M 1.. n M 1..1.. implies that '111 = 0, a contradiction.
Hence M1..1.. = M. D

PROOF OF THEOREM 9. First assume that M is dense in H; then for any


u E Hand given any E > 0, we can always find a point v, say, in M such
that Ilv - ull < E. In particular, if u E M1.. then

Since E is arbitrary we must have u = O. Conversely, assume that M 1.. =


{O}; then M1..1.. = {O}1.. = H, so that from Lemma 1, M = H. That is, M
is dense in H. D
128 4. Properties of normed spaces

Example

27. Since C(rl) is dense in L 2 (D,) with the L 2 -norm, it follows from The-
orem 9 that ifu E L 2 (D,) and (u,v) = 0 for all v E C(rl), then u = O.
That is,

i u(x)v(x) dx = 0 for all v E C(rl) =? u(x) = O.

4.6 Bibliographical remarks


Thc results covered in this chapter are usually given a detailed treatment in
books on functional analysis. Good references include Binmore [6], Kreyszig
[27], Lang [29], Naylor and Sell [33], Oden [36], Roman [43], and Smirnov
[49]; these texts contain a wealth of information on normed and inner prod-
uct spaces. Naylor and Sell [33], in particular, treat in detail the issue of
dense subspaces of LP. Lang [29] has a simple and elegant proof of the
Weierstrass Approximation Theorem.

4.7 Exercises
Sequences

4.1. Calculate d(x, B) (see Exercise 3.18) if X =~, x = 1and B = {X n =


3 + (-1)nn/(n 2 + 1): n = 1,2,3, ... } U {3}.

4.2. Let X be an inner product space and suppose that {u n } and {vn }
are convergent sequences in X with limits u and v, respectively, con-
vergence being defined via the norm generated by the inner product
on X. Show that (un,v n ) -> (u,v), and deduce that (un,v) -> (u,v)
and that Ilunll -> Ilull.
4.3. If U n -> U in a normed space X and lIun - wll :::; Q for some w E X
and Q E ~, show that Ilu - wll :::; Q.
Convergence of sequences of functions

4.4. Determine intervals on which the following sequences of functions


converge pointwise: (a) un(x) = x n ; (b) un(x) = 1/(1 + n 2 x 2 ).

4.5. Show that the following sequences converge pointwise to 0 on the


intervals given, but that they do not converge in the mean.
4.7 Exercises 129

0,
(a) un(x) = { n,
°~ x :::; 1/n,
1/n < x < 2/n,
0, 2/n ~ x :::; 1;
(b) Un(X) = n 3/2 xe- n 2 X 2 on [-1, 1J.

4.6. Does the sequence un(x) = nx/(l + n 2 x 2 ), n 1,2, ... converge


uniformly in [0,1]7 in (a,1J (for 0< a < I)?

4.7. Show that uniform convergence of a sequence of functions implies LI'


convergence. Give an counterexample to show that the converse does
not hold.

°
4.8. Let a > be a fixed real number, and define Ilull = sup{lu(x)l: lxi:::;
a} and Illulll = min(l, Ilull) on the space C( -00,00). Why is 11·11 not
a norm? Is 111· 111 a norm?

Completeness

4.9. Show that the sequence un(x) = X1/n is a Cauchy sequence in L 2 (0, 1).

4.10. Consider the sequence un(x) = x n in the space[}(O, 1). Is this a


Cauchy sequence?

4.11. The purpose of this exercise is to show that C[a, bl is complete with
respect to the sup-norm. Let un(x) be a Cauchy sequence; show that
un(xo) is a Cauchy sequence of real numbers for every fixed Xo in
[a, bJ and deduce that un(xo) converges to a number u(xo), say. Next,
show that un(x) converges uniformly to the function u(x). Finally,
since U n -+ u uniformly, we have

Iun(x) - u(x)1 < E for all n > N;


use the triangle inequality to show that

and deduce from this result and the continuity of Un that U is con-
tinuous.

4.12. Show that ]Rn with the norm 11 ·llp (1 :::; P ~ 00) is complete.

4.13. Show that every convergent sequence is a Cauchy sequence.

4.14. Consider C[O, 1] with the L 2 -norm. Show that thc sequence {u n } is
a Cauchy sequence, where un(x) is as shown in tbe following figure.
Next, show that if U n converges to u(x), then we should have

U(x) = { 0,
1,
°~< x < ~,
~ x ~ 1,
130 4. Properties of normed spaces

so that G[O, 1] with the L 2 -norm is not complete.

4.15. Show that the set Y = {v E L 2 (0, 1): Jo1 Iv(x)1 dx = I} is complete.

Open and closed sets, completion

4.16. Show that the function

-1, -1::; x < 0,


U(x) = {
+1, 0::; x::; 1,

is a point of accumulation of G[-I, 1] with respect to the L 2 -norm.

4.17. Consider the space G[a, b] with the sup-norm, and let M be the subset
consisting of functions v satisfying v(a) = 0 and Iv(x)1 < 1. 1s the
function u(x) = 1 a point of accumulation of M?

4.18. Find the smallest value of r such that the function v(x) = cos27T"x
lies in the closed ball with center Uo and radius r in the space G[O, 1]
with the sup-norm, where uo(x) = sin27T"x.

4.19. Show that a set Y in a normed space X is closed if and only if its
complement Y' = X - Y is open.

4.20. Prove Theorem 4, which states that if X is a Banach space and Y a


subset of X, then Y is complete if and only if Y is closed in X.

4.21. Let W, X, and Y be normed spaces, and suppose that W is dense in


X and X is dense in Y. Show that W is dense in Y.

4.22. Prove Theorem 5.

Orthogonal complements and the projection theorem

4.23. Show that the element va in Theorem 7 is unique.

4.24. If Y is a subset of an inner product space X, show that Y 1. is a closed


subspace of X. [Hint: let fUn} be a convergent sequence in y1. with
limit uo.]
4.7 Exercises 131

4.25. Where in Lemma 1 is the completeness of H used?

4.26. If X and Y are subsets of an inner product space Wand X c Y,


show that yl.. C Xl...
5
Linear operators

In the preceding chapters we have acquainted ourselves with some of the ba-
sic structures of normed and inner product spaces. We come now to another
fundamental concept in functional analysis, namely, that of a mapping or
operator from one space to another. At the most primitive level one re-
quires only two sets in order to define an operator from one of them to the
other, and these sets need not have any algebraic or topological structure
for the definition to make sense. Obviously, though, the really interesting
and useful properties of operators come to the fore when the two sets are
given additional structure: if the two sets are vector spaces, we can intro-
du ce the concept of a linear operator, and if the sets are normed spaces as
weil, then it is possible to construct a rieh theory of linear operators on
such spaees. After a general introduction to operators in Section 5.1, we
discuss the theory of linear operators on normed spaces in Section 5.2.
Projections are a dass of operators that feature strong;ly in later chapters
when we discuss approximations of boundary value problems. Apart from
this, much of the geometrical structure of Hilbert spaces is laid bare with
the aid of projection operators acting on these spaces. For these reasons
we devote Seetion 5.3 to a diseussion of projection operators on Hilbert
spaces.
Operators that map members of a specified space into the real or com-
plex numbers are special, and are given a special name: these are called
functionals, and are discussed in Section 5.4. Finally, we discuss in Section
5.5 operators that map pairs of elements into the real or complex numbers
in a linear fashion; these are known as bi linear forms. Linear functionals
134 5. Linear operators

1
f(x) = sinx

-Ir/2 x

-1

FIGURE 5.1. The function f(x) = sinx considered as a mapping

and bilinear forms playa central role in the study of linear boundary value
problems, as we show in Chapter 9 and subsequently.

5.1 Operators
The subject of this chapter is not entirely unfamiliar; we havc all come
across both linear and nonlinear operators in earlier courses on linear al-
gebra, differential equations, and so on. Here we continue the process of
generalizing from the familiar. Consider the function f(x), defined on the
interval I = [-Ir /2, Ir /2], as shown in Figure 5. L This familiar situation is
really just an example of the action of an operator: specifically, we have
defined f to be something that acts on any member x in I, and pro duces a
real number sin x. Furthermore, the image sin x lies in the set J = [-1, 1J.
More formally we write all of this as follows:

f :I --t IR, f(x) = sinx.

Here the first expression reads "f maps elements of I to elements of IR"
and the second expression tells how f does this: f aets on x to produce
sin x. The set I is called the domain of the operator f, written D (f). The
set IR in which f(x) takes its values is called the image space, whereas the
subset J c IR consisting of all real numbers that are images of I under the
mapping f is called the range of f, written R(f). We now generalize.
Let X and Y be two sets, and suppose that a rule is given whereby an
element u of X is mapped or transformed to an element v of Y. This rule
is called an operator or transformation or mapping and we write, for an
operator T,

T: X --t Y, Tu =v (or T(u) = v), u E X, v E Y.

The first expression reads "T maps elements of X to elements of Y" while
the secoIld reads "T acts on u to produce v". We refer to Y as the image
5.1 Operators 135

FIGURE 5.2. Illustration of concepts associated with a mapping T : X -> y

space, X is ealled the domain of T, written D(T), and we write R(T) for
the range of T, whieh eonsists of all those elements of Y that are images of
members of X. In other words,
R(T) = {v: v E Y, Tu = v for some u EX}.
Finally, the element v is ealled the image of u under the mapping T. These
eoneepts are illustrated in Figure 5.2.
If the range of T happens to be all of Y, then T is ealled a surjective
operator, and we say that T maps X onto Y. Otherwise T maps X into Y.
Assurne that the image spaee of T eontains the zero element; then the
null space N(T) of T is the set of all elements of D(T) whose image is zero:
N(T) = {u EX: Tu = O}.
The inverse image of a member v E Y is denoted by T- 1 (v), and is the set
of all u E X such that Tu = v:
T- 1 (v)={uEX: Tu=v}.
Likewise, the inverse image of a subset W of Y is denoted by T- 1 (W), and
is the set of all u E X such that Tu E W (Figure 5.3):
T- 1 (W) = {u EX: Tu E W}.

Examples
1. All functions of a real variable are operators from a subset of lR to lR,
for example, the operator or funetion f(x) = sinx diseussed at the
beginning of this section. In the same way, the function
f : lR2 -> lR, f(x) = f(x, y) = x 2 + y2
136 5. Linear operators

x y

FIGURE 5.3. The inverse images of an element and of a set

is an operator that maps the point x in neto the real number x 2 + y2 .


This number is never negative, and indeed corresponding to any real
number r there are numbers x and y such that fex) = r (think about
a circle with radius r). Thus RU) = the set of all nonnegative real
numbers, and hence f is not surjective. The inverse image of the set
(-1,0) is the empty set, whereas the inverse image of the point +1
is the set of all points on the unit circle.
2. An n x m matrix is an operator from IR ffi to jRn. For example, thc
operator

Tx = ( Tu
T 21

is a 2 x 3 matrix that transforms a mcmber of jR3 to a member of jR2.


Whether T is surjective depends on the entries T ij in T; for example,
if T ll = T 21 = 1 and all other T ij = 0, then Tx = (Xl, Xl) and the
range of T is the subset of jR2 described by the straight line running
through the origin at 45°. The null space of T consists of all points
x far which Tx = (XI,XI) = (0,0): thus N(T) = {x E 1R 3 : Xl = O},
which is the x2x3-plane.
3. Various examples of differential operators were presented in the In-
troduction, and reappear later; recall that these are operators that
consist of combinations of ordinary or partial derivative operators.
For example, if u E C 2(0) with n a domain in jR2, then the Lapla-
cian operator ~ is defined by
5.1 Operators 137

for a problem involving two variables x and y only. Thus J, the image
of U, is a continuous function. To be specific, if nc ]R2 and u(x) =
x 2 y 3, then the image of U is the function J defined by
J(x, y) = 2y 3 + 6x 2 y.
The question of whether ~ is a surjective operator is a question that
is taken up in Chapter 8; this is equivalent to asking whether there
exists a solution to the equation ~u = J.
Two operators S : X -+ Y and T : X -+ Y are said to be equal if for
every u E X we have

Su = Tu.
When this is the case, we write S = T.
The sum of two operators S : X -+ Y and T : X -+ Y is defined to be
the operator satisfying

(S + T)u = Su + Tu, u E X,
where Y is a vector space. That is, T + S has the
same effect on any
member of u as would be obtained by applying T and S separately, and
then adding together the result. In order for the definitions of the sum of
operators, and of equality of operators, to make sense, t.he domains of the
two operators Sand T must be equal, as must the image spaces.
The composition or product T S of two operators S : X -+ Y and T :
Y -+ W is defined to be the operator satisfying

TS: X -> W, (TS)u = T(Su) for all u E X.

That is, the element (TS)u E W is found by first obtaining the element
Su E Y, and then by the action of T on Su. Note that the composition
T S is meaningless if the element Su does not belong to the domain of T.
Furthermore, in general TS f=- ST; in fact, ST may be quite meaningless.

Example
4. Let X = ]R3, Y = ]R2, W = IR, and let T : X -+ Y and S : Y -+ W
be the matrices
2
S = [1 2J.
3

Then for any x = (x, y, z) in IR 3 ,

S(~
2
(ST)x 21 ) ( yX ) ( x + 2y + z )
3 z = [1 2] 2x + 3y + 2z

5x + 8y + 5z.
138 5. Linear operators

FIGURE 5.4. An injective operator and its inverse

It follows that the operator TS is meaningless since for any x in]R2


we have Sx E ]R and T(Sx) makes no sense.

The identity operator is an operator from a set X into itself, which maps
each element of X to the same element. That is,

I: X --t X, Iu = u for all u E X.

The zero operator 0 is an operator 0 : X --t Y which maps every element


of X to the zero element in Y (we ass urne of course in this definition that
Y has a zero element):

o:X --t Y, Ou =0 for an u E X.

Example

5. Let X = Y = ]R3; then the identity operator I : ]R3 --t ]R3 is simply
the 3 x 3 identity matrix. The zero operator from ]R3 to ]R2 is the
2 x 3 matrix containing an zeros.

Injective (one-to-one) and invertible operators. An operator T :


X ---+ Y is one-to-one or injective if no two distinct elements of X are
mapped to the same element in Y. That is, T is one-to-one if

or, equivalently, if TUj = TU2 implies that Uj = U2, for an Uj, U2 E X


(Figure 5.4). From this definition it is evident that each v in the range ofT
is the image 0/ exactly one element U in X. We may accordingly define an
5.1 Operators 139

FIGURE 5.5. The function f(x) = sinx

operator T- 1 , called the inverse of T, which maps v back to u. The inverse


is then defined by

T- 1 : R(T) ....... X, T- 1 (Tu) = u. (5.1)

In view of the definition of the composition of two operators, (5.1) indicates


that

In the same way, by starting with T-l we find that

TT- l = I,
that is, (T- 1)-1 = T. Ifthe range ofT is all ofY (that is, T is surjective)
and T is also one-to-one, then T is said to be bijectivej T- 1 is a one-to-one
operator from Y onto X, and we say that T is invertible.

Examples

6. The function f : [-7l'/2, 7l'/2] ....... [-1,1], f(x) = sinx, is one-to-one


since to each value of f(x) = sinx there corresponds only one point
x. However, ifthe domain of fis the whole realline, then we see from
Figure 5.5 that f is not one-to-one since, for any :Z;I,

f(XI + 2n7l') = f(Xl), n = 1,2, ....


Returning to the case in which DU) = [-7l',/2,7l'/2], the inverse
function f-l: [-1,1] ....... [-7l'/2,7l'/2] is defined by f-l(y) = arcsiny.
7. Any nonsingular nxn matrix T : !Rn ----> !Rn is one-to-one, with inverse
T- l being the usual matrix inverse.
8. The operator T = d/ dx : Cl [a, b] ....... C[a, b] is not one-to-one since
there are infinitely many functions, all differing from each other by
140 5. Linear operators

a constant, which have the same image or derivative. However, if we


choose the domain ofT to be X = {u E C1[a,b]: u(a) = O}, then T
is invertible with inverse T- 1 defined by

T-1(v)(x) = l x
v(y)dy.

Restrietion and extension. Suppose that we are given an operator T


from X to Y, so that D(T) = X. Let U be a subset of X. Then the
restriction of T to U is the operator T lu defined by

T lu: U -> Y, T lu u = Tu for an u E U.

Thus T lu is an operator with domain U which has the same action on


members of U as does T.
Suppose next that X is a subset of a bigger set V. Then an extension of
T to V is an operator T with the property
T :V -> Y, T Ix= T.
That is, Tu = Tu for u E X, so that T is thc rcstriction of T to X.

5.2 Linear operators, continuous, and bounded


operators
Linear operators. A linear operator T is an operator whose domain X is
a vector space, and which is

(a) additive: T(u + v) = T(u) + T(v) for an u, v E X; and


(b) homogeneous: T(au) = aT(u) for aB u E X, a E lK.

Here lK is the field (either IR or C) over which the vector space is defined. We
may summarize (a) and (b) in one statement by defining a linear operator
to be one that satisfies

T(au + ßv) = aT(u) + ßT(v) for aB u, v E X, a, ß E lK.

Examples
9. The differential operator d n / dx n : cn [a, b] -> C[ a, bJ is linear since
dn dnu dnv
-(au+ßv) = a - +ß-.
dx n dx n dx n
Similar considerations apply to partial differential operators of an
orders. which are also linear operators.
5.2 Linear operators, continuous, and bounded operators 141

10. The operator f : R -+ R, f(x) = sinx, is not a linear operator since,


if either x or y is nonzero, f(x + y) = sin(x + y) =I f(x) + f(y) =
sinx + siny.
We note that if X and Y are vector spaces and T : X -+ Y is linear,
then N(T) and R(T) are subspaces of X and Y, respectively.
Suppose that T : X -+ Y is a linear one-to-one operator, and that Tuo = 0
for some Uo E X. Since T is linear we have, for any element u, T( u) =
T(u + 0) = T(u) + T(O) which implies that T(O) = O. But since T is one-
to-one, the inverse image of 0 E Y must be a unique element in X, from
which it follows that Uo = O.
Conversely, suppose that T : X -+ Y is a linear operator with the prop-
erty that Tuo = 0 only for Uo = o. Then for any two distinct elements
u,v E X, T(u) - T(v) = T(u - v) =I 0 since u - v i 0 by hypothesis,
and so two distinct elements do not have the same image. Hence T is one-
to-one. We summarize all of this in the following important theorem.

THEOREM 1. A linear operator T is one-to-one if and only if N(T) = {O}.

Example
11. Let T : IRn -+ Rn be the operator defined by an n x n matrix. It is
easily shown that T is a linear operator; the question of whether T
is one-to-one is equivalent to asking whether the equation

Tx=y
has a unique solution x for a given y. According to Theorem 1 this
question may be answered by considering the equation

Txo = 0;
if the only element Xo satisfying this equation is Xo = 0, then T is one-
to-one. Equivalently, we may check whether the matrix is nonsingular.
For example, if T : IR 2 x IR 2 is given by

then

which has the solution Xo = ,( -2,1) for , E IR:


N(T) = {xo E IR2 : Xo = ,(-2, 1) for all , ER}.
142 5. Linear operators

It follows that the equation Tx = y will not have a unique solution;


in fact, if Xl is any solution, then Xl +,( -2,1) is also a solution. We
observe also that T is singular, in that its determinant is zero.

Isomorphisms. Two veetor spaees X and Y are said to be isomorphie, or


more preeisely, algebraieally isomorphie, to each other if there exists a linear
bijective map T from X onto Y; this map is then ealled an isomorphism. It
follows that the inverse operator T-I is also an isomorphism from Y onto
X.
Isomorphisms are useful, in that they provide information about whether
it is possible to put elements of one space X in one-to-one eorrespondenee
with elements of another spaee Y. This in turn establishes that the two
spaces eoneerned are "alike", in a rough sense. It should be borne in mind,
though, that there are many attributes of aspace, such as its topologieal
properties, whieh eannot be inferred simply from its isomorphie relationship
to another.

Example

12. To emphasize the point that two isomorphie spaees ean be quite differ-
ent in nature, eonsider the ease in whieh X = ]Rn and Y = Pn - I [a, bJ,
the spaee of polynomials of degree less than or equal to n-l. An arbi-
trary member of Y is of the form p(x) = ao + aIX + ... + an_IXn-l,
and is therefore defined uniquely by the n numbers ao, . .. , an-I. Let
T : X -> Y be the operator that associates with the point a =
(ao, . .. , an-I) the polynomial p(x) introdueed earlier; then clearly T
is linear and bijeetive, and is henee an isomorphism. Thus X = ]Rn
and Y = Pn-da, bJ are isomorphie to each other.

All of the previous eonsiderations depend only on the algebraie strueture


of X and Y; in other words, we have required no more of X and Y than that
they be linear spaees. When an operator maps elements from one normed
space to another, though, many further interesting properties emerge. We
take a look first at continuous opemtors.
Before giving a general definition of a eontinuous operator, it may be
helpful to reeall the diseussion of eontinuous functions in Seetion 4 of Chap-
ter 1. We defined a function f : ]R -> ]R to be eontinuous at Xo if, given
any f > 0, it is always possible to find a 6> 0 such that If(xo) - f(x)1 < 6
whenever Ixo - xl < 6. Now suppose we rephrase this in the language of
open sets: f is eontinuous at Xo if, given any f > 0, it is always possible to
find 6 > 0 such that the image of any point in the neighborhood N(xo, 6)
lies in the neighborhood N(f(xo), f). This is precisely how we define a eon-
tinuous operator in any normed spaee.

Continuous operator. Let T : X -> Y be an operator from a normed


spaee X to a normed spaee Y: then T is eontinuous at Uo EX if for every
5.2 Linear operators, continuous, and bounded operators 143

FIGURE 5.6. A neighborhood and its image

E > 0 there is a positive number 8, possibly depending on Uo and E, such


that

IITuo - Tull < E whenever lIuo - ull < 8. (5.2)

If (5.2) holds for every Uo EX, then we simply say that T is continuous on
X. Furthermore, if 8 does not depend on uo, then T is said to be uniformly
continuous on X.
The situation is shown schematically in Figure 5.6. Choose some point
Uo and a number E > 0; then T is continuous if a number 8 can be found
such that the image of the points lying inside the neighborhood N (uo, 8)
is contained in the open ball of radius E and center Tuo. At this point we
draw attention to the norms used in (5.2); since Uo and u are in X, the
norm used when evaluating 11 UD -ull is the norm defined on X; on the other
hand, the norm used in the evaluation of IITuo - Tull must be that defined
on Y. When wishing to emphasize the distinction we write, for example,
IluD - ullx and IITuo - Tully· Generally, though, it is expected that there
will be no confusion about which norm should be used.

Examples

13. Let X = Y = IR and let f : IR --+ IR. Then the definition of continuity
given previously coincides with that given in Section 2.1 if we use the
norm 11 . 11 = I . I on IR.

14. Let X = Y = G[O, 1] with the sup-norm, and define T : G[O, 1] --+
G[O, 1] by

Tu(x) = l x
u(y) dy, xE [0,1]
144 5. Linear operators

(for example, if u(x) = cosx, then Tu(x) is the continuous function


sinx). Now

IITuo - Tull = SUp I r UO(Y) dy - r u(y) dyl


xE[O,l] Jo Jo
and so we have to estimate the term inside I ... 1. We find that

Il x
(uo(y) - u(y)) dyl ::; l x
luo(y) - u(y)1 dy
::; (x - 0) sup luo(y) - u(y)1
yE[O,l]
(using Theorem 2, Chapter 2)

and so

IITuo - Tull oo ::; sup x sup luo(y) - u(y)1


xE[O,l] yE[O,l]
::; 1· sup luo(Y) - u(y)1 = Iluo - ull oo .
Hence if Iluo - ull oo < 8, then IITuo - Tull oo < E and so, given any
E > 0 we simply choose 8 = E to show that T is continuous. Since 8
does not depend on uo, T is in fact uniformly continuous.
15. Let X = jRn and Y = jRrn and consider the linear operator T from
X to Y represented by an m x n matrix. We endow both jRn and jRm
with the Euclidean norm IIxll2 introduced in Example 15, Chapter 3.
Now consider the image a of x under the mapping T. That is,

Tx=

or
n
LTijXj = ai, 1::; i::; m.
j=l

If y is another point in jRn with image b E jRrn, then

IIb-alil ~ t, (t,T;,(Yj-Xj))'
$ t, (t,T;;) t.(Yj -Xj)'
5.2 Linear operators, continuous, and bounded operators 145

(using the Cauchy-Schwarz inequality in ffi.n)


m n

< k 21Iy_xI1 2 , where k2=L)=Ti~'


i=1 j=1

Hence

IITy - Txll :::; kilY - xii,


so for given E > 0 we simply choose 8 = E/k; then IITy - Txll < E

whenever IIY - xii< 8 and T is thus continuous on X.


Continuous operators can also be characterized in terms of open sets, as
the following theorem shows.

THEOREM 2. An operator T (not neeessarily linear) !rom a normed spaee


X into a normed spaee Y is eontinuous ii and only ii the inverse image So
oi any open subset S oi Y is an open subset oi X.

PROOF. Figure 5.7 illustrates the assertion of the theorem. Suppose that
T is continuous, and for any '11.0 E So let Vo = Tuo. Since S is open, there
is a neighborhood N(vo, E) of Vo contained entirely in S. By the continuity
of T, Uo has a neighborhood No(uo, 8) that is mapped into N(vo, E). Thus,
No C So since No is part of the inverse image of S; so So is open. Conversely,
assurne that the inverse image of cvcry open set in Y is an open set in X.
Then in particular for every Uo E X and any neighborhood N(Tuo,f)
of Tuo, the inverse image No, say, of N is open. Hence No also contains
a neighborhood of center Uo and, by definition of No, the image of this
neighborhood lies in N, Since Uo was arbitrary, T is continuous. 0

Isometries. Spaces that are isomorphie to each other were introduced


earlier. With the notion of a norm available it is possible tü take one step
furt her the idea of two spaces being alike in some sense. Let X and Y
be normed spaces, and suppose that there exists an operator T : X --> Y
which is linear and bijective - in other words, an isomorphism - and which,
furthermore, has the property that

IITully = Ilullx für any u E X.

Then T is called an isometry, and X and Y are said to be isometrieally


isomorphie. This in turn implies, from the linearity of T, that

IITu - Tvll = IIT(u - v)11 = Ilu - vii


for any members u and v of X; that is, T preserves the distances between
elements. The situation is depicted schematically in Figure 5.8.
146 5. Linear operators

FIGURE 5.7. An equivalent definition of continuity

Completions revisited. Recall that we discussed in Section 4.4 the no-


tion of completion of an incomplete subset S of a normed space X, and
defined the completion of such a set to be the closure S of S or, equivalently,
the union of S with the limits of all Cauchy sequences in S (these may of
course have their limits either in S, or in the larger space X). It turns out
that this treatment of completion is a somewhat simplified version of the
full story, which can be given now that we have acquired some background
in operator theory.
In general, by the completion of a subset S of a normed space X with
norm 11· Ilx is meant any complete subset S* of a normed space X* with
norm 11 . Ilx·, with the property that S* has a dense subset Y that is
isometrically isomorphie to S.
We restricted the definition in Section 4.4 to the special case in which
S = Y, and in whieh the isometrie mapping is just the identity. The gen-
eral procedure given here allows one to find a completion even when the
incomplete space is not given as a subset of a complete normed space, but
for our purposes the description given in Section 4.4 suffices.

Bounded operators. The concept of a bounded operator is closely con-


nected with that of a continuous operator. Let T : X --+ Y be a linear
operator; we say that T is bounded if it is possible to find a number K > 0
such that

IITul1 :S Kllull for all u in X.


5.2 Linear operators, eontinuous, and bounded operators 147

x y

Tu

IITU-TV I (
= Ilu-vll

Tv I
FIGURE 5.8. Two isometrically isomorphie spaees

For u =I 0 we see that K ~ IITull/llull. So the set {K: K ~ IITull/llull, u =I


O} is bounded below, and the least upper bound, taken over all members
u of X, is called the norm ofT, and is written IITII. That is,

IITII = sup{IITull/llull, u =I O} (5.3)

and so we can write

IITull<::; IITllllul1

for a bounded linear operator.


We now show that what we have called the "norm" of T does indeed
satisfy the norm axioms. First, there is the quest ion of the space on which
this quantity is a norm; this is the set L(X, Y) of all bounded linear oper-
ators fram X to Y, which forms a vector space; to see this we note that if
T and S belong to .c(X, Y), then aT + ßS, defined by

(aT + ßS)u = aTu + ßSu

for all u E X and a, ß E lR or C, also belongs to L(X, Y). Also, I is the


identity element and the zero operator the zero element.
The quantity defined in (5.3) does ineleeel obey all of the norm axioms:
IITII ~ 0 anel IITII is zero if and only if T = 0 (the zero operator). The
triangle inequality follows from

II (T + S)ull IITu + Sull


IIT+SII sup = sup
#0 Ilull u#O Ilull
IITul1 IISull)
<::; ~~~ ( M + M = IITI! + IISII·
148 5. Linear operators

Thus L(X, Y) is a normed space with its norm defined by (5.3).

Examples
16. Let T : ]R2 - t ]R3; then the space L(]R2, ]R3) of linear operators from
]R2 to lR3 is equivalent to the space of all real 3 x 2 matrices. If]R2
and ]R3 are equipped with the I-norm
n

IIxl11 = L lXii
i=1
in which n = 2 or 3, respectively, then

IITII
IITxlh
sup { ~' x fO}
sup{IITxI11, IIxl11 = I} (see Exercise 5.11).

Now
3 3

L ITi1 X1+ Ti2 x21 :::; L(ITi1 11 x11 + ITd IX 21)


i=l i=l
3 3

lXII L ITi1 ! + !X2! L ITi2 !


i=l i=l

: :; (J!!'Mt ITijl) (!X1! + !x21)


3

: :; ~~~L ITij
J • i=l
!
(5.4)

!ITI!l = max II Tx l11 :::; max L !Tij !.


J i=l
Suppose that this maximum is attained for j = 1, and choose x =
(1,0); for this choice of x
3

I! Tx Il1 = L ITi1 1,
i=l

so that we have equality in (5.4). Generally, !!Tx!h = :L:=1 !Tip !,


where p denotes the column in which :L:=l !Tip ! is a maximum. Hence
3
IITI!l = max L !T,j!;
J i=l
5.2 Linear operators, continuous, and bounded operators 149

this quantity is known, for obvious reasons, as the maximum column


sum ofT.
This special result is easily generalized to the case in which T is an
mx n matrix; in other words, when T : jRn ---> jRm. 'W'hen both jRn and
jRm have the norms 11 . 111 then, following the steps in this example,
we find that
m

Other matrix norms are also possible, depending on the choices of


norms for the domain and image spaces. In Exercise 5.12, for example,
you are asked to repeat this example for the case in which the two
spaces are endowed with the max-norm.

17. Let T = dldx : G 1 (O, 1] ---> G[O, 1] with the sup-norm defined on
GI and G. T is not a bounded operator; to show this, we need only
consider u(x) = sinnx. Then Ilull = 1 and Ilduldxli = Iincosnxli = n.
It follows that IITul1 can take on arbitrarily large values (for any
chosen constant K, wc simply choose n big enough to invalidate the
statement IITul1 = n < K). This result may be extended in an obvious
way to show that all ordinary and partial differential operators are
unbounded in the sup-norm.
The connection between bounded and continuous linear operators is one
that is exploited very often. Suppose that T : X ---> Y is a bounded linear
operator; then there exists K > 0 such that

IITu - Tvll = IIT(u - v)11 :::; Kllu - vii, u, v EX.


Given any E > 0, we set 0 = EI K to obtain

IITu - Tvll < E whenever Ilu - vii:::; o.


Thus T is continuous if it is bounded. Now suppose instead that T is
continuous; then, with Uo = 0 and E = 1 in (5.2) we can always find a 0 > 0
such that

IITul1 < 1 whenever Ilull < o.


In particular, assume that u f= 0 (the case u = 0 is trivial), and set z =
oul21lull;then Ilzll = 0/2, hence IITzl1 < 1, and so

1> IITzl1 = ~IIT(oulllull)lI.


That is,
150 5. Linear operators

For the case Uo = 0 we have Tuo = 0 and so IITuol1 :'S: 8- 1 1Iuoll. Thus
IITull :'S: 6- 1 1Iull, and so T is bounded. We thus have the following theorem.

THEOREM 3. A linear operator T from a normed space X to a normed


space Y is continuous if and only if it is bounded.

Theorem 3 is very useful when it needs to be shown that an operator is


continuous; it is frequently more convenient to show boundedness, which
in turn implies continuity.

Example

18. Let T : C[a, b] -+ G[a, b] be defined by Tu(x) = f:(x + ~)u(O d~.


Using the sup-norm we have, assuming b > a > 0,

ITu(x) I = Ix l u(~) d~ + l ~u(~) ~I


b b

:'S: Ix l u(~) d~1 + 11 ~u(~) ~I


b b

:'S: Ibl sup lu(~)llb - al + Ibl sup lu(~)llb - al


(see Theorem 2.2). Hence IITull DO :'S: 2b(b - a)llull oo and so T is
bounded and consequently continuous.

It is a simple but nonetheless extremely important fact that if a linear


operator T is continuous, then for any convergent sequence {u n } in its
domain, T(limn~DO u n ) and limn~oo T(u n ) yield the same result. Note that
this says two things: if U n -+ u, then the sequence {T( u n )} in the range of
T converges, and the limit is the same as T(u).

THEOREM 4. Let A : X -+ Y be a bounded linear operator and let {u n } be


a convergent sequence in X with limit u. Then AUn -+ Au in Y.

PROOF. We have

Hence

!im
n~~
IIAu n - Aully :'S: IIAII !im
n~~
lIu n - ullx = 0;

that is, AUn -+ Au. o


5.3 Projections 151

Open mappings. In Theorem 2 it was seen that a continuous operator


T may be characterized by the property that the inverse image of an open
set in R(T), the range of T, is itself an open set. This does not of course
imply that T maps open sets to open sets; for example, the operator T
defined by
T: (0,271') -+ lR, T(x) = sinx (5.5)
is continuous, but it maps the open set (0,271') onto the closed set [-1,1].
Those special operators that do map open sets to open sets are called
open mappings; more formally, an operator T : X -+ Y, where X and Y
are normed spaces, is called an open mapping if the image of every open
set in X is an open set in Y.
The quest ion remains: under what conditions is an operator an open
mapping? The answer is provided by one of the important theorems in
functional analysis, the open mapping theorem. We now state this theorem,
omitting the rather technical and lengthy proof.

THEOREM 5 (THE OPEN MAPPING THEOREM). Let X and Y be Eanach


spaces, and let T : X -+ Y be a bounded linear opern tor from X onto Y.
Then T is an open mapping.

Note the essential ingredients: X and Y have to be Banach spaces, and T


must be bounded and surjective. In the example (5.5) the first requirement
was met, but not the second, in that T was not surjective.
The open mapping theorem has as a consequence a result that proves
very useful when we study the existence of solutions to boundary value
problems in Chapter 8. This is the so-called Eanach theorem.

THEOREM 6 (THE BANACH THEOREM). A bounded linear one-to-one


operntor T from a Eanach space X onto a Eanach space Y has a continuous
inverse T- 1.

PROOF. Since T is one-to-one with range all of Y, it remains to show that


T- 1 : Y -+ X is continuous. Let S be any open set in Y; then its inverse
image under the mapping T- 1 is (T-1)-1(S) = T(S), since T is bijective.
But from the Open Mapping Theorem we know that T(S) is open. Hence,
by Theorem 2, T- 1 is continuous. 0

5.3 Projections
Consider the following situation in ]R3, shown in Figure 5.9. Given any
vector x we define an operator P which has the property that
Px = (Xl,X2,0).
152 5. Linear operators

FIGURE 5.9. Projection of a point x onto the Xlx2-plane

That is, P projects any vector onto the Xlx2-plane. It follows that if y is
a vector of the form (Yl, Y2, 0), then Py = y, so that R(P) = {y: y =
(a,ß,O)} and N(P) = {y: y = (O,O,'Y)}, where a,ß, and'Y are real
numbers. Furthermore, the only vector common to R(P) and N(P) is the
zero vector. More generally, P has the property that p 2 x == P(Px) = Px
for all points x in ]R3. This is a simple and standard example of a projection
operator on ]R3; we now generalize to arbitrary vector spaces.

Projection operators. A linear operator P : X -+ X, where X is a


vector space is called a projection operator, or simply a projection, if

p 2 = P; that is, P(Pu) = Pu for all u EX.

Example

19. Let X = G[O, 1] and define the operator P: G[O, 1] -+ G[O, I] by

Pu = u(O)(l - x) + u(l)x.
That is, P maps a continuous function to its linear interpolate, as
shown in Figure 5.10. To see that P is a projection operator, note
that

P(Pu) Pv, where v = Pu = u(O)(l - x) + u(l)x


v(O)(l - x) + v(l)x = u(O)(l - x) + u(l)x = Pu.

The characterization of the range and null space of a projection operator


carries over in an obvious way from projections on ]R3 to those on vector
spaces in general, and the main ideas are embodied in the following result.
5.3 Projections 153

I
FIGURE 5.10. A continuous function u and its linear interpolate Pu

THEOREM 7. Let P : X ~ X be a linear projection operator on a vector


space X. Then R(P) n N(P) = {O} and every member of X has the unique
representation u = v + w for some v E R(P) and w E N(P). That is,
X = R(P) E8 N(P).
PROOF. We recall from Section 3.1 the definition of the direct sum U @ V
of two subspaces U, V of a vector space X: X = U + V and U n V = {O}.
Suppose then that u E R(P) nN(p). Then since u E R(P) there is avE X
such that Pv = u. Hence Pv = p 2 v = Pu = 0, since 1t E N(P) as weIl.
Thus u = O.
To show that R(P) + N(P) = X, let u be any member of X, and let
Pu = v. If we set w = u - v, then Pw = Pu - Pv c= P(Pu - Pv) =
P(v-Pv) = Pv-Pv = O. Hence u = v+w with v E R(P) and w E N(P).
Thus X = R(P)E8N(P). The uniqueness ofthe representation follows from
Theorem 3.1. 0

Example

20. Let X = Cl-I, 1] and define P to be the projection that maps any
XE Cl-I, 1] to its even part (Figure 5.11):

Pu = v, where v(x) = ~[u(x) +u(-x)].

The range of P is then

R(P) = {v E Cl-I, 1]: v(x) = v(-x)},


that is, the space of all even functions, whereas the null space of P is
the space of all odd functions:

N(P) = {v E C[-l,l]: v(x) = -v(_·x)}.

Clearly X = R(P) EB N(P), sincc every continuolls function can be


represented as the sum of an odd and an even function, and furt her-
more the only function that is in R(P) n N(P) is the zero function.
154 5. Linear operators

--------+--------+------~~-------

the function v even part Pv odd part v - Pv


FIGURE 5.11. Decomposition of a continuous function into its odd and even
parts

Orthogonal projections. There is a furt her property of projection op-


erators on Rn that is easily generalized if the vector space is also an inner
product space: this is the concept of an orthogonal projection. We define
an orthogonal projection opemtor to be a projection on an inner product
space X with the property that
R(P) ..1 N(P);
that is, (u, v) = 0 for U E R(P) and v E N(P). The situation in R 3 is
obvious, as Figure 5.9 shows; if P is the projection operator that maps
vectors onto the xy-plane, then R(P) is the xy-plane, N(P) is the z axis,
and R(P) ..1 N(P).
Since we now have at our disposal a normed space (the norm being
generated by the inner product), it is natural to enquire into the continuity
of projection operators. We have the following result.

THEOREM 8. An orthogonal projection P : X -+ X on an inner product


space X is continuous.

PROOF. Any U E X can be expressed in the form u = v + w, where


v E R(P) and w E N(P). Furthermore, (v, w) = 0, so it follows (why?)
that lIull 2 = IIvl1 2 + IIw1l 2 , hence IIPul1 2 = IIvl1 2 ::; lIu11 2 . Thus P is bounded,
hence continuous. D

Up to now we have discussed the situation that obtains when we are given
a projection operator. What of the converse situation? Suppose we are
given a subspace Y of an inner product space X. Is it possible to define an
orthogonal projection P with the property that R(P) = Y? The answer lies
in a logical extension to Theorem 8 of Chapter 4, the Projection Theorem,
as we now show.

THEOREM 9. Let Y be a closed linear subspace 0/ a Hilbert space H. Then


5.3 Projections 155

there is exactly one orthogonal pmjection P: H ----> H with R(P) = Y.


M oreover, N (P) = Y -L .

PROOF. We know that H = Y + y-L and that every U E H has the uniquc
representation

U = v+w, where v E y, W E y-L.

Now define P: H ----> H by

P(v + w) = v;

that is, P is a projection with range R(P) = Y. It is not difficult to show


that P is an orthogonal projection; so an that remains is to show that P is
unique. Let Q be another orthogonal projection with R(Q) = Y. Now, by
Theorem 8 of Chapter 4, N (Q) = Y -L, and

Pv = v = Qv for v E Y,
Pw=O =Qw for w E y-L.

Hence P(v + w) = Q(v +w) for an U = v + wEH; in other words, P = Q.


D
So provided that H is a Hilbert space, for any given elosed subspace Y it
is possible to set up a unique orthogonal projection P onto Y. By Theorem
8 of Chapter 4, we then have

H = R(P) E:B N(P) = Y E:B y-L.

Note thc elose relationship between Theorem 9 and the Projection Theo-
rem.
This section is coneluded with a similar extension of Theorem 7, also
from Chapter 4.

THEOREM 10. Let Y be a closed subspace of a Hilbert space H, and let P


be the orthogonal projection from H onto Y. Then for any UD E H

IIUD - PUD 11::; IIUD - vii for all v E Y;

that is,

IIUD - PUD I = min Iluo


vEY
- vii·
PROOF. Since H = R(P) E:B N(P) from Theorem 7, elearly Uo - PUD E
N(P) = y-L. Hence

(Uo - PUD, Puo - v) = 0 far an v E Y


156 5. Linear operators

since PUO - v E Y also, and so

Iluo - vl1 2 Iluo - Puo + Puo - vl1 2


Iluo - Puol12 + IIPuo - vl1 2
> Il uo- PuoIl 2,
which proves the theorem. o

Example

2l. Consider the space L 2 (-I,I). Let V be the subspace of L 2 (-I,I)


consisting of all even functions; that is,

V = {v E L 2 (-I, 1): v(-x) = v(x) a.e. on (-1, In.

Then V is a subspace of L 2 ( -1,1), and in fact V is closed (show


this). According to Theorem 9 there is an orthogonal projection from
P onto V; in other words, R(P) = V. This projection is defined by

P: L 2 -> L 2 , Pv = ~[v(x) +v(-x)].

P is clearly a projection; it is linear and p 2 = P. The null space of


P is the set of odd functions, as characterized in Example 20.
We easily verify that Pis an orthogonal projection: if u E R(P) and
v E N(P), then

(u, v) = [11 u(x)v(x) dx = 0

since the product uv is an odd function. Hence R(P) ...l.. N(P). So


from Theorem 7,

L 2 ( -1,1) = R(P) EB N(P);

that is, every function in L 2 ( -1,1) can be represented as the unique


sum of an even and an odd function that also belong to L 2 (-I, 1).

5.4 Linear functionals


Linear functionals and dual spaces. Let IK denote either lR. or C; then a
linear functional I'. on a vector space X is defined to be any linear operator
that maps elements of X to IK; that is, I'. : X -> IK. Now we have seen that
the set L(X, Y) of all bounded linear operators from a normed space X
to a normed space Y is itself a normed space, with norm defined by (5.3).
When Y = IK, L(X, IK) is the space of bounded linear functionals on X, and
5.4 Linear functionals 157

is given a special name: this is called the dual space of X, and is denoted
by X'. That is,

X' = .c(X, IK),

and for any fEX',

IIf(u)11 = If(u)1 ::; Kllull for all u E X.

The second expression states that f is bounded and hence continuous.


It is customary, when dealing with bounded linear functionals, to denote
the action of such a functional f on an element u by

(f,u) instead ofthe usual f(u);

we adopt this custom.


Using the definition (5.3) of an operator norm we see that the norm
Ilfll x' of a member of X' is given by

Iltllx' = sup w-'


I(t, v)1
v =Je O.

Most of the time we deal with the case IK = 1Ft, and it is this case that is
the focus in examplcs.

Examples

22. Let f: L 2 (a,b) --> 1Ft be defined by

(f, u) = l b
u(x) dx.

Then f is a linear functional: (f, au + ßv) = a(f, u) + ß(f, v). Fur-


thermore, using the CauchySchwarz inequality on L 2 ,

and so f is bounded, and is thus a member ofthe dual space [L 2 (a, b)]'.

23. One reason for the importance of functionals is exemplified by the


Dirae delta "function" 6, which occurs in many branches of physics
and engineering. This quantity is commonly defined to be zero every-
where, with a "spike" at the origin; that is, 6(x) = 0 for x =Je 0, 6(x) -->

i:
00 at x = o. Furthermore, 6 is assumed to have the property

6(x)u(x) dx = u(O)
158 5. Linear operators

00

area = 1

E 1'-->0

FIGURE 5.12. The Dirae delta

for any continuous function u (Figure 5.12). However, it is not possible


to construct a function in the ordinary sense having the properties
just described. Rather, 0 is more carrectly defined as a bounded linear
functional on the space of continuous functions G[a, b] (here b > 0 >
a), with

0: G(a, b) --> IR, (0, u) = u(O).

Thus the Dirac delta has a sampling property; it acts on a continuous


function in such a way as to produce the value of that function at
x = 0 (or at any other chosen point, with a small modification). The
boundedness of 0 follows from

10(u)1 = lu(O)I:S sup lu(x)1 = Ilull oo ·


a<;::x<;::b

When approached in this way, there is no difficulty whatsoever in


dealing with the Dirac delta; it is simply an operator that acts on a
continuous function to produce its value at the origin.
The Riesz Representation Theorem. Suppose that X is an inner
product space, and u a member of X. The inner product (u, v) of u with
an arbitrary member v of X can be interpreted as the action of a functional;
indeed, for given u we have simply to define the functional R by

(R,v) = (u,v) (5.6)

far a given u, and for any v. Linearity of R is obvious; in addition, R is


bounded, with IIRII = Ilullx. To see this, first observe that

I(R,v}1 = l(u,v)l:S Kllvll


using the Cauchy~Schwarz inequality, with K = Ilull, so that IIRII :S Ilull
(cf. equation (5.3)). Secondly, I(R,u)1 = II(u,u)11 = IIul1 2 so that IIRII 2:
I(u, u)I/llull = Ilull· Thus IIRII = lIull·
5.4 Linear functionals 159

A result of crucial importance in functional analysis states that for a


bounded linear functional on a Hilbert space H, the converse also holds
true: given R, it is always possible to find an element u in H such that (5.6)
holds. This is the essence of the following theorem, whieh is regarded as
one of the "big" theorems in linear functional analysis.

THEOREM 11 (RIESZ REPRESENTATION THEOREM). Let H be a Hilbert


space and R a bounded linear functional on H. Then there exists a unique
element u in H such that

(R,v) = (v,u) for all v E H. (5.7)

Furthermore,

IIRII = Ilull· (5.8)

PROOF. Assume that Ri-O since, if R = 0, then (5.7) and (5.8) hold with
u = O. Also, observe that if a representation (5.7) exists, then the element
u must be nonzero. Second, for any v in H for which (C, v) = 0 we must
have (u, v) = O. This implies that u must be orthogonal to any member
of the null space of C; that is, u E N (eV-. We are thus led to show the
existence of u by considering N(e) and N(e)l...
Now N(e) is a closed subspace of H (see Exercise 5.16). Furthermore,
since ei- 0 by assumption, N(R) i- H (if N(e) were equal to H this would
imply that (e,v) = 0 for all v EH). Thus N(e) is a proper subset of H
and so, by the Projection Theorem (Theorem 8, Chapter 4), N(e)l.. i- {O}.
Hence there must be at least one nonzero element, uo, say, in N(R)l... Set

z = (e, v)uo - (e, uo)v for any v E H;

then

(e, z) = (e, v)(f, uo) - (e, uo) (e, v) =0


and so z E N(e). Also, since Uo E N(e)l.. we have

0= (uu, z) = (uo, (e, v)uo - (R, uo)v) = (e, v)(uo, uo) -- (e, uo)(uo, v)
which implies that

(" \ = (e, uo)(uo, v)


'-,vI Il uol1 2 '
(5.9)

Finally, we set
(e,uo)
u = Il uol1 2 uo, (5.10)
160 5. Linear operators

and from (5.9) we see that the element u defined by (5.10) satisfies (5.7).
The existence of u has been proved.
The proofthat u is unique, and the derivation of (5.8), are more straight-
forward than the existence proof, and are left as exercises (see Exercise
5.30). 0

In practice it is in many instances a difficult task to construct the element


u related to a linear functional P by (5.7); but there are situations in which
this can be done, and we give some examples.

Examples

24. Let! be a linear functional on ne; then (f, x) is areal number and
according to Theorem 11 we can always find a point y E IR n such
that

(f,x) = X· y.

For example, if ! is defined by

(f,X)=X1+···+Xn forx=(x1,···,X n )EIRn ,

then we may find y = (Y1, ... , Yn) from

Yi=y·ei=(!,ei), i=I, ... ,n,

where ei is the standard ith basis element of IR n . Hence

Yi = 0 + ... + 1 + ... + 0 = 1 and so y = (1, 1, ... , 1).

25. Let P be a linear functional on L 2 (0, 1) defined by

P: L 2 ...... IR, (P,v) = 1 1 2


/ v(x) dx.

Then according to Theorem 11 there exists a unique u E L 2 (0,1)


with the property that

Jar u(x)v(x) dx = Jar /


1 1 2
(P,v) = (u,v) or v(x) dx.

Clearly u( x) is thc function

1, 0< x S; ~
u(x) = { (5.11)
0, ~ <x < 1.
5.4 Linear functionals 161

We have seen from the Riesz Representation Theorem that there is a


unique one-to-one eorrespondenee between bounded linear functionals on
a Hilbert space H and members of H. In other words, it is possible to
set up a bijeetive linear map (in other words, one that is one-to-one with
range all of H) J: H -+ H', called the Riesz map of H. In addition, from
(5.8), lIull = Ilfll = IIJull. Thus any Hilbert space H and its dual H' are
isometrically isomorphie to each other, and may be identified with each
other in this sense. This correspondence is written

H~H',

meaning that His identified with its dual.

Example

26. In view of the preceding remarks the correspondenee

(5.12)

exists. For example, if f : L2(0, 1) -+ ~ with

{1/2
(f, v) = Jo v(x) dx,

then according to Example 24, u given by (5.11) is the unique member


of L 2 (0, 1) corresponding to f; that is, f is identified with u, and
Ilfll = Ilull L2.
The dual space of LV(!}). The equivalence (5.12) has its counterpart in
the case of the spaces LP(D) which, with the exception of p = 2, are not
Hilbert spaces. For any p such that 1 ::; p < 00, set q = 'piep - 1) (that is,
l/p + l/q = 1, with the usual rule 1/0 = 00). Note that the case p = 00 is
excluded.
Now let g be any function in Lq and define a functionall g on LP aecording
to

(f g , f) = 10 fg dx for all fE LV.

Then, using Hölder's inequality (3.13),

(here the LP- norm is denoted by 11 . IIp) so that f g is a. bounded linear


functional on LP, with
162 5. Linear operators

It is possible to go one step further, and to show that in fact

this step is treated in Exercise 5.3l.


Now it is clear from this argument that it is possible to associate with
each g in Lq a bounded linear functionall g on LP. What of the converse?
Is it the case that every bounded linear functional on LP has the form f. g
for some g E Lq? The answer is affirmative, and is embodied in a result
known as Riesz's Theorem.

THEOREM 12 (RIESZ'S THEOREM). Let f. be a bounded linear junctional


on LP(n), jor 1::; P < 00. Then there is a junction gin Lq(n) such that

(f., J) = L f9 dx for all fE LP. (5.13)

Furthermore,

Though Theorem 12 can be proved using techniques that are elementary,


the proof is rather lengthy and is omitted. What is important, though, is
to note that [LP]' and Lq are isometrically isomorphic; that is,
1 1
- + - = 1, 1::; P < 00.
P q
Although in particular [L1(n)]' rv LOO(n), this relationship does not com-
mute; in fact it can be shown that [LOO(n)l' is larger than L1(n).

Weak and weak* convergence. The existence of bounded linear func-


tionals on a normed space X makes it possible to introduce an alternative
form of convergence of sequences in X. Let {u n } be a sequence in X; then
this sequence is said to converge weakly to a member u of X if

(f.,u n ) ----> (f.,u) as n ----> 00, for all f. E X'.

When this is the case, u is referred to as the weak limit of {u n }, and we


write U n ~ u.
When wishing to distinguish between weak convergence and the conven-
tional notion of convergence defined in (4.2), the latter is referred to as
strang or norm convergence. There is a difference between strong and weak
convergence; on the one hand strang convergence implies weak convergence:
indeed, if U n ----> U (strongly), then for any bounded linear functional e
I(f., U n - u) I ::; 11f.llllun - ull ----> 0 as n --+ 00.
5.5 Bilinear forms 163

However, the converse is not true. Take, for example, the function u(x) = 0
and the sequence defined by un(x) = cosnx, on the interval [0,27r]; then
U n E L 2 (O,21l') (for example), and it can be shown that U n ~ O. On the
other hand, Ilun - 011L2 = 1l' for all values of n, so U n does not converge
strongly to O.
It follows from the Riesz Representation Theorem that in a Hilbert space
H, U n ~ U if and only if (u n , v) ~ (u, v) for all v.E H.
In the context of the spaces LP, the not ion of weak convergence takes on
a more concrete form in the light of Riesz's Theorem. Indeed, the corre-
spondence (5.13) implies that a sequence {u n } in LP (1 ::::; p < (0) is weakly
convergent with limit U if and only if

lim
n->oo 10.rung dx = 10.rug dx for all g E U(f2).

We mention briefly also the concept of weak* convergence, which applies


to sequences of bounded linear functionals. Suppose that X is a normed
space with dual X'; then a sequence {ln} in X' is said to converge weakly*
to an element l in X' if

(ln, u) -> (f, u) as n ~ 00,

for all u EX. In this case we write in . . :':. . l.

5.5 Bilinear forms


Another special type of operator that occurs very frequently in the study
of boundary value problems is one that maps a pair of elements to the real
or complex numbers, and which is linear in each of its slots. This called a
bilinear form. Since we deal exclusively with real-valued bi linear forms, the
definition and examples are restricted to this case. Prom the discussion of
linear functionals it should be clear, though, that the extension to complex-
valued forms is immediate.
If X and Y are vector spaces, a bilinear form a : X x Y ~ IR is defined
to be an operator with the properties

a(au+ ßw,v) aa(u, v) + ßa(w, v), u, w E X, v E Y,


(5.14)
a(u,av + ßw) aa(u, v) + ßa(u,w), uEX, v,wEY,

where a and ß are real numbers.

Examples

27. Let X = Y = 1R3 , and let A be any 3 x 3 matrix; then the operator
defined by a(x,y) = x· Ay is a bilinear form. In particular, for any
164 5. Linear operators

inner product space X the inner product (.,.) : X x X -+ ~ is a


bilinear form. An example of a nonlinear form is

here, in general,

a(ax + ßz, y) = lax + ßzi + lyl =I- aa(x, y) + ßa(z, y)


(this last expression being equal to a(lxl + Iyl) + ß(lzi + Iyl)).
28. Let X = Y = C1[a, b]. Then the operator defined by

a:Cl[a,b]xCl[a,b]-+~, a(u, v) = l b
(Uv+u'V')dX

is a bilinear form.

Continuous bilinear forms. Suppose that we are given abilinear form


a : X x Y -+ ~, where now X and Y are normed linear spaces. Consider
the expression a(u, v); if there is a positive number K such that

la(u,v)1 ::; Kllullllvil für all u E X,v E Y, (5.15)

then a is called a continuous bilinear form (this definition should be com-


pared with that of bounded operators, or bounded linear functionals). Later
on it is shown that differential equations have associated bilinear forms, and
in order for problems to be well-posed in a certain sense it is essential that
these bilinear forms be continuous.

Example

29. We discuss here an example that is typical of a dass of problems


that appears later. Denote by H 1 (c, d) the vector space of functions
that together with their first derivatives, are square-integrable on
the interval (c, d). The reason for the notation Hl becomes dear in
Chapter 7, where it is also shüwn that H1(c,d) is in fact a Hilbert
space if it is endowed with the inner product (u, V)Hl = fcd[uv +
u'v'] dx; the associated norm is then given by

Now define the bilinear form a(·,·) by


5.5 Bihnear forms 165

where K(X) is a bounded continuous function satisfying Kl ~ K(X) ~


K2 > 0 for x E (c, d) and constants Kl and K2. To show that a is
continuous, consider

la(u,v)1 lid[U'V' + KUV] dxl

l
sild u'v' dxl + d IKuvl dx

< lid u'v' dxl + Kl id lullvl dx


I(u', v')pl + Kl(lul, Ivl)p
< Ilu'lIpllv'llp + Klilullpllvilp
(using the Cauchy-Schwarz inequality).

Now from (5.16) it is clear that lIullp S IlullHl, and likewise Ilu'llp S
IlullHl; thus

la(u,v)1 < lIullH1llvllHl + KlilullH1llvliHl


(1 + Kl)llullH11lvllHl

so that a(·,·) is continuous, with constant K = 1 + Kl.

In practice it is always desirable to find the smallest constant K for


which the inequality (5.15) holds. In this example it is possible to
find a better constant K = max(l, Kl)' This is discussed furt her in
Example 33.

H-elliptic bilinear forms. Given a bilinear form a : H x H -+ IR, whcre


H is an inner product space, we say that a is H-elliptic if there exists a
constant a > 0 such that

a( v, v) ~ allvll~ for all vE H.

Thus an H-elliptic form is one that is always non negative, and takes the
value 0 only for the case in which v = O. In other words, it is positive-
definite.
166 5. Linear operators

Example

30. Consider Example 29 againj this bilinear form is H1-elliptic since

la(v,v)1 11d[(VI)2+KV2j dxl

~ d
11 [(V 1 )2 + K2V2jdXl

~ a !l d l
[(V ? + V2jdXl = allvll~"
where a = min(l, K2)'
The Riesz Representation Theorem for linear functionals has a counterpart
for bilinear forms that proves usefullater. Suppose that we are given areal
inner product space H and a continuous, H-elliptic bilinear form a on Hj
then for any given u E H it is possible to define a bounded linear functional
f on H according to the rule

a: H x H -> lR, a(u,v) = (f,v) for all v E H. (5.17)

From the continuity of a we find that

Ilfll::; Kllull, (5.18)

in which K is the constant appearing in (5.15). Furthermore, from the


H-ellipticity of a(·,·) we have

allul1 2 ::; a(u,u)::; (f,u)

or

Ilull ::; (l/a)lIfll (5.19)

In this sense, then, u and a generate the bounded linear functional E. We


now prove the converse assertion: namely, given a bilinear form a and a
linear functional with suitable properties, there exists a unique u E H
satisfying (5.17). This is the Lax-Milgram theorem.

THEOREM 13 (THE LAX-MILGRAM THEOREM). Let H be a Hilbert space


and let a : H x H -> lR be a continuous, H-elliptic bilinear form defined
on H. Then, given any continuous linear functional f on H, there exists a
unique element u in H such that (5.17) and (5.19) hold for all v E H.
5.5 Bilinear fonns 167

The proof of this theorem is rather lengthy, and is made more digestible
by breaking it up into aseries of five lemmas.

LEMMA 1. Given any u in H there is a unique element w in H such that

a(u,v) = (w,v) forall vEH. (5.20)

PROOF. Given any u E H, a( u, .) is a bounded linear functional on H since

a(u,·) : H -> IR, la(u, v)1 ::; K'llvll,


where K' = Kllull. Hence, according to the Riesz Representation Theorem
there is a unique element w in H such that a(u,v) = (w,v). 0

LEMMA 2. Let A be the operator that associates u with w:

A : H -> H, Au = w. (5.21)

Then A is a bounded linear operator.

PROOF. Let W1, W2 E Hj then according to Lemma 1 there are elements Ul


and U2 in H that satisfy

Since a is bilinear,

Q:a(ul, v) + ßa(u2:.v)
(Q:Wl + ßW2, v). (5.22)

Furthermore, from the definition of A we have

(5.23)

But from (5.20) through (5.22) we see that A maps Q:Ul +ßU2 to Q:W1 +ßW2:

A(Q:U1 + ß U2) = Q:W1 + ßW2.

The linearity of A thus folIows. A is bounded since, choosing u f= 0, setting


v = Au in (5.20), and using the continuity of a and (5.21),

KllullllAull:2: a(u, Au) = (w,Au) = IIAuf =;, IIAul1 ::; Kllull.


If u = 0, we simply choose w = o. o
LEMMA 3. A is one-to-one with bounded inverse A -1.

PROOF. Let R(A) denote the range of A (of course, R(A) eH). We show
168 5. Linear operators

that Az = 0 only for z = 0 and use Theorem 1 to show that A is one-to-


one. Let z be such that Az = O. Then, since A by definition maps z to a
member Az of H such that a(z,v) = (Az,v) we have

a(z,v)=(O,v)=O forall vEH.

In particular, for v = z,
0= a(z, z) ~ allzl1 2
so that Ilzll = 0 or z = O. Hence A is one-to-one, and its inverse A -1 :
R(A) ~ H exists. Furthermore, A- l is linear since A is linear (see Exercisc
5.9), and A-l is bounded since

allul1 2 :s; a(u,u) = (w,u):s; Ilwllllull (using the Cauchy-Schwarz inequality)

LEMMA 4. R(A) is a complete space.

PROOF. Let {wd be a Cauchy sequence in R(A). Since R(A) is a subset


of H, {wd is a Cauchy sequence in H too, and so it converges in H; that
is,

lim
k~oo
Ilwk - wll = 0 in H.

It is necessary to show that w is in R(A). To do this, let Uk be defined by


AUk = Wk. Then

Iluk - uzll IIA-lWk - A-lwzll = IIA-l(Wk - wl)11


< IIA -lllllwk - wIll
so that

lim
k,l~(X)
Iluk - ulil :s; IIA-lil k,l-oo
lim Ilwk - wzll = 0

({ wd is a Cauchy sequence in H). Hence {ud is also a Cauchy sequence


in H, with limit u in H. Furthermore, since AUk = Wk we have

lim AUk
k---+CXJ
= lim Wk
k---+(X)
=W or W = A(lim Uk) = Au

(using Theorem 4). Hence W is in the range of A, and since W is the limit
of an arbitrary Cauchy sequence, R(A) is complete. 0

LEMMA 5. R(A) = H; that is, Ais bijective.


5.6 Bibliographical remarks 169

PROOF. Suppose that R(A) is a proper subspaee of H, so that there is


a nonzero element Uo that lies in R(A)-L (reeall the Projeetion Theorem,
Theorem 8 of Chapter 4). Then

(uo, z) = 0 for all z E R(A).

Using Lemma 2 we set Wo = Auo so that Wo E R(A). Then from Lemma


1 we have a(uo,v) = (wo,v) for all v in H. In partieular, ifwe set v = uo,
then

Qlluol12::; a(uo,uo) = (wo,uo) = 0 sinee Wo E R(A), Uo E R(A)-L.

Henee Uo = 0, which is a eontradiction, so that R(A)-L =, {O} and R(A) =


H. 0
Finally, we gather together all the pieees of information to give the fol-
lowing.

PROOF OF THE LAX-MILGRAM THEOREM. Lemma 1 shows that for any


given u E H there is a unique wEH defined by (5.20). This lemma does
not prove the eonverse; indeed, we define the operator A by (5.20), and in
order to prove that the eonverse is truc it is nceessary to show that A is
bijective. This is done in Lemmas 3, 4, and 5. Henee we eonclude that given
any wEH there exists a unique u E H sueh that

a(u,v) = (w,v) for all v E H. (5.24)

By the Riesz Representation Theorem, every bounded linear funetional P


ean be expressed in the form

(P,v) = (w,v) forall vEH, (5.25)

with IIPII = Ilwll. Thus (5.24) and (5.25) imply (5.17), :md (5.19) follows
from thc H-ellipticity of a and the eontinuity of P. This proves the theorem.
o

5.6 Bibliographical remarks


Operators from one set to another are usually known as funetions when the
sets have no strueture. Mueh of the material eovered in Seetion 5.1 falls
into this eategory, and ean usually be found in texts that diseuss set theory.
Good aeeounts of the material in Seetion 5.1 may be found in Naylor and
Sell [33], Oden [36], Kreyszig [27], and Roman [42].
Seetion 5.2 makes use of both algebraie and topologieal properties of
sets. The definition of a linear operator requires only the algebraie notion
of a linear spaee, whereas the definition of a eontinuous operator ohviously
170 5. Linear operators

needs a normed space. The above-mentioned texts by Naylor and Sell,


Oden, and Kreyszig are good references for further reading as is Volume 2
of the pair of texts by Roman [43], and Zeidler [54]. The same applies to
the material of Sections 5.3 and 5.4; these texts are all good references. A
detailed account, including proofs, ofthe equivalence of [LP(D)l' and Lq(D)
may be found in the book by Hewitt and Stromberg [19].
A good source for discussions of bilinear forms, including the Lax- Milgram
Theorem, is Rektorys [41]. This theorem has been generalized to the case
of a bilinear form B : H x Y -> JE., where Hand Y are distinct Hilbert
spaces, by Babuska (see Babuska and Aziz [3] for this result); an account
of this generalization is also given by Oden [36].

5.7 Exercises
Operators

5.1. Describe the range and null space of the following operators.

(a) M: (-1,1) JE.2, M(x) = (x, V1=X2).

1
->

1
(b) K: L 2 (0, 1) -> JE., Ku = [u(xW dx.

(c) f:(0,7r/2)->JE., f(x)=tanx.

5.2 Find tho null 'pru" 01 S ~ (; ; ) =d ofT ~( ! 20)


3
-1
4
4
.

5.3. Which of thc following operators is one-to-one? surjective?

(a) K: G[O, 1] -> G[O, 1], Ku = l x


u(y) dy.

(b) T:JE. 2 ->IR2, T(x) = (y,x).


5.4. The operator f : M -> C is defined by fez) = z2, where M is the
subset ofC defined by M = {z = x + iy E C: xy:2: 1, x> 0, y > O}.
Sketch the domain of fand show that the image of the curve xy = 1
under the mapping f is the line Im fez) = 2. Hence illustrate the
range of f.
5.5. Describe the compositions ST and TS for the operators
(a) T : JE.2 -> JE.2, S: JE.2 -> lIe; T(x) = (x, -y), and Sex)
(2y,x).
(b) T: JE. -> IR, T(x) = sinx and S : JE. -> JE., Sex) = x 2 - 1
5.7 Exercises 171

5.6. Suppose that S : U -> V and T : V -> Ware invertible operators.


Show that TS is invertible and that (TS)~l = S~lT~l.
Linear operators, bounded, and continuous operators
5.7. Which of the following are linear operators?

(a) T:L 2 (-1,1)->L 2 (-1,1), Tu=J~lK(x,y)u(y)dYi


(b) T: Cl[a, b] -> C[a, b], Tu = x 2 8u/8x + 2Ui
(c) M: R 2 -> R, M(x) = xy.
5.8. An operator T : Rn -> Rn is called an affine transformation if
Tx = Ax + b, where A is an n x n matrix and b an n x 1 vec-
tor. Find the affine transformation in R 2 that takes the triangle with
vertices at (0,0), (0,1), and (1,0) to thc triangle with vertices at
(4,5), (-1,2), (3,0).
5.9. If T : U -> V is an invertible linear operator, where U and V are
vector spaces, show that T~l is linear.
5.10. If d(·, B) in Exercise 3.18 is regarded as an operator on Rn, is this a
linear operator? What is its null space?
5.11. Show that the norm of a bounded linear operator ean equivalently be
defined by

IITII = sup{IITull: lIull = 1}


or by

IITII = sup{IITull: Ilull ~ 1}.

5.12. Let X be the space Rn with the norm Ilxlloo = maxlSjSn IXjl. If
A : X - t X is a linear operator represented by an n x n matrix, show
that
n

IIAlloo = max
, L IAijl·
j=l

Determine IIAlloo if A = [-~ :].

5.13. The 2 x 2 matrix A has elements All = A 22 = a, A 12 = A 2l = b,


with a > 0 and b > O. Show that IIAI12 = a + b.
5.14. Show that thc idcntity operator I : X - t X is continuous, where X
is any normed space. If V is the normed space Cl [a, b] with the sup-
norm, and W is the space C1[a,b] with the norm Ilullw = Ilull= +
Ilu'II=, produce an example to show that I: V - l W is not continu-
ous.
172 5. Linear operators

5.15. If T : U -> V and S : V -> Ware bounded linear operators, show


that ST: U -> W is bounded with liSTli:::; IISIlIITII.
5.16. Show that the null space N(T) of a linear operator T : U -> V is
closed if T is a bounded linear operator.
5.17. An operator T : U -> V is bounded below if there exists a constant K
such that

IITullv ~ Kllullu, u E U.
If T is a bounded below linear operator, show that T is one-to-one,
and that T- I : R(T) -> U is a bounded operator.

5.18. Show that the operator D : C6[0, 1] -> C[O,l], Du = du/dx is


bounded below, where C6 is the space of functions in Cl that are
zero at x = 0 and x = 1. Use the sup-norm. [Hint: consider u(x) =
f; u'(y) dy.]
Projections
5.19. If P : U -> U is a projection operator, show that (I - P) is also a
projection. How are R(I - P) and N(I - P) related to R(P) and
N(P)?

5.20. Show that IIPII = 1 if P is an orthogonal projection.


5.21. Give an example of a nonlinear operator P that satisfies p 2 = P.
5.22. Show that N(P) = R(P)l. if P is an orthogonal projection on an
inner product space.
5.23. Let T be the transformation defined by

if lxi< 1,
Tu= {
°
u(x)
otherwise.

Show that T is an orthogonal projection. What are the range and


null space of T?
5.24. Show that the operator P: L 2 ( -1,1) -> L 2 ( -1,1) defined by

Pu(x) =~ r ei(x-y)u(y) dy
LI
l

is a projection. Is P an orthogonal projection?


.5.7 Exereises 173

Linear functionals and the Riesz Representation Theorem


5.25. Let A be a positive-definite symmetrie n x n matrix; that is, x T Ax >
ofor all nonzero veetors x. Then the space ~n is a Hilbert space when
endowed with the inner product
n

(x, y) = L AijXiYj·
i,j=l

Given a functional P : ~n --> ~, find the element a: such that (P, y) =


(x, y) when P is defined by:

(a) (P, y) = Yl + Y2 + ... + Yn;


(b) (P, y) = Yl.
5.26. For each 1 E L 2 (0,1) let u(x) be the solution 01' u" +u l - 2u = 1
with u(O) = u(l) = O. Define the functional P by

P: L 2 (0, 1) --> ~, (P, f) = 1 1


u(x) dx.

Show that P is a bounded linear functional, and find the function u,


the value of (P, f), and the element 9 such that (j9, f) = (g, 1), when
I(x) = 2x.
5.27. Repeat Exercise 5.26 for the differential equation u" - 2u' +u = I.
5.28. If X is a normed space (not necessarily complete), prove that X' io:
a Banach space.
5.29. Where in the proof of the Riesz Representation Theorem is the com-
pleteness of the Hilbert space H first used, and where is it used sub-
sequently?
5.30. Complete the proof of Theorem 11 by showing that u is unique and
that IIPII = lIull·
5.31. For any p and q such that 1 ::; p < 00 and l/p + l/q = 1, let 9 be a
function in Lq and define a functional 19 on LP aecording to

(P g ,!) = llg dx for all 1 E P.


Show that IIPglI = Iigil. [Hint: in the prelude to Theorem 12 it is shown
that IIPglI ::; IIgll; choose 1 = Iglq-1sgng, show that I/IP = Iglq, and
hence that IIPglI ~ IIglI.]
5.32. If Y is a dense subset of a normed spaee X and P is a member of XI,
show that
(P, v) = 0 for all v in Y implies that P = O.
174 5. Linear operators

Bilinear forms and the Lax-Milgram Theorem

5.33. Show that the constant K in Example 29 can be improved upon, in


that K = max(l, 1\:1).

5.34. If a : X x X --f .IR is a continuous bilinear form on an inner product


space X, show that

!im a(un,vn ) = a(u,v)


n~oo

if U n --f U and Vn --f v.

5.35. Let R: HJ(O, 1) --f ffi. and a: HJ(O, 1) x HJ(O, 1) --f.IR be defined by

1
(R,v) = 1 (-1-4x)vdX, a(u, v) = 1\X+1)U'V'dX,

where

HJ(O, 1) = {v E L 2 (O, 1): v' E L 2 (O, 1), v(O) = v(l) = O};

this is a Hilbert space (see Chapter 7) with the inner product

Jar (uv +u'v') dx = + (u',v'h2.


1
(U,V)H1 = (u,v)p
o

Show that R is continuous, that a is continuous and HJ- elliptic, and


verify that the unique element u satisfying

a(u,v) = (R,v) for all v E HJ(O,l)

is u(x) = x 2 -x. [Hint: it may be necessary to use integration by parts.


You mayassurne that a constant C > exists such that IIvllL2 :::; °
CIIv'IIL2.]
5.36. Let a : H x H --f ffi. be a continuous, H-elliptic bilinear form, and
define the bi!inear form ii : H x H --f ffi. by

a(u,v) = a(u,v) + (u,II:Vh2.


If H = HJ(O, 1) and lI:(x) is continuous and satisfies < 11:1 :::; II:(X) :::;
11:2 for some constants 1\:1 and 11:2, show that ii is continuous and H-
°
elliptic.
6
Orthonormal bases and Fourier series

In vector algebra it is often the case that computations are carried out using
the components of vectors. A set of three mutually orthogonal unit vectors
{i, j, k} is selected as a basis, and every vcctor a can then be written
as u = ai + ßj + ,k, the coefficients a, ß" being the components of a
relative to the chosen basis. In Section 6.1 we start the process of extending
this notion to vector spaces in general, by introducing finite-dimensional
vector spaces. In Section 6.2 the vector space is endowed with an inner
product or a norm, and this in turn perrnits the investigation of various
properties that such inner product or normed spaces have by virtue of their
being finite-dimensional. Section 6.3 is devoted to an examination of linear
operators acting on finite-dimensional spaces; these are always continuous,
and they also inherit in general the simple nature of their domains.
These concepts are extended to infinite-dimensional spaces in Section 6.4;
if the space concerned is a Hilbert space, then the idea of an orthonormal
basis carries over in a natural way from the finite-dimensional situation.
The quest ion of how one generates bases in infinite-dimensional spaccs is
partially answered by considering Sturm-Liouville problems, the topic of
Seetion 6.5; these eigenvalue problems have a number of i.nteresting proper-
ties, the most relevant of which is that their eigenfunctions form orthonor-
mal bases in L 2
176 6. Orthonormal bases and Fourier series

6 .1 Finite-dimensional spaces
In this section we disCUBS vector spaces that have the property that every
member can be expressed as a finite sum of multiples (that is, a linear com-
bination) of a selected subset of members of that space. The motivation for
endowing vector spaces with this property once again comes from elemen-
tary vector algebra; every vector in three dimensions can be represented as
a sum of multiples of three noncoplanar vectors.

Linear combination. Let X be a linear space and JK: the set of real or
complex numbers. Let {Ul, U2, ... , u n } be a set of elements in X. The ex-
pression

where Ql,". , Qn E JK:, is said to be a linear combination of the elements


Ul ... , UnoNote that Ql Ul + ... + QnUn E X.
In this section attention is restricted to finite linear combinations; as
long as we do this, the theory that arises is purely algebraic. Infinite lin-
ear combinations of the form I:~l QiUi require topological tools for their
treatment and the discussion of this situation is postponed to Section 6.4.

Linear dependence, independence. Let X be a linear space, and let


u n } be a finite set of elements of X. Then this set is linearly depen-
{Ul, ... ,
dent if there exist numbers Ql, Q2, ... ,Qn in JK:, not all of which are zero,
such that

(6.1)

The set {Ul, ... , u n } is linearly independent if (6.1) holds only when all of
the Qi are zero. In other words, a set is linearly dependent if one of its
elements can be written as a linear combination of the others; for if Qk is
nonzero, then (6.1) may be rewritten in the form

Uk = -(l/Qk)[Ql Ul + ... + Qk-1Uk-l + Qk+1Uk+l + ... + Qnunl;


for a linearly independent set this is not possible.

Examples
1. Let X = 1I~?, and consider the vectors al = (2,1) and a2 = (1,2). To
test for linear dependence, consider the linear combination

where el = (1,0) and e2 = (0,1). Accordingly, we must have


6.1 Finite-dimensional spaces 177

FIGURE 6.1. The vectors in Example 1

The only possible solution to these two equations is (}:1 = (}:2 = 0, and
so a1 and a2 are linearly independent. Graphically this is easy to see
(Figure 6.1), in that it is not possible to express a2 as a multiple of
a1·
Now suppose that we also have the vector a3 as shown in Figure 6.l.
This set is linearly dependent since, whatever the length and direction
of a3, it is always possible to express it in the form a3 = 131 a1 + ß2a2
for some 131,132, Hence there exist scalars 131,132, and 133 = -1 such
that ßla1 + ß2a2 + ß3a3 = O.
2. Let X = L 2 (0, 1) and consider the functions Uk (.4: = 1,2,3) defined
by Ul(X) = coshx, U2(X) = sinhx, U3(X) = e X • Then the equation
3
L (}:iUi =0 or (}:1 cosh x + (}:2 sinh x + (}:3ex = 0
i=1

is satisfied for any nonzero (}:i that are related to each other by (}:l =
= -(}:l, and so the set is linearly dependent.
(}:2, (}:3

Basis, dimension. A finite set {U1, ... , u n } of elements of a vector space


X is said to span X if every U E X can be written in the form U =
(}:lU1+" '+(}:nun forsomenumbers(}:i, i = 1, ... ,ninlK. Aset {Ul,""U n }
of elements of X is said to be a basis of X if and only if

(i) the set spans X, and


(ii) the set {Ul,' .. , u n } is linearly independent.

The number of elements that form a basis is called the dimension of X.


We write dimX for the dimension of X. If {udi=1 is a basis for a vector
space X and
n
U == LQiUi,
i=l
178 6. Orthonormal bases and Fourier series

then Di (i = 1, ... , n) are called the components of u relative to the basis


{ud. Note that the components change with a change of basis. We stress
the fact that the preceding definition applies only when {U1," ., u n } is a
finite set; exactly what is meant by an infinite-dimensional space becomes
clear later. We also note that, although the dimension of aspace is fixed, it
is possible to construct many different bases. These points should become
clearer in the following examples.

Examples
3. Consider the space ~3: the set {edr=l = {(1,0,0), (0, 1,0), (0,0, I)}
is linearly independent and also spans ~3; hence {edr=l is a basis
for ~3 and dirn ~3 = 3. Consider the point x = 2e1 + 3e3; this
has components (2,0,3) relative to the basis {ei}' But if we choose
instead the basis UJr=l defined by f1 = e1 + e2 + 2e3, f2 = e1 -
e2 + e3, f3 = 2e1 + e2, then the components of x relative to this
basis are found from the fact that

X=f1+f2'

so that x has components (1,1,0) with respect to the basis {fJ.


4. Consider the space P3[O, 1) of polynomials of degree at most 3 defined
on the interval [0,1). Set Pk(X) = x k , k = 0, ... ,3. Then {PkH=o is
linearly independent since

° => °
3

L DiPi = Do + D1X + D2X2 + D3 X3 =


i=O

holds only if all the Di are zero. Furthermore, every polynomial in


P3[O,l) can be expressed in the form
3
p(x) = LDiPi = DO + D1X + D2X2 + D3X3,
i=O

so that {pd~=o spans the space. Thus E == {pd~=o forms a basis


for P3 [O, 1) and dimP3 [0, 1) = 4. The components of the polynomial
p(x) = 2x - x 2 + x 3 relative to the basis E are {Dd = {O, 2, -1, I}.
But relative to the basis F = {( 1 - x), (1 + x), x 2 , x 3 } the components
of P are easily shown to be {-I, 1, -1, I} so that

p(x) = -1 . (1 - x) + 1· (1 + x) - x2 + x3.

The following theorem describes an obvious but important property of


finite-dimensional spaces.
6.2 Finite-dimensional inner product and normed spaces 179

THEOREM 1. Let X be a jinite-dimensionallinear space with dim X = n.


Then any subset of X containing more than n members is linearly depen-
dent.

PROOF. Let B = {VI, V2, ... , Vn } be a basis for X, and let S = {Ul, ... ,
Un , Un +1 , ... , UnH} be any set of (n + k) elements in X. Then by definition
there are scalars A ij such that
n
ui=LAijVj, i=l, ... ,n+k.
j=1

For any set of scalars ßl, ... , ßn+k we have


n+k n+k n
L ßiUi = L ßi L Aijvj
i=1 i=l j=1

Thus if ßIUl + ... + ßn+kUn+k = 0, then we must have


n+k
L IjVj = 0, whcre Ij = L Aijßi;
j=1 i=l

but since {Vj} is linearly independent, this implies that Ij = 0, or


n+k
L Aijßi = 0, or A t {3 = 0,
i=l

where A is an (n + k) x n matrix and {3 an (n + k) x 1 column vector.


From a standard result for sets of linear algebraic equations, cvery set of n
homogcneous (that is, right-hand side equal to 0) equations in (n + k) un-
knowns has a nontrivial solution; hence there are scalars ßl, ... , ßn+k, not
aB zero, such that L~lk ßiUi = 0, so that {ud ~lk is linearly dependent.
o

6.2 Finite-dimensional inner product and normed


spaces
Concepts such as linear dependence of a set and finite dimension of aspace
are algebraic: they require for their definition only the concept of a vector
space. But if the vector space happens also to be an inner product space,
it is possible to dcduce a number of useful properties.
180 6. Orthonormal bases and Fourier series

First, it is simple to check whether a set {U;}f=l in an inner product space


is linearly dependent. Confining attention to real inner product spaces,
suppose that

Cl:1U1+"'+Cl:kUk=0, UiEX,Cl:iE]R, i=l, ... ,k. (6.2)

Take the inner product of both si des of this equation with U1 to obtain

where A ij = (Ui,Uj) = (Uj,Ui) = A ji . By successively taking the inner


product of (6.2) with each of the members Ui, we eventually find that
k

LAijCl:j=O, i=l, ... ,k, orAa=O, (6.3)


j=l
where A is the symmetrie matrix with entries A ij and a is the column
vector [Cl:1, ... ,Cl:k]t. Now a necessary and sufficient condition for (6.3) to
have a nontrivial solution is that det A = 0; hence, the set {U1, . .. , Uk} is
linearly dependent if and only if det A = O.

Examples

5. Let X = ]R2 with a1, a2, and a3 as in Example 1. Then with (a, b) ==
a . b, A ij = a, . aj, and

4
detA = det ( ~ 5
2ßl + ß2 ß1 + 2ß2
which is easily shown to be identically zero for any values of ßl and
ß2. Hence the set {al, a2, a3} is linearly dependent.

6. The functions U1 = 1, U2 = X, U3 = x 2 are linearly independent in


L 2 (-1, 1) since

2 o 2/3 )
o 2/3 o i O.
2/3 o 2/5

Orthonormal sets and bases. If X is an inner product space, a set


{cPl' ... , cPk, ...} of elements in X is said to be an orthonormal set if the
elements are mutually orthogonal and have unit length; that is,

1 if i = j,
(cPi, cPj) =
{ 0 otherwise. (6.4)
6.2 Finite-dimensional inner product and normed spaces 181

Any orthonormal set is linearly independent. To see this, consider

now take the inner product with <PI to obtain Cll . 1 + 0 + ... + 0 = O. Thus
Q:l = O. In the same way, by taking the inner product with each <Pk in turn,
we find that all the Q:k are zero.
Now suppose that X is a finite-dimensional inner product space with
dirn X = n. Then a basis {<PI, . .. , <Pn} of X whose elements satisfy (6.4) is
said to be an oTthonormal basis.

Examples

7. The set {(I, 0, 0), (0, 1,0), (0, 0, I)} forms an ort ho normal basis for]R3.
8. Consider the space L 2 ( -1,1). The infinite set

{<Pk: <Pk(X) = sink7rx, k = 1,2, ...}

is an orthonormal set since

if k = l,
otherwise.

But L 2 ( -1, 1) is not finite-dimensional, so talk of an orthonormal


basis is premature at this stage.

9. Consider again L 2 ( -1,1), but this time as aspace of complex-valued


functions. The set {UQ, Ul, ... } in which Uk(X) = (1/\I'2)e ikrrx is an
ort ho normal set sinee

which is 0 if k f /, and equal to 1 if k = I.

One of the main advantages of ort ho normal bases over other bases is
that computations involving the former are much simpler. For example, if
{<Pi }i=l is an orthonormal basis for X and u, v E X, then

(U,v)
n n

i=l j=l i=l

Gram-Schmidt orthonormalization. Given any nonorthonormal ba-


sis S = {1Pd ~ l' it is possible to construct from S an orthonormal basis
182 6. Orthonormal bases and Fourier series

tPl

FIGDRE 6.2. The Gram-Schmidt orthonormalization process

E = {cP;}i=l' using the Gram-Schmidt orthonormalization procedure. We


illustrate the procedure for vectors in three dimensions and then generalize
from that.
Let {tPl,tP2,tP3} be any basis for JR3, and construct cPl from

Next, project tP2 onto the plane orthogonal to cPl (Figure 6.2), using for
this purpose the projection operator P2 defined by

Then set

Finally, project tP3 onto the line orthogonal to both cPl and cP2, using the
projection operator P3 defined by

Then set

H=
The resulting basis {cP k 1 is ort ho normal.
Generally, the procedure may be summarized as folIows. Given a basis
{1/J;}i=l) form an orthonormal basis {cP;}i=l from
Pi1/Ji
cPi = 11 Pi 1/Ji 11 '
6.2 Finite-dimensional inner product and normed spaces 183

the projection operators Pi being defined by


i-I

P1u = U, Piu = U - 2)U' rPk)rPk for i = 2,3, .. . ,n.


k=1

We conclude this section with an important result on the completeness of


finite-dimensional normed spaces. To prove this result the following lemma
is required.

LEMMA 1. Let {Ul, ... , u n } be a linearly independent set 0/ members 0/ a


normed space X. Then there is a constant c > 0 such that for every choice
of scalars al, ... , an,

THEOREM 2. Every finite-dimensional normed space X is complete.

PROOF. Let {udk=1 be a Cauchy sequcnce in X; the aim then is to show


that Uk -> U in X. Suppose that dirn X = n, and let {eI," ., en } be any
basis for X. Then each Uk can be expressed in the form

where aki (i = 1, ... , n) are the components of Uk. Using Lemma 1 and the
fact that {Uk} is Cauchy, it follows that for any given 0" > 0, there exists
N such that

for k, I > N. Since


n

laki - alil ::; L laki - alil < ~


i=1 C

we see that {akd is a Cauchy sequence in lK for each fixed i. Hence aki
converges to an element ai, say. Now define

Then
184 6. Orthonormal bases and Fourier series

but Qki ---> Qi for each i, so that Uk ---> U in X; hence X is complete. 0

Weak and strong convergence. The notion of weak convergence was


introduced in Seetion 5.4, and the point was also made there that strang
convergence implies weak convergence of a sequence. The converse is gen-
erally not true, except for the particular case in which the space is finite-
dimensional. This is the subject of the following theorem.

THEOREM 3. Let {ud be a weakly convergent sequence in a normed space


X, with weak limit u. 11 dimX < 00, then {ud converyes strongly to u.

PROOF. Suppose that dimX = n and let {ei}i=l be a basis for X. Then
we may express Uk and u in the form

and

Now

(6.5)

by assumption, for every E E X'. Take in particular the n functionals


Ei, ... , En defined by
far i = j
otherwise.

Then it follows that (Ei, Uk) = Qki and (Ei, u) = ai, and (6.5) implies that
Qki ---> Qi as k ---> 00, for each i. Thus

s; 'LIQki-Qillleill--->O
i=l

as k ---> 00. Thus limk->(X) Uk = U. o

6.3 Linear operators on finite-dimensional spaces


We turn now to the consideration of linear operators whose domains are
finite-dimensional spaces. As might be expected, the nature of such op-
erators is heavily influenced by thc fact that their domains have finite
6.3 Linear operators on finite-dimensi.onal spaces 185

dimension. For example, it turns out that if T is a linear operator on a


finite-dimensional normed space, then T is always continuous, as the next
theorem shows.

THEOREM 4. Let T : X -+ Y be a linear operator, where X and Y are


normed spaces, and X has finite dimension. Then T is bounded, and hence
continuous.

PROOF. Let {ei, ... , en } be a basis for X; then any u E: X has the repre-
sentation u = alel + ... + ane n for certain scalars al,· .. , an, and so

IITul1 IIT(alel + ... + anen)11 = IlalTel + ... + anTenl1


~ lallllTell1 + ... + lanlllTenl1
< M(lall + ... + lan!),
where M = max{ IITed, ... , IITe n II}. From Lemma 1 there is a constant
C > 0 such that

so that
M
IITul1 ~ Cllull.
Thus T is bounded, hence continuous. o
There is a very simple relationship among the dimensions of the domain,
null space, and range of a linear operator when the operator acts on a finite
dimensional space, as we now show.

THEOREM 5. Let T : X -+ Y be a linear operator with dirn X = n and


dirn N(T) = k ~ n, where N(T) is the null space ofT. Then

(a) if {eI, ... , ed is a basis for N(T) and {eI, ... , ek, eHl, ... , en } is a
basis for X, then {Tek+l, ... , Te n } is a basis for R(T), the range of
T;

(b) dimN(T) + dimR(T) = dirn X.

PROOF. (a) The elements Tel,Te2, ... ,Te n certainly span R(T) since any
v E R(T) satisfies, for sorne u EX,
186 6. Orthonormal bases and Fourier series

where C\(i are the components of u relative to the basis ei. Since e1, ... , ek
are in N(T) we have Tel = ... = Tek = 0 so that {Tek+1" .. , Te n } spans
R(T). We show next that this set is linearly independent. Suppose that
there are scalars ßk+1, ... , ßn such that
n
L ßi(Tei) = 0;
i=k+1
by the linearity of T,

T ( t
i=k+1
ßiei) = 0

so that the sum L~=k+1 ßiei belongs to N(T), and may therefore be rep-
resented in the form
n k
L ßiei = L "/jej
i=k+l j=l
for some scalars "/1, ... , "/k· It follows that, if we set ß1 = -"/1, ... , ßk = -"/k,
then this expression may be rewritten in the form
n
Lßiei = O.
i=l
But {eI,"" en } is linearly independent, hence ß1 = ... = ßn = O. So
{Tek+l?" ., Te n } is linearly independent and, since it spans R(T), it forms
a basis for R(T). Part (b) is a trivial consequence of (a). 0

For the special case in which T : X -+ Y with dirn X = dirn Y = n, we can


deduce from Theorem 5 the following.

COROLLARY TO THEOREM 5. Let T : X -+ Y be a linear operator with


dimX = dimY = n. Then N(T) = {O} if and only if R(T) = Y, and
when this is so, T is one-to-one and surjective, with a unique inverse T- 1.
Together with Theorem 4, it follows that T is an isomorphism of X onto
Y.

Example

10. Let X = R 3 , Y = R2 , and let T : X -+ Y be the matrix T =


(~ ~ ~). The null space of T consists of all vectors a: satisfying

Xl + 2X3 0,
Ta: = 0 or
3X1 + 4X2 + 2X3 o.
6.3 Linear operators on finite-dimensional spaces 187

It is not difficult to see that N(T) consists of an vectors of the


form x = cx( -2,1,1) so that dirn N(T) = 1. We should then have
dirn R(T) = 3 - 1 = 2; this is borne out by the fact that {x, y, z} =
{( -2,1,1), (1,2,0), (1, 1, I)} forms a basis for X. Now x spans N(T)
so that {Ty, Tz: forms a basis for Y, as is readily verified.

Isomorphisms. We have seen that finite-dimensional spaces all "look" the


same, in that their elements are uniqucly described by the specification of
the components relative to a given basis. The situation is even simpler, as it
turns out: it is possible to set up a one-to-one correspondence between the
elements of any n-dimensional real inner product space X and the elements
of jRn in such a way that the elements thus related have the same lengths.
More precisely, let T : X --> jRn be a linear operator from X to jRn; we
assert that it is possible to define an isometrie isomorphism T between
these two spaces: that is, T is bounded, bijective, and if Tu = v, then
Ilull = IITul1 = Ilvll (see Section 5.2). So X and jRn are, to all intents and
purposes, one and the same thing.

THEOREM 6. Let X be any finite-dimensional inner pmduet spaee with


dirn X = n. Then X ~ jRn; that is, there exists an isometrie isomorphism
/rom X to jRn.

The proof of this theorem is the subject of Exercise 6.15.

Representation of linear operators by matrices. An m x n matrix


is a linear operator from jRn to jRrn. It is natural to a.sk, then, whether
there exists any way in which a linear operator from one arbitrary finite-
dimensional space to another can be represented by a matrix. This is easily
done, as we now show.
Let T: X --> Y, where dimX = n and dirn Y = m. Let {el,"" e n } and
{!l, ... , fm} be bases for X and Y, respectively. Then if u is any member
of X with image v E Y under the mapping T, there are scalars CXl,' .. , CXn
and ßl, ... ,ßm such that u and v have the representations

v = ßdl + ... + ßmfm.


Since Tu = v and T is linear, it follows that
cxITel + ... + cxnTen = ßdl + ... + ßmfm· (6.6)
Now Tej is in Y far j = 1, ... , n, and so it is possible to express Tej in the
form
rn
Tej = LTij!;, (6.7)
i=1
188 6. Orthonormal bases and Fourier series

where T ij are scalars. We form a matrix T with components T ij ; then T is


the matrix 0/ T relative to the bases {e;} and {fj}, and (6.6) be comes
m n m

i=l j=l i=l

or

Since !I, ... , / m form a linearly independent set we have


n

LTijaj = ßi or Ta: = ß, (6.8)


j=l

where a: = (al, .. . ,a n ) and ß = (ßl, .. . ,ßm). It follows that ifthe matrix


corresponding to a linear operator is known, then (6.8) can be used to find
the components of the image of any member of the domain of the operator.

Example

11. Let X = P 2 [0, 1] and Y = PdO,1], where Pk[O, 1] is the set of poly-
nomials of degree at most k on [0,1]; dirn PdO, 1] = k + 1. Suppose
that we choose as bases for X and Y

where el = !I = 1, e2 = h = x, e3 = x 2 , and let T be the derivative


operator d/dx; then

Tel = 0, Te2 = 1, Te3 = 2x.

The matrix T corresponding to d/dx is found from (6.7):

o Tu + T 2l X,
1 T 12 + T 22 X,
2x 113 + T 23 x.

The elements T ij are found by equating coefficients of X O and xl, and


are

Tu = T2l = T 22 = Tl3
T 12 = 1, T 23 = 2,
= 0,
so that T=(~ 1 0)
o 2 .
6.3 Linear operators on finite-dimensional spaces 189

Thus if we are given any polynomial p in P2[0, 1], the coefficients ßi


of its derivative may be found from (6.8); if p(x) = 5 - x + 3x 2 , for
example, then

ß=(~ -1 )
-6 '

or dp/dx = Tp = -1 + 6x.

Linear functionals. Linear functionals on finite-dimensional spaces have a


particularly simple structure; in fact, they inherit the finite-dimensionality
of their domain X, so that dirn X' = dirn X is finite. To see this, let
{eI ... , en } be a basis for the n-dimensional normed space X, and define a
total of n linear functionals t\, ... ,Rn on X by

if j = k,
(6.9)
otherwise.

We claim that the set L = {RI, ... , Rn} thus defined is a basis for X'; indeed,
L is linearly independent since, if

(6.10)

then this implies that


n
LOoi(Ri,u) = 0 for all u E X,
i=l

which in turn gives


n
O=LOoi(Ri,ej)=Ooj, j=l, ... ,n,
i=l

using (6.9). Thus (6.10) holds only if all Ooj = O. Secondly, every R E X'
has the unique representation

L Cl'ißi.
n
(R, u) = (e, Cl'lel + ... + Cl'nen) =
i=l

On the other hand,


190 6. Orthonormal hases and Fourier series

Hence (f, u) = E~=i ßi(fi , u) or

as asserted. Thus {f 1, ... , f n } spans X', so that dim X' = n. It can be


shown, furthermore, that X and X' are isomorphie to each other. Of course,
if X is an inner product space, then by virtue of its finite-dimensionality it
is a Hilbert space and thus X and X' are isometrically isomorphie according
to the Riesz Representation Theorem.

6.4 Fourier se ries in Hilbert spaces


Our main aim in this section is to extend the idea of a basis to arbitrary
Hilbert spaces, inc1uding spaces of functions. Now, generally speaking these
spaces are not finite-dimensional; for example, it is not possible to find a
finite set of functions in L 2 (n) that spans L 2 (n). The best that can be
done is to construct an infinite sequence of functions with the property
that any member of the space can be approximated arbitrarily c10sely by
a finite linear combination of these functions, provided that a sufficiently
large number of functions is used. This leads to the idea of a basis consisting
of a countably infinite set. We work with such sets in inner product spaces
and, although not necessary, the resulting theory is rendered more tidy if
it is developed in the framework of orthonormal sets; these are sets of the
form {<Pi, <P2, ... , <Pk' ... } for which

if i = j,
otherwise.

Maximal orthonormal set, basis. Let X be any inner product space


and let 11> = {<P;}~i be an orthonormal set in X. Then we say that 11> is a
maximal orlhonormal set in X if there is no other non-zero member <P in
X that is orthogonal to all the <Pi. That is, 11> is maximal if (<p, <Pd = 0 for
all i implies that <p = 0; it is not possible to add to 11> a further nonzero
element that is orthogonal to all existing members of 11>.
A maximal orthonormal set in a Hilberl space H is called an orlhonormal
basis for H.
It is c1ear that we are generalizing from the finite-dimensional case; in-
deed, since every finite-dimensional inner product space is complete and
therefore a Hilbert space, the preceding definition of an orthonormal basis
is equivalent to that given in Section 6.2.
6.4 Fourier series in Hilbert spaces 191

Example
12. Let H = L 2 (-l, 1) and consider the set ([>1 = {sin7l"x,sin27l"x, ... }.
([>1 is an orthonormal set since

(rfJk, rfJI) = 1 1
-1 sin k7l"x sin 17l"x dx =
{I otherwise,
0 if k ~ l.

But ([>1 is not maximal; in particular, any even function ue(x) is


orthogonal to all rfJk since J~1 ue(x) sink7l"x dx = 0, sink7l"x being an
odd function. But it can be shown that the set

([>2 = {1/v'2} U {sink7rx,cosk7l"x}k=1

is a maximal orthonormal set. Since L 2 ( -1, 1) is complete, ([>2 is an


orthonormal basis.

Let {rfJk} be an orthonormal set in an inner product space X. Then, for


any U E X the numbers Uk = (u, rfJk) are called the Fourier coefficients
of U with respect to {rfJk}. These are the infinite-dimensional counterparts
of the components Uk of an element U of a finite-dimensional space X. If
{rfJl, . .. ,rfJn} is an orthonormal basis for an inner product space X with
dimension n, then for any U EX,
n

U = L ukrfJk, where Uk = (U, rfJk).


k=1
Precisely under what conditions the expression
00

U = L(U,rfJk)rfJk (6.11)
k=1
is valid for an element U of an infinite-dimensional inner product space X
is essentially the subject of this section. These conditions are discussed in
amoment, but first we must digress and make clear exactly what is meant
by an infinite sum of the form 2::;;'=1 O!kUk.
Suppose that {ud is a sequence in an inner product space and {O!k} is
a sequence of real or complex numbers, and define the corresponding nth
partial sum Sn of this sequence by

(6.12)

where {O!I' 0!2, ... } is a set of real or complex numbers. Now suppose that
we generate SI, S2, ... using (6.12); then the series 2::;;'=IO!kUk is said to
converge to an element U if the sequence {Sn} of partial sums converges to
192 6. Orthonormal bases and Fourier series

u. That is, we write u = L~=l Cl:kUk if, given any E > 0, it is possible to
find a number N such that

Ilsn - ull < E whenever n> N

or, more briefly,

lim Ilsn - ull =


n~oo
O.

It is in this sense that the expression (6.11) must be interpreted: any partial
sum Sn is an approximation to u, and this approximation improves as n
increases.

The Best Approximation Theorem. We show next that the Fourier


coefficients of a function are indeed special numbers in the following sense.
Suppose that v is an arbitrary member of the inner product space X,
and we wish to find the best possible approximation to v in the finite-
dimensional subspace ~ spanned by {<Pdk=l' The first question is, how
does one measure such an approximation, and the second is, what is that
best approximation?
To answer the first question, let v be the element in ~ that is "dosest" to
v. A reasonable way of making this assertion mathematically is to interpret
this to mean that

Ilv - vii<; Ilv - wll for any w E ~.

This is of course precisely the topic of Theorem 7, Chapter 4, according


to which such an element v exists, and is in fact unique. Here the task is
one of characterizing v, given that ~ is finite-dimensional. Now since any
member w of ~ can be expressed in the form

it follows that the issue is one of determining what the coefficients Ck must
be; it turns out that the coefficients that provide the best approximation
are precisely the Fourier coefficents.

THEOREM 7 (THE BEST ApPROXIMATION THEOREM). Let X be an inner


product space and {<Pdk=l an orthonormal set in X. Let v be a member oi
X and let Sn and t n denote the partial sums
n n

Sn = L Vk<Pk and t n = L Ck<Pk,


k=l k=l

where Vk = (v, <Pk) is the kth Fourier coefficient oi v and Ck are arbitrary
real 01' complex numbers. Then
6.4 Fourier series in Hilbert spaces 193

(a) (BEST ApPROXIMATION)

(6.13)

(b) (BESSEL'S INEQUALITV)

L L
00 00

IVkl 2 converges, and IVkl2 ~ IIv11 2 ; (6.14)


k=l k=l

(c) (PARSEVAL'S FORMULA)

L
00

I Vkl 2 = IIvII 2 if and only if Ilv - snll--+ 0 as n --+ 00


k=l

PROOF. To prove (a), consider

Now

n n n

LL
k=ll=l
CkCl(cPk, cPl) = L I kI
k=l
C 2;

next,

Likewise, (tn, v) = L:~=1 C1Vl. Assembling all these terms, we find that

n
IIv - tn l1 2 = IIvl1 2 + L (-CkVk - ckih + ICkI 2 )
k=l
n n

IIvII 2 + L IVk - ckl 2 - L IV kI 2 . (6.15)


k=l k=l

The inequality (6.13) follows by comparing this equation with that obtained
by setting Ck = Vk, for which case t n = Sn and the second term on the right-
hand side is zero.
194 6. Orthonormal bases and Fourier series

Parts (b) and (c) follow readily from (a), and are treated in Exercise
6.W. 0

Example

13. Let X = L 2 ( -1,1) and let cI> be the subspace of X spanned by the
orthonormal set {li -/2, cos 7rX, sin 7rx}. Consider the function v(x) =
x 2 . Then the Fourier coefficients of v are

(v, 1/-/2) j 1

-1
_1_x2 dx
-/2
= -/2
3 '

(v, COS 7rx) j


-1
1 x2 COS 7rX dx = -4/11'2,

(v, sin 7rx) j


-1
1 X2 sin 7rX dx = 0,

and so

_ -/2 4
v = 3 - 11'2 COS7rX.

The approximation v of v is shown in Figure 6.3.


The error in the approximation is found from

-2
Ilv - vllv = 1[ (3-/2 -
[1 X
2
-
4
11'2 COS7rX )]2 dx = 0.0516
so that the relative error is

Ilv - vllv
Ilvllv = 0.36.

The link between Theorem 7 and the Projection Theorem is obviously


a elose one, and in fact it can be shown that the partial sum Sn is the
orthogonal projection of v onto <P. This point is taken further in Exercise
6.2l.
The next thing we wish to do is to extend Theorem 7 to the catie in which
the ort ho normal set is infinite, and to establish the conditions under which
(6.11) is valid.

THEOREM 8 (THE FOL'RIER SERlES THEOREM). Let H be a Hilbert space,


and let cI> = {<Pd~l be an orthonormal set in H. Then any u EH can be
6.4 Fourier series in Hilbert spaces 195

FIGURE 6.3. The function v in Example 13 and its approximation

expressed in the form


00

U = ~)U,<Pk)<Pk (6.16)
k=l

if and only if <I> is an orthonormal basis, that is, a maximal orthonormal


set.

PROOF. Assurne first that <I> is an orthonormal basis. What we are required
to show is that if Sn denotes, as before, the partial sum Sn = L~=l Uk<Pk>
where Uk = (u, <Pk), then Sn ~ U as n ~ 00, if and only if <I> is an or-
thonormal basis. We begin by showing that {sn} is a Cauchy sequence in
H.
Let n 2: m, and consider

Now recall that in the proof of the Best Approximation Theorem it was
shown that IIsnll2 = L~=1IukI2. Furthermore,

(~Uk<Pk, ~ UI<PI)
(~ Uk<Pk, ~ UI<P1 + l=~l UI<PI)
(~ Uk<Pk, ~ UI<PI) + (~Uk<Pk, l=~l UI<PI)
m
196 6. Orthonormal bases and Fourier series

using the orthonormality of {tPd. Likewise, (sn, Sm)


convenience, set an = :L~=1IukI2. Then

Ilsn - sml1 2 = an - a m = lan - ami;

the last term arises from the fact that a m ::; an- Now from Bessel's in-
equality (6.14), lanl ::; Ilu11 2, and since the right-hand side is independent
of n, the sequence {an} is boundedj it is also monotone increasing, hence
(see Exercise 1.14) it is convergent, and therefore also Cauchy (in IK). The
sequence {sn} is thus also a Cauchy sequence (in H), and by the complete-
ness of H this sequence converges, to a member u', say, of H. It remains
to show that u' = u.
Consider

(u - u', tPI) (u - t
nl~~ k=l UktPk, tPI)
nl~~ (u - t UktPk, tPI) (using Exercise 4.2)
k=l
lim [(u,tPd - (u,tPdl = O.
n->=

This holds for all l, and {tPI} is a basis, so it follows that we must have
u' = u. This proves the first part of the theorem.
Conversely, suppose that every u E H has the form (6.16). Then we have,
for u E H,

= (Xl = =

k=l 1=1 k=l k=l


Now if <I> is not maximal, then there is a vector tPo, with IltPoll I, such
that (tPO,tPk) = 0 for all k. But from (6.17),

1 = IitPol1 2 = 2)tPO,tPi)2 = 0,

a contradiction. Hence <I> is an orthonormal basis. o


The Fourier Series Theorem gives explicit conditions under which the
elementary notion of representation of an element in terms of a basis is valid
for infinite-dimensional spaces. The theorem does not guarantee, though,
that it will always be possible to find such bases. Such a guarantee is
provided by the next theorem.

THEOREM 9. Every Hilbert space H has an orthonormal basis.


6.5 Sturm-Liouville problems 197

PROOF. The proof relies on an application of Zorn's Lemma, which was


introduced in Chapter 1; the lemma is applied to the family of orthonormal
sets in H. Let this family be denoted by C; then it is a partially ordered
set if as the partial ordering operation ~ we use set inc1usion, so that two
sets 8 1 and 8 2 of orthonormal sets are ordered according to 8 1 ~ 8 2 . The
collection Cis nonempty since the set 8 = {v/llvll}, where v is any member
of H, is trivially an orthonormal set.
Now let {8n } be a linearly ordered subset of C; then unSn is itself an
orthonormal set, and furthermore it contains each Sn, so that it is therefore
an upper bound for {Sn}. Every linearly ordered subset of C has an upper
bound, so by Zorn's Lemma C has a maximal element, that is, an orthonor-
mal set not contained in any other orthonormal set. This is of course the
definition of an orthonormal basis. 0

If the Hilbert space H is known to be separable (see Chapter 4, Sec-


tion 4), then the proof of existence of an orthonormal basis is much more
straightforward. Indeed, if H is separable, then we can pick at least one
countable set U = {u n } in H that is dense in H. N ow retain in U only
sufficient elements in order for the resulting set, U', say, to consist of lin-
early independent elements. Then U' is also dense in H. Finally, apply the
Gram-Schmidt procedure to this set.

THEOREM 10. Let H be a separable Hilbert space. Then it has an orthonor-


mal basis.

We are now assured of the existence of orthonormal bases in Hilbert


spaceSj it remains only to find ways of constructing such bases. There are
various procedures for doing this, and we explore one such approach, in
the context of the space L 2 (O), which is simple, and which has consider-
able relevance to physical problems. This is the subject of Sturm-Liouville
problems.

6.5 Sturm-Liouville problems


Separation of variables. The method of separation of variables is a
popular technique for solving simple linear initial boundary value prob-
lems. The use of this method leads to an eigenvalue problem whose solu-
tion is essential to the solution of the problem as a whole, and it is such
eigenvalue problems that are examples of Sturm-Liouville problems, the
eigenfunctions of which, it is shown, are candidates for orthonormal bases.
To review the method of separation of variables we return to the Intro-
duction, and to the problem of heat conduction. This problem, which is
198 6. Orthonormal bases and Fourier series

formulated in Box 1 in that chapter, is considered in a somewhat simplified


form, in that the following assumptions are made: the medium is homo-
geneous, heat sources are absent, and the problem is one-dimensional in
space. We choose as the domain the interval (-f, f), so that the problem
becomes one of finding the temperature u that satisfies the PDE

ou
ot
_ ,. , 02
ox2
U =0 (
in -f, f), (6.18)

in which "", the thermal conductivity, is a positive constant. The temper-


ature is assumed to be zero at each end, so that the boundary conditions
are

u( -f, t) = 0 and u(f, t) = o. (6.19)

The initial temperature is specified by a given function f, so that

u(x, 0) = fex). (6.20)

The essence of the method of separation of variables is to seek a solution


of the form

u(x, t) = M(x)N(t);
substitution in (6.18) and rearrangement leads to the equation
N'(t) ,..,M"(x)
N(t) M(x) ,
and since the left-hand side depends only on t and the right-hand side only
on x, it follows that each side of this equation is equal to a constant, wh ich
we denote by -A, the minus sign being inserted for future convenience. The
boundary conditions (6.19) become M(-f) = M(f) = 0 and so the first
problem becomes one of finding u and A that satisfy
-,..,M"(x) = AM(x)
(6.21)
M( -f) = 0, M(f) = O.
The second problem involves finding N that satisfies

N'(t) + AN(t) = o. (6.22)

It is equation (6.21) which is the prototype of the Sturm-Liouville eigen-


value problem.
Eigenvalue problems. Let L be a linear operator (the precise specifi-
cation of its domain is not too important right now); then the eigenvalue
problem for the operator L is the problem of finding u and A that satisfy

Lu = AU, (6.23)
6.5 Sturm-Liouville problems 199

where >. is in general a complex number. If L is a differential operator


defined on a domain n with boundary r, then it is also necessary to specify
homogeneous boundary conditions of the form
Bu=O, (6.24)
in which B is also a linear operator. Alternatively, (6.23) could represent a
matrix eigenvalue problem, of the kind that is encountered in elementary
courses in linear algebra.
The defining features of an eigenvalue problem are, first, that u = 0 is
a solution, known as the trivial solution; second, there are special nonzero
values of >., called eigenvalues, for which (6.23) and (6.24) have nontrivial
solutions. In the context of matrix problems these solutions are known as
eigenvectors whereas for differential equations they are known as eigen-
tunetions. In either case they are determined only up to a multiplicative
constant; that is, if u is an eigenfunction, then so is au for any number a,
since (6.23) gives aLu = a . >.u or L(au) = >.(au). Because of this inde-
terminacy, it is customary to normalize eigenfunctions in some convenient
manner.
Returning to problem (6.21), we now seek a solution to this eigenvalue
problem in the form M(x) = eax. Substitution in (6.21h leads to the
equation -Ka 2 = >., so that there are two possible solutions, viz. a =
±iJ>'jK (assuming that >. > 0). So the most general solution of (6.21h is
a linear combination of the solutions corresponding to the two values of a.
Since e"Y = cos y + i sin y for any real number y, this general solution may
be expressed in the alternative form
M(x) = Acosax + Bsinax
in which a = J>.j K. The boundary conditions (6.21h are imposed next:
these give
M(-!!) = O} =? A cos a!! - B sin a!! = 0
M(!!) = 0 A cos a!! + B sin a!! = 0
or, in matrix form,

(
cosa!!
cosa!!
- sin a!! ) (
sma!! B
A) (0)
0
.
In order for this set of equations to have a nontrivial solution it is necessary
and sufficient that the determinant of the matrix be zero; that is, we require
that
cos a!! sin a!! = 0 or sin 2a!! = 0,
the solution of which is 2a!! = k7r (k = 0,1,2, ...), so that the problem
(6.21) has an infinite sequence of eigenvalues >'k, k = 0,1,2, ... , where
200 6. Orthonormal bases and Fourier series

It also follows that there is an infinite sequence of eigenfunctions, denoted


here for convenience by Mk(x), where

and O'.k = br/U. We are now able to return to the problem (6.22), which
is considered in the form

N'(t) + AkN(t) = O.

The solution of this equation, for each value of k, is

The general solution of (6.18) and (6.19) may now be obtained by adding
up the linear combinations of the possible solutions; we set aside for now
the issue of convergence of the infinite sum that results, and express the
general solution in the form
00

u(x, t) = L[A k COS O'.kX + Bk sin O'.kX] exp( -Akt).


k=O

All that remains is to obtain the constants A k and Bk. These may be found
by using the last remaining condition to be satisfied, which is the initial
condition; from (6.20), then,
00

j(X) = u(x,O) = M(x)N(O) = L[A k COSO'.kX + Bk sinO'.kx]. (6.25)


k=O

The representation (6.25) of the function j is known as the eigenjunction


expansion of f. The coefficients may be found by exploiting the orthogo-
nality properties of the trigonometrie functions; indeed, with the L 2 -inner
product on (-P,P) denoted by (.,.), recall that

P if k = j,
(sin O'.kX, sin O'.jx) = (cos O'.kX, COS O'.jx) = {
o otherwise, (6.26)
(COSO'.kX, sin O'.jx) = O.

So if we take the inner product of each side of (6.25) with sin O'.kX and with
cos O'.kX in turn, we find that

(6.27)

and the coefficients A k and Bk are all thus determined.


The relevance of the problem just discussed, and of Sturm-Liouville prob-
lems in general, may now be explored a little further. First, it has been seen
that the eigenfunctions of the Sturm-Liouville problem (6.21) are the set of
6.5 Sturm-Liouville problems 201

trigonometrie functions <P == {eos akX }~o U {sin akX H~o=l; seeond, in order
to eomplete the solution of the problem it is required to expand the fune-
tion f (x) in terms of these trigonometrie funetions. The appearanee of the
L 2 -inner produet in (6.27) is not accidental; indeed, the question ofwhether
the funetion f may be represented in the form (6.25) is equivalent to asking
whether the set <P forms a basis for L 2 ( -f, f) (this set is indeed orthogo-
nal but not orthonormal, although easily orthonormalized). The answer is
affirmative, and this is the relevanee of Sturm-Liouville problems to the
subjeet of orthonormal bases: the eigenfunctions of StU7m-Liouville prob-
lems constitute orlhonormal bases for L 2 . These eonsiderations are precisely
what motivate elementary Fourier analysis, and indeed (6.25) together with
(6.27) simply gives the Fourier series repsentation of f.
We turn now to a more detailed study of Sturm-Liouville problems, the
objective being to work towards this general result.

Sturm-Liouville problems. A Sturm-Liouville operator L is a linear


operator of the form
1
Lu = -[-(pu')' + qu],
w
defined on an interval [a, b] of the realline. Here p, p', q, and ware contin-
uous real-valued funetions on [a, b] that satisfy

P(X»O}
q(x) 2: 0 on [a, b]. (6.28)
w(x) > 0

Let Bi and B 2 be linear operators that specify boundary values of a con-


tinuous function, and that are defined by

aiu(a) + ßiu'(a),
(6.29)
a2u (b) + ß2u '(b).

The eonstants a and ß satisfy

ai 2: 0, ßi 2: 0, and O:i + ßi > O. (6.30)

Then a regular Sturm-Liouville problem is an eigenvalue problem of the


form
Lu = >.U on (a, b)
(6.31)
Bi u = 0, B 2 u = O.
Equation (6.31h is eneountered in the form

-(PU')' + qu = >.wu,
rather than in the form in which w is found on the left-hand side.
202 6. Orthononnal bases and Fourier series

If any of the eonditions in the definition differ from those given here,
whether with respeet to the boundedness of the interval, the requirements
(6.28), or the eonditions (6.30), the problem is then known as a singular
Sturm-Liouville problem.
The problem (6.31) is eonsidered in the space L 2 (a, b) endowed with the
inner prod uet (.,.) defined by

(u,v) = l b
u(x)v(x)w(x) dx;

beeause of its role in the definition of the inner produet, w is ealled a


weighting function.
Now the first issue that needs to be resolved is that eoneerning the da-
main D(L) of the operator L. The problem is posed in L 2 (a, b), and of
eourse not all members of this spaee have derivatives in the classical sense.
It follows that D(L) has to be a proper subspaee of L 2 (a,b), and it suffiees
to take

(6.32)

Sinee the spaee C~(a, b) is eontained in D(L), and sinee C~(a, b) is dense
in L 2 (a, b) (see Chapter 4, Theorem 6 and the diseussion that follows it),
it follows that D (L) is dense in L 2 (a, b).

Examples
14. The problem (6.21) is a Sturm-Liouville problem with [a,b] = [-l,l],
p(x) = K" q(x) = 0, and w(x) = 1. With regard to the boundary
eonditions, 01 = 02 = 1 and ßl = ß2 = O.
15. Legendre 's equation arises when the method of separation of variables
is applied to problems having spherical symmetry (see Exercise 6.23).
This problem takes the form

-[(I-x 2 )u']'=AU on (-1,1),


(6.33)
u(-I) and u(l) are finite,

and is a singular Sturm-Liouville problem sinee the boundary eondi-


tions do not eonform to the strueture of (6.29) and (6.31h, Never-
theless, many of the properties of regular problems hold in this ease
as weIl.

Symmetrie operators. It turns out that Sturm-Liouville operators are


examples of what are known as symmetrie operators, and symmetrie oper-
ators have many of the nice properties that symmetrie matriees possess in
linear algebra.
6.5 Sturm-Liouville problems 203

Let L be a linear operator defined on a Hilbert spaee H, with domain


D(L). Then L is said to be a symmetrie operator if

(Lu, v) = (u,Lv) für all u,v E D(L). (6.34)

It is important to bear in mind that the definition applies to members of


the domain of L, and given that boundary conditions playa role in the
ehoice of the domain (as in (6.32», these will be erueial in determining
whether a given operator is symmetrie.
The next two results eoneerning symmetrie operators are direet general-
izations of the situation that pertains for matriees.

LEMMA 2. The eigenvalues 0/ a symmetrie linear opemtor are real.


PROOF. Consider the eigenvalue problem Lu = ..\u. Then
(..\-X)(u,u) ..\(u,u) - X(u,u)
(..\u,u) - (u,..\u) = (Lu,u) - (u,Lu) = 0,

using (6.34). o
LEMMA 3. Let L be a symmetrie linear opemtor defined on a Hilbert spaee
H. Then the eigenfunetions eorresponding to two distinct eigenvalues are
orthogonal.
PROOF. Let ..\1 and ..\2 be eigenvalues of L with eigenfunctions ul and U2,
respeetively. Then LUi = ..\iUi (i = 1,2) and so

"\1(Ul,U2) - "\2(Ul,U2)
("\lUl,U2) - (Ul,"\2U2) = (LUl,U2) - (Ul,Lu2) = o.
Sinee ..\2 =F ..\1 by assumption, it follows that (Ul, U2) = o. o
Properties of Sturm-Liouville operators. We begin by establishing
that L is symmetrie; indeed, for any u and v in D(L) we have

(Lu, v) - (u,Lv) l b
[-(pu')'v - quv + u(p'U')' + quv] dx

l b
[-(PU')'v+ (p'U')'u] dx

l b
[(p'U'U)' - (p'Uu')'] dx

[pv'u - pu'v]~
p(b)[u(b)v'(b) - u'(b)v(b)]
-p(a) [u(a)v'(a) - u'(a)v(a)]. (6.35)
204 6. Orthonormal bases and Fourier series

Now since v belongs to D(L), so does v since the coefficients in the boundary
terms are all real. It follows that BI u = BI V = 0 or, recasting this in matrix
form after using (6.29),

( u(a) u'(a)) ( QI ) = ( 0 )
v(a) v'(a) ßI O'

From the set of conditions (6.30) at least one of QI and ßI must be nonzero,
and this is only possible if the matrix is singular, that is, if

u(a)v'(a) - u'(a)v(a) = O.
Repeating the exercise for the boundary condition B 2 u
obtain for that case

u(b)v'(b) - u'(b)v(b) = O.

From these two equations it follows that the right-hand of side of (6.35) is
zero, as desired. This result, together with a related result, is summarized
in the following theorem.

THEOREM 11.

(a) The Sturm-Liouville operator is symmetrie;

(b) The Sturm-Liouville operator L is positive; that is, (Lu, u) 2: 0 for


all u E D(L).

The proof of part (b) is deferred to Exercise 6.25, as is the proof of the
following corollary.

COROLLARY TO THEOREM 11. The eigenvalues of L are all nonnegative,


and form a eountable set.

Thus we have established that the eigenvalues of L may be arranged in the


sequence 0 :s:; )'1 :s:; A2 :s:; .... It can furt her be shown that An ---> 00 as
n ---> 00, although we do not pursue this result here.
We come now to the main result of this section.

THEOREM 12. The eigenfunctions of a regular Sturm -Liouville problem


form an orthonormal basis for L 2 (a, b).

By way of preparing for the proof of this theorem, we introduce the Rayleigh
quotient R, a functional defined on D(L) by

(Lv, v)
R(v) = TvJj2 for all v E D(L).
6.5 Sturm-Liouville problems 205

Note that R(v) ~ 0 by the positivity of L.

LEMMA 4. The minimum 01 R(v) over all functions 1J E D(L) that are
orthogonal to the first n eigenlunctions is An+l. That is,

min{R(v): v E D(L), (v, QJl) = (V,1>2) = ... = (v, <Pn) = O} = An+l·


Furthermore, the minimizing lunction is QJn+l.
The proof of this lemma is treated in Exercise 6.26.

PROOF OF THEOREM 12. We have already established from Lemma 3 and


the Corollary to Theorem 11 that there is a eountable infinity of eigen-
functions of the Sturm-Liouville problem, and that these are mutually
orthogonal. Sinee the eigenfunctions {QJ1, QJ2, ... } are determined up to a
multiplieative eonstant, we assume onee and for all that these are normal-
ized, so that IIQJkl1 = 1, k = 1,2, ....
rt therefore remains to show that, given any member u of L 2 (a, b) and
a Sturm-Liouville problem defined on D(L), the eigenfunctions {QJdk'=l
form an orthonormal basis for L 2 (a, b) in the sense that for every U E H

00

U = '"
~
UkQJk or !im
n--+oo
Ilu - snll = 0,
k=l

where Uk = (u, QJk) and Sn = :E~=1 UkQJk (reeall Theorem 8 and its proof).
It is eonvenient to introduee the remainder r n = U -- Sn, and we now
estimate R(r n ). For k = 1, ... , n we have

(U - Sn, QJk) = (U, QJk) - (Sn, QJk)


Uk - Uk = O.

In other words, the remainder is orthogonal to the first n eigenfunetions.


Now aecording to Lemma 4, the smallest possible value of R(v) over v in
the subspaee of functions that are orthogonal to the first n eigenfunetions,
is An+l. It follows, therefore, that since r n belongs to this subspace, R(rn )
cannot be less than An + 1 :

(6.36)

Suppose now that we are able to show that the numerator (Lrn,r n ) is
bounded independently of nj then as n ~ 00 the right-hand side of (6.36)
goes to infinity, and henee r n ~ o. The theorem is thus proved if it ean be
shown that (Lrn, r n ) is bounded. This is aehieved by showing that (Lr n , r n )
206 6. Orthonormal bases and Fourier series

ean be bounded above by (Lu, u); indeed, we have

(Lu, u) (L(rn + sn), rn + Sn)


(Lrn,rn ) + (Lsn,r n ) + (Lr n , Sn) + (Lsn,sn)
(Lrn , r n) + (Ls n , rn ) + (rn, Lsn ) + (Ls n , sn)
(using the symmetry of L)
+ (Ls n , Sn) + 2Re (Ls n , r n )
(Lrn , r n)
> (Lrn , r n ) + 2Re (Ls n , r n )
sinee (Ls n , Sn) 2: O.

Now it ean be shown that in faet (Lsn,r n ) = 0 (see Exercise 6.27), and so
we are left with the inequality

which shows that (Lrn,r n ) is bounded. This eoncludes the proof of the
theorem. D

Examples
16. Theorem 12 eonfirms that the set of eigenfunetions {eos O:kX }k=O U
{sinO:kx}k=l ofthe Sturm-Liouville problem (6.21) forms a basis for
L 2 (a,b). In this form the basis is orthogonal, but not orthonormal;
however, it is easily eonverted to an orthonormal basis by using the
relations (6.26), aeeording to whieh 11 coso:xll = 11 sino:xll = ,jE and,
for the case k = 0, 11111 = vU. The orthonormal basis is thus

{l/vU} U {l/,jEeosO:kx}k=l U {l/,jEsinO:kx}k=l'

17. We return to the problem (6.21), but this time express the solution
in the form

rat her than rewriting it in terms of trigonometrie funetions. By follow-


ing the same proeedure as before (that is, substituting in the bound-
ary eonditions and seeking a nontrivial solution) we find that the
eigenvalues are An = (mr/Cf, n = 0, ±1, ±2, ... so that the set of
eigenfunctions is now

and this forms a basis for the spaee L 2 ( -C, C) of complex-valued fune-
tions (see also Example 9). Normalization is easy, sinee
6.6 Bibliographical remarks 207

18. The problem associated with Legendre's equation (6.33) is a singular


Sturm-Liouville problem. Nevertheless, by a minar modification of
the theory it can be shown that the properties of the regular problem
carry over. In particular, the eigenfunctions of Legendre's equation,
the Legendre polynomials Pn , form an orthogonal basis for L 2 ( -1,1):
these are defined by

1 Iddn (x 2 _1)n,
Pn (X)=-2 n=O,I,2, ....
nn. x n

Since
1 2
(Pn,Pn) = / P~(x)dx = - 2l'
-1 n+
the corresponding onhonormal set is {ePn(x), n == 0,1, ... } where
ePn = [(2n + 1)/2]1/2 Pn . It is also worth noting that the Legendre
polynomials can be obtained by applying the Gram-Schmidt proce-
dure to the set {I, x, x 2 , ... } of monomials (cf. Exercise 6.8).

6.6 Bibliographical remarks


Finite-dimensional spaces are given a detailed treatment in the texts by
Hairnos [18], Hoffman and Künze, [20] and Lang [28]. The texts by Noble
[35] and by Strang [50] also give good, applications-oriented expositions of
linear algebra. The subject of arthonormal bases in Hilbert spaces is de-
veloped in a systematic fashion in Naylor and SeIl [33] and Oden [36], as
are the other topics in this chapter. Other good treatments of the topic
of orthonormal bases are to be found in Reed and Simon [40] and Apos-
tol [2], the latter devoting attention to the situation in L 2 . The proof of
Lemma 1 may be found in Naylor and SeIl. The text by Zauderer [52] is
a good source for applications-oriented material on Sturm-Liouville prob-
lems, whereas Naylor and SeIl treat the functional analytic aspects.

6.7 Exercises
Finite-dimensional spaces

6.1. Which of the following subsets of 1R3 is linearly independent?


(al {(I, 4, 9), (1,0,9), (-1,4, -9)}; (b) {(I, 4, 9), (1,0,9), (2,4, 8)}.
208 6. Orthonormal bases and Fourier series

6.3. Let X be the set of solutions to the ordinary differential equation

u"-2u'+u=0, UEC 2 [0,1].


Show that X is a vector space, find dirn X, and display a basis for X.

6.4. Let M be the vector space of all real 3 x 3 matrices and K the subset
of matrices of the form

for all nonzero real numbers Q, ß, "'{, 15. What are dirn M and dirn K?
Display a basis for K.

6.5. Let V and W be proper subspaces of a finite-dimensional vector space


X such that X = V EB W. Show that

dim(V EB W) = dirn V + dirn W.


6.6. Let V be a subspace of a linear space X, with dimX = n. Show (a)
that every linearly independent subset of V is part of a basis for X;
and (b) that if V is a proper subspace of X, then dirn V < dirn X.
Finite-dimensional inner product and normed spaces
6.7. Apply the Gram-Schmidt procedure to the vectors
{(I, 0, 1), (1,0, -1), (0,3, 4)} to obtain an orthonormal basis for
IR3 .

6.8. Let V be the four-dimensional subspace of L 2 ( -1,1) spanned by


{l, X, x 2 , x 3 }. Use the Gram-Schmidt procedure to construct an or-
thonormal basis for V.

6.9. Test the set {Ul,U2} = {e x ,e- 3x } for linear dependence in L 2 (O,1)
by evaluating detA where Aij = (ui,ujh2.

6.10. Show that any norm 11·111 on a finite-dimensional space X is equivalent


to any other norm 11·112 on X (recall the definition (3.10) of equivalent
norms). [Lemma 1 may be useful.]
Linear operators on finite-dimensional spaces
6.1l. Let X and Y be the spaces of polynomials of degree 3 and 1, respec-
tively, and let {l + t, t(l + t), 1 + t 3 } and {I, t} be bases for X and
Y, respectively. Let T : X --> Y be the linear operator defined by

Tp = d2 p/dx 2 .
Find the matrix corresponding to T.
6.7 Exercises 209

6.12. Let X be the linear space of all functions of the form u(x) = a +
ßcosx + ,sinx, 0::; x ::; 271", and define T: X ---> X by
(271"
Tu = Jo [1 + cos(x - ~)]u(~) dt,.

Find the matrix corresponding to T.


6.13. Let T be an n x m matrix with transpose T t , and consider the equa-
tion Ta = b, where a E ]RTn and b E ]Rn. Suppose that we wish to
solve for a: show that a necessary condition for such a solution to
exist is

(c, b) = 0 for all cE N(T t );

that is, b E N(T t ).1.. (Note that (x, Ty) = (Tt~~, y).) This shows
that R(T) C N(T t ).1.. Show that R(T) = N(T t ).1.,
Determine N(T t ) for the matrix

T~[!-~n
and hence find the general form of the vector b such that Ta = b.
6.14. Find a basis for the null space of the functional R. : lR3 ---> lR, (R., x) =
aIXI + a2X2 + a3X3, where aI 1= 0,

6,15. Prove Theorem 6, which states that there exists an isometrie isomor-
phis m from any n-dimensional inner product space to ]Rn. Show that
this does not hold in general for finite-dimensional normed spaces by
verifying that (lR 2, 11·111) and (]R2, 11·112) are not isometrically isomor-
phie.
6.16. Let X = ]R3 with the norm 11 . 111. If R. is as defined in Exercise 6.14,
find IIR.II.
Fourier series in Hilbert spaces
6.17. The set {1/J271", coskx, sinkx, k = 1,2, ...} is an orthonormal basis
for L 2 (-71", 71"). Find the Fourier coefficients Ui if (I) u(x) = 1; (ii)

u(x) = {-I, 1,
-'Ir::;
0 < x ::;
x ::;
'Ir.
0
6,18. Determine the first three terms of the expansion IL = L~=o ul;;el;; on
[-1,1] when el;; are the normalized Legendre polynomials and

-1, -1::; x::; 0


U(x) = {
x, O<x::;l.
210 6. Orthonormal bases and Fourier se ries

6.19. The trigonometric Fourier series representation


00
u(x) = u,::, + 2)U2k cos k7rx + U2k-l sin k7rx)
v2 k=l
can be written in complex form as u(x) = 2:%"=-00 Ckeikx. Express
the coefficients Ck in terms of Uk.
6.20. (a) If <I> is an orthonormal set in an inner product space X, derive
the Bessel inequality (part (b) of Theorem 7). [Use (6.15) to
show that the inequality in (6.14) holds for finite sums. Then
consider why it should hold for an infinite sum.]
(b) Derive Parseval 's formula (part (c) of Theorem 7).
6.21. Let {rPl, rP2, ... , rPn} be an orthonormal set in a Hilbert space H, and
let V = span {rPl, ... , rPn}. Show that the orthogonal projection P of
H anto V is given by
n

Pu = 2)u, rPk)rPk.
k=l
6.22. Let V be the subspace of L 2 ( -1, 1) spanned by the first four orthonor-
mal Legendre polynomials rPn = (n+ ~)1/2Pn(X), n = 0,1, ... ,3.
Write out rPn explicitly, find the orthogonal projection of u( x) = x 4
onto V, and verify the inequality Ilu - Puli::::: Ilu - viI, v E V, for any
suitable choice of v.
Sturm-Liouville problems
6.23. Spherieal coordinates (r, B, rP) are related to Cartesian coordinates
(x,y,z) through
x = rsinBcosrP, y = rsinBsinrP, z = reosB,
and the Laplacian operator in spherical coordinates is given by

2 1 8 ( 2 8U) 1 a (. au) 1 a2 u
'V u = r 2 8r r ar + ;:2;[nB aB sm BaB + r2 sin 2 () arP2 .

Consider the problem of steady heat conduction in a body with spher-


ical symmetry.
(a) A,;sume that heat sources are absent, and that there is no de-
pendence on rP; then use the method of separation of variables
to obtain two problems with independent variables rand B, re-
speetively. By making the change of variable ~ = eos B, show
that the second of these equations reduces to Legendre's equa-
tion (Example 15). Hence obtain the general ,;olution to this
problem.
6.7 Exercises 211

(b) A known temperature distribution f(B) is maintained over the


surface of a sphere of unit radius. Find the steady-state temper-
ature at any point in the region {(r,B,tP): r< I}.
6.24. Determine the eigenvalues (approximately) and the eigenfunctions of
the problem

V" + AV = 0, v(O) = 0, v'(l) + ßv(l) ,= 0,


where ß > o. Verify that the properties of Sturm-Liouville eigenpairs
are satisfied. Which boundary conditions in the IBVP for the heat
equation would give rise to this problem?
6.25. Show that the Sturm-Liouville operator is positive, that its eigenval-
ues are nonnegative, and that they form a countable set. [For the last
part of this exercise use the fact that L 2 is separable, together with
the properties of the set of eigenfunctions.]
6.26. If R represents the Rayleigh quotient, show that min{ R( u) : (u, tPl) =
. .. = (u, tPn-l) = O} = An, where (.,.) is the usual weighted inner
product. Show also that R is minimized by the nth eigenfunction tPn.
6.27. Show that (Ls n , r n ) = 0 in the proof of Theorem 12.
6.28. Modify, where necessary, the theory for regular Sturm-Liouville prob-
lems to accommodate the following singular problems.
(a) -(pu')' + qu = AWu, xE (-00,00);
(b) -(pu')' = AWu with u(-L) = u(L), u'(-L) = u'(L).
6.29. The Schrödinger operator S defined by

= - ddxu2 + a 2 x 2 u,
2
Su xE (-oo,OQ),

arises in the study of the harmonie oscillator, and is a singular Sturm-


Liouville operator.

(a) Show that S is symmetrie on the space C5( -00,00) of twiee


continuously differentiable functions with compact support.
(b) Consider the Sturm-Liouville problem Su = AU; make the changes
of variables y = Vax and f-l = A/a to obtain the equation

(6.37)

then make the furt her change of variable v = u exp( _x 2 /2) to


obtain the Hermite differential equation

V" - 2yv' + AV = O. (6.38)


212 6. Orthonormal bases and Fourier series

(c) The Hermite polynomials H n may be defined by the Rodrigues


formula

n = 0, 1,2, ....

Show that H~ = 2nHn - 1 and that H n +1 - 2xHn + H~ = 0,


and that H n is a solution of (6.37) with A = 2n. The functions
ifJn (n = 0,1, ...) defined by

ifJn(x) = (y'7f2 nn!)-1/2Hn(x)exp(-x 2/2)

are an orthonormal collection of eigenfunctions for the problem


(6.38), and form a basis for L 2 ( -00,00).
7
Distributions and Sobolev spaces

In the Introduction the relevance of qualitative information about boundary


value problems and their solutions was discussed, by way of motivating the
introduction to functional analysis. Given the problem of finding a function
u that satisfies a partial differential equation and one or more boundary
conditions, it is clearly of great value to know be forehand whether such a
solution exists and, if so, whether it is unique, and finally how smooth this
function iso
When approaching such issues one is essentially dealing with the proper-
ties of an operator (in this case a partial differential operator) A from one
space of functions X to another space Y, so to start with it is ncccssary to
choose suitable spaces for X and Y. The spaces Cm(rl) would appear at
first sight to be appropriate since they are spaces of m-times continuously
differentiable functions and we are, after all, dealing with differential equa-
tions. However, these spaces suffer from the drawback that, although c(n)
with the sup-norm, and Cm(TI) with a corresponding norm, are Banach
spaces, these are not Hilbert spaces. Also, the spaces Cm(rl) are not the
natural spaces in which to consider variational formulations of problems,
and such formulations lie at the heart of our treatment of boundary value
problems and thcir approximation by finite elements.
The Sobolev spaces provide, as we show, a very natural setting for bound-
ary value problems. First of all, there is a category of Sobolev spaces that
are Hilbert spaces; second, it is possible to obtain quite general results
regarding existence and uniqueness of solutions in a variational setting, us-
ing these spaces. A third advantage is that, like the spaces Cm(rl), Sobolcv
spaces provide a me allS of characterizing the degree of smoothness of func-
214 7. Distributions and Sobolev spaces

tions. Finally, and perhaps of most importance is the fact that approximate
solution methods such as the Galerkin and finite element methods are most
conveniently and correctly formulated in finite-dimensional subspaces of
Sobolev spaces.
In order to discuss Sobolev spaces it is necessary first of all to leam a little
about distributions. This necessary background is provided in Sections 7.1
and 7.2. Then in Seetion 7.3 we introduce Sobolev spaces, and discuss some
of their more important properties. The apparently innocuous quest ion of
how one obtains the value of a function on the boundary of a domain,
given the function on the domain, is shown in Seetion 7.4 to be a nontrivial
problem. We show that a function has to satisfy certain requirements in
order for its values on the boundary to be defined unambiguously. Finally, in
Section 7.5 we discuss the Sobolev spaces Hö'(rl) offunctions that, together
with their derivatives of order less than m, vanish on the boundary. We also
introduce in this section the space H-rn(rl) (for m > 0) which is defined
to be the dual space of Hö'(rl).

7.1 Distributions
In this section and in those that follow it is often necessary to deal with
partial derivatives of all orders, and when discussing general ideas the no-
tation can sometimes become very clumsy. As aprelude to the main topic
of this chapter the very useful multi-index notation for partial derivatives
is introduced.

Multi-index notation. Let Z~ denote the set of all ordercd n-tuples of


nonnegative integers: a member of Z~ is usually denoted by a or ß, where,
for example,

a = (Ol,ct2, ... ,etn),

each component ai being a nonnegative integer.


We denote by lai the sum lai = al + a2 + ... + an and by D"'u the
partial derivative

Thus if lai = m, then D"'u denotes one of the mth partial derivatives of u.

Examples

l. If n= 3, thcn a multi-index 0' E Z~ is an ordcrcd tripIe of nonnegative


integers. For example, 0' = (1,0,3) belongs to Z~, with 10'1 = a1 +
7.1 Distributions 215

a2 + a3 = 1 + 0 + 3 = 4. Furthermore, in this case the partial


derivative Da u is the fourth derivative defined by

2. Let n = 2, and consider the expression

I = L a",Dau,
lal9
whcre aa are given functions of x and y. Thus

I = L aa Dau + L aaDo. u + L a"Do.u.


10.1=0 1"1=1 10.1=2
When lai = 0 the only possibility is a = (0,0) (remember that n =2
here, so we are dealing with ordered pairs); the other values are

lai = 1: a = (0,1) and (1,0),


lai =2: a = (2,0) and (1,1) and (0,2).

Suppose now that the functions aa are given as

where we have written, for example, alO for a(1,O)' Then

L ao.D"u alO
alU
ox l oyO + aOl oxO
alu
oyl = 2x
(OU
Ox
ou)
+ oy ,
10.1=1
02 U 02 U 02 u
a2°8x2 8 y O + an 8x l 8 y l + a02 8xO 8 y2

8 2u 8 2u 8 2u
1 . 8x 2 + x 2 8x 8y + 1 . oy2'
Collecting all terms, it turns out that

8 2u
L + a + u.
0. 8 u 2 2 82u (011., 811.)
aa D u = 8x2 + x 8x 8 + ß2 + 2x ox
10'19 Y Y Y

Hence Llal:Sk aQD"u is, in general, shorthand for a linear combina-


tion of partial derivatives of u, up to and including those of order k.
The advantages of using multi-index notation shoulcil be evident from
this simple example.
216 7. Distributions and Sobolev spaces

-b -a a b

FIGURE 7.1. An example of a member of V(O)

In Section 5.4 we discussed an example that showed that the Dirac delta
8 is not a function at all, but is more correctly viewed as a continuous linear
functional, in that it operates on a continuous function u to produce areal
number, namely, u(O):

8: C[-I, 1]---> lR, (8,u) = u(O).


The Dirac delta belongs to a rat her special space of functionals called dis-
tributions, and these in turn playa central role in the definition of Sobolev
spaces. In order to introduce distributions formally, we first set up aspace
of very smooth functions on which these distributions can operate.

The space V(n). For reasons that become evident later, it is desirable to
consider the action of distributions not on all of C(n), but on only the small
subset Co (n) of infinitely differentiable functions with compact support;
the notion of functions having compact support was, of course, introduced
in Section 4.4. In the context of distribution theory it is conventional to
use the notation V(n) for Co(n), and to refer to V(n) as the space of test
functions, becausc it is against functions in this space that distributions
are tested, in a sense to be made precise.

Example
3. A canonical example of a member of V(n) is thc function

lxi 2': a,
lxi< a,
defined on n = (-b, b), where b > a > 0, as shown in Figure 7.1. It
is not difficult to show that cp is infinitely differentiablc, and that the
support of cp and all of its derivatives is the set [-a, a].
It is possiblc to provide the space V(n) with a topology known as an
inductive limit topology, but such considerations are rat her complicated.
7.1 Distributions 217

Fortunately, the only topological concept we require is the notion of con-


vergence of sequences in D(n), for which the following definition suffices.

Convergence in D(n). Let {4>n} be a sequence of functions in D(n). Then


this sequence is said to converge to 4> E D(n) if
(a) there is a fixed compact set K in n that contains the supports of all
4>n; and
(b) the sequence {DG:4>n} converges uniformlyon K to DG:4> for any 0:.

Distributions. We define a distribution on a domain n in lRn to be a con-


tinuous linear functional on D(n). That is, a distribution is a continuous
linear map from D(n) to lR. Thus the space of distributions is the dual
space of D(n) and, in keeping with the notation introduced in Section 5.4
for dual spaces, we denote the space of distributions by D'(n).
Again, the topological notions that are required are best defined through
the actions of sequences. Thus, to say that f is eontinuous on D(n) means
that for every convergent sequence {4>n} in D(n), with limit 4>,

(f, 4>n) -+ (f,4» as n -+ 00.

Example

4. By now the idea of the Dirac delta as a distribution should be a


familiar one. In fact, (j belongs to D( -a, a) for any a > 0 since it is
more gene rally defined by

(j : D( -a, a), -+ lR, ({j,4» = 4>(0), 4> E D(n),

and is therefore a continuous linear functional on D(n).

Regular distributions. It is not only highly irregular objects such as the


Dirac delta that are distributions. In fact, there are many ordinary func-
tions that can be identified with distributions. All we require of a function
fis that the integral fK If(x)ldx be finite on every compact subset K ofn.
When this is so, f is said to be locally integrable on n, and a distribution
F associated with f can then be defined in a very natural way by

F: D(n) -+ lR, (F,4» = in N dx, cf; E D(n).

If the support of cf; is K c n, then

I(F,cf;)1 = lin fcf; dxl = IL N dxl s ~~~ Icf;(x) I J~ If(x)1 dx,


218 7. Distributions and Sobolev spaces

which is finite, and so we are assured that (F, <jJ) has meaning. Under these
circumstances F is said to be the distribution genemted by f. In future the
different notation for a function (I) and its associated distribution (F) is
dispensed with, and we simply write f for both quantities. Whether f is
the function or the distribution that it generates is clear from the context;
for example, f in the expression In
f<jJ dx is clearly a function whereas f
in (j, <jJ) is a distribution.

Examples

5. Every bounded continuous function is locally integrable and hence


generates a distribution, but there are also many irregular and dis-
continuous functions that are locally integrable. One such example is
the function f(x) = Ixl-I/2 on [-1,1]. This function has a singularity
at the origin, but is locally integrable since

and is bounded for every closed interval [a, b] in [-1,1]. Thus f gen-
erates the distribution, also denoted by f, and defined by

6. The step function H(x), defined on [-1,1] by

H(x) = { ~' -l:S;x<O,


O:S;x:S;l,

is locally integrable and generates the distribution H that satisfies

(H, <jJ) = 1 1

-I
H(x)<jJ(x) dx = (<jJ(x) dx.
Ja
A distribution that is generated by a locally integrable function is called
a regular distribution. If a distribution cannot be generated by a locally
integrable function, it is then said to be a singular distribution. An impor-
tant example of a singular distribution is the Dirac delta; it is not difficult
to show (see Exercise 7.2) that there does not exist a locally integrable
function that gene rates b.
It is possible to define in a very natural way the pmduct of a fUIlctioIl
and a distribution. Specifically, if nc jRn, U belongs to C=(D), and f is a
distribution on n, then by uf we understand the distributioIl satisfying

((uf),<jJ) = (j,u<jJ) for all <jJ E D(D). (7.1)


7.2 Derivatives of distributions 219

Note that (7.1) is a generalization of the trivial identity

In[U(X)f(X)]IjJ(X) dx = In f(x)[u(x)ljJ(x)] dx
that holds when f is locally integrable.

Example
7. The distribution u8 on (-1,1), where u(x) = x, satisfies

(x8, 1jJ) (8, xljJ)


(xljJ)lx=o = 0 for allljJ E V(-l, 1).
Hence x8 = 0, thc zero distribution.

7.2 Derivatives of distributions


Quantities such as the Dirac delta and the Heaviside step function do not
have derivatives in the ordinary sense. However, if they are treated as dis-
tributions it is possiblc to extend the concept of a derivative in such a way
that any number of derivatives can be defined for these quantities and,
indeed, for any distribution. Furthermore, we can define the derivative of a
distribution in such a way that, should the distribution be (generated by)
a continuously differentiable function, then the classical notion of a deriva-
tive is recovered. To this end we appeal to Green's theorem; the classical
version of this well-knowIl theorem states that the identity

i o
u8v
- dx
8Xi
= 1 r
UVVi ds - 10
v8u
- dx
8Xi
(7.2)

holds for all functions u, v in Cl (TI), where Vi is the üh component of


the outward unit normal vector v to the boundary r of a domain n, the
boundary being assumed to be sufficiently smooth (Figure 7.2). The one-
dimensional version of (7.2), the intcgration-by-parts formula, states that

1b uv' dx = [uv]~ -
jb vu' dx,
a (7.3)

Indeed, (7.3) is just a special case of (7.2) with n = (a,II),r = {a,b}, and
v = ±1 at x = band a, respectively.
The theorem is easily generalized to a result involving partial derivatives
of order m of functions u, v E Cm(TI) (see Exercise 7.4): by replacing U by
Dcxu in (7.2), and with lai = m, one can show that

In (Dau)v dx = C- 1n uDcxv dx + i h(u:v) ds,


1) lcx1 (7.4)
220 7. Distributions and Sobolev spaces

FIGURE 7.2. A domain n with boundary r and outward unit normal v

where h( u, v) is an expression involving a surn of products of derivatives of


u and v of order less than m.
Now replace v in (7.4) by a function cjJ belonging to V(n). Then since
rjJ = 0 on the boundary, (7.4) becornes

(7.5)

Since u is m-tirnes continuously differentiable it generates a distribution,


also denoted by u, so that

(u, cjJ) = l ucjJ dx

or, since DarjJ also belongs to V(n),

(u, DacjJ) = l uDOirjJ dx.

Furtherrnore, DOi U is continuous so it is able to generate a regular distri-


bution (denoted by DC>.u) satisfying

Hcncc (7.5) can be written as

(DOiu,rjJ) = (_l)I Oi I(u,DC>.rjJ) for all rjJ E V(n). (7.6)

We take (7.6) as the basis for defining the derivative of any distribution
f, as folIows. The o:th distributional or generalized partial derivative of a
distribution f is defined to be a distribution, denoted by Da f, that satisfies

(DC>. f, rjJ) = (_l)la l (f, DarjJ) for all rjJ E v(n). (7.7)
7.2 Derivatives of distributions 221

Thus we use the same notation for the generalized derivative of a distribu-
tion as that used for the conventional derivative of a function. Of course,
if the function belongs to em(n), then the generalized derivative coincides
with the conventional ath partial derivative for lai :S m, as can be seen
immediately from (7.5) and (7.6).
For the special case of first derivatives the multi-index notation can be
dispensed with, in which case (7.7) becomes

Furthermore, for the case n = (a, b) C ]R aB derivatives are with respect to


x only, and so (7.7) becomes

(7.8)

Examples

8. The first generalized derivative of the Heaviside step function H(x)


is the distribution H' satisfying, for aB test functions cp,

(H', cp) (_1)1 \ H, ~~)


-1 1 df/>
H(x)- dx (H is locaBy integrable)
dx

-1
-1

df/>
°
1
dx dx =-[f/>lö=(f/>,0)=(8,cp)

so that, symbolically, H' = 8; that is, the derivative H' of the step
function is the Dime delta.

9. The ramp function R(x) on n= (-1,1) x (-1,1) C]R2 is defined by

R(x)
°
= {x y ~f x ~ 0, y ~ 0,
If x < or y < 0. °
The generalized derivative D(I,O) R = 8R/8x is found from

8f/» = -
- \ R, -8
x
1

--1 -1
1
11 8cp dxdy
R(x)-8
X

11
(R is locally integrable)

- 1°111° 8f/>
X Y-8 = 1 1.
dxdy yf/> dxdy
x ° (I
222 7. Distributions and Sobolev spaces

after using Green's theorem (7.2). Furthermore, De 1 ,1) R = a 2 R/axay


is found from

2 / a 2 q;) 1 1
a 2 q;r r
(-1) \ R, axay = Jo Jo xy ax ay dxdy

11 1 1
q; dxdy (applying Green's theorem twice)

l H(x)q;(x) dxdy,

where H is the two-dimensional step function:

H(x) = { 1 if x;::: 0, y;::: 0,


o if x < 0 or y < O.

Hence

\/ axay'
a2 R q; ) = (H, q;) so that De 1 ,1) R = H.

Weak derivatives. Suppose that a function u is locally integrable so that


it generates a distribution, also denoted by u, that satisfies

(u, q;) = l uq; dx for all q; E D(n).

Furthermore, the distribution u possesses distributional derivatives of all


orders: in particular, the derivative DCt u is defined by (7.7). Of course DCt u
may or may not be a regular distribution; if it is a regular distribution,
then naturally it is generated by a locally integrable function so that

(DCtu, q;) = l DCtu(x)q;(x) dx. (7.9)

It follows in this case from (7.7) and (7.9) that the functions u and DCtu
are related by

l DCtu(x)q;(x) dx = (_l)m 10 u(x)DCtq;(x) dx


for lai = m. We call the function (more precisely, the equivalence dass of
functions; see the discussion following Example 10) DQu obtained in this
way the ath weak derivative of the function u. Of course, if u is sufficiently
smooth to belong to cm(n), then its weak derivatives DCtv. coincide with
its dassical derivatives for lai S; m.
Aremark concerning notation is in order here. We have reached the stage
where DCtu may represent the dassical partial derivative of a function, or
7.2 Derivatives of distributions 223

-1 1

FIGURE 7.3. The function u(x) = lxi and its weak derivative

the weak partial derivative of a function, or the generalized derivative of a


distribution (possibly generated by a function). For the most part it should
be clear from the context exactly which derivative is being used, but should
there be any danger of ambiguity it is made quite clear exactly what DOu
stands for. The same applies to the notation 8uj8x, and the like; this may
represent any one of the various types of derivatives.

Example
10. The function u(x) = lxi belongs to Cl-I, 1], but the classical deriva-
tive u' does not exist, in that it is not defined at the origin. However,
the weak derivative of u is the function

u
,= {-I
+1
for
for
-1::; x < 0,
0 ::; x ::; 1

(see Figure 7.3), since the identity J~l u'ifJ dx = - I-~1 uifJ' dx is easily
shown to hold. Note furthermore that u' E L 2 ( -1,1), and is therefore
of course locally integrable.
The preceding example illustrates one fundamental difference between clas-
sical and weak derivatives. The classical derivative, if it exists, is a function
defined pointwise on an interval, so it must be at least continuous. A weak
derivative, on the other hand, need only be locally integrable. Thus any
function v differing from a weak derivative u' on a set of measure zero
(for example, at a finite number of points in the realline) is itself a weak
derivative of u.

Distributional differential equations. Since we now have at our dis-


posal the concept of the derivative of a distribution, it is natural to consider
next differential equations involving distributions. For example, suppose
that we are required to find the distribution 9 that satisfies
g' = f (7.10)
for a given distribution f, on some interval of the real line. If f and 9
were ordinary functions (for example, f E C[a,b] and gE C 1 [a,b]), then
224 7. Distributions and Sobolev spaces

(7.10) would be a simple first order differential equation. Since fand gare
actually distributions, we go back to the definition (7.8) of a generalized
derivative; then (7.10) really reads

(g',cf;) = (j,cf;) or - (g,cf;') = (j,cf;) for all cf; E D(a,b). (7.11)

If by (7.10) we understand (7.11), then (7.10) is said to be a distributional


differential equation.
The same procedure applies in the case of more general differential equa-
tions. For example, suppose that we are required to find the distribution 9
satisfying

Ag = J, (7.12)

where A is the (generalized) differential operator given by

We interpret (7.12) as a differential equation involving genemlized deriva-


tives of g, and seek 9 such that

(Ag, cf;) = (I, cf;) for all cf; E D(a, b),

which is equivalent to

(g,A*cf;) = (j,cf;), cf; E D(a,b),

the operator A* resulting from successive applications of (7.1) and (7.7);


thus

(7.13)

Generally, for partial differential equations involving distributions the same


procedure is adopted. The problem of finding a distribution 9 satisfying

Ag = J, (7.14)

where Ag = 2: l al:'Ok aaDO!g, is equivalent to the problem of finding 9 such


that

(Ag, cf;) = (I, cf;) or (g, A*cf;) = (j, cf;), (7.15)

where A* is obtained, as in (7.13), by repeated application of (7.1) and


(7.7).
Naturally one would expect that if J is continuous (that is, a distribution
generated by a continuous function), then the solution 9 should be a func-
tion that is k times continuously diffcrentiable. This is indeed so; in other
7.3 The Sobolev spaces Hm(rl) 225

words, when the distributions involved are generated by sufficiently differ-


entiable functions, we recover the classical concept of a differential equa-
tion. In this case 9 is called a classical solution. More gene rally, though, if
I is a regular distribution generated by a function that is locally integrable
but not continuous, or indeed if it is a singular distribution, then equation
(7.14) cannot be expected to have any meaning in the classical sense. The
solution in this case is called a weak or genemlized solution.
When 9 satisfies an equation ofthe form (7.14) or (7.15) we say that Ag =
I in the sense 01 distributions, or that 9 satisfies (7.14) distributionally.

Examples

11. The equation

xg'=O on n=(-1,1) (7.16)

has the classical solution 9 = constant. But if (7.16) is regarded as a


distributional differential equation, then the weak solution is

where Cl and C2 are constants. We check as follows: g' = c 1 8 so that

(xg',cjJ) = (g',xcjJ) = cl(8,xcjJ) = cd(xcjJ)(O)] = 0;

hence xg' = 0 in the sense of distributions.

12. The equation g" = 8' has no classical solution on (-1,1) but its weak
solution is

This is verified by considering the fact that

(g",cjJ) = (H",cjJ) = (H,cjJ") = 1 1


cjJ" dx = -cjJ'(O) =, -(8,cjJ') = (8',cjJ).

7.3 The Sobolev spaces Hm(f2)


Before we actually get down to discussing Sobolev spaces, it is appropriatc
at this stage to elaborate on the degree of smoothness that we expect the
boundary r of a domain n in lRn (n ;::: 2) to have, since some results con-
cerning Sobolev spaces hold only when the boundary is sufficiently smooth.
Let n be a domain in lR n (n;::: 2) with boundary r. Let Xo be an arbitrary
point on rand construct B(xo, E), the open ball of radius E, center xo, for
some E > 0; that is, B(XO,E) = {x E lR n : Ix - xol < E}.
226 7. Distributions and Sobolev spaces

1(6)

E
r Lipschitz

/+--------------~--~l

r
r not Lipschitz

FIGURE 7.4. A local coordinate system for classifying the boundary of a domain,
and examples of Lipschitz and non-Lipschitz domains

Next, set up a coordinate system (6, ... , ~n) such that the segment rn
B(xo, E) can be expressed as the function

~n = 1(6,···,~n-l).

If the function 1 is m-times continuously differentiable for every Xo E r,


we say that r is 0/ class cm; r is said to be Lipschitz if 1 is Lipschitz-
continuous, that is, if there is a constant k such that

where e = (6,··· ,t;"n-l) and." = ('(/1, ... , 1]n-l) (recall that a Lipschitz-
continuous function is uniformly continuous). The situation is illustrated in
Figure 7.4 for n = 2. Unless otherwise stated r is assumed to be Lipschitz;
this includes, in ]Ft2, boundaries that are triangular, reet angular, and an-
nular, whereas in ]Ft3 tetrahedra and cubes are Lipschitz. Boundaries that
are not Lipschitz include those with CUSpi:i and those that have the domain
o on both sides, as shown in Figure 7.4.
The Sobolev spaces Hm(O). The Sobolev space of order m, denoted by
H m (0), is defined to be the space consisting of those functions in L 2 (0)
that, together with all their weak partial derivatives up to and including
those of order m, belong to L 2 (0):

H m (0) = {u: DCtu E L 2 (0) for all 0: such that 10:1 ::; m}.
7.3 The Sobolev spaces H"'(rl) 227

We consider real-valued functions only, and rnake HTn (n) an inner product
space by introducing the Sobolev inner product (-, ·)Hm defined by

(u, V)Hm = l L
l"'l<:m
(D"'u)(D"'v) dx for u, v E H Tn (0,).

This inner product in turn generates the Sobolev norm 1 . IIHm defined by

Note that HO (0,) = L 2 (0,), and furtherrnore that we may write (u,v)w"
as

(U,V)H== L (D"'u,DO:v)L2;
10:1<:=
in other words, the Sobolev inner product (u, v) Hm is equal to the surn of
the L 2 -inner products of DO:u and DO:v over an 0' such that 10'1 ::; m. Of
course, we could also write

Iluilkm = L IIDO:ulli2;
1001<:Tn
when written out in fun for the case m = 2, and for a function u defined
on 0, C ]R2, this becornes

dx.

Examples

13. Consider the function u(x) defined on 0, = (0,2) by

0< x::; 1,
u(x) = {
1< x < 2.

Then

v (x) == u' (x) = { 2x, 0 < x ::; 1,


4x - 2, 1 < x < 2,

which is a continuous function. The (weak) derivative of this function


is

w(x) == u"(x) = { 2, 0< x::; L


4, 1< x < 2.
228 7. Distributions and Sobolev spaces

4 ---+----/- u"

1 2

FIGURE 7.5. The function U of Example 13, and its derivatives

By inspection u,u', and u" all belong to L 2(O, 2); however, the (gen-
eralized) derivative of u" is Ulll = 28(x - 1) ~ L 2 (O, 2). Hence u is a
member of H 2 (0, 2), the function v belongs to H 1 (0,2), and w be-
longs to L 2 (0,2) = HO(O,2) (see Figure 7.5). The respective norms
of these functions are

1 2 2
[u + (u')2 + (u")2] dx = 71.37,

1 2 2
[v + (V')2] dx = 39,

1 2
w 2 dx = 20.

14. The function u defined on n= (-1,1) x (-1,1) by

u(X)={ x
° for
for
x>O,
x ~ 0,

is shown in Figure 7.6; this function belongs to H 1 (n). To see this,


we start by evaluating its first derivatives: for 4J E V(n),

r GU
Jn oy 4J dxdy = -1 n GY
uGcjJ
- dxdy =- 1°1 [11
x G4J ] dx
-dy
GY
-11
-1

x [cjJ(x, 1) - cjJ(x, -1)] dy = O.


7.3 The Sobolev spaces HTn(O) 229

/------.;;..-x

FIGURE 7.6. The function in u in Example 14

Hence äu/ äy = O. Secondly,

1 äu
o äxrjJ dxdy = -1 uärjJ
- dxdy = - j1 ([1 x-dx
ärjJ ) dy

-11dX)
o äx -1 "0äx

-/1 ([XrjJl6 rjJ dy

j-1}0
1 r1 rjJ dxdy =1 0
H(x)rt>(x,y) dxdy.

Hence äu/äx = H(x), the Heaviside step function in the x direction.


We can show next that ä 2 u/äx 2 = Dx , the two-dimensional Dirac
delta defined by Dx(rjJ) = rjJ(O,y), so that ä 2 u/äx 2 rt. L 2 (D). Hcnce
u E H 1 (0).
The picture that emerges is that the spaces Hm (0) provide a very logical
means for characterizing the degree of smoothness of a function. When
dealing with the spaces Cm(TI), by "degree of smoothness" is understood
"how many times can the function be differentiated?" In the case of Sobolev
spaces "degree of smoothness" is understood to mean "how many timcs can
the function be differentiated (weakly) before it ceases to belong to L 2 (O)?"
The following theorem summarizes the most important properties of the
space Hm(O).

THEOREM 1. Let Hrn(o) be the Sobolev space 0/ order m, and 0 a bounded


domain with Lipschitz boundary. Then
230 7. Distributions and Sobolev spaces

(i) HT(D.) ~ Hm(D.) if T 2: m;


(ii) Hm(n) is a Hilben space with respect to the norm 11·IIHm;
(iii) Hm(n) is the completion OT closure, with respect to the norm 11·IIHm,
of the space Coo (0).

PROOF. Only Parts (i) and (ii) are proved; the proof of (iii) is rather long
and technical, and its details may be found in the references at the end of
this chapter.

(i) If u E HT(n), then D"'u belongs to L 2 (D.) for all a such that lai::::; T,
and thus for all a such that lai::::; m. So U E Hm(D.), and HT(n) ~ Hm(D.).

(ii) We know that Hm(n) is an inner product space, so what remains to


be shown is that Hm(D.) is complete. Let {ud be a Cauchy sequence in
Hm(n). We have to show that Uk converges to a function U in Hm(D.).
First, by definition

lim Iluk - udlHm


k,l---+CX)
=0

or, using the definition of the Hm- norm,

Since each term in this sum is positive, it follows that

lim IID"'uk - D"'ud1L2


k,l-H;x)
=0 for all a, lai::::; m.

Hence {D"'Uk} is a Cauchy sequence in L 2 (n) for each a such that lai::::; m.
Since L 2 is complete, it follows that DcoUk converges to a function u(a), say,
that belongs to L 2 . In particular, for lai = 0, Uk converges to a function
u, say, in L 2 .
We show next that U is in Hm(n). Consider

inr (!im
k-+(X)
DCOUk)1> dx = (!im DCOUk,1»L2
k-oo

!im (Dll!Uk, 1»L2 = !im (-l)lco l(uk, D CO 1»L2


k---+a:: k---+oo

=
inruD
(-l)lco l (!im uk,D C0 1»L2 (_1)1"'1 a 1> dx,
k-+oo

where we have used the result of Exercise 2 of Chapter 4, (7.7), and the fact
that Dll!Uk is a regular distribution. Thus u(a) is the ath weak derivative
7.3 The Sobolev spaces Hm(O) 231

of u and since u, as weH as all of its weak derivatives of order ::; m, is in


L 2 (n),u belongs to Hm(n). Hence Hm(n) is complete. 0

Part (iii) of the theorem has an important interpretation: from the defi-
nition of the completion of aspace (Section 4.3) we know that C""(fl) is
dense in Hm(n); hence, for any u E Hm(n) it is always possible to find an
infinitely difIerentiable function J(x), say, that is arbitrarily elose to u in
the sense that

for any given E > O. In other words, every member of Hm(n) is either a
member of C"" (fl), or may be approximated arbitrarily closely by a function
from this space.

Example
15. Refer to Example 13: from what was said there we conelude that,
given any E > 0, it is possible to find functions J, g, and h in C"" (fl)
that satisfy

[ 11
(u - f)2 + (u' - J')2 + (u" - f"?dx
] 1/2
< E,

[ 11
(v - g)2 + (v' - g')2dx
] 1/2
< E,

[1 1
(w - h)2dx
] 1/2
< E.

When m = 0 we can deduce the following property of HO(n) = L 2 (n)


from Theorem 1.

COROLLARY TO THEOREM 1. L 2(n) is the completion, with respect to the


L 2 -norm, of the space C""(fl).

It is worth recalling that this result is contained also in Theorem 6 of Chap-


ter 4.

The Sobolev Embedding Theorem. A glance at the examples discussed


earlier in this section may lead one to wonder whether it is true that mem-
bers of HTn(n) are simply functions that, together with their derivatives
of order ~ m - 1, are continuous. After all it is not easy, for example, to
conceive of a function in H 1 (n) that is not continuous. A famous theorem
due to Sobolev asserts that, as we would expect, all members of H 1 (a, b)
are indeed continuous functions, but that this does not hold for higher-
dimensional domains.
232 7. Distributions and Sobolev spaces

Before stating the theorem we give a simple example to show that in-
tuition would be misleading. Let n be the disc of radius ~ in ~2, and let
u = In(In(l/r)), where r 2 = x 2 +y2. Then, using polar coordinates (r,e),

/1 r!
u 2 dxdy = 1 1/21271" [ln(In(1/r))]2
0 0
rdrde =
1271"1.00 (e-
0 In 2
t In t)2 dtde

(making the change in coordinates t = - In r) w hich is easily shown to be


bounded. Furthermore,

J10 [(OU/8X)2 + (8u/8y)2] rdrde = J10 (Inr)-2d(Inr) de = 27f/ln2.

Hence IlullHl is finite and so u belongs to H 1 (n). But u is not continuous


at the origin.
Let X and Y be two Banach spaces, with X <;;;; Y. The not ion of X
being embedded in Y goes a little further than the fact that X is a subset
of Y, in the foIIowing sense. Let u be any member of X; then of cour~c
u is also a member of Y, and this may be represented by making use of
the identity operator L : X -> Y, that simply takes a member of X to the
same member, viewed as an element in Y: that is, L(U) = u. This exercise
is of more than trivial interest because the two spaces X and Y are, in
general, endowed with different norms 11·llx and 11·lly, so we mayenquire
as to whether the operator L is bounded, that is, whether it is the case that
IIL(U)lly = Iluily : : : Kllullx, for some constant K > o. When this is the case,
then X is said to be continuously embedded in Y. The foIIowing theorem
gives conditions under which Sobolev spaces are embedded in space~ of
continuous functions.

THEOREM 2 (THE SOBOLEV EMBEDDING THEOREM). Let n be a bounded


domain in Rn with a Lipschitz boundary r. Ij m - k > n/2, then every
junction in Hm(n) belongs to Ck(O). Furthermore, the embedding

(7.17)

is continuous.

REMARK. Some care has to be exercised in the interpretation ofTheorem 2.


RecaII that members of Hm(n) are equivalence classes of functions, given
that they are members of L 2 , whereas continuous functions, by contrast,
are defined unambiguously. The embedding (7.17) has therefore to be in-
terpreted in the sense that each member of Hm(n) may be identified with
a function in Ck(O), possibly after changing its values on a set of measure
zero.
According to the Sobolev Embedding Theorem, if n = 1 so that n is
a subset of the real line, then the functions in H 1 (n) are continuous. For
7.3 The Sobolev spaces Hm(O) 233

domains that are subsets of the plane, though, n = 2 and we require that
a function be a member of H 2 (n) in order to guarantee its continuity.

An alternative definition of Sobolev spaces. The definition of Sobolev


spaces used here is one that is phrased in terms of generalized derivatives,
and whether these belong to L 2 • An alternative definition takes as a start-
ing point the spaces of m-times continuously differentiable functions; these
are not complete with respect to the norm 11 . IIH=, and the Sobolev space
Hm is defined as precisely the completion of C m (n) in this norm, for m ;::: 1.
That the two definitions are in fact equivalent is a well-known result, that
is contained in the following theorem.

THEOREM 3. Let n be a bounded domain. Then Hm(n) is the completion


or closure, with respect to the norm II·IIH=, ofthe space am(n) ofm-times
continuously differentiable functions that have a finite norm 11 . IIH=.

The main point about Theorem 3 is that every functiün in Hm(n) can
be approximated arbitrarily closely by a member of cm(n).
We conclude this section with an important and frequently useful in-
equality.

THEOREM 4. (THE POINCARE INEQUALITY). Let n be a domain in ]Rn


with a Lipschitz boundary. Then for any U E H l (n) there exist constants
Cl and C2 such that

Ilulli2 <::: Cl 1L
[11"'1=1
ID"'uI 2 dx + C2 [1 [1
u(x) dX] 2 (7.18)

More generally, for any U E Hk(n) there exist constants C3 and C4 such
that

(7.19)

PROOF. We prove (7.19) für the case n = (a, b) C IR; the higher-dimensional

r
results follow in a similar way. Thus for n = 1 we have to derive the in-

r
equality

Ilull~k <::: Cl l b
(::~ dx + C2 ~ (l ::~ b
dx. (7.20)

Let.; and TJ be two points in (a, b) with TI < .;, so that

u(O - u(TJ) = i~ u' dx


234 7. Distributions and Sobolev spaces

and so, using the Cauchy-Schwarz inequality,

[u(~)-u(11)12= [J:\'dXr :S [J:~12dX] [J:\U'fdX]

b
< (b - a) l (u'fdx.

Now we integrate with respect to~, keeping 11 fixed, and then with respect
to 11, to get

21 u(Od~
b
l
b
U(11)d11

< (b - a)31 (u')2dx


b

or

(7.21)

where Cl = ~(b-a? and C 2 = 2/(b-a). Since u E Hk(a,b), (7.21) is still


valid if we replace u by u', or by u", and so on, up to dk - 1 U/ dX k - l . That
is, we also have

l
a
b (dk-1U)2
dx k-l dx

To obtain (7.20) for k = 1 we add f:(u'?dx to both sides of (7.21) to get

(7.22)

Next, to get (7.20) for k = 2 we add f:(U")2dx to both sides of (7.22) and
use (7.21), to obtain

Ilull~2 :S (1+Cl(1+Cd)lb(UII)2dX+C2(lbudxr

+C2 (1 + Cl) (l b
u' dX) 2
7.3 The Sobolev spaces Hm(fl) 235

Continuing in this manner we can derive (7.20) for any value of k. 0

The Sobolev spaces Wm,P(fl). The Sobolev spaces Hm(f!) have been
defined by taking as a point of departure the Hilbert space L 2 (f!); in this
way it has been possible to introduce a family of Hilbert spaces, of which
L 2 is a special case, viz. the case m = O. In much the same way it is possible
to take as a point of departure the spaces LV(f!) for 1 S p S 00, and in
this way to introduce, for each p, a Sobolev space that is a Banach space.
Thus, for m = 0,1, ... , the Sobolev space Wm,V(f!) is defined to be the
space of functions that, together with all their weak derivatives up to and
including those of order m, belong to LV(f!):

(7.23)

Clearly we have Hm(f!) = W m,2(f!). The space Wm,V(f!) is a normed space


when endowed with the Sobolev norm 1I·lIw=,p, or simply 1I·llm,v, defined
by

for 1 S p < 00, and by

Ilullm,oo = L ess sup ID"'ul


1"'I:Sm
for p = 00. Recalling the definition (3.6) of the LV-norm, the Sobolev norm
may be expressed in the alternative form

That is, the Sobolev norm Ilull~,v is equal to the sum of the pth powers of
the LV- norrns of D"'u over all a such that lai :S m.
Theorems 1 to 3 all have counterparts for the spaces lt,m'V(f!), and some
of these extensions are given in the following theorem.

THEOREM 5. Let Wm,P(f!) be the Sobolev space defined by (7.23), and f!


a bounded domain with Lipschitz boundary. Then
(i) Wm,V(f!) is a Banach space with respect to the no:rm 11 . Ilm,v;
(ii) Wm,V(n) is the completion, with respect to the norm I ·11 W=.P, of the
space em(n) of m-times continuously differentiable functions that
have a finite norm II . IIw=.p;
236 7. Distributions and Sobolev spaces

(iii) W m,P(S1) is the completion, with respect to the norm 11·llw~.p, of the
space coo (TI);

(iv) for nonnegative integers m, k, and p such that 1 :S p :S 00, wm,p(S1)


is continuously embedded in Ck(TI) if m - k > nlp.

All of the discussions later on involving boundary value problems and


their approximation by finite elements take place in a Hilbert space context,
and the spaces Hm(S1) suffice for these purposes. The focus in this chapter
therefore remains on this special case of W m ,P(S1).

7.4 Boundary values of functions and trace


theorems
Traces of functions in H m (S1). Later on, when dealing with boundary
value problems, we are concerned not only with values of functions on an
open domain S1, but also with their values on the boundary r. Now, in
the case of continuous functions defined on TI = S1 u r, one simply finds
the values of a function u on the boundary by evaluating u on r; we then
write this as ulr. For example, if TI = [0,1] and u = x + 2, then ulr
consists ofu evaluated at x = 0 and x = 1: u Ir= {u(O),u(l)} = {2,3}.
Similarly, if S1 is the square {(x, y): lxi< 1, lyl < I} with boundary
r = {(x,y): lxi = 1, lyl:S I} U {(x,y): lyl = 1, Ixl:S I}, and we choose
u E C(TI) as in Example 14, then the restriction of u to the boundary is
the function Ull" shown in Figure 7.6. This procedure may be formalized by
introducing an operator, called the trace operator: , is a linear operator
that acts on a continuous function u E C(TI) to produce its restriction to
the boundary r; that is,

(7.24)

Note that since u is continuous, its restrietion to r is a continuous function


on r, so that ulr E C(r). Thus if S1 c ne, the graph of ulr can be drawn
as a continuous curve above r (as in Figure 7.6). The case of S1 c lR is of
course adegenerate special case.
We are interested in the problem of how to define ulr when u belongs to
L 2 (S1) or, more generally, to one of the Sobolev spaces H m (S1). Note that
functions belonging to Hm(S1) are defined on S1 and not on TI since r is a set
of measure zero and members of Hm(S1) are in fact equivalence classes of
functions, two functions being equivalent if they differ on a set of measure
zero. One way of defining uk (or ,Cu)) for a function u in L 2 (S1) would be
to set up a sequence {ud of continuous functions on n that converges to
u in the L 2 -norm (recall from the remarks following Theorem 6 of Chapter
4, or indeed, from Theorem 3, that C(TI) is dense in L 2 (S1)). Since Uk is in
7.4 Boundary values of functions and trace theorems 237

C(TI) we can unambiguously define, as in (7.24),

so we hope to be able to define /( u) according to

However, a glance back to Chapter 5, Theorem 4 shows that this is tan-


tamount to requiring that / be a continuous (or bounded) linear operator
from C(TI) to C(r), and we hope to extend / to a bounded operator from
L 2 (n) to L 2 (r). But this is not possible. It can be shown that there is no
continuous mapping of thc kind wc are looking for. In other words, it is
not possible to define unambiguously ulr when u is in L 2 (n), as is further
illustrated by the following example.

Example

16. Let u(x) = 1 on n = (0,1); u is actually continuous and it would


seem logical to define ulr by u(O) = u(l) = 1. But if we set up the
sequence {udk=3 defined by (Figure 7.7)

o s: x < l/k,
s:
l/k x < 1 - l/k,
1 - l/k s: x s: 1,

that converges to u in L 2 (0, 1), we find that /(Uk) == uklr = {udO), Uk


(I)} = {( -1)k+1, (-1 )k+ 1 }; thus uklr oscillates between -1 and + 1
and limk~CXl Uk does not exist.
As the preceding example shows, even though an apparently obvious value
for u Ir may be assigned (u( 0) = u (1) = 1), there is still ambiguity that rc-
sults from the fact that / is not a continuous operator. This has far-reaching
consequences, as can well be appreciated by the following considerations.
If / : L 2 (n) -+ L 2 (r) were a continuous operator, then we would havc

(7.25)

for some constant C > O. Thus if U and v are two functions in L 2 (n)
(they could be continuous functions) that are elose in the sense that Ilu -
vIIL2(rl) < E for some small E > 0, then (7.25) gives immediately

so that ulr and vlr are correspondingly elose. However, if / is not continu-
ous there is no guarantee that this situation would obtain. This is obviously
untcnablc if we are to develop a coherent theory of boundary value prob-
lems.
238 7. Distributions and Sobolev spaces

-1

FIGURE 7.7. A sequence of continuous functions with nonconvergent büundary


values

All is not lost, however; if a function u belongs to Cl (!1), then it can


be shown that the operator I mapping u to its value on r is a continuous
operator from C l (!1) to C(r), with respect to the norms 11 . IIHl([!) and
11·IIL2(l). That is,

satisfics

(7.26)

for so me constant C > 0 (note the norms used). The proof of this result is
contained in the next lemma.

LEMMA 1. Let n be a domain with Lipschitz boundary. Then the estimate


(7. 26) holds JOT all functions u E Cl (!1).

PROOF. We prove the result for the case n = 2; the proof für the more gen-
eral case follows in a similar way. We consider a local piece of the boundary
and set up coordinates (~, 'Tl) so that this can be represented in the form

'Tl = f(E.,), ~ E [-a, a],

where f is a Lipschitz function. It follows that there exists a number b > 0


such that the set

s= {-a S; ~ S; a, f(E.,) - b S; 'Tl S; f(~)}


7.4 Boundary values of functions and traGe theorems 239

FIGURE 7.8. The subset S of n

belongs to 0 (Figure 7.8). Now let u E C 1 (0). Then

u(~,J(~)) = 1s
1«(,) ou
a(~'
7]
7]) d7] + u(~, s),
where !(~) - b ::::; s ::::; !(O. We use the elementary inequality (n + ß)2 ::::;
2n 2 + 2ß2 to obtain

(7.27)

Now the integral in (7.27) may be simplified by applying the Cauchy-


Schwarz inequality, to give

: :; (1 1
«(,) 1d77r
2 (l 1W
(~~r d7]r
(f(~) - s? (l (fJ)2
a*
s
f(f.)
d7]
)2
< b 2(t(t;) (8U)2 dT!)2
lf(O-b 07]

using the fact that s ~ !(~) - b. After substitution in (7.27) we next


integrate with respect to s to obtain
240 7. Distributions and Sobolev spaces

and integrate again, this time with respect to ~; this gives

(7.28)

If r is a Cl boundary, then f E Cl and the differential of are length is


given by

ds = [1 + (f'?F/2d~.
Furthermore, f' is bounded so that 1 ~ [1 + (f')2]1/2 ~ C, where C is
a constant independent of f. Substitution in the left-hand side of (7.28)
yields

j-a
a u(~, f(O?
b[l + (f')2]1/2
> r 2
d~ - Cl }f'S u(~, f(~)) ds,
for some constant Cl, where r 5 is the portion of the boundary correspond-
ing to the interval ~ E [-a,a]. The right-hand side is easily estimated, and
(7.28) becomes

So the inequa!ity is established for the domain S; in order to obtain (7.26)


we simply sum ovcr all such patches.
In the event that r is merely Lipschitzian, it is still the case that f' is
bounded and the proof carries over virtually unchanged. 0

THEOREM 6 (THE TRACE THEOREM). Let n be a bounded domain in IRn


with a Lipschitz boundary r. Then
(i) there exists a unique bounded linear operator, that maps Hl (n) into
L 2(r); that is,

, : H l (0.) -> L 2 (r), lb(u)IIL2(f') ~ CII u IIHl(O),


with the property that if u E Clen), then ,Cu) = ulr in the conven-
tional sense;
(ii) the range of, is dense in L 2 (r).
PROOF. We prove (i). The proof follows immediately from (7.26) and
the fact that H l (0.) and L 2 (r) are the completions of Cl(n) and C(r),
respectively, in the appropriate norms. Indeed, for any u E H l (n) we can
set up a sequence {ud in Cl(n) that converges to u in the Hl-norm. Thus

!im
k->co
Iluk - ullHl = 0,
7.4 Boundary values of functions and trace theorems 241

and so, using (7.26), one sees that {-rk} is a Cauchy sequence in L 2(r) and
therefore converges to v, say, in L 2 (r). Define ')'(u) = v; then

\I1'(u)IIL2(r) 111'( lim


k--+CX)
uk)1I = lim \I1'(uk)1I
k-+oo

::; C k--->oo
lim lIukIIHl(rl) = ClluIIHl(rl)'
Part (ii) of the theorem implies that, although the range of')' is not all
of L 2 (r), any member of L 2 (r) can be approximated arbitrarily closely by
a function lying in the range of ')'. D
The trace theorem enables us to define unambiguously ')'(u) or ulr, pro-
vided that u is smooth enough to be in H 1 (n). Now suppose that u is even
smoother, so that u belongs to H 2 (n). Then u is a member of HI(n) and
so in fact is DDlU for lad = 1:
8u 8u 1
u'-8 ' ' ' ' ' - 8 EH (0.).
Xl X n

This means that the boundary values of the first derivatives of u can also
be defined unambiguously, using the trace theorem.
The argument can be generalized to the space Hm(n); indeed, when
m> 1 then for any u E Hm(n) we have DDlU E HI(n) for lad::; m-I. By
the trace theorem the value of DDl U on the boundary is well-defined and
belongs to L 2 (r) :

Furthermore, if u is in fact m-times continuously differentiable, then DDlU


is at least continuously differentiable for lad::; m - 1 and

We introduce the notation ')'01 to denote the operator that, when applied
to a member u of Hm(n), gives the trace or boundary value of DDlU for
10:1::; m - 1 :
(7.29)

If u E cm(fi) , then ')'Dl(U) = ')'(DDl U) = DDlul r . Clearly ')'01 is a bounded


operator.
A word about notation is in order at this point. Henceforth we deal with
boundary values of a function only if these boundary values can be defined
unambiguously, in the sense of Theorem 6 (or its extension to (7.29));
when referring to the value of a function u or that of its derivatives on the
boundary we simply write u,8u/8x, ... , instead of ')'(U),')'(I,O ... )U, it being
understood that the boundary values are to be interpreted in the sense 01
the troce theorem. So if we see, for example,

u = Uo on r,
242 7. Distributions and Sobolev spaces

this means that ,( u) takes on the value Uo a.e. on r. Sometimes, in order


to make this clearer, we may write "u = Uo in the sense of traces".
Naw that the issue of boundary values of functions in Hm(O) has been
clarified, it is fairly straightfarward to extend Green's theorem, equation
(7.2), to functions in H I (0) (see Exercise 7.16): given functions u, v E
HI(O), the identity

1u8v
- dx
o 8Xi
= 1
r
UVI/i ds - l8u- v dx
0 8X i
(7.30)

holds for i = 1,2, ... ,n. From this identity we can deduce higher-order
identities; for example, if u is replaced by 8u/8xi (assuming now that
u E H 2 (0)) and the resulting equation is summed over i from 1 to n, then
we find that

In "ilu· "ilv dx = Ir :~ v ds - In ("il u)v dx


2

far u E H 2 (0), V E HI(O), where "il 2 is the Laplacian (see (8)).


We conclude this section with a set of inequalities that are useful later.

THEOREM 7. Let 0 be a bounded domain in IR n with Lipschitz boundary


r ifn 2: 2. Then
(i) tor any u E HI (0) there are positive constants Cl and C2 such that

(7.31)

tor n = 1, and

(7.32)

tor n 2: 2;
(ii) tor any u E H 2 (O) there exists a constant C3 such that that

lIull~2 ~ C3 (L (ID"'uI 2 dx +
101=210
1r
u2 dS) . (7.33)

7.5 The spaces Hü(O) and H-m(O)


The space H({'(O) is a subspace of Hm(O) that arises frequently in bound-
ary value problems because members of H({'(O) are distinguished by the
7.5 The spaces Hü(O) and H-"'(O) 243

fact that certain of their derivatives vanish on the boundary. We define


HO'(n) to be the completion, in the Sobolev norm 11 . IIH=, of the space
CO'(n) of functions with continuous derivatives of order::=:: m, all of which
have compact support in n. In other words, HO'(n) is formed by taking the
union of eO'(n) and all those limits of Cauchy sequences in CO'(n) that
are not in CO'(n).
Since Do.uk = 0 on r (lai::=:: m) for each member of a Cauchy sequcnce
{ud in CO'(n), it suggests that the limit of such a Cauchy sequence, that
of course belongs to HO'(n), also satisfies Do.u = 0 on the boundary. This
is borne out by the following theorem, which also gives other properties of
HO'(n).

THEOREM 8. Let 0. be a bounded domain in ~n with a sufficiently smooth


boundary rand let HO'(n) be the completion ojCO'(n) in the norm II-IIH=.
Then
(a) HO'(n) is also the completion oICO'(n) in the norm II·IIH=;
(b) HO'(n) c H'''(n);
(c) ij u E Hffi(n) belongs to HO'(n), then
Do.u = 0 on r, lai::=:: m - 1.

PROOF. The proof of (a) is similar to that of Theorem 1 (iii). Part (b) is
obvious. To prove (c), we usc the continuity of the trace operator: let {Uk}
be a Cauchy sequence in CO'(n) with limit u in HO'(n). Then from the
definition of 10/ we have

Hence lim Io.(uk)


k----+oo
= 10. k-oo
lim Uk = Io.(u) = D"'u = O. D
Part (c) of Theorem 8 is particularly useful in characterizing members
of HO'(n), as the following example shows.

Example
17. The function u defined by

u(x) = { ~'~, + ~, 2x -

(2 - x?,
is a mcmber of H 2 (O, 2), as Figure 7.9 shows. Also, u and du/dx are
equal to zero on the boundary x = 0, x = 2. Hence u E HJ(O, 2).
244 7. Distributions and Sobolev spaces

FIGURE 7.9. The function in Example 17

a b x
FIGURE 7.10. The construction used in the derivation of (7.34)

Equivalent norms on H'[;'(O). We begin with a famous inequality that


serves as a basis for defining a norm on HJ (11) that is equivalent to the
standard H1-norm.

THEOREM 9 (THE POINCARE-FRIEDRICHS INEQUALITY). Let 11 be a


bounded domain in Rn. Then there exists a constant C > 0 such that

(7.34)

PROOF. The inequality is first established for the case u E C(f'(11), after
which the density of this space in HJ(11) may be used to obtain (7.34). We
focus on the situation in which n = 2, for convenience. Let G = [a, b] x [c, d]
be a rectangle that includes 11 as a proper subset, as in Figure 7.10, and
7.5 The spaces Hf)'cn) and H-mcn) 245

note that
[Y 8u
u(x, Y) = Je 8t (x, t) dt for all (x, y) E C

because u(x, c) = O. Prom the Cauchy-Schwarz inequality we have

u 2(x,y) (l Y
1· ~~(x,t) dy)2 ~ l l Y
dt
Y
(~~(X,t))2 dt

< (d-c) I d
(~~(X,t)r dt.

Integrating over C, and bearing in mind that u = 0 outside of 0, we obtain

The inequality (7.34) may now be obtained by repeating the argument, this
time integrating in the x-direction, and then adding.
The extension to functions in HJ(O) is left as an exercise (Exercise
7.16). D
At this stage it is convenient to introduce a family of seminorms on
Hm(o). A seminorm 1·1 satisfies all the norm axiOIIlS except that of positive-
definiteness (Axiom N2 in Section 3.3), in that lul 2: 0, but lul = 0 does
not imply that u = O. The quantity I . Im defined on Hrrt(O) by

lul;' = L
l"l=m
in ID"uI 2 dx (7.35)

is a seminorm; indeed, lul m = 0 implies that DO:u = 0 for 10:1 = m, which


of course does not imply that u itself is zero.
The relevance of the semi-norm to the present discussion is that, with
the aid of the Poincare-Friedrichs inequality, it is possible to show that 1·11
is in fact a norm on HJ(O).

COROLLARY TO THEOREM 9. The quantity I· h is a norm on HJ(O),


equivalent to the standard H 1 -norm.

This result is treated in Exercise 7.17; note in particular that (7.34) can be
expressed in the form

It is possible to extend Theorem 9 and its Corollary to the spaces HO'(O)


for any m 2: 1; this is also discussed in Exercise 7.17 . and the result is
summarized in the following.
246 7. Distributions and Sobolev spaces

THEOREM 10. Let 0, be a bounded domain in jRn. Then there exists a


constant C > 0 such that

Ilulli2 :s: Clul;" for all u E Ht;'(n). (7.36)

Furthermore, I· Im is a norm on Hü(n), equivalent to the standard Hm_


norm.

The Space H-m(O). In Section 5.4 we discovered that the space L 2 (n) is
self-dual. The quest ion now arises as to how we can characterize [Hm(n)l',
the space of bounded linear functionals on Hm(n). Now we would hope
to find out ab out [Hm(n)l' by considering functionals E on 1)(0,) (that is,
distributions), and by looking at the limits of (E, ifJk! as k --> 00, where
{ifJd is a Cauchy scquence in 1)(0,). There is a complication here, however,
in that 1)(0,) is not dense in Hm(n), so that not every U E Hm(n) is
the limit of a Cauchy sequence {ifJd in 1)(0,). This dilemma is resolved
by restricting attention instead to the dual of Hü(n); 1)(0,) is dense in
Hü(n), by Theorem 8(a), and this property is used to definc Hü(n)' in the
following theorem. Before stating the theorem we introduce the convention
whereby the dual of Hü(n) is denoted by H-m(n):

As shown in the following, this notation makes complete sense.

THEOREM 11. A distribution q is in the dual space H-m(n) of Hü(n) if


and only if it can be expressed in the form

(7.37)

where qa are functions in L 2 (n).

PROOF. Let J be any function in L 2 (n) (= [L 2 (n)]'); then, far any ifJ E
1)(0,)

I(D a J, ifJ)1 IU, DaifJ) I

Il J(DaifJ) dxl

:s: IlfllL2IID a ifJllL2:S: IIJIIL2llifJIIH= (7.38)

using the Cauchy-Schwarz inequality. If {ifJd is any Cauchy sequence in


V(n) with limit u in Hü(n), then by replacing ifJ with ifJk in (7.38) and
taking the limit as k --> 00, we see that Da f is a bounded linear functional
on Hü(n) for JE L 2 (n) and lai :s: m. That is, Da f belongs to [Hü(n)l' =
H-m(rI).
7.5 The spaces Hi)'(n) and H-m(n) 247

Conversely, if q belongs to H-m(o.), then by the Riesz Representation


Theorem there is auE Hf!' (0.) such that

(q,q;) = (U,CP)H= for all cP E v(o.).

Now, for any v E Hf!'(o.), let {cpd c V(D.) with limk--->oo CPk = v. Then
(U,CPk)H=
1L°1<>ISm
(D<>u)(D<>CPk) dx

L (-I)I<>I(D<>(D<>u),CPk)'
1<>ISm
Hence, as k -> 00 we have

(q,v) = / L (-I)la ID a(D aU),V)


\alsm
so that q is of the form

q = L (-I)I<>ID<>(D<>u)
1<>ISm

which gives the desired result since Dau E L 2 (D.). o

Example
18. Theorem 11 gives a useful way of characterizing the negative Sobolev
spaces H-m(o.)j indeed, (7.37) indicates that if we difIerentiate a
member of L 2 (D.) up to m times, we get a functional q on Hf!'. For
example, take

-1<x<O,
H(x) = { ~: 0< x < 1,

which belongs to L 2 (o.). We know that H' = 8, the Dirac delta, so


we conclude that

This should give some idea of the nature of members of H-m(D.)j as


m gets larger, we find progressively more irregular distributions in
H-m(D.). We note here that
Hm(o.) C W n- 1 (o.) C ... C HO(o.) = L 2 (o.) C H-1(o.) C H-m(o.)
(recall that [HO(o.)]' = HO(D.)).
248 7. Distributions and Sobolev spaces

7.6 Bibliographical remarks


The definition of a distribution given in Section 7.1 is a simplified one
that avoids a number of complicated topological considerations, but which
is adequate for our purposes. The classical reference to distributions is
Schwartz [46], who was responsible for developing much of the theory. Thc
book [47] by Schwartz on mathematical methods also contains an account
of distributions. Other references worth consulting are those by Dautray
and Lions [13], Oden and Reddy [38], Roman [42], Showalter [48], and
Zeidler [54, 53].
We have restricted attention to the Hilbert space w m.2 (n) == Hm(n).
A comprehensive treatment of the Sobolev spaces Wm,p(n) may be found
in the books by Adams [1], Dautray and Lions [13], Oden and Rcddy [38],
Showalter [48], and Zeidler [54, 53]. Very accessible treatments mayaIso be
found in the works by Rektorys [41] and, at a more advanced level, Necas
[34].
It is possiblc to dcfinc Sobolev spaces WS,p(n) for real values of s. These
spaces are important in certain classes of boundary value problems, partic-
ularly nonlinear problems. Of some relevance to the treatment given in this
chapter is the boundary space W 1/ 2,2(r) or H 1/ 2(r), the chief property of
this space being that it is the range of the trace operator (recall that the
range of I is merely dense in L 2 (r)). A proper treatment of this space, and
of fractional Sobolev spaces in general, ia necessarily a lengthy process, and
is omitted. Certainly it suffices in later chapters to work with L 2 (r).

7.7 Exercises
Distributions

7.1. If ais a multi-index in Z~, define a! and x" by

Verify that thc conventional Taylor expansion of a function j about


the origin takes the form

j(x) = f= x~ a.
Da j(O).
1"'1=0

[Expand the right-hand side for the case n = 2, by working out the
first few terms.]
7 7 Exercises 249

7.2. Show that the Dirae delta /5 is not generated by a loeally integrable
function /5(x), as follows. Let 1>a(x) be the test funetion defined by

exp [x a:.a
2 2 ], lxi< a,
1>a(x) = {
0, b> lxi ~ a,
for b> a > O. Assume that a function /5(x) exists, and show that

Consider the limit as a -+ 0 and obtain a eontradietion.


7.3. Prove that the only eontinuous funetion f for which (f,1» =
I f(x)1>(x) dx = 0 for all 1> E D(n), is the zero funetion. [Use the
result in Exercise 2.6.]
Derivatives of distributions
7.4. Use the standard form of Green's theorem to show that

r
io
(D"'u)v dx = (_1)1"'1
in
r u(D"'v) dx +
ir'
r h(u,v) ds

for lai = m, where u, v E cm(n), and heu, v) is a function of u and


v,

7.5. Show that (sgn)' = 2/5 on (-1,1), where

sgn x = {Xo/lxl for x i= 0,


for x = o.

7.6. Show that 1" = ao - a 2 (sinax)H on (-1,1), where f is (the dis-


tribution generated by) the function fex) = (sinax)H(x), and a is
eonstant.
7.7. Show that the function
x, -1< x ~ 0,
fex) = { x+c, 0< x ~ 1,

has a generalized derivative given by l' = df /dx + cO = 1 + co.


7.8. Let f be defined on n = (-1,1) x (-1,1) C ]R2 by

fex) = { xy, if xy ~ 0,
0, if xy< O.

+1,
=Ei' f jaxay ~ g(x) ~ {
x ~ 0, y ~ 0,
Show ,hat n",l' f -1, x ~ 0, y ~ 0,
0, otherwise.
250 7. Distributions and Sobolev spaces

7.9. Find the solution to the distributional differential equation

[Try u in the form u = Hf, where H is the step function and f E


C 2 ( -1,1).]
The Sobolev spaces Hm(D.)
7.10. To which spaces Hm(D.) do the following functions u belong?

0, 0< x< 1,
(a) u'(x) = { x-I, 1 :S x< 2,
x3 - x2 - 3, 2 :S x < 3;
xy,
x(2 - y),
°°:S:S x :S:S °:S<
x
1, y < 1,
1, 1 y :S 2,
{
(b) u(x, y) = X O:Sx<~, y=l,
371", x =~, y = 1,
x, ~ < x :S 1, Y = 1.
7.11. Show that the functions

u(x) = {
X
2'- x,
°1 :S:S x :S 1,
x :S 2,
and v(x)=sin71"x,

are orthogonal in L 2 (0, 2). Investigate whether they are orthogonal


in H1(0, 2). Find the distance between u and v in L 2 (0,2) and in
H 1 (0,2).
7.12. For the functions u and u - v in Exercise 7.11, verify the Cauchy-
Schwarz inequality for L 2 (0,2) and H 1 (0,2).
7.13. Use the Sobolev Embedding Theorem to show that the function

u(x)={ x 2 y 2, x>O,y>O,
0, otherwlse,

is continuous on D. = (-1,1) x (-1,1).


Boundary values of functions in Hm(D.)
7.14. Starting with (7.2), derive Green's theorem (7.30) for functions in
H1(D.). [Apply (7.2) to sequences u n , V n in C1(fi) and use the fact
that DOu n ---> DOu in L 2, together with the continuity of the inner
product and of the trace operator.]
7.15. Derive the Green's formula

In V 2 UV 2 v dx = In (V 4 u)v dx + l [(V 2 U) :~ - :)V 2 u)v] dx

for u E H 4 (D.), v E H 2 (D.).


7.7 Exercises 251

The spaces HO'(n) and H-m(n)

7.16. Complete the proof of Theorem 9 by extending the inequality from


C8"(n) to HJ(n). Show also that 1·11 is a norm on HJ(n).
7.17. Show that the seminorm (7.35) is a norm on HO'(n), equivalent to
the standard Hm- norm.

7.18. Use Green's theorem to show that

7.19. Sinee H- m (0,) eonsists of eontinuous linear functionals on HO'(n),


the norm on H-m(n) is defined by (see Seetion 5.4)

IU,v)1
IIfIIH-~ = sup IlvIIH~' v E HO'(n).

Under what eonditions is the Dirae delta a member of H-m (n)?

7.20. In the spaee HI(n) show that the orthogonal eomplement of HJ(n)
is the subspaee of funetions U E HI(n) for whieh 'J2 u = u (distribu-
tionally). Find a basis for HJ(n)..L for the ease 0, = (0,1) C IR.

7.21. Show that u(x) = lnx is a member of L 2 (0, 1), and hence that v(x) =
I/x belongs to H-I(O, 1).
Part 11

Elliptic Boundary Value


Problems
8
Elliptic boundary value problems

In this chapter we return to the topic of the Introduction, and set about
the process of developing a mathematically coherent framework for bound-
ary value problems. Section 8.1 sets the stage by introducing a range of
problems involving differential equations; we saw some examples in the
Introduction, and here the opportunity is taken to introduce a few more.
In the remaining four sections we build up towards a general theory for
the existence, uniqueness, and regularity of solutions to elliptic boundary
value problems. The problem is posed as one involving an elliptic operator
from one Sobolev space to another. To the uninitiated, the ideas discussed
here may seem esoteric at times; rather than discuss techniques for solving
boundary value problems, the results obtained are of a qualitative nature.
This is precisely the program of investigation that was proposed in the
Introduction, and the intention is that the motivating ideas of that chapter
together with the theory developed here, convey the relevance of these
qualitative results to a proper understanding of the problem.

8.1 Differential equations, boundary conditions,


and initial conditions
The main ideas of this section have in fact already been stated in the Intro-
duction, albeit rat her succinctly. Here we expand on many of those notions,
and introduce a few more definitions relevant to thc study of boundary
value problems.
256 8. Elliptic boundary value problems

Differential equations. Differential equations are the lifeblood of any


mathematical modeling process in which real-life situations are translated
into mathematical language through the use of junctions of position or
time, or both. The assumption that such functions are differentiable to
some extent, together with sets of equations that represent natural laws
(conservation or balance laws, for example) and that capture in mathemat-
ical form the behavior of particular media, lead to mathematical models in
the form of differential equations.
At the heart of a differential equation is an unknown function u, say,
that could be a function of one or more independent variables Xl, X2, .•. ,
X n , t. The variables Xl, X2, •.. , X n , of which there are invariably three or
less, usually refer to coordinates of a point in space. As before we use
x = (x,y,z) rather than x = (Xl,X2,X3) for a point in ]R3, whenever this
is more convenient. The variable t refers in a physical context to time.
A differential equation (DE) is any equation involving the independent
variables Xl, X2, .•• ,Xn , t, a function u of these variables, and some of
the derivatives of u with respect to these variables. If there is only one
independent variable, then the DE is called an ordinary differential equation
or ODE; on the other hand, if there are two or more independent variables,
it is a partial differential equation (PDE).
In addition to the variables mentioned there may be other given functions
that appear in the DE; these, together with any other information that is
given beforehand, constitute the data of the problem.
The order of a DE is defined to be the order of the highest derivative
appearing in the equation.
It may be the case that the unknown function is vector- rather than
scalar-valued. If, for example, the unknown function u has components
Ui (i = 1,2,3), each of which is a function of x E ]R3 and t, then there will
be not one but a system of three PDEs defining the problem, one equation
for each unknown.
A DE (or a system of DEs) is linear if it can be written in the form
Au = f, where A is a linear opemtor. Otherwise it is a nonlinear DE.

Examples

1. Biological population growth. Suppose that we wish to model the


change in a biological population with time. The population at time
t is denoted by u(t), and since there is only one independent variable
t, it is an ODE that will model this process.
A simple example of such a model is one in which it is assumed that
the rate of change of population depends on the current population,
and on the difference between the birth rate per capita b( u), and the
death rate per capita d(u). The functions band d constitute part of
the data of the problem, and the ODE corresponding to this model
8.1 Differential equations, boundary conditions, and initial conditions 257

is given by

du
- = [b(u) - d(u)]u. (8.1)
dt
This is a first-order nonlinear ODE, since the operator Au == du/ dt-
[b(u) - d(u)]u is nonlinear.

2. Heat conduction or diffusion. The unsteady heat or diffusion equation

au 1.
- - -dlV (KV"u) =Q (8.2)
at cp

derived in the Introduction (see equation (2) there) is an example of


a second-order PDE. The functions c, p, K, and Q constitute part of
the data of the problem, as does the domain on which the problem
is posed. Note that this is a linear PDE since the operator A defined
by Au = au/at - (l/cp)div (KV"u) is linear.

3. The Poisson equation. The assumption that heat conduction is steady


(that is, time-independent), and that the medium is homogeneous (so
that c, p, and K in (8.2) are constant) leads to the second-order PDE
known as the Poisson equation; this is given by

in which

is the Laplacian operator in]R3. Recall from the Introduction that this
equation also arises, on a domain in two dimensions, in the problem
of the deflection of an elastic membrane.

4. One-dimensional heat conduction. An example of a spatial ODE may


be obtained by specializing Example 2 to a situation in which, first,
the conduction is steady (so that time disappears as a variable) and,
second, all the data depend on one variable, x, say. Then the problem
of steady one-dimensional heat conduction corresponds to

_~~
cpdx
(K dU ) = Q.
dx
(8.3)

Note that the left-hand side of (8.3) has the form of a Sturm-Liouville
operator (recall Section 6.5).

5. Linear elasticity. This next example is new, and yields a system of


PD Es in which the unknown is a vector-valued function.
258 8. Elliptic boundary value problems

Shape at time t

FIGURE 8.1. The deformation of an elastic body

An elastic body is defined to be asolid whieh, once deformed, will


revert to its original shape if the forces causing the deformation are
removed. As in the derivation of the heat equation in the Introduc-
tion, the equations of elasticity are obtained from a balance law, viz.
balance of momentum, together with a constitutive law, viz. Hooke's
law.
The analogue of the temperature u is the displacement u whereas the
function analogous to the heat flux q is the stress u. The displacement
is a vector with components (Ul, U2, U3), and x + u(x, t) gives the
position at time t occupied by a material particle originally located
at x (Figure 8.1). The stress is a second-order tensor, but can be
regarded as asymmetrie 3 x 3 matrix with components O"ij for the
purposes of this discussion. The stress characterizes the internal forces
at any point in a body in a very simple way according to Cauchy's
law, which states that the (vector) force per unit area t acting on a
surface with unit normal v is given by

t= uv. (8.4)

Proceeding in a manner completely analogous to that for the case of


the heat equation (cf. (2) through (7) in the Introduction, and see
also Exercise 8.2), we obtain Cauchy's equation of motion

(8.5)

in which p is the mass density and Q is a prescribed body force per


unit volume. Just as the operator div maps a vector to a scalar, when
applied to a matrix it produces a vector, according to the formula
8.1 Differential equations, boundary conditions, and initial conditions 259

In eomponent form, (8.5) is therefore the set of equations

i)2Ui _ ~ aO'ij _ Q.
p at 2 ~
j=l
ax·J - , (i=1,2,3). (8.6)

The analogue of Fourier's law is the elasticity law, sometimes knüwn


as the generalized Hooke's law. Just as Fourier's law relates the heat
flux q to derivatives of the temperature, in the same way the elasticity
law relates the stress to eertain derivatives of the displaeement. These
derivatives are eontained in the symmetrie strain tensor or matrix 10,
whieh measures deformation in the büdy, and whose eomponents are
given by

1 (aUi
()
Eij U = '2 aXj + aUj)
aXi .
(8.7)

The constitutive law for linear elastic materials then states that the
stress depends linearly on the strain at every point of the body; that
is,

(8.8)

so that C is a linear operator that takes strains to stresses, and is


known as the elastieity tensor. When written out in eomponent form
(8.8) beeomes
3
O'ij = L CijklEkl(U),
k,l=l

so that in general eaeh component of (J" depends on every component


of E. In praetice this dependence ean be narrowed down quite consid-
erably, and we in fact foeus on one special but very important case,
viz. that corresponding to isotropie elasticity. Für this ease (8.8) reads

in whieh A and JL are material coefficients known as Lame's constants,


and the trace tr M of a matrix M is defined by tr M = L~=l M ii .
Thus the eomponents of the elastieity tensor are given by

(8.9)
It is of course possible to express the stress directly as a function of
the displacement, by writing

(J"=OU,
260 8. Elliptic boundary value problems

where the elasticity operator 0 is defined by

Ou = >.[trE(u)]I + 2p,E(U).

When written out in full, this reads

and so on. Thus although the constitutive equation is a !ittle more


complex than that for the case of heat conduction, the structure is
exactly the same; for heat conduction the operator in quest ion is V
whereas for elasticity it is O.
Continuing the analogy, we eliminate the flux (in this case the stress)
from (8.5) to obtain a system of equations in the components of u;
these are known as Navier's equations, and are found by straightfor-
ward substitution to be

(8.10)

or, after substituting for C,

(8.11)

This represents a set of three second-order linear PDEs in the three


components of u; their structure should be compared with that of
the heat equation (Box 1 in the Introduction).

6. Deflection of a plate. The next example also comes from linear elas-
ticity, and concerns the special case in which the body is a thin
plate. That is, one of its dimensions, in the z direction, say, is very
much smaller than the other two, and the body occupies the region
n x (-h/2, h/2), where n is a domain in IR 2 , so that geometrically
the plate is flat (Figure 8.2). It is assumed that external forces act
only in the z direction. This set of circumstances allows various as-
sumptions to be made about the deformation of the plate. First, the
midsurface n is assumed to undergo a displacement with compo-
nents Ul(X,y,O) = U2(X,y,O) = 0 and U3(X,y,O) == w(x,y). Second,
we invoke a key geometrical assumption known as the Kirchhoff-Love
hypothesis: this states that sections of the plate that are straight, and
normal to the midplane n, remain straight and normal after deforma-
tion. The Kirchhoff-Love hypothesis has an immediate consequence,
8.1 Differential equations, boundary conditions, and initial conditions 261

n:,
tt q(if '~' I~J.J!L
dX

y
t z
q..I" r
I
tor Majl

FIGURE 8.2. A thin elastic plate

which is that thc inplane displacements can be expressed in terms of


the transverse displacementj indeed, from Figure 8.2 we see that
OW OW
Ul(X,y,Z) = -z OX and U2(X,y,Z) = -Zt/y' (8.12)

to which is added
U3(X, y, z) = w(x, y). (8.13)
The governing equation for an elastic plate is obtained by imposing
these assumptions on the elasticity equations. First, we adopt the
convention that Greek suffixes range over 1 and 2. Next, we define
the components Sa and Maß of the shear force vector Sand bending
moment matrix M by
h/2 jh/2
Sa = j CT3a dz and Maß = ZCTaß dz.
-h/2 -h/2
These are quantities that are averaged over the thickness of the platej
their interpretations are illustrated in Figure 8.2.
The shear force is eventually elirninated, but a constitutive equation
is required for M. This may be derived from the generalized Hooke's
law, which together with (8.12) becomes

Maß = -D [V(\12 W)laß + (1 - lJ) ",027~J . (8.14)


uxaxß

Here \12 is the two-dimensional Laplace operator, laß are the com-
ponents of the 2 x 2 identity matrix, and D is called the bending
stiffness; it depends on the material and the geometry, and is dcfincd
by D = Eh 3/12(1 - lJ2), in which E and v are material constants
known, respectively, as Young's modulus and Poisson's ratio (noth-
ing to do with the Poisson equation!). These two constants may be
expressed in terms of the Lame moduli if desired.
262 8. Elliptic boundary value problems

Assuming static (time-independent) behavior and an external force


per unit area q acting only in the vertical direction, the use of (8.7)
together with the definitions of shear force and bending moment can
be shown (see Exercise 8.4) to lead to the pair of equations

l:~=1 8Ma ß/8xß


(8.15)
l:!=1 8Sa / 8x a. + q O.
Finally, elimination of So. from these equations, and use of the con-
stitutive equation (8.14) leads to the linear fourlh-order PDE

in which '\74, the biharmonic operator, is defined by

(8.16)

7. Deflection of a beam. A body that is rectilinear in shape, and whose


length is considerably greater than its two other dimensions, is known
as a beam. The equations of elasticity, when applied to beams, simplify
in much the same way as they do for plates, the difference being that
the theory for beams is one-dimensional.
Consider then inplane defiection of the beam shown in Figure 8.3; it
has length L, breadth b, and depth d, and its breadth and depth are
assumed to be much smaller than its length. The beam is subjected
to a force of intensity q per unit length. The assumptions underlying
beam theory are very similar to those for plates, so these are discussed
only briefiy. First, the midplane z = 0 of the beam is identified,
and it is assumed that the midplane displacements are of the form
Ul(X,y,O) = U2(X,y,0) = 0, and U3(X,y,O) == w(x). The analogue
of the Kirchhoff-Love assumption is the Euler-Bemoulli hypothesis,
according to which plane sections that are normal to the midplane

FIGURE 8.3. Inplane deformation of a beam


8.1 Differential equations, boundary conditions, and initial conditions 263

before deformation remain plane and normal; by analogy with (8.12)


we thus obtain

Thus in particular, En = -zw".


Next, we define the bending moment M and shear force 8 according
to

M(x) = L O"n Z dydz, 8(x) = L 0"31 dydz;

by carrying out aseries of manipulations similar to those that lead to


(8.15) for plates (see Exercise 8.4), we find that two ofthe equilibrium
equations give

M'-8 0,
(8.17)
8'+1 o.
The constitutive equation for the bending moment comes from the as-
sumption that Poisson's ratio v is very nearly zero; thus from (8.14),
for example, o"u = EEu and so, after substitution for EU, multipli-
cation by z and integration with respect to y and .2:, we find that

M = -Elw", (8.18)

in which I is a property of the cross-sectional area known as the


second moment of area, and is defined by I = JA z2 dy dz = bd3/12.
The constitutive equation for the shear stress may be found from
(8.17h, and is

8 = -Elw"'. (8.19)

Elimination of 8 from (8.17) thus leads to the governing equation

El d4w4 = 1 (8.20)
dx
for the deflection of a beam.

Specification of the domain of interest. Physical conditions invariably


dictate that a DE is required to be satisfied only on an open subset n of
jRn and, if time is present as a variable, over a prescribed length of time.
It follows that a proper description of the physical system must include, in
addition to the DE, a statement indicating the spatial and temporal ranges
of interest. For example, (8.1) needs to be supplemented by a statement to
264 8. Elliptic boundary value problems

°°
the effect that we require u(t) for t lying in the range 0 < t :s; T or (0, T],
where t = represents some datum and T is the longest time of interest.
If t = is taken to be the present, and a solution is required for all time
in the future, then the range of t is (0,00). Similarly, if for example, the
problem has to do with he at conduction in a slab occupying the region
(0,1) x (0,1) x (0,1), and if we require a solution for all time, then (8.2)
has to be supplemented by the statements

xE n = (0,1)3 and tE (0, (0).

Boundary conditions and initial conditions. Once the domain of


interest has been specified, the next stage in the formulation of the problem

derivatives on the boundary rand at the initial time t = °


involves the specification of the unknown function and possibly so me of its
(if time is
present as a variable). The former are known as boundary conditions (BGs)
and the latter are called initial conditions (lGs). Once again, these are
normally dictated by physical considerations. These ideas were of course
stated in the Introduction, in the context of the heat equation, but they
are reiterated here in this more general context.
If the domain of specification of a DE is purely spatial and denoted
by n, then only boundary conditions need to be specified, and the DE
together with the set of BGs is called a boundary value problem (BVP).
A special kind of BVP is one defined on an interval [a, b] of the realline;
then n = (0,1), r = {a,b}, and boundary conditions are given at x = a
and x = b. This kind of problem is called, for obvious reasons, a two-point
boundary value problem.
When the domain is purely temporal, the problem consü;ts of an ODE
defined for t E (0, T) - T may be infinity - and one or more initial conditions
that specify the unknown function and possibly some of its derivatives at
t = 0. This kind of problem is known as an initial value problem (lVP).
Finally, when the domain is both spatial and temporal, the problem
comprises a PDE (or a set of PDEs) together with boundary conditions
and initial conditions. This problem is called an initial boundary valne
problem (IBVP).

Examples

8. Population dynamies. Returning to Example 1, the complete specifica-


tion ofthe problem becomes: find u(t) satisfying (8.1), with u(o) = Uo·
Thus the initial population is prescribed, and this is an initial value
problem, which is summarized in Box 1.
8.1 Differential equations, boundary conditions, and initial conditions 265

Box 1: THE IVP FOR POPULATION GROWTH

ODE: du/dt = [b(u) - d(u)]u, tE (0,00)

Ie: u(O) = uo

9. Heat conduction. Suppose for example that the domain n is the cylin-
drical region r < a and 0 < z < L, where r 2 = x 2 + y2. Suppose
further that the ends z = 0 and z = L are insulated and the tem-
perature is a prescribed constant on the curved part of the boundary
(Figure 8.4). In this case it is more convenient to use cylindrical co-
ordinates (r, (), z)j then if the initial temperature is known, and is
given by the function !(r, (), z), the initial boundary value problem
corresponding to heat conduction is summarized as in Box 2.

8u/8z = 0

FIGURE 8.4. Heat conduction in a cylindrical domain


266 8. Elliptic boundary value problems

Box 2: THE IBVP FOR HEAT CONDUCTION

PDE: {}u _ ~div (KV'u) = Q


at cp

{}u {}u
BCs: {}z (r, 0,0, t) = {}z (r, 0, L, t) = 0
u(a,O,z,t) = c

IC: u(r,O,z,O) = !(r,O,z)

10. One-dimensional steady heat conduction. Suppose that Example 4


applies to heat eonduetion in a bar sueh as that shown in Figure 8.5,
and that the cireumstanees along the longitudinal sides of the bar
are eonsistent with the assumption that all variables depend only on
x (for example, the eonditions on the surfaces x = 0 and x = l are
independent of y and z).

We give an example of the kinds of boundary eonditions that may be


speeified at the ends x = 0 and x = l of the slab. Suppose then that
the end x = 0 is held at a preseribed temperature, and that at the
other end x = l the heat flux is proportional to the differenee between
the ambient temperature U a and the temperature u(l) at that end of
the bar; this eondition is known as Newton's law of eooling. The fuH
two-point BVP is then as summarized in Box 3.

ambient temperature ua

FIGURE 8.5. One-dimensional heat conduction in a bar


8.1 Differential equations, boundary conditions, and initial conditions 267

Box 3: TUE TWO-POINT BVP


FOR STEADY ID HEAT CONDUCTION

ODE: _~~
cpdx
(K dx
dU ) =Q

BCs: u(O) =0 and - Ku'(l) = a(u(l) - ua )

The constant a is assumed positive; this makes physical sense, since


heat then ßows from a high to a low temperature.

11. Elasticity. Suppose that the elastic body under consideration is the
bar shown in Figure 8.6; this bar is fixed at the end x = 0, it is
subjected to a time-independent (vectorial) force per unit area f(y, z)
at the end x = L, and on the remainder of its surface there are no
forces acting. To specify the force boundary conditions we make use
of (8.4) and (8.8), with the appropriate choice of v. In this way we
arrive at the boundary value problem in Box 4.

2d I
f
z

FIGURE 8.6. Deformation of an elastic bar


268 8. Elliptic boundary value problems

Box 4: THE BVP FOR


LINEAR ELASTICITY

PDE:

BCs: u(O,y,z) =0

(<>u)(l,y,z)e x = J(y,z)
(<>u)(x, y, z)e y = 0 for y = ±d

(<>u)(x, y, z)e z = 0 for z = ±h

12. Elastic plate. The fourth-order plate problem requires two boundary
conditions at each point on the boundary, as we show in the theory
that foltows. These are of two kinds: those in which the displacement
or its first derivatives are prescribed, and those in which the shear
force or bending moment along the boundary are prescribed. We take
a concrete example to show what form some of these boundary con-
ditions can take.
Consider then the rectangular plate shown in Figure 8.7. It is con-
strained against motion along the ends x = ±h, whereas the other two
ends y = ±l rest on supports that permit rotation, but not vertical
displacement. The boundary conditions along x = ±h therefore stip-
ulate that the displacement and slope are both zero; in other words,
w = 0 and äw / äx = O.
In order to write down the boundary conditions at the other two
ends we must first be clear about what it is that they stipulate. One
of the conditions is straightforward: w = 0 there. But the condition
that these ends are free to rotate is equivalent to stating that the
plate experiences no restraining moment or couple there. Referring
to Figure 8.2, we see that it is the moment M 1l that is required to
be zero. From (8.14) this is

But since w = 0 along the edge y = ±l, it follows that ä 2 wjäx 2 = 0


there. So the condition M xx = 0 becomes, along that edge, ä 2 w j ä y 2 =
O.
The boundary value problem für the rectangular plate is summarized
in Box 5.
8.2 Linear elliptic operators 269

FIGURE 8.7. Büundary eonditions für a reet angular plate

Box 5: THE BVP FOR A PLATE

PDE: u'V 4 w =Q in f! = (-h,h) >( (-l,l)

BCs: w(±h,y) =0, (8wj8x)(±h,y) =0, YE[-l,l]

w(x,±l) = 0, (8 2wj8y2)(X,±l) = 0, xE [-h,hl

Although all classes of problems introduced here are important in their


own right, subsequent discussions are limited to boundary value problems
in order to keep the scope of this work within reasonable limits. Certainly
BVPs provide the ideal vehicle with which to introduce and motivate the
finite element method later on; more generally, the st.udy of BVPs pre-
sented here may be regarded as a suitable prerequisite to the study of
time-dependent problems.
We begin in earnest the study of BVPs in the following section, which is
devoted to a study of an important class of (ordinary or partial) differential
operators called elliptic operators. The corresponding DE together with an
appropriate set of boundary conditions is referred to as an elliptic boundary
value problem.

8.2 Linear elliptic operators


Let A be a partial differential operator of even order 2m. in n variables, and
ofthe form
Au = L (-l)laID" (aaß(x)Dß u ), xE f! c jRn, (8.21)
lal,IßI:Sm
270 8. Elliptic boundary value problems

where n is an open bounded set in !Rn (recall the discussion of multi-index


notation in Section 7.1). The coefficients aaß are real-valued functions of
position, and Da represents a partial differential operator of order lai; that
is,

The term (-1) lai is not essential, but is included here for future conve-
nience.
The operator A is assumed to occur in a PDE (or system of PDEs) of
the form

Au = J,
where J lies in the range of A. For now we restrict attention to scalar-valued
functions u, and make the extension to vector-valued functions (that occur
in elasticity, for example) later.
The classification of A depends only on the coefficients of the highest-
order derivatives, that is, the derivatives of order 2m, and the terms involv-
ing these derivatives are said to constitute the principal part oJ A, denoted
by A o, and which for the operator (8.21) is given by

A o == L aaßDa+ßu .
lal,IßI~m

Let e be a vector in !Rn, and let

Then

(i) A is elliptic at Xo E n if
L aaß(xo)e a + ß =1= 0 for all e =1= 0; (8.22)
lal,IßI=m

(ii) A is elliptic if it is elliptic at all points in n;


(iii) A is stmngly elliptic if there exists a number JL > 0 such that

L aaß(xo)e a + ß 2: JLlel 2m (8.23)


lal,IßI=m

holds at every point Xo in n, and for all e E !Rn. Here lei


... + e;;Y/2is the length of the vector e.
8.2 Linear elliptic operators 271

For the case in which A is a second-order operator (that is, m = 1), the
notation can be simplified. Indeed, suppose that the problem is posed in
}Rn; then (8.21) takes the form

Au =- ~ ou) + ~
L.... - 0 ( aij(x)- ou
L....aj- +aou = f
. n
In (8.24)
.. 1 8Xi 8xJ· . 1 8xJ·
',J= J=

for suitable coefficients aij, aj, and ao, and the condition of ellipticity is
exarnined by considering, instead of (8.22) and (8.23), the conditions
n
L: aij(xO)~i~j f= 0 for aB f. f= 0, (8.25)
i,j=l

for ellipticity, and


n
L: aij (XO)~i~j 2: ILIf.1 2 (8.26)
i,j=l

for strong ellipticity.


These ideas are best appreciated by looking at a few examples.

Examples
13. Consider the operator that appears in the steady, nonhomogeneous
heat equation (that is, the steady version of Example 9), and assume
that the problem is plane, so that n = 2. The operator A is thus
(ignoring the coefficient 1/(cp)) given by

Au -div (K'V)
_~
OX
(Kau)
OX
_~oy (K OU8y )
so that, in the notation of (8.24), au = a22 = K and a12 = a21 = O.
The principal part of this operator is

K (~:~ + ~:~) or K'V 2 u.

The left-hand side of (8.25) is equal to K(~~+~~), and so this operator


is strongly elliptic, with IL = K in (8.26).
14. The biharmonic operator given by (8.16) is strongly elliptic: aO!.ß = 1
only when a = ß = (2,0) or (0,2) or (1,1); so, for nc }R2 and writing
f. = (~,.,,),

L: ao.ßf.O!.+ß = ~4 + 2e.,,2 +.,,4 = 1f.1 4.


lal,IßI=2
272 8. Elliptic boundary value problems

15. The operator

82 82 8
A = ( I - x )2 - + 3 - - y -
8x 8y 2 8x
is elliptic only in the half plane x < 1; to see this, we evaluate

L aaßC = (1 - x)e + 3772 ;


lal,IßI=l

this expression is nonzero for all nonzero vectors ~ = (E;, 77) provided
that x < l. However, for any point (xo, Yo) in the half plane x ;::: 1
this expression is zero for all vectors of the form ~ = (y'3, vx;;-=l).

The definition of elliptic operators has deliberately been confined to op-


erators of even order, since it is possible to show that all elliptic operators
in]Rn are 01 even order when n ;::: 2. It is also worth noting that the oper-
ators that occur in physically realistic problems such as those discussed in
Section 8.1 are always of even order.
Though the definitions (8.22) and (8.23) are given in the context ofPDEs
involving a single scalar-valued function, the extension to systems of PDEs
is immediate. Exercise 8.8 addresses this point in the context of the elas-
ticity problem.

8.3 Normal boundary conditions


Boundary conditions cannot be specified arbitrarily; there must be restric-
tions on their number, the order of the differential operators appearing in
them, and so on, if the boundary value problem is to admit a solution. For
example, if two boundary conditions are identical or, in any case, not inde-
pendent of each other, the formulation is defective. Similar considerations
apply if two boundary conditions are contradictory; for example, suppose
we have a domain [2 C ]R2 with boundary r, and let the two boundary
conditions be

u g, (8.27)
Vu· s == du/ds h, (8.28)

where du/ds is the tangential derivative, s being the unit tangent vec-
tor to the curve defining the boundary. The ccndition (8.27) implies that
du/ds = dg/ds, wh ich contradicts (8.28), unless dg/ds = h (Figure 8.8).
Hence these two equations arc inadmissible as boundary conditions when
specified together. In order to avoid situations such as these, we restrict
the mann er in which boundary conditions might be specified. First, recall
that we restrict attention to boundary value problems involving differential
8.3 Normal boundary conditions 273

FIGURE 8.8. A pair of contradictory boundary conditions

equations of even order 2m (m = 1,2, ... ), say, and the boundary is as-
sumed to be smooth (that is, of dass COO). Then the following restrictions
are imposed on the boundary conditions.
(i) A total of m conditions must be specified at each point of the bound-
ary. These are written in the form

Bou 90,
B 1u 91,
(8.29)

B rn - 1 u 9rn-1,

where 90, gl, ... , grn-1 are given functions and B o , B 1 , ... , B rn - 1 are
a set of linear differential operators called boundary operators. (The
boundary conditions are numbered 0, 1,2, ... rather than 1,2, ... for
reasons of convenience, as becomes apparent). The jth boundary op-
erator is of the form

B·u
J = "~ b(j)D"'u'
Q ,

lal~qJ

that is, it is a linear operator of order qj. The eoefficients b~P are
given functions of x for x E r. We assume that b~P and gj are
smooth functions;
(ii) the order of the highest derivative appearing in each boundary con-
dition must be less than the order of the PDE: in other words,

o :S: q] :S: 2m - 1 for j = 0, 1, ... , m - 1;


274 8. Elliptic boundary value problems

(iii) qi i' qj for i i' j; that is, no two boundary conditions should have
differential operators of the same order;

(iv) the final requirement is a restrietion on the coefficients of the highest


order derivatives, the principal part of B j . We require that

L b~)vez i' 0 for all x E r, (8.30)


jezj=qj

For second-order problems these conditions may once again be simplified.


First, from (i) and (ii) we have a single boundary condition, which is of
order at most equal to one. This condition may therefore be expressed in
the farm
ou
L b ox. +
n
Bu = j CU, (8.31)
j=l J

in which bj (j = 1, ... , n) and c are real-valued functions. Finally, require-


ment (iv), when recast using the notation in (8.31), becomes
n
L bjvj i' 0 or b· v i' o. (8.32)
j=l

Requirements (i) through (iii) are self-explanatory but the fourth re-
quirement needs some explanation, which is best done by means of a simple
example. Suppose that we have a second-order problem with the boundary
condition

\7u· a = h

specified on r c I1~?, where a is an arbitrary unit vector; \7u . a is the


directional derivative in the direction of a, and is equal to axou/ox +
ayou/oy. Clearly bj = aj in (8.31), and (8.32) yields the condition

b· v i' o.
Thus (8.30) or (8.32) requires that the vector a should not be orthogonal to
v; this condition ensures that we do not have a situation such as that which
occurred with the pair of boundary conditions (8.27) and (8.28) discussed
earlier. There, a = sand the two conditions are contradictory.
When Conditions (i) to (iv) are satisfied, the set {Bo,B1, ... ,Bm-d is
said to be a set of normal boundary conditions. An important special case of
a set of normal boundary conditions arises when the order qj of the highest
derivative in the jth boundary condition is equal to j, far j = 1, ... , m - 1;
such a set of boundary conditions is called a Dirichlet system of order m.
8.3 Normal boundary conditions 275

Examples
16. As observed in Example 12, the PDE corresponeling to the plate
problem requires two boundary conelitions to be specified at each
point on the boundary. One possibility is to specify that the dis-
placement anel the slope are both zero along f; in other words, the
plate is clamped along its edge. In this case the boundary conditions
are

Bau == u = 0,
B 1u == V'u· 1/ = 8u/8v = 0,
which is a Dirichlet system of order 2 since qa
system
= °
and q1 = 1. The

8u/8x = 9a,
8u/8y = 91,

on the other hand, violates requirement (iii) since qa = q1 = 1.

17. It is not necessary that the total of m boundary conditions has to


be in the form of m equations, each of which applies to the whole of
f. The requirement is that m conditions be specified at each point
in r. We have already seen in Example 12 how it is possible - and
indeed often dictated by the physical description of the problem -
that different boundary conditions may be prescribed on different
parts of r. The boundary conditions in that example are specified on
two complementary parts r 1 and f 2 of the boundary:
r1 = {(x,y): x = ±h, Y E [-l,l]} and
r2 = {(x, y): y = ±l, xE [-h, h]}.
These are known as mixed boundary conditions.
18. Consider the two-point BVP
d4 u d2 u du
dx 4 + 2 dx 2 + 3 dx + u = f on n == (0,1),
u(o) = 0, u(l) = 0,
u'(O) = 1, u'(I) = 2.
The boundary conditions form a normal set (note that requirement
(iv) is trivial in the case of two-point BVPs); in fact, the boundary
conditions constitute a Dirichlet system of order 2. We observe also
that the BCs can be written in the format (8.29) if Ba and BI are
regarded as maps from C 2 (0), say, to ~2, and are defined by
Bau = (u(O),u(1)), B1u = (u'(O),u'(I));
276 8. Elliptic boundary value problems

then we have

Bou = (0,0), BI U = (1,2).

The conditions for a set of boundary conditions to be normal in the case


of vector-valued functions may be extended from the scalar case, although
the end result is less straight forward. We carry out this extension for the
case of elasticity, but rather than make allowance for the most general set
of conditions possible, we confine attention to those cases that are likely to
occur in practice.
Boundary conditions for problems of elasticity are almost always ex-
pressed as conditions involving the displacement u or the surface traction
t (equation (8.4)). Now recall that the elasticity operator is one of sec-
ond order, so that a single boundary condition is required at each point
along the boundary. However, because we are dealing with a vector-valued
unknown variable, it follows that it is a single vector-valued boundary con-
dition that is required. In other words, we require a total of n conditions,
corresponding to the n components of the vector.
For convenience we assume that the n components of the boundary con-
ditions are referred to a local basis made up of the unit outward nor-
mal, and either one or two unit tangent vectors, accordingly as the do-
main is in ll~? or IR 3. These bases are denoted by {vd%=1 == {v, 8} and
{VkH=1 = {v, 81, sd, respectively (Figure 8.9). The case of nonsmooth
but otherwise Lipschitz boundaries may be treated as shown in Figure 8.9.
The vectors u and t are resolved relative to this basis, and the boundary
conditions are assumed to be, most generally, linear combinations of the
normal and tangential components of u and t; that is, for a domain in IR n ,
n
E (bklVI . U + ckWl . t) = gk, k = 1, ... ,n, (8.33)
1=1

in which bk1 and Ckl are most generally sets of functions. Note that these
two matrices do not contain derivative operators, and that t is a function
of the displacement through (8.4) and (8.8).
As in the case of scalar problems, the functions bk1 and Ckl cannot be spec-
ified arbitrarily. For example, it is necessary that conditions be specified for
the normal and each of the tangential n components. Such a requirement is
met by specifying that the functions appearing in the boundary conditions
(8.33) satisfy the condition:

for boundary condition k, the coefficients bkk and Ckk are not both zero.
(8.34)

This requirement rules out the possibility of a set of boundary conditions


in which not all n components of u appear in the boundary condition.
8.3 Normal boundary conditions 277

"

FIGURE 8.9. Local bases for the formulation of boundary conditions

Example

19. A very cornmon boundary condition encountered in problems of elas-


ticity is that in which the displacement is specified at every point on
the boundary, so that u = g. For this case b = land c = O. A
second common condition is one in which the surface traction t is
specified, so that t == Uy = g. In this case c = land b = O.
Consider a domain in IR?j then the pair of boundary conditions

u·y 0,
t· 8 0

corresponds to a situation such as that shown in Figure 8.10, in which


frictionless sliding is possible along the boundaryj for this case bn =
C22 = 1 and all other components are zero. The pair of conditions

u·y 0,
t· y 0,

on the other hand, is not acceptable since no conditions are specified


in respect to tangential components.

We return to scalar problems. Having ensured that the boundary con-


ditions are consistent and contain no ambiguities, we must now ensure
also that they are compatible with the partial differential equation of the
problem. Intuitively it should be clear that one cannot expect an arbitrary
278 8. Elliptic boundary value problems

FIGURE 8.10. A typical mixed boundary condition in elasticity

set of boundary conditions to be compatible with the PDE, and that it is


therefore necessary that further restrictions be placed on them in order to
ensure that the problem as a whole is well-posed.
Let s be a unit tangent vector to r at a point x, and let v be the outward
unit normal at this point. Now consider the pair of equations

L aOtß(x) [S - iv ds
d ] Ot+ß
u(s) = 0, s> 0, (8.35)
IOtI,IßI=m

L
IOtI=qj
b~)(x) [S-iV:sr u(S)I_ 8-0
=0, j=0, ... ,m-l,(8.36)

that involve only the principal parts of A and of B j (recall that aOt =
a~'a~2 ... a~n for any vector a in Rn). The set {Bo,B1, ... ,Bm-d of
boundary operators is compatible with A, and is said to cover A at x,
ifthe only solution of(8.35), (8.36) is u(s) = 0. We require that {Bj } cover
A at every point x in r.
Precisely why a requirement such as the covering condition should ensure
compatibility between B j and A is not an obvious matter; the details are
lengthy, and may be pursued in the references given at the end of this
chapter.

Example
20. Consider the Poisson equation
-V 2 u=! inncR2 ;
the most general normal boundary condition is of the form
ßu ßu
Bou = a ßx + b ßy + cu = 9 on r, (8.37)
8.4 Green's fOrlllUlas and adjoint problems 279

and so we must investigate the restrictions placed on a, b, and c by


the covering condition. At a point x on the boundary with tangent
S = (0",0), equation (8.35) gives (with a20 = a02 = -1)

2 rPu
-0" u+-2 =0 (8.38)
ds
whereas (8.36) gives

aO"u(O) - ibu' (0) = O. (8.39)

A general solution of (8.38) is u(s) = cleO"s + C2e-0"S, and since we


require u(s) to be finite as s --+ 00, we must have Cl = O. The use of
(8.39) now gives

(a + ib)O"C2 = 0,
so that C2 = 0 and hence u( s) = 0 provided that a # or b # 0, so
that (8.37) covers A at x for any values of a, b, and c. In order to
°
investigate the covering condition at other points on the boundary,
we simply introduce new axes X, y so that v = (0,1) relative to these
axes, at the point under consideration.

8.4 Green's formulas and adjoint problems


In this and the following sections we concern ourselves with boundary value
problems of the form
inncIRn ,

::~ }on r, (8.40)

= 9m-l
where A is a linear elliptic partial differential operator of order 2m, of the
form

Au = L (_l)la ID a ( L aaß(X)DßU ) ' xE nc IRn ; (8.41 )


1"'I:'Sm IßI:'Sm

the coefficients a"'ß are functions of x, are smooth, and satisfy the condition
for ellipticity. The set B o, BI, ... , B m - l of boundary operators is of the
form

Bju = L b~P D"'u (8.42)


I"'1:'Sqj
280 8. Elliptic boundary value problems

and constitutes a set of normal boundary conditions that cover A. The


coefficients b~) are also assumed to be smooth functions. We refer to (8.40)
through (8.42) as a regularly elliptic boundary value problem of order 2m.
In the case of second-order problems, (8.41) and (8.42) can be expressed
in the form (8.24) with the single boundary condition

n au
Eu = """
L b ·-
Jax. + cu 9 on r,
j=1 J

in which the ellipticity of A and the normality of the boundary operator B


are defined through (8.25), (8.26), and (8.32).
A central question of the theory of elliptic boundary value problems re-
lates to the conditions under which one may expect a unique solution of
(8.40) to exist. In other words, given data in the form of the functions
f, aaß, b~), gj, as weH as the geometry of the domain D, under what con-
ditions can we expect to find a unique solution? Furthermore, if such a
solution exists, then it is equaHy important to know something about the
regularity or smoothness of this solution. If, for example, f belongs to HT (D)
and the functions gj are members of the boundary spaces HSj (r), we would
like to know the largest integer (J for which the solution u belongs to HCT (D),
since this conveys information about the degree of smoothness of u. As one
would expect, the regularity of u depends very much on that of the data:
thc smoother the data, the smoother u can be expected to be.
Before we can discuss quest ions of existence and uniqueness in any detail
it is necessary to introduce the concept of a Green's formula associated with
the operator A.

Green's formula and the formal adjoint operator. With the operator
A given by (8.41), we denote by A* the operator defined by

A * is referred to as the formal adjoint of A. The relevance of the formal


adjoint is that if Green's theorem (7.4) is applied to the integral 10
vAu dx,
then we obtain

10 vAu dx = 10 uA*v dx + l F(u,v) ds (8.43)

in which F(u, v) represents boundary terms that arise from the application
of the theorem. If A * = A, that iso aaß = aß<>., the operator A is then said
to be formally self-adjoint.
8.4 Green's formulas and adjoint problems 281

In the case of second-order problems, two successive applications of


Green's theorem (7.2) yield, for fixed i and j,

- Jrr Vaij~Vi
8xj
ds+l aij~~ dx
n 8xj 8Xi
- Jrr [va. ,~v.
8xj'
IJ
- ua·· 88vXi v.] '.1 J
ds

- 1 u8- ( aij~)
o 8xj
8 ' dx.
8 Xi
By summing over i and j we therefore find that (8.43) holds with

A*v = - ~ ~ (aji(X)~)
~ 8x· 8x·
, (8.44)
i,j=l' J

and

F(u , v) = - Ln
i,j=l
a··
'J
(8U
v-v-
8
Xj
8v')
' - u-v·
8
Xi
J
j
,
(8.45)

so that A is formally self-adjoint if aji = aij.

Examples
21. Consider the second-order ordinary differential operator
d2
A = - dx 2 + 1;
using integration by parts we have, for sufficiently smooth u and v
and for n = (0,1),

1 1
( -v ~:~ + vu) dx

- 1+
- [vdU]
dX a a
1
- -du
(dV
dxdx
1 + vu ) dx

_ [v dU] + [dV U]
dx a dx a
1 1
_ t
Ja
(d2~
dx
+V)UdX.

The Green's formula is thus

dx =
du
[-v dx dV] 1
+ dx u 0 + Ja
r (-ddxv + V) U dx,
1 2
2
, v ' '--..,..--.-'
F(u,v) A'v
(8.46)
and since A * = A, A is formally self-adjoint.
282 8. Elliptic boundary value problems

22. Consider next the operator defined by

(8.47)

Since A is a second-order operator and this problem is posed on 1R2 ,


(8.44) and (8.45) can be usedj thus

Au=- ~2
- f) ( f)u )
aij- ,
L f)x f)x'
i,j=l' J

where an = a12 = a21 = a22 = 1, so that A is formally self-adjoint.


Furthermore,

23. The analogue of (8.43) is readily derived for the elasticity problem.
We disregard dependence on time as before, and write the system of
PDEs (8.10) or (8.11) corresponding to the elasticity problem in the
form

Au=Qj (8.48)

the elasticity operator is denoted here by A in keeping with the


notation of this section, and is defined by the composition A(·) =
-div CI'{). To obtain (8.43) we take the scalar product of Au with
an arbitrary smooth vector function v, integrate, and use Green's
theorem to obtain

l Au·v dx = - l div[Ce(u)]·v dx

=-!r[ce(u)]V,VdS+ l[ce(u)].e(v)dx (8.49)

in which the scalar product of two matrices t7 and T has been writ-
ten as t7 . T = L~j=l rrijTij· The details of the derivation of (8.49)
are discussed in Exercise 8.16. Now another application of Green's
theorem, this time to the volume integral on the right-hand side of
(8.49), yields

r[Ce(u)]. e(v) dx irr


in
= U· [Ce(v)]v ds - r
in
div [Ce(v)]· u dx,
(8.50)
8.4 Green's formulas and adjoint problems 283

in which symmetry properties of the components Cijkl are exploited


(see Exercise 8.15). Putting together (8.49) and (8.50) we have, fi-
nally,

l Au·v dx [,-[Ce(u)]v. vv+ [Ce(v)]v·


F(U,V)
~ ds

+ r -div [Ce(v)]·u dx.


} rl ' - - - - v - - "
(8.51)
A*v

Comparison with the definition reveals that the elasticity operator A


is formally self-adjoint. The boundary term in (8.51) may be rewritten
in a more readily recognizable form if we recall that the dependence
of the surface traction t on displacement is, after combining (8.4) and
(8.8),

t(u) = [Ce(u)]v.

It therefore follows that F may be written in the more compact form

F(u, v) = -t(u) . v + t(v) . u. (8.52)

It turns out that the boundary integral appearing in a Green's for-


mula can be expressed very concisely in terms of four sets of boundary
operators. One of these sets is B j , that forms part of the description of
the original BVP. The second set of boundary operators is denoted by
Sj (j = 0, ... ,m - 1) and has the property that the 2m operators

(8.53)

form a Dirichlet system of order 2m. Given these two sets of operators, it
is possible to write the Green's formula in the form

1
rl
vAu dx = 1
rl
uA*v dx
m-l

+L
j=O
1r
(SjUBi V - BjUSi v ) ds, (8.54)

where Sj and B j are as previously defined, and the operators B; and


s; (j = 0, 1, ... , m - 1), which are uniquely defined, have the properties:

B; is of order 2m - 1 - Pj, where Pj is the order of Sj;


S; is of order 2m - 1 - qj, where qj is the oreler of B j ; (8.55)
the system B o , Br, ... , B;" __ l , So, Si, ... , S;,,_ l
is a Dirichlet system of order 2m.

We return to the previous examples to illustrate these ielcas.


284 8. Elliptic boundary value problems

Examples

24. In the Green's formula (8.46) we wish to express the boundary term
in the form

(remember that m = 1 here). Exactly what form this integral takes


depends of course on the boundary condition. Suppose that this prob-
lem has thc boundary condition

u(O) = u(l) = 0 or Bou == (u(O), u(l)) = (0,0).


Thus qo = 0, and so So must be of order 1 for {B o, So} to be a
Dirichlet system of order 2. Furthermore, So must be of order 2m -
1 - qo = 1 and Ba must be of order 2m - 1 - Po = o. By inspcction
of (8.46) we have the correspondence

Bou = Bau = (u(O), u(l)), Sou = Sou = (-u' (0), -u'(1)).

25. In the Green's formula corresponding to the operator in Example 22,


the function F(u, v) ought to be expressible as

Suppose that we are given the boundary condition oU/OI/ = g, so


that B o = 0/01/. This is a first-order operator (qo = 1) so in order
that {B o, So} form a Dirichlet system of order 2 the operator So must
be of order 0; that is,

Sou = ßu (Po = 0)
for so me function ß. Ncxt, So must be of order 2m - 1 - qo =
2 - 1 - 1 = 0 and Ba must be of order 2m - 1 - Po = 2 - 1 - 0 = 1.
Thus Ba and So must be of the form

'Y V ,

Bov

for some functions 'Y, p, u, and T, from which it follows that

Cßu) ( pv+u- ov) + -("(v)


ov +T- ou = (Ou ov)
v- -u-
ox oy 01/ 01/ 01/
8.4 Green's formulas and adjoint problems 285

Since ou/ov = vxou/ox + vyou/oy we obtain, by equating coeffi-


cients,

uov/ox: ßu = Vx + vy
uov/oy: ßT = V x + vy
uv: ßp=O
vou/ox: ,vx = -vx - vy
v ou/oy : ,vy = -vx - vY '
The last two of these equations give, after using the fact that v; +
v y2 = 1 ,

This leaves three equations with the four unknowns ß, p, U, T. One


of these may be chosen arbitrarily, so we set ß = 1. Then

p = 0, u = T = I/x + I/y.
Hence the boundary integral can be written in the form

l~ [(I/X + I/y ) (:: + :~)]


~U' _ •

Böv

26. The analogue of (8.54) in the case of the elasticity problem may
be formulated by considering the specific form (8.52) taken by the
boundary integrand F(u,v). First we denote the left-hand side of
the boundary conditions (8.33) by Bi (i = 1, ... , n), so that this set
of equations reads Biu = gi, (remember that t also depends on u.)
Then (8.52) is expressible in the form
n
F(u,v) = L:SiuBiv - BiuSiv, (8.56)
i=l

in which the new operators Si, Bi, Si are defined in exactly the same
way as in (8.54) and (8.55), with m = 1; thus for i = 1, ... , n, {Bi, Si}
forms a Dirichlet system of order 2, Bi is of order 1 - Pj, where Pi is
the order of Si, Si is of order 1- qi, where qi is the order of Bi, and
{Bi, Sn forms a Dirichlet system of order 2.
Suppose then that for a problem posed in ]R2 (n= 2) the pair of
boundary conditions is

u·v=O,
t· s = O.
286 8. Elliptic boundary value problems

Thus B 1 u = u· v and B 2 u = t(u) . s = [C€(u)v]· s. By resolving


the vectors into their tangential and normal components, denoted, re-
spectively, by subscripts sand v, (8.52) can be recast in the following
form, in which the forms of the various operators are plain.

F(u,v) = -tv(u)vv -ts(u)v s +tv(v)uv


I I I

Note that the boundary operators Sj have to be partial differential op-


erators of such orders as to make (8.53) a Dirichlet system, but further
than that they are not unique. Indeed, in the last example we saw that the
function ß could be chosen arbitrarily. However, once Sj are fixed then so
are the forms that the sets of operators and Bitake. Si
With each regularly elliptic problem of the form (8.40) may be associated
an adjoint pmblem

A*u = f* in nc lR. n ,
Bou
Biu = 90
= 9i }
_ * on f,
B;"_I U - 9 rn - l

where f*, 90' 9i, ... , 9;"-1 are given functions.


Like the original problem, the adjoint problem is also a regularly eHip-
tic boundary value problem of order 2m. We soon show that the adjoint
problem plays a key role in determining whether the original problem has
solutions, and whether these are unique.

Example

27. Returning to Example 22, A* = A, and the operator B o is given in


Example 25. The adjoint problem is thus

8 2u 8 2u 82 u
---2----
8x 2 8x8y 8 y2
1* in n,
8u OU OU
-+v--n-
OV x 8y Y 8x
90 on r.

8.5 Existence, uniqueness, and regularity of


solutions
We co me now to the main topic of this chapter, namely, the discussion
of well-posedness of solutions to problems of the form (8.40). In order to
8.5 Existence, uniqueness, and regularity of solutions 287

keep the discussion as simple as possible we confine attention to prob-


lems having homogeneous boundary conditions, that is, problems for which
go, g1, ... ,gm-l are all zero. This is no real restriction, sinee it is not dif-
ficult to show (see Exercise 8.19) that any problem with nonhomogeneous
boundary eonditions can be converted to one with homogeneous boundary
eonditions in a fairly straightforward manner.
We also assurne that the domain n is bounded, and is smooth (in the
language of Seetion 7.3, the boundary is assumed to be Coo). This assump-
tion, although rather restrietive, permits the development of a fairly general
existenee theory.
Thus we consider the problem

:}
Au=j in n C !Rn,
Bou =
B 1u = (8.57)
on r,
Bm-lu =

where A and B j are given by (8.41) and (8.42). Our aim is to settle the
questions of

(a) existence: under what circumstances (8.57) has a solution u that be-
longs to HS(n), s being an integer greater than or equal to 2m;

(b) uniqueness: whether there is only one such solution;

(e) continuous dependence on the data: whether the solution depends on


the data in the sense that the estimate

(8.58)

holds for some eonstant C > 0, independent of the solution; and

(d) regularity: to establish the largest value of s for which u E HS(n).

If the problem (8.57) has a unique solution that depends eontinuously on


the data, then the problem is said to be well-posed_ Regularity is a supple-
mentary issue, the goal of whieh is to establish the maximum smoothness
of the solution consistent with the data. Note that if u belongs to HS(n),
then Au E H s - 2m en) sinee A is a differential operator of order 2m_
The inequality (8_58) has the following implication. Suppose that prob-
lem (8_57) is eonsidered with two different sets of data ft and 12, and that
the solutions corresponding to these two sets of data are, respectively, Ul
and U2. Since A is linear it follows that
288 8. Elliptic boundary value problems

imply that

where!lu = U2 - Ul and !lJ = 12 - h. So !lu is a solution to the problem


with data !lJ, and the inequality (8.58) then gives

from which it ean be eoneluded that if hand 12 are elose to eaeh other in
the sense that II!lJII is smalI, II!lJII < E, say, where E is a small number,
then II!lull < CE so that Ul and U2 are eorrespondingly elose.
The quest ion of existenee and uniqueness of a solution is best approaehed
by adopting the language of linear operator theory (Chapter 5). First, we
denote by N(Bj ) the null space ofthe boundary operator B J ; that is, if B j
is regarded as an operator from H'(n) to L 2 (f), then

N(Bj ) = {u E HS(n): Bju = 0 on f}, j = 0,1, .. . ,m - 1.

It now follows that a solution of (8.57), if it exists, will belong to the


subspaee of HS(n) eonsisting of all funetions that are also in N(B j ). We
eonsequently take the domain of A to be the spaee D(A) defined by

D(A) HS(n) n N(B o) n··· n N(B rn - 1)


{u E HS(n): Bju = 0 on r}, (8.59)

so that problem (8.57) now reads: find u that satisfies

A : D(A) --+ H S- 2m (n), Au = J in n. (8.60)

Our first task is to determine the set of functions J in H s - 2m (n) for whieh
(8.60) admits a solution. That is, we must identify R(A), the range of A.
This enables us to solve the problem of the existence of a solution. We
find that R(A) is not all of Hs- 2 m(n); there are functions J in Hs- 2 rn(n)
that do not lie in R(A), and for which no solution exists. The situation
is shown diagrammatieally in Figure 8.11. The seeond task is to aseertain
the eonditions under which the solution is unique; in other words, we wish
to know the eonditions under whieh A is one-to-one. For this purpose we
define the null spaee N(A) of A by

N(A) {u E D(A): Au = O}
{u E HS(n): Au = 0 in n, Bju = 0 on f}.

Clearly if N(A) 1= {O}, then we eannot expeet to have a unique solution


sinee, if Uo is a solution, so is uo+w for any w E N(A) beeause A(uo +w) =
Auo + Aw = Auo = J. SO elements of N(A) have to be exeluded from the
domain of A in order to ensure uniqueness. This is no problem, sinee we
8.5 Existence, uniqueness, and regularity of solutions 289

D(A) A

N(A)

FIGURE 8.11. The various spaces occurring in the problem (8.60)

have simply to introduce the orthogonal complement N(A)1- of N(A) with


respect to the L 2 -inner product, which is defined by

N(A)1- = {v E D(A): (v,w) =0 for all W E N(A)}.

Now it can be shown that N(A) is finite-dimensional, and hence complete,


so that by the Projection Theorem (Theorem 8 of Chapter 4) we have

D(A) = N(A) ES N(A)1-;


in other words, every u E D(A) is of the form u = v + w for v E N(A)1- and
W E N(A), and furthermore N(A)nN(A)1- = {O}. Since N(A) and N(A)1-
have in common only the zero element, we simply restrict the domain of A
to N(A)1- to ensure uniqueness.
Similar remarks apply of course to the adjoint problem

A*u =f in nc jRn,

~O
B~u
Bi u (8.61)
on r;
}
B:n_l U
we define

D(A*) = {u E HS(n): B~u = B~u = ... = B:n_1U = 0 on r}

by analogy with (8.59), and rephrase (8.61) as the problem of finding u


that satisfies

A* : D(A*) ---> H s- 2m (n), A*u = r in n.


290 8. Elliptic boundary value problems

The null space N(A*) of A* and its orthogonal complement N(A*)..L are
then
N(A*) {w E D(A*): A*w = O},
N(A*)..L {v E D(A*): (v, w)p = 0 for all w E N(A*)}.
Like N(A), the space N(A*) is finite-dimensional. Indeed, for most prob-
lems of practical interest
dimN(A) = dimN(A*).
We are not particularly concerned with solutions to the adjoint problem,
but when discussing the existence of solutions to (8.60) it is necessary to
call on properties of the space N(A*)..L. We now give a few examples.

Examples
28. Consider the problem

Au = -u" = f in n = (0,1),
Bou = (u(O), u(l)) = (0,0).
Assume that f E L 2 (0, 1), so that a solution u E H 2 (0, 1) is sought.
Also,
N(Bo) = {u E H 2 (n): u(O) = u(l) = O} = D(A).

The null space of A is the set of solutions to the problem


w" = 0 in (0,1), w(O) = w(l) = 0;
the only solution to this problem is w = 0 so that N(A) = {O} and
a solution, if it exists, will be unique. Alternatively, suppose that the
boundary condition is
Bou = (u'(O),u'(l)) = (0,0);
then N(A) = {w: w(x) = const.} so that

N(A)..L = {v: (V,W)L2 = 0 or 1 1


v(x) dx = o}.
The operator -d2 /dx 2 is formally self-adjoint and Bij = B o, so that
N(A*) = N(A).
29. Consider the problem

Au

Bou
8.5 Existence, uniqueness, and regularity of solutions 291

Clearly

N(A) = {w: w(x) = const.}


from which it follows that

N(A)L = {v: l v(x) dx = O} .


The self-adjointness of A has been established in Example 22, and
B o and B o are given in Example 25, as

so that the condition Bow = 0 is the same as

The null space of A * = A is the set of solutions to


A*w=O inn, B;w=O onf,

and this is given by

N(A*)={w: w(x)=a+ß(x-y), a,ßE~}

and

N(A*)L = { v: l v(x)[a + ß(x - y)] dxdy = O}


or, since a and ß are arbitrary,

N(A*)L = {v: l v(x) dx = 0, l v(x)(x - y) dxdy = o}.


We are now in a position to state the main result of this section.

THEOREM 1. Consider the regularly elliptic boundary value problem (8.57),


with s 22m, and posed on a bounded domain n with smooth boundary f.
Then
(i) (uniqueness) assuming that the solution u exists, it is unique if u E
N(A)L, that is, if

(u, w)p = 0 for all w E N(A); (8.62)


292 8. Elliptic boundary value problems

A is one-to-one and surjective

N(A).L

N(A) D(A)

FIGURE 8.12. The domain and range of the operator A in Theorem 1

(ii) (existence) there exists at least one solution if and only if / E N(A*).L,
that is, if
(f,v)p = 0 for all v E N(A*); (8.63)

(iii) (continuous dependence on data) if a unique solution exists, then


there is a constant C > 0, independent 0/ u, such that

REMARKS. 1. The theorem states that A is a surjective operator from D( A)


onto the subspace of functions in Hs- 2m that satisfy (8.63). Furthermore,
Ais one-to-one if its domain is restricted to the subspace of functions that
satisfy (8.62) (Figure 8.12).
2. Theorem 1, in a slightly modified form, is referred to as the Closed Range
Theorem, since Part (ii) of the theorem is equivalent to the requirement that
R(A) be closed. This equivalence is made apparent in the proof.
3. Part (ii) of the theorem expresses the fact that the data cannot be
specified arbitrarily: they have to satisfy (8.63) if a solution is to exist.
This is known as a compatibility condition, and when (8.63) is satisfied we
say that the data are compatible with the operator A. The condition is,
however, trivial in the event that N(A*) = {O}.
4. Part (iii) may be interpreted also as a regularity result, in the sense that
it shows that U E Hs+ 2m(D.) if fE HS(D.).

PROOF. (i) Take any w E N(A) and assume that there are two solutions
satisfying
Ul, U2
8.5 Existence, uniqueness, and regularity of solutions 293

that is, UI and U2 belong to N(A)1-. Since AUI = AU2 = J, we have


A(UI - U2) = ° so that UI - U2 E N(A). But D(A) == N(A) EB N(A)1-
from Chapter 4, Theorem 8, and since N(A) nN(A)1- = {o} it follows that
UI - U2 = 0, or Ul = U2. Hence the solution is unique.
(ii) First assurne that (8.57) has a solution. Then for any v E N(A*) we
have, using Green's formula (8.54),

(f,v)u = (Au,v)p (u, A*v)p +L


rn-I

j=O
1r
(SjUBi V - BjUSi v ) ds

(u,O)p+
m-I

L
j=O
1r
(SjU'O-O'Si v) ds=O.

Hence J E N(A*)1-.
We sketch the proof of the converse and leave some of the details to
the exercises. The aim is to prove that if J E N(A*)1-, then J E R(A);
that is, N(A*)1- c R(A). First we note from (i) that, since A is one-to-one
from N(A)1- ontu R(A), it is possible to define the inverse operator A-I :
R(A) ----> N(A)1-. Second, it can be shown (see Exercise 8.22) that both A
and A- I are bounded operators, and furthermore that R(A) is closed. It
follows then from Chapter 4, Lemma 1 that R(A)1-1- =R(A) = R(A).
Next, if v E R(A)1- and u E D(A), then

(v, Au)p = ° = (u, A*v)p +L


m-I

j=O
1 r
SjUBi V ds,

so that v E N(A*) (since u is arbitrary we must have A*v = 0 and Biv =


0). Hence R(A)1- c N(A*), which implies that N(A*)1- c::: R(A)H = R(A)
(see Exercise 4.26 and Lemma 1, Chapter 4), which completes the proof.
(iii) Gnee again we use the fact that A is a bounded, one-to-one linear

°
operator from N(A)1- onto R(A); then (Exercise 8.23) there is a constant
C > such that

Examples

30. Consider the Poisson problem

-kV 2 u = J in n,
u = 0 on r.
In this case A = A * = - kV 2, so A is formally self-adjoint. Assurne
that J E L 2 (n), and take s = 2 (m = 1 here). Thus

N(A) = N(A*) = {u E H 2(n) : -kV 2u = 0 on n, u = 0 on r} = {O}


294 8. Elliptic boundary value problems

which should come as no surprise if one considers the various physical


problems for which the Poisson equation is a model; whether it is the
membrane problem or that of steady heat conduction, clearly one will
expect that the solution in the absence of any forcing function f, with
u prescribed to be zero along the boundary, is going to be zero.
Returning to Theorem 1, (8.62) and (8.63) are satisfied identically. It
follows that -kV 2 is one-to-one from D(A) onto L 2 (!1). Furthermore,
from Part (iii) of the theorem there is a constant C > 0 such that

31. Consider now the problem

-kV 2 u f in!1,
ou/ov o on r.
This would correspond physically to thc problem of a membrane con-
strained around its boundary in such a way that the slope there is
zero, or in the case of heat conduction, to a medium that is perfectly
insulated along its boundary.
In this case N(A) = N(A*) = {c}, c being a constant function. From
Part (ii) of the theorem we thus deduce that there exists a solution
if and only if

(f, c) = 0, or c l f dx = 0, or l f dx = 0,

since c is arbitrary. Physically, this compatibility condition means


that the net force on the membrane must be zero, or in the case
of heat conduction, the net heat source must be zero. Again this
condition makes physical sense: in the case of the membrane, there
is no constraint against vertical motion along the boundary, so that
the membrane would fiy off unless the forces acting on it were in
equilibrium.
From (i) the solution is unique if we prescribe the condition

(u,c)=Oor lUdx=o.
Such a condition would serve to determine the value of any arbitrary
constant in the solution.
32. We return to the problem of elasticity, and show that Theorem 1
is applicable to this problem as weiL Suppose that the boundary
condition is

u" = Us = 0 or, equivalently, u = 0 on r.


8.5 Existence, uniqueness, and regularity of solutions 295

We first investigate the structure of N(A) = {u: Au = 0 on!1, u =


o on r} (recall (8.48)). Now let C,,) denote the inner product on
[L 2 (!1)]n, with (u,v) == Jflu,v dx, and consider the inner product
(Au, u), where u E N(A); this inner product is of course zero since
Au = 0, and so (8.9) and (8.49) give

0= (Au,u) = L..t
t,J,k,l=l
CijklEij(U)Ekl(U) dx. (8.64)

Now in order that various features of realistie elastie materials be


encapsulated in the specification of C, it is necessary that this tensor
possess a property akin to that of positive-definiteness in the case of
matrices. In the present context this is known as pointwise stability;
the elasticity tensor is said to be pointwise stable if there exists a
constant Co > 0 such that
n n
L CijklMijMkl 2: Co L MijMij for all matriees M. (8.65)
i,j,k,l=l i,j=l

For an isotropie elastic material, pointwise stability is equivalent to


the requirernent that (Exercise 8.8)

JL > 0 and A + 2JL > O.


Returning to (8.64), and assuming that the elasticity tensor is point-
wise stable, we now have

(8.66)

where 1·1 represents the norm of a matrix; that is, 1€1 2 = E~j=l EijEij'

Now define the norm on [Hl(!1)]n in an obvious way, according to

and define also the norm on the space [L 2(!1)]nxn of rnatrix-valued


functions whose components are in L 2 , by

We would like next to bound (8.66) from below in terms of a Sobolev


norm, in order to conclude that u = 0, and herein lies a problem: the
right-hand side of (8.66) contains only specific first derivatives of the
displacement. This impasse can fortunately be resolved by appealing
to a result known as Kom's inequaldy, which plays a vital role in
296 8. Elliptic boundary value problems

u(x) = a+ b X X

FIGURE 8.13. An elastic body subject to a traction boundary condition and a


rigid body displacement

analyses of problems in elasticity, and according to which there is a


constant C 2 > 0 such that

(8.67)

whenever v = 0 on apart r v of the boundary r, with J.L(r v) =1= o.


Putting (8.66) and (8.67) together we therefore find that IlullHl = 0,
so that u = o.
Thus the only member of the null space N(A) is the zero element,
and so according to Theorem 1, the problem (8.48) together with the
boundary condition u = 0 has a unique solution, and furthermore
there is a constant C > 0 such that

33. A more interesting situation arises when the boundary condition is


given by

t(u) = 0 on r.
Physically, the body is not constrained against movement anywhere
on its boundary, so we would expect an element of nonuniqueness in
the solution, inasmuch as the body could be translated and rotated
from whatever its current position is, without affecting its state at all
(Figure 8.13). Such a motion, which takes place without adding any
deformation to the body, is known as a rigid body displacement. Its
most general form is

u(x)=a+bxx,
and it is easy to verify that E(U) = 0 for such a displacement field.
8.6 Bibliographical remarks 297

For the problem with a traction boundary condition, the most general
solution of the problem Au = 0 in n and t(u) = 0 on r is € = 0, in
other words, a rigid body displacement, and so

N(A) = {u: u(x) = a + b x x, a, bE ]Rn}.


A solution therefore exists, according to Part (ii) of Theorem 1, if
and only if the force Q satisfies the condition

In Q . [a + b x xl dx = 0 for all a, b E ]Rn

or, equivalently, if

In Q dx = 0 and In Q x x dx = O.

These conditions stipulate that Q may not be specified arbitrarily,


but rat her that the net total force and total couple acting on the
body be zero (Figure 8.13), a condition that makes physical sense.
Uniqueness is also subject to a condition: the solution is unique only
if it is in N(A)1-, that is, if it satisfies

In u . [a + b x xl dx = 0 for all a, b E ]Rn

or, equivalently, if

In u dx = 0 and In u x x dx = O.

These two conditions suffice to ensure that u contains no rigid body


displacement .

8.6 Bibliographical remarks


The concepts in Section 8.1 are elementary, and are normally encountered
in beginning courses on differential equations. Further details, including
various techniques for finding solutions, may be found in texts such as that
of Zauderer [52], for example.
The theory of elliptic boundary value problems developed in Sections
8.2 and 8.3 draws heavily on the account given in the extended survey
by Babuska and Aziz ([3J, Chapter 3, which was written by B. Kellogg).
This account is based in turn on the treatment of Lions and Magenes [30];
indeed, the presentation given in Sections 8.2 and 8.3 avoids some rat her
delicate technical issues, fuH details of which may be found in [30J, and
298 8. Elliptic boundary value problems

concentrates on the aspects that are most accessible, and most relevant,
to readers of this text. Accessible treatments of an alternative approach
to regularity, using what is known as the method of differentials, may be
found in the monographs by Zeidler [53] and by Dautray and Lions [13]. The
latter text mayaIso be consulted for further details of Korn's inequalities.
The article by Horgan [21] summarizes the major results concerning Korn's
inequalities for bounded domains, and discusses bounds on the constants
appearing in the inequalities.
Attention has been focused deliberatelyon those aspects of the theory of
elliptic boundary value problems that are relevant to the primary objective,
viz. that of presenting the theory of variational boundary value problems
and their approximation by finite elements. Some of the more complex
topics that have been omitted include the question of well-posedness in the
presence of nonhomogeneous boundary conditions, and in the presence of
data in H-r(D) for r > O. The latter would cover problems such as - V 2 u =
f in D where, for example, f is a Dirac delta. Naturally the solution u is
correspondingly irregular. These topics rcquire some knowledge of Sobolev
spaces HS(D) and HS(r) for which s is real; the theory of such spaces is
covered in the references to Sobolev spaces given at the end of Chapter 7.
We have assumed the boundary to be of dass Coo; when the boundary is
less smooth (for example, Lipschitz or polygonal) then the theory on regu-
larity becomes more complicated, although in many cases the results look
similar to those given here. For a comprehensive treatment of problems in
nonsmooth domains the monograph by Grisvard [17] is recommended.

8.7 Exercises
Differential equations, boundary conditions, and initial conditions

8.1. For each of the following differential equations specify the order of
the equation, state whether it is linear, and sketch the spatial domain
D.
ß2U ßu ßu
(a) -+--=y inD={xE]R2: X 2+ y2<I,y>0};
ßx 2 ßx ßy

(b) in D = {x E ]R2 : x > 0, Y >

8.2. The purpüse of this exercise is tü derive Navier's equation (8.11) für
elastic bodies, by retracing the steps employed in the Introduction
(equations (0.1) through (0.7)) in the derivation üfthe heat equation.
8.7 Exercises 299

(a) The balance law in this case is balance of linear· momentum,


which states that the rate of change of total moment um equals
the total force acting on the body. Express this balance law
in mathematical form, and obtain Cauchy's equation of motion
(8.5).
(b) Eliminate the stress from Cauchy's equation, using the consti-
tutive equation (8.8) and (8.9), to obtain Navier's equation.
8.3. Using the general approach of the Introduction, obtain the Helmholtz
equation

for the behavior of a membrane that is connected to a foundation


with stiffness k; that is, the foundation exerts a resisting force that
is proportional to the displacement of the membrane, the coefficient
of proportionality being the stiffness k.
8.4. The purpose of this exercise is to fill in some of the missing details in
the derivation of the plate equation (Example 12, Box 5).
(a) Assuming static (time-independent) behavior and an external
force acting only transversely, consider the first two of equations
(8.6), that is, L~=l 8ua j/8xj = 0 (0 = 1,2); multiply by z and
integrate to obtain (8.15h-
(b) Consider next the third of equations (8.7), that is,
- L~=l 8 U 3j/8xj = Q3, and integrate to obtain (8.15h-
(c) Use Parts (a) and (b) together with the constitutive equation
(8.14) to derive the biharmonic equation in Box 5.
8.5. The problem of an elastic beam (Example 7), being a fourth-order
differential equation, requires two boundary conditions at each end.
Sketch and formulate the boundary conditions corresponding to the
situations in which
(a) the end of the beam is unable to rotate, but may displace verti-
cally;
(b) the end of the beam is unable to displace vertically, but is free
to rotate.
Linear elliptic operators
8.6. Find the regions in the xy plane in which the operator
284 84 2 84
A = (1 - x) -
8x 4
+ 2(1 - x)(1 - y ) - -
8x 2 8 y 2
+ (1 - y) -
8 y4
is (i) elliptic; (ii) strongly elliptic.
300 8. Elliptic boundary value problems

8.7. Show that the operator A defined by

is not elliptic anywhere in ]R3.

8.8. In the context of elasticity the definition of ellipticity given in Section


8.2 is extended in a very natural way to systems of PDEs involving
the displacement vector as unknown variable. Suppose we consider
only time-independent second-order problems in ]R3. Then clearly the
principal part of Navier's equation can be written in the form

where the coefficients C ijk1 are defined by (8.9). The clasticity oper-
ator is then said to be elliptic if for all vectors ~ and 1],
3

L Cijk1c'iTJjc'kTJl 2:: O.
i.j,k,I=1

Furthermore, it is said to be strongly elliptic if this inequality holds


strictly for all nonzero vectors. Show that the operator in Navier's
equation is strongly elliptic, and that it is also pointwise stable (see
(8.65)) if and only if the Lame constants satisfy 11 2:: 110 > 0 and
3'\ + 211 2:: k o > 0, for constants 110 and k o·

Normal boundary conditions

8.9. Express the boundary condition

on r
in the form LI"'19 b",D"'u = g. Is it normal?

8.10. Show that in thc theory of clastic platcs, the boundary condition
SI = 0 along thc cdgc x = L of thc plate can bc cxpressed in the
form

Write the cquation in thc form L b",D"'u = 0 and investigate whether


it fails to be anormal boundary condition for any valucs of v.
8.7 Exercises 301

8.11. Determine the conditions under which the pair of boundary condi-
tions
Bou u,
8 3u 8 3u ~u 8 3u
Q; 8x3 + ß8x28y + 'Y 8x8 y2 + 8 y 3'
cover the biharmonic operator A = \74, at a point on the boundary
with normal v = (0,1).
8.12. An elastic body occupies the domain n = (0,1) x (0,1). The sides
x = 0, x = 1, and y = 1 are traction-free, whereas the side y =
is constrained by a flexible foundation, in the sense that the normal
°
component of the surface traction acting on the boundary is pro-
portional to the normal component of displacement; the tangential

°
component of displacement is zero along this side. Do the boundary
conditions along y = satisfy (8.34)?
8.13. Consider again the elastic body discussed in Exercise 8.12, but this
time suppose that the boundary condition along :IJ = is that corre-
sponding to Coulomb friction: the normal component of displacement
°
is zero, whereas the tangential component of traction is proportional
to the normal component of traction. Formulate this boundary con-
dition.
Green's formulas and adjoint problems
8.14. Show that the Green's formula for the operator A defined by
d4 u
dx 4 =f in n= (0,1)

is 1
1
VU'II' dx = 11
UV"" dx + [ulllv - u"v' + u'v" - uvllllö.

Given that Bou = (u l (O),u"(I)) and B 1 u = (u"'(O),u lll (I)), find the
operators B;, Sj, and S; (j = 0,1).
8.15. Show that the Green's formula for the Laplacian operator Au = \72 u
can be expressed in the form

10 (\7 u)v dx = 10 u(\7 v) dx + h(v\7u· v - u\7v' v) ds.


2 2

Given that B o = 8181.1, identify the boundary operators B o,So, So.


8.16. The purpose of this exercise is to derive the identities (8.49) and
(8.50). First, use (7.2) to show that

"~
n

i,j=l
1r!
8a ij
- v · dx
8x.'
J
=" 1a··v·v· d s - " 1a ·8x8Vi· -. dx '
~
n

i,j=l r
'J J ' ~
n

i,j=l r!
'J
J

(8.68)
302 8. Elliptic boundary value problems

where O"ij are the elements of a matrix u. Show furthermore that if


u is symmetrie, then the integrand over n on the right-hand side of
(8.68) is in fact equal to u· e(v) = 2: i ,j=1 UijEij(V). Next, use (8.8)
to obtain (8.49). Apply Green's theorem again to find (8.50).
8.17. Derive the Green's formula for the operator A given by

Au = V'4 U = f in n,

u = go } on r
8uj8v = g1 .

Existence, uniqueness, and regularity of solutions


8.18. Consider the BVP
Au = f in n
Bju = gj on r (j = O,I, ... ,m - 1),
where A is a 2mth order operator. Let r/> be a known function in
C 2m (n) sueh that Bjr/> = gj on r. Show that the BVP ean be trans-
formed to the problem

Aw j in n,
Bjw 0 on r,

where w =u - r/> and j =f - Ar/>.

8.19. Investigate the existenee, uniqueness, and regularity of solutions to


the problem of an elastie beam, whieh is deseribed by
d4 u
dx 4= f in (0,1),
u"(O) = u"(l) = 0,
u"'(O) =u lll (l) = O.
In partieular, determine the eonditions that must be plaeed on the
loading f.
8.20. Investigate the existence, uniqueness, and regularity of solutions to
the problem
82u 82 u 82u
8x 2 + 2 8x8y + 8 y 2 =f in n,
8u
ßv = 0 on r.
If n = (-1,1) x (-1,1), show that any loading f satisfying f(x, y) =
f (y, x) with f odd in x or y is eompatible.
8.7 Exercises 303

8.21. Verify that €(u) = 0 for the rigid body displacement u(x) = a +
b x x.

8.22. The purpose of this exercise is to fill in some of the details of the
proof of Theorem 1.

(a) Show that A: HS(O,) -+ HS- 2m(0,), Aas in (8.41), is a bounded


operator if the coefficients have bounded derivatives of all orders.
(b) Use the fact that A is one-to-one from N(A)..L onto its range,
so that A has an inverse A- 1 : R(A) -+ N(A) .1. Now use the
Eanach Theorem, Theorem 6 of Chapter 5, to conclude that A- 1
is bounded. Use the boundedness of A- 1 to show that R(A) is
closed.

8.23. Investigate the conditions under which unique solutions exist to the
elasticity problem with boundary conditions given in Exercises 8.12
and 8.13.
9
Variational boundary value problems

In the preceding few sections we have built up a theory of regularly elliptic


BVPs, in which the typical problem involves finding a function u that
satisfies

PDE: Au f in 0,

BCs: Bou 90
} on r,
9m-l

where A is an elliptic PDE of order 2m in a domain 0, whereas the bound-


ary conditions are normal, and cover A. The question of well-posedness of
solutions to elliptic BVPs has been settled, at least for the case of a smooth
domain and homogeneous BCs; provided that certain conditions are met, a
unique solution exists. Furthermore, if f E HS(D), then u is smooth enough
to belong to HS+ 2m(D).
In this chapter we broaden the concept of a boundary value problem
by introducing what is known as a variational boundary value problem
(VBVP). The variational formulation is a weaker one than the conventional
formulation, since it demands less smoothness of the solution u. Neverthe-
less, there is a VBVP corresponding to every BVP, and vice versa, so that
we have the option of formulating a problem in either of these two settings.
We start by examining a typical VBVP in Section 9.1; we take a simple
example and show explicitly the relationship between the variational and
306 9. Variational boundary value problems

conventional formulations. Then in Section 9.2 the general features of VB-


VPs are examined: how they are formulated and how they are related to
BVPs. In Section 9.3 we consider the quest ions of existence and uniqueness
of solutions to VBVPs. Finally, we show in Section 9.4 that certain VBVPs
can be formulated alternatively as minimization problems, in which it is
required to find the function that mimimizes a given functional.

9.1 A simple variational boundary value problem


In the present context we understand a variational boundary value problem
to be one of the form: find a function u that belongs to a Hilbert space V,
and that satisfies the equation

a(u,v) = (f,v)

for all functions v in V, where a is abilinear form and C a linear lunctional.


Before discussing general ideas, we consider the following simple, concrete
example of a VBVP.
Find u E HJ (!1), !1 c ]R2, that satisfies

In V'u· V'v dxdy = In Iv dxdy for all v E HJ(!1). (9.1)

Here V = HJ(!1),

a(u,v) = 1
~
V'u· V'v dx = 1(- - +--
n
öuöv
öx ÖX
öuöv) dxdy
öy öy
and

(e, v) = In Iv dxdy.
The first quest ion we ask is: in what sense is (9.1) equivalent to a BVP, and
what does this BVP look like? This is resolved by observing first that since
v in (9.1) is arbitrary, we can set v = qy E V(!1) (note that V(!1) C HJ(!1)),
to give

a(u,qy) = in
r (öu öqy öu qy )
öxöx+öyöy
Ö
dxdy=(e,qy). (9.2)

Suppose for definiteness that I is in L 2 (!1); then I is locally integrable and


generates a regular distribution, also denoted I, so that

(e, qy) = (j, cb) = In N dxdy. (9.3)


9.1 A simple variational boundary value problem 307

Now the functions ßu/ßx and au/ßy appearing in (9.2) belong to L 2(0)
(since u E HJ(O)) and also generate regular distributions ßu/ßx and
ßu/ßy, from which it follows that

/ ßu ßrjJ) / ßu ßrjJ) (9.4)


a(u,v) = \ßx' ßx + \ßy' ßy ,
the right-hand side indicating the action of the distributions ßu/ ßXi on
ßrjJ / ßXi' Furthermore, from the definition of the generalized derivative of a
distribution,

(9.5)

ß2U/ßX~ being a distribution, although not neeessarily regular. Bringing


together (9.2) through (9.5) we thus obtain

(V 2u - I, rjJ) = 0 for all rjJ E V(O); (9.6)

in other words (9.1) implies the problem of finding u E HJ(O) that satisfies
the Poisson equation

(9.7)

in the sense 01 distributions (see Section 7.2). We could even go one step
further, and make use of the fact that V(O) is dense in HJ(O) to argue,
using (9.6), that (9.7) makes sense in H- 1 (0), the dual space of HJ(O).
Furthermore, since u E HJ (0) it vanishes on the boundary, and we have

u = 0 on r, (9.8)

in the sense of traees.


It is important to remember that by (9.7) we mean (9.6). That is, the
PDE (9.7) may only make sense when viewed as a distributional differential
equation. For example, suppose that we consider the physical context of
the membrane problem, in which case I represents the force acting on the
membrane, and suppose further that this force is a point load of intensity
P (Figure 9.1) acting at x = O. Then instead of (9.3) we have

(t, v) = P(8, v) = Pv(O). (9.9)

where 8 is the Dirae singular distribution, and the same proeedure leads to
the equation

whieh, as we know from Section 7.2, only has meaning in the distributional
sense.
308 9. Variational boundary value problems

FIGURE 9.1. A membrane subjected to a point force

As (9.7) and (9.8) stand, a solution is sought in the space HJ (n). Whether
this solution coincides with a "classical" solution of the kind discussed in
Chapter 8, depends on the smoothness of f. If fE HS(n) with s ~ 0, then
u E Hs+2(n), and so the solution to the VBVP is the same as that of the
classical BVP.
So far we have shown that the VBVP (9.1) implies (9.7) (in the sense of
distributions) and (9.8). What ofthe converse? Suppose that we start with
the Dirichlet problem for the Poisson equation, that is,

f in n, (9.10)
u o on r, (9.11)

with f E L 2 (n), and we wish to derive the corresponding VBVP. First


we select V to be HJ(n) (the general procedure for selecting this space is
discussed in detail in the following section); next, we multiply (9.10) byan
arbitrary function v from HJ(n) and integrate over n, to obtain

-l (V 2 u)v dx = l fv dx. (9.12)

Green's theorem in the form (7.30) is now applied to the left-hand side of
(9.12), to reduce this to

-l (V 2 u)v dx = -l (~~) v ds + l Vu· Vv dx.

Since v E HJ(n) the boundary integral vanishes and so (9.1) is seen to


hold.
To summarize, then: the solution to the Dirichlet problem (9.10) and
(9.11) satisfies the VBVP (9.1). Conversely, the VBVP (9.1) implies the
problem (9.7) and (9.8) or (9.10) and (9.11) provided that this problem is
interpreted in the broader sense of seeking u E HJ (n) that satisfies (9.6).
9.2 Formulation of variational boundary value problems 309

Thus the variational formulation contains all the information found in


the classical formulation and more, since we are able, when dealing with
VBVPs, to work in a larger space and also to consider very irregular data
such as that given by (9.9). This is an important consideration since phys-
ical problems may well require that data be modeled using distributions
such as the Dirac delta: the case discussed earlier of the membrane sub-
jected to a point force is one such example, and there are other similar
examples, such as in heat conduction, in which one might want to consider
a point heat source of the form PO. Whereas the classical formulation does
not permit a treatment of such problems, the variational formulation offers
a natural setting.

9.2 Formulation of variational boundary value


problems
The ideas developed in the previous section are readily applicable to BVPs
of arbitrary order. We confine attention to regularly elliptic BVPs of order
2m, and go on now to discuss details of the general procedure for formu-
lating the corresponding variational boundary value problems.
In anticipation of the fact that each boundary condition plays a role that
depends on the order of the condition, we partition the set of boundary
conditions into two subsets:
(i) essential boundary conditions, which are those of order< m;
(ii) natural boundary conditions, which are those of order::::: m.
The reason for making the distinction is this: the airn is to formulate a
VBVP in which the solution is required only to be in a subspace of Hm(fl).
If this is so, then by the trace theorem it is only the derivatives of order
less than m that make sense as boundary values; thus the set of essential
boundary conditions may be included in the description of the space in
which a solution is sought (as with the inclusion of (9.11) in the problem
description by choosing V = HJ (fl)). The natural boundary conditions, on
the other hand, have to be accommodated in a different way.
As in the case of the theory in Section 8.5, we restrict our considerations
to bounded domains fl having smooth boundaries r.
In order to simplify matters, attention is confined to problems with ho-
mogeneous essential boundary conditions. This assumption does not imply
any restriction on the class of problems that may be considered, since it
is a straight forward matter to convert any problem with nonhomogeneous
boundary conditions to one whose boundary conditions are homogeneous,
as has already been discussed in Exercise 8.18.
So if the BCs are written down in the order of the highest derivatives
appearing in each one, so that the first p BCs are essential, then the BVP
310 9. Variational boundary value problems

to be considered has the form

(9.13)
= j in r2
BCs: Bou 0
} (,.,,,.tial) (9.14)
Bp_lu 0
Bpu 9p
} (natuml). (9.15)
Brn-lu 9m-l

The first step is to define aspace V in which the solution to the VBVP
is to be sought. This corresponds to the space HJ(r2) in problem (9.1). The
space V is known as the space oj admissible junctions, and is defined by

V = {v E H rn (r2): v satisfies all essential boundary eonditions}

or

V={VEH m (r2): Bjv=Oonf, j=I, ... ,p-l}.

As with the simple example worked through earlier, the next step is to
multiply both sides of (9.13) by an arbitrary funetion v from V, integrate,
and use Green's theorem to reduee the expression so obtained to one of the
form

a(u,v) = (C,v) (9.16)

in which the bilinear form a is given by

a(u,v) = l L
lal,IßISm
aaß(x)DßuDav dx + boundary terms.

Although the essential BCs are taken care of by the requirement that U E V,
the natural BCs are substituted into (9.16) direetly. Onee the formulation
(9.16) is arrived at we may disregard any smoothness initially assumed of
u, and pose the VBVP: find u E V that satisfies (9.16) for all v E V. Since
the VBVP is derived from the setting (9.13) through (9.15), every solution
of (9.13) through (9.15) is a solution of the VBVP. Conversely, it can be
shown that every solution of (9.16) solves the classical problem, possibly
in a wcak or distributional sense.
9.2 Formulation of variational boundary value problems 311

Examples
1. Consider the problem

-V 2 u+au I in fl,
(9.17)
au/all + bu 9 on r,
where a and bare continuous functions and it is assumed that I E
L 2 (fl) and gE L 2 (r). This problem arises in steady heat conduction,
in which the heat source is temperature-dependent, and of the form
I-au, and there is Newton cooling on the boundary. In this problem
m = 1, so that the boundary condition is a natural one. The space
of admissible functions is thus V = H1(fl). Multiplying both sides of
the PDE by v E H1(fl), integrating, and using Green's theorem, we
get

ln(vu.vv+aUV)dx-Ir(~~)VdS= InIVdX.
The introduction of the natural boundary condition into the bound-
ary term reduces this equation to the VBVP of finding u E H 1 (fl)
that satisfies

In
...
(vu· Vv + auv) dx +
v
Ir buv ds =
",...
In Iv dx +
T
Ir gv ds
'"
(9.18)

~u~) Q~

for all v E H1(fl). Thus the solution to problem (9.17) for I E L 2 (fl)
also solves the VBVP (9.18).
Conversely, if u is a solution of (9.18), then upon setting v = cf> E V(fl)
we get

(9.19)

so that (9.17h is satisfied distributionally.


The interpretation of the boundary integrals in (9.18) is less straight-
forward, though, unless we assume that u E H 2 (fl), in which case
Green's theorem may be used to obtain

o = Ir ( bu - 9 + ~~) v ds - In (V 2 u _. au + I) v dx

Ir (bu - 9 + ~~) v ds
using (9.19). The boundary value au/all is, of course, well-defined
since u E H 2(fl) by assumption, and so au/all E L 2(r). By choosing
312 9. Variational boundary value problems

a function v E V that has a trace cp E C= (r) (this is always possible),


and by exploiting the density oftest functions in L 2 (r), the boundary
condition (9.17h can be shown to be valid in L 2 (r), and hence holds
almost everywhere on r.

2. We consider next an example involving the biharmonic equation

where
4 84 u 84u 84u
V' W = 8x 4 + 2 8x 2 8 y 2 + 8 y 4'
Recall from Section 8.1 (Box 5) that physically this equation repre-
sents the behavior of a Rat plate with stiffness D subject to a trans-
verse force Q per unit area, with f = Q / D. Far simplicity we confine
attention here to a rectangular plate such as that shown in Figure
8.6.
Suppose that the plate is supported on its entire boundary in such
a way that rotation is permitted, but the boundary is constrained
against dis placement (as in the second boundary in Section 8.1, Box
5). Then there are two boundary conditions, the first of which is w = 0
on r. To formulate the second boundary condition we must consider
the edges x = ±h and y = ±l separately. For the edges y = ±l we
have, as in Box 5, the condition 8 2 w/8 y 2 = O. By a similar argument,
that essentially entails reversing the roles of x and y, we arrive at the
condition 8 2 w /8x 2 = 0 along the edge x = ±h. In summary, we
require that

o on r,
o forx=±h, YE[-l,l], (9.20)
o far y = ±l, xE [-h,h].

In this problem m = 2, of course, which accounts for the two bound-


ary conditions. The condition w = 0 is an essential Be whereas the
remaining two are natural conditions. Hence

To obtain the bi linear form corresponding to this problem, we first


observe that
9.2 Formulation of variational boundary value problems 313

after two applications of Green's theorem. Similarly,

Fina11y,

(this decomposition is carried out in order to preserve the symmetry


inherent in the biharmonic operator)

Now v is assumed to be in V so, in particular, v = 0 on r. Thus it


fo11ows that the first terms on the right-hand sides of (9.21) through
(9.23) vanish. This leaves boundary terms involving second deriva-
tives of w. The terms involving 8 2 w/8x 2 and 8 2 w/8y 2 all vanish,
either due to (9.20h-3 or because w = 0 along x = ±h implies that
8 2 w / 8 y 2 = 0 there, with a similar argument along y = ±l. To see that
the term involving the mixed derivative 8 2 w / 8x8y = 0 also vanishes,
note that this can be written as

8
8x
(8W)
8y ,

and 8w/8y vanishes along x = ±h. The other two sides are treated
in the same way, by swapping x and y.
Thus all the boundary terms vanish, and we fina11y obtain the VBVP:
find w E V such that

(9.24)

In Exercise 9.3 we show how the VBVP may be arrived at in a more


direct way, which also a110ws the boundary conditions to be applied
more easily.

3. We return to the problem of the deformed elastic bar, summarized in


Box 4, Chapter 8, and discussed further in Example 11 of that chap-
ter. The same procedure applies as that adopted for scalar-valued
314 9. Variational boundary value problems

y l

2d 1
z

FIGURE 9.2. The domain of Example 3, and its boundary

functions, so we begin by identifying the essential boundary condi-


tions: there is only one, that is,

where f 1 = {x: x = 0, Y E (-d,d), Z E (-h,h)} (Figure 9.2). So


the space of admissible functions is

v= {v: Vi E H1(rl), v = 0 on fd.


Now fortunately much of the work entailed in deriving the VBVP
appropriate to this problem has already been done in Chapter 8,
in the course of arriving at the adjoint problem. Indced, by taking
the inner product of the left-hand side of (8.10) (without the time
derivative) with an arbitrary function v E V, integrating, and using
Green's theorem, we arrive at (8.50). The boundary term

- .Irr [CE(U)]V' v ds
may be written as a sum of integrals over the parts f 1, ... , f 4 making
up f; now the integrals over f 1 , f 2 , and f 3 vanish, either because
v = 0 (on fd or because the surface traction vanishes (on f 2 and f 3 ).
The integral over f 4 becomes simply 11'4 f . v ds, after substitution
of the natural boundary conditions, and the desired VBVP is: find
U E V such that

1
,11
[CE(U)]' E(V) dx =
v ,~
Ir f· v ds (9.25)

a(U.V) (l,V)

4. Consider the problem of deflection of a linear clastic beam; the dif-


ferential equation for this problem has been derived in Chapter 8,
9.2 Formulation of variational boundary value problems 315

--__=:1
FIGURE 9.3. The beam corresponding to Example 4

Example 7, and various boundary conditions have been discussed in


Exercise 8.5. Take, for example, the case in which the beam i;; con-
strained against dis placement and rotation at one end, whereas at
the other it is constrained merely against rotation, and is subjected
to a shear force of magnitude SL at that end, as shown in Figure 9.3.
The boundary conditions are thus

u(O) = 0 u'(L) = 0
u'(O) = 0 u"'(L) = -SLIEI.

Ofthese, all except the condition ulll(L) = SIEl are essential condi-
tions, so it follows that the space of admissible functions is

v= {v E H 2 (0,L): u(O) = u'(O) = u'(L) = O}.

In order to obtain the VBVP we multiply the left-hand side of the


differential equation (8.20) by an arbitrary function v and integrate
twice by parts; this gives

[w"'vlÖ' - foL wlllv' d:"C

[w"'v - w"v'lÖ' + foL w"v" dx.

Now with the assumption that the function v belongs to V, the


boundary condition re duces to the single term w'''(L)v(L) =
-(SLI EI)v(L), after imposition of the natural boundary condition.
After rearranging terms we therefore arrive at the VBVP: find w E V
that satisfies

foL w"v" dx = foL qv dx + (SL/EI)v(L),


'--------v----- ' ,
a(w,v) (i,v)

where q = fiEl.
316 9. Variational boundary value problems

9.3 Existence, uniqueness, and regularity of


solutions
Existence and uniqueness of solutions to VBVPs. Earlier, in Chap-
ter 8, we discussed the conditions under which solutions to regularly elliptic
BVPs exist and are unique. The results there apply, of course, to what is
referred to as the classical formulation, that consists of a PDE and a col-
lection of homogeneous boundary conditions.
Now in much the same way we wish to know the conditions under which
a unique solution to the corresponding variational boundary value problem
may be found. J ust as the issues of existence and uniqueness of the solution
to (8.57) depend on various properties of the differential operators A and
BQ, ... , B m - I , in the case of VBVPs these issues can be expected to be tied
closely to properties of the bilinear form a(-, .) and the linear functional {!.
It turns out that there is exactly one solution to a VBVP of the form
(9.16) provided that {! is continuous and provided that a is continuous and
V -elliptic: recall from Seetion 5.5 that abilinear operator a is continuous
if there is a constant M > 0 such that

la(u,v)1 ::; Mllullv Ilvllv for all u, v E V,

and V -elliptic if there is a constant a > 0 such that

a(v,v):;:' allvllt for all v E V, (9.26)

V being the space of admissible functions and 11·11 v the norm on this space.
Without furt her ado we present the basic existence and uniqueness theorem
for VBVPs, after which a few specific examples are considered.

THEOREM 1. Let V be a Hilbert space and let a(-,·) : V x V ---> ~ be a


continuous, V -elliptic bilinear lorm on V. Furthermore, let {! : V ---> ~ be a
continuous linear lunctional on V. Then

(i) the VBVP 01 finding u E V that satisfies

a(u, v) = ({!, v) lor all v E V, (9.27)

has one and only one solution;

(ii) the solution depends continuously on the data, in the sense that

1
Ilullv<; -11€llv',
a
(9.28)

where 11 . Ilv' is the norm in the dual space V' 01 V and a is the
constant in (9.26).
9.3 Existence, uniqueness, and regularity of solutions 317

PROOF. The proof of this theorem follows from the Lax-Milgram theorem
(Theorem 5.13). Since a is continuous and V-elliptic, every bounded linear
functional, and in particular the functional C, can be expressed in the form

(C,v) = a(u,v),
where u is unique. This proves Part (i); Part (ii) follows by setting v =u
in (9.26) and using (9.27). This gives

allullt :S a(u,u) = (C,u) :S IICllv,llullv,

the last inequality coming from the fact that Cis bounded. Dividing through-
out by Ilull, we obtain (9.28). 0

Recall from the discussion in Section 8.5 the significance of a result such
as (9.28). This inequality assures us that a small change in the functional
C leads to a correspondingly small change in the solution.
The inequality (9.28) may be expressed in an alternative form if Cis given
by

(C, v) = L fv dx + [9V ds,

where f is in L 2 (r2) and 9 E L 2 (r) (as in (9.18)); for then we have


allullt:s a(u,u) = (C,u) + (g,u)u(r)
(j,u)U(O)
:S Ilfllu(o)llullu(o) + Ilgllu(f')llullu(I')
< clluIIL2(O)(llfllu(o) + Ilgllu(f'))
using the Cauchy-Schwarz inequality and the trace theorem. Since V is a
subspace of a Sobolev space H m (r2), the norm II '11v is the Hm- norm, and
of course IIullL2 :S lIullv. Hence

(9.29)

Wc now show how Theorem 1 is applied to actual problems.

Examples
5. Consider the BVP
- \72 u + ku = f in r2,
(9.30)
u =0 on r,

where k(x) is continuous on r2. We ass urne first of all that

(9.31)
318 9. Variational boundary value problems

for some constants MI, M 2 . The VBVP corresponding to (9.30) is:


find u E HJ (!1) that satisfies

10 (Vu· Vv + kuv) dx = 10 fv dx, v E HJ(!1). (9.32)


, ...
a(u,v)
' --.....-.....
(i,v)

If it can be shown that f. is continuous and that a is both continuous


and HJ - elliptic, then we are guaranteed the existence of a unique
solution to (9.32). First, f. is continuous since

1(f.,v)1 110 fv dxl ::; Ilfllullvllu (Cauchy-Schwarz inequality)

::; IlfllullvllHI;

if we set IIfllL2 = K, then 1(f.,v)1 ::; KllvllHI and so f. is bounded,


and hence continuous.

Next, a is continuous since

la(u, v)1 110 (Vu· Vv + kuv) dxl


::; 110 Vu· Vv dxl + 110 kuv dxl (triangle inequa!ity)

::; t l (::' ::J J + M2 1(lul, Ivlh 2 1 (from (9.39))

::; tll::JL211::JL2 +M2 1lullull vllu

::; M [tll::llull::Ju +IIUllullvllu]


MlluIIH'llvIIH'

in which M = max{ 1, Md. The last !ine is arrived at by defining


the vector u = (118uj8xIilp, ... , 118uj8xn llu, Ilullp) in JRn+I, and
by defining similarly a vector v. Then the penultimate !ine is simply
Mu . v, and from the Cauchy-Schwarz inequality, this is bounded
above by Mlullvl; the observation that lu! = lIullH' (similarly for v)
yields the last !ine. Thus a is continuous.
9.3 Existence, uniqueness, and regularity of solutions 319

Finally, to show that a is HJ-elliptie, eonsider

a(v,v) i ('\lv· '\lv + kv 2) dx

i + ('\lv· '\lv M 1v 2) dx (from (9.31))

{ I [i +:;1
('\lv· '\lv v 2) dx

('\lv· '\lv + v 2) dX] if Mi ::::: 1.

In other words,

where Q = min(l, M 1 ). Thus a is HJ-elliptic. All requirements are


met, and so a unique solution to (9.32) exists.

6. Consider onee again the BVP (9.30), but this time ass urne only that
k(x) is nonnegative and bounded above, so that

(note that the Poisson equation _'\l2u = f is a particular example of


this situation). Continuity of land a follow from the same arguments
as those used in Example 4, but the proof of HJ-elliptieity does not
apply here sinee we made use in Example 4 of the eondition a(x) ~
Mi. In order to show that a is HJ-elliptie, we exploit the Poincare-
F'riedrichs inequality (7.34); from (9.32) we have

a(v, v) = i ('\lv· '\lv + kv 2) dx ~ i '\lv . V'v dx

and (7.34) gives

(C + 1) i '\lv· V'v dx ~ i (v 2 + V'v· V'v) dx = IIvll~l


so that

and a is HJ-elliptic.

7. Returning to Example 3, that eoncerns the deformation of an elastic


bar, we consider the case of an isotropie material for which C is given
320 9. Variational boundary value problems

by (8.9). The bilinear form a(·,·) is defined in (9.25); first consider


the quest ion of continuity of a, that is defined in (9.25). Using (8.9)
we have

la(u,v)1 in ..t 1.,J,k,l=l


CijkIEij(U)Ekl(V) dx

lin[A(divU)(diVV) + 2p,e(u)· e(v)] dxl


A(divu, divv)p + 2p,(e(u), e(v»p
::; AlldivullplldivvllLZ + 2P,lIe(u)IIL2 I1e(v)IILZ .

Given that all the norms are of various combinations of first deriva-
tives of u and v it follows that there is a constant M > 0 such that

The proof of V-ellipticity follows very closely the arguments in Chap-


ter 8, Example 32; indeed, from (8.66) and Korn's inequality (8.67)
we see that the bilinear form is V-elliptic provided that the part r l
of the boundary on which u = 0 is not empty. Thus this problem has
a unique solution.

8. Consider next the problem of an elastic plate whose boundary is


rigidly clamped; the problem is thus one of finding w that satisfies

in !1,
u
oujov
on r.

Both boundary conditions are essential, and so the space of admissible


functions is V = H6(!1), whereas the bilinear form a(-,·) and linear
functional f are as in (9.24).
To show that a is continuous, consider the first term in the bilinear
form. We have, using the Cauchy-Schwarz inequality and the defini-
tion of the H 2 -norm,

11n a wa
2
<l
uX
2 <l
uX
2
v
2 dx
I

The remaining two terms in the integrand may be bounded similarly,


and it is then trivial to show that la(w,v)1 ::; 411w1lH211v11H2.
9.3 Existence, uniqueness, and regularity of solutions 321

To show that a is V-elliptic we simply note that

a(v, v) ~ L
lal=2
i (Dav? dx,

and the desired bound is obtained by applying (7.36). Continuity of


the linear functional is easy to prove, so we find that the problem of a
plate that is clamped all around its boundary has a unique solution.

9. An example of a problem that does not have a unique solution is the


BVP
in n,
(9.33)
on r.

The corresponding VBVP is: find U E H1(n) such that

i 'lu· 'lv dx = i Iv dx (9.34)

for all v E H 1 (n). Thus (9.34) is similar in many respects to Example


5, one cxception being that the space of admissible functions is H 1 (n)
and not HJ(n). As before, e and aare continuous; but a is not Hl_
elliptic: indeed, the inequality

a(v,v) ~ allvl1 2
ceases to hold for any function v that is constant (for which case
a(v,v) = 0). Hence, although the lack of H1-ellipticity does not nec-
essarily me an that a unique solution does not exist (Theorem 1 gives
sufficient conditions for the existence of a solution; if these conditions
are not satisfied, it does not imply nonexistence or nonuniquene~s),
we are unable to guarantee the existence of a unique solution.
Now the problem (9.33) has in fact been treated previously, in Chap-
tcr 8, Example 31; indeed, recall that in order for a unique solution to
exist, it was necessary and sufficient that the data land the solution
u satisfy

iI dx =0 and l u dx = o. (9.35)

The reason why we cannot prove existence of a unique solution 1.0 the
corresponding VBVP (9.34) is essentially that the conditions (9.35)
are not satisfied in the statement of (9.34) as it stands. First of all,
the space H1(n) in which u is sought is too large; for uniqueness
we must restrict attention to the subspace of H1(n) consisting of
elements orthogonal to constants, that is, elements satisfying (9.35h.
322 9. Variational boundary value problems

Second, the compatibility condition (9.35h is a necessary condition


for existence of a solution, since we get from (9.34), setting v = c, a
constant,

o= In fc dx or In f dx = O.

We show that (9.35h is in fact also a sufficient condition for existence,


as in Section 8.5.

THEOREM 2. Let V be a closed subspace of Hm(0.), let a be a continuous


bilinear form on V x V, and l a continuous linear functional on V. Let P
be a closed subspace of V such that

a(u + p, v + p) = a(u, v) for all u, v E V and p,p E P. (9.36)

Also, denote by Q the subspace of V consisting of functions orthogonal to


P in the L2- norm; that is,

Q = {v E V: In vp dx = 0 for all PEP} ,

and assume that a(·,·) is Q-elliptic: there is a constant a > 0 such that

a(q, q) ?: allqll~ for q E Q,

the norm on Q being the same as that on V. Then

(i) there exists a unique solution to the problem of finding u E Q such


that

a(u, v) = (l, v) for all v E V (9.37)

if and only if the compatibility condition

(l, p) =0 for pEP (9.38)

holds;
(ii) (continuous dependence on the data) the solution u satisfies

REMARK. Note that Theorem 1 is a special case for which P = {O}, so that
Q = V. Also, observe that for Example 9 we have V = Hl(0.), P = Po,
the set of constant functions, and Q consists of functions satisfying (9.35h,
9.3 Existence, uniqueness, and regularity of solutions 323

whereas (9.38) reduees to (9.35h.

PROOF. First we show the neeessity of (9.38). Assuming that (9.37) holds,
we have from (9.36),

a(u,v + p) = a(u,v) = (f,v + p)


so that, subtraeting this from (9.37) and using the linearity of f, (9.38)
follows.
Conversely, assurne that (9.38) holds; we want to show that (9.37) has
exaetly one solution. This we do by showing first that (9.37) is equivalent
to the problem of finding u E Q such that

a(u,q) = (f,q) for all q E Q. (9.39)

Onee this has been done, the proof follows that in Theorem 1, sinee a(·,·)
is continuous and Q-elliptie and, furthermore, it is readily shown that Q is
a closed subspaee and therefore complete in the Hm- norm.
Now sinee P is closed in V, which is in turn closed in Hm(n) and henee
also in L 2 (n), we have by Theorem 8 of Chapter 4 that

V = PEBQ.

Hence we ean write any v E V uniquely in the form v= p + q for pEP


and q E Q. Sinee (9.38) is assumed to hold we have, from (9.36) and (9.38),

a(u,p) =0 and (f,p) =0


so that, if u E Q is a solution of (9.39), then

a(u,q) +a(u,p) = (f,q) + (f,p)


or, with p +q = v,

alu, v) = (f, v) for all v E V.

Hence a solution of (9.39) is also a solution of (9.37). Conversely, (9.37)


holds a fortiori for all v E Q since Q is in V. Henee a solution of (9.37) is
also a solution of (9.39), so that problems (9.37) and (9.39) are equivalent.
The remainder of the proof of Part (i), as weil as the proof of Part (ii),
now follow in mueh the same way as that of Theorem 1. 0

Example

10. We return to (9.33), but consider instead the nonhomogeneous bound-


ary eondition
ou =9
- in r.
01/
324 9. Variational boundary value problems

The VBVP is now: find u E H 1 (fl) such that

L Vu· Vv dx = L fv dx + l gv ds for all v E H 1 (fl). (9.40)

Here V = H 1 (fl). Of course, (9.40) does not have a unique solution


as it stands; we note that

a(u+p,v+p)= L V(u+p)·V(v+p)dx=a(u,v) forallp,pEPo,

where Po is the space of constant functions. Thus P = Po and the


solution must be sought in the space

Q = {q E H 1 (fl): Lq o} . dx = (9.41 )

We know that a(·,·) and (t,·) are continuous; to show that a is Q-


elliptic we use the Poincare inequality (7.19) with (9.41) to show that

Hence there exists a unique solution u in Q to (9.40) if and only if

Furthermore, we have

(9.42)

Since (t, v) is given by the right-hand side of (9.40) we can in fact


express (9.42) in the alternative form (9.29).

11. In Example 7, if the boundary condition is

t = 0 on r,
then r 1 = 0 and this is the same situation as that encountered
in Chapter 8, Example 33. Returning to Theorem 2, that holds for
vector-valued functions with minor modifications, we first note that
V = [Hl(fl)J3, and the space of functions P coincides with the set of
rigid body displacements:

P={v: v(x)=a+bxx, a,bElR 3 }.


Thus Q, the space that is L 2 -orthogonal to P, is given by
9.3 Existence, uniqueness, and regularity of solutions 325

or, since a and bare arbitrary,

Q = {U E V: 10 u dx = 0, 10 u x x dx = o} .
Kom's inequality (8.67) holds on Q, and thus the bilinear form is
Q-elliptic. According to Theorem 2, a unique solution exists in Q
provided that the compatibility condition (9.38) is satisfied: this is
precisely the pair of conditions

10 Q dx = 0 and 10 Q x x dx= 0

encountered in Chapter 8, Example 33.

Regularity of solutions. A few words are in order regarding the reg-


ularity or degree of smoothness of the solution to a VBVP. Recal! that
we discussed this issue in Section 8.5 in the context of the conventional
or classical formulation, and we showed there that the solution u belongs
to HS(O,) (or rather, a subspace frs(o') of HS(O,)) if the data I are in
H S - 2m (0,). Here s 2: 2m.
In the context of variational boundary value problems we also showed
earlier that, when the linear functional P is of the form

(P,v) = 10 Iv dx (9.43)

with I E L 2 (0,), then the problem of finding u E V that satisfics

a(u,v) = (P,v), vEV (9.44)

is equivalent to the problem of u E V that satisfies

(9.45)

for appropriate A. Thus, for a differential operator of order 2m we conclude


that u is "almost" in H 2m (0,), in the sense that u lies in the domain of an
operator of order 2m. If A contains all derivatives of order 2m, then u in
fact belongs to H 2m (0,).
The preceding characterization of the smoothness of solutions of VB-
VPs can be strengthened considerably with the aid of the existence and
uniqueness results given earlier in this section, and the theory of Section
8.5. Indeed, suppose that (9.43) through (9.45) hold, and that a and P sat-
isfy the requirements for existence and uniqueness; since the solution u is
unique, we may conclude from (9.45) and from Theorem 1 of Chapter 8 that
u in fact belongs to H 2 m(0,), or rat her the subspace of H 2 m(0,) consisting
326 9. Variational boundary value problems

of functions that satisfy all of the homogeneous boundary conditions. Go-


ing one step further, if 1 is even more smooth so that it lies in HS(n), say,
with s 2: 2m, then we are assured that u is in HS(n). We summarize these
observations in the following theorem.

THEOREM 3. Let n be a smooth domain and let u E V be a solution 01 the


VBVP

a(u,v) = !oIVdX, vEV,

where V c Hm(n). 11 1 E Hs~2m(n) with s 2: 2m, then u E HS(n) and


the estimate

holds.

9.4 Minimization of functionals


It should now be clear that an elliptic boundary value problem of the form

1 in n (9.46)
gj on r (j=O, ... ,m-l)
can be posed in the alternative form of a variational boundary value prob-
lem

uEV, a(u,v) = (C,v), vEV. (9.47)

The formulation (9.47) has been shown also to hold certain advantages over
(9.46).
It turns out that if the bilinear form in (9.47) is symmetrie, that is, if

a(v,u) = a(1L,v) for all u, v E V

~ and this is has been the case for the examples treated here - then a
third formulation is possible: the variational boundary value problem is
equivalent to a minimization problem, in which a function u in V is sought
that renders the value of a functional J : V ----;. IR aminimum; that is,

J(u) ::; J(v) for all v E V, (9.48)

where J is defined by

J(v) = ~a(v,v) - (C,v). (9.49)


9.4 Minimization of functionals 327

The formulation (9.48) is often taken as a starting point, particularly in


physical problems for which J( v) represents the energy of a system, and the
solution of the problem is that which renders the energy aminimum. For
example, returning to the membrane problem (9.1), the quantity ~a(v,v)
represents the strain energy of the membrane, that is, the energy that the
membrane possesses by virtue of its having sustained a displacement v. The
quantity - In fv dx represents the potential energy of the loading. Adding
these two together, we obtain

J(v) = ~ LIV'vl 2 dx - Lfv dx,

the total potential energy of the system, whose minimum characterizes the
solution; this minimum is sought in HJ
(n).
Identical remarks apply to the elasticity problem; the total potential
energy is again given by

J(v) = ~a(v,v) - (f,v)


in which a and f are defined in (9.25). Of course, if the elastic body under-
goes a rigid body displacement, then a(v, v) = 0, as would be expected;
the body has undergone no deformation, and therefore possesses no strain
energy.
The aim of this section, then, is to investigate the consequences of for-
mulating problems such as (9.48), (9.49), and to show how these relate
to VBVPs. Before embarking on this task, it is instructive to review the
situation for functions of a single variable, wherein many of the ideas for
general functionals are present.
Consider a Cl function f : IR ---> IR, and suppose that we wish to locate
the minimum value of f(x), as well as the point Xo at which the minimum
occurs. That is, we wish to find Xo such that

f(xo) ::; f(x) for all x E IR.


From elementary calculus we know that a necessary condition for f to have
an extremum (that is, aminimum, a maximum, or an inflection point) at
Xo is that

!'(xo) = O.
The point Xo is a minimum if the function f is convex, that is, if a straight
line drawn between any two points on the curve of f(x) lies on or above
the graph of the function. Mathematically, f is convex if, for 0 < 8 < 1,

f(y + 8(x - V)) ::; f(y) + 8(f(x) - f(y))


or

f(8x + (1 - 8)y) ::; 8f(x) + (1- 8)f(y).


328 9. Variational boundary value problems

9 !
\ ./
strictly
convex

a b y x
y+(}(x-y)

FIGURE 9.4. Examples of a convex and a strictly convex function

The minimum may not be unique. For example, consider the function 9
shown in Figure 9.4; the minimum value of g(x) occurs at all points between
a and b. But for a strictly convex function, that is, one for which

!((}x + (1 - (})y) < (}f(x) + (1 - (})f(y), 0< () < 1, x -=I y,

the minimum is unique; this is the case for the function ! in Figure 9.4.
Remarkably, all of these ideas extend in a very simple way to functionals
defined on arbitrary spaces. In order to see how this is done we first intro-
duce the required gencralizations of convex functions and their derivatives.

Convex functionals. Let J : V ---> lR be a functional defined on a vector


space V. Then J is said to be convex if

J((}u + (1 - (})v) :::: (}J(u) + (1 - (})J(v),

and strictly convex if

J((}u + (1 - (})v) < (}J(u) + (1 - (})J(v),


far all u,v E V with u -=I v, and for 0 < () < l.

Gateaux derivative. A functional J : V ---> lR on a normed space V is


said to be Gateaux-differentiable, or simply differentiable, at u E Vif there
exists an operator DJ : V --4 V' defined by

(DJ(u), VI = lim [J(u +


e~o
(}i - J(u)] (9.50)

for all v E V. Equivalently, DJ is defined by

(9.51)
9.4 Minimization of functionals 329

(see Exercise 9.12). The operator DJ is called the gradient of J and DJ(u) :
V -> IR is the (Gateaux) derivative of J at u. Observe from (9.50) or (9.51)
that DJ maps V to its dual space V', so that DJ(u) is required to be a
bounded linear functional on V.
The Gateaux derivative does not always existi it may be verified, for
example, that if J is defined by J : IR 2 -> IR,

then

lim 8- 1 [J(x
IJ~O
+ 8y) - J(x)] = yUY2'

which is not linear in y.

Examples

12. If V is an interval in IR, then we see that DJ reduces to the con-


ventional derivative: DJ(x) = dJ/dx. Furthermore, if V c IRn , then
according to (9.50) or (9.51) we have (noting that (IR n )' = IR n )

8J
L
n
(DJ(x),y) = 8X.Yi = VJ· Yi
i=l t

that is, the Gateaux derivative is the directional derivative (see Ex-
ercise 9.13).

13. Let J : Hl(r!) -> IR be defined by

J(v) = ~ l Vv· Vv dx -l fv dv.

Then J is convex: indeed,

J(8u + (1 - 8)v) = ~{;21'VUI2 dx + ~(1- 8)2llvvl2 dx

+8(1 - 8) l Vu· Vv dx - 8 l fu dx - (1 - 8) l fv dx.

Now fn(Vu - Vv) . (Vu - Vv) dx ::::: 0 for v i= u (equality occurring


for the case in which u and v are constant functions), so that
330 9. Variational boundary value problems

Hence

J(Bu + (1- B)v) ::; ~B2 i IV'ul 2 dx + H1- Bf i IV'vl 2 dx

+~B(1- B) i (lV' u12 + IV'vI 2 ) dx - B i fu dx - (1 - B) i fv dx

= BJ(u) + (1 - B)J(v).

We also observc that J is strictly convex on HJ(r2), since the only


constant function in HJ (r2) is u = O.
To find the derivative of J we use

(DJ(u), v)

= :B [i [~(IV'uI2 + 2BV'u· V'v + B21V'v1 2) - f(u + Bv)] dx L=o


= r
Jo
(V'u. V'v + BIV'vl 2 - fv) dxl
8=0

or

(DJ(u),v) = i (V'u· V'v - fv) dx.

Note that DJ is an operator from H 1 (r2) to [H 1 (r2)]', and so DJ(u)


is a bounded linear functional on H 1 (r2).

We are now in a position to demonstrate the relationship between mini-


mization problems and VBVPs, and start with the following fundamental
result.

THEOREM 4. Let J be a convex dijJeT'entiable junctional defined on a sub-


space V of a normed space X. An element u E V is a solution of the
minimization problem

J(u) ::; J(v) foT' all v E V (9.52)

if and only if u is a solution of the VBVP of finding u E V that satisfies

(DJ(u), v) = 0 foT' all v E V. (9.53)

PROOF. We show first that (9.52) implies (9.53). Assume that (9.52) holds;
then, replacing v by u + Bv for any u, v E V and B E (0,1), we have

J(u + Bv) - J(u) 2> O.


9.4 Minimization of functionals 331

Dividing by 0 and allowing 0 to go to zero, we obtain

DJ(u)v :2: o. (9.54)

But v is arbitrary, so (9.54) holds ifwe replace v by -v. Using the linearity
of DJ(u) we get (DJ(u),v) ::; 0, and so (DJ(u),v) = O.
To show that (9.53) implies (9.52), we start with

J(Ov + (1 - 8)u) = J(u + 8(v - u)) ::; 8J(v) + (1 - 8)J(u)


J(u) + O(J(v) - J(u))

by thc convexity of J. Hence

J(v) _ J(u) :2: J(u + O(v ~ u)) - J(u)

so that, as 8 -> 0,

J(v) - J(u):2: (DJ(u),v - u) = 0;

hence (9.53) implies (9.52). o

Exarnples

14. Supposc that X = H 1 (n), V = HJ(n), and

J(v) = ~ 10 'Vv· 'Vv dx - 10 Iv dx. (9.55)

We found DJ(u) in the previous example, so it follows that the prob-


lem of finding u E H6(n) that minimizes (9.55) is equivalent to the
problem of finding u E HJ (n) that satisfies

(DJ(u),v) = 0 or 10 'Vu· 'Vu dx = 10 Iv dx for all v E HJ(n).


(9.56)

We recognize (9.56) as a VBVP.

15. The preceding example is jlh'lt a special case of the general minimiza-
tion problem that involves quadratic functionals of the form

J: V -~ IR, J(v) = ~a(v, v) - (f., v) (9.57)

in which a(·,·) is asymmetrie bilinear form on V and f. is a linear


functional on V. Here V will generally be a subspace of a Sobolev
332 9. Variational boundary value problems

space Hm(D.), or perhaps of [Hm(D.)]n in the case of a problem such


as that of elasticity. When J takes the form (9.57) then we have
(DJ(u),v)
= lim 0- 1 [~a(u
e~o
+ Ov, u + Ov) - (f, u + Ov) - ~a(u, u) + (f, u)]
= lim 0- 1 [O(a(u,v) - (f,v)) + ~02a(v,v)]
e~o

= a(u,v) - (f,v)

using the bilinearity and symmetry of a(· , .) and the linearity of f.


Hence the problem of minimizing (9.57) is equivalent to the VBVP
of finding u E V satisfying
a(u,v) = (f,v) for all v E V, (9.58)
assuming that J(.) is eonvex. That is not usually a problem; if a(·,·)
is V-elliptic, for example, then it is strictly convex (see Exercise 9.14).
To summarize, then, any VBVP of the form (9.58) in which a(-,') is V-
elliptic is equivalcnt to the problem of minimizing the functional (9.57) and
vice versa. This equivalence, as a matter of interest, explains the reason for
the terminology "variational" in the expression "variational boundary value
problem". Thc elassical ealeulus of variations is concerned with the problem
of minimizing functionals of a general nature and the expression (D J (u), v)
is known in that theory as the first variation of J. A necessary condition für
a minimum is that the first variation vanish, that is, (DJ(u),v) = 0, and
this is what we call a variational BVP. It is important to note, though, that
problems of the form (9.58) are referred to as VBVPs even if a(·,·) is not
symmetrie, in which case thcre is no corresponding minimization problem.
We elose this section with a theorem that gives conditions for thc ex-
istenee and uniqueness of solutions to minimization problems involving
functionals of the form (9.57). Of course, existence and uniqueness could
be discussed in terms of the equivalent VBVP (9.58), using the theory of
Section 9.3. But for completeness we discuss problem (9.57) on its own,
and show in fact that the requirements for well-posedness of Theorems 1
and 5 coincide.

THEOREM 5. Let J : V --+ IR be the funetional given by (9.51), in whieh


V is a closed subspaee of a Hilbert spaee H. Assume that a(- , .) is bilinear,
symmetrie, eontinuous, and V -elliptie, and that f is bounded and linear.
Then the problem of finding u E V that minimizes J(v) over all v E V has
one and only one solution.

PROOF. We start by observing that a(·,·) defines an inner produet on V;


indeed, if we write a( u, v) == (u, v)a, then
(u, v)a = (v, u)", (au + ßv, w)a = a(u, w)a + ß(v, w)a,
9.5 Bibliographical remarks 333

and the positive-definiteness of (. , ')a follows from the continuity and V-


ellipticity of a, in that

(9.59)

Thus (u,u)a 2: 0 and (u,u)a = 0 if and only if u = O. Furthermore, the


norm Iluli a == (u,u)a is equivalent to the standard norm on V, as (9.59)
indicates, so it follows that the space V with the inner product (. , ')a is a
Hilbert space.
We now apply the the Riesz Representation Theorem using (., ')a: cor-
responding to the functional f there exists l E V such that (f,v) = (l,v)a'
Hence (9.57) reads

(9.60)

where I . Ila is the norm generated by (-, ·)a. From (9.60) it is clear that
the problem amounts to one of finding u E V such that

lIu -lila -s: III - vll a for all V E V.

By Theorem 6 of Chapter 4 such an element exists and is unique. Indeed


since l E V we have u = l. 0
We remark in conclusion that Theorem 5 is equivalent to Theorem 1
when the bilinear form is symmetrie, but that Theorem 1 alone is of use if
a(·,·) is nonsymmetric.

9.5 Bibliographical remarks


Good accounts and further examples ofthe theory covered in Sections 9.1 to
9.3 may be found in Dautray and Lions ([13], Chapter VII) and in Rektorys
[41J. The discussion of non-V-elliptic problems that includes Theorem 2 is
adapted from the treatment of this topic by Necas [34J and Rektorys [41J.
The discussion in Section 9.4 of the minimization of convex functionals
has focused only on those aspects pertinent to our main goals. The subject
is huge, and in itself contains many interesting applications of functional
analysis. For more details the texts by Glowinski [16J and Zeidler [55J are
good sources.
We have avoided discussion of problems such as (9.52) when V is a convex
subset but not a subspace. For example, consider the problem of finding a
function that satisfies

- \J2U - f 2: 0, u 2: g
(u-g)(-\J 2u-f) =0
U = 0 on r.
334 9. Variational boundary value problems

This corresponds to the problem of finding the shape of a membrane


stretched over an obstacle, as shown in Figure 9.5. This set of equations
describes the fact that the membrane has to be above or on the obstacle
(u - 9 ~ 0), that the force acting on the membrane is either zero, when
there is no contact, or positive, at those points at which there is contact
(-\7 2 u - f ~ 0). The second equation indicates that these quantities can-
not both be positive; one either has contact, in which case u - 9 = 0 and
the net force is positive, or the membrane lies above the obstacle, in which
case the net force is necessarily zero.
It can be shown (Exercise 9.17) that the corresponding VBVP is the
variational inequality: find u E K such that

In \7u· \7(v - u) dx - In f(v - u) dx ~0 for all v E K, (9.61)

where K is the convex subset

K={VEHJ(n):v~g a.e.in n} (9.62)


(note that K is not a subspace), and that the corresponding minimization
problem is: find u E K such that

J(u) ::; J(v) = ~ In V'v· V'v dx - In fv dx

for all v E K. For a detailed account of variation al inequalities see, for


example, Baiocchi and Capelo [4] and Glowinski [16]. The book by Duvaut
and Lions [15] is devoted to a thorough study of variational inequalities
that arise in mechanics and physics.

9.6 Exercises
Formulation of variational boundary value problems

g(x)

FIGURE 9.5. A membrane stretched over an obstacle


9.6 Exercises 335

9.1. Formulate the VBVP corresponding to

[k(X)U I (X)]" - [d(x)u'(x)l' + c(x)u(x) = f(x) in (0,1),


u(O) = 0, u'(O) =0
(ku")(l) = a, [-(ku")' + du'](l) = ß.

9.2. Find the VBVP corresponding to

f
9 on r,

in wh ich aU/aT == Vu· T is the oblique directional derivative in the


direction of the unit vector T, which is not generally tangential to the
boundary r.

9.3. The VBVP for the plate problem may be derived in a manner that
facilitates im position of the natural boundary conditions, in the fol-
lowing way.

(a) Equations (8.15) implythat 2:!,ß=1 a2M a ß/ax a axß = -q. Mul-
tiply this equation by an arbitrary function v and use Green's
theorem to obtain the identity

(b) Assuming that the same conditions as in Example 2 hold, derive


the VBVP simply by imposing the natural boundary condition
M u = 0 or M 22 = 0 on r, and by defining V as in the cxamplc.
Show that the bilinear form becomes

How would you reconcile this expression with that given in


(9.24)?
336 9. Variational boundary value problems

Existence, uniqueness, and regularity of solutions


9.4. Verify that the VBVP in Exercise 9.1 has a unique solution if the
functions k, d, and c are all strietly positive, with k E G2 [0, 1], c E
G[O,I], and d E Gl[O, 1].
9.5. Consider the BVP for nonhomogeneous, anisotropie heat eonduetion
with a temperature-dependent heat souree (Chapter 8), viz.

f in n,
U o on r;
the eoefficients k ij of the thermal eonduetivity matrix are sueh that
the operator is strongly elliptie. Derive the eorresponding VBVP and
show that the bilinear form is V-elliptie provided that b(x) ~ O. Show
also that the bilinear form is eontinuous provided that Ikij(x)1 :s: K.
9.6. For a plate oeeupying a domain n with arbitrary, nonreetangular
boundary r, it ean be shown (see, for example, Rektorys [41], Chap-
ter 23) that the moment aeting on the boundary is given by Mn =
n T Mn = E!,ß=l Maßnanß· In this exercise the unit normal is de-
noted by n, to avoid eonfusion with Poisson's ratio v .

...... _..... _._--


n

Consider the problem of a plate that is simply supported on its bound-


ary, so that the displaeement and moment are zero along the bound-
ary. Show that the moment boundary eondition beeomes
v\7 2 w + (1 - v){Pwj8n 2 = 0 on r.
If the seeond boundary eondition is w = 0, show that the eorre-
sponding bilinear form (after an appropriate definition of the spaee
V) is unehanged from that in Exercise 9.3. Show also that a(·,·) is V-
elliptic provided that v lies in the range 0 :s: v < 1. [Use the inequality
(7.33).]
9.7. Show that the bilinear form associated with the BVP
-(pu')' + TU = f n = (0,1),
in
u(O) = 0, u'(I) + u(l) = 0,
9.6 Exercises 337

is V-elliptic and continuous; here p and r satisfy the usual conditions


for the left-hand side to be a Sturm-Liouville operator.

9.8. Investigate the well-posedness of the problem of an elastic beam that


has the set of boundary conditions

(a) as in Example 4;
u"(O) = 90, u"(I) = 91,
(b)
ul/l(O) = ho, ul/l(l) = h l .
9.9. Show that the bilinear form in Example 7 is V-elliptic provided that
the Lame constants satisfy the conditions given in Exercise 8.8.

9.10. Derive the identity


n
L CijkIEij(U)EkI(V) = A(divu)(divv) + 2/-lE(U)· E(V)
i,j,k,l=l

used in Example 7.

9.11. An elastic cylinder is subjected to a body force f in its domain n =


S x (O,L), where S = {(r,B): 0::; B < 27r, r < R}. The curved
boundary r = R is free of applied forces, and on its two ends the
cylinder is restrained from axial displacement on the boundary, and
a system of (ideally) frictionless bearings results in there being no
tangential force there. Use Theorem 2 to investigate the conditions
under which this problem has a unique solution.

Minimization of functionals
9.12. Show that an equivalent definition of the Gateaux derivative is
d
(DJ(u), v) = dB [J(u + Bv)lo=o·

9.13. Consider the functional J : lRn --+ lR; show that

8J
(DJ(x),y) = L 8xYi.
n

i=l 2
338 9. Variational boundary value problems

9.14. If a : V x V ----> IR is a V-elliptie, symmetrie bilinear form and f : V ---->


IR is a linear functional, show that

J(v) = ~a(v,v) - (f,v)

is strictly convex.

9.15. Show that J(v) = ~a(v, v) - (f, v) is convex if ais positive, that is, if
a(v,v) :c:: 0 for all v E V. Hence prove the converse of Theorem 3: if
u satisfies a(u,v) = (f,v) for all v E V, then u minimizes J.

9.16. Formulate the minimization problem corresponding to the VBVP of


Exercise 9.1.

9.17. Considcr, in thc context of Theorem 5, the situation in which V is a


closed and convex subset of H. Verify that the theorem still holds, but
that the condition (9.58) for a minimum is replaced by thc variational
inequality

u E V and a(u,v - u) :c:: (f,v - u) for all v E V.

The obstacle problem (9.61), (9.62) is a special case of this abstract


problem.
10
Approximate methods of solution

In the two preceding chapters we have devoted considerable attention to


various aspects of boundary value problems. The stage has now been reached
where we can quite justifiably ask: how does one actually obtain solutions?
The answer is rather disappointing, unfortunately; except for problems
involving very simple PDEs and geometries, it is quite impossible, using
existing methods, to obtain exact solutions to most BVPs in either the
conventional or variational formulations.
This state of affairs naturally leads to the quest ion of whether it is possi-
ble to obtain approximate solutions. Here matters are far more encoumging,
in that there are available many good methods for finding approximate solu-
tions. Some, such as the finite difference method, are based on the classical
formulation whereas others, such as the Galerkin method, take as their
starting point the variational formulation.
The methods that make use of variational formulations have enjoyed a
great upsurge in popularity in the past three decades, particularly sincc
the establishment of the finite element method, which is probably the best
known special case of the Galerkin method. In the sections that follow we
show how approximate solutions to VBVPs 0[, equivalently, to thc cor-
responding minimization problems, can be obtained. The emphasis is on
the Galerkin method and the finite element method, although we also give
some indication of other related methods in Section 10.3.
340 10. Approximate methods of solution

10.1 The Galerkin method


The basic idea behind the Galerkin method is an extremely simple one.
Consider the VBVP of finding U E V that satisfies

a(u,v) = (e,v) for alt v E V, (10.1)

where V is a subspace of a Hilbert space H. We assurne for convenience that


alt spaces are defined over the real numbers. The difficulty in trying to solve
(10.1) lies with the fact that V is a very large space (infinite-dimensional,
in the language of Chapter 6), with the result that it is not possible to set
up a practical method for finding the solution. But suppose that, instead
of posing the problem in V, we pick a few linearly independent functions
CP1, CP2, ... ,cP N in V and define the space V h to be the finite-dimensional
subspace of V spanned by the functions cp;. That is,

(10.2)
The index h is a parameter that lies between 0 and 1, and whose magnitude
gives some indication of how elose V h is to V; his related to the dimension
of vh, and as the number N of basis functions chosen gets larger, h gets
smalter (for example, we could set h = 1/N). In the limit, as N --> 00, h-->
o and we would like to choose {cpd in such a way that V h will approach
V, in a manner made precise later.
Having defined the space V h , problem (10.1) is now posed in V h instead
of in V. That is, we try to find a function Uh E V h that satisfies

(10.3)

This is the essence of the Galerkin method. In order to solve for Uh, we
simply note that both Uh and Vh must be linear combinations of the basis
functions of V h , so that
N N

Uh = LC;<P; and Vh = L dj<pj. (10.4)


i=l j=l

Of course, since Vh is arbitrary, so are the coefficients dj . Substitution of


(10.4) in (10.3) and use of the fact that a is bilinear and P is linear, lead to
the equation
N N N

LLa(cp;,cpj)c;dj = L(P,cpj)dj
;=1 j=l j=l

or, more concisely,

(10.5)
10.1 The Galerkin method 341

in which

(10.6)

Note that K ij and F j can be evaluated in practice since the 1>i are known
functions and the forms of a and f are also known.
Since the coefficients dj are arbitrary, it follows that (10.5) only holds if
the term in brackets is zero. The problem is thus reduced to one of solving
the set of simultaneous linear equations
N
LKijCi = Fj , j = 1, ... ,N (10.7)
i=l

or, more compactly,

(10.8)

in which K and F are, respectively, the matrix and vector with entries K ij
and F i . Once these equations are solved, thc approximate solution Uh can
be found from the first of equations (10.4).

Examples

l. Consider the BVP

d2 u . 7rX .
- dx 2 = sm 2 In [! = (0,1), u(O) = u'(1) = O.

The corresponding VBVP is: find u E V such that

(1 u'v' dx = (1 (sin7rx/2)v dx for all v E V,


Ja Jn
where V = {v E H 1 (O,1) : u(O) = O} (note that u'(1) = 0 is a
natural boundary condition). Now define V h to be the subspace of V
spanned by the N mono mi als

1>i(X)=X i , i=1,2, ... ,N.

Then

and

F, = (f, !Pi) = ((sin 7rx/2) 1>, dx.


Jn
342 10. Approximate methods of solution

o 1

FIGURE 10.1. Exact and approximate solutions to the problem in Example 1

Suppose that we take N = 2; then we obtain the set of simultaneous


equations

0.405
Kc=F {==}
0.295

(K is symmetrie; that is, K t = K) and the solution to these equa-


tions is

Cl = 0.738, C2 = -0.33.
The approximate solution is thus

ClcPl(X) + C2cP2(X)

0.738x - 0.33x 2.

This problem ean bc solved in closed form, and the exaet solution is

u(X) = (2j-rr)2 sin(1T'x/2),


which is eompared with the approximate solution in Figure 10.1. We
see that even the erude approximation in a two-dimensional subspaee
produees in this ease a solution that compares very favorably with
the exaet solution.

2. Consider the VBVP of finding U E HJ(n) that satisfies

JnrVu. Vv dx = Jnr fv dx \;/v E HJ(n),

where f(x,y) = xy. [The eorresponding BVP is


xy in n,
U o on r.]
10.1 The Galerkin method 343

Here n is the unit square (0,1) x (0,1) in IR? We now ehoose as a


basis for V h the set of functions
c/Jl = sin 7rX sin 7rY, c/J2 = sin 7rX sin 27rY,
c/J3 = sin 27rX sin 7rY, c/J4 = sin 27rx sin 27rY,
which of course alt belong to HJ(n). The next step is to evaluate
Kij=a(c/Ji,c/Jj) and Fi=(f,c/Ji)U,
which is straightforward if we make use of the identity

1 1
sin n7rX sin m7rX dx 1
1
eosn7rX cosm7rX dx

if n i= m,
ifn = m.

11
Then

(ßc/Ji ßc/Jj
+ ßc/Ji
'" ßc/Jj)
1 1
K. = d d
'J o 0 '"
uX '"
uX uy '"
uy x y,
and beeause of the orthogonality of the trigonometrie functions the
only nonzero terms of K ij are

Kii 11 (ßc/Ji)2
o
- 1
+ (ßc/Ji)
- 1
ßx ßy
2
dxdy

1
0

1
7r211 n 2 eos 2 n7rX sin 2 m7rY dxdy

1
+7r211 1 m 2 sin 2 n7rX eos 2 m7rY dxdy,

where n and m take the values:


1 2 3 4
nIl 2 2
m 1 2 1 2.
After earrying out the integration we obtain
0
K~ ri
0 0
7r 2 2 2 5 0 0
K ii = 4(n +m ) or .' 5
1
4 0 0 0
0 0 0 8
Similarly,

1o 1
1 1
xy sin n7rX sin m7rY dx dy
1
2(1, -2, -2,4).
7r
344 10. Approximate methods of solution

Hence the solution is

and so
4
4" [~( sin 7rX sill7ry + sin 2?TX sin 2?TY)
?T
- ~ (sin?Tx sin 2?TY + sin 2?TX sin ?TY)] .

Wc recall from Section 5.5 that the bilinear form a(·,·) defines an in-
ner product on V if a is symmetrie and V-elliptic; indeed, the properties
of linearity and symmetry are obvious, whereas the property of positive-
definiteness comes from the V-ellipticity of a:

a(v, v) ;:::: allvll~ > 0 for all nonzero v. (10.9)

Furthermore, we have seen in the proof of Theorem 5, Chapter 9, that if


ais also continuous, then the norm Ilvll a == a(v, v) generated by this inner
product is equivalent to the standard norm on V, so that if V is complete
with respect to the standard norm, it is also complete with respect to the
norm 11·lla. As before, this inner product is denoted by C, ')a and referred
to as the energy inner product (the rationale behind this terminology has
been discussed in Seetion 9.4), and the corresponding norm is called the
energy norm.
Now if the set of basis functions {ljJd~l is chosen in such a way that they
are orthogonal with r-espect to the energy inner- product, then the system of
equations (10.7) simplifies considerably, since

and so

KiiCi = Fi , or Ci = Fd K ii ·
This is in fact the case in Example 2.
However, a word of warning is appropriate. Although for the preceding
example it was quite simple to find a basis that was orthogonal with respect
to (-, ')a, in general this is quite difficult. One could of course choose any
non-orthogonal basis and use the Gram-Schmidt procedure of Section 6.2
to orthogonalize or even orthonormalize, but for aIl exccpt the most trivial
problems this is a laborious procedure, and little is to be gained from it.
The problem of constructing a basis {ljJd~l in such a way that V h ap-
proaches V as N ---> 00 can be rather awkward. Remember that although
orthonormal bases for spaces such as L 2 are weIl known, at least for spaces
of functions on the real line or on simple two- and three-dimensional do-
mains (see, for example, Section 6.4), when using the Galerkin rnethod we
10.2 Properties of Galerkin approximations 345

are required to find bases for spaces V that are subspaces of Sobolev spaces
Hm(rI,), and that are defined on domains rI, which may be quite irregular
in shape. A vcry simple and elegant mcthod for constructing such bases is
provided by the finite element method. This is the topic of discussion in
the next two chapters.

The Rayleigh-Ritz method. The Rayleigh-Ritz method is very closely


linked to the Galcrkin method. It takes as its starting point the minimiza-
tion problem (9.52) and, as with the Galerkin method, proceeds to pose
this problem on a finite-dimensional subspace. That is, problem (9.52) is
replaced by the problem of finding Uh E V h such that

J(Uh) ::; J(Vh) for allvh E V h ,

where V h is a finite-dimensional subspace of V. If {(h H"=l


is a basis for
V h , then substitution of Vh = ~:=1 Ck<Pk in the expression for J yields the
function

which is a function of the N variables Cl,"" CN. In order to minimize


J( Vh), therefore, we require that

aJ =0 k = 1, ... ,N,
aCk '
and this yields a set of N simultaneous algebraic equations in the N un-
knowns Cl, ... , CN· Solution of these equations then gives the components
Ck of Uh. In particular, if J is given by (9.57), then
N N
J(Ck) = ~ L KijCiCj - L Fjcj,
i,j=l j=l

where K ij and F j are defined by (10.6), and minimization with respect to


the Ck yields the set of linear equations
N
LKijCj = Fi, i = 1, .. . ,N
j=l

which is precisely (10.7). Here, though, K is always symmetrie.

10.2 Properties of Galerkin approximations


In Seetion 10.1 we introduced the Galerkin method and illustrated how the
method is used in practice. It is not very satisfactory, however, simply to
346 10. Approximate methods of solution

leave things at that; we ought to know, first of all, whether the Galerkin
method always works and, if so, how significantly the approximate solution
differs from the exact solution. Also, we would like to be confident that as
the number of functions CPi in the basis of V h is increased, V h approaches
in some sense the space V and Uh approaches the exact solution u. This last
consideration is of course one of converyence of the approximate solution
as h -+ 0 (or as N -+ 00).

Existence and uniqueness. The quest ion of existence and uniqueness of


a solution is easily resolved, in view of the results in Section 9.3. Since V h is
a finite-dimensional subspace of a Hilbert space, it is necessarily complete.
Assuming then that ais a V-elliptic continuous bilinear form on V x V and
fis a continuous function on V, the same obviously holds true when their
domains are restricted to V h . Hence, since it has already been shown that a
unique solution to (10.1) exists, the same will hold true for the approximate
problem (10.3). This information is recorded in the following theorem.

THEOREM 1. Let V h be a finite-dimensional subspace of a Hilben space V,


a : V h x V h -+ lR a continuous, V -elliptic bilinear form, and f : v h -+ lR
a bounded linear functional. Then there exists a unique function Uh E V h
that satisfies

a(uh,Vh) = (f,Vh) for all Vh E V h.


Futhermore, if f is of the form

(f, Vh) = in fVh dx

with f E L 2 (0), then

where Q is the constant in (10.9).

Errors in Galerkin approximations. Having established the conditions


under which an approximate solution can be found, we proceed now to
characterize the error e, which is defined to be the difference between the
exact and approximate solutions:

For this purpose we return to (10.1). Since V h is a subspace of V it is in


order to choose v to be a member of V h ; denoting this member by Vh we
have
(10.10)
10.2 Properties of Galerkin approximations 347

Furthermore, the Galerkin approximation Uh satisfies (10.3). Subtracting


(10.3) from (10.10) and making use of the bilinearity of a, it is found that

(10.11)

or

that is,

a(e, v,,) = o. (10.12)

This seemingly innocuous result has a useful geometrical interpretation in


the event that a is symmetrie; for then the inner product ("')a = a(-,·)
is available, and the theory of Sections 5.3 and 6.4 on orthogonal projec-
tions comes into play. Indeed, according to Exercise 6.21, if {ePdJ:'=l is
an orthonormal basis of V h with respect to (., ')a, then the orthogonal
projection onto V h is defined by
N
Pv = L,)v, ePk)aePk. (10.13)
k=l
But if we set Vh = ePk in equation (10.11) we find that

so that
N

Pu 'L,(Uh' ePk)aePk
k=l
PUh = Uh. (10.14)

Henee the orthogonal projeetion 01 the solution U onto V", with respeet to
the inner produet (., ')a, is the approximate solution Uh. Clearly, then, thc
errar e = u - Uh = U - Pu belongs to N(P), that is,

(e, Uh)a = 0,
which confirms (10.12). In other words, relative to the inner product (-, ')a
the error is orthogonal to the subspaee V h .
The geometrical analogy may be carried a step furt her. It would appear
that the distance Ilu - vhll, when measured using the norm 11·lla, is a min-
imum when Vh = Uh· That this is indeed so is borne out by the following.
Wc have

a(u - Uh + Uh - Vh, U - Uh + Uh - Vh)


a(e + (Uh - Vh), e + (Uh - Vh))
a(e,e) + 2a(e,uh - Vh) + a(uh - Vh,Uh - Vh),
348 10. Approximate methods of solution

using the bilinearity of a. Now the second term on the right-hand side is
zero, since eisorthogonal to all members of V h with respect to the inner
product (., ·)a. Thus

and for fixed U and Uh (and hence for fixed e) we condude that Ilu - vhlla
is smallest when Vh = Uh,; that is,

(10.15)

In other words, the function in V h that is dosest to U is the Galerkin


approximation. In this sense the Galerkin approximation is the best ap-
proximation to U in V h . Of course we could have deduced (10.15) directly
from Theorem 7 of Chapter 4 and (10.14).

Convergence of Galerkin approximations. As mentioned earlier, each


value of the parameter h defines a subspace V h of V: the smaller his, the
larger the dimension of V h will be. We could use for h the reciprocal of
the number of basis functions that span vh, although in Chapter 11 we
give h a more geometrical meaning, in the context of the finite element
method. In any case, h lies in (0,1) and equation (10.3) can be regarded
as representing a lamily of Galerkin approximations, each value of h hav-
ing associated with it a problem (10.3). Corresponding to this family of
problems we have also a family of solutions Uh, and once again a particular
solution Uh is associated with each value of h. Of course, if we define h by
h = l/N, where N = dirn vh, then h cannot take on all values in (0,1) but
only those of the form l/N for integer N.
With these ideas at our disposal, it is quite simple to give adefinition
of the convergence of a family of Galerkin approximations; we say that the
family of solutions Uh converges to the exact solution u if

lim
h~O
Iluh - ullv = o. (10.16)

The task of proving convergence, once a basis or a family of bases has


been identified, is made easier by a deceptively simple result, which has
far- reaching implications.

LEMMA 1 (CEA'S LEMMA). Let V be a closed subspace 01 a Hilbert space,


and let a and P. be, respectively, a continuous, V -elliptic bilinear form and
a bounded linear lunctional on V. Then there exists a constant C, indepen-
dent 01 h, such that

(10.17)
10.2 Properties of Galerkin approximations 349

Consequently, a sufficient condition for the Galerkin approximation Uh to


converge to the solution u of problem (10.1) is that there exists a family of
{V h } of subspaces with the property that

inf Ilu-Vhllv--->O as h--->O. (10.18)


VhEV h

PROOF. Since the bilinear form a is V-elliptic by assumption, we have


(denoting the norm on V simply by 11 '11, for convenience)

o:llu-Uhl12 < a(u-uh,U-Uh)


a(u - Uh, U - Vh - Uh + Vh)
a(u - Uh,U - Vh) - a(e,uh - Vh).

The last term on the right-hand side is zero by (10.12) so, using the conti-
nuity of a,

o:llu-Uhl12 < a(u-uh,U-Vh)


< Milu - uhlillu - vhll·

The inequality (10.18) now folIows, with C = M/o:. o


Cea's Lemma effectively transforms the problem of estimating the error
U - Uh to one of estimating the distance of U from the subspace V h . That is,
we can gain an idea of the quality of the approximation Uh by estimating
how far off U is from V h . Since infvhEvh Ilu - vhll -s; Ilu - vhll for any
particular Vh E V h , it also follows that one may obtain a suitable estimate
by choosing v in an appropriate and convenient way. It turns out that
the most convenient choice is to make use of the interpolate of u; this is
a function Uh in V h whose value coincides with that of U at N points
Xl, X2, ... , XN in n. Since Uh has the representation

N
Uh = LCkCPk, (10.19)
k=1

where q:,k is any basis of V h , we can determine the coefficients Ck from the
fact that

k = 1, ... ,N, (10.20)

for a given function u. That is, we solve for Cj the N simultaneous equations

N
L Cjq:,j(Xk) = U(Xk), k=l, .... N
j=1
350 10. Approximate methods of solution

O 1. 2 1
3 3
FIGURE 10.2. A function and its interpolate in the space spanned by (10.22)

We observe that the operator P : V --> V defined by

(10.21)

is a projection operator.

Example

3. Choose V = H1(O, 1) and V h = Pl(O, 1), the space of polynomials of


degree not greater than 1, wh ich is spanned by the functions

(Pl(x) = x, (h(x) = 1 - x. (10.22)

Suppose that we require thc intcrpolate U to be equal to u at the


points Xl ~ and X2 = ~ (Figure 10.2); then (10.20) and (10.22)
give

Cl . ~ + C2 . ~ u( ~)
Cl . ~ + C2 . ~ u( ~ ).

Solving, we have

Cl = 2u(~) - u(~), C2 = 2u( ~) - u( ~ ).

With the choice (10.19) for vh, (10.17) yields

(10.23)

and so thc problem of convergence reduces to one of finding out whether


Uh --> u as h --> 0, and if so at what rate this occurs. We havc thus re-
duced the problem of convergence 01 Galerkin approximations to one of
convergence 01 interpolates.
In the case of the finite element method, which is distinguished by the
fact that the basis functions are piecewise polynomials, we show that the
distance between u and its interpolate Uh satisfies an inequality of the form

(10.24)
10.3 Other methods of approximation 351

where the constant c is independent of h, and ß is positive. Then (10.23)


immediately implies that

cM ß
Ilu - uhllv :S - h .
Q
(10.25)

Hence, as the approximation is progressively improved so that N gets larger


and h gets smaller, we can expect Uh to converge to u at a rate that is
determined by the magnitude of ß. This is expressed in the form

(10.26)

and we say that the convergence of Uh to U is of order ß. Clearly the aim


is to have ß as large as possible.
A result such as (10.24) is called an interpolation error estimate, for
obvious reasons, whereas (10.25) or (10.26) is called a Galerkin (asymp-
totic) error estimate. Our means of examining the convergence of Galerkin
approximations is always via estimates of the form (10.25).
We conclude this section by noting that the constant in the error estimate
(10.25) can be improved in the event that a is symmetrie. Indeed, returning
to (10.15), this inequality implies that

using the V-ellipticity and continuity of the bilinear form, we find that this
reduces to

which represents an improvement over (10.17) since M :2: Q.

10.3 Other methods of approximation


The Galerkin-Rayleigh-Ritz approach is very widely used, particularly in
the context of finite element methods. But a number of other approximate
methods also exist, and although these are perhaps not as ubiquitous as the
Galerkin method, they are nevertheless worth knowing about as they make
interesting, viable, and in some cases superior alternative approaches. We
discuss the Petrov-Galerkin method, the method of weighted residuals, the
method of least squares, collocation methods, and H-1-methods.
We start by returning to the BVP

Au f in n (10.27)
Bau =
352 10. Approximate methods of solution

in which, as before, A is an elliptic operator of order 2m. For convenience


attention is confined to problems with homogeneous boundary conditions.
As in Chapter 9, we multiply both sides of (10.27) by a function v and
integrate to obtain

(Au,v) = (f,v) (10.28)

in which (.,.) represents the L 2 -inner product. Now in Chapter 9 Green's


theorem was used to shift half the derivatives in Au over to v, and in so
doing to arrive at the VBVP

a(u, v) = (f, v) (10.29)

for all v E V (V being a subspace of Hm(D)). The other methods discussed


here all rely on the observation that there are other ways besides (10.29) of
formulating VBVPs. At one extreme we could eonsider (10.28) as it stands
and seek

u E U, (Au,v) = (f,v) (10.30)

in whieh, for example,

U = {u E H 2m : Bou = ... = Ern-lU = O}. (10.31)

In this case no derivatives of u will have been shifted over to v.


A second alternative is to pose the VBVP in the form (10.29), but to
seek the approximate solution Uh in aspace U h which is distinct from
that in which the admissible functions Vh are sought; this is known as
the Petrov-Galerkin method, to distinguish it from the standard Galerkin
method which has hitherto been the main foeus of attention.
Continuing with the process of eonsidering alternative formulations, at
the other extreme we could shift all derivatives over to v by repeated appli-
cation of Green's theorem, and then pose the problem of finding u E L 2 (D)
such that

(u,A*v) = (f,v) for all v E V, ( 10.32)

where A* is the formal adjoint of A and V = HÖm(D), for example.


All of these new approaches are charaeterized by the feature that the
solution is sought in a particular space, and the equation is required to be
satisfied for all functions v belonging to a spaee that is generally distinct
from the solution space. The Petrov-Galerkin method is based on (10.29),
and we find that weighted residual, least squares and collocation methods
are based on (10.30) whereas H-l-methods are based on (10.32).
10.3 Other methods of approximation 353

Example
4. Consider the problem
-u"+u=f inO=(0,1),
(10.33)
u(O) = u(1) = O.
Equation (10.30) reads, for this problem,

1 1
(-U Il +u)v dx = 1
1
fv dx,

where U = H 2 (0, 1) nHJ(O, 1) and V = L 2 (0, 1). Integrating by parts


once we obtain

1 1
(u'v' + uv) dx = 1
1
fv dx

with U = V = HJ(O, 1). Finally, by integrating by parts once more


we may pose the problem of finding u E L 2 (0, 1) such that

1 1
(-uv" + uv) dx = 1 1
fv dx

for all v E H;3(o, 1).


We now proceed to discuss the group of approximate methods based on
the formulation (10.30). Given finite-dimensional subspaces U h c U and
V h C V, where U is given by (10.31) and V = L 2 (O), consider the problem
of finding a function Uh E U h such that
(10.34)

for all Vh E V h . The expression T( Uh) == AUh - f is called the residual; if


Uh is the exact solution, then of course the residual vanishes.

The method of weighted residuals. In this method U h and V h are


chosen such that dirn U h = dirn V h = N, say, with bases
(10.35)
Then
N
and Vh = L bk1fJk, (10.36)
k=l

where the coefficients bk are arbitrary in view of the arbitrariness of Vh.


Substitution of (10.36) in (10.34) yields a set of N simultaneous equations
N
LMklCk = Fl, l = 1, ... ,N,
k=l
354 10. Approximate methods of solution

in which

(.,.) denoting the L 2 -inner product.

The method of least squares. This method entails finding a function


Uh E U h that minimizes the residual, the magnitude of the residual being
measured in the L 2 -norm (hence the name of the method). That is, we
define a functional

Ilr(Vh) IIi2
l (AVh - 1)2 dx

and we seek Uh E U h such that

From Section 9.4 this is equivalent to the problem of finding Uh E U h such


that

or

1
n
(AUh - f)AVh dx =0 (10.37)

Comparison with (10.34) indicates that the method of least squares is


equivalent to the method of weighted residuals if in the latter we choose
the space V h to be the span of the functions {At,I>;Jt'=l> where {t,I>;Jt'=l is
the basis far U h .

Collocation methods. The idea behind these methods is to force the


residual to vanish at a finite number of points in the domain 0. That is, if
dirn U h = N, then we seek Uh E U h such that

AUh - f =0 at x = Xk, k = 1,2, ... , N.

The collocation method is also a variant of the method of weighted residll-


als, as can be appreciated from the following considerations: corresponding
to each Vh in V h we can define a linear functional f. h such that

Now suppose for definiteness that Uh and f are smooth enough far the
residual r(uh) = AUh - f to belong to HJ(O); that is,

AUh - fEH c HJ(O),


10.3 Other methods of approximation 355

where H is assumed to be dense in HJ(O). Then eh is a linear functional


on H, and eh lies in the dual space H' of H and we must find Uh E U h
such that

(10.38)

The collocation method arises from choosing the functionals eh to be the


N Dime delta functionals OXj ,j = 1, ... , N, defined by

The Petrov-Galerkin method. This method takes as a point of depar-


ture the formulation (10.29); but whereas U = V, the subspaces U h and
V h are distinct. Suppose that these two finite-dimensional subspaces have
bases {.pd~l and {'1jJd~l; then, following the sequence of manipulations
that lead to (10.8), we once again arrive at the equation

but this time the matrix K and vector F are defined by

The methods of weighted residuals, least squares, and collocation all possess
advantages that make them, in principle at least, viable alternatives to
the Galerkin method. In particular, in all three cases a greater degree of
smoothness is expected of the approximate solution: if A is an operator of
order 2m, then both weighted residuals and least squares require that U h C
H 2m (O), whereas in the case of the collocation method, the assumption
that AUh - f E HJ(O) will require that Uh E H 3 (O).
The rationale behind opting for the the Petrov-Galerkin procedure rather
than the standard Galerkin method is perhaps less clear, since the advan-
tage of greater smoothness is not present. However, there are various situa-
tions in which the Petrov-Galerkin method provides approximations of far
superior quality. This is particularly true of eonveetion-dijJusion problems,
of the form

in which the standard heat or diffusion equation is supplemented by a term


involving the first derivative, and which accounts for convective transport.
When a is much bigger than k, so that convective effects dominate, the
Galerkin method gives solutions that are oscillatory, and bear little relation
to the exact solution. The Petrov-Galerkin method, on the other hand, is
one way of overcoming this deficiency, by a judicious choice of U h and V h ,
356 10. Approximate methods of solution

the latter often being chosen to be of the form V h = U h EI:J Wh in which


Wh is aspace of functions that, when added to U h , serve to overcome the
drawbacks of the standard approach.

H-1-methods. Unlike the methods discussed previously, the H- 1 -method


is based on the formulation (10.32) in which an derivatives are transferred
by means of successive applications of Green's theorem. Working in finite-
dimensional subspaces, we now seek Uh E U h such that
(10.39)
where U h C L 2 (D) and V h C Hgm(D). When using this approach we
are free to choose U h in such a way that some or an of its members are
discontinuous, while at the same time the functions making up V h have to
belong to Hgm (D) .
Returning to the earlier examplc (10.33), the H- 1 -method entails finding
Uh E U h C L 2 (0, 1) such that

1(-UhV~ +
1
UhVh) dx = 1
1
fVh dx

for all Vh E V h c Hg(O, 1). So, for example, we could choose as a basis
for the U h a set of piecewise-constant functions whereas for V h we need a
basis of functions that are at least continuously differentiable and which,
together with their first derivatives, vanish on the boundary.
In Exercise 10.10 the methods introduced here are applied to a simple
problem.

10.4 Bibliographical remarks


There is a large body ofliterature on the Galerkin and Rayleigh-Ritz mcth-
ods, with texts ranging from those that deal mainly with computational
aspects to those that also consider questions of convergence. The texts by
Dhatt and Touzot [14] and Hughes [22] are good examples of thc former,
and Johnson [23] and Strang and Fix [51] cover both aspects. The books by
Becker, Carey and Oden [5] and Oden and Carey [37] an give full coverage
of the Galerkin method, although they emphasize the use of the method
as part of the finite element method. Further details on thc alternative
methods discussed in Section 10.3 may be found in Carey and Oden [10]
and in Rektorys [41]. The Petrov-Galerkin method has been the subject
of extensive research in the context of the finite element method, and is
discussed in the text by Johnson [23]. An interesting application of thc
method to the finite element analysis of eastic beams is presented in the
paper by Loula, Hughes, and Franca [32].
10.5 Exercises 357

10.5 Exercises
The Galerkin method
10.1. The BVP (xu')' = xinO = (1,2), u(1) = u(2) = 0, has the exact
solution u(x) = iX2 - (3lnx/4ln 2) - i.
Use the Galerkin method to
find an approximate solution Uh in the subspace of HJ(1, 2) spanned
by ct>1 (x) = (x -1)(x - 2) and ct>2(X) = x(x - 1)(x - 2). Compare the
exact and approximate solutions by (a) sketching graphs of U and Uh;
(b) evaluating the errors lIellu"', lIellp, and IlellHl, where e = U - Uh·
10.2. Use the Galerkin method with basis functions ct>1 (x, y) = (-x 2 +
x)( _y2 + y) and ct>2(X, y) = (x 3 - ~X2 + ~x)(y3 - h
2 + ~y) to solve

the BVP
xy on 0 = (0,1) x (0,1),
U o on r.

10.3. Let J : H --> lR be a functional on a Hilbert space H defined by


J(v) = ~a(v,v) - (f,v), v E H,
where a(·,·) is continuous and H-elliptic and f(·) is continuous, and
let {u n } be a sequence in H such that
lim J(u n )
n_oo
= J(u) ::; J(v).

Then U n is called a minimizing sequence. The aim of this exercise is


to show how such a minimizing sequence can be generated, and also
to show that U n --> u in the norm 11 . lIa.
Let {ct>k }k"=1 be an orthonormal basis for H with respect to the inner
product ("')a = a(·,·), and let H(n) = span{ct>l, ... ,ct>n}' Let U n be
the minimizer of J in the space H(n). Show that Un = I:;=1 (f, ct>k)ct>k
and that (f, ct>k) = (u, ct>k)a, where u is the minimizer of J in H. Hence
deduce that U n is the nth partial sum of the Fourier series expansion
for u, and conclude that lIu n - ull a --> 0 and lIu n - uliH --> O.
10.4. Use the Rayleigh-llitz method with basis function ct>l(X,y) = (x 2 -
a 2)(y2 - ß2) to find an approximate solution to the problem of min-
imizing the functional

J: Hg(O)-->lR, J(V)=~i:j~[(\l2V)2_2~V] dxdy

corresponding to the problem of defiection of an elastic plate occupy-


ing the domain n = (-0:,0:) X (-ß, ß). The corresponding classical
problem is
q in 0,
o on r,
358 10. Approximate methods of solution

and the exaet solution satisfies u(O,O) = 0.0202qa 4 / D at the origin,


if a = ß. Compare this with the approximate solution.
Properties of Galerkin approximations
10.5. Given the VBVP

a(u, v) = (R.,v), vEV


in whieh a is symmetrie and V-elliptie, show that the Galerkin ap-
proximation Uh satisfies

that is, the error in the energy norm equals the error in the energy,
and therefore Iluhila ::; Ilulla.
10.6. If u minimizes the funetional J: V ----> IR given by

J(v) = ~a(v,v) - (R.,v),


show that J(u) = -~a(u,u).

10.7. Vei-ify that the operator P defined by (10.21) is a projeetion.


10.8. Let V h be the subspaee of H I( -1,1) spanned by the three functions

(h(x) = ~x(x -1), <P2(X) = 1- x 2, <P3(X) = ~x(x + 1),


and define the interpolate Uh of u E H I ( -1, 1) by
3

Phu = Uh = I:>k<Pk(X),
k=l

where ak = U(Xk) and Xl = -1, X2 = 0, X3 = + 1. Find Uh (x) if


u(x) = sin(7f/2)(x - ~), and evaluate Ilu - UhliHl.
Other methods of approximation
10.9. Use the Green's formula

(Au, v)p = (u, A*v) + G(u, v)


(see Section 8.4) to show that the least-squares problem (10.36) is
equivalent to the high er-order problem

A*Auh=A*JinO,
plus appropriate boundary eonditions. If A = _\72 (the Laplaeian),
show that this is equivalent to the problem
\74 uh = \72 J,

where \74 is the biharmonie operator.


10.5 Exercises 359

10.10. Consider the problem

-u"+u sinx inf!=(O,l)


u(O) u(l) = O.

(a) Reformulate this problem in a manner suitable for solution using


the method of weighted residuals. Then find an approximate
solution in a two-dimensional subspace spanned by polynomials
of appropriate order (that is, make suitable choices for U h and
V h in (10.35)).
(b) Find approximate solutions using the method of least squares
and the method of collocation.
(c) Apply integration by parts successively to transform the prob-
lem into one suitable for solution using H- 1 methods. Again
choose suitable subspaces of U h and V h , and find an approxi-
mate solution.
Part 111

The Finite Element


Method
11
The finite element method

In practical situations the determination of suitable basis functions for use


in the Galcrkin method can be extremely difficult, especially in cases for
which the domain n does not have a simple shape. The finite element
method overcomes this difficulty by providing a systematic means for gen-
erating basis functions on domains of fairly arbitrary shape. What makes
the method especially attractive is the fact that these basis functions are
piecewise polynomials that are nonzero only on a relatively small part of rl,
so that computations may be carried out in a modular fashion, which is well
suited to computer-based approaches. As we show a little later, the family
of spaces V h (h E (0,1)) defined by the finite element procedure possesses
the property that V h approach es V as h approaches zero, in an appropriate
sense. This is, of course, an indispensable property for convergence of the
Galerkin method.
In Section 11.1 we outline the steps that lead to the construction of finite
element bases for second-order B VPs defined on domains in ]R and]R2. This
is the simplest and most commonly encountered class of problems, and most
of the features of finite element approximations can be demonstrated in this
context. The generalization to higher-order BVPs and to problems in ]Rn
for n ::::: 3 follows in a natural way, and is covered in detail in all texts
dealing with thc finite element method.
Then in Sections 11.2 and 11.3 we describe, for problems in ]R and ]R2,
respectively, some commonly used basis functions. We also show by meam;
of examples how these are used to solve actual boundary value problems.
Section 11.4 is devoted to the construction of bases that are relevant
to fourth-order problems; for this case it is necessary to impose additional
364 11. The finite element method

eontinuity requirements, whieh in turn lead to bases that are quite different
from those used in seeond-order problems.
The elements diseussed in Section 1l.2 to 1l.4 are all polygonal (in two
dimensions). Domains with eurved boundaries ean, however, be aceommo-
dated by resorting to the use of isoparametrie elements; exaetly how this
is done is the subjeet of Seetion 11.5.
The final seetion of this chapter deals with an issue that is eentral to all
finite element eomputations, viz. numerical integration. For general prob-
lems, involving inhomogeneous media, say, or for domains that are approx-
imated using isoparametric elements, the integrals that arise eannot be
evaluated exaetly. For this reason it makes sense to resort to methods that
allow the integrals to be evaluated approximately, although with a degree
of aceuraey that can be estimated and therefore improved upon if desired.

11.1 The finite element method for second-order


problems
Suppose that we have a VBVP of the form: find u E V such that

a(u,v) = (e,v) for all v E V (11.1)

For a second-order BVP the space of admissible functions V consists of all


those functions in H 1 (0.) that satisfy the essential boundary eonditions.
A Galerkin approximation Uh to the solution of (1l.1) may be sought
by constructing a finite-dimensional subspace V h of V, whieh is spanned
by a finite number of basis functions Ni, and we then pose the problem of
finding Uh E V h that satisfies

(11.2)

If {N;}Y=l is a basis for vh, then we have seen in Chapter 10 that expansion
of Uh and Vh in terms of this basis and substitution in (1l.2) leads to the
set of simultaneous linear equations

(11.3)

or, in matrix form,

where, as before,

(1l.4 )
11.1 The finite element method for second-order problems 365

FIGURE 11.1. A polygonal domain in ]R2 and its subdivison into finite elements

The aim of this section is to describe a method for constructing those


special bases {Ndr=l that are associated with the finite element method.

The finite element mesh. We start by partitioning the domain n into


a finite number E of subdomains n 1 ,n 2 , ... ,n E , called finite elements.
These elements are nonoverlapping and cover n, in the sense that
E

ne n n f = 0 for e =I- J, U Oe = 0.
e=l
To avoid complicating matters unnecessarily, we assume that the domain n
is polygonal if it is a subset of R.2 • That is, the boundary r of n is made up
of straight segments. Under these conditions, it is easy to see that the entire
domain can be covered exactly by polygonal elements (Figure 11.1). One
more condition is imposed on the subdivision of n: it is required that every
side of the boundary of an element in R.2 be either part of the boundary
r, or a side of another element. This condition rules out a situation such
as that shown in Figure 11.2, in which AB is a side of n2 but not of n1 .

Nodal points. We next identify certain points called nodes or nodal points
in the subdivided domain; these points playa key role in the finite element
method, as will so on become evident. Nodes are allocated at least at the
vertices of elements, as shown in Figure 11.3(a), but in order to improve
the approximation, further nodes may be introduced, for example, at the
midpoints of the sides of elements as shown in Figure 1l.3(b). In any case
there is a total of G nodes, say, which are numbered 1,2, ... , G and which
have position vectors X1,X2, ... ,Xc. The set of elements and no des that
366 11. The finite element method

FIGURE 11.2. An inadmissible subdivison

• •

(a) (b)
FIGURE 11.3. Finite element meshes comprising elements and nodal points
11.1 The finite element method für second-ürder problems 367

make up the domain n is called a finite element mesh.

Basis functions Ni. We are now ready to describe how the finite element
basis functions are farmed. In carrying out this procedure it must be borne
in mind that the basis functions define a subspace of V, so that they must
be functions in H 1 (n) (for second-arder problems) that satisfy the essential
boundary conditions. The quest ion of boundary conditions is left aside for
now, and we proceed to construct a set of basis functions with the following
properties.
(i) The functions Ni are bounded and eontinuous, that is,

Ni E C(n); (11.5)

(ii) there is a total of G basis functions, that is, one for each node, and
each function Ni is nonzero only on those elements that are connected
to node i:

(11.6)

(iii) Ni is equal to 1 at node i, and equal to zero at the other nodes:

if i = j,
(11. 7)
otherwise;

(iv) the restrietion NiCe) of Ni to ne is a polynomial:

Nilo e == NiCe) , N iee ) E Pk(n e ) far some k;:::: 1, (11.8)

where Pk(n e ) is the space of polynomials of degree at most k on neo


From (iii) and (iv) it is clear that the function N?) defined on element
ne will have the property that

if i = j,
(11.9)
otherwise,

i and j running over all nodes in neo We call NiCe) a loeal basis function.
These ideas are illustrated in Figure 11.4. It is not diffieult to show that
Conditions (i) and (iv) ensure that the functions Ni belong to H1(n), as
required. We are thus going to set up basis functions that are pieeewise
polynomials, and that have small supports, in that they are nonzero only
in a "small" region. It should be clear that we may regard a typieal basis
funetion N, as having been built up by patehing together the loeal basis
functions NiCe) assoeiated with node i, as shown in Figure 11.5. To distin-
guish the basis funetions Ni from the loeal basis funetions N,Ce) , we refer
to the former as global basis junctions.
368 11. The finite element method


1

the funäion Ni
~ ... ..... --.-.

:-.

FIGURE 11.4. Local and global basis functions

FIGURE 11.5_ A basis function Ni formed by patching together local basis func-
tions Nie)
11.1 The finite element method for second-order problems 369

Specific examples are given in the following section, but in the meantime
we observe that if we write
G
Vh(X) = L biNi(x), (11.10)
i=1

then as a consequence of (11.7),

G
Vh(Xj) = LbiNi(Xj) = bj ; (11.11)
i=1

that is, the coefficient bj is simply the value 0/ Vh at node j.


We denote by X h thc space spanned by the basis functions {Nd:=l'
Note that nothing has as yet been done about the essential boundary con-
ditions, which are required to form part of the definition of V h . This is now
addressed by simply setting

{Vh E X h : Vh satisfies essential BCs}


span {Ni: Ni satisfies essential BCs} . (11.12)

In the same way we denote by Xe the space spanned by the functions Ni(e).
That is,

that is, Xe is the space consisting of the restriction of all functions in X h


to neo In view of Condition (iv) we see that Xe consists of all polynomials
up to a given degree.

The approximate solution. We go straight to (11.3) and (11.4) and


note that the bi linear form a(Ni , N j ) will be an integral over n, and that
this integral can in turn be written as a sum of integrals over ne . Denoting
the integrand by F(Ni , N j ), we thus have

l F(Ni,Nj ) dx

L
E

e=l
1 Oe
F(Ni , N j ) dx

L
E

e=l
1 Oe
F(Ni(e) , Nje») dx

e=1
370 11. The finite element method

in which (11.8) has been introduced in the penultimate line. Likewise, if


the integrand appcaring in (I!, N j ) is denoted by 9(Nj ), then

E
L(I!(e),NJe»).
e=l

In other words, the matrix K and vector F in (11.4) can now be expressed
in the form

E E
K = LK(e) and F = LF(e), (11.13)
c=l c=l

in which

K(e)
~J
= a(e)(N(e)
t ,
N(e»)
J
and F(e)
J
= (R(e) ' N(e»)
J . (11.14)

So the actual evaluation of K ij and F j reduces to the evaluation of a number


of matrices Kij) and vectors FJ") for each element, and then summing these
contributions over all elements. The Condition (ii) (equation (11.6)) results
in an additional simplifying feature: since Ni = 0 for all elements that do
not have node i as anode, clearly Kij) = 0 if nodes i and j do not belong to
neo If follows that a judicious numbering of nodes will result in the matrix
K having a banded structure in which all nonzero entries are clustered
around the main diagonal. Prom a computational viewpoint this represents
a distinct advantage.
The matrix K is known as the stiffness matrix and F is known as the
load vector; this terminology derives from the early development of the
finite element method in the context of structural mechanics, and tends to
be used whatever the actual physical context. Likewise, the matrix K(e)
and vector F(e) are referred to, respectively, as the element stiffness matrix
and element load vector.
In the next section we give details of a systematic procedure for setting
up local basis functions Ni(e), and henee also the basis function Ni, for the
case of one-dimensional problems.
11.2 One-dimensional problems 371

• • • • • •
1 2 3 E-1 E E+ 1
FIGURE 11.6. The domain (a, b) and its sub division into elements

FIGURE 11.7. A typical member of X h

11.2 One-dimensional problems


Consider a second-order problem defined on a subset n = (a, b) of the real
line. The domain is divided into elements n l , n2 , ... , n E , each element ne
being a segment of length h e , say. Suppose now that we would like X h
to be the space of piecewise polynomials of degree one, that is, piecewise
straight lines. Then it foltows that Xe is the same as PI (ne). F'urthermore,
since every straight line is of the form f(x) = a + bx, a knowledge of the
value of f at two points in ne suffices to determine f uniquely on neo With
this in mind we define nodal points at the ends of alt elements, so that
each element has two nodal points. As shown in Figure 11.6, this simple
arrangement allows the nodes to be numbered sequentially in such a way
that ne will be connected to no des e and e + 1.
The spacc Xe has dimension two, so the next step is to define two basis
functions Ni(e) and Ni<:!1 that satisfy (11.9). The only functions that fit alt
requirements are

N (e)( ) = x- Xi
(11.15)
t X h·
e

By patching together the functions in the mann er outlined in the previous


section we see that each basis function Ni(x) is a pieeewise linear "hat"
function made up of the loeal basis functions associated with node i. Hence
every function Vh in X h is a piecewise linear function of the form
G
Vh(X) = LdiNi(x)
i=l

in whieh d i is the value of Vh at node i (Figure 11.7).

The reference element. Instead of defining loeal basis functions for each
372 11. The finite element method

••
, .. , ....... ,."

------r-~~_e.----
....,.".__ . --------------_._-
._-_._---------
--• ._-----e.----~~~~._----~
.........._~.
• •.__
-1 1 Xi Xi+l

FIGURE 11.8. Reference element f2 and the image Oe under the affine map Fe

element as in (11.15), matters can be simplified considerably by setting up


a reference element fi, say, which is isolated from the actual finite element
mesh and which is referred to its own coordinate system 1;. The reference
element extends from I; = -1 to I; = + 1 and has the same system of nodal
points as the elements Oe in the actual mesh (two nodes in this case). This
situation is shown in Figure 11.8. Each element n e can now be thought of
as having been generated by an invertible map Fe from fi to n e , of the
form

(11.16)

where Xe is the co ordinate of the center of Oe (Xe = ~(Xi + Xi+!) in Figure


11.8) and h e is the length of neo Thus as ~ goes from -1 to +1, X goes
from Xi to Xi+l. In particular, nodes 1 and 2 on the reference element are
mapped, respectively, to nodes i and i + 1 connected to element neo
The map Fe is an example of an affine map; a map or function f defined
on the real line is said to be affine if it is of the form fex) = a + bx,
where a and bare constants. Such a map transforms an interval most
generally by stretching and translating it, the constant term accounting for
the translation (Figure 11.8).
One advantage of introducing a reference element in this way is that we
can define, once and far all, local basis functions NI and N2 on fi that have
the requisite properties, and having done this simply use (11.16) to map
NI and N 2 to Ni(e) and Ni~)I' respectively, far each element, by defining
Ni(e) and Ni~1 to be functions on Oe satisfying

and (11.17)
or
() ~ Ce) ~
N;" (X) = NI (I;) , Ni+!(x) = N 2 (O, (11.18)
in which X and ~ are related through (11.16) (Figure 11.9).
For a piecewise linear basis we thus define
~(1-1;),
(11.19)
~(1+1;),
11.2 One-dimensional problems 373

N (e)
N(e)
, i+l

i+1

FIGURE 11.9. Local basis functions defined on the reference element 0 and on
the element Oe

and from (11.18) and (11.16) we then reeover (11.15) sinee, for examplc,

as is readily verified. In future loeal basis funetions are always defined


on a referenee element, with the assumption that the ae tu al loeal basis
funetions ean be reeovered by means of a relation such as (11.17). Later
we show that the proeess of evaluating the matriees and veetors defined in
(11.13) and (11.14) is also rendered more straightforward by earrying out
such integrations on the referenee element.
The proeedure is readily extended to higher-order approximations. For
example, suppose that we wish to eonstruet aspace of piecewise quadmtic
functions, with the restrictions to eaeh element being a quadratie function
(that is, a member of P2(D e )). Every funetion fEXe is thus of the form
f(x) = a + bx + cx 2 and so speeifieation of f at three points in D e de-
termines the function uniquely in De . Henee we plaee nodes at the ends as
well as at the midpoints of elements, so that an arbitrary element will have
associated with it nodes i, i + 1, and i + 2.
Next, a refcrcnee element fi is set up with nodes at the ends and at the
center, and we define quadratie basis funetions

N1(~) ~~(~ - 1),


N2(~) 1-e, (11.20)
N3(~) ~~(~ + 1),
as shown in Figure 11.10.
Naturally the funetions in (11.20) satisfy (11.9), as do the loeal basis
·
f une t lOns i+1' i+2 wh'1eh are generate d usmg
N(e) N(e) N(c)
i '
.

(e) (
Ni+2 x) = N- 3 ( ~ ) .
A few typieal pieeewise quadratie basis functions Ni that result from pateh-
ing together the quadratic loeal basis functions are also shown in Figure
1l.l0.
Polynomial bases of the form (11.19) and (11.20) are often referred to as
Lagmnge bases or families, beeause of their elose assoeiation with Lagrange
374 11. The finite element method

i i +1

FIGURE 11.10. Quadratie loeal basis functions and pieeewise quadratie global
functions

interpolation. Generally, if the interval (-1, 1) has a total of K + 1 equally


spaeed nodes, then the Lagrange polynomial eorresponding to node I is
given by

that the requirement (11.9) is indeed met ean be dedueed by inspection.


This general formula allows piecewise cubic and higher-order approxima-
tions to be generated in a systematic manner.
We now show how an approximate solution to a two-point BVP may be
found, using piecewise linear basis functions.

Example
1. Consider the BVP
-u" + u = sin 'Ir X , x E n = (0,1), (11.21)
u(O) = u(l) = O.
The corresponding VBVP is: find u E V = HJ (0,1) such that

10 1(U'V' + UV) dx = 10 1 vsin'lrdx for all v E HJ(O, 1),

and the Galerkin approximation is: find Uh E V h such that

10 1(u~ v~ + UhVh) dx = 10 1 Vh sin 'Ir dx for all Vh E V h ,

where V h = {Vh E X h : Vh(O) = vh(l) = O}.


Suppose that we use three elements for this problem. Since X h is to
be spanned by piecewise linear basis functions, nodes are required at
the ends of elements only, and the basis functions Ni are formed by
11.2 One-dimensional problems 375

FIGURE 11.11. A typical member of V h

patching together the functions defined in (11.15). These functions


span X h .
Next, we construct V h by requiring that (see (11.12))

since NI and N 4 do not satisfy (11.21h. Hence every member of V h


is a linear combination of N 2 and N 3 (Figure 11.11). From (11.13)
and (11.14) we have, for i and j = 2,3,

+ /2/3 [N(2)' N(2)'


1.. J
+ N(2) N(2)]
1 ]
dx
, 1/3

using the fact that the restrietion of Ni to element ne is Ni(e). Of


course, Ni(e l = 0 if node i is not anode of neo

Now recall that Ki~l = 0 if either node i or node j does not belong to
neo Hence the only nonzero contributions that have to be calculated
are Kg), K~;) , Kg) , Kg l ,Kgl (note that K ij is symmetrie).
376 11. The finite element method

The computational work is facilitated by evaluating integrals on the


reference element; thus

r
in,
( dN2
dx
(1) (1)
dN2
dx
+ N(l) N(l) )
2 2
dx

ifir (
dN2 dN2
d~ d~
(~)2
h
NN)
+ 2 2
l
l
h dC
2 ..,
(11.22)

using the fact that


dN(l) dN2 d~ 2
2
and d~ = h l dx
dx d~ dx

(here h e = ~ for all elements). We now substitute for N2 from (11.19)


and integrate between ~ = -1 and ~ = +1 to get K~;) = 28/9. After
transforming to n,
it is found that Kg) is exactly the same as (11.22);
Kg and Kg> differ only in that N2 is replaced by NI in (11.22), but
J
since (N1 )2 and (N2 ? have the same integrals, as do (N~? and (N~?,
we arrive at the same answer. Hence
(2J _ K(2) _ K(3) _ 28
K 22 - :~:l - 3:l - 9

Collecting all terms, then, we have


3
K ij = L Ki~J ===»
e=l

Next we have to evahIate

(11.23)

Now, although evaluation of this integral over each element is a simple


enough matter for this problem, it can be quite tedious in general,
and in the case of more complicatcd data, impossible to carry out
exactly. We observe, however, that it is a piecewise linear approxi-
mation to the exact solution that is being sought, so it would make
11.2 One-dimensional problems 377

FIGURE 11.12. The interpolate of the function fex) = sin(7TX)

sense to replace f(x) = sin 7rX by its linear interpolate ih


in X This h.
would enable evaluation of the terms F; to be carried out very easily
while preserving the piecewise linear nature of the approximation.
The interpolate of fis, we recall, a function of the form
4
A(x) = 2..: j;Ni(x) , j; = f(x;)
;=1

with the property that ih


is linear between nodes and equal to f at
the nodes (Figure 11.12).
When f(x) = sin 7rX we have h = f4 = 0 and h = h = ,;3/2.
Equation (11.23) is thus replaced by

{1/3 (2/3
=}o N~~) NP) dX, + }1/3 (N~2) + N~2»)NF) d~
F,(l)
.
Fi(2)

+ (1 y'3 NJ3) NP) dx .


,J2/3 2
. ~

PP)

Transforming each of these integrals to integrals on n as before, and


evaluating, we find that
378 11. The finite element method

FIGDRE 11.13. The exact and approximate solutions to Example 1

the other components of Fi(e) being zero. Thus

Fi y:[(~, 1,0, O)+(O,~,~, 0)+(0,0, 1, ~)l


'---v----" '---v-----"' '---v----"

v'3
36"(1, 5, 5, 1).

Finally, then, we havc to solve

K22C2 + K23C3 F2 ,
K 32 C2 + K33C3 F3 ,
wh ich gives C2 = C3 = 0.0734. Hence the approximate solution is

We compare this with the exact solution u(x) = (1 + 11"2)-1 sin 11"X
in Figure 11.13 and see that the finite element solution is a fair ap-
proximation of the exact solution, given the very small subspace V h
that has been used. Furthermore, the approximate solution, like thc
exact, is symmetrie about x = ~. The approximate solution could
of course be improved in two ways: by subdividing the domain into
a larger number of elements, and by using a higher-order element,
such as the quadratic element. Of course, either of these refinements
will result in a greater amount of computational work, which mllst
be taken into consideration. It is clearly of interest to know before-
hand by how much an approximate solution will improve as a result
of either of the two refinements mentioned, so that one may decide
whether the refinement is worth the effort. This is a quest ion that is
addressed in the following chapter. In the next seetion we apply the
ideas developed here to second-order problems defined on domains n
in ]R2
11.3 Two-dimensional problems 379

11.3 Two-dimensional problems


We have already given a rough indieation in Seetion 11.1 of how a finite
element mesh is eonstrueted for problems defined on domains in ll~? The
subject is taken up again here, and the ideas in Seetion 11.2 are generalized
to two-dimensional second-order problems.
Reeall from Seetion 11.1 that the domain 0 C IR 2 is assumed to be
polygonal, so that the boundary r is made up of straight segments. Next,
o is partitioned into elements 0 1 , O2 , ... , OE, and it is required that every
side of an element be either part of the boundary r, or the side of another
element. We proeeed now to look at a few eommonly used elements.

Triangular elements. Triangles are the simplest polygonal shapes in IR 2 ,


so it is not surprising that triangular elements possess features that make
for very simple means of approximation.
Suppose that we want X h to be the spaee of pieeewise polynomials of
degree 1, so that we require Xe to be the spaee PI (Oe) (see the eorrespond-
ing diseussion at the beginning of Seetion 11.2). The most general funetion
in PI (Oe) is of the form f(x, y) = a + bx + cy, so that if the values of the
funetion are known at three points, then it is uniquely determined (in other
words a, b, and c ean be found). Three nodal points are thus required, and
these are plaeed at the vertices of the triangle. This positioning of nodes
ensures that if any ofthe sides ij,jk, or ki of Oe is shared with an adjaeent
element 0 f, say, the piecewise linear function formed by patching together
the functions defined on Oe and 0 f will be continuous across the interface
of these elements. Rather than having to deal with an element that has
nodes numbered in same arbitrary way, the development of the theory is
made eonsiderably more eonvenient by adopting a local numbering system
when evaluating the element stiffness matrix and load veetor, sinee the
numbering system is then identieal to that on the referenee element. Onee
these have been evaluated, the eomponents ean then simply be plaeed in
the eorreet rows and eolumns of the global matrix and veetor by reealling
the global node numbers of the element. Thus eonsider again the tri angle
shown in Figure 11.14; we assign loeal node numbers 1 through 3 to the
three no des in the manner shown, and reeord the assoeiation

i +-+ 1
j<->2
k+-+3

between global and loeal node numbers. The coordinates of eaeh node ean
likewise be expressed in global or loeal form, and we may write X}e) (I =
1,2,3) for Xi, Xj, Xk. Finally, the loeal basis functions mayaIso be num-
bered loeally, in the form Nie). Wherever it is neeessary to make the dis-
tinetion, loeal quantities are always indexed by upper ease [etters.
380 11. The finite element method

y
k

T}
3 (0,1)
J

Xi X

n
(1,0)
1 2 ~

FIGURE 11.14. A triangular element and the corresponding reference element

The element stiffness matrix K(e) he re is a 3 x 3 matrix, and the manner


in whieh its eontribution to K is made (aeeording to (11.13)) is as shown
in Figure 11.15. The proeess whereby K(e) and F(e) are eomputed for eaeh
element, and then added to the global matrix, is known as assembly. The
n
referenee triangular element is shown in Figure 11.14. Now it is required,
as in the one-dimensional ease, to eonstruet a map Fe that will transform
n to the element ne , and the most general map that takes triangles to
triangles is the affine map, whieh is of the form
f(~) = Tt;. + b, (11.24)
in whieh T is a eonstant matrix and b a eonstant veetor.
The reference element is a right-angled isoseeles triangle, and it is not
difficult to verify that the transformation
X = x~e)(l - ~ - T}) + x~e)~ + x~e)T}
y = yi e)(l - e- T}) + y~e)e + y~e)T}
or

maps the nodal points 1,2,3 of n


to loeal nodal points 1, 2,3 of n e , and
n
indeed maps each point t;. E to a point x E neo The matrix T and veetor
b in (11. 24) are thus seen to be given by

T - (
(e)
x 2 - Xl
(e) (e)
X3 -
(e))
Xl and
- (e) (e) (e) (e)
Y2 - YI Y3 - Yl
11.3 Two-dimensional problems 381

k j
I I
K(e) K(e) K(e) i
11 13 12

I
K(e) K(e) k
33 32

---K(e)-----+ j
22
I
FIGURE 11.15. Assembly of the global stiffness matrix

whereas the inverse of this transformation is

~ 2~e [(y~e) - y~e))(x - x~e)) - (x~e) - x~e))(y - yi e))] ,

7J = 2~e [_(y~e) - y~e))(x - x~e)) + (x~e) _ x~e))(y _ y~e))] ,

where A e is the area of neo


The next step is to define loeal basis funetions NI, N2 , N3 on fi, and then
to obtain the basis funetions on ne from (11.17), or
Ce)
NI (a:) = NI(e),
~
1= 1,2,3.

The loeal basis functions on fi must satisfy (11.9), and are

• NI Ce)
N2 Ce)
N3Ce)
the function Nl is shown in Figure 11.16, whereas Figure 11.4 shows the
images of NA on the elements attached to node i, and the basis funetion
Ni(a:) that results from patching together allloeal basis functions associ-
ated with node i. Clearly the eondition Cl1.9) is satisfied. The basis funetion
Ni formed by patching together all the loeal funetions Ni(e) associated with
node i is the two-dimensional counterpart of the "hat" function in one di-
mension, and is pyramidal in shape. Naturally Ni is pieeewise linear, and
is nonzero only on those elements that have node i as anode.
Piecewise quadratic triangular elements are obtained by adding a fur-
ther three nodes to an element, at the midpoints of the sides, as in Fig-
ure 11.17. The most general function in P2 Cfi) has the form f (~ , 7J) =
382 11. The finite element method

FIGURE 11.16. The Ioca! basis function Nl on the reference triangle fi

FIGURE 11.17. A triangular element with quadratic Ioca! basis functions


11.3 Two-dimensional problems 383

1
t k=O
x Y
k= 1
x2 xy y2
k=2
x3 x 2y xy2 y3
k=3
x4 x 3y X 2y 2 xy3 y4

FIGURE 11.18. The Pascal triangle

al + a2~ + a31) + a4e + a5~1) + a61)2 , and is thus uniquely determined on


n(and hence on n e ) by its values at the six nodes. Furthermore, the basis
functions Ni formed by patching together alllocal basis functions Ni(e) are
continuous (see Exercise 11.6). Some of the local basis functions are shown
in Figure 11.17.

The Pascal triangle. The task of constructing bases of ever-increasing


orders on elements in ]R2 is greatly facilitated by making use of the Pascal
triangle; this is an arrangement in triangular form of the terms in a poly-
nomial, the kth row containing all the terms of the form xPyq such that
p + q = k (Figure 11.18). Thus the terms in the polynomial of degree k
can be identified by inspection, as can their number, and thereforc also the
number of nodal points required.

Rectangular elements. We turn now to a second category of finite ele-


ments, namely, those that are rectangular or, more generally, quadrilateral
in shape. If we are to adhere to the policy of having nodal points at least
at the vertices of elements, then clearly the simplest rectangular element
will be one with Jour nodes, one node at each corner (Figure 11.19). The
question now arises: what kind of space of polynomials Xe can be defined
on n e so that any function in Xe is uniquely determined by its values at
the four vertices? Functions in P1(n e ) are completely determined by three
nodal values, so they will not do. On the other hand, quadratic functions
require six nodal values, which is more than we have at our disposal. The
solution to the problem is to examine the first few terms of the polynomial
J(x, y) = al + a2X + a3Y + a4XY + a5x2 + a6y 2 + ...
and to resolve that four terms be retained. The constant and linear terms
are obviously required, and it remains to decide which additional term to
retain, in order to arrive at a total of four terms. It is inadvisable to keep the
384 11. The finite element method
k

TI
4 3

~IG
~

1 2

FIGURE 11.19. A reet angular element and eorresponding reference element

terms involving x 2 or y 2 , since this would result in a lopsided approximation


in which a quadratic term appears for only one of the coordinates. However,
there is no objection to retaining the term involving xy: this ensures that
the coordinates x and y are equally represented, for then the approximation
is of the form

We call f(x, y) abilinear polynomial; in general, the space of polynomials


containing terms of degree ::::: kin each ofthe variables is denoted by Qk(n),
so that f(x, y) is a member of Ql (n). Note that the inclusions Pk(n) c
Qk(n) c P2k(n) hold for n c ll~? The situation now is that Xe = Ql(ne ),
so that X h consists of piecewise bilinear popnomials.
As before, we set up a reference element n which this time is the square
(-1,1) x (-1,1), shown in Figure 11.19. The reference element is mapped
onto an arbitrary rectangular element ne by the affine transformation

x = Fee == Te + b or (~)=(
in which the matrix T is given by
(e) (e)
_ 1 ( X2 - Xl Y2(e) - Yl(e)) (11.26)
T - 2" (e) (e) (e) (e)
X4 - Xl Y4 - Yl

and bis the position vector of the centroid of the rectangle neo Since affine
maps take straight lines to straight lines, it is worth noting that the most
general such map would transform the reference element in Figure 11.19 to
a pamllelogmm, so that parallelograms could be as easily accommodated.
11.3 Two-dimensional problems 385

thc function Ni

FIGURE 11.20. A piecewise bilinear global basis function

Ncxt, we set up bilinear local basis functions on f2 that satisfy (11.9); just
as the reference element (-1, 1) x ( -1, 1) may be regardcd as the Cartesian
product of the one-dimensional reference element, in the same way local
basis functions satisfying (11.9) may be generated from products of the
functions (11.19). Thus we obtain

(11.27)

where (~i TJi) are the coordinates of node i on the reference element; in fuH
this reads
NI (f,) Hl - ~)(1 - TJ),
N2 (f,) Hl + ~)(1 - TJ),
N3 (f,) i (1 + ~)(1 + TJ),
N4 (f,) i(1 - ~)(1 + TJ)·

Then the functions NJe) are obtained by setting Nje) (x) = NI(f,), with x
and f, being related through (11.25). As in the case of triangular elements,
the positioning of the nodes and the choice of local basis funetions ensures
that the basis functions Ni will be continuous ac ross element boundaries,
as shown in Figure 11.20. Higher-order approximations on reet angular el-
ements may be generated by onee again appealing to Pascal's triangle.
Figure 11.21 shows the triangle, on whieh are marked the four terms that
give the bilinear approximation. By extending the diamond-shaped pattern
associated with the bilinear approximation, we arrive at a biquadmtic ap-
proximation that eontains nine terms, so that nine nodes are required, as
shown in Figure 11.22. The loeal basis {Nje)H=l on f2 e spans Q2; these
functions are again most conveniently found by constructing basis functions
on fi, and then using the relationship NI(f,) = Nje) (x). The nine functions
on fi may be generated from products of the one-dimensional quadratic
386 11. The finite element method

FIGURE 11.21. Paseal's triangle as a tool for generating bases in Qk

element De
FIGURE 11.22. A biquadratie loeal basis function
11.3 Two-dimensional problems 387

functions, thus ensuring that (11.9) is satisfied. Indeed, suppose that we


denote by rh, 11:2 , Tb the three functions given in (11.20); then the nine
basis functions for the nine-noded element follow from
1h(~)Th(1)),
n3(~)Th (1)),

The basis functions Ni formed by patching together the function N iCe ) as-
sociated with global node i are piecewise biquadratic polynomials that are
continuous across interelement boundaries.
This concludes the discussion on elements for second-order problems in
two dimensions. We now work through a simple example involving rectan-
gular elements.

Example

2. Consider the problem

2 - (x 2 + y2) in n= (0,1) x (0,1),


U o on r 1 ,
au/an o on r 2 ,
where r1 and r2 are the parts of the boundary r shown in Figure
11.23.
The corresponding VBVP is: find u E V such that

In 'Vu· 'Vv dxdy = In [2 - (x 2 + y2)]V dxdy for all v E V,

where V = {v E H1(n): v = 0 on rd, and the approximate prob-


lem is: find Uh E V h C V such that

In 'VUh·'VVhdxdy= 1n[2-(X2+y2)]VhdXdy for all Vh EV h .

We divide the domain into four square elements and choose for X h
the space of piecewise bi linear functions, so that nodes are required
at the corners of elements only (see Figure 11.23).
Next we construct Vh. From (11.12) it is required that

V h = span{N, E X h : Ni(x) = 0 on rd = span{N1 , N 2 , N 3 , N 4 },


the nlllctions Ni being piecewise bilinear functions; the restrietion of
Ni to ne is of course Ni(e).
388 11. The finite element method

y r1
9 8 7

[24 [23

2 1 6 r1

[21 [22

3 4 5 x
r2
FIGURE 11.23. The domain and finite element mesh for Example 2

Now we require

Since all elements have the same geometry, the amount of compu-
tational work can be reduced considerably by observing that many
of the integrals have the same value. Indeed, if nodes i and j both
belong to Oe, then an moment's thought will convince us that

1= J,

if I, J are adjacent,

I, J otherwise,
(1l.28)

so in fact only three integrals have to be evaluated.

The four nodes associated with element 1 are numbered 1 through 4,


so that in this case the local and global node numbers coincide. We
11.3 Two-dimensional problems 389

have

(11.29)

using the chain rule and the rule dxdy = Ijld~d'T] for changing vari-
ables in area integrals; here j = detT, where T is given by (11.26).
For this element,

T = 4: 1(-1 0) 0 -1 and T
-1
= (-4 0)0 - 4 .

By inverting (11.25) we obtain ~ and 'T] in terms of x and y, and find


that

Also, J = ft.
With all of this available and with the use of (11.27)
we can now evaluate (11.29), to obtain

Kg) = j1 j1
-1 -1
([ft('T] _ 1)( -4W + [i(~ - 1)( -4W) ~ d~d'T] = ~.
16 3

Similarly, Kg) = -!, Kg) = -~. Using (11.28) and (11.29) wc gct

-D
-1 -2 o
~n+('
0

C
6K=
4 -1 o 0
sym 4 sym 0

+(' 0)
0 0 -1

D+C
0 0 4 o 0
sym 0 sym o 0
0

C
-2 -2
8 -1 -2
-2 )
6 sym 4 -1 .
8
390 11. The finite element method

The next task is to evaluate Fi = 110.


f Ni dxdy. As in the one-
dimensional casc, we replace f(x,y) by its interpolate lh(x,y) which
is given by
9
lh(x, y) = L fiNi(X, y), h = f(Xi, Yi).
i=l
Then

Thus we need integrals of the form 110.


!I Nje) N}e) dxdy, and onee
again a great deal of effort can be avoided by noting that

JJ;Oe N(l)
1
N(l)
1
dxdy if 1= J,

I~ N(e) N(e) dxdy


ne I J
{ I~

I~
Oe

Oe
N(l)
1

1
N,(l) dxdy
2

N(l) N(l) dxdy


3
land J adjaeent,

otherwise,

1/24

{ 1/72
1/144.

Hence, taking cognizanee of the relationship between local and global


node numbers, we have

~(h) + ~(h + f4) + 1!4(h) 0.0938

~(h) + ~(h + h) + 1!4(f4) 0.0972

~(h) + ~(h + f4) + 1!4(!I) 0.1007

~(f4) + ~(h + h) + 1!4(h) 0.0972

Finally, we solve K c = F to obtain

0.1585 )
( 0.2568
c = 0.4213 .
0.2568

The approximate ~olution i~


11.3 Two-dimensional problems 391

FIGURE 11.24. The exact and approximate solutions to Example 2

and the exact solution is


u(x,y) = ~(X2 _1)(y2 -1).

These two solutions are eompared in Figure 11.24, where we see that
the approximation is quite favorable, notwithstanding the relatively
erude mesh.
3. Suppose that we wish to find an approximate solution, using fi-
nite elements, to a two-dimensional problem in linear elasticity. The
variational problem takes the form (9.25), and it is assumed that
U = U1 (x, y)e1 + U2(X, y)e2, so that the only nonzero eomponents
of the strain are, from (8.7), 1011,1012 = 1021, and 1022. It is assumed
furthermore that the material is isotropie, so that the bilinear form
is given by (see Exereise 9.10)

a(uh,vh) = !n[),,(diVUh)(diVVh) + 2J.tE(Uh)· E(Vh)] dxdy.

The purpose of this example is to give some idea of the ehanges


that are neeessary in the event that the prineipal unknown is veetor-
valued; we foeus on the task of eonstrueting the element stiffness
matrix. Now in problems of this nature, it makes a great deal of
sense to arrange the nonzero eomponents of the strain matrix in the
form of a veetor, also denoted by E, and defined by

(11.30)

On element Oe the dis placement is approximated according to


N
Uh(X) = LCINI(x),
1=1

using a loeal numbering system; the veetor C represents the value of


Uh at node I. Superseripts (e) which identify CI and NI as quantities
associated with Oe are omitted without any danger of ambiguity.
392 11. The finite element method

Substitution in (11.30) then yields the representation

€(Uh) = LB1Cl == Be,


1=1

in which the matrix BI contains derivatives of NI with respect to


x and y, and Band c are the matrix and vector defined by B =
[BI . .. B N], c t = [ci.·· civ]. The contribution K(e) of D e to the
stiffness matrix is now found from

r [>"(divUh)(divVh) + 2f-l€(Uh)· €(Vh)]


in
dxdy =

[le
e

dt [>..BtCtCB + 2f-l BtB ] dXdY] C,


, '

where d is the vector of nodal variables corresponding to the arbitrary


function v.
As before, it makes sense to evaluate this integral on the refercnce
element, and to do so it is necessary to express B as a function of ~
and Ti. This is achieved by observing that a typical term in B [ is, for
example, 8NI/8x, and we have

8N[ aNI a~ 8N aTi


- = - - + -I -
ax a~ ax aTi ax'

which is easily evaluated with the aid of express ions for NI and the
affine maps (11.24) or (11.25). In Seetion 11.5 it is shown that this
transformation can be carried out without explicit inversion of the
map between the reference and the actual element. If the matrix thus
transformed is denoted as B, then we have finally (cf. (11.29))

11.4 Fourth-order problems and Hermite families


of elements
The introduction to the finite element method presented in this chapter
halO been, up to now, geared towards second-order problems, for the simple
reason that these problems lead to the simplest examples of the method;
the space in which second-order problems are formulated is a subspace V
of H 1 (D), and it suffices to construct finite-dimensional spaces V h that
11.4 Fourth-order problems and Hermite families of elements 393

are spanned by continuous functions. In the case of fourth-order problems,


however, the situation is less straightforward. The "parent" space for the
variational problem is H 2 (0), and because not all continuous functions
belong to this space (consider, for example, the basis functions constructed
in the last two sections), it follows that we have to go a stage further in
order to obtain suitable finite element subspaces V h that satisfy V h C V C
H 2 (O).
Consider, for example, thc problem of an elastic beam that is constrained
against both displacement and rotation at its two ends. The problem is thus
(see also (8.20))

d4 w f
dx 4 EI on (O,L),
w(O) = w'(O) = 0, (11.31 )
w(L) = w'(L) = O.
The space of admissible functions is V = H5(O), and the VBVP is: find
w E V such that

l L
w"v" dx = l L
(j / EI)v dx for all v E V

It is clear then that the space V h must comprise functions whose sec-
ond derivatives cxist, at least in a weak sense. By analogy with the set of
conditions (11.5) through (11.8) for second-order problems, we therefore
stipulate that the basis functions of V h must satisfy the following proper-
ties.
(i) The global basis functions comprise two sets, denoted by Ni (i =
1, ... ,G) and Mj (j = 1, ... , K); these functions are bounded and
continuously differentiable, that is,

(ii) cach of the functions Mi and Ni is nonzero only on those elements


that are connected to node i:

Mi(x)IOe } 'f d r.
N i (x)loe == 0 I X'F He;

(iii) the basis functions have the properties

if i = j,
if i =I j,

(iv) let Ni(e) and Mi(e) be, respectively, the restrietions of Ni and Mi to
Oe; then Ni(e) and Mi(e) are polynomials.
394 11. The finite element method

This time it is clear from (iii) and (iv) that the loeal basis function Ni(e)
defined on element Oe will have the properties

I if i = j,
{
o otherwise,
o at all nodal points x j,

and the local function Mi(e) will have the properties

Mi(e) (Xj) o at all nodal points Xj,


Mi(e) , (Xj) I if i = j,
{
o otherwise.
A basis that satisfies the properties (i) through (iv) is known as a Hermite
jamily. The conditions (i) and (iv) ensure that the Hermite basis functions
Ni and Mi belong to H2(O), as required. In contrast to (11.10) and (11.11),
if we now write
G K

Vh = Lb;Ni + LdjMj ,
i=l j=l

then it follows from the construction of the basis functions that


G K
Vh(Xj) = L biNi(xj) +L d;Mi(xj) = bj ,
i=l i=l

whereas
G K
v~(Xj) = LbiN:(xj) + Ld;MI(xj) = dj .
;=1 i=l

So both the value of a function and its derivative are interpolated by Her-
mite basis functions.
The space X h is now simply defined by

Example
4. Returning to a one-dimensional problem such as (11.30) for the elastic
beam, suppose we try the simplest mesh, consisting of a set of ele-
ments with nodes only at the interelement boundaries (Figure 11.25).
The restrietion of the local basis functions to element Oe is required
to be polynomials whose values and slopes at the nodes are uniquely
11.4 Fourth-order problems and Hermite families of elements 395

i-1ii+l

FIGURE 11.25. Local and global Hermite basis functions

determined. Thus an allowanee must be made for four eoefficients to


be determined in eaeh element, and so it follows that the loeal basis
functions are cubic polynomials. On the element n i = (XI,X2), for
example, the basis functions are

These are illustrated in Figure 1l.25. Global basis functions eorrc-


sponding to an arbitrary node are also shown in the figure.
The boundary eonditions in (1l.31) will require that NI = MI =
NE+l = ME+I = 0, so that

vh =span{N2 , ... ,NE , M 2 , ... ,ME }.

5. Suppose that it is required to solve the variational problem (9.24)


eorresponding to defleetion of a plate. The spaee V h is required to
belong to H 2 (n), and onee again this is aehieved by eonstrueting
basis functions that are eontinuously differentiable.
Consider a triangular element with no des at the vertices (Figure
1l.26), and suppose that we begin by naively extending Example
4, in that the loeal basis functions on ne are assumed to be eomplete
eubie funetions, that is, members of P3(n e ). A cubic function of two
variables has 10 terms, so it follows that if the funetion u as well as its
first derivatives U x == Bu/Bx and uy == Bu/Bx are going to be the un-
known nodal variables, this will aecount for 9 of the 10 eoeffieients in
the eubie polynomial (3 nodes, and 3 unknowns per node). A fourth
nodc is thcrcfore introdueed at the eentroid of the element, and only
396 11. The finite element method
1I s

FIGURE 11.26. The restrietion of a piecewise cubic function

the value of the function is interpolated at this point. Adopting a


loeal numbering convention, the function u hru:; the representation
3
U = ~)brN~e) + crM~e) + d/L}e)) + b4 N 4
/=1

in which br,cr, and d r are, respectively, the values of u, u'" and uy


at node I, and N~e), M;e), and L}e) are the corresponding local ba-
sis functions. Next we examine whether this local basis will allow a
global basis of Cl functions to be constructed. Consider any one of the
edges of the triangle, and set up coordinates (s, n) with axes parallel
to the tangent and normal to this edge (Figure 11.26). The function
u is cubic and, when transformed to the coordinate system (s, 1I), is a
cubic function of s along the edge 1I = O. Since u and its first deriva-
tives are specified at the two nodes that constitute the boundary of
this edge, it follows that a unique cubic function may be defined along
the edge (since u and the tangential derivative Us == au/as are known
at the nodes). Thus, when the function u over the domain n is ob-
tained by patching together its rcstrictions to the various elements,
the resulting function will be continuous ac ross adjaccnt elements.
For this function also to be continuously differentiablc across thc
element boundary, it is necessary that the normal derivative u" ==
au/all be continuolls there. But u" is a quadratic function of s, and
is thercfore not llniqucly determined by its two nodal values. So the
global function is not continuously differentiable and, as things stand,
is not a candidate for a Hermite basis.
The remedy to thc problem encountered in Example 5 lies in incrcasing
the degree of polynomial approximation on the element. For this purpose
11.4 Fourth-order problems and Hermite families of elements 397
both first derivatives

normal derivative

FIGURE 11.27. The degrees of freedom corresponding to a quintic polynomial

the following result is required.

THEOREM 1. Let ne C ]R2 be an arbitrary triangular element with nodes


1,2,3 at the vertices and 4,5,6 at the midpoints 0/ the sides. Then any
complete polynomial /unction p 01 degree 5 is uniquely determined by the
nodal values:

p(XI) }
pAXI), Py(XI) at the vertices (I = 1,2,3)
Pxx(XI), Pxy(XI), Pyy(XI)
Pli at the midpoints 0/ the sides.

Inspection of Pascal's triangle (Figure 11.18) verifies that 21 nodal values


are required in order to determine a quintic polynomial uniquely. This ele-
ment is normally depicted as in Figure 11.27, in which the various degrees of
freedom are denoted by different symbols. The proof of this result is left to
Exercise 11.11. Bearing in mind the main aim, which is that of constructing
a basis which is piecewise polynomial and in C 1 (n), and hence in H 2 (n), it
remains to verify that the function obtained by patching together the quin-
tic interpolation of Theorem 1 will fulfill this purpose. Consider one of the
sides which constitutes a boundary between elements: by transformation
to the coordinates (s, v) we see that the function u is a quintic polynomial
g(s), say, along this edge. Now g, g', and g" are all determined uniquely at
the vertex nodes, and thus the function 9 is uniquely determined along the
edge (since a quintic has six coefficients). Thus we know that V h C C(n).
Next, consider the restriction to the edge ofthe normal derivative 8u/8v,
and denote this function by 1(8). After transformation to the coordinates
(8, v) and evaluation at v = 0, clearly 1 will be a quartic polynomial in s.
A total of five of its values are uniquely determined: 1 and f' at the two
398 11. The finite element method
y
3

6 5

1 4 2
x
FIGURE 11.28. A curvilinear triangular element

vertex nodes, and f at the midside node. Thus the normal derivative is
uniquely determined at the interelement boundary, and so V h C C 2 ((I).

11.5 Isoparametrie elements


For domains in ll~? the finite elements discussed up to now have been re-
stricted to two geometrical types, viz. triangles and rectangles (or, more
generally, parallelograms). Such elements are of course adequate for the
discretization of domains with polygonal boundaries; but for boundaries of
more general shape, and for curved boundaries in particular, it is necessary
to extend the ideas of the earlier sections. This seetion will give an idea of
how this extension is carried out, in the context of Lagrangian elements in
]R2.
Consider the six-noded reference triangle n which is shown in Figure
11.17, together with the set of quadratic local basis functions. We now
n
generate an element e by choosing six nodes X}e) (I = 1, ... ,6) and by
n
stipulating that ne be the image of nunder the map Fe : n -+ e defined
by
6
X = Fe(~) == L X}e) N1(~). (11.32)
1=1

The six nodes X}e) of ne are the images of the six nodes of the reference
element, as shown in Figure 11.28, and as may be deduced by using the
properties of the local basis functions, and the sides of the reference element
are mapped to curves which are deseribed by quadratie polynomials in ~
and TJ. In this way we have used the basis funetions to generate an element
with curved sides: this is known as an isoparametric element, and a mesh
of elements generated in this way is known as an isoparametric mesh. The
loeal basis functions on ne are generated in the usual way, by using the
relationship
11.5 Isüparametric elements 399

in which x and ~ are related through (11.32). In this way much of the
process developed for affine elements earries over to this more general ease.
What must be reeognized, though, is that the basis functions NY) no longer
inherit the polynomial strueture of the functions NI, for the simple reason
that the map Fe is no longer affine. The manner in whieh eomputations are
earried out on the referenee element is best illustrated through a eonerete
example.

Example

6. Consider the VBVP

where V h c V c H 1 (rl). The eontribution to the stiffness matrix


from element e is thus

K(e)
IJ -
-1Oe
\7 N(e)
I
. \7 N(e) dx
J '

using a loeal numbering system, in which land J range over 1 to 6


for a quadratie triangle.
The 2 x 2 Jacobian matrix J is defined by

(11.33)

and is obtained from (11.32). This plays a key role in the evaluation
of (11.33) on the referenee element, as does its determinant, which is
denoted by j:

j = detJ.

It is required that j(~) > 0 for all ~ E n,


in order that the map
(11.32) be invertible, and to maintain the orientation of the referenee
element (for invertibility alone, j cl 0 would suffiee). We also observe
that for isoparametric elements j is in general a function defined on
n; für affine maps it is constant.
For eomputational purposes the integrand of (11.33) is best expressed
in matrix form; thus, denoting by BI the 2 x 1 vector consisting of
Ni
the eomponents of V' e), (11.33) beeomes

(11.34)
400 11. The finite element method

Now considering that the aim is to evaluate these terms on the ref-
erence element, it follows that we have to transform the vectors B r.
We have

8N}e) _ 2 8 Nr 8~i
8x·J - L
i=1
a<".
e . ax·
J

or, in matrix form,

(11.35)
~t ~ ~

where BI = [8Nr / a6 aNr / 86J t . This is very convenient, except for


one problem: the elements of the inverse Jacobian J- 1 are given by

-1 8~k
J kl (x) = -8 '
Xl

and evaluation of this matrix would require that (11.32) be inverted,


which is a nontrivial task in general.
Fortunately there is a way around this difficulty; indeed, if J is writ-
ten lor
l:
convemence
••
In t he lorm
l: J = (a b)
c d ' t hen . lts.Inverse IS
.

given by

Thus the elements of J- 1 can be expressed entirely in terms of par-


tial derivatives of Xk with respect to ~/, which are easily evaluated.
This now clears the way for the evaluation of (11.34) since direct
substitution of (11.35), together with transformation to the reference
element, give

10 (j-2)B~J-l J- T BJ j ~ d'f/
1o(j-l)B~J-IJ-tBJ ~d'f/. (11.36)

It is important to note that the integrand is no longer a polynomial;


rather, due to the presence of the jacobian determinant in the de-
nominator, it is a rational polynomial. In practice integrals such as
that appearing in (11.36) are evaluated approximately using numeri-
cal integration, a procedure that will be discussed in the next section.
The procedure for generating isoparametric elements from a reference
square is unaltered. Whereas there was no point in starting with the three-
noded triangle, since an isoparametric map would simply take a tri angle to
11.5 Isoparametrie elements 401
y
T)

Oe
(~
(

y
T)

FIGURE 11.29. Isoparametrie maps from a reference square element

a triangle, in the case of a square reference element there is every reason


to begin with the four-noded element: the basis functions are bilinear, so
that the map

4
X = LxINI(e)
I=l

with NI given by (11.27), will give most generally an arbitrary quadrilateral


of the kind shown in Figure 11.29. Thus while this element does not have
curved sides, the isoparametric concept permits the possibility of war king
with quadrilaterals whose sides are not parallel.
The next step up is the nine-noded element; here the isoparametric map
does lead to an element with curved sides, as shown in Figure 11.29.
The ideas embodied in Example 6 lie at the heart of finite element com-
putations. Indeed, the convenience of this procedure lies not only in the
ease with which the element stiffness matrix may be evaluated far arbi-
trary elements, but also in the fact that the same procedure may be used
to carry out these computations also far elements generated by affine maps:
(11.32) may be used to generate an element Oe simply by choosing the ref-
erence element geometry (triangle or square), the number of nodes and
their placement, and the coordinates of the nodal points. The rest of the
computations folIowas in Example 6.
402 11. The finite element method

11.6 Numerical integration


A key stage in the implementation of the finite element method is the con-
stmction of the stiffness matrix and load vector, and these require that a
number of terms be integrated, generally over the reference element. Now
while such integrations can be carried out in closed form for simple prob-
lems such as the Poisson equation, more complex problems arising, for
example, from the modelling of non-homogeneous media, will give inte-
grands which may weil not be integrable in closed form, particularly if the
coefficients representing the non-homogeneities are anything other than
straight forward functions. The existence of such complex integrands will
also arise from the use of isoparametric elements, as we have seen in the
previous section.
There is thus the need to find an alternative, possibly approximate, way
of computing integrals. Two criteria which any such alternative must meet
are: (a) its degree of accuracy must be known; and (b) it must be amenable
to easy implementation in finite element computer programs. The basis of
most numerical integration schemes is the identification of selected points,
known as sampling points, at which the value of the function is sampled,
and the specification of a set of weights, one for each sampling point.
Suppose that integration is to be carried out over one of the reference
elements D; then if the sampling points are denoted by ~i (f! = 1, ... , r)
and the weights by Wt (f! = 1, ... ,r), a numerical integration formula oj
order r is defined to be a formula of the kind

(11.37)

The main aim then is to have available a systematic means of choosing the
sampling points and weights in such a way as to be able to minimize the
error IIn j(t;,) ~ - [rU)1 for an integration scheme of given order, where
IrU) represents the righthand side of (11.37). The choice is usually carried
out in such a way that the integration scheme is exact for polynomials of
a given degree.

One-dimensional problems. For integration of functions over the inter-


val (-1, 1), Gauss quadrature is a popular option. The Gauss quadrature
rule may be defined for any order, though schemes up to those of order
three are most common in finite element calculations. Sampling points and
weights for the mies of orders 1,2 and 3 are as follows:
11.6 Numerical integration 403

Order ~t Wt

1 0 2

2 -1/../3 1
1/../3 1

3 -.j3!5 5/9
0 8/9
.j3!5 5/9

The weights and sampling points corresponding to Gauss quadrat ure are
chosen in an optimal fashion, so as to integrate exactly polynomials of
as high a degree as possible. Thus a polynomial of order 2r is integrated
exactly by a Gauss rule of order r + 1. Alternatively, a rule of order r
integrates exactly a polynomial of order 2r - 1.

Example

7. We show in this example how the sampling points and weights for the
scheme of order 2 may be obtained. Suppose that an arbitrary cubic
function J(~) = ao + al~ + a2e + a3e is to be integrated exactly
over the interval (-1, 1), using a scheme of order 2. Now

so it is required to find sampling points ~l and 6, and weights Wl


and W2, such that

(11.38)

Suppose that we simplify matters by assuming that the sampling


points are located symmetrically about the origin, and the two weights
are equal, so that ~2 = -~l and W2 = Wli then we obtain

This must hold for all values of ao and al, and so it follows that
Wl = 1 and ~l = 1/../3.
404 11. The finite element method

Gauss quadrat ure is closely related to properties of the Legendre polyno-


mials, which were introduced in Chapter 6. Indeed, it can be shown that
the sampling points and weights corresponding to an integration rule of
order rare given by

el = I! th zero of the Legendre polynomial Pr,


2
I! = 1, ... ,r.

Integration over a square reference element. The extension of the


Gauss quadrat ure rule to the reference element (-1, 1) x ( -1, 1) is straight-
forward, and relies simply on the application of the one-dimensional rule
in each coordinate direction: thus

kf(~, TJ) d~dTJ = [11 [11 f(E" TJ) d~dTJ


[11 ~ Wt!(el, TJ) dTJ
r r

l=l m=l
r

L wlwmf(el, firn).
l,m=l

Integration over triangles. An integration rule of order 1 may be defined


on a triangle ne by

r f(x,y) dxdy ~ Aef(x,y)


Jo.
(11.39)

in which (x, y) are the coordinates of the centroid of the triangle (Figure
11.30). Likewise, a rule of order 3 may be defined by

1~
f(x, y) dxdy ~ ~Ae L
3

~1
f(xe,iJt), (11.40)

where (Xl, fit) (I! = 1,2,3) are the coordinates of the midpoints of the sides
(Figure 11.30). It is not too difficult to show (see Exercise 17) that the rule
of order 1 is exact for polynomials of degree 1, while the rule of order 3 is
exact far polynomials of order 2.
11. 7 Bibliographical remarks 405

I-point integration 3-point integration

FIGURE 11.30. Integration rules on the reference triangle

11.7 Bibliographical remarks


The basic ideas set out in Sections 11.1 and 11.2 may be found in most
books on finite elements, though the style and emphasis vary considerably
from one book to anothcr. The subject first gained prominence through
its use in the solution of problems in solid and structural mechanics, and
the vast majority of texts, though covering most of the essential ideas, are
directed at those whose interests lie in mechanics. For so me insight into
the 'real' applications of the method the books of Zienkiewicz and Taylor
[56, 57] are recommended. Burnett [9] also gives examples of a number of
physical applications, many of them from outside mechanics. The Finite
Element Handbook [24] is an encyclopaedic work which covers just about
every aspeet of the subject, including a wealth of physical applications.
Computational considerations have not been discussed in detail in this
exposition. This is a huge topic in its own right, and a number of textx
provide comprehensive coverage of the procedures which are relevant to
writing efficient computer programs. The texts [9, 56, 57] are again good
sourees, as are the works by Hughes [22] and by Dhatt and Touzot [14].
Much valuable theoretical background mayaiso be found in these works.
Other useful elementary texts include those by Bccker, Carey and Oden
[5], and by Johnson [23]. The latter provides a fairly comprehensive treat-
ment of convection-diffusion problems, for which it is necessary to deviate
from the standard Galerkin- based approach.

11.8 Exercises
The finite element method for second order problems

11.1. Assurne that the space Xe spanned by local basis functions belongs
to H1(r!c), and that X h C C(n). Show that X h C H 1 (r!). [Take
406 11. The finite element method

any v E X h: apply Green's theorem to fn,(aV/aXi)'IjJ dx, where 'IjJ E


CQ'(il e ); then sum over all elements.]
11.2. The half-bandwidth of a symmetrie matrix K is the smallest number
B corresponding to which K ij = 0 for all i and for all j satisfying
Ij - i + 11 > B. Number the nodes in the finite element mesh shown,
in such a way that the resulting stiffness matrix has as small a half-
bandwidth as possible.

One-dimensional problems

11.3. Rework Example 1 using a mesh of two elements and the quadratic
loeal basis functions
N1 (0 = !~(~ -1), N2 (0 = 1 - e,
N3(~) = !~(~ + 1).
11.4. Let Xh be the space spanned by piecewise linear functions, that is,
Xe = Pl(il e ), where il e eile IR. Let 1 be any funetion defined on
il, and assurne that 1 can be differentiated as many times as desired.
Let lh be the interpolate of 1 in Xh. The purpose of this Exercise
is to show that the interpolation erraT" e = 1 - lh satisfies the erraT"
baund

Iiell oo = O~x~l
max 11(x) - lh(x) I::; h
8
2
max
O~x~l
1f"(x) 1
where h is the length of an element. Expand e(x) in a Taylor series
ab out any point x in il e , that is,

e(x) = e(x) + e'(x)(x - x) + !e"(z)(x - X)2

where z is a point hetween x and x. Select x to be the point at which


e is a maximum; then derive the result

le(x)1 = !je"(Z)I(Xi - X)2

where Xi is one of the no des of il. Assuming that Xi is the node nearer
to x, obtain the error cstimatc.

11.5. Use Exercise 4 to estimate the error 111 - lhlloo if 1 is the function
f(x) = xsin7fx on the domain il = (0,1). Compute the aetual crror
11.8 Exercises 407

using two, three and four elements, and compare with the estimate.
Plot a log-log graph of error vs. hand plot the three points corre-
sponding to the three actual errors obtained. Do these points indicate
a quadratic rate of convergence?

Two-dimensional elements

11.6. Show that the basis functions Ni obtained by patching together quadratie
loeal basis funetions Ni(e) on triangular elements are continuous.

11.7. It is possible to eliminate the interior node in elements such as the


nine-noded quadrilateral, and in so doing to arrive at an element
whieh has nodal points only at the vertiees and the midpoints of
the sides. Using Paseal's triangle, eonsider which terms should be
contained in such an approximation, and derive the loeal basis fune-
tions. This eight-noded element is known as a serendipity element,
presumably as a result of its accidental diseovery.

11.8. Rework Example 2 using the mesh shown below:

Fourth-order problems and Hermite families of elements

11.9. Using a mesh of two elements, find an approximate solution to the


beam problem (11.31), and eompare this with the exaet solution

W(X)
f L4[1 (
x)4 (X) 3 1 x ) 2]
= EI 24 L - 12 L + 24 L .
1(
11.10. Prove Theorem 1.

Isoparametric elements

11.11. Prove that the isoparametrie map from the referenee element to a
parallelogram is neeessarily affine.

11.12. Determine the range of values of d for which the quadrilateral ele-
ment shown below has a jaeobian determinant whieh is everywhere
408 11. The finite element method

positive.

1 d

N umerical integration
11.13. Following the procedure used in Example 7, find the sampling points
and weights corresponding to a Gauss quadrat ure rule of order 3 on
the reference triangle.
11.14. Rework Example 2 using the method of Example 6, with 2 x 2 Gauss
quadrat ure.
11.15. The purpose of this exercise is to explore the consequences of under-
integration, the process whereby the terms in the stiffness matrix are
obtained by using an integration scheme of a lower order than that
required for exact integration. Consider an element in the form of the
reference square (-1,1) x (-1,1) (that is, Oe = n) and suppose that
the bilinear form is that corresponding to the Laplacian operator.
(a) The basis functions (11.27) may be expressed in vectorial form
as

find the constant vectors a, b, c and d.


(b) The element stiffness matrix is given by (11.33) and the inte-
grand may be expressed in the alternative form (V N)(V N)t,
where VN is the 4 x 2 matrix with entries aNr/a~k' Evaluatc
the stiffness matrix by integrating exactly, and show that the
null spacc of this matrix is spanned by the single vector a.
(c) Evaluate the stiffncss matrix again, this time using a one-point
integration rule with sampling point (0,0) and weight w = 4.
Show that the resulting matrix has a two-dimensional null space,
spanned by a and d.
Underintegration has an obvious economical advantage when large
problems are required to be solved; but in making use of this pro-
cedure, it is necessary remove the additional vcctor d Irom the null
11.8 Exercises 409

space, since the desired solution will be polluted by this vector. Highly
effective schemes exist for achieving this end.

11.16. Show that the integration rule (11.39) is exact für polynomials of
degree 1, while the rule (11.40) is exact for polynomials of order 2.
12
Analysis of the finite element method

Chapter 11 has been devoted to a detailed account of the finite element


method, with the focus being on the basic ideas underlying the method, as
weil as a number of issues that arise in practice. The goal of this chapter is
to take developments a step forward, and to provide a mathematical jus-
tification for the method. In other words, we return to the problem posed
in Chapter 10, in the context of the Galerkin method: given a variational
boundary value problem with solution u and approximate solution Uh, esti-
mate the error U - Uh, and determine the rate of convergence of Uh to U as
h --> o. This problem is now addressed in the context of the finite element
method.
It was seen in Chapter 10 that the error Ilu - Uh 1 v is bounded, up to
a multiplicative constant, by the shortest distance from U to the subspace
V h (Theorem 2, Chapter 10). This is Cea's Lemma, and it forms the cor-
nerstone of the analysis of the finite element method; indeed, since this
shortest distanee is in turn bounded above by the distanee Ilu - Uh Ilv be-
tween U and its interpolate Uh E V h , sharp estimates of the interpolation
error will suffice to obtain a knowledge of the finite element approximation
error.
The aim of this chapter, therefore, is to obtain such interpolation esti-
mates. The theory is developed in the context of elements that are obtained
by affine maps from a reference element, so that the domain n is assumed to
have a boundary that is polygonal in ]R2, and polyhedral in ]R3. Otherwise
the theory presented here is quite general in nature.
Section 12.1 is devoted to a diseussion of affine families of elements, and
of interpolation operators. In Section 12.2 the aim is to derive estimates of
412 12. Analysis of the finite element method
y


~
x
-+-------11---- ~

FIGURE 12.1. Generation of a finite element mesh by a family of affine maps

thc interpolation error on a single element. This estimate takes the form of
abound on the H Tn -seminorm of U - Uh, in terms of geometrical properties
of the element. Then in Seetion 12.3 error estimates are derived for second-
order problems, in appropriate Sobolev norms. The final section of thb
chapter is devoted to a discussion of the modifications that must be made to
the theory in order to accommodate the presence of curvcd boundaries, and
also to incorporate into the estimates the error due to numerical integration.

12.1 Affine families of elements


In this seetion we start to set up the machinery that is vital to a proper
development of error estimates for finite element approximations.

Affine-equivalent elements. We consider a situation in which a domain


Sl has been partitioned into E finite elements, all elements being of the
same geometrical type (for example, all triangles) and having the same
degree of approximation (for example, all three-noded triangles). Such a
finite element mesh may be generated simply by setting up a single refer-
ence element n, say, and by mapping or transforming ninto each one of
the elements Sle in turn (Figure 12.1).
The ba..'lic idea has been encountered in Chapter 11, and is very simple.
First, define the reference element n,
this element being of the same geo-
metrical type as the elements that make up Sl. Next, define an affine trans-
formation, that is, a transformation that maps straight lines into straight
lines, by

(12.1 )
12.1 Affine families of elements 413

~e
~
small hel Pe large hel Pe

FIGURE 12.2. The constants h e and pe associated with an element

so that Fe maps eaeh point ~ of n


to a point x of n e . Here Te is an
invertible n x n matrix and b e is a translation veetor. We also require of
Fe that it maps the nodal point ~I of n
to the (loeally numbered) nodal
point X}e) of ne :

(12.2)

Onee a set of affine transformations has been eonstructed in this way for
eaeh element, we need to foeus attention only on the referenee element
n and the family of transformations F 1 , F2 , ... , FE, sinee these provide a
complete description of the mesh.
n
When two elements and n e are related to eaeh other by a transforma-
tion of the type (12.1), (12.2), they are said to be affine-equivalent. Also,
a set of finite elements n 1 , ... , n E is ealled an affine family if all elements
are affine-equivalent to a single reference element n.
It should be clear from the discussion in Section 11.3 that affine maps of
the form (12.1), (12.2) exist in lR, and in lR 2 from one tri angle to another,
and as far as quadrilaterals go, most generally from one parallelogram to an-
other. Similar results hold in lR 3 for tetrahedra and 3-rectangles or "brieks".
We are thus assured that affine maps are always available for the elements
with which we are concerned.
The relative size and shape of an arbitrary element ne are quantified in
a natural way by defining the eonstants

he = diam (ne) = max {Ix - yl, x, y E n e } (12.3)

and

Pe = sup{ diameters of all spheres eontained in ne }. (12.4)

When dealing with the referenee element n


we denote the eorrespond-
ing eonstants by ii and ß. These quantities are illustrated in Figure 12.2;
whereas h e gives some idea of the "size" of ne , the ratio hel Pe gives an
indication of how "thin" the element iso
We now summarize some useful properties of the affine transformation
414 12. Analysis of the finite element method

(12.1).

LEMMA 1. Let Fe : 0 -> Oe be the affine map from 0 to Oe defined by


(12.1), for 0, Oe C jRn. If the matrix norm IITel1 is defined by

with lIell = (E~=l ~i~i)1/2 for anye E jRn, then

IITell :::; h,e and IIT;lll:::; h.


P Pe

PROOF. Let z = ße/liell; then IIzll = ß and, for e -I- 0,

II T II = sup IITeel1 = sup {11(lleIlIß)Tezll} = IITezl1


eileil Iiell ß .
Now pick any two points and TI in e 0 that lie on the sphere of diameter
ß; then Ile - Tlil = ß and so

IITel1 ß- 1 sup IITe(e - TI) 11


ß- 1 sup 11 (Tee + be) - (TeTl + be)1I
ß- 1 sup IIx - Yll :::; heiß.
The second inequality follows similarly (see Exercise 12.1). o
Mappings of functions. Suppose that we are given a continuous function
v defined on Oe; making use of the affine map (12.1), we can set up an
operator K e : C(Oe) -+ C(O) that maps v to a function v in C(O), the
function v being defined by

(12.5)

where x = Fe(e) (Figure 12.3). The operator K e is invertible with inverse


K;l, so that

(12.6)

Now suppose that {NI }~1 is a set of loeal basis funetions defined on 0
with the usual property that

I if J = I
NI(eJ) = { 0 otherwise,
12.1 Affine families of elements 415

FIGURE 12.3. The map K.

for nodal points {J. The function NI is a polynomial of degree k, say, that
can be mapped to C(n e ) using (12.6):

K e- 1 N I = N(e)
A

I'

Here {N;e) }~1 is the corresponding set of polynomial loeal basis functions
defined on ne ; these functions also have the property that N;e) (XI ) = 1 and
N;e) (xJ) = 0 for J #- I since (12.5) implies that NI({J) = N;e) (xJ) (we
have in fact carried out this transformation for one- and two-dimensional
problems in Se~tions 11.2 and 11.3}.
As usual, {NI} spans aspace X (of polynomials, in our case) and so
we can construct a projeetion operator IT that maps any v E C(n) to its
interpolate v in X, according to
M
IT: C(n) ~ X, ITv = LV({I)NI . (12.7)
1=1

Sirnilarly, we define the projection operator II e by


M
II e : C(n e) -> Xe, IIev =L V(XI )N;e) , (12.8)
1=1

where Xe = span {Nje)} and IIev is the interpolate of V in Xe. We come


now to a crucial question about such interpolations: given a function v in
C(n e ) and its image Kev or v in C(n), are IT(Kev) and Ke(IIev) the same
functions? That is, if we map v to v and then interpolate in n,
is this the
same as first interpolating v and then mapping it? A glance at the sketch
in Figure 12.3 (for linear interpolations) would seem to indicate that this
is plausible; we now prove the assertion.

THEOREM 1. Let n and ne be affine-equivalent finite elements. Then the


interpolation operators IT and IIe are sueh that

IT(Kev) = Ke(IIev) or ITv = IIev.


416 12. Analysis of the finite element method

PROOF. We have
M
ITev = LV(XI)N;e)
1=1
by virtue of (12.8). Hence

Ke (t V(eI )N}e))

M
L V(e1 )KeN;e) (Ke is a linear operator)
1=1
M
LV(e1)fh
1=1

which is precisely Uv. D

12.2 Local interpolation error estimates


Recall from the discussion of the convergence of Galerkin approximations
in Chapter 10 that the error lIu - Uh 11, measured in some appropriate norm,
can be bounded above by the interpolation error lIu - Üh 11, where Üh is the
interpolate of u in V h . The task of estimating the Galerkin error conse-
quently reduces to one of estimating the interpolation error. We go one
step furt her towards obtaining such an estimate by deriving in this section
an estimate of the interpolation error IIv - ITevll for functions defined on a
single finite element neo Once this estimate has been found, it can be used
to obtain an estimate for functions defined over the entire domain n.
As before, the finite-dimensional space Xe spanned by local basis func-
tions N;e) contains polynomials of degree ::; k, for some k :0:: 1. In other
words, either Xe = Pk(n e) or (as in the case of rectangular elements in
IR?) Xe = Ql(n e ) with l large enough so that Pk(n e C Ql(n e ). We show
eventually that an interpolation error estimate in the Hm-norm can be de-
rived for a function v that is smooth enough to be in H k + 1 (n e ), and so
consider the situation in which there are two spaces Hk+ 1 (n e ) and Hm(n e )
with k + 1 :0:: m, and a projection operator IT e that maps members of
Hk+ 1 (n e ) to Hm(n e ), the images ITev alI lying in Xe (Figure 12.4):
IT e : Hk+ 1 (n e ) --> Hm(n e ), R(IT e ) = Xe. (12.9)
The projection operator IT e is defined by (12.8), and since Pk(n e ) C Xe by
assumption, it has the property that
(12.10)
12.2 Local interpolation error estimates 417

FIGURE 12.4. The action of the operator II e

Similarly,
ITv = v for any v E Pk(n). (12.11)

The main result in this section is: for v E H k +1 (Oe) and IIe satisfying
the preceding properties, the interpolation errar in the Hm- norm can be
estimated by

Ilv - IIevll=,oe ::; Ch~+1~mlvlk+1,Oe'


where h e is defined in (12.3) and I . Is,o. denotes the Sobolev seminorm:

Ivl;,o. = L
l"l=s
In •
[D"v(xW dx

(recall also that the Sobolev norm 11·lls,o. is given by II vll;,o. = 2:;=1 Ivlf,oJ.
Here and subsequently the norm on HS(O) is denoted by 11·lIs,o rather than
the more cumbersome 11· IIHs(o), We start the development by recording
an important result that is required later.

THEOREM 2. There is a constant C, depending only on the geometry ofO,


such that for all v E H k +1(O),
(12.12)

PROOF. We use the Poincare inequality (7.19); replacing u by v +p and


noting that D"p = 0 for lai = k + 1, we have

IIv + pll%+1 ::; C (IV1%+1 + L {I D"(v + p) dX}2)


1"I<k+1 0
(12.13)
418 12. Analysis of the finite element method

Now cünstruct a polynomial p in Pk (f2) that has the property that

l D"'(v + p) dx = 0 for lal:S: k. (12.14)

This can always be done: set lai = k; then D"'p equals the coefficient of
x O , that can be solved for using (12.14). Having solved für all coefficients
of terms of order k, set lai = k -I, and use (12.14) to solve for coefficients
of terms of order k - 1. Proceeding in this way, we find p for any given v.
With p = p in (12.13), we have

inf IIv + pll~+l :s: IIv + pll~+l :s: Clvl~+ll


pEP. (rI)

from which (12.12) folIows. 0


Next we need tü know how the seminorms of the functions v and of {j
are related.

THEOREM 3. Let f2 e and 0. be two affine-equivalent open subsets of IR n .


Then for any functions v E HS(f2 e ) and v = Kev E H S (0.),

(12.15)

and
(12.16)

where Te is the matrix occuring in the affine map (12.1).

PROOF. We prove (12.15); (12.16) is proved in a similar fashion. Now

Ivl;,o = L
lol=s
1 (D°{j(e))2 d~

L 10 (DOv(e))2IdetTel-l dx (12.17)
101=8 e

(using the result from multivariable calculus that if t;i = J;(Xj), then d~ ==
dt; l d6··· d~n = I det(8fd8xj)l dx l dx 2'" dxn ).
By an application of the chain rule we have (see Exercise 12.6), für fixed
x and e,

(12.18)

(since e and x are fixed, D"'v(e) and DQv(x) are simply real numbers).
Hence (12.17) becomes

lvi; (j
,
:s: L
101='
1rle
(Dov(e))2I1TeIl28(detTe)-1 dx
12.2 Local interpolation error estimates 419

from which (12.15) follows, since I\Tel1 and det Te are constant. 0
We come now to the interpolation error estimate for the seminorm Iv -
IIevlm,n e •

THEOREM 4. Let k and m be nonnegative integers such that

and

Let IIe and tr be the operators defined in (12.7) and (12.8). Then for any
affine equivalent element Oe and for all functions v E Hk+1(Oe),

(12.19)

where h e and Pe are defined in (12.3) and (12.4), and 6 is a constant


depending on fl and tr.

PROOF. We have, for all v E Hk+1(fl) and all ß E Pk(fl), and using (12.11),
Iv - ITvl m,n- :s Ilv - ITvll m,n- = Ilv - ITv + ß - ITßII m,n-
:s III(v + ß) - IT(v + ß)llm,n
:s III( v + ß) Ilm,n + IIIT( v + ß) IIm,n
:s (11111 + Iltrl!) Ilv + ßll k + 1 n'
---------------
6 '

The last line follows from the fact that land tr are bounded operators
from Hk+1(fl) to Hm(fl). The use of Theorem 2 now yields

(12.20)

From Theorem 1 we have tr(Kev) = Ke(IIev), so that

v - ITv = Kev - IT(Kev) = Ke(v - IIev); (12.21)

consequently, using (12.16) (replace v by v - IIev and set s = m) and


(12.21) we obtain

Iv - IIevlm,n e :S IIev)lm,n
IIT~lln det T eI1 / 2 IK e(v -
IIT~lln det Te11/ 2 1v - ITvlm,n; (12.22)
420 12. Analysis of the finite element method

furthermore, from (12.15) with s = k + 1,


(12.23)

Finally, substituting (12.20) in (12.22), then (12.23) in that result we obtain

which, with the use of Lemma 1, leads to (12.19). o

REMARKS.

1. Since we wish to evaluate Iv-II ev m ,!1 e I, it follows that both v and IIev
must be in Hm(0,e) for this term to make sense. Equivalently, v and
ITv must be in Hm(n). This accounts for the inclusions Hk+ I (n) c
Hm(n) and X c Hm(n). Note that v E Hk+I(0,e) implies v E
H k+ I (n). The inclusion H k +1(n) c Hm(n) of course holds if m :::;
k + 1.

2. In evaluating the interpolant IIev of v, it is necessary to know the


nodal values of v. This in turn requires that v be continuous, so that
we must have v E H k +1 (0,e) C C(0,e) or equivalently, v E Hk+I (0.) c
C(n). By the Sobolev Embedding Theorem, this inclusion holds if
k + 1 > n/2 for a problem in IR n .

The two parameters h e and Pe appearing in (12.19) may be reduced to


one if attention is restricted to finite elements for which the ratio hel Pe is
bounded above, so that elements are not allowed to become too "Hat". For
this purpose we introduce the not ion of a regular family of finite elements.
A family {0,j, ... , 0,E} of finite elements is said to be regular if

(i) there exists a constant ()" such that hel Pe :::; ()" for all elements;

(ii) the diameters h e approach zero.

In the case of regular families the error estimate of Theorem 4 can be


expressed in terms of a norm; this is recorded in the following.

COROLLARY TO THEOREM 4. Let the conditions of Theorem 4 hold, and


let {0,I,'" ,0,E} be a regular family of finite elements. Then there is a
constant C such that, fOT" any element 0,e in the family, and alt functions
v E Hk+ I (0,e),

( 12.24)
12.3 Error estimates for second-order problems 421

It is not difficult (see Exercise 12.5) to deduce this result, and in partic-
ular to show that it depends on Property (ii) of regular families of finite
elements.

Examples
1. Let n e be the three-noded triangle in ]R2. The space Xe spanned by
the loeal interpolation functions is PI (ne), so that k = 1. Assuming
that v is smooth enough to belong to H 2 (n e ), (12.24) gives

(12.25)

We confirm that the conditions of Theorem 3 hold: H k +1(n) =


H 2 (n) c C(n) by the Sobolev Embedding Theorem. Second, the es-
timate (12.25) holds for all m such that m::; k+1; that is, 0::; m::; 2.

2. For problems such as those arising in linear elasticity, for which the
unknown variable is vector-valued, the set of results culminating in
(12.25) carries over virtually unchanged. We return to Chapter 9,
Example 3, for which case V = {v: v E [H 1 (n)f, v = 0 on rI}.
This problem is posed on a domain in ]R2, so suppose that we make
use of four-noded rectangular elements, generated by a family of affine
maps (11.27) from the reference square.
Now the basis functions (11.27) eorresponding to this element are
bilinear; thus the restriction to ne of any function Vh E V h will
belong to Ql(n e ), and since PI C Ql C P2 it follows that the value
of k appropriate to this problem is k = 1. If the Hm- norm for vector-
valued functions is defined on n e according to

Ilvll~,o, = Ilvlll~,oe + Ilv211~,oe'


then for all funetions v E [H2(n e )j2 there exists a constant C such
that

12.3 Error estimates for second-order problems


Having established properties of finite element interpolations over individ-
ual elements, we turn now to the quest ion of interpolation of a function
defined on the entire domain n. Specifically, we have a function v E C(n),
and we construct its interpolant Vh in the finite element spaee X h according
to
G
Vh(X) = L v(xi)Ni(x),
i=l
422 12. Analysis of the finite element method

FIGURE 12.5. Global interpolation of a function

wherc Ni are the global basis functions that span X h . As in Seetion 12.1,
we define a projection operator II h that maps v to its interpolant Vh or
IIhv:
N
Ih : C(r2) --> X h, IIhv = L v(x;)N
i. (12.26)
i=1

From the way in whieh the functiüns Ni are construeted from loeal basis
functions N;, it should be clear that the restriction of II h v to any element
r2 e is in fact IIev (Figure 12.5):

Sinee we are primarily interested in this seetion in obtaining error estimates


for second-order problems we must estimate Ilu - vhlko for any Vh E vh,
in aecordance with Cea's Lemma. We ehoose for convenienee Vh = IIhu,
and so seck an estimate of the interpolation error lIu - II h ul11,n (reeall that
m = 1 for second-order problems). In the same way as Ilu - IIeullm,oe is
estimated in terms of the parameter h e , a suitable parameter is required
for the global estimate. For this purpose, suppose that we are dealing with
a regular family of finite elements, and set

(12.27)

The eonstant h is ealled the mesh parameter, and is a measure of how


refined the mesh is: the smaller his, the larger the number of elements für
a given domain r2. Henee, if it is possible to obtain an interpolation error
estimate of the form

then we are assured of convergenee as h --> 0, provided that ß > O.


The mesh parameter provides a natural way of quantifying the dimen-
sion of the spaces X h or V h that oceur in Galerkin approximations. Recall
from Chapter 10 that we discussed the notion of a family of problems,
12.3 Error estimates for second-order problems 423

parametrized by a real parameter h. The idea is that for each value of h


the approximate solution is sought in a finite-dimensional space V h , with
the hope that the error Ilu - uhll approaches zero as h -+ O. At the time
h was thought of as being, for example, l/(dim V h ). In the context of the
finite element method, though, the mesh parameter gives a measure of how
fine the subdivision of n is: the smaller his, the finer the sub division. Fur-
thermore, the smaller his, the larger thc number of elements and nodal
points will be, and hence the larger the dimension of V h will be. Further-
more, there is now a clear sense in which V h can be said to approach V as
h -+ 0: it is required that Ilv - IIhvll -+ 0 as h -+ O. Consequently we may
use h, as defined in (12.27), as a measure of the size of the subspace V h
relative to V.
The following global interpolation error estimate establishes the precise
sense in which V h -+ V.

THEOREM 5. Assume that all the conditions of Theorem 4 and its corollary
hold. Then there exists a constant c independent of h such that, for any
v E Hk+1(n),

(12.28)

PROOF. When m = 1, then X c H 1 (0.) and X h C C(n) imply that


X h C H 1 (n) (see Exercise 12.7). Hence IIhu E H 1 (n) with IIhulrl e = IIeu
and we thus have, applying the Corollary to Theorem 4 with m = 0 or 1,

E ) 1/2
< L C h e (k+1-m) lul k+1,rl e
"'"
( 2 2 2

e=l

E ) 1/2
Ch k + 1 - m ( "'"
L lul 2k+1,Oe
e=l

Ch k+1- m lu lk+1,O.

This proves the theorem. o

Finally, we come to the error estimate for second-order problems.

THEOREM 6. Consider the VBVP of finding u E V such that

a(u,v) = (P..,v) for all v E V c H 1 (n), (12.29)


424 12. Analysis of the finite element method

where a(·,·) is continuous and V-elliptic and (C,') is continuous on V. fluh


is the finite element approximation 01 the solution in V h , then there exists
a constant C independent 01 h such that

Ilu - uhlkn :S: Chklulk+l,n.

PROOF. From Theorem 2 of Chapter 10, with Vh = Ihu and (12.28) with
m = 1 we obtain

Ilu - uhlh,rl :S: (M/o:)llu - IIhulll,n :S: Chklulk+l,rI


with C = cM/o:. o

It may happen in practice that the solution u is not smooth enough to


belong to Hk+l(n). For example, if we know from the theory of elliptic
BVPs that u is in H 2 (n), then the use of quadratic six-noded triangles for
a problem in ~2 means that k = 2 or k + 1 = 3, and the se mi norm Ivl3,rI
in (12.28) does not necessarily make sense. We overcome this problem by
going back to Section 12.2, and by noting that the entire theory developed
there still holds if we replace k + 1 by r, and hence also k by r - 1, where
r :S: k + 1 is any positive integer. Specifically, we do this in Theorems 2
and 4, and in the Corollary to Theorem 4. Of course, r must be such that
HT(fl) C C(fl) (that is, r > n/2 and r 2: m). The estimate (12.24) then
reads, for v E HT(n e ),

Ilv - IIevllm,n, :S: Ch~lvIT,n"


where /-l = k + 1 - m if r 2: k + 1 (since in this case v E Hk+ 1(n e ) also)
and /-l = r - m if r < k + l. Coming to the global estimate (12.28), we may
alter this accordingly so that, for v E HT(n),
(12.30)

where 0: = min(k,r -1).


We make one more improvement to the error estimate (12.30). As it
stands, it involves the unknown quantity lulT,n on the right-hand side. This
dependence on u is easily removed, however, if we know that the solution
depends continuously on the data. The theory of Chapter 8 leads to the
result that if the original PDE is of the form Au = I with f E HS(n) and
with n having a smooth boundary, then the solution u lies in HS+ 2(n) and

(12.31)

far some constant Cl > O. The finite element theory developed here is
applicable only to polygonal domains (in ~2), but if it is known that the
estimate holds even for such a case, then we may set r = s + 2, and since
12.3 Error estimates for second-order problems 425

the dependencc on lul r in (12.30) may be removed.

COROLLARY TO THEOREM 6. Let the conditions for Theorem 6 hold, and


let the data f be given in HS(fl), s 2: O. Furlhermore, assurne that (12.31)
holds. Then a constant C exists such that, as h - t 0,

(12.32)

where ß = min(k, s + 1).


According to the theorem and its corollary, since the order of convergence
ß is governed by the smaller of k and s + 1, when s :S k -1 then convergence
is governed by the smoothness of f. For example, if I is only in L 2(fl) =
HO(fl), then it sufficcs to usc elements that contain only polynomials of
degree :S 1 (such as two-noded elements in R, three-noded triangles, and
four-noded rectangles in R 2 ).
For problems posed on domains in R thc issue of the smoothness of the
boundary does not arise, and so the estimate (12.32) holds in such cases.

Example

4. Consider the problem

I inflclRn ,
u o on r.

The corresponding VBVP is: find u E HJ(fl) such that

10 'Vu· 'Vu dx = 10 Iv dx for all vE HJ(fl),

and this problem has a unique solution. Similarly, the VBVP corre-
sponding to the approximate solution is: find Uh E V h such that

and this problem also has a unique solution. Here V h consists of


those piecewise polynomial functions in X h that satisfy the boundary
condition, so that. V h C HJ(fl). If I E HS(fl), then the error is
estimated by

wherc ß = mine k, s + 1). Thus if linear (k = 1) elements are used,


the error is of order h since s + 1 will not be less than 1.
426 12. Analysis of the finite element method


/
/

/
/
/

FIGURE 12.6. The triangle generated by a quadratic isoparametrie map

12.4 Isoparametric families and numerical


integration
The theory that culminates in the error estimate (12.28) is based entirely
on the assumption that finite element meshes are generated by affine maps
from a reference element. The theory therefore does not take into account
deviations in the form of isoparametric maps, nor indeed does it account
for errors induced by numerical integration. In this section we give so me in-
dication of how these deviations are accommodated in the error estimates.

Isoparametrie maps. There are various complications that arise when


dealing with this more general family of elements: in particular, the Jaco-
bian matrix J defined in (11.33) is no longer constant. The theory appropri-
ate to isoparametric maps is outlined for the special case of the six-noded
triangle, shown in Figure 12.6. This element is of course obtained by the
map

X = Fe(e) = LxINI(e), (12.33)


1=1

in wh ich the functions NI are quadratic. Also shown in the figure is the
element Oe generated by the affine map Fe from the reference triangle. The
definitions (12.3) and (12.4) of the quantities h e and Pe are retained, but
these refer to the affine element Oe, as shown in Figure 12.6. Then under
these conditions a family of isoparametric elements is said to be regular if
12.4 Isoparametrie families and numerical integration 427

1. there exists a constant ()' such that


he
- ::; ()' for e = 1, ... ,E;
Pe

2. the quantities h e approach zero;

3. if x I J and XI J are, respectively, the coordinates of the midpoint nodes


of rl e and fi e , then

IIXIJ - xIJ11 = O(h~) for 1 ::; I < J::; 3. (12.34)

Thus a comparison with the definition given in Section 12.2 shows that a
family of regular isoparametric elements has to satisfy the criteria that are
set for affine families, but in addition rl e is required to be not very different
from fi e , in the sense of (12.34). Under these conditions it is possible to
prove the following analogue of Theorem 4 and its corollary.

THEOREM 7. For any regular family of isoparametric elements genera ted


by the map (12.33) corresponding to the six-noded triangle, and for any
function v E H 3 (rl e ), there exists a constant C such that

(12.35)

for integers m ::; 3.

Thus the estimate (12.35) differs from (12.24) (with k = 3 there) only in
that the term Ivl2,n e also appears on the right-hand side.
One of the reasons for using isoparametric families is that these permit
the construction of domains with curved boundaries. It is often the case,
though, that the actual curved boundary r of the domain rl cannot be
represented exactly using isoparametric elements. When attempting to ar-
rive at an error estimate of the kind (12.28) for second-order problems,
therefore, the theory must take account of the fact that the domain rl h
which is defined by the finite element mesh may be distinct from rl. Such
a situation is of course also true in the case of affine families, which would
at best represent a polygonal approximation to a domain with a curved
boundary.

Let rl h be the domain represented by a regular family of isoparametric


elements, and let n be the actual domain (Figure 12.7). The space V h in
which approximate solutions are sought is now defined as a subspace of
H1(rl h ) (for second-order problems). Then for a second-order problem and
a regular isoparametric mesh comprising six-noded triangles, Theorem 7
may be used to derive the counterpart to (12.28); that is,

(12.36)
428 12. Analysis of the finite element method

FIGURE 12.7. The domain n and its approximation nh

note that the norms are defined here on the domain n h . In going from
(12.35) to (12.36) we also make use of the elementary fact that Iv12,Oe +
Iv13,Oe ~ cllvIl3,Oe·
Numerical integration. We consider now the modifications that have
to be made to the standard theory, in the event that numerieal integration
proeedures ofthe kind discussed in Section 11.6 are used. Take, for example,
the problem of finding U E V = HJ(n) that satisfies
a(u,v) = (i,v) for all v E V, (12.37)
where

a(u, v) = l k'V'u· 'V'v dx,

k being a matrix of funetions; thus the integrand reads, when expanded,

The matrix k is assumed to be symmetrie and the coefficients k ij are such


that the bilinear form is continuous and V-elliptic (and hence also V h _
elliptie): in partieular we assume that kij E C(O) , and that a eonstant
ko > 0 exists such that
(12.38)
for any vector a. The linear functional is assumed to be given by

(i,v) = l Iv dx.

The discrete problem entails finding Uh E V h such that


a( Uh, Vh) = (i, Vh) for all Vh E vh• (12.39)
12.4 Isoparametric families and numerical integration 429

Now if numerical integration is used to evaluate the integrals, the discrete


problem that is solved is not in fact (12.39), but rather the problem

ah(uh,vh) = (Rh,Vh) for all Vh E V h ,

in which the bilinear form ah(uh,vh) and linear functional (Rh,Vh) are ob-
tained by integrating numerically over each element and summing over all
elements. For an integration rule of order r, therefore,

~~=1 ~~=1 ~wlk(~I)'VU(~I)' 'VVh(~I)'


(12.40)
~~=1 ~~=1 Wd(~I)Vh(~I)'
Since a f= ah and R f= Rh, the theory leading to Theorem 6 needs to be mod-
ified in order to arrive at an error estimate. In particular, Cea's Lemma
(Theorem 10.2) does not hold any longer, and must be rcplaced by a suit-
able extension; this is providcd by the following result.

THEOREM 8 (STRANG'S LEMMA). Suppose that the bilinear form ah("')


is uniformly Vh-elliptic, in the sense that a constant 0:, independent of h,
exists such that

Then there exists a constant C independent of h such that

The proof of this theorem is discussed in Exercise 12.10. We see that it


reduces to Cea's Lemma in the event that integration is exact, since in
that case ah = a and Rh = f.
There are thus two additional tasks that need to be carried out in order
to arrive at an error estimate: the two new terms on the right-hand side of
(12.41) have to bc estimated, and it is necessary also to establish conditions
under which the approximate bilinear form ah is Vh-elliptic. The former is
usually achieved by deriving consistency error estimates of the form

(12.42)
430 12. Analysis of the finite element method

in which Ih denotes the interpolation operator defined in (12.26) and Cl


and C 2 are constants that depend, respectively, on k and u, and on f.
These estimates would then permit the necessary extension of Theorem
5. The theory leading to the desired estimates is rather complex, and the
details are omitted.
We examine the issue of Vh-ellipticity, for the special case of an affine
family generated by the three-noded reference triangle, and with the use
of one-point integration on this triangle; recall that such a rule is exact for
polynomials of degree one.

THEOREM 9. Suppose that integration on the triangle ne is carried out


using the rule

r
in e
f(x) dxdy ~ h(f) == Aef(x),
where A e is the area of n e and x the loeation of its eentroid. If Xe =
PI (ne), where Xe is the spaee spanned by the loeal basis functions, then
there exists a constant 00, independent of h, such that
ah(vh, Vh) 2:: oollvhll~·

PROOF. For vhlne E PI(n e ), the vector \lvh is constant on n e; therefore,


using (12.38),

r k\lvh' \lVh dx dy
in e
h(k\lvh . \lvh)

Aek(x)\lVh(X)' \lvh(X)
2:: A ek ol\lvhI 2 (x)
kolvhl~,ne'
The desired result then follows from summing over all elements, and then
using the Poincare-Friedrichs inequality (7.34). 0
Theorem 8, together with the consistency error estimates, gives the fol-
lowing result.

THEOREM 10. Assume that the conditions of Theorem 8 hold. Then if the
solution U E HJ(n) of the problem (12.37) belangs to H 2 (n), and if the
da ta satisfy (12.37) and
k ji = k ij , k ij E C(O), JE H 2 (n),
then there exists a eonstant C dependent on u, k, and f but independent
of h,
12.5 Bibliographical remarks 431

12.5 Bibliographical remarks


This ehapter draws heavily on the work by Ciarlet [11], which may be eon-
sulted for further details of the topies presented here, and for extensions of
the theory. The texts by Brenner and Seott [8], Oden and Reddy [38], Oden
and Carey [37], Raviart and Thomas [39], and Strang and Fix [51] are also
very useful sourees for the mathematieal theory of finite elements, as is the
Finite Element Handbook [24]. Johnson [23] provides a useful expository
aeeount of finite element methods for eonveetion-diffusion problems, and
for hyperbolie problems generally.
The interpolation theory for isoparametrie elements is diseussed by Cia-
rlet [11], mainly in the eontext of the six-noded triangle. The original, and
more general, treatment is by Ciarlet and Raviart [12]. Likewise, the devel-
opments leading to error estimates when numerieal integration is used are
treated in detail in [11] and in [51].

12.6 Exercises
Affine families of elements
12.1. Complete the proof of Lemma 1 by showing that IIT;lll ::; hl Pe.
Local interpolation error estimates
12.2. Show that I : Hk+1(n) ...... Hm(n) and fI : Hk+l(n) ...... Hm(n)
are bounded operators, where fI is the projection operator defined in
Seetion 12.1. [Theorem 2 of Chapter 7 is useful when dealing with fI.]

12.3. Consider a regular family of triangular finite elements in R 2 , that is,


one for which

for some a > O. Show that this condition is satisfied if the smallest
angle (Je in an element is bounded below by some eonstantj that is,

for some (Jo > O.


This is known as Zlamal's conditionj it ensures that elements are not
too severely distorted.
432 12. Analysis of the finite element method

12.4. Complete the following table.

Largest k for which k = 1 ? ?


Pk(rI,e) C Xe

Ilu ~ uhllm,n e
O(h;-m) ? ?
O:S;m:S;2

Regularity H 2 (rI,e) ? ?

zj ~ ~

0 D D. .
12.5. Derive the estimate

and explain where in the derivation the condition h c -+ 0 is required.


Also show that the constant C is proportional to u a in Exercise 12.4,
far some positive number a, and explain how this affects the error
estirnate.
12.6. The purpose of this exercise is to derive the relation (12.18) for func-
tions defined on dornains in lR,2. We start by defining the ?rechet
derivative Vv of a function v to be the linear map
OV
=L
2
Vv : lR,2 --t JR;" Vv(a) ox ai·
i=l 1,

The second Frechet derivative is defined to be the bilinear map


2 <l2 V

D 2 v(a, b) = "'"'
D aaaib
U j
i,j=l Xi Xj
12.6 Exercises 433

and higher derivatives are defined similarly. Generally Vkv is an op-


erator from~? x ... x l~? (k times) to lR, and it is linear in each slot.
The space of all kth Frtkhet derivatives is a normed space with norm
IIVkvl1 = supIV kv(a(1), ... ,a Ck )I, IlaCk)II:::; 1. (i)
Clearly then, we have
ID"'v(x)1 :::; IIVkvl1 for lai = k; (ii)
for example, if a = (1,1), then

ID"'v(x)1 182 vj8x8yl = IV 2v(el,e2)1


:::; sup{IV 2 v(a,b)l: Ilall:::; 1, Ilbll :::; 1}.

To derive (12.18), show that

V kv(a(1), . .. , aCk) = V kv(T ea(1), ... , TeaCk)


and then use (i) and (ii). [Carry out the calculation first for k = 1
and k = 2; then proceed to the general case.]
Error estimates for second-order problems
12.7. Show that if Xc H1(Ö) and X h C C(TI), then X h C H1(O).

12.8. The theory of Sections 12.1 and 12.2 does not enable us to ob-
tain optimal error estimates in the L 2 -norm for second-order prob-
lems, mainly because of the central röle played by the inequality
Ilu - uhllv :::; Cllu - vhllv. It is possible, however, to obtain L 2 esti-
mates using what is known as the Aubin-Nitsche method. The method
is outlined in this exercise, for the problem (12.29).
Consider the auxiliary VBVP of finding w E V C Hl(O) such that

a(w, v) = (u - Uh, vh2 for all v E V,

and let Wh be the interpolate of w in Vh . Show that

and use the continuity of a and the results of Sections 12.1 and 12.2
to obtain the estimate

where ß = min(k, r - 1) and I = min(k,p - 1). Finally, use Theorem


1 of Chapter 8 (remember that Aw = u - Uh) to obtain the estimate

where 11 = min(2k, k + 1, r).


434 12. Analysis of the finite element method

12.9. Use the result of Exercise 12.7 to obtain L 2 -estimates of the error for
the problem

\J2 U ! in 0 C JR2,
U 0 on r,
assuming that ! E L 2 (O), and using linear or bilinear elements.
12.10. Suppose that we have to solve a fourth-order BVP defined on n =
(0,1), and assume that we intend using the Hermite basis functions
described in Section 11.4. Verify that the theory developed in Sections
12.1 to 12.3 remains essentially unchanged except that, for example,
it is necessary to specify that Hk+l(n) c Gl(n) in Theorem 4. De-
rive an estimate of the error in finite element approximations of the
problem

d4 u
dx4+ku=! in (0,1),
u(O) = u(l) = 0,
u'(O) = u'(I) = 0,

assuming that ! E L 2 (0, 1), and using the cubic Hermite functions in
Example 4 of Chapter 11.

Isoparametrie elements and numerical integration

12.11. The purpose of this exercise is to derive the estimate (12.41) in The-
orem 8. Use the Vh-ellipticity of ah to obtain the inequality

o:lluh - vhllt < a(u - Vh, Uh - Vh) + a(vh, Uh - Vh)


-ah(vh,uh - Vh) + (fh,Uh - Vh) - (e,Uh - Vh);

then use the continuity of a and the triangle inequality to derive


(12.41).

12.12. Show that the bi linear form ah("') defined by (12.40) is Vh-elliptic,
if Oe is the six-noded quadratic element and the integration rule used
is the three-point rule on the triangle.
References

[1] Adams R.A., Sobolev Spaces. Academic Press (New York) 1975.
[2] Apostol T.M., Mathematical Analysis: A Modern Approach to Ad-
vanced Calculus. Addison-Wesley (Reading, Mass.) 1957.
[3] Babuska 1. and Aziz A.K., Survey Lectures on the Mathematical Foun-
dations of the Finite Element Method, in The Mathematical Founda-
tions of the Finite Element Method with Applications to Partial Dif-
ferential Equations (ed. A.K. Aziz). Academic Press (New York) 1972.
[4] Baiocchi C. and Capelo A., Variational and Quasi- Variational Inequal-
ities. Wiley (New York) 1984.
[5] Becker E.B., Carey G.F. and Oden J.T., Finite Elements, Volume 1:
An Introduction. Prentice-Hall (Englewood Cliffs, N.J.) 1981.
[6] Binmore K.G., Mathematical Analysis: A Straightforward Approach.
Cambridge University Press (Cambridge) 1977.
[7] Binmore K.G., The Foundations of Analysis: A Straightforward Intro-
duction. Book 2: Topologicalldeas. Cambridge University Press (Cam-
bridge) 1981.
[8] Brenner S. and Scott L.R., The Mathematical Theory of Finite Ele-
ment Methods. Springer-Verlag (New York) 1994.
[9] Burnett D.S., Finite Element Analysis. Addison-Wesley (Reading,
Mass.) 1987.
436 References

[10] Carey G.F. and Oden J.T., Finite Elements, Vol. 2; A Seeond Course.
Prentice-Hall (Englewood Cliffs, N.J.) 1983.

[11] Ciarlet P.G., The Finite Element Method for Elliptie Problems. North-
Holland (Amsterdam) 1978.

[12] Ciarlet P.G and Raviart P.-A., Interpolation theory over curved ele-
ments with applications to finite element methods. Computer Methods
in Applied Meehanies and Engineering 1 (1972) 217-249.

[13] Dautray R. and Lions, J.-L., Mathematieal Analysis and Numerieal


Methods for Seienee and Teehnology, Vol. 2; Functional and Varia-
tional Methods. Springer-Verlag (Berlin) 1988.

[14] Dhatt G. and Touzot G., The Finite Element Method Displayed. Wiley
(New York) 1984.

[15] Duvaut G. and Lions J.L., Inequalities in Meehanies and Physies.


Springer-Verlag (Berlin) 1976.

[16] Glowinski R., Numerieal Methods for Nonlinear Variational Problems.


Springer-Verlag (Berlin) 1984.

[17] Grisvard P., Elliptic Problems in Nonsmooth Domains. Pitman (Lon-


don) 1985.

[18] Halmos P., Finite Dimensional Veetor Spaees. Van Nostrand Reinhold
(New York) 1958.

[19] Hewitt E. and Stromberg K.R., Real and Abstmct Analysis; A Modern
Treatment of the Theory of Funetions of a Real Variable. Springer-
Verlag (New York) 1965.

[20] Hoffman K. and Künze R., Linear Algebm. Addison-Wesley (Reading,


Mass.) 1973.

[21] Horgan C.O., Kom's inequalities and their applications in continuum


mechanics, SIAM Review 37 (1995) 491-511.

[22] Hughes T.J.R., The Finite Element Method; Linear Statie and Dy-
namie Analysis. Prentice-Hall (Englewood Cliffs, N.J.) 1987.

[23] Johnson C., Numerical Solutions of Partial Differential Equations by


the Finite Element Method. Cambridge University "Press (Cambridge)
1987.

[24] Kardestuncer H. (ed.), Finite Element Handbook. McGraw-Hill (New


York) 1987.
References 437

[25] Kolmogorov A.N. and Fomin S.V., Elements of the Theory of Fune-
tions and Funetional Analysis. Volume 1: Metrie and Normed Spaees.
Graylock Press (Rochester, N.Y.) 1957.

[26] Kolmogorov A.N. and Fomin S.V., Elements of the Theory of Fune-
tions and Functional Analysis. Volume 2: Measure, Lebesgue Integrals
and Hilbert Spaee. Academic Press (New York) 1961.

[27] Kreyszig E., Introduetory Functional Analysis with Applications. Wi-


ley (New York) 1978.

[28] Lang S., Introduetion to Linear Algebra. 2nd edition. Springer-Verlag


(New York) 1986.

[29] Lang S., Undergraduate Analysis. Springer-Verlag (Ncw York) 1983.

[30] Lions J.L. and Magenes E., Non-Homogeneous Boundary- Value Prob-
lems and Applieations, Volume 1. Springer-Verlag (New York) 1972.

[31] Lipschutz S., Set Theory and Related Topies. Schaum Outline Series.
McGraw-Hill (New York) 1964.

[32] Loula A.F.D., Hughes T.J.R. and Franca L.P., Petrov-Galerkin for-
mulations of the Timoshenko beam problem. Computer Methods in
Applied Meehanies and Engineering 63 (1987) 115-132.

[33] Naylor A.W. and Seil G.R., Linear Operator Theory in Engineering
and Seienee. Springer-Verlag (Berlin) 1982.

[34] Necas .I., Les Methodes Direetes en Theorie des Equations Elliptiques.
Masson (Paris) 1967.

[35] Noble B., Applied Linear Algebra. Prentice-Hall (Englewood Cliffs,


N.J.) 1969.

[36] Oden J.T., Applied Funetional Analysis: An Introduetory Treatment


for Students of Meehanics and Engineering Seience. Prentice-Hall (En-
glewood Cliffs, N.J.) 1979.

[37] Oden J.T. and Carey G.F., Finite Elements, Volume 4: Mathematical
Aspeets. Prentice-Hall (Englewood Cliffs, N.J.) 1982.

[38] Oden J.T. and Reddy J.N., An Introduetion to the Mathematical The-
ory of Finite Elements. Wiley (Ncw York) 1976.

[39] Raviart P.-A. and Thomas J.M., Introduetion a l'Analyse Numerique


des Equations aux Derivees Partielles. Masson (Paris) 1983.

[40] Reed M. and Simon B., Methods 0/ Modern Mathematical Physics I:


Functional Analysis. Acadcmic Press (New York) 1980.
438 References

[41) Rektorys K., Variational Methods in Mathematics, Science and Engi-


neering. 2nd edition. D. Reidel (Dordrecht) 1980.

[42) Roman P., Some Modern Mathematics for Physicists and Other Out-
siders, Volume 1: Algebra, Topology and Measure Theory. Pergamon
(Oxford) 1975.

[43) Roman P., Some Modern Mathematics for Physicists and Other Out-
siders, Volume 2: Functional Analysis with Applications. Pergamon
(Oxford) 1975.

[44) Royden H.L., Real Analysis. 3rd edition. Collier-Macmillan (London)


(1988).

[45) Rudin W., Real and Complex Analysis. 2nd edition. McGraw-Hill (New
York) 1974.

[46) Schwartz L., Theorie des Distributions. Hermann (Paris) 1950.

[47) Schwartz L., Mathematics for the Physical Sciences. Hermann (Paris)
1966.

[48) Showalter R.E., Hilbert Space Methods for Partial Differential Equa-
tions. Pitman (Boston) 1977.

[49) Smirnov V.L, A Course of High er Mathematics, Volume 5: Integration


and Functional Analysis. Pergamon (Oxford) 1964.
[50) Strang G., Linear Algebra and its Applications. Academic Press (New
York) 1976.

[51) Strang G. and Fix G.J., An Analysis of the Finite Element Method.
Prentice-Hall (Englewood Cliffs, N.J.) 1973.

[52) Zauderer E., Partial Differential Equations of Applied Mathematics.


2nd edition. Wiley (New York) 1989.

[53) Zeidler E., Nonlinear Functional Analysis and Its Applications. Vol-
ume IIA: Linear Monotone Operators. Springer-Verlag (Berlin) 1990.
[54) Zeidler E., Applied Functional Analysis: Applications of Mathematical
Physics. Springer-Verlag (Berlin) 1995.

[55) Zeidler E., Applied Functional Analysis: Main Principles and Their
Applications. Springer-Verlag (Berlin) 1995.
[56) Zienkiewicz O.C. and Taylor R.L., The Finite Element Method. Vol-
ume 1: Basic Formulation and Linear Problems. McGraw-HiIl (Lon-
don) 1989.
References 439

[57] Zienkiewicz O.C. and Taylor R.L., The Finite Element Method. Vol-
ume 2: Solid and Fluid Mechanics, Dynamics and Nonlinearity.
McGraw-Hill (London) 1991.
Solutions to Exercises

Chapter 1

1.1. A = {-2,3}, B = {-3,-2,-I,O,I,2,3}. AUB = B; AnB =


A; AnZ+=3, A-Z+={-2}.

1.2. Au C = {I, 2, 9} so B x (A U C) = {(7, 1), (7,2), (7,9), (8, 1), (8, 2),
(8,9)}; An C = {I} so (A n C) x B = {(I, 7), (1, 8)}.
1.3. Let x E An (B U C). Then x E A and x E B or C; Le., x E A and
xE B, or xE A and x E C. Hence x E (AnB) U (AnC). The second
identity is proved in a similar way.

1.4. n(A U B U C) = n(A) + n(B) + n(C) - n(A n B n C) - n(A nB-


C) - n(B n C - A) - n(C nA - B).

1.5. P(A) = {A,0,{I},{2},{3},{1,2},{2,3},{1,3}};


P(B) = {B, 0, {{I, 2}}, {3}}.
1.6. Consider the table
1/1 1/2 1/3 1/4 1/5 .. .
2/1 2/2 2/3 2/4 2/5 .. .
3/1 3/2 3/3 3/4 3/5 .. .
4/1 ...

The rationals can be listed by writing down the numbers in the pre-
ceding table in the order shown, omitting those already listed (e.g.,
442 Solutions

omit 2/2 = 1). This then gives a listing of all rationals whose nu-
merator and denominator add up to 2, then 3, and so on. In this
way all positive rationals are covered. Multiply by -1 to get negative
rationals.

1.7. (i) [a, b]; (ii) lR; (iii) [0,1].

1.8. (i) Not open: A = {±I/n7r, n = I, ... } and for every x E A there is
a nhd N(x) such that N(x) - {x} ri A. Not closed: 0 ri Ais a point
of accumulation. (ii) Neither open nor closed. (iii) Open, not closed
since {±I/n7r} are points of accumulation, but are not in A.

1.9. Assurne I is closed. Let x E 1'; since x ri I, the distance from x to I


is finite. Hence we can set up a neighborhood of radius E < d about
x that lies entirely in 1'. Hence l' is open. Conversely, assurne l' is
open. We always have I c 1, so we want to show that 1 c I. Let
x E 1 and assurne x ri I. Then x is in 1'. Since l' is open, there is a
neighborhood N of x with N nI = 0, which is a contradiction. Thus
x E I and so 1 EI.

1.10. Points of accumulation: A = {z: x 2 - y2 = I}. A is open.

1.11. (i) -1,1/2, -1/6, 1/24, ... ; (ii) 1,0,1,0,1 ... ; (iii) -3,6/7,9/13,12/19, ...

1.12. (a) Converges to -3/2; (b) not convergent; (c) converges to 1.

1.13. 1(3n + 2)/(n - 1) - 31 = 15/(n - 1)1 < 0.001. Assurne n > 1, so that
5 < O.OOI(n - 1) => n > 5001. Take n = 5001.

1.14. Suppose U n is monotone increasing, with sup = m. For any E > 0


there exists N such that IU n - ml < E far n > N, so Un -> m. The
same reasoning applies if U n is monotone decreasing.

1.15. (i) maxA = 1 = supA,minA is undefined, inf A = O. (ii) maxA, minA


undefined; sup A = 1, inf A = -1. (iii) min A, max A do not exist,
inf A = -00; supA = c. (iv) Iz 2 + 11 :<::: Iz 2 1 + 1 = Izl 2 + 1 :<::: 2.
Maximum achieved at z = ±1. Minimum is achieved at z = ±i.

1.16. Y = inf A => a :<::: y :<::: x for all x E A and lower bounds a. Thus
-x :<::: -y :<::: -a so that -y is the least upper bound of -A.

1.17. Take A = (-1,0) and B = (-2,0); then a = b= O. But C = (0,2) so


that sup C = 2 =I ab.

1.18. (i) Lct p = supI; then x:<::: p for any x E I. Let J = {ax: xE I};
since a > 0, ax :<::: ap. Hence J is bounded above by ap. Let the
supremum of J be q (we must prove that q = ap). Since ap is an
upper bound for J and q is the least upper bound, q :<::: ap. But for
Solutions 443

any y E J we have y ::::; q =} a- 1y ::::; a- 1q. But 1= {a-1y : y E J},


hence a- 1q is an upper bound for I. Thus p ::::; a-1q or ap ::::; q. Since
q:::: ap also, we have ap = q.

1.19. (i) Closed. (ii) Open. Set of limit points is nU {x : x 2 + y2 + z2 =


a2, z > O} U {x: x 2 + y2 < a2, z = O}.

1.20. (a) V2j (b) 2a.

1.21. (a) Not an equivalence relation, but a partial ordering; (b) equivalence
relation, not a partial orderingj (c) neither an equivalence relation
nor a partial orderingj (d) not an equivalence relation, but a partial
ordering.

1.22. {(2,2), (3,3), (4,4), (5,5), (6,6), (2,5), (5,2), (3,6), (6,3)}.

1.23. Take c E A a n Ab. Then c'" a implies that a '" c. Also, c'" b. Thus
a '" b by transitivity, and b '" a by reflexivityj hence a E Ab and
b E A a . Take any x E A a : x '" aj hence x '" b, so x E Ab. Thus
A a C Ab· Similarly show that Ab C A a ·

1.24. A a is the set of ordered pairs of integers lying on the "diamond"


{z: lxi + lyl = const} on which a is located.

Chapter 2

2.1. (a) not continuous at x = ±1; (b) continuous on (-oo,Ol.

2.2. (a) Supposethat Ix-yl < o. Then Ip(x)-p(y)1 = la1(x-y)+a2(x 2 -


y2) + ... + ak(x k - yk)1 ::::; Ix - yl[la11 + la211x +yl + ... + lakllxk-1 +
x k - 2 y + ... + yk- 1ll < oC since term in square brackets is bounded
above. Set 0 = E/C.
(b) For °: : y ::::; :z: we have..jY::::; ,fi =} 2y ::::; 2yXfj =} x-2yXfj+y ::::;
x - y or (,fi - Vy)2 :::: x -y. If Ix - yl < 0, then Iu(x) - u(y)1 < 01/ 2 .
For given E set 0 = E2 •

2.3. (b) Ij(x) - j(y)1 = Ix- 1 - y- 11 = Iy - xl/lxyl. But x > a, y > a, so


xy > x 2 or 1/xy < 1/a 2. Hence Ij(x) - j(y)1 < a- 2lx - yl.

2.4. Ij(x) - j(x)1 = l(x 2 + 2y) - (x 2 + 2y)1 = l(x 2 - x 2 ) + 2(y - y)1 ::::
Ix 2-x 2 1+2IY-YI. Supposethat lx-xl< Oj i.e., (X_x)2+(y_y)2 < 02.
Thenlx2 - x21 = Ix - xlix + xl < O· C. Also, Iy - yl < o. Hence
Ij(x) - j(x)1 < (C + 2)0. Set 0 = E/(C + 2).

2.5. Set je) = d(-, E). Then Ij(x) - j(y)1 = I inf zEA Ix - zl- inf zEA Iy-
zil :::: Ilx - Yl + inf Iy - zi - inf Iy - zll = Ix - yl· Given E > 0,
choose 0 = E.
444 Solutions

2.6. If(xo) - f(x)1 < E whenever Ixo - xl < 8, Le., for xE (xo - 8, Xo + 8).
Pick any such x: either 0 < f(xo) - fex) < E in which case fex) >
f(xo) - E or 0< fex) - f(xo) < Ein which case fex) < f(xo) + E. For
the first case choose E smaller than I(xo) so that fex) is positive. For
the seeond ease fex) > f(xo) > o.

2.7. Assume that f(a) < 0, f(b) > o. Sinee f(a) < 0, there is an interval
[a, c] in which fex) < o. Let the l.u.b. of such points c be e; then
fee) ~ o. We cannot have fee) < 0 sinee we would then be able to
find an interval about e for whieh fex) < 0, which would imply that e
is not a l.u.b. Hence fee) = o. A similar argument applies if f(a) > 0
and I(b) < O.

2.8. (a) U EG(-I, 1); (b) U E Goo([O, 71"] X [0,1]); (e) U EGl[O, I].
2.9. Iu(x) - u(y)1 = Ilxl - lyll ~ Ix - Yl sinee lxi = Ix - Y + Yl ~
Ix - Yl + lyl·
2.10. Choose 8 = E/ L in the definition of continuity.

2.11. I = IQ U I', where IQ and I' are the subsets of rationals and irra-
tionals. J.L(I') = J.L(I) - J.L(IQ) = J.L(I).

2.12. Let M be an arbitrary measurable set in IR. If 1 E M, 0 f/. M, then


XE/(M) = E; 1 f/. M, 0 E M => XE/(M) = E'; 1 f/. M, 0 f/. M =>
XE/(M) = 0; 1 E M, 0 E M => XE/(M) = dom XE. Thus XIi/(M) is
a measurable set. Conversely, if E is not measurable, then XE cannot
be measurable.

2.13. Put 8n = 2- n . For eaeh n and for every x there is an integer k n


such that k n8n ~ x < (k n + 1)8n . Set ifJn(x) = k n(x)8n if 0 ~ x <
n, ifJn(x) = 0 for n ~ x. Then x - 8n < ifJn(x) ~ x if 0 ~ x ~ n;
furthermore 0 ~ ifJl ~ ifJ2 ~ ... ~ x and ifJn(x) ----> x as n ----> 00, for
x E [0,00]. Set Sn = ifJn 0 f.

2.14. First caleulate JlR Sk(X) dx = I:k=~ (k/2 n )(I/2n ) (1/2 2n ) I: k.


2n 1
=
Then use the formula I:~l k = m(m - 1)/2.

215 f+( )
•• X
= {I,0 0otherwise,'
~x~1 r() =
x
~x<
0 otherwise.
{I, -1 0
,{jRf+ dx = JlRf- dx = 1, so JlRf dx = O.
JlRg+ dx = +00, JlRg- dx = 1, so JlRg dx = +00.
2.16. Use the fact that III = f+ + 1-, and that integrability of f implies
that of f+ and f-. For the converse use f = f+ - f-· Show that
- J r - J f- ~ J f+ - J f- ~ J r + r·
2.17. (a) ap> -1; (b) ap< -1.
Solutions 445

2.18. All real a exeept a = -~, -~.


2.19. Consider 0< In lu(x) - av(x)1 2 dx for any a E R Expand and then
ehoose a = In uv dx/ In Ivl 2 dx.
Chapter 3
3.1. (a) Veetor spaee; (b) not a veetor spaee; (e) not a veetor spaee; (d)
veetor space; (e) not a veetor space.
3.2. (a) Subspaee; (h) not a subspace: 0 ft V.
3.3. (a) Subspace; (b) not a subspace; (e) subspace; (d) subspaee.
3.4. Suppose that U = V EB W, and let u = VI + WI = V2 + W2 for
VI, V2 E V and WI, W2 E W. Then VI -V2 = WI -W2. But VI -V2 E V
and WI -W2 E W, so that VI -V2 = WI -W2 = 0, or VI = V2, Wl = W2.
Conversely, suppose that u = V + W for V E V, W E W with V and
W uniquely defined. If V n W f {O}, then there exists z E V n W
with z f o. Henee we ean write u = (V + z) + (w - z) so that the
deeomposition of u is not unique, a eontradietion.
3.5. For any u E e[O,l],u(x) = v(x) + w(x), where v(x) = Hu(x) +
u(-x)) and w(x) = Hu(x) - u(-x)). Thus V E V and W E W. Also,
vn W = {v: V is even and odd} = {O}.
3.6. aß -::; areaA + areaB, henee aß -::; a P /p+ ßq /q sinee A = Ioo. x p- I dx =
a P /p, ete. The proof now follows easily from the hints given.
3.7. (u,w) = (v,w) => (u - v,w) = 0 for aB w. Set w = u - v.
3.8. (u,v)o = 0; (u,vh = (u,v)o + (u',v')o f O.
3.9. Ilull = Ilu - V + vII :s: Ilu - vii + IIvll. Repeat with u.
3.10. Ilu + vl1 2+ lIu - vll 2 = (u + v,u + v) + (u - v,u - v). Expand and
rearrange.

3.11. If v = au, then lIu + vii = (u + au,u + au)1/2 = (1 + a)lIull. But


Ilull+llvll = (1+a)llull· Conversely, assumethat Ilu+vll = Ilull+llvll·
Then Ilu + vll 2 = IIull 2 + IIvl1 2 + 2(u, v) = (Ilull + IIvl1)2 = lIull 2+
IIvll 2+ 211ullllvll. Hence lIullllvll = (u,v) or (u,v) = 1, where u =
uillull, fJ = v/llvll. Suppose v f u; then fJ = u + w, and 1 = (u,u +
w) = 1 + (u, w) => (u, w) = O. Also, IIfJII 2 = 1 = 1 + IIwll 2 + 2(u, w);
Le., IIwll = 0 => w = O. Hence fJ = u or v = au for some a.
3.12. Assurne that IIx - ylllly - zll = Ilx - zll. Square and rearrange to get
(a, b) = 1, where a = a/llall, a = x - y, b = Y - z. Thus b = a which
gives Y = ax + (1- a)z, where a = lIy - zli/(llx - yll + IIY - zll). The
converse is straightforward.
446 Solutions

3.13. Verify that (. , .) defined by (u, v) =


X.
J: u ' Vi dx is an inner product on

3.14. No.

3.15. Expand the right-hand side and simplify.

3.16. lIau + (1 - a)vll ::; allull + (1 - a)llvll ::; 1.


3.17. IIul1 2 + 2a(u,v) + a211vl1 2 = IIul1 2 - 2a(u,v) + a211v11 2. The result
follows from this.

3.18. (i) (y'5 - 1) /2; (ii) 1.


3.20.

p=1 p=2 p= 00

3.21. Ilxll§ = x 2 + y 2 = (lxi + lyl)2 - 21x lyl ::; (lxi + lyl? = Ilxlli- Ilxlli =
x 2 + y2 + 21xyI ::; 2(x 2 + y2) = Ilxll~.
3.22. J IUTVT I dx ::; [f luIT(p/Tlr/p[f Ivlr(q/rlt/ q· Take rth roots of both
sides.

3.23. Follow the argument of Example 20.


Chapter 4

4.1. 2.

4.2. I(un,v n ) - (u,v)1 = I(u n - U,V n - v) + (u,v n - v) + (v,u n - u)l ::;


Ilu n - ullllvn - vii + llullllvn - vii + llvlillun - ull - t 0 as n ---> 00.
Set Vn = v (Le. the sequence v, v, ... ) to get (u n , v) - t (u, v). Finally,
l(un,v) - (u,v)l ::; l(un - u,vll ::; llu n - ullllvll, hence (un,v) - t
(u, V l. Set V n = U n to get the final result.

4.3. llu - wll = llu - Un + Un - wll ::; Ilu - unll + Ilun - wll < E + a. The
inequality follows from the arbitrariness of a.

4.4. (a) (-1,1]; (b) (-00,00).

4.5. (a) un(x) -t 0 pointwise. But Ilun - ulli2 = J12/:: n 2 dx = n - t 00


as n ---> 00; (b) un(x) -t pointwise since un(x) = n 3 / 2 x/ exp(n 2 x 2 ) =
Solutions 447

n 3 / 2x/[1 + n 2x 2 + ~n4x4 + ...] -> 0 as n -> 00. But Ilun - ulli,2 =


f'n y2 exp( _2 y2) dy (setting y = nx) = -H[yexp( -2y2)]~n
- J:'n exp(-2 y2)dy = -~(O + )(7f/2)) as n -> 00.

4.6. sup lun(x)1 = 1/2at x = l/n. Thus in [0,1], un(x) - .. 0 pointwise but
Ilu n - ulloo = 1/2, so convergence is not uniform. But convergence
is uniform in (a,l] (a > 0) : sup lun(x)1 = na/(l + n 2 a 2 ) at x = a
for n > l/a (check this by sketching un(x)) and sup lun(x)1 -> 0 as
n -> 00.
b
4.7. sup Iun(x) - u(x)1 < E for n > N. Hence Ja Iun(x) - u(x)IP dx :s;
(sup Iun(x) - u(x)I)P' (b - a) < (b - a)E P.

4.8. Ilull = 0 does not imply that u = 0; 111 . 111 is also not a norm.
4.9. Ilu - U 11 2 = ~ -
n m L2
2mn + ~
n+2 mn+m+n m+2 = 2 (m+2)(n+2)(mn+m+n)'
(m_n)2 Nu-
merator (m - n)2 :s; (m + n)2. Now show that Ilun - u m lli2 -> 0 as
n,m -> 00.

4.lO. Ilun - umllu = Jo1 Ix n - xml dx = n~1 - m~1 (taking m > n)


= (n+0C';-:+I) :s; (n+1)(m+l) -> 0 as n,m -> 00. Hence {un } is a
Cauchy sequence.

4.11. {u n } is Cauchy, so suplun(x) - um(x)1 < E for m,n > N. For any
Xo, Iun(xo) - um(xo)1 < E, so {un(xo)} is a Cauchy sequence of
real numbers. IR is complete, so un(xo) -> u(xo), say, which defines
a function u( x). Thc rest of the proof follows easily from the hints
given.

4.12. Let {x k } be a Cauchy sequence in IR n : Ilxk - xIII< E for k, I > N;


i.e., 2:; IXki - xlil P < EP • Hence IXki - Xlil P < EP for each i. But IR is
complete so xki -> Xi, say. Hence X -> X in IRn .

4.13. Assume {u n } convergent: Ilu n - ull < E for n > N. Also Ilu m - ull < E'
for m > Nt Hence Ilun - Um 11 = II(un - U) + (um - u)11 :s; Ilun - ull +
Ilu m - ull < E + E' for n, m > N (assume N > N').

1/ 2+1/ m [
4.14. 11 Un - Um 11 2 = Jl/2+1/n[
1/2 n ( X - '21) - m ( X - '21 )]2 d:r + J1/2+1/n 1-
m(x - ~)J2 dx. Show that this -> 0 as m, n -> (Xl, SO that {U n } is
Cauchy. Also, Ilu n _u11 2 = Jllg+l/n[n(x-~) -1]2 dx -> 0 as n -> 00.
So u n -> u in L 2 .

4.15. Take V n E Y with V n -> v. It is required to show that v E Y. From


Exercises 3.9 and 3.22, Illv n Ilu -Ilvii u I :s; Ilvn - vllu :s; cllvn - vllp·
Thus 11-llvllul < E so that v E Y.
448 Solutions

~1 ~1::;x<~E,
4.16. Let v(x) E C[~I, 1] be defined by v(x) = { I/E', ~E ::; x::; E,
+1, E<X::;1.
J
We have Ilu ~ vlli2 = J~« ~1 ~ C I )2 dx + o«1 ~ C I )2 dx = E3 /3 ~
E2 + E. Hence v can be made arbitrarily elose to u by choosing E small
enough.

4.17. Ilu vll oo = supll ~ v(x)l, where Iv(x)1 < 1 and v(O) = O. Hence
~
Ilu~vlloo = 1; neighborhoods ofu ofradius less than 1 do not contain
members of V, so u is not a point of accumulation.

4.18. v E B(Uo,T) ~ Iluo ~ vll oo ::; T; Le., suplsin21rx ~ COS21rTI ::;


T. sup lUD ~ vi = v'2 (at x = 3/8) so we require T ~ 3/8.

4.19. Cf. solution to Exercise 1.9.

4.20. Assume that Y is complete, and let v be a point of accumulation of Y.


Then each open ball B(v, l/n), n = 1,2, ... , contains a point v n , say,
in Y. The sequence {u n } is convergent, hence Cauchy, in Y. Since Y
is complete, v E V. Hence Y contains all its points of accumulation,
and is elosed. Conversely, assume that Y is elosed, and let {v n } be a
Cauchy sequence in Y. Then {v n } is a Cauchy sequence in X, and so
converges to v in X. From Theorem 3 of Chapter 4, v is in Y also, so
Y is complete.

4.21. W dense in X ~ for any v EX there is a w E W such that Ilw~vll <


E. Similarly, for any u E Y there is v E X such that Ilv ~ ull < E.
Hence Ilu ~ wll ::; Ilu ~ vii + Ilv ~ wll < 2t:, so W is dense in Y.

4.22. Take i E LP. For given E > 0 choose a bounded function 9 in LP,
where 9 has compact support, for example, Igl ::; M in [a, b] and
9 = 0 otherwise. Select 9 so that Ili ~ gllp < E. Bounded functions
with compact support are dense in LP, so we can find ihn} in Co
such that 9 = limh n a.e. Assume that Ihni::; M in [a,b] and 0
otherwise. Then Ig ~ hnl P ::; (2M)P on [a, b] and Iig ~ hnll p --> 0 from
the Dominated Convergence Theorem. Choose n so that Ilg~hnll ::; E
and use the Minkowski inequality.

4.23. Suppose that there are two points vo, vb such that Ilu - Vo II = d.
Then w = (vo + vb)/2 is in M hence, using the parallelogram law, it
can be shown that d 2 ::; IluD ~ wl1 2 < ~llu ~ vol1 2 + ~llu ~ v'II 2 =~,
a contradiction.

4.24. Consider {u n } C y.l with limit UD in X. We must show that UD E y.l


also. By definition (u n , v) = 0 for any v E Y; thus 0 = limn->oo(u n , v) =
(limn~ooun,v)=(uo,v)=O~uo E V.l.
Solutions 449

4.25. Theorem 7(b), which requires completeness of H, is used in Lemma


1.
4.26. Let u E X and w E Y 1.. Then u E Y also, so (u, w) == 0. u is arbitrary;
hence w E Xl. =} yl. C Xl..
Chapter 5
5.1. (i) R(M) = points on the upper unit semicircle, N(M) = 0; (ii)
R(K) = [0, 00), N(K) = {al; (iii) R(f) = (0, 00), N(f) = 0.
5.2. N(S) = {al; N(T) = {a( -8,4, In.
5.3. (i) One-to-one, not surjective; (ii) one-to-one, surjective (T is a re-
fiection about a line at 45° through the origin).
5.5. (i) ST(x) = S(x,-y) = (-2y,x); TS(x) = T(2y,x) = (2y,-x); (ii)
ST(x) = S(sinx) = sin 2 x-I, TS(x) = T(x 2 - 1) = sin(x 2 - 1).
5.6. S-l : V --t U and T- 1 : W --t V exist. Clearly TS : U --t W is
one-to-one onto W, so (TS)-l exists. Furt hermore , (TS)u = w =}
u = (TS)-l W . But (TS)u = T(Su) = w, so Su == T- 1w and u =
S-lT- 1w. Hence (TS)-l = S-IT- 1.

5.7. (i) linear; (ii) linear; (iii) nonlinear.

5.8. Tx = ( -5 -1) ( 4)
-3 -5 x + 5 .
,assummg that {(O,O), (1,0), (0, In
go to {(4,5), (-1,2), (3,On.
5.9. Let TUI = VI, TU2 = V2. Then T(aul + ßU2) = O~Vl + ßV2 by the
linearity of T. Hence T-l(av1 + ßV2) = aUI + ßU:2. Eut aT-lvI =
aUl, aT-1v2 = aU2 =} T-l(aul + ßU2) = aT-lv + ßT-lV2.
5.10. No; e.g., d(x, B) + d(y, B) =1= d(x + y, B) in general. Null space is the
set B.

5.11. For u =1= 0, IITII = sup(IITull/llull) = sup IIT(u/llulDIl (T is linear)


= sup 11 Tu 11 , lIull = 1. To prove the second result, consider IITull ::;
IITlillull. For every E > 0, there is a Uo such that IITuoli > (IITII -
E)lIuoll· If lIull ::; 1, then IIAull ::; IIAlillull ::; 11 All =} sup 11 Au 11 ::;
IIAII, lIull ::; 1. Eut ifwe put U1 = uo/iluoll, then IIAutil = lIuo 1I-11IAuo 11
> IIAII-E, so for lIull ::; 1, sup IIAul1 2- IIAutli > IIAII-E or sup 11 Au 11 ::;
IIAII·

5.12. IIAxll= = maxl:S:i:S:n 12::7=1 AijXjl ::; maX1:S:i:S:n 2=7=1 IAijlllxjl ::;
maxl:S:i:S:n 2::7=1 lAi] I maxl:S:j:S:n IXjl = maxl:S:i:S:n 'L7=1 IAijlllxll=·
Hence IIAII = sup(IIAxll=/lIxll=) ::; maXi:S:i:S:n 'L7=1IAij l. Suppose
maximum occurs for i = k. Then for x such that Xj = +1 if A kj 2-
°
0, Xj = -1 if A kj < we have IIAxll=/llxll oo = 2::7=1 IAijl·
450 Solutions

5.13. For x =f. 0, (IIAxll/llxllf = (a + b)2 - 2ab(x - y)2/(x 2 + y2). Take


the supremum (at y = x) to find IIAI12.

5.14. Illull = Ilull; I is bounded. Consider u(x) = sin nx: lIullv = 1 but
Illullw = 1 + n which cannot be bounded.

5.15. IIST(u)11 = IIS(Tu)11 ~ IISllllTul1 ~ IISIIIITlillull·

5.16. Let {u n } C N(T) with limit u in U. Then TUn = O. Thus 0 =


lim n -+ oo TUn = T(limn -+ oo u n ) = Tu =} u E N(T).

5.17. T is one-to-one since, if TUI = TU2 = v, then IITul - TU211 = 0 ~


Kllul - u211. IIT- 1 vll = Ilull ~ K- 1 11Tull = K- 1 I1vll·
5.18. u(x) = I; u'(s) dx ~ sUPoSxsllu'(x)1 = IIDull. Take sup of both
sides.
5.19. (I - P)(l - P) = 12 - PI - IP + p 2 = 1- P.Range : R(l - P) =
N(P), R(P) = N(I - P).
5.20. From Theorem 8, IlPuli ~ Ilull. Thus IIPII ~ 1. But for u E R(P) we
have Pu = u, so IIPul1 = Ilull. Hence IIPII = 1.
5.21. Take, for example, the map on lR2 that takes a point x to the point
in B(O, 1) dosest to x. This is a projection, but the map is not ho-
mogeneous.
5.22. Let u E N(P). By definition (u,v) = 0 for v E R(P). Hence N(P) C
R(P).L. Let u E R(P).L. Then (u, z) = 0 for z E R(P). By Theorem
9, u = v+w for v E R(P), w E N(P), so Pu = Pv + Pw = Pv = v.
Also, 0 = (u,z) = (v,z) + (w,z) = (v,z); hence v = o. Thus Pu =
o =} u E N(P).
5.23. T is a projection since T is linear and T 2u = Tv (where v = u(x) if
lxi< 1 and 0 otherwise) = v = Tu. R(T) = {u E L 2(lR): u(x) = 0
for lxi ~ I}, N(T) = {u E L 2 (lR): u(x) = 0 for lxi< I}.

5.24. v(y) = Pu(y) = I~1 exp(i(y - z))u(z) dz; show that Pv(x) ==
p 2u(x) = Pu(x). Pis an orthogonal projection.
5.25. (i) x satisfies Ax = 1 where 1 = (1, ... ,1); (ii) x satisfies Ax = 0 =
(1,0, ... ,0).

5.26. u(x) = e3~1 (_e 3 - 2X + eX) - 2x + 2, (l, n = I~ u(x) dx.


e3~1 (~e - e3 ) = 101 g(x)2x dx; so 9 satisfies Jo1 (gj - u)dx = O.
5.28. Let {Pn} be a Cauchy sequence in X'. Then for any u E X, I(Pn,u)-
(Pm, u)1 :S IIPn - Pmlillull -> 0 as m, n -> 0 so {(C n, u)} is a Cauchy
sequence in lR, with limit (C, u), say. Complete the proof by showing
that P is bounded and linear, and Cn -> C in X'.
Solutions 451

5.29. In the use of the projection theorem.

5.30. If there are two elements Ul, U2 such that (Ul,V) = (U2'V) = (P,u),
then (Ul - U2,V) = o. Set v = Ul - U2: IIUI - u211 2 = Oor
Ul = U2· II Pli = sup(I(P,v)I/lIvID (for v =I- 0) = sup((u,v)/llvID :'S
sup(llullllvll/llvll) = Ilull· Also, I(P,u)1 = (u,u) = IIul1 2 :'S IIPllllul1 so
IIPII ~ lIull· Hence IIPII = Ilull·
5.31. Take I = Iglq-l sgng; then I/IP = Iglq, so I E LP, and II/lb =
IlgIl1:;-1. Then show that (P g , I) = II/lbl!gl!Lq.
5.32. P = 0 <=} (P, v) = 0 far all v E X. Given v E X there exists a

sequence {v n } in Y such that V n -+ v. Thus (P, v) == (P, limn--->oo vn ) =


limn--->oo(P, v n ) = o.

5.33. la(u, v)1 2 :'S [llu'lIlIv'll +Ktllullllvll]2 :'S (K~ IIul1 2 + lIu'1I 2 )(llvI1 2 + IIv'11 2 ),
using Cauchy-Schwarz.

5.34. cf Exercise 4.2.

5.35. I(P,v)1 = If~(-1-4x)v(x)dxl = 1(-1-4x,v)ul:'S 11-1-4xllullvllu


:'S kllvllHl. la(u,v)1 :'S 21fo u'v' dxl :'S 21lu'llull v'llL2 :'S 211ullH 1 llvllHl,
hence continuous. a(v, v) ~ fo1(v')2 dx. Now IIv'I17J2 ~ C-211vlli2 so
(C- 2 +1)llv'lIi2 ~ C-21Ivllt,. fo1(-1-4x)vdx = fo1(x+l)u'v' dx =
[(x + l)u'v]Ö - fo1(u' + (x + l)u") dx =? fo1{(x + l)u" + u' - (1 +
4x)}v dx = o.
5.36. lii(u, v)1 :'S la(u, v)1 + I(u, Kv)ul :'S Kilullllvil +K'llullllvll where K' =
sup IK(X)I. ii(v, v) = a(v, v) + (v, KV) ~ allvl1 2 + ß(v, v) ~ allvl1 2
where ß = inf K(X).
Chapter 6
6.1. (a) Linearly dependent; (b) linearly independent.

6.2. L;=1 ak eikx = 0 =? L;=1 ak cos kx = 0 and L;=1 ak sin kx = 0


which halds only for all ak = O. Hence {e ik"'} is linearly independent.
6.3. If u, v E X, then (au + ßv)" - 2(au + ßv)' + (au + ßv) = a(u" -
2u' + u) + ß(v" - 2v' + v) = 0, hence au + ßv EX. dim X = 2. Basis
for Xis {Ul(X) = e"', U2(X) = xe"'}.
6.4. dimM = 9, dimK = 4.
6.5. Let dimV = m with basis {Vl, ... ,Vm } and dimW = n with basis
{Wl' ... , w n }. Every u E V ffi W is of the form u ,= v + W for some
v E V, W E W. But v = Li aiVi and W = Lj ßjWj so u =
Li aivi+ Lj ßjWj. Hence B = {VI, ... , Vm ,Wl,·· .,Wn } spans VffiW.
It remains to show that B is linearly independent.
452 Solutions

6.7. ct>l = (1/V2)(1,0, 1), ct>2 = (1/V2)(1,0, -1), ct>3 = (0,1,0).

6.8. cPo(x) = Vlfi, cPl(X) = ..j3fix, cP2(X) = hß72 (3x 2-1), cP3(X) =
~..Jf72 (5x 3 - 3x).

6.9. All = ~(e2-1), A l2 = A 2l = ~(I-e-2), A 22 = i(1-e- 6 ). detA =I


0.

6.10. Consider I : Xl ---> X 2 : IIIul12 = IIUl12 S kllulh (show this us-


ing Lemma 1; see also Theorem 4). Similarly, Ilulll S Kllul12 if we
consider I: X 2 ---> Xl.

6.11. T l2 = 2, T 23 = 6, others zero.

6.12. T ll = 27r, T 22 = cosx, others zero.

°
6.13. (b, c) = (Ta, c) = (a, TT c) = °
if c E N(TT). Let d E R(T)l...
= (TT d, u) => d E N(TT). Conversely, if d E
°
Then (d, Tu) =
N(TT), then if Tu = v we have (TT d,u) = = (d,v) => d E
R(T)l... Hence N(TT) = R(T)l.. => N(TT)l.. = R(T). N(TT) =
{(1,1,-1)}, b=(a,ß,a+ß).

6.14. (0'.2, -0'.1,0), (0'.3,0, -ad·

6.15. Let BI = {el, ... ,e n } and B 2 = {h, ... ,fn} be orthonormal bases
of X and ffi:n, respectively. For any u E X we have u = L uiei, Ui =
(u, eil. Define the map T: X ---> IR n by T(u) = (Ul,"" un ). Then T is
an isomorphism (show this) and Ilulli = (u,u) = (Luiei, LUjej) =
LU; = IITullffi.n.

6.16. 111:'11 = max lail·


6.17. (i) u(x) = y'2;(I/y'2;); (ii) u(x) = L;;'=1(2/k)(I- (-I)k)sinkx.

6.18. Uo = -V2/4, Ul = 5V3/6V2, U2 = ,;5/8V2.

6.19. Ck = ~(U2k - iU2k-l) for k = 1,2, ... , Ck = ~(U2k + iU2k-l) for


k = -1, -2, ... , Co = uo/V2.

6.20. °S Ilu- L;':1 (u, cPi)cPi 11 2 = lIu11 2 - L;':1 (u, cPi)2, hence L;':1 (u, cPi)2 S
Ilu11 2 . Since sum is bounded, we can let N ---> 00.

6.21. Use the property PcPk = cPk to show that p 2u = Pu. Clearly R(P) c
V. Conversely, if v E V, show that Pv = v so that R(P) = V.
Orthogonality: take v E R(P) and W E N(P); then (w, v) = (w, Pu).
Use this to show that (w,v) = 0.

6.22. See Exercise 6.8. Pu = L!=o(U,4>k)cPk = ..J275cPo + (8/35)-/572cP2'


Solutions 453

6.23. (a) Set u(r, e) = R(r)8(e) to get (8' sine)' + '\8sine = O. Set
E = cos e to get Legendre's equation. General solution is u( r, e) =
2:~=o[anrn + bnr-(n+1)]Pn(cose).
(b) an = (2n + 1)/2 Jo f(e)Pn(cose) deo
71:

6.24. Eigenvalues satisfy.J>: cos( .;x;;C) + ßsin( .;x;;C) = O. vdx) = [(C/2)+


(1/2ß) cos 2(.;x;;C)]-1/2 sin( .;x;;C). Heat equation: u(O, t) = 0,
(Bu/Bx + ßu)(C, t) = O.
6.25. Use integration by parts and the boundary conditions to show that
(Lu, u) 2: O. Nonnegativity ofthe eigenvalues follows from 0 :S (Lu, u)
:S A(U,U). Since L2 is separable there is at most a countable number
of nonzero mutually orthogonal vectors.

6.26. Let the minimizer be u, and set w = u + EV; then consider R(w) =
R(E) over all w that satisfy (w,el) = (w,e2) = ... = (w,en-l) = O.
Set [dR/dE]<=o = 0; expand and differentiate to find that A = R(u)
and u = en .

6.27. (Ls n, r n ) = (L 2:~=1 ukq;k, rn ) = (2:~=1 ukAkq;k, r n ) = 0 since (rn, q;k) =


o (Proof of Theorem 6.12).
6.28. Return to (6.34): for symmetry of L, (a) p(x) ----> I) as x ----> ±oo; (b)
p( -L) = p(L).

6.29. (c) Show that H~(x) = 2xHn (x)-Hn+ l (x). Set f(x) = exp( _x 2 ) and
show that f(n+1)+2xf(n)+2nf(n-l) = 0; multiply by (_l)n+l exp(x 2 )
to get H n+ 1 - 2xHn + 2nHn- 1 = O.

Chapter 7

7.1. 10:1 = 0 =? 0: = (0,0), (x'" / o:!)D'" f(O) = f(O). 10:1= 1 =?

xlyO D(l,O) f(O) + XOyl D(O,l) f(O) = x Bf I +- Y Bf I etc


1!0! 0!1! 8x 0 8y 0 .

7.2. J~a 8(x)q;(x) dx :S C l J~a 8(x) dx since supq;a(x) = e- l . If 8 were

locally integrable, then lima--->o J~l 8(x) dx = O. But left-hand side


= q;(0) = e- l .

7.3. f(x)q;(x) E C(O). Assume f # 0, but J N dx = O. In particular, if


f(xo) # 0, then f(x) # 0 for all x E (xo - h, Xo + h) for some h.
Choose arbitrary rp with compact support inside (:r:o - h, Xo + h); can
always find q; such that J fq; dx # 0, a contradiction.

7.4. Consider 0 C lR?, for example; for 10:1 = m, JoJDau)v dx = In(8 rn u/


8x k8yrn-k)v dx, where 0 :S k :S m. Use Green's theorem repeatedly.
454 Solutions

7.5. ((sgn)',4» = -(sgn,4>') = - J~I(-l)4>' dx - Jo1(+I)4>' dx = [4>]~1 -


[4>]6 = 24>(0) = 2(8,4».
7.6. ((sin ax . H(x))", 4» = (sin ax . H(x), 4>") = (H(x), 4>" sin ax)
= Jo1 4>" sinax dx = W sinax - a4>cosax16 - J01 a 2 4>sinax dx
a4>(O) - a 2(sin ax . H(x), 4».

7.7. (1',4» = -(1,4>') = - J~1 x4>'(x) dx- Jo1(x +c)4>'(x) dx = -[X4>]~1 +


J~1 4>(x) dx - [(x + c)4>16 + Jo 4>(x) dx l
= c4>(O) + J~11 . 4>(x) dx =
(c8,4» + (1,4».
7.8. Set A = {x: -1 < x < 0, -1 < y < O}, B = {x : 0 < x < 1, 0 <
y< I}, C = Au B, with boundaries ßA, ßB. Then

D(1,I) (1, 4» = rxy ßxay


(1, DU,I) 4» =
ß24> dx dy
Je
= r XYV x ~4> ds + r XYVx ~4> ds _ r y ~4> dx dy
JöA uy JöB uy Je uy

=- iy~~ dxdy = i 4> dx dy.

7.9. Solution of homogenous equation is u(x) = e- X • Now (u' + u, 4» =


-(u, 4>') + (u,4» = -(H, /4>') + (H,/4» (usingu = Hf) = (8,4» after
integrating. Left-hand side = 1(0)4>(0) + JoIU' + f)4> dx =? fex) =
e- X • Hence u(x) = (c + H(x))c x .
7.lO. (a) u E H 2(0, 3); (b) u E HI((O, 1) x (0,2)).
7.1l. u..L v in HI(O, 2).
7.13. D"'u E L 2 (fJ) far lai = 2; so m = 2 > n/2 = l.
7.14. Consider {u n }, {v n } C Cl (f!) such that Un -> u and Vn -> u in the
Hl-norm with u, v E HI(fJ) (H I is the closure of Cl). Then DO:U n ->
D"'u, DO:v n -> DO:v in L 2, for lai : : : l. Also, Vn -> v and Un -> U
in L 2(r). Thus, for example, (ßUn/ßxi,vnh2(o.) = (un,vnvih2(r) -
(un , aVn/ßXi)P(o.). Take limn~oo.

7.15. Assume 0, c lR?; then left-hand side is 10. (~:~ + ~:~) (~:~ + ~:~) dx.

Now Jo. uv
f ö2 ö2
öx2 öx2 dx = Jr
f 2 ÖV
öx2 öx (ö u u) f ö4
- ööx3V Vx ds+ Jo.
3
öx4V dx. Procee du
in this manner; use ß/av = Vla/ßX + V2a/ßy.
7.16. Let {v n } be a sequence in 'D(fJ) with limit v E H{j(fJ). We have
Ilvnllp : : : clvnlHl; V n -> v in H I implies that IIvn llL2 --+ IIvllL2 and
IvnlHJ -> IvlHl. I·IHJ is positive-definite since lviI = 0 implies that
J l'Vvl 2 dx = 0, so that v = const = 0, given the boundary value of
v.
Solutions 455

7.17. Show that (u,v) == InLlal=mDauDav dx is an inner product. In


particular, (u, u) = 0 =} In(Da u )2 dx = 0 for Inl = m, hence Dau =
o for Inl = m. But u E HO'(n); so u = O.

7.18. IIV2v lli2 = In [(~:~r + 2~:~ ~~ + (~:~r] dx.


f ( {)2 V ) 2 f {)3 v {)v f {)2 V {)2V
But in {)x {)y dx = - in {)x2 {)y {)x dx = in {)y2 {)x2 dx.
7.19. Require sup I(8, v) I to be defined, Le., v continuous. Hence m > n/2.
For example, 8: HJ(n) --+ R is not defined for 0. C R 2 .

7.20. u E HJ(n).L =} (U,V)Hl = 0 for all v E HJ(n); Le., 0 = In(uv +


Llal=l DOtuDOtv) dx. Set v = </J E D(n): 0 = I(u. - V 2 u)</J dx using
Green's theorem =} V 2 u = u. Since D(n) is dense in HJ, we can
extend this result in the usual way. u E HJ(n).L for 0. = (0,1) =}
u" - u = o. Basis for HJ(n).L is {eX,e- X}.

7.21. I(lnx?dx = x(lnx)2 - 2xlnx + 2x. Then use Theorem 9.


Chapter 8
8.1. (a) Second order, nonlinear, 0. = upper unit semicircle. (b) Fourth-
order, linear, 0. = triangle with vertices at (0,0), (1,0), (0, I).

8.2. (a) 1t
In' pv dx = In' Q dx + fr"
t ds. Use Cauchy's law t = ern and
the divergence theorem to rewrite the surface integral as In' diver dx.
The left-hand side equals In' p{)2 u /{)t2 dx. Regroup and invoke the
arbitrariness of 0.' to obtain (8.5). (b) (J"ij = 'x(divu)Iij + 2J-tEij(U).
Substitute in (8.5).
8.3. The argument is as in Example 2 of the Introduction: simply replace
f by f - ku, ku being the force of the foundation.
8.4. (a) Lj {)(J"aj/{)Xj = Lß {)(J"Otß/{)xß + {)(J"Ot3/{)Z, where Z = X3. Inte-
grate with respect to z and use the definitions of Sa. and Maß. (b) Fol-
lows as part (a). (c) Differentiate (8.14h, with respect to x"' sum on
n, and use (8.14h to eliminate Sa:; this gives La,ß {)2 M aß / {)xa{)xß =
-q. Next, use (8.13). This gives La,ß{)2Ma:ß/{)xa{)xß = -D[VLa
[j2(V 2w)/{)x; + (1 - v) La:,ß {)4w/{)x;{)x~1·

8.5. (a) w' = 0, Will = 0; (b) w = 0, w" = O.


8.6. Elliptic in A = {x: x > l,y > I} U {x: X < l,y < I}; strongly
elliptic in any open subset of A.

8.7. LIOtI, IßI=l aa:ßE,Ot+ß = -(1 + x 2 )e


+ 3.,,2 + 2(1 + X2)(2 = 0 at any Xo
for any E, such that e
= [3.,,2 + 2(1 + z~)(2l1(1 + x6).
456 Solutions

8.8. Li,j,k,l Cijkl~i"'j~k"'l = JLle1 21771 2 + (>. + JL)(e· 77)2 = JLle1 21771.1 2 + (>. +
2JL) (e . 77)2, where 771. is the component of 77 orthogonal to e. The
result follows from the independence of.,,1. and (e· 77). Pointwise sta-
bility: f == Li,j,k,l CijklMijMkl = (3)'+2JL)IM s I2+2JLIM D I2 [M s =
~(tr M)I and MD = M -MsJ. Show that IMI 2 = IM s I2 + IMDI2:
then f ~ cIMI 2 Hf 3>' + 2{t > ko and {t > {to.
[)2 U [)2 U [)2 U
8.9. -[)2 V~
Xl
+2
8 Xl [)-
X2
VI V2 + -82 V~
X
= g. Have to check Llal=2 baaa =
2
v?a~+2VIV2ala2+V2a2 = (VIai +V2a2)2 = (v?+vi)2 f= 0 if a = v.

8.10. Use (8.13) and (8.14). The BC can be rewritten as ~ +V a~~~2 = o.


With 10:1 = 3, Lbav a = b(3,0)V? + b(I,2)VI Vi = VI [V? + vviJ f= 0
along X = L, for which VI = 1, V2 = o.

8.11. Irl + Iß - 31 f= o.
8.12. ku·v-t·v = O. n = 2j bn = k, b22 = -Cn = Ij all other components
are zero. So (8.33) is satisfied.

8.13. u· v = 0, t· s = {tt· v with t = UV, and u = Ce(v).


8.14. [u"'v - u"v' + u'v" - uvlllM = [-BIUSiv - BouSov + SIuBiv +
SouBovM·
8.15. B o= 8/[)v, So = -SQ = 1.
8.16. Set 8vif8xj = eijj then since u is symmetrie, Li,j O'ijeij = Li,j O'jieij
(i)j also, by swapping indices, Li,j O'ijeij = Li,~ O'jieji (ii)j add (i)
and (ii) to get desired result. To obtain (8.49), use the fact that
Jn Lk,l O'klfkl(U) dx = Jn Lk,l O'kl(äu k/ 8xt} dx = Jr Lk,l O'klVIUk ds-
Jn Lk,I([)O'k!/[)XI)Uk dx. Set u = Ce(v).
8.19. N(A) = {u: u(x) = ax + b} = N(A*). Solution exists if J; fex) dx =
Jol xf(x) dx = O. Solution is unique if J; u(x) dx = Jol xu(x) dx = O.
8.20. N(A) = {u: u const.}j unique solution if Jn u(x) dx = O. N(A*) =
{u: u(x) = 0:1 + 0:2(X - Y)}j solution exists if Jn f dx = Jn(x-
y) f dx = O. If n = (-1, 1) x (-1, 1), then Jn f dx = 0 if f is odd in
X or Yj Jn(x - y)f(x) dx = 0 if f(x,y) = f(y,x).

8.22. (b) From (a), A : N(A)1. --> R(A) is bounded. Hence, using the
Banach theorem, A- I : R(A) --> N(A)1. is linear, bounded IIA-IVIl '*
sKllvll for all v E R(A), so setting v = Au we have lIull KIlAull s
for u E N(A)1.. If {vn } is a Cauchy sequence in R(A) with limit v,
then with Un = A-1vn we have lIu m - unll s
Kllv m - vnll --> 0 as
m, n --> 00; so {um} is a Cauchy sequence in N(A)1.. N(A)1. is closed;
Solutions 457

SO Um ----> u in N(A)1-. Since A is continuous, Vn 0= AUn =* v = Au.


Hence v E R(A) =* R(A) is closed.
8.23. Flexible foundation: N(A) = {O} and a unique solution exists.
Coulomb friction: N(A) = {clel + c2ed, where Cl and C2 are con-
stants. A unique solution exists if and only if h =, 12 = O. If friction
is not limiting, then N(A) = {O}.
Chapter 9
9.1. V = {v E H 2(O,I): v(O) = 0, v'(O) = O}.
I~ [kUli v" + du'v' + cuJ dx = I; Iv dx + ßv(l) + Qv'(l).
9.2. Let angle between T and v be ß. Boundary term in VBVP is
IrvVu· v ds, v E HI(n). But T = vcosß + ssinß (s = tangent
= (-V2,VI)), or T = (VICOSß-V2sinß, V2COSß+VIsinß) =*
v = (Tl cos ß + T2 sin ß, - Tl sin ß + T2 COS ß). Boundary term is thus
Ir v(g cos ß - Vu· IL sin ß) ds, where IL is normal to T.
9.3. a(w,v) follows by direct substitution of (8.13).
9.4. For continuity of a, use the Sobolev Embedding Theorem to obtain
Iv(l)1 ~ CllvliHl ~ CllvllH2, etc.
9.5. a( v, v) ~ J-l L.i t;, t;, dx using strong ellipticity. Complete by us-
10.
ing the Poincare-Friedrichs inequality.
9.6. Use (8.13) to obtain the first part. For the second part return to
Exercise 9.3: the remaining boundary term is Ir M,,(w)8v/öv ds = O.
Use the identity a2 + 2vab + b2 ::>: (1- v)(a 2 + b2 ) and (7.18) to show
10.
that a(v, v) ::>: (1 - v) L.1<>1=2 D<>u1 2 dx. Here v is Poisson's ratio.
Use the Poincare inequality (7.18) to get Io.(~';?dx ~ c Io.[(~,;)2 +
(a~::X2?J dx, etc. Then apply the Poincare-Friedrichs inequality to
obtain a similar bound on Io.v 2 dx. This leads to a(v,v) ~ C(l-
v)llvll~2'

9.7. a( u, v) = 101 (pu'v' +TUV) dx+p(l)u(l)v(l). V-ellipticity: use Theorem


I, Chapter 7, to get a(v,v) ::>: Qllvll~" Q = min(po,To). Continuity:
a(u,v) ~ 101 (PIU'V' + TIUV) dx + Plu(l)v(l) ~ k 101 (u'v' + uv) dx +
Plilull oo Ilvll oo (k = max(Pl, Tl)) ~ k(u, V)Hl + PlKIlullH1llvllw
(Sobolev Embedding Theorem) ~ (k + PlK)lluIIH1IlvIIHl.

9.8. (b) VBVP is: Jol U"V" dx + [h)v(l) - gIV'(l) - hov(O) + gov'(O)J =
101 Iv dx, v E H 2 (O, 1); so P = PI (0,1). Hence Q = {v E H 2 (O, 1) :
101 v dx = Jo1 xv dx = O.} Q-ellipticity is tricky, but see Rektorys

[39], Chapter 35. A unique solution exists if and only if 0 = (P.,p) =


I; fp dx + [glP'(l) - h lP(1) + hop(O) - gOp'(O)J for all P E P I (O,I).
458 Solutions

9.9. See Exercise 8.8: use Korn's inequality.

9.11. a(u + p, v + p) = a(u, v) for p = Prer + p()e() , where Pr,P() E Po(!1).


Q = {v E V: In Vr dx = In v() dx = O}; a unique solution exists if
f satisfies In fr dx = In f() dx = O.
9.12. (DJ(u),v) = lim()->oe-1[J(u+ev) - J(u)] by definition. Set f(e) =
J(u+ev) for any given u,v. Then (DJ(u),v) = lim()->oe-1[f(e)-
f(O)] = 1'(0) = (d/de)(DJ(u)v)I()=o.
9.13. äJ/ äXi = lim()->o[J(x+e(O, ... , Yi, . .. ,0) )-J(x )]/(eYi) (Yi in ith slot).
Multiply by Yi and sum over i to get rcsult.

9.14. J(eu + (1- e)v) = He 2a(u, u) + (1- e)2 a(v, v) - 2e(1 - e)a(u, v)}-
e(C,u) - (1 - e)(C,v). a(u - v,u - v) > 0 since a is V-elliptic, so
2a(1L, v) < a(u, u) + a(v, v). Use this to obtain strict convexity of J.
9.15. J(eu+(I-e)v) = eJ(u)+(I-e)J(v)-~e(l-e)a(u-v,u-v). Thelast
term on the right is nonnegative. To show that u is a minimizer: from
convexity, J(v) - J(u) :::: e- 1 [J(u + e(v - u)) - J(u)] = (DJ(u), v)
e
when ---7 0 (see Example 15).

9.16. J(v) = ~ 101 [k(v"? + d(v'? + cu 2] dx - 101 fv dx - ßv(l) - o:v'(I).


9.17. Since u is a minimizer, J(u) :::; J((1 - e)u + ev) for 0 < e < 1 (since
V must be convex). Expand and rearrange to get a(u, v - u) - (C, v-
u) + ~ea(v - u,v - u) :::: o. Let e ---7 O.
Chapter 10

10.3. Un satisfies a( u, v) = (f., v) or (u n , CPk)a = (C, CPk/. Also, Un =


L~=1 (u n , CPk)aCPk = L~=l (C, CPk)CPk. Now J(u) = -~ Ilull; (show this);
but J(u n ) = ~llun - ull; - ~llull;; hence Ilun - ull a ---7 O. The result
Ilu n - uliH ---7 0 follows from continuity of a(·, .).

10.5. Ilu - uhll; = a(u - Uh, U - Uh) = Ilull; - Iluh II~ - 2a(uh, u - Uh)' The
last term is zero.

10.6. a(u,v) = (f.,v) so a(u,u) = (C,u); hence J(u) = -~a(u,u).

10.8. Uh(X) = (V2/2)( -CP1(X) - CP2(X) + CP3(X)) = (V2/2)(x 2 + ~x - 1).


10.9. Replace v by AVh in Green's formula (G(u,v) = 0); (10.37) gives
(AVh,f) = (AVh,Auh) =?- (vh,A*f) = (vh,A*Auh)' If A = _\7 2 ,
then A* = _\7 2 .

10.10. (a) I;(-u~ +Uh -- sinX)Vh dx = 0, Vh E v h C L 2 (0, 1), Uh E U h C


H 2 (0, 1) n HJ(O, 1).
(b) Least squares: solve MT a = F, where ij = M 10\
-CP:' +CPi)( -'ljJ~' +
Solutions 459

J;
'l/Jj) dx and Pj = (sinx)( -'l/Jj' +'l/Jj) dx. Collocation: solve L:=1 (-cp%
(Xi) + CPk(xd)ak = f(xi), i = 1, ... , N.
(c) Solve MT a = F, where Mij = J;
CPi( -'l/Jj' + 'l/Jj) dx and Pj =
Jo1
f'I/Jj dx.
Chapter 11

11.1. Must show that a function v, say, exists such that Jn ViCP dx =
- Jn VOCP/OXi dx. For each Oe, Jn e (OV/OXi)CP dx = Jr e V!/iCP ds -
Jne V OcpjOXi dx since vbe E H 1(O).
11.2. Optimal B = 5.

11.4. emax exists at point x, where e' = 0; then e(:l:i) = 0 = e(x) +


~e"(z)(xi - x? => le(x)1 = ~!e"(Z)I(Xi - x? If i is node nearer
to x, then lXi - xl ::::: ~h. Hence le(x)1 ::::: ~h2Ie"(x)l. Maximize ovcr
all elements to get result.

11.5. f"(x) = 27rcos7rx-7r 2xsin7rx.IMax.valuel = 12r.1 at X = 0,1. Hence


lIell= : : : ~h2 ·271" = 7rh 2/4. See whether log lIell= ::= 2h + const.
11.7. Retain 1,~, TJ, e, ~TJ, TJ2, eTJ, ~TJ2. Then, for example, if node 5 is
located at ce, TJ) = (0, -1), N5 (e, TJ) = ~Cl - e)(:l- TJ)·

11.10. One needs to solve a system of 21 equations uniquely, for any given
right-hand side. Equivalently, show that any polynomial for which
D"p = 0 for lai::::: 2 at the vertices, and p" = 0 at the midpoints is
identically zero. See Ciarlet [11] (Theorem 2.2.11) for full details.
4' ,
11.11. x = LA=1 whcrc NA are given by (11.27). Substitute
xANA(~,TJ),
and use the geometry of the parallelogram to verify that x = A~ + b
for suitable A and b.

11.12. j = H2d - (e + TJ)(I- d)] > 0 for all ~ E n if d > ~.


11.15. (a) a = [1 1 1 IjT, b = 2[-1 1 1 - IjT, c = 2[1 1 - 1 - I]T, d =
[1 -11 _1]T. (b) C'ilN)T = [-~b+TJd ~c+~d].

11.16. Show by direct integration over reference element; for examplc, Jn e (a +


b1x + b2 y) dxdy = Ae(a + bTx) where b = [bI b2 jT and x =
(1/6)[1 1jT, for a polynomial of degrec one.

Chapter 12

12.1. IIT;;-lll = sup IIT;:-I Y II/lIylI, Y f O. Set z = Pey/llyll; thcn IIT;111 =


supllp;IT;;-lzll· Pick x,y in Oe such that IIx - ylI = Pe: IIT;111 =
p;l sup IIT;;-l(x - b + b - y)11 = p;1 sup IIx - yll = h/ Pe.
460 Solutions

12.2. //lv//m,1l = /lv/lm,1l ~ /lv/lk+l,n since m ~ k + 1. /lrrv/lm,n =


/lI:iV(Xi)~i/lmll ~ I:i/iJ(Xi)"/~i/lmll ~ CsUP/iJ(Xi)/ (C is inde-
pendent of iJ). ' ,

12.3. Let the triangle have angles o:,ß,"'( with (Je = 0: ~ ß ~ "'(. Let the
sides opposite 0:, ß, "'( be a, b, c, respectively. Then a ~ b ~ c and
h e = c. The largest cirele inscribed in the tri angle touches all sides.
Draw a sketch and show that h e = (Pe/2) (cot 0:/2 + cotß/2). Now
0: < 7r/2,ß < 7r/2; so cotß/2 ~ coto:/2. Hence he/Pe ~ coto:/2 ~ a
if we prescribe 0: ~ Bo, so that a = cot Bo/2.

12.4. k = 2 O(h~-m), 0 ~ m ~ 3, u E H 3 (n e )
k = 3 O(h!-m), 0 ~ m ~ 4, u E H 4 (n e ).

12.5. /Iv - TIev/l;" = I:;:o Iv - TIevli ~ C2h2(k+l)[aOh~ + a 2h;;2 + ... +


a2mh-2ml/v/2
e k+l <
- C2eh2(k+l-m)[h2m+h2m-2+···+1l/vI2
e e k+l (where
e = max(ao, .. . ,a- 2m )). Given K > 0 we can always find E > 0 such
that the term in square brackets < 1 + K provided h e < E.

12.6. 'OiJ(a) = I:i :;, ai = I:i,j t:, ~ai = I:i,j t:] Tjiai = 'Ov(Ta). Pro-
ceed in the same way for higher derivatives. Then for k = 2, for
example, ID"'iJ(x) I ~ /l'0 2iJ/I = sup 1'0 2 iJ(a, b)/ (/la/l ~ I, /lbll ~ 1)
= sup/'0 2 v(Ta,Tb)1 = SUpl'02V(~~ 1~1)1·/lTII2. Use IITa/l ~
IIT/Iliall ~ IITII·

12.7. Any v E X h also belongs to L 2 (n), so it is required to find Vi E L 2 (n)


such that Jn Vicj; dx = - Jn vß/ßx; dx Vcj; E '0(0.). Use Green's theo-
rem applied to the function ßW/ßXi' where w = vln e ; then sum over
all elements to get In ViCP dx = - In Ußcj;/ßXi dx = I::=1 Jarle Wcj;Vi ds;
the boundary integrals vanish.

12.8. a(w,e) = /lell1,2' where e = u - Uh. Also, a(wh,e) = 0; so a(w-


wh,e) = lIe/l1,2' Hence /le 11 1,2 ~ Kllw-wh//I,n/lel/I,n ~ KCh/Lllwllp,nM3I1ullr,n
for w E HP(n), u E Hr(n), where J-l = min(k,p - 1) and ß =
min(k,r - 1). Since Aw = e, we have w E H 2 (n) and I/w/l2m,n ~
ellellv; so /lel/L2 ~ C1h v llull r ,n.

12.9. /lu - Uh/lU ~ C1hvllu/lr,rl, where v = min(2,r) for linear or bilinear


elements.

12.10. Using the Hermite basis functions and making appropriate changes
(e.g., replace C(n) by CI(n)), the estimate (12.24) remains valid.
The VBVP is: find u E Hg(O, 1) such that I;
(u"v" + k(x)uv) dx =
101 fv dx for all v E Hg(O,I). We obtain an error estimate from
lIu - uhl/2,n ~ Kllu - uhll2,rl = K (I:e /Iu - uhll~,nY/2 ~ Kh;-1
Solutions 461

(Ee lul~+l,oY/2 = Kh k- 1 u lk+1,O, provided that Pk(O) c


I Xc
H 2 (0) and Hk+l(O) C C 1 (0) and u E Hk+l(n).
12.12. This follows as in Theorem 9. In particular k"ilv . v ~ kol"ilvl 2 so
that a~(vh,vh) ~ koE~=lWtl"ilVh(~l)12 = kolvhltoe' since "ilvh E
[P1(n e )J2 and a rule of order three is exact for quadratic functions.
Index

additivity, 88 for beams, 263


adjoint problem, 286 bending stiffness (D), 261
affine family, 413 Bessel's Inequality, 193, 210
affine map, 372, 384, 412 Best Approximation Theorem, 192
in R?, 380 biharmonic equation, 312
affine-equivalent element, 412 biharmonic operator C'V 4 ), 262
affine-equivalent elements, 413 bijective operator, 139
almost everywhere (a.e.), 66 bilinear form, 163
assembly, 380 V-elliptic, 165, 316
Aubin-Nitsche method, 433 continuous, 164, 316
Axiom of Choice, 45 bilinear polynomials (Qk), 384
equivalence with Zorn's Lemma, biological population dynamics, 256,
45 264
biquadratic basis, 385
balance Bolzano--Weierstrass Theorem, 36,
of energy, 2, 3, 9 60
of forces, 9 boundary
of moment um, 9 insulated, 5
Banach space, 115 boundary conditions (BC), 5, 264
Banach Theorem, 151 essential, 309
Banach-Tarski paradox, 65 for elastic plate, 268
basis, 177 homogeneous, 309
basis function, 16, 367 in elasticity, 276
finite element, 367 natural, 309
bending moment (M), 261 nonhomogeneous, 298
464 Index

boundary operators, 273 complex number


boundary value problem (BVP), imaginary part, 29
5,6,13 modulus,29
homogeneous, 6 real part, 29
two-point, 264 complex-valued function, 77
variational, 306 connected set, 40
bounded fUllction, 60 consistency error estimate, 429
constitutive equation, 2, 9
C(a, b) or C[a, b], 56 constitutive law, 259
c(n), 56 continuity, 54
C(ri),56 equivalence of t: - 8 and limit
not an inner product space, definitions, 111
97 in]Rn,56
as a complete space, 114 limit definition, 111
cm(n),57 of a function of several vari-
as a vector space, 84 ables, 56
COO(n),57 continuous dependence on data,
C8"(n),216 14, 287
calculus of variations, 332 for elliptic BVP, 292
Cartesian product, 27 continuous functions, 54
Cauchy sequence, 113 measurability of, 68
Cauchy's equation of motion, 258 on compact sets, 58
Cauchy's law, 258 continuous operator, 143
Cauchy-Schwarz inequality, 91, 96 convergence, 17, 33
Cea's Lemma, 348, 411 in V(n), 217
characteristic function (XE), 68 in LP, 112
choice function, 45 in the mean, 112
closed ball, 118 of sequences, 33
closed neighborhood, 116 of sequences of functions, 108
closed set, 31, 116, 117 pointwise, 108
in ]Rn, 40 rate of, 18
closure of a set, 31 uniform, 108
collocation methods, 354 convergence of interpolates, 350
compact set, 120 convergence of sequences, 106
in]Rn,40 convex function, 327
compactness, 37 convex functional, 328
complete space, 114 convex set, 103
completeness, 113 countable additivity, 65
equivalence with closedness, covering condition
119 of boundary operators, 278
of C[a, b], 129
of finite-dimensional spaces, v(n), 216
183 distribution (V'(n)), 217
completion, 124, 146 data, 256
complex conjugate, 29 dense sets, 121
Index 465

inLP,122 eigenfunetion, 199


density,3 eigenfunction expansion, 200
of Co in V, 123 eigenfunctions
differential equation (DE), 256 orthogonal, 203
linear, 256 eigenvalue, 199
order of, 256 problem, 198
ordinary (ODE), 256 elastie
partial (PDE), 256 bar: well-posedness ofVBVP,
diffusion, 2 319
equation, 7 beam, 262, 393
steady,2 membrane, 2, 9, 10
diffusion equation, 257 elastie plate, 260
dimension, 177 well-posedness of VBVP, 320
of domain in finite-dimensional elastieity
space, 185 isotropie, 259
Dirae delta, 157, 216, 217 operator (0), 260
direet sum, 86 elliptie, 300
Diriehlet system, 283 strongly elliptie, 300
tensor, 259
of boundary eonditions, 274
eleetrostaties, 2, 7
diseonneeted set, 40
elliptie operator, 270
displaeement, 9, 258
elliptie problem, 10
distanee from a point to aset (d(x, B)),
embedding, 232
103
eontinuous, 232
distribution, 214
energy inner produet, 344
derivative of, 219
energy norm, ~144
generated by a loeally inte- equivalenee dass, 43
grable funetion, 218 equivalenee relation, 42
in H-m, 246
equivalent norms, 97
produet with smooth funetion, and eonvergenee, 107
218 on H'[{'(f!), 244
regular, 217 error, 17
singular, 218 error estimate, 17, 18, 406
distributional derivative, 220 for seeond-order problems, 423
distributional differential equation, interpolation, 351
223, 307 with numerical integration, 430
divergenee theorem, 4 error estimates
domain, 40, 134, 135 for fourth-·order problems, 434
Lipsehitz, 226 loeal interpolation, 416
of dass cm, 226 essential supremum (ess sup), 94
of Sturm-Liouville problem, Euler-Bernoulli hypothesis, 262
202 existenee, 14
with eurved boundary, 427 of solutions, 287, 316
dual spaee, 157 to minimization problem, 332
of LP, 161 to elliptie BVP, 292
466 Index

extension of an operator, 140 Gram-Schmidt orthonormalization,


181
family of problems, 422 greatest lower bound (inf), 36
finite difference method, 16 Green's formula, 280
finite element mesh, 365, 367 Green's theorem, 219, 242
finite element method, 16
for second-order problems, 364 Hm(n),226
finite elements H-m(n),246
regular family, 420 H-1-methods, 356
finite-dimensional space, 176 half-bandwidth, 406
formal adjoint, 352 harmonie oscillator, 211
operator, 280 heat capacity, 3
formally self-adjoint operator, 280 heat conduction, 2, 5, 15, 16, 265
one-dimensional, 257
Fourier coefficients, 191
steady, 2, 6, 257
Fourier Series Theorem, 194
heat equation, 4, 198, 257
Fourier's law, 4
unsteady,5
fourth-order problems, 392
heat ftux, 3
Frechet derivative, 432
he at source, 3
functions
Heaviside step funetion, 61
bounded continuous, 18
generalized derivative of, 221
even, 153
measurability of, 68
odd, 153
Hermite
positive and negative parts,
differential equation, 211
72
polynomials, 212
with compact support, 121,
basis functions, 394
216
families of elements, 392, 394
functional, 11
Hermitian, 88
Gateaux-differentiable, 328
Hilbert space, 115
Hölder inequality, 101
Galerkin approximations, 364 for sums, 103
convergence, 348 homogeneity, 88
errors in, 346 homogeneous medium, 257
Galerkin method, 340 Hooke's law, 259
properties of approximations,
345 identity operator, 138
Gateaux derivative, 328 image, 59, 135
Gauss quadrat ure image space, 134, 135
in one dimension, 402 inductive limit topology, 216
Gauss's law, 7 infimum (inf), 35, 36
generalized partial derivative, 220 infinity 00, 32
global basis function, 367 initial boundary value problem (IBVP) ,
global interpolation, 422 5,264
error estimate, 423 initial condition, 5
gradient of a functional, 328 initial conditions (les), 264
Index 467

initial value problem (IVP), 264 LP(n), 62, 67, 75


injective operator, 138 as a vector space, 84
inner product, 87 LP(a, b), 62
defined by abilinear form, 344 C(X, Y), 147
inner product space, 87 Lagrange bases, 373
finite-dimensional, 179 Lame's constants, 259
real,89 Laplace's equation, 6
integrable function, 61, 73 Laplacian, 6
integration by parts, 219 in spherical coordinates, 210
interior point, 30 operator, 137
interior point: in jRn, 39 Lax-Milgram Theorem, 166
interpolate, 349 least upper bound, 36
interpolation error, 411 Lebesgue Dominated Convergence
for isoparametrie elements, 427 Theorem, 74, 123
interpolation operators Lebesgue integral, 53, 64, 67, 69
II h ,421 of a measurable function, 70
(tr and IIe ), 415 of a simple function, 70
interval, 29 Lebesgue measure, 54, 65
closed,29 Legendre polynomials, 207
half-open, 29 and Gauss quadrature, 404
open, 29 Legendre's equation, 202, 207
into, 135 limit of a sequence, 33, 107
inverse image, 135 linear
inverse operator, 138 combination, 176
irrational numbers, 28 dependence, 176
irrationality of v'2, 48 elasticity, 257
isometrie isomorphism, 146, 161 functional, 156
isometry, 145 on finite-dimensional space,
isomorphisms, 142, 186 189
in finite-dimensional spaces, independence, 176
187 interpolate, 377
isoparametrie elements operator, 140
triangular, 398 bounded iff continuous, 150
quadrilateral, 400 on finite-dimensional space,
184
Jacobian matrix, 399 ordering, 42
space, 82
Kirchhoff-Love hypothesis, 260 linearity, 88
Korn's inequality, 295, 320, 325 Lipschitz
continuous function, 61
L 2 (n) uniform continuity of, 80
as an inner product space, 90 domain,40
as the completion of Coo(n), load vector, 370
231 element, 370
LOO(n), 77 local basis functions, 415
468 Index

on reference element, 372 of an operator, 147


on square reference element, on LOO(O), 94
385 on LP(O), 94
on triangular element, 381 normal boundary conditions, 274
piecewise quadratie, 373 normal derivative, 5
local numbering system, 379 normed spaee, 18, 92, 95
locally integrable function, 217 norms
equivalent, 97
mapping, 134 on IR n , 93, 103
mass density (p), 258 null space, 135
matrix numerical integration, 402
representing linear operator, on square, 404
187 on triangle, 404
maximal element, 44 order, 402
maximum, 35
measurable function, 67 one-to-one, 186
measurable set, 65 one-to-one operator, 138
measurable space, 65 onto, 135
measure, 61 open ball, 117
mesh parameter, 422 open mapping, 151
method of least squares, 354 Open Mapping Theorem, 151
method of weighted residuals, 353 open neighborhood, 116
metric, 98 open set, 30, 116, 117
generated by a norm, 99 inlR n , 39
metric space, 99 operator, 134
minimization of functionals, 326 bijeetive, 139
minimization problem, 11 bounded, 146
equivalence with VBVP, 330 eontinuous, 143
minimizing sequence, 357 differential, 136
minimum, 35 identity, 138
Minkowski inequality injective, 138
for integrals, 84, 10 1 inverse, 138
for sums, 103 linear, 140
multi-index notation, 214 matrix, 136
one-to-one, 138
Navier's equations, 260 projection, 152
necessary condition, 46 symmetrie, 203
neighborhood, 30, 116 uniformly continuous, 143
in IR n , 38, 39 operators
nodal points, 365 composition of, 137
nonhomogeneous, 3 equal, 137
norm, 18,92 product of, 137
generated by an inner prod- sum of, 137
uct, 95 ordered n-tuples (!Rn), 38
matrix, 148 ordered pairs (1R 2 ), 37
Index 469

ordered tripies (]R3), 38 Rayleigh-Ritz method, 345


orthogonal complement, 124 rectangular elements, 383
of HJ(fl) in H 1 (fl), 251 reductio ad absurdum, 47
orthogonal projection, 154 reference element, 371, 384, 412
on Hilbert spaces, 155 triangular, 380
orthogonality, 91 regular family
orthonormal basis, 181, 190 of isoparametric elements, 426
eigenfunctions ofSturm-Liouville regularity of solutions, 287, 325
operator, 204 relation, 41
in Hilbert space, 196 antisymmetric, 42
orthonormal set, 180 reflexive, 42
maximal, 190 symmetrie, 42
transitive, 42
parallelogram law, 96, 102 restrietion of an operator, 140
Parseval's Formula, 193, 210 Riemann integral, 63
partial ordering, 42 Riesz map, 161
partial sum, 191 Riesz Representation Theorem, 159
partition, 43 Riesz's Theorem, 162
Pascal triangle, 383 rigid body displacement, 296, 324
Petrov-Galerkin method, 355 Ritz-Galerkin method, 16
piecewise linear function, 371
Poincare-Friedrichs Inequality, 244 sampling points, 402
Poincare Inequality, 233 Schrödinger operator, 211
point of accumulation, 31, 117 seminorm, 245
in ]Rn, 39 separable space, 123
pointwise stable, 300 Hilbert space as, 197
Poisson equation, 6, 257 separation of variables, 197
Poisson's ratio (v), 261 sequences, 32
positive homogeneity, 92 bounded,50
potential energy, 327 convergence of, 106
principal part, 270 convergent, 33
projection, 152 finite, 32
orthogonal, 154 of numbers, 32
Projection Theorem, 127, 155, 194 in normed spaces, 106
proof by contradiction, 47 infinite, 32
monotone, 50
quality of approximation, 17 serendipity element, 407
quintic polynomial, 397 set, 23
complement of, 26
]Rn,37 countable, 26, 66
as a complete space, 114 elements of, 23
as a vector space, 83 empty (0), 24
ramp function, 62 finite, 24
generalized derivative of, 221 infinite, 24
range, 59, 135 linearly ordered, 42
470 Index

null (0),24 Strang's Lemma, 429


of complex numbers (C), 29 stress, 258
of integers (Z), 24 strictly convex function, 327
of measure zero, 66 strong convergence, 162
of natural numbers (fIT), 28 strongly elliptic operator, 270
ofnonnegative integers (Z+), Sturm-Liouville operator
24 positive, 204
of rational numbers (!Q), 28 symmetry of, 204
of real numbers (IR), 28 Sturm-Liouville problem, 201
partially ordered, 42 regular, 201
universal, 25 singular, 202
sets subset, 24
difference of, 25 proper, 24
equal, 25 subspace, 84
intersection of, 25 sufficient condition, 46
of functions, 53 sum of subspaces, 85
of numbers, 28 supremum (sup), 35, 36
union of, 25 surjective, 135, 186
shear force (8), 261
for beams, 263 temperature, 3
simple function, 64, 69 test functions, 216
Sobolev Embedding Theorem, 232 thermal conductivity, 4
Sobolev inner product (u, v) Hm, thermal diffusivity, 6
227 trace,236
Sobolev space of a matrix, 259
H m (0),226 trace operator
ascompletionofCm (0),233 "Y,236
as completion of COO(O), "YOll 241
229 as continuous map from H 1 (0)
H()'(O) , 243 into L 2 (r), 240
Wm,P(O), 235 Trace Theorem, 240
alternative definition, 233 traces
as a Hilbert space, 229 in the sense of, 242
solution transformation, 134
distributional, 225 triangle inequality, 93
generalized, 225 triangular elements, 379, 381
weak,225
space of admissible functions, 310 underintegration, 408
span, 177 uniform continuity, 55, 56
square-integrable function, 75 uniqueness, 14
steady-state, 2 of solution, 287, 316
stiffness matrix, 370 to elliptic BVP, 291
element, 370 to minimization problem,
strain, 259 332
strain energy, 326 unit ball, 104
Index 471

sufficient condition, 46 equivalence to classical


sum of subspaces, 85 problem, 307
supremum (sup), 35, 36 existence of solution, 316
surjective, 135, 186 formulation, 309
uniqueness of solution, 316
temperature, 3 variational inequality, 334
test functions, 216 variational problem, 10
thermal conductivity, 4 vector space, 82
thermal diffusivity, 6
trace, 236
of a matrix, 259 Wm,P(O)
trace operator as a Banach space, 235
,,236 as completion of COO(O),
'Oll 241 235
as continuous map from as completion of cm(o),
H1(O) into L 2 (r), 240 235
Trace Theorem, 240 continuous embedding in
traces C k (O),235
in the sense of, 242 WS,P(O) for real s, 248
transformation, 134 weak convergence, 162
triangle inequality, 93 in finite-dimensional spaces,
triangular elements, 379, 381 184
weak derivative, 222
underintegration, 408 weak* convergence, 162
uniform continuity, 55, 56 Weierstrass Theorem, 121, 124
uniqueness, 14 weighting function, 202
of solution, 287, 316 weights, 402
to elliptic BVP, 291 well-posedness, 14
to minimization problem,
332
unit ball, 104 Young's modulus (E), 261
upper bound
of a partially ordered set, 44
Z~, 214
variational boundary value zero operator, 138
problem (VBVP), 13, Zhl.mal's condition, 431
306 Zorn's Lemma, 45, 197
continuous dependence on equivalence with Axiom of
data, 316 Choice,45

You might also like