Pablo Pedregal - Functional Analysis, Sobolev Spaces, and Calculus of Variations (UNITEXT, 157) - Springer (2024)

UNITEXT 157
Pablo Pedregal
Functional
Analysis,
Sobolev Spaces,
and Calculus
of Variations
UNITEXT
La Matematica per il 3+2 .
Volume 157
Editor-in-Chief
Alfio Quarteroni, Politecnico di Milano, Milan, Italy
École Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland
Series Editors
Luigi Ambrosio, Scuola Normale Superiore, Pisa, Italy
Paolo Biscari, Politecnico di Milano, Milan, Italy
Ciro Ciliberto, Università di Roma “Tor Vergata”, Rome, Italy
Camillo De Lellis, Institute for Advanced Study, Princeton, NJ, USA
Victor Panaretos, Institute of Mathematics, École Polytechnique Fédérale de
Lausanne (EPFL), Lausanne, Switzerland
Lorenzo Rosasco, DIBRIS, Università degli Studi di Genova, Genova, Italy
Center for Brains Mind and Machines, Massachusetts Institute of Technology,
Cambridge, Massachusetts, US
Istituto Italiano di Tecnologia, Genova, Italy
The UNITEXT - La Matematica per il 3+2 series is designed for undergraduate
and graduate academic courses, and also includes books addressed to PhD students
in mathematics, presented at a sufficiently general and advanced level so that the
student or scholar interested in a more specific theme would get the necessary
background to explore it.
Originally released in Italian, the series now publishes textbooks in English
addressed to students in mathematics worldwide.
Some of the most successful books in the series have evolved through several
editions, adapting to the evolution of teaching curricula.
Submissions must include at least 3 sample chapters, a table of contents, and
a preface outlining the aims and scope of the book, how the book fits in with the
current literature, and which courses the book is suitable for.
For any further information, please contact the Editor at Springer:
[email protected]
THE SERIES IS INDEXED IN SCOPUS
***
UNITEXT is glad to announce a new series of free webinars and interviews
handled by the Board members, who rotate in order to interview top experts in their
field.
Access this link to subscribe to the events:
https://cassyni.com/s/springer-unitext
Pablo Pedregal
Functional Analysis, Sobolev

Spaces, and Calculus
of Variations
Pablo Pedregal
Department of Mathematics
University of Castilla-La Mancha
Ciudad Real, Spain
ISSN 2038-5714 ISSN 2532-3318 (electronic)

UNITEXT
ISSN 2038-5722 ISSN 2038-5757 (electronic)
La Matematica per il 3+2
ISBN 978-3-031-49245-7 ISBN 978-3-031-49246-4 (eBook)
https://doi.org/10.1007/978-3-031-49246-4
© The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland
AG 2024
This work is subject to copyright. All rights are solely and exclusively licensed by the Publisher, whether
the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse
of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and
transmission or information storage and retrieval, electronic adaptation, computer software, or by similar
or dissimilar methodology now known or hereafter developed.
The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication
does not imply, even in the absence of a specific statement, that such names are exempt from the relevant
protective laws and regulations and therefore free for general use.
The publisher, the authors, and the editors are safe to assume that the advice and information in this book
are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or
the editors give a warranty, expressed or implied, with respect to the material contained herein or for any
errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional
claims in published maps and institutional affiliations.
This Springer imprint is published by the registered company Springer Nature Switzerland AG
The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland
Paper in this product is recyclable.

To María José, for being always there
Preface
This is a textbook that is born with the objective of providing a first, basic reference
for three important areas of Applied Analysis that are difficult to separate because
of their many interconnections; yet they have their own personality, and each has
grown to become an independent area of very live research. Those three areas
occur in the title of the book. There is, at least, a fourth one, Partial Differential
Equations (PDEs), which is also quite close to those other three but has such a
strong temperament, and attract so much attention, that there are many popular and
fundamental references to cover this field in connection with the others. Though
there is no reasonable way to avoid talking about PDEs in this text, our intention is to
focus on variational methods and techniques, since there are many more references
to cover PDEs. In particular, the popular textbook [Br11] has been an inspiration for
us.
We should also warn readers from the very beginning that our variational
techniques are more oriented towards the direct method rather than to other
techniques that could be placed under the umbrella of indirect methods, meaning
by this all fundamental results related to optimality. We will of course cover too the
Euler-Lagrange equation, but we will not elaborate on this in any special manner.
As a matter of fact, some of these ideas are very hard to extend to the higher
dimensional situation. On the other hand, the material on Sobolev spaces covered is
also oriented and motivated by the needs of variational problems, and hence some
basic facts about these important spaces of functions are not proved or treated here,
though they can be very easily shown once readers mature the basics.
As such, this book can serve several purposes according to where emphasis is
placed. Ideally, and this is how it was conceived, it is to be used as a first course
in the Calculus of Variations without assuming that students have been previously
exposed to Functional Analysis or Sobolev spaces. It is therefore developed with no
prerequisites on those basic areas of Analysis. But it can also serve for a first contact
with Functional Analysis and one of its main applications to variational problems.
It may also be used for a first approach to weak derivatives and Sobolev spaces, and
its main application to variational problems and PDEs. More specifically:
vii
viii Preface
• A first course in Calculus of Variations assuming basic knowledge of Functional

Analysis and Sobolev spaces: Chaps. 3, 4, 8, 9.
• A basic course in Functional Analysis: Chaps. 1, 2, 3, 5, 6.
• An introductory course on Sobolev spaces: Chaps. 1, 2, 3, 7.
Pretending that there are no other areas of Applied Analysis involved in such an
endeavor would be narrow-minded. To name just one important example, there is
also abundant references to the classic subject of Convex Analysis as treated in very
classic sources like [EkTe99] or [Ro97]. Even within the field of the Calculus of
Variations, we do not aim at including all that is to be learnt. There is, for instance, a
lot of material more related to the indirect method that we hardly treat. All of this is
very well represented in the encyclopedic work [GiHil96]. The final Appendix is an
attempt both to enlarge the perspective of students after covering this material, and
to describe, in a non-exhaustive way, those many other areas of Analysis directly or
indirectly related to variational problems.
The book is addressed to students and new comers to this broad area of
optimization in the continuous, infinite-dimensional case. Depending on needs
and allotted time, the first part of the book may already serve as a nice way to
taste the three main subjects: Functional Analysis, Sobolev spaces and Calculus of
Variations, albeit in a one-dimensional situation. This part includes the basic notions
of Banach spaces, Lebesgue spaces and their primary properties; one-dimensional
weak derivatives and Sobolev spaces, the dual space, weak compactness, Hilbert
spaces and their basic fundamental properties; the Lax-Milgram lemma, the Hahn-
Banach theorem in its various forms, and a short introduction to Convex Analysis
and the fundamental concept of convexity. The final chapter of this first part
deals with one-dimensional variational problems and insist on convexity as the
fundamental, unavoidable structural assumption. The second part focuses on some
of the paradigmatic results in Functional Analysis related to operators, understood
as mappings between infinite-dimensional spaces. Classic results like the Banach-
Steinhauss principle, the open mapping and closed graph theorems, and the standard
concepts of linear continuous operators in parallelism with the usual concepts of
Linear Algebra are examined. The important class of compact operators are dealt
with in the second chapter of this part. Finally, the third part focuses on high-
dimensional problems: we start introducing multi-dimensional Sobolev spaces and
their basic properties to use them in scalar, multi-dimensional variational problems;
a presentation of the main facts for this class of problems follows; and a final
chapter pretends to cover more specialized material where students are exposed
to the subtleties of the high-dimension case. At any rate, each chapter is like a
first introduction to the particular field that is treated. Even chapters devoted to
variational techniques are far from being complete in any sense, so that readers
interested in more information or in deepening their knowledge should look for
additional resources.
The scope of the subject of this book is so broad that the relevant literature is truly
overwhelming and unaffordable in finite time. Being a source addressed to students
Preface ix
with no previous background on these areas, references are restricted to textbooks,

avoiding more specialized bibliography.
The style of the book is definitely pedagogical in that a special effort is made in
conveying to students the reasons of why things are done the way they are. Steps
ahead searching for new concepts are tried to be justified on the grounds that they
are necessary to advance in the understanding of new challenges. Proofs of central
results are however a bit technical at times, but that is also a main part of the training
that is sought by the text. There is a good collection of exercises, aimed at helping
to mature the ideas, techniques and concepts. It is not particularly abundant to avoid
that frustration feeling of not going through them all, but sufficient to that purpose.
Some exercises are taken from certain steps in proofs. Thinking about a semester,
there is no enough time to go through many more exercises. Additional ones can be
found in the bibliography provided.
Prerequisites include, in addition to Linear Algebra, Multivariate Calculus, and
Differential Equations, a good background on more advanced areas such as Measure
Theory or Topology. A partial list of fundamental facts from these areas that are used
explicitly in some proofs are:
1. The multivariate integration-by-parts formula.
2. Measure Theory: dominated convergence theorem, Fatou’s lemma, approxima-
tion of measurable sets by open and compact sets.
3. Lebesgue measure in .RN .
4. Fubini’s theorem.
5. Arzelá-Ascoli theorem.
6. Radon-Nykodim theorem.
7. Tychonoff’s theorem on compact sets in a product space.
8. Urysohn’s lemma and Tietze extension theorem.
9. Egorov’s and Luzin’s theorems.
A final remark is worth stating. Since our emphasis in this book is on variational
methods and techniques, whenever possible we have given proofs of results of a
variational flavor, and hence such proofs are sometimes not the standard ones that
can be found in other sources.
I would like to thank the members of the Editorial Board of the series UNITEXT
for their support; my gratitude goes especially to Francesca Bonadei and Francesca
Ferrari for their constant assistance.
Ciudad Real, Spain Pablo Pedregal

November 2023
Contents
1 Motivation and Perspective . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

1.1 Some Finite-Dimensional Examples. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Basic Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.3 More Advanced Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.4 The Model Problem, and Some Variants . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
1.5 The Fundamental Issues for a Variational Problem . . . . . . . . . . . . . . . . . . 16
1.6 Additional Reasons to Care About Classes of Functions . . . . . . . . . . . . 20
1.7 Finite Versus Infinite Dimension . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
1.8 Brief Historical Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
1.9 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
Part I Basic Functional Analysis and Calculus of Variations

2 A First Exposure to Functional Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
2.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
2.2 Metric, Normed and Banach Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
2.3 Completion of Normed Spaces. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
2.4 Lp -Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
2.5 Weak Derivatives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
2.6 One-Dimensional Sobolev Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
2.7 The Dual Space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
2.8 Compactness and Weak Topologies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
2.9 Approximation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
2.10 Completion of Spaces of Smooth Functions with Respect
to Integral Norms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
2.11 Hilbert Spaces. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
2.12 Some Other Important Spaces of Functions . . . . . . . . . . . . . . . . . . . . . . . . . . 84
2.13 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
3 Introduction to Convex Analysis: The Hahn-Banach and
Lax-Milgram Theorems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
3.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
xi
xii Contents
3.2 The Lax-Milgram Lemma . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94

3.3 The Hahn-Banach Theorem: Analytic Form . . . . . . . . . . . . . . . . . . . . . . . . . 98
3.4 The Hahn-Banach Theorem: Geometric Form . . . . . . . . . . . . . . . . . . . . . . . 102
3.5 Some Applications. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106
3.6 Convex Functionals, and the Direct Method . . . . . . . . . . . . . . . . . . . . . . . . . 108
3.7 Convex Functionals, and the Indirect Method. . . . . . . . . . . . . . . . . . . . . . . . 112
3.8 Stampacchia’s Theorem: Variational Inequalities . . . . . . . . . . . . . . . . . . . . 114
3.9 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116
4 The Calculus of Variations for One-dimensional Problems . . . . . . . . . . . . . 121
4.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121
4.2 Convexity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122
4.3 Weak Lower Semicontinuity for Integral Functionals . . . . . . . . . . . . . . . 125
4.4 An Existence Result . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131
4.5 Some Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134
4.6 Optimality Conditions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143
4.7 Some Explicit Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 148
4.8 Non-existence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155
4.9 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157
Part II Basic Operator Theory

5 Continuous Operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169
5.1 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169
5.2 The Banach-Steinhaus Principle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171
5.3 The Open Mapping and Closed Graph Theorems. . . . . . . . . . . . . . . . . . . . 172
5.4 Adjoint Operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 175
5.5 Spectral Concepts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 178
5.6 Self-Adjoint Operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 180
5.7 The Fourier Transform . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 183
5.8 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 189
6 Compact Operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193
6.1 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193
6.2 The Fredholm Alternative . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 197
6.3 Spectral Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 200
6.4 Spectral Decomposition of Compact, Self-Adjoint Operators. . . . . . . 203
6.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 207
Part III Multidimensional Sobolev Spaces and Scalar Variational

Problems
7 Multidimensional Sobolev Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 215
7.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 215
7.2 Weak Derivatives and Sobolev Spaces. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 216
7.3 Completion of Spaces of Smooth Functions of Several
Variables with Respect to Integral Norms . . . . . . . . . . . . . . . . . . . . . . . . . . . . 221
Contents xiii
7.4 Some Important Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 221

7.5 Domains for Sobolev Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 226
1,p
7.6 Traces of Sobolev Functions: The Space W0 (Ω) . . . . . . . . . . . . . . . . . . 231
7.7 Poincaré’s Inequality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 236
7.8 Weak and Strong Convergence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 238
7.9 Higher-Order Sobolev Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 240
7.10 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 242
8 Scalar, Multidimensional Variational Problems . . . . . . . . . . . . . . . . . . . . . . . . . . 245
8.1 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 245
8.2 Abstract, Quadratic Variational Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . 246
8.3 Scalar, Multidimensional Variational Problems . . . . . . . . . . . . . . . . . . . . . . 249
8.4 A Main Existence Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 252
8.5 Optimality Conditions: Weak Solutions for PDEs . . . . . . . . . . . . . . . . . . . 254
8.6 Variational Problems in Action . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 257
8.7 Some Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 259
8.8 Higher-Order Variational Principles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 262
8.9 Non-existence and Relaxation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 266
8.10 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 271
9 Finer Results in Sobolev Spaces and the Calculus of Variations. . . . . . . . 281
9.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 281
9.2 Variational Problems Under Integral Constraints . . . . . . . . . . . . . . . . . . . . 285
9.3 Sobolev Inequalities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 288
9.4 Regularity of Domains, Extension, and Density . . . . . . . . . . . . . . . . . . . . . 298
9.5 An Existence Theorem Under More General Coercivity
Conditions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 300
9.6 Critical Point Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 302
9.7 Regularity. Strong Solutions for PDEs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 312
9.8 Eigenvalues and Eigenfunctions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 317
9.9 Duality for Sobolev Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 319
9.10 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 321
A Hints and Solutions to Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 325
A.1 Chapter 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 325
A.2 Chapter 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 328
A.3 Chapter 3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 338
A.4 Chapter 4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 341
A.5 Chapter 5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 349
A.6 Chapter 6 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 353
A.7 Chapter 7 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 358
A.8 Chapter 8 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 360
A.9 Chapter 9 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 368
xiv Contents
B So Much to Learn . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 373

B.1 Variational Methods and Related Fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 373
B.2 Partial Differential Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 376
B.3 Sobolev Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 377
B.4 Functional Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 377
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 379
Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 385
Chapter 1
Motivation and Perspective
1.1 Some Finite-Dimensional Examples
Most likely, our readers will already have some experience with optimization
problems of some kind, either from Advanced Calculus courses or even from
some exposure to Mathematical Programming. The following are some typical
examples.
1. Minimizing the distance to a set. Let
/
ρ(x, y) =
. (x − x0 )2 + (y − y0 )2
be the distance function to a given point
P = (x0 , y0 ) ∈ R2 .
.
Suppose we are given a set .Ω ⊂ R2 with .(x0 , y0 ) ∈

/ Ω , and would like to find the
closest point to .P in .Ω . With some experimentation for different types of sets .Ω ,
can our readers conclude under what conditions on .Ω one can ensure that there
is one and only one point in .Ω which is closest to .P? Does the situation change
in three-dimensional space .R3 , or in .RN , no matter how big N is?
2. Assume we have two moving points in the plane. The first one is moving in the
X-axis from right to left, starting at .x0 > 0 with velocity .−u0 , .u0 > 0; while
the second is moving vertically from bottom to top, starting at .y0 < 0 with speed
.v0 > 0. When will those objects be the closest to each other? What will the
distance between them be at that moment?

3. A thief, after committing a robbery in a jewelry, hides in a park which is the
convex hull of the four points
.(0, 0), (−1, 1), (1, 3), (2, 1).
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2024 1

P. Pedregal, Functional Analysis, Sobolev Spaces, and Calculus of Variations,
La Matematica per il 3+2 157, https://doi.org/10.1007/978-3-031-49246-4_1
2 1 Motivation and Perspective
The police, looking after him, gets into the park and organizes the search
according to the function
ρ(x1 , x2 ) = x1 − 3x2 + 10
.
indicating density of surveillance. Recommend the thief the best point through
which he can escape, or the best point where he can stay hidden.
4. Divide a positive number a into n parts in such a way that the sum of the
corresponding squares be minimal. Use this fact to prove the inequality
⎛ n ⎞2
Σ xi Σ
n
x2
. ≤ i
n n
i=1 i=1
for an arbitrary vector
x = (x1 , ..., xn ) ∈ Rn .
.
5. Prove Hölder’s inequality

⎛ n ⎞1/p ⎛ ⎞1/q
Σ
n Σ p
Σ
n
q 1 1
. xk yk ≤ xk yk , + = 1,
p q
k=1 k=1 k=1
for .p > 1, .xk , yk > 0, by maximizing the function
Σ
n
. xk yk
k=1
in the .xk ’s for fixed, given .yk ’s, under the constraint
Σ
n
p
. xk = b,
k=1
for a fixed, positive number b.

6. The Cobb-Douglas utility function u is of the form
u(x1 , x2 , x3 ) = x1α1 x2α2 x3α3 ,

. 0 ≤ αi , α1 + α2 + α3 = 1, xi ≥ 0.
If a consumer has resources given by .x i for commodity .Xi , and their respective
prices are .pi , formulate and solve the problem of maximizing satisfaction
measured through such utility function.
7. Find the best parabola through the four points .(−1, 4), .(0, 1), .(1, 1) and .(1, −1)
by minimizing the least square error.
1.2 Basic Examples 3
Many more examples could be stated. Readers should already know how to solve
these problems.
After examining with some care all of these problems, we can draw some
conclusions.
• Important ingredients of every optimization problem are the cost function, the
function to be minimized or maximized, and the constraints that feasible vectors
need to comply with to be even considered. The collection of these vectors is the
feasible set of the problem.
• It is important, before starting to look for optimal solutions, to be sure that those
are somewhere within the feasible set of the problem. This is the issue of the
existence of optimal solutions. Some times optimization problems may not be
well-posed, and may lack optimal solutions.
• In all of these examples, we deal with functions and vectors, no matter how
many components they may have. The set of important techniques to treat this
kind of optimization problems lie in the area universally known as Mathematical
Programming and Operations Research.
The class of optimization problems we would like to consider in this text share
some fundamental ingredients with those above: objective or cost functions, and
constraints. But they differ fundamentally in the last feature: we would like to
explore optimization problems for infinite dimensional vectors. Vectors are no
longer vectors in the usual sense, but they become functions; and objective functions
become functionals; and Mathematical Programming, as the discipline to deal with
finite-dimensional optimization problems, becomes Calculus of Variations, one of
the main areas of Analysis in charge of infinite dimensional optimization problems.
Independent variables in problems, instead of being vectors, are, as already pointed
out, functions. This passage from finite to infinite dimension is so big a change that
it is not surprising that methods and techniques are so different from those at the
heart of Mathematical Programming.
Let us start looking at some examples.
1.2 Basic Examples
Almost every formula to calculate some geometric or physical quantity through an

integral can furnish an interesting problem in the Calculus of Variations. Our readers
will know about the following typical cases:
1. The area enclosed by the graph of a non-negative function .u(x) between two
values a and b:
b
A=
. u(x) dx.
a
2. The length of the graph of the function .u(x) between two values a and b:
b √
L=
. 1 + u' (x)2 dx.
a
3. The volume of the solid of revolution generated by the graph of the function .u(x)
around the X-axis between the two values a and b:
b
V =π
. u(x)2 dx.
a
4. The surface of revolution of the same piece of graph around the X-axis
b √
S = 2π
. u(x) 1 + u' (x)2 dx.
a
5. The area of a piece of graph of a function .u(x, y) of two variables over a subset
.Ω ⊂ R where the variables .(x, y) move, is given by the integral
2
/
. S= 1 + |∇u(x, y)|2 dx dy. (1.1)
Ω
Some examples coming from physical quantities follow.

1. The work done by a conservative force field .F(x) where
F = (F1 (x), F2 (x), F3 (x)),

. x = (x1 , x2 , x3 ),
in going from a point .P to another one .Q is independent of the path followed,

and corresponds to the difference of potential energy. But if the field is non-
conservative, the work depends on the path, and is given by
b
W =
. F(x(t)) · x' (t) dt, P = x(a), Q = x(b).
a
2. The flux of a field .F(x) through a surface .S is given by

F =
. F(x) · n(x) dS(x),
S
where .n(x) is the normal to .S, and dS represents the element of surface area.
3. Moments of inertia. If a body of density .ρ(x) occupies a certain region .Ω in
space, given a certain axis .r, the moment of inertia with respect to .r, is given by
1.2 Basic Examples 5
the integral

. r(x)ρ(x) dx,
Ω
where .r(x) is the distance from .x to the axis .r.

Let us focus on one of these examples, say, the area enclosed by a portion of the
graph of the function .u(x), .x ∈ (a, b), i.e.
b
A=
. u(x) dx. (1.2)
a
There are essentially two ingredients in this formula: the two integration limits, and
the function .u(x). We can therefore regard quantity A as a function of the upper
limit b, maintaining the other ingredients fixed
b
A(b) =
. u(x) dx.
a
This is actually the viewpoint adopted when dealing with the Fundamental Theorem
of Calculus to conclude that if .u(x) is continuous, then the “function” .A(b) is
differentiable and .A' (b) = u(b). We can then think about the problem of finding
the extreme values of .A(b) when b runs freely in .R, or in some preassigned interval
.[α, β].
But we can also think of A in (1.2) as a “function” of .u(x), keeping the two
end-points a and b fixed, and compare the corresponding integrals for different
choices of functions .u(x). If we clearly determine a collection .A of such competing
functions .u(x), we could ask ourselves which one of those functions realizes the
minimum or the maximum of A. It is important to select in a sensible way the
class .A for otherwise the optimization may be pointless. Suppose we accept every
function .u(x) one could think of, and try to figure out the one providing the
maximum of the integral. It is pretty clear that there is no such function because
the maximum is infinite: we can always find a function .u(x) for which the integral
A is bigger than any preassigned number, no matter how big this is. This problem is
useless. Assume, instead, that we only admit to compete for the minimum those
functions .u(x) defined on the interval .[a, b], such that .u(a) = u(b) = 0, and
whose graph measures less than or equal to a positive number .L > b − a. After
a bit of experimentation, one realizes that the problem looks non-trivial and quite
interesting, and that the function realizing this minimum, if there is one such
function, must enjoy some interesting geometric property related to area and length
of the graph.
We hope this short discussion may have helped in understanding the kind of
situations that we would like to address in this text. We are interested in considering
sets of functions; a way to assign a number to each one of those, a functional; decide
if there is one such function which realizes the minimum or maximum possible; and
if there is one, find it, or derive interesting properties of it. This sort of optimization
problems are identified as variational problems for reasons that will be understood
later.
Almost as important as studying variational problems is to propose interesting,
meaningful problems or cases. As we have seen with the above discussion, it is not
always easy to make a proposal of a relevant variational problem. We are going
to see next a bunch of those that have had a historical impact on Science and
Engineering.
1.3 More Advanced Examples
The previous section has served to make us understand the kind of optimization
problems we would like to examine in this text. We will describe next some of the
paradigmatic examples that have played a special role historically, or that are among
the distinguished set of examples that are used in most of the existing textbooks
about the Calculus of Variations.
1. Transit problems.
2. Geodesics.
3. Dirichlet’s principle.
4. Minimal surfaces.
5. Isoperimetric problems.
6. Hamiltonian mechanics.
We will see through these examples how important it is to be able to formulate
a problem in precise mathematical terms that may enable to compare different
alternatives, and decide which one is better; to argue if there are optimal solutions,
and, eventually, find them. It is an initial, preliminary step that requires quite a good
deal of practice, and which students usually find difficulties with.
1.3.1 Transit Problems
The very particular example that is universally accepted as marking the birth of the
Calculus of Variations as a discipline on its own is the brachistochrone. We will talk
about its relevance in the development of this field later in the final section of this
chapter.
Given two points in a plane at different height, find the profile of the curve
through which a unit mass under the action of gravity, and without slipping, employs
the shortest time to go from the highest point to the lowest. Let .u(x) be one such
1.3 More Advanced Examples 7
feasible profile so that
u(0) = 0,
. u(a) = A,
with .a, A > 0, and consider curves joining the two points .(0, 0), .(a, A) in the
plane. It is easy to realize that we can restrict attention to graphs of functions like
the one represented by .u(x) because curves joining the two points which are not
graphs cannot provide a minimum transit time under the given conditions. We need
to express, for such a function .u(x), the time spent by the unit mass in going from
the highest point to the lowest. From elementary kinematics, we know that

ds ds
dt = √
. , T = dt = √ ,
2gh 2gh
where t is time, s is arc-length (distance), g is gravity, and h is height. We also know

that
√
.ds = 1 + u' (x)2 dx,
and can identify h with .u(x). Altogether we find that the total transit time T is given
by the integral
√
1 a 1 + u' (x)2
.T = √ √ dx.
2g 0 u(x)
An additional simplification can be implemented if we place the X-axis vertically,

instead of horizontally, without changing the setup of the problem. Then we would
have that
A√
1 1 + u' (x)2
.T = √ √ dx, (1.3)
2g 0 x
and the problem we would like to solve is

√
A 1 + u' (x)2
Minimize in u(x) :
. √ dx
0 x
subject to
. u(0) = 0, u(A) = a.
Notice that this new function u is the inverse of the old one, and that positive,
multiplicative constants do not interfere with the optimization process.
To stress the meaning of our problem, suppose, for the sake of definiteness, that
we take .a = A = 1. In this particular situation, we have three easy possibilities of a
function passing through the two points .(0, 0), .(1, 1), namely
√
u1 (x) = x,
. u2 (x) = x 2 , u3 (x) = 1 − 1 − x2,
a straight line, a parabola, and an arc of a circle, respectively. Let us compare the
transit time for the three, and decide which of the three yields a smaller value.
According to (1.3), we have to compute (approximate) the value of the three
integrals
√ 1√
1 2 1 + 4x 2
T1 =
. √ dx, T2 = √ dx,
0 x 0 x
1
1
T3 = √ dx.
0 x(1 − x 2 )
Note that the three are improper integrals because the three integrands do have, at
least, an asymptote at zero. Yet the value of the three integrals is finite. The smaller
of the three is the one corresponding to the parabola .u2 . But still the important issue
is to find the best of all such admissible profiles.
1.3.2 Geodesics
It is well-known that geodesics in free, plane euclidean space are straight lines.
However, when distances are distorted because of the action of some agent, then
the closest curves joining two given points, the so-called geodesics, may not be the
same straight lines. More specifically, if locally at a point .x ∈ RN distances are
measured by the formula
||v||2 = vT A(x)v
.
where .v ∈ RN , and the symmetric, positive-definite matrix .A(x) changes with the
point .x, then the total length of a curve
x(t) : [0, 1] → RN
.
would be given by the integral

1⎛ ⎞1/2
. x' (t)T A(x(t))x' (t) dt.
0
Hence, shortest paths between two points .P0 and .P1 would correspond to curves .X
realizing the minimum of such a problem
1⎛ ⎞1/2
Minimize in x(t) :
. x' (t)T A(x(t))x' (t) dt
0
subject to
x(0) = P0 ,
. x(1) = P1 .
The classical euclidean case corresponds to .A = 1, the identity matrix; but even for
a situation in the plane .R2 with
⎛ ⎞
10
A=
.
02
is not clear if geodesics would again be straight lines. What do you think?
1.3.3 Dirichlet’s Principle
Another problem that played a major role in the development of the Calculus of
Variations is the Dirichlet’s principle in which we pretend to find the function
u(x) : Ω ⊂ RN → R,
.
for a preassigned domain .Ω , that minimizes the functional

1
. |∇u(x)|2 dx (1.4)
2 Ω
among all such functions .u(x) complying with .u = u0 around the boundary .∂Ω of
Ω , and the function .u0 is given a priori. We are hence forcing functions to take some
.
given values on around .∂Ω . Note that .u(x) is a function of several variables, and
⎛ ⎞ Σ
N
∂u ∂u ∂u
. ∇u(x) = (x), . . . , (x) , |∇u(x)|2 = (x)2 .
∂x1 ∂xN ∂xi
i=1
In certain circumstances, if graphs of such functions .u(x) are identified as shapes

of elastic membranes fixed at the boundary .∂Ω according to the values of .u0 , the
integral in (1.4) provides a measure of the elastic energy repressed by the membrane
when it adopts the shape determined by the graph of some .u(x). The minimizer
would correspond to the function .u(x) whose graph would repressed the least energy
possible under such circumstances, and so it would yield the shape adopted by the
membrane naturally.
1.3.4 Minimal Surfaces
One of the most fascinating variational examples is the one trying to minimize the
area functional S in (1.1)
/
.S(u) = 1 + |∇u(x, y)|2 dx dy.
Ω
It is one of the main functionals that has stirred and is stirring a lot of research,
both in Analysis and Geometry. Typically, the underlying variational problem tries
to minimize the area .S(u) among those functions sharing the same values on the
boundary .∂Ω of a given domain .Ω ⊂ R2 . The graph of a given function .u(x) is
a minimal-area surface if it is a minimizer of .S(u) among those functions sharing
with u its boundary values around some .Ω .
Because surface area is closely related to surface tension, minimal surfaces
represent shapes adopted, for instance, by soap films as these are looking for
surfaces minimizing surface tension.
1.3.5 Isoperimetric Problems
We know from Calculus, and has been reminded above, that many quantities
associated with geometric objects can be expressed through integrals. The graph
of a certain function .u(x) for .x ∈ [x0 , x1 ] determines some important geometric
quantities, except for positive multiplicative constants, like the area enclosed by it
x1
A=
. u(x) dx;
x0
the length of its graph

x1 √
.L= 1 + u' (x)2 dx;
x0
the area of revolution around the X-axis

x1 √
.S = u(x) 1 + u' (x)2 dx;
x0
the volume of the solid of revolution around the X-axis

x1
.V = u(x)2 dx.
x0
Typically, the optimization problem that consists in minimizing (or maximizing) one
of these integrals under fixed end-point conditions is meaningless either because the
optimal answer can be easily found, and the problem becomes trivial; or because the
corresponding extreme value is not finite. For instance, finding the minimum of the
length of a graph passing through two given points in the plane is trivially found to
be the straight line through those two points, while the maximum length is infinite.
It is a much more interesting situation to use two of those functionals to setup
interesting and important variational problems. The classical one is the following:
Find the optimal function .u(x) for .x ∈ [−1, 1] that minimizes the integral
1
. u(x) dx
−1
among those with
u(−1) = u(1) = 0,
.
and having a given length .L > 2 of its graph, i.e. among those respecting the
condition
1 √
L=
. 1 + u' (x)2 dx.
−1
This is a much more interesting and fascinating problem. In general terms, an

isoperimetric problem is one in which we try to minimize an integral functional
among a set of functions which are restricted by demanding, among other possible
constraints, a condition setup with another integral functional. There are various
possibilities playing, for example, with the quantities A, L, S and V , above.
There are, in particular, two very classical situations.
1. Dido’s problem. Given a rope of length .L > 0, determine the maximum area that
can be enclosed by it, and the shape to do so. It admits some interesting variants.
The following is an elementary way to state the situation. Let .(u(t), v(t)) be the
two components of a plane, closed curve .σ parameterized in such a way that
u(a) = u(b) = 0,
. v(a) = v(b) = 0.
The length of .σ is given by

b √
L(σ ) =
. u' (t)2 + v ' (t)2 dt,
a
while the area enclosed by it is, according to Green’s theorem,

b
A(σ ) =
. (u(t)v ' (t) − u' (t)v(t)) dt.
a
We would like to find the optimal curve for the problem
Minimize in σ :
. L(σ ) under A(σ ) = α, σ (a) = σ (b) = (0, 0).
2. The hanging cable. This time we have a uniform, homogeneous cable of total
length L that is to be suspended from its two end-points between two points
at the same height, and separated a distance H apart. We will necessarily have
.L > H . We would like to figure out the shape that the hanging cable will adopt
under the action of its own weight. If we assume that such a profile will be the
result of minimizing the potential energy associated with any such admissible
profile represented by the graph of a function .u(x), then we know that potential
energy is propotional to height, and so
√
dP = u(x) ds,
. ds = 1 + u' (x)2 dx.
The full potential energy contained in such feasible profile will then be
H √
P =
. u(x) 1 + u' (x)2 dx.
0
Constraints now should acount for the fact that the cable has total length L, in
addition to demanding
u(0) = 0,
. u(H ) = 0.
The constraint on the length reads

H √
L=
. 1 + u' (x)2 dx,
0
coming from integrating in the full interval .[0, H ] arc-length ds. We then seek
to find the optimal shape corresponding to the problem
H √
Minimize in u(x) :
. u(x) 1 + u' (u)2 dx
0
under the constraints

H √
u(0) = u(H ) = 0,
. L= 1 + u' (x)2 dx.
0
1.3.6 Hamiltonian Mechanics
There is a rich tradition of variational methods in Mechanics. If .x(t) represents the

state of a certain mechanical system with velocities .x' (t), then the dynamics of the
system evolves in such a way that the action integral
T
. L(x(t), x' (t)) dt
0
is minimized. The integrand
L(x, y) : RN × RN → R
.
is called the lagrangian of the system. The hamiltonian .H (x, y) is defined through
the formula
.H (x, y) = sup {z · y − L(x, y)}.

z∈RN
H , defined through this formula, is called the conjugate function of L with respect
to the variable .y, as variable .x here plays the role of a vector of parameters. It is
interesting to note that, under hypotheses that we do not bother to specify,
L(x, y) = sup {z · y − H (x, y)}

.
z∈RN
as well.
We will learn later to write the Euler-Lagrange (E-L) system that critical paths of
the functional with integrand .L(x, y) above ought to verify. It reads
d
. − Ly (x(t), x' (t)) + Lx (x(t), x' (t)) = 0. (1.5)
dt
Paths which are solutions of this system will also evolve according to the hamilto-
nian equations
∂H ∂H
y' (t) = −
. (x(t), y(t)), x' (t) = (x(t), y(t)).
∂x ∂y
In fact, the definition of the hamiltonian leads to the relations
.H (x, y) = z(x, y) · y − L(x, z(x, y)), y − Ly (x, z(x, y)) = 0,

where this .z is the optimal vector .z (depending on .x and .y) in the definition of the
hamiltonian. Then it is easy to check that
Ly (x, z) = y,
. z = Hy (x, y), Hx (x, y) = −Lx (x, z),
If we put
x' = z,
. y = Ly (x, x' ),
it is immediate to check, through these identities, the equivalence between the E-L
system and Hamilton’s equations. Note that system (1.5) is
y' = −Hx (x, y).

.
1.4 The Model Problem, and Some Variants
We have emphasized in the preceding sections the interest, both from a purely
analytical but also from a more applied standpoint, of studying variational problems
of the general form

I (u) =
. F (x, u(x), ∇u(x)) dx
Ω
under typical additional conditions like limiting the value of competing fields .u(x)
around .∂Ω . Here we have:
• .Ω ⊂ RN is a typical subset, which cannot be too weird;
• feasible fields
.u(x) : Ω → Rn
should be differentiable with gradient or differential (jacobian) matrix
∇u(x) : Ω → Rn×N ;
.
• the integrand
. F (x, u, z) : Ω × RN × Rn×N → R
is assumed to enjoy smoothness properties that will be specified as they are

needed.
1.4 The Model Problem, and Some Variants 15
Dimensions N and n make the problem quite different ranging from understandable
to pretty hard. We will refer to:
1. .N = n = 1: scalar (.n = 1), variational problems in dimension .N = 1;
2. .N = 1, .n > 1: vector problems in dimension one;
3. .N > 1, .n = 1: scalar, multidimensional problems;
4. .N, n > 1: vector, multidimensional problems.
In general terms we will refer to the scalar case when either of the two dimensions
N or n is 1, while we will simply classify a problem as vectorial if both dimensions
are greater than unity. We will focus mainly in the scalar case in this text, while the
vector case in left for another book. Note how all of the examples examined earlier
in the chapter fall into the category of scalar problems.
To stress how variational problems of the above class are just a first step to more
general situations of indisputable interest in Science and Engineering, consider a
scalar, uni-dimensional variational problem
T
Minimize in u(t) :
. F (t, u(t), u' (t)) dt
0
under the end-point conditions
u(0) = u0 ,
. u(T ) = uT .
It is obvious that we can also write the problem in the form

T
Minimize in v(t) :
. F (t, u(t), v(t)) dt
0
u(0) = u0 ,
. u(T ) = uT , u' (t) = v(t).
Notice how we are regarding the function .v(t) as our “variable” for the optimization
problem, while .u(t) would be obtained from .v(t) through integration. In particular,
there is the integral constraint
T
. v(t) dt = uT − u0
0
that must be respected by feasible functions .v(t). The obvious relationship
u' (t) = v(t)

.
can immediately be generalized to
u' (t) = f (t, u(t), v(t))

.
for an arbitrary function
f (t, u, v) : (0, T ) × R × R → R.
.
This is not just a more general situation for the sake of it, since it gives rise to a
new class of optimization problems of a tremendous impact in Engineering that are
identified as optimal control problems (of ODEs).
There is another important generalization that our readers should keep in mind.
So far integrands in functionals have been assumed to depend explicitly on the first
derivative .u' or gradient .∇u of competing functions. This does not have to be so.
In fact, the highest derivative occurring explicitly in a variational problem indicates
the order of the problem. Usually, first-order problems are the ones studied because
these families of problems are the most common. However, one can consider zero-
order, and higher-order problems. A functional of the form
1
. F (t, u(t)) dt
0
would correspond to a zero-order problem, assuming that no derivative participates

explicitly in additional constrains, whereas
1
. F (t, u(t), u' (t), u'' (t)) dt
0
with a explicit dependence of F on the variable .u'' would indicate a second-order

example.
1.5 The Fundamental Issues for a Variational Problem
What are the main concerns when facing any of the examples we have just
examined, or any other such problem for that matter? The first and fundamental
issue is to know if there is a particular feasible function realizing the minimum. If
there is such a function (there might be several of them), it will be quite special in
some sense. How can one go about showing whether there is a minimizer for one of
our variational problems? Let us just backup to a finite dimensional situation to see
what we can learn from there.
Suppose we have a function
F (x) : RN → R,
.
1.5 The Fundamental Issues for a Variational Problem 17
and we would like to convince ourselves that there is at least one vector .x0 ∈ RN
with the property
F (x0 ) = min F (x).

.
x∈RN
We would probably look immediately, specially if we were good Calculus students,

at the system of critical points
∇F (x) = 0.
. (1.6)
That is not a bad idea. However, we are well aware that, even if .x0 must be a
solution of this non-linear system, provided F is differentiable, there might be other
solutions for (1.6). On the other hand, we do not have the slightest idea about what
the equivalent to the critical point system (1.6) would be for a functional like the
one in the previous section. In case F is not differentiable, then we would feel at a
loss.
There are some precautions one should always take before moving on. Let us put
m = inf F (x).
.
x∈RN
We would definitely like .m ∈ R to be a real number. It could however happen that

m = −∞ (the case .m = +∞ does not make much sense, why?). If .m = −∞,
.
it means that the problem is not well-posed in the sense that the values of F can
decrease indefinitely, and moreover, if F is continuous,
. lim inf F (x) = −∞.

|x|→∞
These situations are not interesting as we could not have minimizers. How can one
avoid them? A standard condition is known as coercivity: such a function F is said
to be coercive if, on the contrary,
. lim inf F (x) = +∞. (1.7)

|x|→∞
This condition suffices to ensure that .m ∈ R.

Once we know that .m ∈ R, one thing one can always find (establish the existence
of) is a minimizing sequence .{xj } for F . This is simply a sequence of vectors .{xj }
such that
F (xj ) \ m = inf F (x).

.
x∈RN
This is nothing but the definition itself of the infimum of a collection of numbers.
Another matter is how to find in practice, for specific functions F , such a sequence.
In some sense that minimizing sequence should be telling us where to look for
minimizers. What is sure, under the coerciveness condition (1.7), is that .{xj } is a
bounded, collection of vectors. As such, it admits some converging subsequence
(Heine-Borel theorem) .{xjk }:
xjk → x0 as k → ∞.
.
This is a fundamental compactness property. If F is continuous, then
F (x0 ) = lim F (xjk ) \ m,

.
k→∞
and .x0 becomes one desired minimizer. Once we are sure that there are vectors
where the function F achieves its minimum value, then we know that those must be
solutions for the system of critical points (1.6), and we can start from here to look
for those minimum points.
Let us mimic this process with a typical integral functional of the form
1
.I (u) = F (u(x), u' (x)) dx (1.8)
0
where
F (x, z) : R × R → R,
. u(x) : [0, 1] → R,
and there could be further conditions, most probably in the form of end-point
restrictions, for admissible functions. The density F is assumed to be, to begin
with, just continuous, and for the sake of definiteness we have taken the interval
of integration to be the unit interval .[0, 1] ⊂ R. The functional I is supposed not to
be identically .+∞, i.e. there is a least one admissible function u with .I (u) ∈ R. Set
m = inf I (u),
.
A
where .A stands for the class of functions that we would allow in the minimization
process, essentially .C1 -functions (in order to have a derivative) complying with
additional restrictions. The first concern is to guarantee that .m ∈ R, i.e. .m > −∞.
In the case of a function of several variables .F (x), we discussed above that the
coercivity condition (1.7) suffices to ensure this property. What would the substitute
be for a functional like .I (u)? Note how this requires to determine in a precise way
what it means for a function
u(x) : [0, 1] → R
.
1.5 The Fundamental Issues for a Variational Problem 19
to go to “infinity”. In the case of a finite-dimensional vector .x, it clearly means that
Σ
N
|x|2 =
. xi2 → ∞, x ∈ RN , x = (x1 , x2 , . . . , xN ).
i=1
One can think of the condition, which is possibly the most straightforward general-
ization of this finite-dimensional case,
. sup |u(x)| → ∞.
x∈[0,1]
But there might be other possibilities like demanding, for instance,

1
. |u(x)| dx → ∞.
0
What is pretty clear is that a certain “measure” of the overall size of the function
must tend to infinity, and as the size of a feasible function .u ∈ A tends to infinity,
the numbers .I (u) ought to go to infinity as well.
Assume that we have somehow solved this problem, determined clearly how to
calculate the global size .|u| ∈ R+ of a function .u ∈ A, and checked that the
functional .I (u) is coercive in the sense
. lim inf I (u) = +∞,

|u|→∞
in such a way that minimizing sequences .{uj } ⊂ A enjoy the compactness property
{|uj |} ⊂ R,
.
a bounded collection of numbers. At this stage, in the finite-dimensional situa-

tion, one resorts to the fundamental compactness property that there should be
subsequences converging to some vector. This innocent-looking assertion encloses
various deep concepts. We need to grasp how to find a replacement of this
compactness property in the infinite-dimensional scenario.
Given that functions .u(x) are in fact numbers for each point in the domain .x ∈
[0, 1], the most direct way to argue is possibly as follows. If the property .{|uj |} ⊂ R
implies that .{uj (x)} is a bounded sequence of numbers for every such x, we can
certainly apply the compactness property in .R to conclude that there will be a certain
subsequence .jk such that .ujk (x) → u(x) where we have put .u(x) for the limit,
anticipating thus the limit function. There are two crucial issues:
1. The subsequence .jk which is valid for a certain .x1 ∈ [0, 1] might not be valid
for another point .x2 ∈ [0, 1], that is to say, the choice of .jk may depend on
the point x. We can start taking subsequences of subsequences in a typical
diagonal argument, but given that the interval .[0, 1] is not countable, there is no
way that this process could lead anywhere. Unless we find a way, or additional
requirements in the situation, to tune the subsequence .jk for all points .x ∈ [0, 1]
at once so that
ujk (x) → u(x) for all x ∈ [0, 1],

.
it is impossible to find a limit function in this way.

2. Even if we could resolve the previous step, there is still the issue to check whether
the limit function .u(x) ∈ A. In particular, u must be a function for which the
functional I can be computed.
Example 1.1 To realize that these difficulties are not something fancy or patholog-
ical, let us examine the following elementary example. Put
.uj (x) = sin(2j π x), x ∈ [0, 1].
It is not difficult to design variational problem like (1.8) for which sequences similar
to .uj are minimizing. For each .x ∈ [0, 1], .{uj (x)} is obviously a sequence of
number in the interval .[−1, 1], and so bounded. But there is no way to tune a single
subsequence .jk to have that .{sin(2jk π x)} is convergent for all x at once. Note that
the set of numbers
.{sin(2j π x) : j ∈ N}
is dense in .[−1, 1] for every .x ∈ (0, 1). In this case, there does not seem to be
a reasonable way to define a limit function for this sequence. The solution to this
apparent dead-end is surprising and profound.
To summarize our troubles, we need to address the following three main issues:
1. we need to find a coherent way to define the size of a function; there might be
various ways to do so;
2. we need some compactness principle capable of determining a function from a
sequence of functions whose sizes form a bounded sequence of positive numbers;
3. limit functions of bounded-in-size sequences should enjoy similar properties to
the members of the sequences.
Before we can even talk about how to tackle variational problems like (1.8), there
is no way out but to address these three fundamental issues. This is the core of our
initial exposure to Functional Analysis.
1.6 Additional Reasons to Care About Classes of Functions
Let us start making some explicit calculations to gain some familiarity with simple
variational principles.
1.6 Additional Reasons to Care About Classes of Functions 21
Let us focus on some scalar, one-dimensional examples of the form

x1
I (u) =
. F (x, u(x), u' (x)) dx. (1.9)
x0
Note how in this case
Ω = (x0 , x1 ),
. x ∈ (x0 , x1 ),
u(x) : (x0 , x1 ) → R, u' (x) : (x0 , x1 ) → R,
and additional standard conditions will most likely limit the values of u or its
derivative at one or both end-points
x = x0 ,
. x = x1 .
Different problems will obviously depend on the integrand
F (x, u, z) : (x0 , x1 ) × R × R → R.
.
To compute an integral like the one in (1.9), for a given function .u(x), we clearly
need to ensure that u is differentiable in .(x0 , x1 ), and plug its derivative .z = u' (x)
as the third argument in F ; hence, we must restrict ourselves to look for the infimum
of I in (1.9) among differentiable functions.
Let us explore the following two examples:
1. the variational problem
1
Minimize in u(x) :
. u(x)2 (2 − u' (x))2 dx
0
subjected to
u(0) = 0,
. u(1) = 1;
2. the variational problem

1
Minimize in u(x) :
. u' (x)6 (u(x)3 − x)2 dx
−1
subjected to
u(−1) = −1,
. u(1) = 1.
Concerning the first, we notice that the integrand is non-negative, being a square,
and so it would vanish for functions .u(x) so that
u(x)(2 − u' (x)) = 0

.
for every point .x ∈ (0, 1). Hence, either .u(x) = 0, or else .u' (x) = 2. Given the
end-point conditions
u(0) = 0,
. u(1) = 1,
that we must respect, it is easy to conclude that

⌠
0, 0 ≤ x ≤ 1/2,
u(x) =
.
2x − 1, 1/2 ≤ x ≤ 1,
is one possible optimal solution of the problem. But .u is not differentiable !! The
same problem pops up for any minimizer. If we insist in restricting attention to
differentiable functions, .m = 0 is the unattainable infimum of the problem: .u can
be approximated by .C1 -functions in an arbitrary way by rounding the corner of the
graph of .u, but the value .m = 0 can never be reached. The situation is like solving
the equation .x 2 = 2 in the field of rational numbers.
The second example is more sophisticated. It is a typical situation of the
phenomenon discovered by Lavrentiev in the 1920s. As in the first example, the
integrand is non-negative, and the only way to lead integrals to zero, under the given
end-point conditions, is by taking
√
.u(x) = 3
x. (1.10)
This unique minimizer is not differentiable either at .x = 0. However, the situation is

more dramatic than in the previous example. Suppose we pretend to approximate the
minimizer in (1.10) by a family of differentiable functions. The most straightforward
way would be to select
√ a small .h > 0,√and use a piece of a straight line interpolating
the two points .(−h, 3 −h) and .(h, 3 h) in the interval .[−h, h]. The slope of this
secant is
√ √
3
h − 3 −h 1
. = √ 3 2
.
2h h
The resulting function is not differentiable either at .±h, but those two discontinuities
can be rounded off without any trouble. The most surprising fact is that if we put
1
I (u) =
. u' (x)6 (u(x)3 − x)2 dx
−1
1.7 Finite Versus Infinite Dimension 23
and
⌠
u(x), −1 ≤ x ≤ −h, h ≤ x ≤ 1,
uh (x) =
. ,
h−2/3 x, −h ≤ x ≤ h,
then, due to evenness,

h
I (uh ) = 2
. h−4 (h−2 x 3 − x)2 dx.
0
A careful computation yields

⎛ ⎞
2 1 1 2
.I (uh ) = + − ,
h 3 7 5
and we clearly see that .I (uh ) → ∞ as .h → 0 !! It is a bit shocking, but this

elementary computation indicates that the behavior of this apparently innocent-
looking functional is quite different depending on the set of competing functions
considered. In particular, and this would require some more work than this simple
computation we have just performed, the value of the infimum of .I (u) over a class
of piece-wise affine functions over .[−1, 1] cannot vanish. And yet, when we permit
the function .u(x) to compete for the infimum, the minimum vanishes !!
What we have tried to emphasize with these two examples is the fundamental
relevance in variational problems to specify the class of functions in which we are
seeking to minimize a given functional. Historically, it took quite a while to realize
this, but variational problems were at the core to push mathematicians to formalize
spaces of functions where “points” are no longer vectors of a given dimension, but
rather full functions. This led to the birth of an independent, fundamental discipline
in Mathematics known universally as Functional Analysis.
1.7 Finite Versus Infinite Dimension
One of the main points that students must make an effort in understanding is the
crucial distinction between finite and infinite-dimension. Though this is not always
an issue, Functional Analysis focuses on discovering the places where an infinite-
dimensional scenario is different, even drastically different, and what those main
differences are or might be. To master where caution is necessary takes time because
genuine infinite-dimensional events may be some times counter-intuitive, and one
needs to educate intuition in this regard.
To stress this point, let us present two specific examples. We will see that the
first one is identical in finite and infinite dimension, whereas in the second, the
framework of infinite dimension permits new phenomena that need to be tamed.
This is part of our objective with this text. For the next statement, it is not
indispensable to know much about metric spaces as the proof is pretty transparent.
We will recall the basic definitions in the next chapter.
Theorem 1.1 (The Contraction Mapping Principle) Let .H be a complete metric
space under the distance function
.d(x, y) : H × H → R+ .
Suppose the mapping .T : H → H is a contraction in the sense
d(Tx, Ty) ≤ Kd(x, y),

. 0 < K < 1. (1.11)
There there is a unique fixed point for .T, i.e. a unique .x ∈ H such that .Tx = x.
Proof Select, in an arbitrary way, .x0 ∈ H, and define recursively the sequence
xk = Txk−1 ,
. k ≥ 1.
We claim that .{xk } is a Cauchy sequence. In fact, if .j < k, by the triangle inequality
and the repeated use of the contraction inequality (1.11),
Σ
k−1
d(xk , xj ) ≤
. d(xl , xl−1 )
l=j
Σ
k−1
≤ K l d(Tx0 , x0 )
l=j
Σ
k−1
=d(Tx0 , x0 ) Kl
l=j
Kj − Kk
=d(Tx0 , x0 ) .
1−K
Since .K < 1, we clearly see that .d(xk , xj ) → 0 as .k, j → ∞, and thus, the
sequence converges to a limit element .x. Since
d(T(k+1) x0 , Tx) ≤ Kd(T(k) x0 , x),

.
the left-hand side converges to .d(x, Tx), and the right-hand side converges to 0, we
conclude that indeed .x is a fixed point of .T. The same inequality (1.11) exclude the
possibility of having more than one such fixed point.
Note how the statement and the proof of such a result does not make the distinction
between finite and infinite dimension. As a general rule, when a fact can be shown in
1.7 Finite Versus Infinite Dimension 25
a metric space regardless of any other underlying structure, that result will be valid
regardless of dimension.
Possibly, from a more practical viewpoint, one of the main differences between
finite and infinite-dimension refers to compactness. This is always a headache in
Applied Analysis, and the main reason for the introduction and analysis of weak
convergence. In some sense, in an infinite dimension context there are far too many
dimensions where things may hide away or vanish. We refer to Example 1.1. For
the sequence of trigonometric functions
{sin(j x)},
. x ∈ [−π, π ],
there are at least two facts that strikes when compared to a situation in finite
dimension:
1. there cannot be a subsequence converging in a natural manner to anything;
2. any finite subset (of an arbitrary number of elements) of those .sin-functions is
always linearly independent.
We have already emphasized the first fact earlier. Concerning the second, suppose
we could find that
Σ
k
. λj sin(j x) = 0, x ∈ [−π, π ]
j =1
for a collection of numbers .λj , and arbitrary k. If we multiply this identity by

sin(lx), .1 ≤ l ≤ k, and integrate in .[−π, π ], we would definitely have
.
Σ
k π
0=
. λj sin(j x) sin(lx) dx.
j =1 −π
But an elementary computation yields that

π
. sin(j x) sin(lx) dx = 0
−π
when .j /= l. Hence
π
.0 = λl sin2 (lx) dx,
−π
which implies .λl = 0 for all l. This clearly means that such an infinite collection
of trigonometric functions is independent: the dimension of any vector space
containing them must be infinite-dimensional.
1.8 Brief Historical Background
The development and historical interplay between Functional Analysis and the
Calculus of Variations is one of the most fascinating chapters of the History
of Mathematics. It looks, then, appropriate to write a few paragraphs on this
subject addressed to young students, with the sole purpose to ignite the spark of
curiosity. It may be a rather presumptuous attitude on our part, twenty-first century
mathematicians, to look back on History and marvel at the difficulties that our
former colleagues found in understanding concepts and ideas that are so easily
conveyed today in advanced mathematics lectures all over the world. We should
never forget, though, that this easiness in grasping and learning deep and profound
concepts and results is the outcome of a lot of work and dedication by many of
the most brilliant minds of the XIX- and XX-centuries. Modesty, humility, and awe
must be our right attitudes. Our comments in this section are essentially taken from
[BiKr84, Kr94I, Kr94II]. In [FrGi16], there is a much more detailed description
of the early history of the Calculus of Variations, and its interplay with Functional
Analysis and other areas. See also [Go80].
The truth is that concepts that are so much ingrained in our mathematics
mentality today like that of “function” took quite a while until it was recognized and
universally adopted. Great mathematicians like Euler, Lagrange, Fourier, Dirichlet,
Cauchy, among others, but especially Riemann and Weierstrass contributed in a
very fundamental way. The concept of “space” was, however, well behind that of
function. The remarkable rise of non-euclidean geometries (Gauss, Lobachevsky,
Bolyai, Klein) had a profound impact on the idea of what might be a “space”. Some
help came from Mechanics through the work of Lagrange, Liouville, Hamilton,
Jacobi, Poincaré, . . . ; and from Geometry (Cayley, Grassmann, Riemann, etc). It
was Riemann who actually introduced, for the first time in his doctoral thesis in
1851, the idea of a “function space”. There were other pioneers like Dedekind
and Méray, but it was the fundamental work of Cantor with his set theory that
permits the whole building of Functional Analysis as we think about this subject
today. We cannot forget the italian school as it is considered the place where
Functional Analysis started at the end of nineteenth century, and the initial part
of twentieth century. Names like Ascoli, Arzelá, Betti, Beltrami, Cremona, Dini,
Peano, Pincherle, Volterra, should not be forgotten. But also the french school, in
the heart of the twentieth century, contributed immensely to the establishment of
Functional Analysis as a discipline on its own. Here we ought to bring to mind
Baire, Borel, Darboux, Fréchet, Goursat, Hadamard, Hermite, Jordan, Lebesgue,
Picard, among many others.
The Calculus of Variations, and the theory of integral equations for that
matter, provided the appropriate ground for the success of functional analytical
techniques in solving quite impressive problems. Variational problems, formulated
in a somewhat primitive form, started almost simultaneously with the development
of Calculus. Yet the issue of making sure that there was a special curve or surface
furnishing the minimum value possible for a certain quantity (a functional defined
1.8 Brief Historical Background 27
on a space) was out of the question until the second half of the nineteenth century. It
was not at all easy to appreciate the distinction between the unquestionable physical
evidence of the existence of minimizers in Mechanics, and the need for a rigorous
proof that it is so.
There are two fundamental years for the Calculus of Variations: 1696 is
universally accepted as the birth of the discipline with J. Bernoulli’s brachistochrone
problem; while 1744 is regarded as the beginning of its theory with the publication
by Euler of his necessary condition for a minimum. It was Lagrange who pushed
Euler’s ideas beyond, and invented the “method of variations” that gave its name
to the discipline. Later Jacobi and Weierstrass discovered the sufficiency conditions
for an extreme, and Legendre introduced the second variation.
One of the most fundamental chapters of the mutual interplay between the
Calculus of Variations and Functional Analysis was Dirichlet’s principle, or more
generally Partial Differential Equations. It reads:
There exists a function .u(x) that minimizes the funcional

.D(u) = |∇u(x)|2 dx,
Ω
for .Ω ⊂ RN , among all continuously differentiable functions in .Ω , and continuous

up to the boundary .∂Ω , taking prescribed values around .∂Ω given by f . Such
function .u(x) is the solution of the problem
.Δu = 0 in Ω , u = f on ∂Ω , u ∈ C2 (Ω ).
Its rigorous analysis had an unquestionable impact on Analysis, and it led to

several interesting methods to show the existence of such a function minimizer u
by Neumann, Poincaré, and Schwarz. But it also stirred ideas, initiated by Hilbert,
that culminated in the direct method of the Calculus of Variations. These methods
do not deal with the Euler-Lagrange equation or system, but place themselves in a
true functional analytical setting by “suitably generalizing the concept of solution”.
This idea, expressed by Hilbert in a paper in 1900, became a guide for the Calculus
of Variations for the twentieth century, and culminated in an important first step,
along with the notion of semicontinuity, in the work of Tonelli at the beginning of
the last century. There were other fundamental problems that also had a very clear
influence in Analysis: Plateau’s problem and minimal surfaces, and Morse theory.
These areas started to attract so much attention that they eventually developped to
be independent on their own. They are part of a different story, just as it is so with
the field of integral equations.
It was precisely the important work of Fredholm on integral equations that
attracted the interest of Hilbert and other colleagues in Göttingen. A lot of work
was displayed on integral equations on those first decades of the twentieth century
in Göttingen where a large number of Hilbert’s disciples, comprising the Hilbert
school, simplified, further explored and even extended Hilbert’s intuition, and paved
the way towards the year 1933 where Functional Analysis was started to being
clearly considered an independent, important branch of Mathematics. There are

some other personal, fundamental contributions to Functional Analysis that cannot
be justly missed or ignored: Banach, Hahn, and Riesz; and some later figures like
Schauder, Stone, or Von Neumann.
This is just a glimpse on the origins of these two fundamental areas of Analysis.
We have not touched on ideas, since that would require a full treatise. We have not
mentioned every mathematician deserving it. Again brevity has forced us to name
just the characters regarded as key figures in the early development of these parts
of Mathematics, hoping to have motivated on readers a genuine interest in learning
more.
1.9 Exercises
1. Prove Hölder’s inequality

⎛ n ⎞1/p ⎛ ⎞1/q
Σ
n Σ p
Σ
n
q 1 1
. xk yk ≤ xk yk , + = 1,
p q
k=1 k=1 k=1
for p > 1, xk , yk > 0.

(a) Argue that the maximization problem
Maximize in x = (x1 , x2 , . . . , xn ) :
. x·y
subjected to the condition
Σ
n
p
. xk = b
k=1
where
b > 0,
. y = (y1 , y2 , . . . , yn ),
are given, has a unique solution.

(b) Find such solution.
(c) By interpreting appropriately the maximal property of the found solution,
show Hölder’s inequality.
2. For the pth-norm (p ≥ 1)
p
Σ
||x||p =
. |xj |p , x = (x1 , x2 , . . . , xn ),
j
1.9 Exercises 29
show the inequality
||x + y||p ≤ ||x||p + ||y||p ,

.
by looking at the optimization problem
.Minimize in (x, y) ∈ RN +N : (||x||p + ||y||p )
subject to x + y = z, for a fixed vector z ∈ RN .

(a) Show that for every z ∈ RN , there is always a unique optimal feasible pair
(x, y).
(b) Use optimality to find such optimal pair.
(c) Conclude the desired inequality.
3. (a) Consider a variational problem of order zero of the form

. F (x, u(x)) dx.
Ω
Argue how to find the minimizer under no further constraint. Derive

optimality conditions. Why can this same strategy not be implemented for
a first-order problem?
(b) For a fixed function f (x), consider the variational problem

Minimize in χ (x) :
. χ (x)f (x) dx
Ω
for every characteristic function χ with

. χ (x) dx = |Ω |s, s ∈ (0, 1).
Ω
Describe the minimizers.

4. If the cost of flying in the air at height h > 0 over ground (h = 0) is given by
e−ah , write precisely the optimization problem providing the minimum cost of
flying a distance L on ground level.
5. Fermat proposed in 1662 the principle of propagation of light: the path taken
by light between two points is always the one minimizing the transit time. If
the velocity of light in any medium is the quotient c/n where n(x, y) is the
refractive index of the medium at point (x, y), and c is the velocity of light in
vacuum, write the variational problem that furnishes the path followed by a ray
of light going from point (0, y0 ) to (L, yL ).
6. Newton’s solid of minimal resistance. The integral

y(x)y ' (x)3
. dx
1 + y ' (x)2
provides the global resistance experienced by the solid of revolution obtained

by the rotation of the graph (x, y(x)) around the X-axis between two points
(0, H ), (L, h), h < H . Write the same previous functional for profiles
described in the form (x(y), y) between x(H ) = 0, x(h) = L.
7. Minimal surfaces of revolution. One can restrict the minimal surface problem
to the class of surfaces of revolution generated by graphs of functions u(x) of
a single variable that rotates around the X-axis between two values x = 0 and
x = L. Write the functional to be minimized for such a problem.
8. Geodesics in a cylinder and in a sphere. Write the functional yielding the length
of a curve σ whose image is contained either in the cylinder
x2 + y2 = 1
.
or in the sphere
x 2 + y 2 + z2 = 1.
.
Discuss the several possibilities of the mutual position of the initial and final
points in each case.
9. Consider the canonical parametrization (x, y, u(x, y)) of the graph of a smooth
function
u(x, y),
. (x, y) ∈ Ω ⊂ R2 .
Write the functional with integrand yielding the product of the sizes of the two
generators of the tangent plane at each point of such a graph. Can you figure
out at least one minimizer?
10. Let
u(t) : [0, 1] → RN
.
be a smooth map, and consider the functional

1
. F (u(t), u' (t)) dt
0
for a certain integrand
F (u, v) : RN × RN → R.
.
1.9 Exercises 31
Write down the variational problem of finding the minimum of the above
functional over all possible reparametrizations of the given path u preserving
the values u(0) and u(1).
(a) How can the problem be simplified for smooth, bijective parametrizations?
(b) Deduce for which integrands F (u, v) the corresponding functional is
independent of reparametrizations, and check that this is so for the one
yielding the length of the curve {u(t) : t ∈ [0, 1]}.
Part I
Basic Functional Analysis
and Calculus of Variations
Chapter 2
A First Exposure to Functional Analysis
2.1 Overview
We focus on this chapter on the three main basic issues that are indispensable to
tackle variational problems under a solid foundation, namely,
1. define the size of a function;
2. examine a compactness principle valid for bounded (with respect to that size)
sequences of functions;
3. study whether limit functions enjoy similar properties to the members of a
sequence.
From the very beginning, and as is usual in Mathematics, we will adopt an abstract
viewpoint to gain as much generality as possible with the same effort. Fundamental
spaces of functions and sequences will be explored as examples to illustrate
concepts, fine points in results, counterexamples, etc. Spaces that are fundamental
to Calculus of Variations and Partial Differential Equations (PDE, for short) will
also be introduced as they will play a central role in our discussion. In particular, we
will introduce the concept of weak derivative and weak solution of a PDE, and start
studying the important Sobolev spaces.
From the strict perspective of Functional Analysis, we will introduce Banach
and Hilbert spaces, the dual space of a given Banach space, weak topologies, and
the crucial principle of weak compactness. Some other fundamental concepts of
Functional Analysis will be deferred until later chapters.
2.2 Metric, Normed and Banach Spaces
Since sets of functions taking values on a vector space (over a certain field .K which
most of the time will be .R if not explicitly claimed otherwise) inherit the structure

36 2 A First Exposure to Functional Analysis
of a vector space over the same field, we will assume that an abstract set .E is such a
vector space.
Definition 2.1 A non-negative function
|| · || : E → R+
.
is called a norm in .E if:

N1 .||x|| = 0 if and only if .x = 0;
N2 triangle inequality: for every pair .x, y ∈ E,
. ||x + y|| ≤ ||x|| + ||y||;
N3 for every scalar .λ and vector .x ∈ E,
||λx|| = |λ| ||x||.

.
A norm is always a way to measure the size of vectors, and those three properties
specify how it should behave with respect to linear combinations, the basic operation
in a vector space.
If a set .E is not a vector space, we can still define a distance function
d(x, y) : E × E → R+
.
to measure how far from each other elements of .E are. Such distance function must
comply with some basic properties to maintain a certain order in .E, namely:
1. .d(x, y) = 0 only when .x = y;
2. symmetry: for every pair .x, y ∈ E, .d(x, y) = d(y, x);
3. triangular inequality:
d(x, z) ≤ d(x, y) + d(y, z)

.
for any three elements .x, y, z ∈ E.

If .|| · || is a norm in .E, then the formula
d(x, y) = ||x − y||

.
yields a distance in .E, with the additional properties

1. .d(x + z, y + z) = d(x, y);
2. .d(λx, λy) = |λ|d(x, y).
2.2 Metric, Normed and Banach Spaces 37
Conversely, if
d(·, ·) : E × E → R+
.
is a distance function enjoying the above two properties, then
||x|| = d(0, x)
.
is a norm in .E. The topology in .E is determined through this distance function. The
collection of balls
{x ∈ E : ||x|| < r}
.
for positive r make up a basis of neighborhoods at .0. Their translation to every

vector .x ∈ E form a basis of neighborhoods at .x.
Definition 2.2
1. A complete metric space is a metric space .(E, d) in which Cauchy sequences
always converge to a point in the same space.
2. A Banach space .E is a normed vector space which is complete with respect to its
norm. Every Banach space is a complete metric space under the distance induced
by the norm.
Example 2.1 The first example is, of course, that of a finite dimensional vector
space .KN in which one can select various ways to measure the size of vectors:
Σ
||x||22 =
. xi2 , x = (x1 , x2 , . . . , xN ),
i
p
Σ
||x||p = |xi |p , p ≥ 1,
i
||x||∞ = max |xi |.

i
KN is, definitely, complete if .K is.

.
Example 2.2 One of the easiest, non-finite-dimensional normed spaces that can be
shown to be complete is
l∞ (= l∞ (R)) = {x = (xn )n∈N : sup |xn | = ||x||∞ < ∞}.

.
n
If a sequence
{x(j ) } ⊂ l∞
.
is a Cauchy sequence, then
||x(j ) − x(k) ||∞ → 0

. as j, k → ∞.
In particular, for every .n ∈ N,
(j )
|xn − xn(k) | ≤ ||x(j ) − x(k) ||∞ .
.
Hence
(j )
{xn } ⊂ R
.
is a Cauchy sequence in .R, and consequently it converges to, say, .xn . It is then easy
to check that
.x = (xn )n∈N
belongs to .l∞ , and .x(j ) → x, that is to say
||x(j ) − x||∞ → 0 as j → ∞.
.
l∞ is not a finite-dimension vector space as it cannot admit a basis with a finite

.
number of sequences.
Example 2.3 This is an extension of Example 2.2. Let .X be a non-empty set, and
take
.l∞ (X) = {f : X → R, f (X) is bounded in R},
and put
. ||f ||∞ = sup{|f (x)| : x ∈ X}.
Just as above .(l∞ (X), || · ||∞ ) is a Banach space.

Example 2.4 Another fundamental example is that of the space of bounded,
continuous functions .C 0 (Ω ) in an open set .Ω of .RN , under the sup-norm. In fact, if .X
is a general topological space, and .C 0 (X) denotes the space of bounded, continuous
functions as a subspace of .l∞ (X), because .C 0 (X) is a closed subspace of .l∞ (X)
(this is a simple exercise), it becomes a Banach space on its own under the sup-
norm.
The most important examples in Applied Analysis are, by far, the .Lp -spaces. As
a prelude to their study, we look first at spaces .lp for .p ≥ 1, in a similar way as
with .l∞ .
2.3 Completion of Normed Spaces 39
Example 2.5 We first define
p
Σ
lp = {x = (xn )n∈N : ||x||p ≡
. |xn |p < ∞}, p > 0.
n
It turns out that .lp is a Banach space precisely because

⎛ ⎞ 1/p
Σ
||x||p =
. |xn | p
, p ≥ 1,
n
is a norm in the vector space of sequences of real numbers without any further
condition. The triangular inequality was checked in an exercise in the last chapter.
The other two conditions are trivial to verify. Hence .lp , for exponent .p ≥ 1,
becomes a normed-vector space. The reason why it is complete is exactly as in
the .l∞ case. Simply note that
(j ) (j )
|xn − xn(k) | ≤ ||x(j ) − x(k) ||p ,
. x(j ) = (xn ), x(k) = (xn(k) ),
for every n. Hence if .{x(j ) } is a Cauchy sequence in .lp , each component is a Cauchy
sequence of real numbers. The case .p = 2 is very special as we will later see. For
exponent .p ∈ (0, 1) the sets .lp are not, in general, vector spaces. See Exercise 26
below.
2.3 Completion of Normed Spaces
Every normed space can be densely embedded in a Banach space. Every metric
space can be densely embedded in a complete metric space. The process is similar
to the passage from the rationals to the reals. Recall Example 2.3 above.
Theorem 2.1
1. Let .(E, d) be a metric space. A complete metric space .(Ê, d̂), and an isometry
⏀ : (E, d) |→ (Ê, d̂)

.
can be found such that .⏀(E) is dense in .Ê. .(Ê, d̂) is unique modulus isometries.
2. Every normed space .(E, ||·||) can be densely embedded in a Banach space .(Ê, ||·
||) through a linear isometry
⏀ : (E, || · ||) |→ (Ê, || · ||).

.
Proof Choose a point .p ∈ E, and define
.⏀ : (E, d) |→ l∞ (E), ⏀(q)(x) = d(q, x) − d(p, x).
Because of the triangular inequality, it is true that
|d(q, x) − d(p, x)| ≤ d(q, p)

.
for every .x ∈ E, and
||⏀(q)||∞ ≤ d(q, p).

.
In a similar way, for every pair .qi , .i = 1, 2,
. ||⏀(q1 ) − ⏀(q2 )||∞ ≤ d(q1 , q2 ).
However
|⏀(q1 )(q2 ) − ⏀(q2 )(q2 )| = d(q1 , q2 )

.
and, hence,
||⏀(q1 ) − ⏀(q2 )||∞ = d(q1 , q2 ).

.
Take
(Ê, d̂) = (⏀(E), || · ||∞ )

.
where closure is meant in the Banach space .l∞ (E). The uniqueness is straightfor-
ward. In fact, suppose that
⏀i : (E, d) |→ ⏀i (E) ⊂ (Ei , di ),

. i = 1, 2,
with .⏀i (E) dense in .Ei . The mapping
⏀ = ⏀2 ◦ ⏀−1
.
1 : ⏀1 (E) |→ ⏀2 (E)
can be extended, by completeness and because it is an isometry, to .E1 and put
. ⏀ : (E1 , d1 ) |→ (E2 , d2 )
so that
. ⏀|⏀1 (E) = ⏀2 ◦ ⏀−1

1 ;
2.4 Lp -Spaces 41
and viceversa, there is
Θ : (E2 , d2 ) |→ (E1 , d1 ),
. Θ |⏀2 (E) = ⏀1 ◦ ⏀−1
2 .
Therefore
⏀ ◦ Θ = 1|E2 ,
. Θ ◦ ⏀ = 1|E1 ,
in respective dense subsets. The claim is proved, then, through completeness.

Since .(E, || · ||) can be understood as a metric space under the distance
d(x, y) = ||x − y||,

.
the second part is a consequence of the first. Let
⏀ : (E, || · ||) |→ (Ê, d̂)

.
be such an isometry with .⏀(E) densely embedded in .Ê. If .x, .y are elements of .Ê,
one can take Cauchy sequences .{xj }, .{yj } in .E such that
⏀(xj ) → x,
. ⏀(yj ) → y.
In that case .{xj + yj } is also a Cauchy sequence, and by the completeness of

Ê, .⏀(xj + yj ) → x + y. The same is correct for the product by scalars. It is
.
straightforward to check the independence of these manipulations from the chosen

Cauchy sequences. Finally,
||x|| = d̂(x, 0) = lim d̂(⏀(xj ), 0) = lim ||xj ||

.
j →∞ j →∞
and
. d̂(x, y) = ||x − y||.
⨆
⨅
This abstract result informs us that every time we have a norm on a vector
space, we can automatically consider its completion with respect to that norm. The
resulting space becomes a Banach space in which the starting one is dense. This is
often the way to tailor new spaces for specific purposes.
2.4 Lp -Spaces
Let .Ω ⊂ RN be an open subset. .dx will indicate the Lebesgue measure in .RN . We
assume that readers have a sufficient basic background on Measure Theory to know
what it means for a function
f (x) : Ω → R
.
to be measurable and integrable, and to have a finite essential supremum. All

functions considered are measurable without further notice. In fact, it is usually
written
L1 (Ω ) = {f (x) : Ω → R : f is integrable},
.
and define
⎛
||f ||1 =
. |f (x)| dx.
Ω
Definition 2.3 Let .p ≥ 1. We put
. Lp (Ω ) = {f (x) : Ω → R : |f (x)|p is integrable},
and
L∞ (Ω ) = {f (x) : Ω → R : esssupx∈Ω |f (x)| < +∞}.

.
The main objective of this section is to show that .Lp (Ω ) is a Banach space for every
.p ∈ [1, +∞] with norm
⎛
p
||f ||p =
. |f (x)|p dx, ||f ||∞ = esssupx∈Ω |f (x)|. (2.1)
Ω
This goal proceeds in two main steps:

1. show that .Lp (Ω ) is a vector space, and .|| · ||p is a norm; and
2. .Lp (Ω ) is complete under this p-th norm.
It is trivial to check that
||λf ||p = |λ|||f ||p

.
for every scalar .λ ∈ R, and every .f ∈ Lp (Ω ). More involved is to ensure that
f + g ∈ Lp (Ω )
.
2.4 Lp -Spaces 43
as soon as both f and g are functions in .Lp (Ω ). But in fact, this is a direct
consequence of the fact that .|| · ||p is a norm; more in particular, of the triangle
inequality N2 in Definition 2.23. Indeed, given a large vector space .E, if .|| · || is a
norm over .E that can take on the value .+∞ some times, then
E||·|| = {x ∈ E : ||x|| < +∞}

.
is a vector subspace of .E. All we have to care about, then, is that the p-th norm given
above is indeed a norm in the space of measurable functions over .Ω . Both N1 and
N3 are trivial. We focus on N2. This is in turn a direct consequence of an important
inequality for numbers.
Lemma 2.1 (Young’s Inequality) For real numbers a and b, and exponent .p > 1,
we have
1 p 1 q 1 1
|ab| ≤
. |a| + |b| , + = 1.
p q p q
Proof It is elementary to check that the function .log x is concave for x, positive.
This exactly means that
t1 log a1 + t1 log a2 ≤ log(t1 a1 + t2 a2 ),

. ti ≥ 0, t1 + t2 = 1, ai > 0.
For the choice

1 1
t1 =
. , t2 = , a1 = |a|p , a2 = |b|q ,
p q
we find, through the basic properties of logarithms, that

⎛ ⎞
1 1 1 p 1 q
. log |ab| = log |a|p + log |b|q ≤ log |a| + |b| .
p q p q
The increasing monotonicity of the logarithmic function implies the claimed

inequality. ⨆
⨅
Another necessary, but fundamental inequality on its own right, follows. It is a
consequence of the previous one.
Lemma 2.2 (Hölder’s Inequality) If .f ∈ Lp (Ω ) and .g ∈ Lq (Ω ), i.e.
||f ||p < ∞,

. ||g||q < ∞,
with .1 = 1/p + 1/q, then the product f g is integrable (.f g ∈ L1 (Ω )), and
.||f g||1 ≤ ||f ||p ||g||q .

Proof Apply Young’s inequality to the choice
|f (x)| |g(x)|
.a= , b= ,
||f ||p ||g||q
for an arbitrary point .x ∈ Ω , to obtain
|f (x)||g(x)| 1 |f (x)|p 1 |g(x)|q

. ≤ p + .
||f ||p ||g||q p ||f ||p q ||g||qq
Rearranging terms and integrating over .Ω , we arrive at our conclusion. ⨆

⨅
We are now ready to show the following.
Proposition 2.1 For .p ∈ [1, +∞], the p-th norm in (2.1) is a norm in the space of
measurable functions in .Ω , and consequently .Lp (Ω ) is a vector space.
Proof The limit cases .p = 1 and .p = ∞ are straightforward (left as an exercise).
Take .1 < p < ∞, and let f and g be two measurable functions defined in .Ω . Then
for each individual point .x ∈ Ω , we have
|f (x) + g(x)|p ≤ |f (x) + g(x)|p−1 |f (x)| + |f (x) + g(x)|p−1 |g(x)|.

.
We apply Hölder’s inequality to both terms to find

p
||f + g||p ≤ |||f + g|p−1 ||q ||f ||p + |||f + g|p−1 ||q ||g||p .
. (2.2)
If we note that .q = p/(p − 1), then it is immediate to check that
p−1
|||f + g|p−1 ||q = ||f + g||p
. ,
and taking this identity to (2.2), we conclude the proof. ⨆

⨅
The second step is equally important.
Proposition 2.2 .Lp (Ω ) is a Banach space for all .p ∈ [1, +∞], and every open
subset .Ω ⊂ RN .
Proof The case .p = ∞ was already examined in the previous section. Take a finite
p ≥ 1, and suppose that .{fj } is a Cauchy sequence in the vector space .Lp (Ω ), i.e.
.
. ||fj − fk ||p → 0 as j, k → ∞.
It is then easy to argue that for a.e. .x ∈ Ω ,
.|fj (x) − fk (x)| → 0 as j, k → ∞. (2.3)

2.4 Lp -Spaces 45
Indeed, if for each fixed .ɛ > 0, we let
Ω ɛ,j,k ≡ {x ∈ Ω : |fj (x) − fk (x)| ≥ ɛ},

.
then
. lim |Ω ɛ,j,k | = 0.
j,k→∞
If, seeking a contradiction, this were not true, then for some .ɛ > 0, we could find
δ > 0, such that
.
. lim |Ω ɛ,j,k | ≥ δ.
j,k→∞
But then
⎛
0 < δɛ p ≤ lim
. |fj (x) − fk (x)|p dx
j,k→∞ Ω ɛ,j,k
⎛
≤ |fj (x) − fk (x)|p dx
Ω
p
=||fj − fk ||p ,
which is a contradiction. Equation (2.3) is therefore correct, and there is a limit

function .f (x). For k, fixed but arbitrary, set
gj (x) = |fj (x) − fk (x)|p .

.
Each .gj is integrable, non-negative, and

⎛ ⎛
. gj (x) dx = |fj (x) − fk (x)|p dx.
Ω Ω
Hence .{gj } is bounded in .L1 (Ω ). By the classical Fatou’s lemma, we can conclude
that
⎛
p
. |f (x) − fk (x)|p dx ≤ lim inf ||fj − fk ||p
Ω j →∞
for all such k. Taking limits in k in both sides of this inequality, we deduce that
⎛
p
. lim sup |f (x) − fk (x)|p dx ≤ lim ||fj − fk ||p = 0.
k→∞ Ω j,k→∞
Then
||f − fk ||p → 0,
.
and f is the limit function. ⨆

⨅
There are many more important properties of these fundamental spaces, some of
which will be stated and proved later.
2.5 Weak Derivatives
Our intention of not loosing sight of variational principles pushes us to move beyond
Lp -spaces, as we need to cope with derivatives as an essential ingredient. To be more
.
specific, suppose we consider the one-dimensional functional

⎛ x1
I (u) =
. F (u(x), u' (x)) dx (2.4)
x0
for competing functions
u(x) : [x0 , x1 ] ⊂ R → R,
.
where the integrand
F (u, z) : R × R → R
.
is supposed smooth and regular. At first sight, as soon as we write the derivative
u' (x), we seem to have no alternative than to restrict attention to .C 1 -functions.
.
Otherwise, we do not know what the derivative .u' (x) might mean in those points
where u is not differentiable. The truth is that because the value of the integrals in
(2.4) is determined uniquely when the integrand
F (u(x), u' (x))

.
is defined except, possibly, in a finite number of points, we realize that we can permit
piecewise, .C 1 functions to enter into the optimization process. Indeed, we can allow
many more functions.
It all starts with the well-known integration-by-parts formula that is taught in
elementary Calculus courses. That formula, in turn, is a direct consequence of the
product rule for differentiation, and the Fundamental Theorem of Calculus. It reads
⎛ b ⎛ b
'
. f (x)g(x) dx = x=b
f (x)g(x)|x=a − f (x)g ' (x) dx. (2.5)
a a
2.5 Weak Derivatives 47
In such basic courses, one is told that this formula is valid whenever both functions
f and g are continuously differentiable in the interval .[a, b]. This is indeed so.
Suppose, however, that f is only continuous in .[a, b] but fails to be differentiable
at some points, and that we could, somehow, find another function F , not even
continuous, just integrable, in such a way that
⎛ b ⎛ b
. F (x)g(x) dx = − f (x)g ' (x) dx (2.6)
a a
holds for every continuously differentiable function g with .g(a) = g(b) = 0.

Example 2.6 Take
a = −1,
. b = 1, f (x) = |x|
which is not differentiable at the origin, and

⎧
−1, −1 ≤ x < 0,
F (x) =
.
1, 0 < x ≤ 1.
We will take .g(x) as an arbitrary continuously differentiable function with
g(−1) = g(1) = 0.
.
Because integrals enjoy the additivity property with respect to intervals of integra-
tion, we can certainly write
⎛ 1 ⎛ 0 ⎛ 1
. F (x)g(x) dx = F (x)g(x) dx + F (x)g(x) dx,
−1 −1 0
and then
⎛ 0 ⎛ 0
. F (x)g(x) dx = − g(x) dx,
−1 −1
⎛ 1 ⎛ 1
F (x)g(x) dx = g(x) dx.
0 0
On the other hand,

⎛ 1 ⎛ 0 ⎛ 1
' '
. f (x)g (x) dx = −xg (x) dx + xg ' (x) dx,
−1 −1 0
and a “true” integration by parts in these two integrals separately clearly yields
⎛ 1 ⎛ 0 ⎛ 1 ⎛ 1
'
. f (x)g (x) dx = g(x) dx − g(x) dx = − F (x)g(x) dx.
−1 −1 0 −1
We therefore see that formula (2.6) is formally correct if we put .f ' (x) = F (x).
Notice that the difficulty at .x = 0 does not have any relevance in these computations
because what happens at a single point (a set of measure zero) is irrelevant for
integrals.
After this example, we suspect that formula (2.6) hides a more general concept
of derivative.
Definition 2.4 Let
. f (x) : (a, b) → R
be a measurable, integrable function. Another such function
F (x) : (a, b) → R
.
is said to be the weak derivative of f , and we write .f ' = F , if (2.6) holds for every
continuously differentiable function
g(x) : [a, b] → R, with g(a) = g(b) = 0.

.
In this way, we would say that .F (x) = x/|x| is the weak derivative of .f (x) = |x|
even if F is not defined at .x = 0, and f is not differentiable at the same point. We
can conclude that formula (2.6) is always correct as long as the derivative of f is
understood in a weak sense. Of course, if f is continuously differentiable, then its
standard derivative is also its weak derivative.
√
Example 2.7 Even though the function .f (x) = |x| has a derivative
x
f ' (x) =
. √ , x /= 0,
2|x| |x|
that is not defined at .x = 0, and, in fact, it blows up at this point, this function .f ' (x)
is also the weak derivative of f in the full interval .[−1, 1]. Indeed, it is easy to check
that the formula of integration by parts
⎛ 1 ⎛ 1
. f ' (x)g(x) dx = − f (x)g ' (x) dx
−1 −1
is correct for every .C 1 -function g vanishing at .x = ±1.

2.6 One-Dimensional Sobolev Spaces 49
2.6 One-Dimensional Sobolev Spaces
We are ready to introduce the fundamental Sobolev spaces of weakly differentiable

functions that are so indispensable for PDEs and Calculus of Variations. We will
restrict attention now to the one-dimensional situation, and deferred the treatment
of the more involved higher-dimensional case.
Definition 2.5 Let .J = (a, b) ⊂ R be a finite interval, and take .1 ≤ p ≤ ∞.
The Sobolev space .W 1,p (J ) is the subspace of functions f of .Lp (J ) with a weak
derivative .f ' which is also a function in .Lp (J ), namely
W 1,p (J ) = {f ∈ Lp (J ) : f ' ∈ Lp (J )}.

.
The first thing to do is to determine the norm in .W 1,p (J ), and ensure that .W 1,p (J )
is indeed a Banach space.
Proposition 2.3 For J and p as above, .W 1,p (J ) is a Banach space under the norm
||f ||1,p = ||f ||p + ||f ' ||p .

.
Equivalently, we can also take

⎛
[|f (x)|p + |f ' (x)|p ] dx.
p
||f ||1,p =
.
J
Proof The fact that .|| · ||1,p is a norm in .W 1,p (J ) is a direct consequence of the
fact that .|| · ||p is in .Lp (Ω ). To show that .W 1,p (J ) is a Banach space is now easy,
given that convergence in the .|| · ||1,p means convergence in .Lp (J ) of functions and
derivatives. In fact,
||fj − fk ||1,p → 0
. as j, k → ∞,
is equivalent to
||fj − fk ||p , ||fj' − fk' ||p → 0

. as j, k → ∞.
There are functions .f, g ∈ Lp (J ) with
fj → f,
. fj' → g
in .Lp (J ). It remains to check that .g = f ' . To this end, take an arbitrary, .C 1 -function
φ(x) : J → R,
. φ(a) = φ(b) = 0.
From
⎛ ⎛
. fj (x)φ ' (x) dx = − fj' (x)φ(x) dx,
J J
by taking limits in j , we find (justification: left as an exercise)

⎛ ⎛
'
. f (x)φ (x) dx = − g(x)φ(x) dx.
J J
The arbitrariness of .φ in this identity, exactly means that .f ' = g, hence
.f ∈ W 1,p (J ), fj → f
in the Sobolev space. ⨆

⨅
In dimension one, there is essentially a unique source of functions in .W 1,p (J ).
It is a consequence of the Fundamental Theorem of Calculus.
Lemma 2.3 Let .g ∈ Lp (J ), and put
⎛ x
f (x) =
. g(s) ds.
a
Then .f ∈ W 1,p (J ), and .f ' = g.

Proof We first check that f , so defined, belongs to .Lp (J ). Let .p ≥ 1 be finite. Note
that for each .x ∈ J , by Hölder’s inequality applied to the two factors .g ∈ Lp (J )
and the characteristic function of the interval of integration,
⎛ x
|f (x)| ≤
. |g(s)| ds ≤ ||g||p |x − a|(p−1)/p ≤ ||g||p |J |(p−1)/p .
a
The arbitrariness of .x ∈ J shows that .f ∈ L∞ (J ), and, in particular, .f ∈ Lp (J )

for every p.
Let us check that the formula of integration by parts is correct. So take a .C 1 -
function .φ(x) with .φ(a) = φ(b) = 0. Then
⎛ b ⎛ b ⎛ x
'
. f (x)φ (x) dx = g(s)φ ' (x) ds dx.
a a a
Fubini’s theorem permits us to interchange the order of integration in this double

integral to find that
⎛ b ⎛ b ⎛ b
'
. f (x)φ (x) dx = g(s)φ ' (x) dx ds.
a a s
If we change the name of the dummy variables of integration to avoid any confusion,
we can also write
⎛ b ⎛ b ⎛ b
. f (x)φ ' (x) dx = g(x)φ ' (s) ds dx,
a a x
and the Fundamental Theorem of Calculus applied to the smooth function .φ (recall
that .φ(b) = 0) yields
⎛ b ⎛ b
. f (x)φ ' (x) dx = − g(x)φ(x) dx.
a a
The arbitrariness of .φ implies that indeed
f ' = g ∈ Lp (J ),
. f ∈ W 1,p (J ).
⨆
⨅
A funny consequence of this lemma amounts to having the validity of the Funda-
mental Theorem of Calculus for functions in .W 1,p (J ).
Corollary 2.1 Let .f ∈ W 1,p (J ). Then for every point .c ∈ J ,
⎛ x
f (x) − f (c) =
. f ' (s) ds.
c
2.6.1 Basic Properties
It is also important to study some basic properties of these spaces. In particular, we

would like to learn to appreciate how much more regular functions in .W 1,p (J ) are
compared to functions in .Lp (J ) \ W 1,p (J ). One main point from the perspective
of the variational problems we would like to study is to check how end-point
constraints like
u(x0 ) = u0 ,
. u(x1 ) = u1
for given values .u0 , u1 ∈ R can be enforced. Note that such a condition is
meaningless for functions in .Lp (J ), .J = (x0 , x1 ), because functions in .Lp -spaces
are defined except in negligible subsets, and individual points .x0 and .x1 do have
measure zero. As usual J is a bounded interval either .(a, b) or .(x0 , x1 ).
Proposition 2.4 Let .p ∈ [1, ∞]. Every function in .W 1,p (J ) is absolutely continu-
ous. If .p > 1, every bounded set in .W 1,p (J ) is equicontinuous.
Proof Our starting point is the Fundamental Theorem of Calculus Corollary 2.1
⎛ x ⎛ x
.f (x) − f (y) = f ' (s) ds, |f (x) − f (y)| ≤ |f ' (s)| ds. (2.7)
y y
If we use Hölder’s inequality in the integral in the right-hand side for the factors .|f ' |
and the characteristic function of the interval of integration as before, we deduce that
|f (x) − f (y)| ≤ ||f ' ||p |x − y|(p−1)/p .

. (2.8)
This inequality clearly implies that every uniformly bounded set (in particular a
single function) in .W 1,p (J ) is equicontinuous, if .p > 1. For the case .p = 1,
the inequality breaks down because the size of .|x − y| is lost on the right-hand
side. However, (2.7) still implies that a single function in .W 1,1 (J ) is absolutely
continuous (why is the argument not valid for an infinite number of functions in a
uniformly bounded set in .W 1,1 (J )?). Inequality (2.8) is referred to as the fact that a
bounded sequence of functions in .W 1,p (J ) with .1 < p < ∞ is uniformly Hölder
continuous with exponent
α = 1 − 1/p ∈ (0, 1).

.
For .p = ∞, the exponent is .α = 1, and we say that the set is uniformly Lipschitz.
⨆
⨅
As a consequence of this proposition, we can define the following important
subspace of .W 1,p (J ), because functions in .W 1,p (J ) are continuous, and so its
values at individual points are well-defined.
1,p
Definition 2.6 The space .W0 (J ) is the subspace of functions of .W 1,p (J ) whose
end-point values vanish
1,p
W0 (J ) = {u ∈ W 1,p (J ) : u(a) = u(b) = 0},
. J = (a, b).
Proposition 2.4, together with the classic Arzelá-Ascoli theorem, implies that
every bounded sequence in .W 1,p (J ), with .p > 1, admits a subsequence converging,
in the .L∞ (J )-norm, to some function in this space. This deserves further closer
attention.
2.6.2 Weak Convergence
Suppose that
. fj → f in L∞ (J ), fj ∈ W 1,p (J ), f ∈ L∞ (J ). (2.9)
In this situation, there are two issues that we would like to clarify:
1. Is it true that in fact .f ∈ W 1,p (J ), i.e. there is some .f ' ∈ Lp (J ) so that
⎛ ⎛
. f (x)φ ' (x) dx = − f ' (x)φ(x) dx
J J
for every .φ ∈ C01 (J )?1

2. If so, how is the convergence of .fj' to .f ' ?
Under (2.9), we can write
⎛ ⎛ ⎛
. fj (x)φ(x) dx = − fj (x)φ (x) dx → − f (x)φ ' (x) dx.
' '
(2.10)
J J J
Since this is correct for arbitrary functions .φ in .C01 (J ), one can look at .φ as a
functional variable, and regard all those integrals as “linear operations” on .φ. In
particular, Hölder’s inequality implies that
|⎛ |
| | 1 1
| ' | '
.
| fj (x)φ(x) dx | ≤ ||fj ||p ||φ||q , + = 1.
p q
J
The boundedness of .{fj } in .W 1,p (J ) indicates that the sequence of linear operations
⎛
<Tj , φ> =
. fj' (x)φ(x) dx
J
is such that
|<Tj , φ>| ≤ C||φ||q ,

. (2.11)
with C independent of j . This situation somehow forces us to bother about

sequences of linear operations uniformly bounded.
Suppose one could argue somehow that, under (2.11), there is some function
.g ∈ L (J ) with
p
⎛ ⎛
.<Tj , φ> = fj' (x)φ(x) dx → g(x)φ(x) dx.
J J
(2.10) would imply then that

⎛ ⎛
. g(x)φ(x) = − f (x)φ ' (x) dx
J J
1 The notation .0 means that functions in this space vanish on end-points of J .

for all .φ ∈ C01 (J ), and this would precisely imply that indeed
f ∈ W 1,p (J ),
. f ' = g.
Once we know this, the Fundamental Theorem of Calculus Corollary 2.1 leads to
the fact that
⎛ y ⎛ y
. fj' (s) ds = fj (x) − fj (y) → f (x) − f (y) = f ' (s) ds,
x x
for arbitrary points x and y in J . This suggests that the convergence at the level of
derivatives .fj' to .f ' is such that
⎛ y ⎛ y
. fj' (s) ds → f ' (s) ds
x x
for arbitrary points .x, y ∈ J . This is indeed a notion of convergence as good as any
other, but definitely different from the usual point-wise convergence as the following
example shows.
Example 2.8 We recover Example 1.1 where
uj (x) = sin(2j π x),

. x ∈ [0, 1].
We already argued that there is no way to find a function .u(x) so that
uj (x) → u(x)
.
for a.e. .x ∈ [0, 1], not even for arbitrary subsequences. Yet, it is not difficult to
check that
⎛ b ⎛ b
. uj (x) dx → 0 = 0 dx
a a
for every arbitrary pair of points .a < b in .[0, 1] (left as an exercise). What this fact
means is that the trivial, identically vanishing function is, somehow, determined
by the sequence .{uj } by passage to the limit in a special way. This is called weak
convergence or convergence in the mean. Note that what makes this possible is the
persistent cancellation of the positive and negative parts of .uj when computing the
integral. In other words
|⎛ |
| b |
| uj (x) dx || → 0 as j → ∞;
.
|
a
2.7 The Dual Space 55
while it is not true that

⎛ b
. |uj (x)| dx → 0.
a
Example 2.9 It is important to start realizing why the exponent .p = 1 in Lebesgue

and Sobolev spaces is so special. The main issue, from the scope of our interest in
variational problems, that we would like to emphasize refers to weak compactness.
We can understand the difficulty with a very simple example. Take
⎧
j, 0 ≤ x ≤ 1/j
. uj (x) = j χ[0,1/j ] (x) = ,
0, else
and put
⎛ x
Uj (x) =
. uj (y) dy.
0
A simple computation and picture convince us that the sequence .{Uj } is uniformly
bounded in .W 1,1 (0, 1), and yet there cannot be a limit in whatever reasonable sense
we may define it, because it would be sensible to pretend that the function .U ≡ 1
be such limit but then we would have
⎛ 1 ⎛ 1
0=
. U ' (t) dt, lim Uj' (x) dx = 1,
0 j →∞ 0
which would be pretty weird for a limit function U . The phenomenon happening
with the sequence of derivatives .{uj } is informally referred to as a concentration
phenomenon at .x = 0: derivatives concentrate its full finite “mass” in a smaller and
smaller region around .x = 0, as j becomes larger and larger; this concentration of
mass forces a breaking of continuity at that point.
We see that we are forced to look more closely into linear operations on Banach
spaces. This carries us directly into the dual space.
2.7 The Dual Space
This section is just a first incursion on the relevance of the dual space.
Definition 2.7 Let .E be a Banach space. The dual space .E' is defined to be the
collection of all linear and continuous functionals defined on .E
. E' = {T : E → R : T , linear and continuous}.

A clear criterium to decide whether a linear operation is continuous follows.

Proposition 2.5 A linear functional
T :E→R
.
belongs to .E' if and only if there is .C > 0 with
|T (x)| ≤ C||x||
. (2.12)
for all .x ∈ E.
Proof On the one hand, if .xj → x in .E, then
|T xj − T x| = |T (xj − x)| ≤ C||xj − x|| → 0

.
as .j → ∞, and T is continuous. On the other, put
C = sup |T (x)|,
. (2.13)
||x||=1
and suppose that .C = +∞, so that there is a certain sequence .xj with
||xj || = 1,
. T (xj ) → ∞.
In this case
1
. xj → 0
T (xj )
and yet, due to linearity,

⎛ ⎞
1
T
. xj = 1,
T (xj )
which is a contradiction with the continuity of T . The constant C in (2.13) is then

finite, and, again through linearity, the estimate (2.12) is correct. ⨆
⨅
This proposition is the basis of the natural possibility of regarding .E' as a Banach
space on its own right by putting
||T || = sup |T (x)|

.
||x||=1
or, in an equivalent way,
. ||T || = inf{M : |T (x)| ≤ M||x|| for all x ∈ E}.

2.7 The Dual Space 57
Proposition 2.6 If .E is a normed space (not necessarily complete), then .E' is a

Banach space endowed with this norm.
The proof of this fact is elementary, and is left as an exercise.
Quite often, we write
T (x) = <T , x>,

.
and refer to it as the duality pair or product, to stress the symmetric roles played by
T and .x in such an identity. As a matter of fact, if .x' ∈ E' is an arbitrary element,
the duality pair and the corresponding estimate
<x' , x> ≤ ||x' ||||x||

.
between .E and its dual .E' immediately yields that .x ∈ E can be interpreted, in a
canonical way, as an element of the bidual .E'' = (E' )' . It is worth to distiguish those
spaces for which there is nothing else in .E'' .
Definition 2.8 A Banach space .E is said to be reflexive, if .E = E'' .
The most important example for us of a dual pair is the following.
Theorem 2.2 (Riesz Representation Theorem) Let .p ∈ [1, ∞), and .Ω ⊂ RN , an
open subset of finite Lebesgue measure. The dual space of .Lp (Ω ) can be identified
with .Lq (Ω ), .1 = 1/p + 1/q (these exponents p and q are called conjugate of each
other), through
⎛
Θ : Lq (Ω ) |→ Lp (Ω )' ,
. < Θ (g), f > = f (x)g(x) dx (2.14)
Ω
for .g ∈ Lq (Ω ) and .f ∈ Lp (Ω ).
Proof Through Hölder’s inequality, it is elementary to check that the mapping . Θ in
(2.14) is well-defined, linear, and because
|| Θ (g)|| ≤ ||g||q ,
.
it is also continuous.
We claim that . Θ is onto. To see this, take .T ∈ Lp (Ω )' . For a measurable subset
.A ⊂ Ω , if .χA (x) stands for its characteristic function, given that it belongs to .L (Ω ),
p
we define a measure m in .Ω through formula
. m(A) = <T , χA >. (2.15)
It is immediate to check that the set function m is indeed a measure, and, moreover,
it is absolutely continuous with respect to the Lebesgue measure because if A is
negligible, .χA = 0 in .Lp (Ω ), and so (2.15) yields .m(A) = 0. The Radon-Nykodim
theorem guarantees that there is some .g ∈ L1 (Ω ) such that

⎛
<T , χA > =
. χA (x)g(x) dx.
Ω
This identification easily extends to linear combinations of characteristics,2 and by

density to all .f ∈ Lp (Ω )
⎛
<T , f > =
. f (x)g(x) dx, (2.16)
Ω
provided it is true that .g ∈ Lq (Ω ).

The case .p = ∞ is clearly correct, since we already know that .g ∈ L1 (Ω ). In
particular, (2.16) is valid for uniformly bounded functions f . Suppose .p < ∞. Put
. En = {|g| ≤ n}, n ∈ N,
and
⎧
⎨ |g(x)| , x ∈ E ,
q
n
.fn (x) = g(x)
⎩
0, else.
Since each .fn is uniformly bounded, by our comments above,

⎛ ⎛
<T , fn > =
. fn (x)g(x) dx = |g(x)|q dx.
Ω En
The continuity of T allows to write

⎛
q/p
. |g(x)|q dx ≤ ||T ||||fn ||p = ||T ||||χEn g||q ,
En
and
||χEn g||q ≤ ||T ||.

.
By the monotone convergence theorem, conclude
||g||q ≤ ||T ||.

.
⨆
⨅
2 These are called simple functions.

2.8 Compactness and Weak Topologies 59
The space .L∞ (Ω ) is very special concerning its dual space. All we can say at this
stage is the it contains .L1 (Ω ), but it is indeed much larger.
Corollary 2.2 .Lp (Ω ) is reflexive in the finite case .1 < p < ∞.
Example 2.10 The precise identification of the dual space of .C 0 (Ω ) is technical and
delicate, though very important. It requires fundamental tools from Measure Theory.
Formally, such dual space can be identified with the space of signed measures with
finite total variation on the .σ -algebra of the Borel subsets of .Ω . It can be generalized
for a compact Hausdorff space, or even for a locally compact such space.
2.8 Compactness and Weak Topologies
We remind readers that one of our main initial concerns is that of establishing
a suitable compactness principle that may enable us to have a convergent (in
some appropriate sense) subsequence from a uniformly bounded sequence. One
possibility is to study directly the nature of compact sets of functions in a given
Banach space. This is however pretty fruitless, from this perspective, as the topology
associated with the norm in a Banach space is usually too fine, and so conditions on
compact sets are rather demanding. In the case of .Lp -spaces, such compactness
criterium is known and important, yet, as we are saying, pretty inoperable in
practice, at least from the perspective of our need in this text.
On the other hand, we have already anticipated how, thanks to the Arzelá-
Ascoli theorem, bounded sets in .W 1,p (J ) are precompact in .L∞ (J ) (if .p > 1),
and sequences converge to functions which indeed remain in .W 1,p (J ). Moreover,
we have also identified the kind of convergence that takes place at the level of
derivatives
⎛ ⎛
. χ(x,y) (s)fj' (s) ds → χ(x,y) (s)f ' (s) ds, (2.17)
J J
for arbitrary points .x, y ∈ J , where, as usual, .χ(x,y) (s) designs the characteristic
function of the interval .(x, y) in J . It is easy to realize that (2.17) can be
extended to simple functions (linear combinations of characteristic functions), and
presumably, by some density argument, to more general functions. This short
discussion motivates the following fundamental concept.
Definition 2.9 Let .E be a Banach space, with dual .E' .
1. We say that a sequence .{xj } ⊂ E converges weakly to .x, and write .xj ⥛ x, if
for every individual .x' ∈ E' , we have
<x' , xj > → <x' , x>

.
as numbers.
∗
2. We say that a sequene .{x'j } ⊂ E' converges weakly * to .x' , and write .x'j ⥛ x' , if
for every individual .x ∈ E, we have
.<x'j , x> → <x' , x>
as numbers.
This may look, at first sight, as a too general or abstract a concept. It starts to be
more promising if we interpret it in our .Lp -spaces. In this setting, we would say,
after the Riesz representation theorem, that .fj ⥛ f in .Lp (Ω ), if
⎛ ⎛
. g(x)fj (x) dx → g(x)f (x) dx
Ω Ω
for every .g ∈ Lq (Ω ). If we write this condition in the form

|⎛ |
| |
| g(x)(fj (x) − f (x)) dx| → 0,
.
| |
Ω
and compare it to
⎛
| |
. |g(x)(fj (x) − f (x))| dx → 0,
Ω
we see that the place where the absolute value is put, either outside or inside
the integral sign, determines the kind of convergence we are considering (recall
Example 2.8). In the first case, we are talking about weak convergence, while in
the second about strong or norm convergence. In the case of weak convergence,
persistent cancellation phenomenon in the integrals is not excluded; indeed this is
the whole point of weak convergence, whereas cancellation is impossible in the
second.
In the particular case in which .Ω is a finite interval J of .R, and g is the
characteristic function of a subinterval .(x, y), we would find that weak convergence
implies
⎛ y ⎛ y
. fj (s) ds → f (s) ds,
x x
exactly the convergence of derivatives that we had anticipated for bounded

sequences in the Sobolev space .W 1,p (J ).
Proposition 2.7 Every bounded sequence in .W 1,p (J ) for .p > 1, admits a
subsequence converging weakly in .W 1,p (J ) and strongly in .L∞ (J ), to some
function in .W 1,p (J ).
Proof We have already argued why this statement is correct. Such a bounded
sequence .{fj } in .W 1,p (J ) is equicontinuous, and hence, for a suitable subsequence
which we do not care to relabel, it converges to some f in .L∞ (J ). Indeed, we
showed in Sect. 2.6 that .f ∈ W 1,p (J ), and that
⎛ ⎛
. χ (x)fj' (x) dx → χ (s)f ' (x) dx, (2.18)
J J
for every characteristic function .χ of a subinterval of J . All that remains is to check

that (2.18) can be extended from .χ (x) to an arbitrary .g ∈ Lq (J ) if q is the conjugate
exponent for p. This is done through a standard density argument, since the class of
simple functions (linear combinations of characteristic) is dense in any .Lq (Ω ). The
argument is left as an exercise. ⨆
⨅
Possibly, the most important reason to care about weak topologies is the crucial
result that follows.
Theorem 2.3 (Banach-Alaouglu-Bourbaki Theorem) Let .E be a Banach space
(it suffices that it be a normed space). The unit ball of the dual .E'
B(E' ) = {x' ∈ E' : ||x' || ≤ 1}

.
is compact in the weak * topology.

Proof Let .L(E, K) denote the sets of all mappings from .E into .K, i.e.
L(E, K) = KE .
.
It is clear that .E' ⊂ L(E, K), since .E' is the subset of .L(E, K) of those linear and
continuous mappings. It is also easy to realize that
||
B(E' ) ⊂ Δ =
. Δx , Δx = {λ ∈ K : |λ| ≤ ||x||},
x∈E
because, for .x' ∈ B(E' ), it is true that
|<x' , x>| ≤ ||x||.

.
The set .Δ is compact in the product topology in .KE , because each projection .Δx ,
for all .x ∈ E, is (Tychonoff’s theorem). It suffices to check that .B(E' ) ⊂ Δ is closed
(under the weak * topology).
To this end, let .x'j → x' with .x'j ∈ B(E' ),3 with
<x'j , x> → <x' , x>

. (2.19)
for each individual .x ∈ E. It is straightforward to check the linearity of .x' if each .x'j
is. If .x'j ∈ B(E' ),
. |<x'j , x>| ≤ ||x||
for all j and .x ∈ E. This implies, through (2.19), that
|<x' , x>| ≤ ||x||,

.
i.e. .x' ∈ E' (it is continuous), and .x' ∈ B(E' ). ⨆

⨅
We immediately can write a corollary which is the version we will invoke most of
the time. If a Banach space is reflexive, then .E = E'' , and .E turns out to be the dual
of its dual. In this case, the weak star topology in .E'' becomes the weak topology in
.E.
Corollary 2.3 Let .E be a reflexive, Banach space, and .{xj }, a bounded sequence in
E. There is always a subsequence converging weakly in .E.
.
After this corollary the following fundamental fact is at our disposal.

Corollary 2.4 Let J be a bounded interval in .R, and .Ω , an open, bounded subset
of .RN . Let .p > 1 be a finite exponent.
1. Every bounded sequence in .Lp (Ω ) admits a weakly convergent subsequence.
2. Every bounded sequence in .W 1,p (J ) admits a subsequence converging weakly
in .W 1,p (J ) and strongly in .L∞ (J ).
Example 2.11 We go back to our favorite Example 2.8, but this time we put
u'j (x) = sin(2j π x),

. x ∈ [0, 1].
As before, we still have
u'j ⥛ 0 in L2 ([0, 1]),

.
but we can take

1
uj (x) = 1 −
. cos(2j π x),
2j π
3 Though we argue here through sequences, in full rigor it should be done through nets. But in this
more abstract framework, it is exactly the same.

and it is pretty obvious that
uj → u in L∞ ([0, 1]),
. u(x) ≡ 1.
Example 2.12 Let J be any finite interval in .R of length L, and take .t ∈ (0, 1).
There is always a sequence of characteristic functions .{χj (x)} in J such that
χj (x) ⥛ Lt.
. (2.20)
Note that the limit function is a constant. This weak convergence means that one
can always find a sequence .{Jj } of subintervals of J such that
⎛ ⎛
. g(x) dx → Lt g(x) dx
Jj J
for every integrable function g. Take any characteristic function .χ (x) in the unit
interval .J1 = [0, 1] such that
⎛
. χ (x) dx = t.
J1
If .J = (a, a + L) then the linear function .x |→ Lx + a is a bijection between .J1

and J , and hence
⎛
1
. χ ( (y − a)) dy = Lt.
J L
If .χ is regarded as a 1-periodic function in .R, and we put .χj (x) = χ (2j x), then it
is easy to argue that the sequence of functions
1
y |→ χj ( (y − a))
.
L
enjoys property (2.20).
All we have said in this and previous sections about .W 1,p (J ) is applicable to
Sobolev spaces .W 1,p (J ; Rn ) of paths
u(t) : J → Rn ,
. u = (u1 , u2 , . . . , un ),
with components .ui ∈ W 1,p (J ). We state the main way in which weakly convergent
sequences in one-dimensional Sobolev spaces are manipulated in this more general
context. It will be invoked later in the book.
Proposition 2.8 Let .{uj } ⊂ W 1,p (J ; Rn ) be such that .u'j ⥛ U in .Lp (J ; Rn ), and
.uj (x0 ) → U0 in .R , for some .x0 ∈ J . Then .uj ⥛ u in .W
n 1,p (J ; Rn ) where
⎛ x
u(x) = U0 +
. U(s) ds.
x0
The proof is left as an exercise.
2.9 Approximation
Another important chapter of Lebesgue and Sobolev spaces is to explicitly specify in

what sense smooth functions are dense in these spaces, and how this approximation
procedures allow to extend basic results from smooth to non-smooth functions.
One initial result in this direction is a classical fact in Measure Theory which in
turn makes use of fundamental results in Topology like Urysohn’s lemma, or the
Tietze extension theorem. Parallel and important results are also Egorov’s or Luzin’s
theorems.
Theorem 2.4 The set of continuous, real functions with compact support in the real
line .R is dense in .L1 (R).
This theorem, which is also valid for .RN and that we take for granted here, will be
our basic result upon which build more sophisticated approximation facts.
Lemma 2.4
1. Let .f (x) ∈ L1 (R). There is a sequence .{fj (x)} of smooth (.C ∞ ), compactly
supported functions such that .fj → f in .L1 (R).
2. Let .f (x) ∈ W 1,1 (J ) for a finite interval J . There is a sequence .{fj (x)} of smooth
(.C ∞ ), functions such that .fj → f in .W 1,1 (J ).
The basic technique to achieve this fact makes use of the concept of “mollifier”
and convolution that we explain next. Take a smooth .C ∞ -function .ρ(z), even, non-
negative, supported in the symmetric interval .[−1, 1] and with
⎛ ⎛ 1
. ρ(z) dz = ρ(z) dz = 1. (2.21)
R −1
One of the most popular choices is

⎧
C exp z21−1 , |z| ≤ 1,
ρ(z) =
.
0, |z| ≥ 1,
2.9 Approximation 65
for a positive constant to ensure that (2.21) holds. Define

⎛
fj (x) =
. ρj (x − y)f (y) dy, ρj (z) = jρ(j z).
R
Note that
⎛ ⎛
. ρj (z) dz = 1, fj (x) = ρj (y)f (x − y) dy.
R R
Either of the two formulas to write .fj (x) is called the product of convolution of .ρj
and f , and is written
fj = ρj ∗ f.
.
Such a operation between functions enjoys amazing properties. The family .{ρj } is
called a mollifier. We claim the following:
1. each .fj is smooth and
dk dk
.
k
fj = ρj ∗ f ;
dx dx k
2. if the support of f is compact, so is the support of .fj , and all of these supports
are contained in a fixed compact;
3. convergence: .fj → f in .L1 (Ω );
Concerning derivatives, it is easy to realize
⎛
1 1
. [fj (x + h) − fj (x)] = [ρj (x − y + h) − ρj (x − y)]f (y) dy,
h R h
for arbitrary .x, h ∈ R. Since each .ρj is uniformly bounded (for j fixed), by
dominated convergence, as .h → 0, we find
⎛
1
. [fj (x + h) − fj (x)] =
lim ρj' (x − y)f (y) dy.
h→0 h R
The same reason is valid for the derivatives of any order.

The assertion about the supports is also elementary. If the support of f is
contained in an interval .[α, β], then the support of .fj must be a subset of .[α −
1/j, β + 1/j ]. This is easy to realize, because if the distance of x to .[α, β] is larger
than .1/j , the supports of the two factors in the integral defining .fj are disjoint, and
hence the integral vanishes.
Finally, for the convergence we follow a very classical and productive technique,
based on Theorem 2.4, consisting in proving the desired convergence for continuous
functions with compact support, and then take advantage of the density fact of
Theorem 2.4. Assume then that .g(x) is a real function, which is continuous and
with compact support. In particular, g is uniformly continuous. If we examine the
difference (bear in mind that the integral of every .ρj is unity)
⎛
gj (x) − g(x) =
. ρj (y)(g(x − y) − g(x)) dy,
R
because the support of .ρj is the interval .[−1/j, 1/j ], and the uniform continuity of
g, given .ɛ > 0, .j0 can be found so that
|g(x − y) − g(x)| ≤ ɛ,
. j ≥ j0
for every y with .|y| ≤ 1/j , and all .x ∈ R. Thus
|gj (x) − g(x)| ≤ ɛ

.
for all .x ∈ R if .j ≥ j0 . If we bring to mind, because of our second claim above about
the supports, that the supports of all .gj ’s and g are contained in a fixed compact, then
the previous uniform estimate implies that .gj → g in .L1 (R).
Proof of Lemma 2.4 In the context of the preceding discussion, consider .f ∈
L1 (R) arbitrary, and let .ɛ > 0 be given. By Theorem 2.4, select a continuous
function g with compact support such that
||f − g||L1 (R) ≤ ɛ.

.
We write
||fj − f ||L1 (R) ≤ ||fj − gj ||L1 (R) + ||gj − g||L1 (R) + ||g − f ||L1 (R) .
.
The last term is smaller than .ɛ, while the second one can be made also that small,
provided j is taken sufficiently large. Concerning the first one, we expand
⎛ ⎛ ⎛
. |fj (x) − gj (x)| dx ≤ ρj (x − y)|f (y) − g(y)| dy dx.
R R R
A change of the order of integration4 carries us directly to
||fj − gj ||L1 (R) ≤ ||f − g||L1 (R) ≤ ɛ,

.
as well. The first item in the lemma is proved.
4 We leave to the interested readers to check the technical conditions required for the validity of
this fact.
2.9 Approximation 67
Concerning the second, it is achieved by integration. If we apply the first to the

L1 (R)-function .f ' (x)χJ (x), where .χJ (x) is the characteristic function of J , we
.
would find a sequence of smooth functions .gj with .f ' −gj tending to zero in .L1 (R).
We put
⎛ x
fj (x) =
. gj (y) dy.
−∞
Then .fj ∈ W 1,1 (R),
fj' (x) = gj (x),

. ||fj' − f ' ||L1 (R) → 0,
and, as usual, the convergence .fj − f to zero takes place, through integration of its
derivative, in .L∞ (R). The restriction of all functions to J yields the result. ⨆
⨅
This approximation fact, which is just one example of a whole family of such
results, permits us to generalize results, that are very classical when smoothness
assumptions are guaranteed, to the context of functions in Sobolev spaces. Some
times there is no other way of showing the validity of such results. The following
is the product rule and the generalization of the integration-by-parts formula for
functions in .W 1,1 (R).
Proposition 2.9 Let .J = [α, β] be a finite interval. Assume f and g are functions
belonging to .W 1,1 (J ).
1. The product f g also belongs to the same space, and
(f g)' = f ' g + f g ' .

.
2. We have
⎛ ⎛
. f ' (x)g(x) dx = − f (x)g ' (x) dx + f (β)g(β) − f (α)g(α).
J J
In particular, if one of the two functions, either f or g, belongs to .W01,1 (J ) (recall

Definition 2.6), then
⎛ ⎛
. f ' (x)g(x) dx = − f (x)g ' (x) dx. (2.22)
J J
Proof Suppose first that the factor g is smooth, and so belonging to .W 1,1 (J ). The
weak derivative of f is such that
⎛ ⎛
. f ' (x)g(x)φ(x) dx = − f (x)(g(x)φ(x))' dx
J J
for every smooth function .φ with compact support in J , or vanishing at both end-
points .α and .β. Note that these facts at the end-points of J are also correct for the
product .gφ if g is smooth. But we know that the product rule is valid when the
factors are smooth and we are dealing with classical derivatives. Then
⎛ ⎛
. f ' (x)g(x)φ(x) dx = − [f (x)g ' (x)φ(x) + f (x)g(x)φ ' (x)] dx,
J J
which can be reorganized in the form

⎛ ⎛
. f (x)g(x)φ ' (x) dx = − (f ' (x)g(x) + f (x)g ' (x))φ(x) dx.
J J
The arbitrariness of .φ indicates that the product fg admits as a weak derivative

f ' g + f g ' which is a function in .L1 (J ).
.
If g is a general function in .W 1,1 (J ), we proceed by approximation. Thanks to

our previous lemma, we can find a sequence .gj of smooth functions converging to
g in .W 1,1 (J ). For a smooth, test function .φ(x), we write, by our previous step,
⎛ ⎛
'
. f (x)g(x)φ (x) dx = lim f (x)gj (x)φ ' (x) dx
J j →∞ J
⎛
= − lim (f ' (x)gj (x) + f (x)gj' (x))φ(x) dx
j →∞ J
⎛
=− (f ' (x)g(x) + f (x)g ' (x))φ(x) dx.
J
The arbitrariness of .φ implies our result. The formula of integration by parts is a

straightforward consequence of the product rule. ⨅
⨆
2.10 Completion of Spaces of Smooth Functions with

Respect to Integral Norms
We use here Theorem 2.1 to build Lebesgue spaces and Sobolev spaces in dimension
one. The process for Sobolev spaces in higher dimension is similar, as we will check
later in the book.
Consider the vector space .C ∞ (Ω ) of smooth functions in an open set .Ω ⊂ RN .
As in Sect. 2.4, we define the pth- norm of functions in .C ∞ (Ω ) for .1 ≤ p ≤ ∞.
Young’s and Hölder’s inequalities imply that the pth-norm is indeed a norm in
this space of smooth functions, and hence, by Theorem 2.1, we can consider its
completion .Lp (Ω ) with respect to this norm. The only point that deserves some
special comment is the fact that the pth-norm is just a seminorm, not a norm, in this
2.11 Hilbert Spaces 69
completion, and it becomes again a norm on the quotient space
Lp (Ω ) = Lp (Ω )/N p
.
where .N p is the class of functions that vanish except for a negligible set. The
concepts of subnorm and seminorm will be introduced and used in Chap. 4.
To be honest, this process yields a Banach subspace of the true .Lp (Ω ), in which
smooth functions are dense (with respect to the pth-norm). To be sure that such a
subspace is in fact the full Banach space .Lp (Ω ) requires to check that in these spaces
smooth functions are also dense. But this demands some extra work focused on
approximation, which is interesting but technical, as we have seen in the preceding
section.
Concerning Sobolev spaces, the process is similar but the completion procedure
in Theorem 2.1 is performed with respect to the Sobolev norm in Proposition 2.3.
The resulting space is a Banach subspace (again after making the quotient over the
class of a.e. null functions) of the true Sobolev space .W 1,p (J ). They are the same
spaces, but this asks for checking that the space of smooth functions is dense in
.W
1,p (J ), which once more is an important approximation procedure.
The process in higher dimension is not more difficult, though this important
approximation fact of Sobolev functions by smooth functions is more involved. We
will recall these ideas later in Chap. 7 when we come to studying higher-dimensional
Sobolev spaces, and stress how, from the viewpoint of variational problems, one
can work directly with these completions of smooth functions under the appropriate
norms.
2.11 Hilbert Spaces
There is a very special class of Banach spaces which share with finite-dimensional
Euclidean spaces one very fundamental feature: an inner product.
Definition 2.10 Let .H be a vector space.
1. An inner or scalar product in .H is a bilinear, symmetric, positive definite, non-
degenerate form
<·, ·> : H × H → R.
.
The norm in .H is induced by the inner product through the formula
||x||2 = <x, x> ≥ 0.

. (2.23)
2. A Hilbert space .H is a vector space endowed with an inner product which is a

Banach space under the corresponding norm (2.23).
The properties required for a function
. <·, ·> : H × H → R
to be an inner product explicitly are:

1. symmetry: for every .x, y ∈ H,
<x, y> = <y, x>;

.
2. linearity on each entry: for every fixed .y ∈ H,
<y, ·> : H → R,
. <·, y> : H → R,
are linear functionals;

3. positive definiteness:
<x, x> ≥ 0,
. <x, x> = 0 ⇐⇒ x = 0.
The triangle inequality N 3 for the norm coming form an inner product is a
consequence of the Cauchy-Schwarz inequality that is correct for every scalar
product
|<x, y>| ≤ ||x||||y||.

.
The model Hilbert space is .L2 (Ω ) for an open subset .Ω ⊂ RN . It is, by far, the most
important Hilbert space in Applied Analysis and Mathematical Physics. Similarly,
if J is an interval in .R, the Sobolev space .W 1,2 (J ) is also a Hilbert space.
Proposition 2.10 The mapping
⎛
<f, g> =
. f (x)g(x) dx
Ω
defines an inner product on the set of measurable functions defined in .Ω , with norm
⎛
.||f ||2 =
2
f (x)2 dx.
Ω
The space .L2 (Ω ) of square-integrable, measurable functions is a Hilbert space

under this inner product. In a similar way, the map
⎛
<f, g> =
. [f (x)g(x) + f ' (x)g ' (x)] dx
J
determines an inner product in .W 1,2 (J ). This space is typically referred to as

1
.H (J ).
All that the proof of this proposition requires, knowing already that .L2 (Ω ) and
.W
1,2 (J ) are Banach spaces under their respective 2-norms, is to check that, in
both cases, the formula determining the inner product is symmetric, bi-linear, and
positive definite. This is elementary.
The fact that a Banach space can be endowed with a inner product under which it
becomes a Hilbert space may look like something accidental, or simply convenient.
But it is not so, as an inner product allows for fundamental and profound facts:
1. In a Hilbert space, one can talk about orthogonal projections onto subspaces, and
even onto convex, closed subsets.
2. Orthogonality also has important consequences as one can consider orthonormal
bases.
3. The inner product permits to identify a Hilbert space with its own dual whenever
convenient. In this sense, we can say that .H = H' .
4. The extension of many facts from multivariate Calculus to infinite dimension is
possible thanks to the inner product.
We treat these four important issues successively.
2.11.1 Orthogonal Projection
We start with one of the most remarkable and fundamental tools. Recall that a
convex set .K of a vector space .H is such that convex combinations
tx1 + (1 − t)x0 ∈ K
.
whenever .x1 , x0 ∈ K and .t ∈ [0, 1].

Proposition 2.11 (Projection Onto a Convex Set) Let .K ⊂ H be a closed and
convex subset of a Hilbert space .H. For every .x ∈ H, there is a unique .y ∈ K such
that
||x − y|| = min ||x − z||.

.
z∈K
Moreover, .y is characterized by the condition
(y − x) · (z − y) ≥ 0
.
for every .z ∈ K.
Proof Let .yj ∈ K be a minimizing sequence for the problem
m = inf ||z − x||2 ,

. ||yj − x||2 \ m.
z∈K
From the usual parallelogram law, which is also valid in a general Hilbert space,
||x + y||2 + ||x − y||2 = 2||x||2 + 2||y||2 ,

. x, y ∈ H,
we find that
||yj − yk ||2 = 2||yj − x||2 + 2||yk − x||2 − 4||x − (1/2)(yj + yk )||2 .

.
Since .K is convex, the middle point
1
. (yj + yk )
2
belongs to .K as well. Hence, by definition of m,
||yj − yk ||2 ≤ 2||yj − x||2 + 2||yk − x||2 − 4m.

.
The right-hand side converges to zero as .j, k → ∞ because .{yj } is minimizing.

This implies that .{yj } is a Cauchy sequence, and so, it converges to some .y in .K
(because .K is closed), which is the minimizer. The uniqueness follows from the
strict convexity of the norm in a Hilbert space (or again from the parallelogram
rule).
For arbitrary .z ∈ K, consider the convex combination
tz + (1 − t)y ∈ K,
. t ∈ [0, 1].
The function
φ(t) = ||x − tz − (1 − t)y||2 ,

. t ∈ [0, 1]
attains its minimum at .t = 0, and so .φ ' (0) ≥ 0. But
φ ' (0) = 2(x − y) · (y − z).

.
Conversely, if for every .z ∈ K, we have that
0 ≤ (x − y) · (y − z) = −||x − y||2 + (x − y, x − z),

.
then, by the Cauchy-Schwarz inequality,
||x − y||2 ≤ (x − y, x − z) ≤ ||x − y|| ||x − z||

.
for all such .z. This implies that
||x − y|| = min ||x − z||,

.
z∈K
as desired. ⨆
⨅
This lemma permits to consider the map
πK : H → K,
.
called the projection onto .K, defined precisely by putting
πK (x) = y,
.
the unique vector .y in the statement of the last proposition. The projection .πK is
a continuous map. In fact, by the characterization of the projection in the previous
proposition, we have
.(πK x1 − x1 ) · (πK x2 − πK x1 ) ≥ 0,
(πK x2 − x2 ) · (πK x1 − πK x2 ) ≥ 0.
By adding these two inequalities, we can write
||πK x1 − πK x2 ||2 =(πK x1 − πK x2 ) · (πK x1 − πK x2 )

.
≤ (x1 − x2 ) · (πK x1 − πK x2 ),
and, again by the Cauchy-Schwarz inequality,
||πK x1 − πK x2 || ≤ ||x1 − x2 ||.

.
The very particular case in which .K is a closed subspace is especially relevant.

Corollary 2.5 If .K is a closed, subspace of a Hilbert space .H, the orthogonal
projection .πK is a linear, continuous operator characterized by the condition
<x − πK x, y> = 0
.
for every .y ∈ K.
2.11.2 Orthogonality
The existence of an inner product leads to the fundamental geometrical concept of

orthogonality.
Definition 2.11 A sequence of vectors .{xj } of a Hilbert space .H is said to be a
orthonormal basis if
||xj || = 1 for all j,

. <xj , xk > = 0 for all pairs j /= k,
and the subspace spanned by it is dense in .H.

Hilbert spaces that admit a orthonormal basis need to be separable, meaning by that
the following.
Definition 2.12 A Banach space .E is said to be separable if there is a countable
subset which is dense.
The fundamental facts about orthogonality in Hilbert spaces can be specified
through the following two results.
Proposition 2.12 Every separable Hilbert space admits orthonormal bases.
Proof Let .{xj } be dense in the Hilbert space .H, and let .Hj be the subspace spanned
by .{xk }k≤j , so that .{Hj } is a non-decreasing sequence of finite-dimensional sub-
spaces whose union is dense in .H. By using the Gram-Schmidt orthonormalization
process, we can produce an increasing sequence of orthonormal vectors .{xk } that
spanned the successive subspaces .Hj . This full collection of vectors .{xk } is a
orthonormal basis of .H according to Definition 2.11. ⨆
⨅
The following is also a fundamental feature of Hilbert spaces.
Proposition 2.13 Let .H be a separable, Hilbert space, and .{xj }, a orthonormal
basis.
1. For every .x ∈ H,
∞
Σ ∞
Σ
x=
. <x, xj >xj , ||x|| =
2
|<x, xj >|2 .
j =1 j =1
2. If the sequence of numbers .{aj } is square-summable
∞
Σ
. aj2 < ∞,
j =1
then
∞
Σ ∞
Σ
x=
. aj xj ∈ H, ||x||2 = aj2 , aj = <x, xj >.
j =1 j =1
Proof Let .x ∈ H be given, put .Hj for the subspace spanned by .{xk }k≤j , and
Σ
j
x(j ) =
. <x, xk >xk .
k=1
If .πHj is the orthogonal projection onto .Hj , we claim that
.x(j ) = πHj x. (2.24)
To check this, consider the optimization problem
Σ
j
Minimize in z = (zk ) ∈ Rj :
. ||x − zk xk ||2 .
k=1
We can actually write, taking into account the orthogonality of .{xk } and the
properties of the inner product,
Σ
j Σ
j Σ
j
||x −
. zk xk ||2 =<x − zk xk , x − zk xk >
k=1 k=1 k=1
Σ Σ
=||x||2 − 2 zk <x, xk > + zk2 .
It is an elementary Vector Calculus exercise to find that the optimal solution .z

exactly corresponds to
. zk = <x, xk >.
This means that indeed (2.24) holds. In particular, thanks to Corollary 2.5, we can
conclude that
<x − x(j ) , x(j ) > = <x − πHj x, πHj x> = 0.

. (2.25)
Bearing in mind this fact, we further realize that
0 ≤ <x − x(j ) , x − x(j ) > = ||x||2 − <x, x(j ) >,

.
that is to say
Σ
j
. |<x, xk >|2 ≤ ||x||2 ,
k=1
for arbitrary j . This implies that the series of non-negative numbers
Σ
j
. |<x, xk >|2
k=1
is convergent. But, again bearing in mind the orghogonality of .{xj }, we can also
conclude for .k < j , that
Σ
j
.||x(j ) − x(k) ||2 = |<x, xl >|2 .
l=k+1
Hence .{x(j ) } is a Cauchy sequence, and it converges to some .x. It remains to show
that in fact .x = x.
The condition
∞
Σ
x=
. <x, xk >xk ,
k=1
implies, just as we did with .x, that
Σ
j
.πHj x = <x, xk >xk = πHj x.
k=1
Then
πHj (x − x) = 0,
.
for all j , and hence the difference .x − x is a vector which is orthogonal to the full
basis .{xj }. This implies that indeed .x−x must be the vanishing vector, for otherwise
the subspace .H spanned by the full basis could not be dense in .H, as it would accept
a non-vanishing orthogonal vector.
The second part of the statement requires exactly the same ideas. ⨆
⨅
It is quite instructive to look at some explicit cases of orthonormal bases. The
first one is mandatory.
Example 2.13 Consider the Hilbert space

Σ
. l2 ≡ l2 (R) = {x = (xk ) : xk2 < ∞},
k
with inner product

Σ
<x, y> =
. xk yk .
k
If we let .ek ∈ l2 be the trivial sequence except for 1 in the k-th place, then it
is evident that the countable collection .{ek : k ∈ N}, the canonical basis, is an
orthonormal basis of .l2 .
Notice how Proposition 2.13 establishes that every separable real Hilbert space is
isomorphic to .l2 .
The most important and popular example is possibly that of the trigonometric
basis in .L2 (−π, π ).
Example 2.14 Consider the family of trigonometric functions
{1} ∪ {cos(kx), sin(kx)}k∈N .

.
It is elementary to check that they belong to .L2 (−π, π ): by the double-angle

formula, we know that
⎛ π ⎛
1 π
. cos (kx) dx =
2
[1 + cos(2kx)] dx = π,
−π 2 −π
⎛ π ⎛
1 π
sin2 (kx) dx = [1 − cos(2kx)] dx = π,
−π 2 −π
for all .k ∈ N. Moreover, the elementary trigonometric formulas
. cos(α ± β) = cos α cos β ∓ sin α sin β,

sin(α ± β) = sin α cos β ± cos α sin β,
suitably utilized, lead to the identities

⎛ π
. cos(kx) cos(j x) dx = 0, k /= j,
−π
⎛ π
sin(kx) sin(j x) dx = 0, k /= j,
−π
⎛ π
cos(kx) sin(j x) dx = 0, k /= j.
−π
All these calculations prove that the family of functions
1 1 1
F = { √ } ∪ { √ cos(kx), √ sin(kx)}
. (2.26)
2π π π
is orthonormal. In fact, if we let
1
.yk (x) = cos(kx), zk (x) = sin(kx) = − yk' (x), k ∈ N,
k
it is easy to realize that
yk'' + k 2 yk = 0,
. zk'' + k 2 zk = 0. (2.27)
But then, relying on two integration by parts for which end-point contributions drop
out,
⎛ π ⎛ π
.k
2
yk (x)yj (x) dx = − yk'' (x)yj (x) dx
−π −π
⎛ π
= yk' (x)yj' (x) dx
−π
⎛ π
=− yk (x)yj'' (x) dx
−π
⎛ π
=j 2 yk (x)yj (x) dx.
−π
If .k /= j , we conclude that .yk (x) and .yj (x) are orthogonal in .L2 (−π, π ), without
the need to compute the definite integrals explicitly. Similar manipulations lead to
the other two orthogonality relations.
Even more is true. The Fourier family of functions (2.26) is indeed a basis for
2
.L (−π, π ). This can be checked directly by showing that Fourier partial sums of
the form
Σ
N
a0 +
. [ak cos(kx) + bk sin(kx)]
k=1
with
⎛ π ⎛ π
1 1
.ak = √ f (x) cos(kx) dx, a0 = √ f (x) dx,
π −π 2π −π
⎛ π
1
bk = √ f (x) sin(kx) dx,
π −π
converge, in .L2 (−π, π ), to an arbitrary function .f ∈ L2 (−π, π ) as .N → ∞,

though it requires a good deal of fine work. But similar trigonometric families of
functions are shown to be bases of .L2 -spaces as a remarkable consequence of (2.27).
We will later explore this in Theorem 6.4.
There are other fundamental examples of orthonormal basis that are introduced in
the exercises below.
2.11.3 The Dual of a Hilbert Space
The inner product in a Hilbert space .H also has profound consequences for the dual
H' .
.
Proposition 2.14 (Fréchet-Riesz Representation Theorem) For every .x' ∈ H'

there is a unique .x ∈ H such that
<x' , y> = <x, y>,

. y ∈ H.
We are using brackets here for two different things at first sight: the left-hand side
corresponds to the duality between .H' and .H, while in the right-hand side, it is the
inner product in .H.
Proof Let .x' ∈ H' be arbitrary, and put
H0 = {y ∈ H : <x' , y> = 0} ⊂ H,
.
a closed subspace of .H. If .H0 = H, then .x' = 0, and we can, obviously, take .x = 0
as well. Assume then that .H0 is not the full .H.
Take .x0 ∈ H \ H0 , a non-vanishing vector, and put
x0 − πH0 x0
X0 =
. .
||x0 − πH0 x0 ||
The norm in the denominator cannot vanish precisely because .x0 ∈

/ H0 . For .y ∈ H
arbitrary, the combination
<x' , y>
y−
. X0 (2.28)
<x' , X0 >
belongs to .H0 because .x' applied to it, vanishes. The number in the denominator
does not vanish because, once again .x0 ∈ / H0 . By Corollary 2.5, applied to .x = x0 ,
.K = H0 , and .y, the combination in (2.28), we conclude that
<x' , y>
<X0 , y −
. X0 > = 0,
<x' , X0 >
as well for all .y ∈ H, and this identity lets us see that it suffices to take
x = <x' , X0 >X0 .
.
Recall that .X0 is unitary. ⨆

⨅
This theorem clearly establishes that a Hilbert space can always be identified, in
a canonical way through the inner product, with its own dual. In particular, every
Hilbert space is reflexive.
2.11.4 Basic Calculus in a Hilbert Space
The structure of a Hilbert space through its inner product allows for many similar
facts as in finite dimensional spaces. The following definition refers to a functional
.I : H → R where .H is a Hilbert space.
Definition 2.13
1. Such a functional I is said to be Gateaux-differentiable if every section, for
arbitrary .x, y ∈ H,
ɛ |→ I (x + ɛy)
.
is differentiable as a function of the single variable .ɛ.

2. I is differentiable if it is Gateaux-differentiable, and the operation
|
d |
.(x, y) |→ T(x, y) = I (x + ɛy)||
dɛ ɛ=0
is continuous in .x, and linear in .y, in such a way that we can write
. T(x, y) = <T̃x, y>,
for a certain continuous .T̃ : H → H. We put
I ' (x) = T̃x,

.
and refer to it as the derivative of I . Note that .I ' : H → H is a continuous, in

general non-linear, operation.
3. A certain element .x ∈ H is said to be critical for a differentiable functional I , if
'
.I (x) = 0.
4. The functional I as above is Fréchet differentiable if it is differentiable and, for

every .x ∈ H,
1
. ||I (x + y) − I (x) − || → 0 as y → 0. (2.29)
||y||
The concepts of differentiability and Fréchet differentiability can be localized at

individual vectors .x, but since most of the time they are used in a global way, we
do not consider those local definitions. On the other hand, we are incorporating
the continuity of the derivative into the differentiability. Usually this is not done so
in other sources. Again, in most of the important examples and applications, the
continuity of the derivative is required, and thus we have judged more transparent
to proceed in this way.
A parallelism with functions of several variables can be clearly established. If
f (x) : RN → R
.
is a function, then Gateaux-differentiability amounts to the possibility of com-

puting all directional derivatives; differentiability means linearity on those direc-
tional derivatives, and continuity of partial derivatives; and Fréchet-differentiability
amounts to plain differentiability. The well-known criterium in finite dimension that
the continuity of partial derivatives implies differentiability also has a counterpart
in infinite dimension.
Lemma 2.5 A functional .I : H → R defined on a Hilbert space is differentiable if
and only if it is Fréchet-differentiable.
Proof Suppose I is differentiable with continuous derivative .I ' : H → H. Let
.x, y ∈ H, and consider the real function
g : R → R,
. g(s) = I (x + sy).
We know that this function is continuously differentiable and besides, because I is

differentiable,
g ' (s) = .

.
By the mean-value theorem (for real functions)
g(1) − g(0) = g ' (s0 ),

. I (x + y) − I (x) = ,
where .s0 ∈ (0, 1) will most likely depend on .x and .y. Hence, the quotient Q in
(2.29) becomes
Q = ,

.
and
0 ≤ |Q| ≤ ||I ' (x + s0 y) − I ' (x)|| → 0

.
as .||y|| → 0, thanks to the continuity of .I ' . ⨆

⨅
As a consequence of this result, and to avoid confusion with other equivalent
definitions, we introduce the concept of a .C 1 -functional, as another way to identify
differentiable functionals. We also define some other usual concepts.
Definition 2.14 Let .I : H → R be defined on a Hilbert space .H.
1. I is said to be .C 1 , if it is differentiable (and then the derivative .I ' : H → H is
continuous).
2. I is said to be coercive if
. lim I (x) = +∞.

||x||→∞
3. I is said to be locally Lipschitz if it is differentiable and
||I ' (x) − I ' (y)|| ≤ M||x − y||,

. x, y ∈ K, M = M(K) > 0, (2.30)
for every bounded subset .K ⊂ H.

The preceding lemma has, at least, two important consequences that are conve-
nient to highlight just as they are important in finite dimension. The first refers to
the orthogonality property of the derivative .I ' with respect to level sets of I ; the
second, the relevance of the flow of the derivative .I ' . The two of them are direct
consequences of the chain rule.
Let .I : H → R be a differentiable functional, and .γ : (−ɛ, ɛ) → H a
differentiable curve with continuous tangent vector
γ ' (s) ∈ H,
. s ∈ (−ɛ, ɛ),
for some positive .ɛ.

Lemma 2.6
1. The composition
g(t) = I (γ (t)) : (−ɛ, ɛ) → R

.
is differentiable and
g ' (t) = ,

. t ∈ (−ɛ, ɛ).
2. For every .x ∈ H, the vector .I ' (x) is orthogonal to the level set of I through .x.
3. Suppose, in addition to the previous assumptions, that .I ' : H → H is coercive

and locally Lipschitz Then, for arbitrary initial vector .x0 ∈ H, the differential
infinite-dimensional system
.x' (t) = −I ' (x(t)), x(0) = x0 (2.31)
is defined for every positive time .t > 0, and, for every such .t > 0,
d
. I (x(t)) = −||I ' (x(t))||2 .
dt
Moreover, if we write
x' (t; x0 ) = −I ' (x(t; x0 )),

. x(0; x0 ) = x0 ,
to stress the dependence of the solution .x on the initial condition .x0 , then the
mapping .x(t; x0 ) is continuous in .x0 .
Given the information we already have on differentiable functionals and how
everything is similar, almost word by word, with the finite-dimensional setting, the
proof of this lemma follows exactly as in that elementary situation. The unique
solution of the gradient differential system (2.31) rests, as in the finite-dimensional
case, on the contraction principle Theorem 1.1.
Finally, in practice, it may not be possible to clearly see the derivative .I ' (x) from
the computation of the directional derivatives .T(x, y) given by its definition
|
d |
.T(x, y) = I (x + ɛy)|| .
dɛ ɛ=0
The following basic lemma informs us how to do so.

Lemma 2.7 For a fixed element .x ∈ H, the unique optimal solution of the quadratic
functional
1
y |→
. ||y||2 − T(x, y) (2.32)
2
is precisely .y = I ' (x).
Proof This is checked easily by writing
T(x, y) = <T̃x, y>,

.
and completing squares
1 1 1
. ||y||2 − <T̃x, y> = ||y − T̃x||2 − ||x||2 .
2 2 2
⨆
⨅
2.12 Some Other Important Spaces of Functions
In this chapter, we have introduced the most important spaces of functions utilized
in Applied Analysis. There are some important variants that are worth mentioning,
and spaces of a different nature. We do not devout more time to study these as they
are the subject of more advanced courses in Functional Analysis. There is additional
important material in the final Appendix.
1. If .(Ω , σ, μ) is an abstract measure space, one can define the corresponding
.L (dμ)-spaces, .1 ≤ p ≤ ∞, as one would anticipate
p
⎛
L (dμ) = {f : Ω → R, σ − measurable :
.
p
|f (x)|p dμ(x) < ∞},
Ω
for p, finite; and the corresponding definition for .p = ∞. In particular, if .Ω ⊂

RN , and .w(x) : Ω → R is a positive, integrable function, regarded in this context
as a weight, then
⎛
Lp (w) = {f : Ω → R :
. |f (x)|p w(x) dx < ∞}.
Ω
Even the spaces .lp may be understood in this context as .Lp -spaces with respect
to the counting measure in .N.
2. So far we have used integration to provide ways to measure the size of functions
in suitable classes of measurable functions. These collection of functions are
rather huge sets including functions with may exhibit rather irregular behavior.
There are ways to restrict attention to regular classes of functions like
C m (Ω ) = {f : Ω → R : with continuous derivatives up to order m},

.
with .m ∈ N, and .Ω an open subset of .RN . However, given that such functions, or
some of their derivatives, can blow up as we approach the boundary of .Ω , there
does not seem to be a unified way to measure their size only with derivatives and
without introducing integration in any way. One possibility is to write
.pK (f ) = max |∇ α f (x)|

x∈K,|α|≤m
for a compact set .K ⊂ Ω . This of course does not give a full measure of f as
it does not provide information on a function outside the set K. The functions
.pK (f ) are not norms, but the full collection .pK (f ) for an increasing sequence
of compact sets .Kj ⊂ Ω with
|Ω \ ∪j Kj | → 0,
. j → ∞,
2.13 Exercises 85
can be used in such a way that .C m (Ω ) becomes a topological vector space,

something more general than a Banach space. One could try to consider the
subspaces
C m (Ω ) = {f : Ω → R : with continuous derivatives up to ∂Ω },

.
and consider the previous ways of measuring the size of f for .K = ∂Ω (if .Ω is
bounded).
3. Functions of bounded variation. In a finite interval .J = [xl , xr ] ⊂ R, we consider
the class .P of all finite partitions P of the form
xl = x0 < x1 < x2 < · · · < xk−1 < xk = xr ,

.
and put
Σ
V (u) = sup
. |u(xj ) − u(xj −1 )|.
P j
The set of measurable functions u with finite .V (u) is the class .BV (J ) of
functions of bounded variation in J . The quantity .V (u) + u(xl ) turns out to be a
norm in .BV (J ) under which it becomes a Banach space. The interesting point is
that each function .u ∈ BV (J ) determines a linear, continuous functional .Tu in
.C(J ) through the classical Riemann-Stieltjes integral
⎛
Tu (v) =
. v du
J
in such a way that .BV (J ) becomes the dual of .C(J ) under this identification.
4. Complete metric spaces are also a class of objects more general than Banach
spaces as they need not be vector spaces where one can take linear combinations.
Yet complete metric spaces enjoy some fundamental properties because of the
nature of the underlying distance function. As a matter of fact, vector topological
spaces that can be shown to be, even locally, metrizable, i.e. their topology comes
from the balls of a distance function, do share with metric spaces some of these
remarkable properties.
5. Spaces of distributions. We simply mention here the fundamental spaces of
distributions because of their crucial role in modern Analysis. There is more
information in the final Appendix.
2.13 Exercises
1. Show that L1 (Ω ) and L∞ (Ω ) are Banach spaces for every open subset Ω ⊂
RN .
2. Prove that if fj → f in Lp (Ω ) for p ≥ 1, and g ∈ C(Ω ) is a continuous

function up to the boundary of Ω , then
⎛ ⎛
. fj (x)g(x) dx → f (x)g(x) dx.
Ω Ω
3. Seek a counterexample of a bounded sequence {uj } in W 1,1 (0, 1) which is not

equicontinuous.
4. Take
uj (x) = sin(2j π x),

. x ∈ [0, 1].
(a) Check that for arbitrary points 0 ≤ a < b ≤ 1, we always have

⎛ b
. uj (x) dx → 0.
a
Draw a picture of the graphs of the initial members of the sequence to

visualize the cancellation property as j → ∞.
(b) Calculate the limit of the integrals
⎛ b
. u2j (x) dx
a
as j → ∞ in an arbitrary subinterval (a, b) ⊂ [0, 1]. Do you notice

something a bit unexpected if these results are interpreted in terms of weak
convergence?
5. Let {fj } be a bounded sequence in Lp (J ) for p > 1, and J a finite interval of
R. Suppose there is f in the same space such that
⎛ b ⎛ b
. fj (x) dx → f (x) dx
a a
for every pair a < b in J . Argue that

⎛ ⎛
. g(x)fj (x) dx → g(x)f (x) dx
J J
for every g ∈ Lq (J ) for the conjugate exponent q.

6. Prove Proposition 2.6.
7. Finish the proof of Proposition 2.7.
8. Prove Proposition 2.8.
9. Prove the second part of Proposition 2.13.
2.13 Exercises 87
10. For E, the collection of continuous functions in a fixed interval, say, [0, 1],
consider the two norms
⎛ 1
||f ||1 =
. |f (t)| dt, ||f ||∞ = sup |f (t)|.
0 t∈[0,1]
(a) Show that E1 = (E, || · ||1 ) is not complete, but E∞ = (E, || · ||∞ ) is.
(b) If for each g ∈ E, we put
⎛ 1
. <Tg , f > = g(t)f (t) dt,
0
then Tg ∈ E'1 . Find its norm.

(c) Argue that Tg ∈ E'∞ as well, and
||Tg || = ||g||1 .
.
(d) Consider the linear functional on E given by
f |→ δ1/2 (f ) = f (1/2).
.
/ E'1 .
Show that δ1/2 ∈ E'∞ but δ1/2 ∈
(e) Prove that
⎛ 1
. H = {f ∈ E : f (t) dt = 0}
0
is a closed hyperplane both in E'1 and E'∞ .

11. Let E be a normed space with norm || · ||.
(a) Show that the norm is uniformly continuous.
(b) Every linear map
T : (Rn , || · ||∞ ) → E
.
is continuous.
(c) Every Cauchy sequence is bounded.
(d) Every norm in Rn is a continuous function, and there is M > 0 with
||x|| ≤ M||x||∞ .
.
Conclude that every norm in E is equivalent to the sup-norm, and that all
norms in a finite-dimensional vector space are equivalent.
12. (a) If x ∈ lp for some p > 0, then x ∈ lq for all q > p, and
. lim ||x||q = ||x||∞ .

q→∞
(b) For every bounded and measurable function f (t) for t ∈ [a, b],
. lim ||f ||p = ||f ||∞ .

p→∞
13. Let E be an infinite-dimensional vector space with an algebraic basis {ei }, i ∈ I ,

in such a way that for each x ∈ E, we can write in a unique way
Σ
.x= xi ei
i∈I
with only a finite number of the xi ’s being non-zero. Define

Σ
||x||∞ = max{|xi | : i ∈ I },
. ||x||1 = |xi |.
i∈I
(a) Argue that || · ||1 and || · ||∞ are not equivalent.

(b) For an arbitrary norm || · || in E, there is always a linear functional which is
not continuous.
14. In the vector space
K = {u ∈ C 1 [0, 1] : u(0) = 0},

.
we consider
||u|| = sup |u(t) + u' (t)|.

.
t∈[0,1]
Show that it is a norm in K equivalent to the norm
. ||u||∞ + ||u' ||∞ .
15. Let
T : L1 (R) → Lp (R),
. 1 < p ≤ ∞,
be linear. Show the equivalence of the following two assertions:

(a) There is some w ∈ Lp (R) such that
T(u) = u ∗ w,
. u ∈ L1 (R);
2.13 Exercises 89
(b) T is continuous and
T(u) ∗ v = T(u ∗ v),

. u, v ∈ L1 (R).
16. If {xj } is an orthogonal basis for a separable, Hilbert space H, argue that xj ⥛ 0
in H.
17. Let the kernel
k(x, y) : Ω × Ω → R,
.
be given by the formula

Σ
. k(x, y) = wi (x)wi (y), x, y ∈ Ω ⊂ RN ,
i
where {wi } is a finite collection of smooth functions. Explore if the formula

⎛
<f, g> =
. f (x)k(x, y)g(y) dx dy
Ω ×Ω
defines an inner-product in the space of measurable functions.

18. In the Hilbert space L2 (−1, 1), show that the family of polynomials
dj
Lj (t) =
. [(t 2 − 1)j ]
dt j
is a orthogonal family.5
(a) Compute the minimum of the set of numbers
⎧⎛ 1 ⎞
. |t − a2 t − a1 t − a0 | dt : a0 , a1 , a2 ∈ R .
3 2 2
−1
(b) Find the maximum of the set of numbers

⎛ 1
. t 3 p(t) dt
−1
where p(t) is subjected to the constraints

⎛ 1 ⎛ 1 ⎛ 1 ⎛ 1
. p(t) dt = tp(t) dt = t p(t) dt = 0,
2
p(t)2 dt = 1.
−1 −1 −1 −1
5 Except for normalizing constants, these are the Legendre polynomials.

19. Let J be a finite, closed interval of R. A weight p(t) over J is a positive,

continuous function such that all monomials t j are integrable with respect to
p(t), i.e.
⎛
. t j p(t) dt < ∞.
J
Let Pj (t) be the resulting orthogonal family coming from {t j : j = 0, 1, 2, . . . }

after applying the Gram-Schmidt process to it with respect to the inner product
⎛
<f, g> =
. f (t)g(t)p(t) dt.
J
(a) For every j , it is true that
. <tPj −1 (t), Pj (t)> = <Pj , Pj >.
(b) There are two sequences of numbers {aj }, {bj }, with bj > 0, such that
.Pj (t) = (t − aj )Pj −1 (t) − bj Pj −2 (t), j ≥ 2.
(c) Pj has j different real roots in J .

20. Consider the complex Hilbert space H of 2π-periodic, complex-valued func-
tions u : [−π, π] → C under the inner product
⎛ π
<u, v> =
. u(t)v(t) dt
−π
where z stands for the conjugate of the complex number z. Check that the
exponential system
1
{
. exp(ij t)}j =0,±1,±2,...
2π
is an orthonormal basis for H.

21. The Haar system. Consider the function
⎧
⎪
⎪ 0 ≤ t < 1/2,
⎨1,
.ψ(t) = −1, 1/2 ≤ t < 1
⎪
⎪
⎩0, else.
For a couple of integers n, m ∈ Z, define
ψm,n (t) = 2n/2 ψ(2n t − m).

.
2.13 Exercises 91
(a) Show that this collection of functions is an orthonormal basis of L2 (R).

(b) Show that the family of functions
{1} ∪ {ψk,n (t) : n ∈ N ∪ {0}, 0 ≤ m < 2n }

.
is an orthonormal basis of L2 (0, 1).

22. Wavelets. Let ψ(t) be a real function. For fixed numbers a > 1 and b > 0,
define
⎛ ⎞
1 t − nb
.ψm,n (t) = √ ψ ,
am am
for n, m ∈ N. Find conditions on the function ψ in such a way that the family
{ψm,n } is an orthonormal basis of L2 (R).
23. Find the orthogonal complement of H01 (J ) in H 1 (J ), for an interval J .
24. Use the orthogonality mechanism around (2.27), to find another two families of
orthogonal functions in L2 (0, 1).
25. For J = [0, 1], let E be the set of measurable functions defined on J . Set
⎛ 1 |f (t) − g(t)|
d(f, g) =
. dt.
0 1 + |f (t) − g(t)|
Show that d is a distance for which E becomes a complete metric space.

26. Consider the space
⎛ 1
X = {f (x) : (0, 1) → R, measurable :
. |f (x)|1/2 dx < +∞}.
0
(a) Show that X is a vector space.

(b) Argue that
⎛ 1
1/2
||f ||1/2 =
. |f (x)|1/2 dx
0
is not a norm.
(c) Check that
⎛ 1
d(f, g) =
. |f (x) − g(x)|1/2 dx
0
is a distance on X that makes it a complete metric space.

27. (Jordan-Von Neumann Theorem) Let E be a normed space. Show that the norm
comes from a inner product if and only if the parallelogram identity
||x + y||2 + ||x − y||2 = 2||x||2 + 2||y||2 ,

. x, y ∈ E,
holds.
28. Multidimensional Fourier series. Check that the family of functions
1
{
. exp(ij · x)}j ∈ZN
2π
is a orthonormal basis for the Hilbert space L2 ([−π, π ]N ; C).

29. A simple obstacle problem. Through the projection theorem Proposition 2.11
to prove that there is a unique solution to the variational problem
⎛ 1
Minimize in u ∈ H01 (0, 1) :
. u' (x)2 dx
0
subject to u(x) ≤ φ(x), where φ(x) is a continuous function with φ(1), φ(0) <
0. Write explicitly a condition characterizing the minimizer.
30. For the functional
⎛ 1
E : H01 (0, 1) → R,
. E(u) = [ψ(u' (x)) + φ(u(x))] dx
0
where ψ and φ are C 1 -, real functions such that
|ψ(u)|, |φ(u)| ≤ C(1 + u2 )

.
for a positive constant C, write a quadratic variational problem whose unique

solution yields precisely E ' (u) for an arbitrary u ∈ H01 (0, 1). We will later
calculate explicitly such derivative.
31. For p ∈ (0, 1), consider the function
⎛
d(f, g) =
. |f (x) − g(x)|p dx,
Ω
for f, g ∈ C ∞ (Ω ) where Ω ⊂ RN is an open set as regular as it may be

necesssary.
(a) Check that (C ∞ (Ω ), d) is a metric space.
(b) Consider its completion, and show that Lp (Ω ) is a complete metric space
under the same distance function.
Chapter 3
Introduction to Convex Analysis: The
Hahn-Banach and Lax-Milgram
Theorems
3.1 Overview
Before we start diving into integral functionals, it is important to devout some time
to understand relevant facts for abstract variational problems. Because in such a
situation we do not assume any explicit form of the underlying functional, these
results cannot be as fine as those that can be shown when we materialize functionals
and spaces. However, the general route to existence of minimizers is essentially
the same for all kinds of functionals: it is called the direct method of the Calculus
of Variations. This chapter can be considered then as a brief introduction to the
fundamental field of Convex Analysis.
The classical Hahn-Banach theorem is one of those basic chapters of Functional
Analysis that needs to be known. In particular, it is the basic tool to prove one of
the most important existence results of minimizers under convexity and coercivity
assumptions in an abstract form. There are several versions of this important
theorem: one analytic dealing with the extension of linear functionals; and two
geometric that focus on separation principles for convex sets. There are many
applications of these fundamental results that are beyond the scope of this text.
Some of them will be mentioned in the final Appendix of the book.
We also discuss another two fundamental result that readers ought to know. The
first one is basic for quadratic functionals in an abstract, general format: the Lax-
Milgram theorem. Its importance in variational problems and Partial Differential
Equations cannot be underestimated. The second is also an indispensable tool in
Analysis: Stampacchia’s theorem for variational inequalities.

94 3 Introduction to Convex Analysis: The Hahn-Banach and Lax-Milgram Theorems
3.2 The Lax-Milgram Lemma
One first fundamental situation is concerned with quadratic functionals over a

general Hilbert space .H.
Definition 3.1 Let .H be a (real) Hilbert space.
• A bilinear form on .H is a mapping
A(u, v) : H × H → R
.
that is linear on each variable separately. If A is symmetric, i.e.
A(u, v) = A(v, u),

. u, v ∈ H,
the function
1
a(u) =
. A(u, u)
2
is identified as its associated quadratic form.
• The bilinear form A is continuous if there is a positive constant .C > 0 with
|A(u, v)| ≤ C||u|| ||v||,

.
for every pair .u, v ∈ H. If A is symmetric, then
2a(u) ≤ C||u||2 ,
. u ∈ H.
• The bilinear form A is coercive if there is a positive constant .c > 0 such that
c||u||2 ≤ A(u, u)
.
for every .u ∈ H. If, in addition, A is symmetric,
c||u||2 ≤ 2a(u),
. u ∈ H.
If .a(u) is the quadratic form coming from a certain bilinear, symmetric form, we
would like to look at the variational problem
.Minimize in u ∈ H : a(u) − <U, u> (3.1)
for a given .U ∈ H. The following classic result provides a very clear answer to such
quadratic problems.
3.2 The Lax-Milgram Lemma 95
Theorem 3.1 (Lax-Milgram) Let .A(u, v) be a symmetric, continuous, coercive

bilinear form over a Hilbert space .H, with associated quadratic form .a(u). The
previous variational problem (3.1) admits a unique minimizer .u ∈ H that is
determined by the condition
A(u, v) = <U, v>

. (3.2)
for every .v ∈ H.
Proof We start by making sure that the variational problem (3.1) is well-posed in
the sense that if we put
m = inf (a(u) − <U, u>)

.
u∈H
then .m ∈ R. Indeed, by the coercivity property we can write

c
.a(u) − <U, u> ≥ ||u||2 − <U, u>,
2
and completing squares in the right-hand side,
|| ||2
c ||
|| 1 || 1
.a(u) − <U, u> ≥ u − U|| − ||U||2 .
2 || c || 2c
The resulting inequality shows that
1
m≥−
. ||U||2 .
2c
Let, then, .{u(j ) } be a minimizing sequence
a(u(j ) ) − <U, u(j ) > \ m.

.
In particular, by the calculations just indicated,

|| ||2
c || ||
||u(j ) − 1 U|| ≤ a(u(j ) ) − <U, u(j ) > + 1 ||U||2 ,
.
2 || c || 2c
which shows that .{u(j ) } is uniformly bounded in .H, and hence, for some subse-
quence which we do not care to relabel, we will have .u(j ) ⥛ u for some .u ∈ H.
Again by the coercivity property, we find
1 1
0 ≤ a(u(j ) − u) =
. A(u(j ) , u(j ) ) − A(u(j ) , u) + A(u, u),
2 2
that is
1
A(u(j ) , u) − A(u, u) ≤ a(u(j ) ).
.
2
If we take limits in j , because .A(·, u) is a linear functional,
1
A(u, u) − A(u, u) ≤ lim inf a(u(j ) ),
.
2 j →∞
and
a(u) ≤ lim inf a(u(j ) ).

. (3.3)
j →∞
Again, because .<U, ·> is a linear operation, we can conclude, by the very definition
of m as the infimum for our problem, that
.m ≤ a(u) − <U, u> ≤ lim inf[a(u(j ) ) − <U, u(j ) >] = m.

j →∞
Consequently
m = a(u) − <U, u>,

.
and .u is truly a minimizer.

For the second part of the proof, we proceed in two steps. We first show that (3.2)
holds indeed for the previous minimizer .u. To this aim, take .v ∈ H, and consider
the one-dimensional section
g(∈ ) = I (u + ∈ v),
. I (u) = a(u) − <U, v>.
Because .u is a (global) minimizer for I , .∈ = 0 ought to be a global minimizer for

g. On the other hand, it turns out, taking advantage of the bilinearity and symmetry
of A, that g is quadratic, and so smooth. Consequently, .g ' (0) = 0, and a quick
computation yields
0 = g ' (0) = A(u, v) − <U, v>.

.
The arbitrariness of .v leads to (3.2).

Secondly, we argue that there cannot be two different vectors .ui , .i = 1, 2, for
which (3.2) holds. If it were true that
A(ui , v) = <U, v>,

. v ∈ H, i = 1, 2,
3.2 The Lax-Milgram Lemma 97
we would definitely have
A(u1 − u2 , v) = 0
.
for every .v ∈ H; in particular, for .v = u1 − u2 , we would find
. a(u1 − u2 ) = 0.
The coercivity of the quadratic form a means that .u1 = u2 . ⨆

⨅
The Lax-Milgram lemma is a very powerful tool to deal with quadratic functionals
and linear optimality conditions (linear differential equations) because it does not
prejudge the nature of the underlying Hilbert space .H: it is valid for every Hilbert
space, and every coercive quadratic functional. The Lax-Milgram lemma can also
be shown through Stampachia’s theorem (Exercise 10) that is treated in the final
section of the chapter.
In the previous proof, we have already anticipated some of the key ideas that will
guide us for more general situations.
Example 3.1 The first example is mandatory: take
A(u, v) = <u, v>

.
the inner product itself. In this case the variational problem (3.1) becomes the one
examined in (2.32). There is nothing to be added.
Example 3.2 The prototypical example, in dimension one, that can be treated under
the Lax-Milgram lemma is the following. Let .H be the Hilbert subspace .H01 (J )
of functions in .H 1 (J ) vanishing at the two end-points of J , for a finite interval
.J = (x0 , x1 ) of .R. Consider the bilinear form
⎛
.A(u, v) = [α(x)u' (x)v ' (x) + β(x)u(x)v(x)] dx,
J
where functions .α(x) and .β(x) will be further restricted in the sequel. A is definitely
symmetric. It is bounded if both functions .α, and .β belong to .L∞ (J ). It is coercive
if the same two functions are positive and bounded away from zero. The application
of Theorem 3.1 directly leads to the following result.
Corollary 3.1 Suppose there is a constant .C > 0 such that
1
.0 < C < min(α(x), β(x)) ≤ max(α(x), β(x)) ≤ .
x∈J x∈J C
For every function .γ (x) ∈ L2 (J ), the variational problem

⎛
1 1
Minimize in u(x) ∈ H01 (J ) :
. [ α(x)u' (x)2 + β(x)u(x)2 − γ (x)u(x)] dx
J 2 2
admits a unique minimizer .u which is characterized by the condition

⎛
. [α(x)u' (x)v ' (x) + β(x)u(x)v(x) − γ (x)v(x)] dx = 0 (3.4)
J
for every .v ∈ H01 (J ).

This last integral condition is typically called the weak form of the boundary-value
(Sturm-Liouville) problem
. − [α(x)u' (x)]' + β(x)u(x) = γ (x) in J, u(x0 ) = u(x1 ) = 0. (3.5)
3.3 The Hahn-Banach Theorem: Analytic Form
The analytic form of the Hahn-Banach theorem is typically based on Zorn’s lemma,
which is taken for granted here. The relevant concepts to understand Zorn’s lemma
are gathered in the following statement.
Definition 3.2 Let .P be a partially ordered set.
• A subset .Q is totally ordered if for every pair .u, v in .Q, either .u ≤ v or .v ≤ u.
• An element .w ∈ P is an upper bound for a subset .Q ⊂ P if .u ≤ w for every
.u ∈ Q.
• An element .m ∈ P is maximal if there is no element .u ∈ P with .m ≤ u except

.m itself.
• .P is said to be inductive if every chain (i.e. totally ordered subset) .Q in .P has an

upper bound.
With these concepts we have:
Lemma 3.1 (Zorn’s Lemma) Every non-empty, inductive, ordered set has a max-
imal element.
In addition, we need to define what is understood by a seminorm in a (real) vector
space .E. It is a real, non-negative function complying with N2 and N3, but not with
N1, in Definition 2.1 in Chap. 2.
Definition 3.3 A function
. p(x) : E → R
3.3 The Hahn-Banach Theorem: Analytic Form 99
is called a subnorm in .E if:

N2 for every pair .x, y ∈ E,
p(x + y) ≤ p(x) + p(y);

.
N3’ for every positive scalar .λ and vector .x ∈ E,
p(λx) = λp(x).
.
If, in addition, p is non-negative, then p is called a seminorm.

We are now ready to prove the analytic form of the Hahn-Banach theorem.
Theorem 3.2 Let p be a subnorm over a (real) vector space .E; .M, a subspace of
E, and .T0 a linear functional on .M with
.
T0 (x) ≤ p(x),
. x ∈ M.
Then a linear extension T of .T0 to all of .E can be found, i.e. . T |M = T0 , so that
T (x) ≤ p(x),
. x ∈ E. (3.6)
Proof The standard proof of this result proceeds in two steps. The first one deals
with the case in which
E = M ⊕ <x0 >,
.
where .<x0 > stands for the subspace spanned by some .x0 ∈ E. In this situation, every
x ∈ E can be written in a unique way in the form .m + λx0 , and, by linearity,
.
T (x) = T (m) + λT (x0 ) = T0 (m) + λT (x0 ).

.
Thus, it suffices to specify how to select the number .μ = T (x0 ) appropriately to

ensure inequality (3.6) in this case, that is to say
. T0 (m) + λμ ≤ p(m + λx0 ),
or
λμ ≤ p(m + λx0 ) − T0 (m)

.
for every .m ∈ M and every .λ ∈ R. For .λ > 0, this last inequality becomes
μ ≤ p(x0 + (1/λ)m) − T0 ((1/λ)m);

.
while for .λ < 0, we should have
μ ≥ −p(−x0 − (1/λ)m) + T0 (−(1/λ)m).

.
If we let
v = −(1/λ)m,
. u = (1/λ)m,
it suffices to guarantee that there is a real number .μ such that
T0 (v) − p(v − x0 ) ≤ μ ≤ −T0 (u) + p(u + x0 )

.
for every .u, v ∈ M. But for every such arbitrary pair, by hypothesis and the
subadditivity property of p,
T0 (u) + T0 (v) = T0 (u + v) ≤ p(u + v) ≤ p(u + x0 ) + p(v − x0 ).

.
This finishes our first step.

For the second step, consider the class
G = {(F, TF ) : F is subspace, M ⊂ F ⊂ E, TF is linear in F,

.
TF |M = T0 , TF (x) ≤ p(x), x ∈ F}.
G is an ordered set under the order relation

.
|
(F1 , TF1 ) < (F2 , TF2 ) when F1 ⊂ F2 , TF2 |F = TF1 .
.
1
G is non-empty because .(M, T0 ) ∈ G. It is inductive too. To this end, let .(Fi , TFi )
.
be a chain in .G. Set .F = ∪i Fi which is a subspace of .E as well. For .x ∈ F, define
TF (x) = TFi (x),

. x ∈ Fi .
There is no ambiguity in this definition, in case .x belongs to several of the .Fi ’s. By
Zorn’s lemma, there is a maximal element .(H, TH ) in .G. Suppose .H is not all of .E,
and let .x0 ∈ E \ H. For our first step applied to the direct sum
G = H ⊕ <x0 >,
.
we could find .TG with
. TG | H = TH , TG (x) ≤ p(x), x ∈ G.
This would contradict the maximality of .(H, TH ) in .G, and, hence .H must be the
full space .E. ⨅
⨆
3.3 The Hahn-Banach Theorem: Analytic Form 101
Among the most important consequences of Theorem 3.2 are the following.
Corollary 3.2 Let .E be a Banach space, with dual .E' .
1. Let .F be a linear subspace of .E. If .T0 : F → R is linear and continuous (.T0 ∈ F' ),
there is .T ∈ E' with
. T | F = T0 , ||T || = ||T0 ||.
2. For every non-vanishing vector .x0 ∈ E, and real .α, there is .T0 ∈ E' with
|α|
<T0 , x0 > = α,
. |<T0 , x>| ≤ ||x||, x ∈ E.
||x0 ||
In particular, for .α = ||x0 ||2 there is .T0 ∈ E' with
<T0 , x0 > = ||x0 ||2 ,

. ||T0 || = ||x0 ||.
3. For every .x ∈ E,
||x|| =
. sup <T , x> = max <T , x>.
T ∈E' :||T ||=1 T ∈E' :||T ||=1
Proof For the first part, apply directly Theorem 3.2 to the choice
. p(x) = ||T0 || ||x||.
Take, for the second, .M = <x0 > and
T0 (λx0 ) = λα.
.
Then, it is obvious that
|α|
T0 (λx0 ) ≤
. ||λx0 ||.
||x0 ||
Apply Theorem 3.2 with
|α|
p(x) =
. ||x||.
||x0 ||
The third statement is a consequence of the second one, and is left as an exercise.
⨆
⨅
3.4 The Hahn-Banach Theorem: Geometric Form
A main tool for the study of the geometric form of the Hahn-Banach theorem and its
consequences concerning the separation of convex sets is the Minkowski functional
of an open convex set. Recall the following.
Definition 3.4
1. A convex set .C of a vector space .E is such that convex combinations of elements
of .C stay in .C
tx + (1 − t)y ∈ C,
. x, y ∈ C, t ∈ [0, 1].
2. The convexification of a set .C is given by

Σ Σ
. co(C) = { λi xi : xi ∈ C, λi ∈ [0, 1], λi = 1}.
i,f inite i
It is immediate to check that the convexification of any set is always convex.

Definition 3.5 For a subset .C ⊂ E, we define the Minkowski functional of .C as
pC (x) : E → [0, +∞],

. pC (x) = inf{ρ > 0 : x ∈ ρC}.
Important properties of .pC depend on conditions on .C.

Lemma 3.2 Let .C ⊂ E be open, convex, and .0 ∈ C. Then .pC is a subnorm
(Definition 3.3), there is a positive constant M with
0 ≤ pC (x) ≤ M||x||,
. x ∈ E,
and
C = {x ∈ E : pC (x) < 1}.

.
Proof Property N3’ in Definition 3.3 is immediate. If .C is open, and .0 ∈ C, there

is some positive radius .r > 0 such that the closed ball .Br ⊂ C. This means that
||x|| ||x||
x∈
. Br ⊂ C,
r r
and, by definition,
0 ≤ pC (x) ≤ M||x||,
. M = 1/r.
3.4 The Hahn-Banach Theorem: Geometric Form 103
Suppose .x ∈ C. Because .C is open, for some .∈ > 0, .(1 + ∈ )x ∈ C, i.e.
1
pC (x) ≤
. < 1.
1 + ∈
If, on the other hand, .pC (x) < 1, .ρ ∈ (0, 1) can be found so that
1
x ∈ ρC,
. x ∈ C.
ρ
Since
.0, (1/ρ)x ∈ C,
by convexity
x = ρ(1/ρ)x + (1 − ρ)0 ∈ C.
.
We finally check that .pC complies with N2. We already know that for positive .∈ and
x ∈ E,
.
1
. x ∈ C,
pC (x) + ∈
because
⎛ ⎞
1
.pC x < 1.
pC (x) + ∈
For every couple .x, y ∈ E, and .∈ > 0, by the convexity of .C, we will have that
t 1−t
. x+ y ∈ C,
pC (x) + ∈ pC (y) + ∈
for every .t ∈ [0, 1], i.e.

⎛ ⎞
t 1−t
.pC x+ y < 1.
pC (x) + ∈ pC (y) + ∈
By selecting t in such a way that
t 1−t pC (x) + ∈
. = , t= ,
pC (x) + ∈ pC (y) + ∈ pC (x) + pC (y) + 2∈
we will have
pC (x + y) ≤ pC (x) + pC (y) + 2∈ .
.
The arbitrariness of .∈ implies the triangular inequality. ⨆

⨅
We now introduce the concept of separation of sets.
Definition 3.6 Let .E be a Banach space with dual .E' , and .F, .G, two subsets of .E.
For .T ∈ E' , the hyperplane
M = {T = α},
. α ∈ R,
separates .F from .G if
<T , x> ≤ α, x ∈ F,
. <T , x> ≥ α, x ∈ G.
The separation is strict if .∈ > 0 can be found such that
<T , x> ≤ α − ∈ , x ∈ F,
. <T , x> ≥ α + ∈ , x ∈ G.
One preliminary step corresponds to the particular case in which one of the sets in
a singleton.
Lemma 3.3 Let .C be a non-empty, open, convex set in .E, and .x0 ∈
/ C. Then a
functional .T ∈ E' can be found with
<T , x> < <T , x0 >,

. x ∈ C.
In this way, the closed hyperplane .{T = <T , x0 >} separates .C from .{x0 }.
Proof Without loss of generality through a translation, we may suppose that .0 ∈ C.
Let .p = pC be the Minkowski functional for .C, and let .G be the one-dimensional
subspace generated by .x0 . Define
. T0 ∈ G ' , <T0 , x0 > = 1.
Then, for .t > 0, we have
<T0 , tx0 > = t ≤ tp(x0 ) = p(tx0 )

.
because .x0 ∈
/ C; whereas for .t ≤ 0,
<T0 , tx0 > = t ≤ 0 ≤ p(tx0 ).

.
3.4 The Hahn-Banach Theorem: Geometric Form 105
The analytic form of the Hahn-Banach theorem implies that .T0 can be extended to
all of .E in such a way that
. <T , x> ≤ p(x), x ∈ E,
if .T ∈ E' is such an extension. Note that T is continuous because, for some .M > 0,
p(x) ≤ M||x||,
. x ∈ E.
In particular, for every .x ∈ C,
<T , x> ≤ p(x) < 1 = <T , x0 >.

.
⨆
⨅
We are now ready to deal with the first form of the geometric version of the Hahn-
Banach theorem.
Theorem 3.3 Let .F, G be two convex, disjoint subsets of the Banach space .E with
dual .E' . If at least one of the two is open, there is a closed hyperplane separating
them from each other.
Proof Suppose .F is open, and put
C = F − G = {x − y : x ∈ F, y ∈ G} = ∪y∈G (F − y).
.
It is elementary to check that .C is convex, open (because each .F − y is), and .0 ∈

/ C.
By our Lemma 3.3, conclude that there is .T ∈ E' such that
<T , z> < 0,

. z ∈ C.
Since every difference
z = x − y,
. x ∈ F, y ∈ G,
belong to .C, we find
<T , x> − <T , y> < 0.

.
If .ρ ∈ R is chosen in such a way that
. sup<T , x> ≤ ρ ≤ inf <T , y>,

x∈F y∈G
we will have that the hyperplane .{T = ρ} separates .F from .G. ⨆

⨅
Asking for a strict separation between .F and .G demands much more restrictive
assumptions on both sets.
Theorem 3.4 Let .F, G be two disjoint, convex sets of .E. Suppose .F is closed, and
G, compact. There is, then, a closed hyperplane that strictly separates them from
.
each other.
Proof We also consider the difference set .C as in the proof of the first version.
We leave as an exercise to show that .C is, in addition to convex, closed. The more
restrictive hypotheses are to be used in a fundamental way here. Since .0 ∈ / C, there
is some positive r such that the open ball .Br , centered at .0 y radius r, has an empty
intersection with .C. By Theorem 3.3 applied to .C and .Br , we find a non-null .T ∈ E'
so that
<T , x − y> ≤ <T , rz>,

. x ∈ F, y ∈ G, z ∈ B1 .
This yields, by taking the infimum in .z ∈ B1 ,
1
. <T , x − y> ≤ −||T ||, x ∈ F, y ∈ G,
r
that is to say
r||T || r||T ||
<T , x> +
. ≤ <T , y> − , x ∈ F, y ∈ G.
2 2
As before, for .ρ selected to ensure that
r||T || r||T ||
. sup<T , x> + ≤ ρ ≤ inf <T , y> − ,
x∈F 2 y∈G 2
we will have that the hyperplane .{T = ρ} strictly separates .F from .G. ⨆
⨅
3.5 Some Applications
Our first application is a helpful fact that sometimes may have surprising conse-
quences. Its proof is a direct application of Theorem 3.4 when the closed set is a
closed subspace, and the compact set is a singleton.
Corollary 3.3 Let .M be a subspace of a Banach space .E with dual .E' . If the closure
'
.M of .M is not the full space .E, then there is a non-null .T ∈ E such that
<T , x> = 0,
. x ∈ M.
3.5 Some Applications 107
Proof For the choice
F = M,
. G = {x0 }, x0 ∈ E \ M,
by Theorem 3.4, we would find .T ∈ E' and .ρ ∈ R such that
<T , x> < ρ < <T , x0 >,

. x ∈ M.
Since
.λx ∈ M if x ∈ M,
we realize that the only value of .ρ compatible with the previous left-hand side
inequality is .ρ = 0, and then the inequality must be an equality. ⨆
⨅
One of the most appealing applications of this corollary consists in the conclusion
that a subspace .M is dense in .E, .M = E, if the assumption
<T , x> = 0 for every x ∈ M

.
implies that .T ≡ 0 as elements of .E' .

There is a version of the preceding corollary for general convex sets, not
necessarily subspaces, that is usually utilized in a negative form.
Corollary 3.4 Let .F be a convex set of a Banach space .E with dual .E' . If a further
set .G cannot be separated from .F in the sense
<T , x> + ρ ≥ 0 for every x ∈ F and some T ∈ E' ,

.
implies
. <T , x> + ρ ≥ 0 for every x ∈ G,
then .G ⊂ F.
There is even a version of this result in the dual of a Banach space.
Corollary 3.5 Let .F be a convex set of the dual space .E' of a Banach space .E. If a
further subset .G ⊂ E' cannot be separated from .F in the sense
<T , x> + ρ ≥ 0 for every T ∈ F and some x ∈ E

.
implies
<T , x> + ρ ≥ 0 for every T ∈ G,

.
then .G ⊂ F.
The following form is, however, usually better adapted to practical purposes.
Corollary 3.6 Let .F be a set of the dual space .E' of a Banach space .E. If a further
subset .G ⊂ E' cannot be separated from .F in the following sense: whenever
<T , x> + ρ < 0 for some T ∈ G, x ∈ E, ρ ∈ R,

.
then there is .T̂ ∈ F with
<T̂ , x> + ρ < 0;

.
then .G ⊂ co(F).
A fundamental consequence for variational methods is the following.
Theorem 3.5 Every convex, closed set .C in a Banach space .E is weakly closed.
Proof Let
uj ⥛ u,
. uj ∈ C,
with .C convex and closed in .E. Suppose .u ∈/ C. We can apply Theorem 3.4, and
conclude the existence of an element .T ∈ E' and some .∈ > 0 such that
<T , u> < ∈ ,

. <T , v> ≥ ∈ ,
for all .v ∈ C. This is a contradiction because each .v = uj ∈ C, and then we would

have
<T , uj > ≥ ∈ ,
. <T , u> < ∈ ,
which is impossible if
. lim <T , uj > = <T , u>.

j →∞
⨆
⨅
3.6 Convex Functionals, and the Direct Method
The Lax-Milgram lemma Theorem 3.1 is a very efficient tool to deal with quadratic
functionals and linear boundary-value problems for PDEs. If more general situations
are to be examined, one needs to rely in more general assumptions, and convexity
stands as a major structural property. We gather here the main concepts related to
minimization of abstract functionals.
3.6 Convex Functionals, and the Direct Method 109
Definition 3.7 A functional .I : E → R, defined in a Banach space .E, is:

1. convex if
. I (tu1 + (1 − t)u0 ) ≤ I (u1 ) + (1 − t)I (u0 )
for every
u1 , u0 ∈ E,
. t ∈ [0, 1];
it is strictly convex if, in addition, the identity
I (tu1 + (1 − t)u0 ) = I (u1 ) + (1 − t)I (u0 )

.
is only possible when
u1 = u0 ,
. t (1 − t) = 0;
2. lower semicontinuous if
I (u) ≤ lim inf I (uj )

.
j →∞
whenever .uj → u in .E;

3. coercive if
I (u) → ∞ when ||u|| → ∞;

.
4. bounded from below if there is a constant .c ∈ R such that
c ≤ I (u),
. u ∈ E.
It is not difficult to check that under our assumptions in Theorem 3.1, one can
show the strict convexity of the functional
I : H → R,
. I (u) = a(u) − <U, u>. (3.7)
Indeed, let
u1 , u0 ∈ H,
. t ∈ [0, 1].
We would like to conclude that
I (tu1 + (1 − t)u0 ) ≤ tI (u1 ) + (1 − t)I (u0 ),

.
and equality can only occur if .u1 = u0 , or .t (1 − t) = 0. Since the second

contribution to I is linear in .u, it does not perturb its convexity. It suffices to check
that indeed
a(tu1 + (1 − t)u0 ) ≤ ta(u1 ) + (1 − t)a(u0 ),

.
with equality under the same above circumstances. The elementary properties of the
bilinear form A lead to
ta(u1 ) + (1 − t)a(u0 ) − a(tu1 + (1 − t)u0 ) = t (1 − t)a(u1 − u0 ).

.
Because the quadratic form a is coercive, it is strictly positive (except for the
vanishing vector), and this equality implies the strict convexity of I .
This strict convexity of I , together with its coercivity, has two main consequences
that hold true regardless of the dimension of .H (exercise):
1. I admits a unique (global) minimizer;
2. local and global minimizers are exactly the same vectors.
There is another fundamental property that has been used for the quadratic
functional I in (3.7). It has been explicitly stated in (3.3). It can also be regarded in
a general Banach space. Because it will play a special role for us, we have separated
it from our general definition above.
Definition 3.8 Let .I : E → R be a functional over a Banach space .E. I is said to
enjoy the (sequential) weak lower semicontinuity property if
I (u) ≤ lim inf I (uj )

.
j →∞
whenever .uj ⥛ u in .E.

Note that Definition 3.7 introduces lower semicontinuity with respect to conver-
gence in the Banach space itself. We will refer to this, to distinguish it from weak
lower semicontinuity, as strong lower semicontinuity whenever appropriate.
Before we focus on convexity, let us describe the direct method of the Calculus
of Variations in general terms.
Proposition 3.1 Let .I : E → R be a functional over a reflexive Banach space
E, bounded from below by some finite constant, and complying with the two
.
conditions:
1. coercivity:
I (u) → ∞ when ||u|| → ∞;

.
2. weak lower semicontinuity according to Definition 3.8.

Then the there is a global minimizer for I in .E.
3.6 Convex Functionals, and the Direct Method 111
Proof Let .m ∈ R be the infimum of I over .E, and let .{uj } be minimizing
.I (uj ) \ m.
The coercivity condition implies that .{uj } is uniformly bounded in .E, and, hence, by
the weak compactness principle Theorem 2.3, there is a subsequence (not relabeled)
converging weakly to some .u ∈ E: .uj ⥛ u. The weak lower semicontinuity
property leads to
m ≤ I (u) ≤ lim inf I (uj ) = m,

.
j →∞
and .u becomes a global minimizer for I in .E. ⨆

⨅
This principle is quite easy to understand, and it isolates requirements for a
functional to attain its infimum. However, we would like to stress that convexity
comes into play concerning the weak lower semicontinuity property.
Proposition 3.2 Let .I : E → R be a convex, (strong) lower semicontinuous
functional over a Banach space .E. Then it is weak lower semicontinuous.
Proof The proof is based on one of our main consequences of the Hahn-Banach
theorem, specifically Theorem 3.5. We showed there that convex, closed sets in
Banach spaces are also weakly closed.
Suppose .uj ⥛ u, but
I (u) > lim inf I (uj ).

.
j →∞
Select .m ∈ R such that
. lim inf I (uj ) < m < I (u).

j →∞
The sublevel set
C = {v ∈ E : I (v) ≤ m}
.
is convex and closed, precisely because I is convex and lower semicontinuous.

Hence, by Theorem 3.5, .C is also weakly closed. But then the two facts
uj ⥛ u,
. lim inf I (uj ) < m
j →∞
imply .u ∈ C, which is impossible if .m < I (u). ⨆

⨅
As a main consequence, we have an existence abstract theorem for convex
functionals that is a direct consequence of Propositions 3.2 and 3.1
Corollary 3.7 Let .I : E → R be a convex, lower semicontinuous functional over a

reflexive Banach space .E, that is bounded from below by some finite constant, and
coercive in the sense
I (u) → ∞ when ||u|| → ∞.

.
Then the there is a global minimizer for I in .E.

Though this corollary yields a first existence result for a large class of integral
functionals, we will focus on an independent, more specific strategy that can be
generalized to tackle larger classes of problems, as we will see later in the book, and
broader notions of convexity that are central to vector problems.
Conditions to ensure uniqueness of minimizers are, typically, much more
restrictive and are usually associated with strict convexity.
Proposition 3.3 Assume, in addition to hypotheses in Corollary 3.7, that the
functional .I : E → R is strictly convex in the sense of Definition 3.7. Then there is
a unique minimizer for I in .E.
Proof The proof of uniqueness based on strict convexity is quite standard. Suppose
there could be two minimizers .u1 , .u0 ,
I (u1 ) = I (u0 ) = m = min I.

.
Then we would have,
.m ≤ I ((1/2)u1 + (1/2)u0 ) ≤ (1/2)I (u1 ) + (1/2)I (u0 ) = m.
Hence
I ((1/2)u1 + (1/2)u0 ) − (1/2)I (u1 ) − (1/2)I (u0 ) = 0

.
and from the strict convexity, we realize that the only alternative is .u1 = u0 . ⨆
⨅
Even for more specific classes of functionals like the ones we will consider in the
next chapter, uniqueness is always associated with this last proposition so that strict
convexity of functionals need to be enforced. This is however not so for existence,
since we can have existence results even though the functional I is not convex.
3.7 Convex Functionals, and the Indirect Method
The direct method of the Calculus of Variations expressed in Proposition 3.1,

or more specifically in Corollary 3.7, allows to prove existence of minimizers
“directly” without any further computation, except possibly those directed towards
checking coercivity and/or convexity, but no calculation is involved in dealing
3.7 Convex Functionals, and the Indirect Method 113
with the derivative of the functional. In contrast, taking advantage of optimality

conditions in the spirit of Lemma 2.7, to decide if there are minimizers, falls under
the action of techniques that we could include in what might be called the indirect
method.
We already have a very good example to understand this distinction between the
direct and the indirect method in the important Lax-Milgram Theorem 3.1 though
in this particular, fundamental case both are treated simultaneously. Recall that in
this context we are before a symmetric, continuous, coercive bilinear form .A(u, v)
over a Hilbert space .H, with associated quadratic form
1
a(u) =
. A(u, u).
2
1. The direct method would ensure, based on the coercivity and the convexity of
.a(u), that there is a unique minimizer of problem (3.1)
Minimize in u ∈ H :
. a(u) − <U, u> (3.8)
for any given .U ∈ H.

2. The indirect problem would focus on the condition of optimality (3.2)
A(u, v) = <U, v>

.
for every .v ∈ H, and assuming that we are able to find, independently of the direct
method, one solution .u, argue, again based on the convexity of the quadratic
functional in (3.8), that such element .u is indeed a minimizer for problem (3.8).
Said differently, the direct method points to a solution, the minimizer, of the
conditions of optimality, under smoothness of the functional. As a matter of fact, a
minimizer of a variational problem, regardless of how it has been obtained, will be a
solution of optimality conditions, under hypotheses guaranteeing smoothness of the
functional. From this perspective, we say that existence of solutions of optimality
conditions are necessary for the existence of minimizers, even in the absence of
convexity. But convexity is required to guarantee that existence of solutions of
optimality conditions are sufficient for (global) minimizers.
Consider the variational problem
Minimize in u ∈ H :
. I (u) (3.9)
where .H is a Hilbert space, and I is differentiable (Definition 2.13).

Proposition 3.4
1. If .u ∈ H is a minimizer for (3.9), then
 = 0

. (3.10)
for every .U ∈ H.
2. If .u ∈ H is a solution of (3.10), and I is convex, then .u is an optimal solution
for (3.9).
3. If I is strictly convex, then either problem (3.9) or (3.10) has a unique solution.
Proof Though the proof is pretty elementary, its significance goes well beyond
that simplicity: (3.10) is the abstract expression of the fundamental Euler-Lagrange
equation in its weak form. We will review it in more specific frameworks in
subsequent chapters.
If .u is indeed a minimizer for (3.9), then .∈ = 0 has to be a (global) minimizer for
each section
∈ ∈ R |→ I (u + ∈ U).
.
In particular, its derivative at .∈ = 0 which, by Lemma 2.7, is given by (3.10), must

vanish. This does not require any convexity. On the other hand, if I is convex, then
.I (u + U) ≥ I (u) + 
for every .u, .U in .H (exercise below). In particular, if .u = u complies with (3.10),

then we immediately see that
I (u + U) ≥ I (u)
.
for every .U, and .u becomes a minimizer. The uniqueness has already been treated
in the last section under strict convexity. This is left as an exercise. ⨆
⨅
The relevance of the simple ideas in this section will be better appreciated in some
cases when the direct method for integral functionals is inoperative because of lack
of appropriate coercivity properties. In such a situation, the indirect method may be
explored to see if existence of global minimizers may be reestablished.
3.8 Stampacchia’s Theorem: Variational Inequalities
Stampacchia’s theorem is a fundamental statement to deal with quadratic, not

necessarily symmetric, forms in a Hilbert space .H. It is the simplest example of a
variational inequality, and it is typically used to have the Lax-Milgram lemma as an
easy corollary (Exercise 10). As we have shown above, we have followed a different,
independent, more variational, route to prove that crucial lemma. However, it is
important to know this other fundamental result by Stampacchia. Its proof is a
beautiful application of the classical contraction principle Theorem 1.1.
Theorem 3.6 Let .A(u, v) be a continuous, coercive bilinear form in a Hilbert
space .H, and let .K ⊂ H be non-empty, closed, and convex. For every .T ∈ H' ,
3.8 Stampacchia’s Theorem: Variational Inequalities 115
there is a unique .u ∈ K such that
A(u, v − u) ≥ <T , v − u>

.
for every .v ∈ K. In, in addition, A is symmetric, the vector .u is characterized as the

unique minimizer of the quadratic functional
1
v |→
. A(v, v) − <T , v>
2
over .K.
Proof Through the Riesz-Fréchet representation theorem Proposition 2.14, it is
easy to show the existence of a linear, continuous operator .A : H → H such that
A(u, v) = <Au, v>,

. ||Au|| ≤ C||u||, u, v ∈ H, (3.11)
and, moreover,
1
<Au, u> ≥
. ||u||2 , u ∈ H, (3.12)
C
for some positive constant C. Similarly, through the same procedure, there is some
t ∈ H with
.
<T , v> = <t, v>,

. v ∈ H.
Note that the left-hand side is the duality pair in the Hilbert space .H, while the
right-hand side is the inner product in .H. There is, we hope, no confusion in using
the same notation. In these new terms, we are searching for a vector .u ∈ H such that
<Au, v − u> ≥ <t, v − u>,

. v ∈ H. (3.13)
If for a positive r, we recast (3.13) in the form
<rt − rAu + u − u, v − u> ≤ 0,

. v ∈ H,
we realize, after Proposition 2.11 concerned with the orthogonal projection onto a
convex set, that the previous inequality is equivalent to the functional equation
u = πK (rt − rAu + u).

.
We would like to interpret this equation as a fixed point for the mapping
u |→ Tu ≡ πK (rt − rAu + u).

.
The contraction principle Theorem 1.1 guarantees the existence of a unique fixed
point for such a mapping, provided that it is a contraction. Since the norm
of every projection is less than unity (check the discussion after the proof of
Proposition 2.11), we find that
.||Tu1 − Tu2 ||2 ≤||u1 − u2 − rA(u1 − u2 )||2

=||u1 − u2 ||2 − 2r<u1 − u2 , A(u1 − u2 )> + r 2 ||A(u1 − u2 )||2
1
≤||u1 − u2 ||2 − 2r 2 ||u1 − u2 ||2 + C 2 r 2 ||u1 − u2 ||2
C
⎛ ⎞
2r
= 1 − 2 + r C ||u1 − u2 ||2 ,
2 2
C
by (3.11) and (3.12). If we select
2
0<r<
. ,
C4
we clearly see that .T becomes indeed a contraction, and it admits a unique fixed
point .u which is the vector sought as remarked earlier.
The symmetric case and the minimization property is left as an exercise, though
it takes us back to the Lax-Milgram theorem. ⨆
⨅
3.9 Exercises
1. Prove the third part of Corollary 3.2 relying on the other two parts.
2. For a finite set of vectors
{x1 , x2 , . . . , xk } ⊂ E
.
in a normed space E, show that the three numbers

Σ
. sup{|| λj xj || : λj = ±1},
j
Σ
sup{|| λj xj || : −1 ≤ λj ≤ 1},
j
Σ
sup{ |<x' , xj >| : x' ∈ E' , ||x' || ≤ 1},
j
are identical.
3. Show that a difference set C = F − G is closed if F is closed and G, compact.
3.9 Exercises 117
4. Let
f (x) : RN → R,
.
be coercive and strictly convex. Prove that f admits a unique local minima
which is also global.
5. (a) Let a function
f (x) : Rn → R
.
be C1 . Show that it is convex if and only if
f (x + y) ≥ f (x) + ∇f (x) · y
.
for arbitrary x, y ∈ Rn .
(b) Suppose the functional I : E → R is differentiable and convex, where E is
a Hilbert space. Prove that
. I (u + U) ≥ I (u) + 
for every u, U in E. If, in addition, I is strictly convex then the equality
I (u + U) = I (u) + 

.
is only valid if U ≡ 0. Based on this, prove the third item of Proposition 3.4.
6. Show that if u is a smooth solution of (3.5), then (3.4) is correct for every
v ∈ H01 (J ).
7. Consider the Hilbert space H = H01 (0, 1).
(a) Argue that
⎛ 1
<u, v> =
. u' (t)v ' (t) dt
0
is an equivalent inner-product in H, and regard H as endowed with it.

(b) Consider the bilinear form
⎛ 1
A(u, v) =
. tu' (t)v ' (t) dt,
0
and study the corresponding Lax-Milgram problem (3.1) for U = 0.

Show that there is no minimizer for it. What is the explanation for such
a situation?
8. In the context of the Lax-Milgram theorem, consider the map T : H → H

taking each U ∈ H into the unique solution u. Show that this is a linear
continuous operator.
9. Show Stampacchia’s theorem for the symmetric case by defining on H a new,
equivalent inner product
<u, v>A = A(u, v),

.
and using the orthogonal projection theorem for it. In this same vein, argue that
the solution u is characterized as the unique minimizer of the corresponding
quadratic functional in the statement of Stampachhia’s theorem.
10. Prove the Lax-Milgram lemma from Stampachia’s theorem.
11. For f (x) ∈ L2 (J ), define the functional I : H01 (J ) → R through
⎛
I (u) =
. F (v(x)) dx,
J
−(u' (x) + v ' (x))' + f (x) = 0 in J, v ∈ H01 (J ).
Explore conditions on the integrand F : R → R allowing for the direct method

to be applied.
12. For an open, convex subset D in RN , let u(x) : D → R be a convex function.
Let (X, μ) be a probability space, and
vj (x) : X → R,
. 1 ≤ j ≤ N,
integrable functions such that the joint mapping
. x |→ (vj (x))j ∈ D
and the composition
x |→ u(v1 (x), v2 (x), . . . , vN (x))

.
is integrable (with respect to μ).

(a) Argue that
⎛⎛ ⎞
. vj (x) dμ(x) ∈ D.
X j
3.9 Exercises 119
(b) Prove Jensen’s inequality

⎛⎛ ⎛ ⎞
u
. v1 (x) dμ(x), . . . , vN (x) dμ(x) ≤
X X
⎛
u(v1 (x), . . . , vN (x)) dμ(x).
X
(c) Figure out a suitable choice of N , D, u, X and μ, to prove that the geometric
mean of m positive numbers rj is always smaller than the arithmetic mean
λ Σ Σ
⊓j r j j ≤
. λj rj , λj > 0, λj = 1.
j j
13. Explore conditions on the integrand
.F (u, v) : R2 → R
to guarantee that the functional

⎛ ⎛
I (u) =
. f (u(x), u(y)) dx dy
J J
is convex. J is an interval in R.
14. (a) Consider a functional of the form
⎛ 1 ⎛⎛ 1 ⎞
I (u) =
. ⏀ W (x, y, u(y)) dy dx
0 0
for certain functions
W (x, y, u, v) : (0, 1)2 × R → R,

. ⏀(t) : R → R.
Derive a set of sufficient conditions for the functional I to be weak lower

semicontinuous.
(b) The generalization to a funcional of the form
⎛ 1 ⎛⎛ 1 ⎞
I (u) =
. ⏀ W (x, y, u(x), u(y)) dy dx
0 0
is much more difficult. Try to argue why.

15. Let functions
fi (x, u) : (0, 1) × R → R,
. 1 ≤ i ≤ n, f (x, u) = (fi (x, u))i ,
be given, and F : Rn → R. Consider the functional

⎛⎛ ⎞
I (u) = F
. f (x, u(x)) dx .
J
Explore both the direct and indirect methods for this family of functionals.
16. Let {ui } be an orthonormal basis of L2 (Ω) for a open set Ω ⊂ RN , and consider
.sN : L2 (Ω) → R,
⎛ Σ
N
N sN (f ) =
2
f (x) ui (x)ui (y) f (y) dx dy.
Ω×Ω i=1
(a) Check that each sN is a semi-norm in L2 (Ω).

(b) Show that the usual topology of L2 (Ω) (the one associated with its
usual inner product) is the coarser one that makes the collection {sN }N
continuous.
17. For the functional
⎛ 1
E : H01 (0, 1) → R,
. E(u) = [ψ(u' (x)) + φ(u(x))] dx
0
where ψ and φ are C1 -, real functions such that
|ψ(u)|, |φ(u)| ≤ C(1 + u2 )

.
for a positive constant C, calculate, as explicitly as possible, its derivative E ' (u)
for an arbitrary u ∈ H01 (0, 1).
Chapter 4
The Calculus of Variations for
One-dimensional Problems
4.1 Overview
The goal of this chapter is to study one-dimensional variational problems where one
tries to minimize an integral functional of the form
⎧
. I (u) = F (x, u(x), u' (x)) dx (4.1)
J
among a certain class of functions (paths) .A. Here J is a finite interval in .R, and
the integrand
F (x, u, v) : J × Rn × Rn → R
.
is a function whose properties are the main object of our concern. Fundamental
ingredients to be examined are function spaces where competing paths
u(x) : J → Rn
.
are to be taken from. Additional constraints, typically in the form of prescribed end-
point conditions, are also to be taken into account. All of these various requirements
will be hauled into the collection of competing paths .A.
Our goal is to understand these two principal issues for functionals of this integral
type:
1. isolate properties on the integrand F to ensure the existence of at least one
minimizer in a natural function space, under typical sets of constraints; and
2. determine additional properties, the so-called optimality conditions, that such
minimizers should comply with precisely because they are optimal solutions of
a given variational problem.
122 4 The Calculus of Variations for One-dimensional Problems
As a general rule, existence theorems for minimizers of functionals can be estab-

lished in Banach spaces; however, optimality conditions involving derivatives
require, at least in a first step, an inner product, and so we will look at them in a
Hilbert space scenario. Optimality conditions in Banach spaces demand additional
training that we will overlook here.
Though variants can, and should, be accepted and studied, our basic model
problem will be
.Minimize in u ∈ A : I (u)
where .I (u) is given in (4.1), and .A is a subset of a suitable Sobolev space usually
determined through fixed, preassigned values at the end-points of J
u(x0 ) = u0 ,
. u(x1 ) = u1 , J = (x0 , x1 ) ⊂ R, u0 , u1 ∈ Rn .
There are many more topics one can look at even for one-dimensional problems:
important issues hardly end with this introductory chapter. As usual, we will indicate
some of these in the final Appendix of the book.
Since convexity will play a central role in this book henceforth, we will remind
readers of the main inequality involving convex functions: Jensen’s inequality. It
has already been examined in Exercise 12 of Chap. 3. Convexity for functionals has
already been treated in the same chapter. We will reserve a more practical look at
convexity until Chap. 8 where functions and integrands depend on several variables.
Though this can be the case for vector, uni-dimensional problems too, we defer a
more systematic discussion until then. Most of the interesting examples examined
in this chapter correspond to uni-dimensional, scalar problems and so convexity is
mainly required for real functions of one variable.
4.2 Convexity
Though often convexity is defined for functions taking on the value .+∞, and this is
quite efficient, we will only consider real functions defined on subsets.
Definition 4.1 A function
φ : D ⊂ RN → R
.
is convex if D is a convex set, and
φ(t1 x1 + t2 x2 ) ≤ t1 φ(x1 ) + t2 φ(x2 ),

. x1 , x2 ∈ D, t1 , t2 ≥ 0, t1 + t2 = 1. (4.2)
If there is no mention about the set D, one consider it to be the whole space .RN or
the natural domain of definition of .φ (which must be a convex set). Moreover, such
4.2 Convexity 123
a function .φ is said to be strictly convex if the equality in (4.2)
φ(t1 x1 + t2 x2 ) = t1 φ(x1 ) + t2 φ(x2 ),

. x1 , x2 ∈ D, t1 , t2 ≥ 0, t1 + t2 = 1,
can only occur when either .x1 = x2 or .t1 t2 = 0.

Though we will be more explicit in Chap. 8 when dealing with high-dimensional
variational problems, we anticipate two basic facts which are easily checked:
1. Every linear (affine) function is always convex, though not strictly convex.
2. The supremum of convex functions is also convex.
One main consequence is that the supremum of linear functions is always convex.
In fact, this is always the case. To dig deeper into these issues, we introduce the
following concept.
Definition 4.2 For a function .φ(x) : RN → R, we define
.CL φ = sup{ψ : ψ ≤ φ, ψ, linear},

CC φ = sup{ψ : ψ ≤ φ, ψ, convex}.
It is obvious that
. CL φ ≤ CC φ ≤ φ,
and that if .φ is convex, then .CC φ = φ. The following important fact is an

unexpected consequence of Theorems 3.3 or 3.4. In the finite-dimensional case,
both are equivalent.
Proposition 4.1 A function .φ(x) : RN → R is convex if and only if
. φ ≡ CL φ ≡ CC φ.
Proof Suppose .φ is a convex function. Its epigraph
E = {(x, t) ∈ RN +1 : t ≥ φ(x)}
.
is, then, a convex set. This is straightforward to check. If we assume that there is a
point .y where .CL φ(y) < φ(y), then the two convex sets
E,
. {(y, CL φ(y)}
are disjoint with the first, closed, and the second, compact. Theorem 3.4 guarantees
that there is a closed hyperplane in .RN +1 separating them. Hyperplanes in .RN +1
are of the form
u · x + uxN +1 = c,
. u, x ∈ RN , u, c ∈ R;
and hence there is such .u ∈ RN , .u ∈ R, with
u · x + ut > c when t ≥ φ(x),

. (4.3)
u · y + uCL φ(y) < c.
We claim that u cannot be zero, for if it were, we would have
u·x<c <u·y
.
with no restriction on .x ∈ RN . This is clearly impossible. Suppose that, without loss

of generality, u is positive, and define
1 1
Lx = − u · x + c.
.
u u
then (4.3) for .t = φ(x) would lead to .Lx < φ(x) for every .x, and yet .Ly > CL φ(y),
which is a contradiction with the definition of .CL φ. ⨆
⨅
The most fundamental fact involving convexity is Jensen’s inequality. It is almost
an immediate consequence of Proposition (4.1).
Theorem 4.1 If .φ : RN → R is a convex function, and .μ is a probability measure
supported in .RN , then
⎛⎧ ⎞ ⎧
φ
. x dμ(x) ≤ φ(x) dμ(x).
RN RN
If .φ is strictly convex, and

⎧ ⎛⎧ ⎞
. φ(x) dμ(x) = φ x dμ(x) ,
RN RN
then .μ is a Dirac mass .δz supported at

⎧
z=
. x dμ(x).
RN
Proof The fact that .μ is a positive measure with total mass 1, implies that if L is a
linear function such that .L ≤ φ, then because integration is a linear operation,
⎛⎧ ⎞ ⎧
. L x dμ(x) = L(x) dμ(x)
RN RN
⎧
≤ φ(x) dμ(x).
RN
4.3 Weak Lower Semicontinuity for Integral Functionals 125
If we take the supremum on the left-hand side among all linear functions with .L ≤
φ, by Proposition 4.1, we have our inequality.
If, in fact, the last inequality turns out to be an equality, then the support of .μ
must be contained where .φ equals .CL φ; but the strict inequality of .φ implies that
the support of .μ must be a singleton. ⨆
⨅
It is not difficult to generalize in a suitable way these concepts and results for
functions that can take on the value .+∞, by restricting all operations to the domain
of .φ.
4.3 Weak Lower Semicontinuity for Integral Functionals
So far, we have been exposed to abstract, general functionals defined over a Banach
or Hilbert space, and have anticipated the central role played by the convexity
property of the functional under consideration. When one restricts attention to
special classes of functionals, finer results can typically be shown. We start now
the study of integral functionals of the type
⎧
I (u) =
. F (x, u(x), u' (x)) dx (4.4)
J
defined over suitable subsets of Sobolev spaces of competing functions or paths .u.
Our model problem will insist on end-point conditions of the form
u(x0 ) = u0 ,
. u(x1 ) = u1
if .J = (x0 , x1 ) is a finite interval in .R, and .u0 , u1 ∈ Rn , though other restrictions

can be adapted to variational methods. The properties of the integrand
F (x, u, U) : J × Rn × Rn → R
. (4.5)
will be crucial to our results. In particular, we are mostly interested in understanding

for which such densities the direct method Proposition 3.1 can be implemented to
produce existence results for minimizers for I . One way or another the convexity of
I or of F will have to play a central role.
To be more specific, and for the sake of definiteness, once J is given, take a finite
.p > 1, and let
E = W 1,p (J ; Rn )
.
be our basic, underlying Banach space where we will consider our variational
problems. For .p /= 2, it is a Banach space, while for .p = 2, .E becomes
H = H 1 (J ; Rn ), a Hilbert space. Our model problem will be

.
⎧
Minimize in u ∈ E = W 1,p (J ; Rn ) :
. I (u) = F (x, u(x), u' (x)) dx
J
subject to
u(x0 ) = u0 ,
. u(x1 ) = u1 .
Strictly speaking, the class
.A = {u ∈ W 1,p (J ; Rn ) : u(x0 ) = u0 , u(x1 ) = u1 },
is not a vector space. However, if we set
x − x0 x1 − x
L : J → E,
. L(x) = u1 + u0 ,
x1 − x0 x1 − x0
then
1,p
A = L + W0 (J ; Rn ),
.
1,p
W0 (J ; Rn ) = {u ∈ W 1,p (J ; Rn ) : u(x0 ) = u(x1 ) = 0},
1,p
and .W0 (J ; Rn ) is a vector subspace of .E. Competing paths for our model problem
can then be written in the form
1,p
.u = L + u, u ∈ W0 (J ; Rn ).
We will not insist in this point any longer, and, without further notice, we will work
with the class .A as if it were a Banach subspace of .E.
The direct method Proposition 3.1 will guide us. In fact, we could already state a
first result through Proposition 4.1. But since we are after a more general existence
result which is specific for our integral functionals and, hence, it can guide us in
other situations, we will follow a different strategy.
Let us focus on functional (4.4) for an integrand as in (4.5). .F (x, u, U) is
assumed to be continuous on pairs .(u, U), but only measurability on the spatial
variable x is assumed. It is clear that if
F (x, ·, ·) : Rn × Rn → R
.
is convex for each .x ∈ J , then the corresponding functional I in (4.4) is also

convex. In this way, and under additional suitable assumptions, we could rely on
Proposition 4.1, as already indicated, for weak lower semicontinuity, and eventually
have an existence result through Corollary 3.7. However, our emphasis is placed in
the fact that only convexity of .F (x, u, U) with respect to variable .U is necessary for
our functional I to be weakly lower semicontinuous in Sobolev spaces. Our main
result on weak lower semicontinuity follows.
Theorem 4.2 Let
F (x, u, U) : J × Rn × Rn → R
.
be continuous in .(u, U) for a.e. .x ∈ J , and measurable in x for every pair .(u, U).
The associated functional I in (4.4) is weak lower semicontinuous in .W 1,p (J, Rn )
if and only if .F (x, u, ·) is convex for a.e. .x ∈ J and every .u ∈ Rn .
The proof of this main result is our goal in the rest of this section.
The most interesting part for us is the sufficiency. The necessity part is, however,
relevant too because it is informing us that there cannot be surprises concerning
convexity: weak lower semicontinuous functional cannot escape convexity. This is
in deep contrast with the vector case in which one considers competing fields
u ∈ W 1,p (Ω, Rn ),
. Ω ⊂ RN ,
and both dimensions n and N are at least two. This requires, as a preliminary step,
a serious concern about high-dimensional Sobolev spaces. We will treat these in
the third part of the book. For vector problems, the necessity part of Theorem 4.2
dramatically fails, and thus opens the door to more general functionals than those
having a convex dependence on .U.
We begin by arguing the necessity part. Suppose the functional I in (4.4) is weak
lower semicontinuous, and let
a ∈ J,
. u0 , U0 , U1 ∈ Rn . (4.6)
Take .δ > 0 sufficiently small so that
Ja,δ = [a − δ/2, a + δ/2] ⊂ J = (x0 , x1 ).

.
For each j , divide .Ja,δ in 2j equal subintervals of length .δ/(2j ), and let .χj (x) be
the characteristic function of the odd subintervals, so that .1 − χj (x), restricted to
.Ja,δ is the characteristic function of the even subintervals of such family. Define
⎧
⎪
⎪ x ∈ (x0 , a − δ/2)
⎨u0 , ⎧ ⎧x
x
.uj (x) = u0 + a−δ/2 χj (s) dsU1 + a−δ/2 (1 − χj (s)) dsU0 , x ∈ Ja,δ ,
⎪
⎪
⎩u + (δ/2)(U + U ), x ∈ (a + δ/2, x1 ).
0 1 0
We claim that each .uj ∈ W 1,p (J ; Rn ). In fact, it is easy to check that it is continuous
as it is continuous in each subinterval
(x0 , a − δ/2),
. Ja,δ , (a + δ/2, x1 ),
separately, and it matches across common end-points. On the other hand, the
derivative .u'j (x) turns out to be, except in a finite number of points,
⎧
⎪
⎪ x ∈ (x0 , a − δ/2)
⎨0,
u'j (x) =
. χj (x)U1 + (1 − χj (x))U0 , x ∈ Ja,δ ,
⎪
⎪
⎩0, x ∈ (a + δ/2, x1 ),
and this is definitely a path in .Lp (J ; Rn ). In addition, if we recall Example 2.12, we

conclude that .χj ⥛ 1/2 in .Ja,δ , and .u'j ⥛ U where
⎧
⎪
⎪ x ∈ (x0 , a − δ/2)
⎨0,
.U(x) = (1/2)U1 + (1/2)U0 , x ∈ Ja,δ ,
⎪
⎪
⎩0, x ∈ (a + δ/2, x1 ).
If, finally, we define

⎧ x
u(x) = u0 +
. U(s) ds,
x0
then, by Proposition 2.8, we can conclude that
uj ⥛ u in W 1,p (J ; Rn ).
.
If our functional I in (4.4) is truly weak lower semicontinuous, we ought to have
I (u) ≤ lim inf I (uj ).

. (4.7)
j →∞
Let us examine the two sides of this inequality. For the left-hand side, we find
⎧ a−δ/2
I (u) =
. F (x, u0 , 0) dx
x0
⎧ a+δ/2
+ F (x, u0 + (x/2)(U1 + U0 ), (1/2)(U1 + U0 )) dx
a−δ/2
⎧ x1
+ F (x, u0 + (δ/2)(U1 + U0 ), 0) dx.
a+δ/2
Similarly,
⎧ a−δ/2
I (uj ) =
. F (x, u0 , 0) dx
x0
⎧ a+δ/2
+ F (x, uj (x), χj (x)U1 + (1 − χj (x))U0 ) dx
a−δ/2
⎧ x1
+ F (x, u0 + (δ/2)(U1 + U0 ), 0) dx.
a+δ/2
Concerning the central contribution, it is reasonable to expect (proposed as an

exercise) that the difference
⎧ a+δ/2
. F (x, uj (x), χj (x)U1 + (1 − χj (x))U0 ) dx
a−δ/2
⎧ a+δ/2
− F (x, u0 + (x/2)(U1 + U0 ), χj (x)U1 + (1 − χj (x))U0 ) dx
a−δ/2
tends to zero as .j → ∞ because the convergence .uj → u takes place in

L∞ (J ; Rn ), and F is continuous with respect to the .u-variable. Since the first
.
and third terms of the above decompositions are identical for both sides in
inequality (4.7), such inequality and the preceding remarks lead to
⎧ a+δ/2
. F (x, u0 + (x/2)(U1 + U0 ), (1/2)(U1 + U0 )) dx
a−δ/2
⎧ a+δ/2
≤ lim inf F (x, u0 + (x/2)(U1 + U0 ), χj (x)U1 + (1 − χj (x))U0 ) dx.
j →∞ a−δ/2
But the integral in the limit of the right-hand side of this inequality can be broken
into two parts
⎧
. F (x, u0 + (x/2)(U1 + U0 ), U1 ) dx
Ja,δ ∩{χj =1}
⎧
+ F (x, u0 + (x/2)(U1 + U0 ), U0 ) dx,
Ja,δ ∩{χj =0}
For .δ sufficiently small, each of these two integrals essentially are
1 1
. F (a, u0 , U1 ) + F (a, u0 , U0 ),
2 2
and the weak lower semicontinuity inequality becomes
1 1
F (a, u0 , (1/2)(U1 + U0 )) ≤
. F (a, u0 , U1 ) + F (a, u0 , U0 ).
2 2
The arbitrariness in (4.6) leads to the claimed convexity. There is however a few
technical steps to be covered for a full proof that are left as exercises.
For the sufficiency part, our main tool is the classical Jensen’s inequality,
Theorem 4.1. It was also treated in Exercise 12 of Chap. 3. Turning back to the
sufficiency part of Theorem 4.2, suppose
uj ⥛ u in W 1,p (J ; Rn ).
. (4.8)
For a positive integer l, write
⎧ l ⎧
Σ
. F (x, u(x), u' (x)) dx = F (x, u(x), u' (x)) dx
J i=1 Ji
where the full interval of integration J is the disjoint union of the .Ji ’s and the
measure of each .Ji is .1/ l. Then
⎧ Σ
l ⎛ ⎧ ⎧ ⎧ ⎞
' 1 '
. F (x, u(x), u (x)) dx ∼ F l x dx, l u(x) dx, l u (x) dx .
J l Ji Ji Ji
i=1
Because of the weak convergence (4.8), and the continuity of F ,

⎛ ⎧ ⎧ ⎧ ⎞
'
. lim F l x dx, l uj (x) dx, l uj (x) dx
j →∞ Ji Ji Ji
⎛ ⎧ ⎧ ⎧ ⎞
'
=F l x dx, l u(x) dx, l u (x) dx ,
Ji Ji Ji
for each i. But for the probability measure .μi,j determined through the formula
⎧
<φ, μi,j > = l
. φ(u'j (x)) dx,
Ji
Jensen’s inequality applied to

⎛ ⎧ ⎧ ⎞
φ=F l
. x dx, l uj (x) dx, ·
Ji Ji
4.4 An Existence Result 131
implies
⎛ ⎧ ⎧ ⎧ ⎞
F l
. x dx, l uj (x) dx, l u'j (x) dx ≤
Ji Ji Ji
⎧ ⎛ ⎧ ⎧ ⎞
l F l x dx, l uj (x) dx, u'j (x) dx.
Ji Ji Ji
The right-hand side, when m is large, is essentially the integral

⎧ ⎛ ⎞
l
. F x, uj (x), u'j (x) dx.
Ji
If we put all of our ingredients together, we arrive at the claimed weak lower
semicontinuity result
⎧ ⎧
. F (x, u(x), u' (x)) dx ≤ lim F (x, uj (x), u'j (x)) dx,
J j →∞ J
as the limit for l large, of
Σ
l ⎛ ⎧ ⎧ ⎧ ⎞
1
. F l x dx, l u(x) dx, l u' (x) dx
l Ji Ji Ji
i=1
Σ
l ⎛ ⎧ ⎧ ⎧ ⎞
1
= lim F l x dx, l uj (x) dx, l u'j (x) dx ,
j →∞ l Ji Ji Ji
i=1
l ⎧
Σ ⎛ ⎧ ⎧ ⎞
'
≤ lim F l x dx, l uj (x) dx, uj (x) dx
j →∞ Ji Ji Ji
i=1
l ⎧
Σ ⎛ ⎞
∼ lim F x, uj (x), u'j (x) dx
j →∞ Ji
i=1
⎧ ⎛ ⎞
= lim F x, uj (x), u'j (x) dx.
j →∞ J
4.4 An Existence Result
Based on our main weak lower semicontinuity result Theorem 4.2, we can now
prove a principal result for existence of optimal solutions for variational problems
with integral functionals of the kind

⎧
Minimize in u ∈ A :
. I (u) = F (x, u(x), u' (x)) dx (4.9)
J
where .A ⊂ W 1,p (J ; Rn ) is a non-empty subset encoding constraints to be

respected. The exponent p is assumed to be strictly greater than unity, and finite.
Theorem 4.3 Let
F (x, u, U) : J × Rn × Rn → R
.
Suppose that the following three main conditions hold:
1. coercivity and growth: there are two positive constant .C− ≤ C+ , and a exponent
.p > 1 with
( ) ( )
C− |U|p − 1 ≤ F (x, u, U) ≤ C+ |U|p + 1 ,
.
for every .(x, u, U) ∈ J × Rn × Rn ;

2. convexity: for a.e. .x ∈ J , and every .u ∈ Rn , we have that
F (x, u, ·) : Rn → R
.
is convex;
3. the set .A is weakly closed, and for some .x0 ∈ J , the set of numbers
.{u(x0 ) : u ∈ A}
is bounded.
Then there are optimal solutions for the corresponding variational principle (4.9).
Proof At this stage the proof does not show any surprise, and follows the guide of
the direct method Proposition 3.1. If m is the value of the associated infimum, the
coercivity and growth condition implies that .m ∈ R. Let .{uj } be a minimizing
sequence. Through the coercivity condition, we can ensure that, possibly for a
suitable subsequence, the sequence of derivatives .{u'j } is bounded in .Lp (J ; Rn ).
Indeed, for such minimizing sequence we would have, thanks to the coercivity
condition,
1
||u'j ||Lp (J ;RN ) ≤ 1 +
p
. I (uj ).
C−
4.4 An Existence Result 133
Since .I (uj ) \ m ∈ R, this last inequality implies that the sequence of derivatives
{u'j } is uniformly bounded in .Lp (J ; RN ). If we write
.
⎧ x
uj (x) = uj (x0 ) +
. u'j (s) ds,
x0
the condition assumed on the point .x0 ∈ J and all feasible paths in .A, together with
the previous uniform bound on derivatives, imply that .{uj } is uniformly bounded in
.W
1,p (J ; Rn ), and hence .u ⥛ u for some .u, which belongs to .A, if this subset is
j
weakly closed. Our weak lower semicontinuity result Theorem 4.2 guarantees that
. m ≤ I (u) ≤ lim inf I (uj ) ≤ m,

j →∞
and .u becomes a true minimizer for our problem. ⨆

⨅
There are several remarks worth considering.
1. We know that the convexity condition on the variable .U is unavoidable for
weak lower semicontinuity, however both the coercivity condition and the weak
closeness of the feasible set .A may come in various forms. In particular, only the
coercivity inequality
( )
C |U|p − 1 ≤ F (x, u, U)
. (4.10)
is necessary for an existence result. However, since in such a situation the possi-
bility that I may take on the value .+∞ somewhere in the space .W 1,p (J ; Rn ) is
possible, one needs to make sure that the feasible set .A, where I is supposed to
take on finite values, is non-empty, i.e. I is not identically .+∞, and then
m = inf I (u),
.
u∈A
is a real number. In this case, we can talk about minimizing sequences .{uj } with
I (uj ) \ m ∈ R, and proceed with the same proof of Theorem 4.3.
.
2. The coercivity condition in Theorem 4.3 is too rigid in practice. It can be replaced
in the statement of that result by the more flexible one that follows, and the
conclusion is valid in the same way: there is .C > 0 and a exponent .p > 1
such that
( )
C ||u' ||Lp (J ;RN ) − 1 ≤ I (u).
. (4.11)
3. The uniqueness of minimizers can only be achieved under much more restrictive
conditions as it is not sufficient to strengthen the convexity of the integrand
.F (x, u, U) with respect to .U to strict convexity. This was already indicated in
the comments around Proposition 3.3. In fact, the following statement is a direct
consequence of that result.
Proposition 4.2 In addition to all assumptions in Theorem 4.3, suppose that the
integrand .F (x, u, U) is jointly strictly convex with respect to pairs .(u, U). Then
there is a unique minimizer for problem (4.9).
However, just as we have noticed above concerning coercivity, some times the
direct application of Proposition 3.3 may be more flexible than Proposition 4.2.
4.5 Some Examples
Convexity and coercivity are the two main ingredients that guarantee the existence
of optimal solutions for standard variational problems according to Theorem 4.3.
We will proceed to cover some examples without further comment on these two
fundamental properties of functions. Later, when dealing with higher-dimensional
problems that are often more involved, we will try to be more systematic.
We examine next several examples with care. In many cases, the convexity
requirement is easier to check than the coercivity.
Example 4.1 An elastic string of unit length is supported on its two end-points
at the same height .y = 0. Its equilibrium profile under the action of a vertical
load of density .g(x) for .x ∈ [0, 1] is the one minimizing internal energy which is
approximated, given that the values of .y ' are assumed to be reasonably small, by the
functional
⎧ 1 1
.E(y) = [ y ' (x)2 + g(x)y(x)] dx.
0 2
We assume .g ∈ L2 (0, 1). For the end-point conditions
y(0) = y(1) = 0,
.
existence of minimizers can be achieved through our main existence theorem,

though the coercivity hypothesis requires some work. In fact, the initial form of
coercivity in Theorem 4.3 can hardly be used directly. However, it is possible to
establish (4.11) for .p = 2. As a matter of fact, the ideas that follow can be used in
many situations to properly adjust the coercivity condition. From
⎧ x
.y(x) = y ' (s) ds
0
it is easy to find, through Hölder’s inequality, that

⎧ 1
y(x) ≤ x
.
2
y ' (s)2 ds,
0
4.5 Some Examples 135
and
⎧ 1 ⎧ 1
. y(x)2 dx ≤ y ' (x)2 dx.
0 0
It is even true that

⎧ 1 ⎧ 1
1
. y(x)2 dx ≤ y ' (x)2 dx,
0 2 0
but this better estimate does not mean any real improvement on the coercivity
condition we are seeking. Then for the second term in .E(y), we can write, for
arbitrary .∈ > 0,
⎧ 1 ⎧ 1 ⎧ 1
1 ∈2
. |g(x)| |y(x)| dx ≤ g(x)2 dx + y(x)2 dx
0 2∈ 2 0 2 0
⎧ 1 ⎧ 1
1 ∈2
= 2 g(x)2 dx + y ' (x)2 dx.
2∈ 0 2 0
We are using the inequality

⎛ ⎞
b 1 1
.ab = (∈a) ≤ ∈ 2 a 2 + 2 b2 , (4.12)
∈ 2 2∈
valid for arbitrary positive numbers .∈, a, b. If we take this estimate to the functional
E(y), we see that
.
⎧ 1 ⎧ 1
1 '
.E(y) ≥ y (x) dx − |g(x)| |y(x)| dx
2
2 0 0
⎧ ⎧
1 − ∈2 1
' 1 1
≥ y (x) dx − 2 2
g(x)2 dx
2 0 2∈ 0
Since .∈ > 0 is arbitrary, we can take, say, .2∈ 2 = 1, and then

⎧ 1 ⎧ 1
1
E(y) ≥
. y ' (x)2 dx − g(x)2 dx.
4 0 0
This inequality implies the necessary coercivity. Though the integrand
1 2
F (x, y, Y ) =
. Y + g(x)y
2
is not jointly strictly convex with respect to .(y, Y ), and hence we cannot apply
directly Proposition 4.2, it is not difficult to argue, through Proposition 3.3, that
there is a unique minimizer because the functional, not the integrand,
E : H01 (0, 1) → R
.
is strictly convex.
Example 4.2 A much-studied scalar uni-dimensional problem is of the form
⎧ 1 ┌ ┐
. I (u) = au' (x)2 + g(u(x)) dx, a > 0, (4.13)
0
under typical end-point conditions, with a real function g which is assumed to be just
continuous. For simplicity, we also take vanishing end-point conditions. Once again
coercivity becomes the main point to be clarified for the application of Theorem 4.3.
Uniqueness is, however, compromised unless convexity conditions are imposed on
the function g. As in the preceding example, we see that
⎛⎧ ⎞1/2
√ 1
|u(x)| ≤
. x u' (s)2 ds ≤ ||u' ||L2 (0,1) .
0
The second term in .I (u) is bounded by
. max |g(U )|, Ju = [−||u' ||L2 (0,1) , ||u' ||L2 (0,1) ].

U ∈Ju
Suppose that
|g(U )| ≤ b|U | + C,
. b < a, C ∈ R, U ∈ Ju .
Then the previous maximum can be bounded by
b||u' ||L2 (0,1) + C,

.
and then
I (u) ≥ (a − b)||u' ||L2 (0,1) − C.

.
Again, this inequality fits better (4.11) than the coercivity condition in Theorem 4.3.
This last theorem ensures that I in (4.13) admits global minimizers.
Example 4.3 In the context of Sect. 1.3.6, the integrand
1 1
L(x, x' ) =
. m|x' |2 − k|x|2 , m, k > 0,
2 2
is the Lagrangian for a simple harmonic oscillator. The corresponding integral is

the action integral. In these problems in Mechanics only the initial conditions are
enforced so that feasible paths are required to comply with
x(0) = x0 ,
. x' (0) = x'0 ,
for given vectors .x0 , x'0 ∈ RN . We consider then the variational problem
⎧ T
Minimize in x ∈ A :
. A(x) = L(x(t), x' (t)) dt (4.14)
0
where this time

⎧ ⎧ ∈ ⎫
'
.A = x ∈ H (0, T ; R ) : x(0) = x0 ,
2 N
|x (t) − x'0 |2 dt → 0 as ∈ → 0 .
0
Without loss of generality, for the sake of simplicity, we can take
.x0 = x'0 = 0,
and in this case .A becomes a vector space.

We would like to show that the value of the minimum m in (4.14) vanishes, and
that the only minimizer is the trivial path .x ≡ 0. This requires some computations
to assess the relative size of the two terms in the Lagrangian. Using ideas as in the
previous examples, we can write
⎧ t
|x(t)| ≤ t
.
2
|x' (s)|2 ds.
0
From this inequality, and by using Fubini’s theorem,

⎧ T ⎧ T ⎧ t
. |x(t)|2 dt ≤ t |x' (s)|2 ds dt
0 0 0
⎧ T ⎧ T ⎧ t
= t |x' (s)|2 dt ds
0 s 0
⎧ T T2 − s2 '
= |x (s)|2 ds.
0 2
If, relying on this estimate, we compare the two terms in the action integral .A(x),
we would find
⎧ T⎛ ⎞ ⎧ T⎛ ⎞
m ' 2 m T 2 − t2
. |x (t)| − |x(t)|2 dt ≥ − |x' (t)|2 dt
0 k 0 k 2
⎛ ⎞⎧ T
m T2
≥ − |x' (t)|2 dt.
k 2 0
Since the action integral has to be minimized almost instantaneously as the

movement progresses, we can assume that the time horizon T is small, namely,
if
/
2m
.T < , (4.15)
k
then .A(x) is non-negative on .A, and it can only vanish if .x' ≡ 0, i.e. .x ≡ 0 in .[0, T ].
For larger values of T , one can proceed in successive subintervals of length smaller
than the right-hand side in (4.15). This shows that once the pendulum touches the
rest conditions, it needs an external agent to start again the movement.
Example 4.4 The behavior of non-linear, elastic materials (rubber for instance) is
much more complicated to understand and simulate than their linear counterparts.
Suppose that an elastic stick is of length one in its natural rest configuration. When
subjected to an elongation of its right-end point .x = 1 to some other value, say,
.L > 0, maintaining .x = 0 in its position, it will respond seeking its minimum
internal energy under the new end-point conditions. We postulate that each possible
deformation of the bar is described by a function
u(x) : (0, 1) → R,
. u(0) = 0, u(1) = L, u' (x) > 0,
with internal potential energy given by the functional

⎧ 1 ┌ 1 α
┐
'
E(u) =
. u (x) + ' + g(x)u(x) dx,
2
α > 0,
0 2 u (x)
where the second term aims at assigning an infinite energy required to compress
some part of the bar to zero volume, and to avoid interpenetration of matter .u' ≤ 0;
and the third one accounts for a bulk load with density .g(x). The optimal solutions
of the variational problem
. Minimize in u(x) ∈ H 1 (0, 1) : E(u)
under
u(0) = 0,
. u(1) = L,
will be taken to be as configurations of minimal energy, and they will be the ones that
can be adopted by the bar under the indicated end-point conditions. It is elementary
to check that the integrand
1 2 α
F (x, u, U ) =
. U + + g(x)u
2 U
is strictly convex with respect to U in the positive part .U > 0, linear with respect
to u, while the functional .E(u) is coercive in .H 1 (0, 1), just as one of the examples
above. We can therefore conclude that there is exactly one minimizer of the problem.
Unfortunately, many of the geometrical or physical variational problems, some of
which were introduced in the initial chapter, do not fall under the action of our main
existence theorem because the exponent p for the coercivity in that statement is
assumed greater than 1, and quite often functionals with a geometric meaning have
only linear growth on .u' . The trouble with .p = 1 in our main result Theorem 4.3
is intimately related to the difficulty explained in Example 2.9. The coercivity
condition in the statement of that theorem would lead to a uniform bound of a
minimizing sequence in .W 1,1 (J ; RN ), but this is not sufficient to ensure a limit
function which would be a candidate for minimizer. In such cases, an additional
argument of some kind ought to be invoked for the existence of a minimizer; or
else, one can examine the indirect method (ideas in Sect. 3.7).
Example 4.5 Let us deal with the problem for the brachistochrone described in
Sect. 1.3. The problem is
⎧ √
A 1 + u' (x)2
Minimize in u(x) :
. B(u) = √ dx
0 x
subject to
u(0) = 0,
. u(A) = a.
√
The factor .1/ x, even though blows up at .x = 0, is not a problem as we would
have a coercivity constant of tremendous force. The difficulty lies with the linear
growth on .u' of the integrand. For some minimizing sequence .{uj }, we need to rule
out the possibility shown in Example 2.9. Unfortunately, there is no general rule for
this since each particular example may require a different idea, if indeed there are
minimizing sequences not concentrating. This would be impossible for a variational
problem for which concentration of derivatives is “essential” to the minimization
process. Quite often, the geometrical or physical meaning of the situation may lead
to a clear argument about the existence of an optimal solution. In this particular
instance, we can interpret the problem as a shortest distance problem with respect
to the measure
dx
dm = √ .
.
x
Note that this measure assigns distances in the real line in the form
⎧ √
∈ dx √
. √ = 2( ∈ − δ).
δ x
Therefore our problem consists in finding the shortest path, the geodesic with
respect to this new way of measuring distances, and minimal distance between the
points .(0, 0) and .(A, a). As such, there should be an optimal solution. We will later
calculate it.
Example 4.6 Suppose
F(x) : R3 → R3
.
is a smooth vector field in three-dimensional space, representing a certain force

field. The work done by it when a unit-mass particle goes from a point .x(0) = x0 to
.x(1) = x1 is measured by the path integral
⎧ 1
W (x) =
. F(x(t)) · x' (t) dt.
0
We well know that if .F is conservative so that there is an underlying potential
φ(x) : R3 → R,
. F = ∇φ,
then such work is exactly the potential difference

⎧ 1 ⎧ 1 d
. F(x(t)) · x' (t) dt = [φ(x(t))] dt
0 0 dt
=φ(x1 ) − φ(x0 ),
and so the work functional .W (x) is independent of the path .x. In the jargon of
variational problems, we would say that W is a null-lagrangian. But if .F is non-
conservative, then .W (x) will depend on .x, and hence one may wonder, in principle,
about the cheapest path in terms of work done through it. We would like, hence, to
consider the variational problem
⎧ 1
Minimize in x(t) :
. W (x) = F(x(t)) · x' (t) dt
0
under the given end-point conditions. The integrand corresponding to the problem
is
F (x, X) = F(x) · X
.
which is linear in the derivative variable .X. The natural functional space for the
problem is then .W 1,1 ((0, 1); RN ). Paths .x(t) in this space are absolutely continuous
and .W (x) is well-defined for them. Can we ensure that there is always paths with
minimal work joining any two points .x0 , .x1 ? Theorem 4.3 cannot be applied, and
there does not seem to be any valid argument available to ensure the existence
of optimal paths. We will look at the problem below from the point of view of
optimality (indirect method).
4.5.1 Existence Under Constraints
In a good proportion of situations, one is interested in minimizing a certain integral

functional under additional constraints other than fixed end-point conditions. These
can be of a global nature, quite often expressed through an additional integral
functional; or of a local nature by demanding that constraints are respected in a
pointwise fashion. At the level of existence of optimal solutions such constraints
can always be incorporated into the feasible set .A of our main result Theorem 4.3.
We present two examples to cover those two situations.
Example 4.7 In the context of Example 4.1 above, where the profile adopted by
a linear, elastic string subjected to a vertical load is given by the minimizer of the
functional
⎧ 1 1
E(y) =
. [ y ' (x)2 + g(x)y(x)] dx
0 2
under end-point constraints, we may complete the situation with an additional

interesting ingredient. Suppose that there is a rigid obstacle whose profile is
represented by the continuous function
φ(x) : (0, 1) → R
.
in such a way that feasible configurations .y(x) must comply with
y(x) ≥ φ(x),
. x ∈ (0, 1).
In this case, the admissible set for the optimization problem is
.A = {y(x) ∈ H 1 (0, 1) : y(x) ≥ φ(x), y(0) = y0 , y(1) = y1 },

and obviously we need to enforce
. φ(0) < y0 , φ(1) < y1 ,
if we want .A to be non-empty. Theorem 4.3 can be applied. The only new feature,
compared to Example 4.1, is to check if the feasible set of this new situation is
weakly closed. But since weak convergence in .H 1 (0, 1) implies convergence in
∞
.L (0, 1), an inequality not involving derivatives like
yj (x) ≥ φ(x)
.
is preserved if .yj → y uniformly in .(0, 1). Such problems have been studied quite
a lot in the literature and are referred to as obstacle problems. (See Exercise 29).
Example 4.8 We recover here the problem of the hanging cable introduced in the
first chapter. We are concerned with minimizing the functional
⎧ H √
. u(x) 1 + u' (x)2 dx
0
under the conditions

⎧ H √
u(0) = u(H ) = 0,
. L= 1 + u' (x)2 dx,
0
and .0 < H < L. Since both integrands involved in the two functionals share linear
growth in the derivative variable, the natural space in which to setup the problem is
.W
1,1 (0, H ). In addition, the global integral constraint must be part of the feasible
set of competing functions

⎧ ⎧ H √ ⎫
A = u(x) ∈ W 1,1 (0, H ) : u(0) = u(H ) = 0, L =
. 1 + u' (x)2 dx .
0
There are two main difficulties for the use of Theorem 4.3 in this situation: one is the
coercivity in which the growth exponent .p = 1 for integrands keeps us from using it
directly; the other one is the weakly closeness of .A which is not correct either.
√ Can
readers provide an explicit sequence .{uj } ⊂ A (for .H = 1 and .L = 2/ 2) such
that its weak limit u does not belong to .A? (Exercise 14). Even so, the physical
interpretation of the problem clearly supports the existence of a unique optimal
solution which, after all, would be the profile adopted by the cable in practice.
4.6 Optimality Conditions 143
4.6 Optimality Conditions
In abstract terms, optimality conditions require to work in a Hilbert space to be able

to compute derivatives as in Lemma 2.7. This would lead us to restrict attention to
the particular case in which the exponent for coercivity is .p = 2 in Theorem 4.3,
so that the corresponding integral functional would admit minimizing sequences
in the Hilbert space .H 1 (J ; RN ). This is another aspect in which studying integral
functional becomes much more flexible than doing so for classes of functionals
not specifying its nature. One can actually look at optimality conditions in a general
Sobolev space .W 1,p (J ; RN ) for any value of .p > 1, under appropriate assumptions.
Yet, since, typically, optimality demands more restrictive working conditions and
many more technicalities, we will be contented with a general discussion in the
Hilbert space scenario for growth exponent .p = 2, and try to expand the scope of
this particular situation to other cases through examples. Under a simple coercivity
condition like (4.11), where one could allow functionals whose finite set, the set of
functions where it attains a finite value, is not a vector-space, optimality is specially
delicate.
We will therefore be contented with examining optimality under the action of
Theorem 4.3 for the particular case .p = 2. Namely, the following is a corollary of
that theorem with the only difference in the statement that we restrict attention to the
case .p = 2 so that the underlying functional space is the Hilbert space .H 1 (J ; Rn ).
Corollary 4.1 Let
F (x, u, U) : J × Rn × Rn → R
. (4.16)
1. coercivity and growth: there are positive constants .C− ≤ C+ with
⎛ ⎞ ⎛ ⎞
C− |U|2 − 1 ≤ F (x, u, U) ≤ C+ |U|2 + 1 ,
.
for .(x, u, U) ∈ J × Rn × Rn ;
2. convexity: for a.e. .x ∈ J , and every .u ∈ Rn , we have that
.F (x, u, ·) : Rn → R
is convex;
3. the set
A ⊂ H 1 (J ; Rn )
.
is weakly closed, and for some .x0 ∈ J , the set of numbers
{u(x0 ) : u ∈ A}
.
is bounded.
Then there are optimal solutions for the variational problem
⎧
. I (u) = F (x, u(x), u' (x)) dx. (4.17)
J
We place the following discussion in the context of Sect. 2.11.4, and assume that the
integrand .F (x, u, U) in (4.16) is, in addition to hypotheses in the previous corollary,
smooth (at least .C1 ) with respect to pairs
(u, U) ∈ Rn × Rn
.
for a.e. .x ∈ J . Moreover, as we describe the process, we will be putting aside

assumptions as they are needed, without pretending to isolate the most general
setting.
One of the most studied situations is concerned with end-point boundary value
problems in which, in the context of Theorem 4.3 or Corollary 4.1,
A = {u ∈ H 1 (J ; Rn ) : u(x0 ) = u0 , u(x1 ) = u1 }
. (4.18)
for two fixed vectors .u0 , u1 ∈ Rn . Note how the third main condition in
Theorem 4.3 or Corollary 4.1 is automatically fulfilled. Suppose we have found,
through Corollary 4.1 or otherwise, a minimizer .u ∈ A. The basic idea was stated
in Proposition 3.4. We would like to “make variations” based on .u by perturbing it
with .v in the underlying subspace
H01 (J ; Rn ) = {v ∈ H 1 (J ; Rn ) : v(x0 ) = v(x1 ) = 0}

.
that has been introduced earlier in the text. The combination .u + sv for arbitrary real
s is a feasible path for our variational problem, and then we should have
. I (u) ≤ I (u + sv)
for all real s. This means that the real function
s |→ I (u + sv)
. (4.19)
has a global minimum for .s = 0, and hence

|
dI (u + sv) ||
.
| = 0,
ds s=0
provided (4.19) is differentiable. We can differentiate formally under the integral

sign to find
⎧
┌ ┐
. Fu (x, u(x), u' (x)) · v(x) + FU (x, u(x), u' (x)) · v' (x) dx = 0 (4.20)
J
for every such .v ∈ H01 (J ; Rn ). If we recall that functions in .H 1 (J ; Rn ) are

continuous and uniformly bounded, for the integrals in (4.20) to be well-defined
and so (4.19) be differentiable, we only require, for some positive constant C, that
.|Fu (x, u, U)| ≤ C(|U|2 + 1), |FU (x, u, U)| ≤ C(|U| + 1).
Consider the orthogonal complement, in the Hilbert space .H 1 (J ; Rn ), to the

subspace .e generated by the constant vectors .ei of the canonical basis of .Rn , i.e.
⎧
⊥
e = {w(x) ∈ H (J ; R ) :
.
1 n
w(x) dx = 0}.
J
For every .w ∈ e⊥ , its primitive

⎧ x
v(x) =
. w(y) dy
x0
belongs to .H01 (J ; Rn ) and
v' (x) = w(x),

. x ∈ J.
On the other hand, if we let

⎧ x
.⏀(x) = Fu (y, u(y), u' (y)) dy, Ψ(x) = FU (x, u(x), u' (x)),
x0
(4.20) can be recast in the form

⎧
. [⏀' (x) · v(x) + Ψ(x) · v' (x)] dx = 0.
J
Bearing in mind that
v(x0 ) = v(x1 ) = 0,
.
an integration by parts in the first term leads to

⎧
. [⏀(x) − Ψ(x)] · w(x) dx = 0
J
for every .w ∈ e⊥ . Conclude that
⏀(x) + Ψ(x) = c, a constant vector in J,

.
or by differentiation
d
Fu (x, u(x), u' (x)) −
. FU (x, u(x), u' (x)) = 0 in J.
dx
We have just proved the following important statement.
Theorem 4.4 Suppose that the integrand .F (x, u, U) is .C1 with respect to pairs
.(u, U) ∈ R × R , and that
n n
⎛ ⎞ ⎛ ⎞
C− |U|2 − 1 ≤ F (x, u, U) ≤ C+ |U|2 + 1 ,
.
|Fu (x, u, U)| ≤ C+ (|U|2 + 1), |FU (x, u, U)| ≤ C+ (|U| + 1),
for some positive constants .C− ≤ C+ , and every .(x, u, U) ∈ J × Rn × Rn . Let

.u ∈ A, given in (4.18), be a minimizer of our variational problem (4.17). Then
FU (x, u(x), u' (x))

.
is absolutely continuous in J , and
d
Fu (x, u(x), u' (x)) −
. FU (x, u(x), u' (x)) = 0 a.e. x in J. (4.21)
dx
The differential system in this statement is universally known as the Euler-Lagrange
system of optimality. In the scalar case .n = 1, the second-order differential equation
in this result together with the end-point conditions is referred to as a Sturm-
Liouville problem.
One main application of this result stresses the fact that if the minimizer .u is the
outcome of Corollary 4.1, because the suitable hypotheses hold, then .u must be a
solution of the corresponding Euler-Lagrange problem (4.21). This is the variational
method to show existence of solutions of (4.21). However, some times one might be
interested in going the other way, i.e. from solutions of (4.21), assuming that this
is possible, to minimizers for (4.17). The passage from minimizers .u for (4.17)
to solutions of (4.21) does not require convexity, and the process is known as
the necessity of optimality conditions expressed in (4.21) for minimizers; for the
passage from solutions .u of (4.21) to minimizers of (4.17) convexity is unavoidable,
and the process is labeled as sufficiency of the Euler-Lagrange problem (4.21) for
minimizers of the variational problem (4.17). Though this discussion is formally
like the one in Proposition 3.4, in the case of integral functionals the result can be
much more explicit, to the point that we can work in the most general Sobolev space
.W
1,1 (J ; Rn ).
Theorem 4.5 Let the integrand .F (x, u, U), as above, be .C1 - and convex in pairs
.(u, U). Suppose the path
u(x) ∈ W 1,1 (J ; Rn )
.
is such that (4.21) holds
d
Fu (x, u(x), u' (x)) −
. FU (x, u(x), u' (x)) = 0 a.e. x in J, (4.22)
dx
the two terms in this system belong to .L1 (J ; Rn ), and
u(x0 ) = u0 ,
. u(x1 ) = u1 ,
with .u0 , u1 ∈ Rn . Then .u is a minimizer for the problem

⎧
. I (u) = F (x, u(x), u' (x)) dx (4.23)
J
in the feasible set
A = {u ∈ W 1,1 (J ; Rn ) : u(x0 ) = u0 , u(x1 ) = u1 }.

. (4.24)
Proof Let
v ∈ W01,1 (J ; Rn ),
.
which is the subspace of .W 1,1 (J ; Rn ) in (4.24) corresponding to .u0 = u1 = 0. We

know that .v ∈ L∞ (J ; Rn ) and, moreover, the formula of integration by parts (2.22)
guarantees that
⎧ ⎧
d
. FU (x, u(x), u' (x)) · v(x) dx = − FU (x, u(x), u' (x)) · v' (x) dx
J dx J
is correct. This identity implies, through (4.22), that

⎧
┌ ┐
. Fu (x, u(x), u' (x)) · v(x) + FU (x, u(x), u' (x)) · v' (x) dx = 0. (4.25)
J
On the other hand, it is well-known that a characterization of convexity for .C1 -

functions, like .F (x, ·, ·), for a.e. .x ∈ J , is
F (x, u + v, U + V) ≥ F (x, u, U) + Fu (x, u, U) · v + FU (x, u, U) · V

.
for every .u, U, v, V ∈ Rn . In particular, for a.e. .x ∈ J ,
F (x, u(x) + v(x), u' (x) + v' (x)) ≥F (x, u(x), u' (x))
.
+ Fu (x, u(x), u' (x)) · v(x)

+ FU (x, u(x), u' (x)) · v' (x).
Upon integration on .x ∈ J , bearing in mind (4.25), we find
I (u + v) ≥ I (u).
.
The arbitrariness of .v ∈ W01,1 (J ; Rn ) implies our conclusion. ⨆

⨅
As usual, uniqueness of minimizer is associated with some form of strict
convexity.
Proposition 4.3 Assume, in addition to hypotheses in Theorem 4.5, that the inte-
grand .F (x, u, U) is jointly convex on pairs .(u, U). Then the path .u in Theorem 4.5
is the unique minimizer of the problem (4.23)–(4.24).
The proof is just like other such results, for example Proposition 4.2.
4.7 Some Explicit Examples
The use of optimality conditions in specific examples, may serve, among other
important purposes beyond the scope of this text, to two main objectives: one is
to try to find minimizers when the direct method delivers them; another one, is
to explore if existence of minimizers can be shown once we have calculated them
in the context of the indirect method (Sect. 3.7). We will only consider end-point
conditions as additional constraints in the feasible set .A; other kind of constraints
ask for the introduction of multipliers, and are a bit beyond the scope of this text.
See some additional comments below in Example 4.11.
Example 4.9 We start by looking at the first examples of Sect. 4.5. It is very easy
to conclude that the unique minimizer .y(x) for Example 4.1 is the unique solution
of the differential problem
. − y '' (x) + g(x) = 0 in (0, 1), y(0) = y(1) = 0. (4.26)

4.7 Some Explicit Examples 149
It is elementary to find that such a solution is given explicitly by the formula

⎧ x ⎧ 1
y(x) =
. (x − s)g(s) ds − x (1 − s)g(s) ds. (4.27)
0 0
For Example 4.2, we find the optimality differential equation
. − 2au'' (x) + g ' (u(x)) = 0 in (0, 1), u(0) = u(1) = 0.
This time is impossible to know, without looking at the functional it is coming from,
if this equation admits solutions. Note that the problem is non-linear as soon as g is
non-quadratic. According to our discussion above, we can certify that this non-linear
differential problem admits solutions (possibly in a non-unique way) whenever
|g(u)| ≤ b|u| + C,
. b < a, C ∈ R.
For Example 4.3, there is no much to say. Optimality conditions become a

differential system
mx'' (t) + kx(t) = 0 for t > 0,

. x(0) = x' (0) = 0,
whose unique solution is the trivial one .x ≡ 0. Finally, for Example 4.4, we find the
non-linear differential problem
α 1
. − u(x)'' + + g(x) = 0 in (0, 1), u(0) = 0, u(1) = L > 0.
2 u (x)u'' (x)
'
It is not easy to find explicitly the solution .u(x) of this problem even for special
choices of the function .g(x). What we can be sure about, based on strict convexity
arguments as we pointed out earlier, is that there is a unique solution.
Example 4.10 We analyze next the classic problem of the brachistochrone for
which we are interested in finding the optimal solution, if any, of the variational
problem
⎧ √
A 1 + u' (x)2
. Minimize in u(x) : B(u) = √ dx
0 x
subject to
u(0) = 0,
. u(A) = a.
We have already insisted in that the direct method cannot be applied in this case due
to the linear growth of the integrand
1 √
F (x, u, U ) = √
. 1 + U 2,
x
with respect to the U -variable, though it is strictly convex. We still have the
hope to find and show the unique minimizer of the problem through Theorem 4.5
and Proposition 4.3. This requires first to find the solution of the corresponding
conditions of optimality, and afterwards, check if hypotheses in Theorem 4.5 hold
to conclude.
The Euler-Lagrange equation for this problem reads
⎛ ⎞
d 1 u' (x)
. √ √ = 0 in (0, A),
dx x 1 + u' (x)2
or
u' (x) 1
. √ = (4.28)
'
x(1 + u (x) )
2 c
for c, a constant. After a bit of algebra, we have

⎧ x /
x s
u' (x)2 =
. , u(x) = ds.
c −x
2
0 c2 −s
The constant c would be selected by demanding

⎧ A/ s
a=
. ds.
0 c2 −s
This form of the solution is not particularly appealing. One can find a better
equivalent form by making an attempt to describe the solution in parametric form
with the goal of introducing a good change of variable in the integral defining .u(x),
and calculating it in a more explicit form. Indeed, if we take
c2
s(r) = c2 sin2 (r/2) =
. (1 − cos r),
2
and perform the corresponding change of variables in the definition of .u(x), we find
⎧ τ r c2
u(τ ) = c2
. sin2 ( ) dr = (τ − sin τ ),
0 2 2
where the upper limit in the integral is
c2
x=
. (1 − cos τ ).
2
We, then, see that the parametric form of the optimal profile is
(t (τ ), u(τ )) = C(1 − cos τ, τ − sin τ ),

. τ ∈ [0, 1],
where the constant C is chosen to ensure that
t (1) = A,
. u(1) = a.
This family of curves is well-known in Differential Geometry: they are arcs of

cycloids, and enjoy quite remarkable properties.
It is easy to realize that this arc of cycloid .u(x) is in fact a Lipschitz function
belonging .W 1,∞ (0, A), and hence it is a function in .W 1,1 (0, A) too. Since the
integrand does not depend explicitly upon u, assumptions in Theorem 4.5 and
Proposition 4.3 hold true, and we can conclude that this is the only solution of the
brachistochrone problem.
One of the most surprising features of this family of curves is its tautochrone
condition: the transit time of the mass falling along an arc of a cycloid is independent
of the point where the mass starts to fall from !! No matter how high or how low you
put the mass, they both will employ the same time interval in getting to the lowest
point. This is not hard to argue, once we have gone through the above computations.
Notice that the cycloid is parameterized by
(t (τ ), u(τ )) = C(1 − cos τ, τ − sin τ )

.
for some constant C and .τ ∈ [0, 1], so that it is easy to realize that .u' (τ ) = t (τ ). If
we go back to (4.28), and bear in mind this identity, it is easy to find that indeed
/
t
. = c̃, (4.29)
1 + u' (t)2
a constant. Because in the variable .τ the interval of integration for the transit time
is constant .[0, 1], we see that for the optimal profile, the cycloid, the transit time
whose integrand is the inverse of (4.29), is constant independent of the height .uT .
This implies the tautochrone feature.
Example 4.11 Optimality conditions for problems in which additional constraints,
other than end-point conditions, are to be preserved are more involved. A formal
treatment of these is beyond the scope of this text. However, it is not that hard
to learn how to deal with them informally in practice. Constraints are dealt with
through multipliers, which are additional, auxiliary variables to guarantee that
constraints are taken into account in a fundamental way. There is one multiplier
for each restriction, either in the form of equality or inequality. If there are local
constraints to be respected at each point x of an interval, then we will have one (or
more) multipliers that are functions of x. If, on the contrary, we only have global,
integral constraints to be respected then each such constraint will involve just one
(or more) unknown number as multiplier.
To show these ideas in a particular example, let us consider the classical problem
of the hanging cable, introduced in Chap. 1,
⎧ H √
Minimize in u(x) :
. u(x) 1 + u' (x)2 dx
0

⎧ H √
u(0) = u(H ) = 0,
. L= 1 + u' (x)2 dx,
0
with .L > H . Since there is an additional global, integral constraint to be respected,

in addition to end-point conditions, we setup the so-called augmented integrand
√ √
F (u, U ) = u 1 + U 2 + λ 1 + U 2
. (4.30)
where .λ is the unknown multiplier in charge of keeping track of the additional

integral restriction. Suppose we look at the variational problem, in spite of not
knowing yet the value of .λ,
⎧ H √
.Minimize in u(x) : (u(x) + λ) 1 + u' (x)2 dx
0
subject to just .u(0) = u(H ) = 0, through its corresponding Euler-Lagrange

equation
┌ ┐'
u' (x) √
. − (u(x) + λ) √ + 1 + u' (x)2 = 0 in (0, H ),
1 + u' (x)2
u(0) = u(H ) = 0.
The solution set of this problem will be a one-parameter family .uλ (x) of solutions.
The one we seek is .uλ0 (x) so that
⎧ H /
L=
. 1 + u'λ0 (x)2 dx.
0
We expect to be able to argue, through Theorem 4.5 and Proposition 4.3, that this
function .uλ0 (x) is the unique minimizer we are searching for.
Because the integrand in (4.30) does not depend on the independent variable,
the corresponding Euler-Lagrange equation admits an integration in the form
(Exercise 16)
√ u' (x)
(u(x) + λ) 1 + u' (x)2 − u' (x)(u(x) + λ) √
. = c.
1 + u' (x)2
Some elementary but careful algebra carries us to
1√
u' (x) =
. (u(x) + λ)2 − c2 ,
c
and to
du dx
. √ = .
(u(x) + λ) − c
2 2 c
A typical hyperbolic trigonometric change of variables leads to

⎛x ⎞
u(x) = c cosh
. + d − λ,
c
for constants c, d and .λ. These are determined to match the three numerical
constraints of the problem, namely,
⎧ H √
u(0) = u(H ) = 0,
. L= 1 + u' (x)2 dx.
0
This adjustment, beyond the precise calculation or approximation, informs us that

the shape is that of certain hyperbolic cosine, which in this context is called a
catenary. To argue that this catenary is indeed the minimizer of the problem, we note
that the interpretation of the problem yields .c > 0 necessarily, i.e. .u(x) + λ0 > 0,
and the integrand
√
F (u, U ) = (u + λ0 ) 1 + U 2
.
is strictly convex in U . However, as a function of the pair .(u, U ) is not convex, and
hence we cannot apply directly Theorem 4.5. The unique solution is, nonetheless,
the one we have calculated.
Example 4.12 The case in which no boundary condition whatsoever is imposed
in a variational problem is especially interesting. The following corollary is similar
to the results presented earlier in the chapter. The only point that deserves some
comment is the so-called natural boundary condition (4.33) below that requires a
more careful use of the integration-by-parts step. See Exercises 8, 12.
Corollary 4.2
1. Let
F (x, u, U) : J × Rn × Rn → R
. (4.31)
a. coercivity and growth: there are positive constants .C− ≤ C+ with
⎛ ⎞ ⎛ ⎞
C− |U|2 − 1 ≤ F (x, u, U) ≤ C+ |U|2 + 1 ,
.
for .(x, u, U) ∈ J × Rn × Rn ;
b. convexity: for a.e. .x ∈ J , and every .u ∈ Rn , we have that
F (x, u, ·) : Rn → R
.
is convex.
Then there are optimal solutions for the variational problem
⎧
. Minimize in u ∈ H (J ; R ) :
1 n
I (u) = F (x, u(x), u' (x)) dx. (4.32)
J
2. If, in addition,
|Fu (x, u, U)| ≤ C(|U|2 + 1),

. |FU (x, u, U)| ≤ C(|U| + 1)
for some constant C, and .u ∈ H 1 (J ; Rn ) is a minimizer of the previous problem,

then
FU (x, u(x), u' (x))

.
is absolutely continuous in J ,
d
Fu (x, u(x), u' (x)) −
. FU (x, u(x), u' (x)) = 0 a.e. x in J,
dx
and
|
. FU (x, u(x), u' (x))|x=x = 0. (4.33)
0 ,x1
4.8 Non-existence 155
4.8 Non-existence
Readers may have realized that our main existence theorem cannot be applied
to examples of variational problems included in the first chapter, not because
integrands fail to be convex but because they fail to comply with the coercivity
condition for an exponent .p > 1. Indeed, most of those interesting problems
have a linear growth at infinity, so that, at first sight, existence of minimizers is
compromised. Indeed, taken for granted that there are no troubles concerning the
weak closeness of the set of competing functions or paths, we see that there may
be two reasons for the lack of minimizers, depending on whether coercivity or
convexity fail.
The simplest example of a variational problem without minimizer because of
lack of convexity is due to Bolza. Suppose we wish to
⎧ 1 ┌ 1 1
┐
'
Minimize in u(t) :
. (|u (t)| − 1) + u(t) dt
2 2
0 2 2
under .u(0) = u(1) = 0. The integrand for this problem is the density
1 1
f (u, U ) =
. (|U | − 1)2 + u2 ,
2 2
which is not convex in the variable U . Coercivity is not an issue as f has quadratic
growth in both variables. The reason for lack of minimizers resides in the fact that
the two non-negative contributions to the functional are incompatible with each
other: there is no way to reconcile the vanishing of the two terms since either you
insist in having .|u' | = 1; or else, .u = 0. Both things cannot happen simultaneously.
Yet, one can achieve the infimum value .m = 0 through a minimizing sequence in
which slopes .±1 oscillate faster and faster to lead the term with .u2 to zero. This is
essentially the behavior of minimizing sequences due to lack of convexity, and we
refer to it as a fine oscillatory process.
A more involved, but similar phenomenon, takes place with the problem in which
we try to minimize the functional
⎧ 1⎛ ⎞
I (u) =
. |u'1 (x)| |u'2 (x)| + |u(x) − (x, x)|2 dx, u = (u1 , u2 ),
0
in the feasible set of functions
A = {u ∈ H 1 ((0, 1); R2 ) : u(0) = 0, u(1) = (1, 1)}.

.
It should be noted, however, that lack of convexity does not always lead to lack of
minimizers. As a matter of fact, there are some interesting results about existence
without convexity. See the final Appendix.
There is also a classical simple example due to Weierstrass for which the lack of
uniform coercivity is the reason for lack of minimizer.s Let us look at
⎧ 1 1 2 ' 2
Minimize in u(t) :
. t u (t) dt
−1 2
under
u(−1) = −1,
. u(1) = 1.
We notice that the factor .t 2 accounting for coercivity degenerates in the origen. It
is precisely this lack of coercivity, even in a single point, that is responsible for the
lack of minimizers. If there were some minimizer u, then it would be a solution of
the corresponding Euler-Lagrange equation which it is very easily written down as
(t 2 u' (t))' = 0 in (−1, 1).

.
The general solution of this second-order differential equation is

a
u(t) =
. + b, a, b, constants.
t
The end-point conditions lead to .b = 0, .a = 1, and so the minimizer would be
u(t) = 1/t
.
which blows up at .t = 0, and hence it does not belong to any of the spaces
W 1,p (0, 1).
.
Example 4.13 The work done by a non-conservative force field. Suppose that we
consider the functional
⎧ T
W (x) =
. F(x(t)) · x' (t) dt
0
furnishing the work performed by a non-conservative force field .F(x) in going from
a point .x0 to .xT in a time interval .[0, T ] through a continuous path
x(t) : [0, T ] → Rn ,
. x(0) = x0 , x(T ) = xT .
It is easy, but surprising, to check that the Euler-Lagrange system is void !! It does
not provide any information. This is so regardless of whether the force field is
conservative. To better comprehend the situation for a non-trivial, non-conservative
force field .F, suppose that the system of ODE
X' (t) = −F(X(t)),

. t > 0,
4.9 Exercises 157
has a periodic solution passing through a given point .xP ∈ Rn with period .P > 0,
so that
X(0) = xP ,
. X(P ) = xP .
In this situation, the integral

⎧ P ⎧ P
'
. F(X(t)) · X (t) dt = − |F(X(t))|2 dt
0 0
is a strictly negative fixed quantity .M < 0. If a fixed time interval .T > 0 is given,
and take .λ = j P /T for an arbitrary positive integer j , then the path
⎛ ⎞
jP
x(t) = X
. t , t ∈ [0, T ],
T
is such that
x(0) = x(T ) = xP ,
.
and by an easy change of variables, and periodicity, we find that
W (x) = j M.
.
In this way, we see that if m is the infimum of the work functional under the
given conditions, then .m = −∞. It suffices to select a sequence of paths .xj (t)
as indicated, reserving an initial time subinterval to go from .x0 to .xP , and another
final time subinterval to move back from .xP to .x1 , after going around the periodic
solution .X j times. Note that it is imposible for a gradient field to have periodic
integral curves as the above curve .X.
Is there anything else interesting to do with such non-convex examples? Is there
anything else to be learnt from them? We will defer the answer to these appealing
questions until Chap. 8 in the context of multidimensional variational problems.
4.9 Exercises
1. Let H be a vector space, and I : H → R, a functional. Show that if I is strictly

convex, then it cannot have more than one minimizer, and every local minimizer
of I is, in fact, a global minimizer.
2. Let
3
φa (u) = u4 + au3 + a 4 u2 ,
.
8
where a is a real positive parameter. Study the family of problems

⎧ 1
Minimize in u(x) :
. φa (u' (t)) dt
0
under end-point conditions
u(0) = 0, u(1) = α.
.
We are especially interested in the dependence of minimizers on a and on α.

3. Argue that all minimizers of the functional
⎧ b ┌ ┐
1 ' 2 1
I (u) =
. u (x) + u(x)3 dx
a 2 3
are convex functions in (a, b).

4. Provide full details for the necessity part of the proof of Theorem 4.2.
5. An integrand
F (x, z) : RN × RN → R
.
is a null-lagrangian if the integrals

⎧ b
. F (x(t), x' (t)) dt
a
only depend upon the starting point xa and the final point xb , and not on the
path itself
x(t) : [a, b] → RN
.
as long as
x(a) = xa ,
. x(b) = xb .
(a) Argue that for a conservative force field
F(x) : RN → RN ,
.
the integrand
F (x, z) = F(x) · z
.
4.9 Exercises 159
is a null-lagrangean. What is the form of the Euler-Lagrange system in this

case?
(b) Study the corresponding variational problem for a non-conservative force
field.
6. Provide all missing technical details for a full proof of Theorem 4.4.
7. Determine the infimum of the functional
⎧ 1
. |u' (x)|α dx
0
over the class of Lipschitz functions (W 1,∞ (0, 1)) with u(0) = 0, u(1) = 1, in
terms of the exponent α ∈ R.
8. Derive optimality conditions for a variational problem like the ones in Sect. 4.6
under periodic boundary conditions
u(x0 ) − u(x1 ) = 0.
.
9. Examine the form of optimality conditions for a variational problem depending

explicitly on second derivatives of the form
⎧
. F (x, u(x), u' (x), u'' (x)) dx
J
under end-point conditions involving the values of u and u' at both end-points
of J .
10. The typical way to proceed in the proof of Theorem 4.4 from (4.20) focuses on
an integration by parts on the second term rather than on the first. Follow this
alternative route, and provide the details to find the same final result.
11. For each positive ∈, consider the problem that consists in minimizing in pairs
(u1 (t), u2 (t)) the functional
⎧ 1 ┌ 1 ∈
┐
. (u1 (t)u2 (t) − 1)2 + (u'1 (t)2 + u'2 (t)2 ) dt
0 2 2
under
u1 (0) = 2, u2 (0) = 1/2,

. u1 (1) = −1/2, u2 (1) = −2.
(a) Argue that there is an optimal pair u∈ = (u1,∈ , u2,∈ ) for each ∈.
(b) Examine the corresponding Euler-Lagrange system.
(c) Explore the convergence of such optimal path u∈ as ∈ \ 0.
12. For a regular variational problem for an integral functional of the form
⎧ 1
I (u) =
. F (u(t), u' (t)) dt,
0
derive optimality conditions for the problem in which only the left-hand side
condition u(0) = u0 is prescribed but the value u(1) is free. Pay special
attention, precisely, to the optimality condition at this free end-point (natural
boundary condition).
13. Explore how to enforce Neumann end-point conditions
u' (x0 ) = u'0 ,

. u' (x1 ) = u'1
for arbitrary numbers u'0 , u'1 in a variational problem of the particular form
⎧ ⎛ ⎞
1 ' 2
. u (x) + F (x, u(x)) dx, J = (x0 , x1 ).
J 2
14. Show that the subset

⎧ ⎧ 1√ ⎫
√
.A = u(x) ∈ W
1,1
(0, 1) : u(0) = u(1) = 0, 2 = 1 + u' (x)2 dx
0
is not weakly closed in W 1,1 (0, 1).

15. Check that the unique solution of problem (4.26) is given by formula (4.27).
16. Show that if the integrand F for a integral functional does not depend on the
independent variable x, F = F (u, U), then the Euler-Lagrange system reduces
to
F (u(x), u' (x)) − u' (x) · FU (u(x), u' (x)) = constant.

.
17. For paths
x(t) ∈ H 1 ([0, T ]; Rn ),
. x(0) = x0 ,
consider the functional

⎧ T 1 '
E(x) =
. |x (t) − f (x(t))|2 dt,
0 2
where f (x) : Rn → Rn is a continuous mapping.

(a) If f is Lipschitz
|f (z1 ) − f (z2 )| ≤ M|z1 − z2 |,

. M > 0,
show that there is a unique minimizer X(t) for the problem.

4.9 Exercises 161
(b) If, in addition, f is C1 , then the unique minimizer X is the unique solution
of the Cauchy problem
X' (t) = f (X(t)),

. X(0) = x0 .
18. Let the C1 mapping
f (x, y) : Rn × Rm → Rn
.
be given complying with the conditions:

(a) uniform Lipschitzianity with respect to y:
|f (z1 , y) − f (z2 , y)| ≤ M|z1 − z2 |

.
with a positive constant M valid for every y;

(b) coercivity with respect to y: for every compact set K ⊂ Rn ,
. lim sup |f (x, y)| = ∞.

|y|→∞ x∈K
Consider the controllability situation
x' (t) = f (x(t), y(t)) in [0, T ],

. x(0) = x0 , x(T ) = xT .
It is not clear if, or under what restrictions, there is a solution of this problem.
(a) Study the surjectivity of the mapping
.y ∈ Rn |→ x(T )
where x(t) is the unique solution of
.x' (t) = f (x(t), y) in [0, T ], x(0) = x0 ,
by looking at the variational problem
Minimize in (x(t), y) ∈ H 1 ([0, T ]; Rn ) × Rn :

.
⎧ T
1 '
|x (t) − f (x(t), y)|2 dt
0 2
subjected to
x(0) = x0 ,
. x(T ) = xT .
i. show that there is always an optimal pair
. (x, y) ∈ H 1 ([0, T ]; Rn );
ii. write optimality conditions.

(b) Do the same for variable paths y(t).
(c) Examine the particular case of a linear map
f (x, y) : Rn × R → Rn ,
. f (x, y) = Ax + by,
for constant matrix A and non-vanishing vector b.

19. For a standard variational problem of the form
⎧ 1
. F (t, u(t), u' (t)) dt
0
explore general optimality conditions of the form
u(s, t) : (−∈, ∈) × H 1 ([0, 1]; RN ) → RN

.
if u(0, t) is a true minimizer under, say, end-point conditions.

20. For a standard variational problem of the form
⎧ 1
I (u) =
. F (t, u(t), u' (t)) dt,
0
transform the functional
Iˆ(ψ) = I (uψ )
.
for competing fields of the form
uψ (t) = u(ψ −1 (t)),

.
where u is fixed, while ψ varies in the class of diffeomorphims of the interval

[0, 1] leaving invariant both end-points, by performing the change of variables
ψ(s) = t in the integral defining I (uψ ). Investigate optimality conditions for
Iˆ(ψ).
21. For a variational problem of the form
⎧ 1
. F (t, u(t), u' (t)) dt
0
4.9 Exercises 163
under a point-wise constraint
φ(u(t)) = 0 for all t ∈ [0, 1],

. φ : RN → Rn ,
examine optimality conditions with respect to inner-variations of the form
u(ψ(t)),
. ψ(t) : [0, 1] → [0, 1], ψ(0) = 0, ψ(1) = 1.
22. For a finite number n, and a finite interval J = (x0 , xT ) ⊂ R, consider
fi (x, u, v) : J × R × R → R,
.
f (x, u, v) = (fi (x, u, v))i : J × R × R → Rn ,
and another function F : Rn → R. Explore optimality conditions for the

functional
⎛⎧ ⎞
.I (u) = F f (x, u(x), u' (x)) dx
J
subjected to typical end-point conditions
u(x0 ) = u0 ,
. u(xT ) = uT ,
assuming every necessary regularity hypothesis on the functions involved. Look

specifically to the case of products and quotients for which n = 2, and
⎛⎧ ⎞ ⎛⎧ ⎞
' '
P (u) =
. f1 (x, u(x), u (x)) dx f2 (x, u(x), u (x)) dx ,
j j
⎛⎧ ⎞ ⎛⎧ ⎞
' '
Q(u) = f1 (x, u(x), u (x)) dx / f2 (x, u(x), u (x)) dx .
j j
23. If a non-singular symmetric matrix-field
A(x) : RN → RN ×N
.
determines how to measure distances locally around the point x ∈ RN , then we

already described that the distance functional
⎧ 1
. [x' (t)T A(x(t))x' (t)]1/2 dt
0
provides the length of the curve
x(t) : [0, 1] → RN ,
. x(0) = P, x(1) = Q
joining the two given points. Geodesics are the paths minimizing such
functional. Study the corresponding Euler-Lagrange system, and identify the
Chrystoffel symbols ┌ijk of Differential Geometry when the system is written
in the form
Σ
N
xk'' (t) −
. ┌ijk (xi )' xj' = 0.
i,j =1
Start with the case N = 2, and try to figure out the general case from there.
24. Consider the functional
⎧ 1
I (u) =
. |u' (t)|2 dt, u : [0, 1] |→ RN , u ∈ H 1 (0, 1; RN ),
0
and define
Ii (u) = inf I (uψ ),

. uψ = u ◦ ψ,
ψ
where ψ runs through all diffeomorphisms of the unit interval preserving its
two-end points.
(a) For each feasible u fixed, define
Iˆ(ψ) = I (uψ ),
.
and compute the optimal ψ.

(b) Calculate
. inf Iˆ(ψ),
ψ
for each fixed u. Is Ii (u) a typical integral functional?

(c) Redo the same calculations with the functional
⎧ 1
I (u) =
. |u' (t)| dt, u : [0, 1] |→ RN , u ∈ W 1,1 (0, 1; RN ).
0
What is the main difference between the two cases?

25. Suppose the family of functions {a∈ (x)} is strictly positive and uniformly
bounded
0 < m ≤ a∈ (x) ≤ M < +∞,

. x ∈ [0, 1].
4.9 Exercises 165
Consider the variational problem

⎧ 1 a∈ (x) ' 2
Minimize in H (0, 1) :
.
1
Q∈ (u) = u (x) dx
0 2
subject to u(x) − x ∈ H01 (0, 1).

(a) Check that there is a unique minimizer u∈ , which is a uniformly bounded
sequence in H 1 (0, 1), and therefore, for a suitable sequence (not relabeled),
there some feasible u ∈ H 1 (0, 1), u(x) − x ∈ H01 (0, 1), such that u∈ ⥛ u.
(b) Use optimality to derive a closed form for u∈ .
(c) See how to arrange things in such a way that the limit function u becomes
the minimizer of a similar variational problem
⎧ 1 a(x) ' 2
Minimize in H 1 (0, 1) :
. Q∈ (u) = u (x) dx
0 2
under the same end-point conditions.

Part II
Basic Operator Theory
Chapter 5
Continuous Operators
Though we have insisted in the pivotal rule played historically by the Calculus
of Variations in the initial steps of Functional Analysis as a discipline on its
own, it has grown to be an immense and fruitful field, crucial to many other
areas of Mathematics and Science. In particular, Operator Theory is one of those
fundamental parts of Functional Analysis that we briefly treat in the following
two chapters. We focus on some of the more relevant results that applied analysts
and applied mathematicians should know. Most of the concepts and results in this
chapter are abstract. But they are so important for fundamental applications in
Analysis when spaces become either Lebesgue or Sobolev spaces, as we will see
in later chapters, that they need to be taken into account very seriously.
5.1 Preliminaries
We start by considering linear, continuous operators, between Banach spaces which

are a natural extension of the ideas discussed earlier in Sect. 2.7 about the dual space.
Proposition 5.1 Let .E, .F be Banach spaces. A linear map .T : E → F is continuous
if and only if there is a positive constant M such that
||Tu|| ≤ M||u||,
. u ∈ E.
The proof of this fact is easy and similar to the one in Sect. 2.7. It makes the
following definition sensible.
Definition 5.1 If .T : E → F is continuous, we put
||T|| = inf{M > 0 : ||Tu|| ≤ M||u||} = sup ||Tu||.

.
||u||=1
170 5 Continuous Operators
The space of linear continuous operators from .E to .F, written .L(E, F), becomes a
Banach space under this norm. This is straightforward to check.
Note that a bijection .T : E → F is an isomorphism if there is a positive constant
M > 0 such that
.
1
. ||u|| ≤ ||Tu|| ≤ M||u||, u ∈ E.
M
In particular, two different norms .|| · ||1 , .|| · ||2 , in the same underlying space .E are
equivalent if there is a constant .M > 0 with
1
. ||u||1 ≤ ||u||2 ≤ M||u||1 , u ∈ E.
M
Complete metric spaces enjoy quite remarkable properties. Since Banach spaces are
a subclass of the class of complete metric spaces, we are interested in stressing some
fundamental properties of those spaces related to Baire category methods. There are
various equivalent definitions of Baire spaces. We stick to the following.
Definition 5.2 A topological space .X is a Baire space if the countable intersection
of open and dense subsets of .X is dense in .X. Alternatively, .X is a Baire space if for
every sequence of closed subsets with empty interior, the union of them all still has
empty interior.
The fundamental result we are interested in is the fact that every complete metric
space is a Baire space.
Theorem 5.1 Every complete metric space is a Baire space.
Proof Let .X be a complete metric space with distance function .d(·, ·), and let .Oj
be open and dense. We pretend to show that .O = ∩j Oj is dense too. To this end,
choose an additional arbitrary open, non-empty set .G, and show that .G ∩ O is non-
empty.
Take some .x0 ∈ G, and, given that .G is open, .r0 > 0 so that
Br0 (x0 ) ≡ {x ∈ X : d(x, x0 ) ≤ r0 } ⊂ G.

.
Since .O1 is open and dense, we can find .x1 and .r1 > 0 with
Br1 (x1 ) ≡ {x ∈ X : d(x, x1 ) ≤ r1 } ⊂ Br0 (x0 ) ∩ O1 ,

. 0 < r1 < r0 /2.
In a recursive way, we find sequences .{xj } and .{rj } such that
Brj +1 (xj +1 ) ⊂ Brj (xj ) ∩ Oj +1 ,

. rj +1 < rj /2.
5.2 The Banach-Steinhaus Principle 171
It is then clear that .{xj } is a Cauchy sequence because for all j ,
.d(xj +1 − xj ) ≤ rj ,
and, hence, if .j < k,
Σ
k−1 Σ
k−1 Σ
k−1−j
d(xk , xj ) ≤
. d(xi , xi+1 ) ≤ ri ≤ rj 2−i = rj 2j −k+1 .
i=j i=j i=0
The completeness of the space leads to a limit .x. Since the sequence of balls
{Brj (xj )} is a nested, decreasing sequence of subsets of .G ∩ Oj −1 , the limit .x will
.
belong to .G ∩ O. This proves the result. ⨆

⨅
5.2 The Banach-Steinhaus Principle
The Banach-Steinhaus principle enables to derive a uniform bound from a pointwise

bound, which is a quite remarkable fact given that, in general, it is far from being
so.
Theorem 5.2 Let .E and .F be two Banach spaces, and .{Tj }j ∈J , a family (not
necessarily a sequence) in .L(E, F). The following two assertions are equivalent:
1. there is a constant .C > 0 such that
||Tj x|| ≤ C||x||,

. x ∈ E, j ∈ J ;
2.
. sup ||Tj x|| < ∞, x ∈ E. (5.1)

j ∈J
Proof Suppose (5.1) holds, and set, for each .i ∈ N,
Fi = {x ∈ E : ||Tj x|| ≤ i, for all j ∈ J }.

.
It is clear that each .Fi is closed, and .E = ∪i Fi , precisely by (5.1). Since every
Banach space is a complete metric space, by Theorem 5.1 and Definition 5.2, there
must be some .i0 with .Fi0 having a non-empty interior. This means that there is some
.x0 ∈ E and .ρ > 0 with
||Tj (x0 + ρx)|| ≤ i0 ,

. j ∈ J, ||x|| ≤ 1.
This inequality can be recast as
1
||Tj || ≤
. (i0 + ||Tj x0 ||)
ρ
which is our conclusion. ⨆

⨅
An interesting corollary is the following.
Corollary 5.1 Let .E, .F be two Banach spaces, and .{Tj } a sequence in .L(E, F),
such that the limit of .{Tj x} exists for every .x ∈ E, and is denoted by .Tx. Then
.T ∈ L(E, F) and
||T|| ≤ lim inf ||Tj ||.

.
j →∞
Proof Directly from Theorem 5.2, we conclude that .T ∈ L(E, F). Since, it is true
that
||Tj x|| ≤ ||Tj || ||x||

.
for every j , we can conclude the inequality in the statement. ⨆

⨅
Remark 5.1 It is important, however, to realize that it is not true, in general, that
||Tj − T|| → 0.
.
5.3 The Open Mapping and Closed Graph Theorems
These two are among the classical results in Functional Analysis that ought to be
studied in a first course.
Theorem 5.3 Every surjective, linear operator .T : E → F between two Banach
spaces .E and .F is open, i.e. the image .T(O) of every open set .O in .E is open in .F.
Proof It is easy to argue that it suffices to show that there is some .ρ > 0 such that
Bρ ⊂ T(B1 ),
. (5.2)
where the first ball is a ball in .F centered at .0, while the second ball is the unit ball
in .E (exercise). To show (5.2), put
Fi = iT(B1 ),
. i = 1, 2, . . .
5.3 The Open Mapping and Closed Graph Theorems 173
Since .T is onto, it is true that .F = ∪i Fi . By the Baire category result Theorem 5.1,
there must be some .i0 with .Fi0 not having non-empty interior, for otherwise the
union would have empty interior which is not the case. There are, therefore, .ρ > 0
and .y0 ∈ F with
B4ρ (y0 ) ⊂ T(B1 ).

.
Due to the symmetry of .T(B1 ) with respect to changes of sign, it is easy to have
B4ρ (0) ⊂ T(B1 ) + T(B1 ) = 2T(B1 ),

.
or
B2ρ (0) ⊂ T(B1 ).

. (5.3)
Based on (5.3), we would like to prove that
.Bρ ≡ Bρ (0) ⊂ T(B1 ).
To this end, let .y ∈ F be such that .||y|| ≤ ρ; we want to find .x ∈ E with
||x|| < 1,
. Tx = y.
Because of (5.3), for .∈ > 0 arbitrary, a vector .z ∈ E can be found so that
||z|| < 1/2,

. ||y − Tz|| < ∈.
In particular, for .∈ = ρ/2, there is some .z1 ∈ E such that
||z1 || < 1/2,

. ||y − Tz1 || < ρ/2.
But this same argument applied to the difference .y − Tz1 , instead of .y, and for
∈ = ρ/4, yields .z2 ∈ E and
.
. ||z2 || < 1/4, ||(y − Tz1 ) − Tz2 || < ρ/4.
Proceeding recursively, we would find a sequence .{zi } with the properties

|| ⎛ i ⎞||
|| Σ ||
−i || ||
||zi || < 2 ,
. ||y − T zk || < ρ2−i ,
|| ||
k=1
for every i. Consequently, the sequence of partial sums
Σ
i
xi =
. zk
k=1
is a Cauchy sequence, hence convergent to some .x ∈ B1 , and, by continuity of .T,
Tx = y,
.
as desired. ⨆
⨅
There are at least three important consequences of this fundamental result, one of
which is the closed graph theorem which we discuss in the third place.
Corollary 5.2 If .T : E → F is a linear, continuous operator between two Banach
spaces that is a bijection, then .T−1 : F → E is linear, and continuous too.
Proof We investigate the consequences of injectivity together with property (5.2).
Indeed, if .T is one-to-one, and .x is such that .||Tx|| < ρ then necessarily .||x|| < 1.
For an arbitrary .x ∈ E, the vector
ρ
x=
. x
2||Tx||
is such that .||Tx|| < ρ, and hence

ρ
||x|| <
. ||Tx||.
2
This is the continuity of .T−1 . ⨆

⨅
A particularly interesting situation of this kind occurs when a unique underlying
space turns out to be a Banach space under two different norms.
Corollary 5.3 If a vector space .E is a Banach space under two different norms
|| · ||1 and .|| · ||2 , and there is a constant .C > 0 with
.
||x||2 ≤ C||x||1 ,
. x ∈ E,
then the two norms are equivalent, i.e. there is a further constant .c > 0 with
||x||1 ≤ c||x||2 ,
. x ∈ E.
Proof The proof is hardly in need of any clarification. Simply apply the previous
corollary to the identity as a continuous bijection between the two Banach spaces
.(E, || · ||i ), .i = 1, 2. ⨆
⨅
It is of paramount importance that .E is a Banach space under the two norms.
5.4 Adjoint Operators 175
Finally, we have the closed graph theorem. For an operator .T : E → F, its graph
G(T) is the subset of the product space given by
.
G(T) = {(x, Tx) ⊂ E × F : x ∈ E}.

.
Theorem 5.4 Let .T : E → F be a linear operator between two Banach spaces.

The two following assertions are equivalent:
1. .T is continuous;
2. .G(T) is closed in .E × F.
Proof If .T is continuous then its graph is closed. This is even true for non-linear
operators. For the converse, consider in .E the graph norm
||x||0 = ||x||E + ||Tx||F .

.
Relying on the assumption of the closedness of the graph, it is easy to check that .E
is also a Banach space under .|| · ||0 (exercise). We trivially have
. ||x||E ≤ ||x||0 , x ∈ E.
By the preceding corollary, the two norms are equivalent
||x||0 ≤ c||x||E ,
. x ∈ E,
for some positive constant c. But then
||Tx||F ≤ c||x||E ,
. x ∈ E,
and .T is continuous. ⨆
⨅
5.4 Adjoint Operators
There is a canonical operation associated with a linear, continuous mapping .T :

E → F between two Banach spaces with respective dual spaces .E' , .F' . If .T ∈ F' so
that .T : F → R is linear and continuous, the composition
T ◦T:E→R
.
is linear and continuous as well, and hence it belongs to .E' .

Definition 5.3 Let .T : E → F be linear and continuous. The map .T' : F' → E'
defined through the identity
<T' T , x> = <T , Tx>,

. x ∈ E,
is called the adjoint operator of .T.

It is very easy to check that
||T' ||L(F' ,E' ) = ||T||L(E,F) .

.
The duality between a Banach space .E and its dual .E' can be understood as a bilinear
mapping
<·, ·> : E × E' → R.

.
Definition 5.4
1. For a subspace .M of a Banach space .E, we define its orthogonal subspace
M⊥ = {T ∈ E' : <T , x> = 0 for all x ∈ M}.

.
Similarly, if .M is now a subspace of the dual .E' , then
M⊥ = {x ∈ E : <T , x> = 0 for all T ∈ M}.

.
2. Given a linear, continuous operator .T : E → F there are two special subspaces,

one in .E, and another one in .F, associated with .T
N(T) = {x ∈ E : Tx = 0},
. R(T) = {Tx : x ∈ E}.
It is instructive to relate the various concepts in this definition. To do this in an

efficient way, we first show the following which is also a nice fact on its own right.
Proposition 5.2 If .M is a subspace of a Banach space .E, then
⎛ ⎞⊥
. M⊥ = M. (5.4)
However, if .M is a subspace of the dual .E' , then

⎛ ⎞⊥
. M⊥ ⊃ M. (5.5)
If .E is reflexive, (5.5) becomes an equality too.

5.4 Adjoint Operators 177
Proof If .M is a subspace of the Banach space .E, it is easy to check that .M ⊂

(M⊥ )⊥ , and that .(M⊥ )⊥ is closed. In particular, .(M⊥ )⊥ is a Banach space itself
(under the same norm in .E). According to Corollary 3.3, to show (5.4) it suffices to
prove that for a linear functional .T ∈ E' such that
<T , x> = 0 for all x ∈ M

. (5.6)
we have
. T |(M⊥ )⊥ ≡ 0. (5.7)
But (5.6) exactly means that .T ∈ M⊥ , and so (5.7) holds by definition.

The proof of (5.5) is similar. However, equality might not hold because linear
continuous functionals over .E' , i.e. elements of .E'' , vanishing on .M may not belong
to .E and not be elements of .M⊥ . If .E is assumed to be reflexive, the equality is
recovered as before. ⨆
⨅
Proposition 5.3 Let .T : E → F be a linear, continuous operator between two
Banach spaces, and .T' : F' → E' , its adjoint. Then
N(T) = R(T' )⊥ ,
. N(T' ) = R(T)⊥ ,
N(T)⊥ ⊃ R(T' ), N(T' )⊥ = R(T).
If .E is reflexive, then
N(T)⊥ = R(T' ).
.
Proof The first two identities are essentially a consequence of the definitions.
Concerning the other two, they are implied by the first two if we bear in mind
Proposition 5.2, including the situation when .E is reflexive. ⨆
⨅
We wrap this section by commenting briefly on the double adjoint .T'' : E'' → F'' .
If we start with .T : E → F, we know that the adjoint operator .T' : F' → E' is
determined through
<T' T , x> = <T , Tx>,

. x ∈ E, T ∈ F' .
Once we have .T' : F' → E' , we can talk about .T'' : E'' → F'' characterized by
<T'' x, T > = <x, T' T >,

. x ∈ E'' , T ∈ F' .
In particular, since .E ⊂ E'' is a closed subspace, if we take .x ∈ E we would have

for every .T ∈ F' ,
<T'' x, T > = <T , Tx>.

.
This identity yields, again because .F is a closed subspace of .F, that .T'' x = Tx. This
fact can be interpreted by saying that .T'' : E'' → F'' is an extension of .T : E → F.
If .E is reflexive, .T'' and .T is the same operator with a different target space if .F is
not reflexive.
5.5 Spectral Concepts
For a Banach space .E, we denote .L(E) the collection of linear, continuous operators
from .E to itself. .1 ∈ L(E) stands for the identity map.
Definition 5.5 Let .T ∈ L(E) for a Banach space .E.
1. The resolvent .ρ(T) of .T is the subset of real numbers .λ such that .T − λ1 is a
bijection from .E onto .E. We put
Rλ = (T − λ1)−1 ,
. λ ∈ ρ(T).
2. The spectrum .σ (T) is defined to be
σ (T) = R \ ρ(T).
.
The spectral radius .ν(T) is the number
ν(T) = sup{|λ| : λ ∈ σ (T)}.

.
3. A real number .λ ∈ R is an eigenvalue of .T if
. N(T − λ1) /= {0}.
If .λ is an eigenvalue of .T, the non-trivial subspace .N(T − λ1) is, precisely, its
eigenspace. The set of all eigenvalues of .T is denoted .e(T).
Note that we always have
e(T) ⊂ σ (T).
.
Some basic properties follow. Let .T ∈ L(E) for a certain Banach space .E.
Proposition 5.4
1. The resolvent .ρ(T) is open in .R.
2. For every pair .λ, μ ∈ ρ(T),
Rλ − Rμ = (λ − μ)Rμ Rλ .
. (5.8)
5.5 Spectral Concepts 179
In particular,
Rλ − Rμ
. lim = R2μ .
λ→μ λ − μ
3. The spectrum .σ (T) is a compact set of .R.

4. The spectral radius .ν(T) is smaller than the norm .||T||.
Proof We first check that the class of isomorphisms in .L(E) is open in this Banach
space. Indeed, if .T ∈ L(E) is an isomorphism, the ball
B = {S ∈ L(E) : ||S − T|| < 1/||T−1 ||}

.
is fully contained in the same class. To check this, we can write

⎛ ⎞
. S = T − (T − S) = T 1 − T−1 (T − S) . (5.9)
But if .S ∈ B,
||T−1 (T − S)|| < 1,

.
and then (see exercise below)
1 − T−1 (T − S)
.
is an isomorphism. Equation (5.9) shows that so is .S. Since the resolvent of .T is the
pre-image of the class of isomorphisms, an open subset in .L(E) by our argument
above, through the continuous map
ψ(λ) : R → L(E),
. ψ(λ) = T − λ1,
we can conclude that the resolvent is an open set of .R.

Formula (5.8) is straightforward. In fact
Rλ − Rμ =Rμ (R−1
. μ Rλ − 1)
−1
=Rμ (R−1
μ − Rλ )Rλ
=(λ − μ)Rμ Rλ .
The spectrum is a closed set of .R, given that its complement, the resolvent, has been
shown to be open. Moreover, it is easy to prove that
σ (T) ⊂ [−||T||, ||T||],

. (5.10)
which, by the way, proves the last statement of the proposition. Equation (5.10) is
a direct consequence of the exercise mentioned above: .||(1/λ)T|| < 1 implies that
.1 − (1/λ)T is an isomorphism, and so is .T − λ1, i.e. .λ ∈ ρ(T). If .|λ| > ||T|| what
we have just argued is correct. ⨆

⨅
5.6 Self-Adjoint Operators
Since the dual space .H' of a Hilbert space .H can be identified canonically with itself
' '
.H = H, given an operator .T ∈ L(H), its adjoint operator .T can also be regarded
as a member of .L(H), and is determined through the condition
<Tx, y> = <x, T' y>,

. x, y ∈ H.
Definition 5.6 Let .T ∈ L(H) for a Hilbert space .H. .T is self-adjoint if .T' = T, or,
equivalently, if
<Tx, y> = <x, Ty>,

. x, y ∈ H.
In addition, a self-adjoint operator .T is positive if
. <Tx, x> ≥ 0, x ∈ H.
Theorem 5.5 For a self-adjoint operator .T ∈ L(H),
σ (T) ⊂ [m, M],

. m = inf <Tx, x>, M = sup <Tx, x>.
||x||=1 ||x||=1
Moreover, both end-points m and M belong to .σ (T).

It is very easy to argue that .e(T) is a subset of the interval .[m, M]. Indeed, if there
is a non-null vector .x such that
(T − λ1)x = 0,
. Tx = λx,
which can be assumed to have size one .||x|| = 1, we would trivially have
<Tx, x> = λ||x||2 = λ,

.
and hence .λ ∈ [m, M]. However, given that .e(T) ⊂ σ (T), what this proposition
ensures is more than that. But it requires the self-adjointness of the operator .T. For
self-adjoint operators .T, it is true that
e(T) ⊂ σ (T) ⊂ [m, M].

.
5.6 Self-Adjoint Operators 181
Proof The main tool for the proof is the Lax-Milgram lemma Theorem 3.1. Assume
that .λ < m. By our above remark that .λ cannot be an eigenvalue, .T − λ1 is injective.
Take .∈ > 0, so that .m − λ ≥ ∈. In this case the bilinear form
A(x, y) = <Tx − λx, y>

.
turns out to be symmetric (because .T is self-adjoint), continuous, and coercive, since

the associated quadratic form
<Tx − λx, x> ≥ ∈||x||2

.
is coercive. By the Lax-Milgram lemma, for every given element .y ∈ H, there is .x

with
<Tx − λx, z> = <y, z>

.
for every .z ∈ H, i.e.
Tx − λx = y,
.
and .T − λ1 is onto. This means that .λ ∈ ρ(T).

If .λ > M, argue in the same way with the bilinear form
A(x, y) = <λx − Tx, y>.

.
For the last part of the proof, suppose, seeking a contradiction, that .m ∈ ρ(T), so
that .T − m1 is a bijection and .(T − m1)−1 is continuous. By definition of m, there
is a sequence of unit vectors .xj , .||xj || = 1, such that
<Txj − mxj , xj > → 0.

. (5.11)
If we could derive from this convergence that the only way to have it is
Txj − mxj → 0,
. (5.12)
that would be the contradiction with the continuity of .(T − m1)−1 and the fact that
the .xj ’s are unit vectors.
Concluding (5.12) from (5.11) involves the non-negativity of the bilinear,
possibly degenerate, form
A(x, y) = <Tx − mx, y>.

.
This is nothing but the classical Cauchy-Schwarz inequality that does not require
the strict positivity of the underlying bilinear form to be correct. We include it next
for the sake of completeness and clarity. ⨆
⨅
Lemma 5.1 Let .A(u, v) a symmetric, bilinear, non-negative form in a Hilbert space
H. Then
.
A(u, v)2 ≥ A(u, u)A(v, v),

. u, v ∈ H.
Proof The proof is elementary. For an arbitrary pair .u, v ∈ H, consider the
quadratic, non-negative polynomial
P (t) = A(u + tv, u + tv),

.
and conclude by demanding the non-negativity of the discriminant. ⨆

⨅
If we use this lemma for the usual bilinear form
A(x, y) = <Tx − mx, y>,

.
in the context of the convergences (5.12) and (5.11), we see that
||Txj − mxj ||2 ≤ <Txj − mxj , xj > → 0,

.
and the proof of this item is finished. The case .λ < M is treated in the same way.
Corollary 5.4 The only self-adjoint operator with a spectrum reduced to .{0} is the
trivial operator .T ≡ 0.
Proof By Theorem 5.5, if .σ (T) = {0}, we can conclude that the two numbers
m = M = 0 vanish. Hence
.
<Tz, z> = 0 for every z ∈ H.

.
In particular, for arbitrary .x, .y,
0 =<T(x + y), x + y>

.
=<Tx, x> + <Tx, y> + <Ty, x> + <Ty, y>

=2<Tx, y>,
and .T ≡ 0. ⨆
⨅
Remark 5.2 In the context of the two previous results, it can be shown that for a
self-adjoint operator .T, it is true that
||T|| = max{|m|, |M|}.

.
5.7 The Fourier Transform 183
5.7 The Fourier Transform
The utmost importance of the Fourier transform in Analysis can hardly be suitably
estimated. In this section, we look at it from the viewpoint of the theory of
continuous operators. It is just but a timid introduction to this fundamental integral
transform.
Definition 5.7 For a function .f (x) ∈ L1 (RN ; C), we define
. Ff (y) = fˆ(y) : RN → C
through
⎛
.fˆ(y) = f (x)e−2π ix·y dx, y ∈ RN .
RN
i is, of course, the imaginary unit. Recall that the inner product in .L2 (RN ; C) is
given by
⎛
<f, g> =
. f (x)g(x) dx
RN
where .· represent the complex conjugate.

We immediately notice that for every individual .y ∈ RN , we have
⎛
|fˆ(y)| ≤
. |f (x)| dx,
RN
and so
.||fˆ||L∞ (RN ;C) ≤ ||f ||L1 (RN ;C) .
However, the full power of the Fourier transform is shown when it is understood as
an isometry in .L2 (RN ; C). To this end, we state several basic tools, whose proof is
deferred until we see how to apply them to our purposes, and are sure they deserve
the effort.
Lemma 5.2 (Riemann-Lebesgue) If .f ∈ L1 (RN ; C), then
⎛
. lim f (x)e−2π ix·y dx = 0.
|y|→∞ RN
Lemma 5.3 (The Inverse Transform) If .f, fˆ ∈ L1 (RN ; C), and
. lim f (x/r) = f (0),

r→∞
then
⎛
f (0) =
. fˆ(y) dy.
RN
In general, for almost every individual .x ∈ RN ,

⎛
f (x) =
. f (z)e2π i(x−z)·y dz dy.
RN ×RN
This lemma permits the following coherent definition.

Definition 5.8 For a function .f ∈ L1 (RN ; C), we define
⎛
F(f )(y) ≡ fˇ(y) =
. f (x)e2π ix·y dx.
RN
The previous lemma can be interpreted in the sense that
F(f ) = F−1 (f ),
.
when both transforms are appropriately interpreted in .L2 (RN ; C).

Proposition 5.5 (Plancherel’s Theorem in .L1 ∩ L2 ) If .f, g, fˆ, ĝ belong to
.L (R ; C) ∩ L (R ; C), then
1 N 2 N
<fˆ, ĝ> = <f, g>.

.
In particular, .||fˆ||2 = ||f ||2 .

Define, for .f ∈ L2 (RN ; C),
⎛
.Fj f (y) = f (x)e−2π ix·y dx = F(f χj ),
[−j,j ]N
where .χj stands for the characteristic function of the box .[−j, j ]N . This definition
is legitimate because
f χj ∈ L1 (RN ; C) ∩ L2 (RN ; C).

.
Moreover, we claim that
F(f χj ) ∈ L1 (RN ; C).

.
To show this, we use the Riemann-Lebesgue lemma for .F(f χj )(y) and all of its
partial derivatives of any order. Indeed, if j is fixed, since .f χj has compact support,
so does the product of it with every complex polynomial .P (x), and hence
f χj P ∈ L1 (RN ; C).
.
In this case, we can differentiate .F(f χj )(y) with respect to .y any number of times.
For instance,
⎛
∂F(f χj )
. = f (y)χj (y)(−2π ixk )e−2π ix·y dy, x = (xk )k , y = (yk )k .
∂yk RN
This implies that .F(f χj )(y) is .C∞ , and, thanks to Lemma 5.2 applied to .f χj P ,
the limit of all partial derivatives of .F(f χj )(y) vanishes as .|y| → ∞. This, in
particular, means that
F(f χj )(y) ∈ L1 (RN ; C) ∩ L2 (RN ; C).

.
Proposition 5.3 permits us to ensure that
||F(f χj )||2 = ||f χj ||2 ≤ ||f ||2

. (5.13)
for all j . This fact means that operators .Fj are well-defined from .L2 (RN ; C) to
itself.
We are now ready to show the main fact of the theory of the Fourier transform in
.L (R ; C). We will apply Corollaries 5.1 and 5.2.
2 N
Proposition 5.6 The limit
F = lim Fj in L2 (RN ; C)
.
j →∞
defines an isometry in this same Hilbert space
||Ff || = ||f ||,

. f ∈ L2 (RN ; C). (5.14)
Identity (5.14) is known universally as Plancherel’s identity.

Proof According to Corollary 5.1, we need to check that the limit of .Fj f exists in
L2 (RN ; C) for every individual function f in the same space. Let .j < k be two large
.
positive integers. By our calculations above redone for the difference .f (χj − χk ),
we can conclude that
||Fj f − Fk f || = ||f (χj − χk )|| → 0,

. j, k → ∞,
i.e. .{Fj f } is a Cauchy sequence in .L2 (RN ; C), and, as such, it converges to an
element
Ff ∈ L2 (RN ; C).
.
By Corollary 5.2, .F is a continuous operator in such Hilbert space, and, according

to (5.13),
.||F|| ≤ 1.
More is true because (5.13) implies that
||F(f χK )|| = ||f χK ||

.
for every compact subset .K ⊂ RN , and, by density through the theorem of

dominated convergence,
||Ff || = ||f ||.

.
⨆
⨅
We therefore see that the proofs of Lemmas 5.2, 5.3, and Proposition 5.14 are
important. We now examine them.
Proof (Lemma 5.2) This important fact is essentially Exercise 16 in Chap. 2. We
argue in several steps:
1. Suppose first that f is the characteristic function of a box .B = ⨅k [ak , bk ]. Then
⎛ ⎛ bk
−2π ix·y
. e dx =⨅k e−2π ixk yk dxk
B ak
|bk
1 |
=⨅k e−2π ixk yk | → 0
−2π iyk ak
as .yk → ∞. Since at least one of the coordinates .yk must tend to infinity, the
claim is correct.
2. By linearity, the conclusion is valid for linear combinations of characteristic
functions of boxes.
3. By density, our claim becomes true for arbitrary functions in the space
.L (R ; C). The argument goes like this. Take
1 N
. f ∈ L1 (RN ; C),
and for arbitrary .∈ > 0 find a linear combination .χ of characteristic functions of

boxes with .||f − χ || ≤ ∈. In this manner
⎛ ⎛
. f (x)e−2π ix·y dx = (f (x) − χ (x))e−2π ix·y dx
RN RN
⎛
+ χ (x)e−2π ix·y dx.
RN
The first term cannot be any greater than .∈, while the second can be made
arbitrarily small by taking .y sufficiently large, according to our previous step.
⨆
⨅
Proof (Lemma 5.3) We start looking at the Fourier transform of the special function
φN (x) = e−π |x| = ⨅k e−iπ xk ,

2 2
.
where its product structure enables us to focus on the one dimensional version
φ1 (x) ≡ φ(x) = e−π x .

2
.
Once we have its Fourier transform .φ̂(y), then
φ̂N (y) = ⨅k φ̂(yk ).

.
For the one-dimensional case, we write

⎛
.φ̂(y) = φ(x)e−2π ixy dx.
R
If we differentiate formally with respect to y, we are led to

⎛ ⎛
φ̂ ' (y) =
. φ(x)(−2π ix)e−2π ixy dy = i φ(x)(−2π x)e−2π ixy dx.
R R
If we notice that .φ ' (x) = −2π xφ(x), then

⎛
'
.φ̂ (y) = i φ ' (x)e−2π ixy dx,
R
and an integration by parts takes us to (contributions coming from infinity vanish)

⎛
. φ̂ ' (y) = −2πy φ(x)e−2π ixy dx = −2πy φ̂(y).
R
We conclude that .φ̂(y) = cφ(y) for some constant c. If we examine this identity at
y = 0, we see that
.
⎛
c = φ̂(0) =
. φ(x) dx = 1.
R
This last integral is well-known from elementary Calculus courses. Hence
φ̂(y) = φ(y) = e−πy , φ̂N (y) = e−π |y| .

2 2
.
For positive r, if we put
g(x) = φN (x/r),
.
we can easily check that

⎛ ⎛
. f (x/r)φ̂N (x) dx = f (x)ĝ(x) dx
RN RN
by manipulating the change of scale in the natural manner. It is also elementary to

realize, through Fubini’s theorem, that the Fourier transform is formally self-adjoint
in the sense
⎛ ⎛
. fˆ(x)g(x) dx = f (x)ĝ(x) dx, and
RN RN
hence we arrive at
⎛ ⎛
. f (x/r)φ̂N (x) dx = fˆ(x)φN (x/r) dx,
RN RN
for every positive r. If we take r to infinity, the right-hand side converges, because
fˆ ∈ L1 (RN ; C), to the integral of .fˆ, and we obtain our claim.
.
For a.e. .x, we can apply the previous result translating the origen to the point .x,
and so
⎛ ⎛
.f (x) = f (z + x)e−2π iz·y dz dy.
RN RN
A natural change of variables in the inner integral leads directly to our formula. ⨆
⨅
Proof (Proposition 5.14) Plancherel’s identity is a direct consequence of the more
general fact concerning the behavior of Fourier’s transform with respect to the inner
product in .L2 (RN ; C), namely, if
f, g ∈ L2 (RN ; C),
.
5.8 Exercises 189
then
.<fˆ, ĝ> = <f, g>.
In fact, the left-hand side is

⎛ ┌ ⎛ ⎛ ┐
.<fˆ, ĝ> = f (x)e−2π ix·y dx g(z)e2π iz·y dz dy,
RN RN RN
and elementary manipulations lead to

⎛ ┌ ⎛ ┐
.<fˆ, ĝ> =
2π i(z−x)·y
f (x)g(z)e dx dz dy
RN RN ×RN
⎛ ┌ ⎛ ┐
= ĝ(z) f (x)e2π i(z−x)·y dy dx dz.
RN RN ×RN
We conclude directly by Lemma 5.3. ⨆

⨅
In the proof of these fundamental facts we have used along the proof some of
the basic properties that make the Fourier transform so useful in Applied Analysis:
the behavior with respect to translations and changes of scale, the derivative of the
transform, the transform of the derivative, etc.
As remarked at the beginning of the section, the importante of the Fourier
transform goes well beyond what has been described here.
5.8 Exercises
1. Show that the two definitions of a Baire space in Definition 5.2 are indeed
equivalent.
2. Prove that if
Bρ ⊂ T(B1 )
.
for a linear, continuous map T : E → F between Banach spaces, and for some
ρ > 0, then the image under T of every open subset of E is open in F.
3. Show that if the graph of a linear operator T : E → F is closed in the product
space, then E under the graph norm
||u||G = ||u||E + ||Tu||F ,

.
is a Banach space too.

4. Let E be a Banach space, and T ∈ L(E) such that ||T|| < 1. Argue that 1 − T
is an isomorphism, by checking that the series
∞
Σ
S=
. Ti
i=0
is its inverse.
5. Consider the operator
T : L2 (−π, π ) |→ L2 (−π, π ),
. Tf (x) = f (2x),
where functions in L2 (−π, π ) are regarded as extended to all of R by

periodicity.
(a) Check that T is linear, continuous, and compute its norm.
(b) Show that 0 ∈ σ (T) \ e(T).
(c) Calculate its adjoint T' .
6. Let H be a Hilbert space.
(a) If π : H → H is such that
π 2 = π,
. <π x, y> = <x, π y>,
for every x, y ∈ H, prove that π is linear continuous, and has norm 1.

Conclude that K = π(H) is a closed subspace, orthogonal to the kernel of
π , and π is the orthogonal projection onto K.
(b) If π ∈ L(H) is such that π 2 = π , argue that the four following assertions
are equivalent: π is the orthogonal projection onto π(H); π is positive; π
is self-adjoint; and π commutes with π ' .
7. For the operator
T : lp → lp ,
. T(x1 , x2 , . . . ) = (0, x1 , x2 , . . . ),
show that
σ (T) = {|λ| ≤ 1},

. e(T) = ∅.
8. For the operator
T : lp → lp ,
. T(x1 , x2 , x3 , . . . ) = (x1 , x2 /2, x3 /3, . . . ),
find σ (T) and e(T). For λ ∈ ρ(T), calculate explicitly (T − λ1)−1 .

5.8 Exercises 191
9. Suppose
K(x, y) : [0, 1] × [0, 1] → R,

. K(x, y) ≡ 0 if x < y,
is bounded, and continuous when x ≥ y. The integral operator

⎛ x
. Tu(x) = K(x, y)u(y) dy
0
is known as the Volterra operator with kernel K.

(a) Show that T is linear and continuous from C[0, 1] to itself, and that the
successive powers Tj also are Volterra operators.
(b) Show, by induction, that if Kj is the kernel of Tj , then
Mj
|Kj (x, y)| ≤
. (x − y)j −1
(j − 1)!
if M > 0 is an upper bound of |K(x, y)|.

(c) Look for examples of kernels K(x, y) for which 0 ∈ e(T) and 0 ∈
/ e(T).
(d) For the particular example
K(x, y) = χΔ (x, y),

. Δ = {(x, y) ∈ [0, 1]2 : x ≥ y},
the indicator function of Δ, calculate explicitly Kj , Tj , and (1 − T)−1 . In

this way, the equation (1 − T)u = v can be resolved for v ∈ C[0, 1].
(e) If v is differentiable, check that the equation (1 − T)u = v is equivalent to
a linear differential equation with a suitable initial condition, and that the
corresponding solution in the previous item coincide with the one coming
from the theory of Differential Equations.
10. Let T = L(E), for a Banach space E, such that the series
∞
Σ
. Tj (x)
j =0
converges in E for every x ∈ E (note the difference with Exercise 4 above).

(a) Show that 1 − T is an isomorphism of E and
∞
Σ
(1 − T)−1 (y) =
. Tj (y).
j =0
(b) For arbitrary x0 ∈ E, consider the sequence
xj +1 = y + Txj .
.
Argue that it converges to (1 − T)−1 (y), regardless of how x0 is selected.

(c) If we put
x = lim xj ,
. xj +1 = y + Txj ,
j →∞
show that, for all j ,
||x − xj || ≤ ||(1 − T)−1 || ||Tj || ||x1 − x0 ||.

.
11. In the Banach space E = C[0, 1], endowed with the sup norm, let K be a
subspace of differentiable functions (C1 ). The derivative operator taking each
u ∈ K into its derivative u' ∈ E is continuous if and only if K is finite-
dimensional. In particular, if K ⊂ C1 [0, 1] is closed in E, then it has finite
dimension.
12. For functions
a(x) ∈ L∞ (Ω),
. b(x) ∈ L2 (Ω), Ω ⊂ RN ,
define the operator T ∈ L(L2 (Ω)) by putting
T(f )(x) = a(x)f (x) + b(x).

.
13. The Laplace transform is universally defined through the formula

⎛ ∞
L(u)(s) =
. u(t)e−st dt, s ∈ (0, ∞).
0
14. Use the Riesz representation theorem, Proposition 2.14 and the duality rela-
tions 5.3, to prove the non-symmetric version of the Lax-Milgram lemma: Let
A(u, v) be a continuous, coercive bilinear form over a Hilbert space H. For
every U ∈ H, there is a unique u ∈ H such that
A(u, v) = <U, v>.

.
for every v ∈ H.
15. For E = C[0, 1] under the sup-norm, consider the operator T taking each y(t) ∈
E into the unique solution x(t) of the differential problem
x ' (t) = x(t) + y(t) in [0, 1],

. x(0) = 0.
Check that it is linear, continuous, and find its norm, and its eigenvalues.
Chapter 6
Compact Operators
Compactness of an operator goes one step beyond continuity. Its treatment is

fundamental in many areas in Applied Analysis, particularly Differential Equations
and Variational Methods. Since compactness is one of the main concepts in
Topology when applied to Analysis, compact operators will also play a fundamental
role. In some sense, the class of compact operators is the one that allows to shift our
intuition about finite-dimensional spaces to the infinite-dimensional scenario.
6.1 Preliminaries
We start with the basic definition, properties, and examples.

Definition 6.1 A linear, continuous operator .T : E → F between two Banach
spaces is said to be compact if the image of the unit ball .BE of .E is relatively
compact in .F. Equivalently, if from every uniformly bounded sequence .{uj } in .E,
one can find a subsequence .{Tujk } converging in .F. The set of compact operators
from .E to .F is denoted by .K(E, F).
Because uniformly bounded sequences in reflexive Banach spaces admit subse-
quences converging weakly, roughly speaking compact operators are those trans-
forming weak into strong convergence.
Proposition 6.1 For two given Banach spaces .E and .F, .K(E, F) is a closed
subspace of .L(E, F).
Proof The fact that .K(E, F) is a subspace of .L(E, F) is pretty obvious. Let
{Tk } ⊂ K(E, F),

. Tk → T in L(E, F),
194 6 Compact Operators
and .{uj } ⊂ E uniformly bounded. Since, .F is a complete metric space, it is a well-

known fact in Topology, that it suffices to check that the image .T(BE ) can be covered
by a finite number of balls of arbitrarily small size. Take .∈ > 0, and k with
||Tk − T|| < ∈/2.

.
Since .Tk is compact, we can find .vi ∈ F with
Tk (BE ) ⊂ ∪i B(vi , ∈/2).

.
The triangular inequality immediately leads to
T(BE ) ⊂ ∪i B(vi , ∈),

.
and the arbitrariness of .∈ implies .T ∈ K(E, F). ⨆

⨅
There are several main initial examples of compact operators.
Example 6.1 A linear operator .T : E → F is of finite rank if the image
subspace .T(E) is finitely generated. Every linear operator of finite rank is compact.
According to the previous proposition, every limit operator of a sequence of finite
rank operators is also compact. The converse of this statement is known as the
approximation problem: to decide whether every compact operator is a limit of a
sequence of finite-rank operators. Though the answer is positive when, for instance,
.F is a Hilbert space, it is indeed a very delicate issue which is not true in general.
Example 6.2 Let .J = (x0 , x1 ) be a finite interval. The integration operator

⎛ x
∞
I:W
.
1,p
(J ; R ) → L (J ; R ),
n n
Iu(x) = u' (y) dy,
x0
for .p > 1 is compact. We already know this from Chap. 2.

Example 6.3 The prototype of an operation which is compact is integration, as
opposed to differentiation, as we have started to suspect from the previous example.
To clearly show this assertion, let . Ω ⊂ RN be a bounded open set, and let
K(x, y) : Ω × Ω → R,
. |K(x, y) − K(z, y)| ≤ M|x − z|α k(y),
k(y) ∈ L2 ( Ω ),
for a positive constant M, and exponent .α > 0. Define the operator

⎛
∞
T : L ( Ω ) → L ( Ω ),
.
2
Tu(x) = K(x, y)u(y) dy.
Ω
6.1 Preliminaries 195
T is compact. This is easy to see because for a bounded sequence .{uj } in .L2 ( Ω ),
.
we can write for an arbitrary pair of points .x and .z,
|Tuj (x) − Tuj (z)| ≤ M̃|x − z|α ||k||L2 ( Ω )

.
where constant .M̃ incorporates constant M and the uniform bound for the sequence
{uj }. This final inequality shows that the sequence of images .{Tuj } is equicontinu-
.
ous, and hence it admits some convergent subsequence.

Example 6.4 The compactness of an operator may depend on the spaces where it
is considered. A convolution operator of the kind
⎛
Tu(x) =
. ρ(x − y)u(y) dy
RN
with a smooth, compactly supported kernel
ρ(z) : RN → R,
.
is compact when regarded as
T : L1 (RN ) → L∞ (RN ).
.
It is a particular example of the situation in the preceding example for a kernel
K(x, y) = ρ(x − y).

.
However, it is not compact when considered from .L1 (RN ) to itself, because the
injection
L∞ (RN ) ∩ L1 (RN ) ⊂ L1 (RN )

.
is not compact. If we restrict attention to a subspace .L1 ( Ω ) for a bounded subset

Ω ⊂ RN , then it is compact though.
.
Example 6.5 The kind of integral operators considered in the last examples can be
viewed in a much more general framework, and still retain its compactness feature.
These are called Hilbert-Schmidt operators. Let
K(x, y) : Ω × Ω → R,
. Ω ⊂ RN ,
with
⎛
K(x, y) ∈ L ( Ω × Ω ),
.
2
|K(x, y)|2 dx dy < ∞,
Ω × Ω
and define the corresponding integral operator as before

⎛
Tu(x) =
. K(x, y)u(y) dy, T : L2 ( Ω ) → L2 ( Ω ).
Ω
It is easy to check that .T is continuous. To show its compactness, we take

an orthonormal basis .{vj } for .L2 ( Ω ), and realize that the set of all products
.{vi (x)vj (y)} makes a orthonormal basis for .L ( Ω × Ω ). In this way we can write
2
Σ Σ
K(x, y) =
. ki,j vi (x)vj (y), |ki,j |2 < ∞,
i,j i,j
and
⎛ ⎞
Σ Σ ⎛
.Tu(x) = ⎝ ki,j vj (y)u(y) dy⎠ vi (x).
i j Ω
For a finite k, define

⎛ ⎞
Σ
k Σk ⎛
Tk : L2 ( Ω ) → L2 ( Ω ),
. Tk u(x) = ⎝ ki,j vj (y)u(y) dy⎠ vi (x).
i j Ω
It is clear that .Tk is a finite-rank operator; and, on the other hand, the condition
on the summability of the double series for the coefficients .ki,j (remainders tend to
zero) directly implies that
||Tk − T|| → 0,
. k → ∞.
Conclude by Proposition 6.1.

Proposition 5.3 establishes a clear relationship between the kernel and range of an
operator .T and its adjoint .T' . In order to make use of such identities in the context of
compact operators, it is important to know whether the adjoint of a compact operator
is compact as well.
Theorem 6.1 Let .T ∈ L(E, F) for a couple of Banach spaces .E and .F. Then .T ∈
K(E, F) if and only if .T' ∈ K(F' , E' ).
Proof Suppose .T : E → F is compact. Let .{yk } be a sequence in the unit ball .B' of
' '
.F . We would like to show that .{T (yk )} admits some convergent subsequence. Our
main assumption, the compactness of .T, implies that .K = T(B) is compact in .F, if
.B is the unit ball of .E. Each .yk is, in particular, continuous in the compact set .K,
and because .||yk || ≤ 1 for all k (.yk ∈ B' ), the whole family .{yk } is equicontinuous
when restricted to .K. By the classical Arzelá-Ascoli theorem, there is a convergent
6.2 The Fredholm Alternative 197
subsequence (not relabeled), and hence
. sup |<yj , Tx> − <yk , Tx>| → 0

x∈B
as .k, j → ∞. This is exactly the same as saying
||T' yj − T' yk || → 0
.
as .k, j → ∞; that is .{T' yk } is convergent in .E' .

Assume now that .T' : F' → E' is compact. What we have just proved implies
that .T'' : E'' → F'' is compact. By our remarks at the end of Sect. 5.4, we know that
'' ''
.T(B) = T (B) if .B is the unit ball of .E. Moreover, .T (B) has a compact closure in
'' ''
.F , and .F ⊂ F is a closed subspace. We can then deduce that .T(B) has compact
closure in .F, and .T : E → F is compact. ⨆

⨅
6.2 The Fredholm Alternative
For a Banach space .E, we designate by .K(E) the class of compact operators from
E into itself. A first fundamental result with a couple of basic facts about compact
.
operators, that is the basis for the Fredholm alternative, follows. Recall that .1 is the
identity map.
Proposition 6.2 Let .T ∈ K(E).
1. If .T(E) is a closed subspace, then it has finite dimension.
2. The subspaces .N(T − 1) and .N(T' − 1) always have finite dimension (and it is
the same for both).
Proof If .T(E) is closed, it is a Banach space itself, and .T : E → T(E) turns out to
be onto. According to the open mapping theorem, .T is open, and so .T(B), the image
under .T of the unit ball .B in .E, is a relatively compact, neighborhood of zero in the
space .T(E). In this way, there is a finite set .{ui } ⊂ T(E) such that
| | ⎛ 1
⎞
1
T(B) ⊂
. ui + T(B) ⊂ M + T(B)
2 2
i
if .M is the finite-dimensional space spanned by .{ui }. By Exercise 2 below, we

conclude that .T(E) has finite dimension.
The subspace .N(T−1) is closed, and the compact restriction of .T to this subspace
is the identity. Thus the image subspace .T(N) = N is closed. By the previous item,
such subspace must be finite dimensional. The same applies to the compact operator
'
.T .
⨆
⨅
We are ready to deal with one of the most remarkable facts about compact
operators.
Theorem 6.2 (Fredholm Alternative) Let .T ∈ K(E), and .λ ∈ R \ {0}. Then
.R(T − λ1) = N(T' − λ1)⊥ .
The usual, and useful, interpretation of this statement refers to the solvability of the
equation
.Tx − λx = y, y ∈ E, λ /= 0, (6.1)
in two “alternatives":
1. either there is a unique solution .x for (6.1) for every .y ∈ E, in case .N(T' − 1) is
trivial; or else
2. Equation (6.1) for .y = 0 has a finite number of independent solutions, and the
non-homogeneous equation (for .y /= 0) admits solutions if and only if
y ∈ N(T' − λ1)⊥ :
. <y, x' > = 0 whenever T' x' − λx' = 0.
Proof By Proposition 5.3, it suffices to show that .R(T − λ1) is closed. Since .T is
compact if and only if .(1/λ)T is compact for non-zero .λ, we can, without loss of
generality, concentrate in showing that .R(T − 1) is closed.
Assume then that
Txj − xj → X,
. X ∈ E. (6.2)
We are supposed to show the existence of some .x ∈ E with
Tx − x = X.
.
Note that for arbitrary
zj ∈ N(T − 1),
.
we can replace .xj by .xj + zj without changing the limit .X
. T(xj + zj ) − (xj + zj ) → X. (6.3)
In addition, by Proposition 6.2, the subspace .N(T − 1) is finite-dimensional.

• We claim that .zj in the kernel of .T − 1 can be chosen so that .xj + zj remains a
bounded sequence. To this aim, consider the function
||xj + ·|| : N(T − 1) → R+ .

.
6.2 The Fredholm Alternative 199
This is a convex, coercive function defined over a finite dimensional space. It

therefore attains its global minimum at some .zj ∈ N(T − 1) (possibly non-
unique if the norm is not strictly convex). Suppose the sequence of numbers
.{αj = ||xj + zj ||} is not bounded. In this case,
1 1
. T(xj + zj ) − (xj + zj ) → 0. (6.4)
αj αj
But if, after extracting a subsequence not relabeled, due again to the compactness
of .T,
1
. T(xj + zj ) → z,
αj
then from (6.4),
1
. (xj + zj ) → z
αj
as well, so that .Tz = z, and .z ∈ N(T − 1). But because .N(T − 1) is a subspace
and .zj belongs to it, on the one hand,
. min ||(1/αj )(xj + zj ) + u||

u∈N(T−1)
= (1/αj ) min ||xj + u|| = 1, (6.5)

u∈N(T−1)
because this last minimum was attained at .zj ; but on the other, by taking .u = −z
on the minimum on the left-hand side, we see that such an minimum must be
smaller than
||(1/αj )(xj + zj ) − z||

.
which tends to zero, a contradiction with (6.5). Hence, .{xj + zj } is bounded.

• By the compactness of .T, for a subsequence which we do not care to relabel,
T(xj + zj ) → Y,
. some Y ∈ E.
By (6.3),
xj + zj → Y − X,
.
and, then,
T(xj + zj ) → T(Y − X).

.
Putting these last two convergences together, and bearing in mind (6.2),
T(Y − X) − (Y − X) = X,
.
i.e. .X ∈ R(T − 1), as desired.

⨆
⨅
Remark 6.1 The fact that the dimension is the same for the two finite-dimensional
spaces .N(T − 1) and .N(T' − 1), when .T is compact, can be shown in a elementary
way by using Theorem 6.2 (and also Proposition 5.3, and the fact that finite-
dimensional subspaces always admit complement).
6.3 Spectral Analysis
Recall that (Definition 5.5) a real number .λ is an eigenvalue of an operator .T ∈

L(E), and we write .λ ∈ e(T), if the subspace .N(T − λ1) is not the trivial subspace.
The resolvent .ρ(T) is the set of numbers .λ for which .T − λ1 is a bijection. Finally,
the spectrum .σ (T) is the complement of the resolvent.
Before we discuss the main result in this section, we prove a very helpful tool.
The distance function to a set .U in a Banach space .E is taken to be
d(x, U) = inf ||x − y||,

. x ∈ E.
y∈U
Lemma 6.1 (Riesz) Let .E be a Banach space, and .M, a proper, closed subspace.
For every .∈ > 0, there is .x ∈ E with .||x|| = 1 and such that
d(x, M) ≥ 1 − ∈.
.
Proof By Corollary 3.3 of the Hahn-Banach theorem, there is .T ∈ E' with
||T || = 1,
. T |M = 0.
For .∈ > 0, there is some unit .x, .||x|| = 1, such that
||T || − ∈ ≤ |<T , x>|.

.
Then, for every .y ∈ M,
1 − ∈ =||T || − ∈ ≤ |<T , x>|

.
=|<T , x − y>| ≤ ||T || ||x − y|| = ||x − y||.
This is the conclusion of the lemma because .y ∈ M is arbitrary. ⨆

⨅
6.3 Spectral Analysis 201
This lemma will be an indispensable tool for yet an intermediate step for our main
theorem which is a revision of some spectral concepts for compact operators. It very
clearly expresses the remarkable and surprising consequences of compactness.
Proposition 6.3 Let .T ∈ K(E). .N(T − 1) is the trivial subspace if and only if
R(T − 1) the full space, i.e. .T − 1 is injective if and only if it is surjective.
.
Proof Suppose first that .N(T − 1) is trivial but .R(T − 1) is not the full .E. By
Theorem 6.2, the proper subspace .M ≡ R(T − 1) is closed, and so .M is a Banach
space on its own. Moreover, it is evident that .T(M) ⊂ M, but .T(M) cannot fill up
all of .M, precisely because .N(T − 1) is trivial and .R(T − 1) is not the full .E (this is
exactly as in Linear Algebra in finite dimension). In addition,
. T|M ∈ K(M).
We therefore see that we can define recursively the following sequence of subspaces
Mj = (T − 1)j E,
. j ≥ 0,
in such a way that it is a strictly nested, decreasing sequence of closed subspaces

(for the same reason indicated above). By Lemma 6.1, we can find a sequence .{xj }
in such a way that
1
xj ∈ Mj ,
. ||xj || = 1, dist(xj , Mj +1 ) ≥ . (6.6)
2
If for .j < k, we look at the difference
Txj − Txk = (Txj − xj ) − (Txk − xk ) + (xj − xk ),

.
we realize that
(Txj − xj ) − (Txk − xk ) − xk ∈ Mj +1 ,
.
and hence
Txj − Txk = xj + z,
. z ∈ Mj +1 .
But (6.6) implies, then, that
1
||Txj − Txk || ≥
.
2
for all .j, k, which is impossible for a compact operator. This contradiction leads us
to conclude that, indeed, .R(T − 1) is the full space.
The converse can be argued through the adjoint .T' , the dual .E' and the relations
in Proposition 5.3. It is left as an exercise. ⨆
⨅
We are now ready for our main result concerning important spectral facts for
compact operators.
Theorem 6.3 Let .E be a (non-trivial) Banach space, and .T ∈ K(E).
1. .0 ∈ σ (T);
2. every non-vanishing .λ in .σ (T) is an eigenvalue of .T:
σ (T) \ {0} ⊂ e(T)

. or σ (T) = e(T) ∪ {0};
3. for positive .r > 0, the set
Er = {λ ∈ σ (T) : |λ| ≥ r}
.
is empty or finite;
4. the spectrum .σ (T) is a non-empty, countable, compact set of numbers with 0 as
the only possible accumulation point.
Proof If .0 ∈ ρ(T), then .T should be a bijection but this is impossible for a compact
operator on an infinite-dimensional Banach space.
Suppose a non-vanishing .λ is not an eigenvalue, so that .T − λ1 is injective. By
Proposition 6.3, it is also surjective, and so .λ ∈ ρ(T).
For the third item, we follow a proof similar to that of Proposition 6.3. Let, for
some fixed, positive .r > 0, the set .Er be infinite: .{λi } ⊂ Er . By the previous item,
each .λi is an eigenvalue of .T with some eigenvector .ei in such a way that the set .{ei }
is a linearly independent set. Put .Mn for the finite-dimensional subspace spanned by
.{ei }i=1,2,...,n so that
M1 ⊆ M2 ⊆ · · · ⊆ Mn ⊆ . . .
.
with proper inclusions. By Lemma 6.1, we can find a sequence .{xi } with
1
||xi || = 1,
. xi ∈ Mi , ||xi − y|| ≥
2
for every .y ∈ Mi−1 . We claim, then, that the sequence .{Txi } cannot possess a
convergent subsequence, which is impossible due to the compactness of .T. Indeed,
we can write for .j < k,
||Txj − Txk || = ||Txj − (T − λk 1)xk − λk xk ||.

.
6.4 Spectral Decomposition of Compact, Self-Adjoint Operators 203
Because .(T − λk 1)xk cannot have a non-vanishing component along .ek , precisely
because .ek is an eigenvector associated with the eigenvalue .λk , we see that (.j < k)
Txj − (T − λk 1)xk ∈ Mk−1

.
and so does this same vector divided by .λk . In this way
1 1
||
. (Txj − (T − λk 1)xk ) − xk || ≥ ,
λk 2
or
|λk | r
||Txj − Txk || = ||Txj − (T − λk 1)xk − λk xk || ≥
. ≥ .
2 2
This is impossible for a compact operator .T, and this contradiction implies that each
Er must be finite (or empty).
.
The last item is a direct consequence of the previous ones. ⨆

⨅
6.4 Spectral Decomposition of Compact, Self-Adjoint

Operators
We would like to explore a main fact that is a consequence of putting together these
two fundamental properties, self-adjointness and compactness, in a given operator
.T from a Hilbert space into itself. It is the most fundamental property that we would
like to highlight: they are always diagonalizable.

Theorem 6.4 Let .T ∈ K(H) be a compact, self-adjoint operator over a separable
Hilbert space .H. There exists a orthonormal basis of .H made up of eigenvectors of
.T.
Proof By Theorem 6.3, there is a sequence of non-zero eigenvalues .λj converging

to 0. By Proposition 6.2 each subspace .N(T − λj 1) is of finite dimension. Put
H0 = N(T),
. Hj = N(T − λj 1), j ≥ 1.
We will show that .H is the closure of the union of all the .Hj ’s, and they are mutually
orthogonal. Indeed, if .x ∈ Hj and .y ∈ Hk with .j /= k, then
(λj − λk )<x, y> = <Tx, y> − <x, Ty> = 0

.
because .T is self-adjoint, and
.Tx = λj x, Ty = λk y.
Consequently because .λj /= λk , we conclude that .x and .y are orthogonal.

Let .M be the subspace spanned by the union of the .Hj , .j ≥ 0. We claim that
the closure of .M is the full .H. Suppose, seeking a contradiction, that it is not so.
Let .M⊥ be the orthogonal complement to the subspace .M. Because .T(M) ⊂ M, if
⊥
.y ∈ M and .x ∈ M, then
<Ty, x> = <y, Tx> = 0.

.
The arbitrariness of .x and .y implies that .T(M⊥ ) ⊂ M⊥ , as well. Thus .M⊥ is a

Hilbert space on its own, and . T|M⊥ is a compact, self-adjoint operator. But since we
have already recorded all of eigenvalues of .T in the initial sequence .{λj }, including
.λ0 = 0, the restriction of .T to .M
⊥ cannot have eigenvectors. According to the last
item of Theorem 6.3, the only possibility left is for .M⊥ to be trivial. Conclude then
that .M = H by Corollary 3.3.
If we organize over each .Hj orthonormal bases, the union of them all will be an
orthonormal basis for the full space .H, according to our proof here. The existence of
orthonormal bases for .H0 are guaranteed by Proposition 2.12 in case it is an infinite-
dimensional subspace, while the .Hj ’s, for .j ≥ 1, are finite dimensional. ⨆
⨅
As in the finite-dimensional setting, this fundamental result is informing us that
there is a special basis for .H, in which the compact, self-adjoint operator .T admits
a especially convenient representation: if
Σ
x=
. xj ej , Tej = λj ej ,
j
i.e. .{ej } is an orthonormal basis of .H with eigenvectors, then

Σ
Tx =
. λj xj ej .
j
Example 6.6 Trigonometric bases. This is related to Example 2.14. Take .H to be

the Hilbert space .L2 (−π, π ), and define the operator
⎛ ⎛
x x+π π
T : H → H,
. Tu(x) = (y − x)u(y) dy − (y − π )u(y) dy.
−π 2π −π
It is very easy to check that .T is well-defined.

6.4 Spectral Decomposition of Compact, Self-Adjoint Operators 205
Moreover, it is compact because .T is essentially an integral operator (check the

examples in Sect. 6.1). It is very easy to realize that, in fact, if we put .U (x) = Tu(x)
then
. − U '' (x) = u(x), U (−π ) = U (π ) = 0,
i.e. U is the unique minimizer of the variational problem

⎛ π ┌ ┐
1 ' 2
Minimize in U (x) ∈ H01 (−π, π ) :
. U (x) − u(x)U (x) dx.
−π 2
However, the image of .T is contained in the subspace .L20 (−π, π ) with vanishing
average
⎛ π
. f (x) dx = 0.
−π
This is very easily checked because this integral condition is correct for function
U ∈ H01 (−π, π ). Restricted to this subspace .H = L20 (π, π ), .T is also compact. .T
.
is self-adjoint too because if for .v(x) ∈ H, we put .V (x) = Tv(x), then

⎛ π ⎛ π
<Tu, v> =
. U (x)v(x) dx = − U (x)V '' (x) dx
−π −π
and two successive integrations by parts yield

⎛ π
<Tu, v> = −
. U '' (x)V (x) dx = <u, Tv>.
−π
According to Theorem 6.4, the full collection of, suitably normalized, eigenfunc-
tions for .T furnish an orthonormal basis for .L2 (−π, π ). Such eigenfunctions, and
their respective eigenvalues, are the posible non-trivial solutions of the problem
. − U '' (x) = λU (x), U (−π ) = U (π ) = 0.
It is an elementary exercise in Differential Equations to find that the only non-trivial

solutions of this family of problems correspond to
.λ = j 2 /4, j = 1, 2, . . . ,
and
⎛ ⎞
1 1 1
.Uj (x) = √ sin jx + jπ .
π 2 2
This family of trigonometric functions is an orthonormal basis for .L2 (−π, π ). Note
how different it is from the one in Example 2.14, in spite of both being orthonormal
basis of the same space.
Example 6.7 A typical Hilbert-Schmidt operator. In the context of Example 6.5,
K(x, y) : Ω × Ω → R,
. Ω ⊂ RN ,
with
⎛
K(x, y) ∈ L2 ( Ω × Ω ),
. |K(x, y)|2 dx dy < ∞,
Ω × Ω
define the corresponding integral operator

⎛
Tu(x) =
. K(x, y)u(y) dy, T : L2 ( Ω ) → L2 ( Ω ).
Ω
If the kernel K is symmetric, then the operator .T is compact and self-adjoint. As a

consequence of Theorem 6.4, there is a basis of .L2 ( Ω ) made up of eigenfunctions
of .T. Eigenfunctions are solutions of the integral equation
⎛
. K(x, y)u(y) dy = λu(x).
Ω
Of particular importance is the case of a convolution kernel
K(x, y) = ρ(x − y).

.
It is hard to solve these integral equation even for simple situations.

Example 6.8 Definition of a function of an operator. If .T is a compact, self-adjoint
operator of a Hilbert space into itself, we already know there is a special orthonormal
basis .{ej } very well adapted to .T in such a way that if
Σ
x=
. xj ej , Tej = λj ej , (6.7)
j
then
Σ
Tx =
. λj xj ej .
j
Given a real continuous function
. f (λ) : R → R, f (0) = 0,
6.5 Exercises 207
we can define the operator .f (T) which is a linear, compact, self-adjoint operator
uniquely determined by putting
Σ
f (T)x =
. f (λj )xj ej
j
under (6.7). See an explicit example below.
6.5 Exercises
1. Let E, F, G, H be four Banach spaces. Suppose that
.S ∈ L(E, F), T ∈ K(F, G), U ∈ L(G, H).
Argue that the composition U ◦ T ◦ S ∈ K(E, H).

2. Show that if U and M are a bounded set and a subspace, respectively, in a
Banach space E, such that
1
U ⊂ M + U,
.
2
then U ⊂ M.
3. Let T ∈ L(H) with H, a Hilbert space. Show that σ (T) = σ (T' ), and that
N(T) = R(T' )⊥ ,
. N(T' ) = R(T)⊥ .
4. (a) Define the logarithm of a symmetric, positive-definite matrix A. Calculate

log A for
⎛ ⎞
3 1 1
.A = ⎝−4 −1 −2⎠ .
2 1 2
(b) Argue that for the operator
T : L2 (0, 1) → L2 (0, 1),

. U = Tu,
''
−U = u in (0, 1), U (0) = U (1) = 0,
its square root is well-defined.

5. Argue that there is a basis of L2 (0, 1) composed of solutions of the problems
. − uj (x)'' + uj (x) = λj uj (x) in (0, 1), u'j (0) = u'j (1) = 0,
for a collection of numbers λj tending to ∞.

6. Let E be an infinite-dimensional, separable Hilbert space with
E = ∪i Ei ,
. Ei ⊂ Ei+1 ,
and each Ei of finite dimension. Let πi be the orthogonal projection onto Ei ,

and πi⊥ , the orthogonal projection onto its orthogonal complement. Argue that
πi⊥ x → 0
.
for every x, and yet it is not true that ||πi⊥ || → 0. Does this fact contradict the
Banach-Steinhaus principle?
7. Let E be a separable Hilbert space with
E = ∪i Ei ,
. Ei ⊂ Ei+1 ,
and each Ei of finite dimension. Let πi be the orthogonal projection onto Ei .

An operator T ∈ L(E) is compact if and only if
||πi T − T|| → 0,
. i → ∞.
8. Suppose T ∈ L(H), H a Hilbert space, can be written

∞
Σ
. T(x) = ai πi (x)
i=1
where πi are the orthogonal projections onto finite-dimensional, pairwise-

disjoint subspaces Ki , and ai are distinct, non-vanishing numbers converging to
0. Show that T is compact, commutes with its adjoint, the ai ’s are its non-null
eigenvalues, and
Ki = ker(T − ai 1).
.
9. Let T : l2 → l2 be defined by
T(x1 , x2 , x3 , . . . ) = (2x1 , x3 , x2 , 0, 0, . . . ).
.
Prove that T is compact, and self-adjoint. Calculate σ (T), the corresponding

eigen-spaces, the orthogonal projections, and the spectral decomposition of T.
6.5 Exercises 209
10. For the kernel

⎧
(1 − x)y, 0 ≤ y ≤ x ≤ 1,
K(x, y) =
.
(1 − y)x, 0 ≤ x ≤ y ≤ 1,
consider the integral operator T : L2 (0, 1) → L2 (0, 1) associated with K.

(a) If λ /= 0 is an eigenvalue and u, a corresponding eigen-function, argue that
u is C∞ and
λu'' + u = 0 in (0, 1),

. u(0) = u(1) = 0.
(b) Conclude that
σ (T) = {(j π )−2 : j ∈ N} ∪ {0}.

.
Find the associated eigen-functions. Is 0 an eigenvalue?

(c) Calculate T' , and give conditions on v so that the equation
(T − λ1)u = v
.
be solvable for λ = (j π )−2 .

11. Investigate whether the operators of Exercises 7 and 8 of Chap. 5 are compact.
12. For a continuous function f (x) : [0, 1] → [0, 1], define the operator
Tf (u) = u(f (x)),

. u ∈ C([0, 1]).
Prove that Tf is compact if and only if f is constant.

13. Show that the integral equation
⎛ π
u(x) −
. sin(x + y)u(y) dy = v(x), x ∈ [0, π ],
0
always has a solution for every v ∈ C([0, π ]).

14. For a orthonormal basis {ej } in a Hilbert space H, let T ∈ L(H) be such that
Σ
. ||Tej ||2 < ∞.
j
Prove that T is compact. Give an example of a compact operator T in l2 of the

form
T(x) = (tj xj )j ,
. x = (x1 , x2 , . . . ),
such that
Σ
. ||Tej ||2 = ∞, ej = χ{j } .
j
15. Let T be a compact operator in a reflexive Banach space E. Argue that the image
of the unit ball B of E under T is closed. For the particular case E = C([−1, 1])
and the Volterra operator
⎛ x
T(u)(x) =
. u(y) dy,
−1
show that T(B) is not closed looking at the sequence

⎧
⎪
⎪ −1 ≤ x ≤ 0,
⎨0,
uj (x) =
. j x, 0 ≤ x ≤ 1/j, .
⎪
⎪
⎩1, 1/j ≤ x ≤ 1
16. Argue that the injection operator
ι : W 1,p (J ) → Lp (J ),
. ι(u) = u
is a compact operator.
17. Use the Riesz lemma Lemma 6.1 to argue that if E is a normed space such that
the unit ball B(0, 1) is compact, then E is finite-dimensional.
18. Let H = L2 (0, 1), and
. T : H → H, Tf (s) = sf (s).
(a) Check that T is linear, continuous and self-adjoint with ||T|| = 1.

(b) Show that σ (T) = [0, 1], and calculate e(T). Is T compact?
19. The operator T determined by
⎛ x ⎛ 1
f (x) |→ Tf (x) =
. f (y) dy − x f (y) dy
0 0
yields the unique solution u(x) of the problem
(u' (x) − f (x))' = 0 in (0, 1),

. u(0) = u(1) = 0.
6.5 Exercises 211
(a) Show that the it is so.

(b) Prove that T : L2 (0, 1) → L2 (0, 1) is compact.
(c) Show that the operator derivative
f (x) |→ T' f (x) ≡ (Tf (x))'

.
is not compact.
Part III
Multidimensional Sobolev Spaces
and Scalar Variational Problems
Chapter 7
Multidimensional Sobolev Spaces
7.1 Overview
We have introduced one-dimensional Sobolev spaces in Chap. 2. The fundamental

concept that is required is that of weak derivative. In the case of higher dimension,
we need to talk about weak partial derivatives. The basic identity that allows to
introduce such fundamental concept is again the integration-by-parts formula which
in the multidimensional framework is a consequence of the classical divergence
theorem and the product rule. Though many concepts are similar to the one-
dimensional situation, technical issues are much more involved.
Again, our initial motivation comes from our interest to deal with variational
problems where one tries to minimize an integral functional of the form
⎛
I (u) =
. F (x, u(x), ∇u(x)) dx,
Ω
where .Ω ⊂ RN is a certain domain; feasible functions
u(x) : Ω → R
.
should, typically, comply with further conditions like having preassigned boundary
values around .∂Ω; and the integrand
F (x, u, u) : Ω × R × RN → R
.
determines in a crucial way the nature of the associated functional I .

Our first main task focuses on defining a weak gradient
⎛ ⎞
∂u ∂u ∂u
.∇u(x) = (x), (x), . . . , (x)
∂x1 ∂x2 ∂xN
216 7 Multidimensional Sobolev Spaces
that takes the place of the variable .u ∈ RN in the integrand .F (x, u, u) when
computing the cost .I (u) of every competing function u. Sobolev spaces will then
be the class of all those functions whose weak derivates, i.e. weak gradient, enjoy
certain integrability conditions. This chapter concentrates on setting the foundations
for such Banach spaces, while the next chapter builds on Sobolev spaces to explore
scalar, multidimensional, variational problems.
One of the main reasons why multidimensional Sobolev spaces are much more
sophisticated than their one-dimensional counterpart, is that the class of domains
in .RN where Sobolev spaces can be considered is infinitely more varied than in .R.
On the other hand, there is no genuine multidimensional version of a fundamental
theorem of Calculus as such, but every such result is based one way or another in the
classical fundamental theorem of Calculus which is essentially one-dimensional.
The study of Sobolev spaces in the multidimensional setting is a quite reach
and fascinating subject. Though in this chapter we focus our attention on those
fundamental facts necessary for their use in minimization problems for standard
variational problems, we will come back to them in the final chapter to explore
some other crucial properties of functions belonging to these spaces to enlarge their
use in variational problems and other important situations.
7.2 Weak Derivatives and Sobolev Spaces
Suppose
u(x), φ(x) : Ω → R
.
are two smooth functions, with .φ having compact support within .Ω. By the classical
divergence theorem, we have
⎛ ⎛
. div[φ(x)u(x)ei ] dx = φ(x)u(x)ei · n(x) dS(x), (7.1)
Ω ∂Ω
where .n(x) stands for the outer, unit normal to .∂Ω, and .{ei }i=1,2,...,N is the canonical
basis of .RN . The right-hand side of (7.1) vanishes because .φ, having compact
support contained in .Ω, vanishes over .∂Ω, while the left-hand side can be written,
through the product rule, as
⎛
. [u(x)∇φ(x) · ei + φ(x)∇u(x) · ei ] dx = 0,
Ω
i.e.
⎛
∂φ ∂u
. [u(x) (x) + φ(x) (x)] dx = 0,
Ω ∂xi ∂xi
7.2 Weak Derivatives and Sobolev Spaces 217
for every .i = 1, 2, . . . , N . If we regard .φ as a varying test function, this identity,

written in the usual form
⎛ ⎛
∂φ ∂u
. u(x) (x) dx = − φ(x) (x) dx
Ω ∂xi Ω ∂xi
permits us to declare a function .ui (x) as the weak ith-partial derivative of u if

⎛ ⎛
∂φ
. u(x) (x) dx = − φ(x)ui (x) dx
Ω ∂xi Ω
for every smooth, compactly supported .φ. The coherence of such a concept is clear,
for if u is indeed smooth with a gradient (in the usual sense) .∇u(x) then
∇u(x) = u(x),
. u = (u1 , u2 , . . . , uN ).
But there are non-differentiable functions u in the classical sense which admit weak
gradients. We write .∇u(x) to designate the vector of weak, partial derivatives .u(x)
of u, and then we recover the formula of integration by parts
⎛ ⎛
. u(x)∇φ(x) dx = − φ(x)∇u(x) dx (7.2)
Ω Ω
which is valid for every smooth function .φ with compact support in .Ω.
Mimicking the one-dimensional situation, we introduce the multidimensional
Sobolev spaces.
Definition 7.1 Let .Ω be an open subset of .RN , and let exponent .p ∈ [1, +∞]
be given. The Sobolev space .W 1,p (Ω) is defined as the collection of functions of
.L (Ω) admitting a weak gradient .∇u, in the above sense, which is a vector with
p
components in .Lp (Ω) as well. In compact form, we write
.W 1,p (Ω) = {u ∈ Lp (Ω) : ∇u ∈ Lp (Ω; RN )}.
The norm in .W 1,p (Ω) is
. ||u||W 1,p (Ω) = ||u||Lp (Ω) + ||∇u||Lp (Ω;RN ) .
Equivalently, we can take, in a more explicit way,

⎛
p
||u||W 1,p (Ω) =
. [|u(x)|p + |∇u(x)|p ] dx.
Ω
As in the one-dimensional situation, the exponent .p = 2 is a very special case. To

highlight it, we use a special notation, which is universally accepted, and put
H 1 (Ω) ≡ W 1,2 (Ω).

.
There is an inner product in .H 1 (Ω) given by

⎛
<u, v> =
. [u(x)v(x) + ∇u(x) · ∇v(x)] dx,
Ω
with associated norm

⎛
.||u|| 1 = [u2 (x) + |∇u(x)|2 ] dx.
2
H (Ω)
Ω
Our first statement asserts that these spaces are Banach spaces. In addition, .H 1 (Ω)
is a Hilbert space.
Proposition 7.1 .W 1,p (Ω) is a Banach space for every exponent .p ∈ [1, +∞].
1
.H (Ω) is a separable Hilbert space.
The proof of this fact is exactly like the one for Proposition 2.3.
Remark 7.1 Most of the basic properties of multi-dimensional Sobolev spaces are
inherited from the corresponding Lebesgue spaces. Features like reflexivity and
separability hold for .W 1,p (Ω) if .p ∈ (1, +∞), while .W 1,1 (Ω) is also separable.
When one works with weak derivatives, duality is not that important. In fact, there is
no standard way to identify the dual of .W 1,p (Ω). It is more important to identify the
dual spaces of certain subspaces of .W 1,p (Ω) as we will see. On the other hand, we
have not paid much attention to the separability property except for Hilbert spaces.
Since even just measurable functions are nearly continuous according to Luzin’s
theorem, when deciding in practice if a given function belongs to a certain Sobolev
space, one would differentiate such function paying attention to places where such
differentiation is not legitimate, and proceed from there. However, it may be not so
easy to finally decide if a given function belongs to a Sobolev space in the multi-
dimensional scenario because the set of singularities of derivatives may be quite
intricate and its size, and integrability properties, at various dimensions may be
crucial in the final decision. This may be a quite delicate and specialized issue well
beyond the scope of this text. There are however easier situations in which one can
decide in a more direct way.
Proposition 7.2 Let .u(x) be a function in .L∞ (Ω) that is continuous and differen-
tiable except in a singular subset .ω of .Ω of measure zero that admits a family of
regular subsets .Ωɛ (in the sense that the divergence theorem is valid in .Ω \ Ωɛ ) with
.ω ⊂ Ωɛ , |∂Ωɛ |, |Ωɛ | → 0
7.2 Weak Derivatives and Sobolev Spaces 219
as .ɛ \ 0, and its gradient .∇u(x) ∈ Lp (Ω) for some .p ∈ [1, ∞]. Then .u ∈
W 1,p (Ω), and its weak gradient is .∇u(x).
Proof Let .φ(x) be a smooth, compactly-supported function in .Ω, and take the
family of subsets .Ωɛ indicated in the statement such that u is continuous and
differentiable in .Ω\Ωɛ and .|∂Ωɛ |, |Ωɛ | \ 0 . By our hypotheses, the three integrals
⎛ ⎛
. φ(x)∇u(x) dx, φ(x)u(x)n(x) dS(x),
Ωɛ ∂Ωɛ
⎛
∇φ(x)u(x) dx,
Ωɛ
converge to zero as .ɛ \ 0. The first one because it can be bounded from above by
⎛ ⎛ ⎞1/q ⎛ ⎛ ⎞1/p
. |φ(x)| dxq
|∇u(x)| dx
p
,
Ωɛ Ωɛ
and this integral converges to zero if .|∇u|p is integrable and .|Ωɛ | \ 0 (this is
even so if .p = 1). The second and third ones are straightforward if .u ∈ L∞ (Ω) is
continuous.
On the other hand, by the regularity properties assumed on u and on .Ω \ Ωɛ , we
can apply the divergence theorem in .Ω \ Ωɛ , and find
⎛ ⎛
. ∇φ(x)u(x) dx = − φ(x)∇u(x) dx
Ω\Ωɛ Ω\Ωɛ
⎛
+ φ(x)u(x)n(x) dS(x),
∂Ωɛ
where .n(x) is the unit, “inner” normal to the hypersurface .∂Ωɛ .

With these facts, let us examine the following chain of identities
⎛ ⎛ ⎛ ⎛ ⎞
. ∇φ(x)u(x) dx = lim ∇φ(x)u(x) dx + ∇φ(x)u(x) dx
Ω ɛ →0 Ω\Ωɛ Ωɛ
⎛
= lim ∇φ(x)u(x) dx
ɛ →0 Ω\Ωɛ
⎛ ⎛
= lim − φ(x)∇u(x) dx
ɛ →0 Ω\Ωɛ
⎛ ⎞
+ φ(x)u(x)n(x) dS(x)
∂Ωɛ
⎛
= − lim φ(x)∇u(x) dx
ɛ →0 Ω\Ωɛ
⎛ ⎛ ⎛ ⎞
= − lim φ(x)∇u(x) dx + φ(x)∇u(x) dx
ɛ →0 Ω\Ωɛ Ωɛ
⎛
=− φ(x)∇u(x) dx.
Ω
The arbitrariness of .φ in this final identity implies our claim. ⨆

⨅
Example 7.1 The family of functions
xi
ui (x) =
. , i = 1, 2, . . . , N,
|x|
are the components of the mapping

x
u(x) =
. .
|x|
They are differentiable except at the origin. Off this point, the functions are smooth,
and their gradients are easily calculated as
1 1
∇u(x) =
. 1− x ⊗ x, (x ⊗ x)ij = xi xj , x = (xi )i ,
|x| |x|3
where, as usual, .1 is the identity matrix of the appropriate dimension (.N × N). It is
then elementary to realize that
C
|∇u(x)| ≤
. ,
|x|
for some positive constant C. We therefore see, according to Proposition 7.2 where
Ωɛ is the ball .Bɛ of radius .ɛ around the origen, that the Sobolev space to which these
.
functions belong to depend on the integrability of the function
1
u(x) =
. .
|x|p
Lemma 7.1 For .R > 0, the integral

⎛
. |z|s dz, BR = {x ∈ RN : |x| ≤ 1},
BR
is finite whenever .s > −N.

According to this elementary lemma, we conclude that our family of functions in
this example belong to .W 1,p (Ω) for a bounded domain containing the origen at least
for .1 ≤ p < N; and if the origen does not belong to .Ω, for every p.
7.4 Some Important Examples 221
Example 7.2 For a bounded domain .Ω ⊂ R2 with non-empty intersection with the
Y -axis, consider the function
x1
u(x1 , x2 ) =
. .
|x1 |
It is clear that the second partial derivative vanishes; however, the first one has a
jump all along the part J of the Y -axis within .Ω. The point is that this set J cannot
be covered by a sequence like the one in Proposition 7.2. Indeed, it is impossible
to have that .∂Ωɛ be made arbitrarily small; after all, the length, the 1-dimensional
measure, of J does not vanish. This function u does not belong to any Sobolev
space, but to some more general space whose study goes beyond the goal of this
text.
7.3 Completion of Spaces of Smooth Functions of Several

Variables with Respect to Integral Norms
As in Sect. 2.10, we can talk about the vector space .C∞ (Ω) of smooth functions in
a open subset .Ω ⊂ RN . Endowed with the integral norm
⎛
( ┐
||u|| =
.
p
|u(x)|p + |∇u(x)|p ), dx, u ∈ C∞ (Ω), p ≥ 1,
Ω
the space is not complete. But it can be immediately completed according to

Theorem 2.1. The space so obtained is then complete with respect to the above
norm. In it, smooth functions are dense.
We could legitimately call the resulting space .W 1,p (Ω), and argue that functions
u in this space admit a weak gradient .∇u ∈ Lp (Ω; RN ). What we cannot know in
advance is whether this space is the full set of functions in .Lp (Ω) with such a weak
gradient. This would require to work harder to be convinced that smooth functions
are dense, under this norm, in the set of all functions in .Lp (Ω) with a weak gradient
in .Lp (Ω; RN ), much in the same way as in Sect. 2.29. Under additional assumptions
on .Ω, this density fact can be shown to hold. See Corollary 9.1 below.
From the viewpoint of our interest in variational problems, one can work in the
completion of .C∞ (Ω) with respect to the above norm, and proceed from there.
Nothing essentially changes in the analysis that we will perform in the next chapter.
7.4 Some Important Examples
We describe in this section some important examples of multi-dimensional Sobolev

functions of special relevance.
Suppose we have a measurable kernel
K(x, y) : Ω × Ω → R
.
with the fundamental property that
K(·, y) ∈ W 1,1 (Ω)

.
for a.e. .y ∈ Ω. In addition, there are symmetric, measurable, non-negative kernels
K0 (x, y),
. K1 (x, y)
with
|K(x, y)| ≤ K0 (x, y),

. |∇x K(x, y)| ≤ K1 (x, y),
and the measurable functions

⎛
μi (x) =
. Ki (x, y) dy, i = 0, 1, (7.3)
Ω
turn out to be bounded in .Ω (they belong to .L∞ (Ω)). For a given measurable
function .f (y), define
⎛
u(x) =
. K(x, y)f (y) dy, x ∈ Ω, (7.4)
Ω
through the corresponding Hilbert-Schmidt type operator.

Lemma 7.2 If .f (y) ∈ Lp (Ω) for some .p ≥ 1, then .u(x) defined through (7.4),
under all of the previous assumptions, belongs to .W 1,p (Ω), and
⎛
∇u(x) =
. ∇x K(x, y)f (y) dy. (7.5)
Ω
Proof We first check that .u(x) defined through (7.4) belongs to .Lp (Ω). We first
have
⎛ ⎛
.|u(x)| ≤ |K(x, y)| |f (y)| dy ≤ K0 (x, y) |f (y)| dy.
Ω Ω
Since .μ0 (x) is finite for a.e. .x ∈ Ω, according to (7.3), the measure
K0 (x, y)
dνx (y) =
. dy
μ0 (x)
is a probability measure. By Jensen’s inequality applied to the function .| · |p , which

is convex if .p ≥ 1,
| ⎛ |p ⎛
| |
| |f (y)| dνx (y)| ≤ |f (y)|p dνx (y)
.
| |
Ω Ω
⎛
1
= K0 (x, y) |f (y)|p dy.
μ0 (x) Ω
Hence
⎛
|u(x)| ≤ μ0 (x)
.
p p−1
K0 (x, y) |f (y)|p dy.
Ω
If we integrate in .x ∈ Ω, we arrive at
⎛ ⎛
p p−1
||u||Lp (Ω) ≤ ||μ0 ||L∞ (Ω)
. K0 (x, y) |f (y)|p dy dx.
Ω Ω
By Fubini’s theorem, and the symmetry of .K0 , we can also write

p p p
||u||Lp (Ω) ≤ ||μ0 ||L∞ (Ω) ||f ||Lp (Ω) .
.
Concerning derivatives, it is immediate to check formula (7.5) testing it against a

smooth function .φ(x) with compact support in .Ω as in (7.2). In checking this, it is
also crucial to use Fubini’s theorem to interchange the order of integration. Finally
each weak partial derivative in (7.5) belongs to .Lp (Ω), by the same ideas as with
(7.4) replacing the kernel .K0 (x, y) by .K1 (x, y). ⨆
⨅
One fundamental particular case follows. Define the newtonian potential .w(x) of
an integrable function .f (y) : Ω → R by the formula
⎛
w(x) =
. Θ(x − y)f (y) dy, x ∈ RN , (7.6)
Ω
where
⎧
N (2−N )ωN |x − y|
1 2−N , N > 2,
Θ(x − y) = Θ(|x − y|) =
.
2π log |x − y|, N = 2.
1
The positive number .ωN is the measure of the unit ball in .RN , and .N ωN the surface
measure of the unit sphere. Note that w in (7.6) is defined for every .x ∈ RN , though
f (y) is only defined for .y ∈ Ω. It is true however that we can also rewrite (7.6) in
.
the form
⎛
.w(x) = Θ(x − y)χΩ (y)f (y) dy
RN
for the characteristic function .χΩ (y) defined as 1 for points .y ∈ Ω, and vanishing
off .Ω.
According to our general discussion above, we need to check that
⎛
. Θ(x − y) dy
Ω
is uniformly bounded in .x ∈ Ω, to conclude that .w(x) defined through (7.6) belongs

to .Lp (Ω) if f belongs to this same space. Notice that we can take
K0 (x, y) = Θ(x − y).

.
Now, if .Ω is bounded, and .Ω − Ω ⊂ BR for some positive .R > 0, then

⎛ ⎛
1
. Θ(x − y) dy = |z|2−N dz
Ω N (2 − N )ωN x−Ω
⎛
1
≤ |z|2−N dz,
N (2 − N )ωN BR
and this last integral is finite according to Lemma 7.1, and independent of .x ∈ Ω.
The case .N = 2 can also be checked separately. Consequently, if .f ∈ Lp (Ω), so
does .w(x) in (7.6). But more is true.
Lemma 7.3 If .f ∈ Lp (Ω) with .Ω, bounded, and .p ≥ 1, then its newtonian
potential .w(x) given by (7.6) belongs to .W 1,p (Ω).
Proof After our previous calculations, all we need to find is a symmetric, upper
bound .K1 (x, y)
|∇x Θ(x − y)| ≤ K1 (x, y)

.
with
⎛
. K1 (x, y) dy
Ω
bounded uniformly in .x ∈ Ω. It is an elementary exercise to find that
1
∇x Θ(x − y) =
. |x − y|−N (x − y),
N ωN
and hence, we can take
1
K1 (x, y) =
. |x − y|1−N .
N ωN
Lemma 7.1 enables us to conclude that

⎛
. K1 (x, y) dy
Ω
is uniformly bounded in .x ∈ Ω, and hence .w(x) ∈ W 1,p (Ω). ⨆

⨅
Remark 7.2 One can explore the possibility of a further differentiation. Indeed, we
are led to compute the hessian matrix of .w(x) in (7.6) through the formula
⎛ ⎞
1 x−y x−y
∇x2 Θ(x − y) =
. 1N − N ⊗ ,
N ωN |x − y|N |x − y| |x − y|
where .1N is the identity matrix of size .N × N, and, as already introduced earlier,
u ⊗ v = (ui vj )ij ,
. u = (ui ) ∈ RN , v = (vj ) ∈ RN
is the tensor product of two vectors in .RN . Hence
|∇x2 Θ(x − y)| ≤ K2 (x, y) ≡ CN |x − y|−N

.
for a certain constant .CN . However, the integral

⎛
. K2 (x, y) dy
Ω
is no longer finite according to Lemma 7.1. We can easily check, though, that
Δx Θ(x − y) = tr(∇x2 Θ(x − y)) = 0

.
for every .y except when .x = y. We will come back to these facts in the final chapter.
Another interesting example is the generalization of the results in Sect. 2.29 to a
higher dimensional situation. Specifically, take a smooth function .ρ(x) : RN → R,
supported in the unit ball .B, and with
⎛
. ρ(x) dx = 1.
B
The typical choice is similar to the one in the above-mentioned section

⎧
C exp |x|21−1 , |x| ≤ 1,
ρ(x) =
.
0, |x| ≥ 1,
for a suitable positive constant C. Define
K(x, y) = ρ(x − y),

. x, y ∈ RN ,
and
⎛
u(x) =
. ρ(x − y)f (y) dy (7.7)
RN
for .f ∈ Lp (RN ). As a direct corollary of Lemma 7.2, based on the smoothness of

.ρ, the following is immediate.
Corollary 7.1
1. For .f ∈ Lp (RN ), the function u in (7.7) belongs to the space .W 1,p (RN ) ∩
C∞ (RN ).
2. Moreover if
⎛
1
ρj (z) =
. ρ(j z), fj (x) = ρj (x − y)f (y) dy
j RN
for .f ∈ W 1,p (RN ), then .fj → f in .W 1,p (RN ).

The proof is similar to the one for Lemma 2.4.
7.5 Domains for Sobolev Spaces
Sobolev spaces can be defined for any open set .Ω ⊂ RN . Yet, from the viewpoint of
variational problems, fundamental properties of Sobolev functions, that one cannot
be dispensed with, can only be shown under some reasonable properties of the
sets were Sobolev functions are defined and considered. There are three main facts
that Sobolev functions should comply with to be of use in variational problems of
interest with sufficient generality:
1. Sobolev functions in .W 1,p (Ω) should have traces on the boundary .∂Ω, i.e.
typical variational problems demand preassigned boundary values for competing
functions;
2. coercivity often asks, under boundary conditions, for the .Lp (Ω)-norm of a
function in .W 1,p (Ω) to be included in the .Lp (Ω)-norm of its gradient (Poincaré’s
inequality);
7.5 Domains for Sobolev Spaces 227
3. weak convergence of derivatives must imply strong convergence of functions: the

injection from .W 1,p (Ω) into .Lp (Ω) must be a compact operator.
There are various ways to formulate geometric properties on subsets .Ω for which
the three above properties hold. We have adopted the one we believe is most
economical in terms of technicalities. It makes multidimensional Sobolev spaces
to raise naturally, and in a transparent form, from their one-dimensional version, as
we focus on the restriction of functions to one-dimensional fibers or sections.
One initial technical lemma will be helpful. Curiously, it is usually known as the
Fundamental Lemma of the Calculus of Variations.
Lemma 7.4 Let .⏀(x) be a measurable, locally integrable function defined in an
open set .D ⊂ RN . If for every .φ(x) smooth, and with compact support in D, we
have that
⎛
. ⏀(x)φ(x) dx = 0, (7.8)
D
then .⏀(x) = 0 for a.e. .x ∈ D.

Proof The proof is straightforward if we are allowed to use general measure-
theoretic results. Other possibilities use approximation by mollifiers to generalize
(7.8) for functions .φ ∈ L∞ (D) as in Lemma 2.4. We seek a contradiction between
(7.8), and the fact that .⏀ might not vanish somewhere. Suppose we could find some
non-negligible set .C where .⏀ is strictly positive. We can approximate .C by non-
negligible compact subsets .K from within. If then .φ is selected strictly positive
and with compact support in .K, we would find that the integral in (7.8) would be
strictly positive, a contradiction. The same argument works if .⏀ is negative in a
non-negligible subset. ⨆
⨅
The following proposition is the clue to our definition below of a domain. It
clearly explains why Sobolev functions in .W 1,p (Ω) are in fact Sobolev functions in
one-dimensional fibers in a generalized cylinder .Ω.
Proposition 7.3 Let .┌ be an open piece of the hyperplane .xN = 0 in .RN , and .Jx ' ,
an open interval in .R for a.e. .x ' ∈ ┌ such that
Ω = ┌ + Jx ' eN = {x ' + teN : x ' ∈ ┌ , t ∈ Jx ' }

. (7.9)
is an open subset (a cylinder with .eN -axis and variable vertical fibers) of .RN . If
.u ∈ W
1,p (Ω), the function of one variable
t ∈ Jx ' |→ u(x ' + teN ) = u(x ' , t)

.
belongs to .W 1,p (Jx ' ) for a.e. .x ' ∈ ┌ , and hence it is absolutely continuous (as a
function of one variable).
Proof If .u(x) ∈ W 1,p (Ω), then

⎛ ┌ ┐
∂φ ∂u
. u(x) (x) + φ(x) (x) dx = 0 (7.10)
Ω ∂xN ∂xN
for every test function .φ(x), smooth and compactly-supported in .Ω. In particular,
we can take test functions of the product form
. φ(x) = φ ' (x ' )φN (x ' , xN ), x = (x ' , xN ), x ' ∈ ┌ ,
for arbitrary .φ ' , smooth and compactly-supported in .┌ . By taking smooth test

functions of this form into (7.10), we realize that
⎛ ⎛ ┌ ┐
∂φN ∂u
. φ ' (x ' ) u(x) (x) + φN (x) (x) dxN dx ' = 0.
┌ Jx ' ∂xN ∂xN
If we put
⎛ ┌ ┐
∂φN ' ∂u '
⏀(x ' ) ≡
. u(x ' , xN ) (x , xN ) + φN (x ' , xN ) (x , xN ) dxN ,
Jx ' ∂xN ∂xN
by the preceding lemma, we conclude that

⎛ ┌ ┐
∂φN ' ∂u '
. u(x ' , xN ) (x , xN ) + φN (x ' , xN ) (x , xN ) dxN = 0
Jx ' ∂xN ∂xN
vanishes for a.e. .x ' ∈ ┌ . The arbitrariness of .φN implies our result. Note that for
fixed .x ' ∈ ┌ , the test function .φN (x ' , xN ) can be taken to be of the product form
too (Exercise 1). ⨆
⨅
All we need for this proof to be valid is that the last partial derivative .∂u/∂xN belong
to .Lp (Ω).
We expect our definition below will be clearly sensed after the preceding result.
We designate by .{ei }, the canonical basis of .RN and by .πi , the i-th coordinate
projection of .RN onto .RN −1 , .i = 1, 2, . . . , N .
Definition 7.2 We will say that an open subset .Ω ⊂ RN is a domain if it enjoys the
following “cylinder” property:
There is a finite number n, independent of i, such that for every .i = 1, 2, . . . , N , and for
every .x ' ∈ πi Ω, there is .Ji,x ' ⊂ R which is a finite union of at most n open intervals (some
of which could share end-points), with
.Ω = πi Ω + Ji,x ' ei = {x ' + tei : x ' ∈ πi Ω, t ∈ Ji,x ' }, (7.11)
for every .i = 1, 2, . . . , N .
7.5 Domains for Sobolev Spaces 229
Any reasonable set will fall under the action of this definition. Singular sets
violating this condition are related to not having the finiteness of n.
Proposition 7.3 can be strengthened in the sense that partial derivatives can be
measured along any orthogonal system of coordinates. Once functions posses weak
partial derivatives with respect to one such system, then they have weak partial
derivatives with respect to every other such system too.
Proposition 7.4 Let .Ω ⊂ RN be a domain, and .u ∈ W 1,p (Ω). For a.e. pair of
points .x, y ∈ Ω, the one-dimensional section
f (t) : Jx,y ⊂ R → R,
. Jx,y = {t ∈ R : tx + (1 − t)y ∈ Ω},
f (t) = u(tx + (1 − t)y),
is absolutely continuous, and for a.e. .t ∈ [0, 1],
f ' (t) = ∇u(tx + (1 − t)y) · (x − y).

.
Proof For an arbitrary rotation .R in .RN , .RT R = 1, define the function
uR (z) : RT Ω → R,
. uR (z) = u(Rz).
It is easy to check that this new function .uR belongs to .W 1,p (RT Ω) (Exercise 3),
and that
∇uR (z) = RT ∇u(Rz).

.
Proposition 7.3 can then be applied to .uR to conclude, due to the arbitrariness of .R,
that u is also absolutely continuous along a.e. (one-dimensional) line intersecting
.Ω. ⨆
⨅
In practice, quite often domains are usually defined through functions in such a
way that the regularity or smoothness of such functions determined the regularity of
their associated domains.
Definition 7.3 We will say that a domain .Ω ⊂ RN is a .C1 -domain if there is a
.C -function .φ(x) : R
1 N → R such that
1. .Ω = {φ > 0}, RN \ Ω = {φ < 0}, ∂Ω = {φ = 0};

2. for some positive .ɛ , the function .φ has no critical point in the strip .{−ɛ < φ < ɛ }
around .∂Ω.
The domain
Ωɛ = {−ɛ < φ}
.
is an extension of .Ω. The normal direction to .∂Ω is given by the gradient .∇φ,
which, by hypothesis, is non-singular over .∂Ω. Every regular domain according to
Definition 7.3 is a domain according to Definition 7.2.
If .Ω is a .C1 -domain, the signed-distance function to .∂Ω is one standard choice
for .φ
φ(x) = dist(x, ∂Ω).

.
A ball is, of course, a prototypical case
.BR (x 0 ) = {x ∈ RN : |x − x 0 |2 < R}, φ(x) = R − |x − x 0 |2 .
The definition of a domain through a smooth function .φ permits some interesting

manipulations with functions .u ∈ W 1,p (Ω) as we will later see. For instance, the
following fact may be interesting in some circumstances.
Lemma 7.5 Let .Ω ⊂ RN be a .C1 -domain with bounded boundary determined by
.φ(x). Then there is a sequence .{ηj } of .C -functions defined in .Ω such that:
1
1. .ηj (x) : Ω → [0, 1], .ηj = 0 on .∂Ω;

2. if .ωj = |Ω \ {ηj = 1}|, then .ωj \ 0 as .j → ∞;
3. .|∇ηj | ≤ C/ωj , for some constant C (that may depend on .Ω).
Proof Select a .C1 -function .P (y) in .[0, 1] with the four properties
P (0) = P ' (0) = P ' (1) = 0,

. P (1) = 1,
for instance the polynomial
P (y) = 3y 2 − 2y 3 .
.
Extend it by putting
.P (y) = 1 for y ≥ 1, P (y) = 0 for y ≤ 0.
The resulting function is .C1 in all of .R. For a positive integer j , put
Pj (y) = P (jy),
. ηj (x) = Pj (φ(x)).
It is clear that each .Pj is .C1 , and that
{Pj = 0} = RN \ (Ω ∪ ∂Ω),
. {Pj = 1} = {φ ≥ 1/j },
1,p
7.6 Traces of Sobolev Functions: The Space W0 (Ω) 231
at least for large j . In this way, by the standard co-area formula or Cavalieri’s
principle,
ωj =|Ω \ {ηj = 1}|

.
=|{0 < φ ≤ 1/j }|

⎛ 1/j
= |{φ = s}| ds.
0
The smoothness of .φ and the boundedness of .∂Ω imply that indeed .ωj \ 0. Finally,
∇ηj (x) = Pj' (φ(x))∇φ(x).

.
This last product vanishes except when .0 < φ < 1/j . For points .x in this region,
we find
∇ηj (x) = j P ' (j φ(x))∇φ(x).

.
Every factor in this product is bounded by a constant, independent of j , except the

factor j itself. But the relationship between j and .ωj involves a further constant
because .ωj is of the order of .1/j . ⨆
⨅
There is no particular difficulty in raising the degree of smoothness in .Ω and in .φ,
though one would have to replace the polynomial P by a better auxiliary function.
Sometimes one may need a slight different version of these cut-off functions .{ηj },
which require a slight modification of the above proof.
Lemma 7.6 Let .Ω ⊂ RN be a .C1 -domain with bounded boundary determined by
.φ(x). Then there is a sequence .{ηj } of .C -functions defined in .Ω such that
1
1. there is compact set .K ⊂ Ω, such that .supp(ηj ) ⊂ K for all j ;

2. if .ωj = |Ω \ {ηj = 1}|, then .ωj \ 0 as .j → ∞;
3. .|∇ηj | ≤ C/ωj , for some constant C (that may depend on .Ω).
We next explore the validity of the three main features we need for Sobolev
functions to be of use in variational problems.
1,p
7.6 Traces of Sobolev Functions: The Space W0 (Ω)
One vital ingredient in variational problems is the boundary condition around .∂Ω
imposed on competing functions. This forces us to examine in what sense functions
in Sobolev spaces can take on boundary values. It is another important consequence
of Proposition 7.3, and our definition of a feasible domain Definition 7.2. Note that
functions in .Lp (Ω) cannot have, in general, traces over .N − 1 dimensional sets in
.R , as these have a vanishing N -dimensional Lebesgue measure.
N
We need first an interesting technical result involving the boundary .∂Ω of a

domain to facilitate proofs, which is implicit in the definition of a domain. In fact,
this property can be included in Definition 7.2 as part of it.
Lemma 7.7 Let .Ω be a domain according to Definition 7.2. Then for a.e. point .x
in the boundary .∂Ω, there is some (most likely more than one) .i ∈ {1, 2, . . . , N },
possibly depending on .x, such that
x = πi x + tei ,
. t ∈ Ji,πi x . (7.12)
Proof In the context of Definition 7.2, consider the set

( )
. ∪i πi Ω + ∂Ji,x ' ei = ∪i {x ' + tei : x ' ∈ πi Ω, t ∈ ∂Ji,x ' }
which is a subset of .∂Ω. Each of the sets, for fixed i,
{x ' + tei : x ' ∈ πi Ω, t ∈ ∂Ji,x ' }

.
covers the part of .∂Ω that is projected onto .πi Ω, and hence our result is proved once
we realize that the set
⎛ ⎞
−1
.∂Ω \ ∪i π (π i Ω) ∩ ∂Ω
i
is negligible in .∂Ω. ⨆
⨅
Our main result in this section follows.
Proposition 7.5 Let .Ω ⊂ RN be a domain, and .u ∈ W 1,p (Ω). Then the restriction
. u|∂Ω : ∂Ω → R
is well-defined, and measurable.

Proof Through Lemma 7.7, consider a partition of .∂Ω in N subsets .┌ i , .i =
1, 2, . . . , N, so that for every .x ∈ ┌ i , (7.12) holds. In this way, we can restrict
attention to the case in which the decomposition (7.12) holds for a particular i, and
show that Sobolev functions in .W 1,p (Ω) have a trace for a.e. .x ∈ ┌ i . Consider the
case, without loss of generality, .i = N, and put
┌ = ┌ N ,
. x ' = πN x, Jx ' , a connected subinterval of JN,πN x .
1,p
We can write through Proposition 7.3, for every pair of numbers y and z in .Jx ' ,
⎛ y ∂u '
u(x ' , y) − u(x ' , z) =
. (x , x) dx
z ∂xN
⎛
∂u '
= χ[z,y] (x) (x , x) dx,
Jx ' ∂xN
and, by Hölder’s inequality if .p > 1,

⎛ ⎛ | | ⎞ 1/p
| ∂u ' |p 1 1
' '
|u(x , y) − u(x , z)| ≤ | | |y − z|1/q , + = 1.
.
| ∂x (x , x)| dx p q
Jx ' N
Since the quantities

⎛ ⎛ | | ⎞ 1/p
| ∂u ' |p
| |
.
| ∂x (x , x)| dx
Jx ' N
are finite for a.e. .x ' ∈ Ω' (because the integral of its p-th power is finite), we
conclude that u is absolutely continuous along a.e. such fiber, and hence it is well-
defined on a.e. point of the form
x ' + teN ,
. t ∈ Jx ' .
This implies our conclusion over .┌ = ┌ N since points in this set are exactly of
this form. The case .p = 1 is argued in the same way, though the above Hölder’s
inequality is not valid. ⨆
⨅
Once we have shown that Sobolev functions in .W 1,p (Ω), for a certain domain .Ω,
have traces on the boundary .∂Ω, one can isolate the following important subspace.
Definition 7.4 For a domain .Ω ⊂ RN , the subspace
1,p
W0 (Ω) ⊂ W 1,p (Ω)
.
is the closure in .W 1,p (Ω) of the subspace .C∞

c (Ω) of smooth functions with compact
support in .Ω.
It is evident that the set of smooth functions with compact support in .Ω have a
1,p
vanishing trace on .∂Ω. In fact, one can, equivalently, define .W0 (Ω) as the class of
functions with a vanishing trace on .∂Ω. If we adopt the above definition, however,
this must be a result to be proved.
1,p
Proposition 7.6 Every function in .W0 (Ω) has a vanishing trace on .∂Ω.
Proof It consists in the realization that the proof above of Proposition 7.5 is valid
line by line for a full sequence .{uj } converging in .W 1,p (Ω), in such a way that we
conclude that there is a point-wise convergence of their traces around .∂Ω. If these
vanish for a sequence of compactly supported functions, so does the trace of the
limit function. ⨆
⨅
There still remains the issue of whether every function in .W 1,p (Ω) with a vanishing
1,p
trace on .∂Ω belongs, as a matter of fact, to .W0 (Ω). This amounts to showing
that such functions can be approximated, in the norm of .W 1,p (Ω), by a sequence
of smooth functions with compact support contained in .Ω. This issue is technical
and requires some smoothness on .∂Ω, as in other situations. Though we treat
below some other similar points, we will take for granted, under the appropriate
1,p
smoothness on .Ω, that .W0 (Ω) is exactly the subspace of .W 1,p (Ω) with a
vanishing trace on .∂Ω. Under this equivalence, we find that (7.2) is correct for
.u ∈ W
1,p (Ω) and .φ ∈ W 1,q (Ω). In particular, it is correct for .u, φ ∈ H 1 (Ω)
0
and one of the two in .H01 (Ω).
1,p
The most important point is that .W0 (Ω) is a Banach space on its own right,
under the same norm. .H01 (Ω) is a Hilbert space with the same inner product of
1 1,p 1,p (Ω).
.H (Ω). It suffices to check that .W
0 (Ω) is a closed subspace in .W
1,p
Proposition 7.7 .W0 (Ω) is closed in .W 1,p (Ω).
Proof Suppose we are facing a situation where
1,p
uj → u in W 1,p (Ω),
. uj ∈ W0 (Ω),
1,p
and we would like to conclude that necessarily .u ∈ W0 (Ω).
By Definition 7.4, we know that there is a sequence .{φj } of smooth functions
with compact support contained in .Ω and such that
||φj − uj || → 0 as j → ∞ in W 1,p (Ω).

.
On the other hand
||uj − u|| → 0 as j → ∞ in W 1,p (Ω).

.
It is immediate to conclude that
. ||φj − u|| → 0 as j → ∞ in W 1,p (Ω),
1,p
and .u ∈ W0 (Ω), again by Definition 7.4. ⨆
⨅
Another natural and relevant question is what functions, defined in the boundary
∂Ω of a domain .Ω ⊂ RN , can be attained as the trace of a Sobolev function in .Ω.
.
There is a very precise answer to this question that is a bit beyond this first course
1,p
about Sobolev spaces. A practical answer that is sufficient in practice most of the
time, is that such traces functions are, of course, of the form
. u|∂Ω , u ∈ W 1,p (Ω).
In this way, fixed boundary values around .∂Ω for a certain variational problem are
given by providing a specific Sobolev function .u0 ∈ W 1,p (Ω), and then feasible
functions .u ∈ W 1,p (Ω) are asked to comply with the requirement
1,p
u − u0 ∈ W0 (Ω).
.
1,p
Remark 7.3 Note that .W0 (RN ) = W 1,p (RN ).
The following is another remarkable but natural result.
Proposition 7.8
1,p
1. If .u ∈ W0 (Ω), then its extension by zero, indicated by the operator .·,
⎧
u(x), x ∈ Ω,
. u(x) =
0, x∈
/ Ω,
is a function in .W 1,p (RN ). Moreover

⎧
∇u(x), x ∈ Ω,
.∇u(x) =
0, x∈
/ Ω,
and
||u||W 1,p (Ω) = ||u||W 1,p (Ω) .

.
0
2. If .Ω is a regular domain (according to Definition 7.3), and the extended function

.u belongs to .W
1,p (RN ), then .u ∈ W 1,p (Ω).
0
Proof Suppose first that .u ∈ W0 (Ω), and let .uj ∈ C∞

1,p
c (Ω) such that
||u − uj ||W 1,p (Ω) → 0.

.
It is evident that .uj ∈ C∞ N

c (R ), and that
||u − uj ||W 1,p (RN ) = ||u − uj ||W 1,p (Ω) → 0.

.
Hence .u ∈ W 1,p (RN ). This part does not require any smoothness on .Ω.
Conversely, assume that .u ∈ W 1,p (RN ). By Remark 7.3, there is a sequence .{uj }
of smooth functions with compact support such that
||u − uj ||W 1,p (RN ) → 0.

.
But since
.||uj ||W 1,p (RN \Ω) ≤ ||u − uj ||W 1,p (RN ) ,
we conclude that .uj → 0 in .W 1,p (RN \ Ω). If .Ω is regular and .φ is its defining
function, by Lemma 7.6 (or rather its .C∞ -version) there is a sequence .{ηj } with
those stated properties. Then the sequence .{vj = uj ηj } ⊂ C∞
c (Ω) and
. ||uj − vj ||W 1,p (Ω) → 0.
Therefore
||u − vj ||W 1,p (Ω) ≤||u − uj ||W 1,p (Ω) + ||uj − vj ||W 1,p (Ω)
.
≤||u − uj ||W 1,p (RN ) + ||uj − vj ||W 1,p (Ω) ,
1,p
and .u ∈ W0 (Ω). ⨆
⨅
7.7 Poincaré’s Inequality
Our manipulations in the proof of Proposition 7.5 lead in a natural way to the
following remarkable fact. Recall that .πi , .i = 1, 2, . . . , N, is the i-th canonical
coordinate projection so that .1 − πi is the projection onto the i-th axis.
Proposition 7.9 Suppose .Ω ⊂ RN is a domain such that at least one of the N
projections .(1 − πi )Ω is a bounded set of .R. Then there is a constant .C > 0
(depending on p and on the size of this projection in .R) such that, for every
1,p
.u ∈ W
0 (Ω), we have
||u||Lp (Ω) ≤ C||∇u||Lp (Ω;RN ) .

.
Proof Suppose, without loss of generality, that the index i is the last one N , so
that the diameter of the projection .(1 − πN )Ω onto the last axis is not greater than
.L > 0. We can write, with the notation in the proof of Proposition 7.5 above, for
a.e. .x ' ∈ πN Ω,
⎛ y ⎛
∂u ' ∂u '
u(x ' , y) =
. (x , x) dx = χ[z,y] (x) (x , x) dx,
z ∂xN Jx ' ∂xN
7.7 Poincaré’s Inequality 237
if the point .(x ' , z) ∈ ∂Ω, and hence .u(x ' , z) = 0. The diameter of the set .Jx ' is not
greater than L for a.e. .x ' ∈ πN Ω. Again by Hölder’s inequality,
⎛ ⎛ | | ⎞ 1/p
| ∂u ' |p 1 1
|u(x ' , y)| ≤ | | |y − z|1/q , + = 1,
.
| ∂x (x , x)| dx p q
Jx ' N
that is to say
⎛ ⎛ | | ⎞ 1/p
| ∂u ' |p
'
|u(x , y)| ≤ | | L1/q ,
.
| ∂x (x , x)| dx
Jx ' N
for all .y ∈ Jx ' . Therefore

⎛ ⎛ | |
| ∂u ' |p
'
|u(x , y)| dy ≤ p | | p/q+1
.
| ∂x (x , x)| dx L .
Jx ' Jx ' N
A further integration with respect to .x ' leads to

⎛ ⎛
|u(x ' , y)|p dy dx '
p
||u||Lp (Ω) =
.
πN Ω Jx '
⎛ ⎛ | |
| ∂u ' |p
≤ Lp | (x , x) | dx dx '
| ∂x |
πN Ω Jx ' N
|| ||p
||
p || ∂u
||
= L || || .
∂x || p
N L (Ω)
It is then clear that
||u||Lp (Ω) ≤ L||∇u||Lp (Ω;RN ) .

.
Notice that . pq + 1 = p. The case .p = 1 is also correct, and the proof requires some
very minor adjustments. ⨆
⨅
This result indicates that when boundary values around .Ω are preassigned, the size
of functions is somehow incorporated in the norm of the gradient. In particular, we
see that the p-th norm of the gradient
⎛
p
||∇u||Lp (Ω) =
. |∇u(x)|p dx
Ω
1,p
is truly a norm in the space .W0 (Ω), and
⎛
. ∇u(x) · ∇v(x) dx
Ω
is a genuine inner product in .H01 (Ω). Poincaré’s inequality is the second main point
that we set to ourselves before proceeding to examining scalar, multidimensional
variational problems.
7.8 Weak and Strong Convergence
One main ingredient to reduce in an important way the convexity requirement

on integrands for multidimensional variational problems, is to show that weak
convergence in .W 1,p (Ω) implies strong convergence in .Lp (Ω). We will focus on
these properties from a broader perspective in the final chapter, but now we will
treat this issue in a right-to-the-point manner. Our plan is to complete Corollary 2.4.
Proposition 7.10 Let .p > 1, and suppose .{uj } is a bounded sequence of functions
in .W 1,p (Ω) for a domain .Ω ⊂ RN . There is a subsequence, not relabeled, and a
function .u ∈ W 1,p (Ω) such that .uj ⥛ u in .W 1,p (Ω), and .uj → u in .Lp (Ω).
Proof The N sequences of functions
{∂uj /∂xi },
. i = 1, 2, . . . , N,
are bounded in .Lp (Ω). By the first part of Corollary 2.4, there are functions
u(i) ∈ Lp (Ω),
. i = 0, 1, 2, . . . , N,
with
∂uj
. ⥛ u(i) , i = 1, 2, . . . , N, uj ⥛ u(0) .
∂xi
We first claim that .u ≡ u(0) belongs to .W 1,p (Ω), and its i-th weak, partial derivative
is precisely .u(i) . To this end, take a test function .φ, and write for each j and i, since
.uj ∈ W
1,p (Ω),
⎛
∂φ ∂uj
. [uj (x) (x) + φ(x) (x)] dx = 0.
Ω ∂xi ∂xi
7.8 Weak and Strong Convergence 239
By the claimed weak convergences, a direct passage to the limit in j , leads to

⎛
∂φ
. [u(x) (x) + φ(x)u(i) (x)] dx = 0.
Ω ∂xi
This identity, valid for arbitrary test functions .φ, exactly means, given that each .u(i)
belongs to .Lp (Ω), that .u ∈ W 1,p (Ω), .u(i) is the i-th partial derivative of u, and
hence, .uj ⥛ u in .W 1,p (Ω).
It remains to show the fundamental additional fact that .uj → u (strong) in
p
.L (Ω). We know that
⎛ ⎛ | |p || ||
| ∂uj ' | || ∂uj ||p
| | ' || ||
.
| ∂x (x , xN )| dxN dx = || ∂x || ≤ M < ∞,
Ω' Jx ' N N
∂uj ∂u
⥛ ,
∂xN ∂xN
for a positive constant M, independent of j . This weak convergence means that

⎛ ⎛ ⎞
∂uj ∂u
. (x) − (x) φ(x) dx → 0 (7.13)
Ω ∂xN ∂xN
for all .φ ∈ Lq (Ω). By (7.11) in Definition 7.2
.Ω = Ω' × {Jx ' eN : x ' ∈ Ω' }, Ω' = πN Ω ⊂ RN −1 ,
and we can recast (7.13) in the form

⎛ ⎛ ⎛ ⎞
∂uj ' ∂u '
. (x , xN ) − (x , xN ) φ(x ' , xN ) dxN dx ' → 0.
Ω' Jx ' ∂xN ∂xN
In particular, we can take .φ of the product form
φ(x ' , xN ) = ψ(x ' )φ(xN ),

.
to find
⎛ ⎛ ⎛ ⎛ ⎞ ⎞
' ∂uj ' ∂u '
. ψ(x ) (x , xN ) − (x , xN ) φ(xN ) dxN dx ' → 0.
Ω' Jx ' ∂xN ∂xN
Due to the arbitrariness of .ψ, thanks to Lemma 7.4, we can conclude that, for a.e.
x ' ∈ Ω' ,
.
∂uj ' ∂uj ' ∂u '

. (x , ·) ∈ Lp (Jx ' ), (x , ·) ⥛ (x , ·).
∂xN ∂xN ∂xN
From Proposition 2.7 and the observations before its statement, we can conclude
that
uj (x ' , ·) + vj (x ' ) → u(x ' , ·)

.
for certain measurable functions .vj , independent of .xN , and for a.e. .x ' ∈ Ω' . For this
fact to be precisely true, one would have to partition the domain .Ω in subsets where
the transversal sets .Jx ' of .R are single intervals (Exercise 2 below). Our conclusion
exactly means that we have the point-wise convergence
uj (x) + vj (x ' ) → u(x)

.
for a.e. .x ∈ Ω, and a.e. .x ' ∈ Ω' . There is nothing keeping us from going over this
argument with a different partial derivative
∂/∂xi ,
. i = 1, 2, . . . , N − 1,
so that we would conclude that in fact the functions .vj (x ' ) can be taken as constants
.vj independent of .x. Since, on the other hand, we indeed know that .uj ⥛ u in
.L (Ω), by uniqueness of limits, we conclude that .vj → 0, because weak and strong
p
convergence for constants is the same, and .uj → u strongly in .Lp (Ω). ⨆
⨅
With this result we complete our initial analysis of first-order, multidimensional
Sobolev spaces that permits us to deal with the most pertinent issues about scalar
variational problems. We will do so in the next chapter. We include a final section to
briefly describe how to set up, in an inductive manner, higher-order Sobolev spaces.
These will allow to deal with higher-order variational problems.
7.9 Higher-Order Sobolev Spaces
Once we have defined Sobolev spaces of first-order involving first-order weak

derivatives belonging to Lebesgue spaces, it is easy to move forward and define
second-order Sobolev spaces, and high-order Sobolev spaces.
Definition 7.5 Let .Ω be an open subset of .RN , and let exponent .p ∈ [1, +∞]
be given. The Sobolev space .W 2,p (Ω) is defined as the collection of functions in
.W
1,p (Ω) whose partial derivatives
∂u
. ∈ Lp (Ω), i = 1, 2, . . . , N,
∂xi
7.9 Higher-Order Sobolev Spaces 241
admit a weak gradient

⎛ ⎞
∂u ∂ ∂u ∂ ∂u ∂ ∂u
∇
. = , ,..., .
∂xi ∂x1 ∂xi ∂x2 ∂xi ∂xN ∂xi
As in the smooth case, the full collection of weak second partial derivatives can be
arranged in the weak hessian
⎛ ⎞
∂ 2u
∇ 2u =
.
∂xi ∂xj i,j =1,2,...,N
which is always a symmetric .N × N-matrix. In compact form, we write
.W 2,p (Ω) = {u ∈ Lp (Ω) : ∇u ∈ Lp (Ω; RN ), ∇ 2 u ∈ Lp (Ω; RN ×N )}.
W 2,p (Ω) is a Banach space under the norm

.
⎛ ⎛ ⎞
||u||p =
. |u(x)|p + |∇u(x)|p + |∇ 2 u(x)|p dx,
Ω
or, equivalently,
||u|| = ||u||Lp (Ω) + ||∇u||Lp (Ω;RN ) + ||∇ 2 u||Lp (Ω:RN×N ) .

.
The space .H 2 (Ω) = W 2,2 (Ω) is a separable, Hilbert space with the inner product
⎛ ⎛ ⎞
<u, v> =
. u(x)v(x) + ∇u(x) · ∇v(x) + ∇ 2 u(x) : ∇ 2 v(x) dx.
Ω
In a similar, inductive way, one can define Sobolev spaces .W m,p (Ω) for .m ≥ 1,
by demanding that derivatives of order .m − 1 belong to .W 1,p (Ω). Recall that the
product .A : B of two .N × N-matrices is, as usual,
A : B = tr(AT B).
.
The fact that the weak hessian is a symmetric matrix, because of the equality of
mixed partial derivatives, is also a direct consequence of the same fact for smooth
functions through the integration-by-parts formula.
Most of the important facts for functions in .W 2,p (Ω) can be deduced from
parallel facts for functions in .W 1,p (Ω), applied to each partial derivative. Possibly,
one of the points worth highlighting, from the viewpoint of variational problems,
is the fact that functions in .W 2,p (Ω) admit traces for functions and all their partial
derivatives. In particular, if .Ω has a smooth boundary with outer, unit normal .n, then
the normal derivative

∂u
. = ∇u · n
∂n
is well-defined at points in .∂Ω. In particular, we can also talk about the space
2,p 2,p (Ω) with
.W
0 (Ω) which is the subspace of .W
u = ∇u = 0 on ∂Ω,
.
or equivalently
∂u
. u= = 0 on ∂Ω.
∂n
From this standpoint, it is important to understand the distinction between the two
spaces
2,p 1,p
W0 (Ω),
. W0 (Ω) ∩ W 2,p (Ω).
Definition 7.6 Let .Ω ⊂ RN be a domain.

1. The subspace
2,p
W0 (Ω) ⊂ W 2,p (Ω)
.
is the closure in .W 2,p (Ω) of the subspace of smooth functions with compact
support in .Ω.
2. The subspace
1,p
W0 (Ω) ∩ W 2,p (Ω) ⊂ W 2,p (Ω)
.
is the closure in .W 2,p (Ω) of the subspace of smooth functions vanishing at .∂Ω.
We will come back to these spaces in the final part of the next chapter, when dealing
with second-order variational problems.
7.10 Exercises
1. Let Ω ⊂ RN be an open subset, and let
{(x '0 , t) : t ∈ J } ⊂ Ω
.
7.10 Exercises 243
for some fixed x '0 and compact interval J ⊂ R. Show that if φ(xN ) is a test
function with support contained in J , then there is another test function ψ(x ' )
such that
φ(x) = ψ(x ' )φ(xN )

.
is a test function in Ω with ψ(x '0 ) = 1.

2. Argue that a connected domain Ω in RN can be partitioned in disjoints parts
Ωk , k ∈ K, so that over each part Ωk the decomposition in (7.11) is such that
Jx ' is a true single interval in R.
3. For an arbitrary rotation R in RN , RT R = 1 (1, the identity matrix), define the
function
uR (z) : RT Ω → R,
. uR (z) = u(Rz),
if Ω ⊂ RN is a domain, and u ∈ W 1,p (Ω). Prove that uR ∈ W 1,p (RT Ω), and
∇uR (z) = RT ∇u(Rz).

.
4. In the field of PDEs, spaces of functions where not all partial derivatives have
the same integrability properties need to be considered. To be specific, consider
the space
∂u
{u ∈ L2 (Ω) :
. ∈ L2 (Ω)}, x = (x1 , x2 ), Ω ⊂ R2 ,
∂x1
but nothing is required about ∂u/∂x2 . Show that it is a Hilbert space under the
inner product
⎛
∂u ∂v
.<u, v> = [u(x)v(x) + (x) (x)] dx, x = (x1 , x2 ).
Ω ∂x1 ∂x1
5. Show rigorously that:

(a) if u ∈ W 1,p (Ω) and ψ ∈ C∞ N
c (R ), then the product uψ belongs to
1,p
W (Ω), and the product rule holds
∇(uψ) = ∇u ψ + u∇ψ in Ω;
.
(b) if u ∈ W0 (Ω) and ψ ∈ C∞

1,p 1,p
c (R ), then the product uψ ∈ W0 (Ω) too.
N
6. Given a domain Ω ⊂ RN and a positive volume fraction t ∈ (0, 1), there is

always a sequence of characteristic functions {χj (x)} of subsets of Ω such that
χj ⥛ t.
7. Let Ω ⊂ RN be a bounded, regular domain with a unit, outer normal field n on
∂Ω.
(a) Define the subspace L2div (Ω) of fields in L2 (Ω; RN ) with a weak diver-
gence in L2 (Ω).
(b) Consider the further subspace
H = {F ∈ L2div (Ω; RN ) : div F = 0},

.
and check that it is the orthogonal complement of the image, under the
gradiente operator, of H01 (Ω).
8. Take Ω = (0, 1)2 ⊂ R2 . Declare a measurable function
u(x1 , x2 ) : Ω → R,
.
as mildly differentiable with respect to x1 , with mild derivative u1 (x1 , x2 ), if

⎛
1
. lim |u(x1 + h, x2 ) − u(x1 , x2 ) − hu1 (x1 , x2 )|2 dx1 dx2 = 0.
h→0 h2 Ω
In a similar way declare a function u2 (x1 , x2 ) as the mild derivative of

u(x1 , x2 ) with respect to x2 . Explore the differences between mild and weak
differentiability, starting with smooth functions.
9. Mimic the passage from the rationals to the reals with the absolute value, to
define Sobolev spaces from test functions.
1,p
10. Show that W 1,p (RN ) = W0 (RN ).
11. Argue how to define the sequence in Lemma 7.6 from the sequence in (the
proof of) Lemma 7.5. Provide the details for the proofs of C∞ -versions of
Lemmas 7.5, and 7.6.
12. Redo the proof of Lemma 2.4 in the higher-dimensional setting Corollary 7.1.
13. Let B ⊂ R2 be the unit disc in R2 . Isolate the conditions for functions of a
single variable u = u(x1 ) to be elements of H 1 (B). How would be the situation
for a general, regular domain Ω ⊂ R2 ?
14. With the notation and the ideas of the previous exercise, consider the set of
functions
L = {u(x1 ) : u ∈ W 1,∞ (−1, 1)} ⊂ H 1 (B).

.
Argue that it is a subspace of H 1 (B) that is not closed (under the norm of
H 1 (B)).
Chapter 8
Scalar, Multidimensional Variational
Problems
8.1 Preliminaries
Once we have established a solid functional-analytical foundation, we are ready to

tackle multidimensional variational problems in which we pretend to minimize the
value of the standard, integral functional
⎛
. F (x, u(x), ∇u(x)) dx
Ω
among a given set .A of competing functions. The main situation we will explore
is that in which feasible functions in .A are determined through their preassigned
values around .∂Ω.
We will start with the particular, fundamental case of quadratic functionals which
builds upon the Lax-Milgram theorem of Chap. 3. This is rather natural and does not
require any new fundamental fact. After that, we will focus on the three important
aspects as in Chap. 4, namely,
1. weak lower semicontinuity, the direct method, and one main existence result;
2. optimality conditions in the Hilbert-space scenario, and weak solutions of PDEs;
3. explicit examples.
We will wrap the chapter with a look at the most important example of a second-
order problem that is important in applications: the bi-harmonic operator. We will
cover, in such a case, much more rapidly the two issues of existence of optimal
solutions and optimality.
Since at this point we already have a non-negligible training on most of the
abstract, underlying issues, proofs dwell in more technical facts, and, sometimes,
they are shorter than the ones for previous similar results.
246 8 Variational Problems
8.2 Abstract, Quadratic Variational Problems
Our abstract discussion on the Lax-Milgram lemma in Sect. 3.2 can be applied
directly to quadratic, multi-dimensional variational problems of the form
⎛ ⎡ ⎤
1
Minimize in u ∈
. H01 (Ω) : ∇u(x) A(x)∇u(x) + f (x)u(x) dx
T
Ω 2
under vanishing Dirichlet boundary conditions around .∂Ω. The important ingredi-
ents to ensure the appropriate hypotheses in Theorem 3.1 are
A(x), symmetric,
. |A(x)| ≤ C, C > 0, x ∈ Ω,
c|u|2 ≤ uT A(x)u, c > 0, x ∈ Ω, u ∈ RN .
In addition .f (x) ∈ L2 (Ω). Under these assumptions the following is a direct

application of Theorem 3.1.
Corollary 8.1 The variational problem just described admits a unique minimizer
u ∈ H01 (Ω) that is characterized by the condition
.
⎛
. [∇v(x)T A(x)∇u(x) + f (x)v(x)] dx = 0 (8.1)
Ω
for every .v ∈ H01 (Ω).

The main specific example is, of course, the Dirichlet principle described in
Sect. 1.4, and also mentioned in Sect. 1.8, because of its historical relevance in the
development of Functional Analysis and the Calculus of Variations. It corresponds
to the choice
A(x) = 1,
. f (x) ≡ 0,
which is obviously covered by Theorem 3.1 and Corollary 8.1. Due to its signifi-
cance, we state it separately as a corollary.
Corollary 8.2 (Dirichlet’s Principle) For every domain .Ω, there is a unique
function .u(x) ∈ H01 (Ω) which is a minimizer for the problem
⎛
1
Minimize in v(x) ∈ H 1 (Ω) :
. |∇v(x)|2 dx
2 Ω
under .u − u0 ∈ H01 (Ω) for a given .u0 ∈ H 1 (Ω). This unique function u is
determined as a (weak) solution of Laplace’s equation
Δ u(x) = 0 in Ω,
. u = u0 on ∂Ω.
8.2 Abstract, Quadratic Variational Problems 247
As we comment below, much more regularity for the harmonic function u can be
derived if one can count on further regularity of the domain .Ω, and the boundary
datum .u0 .
Several variations can be adapted to this model problem. We name a few, some of
which are proposed as exercises in the final section. We would have a corresponding
corollary for all of them.
• The linear term
⎛
. f (x)u(x) dx
Ω
can be of the form

⎛
. F(x) · ∇u(x) dx
Ω
for a field
F(x) ∈ L2 (Ω; RN ),
.
or a combination of both. Even more generally, it could also be
<f, u>,
. f ∈ H −1 (Ω),
if we declare the space .H −1 (Ω) as the dual of .H01 (Ω). More on this in the final
chapter.
• Boundary conditions can be easily changed to a non-vanishing situation, as in
the Dirichlet’s principle, by selecting some appropriate specific function .u0 ∈
H 1 (Ω), and defining the bilinear form over functions of the kind .u + u0 for
.u ∈ H (Ω).
1
0
• The bilinear form, i.e. the integral functional, can also incorporate a quadratic
term in u
⎛ ⎡ ⎤
1 1
. ∇u(x) A(x)∇u(x) + a(x)u(x) + f (x)u(x) dx
T 2
Ω 2 2
for a non-negative coefficient .a(x).

• One can also consider the same variational problems without an explicit bound-
ary condition like
. Minimize in u ∈ H 1 (Ω) :
⎛ ⎡ ⎤
1 1
∇u(x) A(x)∇u(x) + u(x) + f (x)u(x) dx.
T 2
Ω 2 2
The minimization process in this situation leads to the so-called natural boundary
condition, as it is found as a result of the optimization process itself, and not
imposed in any way.
• Boundary conditions may come in the Neumann form for the normal derivative
⎛ ⎡ ⎤
1 1
∇u(x) A(x)∇u(x) + u(x) + f (x)u(x) dx.
T 2
Ω 2 2
under
.∇u(x) · n(x) = h(x) on ∂Ω,
for a function h defined on .∂Ω, and .n(x), the unit, outer normal to .∂Ω. .Ω is
supposed to have a boundary which is a smooth .N −1 compact hypersurface. This
problem would require, however, a much more detailed analysis to determine
Lebesgue spaces defined on manifolds for functions h.
• The matrix field .A(x) might not be symmetric.
There are two important issues to explore, once one has shown existence of optimal
solutions for a certain variational principle.
1. In the first place, it is most crucial to establish optimality conditions. These are
additional fundamental requirements that optimal solutions must comply with
precisely because they are optimal solutions for a given variational problem; they
typically involved, in the multidimensional case, partial differential equations
(PDEs). These are initially formulated in a weak form; in the case of quadratic
functionals, such optimality conditions are given as a fundamental part of the
application of the Lax-Milgram lemma. In Corollary 8.1, it is (8.1). We say that
(8.1) is the weak form of the linear, elliptic PDE
. − div[A(x)∇u(x)] + f (x) = 0 in Ω, u = 0 on ∂Ω,
because by multiplying formally this equation by an arbitrary function .v ∈

H01 (Ω) and integrating by parts, we recover (8.1).
2. The second important issue is to derive further regularity properties (better
integrability, continuity, smoothness, analyticity, etc) for the optimal solution .u
from additional conditions on the ingredients of the problem: the domain .Ω, and
.A(x) and .f (x). Typically, this regularity issue is delicate and requires quite a bit
of fine work.
We will be using these quadratic model problem for a more general integral
functional. However, when one is off the quadratic framework, the existence of
optimal solution is much more involved as there is no general result as neat as the
Lax-Milgram lemma.
8.3 Scalar, Multidimensional Variational Problems 249
8.3 Scalar, Multidimensional Variational Problems
We are now ready to tackle scalar, multi-dimensional variational problems of the

form
⎛
.I (u) = F (x, u(x), ∇u(x)) dx (8.2)
Ω
under typical constraints prescribing boundary values
u = u0 on ∂Ω
. (8.3)
for a fixed, given function .u0 . Since such variational problems will be setup in
Sobolev spaces .W 1,p (Ω) for exponent .p > 1, the underlying set .Ω will always be
taken to be a bounded domain according to Definition 7.2. Moreover the function
.u0 will belong to the same space .W
1,p (Ω) and competing functions .u ∈ W 1,p (Ω)
will be further restricted by putting
1,p
u − u0 ∈ W0 (Ω).
.
This is our formal way to enforce (8.3). More specifically, we will be concerned
with the variational problem
⎛
. I (u) = F (x, u(x), ∇u(x)) dx
Ω
where .A ⊂ W 1,p (Ω) is a non-empty, weakly-closed subset. We would like to learn

what structural conditions on the integrand
F (x, u, u) : Ω × R × RN → R,
. (8.4)
and on such feasible set .A of competing functions, guarantee the success of

the direct method Proposition 3.1. In this section we focus on the weak lower
semicontinuity property. This will be a direct corollary of the following general
result.
Theorem 8.1 Let
F (u, v) : Rm × Rn → R
.
be continuous, and bounded from below. Consider the associated integral functional
⎛
E(u, v) =
. F (u(x), v(x)) dx
Ω
for pairs
(u, v) ∈ Lp (Ω; Rm × Rn ).
.
Then E is strong-weak lower semicontinuous, i.e.
E(u, v) ≤ lim inf E(uj , vj )

.
j →∞
whenever
uj → u in Lp (Ω; Rm ),
. vj ⥛ v in Lp (Ω; Rn ), (8.5)
if and only if .F (u, ·) is convex for every .u ∈ Rm .

Proof We follow along the lines of the similar proof for dimension 1 Theorem 4.2.
Suppose we have the convergence (8.5). For each j , by Jensen’s inequality we have
⎛ ⎛ ⎞ ⎛
1 1
.F ũj , vj (x) dx ≤ F (ũj , vj (x)) dx
|Ω̃| Ω̃ |Ω̃| Ω̃
for arbitrary subsets .Ω̃ ⊂ Ω, and

⎛
1
ũj =
. uj (x) dx.
|Ω̃| Ω̃
In particular, if .{Ωi }i is a finite, arbitrary partition of .Ω, and

⎛
(i) 1
.ũ = uj (x) dx,
j
|Ωi | Ωi
we will have
Σ ⎛ ⎛ ⎞ Σ⎛
1
. |Ωi |F ũ(i) , vj (x) dx ≤ F (ũ(i)
j , vj (x)) dx
j
|Ωi | Ωi Ω i
i i
⎛
≤ F (uj (x), vj (x)) dx + Rj ,
Ω
where
Σ ⎛
(i)
Rj =
. Rj,i , Rj,i = |F (uj (x), vj (x)) − F (ũj , vj (x))| dx.
i Ωi
8.3 Scalar, Multidimensional Variational Problems 251
If we select the partition .{Ωi }i of .Ω, depending on j , in such a way that

Σ
uj (x) −
. χΩi (x)ũ(i)
j →0
i
pointwise in .Ω, the sequence of functions

Σ ⎛
(i)
rj (x) =
. χΩi (x)|F (uj (x), vj (x)) − F (ũj , vj (x))|, Rj = rj (x) dx,
i Ω
converges pointwise to zero in .Ω. Let .E ⊂ Ω be a measurable subset in which the

sequence of remainders .{rj } is uniformly bounded. If all the previous integrals are
restricted to E, by the Lebesgue dominated convergence theorem, we would find
after taking limits in j , because
⎛
Rj (E) ≡
. rj (x) dx → 0,
E
that
⎛ ⎛
. F (u(x), v(x)) dx ≤ lim inf F (uj (x), vj (x)) dx.
E j →∞ E
Since F can be assumed, without loss of generality to be non-negative .F ≥ 0, then

⎛ ⎛
. F (u(x), v(x)) dx ≤ lim inf F (uj (x), vj (x)) dx,
E j →∞ Ω
and the arbitrariness of E filling out all of .Ω, leads to the sufficienty.
For the necessity, we invoke a generalization of Example 2.12 in the higher
dimensional setting. Given a domain .Ω ⊂ RN and a positive volume fraction
.t ∈ (0, 1), there is always a sequence of characteristic functions .{χj (x)} of subsets
of .Ω such that .χj ⥛ t|Ω|. This is left as an exercise. If we now consider the
sequence of pairs
(u, χj (x)v1 + (1 − χj (x))v0 ),

.
for vectors .u ∈ Rm , .v1 , v0 ∈ Rn , we see that
χj (x)v1 + (1 − χj (x))v0 ⥛ tv1 + (1 − t)v0 ,

.
and by the strong-weak lower semicontinuity we would conclude that

⎛
|Ω|F (u, tv1 + (1 − t)v0 ) ≤ lim inf
. F (u, χj (x)v1 + (1 − χj (x))v0 ) dx
j →∞ Ω
⎛
⎡ ⎤
= lim inf χj (x)F (u, v1 ) + (1 − χj (x))F (u, v0 ) dx
j →∞ Ω
=|Ω|(tF (u, v1 ) + (1 − t)F (u, v0 )).

⨆
⨅
Note how this proof has turned out much simpler than its one-dimensional twin
Theorem 4.2 because in the current higher-dimensional situations we have decou-
pled the two sets of variables .u and .v, i.e. they are unrelated.
Corollary 8.3 Suppose the integrand in (8.4) is measurable in x, and continuous
in pairs .(u, u) and bounded from below. Then functional (8.2) is weakly lower
semicontinuous in .W 1,p (Ω) if and only if F is convex in .u for a.e. .x ∈ Ω, and
.u ∈ R.
Proof The sufficiency part of this corollary is a direct consequence of the previous
theorem if we identify
Uj (x) = (x, uj (x))

.
in such a way that the strong convergence of .{Uj } in .Lp (Ω; RN +m ) is equivalent to
the convergence of .{uj } in .Lp (Ω; Rm ). The necessity is a bit more involved than the
property shown in Theorem 8.1 precisely by the remark made right after the proof
of it. But it is a clear indication that it should be correct; and indeed it is. Its proof
would require finer calculations as in the proof of Theorem 4.2. We do not insist on
this point because our main interest is in the sufficiency part. ⨆
⨅
8.4 A Main Existence Theorem
Corollary 8.3 is the cornerstone of a general existence theorem for optimal solutions
for variational problems with an integral functional of type (8.2)
⎛
I (u) =
. F (x, u(x), ∇u(x)) dx
Ω
under boundary values around .∂Ω for admissible u’s, and additional assumptions
ensuring coercivity in appropriate Sobolev spaces.
Theorem 8.2 Let
F (x, u, u) : Ω × R × RN → R
.
8.4 A Main Existence Theorem 253
be measurable with respect to .x, and continuous with respect to pairs .(u, u).
Suppose in addition that.
1. there is .p > 1 and .C > 0 with
C(|u|p − 1) ≤ F (x, u, u)
. (8.6)
for every triplet .(x, u, u);

2. the dependence
u |→ F (x, u, u)
.
is convex for every pair .(x, u).

Then for every .u0 ∈ W 1,p (Ω), furnishing boundary values, the variational problem
Minimize in u ∈ W 1,p (Ω) :

. I (u)
under
1,p
u − u0 ∈ W0 (Ω)
.
admits optimal solutions.

Proof The proof reduces to the use of the direct method Proposition 3.1 together
with Corollary 8.3 for the weak lower semicontinuity.
• Note how the coercivity condition (8.6), in addition to the Dirichlet boundary
1,p
condition .u − u0 ∈ W0 (Ω) imposed on competing functions, imply that
minimizing sequences are uniformly bounded in .W 1,p (Ω). In fact, if .{u0 + uj }
is minimizing with
1,p
{uj } ⊂ W0 (Ω),
.
then
p 1
. ||∇u0 + ∇uj ||Lp (Ω) ≤ 1 + I (u0 + uj ),
C
if I is the corresponding integral functional. Thus, if .{u0 +uj } is truly minimizing
with a value .m ∈ R for the infimum,
⎛ ⎞
m + 1 1/p
||∇u0 + ∇uj ||Lp (Ω)
. ≤ 1+ ,
C
and
||∇uj ||Lp (Ω) ≤||∇u0 + ∇uj ||Lp (Ω) + ||∇u0 ||Lp (Ω)
.
⎛ ⎞
m + 1 1/p
≤ 1+ + ||∇u0 ||Lp (Ω) ,
C
for all j large, and a constant on the right-hand side independent of j .

• Because .p > 1, there are subsequences converging weakly in .W 1,p (Ω) and
strongly in .Lp (Ω) to limit functions u that, therefore, comply with the same
boundary condition, and are hence admissible.
• The convexity of the integrand with respect to the gradient variable yields,
according to Corollary 8.3, a functional I which is weakly lower semicontinuity
in .W 1,p (Ω).
We therefore are in possession of all required ingredients for the direct method
Proposition 3.1 to work, and conclude the existence of minimizers. ⨆
⨅
Boundary conditions allow for some variations. Some of these are proposed in the
exercises, as well as specific problems for concrete integrands. One important such
situation is that of the natural boundary condition (Exercise 1). Another important,
even more general situation corresponds to a feasible set .A which is weakly closed
in the appropriate space. Such a subset .A must be weakly closed to ensure that weak
limits of sequences in .A stay within .A. A paradigmatic situation corresponds to the
classical obstacle problem Exercise 29. We will see some other relevant situation in
the next chapter.
Uniqueness of optimal solutions require, as usual, more demanding conditions
on the integrand: jointly strict convexity with respect to .(u, u). In general, it is not
sufficient to have strict convexity only with respect to .u. As a matter of fact, lack
of strict or even plain convexity with respect to u, having the same strictly convex
dependence on .u, yields a lot a flexibility to study interesting problems.
Proposition 8.1 Suppose, in addition to hypotheses in Theorem 8.2, that the
integrand F is jointly strictly convex with respect to pairs .(u, u). Then there is
a unique optimal solution of the corresponding variational problem under given
Dirichlet boundary conditions determined by .u0 ∈ W 1,p (Ω).
The proof is similar to that of Proposition 3.3.
8.5 Optimality Conditions: Weak Solutions for PDEs
We now come to the last of the main issues concerning variational problems, that
of deriving optimality conditions that optimal solutions of problems must comply
with precisely because they are optimal. As usual, the natural scenario to derive
optimality is that of Hilbert spaces, though it can be generalized to Banach spaces.
8.5 Optimality Conditions: Weak Solutions for PDEs 255
For this reason, we restrict attention in this section to the case .p = 2, and so we will
be working in the space .H 1 (Ω). For the sake of simplicity, we avoid the irrelevant
dependence of the integrand .F (x, u, u) on .x. We are not writing the most general
conditions possible, but will be contented with understanding the nature of such
optimality conditions. Note that no convexity is assumed on the next statement of
optimality conditions. It is a necessary statement.
Theorem 8.3 Suppose the integrand
F (u, u) : R × RN → R
.
is continuously differentiable with partial derivatives
Fu (u, u),
. Fu (u, u),
and, for some constant .C > 0,
|F (u, u)| ≤ C(|u|2 + |u|2 ),

.
|Fu (u, u)| ≤ C(|u| + |u|), |Fu (u, u)| ≤ C(|u| + |u|).
If a function .v ∈ H 1 (Ω) is a minimizer for the problem

⎛
Minimize in u ∈ H 1 (Ω) :
. I (u) = F (u(x), ∇u(x)) dx
Ω
under
u − u0 ∈ H01 (Ω),
. u0 ∈ H 1 (Ω), given,
then
⎛
. [Fu (v(x), ∇v(x))V (x) + Fu (v(x), ∇v(x)) · ∇V (x)] dx = 0 (8.7)
Ω
for every .V ∈ H01 (Ω).

Condition (8.7) is the weak form of the PDE which is known as the Euler-Lagrange
equation of the problem
. − div[Fu (x, v(x), ∇v(x)] + Fu (x, v(x), ∇v(x) = 0 in Ω, v = v0 on ∂Ω.
Note how (8.7) is precisely what you find when, formally, multiply in this
differential equation by an arbitrary .V ∈ H01 (Ω), and perform an integration by
parts, as usual. Solutions v of (8.7) are called critical functions (of the corresponding
functional).
Proof Note that under the bounds assumed on the sizes of .F (u, u) and its partial
derivatives .Fu (u, u) and .Fu (u, u), the integral in (8.7) is well-defined for every .V ∈
H01 (Ω).
The proof of (8.7) is very classical. It follows the basic strategy in the corre-
sponding abstract result Proposition 3.4. Let .v ∈ H01 (Ω) be a true minimizer for the
variational problem in the statement, and let .V ∈ H01 (Ω) be an arbitrary element.
For every real .s ∈ R, the section
g(s) = E(v + sV )
.
is a function with a global minimum at .s = 0. If g is differentiable, then we should

have .g ' (0) = 0. We have
⎛
g(s) =
. F (v(x) + sV (x), ∇v(x) + s∇V (x)) dx.
Ω
Since the integrand .F (u, u) is continuously differentiable, and the result of the
formal differentiation with respect to s
⎛
. Fu (v(x) + sV (x),∇v(x) + s∇V (x))V (x)
Ω
+Fu (v(x) + sV (x), ∇v(x) + s∇V (x)) : ∇V (x) dx
is well-defined because of our upper bounds on the size of F and its partial
derivatives, it is easy to argue that indeed g is differentiable, and its derivative
is given by the last integral. The condition for global minimum .g ' (0) = 0 then
becomes exactly (8.7). ⨆
⨅
A final fundamental issue is that of the sufficiency of optimality conditions.
Suppose we have a function .v ∈ H 1 (Ω) for which (8.7) holds for every .V ∈ H01 (Ω).
Under what conditions on the integrand .F (u, u) can we be sure that v is in fact a
minimizer of the corresponding variational problem? This again involves convexity.
One main point of research, that we will barely touch upon in the next chapter,
is to derive results of existence of critical functions which are not minimizers for
different families of integral functionals.
Theorem 8.4 Suppose the integrand
F (u, u) : R × RN → R
.
is like has been indicated in Theorem 8.3, and it is jointly convex in pairs .(u, u).
Suppose that a certain function .v ∈ H 1 (Ω), with .v−u0 ∈ H01 (Ω) and .u0 prescribed,
is such that (8.7) holds for every .V ∈ H01 (Ω). Then v is a minimizer for the problem
⎛
. I (u) = F (u(x), ∇u(x)) dx
Ω
8.6 Variational Problems in Action 257
under
u − u0 ∈ H01 (Ω).
.
Proof The proof is essentially the one for Theorem 4.5 adapted to a multidimen-
sional situation. Suppose the feasible function v is such that (8.7) holds for every
.V ∈ H (Ω). The sum .v +V is also feasible for the variational problem, and because
1
0
of the convexity assumed on F , for a.e. .x ∈ Ω,
.F (v(x) + V (x), ∇v(x) + ∇V (x)) ≥F (v(x), ∇v(x))

+Fu (v(x), ∇v(x))V (x)
+Fu (v(x), ∇v(x)) : ∇V (x).
Upon integration on .x, taking into account (8.7), we see that .I (v + V ) ≥ I (v), for
arbitrary .V ∈ H01 (Ω), and v is indeed a minimizer of the problem. ⨆
⨅
8.6 Variational Problems in Action
We have seen that the two main ingredients that ensure existence of optimal
solutions for a typical variational problem associated with an integral functional
⎛
. F (x, u(x), ∇u(x)) dx, F (x, u, u) : Ω × R × RN → R
Ω
is the convexity and coercivity of the integrand .F (x, u, u) with respect to the
gradient variable .u. Though readers may think that once understood this, one can
be reputed to know a lot about variational problems, the truth is that the application
of Theorem 8.2 to concrete examples might turned out to be more challenging than
anticipated.
One needs to deal with convexity in the most efficient way, and this implies to
know the main operations among functions that preserve convexity in order to have a
clear picture of how to build more sophisticated convex functions from simple ones.
We assume that readers are familiar with the use of the positivity of the hessian to
check convexity for smooth cases. This is taught in Multivariate Calculus courses.
Often times, however, the application of such criterium, though clear as a procedure,
may be not so easy to implement. For instance, the function
1 2 1
.F (u) = |u| + |u|u · v, v ∈ RN , |v| ≤ ,
2 2
is convex, but it is not so straightforward to check it. Same comments apply to
coercivity. At times, convexity may be quite straightforward, and it is coercivity
that turns out to be more involved. We aim at giving some practical hints that may
help in dealing with examples.
Recall again (Definition 4.1) that a function
φ : D ⊂ RN → R
.
is convex if D is a convex set, and
φ(t1 x1 + t2 x2 ) ≤ t1 φ(x1 ) + t2 φ(x2 ),

. x1 , x2 ∈ D, t1 , t2 ≥ 0, t1 + t2 = 1.
If there is no mention about the set D, one consider it to be the whole space .RN or
the natural domain of definition of .φ (which must be a convex set).
The following is elementary, but they are the principles to build new convex
functions from old ones. We have already mentioned it in Sect. 4.1.
Proposition 8.2
1. Every linear (affine) function is convex (and concave).
2. A linear combination of convex functions with positive scalars is convex. In
particular, the sum of convex functions is convex, and so is the product by a
positive number.
3. The composition of a convex function with an increasing, convex function of one
variable is convex.
4. The supremum of convex functions is convex.
We already know (Proposition 4.1) that, when .φ is convex, then
CL φ = sup{ψ : ψ ≤ φ, ψ, linear} ≡ φ.
.
If it is not, we isolate the following important concept.

Definition 8.1 For a function .φ(x) : RN → R, we define the convex function
CL φ ≡ CC φ ≡ Cφ
.
as its convexification. It is the largest convex function, not bigger than .φ itself.
Concerning coercivity, we just want to mention a main principle that plays an
important role in many different contexts when contributions in inequalities are
desired to be absorbed by other terms. It is just a simple generalization of inequality
(4).
Lemma 8.1 For .∈ > 0, and .u, v ∈ RN ,
∈2 2 1
u·v≤
. |u| + 2 |v|2 .
2 2∈
Some examples follow.
Example 8.1 The size of a vector .u →

| |u| is a convex function. Indeed, because
.|u| = sup v · u,
|v|=1
it is the supremum of linear functions. Hence, all functions of the form .φ(u) =
f (|u|) for a non-decreasing, convex function f of a single variable, are convex. In
particular,
√
|u|p , p ≥ 1,
. 1 + |u|q , q ≥ 2.
Example 8.2 If .ψ(u) is convex, then .φ(u) = ψ(Au) is convex as well for every
matrix .A. In this way the functions
√
.|Au| , p ≥ 1, 1 + |Au|q , q ≥ 2,
p
are convex too.

Example 8.3 This is the one written above
1 2 1
F (u) =
. |u| + |u|u · v, v ∈ RN , |v| ≤ . (8.8)
2 2
One first realizes that for an additional constant vector .w ∈ RN , we have that the
quadratic function
⎛ ⎞
1 2 1
.u |→ |u| + u · w u · v = u 1+w⊗v u
2 2
is convex (because it is positive definite) provided .|w| ≤ 1, .|v| ≤ 1/2. Hence if we

take the supremum in .w, the resulting function is convex too, i.e. .F (u) in (8.8) is
convex.
8.7 Some Examples
Quadratic variational problems are associated with linear Euler-Lagrange equations,

and can be tackled within the scope of the Lax-Milgram lemma as we have already
seen. In this section, we would like to mention some non-linear problems that come
directly from our basic results for scalar, multidimensional variational problems.
Some examples are based on the ones in the last section.
1. One first easy example is the one corresponding to the integrand
1 2
F (x, u, u) =
. |u| + |u| u · F(x)
2
for a given field .F(x) such that
|F(x)| ≤ 1/2,
. x ∈ Ω.
It admits the variant

/
1 2
F (x, u, u) =
. |u| + 1 + |u|2 u · F(x).
2
The existence of minimizers under standard Dirichlet boundary conditions is
straightforward. Coercivity can be treated through Lemma 8.1. The associated
Euler-Lagrange equations are a bit intimidating. For the first case, it can be
written in the form
⎡ ⎛ ⎞ ⎤
∇u ∇u
. div ∇u + |∇u| 1 + ⊗ F = 0,
|∇u| |∇u|
where the rank-one matrix .u ⊗ v, for two vectors .u, .v is given by
(u ⊗ v)ij = ui vj .
.
Note that the existence of a unique minimizer for the integral functional with one
of these two integrands immediately yields the existence of (weak) solutions of,
for example, this last non-linear PDE.
2. Our second example is of the form
1
F (x, u, u) =
. (|u| + f (x))2 + u(x)g(x),
2
with .f ∈ L1 (Ω), and .g ∈ L2 (Ω). Once again, the existence of a unique

minimizer does not require any special argument, and such minimizer is,
therefore, a (weak) solution of the non-linear PDE
⎛ ⎞
∇u
. div ∇u + f =g
|∇u|
under typical boundary conditions.

3. Consider
1 2
.F (x, u, u) = |u| + f (u)
2
where .f (u) is a real, non-negative, continuous function. The existence of
minimizers is straightforward. Without further conditions on the function f , it
is not possible to guarantee uniqueness. Compare the two cases
1 2 1
f (u) =
. u , f (u) = (|u| − 1)2 .
2 2
4. For exponent p greater than 1, but different from 2, the parallel functional for the
pth-Dirichlet problem is
⎛
1
Minimize in u(x) ∈ W 1,p (Ω) :
. |∇u(x)|p dx
Ω p
1,p
under prescribed boundary conditions .u − u0 ∈ W0 (Ω) for a given .u0 ∈
W 1,p (Ω). The associated Euler-Lagrange equation reads
. div(|∇u|p−2 ∇u) = 0 in Ω.
This non-linear PDE is known as the pth-laplacian equation for obvious reasons,
and it probably is the best studied one after Laplaces’. Even for the case .p = 1, a
lot of things are known though the analysis is much more delicate. The existence
of a unique minimizer for the underlying variational problem is, however, a direct
consequence of our analysis in this chapter.
5. The functional for non-parametric minimal surfaces
⎛ /
. 1 + |∇u(x)|2 dx
Ω
faces the same delicate issue of the linear growth so that .W 1,1 (Ω) is the natural
space. It is also a problem that requires a lot of expertise. One can always try a
small regularization of the form
⎛ ⎛ / ⎞
∈
. |∇u|2 + 1 + |∇u(x)|2 dx
Ω 2
for some small, positive parameter .∈. Or even better

⌠ √
1 + |u|2 , |u| ≤ ∈ −1 ,
F∈ (u) =
.
∈|u|2 + ((1 + ∈ 2 )−1/2 − 2)|u| + ∈(1 + ∈ 2 )−1/2 + ∈ −1 , |u| ≥ ∈ −1 .
The form of this integrand has been adjusted for the zone .|u| ≥ ∈ −1 in such way
that it has quadratic growth, and the overall resulting integrand .F∈ turns out to be
at least .C1 . The variational problem for .F∈ has a unique minimizer .u∈ . This is the
easy part. The whole point of such regularization is to investigate the behavior of
such minimizers as .∈ tends to zero. This is typically reserved for a much more
specialized analysis.
6. Our final examples are of the form
a(u) 2
F (u, u) =
. |u| , F (u, u) = uT A(u)u,
2
where the respective matrices involved
1
A(u) =
. a(u)1, A(u),
2
need to be, regardless of its dependence on u, uniformly bounded and positive
definite. The existence of minimizers can then be derived directly from our
results.
8.8 Higher-Order Variational Principles
Once we have spaces .W 2,p (Ω) to our disposal we can treat variational problems of
second-order, whose model problem is of the form
⎛
Minimize in u ∈ W
.
2,p
(Ω) : I (u) = F (x, u(x), ∇u(x), ∇ 2 u(x)) dx,
Ω
typically, under boundary conditions on competing functions that might involve u

and/or .∇u around .∂Ω. In full generality, the density
F (x, u, u, U) : Ω × R × RN × RN ×N → R
. (8.9)
is supposed to be, at least, continuous in triplets .(u, u, U) and measurable in .x ∈ Ω.

Ω ⊂ RN is a smooth, regular, bounded domain.
.
We could certainly cover similar facts for second-order problems as the ones
stated and proved in Sects. 8.3, 8.4, and 8.5 for first-order problem, but we trust, just
as we have done with higher-order Sobolev spaces, that the experience and maturity
gained with first-order problems may be sufficient to clearly see the validity of the
following results which is a way to summarize the main points about one problem
like our model problem above. It is by no means the most such general result. There
might be several variants concerning constraints to be respected either as boundary
conditions around .∂Ω, or otherwise. Though, it is a long statement, we believe it is
a good way to sum up the fundamental facts.
We are given a density like F in (8.9) which is measurable in the variable .x, and
continuous in .(u, u, U). We consider the corresponding variational problem
⎛
Minimize in u ∈ A ⊂ W 2,p (Ω) :
. I (u) = F (x, u(x), ∇u(x), ∇ 2 u(x)) dx,
Ω
8.8 Higher-Order Variational Principles 263
where we have added the restriction set .A, typically incorporating boundary
conditions for feasible functions.
Theorem 8.5
1. Suppose that
(a) there is a constant .c > 0 and exponent .p > 1 such that
.c(|U|p − 1) ≤ F (x, u, u, U),
or more in general
( )
c ||u||W 2,p (Ω) − 1 ≤ I (u);
.
(b) the function
U → F (x, u, u, U)
.
is convex, for every fixed .(x, u, u);

(c) the class .A is weakly closed as a subset of .W 2,p (Ω).
Then our variational problem admits optimal solutions.
2. Suppose the integrand
F (u, u, U) : R × RN × RN ×N → R
.
is continuously differentiable with partial derivatives
.Fu (u, u, U), Fu (u, u, U), FU (u, u, U)
and, for some constant .C > 0,
.|F (u, u, U)| ≤ C(|U|2 + |u|2 + |u|2 ), |Fu (u, u, U)| ≤ C(|U| + |u| + |u|),
|Fu (u, u, U)| ≤ C(|U| + |u| + |u|), |FU (u, u, U)| ≤ C(|U| + |u| + |u|).
If a function .v ∈ H02 (Ω), with .v − u0 ∈ H02 (Ω), is a minimizer for the problem
⎛
.Minimize in u ∈ H 2 (Ω) : I (u) = F (u(x), ∇u(x), ∇ 2 u(x)) dx
Ω
under
u − u0 ∈ H02 (Ω),
. u0 ∈ H 2 (Ω), given,
then
⎛ ⎡
. Fu (v(x), ∇v(x), ∇ 2 v(x))V (x) + Fu (v(x), ∇v(x), ∇ 2 v(x)) · ∇V (x)
Ω
⎤
+FU (v(x), ∇v(x), ∇ 2 v(x)) : ∇ 2 V (x) dx = 0
for every .V ∈ H01 (Ω), i.e. v is a weak solution of the fourth-order problem
. div[div FU (v, ∇v, ∇ 2 v)] − div Fu (v, ∇v, ∇ 2 v) + Fu (v, ∇v, ∇ 2 v) = 0 in Ω,

v = u0 , ∇v = ∇u0 on ∂Ω.
3. Assume the integrand .F (x, u, u, U) and its partial derivatives
Fu (u, u, U),
. Fu (u, u, U), FU (u, u, U)
enjoy the properties of the previous item, and, in addition, it is convex with
respect to triplets .(u, u, U). If a certain function .v ∈ H 2 (Ω), complying with
.v − u0 ∈ H (Ω) and .u0 , prescribed, is a weak solution of the corresponding
2
0
fourth-order Euler-Lagrange equation just indicated in the previous item, then v
is a minimizer of the associated variational problem.
4. If in addition to all of the hypotheses indicated in each particular situation, the
integrand F is strictly convex in .(u, u, U), there is a unique minimizer and a
unique weak solution to the Euler-Lagrange equation, and they are the same
function.
We will be contented with explicitly looking at the most important such example:
the bi-laplacian.
Example 8.4 For a regular, bounded domain .Ω ⊂ RN , and a given function .u0 ∈
H 2 (Ω), we want to briefly study the second-order variational problem
⎛
1
.Minimize in u ∈ H 2 (Ω) : I (u) = |Δ u(x)|2 dx
2 Ω
under the typical Dirichlet boundary conditions
u − u0 ∈ H02 (Ω).
.
Recall that
Σ
N
∂ 2u
Δ u(x) =
. (x),
i=1
∂xi2
8.8 Higher-Order Variational Principles 265
and, hence, the integrand we are dealing with is
1
.F (x, u, u, U) = (tr U)2 .
2
It is elementary to realize that the associated Euler-Lagrange problem is concerned
with the bi-laplacian or bi-harmonic operator
∂u ∂u0
.Δ 2 u = 0 in Ω, u = u0 , = on ∂Ω. (8.10)
∂n ∂n
By introducing .v = u−u0 ∈ H02 (Ω), so that .u = v +u0 , and resetting notation .v |→

u, we see that we can work instead, without loss of generality, with a variational
problem with integrand
1
F (x, u, u, U) =
. (tr U)2 + a(x) tr U + F0 (x, u(x), ∇u(x)) (8.11)
2
but vanishing boundary conditions .u ∈ H02 (Ω). For the sake of simplicity, we will
simply take .F0 ≡ 0, and will take .a(x) ∈ L2 (Ω). We will, therefore, focus on the
optimization problem
⎛ ⎛ ⎞
1
Minimize in u(x) ∈ H02 (Ω) :
. I (u) = |Δ u(x)|2 + a(x)Δ u(x) dx.
Ω 2
(8.12)
We will take advantage of this special example to stress the following general fact.
If the integrand F of an integral functional I is coercive and (strictly) convex,
then I inherit these properties from F . But an integral functional I may be coercive
and strictly convex in its domain of definition while none of these properties be
correct for the integrand of I pointwise.
It is clear that the integrand F in (8.11) is not coercive on the variable .U nor
strictly convex because some of the entries of the hessian of u do not occur in F .
Yet we check below that the functional itself I is both coercive and strictly convex
in .H02 (Ω).
Proposition 8.3 The functional .I : H02 (Ω) → R in (8.12) is coercive, smooth
and strictly convex. Consequently, there is a unique minimizer for (8.12) and for its
corresponding Euler-Lagrange problem (i.e. for (8.10)).
Proof The proof only require one important, but elementary fact: for every .u ∈
H02 (Ω), it is true that
⎛ ⎛
. |∇ 2 u(x)|2 dx = Δ u(x)2 dx. (8.13)
Ω Ω
A more general form of this equality is proved in the next chapter. This identity
implies that we can also represent our functional in (8.12) by
⎛ ⎛ ⎞
1 2
I (u) =
. |∇ u(x)|2 + a(x)Δ u(x) dx.
Ω 2
But the integrand for this representation is
1 2
F̃ (x, u, u, U) =
. |U| + a(x) tr U,
2
which is point-wise coercive and strictly convex in the variable .U. We would
conclude by Theorem 8.5.
To show (8.13), notice that through a standard density argument, given that
∞
.Cc (Ω) is, by definition (Definition 7.6), dense in .H (Ω), it suffices to check that
2
0
∞
for every pair .i, j , and for every .u ∈ Cc (Ω), it is true that
⎛ ⎛ ⎞ ⎛ ⎞ ⎛
∂ ∂u ∂ ∂u ∂ 2u ∂ 2u
. dx = dx.
Ω ∂xi ∂xj ∂xj ∂xi Ω ∂xj2 ∂xi2
This identity is checked immediately after two integration by parts (or just by using
the product rule for smooth functions) bearing in mind that boundary values for u
and .∇u vanish around .∂Ω. ⨆
⨅
We can, of course, consider variational problems of order higher than two.
8.9 Non-existence and Relaxation
We have stressed the fundamental concept of convexity throughout this text.

Together with coercivity, they are the two building properties upon which existence
of optimal solutions to variational problems is proved. Two natural questions arise:
what may happen in problems when one of these two ingredients fail? What
behavior is expected when the integrand is not convex; or when the integrand is
not coercive? There are two terms that refer precisely to these two anomalies:
oscillation, when convexity fail; and concentration, when coercivity does. The first
indicates a persistent oscillatory phenomenon that is unavoidable in a minimization
process, similar to weak convergence when it is not strong; in the second, the loss
of mass at infinity is part of the minimization process, and is associated with lack of
coercivity. In some cases, both phenomena may produce a combined effect. We are
especially interested in having a first contact with these two phenomena.
8.9 Non-existence and Relaxation 267
Example 8.5 Consider the situation

⎛ ⎡ ⎤
Minimize in u ∈ H 1 (B) :
. arctan2 (|∇u(x)|) + u(x)2 dx,
B
under .u = 1 on .∂B. .B is the unit ball in .RN . Though the form of this problem may
look a bit artificial, it is designed to convey the concentration effect. We claim that
the infimum of the problem vanishes. If this is indeed so, then it is clear that there
cannot be a function u realizing it, because of the incompatibility of the functional
with the boundary datum. Minimizing sequence then should jump abruptly from
zero to meet the boundary datum in small boundary layers around .∂B. Indeed, the
following is a minimizing sequence of the problem
⌠
0, x ∈ [0, 1 − 1/j ],
. uj (x) = fj (|x|), fj (x) =
j (x − 1) + 1, x ∈ [1 − 1/j, 1].
If definitely shows the concentration phenomenon we are talking about.

We plan to spend more time with the oscillatory effect that is a consequence of
lack of convexity. To this aim we make the following definition.
Definition 8.2 Let .φ(u) : RN → R be continuous. We define its convex hull as the
function
⌠ ⎛ ⎫
1 ∞
.φ (u) = inf φ(∇u(x)) dx : u ∈ C (B), u(x) = u · x on ∂B .
#
|B| B
B is the unit ball in .RN .

.
A first immediate information is the following.

Proposition 8.4
1. We always have
.Cφ ≤ φ # ≤ φ.
2. If .φ is convex, then .φ = φ # .
Proof The inequality .φ # ≤ φ is trivial because the linear function .u(x) = u · x is,
of course, a valid function to compete for .φ # (u).
Let u be one of the competing functions in the definition of .φ # so that .u(x) = u·x
on .∂B. Define a measure through the formula
⎛
1
.<ψ, μu > = ψ(∇u(x)) dx.
|Ω| Ω
It is clear that every such .μu is a probability measure, compactly supported in .RN
and barycenter .u. By Jensen’s inequality, since .Cφ is convex and .μu , a probability
measure with first moment .u,
⎛ ⎛
1 1
.Cφ(u) ≤ Cφ(∇u(x)) dx ≤ φ(∇u(x)) dx.
|B| B |B| B
By taking the infimum in u, we conclude that .Cφ ≤ φ # . ⨆

⨅
The fundamental and surprising property is the following.
Theorem 8.6 The function .φ # is always convex. Consequently, .Cφ ≡ φ # .
Proof As in the proof of the last proposition, we put
F = {μu : u ∈ C∞ (B), u(x) = u · x on ∂B},

.
and .G, the set of all probability measures with first moment .u and compact support
in .RN . It is therefore clear that .F ⊂ G. We will use Corollary 3.6 to show that in
fact .G ⊂ co(F) in the Banach space .C0 (B) and its dual, which was identified in
Example 7.3 of Chap. 2. Though we did not prove this fact, it is however clear that
both sets .F and .G, being subsets of probability measures, are subsets of the dual
of .C0 (B). Let then suppose that for a continuous function .ψ, a probability measure
.μ ∈ G, and a real number .ρ, we have that
<ψ, μ> + ρ < 0.

.
This implies that .ψ must be less than .−ρ somewhere in .B, for otherwise the previous
inequality would be impossible. By continuity, this same inequality .ψ < −ρ still
holds in a certain ball .Br (u). In this case we can find, for instance, a continuous
radial function
x
u(x) = u(|x|),
. ∇u(x) = u' (|x|) ,
|x|
with
u(t) : [0, 1] → R,
. u(1) = 0, u' ≤ r,
in such a way that
u · x + u(x) ∈ F,
. supp(∇u) ⊂ Br (u).
8.9 Non-existence and Relaxation 269
It is then elementary to realize that for such u, the corresponding .μu is an element
of .F and, by construction,
<ψ, μu > + ρ < 0.

.
⨆
⨅
A fundamental interpretation of the last theorem leads to the remarkable conclusion
that the two numbers
⌠ ⎛ ⎫
∞
. inf Cφ(∇u(x)) dx : u ∈ C (B), u(x) = u · x on ∂B
B
and
⌠ ⎛ ⎫
∞
. inf φ(∇u(x)) dx : u ∈ C (B), u(x) = u · x on ∂B ,
B
are equal because both numbers are equal to .|B|Cφ(u). This is the simplest example
of what in the jargon of the Calculus of Variations is called a relaxation theorem.
Since this is a more advanced topic, we just state here a more general such fact for
the scalar case, and leave a more complete discussion for a more specialized treatise.
Let
F (x, u, u) : Ω × R × RN → R
.
be measurable with respect to .x, and continuous with respect to pairs .(u, u) and
such that there is .p > 1 and .C > c > 0 with
c(|u|p − 1) ≤ F (x, u, u) ≤ C(|u|p + 1)

.
for every triplet .(x, u, u). As usual, we put

⎛
I (u) =
. F (x, u(x), ∇u(x)) dx.
Ω
Let .CF (x, u, u) stand for the convexification of .F (x, u, ·) for each fixed pair .(x, u).
Then Theorem 8.2 can be applied to the (convexified) functional
⎛
CI (u) =
. CF (x, u(x), ∇u(x)) dx,
Ω
and for given .u0 ∈ W 1,p (Ω) there is, at least, one minimizer v of the problem.
Theorem 8.7 Under the indicated hypotheses, we have
.CI (v) = min{CI (u) : u − u0 ∈ W 1,p (Ω)} = inf{I (u) : u − u0 ∈ W 1,p (Ω)}.
What this statement is claiming is that the problem for the non-convex integrand
F may not admit a minimizer for lack of convexity, but if we replace it by
its convexification, we recover minimizers and the value of the infimum has not
changed.
There are several other important ways to show the convexity of .φ # . These are
important for the proof of the previous statement even in a vector situation. As
such, they need to be known by anyone interested in deepening his insight into the
Calculus of Variation. Since this is a primer on variational techniques, we do not
treat them, but just mention them.
The first one is based on the classical Carathéodory theorem, and a funny
construction of a Lipschitz function whose graph is some sort of pyramid. Recall
the concept of the convexification of a set in Definition 3.4.
Theorem 8.8 For every set .C ⊂ RN , we always have
⌠ ⎫
Σ
N Σ
N
. co(C) = ti ui : ti ≥ 0, ti = 1, ui ∈ C .
i=0 i=0
The important point here is that in spite of limiting, to .N + 1, the number of terms
in convex combinations with elements of .C, yet one does not loose any vector in
.co(C).
1
Proposition 8.5 Let
Σ
N Σ
N
u=
. ti ui : ti ≥ 0, ti = 1, ui ∈ RN .
i=0 i=0
Then there is a simplex .X ⊂ RN and a Lipschitz function .u(x) : X → R such that
u(x) = u · x on ∂X,
.
and
|{∇u ∈
. / {ui : i = 0, 1, . . . , N}}| = 0.
By putting together these two results, one can show, much more explicitly than we
have done above, Theorem 8.6.
The second one involves a fundamental construction whose interest goes well
beyond our scalar problems here.
1 We are using the same letter .C for two different things: a set in .RN and to indicate the
convexification of a function. We hope not to lead to any confusion as the context makes clear,
we believe, we are referring to in each case
8.10 Exercises 271
Lemma 8.2 Let .Ω be a bounded, regular domain, and .ui ∈ RN , .i = 1, 0, .t ∈

(0, 1). Put .u = tu1 + (1 − t)u0 . There is a sequence of functions .uj (x) : Ω → R
with the following properties:
1. .uj (x) = u · x on .∂Ω;
2. .{uj } is bounded sequence (of Lipschitz functions) in .W 1,∞ (Ω);
3. as .j → ∞,
|{x ∈ Ω : ∇uj (x) ∈

. / {u1 , u0 }}| → 0.
8.10 Exercises
1. Consider the variational problem

⎛
Minimize in u ∈ W
.
1,p
(Ω) : F (u(x), ∇u(x)) dx
Ω
where the integrand F is assumed to comply with hypotheses in Theorem 8.2,

and no condition is assumed on ∂Ω. Argue that the problem admits an optimal
solution, and derive the so-called natural boundary condition for the problem.
2. Show that for every
F(x) : Ω ⊂ RN → RN ,
. F ∈ L2 (Ω; RN ),
f (x) ∈ L2 (Ω), and u0 ∈ H 1 (Ω), there is a unique solution u ∈ H 1 (Ω) solution

of the variational problem
⎛ ⎡ ⎤
1
Minimize in v ∈ H 1 (Ω) :
. |∇v(x) − F(x)|2 + f (x)v(x) dx
Ω 2
under the boundary condition v − u0 ∈ H01 (Ω). Write with care the associated
Euler-Lagrange equation.
3. Study the problem
⎛ ⎡ ⎤
1 1
∇u(x) A(x)∇u(x) + u(x) + f (x)u(x) dx
T 2
Ω 2 2
under no boundary condition whatsoever where the matrix field A enjoys

the usual conditions to guarantee convexity and coercivity. By looking at
conditions of optimality, derive the corresponding natural boundary conditions
that minimizers of this problem should comply with. Based on this analysis, try
to write down a variational problem that formally would yield a (weak) solution
of the problem
∂u
Δ u = 0 in Ω,
. = h on ∂Ω,
∂n
where n is the unit, outer normal to ∂Ω.
4. Consider the variational problem associated with the functional
⎛ ⎛ ⎞
a
. |∇u|2 + ∇u · F ∇u · G dx
Ω 2
under usual Dirichlet boundary conditions. Give conditions on the two fields
F(x), G(x) : Ω ⊂ RN → RN ,
.
so that the problem admits a unique solution. Explore the form of the Euler-
Lagrange equation.
5. Argue if the non-linear PDE
. − div(a∇u + b|∇u|F) = 0
can be the Euler-Lagrange equation of a certain functional for suitable constant

(or functions) a and b, and field F. Find it if the answer is affirmative.
6. Consider the family of radial variational problems with integrands of the form
F (u) = f (|u|)
.
for f : [0, ∞) → R. Explore further conditions on f guaranteeing existence

of minimizers under standard Dirichlet boundary conditions.
7. Obstacle problems. Show that the problem
⎛
1
.Minimize in u(x) ∈ H01 (Ω) : |∇u(x)|2 dx
2 Ω
under
u(x) ≥ φ(x) in Ω
.
admits a unique minimizer, provided φ, the obstacle, is a continuous function

that is strictly negative around ∂Ω. The set where u = φ is known as the
coincidence set , and its unknown boundary as the corresponding free boundary.
8.10 Exercises 273
8. Consider a variational problem of the form

⎛ ⎡ ⎤
1
. |∇u(x)|2 + ∇u(x) · F(x) dx
Ω 2
under a typical Dirichlet condition around ∂Ω, where the field F is divergence-
free div F = 0 in Ω. Explore the corresponding Euler-Lagrange equation, and
try to understand such behavior.
9. Let A(x) be a matrix-field in Ω, not necessarily symmetric. Argue that the two
quadratic variational problems with identical linear and zero-th order parts but
with quadratic parts
1
uT A(x)u,
. uT (A(x) + A(x)T )u
2
are exactly the same. Conclude that the Euler-Lagrange equation of such a
quadratic problem always yield a symmetric problem.
10. Let Ω ⊂ RN be a bounded, regular domain with a unit, outer normal field n on
∂Ω, and consider the space H ≡ L2div (Ω) of Exercise 7 of the last chapter.
(a) Use the Lax-Milgram theorem for the bilinear form
⎛
. F(x)T A(x)F(x) dx, F ∈ H,
Ω
for a coercive, continuous matrix field A(x) : Ω → RN ×N .

(b) Derive optimality conditions for the variational problem
⎛ ⎡ ⎤
1
. Minimize in F ∈ H : F(x)T A(x)F(x) + F(x) · G(x) dx,
Ω 2
under
. F(x) · n(x) = f (x) on ∂Ω.
11. Consider the variational problem for the functional

⎛
1
. |ux (x) + a(x)ux2 (x)|2 dx
Ω 2 1
where
a ∈ L2 (Ω),
. Ω ⊂ R2 , x = (x1 , x2 ).
(a) Explore if one can apply directly the fundamental existence theorem of this
chapter under standard Dirichlet conditions on ∂Ω.
(b) Perturb the previous functional in the form
⎛ ⎡ ⎤
1 ∈
. |ux1 (x) + a(x)ux2 (x)|2 + ux2 (x)2 dx
Ω 2 2
for a small positive, parameter ∈. Show that this time there is a unique
minimizer u∈ of the problem.
(c) What are the changes if we start instead with a functional
⎛
1
. |ux (x) + a(u(x))ux2 (x)|2 dx
Ω 2 1
where this time a is a real function as regular as necessary?

12. As in Exercise 1, consider the variational problem
⎛
Minimize in u ∈ W
.
1,p
(Ω) : F (u(x), ∇u(x)) dx
Ω
where the integrand F is assumed to comply with hypotheses in Theorem 8.2,

and we demand u = u0 only on a given portion ┌ ⊂ ∂Ω of the boundary. Argue
that the problem admits an optimal solution, and derive the corresponding
optimality conditions.
13. Argue what are the optimal solutions of the variational problem
⎛
∞
Minimize in u ∈ L (Ω) :
. F (x, u(x)) dx
Ω
under
⎛
. u(x) dx = |Ω|u0 ,
Ω
where u0 belongs to the range of u.

14. Among all surfaces Su that are the graph of a function u of two variables
u(x, y) : Ω → R,
. Ω ⊂ R2 ,
complying with some prescribed boundary values given by a specific function

u0 (x, y),
u(x, y) = u0 (x, y),

. (x, y) ∈ ∂Ω,
8.10 Exercises 275
write the functional furnishing the flux of a vector field
F(x, y, z) : R3 → R3
.
through Su . Argue if there is an optimal such surface, and examine the

corresponding equation of optimality.
⎛
1
.I (u) = (det ∇ 2 u(x))2 dx, u : Ω ⊂ R2 → R.
Ω 2
(a) Check that the function
1
F (A) =
. det A2 , A ∈ R2×2 ,
2
is not convex, and therefore, our main result for second-order problems
Theorem 8.5 cannot, in principle, be applied.
(b) Calculate, nonetheless, the corresponding Euler-Lagrange equation.
⎛
α1 χ (x) + α0 (1 − χ (x))
. |∇u(x)|2 dx
Ω 2
where χ and 1 − χ are the characteristic function of two smooth subsets Ω1

and Ω0 sharing a smooth boundary ┌ within Ω.
(a) Show that there is a unique optimal solution for u − u0 ∈ H01 (Ω), provided
that α1 , α0 are non-negative.
(b) Use optimality conditions to describe u with the so-called transmission
condition through ┌.
17. Consider the quadratic functional
⎛
a∈ (x)
. |∇u(x)|2 dx
Ω 2
with
a∈ ⥛ a0 in Ω,
. 0 < c ≤ a∈ , a0 ≤ C.
(a) Show that there is a unique solution u∈ under a typical Dirichlet boundary
condition around ∂Ω.
(b) Show that there is a function u0 ∈ H 1 (Ω) such that u∈ ⥛ u0 in H 1 (Ω) as
∈ → 0.
(c) Suppose that for each v ∈ H01 (Ω) one could find a sequence {v∈ } ⊂ H01 (Ω)
such that
a∈ ∇v∈ → a0 ∇v in L2 (Ω).
.
Show that the limit function u0 is the minimizer of the limit quadratic
functional
⎛
a0 (x)
. |∇u(x)|2 dx.
Ω 2
In this context, we say the initial quadratic functional ┌-converges to this

last quadratic functional.
18. Let Ω be an open subset of RN and u ∈ W 1,p (Ω). Consider the variational
problem
Minimize in U ∈ W 1,p (RN ) :

. ||U ||W 1,p (RN \Ω) + ||U − u||W 1,p (Ω) .
(a) Show that there is a unique minimizer û ∈ W 1,p (RN ).

(b) Suppose that Ω is so regular that the divergence theorem holds. Deduce the
corresponding transmission condition as in Exercise 16 above.
(c) Explore the effect of using a positive parameter ∈ to change the functional
to
1
||U ||W 1,p (RN \Ω) + ||U − u||W 1,p (Ω) .
.
∈
Suppose that u admits an extension to U in the sense that
U ∈ W 1,p (RN ),
. U = u in Ω.
If û∈ is the unique minimizer for each ∈, what behavior would you expect
for û∈ as ∈ → 0?
19. This exercise is similar to the previous one for the particular case p = 2. Let Ω
be an open subset of RN and u ∈ H 1 (Ω). Consider the variational problem
⎛ ⎡ ⎤
1 1
.I∈ (v) = χRN \Ω (x)|∇v(x)| + χΩ (x)|∇v(x) − ∇u(x)| dx,
2 2
2 RN ∈
over the full space v ∈ H 1 (RN ).

(a) Prove that there is a unique minimizer v∈ ∈ H 1 (RN ).
8.10 Exercises 277
(b) Write down the weak formulation of the underlying Euler-Lagrange equa-
tion of optimality, and interpret it as a linear operation
E∈ : H 1 (Ω) → H 1 (RN ),
. u |→ v∈ = E∈ u.
(c) Let m∈ be the value of the minimum, and suppose that it is a uniformly
bounded sequence of numbers. Use the Banach-Steinhauss principle to
show that there is limit operator E : H 1 (Ω) → H 1 (RN ) which is linear
and continuous.
(d) Interpret the operation E : H 1 (Ω) → H 1 (RN ) as an extension operator.
20. Let F(x) be a tangent vector field to ∂Ω that can be assumed as smooth as
necessary. Explore how to deal with the variational problem
⎛
1
Minimize in u :
. |∇u(x)|2 dx
Ω 2
subject to
.F · ∇u = 0 on ∂Ω.
Can it be done in a consistent way? Is it a well-posed problem?

21. For the following examples, explore if the basic existence theorem, under
Dirichlet boundary conditions, can be applied and write the underlying PDE
of optimality. Argue if optimality conditions hold.
⎛ ⎛| | | |⎞
1 || ∂u || || ∂u || 2
.
| ∂x | + | ∂x | dx1 dx2 ,
Ω 2 1 2
⎛ ⎛ ⎛ ⎞2 ⎛ ⎞2 ⎞
1 ∂u ∂u ∂u ∂u
2 + −2 dx1 dx2 ,
Ω 2 ∂x1 ∂x2 ∂x1 ∂x2
⎛ |⎛ ⎞ ⎛ ⎞ |
1 || ∂u 2 ∂u 2 ∂u ∂u ||
| + −2 | dx1 dx2 ,
Ω 2 | ∂x1 ∂x2 ∂x1 ∂x2 |
⎛ ⎛| | | | ⎞1/2
1 || ∂u ||4 || ∂u ||4
+| dx1 dx2 ,
Ω 2 | ∂x1 | ∂x2 |
⎛ ⎛ ⎛ ⎞ ⎛ ⎞ ⎛ ⎞ ⎞1/2
1 ∂u 4 ∂u 4 ∂u ∂u 2
2 + −2 dx1 dx2 ,
Ω 2 ∂x1 ∂x2 ∂x1 ∂x2
⎡⎛ ⎤
⎛ ⎛ ⎞ ⎛ ⎞ ⎞1/2 ⎛ ⎞
1⎣ ∂u 4 ∂u 2 ∂u 2 ⎦
+ + dx1 dx2 ,
Ω 2 ∂x1 ∂x2 ∂x2
⎡⎛ ⎤
⎛ ⎛ ⎞ ⎛ ⎞ ⎞1/2 ⎛⎛ ⎞ ⎛ ⎞ ⎞1/2
1⎣ ∂u 4 ∂u 2 ∂u 2 ∂u 4 ⎦ dx1 dx2 ,
+ + +
Ω 2 ∂x1 ∂x2 ∂x1 ∂x2
⎛ ⎡ ⎛ ⎞ ⎛ ⎞ ⎛ ⎛ ⎞4 ⎛ ⎞ ⎞⎤
1 ∂u 2 1 ∂u 2 ∂u ∂u 2
+ + 7 exp − −1 − dx1 dx2 ,
Ω 2 ∂x1 2 ∂x2 ∂x1 ∂x2
⎛ | || |
| ∂u | | ∂u |
| || |
| | | ∂x | dx1 dx2 ,
Ω ∂x1 2
⎛ ⎛ ⎛ ⎞ ⎞1/2 ⎛ ⎛ ⎞ ⎞1/2
∂u 2 ∂u 2
1+ 1+ dx1 dx2 ,
Ω ∂x1 ∂x2
⎛ ⎛ ⎛ ⎞2 ⎛ ⎞2 ⎞1/2
∂u ∂u
1+ + dx1 dx2 .
Ω ∂x1 ∂x2
22. Other boundary conditions. In the following three situations, investigate the
interplay between the functional and the boundary condition. For the sake of
simplicity, take in all three cases the usual quadratic functional
⎛
1
. |∇u(x)|2 dx.
2 Ω
(a) Mixed boundary condition. The boundary ∂Ω of the domain Ω is divided

in two sets ┌0 and ┌n where a Dirichlet condition u = u0 and a Neumman
condition ∇u · n = 0, respectively, are imposed.
(b) Robin boundary condition. This time we seek an explicit linear dependence
between a Dirichlet and a Neumman condition
∇u · n = γ u on ∂Ω,
.
where γ is a non-null constant.

(c) A mixed boundary condition with a free boundary between the Dirichlet
and Neumann contition. Take as boundary condition u ≤ u0 on ∂Ω.
23. If Ω ⊂ R3 , write the Euler-Lagrange equation for the functional
⎛
1
.E(u) = |∇u(x) ∧ ∇w(x) − F(x)|2 dx
2 Ω
8.10 Exercises 279
where the function w and the field F are given so that the gradient ∇w of w and
F are parallel at every point x ∈ Ω
∇w(x) || F(x),
. x ∈ Ω.
u ∧ v stands for the cross or vector product in R3 .

Chapter 9
Finer Results in Sobolev Spaces
and the Calculus of Variations
9.1 Overview
This chapter focuses on some important issues which arise when one keeps working
with Sobolev spaces in variational problems and PDEs. We can hardly cover all of
the important topics, but have tried to select those which, we believe, are among
the ones that might make up a second round on Sobolev spaces and variational
problems.
The first situation we will be dealing with is that of variational problems under
additional constraints, other than boundary values, in the form of global, integral
conditions. In general terms, we are talking about problems of the form
⎛
. F (x, u(x), ∇u(x)) dx
Ω
where the set .A of competing functions is a subset of some .W 1,p (Ω), which, in
addition to boundary conditions around .∂Ω, incorporates some integral constraints
given in the form
⎛
. F(x, u(x), ∇u(x)) dx ≤ (=)f
Ω
where
F(x, u, u) : Ω × R × RN → Rd ,
. f ∈ Rd .
The various components of the vector-valued map .F are assumed to be smooth. We

have already seen one important such example, though constraints are imposed in a
282 9 Finer Results in Sobolev Spaces and the Calculus of Variations
point-wise fashion, in Exercise 29 of the last chapter. For such a situation
1 2
F =
. |u| , A = {u ∈ H01 (Ω) : u ≥ ψ in Ω},
2
where the continuous function .ψ represents the obstacle. Another fundamental
problem is that in which we limit the size of feasible u’s in some suitable Lebesgue
space. The existence of minimizers for such problems is essentially covered by our
results in the previous chapter. We will also look at optimality conditions.
A second important point is that of better understanding the quantitative rela-
tionship between the norm of a function u in .W 1,p (Ω) and its norm in the Lebesgue
space .Lq (Ω), for a domain .Ω ⊂ RN . This is directly related to variational problems
of the form
⎛ ┐ ┌
1 1 1
. |∇u(x)| + |u(x)| − λ |u(x)| dx
p p q
(9.1)
Ω p p q
for different exponents p, q, and positive constant .λ. Note that if .λ ≤ 0, the unique
1,p
minimizer of the previous functional in .W0 (Ω) is the trivial function. However,
the problem becomes meaningful (non-trivial) if .λ > 0. Indeed, the functional in
(9.1) is closely related to the constrained variational problem
p
Minimize in u ∈ W 1,p (Ω) :
. ||u||W 1,p (Ω)
under the integral constraint
||u||Lq (Ω) = C,
. C > 0, given.
To better realize what is at stake in such situation, suppose we take .Ω all of space
Ω = RN . For real, positive r, we put
.
ur (x) = u(rx),
. u ∈ W 1,p (RN ).
A simple computation leads to
||∇ur ||Lp (Ω) = r 1−N/p ||∇u||Lp (Ω) ,

. ||u||Lq (Ω) = r −N/q ||u||Lq (Ω) ,
for each such r, and exponents p and q. This identities show that .ur ∈ W 1,p (RN )
for all such r; and, moreover, if the norms of .∇u in .Lp (RN ) and u in .Lq (RN ) are to
be comparable for every function u in .W 1,p (RN ), then we need to necessarily have
that
1 1 1
. = + , (9.2)
N p q
9.1 Overview 283
for otherwise, the limits for .r → 0 or .r → ∞ would make invalid the comparison.
The condition (9.2) yields the relationship between exponent p for the Sobolev
space, exponent q for the Lebesgue space, and the dimension N of the domain.
The situation for a different (in particular bounded) domain is much more involved
and technical. Indeed, the usual way of looking at the issue is by resorting to the
case of the whole space .RN through the use of an appropriate extension operator .ϒ
defined as
ϒ : W 1,p (Ω) → W 1,p (RN ),

. ||ϒu||W 1,p (RN ) ≤ CΩ ||u||W 1,p (Ω) , (9.3)
for a constant .CΩ depending exclusively in the domain .Ω (and dimension N ). If this
is the case, and assuming that we establish first an inequality
.||v||Lq (RN ) ≤ CN,p,q ||v||W 1,p (RN ) ,
for every .v ∈ W 1,p (RN ), we would have
.||u||Lq (Ω) ≤ ||ϒu||Lq (RN ) ≤ CN,p,q ||ϒu||W 1,p (RN ) ≤ CN,p,q CΩ ||u||W 1,p (Ω) ,
for every .u ∈ W 1,p (Ω). Though there is, apparently, no full characterization of
those domains that admit such extension operators with the fundamental property
in (9.3), there are various ways to construct them based on additional smoothness
of their boundaries. Most of them require a good deal of technicalities. One way
relies on the use of partitions of unity, as well as other more intuitive extension
techniques like reflection. This process typically requires .C1 -smoothness. Other
techniques only demand Lipschitzianity of .∂Ω. In fact, there is a particular case
in which such extension procedure is especially easy: from Proposition 7.8, we
1,p
already know that for functions in .W0 (Ω) the extension by zero off .Ω makes the
function an element of .W (R ), or of .W 1,p (Ω̃) for any bigger domain .Ω̃ for that
1,p N
matter. We will therefore start with the study of the relationship between the norms
1,p
in .Lq (Ω) and .W 1,p (Ω) for functions in .W0 (Ω). We will also briefly describe a
different technique that is sufficient to deal with most of the regular domains, and is
based on Definition 7.3 and ODEs techniques.
One of the most studied scalar variational problems corresponds to the integrand
1 2
F (u, u) =
. |u| + f (u)
2
where the continuous function .f (u) is placed into varying sets of assumptions. One
could even allow for an explicit .x-dependence, though for simplicity we will stick
to the above model problem. To fix ideas, we could think of f as a polynomial
of a certain degree. If .f (u) is strictly convex and bounded from below by some
constant, then our results in the previous chapter cover the situation, and we can
conclude that there is a unique critical function (a unique solution of the underlying
Euler-Lagrange equation) which is the unique minimizer of the problem. If we are

willing to move one step further, and be dispensed with these assumptions, then
more understanding is required concerning the two basic ingredients of the direct
method:
1. the convexity requirement in Theorem 8.2 does not demand the convexity of
.f (u), since it suffices the convexity of .F (u, u) with respect to .u;
2. the coercivity condition in that same result (for .p = 2) might not hold true if
.f (u) is not bounded from below.
Therefore, there are two main issues to be considered for a functional like
⎛ ⎛ ⎞
1
.I (u) = |∇u(x)|2 + f (u(x)) dx. (9.4)
Ω 2
One is to understand more general coercivity conditions than those in Theorem 8.2
that still may guarantee the existence of minimizers. The second is to explore
under what circumstances this last functional may admit critical functions other
than minimizers. Note that the Euler-Lagrange equation for such a functional is
. − Δ u + f ' (u) = 0 in Ω.
The first question leads to investigating further integrability properties of Sobolev

functions, while the second pushes in the direction of finding methods to show
the existence of critical functions solutions of the above PDE not relying on
minimization.
Another final fundamental point we would like to consider is that of the further
regularity of weak solutions of PDEs, let them be minimizers or just critical
functions of a certain functional. We will concentrate on the following basic
problem:
Δ u ∈ L2 (Ω) ⇐⇒ u ∈ H 2 (Ω).
.
This is a quite delicate topic that we plan to address by means of considering

higher-order variational problems. Second-order boundary-value problems are also
important and appealing. We have already had a chance to explore the important
case of the bi-harmonic operator in the last chapter which is associated with the
second-order functional
⎛
1
. | Δ u(x)|2 dx.
Ω 2
Their nature depends in a very fundamental way on the boundary conditions that are
enforced in concrete variational problems.
9.2 Variational Problems Under Integral Constraints 285
9.2 Variational Problems Under Integral Constraints
Variational problems in which integral constraints, in addition to boundary values,

are to be respected as part of feasibility can be interesting and important. We
have already mentioned and proposed one such example in the last chapter. As a
matter of fact, the main existence result for such problems comes directly from a
direct generalization of Theorem 8.2 which is modeled after Theorem 4.3 that we
transcribe next.
Suppose that
F (x, u, u) : Ω × R × RN → R,
.
is measurable in .x and continuous in .(u, u). Consider the variational problem

⎛
Minimize in u(x) ∈ A :
. I (u) = F (x, u(x), ∇u(x)) dx
Ω
where the feasible set .A is a subset of .W 1,p (Ω).

Theorem 9.1 If the three conditions:
1. I is non-trivial, and coercive in the sense
I (u) → ∞ whenever ||u||W 1,p (Ω) → ∞;

.
2. .F (x, u, ·) is convex for every pair .(x, u);

3. .A is weakly closed in .W 1,p (Ω);
hold, then the above variational problem admits minimizers.
The proof must be clear at this stage right after the direct method. As usual, the
coercivity of the functional I is derived from the coercivity of the integrand F just
as in Theorem 8.2. Our emphasis is here is in some typical examples of feasible sets
.A complying with the weak closeness in the previous statement.
Proposition 9.1 Let
F(x, u, u) : Ω × R × RN → Rd ,
.
be a mapping whose components are measurable in .x and continuous on .(u, u).

1. If every component of .F is convex in .u, then
⎛
1,p
A = {u ∈ W0 (Ω) :
. F(x, u(x), ∇u(x)) dx ≤ f }
Ω
is weakly closed for every fixed .f ∈ Rd .

2. If every component of .F is linear in .u (in particular for those components of .F

independent of .u), then
⎛
1,p
A = {u ∈ W0 (Ω) :
. F(x, u(x), ∇u(x)) dx = f }
Ω
is weakly closed for every fixed .f ∈ Rd .

One can obviously mix the equality/inequality conditions for different components
of .F asking for the linearity/convexity of the corresponding components. The proof
amounts to the realization that, due to the weak lower semicontinuity, of the family
of functionals
⎛
. F(x, u(x), ∇u(x)) dx,
Ω
the definition of the admissible set .A with inequalities turn out to be weakly closed.
For the case of linearity, note that both .F and .−F are convex in .u.
As indicated above, the particular case for an obstacle problem
A = {u ∈ H01 (Ω) : u ≥ ψ in Ω}
.
falls under the action of the preceding proposition, and so does
A = {u ∈ W 1,p (Ω) : ||u||Lp (Ω) = k}

.
for a fixed positive constant k. In both cases, there are unique minimizers for the
minimization of the .L2 (Ω)-norm and the .Lp (Ω)-norm of the gradient, respectively
(if .p > 1).
Optimality for such constrained problems involves the use of multipliers. This
can be setup in full generality, but since the most studied examples correspond to
very specific situations, we will be contented with looking at the following example.
Example 9.1 Let us investigate the problem
⎛
1
Minimize in u ∈
. H01 (Ω) : |∇u(x)|2 dx
2 Ω
subject to the condition

⎛
. u(x)2 dx = 1.
Ω
It is easy to see how the combination of Theorem 9.1 and Proposition 9.1 yields
immediately the existence of a unique global minimizer .u0 for this constrained
9.2 Variational Problems Under Integral Constraints 287
variational problem. Let

⎛
1
m=
. |∇u0 (x)|2 dx > 0.
2 Ω
If u is an arbitrary, non-trivial function in .H01 (Ω), then it is elementary to realize

that
1
u≡
. u
||u||L2 (Ω)
is feasible for our constrained variational problem, and hence

⎛
1
. |∇u(x)|2 dx ≥ m,
2 Ω
that is
⎛ ┐ ┌
1
. |∇u(x)|2 − mu(x)2 dx ≥ 0.
Ω 2
This inequality exactly means that .u0 is also a global minimizer for the augmented
functional
⎛ ┐ ┌
1
.I˜(u) = |∇u(x)|2 − mu(x)2 dx,
Ω 2
because .I˜(u0 ) = 0. The function .u0 is, then, a solution of the corresponding Euler-
Lagrange equation
. − Δ u = 2mu in Ω, u = 0 on ∂Ω.
We say that the problem
. − Δ u = λu in Ω, u = 0 on ∂Ω (9.5)
is the optimality condition for the initial constrained problem for a certain value
for the multiplier .λ(= 2m). Note how Eq. (9.5) can be interpreted by saying that
.u = u0 is an eigenfunction of the (negative) Laplace operator corresponding to the
eigenvalue .λ.
The obstacle problem is different because constraints are enforced in a point-wise
manner, and so multipliers would not be vectors but functions. We will not treat this
kind of situations here.
9.3 Sobolev Inequalities
One fundamental chapter in Sobolev spaces is that of understanding how much

better functions u in .W 1,p (Ω) are compared to functions in Lebesgue spaces
.L (Ω) depending on the exponent p and the dimension N where .Ω ⊂ R . The
p N
basic tool is again the fundamental theorem of Calculus used as in the proofs of
Propositions 7.3, 7.5 and 7.9. Once again we can write
⎛ x ∂u '
u(x ' , x) = u(x ' , y) +
. (x , z) dz (9.6)
y ∂xN
for a.e. .x ' ∈ πN Ω, and .x, y ∈ R so that
(x ' , x), (x ' , y) ∈ Ω.

.
If, as usual, we put
Ω = {(x ' , x) ∈ RN : x ' ∈ πN Ω, x ∈ Jx ' },

.
and
y(x ' ) = inf Jx ' ,

. (x ' , y(x ' )) ∈ ∂Ω,
then (9.6) becomes

⎛ | |
| ∂u ' |
|u(x ' , x)| ≤ |u(x ' , y(x ' ))| + | (x , z) | dz.
.
| ∂x |
Jx ' N
At this initial stage, we are not especially interested in measuring the effect of the
term .|u(x ' , y(x ' ))| at the boundary, which will take us to introducing spaces of
functions over boundaries of sets, so that we will assume, to begin with, that .u ∈
1,p
W0 (Ω). If this is so, then
u(x ' , y(x ' )) = 0,

.
and
⎛ | |
| ∂u ' |
|u(x)| = |u(x ' , x)| ≤ | (x , z) | dz. (9.7)
.
| ∂x |
Jx ' N
We define the function .uN (x ' ) as the right-hand side in this inequality
⎛ | |
| ∂u ' |
uN (x ) =' | |
.
| ∂x (x , z)| dz,
Jx ' N
9.3 Sobolev Inequalities 289
and realize that we could redo all of the above computation along any of the
coordinate axes .ei to find
|u(x)| ≤ ui (x 'i ),
. i = 1, 2, . . . , N, (9.8)
where
⎛ | |
| ∂u ' |
'
.ui (x i ) ≡ | |
| ∂x (x i , z)| dz.
Jx ' i
i
We would like to profit inequalities (9.8) as much as possible. We know that each
right-hand side in (9.8) belongs to .L1 of its domain
|| ||
|| ∂u ||
'
.ui (x i ) : πi Ω → R, ||ui ||L1 (πi Ω) ||
= || || ,
∂xi ||L1 (Ω)
provided .u ∈ W 1,1 (Ω). But (9.8) also implies

⨅
|u(x)|N/(N −1) ≤
. ui (x 'i )1/(N −1) , (9.9)
i
and we have the following abstract technical fact.

Lemma 9.1 Suppose we have .N (≥ 2) functions .fi ∈ LN −1 (πi Ω) for open sets
.πi Ω ⊂ R
N −1 , .i = 1, 2, . . . , N , and put
⨅
f (x) =
. fi (πi x).
i
Then .f ∈ L1 (Ω), and

⨅
||f ||L1 (Ω) ≤
. ||fi ||LN−1 (πi Ω) .
i
Proof The first case .N = 2 does not require any comment. Take .N = 3, and put,
to better see the structure of the situation,
f (x1 , x2 , x3 ) = f1 (x2 , x3 )f2 (x1 , x3 )f3 (x1 , x2 ),

.
with the three factors on the right being functions in .L2 of the corresponding
domains. For a.e. pair .(x2 , x3 ) ∈ π1 Ω, we see that
⎛
. |f (x1 , x2 , x3 )| dx1 ≤
Jπ 1 x
⎛⎛ ⎞1/2 ⎛⎛ ⎞1/2
|f1 (x2 , x3 )| |f2 (x1 , x3 )| dx1 2
|f3 (x1 , x2 )| dx1
2
.
Jπ 1 x Jπ 1 x
Integration on the variables .(x2 , x3 ) ∈ π1 Ω, together with Hölder’s inequality leads

to the desired inequality. The general case proceed by induction (exercise). ⨆
⨅
If we go back to our situation in (9.9), and apply to it Lemma 9.1, we conclude that
u ∈ LN/(N −1) (Ω), and, since,
.
N/(N −1)
||u||LN/(N−1) (Ω) = |||u|N/(N −1) ||L1 (Ω)
.
we arrive at
|| ||
N/(N −1)
⨅ || ∂u ||1/(N −1)
.||u|| N/(N−1) ≤ || || ,
L (Ω) || ∂x || 1
i i L (Ω)
Finally, because
|| ||
|| ∂u ||
|| || ≤ ||∇u||L1 (Ω;RN ) ,
.
|| ∂x || 1
i L (Ω)
we find
||u||LN/(N−1) (Ω) ≤ ||∇u||L1 (Ω;RN ) .

.
Lemma 9.2 Let .Ω ⊂ RN be a domain, and .u ∈ W01,1 (Ω). Then .u ∈ LN/(N −1) (Ω),
and
||u||LN/(N−1) (Ω) ≤ ||∇u||L1 (Ω;RN ) .

.
This is the kind of result we are looking for. It asserts that functions in .W01,1 (Ω)
enjoy much better integrability properties since they belong to the more restrictive
class of functions .LN/(N −1) (Ω).
9.3.1 The Case of Vanishing Boundary Data
1,p
Assume that .Ω ⊂ RN is a domain, and .u ∈ W0 (Ω). We will divide our discussion
in three cases depending on the relationship between the exponent p of integrability
of derivatives, and the dimension N of the domain, according to the following
classification.
1. The subcritical case .1 ≤ p < N.
2. The critical case .p = N.
3. The supercritical case .p > N.
We will treat the three cases successively.
9.3.1.1 The Subcritical Case
We place ourselves in the situation where the integrability exponent p is strictly

smaller than dimension N. We will be using recursively Lemma 9.2 in the following
way.
1,p
Suppose .u ∈ W0 (Ω) with .1 1, not yet
selected. Consider the function
U (x) = |u(x)|γ −1 u(x),

.
and suppose that .γ is such that .U ∈ W01,1 (Ω). It is easy to calculate that
∇U (x) = γ |u(x)|γ −1 ∇u(x).

.
If we want this gradient to be integrable (.U ∈ W01,1 (Ω)), since .∇u belongs to
.L (Ω), we need, according to Hölder’s inequality, that .|u|
p γ −1 ∈ Lp/(p−1) (Ω).
Given that .u ∈ L (Ω), this forces us to take .γ = p. If we do so, then

p
Lemma 9.2, implies that in fact .U = |u|γ −1 u belongs to .LN/(N −1) (Ω), that is,
.u ∈ L
pN/(N −1) (Ω). Once we know this, we can play the same game with a different
exponent .γ , in fact, with
p pN
. (γ − 1) = .
p−1 N −1
If we so choose .γ , the same previous computations lead to another U to which we

can apply Lemma 9.2. By the new choice of .γ , the factor .|u|γ −1 ∈ Lp/(p−1) (Ω).
The conclusion is that .u ∈ Lγ N/(N −1) (Ω). Proceeding recursively, we realize that
we can move on with this bootstrap argument to reach a exponent .γ such that
p γN
(γ − 1)
. = ,
p−1 N −1
and, in such a case, .u ∈ Lγ N/(N −1) (Ω). Some elementary arithmetic yields
p(N − 1)
γ =
. , u ∈ LNp/(N −p) (Ω).
N −p
Theorem 9.2 If .u ∈ W0 (Ω), and .1 ≤ p < N, then .u ∈ LNp/(N −p) (Ω), and
1,p
||u||LNp/(N−p) (Ω) ≤ C||∇u||Lp (Ω)

.
for a positive constant C depending on p and N (but independent of .Ω, exercise

below).
The exponent
pN
p∗ =
. , 1 ≤ p < N,
N −p
is identified as the critical exponent.

For bounded domains, we can state a more complete result.
Theorem 9.3 If .Ω ⊂ RN is a bounded domain and .1 ≤ p < N, then
q ∈ [1, p∗ ],
1,p
W0 (Ω) ⊂ Lq (Ω),
.
and
.||u||Lq (Ω) ≤ C||∇u||Lp (Ω) ,
for a constant depending on p, q, N , and .Ω.

The proof is a combination of the case .q = p∗ shown in the previous result, and
Hölder’s inequality when .q < p∗ (.Ω, bounded).
9.3.1.2 The Critical Case
When .p = N, notice that the critical exponent .p∗ = ∞. As a matter of fact, in this
situation the recursive process above corresponding to the proof of Theorem 9.2
never ends (exercise), and we can conclude that .u ∈ Lq (Ω) for every q, in case .Ω
is bounded. This is however different to saying that .u ∈ L∞ (Ω). In fact, for .N > 1,
there are functions in .W 1,N (RN ) which do not belong to .L∞ (Ω).
Theorem 9.4 Every function .u ∈ W01,N (Ω) belongs to every .Lq (Ω) for all finite q.
9.3.1.3 The Supercritical Case
Let us now deal with the supercritical case .p > N. To better understand and
appreciate the relevance that the power of integrability be strictly larger than
dimension, let us recall first the case .N = 1. The conclusion in this case is exactly
inequality (2.8) where we showed easily that
|u(x) − u(y)| ≤ ||u' ||Lp (J ) |x − y|1−1/p ,

. p > 1,
for .x, y ∈ J ⊂ R. As matter of fact, our approach to higher dimensional

Sobolev spaces relies in the fact that one-dimensional sections along coordinate
axes are continuous for a.e. such line. Recall Proposition 7.3 and even more so
Proposition 7.4. We proceed in several steps.
Step 1. By Proposition 7.8, we can assume that .u ∈ W 1,p (RN ), so that the
domain .Ω is all of space. We regard, then, u as defined for every vector .x ∈ RN .
Step 2. Suppose that .Q is any cube with axes parallel to the coordinate axes, and
such that .0 ∈ Q. For any point .x ∈ Q, we have, according to Proposition 7.4, that
⎛ 1
u(x) − u(0) =
. ∇u(rx) · x dr.
0
If we integrate with respect to .x ∈ Q, we arrive at

⎛ ⎛ ⎛ 1
. u(x) dx − |Q|u(0) = ∇u(rx) · x dr dx.
Q Q 0
If we interchange the order of integration in the last integral, we can also write
⎛ ⎛ 1⎛
. u(x) dx − |Q|u(0) = ∇u(rx) · x dx dr.
Q 0 Q
We can perform the change of variables
y = rx,
. dy = r N dx,
in the inner integral, for each .r ∈ (0, 1) fixed, and find that
⎛ ⎛ 1 ⎛
1
. u(x) dx − |Q|u(0) = ∇u(y) · y dy dr.
Q 0 r N +1 rQ
For the inner integral, after using the classic Cauchy-Schwarz inequality and
Hölder’s inequality, bearing in mind that .rQ ⊂ Q for .r ∈ (0, 1),
⎛
. ∇u(y) · y dy ≤ ||∇u||Lp (Q;RN ) r 1+N (p−1)/p s 1+N (p−1)/p
rQ
if .|Q| = s N . Altogether, we find, performing the integration with respect to r,

|⎛ |
| | 1
| |
| u(x) dx − |Q|u(0)| ≤ 1 − N/p ||∇u||Lp (Q;RN ) s
1+N (p−1)/p
. .
Q
Dividing through .|Q| = s N , we finally get

| ⎛ |
| 1 | 1−N/p
| − |≤ s
.
| |Q| u(x) dx u(0)| 1 − N/p ||∇u||Lp (Q;RN ) .
Q
Step 3. Because in the previous step the cube .Q was simply assumed to contain
the origen .0, and the two sides on the last inequality are translation invariant, we can
conclude that
| ⎛ |
| 1 | 1−N/p
| − |≤ s
.
| |Q| u(x) dx u(y)| 1 − N/p ||∇u||Lp (Q;RN )
Q
is also correct for any such cube .Q, and any .y ∈ Q. In particular, if .y and .z are two
given, arbitrary points, and .Q is a cube containing them with side, say, .s = 2|y − z|,
then
| ⎛ | | ⎛ |
| 1 | | 1 |
.|u(y) − u(z)| ≤ | u(x) dx − u(z) |+| u(x) dx − u(y) |
| |Q| | | |Q| |
Q Q
≤C|y − z|1−N/p ||∇u||Lp (RN ;RN ) ,
for a constant C depending on N and p but independent of u.

1,p
Theorem 9.5 If .Ω ⊂ RN is a domain, and .p > N, every function .u ∈ W0 (Ω)1
is such that
|u(y) − u(z)| ≤ C|y − z|1−N/p ||∇u||Lp (Ω;RN )

.
for every .y, z ∈ Ω, and a constant C independent of u. In particular, bounded sets

1,p
of .W0 (Ω) are equicontinuous, and
W0 (Ω) ⊂ L∞ (Ω).
1,p
.
1 Thisis understood in the sense that u can be redefined in a negligible set in such a way that the
modified function complies with this statement.
9.3.2 The General Case

1,p
Note that for the particular case .Ω = RN , the two spaces .W 1,p (RN ) and .W0 (RN )
are identical, and so all of our results are valid for this special domain (without
boundary). Recall Remark 7.3. In this section, we would like to explore the
extension of the same inequalities we have examined in the previous section to
functions u in .W 1,p (Ω) without demanding to have a vanishing trace around .∂Ω.
This job typically restricts more the domains .Ω where that goal can be achieved.
Since the treatment of the general situation is rather technical, though important,
and relies on the extension property indicated in the Introduction of this chapter, we
will not cover such. Instead, and as an example of how the inequality in Theorem 9.2
demands the explicit presence of the .L1 -norm of the function u, we will redo the
above calculations for a cube .Ω = Q in .RN with edges of size .|Q|1/N .
We retake identity (9.6) to write
⎛ | |
x | ∂u ' |
'
|u(x , x)| ≤ |u(x , y)| + ' | | (9.10)
.
| ∂x (x , z)| dz.
y N
The numbers x and y belong to .Jx ' where, according to (7.11),
Ω = πN Ω × JN,x ' eN ,
.
and the same can be done for every coordinate direction .ei , .i = 1, 2, . . . N.
If we integrate this inequality in the variable y in the two subsets
.Ix ' = Jx ' ∩ (inf Jx ' , x), Dx ' = Jx ' ∩ (x, sup Jx ' ),
we arrive at
⎛ ⎛ ⎛ | |
x| ∂u ' |
'
|Ix ' | |u(x , x)| ≤ '
|u(x , y)| dy + | |
.
| ∂x (x , z)| dz dy,
Ix ' Ix ' y N
⎛ ⎛ ⎛ y| |
| ∂u ' |
|Dx ' | |u(x ' , x)| ≤ |u(x ' , y)| dy + | |
| ∂x (x , z)| dz dy.
Dx ' Dx ' x N
If we use Fubini’s theorem in the two double integrals (assuming the partial
derivative extended by zero when necessary), we find that
⎛ ⎛ | |
| ∂u ' |
|Ix ' | |u(x ' , x)| ≤
. |u(x ' , y)| dy + (z − inf Jx ' ) || (x , z)|| dz,
Ix ' Ix ' ∂xN
⎛ ⎛ | |
| ∂u ' |
|Dx ' | |u(x ' , x)| ≤ |u(x ' , y)| dy + (sup Jx ' − z) || (x , z)|| dz,
Dx ' Dx ' ∂xN
or even
⎛ ⎛ | |
| ∂u ' |
'
.|Ix ' | |u(x , x)| ≤
'
|u(x , y)| dy + |
(z − inf Jx ' ) | (x , z)|| dz,
Ix ' Jx ' ∂xN
⎛ ⎛ | |
| ∂u ' |
'
|Dx ' | |u(x , x)| ≤ '
|u(x , y)| dy + (sup Jx ' − z) | | (x , z)|| dz.
Dx ' Jx ' ∂xN
If we add up these two inequalities, we have

⎛ ⎛ | |
| ∂u ' |
d(Jx ' )|u(x , x)| ≤ ' '
|u(x , y)| dy + d(Jx ' ) | |
.
| ∂x (x , z)| dz,
Jx ' Jx ' N
where
d(Jx ' ) = sup Jx ' − inf Jx '

.
is the diameter of .Jx ' . Dividing through by this positive number .d(Jx ' ), we are led
to the inequality
⎛ ⎛ | |
1 | ∂u ' |
'
.|u(x , x)| ≤
'
|u(x , z)| dz + | |
d(Jx ' ) | ∂x (x , z)| dz.
Jx ' Jx ' N
This is a possible replacement of inequality (9.7) to the general situation of a

function u in .W 1,1 (Ω). Suppose, in particular, that .Ω = Q is a certain fixed cube.
Then
⎛ ⎛ | |
| ∂u ' |
'
.|u(x , x)| ≤ |Q|
−1/N '
|u(x , z)| dz + | (x , z)|| dz.
|
Jx ' Jx ' ∂xN
If we now define
⎛ ⎛ | |
| ∂u ' |
ui (x 'i ) = |Q|−1/N |u(x 'i , z)| dz + | |
.
| ∂x i | dz
(x , z)
Jx ' Jx ' i
i i
for .i = 1, 2, . . . , N, we have exactly the same inequality (9.8). Since
||ui ||L1 (πi Q) ≤ |Q|−1/N ||u||L1 (Q) + ||∇u||L1 (Q;RN ) ,

.
the same use of Lemma 9.1 in this situation yields
||u||LN/(N−1) (Q) ≤ |Q|−1/N ||u||L1 (Q) + ||∇u||L1 (Q;RN ) .

.
It does not seem possible to extend this inequality for more general domains than
just cubes.
The following is a full summary of the fundamental results of this section.

Theorem 9.6 Let .Ω ⊂ RN be a .C1 -domain (its boundary .∂Ω is a .C1 -manifold of
dimension .N − 1). Let .p ∈ [1, +∞].
1. Subcritical case .1 ≤ p < N:
∗ 1 1 1
W 1,p (Ω) ⊂ Lp (Ω),
.
∗
= − .
p p N
2. Critical case .p = N:
W 1,p (Ω) ⊂ Lq (Ω),

. q ∈ [p, +∞).
3. Supercritical case .p > N:
W 1,p (Ω) ⊂ L∞ (Ω).

.
Moreover,
1. if .Ω is bounded, and .p < N, then
W 1,p (Ω) ⊂ Lq (Ω),

. q ∈ [1, p∗ ];
2. all the above injections are continuous;

3. in the supercritical case .p > N, we have the inequality
.|u(x) − u(y)| ≤ C||u||W 1,p (Ω) |x − y|1−N/p (9.11)
valid for every .u ∈ W 1,p (Ω) and every couple .x, y ∈ Ω, with a constant C
depending only on .Ω, p and dimension N ; in particular, u admits a continuous
representative in .Ω.
It is also important to highlight how these results, together with the ideas in the
proof of Proposition 7.10 also yield the following fine point.
Theorem 9.7 For a bounded and .C1 domain .Ω ⊂ RN , the following are compact
injections
W 1,p (Ω) ⊂ Lq (Ω),

. q ∈ [1, p∗ ],
W 1,p (Ω) ⊂ Lq (Ω), q ∈ [1, +∞),
W 1,p (Ω) ⊂ C(Ω),
in the subcritical, critical and supercritical cases, respectively.

Proof The supercritical case is a direct consequence of (9.11) and the classical
Arzelá-Ascoli theorem based on equicontinuity. Both the subcritical and critical
cases, are a consequence of the same proof of Proposition 7.10. Note that this
explains why the limit exponent .p∗ can be included in the subcritical case .p < N ,
while it is not so for the critical case .p = N. ⨆
⨅
9.3.3 Higher-Order Sobolev Spaces
One can definitely apply Sobolev inequalities to first partial derivatives of functions
in .W 2,p (Ω), thus leading to better properties of functions in this space. We simply
state here one main general result which does not deserve any further comment as it
can be deduced inductively on the order of derivation.
Theorem 9.8 Suppose .Ω ⊂ RN is a .C1 -, bounded domain. Let .u ∈ W k,p (Ω).
1. If .k < N/p and .1/q = 1/p − k/N, then .u ∈ Lq (Ω), and
||u||Lq (Ω) ≤ C||u||W k,p (Ω)

.
for a positive constant C independent of u, and depending on k, p, N and .Ω.

2. If .k > N/p, put
⎧
[N/p] + 1 − N/p, N/p is not an integer,
α=
.
any positive value < 1, N/p is an integer.
Then2 .u ∈ Ck−[N/p]−1,α (Ω), and
|uγ (x) − uγ (y)| ≤ C||u||W k,p (Ω) |x − y|α

.
where .uγ is any derivative of u of order .k − [N/p] − 1, .x, .y are two arbitrary
points in .Ω, and the constant C only depends on k, p, N, .α, and .Ω.
9.4 Regularity of Domains, Extension, and Density
We have examined above two particular situations where functions in a Sobolev

space .Ω can be extended in an ordered way to functions in all of space .RN . We
would like to briefly describe, without entering into details, a more general method
where this extension can be carried through. Basically, extensions of functions
2 The notation .[r] indicates integer part of the number r.

9.4 Regularity of Domains, Extension, and Density 299
defined in .Ω to a bigger set .Ω̃, requires some regular transformation .⏀ : Ω̃\Ω → Ω

to define the extension through
U (x) = u(⏀(x)) for x ∈ Ω̃ \ Ω.

.
Suppose .Ω is a .C1 -domain according to Definition 7.3 with .φ its defining .C1 -
function, and .∈ > 0 the strip parameter around .∂Ω.
We learn from ODEs courses, that the flow
⏀(t, x) : [0, T ) × RN → RN ,
. ⏀' (t, x) = ∇φ(⏀(t, x)), ⏀(0, x) = x,
corresponding to the dynamical system
x ' (t) = ∇φ(x(t))

. (9.12)
is a continuous mapping in the variable .x. Moreover, we also define the function
t (x) through the following condition
.
t (x) : {−∈ < φ ≤ 0} → {0 ≤ φ < ∈},

. φ(⏀(t (x), x)) = −φ(x).
This function is as regular as the flow .⏀, and, moreover
t (x) = 0 for every x ∈ ∂Ω.

.
Note that integral curves of (9.12) travel perpendicularly into .Ω.

Though the following result can be treated in the generality of Definition 7.3, for
the sake of simplicity and to avoid some additional technicalities, we will assume
that the function .φ is .C2 to rely on the .C1 -feature of the flow mapping .⏀.
Proposition 9.2 Let .Ω, .φ, and .∈ be as in Definition 7.3 with .φ, a .C2 -function, and
.∂Ω, bounded. Let .u ∈ W
1,p (Ω), and define
⎧
⎪
⎪ x ∈ Ω,
⎨u(x),
⎛ ⎞
.U (x) = 1 − ∈ φ(⏀(t (x), x)) u(⏀(t (x), x)), x ∈ {−∈ < φ < 0},
1
⎪
⎪
⎩
0, x ∈ RN \ Ω∈ .
Then .U ∈ W 1,p (RN ) is an extension of u, and
||U ||W 1,p (RN ) ≤ C||u||W 1,p (Ω) ,

.
for some constant C depending on .Ω through .φ and .∈.

With the help of the extension operation, it is easy to show the density of smooth
functions in Sobolev spaces. The proof utilizes the ideas of Sect. 2.29 in a high-
dimensional framework in the spirit of Corollary 7.1.
Corollary 9.1 The restrictions of functions in .C∞ (RN ) with compact support make
up a dense subspace of .W 1,p (Ω) if .Ω is .C1 .
9.5 An Existence Theorem Under More General Coercivity

Conditions
Though more general examples could be dealt with, we will restrict attention to
functionals of type (9.4)
⎛ ⎛ ⎞
1
I (u) =
. |∇u(x)|2 + f (u(x)) dx (9.13)
Ω 2
to understand how, taking advantage of the better integrability properties that we

have learnt in previous sections, one can relax the coercivity condition in our
Theorem 8.2, and yet retain the existence of minimizers for the functional under
typical Dirichlet boundary conditions. We therefore stick in this section to the model
problem
⎛ ⎛ ⎞
1
Minimize in u(x) ∈ H (Ω) :
.
1
|∇u(x)| + f (u(x)) dx
2
(9.14)
Ω 2
under
u − u0 ∈ H01 (Ω),
. u0 ∈ H 1 (Ω), given.
Our goal is then to find more general conditions on function .f (u) that still allow to
retain existence of minimizers. In particular, some of these conditions permit non-
linearities .f (u) not bounded from below. In all cases, the main concern is to recover
the necessary coercivity. The basic idea is to arrange things so that the contribution
coming from the function f can be absorbed by the term involving the square of the
gradient. Some of the techniques utilized are typical in this kind of calculations.
Theorem 9.9
1. Suppose .f (u) : R → R is a continuous, bounded-from-below function. Then
there are minimizers for (9.14).
2. Let .f (u) be such that
1
|f (u)| ≤ λu2 + C,
. 0≤λ< ,
8CP2
9.5 An Existence Theorem Under More General Coercivity Conditions 301
where .CP > 0 is the best constant for Poincaré’s inequality in .H01 (Ω). Then
there are minimizers for (9.14).
3. If the function .f (u) is such that
.|f (u)| ≤ c|u|r + C, r < 2, c > 0,
then there are minimizers for our problem.

Proof The first situation is immediate because the coercivity condition (8.6) for
p = 2 is valid as soon as .f (u) is bounded from below by some constant .C ∈ R, i.e.
.
1 2 1
. |u| + C ≤ |u|2 + f (u).
2 2
The existence of minimizers is then a direct consequence of Theorem 8.2.
The second possibility is also easy. Note that, by Poincaré’s inequality applied to
the difference .u − u0 ∈ H01 (Ω),
⎛
. |f (u(x))| dx ≤λ||u||2L2 (Ω) + C|Ω|
Ω
≤2λ||u − u0 ||2L2 (Ω) + 2λ||u0 ||2L2 (Ω) + C|Ω|
≤2λCP2 ||∇(u − u0 )||2L2 (Ω;RN ) + 2λ||u0 ||2L2 (Ω) + C|Ω|
≤4λCP2 ||∇u||2L2 (Ω;RN ) + 4λCP2 ||∇u0 ||2L2 (Ω;RN )
+ 2λ||u0 ||2L2 (Ω) + C|Ω|
=4λCP2 ||∇u||2L2 (Ω;RN ) + C̃,
where .C̃ is a constant independent of u. Hence,

⎛ ⎞ ⎛ ⎛ ⎞
1 1
. − 4λCP2 ||∇u||2L2 (Ω;RN ) − C̃ ≤ |∇u(x)|2 + f (u(x)) dx,
2 Ω 2
with
1
. − 4λCP2 > 0.
2
This is the coercivity required to recover existence of minimizers through the direct
method.
For the third case, by Young’s inequality Lemma 2.1, applied to the factors, for
an arbitrary .δ > 0 and .x ∈ Ω,
1
a = δ|u(x)|r ,
. b= ,
δ
with the exponents
2 r
p=
. > 1, q= ,
r 2−r
we see that
1 p 1 1
|u(x)|r = |ab| ≤
. δ |u(x)|2 + .
p q δq
An integration in .Ω leads to
⎛
δp |Ω|
. |u(x)|r dx ≤ ||u||2L2 (Ω) + q .
Ω p qδ
From this inequality, and following along the calculations of the previous case, we
find that
⎛ ⎛ ⎞ ⎛ ⎞
1 1 4cδ p 2
. |∇u(x)|2 + f (u(x)) dx ≥ − CP ||∇u||2L2 (Ω;RN ) + C̃(δ),
Ω 2 2 p
where the constant .C̃(δ) turns out to be
4cδ p 2 2cδ p 2 c|Ω|

C̃(δ) = −
. CP ||∇u0 ||2L2 (Ω;RN ) − CP ||u0 ||2L2 (Ω;RN ) − − C.
p p qδ q
Since we still have the parameter .δ to our disposal, it suffices to select it in such a
way that
⎛ ⎞1/p
p
0≤δ<
. ,
8cCP2
to ensure the necessary coercivity, and to conclude the proof. ⨆

⨅
9.6 Critical Point Theory
In this section, we become interested in solutions of the critical function equation

(the Euler-Lagrange equation), not coming from minimization, for a functional
of the same kind (9.13); that is to say, critical functions that are not minimizers.
The emphasis is placed on the fundamental compactness property of Palais-Smale
that permits to translate some of the intuitive ideas for functions of several (finite)
variables to the infinite-dimensional setting. Unfortunately, checking details in a
complete way requires quite a good deal of calculations with constants, exponents,
9.6 Critical Point Theory 303
norms in different spaces, various easy estimates and inequalities with numbers and
functions, etc. In particular, we will have an opportunity to see the relevance of
Poincaré’s and Sobolev inequalities. The basic relevant concepts are exactly as in
the finite-dimensional case.
Definition 9.1 Let .E : H → R be a .C1 -functional defined in a Hilbert space .H.
1. A vector .u ∈ H is critical for E if .E ' (u) = 0.
2. A number c is a critical value for E if there is .u ∈ H such that
E(u) = c,
. E ' (u) = 0.
The basic heuristic principle to detect critical points which are not minimizers is
the following. When one is seeking a minimizer, one is led to minimize “in every
possible direction or way”. If, on the other hand, we look for critical points (where
the derivative or gradient must vanish) of a function or functional that may not
be minimizers, the first attempt would be to minimize “in all but one direction or
dimension”. This is the simplest version of a primitive min-max principle: minimize
the maximum across a family of objects of a finite dimension. It can be made specific
in the following intuitive way.
Let .E : H → R be a .C1 -functional over a Hilbert space .H. Take two vectors .u0 ,
.u1 in .H, and consider the class .┌ of smooth curves, regarded as one-dimensional
objects,
┌ = {γ (t) : [0, 1] → H : γ (0) = u0 , γ (1) = u1 }.

.
For each such curve .γ ∈ ┌, we would like to detect the maximum of the composition
with E
. max E(γ (t)),

t∈[0,1]
which is obviously attained at some .tM that will, most likely, depend on .γ . Then, we
take the minimum (infimum) of all these maximum values across the set of curves
in .┌
c = inf max E(γ (t)),

.
γ ∈┌ t∈[0,1]
and trust that this value ought to be critical. There are three main concerns:
1. c is a finite value (it suffices that E is bounded from below, though this is not
always the case);
2. c is indeed a true critical value so that there some .u ∈ H with .E ' (u) = 0 and
.E(u) = c;
3. c is not a critical value for a minimizer.

The subtle issue is to ensure that c is a critical value. This would be almost
immediate in a finite-dimensional setting. But it requires some caution in an infinite
dimensional situation that, as in the case of weak convergence, is related to lack of
compactness. The nature of the functional E needs to be strengthened, but we will
see exactly in which step and what that reinforcement should be.
Imagine that a certain .c ∈ R is not a critical value for the functional E. This
means that the derivative cannot vanish in the region
{c − ∈ ≤ E(u) ≤ c + ∈} ≡ {u ∈ H : E(u) ∈ [c − ∈, c + ∈]},

.
at least for some small positive .∈. If we are working in finite dimension, this implies
that the derivative .E ' is uniformly away from zero in such a region, i.e. there is some
.δ > 0 (depending on .∈), such that
{c − ∈ ≤ E(u) ≤ c + ∈} ⊂ {||E ' || ≥ δ}.

.
However, in an infinite-dimensional setting, there might be a sequence .{uj } such

that
. E(uj ) ∈ [c − ∈, c + ∈], E ' (uj ) → 0.
Such sequence could even be uniformly bounded in .H, but only converge in a weak
sense to some .u. The point is that because the convergence is only weak, we loose
all information concerning .E(u) and .E ' (u). There is no way to ensure whether .u
is a critical point for E. This is precisely what needs to be adjusted for functionals
defined in infinite-dimensional spaces.
Definition 9.2 A .C1 -functional .E : H → R defined on a Hilbert space .H is said to
enjoy the Palais-Smale property if for every sequence .{uj } ⊂ H such that .{E(uj )}
is a bounded sequence of numbers and .E ' (uj ) → 0 (strongly) in .H, we can always
find some subsequence converging in .H. Sequences .{uj } with
|E(uj )| ≤ M,
. E ' (uj ) → 0,
for some fixed M, are called Palais-Smale sequences (for E).

Before proceeding, since this concept may sound a bit artificial to readers, it is
interesting to isolate the main family of functionals that comply with the Palais-
Smale condition. Recall that .1 : H → H is the identity operator.
Proposition 9.3 Suppose that Palais-Smale sequences are uniformly bounded (for
instance if E is coercive), for a .C1 -functional .E : H → R, and
.E ' = 1 + K, K : H → H, a compact operator.
Then E enjoys the Palais-Smale property.

Proof If .{uj } is a Palais-Smale sequence for E, it is bounded, and, for some

subsequence, not relabeled, we would have the weak convergence .uj ⥛ u. The
compactness of .K would lead to a further subsequence, if necessary, with .Kuj → ũ.
Thus
uj = E ' (uj ) − Kuj → −ũ

.
strongly. ⨆
⨅
We now focus on the most studied situation in which we take .H = H01 (Ω) for a
bounded domain .Ω ⊂ RN , and .E : H → R defined by
⎛ ⎛ ⎞
1
.E(u) = |∇u(x)|2 + f (u(x)) dx (9.15)
Ω 2
with f a .C1 -function. Most of the statements of the technical assumptions for the
non-linearity .f (u) would require the distinction between dimension .N = 1, 2 and
.N > 2. As a rule, we will assume in what follows that .N > 2, and will leave the
cases .N = 1, 2 for the exercises.

Lemma 9.3
1. If
N +2
|f ' (u)| ≤ c|u|r + C,
. c > 0, r < , (9.16)
N −2
then the functional E in (9.15) is .C1 , and .E ' = 1 + K with .K, compact.
2. If, in addition to (9.16),
|f (u)| ≤ λ|u|2 + C,
.
with
1
0≤λ<
. ,
8CP2
where .CP is the best constant in Poincaré’s inequality in .H01 (Ω), then Palais-
Smale sequences are uniformly bounded in .H01 (Ω), and by Proposition 9.3,
functional E in (9.15) complies with the Palais-Smale condition.
3. If, in addition to (9.16),
1
f (u) − λf ' (u)u ≥ C,
. λ< , (9.17)
2
then Palais-Smale sequences are uniformly bounded in .H01 (Ω) (though this time
the funcional might not be coercive), and again by Proposition 9.3, functional E
in (9.15) complies with the Palais-Smale condition.
Proof In order to apply Proposition 9.3 to functional (9.15), we need to check three
things:
1. E is .C1 ;
2. .E ' = 1 + K with .K, compact;
3. Palais-Smale sequences are uniformly bounded in .H.
For the first point, and according to Definitions 2.13 and 2.14, we need to calculate
the directional derivative
|
d |
. I (u + ∈U )||
d∈ ∈=0
and make sure that it is a continuous operation in u and linear in U . Based on the
differentiability of f , it is easy to conclude that
⎛ ⎛ ⎞ |
d 1 |
. |∇u(x) + ∈∇U (x)|2 + f (u(x) + ∈U (x)) dx ||
d∈ Ω 2 ∈=0
is given by the expression

⎛
( )
. ∇u(x) · ∇U (x) + f ' (u(x))U (x) dx,
Ω
that indeed it is continuous in u, since .f ' is, and linear in U . Moreover, by

Lemma 2.7, the derivative can be identified as the unique minimizer of the quadratic
variational problem that consists in minimizing in .H01 (Ω) the functional
⎛ ┐ ┌
1
. |∇U (x)|2 − ∇u(x) · ∇U (x) − f ' (u(x))U (x) dx.
Ω 2
The unique solution U of this problem is furnished, for instance, by the versions of
the Lax-Milgram lemma in Sect. 8.2, and can be identified with the unique solution
'
.U (= I (u)) (for given u) of the linear PDE problem
. − div(∇U − ∇u) − f ' (u) = 0 in Ω, u = 0 on ∂Ω.
We clearly see that .U = I ' (u) = u plus the operation .K taking u into the solution
.v ∈ H (Ω) of the problem
1
0
. − Δ v = f ' (u) in Ω, v = 0 on ∂Ω.

By writing its weak formulation, using as a test function v itself, we see that
⎛ ⎛
. |∇v(x)|2 dx = f ' (u(x))v(x) dx.
Ω Ω
Using Hölder’s inequality on the integral on the right-hand side for exponents
2N 2N
p=
. , q= ,
N −2 N +2
we arrive at
||∇v||2L2 (Ω) ≤ ||v||L2N/(N−2) (Ω) ||f ' (u)||L2N/(N+2) (Ω) .

.
The corresponding embedding inequality leaves us with
||∇v||L2 (Ω) ≤ ||f ' (u)||L2N/(N+2) (Ω) ,

.
and the bound on the size of .f ' (u) leads us to conclude that
||∇v||L2 (Ω) ≤ c||u||L(N+2)/(N−2) (Ω) + C

.
for new constants .c > 0 and C. Due to the compact embedding of .H01 (Ω) into
2
.L (Ω), we see that indeed .K is compact.
In case the upper-bounds for .f ' and f can be established in relationship to

Poincaré’s inequality, then the coercivity shown in the proof of the second part of
Theorem 9.9 furnishes the final ingredient: Palais-Smale sequences are uniformly
bounded.
For the last item, suppose .{uj } is a Palais-Smale sequence
E(uj ) ≤ M,
. E ' (uj ) → 0,
that is
⎛ ⎛ ⎞
1
. |∇uj (x)|2 + f (uj (x)) dx ≤ M,
Ω 2
⎛
( )
∇uj (x) · ∇U (x) + f ' (uj (x))U (x) dx → 0,
Ω
uniformly on bounded sets of test functions U ’s in .H01 (Ω). In particular, for
1
U=
. uj ,
||∇uj ||L2 (Ω)
we can deduce that

⎛ ⎛ ⎞
. |∇uj (x)|2 + f ' (uj (x))uj (x) dx ≤ ||∇uj ||L2 (Ω)
Ω
for j sufficiently large. From all this information, we conclude that for the
combination
⎛ ⎛ ⎞
1
.Q ≡ |∇uj (x)|2 + f (uj (x)) dx
Ω 2
⎛ ⎛ ⎞
−λ |∇uj (x)|2 + f ' (uj (x))uj (x) dx
Ω
we find, through (9.17),
1
( − λ)||∇uj ||2L2 (Ω) + C ≤ Q ≤ M + λ||∇uj ||L2 (Ω) .
.
2
Since the coefficient in front of the square is strictly positive, this inequality implies
the uniform boundedness of the sequence of numbers .{||∇uj ||}, as claimed. ⨆
⨅
We finally need to face the min-max principle announced above. It is universally
known as the mountain-pass lemma, as this term intuitively describes the situation.
Recall the definition of the class of paths .┌ when the two vectors .u0 and .u1 are
given.
Theorem 9.10 Let .H be a Hilbert space and E, a .C1 -functional defined on .H that
satisfies the Palais-Smale condition. If there are .u0 , .u1 in .H, and
0 < r < ||u0 − u1 ||

.
such that
. max{E(u0 ), E(u1 )} < mr ≡ inf E(u),

||u−u0 ||=r
then the value
c = inf max E(γ (t)) ≥ mr

.
γ ∈┌ t∈[0,1]
is a critical value for E.

Proof Since each element .γ in .┌ will intersect at some point the set where .mr is
the infimum, because .r < ||u0 − u1 ||, we clearly see that .c ≥ mr , and, hence, c
cannot be a critical value corresponding to a (local) minimum.
We will proceed by contradiction assuming that c is not a critical value. If this is

so, and E verifies the Palais-Smale condition, we claim that there are
1
∈<
. (mr − max{E(u0 ), E(u1 )}), δ > 0,
2
with the property
E(u) ∈ [c − ∈, c + ∈] implies ||E ' (u)|| ≥ δ.

. (9.18)
Suppose this were not the case, so that we could find .uj with .E(uj ) − c → 0, and
yet .E ' (uj ) → 0. These conditions exactly mean that .{uj } would be a Palais-Smale
sequence, and therefore, we would be able to find an accumulation vector .u which,
by continuity of E and .E ' , would be a critical point at level c. If this situation is
impossible, we could certainly find some such .∈ and .δ for which (9.18) is correct.
By definition of c, there should be a path .γ ∈ ┌ such that
.E(γ (t)) ≤ c + ∈, t ∈ [0, 1].
The idea is to use the flow of E to produce a new feasible path .γ̃ such that
. E(γ̃ (t)) ≤ c − ∈.
This would be the contradiction, since again the definition of c makes this
impossible. But calculations need to be performed quantitatively in a very precise
way.
For each fixed .t ∈ [0, 1] for which
E(γ (t)) > c − ∈,

.
consider the infinite-dimensional gradient system
σ ' (s) = −E ' (σ (s)),

. σ (0) = γ (t).
We will assume that .E ' is locally Lipschitz continuous to avoid some more
technicalities. This requirement is not necessary, but if .E ' does not comply with this
local lipschitzianity condition one needs to modify it appropriately. By Lemma 2.6,
the previous gradient system is defined for all s positive, and its solution .σ (s; γ (t))
depends continuously on the initial datum .γ (t). Furthermore, again by Lemma 2.6,
for .r > 0,
⎛ r
E(σ (r)) − E(γ (t)) =
. <E ' (σ (s)), σ ' (s)> ds
0
⎛ r
=− ||E ' (σ (s))||2 ds
0
≤ − rδ 2 ,
while .σ (s) stays in the region .{E ∈ [c − ∈, c + ∈]}. Thus, while this is correct,
.E(σ (r)) ≤ E(γ (t)) − rδ 2 ≤ c − ∈, (9.19)
as soon as
E(γ (t)) − ∈ − c
. ≤ r.
δ2
If we let .r(t) be the (continuous) minimum of the these values for which (9.19)
holds, we find a new continuous path
. γ̃ (t) = σ (r(t); γ (t))
with
E(γ̃ (t)) ≤ c − ∈,
. t ∈ [0, 1].
This contradicts the definition of c, and consequently, it must be a critical value. ⨆

⨅
One of the principal applications of the results above which justifies the central
role of the Palais-Smale condition for the critical point theory in infinite dimension
refers again to the situation around (9.15) for which
⎛ ⎛ ⎞
1
E(u) : H01 (Ω) → R,
. E(u) = |∇u(x)|2 − f (u(x)) dx.
Ω 2
Note the change of sign in front of .f (u).

Theorem 9.11 Consider this last problem with
N +2
|f ' (u)| ≤ c|u|r + C,
. c > 0, 1 < r < ,. (9.20)
N −2
f (u)
lim = 0, . (9.21)
u→0 u2
1
0 < f (u) ≤ λf ' (u)u, 0<λ< , |u| ≥ R, (9.22)
2
for some positive R. Then there is some non-trivial critical function for the problem.
Proof Since condition (9.22) clearly implies (9.16), and, thus, by Proposition 9.3
and Lemma 9.3, the functional enjoys the Palais-Smale property, all that is left to
do, to be able to apply Theorem 9.10, is to detect the two functions .u0 , .u1 for which
the situation in that theorem holds. We will show that .u0 ≡ 0 with .E(0) = 0 is a
strict local minimizer, and for every .v ∈ H01 (Ω), there is always t with .E(tv) ≤ 0.
Hence if we take .u1 = tv, the condition of the trivial function being a strict local
minimum clearly indicates that we are in the situation of Theorem 9.10, and we can
conclude our result by Theorem 9.10.
It is clear that .E(0) = 0. We want to show that for some positive .ρ, .E(u) > 0,
whenever .0 < ||u|| ≤ ρ. Recall that
||u|| = ||∇u||L2 (Ω) .

.
Let .∈ > 0 be arbitrary. From (9.21), there is some .δ = δ(∈) > 0 with the property
∈ 2
|f (u)| ≤
. u , |u| ≤ δ.
2
On the other hand, from (9.20) by integration, we find, for some constant .C = C(∈),
|f (u)| ≤ C|u|r+1 ,
. |u| ≥ δ.
Altogether, we have
∈ 2
|f (u)| ≤
. u + C|u|r+1
2
for all u. We bear in mind this estimate, and go to estimating .E(u). Indeed,
1 ∈
E(u) ≥
. ||∇u||2L2 (Ω) − ||u||2L2 (Ω) − C||u||r+1
Lr+1 (Ω)
.
2 2
We now invoke two facts. First, Poincaré’s inequality
||u||L2 (Ω) ≤ CP ||∇u||L2 (Ω) ;

.
secondly, the Sobolev inequality
||u||Lr+1 (Ω) ≤ CS ||∇u||L2 (Ω)

.
for some constant. Note that .r + 1 < (2N )/(N − 2). We can hence estimate
⎛ ⎞
1 ∈ 2
. E(u) ≥ − C ||∇u||2L2 (Ω) − CCSr+1 ||∇u||L
r+1
2 (Ω) .
2 2 P
If we take .∈ sufficiently small so that the coefficient in front on .||u||2 is positive, and
realizing that .r + 1 > 2, we can certainly conclude that .E(u) > 0 for all non-trivial
u in a certain ball around the trivial function. This is exactly what is meant by saying
that .u ≡ 0 is a strict local minimum.
Finally, it is elementary to check (Exercise 4) that (9.22) implies
1
f (u) ≥ a|u|p + b,
. a > 0, b ∈ R, p = > 2.
λ
By using this estimate in our functional, we arrive at the upper bound
1 p
E(u) ≤
. ||∇u||2L2 (Ω) − a||u||Lp (Ω) − b|Ω|.
2
But replacing tu by u and letting t move, we find that the right-hand side converges
to .−∞, because .p > 2, as .t → ∞. Consequently, the same is correct for .E(tu).
In particular, for each u given, there is some t sufficiently large so that .E(tu) < 0.
This was the other necessary ingredient. ⨆
⨅
We will see more specific examples in the exercises.
9.7 Regularity. Strong Solutions for PDEs
We start with an extension of the identity (8.13).

Lemma 9.4 Let .Ω ⊂ RN be a regular, smooth domain with boundary .∂Ω which
is a .C2 -smooth, .(N − 1)-dimensional manifold, possibly with several connected
components, and unit, outer normal direction .n(x).
1. If .u ∈ H 2 (Ω) ∩ H01 (Ω), then
⎛ ⎛ ⎛
. |∇ u(x)| dx =
2 2
| Δ u(x)| dx +
2
H (x)|∇u(x) · n(x)|2 dS(x),
Ω Ω ∂Ω
where .H (x) is the curvature of .∂Ω at .x.

2. If .u ∈ H 2 (Ω) with
∇u(x) · n(x) = 0
.
9.7 Regularity. Strong Solutions for PDEs 313
then
⎛ ⎛ ⎛ ⎛ ⎞
1
. |∇ u(x)| dx =
2 2
| Δ u(x)| dx +
2
∇ |∇u| · n dS(x).
2
Ω Ω ∂Ω 2
3. In particular, if .u ∈ H02 (Ω), then

⎛ ⎛
. |∇ 2 u(x)|2 dx = | Δ u(x)|2 dx.
Ω Ω
Proof The main computation involves a typical integration by parts. Proceeding by

density, just as in Definition 7.6, suppose for the time being that u is .C∞ - in .Ω. Then
⎛ ⎛ ⎛ ⎞
. |∇ u(x)| dx =
2 2
tr ∇ 2 u(x)∇ 2 u(x) dx
Ω Ω
Σ ⎛ ⎛ ⎞
∂u ∂
= ∇ · (∇u) dx.
Ω ∂xi ∂xi
i
We can use the divergence theorem for smooth functions and domains twice in each
of the terms of the last sum to write, keeping track of boundary contributions and
putting .n = (ni ),
⎛ ⎛ ⎞ ⎛ ⎛
∂u ∂ ∂u ∂ ∂u ∂
. ∇ · (∇u) dx = − ( Δ u) dx + ∇u · n dS(x)
Ω ∂xi ∂xi Ω ∂xi ∂xi ∂Ω ∂xi ∂xi
⎛ 2 ⎛
∂ u ∂u
= 2
Δ u dx − ni Δ u dS(x)
Ω ∂xi ∂Ω ∂xi
⎛
∂u ∂
+ ∇u · n dS(x).
∂Ω ∂xi ∂xi
The sum in the index i carries us, through the first identity above, to
⎛ ⎛
. |∇ 2 u(x)|2 dx = | Δ u(x)|2 dx (9.23)
Ω Ω
⎛ ⎛ ⎞
+ ∇u ∇ 2 u n − ∇u · n Δ u dS(x).
∂Ω
If .u = 0 on .∂Ω, then
|∇u|n = ∇u,
. ∇u · n = |∇u|,
on .∂Ω because this boundary becomes a part of the level set .{u = 0}, and then
⎛ ⎛
. |∇ 2 u(x)|2 dx = | Δ u(x)|2 dx
Ω Ω
⎛ ⎛ ⎞
+ |∇u| n ∇ 2 u n − Δ u dS(x).
∂Ω
But under our smoothness assumptions (Exercise 1 below)
n ∇ 2 u n − Δ u = − Δ |∂Ω u + H ∇u · n
. (9.24)
at .∂Ω where
. Δ |∂Ω
is the Laplace operator over .∂Ω. If .u = 0 on .∂Ω, then
. Δ |∂Ω u = 0,
and this yields our first formula.

For the second situation, simply note that by retaking formula (9.23), the second
term in the surface integral vanishes, and we find
⎛ ⎛ ⎛ ⎛ ⎞
1
. |∇ u(x)| dx =
2 2
| Δ u(x)| dx + 2
∇ |∇u| · n dS(x).
2
Ω Ω ∂Ω 2
In both cases, a standard density argument yields the claimed formulas for functions
in the respective Sobolev spaces. ⨆
⨅
For .f (x) ∈ L2 (Ω), with .Ω a .C2 -domain as in the previous lemma, we would
like to consider the second-order problem
⎛ ⎛ ⎞
.Minimize in u(x) ∈ H 2
(Ω) ∩ H01 (Ω) : |∇ 2 u(x)|2 + 2f (x) Δ u(x) dx.
Ω
This is a standard second-order variational problem that does not require any special
consideration.
Proposition 9.4 The previous variational problem admits a unique minimizer .ũ ∈
H 2 (Ω) ∩ H01 (Ω).
Proof The functional is well-defined; it is coercive in its feasible set, and strictly
convex. Theorem 8.5 applies. Note that indeed it is a quadratic, strictly convex
functional on its admissible set. ⨆
⨅
9.7 Regularity. Strong Solutions for PDEs 315
The point is that, due to Lemma 9.4, our second-order, variational problem above
is exactly the same, except for the constant term
⎛
. |f (x)|2 dx,
Ω
as
⎛
Minimize in u(x) ∈ H 2 (Ω) ∩ H01 (Ω) :
. | Δ u(x) + f (x)|2 dx
Ω
⎛
+ H (x)|∇u(x) · n(x)|2 dS(x).
∂Ω
This variational problem is a bit special in that the functional incorporates a term
which is a surface integral, and we have not explored how to deal with such. Some
examples have been considered in connection with Neumman boundary conditions.
Yet, because of the equivalence of these two variational problems, as indicated, we
do know that the minimizer .ũ in Proposition 9.4 must also be a minimizer for the
same variational problem in the second form. In particular, the surface integral
⎛
. H (x)|∇ ũ(x) · n(x)|2 dS(x) (9.25)
∂Ω
is well-defined and is finite. If we now look at the restricted variational problem

⎛
Minimize in u ∈ H02 (Ω) :
. | Δ ũ(x) + Δ u(x) + f (x)|2 dx,
Ω
because .ũ+u, for a feasible .u ∈ H02 (Ω), does not change the boundary value of .ũ or
of its normal derivative involved in the surface integral (1.2), we conclude that the
trivial function .u ≡ 0 is a minimizer of this last problem. As such, we can examine
optimality conditions as derived in Sect. 8.8, to argue that
⎛
. ( Δ ũ(x) + f (x)) Δ v(x) dx = 0 (9.26)
Ω
for every feasible .v ∈ H02 (Ω). Since we know that the classical problem
. Δ v = g in B, v = 0 on ∂B
for arbitrary smooth functions g and any ball .B has a unique smooth solution v, we
realize that (9.26) implies that indeed
Δ ũ(x) + f (x) = 0
. (9.27)
for a.e. .x ∈ Ω. This is essentially a consequence of Lemma 7.4.

Definition 9.3 We say that a function .u ∈ H 2 (Ω) is a strong solution of . Δ u+f = 0

for a function .f ∈ L2 (Ω) if
Δ u(x) + f (x) = 0
.
for a.e. .x ∈ Ω.
Note how different the definition of weak and strong solutions for the same PDE
are. This definition can, of course, be generalized to many more families of PDEs.
Our arguments above (9.27) show that .ũ is a strong solution of . Δ u + f = 0, and,
hence, it is also the weak solution of the problem.
Theorem 9.12 Let .u ∈ H01 (Ω) be the unique minimizer of the problem
⎛ ┐ ┌
1
Minimize in v ∈
. H01 (Ω) : |∇v(x)| − f (x)v(x) dx
2
Ω 2
where .f ∈ L2 (Ω), and .Ω is a .C2 -domain. Then .u ∈ H 2 (Ω) and . Δ u + f = 0 a.e.

in .Ω.
Proof Under the smoothness assumptions on .Ω, the function .ũ ∈ H01 (Ω) is a strong
solution, i.e.
Δ ũ(x) + f (x) = 0
.
for a.e. .x ∈ Ω. Therefore for an arbitrary .v ∈ H01 (Ω),

⎛
. ( Δ ũ(x) + f (x))v(x) dx = 0.
Ω
But the integration-by-parts formula applied to the first term yields

⎛
. [∇ ũ(x) · ∇v(x) − f (x)v(x)] dx = 0.
Ω
The arbitrariness of .v ∈ H01 (Ω) in this equality implies that .ũ must be the weak
solutions of our problem, which belongs, then, to .H 2 (Ω). ⨆
⨅
This result is the basis of the regularity theory for PDEs: under suitable smoothness
assumptions on .Ω and the right-hand side .f ∈ L2 (Ω) of the PDE, the unique weak
solution .u ∈ H01 (Ω) turns out to belong to .H 2 (Ω). From here one can translate
further regularity on f and on .Ω into more regularity for the weak solution u. The
smoothness of .∂Ω is unavoidable.
9.8 Eigenvalues and Eigenfunctions 317
9.8 Eigenvalues and Eigenfunctions
Possibly, the most important case and application of Theorems 6.4 and 6.3 (see also
Example 6.6) is that of the eigenvalues and eigenfunctions of the Laplace operator
under vanishing Dirichlet boundary conditions. Consider the operator, for a given
bounded domain .Ω ⊂ RN ,
T : L2 (Ω) |→ L2 (Ω),
. U = Tu, − Δ U = u in Ω, U ∈ H01 (Ω).
In other words, for each .u ∈ L2 (Ω), U is the unique minimizer in .H01 (Ω) of the
functional
⎛ ┐ ┌
1
. |∇U (x)| − U (x)u(x) dx.
2
Ω 2
In particular, we have that

⎛
. [∇U (x) · ∇V (x) − u(x)V (x)] dx = 0 (9.28)
Ω
for every .V ∈ H01 (Ω).

The operator .T is compact and self-adjoint:
1. self-adjointness: for .u, v ∈ L2 (Ω) we easily realize, if we set .U = Tu, .V = Tv,
that
⎛
.<u, V > = u(x)V (x) dx
Ω
⎛
= ∇U (x) · ∇V (x) dx
Ω
⎛
= U (x)v(x) dx
Ω
=<U, v>,
by (9.28);
2. compactness: this is a direct consequence of the fact that the operation
u ∈ L2 (Ω) |→ U ∈ H01 (Ω),

. − Δ U = u in Ω
is continuous, because by (9.28) for .V = U itself,
||U ||2H 1 (Ω) ≤ ||uU ||L1 (Ω) ≤ ||u||L2 (Ω) ||U ||L2 (Ω) ,
.
and by Poincaré’s inequality
||U ||2H 1 (Ω) ≤ C||u||L2 (Ω) ||U ||H 1 (Ω) ;

.
the compactness of the injection .H01 (Ω) into .L2 (Ω) yields the compactness of
the operator .T.
As a direct consequence of those theorems recalled above, we conclude the
following classic result.
Theorem 9.13 There exist a sequence of positive numbers .{λj }, .λj → ∞, and a
sequence of functions
{uj } ⊂ H01 (Ω) ∩ C∞ (Ω)

.
that make up an orthonormal basis of .L2 (Ω), such that
. − Δ uj = λj uj in Ω. (9.29)
Proof We have already checked that the operator .T is compact and self-adjoint. It
is, in addition, positive which means that
<Tu, u>L2 (Ω) = ||Tu||2H 1 (Ω) ≥ 0

.
for every .u ∈ L2 (Ω). By Theorems 6.4 and 6.3, we can conclude the existence of a
sequence of non-vanishing eigenvalues .{1/λj }, which are positive because so is .T,
converging to zero, and a sequence of corresponding eigenfunctions .{uj }, i.e. (9.29)
holds. Finally, from the regularity results from the previous section, which may be
restricted to arbitrary compact, smooth subdomains of .Ω, utilized in a recursive way
through (9.29), we conclude the (interior) smoothness of each eigenfunction .uj . ⨅
⨆
For obvious reasons, the numbers .λj in (9.29) are called eigenvalues of the
Laplace operator (under vanishing Dirichlet boundary conditions) in .Ω, while the
corresponding .uj are the associated eigenfunctions.
It is interesting to realize, as we described in the Introduction to this chapter, that
solutions for the problem
. − Δ u = λu in Ω, u = 0 on ∂Ω,
correspond to critical functions for the functional

⎛ ⎛ ⎞
1 1
. |∇u|2 − λ u2 dx.
Ω 2 2
9.9 Duality for Sobolev Spaces 319
We now know that when .λ ≤ 0, the corresponding integrand
1 2 1
(u, u) |→
. |u| − λ u2
2 2
is convex, or even strictly convex if .λ < 0, and hence the unique critical function
would be the unique minimizer which is the trivial function. For .λ > 0, this is no
longer true. However, the associated variational principle might not be well-posed in
the sense that the infimum might decrease to .−∞. To recover a meaningful problem,
one needs to limit the size of the competing functions u
⎛
1
. |∇u(x)|2 dx
2 Ω
subject to
⎛
. u(x)2 dx = 1.
Ω
For such a variational problem, the set .A of admissible functions would be
A = {u ∈ H01 (Ω) : ||u||L2 (Ω) = 1}.

.
Because of the compact injection of .H01 (Ω) into .L2 (Ω), the set .A is weakly closed,
and hence there is a unique global minimizer .u1 ∈ H01 (Ω) for such a problem.
Recall our discussion in Sect. 9.2. Indeed, Theorem 9.13 ensures that there is a
whole sequence of such values.
9.9 Duality for Sobolev Spaces
Even though duality is quite well-known and useful for Lebesgue spaces, it not so
for Sobolev spaces. The main reason is that Sobolev spaces are the fundamental
function spaces for variational problems and PDEs involving weak derivatives, and
in this context duality does not play such a prominent role.
Possibly, the most used dual for a Sobolev space is the dual space of .H01 (Ω),
which is designated by .H −1 (Ω) to stress that functions in this space have one
“negative” weak derivative in the sense that they can be identified with derivatives
of functions in .L2 (Ω) (which do not admit, in general, weak derivatives).
Proposition 9.5 Let .T ∈ H −1 (Ω). Then there is .F ∈ L2 (Ω; RN ) such that,

formally, .T = div F, that is to say
⎛
.<T , u> = F(x) · ∇u(x) dx (9.30)
Ω
for all .u ∈ H01 (Ω).

As a matter of fact, the identification of the dual of a Hilbert space with itself leads
to a more explicit identification: for .T ∈ H −1 (Ω), there is a unique .v ∈ H01 (Ω)
such that
⎛
.<T , u> = ∇v(x) · ∇u(x) dx, (9.31)
Ω
so that .F = ∇v for a unique .v ∈ H01 (Ω). Moreover
||T ||H −1 (Ω) = ||v||H 1 (Ω) .

.
Formally, (9.31) can be written as
. − Δ v = T , ||T ||H −1 (Ω) = ||v||H 1 (Ω) .
On the other hand, every field .F ∈ L2 (Ω; RN ) defines an element T of .H −1 (Ω)

through the integral (9.30). Again, it can be represented by a unique .v ∈ H01 (Ω)
such that
⎛ ⎛
. F(x) · ∇u(x) dx = ∇v(x) · ∇u(x) dx (9.32)
Ω Ω
for all .u ∈ H01 (Ω). Condition (9.32) amounts to
. div(F − ∇v) = 0 in Ω, v ∈ H01 (Ω),
i.e. the function .v ∈ H01 (Ω) representing the element .div F ∈ H −1 (Ω) is the unique
minimizer v of the quadratic problem
⎛
1
. |F(x) − ∇u(x)|2 dx.
2 Ω
Even more generally, for a given .T ∈ H −1 (Ω), it is identified as the unique solution
of the quadratic problem (recall Lemma 2.7)
1
. ||u||2 − <T , u>.
2
9.10 Exercises 321
The optimality condition for the minimizer v of this problem becomes exactly
(9.31). In practice, elements T of the dual space .H −1 (Ω) are manipulated via its
unique representative .v ∈ H01 (Ω) given by these conditions.
There are analogous identifications for elements of dual spaces .W −1,q (Ω) for
1,p
.W
0 (Ω) and conjugate exponent q, although the identification cannot go as far as
with the Hilbert space .H01 (Ω).
9.10 Exercises
1. Let u(x) be a smooth function defined on a domain Ω ⊂ RN limited by a

smooth (C2 -) manifold ∂Ω such that
. u|∂Ω ≡ 0.
Let smooth functions (X(x), XN (x)) be defined in a neighborhood U of any

point in ∂Ω in such a way that at every point of U, gradients ∇Xi (x) for i =
1, 2, . . . , N − 1 are tangent to ∂Ω while ∇XN (x) is the unit, outer normal n(x)
to ∂Ω.Put
u(x) = U (X(x), XN (x)).

.
(a) Argue that
∂ 2U ∂U
∇ 2u =
.
2
(X, XN )∇XN ⊗ ∇XN + (X, XN )∇ 2 XN
∂XN ∂XN
over U.
(b) Proceed by density to prove (9.24).
2. Show that the best constant in the Poincaré’s inequality in H01 (Ω) is the inverse
of the first positive eigenvalue of the operator − Δ by examining the constrained
variational problem
⎛
Minimize in u(x) ∈
. H01 (Ω) : |∇u(x)|2 dx
Ω
under the integral constraint

⎛
. u(x)2 dx = positive constant.
Ω
3. Examine the various results in Sect. 9.6 and how they should be adapted for
dimensions N = 1 and N = 2.
4. Prove that if for a C1 -function f (x), we know that
.0 < pf (x) ≤ xf ' (x), |x| ≥ R > 0, p > 0,
then there are constants a > 0 and b such that
f (x) ≥ a|x|p + b
.
for all real x.

5. Explore the eigenvalues and eigenfunctions of the Laplace operator under
Neumann boundary conditions following the same steps as with the case of
Dirichlet boundary conditions. The following fact (which we have not proved
in the text)
||u||L2 (Ω) ≤ C||∇u||L2 (Ω;RN )

.
is valid for functions in the space H 1 (Ω) ∩ L20 (Ω) where

⎛
L20 (Ω) = {u ∈ L2 (Ω) :
. u(x) dx = 0}.
Ω
This inequality is known as Wirtinger’s inequality.

6. (a) Explore the variational constrained problem
⎛
. |u(x)|2 dx
Ω
subject to
⎛
. |∇u(x)|2 dx = 1.
Ω
Is there an optimal solution? Look for an easy counterexample in dimension

1.
(b) Consider the variational problem
⎛
Maximize in u ∈ H01 (Ω) :
. |u(x)|2 dx
Ω
subject to
⎛
. |∇u(x)|2 dx ≤ 1.
Ω
Is there an optimal solution?

9.10 Exercises 323
7. For a real function

√
f (t) : [0, +∞) → [0, ∞),
. ( like f (t) = t),
use Theorem 9.13 to define f ( Δ ) through the spectrum of Δ .

8. Complete the induction argument of Lemma 9.1.
9. Keep track of the various constants in the recursive process associated with
Theorem 9.2, and conclude that it is independent of the domain Ω.
10. Show how in the critical case p = N , the recursive process associated with the
proof of Theorem 9.2 can be pushed to infinity.
11. Suppose u is a smooth minimizer for the Dirichlet integral
⎛
1
. |∇u(x)|2 dx, Ω ⊂ RN ,
2 Ω
under some (non-null) Dirichlet boundary conditions. Use inner variations of

the form
U∈ (x) = u(x + ∈⏀(x)),

. ⏀(x) : Ω → RN , ⏀|∂Ω = 0,
to deduce the corresponding optimality conditions.

12. The basic strategy of the direct method can be used in more general situations.
Consider the following one. It is a matter of
⎛ ┐ ┌
1
Minimize in F :
. E(F) = F(x) A(x)F(x) + A(x) · F(x) dx
T
Ω 2
where F(x) : Ω ⊂ RN → RN , and
. div F = 0 in Ω, F · n = 0 on ∂Ω.
(a) Define in a suitable way the space where a weak null-divergence can be
enforced, together with the normal component on the boundary.
(b) Apply the direct method to the proposed problem.
(c) Derive the form of optimality conditions in this case.
Appendix A
Hints and Solutions to Exercises
A.1 Chapter 1
1. Computations are not hard if performed with some care.

2. No comment.
3. (a) If there is no constraint to be respected, the minimizer will be achieved by
minimizing the integrand pointwise for each x ∈ o: the optimal function
will be
u(x) = min F (x, u),

.
u
and optimality conditions will be given by
∂F
. (x, u) = 0 for x ∈ o.
∂u
As soon as there are global conditions to be preserved, this process can
hardly yield the optimal solution. Even a condition like
f
. u(x) dx = u0
o
spoils the above procedure, as it is shown next.

(b) A bit of reflection leads to the minimizer χA where the set A is described
by the condition on constant M so that
.A = {x ∈ o : f (x) ≤ M}, |A| = s|o|.
La Matematica per il 3+2 157, https://doi.org/10.1007/978-3-031-49246-4
326 A Hints and Solutions to Exercises
4. If
h(x) : [0, L] → R+ ,
. h(0) = h(L) = 0
is an arbitrary path joining the departure point (0, 0) and the arrival point (L, 0),
its cost will be given by the integral
f L √
. e−ah(x) 1 + h' (x)2 dx
0
where
√
. 1 + h' (x)2 dx
is the element of arc-length.

5. We have already used earlier in the chapter that
f f
ds
T =
. dt = .
v
In this particular situation we would have

f L /
T =
. n(x, y(x)) 1 + y ' (x)2 dx
0
y(0) = y0 ,
. y(L) = yL .
6. It is just a matter of bearing in mind that
. y ' (x)x ' (y) = 1, dx = x ' (y) dy
to find that the resistance functional can be written in the form

f h y
. dy.
H 1 + x ' (y)2
7. From elementary courses of Calculus, it is known that the surface of revolution

generated by the graph of the a function u(x) between x = 0 and x = L is
given by the integral
f L √
S = 2π
. u(x) 1 + u' (x)2 dx.
0
A Hints and Solutions to Exercises 327
8. A curve σ (t) with image contained in the cylinder of equation
x2 + y2 = 1
.
can obviously parametrized in the form
σ (t) = (cos t, sin t, z(t))

.
with t ∈ [0, θ ], with θ ≤ π , if
P |→ t = 0, z(0) = 0,
. Q |→ t = θ, z(θ ) = L ∈ R,
if the points P and Q are, respectively, the initial and final points and L can be
positive or negative. The functional providing the length of such a curve is then
f θ √
. 1 + z' (t)2 dt.
0
For the sphere, one can use spherical coordinates and write
σ (t) = (sin z(t) cos t, sin z(t) sin t, cos z(t)),

.
z(0) = π/2, z(θ ) = φ ∈ [0, π ],
if, without loss of generality,
(0, π/2, (θ, φ),

. 0 ≤ θ ≤ π,
are the spherical coordinates of the initial and final points.

9. It is easy to write the functional
f / /
. 1 + ux (x, y)2 1 + uy (x, y)2 dx dy.
o
Clearly a function for which ux = uy = 0, i.e. a constant function, is a

minimizer of this functional. It would be hard to give another one.
10. It is clear that all possible parametrizations of the path u(t) are of the form
v(s) = u(γ (s)),

. t = γ (s),
with
γ (s) : [0, 1] → [0, 1],

. γ (0) = 0, γ (1) = 1.
The variational problem would be

f 1
Minimize in γ :
. F (u(γ (s)), γ ' (s)u' (γ (s))) ds.
0
If γ is assumed bijective and smooth, then a change of variables leads to the

problem of minimizing in γ
f 1 ( )
1 '
. F u(t), u (t) γ ' (t) dt.
0 γ ' (t)
Here we have replaced γ by its inverse γ −1 . We therefore see that those

integrands F (u, v) that are homogeneous of degree 1 in v are parametrization-
independent. A fundamental example is
F (u, v) = |v|
.
whose associated functional is the length of the curve.
A.2 Chapter 2
1. That these two sets are vector spaces is a direct consequence of the triangular
inequality. As in other initial examples in the chapter, L∞ (o) is seen to be
complete. The case of L1 (o) is included in the proof of Proposition 2.2. It is a
good idea to review that proof for the case p = 1.
2. This is clear if we notice that every continuous function g in o always belong
to Lq (o), even if q = ∞. More directly, simply use Hölder’s inequality for
q < ∞.
3. This kind of examples is easy to design once the underlying difficulty is
understood. For instance, take
f x
.uj (x) = j χ(0,1/j ) (y) dy, u'j (x) = j χ(0,1/j ) (x), uj (0) = 0,
0
where χ(0,1/j ) is the characteristic function of the interval (0, 1/j ) in the unit
interval (0, 1). It is easy to check that {uj } is uniformly bounded in W 1,1 (0, 1)
but it is not equicontinuous because the limit of any subsequence cannot be
continuous.
4. (a) This is elementary.
(b) By using the double-angle formula, it is not difficult to conclude that
f b b−a
. u2j (x) dx → .
a 2
In terms of weak convergence, we see that
1
uj - 0,
. u2j - ,
2
which is a clear indication of the unexpected behavior of weak convergence
with respect to non-linearities.
5. Under the hypotheses assumed on {fj }, we know that there is some subse-
quence jk and f ∈ Lp (J ) with
fjk - f in Lp (J ).
.
But the convergences in arbitrary subintervals imply that

f b f b
. f (x) dx = f (x) dx.
a a
Since a and b are arbitrary, we can conclude that f ≡ f as elements in Lp (J ).

Since the convergence in subintervals takes place for the full sequence (no
subsequences), the same is correct at the level of weak convergence.
6. Let {x'j } be a Cauchy sequence in E' . Proceed in two steps:
(a) Show that the limit {x'j (x)} exists for every individual x ∈ E. Define x' (x),
as the limit. x' is linear by definition.
(b) Show that the set of numbers {||x'j ||} is bounded, and conclude that x' ∈ E' .
7. The proof of Proposition 2.7 can be finished after Exercise 5 above.
8. The vector situation in Proposition 2.8 is similar to the scalar case. It is a good
way to mature these ideas.
9. If one admits that
∞
Σ
x=
. aj xj ∈ H
j =1
because
Σ
. ||x||2 ≤ aj2 < ∞,
j
then the other conclusions are a consequence of the first part of the proposition.
Based on the summability of the series of the aj ’s, it is easy to check that the
sequence of partial sums
Σ
N
{
. aj xj }N
j =1
is convergent.
10. (a) The same example in Exercise 3 is valid to show that E1 is not complete.
(b) It is not difficult to realize that ||Tg || = ||g||∞ . In fact, if we regard E1 as a
(dense) subspace of L1 (0, 1), then it is clear that ||Tg || = ||g||∞ .
(c) It suffices to take f = g/|g| on the subset of (0, 1) where g does not vanish.
(d) It is obvious that δ1/2 is a linear, continuous functional on E∞ . However, it
does not belong to E'1 because functions can take on arbitrarily large values
off {1/2} without changing its value at this point.
(e) The subspace H is the zero-set of the linear, continuous functional Tg for
g ≡ 1. Since by the previous items, T1 belongs both to E'1 and E'∞ , we
have our conclusion.
11. (a) This is a direct consequence of the triangular inequality and works as in
any vector space; namely
||x|| ≤ ||x − y|| + ||y||,

. ||y|| ≤ ||x − y|| + ||x||,
for any two pair of vectors x, y, and so
. |||x|| − ||y||| ≤ ||x − y||.
(b) Let
Σ
z=
. zi ei , {ei : i} ⊂ Rn , the canonical basis.
i
Then, by linearity,
Σ
. Tz = zi Tei ,
i
and
Σ Σ
||Tz|| ≤
. |zi |||Tei || ≤ ||z||∞ ||Tei ||.
i i
(c) This is immediate: for every k, j larger than some j0 , we will have
||xj − xk || ≤ 1
.
if {xj } is a Cauchy sequence. In particular,
||xj || ≤ ||xj − xj0 || + ||xj0 || ≤ 1 + ||xj0 ||,

.
for every j larger than j0 .

(d) It is easy to realize, if {ei } is the canonical basis of Rn , that if
Σ
x=
. xi ei ,
i
then
Σ Σ
||x|| ≤
. |xi |||ei || ≤ ||ei ||||x||∞ = M||x||∞ .
i i
12. (a) It is clear that if x ∈ lp for some positive p, i.e. the series
Σ
. |xj |p < ∞, x = (x1 , x2 , . . . ),
j
then all the terms, except a finite number, must be less than 1, and hence
|xj |q < |xj |p ,

. and x ∈ lq .
Moreover, there is always a finite index set K ⊂ N such that
||x||∞ = |xk |,
. |xj | ≤ |xk |,
for k ∈ K, and all j . Hence

⎛ ⎞1/q
||x||q Σ |xj |q
. =⎝ ⎠
||x||∞ |xk |q
j
share the limit with
. #(K)1/q → 1, as q → ∞.
(b) This is similar to the previous item.

13. (a) It is clear that
.||x||∞ ≤ ||x||1 .
However, the elements

Σ
xk ≡
. ei
i∈Ik
for a sequence of finite sets Ik with cardinal k → ∞, are such that
||xk ||1 = k,
. ||xk ||∞ = 1.
(b) The index set I is infinite (because E has infinite dimension), and select a
copy of N within I . Define a linear functional T over E by defining it on
the basis {ei } and putting
T(en ) = n||en ||, n ∈ N,

. T(ei ) = 0, i ∈ I \ N.
This linear functional cannot be continuous.

14. The given function || · || is a norm, because the only solution of the problem
u(t) + u' (t) = 0 in [0, 1],

. u(0) = 0,
is the trivial function. The other conditions are inherited from the similar
properties of the absolute value. In particular,
||u|| ≤ ||u||∞ + ||u' ||∞ .

.
For the reverse inequality, bear in mind that

f t
|u(t)| ≤ ||u|| +
. |u(s)| ds,
0
f t
'
|u (t)| ≤ ||u|| + |u' (s)| ds,
0
and use Gronwall’s inequality.

15. (a) implies (b) is elementary. For the reverse implication, use a sequence {ρj }
in L1 (R), as in Sect. 2.29 with the property
T(ρj ) ∗ v = T(ρj ∗ v) → T(v)

.
for all v. {ρj } is called an approximation of the identity in L1 (R). On the other
hand, show that some subsequence of {T(ρj )} converges weakly in Lp (R) to
some w.
16. Note that for fixed x ∈ H, the convergence of the series

Σ
. <xj , x>xj
j
implies that
. <xj , x> → 0,
for all x ∈ H.
17. All conditions of an inner product are fine, except for the last one. Note that if
a non-null function f (x) is orthogonal (in L2 (o)) to the full set {wi }, then the
inner-product with a double integral would vanish.
18. Integrate by parts in the product
f 1
. Lj (t)Lk (t) dt, j > k,
−1
as many times as necessary, charging derivatives on Lk and checking that

contributions at the end-points vanish, until you see that the integral vanishes.
(a) They are asking about the projection of t 3 onto the subspace generated by
{L0 , L1 , L2 }, namely
Σ
2
<t 3 , Li >
πt3 =
. Li (t).
<Li , Li >
i=0
More specifically, the minimum sought is

f 1
. |t 3 − π t 3 |2 dt.
−1
(b) The maximum is attained for
1
p=
. (t 3 − π t 3 ).
||t 3 − π t 3 ||
19. (a) It suffices to realize that the leading coefficient of all the Pj ’s is unity,
as a consequence of the Gram-Schmidt orthogonalization process, and so
tPj −1 (t) − Pj (t) has, at most, degree j − 1.
(b) One must show that
<Pj (t) − tPj −1 (t), Pk (t)> = 0,

. k ≤ j − 3.
Indeed,
<tPj −1 (t), Pk (t)> = <Pj −1 (t), tPk (t)> = 0

.
if k ≤ j − 3. This implies that
Pj (t) − tPj −1 (t) = aj Pj −1 (t) + bj Pj −2 (t).

.
The inner product with Pj −2 , after the use of the previous item, yields the
sign of bj .
(c) Argue first that Pj ought to have at least one real root in J because
<Pj , 1> = 0. If we let
t 1 < t2 < · · · < tq

.
be the roots of Pj in J in which Pj changes sign and set

q
P (t) = ni=1 (t − ti )
.
argue that <Pj , P > cannot vanish, which is absurd if q < j .

20. It is a simple exercise to check that the complex system of exponential is
orthonormal. To see that it is a basis for H, separate in real and imaginary parts.
21. Proceed in various steps:
(a) Show first that {ψm,n } is orthonormal by considering the two cases (m1 , n),
(m2 , n), first; and then (m1 , n1 ), (m2 , n2 ).
(b) Argue that every function in L2 (R) can be represented as a linear combina-
tion of the Haar system of wavelets in the following way.
(i) Define the subspace Hj as the set of square-integrable functions which
are constant on dyadic intervals of length 2−j . Note that
. · · · ⊂ H−2 ⊂ H−1 ⊂ H0 ⊂ H1 ⊂ H2 ⊂ . . . ,
and that
. ∩j Hj = {0}.
(ii) Show that H = ∪j Hj is dense in L2 (R). Characteristic functions of

the form f (x) = χ[a,b] (x) can be approximated by elements of H if
we put
k l
a=
. − a1 , b= + b1 , a1 , b1 < 2−n ,
2n 2n
g(x) = χ[k/2n ,l/2n ] (x) ∈ H, ||f − g|| ≤ 21−n .
Since H is a subspace, linear combination of characteristics (simple or

step functions) belong to H too. Since simple functions are dense in
L2 (R), so is H.
22. Though a complete and clear answer to this problem would require further
knowledge about wavelet analysis, take the previous exercise as a model, and
write conditions on ψ in terms of the density in L2 (R) of a certain subspace
associated with ψ.
23. Such complement is spanned by functions u(t) such that
f
. (u(t)v(t) + u' (t)v ' (t)) dt = 0, v ∈ H01 (J ).
J
By integrating by parts on the second term, we deduce that u = u'' .

24. The same procedure shown in (2.27) is valid in more generality for a linear,
self-adjoint, differential operator L. Indeed, if
.Luk + ωk2 uk = 0 in (0, 1), uk (0) = uk (1) = 0,
for a certain sequence of (pair-wise different) numbers ωk so that uk is not the

trivial function, then
f 1 f 1
ωk2
. uk (x)uj (x) dx = − Luk (x)uj (x) dx
0 0
f 1
=− uk (x)Luj (x) dx
0
f 1
=ωj2 uk (x)uj (x) dx.
0
If ωk /= ωj , we conclude that uk and uj must be orthogonal. The self-

adjointness of L (see Chap. 5 below) means, through integration by parts,
precisely that
f 1 f 1
. Lu(x)v(x) dx = u(x)Lv(x) dx,
0 0
.uk (0) = uk (1) = 0.
Operators of the form
Lu(x) = −(p(x)u' (x))' + q(x)u(x)

.
for arbitrary functions p(x) > 0 and q(x) are valid for these computations to
hold (Sturm-Liouville eigenvalue problems).
25. For the triangle inequality, note that the function φ(r) = r/(1 + r) is strictly
increasing for positive r. Then
φ(|f − g|) ≤φ(|f − h| + |h − g|)

.
|f − h| + |h − g|
=
1 + |f − h| + |h − g|
≤φ(|f − h|) + φ(|h − g|).
For the completeness, reduce the case to L1 (J ).

26. (a) This is easy to argue because for positive numbers a and b, we always have
√ √ √
. a+b ≤ a + b.
(b) However || · ||1/2 is not a norm because it does not respect the triangular
inequality. To check this, consider the situation for χ , the characteristic
function of the interval (0, 1/2) in (0, 1), and 2 − χ .
(c) As in the first item above, it is easy to check that d(f, g) is indeed a
distance. The completeness reduces to the same issue in Lp (0, 1) for p ≥ 1.
27. Define
1
. <x, y> = (||x + y||2 − ||x − y||2 )
4
1 1 1
= ||x||2 + ||y||2 − ||x − y||2 ,
2 2 2
and prove that it is an inner-product whose associated norm is || · ||, through the
following steps:
(a) Check through the parallelogram identity that
.2<x, y> = <x + z, y> + <x − z, y>.
(b) By writing
1
<x + z, y> =
. (<x + z + w, y> + <x + z − w, y>) ,
2
and choosing w in an appropriate way, conclude that
<x + z, y> = <x, y> + <z, y>.

.
(c) Show that
. <2x, y> = 2<x, y>,
and then, writing an arbitrary λ in base 2, through the previous item,

conclude that
<λx, y> = λ<x, y>.

.
28. If we rely on the similar result for the one-dimensional situation, one can
deduce the same result for the closure of the subspace of L2 ([−π, π]N ; C)
made up of linear combinations of the form
Σ
. nk fkl (xk ), fkl ∈ L2 ([−π, π]; C).
l
It is a standard separation-of-variables process.

29. The projection theorem Proposition 2.11 can be applied directly to this situation
under the following elements:
(a) H is H01 (0, 1) while we take x, the trivial function. The norm in H is
precisely
f 1
||u|| =
. u' (x)2 dx.
0
(b) K is the subset of H given by
K = {u ∈ H01 (0, 1) : u(x) ≤ φ(x)}

.
which is clearly convex.

One can then conclude the existence of a unique function u ∈ K realizing the
minimum of the problem. This function is characterized by the condition
f 1
. u(x)[u(x) − u(x)] dx ≥ 0
0
for every u ∈ K.
30. The upper bounds given in the statement ensure that the functional E is well-
defined in H01 (0, 1). According to Definition 2.13, we need to calculate the
derivative of the section
f 1
t |→ E(u+tv) =
. [ψ(u' (t)+tv ' (x))+φ(u(x)+tv(x))] dx, v ∈ H01 (0, 1),
0
which is explicitly given by the formula

f 1
'
<E (u), v> =
. [ψ ' (u' (x))v ' (x) + φ ' (u(x))v(x)] dx.
0
Lemma 2.7 informs us that the derivative E ' (u) will be the unique minimizer
of the problem
f 1 [1 ]
' ' ' ' '
v∈
. H01 (0, 1) |→ v (x) − ψ (u (x))v (x) − φ (u(x))v(x) dx.
2
0 2
31. This is a typical situation for the application of Theorem 2.1 of Chap. 2, in a
similar way as in Sect. 2.10 of the same chapter.
A.3 Chapter 3
1. This is immediate from the statement of Corollary 3.2.

2. If we let a, b, and c be those three numbers, respectively, then it is immediate to
check that a ≤ b ≤ c. For this last inequality, note that there is always x' ∈ E'
with
< >
. x' , xj = λj ||xj ||
for every particular choice of numbers λj ∈ [−1, 1]. In particular, there is

always one such x̂' with
Σ Σ
.<x̂' , λj xj > = || λj xj ||
j j
if λj , with |λj | = 1, are the numbers furnishing the sup for a.

3. This is elementary if argued through sequences.
4. This is a fine exercise in Mutivariate Calculus.
5. (a) This is an interesting exercise of the manipulation of the convexity
inequality to end with the sought inequality.
(b) Check that the previous proof in a finite-dimensional scenario is valid in
an infinite-dimensional situation too. The application to Proposition 3.4 is
immediate.
6. Multiply (3.5) by an arbitrary v ∈ H01 (J ), and perform an integration by parts.
7. (a) This is standard. Check first that <u, v> is a seminorm, and then that the
associated norm only vanishes at the trivial function.
(b) The corresponding Lax-Milgram lemma is concerned with the minimiza-

tion problem for the functional
f 1
1
. tu' (t)2 dt
2 0
over H. It is enlightening to check that there cannot be a minimizer, since

if there would, it would be a solution of the Euler-Lagrange problem
. − [tu' (t)]' = 0 in (0, 1), u(0) = u(1) = 0.
It is elementary to check that there is no solution for this problem. The

whole point is that the given bilinear form is not coercive as the coercivity
constant t degenerates as t moves closer to the origen.
8. The linearity of such a mapping is clear. For the continuity, use v = u in the
statement of the Lax-Milgram lemma, and conclude by the coercivity of the
bilinear form, and the Cauchy-Schwarz inequality for the inner product.
9. If the bilinear form A(u, v) is symmetric, continuous, and coercive, it defines a
inner product in H which is equivalent to the one we already have. Apply the
orthogonal projection Theorem 2.11 to the space
. (H, <·, ·>A ),
as well as the Riesz-Freéchet representation theorem Proposition 2.14. The

minimization property is nothing but the corresponding minimization of the
distance with this new inner product (Theorem 2.11 again).
10. Apply Stampachhia’s theorem for the choice K = H which is a vector space,
not just a convex set to see that the inequality in Stampachhia’s theorem
becomes an equality.
11. If u'j - u' , and vj' are the corresponding solutions, check that there is some
feasible v with vj' - v ' , and then both limits u and v are related through the
same differential problem. Since in this situation vj → v in L∞ , the only
requirement on the integrand F is its continuity.
12. (a) Through a process limit, one can assume that the probability measure μ is
finitely-supported
Σ Σ
μ=
. μi δxi , μi > 0, μi = 1, xi ∈ X.
i i
In this case, the vector

(f ) ( )
Σ Σ
. vj (x) dμ(x) = μi vj (xi ) = μi (vj (xi )j
X j i j i
belongs to D, directly from convexity.

(b) This is easily seen through the approximation procedure of the previous
item, and the definition of convexity.
(c) It suffices to use the convexity of the function u(r) = − log r.
13. In general terms, convexity of the functional I can be written in the form
f
. f (t1 u1 (x) + t2 u2 (x), t1 u1 (y) + t2 u2 (y)) dx dy ≤
J ×J
f f
t1 f (u1 (x), u1 (y)) dx dy + t2 f (u2 (x), u2 (y)) dx dy
J ×J J ×J
for arbitrary t1 , t2 ≥ 0, t1 + t2 = 1, and functions u1 , u2 . It is not easy to find a

simple necessary and sufficient condition for this inequality to hold in general.
For instance, if functions u1 and u2 are taken to be constant, then a necessary
condition is that the function of one variable
.u |→ f (u, u)
must be convex, but this is far from being sufficient because it does not provide
any information on f off the diagonal. A bit of experimentation with simple
functions can help us realize that there is no simple answer.
14. (a) It suffices to take W (x, y, ·) convex for every pair (x, y) ∈ (0, 1)2 , and 0,
increasing.
(b) If there is an explicit dependence of W (x, y, u, v) on the variable u, then
there is no way to derive any information of the behavior on the integrals
f 1
. W (x, y, uj (x), uj (y)) dy
0
when uj weakly converges to u. Note how this is not an issue in the

previous item.
15. One weak lower semicontinuity statement is the following. Assume each
fi (x, u) is convex in u, and F is non-increasing in the following sense
F (x) ≤ F (y) when xi ≤ yi for all i.

.
Then functional I is weak lower semicontinuity. It can be generalized to include

a more general statement as follows. Suppose the function (−1)i fi (x, u) is
convex for each i, and
F (x) ≤ F (y),
. (−1)i xi ≤ (−1)i yi .
Then functional I is weak lower semicontinuous.

16. (a) This is elementary since
N (f
Σ )2
N sN (f )2 =
. f (x)ui (x) dx
i=1 o
Σ
N
= <f, ui >2 .
i=1
(b) From the last identity and the classical Cauchy-Schwarz inequality, it is
clear that
sN (f ) ≤ ||f ||,
.
and so each sN is continuous. Moreover, because

Σ
f =
. <f, ui >ui ,
i
we can deduce that

Σ
||f ||2 =
. <f, ui >2 = lim sN (f ),
N →∞
i
and this limit is monotone. This suffices to finish.

17. This is the final part of Exercise 30 of Chap. 2. The derivative E ' (u) can be
given the explicit form
f x
'
E (u)(x) =
. [ψ ' (u' (y)) − (x − y)φ ' (u(y))] dy
0
f 1
−x [ψ ' (u' (y)) − (1 − y)φ ' (u(y))] dy.
0
A.4 Chapter 4
1. This is the infinite-dimensional analogue of Exercise 4 of Chap. 3. The argu-

ment is formally the same as the finite-dimensional case.
2. Write
a 2 a2 2
φa'' (u) = 12[(u +
. ) + (a − 1)].
4 16
If a 2 − 1 ≥ 0, φa is convex, and the unique minimizer of the problem is the

linear function u(t) = αt. If a 2 − 1 < 0, there is a part of the graph of φa that is
not convex, and so existence of minimizers is compromised. Take a = 1/2, and
explore the corresponding variational problem, to check that there is a small
interval J in such a way that if α ∈ / J , the lineal function still is the unique
minimizer, but if α ∈ J , then there are infinitely many minimizers, but the
linear function is not a minimizer any longer.
3. Check that the functional admits minimizers because convexity dependence
with respect to u' is strictly convex. Minimizers are solutions of the differential
equation u'' = u2 which are convex functions of one variable.
4. In the proof of Theorem 4.2, there is a step where a certain difference of
integrals should be shown to converge to zero (indicated as a proposed exercise:
this one). Use the fact that for a.e. a ∈ J , the integrand F (x, u, U) is bounded
from above by some constant when triplets (x, u, U) belong to a box
.[a − δ/2, a + δ/2] × Ju × JU
where Ju and JU are specific compact sets for the variables u and U,
respectively. Use the dominated convergence theorem to conclude that such
difference of integrals tend to zero.
5. (a) This is an elementary Multivariate Calculus exercise.
(b) This has already been shown in the final section of the chapter.
6. Missing technical details refer to the justification of differentiation under the
integral sign. This is correct under the bound assumed on the integrand F and
its partial derivatives.
7. It is not difficult to conclude that if m(α) is the value of the infimum, then
m(α) = 1 for α = 0 and α ≥ 1, while m(α) = 0, else.
8. Consider perturbations in the subspace
1
Hper
. (J ; Rn ) = {v ∈ H 1 (J ; Rn ) : v(x0 ) − v(x1 ) = 0},
to conclude that (4.20) is valid for all v in this subspace. Proceed in two
successive steps:
(a) take first, in particular, v ∈ H01 (J ; RN ) to conclude (4.21) as in Theo-
rem (4.4);
(b) for v ∈ Hper
1 (J ; Rn ) such that the constant vector
v0 = v(x0 ) = v(x1 )
.
is free, and bearing in mind (4.21), go back to (4.20) and, by integrating by

parts, conclude that this time the end-point contributions amount to
|x
. FU (x, u(x), u' (x))|x1 · v0 = 0.
0
The arbitrariness of v0 leads to

|x
. FU (x, u(x), u' (x))|x1 = 0.
0
This final condition, together with differential system (4.21) (and the periodicity
conditions for u), make up the optimality problem.
9. Mimicking the process around the proof of Theorem 4.21, one finds that
d
Fu (x, u(x), u' (x), u'' (x)) −
. FU (x, u(x), u' (x), u'' (x))
dx
d2
+ FZ (x, u(x), u' (x), u'' (x)) = 0 a.e. x in J,
d 2x
if F (x, u, U, Z) is the integrand for the second-order functional

f
. F (x, u(x), u' (x), u'' (x)) dx
J
under end-point conditions both for u and u' .

10. Perform an integration by parts on the second term in (4.20) to find that
f
d
. [Fu (x, u(x), u' (x)) − FU (x, u(x), u' (x))] · v(x) dx = 0
J dx
because contributions from end-points vanish. The arbitrariness of v, other than

end-points restrictions, imply (4.21).
11. (a) Corollary 4.1 can be applied to deduce the existence of solutions for each
fixed ε > 0.
(b) The Euler-Lagrange system becomes
. − εu''1 + (u1 u2 − 1)u2 = 0, −εu''2 + (u1 u2 − 1)u1 = 0,
which is a singularly-perturbed, second-order, non-linear differential sys-

tem.
(c) The previous system is impossible to solve explicitly. However, its asymp-
totic limit as ε becomes smaller and smaller can be figured out. In fact, in
the complement of a smaller and smaller symmetric interval with respect
to t = 1/2, the solution will essentially follow the curve u1 u2 − 1 = 0. In
that particular, central subinterval due to symmetry, the curve will follow
momentarily the principal diagonal of the plane.
12. Use the same ideas as in Exercise 8 above.
13. Neumann boundary conditions cannot be imposed directly in a first-order
system because competing functions will, typically have weak derivatives in a
certain Lebesgue space, and these cannot admit individual point values. Instead,
consider the problem for the modified functional

f ( )
1 ' 2
. u (x) + F (x, u(x)) dx − u'0 u(0) + u'1 u(1)
J 2
minimized among all feasible u’s in the appropriate space without any end-
point conditions.
14. This is the typical situation with a saw-tooth sequence of functions where only
slopes ±1 are used, and they alternate in smaller and smaller scales, while
preserving the given end-point conditions. All this functions belong to A, and
yet its weak limit, which is the trivial function,√does not belong to A because
the integral condition would yield 1 instead of 2.
15. Checking that (4.27) is a solution of (4.26) is an interesting Calculus exercise.
On the other hand, it is elementary to argue that problem (4.26) can admit at
most one solution.
16. This exercise involves an interesting manipulations to find a way to perform a
first integration in the Euler-Lagrange system. The statement itself yields the
clue.
17. (a) In order to apply our main existence theorem for such a variational problem,
the main point to discuss is the coercivity. Note that the strict convexity of
f in the derivative is very easily checked. If we put f = f (y),
( )2
1 1
0≤
. |z| − |f | ≤ |z|2 − |z| |f | + |f |2 ,
2 4
and
1 2
|z| |f | ≤
. |z| + |f |2 .
4
In this way, since
1 1
. |z − f |2 ≥ |z|2 − |z| |f |,
2 2
we can conclude that
1 1
. |z − f |2 ≥ |z|2 − |f (y)|2 . (A.1)
2 4
This is not exactly the coercivity condition required because our lower
bound incorporates an explicit dependence on the variable y. However, for
arbitrary t ∈ J = [0, T ], we have
f t
|x(t) − x0 | ≤ t
.
2
|x' (s)|2 ds,
0
due to Jensen’s inequality

| f t |2 f
|1 | 1 t '
| ' |
x (s) ds | ≤ |x (s)|2 ds.
.
|t t 0
0
Then, by (A.1),
f t
|x(t) − x0 | ≤t
.
2
|x' (s)|2 ds
0
f t
≤4T E(x) + 4T |f (x(s))|2 ds.
0
For the last term, in turn, we find that

f t f t f t
. |f (x(s))| ds ≤2
2
|f (x(s)) − f (x0 )| ds + 2 2
|f (x0 )|2 ds
0 0 0
f t
≤2M 2
|x(s) − x0 |2 ds + 2T f (x0 )|2 .
0
Altogether, we obtain
|x(t) − x0 |2 ≤ 4T E(x)
.
f t
+ 8M 2 T |x(s) − x0 |2 ds
0
+ 8T |f (x0 )|2 .2
If E(x) belongs to a bounded set of numbers, we can conclude, through

Gromwall’s lemma, that |x(t) − x0 | is uniformly bounded, and in this
case (A.1) becomes the appropriate necessary coercivity. More specifically,
suppose {xj } is minimizing for E. In this case, the sequence of numbers
{E(xj )} is certainly bounded, and by our hypotheses and our discussion
above, {xj } is bounded uniformly:
xj (t) ∈ K for all j and all t ∈ J,

.
for a certain, fixed compact set K. Once we can count on this piece of
information, (A.1) implies
f f
4
. |x'j (s)|2 ds ≤E(xj ) + max |f (y)|2 ds
J J K
≤E(xj )
f
+ [2M 2 max |y − x0 |2 + 2|f (x0 )|2 ] ds < ∞,
J K
and so {xj } is uniformly bounded in H 1 (J ; Rn ). This is the main conclu-

sion of coercivity.
(b) The Euler-Lagrange system can be rewritten in the form
e(t)' + ∇f (x(t))T e(t) = 0 in J,

. e(T ) = 0,
where x(t) is any minimizer of the problem, and
e(t) = x' (t) − f (x(t))

.
is the residual associated with such minimizer. Conclude that the only
solution for the previous linear problem for e is the trivial one.
18. The situation for a constant vector y or a variable path y(t) is the same. The
existence of an optimal path is shown, under the conditions given for the map
f (x, y), by checking how the uniformity of the Lipschitz constant M permits
to exactly reproduce the proof in Exercise 17, both in the constant and variable
cases. Note that the problem is first-order in x, but zero-order in y. Optimality
conditions become a differential-algebraic system of the form
.e' (t) + ∇x f (x(t), y(t))T e(t) = 0 in (0, T ),

∇y f (x(t), y(t))T e(t) = 0 in (0, T ),
if
e(t) = x' (t) − f (x(t), y(t))

.
is the residual vector. One would like to be able to conclude, from these
optimality conditions, that e ≡ 0. However, it is not easy to give an explicit
answer unless we assume a more explicit form of f . In the case of a linear
mapping
f (x, y) = Ax + by,
.
those optimality conditions become
e' + AT e = 0,
. bT e = 0.
From here one can deduce that e ≡ 0 if and only if the rank condition of Kalman
( )
. rank b Ab A2 b . . . An−1 b = n
is verified.
19. In this case, one considers apparently more general variations as indicated in
the statement. Consider the real function
f 1
s ∈ (−ε, ε) |→ g(s) ≡
. F (t, u(s, t), ut (s, t)) dt,
0
and impose the condition that g ' (0) = 0. Though one is considering more
general variations, optimality is reduced to the usual Euler-Lagrange system
of optimality.
20. Regarding the path u as fixed, we perform a change of variables setting
v(t) = u(φ −1 (t)),

.
and consider the variational problem for φ

f 1
Minimize in φ :
. F (uφ (t), u'φ (t)) dt
0
φ(0) = 0, φ(1) = 1,
. φ ' > 0.
The functional to be minimized can be explicitly written, through the change of

variables s = φ −1 (t), as
f 1 ( )
1 '
. F u(s), u (s) φ ' (s) ds.
0 φ ' (s)
It is with respect to this integral as a functional of φ, for fixed u, that one should
write optimality conditions.
21. This is a continuation of the previous exercise to stress how inner-variations are
especially well-suited to study optimality under point-wise constraints. Note
how paths of the form
uψ (t) = u(ψ(t))
.
comply with point-wise constraint if u itself does for arbitrary function ψ. In

particular, if u is optimal, then one can explore optimality by looking at the real
function
s ∈ (−ε, ε) |→ g(s) = u(t + sψ(t)),

. ψ(0) = 0, ψ(1) = 0,
and asking for g ' (0) = 0.

22. This new situation does not deserve special comments. It is a matter of
following the formalism for optimality conditions by considering the sections
(f )
s ∈ (−ε, ε) |→ g(s) = F
. f (x, u(x) + sv(x), u' (x) + sv ' (x)) dx
J
for fixed v with v(1) = v(0) = 0, if u is indeed a minimizer.

23. For this problem we have the integrand
F (x, u) = [uT A(x)u]1/2 .

.
It is not easy to write the corresponding Euler-Lagrange system in a meaningful,

compact form. In fact, to identify the Chrystoffel symbols, it is better to derive
the system in a fully developed form
Σ
F (x, u) = (
. Aij (x)ui uj )1/2 , A = (Aij ), u = (ui ).
i,j
For the two-dimensional case, we have

/
F (x, u) =
. A11 (x1 , x2 )u21 + 2A12 (x1 , x2 )u1 u2 + A22 (x1 , x2 )u22 .
It is easier to write the two-equation system for this particular case.

24. Through optimality conditions for the minimization problem in ψ for fixed u,
it is easy to find that
(f 1 )2
'
Ii (u) =
. |u (t)| dt ,
0
which is not a local integral functional. The calculation for the second situation
are trivial and one finds that I = Ii .
25. The application of basic results in this chapter to the problem proposed for fixed
ε is straightforward (though it requires some care in calculations). The basic
property to define the limit problem is to write the weak limit of the quotiens
1/aε (x) in the same quotient form
1 1
. - .
aε (x) a(x)
A.5 Chapter 5
1. This is a straightforward, practice exercise.

2. Use linearity to translate the given fact about balls into the claimed statment:
the map T is open.
3. This is just a direct argument.
4. Under the condition on T, it is standard to check that the series determining S
is well-defined. Then check easily that
(1 − T)S = S(1 − T)
.
is the identity.
5. (a) This is straightforward.
(b) This amounts to checking that T is injective but not onto.
(c) T' f (x) = f (x/2).
6. (a) To show that π is linear, argue as follows. For x, y ∈ H arbitrary, and a
scalar α, we have
<π(αx), y> = <αx, π y> = <αx, π 2 y>

.
and then
<π(αx), y> = α<x, π 2 y> = · · · = α<π x, y>.

.
The arbitrariness of y implies π(αx) = απ x, and in a similar way for sums

of vectors. For the continuity, bear in mind that
||π x||2 = <π 2 x, x> = <π x, x> ≤ ||π x|| ||x||.

.
This inequality also shows that ||π || ≤ 1. But notice that
||π 2 x|| = ||π x||,

.
and so ||π || = 1.
(b) Identify the four statements as (a), (b), (c), and (d), respectively. Then (a)
means that
<x − π x, π x> = 0,
. <x, π x> = ||π x||2 ≥ 0,
which is (b). (c) trivially implies (d).

7. Since ||Tx|| = ||x||, ||T|| = 1, and so
σ (T) ⊂ {|λ| ≤ 1}.

.
To show the equality, for arbitrary λ with |λ| ≤ 1 prove that (T − λ1) is not
onto by trying to find the inverse image of, for instance, (1, 0, . . . ).
8. It is clear that e(T) = {1/j : j ∈ N}. Since ||T|| = 1,
σ (T) ⊂ {|λ| ≤ 1}.

. (A.2)
If λ has size not greater than unity, it is non-zero, and not one of the 1/n, then
1
(T − λ1)x = y,
. xj = yj .
1/j − λ
For j large, ||λ − 1/j | ≥ |λ|/2, and then vector x given by the above formula
belongs to lp . Operator T − λ1 is bijective, and by the open-mapping theorem
is isomorphism. This implies that (A.2) is an equality.
9. (a) The linearity and continuity is straightforward. For the composition, use
Fubini’s theorem to show that
f x
.T u(x) =
2
K2 (x, y)u(y) dy
0
where
f x
K 2 (x, y) =
. K(x, z)K(z, y) dz.
y
Keep iterating. In fact, if

f x f x
Tu(x) =
. K(x, y)u(y) dy, Su(x) = L(x, y)u(y) dy,
0 0
then
f x
T(Su)(x) =
. K(x, y)u(y) dy
0
where
f x
K(x, y) =
. K(x, z)L(z, y) dz.
y
(b) There is no special difficulty with induction based on the previous calcula-
tions.
(c) A constant kernel have a trivial kernel for the corresponding operator. The
kernel
(π )
.K(x, y) = x cos y
x
has a constant function in the kernel of the corresponding operator.
(d) In the case K(x, y) = 1, the rule found in the first item leads to
1
Kj (x, y) =
. (x − y)j , j ≥ 0.
j!
In this way, one can write

Σ
(1 − T)−1 u(x) =
. Tj u(x)
j
Σf x 1
= (x − y)j u(y) dy
0 j!
j
f x
= ex−y u(y) dy. (A.3)
0
(e) This is straightforward.

10. (a) Apply Corollary 5.1 to conclude that the limit operator
Σ
Sy =
. Tj y
j
is linear and continuous, and because (1 − T)S = 1, S is the inverse of

(1 − T) which is, therefore an isomorphism of E.
(b) Show that it is a Cauchy sequence based on the convergence of the sum of
powers of T.
(c) Write
j −1
Σ ∞
Σ
xj =
. Tk y + Tj x0 , x= Tk y.
k=0 k=0
The difference x − xj becomes
∞
Σ ( )
. Tk y − Tj x0 = Tj (1 − T)−1 y − x0 .
k=j
11. Suppose {u'j } is an infinite number of linearly independent C0 -functions in

[0, 1], which we can assume uniformly bounded but with the same size in terms
of the sup-norm. This effect can be achieved by multiplying each derivative by
an appropriate constant. Then we know that, at least for a subsequence and
subtracting a fixed function, u'j - 0 but this convergence cannot be strong
(because the norm is the same and away from zero). Then uj → 0 (strong), but
their derivatives converge only weak, and so the derivative operator cannot be
continuous.
12. It is clear that T is linear and continuous. If a number λ belongs to the resolvent
then T − λ1 must be a bijection:
(a) T − λ1 is injective: all level sets {a = λ} of a must be negligible;
(b) T − λ1 is onto: the function 1/(a(x) − λ) must belong to L∞ (o).
Eigenvalues are those λ’s for which the level set {a = λ} is non-negligible,
while λ belongs to the resolvent when
1
. ∈ L∞ (o).
a(x) − λ
13. Use the change of variables st = r to see that
. lim L(u)(s) = 0.
s→∞
Through the mean-value theorem for the exponential, and the same change of
variables prove that Lu is a continuous function, and that L is continuous. The
property of the derivative of Lu is standard.
14. Define a linear operator T and its adjoint T' , through the Riesz representation
theorem, by putting
A(u, v) = <Tu, v> = <u, T' v>.

.
Check, through the coercivity of A and the Cauchy-Schwarz inequality, that
c||v|| ≤ ||T' v||.

.
The duality relations imply that the closure of R(T) is the full space H.
Conclude by checking that T is closed.
15. This is the same operator of the two final items of Exercise 9. It does not have
eigenvalues, and its norm is e − 1.
A.6 Chapter 6
1. This is immediate from the definitions upon some reflection.

2. Let u ∈ U. By hypothesis,
u = m + (1/2)u1 ,
. m ∈ M, u1 ∈ U.
By the same reason, there are
1
m1 ∈ M, u2 ∈ U,
. u1 = m1 + u2
2
i.e.
1 1 1 1
u = m + (m1 + u2 ) = m + m1 + 2 u2 .
.
2 2 2 2
Proceed in a similar manner to find that
Σ
n
1 1
u=
.
i
mi + n+1 un+1 , m = m0 ,
2 2
i=0
for every n. This decomposition of u implies, because U is bounded, that u ∈

M.
3. The orthogonality relations are immediate upon interpretation of them in a
Hilbert space scenario. For the equality of spectra, it suffices to check that the
resolvents are the same. Suppose that (T' − λ1)x = 0, then one concludes that
<x, (T − λ1)y> = 0
.
for all y. If T − λ1 is a bijection, then this can only happen if x = 0, and

(T' − λ1) is injective. Note that
(T − λ1)' = T' − λ1,

.
and use the orthogonality relations to conclude.

4. (a) If A is a symmetric, positive-matrix, it has non-negative eigenvalues λj >
0, and it is diagonalizable
A = PDP−1 ,
. D = diag(λj ), P, non-singular.
We can define, then, in a coherent way
. log A = P log DP−1 , log D = diag(log λj ).

For the concrete matrix provided, it is elementary to find that

⎛ ⎞⎛ ⎞⎛ ⎞
1 0 1 100 −1 −1 −1
.A = ⎝−1 1 −2⎠ ⎝0 1 0⎠ ⎝ 3 2 1 ⎠,
−1 −1 1 002 2 1 1
and
⎛ ⎞⎛ ⎞⎛ ⎞
1 0 1 00 0 −1 −1 −1
. log A = ⎝−1 1 −2⎠ ⎝0 0 0 ⎠⎝ 3 2 1 ⎠,
−1 −1 1 0 0 log 2 2 1 1
i.e.
⎞ ⎛
2 1 1
. log A = log 2 ⎝−4 −2 −2⎠ .
2 1 1
It is easy to check that exp(log A) = A.

(b) The formula
f 1 f t
Tu(t) = t
. u(s)(1 − s) ds − u(s)(t − s) ds
0 0
yields an explicit form to the operator T which proves that T is compact.

However, it is better, for other purposes, to use its definition in the
statement. Two successive integration by parts show that
f '' f
. V ds = V '' ds, U (0) = U (1) = V (0) = V (1),
U U
and hence T is self-adjoint. In addition, one integration by parts leads to

f 1
<Tu, u> =
. u' (s)2 ds ≥ 0,
0
and T is positive as well. Consequently, for any continuous function f (r) :

[0, ∞) → [0, ∞), the operator f (T) can be defined as in Example 6.8.
5. This exercise reduces to checking that the operator
T : L2 (0, 1) |→ L2 (0, 1),

. U = Tu
defined through
. − U '' (x) + U (x) = u(x) in (0, 1), U ' (0) = U ' (1) = 0, (A.4)
is compact, self-adjoint, and positive, just as in the second part of the previous
exercise. There is no explicit formula for T though.
(a) In the first place T is well-defined. Note that problem A.4 admits a unique
solution for u ∈ L2 (0, 1) as it is the minimizer of the quadratic functional
f 1 1 1
. [ U ' (x)2 + U (x)2 − u(x)U (x)] dx
0 2 2
under no boundary condition. Refer to Corollary 4.2.

(b) For the compactness, multiply (A.4) by U itself and integrate by parts to
find
f 1 f 1
. [U ' (x)2 + U (x)2 ] dx = u(x)U (x) dx.
0 0
From here, we deduce that
||U || ≤ ||u|| in L2 (0, 1),

.
||U ' ||2 ≤ ||u|| ||U || ≤ ||u||2 in L2 (0, 1).
Hence, if u belongs to a bounded set in L2 (0, 1), so does the set of

derivatives U ' , and by Proposition 2.7, the set of functions U themselves
belong to a compact set in L2 (0, 1).
6. Since by definition, it is always true that
1 = πi + πi⊥ ,
. πi → 1
we should have the given statement. However, if it were true that ||πi⊥ || →
0, that would imply that the identity operator 1 would be compact, which is
impossible in an infinite-dimensional Hilbert space. There is no contradiction
with the Banach-Steinhaus principle.
7. Because πi → 1 point-wise, if T is compact then T(B) is a compact set and
hence
||πi T − T|| = sup |(πi T − T)x| → 0,

. i → ∞.
x:||x||=1
Conversely if the norm in the statement tends to zero, T turns out to be a limit,
in the operator norm, of a sequence of finite-rank operator, and hence, compact.
8. The arguments are much in the spirit of Proposition 2.13. Bear in mind that
under the given hypotheses πi πj = 0 for i /= j . For the compactness use the
fact that a limit of finite-rank operators is compact.
9. This is a typical Linear Algebra exercise stated in an infinite-dimensional

scenario.
10. (a) From
f x f 1
λu(x) =
. (1 − x)yu(y) dy + (1 − y)xu(y) dy,
0 x
it is easy to check the first item by differentiating twice with respect to x.

(b) This is elementary from the previous item. The eigen-functions are
uj (x) = sin(πj x).

.
(c) For integral operators of this kind, the adjoint is the integral operator
corresponding to the symmetric kernel
{
' (1 − y)x, 0 ≤ x ≤ y ≤ 1,
K (x, y) = K(y, x) =
. = K(x, y).
(1 − x)y, 0 ≤ y ≤ x ≤ 1,
T is then self-adjoint. The Fredholm alternative ensures that the equation
(T − (j π )−2 1)u = v
.
is solvable, provided <v, w> = 0 for all w such that
(T − (j π )−2 1)w = 0,
.
i.e. v belongs to the orthogonal complement of the eigen-space associated

with (πj )−2 .
11. The one in Exercise 7 is not compact, for example, by Theorem 6.3: every
compact operator must have eigenvalues. However, the one in Exercise 8 is
compact. To argue this, consider the finite-rank operator
Tj x = (x1 , x2 /2, x3 /3, . . . , xj /j, 0, 0, . . . , 0, . . . ),

.
and realize that

1
||T − Tj || ≤
. .
j +1
12. If f is non-constant, its image contains a certain non-negligible interval J ⊂

[0, 1]. The sequence uj (x) = sin(jf (x)) is continuous, uniformly bounded,
but it will not converge in the subinterval f −1 (J ).
13. Since T is compact and self-adjoint, the Fredholm alternative ensures the result
if the unique solution u(x) of the homogeneous equation
f π
u(x) −
. sin(x + y)u(y) dy = 0
0
is the trivial one. Write this integral equation in the form

f π f π
u(x) = sin x
. cos yu(y) dy + cos x sin yu(y) dy
0 0
=a sin x + b cos x,
and write u(y) = a sin y + b cos y back into

f π
a=
. sin y(a sin y + b cos y) dy,
0
f π
b= cos y(a sin y + b cos y) dy,
0
to conclude that a = b = 0.
14. It is very natural to argue that such a T is a limit of a sequence of finite-
rank operators. For a counterexample, check examples of the form given in
Exercise 8 of Chap. 5 that has been discussed in Exercise 11 above.
15. In a reflexive Banach space, the unit ball B is weakly compact. This suffices
to prove the claim easily. In the explicit situation given, the sequence Tuj
converges, in the sup norm, to the function 0 for x ≤ 0; x for x ≥ 0, whose
derivative is discontinuous, and hence, it cannot belong to the image of T.
16. This property has sufficiently emphasized earlier.
17. This proof is a very typical application of Riesz lemma. If the space is not finite-
dimensional, a sequence can be found inductively in the unit sphere that cannot
converge to anything. The argument is similar to the proof of Proposition 6.3.
18. (a) To calculate ||T||, evaluate T(fr ) for
fr = χ[r,1] ,
. 0 ≤ r ≤ 1.
(b) For any λ ∈ [0, 1], it is easy to realize that the equation
(s − λ)f (s) = g(s)

.
for f , g in L2 (0, 1) is impossible. However, there is no particular difficulty

if λ ∈
/ [0, 1]. The operator does not have eigenvalues, and so it cannot be
compact (Theorem 6.3).
19. (a) This is just a simple calculation.
(b) Both terms in Tf represent compact, integral operators of Volterra type in

both cases.
(c) The operator derivative
f 1
f (x) |→ f (x) −
. f (y) dy
0
is, however, not compact because the first term is just the identity, but the
second contribution is compact.
A.7 Chapter 7
1. This is elementary. Take the open set πN o, and select a test function ψ for such
an open set in RN −1 with ψ(x '0 ) = 1.
2. This is also a geometric property which is, at least, easy to visualize.
3. This is just a change-of-variables issue.
4. There is nothing surprising in this exercise.
5. (a) If ψ ∈ C∞ ∞
c (R ) and φ ∈ Cc (o), the product ψφ is a smooth function
N
with compact support in o, and hence

f f
. u∇(ψφ) dx = − ∇u ψφ dx.
o o
Thanks to the product rule for smooth functions,

f f
. u∇(ψφ) dx = u(∇ψφ + ∇φψ) dx.
o o
If we reorganize these identities we find

f f
. uψ∇φ dx = − (∇uψ + u∇ψ)φ dx.
o o
The arbitrariness of φ leads to the desired result.

(b) This is straightforward from the definition.
6. This can be accomplished through the same ideas as in Example 2.12 of
Chap. 2. Apply the one-dimensional construction in that example to the func-
tion
χ (x) = χt (x · n)
.
where χt is the 1-periodic function of one variable of that example, and n is any
unit normal vector.
7. (a) Just as in the case of the standard Sobolev spaces, one can define the Hilbert
space
L2div (o) = {F ∈ L2 (o; RN ) : div F ∈ L2 (o)}

.
where the weak divergence div F is defined through the usual integration-
by-parts formula
f f
. div Fφ dx = − F · ∇φ dx
o o
valid for every test function φ in o.

(b) This is straightforward because div F = 0 means, according to the previous
formula,
f
. F · ∇φ dx = 0
o
for all test functions. These are dense in H01 (o).

8. This exercise is left for exploration to interested readers.
9. This is similar to arguments in Sects. 2.10 and 7.3.
10. Use Corollary 7.1 and Lemma 7.5 for an increasing sequence of balls {Bj } with
radii tending to ∞ to conclude that the set {C∞ N
c (R )} is dense in W
1,p (RN ).
11. These facts are straightforward.

12. This is an interesting exercise but straightforward after the proof of Lemma 2.4.
13. It is easy to calculate that if a function u(x1 , x2 ) ∈ H 1 (B) is independent of x2 ,
then
f 1 /
||∇u||2 2
. =2 u' (x1 )2 1 − x12 dx1 .
L (B;R )
2
−1
Similarly
f 1 /
||u||2L2 (B) = 2
. u(x1 )2 1 − x12 dx1 .
−1
In this way, one can define the Sobolev space with weight
f 1
L2w (−1, 1) = {u(x), measurable :
. u(x)2 w(x) dx < ∞},
−1
Hw1 (−1, 1) = {u(x) ∈ L2w (−1, 1) : u' (x) ∈ L2w (−1, 1)},
for the weight function

√
w(x) =
. 1 − x2, x ∈ (−1, 1).
For a general domain the procedure is formally the same for the weight function
w(x) = |{x1 = x}|.

.
This procedure can be implemented in a completely general way.

14. That L is a subspace is straightforward. It is also easy to argue that it is a
subspace of Hw1 (−1, 1) with the ideas and notation of the previous exercise.
To check that it is not closed, it suffices to find a function in Hw1 (−1, 1) which
does not belong to L. The function
f x 1
u(x) =
. √ dt
1 − t2
4
0
is a good candidate.
A.8 Chapter 8
1. By looking with care at the integration by parts involved in deriving the weak
form of the Euler-Lagrange equation in 8.7, one can conclude that
Fu (u(x), ∇u(x)) · n(x) = 0 on ∂o.

. (A.5)
Proceed in two steps:

(a) use first a test function V ∈ H01 (o) for which the integration by parts does
not interfere with boundary contributions, to derive the PDE itself;
(b) once one can rely on the information coming from the previous PDE, use a
general test function V ∈ H 1 (o) to conclude (A.5).
2. This is a direct application of our main existence result. Since the functional is
quadratic, it can also be solved with the help of the Lax-Milgram lemma. The
optimality condition reads
. div(∇v − F) = f in o, u = u0 on ∂o.
3. The first part is a consequence of the first exercise of this chapter. It amounts to
writing the natural boundary condition
Au · n = 0 on ∂o.
.
It is relevant to note the presence of the quadratic term in u to avoid the situation
in which minimizing sequences do not remain bounded in H 1 (o) by adding an
arbitrary constant to u. If one is interested in adjusting a non-vanishing normal
derivative, then the functional should be formally changed to
f f
1
. |∇u(x)|2 dx + h(x)u(x) dS(x).
o 2 ∂o
4. The integrand for this problem can be writte n in the form

( )
1
∇u
.
T
1 + F ⊗ G ∇u.
2
It is therefore a quadratic variational problem and one needs the condition
a > 2||F||L∞ (o) ||G||L∞ (o)

.
to guarantee convexity and coercivity. The Euler-Lagrange equation is

( )
1 1
. div a∇u + F · ∇uG + G · ∇uF = 0.
2 2
5. The functional with integrand
a 2 b
u |→
. |u| + |u|F · u
2 2
has the given Euler-Lagrange equation. The condition
a > ||bF||L∞ (o)

.
is required to ensure existence of solutions.

6. According to the third item of Proposition 8.2, the function f needs to be
convex and increasing to ensure convexity of F . Moreover, one needs the
asymptotic behavior
f (r) p
. lim = α > 0, p > 1.
r→∞ r
In fact, it suffices
f (r)
. lim = +∞.
r→∞ r
7. This is similar to Exercise 29 of Chap. 2.
8. If the field F is divergence-free, then it this term does not affect the underlying
Euler-Lagrange equation. On the one hand, by the divergence theorem
f f
. ∇u(x) · F(x) dx = u(x)F(x) · n(x) dS(x),
o ∂o
and this last integral is independent of u as it only depends on its boundary

values which are fixed. This is a null-lagrangian, meaning that this second term
does not affect the optimization process, as it is constant. On the other, it is
elementary to realize that when writing the Euler-Lagrange equation, this term
drops out precisely because F is divergence-free.
9. This is straightforward.
10. (a) The application of the Lax-Milgram lemma is direct in the space H.
(b) We have to change H slightly to incorporate the normal boundary condition.
In this case the subspace of L2 (o; RN ) made up of gradients of functions
in H 1 (o) is the orthogonal complement to H. Hence, optimality becomes
. curl(AF + G) = 0.
Therefor there exists u with ∇u = AF + G. This is formally equivalent, by

solving for F, to
[ ]
. div A−1 (∇u − G) = 0 in o,
A−1 (∇u − G) · n = f on ∂o.
According to Exercise 3 of this same chapter, these are exactly the

optimality condition for the standard problem for the functional
f ( ) f
1 T −1 T −1
. ∇u A ∇u − ∇u A G dx + uf dS(x)
o 2 ∂o
over H 1 (o).
11. (a) The standard theorem cannot be applied because the matrix
( )
1 1 a
A=
.
2 a a2
is not positive definite.

(b) The proposed perturbation transforms the previous matrix A to
( )
1 1 a
A=
. .
2 a a2 + ε
This time the theorem can be applied and there is a unique solution.
(c) The situation is exactly the same. The main change for this new problem
would affect the form of the Euler-Lagrange equation that it would be non-
linear.
12. This is a straightforward generalization of Exercise 1. The corresponding
optimality problem would read
. − div[Fu (u, ∇u)] + Fu (u, ∇u) = 0 in o,

u = u0 on r, Fu (u, ∇u) · n = 0 on ∂o \ r.
13. This is a zeroth-order problem. A typical existence theorem would require the
convexity of the integrand F (x, ·) for a.e. x ∈ o, and the coercivity
C(|u|p − 1) ≤ F (x, u),

. C > 0, p > 1.
Optimality conditions would amount to
Fu (x, uλ (x)) = λ, constant in o,

.
where such constant value λ is determined so that the integral constraint is

respected
f
. uλ (x) dx = |o|u0 .
o
14. It is well-known from Vector Calculus courses that the flux of a vector field F
across a given surface S is given by the surface integral
f
. F(x) · n(x) dS(x),
S
where n(x) is the unit normal vector field to S (one of the possible two options).
If surface S is given by the graph of a certain function u(x, y) over o, then the
flux is
f
. F(x, y, u(x, y)) · (−ux (x, y), −uy (x, y), 1) dx dy.
o
More explicitly, if we put
F(x, y, u(x, y)) = (F1 (x, y, u(x, y)), F2 (x, y, u(x, y)), F3 (x, y, u(x, y))),
.
the flux is given by

f
. [F3 (x, y, u(x, y)) − F1 (x, y, u(x, y))ux (x, y)
o
− F2 (x, y, u(x, y))uy (x, y)] dx dy.
After a few careful calculation, the equation of optimality is written
. div F(x, y, u(x, y)) = 0,
which forces optimal surfaces to have their image on the set where the
divergence of the vector field F vanishes. This is clearly impossible if this
possibility is forbidden by the function u0 furnishing the boundary values for
u. If F is divergence-free, the Euler-Lagrange equation becomes void, and the
integrand becomes a null-lagrangian.
15. (a) It is easy to check that the square of the determinant is not convex. For
instance, the section
(( ) ( ))
01 10
t |→ F
. +t .
10 01
is not convex.
(b) With a bit of care the equation becomes
∇(det ∇ 2 u) · (uyy − uxy , uxx − uxy ) = 0 in o,

.
which is a highly non-linear equation.

16. (a) It is immediate to show the existence of a unique solution either through
the Lax-Milgram theorem (we are talking about a quadratic functional) or
by our main existence result in the chapter.
(b) Optimality conditions lead to
f
. [α1 χ (x) + α0 (1 − χ (x))]∇u(x) · ∇v(x) dx = 0
o
for all v ∈ H01 (o). Use successively test functions v with support in o1
and o0 to conclude that u must be harmonic separately in both sets. Once
this information is available, for a general v with arbitrary values on r, use
the divergence theorem and deduce the transmission condition
(α1 − α0 )∇u · n = 0 on r.
.
Both α’s could be functions of x.

17. (a) Once again, it is immediate to show existence of a unique solution for each
ε fixed, either through the Lax-Milgram lemma or our main existence result
for variational problems.
(b) Argue that the sequence of minimizers {uε } is uniformly bounded in
H 1 (o), and, hence, possibly for a subsequence (not relabeled) it converges
weakly to some limit u.
(c) Under the hypothesis suggested one would find that
f f
0=
. aε ∇vε · ∇uε dx → a0 ∇v · ∇u dx.
o o
Hence
f
. a0 ∇v · ∇u dx = 0
o
for all v ∈ H01 (o), and this implies that u0 is the minimizer of the limit
functional.
18. (a) This is again a consequence of our main existence theorem for convex,
coercive variational problems.
(b) As in the mentioned exercise, the transmission condition would be
[|∇ û − ∇u|p−2 (∇ û − ∇u) − |∇ û|p−2 ∇ û] · n = 0 on ∂o.

.
(c) Because the value of the functional for U is finite (it is important that U = u
in o), independently of ε, we can deduce that ûε → u in o.
19. (a) This is standard after main results in this chapter.
(b) This is a particular case of the same question in the previous exercise.
The main difference is the linearity of the operation taking each u to the
corresponding minimizer vε .
(c) Note that
1 1
mε =
. ||∇vε ||2 2 N + ||∇vε − ∇u||2L2 (o) , (A.6)
2 L (R \o) 2ε
and bear in mind that we can take
||vε ||2
. = ||∇vε ||2 2 , ||vε ||2 = ||∇vε ||2 2 .
H 1 (R \o) L (R \o) H 1 (R ) L (R )
N N N N
Then
||vε ||2
. = ||vε ||2 + ||∇vε ||2L2 (o) ,
H 1 (R ) H 1 (R \o)
N N
and, on the one hand, by (A.6),
||vε ||2
. ≤ 2mε ,
H 1 (R \o)
N
whereas on the other,

( )2
||∇vε ||2L2 (o) ≤ ||∇vε − ∇u||L2 (o) + ||∇u||L2 (o)
.
≤ 2||∇vε − ∇u||2L2 (o) + 2||∇u||2L2 (o)
≤ 4mε ε + 2||∇u||2L2 (o) .
Since we are interested in small values of ε, these computations show that

there is a uniform constant C such that
||Eε u|| ≤ C||u||.

.
(d) Conclude accordingly.

20. The natural possibility would be to consider the subspace
H = {u ∈ C∞ (o) : F · ∇u = 0 on ∂o},
.
of H 1 (o), and its closure in H 1 (o), which we designate by the same letter H.
However, it is not difficult to realize that H becomes the full H 1 (o) because
one can modify a given function u ∈ H 1 (o) in a small amount (in the H 1 (o)-
norm) near de boundary to make it comply with the given boundary condition.
This is similar to the situation of the L2 (o)-closure of the set C∞
c (o) of smooth
functions with compact support in o, which is the full L2 (o). The problem is
not well-posed. It is ill-posed.
21. This is a practice exercise with typical integrands in the quadratic case. It
amounts to checking (strict) convexity and coercivity in each particular case, as
well as smoothness conditions to write the underlying Euler-Lagrange equation,
at least formally. We briefly comment on each case.
(a) The integrand is
1
F (u1 , u2 ) =
. (|u1 | + |u2 |)2 .
2
It is strictly convex and coercive.
(b) The integrand is quadratic corresponding to the symmetric, positive definite
matrix
( )
2 −1
.A = .
−1 1
(c) Similar to the previous one for the matrix

( )
1 −1
A=
. .
−1 1
Though it is convex (not strict), it is not coercive. Check the diagonal u1 =

u2 .
(d) The integrand is
1 4
F (u1 , u2 ) =
. (u + u42 )1/2 .
2 1
It is strictly convex and coercive. Note that
1
F (u) =
. ||u||24 , u = (u1 , u2 ), ||u||44 = u41 + u42 .
2
All norms are equivalent in R2 , and the p-th norm is strictly convex for
p > 1.
(e) The integrand is
( ) ( 2 )1/2
1 ( 2 2 ) 2 −1 u1
F (u1 , u2 ) =
. u u .
2 1 2 −1 1 u22

(f) The integrand is
1( 4 )
F (u1 , u2 ) =
. (u1 + u22 )1/2 + u22 .
2
(g) Slight variation of the previous one.
(h) The integrand is
1 2
F (u1 , u2 ) =
. (u + u22 ) + 7 exp(−(u1 − 1)4 − u22 ).
2 1
It is coercive but not convex.
(i) This time
F (u1 , u2 ) = |u1 | |u2 |.

.
It is neither convex nor coercive.

(j) The integrand is
F (u1 , u2 ) = (1 + u21 )1/2 (1 + u22 )1/2 .

.
It has linear growth at infinity; it is not enough for coercivity. It is not

convex either.
(k) This time
. F (u1 , u2 ) = (1 + u21 + u22 )1/2 .
It has linear growth, but it is strictly convex.

22. (a) The mixed boundary condition can be implemented by minimizing the
functional over the set of feasible functions
{u ∈ H 1 (o) : u = u0 on r0 }.
.
It requires to examine the subspace of H 1 (o), given by
{u ∈ H 1 (o) : u = 0 on r0 },
.
as the closure in H 1 (o) of the same subspace of smooth functions.

(b) Mimic the non-homogeneous Neumman boundary condition by minimiz-
ing the augmented functional
f f
1 γ
. |∇u(x)|2 dx + u2 (x) dS(x).
2 o 2 ∂o
(c) If boundary condition is restricted through an inequality, then the part of

Dirichlet condition will be the coincidence set u = u0 on ∂o, while the
Neumman part ∇u·n = 0 will correspond to its complement where u < u0
on ∂o.
23. This is a practice exercise. With a bit of care, one finds that the corresponding
Euler-Lagrange equation turns out to be
. div [(1 − ∇w(x) ⊗ ∇w(x)) ∇u(x)] = 0.
A.9 Chapter 9
1. Almost all main ingredients of the computations have been given in the
statement. The formula in the first part is just a careful use of the chain rule.
Note that, unless we differentiate twice with respect to the normal to ∂o, i.e.
with respect to XN , at least one derivative has to be computed tangentially,
and so it must vanish. In order to conclude (9.24), recall that we learn from
Differential Geometry that the quantity
∇XN ∇ 2 XN ∇XN
.
yields the curvature H of a hyper-surface if it is given by a level set of XN , and

∇XN is unitary.
2. The optimality condition for the suggested constrained variational problem is
Au = λu in o,
. u = 0 on ∂o,
for a positive multiplier λ. The smallest possible such λ is the first eigenvalue
of the Laplacian. From this equation one concludes that
f f
λ
. u(x)2 dx = |∇u(x)|2 dx.
o o
On the other hand, the best constant C in Poincaré’s inequality will be such that
f f
. u(x)2 dx ≤ C |∇u(x)|2 dx.
o o
3. For the case N = 2, the exponent p = (N +2)/(N −2) cannot be taken directly,
so one needs to work with an arbitrary p, apply the corresponding inequalities,
and then take p to ∞.
4. This is an elementary Calculus exercise. For x ≥ R, write
p
f ' (x) =
. f (x) + g(x), g(x) ≥ 0.
x
Use the explicit formula
(f x )[ f x ( f y ) ]
p p
f (x) = exp
. ds f (R) + exp − ds g(y) dy
R s R R s
for the solution of such a linear, first-order ODE to conclude, based on the fact
g(x) ≥ 0, that
f (R) p
f (x) ≥
. x , x ≥ R.
R
From here, it is easy to deduce the final result.
5. The main point is to define the operator T in this new setting appropriately. Put
U = Tu for the unique minimizer of the functional
f [ ]
1
. |∇U (x)| − U (x)u(x) dx
2
o 2
over the space H 1 (o) ∩ L20 (o) This constraint on the mean value of functions
is necessary to have a substitute for Poincaré’s inequality through Wirtinger’s
inequality, and to have that the L2 -norm of the gradient is again a norm in
H 1 (o) ∩ L20 (o). Everything else is like in the Dirichlet case.
6. (a) This variational problem may not have solutions. Note that if {uj } is a
minimizing sequence, their gradients are uniformly bounded in L2 , and
there is a weak limit u. But this weak limit may not comply with the
constraint (being an equality constraint). In dimension 1, a simple example
may help us in realizing the difficulty. The function
{√
h− √x , 0 ≤ x ≤ h1 ,
uh (x) =
. h
0, 1
h ≤ x ≤ 1,
is such that
f 1 f 1 h2
. u'h (x)2 dx = 1, uh (x)2 dx = .
0 0 3
A minimizer u would be such that

f 1 f 1
'
. u (x) dx = 1,2
u(x)2 dx = 0
0 0
which is, obviously, impossible.

(b) It is easy to realize that the proposed problem is equivalent to
f
Minimize in u ∈ H 1 (o) :
. − |u(x)|2 dx
o
subject to
f
. |∇u(x)|2 dx ≤ 1.
o
This time the direct method furnishes minimizers, because we have an

integral constraint in the derivative in the form of an inequality.
7. According to Theorem 9.13, the Laplace operator A (under Dirichlet boundary
conditions) has a sequence of positive eigenvalues λj and eigenfunctions uj ,
which make up an orthonormal basis of L2 (o). For a L2 (o)-function u, write
the decomposition
Σ
u=
. <u, uj >uj
j
and define
Σ
f (A)u =
. <u, uj >f (λj )uj .
8. No comment. This is a practice exercise.

11. It is a matter of computing the derivative
| f
d || 1
|∇u(x + ε0(x))(1 + ε∇0(x))|2 dx.
dε |ε=0 o 2
.
With a bit of care, one finds that

f [ ]
.0 = ∇u(x)∇ 2 u(x)0(x) + ∇u(x) · ∇0(x) ∇u(x) dx
o
for all such 0. After rewriting this integral and using an integration by parts,
we have
f [ ]
.0 = [ ∇u(x)∇ 2 u(x)0(x) − div[∇u(x) ⊗ ∇u(x)]0(x) dx.
o
The arbitrariness of 0 leads to
∇u∇ 2 u − div(∇u ⊗ ∇u) = 0.

.
Of course, this system reduces, after some calculations, to Au = 0.

12. (a) Define the space Hdiv (o) as the subspace of fields F(x) : o → RN in
L2 (o; RN ) such that
f
. F(x) · ∇φ(x) dx = 0
o
for all smooth functions φ in o. Note how this definition forces both
conditions div F = 0 in o and F · n = 0 on ∂o. The norm in this space
is just the norm of the ambient space L2 (o; RN ). It is easy to check that
Hdiv (o) is a closed subspace of L2 (o; RN ), and consequently it is also
weakly closed. Recall Theorem 3.5.
(b) If the matrix field A(x) is symmetric and uniformly positive-definite in the
sense
FT A(x)F ≥ C|F|2 ,
. C > 0, F ∈ RN ,
and the vector field A ∈ L∞ (o; RN ), then the given functional in the
statement is coercive, and strictly convex, and the direct method yields a
unique minimizer F̂ ∈ Hdiv (o) of the problem.
(c) To derive the corresponding optimality conditions, we perform variations
of the form
F̂ + εF,
. F ∈ Hdiv (o).
The computation
|
d ||
E(F̂ + εF)
dε |ε=0
.
gives
f
. (F̂AF + A · F) dx = 0
o
for all such F ∈ Hdiv (o). According to the definition of Hdiv (o), as the
orthogonal complement of gradients of functions in H 1 (o) in L2 (o; RN ),
we deduce that
AF̂ + A = ∇φ,
. φ ∈ H 1 (o).
This can be written, equivalently, in a differential form
. curl(AF̂ + A) = 0 in o.
Appendix B
So Much to Learn
There is so much material one can look at after having covered the material in this
course, depending on preferences and level of understanding, that we have tried
to organize it according to the four main areas involved in this textbook: the three
occurring in the title, and PDEs. Seeking that these comments may be helpful to
students, we have tried to avoid dispersion, and so we highlight a few sources, at
most, for each mentioned subject. In some of these, we also provide some clues at
our discretion. There is no aim at completeness or relevance on the selection that
follows. On the other hand, we have tried to include general references accessible
to students; this explains why some more specialized material is not included.
B.1 Variational Methods and Related Fields
This is the broader section in this Appendix, as the Calculus of Variations has been
our main interest. Readers will see that, even so, we have left out far too many
interesting topics.
B.1.1 Some Additional Sources for the Calculus of Variations
There is a number of fundamental textbooks dealing with various viewpoints on

variational methods that can support parts of this text, or go well beyond. We cite the
following, some of which are very classic sources, though the goal and methods are
not uniform in them [Bu89, Cl13, Da08, Da15, FoLe07, GiMoSo98, Gi03, JoLi98,
Mo08, Mor96, Mo03, Ri18].
374 B So Much to Learn
B.1.2 Introductory Courses
One can feel some curiosity at how variational methods can be introduced even at
a more elementary level; or how it is presented to other scientists and/or engineers.
We just mention some [BuGiHil98, Da15, Ki18, Ko14, Kot14, Tr96].
B.1.3 Indirect Methods
Though our treatment of variational problems is essentially based on the direct

method, there is a lot of fundamental classic material that can be integrated under
the umbrella of the indirect method. Inner-variations and corresponding optimality
conditions are also relevant in this regard. Some of those issues are
• Legendre-Hadamard condition;
• Weierstrass necessary condition;
• Du Bois-Reymond Lemma;
• Weierstrass-Erdmann conditions.
The encyclopedic work [GiHil96] is a main reference for this area.
B.1.4 Convex and Non-smooth Analysis
Chapter 3 is a mere, brief introduction to Convex Analysis so readers already know

a bit what this important area is about; though, again, we have not explored basic
material that needs to be studied at some point like, for instance, the theory of
conjugate convex functions . In addition to the classical books [EkTe99] or [Ro97],
there are excellent additional references like [AtBuMi14, BoLe06, Bo14, Cl90,
HiLe01, Mo18, MoNa14, Ta09].
Convex analysis deals, in a general abstract way, with convex sets and convex
functions, duality in convex optimization, complementary conditions and problems.
Techniques of non-smooth analysis are intimately connected to convex analysis.
We mention some important sources for non-smooth analysis in addition to those
alreacy cited: [Cl90, Io17].
B.1.5 Lagrangian and Hamiltonian Formalism
Historically, a lot of effort has been devoted to topics within the indirect method.
On the other hand, there is a rich tradition on variational methods in Mechanics.
Our main reference for this field is [GiHil96], where most of the knowledge in this
B So Much to Learn 375
area is gathered. This is also a good source for everything related to variations in
the independent variable, and Noether’s conservation law theorems, as has already
been indicated above.
B.1.6 Variational Inequalities
These deal with optimality conditions in optimization problems where the set of
competing objects is a convex set of functions or fields but they do not support an
underlying linear structure. Two basic references are: [KiSt00, Ul11].
B.1.7 Non-existence and Young Measures
Ever since the pioneering work of Young, Young measures has been a main tool to
tame non-convexity and non-existence. Our basic references are [CRV04, FlGo12,
Pe97, Yo69].
B.1.8 Optimal Control
Optimal control is a fundamental part of optimization of paramount importance in

Engineering. Traditionally is studied together, or after, the Calculus of Variations
for one-dimensional problems. Some basic textbooks are: [Be17, Bu14, Le14, Li12,
Me09, Tr96, Yo69].
Optimal control is however much more in the sense that the part of this area
where state equations or systems are PDE is considerably more complicated, but
the scope goes well beyond.
B.1.9 r-Convergence
The area of .r-convergence deals with sequences of functionals and how its limit
behavior can be explored in an orderly manner. There are not yet many textbooks in
the subject [AtBuMi14, Br02, DM93].
B.1.10 Other Areas
There are a number of important topics that are part of more general fields but they
have a character on their own. There are not yet textbooks as they are being intensely
explored these days. Some of those are:
• Existence without convexity
• Regularity in variational problems
• Second-order optimality conditions
• Non-local functionals
• Constrained variational problems and multipliers
• Stochastic Calculus of Variations
• Variational problems in .L∞
The two references [Ul11, Is16] treat some of these.
B.2 Partial Differential Equations
The interconnection between the Calculus of Variations and PDEs is so deep that
it is impossible to tell them apart. Variational methods are utilized constantly in
problems in Analysis where the main interest is the underlying PDEs themselves;
and, viceversa, fundamental motivation and techniques in variational problems
are constantly borrowed from the field of PDEs. We refer here to additional
sources where variational methods are at the background of viewpoints on prob-
lems [ACM18, Br13, Br11, CaVi15, CDR18, Cr18, Ev10, GiTr01, Kr08, MaOc19,
SVZZ13, Sa16, SBH19, Ta15].
B.2.1 Non-linear PDEs
Non-linear PDEs are sometimes quite different from their linear counterparts, and
typically much more difficult. This is almost always true in every part of Analysis.
In particular, quite often non-linear PDEs pose quite challenging problems to
researchers. Most of the references in the previous item have chapters dealing one
way or another with non-linear problems. Some other interesting resources are
[AmAr11, Co07].
B So Much to Learn 377
B.2.2 Regularity for PDEs: Regularity of o Is Necessary
The theory of regularity either for variational problems or PDEs is quite delicate
and technical, but it should be known to a certain extent by every Applied Analyst.
Our basic reference here is [Gr11].
B.2.3 Numerical Approximation
The numerical approximation of solutions to PDEs and variational problems is

another fundamental chapter of the theory with a major relevance in applications.
The fundamental terms here are finite element analysis. We just name a few books
as a guide [Da11, LaBe13, Ma86, Wh17, ZTZ13].
B.3 Sobolev Spaces
The theory of Sobolev spaces is fundamental for Applied Analysis. In this text we
have but covered the most basic facts, but a much deeper knowledge is required for a
finer treatment of variational problems and PDEs. Some additional references where
readers can keep on learning on this area are [Ad03, Ag15, Le17, Ma11, Ta07, Zi89].
B.3.1 Spaces of Bounded Variation, and More General Spaces

of Derivatives
Spaces of bounded variation are becoming a cornerstone in many technological

applications. They are a first step beyond Sobolev spaces in that derivatives do not
necessarily be measurable functions but can be measures. The area is quite mature
at this point. Some relevant sources are [AFP00, Li19, Zi89].
There are other important topics that are being intensely investigated these
days like fractional Sobolev spaces [RuSi96] or the interplay between PDEs and
harmonic analysis [Ab12].
B.4 Functional Analysis
As we have tried to stress in the initial chapters of this text, the role played on
Functional Analysis by problems in the Calculus of Variations has been, historically,
crucial. Nowadays there is also a clear and important interaction in the other
direction to the extent that one cannot be dispensed with a solid foundation in
Functional Analysis to understand the modern Calculus of Variations or the modern
theory of PDEs. We name a few additional references for Functional Analysis,
in addition to some other more specialized texts in the subsequent subsections
[Bo14, Ce10, Ci13, Fa16, LeCl03, Li16, Ov18, Sa17, Si18].
It is again important to stress that Functional Analysis is a very large and
fundamental area of Mathematical Analysis that we cannot cover in a few lines,
or with a few references.
Nevertheless, we include a few more important subareas of Functional Analysis
with some other resources.
• Distributions. The theory of distributions was a fundamental success for Applied
Analysis, hardly overestimated ever since [HaTr08, Mi18].
• Unbounded operators and Quantum Mechanics [Sch12].
• Topological vector spaces. Locally convex topological vector spaces [Vo20].
• Orlicz spaces. These spaces are a generalization of Lebesgue spaces that are built
by retaining the fundamental properties of the pth-power function that permit that
Lebesgue spaces become Banach spaces [HaHa19].
• Non-linear Analysis [AuEk84, Pa18, Ta09].
References
[Ab12] Abels, Hel.: Pseudodifferential and Singular Integral Operators. An Introduction

with Applications. De Gruyter Graduate Lectures. De Gruyter, Berlin (2012)
[Ad03] Adams, R.A., Fournier, J.J.F.: Sobolev Spaces. Pure and Applied Mathematics
(Amsterdam), vol. 140, 2nd edn. Elsevier, Amsterdam (2003)
[Ag15] Agranovich, M.S.: Sobolev Spaces, Their Generalizations and Elliptic Problems in
Smooth and Lipschitz Domains. Revised translation of the 2013 Russian original.
Springer Monographs in Mathematics. Springer, Cham (2015)
[AmAr11] Ambrosetti, A., Arcoya, D.: An introduction to nonlinear functional analysis and
elliptic problems. In: Progress in Nonlinear Differential Equations and Their Appli-
cations, vol. 82. Birkhäuser, Boston (2011)
[AFP00] Ambrosio, L., Fusco, N., Pallara, D.: Functions of Bounded Variation and Free
Discontinuity Problems. Oxford Mathematical Monographs. The Clarendon Press,
New York (2000)
[ACM18] Ambrosio, L., Carlotto, A., Massaccesi, A.: Lectures on Elliptic Partial Differential
Equations. Appunti. Scuola Normale Superiore di Pisa (Nuova Serie) [Lecture
Notes. Scuola Normale Superiore di Pisa (New Series)], vol. 18. Edizioni della
Normale, Pisa (2018)
[AtBuMi14] Attouch, H., Buttazzo, G., Michaille, G.: Variational Analysis in Sobolev and BV
Spaces. Applications to PDEs and Optimization. MOS-SIAM Series on Optimiza-
tion, vol. 17, 2nd edn. Society for Industrial and Applied Mathematics (SIAM),
Philadelphia (2014)
[AuEk84] Aubin, J.-P., Ekeland, I.: Applied Nonlinear Analysis. In: Pure and Applied Mathe-
matics (New York). Wiley, New York (1984)
[Be17] Bertsekas, D.P.: Dynamic Programming and Optimal Control, vol. I. 4th edn. Athena
Scientific, Belmont (2017)
[BiKr84] Birkhoff, G., Kreyszig, E.: The establishment of functional analysis. Historia Math.
11, 258–321 (1984)
[BoLe06] Borwein, J.M., Lewis, A.S.: Convex Analysis and Nonlinear Optimization. Theory
and Examples. CMS Books in Mathematics/Ouvrages de Mathématiques de la SMC,
vol. 3, 2nd edn. Springer, New York (2006)
[Bo14] Botelho, F.: Functional Analysis and Applied Optimization in Banach Spaces.
Applications to Non-convex Variational Models. With contributions by Anderson
Ferreira and Alexandre Molter. Springer, Cham (2014)
[Br02] Braides, A.: .r-Convergence for Beginners. Oxford Lecture Series in Mathematics
and its Applications, vol. 22. Oxford University Press, Oxford (2002)
380 References
[Br13] Bressan, A.: Lecture Notes on Functional Analysis. With Applications to Linear
Partial Differential Equations. Graduate Studies in Mathematics, vol. 143. American
Mathematical Society, Providence (2013)
[Br11] Brezis, H.: Functional Analysis, Sobolev Spaces and Partial Differential Equations.
Universitext. Springer, New York (2011)
[Bu14] Burns, J.A.: Introduction to the Calculus of Variations and Control—with Modern
Applications. Chapman & Hall/CRC Applied Mathematics and Nonlinear Science
Series. CRC Press, Boca Raton (2014)
[Bu89] Buttazzo, G.: Semicontinuity, Relaxation and Integral Representation in the Calculus
of Variations. Pitman Research Notes in Mathematics Series, vol. 207. Longman
Scientific & Technical, Harlow (1989); copublished in the United States with John
Wiley, New York
[BuGiHil98] Buttazzo, G., Giaquinta, M., Hildebrandt, S.: One-Dimensional Variational Prob-
lems. An Introduction. Oxford Lecture Series in Mathematics and its Applications,
vol. 15. The Clarendon Press/Oxford University Press, New York (1998)
[CaVi15] Cañada, A., Villegas, S.: A Variational Approach to Lyapunov Type Inequalities.
From ODEs to PDEs. With a foreword by Jean Mawhin. SpringerBriefs in Mathe-
matics. Springer, Cham (2015)
[CRV04] Castaing, C., Raynaud de Fitte, P., Valadier, M.: Young Measures on Topological
Spaces. With Applications in Control Theory and Probability Theory. Mathematics
and its Applications, vol. 571. Kluwer Academic, Dordrecht (2004)
[Ce10] Cerdà, J.: Linear Functional Analysis. Graduate Studies in Mathematics, vol.
116. American Mathematical Society, Providence, RI; Real Sociedad Matemática
Española, Madrid (2010)
[Ci13] Ciarlet, P.G.: Linear and Nonlinear Functional Analysis with Applications. Society
for Industrial and Applied Mathematics, Philadelphia (2013)
[CDR18] Cioranescu, D., Donato, P., Roque, M.P.: An Introduction to Second Order Partial
Differential Equations. Classical and Variational Solutions. World Scientific, Hack-
ensack (2018)
[Cl90] Clarke, F.H.: Optimization and Nonsmooth Analysis. Classics in Applied Mathe-
matics, vol. 5, 2nd edn. Society for Industrial and Applied Mathematics (SIAM),
Philadelphia (1990)
[Cl13] Clarke, F.: Functional Analysis, Calculus of Variations and Optimal Control. Gradu-
ate Texts in Mathematics, vol. 264. Springer, London (2013)
[Co07] Costa, D.G.: An Invitation to Variational Methods in Differential Equations.
Birkhäuser, Boston (2007)
[Cr18] Craig, W.: A Course on Partial Differential Equations. Graduate Studies in Mathe-
matics, vol. 197. American Mathematical Society, Providence (2018)
[Da08] Dacorogna, B.: Direct Methods in the Calculus of Variations. Applied Mathematical
Sciences, vol. 78, 2nd edn. Springer, New York (2008)
[Da15] Dacorogna, B.: Introduction to the Calculus of Variations, 3rd edn. Imperial College
Press, London (2015)
[DM93] Dal Maso, G.: An Introduction to .r-Convergence. Progress in Nonlinear Differential
Equations and Their Applications, vol. 8. Birkhäuser, Boston (1993)
[Da11] Davies, A.J.: The Finite Element Method. An Introduction with Partial Differential
Equations, 2nd edn. Oxford University Press, Oxford (2011)
[EkTe99] Ekeland, I., Témam, R.: Convex Analysis and Variational Problems. Translated from
the French. Corrected Reprint of the 1976 English Edition. Classics in Applied
Mathematics, vol. 28. Society for Industrial and Applied Mathematics (SIAM),
Philadelphia (1999)
[Ev10] Evans, L.C.: Partial Differential Equations. Graduate Studies in Mathematics, vol.
19, 2nd edn. American Mathematical Society, Providence (2010)
[Fa16] Farenick, D.: Fundamentals of Functional Analysis. Universitext. Springer, Cham
(2016)
References 381
[FlGo12] Florescu, L.C., Godet-Thobie, C.: Young Measures and Compactness in Measure
Spaces. De Gruyter, Berlin (2012)
[FoLe07] Fonseca, I., Leoni, G.: Modern Methods in the Calculus of Variations: Lp Spaces.
Springer Monographs in Mathematics. Springer, New York (2007)
[FrGi16] Freguglia, P., Giaquinta, M.: The Early Period of the Calculus of Variations.
Birkhäuser/Springer, Cham (2016)
[GiHil96] Giaquinta, M., Hildebrandt, S.: Calculus of Variations. I. The Lagrangian Formalism.
II. The Hamiltonian Formalism. Grundlehren der Mathematischen Wissenschaften
[Fundamental Principles of Mathematical Sciences], vols. 310–311. Springer, Berlin
(1996)
[GiMoSo98] Giaquinta, M., Modica, G., Souček, J.: Cartesian Currents in the Calculus of
Variations. I. Cartesian Currents II. Variational Integrals. Ergebnisse der Mathematik
und ihrer Grenzgebiete. 3. Folge. A Series of Modern Surveys in Mathematics
[Results in Mathematics and Related Areas. 3rd Series. A Series of Modern Surveys
in Mathematics], vols. 37–38. Springer, Berlin (1998)
[GiTr01] Gilbarg, D., Trudinger, N.S.: Elliptic Partial Differential Equations of Second Order.
Reprint of the 1998 edition. Classics in Mathematics. Springer, Berlin (2001)
[Gi03] Giusti, E.: Direct Methods in the Calculus of Variations. World Scientific, River Edge
(2003)
[Go80] Goldstine, H.H.: A History of the Calculus of Variations from the 17th Through the
19th Century. Studies in the History of Mathematics and Physical Sciences, vol. 5.
Springer, New York (1980)
[Gr11] Grisvard, P.: Elliptic Problems in Nonsmooth Domains. Reprint of the 1985 original
[MR0775683]. With a foreword by Susanne C. Brenner. Classics in Applied
Mathematics, vol. 69. Society for Industrial and Applied Mathematics, Philadelphia
(2011)
[HaHa19] Harjulehto, P., Hästö, P., Orlicz Spaces and Generalized Orlicz Spaces. Lecture
Notes in Mathematics, vol. 2236. Springer, Cham (2019)
[HaTr08] Haroske, D.D., Triebel, H.: Distributions, Sobolev Spaces, Elliptic Equations. EMS
Textbooks in Mathematics. European Mathematical Society (EMS), Zürich (2008)
[HiLe01] Hiriart-Urruty, J.B., Lemaréchal, C.: Fundamentals of Convex Analysis. Abridged
Version of Convex Analysis and Minimization Algorithms. I [Springer, Berlin, 1993;
MR1261420] and II [ibid.; MR1295240]. Grundlehren Text Editions. Springer,
Berlin (2001)
[Io17] Ioffe, A.D.: Variational Analysis of Regular Mappings. Theory and Applications.
Springer Monographs in Mathematics. Springer, Cham (2017)
[Is16] Ishikawa, Y.: Stochastic Calculus of Variations. For Jump Processes. De Gruyter
Studies in Mathematics, vol. 54, 2nd edn. De Gruyter, Berlin (2016)
[JoLi98] Jost, J., Li-Jost, X.: Calculus of Variations. Cambridge Studies in Advanced Mathe-
matics, vol. 64. Cambridge University Press, Cambridge (1998)
[Ki18] Kielhöfer, H.: Calculus of Variations. An Introduction to the One-Dimensional
Theory with Examples and Exercises. Translated from the 2010 German original.
Texts in Applied Mathematics, vol. 67. Springer, Cham (2018)
[KiSt00] Kinderlehrer, D., Stampacchia, G.: An Introduction to Variational Inequalities and
Their Applications. Reprint of the 1980 Original. Classics in Applied Mathematics,
vol. 31. Society for Industrial and Applied Mathematics (SIAM), Philadelphia
(2000)
[Ko14] Komzsik, L.: Applied Calculus of Variations for Engineers, 2nd edn. CRC Press,
Boca Raton (2014)
[Kot14] Kot, M.: A First Course in the Calculus of Variations. Student Mathematical Library,
vol. 72. American Mathematical Society, Providence (2014)
[Kr94I] Kreyszig, E.: On the calculus of variations and its major influences on the mathe-
matics of the first half of our century. I. Am. Math. Month. 101(7), 674–678 (1994)
382 References
[Kr94II] Kreyszig, E.: On the calculus of variations and its major influences on the mathemat-
ics of the first half of our century. II. Amer. Math. Month. 101(9), 902–908 (1994)
[Kr08] Krylov, N.V.: Lectures on Elliptic and Parabolic Equations in Sobolev Spaces. Grad-
uate Studies in Mathematics, vol. 96. American Mathematical Society, Providence
(2008)
[LaBe13] Larson, M.G., Bengzon, F.: The Finite Element Method: Theory, Implementa-
tion, and Applications. Texts in Computational Science and Engineering, vol. 10.
Springer, Heidelberg (2013)
[LeCl03] Lebedev, L.P., Cloud, M.J.: The Calculus of Variations and Functional Analysis.
With Optimal Control and Applications in Mechanics. Series on Stability, Vibration
and Control of Systems. Series A: Textbooks, Monographs and Treatises, vol. 12.
World Scientific, River Edge (2003)
[Le17] Leoni, G.: A First Course in Sobolev Spaces. Graduate Studies in Mathematics, vol.
181, 2nd edn. American Mathematical Society, Providence (2017)
[Le14] Levi, M.: Classical Mechanics with Calculus of Variations and Optimal Control. An
Intuitive Introduction. Student Mathematical Library, vol. 69. American Mathemat-
ical Society, Providence; Mathematics Advanced Study Semesters, University Park
(2014)
[Li12] Liberzon, D.: Calculus of Variations and Optimal Control Theory. A Concise
Introduction. Princeton University Press, Princeton (2012)
[Li16] Limaye, B.V.: Linear Functional Analysis for Scientists and Engineers. Springer,
Singapore (2016)
[Li19] Liflyand, E.: Functions of Bounded Variation and Their Fourier Transforms. Applied
and Numerical Harmonic Analysis. Birkhäuser/Springer, Cham (2019)
[MaOc19] Marin, M., Öchsner, A.: Essentials of Partial Differential Equations. With Applica-
tions. Springer, Cham (2019)
[Ma86] Marti, J.T.: Introduction to Sobolev Spaces and Finite Element Solution of Elliptic
Boundary Value Problems. Computational Mathematics and Applications. Aca-
demic Press [Harcourt Brace Jovanovich, Publishers], London (1986)
[Ma11] Maz’ya, V.: Sobolev Spaces with Applications to Elliptic Partial Differential
Equations. Second, revised and augmented edition. Grundlehren der Mathematis-
chen Wissenschaften [Fundamental Principles of Mathematical Sciences], vol. 342.
Springer, Heidelberg (2011)
[Me09] Mesterton-Gibbons, M.: A Primer on the Calculus of Variations and Optimal Control
Theory. Student Mathematical Library, vol. 50. American Mathematical Society,
Providence (2009)
[Mi18] Mitrea, D.: Distributions, Partial Differential Equations, and Harmonic Analysis.
Second edition [MR3114783]. Universitext. Springer, Cham (2018)
[Mo18] Mordukhovich, B.S.: Variational Analysis and Applications. Springer Monographs
in Mathematics. Springer, Cham (2018)
[MoNa14] Mordukhovich, B.S., Nam, N.M.: An Easy Path to Convex Analysis and Appli-
cations. Synthesis Lectures on Mathematics and Statistics, vol. 14. Morgan &
Claypool, Williston (2014)
[Mo08] Morrey, C.B., Jr.: Multiple Integrals in the Calculus of Variations. Reprint of the
1966 edition [MR0202511]. Classics in Mathematics. Springer, Berlin (2008)
[Mor96] Morse, M.: The calculus of Variations in the Large. Reprint of the 1932 original.
American Mathematical Society Colloquium Publications, vol. 18. American Math-
ematical Society, Providence (1996)
[Mo03] Moser, J.: Selected Chapters in the Calculus of Variations. Lecture Notes by Oliver
Knill. Lectures in Mathematics ETH Zürich. Birkhäuser, Basel (2003)
[Ov18] Ovchinnikov, S.: Functional Analysis. An Introductory Course. Universitext.
Springer, Cham (2018)
[Pa18] Papageorgiou, N.S., Winkert, P.: Applied Nonlinear Functional Analysis. An Intro-
duction. De Gruyter Graduate. De Gruyter, Berlin (2018)
References 383
[Pe97] Pedregal, P.: Parametrized Measures and Variational Principles. Progress in Nonlin-
ear Differential Equations and Their Applications, vol. 30. Birkhäuser, Basel (1997)
[Ri18] Rindler, F.: Calculus of Variations. Universitext. Springer, Cham (2018)
[Ro97] Rockafellar, R.T.: Convex Analysis. Reprint of the 1970 Original. Princeton Land-
marks in Mathematics. Princeton Paperbacks. Princeton University Press, Princeton
(1997)
[RuSi96] Runst, T., Sickel, W.: Sobolev Spaces of Fractional Order, Nemytskij Operators, and
Nonlinear Partial Differential Equations. De Gruyter Series in Nonlinear Analysis
and Applications, vol. 3. Walter de Gruyter, Berlin (1996)
[Sa16] Salsa, S.: Partial Differential Equations in Action. From Modelling to Theory.
Unitext, vol. 99, 3rd edn. La Matematica per il 3+2. Springer, Cham (2016)
[SVZZ13] Salsa, S., Vegni, F.M.G., Zaretti, A., Zunino, P.: A Primer on PDEs. Models,
Methods, Simulations. Translated and Extended from the 2009 Italian edition.
Unitext, vol. 65. La Matematica per il 3+2. Springer, Milan (2013)
[Sa17] Sasane, A.: A Friendly Approach to Functional Analysis. Essential Textbooks in
Mathematics. World Scientific, Hackensack (2017)
[SBH19] Sayas, F.J., Brown, T.S., Hassell, M.E.: Variational Techniques for Elliptic Partial
Differential Equations. Theoretical Tools and Advanced Applications. CRC Press,
Boca Raton (2019)
[Sch12] Schmüdgen, K.: Unbounded Self-Adjoint Operators on Hilbert Space. Graduate
Texts in Mathematics, vol. 265. Springer, Dordrecht (2012)
[Si18] Siddiqi, A.H.: Functional Analysis and Applications. Industrial and Applied Mathe-
matics. Springer, Singapore (2018)
[Ta15] Taheri, A.: Function Spaces and Partial Differential Equations. Vol. 1. Classical
Analysis. Vol. 2. Contemporary Analysis. Oxford Lecture Series in Mathematics
and its Applications, vol. 40–41. Oxford University Press, Oxford (2015)
[Ta09] Takahashi, W.: Introduction to Nonlinear and Convex Analysis. Yokohama Publish-
ers, Yokohama (2009)
[Ta07] Tartar, L.: An Introduction to Sobolev Spaces and Interpolation Spaces. Lecture
Notes of the Unione Matematica Italiana, vol. 3. Springer, Berlin; UMI, Bologna
(2007)
[Tr96] Troutman, J.L.: Variational Calculus and Optimal Control. With the Assistance of
William Hrusa. Optimization with Elementary Convexity. Undergraduate Texts in
Mathematics, 2nd edn. Springer, New York (1996)
[Ul11] Ulbrich, M.: Semismooth Newton Methods for Variational Inequalities and Con-
strained Optimization Problems in Function Spaces. MOS-SIAM Series on Opti-
mization, vol. 11. Society for Industrial and Applied Mathematics (SIAM), Philadel-
phia; Mathematical Optimization Society, Philadelphia (2011)
[Vo20] Voigt, J.: A Course on Topological Vector Spaces. Compact Textbooks in Mathemat-
ics. Birkhäuser, Cham (2020)
[Wh17] Whiteley, J.: Finite Element Methods. A Practical Guide. Mathematical Engineering.
Springer, Cham (2017)
[Yo69] Young, L.C.: Lectures on the Calculus of Variations and Optimal Control Theory.
Foreword by Wendell H. Fleming W. B. Saunders, Philadelphia (1969)
[Zi89] Ziemer, W.P.: Weakly Differentiable Functions. Sobolev Spaces and Functions of
Bounded Variation. Graduate Texts in Mathematics, vol. 120. Springer, New York
(1989)
[ZTZ13] Zienkiewicz, O.C., Taylor, R.L., Zhu, J.Z.: The Finite Element Method: Its Basis and
Fundamentals, 7th edn. Elsevier/Butterworth Heinemann, Amsterdam (2013)
Index
Symbols Concentration, 266

.l
p -spaces, 38 Concentration phenomenon, 55, 139
.C -functionals, 82
1 Conjugate exponents, 57
.r-convergence, 276 Conjugate function, 13
p-laplacian, 261 Constraints, 141, 152
Continuity of partial derivatives, 81
Contraction-mapping principle, 24
A Convexity, 108, 148
Area, 3 Convolution, 64
Area of a graph, 4 Convolution kernel, 206
Area of revolution, 4 Convolution operator, 195
Arzelá-Ascoli theorem, 52, 298 Critical point for a functional, 80
Critical points, 17
B Critical value, 303
Baire category, 170 Curvature, 312, 369
Banach-Alaouglu-Bourbaki theorem, 61 Cycloid, 151
Barycenter, 268
Basic density result for functions, 64
D
Bi-harmonic operator, 265
Dido’s problem, 11
Bounded variation, 85
Differentiability of a functional, 80
Brachistochrone, 6, 27, 139, 149
Direct method, 93
Bulk load, 138
Dirichlet boundary condition, 249
Dirichlet’s principle, 9, 27, 246
C Distance function, 36
Cancellation phenomenon, 60 Divergence theorem, 216
Carathéodory theorem, 270 Double adjoint, 177
Catenary, 153 Duality pair, 57
Cauchy-Schwarz inequality, 70 Dual of a Hilbert space, 79
Cavalieri’s principle, 231
Chrystoffel symbols, 164
Co-area formula, 231 E
Coercivity, 17, 19, 82, 133 Eigenfunction, 287
Coincidence set, 272, 368 Elastic materials, 138
Compactness, 18, 20, 25 Elastic membrane, 9
Complete metric space, 170 Epigraph, 123
386 Index
Euler-Lagrange equation, 114 J

Euler-Lagrange system, 146 Jensen’s inequality, 119, 130, 223
Jordan-Von Neumann theorem, 92
F
Failure of convexity, 127 K
Fine oscillations, 155 Kalman’s rank condition, 347
Finite-rank operator, 194
First moment, 268
Flow of a dynamical system, 299 L
Flux, 4 Lack of coercivity, 156
Fréchet differentiability, 81 Lack of convexity, 155
Fréchet-Riesz representation theorem, 79 Laplace operator on a hyper-surface, 314
Free boundary, 272 Laplace’s equation, 246
Fubini’s theorem, 137, 223 Lavrentiev’s phenomenon, 22
Function of a compact operator, 206 Lebesgue spaces, 42
Fundamental Lemma of the Calculus of Lipschitz continuity, 52
Variations, 227 Local lipschitzianity, 82
Fundamental Theorem of Calculus, 51 Logarithm of an operator, 207
Luzin’s theorem, 218
G
M
Gateaux differentiability, 80
Mathematical Programming, 1, 3
Generalized cylinder, 227
Mechanics, 137
Geodesics, 8, 30, 164
Method of variations, 27
Gram-Schmidt process, 74
Mild derivative, 244
Graph length, 4
Minimal surfaces, 10, 27, 30, 261
Graph of an operator, 175
Minkowski functional, 102
Min-max principle, 303
Mixed-boundary condition, 278
H Mollifier, 64
Haar system, 90 Moment of inertia, 4
Hamiltonian mechanics, 13 Morse theory, 27
Hanging cable, 12, 142, 152 Mountain-pass lemma, 308
Harmonic function, 247 Multipliers, 148, 151, 286
Hessian, 257
Hilbert-Schmidt operator, 195, 206, 222
Hölder continuity, 52 N
Hölder’s inequality, 2, 43 Natural boundary condition, 153, 160, 248
Neumann boundary condition, 160, 248, 272
Newtonian potential, 223
I Non-conservative field, 156
Ill-posed problem, 277 Norm, 36
Indirect method, 374 Null-lagrangian, 140, 362
Inner product, 69
Inner-variations, 163, 374
Integral equation, 26, 27, 209 O
Integral functional, 121 Obstacle problem, 92, 142, 272, 286
Integral operator, 196 One-dimensional fibers, 227
Integration-by-parts formula, 46, 67 Operations Research, 3
Internal energy, 134 Optimal control problems, 16
Isomorphism, 170 Order of a variational problem, 16
Isoperimetric problem, 10 Orthogonal projection, 71
Index 387
Orthonormal basis, 74 Subnorm, 99

Oscillating test functions, 276 Surface tension, 10
Oscillation, 266
T
P Taytochrone, 151
Palais-Smale condition, 304 Topological vector space, 85
Partitions of unity, 283 Total variation, 59
Periodic boundary conditions, 159 Transit problems, 6
Persistent cancellation, 54 Transmission condition, 275
Plancherel’s identity, 184, 185 Trigonometric basis, 77, 204
Plateau’s problem, 27 Tychonoff’s theorem, 61
Projection onto a convex set, 71, 73
Propagation of light, 29
U
Uniqueness, 133
R Utility function, 2
Radon-Nykodim theorem, 58
Reflexive spaces, 57
Relaxation theorem, 269 V
Riemann-Lebesgue lemma, 183 Variations, 144
Riemann-Stieltjes integral, 85 Vector, one-dimensional Sobolev spaces, 63
Riesz representation theorem, 57 Vector problems, 15
Riesz’s lemma, 200 Volterra operator, 191
Robin boundary condition, 278 Volume of revolution, 4
S W
Scalar problems, 15 Wavelet, 91
Second-order problem, 159 Weak convergence, 52
Semicontinuity, 27 Weak divergence, 244
Seminorm, 98 Weak lower semicontinuity, 110
Separability, 74 Weight function, 84
Separation of sets, 104 Well-posed problem, 277
Sequence of cut-off functions, 229 Wirtinger’s inequality, 322
Signed distance function, 230 Work, 4
Singularly-perturbed problem, 343
Size of a function, 20
Sobolev spaces with weights, 359 Y
Solid of minimal resistance, 30 Young’s inequality, 43
Square root of an operator, 207
Strict convexity, 112
Strong solution, 316 Z
Strong-weak lower semicontinuity, 250 Zorn’s lemma, 98
Sturm-Liouville problem, 98, 146, 336

Pablo Pedregal - Functional Analysis, Sobolev Spaces, and Calculus of Variations (UNITEXT, 157) - Springer (2024)

Uploaded by

Copyright:

Available Formats

Pablo Pedregal - Functional Analysis, Sobolev Spaces, and Calculus of Variations (UNITEXT, 157) - Springer (2024)

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Pablo Pedregal - Functional Analysis, Sobolev Spaces, and Calculus of Variations (UNITEXT, 157) - Springer (2024)

Uploaded by

Copyright:

Available Formats

UNITEXT 157

La Matematica per il 3+2 .

Functional Analysis, Sobolev

ISSN 2038-5714 ISSN 2532-3318 (electronic)

Paper in this product is recyclable.

• A first course in Calculus of Variations assuming basic knowledge of Functional

with no previous background on these areas, references are restricted to textbooks,

Ciudad Real, Spain Pablo Pedregal

1 Motivation and Perspective . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

Part I Basic Functional Analysis and Calculus of Variations

3.2 The Lax-Milgram Lemma . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94

Part II Basic Operator Theory

Part III Multidimensional Sobolev Spaces and Scalar Variational

7.4 Some Important Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 221

B So Much to Learn . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 373

1.1 Some Finite-Dimensional Examples

be the distance function to a given point

Suppose we are given a set .Ω ⊂ R2 with .(x0 , y0 ) ∈

distance between them be at that moment?

.(0, 0), (−1, 1), (1, 3), (2, 1).

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2024 1

for an arbitrary vector

5. Prove Hölder’s inequality

for .p > 1, .xk , yk > 0, by maximizing the function

for a fixed, positive number b.

u(x1 , x2 , x3 ) = x1α1 x2α2 x3α3 ,

1.2 Basic Examples

Almost every formula to calculate some geometric or physical quantity through an

Some examples coming from physical quantities follow.

F = (F1 (x), F2 (x), F3 (x)),

in going from a point .P to another one .Q is independent of the path followed,

2. The flux of a field .F(x) through a surface .S is given by

where .r(x) is the distance from .x to the axis .r.

1.3 More Advanced Examples

1.3.1 Transit Problems

feasible profile so that

where t is time, s is arc-length (distance), g is gravity, and h is height. We also know

An additional simplification can be implemented if we place the X-axis vertically,

and the problem we would like to solve is

would be given by the integral

1.3.3 Dirichlet’s Principle

for a preassigned domain .Ω , that minimizes the functional

In certain circumstances, if graphs of such functions .u(x) are identified as shapes

1.3.4 Minimal Surfaces

1.3.5 Isoperimetric Problems

the length of its graph

the area of revolution around the X-axis

the volume of the solid of revolution around the X-axis

among those with

This is a much more interesting and fascinating problem. In general terms, an

The length of .σ is given by

while the area enclosed by it is, according to Green’s theorem,

We would like to find the optimal curve for the problem

The constraint on the length reads

under the constraints

1.3.6 Hamiltonian Mechanics

There is a rich tradition of variational methods in Mechanics. If .x(t) represents the

is minimized. The integrand

.H (x, y) = sup {z · y − L(x, y)}.