Nonlinear Func Anal Appli 4 Appli Mathe Phy Zeidler

Download as pdf or txt
Download as pdf or txt
You are on page 1of 1007

Nonlinear Functional Analysis

and its Applications

IV: Applications to Mathematical Physics

Springer Science+Business Media, LLC


Carl Friedrich Gauss (1777-1855)
Eberhard Zeidler

Nonlinear
Functional Analysis
and its Applications
IV: Applications to Mathematical Physics

Translated by Juergen Quandt

With 201 Illustrations

~Springer
Eberhard Zeidler Juergen Quandt (Translator)
Max-Planck-Institut für Mathematik Department of Mathematics
in den Naturwissenschaften University of Central Florida
Inselstraße 22-26 Orlando, FL 328J6
D-04103 Leipzig U.S.A.
Germany

Mathematics Subject Classilication (1980): 46-XX

Library of Congress Cataloging-in-Publication Data


(Revised for vol. 4)
Zeidler, Eberhard.
Nonlinear functional analysis and its applications.
Vol. 3: Translated by Leo L. Boron; vol. 4:
Translated by Juergen Quandt.
Includes bibliographies and indexes.
Contents: v. 1. Fixed point theorems-
v. 3. Variational methods and optimization- v. 4.
Applications to mathematical physics.
I. Nonlinear functional analysis. I. Title.
QA321.5.Z4513 1985 515.7 83-20455
ISBN 978-1-4612-8926-5 ISBN 978-1-4612-4566-7 (eBook)
DOI 10.1007/978-1-4612-4566-7

Previous edition, Vorlesungen über nichtlineare Funktlonalanalysls, Vols. 1-111, published by BSB
B.G. Teubner Verlagsgesellschaft, 7010 Leipzig, Sternwartenstrasse 8, Deutsche Demokratische
Republik.

© 1988 by Springer Science+Business Media New York


Originally publisbed by Springer-Verlag New York Inc. in 1988
Softcover reprint of the hardcover 1st edition 1988
All rights reserved. This work may not be translated or copied in whole or in part without the
written permission of the publisher (Springer Science+Business Media, LLC.),
except for brief excerpts in connection with reviews or scholarly analysis. Use in
connection with any form of information storage and retrieval, electronic adaptation, computer
software, or by similar or dissimilar methodology now known or hereafter developed is forbidden.
Typeset by Asco Trade Typesetting Ltd., Hong Kong.
Printed and bound by R. R. Donnelley & Sons, Harrisonburg, Virginia.

9 8 7 6 5 4 3 2 (Corrected second printing, 1997)


Dedicated in Iove to my sister's family

Edith, Günter, Jürgen, and Ulrich

So teach us to number our days, that we may apply our hearts unto wisdom.
Psalm 90, 12
And though I have the gift of prophecy, and understand all mysteries, and all
knowledge; and though I have all faith, so that I could remove mountains, and
have not charity, I am nothing.
I. Corinthians 13, 2
Preface

The main concern in all scientific work must be the human being himsel[ This,
one should never forget among all those diagrams and equations.
Albert Einstein

This volume is part of a comprehensive presentation of nonlinear functional


analysis, the basic content of which has been outlined in the Preface of Part
I. A Table of Contents for all five volumes may also be found in Part I. The
present Part IV and the following Part V contain applications to mathematical
physics. Our goals are the following:
(i) A detailed motivation of the basic equations in important disciplines of
theoretical physics.
(ii) A discussion of particular problems which have played a significant role
in the development of physics, and through which important mathe-
matical and physical insight may be gained.
(iii) A combination of classical and modern ideas.
(iv) An attempt to build a bridge between the language and thoughts of
physicists and mathematicians.
Weshall always try to advance as soon as possible to the heart ofthe problern
under consideration and to concentrate on the basic ideas.
The treatment of mathematical physics in the mathematical Iiterature is
often as follows. A mathematical problern is discussed and then followed only
by a short remark that physical interpretations exist. But for mathematicians
it is also very important to have a profound knowledge of the physical
background, because it is possible that a mathematical result lies in a parame-
ter space which is physically meaningless. Furthermore, it might happen that
mathematicians will, with a great deal of effort, determine solutions which are
actually unstable, i.e., which are never realized in nature. Finally, we should

vii
viii Preface

mention that every mathematical model of processes in nature is based on


simplifications and approximations. Thereby it is important to learn the
particular model assumptions. For example, the addition of a small viscosity
term may greatly simplify the mathematical study, and this viscosity term
might actually yield a much better model of physical reality than the previous
idealization without viscosity.
lt is therefore important that mathematicians are concerned about the
physical motivation of the problems they study. Our presentation might help
in this direction. In order to avoid confusion we clearly distinguish between
the mathematical formulation of the basic equations, their physical motiva-
tion, and the proof of purely mathematical results. The word "proof" is always
understood in the sense of a rigorous mathematical proof.
Although we emphasize a detailed discussion of the basic equations, this is
never a goal in itself. We therefore do not use an oversophisticated axiomatic
approach. Our main concern is the study of typical problems. Wehave tried
to select interesting and important problems in order to mediate the fascina-
tion which results from the interrelation between mathematics and physics.
At the same time, we have tried to cover a broad spectrum which might Iead
to a diversified and colorful picture. We think that a broad scope in the
education of students is very important, so that later on, the researcher will
be able to apply different ways of thinking to the solutions of his very special
problems. Our main goal is also to emphasize the relation between classical
theories and modern developments in physics, including presently exciting
open problems. Fora student, it is important to know such relations; other-
wise, modern theories may appear to him as unrelated. In the previous
volumes we have emphasized the unity of pure and applied mathematics. Here
we want to complement this picture by trying to show the unity of mathematics
and physics.
The recognition process in theoretical physics can very roughly be described
by the following scheme:

physical reality- mathematical model

1
physical interpretation +- mathematical result.

The treatment of mathematical problems may therefore be very difficult. As


a mathematician, however, one should not forget that the important intel-
lectual contribution in the finding of new insights into nature's secrets lies in
the process of passing from the physical reality to the mathematical model.
This, for example, can be seen from Einstein's struggle with the generat theory
of relativity. He began with physical observations, and for their realization he
then tried to find a suitable mathematical setting.
A mathematician who wants to occupy hirnself with physics is confronted
with the following difficulty.
Preface ix

(a) In the physicalliterature the mathematical apparatus is not always cor-


rectly handled. Even the best textbooks in theoretical physics contain
mathematical statements which are wrong or proofs which arenot correct.
This has the consequence that many mathematicians restriet themselves
to purely mathematical considerations, or if they do treat physical subjects,
then physicists have difficulty in recognizing the familiar physical material.
(b) On the other band, creative physicists have a number of objections to
mathematics. "The rigor of mathematics is a luxury. We need concrete
answers to our problems. The generat results of mathematicians are often
useless for our purposes, or the formulation is so abstract that we cannot
recognize whether a mathematical result is useful to us or not. Thus we
have to develop our own methods and cannot wait until they are mathe-
matically justified. By experience, this takes much too long."
Forastudent who wants to learn both mathematics and physics, a schizo-
phrenic situation may occur, and he might be caught on the horns of a
dilemma.
Actually, both the above views are justified: the critique of the mathemati-
cian and the critique ofthe physicist. Until today, for example, it has not been
possible to develop a mathematically rigorous quantum field theory. For
about 40 years, however, physicists have worked with dubious mathematical
methods and have obtained results which are in fantastic coincidence with
experiment. On the other band, in cases where rigorous and transparent
mathematical theories exist, they are not always used by physicists. This
applies, for example, to the calculus of variations. The difficulties which
occurred durlog the creation of differential and integral calculus and varia-
tional calculus in the eighteenth century, with concepts like "infinitesimally
small," "variation," and "virtual displacement," are still preserved in many
present-day physics textbooks and cause a great deal oftrouble for the student.
Here, the words of the Austrian musician Gustav Mahler (1860-1911) are
appropriate: tradition = slovenliness. Some particular cases will be critically
discussed in this volume. This is not meant as a cheap polemic but rather we
hope to help the student. That many such difficulties exist is evident to the
author in his daily contact with mathematics and physics students.
By choosing particular examples, we also try to explain that specific forms
of mathematical and physical thinking exist, and that both their differences
and synthesis are important for effective research. Many important physical
theories were obtained, not by way of mathematical deduction but instead,
by ingenious physicists, in close contact with observed physical phenomena,
who developed their specific physical ideas and, from a mathematical point
of view, employed some very questionable mathematical tools. At the same
time, however, they developed a number of very fruitful mathematical ideas.
The different forms of mathematical and physical thinking make it very
difficult to find a common language for both disciplines. A student, for
example, might become desperate in trying to translate the standard ten-
X Preface

volume work in theoretical physics of Landau and Lifsic (1962) into a precise
mathematicallanguage. Experience shows that it is more useful for a student
to take a pragmatic point ofview and get acquainted with both languages, i.e.,
the language of mathematics and the language of physics, and to Iook for as
many relations between them as possible.
In this volume we hope that the mathematician feels comfortable, because
bis precise mathematicallanguage is used whenever a purely mathematical
subject is treated, but also that, at the same time, he learns something about
the physical interpretation. On the other band, we hope that the physicist
recognizes bis familiar physical ideas and sees how these are related to impor-
tant modern mathematical concepts and also how these concepts can fruitfully
be applied.
Only the reader can decide whether we have succeeded in this or not. For
the future, however, it will be important that mathematicians and natural
scientists make greater efforts towards a better understanding, so that ideas
may ßow in both directions. No abstract Statementsare needed here, but
rather the personal contact between mathematicians and natural scientists, in
order to help demolish barriers.
This volume has the same formal structure as the previous Parts I-III.
Details may be found in the Preface of Part I. We mention that at the end of
this volume a listing of all theorems may be found. The Appendix contains a
survey of the international system of units and the values of important
universal constants.
Mathematicians should bear in mind that, at the end of even the most
abstract theory, physicists need numbers which they can compare with experi-
ment. It is also very important for physicists to have a precise knowledge of
the orders of magnitude in which the various effects lie, so that they know
which of them to neglect. In order to give mathematicians a feeling for this,
we have added several tables throughout this volume.
In the Introduction to Part I we have already mentioned that in the study
of many problems the following steps are used:
(i) Translation of the problern into the language of functional analysis.
(ii) Applications of abstract functional analytic theorems.
(iii) To verify the assumptions in (ii) it is often necessary to apply deep and
very specific analytical tools.
According to our title "Nonlinear Functional Analysis," we concentrate on (i)
and (ii). The tools in (iii), which usually require very long proofs (e.g., a priori
estimates for linear elliptic systems in spaces of Hölder continuously differen-
tiahte functions), are often used without proof, but we always say where the
corresponding proof may be found.
The sciences today are characterized by an increased splitting. In this
presentation we try to emphasize unifying principles and relations. lf this is
done by a single author, then surely only with imperfection. Critical remarks
and suggestions are therefore very welcome.
Preface Xl

For assistance in typing parts of this manuscript I am very thankful to


Ursula Abraham, Helga Hedwig, Kristina Friedrich, Hiltraud Lehmann, Rita
Löffier, Karin Quasthoff, Karla Rietz, Stefan Ackermann, Frank Benkert,
Wemer Bemdt, Stefan Friedrich, Bernd Fritzsche, Mattbias Günther, Uwe
Heide, Jürgen Herrler, Jürgen Janassary, Bernd Kirstein, Wolfgang Kliesch,
Klaus Schenk, Rainer Schumann, and Friedemann Schuricht.
Also extremely helpful was the understanding and knowledgable assistance
of the librarian of our Institute, Mrs. Ina Letzel, for which I wish to express
my sincerest thanks.
I would heartily like to thank Professor Friedrich Hirzebruch for bis
repeated generous hospitality at the Max-Planck Institute for Mathematics
in Bonn.
I would also like to thank Juergen Quandt for bis English translation.
Finally, my special thanks are due to Springer-Verlag for the harmonious
collaboration and the beautifullayout of the book.

Leipzig Eberhard Zeidler


Summer 1987

Preface to the Corrected Seeond Printing

I am very pleased that Springer-Verlag is publishing a corrected reprint of my


book. In this edition, I have made minor revisions and added a fairly com-
prehensive Iist of recent references.
Nowadays, nonlinear mathematical techniques play a fundamental rote in
all areas of natural sciences. The additional references should help the reader
to discover new ways of applying mathematics to the fascinating phenomena
of our real world. I hope that this volume will contribute to a better under-
standing between mathematicians and natural scientists.

Leipzig Eberhard Zeidler


Spring 1997
Translator's Preface

I would like to take this opportunity to thank Professor Eberhard Zeidler for
much advice durlog the course of the translation. I am very grateful to him
for a pleasant time of collaboration. I also want to thank Steve Kane for bis
help with stylistic questions.
What follows are a few observations about the process of discovery. The
most difficult form of thinking seems to be the thinking about thinking. This
may be related to the problern of self-reference in logic. lt appears to be in
principle impossible to completely analyze one's own process of thought, since
such an analysis interferes with the process to be analyzed. Nevertheless, the
conscious mind can Iook at the subconscious mind and with the aid of memory
also at the conscious mind.
To visualize the mental state of thought, one could use the analogy
awakenness = consciousness
dream = subconsciousness.
The conscious mind is subject to our will, as elements of thought serve certain
more or less vague images and signs, which are continuously combined by
the subconscious mind (i.e., involuntarily). According to logical rules and
controlled by the conscious mind, they are formed into certain chains.
Those which appeal to the conscious mind are kept as further elements of
thought. In this selection process the conscious mind is guided by a certain
feeling of harmony and simplicity and the desire to arrive at logically con-
nected entities. Discovery itself occurs suddenly (like a Dash of lightning),
usually after a period of mental exertion (and possibly preceded by some
incubation time).
What are the necessary prerequisites for discovery?

xiii
XIV Translator's Preface

(l) Character. The most important factor seems tobe one's character. The
better it is, the more likely the conscious mind will be guided by a feeling
ofharmony.
(2) lntelligence. The higher the intelligence, the faster the subconscious mind
is able to combine the elements of thought.
(3) Knowledge. The broader the knowledge, the more elements of thought
the subconscious mind can combine.
(4) Inspiration. Without being inspired by other peoples' ideas, one's thoughts
often proceed in circles.
(5) Persistence. To arrive at deep results (see below), a certain persistence over
long periods of time seems to be required.
How can the value of a discovery be judged?
A result is deep if it is basic (the most basic results being the universallaws
of nature). Such results assume a simple form (F = ma, E = mc 2 , etc.). It seems
to be the hardest task to arrive at those discoveries.

Winter Park, FL Juergen Quandt


Spring 1987
Contents

Preface vii
Translator's Preface xiii

INTRODUCTION
Mathematics and Physics

APPLICATIONS IN MECHANICS 7

CHAPTER 58
Basic Equations of Point ·Mechanics 9
§58.1. Notations lO
§58.2. Lever Principle and Stability of the Scales 14
§58.3. Perspectives 17
§58.4. Kepler's Laws and a Look at the History of Astronomy 22
§58.5. Newton's Basic Equations 25
§58.6. Changes of the System of Reference and the RoJe of Inertial Systems 28
§58.7. General Point System and lts Conserved Quantities 32
§58.8. Newton's Law of Gravitation and Coulomb's Law of Electrostatics 35
§58.9. Application to the Motion of Planets 38
§58.10. Gauss' Principle of Least Constraint and the General Basic
Equations of Point Mechanics with Side Conditions 45
§58.11. Principle of Virtual Power 48
§58.12. Equilibrium States and a General Stability Principle 50
§58.13. Basic Equations ofthe Rigid Body and the Main Theorem about the
Motion of the Rigid Body and Its Equilibrium 52
§58.14. Foundation of the Basic Equations of the Rigid Body 55

XV
xvi Contents

§58.15. Physical Models, the Expansion ofthe Universe, and lts Evolution
~~-~ ~
§58.16. Legendre Transformation and Conjugate Functionals 65
§58.17. Lagrange Multipliers 67
§58.18. Principle of Stationary Action 69
§58.19. Trick of Position Coordinates and Lagrangian Mechanics 70
§58.20. Hamiltonian Mechanics 72
§58.21. Poissonian Mechanics and Heisenberg's Matrix Mechanics in
Quantum Theory 77
§58.22. Propagation of Action 81
§58.23. Hamilton-Jacobi Equation 82
§58.24. Canonical Transformations and the Solution of the Canonical
Equations via the Hamilton-Jacobi Equation 83
§58.25. Lagrange Brackets and the Solution of the Hamilton-Jacobi
Equation via the Canonical Equations 84
§58.26. Initial-Value Problem for the Hamilton-Jacobi Equation 87
§58.27. Dimension Analysis 89

CHAPTER 59
Dualism Between Wave and Particle, Preview of Quantum Theory,
and Elementary Particles 98
§59.1. Plane Waves 99
§59.2. Polarization 101
§59.3. Dispersion Relations 102
§59.4. Spherical Waves 103
§59.5. Damped Oscillations and the Frequency-Time Uncertainty Relation 104
§59.6. Decay of Partides 105
§59.7. Cross Sections for Elementary Partide Processes and the Main
Objectives in Quantum Field Theory 106
§59.8. Dualism Between Wave and Partide for Light 107
§59.9. Wave Packets and Group Velocity 110
§59.10. Formulation of a Particle Theory for a Classical Wave Theory 111
§59.11. Motivation of the Schrödinger Equation and Physical Intuition 112
§59.12. Fundamental Probability Interpretation of Quantum Mechanics 113
§59.13. Meaning of Eigenfunctions in Quantum Mechanics 114
§59.14. Meaning ofNonnormalized States 116
§59.15. Special Functions in Quantum Mechanics 117
§59.16. Spectrum ofthe Hydrogen Atom 118
§59.17. Functional Analytic Treatment ofthe Hydrogen Atom 121
§59.18. Harmonie Oscillator in Quantum Mechanics 122
§59.19. Heisenberg's Uncertainty Relation 123
§59.20. Pauli Principle, Spin, and Statistics 125
§59.21. Quantization of the Phase Space and Statistics 126
§59.22. Pauli Principle and the Periodic System ofthe Elements 127
§59.23. Classical Limiting Case of Quantum Mechanics and the
WKB Method to Compute Quasi-Ciassical Approximations 129
§59.24. Energy-Time Uncertainty Relation and Elementary Partides 130
§59.25. The Four Fundamental Interactions 134
§59.26. Strength of the lnteractions 136
Contents xvii

APPLICATIONS IN ELASTICITY THEORY 143

CHAPTER 60
Elastoplastic Wire 145
§60.1. Experimental Result 147
§60.2. Viscoplastic Constitutive Laws 149
§60.3. Elasto-Viscoplastic Wire with Linear Hardening Law 151
§60.4. Quasi-Statical Plasticity 154
§60.5. Some Historical Remarks on Plasticity 155

CHAPTER 61
Basic Equations ofNonlinear Elasticity Theory 158
§61.1. Notations i66
§61.2. Strain Tensor and the Geometry ofDeformations 168
§61.3. Basic Equations 176
§61.4. Physical Motivation ofthe Basic Equations 180
§61.5. Reduced Stress Tensor and the Principle ofVirtual Power 184
§61.6. A General Variational Principle (Hyperelasticity) 190
§61.7. Elastic Energy ofthe Cuboid and Constitutive Laws 198
§61.8. Theory oflnvariants and the General Structure ofConstitutive Laws
and Stored Energy Functions 202
§61.9. Existence and Uniqueness in Linear Elastostatics (Generalized
Solutions) 2o9
§61.10. Existence and Uniqueness in Linear Elastodynamics (Generalized
Solutions) 212
§61.11. Strongly Elliptic Systems 213
§61.12. Local Existence and Uniqueness Theorem in Nonlinear Elasticity via
the Implicit Function Theorem 215
§61.13. Existence and Uniqueness Theorem in Linear Elastostatics (Classical
Solutions) 221
§61.14. Stability and Bifurcation in Nonlinear Elasticity 221
§61.15. The Continuation Method in Nonlinear Elasticity and an
Approximation Method 224
§61.16. Convergence ofthe Approximation Method 227

CHAPTER 62
Monotone Potential Operatorsand a Class of Models with Nonlinear
Hooke's Law, Duality and Plasticity, and Polyconvexity 233
§62.1. Basic ldeas 234
§62.2. Notations 242
§62.3. Principle of Minimal Potential Energy, Existence, and Uniqueness 244
§62.4. Principle of Maximal Dual Energy and Duality 245
§62.5. Proofs of the Main Theorems 247
§62.6. Approximation Methods 252
§62. 7. Applications to Linear Elasticity Theory 255
§62.8. Application to Nonlinear Hencky Material 256
§62.9. The Constitutive Law for Quasi-Statical Plastic Material 257
xviii Contents

§62.1 0. Principle of Maximal Dual Energy and the Existence Theorem for
Linear Quasi-Statical Plasticity 259
§62.11. Duality and the Existence Theorem for Linear Statical Plasticity 262
§62.12. Compensated Compactness 264
§62.13. Existence Theorem for Polyconvex Material 273
§62.14. Application to Rubberlike Material 277
§62.15. Proof of Kom's lnequality 278
§62.16. Legendre Transformation and the Strategy ofthe General Friedrichs
Duality in the Calculus ofVariations 284
§62.17. Application to the Dirichlet Problem (Trefftz Duality) 288
§62.18. Application to Elasticity 289

CHAPTER 63
Variational lnequalities and the Signorini Problem for Nonlinear
Material 296
§63.1. Existence and Uniqueness Theorem 296
§63.2. Physical Motivation 298

CHAPTER 64
Bifurcation for Variational lnequalities 303
§64.1. Basic Ideas 303
§64.2. Quadratic Variational Inequalities 305
§64.3. Lagrange Multiplier Rule for Variational lnequalities 306
§64.4. Main Theorem 308
§64.5. Proof of the Main Theorem 309
§64.6. Applications to the Bending of Rods and Beams 311
§64.7. Physical Motivation for the Nonlinear Rod Equation 315
§64.8. Explicit Solution of the Rod Equation 317

CHAPTER 65
Pseudomonotone Operators, Bifurcation, and the von Karman Plate
Equations 322
§65.1. Basic Ideas 322
§65.2. Notations 325
§65.3. The von Karman Plate Equations 326
§65.4. The Operator Equation 327
§65.5. Existence Theorem 332
§65.6. Bifurcation 332
§65.7. Physical Motivation ofthe Plate Equations 334
§65.8. Principle of Stationary Potential Energy and Plates with Obstacles 339

CHAPTER 66
Convex Analysis, Maximal Monotone Operators, and Elasto-
Viscoplastic Material with Linear Hardening and Hysteresis 348
§66.1. Abstract Model for Slow Deformation Processes 349
§66.2. Physical Interpretation of the Abstract Model 352
§66.3. Existence and Uniqueness Theorem 355
§66.4. Applications 358
Contents xix

APPLICATIONS IN THERMODYNAMICS 363

CHAPTER 67
Phenomenological Thermodynamics of Quasi-Equilibrium and
Equilibrium States 369
§67.1. Thermodynamical States, Processes, and State Variables 371
§67.2. Gibbs' Fundamental Equation 374
§67.3. Applications to Gases and Liquids 375
§67.4. The Three Laws ofThermodynamics 378
§67.5. Change of Variables, Legendre Transformation, and
Thermodynamical Potentials 385
§67.6. Extremal Principles for the Computation ofThermodynamical
Equilibrium States 387
§67.7. Gibbs' Phase Rule 391
§67.8. Applications to the Law of Mass Action 392

CHAPTER 68
Statistical Physics 396
§68.1. Basic Equations of Statistical Physics 397
§68.2. Bose and Fermi Statistics 402
§68.3. Applications to Ideal Gases 403
§68.4. Planck's Radiation Law 408
§68.5. Stefan-Boltzmann Radiation Law for Black Bodies 409
§68.6. The Cosmos at a Temperature of 1011 K 411
§68.7. Basic Equation for Star Models 412
§68.8. Maximal Chandrasekhar Mass of White Dwarf Stars 412

CHAPTER 69
Continuation with Respect to a Parameter and a Radiation Problem
of Carleman 422
§69.1. Conservation Laws 422
§69.2. Basic Equations of Heat Conduction 423
§69.3. Existence and Uniqueness for a Heat Conduction Problem 425
§69.4. Proof of Theorem 69.A 426

APPLICATIONS IN HYDRODYNAMICS 431

CHAPTER 70
Basic Equations of Hydrodynamics 433
§70.1. Basic Equations 434
§70.2. Linear Constitutive Law for the Friction Tensor 436
§70.3. Applications to Viscous and lnviscid Fluids 438
§70.4. Tube Flows, Similarity, and Turbulence 439
§70.5. Physical Motivation of the Basic Equations 441
§70.6. Applications to Gas Dynamics 444
XX Contents

CHAPTER 71
Bifurcation and Permanent Gravitational Waves 448
§71.1. Physical Problem and Complex Velocity 451
§71.2. Complex Flow Potential and Free Boundary-Value Problem 454
§71.3. Transformed Boundary-Value Problem for the Circular Ring 456
§71.4. Existence and Uniqueness ofthe Bifurcation Branch 459
§71.5. Proof ofTheorem 7l.B 462
§71.6. Explicit Construction of the Solution 464

CHAPTER 72
Viscous Fluids and the Navier-Stokes Equations 479
§72.1. Basic ldeas 480
§72.2. Notations 485
§72.3. Generalized Stationary Problem 486
§72.4. Existence and Uniqueness Theorem for Stationary Flows 490
§72.5. Generalized Nonstationary Problem 491
§72.6. Existence and Uniqueness Theorem for Nonstationary Flows 494
§72.7. Taylor Problem and Bifurcation 495
§72.8. Proof of Theorem 72.C 500
§72.9. Benard Problem and Bifurcation 505
§72.10. Physical Motivation ofthe Boussinesq Approximation 512
§72.11. The Kolmogorov 5/3-Law for Energy Dissipation in Turbulent
Flows 513
§72.12. Velocity in Turbulent Flows 515

MANIFOLDS AND THEIR APPLICATIONS 527

CHAPTER 73
Banach Manifolds 529
§73.1. Local Normal Forms for Nonlinear Double Splitting Maps 531
§73.2. Banach Manifolds 533
§73.3. Strategy of the Theory of Manifolds 535
§73.4. Diffeomorphisms 537
§73.5. Tangent Space 538
§73.6. Tangent Map 540
§73.7. Higher-OrderDerivatives and the Tangent Bundle 541
§73.8. Cotangent Bundle 545
§73.9. Global Salutions of Differential Equations on Manifolds and Flows 546
§73.10. Linearization Principle for Maps 550
§73.11. Two Principles for Constructing Manifolds 554
§73.12. Construction ofDiffeomorphisms and the Generalized Morse
Lemma 560
§73.13. Transversality 563
§73.14. Taylor Expansionsand Jets 566
§73.15. Equivalence of Maps 571
§73.16. Multilinearization of Maps, Normal Forms, and Castastrophe
Theory 572
Contents xxi

§73.17. Applications to Natural Seiences 579


§73.18. Orientation 582
§73.19. Manifolds with Boundary 584
§73.20. Sard's Theorem 587
§73.21. Whitney's Embedding Theorem 588
§73.22. Vector Bundles 589
§73.23. Differentialsand Derivations on Finite-Dimensional Manifolds 595

CHAPTER 74
Classical Surface Theory, the Theorema Egregium of Gauss, and
Differential Geometry on Manifolds 609
§74.1. Basic Ideas of Tensor Calculus 615
§74.2. Covariant and Contravariant Tensors 617
§74.3. Algebraic Tensor Operatiolls 621
§74.4. Covariant Differentiation 623
§74.5. Index Principle of Mathematical Physics 625
§74.6. Parallel Transport and Motivation for Covariant Differentiation 626
§74.7. Pseudotensors and a Duality Principle 627
§74.8. Tensor Densities 630
§74.9. The Two Fundamental Forms of Gauss of Classical Surface Theory 631
§74.10. Metric Properlies of Surfaces 634
§74.11. Curvature Properlies of Surfaces 636
§74.12. Fundamental Equations and the Main Theorem of Classical Surface
Theory 639
§74.13. Curvature Tensor and the Theorema Egregium 642
§74.14. Surface Maps 644
§74.15. Parallel Transport on Surfaces According to Levi-Civita 645
§74.16. Geodesics on Surfaces and a Variational Principle 646
§74.17. Tensor Calculus on Manifolds 648
§74.18. Affine Connected Manifolds 649
§74.19. Riemannian Manifolds 651
§74.20. Main Theorem About Riemannian Manifolds and the Geometrie
Meaning of the Curvature Tensor 653
§74.21. Applications to Non-Euclidean Geometry 655
§74.22. Strategy for a Further Development of the Differential and Integral
Calculus on Manifolds 663
§74.23. Alternating Differentiation of Alternating Tensors 664
§74.24. Applications to the Calculus of Alternating Differential Forms 664
§74.25. Lie Derivative 673
§74.26. Applications to Lie Algebras ofVector Fields and Lie Groups 676

CHAPTER 75
Special Theory of Relativity 694
§75.1. Notations 699
§75.2. lnertial Systems and the Postulates of the Special Theory of Relativity 699
§75.3. Space and Time Measurements in lnertial Systems 700
§75.4. Connection with Newtonian Mechanics 702
§15.5. Special Lorentz Transformation 706
xxii Contents

§75.6. Length Contraction, Time Dilatation, and Addition Theorem for


Velocities 708
§75.7. Lorentz Group and Poincare Group 710
§75.8. Space-Time Manifold of Minkowski 713
§75.9. Causality and Maximal Signal Velocity 714
§75.10. Proper Time 717
§75.11. The Free Particle and the Mass-Energy Equivalence 719
§75.12. Energy Momentum Tensor and Relativistic Conservation Laws for
Fields 723
§75.13. Applications to Relativistic Ideal Fluids 726

CHAPTER 76
General Theory of Relativity 730
§76.1. Basic Equations ofthe General Theory ofRelativity 730
§76.2. Motivation ofthe Basic Equations and the Variational Principle for
the Motion of Light and Matter 732
§76.3. Friedman Solution for the Closed Cosmological Model 736
§76.4. Friedman Solution for the Open Cosmological Model 741
§76.5. BigBang, Red Shift, and Expansion ofthe Universe 742
§76.6. The Future of our Cosmos 745
§76.7. The Very Early Cosmos 747
§76.8. Schwarzschild Solution 756
§76.9. Applications to the Motion of the Perihelion of Mercury 758
§76.10. Deflection of Light in the Gravitational Field of the Sun 765
§76.11. Red Shift in the Gravitational Field 766
§76.12. Virtual Singularities, Continuation ofSpace-Time Manifolds, and
the Kruskal Solution 767
§76.13. Black Holesand the Sinking of aSpaceShip 771
§76.14. White Holes 775
§76.15. Black-White Dipole Holesand Dual Creatures Without Radio
Contact to Us 775
§76.16. Death of a Star 776
§76.17. Vaporization ofBlack Holes 780

CHAPTER 77
Simplicial Methods, Fixed Point Theory, and Mathematical Economics 794
§77.1. Lemma of Spemer 797
§77.2. Lemma of Knaster, Kuratowski, and Mazurkiewicz 798
§77.3. Elementary Proof of Brouwer's Fixed-Point Theorem 799
§77.4. Generalized Lemma of Knaster, Kuratowski, and Mazurkiewicz 800
§77.5. lnequality of Fan 801
§77.6. Main Theorem for n-Person Games of Nash and the Minimax
Theorem 802
§77.7. Applications to the Theorem of Hartman-Stampacchia for
Variational Inequalities 803
§77.8. Fixed-Point Theorem ofKakutani 804
§77.9. Fixed-Point Theorem ofFan-Giicksberg 805
Contents xxiii

§17.10. Applications to the Main Theorem of Mathematical Economics


About Walras Equilibria and Quasi-Variational Inequalities 806
§77.11. Negative Retract Principle 808
§77.12. Intermediate-Value Theorem ofBolzano-Poincare-Miranda 808
§77.13. Equivalent Statements to Brouwer's Fixed-Point Theorem 810

CHAPTER 78
Homotopy Methods and One-Dimensional Manifolds 817
§78.1. Basic ldea 818
§78.2. Regular Solution Curves 818
§78.3. Tuming Point Principle and Bifurcation Principle 821
§78.4. Curve Following Algorithm 822
§78.5. Constructive Leray-Schauder Principle 823
§78.6. Constructive Approach for the Fixed-Point Index and the
Mapping Degree 824
§78.7. Parametrized Version of Sard's Theorem 828
§78.8. Theorem of Sard-Smale 829
§78.9. Proof of Theorem 78.A 830
§78.1 0. Parametrized Version of the Theorem of Sard -Smale 832
§78.11. Main Theorem About Generle Finiteness of the Solution Set 834
§78.12. Proof ofTheorem 78.8 834

CHAPTER 79
Dynamical Stability and Bifurcation in B-Spaces 840
§79.1. Asymptotic Stability and Iostability of Equilibrium Points 841
§79.2. Proof of Theorem 79.A 843
§79.3. Multipliersand the Fixed-Point Trick for Dynamical Systems 846
§79.4. Floquet Transformation Trick 848
§79.5. Asymptotic Stability and Instability of Periodic Solutions 851
§79.6. Orbital Stability 852
§79.7. Perturbation ofSimple Eigenvalues 853
§79.8. Loss of Stability and the Main Theorem About Simple Curve
Bifurcation 856
§79.9. Loss of Stability and the Main Theorem About Hopf Bifurcation 860
§79.10. Proof of Theorem 79.F 863
§79.11. Applications to Ljapunov Bifurcation 867

Appendix 883
References 885
List of Symbols 933
List of Theorems 943
List of the Most lmportant Definitions 946
List of Basic Equations in Mathematical Physics 953
Index 959
INTRODUCTION

Mathematics and Physics

The more I have leamed about physics, the more convinced I amthat phys!·
provides, in a sense, the deepest applications of mathematics. The mathemat1"
problems that have been solved, or techniques that have arisen out of physics
in the past, have been the lifeblood of mathematics.... The really deep questions
arestill in the physical sciences. For the health of mathematics at its research
Ievel, I think it is very important to maintain that link as much as possible.
Sir Michael Atiyah (1984)
The theory of general relativity is a good example for the basic character in a
modern development of a theory. The original hypotheses become more and
more abstract and remote from experience, but thereby the goal of deriving
a maximum of results of our experience from a minimum of hypotheses has
become closer.
Albert Einstein (1955)
The nuclear physicist Niels Bohr ( 1885-1962) did not-as did Arnold Sommer-
feld (1868-1951)-start from precisely defined mathematical assumptions, but,
starting from phenomena and guided by intuition, feit his way towards a new
physics. This form of research was also characteristic for the founder of modern
quantum mechanics, Werner Meisenberg (1901-1976).
Bartel Leendert van der Waerden (1981)
If one does not sometimes think the illogical, one will never discover new ideas
in science.
Max Planck (1945)
"I think this is so," says Cicha, "in the fight for new insights, the breaking
brigades aremarehing in the front row. The vanguard that does not Iook to left
nor to right, but simply forges ahead-those are the physicists. And behind them
there are following the various canteen men, all kinds of stretcher bearers, who
clear the dead bodies away or, simply put, get things in order. Weil, those are
the mathematicians."
From the criminal roman "Dead loves poetry" of the
Czech physicist Jan Klima (born in 1938)
2 lntroduction

The most vitally characteristic fact about mathematics, in my opinion, is its


quite peculiar relationship to the natural sciences, or, more generally, to any
science which interprets experience on a higher more than on a purely descriptive
Ievel ....
I think that this is a relatively good approximation to truth-which is much
too complicated to allow anything but approximations-that mathematical
ideas originate in empirics, although the genealogy is sometimes long and
obscure. But, once they are so conceived, the subject begins to live a peculiar life
of its own and is better, compared to a creative one, govemed by almost entirely
aesthetic motivations, than to anything eise and, in particular, to an empirical
science ....
But there is a grave danger that the subject will develop along the line of least
resistance, that the stream, so far from its source, will separate into a multitude
of insignificant tributaries, and that the discipline will become a disorganized
mass of details and complexities. In other words, at a great distance from its
empirical sources, or after much "abstract" inbreeding a mathematical object is
in danger of degeneration. At the inception the style is usually classical; when it
shows signs of becoming baroque, then the danger signal is up ....
Whenever this stage is reached, the only remedy seems to be a rejuvenating
retum to the source: the reinjection of more or less directly empirical ideas. I
am convinced that this was a necessary condition to conserve the freshness and
the vitality of the subject and that this will remain equally true in the future.
John von Neumann (1947)
Fora mentor of Ph.D. candidates it would be most easy to educate a poor
applied mathematician. The next simplest thing would be to educate a poor pure
mathematician. Then an entire quantum gap lies between the education of a
good pure mathematician, and finally, an enormous quantum gap, the education
of a good applied mathematician. For the latter task (especially after the death
of John von Neumann) I would consider no one sufficiently qualified. The
knowledge and abilities which are nowadays required of a really successful
applied mathematician, presume an extraordinary high intellectual standard,
and, even for the career of our present-day students, it is almost impossible to
predict which parts of mathematics will prove most suited for applications.
Peter Hilton (1973)
Mathematics is not a deductive science-that's a cliche. When you try to prove
a theorem, you don'tjust Iist the hypotbeses, and tben start to reason. Wbat you
do is trial-and-error, experimentation, and guesswork.
Paul Halmos (1985)
Between tbe ages of 12-16, I familiarized myself with the elements of mathe-
matics. In doing so I bad the good fortune of discoverlog books which were not
too particular in tbeir Iogical rigor.
At tbe age of 17, I entered the Polytechnic Institute of Zuricb. Tbere I bad
excellent teacbers (for example, Hurwitz and Minkowski), so that I really could
have obtained a sound matbematical education. However, most of tbe time
I worked in the pbysical laboratory, fascinated by tbe direct contact witb
experience. Tbe rest of tbe time I used, in tbe main, to study at bome tbe works
of Kircbboff, Helmoltz, Hertz, etc. Tbe fact that I neglected matbematics to a
certain extent had its cause not merely in my stronger interest in the natural
sciences tban in matbematics, but also in tbe following strange experience. I saw
tbat matbematics was split up into numerous specialities, eacb of wbicb could
easily absorb the short life granted to us. Consequently, I saw myself in tbe
position of Buridan's ass wbicb was unable to decide upon any specific bundle
of bay. This was obviously due to tbe fact tbat my intuition was not strong
Mathematics and Physics 3

enough in the field of mathematics in order to differentiale clearly that which


was fundamentally important, and that which is really basic, from the rest of the
more or less dispensable erudition, and it was not clear to me as a student that
the approach to a more profound knowledge of the basic principles of physics
is tied up with the most intricate mathematical methods. This only dawned upon
me gradually after years of independent scientific work. True enough, physics
was also divided into separate fields. In this field, however, I soon learned to
scent out that which was able to Iead to fundamentals.
Albert Einstein (1955)

Dick Feynman was a profoundly original scientist. He refused to take any-


body's word for anything. This meant that he was forced to rediscover or
reinvent for hirnself almost the whole of physics. lt took him five years of
concentrated work to reinvent quantum mechanics. At the end, he bad a version
of quantum mechanics that he could understand. The calculations I did for Hans
Bethe, using the orthodox theory, took me several months of work and several
hundred sheets of paper. Dick could get the same answer, calculating on a
blackboard, in half an hour.
In orthodox physics it can be said: Suppose an electron is in this state at a
certain time, then you calculate what it will do next by solving a certain
differential equation (the Schrödinger equation). Instead of this, Dick said
simply: "The electron does whatever it likes." A history of the electron is any
possible path in space and time. The behavior of the electron is just the result
of adding togetlier all the histories according to some simple rules that Dick
worked out. I bad the enormaus Jucktobe there at Cornell in 1948 when the
idea was newborn, and to be for a short time Dick's sounding board.
Dick distrusted my mathematics and I distrusted bis intuition. Dick fought
against my scepticism, arguing that Einstein had failed because he stopped
thinking in concrete physical images and became a manipulator of equations. I
bad to admit that was true. The great discoveries of Einstein's earlier years were
all based on direct physical intuition. Einstein's later unified theories failed
because they were only sets of equations without physical meaning.
Nobody but Dick could use his theory. Without success I tried to under-
stand him .... For two weeks I bad not thought about physics, and then it
came bursting into my consciousness like an explosion. Feynman's pictures and
Schwinger's equations began sorting themselves out in my head with a clarity
they have never bad before. I bad no pencil or paper, but everything was so clear
I did not need to write it down. Feynman and Schwinger were just looking at
the same set of ideas from two different sides. Putting their methods together,
you would have a theory of quantum electrodynamics that combined the mathe-
matical precision of Schwinger with the practical flexibility of Feynman.
Freeman J. Dyson (1979)
There are mathematicians who reject a binding of mathematics to physics, and
who justify mathematical work solely by aesthetical satisfaction which, besides
all the difficulty of the material, mathematics is able to offer. Such mathemati-
cians are more likely to regard mathematics as a form of art than a science, and
this point of view of mathematical unselfishness can be characterized by the
slogan "l'art pour l'art".
On the other band, there are physicists who regret that their science is so much
related to mathematics. They fear a loss of intuition in the natural sciences. They
consider the intimate relation with nature, the finding of ideas in nature itself,
which was given to Goethe (1749-1832) in such a high degree, as being destroyed
by mathematics, and their anger or sorrow is the more serious the more they
are forced to realize the inevitability of mathematics.
4 lntroduction

8oth points of view deserve serious consideration; because not only people
with narrow minds have expressed such opinions. Yes, one can say that such a
radical inclination to one side or the other, if not caused by a Iack of talent, is
sometimes evidence of a deeper perception of science, as if someone is interested
in both sciences, but at the same time is satisfied with obvious connections
between mathematics and physics ....
Mathematics is an organ ofknowledge and an infinite refinement oflanguage.
lt grows from the usuallanguage and world of intuition as does a plant from
the soil, and its roots are the numbers and simple geometrical intuitions. We do
not know which kind of content mathematics (as the only adequate language)
requires; we cannot imagine into what depths and distances this spiritual eye
(mathematics) will Iead us.
Erich Kähler ( 1941)
Relations between mathematics and physics vary with time. Right now, and for
the past few years, harmony reigns and a honeymoon blossoms. However, I have
seen other times, times of divorce and bitter battles, when the sister sciences
declared each other as useless-or worse. The following exchange between a
famous theoretical physicist and an equally famous mathematician might have
been typical, some fifteen or twenty years ago:
Says the physicist: "I have no use for mathematics. All the mathematics I
ever need, I invent in one week."
Answers the mathematician: "You must mean the seven days it took the
Lord to create the world."
A slightly more reliable document is found in the preface of the first edition
of Hermann Weyl's book on group theory and quantum mechanics from 1928.
He writes: "I cannot abstain from playing the role of an (often unwelcome)
intermediary in this drama between mathematics and physics, which fertilize
each other in the dark, and deny and misconstrue one another when face to face."
This dramatic situation, described here by one of the great masters in both
sciences, is a result of recent times. At the time of Newton (1642-1727) dis-
harmony between mathematics and physics seemed unthinkable and unnatural,
since both were bis brainchildren; and close symbiosis persisted through the
whole of the eighteenth century. The rift arose around 1800 and was caused by
the development of pure mathematics (represented by number theory) on the
one band, and of a new kind of physics, independent of mathematics, which
developed out of chemistry, electricity and magnetism on the other. This rift
was widened in Germany under the intluence of Goethe (1749-1832) and bis
followers, Schelling (1775-1854) and Regel (1770-1831) and their "Natur-
philosophie." ...
Our protagonists are Carl Friedeich Gauss (1777-1855), as the creator of
modern number theory, and Michael Faraday (1791-1867) as the inventor of
physics without mathematics (in the strict sense of the word).
It would be foolish, of course, to claim the nonexistence of number theory
before Gauss. An amusing document may illustrate the historical development.
Erleb Hecke's famous "Lectures on the Theory of Algebraic Numbers" has on
its last page a "timetable," which chronologically lists the names and dates of
the great number theoreticians, starting with Euclid (300 B.C.) and ending with
Hermann Minkowski (1864-1909). As a physicist, I am impressed to find so
many familiar names in this Hall of Farne: Fermat (1601-1665), Euler (1707-
1783), Lagrange (1736-1813), Legendre (1752-1833), Fourier (1768-1830), and
Gauss (1777-1855). In fact, we cannot find a single great number theoretician
before Gauss, whom we would not count among the great physicists, provided
we disregard antiquity. Specialization starts after 1800 with names Iike Kummer,
Ga1ois, and Eisenstein; who were all under the great intluence of Gauss' "Dis-
Mathematics and Physics 5

quisitiones Arithmeticae." In this specific sense Gauss' book marks the dividing
line between mathematics as a universal science and mathematics as a union of
special disciplines, and between the "geometre" as a universal "savant" in the
sense of the eighteenth century and the specialized "mathematicien" of modern
times. As is typical for a man of transition, Gauss does not belong to either
category, he was universal and specialized. The struggle raged within him-and
made him suffer.
Res Jost (1984)
(Mathematics and Physics Since 1800: Discord and Sympathy)

Mathematics is an ancient art, and from the outset it has been both the most
highly esoteric and the most intensely practical of human endeavors. As long
ago as 1800 B.C., the Babylonians investigated the abstractproperlies of num-
bers; and in Athenian Greece, geometry attained the highest intellectual status.
Alongside this theoretical understanding, mathematics blossomed as a day-to-
day tool for surveying Iands, for navigation, and for the engineering of public
works. The practical problems and theoretical pursuits stimulated one another;
it would be impossible to disentangle these two strands.
Much the same is true today. In the twentieth century, mathematics has
burgeoned in scope and in diversity and has been deepened in its complexity
and abstraction. So profound has this explosion of research been that entire
areas of mathematics may seem unintelligible to laymen-and frequently to
mathematicians working in other subfields. Despite this trend towards-indeed
because of it-mathematics has become more concrete and vital than ever
before.
In the past quarter of a century, mathematics and mathematical techniques
have become an integral, pervasive, and essential component of science, tech-
nology, and business. In our technically oriented society, "innumeracy" has
replaced illiteracy as our principal educational gap. One could compare the
contributions of mathematics to our society with the necessity of air and food
for life. In fact, we could say that we live in the age of mathematics-that our
culture has been "mathematicized." No reflection of mathematics around us is
more striking than the omnipresent computer....
There is an exciting development taking place right now, reunification of
mathematics with theoretical physics....
In the last ten or fifteen years mathematicians and physicists realized that
modern geometry is in fact the natural mathematical framework for gauge
theory. The gauge potential of gauge theory is the connection of mathematics.
The gauge field is the mathematical curvature defined by the connection; certain
"charges" in physics are the topological invariants studied by mathematicians.
While the mathematicians and physicists worked separately on similar ideas,
they did not just duplicate each other's efforts. The mathematicians produced
general, far-reaching theories and investigated their ramifications. Physicists
worked out details of certain examples which turned out to describe nature
beautifully and elegantly. When the two met again, the results were more
powerful than either anticipated....
In mathematics we now have a new motivation to use specific insights from
the examples worked out by physicists. This signals the return to an ancient
tradition....
Mathematical research should be as broad and as original as possible, with
very long-range goals. We expect history to repeat itself: we expect that the most
profound and useful future applications of mathematics cannot be predicted
today, since they will arise from mathematics yet to be discovered.
Artbur M. Jaffe (1984)
(Ordering the Universe: the RoJe of Mathematics)
6 lntroduction

Tbe matbematician is a very lonely man. Certainly, be bas family and friends,
wbose company be enjoys and wbose understanding be needs; but bis work, bis
matbematical problems-i.e., tbat whicb makes up tbe essence of bis conscious
life-be cannot sbare witb anyone.
Krysztof Maurin (1981)
APPLICATIONS IN MECHANICS

The book of nature is written in the language of mathematics.


Galileo Galilei (1564-1642)
Indolence destroys knowledge, Iet us live and work.
Johannes Kepler (1571-1630)
In 1605, Kepler recognized the orbit of the planet Mars as an ellipse, and, in 1638,
the triumphal march of the natural sciences began with Galileo's law of falling
bodies. Today we start to ask questions, as we see how little help the sciences
are able to offer in questions of humanity, and also, because their child,
technology, which gave rise to so many hopes, can so terribly be misused.
Wilhelm Blaschke (1946)
In the following two chapters we discuss the basic ideas of mechanics and the
dualism between wave and particle. This is fundamental for a deeper under-
standing of all problems, considered in Parts IV and V.
CHAPTER 58

Basic Equations of Point Mechanics

Lex prima: A stationary body will remain motionless, and a moving body will
continue to move in the same direction with unebanging speed unless it is acted
on by some force.
Lex secunda: The time-rate-of-change of the momentum of a body is propor-
tional to the force.
Lex tertia: If any object exerts a force on another object, then the second object
also exerts an equal and opposite force on the first.
Lex quarta: Forces are added like vectors.
lsaac Newton, Philosophiae Naturalis Principia Mathematica (London, 1687)
The same applies to the concept of force as does to any other physical concept:
Verbaldefinitionsare meaningless; real definitions are given through a measur-
ing process.
Arnold Sommerfeld (1954)

Mechanics is the oldest physical discipline. Its ideas, however, have influenced
many other branches of physics. The goal in this chapter is to present some
generat principles of point mechanics which are necessary to understand
elasticity theory, hydrodynamics, and many other branches ofphysics (statis-
tical physics, theory of relativity, electrodynamics, quantum mechanics and
quantum field theory, etc.). We will try to explain the close relation between
the results about variational problems of Part 111 and the basic principles of
mechanics. In particular, we explain the connection between Lagrange's multi-
plier rule and the principle ofleast constraint and the least (stationary) action.
To introduce the reader to the basic ideas, we consider in Section 58.2 a simple,
but typical example: equilibrium state and motion of a balance, and its
stability. Many modern expositions begin with the principle of stationary
action. This principle, however, does not explicitly contain the most important
physical concept-the force. Also, the principle ofthe stationary action, other

9
10 58. Basic Equations of Point Mechanics

than the principle ofthe least constraint, does not admit the most general side
conditions, with nonlinear relations for the velocities. We therefore choose to
present the basic principles in the following order:
(i) Principle of the least constraint of Gauss (most generat principle in
mechanics).
(ii) Principle of virtual power.
(iii) General stability principle for equilibrium states and, as a special case,
the principle of minimal potential energy.
(iv) Lagrange function and the principle of stationary action.
(v) Hamiltonian formalism (canonical equations and the partial differential
equation of Hamilton-Jacobi).
(vi) Poisson brackets and Poisson's mechanics.
Since the principle of virtual work often Ieads to misunderstandings, we
replace it with the principle of virtual power. For a discussion of this, see
Section 58.3. We hope our approach will highlight the relation between
elegant mathematical theories (Lagrangian and Hamiltonian formalism) on
the one band, and physical reality on the other. Especially emphasized are
stability questions. Points (iv)-(vi) play an important role in the formulation
of physical theories other than classical mechanics.
We consider the following interesting applications:

(a) Motion of planets.


(b) Motion of a rigid body.
(c) Expanding universein the context of classical mechanics.

In Section 58.21 we show that Poisson's formulation of classical mechanics


allows a siiJlple deduction of quantum mechanics. As an application we
consider the quantum mechanical oscillator in the context of Heisenberg's
matrix mechanics. This implies Planck's famous formula about the quantiza-
tion of energy for the harmonic oscillator.
We choo~e here a presentation of classical mechanics, which later on might
help to understand modern generalizations such as the special and generat
theory of relativity and quantum field theory. The most elegant mathematical
fundament for mechanics is formed by the geometry on symplectic manifolds.
This will be shown in Part V.

58.1. Notations
In classical mechanics one often works in the real linear three-dimensional
space V3 of vectors x, y, .... Intuitively, x is an arrow in the usual three-
dimensional Euclidean space, and arrows obtained by translations are iden-
tified (Fig. 58.1).1n the usual way, Iet
xy and xxy
58.1. Notations ll

Figure 58.1 Figure 58.2 Figure 58.3

denote the scalar product and the vector product, respectively. lf we specify
a point 0 as the coordinate origin and attach x at 0, then x is called the
position vector and we identify the endpoint P with x (Fig. 58.2). A coordinate
system consists of a point 0 and three linearly independent vectors b1 , b2 , b3
(basis vectors). We can write
X= ~ 1 b1 + ~ 2 b2 + ~ 3 b3,
where the numbers ~ 1 , ~ 2 , ~ 3 are called the coordinates of the point x. In
classical physics weshall use Einstein's sum convention, i.e., equal upper and
lower indices are summed from 1 to 3. In particular, we write
X= ~ib;.

In Cartesian Coordinates the basis vectors are denoted by e 1 , e2 , e3 , i.e., alle;


are unit vectors, pairwise orthogonal, and oriented as in Figure 58.3.
The trajectory of a point is given by a map
tHx(t)
with real time parameter t and position vector x(t). A dot denotes derivatives
with respect to time.
The vectors
x(t) and .X(t)

are called velocity vector and acceleration vector at time t, respectively. lf


x(t) = ~;(t)b;, then
x(t) = ei(t)b; and .X(t) = ht)b;.
The absolute values lx(t)l and lx(t)l are called velocity and acceleration at time
t, respectively.

EXAMPLE 58.1 (Rotation About an Axis). Let b be a unit vector. Consider a


rotation y = Tx about the point 0 and the axis of rotation b with an angle cp,
which is measured clockwise (Fig. 58.4(a)).

We choose Cartesian coordinates with e 3 = b, and obtain


12 58. Basic Equations of Point Mechanics

0
(a) (b)

Figure 58.4

with vectors
/ 1 = (cos qJ)e 1 + (sin fP)e 2 , /2 = -(sin qJ)e 1 + (cos qJ)e2
(Fig. 58.4(b)). Ifwe fix x and choose IP = qJ(t) with IP(O) = 0, then we obtain a
rotational motion about the axis of rotation b with y(O) = x. Differentiation
with respect to t gives

For the velocity vector of this rotational motion we obtain


y(t) = cp(t)b X y(t).

The number ro = cp(t) is called the angular velocity at timet. Often, one writes
m(t) = cp(t)b

and calls m the angular velocity vector.


Expansion of Tx with respect to IP gives
Tx = qJb x x + 0(qJ 2 ~ l'p-+ 0.
The linearized map y = qJb x x is called an infinitesimal rotation about the
angle IP· Furthermore,
y = (4p x x) + c
with 4p = cpb and the constant vector c is called an infinitesimal motion. lt is
composed of an infinitesimal rotation and a translation.
The space V3 is an H-space with scalar product
(xly)~ xy.
For X = ebi and y = '7 bi we have
1 1

xy = e'11b b1•
1 1

Let A: V3 -+ V3 be a linear operator. In a fixed coordinate system equation


y=Ax
58.1. Notations 13

means that relation


'1i = A{~;

holds for the coordinates. Such linear operators appear in the form of inertia
tensors, strain tensors, or stress tensors. The adjoint operator of Ais denoted
by A*: V3 ~ V3 • By definitio_!l, we have (xiAy) = (A*xiy), i.e.,
x(Ay) = y(A*x) for all x, ye v3.
Thus in Cartesian coordinates this gives
(A*)f = AJ for all i,j.
For the F-derivative of a function V: V3 -.. IR at a point x we find
V'(x)h = h;D1V(x) for all h e V3 ,
where h = h1b; and D; = oIo~ i. If we define linear functionals b1e V3* through
(b 1,h) = h1, then
U'(x) = D1V(xW.
One also writes
U'(x) = grad(U(x))

for this. Obviously, W} is a basis for the dual space V3*. We call W} the dual
basis of {b;}. We agree to identify V3* with V3 • More precisely, forevery b* e V3*
there exists a unique b e V3 with
(b*,x) = (blx) = bx
and we identify b* with b. From
for all i,j

for all i,j.


These equations uniquely determine b1• Explicitly, we have
b1 = (b2 X b3)/b1 (b2 X b3).
The formulas for b2 and b 3 are obtained through cyclic permutations. In this
sense U'(x)e V3 holds. In Cartesian Coordinates it follows that b1 = b1 = e1•
Then
U'(x) = grad U(x) = D1V(x)e 1,

where i is summed from 1 to 3.


Using the previous considerations we now may apply functional analytic
methods in classical mechanics. This enables us to work coordinate free.
In particular, we can use the coordinate free differential calculus for the
F-derivative of Chapter 4. This greatly simplifies formulas, and calculations.
14 58. Basic Equations of Point Mechanics

Weshall also use the well-known formulas of vector analysis, which may be
found, for example, in the handbook article Zeidler (1979).

58.2. Lever Principle and Stability of the Scales


In the history of physics many generat results have been obtained in studying
concrete phenomena. Archimedes' Iever principle (statics) and Kepler's laws
about the planetary motion together with Galilei's study of the free fall
(dynamics) were of great significance for the development of mechanics. Using
the example of the scales we try to show in this section how one is naturally
led to the fundamental concepts of torque, work, power, constraining force,
potential energy, kinetic energy, and total energy. Moreover, we consider the
centrat principle of virtual power and discuss stability criteria. In the following
section we explain connections with other generat principles in mechanics,
which later on will be studied in greater detail.
Webegin with the Iever principle. Figure 58.5 shows the typical configura-
tion of a pair of scales, i.e., we consider the simplest model of a weighing
machine. It consists of three firmly connected points with masses m1 and
corresponding vectors y 1, which are attached at the point Q. At the endpoint
Yi> the force K 1 is applied, i = 1, 2, 3, where K 1 is a vector. We first assume
that, except for the firm connection, all three points are able to move freely
(freely moving scales). In Section 58.13 we derive the following equilibrium
condition for this example of a rigid body: total force and total torque with
respect to Q must be zero, i.e.,
K1 + K 2 + K 3 = 0, (1)

(Yt X Kl) + (Y2 X Kl) + (Y3 X K3) = 0, (2)


and the body must be at restat some fixed time. Condition (1) is satisfied with
K 3 = -K 1 - K 2 • From(2)follows

IKtiiYtl = IK2IIY2I· (3)


This relation is called the Iever principle and more suggestively it reads:
Force times forcearm equals weight times weight arm.
For the gravitational force K 1 = -m1ge 3 , i = 1, 2, with acceleration due to
gravity g = 9.81 ms- 2, formula (3) becomes
(4)
Now we consider a balance, which is not freely moving and for the remainder
of this chapter make the natural assumption that the third point y 3 is the fixed
point of rotation 0. We choose 0 as in Figure 58.6 as coordinate origin. This
way we are able to neglect the motion of the third mass point m3 • The other
two mass points with masses m1 and m2 and with position vectors x 1 and x 2
58.2. Lever Principle and Stability of the Scales 15

...-___;-=--~~------ .. ml

Figure 58.5

Kl
K2 K2
Figure 58.6 Figure 58.7

J I
0

Figure 58.8
rK.

1
Figure 58.9

are affected by the gravitational forces K 1 and K 2 , respectively. The system


thus obtained only has one degree of freedom, characterized by the angle of
rotation oc of Figure 58.7. We are now looking for a general principle, which
yields equilibrium conditions and the stability of the equilibrium positions.
To engineers, only the stable equilibrium positions are of interest in construct-
ing buildings, bridges, etc. In order to find such a principle, we use the concept
16 58. Basic Equations of Point Mechanics

of work. For an arbitrary motion of the system


Xi = Xi(t)
which for t = 0 is shown by Figure 58.6, we define the work durlog the time
interval [0, t] as

W(t) = E K1x1 + K 2 x2 dt.


Stability Principle 58.2. Let i = 1, 2. If
W(t) < 0
for allltl > 0 which are sufficiently small and all motions x1 = x 1(t) of the system
which satisfy x 1(t) ::1= x1(0) for ltl > 0, then the system is in a stable equilibrium
for the position (x 1(0), x 2 (0)).

This principle will be formulated in a generat form in Section 58.12. Intu-


itively, it means the following. If we set the system of Figure 58.6 into the
motion x 1 = x 1(t), then for W(t) > 0 the gravitational force has to do work,
i.e., in this case we gain work, whereas for W(t) < 0 we have to do work
ourselves. Thus we expect the stability of the equilibrium position to increase,
if W(t) < 0 decreases. We now assume the Iever condition (4). In the following
we will see that the stability of the configuration in Figure 58.6 increases, the
higher the point of rotation 0 lies above the rocking beam, i.e., the larger
d = ly 3 1is. The configuration in Figure 58.8, on the other band, satisfies
W(t) > 0 for ltl > 0, i.e., as expected, this equilibrium position is unstable. If
the point of rotation 0 lies on the rocking beam, then this will imply that
W(t) = 0. Here, every configuration of Figure 58.9 is in an equilibrium posi-
tion. But all these equilibrium positions are unstable, which everybody knows
who once sat on a swing.

Principle of Virtual Power 58.3. Since W(O) = 0, the Stability Principle 58.2
implies that W(O) = 0 for a stable equilibrium i.e.,
(5)

By definition, W(O) is the power of the forces K 1 and K 2 at time t = 0.


Therefore (5) is called the principle of virtual power. The attribute "virtual"
comes from the fact that in (5) we consider all mathematically possible motions
x1 = x 1(t) of the system, which are compatible with the side conditions, not
only those which are physically possible, under the influence of the forces. The
equations for the physically possible motions will be studied in (12) below.

Proposition 58.4. If the point of rotation 0 of the scales lies properly above the
rocking beam, and if equilibrium condition (4) is satisfied, then Figure 58.6
corresponds to a stable equilibrium position.
58.3. Perspectives 17

PROOF. Let x 1 = ee
1 1 + ( 1e3 • We define the potential energy as
U(x 1 ,x 2 ) = m1 g( 1 + m2 g( 2 + U0 ,
where U0 is an arbitrary constant. We choose U0 suchthat U(x 1 (0), x 2 (0)) = 0.
Because of K 1 =- -m1ge 3 = - Ux, we have

W(t) = Jo(' K x + K
1 1 2 x2 dt = - Jo('dU
dtdt = - U(x 1 (t),x 2 (t)).

For the configuration in Figure 58.7 we obtain


x1 = lxd(sin(y 1 + !X)e 1 - cos(y 1 + !X)e 3 ),
(6)
x2 = -lx 2 l(sin(y2 - !X)el + cos(yz - !X)e3).
Figure 58.6 illustrates the meaning of the angle y1• All mathematically possible
motions are then obtained from all possible functions !X = !X(t) with !X(O) = 0.
One finds
W(t) = - V(!X(t)) (7)
with U(x 1 ,x 2 ) = V(!X), where
V(!X) = m1 glx 1 l(cosy 1 - cos(y 1 +!X})+ m2 glx 2 l(cosy2 - cos(y2 - !X))
= !XV'(O) + !X 2 V"(0)/2 + O(!X 3 ), !X-+ 0,
and
V'(O) = m1 glyd- mzglyzl,
V"(O) = d(m 1 g + m 2 g)~r p, - Y3 I•
d ~l
Equation W(O) = 0 implies
W(O) = - V'(O)!i(O) = 0.
Since l!%(0)1 can be chosen arbitrarily, it follows that V'(O) = 0. This is equilib-
rium condition (4). Moreover, for d > 0 we get W(t) < 0 if I!X(t)l > 0 is suffi-
ciently small. D

58.3. Perspectives
The simple example of the previous section illustrates already a nurober of
important ideas in mechanics, which in a generat form will be discussed in the
following sections. Here we begin with several observations and hope that this
detailed discussion will help to a better understanding of relations in physics.
(i) Principle of virtual power. The corresponding condition
W(O) =o
for the power has been discussed already in (5) above.
18 58. Basic Equations of Point Mecbanics

(ii) Principle ofvirtual work. The work W(t) depends only on the angle oc(t).
It is
W = - V'(O)oc + O(oc 2 ), oc-+ 0.
The linearization
bW~- V'(O)oc
is called virtual work. Often one writes boc instead of oc, i.e.,
bW = - V'(O)boc.
Instead of W(O) = 0 in (i) one may also require that
bW=O.
This also Ieads to the equilibrium condition V'(O) = 0.
In this volume we prefer (i) over (ii), since (i) allows an immediate physical
interpretation, while (ii) is only an auxiliary mathematical construction. Inter-
estingly, physicists prefer (ii). The reason for this is simple. In order to really
understand the principle, (i) is most appropriate. But in many simple situa-
tions, one can use (ii), tagether with intuitive geometrical arguments about the
nature ofthe virtual displacements, to obtain faster results than through exact
computations via (i).
In many textbooks, however, one finds formulations, that are not very
elucidating. In the classical textbook by Sommerfeld (1954) one reads: "A
virtual displacement is ::.n arbitrary and infinitely small change of the posi-
tion of the system, compatible with the side conditions.... The virtual work
of the reactions needed is eaual to zero." In the language of Sommerfeld,
which accidentally was the language of the mathematicians at the beginning
of differential and integral calculus during the eighteenth century, one would
say: lf the real function y = f(x) has a minimum at x 0 , then the "virtual
displacement"
~Y = f'(xo) ~x.
is equal to zero for "infinitely small" displacements bx. This means that the
Taylor expansion is only considered up to the linear term
f(x 0 + h)- f(x 0 ) = f'(x 0 )h
and that this expression is set equal to zero with h = bx. Today we simply
write
f'(x 0 ) = 0.
In Budo (1954) one reads the obscure remark: "While the real displacement
always takes place during a certain time dt, the virtual displacement can
be regarded as timeless bt = 0. The velocity of the virtual displacement is
infinitely large."
58.3. Perspectives 19

(iii) Principle of minimal potential energy. Using the relation


W(t) = - U(x 1 (t), x 2 (t)),
with U(x 1 (0), x 2 (0)) = 0 and W(t) < 0 for ltl > 0, we obtain from the Stability
Principle 58.2: If the potential energy V has a strict minimum at (x 1 (0), x 2 (0))
with respect to all possible positions (x 1(t), x 2 (t)), then (x 1 (O),x 2 (0)) corresponds
to a stable equilibrium position.
(iv) Dynamics and the principle of virtual power. We now want to study the
actual physical motion of our weighing machine. The reader who is familiar
with Newton's law "Force equals mass times acceleration" should be warned
that here the equations of motions are not of the form
i = 1, 2. (8)
The reason for this is that, besides the forces K 1, additional so-called con-
straining forces act, which are caused by the side conditions (linkings) and
which guarantee that dl,lring the motion the rigid fixings are preserved. For
time-independent side conditions, as in our case, the actual physical motion
X;= X;(t)

of the system is not determined by (8), but by the dynamical principle of virtual
power, which will be discussed in Section 58.11. This principle states that
2
L (m x (t
i=l
1 1 0 )- K 1)v1 = 0 (9)

for all t 0 • Here v1 = y1(t 0 ) are arbitrary velocity vectors, which correspond to
all mathematically possible motions x 1 = y1(t) of the system, which are com-
patible with the side conditions and satisfy
Y;(t 0 ) = X 1(t 0 )
at some fixed time t 0 • One refers to x 1 = y 1(t) as the virtual motion. Among
all these virtual motions, the actual motion x 1 = x1(t) of the system is deter-
mined by (9).
lf, in particular, we choose t 0 = t and v1 = i 1(t), then we obtain from (9) that
(10)
where, by definition,
1" = (ml xi + m2ii}j2,
is the kinetic energy. Because of W(t) = - V(!X(t)) we obtain from (5) the
fundamental energy conservation law
T(t) + V(1X(t)) = E0 , (11)
with E0 a constant which is called the total energy or simply the energy of the
system. From (6) follows
20 58. Basic Equations of Point Mechanics

with JJ = m1 x: + m 2 x~. The equation of motion for cx = cx(t) therefore be-


comes
(12)
with V(cx) = ßcx 2 /2 + O(cx3 ), cx-+ 0, ß > 0. The solution of(12) with cx(O) = 0 is
{; f" dcx
'-f2 Jo jE0 - V(cx) = t.
If, approximately, we set V(cx) = ßcx 2 f2, then we obtain
cx(t) = (li(O)/w)sin wt, w = Jßj;. (13)
These are the oscillations of the scales with angular frequency w.
In Figures 58.8 and 58.9 we have ß < 0 and V(cx) = 0, respectively. This
changes the qualitative behavior of the motion. For example, Figure 58.9
yields rotations about 0 with constant angular velocity Ii.
If we use approximations of V(cx) up to order three or four, then we obtain
elliptic functions in (13).
(v) Dynamic stability. In Stability Principle 58.2 only the statics is taken
into consideration. A generat stability analysis, however, needs to take the
dynamical nature of the problern into account, i.e., changes of the system
under small time-rlependent perturbations of position and velocity. This
corresponds to the ljapunov stability for equations ofmotion, which has been
discussed in Section 3.6. In many cases, stability results are difficult to obtain.
In our case, however, the conservation of energy admits a simple analysis. If
cx(O), ä(O) are the initial data, then E0 follows from (12). Furthermore, (12)
implies that the motion is possible only for
E0 - V(cx) ~ 0.
In Figure 58.10 this corresponds to the dark printed cx-interval. This implies
the following:
If lcx(O)I and lä(O)I are sufficiently small, then the total energy
IE0 I issmalland hence lcx(t)l remains small for all times.
This is stability of the system in the sense of Ljapunov. If the slope of V
increases, i.e., ßand d = ly 3 1in Figure 58.5 increase, then the interval in Figure

V
-+---+-----1'----Eo

Figure 58.10 ·
58.3. Perspectives 21

58.10, in which IX varies, decreases. This implies:


The higher the point of rotation 0 lies above the rocking beam,
the more stable the equilibrium state of the scales becomes.
Finally, we will use our simple example to explain the basic ideas of
Lagrangian and Hamiltonian formalism.
(vi) Lagrangianformalism. We consider the Lagrange function
L(1X, Ii) = T- U = 2- 1J.tli 2 - V(1X).
The equations of motions are then the so-called Lagrangian equations
d
dtL« =La., (14)

i.e.,
J.lfi + V'(IX) = 0.
Hence multiplication with Ii and integration imply the energy conservation
law (12).
(vii) Hamiltonian formalism. We define the generalized momentum p =
Lri = J.lli and the Hamilton function
H(1X,p) = pli- L = (p 2 /2J.l) + V(1X).
One immediately observes that H corresponds to the energy. The equation of
motion (14) then assumes the symmetric form
Ii =HP. (15)

This is a so-called Hamiltonian system.


(viii) Geometrization of mechanics. In Figure 58.6 we describe the position
ofthe scales through (x 1 ,x 2 ). This is a point in a six-dimensional space
V6 = V3 x V3 •

This form of description is very convenient if, for example, one wants to define
the work W(t) and the kinetic energy T. In fact, formula (6) implies that all
possible positions only depend on a single parameter IX. One may think of it
as the angle, which describes the position of points on the unit circle. Hence,
the so-called configuration space C, i.e., the set ofall possible positions (x 1 , x 2 )
forms a curve in V6 • More precisely, C is a one-dimensional C00 -manifold,
which is diffeomorphic to the unit circle. A motion
x1 = x 1 (t),
corresponds to a curve in C, where (x 1 (t),x 2 (t)) is the tangent vector to this
curve. The vectors (v 1 , v2 ), which appear in the principle of the virtual power
(9) are precisely the tangent vectors to C at the point (x 1 (t 0 ), x 2 (t 0 )). The space
22 58. Basic Equations of Point Mechanics

V6 only plays the role of an auxiliary space. lmportant for the motion of the
system is the configuration space C. From a satisfactory mathematical theory
for mechanics one would expect that it is a geometrical theory for the manifold
C, i.e., the theory is independent of the choice of the parametrization cx. The
most elegant geometrical formulation of mechanics is obtained by formulating
and studying the Hamiltonian system (15) in an invariant, i.e., parameter
independent way, in the context of symplectic geometry on the cotangent
bundle of C. This will be discussed in Part V.
In all parts of physics one observes a trend towards a geometrization. This
is the realization of Einstein's program of finding a unification of physics. An
immediate and more formal reason for this is that physicists perform their
measurements in coordinate systems. But, of course, the physical essence of
the phenomena should be independent ofthe choice ofthe coordinate system.
A second and more profound reason is the following. The general theory of
relativity, as well as the modern gauge field theories, have led to the con-
sequence that all physical interactions can be described through the curva-
ture of suitable manifolds, and a unified theory of matter should be sought in
this direction.
The methods, demonstrated in this section by using the scales as an example,
can equally well be applied to all static constructions in technology. Naturally,
the computations become more complex if the nurober of degrees of freedom
of the system increases.

58.4. Kepler's Laws and a Look at the


History of Astronomy
1t is a sign for the power of mathematics that it is able to reproduce such a fine
and superior process as the motion of celestial bodies, that it has words and
symbols enough to determine the orbit of Jupiter for the next one hundred years
from amongst all these infinitely many orbits which one can imagine, and this
with an impressive accuracy.
Erich Kähler (1941)

It seems that a look at the stellar sky has fascinated people since the early
days and that it was one of the main reasons which led them to think about
the meaning of the universe and its scientific and mathematical description.
The ingenuity of astronomers and astrophysicists, who have gathered our
present knowledge about the universe, is admirable, and should shame all
master detectives in the world's literature. A first culminating point was the
discovery of the laws of planetary motions by the Prague astronomer and
mathematician Johannes Kepler (1571-1639) during the years from 1609 to
1619. At this time Kepler studied a tremendeous amount of numerical data,
which bad been collected by Tycho Brahe (1546-1601). The laws are:
58.4. Kepler's Laws and a Look at the History of Astronomy 23

(i) The planets move on elliptic orbits with the sun at one focus.
(ii) The line segment joining a planet and the sun sweeps out equal areas in
equal times.
(iii) The squares of the periods of revolution of two planets about the sun
are proportional to the cubes of the semimajor axes of the ellipses.
These laws are true with respect to a Cartesian coordinate system t with
the sun at its origin and whose axes are firmly connected with the stellar
sky. A vague impression of Kepler's tremendeous scientific achievement is
obtained by noting that all the numerical data originated from the moving
planet Earth, i.e., in order to obtain (i) to (iii), Kepler first bad to transform
them to t. This step was never taken by the great astronomers of antiquity.
For example, it was important to observe that Kopernikus' (1473-1543)
hypothesis about a circular orbit of the planet Mars led to an average error
of 8 minutes of arc. Laws (i) to (iii) are of a kinematic nature, i.e., they describe
the motion, but not its cause. Isaac Newton's (1643-1727) path-breaking idea
was then to recognize that (i) to (iii) follow from a universallaw (law of gravity)
and a general equation of motion (see Section 58.9). Newton based bis work
on Kepler's results and Galilei's (1564-1642) observation that all bodies fall
at the same rate, i.e., receive a constant acceleration. This last observation led
Newton to assume that on the surface of the earth a gravitational force exists
which causes the free fall. Because of the interactions between the planets, (i)
to (iii) are only approximately true. Until 1781 the only known planets were
Mercury, Venus, Earth, Mars, Jupiter, and Saturn. In 1781, Hersehe} dis-
covered Uranus with a telescope and in 1811, in connection with a prize
problern of the Paris Academy, Delambre collected numerical data about the
motion of Uranus. lt was noticed that for several times durlog the previous
one hundred years, Uranus bad been registered as a fixed star. But the
observed and the calculated data did not completely match each other. It was
expected that the deviations were caused by a still unknown planet. In 1845
and 1846, after long and complicated computations two young astronomers,
the Englishman Adams (1819-1892) and the Frenchman Leverder (1811-
1877), independently found the orbit of a new planet, called Neptune. Later,
in 1846, Galle (1812-1910) at the Berlin astronomical observatory discovered
it by following the numerical data contained in a Ietter by Leverrier. Jacobi
then wrote: "One can only admire, how it is possible to obtain such precise
results from so few and uncertain results. Those who call this discovery
accidental, should also be encouraged to make such aceidentat discoveries
themselves."
In 1930, the planet Pluto was discovered at the Flagstaff astronomical
observatory in Arizona (U.S.A.) as a result of a perturbation calculation for
Neptune. The following numbers should illustrate the orders of magnitude.
To get some idea of our solar system, we consider in Table 58.1 a model to
the scale of 1: 109 . The sun then has a diameter of 1.4 m. The distance from
the moon to the earth equals about 30 diameters of the earth, i.e., 0.4 m in this
24 58. Basic Equations of Point Mechanics

Table 58.1. Model ofthe Solar System to the Scale of


Im:: 106 km.
Planet Diameter Comparison Distance from the sun
Mercury Smm pea 58m
Venus 12mm cherry 108 ni
Earth 13mm cherry 149m
Mars 7mm pea 229m
Jupiter l43mm coconut 178m
Saturn 121 mm coconut l400m
Uranus 50mm apple 2900m
Neptune 53mm apple 4500m
Pluto lOmm cherry 5900m

model. The mass of the earth is


mearth = 6 . 1024 kg.
Moreover, one roughly has
msun:mearth:mmoon = 3 ·10 5 : 1:10- 2.
Table 58.2 contains a survey about the huge distances in the universe. Our
Milky Way has the form of a Jens with a diameter of 105 light years and a
thickness of maximally 1.5 ·104 light years. The sun is located almost at the
boundary ofthe Milky Way and travels around its center with 285 km/s. This
is about ten times as fast as the earth travels around the sun. For one
revolution about the center of the galaxy the sun takes about 200 · 106 years.
Since its creation 5 ·109 years ago, the sun has traveled only 25 times around
the center of the galaxy. Our galaxy consists of approximately 10 11 suns with
a total mass of 2.5 ·10 11 sun masses, whereby 80 percent is located in the
center, and recently probably a black hole has been discovered there. The

Table 58.2
Object Travel time of light from the sun to the object
Earth 81 minutes
Pluto st hours
Next fixed star
(cx Centauri) 4 years
Sirius 9 years
Center of the
MilkyWay 3·104 years
Next major galaxy
(Andromeda nebulus) 7·105 years
Quasars 10 10 years (approximate age of the universe)
58.5. Newton's Basic Equations 25

number of galaxies is estimated as 10 11 if the closed cosmological model


with finite volume is assumed. In the open model the volume is infinite (see
Chapter 76).
Already in 1802 Newton's theory of gravity was a great triumph. One year
earlier Piazzi, in Palermo, discovered the planetoid Ceres as a: star of magni-
tude eight and was able to follow its orbit for 9 degrees before losing it. The
young Gauss (1777-1855) then computed the entire orbit by employing new
methods of the calculus of Observations; and using this result, Olbers re-
discovered Ceres in 1802.
As a consequence of the great success perturbation calculus has had in the
discovery of Neptune in 1846, a perturbation of the orbit of Mercury in the
form of a motion of the Perihelion of 11cp = 43" during lOO years greatly
intrigued astronomers, and at first led to the hypothesis of a planet Volcano.
But it could never be observed. Today we know that the motion of the
perihelion cannot be explained with Newton's theory of gravity, but is a
consequence of the generat theory of relativity, which was developed by
Einstein in 1915. From this theory the above value follows very accurately
(see Section 76.9).
The development of mechanics has greatly influenced the development of
mathematics. Newton (1643-1727) himself, in searching for a mathematical
formulation ofhis theory, created the differential and integral calculus. He did
this independently of Leibniz (1646-1716). Analytical mechanics was further
developed by Euler (1707-1783) and Lagrange (1736-1813), who both also
developed variational calculus as a mathematical discipline, and by Barnilton
(1805-1865) and Jacobi (1804-1851).
The famous n-body problern is the subject of Section 58.9c, and cosmo-
logical questions, which presently are at the center of interest, will be discussed
in Section 58.15 and, more thoroughly, in Chapter 76.

58.5. Newton's Basic Equations


Galilei (1564-1642), in formulating his law of falling bodies, could do this by
using the old algebra. But for Newton, who created the general principles of
mechanics from Galilei's dynamics, including his now famous law of gravity, it
was clear from the beginning that only a mathematics, containing the concept
of infinitesimal small changes, would be able to draw conclusions from this new
mechanics.
Such a form of mathematics was known to Newton (1643-1727) since his
early days, but his notes on this remained in the hands of his closest friends,
and even when, during the height ofhis life, he published his new concepts about
the universe "Principles of Natural Philosophy", he carefully avoided this new
calculus. He probably feared that his ideas would lose some oftheir strength by
formulating them in a completely new mathematicallanguage. In this regard he
underestimated his fellow scientists; because when Leibniz (1646-1716), inde-
pendently ofNewton, formulated the infinitesimal calculus, using a moreelegant
26 58. Basic Equations of Point Mechanics

formalism than Newton, this mathematics received an enthusiastic reception;


however, "Newton's principles", despite the brash presentation, were weil re-
ceived on the European continent and translated back into the language of
differential and integral calculus.
Erich Kähler (1941)

According to Newton, the motion of n mass points with position vectors X;,
i = 1, ... , n and masses m; is described by the fundamental differential
equations
i = 1, ... , n, (16)
with forces K; = K; (x 1 , ••• ,xn,x 1 , ••• ,x",t). Theseare vector functions. In
concrete situations it is then up to the physicist to determine the force field
K;. Let position and velocity of the system at time t0 be given, i.e.,
i = 1, ... , n, (17)
and if every K; is of class C 1 in a neighborhood of (a 1 , ... , a", b1 , ••• , b", t 0 ),
then Theorem 3.A iinplies that in a neighborhood of t 0 , equation (16) has a
unique solution with (17).

EXAMPLE 58.5 (Free Fall). Let 0 be a point on the surface of the earth and e 3
a unit vector perpendicular to the surface of the earth at 0. We choose a
Cartesian coordinate system e 1 , e2 , e3 with its origin at 0. From experience
we know that in a neighborhood of 0, a gravitational force
K = -mge 3
acts on a point of mass m. The equation of motion
mx=K, x(O) = he 3 , x(O) = -ve 3
has the unique solution
x(t) = - 2- 1gt 2 e3 + x(O) + tx(O). (18)
For x(O) = 0 this is the law of the free fall, discovered by Galilei. Conversely,
by using mx = K, one obtains the form ofthe gravitational force K from (18).
For arbitrary x(O) and x(O), formula (18) yields the motion of a stone, thrown
from an initial position x(O) with the initial velocity x(O). From experiments
one finds for the gravitational acceleration g = 9.81 m/s 2 •
Let x = 'ie; and ( = , 3 . The function
U(x) = mgC
is called the potential energy of the gravitationai force. It is equal to weight
mg times height Cabove the surface of the earth. For every motion x = x(t)
in the gravitational field, the energy conservation law holds
!mx 2 (t) + U(x(t)) = const = E 0 •
This follows, because differentiation with respect to t of the left-hand side
58.5. Newton's Basic Equations 27

u
~E
".

U
I
I
I
lo
I
I
I
I I
~e,
m L __ _ _.~------~'~--------.~
Figure 58.11 Figure 58.12

yields

mxx + Vxx = (mx - K)x = 0.


In Section 58.8 we will show that in first-order approximation the force
K = -mge3 follows from Newton's general force of gravity.
EXAMPLE 58.6 (Harmonie Oscillator). In first-order approximation consider
the most general motion
x(t) = e(t)e 1

of a mass point of mass m on a straight line through the vector e 1 with force
K(e) = k(e)e 1 (Fig. 58.11). The equation of motion mx = K here becomes
(19)

We choose a function V with

and call V the potential energy. We find that K =- Vx, i.e., K = - grad V.
For every solution of (19) we obtain
tm~ 2 (t) + U(e(t)) = const = E0 . (20)
This is easily checked by differentiating (20). We call tme
2 the kinetic energy

and E 0 the energy. Equation (20) shows that the motion is possible only for
U(e):::;; E0
(Fig. 58.12). Thereby E0 is determined by prescribing e{t 0 ) and e(t 0 ). Series
expansion yields
e-+O.
By only considering linear terms, we obtain

me(t)- ko + ktW) = o.
We now assume that K is a repelling force as in the case of a spring, i.e., k 1 > 0.
Moreover, we assume that the force is equal to zero if the deflection is equal e
to zero, i.e., k0 = 0. Then the force K and the potential energy U are equal to
28 58. Basic Equations of Point Mechanics

This important physical system is called an harmonic oscillator. The equation


of motion becomes
(J) = jkJm,
and the solution is

e(t) = e(O) cos wt + e(O)


(J)
sin wt. (21)

These are periodic oscillations with so-called angular frequency w. Since the
trigonometric functions have period 2n, the time needed for one oscillation is
equal to
T = 2n/w.
The nurober v = 1/T is called the frequency. lt is equal to the nurober of
oscillations during one unit of time.
The harmonic oscillator is the most simple oscillating system. More com-
plicated oscillating systems are often modeled by superposition of harmonic
oscillators (cf. Problem 58.11).

The mass m describes the inertia of the body and is measured in kilograms.
By definition, 103 cm 3 of water have the mass of 1 kg at a temperature of 4o
Celsius. On the earth, masses are measured by comparison on scales using the
Iever principle (4). Masses of celestial bodies are determined from their motions
using the gravitationallaw (see Section 58.8).
In Newtonian mechanics one assumes that the mass is constant and does
not depend on the motion and on the system of reference. An exception is
formed by rockets, which continuously eject mass, and whose equations of
motion are
p=K (22)
with momentum p = mX:. One should also note that the basic equation (16)
only describes motions ofpoints which arenot subject to constraints. The case
of constraints will be discussed in Section 58.10 (principle of the least constraint
ofGauss).
Forces, that depend on velocities are called forces of friction. The unit of
the force is the Newton N. lt is
1 N = 1 kgmfs 2 •

58.6. Changes of the System of Reference and


the Role of lnertial Systems
Physical processes are described by space and time coordinates, i.e., relative
to systems of reference. For a physicallaw one requires that it is precisely
58.6. Changes of the System of Reference and the Role of lnertial Systems 29

0 X

Figure 58.13

stated for which systems of reference it is valid and how it is changed under
changes of the system of reference. Moreover, physicists are very much inter-
ested in equivalent systems of reference. Thereby two systems I: and I:' are
called equivalent if and only if the form of the physical processes is the same
for both systems. More precisely, this means: If two physicists build the same
measuring device in I: and I:' and if for both systems the same initial and
boundary conditions are satisfied, then the physicists obtain the same results
of measurement. Not all systems of reference are equivalent. Consider, for
example, a physicist P in a closed box B, who drops a stone with initial velocity
zero. Then P will obtain very different results depending on whether B stands
on the surface of the earth, rides in an accelerated Iift or travels in a spacelab
(weightlessness ).
In the following we discuss the difficulties connected with the system of
reference for the basic equation (16). To this end we introduce an auxiliary
mathematical construction, the absolute space. This means that, as in Section
58.1, we choose a fixed three-dimensional vector space V3 as the absolute space
together with a fixed coordinate origin 0. Every motion x = x(t) with x(t) E V3
for all t, where x(t) is attached at 0, is then called an absolute motion.
Following Newton, we assume that the basic equation (16) holds true for the
absolute space. We then consider the situation of Figure 58.13, where I:' is a
moving Cartesian coordinate system with origin 0' and axial vectors e;(t),
i = 1, 2, 3, whose absolute motion is given by the equation

Suppose the absolute motion of 0' is x = a(t). We want to find out how
equation (16) Iooks in I:'.
A physicist P' in I:' cannot observe the absolute motion x = x(t) of a given
point, but only its relative motion y = y(t) with
x(t) = a(t) + y(t). (23)

We set y(t) = 11;e;. For P' the axes e; are fixed. Thus P' measures the velocity
y(t) = ~i(t)e;,
30 58. Basic Equations of Point Mechanics

and the acceleration


ji(t) = ~ 1 (t)e1 •

Differentiation of (23) yields x = ä + ~ 1e1 + '7 1e1, hence


X= ä + y+ (I) X y, (24)
and
x = ä + ji + L(y,y,m,oi), (25)
with
L = (I) X (2y + (I) X y) + oi X y.
From m1x1 = K 1 in the absolute space, we find as the equation of motion in
the new system l:'
i= l, ... ,n, (26)
with
Kj = K 1 - m1ä - m1L(y1, Y;, m, oi).
For constant m the motion (24) corresponds to a translation about a and a
rotation about the axis m with angular velocity Im I (see Example 58.1). If m(t)
is time-dependent, then the axis of rotation and the angular velocity both
change in time. According to Section 58.14, equation (24) describes the most
general motion of a Cartesian coordinate system in the absolute space. We
obtain the important statement:

The transformed basic equations (26) have the same structure in an arbitrarily
moving Cartesian coordinate system as do Newton's basic equations (16) in the
absolute space. Only the forces need to be changed. In addition to K 1 we have
that Kj contains so-called inertialforces -m1ä- m1L. Thus the basic equations
(16) can be used in every system ofreference iftheforces are suitably chosen.

If, for example, l:' performs a constant rotational motion, i.e., a(t) = 0 and
m(t) = const, then we obtain in l:' the equation of motion
m1ji1 = K 1 - m1m x (m x y1) - 2m1(m x y1)
for i = 1, ... , n. The second (or third) term on the right-hand side is the
centrifugal force (or Coriolis force). Those are inertial forces. In cantrast to
the centrifugal force the Coriolis force vanishes for a body which is at rest in
l:', i.e., for y1 = 0.
We now consider an important special case. We calll:' an inertial system
if and only if l:' performs only a purely translationary motion in the abso-
lute space with constant velocity, i.e., ä(t) = const and m(t) = 0. Then (26)
becomes
i=l, ... ,n, (27)
58.6. Changes of the System of Reference and the Rote of lnertial Systems 31

where, at the same time, Ki is the force which acts on the particle in the
absolute space. We obtain: In all inertial systems the same force is applied to
the particle, and the equation of motion has the same form (principle of
relativity in the restriced sense). In classical mechanics one then assumes that
the following stronger postulate holds.

Galileian Principle of Relativity 58.7. For mechanical processes all inertial


systems are equivalent.

We will show that this principle greatly restricts the number of possible
force fields. Consider the motion of two mass points in the absolute space
under the influence ofa force Ki = Ki(x 1 ,x 2), i.e.,
i = 1, 2. (28)
After a transformation we find in the inertial system :E' the equation
mdii = Ki{yl + a,y2 + a). (29)
The Principle of Relativity 58.7 requires then that the two differential equa-
tions (28) and (29) have the same solutions for identical initial conditions
Xj(O) = Yi(O), .Xi(O) = Yi(O), i = 1, 2.
This is the case if
Ki = Ai(xt - x2),
i.e., ifthe forces only depend on the difference x 1 - x 2 •
For the previous coordinate transformation we have either explicitly or
implicitly used the following three facts:
(a) There exists an absolute space.
(b) There exists an absolute time, i.e., there exists a rule of measurement to
set all clocks in such a way that under changes between two moving
coordinate systems the time remains unchanged.
(c) The mass mi remains unchanged under changes between two moving
systems of reference.
The correctness of these hypotheses can only be tested in physical experi-
ments. In the context ofthe special theory of relativity, (a) to (c) arenot correct.
In fact, physicists were not able to find an absolute motion, because physical
motion can only be observed relative to a system of reference. Therefore, the
construction of the absolute space is only a useful mathematical tool. Actually,
only inertialsystemsexist in nature. Experience in classical celestial mechanics
shows that, as a good approximation of an inertial system, :Esun can be chosen
where the origin is the center of mass of our solar system and the axes are
firmly connected with the stellar sky (see Section 58.9). Then :E' is an inertial
system precisely if it performs a translatory motion with constant velocity
with respect to :Esun· The special theory of relativity which will be discussed
32 58. Basic Equations of Point Mechanics

in Chapter 75, begins with Einstein 's principle of relativity: All inertial systems
are equivalent, not only with respect to mechanical processes but with respect
to all possible physical processes. Furthermore, the velocity of light in the
vacuum is the same for all inertial systems. This implies that (b) is no Ionger
valid. lt also follows that in relativistical mechanics, which is based on Ein-
stein's principle of relativity, (c) is no Ionger true. But the etTects of relativity
theory only occur for velocities which are close to the velocity of light.
In atomic regions, classical mechanics must be replaced with quantum
mechanics. There, the concept of the classical orbit and the classical velocity
is replaced with probability theoretical expectation values (see Section 59.12).
But in this chapter we will work classically. In particular, we assume (a) to (c).

58. 7. General Point System and its


Conserved Quantities
We consider the motion of n mass pointsinan arbitrary coordinate system,
i.e., we consider the Newtonian basic equation
i = 1, ... n, (30)
for given forces Ki and masses mi. To study this differential equation we
introduce a nurober of fundamental quantities. Fora fixed motion xi = xi(t)
we define the kinetic energy
T= Imi#/2,
i

the total momentum

the angular momentum


N = L mi(xi x xi),
i

the torque
M=Ixi x Ki,
i

the work during the time interval [t 0 , t]

W(t) =
J,oI' ~• Kiii dt,
and the power
58.7. General Point System and its Conserved Quantities 33

Moreover, we call

the total mass and

the center of mass of the system. lt is always summed from i = 1 to n.


We now assume that the forces admit a decomposition of the form
Ki = KjPl + Kj4l
whereby a function U exists with
KjP1(x 1 , ... ,x,.) = -Ux,(xt> ... ,x")
for all i. Thereby U is called the potential or potential energy, and Ux denotes
the F-derivative of u with respect to X. In Cartesian Coordinates X= 1ei we e
have

Note that in a simply connected region the C 1-function U is determined only


up to an additive constant by the C2-forces KjP1• This constant may be
specified through a gauge. Gauge invariance in classical mechanics means that
the gauge does not cause any observable effects. This is because in the equa-
tions of motion there occur only forces which are independent of the gauge.
In modern elementary particle theory however, different gauges of the fields
are of central interest (gauge field theories). For example, the photon can be
interpreted as the elementary particle, which carries the gauge information of
the electron-positron field. This naturally implies that the photon has no rest
energy. This will be discussed in Part V.
We call KIP1a conservative force. The knowledge of the potential energy U
greatly simplifies the computation of the work W, because then

W(t) = U(X(t 0 ) ) - U(X(t)) + f.' I: Kl x


r0 1
41 1 dt,

with X = (x 1 , ••• , x,.). This yields the following important interpretation of the
potenlial energy:

lf all forces are conservative, i.e., Kj41 = 0 for all i, then the work W done by
these forces is path independent and equal to the difference between the values
ofthe potential energy at the initial point and at the endpoint ofthe motion.

This motivates the expression potential energy in the sense of accumulated


energy. In particular, according to Example 58.5, the potential energy in the
34 58. Basic Equations of Point Mechanics

gravitational field of the earth is equal to weight times height. During the
falling of a stone we gain work, while during its Iifting we do work. This work
is always equal to weight mg times height difference.

Theorem 58.A (Balance and Conservation Laws). Forasolution ofthe basic


equation (30) we have the following:
(i) Energy balance: (d/dt)(T + V) = Li Kjql_xi·
(ii) Angular momentum balance: N = M.
(iii) Momentum balance: P = LiKi.
(iv) Motion of the center of mass: mji = Li Ki.

PROOF. Allclaimsare easily checked by differentiating. For example, we have


d
-d (T(t) + U(X(t)) = L mixi(t)xi(t) + u..,,(X(t))xi(t)
t i

="'(K.-
L... K!Pl)x. ="' K!ql_x.
I I I ~ I I'
i i

and also
0

Theorem 58.A has a number of important consequences which are not


only of importance for mechanics. The following results can be extended
to much more general physical situations, as we shall see in later chapters.
In modern physics, the generalization to fields plays an important role and
will be discussed in Section 75.12 in connection with the relativistic energy-
momentum tensor.
Energy conservation law. The basic quantity
E(t) = T(t) + U(X(t))
is called total energy or simply energy of the motion at time t. The energy
balance (i) means that the change of energy in time is equal to the power of
the nonconservative forces. Integration yields

E(t) - E(t 0 ) = f.' ~ Kjql_xi dt,


t0 1

which shows that for positive work of nonconservative forces, energy is added
to the system. In the case that no nonconservative forces are present, i.e., if all
KjqJ are identically zero, then the most important physical theorem-the
energy conservation law
E(t) = const,
is valid for the motion.
58.8. Newton's Law of Gravitation and Coulomb's Law of Electrostatics 35

Momentum conservation law. If the total force is identically zero, then the
motion satisfies
P(t) = const.
Angular momentum conservation law. If the total torque is identically zero,
i.e., M(t) = 0, then the motion satisfies
N(t) = const.
Motion ofthe center ofmass. Obviously, (iii) and (iv) in Theorem 58.A are
identical. According to (iv), the center of mass move'S as if the entire mass
were in concentrated there and as if all forces were acting on it. In the case of
momentum conservation the center of mass moves on a straight line with
constant velocity.
Necessary equilibrium condition. From (ii) and (iii) in Theorem 58.A we
obtain: If the system is in an equilibrium position, i.e.,
x1(t) = const for all i,
is a solution of the basic equation (30), then the total force and total torque
vanish for this configuration.
As some important applications of the conservation laws above, we con-
sider:
(a) Planetary motion (Section 58.9).
(b) The rigid body (Section 58.13).

58.8. Newton's Law of Gravitation and Coulomb's


Law of Electrostatics

58.8a. The Gravitational Law


We consider two mass points P1 and P2 in the absolute space with masses m 1
and m2 and corresponding position vectors x 1 and x 2 • According to Newton's
fundamental observation, a gravitationalforce K acts from P1 onto P2 with

K __ Gm 1 m2 (x 1 - x2)
- 3
lxt - x2l
The direction of this force shows that it is an attracting force (Fig. 58.14).
Because of
36 58. Basic Equations of Point Mechanics

0
Figure 58.14

with r = lx 1 - x21we find that this force is proportional to the product ofthe
masses and indirectly proportional to the square ofthe distance. The quantity
Gis a universal constant and is called gravitational constant. Wehave
G = 6.7·10- 11 Nm 2 jkg2 •
For fixed x 1 the gravitational force K has the potential
Gm 1 m2
lx2- x1l'
i.e., U = -Gm 1 m2 /r. In fact, we have
K = -U" 2 •
According to Newton's principle of
actio = ractio
we find that, conversely, the gravitational force - K acts from P2 onto P1 •

Proposition 58.8 (First-Order Approximation of the Gravitational Force). Let


x 1 and x~ befixed with x 1 #= x~ and Iet x 2 = x~ + x. Moreover, we choose the
unit vector e3 = (x~ - xdfix~ - xd.
Then for x -+. 0 we have
K = K 0 + O(lxl), U = U0 (x) + o(ixi)
with

U0 = const + gm 2 (xe 3 ).
Moreover, U0 is the potential of K 0 , i.e., K 0 = -(U0 )".

PRooF. This immediately follows from the Taylor expansion. 0

Since K only depends on the difference x 1 - x 2 , the gravitationallaw above


is valid for any system of reference, according to Section 58.6. If, however, I:'
is no inertial system, then additional inertial forces occur.
58.8. Newton's Law of Gravitation and Coulomb's Law of Electrostatics 37

Figure 58.15

EXAMPLE 58.9 (Special Case Earth and the Determination of Masses of


Celestial Bodies). We apply Proposition 58.8 to the earth (Fig. 58.15). Ac-
cording to Problem 58.4, the gravitational force of the earth onto a point x 2
with mass m2 outside the earth is obtained by computing the gravitational
force, which is exerted from the center of the earth x 1 , assuming the total mass
of the earth m1 is concentrated there. We set x 1 = 0. Moreover, we choose x~
as in Figure 58.15 and use the first-order approximations K 0 and U0 from
Proposition 58.8 in a neighborhood of the surface of the earth. This corre-
sponds to Example 58.5.
According to Example 58.5 the acceleration ofthe earth g can be determined
from gravitational experiments. One obtains g = 9.81 m/s 2 • The gravitational
constant G can be determined experimentally by using the Cavendish torsion
balance in order to measure the gravitational effect of balls. From
g = Gmd!x~l 2 ,
and the radius of the earth lx~l = 6378 km one obtains the mass of the earth
m1 = 6·10 24 kg.
The mass of the sun can be determined from the orbit of the earth and
Kepler's third law (40) below. If one knows the mass of the sun, then analo-
gously the mass of a planet can be determined from the planetary orbits and
(40). The mass of the earth can also be derived from (40) and the motion of
artificial satellites.
The rotating earth with angular velocity w = 2n/day = 1.2 · w-s s- 2 is not
an inertial system (strictly speaking w above is somewhat larger because of
the motion of the earth around the sun). Therefore, according to Section 58.6,
additional inertial forces occur at the surface of the earth, namely the centri-
fugal force and the Coriolis force. These, however, are very small compared
with the gravitational force, and in first-order approximation they can be
neglected. Actually, the centrifugal force at the surface of the earth satisfies
the estimate
lm 2 ro x (ro x x~)l s m 2 w 2 !x~I-
Thus, at the surface of the earth we obtain for the ratio between the maxi-
mal centrifugal force and the gravitational force m 2 w 2 !x~!/m 2 g = 10- 4 .
38 58. Basic Equations of Point Mechanics

58.8b. Coulomb's Law

lf similarly as above P 1 and P2 are two points with charges Q1 and Q2 and
corresponding position vectors x 1 and x 2 , then an electrostatic force K acts
from P1 onto P2 with

K = Ql Q2(x1 - X2)
4neolxt - x2l 3 '
and the dielectricity constant e0 with
1
- = 8.988 ·109 Nm 2/A 2s 2.
4ne0
This law was formulated by Coulomb in 1775, i.e., about one hundred years
after Newton's gravitationallaw.
The main difference between Coulomb's Iaw and the gravitationallaw is
that in Coulomb's law for equal charges Q1 Q2 > 0 a repelling force occurs,
and for different charges Q1 Q2 < 0 an attracting force occurs. The similarity
ofboth laws on the other band is the reason for the close relation between the
motion of planets around the sun and electrons around the atomic nucleus,
which in 1913led to Bohr's atomic model (see Section 59.16). In Section 59.26
we show that for the motion of electrons in the atom the gravitational force
is very small compared to the electrostatic force and therefore plays no role.
The potential V of K is obtained similarly as in the previous section through

U(x2) = Ql Q2
4ne 0 jx 1 - x 21
for fixed x 1 •

58.9. Application to the Motion of Planets


We want to show which differential equation governs the planetary motion
in the Solar System. Also, we want to derive Kepler's laws by explicitly solving
the two-body problern (sun and one planet).

58.9a. The n- Body Problem


We begin with the motion of n mass points which are subject only to the
gravitational force (sun x 1 and n - 1 planets x 2 , ••• , xn). According to Newton
this motion is described in the absol1;1te space by the equation
i = 1, ... , n, (31)
58.9. Application to the Motion of Planets 39

with gravitational force

and
Kii = -Gm1m1(x 1- x1)/lx,- xl.
This means that the force K 1 acting on the ith point is equal to the sum of the
gravitational forces acting from all points on the ith point. All forces are
conservative. The potential is
n
U= L u,1
j=l,Jti

with
U11 = - Gm1m1/lx 1 - x11.
In fact, we have K 1 = - Ux.·
lf, according to Section 58.6, we pass from the absolute space to any abitrary
inertial system with x1 = y1 + a, then we obtain (31) with y1 instead of x1, since
K 1 only depends on x1 - x 1• Hence (31) holds in every inertial system, i.e.,
Newton's law of gravitation satisfies Galilei's Principle of Relativity 58.7.
From Theorem 58.A we find for the center of mass
mji = L. K 1 = 0,
i

thus y moves with constant velocity. Therefore there exists an inertial system
I: 0 whose origin is the center ofmass. We set
x, = y, + y
and find that in this system
Lm1Y1 = Lm,(x,- y) = 0.
i i

Also we obtain (31) with y 1 instead of x 1•

58.9b. Solution of the Two-Body Problem and Kepler's Laws

We now consider the important special case ofthe two-body problem, i.e., we
set n = 2. We have

(33)
where y 1 and y 2 describe the motion of the sun and one planet, respectively.
We want to show that by using the conservation laws of Section 58.7, equa-
40 58. Basic Equations of Point Mechanics

tions (32) and (33) can be reduced to the simple system (37) below, which can
explicitly be solved. The much more difficult case of the general n-body
problern will be discussed a little further in Section 58.9c.
Fortherelative motion of the planet with respect to the sun

X= Y2- Yt
we have
(34)

Tpis follows from (32) and (33). If one knows the motion of x, then one obtains
from (33) that
(35)
By using Theorem 58.A one can easily integrate (34). The angular momentum
conservation law and the energy conservation law immediately imply
m 2 x(t) x x(t) = const = N,
(36)
+ m2 x(t) 2 /2 = const = E,
U(x(t))
with U(x) = - Gmm 2 /lxl. Note that U'(x) = Gmm 2 x/lxl 3 for x -# 0.
Given Yi(O), Yi(O) for i = 1, 2, we find x(O), x(O) and hence from (36) the
angular momentum N and the energy E for the relative motion. Let N -# 0,
and hence x(O) ;f; 0.
Then the initial-value problern which corresponds to (34) has a unique
solution which, according to Theorem 58.A, satisfies equations (36). From (36)
follows
Nx(t) = 0,
i.e., the motion occurs on a plane perpendicular to N. Fora suitable choice
of basis vectors e1 , e2 , and e3 = N II NI we therefore ha ve
x = re" e, ~ (cos q>)e 1 + (sin q>)e 2
with polar coordinates r = r(t), q> = q>(t). Let q>(O) = 0. If we set
e'P ~ -(sinq>)e 1 + (cosq>)e 2 ,
then we find e,e'P = 0, e, x e'P = e3 and
x= re, + rifle'P.
From (36) we obtain the desired form of the equation of motion

r2<P = IN!/m2,
(37)

Theorem 58.8 (Two-Body Problem). The solution of(37) is


r = p/(1 + ecosq>) (38)
58.9. Application to the Motion of Planets 41

l----a-LaE-+4f-.&..----4

Figure 58.16

with

and

(39)

PRooF. From (37) follows dqJjdr = <P/r = F(r). Integration yields (38). D

For e < 1, = 1, > 1, the orbit (38) is an ellipse, parabola, or hyperbola,


respectively, with corresponding energies E < 0, = 0, > 0. The sun, i.e., r = 0
always corresponds to a focus.
We now consider the case ofthe ellipse e < 1 (Fig. 58.16). We want to show
that for the relative motion x = x(t) all three Kepler laws of Section 58.4 are
satisfied.
(i) First law. Using Cartesian coordinates
J'
rcosqJ, '1
2 + 11 2 we obtain from (38) the eqqation of an ellipse
'= =
rsinqJ, r =

(' + ea)2 '12


a2 + b2 = 1

witha = p/(1- e2 )andb = pf~. Thesuncorrespondstothefocus


'= = 0," 0.
(ii) Second law. From (37) follows that during the time interval [t 1 , t 2 ], the
position vector x sweeps out the surface

2- 1 2
r2 ifJdt = 2- 1 (t 2 tdiNI/m 2 •
J,,
(' -

(iii) Third law. If a and b are the great and small semiaxes of the ellipse,
respectively, then the area of the ellipse is equal to nab and for the period
of revolution T we obtain nab = TIN I/2m 2 , i.e.,
(40)
42 58. Basic Equations of Point Mechanics

Since the mass of the planet m2 is small compared to the rnass of the sun
m1 we obtain approximately T 2 /a 3 = const for all planets.
Experience shows that, as a good approximation for an inertial systern 1:0 ,
one can choose a system, whose origin is the center of mass of the solar systern
and whose axes are firmly connected with the stellar sky.

58.9c. Discussion of the n-Body Problem

Is the solar system stable? Properly spe11,king, the answer is still unknown, and
yet this question has led to very deep results which probably llre more import-
ant than the answer to the original question.
Jürgen Moser, Neue Zürcher Zeitung (May 14, 1975)

Conserved quantities. According to the previous discussion, the two-body


problern can be solved in two steps:
(i) Newton's second-order systern of differential equations (31) is reduced to
a first-order system by using conservation laws.
(ii) This first-order system is solved by quadrature (computation of integrals).
The n-body problem, one of the most famous rnathernatical problems of
the eighteenth and nineteenth centuries, consisted in the problern of extending
the solution procedure for n = 2 to arbitrary n. This, however, already failed
for n = 3 (three-body problern). For n = 3 the generat solution of(3l)contains
exactly 18 constants. The goal of the rnathematicians then was to find 18
conserved quantities
i = 1, .. 0' 18
frorn which the orbital motion x1 = x1(t) could have been determined by
solving for x1 and using quadrature. But the energy law, the angular mo-
mentum law, and the center of mass theorem of Section 58.7, only yield 10
conserved quantities E, N, y(O), and _)i(O). Observe that except for the energy
E all these quantities are vectors. In 1887, Bruns showed that no other con-
served quantities C1 exist, which depend algebraically on x1 and x1• This result
then was generalized by Poincare in 1889, who also excluded a number of
cases with analytic dependence.
Stability of the planetary system. The stability problern in generat form is as
follows:
(a) Is it possible that the planets collide, fall into the sun, drastically change
their orbits, or as an extrernal situation leave the solar system?
(b) Are srnall perturbations, e.g., caused by cosrnic dust, ~ble to cause one of
the cases in (a)?
This is a question of existential consequences, because already small pertur-
bations ofthe o·rbit ofthe e~rth would have catastrophic consequences for the
58.9. Application to the Motion of Planets 43

life on our planet. The great interest in explicit analytic solutions for the
n-body problern was motivated by the hope that through this the stability
problern could be solved. The complete answer is still open.
Lagrange (1736-1813) and Laplace (1749-1827), by using their perturba-
tion theory, as weil as later astronomers were able to show that the great
semiaxes of the planetary q_rbits only slightly change during the course of
many hundreds of years (theorem of secular stability). Thereby Lagrange's
method of the variation of constants played an important role. Today, simple
variants of this method are taught in every course about ordinary differential
equations. In perturbation theory it is an important task to find suitable
Coordinates for which a simple description of the problern is possible. For
example, one would like to separate the orbital motion during one revolution
and the secular change of the orbit. Usually, Cartesian Coordinates are of no
use. A very important instrument in this direction is the theory created by
Hamilton (1805-1865) and Jacobi (1804-1851) which will be discussed in
Section 58.23. In 1858, Dirichlet (1805-1859) told bis colleague Kronecker
(1823-1891) that he had found a prooffor the stability ofthe planetary system.
After bis death, however, no such notes could be found. lnitiated by Weierstrass
and Mittag-Leffier, the Norwegian-Swedish King Oscar II offered, in 1885,
a prize for the solution of this problem, and in 1889 it went to the young,
ingenious Henri Poincare (1854-1912). His prize memoir can be found in Acta
Mathematica, Vol. 13 (1890), pp. 5-270. He did not find a solution, but
developed a great number of new ideas, which led to the creation of a qualita-
tive theory for dynamical systems, algebraic topology, and differential topology.
Thereby Poincare opened new dimensions in mathematics and greatly in-
fluenced the mathematics of our century. The important work of Ljapunov
(1857-1918) on stability theory at the end of the last century was also in-
fluenced by the n-body problem.
Series expansion. The prize problern on which Poincare worked, is described
by Mittag-Leffier in Vol. 7 (1885) ofthe Acta Mathematica as follows: Given
an arbitrary system of material points, all of which attract each other accord-
ing to Newton's law, and assuming that no collisions of any two points occur
the problern is to expand the coordinates of each single point into infinite
series, which are composed of known functions of the time and which are
uniformly convergent for all times.
An important, special result in this direction was obtained in 1912 by the
Finlander K. S. Sundman. For the three-body problern he introduced a new
uniforming variable u and was able to expand the coordinates with respect to
this new variable. Hisseries converge in the unit circle. For -1 :S: u :S: 1 these
series yield real solutions for the three-body problern which exist for all times
- oo < t < oo. The case of possible collisions is also included. A proof can be
found in Siegel and Moser (1971, M). Unfortunately, Sundman's method
cannot be generalized to the case n > 3.
Principial mathematical difficulties. Two typical difficulties occur in the
study of the n-body problem:
44 58. Basic Equations of Point Mechanics

(a) The right-hand side of(31) becomes singular for x 1 = x1 (collisions).


(b) The formally computed perturbation series contain dangerously small
divisors.
lf the existence theorem for ordinary differential equations (Theorem 3.A)
is applied to (31) with initial condition x1(0) =F x1(0) for i =F j, then one obtains
the existence of a unique solution for a sufficiently small time interval. In order
to guarantee existence for all times, a priori estimates such as in Section 3.3,
are needed to guarantee the boundedness of the solutions for all times
and to exclude collisions. The singularities in (a) above would then become
meaningless.
The problern of small divisors has already been discussed in Section 3.8.
They greatly complicate the convergence proofs for the perturbation series in
celestial mechanics. In bis prize memoir to the problern of King Oscar II,
Poincare proved that the formally constructed series in celestial mechanics
may diverge. This, however, did not disturb him, because he found a way out
by inventing the concept of asymptotic series. Those are series which possibly
diverge but for any given degree of accuracy yield approximate results. By
definition it is
at a2
f(x) ""' a0 + - + 2 + · · ·, x-+ +oo,
X X

i.e., this series is an asymptotic expansion of f for x -+ + oo if and only if, for
each fixed n, we have

for all x > R with bn(x)-+ 0 as x-+ + oo. For example, one has

as x-+ +oo.
Today, other than to the times of Poincare, we are able to give convergence
proofs for series with small divisors by using the Kolmogorov-Arnold-Moser
theory. Thereby the technique of the hard implicit function theorem is used
(see Chapter 5). Very roughly, the following result holds:
(a) The formally constructed series for the perturbation of quasi-periodie
motions, which, e.g., occur in celestial mechanics, can diverge or converge.
(b) The convergence depends very sensibly on the particular parameter value.
In every neighborhood of a good-natured parameter, no matter how small,
there exist bad-natured parameters with divergence.
(c) In a certain sense good-natured situations are far more common than
bad-natured ones.
58.10. Gauss' Principle of Least Constraint 45

All those are typical difficulties for perturbations of quasi-periodie motions,


which are related to resonance phenomena. And mathematicians have to live
with this.

Explicitly solvable cases ofthe three-body problem. Fora number of special


cases the solution of the three-body system has been known for some time.
(i) First special case of Lagrange. The three mass points form an equilateral
triangle at time t = 0. Then these three points move on similar ellipses,
whereby the triangle remains equilateral. This case is very closely-related
to the two-body problem. It approximately occurs for the configuration
sun, Jupiter, Trojan group of asteroids.
(ii) Second special case of Lagrange. The three mass points with masses m1 ,
m2 , and m3 1ie on a rotating straight line.
(iii) Restringent three-body problem. Here, in addition to (ii), it is assumed that
m3 is very small. Strictly speaking, the limiting case m3 = 0 is considered.
The other two points move on circular orbits around a common center
of mass. One may think of the sun, a planet, and an asteroid. This case
has already been discussed by Euter (1707-1783) and was rediscovered
by Jacobi, 1836. In great detail, Poincare studied the special case m2 « m1
(see Fig. 3.12 in Chapter 3).
(iv) Hill's moon theory. At the end of the last century, Hili gave a number of
stability proofs for a class of planar three-body problems which model
the configuration of the sun, earth, and moon. Thereby he constructed
the so-called Hill's limiting curve, and showed that it cannot be crossed
by any body.
For the questions oftbis section we recommend the Iiterature at the end of
this chapter under the heading "celestial mechanics".

58.10. Gauss' Principle of Least Constraint and the


General Basic Equations of Point Mechanics
with Side Conditions

58.10a. Special Case


In order to explain the simple basic idea we begin with a special case. In an
arbitrary system of reference we consider the motion of a point with mass m
and position vector x under the influence of a force K. If the point is able to
move freely, then equation
mx=K (41)
46 58. Basic Equations of Point Mechanics

Figure 58.17

holds. Now Iet us consider the following generat side conditions


f(x,t) = 0, g(x,x,t) = 0. (42)
Then equation (41) needs tobe modified since, in general, the solutions of(41)
do not satisfy (42). Our goal is the modification
mX=K+Z,
where Z is a so-called constraining force, whose task it is to guarantee (42) for
the motion. In technology such constraining forces occur for motions of
machines. There, they have a very real meaning. lf Z is too large, then the
machine can be torn apart. Let us consider, for example, a pendulum under
the inßuence ofthe gravitational force K. Then Z acts as in Figure 58.17, i.e.,
it keeps the pendulum mass m on the circular orbit. According to Gauss, the
basic equations are
mx = K+ A.f:x: + p.g:i, (43)
i.e., the constraining force is equal to Z = A.f:x: + J.t(J:i· Thereby we have
K = K(x, x, t), A. = A.(x, x, t), p. = p.(x, x, t).
As we shall see below, (43) can be obtained from the generat principle of least
constraint. We begin, however, by studying the mathematical consistence of
(43). Sometimes, in more formally written textbooks on theoretical physics this
is not included; We need to answer two questions:
(i) How can the functions ;,. and p. be determined?
(ii) When do solutions of (43) satisfy the side conditions (42)?
We assume:
(H) fand g are C2 and f:x:, g:i are linearly independent for all considered
arguments.
By considered arguments, we mean all arguments in a fixed open, nonempty
set in the (x, x, t)-space, i.e., in v3 X v3 X R We also allow the case that g does
not appear. Then we assume that f:x: -::F 0 for all considered arguments.
Let x = x(t) be a motion which satisfies (42). Differentiation of (42) with
respect to t yields
fxx +Ir= o. (44)
58.10. Gauss' Principle of Least Constraint 47

Further differentiation with respect to t implies


fxx + xfxxx + f,, = 0,
(45)
gxx + gxx + g, = 0.
Inserting the basic equation (43) into (45), we obtain for fixed arguments
(x, x, t) a linear system of equations for A. and Jl, which can uniquely be solved,
since the corresponding homogeneaus system has only the trivial solution
A. = J1. = 0. In fact, the solution of the homogeneaus system consists in the
problern of finding the point of intersection between the vector A.fx + JJ.gx and
the orthogonal subspace
{h:fxh = O,g:ih = 0}.

Proposition 58.10. Assurne (H). lf x = x(t) is a solution of (43) which satisfies


the side conditions (42), (44) for some fixed time, then this solution satisfies the
side conditions for all times.

PRooF. From the construction of A., J1. it follows that x = x(t) satisfies equations
(45). Integration of (45) yields (44) and (42). 0

Corollary 58.1 I. The general existence and uniqueness theorem (Theorem 3.A)
about initial-value problems can be applied to (43), since
A. = A.(x, x, t) and J1. = JJ,(x,x,t)
are C 1-functions as unique solutions of a linear system of equations with C 1-
coefficients.

The minimization problern


m(x- K/m) 2 =mini (46)
with side conditions (45) is called the principle of least constraint. Thereby only
x varies, while (x, x, t) is considered a fixed parameter. Lagrange's multiplier
rule for real functions of Section 43.10 immediately implies (43).

58.1 Ob. General Case


In an analogous Cashion we now consider the generat case of a motion of n
mass points with position vectors x" masses m1, and forces K 1 acting on x 1•
Let the side conditions be given through the real functions
P'1(X, t) = 0, r = 1, ... , R,
(47)
g1"1(X,X,t) = 0, s = l, ... , s,
with X= (x 1 , •.• ,xn) and R +S < 3n. We assume:
48 58. Basic Equations of Point Mechanics

(R) The side conditions (47) are regular, i.e., all p•> and g(•J are C 2 for all
considered arguments, and the linear system of equations for X, which
similarly as (45), is obtained from (47) by differentiation with respect to
time t, has maximal rank R + S for all considered arguments.

By considered arguments we mean all arguments in a fixed, open, nonempty


nr:l X
set in the (X, X, t)-space, i.e., in v3 IR. We also allow the case that g in
(47) does not appear. Then the maximal rank in (R) must be equal to R. If
g does not appear in (47), then for every t, equations (47) describe an R-
dimensional C 2-manifold c in the X -space Oi=l v3 (see the preimage theorem
of Section 4.18). Thereby Cis called the configuration space if, in addition,
every /(rJ is independent of t.
Similarly as in (43), Gauss' basic equations are
·· _ K i + L.,.
m;X; - ArJx; + L.,. Jls9x,
~ 1 r(r) ~ (s)
(48)

for i = 1, ... , n with

K; = K;(X,X,t), l, = l,(X, X, t), p.. = p.,(X, X, t).


Again, the functions l, and Jls are uniquely determined by the linear system
for X which, as in the case of(45), is obtained from (47) by differentiation with
respect tot. Then a result analogaus to Proposition 58.10 and Corollary 58.11
is valid. Finally, we must replace the principle of least constraint (46) with

L m;(.x; -
I
K;/mY = min!.

Experience shows that, in fact, Gauss' basic equations yield a correct


representation of processes in nature with the corresponding constraining
forces. Equations (47) represent the most general mechanical side condi-
tions. The principle of least constraint is very natural, because, in the absence
of side conditions, one immediately obtains Newton's equations m.X; = K;.
i = 1, ... , n.

58.11. Principle of Virtual Power


We now discuss the important special case of holonomic constraints, i.e., the
side conditions (47) do not depend on time and velocity. Our goal is, by using
a simple trick, to eliminate the constraining forces, which in many instances
are very disturbing.

Definition 58.12. All possible positions X = (x 1 , ... , xn) of the system, which
are compatible with the side conditions, form the configuration space C. By a
58.11. Principle of Virtual Power 49

virtual motion
i = 1, ... , n
we mean any C 1-motion which satisfies the side conditions. Then
V;= .Mt)
is called the corresponding virtual velocity.
By an arbitrary virtual velocity (v1 , ... , vn) at point X we mean the virtual
velocity v1 = .Mt) of a virtual motion Y = Y(t), which for the fixed time t is at
the point X.

Geometrically, the virtual motions correspond precisely to the C 1 -curves


in C. The virtual velocities at the point Xe C are precisely the tangent vectors
to C at X. The virtual C 1 -motions, which physically are possible under the
influence of given forces will be simply called motions in the following.
The difference between virtual motions and motions will play an important
role in our further discussion.

Principle of Virtual Power 58.13. We consider n points with position vectors x 1


and masses m1• The force K 1 = K 1(X,X, t) with X= (x 1 , ••• ,x") acts on x 1• For
holonomic constraints we then obtainfor the motion X = X(t) of the points the
equation

L (m x (t) -
n
1 1 K 1)v1 = 0 (49)
i=l

for all t and arbitrary virtual velocities (v 1 , ..• , vn) at each point X(t).

This generat principle, which goes back to d'Alembert (1717-1783), is very


useful, because of its invariant formulation, i.e., the concrete analytical form
of the side conditions does not appear and all concepts involved have a
physical interpretation. From (49), for example, it is possible to derive in a
simple form the basic equations for the rigid body (see Section 58.14).
In the regular case (R) of Section 58.10, relation (49) follows from Gauss'
equations. In fact, by differentiating with respect to t one immediately obtains
from (47) that
Lf1;
i
1v; = 0.

Multiplication of (48) with v1 and summation over i yields (49).


If one chooses especially v1 = x1(t) in (49), then one obtains for the motion
the energy balance
(50)

with kinetic energy T = Lt m xf /2. This equation is the key for the following
1
stability discussion.
50 58. Basic Equations of Point Mechanics

58.12. Equilibrium States and a General


Stability Principle
In this section we consider the same situation as in the Principle of Virtual
Power 58.13. We say the system is in an equilibrium state (equilibrium position)
at X 0 ifand only ifevery C 2 -motion ofthe system X= X(t) with X(t 0 ) = X 0
and X(t 0 ) = 0 satisfies
X(t) =X 0.

Wecall

W(t) = I.' ~
to •
K 1( Y(t), Y(t), t) Y(t) dt

the virtual work. Thereby X = Y(t) is an arbitrary virtual C 1 -motion with


Y(t 0 ) = X 0 • We call W( ·) nonpositive if and only if
W(t) =:;; 0 for all t.
Moreover, W( ·) is called strictly negative if and only if
W(t) < 0 for all t # t0
and for all virtual C 1-motions X = Y(t) with
Y(t) # X 0 for all t # t0 .

Proposition 58.14 (Equilibrium Condition). lf X0 is an equilibrium state, then


L K1(X0 , 0, t)v 1 = 0 for all t (51)
i

and all virtual velocities (vt> ... , vn) at X 0 .


Jf conversely, the virtual work W( ·) is nonpositive in the sense of the above
definition, then X 0 is an equilibrium state.

PRooF. From (49) with x1 = 0 follows (51). Conversely, suppose that W(t) =:;; 0
with Y(t) = X(t), X(t 0 ) = X 0 and X(t 0 ) = 0. By integration it follows from the
energy balance (50) that
T(t) - T(t 0 ) =:;; 0 and T(t 0 ) = 0.
This means that x1(t) =0. D

Definition 58.15. If W( ·) is strictly negative, then X0 is called a statically stable


equilibrium state.

We want to motivate this definition and show: If the system is at rest at


time t 0 , then it remains at rest for all times t ~ t 0 , i.e., from X(t 0 ) = X 0 and
X(t 0 ) = 0 follows that the system remains at X 0 . More precisely, we show that
58.12. Equilibrium States and a General Stability Principle 51

there exists no C 1-motion X = X(t) with


X(t) # X 0 for all t > t0 .
Suppose such a motion exists. From X(t 0 ) = 0 follows T(t 0 ) = 0 and the
energy balance (50) yields
T(t) = W(t) < 0 for all t > t0
after integration. But this contradicts T(t) ~ 0.
As in Section 58.7 we now assume that the forces can be decomposed in the
form
K; = KjP> + Kjql
with KjP1(X) =- UxJX) for all i and all X, i.e., KjPl has the C 1-potential V.
This implies
W(t) = U(X0 ) - U(Y(t)) + W1q1(t).
The condition W(t) < 0 for statical stability can now be written in the form
-AU+ AW 1q 1 < 0. (52)

Definition 58.16. Suppose the function V: C -+ IR on the metric space C has a


local minimum at the point X 0 . This minimum is called regular if and only if
it is strict and there exists a neighborhood V(X0 ) such that the diameter of
the set
{XE V(X0 ): U(X) ~ E}
for fixed, realE tends to zero as E-+ U(X0 ).
For example, the real function V = x 2 k, k = l, 2, ... has a regular minimum
=
at x = 0, while for U(x) 0 this is not the case.

Theorem 58.C (Stability Principle of the Minimal Potential Energy). We


assume holonomic constraints which only depend on the position X and suppose
that all forces have a potential, i.e., there exists a C 1-function V with
K;(X) =- Ux,(X) for all i and all X.
Then the following is true.
(a) If the system is in an equilibrium state at X 0 , then V on the configuration
space C has a critical point at X 0 .
(b) lf Von the configuration space C has a local, regular minimum at X 0 , then
X 0 is an equilibrium state, which is locally statically stable and stable in the
sense of Ljapunov.

By "locally" statically stable we mean a local variant of Definition 58.15,


for which only virtual motions X = Y(t) in a spatial neighborhood of X 0 and
a time neighborhood of t0 are considered. By stability in the sense of Ljapunov
52 58. Basic Equations of Point Mechanics

we mean that small deßections from the equilibrium state with small initial
velocities only cause small motions of the system. More precisely, we require
that for every e > 0 there exists a {) > 0 such that
IX(to)- Xol + IX(to)l < f>
implies
IX(t) - X 0 l + IX(t)l < B

for all C 1-motions X = X(t) and all t. Thereby we have lXI = Li lxil·
PR.ooF. Ad(a). Let X = Y(t) be a virtual C 1-motion with Y(t 0 ) = X 0 • From
(51) follows that
d .
dt U(Y(t)) = - ~~ Ki(Y(t)) Yi(t) =0 for t = t0 •

Ad(b). The local statical stability follows from


W(t) = U(X0 ) - U(Y(t)) < 0 for Y(t) -::1- X0 •
In order to prove the stability in the sense of Ljapunov we choose a motion
X = X(t) with IX(t0 ) - X 0 1+ IX(t0 )1 < {). From the energy balance (50)
follows the energy conservation theorem
T(t) + U(X(t)) = const = E for all t.
From T(t) ~ 0 it follows that
U(X(t)) s; E for all t.
Since E-+ U(X0 ) as {)-+ 0, Definition 58.16 implies that the diameter of the
region, in which the motion occurs tends to zero as {)-+ 0. Moreover, it follows
from
T(t) + U(X0 ) ~ E for all t
that sup, IX(t)l-+ 0 as {)-+ 0. 0

58.13. Basic Equations of the Rigid Body and


the Main Theorem about the Motion of
the Rigid Body and its Equilibrium
We want to apply the previous observations, and, in particular, the principle
ofvirtual power, to one ofthe most important mechanical mo.dels, namely the
rigid body. Many objects in technology, can in first-order approximation
be considered as rigid bodies, e.g., buildings, bridges, ships, planes, space
ships, etc.
We consider n ~ 3 points, which do not alllie on a straight line, with masses
58.13. Basic Equations of the Rigid Body and the Main Theorem 53

m1 and position vectors x 1• Supppose the force


K, = K,(X, X, t)
with X= (x 1 , ..•• x") acts on x 1• This system is called a rigid body ifand only
if the side conditions
i,j = 1, ... , n (53)

are satisfied with fixed numbers cli '::/; 0 for all i '::/; j. This means that the
distances between the points remain the same for all motions. In the following
section we will see that the configuration space C, i.e., the set of all points with
(53) forms a six-dimensional C00 -manifold in the X-space nl'=t V3 • In order to
separate the motion of the center of mass
Y=Lm 1xdm
i

with the total mass m = L m we set


1 1

X1 = y + Zi.
The angular momentum N with respect to the center of mass is, by definition,
equal to
N = rm,z, X ij.
i

Moreover, we define the inertia tensor (} through


Om = L m1 (z~m -
i
(mz1)z 1).

In an arbitrary coordinate system with the center of mass as the origin we have
9m = (O;m•)b,
with the components
n
L mt(zfc5:- bAzlzD,
8; = k=l r, s = 1, 2, 3.

Here we set m = m"b. and Zt = zlb1. In Cartesian coordinates we obtain for


the basis vectors b.b1 = c5.1. Hence there we have 8; = 8: for all r, s and
therefore
8 = 8*,
i.e., 8: V3 -+ V3 is a self-adjoint linear operator. Thus there exist three ortho-
normal eigenvectors eto e2 , e 3 of 8 with
i = 1, 2, 3. (54)
If we choose a Cartesian coordinate system with the center of mass as the
origin and axes e., then these axes are called principal axes of inertia and the
eigenvalues 81 of 8 are called principal moments of inertia. In this coordinate
54 58. Basic Equations of Point Mechanics

system 8 has a diagonal form, i.e.,


8; = 8.~: for all r, s.
Weshallsee in the following section, by using the principle ofvirtual power,
that the basic equations for the motion of the rigid body are equal to
m.Y=LK1, (55)
I

N=LZI X Kj, (56)


i

i= l, ... ,n, (57)


N = 8m. (58)
These equations have a very intuitive meaning. lf
m(t) = const,
then, according to Example 58.1, relation (57) corresponds to a rotational
motion about the center of mass with axis of rotation m and angular velocity
w =Im I. In the generat case, mvaries with time and we call m(t) and w(t) the
momentary axis of rotation or the momentary angular velocity, respectively.
According to (58), the angular momentum N is obtained from the momentary
axis of rotation m by an application of the linear operator 8, i.e., the inertia
tensor. Relation (55) shows that the center of mass moves as if all forces would
act on it. Altogether, the basic equations (55) to (58) represent the following
statement.

The motion of the rigid body is the combination of the motion of the center
ofmass and a "rotational motion" about the center ofmass, whereby, however,
the axis of rotation and the rotational velocity depend on time. Such a motion
is called a screw motion.

If T = Lt m # /2 is the kinetic energy with respect to the center of mass, then


1

T = Lm (m x z /2 = m8m.
1 1) 2
i

Therefore 8- 1 exists. This follows, because 8m = 0 implies m x z1 = 0 for all


i, hence m = 0.
The following theorem and its corollary answer the centrat question: Which
quantities uniquely determine the motion of the rigid body, and through
which conditions are equilibrium states described? Recall that X= (xtt
... ,xn).

Theorem 58.0 (Existence and Uniqueness Theorem for the Motion of the
Rigid Body). Assurne that we are given C 1-forces
i=l, ... ,n, n~3.
58.14. Foundation ofthe Basic Equations ofthe Rigid Body 55

the initial position


x 1 (t 0 ), ... , x.. (t 0 ),
the initial velocity y(t0 ) of the center of mass, and the initial position m(t0 ) of
the momentary axis of rotation together with the initialangular velocity lm(t0 )1.
Then for all times t in a neighborhood of t0 there exists a unique C2-motion
X = X(t) for the rigid body.-

PROOF. From x1(t 0 ) we obtain y(t0 ) and thereby z1(t0 ) and O(t0 ). Moreover, we
have
N(t 0 ) = O(t0 )m(t0 ).
If we replace m in (57) with o- 1N, then, according to Theorem 3.A, equations
(55) to (57) have a unique solution. 0

Corollary 58.17 (Equilibrium Condition). Assurne the C 1-forces K 1 = K 1(X, X,


t) are given. For the positions x 10 , .•• , x..0 , the rigid body is in an equilibrium
state if and only if the following three conditions are satisfied:
(i) The total force is zero, i.e.,
L K (X
I
1 0, 0, t) = 0 for all t.

(ii) The total torque is zero, i.e.,


Lx
i
10 x K 1(X0 , 0, t) = 0 for all t.

(iii) The body is at rest at time t 0 , i.e., x1(t0 ) = 0 for all i.

PROOF. If one inserts


x 1(t) =x 10

into (55) and (56), then one obtains


y=N=O
and (i) to (iii) follow. Note that
Li x 1 x K1 = LI (y + z 1) x K 1•

Conversely, assume (i) to (iii). Then x1(t) = x10 is a solution of (55) to (58)
and, according to Theorem 58.0, unique. 0

58.14. Foundation of the Basic Eqriations of


the Rigid Body
Lemma 58.18. Assurne that n ~ 3 different points x 1 , ••• , X 11 are given, whereby
not all of them lie on a straight line. Then one obtains precisely all solutions of
56 58. Basic Equations of Point Mechanics

the system of linear equations


(x1 - x1)(h1 - h1) = 0, i,j= 1, ... ,n (59)
through
h1 = a + (J) X X1

with arbitrary a, me V3 •

The proof will be given in Problem 58.1. From the subimmersion theorem
(Theorem 4.1 in Section 4.16) it follows that the configuration space C of the
rigid body, described by (53), is a six-dimensional C00 -manifold. One can show
that C = S0(3) x IR 3 , where S0(3) is the Lie group of allorthogonal (3 x 3)-
matrices with determinant one (see Chapter 84).

Proposition 58.19. The most general possible motion of the rigid body is de-
scribed by the differential equation
x1(t) = a(t) + (m(t) x x 1(t)), i=1, ... ,n (60)
for arbitrary but fixed vector functions a( ·) and m( · ).

PROOF. Equation (59) with h1 = x1 immediately follows from (x1 - x1)2 = c11
by differentiating with respect tot. Hence (60) follows from Lemma 58.18.
Conversely, (60) satisfies equation (59) with h1 = x1• Integration of(59) yields
(x1 - x1)2 = const. 0

In order to obtain the basic equations tor the rigid body we now use the
principle of virtual power from Section 58.11. This principle tells us that
LI (m1x1 - K 1)v1 = 0.

From Proposition 58.19, an arbitrary virtual velocity has the form v1 = a +


(m xxJ This implies
a L m1x1 - K 1+ m L x1 x (m1x1- K1) = 0.
i I

In what follows we write L instead of LI·


Since a and m are arbitrary, it follows that
Lm x = LK~o
1 1 (61)
(62)
From this one easily derives the basic equations in the form of (55) to (58).
Indeed, because of(61), we get

From (62) foilows


L (y + z1) x (my + m1i 1) = L (y + z1) x K 1•
58.15. Physical Models, Expansion ofthe Universe 57

and x1 = a + (m x x 1) implies
i1= a - y + (m x y) + (m x z1).
Because of L m1z1 = 0 and L m1i 1 = 0 we obtain
a - y + (m x y) = 0,

hence i 1 = m x z1• This implies


N= L m1z1 x (m x z1) = L m1mzl - m1z1(mz1) = 8m. 0
i i

58.15. Physical Models, the Expansion ofthe Universe,


and its Evolution after the Big Bang
Objects in the real world are very complicated and a complete mathematical
description, i.e., taking all possible effects into account, is principially impos-
sible. Physicists, however, have made the positive experience that many phe-
nomena can be described with relatively simple models. In this section we
describe this important sort of physical thought by looking at the interesting
example of the expansion of the universe. A complete understanding of this
phenomenon is possible only from the general theory of relativity, and will
be discussed in Chapter 76. Many effects, however, can be understood in the
context of a simple model, which only uses the electromagnetic black-body
radiation and further results in statistical physics. This will now be discussed,
and will perhaps create some interest in the reader for statistical physics and
the general theory of relativity of Chapters 68 and 76. The key question is:
How did the universe evolve?
To answer this question in the context of the so-called standard model We
need two experimental results:
(a) Hubble's law about the red shift in the spectrum of galaxies (Hubble, 1929).
(b) Discovery of the 3 K-radiation in 1965 for which Penzias and Wilson
received the Nobel prize.
Looking at the spectrum of the galaxies, one observes a red shift, which, if
interpreted as a Doppler effect (Problem 58.6), Ieads to an escape velocity V
for the galaxies of
V=HR. (63)
Thereby R is the distance between the galaxy and our Milky Way, and H
denotes the Hubble constant. According to presently available data, we have
H = H 0 with

H0 = (15 km/s) per 106 light years = 1/(2 · 10 10 years).


58 58. Basic Equations of Point Mechanics

This means that for all 106 light years the escape velocity of the galaxies
increases by 15 kmjs. This, however, is a very rough value.
The expansion of the universe described by (63) can be explained with the
hypothesis that at time t = 0 an enormous explosion, called the Big Bang,
occurred and created our universe. Weshallsee that the weak 3 K-radiation
from the universe can be regarded as an important experimental prooffor the
hypothesis of the Big Bang. This radiation has its origin in the photon energy
which at an early stadium of the bot cosmos existed. Roughly speaking, this
photon energy has become very thin as a consequence ofthe expansion ofthe
universe and today we can only observe a very weak radiation which requires
sensitive antenna systems. As one would expect, this radiation is isotropic, i.e.,
no direction in the universe is preferred.
Another sign for the correctness of the Big Bang hypothesis is the currently
observed ratio of
hydrogen: helium = 3: 1.

This also follows from the standard model.


One important question, however, can presently not be experimentally
determined, namely: Is the volume of the universe finite or infinite? For an
answer one would have to know a very precise value for the mass density Po
of the present universe, but today the error margins for p0 are too large. We
shall show, however, that the age of the universe can be estimated by using
the Hubble constant H 0 •
The 3 K-radiation is an electromagnetic radiation density in the universe,
which satisfies Planck's law about the black-body radiation with energy
density

E/V = L'., 8nhcd).jl 5 (ehcfkTA- l)


(64)
= 7.6·10- 16 [T] 4 (Jfm 3 ).

Thereby T = 3 K is the so-called temperature of the radiation. As usual, [T]


denotes the value without dimension, i.e., [T] = 3. Experimentally, E/V is
measured and T is calculated from (64). Equation (64) also determines the
distribution of the energy onto the wavelengths l In Section 68.4, we derive
(64) from the Bose statistics for photons, and show that the particle density
of the photons is given by
N/V = 60.4 (kT/hc) 3 . (65)

This corresponds to a density of


2·10 7 [T] 3 photonsjm 3 •
Thus the mean energy of a photon is equal to
c: = E/N = 3.7 ·10- 23 [T] J
58.15. Physical Models, Expansion or the Universe 59

and hence
e = 2.1kT. (66)
The universal constant k is called the Boltzmann constant and has the value
k = 1.380·10- 23 J/K.
In the following the electromagnetic radiation in the universe will simply be
called photon gas. Section 68.4 implies that the entropy density of the photon
gas is equal to
(67)
where a is a constant given by (68.25).
Now Iet us show that relations (63)-(67) above yield the main results about
the evolution of our universe. In this connection the key equations below will
be the equation of motion of a galaxy (70) and equation (74), which relates
the time after the BigBang to the corresponding energy density.

58.15a. Equation of Motion of the Galaxies

We consider a galaxy r of mass m located at a distance R(t) from our Milky


Way. Very important is that, according to Problem 58.5, the only force acting
on r is the gravitational force caused by the mass M which is contained inside
a ball of radius R(t), i.e.,
M = 4nR(t) 3 p(t)/3.
Thereby p(t) is the mass density at time t. According to Problem 58.4, the
actual motion is the same as if the entire mass M of the ball would be located
at the center R = 0. The expansion of the universe can only be completely
understood in the context of the generat theory of relativity (see Chapter 76).
But it is quite interesting that important results can already be obtained by
using only simple facts from classical mechanics. Tothis end we choose R(t)
not too large, so that the space curvature can be neglected. Further, we take
relativistic effects into account by choosing a mass density p(t) which is related
to the energy density e(t) by the Einsteinrelation
e(t) = p(t)c 2 •
Here c denotes the velocity oflight (see Section 75.11). This energy density e(t)
also contains all kinds of radiation. We assume that the escape motion
R = R(t)
of the galaxy r satisfies the classical energy conservation law, i.e.,
mJP GmM mE
- 2- - ~ = const = - 2-0. (68)

The first term is the kinetic energy of the escape motion and, the second term,
60 58. Basic Equations of Point Mechanics

according to Section 58.8, is the potential energy of the gravitation. Let t0


denote the present time and t = 0 the time of the Big Bang. We set
H0 = H(t 0 ), R0 = R(t 0 )
and
Po= p(to).
From (63) follows
E0 = R5H~ - 2GM/R 0 = R5(H~ - 81tGp 0 /3). (69)
For R = R(t~ equation (68) yields the equation ofmotion for the galaxy r:
fRo dR
(70)
JRir> jE 0 + 2GM/R =to-t.
From this first key equation we draw several important conclusions.
At time t = 0 of the Big Bang the galaxy r is at the origin, i.e., R(O) = 0.
Consequently, the age of the universe t 0 is given by

to = f Ro
o jE0
dR Ro
< -r::::===::======
+ 2GM/R- jE0 + 2GM/R 0 (71)
= 1/H0 = 2 ·10 10 years.

Thus it follows that the universe is less than 20 milliard years old. Note that
the total mass M of the ball of radius R(t) remains constant for alt times t > 0.
At present we approximately have
E0 =0,
as is shown below. Thus with (71) we obtain for the age ofthe universe
(Ro JR 2 {Rl; 2
to= Jo ~WMdR=J~WM=3Ho'
This gives nearly 13 milliard years. Estimates obtained from the generat theory
of relativity will be given in Chapter 76.
We define the critical mass density
Peru = 3HJ/81tG = 5 · 10- 27 kg/m 3 •
lt follows that
Eo = 81tGR5(Pcrit - Po)/3,
where p 0 c2 is the value of the energy density of the present universe, and
p0 is the present mass density. From (70) we derive·the following important
alternative about the structure of our universe:
If E0 :2:: 0, i.e;ifo :s;; Perlt• then the integrand in (70) has the form
0(1) or O(,.j R) for R-+ oo. Consequently, we have
R(t)-+ +oo as t-+ +oo,
58.15. Physical Models, Expansion ofthe Universe 61

i.e., the galaxy may escape into infinity. Note that t'(R) > 0 for
all R > 0 according to (70). Consequently the function t 1-+ R(t)
is strictly monotone increasing for t > 0. (72a)
If E0 < 0, i.e., Po > Pcri~> then the integrand in (70) becomes
imaginary for large R. Therefore, R(t) must remain bounded for
t -+ + oo, i.e., the galaxy r cannot escape into infinity. (72b)
In the context of the generat theory of relativity one finds that for Po < Pcrit
the universe is infinite, whereas for Po >Perlt its volume is finite (see Chapter
76). We will show that Po= 1.7 · 10- 27 kgfm 3, i.e.,

Po"' Pcrit

and E0 - 0. The uncertainty for the value of p 0 , however, is currently solarge


that it is impossible to decide whether (72a) or (72b) is true, i.e., whether the
universe is infinite or finite. lf additional masses would be discovered (inter-
stellar matter, small hypothetical mass of the large number of neutrinos in the
universe, etc.), then this would favor the finite universe.

58.15b. Energy Density of the Present Universe

In one cubic meter, there are presently about Nn nucleons (protons or


neutrons) with 0.03 ~ Nn ~ 6. For say Nn = 1, this yields a matter density for
the present universe of
Pn = 1.7 ·10- 27 kg/m 3.
From (65) with T = 3 K we find that 5.4 · 108 photans are contained in one
cubic meter. The ratio of the numbers of photans and the number of nucleons
in one cubic meter is therefore presently about 109 : 1. This is a very important
consequence of the 3 K-radiation, as we shall see in Sectton 58.15f below.
It is assumed that the photon-nucleon ratio has remained constant at all
times. Formula (66) implies that the energy density ofthe photans is presently
about 109 • 2.7 kT. Dividing by c2 we obtain the corresponding "photon mass
density" of
Pp = 1o-3o kg/m3.
For the mass density of the present cosmos we find
Po = Pn + Pp = 1. 7 ·10- 27 kg/m 3.
We assume that the entropy S of the photon gas has remained constant
since the BigBang. Hence, formula (67) implies that R 3(t)T 3 = const, i.e., the
temperature T of the photon gas is given by
T(t) = const/R(t). (73)
At an early stage in its development the universe was therefore very bot.
Currently the universe is not in a thermodynamical equilibrium, because there
62 58. Basic Equations of Point Mechanics

exist enormous temperature differences between the 3 K of the photon gas


and the very bot matter of galaxies. But one assumes that at an early stage,
there existed a common temperature for the universe, i.e., the universe was in
a thermodynamical equilibrium. Actually, only a very gradually changing
quasi-equilibrium was possible, because of the expansion of the universe (see
Section 67.1).
The above considerations show that the ratio of the energy densities of
nucleons and photons is currently of magnitude 103, i.e., the nucleon energy
dominates (matter cosmos). Since, according to (66), the mean energy of a
photon depends linearly on the photon temperature, and the ratio of nucleons
and photons has remained constant, it follows that for a photon temperature
of approximately 3000 K, the ratio of the nucleon energy and the photon
energy has been equal to one. For higher temperatures, i.e., at an early stage
of the cosmos, the photon energy dominated (radiation cosmos).

58.15c. Critical Temperature

Table 58.3 shows the so-called critical temperature. It is defined as


T.:ru = moc2/k,
where m0 is the rest mass of the particle. One assumes that, in a thermo-
dynamical equilibrium, there exists a great number of particles with rest mass
m0 and hence rest energy m0 c2 above T.:.1, (see Section 75.11). This is motivated
as follows. Let m0 -:F 0. For high temperatures experience shows us that we
can use quasi-classical statistical physics. The particles in the early cosmos are
regarded as an oscillating system. This is a fairly generat assumption as will
be shown in Problem 58.11. The law of equipartition then implies that the
mean energy of each vibrational degree of freedom is equal to
E=kT
(see Problem 68.4). Moreover, the mean energv must be greater than the rest
energy, i.e., E ~ m0 c2 • This implies that
T~ 'T.:rit·

F or m0 = 0 these observations show that T has no lower bound. Thus we set

Table 58.3
Particle Spin Critical temperature T..1,
proton p, neutron n ±t 1013 K
n-mesons n°, n± 0 1.5·10 12 K
muons p± ±t 1012 K
electron e-, positron e+ ±t 6·l09 K
neutrinos lle, ve, v,., v,. t OK
photon y ±1 OK
58.1 5. Physical Models, Expansion of the Universe 63

Tcrit = 0. In Table 58.3 photons and neutrinos have no rest mass, however, it
is possible that neutrinos have a rest mass of 20 eVjc 2• In comparison the
electron mass is 5 · 105 eVfc 2 •

58.15d. Development in Time of the Early Cosmos

At an early stage ofthe universe, timet and mass density p(t) were related like
this:
3
t= (74)
321tGp(t).
We prove this second key formula. From (68) follows
R2 = E0 + 2GM/R.
Because of R = H R we have

H 2 = E0 + 81tGp
R2 3 .

At the early stage of the cosmos (R -+ 0, t -+ 0, p -+ + oo ), the energy density


was mainly determined by the radiation energy of the photons and the other
elementary particles. From (64) and (73) we find for the corresponding mass
density p = efc 2 the relation
p = const · T 4 = const · R- 4 •
This implies

P-+ +oo.

In order to simplify the computations, we neglect the term O(p 112 ). We obtain
H = CR- 2 , C = const
and R = HR = C/R. Integration yields
C(t- td = r 1 (R(t) 2 - R(td 2 )
and hence
t- t1 = r 1(H(tr 1 - H(td-l ).

For t 1 -+ 0 we have p(td-+ + oo (BigBang). This implies (74).

58.15e. The Universe at a Temperature of 10 11 K

This temperature is chosen, because, as a Iook at Table 58.3 shows, at this


temperature essentially only photons, neutrinos, electrons, and positrons exist.
64 58. Basic Equations of Point Mechanics

In Section 68.6 we shall use methods of statistical physics and results about
the spin of Table 58.3, to show that the energy density of the universe at
10 11 K is 9/2 times that of the photon gas. lt follows therefore from (64) that
the energy density is equal to
e(t) = 3.5 ·1029 J/m 3 •
From (74) we obtain the corresponding time
t = (3c 2/32xGe(tW12 = 10- 2 s,
i.e., one hundredth of a second after the Big Bang the universe bad the
temperature 10 11 K.

58.15f. Helium Synthesis


Computations in nuclear physics show that at a temperature of900 million K
the helium synthesis began. This mainly follows from the photon-nucleon
ratio of about 109 :1. At this temperature the ratio of neutrons and protons
is equal to 13:87 as a consequence of nuclear reactions. A hydrogen nucleus

Table 58.4
Object Approximate age
Cosmos maximal 20 milliard years
Our galaxy (Milky Way) 10 milliard years
Sun and Earth 5 milliard years
Life on Earth 4 milliard years
Earth continents 3 milliard years
Earth atmosphere 1 milliard years
Explosion of Iife on Earth 600 million years ago
(fishes, plants, insects)
Dinosauriers appear 200 million years ago
Dinosauriers disappear; strong 60 million years ago
development of mammals
Man 1-3 million years
Humanhistory 5,000 years
Formation ofmodern science 400 years ago
through Galilei and Newton
Einstein's general theory of relativity 1915
Friedman's model of the expanding 1922
universe in the context of the
generat theory of relativity
Hubble's law about the red shift 1929
Atomicbomb 1945
Discovery ofthe 3 K-radiation 1965
Humans on the moon 1969
58.16. Legendre Transformation and Conjugate Functionals 65

(or a helium nucleus) consists of one proton (or two pro~ons and two neutrons).
Furthermore, the mass of the proton is approximately equal to the mass of
the neutron. Let us assume that all neutrons have been used in the formation
of helium and the remaining protons are being used to form hydrogen. This
yields
a mass ratio ofhelium and hydrogen of26:74,
i.e., approximately 1 : 3. This is about the ratio which is currently observed in
the universe, and is thus another reason to assume that the standard model,
discussed above, is a correct model for the evolution of the universe.
A popular and elementary exposition about the development ofthe cosmos
may be found in Weinberg (1977, M), and a thorough discussion of the
mathematical-physical theory in the context ofthe generat theory of relativity
is contained in Weinberg (1972, M). We merely mention here that the helium
synthesis took place 225 s after the Big Bang. Further interesting numbers
can be found in Table 58.4.

58.16. Legendre Transformation and


Conjugate Functionals
In mechanics and thermodynamics the change of dependent and independent
variables plays an important rote. This is called ihe Legendre transforma-
tion. We want to explain the basic idea. Important a,pplications will be dis-
cussed in Section 58.20 (Hamiltonian formalism) andin Section 67.5 (thermo-
dynamical potentials). Let the real function
E = E(S, V)
be C2 on R2 • For the total differential we obtain
dE= TdS+PdV (75)
with
T = E 5 (S, V), P = Ev(S, V). (76)
Our goal is to choose T and V as independent variables. More precisely, we
assume that the first equation in (76) can be solved for S, i.e.,
S = S(T, V).
Locally, as a consequence of the implicit function theorem (Theorem 4.B),
this is possible if E 55 (S, V) '# 0. Globally, this is possible if S 1-+ E 5 (S, V) is
strictly monotone increasing with E5(S, V)-+ ±oo for S-+ ±oo. In the thermo-
dynamics of Chapter 67 we have V = volume, T = absolute temperature,
E =inner energy, S = entropy, and P = pressure.
66 58. Basic Equations of Point Mechanics

The trick of the Legendre transformation is to find a new function


F = F(V, T)
which is related to E in such a way that the total differential dF assumes a
particularly simple form. Formally, F can easily be obtained from the product
rule
d(ab) = adb + bda.
Equation (75) implies that
dE = d(TS)- SdT + PdV.
If we choose F = E - TS, then
dF= -SdT+PdV, (77)
and hence
P = Fv(V. T). (78)
The function Fis called free energy. In order to justify this formal computation
we consider the C 1-curves
S = S(t), V= V(t),
and compute T(t), P(t) from (76). For the time derivatives we find
F = E- ts - TS = TS + PV - ts - TS = - st + PV.
If, in particular, we choose the curves in such a way that
T(t) = t, V= const,
then we obtain
F=FT= -S.
= const, V(t) = t we obtain F = Fv = P. This is (78).
If T(t)
Finally, we explain the connection between conjugate functionals, which
have been discussed in Chapter 51 of Part III. Webegin with
E = E(S, V),
consider V as a parameter, and set
E*(S*, V)= sup S*S- E(S, V). (79)
SeR

From Section 51.1 it then follows that E* is the conjugate function to E, and
S* is the coqjugate variable toS. We want to show that for a sufficiently regular
function E in the above sense, the following holds:
E* = -F, s• = T.
In fact, the existence of a unique maximum in (79) and differentiation with
58.17. Lagrange Multipliers 67

respect to S immediately imply S* = Es(S, V), hence T = S* and


E*(S*, V)= TS- E(S, V)= -F(T, V).

In Example 51.4 we showed that the following two conditions are sufficient
for the existence of a unique maximum in (79):
strict convexity
Ess(S, V)> 0 for all S, V
and coercivity
E(S, V)/ISI ~ +oo as ISI ~ oo for all S, V.
Transformation (79), however, can also be performed if E is not sufficiently
regular.

58.17. Lagrange Multipliers


Our goal in the following sections is this. In special situations, we want to use
the generat principle of Gauss' least constraint to obtain more convenient
formulations of the basic equations of mechanics which can also be applied
to other physical theories, namely:
(i) Principle of stationary action and Lagrangian mechanics.
(ii) Canonical equations and Hamiltonian mechanics.
(iii) Poisson brackets and Poissonian mechanics.
(iv) Propagation of action and the Hamilton-Jacobi theory about the rela-
tions between the canonical equations (ordinary differential equations) and
the Hamilton-Jacobi equation (first-order partial differential equation).
To recognize the connection between mechanics and a variational principle,
in the following section we consider the variational problern

['z L(y, y, t) dt = stationary!


J,l (80)
y(td = a, y(t2) = b,
with the side condition
j = 1, ... ,J, (81)

where y = (yl, .. . ,y")e IR". The real numbers t 1 and t 2 and a, be IR" are given.
Moreover, Iet y(t) E G for all t E [t 1, t 2 ]. We set
J
L(y,y,t) = L(y,y,t) + L Jcj(t)Mj(y)
j=l

and P(t) = (y(t), y(t), t). The necessary conditions for a solution y = y(t) of (80),
68 58. Basic Equations of Point Mechanics

(81) are then


d- -
dt L;;(P(t)) - f-,;(P(t)) = 0, i = 1, ... , n (82)

for all t e] t 1 , t 2 [. The functions A.1 are called Lagrange multipliers. Our assump-
tions are:
(H1) Gis a nonempty, open set in R11 and 1 :::;; J < n.
(H2) The functions M1: G --+ R, j = 1, ... , J are C 2 , and the matrix (aM1jay 1)
has maximal rank J at each point of G.
According to the preimage theorem of Section 4.18 this assumption guar-
antees that the set of all ye G, which satisfy the side conditions (81), form a
(n - J)-dimensional C 2 -manifold M which is called the configuration space.
This fact is the key to our proof.
(H3) The function L: G x R11 +1 --+ R is C2 •

Theorem 58.E. Assurne (H1)-(H3). Jf y: [t 1 , t 2 ]--+ G is a C 2 -solution of (80),


(81 ), then there exist functions A.i [t 1 , t 2 ] --+ R such that (82) holds.
Without side conditions (81) one obtains (82) with L = L.

The importance of this theorem is that the case of side conditions can
formally be reduced to the case without side conditions by replacing L with L.
PROOF.

(I) No side conditions. For this case, (82) with L = L has already been
proved in Section 37.4b. There we bad n = 1. But for n > 1 the proof
is the same if one only varies y 1 and leaves all other yi: fixed.
(II) Side conditions. The proof idea is to eliminate the side conditions by
using y = qJ(q) and then to apply (1).
(11-1) Localization. Let y = y(t) be a solution of(80), (81). We choose a fixed
solution. point y 0 = y(t 0 ) and vary y( ·) only in a neighborhood of y 0 •
Therefore it suffices to consider the case that [t 1 , t 2 ] is a sufficiently
small interval which contains t 0 in its interior and where G is a small
neighborhood of y0 in which the solution-manifold M of (81) has the
parameter representation
y = qJ{q) (83)
with q = (q 1, ... , qm) and qJ(q 0 ) = y0 • Because of dim M = m, the ma-
trix (alfJ 1{q)jaq") has rank m = n - J.
(11-2) Elimination. Using transformatlon (83) we obtain from (80), (81) the
new variational problern

f'> N(q,q,t)dt = stationary!,


J,, (84)
q(td = a, q(t 2 ) = b,
58.18. Principle of Stationary Action 69

where
N(q, q, t) = L(q>(q), q>'(q)q, t).
According to (I) the necessary conditions for (84) are equal to
d
dt Ntik - N9k = 0, k = l, ... ,m. (85)

The chain rule implies for the partial F -derivatives


Nqh = L,q>'(q)h + L;q>"(q)hq,
Ntih = L;q>'(q)h,

:t Ntih = (:t L;) q>'(q)h + L;q>"(q)qh.


From (85), i.e., from· (d/dt)Nti - Nq = 0 we therefore obtain

(~ L;- L,) q>'(q) = 0. (86a)

Differentiation of(81) with respect to q yields


Mj(q>(q))q>'(q) = 0, j = 1, ... ,J. (86b)
(III-3) Lagrange multipliers. In component notation equation (86b) is equal to

~ oM1 oq>' = O
L... k , k = 1, ... , m.
r=l oy' oq
At the point q0 this is a linear homogeneous system with coefficient
matrix (oq>'(q 0 )joqk). This matrix has rank m = n - J. Thus the J
linearly independent vectors Mj span the entire solution space. Rela-
tion (86b) therefore implies the linear dependence

ddt L; - L, = Lj ).jMj.
This is (82) for t = t0 • D

58.18. Principle of Stationary Action


We now want to show that for a typical physical situation the equations of
motion which have been obtained from the principle of least constraint, can
also be derived from a variational principle. This is an important result.
We consider here the following situation. Let N mass points X; with
masses m; be given. Suppose the force K; acts on X; whereby a potential
U = U(x 1 , ••• ,xN,t) exists, i.e.,
i = 1, ... , N,
70 58. Basic Equations of Point Mechanics

and assume that the motion satisfies the side conditions


r = 1, ... , R (87)
with R < 3N. Under suitable regularity conditions one obtains from the
principle of least constraint the equations of motion
R
" _ U
mjxj - - "• + r=l~1 j"(r)
~ ll.rJx, ' i = 1, ... , N. (88)

The following observation is now important. We set


L=T-U (89)
with kinetic energy
N
T= L m1#/2
i=l

and potential energy U. Moreover, we consider the variational problern

J,,1'
2
Ldt = stationary!,
(90a)
x1(td = a1, x1(t 2 ) = b1, i = 1, ... , N
with side conditions
r = 1, ... ,R. (90b)
Under suitable regularity conditions a solution of (90) then satisfies the
equations

d- -
dt Lx, - L,., = 0, i=1, ... ,N (91)

with
R
L= L + L ;.,p•>,
r=l

according to Theorem 58.E. A comparison implies the key result that (91) is
equal to the equations of motion (88). Therefore (88) follows from the varia-
tional principle (90), which is called the principle of stationary action.
The physical significance ofthe Lagrangemultipliers )., is the fact that they
yield constraining forces.

58.19. Trick of Position Coordinates and


Lagrangian Mechanics
By generalizing Euler's method, Lagrange got the idea for his remarkable
formulas, where, in a single line, there· is contained the solution of all problems
of analytic mechanics.
Carl Gustav Jakob Jacobi (1804-1851)
58.19. Trick of Position Coordinates and Lagrangian Mechanics 71

Lagrangian mechanics starts from the variational problern

i r,
rl
L(q, q, t) dt = stationary!,
(92)
q(td = a,
where q = (ql, ... ,qm)eiRm and q(t)eQ for all te[t 1 ,t2 ]. We assume:
(H) The Lagrange function L: Q x IRm x IR--+ IRis C 2 , where Q is a nonempty,
open set in IRm.
Fora C2 -solution q: [t1ot 2 ]--+ Q of(92), Theorem 58.E implies the famous
Lagrangian equation of motion
d
dt L 4(P(t)) - Lq(P(t)) =0 (93)

for all t e [t 1 , t 2 ] with P(t) = (q(t), q(t), t) or in component notation


d
dt L 4k(P(t)) - Lqk(P(t)) = 0, k = l, ... ,m.

We want to show how (92) follows from the previous section.

(T) Trick of Lagrangian position coordinates q. Let a system ofmass points be


given such as in Section 58.18. We set
L= T-U,
where T is the kinetic energy and U the potential energy. The important
idea is to introduce coordinates q such that with respect to q no side
conditions occur. Moreover, we transform T and U into these coordinates
and use (92), (93).
lf, for example, the equation of motion takes place on a circle, then the angle
lf' can be introduced as position coordinate q.
Method (T) precisely corresponds to our proof of Theorem 58.E, whereby
(93) corresponds to (85). Thus the basic idea is as follows. The side conditions
ofSection 58.18 yield a configuration space M, which is a manifold, and which
is parametrized by the coordinate q. Thereby the side conditions vanish
automatically. In the language ofmanifolds this means that we pass to a chart.
Thereby m is called the number of degrees of freedom of the system. The
variational problern (90) then becomes (92).
lf q = q(t) is a C 2 -solution of the Lagrangian equations (93) on [t 1 , t 2 ], and
if L does not depend on t, then
A = 4L,;- L
is a conserved quantity, i.e., it is constant along q = q(t). In fact, we have
d d
dt q L) = qL·q + q-L·-
-(4L·- dt q L q q- L·ii
q
= 0.
72 58. Basic Equations of Point Mechanics

Herewe use the convention (102) below. In many cases we have A = T + U,


i.e., A is the energy. The precise conditions for this are given in Remark 58.22
below.

Remark 58.20 (General Maoifolds). We now use several important concepts


from the theory of manifolds which will be introduced in Chapter 73. By
restricting ourselves to open sets Q, our formulation above is local. In fact, we
can consider the more general case that the configuration space M is a real
rn-dimensional C3 -manifold which generally cannot be described through
a single chart. The formulation for the general M, however, is completely
analogous to (92), (93). In this case q is a point on M and q is a point in the
tangent space TM4 , since the tangent vector q(t) to the curve t~--+q(t) at the
fixed point q = q(t) lies in TM4 • Therefore (q, q) lies in the tangent bundle TM
which is a C2 -manifold. The Lagrange fuoction
L: TM X R-+R
must then be C2 • Furthermore, L4 and L.; in (93) are the corresponding tangent
maps, i.e.,
L 4 (q,q,t): TM4 -+ R, (94)
L 4(q, q, t): TM4 -+ R. (95)
More precisely, q H L(q, q, t) is a C2-map from M into R with tangent map
(94) at the point q, and q H L(q, q, t) is a C2 -map from TM4 into R with tangent
map (95). Since, by definition, tangent maps are linear and continuous, it
follows that
L 4 (q,q,t), L 4(q,q,t)e TM:, (96)
i.e., these are cotangent vectors (elements of the cotangent space TM;).
The formulation of classical mechanics on manifolds in the context of
sympletic geometry will be discussed in Part V.
As an application of the Lagrangian formalism we consider in Problem 58.7
the circlular pendulum. Lagrangian mechanics, i.e., the principle of stationary
action (92) is a model, for many field theories in physics. In Chapter 76 we
will show that the equations of the general theory of relativity can be derived
from a variational principle. Moreover, Maxwell's electrodynamics and all
quantum field theories can also be obtained from such variational principles.
'This will be discussed in Part V.

58.20. Hamiltonian Mechanics


Hamiltonian mechanics starts from the canonical equations
q(t) = Hp(q(t), p(t),t),
(97)
p(t) = - H4 (q(t), p(t), t),
58.20. Hamiltonian Mechanics 73

or more simply
q =HP,
In component notation this means
q = (ql, ... ,q'") with p, qeR'"

and
qt=Hr, Pt=-H,k, k=l, ... ,m. (98)
We will see that in many cases His the energy. Moreover, q describes the
position ofthe system and the components of p are called generalized momenta.
Our assumptions are:
(H) The Hamilton function H: G x R -+ R is C 1, where Gis a nonempty, open
set in R 2 '".
G is called the phase space, and (97) is also called a Hamiltonian system.

STANDARD EXAMPLE 58.21 (Harmonie Oscillator). Newton's equations of


motion for the harmonic oscillator are
mij = -kq (99)
with q e R and k, m > 0. The solution is
q(t) = C sin(wt + o:)
with the angular frequency w = Jkjm. The Lagrange function L = T - U is
L = (mq 2 - kq 2 )/2.
Lagrange's equation (d/dt)L 4 - L, = 0 implies (99). In order to obtain the
form (97) we set
p= L,;,
i.e., p = mq is the momentum. Moreover, Iet
H =pq- L,
hence H = (mq 2 + kq 2 )/2 = T + U. This is the energy. Transformation onto
p yields
p2 kq2
H(q,p) =2m+ 2'

Thus the Hamiltonian equations q = HP, p = - H, mean


q = p/m, p = -kq.
This is equivalent to (99).

Similarly as in this simple example we now want to show how the canonical
equations can be obtained from the Lagrangian mechanics of Section 58.19.
74 58. Basic Equations of Point Mechanics

The key is the Legendre transformation


p = L 4(q, q, t), (100)
H=(p,q)-L, (101)
i.e., the change from (q,q) and L to (q,p) and H. In component notation this
means that
k= 1, ... ,m,
H = p1qi- L,
where j is summed from 1 to m.
In the following we often use the convenient notation
= (a, b) for aeX* and beX.
ab def (102)
In the case that X = X* = IR'" this means that ab = a1bi.
We assume that L is Cl and that (100) can be uniquely solved for q, whereby
a Cl-map
q = q(q,p,t)
occurs. Then (101) becomes
H~~q=~~~q-4~4~~~~
Let q = q(t) be a Cl-solution of the Lagrangian equation
d
dt L 4(P(t)) - L 4 (P(t)) = 0 (103)

with P(t) = (q(t), q(t), t). Using the Legendre transformation (100) we obtain
p(t), whereby
q(t) = q(q(t),p(t), t).
In order to obtain (97) we compute the partial F -derivatives
H4 h = p(q4 h) - L 4 h - L 4(q 4 h) = - L 4 h,
H,k = k4 + p(4,k) - L.;(4,k) = 4k,
for all h, ke IR'" and observe (102). It follows that

i.e., more precisely


H 4 (q,p,t) = -L4(q,4,t), H,(q,p,t) =4
with 4 = 4(q,p,t). From (103), i.e., from p = L 4 = -H4 and 4 = H, we im-
mediately obtain (97).

Remark 58.22 (Energy). If L = T - U with kinetic energy T = T(q, 4, t) and


potential energy U = U(q, t) and if T is homogeneous of degree 2 with respect
58.20. Hamiltonian Mechanics 75

to q, then
H= T+ U,
i.e., H is equal to the energy.
For a proof we differentiale the identity
T(q,rxq,t) ~ rx 2 T(q,q,t) for all rx e IR,
with respect to rx at the point rx = 1. This implies Tqq = 2T. Moreover, we have
p= L 4 = Tq, hence
H = pq- L = 2T- T + U.

Remark 58.23 (Legendre Transformation and Differentials). The Legendre


transformation corresponds precisely to the natural transformation method
discussed in Section 58.16 where q is replaced with a new variable p. In fact,
we have
d=~~+~~+~~=~~+p~+~~

with L 9 dq = L 91 dqi, etc. From


d(pq) = (dp)q + p dq
we obtain
d(pq- L) = -Lqdq + qdp- L,dt.
Letting H = pq - L, we obtain Hq = - Lq, HP = q and H, = L,.

Remark 58.24 (General Manifolds). We now consider the same situation


as in Remark 58.20, i.e., q varies on the configuration space M, where M is a
real, rn-dimensional C3 -manifold. The canonical equations (97) can then be
naturally formulated for this more generat situation. From p = L 4(q, q, t) and
(96) follows
peTM;.
Thus (q,p) is a point of the cotangent bundle TM*, which is a C2-manifold.
We now assume that the Hamitton function
H: TM* x IR-.. IR
is C 1. The corresponding tangent maps are
Hq(q, p, t): TM 9 -.. IR, (104)
Hp(q,p,t): TM: -..IR. (105)
More precisely, qt--+H(q,p,t) is a C 1-map fromM into IR with tangent map
(104), and pt--+ H(q, p, t) is a C 1-map from TM; into IR with tangent map (105).
As an rn-dimensional, real vector space the tangent space TMq is reflexive, i.e.,
76 58. Basic Equations of Point Mechanics

TM:* = TMq. In this sense we have


Hq(q, p, t) E TM:,
and the canonical equations (97) are valid.
For fixed time t the Legendre transformation (101) corresponds to the
change from
(q,q)eTM to (q,p)eTM*,
i.e., to a transformation cp: TM-+ TM* which maps the tangent bundle onto
the cotangent bundle. Behind the Legendre transformation a duality principle
is hidden. The precise relation with conjugate functionals in the context of
duality theory of convex analysis has already been discussed in Section 51.1
of Part 111.

Hamiltonian mechanics has a nurober of important advantages.


(i) The solution of the canonical equations (97) can often be greatly sim-
plified by using canonical transformations (Section 58.24).
(ii) By using Poisson brackets one can give simple criteria for the conserved
quantities (see Section 58.21).
(iii) In cantrast to the Lagrangian equations, the canonical equations (97) are
first-order differential equations, i.e., they represent a dynamical system
for which the entire theory of dynamical systems is available.
(iv) The canonical equations are closely related to symplectic geometry. In
particular, this implies the following. lf q = q(t), p = p(t) are solutions of
(97), then they can be interpreted as trajectories of a flow in the phase
space, this flow is volume preserving (Liouville's theorem), and builds the
starting point for classical statistical physics (see Part V).
(v) The solution theory for the canonical equations is closely related to the
solution theory of a partial differential equation, namely the Hamilton-
Jacobi differential equation for the action function (see Section 58.23).
(vi) The canonical formalism can be generalized to infinite-dimensional
systems. Thereby field theories can be studied in the language of canonical
equations (see Part V).
In summary, one can say the following. Up to now, physical experience
shows that the canonical formalism is a very flexible instrument to formulate
and study physical theories. The reason for this probably lies in the fact that
the connection with two fundamental physical quantities, namely energy H
and action S, is very close and explicit.
In Section 75.11 we will use the Hamiltonian formalism to develop the
mechanics of a free particle in the context of the special theory of relativity.
Thereby we naturally obtain Einstein's fundamental relation
E = mc 2
between energy E, mass m, and velocity of light c.
In the following sections of this chapter and in Section 74.26 we will study
58.21. Poissonian Mechanics and Heisenberg's Matrix Mechanics 77

the classical brackets of Poisson, Lagrange, and Lie and explain their meaning.
These brackets show that physics, in its essential parts, has a noncommutative
structure, which is especially obvious in quantum theory.
The difficulties, however, which occur in the mathematical formulation of
quantum field theory show that at present the true character of physical
phenomena is mathematically not completely understood. We are looking at
a puzzle whi.ch in parts has already been solved with an astounding mathe-
matical beauty, but the entire picture is still hidden.

58.21. Poissonian Mechanics and Heisenberg's


Matrix Mechanics in Quantum Theory
Poissonian mechanics starts from the equation
d
dt F(P(t)) = {H, F}(P(t)) + F,(P(t)), (106)

with P(t) = (q(t), p(t), t) and the so-called Poisson brackets


{H,F} = F9 HP- FPH9 ; (107)
recall our convention ab = (a, b). More precisely, we have
{H,F}(P) = Fq(P)Hp(P)- Fp(P)H9 (P)
with P = (q, p, t). In component notation this becomes
{H,F} = FqJHpJ- FPJHqi•
where j is summed from l to m. Thus, in particular, we obtain
(108)
for all i, j = I, ... , m, and moreover, we find
{H,F} = -{F,H}.
Equation (106) means that H is given and (106) holds for an arbitrarily
smooth F. Precisely, we assume the following:
(H) The functions F, H: G x IR-+ IR are C 1, where Gis an open, nonempty
set in IR 2 "'.
We now want to show that Poissonian mechanics and Hamiltonian me-
chanics are equivalent. In fact, it immediately follows from the canonical
equations (97) that
d
dt F = F9 q + FPp + F, = FqHp - FpHq + F,.
Therefore every solution q = q(t), p = p(t) of(97) is also a solution of(l06).
78 58. Basic Equations of Point Mechanics

Conversely, (97) follows from (106) for F = qk and F = Pt• since


and
The same observation is valid for manifolds, i.e., for the general situation
of Remark 58.24.
The advantage of Poissonian mechanics is the following:
(i) Observables. If the functions F = F(q, p, t) are called observables, then
(106) is a uniform formula for the evolution in time of all observables.
(ii) Conserved quantities. lf
{H,F}+F,:O
holds, then Fis a conserved quantity, because for every solution q = q(t),
p = p(t) of the canonical equations, (106) holds and thereby F(P(t)) =
const.
(iii) Generalization. One can use (106) for the formulation ofthe evolution in
time of more generat physical systems, by replacing the classical Poisson
brackets (107) with other expressions. As we now want to show, one
thereby obtains a very elegant approach to the matrix mechanics, devel-
oped by Heisenberg (1925), which marked the beginning of quantum
mechanics. The problern which was facing the physicists then was the
following. Required was a physical theory which would Iead to a quanti-
zation of classical mechanics, i.e., this theory should contain many fea-
tures of classical mechanics but at the same time take quantum effects
into account, i.e., the quantization of energy of the harmonic oscillator,
which was postulated by Planck in 1900 in order to obtain the correct
radiation law.

ExAMPLE 58.25 (Matrix Mechanics of Heisenberg (1925)). Webegin with the


following formal observations. The key idea is to define the Poisson brackets
as
i
{H,F} = h(HF- FH),

where h = 6.625 · 10- 34 J s is Planck's quantum of action and Ii = h/2n. From


(106) it follows that the equations of motion for the observables F = F(q, p, t)
are given by
d
dtF = {H,F} + F,.
In particular, the equations of motion are
(109)
for j = 1, ... , m. Analogously to (108), we assume
{pi,qi} = b{I, {Pi• Pi}= {qi,qi} =0 (109a)
58.21. Poissonian Mechanics and Heisenberg's Matrix Mechanics 79

for i,j = 1, ... , m. Theseare the fundamental commutation relations of quantum


mechanics. What is the meaning of q1 and pi? Because of(109a), these quanti-
ties do not commute. Following Heisenberg, we require that q 1, Pi and F(q, p, t)
are self-adjoint infinite-dimensional matrices. Self-adjointness is assumed in
order to assure that the observables F have real eigenvalues. Because of(109a)
we cannot use finite-dimensional matrices, since in that case we have
and trace(~{/) #= 0 for i = j,
where trace denotes the sum of the diagonal elements.
1'1 an exact functional analytic theory, which was formulated by John von
Neumann (1932), one requires that q 1, Pi and F(q, p, t) are self-adjoint operators
on a complex H-space. The theory is somewhat complicated, because the
operators are unbounded. Thus in computations one carefully has to watch
the domains of the operators.

We consider an important application.

STANDARD EXAMPLE 58.26 (Quantum Mechanical Harmonie Oscillator). Our


goal is the formula
E = liw(n + t}, n = 0, 1, ... (110)

for the possible energy values of a quantum mechanical harmonic oscillator


with angular frequency w. In order to clearly present the simple basic idea,
we begin with a formal observation. The Hamilton function of the harmonic
oscillatoris

according to Example 58.21. The previous example yields the equations of


motion
j j
jJ = h(Hp - pH), q= h(Hq - qH). (111)

Moreover, we assume that the commutation relation


i
h(pq- qp) =I (112)

is valid for the solution q = q(t), p = p(t) of (111 ), and

p = p*, q = q*, H=H*.

From (111) and (112) follows


jJ = -mw 2 q, p=mq, (113)
80 58. Basic Equations of Point Mechanics

since

. i
q = 2mh (p2q - qp2) = pfm.

A solution of(113), i.e., of ij + oiq = 0 is


q(t) = y(ae-iwt + a*eiwt)fJ2,
p(t) = mq = immy(a*eiwt- ae-iw')/J2

with y = (h/mw) 1' 2 , where a* denotes the adjoint matrix to a. Then q = q*,
p = p* and H = H* is automatically satisfied and (112) yields
aa*- a*a = 1. (114)
Moreover, we have
a = _1 (q(O) _ iyp(O)).
J2 y h
(115)
a* = _1_ (q(O) _ iyp(O)).
J2 y h
Letting
N=a*a
we obtain from (114) that
H(q(t), p(t)) = hw(N + -!).
Wenow show
n = 0, 1, ... (116)
with
lfJn = (a*)nlfJo·
Therefore H has the eigenvalues (110~ which are interpreted as the possible
energy states of the oscillator, i.e.,
n = 0, 1, .... (117)
We prove (116) by induction. For n = 0 we postulate the existence of cp0
with a(/)0 = 0. This implies

From (114) we obtain


NcpH 1 = a*aa*cp11 = a*cp11 + a*Ncp11 = lfJHl + a*Ncp11 •
Hence (116) follows by induction.
58.22. Propagation of Action 81

In order to rigorously justify the formal computations, we choose the


complex H-space X = L~(R) and define the symmetric operators
qo, Po: Ccf(R) ~ X -+ X

through
hdqJ
Po(/)= i dx ·

Then we choose q(O) and p(O) as the uniquely determined self-adjoint exten-
sions of q0 and p0 , respectively. From (llS)one finds then aand a* and thereby
all other quantities. Here a* denotes the adjoint operator to a. Moreover, we
choose

where the constant c0 is obtained from the normalization condition (qJ0 jqJ0 ) =
1. This actually gives a(/)0 = 0.
This realization of p and q through differential operators is not accidental.
1t is closely related to the wave mechanical treatment of the harmonic oscil-
lator which we will consider in Section 59.18. In Part V we will show that
Heisenberg's matrix mechanics and Schrödinger's wave mechanics are only
two different realizations of the same abstract Hilbert space theory.
In quantum field theory, which will also be discussed in Part V, the quantity
({)" is interpreted as a state with n particles, and N is called the particle number
operator. The ground state (/)o with N(/)0 = 0 is called the vacuum. Moreover,
a* and a are called creation operator and anihilation operator, respectively.
This is motivated by the fact that the n-particle state ({)" is obtained from the
vacuum (/)o by an n-fold application of a*, i.e., ({)" = (a*)"qJ0 • Furthermore, we
have
a(/)n+l = aa*qJ" = (I + N)({)" = (1 + n)({)"'
i.e., an application of a to an (n + 1)-particle state yields an n-particle state.
This example shows that already purely algebraical operations imply impor-
tant quantum theoretical results. It is the starting point for applications ofthe
theory of operator algebras in modern quantum theory. In this direction we
recommend Bratteli and Robinson (1979, M) and Part V.

58.22. Propagation of Action


lf q = q(t) is a solution of the Lagrangian equation
d
dtL,;- LP = 0,
then the action !lS which during the time interval (t 1 ,t2 ] is transported by
82 58. Basic Equations of Point Mechanics

this solution is defined as

f,, t2

L(q(t), q(t), t) dt. (118)

Since L = T- U we obtain that L has the dimension of an energy and the


action ~S therefore has the dimension energy x time, hence J s (joule second).
This definition makes the expression "principle of the stationary action" for
(92) plausible.
If q = q(t), p = p(t) is a solution of the canonical equations

then the action ~s. which during the time interval [t 1 ,t 2 ] is transported by
this solution is defined as

~s = f.,, p(t)4(t) -
t2

H(q(t), p(t), t) dt. (119)

The Legendre transformation yields L = pq- H. Therefore (119) and (118)


are consistent.

58.23. Hamilton-Jacobi Equation


Besides the canonical equations
q=Hp, (120)

with H = H(q, p, t) and q, p e !Rm, t e IR we consider the so-called Hamilton-


Jacobi equation
S, + H(q,Sq,t) = 0 (121)
for S = S(q, t). We shall show that (121) is closely related to the concept of
action. The simultaneous study of(120) and (121) has the following advantages:
(i) If an m-parametric solution of (121) is known, then one can obtain a
2m-parametric solution of(120) by using a canonical transformation, i.e.,
one obtains the general solution of the canonical equations. Thereby only
differentiation and elimination are used.
(ii) If, conversely, an m-parametric solution of(120) is known and, in addition,
Lagrange's bracket conditions are satisfied, then one obtains a solution
S of (121) by evaluating a line integral. Thereby S is the action which
corresponds to the solutions of (120).
(iii) The solution of the initial-value problern for (1i1) is a special case of (ii).
Thesepoints will be discussed in the following three sections. In the context
of the generat theory of first-order partial differential equations, (120) is the
characteristic system of(121). In another form, equation (121) has already been
58.24. Canonical Transfonnations and the Solution of the Canonical Equations 83

discussed in Section 37.4 ofPart III. There, we gave an intuitive interpretation


of S in connection with geometrical optics. Here we want to emphasize the
role of the Lagrange brackets. In Part V we will show that this important
concept is naturally related to symplectic geometry and Ieads to Lagrangian
manifolds in the context of this geometry.

58.24. Canonical Transformations and the Solution


of the Canonical Equations via the
Hamilton-Jacobi Equation
An important method for solving the canonical equations (120) is the use of
a variables transformation
'! = a(q, p, t), b = b(q,p,t) (122)
in order to obtain the new canonical equations
ä= Hb, 6 = -H,. (123)
with a modified Hamitton function H = H(a, b, t). Such transformations (122)
are called canonical transformations. One tries to find the simplest form of H.
Then a solution ofthe original canonical equations (120) is obtained by finding
a solution of(123) and using a back transformation. The simplest form of(123)
exists for H = 0. Then
a= const and b = const
is a solution of(123). In fact, following Jacobi, such canonical transformations
can be constructed by setting
S,.(q, t, a) = b,
(124)
S11 (q, t, a) = p.
Thereby S = S(q, t, a) is a solution of the Hamilton-Jacobi equation (121),
which also depends on the m-parameters a = (a 1, ••• , a'"). Our assumptions
are:
(H1) In a neighborhood of (q 0 , p 0 , t) we are given a real C 1-function H =
H(q, p, t) with q, p eR'" and t eR
(H2) The real function S = S(q, t, a) with a eR'" is a complete integral of the
Hamilton-Jacobi equation (121), i.e., in a neighborhood of (q0 , t0 , a 0 )
the function S is C2 and there a solution of (121) which satisfies the
additional property
det S,.11 (q 0 , t 0 , ao) =F 0. (125)
We set b0 = S,.(q 0 , t 0 , a0 ) and Po = S11(q 0 , t 0 , a0 ).
84 58. Basic Equations of Point Mechanics

Because of(l25) we can solve the first equation in (124) locally for q by using
the implicit function theorem (Theorem 4.B).

Theorem 58.F (Jacobi). Assume (H1) and (H2). By solving equation (124) one
obtains, for every fixed (a, b) e IR 2m in a neighborhood of (a0 , b0 ), a solution
q = q(t), p = p(t)
of the canonical equations (120) in a neighborhood of t0 •

PROOF. From (124) we obtain q = q(t,a,b) and p = p(t,a,b). We insert these


C 1-functions into (124). Furthermore, we insert S = S(q, t, a) into (121). Differ-
entiation of (124) with respect to t yields
S,.9 q + S,., = 0, p = S94 q + S,,.
Differentiation of(121) with respect to a and q yields
S,., + HPS,., = 0,
and hence q = HP and p = - H9 • 0
The importance of this method is that in a number of significant cases one
can easily obtain a complete integral of the Hamilton-Jacobi equation by
using the separation ansatz
m
S(q, t) = L Si(q,) + sm+l (t).
i=l

In the case ofthe harmonic oscillator we already showed this in Example 37.8.

Remark 58.27 (Global Version). Our formulation ofTheorem 58.F was local.
But the same proof yields global solutions if equation (124) can be solved
globally for q, whereby q = q(t, a, b) is C 1 • We then have to assume that (125)
holds along (q(t,a,b),t,a) and that the functions Hand S are C 1 and C 2 ,
respectively.

58.25. Lagrange Brackets and the Solution of


the Hamilton-Jacobi Equation via
the Canonical Equations
Suppose we are given a family
q = q(t,a), p=p(t,a) (126)
of solutions of the canonical equations
(127)
which depend on the parameter a e Rm. Thereby we set q= q, and p= p,.
58.25. Lagrange Brackets and the Solution of the Hamilton-Jacobi Equation 85

The question then is, under which conditions this yields a solution of the
Hamilton-Jacobi equation
S, + H(q,S9 ,t) = 0. (128)
The key assumption is:
(L) The Lagrange brackets satisfy
[a 1, ai] (t 0 , a) = 0
for all i,j = 1, ... , n and all ae U.
Thereby we define

Recall that pq = p~;qk, where k is summed from 1 to m. Because of(126) these


brackets depend on (t, a). The idea is as follows. Consider the line integral

S(t,a) = i (f,ll)

(to,llo)
(pq- H)dt + (pq ..,)da1 (129)

with q = q(t, a) and p = p(t, a). We solve q = q(t, a) for a and set
S(q, t) = S(t, a(q, t)). (130)
Condition (L) guarantees that the line integral in (129) is path independent.
In the following assumptions, we mean by "local" more precisely "in a neigh-
borhood of the point (t 0 , a 0 ) e Rm+l ". We set
and
and make the following assumptions:
(H1) The function H: U(q 0 ,p0 ,t0 ) ~ R2 m+l ~ R is Ck, k;;::: 1.
(H2) The functions (126) are local Cl.functions from Rm+l into Rm and, satisfy,
locally, the canonical equations (127).
Furthermore, we ha ve
det q..(t 0 , a 0 ) "::/= 0
and (L) holds in a neighborhood U of a0 •
The condition for the determinant is needed in order to solve locally the
equation q = q(t, a) for a.

Theorem 58.G. Assurne (H1) and (H2). Then (130) is a local ct-solution of the
Hamilton-Jacobi equation.

PRooF.
(I) The Lagrange brackets have the important property that for solutions
q = q(t, a), p = p(t, a) of the canonical equations the relation
iJ I .
ot[a,a1 ]=0 (131)
86 58. Basic Equations of Point Mechanics

holds. In order to simplify the notation we set


~ = al, ß= ai.

Moreover, we write p,. = opfo~. etc. Then (131) follows from


a
ot [~.ß] = p,.qp + p,.qp- Ppq,.- Ppq,.,
and

t/p = Hpqqfl + HppPtJ·


Note that the symmetry of the matrix of the second-order partial deriva-
tives Hp1qJ implies that
(H9Pp,.)qp = p,.(Hpqqp)·
From (131) and (L) we therefore obtain locally
[a 1,ai](t,a) = 0. (132)
(II) The integral (129) is path independent, since the integrability conditions
(pq,.), = (pq - H),., (133)
(pq,.)p = (pqp),. (134)
are satisfied. Actually, (134) follows immediately from ppq,. = p,.qp. This
is (132). On the other band, (133) follows from (127~ since
(pq,.), = pq,. + pq,.,
(pq - H),. = p,.q + pq,. - H 9 q,. - HPp,..
(111) We show that S in (130) is a solution of the Hamilton-Jacobi equation.
From ( 129) follows
S, = pq- H,
Note that Sal = pqai• and hence Sah = p(qah). Formula (130) implies

Differentiation ofthe identity q = q(t, a(q, t)) with respect tot and q gives
O=q+qA
This implies S9 = p and S, = - H. 0

The physical interpretation of S is obtained by inserting a = a0 into (129).


We find

S(q(t, a 0 ), t) = J.'
lo
(pq - H) dt
58.26. Initial-Value Problem for the Hamilton-Jacobi Equation 87

along the solutions q = q(t, a 0 ) and p = p(t, a 0 ) ofthe canonical equations, i.e.,
S corresponds to the action which propagates along the solution of the
canonical equation.

Remark 58.28 (Global Version). Our proof also yields a global version of
Theorem 58.G. In order to obtain the path independence of the integral in
(129), condition (L) must be valid in a simply connected region U. Moreover,
we need the fact that equation q = q(t, a) can be solved globally for a and
yields a Ck-map a = a(t, q).
Remark 58.29 (Lagrangian Manifolds). In Part V we will give an invariant
definition for Lagrangian manifolds in connection with symplectic geometry.
There we will also show that Lagrange's bracket condition is equivalent to
the fact that for t = t 0 the family (126) yields a Lagrangian manifold in the
(q, p)-phase space. The key thereby is the following formal observation. Con-
sider the equation
(135)

where k is summed from 1 to m. The symbol " A" is used in the same way as
a product sign, where cx A ß = - ß A cx. From (135) and q = q(t, a), p = p(t, a)
it follows formally that for fixed t the relation

0= op~ oq~ dai A dai


oa' oa 1

= !(PaiPai- PaiPa;)dai A dai


= ![ai,ai]dai A dai
holds, and hence
for all i,j.

The differential form on the left-hand side of (135) is the reason for the fact
that the canonical formalism can be formulated in the context of symplectic
geometry.

58.26. Initial-Value Problem for the


Hamilton-Jacobi Equation

As an application of Theorem 58.G we consider the initial-value problern for


the Hamilton-Jacobi equation
S, + H(q, S4 , t) = 0,
(136)
S(t 0 , q) = S0 (q),
88 58. Basic Equations of Point Mcchanics

where the function S0 is given. Wehave to find the functioh S. By passing to


S - S0 we can always assure that S0 = 0 for a changed function H. Thus we
can restriet ourselves to the case S0 = 0. For the solution of(136) we use the
canonical equations
q=H", p= -H.,
(137)
q(t0 ) = a, p(t 0 ) = 0.
Proposition 58.30. Let the function H: U(q 0 , 0, t 0 ) !;;;; R"'+ 1 -+ R be C3 • Then the
initial-value problern (136) with S0 = 0 has a unique C2 -solution in a neighbor-
hood of (q 0 , t 0 ).

PROOF.
(I) Existence. Using Theorem 4.0 we solve (137) and obtain a C2-family
q = q(t,a), p= p(t,a).

Because of
p(t 0 ,a) = 0,
the Lagrange brackets are identically zero. Furthermore, from q(t0 , a) = a
follows that
q,.(t0 , a) = I.
Theorem 58.G yields the existence ofa solution of(136), and (130) implies
that S(q, t 0 ) = 0.
(II) Uniqueness. LetS = S(q, t) be a C2 -solution of (136). We set
p(t) = S"(q(t), t)
and determine q = q(t) from
q(t) = H"(q(t), p(t), t), q(t0 ) = a.
Differentiation with respect to t gives

P= s".4 + s",.
Differentiation with respect to q in (i 36) with S = S(q, t) yields the
equation

s," + H" + H"s"" = o.


Consequently,
p(t) = - H11 (q(t), p(t), t), p(t0 ) = 0,
i.e., q( ·) and p( ·) are the unique local solutions of(137).
We set u(t) = S( q(t), t). This implies
a- = s"q + s,
58.27. Dimension Analysis 89

and hence
ä(t) = p(t)q(t) - H(q(t), p(t), t), u(t 0 ) = 0.
Therefore, u(t) is locally unique.
Since detqa(t,a) :;l:-0 in a neighborhood of(t0 ,a0 ) it follows that there
exists a unique solution q( ·) of(137) through each point ae U(q 0 ) for fixed
te U(t0 ). Therefore, S is locally uniquely determined. 0

58.27. Dimension Analysis


Every physical quantity has a dimension. A survey about the international
system of units and the numerical values of important universal constants can
be found in the Appendix. ·
The fact that physical quantities have a dimension greatly restricts the
number of possible physicallaws. We explain this with an example.

EXAMPLE 58.31. Consider a pendulum oflength I and mass M. We are looking


for a formula for the period of oscillation T of the pendulum. We expect that
T depends on I, M and the gravitational acceleration g. Thus we begin with
the ansatz

where ~ is a dimensionless constant. Passing to dimensions we obtain

This implies ß = 0, y = -!, and a = -y = t. i.e.,


T=~A· (138)

The constant ~ has to be determined from experiments. In Problem 58.7 we


show that in first-order approximation, for small pendulum motions, (138)
actually follows from the Newtonian equations of motions, whereby ~ = 2n.
In this example we obtained a maximum of information from a minimum
of assumptions by using dimension analysis.
We want to mention here an important point. Often in physics one uses
approxtmation calculattons, since exact computations are too complicated or
cannot be performed at alt. Thereby one exactly needs to know which terms
in an equation are "small." To this end one uses variables for which the
equations become dimensionless. Only those equations allow a precise com-
parison of the relative magnitude of the particular terms. We will use this
method, for example, in connection with the motion of the Perihelion of
Mercury in Section 76.9.
90 58. Basic Equations or Point Mechanics

PROBLEMS
58.1. Proof of Lemma 58.18.
Solution: Obviously, the vectors
h1 = a + m x x 1, i = l, ... , n
form a solution of (59). Conversely, every solution of (59) has this form. For
n = 3 this follows from the fact that the solution space is six-dimensional, since
the rank of the system of equations is three.
We use induction. Suppose the assertion of Lemma 58.18 is true for fixed
n ~ 3. The trick is then to set
y1 =x1 -x 1 and for i = 2, ... , n + 1.
Formula (59) implies
(y, - YJ)(g 1 - g1) = 0, i,j = 2, ... , n + l,
so that
9i = ii + (I) X Yi• i = 2, ... , n
follows from the induction assumption. Moreover, (59) implies that y 1g1 = 0
for i = 2, ... , n + l, and hence
y 1ä=0, i=2, ... ,n+l.
Consequently, ii=O. Note that by hypothesis span {y 2 , ••• ,y.+d has the
dimension three. If we now choose the vector a so that
h1 = a + m x x 1 ,
we obtain h1 = a + m x x 1 for i = l, ... , n + l, i.e., the assertion of Lemma
58.18 is true for n + l.
58.2. Gravitational potential of a mass point. Assurne a point of mass M is at y. It
exerts a gravitational force

K(x) = GMm(y - x)
IY- xl 3
to an arbitrat:y point of mass m at x with x .;:. y. Compute the potential U of K.
Solution: We must have U'(x) = - K(x) for all x .;:. y. This is satisfied for
GMm
U(x) = -IY- xl"
In order to verify this we use Cartesian coordinates and observe that
oU(x)
ucx>=ve,.
I

58.3. Gravitational potential of a body. A body in a bounded region 0 of IR 3 with

i
continuous density p: ö-+ IR exerts a gravitational force

K(x) = Gmp(y)(y ~ x) dy
n ly-xl
Problems 91

on a point x of mass m. This expression follows from Problem 58.2 by summing


the forces whose origin is in the volume element L\y with mass M = pi\y, and
then passing from the sum to the integral. Prove that the function

U(x)=- i Gmp(y)
--dy
nly-xl
represents the corresponding potential in IR\ i.e., K(x) = - U'(x).
Hint: See Günter (1957, M), p. 78.
58.4. Gravitational potential o.f a spherical shell. Let Q = { ye IR 3 : 0 < r < IYI < R}.
For the gravitational potential U of the previous problern with p = const
prove the following results:
(i) In the interior, i.e., for lxl ::;; r we have U(x) = const. Hence, because of
K = 0 there exists no gravitational force.
(ii) In the exterior, i.e., for lxl ~ R, the potential U(x) behaves as if the entire
mass would be located at the center x = 0. Hence, it follows that

U(x) =
Gm L p(y)dy

lxl
Hint: See, e.g., Mangoldt and Knopp (1957, M), Vol. 3, p. 429.
Statements (i) and (ii) remain true for rotationally symmetric, continuous
densities p. To see this, decompose Q into small spherical shells of constant
density and use an approximation argument.
58.5. Gravitational .force in a mine. If R is radius of the earth and one goes into a
mine at a distance r from the center of the earth, then the gravitational effect
is the same as if the entire mass of the partial ball of radius r would be located
at the center of the earth. This same important phenomenon will be used in
Section 58.15 in connection with theexpansion ofthe universe. More precisely,
prove the following. If Q is a ball of radius R with center x = 0 and the
continuous, radially symmetric density p: Q-+ IR, then the gravitational poten-
tial is equal to

G [ p(y)dy
Jlyls;r
U(x) =
lxl
for lxl::;; r::;; R.
Solution: Decompose Q into a ball of radius r and a spherical shell. The
assertion then follows from Problem 58.4.
58.6. Doppler effect. A source S moves on a straight line away from the observer P
with constant velocity V. At times t = 0 and t = T the source S emits signals
which travel with the velocity c. Then P receives the two signals at a timely
distance of T + TVjc. Now suppose that S emits light continuously. What is
the relation between the wave lengths, observed by Sand P?
Solution: P receives the two signals at a timely distance T + TV/c. If Ä.s and
;.,. is the wave length, which S and P observe, respectively, then we obtain
92 58. Basic Equations of Point Mechanics

A.s = Tc for a suitable choice of T, and


A.p = Tc+ TV = A.8 (1 + Vjc).
Therefore P observes a red shift. For the relative change in the wave length
we obtain
A.p- A.s V
---=-=Ht,
A.s c
where t is the running time of the signal and H the so-called H ubble constant,
i.e., H = V/R with R = ct. Here R is the distance between Sand P at the time
of emission of the signal.
58.7. Plane mathematical pendulum. As in Figure 58.18 we consider a pendulum of
length I and mass m. Compute the motion of the pendulum and the period of
oscillation by using the Lagrangian formalism of Section 58.19. Furthermore,
compute the effective constraining forces.
Solution: The orbital motion x = x(t) occurs on a circle, i.e.,
x(t) = l(e 1 sin cp(t)- e 3 cos cp(t)). (139)

This motion is given by the angle cp = cp(t). No side conditions appear in this
description. The kinetic energy is

The gravitational force K = -mge 3 is applied to the pendulum. lt has the


potential U = mg~ 3 , since K = - Ux. The Lagrange function is therefore
L= Etin - U = !ml 2 4J 2 + mgl cos lp.
The Lagrangian equation
d
-Lr;-L"'=O
dt
yields
ml 2 tp + mgl sin cp = 0. (140)

As an initial condition we choose


4J(t 0 ) = 0 (141)

with -n: < cp0 < n, i.e., cp0 corresponds to the turning point of the pendulum.

Figure 58.18
Problems 93

Since L does not explicitly depend on timet, it follows from Section 58.19 that
q,L• - L = Etin + U is a conserved quantity. This is the energy conservation
law
!ml 2 rp 2 - mgl cos cp = const = E.
The fact that this expression is constant, also follows directly by differentiation
keeping (140) in mind. From (141) we obtain E = -mglcoscp0 , hence

rp 2 = 4: ( sin 2 ~0 - sin 2 ~).


Weset

. cp = k sm.",
. .t.
sm 2
This way we obtain the elliptic integral

!Y I"' dl/l
·{it= Jo Jt-k2 sin 2 1/1.

This integral yields a periodic motion with cp = 0 for t = 0. The period of


oscillation T satisfies cp(T/4) = cp0 , hence

A: f' =
2
-J-,=1=-=:=:::=s=in72=1/1

For small motions around the equilibrium position cp = 0, we have that jcp0 1
is small, i.e., k 2 is small. Expansion of the integrand yields

T=2nfg(1+~2 +0(k 4 )). k->0.

If, for small motions around the equilibrium position, one replaces the
function sin cp = cp + O(cp 3 ) with cp, then one obtains from (140) the equation
ofmotion

which has the solution

cp = cp0 sin(/7t + a).


Here the period of oscillation satisfies

T= 2nfg.
In order to compute the effective constraining forces Z which are acting on
the pendulum, we insert cp = cp(t) into (139). This gives
mx = -mge3 + z.
See Fig. 58.17 in Section 58.10.
94 58. Basic Equations of Point Mechanics

The following Problems 58.8-58.10 serve as a preparation for Problem 58.11 about
oscillating systems.

58.8. Complete system of eigenvectors for self-adjoint operators in .finite-dimensional


H-spaces. Let A: X--+ X be a linear self-adjoint operatorinan H-space X over
IK = R, C with dim X = N, 1 ~ N < oo. Show that there exists a system
{x 1 , ..• ,xN} of eigenvectors of A with
for all i,j = l, ... , N.

Solution: This is a special case of Theorem 19.8. In order to obtain an


entirely elementary approach to oscillating systems we give a full proof.
(I) Webegin with the eigenvalue equation

Ax = lx, xeX, leK (142)


Because of

). = (xiAx) and (xiAx) = (Axlx) = (xiAx).


(xlx)

all eigenvalues of A are real. The number ). is an eigenvalue of A if and


only ifthe inverse operator (A - ).Jr 1 does not exist, i.e.,
det(A - ll) = 0.

According to the fundamental theorem of algebra this polynomial equa-


tion in ). has a solution ). = ). 1 • Hence it follows that an eigensolution

exists.
(II) Letting
Y = {yeX: (ylxd = 0},
we find that dim Y = N - l. Moreover, we obtain the key result
ye Y => Aye Y. (143)
In fact, if y e Y, then
(Aylxd = (yiAxd = l 1 (ylxd = 0.

Because of (143) we can now apply the same argument as in (I) to the
operator A: Y--+ Y. This yields the existence of an eigensolution

(III) Using N analogous steps we find N eigensolutions


i = l, .".. , N.
58.9. Normal form for symmetric matrices. Let AT in the following denote the
transposed matrix to A. Suppose we are given a real symmetric (N x N)-
matrix A = (aü), i.e., A = AT. Show that there exists arealorthogonal (N x N)-
Problems 95

matrix U with
UTAU = diag(A. 1, ... ,Ä.N),
where the J.;'s are the eigenvalues of A.
Solution: We choose X = RN with (xjy) = L~=l ei'7i· Notice that (xjy) = XT y
and hence
(Axly) = (Ax)T y = xT AT y = xT Ay = (x!Ay),
i.e., the linear operator A: X-+ Xis self-adjoint. According to Problem 58.8
there exists a system of eigenvectors of A with Ax; = Ä.;X; and xJ x1 = ~IJ for
all i,j = 1, ... , N. The matrix

satisfies UTU = I because of

U'U ~ ( ] } .. · .,x,) ~ (x/x ~ 1) /,

i.e., U is orthogonal. Finally, we get

U'AU { } • • • · · ,A,x,) ~ (~ .:)

Because of det UTU = 1 we find that


det(UT AU - Ä./) = det(UT(A - Ä.I)U) = det(A - Ä./).
Hence the solutions of det(A - Ä./) = 0 are precisely all the J.;'s.
58.10. Diagonalization of two symmetric matrices. Suppose we are given two real
symmetric (N x N)-matrices A and B with x'~'Ax > 0 for all x ~ 0. Prove that
there exists a nonsingular real (N x N)-matrix C with
(144)
Moreover, show that the w;'s are precisely the solutions of the generalized
characteristic equation
det(B - wA) = 0. (145)
The w;'s are all positive if and only if x TBx > 0 for all x ~ 0.
Solution: From Problem 58.9 there exists an orthogonal matrix U with
UTAU = diag(J. 1, ... ,Ä.N).
We set V= diag(l/ß, ... , 1/A). so that
VTUTAUV= I.
The matrix 8 1 = VTUTBUVis symmetric. Using Problem 58.9 once again we
obtain that there exists an orthogonal matrix W with
WTB 1W = diag(w 1 , ... ,wN).
Setting C = UVW. formula (144) follows.
96 58. Basic Equations of Point Mechanics

The w/s are precisely the eigenvalues of 8 1, i.e., the solutions of det(B 1 -
wl)= 0. Observe then that det V = det VT and hence
det(B 1 - wl) = det(VTUT[B- wA] UV)

= (det V) 2 (det U) 2 det(B - wA).

This implies (145).


Notice, furthermore, that xTBx = yT diag(w 1 , ••• ,wN)Y for y = Cx.
58.11. Generalsystems with small oscillations. We want to find a generat model for a
mechanical system near an equilibrium state. Suppose that the system has f
degrees of freedom, i.e., we can describe the motion of the system by
i = 1, ... ,f,

where q 1 , ••• , q1 are suitable real coordinates. Suppose that the equilibrium
state corresponds to the origin q 1 = ·· · = q1 = 0.
58.1la. Definition. An oscillating system with f degrees of freedom is given by the
equations ofmotion
N N
L aijii1 + L b,1q1 = o,
j=l j=l
i = l, ... ,f (146)

Here the real matrices A = (a11) and B = (biJ) are symmetric and the corre-
sponding quadtatic forms are positive definite.
58.llb. Normal coordinates. Show that there exist new coordinates r1 , ••• , rN so that
(146) becomes
i = 1, ... ,f (147)
with w1 > 0 for all i. Moreover, the frequencies w1 are precisely the solutions
of the generalized characteristic equation
det(B - wA) = 0.
Solution: We write (147) as the matrix equation
Aij +Bq= 0.
Setting q = Cr we obtain CT ACr + CTBCr = 0. According to Problem 58.10
this is (147).
58.llc. Motivation. We like to show that (146) holds under fairly generat hypotheses
about the system. We start with a Lagrangian function L = L(q,q,t) and
assume that the system is homogeneaus in time, i.e.,
L(q, q, t + a) = L(q, q, t) for all aeR
and all arguments. This implies L, = 0. From Taylor's theorem,
N
L(q, q) = L(O, 0) + L a1q1 + b1q1
1=1
Problems 97

The Lagrangian equations

i = 1, ... ,f (148)

coincide with (146) up to a constant term b;. Since q = 0 should be a solution


of(148), we need
b; =0 for all i.
Moreover, we set
a; =0 for all i
because this does not effect (148). The term

T= tLautMi = t4TAq
represents the kinetic energy. Thus, it is reasonable to postulate that qT Aq > 0
if q #' 0.
If w; s; 0, then (147) has unbounded solutions. Thus, we illso require that
w; > 0 for all i. This means that qTBq > 0 if q #' 0. By the way, the potential
energy of the system is
U = tLbuq;qi = tqTBq,
which has a strict minimum at q = 0. This implies the stability of the equilib-
rium state q = 0. Notice that L = T- U.
58.11d. Physical insight. Our considerations show that, under fairly generat assump-
tions, an oscillating system with f degrees of freedom can be reduced to f
independent harmonic oscillators in the sense of (147). This result plays an
important roJe in physics. In Section 58.15c, for example, we used this result
in the study of the early cosmos.
A detailed mathematical investigation of small oscillations may be found in
Gantmacherand Krein (1960, M).
58.12. Further problems. Numerous other problems may be found in Sommerfeld
(1954, M), Vol. 1 and Landau and Lifsic (1962, M), Vol. 1.

References to the Literature


Classical works: Kepler (1609, M), (1618, M), Galilei (1638, M), Newton (1687, M).
Collected works: Kepler (1939), Galilei (1890/1909), Newton (1779), (1967).
Essay about Galilei and Kepler: Blaschke (1957).
Physical mechanics: Sommerfeld (1954, M), Vol. 1, Landau and Lifsic (1962, M),
Vol. 1, Landau, Achieser and LifSic (1970, M).
Mathematical mechanics: Frank and von Mises (1962, M), Arnold (1978, M), Abra-
ham and Marsden (1978, M).
Celestial mechanics: Wintner (1947, M), Sternberg (1969, L), Vol. 1-2, Siegeland
Moser(1971, M), Stumpff(1973, M), Vol. 1-3. Hagihara (1976, M), Vol. 1-5, Abraham
and Marsden (1978, M), Arnold (1987, S), Vol. III.
Similarity theory: Sedov (1959, M), Massey (1971, M).
Physics of gravitating systems: Fridman and Polyaeenko (1984, M).
Small oscillations: Gantmacherand Krein (1960, M).
History of mechanics: Szäj>o (1987, M).
CHAPTER 59

Dualism Between Wave and Particle,


Preview of Quantum Theory, and
Elementary Particles

Except for atoms and emptyness nothing exists.


Demokrit (460 B.C.-371 B.C.)
There exists a limiting case of quantum theory which corresponds to classical
particle physics, and there exists another which corresponds to classical wave
mechanics. The alternatives which the limiting cases represent are not com-
patible. Bohr was therefore right when he called the duality between the two
"pictures"-wave and particle-an example of complementarity.
Carl Friedrich von Weizsäcker (1973)
The last significant turn in quantum theory occurred after de Broglie's discovery
of matter waves in 1924, Heisenberg's formulation of quantum mechanics in
1925, and Schrödinger's generat wave mechanical equation in 1926.
Wolfgang Pauli (1958)
Quantum theory so perfectly illustrates the fact that one might have understood
a certain subject with complete clarity, yet at the sametime knows that one can
speak of it only allegorically and in pictures.
Werner Meisenberg (1901-1976)

In the previous chapter we considered the concept of particles. This chapter


we begin by introducing a number of basic concepts, which are essential for
an understandins of wave phenomena in all parts of physics. We then will
discuss the relation between waves and particles, which has played an impor-
tant role in the development ofmodem physics. In 1925, Wemer Heisenberg
formulated bis matrix mechanics. This is a quantum mechanics, which is
derived from classical mechanics by introducing particle quantization. This
theory has already been discussed in Section 58.21. Independently, in 1926,
Erwin Schrödinger formulated an equivalent wave mechanics which is derived
from wave quantization. The main objective of this chapter is to present a
survey. This, together with the previous chapter, might help the reader to

98
59.1. Plane Waves 99

betterunderstand many ofthe problems discussed later on. We thereby follow


the fascinating line of development, which Ieads to the centrat problern of
modern physics-the creation of a unified theory for all four interactions in
nature. Quantum theory will be discussed in greater detail in Part V. Only
a minimal program is presented here. Some interesting problems that we
consider are:
(i) Spectrum of the hydrogen atom.
(ii) Quantum mechanical treatment of the harmonic oscillator in the context
of Schrödinger's wave mechanics.
(iii) Functional analytical deduction of Heisenberg's uncertainty relation.
Modem quantum theory is not conceivable without the special theory of
relativity, because for many elementary particle processes, which occur under
extreme conditions, relativistic effects are essential. The special and the generat
theory of relativity are discussed in Chapters 75 and 76. Here we will use the
following results:
(a) Every free particle has a rest mass m0 ~ 0. Its energy is equal to
E = Jm~c4 + p2 c2 , (1)
where p is the momentum vector and c the velocity of light.
(b) For m0 > 0 every free particle with velocity vector v has the mass
m = m0 ,1Jl - v2/c 2 (2)
and the energy
E = mc 2 • (3)
(c) Physical effects can travel with, at most, the velocity oflight. The velocity
of particles with rest mass m0 > 0 is always less than the velocity of light.
At several places in this chapter we will take the opportunity to introduce
the reader to the peculiarities and usefulness of physical thought.

59.1. Plane Waves


We begin with a function
y = A(t)
of the time variable t, which has period T. Then T is called the period qf
oscillation. Moreover, we define the frequency
V= 1/T
and the angular frequency
w = 2nv.
100 59. Dualism Between Wave and Particle, Preview ofQuantum Theory

The frequency v is equal to the number of oscillations during one unit oftime.
Wehave
T = 27t/ro = 1/v.

Definition 59.1. Let y = W(tX) be a function of period 21t. By a plane wave we


mean
y = W(kx - rot), (4)
where x denotes the position vector and t the time. Moreover, k is a non-
vanishing vector, which will be called the wave vector. The direction of k is
called the direction ofpropagation, and lkl is called the wave number. The wave
length Ä. and the propagation velocity cP (phase velocity) are, by definition,
equal to
Ä. = 21t/lkl,

We call ro the angular frequency and define the period of oscillation T and
the frequency v as above through T = 21t/ro and v = 1/T.

The function W might be a real function. One may think, for example, of
the fluctuations of pressure or density of sound waves in fluids or gases. But
W can also be a vector function. In this case, one may think of electric and
magnetic fields of electromagnetic waves (light) or the displacement of a body
under elastic waves. Such waves are of particular physical importance, since
they allow a transport of energy and momentum without a transport of mass.
The intuitive meaning and the motivation of the definition above becomes
clear if we lay the e 1 -axis of a Cartesian coordinate system in the direction of
k and set
X = eel + 'le2 + Ce3.
We then obtain kx = lkle, and moreover
y = W(lkle -rot)= W(lkl(e- cpt)).
Since W has period 21t this last equation yields a function y = y(e) of period
Ä. = 21t/lkl at every fixed timet. Furthermore, one obtains a function y = y(t)
of period T for every fixed space point x. The number
IX = kx - rot = lkl(e - Cpt)
is called the phase. Allspace-timepoints (x, t) with equal phase yield the same
value W(tX).If one moves along k with velocity cP, then W has alw~ys the same
value. This justifies the expression phase velocity for cP. From the specific form
of a physical model one often obtains a relation of the form
ro = ro(lkl), (5)
which is called a dispersion relatioh. This is equivalent to cP = cp(Ä.), i.e.,
represents a connection between wave length and propagation velocity. The
59.2. Polarization 101

derivative
c1 = m'(lkl)
is called group velocity c1 for lkl. We shall motivate this expression below.
Actually, from a physical point of view, the group velocity is much more
important than the phase velocity. Dispersion relations of the more generat
form
w = w(k,x,t)
are also used.

EXAMPLE 59.2. An important special case of plane waves are the harmonic or
monochromatic waves
y = W0 sin(kx - wt + a0 ).
Here a 0 is called the phase displacement and IW0 I is called the amplitude. The
quantity W0 may be a real nurober or a vector, e.g., the vector of the electric
field strength. Often, one works with complex harmonic waves
(6)
Sometimes they are studied directly, as in the case of quantum mechanics, but
in other instances only the imaginary part, which is equal to the sinusoidal
oscillation above, is used.

59.2. Polarization
lf W in (4) is a vector function and W0 in (6) is a vector, as in the case of
electromagnetic waves (light), then one can introduce a nurober of further
concepts. The plane wave is called transversal (or longitudinal) if and only if
W is perpendicular to the direction of propagation k (or parallel to k). Max-
well's equations of Part V show that light waves are always transversal. Elastic
waves may be transversal or longitudinal. The direction of W is called the
direction of polarization.
We now consider transversal waves and classify their polarization. Let P
be a fixed plane, perpendicular to k. We move along k with velocity cP, whereby
the projection from W onto P describes a curve C with time period T. The
wave is called circular polarized (or elliptic, linear) if and only if C is a circle
(or an ellipse, straight line). Fora linear polarization, the plane through k and
W is called a polarization plane. The polarization is important for an under-
standing of a number of light phenomena (e.g., anomalous refraction of
leeland spar, intensity of reflected light, birefringence). Circular polarized light
has an angular momentum. In the context of quantum field theory this means
that the photon has spin + 1 or - 1. This plays an important rote in quantum
102 59. Dualism Between Wave and Particle, Preview of Quantum Theory

statistics, which in Chapter 68 will be applied to Planck's radiation law and


the state of the cosmos following the Big Bang.

59.3. Dispersion Relations


We want to discuss (5). As weshall see in Sections 59.9 and 59.10 below, the
group velocity corresponds to the propagation velocity of wave packets and
to the propagation velocity of particle rays. Such rays correspond to the
first-order approximation of wave theories (e.g., geometrical optics). Therefore
the knowledge of the dispersion relation (5) is of particular interest for the
computation of the group velocity. This is done by solving the field equa-
tions of the corresponding physical theory. In Chapter 71 we will compute
the dispersion relation for water waves. Thereby we solve a nonlinear, free
boundary-value problern in the context of bifurcation theory. In Part V we
show that from Maxwell's equations one obtains the linear dispersion relation
c = A.v
for the propagation of electromagnetic waves in vacuum (light), where c is a
universal constant, namely the velocity of light in the vacuum. We therefore
have
m = clkl.
This Ieads to the important fact that

i.e., group velocity and phase velocity coincide here.

Light Quantum Hypothesis 59.3 (Einstein (1905b)). Light consists of photons


of energy E = hv with momentum IPI = E/c and rest mass m0 = 0. In other
terms, we have
E = lim, p = lik, (7)
where h is Planck's quantum of action and Ii = h/2n.
The meaning of these notations will be motivated in Section 75.11 in
the context of the relativistic mechanics of free particles. Einstein used this
hypothesis in order to explain the photo effect, for which in 1921 he received
the Nobel prize. lf a metal plate is exposed to light, then one obtains cathode
rays, i.e., electrons are ejected. In the context of classical physics it cannot be
understood why the energy of the electrons does not depend on the intensity
oflight, but only on its frequency v. However, this immediately becomes clear
if with Einstein one assumes that the electron is ejected following a collision
with a photon, whereby a maximal photon energy of E = hv occurs.
In fact, (7) represents a fundamental relation between the wave and particle
picture of microscopical quantum objects, which is valid not only for light.
59.4. Spherical Waves 103

The universal applicability of (7) was postulated in 1924 by de Broglie in


connection with his theory of matter waves. This occurred before Deisenberg
( 1925) and Schrödinger (1926) gave their formulations of quantum mechanics.
In the nonrelativistic quantum theory of free particles (first quantization)
these free particles are described by complex wave functions
t/1 = t/Joeiltx-mr).

For particles without spin (e.g., n-mesons) t/10 is a complex number. For the
electron one has t/10 e C 4 (see the Klein-Gordon equation and the Dirac
equation in Part V). Energy and momentum of the particles follow here from
the same relation (7) as in the case of the photon. The number

J. = 2n/lkl
is called the de Broglie wave length of the corresponding matter wave. The
relativistic equation
E2 = m~c4 + p2c2
with rest mass m0 , together with (7), implies the dispersion relation for matter
waves

w = cJ(m0c/h)2 + k 2.
In fact, one observes phenomena of diffraction of electron rays at crystal
lattices which are compatible with this wave picture.

59.4. Spherical Waves


Instead of plane waves, one often uses so-called spherical waves
y = W(Kixl - wt)
with k = Kx/lxl. This means that the direction of propagation k ofthe wave
is radial and lkl = K. For fixed timet, the space points x with constant phase
and constant W,lie on spheres around the origin. For variable t, these spheres
travel with velocity cP = w/K (Fig. 59.1). The quantities J., T, and v are defined
as in the case of plane waves.

Figure 59.1
104 59. Dualism Between Wave and Particle, Preview ofQuantum Theory

59.5. Damped Oscillations and the Frequency-Time


Uncertainty Relation
By a damped oscillation which is switched on at time t = 0 we mean
y = W0 (x)w(t) (8)
with
e-imol+lloi-yl for t > 0
w(t) = {
0 r
10r t<
- o'
with real numbers w0 , cx 0 , and y > 0. lf W0 is real, then for t > 0 we obtain
the imaginary part
y = e- 7'W0 (x) sin(cx 0 - w0 t),
which represents a damped sinusoidal oscillation. For great y > 0 these oscil-
lations tend quickly to zero as t -+ + oo (see Fig. 59.2). By definition, the mean
life-time ll.t of the damped oscillation is
ll.t = 1/2y.
This is'the time, during which e- 71 has decreased from 1 at t = 0 to the value
e- 1' 2 = 0.6. The time t 112 , during which e- 71 decreases from 1 to ! is called
half-life period. Because of t 112 = 1.4/l.t one often finds that in the Iiterature
there is no distinction made between t 112 and ll.t. We call
ll.w = Y (9)
the half-width of the spectrum of(8). This will be motivated below. Hence the
important so-called frequency-time uncertainty relation
ll.wll.t = ! (10)
is valid. If, according to (7), we assign the energy E0 = #iw 0 to the frequency
w0 and set ll.E = #ill.w, then we obtain the so-called energy-time uncertainty
relation
#i
ll.Eil.t =2 (11)

Figure 59.2
59.6. Decay of Particles 105

Wo

Figure 59.3

whose fundamental importance for elementary particle physics we will discuss


in Section 59.24.
Now weshall motivate (9). For this, we Iet

- f<Xl
J(w) = 1 ei(m-roo)l+lloi-yl dt
2n 0

2n[y - i(w - w0 )] •
Using a Fourier transformation we obtain the frequency representation

y= J:oo J(w) W (x)e-


0 1"'1 dw

for (8). In many physical theories the squares of the amplitudes are a measure
for the intensity of the processes. In our special case
2 1
ll(w)l = 4n2(y2 + (w - Wo)2)

is then a measure for the intensity of the angular frequency w in the spectrum
(Fig. 59.3). According to its definition we now determine llw as
ll(wo + llw)l 2/ll(woW = },
i.e., for w0 + llw the intensity has decreased about one-half compared with
the maximum at w 0 • This gives (9). Therefore, roughly, one may say that only
the angular frequencies w with
w0 - llw :S: w :S: w0 + llw
provide a significant contribution to ~)le damped oscillation.

59.6. Decay of Particles


In connection with decay processes we also have the concept of the mean
Jife-time of particles. What precisely does this mean? We think, for example,
106 59. Dualism Between Wave and Particle, Preview ofQuantum Theory

of a ß-decay (radioactive decay) of the neutron


n-t p + e- + v
into a proton, an electron, and an antineutrino. This is a typical process of
weak interaction. Theoretically, this was studied for the firsttime by Fermi in
1934. Let N(t) be the nurober of particles at time t that do not decay. We
assume that w(O) = 0. Then the deca)l probabilty w(t) for a particle during the
time interval [0, t] is given by
w(t) = yt + o(t), t-tO, (12)
where t is sufficiently sma~l in order to guarantee that w(t) < l. It follows that
N(t + h) = N(t) - N(t)w(h) = N(t)(l - yh + o(h))), h .... 0
and this implies the differential equation
N' = -yN.
Hence, we obtain the well-known decay law
N(t) = N(O)e-Y'. (13)
Analogously, as in Section 59.5, we call
!:it = 1/2y
the mean life-time of a particle and t 112 = 1.4 · !:it the balf-life period of the
substance. Again, one finds that in the Iiterature there is often no distinction
made between !:it and t 112 • Wehave N(!:it) = 0.6 · N(O) and N(t 112 ) = 0.5 · N(O).
Moreover, y is called the decay coe.fficient.

59. 7. Cross Sections for Elementary Partide


Processes and the Main Objectives in
Quantum Field Theory

In the case of cosmic rays or particle accelerators one has the following
idealized situation. A homogeneaus flow ofbombarding particles with particle
density p (number of particles per volume) and constant velocity vector v hits
on target particles, whereby a nurober of N1arget of such target particles is
contained in a fixed giveq volume. Let Nreaction denote the nurober of certain
reactions which during the time interval [0, t] take place there. One may think
of traces of emulsions for cosmic rays. The vector j = pv is called the vector
of the particle current density. The e.ffective cross section is then defined as
Nreaction
(f=----
~arget' t · Ül .
59.8. Dualism Between Wave and Particle for Light 107

Letting
Nerrecuve = u[il,

we obtain that
Nreaction = Nerrective' Ntaraet' t. (14)
In physics, the quantity u is rhe most important characterization of scattering
processes and has the dimension of a surface. One has the following intuitive
interpretation:
The number of reactions with one target particle during a fixed time interval
is equal to the number of bombarding particles which penetrate a surface u
during this time interval. Here, the surface is perpendicular to the homoge-
neaus flow of bombarding particles.
The larger u is, the bigger is the number of reactions. In Section 59.26 we
will give numerical values for u. In Problem 59.3 we compute the effective
cross section for the classical Rutherford scattering of a-particles at atomic
nucleuses. Using these scattering experiments, Rutherford in 1911 was able to
determine the atomic structure.
By
w(t) = Nerrecuve • t = u[iit (15)
we define the reaction probability w(t) during the time interval [0, t]. Strictly
speaking, one needs to add the term o(t) as t-+ 0 to (14) and (15) similarly as
in the case of (12).
The first main task of quantum field theory is to compute the decay and
reaction probabilities, i.e., the decay coefficients y and the effective cross
sections u. For this, the formalism of the S-matrix (scatterin~ matrix) is used
by physicists without a rigorous mathematical justification. A second main
task would be to predict the specific properlies of all elementary particles
(mass, charge, spin, isospiri, strangeness, etc.). This, however, lies in the distant
future.

59.8. Dualism Between Wave and Partide for Light


In connection with the previous observations Iet us make some brief remarks
about the physical meaning of the dualism between wave and particle. The
mathematical theory will be considered in Part V. For our discussion here we
choose light as a typical example. As in the case of elementary particles,
physicists and mathematicians are still puzzled.
(i) Fermat's principle of geometrical optics of 1662. In connection with
reflection and refraction, light behaves like a particle ray, whose motion in
geometrical optics is determined by Fermat's variational principle of the least
108 59. Dualism Between Wave and Particle, Preview of Quantum Theory

(stationary) light time. From a mathematical point of view, this principle is


completely analogous to the principle of the least (stationary) action of clas-
sical mechanics. At the same time, the principle of stationary action is the most
general variational principle in physics and is true for numerous field theories;
in particular, it determines the mathematical structure of the relativistic
theories for elementary particles (see Part V).
(ii) Electromagnetic waves. In Maxwell's electrodynamics of 1873 light is
described by waves of coupled electric and magnetic fields. Thereby diffrac-
tion, interference, and polarization effects can be explained. Geometrical
optics in this context is obtained by using asymptotic expansions with respect
to the small wave length A. and taking only the first-order approximation into
consideration. The light rays are here the stream lines of the vector field of the
energy density (see Chapter 40).
Maxwell's theory, which will be discussed in Part V, is an ingenious theory.
First, up until today no corrections have been necessary either in connection
with the special theory of relativity or in connection with quantum electro-
dynamics. Second, it is the first physical theory, in which two apparently very
different phenomena, namely electricity and magnetism, are connected with
each other by using the notion of a field. Third, it is a gauge field theory and
hence a model for the modern gauge field theories of elementary particles, in
connection with which one tries to develop a unified theory for all interactions
using the idea of gauge fields. Finally, it enabled Maxwell to predict that light
is an electromagnetic wave. The experimental confirmation came in 1888
through Heinrich Hertz. This was nine years after Maxwell's death.
(iii) Planck's radiation law and quantum theory. At the end ofthe nineteenth
century it was a famous physical problern to find the correct radiation law to
describe the energy distribution in the spectrum of radiating bodies (see
Section 68.4). To find this law by means of thermodynamic methods, Planck
in 1900 made the revolutionary assumption that the energy of the harmonic
oscillator cannot take on all values, but only discrete ones with
AE = hw.
In 1925, Heisenberg showed that the precise values are
E = hw(n + !), n = 0, 1, ...
(see Example 58.26). For quantum field theory it is of particular interest that
there exists a nonzero energy for the ground state n = 0. The physical inter-
pretation of this is that the vacuum itself possesses physical properties, which
can be observed experimentally, e.g., through the fine structure (Lamb-shift)
in the hydrogen spectrum or through the magnetic anomaly of the electron.
Moreover, the vaporization of black holes of Section 76.17 is based on these
quantum phenomena.
(iv) Photons and statistical physics. Other than in geometrical optics, light
of all wave lengths behaves during the photoelectric effect like a particle. This
light quantum hypothesis, which was formulated by Einstein in 1905, follow-
59.8. Dualism Between Wave and Partide for Light 109

ing Planck's quantum hypothesis, can easily be used together with the Bose
statistics to derive Planck's radiation law (see Chapter 68).
(v) Einstein's special theory of relativity (1905). The fact that the velocity of
light is constant for all inertial systems Ieads to the relativistic structure of
time and the development of relativistic mechanics (see Chapter 75).
(vi) Einstein's general theory of relativity (1916). Light is deflected by masses,
i.e., it behaves like a particle with respect to relativistic gravitation (see
Chapter 76).
(vii) Quantum _electrodynamics. This theory, which was created by Feynman,
Schwinger, and Tornonaga durlog the late 1940s, uses a quantum field to
describe the interactions between electrons, positrons, and photons. lt is
obtained from an equation which results from a combination of Maxwell's
equations with Dirac's spinor equation for the electron and the positron. The
formalism of quantum theory consists of two steps. The first quantization
yields relativistic field equations; which, in connection with an abstract Hilbert
space theory, contain particle as weil as wave aspects. A probability theoretical
interpretation of physical processes is important. The second quantization
yields quantum fields. A consistent mathematical theory for this has yet to be
found. Because of very strong singularities, the formalism which is presently
used Ieads to mathematically meaningless expressions. Physicists, however,
have found a regularization procedure (renormalization of charge and mass),
which, formally applied, Ieads to accurate agreements with the experiment. In
this theory, the photon is a quant, which can be regarded as a simultaneous
generalization of wave and particle. Precisely speaking, the quant is the
primary physical phenomenon, and our pictures of particles and waves, which
come from our macroscopical experience, are possible approximations for a
description of microscopical phenomena.
The unusual structure ofthe quant is reflected in the historical development
of quantum mechanics. Heisenberg in 1925 used particle quantization, i.e., a
quantization of classical mechanics, to arrive at quantum mechanics. Schrö-
dinger in 1926, on the other band, obtained bis quantum mechanics completely
independent of Heisenberg, by using the idea of matter waves. Actually, both
theories are only different realizations of the same abstract Hilbert space
theory (Heisenberg picture and Schrödinger picture; see Part V).
(viii) Gauge field theory. This modern theory starts from the Dirac equation
for electron and positron. According to the principle of greatest simplicity,
one automatically obtains the electromagnetic field and the photon as a quant
oftbis field by assuming the gauge invariance ofthe theory. Roughly speaking,
the photon in gauge field theory is obtained for nothing. lt is needed to carry
the information about the gauge of the electron-positron field. This results
in electromagnetic interaction (see Part V).
Quantum electrodynamics and the gauge field theory for the photon form
the model for the modern theories which describe all interactions in the
microcosmos. All this Ieads to the same mathematical difficulties.
The previous points (i)-(viii) show the very interesting phenomenon that
110 59. Dualism Between Wave and Particle, Preview ofQuantum Tbeory

the light, which is necessary for our biological existence, is also the light of
our physical knowledge. Any significant physical theory is in an essential way
connected with light.

59.9. Wave Packetsand Group Velocity


We want to show that wave packets propagate with group velocity and set
K = lkl. By definition, a wave packet is obtained by superposition ofharmonic

i
waves
Ko+AK
y= A(K)eiiK~-miKJtJ dK,
Ko

where ~K > 0 is small. We content ourselves with a rough argument, and


approximate the integral with the trapezoid formula
y= r 1 [f(K 0 ) + f(K 0 + ~K)]~K.
where
f(K) = A(K)eiiK~-miKJtJ.

Taylor expansion at the point K 0 yields


Ke- ro(K)t = Koe- ro(K 0 )t + (e- ro'(K 0 )t)~K
+ o(~K), ~K -+0.
Using the phase velocity c = ro(K 0 )/K 0 and the group velocity eR = ro'(K 0 )
we obtain in first-order approximation
(16)
with
(17)
This can be viewed as a harmonic wave, for which, likewise, the amplitude
varies according to a harmonic wave law. Denoting the wave lengths of K 0
and ~K by
Ä.0 = 21t/K0 and A. = 21t/~K,
respectively, and letting ~K « K 0 , we obtain
A.o « A..
This means that the wave length of the harmonic wave (16) is significantly
smaller than the wave length of the amplitude change (17). This situation is
pictured in Figure 59.4.
Motivated by this argument, we define the propagation velocity ofthe wave
packet as c1 = ro'(K 0 ).
59.10. Formulation of a Particle Theory for a Classical Wave Theory 111

cg-

Figure 59.4

59.10. Formulation of a Partide Theory for a


Classical Wa ve Theory
Our starting point is a wa':e theory with the dispersion relation
w = w(k,x,t) (18)
to which we assign the so-called eikonal equation
- S, = w(Sx, x, t) (19)
and the canonical equations
(20)
Here w has tobe replaced with (18). By definition, the particle rays x = x(t)
are obtained as a solution of (20).
Let us motivate this generat procedure. We will be guided by the geometrical
optics of Chapter 40, and begin with the function
y = Wo(X, t)eiS(x,r>,
where S is called an eikonal. In first-order approximation Taylor expansion
of S at the point (x 0 , t 0 ) yields
S(x,t) = cx 0 + k(x- x 0 ) - w(t- t0 ) + ...
with
w = - S,(x0 , t 0 ). (21)
Thus in first-order approximation we obtain the harmonic wave
y= Woel(k(x-x0 )-ro(r-r0 )+«0 ) + ....
Equations (18) and (21) then yield the eikonal equation (19), which has the
form of the Hamilton-Jacobi equations. According to Section 58.23, equa-
tions (20) are the corresponding canonical equations.

E:XAMPLE 59.4. We consider the important special case


w = w(lkl)
112 59. Dualism Between Wave and Particle, Preview of Quantum Theory

which, for example, occurs for light. From (20) we immediately obtain k =
const and the particle rays
k
x = c11 1kft + x(O)

with c11 = ro'(lkl). This corresponds to particles which propagate with group
velocity c11 •

59.11. Motivation of the Schrödinger Equation


and Physical Intuition
In particular, I would like to mention that I was mainly inspired by the thought-
ful dissertation of Mr. Louis de Broglie (Paris, 1924). The main difference here
lies in the following. De Broglie thinks of travelling waves, while, in the case of
the atom, we are led to standing waves.... I am most thankful to Hermann Weyl
with regard to the mathematical treatment of the equation (of the hydrogen
atom).
Erwin Schrödinger (1926)

The basic equation of quantum mechanics is the Schrödinger equation


112
illt/1, = - 2m 111/1 + U(x)t/1. (22)

More precisely, Schrödinger (1926) began with the stationary equation


112
Ecp = - 2m llcp + U(x)cp (23)

which follows from (22) through


1/!(x, t) = cp(x)e-iEt!ll.
From (23) Schrödinger derived the spectrum of the hydrogen atom. This will
be discussed in Section 59.16. In this case U is the potential ofthe electrostatic
attracting force of the atomic nucleus, and from the eigenvalues E of (23) one
obtains the energy Ievels of the electron of the hydrogen atom.
We now want to motivate (23) by using an argument in the spirit of Schrö-
dinger's ideas. Webegin with the classical energy relation
2
E= %m + U(x) (24)

for a particle of mass m and with momentum vector p, which is located in a


force field with potential U. Formally, we consider a de Broglie matter wave
(25)
59.12. Fundamental Probability Interpretation of Quantum Mechanics 113

with
E=hw, p= hk. (26)
From (24) we obtain the dispersion relation
h2k2
hw = 2m + U(x). (27)

Our goal is the following:


(S) We are looking forapartial differential equation, which has t/1 in (25) with
(26), (27) as a solution.
Most easily such an equation is obtained by substituting
0
p => -ih- (28)
ox
in (24). This immediately yields the Schrödinger equation (22).
lt is, however, quite s"urprising that it is possible to obtain such a funda-
mental equation in such a simple and formal way. The discovery of the
Schrödinger equation is an example for the power of physical intuition. It is
not the mathematical formalism which plays the important role, but rather
the use of physical images and concepts which have been tested in connection
with other physical phenomena.
As weshall see in Part V, substitution (28) also directly implies the Schrödin-
ger equation for many-particle systems and the relativistic equation for the
electron, i.e., the Dirac equation. In the same way we obtain, in Section 59.24,
the Klein-Gordon equation for mesons. Substitution (28) represents a general
quantization procedure (first quantization).

59.12. Fundamental Probability Interpretation


of Quantum Mechanics
lt is remarkable that Schrödinger used bis equation (23), without knowing the
right interpretation of the complex-valued wave function t/J. This interpreta-
tionwas discovered by Max Born (1926) in a fundamental paper, in which he
used the time-rlependent Schrödinger equation (22) to develop a quantum
mechanical scattering theory. This statistical interpretation of t/J, which has
led to a fundamental change in the way of physical thinking, will now be
discussed.
We use the scalar product

(l/'11/1) = f ijiljl dx.


JRl
114 59. Dualism Between Wave and Particle, Preview ofQuantum Theory

In the following only complex-valued functions 1/J = 1/J(x, t) are considered


which belong to the complex H-space L~(~ 3 ) at all times te ~. and satisfy
the normalization condition
(1/111/J)= l.
We interpretsuch functions as the particle state at timet.
(i) Probability of presence for particles. The number

t 11/J(x,tWdx

is the probabitity of finding a particle at time t in the region G.


(ii) Expectation value and dispersion of physical quantities in state 1/J. If we
choose AI/I either equal to ~1 1/1 or equal to -ihoi/Jfo~1 , then
A = (1/JIAI/J),
(29)
(L\A) 2 = (1/JI(A - Ä}li/J)
is the expectation value A and the dispersion (AA) 2 for the position
component ~i and the momentum component p1 at time t, respectively.
Thereby 1/J must be chosen at time t. Furthermore, L\A is called the mean
fluctuation or the mean error for the physical quantity at state 1/J which
belongs to A.
This coincides with the correspondence principle
~j => ~1' P1 => - ih a;ae1, (30)
which has already been used in (28). In (29) one naturally assumes that the
right-hand sides exist.
Other than in classical mechanics, a particle state need not correspond to
a well-defined position state and momentum state. Instead, only expectation
values exist.

59.13. Meaning of Eigenfunctions in


Quantum Mechanics
Equations (29) describe a general quantum mechanical principle. Similarly
as in (30) one assigns operators A to physical quantities and then computes
A and L\A according to (29). The expectation value A is called sharp if and
only if
L\A = 0.
States with sharp expectation values are of particular physical interest.
The mathematical meaning of A and L\A is clear from the Chebychev
59.13. Meaning ofEigenfunctions in Quantum Mechanics 115

inequality
- (AA)2
p(IA -Al ~IX)~ 1- -2- for all IX > 0,
IX

where the left-hand side denotes the probability of measuring a value A with
lA- Äl ~IX. The simple proof of this may be found in Section 68.1. In
particular, if AA = 0, then
p(A = Ä) = 1,
and is the case that AA > 0, we obtain, for example,
p(IA - Äl ~ 4AA) ~ fi.
In general, the Chebychev inequality tells us that the measurement value Ais
closer to the mean value Ä the smaller the dispersion (AA) 2 is.
The great importance of eigenfunctions and eigenvalues in quantum physics
results from the following. lf
At/1 = ).1/J
with (t/111/1) = 1, then, at all timest, the sharp expectation value Ä =). corre-
sponds to the state 1/J, since
Ä = (t/IIAt/1) = ;.,
(AA) 2 = (t/II(A- U)2 t/l) = 0.

EXAMPLE 59.5 (Hamilton Operator). According to (30) the Hamittonoperator


Jil
H= --A+ U
2m
is assigned to the classical Hamitton function R = (p 2 /2m) + U. Since the
classical Hamitton function represents the energy of the particle, we inter-
pret ii as the energy expectation value. If (/) = qJ(x), as in the case of the
hydrogen atom and the harmonic oscillator below, is an eigenfunction of H
with (({)I({)) = 1, i.e., HqJ = EqJ, then
1/J(x, t) = e-IEt/11(/)(X)

is a normalized solution ofthe Schrödinger equation ilit/1, = Ht/1 and we obtain


Ht/1 = Et/1.
Thus for the state 1/J there exists the sharp expectation value ii = E at all
timest.

EXAMPLE 59.6 (Angular Momentum). We assign the operator A = N to the


classical angular momentum fJ = x x p with
N =X X p and p = - ili a;ax.
116 59. Dualism Between Wave and Particle, Preview ofQuantum Theory

Explicitly, we have N = Ll= 1 Niei and


N3 = e1P2 - e2P1,
where N1 and N2 are found from cyclic permutations. The operator of the
square ofthe angular momentum is given by N 2 = Nf +Ni + Nf. In spheri-
cal coordinates this means
N3 = - ih 8f8qJ,

2 h2 8(. 8) + sinNf ,9"


N = - sin 8 88 sm 8 88 2

For the Rarnilton operator of Example 59.5 we obtain

H = _1 [N2- h2r ~(r2~)]


2m r2 8r 28r
+ U.

These expressions are important for the treatment of the hydrogen atom.
Let r,m denote the surface harmonics which are discussed in the following
section. Then, in spherical coordinates, we obtain for every differentiahte
function
t/J(x, t) = R(r, t) Y,m(qJ, 8),
the relations
N 2 t/J = h2 l(l + 1)t/J,
(31)
N3 1/J = hmi/J,
i.e., for these states the square of the angular momentum N 2 has the sharp
expectation value h2 1(l + 1) and N3 has the sharp expectation value hm, where
I = 0, 1, ... and m = I, I - 1, ... , -I. In this way one obtains the quantization
of the angular momentum, which is important for the understanding of so
many physical processes.

59.14. Meaning of Nonnormalized States


For the plane wave
1/1 = 1/Joei<kx-wr)

with E = hw and p = hk the previous interpretation fails, since (t/111/1) = oo.


The interpretation now is as follows. One uses t/J in scattering experiments
and assigns to this function the current density vector
j = pv
of a parallel particle current with density p and velocity vector v, where
v = pfm.
59.15. Special Functions in Quantum Mechanics 117

During the time interval [0, t] exactly N particles with


N = liiFt
penetrate a surface F perpendicular to the direction of propagation. The
physical meaning of current density vectors will be discussed more thoroughly
in Section 69.1 (cf. also Chapter 87).

59.15. Special Functions in Quantum Mechanics


In preparation of the quantum mechanical treatment of the harmonic oscil-
lator and the hydrogen atom, we consider the following special functions of
the real variable e. Moreover, we set D = d/de.
(i) Hermitian functions (harmonic oscillator functions)

H"(e) = ( - 1)" e~ 212 D"e-~ 2 •


J2"nljit
For n = 0, 1, ... they form a complete orthonormal system in L 2 ( - oo, oo) and
solve the differential equation
-y" + e2 y = (2n + 1)y.
(ii) Legendre polynomials

2n+l"
22n+t(nl)2 D (1 - e ).
2 "

For n = 0, 1, ... they form a complete orthonormal system in L 2 ( -1, 1) and


solve the differential equation
-((1 - e )y')' = n(n + 1)y.
2

(iii) Generalized Legendre polynomials

(l- m)l (1 - e2)112D"'R(e)


(l + m)l 1 •

For fixed m = 0, 1, ... and I = m, m + 1, ... they form a complete orthonormal


system in L 2 ( -1, 1) and solve the differential equation
-((1- e )y')' + m (1- e r y = 1(1 + 1)y.
2 2 2 1

(iv) Surface harmonics

Yj"'(cp, .9) = ~ P/"''(cos .9)e 1"'"·


v 21t
For I = 0, 1, ... and m = I, l - 1, ... , -I they form a complete orthonormal
118 59. Dualism Between Wave and Particle, Preview ofQuantum Theory

system in L2 (S 2 ), where S 2 is the surface ofthe unit ball in R3 and cp, 8 denote
spherical coordinates. These functions solve the differential equations (31).
(v) Laguerre functions
L:(e) = c:e~12 e-,.12 D"(e-~e"+,.).
The positive constants c; are chosen such that

fo'Xl <L:>2 de = 1.
For fixed ac > -1 and n = 0, 1, ... , these functions form a complete ortho-
normal system in L2 (0, oo) and solve the differential equation
-4(ey')' + <e + ac 2/e)y = 2(2n + 1 + ac)y.
The proofs for this may be found in Triebet (1972, M).

59.16. Spectrum ofthe Hydrogen Atom


The spectrum of the hydrogen atom consists of discrete lines with wave lengths
A.m,. = 21tcfwm,. and corresponding angular frequencies

(J)m11 = R(:2- ~2 )• (32)

where m, n = 1, 2, ... and m > n. This was empirically discovered by Balmer


in 1885. Later on Rydberg experimentally determined the so-called Rydberg
frequency R = 2.07 · 1016 s- 1 • Thus the problern was to find a theoretical
explanation for this surprisingly simple relation.
The first step was taken by Niels Bohr (1913). The hydrogen atom consists
ofa proton with charge Iei and an electron with charge e = -1.602 ·10- 19 A s.
Bohr postulated that the electron can move only on discrete circular orbits
(Fig. 59.5). He determined these orbits similarly as in the Kepler problern for
the planetary motion, by replacing Newton's gravitationallaw with Coulomb's
law of electrostatics. Moreover, he added the important nonclassical condi-
tion that the angular momentum has the quantized form
INI = nll, n = 1, 2, ....

Figure 59.5
59.16. Spectrum ofthe Hydrogen Atom 119

This led to the orbital energies

n = 1, 2, ... (33)

with

where m. = 9.1·10- 31 kg is the mass of the electron and e0 is the dielectric


constant. This gives
E 1 = -13.6 eV.
With this energy the electron of the lowest orbit is bound to the nucleus. This
energy is needed in order to ionize the hydrogen atom, i.e., to eject the electron.
Moreover, this gives roughly the order of magnitude for the energies which
occur in chemical reactions per atom. For the radius ofthe lowest orbitBohr
obtained
r0 = 41te 0 h2 /m.e 2 = 5·10- 11 m.
The velocity of the electron in the lowest orbit n = 1 is
V0 = c(e 2/41te 0 hc) = c/137.
This velocity issmall compared with the velocity oflight c. This explains why
in this problern relativistic effects can be neglected. Moreover, in Section 59.26,
we show that gravitational effects play no role. We will prove all these relations
above in Problem 59.1.
A comparison of (33) and (32) yields the simple relation
(34)
with the correct, i.e., experimentally observed, Rydberg frequency R in (32).
This fundamental relation of atom spectroscopy can be explained like this.
Through outer stimulation the electron is caused to jump into a higher orbit
and thereby emits the energy difference ß.E in the form of a photon of energy
ß.E = hwmn·
This precisely corresponds to Einstein's photon hypothesis.
The fact, however, that the electron should travel in fixed stable orbits was
not clear at all, because then, according to Maxwell's theory, the electron
should radiate as an accelerated charge and thereby lose energy and con-
sequently tumble into the nucleus. The solution to this famous problern was
found, independently, by Pauli and Schrödinger in 1926. Pauli thereby used
Heisenberg's matrix mechanics of 1926, which corresponded to a particle
quantization, while Schrödinger used bis wave equation.
We now discuss the hydrogen atom by using the Schrödinger equation
h2
ihl/1, = -2- ß.I/J + Ul/1. (35)
m.
120 59. Dualism Between Wave and Particle, Preview of Quantum Theory

As potential we choose the potential of the Coulombforce between the nucleus


and the electron, i.e.,
e2
U= - - - .
4ne 0 r

In spherical coordinates we consider the functions


t/J = e-iEntfh(/)nlm(r, (/), .9) (36)
with

_1Jf
(/)nlm--
r
21+ 1
-Ln-1-1
nr0
(2r) Y,
-
nr0
m
(({),.9),

where E" and r0 have the same meaning as in Bohr's atomic model. Moreover,
we have n = 1, 2, ... , I = 0, 1, ... , n - 1 and m = I, I - 1, ... , -I.
From Section 59.15 one finds through explicit computations that t/1 is a
normalized solution öf (35) with

Ht/1 = E"t/1'
N 2 t/J = h2 1(1 + 1)t/J,
N3 t/l = hmt/1,
where the Rarnilton operator H corresponds to the right-hand side of (35).
Therefore t/1 corresponds to an electron state, which has the sharp enery value
E" for all times t, the sharp value h2 1(l + 1) for the square of the angular
momentum and the sharp value hm of the N3 -angular momentum component.
The numbers n, I, and m are called quantum numbers. They completely
characterize the states (36). Experimentally, these quantum numbers can be
observed in the spectrum if U is perturbed by an electric or magnetic field.
One then obtains energy values Enlm• which are perturbations of E" and
depend on I and m. For the spectrum this Ieads to a splitting of the spectral
lines, according to (34). In Part V we discuss this by using group-theoretical
methods.
In quantum mechanics the classical electron orbits vanish. The number

w= L lt/l(x,tWdx

is the probability of finding the electron at time t in the region G. We choose


t/1 as in (36). Then this probability is time independent. For n = 1 and G =
{xe IR 3 : r ~ lxl ~ R} we have

w =IR W(r)dr.
The function W(r) = (rr0 /2) 2 e- 2 '''o has a maximum for r = r0 • Therefore,
roughly speaking, for n = 1 the probability of a presence for the electron is
maximal in the neighborhood of Bohr's classical electron orbit.
59.17. Functional Analytic Treatment of the Hydrogen Atom 121

The energy values En are all negative. For every E > 0 one finds solutions
of the Schrödinger equation with
rjJ = e-iEtfhcp(x)

and H q> = Ecp by using Gauss' hypergeometrical function (see Landau and
LifSic (1962, M), §36). These, however, cannot be normalized. they correspond
to free electrons in the field of the hydrogen nucleus. Using the picture ofthe
planetary motion, En < 0 corresponds to orbital ellipses (planets) and E" > 0
corresponds to orbital hyperbolas (comets).

59.17. Functional Analytic Treatment of the


Hydrogen Atom
Quantum mechanics has bad a significant influence on the development of
linear functional analysis, especially the spectral theory of unbounded, self-
adjoint operators, which in 1929 was created by John von Neumann. In
1932 bis classical monograph "The Mathematical Foundations of Quantum
Mechanics" appeared. Interestingly, in bis spectral theory of bounded, self-
adjoint operators at the turn of the century, Hilbert intuitively used the notion
of the spectrum, without knowing that this mathematical concept was closely
related to the theory of atomic spectra.
The functional analytic treatment, which will now be discussed, allows
us, among others things, to show in which sense the explicit solutions of
the previous section describe all solutions. We use the complex H-space X =
L~(R 3 ). It consists of all complex-valued, measurable functions cp: IR 3 -+ C
with scalar product

(cpJrjJ) = r
JJR3
ijir/Jdx.

Every q>EX with (cpJcp) = 1 will be interpreted as a stationary state of the


electron. The Hamitton operator
h2
H= - - A + V
2m.
of the hydrogen atom is a self-adjoint, half-bounded operator, whose domain
D(H) is the Sobolev space Wl(IR 3 ) which is densein L~(R 3 ). The eigenvalues
of H are precisely allEn. The corresponding eigenvectors q>"'"' form a complete
orthonormal system in a subspace X 0 of X. Every q> E X 0 with (cpJcp) = 1 can
therefore be written as
(/) = L Cnlm(/)nlm·
nlm

This series converges in X, where


1= L lcnlml 2 ·
nlm
122 59. Dualism Between Wave and Particle, Preview ofQuantum Theory

We interpret qJ as the electron state, in which the electron will be in (()111m with
probability lcn1ml 2 • The evolution in time of qJ is given by
(37)
i.e., t/1 = qJ for t = 0. The exponential function is here to be understood in
the sense of functional calculus. Formal differentiation of (37) yields the
Schrödinger equation
iht/11 = Ht/1, (38)
which, however, will not be used for our development of the theory. The
reason is the following. While (37) is meaningful for all initial states cp e X,
equation (38) only holds for all t/1 with qJ e D(H). Also the operator H has the
continuous spectrum [0, oo[, which corresponds to the free electron of the
previous section. All proofs of this can be found in Triebet (1972, M).

59.18. Harmonie Oscillator in Quantum Mechanics


Parallel to the classical harmonic oscillator of Section 58.5 we consider here
the one-dimensional Schrödinger equation
IJl
iht/1, = - 2m t/1(( + Ut/1 (39)

e
with the potential of the harmonic oscillator U = mw 2 2/2. This equation
follows from (22) if we restriet ourselves to only one space coordinate, i.e., if
we are looking for t/J = t/J(e, t). With regard to the probability interpretation
ofSection 59.12 one has to replace then all space integrals with integrals over
IR.
We consider the function
t/J = e-iE,.tilaHn(e/eo)/jf;, (40)
with n = 0, 1, ... ' eo = .JhjmW and
E11 = hw(n + !). (41)
With results of Section 59.15 one explicitly verifies that these are solutions of
(39). The quantized energy values E11 have been computed for the firsttime by
Heisenberg (1925) using bis matrix mechanics (see Example 58.26). These
values correspond to Planck's quantum hypothesis A.E = hw of 1900.
According to Section 59.12 one obtains the expectation values
and p=O
for the position and momentum co~ponent of (40). The mean error is
Ae = eoj(l + 2n)/2,
59.19. Heisenberg's Uncertainty Relation 123

and
Ap Ae = h(t + 2n)/2;;:: h/2.
This is a special case of Heisenberg's uncertainty relation of the following
section.
The functional analytic treatment is similar to the previous section. We
choose the complex H-space X = L~(R). The Hamilton operator
tJ2 d2
H= ----+ U
2m 2 de
with domain D(H) = C0 (R) is essentially self-adjoint, i.e., it has a self-adjoint
closure ii. This operator has precisely allE" as eigenvalues. The functions t/1
in (40) with t = 0 are the corresponding eigenvectors in X and form a complete
orthonormal system. The proof can be found in Triebel (1972, M).

59.19. Heisenberg's Uncertainty Relation


Webegin with the proof of a general functional analytic result and start from
the commutation relation
ABI/I - BAI/I = iCt/1 for all t/1 e D. (42)
Our goal is
(43)
Our assumptions are:
(Hl) Xis a complex H-space with a dense linear subspace D.
(H2) The operators A, B, C: D-+ D arelinear and symmetric, i.e., (At/llt/1) =
(t/IIAt/1) for all t/JeD, etc.
We assign the following numbers to each t/1 e D:
A = (t/IIAt/1),
and analogously for B, C.

Theorem 59.A (Abstract Uncertainty Relation). If (Hl), (H2) and (42) hold,
then (43) is validfor every t/1 eD with (t/llt/1) = 1.

PROOF. Fora, b eR and t/1 e D one has


iC = (t/II(AB - BA)t/1)
= (t/II(A - al)(B- bl)t/1)- ((A - al)(B- bl)t/llt/1)
= 2ilm(t/II(A - al)(B- bl)t/1).
124 59. Dualism Between Wave and Particle, Preview of Quantum Theory

The Schwarz inequality yields


ICl~ 2II(A- al)t/IIIII(B- bl)t/111.
Fora = A and b = ii we obtain (43). 0

EXAMPLE 59.7 (Uncertainty of Momentum and Position). We choose


X = L~(IR 3 ) and D = Cö(IR 3 ).
By t/1 e X we mean t/1 = ljl(x, t) for fixed t. Moreover, Iet A be the momentum
operator
Pi = - ih a;ae1,
and B the position operator ei. This implies the commutation relation (42)
with C = - hl, i.e., in short
(44)
From Theorem 59.A we obtain the famous Reisenberg uncertainty relation
from 1927:
(45)
Since D is dense in Wl(IR 3 ), relation (45) is valid also for all normalized
t/1 e Wl(IR 3 ) by passing to Iimits.
The fundamental relation (45) means that for normalized quantum mechan-
ical states, position component and momentum component cannot exactly be
measured at the same time. There will always be a mean error which satisfies
(45).
According to Section 59.14 we assign the energy E = hro and the sharp
momentum p = hk to the state
t/1 = t/loei<lcx-wrl.
The impossibility of the normalization (t/111/1) = 1 can now be interpreted as
follows. Because of the sharp momentum it is not possible to localize the free
particle. Because of(t/111/1) = oo it is not possible to define a probability for the
particle to be in the region G.
EXAMPLE 59.8 (Uncertainty of the Angular Momentum Components). We
choose X and D as in the previous example. Let Ni denote the jth component
of the angular momentum operator N = x x p with p = - ih ofox. Then the
commutation rule (42) holds with A = N1 , B = N 2 , C = hN3 , in short
(46)
This implies
~N 1 ~N2;::: hiN3 I/2
for all normalized t/1 e Wl(IR 3 ). In 1925, durlog the formulation of bis matrix
mechanics, Reisenberg noticed that the commutation rules form the essential
59.20. Pauli Principle, Spin and Statistics 125

parts of quantum mechanics. Mathematicians were aware of such commuta-


tion rules for a long time in connection with the theory of Lie algebras (see
Section 74.26). In fact, behind (46) hides the Lie algebra of the rotation group
S0(3) which is responsible for the angular momentum, and also the Lie
algebra of the group SU(2) which is responsible for spin and isospin of the
elementary particles. As we shall see in Part V, one can therefore obtain
important results in quantum theory by using groups and their representation
theory. For example, the quark model for elementary particles follows from
the representation theory ofthe group SU(3).

59.20. Pauli Principle, Spin and Statistics


We assign a spin quantum number S to all elementary particles which may
assume the values S = n/2 with n = 0, 1, .... Furthermore, we distinguish
between the following spin positions
Sz=S,S-1, ... ,-S
for a fixed elementary particle. Physically, this means that the particle has a
spin. Specifying a fixed e 3 -axis it may assume states for which the spin vector

has the sharp value hSz for S3 and the sharp value h2 S(S + 1) for S2 •
In 1925, the electron spin was hypothetically introduced by Goudsmit and
Uhlenbeck, in order to explain the fine structure in the splitting ofthe spectral
Jines caused by a magnetic field. Experimentally, this then was confirmed
by Gerlach and Stern in 1927. They sent hydrogen atoms through an in-
homogeneous magnetic field. lf the electron in this experiment is in the ground
state of the hydrogen atom, then one has n = 1, l = 0, and m = 0. lt follows
that N 3 = 0. According to classical views of electrodynamics, such an atom
without electron angular momentum cannot have a magnetic dipole moment.
The ray therefore should not be efTected by the magnetic field. But, actually,
Gerlach and Stern observed a splitting into two particle rays. This corre-
t
sponds to an electron spin of S = with both spin positions Sz = ±f. In Part
V we will show that the electron spin is a typical effect in the theory of relativity
and automatically follows from the relativistic equation ofthe electron, which
in 1928 was formulated by Dirac. Moreover, this equation also implies the
existence of the positron, the antiparticle of the electron.
For photons, one obtains S = 1 with the two spin positions Sz = ± 1. The
possible value S = 0 does not occur. This follows from the fact that light
corresponds to transversal waves. In Section 68.4 we will derive the correct
radiation law by making essential use of the fact that photons can only have
two spin positions.
126 59. Dualism Between Wave and Particle. Preview of Quantum Theory

Particles with integer spin (or half-numberly) spin are called bosons (or
fermions). The electron is a fermion, the photon is a boson. The following
principle is a basic naturallaw.

Pauli Principle 59.9.In a system offermions two particles can never be in the
same quantum state.

In Chapter 68 we show how from this principle follows that bosons (resp.
fermions) satisfy the Dose statistics (resp. Fermi statistics). As weshall see this
yields, e.g., the radiation law, the critical mass for white dwarfs, as weil as
results about the structure ofthe cosmos following the BigBang. In the context
of axiomatic quantum field theory, this principle can also be mathematically
deduced from suitable axioms. This may be found in Streater and Wightman
(1964, M).

59.21. Quantization of the Phase Space and Statistics


In Chapter 68 we consider quantum statistics. There we use, besides the Pauli
principle, another very successful generat principle. In order to explain this,
we consider in a fixed region Gof R3, with volume V(G), elementary particles
which are of the same kind. Let p denote the vector of the particle momentum.
Moreover, Iet 0 = G x GP be a region in the six-dimensional phase space.
A point (x, p) e 0 describes the state of a particle with position x e G and
momentum peGP.

Principle of Quantization of tbe Phase Space 59.10. The maximal number of


particles which can be in 0 is
N = gV(O)/h 3 •
Thereby g is the number of possible spin positions of a particle described by the
spin quantum number Sz.

i
For example, it is g = 2 for electrons and photons. One can also write

N = hg3 gY{G)i dp.


dx dp = ~
n G0

This principle can be extended to arbitrary 2n-dimensional phase spaces.


Intuitively, it states that in a cell ofvolume h" there can never be two particles
at the same time, having the same quantum state.
Weshall use Principle 59.10 in the form of a postulate. We wantto motivate
this principle. As in Example 58.6, we consider the classical harmonic oscil-
lator with motion
q = C sin(wt + a)
59.22. Pauli Principle and the Periodic System of the Elements 127

(a) (b)

Figure 59.6

and momentum
p = mq = mwC cos(c.ot + oc).
The energy of this motion is
p2 m m
E =2m+ 2w2q2 = 2C2w2.

In the (q, p)-phase space, this motion corresponds to an ellipse which is the
boundary of a region with measure

f dq dp = nmwC 2 = 2nE/w
(Fig. 59.6(a)). All motions with an energy E e [E 0 , E0 + L1E] cover the surface

f dq dp = 2n L1E/w

(Fig. 59.6(b)). If according to the principle above we set this surface equal to
h, then we obtain
L1E = hw. (47)
This is precisely Planck's condition for the quantization of the energy of
the harmonic oscillator. We therefore may think of Principle 59.10 as a
generalization of (47).

59.22. Pauli Principle and the Periodic System


of the Elements
In 1869, great progresswas made in chemistry, when, independently, Men-
delejev and Meyer were able to systematically order the chemical elements
according to phenomenological criteria. Table 59.1 shows the beginning of
the periodic system. In horizontal direction the atomic number Z increases,
while in vertical direction elements one below the other behave similarly. In
1925, Pauli formulated his Principle 59.10, in order to explain the shell struc-
128 59. Dualism Between Wave and Particle, Preview of Quantum Theory

Table 59.1
1H 2He
1s 1s2 =K
3Li 4Be 5B 6C 7N 80 9F 10Ne
K K K K K K K K
2s 2s 2 2sl, 2p 2s 2 , 2p 2 2s 2 , 2p 3 2s 2, 2p4 2sl, 2p' 2sl, 2p6 =L
11 Na 12Mg 13AI 14 Si 15 p 16 s 17 Cl 18 Ar
K K K K K K K K
L L L L L L L L
3s Jsl 3s2, 3p Jsl, Jpl Jsl, Jp3 Jsl, Jp4 Jsl, Jps Jsl, Jp6 =M

ture of the atoms, in the context of Bohr's atomic model and its further
development by Sommerfeld. This was even before the discovery of quantum
mechanics. Thereby one roughly obtains the following picture.
(i) The atomic number Z is equal to the number of electrons and equal to
the number of protons in the nucleus.
(ii) The number of neutrons in the nucleus may vary for fixed Z. Thereby
isotopes occur.
(iii) An electron state is characterized by Courquantum numbers n = 1, 2, ...
(orbit), l = 0, 1, ... , n - 1 (angular momentum), m = l, l - 1, ... , -I and
S,. = ±! (spin). Two electrons cannot coincide in all four quantum num-
bers (Pauli in 1925 did not use the spin).
(iv) For energetic reasons the orbits with n = 1, 2, ... are filled successively.
(v) The chemical behavior is mainly determined by the outer electrons. The
similarity of chemical elements is a consequence of the same number of
the outer electrons. Table 59.1 shows the number of electrons in different
orbits. In horizontal direction, new electrons are added continuously.
Thereby s and p stand for I = 0 and l = 1, respectively. Furthermore, ns"
means that the number of s-electrons in the nth orbit is equal to k. The
maximal number of s-electrons in one orbit is equal to 2 because of
l = 0, m=O, S.,= ±f.
The maximal number of p-electrons in one orbit is equal to six because of
l = 1, m = -1,0, 1 and S,.= ±t.
In the verticallines of Table 59.1 we have the same number of outer s-
and p-electrons. This results in a similar chemical behavior of the corre-
sponding elements. The inert gases 2 He, 10 Ne, and 18 Ar have only
closed shells denoted by K, L, M, respectively. This is the reason for their
chemical inactivity.
Starting with calium with Z = 19, the irregularities in the filling of the orbits
begin. This follows from the fact that for large Z the electron interaction
becomes stronger and the energetic situations more complicated.
59.23. Classical Limiting Case of Quantum Mechanics 129

Fora mathematical treatment in the context of quantum mechanics one


has to consider the Schrödinger equatioH for many-particle systems (see Part
V). The Pauli principle then corresponds to the fact that only such wave
functions are allowed, which are skew-symmetric with respect to the space
variables and the spin variables. The solution of many-electron equations,
taking the electron interaction into account, is only approximately possible
by using Ritz' method for eigenvalue problems (see Chapter 22). The situation
becomes even more complicated for molecular calculations. Today, with the
fastest computers, one can only approximately compute relatively small mole-
cules by using the Schrödinger eqqation. The numerical treatment of large
molecules is presently impossible. The main problern in quantum chemistry
is to find more effective methods and to build faster computers.

59.23. Classical Limiting Case of Quantum Mechanics


and the WKB Method to Compute
Quasi-Classical Approximations
As in Section 59.10 we are looking for the solution ofthe Schrödinger equation
(22) in the form
t/1 =eist".
From (22) we obtain the eikonal equation
sz i1z2
S, + 2 + U- -AS = 0. (48)
2m 2m
For h = 0 this is the Hamilton-Jacobi equation, which corresponds to the
classical energy

The method of Wentzel, Kramers, and Brillouin (WKB method) consists in


writing S as an expansion of the form
S = S0 + lzS1 + lz 2 S2 + .. ·
and to determine the S" successively from (48). This is a very effective method
of determining quasi-classical approximate solutions of the Schrödinger equa-
tion. Note that h = 1.054 ·10- 34 J s is very small. Actually, one has to use
dimensionless quantities in order to determine the smallness of the perturba-
tions in concrete problems. The Oth approximation So is the solution of (48)
with h = 0 (classical Hamilton-Jacobi equation).
This method illustrates the fact that quantum mechanics becomes classical
mechanics as h -+ 0.
130 59. Dualism Between Wave and Particle, Preview ofQuantum Theory

As we have already discussed in Problem 40.11, Maslov succeeded (a few


years ago) in developing a global form oftbis method by finding a procedure
to continue the asymptotic expansion beyond singularities (caustics).

59.24. Energy-Time Uncertainty Relation and


Elementary Particles
The main difference between classical mechanics and quantum mechanics is
e
that the position component 1 and the momentum component p1 cannot be
precisely determined at the same time. According to Section 59.19 we have
the following estimate for the mean errors ~e1 and ~p1 :
Ae1 Ap1 ~ h/2. (49)
lf the energy of the quantum system is not sharply determined, then, ana-
logously to (49~ we postulate the so-called energy-time uncertainty relation
~EL\t;::: h/2. (50)
This inequality states that during the time ~t the energy E can only be
determined up to a mean error of ~E. whereby (50) holds. As a motivation,
physicists point to the fact that in the special theory of relativity, position x
and momentum p are replaced by the contravariant four vectors (x, ct) and
(p, Efc) (see Section 75.11). Hence not only x and p, but also ct and E/c
correspond to each other. In the sense of this correspondence, (50) follows
from (49).
lf one considers concrete physical processes, then one often finds, in place
of the inequality (49), a relation
Ae1 Ap1 ,.., h, (51)
i.e., the product on the left-hand side has an order of magnitude of h. Analo-
gously, one has
(52)
The following two examples should illustrate this more precisely.

EXAMPLE 59.11 (Life-Time ofParticles Which are Generated in Accelerators).


Reactions in particle accelerators are described by writing the effective cross
sections, which are determined experimentally, as a function of the particle
energy E. Thereby one often observes resonanees such as in Figure 59.7.
One refers to them as excited quantum states (one particle or several bound
particles) with energy 2L\E, mass m = 2L\Efc 2 and a life-time
L\t = h/2AE. (53)
This formula is motivated by our observations about quasi-stationary pro-
59.24. Energy-Time Uncertainty Relation and Elementary Particles 131

Figure 59.7

cesses (damped oscillations) of Section 59.5. There we saw in (10) that the
intensity of a damped oscillation is mainly restricted to frequencies w e
[ w0 - llw, w 0 + llw] with
2/lwllt = 1.
The energy formula (53) is obtained from the relation
E=hw
for the de Broglie matter waves.
EXAMPLE 59.12 (Field Quantums and n-Mesons). We want to explain the
physical ideas behind Yukawa's meson theory for the nuclear forces. Our
starting point is the relativistic equation
(54)
between energy E, rest mass m0 , and momentum vector p of a free particle.
Here c is the velocity of light. In order to obtain a corresponding quantum
mechanical wave equation, we use the same quantization procedure as in
Section 59.11, i.e., we introduce the Substitution
0
P ::::;. -ih-
ax'
and obtain from (54) the so-called Klein-Gordon equation
1 iJ2 )
(-
2 -
c ot
2 - ll + Jl 2 "' = 0 (55)

with
Jl = m0 c/h.
This equation has the radially symmetric, stationary solution
e-,..
1/J(x)= Qy- (56)
r
with r = lxl. Fora suitable choice of the constants Q and y this, actually, is a
solution of the equation
( -ll + Jl 2 )1/J = b,
132 59. Dualism Between Wave and Particle, Preview of Quantum Theory

in the sense of distributions where {J is Dirac's delta distribution, which is


concentrated at the origin (see A2 (62)).
We now want to develop the theory of nuclear forces analogously to
classical electrodynamics. In Maxwell's theory, which will be discussed in Part
V, equation (55) with Jl. = 0 is the differential equation for the electrical
potential 1/J. The mechanical potential U of the electrostatic Coulomb force,
which is applied from one charge Q to another charge Q1 , is equal to

(57)

with Jl. = 0 and y = 1/4ne0 , according to Section 58.8b. An atomic nucleus


consists of nucleons, i.e., uncharged neutrons and protons with positive ele-
mentary charge. One assumes that the atomic nucleus is kept together by
strong nuclear forces which exist between the nucleons. Scattering experi-
ments show that this nuclear force has an extremely short radius of action
which is approximately the nuclear radius, i.e.,
R = 1.4 fm = 1.4 ·10- 15 m. (58)

This nuclear radiuswas determined by Ernest Rutherford in 1911. He used


scattering experiments with tx-particles (helium nucleuses). In order to describe
the nuclear force in analogy to the electrostatic force, we use (57) with free
parameters Q1 , Q, and y. The radius of action of the force, corresponding to
the potential U, is defined as R = l/Jl.. This is motivated by the fact that e-,...
has already greatly decreased for r > 1/J.t. Together with (58) we obtain the
fundamental relation
R = l/J.t = h/m0 c = 1.4·10- 15 m. (59)

Formula (57) represents the so-called Yukawa potential. In the case of the
electromagnetic force we have Jl. = 0. Physically, this corresponds to the fact
that, as in the case of the gravitational force, the electrostatic Coulomb force
has an infinite radius of action R.
So far, we have only considered the forces themselves, without looking for
the mechanism through which these forces are transformed. This will be done
right now in the context of the idea of exchange forces. In classical electro-
dynamics one assumes that electromagnetic interactions are transmitted
through the electromagnetic field. In quantum field theory, on the other band,
one starts from the assumption that this transmission is done by field quanta
which are different for every interaction. The field quantum for the electro-
magnetic interaction is the photon. According to Einstein's photon hypothesis
of Section 59.3, we obtain for the photon E = clpl. A comparison with (54)
yields m0 = 0. This coincides with our observation above. There we have
Jl. = 0, and hence m0 = 0. We now pose the question: Which field quantum
transmits the nuclear force? To answer this question we interpret the mass m0
in (55) as the rest mass of the field quantum. From (59) we obtain the rest
59.24. Energy-Time Uncertainty Relation and Elementary Particles 133

energy
E0 = m0 c2 = hc/R = 2 ·10- 11 J = 130 MeV.
Therefore m0 corresponds to about 260 rest masses of the electron. The
life-time of the field quantum can be computed from (53) as
At0 = h/E0 = 0.5 ·10- 23 s. (60)
In general, the life-time of an unstable particle gets smaller, the greater the
energy gets. In 1935, Yukawa formulated the hypothesis that the nuclear force
is transmitted through mesons with the rest mass above. Actually, in 1947,
such particles were discovered in cosmic rays and then, in 1948, artificially
generated in laboratories at Berkeley. Until1950, the mesons n+, n°, and n-
with corresponding charges Ie 1. 0, and e were discovered in accordance with
Yukawa's prediction. In 1949, Yukawa received the Nobel prize for bis meson
theory.
We shall now Iook at a picture which illustrates the mechanism for nuclear
forces. This also will show why there must be n-mesons with three different
charges. lt is a consequence of the different charges of the neutron and the
proton. We consider the interaction between two protons in the atomic
nucleus, as schematically pictured in Figure 59.8(a). The proton p emits a
n+ -meson at A. Since 1t+ carries a positive elementary charge, the proton p
changes into an electrically neutral neutron n. At B, the 1t+ -meson is captured
by another neutron n, which thereby changes into a proton. Analogously, (b)
and (c) in Figure 59.8 can be understood. For alt these processes, the charge
conservation is strictly satisfied, but not so the energy conservation. One
therefore calls these 1t-mesons virtual particles. It is important then that
because of the energy-time uncertainty relation the classical energy conserva-
tion is not required, but instead we give the following interpretation of Figure
59.8(a). The energy ofthe proton p at A cannot be determined as a sharp value,
but, according to (50), may vary in the time interval At about the value AE with
AEAt ~ h/2. (61)
An analogous result holds for the neutron n at B. This is why during a time
interval At a x+ -meson may ßy from A to B with total energy AE and

p n p
n p p
B
A
n p p
p n p
X
(a) (b) (c)

Figure 59.8
134 59. Dualism Between Wave and Particle, Preview ofQuantum Theory

momentum vector p0 • According to the relativistic energy formula we have


(62)
In this sense the nuclear force is an exchange force which is caused by the
exchange of n-mesons.
We want to test the consistency of this picture. According to the theory of
relativity of Chapter 75 a physical effect can travel at most with the velocity
of light c. The n-meson, which flies through the nucleus, can do this with at
most the velocity c, i.e., R/~t :::;; c. From (59) this means
~t:?: R/c = h/m0 c2

and (62) implies t!..E ~ m0 c2 , hence


~EM:?: h.
This is a sharpening of(61).
According to modern views, protons and neutrons consist of quarks, and
the nuclear force is a consequence of the quark interaction, which is caused
by the exchange of eight gluons.
The diagrams in Figure 59.8 are special cases of the so-called Feynman
diagrams. Weshallshow in Part V that, in the context of quantum field theo-
ry, one obtains a mathematical formalism for the computation of quantum
processes. Thereby the Feynman diagrams are graphical representations to
give an intuitive description of purely analytical expressions, which appear in
connection with perturbation calculus. For higher-order perturbations much
more complicated diagrams than those in Figure 59.8 appear. Roughly speak-
ing, several particles are involved.
In comparing physical and mathematical thinking, it is interesting that
Feynman based bis development of quantum electrodynamics, during the late
1940s, on some intuitive physical ideas, which he described by bis diagrams.
Thereby, he used these diagrams in a virtuous fashion in order to compute
physical effects. Dyson wrote in bis book "Disturbing the Universe," that he
bad great difficulty in following Feynmah's intuitionandin translating it back
into a mathematicallanguage which he could understand.
Physical thinking does not seem to follow the lines of a mathematical
formalism, but rather proceeds in images, which are closely related to real
phenomena. Only thereafter, a mathematical formulation is used. It is known
that during the last century, Faraday used the image of electric and magnetic
fields, long before Maxwell gave a precise mathematical description.

59.25. The Four Fundamental Interactions


There exist four fundamental interactions in nature: strong, electromagnetic,
weak, and gravitational. In Tables 59.2 and 59.3 we give a survey.
59.25. The Four Fundamental Interactions 135

Table 59.2
Particle onto
Relative Radius of which the force
Interaction strength action Example is exerted Ficld quanta
Strong 10 w-IS m proton as bound state quarks (hadrons) 8 gluons
of three q uarks
Electromagnetic to- 2 00 atomic force, chemical electricall y charged photon
binding particles
Weak w-s « w-16 m fJ-decay ofthe neturon hadrons, Ieptons, e.g., vector bosons
(radioactive decay) electron, neutrino w±,zo
Gravitational to-39 00 planetary motion particles with mass graviton

The most significant achievement of modern elementary particle physics is


the unified theory for the weak and the electromagnetic interaction, the theory
of the electroweak interaction. For this, Glashow, Salam, and Weinberg re-
ceived the Nobel prize in 1979. This theory can be viewed as a continuation
of Maxwell's theory, in which the electric and magnetic interaction is com-
bined to an electromagnetic interaction. The mathematical apparatus is also
analogous to Maxwell's theory. This theory can be viewed as a U(l)-gauge
field theory, whereas the electroweak theory is a gauge field theory with the
group SU(2) x U(l) (see Part V). The field quanta ofthe electroweak interac-
tion are three heavy vector bosons w+, w-, Z 0 with approximately 100
proton masses and the photon. These predicted heavy particles were detected
in 1983 at CERN (Geneva) with th~ 270/270 Ge V proton-antiproton collider.
According to quantum chromodynamics, protons, neutrons, and n-mesons
consist of quarks, which may carry the three so-called color charges which
are called red, green, and blue. Similarly to the photon, which describes the
interaction between electrically charged particles, one assumes here the exis-
tence of eight gluons, which transmit the interaction between quarks with
color charge. One expects that this causes the strong interaction. The quantum
chromodynamics is a SU(3)-gauge field theory. Since the Lie algebra of SU(3)
is eight-dimensional, this formalism yields the existence of eight gluons.
In order to unify the strong interaction with the electroweak interaction,
gauge field theories with larger groups G have been proposed, e.g., with
G = SU(5) or G = SO(lO). These theories (grand unification theories) imply
a proton decay and the existence of magnetic monopoles. For the proton
decay one has a half-life period of approximately 1030 years, i.e., during this

Table 59.3
Mass/Proton
Field quanta Charge mass Spin Mean life-time
Gluon 0 0 oo?
Photon 0 0 00
w± ±Iei 80 < 10-23 s
zo 0 90 < 10-23 s
136 59. Dualism Between Wave and Particle, Preview of Quantum Theory

time half ofthe proton has decayed. At present, serious efforts are being made
to find an experimental proof. The field quanta of the unified theory-the
so-called Lepto quarks-have approximately 10 14 proton masses. The mag-
netic monopoles should have 10 16 proton masses. This already is in the order
of the magnitude of bacteria. Only during the first 10- 35 s after the BigBang,
the energy needed for creating such heavy particles was large enough. Also
one expects that these magnetic monopoles catalyze the proton decay.
The unified theories assume that, fortarge energies of greater than 10 15 Ge V
per particle, only one interaction exists. Such high energies were present only
during an extremely short time after the BigBang. Then, during the cooling-off
process of the cosmos, the different interactions crystallized analogously to
the crystallization of substances from a fluid. This crystallization goes along
with a loss of symmetry. Therefore the present interactions have different
symmetry groups (symmetry breaking) (see Section 76.7 and Part V).

59.26. Strength of the Interactions


First we compare the gravitational and electromagnetic interaction. Using
classical methods, we compute the gravitational force and the electromagnetic
force between two protons. From (63a) and (63b) below we will obtain that
the ratio between these two forces is approximately equal to 10- 39 :10- 2 • We
use the formula
(63)
The value .of the gravitational force between two protons of mass m =
1. 7 · 10- 27 kg at a distance r is equal to (63) with the dimensionless quantity
IX= Gm 2 /hc = 5·10- 39 • (63a)
According to Coulomb's law of Section 58.8, the value of the electrostatic
repelling force between two protons with positive elementary charge Iei =
1.7 ·10- 19 A s at a distance r is equal to (63) with
IX = e2/4ne 0 hc = 1/137. (63b)
This is the so-called Sommerfeldfine structure constant. The effects of quantum
electrodynamics are obtained from perturbation computations with respect
to powers of IX= 1/137. For the strong interaction, such a perturbation
formalism fails, because the parameters involved are substantially larger.
Since there exists no precise mathematical quantum field theory, it is not
possible to find exact characteristics for the strengths of the strong and weak
interaction. One heuristic method is to calculate certain effects by using rough
quantum field theoretical approximation procedures, and to form dimension-
less quantities analogous to Sommerfeld's fine structure constant [see Bogol-
Problems 137

Table 59.4
Typical efTective Typical energy Typical mean life-times
cross sections a fluctuations of of resonances
Interaction in m2 resonances in MeV in seconds
Strong to-3o 102 10 -23
Electromagnetic 10 -33 to- 3 10-1s
Weak 10-42 10 -14 w-'

jubov and Sirkov (1980, M), §10, Landau and Lifsic (1962, M), Vol. 4b, §145
(Fermi theory of the weak interaction) and Becher and Böhm (1981, M) (gauge
field theories)]. This procedure, which is still quite uncertain and arbitrary,
yields the relative strengths of the interactions given in Table 59.4.
In order to get an idea of the physical effects that are actually observed, we
Iook at the two most important quantities which can be directly determined
in experiments with particle accelerators: the ejfective cross section tJ with the
dimension of a surface, and the energy fluctuation 2AE of resonances. The
concept of effective cross sections has already been discussed in Section 59.7.
If, according to (53), we assign mean life-times to the resonances we obtain
Table 59.4. It shows that, in fact, the interactions have different strengths.

PROBLEMS

59.1. Bohr's atomic model. Prove the formulas of Section 59.16.


Solution: The motion of the electron on a circular orbit of radius r around the
proton is described by
x = r(coswte 1 + sincote2 )
with the orthonormal vectors e 1 , e 2 and e 3 = e 1 x e 2 • This implies
x= cor(-sinwte 1 + coscote2 ),

For the angular momentum of the electron we obtain


N = m.(x X x) = m.wr2e3.
The equation of motion for the electron is
m.x=K (64)
with Coulomb force
e2 x
K=
4xe0 lxl 3
according to Section 58.8. From (64) follows
(65)
138 59. Dualism Between Wave and Particle, Preview ofQuantum Theory

The energy is equal to


m e2
E = !m• x2 + U = ....!..w
2
2 r2 - --
4ne0 r
e2
=---
8neor
The angular momentum quantization INI = nh gives
n = 1, 2, ....
From (65) we obtain the orbit radii

with r0 = 4ne0 h2 fm.e 2 • The energy is


E,. = -e2/8ne. 0 r0 n2,
and the orbital velocity ofthe electron is equal to
V.. = I.XI = wr,. = a.cfn
with a. = e2/4ne. 0 hc = 1/137.
59.2. Scattering of charged particles. A particle with positive charge Q, mass m, and
initial velocity V00 is scattered at a fixed particle with positive charge Q0 , as shown
in Figure 59.9. Prove that
a.
q = (a./mV~) 2 cot 2 , (66)

where 8 is the scattering angle and q the so-called collision parameter. Thereby
we have a. = QQ 0 /4ne. 0 •
Solution: From Section 58.9b we obtain for the Kepler problern
m.i = a.xflxl 3 (67)
with constant positive energy

v..
q

Figure 59.9
Problems 139

the orbital hyperbola

-pr = l + ecos(q~ + 2n- q~0 /2)


with
(68)
and constant angular momentum
N = m(x X x).
Here r and q1 arepolar coordinates. Note that on the right-hand side of(67) we
have the Coulomb force.
The angle fPo e ]0, n[ is chosen so that r -+ oo as q1 -+ -n, i.e.,
1 + ecos(n ± q~0 /2) = 0.
This implies that r -+ oo as q1 -+ -n + q~0 • The scattering angle is 8 = n - q~0 ,
and hence
sin 8/2 = 1/e. (69)

Fort-+ -oo we obtain x(t)-+ V00 e1 • This gives

E = mV:l;/2, (70)
From (68)-(70) follows (66).

59.3. Rutherford's scattering formula. Inslead of one particle in Problem 59.2 we now
consider a homogeneaus particle current with velocity vector V00 e1 and particle
density p of particles with positive charge Qand mass m. This current is scattered
at a fixed particle of positive charge Q0 • Compute the number of particles llN
which, during the time interval [0, t], are scattered about angles which Iie in the
interval [ .9, .9 + ll.9].
Solution: According to Figure 59.9 and (66), the number llN is equal to the
number of particles which, during the time interval [0, t], pass through the
circular ring surface lla, perpendicular to the vector e 1 with
lla = n[(q + llq) 2 - q2 ]

= n(~X/mV:1;)2 (cot 9+2-ll8- cot 28) .


2- 2

This gives
(71)

The surface lla is called the effective cross section. For ll8-+ 0 we obtain

da ( IX ) 2 cos(B/2)
d8 =n mV:l; sin 3 (8/2) ·

The value da is called the differential effective cross section.


140 59. Dualism Between Wave and Particle, Preview of Quantum Theory

References to the Literature

Classical works: Planck (1900), Einstein (1905b), Heisenberg (1925), Schrödinger


(1926).
Physical quantum theory: Heisenberg (1937, M), Pauli (1958, S), Landau and Lifsic
(1962, M), Vols. 3, 4, Bogoljubov and Sirkov (1973, M), (1980, M), Itzykson and Zuber
(1980, M), Lee (1981, M), Frampton (1987, M).
Mathematical quantum theory: von Neumann (1932, M), Van der Waerden (1980,
M) (recommended as an introduction), Streater and Wightman (1964, M), Reed and
Sirnon (1972, M), Vols. 1-4, Triebel (1972, M).

General References to the Literature

General physics: Feynman, Leighton and Sands (1963, L) (Feynman Lectures, Vols.
1-3) (recommended as an introduction), Kittel (1965, L) (Berkeley Physics Course,
Vols. 1-5), Orear (1966, M), Hänsel and Neumann (1972, M), Vols. 1-7, Lüscher
(1980, M).
Experimental physics: Bergmann and Schaefer (1979, M), Vols. 1-4, Gerthsen
(1982, M).
Standard works in theoretical physics: Sommerfeld (1954, M), Vols. 1-6, Landau
and LifSic (1962, M), Vols. 1-10.
Theoretical physics: Pauli (1973, M), Vols. 1-6, Macke (1962, M), Vols. 1-6,
Kompanejec (1961, M), Ludwig (1978, M), Vols. 1-4, Wellerand Winkler (1974, M),
Vols. 1-2, Greinerand Müller (1986, M), Vols. 1-10 (including recent advances in
theoretical physics).
Mathematical physics: Courant and Hilbert (1959, M), Vols. 1-2 and Frank and
von Mises (1962, M), Vols. 1-2 (classical standard works), Morse and Feshbach
(1953, M), Vols. 1-2, Maurin (1967, M), (1976, M), Vols. 1-2, Reed and Sirnon (1971,
M), Vols. 1-4 (emphasis on quantum physics), Triebe) (1981, S), Thirring (1983, M),
Vols. 1-4.
Handbook of physics: Geigerand Scheel (1926, M), Vols. 1-24 (classic), Flügge
(1956, M), Vols. 1-oo.
Encyclopedia: Encyclopedia ofMathematics and its Applications (1976), Vols. 1-oo,
Russian Encyclopedia of Mathematics (1977, M), Handbook of Applicable Mathe-
matics (1980, M), Vols. 1-6, Encyclopedia of Astronomy and Space (1976), Van
Nostrands Scientific Encyclopedia (1976). Encyclopedic Dictionary of Mathematics
(1977) (especially recommended), Encyclopedic Dictionary of Physics (1977), En-
cyclopedia of Mathematics (1987, M), Vols. 1-oo.
Lexica: Dictionary of Mathematics (1961, M), Vols. 1-2, Brockhaus ABC of Physics
(1971, M), Vols. 1, 2, Brockhaus ABC of Chemistry (1965, M), Vols. 1-2.
Four language dictionary ofmathematics: Eisenreich and Sube (1982, M), Vols. 1-2
(35,000 termini).
Four language dictionary of physics: Eisenreich and Sube (1973) (75,000 termini).
Qualitativeanalysis of physical systems: Gitterman and Halpern (1981, M).
Rational mechanics: Truesdell (1977, M), Wang (1979, M).
International system of units: Oberdorfer (1969, M), Massey (1971, M).
Collections ofproblems in physics: Hajko and Schilling (1975, M), Vols. 1-6 (physics
in examples), Vogel (1977, M), Greinerand Müller (1986, M), Vols. 1-10.
Popular expositions about the development of modern physics, cosmology and
biology: Riedl (1976, M), Bresch (1978, M), Sullivan (1979, M), Kippenhahn (1980, M),
Unsöld (1981, M), Fritzsch (1982, M), (1983, M), Sexl (1982, M), lvanov (1983, M)
General References to the Literature 141

(cybernetical methods in neuro-physiology, biology, cultural sciences, and the human-


ities), Henbest (1984, M), Trefil (1983, M), (1984), Taube (1985, M) (cf. p. 793).
Philosophical problems in mathematics and natural sciences: Kähler (1941), (1979,
M), Weyl (1952, M), (1966, M), Born (1957, M), Blaschke (1957, M), Einstein (1965, M),
Planck (1945, M), (1967, M), Kline (1972, M), Monod (1970, M) and Bresch (1978, M)
(biology), von Weizsäcker (1973), (1976, M), (1976a), (1979, M), (1979a, M), Heisenberg
(1977, M), (1977a, M), (1978, M), (1980, M), (1981, M), Dyson (1979, M), Hofstadter
(1979, M), Prigogine (1979, M), Prigogine and Stengers (1981, M), Manin (1981, M),
Cronin (1981, M), Maurin (1981), (1982), Treder (1983, M), Albers (1985, M), Beckert
(1985), Tymoczko (1985, M), Hildebrandt and Tromba (1986, M).
History ofphysics: von Laue (1950, M), Mehra (1982, M).
History of natural sciences: Wussing (1983, M).
Nobellectures: The Nobel prizes (1954ff, M).
Unsolved problems in mathematical physics: Sirnon (1984, S).
Unsolved problems in mathematics: Browder (1976a).
The journals Nature and Scientific American inform about recent developments in
science.
Survey about modern developments in mathematics: Jaffe (1984).
About current results in mathematics and the natural sciences one can consult the
Lecture Notes in mathematics, biomathematics, chemistry, computer science, control
and information sciences, economics, physics, and statistics. These lecture notes appear
by Springer-Verlag.
Current developments in physics may also be found in the two series "Progress
in Physics", Birkhäuser, Boston and "Frontiers in Physics", Benjamin, New York.
Pursue the summer institute series "Les Houches (1951ff)".
Fundamental formulas in physics: Menzel (1955, M).
Numerical recipies: Press (1986, M) (the art of scientific computing).
APPLICATIONS IN
ELASTICITY THEORY

As any human activity needs goals, mathematical research needs problems.


David Hilbert (1932)
Most mathematicians have an idea of the influence of hydrodynamics and
electromagnetism on the theory of complex functions and harmonic
potentials. The influence of elasticity is Jess weil known. Elasticity led to a vast
range of mathematical problems involving linear algebra, differential
geometry, ordinary and partial differential equations (mostly nonlinear),
elliptic functions, and the calculus of variations.
Clifford Truesdell (1983)

In Chapters 60 to 66 we want to show that the following mathematical con-


cepts are closely related to problems in nonlinear elasticity and plasticity
theory:

convex functionals,
monotone potential operators,
pseudomonotone operators,
maximal monotone operators,
subgradients,
variational inequalities,
duality,
bifurcation.

The corresponding theories, which have been developed in Parts II and 111,
will here be applied to solve a number of problems in elasticity and plasticity
theory. In Chapter 62, for instance, we will see that there exists a duality
between strain and stress which allows us to apply the duality theory of Part
111. In fact, the duality between strain and stress was one of the reasons for

143
144 Applications in Elasticity Theory

the creation of a general duality theory. Strain and stress are the two funda-
mental physical quantities of elasticity and plasticity theory.
The two most important concepts used in an elegant and effective descrip-
tion of convex problems in elasticity and hydrodynamics are:
(i) conjugate functionals (duality); and
(ii) subgradients (elastoplastic and viscoplastic material).
In case the considered functionals are not convex, one might be able to
apply the modern method of compensated compactness. In this direction we
will consider:
(a) polyconvex material in elasticity; and
(b) transonic flow in gas dynamics.
Most difficulties in elasticity and hydrodynamics arise from the fact that in
realistic situations in nature no convexity is available.
We emphasize that the application of methods of convex analysis and the
theory of monotone operators is only a first step. For a generat theory of
elastic and plastic phenomena, whiclt presently does not exist, it will be
necessary to substantially broaden the mathematical spectrum. Today, one is
convinced that the true character of elasticity lies beyond the concepts of
convexity and monotonicity.
CHAPTER 60

Elastoplastic Wire

Ut tensio sie vis. 1


Robert Hooke, De Potentia Restitutiva, (London, 1678)
The first mathematician to consider the nature of resistance of solids to rupture
was Galileo (1638) .... He endeavoured to determine the resistance of a beam,
one end of which is built into a wall, when the tendency to break arises from
its own or an applied weight; and he concluded that the beam tends to turn
about an axis perpendicular to its length, and in the plane of the wall. This
problern and, in particular, the determination of this axis is known as Galileo's
problern.
The history of the theory of elasticity started from Galileo's question. Un-
doubtedly, the two great Iandmarks are the discovery of Hooke's law in 1660
(published in 1678), and the formulation of the generat equations by Navier
(1821). Hooke's law provided the necessary experimental foundation for the
theory....
In the interval between the discovery of Hooke's law and that of the generat
differential equations of elasticity by Navier, the attention of those mathe-
maticians who occupied themselves with our science was chiefly directed to the
solution and extension of Galileo's problem, and the related theories of the
vibrations of bars and plates and the stability of columns.
The first investigation of any importance is that of the elastic line or elastica
by Jacob Bernoulli (1705), in which the resistance to bending is a number
proportional to the curvature of the rod when bent. ...
David Bernoulli suggested to Euter (by Ietter in 1742) that the differential
equation of the elastica could be found by making the square of the curvature
taken along the rod a minimum; and Euler (1744) was able to obtain the
differential equation and to classify various solutions of it (an early study of
elliptic integrals and elliptic functions) ....

1 The force of a spring is directly proportional to its relative extension (strain). Robert Hooke

published this in the anagram


ceiiinosssttuu.

145
146 60. Elastoplastic Wire

Na vier (1821) was the first to investigate the generat equations of equilibrium
and vibration of elastic solids. He set out from the Newtonian conception of the
constitution of bodies, i.e., bodies are made up of small parts called "molecules"
which act upon each other by means of centrat forces ....
The studies of Cauchy (1789-1857) in elasticity were first prompted by his
being a member of the commission appointed to report upon a memoir by
Navier on elastic plates which was presented to the Paris Academy in August,
1820. By the autumn of 1822, Cauchy bad discovered most of the elements of
the pure theory of elasticity (published in 1827). He had introduced the notion
of stress. He had also shown how to introduce both tbe stress tensor and tbe
strain tensor.... He bad determined the equations ofmotion (or equilibrium) by
wbicb the stress components are connected with tbe forces .... By means of
relations between stress components and strain components, be bad eliminated
the stress components from the equations of motion and equilibrium, and bad
arrived at equations in terms of tbe displacement. ... Caucby obtained his
stress-strain relation (constitutive law) for isotropic materials by means of two
assumptions:
(i) tbat tbe relations in question are linear; and
(ii) tbat the principal planes of stress are normal to tbe principal axes of strain.
Tbe experimental basis on whicb tbese assumptions can be made to rest is tbe
same as tbat on wbich Hooke's law rests, but Caucby did not refer to it. Tbe
metbods used in tbese investigations are quite different from tbose of Navier's
memoir (1821 ). In particular, no use is made of material points and centrat forces.
Tbe resulting equations differ from Navier's in one important respect: Navier's
equations contain a single constant to express tbe elastic bebavior of an isotropic
body, wbile Cauchy's contain two such constants (today called tbe constants of
Lame (1795-1870))....
Green (1793-1841) was dissatisfied witb tbe bypotbesis on whicb tbe tbeory
of elasticity was based, and he sought a new foundation in his paper (1839).
Starting from what is now called the "principle of minimal elastic potential
energy" he propounded a new method of obtaining the basic equations. Tbe
revolution which Green effected in tbe elements of the tbeory is comparable in
importance witb that produced by Navier's discovery of the basic equations.
Green supposed tbe stored energy function (density of the elastic potential
energy) to be capable of being expanded in powers and products of the com-
ponents of strain .... From tbis principle Green deduced tbe equations of elas-
ticity for anisotropic bodies, containing in the generat case 21 constants. In the
case of isotropy there are two constants, and tbe equations are tbe same as tbose
ofCauchy.... (Green followed the pattern ofthe famous "La Mecanique Analy-
tique" of Lagrange (1788). Green's stored energy function corresponds to tbe
Lagrangian function in mechanics.)
The bistory of the matbematical tbeory of elasticity shows clearly that the
development ofthe theory has not been guided exclusively by considerations of
its utility for technical mecbanics. Most of tbe men by wbose researcbes it bas
been founded and sbaped have been more interested in natural philosopby tban
in material progress, in trying to understand the world rather tban in trying to
make it more comfortable.
A. Love (1906)

The mechanics of continua, which is based on Cauchy's (1827) generat notion


of stress, has been applied so far only to liquid and solid elastic bodies. In regard
to plastic deformations, Saint Venant (1864) (based on experiments of Tresca
(1864)) has sketcbed a theory which, however, does not yield the necessary
nurober of equations in order to completely determine the motion.
60.1. Experimental Result 147

But this paper Ieads to a complete system of equations of motion for plastic
bodies.
Richard von Mises (1913)
The mathematical theory of plasticity owes its development to the demand for
more realistic methods to determine the safety factors of structures or machine
parts, and to the need for better control in technological forming processes such
as rolling, drawing, and extruding.
William Prager and Philip Hodge (1951)

In order to understand the basic ideas of elasticity and plasticity theory, we


discuss in this chapter the simplest situation by choosing the wire as an
example. Special emphasis will be placed on a comparison of important
constitutive laws (stress-strain relations). We stress the possibility that plastic
behavior may be described by multivalued constitutive equations, i.e., more
precisely, by subgradients. The calculus of subgradients has been discussed in
Part 111.
In Chapters 61 to 66 we shall generalize the results of this chapter to
three-dimensional bodies and give applications to some important problems.
The dynamic plasticity model of Section 60.3 will be substantially generalized
in Chapter 66. This model is based on the use of intemal state variables.

60.1. Experimental Result


Consider a wire of length l and cross section C, which consists of steel and is
expanded under the influence ofa tensile force K > 0 about Al (Fig. 60.1). We
define
U= Al (displacement),
y = Al/l (strain),
u = K/C (stress).
By definition, the strain y is equal to the relative change of length. For a
compressive force K < 0, the stress u is negative. Experimentally, the stress-
strain relation
u = u(y)
exhibits the qualitative behavior of Figure 60.2. The points I to IV in Figure
60.2 carry the notations: I = proportional Iimit, II = elastic Iimit, III =
hardening Iimit, and IV = strength Iimit. Ifwe slowly increase the tensile force

.-----------~~----K
-I+AI-

Figure 60.1
148 60. Elastoplastic Wire

p
0

II III IV
oo

F-----------------------~------~Y
0 0'

Figure 60.2

K from zero upwards, then the tension a increases, and the wire is expanded,
i.e., in Figure 60.2 we run through the curve from 0 to IV.
Between 0 and I there exists a linear relation between stress and strain
a = Ey, (1)

which is called Hooke's law. The material constant Eis called the elasticity
module. In fact, the wire is also subject to a reduction ßd of its thickness d,
which is given by
ßd/d = - Jl. ßl/1. (2)

The material constant Jl. is called the Poisson number. In this chapter we
concentrate on the longitudinal deviation (1). The transverse contraction (2),
however, is important for the behavior ofthree-dimensional bodies, as weshall
see in Section 61.7.
Between I and II there exists the nonlinear relation
a = a(y), (3)
which is called a nonlinear Hooke's law. At point II a plastic behavior can be
observed. First, the wire begins to flow, i.e., one observes strains without
significant increase of stress. At point III, the so-called hardening Iimit, the
flow process slows down. One says the material has hardened. At point IV
the material breaks.
Typical for the plastic behavior is a so-called hysteresis effect. In order to
explain this Iet us consider a point P in Figure 60.2 beyond the elastic Iimit
II. If at this point the force is diminished i.e., the stress is decreased, one
does not run backwards through the diagram, but instead along the dotted
line. If during this process one reaches the force K = 0, and hence the stress
a = 0, then we are at point 0'. This means, although no force is present any
more, there still exists a nonzero rest deformation, which is called a plastic
rest deformation. While in the elastic region 0 to II there exists a unique
relation between stress a and strain y, this is no Ionger true in the plastic region
beyond II. There only a multivalued relation exists between a and y, which
depends on the chosen path, i.e., on the process. The objective ofmathematical
plasticity theory is to model such multivalued constitutive laws. This will be
discussed during the following sections. Thereby the subgradient of convex
analysis of Part III will be the main tool to describe hysteresis effects.
60.2. Viscoplastic Constitutive Laws 149

Our observations have a purely phenomenological character, i.e., we do not


study the microphysical mechanisms, which are responsible for elasticity. In
1912, Max von Laue's experiments with X-rays proved that many solid
materials, especially all metals, have a crystalline structure. During the 1930s
one came to the conclusion that plastic behavior is caused by curve-like defects
in the lattice structure. Detailed information about this may be found in
Sommerfeld (1970, M, H), Cbapter IX andin Kleinert (1987, P).

60.2. Viscoplastic Constitutive Laws


lfrelation
u = u(y) (4)
holds, then we speak of linear and nonlinear elastic material if u( ·) is linear
and nonlinear, respectively (Figs. 60.3 and 60.4). We now want to show that
plastic, viscous, and viscoplastic behavior can uniformly be described through
y(t) e oF(u(t)) (5)
with time parameter t. More precisely, we consider the processes
')I= y(t), u = u(t) for 0 ~ t ~ t0 ,

which occur slowly (quasi-dynamical processes). Relation (5) holds for almost
all t e [0, t 0 ].

EXAMPLE 60.1 (Ideal Plastic Behavior). We choose a fixed u0 > 0. Let


C = {ue~: lul ~ uo}
and Iet x be the indicator.function of C, i.e., x(u) = 0 or = +oo for u e C or
u Ii C, respectively. Set F = X· According to Section 47.3 we have

{0} if Iu I < uo,


ox(u) = { IR± if (1 = ± Uo,
0 if lul > Uo.

Figure 60.3 Figure 60.4


150 60. Elastoplastic Wire

..._---+----· y
Figure 60.5
-oo
Figure 60.6

oo

.-------y
Figure 60.7

Therefore, (5) is equivalent to


y(t) = 0 if la(t)l < a0 ,
(6)
±j(t) ~ 0 if a(t) = ± a0 ,
and la(t)l ~ a0 for all t. Figures 60.5 and 60.6 show two possibilities. In Figure
60.5 an increase of stress a, at first, does not yield any strain y. After reaching
the critical stress a0 , the material ßows. Figure 60.6 shows a typical hysteresis
effect. It is now important that no unique relation of the form (4) exists, but
according to (5) a number of different processes are possible.
EXAMPLE 60.2 (Viscous Behavior). Let F1 : IR -+ IR be a convex C 1-function.
Then oF1 (a) = {Fl(a)}. Therefore (5) with F = F1 corresponds to the consti-
tutive law
y(t) = Fl(a(t)) (7)
which models viscous flows. In contrast to (4), the strain velocity here depends
on the stress. If we choose especially

F (a) = {0(a- a if Iai ~ ao,


if Iai > ao,
0 ) 2/4p.
1

with p. > 0, then (7) corresponds to the process of Figure 60.7 with a(t) = .t.
Formally, F1 becomes x as p.-+ 0. Hence, formally, ideal plastic behavior can
be viewed as a limitins case of viscous behavior (Fig. 60.8).
EXAMPLE 60.3 (Viscoplastic Behavior). For F = x + F1 , Theorem 47.8 implies
that
60.3. Elasto-Viscoplastic Wire with Linear Hardening Law 151

Fj
ax

Figure 60.8

oo

Figure 60.9

oo

Figure 60.10

In this case (5) describes a Superposition of Examples 60.1 and 60.2 of the form
y(t) e ox(a(t)) + Fi(a(t)).
For F1 (a) = a 2/4p. with p. > 0 and piecewise constant t1 one obtains, e.g.,
Figure 60.9. Combining (4) with (5) gives
y(t) e E- 1 tt(t) + oF(a(t)). (8)
For F = x. one obtains processes ofthe form in Figure 60.10.

60.3. Elasto-Viscoplastic Wire with


Linear Hardening Law
In Examples 60.1 and 60.3 the plastic Iimit a0 cannot be exceeded, since we
have ox(a) = 0 for Iai > a0 • However, such an exceeding by hardening effects
152 60. Elastoplastic Wire

is actually observed in Figure 60.2. We want to show how one can describe
such hardening effects by introducing so-called internal state variables e, p and
q, r. We set
y = e + p, u = q + r, (9)
whereby e is the elastic strain, p is the viscoelastic strain, q is the viscoplastic
stress, and r is the hardening stress. The constitutive laws are given by linear
Hooke's laws
u = Ae, r= Bp (10)
with positive constants A and B and the viscoplastic law
p(t)e aF(q(t)) (11)
for almost all t e [0, t 0 ] and fixed t 0 > 0. The constants A and B are called
elasticity module and hardening module, respectively. Our assumptions are:
(Hl) The function F: R-+] -oo, oo] is convex and lower semicontinuous
with F :1= +oo.
(H2) We are given a function t 1-+ K(t) describing the change in time for the
outer force with K e Wl(O, t 0 ). This implies the stress
u(t) = K(t)/C.
Thereby u(O) is known. Recall that C is the cross section of the wire.
(H3) The initial data for y, e, p, r, q at time t = 0 are chosen so that (9) and
(10) are satisfied at t = 0 and F(q(O)) < oo holds. lt is enough to know
the initial strain y(O), because all other initial data are then uniquely
determined.

Theorem 60.A. I/(Hl) to (H3) hold, then there exists a unique solution y, e, p,
re Wl(O,t 0 ) which satisfies the given initial data.

PRooF. This is a special case ofTheorem 66.A ofSection 66.3 with U = r = R.


Moreover, we have
y=Du
with u = ~l and D = D* = 1/l, and also
D*u = K/Cl.
We therefore replace Kin Section 66.1 with K/Cl. 0

Recall that for any f e Wl (0, t 0 ) there exists ä uniquely determined con-
tinuous function g: [0, t 0 ] -+ IR, which is equal to f almost everywhere. More-
i"
over, f has a generalized derivative j with f<f dt < oo. On the other band,
one can modify f on a set of measure zero, whereby f remains unchanged as
an element of Wl(O, t 0 ). Therefore we may always assume that f: [0, t0 ]-+ R
is continuous. In this sense f(O) is weil defined.
60.3. Elasto-Viscoplastic Wire with Linear Hardening Law 153

q 0

oo.---.

1----+----- p

(a) (b)

Figure 60.ll

EXAMPLE 60.4. (Hardening Effect). Let F = x as in Example 60.1. We want to


show that there occurs a hardening effect in our model. If p, q travel on the
given curve in Figure 60.ll(a), then y, u run through the curve in Figure
60.ll(b). The slope of the_line segment l5P is equal to A. This is a linear
Hooke's law. A so-called linear hardening law corresponds to the line segment
PQ with slope B > 0, for which the plastic Iimit u0 is exceeded. For B = 0,
however, PQ is horizotal, i.e., we have a pure flow process. Hence one obtains
the two unknown parameters A, Bin the constitutive law for the internal state
variables from the diagram for the experimentally measurable quantities y and
u in Figure 60.ll(b). This figure is an idealization of the experimental result
in Figure 60.2.
As in Figure 60.12, model (9) to (11) may be visualized as a parallel connec-
tion of an elastic element B with a viscoelastic element F, which both are
serially connected to an elastic element A. Thereby A corresponds to the elastic
strain e, while B and F correspond to the viscoelastic strain p. The sum of the
quantities e and p gives the total strain
Y= e + p.
At the points A, B, F of Figure 60.12, the strains e, p, p produce the stresses
u, r, q, respectively. We have that
u = r + q.

strain stress
Figure 60.12
154 60. Elastoplastic Wire

60.4. Quasi-Statical Plasticity


We distinguish here between:
(i) quasi-dynamical plasticity, 1
(ii) quasi-statical plasticity, and
(iii) statical plasticity.
In case (i), time-rlependent processes for strain and stress of the form
y = y(t), u = u(t),
with constitutive law
y(t) e oF(u(t)) (12)

will be considered. The quasi-statical case (ii), on the other band, corresponds
to the constitutive law
yeoF(u) (13)
with the important special case
yef(u) + ox(u), (14)
where x denotes the indicator function for the set
C = {ue IR: Iu! :s;; u0 }, u0 > 0.
According to Example 60.1, equation (14) is equivalent to the constitutive law:
y = f(u) if Iu! < u0 ,
Y ;:::: f(uo) if u ;:::: u0 , (14*)
Y :s;; f( -uo) if u :s;; - u0 ,
i.e., the strain y becomes undetermined if the critical stress ± u0 has been
reached (Fig. 60.13). Note that in the quasi-statical case the history of the

Figure 60.13

1 Note that this terminology is not uniform in the literature. Sometimes quasi-dynamical plas-

ticity is called quasi-statical plasticity.


60.5. Some Historical Remarks on Plasticity 155

material does not play any role, which, in fact, represents a strong idealization
of the actual behavior of plastic materials. In Chapters 62 and 66, we will
discuss generalizations of (i) and (ii) to three-dimensional bodies and prove
the following:
quasi-statical plasticity ~ elliptic variational inequality,
quasi-dynamical plasticity ~ evolution variational inequality of first order.
From the mathematical point of view, the quasi-statical case is much simpler
than the quasi-dynamical case.
The basic ideas of statical plasticity will be discussed in Section 62.11, and
the following will be proved:
dual variational problems between strain and stress
+ plasticity condition
~ statical plasticity ~ elliptic variational inequality.

60.5. Some Historical Remarks on Plasticity


The basic experiments on the behavior of plastic material were performed by
Tresca (1864), nearly 200 years after the discovery ofHooke's law. Tresca also
formulated a plasticity condition (yield condition) which has the following
form:
Tresca(r) < r* no plasticity,
(15)
Tresca(r) ~ r* plasticity occurs,
whereby Tresca(r) is equal to max{lr 1 - r 2 l,lr 2 - r 3 l,lr 1 - r 3 1} and the T~o
r 2 , r 3 denote the eigenvalues of the stress tensor r. This will be discussed more
thoroughly in the following chapter. The numbers Ä. = r 1 , r 2 , r 3 are the
solutions of the characteristic equation
det(riJ - A.c5u) = 0,
where the riJ denote the components of r in a Cartesian coordinate system.
Note that the matrix (ru) is symmetric. The critical value r* thereby depends
on the material under consideration. Suppose n1 , n2 , n3 are eigenvectors of r
which correspond to r 1 , r 2 , r 3 • Following Cauchy (1827), the planes with
normal vectors n 1, n2 , n3 are called the principal planes ofstress, and the T~o
r 2 , r 3 the principal stresses. Their physical meaning will be explained in
Section 61.3. The plasticity condition (15) postulates the following funda-
mentallaw of plasticity:
plasticity occurs if the differences between the principal stresses
are /arge enough.
156 60. Elastoplastic Wire

Theoretical work on plasticity goes back to the papers of Saint Venant


(1871) and Levy (1871). In 1909, Haar and von Karman proposed a so-called
statical model in plasticity, which was based on a variational principle for the
stresses and a side condition, namely, a plasticity condition. A modern formu-
lation of this, together with existence and uniqueness theorems, will be given
in Section 62.11.
The present flow theory in plasticity was founded by von Mises. In bis paper
(1913) he
(i) replaced the Tresca plasticity condition with a new condition; and
(ii) formulated the equations ofmotion for an ideal plastic liquid (von Mises
liquid).
Von Mises' plasticity condition reads as follows:
Mises(r) < r5 no plasticity,
(16)
Mises(r) ~ rJ plasticity occurs,
where
Mises(r) = i((r 1 - r 2)2 + (r 2 - r 3)2 + (r 1 - r 3)2 ). (17)
This condition is easier to handle than the old Tresca condition, since, for one
thing, the function r H Mises(r) is differentiable, and, moreover, Mises(r) can
be expressed in terms of r. In Section 62.9 we will show that
3
Mises(r) = [f, f] = L (r i - i(r
i,J=l
1 11 + r 2 2 + r3 3)) 2 •
This is a consequence of the fact that [f, f] is an invariant expression, i.e., a
special Cartesian coordinate system can be chosen in which r 1i = r 1<511 • This
implies (17).
Throughout this volume we will use the von Mises plasticity condition. The
convex structure of (16) enables us to use the methods of convex analysis in
plasticity theory.
The basic equations of motion for von Mises liquids and also the more
generat basic equations for viscoplastic flows will be discussed in Part V (basic
equations of rheology for non-Newtonian fluids).
Linear quasi-statical plasticity theory was proposed by Hencky (1924), and
linear quasi-dynamical plasticity theory datesback to Prandtl (1924) and was
further developed by Reuss (1930).
A decisive stride towards a rigorous mathematical theory of plasticity was
made by Duvaut and Lions (1972, M), using the modern theory of varia-
tional inequalities in order to prove existence theorems. In doing so, they
paved the way for applications of nonlinear functional analysis in plasticity
and related subjects.
The very fruitful idea of internal state variables was introduced by Nguyen
(1973), and has been used already in Section 60.3 in order to describe harden-
ing effects.
References to the Literature 157

We consider the following problems related to plasticity:


(i) Existence theorem in linear quasi-statical plasticity (Section 62.10).
(ii) Existence and uniqueness theorem in statical plasticity (Section 62.11).
(iii) A generat quasi-dynamical model in plasticity based on internal state
variables (Chapter 66).
(iv) Plastic torsion (Part V).
(v) The basic equations for viscoplastic flows of non-Newtonian liquids,
existence and uniqueness for tube flows, and general flows (Part V). We
consider, for example, viscous Navier-Stokes liquids and more general
viscous Williamson liquids, viscoplastic Dingharn liquids, and ideal plas-
tic von Mises liquids.

References to the Literature

Classical works in elasticity: Galilei (1638), Hooke (1678) (basic experiments),


Bernoulli (1705) and Euter (1744) (bending of beams), Navier (1821), Cauchy (1827),
(1828) (foundation of the generat theory), Green (1839) (energy principle).
History ofthe theory ofelasticity: Love (1906), Gurtin (1972, S), Truesdell (1968, M),
(1983, S), and Antman (1983, S).
Classical works in plasticity: Tresca (1864) (basic experiments), Saint Venant (1871),
Levy (1871), Haar and von Karman (1909)(statical plasticity theory), von Mises (1913)
(fundamental paper: quadratic plasticity condition and equations of motion for ideal
plastic liquids), Hencky (1924) (quasi-statical plasticity theory), Prandtl (1924) and
Reuss (1930) (quasi-dynamical plasticity theory), Moreau (1968), Duvaut and Lions
(1972, M) (variational inequalities), Nguyen (1973) (internal state variables).
History of plasticity theory: Hili (1950, M), Prager and Hodge (1951, M), Geiringer
(1972, S) (handbook article). (See also the detailed References to the Literature for
Chapter 66 on plasticity.)
CHAPTER 61

Basic Equations of
Nonlinear Elasticity Theory

The geometers who have investigated the equations of equilibrium or motion


ofthin plates or surfaces have distinguished two kinds of forces, one produced
by extension or contraction and the other by the bending of surfaces.... It has
seemed to me that these two kinds of forces could be reduced to a single one,
which always ought to be called tension or pressure; this force acts upon each
element of a section, chosen at will, not only in a flexible surface but also in a
solid.
Augustin Louis Cauchy (1827)
The principal creator of three-dimensional elasticity is Cauchy (1789-1857).
Mainly for use in three-dimensional bydrodynamics, Euler (1707-1783) bad
introduced general mappings ofregions and bad created tbe associated calculus
of partial derivatives, cbain rules, Jacobian determinants, etc; be also bad for-
mulated the general principles oflinear and angular momentum and bad sbown
how to apply tbem to fluids.
Caucby mastered all tbis and tumed it to use in elasticity.... It is fair to say
tbat mucb of tbe algebra of vectors, matrices, and tensors grew out of Caucby's
work on the strain, local rotation, and stress in elastic bodies.
Clifford Ambrose Truesdell (1983)
Fora sufficiently small volume element, tbe generalsmall change in the position
of a deformable body can be represented as the sum of a translation, a rotation,
and an extension or contraction in three orthogonal directions.
Hermann Helmholtz (1858)
The following principle, which goes back to Caucby, is fundamental in stress
analysis. If one imagines a volume element whicb is taken from an elastic body,
then the outer forces, tbe inertial forces and tbe stress forces wbicb act on tbe
surface of the volume element, are in an equilibrium.
Erleb Trefftz (1928)
If metbods can be found, wbicb admit a complete understanding of a given
system, starting from a single atom and ending witb tbe entire body, only tben
will a deeper knowledge in elasticity tbeory be gained.
Adolf Busemann and Otto Föppl (1928)

158
61. Basic Equations of Nonlinear Elasticity Theory 159

There are many reasons why nonlinear elasticity is not widely known in the
scientific community:
(i) It is basically a new science whose mathematical structure is only now
becoming clear.
(ii) Reliable expositions of the theory often take a couple of hundred pages to
get to the heart of the matter.
(iii) Many expositians are written in a complicated indicial notation that boggles
the eye and turns the stomach.
Stuart Antman (1984)

The goat of etasticity theory is the computation of deformations of etastic


bodies and the corresponding stress forces. These deformations need not
necessarity be small. Unfortunately, at present, there is no generat nonlinear
existence theory available. This makes the study ofthe Iiterature quite difficult.
Lacking this comprehensive generat theory, a great number of models are used
which are based on different approximation assumptions. These assumptions,
however, are often not explicitty formulated and their foundation seems
doubtful. Difficulties arise mainly from the fact that often there is no strict
distinction between the different regions which correspond to the undeformed
and deformed body.
We will try to present here an approach which might help the reader to
understand the different models and approximation assumptions from a
generat and rigorous point of view. In what follows, we want to describe our
generat strategy, which is represented schematically in Figure 61.1.

Basic Equations and Typical Difficulties

In Section 61.3 we formulate the generat basic equations of nonlinear elasticity


theory. They consist of:
(a) time-dependent equations of motion for the deformed region of the elastic
body;and
(b) constitutive laws which describe the connection between deformation and the
crucial stress tensor r.
The constitutive laws reflect the specific properlies of the different elastic
materials. The two main difficulties are the following:
(i) The stress tensor r refers to the unknown deformed region of the body.
(ii) The nonlinear constitutive laws are by no means uniquety determined by
generat physicat principtes.
In order to avoid difficulty (i) we use, in Section 61.5, the following crucial
procedure. We replace the stress tensor T in the deformed region with the
reduced stress tensor u in the undeformed region which is sometimes also
called the first Piola-Kirchhoff stress tensor. Other than in Section 61.3 we
thereby obtain basic equations in the undeformed region of the body. The key
Observation is that r in the defomed region can be computed from u in the
160 61. Basic Equations of Nonlinear Elasticity Theory

Equations of motion for Theory of invariants


systems of mass points
(Newton, 1687)

----------- 1
Balance of
angular momentum
Balance of
momentum

1 1
Symmetry of the stress Equations of motion Constitutive law:
stress tensor r for elastic bodies strain-stress relation
(Cauchy, 1827) (Cauchy, 1827) for specific materials
(Hooke, 1678; Cauchy,

j 1827; theorem of
Rivlin and Ericksen,
1955)

l
Basic equations Principle of
of elastostatics __._ virtual work (power)
(J ohann Bernoulli, 1717)

1
Basic equations Principle of stationary Calculus of
of hydrodynamics (minimal) energy - variations
(Euter, 1757; Navier, (Green, 1839) (Euter, 1744;
1822; and Stokes, 1845) Lagrange, 1762)

Figure 61.1

undeformed region. We thereby use the famous Piola transformation. It is


therefore sufficient to solve the basic equations in the undeformed region.
The stress forces, which are effective in the deformed body, depend on the
stress tensor r. It is important that these forces can also be computed by using
the reduced stress tensor a. We emphasize, however, that one has to distin-
guish strictly between r and a, which is sometimes not clearly expressed in the
literature.
As a consequence of general physical arguments r is always symmetric,
whereas the same need not be true for a. We also introduce the symmetric
second Piola-Kirchhofftensor S in the undeformed region. The advantage of
S isthat it allows an especially simple formulation of the constitutive laws, as
will be shown in Section 61.8.
In the stationary case, i.e., in the case of elastostatics, the basic equations
Iead to the principle ofvirtual work or, equivalently, to the principle ofvirtual
power. This is discussed in Section 61.5.
61. Basic Equations ofNonlinear Elasticity Theory 161

Variational Approach

In order to reduce difficulty (ii) above we formulate, in Section 61.6, a class of


variational problems

(P) J. L(x,u'(x))dx- J. Kudx- J~G


G G
f TudO = stationary!,
u = Uo on olG,
which corresponds to the principle of stationary potential energy. In this case
we speak of hyperelastic material. The corresponding Euter equations are
(E) divu+K=O onG,
u = Uo on al G,
un = T on o2 G.
The important point is that (E) agrees with the basic equations of elastostatics.
The function L represents the density of the elastic potential energy and is
also called the stored energy function. The advantage of this variational
approach is that from L, one immediately obtains the stress tensor u and the
constitutive law by using the process of differentiation, i.e.,
0' =Lu'·
Moreover, one finds
stability criteria;
duality relations between displacement u and the stress tensor u; and
an elegant method for constructing approximation models.
In Section 61.7 we will show how, for a number ofmodels, one obtains explicit
expressions for the elastic energy, i.e., for L. We thereby reach our goal of
obtaining a maximum of information from a minimum of model assumptions.
Note that not all basic equations (E) of elastostatics are Euler equations of
a variational problern (P), i.e., elastic material need not be hyperelastic.

Theory of lnvariants and General Constitutive Laws

In Section 61.8 we want to show how, in the framework of the exact theory,
the theory of invariants can be used to obtain the generat structure of constitu-
tive laws and stored energy functions. We emphasize that
there exist serious restrictions on the form of the exact con-
stitutive laws.

An important rote in this connection is played by the axiom of material frame


162 61. Basic Equations ofNonlinear Elasticity Theory

indifference, which, roughly speaking, postulates that constitutive laws are


invariant under rotations.
On the other band, many constitutive laws, which are used in the literature,
do not satisfy these restrictions. Such constitutive laws only correspond to
approximation models.

Approximation Models

In order to obtain approximation models, the following two options are


available:
(a) One starts from the basic equations of elasticity (or the principle of virtual
work) and neglects appropriate terms.
(b) One uses the variational approach and neglects appropriate terms in the
variational problem, i.e., the stored energy function L is replaced with an
appropriate approximation Lapprox·
Throughout this volume we will stress the fact that method (b) is much
simpler than method (a). In (a) one has to control approximations of a complex
system of equations (or, in the case of the principle of virtual work, an integral
identity with many components). In (b), on the other band only approxima-
tions of a single function L need to be controlled. Furthermore, the function
L has a significant physical meaning (density of elastic energy). Method (b)
corresponds to a general strategy in theoretical physics:
Use variational principles via Lagrangian functions L and try to obtain im-
portant physical and mathematical information from the structure of L.
Let us note that method (a) can only be recommended if the basic equations
(E) do not correspond to a variational principle. This, however, is an excep-
tional case.

General Existence Theory

In Chapters 61 and 62 our main goal will be the proof of general existence
theorems. We thereby consider two important ways to approachnonlinear
elasticity:
(I) Local existence and uniqueness of smooth solutions via the implicit function
theorem and the continuation method (Section-61.12).
(II) Compensated compactness and global existence for polyconvex material
(Section 62.13).
In connection with (I) we shall also explain how this method is related to
stability and bifurcation, and, in addition, we will present an important ap-
proximation method.
61. Basic Equations of Nonlinear Elasticity Theory 163

The simple key idea in (I) is the following. Write the basic equations of
elastostatics as an operator equation
(E) F(u) = (K, u0 ),
where u is the unknown displacement ofthe elastic body, K denotes the given
density of outer forces, and u0 is the given displacement of the boundary of
the body. lt is important then that the linearization
F'(ii)h = (K, ii0 )
of (E), at a so-called strongly stable state of the deformation ii, corresponds to
a linear strongly elliptic system. The generat theory of such systems, which
will be discussed in Section 61.11, implies that, with respect to suitable function
spaces (Hölder spaces or Sobolev spaces), F'(ii) is a bijective operator. From
the implicit function theorem (Theorem 4.B) we then immediately obtain that
equation (E) has unique solutions u in a neighborhood of ii if K and u0 lie in
a small neighborhood of K and ii0 , respectively. Thereby F(ii) = (K, ii0 ).
If we start, for example, with a known strongly stable initial state ii, let's
say the rest state ii = 0, then this solution of equation (E) can be continued as
long as the continuation remains strongly stable. The discretization of this
continuation procedure yields an important approximation method, whose
convergence can easily be proved by using well-known methods for ordinary
differential equations in B-spaces. This will be donein Section 61.16.
In Section 61.14, weshall show that the loss of stability of an equilibrium
state can Iead to bifurcation, i.e., to new equilibrium states.
The basic idea of (II) is the following. We begin by applying standard
arguments of existence theory for variational problems from Chapter 38, i.e.,
we choose a minimal sequence and select a weakly convergent subsequence.
The important fact thereby is that the functional in the variational problern
(the potential energy ofthe elastic body) is not convex. Thus we have the usual
difficulties in proving that the weak Iimit ofthe minimal sequence is a solution
of our problem. This difficulty can be overcome by using weak Iimit properties
ofintegral identities which typically contain terms ofthe form det A and adj A.
In this connection, the skew symmetry of det A and adj A plays a decisive
rote. In order to apply this technique it is crucial to have the polyconvexity
of the energy functional.

Existence Theory for Approximation Models


In (I) and (II) above the full basic equations of nonlinear elasticity are solved
without using any additional approximation assumptions. We will, however,
also consider several approximation models:
(i) Korn's inequality and generalized solutions in linear elastostatics (appli-
cation of the main theorem on quadratic variational problems (Theorem
22.A)).
164 61. Basic Equations of Nonlinear Elasticity Theory

(ii) Classical solutions in linear elastostatics (strongly elliptic systems).


(iii) Generalized solutions in linear elastodynamics (application of the main
theorem on linear second-order evolution equations (Theorem 24.A)).
(iv) Duality theory for a class of models with linear and nonlinear Hooke's
law, e.g., linear elastic material and nonlinear Hencky material (applica-
tion of the main theorem of duality theory for monotone potential opera-
tors (Theorem 5l.D)).
(v) Existence theorem in linear quasi-statical plasticity (quadratic variational
inequalities).
(vi) Existence and uniqueness theorem in linear statical plasticity via duality.

As we will show in Chapter 62, the advantage of duality theory is that it


Ieads to effective error estimates for approximation methods (Ritz method)
and that it allows a simple approach to statical plasticity theory.

Concrete Problems

In Chapters 63-66 the following topics will be discussed:


(cx) Signorini problern for nonlinear material (application ofthe main theorem
on elliptic variational inequalities (Theorem 54.A)).
(ß) Duckling of rods and beams and bifurcation theory for variational in-
equalities.
(y) Duckling of plates (application of the main theorem on pseudomonotone
operators (Theorem 27 .A) and of the main theorem of bifurcation theory
for potential operators (Theorem 45.A)).
(15) Quasi-dynamical plasticity theory (application of the main theorem on
evolution variational inequalities offirst order (Theorem 55.A)).

Invariant Formulation and Functional Analysis

Elastic properties of bodies do not depend on the choice of the coordinate


system. Thus the theory is developed in an invariant way, i.e., coordinate-free.
We thereby use methods of functional analysis in the space V3 • The stress
tensor, for example, is regarded as a linear operator

while its coordinate representation (tj) plays only a secondary role. Similarly
as in vector calculus, this functional analytic approach simplifies the formu-
las. One works with geometrical objects rather than with Coordinates. The
coordinate formulas, however, can easily be obtained from these invariant
formulas and will also be given.
61. Basic Equations of Nonlinear Elasticity Theory 165

Strain Tensor, Stress Tensor, and Constitutive Laws

The two most important notions in elasticity are:


(i) strain tensor.l; and
(ii) stress tensor r.
As we will show in Section 61.2, I contains all the important information
about the geometry of deformations. The stress tensor r describes the stress
forces acting in an elastic body. We note here that those two basic notions
were already introduced by Cauchy (1827). In Sections 61.3 and 61.4 we will
discuss the properties of r. The strain tensor I depends in a nonlinear Cashion
on the displacement vector u(x) of the elastic body, i.e.,
8(x) = f(u'(x) + u'(x)* + u'(x)*u'(x)),
where
y= x + u(x)
describes the deformation of the elastic body. The nonlinear relation between
u and I is responsible for the fact that elasticity theory and hydrodynamics
are basically nonlinear theories. The key relation for the stress tensor r is

[ rn' dO = stress force acting on H'


Jan·
for all deformed subregions H'. Here, n' is the outer unit normal vector to oH'.
Fora homogeneous body, the constitutive law, i.e., the relation between strain
and stress, is given by
S = B(tt),
where S denotes the second Piola-Kirchhoff tensor, i.e.,
r(y) = y'(x)S(x)y'(x)* det x'(y).
In the case of hyperelastic material, B is a potential operator, i.e.,
S = A'(8)
and for the stored energy function we obtain
L = A(tf).
These important relations show the connection between strain, stress, and
elastic potential energy.
It is a remarkable fact that the tensors I, r, and S are symmetric, i.e., they
are symmetric linear operators on the H-space V3 • This allows us to apply the
principal axis theorem to these operators, i.e., these operators have three
orthonormal eigenvectors with corresponding real eigenvalues in V3 • The
eigenvalues of(J + 28(x)) 1' 2 are called the principal strains at the point x and
the eigenvalues of r(y) are called the principal stresses at the point y. The
166 61. Basic Equations of Nonlinear Elasticity Theory

intuitive meaning of these notions will be explained in Sections 6l.2b and


6l.3b.

61.1. Notations
We use again the notations of Section 58.1 and add several new ones. Let
and
denote the space of linear and linear, symmetric operators
y: v3-+ v3,
respectively. For y, Jl e L(V3 ) we define
try=y/, (l)
[Y,Jl] = tr(yJl) = Y)Jl{, (2)
det y = det(yj). (3)
Thereby tr {J is called the trace of y. As usual, the sum is taken over two equal
upper and lower indices from l to 3. For example,
3
try= LYi·
i=l

These definitions are independent of the coordinate system. This can be


verified explicitly. More elegantly, the invariance is deduced from the general
tensor calculus of Chapter 74, since no free indices occur in (1) and (2). For
symmetric y we ha ve
try = A.1 + Az + A.3,
where the A.; are the eigenvalues of y.
On L(V3 ) we introduce a scalar product through
(YIJl) = [y, Jl*] for all y, Jl e L(V3 ),
where the star denotes the adjoint operator on the H-space V3 • In Cartesian
Coordinates this gives
3
(YIJl) = L Y} Jl].
i,j=l

As a special case, we obtain in Lsym(V3 ) the scalar product


(YiJl) = [y, Jl] for all y, Jl E Lsym(V3 ).
If, furthermore, y( ·) is a tensor field, i.e., y(x) e L(V3 ) for all x in a subset of
V3 , then we define the divergence of y,
divy(x),
61.1. Notations 167

in an invariant way through the integral formula

( divydx = ( yndO (4)


JG JaG
with outer unit normal vector n. Using integration by parts we obtain that in
Cartesian coordinates

This implies
div y(x) = J)iyj(x)ei (5)
with x = e'e
1 and Di = D1= ofoel. For smooth y( ·) the invariance of div y(x)
follows from

divy(x) = lim 1 G ( yndO.


· G-+x meas öG J
This relation is independent of the coordinate system.
Let A: v3-+ v3 be a bijective linear Operator. We then define the Operator
adjA = (detA)A- 1,
and observe that
adj(AB) = adj B adj A, (adj A)* = adj A*.
In Section 61.2 we will consider the Piola identity
div adj y'(x)* =0
which plays a key role in nonlinear elasticity.
In Cartesian coordinates, the coordinates of adj A are the adjoint sub-
determinants of the matrix corresponding to A*. We use this property to
define adj A for an arbitrary linear operator A: V3 -+ V3 . Then
adj A: V3 -+ V3

is a linear operator as weil, and its definition is independent of the choice of


the Cartesian coordinate system.
Given two vectors a, be V3 , we define the dyadic product
u = aob
as the unique}y determined linear Operator u: V3 -+ V3 with
(a o b)x = a(bx) for all XE V3.
168 61. Basic Equations ofNonlinear Elasticity Theory

Moreover, we will use the following notation:


u displacement vector;
8 strain tensor;
y linearized strain tensor;
t stress tensor (in the defomed region);
q first Piola-Kirchhoff stress tensor (reduced stress tensor in the un-
deformed region);
S second Piola-Kirchhofftensor (in the undeformed region);
x position vector in the undeformed region;
y position vector in the deformed region, y = x + u(x);
G undeformed region;
G' deformed region (deformation of G);
H undeformed subregion of G;
H' deformation of H;
F density of outer forces in the deformed region;
K density of the outer forces with respect to the undeformed region;
T density of the outer boundary forces with respect to the undeformed
region;
L stored energy function;
U elastic potential energy;
p mass density of the deformed body;
p0 mass density of the undeformed body.
ji deviation ofy, ji = y- 3- 1 tryl.

61.2. Strain Tensor and the Geometry of


Deformations
Our first goal is to introduce the strain tensor 8. We thereby keep in mind that
(i) 8 is invariant under rigid motions (rotations and translations); and
(ii) 8 contains all necessary information about the geometry of the defor-
mation.
Weshall use the following two steps.
(a) First, we consider the so-called stretch tensor E, which is defined for all
orientation-preserving linear Operators from V3 to V3.
(b) Second, we consider the linearizations of arbitrary deformations. The
corresponding stretch tensor then Ieads to 8.
A stretch tensor thereby is a linear, symmetric, strongly positive operator
E: V3 -+ V3 • Such a tensor has a simple intuitive interpretation. From the
principal axis theorem it follows that there exists an orthonormal system of
eigenvectors {e1 ,e2 ,e3 } with
i = 1, 2, 3. (6)
61.2. Strain Tensor and the Geometry of Deformations 169

The positive eigenvalues 1 + A.i of E are called principal strains. Thus E


describes a transformation with a strain of

in the direction of the ei-axis for all i. These axes are called the principal axes
of strain.
We now consider, more generally, an arbitrary affine, orientation-preserving
map of V3 • Such a map has the form

y = Ah + b, (7)

where A e L(V3 ) and det A > 0. The fixed element b e V3 describes a transla-
tion. We want to represent transformation (7) as the product of a translation,
a rotation, and a stretch tensor. This decomposition is of centrat importance
in elasticity. Ifwe consider physical quantities, which occur in connection with
elastic deformations, then we expect that such quantities, for example, the
elastic energy, do not depend on the rotations and translations of the body,
but only on the strain described by the corresponding stretch tensor. As usual,
we mean by a rotation, an operator Re L(V3 ) with

R*R =I and detR = 1.


The following proposition contains the key to the strain tensor 8 below.

Proposition 61.1 (Normal Form). If A: V3 --. V3 is a linear operator with


det A > 0, then it can be uniquely written as

A=RE,
i.e., as a product of a stretch tensor E and a rotation R. It is

PROOF. The operator A *A is symmetric and strongly positive. This follows


from (A*A)* = A*A and
(xiA*Ax) = (AxiAx) ~ 0
as weil as from the fact that det A *A = det A * det A > 0.
The uniqueness of Eisa consequence of E = R- 1 A and
E 2 = E*E = (R- 1 A)*(R- 1 A) = A*(R- 1 )*R- 1A
=A*A.
Hence we obtain E = (A *A) 1' 2 •
The existence of the decomposition follows by letting
E = (A*A) 112 and 0
170 61. Basic Equations ofNonlinear Elasticity Theory

61.2a. Taylor Expansion and Strain Tensor


We now study the local behavior of a C 1-map
y = x + u(x), (8)
from the space V3 into itself. The Taylor expansion at the point x yields
y(x + h) = x + u(x) + h + u'(x)h + o(llhll), h-.0, (9)
and thus the linearization of the map (8) at the point x has the form
y = b + Ah
with
b = x + u(x) and A =I+ u'(x).
Proposition 61.1 then shows that the corresponding local stretching can be
described by
E = (A*A) 112
with
A*A =I+ u'(x) + u'(x)* + u'(x)*u'(x).
Weset
A*A =I+ U'(x).

Definition 61.2. The linear Operator l(x): v3 .... v3 with


l(x) = !(u'(x) + u'(x)* + u'(x)*u'(x))
is called the strain tensor of the transformation (8) at the point x.
The linear operator y(x): J-3-. J-3 with
y(x) = !(u'(x) + u'(x)*)
is called the linearized strain tensor at x.

It is clear that both l(x) and y(x) are symmetric linear operators on V3 •
In the Cuture weshall calll and y simply strain tensors. We note, however,
that I and y will always refer to the strain tensor and the linearized strain
tensor, respectively. From
u'(x)h = (D1u1(x)hi)e1
we obtain in Cartesian coordinates the representation
lj == !(D1u 1 + D 1u1 + D 1u,.D1u") (10)
for the matrix (lj) which corresponds to l(x) according to Section 58.1. As
usual, we set Di = D1 = o/o~J and x = ~le1 , u" = u,. = u"(x). The matrix which
61.2. Strain Tensor and the Geometry of Deformations 171

corresponds to y(x) satisfies


yj = !(Diu• +D ui).
1 (11)
In order to give some intuitive meaning to y(x), we set
Y- = !(u'(x)- u'(x)*).
From (9) then follows that
y(x + h) = (x + u(x)) + (h + y(x)h + y_(x)h)
(12)
+ o(llhll), h-+ 0.
Passing to Cartesian coordinates we obtain
y_(x)h = m x h for all he V3
with the fixed vector
m = !curl u(x).
Hence it follows from (12) that:
The linearization of the transformation y = x + u(x) at the point x is the sum
of
(i) a translation about the vector x + u(x);
{ii) the stretching (I + y(x))h; and
{iii) the infinitesimal rotation y_(x)h.
For the change in volume we obtain

r dy JHr J(x)dx = JHr (1 + divu)dx + o(llu'll),


JH'
=

where J(x) = det y'(x). Hence, in first-order approximation, div u{x) is the
relative local change in volume. Roughly, one has that ~V' = (1 + div u(x))
~Vand hence

(~V'- ~V)/~V = divu(x).


The observations above illustrate the intuitive meaning of curl u(x) and
div u(x), where u(x) denotes the displacement vector.

61.2b. Geometrical Meaning of the Strain Tensor

Let H be an undeformed subregion, l: an undeformed surface with unit normal


vector n, and C and c. two undeformed curves with parametrization
x = x(tX) and
respectively. Let s = s(tX) denote the arclength of C, and s the derivative with
respect to IX.
172 61. Basic Equations of Nonlinear Elasticity Theory

G
Figure 61.2

We then consider a C 1-deformation


y = x + u(x)
from the undeformed region G onto the deformed region G'. We set J(x) =
dety'(x) and suppose that J(x) > 0 on G. Moreover, we set
A = y'(x)
and recall that A = RE, where R is a rotation, and
E 2 = A"'A =I+ 21(x).
With H', I:', C', C~ we denote the deformations of the sets H, I:, C, C,..,
respectively (Fig. 61.2). The deformed curve C' then has the parametrization
y = x(a) + u(x(a)).
Obviously, we obtain y = x + u'x and hence
y=Ax, y,.. = Ax,...
By observing that s2 = (xlx) and s' 2 = (yly), we find the first two of the
following four basic formulas:
(I) (yly,..) = (xlx,..) + 2(xll(x)x,..),
(II) s' 2 = s2 + 2(xll(x)x),
(111) dV' = det A dV = J det(/ + 21(x)) dV,
(IV) dO' = l(adjA)"'nldO = IJadj(/ + 21(x))nld0.
The definition of the operator adj A has been given in Section 61.1. 1t Ieads
to the following integral formulas for the change in arclength, volume, and
surface under the considered deformation:

(II"') s'(/J) = J: J<xlx) + 2(xll(x)x) df.X, x = x(a),

(111"') f dV' = f Jdet(/ + 21(x)) dV,


JH' JH
(IV"') f dO'= f1Jadj(/+21(x))nld0.
Jt• JI
61.2. Strain Tensor and the Geometry ofDeformations 173

In fact, (111) and (IV) represent a short-hand writing for (III*) and (IV*),
respectively. A proof will be given below. The key to (IV*) is the so-called
Piola identity.
Let us note that the formulas above show that the change in arclength,
volume, and surface can. be described exclusively in terms of A, det A, and
adj A. This observation will play a fundamental role in the theory of poly-
convex materials.
The following discussion should help to illustrate the intuitive idea behind
these formulas. Let {e1 ,e2 ,e3 } be an orthonormal system ofeigenvectors of
E with corresponding eigenvalues s~o s2 , s3 , which are called the principal
strains at the point x. The e1-axes are called the principal axes of strain at x.
If e1 are the eigenvalues of the strain tensor tf(x), then
s1 = J1 + 2e 1, i = 1, 2, 3.
In terms of s1 we obtain
dV' = s 1 s2 s3 dV,
dO' = Jn~s~s~ + n~s~s~ + n~s~s~ dO,
where n = Ll= 1 n1e1• If lltf(x)ll is small, i.e., ü e1 , e2 , e3 are small, then
dV' = (1 + e1 + e2 + e3 )dV + o(lel)
= (1 + tr tf(x)) dV + o(lel), e -+ 0,
dO' = (1 + nHe2 + e3) + n~(e 1 + e3) + n~(e 1 + e2 ))d0 + o(lel),

where Iei = le1l + le2l + le31·

61.2c. Curve Deformation and the Strain Tensor

In order toillustrate the intuitive meaning ofthe components tfj of tf(x), with
respect to an arbitrary Cartesian coordinate system, we fix the point x and
choose an orthonormal system of vectors {e1 ,e2 ,e3 }. We then consider the
straight lines
i = 1, 2, 3

through the point x. These straight lines are deformed into the curves
Yt = x + a.e1 + u(x + a.e1), i = 1, 2, 3
with tangent vectors
i = 1, 2, 3
at the point y = x + u(x). Denote the angle between ej and ej by cp11 (Fig. 61.3).
174 61. Basic Equations ofNonlinear Elasticity Theory

e1.

Figure 61.3

The components 8j of 8(x) are given by


8(x)e1 = 8je 1•

Setting 811 = 8j and 181 = Ll.J=l 8ii, we obtain


i,j = 1, 2, 3,

and hence

e;e; 11 28
= - -=
cos (/)"
11
le:llejl Jt + 28
----;:;===-''r===
11 Jl + 28»
for all i =F j. Note here that cos cp11 = sin(n/2 - cpii), so that, for small181, the
two key formulas

le;l = 1 + 811 + o(l81), 181-+ 0,

2-
1[
q>ii = 28ii + o(l41), i =F j

are valid.

61.2d. Volume Deformation and the Strain Tensor

Using the well-known transformation rule for integrals, we find that

r dy = r dety'(x)dx.
JH' JH
We then observe that det A = det y'(x) and that

det(I + 24(x)) = detA*detA = (detA) 2 •


61.2. Strain Tensor and the Geometry of Deformations 175

61.2e. The Piola ldentities

The following two so-called Piola identities hold true for the undeformed
region G:
(P1) divxadjy'(x)* = 0,
(P2) divx t(y(x))adj y'(x)* = J(x)div, t{y).
The index thereby indicates with respect to which variable div has to be
applied. Suppose that t: G' -+ L(V3 ) is C 1, i.e., t( ·) is a C 1-tensor field on the
deformed region G'. Differentiating the identity y = y(x(y)) with respect to y,
we obtain the relation I= y'(x)x'(y). This implies
adj y'(x)* = (det y'(x)*)y'(x)*- 1 = J(x)x'(y)*,
and thus we can write the Piola identities in the form
(P1*) divxJ(x)x'(y(x))* = 0,
(P2*) divxJ(x)t{y(x))x'(y(x))* = J(x)div,t(y).
The transformation
CT(x) = t(y)adj y'(x)*
= J(x)t(y)x'(y)*, y = y(x)
is called the Piola transformation. It will play an important rote in Sec-
tion 61.5, where the reduced stress tensor will be introduced via the Piola
transformation.
Let us now prove (P1) and (P2). Identity (P1) is precisely formula (12.14) of
Part I. Recall that this formula was the key to our simple analytic approach
to the mapping degree ofChapter 12. In Cartesian coordinates (Pl) and (P1 *)
read as

61.2f. Surface Deformation and the Strain Tensor

We use the Piola transformation CT(x) = t(y(x))adjy'(x)*. Formula (P2) then


implies
[ CTndO = [ divCTdx = [ J(x)div1 t(y(x))dx
JoH JH JH
=[ div" t(y)dy = f tn' dO'.
J"' JaH·
176 61. Basic Equations ofNonlinear Elasticity Theory

Setting • = I, we obtain

( (adjy'(x)*)ndO= ( n'dO'
lau lau·
and hence, taking norms,

( l(adj y'(x)*)nl dO = ( dO',


lau lau·
t.e.,
dO' = l(adj y'(x)*)nl dO.
By definition of I, there exists a rotation R such that
y'(x) =RE, E = JI + U'(x).
From RR* = I and det R = 1 we obtain
adjR* = R,
and hence
adjy'(x)* = adj(E*R*) = RadjE = R(adjE 2 ) 1' 2 •
Note that E* = E. Thus
dO' = IJadj/ + U'(x)nldO.

61.3. Basic Equations


In the generat case, the motion of an e1astic body is described by the following
three equations.
(i) Deformation
y= x + u(x,t), (x, t) E G x [0, t 0 ]. (13)

(ii) Equation of motion


p(Q)ji = div t(Q) + F(Q), Q = (y,t). (14)
(iii) Constitutive law
t(Q) = <ll(u(P), ux(P), P), P = (x,t). (15)
We are looking for
the displacement u.
In (14) div is applied with respect to y. Moreover, we have
ji = u11 (x, t ).
We now discuss the basic equations.
61.3. Basic Equations 177

Figure 61.4

61.3a. Deformation

The deformation of the region G occurs in such a way that at time t


the point x e Gbecomes the point y
(Fig. 61.4). The quantity ji is then the acceleration ofthe point y at timet. The
subregion H of G becomes H' at time t. In order to obtain a regular deforma-
tion, we have to assume that formula (13) describes
an orientation preserving diffeomorphism from Gonto G'
for every t. Because of the complexity of the basic equations in nonlinear
elasticity, one often shows just the existence of the displacement u without
proving the diffeomorphism property.
We call G and G' the reference configuration and the deformed configura-
tion, respectively. Moreover, the coordinates x in G and y in G' are called
Lagrangian and Eulerian coordinates, respectively. In elasticity one frequently
uses Lagrangian coordinates, while in hydrodynamics one usually works with
Eulerian coordinates.
Differentiation of the identity y = y(x(y), t) with respect to y yields
y"(x, t)x,.(y, t) = I, det Y.ix, t) det x,.(x, t) = 1,
where y = x + u(x, t). These formulas will constantly be used in the following.

61.3b. Stress Tensor


The stress tensor
t(Q): v3 - v3
should be a linear, symmetric operator. The physical meaning of t(Q) is that

I
oH'
tn' dO = stress force acting on H' (16)

holds true, where n' is the outer unit normal vector to the deformed subregion
H'. A typical property of stress forces is that they can be described by surface
178 61. Basic Equations of Nonlinear Elasticity Theory

integrals, i.e., they represent surface forces. Here, dO denotes the surface
element of oH'. Moreover, we have

( F dy = outer force acting on H', (17)


Jw
L p0 (x)dx = mass of H, (18)

( p(y) dy = mass of H'. (19)


Jw
Let r 1 , r 2 , r 3 be the eigenvalues of the stresstensorrat a fixed point y
with corresponding orthonormal system of eigenvectors n 1 , n2 , n3 . The
decomposition

of the outer unit normal vector n' at the point y yields


rn' = IX 1 r 1 n 1 + IX 2 r 2 n2 + IX 3 r 3 n3 = IXi(r;nJ
Thus, we obtain

r
]an·
IXi(r;n;)dO =Stress force acting on H'.

This explains the intuitive meaning of the so-called principal stresses r 1 , r 2 ,


r 3 and principal axes of stress n 1 , n 2 , n 3 •

61.3c. Mass Conservation

We postulate mass conservation. This gives

l Podx = l pdy
Jn Jw
for all subregions H. Letting
J(x, t) = det Y.x(x, t),
we obtain

r pdy Jnr pJ dx.


Jn·
=

By contracting H into a single point we find that


p(y, t)J(x, t) = p0 (x)
with y = y(x, t). This is a consequence of the mean value theorem of integral
calculus. Hence, for a given mass density p0 (x) of the undeformed body, we
61.3. Basic Equations 179

have toset
p(y, t) = p0 (x)J(x, tf 1 (19*)
in the basic equation (14).

61.3d. Constitutive Law

The constitutive law (15) describes the connection between


deformation (strain) and stress.
Constitutive laws depend critically on the material under consideration. In a
rigorous theory, the constitutive law cannot be prescribed in an arbitrary
Cashion but must satisfy serious restrictions. This will be discussed in Section
61.8. The generat form (15) ofthe constitutive law corresponds to approxima-
tion models in elasticity.

61.3e. Typical Difficulties

In Cartesian coordinates, the equation ofmotion (14) is equal to


p(Q)u:,(P) = D'itj(Q) + F1(Q), i = 1, 2, 3 (20)
with y = '7 1e1 and D'i = a;a"J.
The main difficulty with these basic equations isthat the differentiation on
the right-hand side of (20) is taken with respect to the unknown deformed
region G'. Therefore (20) only apparently has a simple structure. Also, the
number of constitutive laws, which are being used in the literature, is rather
large. This is a consequence of the fact that these laws cannot be uniquely
determined from generat physical arguments.
The mathematical difficulties are caused by the physical reality that bodies
may become plastic or even break under deformations.

61.3f. Initial and Boundary Conditions

In order to obtain the complete system of basic equations, we need to add


initial and boundary conditions:
(a) The initial conditions are obtained by prescribing the initial values
u(x,O) and u,(x, 0) on G,
i.e., by prescribing the initial position and the initial velocity.
(b) In order to formulate the boundary conditions, we decompose the bound-
ary oG of the undeformed region into the two disjoint sets
aa = o1Guo2G
180 61. Basic Equations of Nonlinear Elasticity Theory

Figure 61.5

(Fig. 61.5). We then prescribe the displacement


u(x, t) on a1 G
and the stress forces

for alt times t.

If the displacement u is known, then one obtains the stress tensor from the
constitutive law (15) and the stress forces, which are acting on all parts of the
deformed body, from the integral formula (16). These forces are ofparticular
interest to the engineer, since they determine how the material in buildings,
bridges, etc. is stressed.

61.3g. Elastostatics

If the displacement u does not depend on time, then we have the common
stationary problem. In this case the general basic equations above become the
basic equations of elastostatics. The term py disappears in the equation of
motion (14) and timet does not explicitly appear in the basic equations. Here
y denotes the position of the point x after the deformation.

61.4. Pbysical Motivation of the Basic Equations


Our observations are based on the following three points:

(i) Structure of the stress forces and the stress tensor.


(ii) Momentum.
(iii) Angular momentum.

We assume in the following that all functions, transformations, and regions


are sufficiently regular.
61.4. Physical Motivation of the Basic Equations 181

61.4a. StressForcesand Stress Tensor

Following Cauchy (1827) the main assumption of elasticity theory is that a


deformation causes stress forces which can be described like this:

f t(y, t; n] dO = stress force acting on H'.


loH'
The heuristic idea behind this is that the stress force
t !l.O,
which acts from the domain G' - H' onto H', is applied to the surface element
!l.O of iJH', whereby this force depends on the outer unit normal vector n' of
iJH'. In the following we shall simply write n instead of n'. From Newton's
principle of actio = reactio, we may, conversely, assume that the stress force
-! !l.O
acts from the domain H' onto the surface element !l.O of G' - H'. We therefore
postulate
t(y,t; -n) = -t(y,t;n). (21)
for all n. Below, we shall prove the linearity oft with respect to n, i.e., there
exists a linear Operator r: V3 -+ V3 with
t(y, t; n) = r(y; t)n (22)
for all n. The operator r(y, t) is called stress tensor. Moreover, we shall
demoostrate the symmetry of the stress tensor, i.e.,
r(y, t) = r(y, t)*. (23)
All important information will follow from the integral form of the equations
of motion, which we will now discuss.

61.4b. Momentum Balance and Angular Momentum Balance


In Section 58.7, i.e., in the context ofpoint mechanics we derived the following
important balance laws for momentum and angular momentum:
Time derivative of the total momentum = total force.
Time derivative of the total angular momentum = total torque.
This will be our key to the derivation of the basic equations. If we decompose
the deformed region G' into small parts, then, after summation and passing
to Iimits, the following assumptions make sense:

dd l pu,dy = l Fdy + l YdO, (24)


t lH' lH' loH'
182 61. Basic Equations of Nonlinear Elasticity Theory

dd r PUr x ydy= r F x ydy+ r tX ydO. (25)


t Jn· Jn· Jan·
Formulas (24) and (25) are the momentum balance and angular momentum
balance, respectively. Note that y = u,. For the time derivatives we obtain

dd
t
r pu,dy = ddt Jnr PoUrdX = Jnr PoUrrdx
Jn·
=r PUudy,
Jn·
dd
dn·
r pu, X ydy=dd
tJn
r PoUr X ydx

= r PoUrr
Jn
X ydx = r PUrr
Jn·
X ydy.

Moreover, observe that pJ = p0 , dy = J dx, and u, x y = u, x u, = 0. From


(22), which has yet to be proved and (4) it follows that

j "i dO = j t:ndO = j divtdy.


Jan· Jan· Jn·
The momentum balance (24) gives

j (puu- F- divt}dy = 0,
JH'
and by contracting H into a point, we obtain
PUrr - F - div t: = 0.
This is the equation of motion (14).

61.4c. Existence of the Stress Tensor


We prove (22). From the momentum balance (24) follows

j (puu- F)dy =j t dO. (26)


Jn· Jan·
We choose H' as a tetrahedron with outer unit normal vectors

side surfaces S1, S2 , S3 , S, and volume V. This implies that


Si= Sni.
From (26) and the mean value theorem of integral calculus it follows that
V(purr - F) = St( ·, n) + Si!(·, - e1).
61.4. Physical Motivation of the Basic Equations 183

More precisely, this relation holds for the components at suitable points of
the tetrahedron. Contracting the tetrahedron with a similarity transformation
into a single point, we obtain after dividing by S that
"'f( ·, n) = - n1"'f( ·, - e1) = n1"'f( ·, e1).
This follows since V/S --+ 0, and hence we obtain the linearity of n 1-+"'f( ·, n).
Thus, thete exists a linear operator r: V3 --+ V3 suchthat
"'f(·,n)=tn for all ne V3 •
This is (22).

61.4d. Symmetry of the Stress Tensor

Weshall prove (23), i.e., the symmetry oft, by using the angular momentum
balance and the key identity

r (t*- t)dy lH'r divt x ydy- loH'


lH'
= r tn X ydO

for arbitrary smooth reL(V3 ). The latter equation follows from passing to
Cartesian coordinates and using integration by parts. In fact, we have that

f Bmi}tt 1D't! dy- f Bmt}f1 1t!n' dO

= f Bmij(tt 1D't!- D'(tt 1#))dy

= f Bmijt{ dy.

In order to prove
t* = t,

weneed

r div
lH'
t x y dy = r
loH'
tn X y d0

for all H'. Butthis follows from the angular momentum balance (25~ i.e.,

r pu"
lH'
X ydy = rF
lH'
X ydy +IoH' tn X ydO,

and from the equation of motion


pu11 = divr + F.
Note that "'f = tn.
184 61. Basic Equations ofNonlinear Elasticity Theory

The motivation of the basic equations in the preceding section is now


complete.

61.5. Reduced Stress Tensor and the Principle of


Virtual Power
The basic equations in Section 61.3 contain the stress tensor
r(y, t)
which corresponds to the deformed region G'. In order to avoid this complica-
tion, we introduce a new (reduced) stress tensor
u(x, t)

in the undeformed region G, which has the following physical meaning:

f iJH
u(x, t)n dO = stress force acting on H'. (27)

On the other band, we obtain the following relation

f iJH'
r(y, t)n' dO = stress force acting on H' (28)

for r. Note that, in (27), the integral is taken over the boundary of the
undeformed subregion H with outer unit normal vector n, and, in (28), over
the boundary of the deformed subregion H' with outer unit normal vector n'.
The fact that (28) can be transformed into (27), the linearity of r implying
the linearity of u, is by no means trivial. As we will see, this follows from the
Piola identity.
Moreover, we set

L K(x, t)dx = outer force acting on H'. (29)

In Section 61.3, we used

j F(y, t)dy = outer force acting on H'. (30)


JH'

61.5a. Transformed Basic Equations

We shall see below that the basic equations for the deformed region of Section
61.3 imply the following new basic equations for the undeformed region which
are easier to handle:
61.5. Reduced Stress Tensor and the Principle ofVirtual Power 185

(i) Deformation
y = x + u(x,t), (x, t) e G x [0, t 0 ]. (31)
(ii) Equation of motion
p0 (x)u 11 (P) = div u(P) + K (P), P=(x,t). (32)
(iii) Constitutive law
u(P) = O(u(P), ux(P), P). (33)
We are looking for the
displacement u.

Then one obtains the deformation from (31) and the stress forces, which are
effective in the deformed body, from (27) and (33). In equation (32), div is used
with respect to the variable x.
Furthermore, it is necessary to add the following initial and boundary
conditions. As in Section 61.3, we use the decomposition

of the boundary oG for the undeformed region G(Fig. 61.5).


(a) We prescribe the initial values
u(x,O) and u,(x, 0) on G,
i.e., the initial position and the initial velocity of the deformed body.
(b) We prescribe the displacement
u(x, t) on o1 G
and the stress forces
u(x, t)n on o2 G
for all times t.
In Cartesian coordinates, the equation of motion (32) becomes
p0 (x)u:,(P) = D1uj(P) + K;(P), i = 1, 2, 3 (34)

e
with X = 1ej and D1 = ofoe 1.

61.5b. Transformation of Forces

The fundamental relation between the forces is given by


K(x, t) = J(x, t)F(y, t)

with y = y(x, t) and corresponding inverse x = x(y, t). Recall that


J(x, t) = det Yx(x, t),
186 61. Basic Equations of Nonlinear Elasticity Theory

and
Po(x) = J(x, t)p(y, t).
lf, for example, the gravitational force Fis applied to the deformed body,
then
F(y, t) = - p(y, t)ge 3 •
This implies that
K(x) = - Po(x)ge3,
i.e., the transformation of F onto the undeformed region assumes a particu-
larly simple form.

61.5c. The First Piola-KirchhofiTensor


By definition, the important relation between u and t is given by the Piola
transformation
u(x, t) = J(x, t)t(y, t)x,(y, t)*.
The linear map u(x, t): V3 __. V3 is called the reduced stress tensor or first
Piola-Kirchhoff tensor. Recall that, according to Section 58.1, the space V3
is an H-space and the star denotes the adjoint operator. In the future t and u
will simply be called stress tensors, but t and u will always refer to the stress
tensor and the reduced stress tensor, respectively.

61.5d. The Second Piola-KirchhofiTensor


The second Piola-Kirchhoff tensor S is defined as
S(x, t) = J(x, t)x1 (y, t)t(y, t)x1 (y, t)*.
Because of(AB)* = B*A* we have
S* = S,
i.e., S is symmetric, while for u this need not be true. In Cartesian coordinates
we can write

and
61.5. Reduced Stress Tensor and the Principle ofVirtual Power 187

61.5e. Proof of the Transformation Formulas

The key here is the Piola identity


divx o-(x, t) = J(x, t)div,. t(y, t), (35)
which was proved in Section 61.2e. From (29) and (30) follows for the trans-
formation of forces that

L LKdx = FJdx.

Contracting H into a single point, we find


K=FJ.
The basic equation (14) therefore implies that
p0 u11 - divo-- K = (pu 11 - divt- F)J = 0.
This is the transformed· basic equation (32). Equality of (27) and (28) then
follows from

r divr dy = JH'r (div t)J dx = JHr div


JH'
0" dx.

According to (4), this implies that

f tn' dO = f o-ndO.
JaH· JaH

61.5f. Basic Equations of Elastostatics


We now consider the stationary case. The displacement u in this case does not
depend on time t, and the basic equations of Section 61.5a attain the form:
(i) Deformation
y= x + u(x), xeG.
(ii) Equilibrium condition
div o-(x) + K(x) = 0 on G. (36)
(iii) Constitutive law
o-(x) = Cl(u(x), u'(x), x).
In order to complete this system we must add boundary conditions. We
thereby use the decomposition
188 61. Basic Equations of Nonlinear Elasticity Theory

and prescribe the displacement


u = Uo on a. G,
and the stress forces
CTn = T on o2G.
Generally, these basic equations represent a complex system of nonlinear
differential equations for the unknown displacement u.
In Cartesian coordinates, equilibrium condition (36) becomes
J)iu} + K1 = 0 on G, i = 1, 2, 3. (36*)
In Section 61.8 we will show that the constitutive law forahomogeneaus
body must have the natural form
S = B(8),
i.e., the symmetric second Piola-Kirchhoff tensor S depends only on the
symmetric strain tensor 8. This explains the rote of S in nonlinear elasticity.

61.5g. Principle of Virtual Power in Elastostatics

Our goal is the transformation ofthe differential equation (36) into the integral
identity

l [u,h'*]dx- l Khdx- l ThdO =0 (37)


JG JG JazG
for all heX(o1 G). This represents the generalized form of the equilibrium
condition (36).
Let X(o1 G) denote the set of all C 1-maps h: G-+ V3 with
h= 0 on o1 G.
Multiplication of(36*) with h~o and integration by parts, yields

l u]Dih1dx- l K 1h1dx- l T 1h1d0 = 0 (37*)


JG JG J~G
for all heX(o1 G). This is (37).
In the Iiterature the expression on the left-hand side of (37) is called virtual
work. Therefore (37) is called the principle of virtual work. Transformation
y = x + u(x) + th(x) with heX(o1 G) (38)
describes a time-dependent perturbation ofthe deformation y .= x + u(x) with
deformation velocity
y = h.
Condition h = 0 on o1 G guarantees that all these perturbations satisfy the
61.5. Reduced Stress Tensor and the Principle ofVirtual Power 189

same boundary condition on o1 G as does u. Comparison with Section 58.11


then shows that equation (36) with h = j can be regarded as a generalization
of the principle of virtual power of point mechanics. The equivalence between
the principle of virtual work and virtual power has been discussed in Section
58.3.
We now consider the special case where rJ is symmetric. From

y(x) = !(h'(x) + h'(x)*),

and equation (37) we obtain the equation

( [rJ,y]dx- ( Khdx- ( ThdO = 0 (39)


JG JG Ja2G
for all heX(o1 G). In fact, using yj = !(D1h1 + .D'h1) and rJj = rJ/ for all i, j,
equation (37*) implies that

( rJjy/dx- f K 1h1dx- ( T 1h1d0 = 0 (39*)


JG G J~G
for all heX(o1 G). This is (39).
In Section 66.4 this version of the principle of virtual work will be used in
the study of plastic materials.

61.5h. Basic Formulas for the Stress Tensors

For the convenience of the reader we now summarize the formulas which
connect the stress tensor T to the first and second Piola-Kirchhoff tensors rJ
and S, respectively.
For y = x + u(x) we have

rJ(x) = r(y)x'(y)* det y'(x),


S(x) = x'(y)r(y)x'(y)* det y'(x),
rJ(x) = y'(x)S(x).

The inverse formulas are

r(y) = rJ(x)y'(x)* det x'(y),


r(y) = y'(x)S(x)y'(x)* det x'(y).

Note that det x'(y) = (det y'(x)t 1•


The star denotes the adjoint operator on the H-space V3 •
lf the displacement u depends on time t, then the same applies to the stress
tensors r, rJ, and S.
190 61. Basic Equations of Nonlinear Elasticity Theory

61.6. A General Variational Principle


(Hyperelasticity)
We want to show how a large class of problems in elastostatics can be obtained
from a generat variational principle. This variational approach to nonlinear
elasticity is very important in the applications. At the same time it yields a
generat method to construct the stress tensor q and to obtain reasonable
constitutive laws. Moreover, we find reasonable approximation models. The
main idea is to consider these cases where the equilibrium condition div u +
K = 0 of Section 61.5f is the Euter equation of a variational problern (principle
of minimal or stationary potential energy).
Our starting point is the variational problern

( L(x,u'(x))dx- ( Kudx- ( TudO = stationary!,


JG JG J~G (40)
u = Uo on alG,
where G is the undeformed region. Given K, T, and u0 , we are looking for the
displacement u.

L
Wecall

U= L(x,u'(x))dx

the elastic potential energy and

W= ( Kudx +( TudO
JG Jo G
2

the work of the outer forces. Relation (40) is known as the principle of
stationary potential energy for elastic bodies. The function L, which represents
the density ofthe elastic potential energy, is called the stored energy function.
Moreover, we define the stress tensor u(x)e L(V3 ) through
q(x) = Lu·(X, u'(x)). (41)
This identification has to be understood in the sense of the scalar product on
L(V3 ), i.e.,

Equation (41) is the required constitutive law. Our discussion below will show
that this is a satisfactory definition of a.
The deformation of G follows from
y= x + u(x).
According to (27) the stress forces, which act on the deformed body, are

L
obtained from
an dO = stress force acting on H'.
61.6. A General Variational Principle (Hyperelasticity) 191

Below, we will show that


an= T on o G.
2

This explains the physical meaning of T. Moreover, we have

[ K dx = outer force acting on H'.


Jn -

61.6a. Euler Equations and the Equilibrium Condition

In order to explain the relation with the basic equations of elastostatics of Sec-
tion 61.5f, we consider the variational problern (40) with sufficiently smooth
data.
(Hl) G is a bounded region in IR 3 with sufficiently smooth ·boundary, i.e.,
oGeC0 • 1•
(H2) The boundary of G admits the decomposition
oG = o1 G u o2 G,
where o1 G and o2G are disjoint, open subsets of oG.
(H3) The stored energy function L is a C 2 -map for all arguments, i.e., L e
C 2 (G X L(V3)).
(H4) The functions K: G-+ V3 and T: o2G-+ V3 are continuous.

Theorem 6l.A (Equilibrium Condition). Assurne (H1) to (H4). Let u: G-+ V3


be a C2 -map. Then thefollowing three statements are equivalent:
(a) u is a solution of the variational problern (40).
(b) u satisfies the variational equation

[ [a,h'*]dx- [ Khdx- [ Thdx=O (42)


JG JG J~G
for all heX(o1 G).
(c) u satisfies the equilibrium condition
div a +K = 0 on G,
(43)
un = T on o2 G.
Recall that X(o 1 G) is the set of all C 1-maps h: G-+ V3 with h = 0 on o1 G.

Rernark 61.3. In Cartesian coordinates, u'(x) corresponds to the matrix


(D1u 1(x)). Equations (41), (42), (43) then become

, aD.u"
ai =
oL *
(41 )
J
192 61. Basic Equations ofNonlinear Elasticity Theory

J)iuj + K1= 0 on G, i = 1, 2, 3,
(43*)
ujni = T 1 on o G.
2

Moreover, we have

•2 iJ2L Dh'D h•
L".". h = oD1uI aD,u • J r •

The sums are taken over two equal indices from 1 to 3.

PRooF OF THEOREM 61.A. We use the standard arguments of variational


calculus as described in Section 18.3. We set
<p(t) = U(u + th) - W(u + th) for heX(o1 G).
Note that all u + th are admitted as candidates in (40). This follows since h = 0
on o1 G and hence
u + th = Uo on al G.
Formula (40) means that
<p'(O) = 0.
This gives (42*) and hence (42). Integration by parts in (42*) yields

-<p'(O) = f (divu + K)hdx + f (T- un)hdO = 0


JG Jö,G
for all h e X(iJ1 G).lfwe make the special choice h1 e Cg'(G), then the variational
Iemma (Proposition 18.2) implies that
diva+K=O onG.
An analogous argument shows that
T-an=O ono2 G. 0
All this yields the following result:
(i) In the case of smooth solutions the Euler equation for the variational
problern (40), i.e., the principle of stationary potential energy, is equivalent
to the equilibrium condition (43).
(ii) Condition an = T on o2 G is a so-called natural boundary condition,
because it does not appear in the original variational problern (40), but
is automatically obtained as a necessary solvability condition.
(iii) Equation (42) is called the variational equation or generalized equation
for the Euter equation (43). It corresponds to the principle ofvirtual power
which is also called the principle of virtual work (see Section 61.5g).
For C2 -maps u: G-+ V3 , relations (42) and (43) are equivalent. The impor-
tance of equation (42) is that it remains meaningful if the displacement u has
fewer smoothness properties. In fact, one can find solutions of (42) which are
61.6. A General Variational Principle (Hyperelasticity) 193

no solutions of the classical equation (43). Such generalized solutions of


differential equations have already been discussed in Parts II and III.

61.6b. Stahle Solutions


Definition 61.4. The C2-solution u of the variational problern (40) is called

L
strictly stable if and only if

t5 2 U(u; h) = Lu•u·(X, u'(x))h' 2 dx > 0 (44)

for all heX(o1 G) with h =I= 0, i.e., the second variation ofthe potential energy
is strictly positive.
We want to rnotivate this definition. Let u be an equilibriurn position and
consider a displacernent
V= U + th with heX(o1 G).
Moreover, Iet
tp(t) = U(u + th)- W(u + th).
The work done by the outer forces and the elastic forces during a displacernent
of the body frorn u to v is equal to
~W = W(v)- W(u) and -~U = U(u)- U(v),
respectively. According to the generat stability principle of Section 58.12, a
stable equilibriurn position u rnust satisfy the relation
L\W-.1.U<O
for the entire work. This shows that
tp(t) > tp(O) (45)
for all h e X(o 1 G) and all t in a neighborhood of zero. Condition (45) rneans
that IP has a strict local rninirnurn at t = 0. This is satisfied if
tp'(O) = 0 (45a)
and
tp"(O) > 0. (45b)
Now observe that (45a) is precisely the variational equation (42). Moreover,
(45b) corresponds to the strict stability condition (44).
We will show in Section 6l.l2b that a strongly stable solution u of the
variational problern (40)corresponds to a strict local rninirnurn ofthe potential
energy, i.e., u is ~ solution of the rninirnurn problern

( Ldx- ( Kudx- ( TudO = rnin!,


JG JG J~G
u = u0 on 81 G.
194 61. Basic Equations of Nonlinear Elasticity Theory

61.6c. Equilibrium of Torques

As we have seen, the Euter equation for the variational problern (40) yields
the equilibrium condition
divu +K = 0.

In integral form this becomes

(E*) f Kdx + f undO = 0


JH loH
for all subregions H. This condition states that the outer forces and the stress
forces areinan equilibrium. Using the Piola transformation (35) we obtain
div u(x) = J(x) div, t(y), K(x) = J(x)F(y)
and hence
div, t(y) + F(y) = 0.
In integral form, this becomes the equilibrium condition for the forces

(E) f F dy + f tn' dO = 0
lH' loH'
for all deformed subregions H'. The equilibrium condition for the torques
reads as

(T) [ F x y dy + [ tn' x y dO = 0
JH' loH'
for all deformed subregions H'. According to Section 61.4d, the equilibrium
condition for the forces div t + F = 0 implies the following:
The equilibri_um condition for the torques (T) is equivalent to the symmetry
of the stress tensor t.
Condition (T) refers to the deformed body. In Problem 61.5 we show that
this implies the natural condition

(T*) rK
lH
X y dx + r
loH
un X y dO =0
for all undeformed subregions H.
We emphasize that for generat stored energy functions L in (40) condition
(T) need not be satisfied. In the following, we consider two important cases
where (T) is exactly or approximately satisfied.
Case 1: L = A(8).
In this case, which is a physically important situation, the stored energy
function L depends only on the strain tensor 8. We will show that then
61.6. A General Variational Principle (Hyperelasticity) 195

condition (T) is satisfied. Tothis end, we choose a Cartesian coordinate system


and set
1 aA
1J = at~i.
j

We have the symmetry condition


'ljj = T;i for all i,j
and (10) implies that

ai=~= T." otl:


1 oD1u1 r oD1u1
= 1)1 + 1J"D"u 1 = 1J"Dt'f 1
with D1 = o/oe 1• Note that y = x + u(x). This yields
a(x) = y'(x)T(x)

and hence T(x) = x'(y)a(x). From Section 61.5h it follows that T is precisely
the second Piola-Kirchhofftensor, i.e., T =Sand hence
S = A'(tl).

This identification has to be understood in the sense of the scalar product on


L(V3 ), i.e.,
A'(tl)y = [S, y*] for all yeL(V3 ).

The symmetry of S implies the symmetry of the stress tensor


T(y) = y'(x)S(x)y'(x)* det x'(y),
i.e., condition (T) is satisfied.
Case 2: a is symmetric.
This means that af = a/, and hence that
aL aL for all i,j.
oD1u1 = oD1u1
Using diva + K = 0, the same computation as in Section 61.4d yields

cK
Jn
X X dx + fan
an X X d0 = 0

for all undeformed subregions H. Thus it follows from y = x + u(x) that for
small displacements, i.e., for smallllu(x)ll, the torque condition (T*) is satisfied
in first-order approximation.
This Case 2 often occurs in approximation models.
196 61. Basic Equations of Nonlinear Elasticity Theory

61.6d. General Strategy of Hyperelasticity

We summarize the basic ideas of hyperelasticity. The starting point is the


variational problern

(P)
lG[ L dx - lG[ Ku dx - [
l~G
Tu dO = stationary!,

U=Uo ona1 G.
We are given the functions
L = L(x,u') (stored energy function),
K = K(x) (density of outer forces),
T = T(x) (density of boundary forces),
u = u0 (x) (displacement of the boundary part a1 G),
and are looking for the displacement
u = u(x).
(i) Deformation. lf we have a solution u = u(x) of (P), then the deformation
of the region G has the form
y = x + u(x).
(ii) Constitutive law. The first Piola-KirchhotT stress tensor is given by
CT(x) = Lu.(x, u'(x)).
From this we obtain the stress tensor
t(y) = CT(x)y'(x)* det x'(y).
In order to find exact models we need the symmetry oft. From Section
61.6c, this symmetry condition is satisfied if
L = A(x,tl),
where
tl(x) = !(u'(x) + u'(x)* + u'(x)*u'(x))
is the strain tensor.
(iii) Stress forces. Let H be a subregion of G and Iet H' denote the corre-
sponding deformed set. Then we have

[ CTn dO = stress force acting on H'.


loH
This force is equal to

f rndO.
loH'
61.6. A General Variational Principle (Hyperelasticity) 197

(iv) Outer forces. We also have

L K dx = outer force acting on H'.

Setting F(y) = K(x)detx'(y1 this force is equal to

r Fdy.
JH'
(v) Boundary forces. Let B be a subregion of the boundary part iJ2 G and Iet
B' denote the corresponding deformed set. We then have

L T dO = boundary force acting on B'.


(vi) Euler equations. Sufficiently srnooth solutions of the variational problern
(P) satisfy
div u + K = 0 on G,
un = T on iJ2 G.

r K dx + IÖH un dO
This rneans

= 0,
JH

t t
i.e., the outer forces and the stress forces are in an equilibriurn, and

TdO = undO,

i.e., the outer boundary forces are equal to the corresponding stress
forces.
(vii) Stability. The deformation y = x + u(x) is strictly stable if and only if

L Lu·u·(X, u'(x))h' 2 dx > 0

for all nonzero C 1 -rnaps h: iJ-+ V3 with h = 0 on iJ1 G.


lfu is strongly stable in the sense ofDefinition 61.18, then u corresponds to
a strict local rninirnurn of the original variational problern (P).

61.6e. A General Strategy for Obtaining


Approximation Models

The easiest procedure to get approxirnation rnodels in elasticity is the fol-


lowing:
(a) We start with an exact rnodel (P) in hyperelasticity.
198 61. Basic Equations of Nonlinear Elasticity Theory

(b) We replace the stored energy function L with an approximation function


Lapprox·
(c) We consider the corresponding variational problern (Papprox>·
This gives us a clear idea of the approximation process. In the following
chapters we will frequently use this strategy.
One advantage ofhyperelasticity isthat we can apply the methods of duality
theory. This will be explained in Chapter 62. The basic idea is to consider the
minimumproblern
(P) Epo1(u) = min!
for the potential energy tagether with the dual maximumproblern
(P*)
for the dual energy, where Epo, and EduaJ depend on the displacement u and
the stress tensor u, respectively.
In Section 62.9 we will apply this duality theory to quasi-statical plasticity.

61. 7. Elastic Energy of the Cuboid and


Constitutive Laws
Tobe able to apply the variational principle (P) of the previous section, we
need physically meaningful expressions for the elastic potential energy

U= f Ldx.

We want to show how such expressions can be obtained from generat obser-
vations. Our strategy is the following:
(i) We consider the stretching of a cuboid in a special coordinate system,
using special stresses and Hooke's law.
(ii) The calculated elastic energy in this case will be formulated in an invariant
way.
(iii) This invariant expression then is independent of the special situation.
In the case of an arbitrary body, U is obtained by using a decomposition
of the body into cuboids and summation or integration.

61.7a. Deformation of a Cuboid


In Cartesian coordinates with basis vectors e; we consider an axially parallel
cuboid
61.7. Elastic Energy ofthe Cuboid and Constitutive Laws 199

Suppose the cuboid C is stretched by the transformation


i = 1, 2, 3.
We then obtain the displacements u1 = A.1 1 and e
Yii = r 1(D1ui + Diut) = A.Ai·
e
The strain of C in the direction of the 1-axis is equal to
llldl 1 = A.t.
e
Suppose the side of the cuboid with 1 = ± 11 has the surface S1, and the tensile
force ± K 1e1 with K 1 ~ 0 acts on it. This corresponds to the stress
crt = KtfSt.
According to Hooke's law (60.1) and (60.2) the stress cr1 Ieads to an extension
llltfl 1 = erdE
and a lateral contraction
i = 2, 3.
Taking the lateral contractions into account, which are caused by cr2 and
cr3, one obtains
llltfl 1 = E- 1 cr1 - p.E- 1 (cr2 + cr3 )
together with analogous expressions for lllt/11• This yields
llltfl1 = (1 + p.)E- 1 [cr1 - (1 + J.I.t 1 (cr1 + cr2 + u3 )] (46)
for i = 1, 2, 3. Solving for u1, one finds
C11 = A.(Y11 + Y22 + "133) + 2KYu (47)
with the so-called Lame constants
A. = Ep./(1 + p.)(1 - 2p.), K = E/2(1 + p.).
From Hooke's law (60.1) and (60.2), the material constants E and J.l. can then
be experimentally determined. Table 61.1 contains several values. Recall that
on the surface of the earth 1 kp is the force which is caused by 1 kg, i.e.,
1 kp = 9.81 N. Table 61.1 shows that A. > 0 and" > 0 are reasonable general
assumptions.

Table 61.1.
p.
Steel 2.18 ·106 0.29
Iron 2.17 ·106 0.28
Copper 1.23. 106 0.35
Aluminium 0.74·10 6 0.34
200 61. Basic Equations ofNonlinear Elasticity Theory

We now want to compute the elastic energy of the cuboid. During the time
interval 0 :::;; t :::;; 1, we stretch the cuboid by letting
'li = (1 + t.A.J~j·
The work which thereby is done equals

u= Ll 2Kj(t)~j(t)dt = Kj(l).A.jlj.

Note that ~i = Ii, i.e., ~i = .A.ili and

since the stress forces, which act on all sides ofthe cuboid, depend linearly on
the dilatation at time t, according to (47). Let V be the volume of the cuboid.
Because of
and
we find that
(48)
lf we define o- e L(V3 ) through o-ei = o-iei, then (47) and (48) may be written in
the invariant form
q = .A. tryl + 21'}', (49)
U = (2- 1 .A.(try) 2 + K[y,y])V. (50)
We interpret U as the elastic energy which, by the stretching, is kept in the
cuboid.

61.7b. The Yield Condition in Plasticity

Our next goal is the plasticity condition (51) below which is also called the
yield condition.lf we consider the extension of a wire, we observe the following
behavior:
no plasticity,
plasticity occurs,
where o- is the stress of the wire. We want to generalize this condition to
three-dimensional bodies, whereby nothing about the concrete microscopical
structure of the bodies shall be used. The way to do this is by using a purely
mathematical argument based on the invariants of the stress tensor o-.
We define the deviations y, u of the tensors y, o- through
y=y- 3-l tr yI, u=a--r 1 tro-I.
61.7. Elastic Energy ofthe Cuboid and Constitutive Laws 201

This gives
try = tru = 0,
and in the special coordinate system used above, this yields
3[}', y] = (A.l - A.2) 2 + (A.2 - A.3) 2 + (A.3 - A..)2,
3[0',0'] = (0'.-- 0'2)2 + (0'2- 0'3)2 + (0'3- 0'.)2.
Hence, we obtain
[y,y] = 0
i.e., the cuboid is stretched uniformly in all three axial directions. The greater
[y, y] becomes, the more nonuniformly the cuboid is stretched.
Similarly, the greater [u, u] becomes, the more nonuniform the stresses in
the cuboid become.

E:XAMPLE 61.5 (The Plasticity Condition ofvon Mises (1913)). For the cuboid
we write
[u, o=] < 0'5 no plasticity,
(51)
- 0'
[ 0', - ] ;;:: O'o2 plasticity occurs.
This criterium is based on the experimental result that a strongly nonuniform
stress in the cuboid causes plasticity, whereas in the case of a strongly uniform
stress, i.e., for strong stresses with [u, u] = 0, no plasticity is observed.
Replacing i1 in (51) with O'(x), we obtain the plasticity condition for generat
three-dimensional bodies which will be used in the following chapters.
The constant 0'0 depends on the material.

61.7c. The General Linear Model


EXAMPLE 61.6 (Linear Elasticity Theory). We consider a homogeneous and
isotropic body. According to (50), a decomposition of the body into small
cuboids suggests the following ansatz for the elastic energy:

U= L 2- 1 A.(tr y) 2 + K[y, y] dx. (52)

Here y depends on x. The energy expression (52) corresponds to the so-called


linear elasticity theory. Since we used the linear Hooke's law to motivate (50)
and (52), we have to assume that llu'(x)ll is small, i.e., that allderivatives IDiu;l
are small.
The constitutive law, which follows from (52) by using the general formula
(41), namely 0' = L 11., is equal to Hooke's law (49).
202 61. Basic Equations of Nonlinear Elasticity Theory

61.7d. Nonlinear Models


EXAMPLE 61.7 (Nonlinear Elasticity Theory). A generat ansatz for U is

U= L A(x,8(x))dx.

For smallllu'(x)ll the strain tensor 8(x) ofSection 61.2 can be approximately
replaced with y(x).
EXAMPLE 61.8 (Nonlinear Hencky Material). We write (52) in the form

L
u= r 1 k(try) 2 + K<p([y,Y])dx (53)

with k = .Ä. + 2K/3 and <p(t) = t.


The corresponding nonlinear material behavior occurs for nonlinear func-
tions <p in (53).
The idea for this ansatz is that for a nonuniform strain, i.e., for (Y, Y] :F 0,
the cuboid in Section 61.7a exhibitsnonlinear behavior.

61.8. Theory of Invariants and the General


Structure of Constitutive Laws and
Stored Energy Functions
We now want to show that natural symmetry conditions restriet the possible
structure of constitutive laws. The main results will be Theorems 61.B and
61.C below.
Let ')I E Lsym(V3), i.e.,
y: v3 ..... Jl3
is a symmetric linear operator. Principal invariants of y are the coefficients of
the characteristic equation
det(y - U) = 0. (54)

Proposition 61.9. Let y e Lsym(V3 ) and Iet .Ä. 1, .Ä. 2 , .Ä.3 be the eigenvalues of y. Then
the principal invariants of y are given by
+ .Ä.2 + .Ä. 3 = try, .Ä.1.Ä. 2.Ä. 3 = dety,
.Ä.1
(55)
· Ä.1Ä.2 + Ä.2Ä.3 + Ä.3Ä.1 = !((try) 2 - [y,y]) = tr(adjy).

PRooF. Note that det(y - U), tr y, and [y, y] are invariants of y, i.e., they are
independent of the chosen coordinate system. Let {e 1 ,e 2 ,e 3 } be an ortho-
61.8. Theory of Invariants and the General Structure of Constitutive Laws 203

normal system of eigenvectors which correspond to the eigenvalues .1. 1 , .1. 2 , .1. 3 •
We then have
i = 1, 2, 3. (56)
Ifwe choose a Cartesian coordinate system with axes e 1 , e2 , e3 , we obtain
y(e 1e;) = eyiei,
1 Yi = A/Ji,
and hence equation (54) assumes the form
0

61.8a. Isotropie Real Tensor Functions


Proposition 61.10. Let f: Lsym(V3 )-+ R be an isotropic .function, i.e.,
f(R- 1 yR) = f(y) (57)
for a/1 rotations R and a/1 y e Lsym(V3 ).
Then f is a function of the principal invariants, i.e., there exists a function
g: R3 -+ R such that
f(y) = g(tr y, [y, y], det y). (58)

PRooF. Let y, a e Lsym(V3 ). lt suffices to show that the relation


a = R- 1 yR (59)
holds precisely if the principal invariants of y and a are equal. Here, R denotes
a rotation.
From (59) it follows that
det(a - ).[) = det(y - )./).
Hence, the principal invariants equal each other.
If, conversely, the principal invariants are equal, then the characteristic
equations are identical, and hence the same is true for the eigenvalues. It
follows that there exists a rotation which transforms the corresponding eigen-
vectors of y and a into each other. This implies (59). 0

To explain the intuitive meaning of (57), consider the equation


Y = yx.
A rotation R sends x and y into xR = Rx and YR = Ry, respectively. Hence,
we obtain ·

and condition (57) means that f(YR) = f(y).


204 61. Basic Equations of Nonlinear Elasticity Theory

61.8b. Isotropie Tensor Functions

Proposition 61.11 (Rivlin and Ericksen (1955)). Let T: Lsym(V3 )-+ Lsym(V3 ) be
an isotropic tensor function, i.e.,
R- 1 T(y)R = T(R- 1 yR) (60)
for all rotations R and all y e Lsym(V3 ).
Then there exist real functions a, b, c: R 3 -+ R such that
T(y) = al + by + cy 2 (61)
where
a = a(try,[y,y],dety),
together with analogous expressions for b and c.

PROOF.

(I) Let {e1 ,e 2 ,e3 } be an orthonormal system of eigenvectors of y with


corresponding eigenvalues A.~o A. 2 , A. 3 , i.e.,
i = 1, 2, 3.
We show that this is also an orthonormal system of eigenvectors for T(y),
i.e.,
T(y) = t1e1, i = 1, 2, 3.
We define a rotation by

From (60) and q- 1 yQ = y it follows that


Q- 1 T(y)Q = T(y).

X. ··x
In the language of matrices, this means

(~ -~)
0 tu 0
-1 0 t21 t22 t23 0 -1
0 -1 t31 t32 t33 0 0

= ( '"
-t21
-t31
-tu
t22
t32
-t")
t23
t33

and hence tu = t 13 = t 21 = t 31 = 0. Similarly, we obtain t 23 = t 32 = 0.


(II) According to (1), there exists a rotation R which diagonalizes both y and
T(y). From
61.8. Theory oflnvariants and the General Structure of Constitutive Laws 205

with

('' ~). ('• },)


0 0
R- 1 T(y)R = ~ tz R- 1 yR = ~ Az
0 t3 0
it follows that
t; = T;(A. 1 ,A 2 ,A 3 ), i = 1, 2, 3
with fixed functions T1 , T2 , T3 for all yELsym(V3 ).
(111) Any permutation of the A.; Ieads to a corresponding permutation of the
t;. This follows from (60) by using rotations like

R = (~ ~ ~)·
0 0 1
(IV) Case 1: A1 =F A2 =F A3 =F A1 . Since
1 Al Ai
1 Az A~ = (A 1 - A2 )(Az - A3)(A3 - Ad =F 0,
1 A3 A~
the linear system
t; = a + bA; + cA.f, i = 1, 2, 3,
has a unique solution a, b, c. Thus, a is a function of A1 , A2 , A3 , t 1 , t 2 , t 3 ,
i.e., a function of A1 , A2 , A3 , and hence a function of the principal
invariants ofy, according to Proposition 61.10. This proves assertion (61)
in Case 1.
Case 2: A. 1 = A. 2 , A. 2 =/; A. 3 • From (II) it follows that t 1 = t 2 . We then
consider the linear system
i = 2, 3.
Case 3: A1 = A. 2 = A3 . In this case we obtain from (II) that t 1 = t 2 = t 3 ,
and hence that t; = a, i = 1, 2, 3. 0

61.8c. Structure of the Constitutive Laws for


Homogeneous Bodies

We consider the constitutive law


r(y) = <!>(/ + u'(x)) (62)
for a homogeneous body, i.e., <I> does not depend on x. It is reasonable then
to postulate
R- 1 <l>(R + Ru'(x))R = <1>(1 + u'(x)) (63)
206 61. Basic Equations of Nonlinear Elasticity Theory

for all rotations R and all x. This condition is called the axiom of material
frame indifference. In order to motivate (63), we consider the stress vector
T(y) = -r(y)n, y= x + u(x). (64)
Using a rotation R, we pass from y to
YR = Ry = Rx + Ru(x).
We set
•R(Y) = <ll(R + Ru'(x)),
and observe that it is quite natural to require

which implies
T(y) = (R- 1 -rR(y)R)n. (64*)
A comparison of (64) with (64*) yields (63).

Proposition 61.12. Condition (63) is equivalent to


S(x) = B(tf(x)), (65)
where S is the second Piola-Kirchhoff stress tensor and I the strain tensor.

PROOF.

(I) (63) => (65). We set C = I + u'(x). From Proposition 61.1 it follows that
C=RU, U = (I + 21(x)) 112,
and Section 61.5h implies
CS(x)C* = (det C)-r(y). (66)
Noting R- 1 = R*, condition (63) then yields
-r(y) = <I>(C) = <I>(RU) = R<I>(U)R- 1 = cu- 1 <I>(U)U- 1 C*.
This implies (65).
(II) (65) => (63). Reverse the argument. 0

Proposition 61.12 shows that the most general constitutive law for homo-
geneous bodies is given by (65).

61.8d. Structure of the Constitutive Laws for


Homogeneous Isotropie Bodies

Let y = x + u(x) denote a deformation of a homogeneaus body. The body is


called isotropic if and only if the following natural conditions
61.8. Tbeory of Invariantsand the General Structure of Constitutive Laws 207

(i) y(x) = y(Rx),


(ii) -r(y) = -r(Ry),
(iii) S(Rx) = B(8(Rx))

are satisfied for all points x and all rotations R.

Theorem 61.8 (Rivlin-Ericksen Theorem (1955)). The most general constitu-


tive law for a homogeneous isotropic body is given by
(67)
where
a = a(tr8,[8,8],det8),
and there exist analogous expressions for band c. Here, S is the second Piola-
Kirchhoff stress tensor and 8 is the strain tensor.

PRooF. Condition (i) implies


Rx + u(Rx) = x + u(x).
Differentiation with respect to x yields
R + u'(Rx)R = I + u'(x)
and hence
I+ 28(Rx) =(I+ u'(Rx))*(I + u'(Rx)) = R(I + u'(x))*(I + u'(x))R- 1 •
This gives
8(Rx) = R8(x)R- 1• (68)
Similarly, we obtain from (ii) and (66) that
S(Rx) = RS(x)R- 1•
Finally, condition (iii) shows that B is an isotropic tensor function. The as-
sertion then follows from Proposition 61.11. 0

61.8e. The Stored Energy Function

From Proposition 61.12, the mostgenerat constitutive law for homogeneous


bodies is given by
S=B(8)
with B: Lsym(V3 )-+ Lsym(V3 ). We then consider the special case where Bis a
potential operator with potential A, i.e.,
S = A'(8). (69)
208 61. Basic Equations of Nonlinear Elasticity Theory

Comparison with Section 61.6c shows that it makes sense to call L = A(8') a
stored energy function. Moreover, using (68) it is reasonable to postulate that
the stored energy function of an isotropic body has the property
A(R- 1tiR) = A(8')
for all rotations R and all8', i.e., A is isotropic.

Theorem 6t.C. The stored energy function of a homogeneous isotropic body has
theform
L = L(tr 8', [8', 8'], det 8').

PRooF. This follows immediately from Proposition 61.10. 0

61.8f. Special Constitutive Laws

EXAMPLE 61.13 (Saint Venant-Kirchhoff Material). Webegin with the most


generat constitutive law for homogeneaus isotropic bodies
S = ai + b8' + c8' 2 ,
where a = a(tr 8', [8', 8'], det 8') and analogaus expressions hold for b, c. We
assume that 18'1 is small. Taylor's theorem then implies that
a = a0 + a 1 tr8' + o(l8'1), 8'-+ 0.
We postulate that 8' = 0 yields S = 0, i.e., a0 = 0. Setting a 1 = A. and b(O) = 2K,
we obtain
S = A. tr 8' I + 2K8' + o(l8'1), 8'-+ 0.
By definition, the linear term
S = A. tr 8' I + 2K8' (70)
corresponds to the Saint Venant-Kirchhoff material. Setting
A(8') = fA.(tr 8') 2 + K[8', 8'], (71)
we obtain
S = A'(8').
Hence, L = A(8') is the corresponding stored energy function.
lf we replace the strain tensor 8' with the linearized strain tensor y, then we
obtain linear elastic materials. This shows that the constitutive law for linear
elastic materials can be obtained from generat considerations without using
Hooke's law as in Section 61.7.
EXAMPLE 61.14 (Rubberlike Material ofOgden (1972)). Set A =I+ u'(x). Let
A. 1 , A. 2 , A. 3 betheeigenvalues of(A* A) 112 , which are called the principal stretches
61.9. Existence and Uniqueness in Linear Elastostatics (Generalized Solutions) 209

of A.If J1. 1 , J1. 2 , J1. 3 are the eigenvalues of tf(x), then

for all i.

The stored energy function has the form

where p, q ~ 1, C, D > 0, and f: ]0, oo[-+ ~ is convex with

lim f(d) = +oo.


d-++0

In terms of A this yields

L = C tr(EP - /) + D tr(adj(Eq) - /) + f(det A), (71 *)

where E = (A* A) 1' 2 is the stretch tensor.


More generally, Ogden's material consists of sums of such expressions.
Mooney-Rivlin material corresponds to the special case p = q = 2.
EXAMPLE 61.15 (Polyconvex Material of Ball (1977)). From Section 61.2b, the
deformation of curves, surfaces, and volume elements depends on A, adj A,
and det A where A = I + u'(x). Thus, it is natural to assume that the stored
energy function has the form

L = P(A, adj A, det A).

The function L is called polyconvex if and only if the function Pis convex with
respect to its three arguments, i.e., if

is convex.
From Problem 61.1, Ogden's stored energy function in Example 61.14 is
polyconvex. In the special case ofMooney-Rivlin material, the polyconvexity
of L follows easily from

L = Ctr(A*A- /) + Dtr(adj(A*A)- /) + f(detA)


(see Problem 6l.lb).

61.9. Existence and Uniqueness in Linear


Elastostatics (Generalized Solutions)
We now want to show that the main theorem about quadratic variational
problems (Theorem 22.A) immediately Ieads to an existence and uniqueness
theorem in linear elastostatics.
210 61. Basic Equations of Nonlinear Elasticity Theory

61.9a. The Classical Problem

The principle of minimal potential energy for linear material yields the fol-
lowing variational problern for the displacement vector u:

L L(u'(x))dx- L Kudx = min!,


(72)
u=u 0 oniJG,
where
L = !..t(try)2 + K[y, y], Y = f(u'(x) + u'(x)*).
Note that tr y = div u. The corresponding Euler equations are
divu + K = 0,
(73)
U =Lu·= A. try I+ 2"1'.
In terms of the displacement vector u, we obtain the so-called Lame equations
"Au+ (A. + K)graddivu + K =0 in G,
(74)
u = u0 on iJG.

61.9b. The Key lnequality


Weset

a(u, v) = L A. tr y(u) tr y(v) + 2K[y(u), y(v)] dx,

b(u) = - L Kudx.

The key observation, in our existence proof, is the inequality

a(u, u) ~ 2K L [y(u), y(u)] dx ~ "L [u'(x), u'(x)] dx


(75)
for all u e Cö(G, V3 ).
Here, ueCö(G, V3 ) means that the components ofu belong to Cö(G).
This is a simple special case of Kom's inequality of Chapter 62. In fact, for
all u1e Cö (G), integration by parts yields

L D 1u1JJiu 1dx = - L (D 1Diu1)u1dx = L u


Diu1D 1 1dx

= L (div u) 2 dx ~ .0.
61.9. Existence and Uniqueness in Linear Elastostatics (Generalized Solutions) 211

Hence

L [y(u), y(u)] dx = L !(D1ui + Diu1)(D1u1 + D1u1)dx

~! L D1u1D 1u1 dx = t L
[u'(x),u'(x)]dx.

Furthermore, the Poincare-Friedrichs inequality of Section 18.9 implies

L [u'(x), u'(x)] dx ~ CL u(x)2 dx. (76)

We set X = Wi(G; V3 ), i.e., X consists of all vector functions u: G-+ V3 where,


in a fixed coordinate system, the components of u belong to Wl (G). This
definition is independent of the choice of the coordinate system. According to
(76), the space Xisareal H-space with scalar product

(ulv)x = L [u'(x), v'(x)] dx.

61.9c. The Generalized Problem

The generalized problem which belongs to the classical variational problem


(72) is
!a(w + u0 , w + u0 ) - b(w + u0 ) = min!, weX. (77)
The corresponding Euler equation
a(w + u0 , v) = b(v) for all veX (78)
is the generalized equation of the classical Lame equations (74). Our hy-
potheses are the following:
(H l) G is a bounded region in R3 with öG e C0 • 1 •
(H2) We are given the density of outer forces K e L 2 (G; V3 ) and the boundary
displacement u0 e Wi(G; V3 ).

Theorem 61.0 (Main Theorem of Linear Elastostatics). Assume (Hl), (H2).


Then both problems (77) and (78) have the same unique solution w e X. The
corresponding displacement vector is given by u = w + u0 •

PRooF. This is an immediate consequence of Theorem 22.A. The strong


positivity of a( ·, ·) follows from Korn's inequality (75). 0

Theorem 22.A also yields the convergence ofthe Ritz method, together with
error estimates.
212 61. Basic Equations ofNonlinear Elasticity Theory

61.10. Existence and Uniqueness in Linear


Elastodynamics (Generalized Solutions)
We use the same notation as in Section 61.9. The classical problern of linear
elastodynamics is
p0 u11 = div a + K,
i.e.,
p0 u" = " ~u + (Ä + K) grad div u + K in G x ]0, oo[,
u(x, t) = u0 (x, t) on iJG ·x ]0, oo [,
(79)
u(x, 0) = u1 (x) on G,
u,(x,O) = u2 (x) on G.

For the sake ofsimplicity, we set u0 = 0. Letting w = u- u0 , the general case


can always be reduced to this special case.
With (ulv) = J0 p0 uv dx, the generalized problern reads as follows:
d2
dt 2 (ulv) + a(u, v) = b(v) for all veC0 (G; V3 ),
(80)
u(O) = u1 , u'(O) = u2 •

We obtain (80) from (79) by multiplying with ve C0 (G; V3 ) and integrating


by parts.
Set QT = G x ]0, T[ and

X = W:/(G; V3 ), H = L 2 (G; V3 ).

Theorem 6l.E (Main Theorem of Linear Elastodynamics). Let G be a bounded


region in R 3, and T > 0. Suppose we are given the mass density p0 e C(G), the
density of outer forces K e L 2(QT ), the initial position u1 e X, and the initial
velocity u2 eH.
Problem (80) then has a unique solution
u'eL 2 (0, T;H), u" E L 2 (0, T; X*).

PROOF. This is an immediate consequence ofTheorem 24.A. The key inequality


(75) yields the strong positivity of a( ·, · ). 0

In Section 24.2 we also proved the convergence of the corresponding


Galerkin method.
61.11. Strongly Elliptic Systems 213

61.11. Strongly Elliptic Systems


The following results serve as a preparation for our generat considerations in
the following sections on nonlinear elasticity. The key observationisthat the
linearization of the equations of nonlinear elasticity at strongly stable solu-
tions yields linear strongly elliptic systems. In order to be able to apply the
implicit function theorem and the methods of bifurcation theory, we need
information about the solutions of such linear strongly elliptic systems.
Consider the system
- a11,.mD1D1u" + a1~cmD1 u" = Km in G
(81)
Um= gm on oG, m= 1, ... ,M,
together with the homogeneous adjoint system
- D,Di(a11,.mu!) - D1 (a1~cmu!) = 0 in G
(81*)
u! = 0 on oG, m= 1, ... ,M.
We employ the notation
x = <e~ ..... eN>·
and sum over two equal indices, where i, j = 1, ... , N and k, m = 1, ... , M.
Moreover, Iet us make the following assumptions:
(H 1) G is a bounded region in !RN with oG e C 00 •
(H2) a 11~r.m• a1,.'"e C00 (G) for all i,j, k, m.
(H3) The system is strongly elliptic, i.e., there is a constant c > 0 such that
alikm(x)dldivl<vm ~ cldl 2 lvl 2
for all deRN, veiRM, xeG.
We are now looking for solutions of(81) with the given functions
Kme Wl- 2 (G), gme Wf- 1' 2 (oG)
for all m and k = 2, 3, .... We expect solutions of the form
Um e Wl( G) for all m.
Proposition 61.16(ii) below shows that uniqueness implies the strongest pos-
sible existence result.
Setting
Au= (K,g),
equation (81) defines an operator
A: Wf(G)M-+ w;-2(G)M x w;-ltl(oG)M
for all k = 2, 3, .... The spaces w;- 1' 2 (oG) of boundary functions were intro-
214 61. Basic Equations of Nonlinear Elasticity Theory

duced in A2 (48). The important point is that the embedding


Wl(G) s;; Wl- 112 (aG)
is continuous and surjective. Conversely, there exists a continuous extension
operator
T: wf- 1' 2 (aG)-+ Wf{G)
suchthat Tu= u on aG. Using this fact one can reduce (81) to the homo-
geneous case.

Proposition 61.16. Assume (Hl)-(H3). Then:


(i) The operator A is Fredholm of index zero for all k = 2, 3, ....
If K e C<X>(G)M and ge C<X>(aG)M, then each solution u of(81) belongs to
C (G)'".
00

(ii) lf the homogeneous equation (81) has only the trivial solution u = 0, then
A is a linear homeomorphism.
(iii) Let K eL 2 (G)AI and g = 0 on aG. Then equation (81) has a solution if and
only if

Ja[ f
m=l
K".u!dx = 0

for all solutions u* of the adjoint equation (81 *).


The homogeneous equations (81) and (81 *) have an equal finite number of
linearly independent solutions.

PRooF. Use the same argument as in Chapter 22 for strongly elliptic equations
including regularization. The key is Gärding's inequality

(G) L u".L".udx ~ c L (D,.u".) 2 dx- d L u!dx

for all ue Cö(G)At with constants c, d > 0, where L".u denotes the left-hand
side ofthe first line in (81). Compare Browder (1954) and Nirenberg (1955). In
(G), we sumover k, m = 1, ... , M. D

We are now interested in a substantially stronger result, i.e., we are looking


for solutions
u".eC 2 ·'"(G) for all m, O<oc<l
for given functions
K".eC'"(G), for all m.
The use of Hölder spaces will be essential. We note that the following propo-
sition is wrong in case that oc = 0, which underlines the importance of Hölder
spaces in the theory of elliptic differential equations.
Let us make the following assumptions:
61.12. Local Existence and Uniqueness Theorem in Nonlinear Elasticity 215

(Al) Gis a bounded region in RN with aGeC 2 ·" for fixed 0 <(X< 1.
(A2) a1111"., a1"". e C'(G) for all i, j, k, m.
(A3) The system (81) is strongly elliptic.
Now consider the operator
A: C2 ·"(G)M-+ C"(G)M x C 2 ·"(aG~
defined through Au = (K, g) by equation (81).

Proposition 61.17. Assurne (A1)-(A3). Then:


(i) The operator A is Fredholm of index zero.
(ii) lf the homogeneous equation (81) has only the trivial solution u = 0, then A
is a linear homeomorphism.

PRooF. This is a profound and extremely sharp result in the theory of elliptic
partial differential equations. A detailed proof would berather long. We sketch
here the main ideas for (ii).
(I) First, consider the case of C 00 -coefficients. By using Proposition 61.16,
one obtains a solution ue Wl(G)M. The L,-estimates in Agmon, Douglis,
and Nirenberg (1959), Part II yield u e W"2 (G)M for all p ~ 2. The Sobolev
embedding theorems then imply that ueC 1•11 (G)M and the result follows
from sharp Schauderapriori estimates in Agmon, Douglis, and Nirenberg
(1959), Part II, Theorem 9.3. See also Morrey (1966, M~ Chapter 6.
(II) Now suppose the coefficients are in C"(G). Note that the C2 ·"(G)-a priori
estimates in (I) depend only on an upper bound for the· C"(G)-norms of
the coefficients. Use the smoothing operator of Section 18.14 in order to
approximate the C"-coefficients by C00 -coefficients and observe that the
C G)-norms of all the approximating coefficients are uniformly bounded.
11 (

This is a consequence ofthe properties ofthe smoothing operator. A com-


pactness argument together with the C2 ·"-a priori estimates yields the final
assertion. 0

61.12. Local Existence and Uniqueness Theorem


in Nonlinear Elasticity via the Implicit
Function Theorem

61.12a. Basic Ideas

The basic idea in our approach to nonlinear elasticity is the following:


(i) In order to solve the basic equations of elastostatics, we linearize around
so-called strongly stable solutions, which correspond to known strongly
stable deformation states of the elastic body (e.g., the rest state u = 0).
216 61. Basic Equations ofNonlinear Elasticity Theory

(ii) It is important then that these linearizations correspond to linear strongly


elliptic systems which have unique solutions as a consequence of the strong
stability.
(iii) The Fredholm property of linear strongly elliptic systems implies that the
linearization corresponds to a bijective operator. Thus the implicit func-
tion theorem (Theorem 4.B) can be applied. This Ieads to Theorem 61.F
below.
(iv) A known strongly stable solution of the nonlinear basic equations of
elastostatics may be continued via the implicit function theorem, as long
as this continuation remains strongly stable, i.e., the linearization cor-
responds to a bijective operator.
The discretization oftbis continuation procedure yields an approxima-
tion method, the convergence ofwhich is proved in Theorem 61.H below.
This convergence proof uses standard arguments from the theory of
ordinary differential equations in B-spaces.

In order to obtain classical solutions, we work in Hölder spaces. Using the


results from Section 61.11, this approachalso applies to Sobolev spaces, where
it yields generalized solutions.
To be able to give a simple physical interpretation in terms of stability
theory via potential energy, we use a variational approach. Our method,
however, is also applicable to the generat basic equations of nonlinear elasto-
statics, which do not correspond to variational problems. Let us call a defor-
mation state u admissible if and only if the linearization of the basic equations
at u yields a linear strongly elliptic system which has a unique solution, i.e.,
where the corresponding linear operator is bijective. Then, roughly speaking,
the following generat and very natural local result follows from the implicit
function theorem.

In a sufficiently small neighborhood of a known admissible (e.g., strongly


stable) deformation state of an elastic body, there exist uniquely determined new
deformation states if we consider small changes in the outer forces and boundary
displacements.

The connection with bifurcation theory is the following. If the known


deformation state u is not admissible, then bifurcation may occur, i.e., small
changes in the outer forces and boundary displacements may· Iead to several
nonuniquely determined new deformation states of the elastic body. In this
case, nature chooses the new state with the "greatest stability" (e.g., the lowest
potential energy). Such bifurcation situations correspond, for example, to the
buckling of rods, beams, plates, and shells.
In the case of the variational approach it is crucial that the linearization
of the basic equation at u is equal to the Euler equation of the accessory
quadratic variational problem, i.e., we have the situation ofFigure 61.6. Recall
that, in Section 29.12, we used accessory quadratic variational problems in
order to obtain generat sufficient criteria for minima (eigenvalue criteria).
61.12. Local Existence and Uniqueness Theorem in Nonlinear Elasticity 217

B.ilinearization at u
Variational problern -------+ Accessory quadratic
(principle of minimal variational problern
potential energy)

"1
Euler equation
(basic equation of
elastostatics)

j Linearization
at u

Linearized Euler Linear Euler


equation equation

Figure 61.6

61.12b. Variational Problem and Strongly Stahle States


We consider the principle of minimal potential energy

L L(u'(x))dx- L Kudx = min!,


(82)
u = g on oG.
Let us assume that a fixed Cartesian coordinate system is given and that the
sum is taken over two equal indices from 1 to 3. From Theorem 6l.A, the
Euler equations to (82) are
- aiJkmDiDJuk = K". in G,
(83)
u". = g". on oG, m = 1, 2, 3,
where
o2 L(u'(x))
aijkm = oDiu". oD uk .
1

Let L be coo. The elastic potential energy of the body is then given by
V = L L(u'(x))dx.

An important role will be played by the second variation

t5 2 U(u; h) = L Lu'u'(u'(x))h'(x) 2 dx

= L aijkmDihmDjhk dx
218 61. Basic Equations of Nonlinear Elasticity Theory

and by the so-called accessory quadratic variational problern

!b 2 U(u;h)- L Khdx = min!,


(84)
h = g onaG

for the unknown function h. Set

A necessary condition for a solution u of (82) to exist is that

According to Section 18.17b, this Ieads to the Legendre-Hadamard condition

for all V, d E V3,

where v o d denotes the dyadic product. In components,

Lu•u•(u'(x))(v od) 2 = aiJkmdidJvkvm.

Definition 61.18. A function u e C2 (G)3 is called strongly stable with respect to


U if and only if
b2 U(u;h) > 0 for all nonzero he Wl(G) 3 (85)
and the strong Legendre-Hadamard condition is valid, i.e.,
Lu·u•(u'(x))(vod) 2 > 0 (86)

for all nonzero v, d e V3 and all x e G.

From Gärding's inequality ofSection 29.19 and Hestenes' theorem (Propo-


sition 22.39), it follows that (85) implies the existence of a constant C > 0 such
that
(85*)

where llhll denotes the norm on Wl(G) 3 .


Using a simple continuity argument, it follows from condition (86) that there
exists a constant c > 0 such that

for ~II V, de v3 (86*)


and all x e G, i.e., the system (83) is strongly elliptic.
According to (85*) and Theorem 29.L, each strongly stable C2 (G)-solution
of the Euler equation (83) yields a strict local minimum of the original
variational problern (82) with respect to the space C 1(G) 3 •
61.12. Local Existence and Uniqueness Theorem in Nonlinear Elasticity 219

61.12c. Local Continuation

In order to prove a local continuation theorem, we assume the following:

(H1) Gis a bounded region in IR 3 with iJG e C2 ·« for fixed 0 < IX < 1.
(H2) The stored energy function L: IR 9 -+ IRis C'n.
(H3) We know a strongly stable solution iie C2 ·«(G) 3 of(83) with correspond-
ing density of outer forces K and boundary displacement g.
Set

Theorem 6l.F (Local Existence and Uniqueness). Assume (H1)-(H3). Then,


there exist neighborhoods
V(ü) in X and W(K,g) in Y

such that, for each (K, g) e W, equation (83) has a unique solution u e V.

PROOF.

(I) Equation (83) defines an operator F: X-+ Y by letting

F(u) = (K, g).

The linearized equation


F'(u)h = (K, g)

corresponds to the linearization of (83), i.e.,

m = 1, 2, 3,
whereby we are looking for h. Formula (87*) can be written as

- D1(autm(u(x))Dih1 ) = Km in G
(87)
hm = gm on iJG, m = 1, 2, 3.
The key observation then is that (87) is precisely the Euler equation to
the accessory variational problern (84).
(II) Let heX be a solution of
F'(ii)h = 0,
i.e., h is a solution of (87) with u = ii and K = 0, g = 0. Integration by
220 61. Basic Equations of Nonlinear Elasticity Theory

parts yields

b2 U(ii;h) = L aiJtw.D1h".D1h,.dx

= - L h".D1(a 1111".D1 h11 ) dx = 0

and hence h = 0, since ii is strongly stable.


(III) Proposition 61.17 shows that F'(ii): X-+ Y is bijective. The assertion
then follows from the implicit function theorem (Theorem 4.B). D

EXAMPLE 61.19. Let


L = M(y(u)),
where M: Lsym(V3 )-+ R is coo and suppose that there exists a constant c > 0
suchthat
(88)
Then, each solution u of (83) is strongly stable.
Condition (88) is fulfilled for linear materials, i.e.,
L = f.J.(tr y) 2 + K[y, y].

PROOF. In this case, we obtain

Kom's inequality (75), (76) implies

L L
aii,.".D1h".Dih" dx = M"(y(u))y(h) 2 dx

~ L c [y(h), y(h)] dx ~ Cllhll 2 for all heC0 (G)3 •

This is (85). Moreover, from

0 = au,.".d1d1v"v". ~ ~(d1 v". + v1d".) 2


we obtain d1v". = - v1d". for all i, m, and hence d = 0 or v = 0. This is (86). D

EXAMPLE 61.20. Let


L = M(tl(u)),
where M: Lsym(V3 )-+ R is coo and suppose that there exists a constant c > 0
61.14. Stability and Bifurcation in Nonlinear Elasticity 221

suchthat
(89)
Then u = 0 is strongly stable. This is proved analogously as Example 61.19.
Condition (89) is fulfilled for Saint Venant-Kirchhoff materials, i.e.,
L = !A.(tr 8) 2 + K[S, 8].

Proposition 61.21. Suppose we have the situation of Example 61.19 or 61.20.


The trivial solution u = 0 (rest state) of the Euter equation (83) with K = 0 and
g = 0 is then strongly stable. From Theorem 6l.F follows that,for all sufficiently
small smooth boundary displacements g and densities of outer forces K, there
exists a unique classical solution u of (83).

61.13. Existence and Uniqueness Theorem


in Linear Elastostatics (Classical Solutions)
Theorem 6l.G. Choose 0 < cx < 1. Let G be a bounded region in IR 3 with
oG e C2 ·«. Let
L = !A.(tr y) 2 + K[y, y].
Then, for each K e c«(G) 3 and ge C 2 ·«(iJG) 3 , the Lame equations (83) have a
unique solution u e C2 ·«(G) 3 .

PROOF. This follows from Proposition 61.21 and the linearity of (83) in this
special case. D

61.14. Stability and Bifurcation in


Nonlinear Elasticity
We consider a homogeneous and isotropic body. From Section 61.8, the
stored energy function must have the form
L = M(S).
Suppose that M: Lsym(V3 )-+ IRis C 00 • We use the notation of Section 61.12.
Let I: be the set of all
ueX, (K,g)e Y,

which are solutions of the basic equations of nonlinear elasticity (83). Our
preceding results lead to the following clear picture. Let P = (ii, K, g).
222 61. Basic Equations of Nonlinear Elasticity Theory

(a)

unstable equilibrium state

-----,---
boundary of bifurcation state Q
stability strongly stable equilibrium state
(K, g)

boundaryof
stability
_\_ __
u = displacement;
K = density of outer forces;
g = boundary displacement.
(b)

Figure 61.7

(i) The points P of I: are equilibrium states of the elastic body. They corre-
spond to critical points of the potential energy

Epot = L LLdx- Kudx.

(ii) If P e I: and u is strongly stable, then, by Theorem 6l.F, I: behaves locally


like a curve (Fig. 61.7(a)).
(iii) B!furcation may only occur at points Qe I: which are not strongly stable
(Fig. 61.7(b) shows a subset of I:).
(iv) By definition, the boundary of stability is given by all the states u e X which
are "almost" strongly stable, i.e.,
cPU(u;h) ~ 0 for all he Wl(G) 3
and u is not strongly stable, i.e.,
c5 2 U(u;h) = 0 for some nonzero he Wl (G) 3 ,
61.14. Stability and Bifurcation in Nonlinear Elasticity 223

or
Lu•u·(u'(x))(v o d) 2 = 0
for some x e Gand some nonzero v, d e V3 .
(v) Points Pe:E with
15 2 U (u; h) < .0 for some nonzero h e Wl (G) 3
correspond to unstable equilibrium states of the body. In this case, the
potential energy Epo1 has a critical point at u which is not a minimum.
Summarizing our observations we obtain the following basic principle of
nonlinear elasticity theory:

Loss of strong stability can Iead to bifurcation, i.e., to new equilibrium


states.

Bifurcation, for example, can Iead to the buckling of rods, beams, plates, and
shells. Of special interest are such bifurcation points where precisely one new
strongly stable branch occurs. Let us, for instance, Iook at point Q in Figure
6l.7(b). Nature will follow the new strongly stable branch (e.g., a buckled state
of a plate). At point Q* in Figure 61.7(b), there occur two new strongly stable
branches. In this case, nature will follow the strongly stable branch with the
lower potential energy.
For the study of bifurcation problems one can use the main theorem of
bifurcation theory for potential operators (Theorem 29.K). Applications to
the theory of nonlinear plates will be considered in Chapter 65.
In Example 61.20 we have shown that the natural condition

with c > 0 implies that the rest state u = 0, K = 0, g = 0 is strongly stable.


Hence, in a neighborhood of the origin, :E Iooks like a "curve" (Fig. 61.7(a)).
lf we know {u, K, g) e :E, then all the interesting physical quantities can easily
be computed. The second Piola-Kirchhoff tensor is obtained from
S = M'(tf),
and the stress tensor in the deformed region follows from
t{y) = {dety'{x)r 1 y'{x)S{x)y'{x)*, y= x + u{x).
According to Sections 61.3 and 61.5, the stress forces acting on a deformed
subregion H' are given by

l undO = l tn' dO'


JaH JoH'
and the outer forces by

l Kdx= l Kdetx'(y)dy.
JH Jw
224 61. Basic Equations ofNonlinear Elasticity Theory

The model, considered above, is an exact model in nonlinear elasticity, i.e.,


the basic equations in Section 61.3 are strictly satisfied.
Our approach also applies to approximation models, where L = L(u'(x))
or, more generally, L = L(u'(x), x) (inhomogeneous bodies). We then compute
the reduced stress tensor from
(1 =Lu'
and the stress tensor in the deformed region from
r(y) = (dety'(x)f 1 u(x)y'(x)*.
Note, however, that in this general case, the symmetry of r may be violated,
i.e., the basic requirement that "total torque = 0" may not hold.
In several situations it is an open problern as to what the global structure
of the set I of equilibrium states and the region of strong stability Iook like.
Because of bifurcation, rupture, and plasticity in nature, we expect the global
structure of I to be very complex.
In the special case of linear elasticity, I isaplane in X x Y and all points
of I are strongly stable, according to Example 61.19. This shows again that
linear elasticity is an unrealistic model for large displacements.

61.15. The Continuation Method in Nonlinear


Elasticity and an Approximation Method
If one knows a strongly stable solution of the basic equation (83), for example,
the rest state u = 0, K = 0, g = 0, then one can try to continue this solution.
This can be done, for example, by replacing K and g in (83) with tK and tg,
where 0 :=:;; t :=:;; 1, i.e., we obtain an operator equation
F(u(t)) = (tK, tg), u(O) = 0. (90)
Intuitively, this means that we follow the curve I in Figure 61.7(a) starting at
the origin. As in the proof ofTheorem 61.F, the implicit function theorem tells
us that, at least for small t, there exists a unique solution curve
u = u(t).
The operator F: X -... Y is coo and bounded on bounded sets. By the implicit
function theorem, the derivative u'(t) exists. From (90) follows
F'(u(t))u'(t) = (K, g). (91)
Moreover, we can continue u = u(t) as long as u(t) remains strongly stable,
i.e., F'(u(t)f 1 exists on Y.
lf we replace the differential quotient in (91) with the difference quotient,
then we find the natural approximation method
F'(u')u'H' = F'(u')u' + &t(K, g). (92)
61.150 Continuation Method in Nonlinear Elasticity; An Approximation Method 225

Suppose that .!\t, K, and g are given and that we have already computed u'
for fixed t = 0, L\t, 2L\t, 000 0We are then looking for the new deformation state

Our method starts with the rest state u' = 0 if t = Oo Note that t is an indexo
In the proof ofTheorem 6l.F we computed the F-derivative F'(u)h in terms
of a linear elliptic systemo Therefore, we can translate (92) into a system of
differential equationso According to (87), equation (92) corresponds to the
system
m = 1, 2, 3,
(93)
h = u' + L\t g on G
for the unknown function

with the new density of the outer forces


K~ = K~ + L\tK",o (94)
Here the old density of the outer forces K~ is given by the equilibrium
condition
- Di(aiJkm(u')u') = K~o (95)
Equation (93) represents a strongly elliptic system for determing ur+Ar if u'
is strongly stableo
Moreover, equation (93) for determining h = ur+ Ar is the Euter equation for
the accessory variational problern

t(PU(u';h)- L K'hdx = min!,

h = u' + L\t g on oGo


Hence the easiest way to compute u'+ 4' is to solve (93 ••,) approximately by a
Ritz method, whereby error estimates are obtained from the duality theory of
the following chaptero

61.15a. Physical Interpretation

This approximation method admits a very natural physical interpretationo We


obtain the continuation ur+Ar from the known state u' by using:
(a) linear material corresponding to the elastic potential energy
uHit5 2 U(u';u); and
(b) the new density of outer forces K'o
226 61. Basic Equations ofNonlinear Elasticity Theory

Note that formula (94) for K' is very natural because (95) is exactly the
equilibrium condition for the displacement u1 with respect to the linear mate-
rial in (a).
More precisely, we set

Then we have

i.e., !l'' is the stored energy function to the linear material corresponding to
u' in (a). Furthermore, we introduce the stress tensor
a'(u) = !l'~.(u')

which corresponds to this linear material.

Approximation Method 61.22. The method (93)-(95) co"esponds to the equa-


tion
div a'(ur+ 41 ) = K' in G,
(93*)
ur+Ar = u' + llt g on iJG
with
K' = K' + lltK, (94*)
where K' is obtained from
div a 1(u 1) = K'. (95*)

We start with u' = 0 and K' = 0 if t = 0 and compute successively


ur +Ar for t = 0, llt, 2/lt, ....

We are given the step length llt, the density of outer forces K, and the
boundary displacement g.

Wecall

the supplementary stress with respect to the state u1• We theri obtain the
equilibrium condition
div p. 1 = llt K,

i.e., the supplementary stress is in equilibrium with the outer force corre-
sponding tollt K.
61.16. Convergence ofthe Approximation Method 227

61.15b. The Singular Bifurcation Case


This method can be applied as long as u' remains strongly stable; it breaks
down if u' loses its strong stability. The same is true for u(t).
In the bifurcaiion case for the basic equation (83), i.e., for
F(u(t)) = (tK, tg),
we can apply the methods of bifurcation theory from Chapter 8. If the
linearized system F'(u(t))h = 0 remains strongly elliptic, but loses its unique
solvability, then F'(u(t)) is a Fredholm operator of index zero, according to
Section 61.11, and hence we have to apply the Fredholm alternative from
Proposition 61.16 in order to get the branching equations of Ljapunov-
Schmidt. In concrete cases it may be hard to apply this method, because we
need explicit expressions for the solutions of the linearized problem.
There exists, however, another method for handling the bifurcation case.
It consists in an application of the main theorem of bifurcation theory for
potential operators (Theorem 29.K). This theoremteils us that each nontrivial
solution of the linearized problern corresponds to a bifurcation point and,
roughly speaking, that the number ofbranches is at least equal to the number
of the linearly independentnontrivial solutions of the linearized problem. This
approach has been studied in detail in Section 29.20. We will come back to
this in Chapter 65 in the context of buckling of plates.

61.16. Convergence ofthe Approximation Method


The following theorem proves the convergence oftbis approximation method.
We construct the continuous curve t~--+ v.:11(t) by letting
n = 0, 1, ... , N
and using linear interpolation (Fig. 61.8(a)). Here N ll.t = T.

At 2At T n
(a) (b)

Figure 61.8
228 61. Basic Equations of Nonlinear Elasticity Theory

Let
O<ß<a.,
and recall that X,. = X, ~ = Y.
Theorem 61.H. Let u = u(t), 0 ~ t ~ T be a solution curve of (90) suchthat u(t)
is strongly stable for all t e [0, T]. Then
lim max llv&1(t) - u(t)ilx, = 0,
At-+0 O$t$T

i.e., V&r converges to u( ·) as llt -+ 0.

PRooF. The key formulas in this simple proof are

u(t)- u(n llt) = J.'


nAI
F'(u(s)t 1 (K,g)ds,
(96)
v"'(t)- V&1(nllt) = J.'
nAI
F'(V&1(nllt)t 1 (K,g)ds

for all teJ where J = [nllt,(n + 1)/lt].


(I) The inverse operator F'(u(t)r 1 exists for all t e [0, T]. Using the con-
tinuity of inverse formation (Problem 1. 7) and the compactness of [0, T],
there exists an open bounded neighborhood n of the solution curve in
X suchthat
F'(ur 1 exists on Y for all uen (Fig. 61.8(b)).
The operator F: X-+ Y is C2 and bounded on bounded sets. From
Problem 1.7, the operator
uHF'(ur 1
is bounded and Lipschitz continuous on n.
(11) Formula (96) implies that
llu(t)- V&1(t)ll ~ llu(nllt)- V&1(nllt)ll

c llt
+ - 2-max llu(s)- v"'(nllt)ll
seJ

for all teJ. By using induction, u(O) = v"'(O) and


Cllt
C llt + C2(flt)2 + C3(1lt)3 + ... ~ 1 - C llt'

we obtain that, for llt sufficiently small, v", remains in n.


(111) Convergence of a subsequence. Since
sup IIF'(ur 1 11 < oo,
uen
Problems 229

equation(96) implies the equicontinuity of {v-11}. The embedding X!;;; X11


is compact. Hence the Arzela-Ascoli theorem A 1 (24) teils us that there
exists a subsequence (v&) and a function v( ·)such that
max llvA1(t) - v(t)llx -+ 0 as At-+ 0. (97)
O:St:ST '

The operator F: X 11 -+ Y11 has the same properties as F: X-+ Y by


Proposition 61.17. Letting At-+ 0, we obtain from (96) that

v(t) =I F'(v(s)f 1 (K,g)ds.

Differentiation of this integral shows that


F'(v(t))v'(t) = (K, g), v(O) = 0.
Thus, u'(t) and v'(t) are classical solutions of the same strongly elliptic
differential equation and hence u'(t) = v'(t). Integration yields u(t) = v(t).
(IV) The same argument proves that each subsequence which satisfies (97)
converges to u( · ). The convergence principle in Proposition 10.13(1)
shows that (97) is valid for the original sequence. 0
Our approach to nonlinear elasticity in Sections 61.12-61.16 has been
strongly inßuenced by the work of Beckert (1975), (1984, S).

PRoBLEMs

61.1. Polyconvex functions. Let A E L(V3), i.e., the Operator A: v3-+ v3 is linear. The
eigenvalues At, A2 , A3 of (A• A) 112 are called the principal stretches of A.
6l.la. Show that the function f: L(V3)-+ IR,
f(A) = tr A• A = A~ + A~ + A~
is convex.
Solution: ö2f(A)H 2 = 2 tr H• H ~ 0 for all He L(V3).
61.1 b. Show that the function L = P(A, adj A) with
L = Ctr A•A + Dtradj(A•A)
= C(A~ + A~ + A~) + D((AtA2 ) 2 + (A 2 A3)2 + (AtA 3 ) 2 )
is polyconvex.
Solution: Use adj(A• A) = (adj A)•(adj A) and Problem 6l.la.
61.1c. Define a function f: L(V3)-+ IR through
f(A) = h(At,A 2 ,A 3).
Suppose that h: IR! -+ R is symmetric, convex, and nondecreasing in each
argument. Show that f is convex.
Hint: Let the Operators A, B: vl -+ vl be linear, symmetric, and positive. Let
At ~ A2 ~ A3, Jlt = Jl2 = JJ. 3, and Pt ~ p2 ~ p3 be the eigenvalues of A, B,
230 61. Basic Equations of Nonlinear Elasticity Theory

and tA + sB, respectively, where 0 :s; t :s; 1, s = 1 - t. From the Courant


maximum-minimum principle we obtain that

(see Riesz and Nagy (1978), p. 239). Hence


f(tA + sB) :S; h(tl 1 + SJL 1 , tA 2 + SJ.l2• tl 3 + SJL 3 )
:s; th(A.to l2, A.3) + sh(J.Lt, 1'2• 1'3)
= if(A) + sf(B).
In the generat case, this argument must be modified. See Ball (1977), p. 363.
61.2. The theorem of Cauchy. For a function f: ll~=t V3 -+ IR the following two
statements are equivalent.
(i) f is isotropic, i.e.,

for all rotations and reflections R, and all x 1, ... , xNe V3.
(ii) f can be expressed as a function of the inner products x1x1, i, j = 1, ... , N.
Hint: See Truesdell and Noll (1965), p. 29.
61.3. Elastic energy of a cuboid with linear state of Stress. Let K= r,K,e, be the
density of the outer forces and assume that

in Section 61.7a, i.e., the stress occurs only in the direction ofthe 1-axis. We e
call this a linear state of stress. Compute, analogously to (50), the elastic energy
U of the cuboid.
Solution: We obtain

becausefrom(48)followsthat U = a 1 y11 V/2andHooke'slawyieldsa1 = Ey11 •


Here, V is the volume of the cuboid.
61.4. Elastic energy of a cuboid with planar state of stress. Let
K 3 =0

in Section 6l.7a, i.e., no Stress occurs in direction of the e3-axis. Compute the
elastic energy of the cuboid.
Solution: Analogaus to Section 6l.7a we obtain
i = 1, 2, (98)
V -
U = 2().(yu + Y22) 2 + 2K(Yu2 + Y22))
2 (99)

with I= I'Ef(l - JL 2 ). Note that here, other than in the generat case of three-
dimensional states of stresses of Section 6l.7a, the quantity I occurs instead of
). = JLE/(l + JL){l - 2JL).
Problems 231

Because of a 3 = 0, we obtain the relation

Yu =I~ J.l(a;- I: /t11 + t12)). i = 1, 2.

This implies (98). Furthermore, it follows from (48) that

U = t(Yu t11 + Y22t12).


If we allow arbitrary rotations in the <e 1• e2)-plane, then (99) takes the in-
variant form

where the sum is taken over i,j = I, 2.


61.5. Equilibrium of torques in elastostatics. We consider the basic equations of
elastostatics in Section 6l.5f. In this stationary case we obtain the torque
condition
rF X ydy + r rn' X ydO = 0 (100)
Ju· Jau·
for all deformed subregions H' from the general balance of angular momentum
(25). This condition refers to the deformed body. In terms of the undeformed
body we expect the torque condition

rK
Ju
X ydx + r
JaH
an X ydO =0 (100*)

for all undeformed subregions H. Prove (100*).


Solution: Using the equilibrium condition Kl + D'a/ = 0 and integrating by
parts we find the general identity

J~ femiJ'I K1 dx + femiJ'I a/n'd0


1 1

=f llmii('1 1Ki + D'('1 1a/))dx

= f e.,11 a/D''1 1 dx.

We now use the second Piola-Kirchhofftensor S given by


a(x) = y'(x)S(x).
According to Section 61.4d the torque condition (100) implies the symmetry of
r. Now S is also symmetric according to Section 61.5h. In Cartesian coordinates
we have
aj = SfDt'1 1
and S~ = ~· Noting e,.11 = -e,.11 we obtain

J =f e,.1ß~Dt'1 1D''1 1 dx = 0.
This is ( l 00* ).
232 61. Basic Equations ofNonlinear Elasticity Theory

References to the Literature


Monographs (mathematical point ofview): von Mises (1962), Neeas and Hlavacek
(1981), Gurtin (1981), Ciarlet (1983), Marsden and Hughes (1983).
Monographs (physical point of view): Sommerfeld (1970) (classical standard work),
Prager (1961), Landau and Lißic (1962), Vol. 7, Solomon (1968), Washizu (1968), Lurje
(1980), Grincenko (1985, M), Vols. 1-6.
Handbook of F'hysics: Flügge (1956), Vol. Vla/1-4.
Rational Mechanics: Truesdell and Noll (1965, S), Truesdell (1977. M).
Thermomechanics: Ziegler (1983, M).
(See also References to the Literature to Chapter 62.)
CHAPTER 62

Monotone Potential Operators and a


Class of Models with Nonlinear
Hooke's Law, Duality and Plasticity,
and Polyconvexity

There is no science, which did not develop from a knowledge of the pheno-
mena; but in order to gain something from this knowledge, it is necessary to be
a mathematician.
Daniel Bernoulli (1700-1782)
When solving variational problems with the Ritz met!tod, it is important to
estimate the quality of the approxim~tion for the minimal values. The Ritz
method yields upper bounds forthoseminimal values. In 1927, Erich Trefftz
introduced a method-applicable to the Dirichlet and related problems-which
allows approximations of the solution of a variational problern and at the same
time yields Iower bounds for the minimal value.
In the following the same goal will be reached by using a different and more
fundamental approach. In general, we can assign to each minimization problern
a dual maximization problem, whose maximal value is equal to the minimal
value of the original problem. 1 The basic principle thereby corresponds to the
Legendre transformation of point mechanics.
Kurt Otto Friedrichs (1929)
When studying any physical problern in applied mathematics, three essential
stages are involved.
(i) Modeling: An appropriate mathematical model, based on the physics or the
engineerjng ofthe situation, must be found. Usually, these models are given
a priori by the physicists or the engineers themselves. However, mathe-
maticians can also play an important roJe in this process, especially con-
sidering the increasing emphasis on nonlinear models of physical problems.
(ii) Mathematical study ofthe problem: A model usually involves a set of ordi-
nary or partial differential equations or an (energy) functional to be min-
imized. One of the first tasks is to find a suitable function space in which
to study the problem. Then comes the study of existence and uniqueness or
nonuniqueness of solutions. An important feature of linear theories is the

1 This is also called the principle of complementary energy.

233
234 62. Monotone Potential Operators and a Class of Models

existence of unique solutions depending continuously on the data (well-


posed problems in the sense ofHadamard (1865-1963)). But with nonlinear
problems, nonuniqueness is aprevalent phenomenon. For instance, bifur-
cation of solutions is of special interest.
(iii) Numerical analysis ofthe model: By this is meant the description of, and the
mathematical analysis of, approximation schemes which can be run on a
computer in a "reasonable" time to get "reasonably accurate" numbers.
Phillipe Ciarlet (1983)
Despite this recent progress, numerical analysis of nonlinear partial differential
equations in three dimensions, as weil as many other frontier questions, still
await new mathematical methods.
Arthur laffe (1984)
Existence theorems under the assumption that the stored energy function L is
convex with respect to u' have been given by several authors. Unfortunately,
these results are only of mathematical interest, since convexity of L with respect
to u' is unacceptable physically....
A wide variety of realistic models of nonlinear elastic materials satisfy the
hypotheses of our existence theorem for polyconvex stored energy functions. In
particular, these include the Mooney-Rivlin material and the Ogden material.
lohn Ball (1977)

62.1. Basic Ideas


In this chapter we consider:
(i) a class of approximation models with convex stored energy function. We
investigate:
existence and uniqueness,
duality, and
approximation methods (Ritz method and Trefftz method, projection-
iteration method, and gradient method);
(ii) a class of exact models with polyconvex stored energy functions (existence
via compensated compactness).

Moreover, w~ consider:
(a) duality and statical plasticity;
(b) variational inequalities and quasi-statical plasticity.

A crucial analytic tool thereby is Korn's inequality, which will be proved later
on by using an equivalent p.orm on L 2 {G) via negative norms.
Weshall observe the following:
(a:) Duality theory in elasticity is a special case of the general duality theory
of Chapter 51 for monotone potential operators. One uses conjugate
functionals and the abstract Young inequality (Theorem 5l.B).
(ß) The classical version of duality theory in elasticity is a special case of a
general classical duality theory in the calculus ofvariations (the Friedeichs
62.1. Basic Ideas 235

duality). It corresponds to the Legendre transformation and the Hamil-


tonian function in point mechanics..
(y) Duality is the natural tool for a rigorous and elegant approach to quasi-
statical and statical plasticity.

62.1a. Convex Functionals and Monotone Operators

We begin by di$Cussing the basic ideas of (i), and the convexity of the elastic
potential energy will thereby be of centrat interest. Our starting point is the
variational problern

( M(y(u))dx- ( Kudx- ( TudO = min!,


JG JG Jo2G (1)
u = u0 on iJ1 G,
which corresponds to the principle of minimal potential energy. We set
v = u - u0 and write (1) as
F(v) - b(v) = min!, veX (2)
with the Euter equation
F'(v) = b, veX, (3)
where X denotes a suitable function space (Sobolev space). lt is important
that M satisfies a so-called strong stability condition

with a constant c > 0. This condition guarantees that M is convex. Thus the
energy functional F is convex and the derivative
F':X -.X*
is a strongly monotone Lipschitz-continuous potential operator. The proof of
this will be based on Korn's inequality. Hence the entire apparatus of the
theory of monotone potential operators of Parts II and 111 is available for the
Euler equation (3). Let us recall that the G-derivative of a convex functional
is a monotone operator.

62.1 b. Convexity and Duality


In applying the abstract duality theory of Part 111 to elasticity theory we
obtain a picture of remarkable clarity with regard to both physical and
mathematical aspects. Let us describe the basic idea. First, we write the
original variational problern (1) in the form
(P) Epo1(u) = min!, U- u0 eX,
236 62. Monotone Potential Operators and a Oass of Models

where u denotes the displacement of the elastic body, where

Epo,(u) = f M(y(u))dx- f Kudx- f TudO


JG JG Jii G
1

is the potential energy and where


y(u)(x) = !(u(x) + u'(x)*)
is the linearized strain tensor, which will also be written simply as y(x). The
constitutive law which corresponds to (P) is
(C) G = M'(y)

with inverse constitutive law


y = M'- 1(G).

The key formula of convex analysis M*' = M'- 1 fr~m Part 111 then yields
(C*) y = M*'(G),

where M* denotes the conjugate functional to M. Recall that Gis the stress
tensor (first Piola-Kirchhofftensor). We now define the so-called dual energy

Eduat(G) = - f
JG
M*(G)dx + f
JG
[G,y(uo)]- KuodX- r
Jii G
1
TuodO

and consider the dual variational problern


(P*) GEl:

for the stress tensor G. The precise definition ofthe set :E will be given in Section
62.4. Roughly speaking, :E is characterized by the relation

f [G,y(h)]dx- f Khdx-J Thd0=0 forall heX.


JG JG ii G
1

From Section 61.5g, this means that Ge :E if and only if G satisfies the principle
ofvirtual work (power) for all "virtual displacements" heX. Intuitively, this
says that Ge :E if and only if the stress tensor G is in an equilibrium with the
outer forces K and T. In Section 62.4 we will show that for sufficiently smooth
Ge :E, the simpler classical expression

Edua1(G) = - f
JG
M*(G) dx + Jii G(Gn)u dO
1
0

is valid.
Let us sumrnarize our observations.
(oc) The original variational problern (P) refers to the potential energy with
respect to all possible displacernents of the elastic body which satisfy the
given boundary displacement.
(fJ) The dual variational problern refers to the dual energy with respect to all
62.1. Basic Ideas 237

stress tensors which areinan equilibrium with the outer forces (volume
force K and boundary force T).
Our main result is as follows.
The original variational problern (P) and the dual variational problern (P*)
have unique solutions u and u, respectively, which are related by the constitutive
law
u = M'(y(u)),
i.e., u is precisely the stress tensor which is observed in the equilibriurn state
corresponding to the displacernent u.
Moreover, the extrernal values of(P) and (P*) are the sarne, i.e.,
Epot(u) = EduaM).
In order to find an intuitive physical interpretation for the dual variational
principle, Iet us introduce the stress energy

E.,, ••• (u) = L M*(u)dx.

It is positive for all reasonable models. The dual problern then becomes

Edua 1(u) =f (un)u 0 d0- E.,,••• (u) = max!, uei:.


a,a
This means that if we consider all stresses which are in an equilibrium with
the outer forces, then, in an equilibrium state of the elastic body, the actual
stress corresponds to a maximal difference between the work done by the
boundary stress forces and the stress energy in the interior of the body.
The dual problern can also be written in the equivalent form
-Edua1(u) = min!, uei:.
This is called the principle of minimal complementary energy. We note the
remarkable fact that the integrand M* of Edual corresponds to the inverse
constitutive law (C*) above. Thus, we might say:
The dual variational problern refers to the inverse constitutive law ofthe elastic
body.
In Section 51.5 we found that the abstract duality theory for monotone
potential operators is based on the abstract Young inequality for conjugate
functionals in the sense of convex analysis. In Example 51.4 we further showed
that the classical Legendre transformation provides the motivation for con-
jugate functionals. This abstract duality theory is a generalization of very
simple classical results. We will discuss this classical background, which was
discovered by Friedrichs ( 1929), in Section 62.16. There it will be shown that
the general duality theory of Friedrichs Ieads to a quite remarkable relation
between elasticity theory and point mechanics. Table 62.1 illustrates how, by
238 62. Monotone Potential Operators and a Class of Models

Table 62.1
Point mechanics Elasticity theory
Lagrangian function Stored energy function M
Hamiltonian function Conjugate stored energy
(Conjugate Lagrangian function) function M•
Position Displacement u
Velocity Strain tensor y
Moment um Stresstensor u
Legendre transformation Constitutive law u = M'(y)
(between velocity and momentum)

applying the Legendre transformation, duality in elasticity can be viewed as


a generalization of duality in point mechanics.
In applying the general theory of this chapter, we consider special stored
energy functions M which correspond to:
(a) linear elasticity theory, and
(b) nonlinear Hencky material.

62.lc. Generalized Solutions


The solution u of the original variational problern (P) above with u - u0 e X,
which corresponds to the classical problern

f M(y(u))dx- f Kudx- f TudO = min!


JG JG J~G
u = Uo on alG,
is also a solution of the abstract Euler equation (3), as weil as a generalized
solution of the classical Euler equation
div u + K =0 on G, u = M'(y),
u = Uo on alG,
un = T on o2 G,
which is a mixed boundary-value problem. On the boundary parts o1 G and
o2 G we are given the displacement and the stress forces (boundary forces),
respectively.

62.ld. Approximation Methods


Since the original variational probtem (P) is convex, we can use the Ritz
method (projection method) in order to obtain approximate solutions of (P).
62.1. Basic ldeas 239

The Ritz method, applied to the dual problern (P*), is the Trejftz method. In
case of the variational problern (P) the Ritz method yields upper bounds for
the minimal value of (P). If we use both the Ritz method and the Trefftz
method, then we find two-sided error estimates for the minimal value of (P)
and error estimates for the solution u of (P) in terms of the so-called energetic
norm, i.e., the norm on the space X.
Moreover, since the operator F', which appears in the abstract Euler
equation (3), is a strongly monotone and Lipschitz continuous potential
operator, we can apply:
the projection-iteration method, and
the gradient method (method of steepest descent).
Let us summarize the advantages of duality in elasticity theory. By using
duality, we obtain:
(a) approximate solutions for the displacements and stresses of the elastic
body;
(b) two-sided error estimates for the minimal potential energy of the elastic
body;
(c) error estimates for the displacements in terms of an energy norm (and
error estimates for the stresses);
(d) a natural approach to quasi-statical plasticity by using the principle of
maximal dual energy together with a· plasticity inequality for the stress
tensor as a side condition; and
(e) a natural approach to statical plasticity by using both

the principle of maximal dual energy


+ the plasticity inequality for the stress tensor
and
the principle of minimal potential energy
+ the corresponding inequality for the strain tensor.
In this way we obtain a rigorous justification of the empirical theory of Haar
and von Karman ( 1909).

62.1e. Linear Elasticity

In the case of linear elasticity we have that


M(y) = !A.(try)2 + K[y, y],
whereby it is clear that ifthe parameters A. and "are positive, then M is strongly
stable. Equation (2) assumes the special form
!a(u, u) - b(u) = min!, u EX,
240 62. Monotone Potential Operators and a Class of Models

where a: X x X -+ IRis a strongly positive, symmetric bilinear form. Here, the


strong positivity, i.e., the fact that
a(v, v) ;;:: cllvll 2 for all veX
with fixed c > 0, follows from Korn's inequality. The existence and uniqueness
result then is an immediate consequence of the main theorem on quadratic
variational problems (Theorem 22.A).
In Section 61.9 we have used this approach for the boundary condition
(F) u = u0 on iJG (first boundary-value problem).

In this chapter we consider the more generat boundary conditions


u = u0 on iJ1 G (mixed boundary-value problem)
(M)
an= T on iJ2 G,
and
(S) an= T on iJG (second boundary-value problem).

For (F) and (M) we obtain unique solutions u.


Note that in (S) only the boundary forces are given. We expect that there
will exist several deformation states u which satisfy this condition, and that
there will be some equilibrium condition between the outer forces K and
the boundary forces T. In fact, the displacement u is only determined up to
infinitesimal rigid motions and a solution u exists ifand only ifthe equilibrium
condition

f K dx + f T dO = 0
JG JiJG
rK
JG
X X dx + rT
laG
X X dO = 0

is satisfied. The first condition means that the total force is equal to zero. The
second condition means that the total torque is equal to zero in the sense of
(7;ppro1 ) of Section 61.6c.

62.1f. Convexity and Approximation Models


In Section 61.8 we have seen that, in the context ofthe exact theory, the stored
energy function of a homogeneaus body must have the form
L = M(8).
lf we replace the strain tensor
8(x) = -!(u'(x) + u'(x)* + u'(x)*u'(x))
62.1. Basic Ideas 241

with the linearized strain tensor


y(x) = t(u'(x) 7 u'(x)*),
then we obtain our model (1). Consequently, this model has only an approxi-
mation character.

62.1g. Polyconvexity
More generat models are obtained from polyconvex stored energy functions
L = P(A, adj A, det A),
where P: L(V3 ) x L(V3 ) x ]0, oo[-+ IRis convex and
A =I+ u'(x).
Hence
E~ (A* A) 112 =(I + 21(xW12 •
In the special case of rubberlike Ogden material with
L = C tr EP + D tr adj E' + f(det A) + const
and p, r ~ 1, we obtain the exact stored energy function
L = M(l).
In fact, because of det A = det E and the definition of E, the quantity L in this
case depends only on 8.
In Section 62.13 we prove a generat existence theorem for polyconvex
material including the Ogden material. The key thereby is the important
recent method of compensated compactness whose basic idea we explain in
Section 62.12.
Recall that in the preceding chapter we have used the implicit function
theorem in order to obtain a rigorous local approach to nonlinear elasticity.
The variational approach via polyconvexity is global in nature.
But it is clear that there remain many open problems in nonlinear elasticity.
As in the preceding chapter, we use an invariant, i.e., coordinate-free,
approach. This simplifies formulas and a variety of arguments. The reader,
who prefers coordinates, can easily rewrite everything by using the results of
Section 61.1. The constitutive law
a = M'(y),
for example, becomes
. iJM(y)
a'-
j - ---:;-}• i,j = 1, 2, 3.
U}';

Before studying this chapter one should again take a short Iook at Section
242 62. Monotone Potential Operators and a Class of Models

61.6 on the variational approach to elasticity, and at Section 21.2 on Sobolev


spaces.

62.2. Notations
Let G be a bounded regi&n in the three-dimensional vector space V3 • Recall
that V3 may be identified with IR 3 • The sets 81 G and 82 G correspond to the
boundary decomposition
aG = al G u az G,
where 81 G and 82 G are open subsets of the boundary 8G.
The space
x = Wl(G, al G; V3),
which plays a key role in this chapter, has the structure of a Sobolev space
(see Part II). More precisely, the space X consists of all displacements v: G-+ V3
with components
i = 1, 2, 3,
(4)
v; = 0 on o1 G.
The boundary condition is understood in the generalized sense (see Section
21.2).
lf relation (4) holds in a fixed coordinate system, then, as a consequence of
the linearity of the coordinate transformations, it holds in every coordinate

t
system. Recall that

(fig)wl<G> = it fg + DifDig dx.


As a scalar product in X we choose
3
(vlw)x = L (vilwi)wl<G>·
i=l

This expression is independent of the choice of the Cartesian coordinate


system. Because of
[v'(x), w'(x)*] = DiviDiwi

we obtain theinvariant equation

(vlw)x = L Lvwdx + [v'(x), w'(x)*] dx.

If we consider (4) without the boundary condition vi = 0 on 81 G for all i,


then we obtain the space
62.2. Notations 243

instead of Wl(G,o 1 G; V3 ). The space Wl(G, V3 ) consists of all v: G-+ V3 ,


whose cornponents vi belong to Wl(G) for all i. In particular,
Wl(G,oG; v3> = Wl(G, v3>·
Let us rnotivate the definition of the space X. Given u0 e Wl(G, V3 ) we will
prove the existence of soluti9ns
u = u0 + v, veX
of equation (2). These solutions rnay be viewed as generalized solutions of the
corresponding classical variational problern (1). The following two points are
irnportant in this regard:
(i) Because of (4), we have that
u = u0 on o1 G,
i.e., the boundary condition in (1) is satisfied in the generalized sense.
(ii) The variational problern (1) contains first-order partial derivatives. Because
of (4), u has such derivatives in the generalized sense.
Let
L 2 (G, V3 )

denote the space of all vector functions K: G -+ V3 with


KieL 2 (G), i = 1, 2, 3.
Analogously, L 2 (op, V3 ) is defined.
Moreover, Iet

be the space of all tensor functions y: G -+ Lsym(V3 )) with


i,j = 1, 2, 3.

As a scalar product in Y we choose

(YIJ.t)r = L [y(x),J.t(x)] dx.

The space Y will play an irnportant role in duality theory.


The spaces X and Y are real H-spaces. This follows frorn the properties of
Sobolev spaces and the fact that, in Section 61.1, we introduced the scalar
product [y,J.t] on Lsym(V3 ). We set

IJ.tl = [J.t, J.t] tt2


All spaces above are defined in an invariant way, i.e., the definitions do not
depend on the choice of the coordinate systern.
244 62. Monotone Potential Operators and a Class of Models

62.3. Principle of Minimal Potential Energy,


Existence, and Uniqueness
We consider the variational problern for the displacement u of an elastic body

l M(y(u))dx- l Kudx- l TudO = min!,


JG JG JÖ2G (5)
u = Uo on olG
with the linearized strain tensor
y(u)(x) = !(u'(x) + u'(x)*).
The key to our approach is the so-called strong stability condition
M"(y)p.l;;;:: ciJ.tl 2 for all y, J.lELsym(V3 ) (6)
and fixed c > 0. We assume, in addition, the growth conditions
IM(y)l :s; const lyl 2,
IM'(y)J.tl :s; const IYIIJ.tl, (7)
IM"(y)J.t 2 1:s; const IJ.tl 2
for all y, J.l e Lsym(V3 ). More precisely, we make the following assumptions:
(Hl) G is a bounded region in the three-dimensional vector space V3 with
sufficiently smooth boundary, i.e., aG e C0 • 1. Intuitively, G corresponds
to the region of the undeformed body.
(H2) The boundary of G can be decomposed as oG = o1 G u o2 G where o1 G
and o2 G are disjoint, open subsets of oG, and o1 G is nonernpty.
(H3) The map M: Lsym(V3 )-+ IRis C2 and satisfies the strong stability condi-
tion (6) and the growth conditions (7). Let M(O) = 0 and M'(O) = 0.
(H4) On G the density of outer forces K e L 2 (G, V3 ) is given.
(H5) On the boundary part o1 G, the boundary displacement u0 is given. More
precisely, Iet u0 e Wi(G, V3 ).
o
(H6) On the boundary part 2 G, the density of the boundary stress forces
Te L 2 (oG 2 , V3 ) is given.

Theorem 62.A. If (Hl) to (H6) hold, then the variational problern (5), i.e., rnore
precisely, the generalized problern

l
JG
M(y(u))dx- l
JG
Kudx- f Ö2G
TudO = min!, u- u0 eX

has a unique solution. This solution is strictly stable.

The proof will be given in Section 62.5.


lf(Hl) to (H6) hold with o1 G = 0, then o2 G = oG. In this special case, the
existence theorem for the so-called second boundary-value problern will be
62.4. Principle of Maximal Dual Energy and Duality 245

proved in Problem 62.6. lt is a typical property of this statically undetermined


problern that the outer forces K and T must satisfy an equilibrium condition
and that the displacements are only determined up to translations and infini-
tesimal rotations, i.e., up to infinitesimal rigid motions. Intuitively, one would
expect that the displacements are determined up to translation and rotations.
The appearnace ofinfinitesimal rotations results from the fact that our model
is not exact. This is because we replace the strain tensor I with the linearized
strain tensor y.

62.4. Principle of Maximal Dual Energy and Duality


Let us write the principle of minimal potential energy (5) in the form
(P) Ep01 (u) = min!, u- u0 eX.
In addition to this principle, we consider in this section the principle of
maximal dual energy
(P*) Edual(u) = max!, uei:
as an example of a dual variational problem. Theorem 62.B below then will
show that the solution of (P*) is the stress tensor u which corresponds to the
solution u of (P).
In preparation we observe that, by passing to components, relation (61.41)
yields the constitutive law
u = M'(y) (8)

for (5). This is to be understood in the sense of the scalar product on Lsym(V3 ),
i.e.,
M'(Y)Jl = [u,Jl]

The important point thereby is the following. Because of the strong stability
condition (6), it follows from Corollary 42.8 that the derivative M' is strongly
monotone. Thus, the main theorem on monotone operators (Theorem 26.A)
shows that equation (8) has a unique solution y, namely
(9)

In addition, we will use the conjugate functional M* to M: Lsym(V3 )-+ R.


From Proposition 51.5 we obtain the following key formula of convex analysis
M'- 1 = M*'.
More precisely, we find that

M*(u) = L 1
[u, M'- 1 (tu)] dt
246 62. Monotone Potential Operators and a Class of Models

for all 0' e Lsym(V3 ). The inverse constitutive law (9) becomes
y = M*'(u). (9*)
This is a very natural condition:
In order to obtain the dual (inverse) constitutive law (9*) from the original
constitutive law (8), we must replace the stored energy function M with its
conjugate .functional M*.
We now define the potential energy of the elastic body

EP01 (u) = ( M(y(u))dx- ( Kudx- ( TudO,


JG JG J~G
and the corresponding dual energy

Edua1(u) = - ( M*(u) dx + ( [u, y(u0 )] - K1J 0 dx - ( Tu 0 dO.


JG JG JÖ1G
By definition, we have that uel: if and only if ueL 2 (G, Lsym(V3 )) and

( [u, y(v)] dx = ( Kvdx +( TvdO


JG JG JalG
for all veX. According to Section 61.5, this relation can be regarded as a
generalized form of the equilibrium condition
div a +K = 0 on G,
un = T on o2 G.
Roughly speaking, one minimizes in (P) over all possible displacements u,
which satisfy the boundary condition u = u0 on iJ1 G. In the dual problern (P*),
one maximizes over all possible stress tensors a which areinan equilibrium
with the outer forces K and T.
lf a e l: and "a is sufficiently smooth, then we obtain from integration by
parts the simpler expression

Edua1(u) = - ( M*(u) dx +( (un)u 0 dO.


JG Ja,G
In Problem 62.9 we will show that
M*(u) = [u, y] - M(y), u = M'- 1 (y)
for all u e Lsym(V3 ). Hence M and M* correspond to the Lagrangian function
and the Hamiltonian function in point mechanics, respectively. In Section
62.16 we shall prove that duality in elasticity theory is a special case of a
generat duality in the calculus of variations.
The following theorem contains a precise formulation of duality between
displacements and stresses in elasticity.
62.5. Proof of the Main Theorems 247

Theorem 6l.B (Duality). We make the assumptions (Hl) to (H6) of Theorem


62.A above.
Problems (P) and (P*) then have the unique solutions uand iJ, respectively,
related through the constitutive law (8), i.e., iJ = M'(y(u)).
Furthermore, the two extremal values of(P) and (P*) equal each other, i.e.,
Epo,(U) = Edua.(ü).

Corollary 62.1 (Error Estimates). Jf u and u satisfy the side conditions of (P)
and (P*), respectively, then we obtain the two-sided estimate for the minimal
potential energy
EduaM) ~ Ep01 (il) ~ Epo1(1l).
Moreover,for the displacement u, wefind the estimate
!cctllu- lllli ~ .Epo,(u)- Edua.<u).
The positive constants c and c 1 thereby appear in the strong stability condition
(6) above andin Korn's inequality (11) below.

The advantage of these error estimates is that we can choose u and u


arbitrarily. Only the side conditions of (P) and (P*) need to be satisfied,
respectively.

62.5. Proof of the Main Theorems


The proofs of:
Theorem 62.A, and
Theorem 62.B together with Corollary 62.1
are easily obtained from
Theorem 42.A (main theorem on free convex minimum problems), and
Theorem 51.8 (main theorem ofthe duality theory for monotone potential
operators).
The key is the important and nontrivial inequality of Korn (1907). This
inequality, which will be proved in Section 62.15, is intimately related to the
famous inequalities of Poincare and Friedeichs of Part II. Thus, our proof
consists of two parts:
(i) concrete analytic substance (Kom's inequality);
(ii) abstract functional-analytic substance (Theorems 42.A and 51.8).

62.5a. Proof of Theorem 62.A


We set y = Du, i.e.,
Du(x) = t(u'(x) + u'(x)*).
248 62. Monotone Potential Operators and a Class of Models

Furthermore, we set v = u- u0 • The variational problern (5) then becomes


F(v) - b(v) = min!, veX (10)
with

F(v) =f M(Dv + Du 0 )dx- fG Ku 0 dx- f Tu 0 d0,


G J~G
b(v) = f Kvdx + f TvdO.
JG Jo1G
Important is the following result.

Lemma 62.2 (Functional Analytic Formulation of Kom's lnequality). Through

(vlw)E = L [Dv,Dw]dx

an equivalent scalar product is defined on the space X.

Letting UviiE = (vlv) 1' 2, this result states that there exist positive constants
c1 and c2 suchthat
for all veX. (11)
Analogously as in Section 22.1 the norm llviiE is called an energy norm. The
left-hand inequality in (11) is Korn's inequality which will be proved in Section
62.15 (see (57)). The right-hand inequality in (11) follows easily from the
definition of Uvllx in Section 62.1 and Hölder's inequality (see Problem 62.1).

Lemma 62.3. Thefunctional b: X--+ IRis linear and continuous.

The simple proof follows from Hölder's inequality (see Problem 62.2).
We now study the functional F which represents the elastic potential energy.
To do this we set
qJ(t) = F(v + tw) for all t e IR and fixed v, w e X.
We then have that
IP'(O) = (F'(v), w), IP"(O) = c5 2F(v; w).
Computation of IP'(O) and IP"(O) yields

(F'(v), w) = L M'(Dv + Du0 )Dwdx,

c5 2 F(v;w) = L M"(Dv + Du0 )(Dw)2 dx for all v, weX.

As a consequence of stability condition (6) we then obtain that


for all v, weX. (12)
62.5. Proof of the Main Theorems 249

Notice Korn's inequality (11). We thus obtain the key result:

The strong stability condition (6) for the stored energy function M implies
that the second variation ofthe elastic potential energy Fis uniformly strongly
positive.

In particular, according to Corollary 42.8, inequality (12) implies the convexity


of F: X .... IR.

Lemma 62.4. The functional F: X .... R is C 1 and convex.


The operator F': X__. X* is a strongly monotone potential operator. More
precisely, we have that
(F'(v)- F'(w), v- w) ~ icllv- wlli for all v, weX. (13)

PRooF. As in Section 42. 7, the growth conditions (7) ensure that the differentia-
tion of cp can be performed under the integral sign and that F, F' are continu-
ous. This is a consequence of the majorant theorem on the differentiation of
parameter dependent integrals A2 (25).
Inequality (13) follows from Corollary 42.8. 0

Theorem 42.A then implies that the minimumproblern (10) is equivalent to


the Euter equation
F'(u)- b = 0, ueX, (14)
which has a unique solution u. Formula (12) implies that
15 2F(u;w)>O forall weX-{0},
and this is the strict stability of u, according to Definition 61.4.
The proof of Theorem 62.A is complete.
In the context of approximation methods (projection-iteration methods
and gradient methods) the following additional information is useful.

Lemma 62.5. The operator F': X -+ X* is Lipschitz continuous.

PROOF. By passing to components, formula (7) implies that the second-order


partial derivatives of M are bounded. Thus M' is Lipschitz continuous, i.e.,
IM'(y)r- M'(J.t)tl ~ constly- J.tllrl
for all y, J.l, t e Lsym(V3 ). Hölder's inequality implies that

I(F'(u)- F'(v), w)l ~ f IM'(Du + Du0 )Dw- M'(Dv + Du0 )Dwl dx

~ const (f IDu- Dvl 2 dx Y' (f


2
IDwl 2 dx Y'
2

= const llu- viiEIIwiiE ~ const llu- vllxllwllx


250 62. Monotone Potential Operators and a Class of Models

for all u, v, weX, and this means that


IIF'(u)- F'(v)li :S const iiu- vllx for all u, veX. 0

62.5b. Proof of Theorem 62.B

Let us show that we are in the situation ofTheorem 5l.B. Tothis end, recall
that
X= Wl(G,i\G, V3 ),
and that y = Du, where
Du(x) = i(u'(x) + u'(x)*).
The operator
D: X-+ Y
is linear and continuous, and on X we impose the energy norm ofLemma 62.2
above. We then have that
for all veX.
Using the operator D we can write the original variational problern (10) in the
form
(P) H(Dv) - b(v) = min!, veX
with the functional

H(y) = L M(y + Du0 )dx- b(u0 ) for all ye Y.

According to Section 51.1, the corresponding dual problern reads as


(P*) -H*(a') = max!, aei:
with
I:= {ae Y: (aiDv}y = b(v) for all veX}
and conjugate functional

H*(a) = I 1
(aiH'- 1 (ta}}ydt- H(H'- 1 (0)).

We identify the H-space Y with its dual space Y*.


In order to be able to apply Theorem Sl.B to (P*) we have to study the
properties of H. We will show that the operator
H': Y-+ Y*
is strongly monotone and Lipschitz continuous. The main theorem on mono-
62.5. Proof of the Main Theorems 251

tone operators (Theorem 26.A) then implies that the inverse operator
H'- 1 : Y*-+ Y
is also strongly monotone and Lipschitz continuous. Finally, Proposition 51.5
implies that
H*' = H'- 1•
Case 1: Webegin with the special case u0 = 0.
For the G-derivative ofthe functional H: Y-+ IR we find

H'(y)J,t = L M'(y)J,tdX for all y, JlE Y,

and for the second variation, we obtain from stability condition (6) that

t5 2 H(y;J,t) = L M"(y)J,t 2 dx ~ cll~-tll: for all y, Jl e Y.

Thus, Corollary 42.8 shows that the operator H': Y-+ Y* is strongly mono-
tone. The same argument, as has been used for the operator F' in Section 62.5a
then shows that H' is Lipschitz continuous.
Ifwe set
u = M'(y)
in the sense of the scalar product on Lsym(V3), i.e.,
[u,J-t] = M'(y)J,t for all J,tEL51m(V3),
we obtain
H'(y)p. = (uip.}y for all p. e Y.
Thus, we have
H'(y) = u
with u(x) = M'(y(x)) on G. This gives

H*(u) = LL1
[u, M'- 1 (tu)] dx dt,

i.e., we obtain the natural formula

H*(u) = L M*(u) dx.

Theorem 62.B and Corollary 62.1 are then an immediate consequence of


Theorem Sl.B.
Case 2: The generat case, i.e., u0 e Wl(G, V3 ).
Analogous arguments apply. Note that now H'(y) = u means that u =
252 62.. Monotone Potential Operators and a Class of Models

M'(y + Du0 ), i.e.,


}' = M'- 1 (a) - Du0 ,

and because of M'- 1(0) = 0 it follows that

H*(a) = LL1
[a,M'- 1 (ta)- Du 0 ]dxdt + b(u0 ),
i.e.,

H*(a) = L M*(a)- [a, Du0 ] dx + b(u0 ).


Hence
H*(a) = - Eduat(a).
Thus the proof of Theorem 62.B and Corollary 62.1 is complete.

62.6. Approximation Methods


For approximate solutions of our model we can use, for exarnple, Ritz'
method, Trefftz' method, projection-iteration methods, or gradient methods.
Let us explain this.

62.6a. The Ritz Method

In Section 42.5 of Part III, we studied the Ritz method for free convex
minirnum problems. This method can be applied to the original variational
problern (10), i.e.,
F(v)- b(v) = min!, veX. (15)
The idea ofthe Ritz method is to replace the space X= Wl(G,o 1 G; V3 ) with
a finite-dimensional subspace Xn, i.e., to consider the new variational problern
F(vn) - b(vn) = rnin!, VnEXn. (15n)
For n = 1, 2, ... we choose subspaces with

and UXn=X.
n

The convergence in X of this rnethod and error estimates are obtained frorn
Theorem 42.A. Observe that, according to Lemma 62.4, the operator F':
X --+ X* is strongly monotone.
In order to obtain a handy formulation, Iet us write the original problern
(15) in the form
(P) Epo1(u) = min!,
62.6. Approximation Metbads 253

The Ritz method (15,.) then becomes


(P,.) Epo1(u,.) = min!, u,. - u0 e X,.
with n = 1, 2, .... Alldisplacements u,. which satisfy the boundary condition
u,. = Uo on a.G
are admitted as candidates in (P,.). More precisely, we have to make the ansatz

+ L d1uUJ
II

u,. = u0 (16)
}=1

with the unknown real coefficients d 1 , ••• , d,. and the fixed given displacements
uUleX,j = 1, 2, ... , n, where

uUI = 0 on o1 G for allj.


In the Appendix to Part II we have shown, how such basis fuilctions uUI can
be chosen in the form offinite elements, which may be, for example, piecewise
linear functions, obtained from a triangulation of the region G by using linear
interpolation. Problem (P,.) is of the form
f(d 1 , •. • ,d,.) = min!.
This Ieads to the generally nonlinear system of equations
of(d ••...• d,.) = 0
j=l,2, ... ,n.
iJd.J '
If u and u,. are solutions of (P) and (P,.), respectively, then, by taking (13)
into account, Theorem 42.A yields the error estimate
fcctllu,.- ulli ~ !cllu,.- ulli ~ Epo1(u,.)- Epo,(u).
In order to eliminate the unknown value Epo1(u), we choose an arbitrary u,. e I:.
Theorem 62.B then implies that EduaMn) ~ Epo1(u). Hence we obtain
fcc 1 Uu,.- uili ~ fcllu,.- ulli ~ Ep01 (u,.)- Edua1(u,.) (17)
for arbitrary fixed u,. e I:. The advantage of this error estimate is that on the
right-hand side, only the known values Edua1(u,.) and EP01(u,.) appear. More-
over, the energy EP01 (u) of the elastic body satisfies the estimate
EduaMn) ~ Epot(u) ~ Epo1(U,.). (17*)
Inequalities (17) and (17*) contain the two key estimates of the Ritz method.

62.6b. The Trefftz Method

Ritz' method for the dual problern


(P*) Edua 1(u) = max!, uei:
254 62. Monotone Potential Operators and a Class of Models

is
(P:)
where

and

This method is also called the Trefftz method.


The convergence of this method is a consequence of Theorem 42.A. From
Section 62.5b follows that problern (P*) can be written as
-H*(u) = max!, uei,
where the operator H*': Y*-+ Y is strongly monotone. We note that I is
obtained from a translation of a closed, linear subspace of the H-space Y.
Let us show that a combination of the Ritz method of Section 62.6a and
the Trefftz method can be very useful in practical computations. Suppose "•
and O'n are solutions of(Pn) and (P:), respectively. From Theorem 62.B follows
that Ep01 (u) = Edua.<u) for the solutions u and u of (P) and (P*), respectively.
Our convergence proofs then show that
and as n-+ oo.
Thus, we obtain the key relation
Epo1(Un) - Edual(un)-+ 0 as n-+ oo.
A good choice of a. in the error estimates (17) and (17*) is therefore a solution
an of the Trefftz method (P:).
These Observations show the importance of duality theory for elasticity
theory.

62.6c. Projection-Iteration Method


According to Section 62.5a and Theorem 42.A, the original variational prob-
lern (P) is equivalent to the operator equation
(E) F'(v)- b = 0, veX
with u = v + u0 • Moreover, as a consequence of Section 62.5a, the operator
F': X-+ X* is strongly monotone and Lipschitz continuous where Xis a real,
separable H-space. Hence, projection-iteration methods, which have been
studied in Section 25.4 of Part II, can be applied to (E). The algorithm is
(E:) (un+llu 1il)E = (unluUl)E - t(F'(un - u0 ) - b, u1J>)
for j = I, 2, ... , n. We start with the known u0 and compute successively the
quantities un+l for n = 0, 1, ... which are given by the ansatz (16).
The advantage of this projection-iteration method over the projection
62.7. Applications to Linear Elasticity Theory 255

method (Ritz' method) (P,.), (E,.) is that here, at each step, we only have to
solve linear systems of equations for the .d 1 , ... , d".
For practical computations one uses (E:) with respect to a Cartesian
coordinate system. This gives

(uiv)E = L yj(u)yi(v)dx,

(F'(u- Uo)- b,v) = r aj(u)yi(v)dx- Jar K vjdx- f.a2a T vjd0


Ja
1 1

with
1( ) _ oM(y(u))
aj u - ori .

62.6d. Gradient Method


We may also apply gradient methods to the original problern (P), (E) as well
as to the Ritz equations (E.. ). This method has been studied in Theorem 42.B.
It is important, in this connection, that the operator F': X -+ X* is strongly
monotone and Lipschitz continuous.
The Ritz problern (P,.) is a free convex optimization problem, and hence, in
order to solve (P,.), the entire apparatus of convex optimization is available.

62.7. Applications to Linear Elasticity Theory


We consider a homogeneous, isotropic body. As a consequence of Example
61.6 we obtain the stored energy function
M(y) = ll(tr y) 2 + K[y, y]
with the Lame constants l, " > 0. If we set Y,(t) = M(y + t!J) and compute
1/1'(0) and 1/1"(0), then we obtain that
M'(y)!J = ). tr y tr Jl + 2K[y, Jl],
M"(y)!J 2 = 2M(!J) for all y, !JELsym(V3).
Thus the strong stability condition
M"(y)!J 2 ~ 2K[JL, JL]
is trivially satisfied and all results of the previous sections can be applied. The
constitutive law a = M'(y) becomes
G = ltryf + 2Ky.
This is precisely Hooke's law. Note that [tr,Jl] = M'(Y)/J and [/,IJ] = tr Jl.
256 62. Monotone Potential Operators and a Class of Models

Theinverse constitutive law y = M'- 1 (u) is


y = A.* troJ + 2K*u
with the dual Lame constants
1
K* =-, A.* =
(2K + JA.)K .
"
From Section 62.4 we thus obtain the dual stored energy function
M*(u) = !A.*(trcr)2 + K*[cr,cr].

62.8. Application to Nonlinear Hencky Material


As an important example for a stored energy function, leading to a nonlinear
Hooke's law, we consider
M(y) = !k(tr y) 2 + KqJ([y, y])
with the material constant k = A. + 2K/3 and the strain deviator
y=y--Jtryl.
The expressionforM has been physically motivated in Example 61.7. In the
special case qJ(t) = t, we obtain the stored energy function of linear elasticity
theory, which has been considered in the previous section.

Proposition 62.6. The function M: Lsym(V3 )-+ IR satisfies the strong stability
condition (6) and the growth conditions (7) if the material function qJ satisfies
the following properties:
(i) qJ: IR+ -+ IR is C 2 with qJ(O) = 0.
(ii) There exist constants a, b > 0 and a natural number n such that the following
inequalities hold on IR+:
a:::;; ql(t):::;; 1, -b:::;; qJ"(t):::;; 0,
1
- :::;; qJ'(t) + 2qJ"(t)t :::;; n.
n

This proposition shows that all the results from Sections 62.3-62.6 can be
applied to this nonlinear model. Conditions (i) and (ii) are satisfied, for exam-
ple, for concave functions qJ which display an· almost linear behavior in
neighborhoods oft= 0 and t = +oo (Fig. 62.1).
The constitutive law er= M'(y) is
C1 = (k- JKqJ'(r))try/ + 2KqJ'(r)y
with r = [Y, y].
62.9. The Constitutive Law for Quasi-Statical Plastic Material 257

cp cp

nonlinear Hencky material linear material

Figure 62.1

PRooF. Let y, JlELsym(V3 ), and Iet t/l(t) = M(y + tJl). Computation of the
derivatives t/1'(0) and t/1"(0) implies
M' (y) Jl = k tr y tr Jl + 2Klp' (r) l]i, ji],
M"(y)Jl 2 = k(trJl) 2 + 4Klp"(r)l]i,ji] 2 + 2Klp'(r)[ji,ji].
Schwarz' inequality
with lyl 2 = [y, y]
then yields
M" (y )Jl2 ~ k(tr Jl) 2 + 2Klp'(r)liil 2 - 4Kill'"(r)IIYI 2 Iiil 2
= k(tr Jl) 2 + 2K(lp'(r) + 2lp"{r)r)liil 2
2K
~ k(tr Jl) 2 + -liil 2
n

2K I 12
~- Jl'
n
and this is the strong stability condition (6). The growth conditions (7) follow
from Schwarz' inequality and the boundedness of lp' and lp". 0

62.9. The Constitutive Law for Quasi-Statical


Plastic Material
We now want to generalize the one-dimensional quasi-statical plasticity
model from Section 60.4 to three-dimensional bodies (Fig. 62.2(b)). According
to Section 61.7 we assume the following plasticity condition ofvon Mises:

[ü,ü] < 0'~ no plasticity,


(18)
[ü,ü] = 0'~ plasticity occurs,
where u0 is a given positive number called the plasticity Iimit.
258 62. Monotone Potential Operators and a Class of Models

(a) (b)

Figure 62.2

In what follows we consider the H-space Lsym(V3 ) with the scalar product
[u, Jll Set
C = { u E Lsym(V3 ): (Ci, o] ~ u~}
and Iet x. denote the indicator function of C, i.e.,
x.(u) = {0 ~f u E C,
+oo tf u~ C.
In Section 62.4 we considered the constitutive law u = M'(y) with the inverse
formula
y = M*'(u) (19)
(Fig. 62.2(a)). Here y, u E Lsym(V3 ). Motivated by Section 60.4, we describe the
corresponding quasi-statical plasticity by the constitutive law
yE M*'(u) + ox (20)
(Fig. 62.2(b)). According to the definition of the subgradient in Section 47.3,
equation (20) is equivalent to
x.(Jl) - x.(u) ~ [y - M*'(u), Jl - u] (20*)
for all JlELsym(V3 ). This, in turn, is equivalent to the following variational
inequality
[y- M*'(u),Jl- u] ~ 0 for all Jl e C. (20**)
In particular, this implies
y = M*'(u) if (Ci, i1] < 0"~.
In Cartesian coordinates we set uii = u/, etc., and thus, we get
3
[u, Jl] = .L UiiJliJ,
l.j=l
3
[a, a] = .L (uii -Matt + u22 + u33))2,
i,j=l

C = {(uu)eiR:;m: [a,a] ~ uJ},


where IR:ym denotes the set of all real tuples (u11 ) in IR 9 with uii = u11 for all i, j.
62.10. Principle of Maximal Dual Energy and the Existence Theorem 259

62.1 0. Principle of Maximal Dual Energy and


the Existence Theorem for Linear
Quasi-Statical Plasticity
As in Section 62.7 we consider linear material, i.e., we choose the stored energy
function
M(y) = !A.(try)2 + K[y,y]
with the Lame constants A., K > 0, and hence
M•(er) = tA.•(trer)2 + K•[er,er]
with the dual Lame constants "• = 1/K and A. • = - A./K(2K + 3A.). Moreover,
we have the constitutive law
y = A. • tr er I + 2K•er.
This implies
y = 2K•Ü,
which will be of particular interest for the statical model in the following
section.
Our goal is to generalize the variational inequality

L [y(u) - M•'(er), Jl - er] dx :5:; 0 (21)

in order to obtain a quasi-statical model in plasticity for which we are able


to prove an existence theorem. This plasticity model corresponds to the
constitutive law
y = M•'(er)
in linear elasticity with the mixed boundary conditions
u = Uo on a.G,
ern = T on o2 G.
Note that (21) corresponds to (20••).

62.1 Oa. The Smooth Case


We make the following assumptions:
(H1) Let G be a bounded region in the three-dimensional vector space V3 with
oG E C1• Suppose that there exists a decomposition of the boundary

with o1 G and o2 G sufficiently regular.


260 62. Monotone Potential Operators and a Class of Models

(H2) Suppose the two maps K: iJ-+ V3 and T: 2 G -+ V3 are continuous. a


(H3) Let I 1 denote the set of all C 1-maps a: iJ-+ L57m(V3 ) which satisfy the
two equilibriurn conditions
diva+K=O onG,
an= T on a2 G,
and the plasticity condition
[ii, ä] ::;; aJ on G
for some fixed a0 > 0.
(H4) Let u: iJ-+ v3 and Uo: a1 G -+ v3 be continuous rnaps with
u = Uo on a1G.
We consider the variational problern

- Jar M*(a)dx + Jala


r Uo(an)dO = rnax!, (22)

This problern corresponds to the principle of maximal dual energy of Section


62.4 with the additional side condition [ii, ä] ::;; aJ. Hence, the formulation of
(22) is very natural.

Proposition 62.7. Assurne (Hl) to (H4). lf a is a solution of(22), then a satisfies


the variational inequality (21).

PROOF. Let a e I 1 be a solution of (22). Let F(a) denote the left-hand side of
(22) and set
<P(t) = F(a + t(J.l - a)), t eR,
where J.lEI 1• Then IP'(O)::;; 0, and this irnplies

- Jar [M*'(a), J.l - 0'] dx + r


Ja1a
Uo(J.l - a)n dO ::;; 0

for all J.l e I 1• Integration by parts then yields

r [y(u), J.l -
Ja
0'] dx = r
Jala
u0 (J.l - a)n dO - r u diV(J.l -
Ja
a) dx

for all J.lEI 1 and hence. we obtain (21). Note that diya = div J.l = -K. D

This proposition shows that we rnust solve the rnaxirnurn problern (22) in
order to guarantee that the stress tensor a and the given displacernent vector
u in (H4) satisfy our plasticity model (21).
62.1 0. Principle of Maximal Dual Energy and the Existence Theorem 261

62.10b. Existence of a Generalized Solution


In order to solve (22) in a suitable H-space, we need a generalized definition
of the boundary integral

F.,(u) = I u(un)dO
Ja.a
in the case of weak assumptions on u. To this end we set
I(u0 ) = closure of I 1 in the H-space L 2 (G, Lsym(V3 )).
Using u, = S,u, where S, is the smoothing operator from Section 18.14, the
definition of the set I in Section 62.4 shows that
I(u0 ) = {uei: [ii,ii] ~ u5 on G}.
Let us suppose the following:
(Al) Gis a bounded region in V3 with oGe C0 ·1, and o1 G and o2 G are open
subsets of oG with

(A2) We are given the densities of the outer forces

and a vector function u0 e Wl(G, V3 ), where u0 on o1 G corresponds to


the given boundary displacement on o1 G.
For all C1-maps u: G-+ V3 and all uei 1 , we obtain from integrating by
parts that

I udivudx=- I [u,y(u)]dx+ I ThdO+F.,(u)


Ja Ja Ja,a
and hence

F.,(u) = I [u, y(u)] dx - I Ku dx - I Th dO. (23)


Ja Ja Ja,a
Obviously, this expression makes sense also in case that ue Wl(G, V3 ) and
uei. In this general situation we define F.,(u) through (23).
The embedding Wl(G)!;;;:; Wlf2 (oG) is continuous. Thus, for ue Wl(G, V3 ),
it follows that u e Wlf2 (oG, V3 ), and we can make the following more general
assumption.
(A2*) Replace TeL 2 (o2 G, V3 ) with
Te Wlf 2 (oG, V3 )*.
The map T: Wl 12 (oG, V3 )-+ IRis a continuous linear functional and we can
262 62. Monotone Potential Operators and a Class of Models

define

[ Tu dO = T(u).
Ja2G
In place of the classical variational problern (22), we now consider the more
generat variational problern

-L M*(cr) dx + Fu (C1) =
0 max!, CT e l:(cr0). (24)

In case that cr0 = +oo, formula (24) is identical to the principle of maximal
dual energy of Section 62.4.

Theorem 6l.C (Linear Quasi-Statical Plasticity). Under assumptions (A 1), (A2),


(A2*), the variational problern (24) has a unique solution.

PRooF. lf we write (24) as a minimum problem, our problern represents a


strongly positive quadratic variational problern on the closed convex set l:(cr0 )
(see Problem 62.8). The assertion then follows from Proposition 46.1. 0

We interpret the solution cr of (24) as the stress tensor of our quasi-statical


plasticity model.

62.11. Duality and the Existence Theorem for


Linear Statical Plasticity
Quasi-statical models in plasticity theory have the disadvantage that the strain
tensor y, and hence the displacement u, are undetermined in case the plasticity
Iimit cr0 has been reached. This can already be seen in the one-dimensional
situation, pictured in Figure 62.2(b). The constitutive law of quasi-statical
models therefore represents a strong idealization. In this section we study a
so-called statical model in plasticity which yields unique stresses and displace-
ments. The idea is this:
Use the principles ofminimal potential energy and maximal dual energy and
add the plasticity condition [ö', ö'] ~ cr~ as a side condition.
Our assumptions are the following:
(A 1) Conditions (H 1) to (H6) of Section 62.3 are satisfied, and cr0 > 0 is fixed.
(A2) We consider linear material, i.e., the stored energy function M has the
sameform as in Section 62.10. In particular, the inverse constitutive law
y = M*'(cr) implies y = 2~e*ö' and hence
(Y, }i) = 4K* 2 (ö', ö'].
62.11. Duality and the Existence Theorem for Linear Statical Plasticity 263

This yields the following key relation in our approach: The stress tensor u
satisfies the plasticity inequality
[a, a] ~ u~
if and only if the strain tensor y satisfies the inequality
(}i, }i] ~ 4K 2 U~.
This observation motivates us to consider the following two variational
problems
Epo1(u) = min!, ueX(u0 ), (25)
Edua 1(u) = max!, u E :E(u0 ), (25*)
where we set
X(u0 ) = {u: u- u0 eX and (Y,y] ~ 4K 2 u~ on G}
:E(u0 ) = {ue:E: [a,ti] ~ u~ on G}.
We observe that if u0 = oo, then (25) and (25*) coincide with problems (P) and
(P*) of Section 62.4, respectively.

Theorem 62.0 (Linear Statical Plasticity). Assurne (Al), (A2). Then problern
(25), as well as problern (25*), has a unique solution.

PRooF. The sets :E(u0 ) and X(u0 ) are closed and convex in the H-spaces
L 2 (G, Lsym(V3 )) and X, respectively (Problem 62.8). Both problern (25) and
(25*) represent strongly positive quadratic variational problems, and the
assertion therefore follows from Proposition 46.1. D

We interpret the solutions u and u as displacement vector and reduced stress


tensor, respectively.
This statical plasticity model is reasonable from both the physical and
the mathematical point of view.
(a) Instead of the "unrealistic" and arbitrary constitutive law in the quasi-
statical model, we start here with a generat physical principle (minimization
ofenergy).
(b) We obtain a unique displacement u.
(c) The unique stress tensor u is the same as in the quasi-statical model, since
problern (25*) is identical to problern (24).
(d) As in the quasi-statical model, the zone of plasticity is given by the set of
all x e G with [a(x), a(x)] = u~.
(e) For implementation on computers, we can use the well-established meth-
ods of convex optimization.
(f) This method can also be applied to nonlinear material. For example, in
Chapter 86, we consider plastic torsion of rods, consisting of nonlinear
264 62. Monotone Potential Operators and a Class of Models

flencky material. There, we will also show the close relation of the torsion
problern with stationary conservation laws in hydrodynamics.
Theorem 62.0 represents a rigorous justification of the classical empirical
theory of Haar and von Karman (1909).

62.12. Compensated Compactness


There are four important methods to obtain existence results in nonlinear
functional analysis and, for that matter, for nonlinear partial differential
equations:
(i) Compactness.
(ii) Monotonicity.
(iii) Convex extremal problems.
(iv) Compensated compactness.
Prototypes for (i)-(iii) are the following:
(i) Leray-Schauder principle (Theorem 6.A) and the generalized Weierstrass
theorem for extremal problems (Theorem 38.B).
(ii) Main theorem for monotone operators (Theorem 26.A).
(iii) Minimization of continuous convex functionals or, more generally, of
weak sequentially lower semicontinuous functionals (Theorem 38.A).
As we have seen in Chapter 42, there exists a close relation between (ii)
and (iii).
The method of compensated compactness stands for the following:
The Iack of (i)-(iii) is compensated for by using additional information (side
conditions).
We will apply this method to two important problems.
(a) Existence theorem for polyconvex material in nonlinear elasticity (Section
62.13).
(b) Existence theorem for transonic ßow (Part V).
In (a) we shall make essential use of properties of adj A and det A (Proposi-
tion 62.1 0).
In (b) weshall introduce an additional entropy condition. This will enable
us to apply the embedding theorem ofMurat (Proposition 62.11). The method
of compensated compactness is especially transparent in case (b) of transonic
flows. There, we have, roughly, the following situation:
(oc) The corresponding variational problern Iooks like
F(u) = min!, ueK. (26)
(ß) Ifthere exist subregions with a supersonic ßow, then F loses its convexity.
62.12. Compensated Compactness 265

(y) If we add an additional so-called entropy condition, then K becomes


compact, according to Proposition 62.11 below. So we can apply the
generalized Weierstrass theorem (Theorem 38.8) to (26).
The entropy condition has a physical background. It guarantees that the
second law of thermodynamics will not be violated.

62.12a. Properties of adj A

Subsequently, we use simple identities for determinants and the method of


integration by parts. We make the following assumption:
(H) Gis a bounded region in R 3 with oG e C0 • 1 • Let 2 :s; p < oo and q :s; r < oo
where p- 1 + q- 1 = 1. We choose a fixed Cartesian coordinate system. Let
A: V3 -+ V3 be a linear operator. Recall that the components ofthe linear
operator
adj A: V3 -+ V3
are the adjoint subdeterminants of A*, i.e.,
(adj A)1i = fB 1t 1 e1".,.Am~:A,." (27)

where, as usual, e1a:1 = sgn C ~ D. We sum over two equal indices


from l to 3.

Lemma 62.8. Assurne (H). lf


a_,.b in W,/(G) 3 ,
adj a' _,.c in L,(G) 9 ,
then adj b' = c.

PRooF. Let ae W,/(G) 3 . Equation (27) yields


(adj a')11 = fe1~; 1 e1".,. Da:am D1a,.. (28)
(I) We begin by proving the key identity

L (adj a')iill' dx = -~ L B1t 1BJmn amD1a,. D~:q> dx (29)

for all q> e Cö( G). The important fact is that no second-order derivatives
appear on the right-hand side. In case that aeC00 (G) 3, this follows from
(28), employing integration by parts and the fact that
8iti8Jmn D~;D1 a,. = 0.
In the general case, we use that C00 (G) is densein W,/(G) and that the
embedding W,/(G) s;;; Lq(G) is continuous. Note that 2 :s; p < oo.
266 62. Monotone Potential Operators and a Class of Models

(II) The embedding W" (G) s


1 Lq(G) is compact, and thus
a-+ b in Lq(G) 3 •
From (29) follows

L
CiJqJdX = -~L e1~: 1 B1 1 1 bmD1 b,.D~:(/)dX
for all qJ e Cö( G) and

L (adj b')11 qJ dx = L cufP dx

for all qJ e Cö(G). Hence adj b' = c. 0


Lemma 61.9. Assume (H). If a e W"1 ( G) 3 and
adj a' e L,(G)9 , (30)
then deta' eL 1 (G) and

L
(deta')qJdx =- L a 1 (adja')uD1qJdx (31)

for all (/) e Cö( G).

This result is typical for the method of compensated compactness. In case


that ae W,1(G) 3 , we merely know that
deta': G-+ R
is a measurable function. Together with the additional information (30),
however, we obtain the much stronger result that deta' eL 1 (G). We set
b1 = (adj a')11 •
Equation (31) is the distributive form of the classical identity
det a' = D1(a 1 b1).
This identity is a consequence of
det a' = b1D1a1 (32)
and
(33)
Equation (33) follows from (28). The simple proof idea is to use integra-
tion by parts as weil as density arguments in order to overcome a Iack of
smoothness.
PROOF. Let a e C00 ( G). From (33) and integration by parts follows that

for all 1/1 e Cö(G).


62.12. Compensated Compactness 267

By using a density argument the same is true for ae l'V,/(G) 3 • Thus we obtain
the key identity

L~1(<pD1c)dx L = - MD1<p)cdx (34)

for all <peC0(G) and ceC00 (G). We then choose ceC00 (G) with
c--. a1 in W,/(G),
and from b1eL9 (G) and (32) we obtain that
b1D1c--. det a' in L 1 (G).
Formula (31) therefore follows from (34). 0

Proposition 62.10. Assurne (H). If


a_.. b in W,/(G) 3 ,
adj a' __.. c in L,(G) 9,
det a' __.. d in L.(G), 1 < s < oo,
then adj b' = c and det b' = d.
This is the key to the proof of Theorem 62.E in the following section.
PRooF. Lemma 62.8 implies that adj b' = c. By passing to the Iimit in (31), we

L L
obtain

d<pdx = - b1(adjb') 11 D1<pdx

for all <p e C0(G), and hence d = det b', according to Lemma 62.9. Notice that
our assumptions imply that
a--. b in Lp(G) 3
adj a' __.. c in L9 ( G) 9 . 0

62.12b. Embedding Theorem of Murat


Recall that
C0(G)+ = { <p e C0(G): <p ;::: 0 on G}.

Proposition 62.11 (Murat (1981)). Let G be a bounded region in RN, N;::: 1 with
aGeC0 • 1 •
(a) Suppose that
F"_..F in Wl(G)* as n--. oo, (35)
F"(<p) ;::: 0 on C0(G)+ for all n. (36)
268 62. Monotone Potential Operators and a Class of Models

Then
as n-+ oo for all 2 < p ~ oo. (37)
(b) Assertion (a) remains true if we replace Wi(G) and lV,/(G) with Wl(G) and
W,/ (G), respectively. Assumption oG E C0 • 1 drops out in this case.
Observation (b) can be stated as follows. lf G is a bounded region in RN,
N ~ 1, then the embedding

Wi(G)t s;;; W" (G)*


1

is compact for all2 < p ~ oo. The key thereby is the positivity condition (36)
which compensates for the Iack of compactness of the embedding Wl (G)* s;;;
W,/(G)*.
Actually, Proposition 62.11(a) is stronger than the corresponding theorem
in Murat (1981) and goes back to Feistauer, Mandel, and Necas (1984). We
need this stronger result in our approach to transonic flows of Chapter 86.
The proof of Proposition 62.11 makes essential use of interpolation theory
and a priori estimates due to Agmon, Douglis, and Nirenberg (1959) and
Meyers (1963). We also use standard properties of embeddings which may be
found in Section 21.11.
Since the embedding J-Yq1 (G) s;;; W,/(G), 1 ~ p ~ q ~ oo is continuous, the
dual embedding

is continuous as well. Hence it is sufficient to prove (37) for all p with


2 < p ~ q0 < oo, where q0 is fixed. In particular, we can assume that p < oo.
PROOF OF (a) IN A SPECIAL CASE. We assume that oG E C 1 and that (36) holds
on C00 (G).
In this special case we can give an "elementary" proof. The generat case will
be considered below.
(I) We set lllfJII = max.x.,allfJ(x)l. The obvious inequality
-lllfJII ~ lfJ ~ lllfJII on G for all lfJE C00 (G)
and (36) imply

From (35) follows that F"(l)-+ F(1) as n-+ oo and hence


for all n.
Since C00 (G) is densein C(G), we obtain the key relation
for all lfJ E C( G).

The set {Fn} is therefore bounded in C(G)*.


The proof then follows from the standard arguments (II) to (IV) below.
62.12. Compensated Compactness 269

(II) The embedding

is compact for all q > N. This yields the compactness of the dual
embedding
for all q > N.

Thus, there exists a subsequence (Fn·) with

as n-+ oo.

For q > 2, the embedding W,/(G)!;;;; Wl(G) is continuous. Hence the


embedding Wl(G)* !;;;; W,/(G)* is continuous as weil. Hypothesis (35)
therefore implies that
F"_,..F in W,/(G)* as n-+ oo,

and thus, F = F.
The convergence principle of Proposition 10.13( 1) then yields the con-
vergence of the original sequence, i.e.,

F"-+ F in W,/(G)* as n-+ oo. (38)


(111) If p ~ q, then (38) implies that

F"-+ F in Wj(G)* as n-+ oo.

Observe the continuity ofthe embedding W"1 (G)!;;;; J-Yq1 (G) which implies
the continuity ofthe embedding JVq1 (G)*!;;;; W,1 (G)*.
(IV) In case that 2 < p < q and q < oo, we use the convergence trick of
interpolation theory (Section 21.17). Every p can be represented as
1 1- t t
-=--+-, O<t<l.
p q 2
The well-known interpolation relation

then yields

IIFII1,, :S CIIFII~:~IIFII'1.2 for all Fe Wl(G)*, (39)

where IIFII 1,, denotes the norm on W,1(G)*. From assumption (35) fol-
lows that the sequence (Fn) is bounded in Wl(G)*. Assertion (37) then
follows from (38) and (39).

PRooF OF (b). Choose proper subregions


H cc A cc B cc G
270 62. Monotone Potential Operators and a Class of Models

with C00 -boundaries and set DuDcp = ~J= 1 D1uD1cp. By hypothesis,


F,._,.,F inWl(G)* as n--+oo, (40)
for all n. (41)
(I) The trick in our proof is to represent F,. in the form
-Liu,. = F,., u,.e Wi(G),

L
i.e.,

Du,.Dcpdx = F,.(cp) for all cpeC0 (G).

According to Theorem 25.1, the function u,. is uniquely determined by


F,., where
llu,.llw~<GI ~ const IIF,.IIw~<GI* for all n.
From (40) follows that the sequence (F,.) is bounded in Wl(G)*, and
hence that
(u,.) is bounded in Wl(G). (42)
Below we shall prove the following:
There exists a sequence of subregions H1 c H2 c · · · c G suchthat
meas( G - Ht) --+ 0 as k --+ oo, and there exists a subsequence of (u,.) such
that
(u,..) converges in ~ 1 (H) for all 1 < s < 2 (43)
and all H = Ht, k = 1, 2, ....
We show that (43) implies assertion (b). Let p > 2, and choose s with
p- 1 + s- 1 = 1. We then have the splitting

Fiep)- Fm(cp) = L (Du,.- Dum)Dcpdx

= ( (Du,. - Dum)Dcp dx +( (Du" - Dum)Dcp dx.


JH Ja-H
Observe that

IL (Du"- Dum)Dcpdxl ~ llu"- umllw.'(H)IIcpll.vJ(G)


and

IL-H (Du"- Dum)Dcpdxl


~ (L-H !Du,.- Dum! dx Y' (L-H !Dcp!PdxY'' (L-H dx
2
2 2
)U/ )-U/PI

~ (meas G- H)0121 -< 11' 111u,.- umllw](GJilq>ilwJ<GJ·


62.12. Compensated Compactness 271

This implies

IIFn·- Fm·ll <" in Wp1 (G)* if n', m' ;;:::: n0 (t:).

Notice (42) and (43). Hence

Fn' -+ F in WP1 ( G)* as n-+ oo.

From (40) follows that F = F. The convergence principle in Proposition


10.13(1) then yields the assertion

as n-+ oo.

(II) In order to finish our proof it only remains to show that (43) holds true.
(11-1) Choose ipE C0 (G) with ip = 1 on B. From
for all cpE C0 (B)

and hypothesis (40, we obtain the key estimate


for all cpEC0(B)

and all n. Since C0 (B) is densein C(Ä), we find that


(F") is bounded in C(Ä)*.

Choose q > max{N,2}. The embedding ~1 (A) c;; C(Ä)is thencompact


and hence the embedding C(A)* c;; ~1 (A)* is also compact. Thus there
exists a subsequence of (F") such that
(F".) converges in ~1 (A).* (44)
(11-2) The interior LP-estimates in Agmon, Douglis, and Nirenberg (1959)
yield

llunllw,'<HJ::;; const IIFnllwJ<AJ* for all n,

where r- 1 + q- 1 = 1. Hence
(u".) converges in J·v,.t(H). (45)
(11-3) We prove that
(u".) converges in W. 1 (H) for all 1 < s < 2. (46)

Observe that 1 < r < 2. lf 1 < s ::;; r, then (46) is valid as a consequence
of the continuity of the embedding W, 1 (H) c;; W.1 (H).
In case that r < s < 2, we choose 0 < t < 1 with
1 1- t t
-=--+-.
s r 2
Interpolation yields
272 62. Monotone Potential Operators and a Class of Models

i.e.,
llulks ~ const llull t~'!lullt2 for all ue Wi(H).

This, together with (42), (45) implies (46).


(11-4) Now choose a sequence H 1 c H 2 c .. · c G. Formula (43) then follows
from a diagonal procedure.
PROOF OF (a) IN THE GENERAL CASE. The key to the proof is the orthogonal
projection (47), together with the splitting argument (49), (49*) below. In par-
ticular, we will use a result due to Meyers (1963).
(I) Let q> e W/(G), p > 2. Then q> e Wi(G). From Section 22.4, there exists
the orthogonal decomposition
(/) = (/)1 + (/)2 (47)
and q> 1 e C"'(G) with
Aq> 1 = 0 on G, q> 1 = q> on oG. (48)
A standard result in Hilbert space theory says that the mappings
i = 1, 2 (47*)
areorthogonal projections on the H-space Wl(G), and hence are con-
tinuous on Wl(G). We need, however, a much stronger result here. The
paper of Meyers (1963) shows that there exists a real number q0 > 2
suchthat the mappings (47*) are continuous from
W,. 1 (G) to W,. 1 (G) if 2 ~ r ~ qo.
In the following we assume that 2 < p ~ q0 •
(II) We show that the set
K = {q>1: llq>llwJ<GJ:::;; 1}
is relatively compact in Wl(G). In fact, it follows from (48) that
I q>1ll Wl<Gl ~ const I q> II WJ'2<öGJ for all q> e Wl (G),
according to Theorem 25.1. We then observe that the embedding
Wp1(G)!;;;;; W"1- 1'P(oG) is continuous and that the embedding
W"1-1fP(oG) !;;;;; Wlf2(oG)

is compact. Denoting the closure of Kin Wl(G) by K, we find that K


is compact.
(III) From hypothesis (35), the sequence (F,.) is bounded in Wl(G)*, and

for all n and all q>, 1/1 e Wl(G). This yields the equicontinuity of {F,.} on
Wl(G).
62.13. Existence Theorem for Polyconvex Material 273

(IV-1) From (35),


Fn(qJ)-+ F(qJ) as n-+ oo for all qJ E K.

According to Problem 19.14, the compactness of K and the equi-


continuity of {F"} imply that this convergence is uniform, i.e.,
sup IFn(qJtl- F(qJdl-+ 0 as n-+ oo. (49)
II <PII t.p :51
(IV-2) The embedding Wl(G)* s;;; Wl{G)* is continuous. From (35) follows
that
Fn __", F in Wl (G)* as n -+ oo.

Proposition 62.11(b) then shows that


F"-+ F in W/(G)* as n-+ oo,

and hence
sup 1Fn(qJ2 ) - F(qJ 2 )1-+ 0 as n-+ oo. (49*)
II<PII,,p st
Notice the continuity of the map qJ 1-+ qJ2 from W/(G) to W/(G).
From (49), (49*) and qJ = qJ 1 + qJ 2 , we obtain the assertion
as n-+ oo. 0

62.13. Existence Theorem for Polyconvex Material


We consider the principle of minimal potential energy

f G
L(l + u'(x))dx- f G
Kudx- f
Ja,G
TudO = min!,
(50)
u = u0 on o1 G
in the case that the stored energy function has the form
L(A) = P(A, adj A, det A) (51)
for all A E L(V3 )+. We set
L(V3 )+ = {A E L(V3 ): det A > 0},
and define the norm
lAI = (tr A*A) 112
In Cartesian coordinates we have
274 62. Monotone Potential Operators and a Class of Models

The connection between A and the displacernent u is given by


A =I+ u'(x).
We now choose a fixed Cartesian coordinate systern and rnake the following
assurnptions:
(Hl) Gis a bounded region in R3 with oGeC0 •1 and there exists a decorn-
position

where o1 G and o2 G are open in oG.


(H2) Polyconvexity. There exists a convex function

suchthat (51) is valid. Moreover, in a neighborhood of a fixed point, P


is bounded frorn above.
(H3) The Iimit det A -+ 0. lf
and as n-+ oo
and d,. -+ + 0, then
P(A,., B,., d,.)-+ +oo as n-+ oo.
(H4) Coerciveness. Assurne that there exist constants C > 0 and D such that
P(A,adjA,detA);;::: C(IAIP + ladjAI' + ldetAI") + D
for all AeL(V3 )+. Here 2::;;; p < oo, q- 1 + p- 1 = 1, q::;;; r < oo, and
1 < s < 00.
(H5) Outer forces. We are given
K eL9 (G) 3 , TeL9 (o2 G)3 •
(H6) Boundary displacement. Let V be the set of all
ue W"t(G) 3
with the additional properties
adj(l + u')eL,(G) 9 , det(l + u')eL,(G),
det(I + u'(x)) > 0 a.e. on G.
We are given u0 e V.
Our final problern Iooks like
F(u) = rnin!, ueU, (52)
where
V={ueV:u=u0 ono1 G}
62.13. Existence Theorem for Polyconvex Material 275

and

F(u) = fG
Ldx- f G
Kudx- f
J~G
TudO.

Theorem 62.E (Ball (1977)). Assurne (H1)-(H6) with F(u0 ) < oo.
Then there exists a solution ofthe variational problern (52).

PRooF. We use standard arguments ofthe calculus ofvariations, together with


the key result of Proposition 62.10.
(I) Preparations. Define

P(A, B, d) = {P+(~ B, d) if d > 0,


vv if d ~ 0.
The convex function P in (H2) is continuous according to Proposition
47.5. Hence the function
P: L(V3 ) x L(V3 ) x IR-+ ]-oo, +oo]
is continuous according to (H3).
Furthermore, we set
X= W"1 (G) 3 X L,(G) 9 X L 5 (G)
and
Hu(x) = (x + u(x), adj(J + u'(x)), det(J + u'(x))). (53)

(1-1) On WP1 (G) we introduce the equivalent norm

(L IDviPdx + Ia,G iviPdOY'P,


and consequently, we get
llull1.p ~ const IIDuiiP + const for all ue U. (54)
Notice that u = u0 on 81 G.
(1-2) Theorem ofMazur. Let (vn) be a sequence in an arbitrary B-space X with
weak convergence
Vn--"'V in X as n-+ oo.
Then there exist convex linear combinations
N(n)

Wn = L An;V;
1=1

suchthat N(n)-+ oo as n-+ oo, and we have strong convergence


Wn-+ V in X as n-+ oo.
The proofmay be found in Yosida (1965, M).
276 62. Monotone Potential Operators and a Class of Models

(1-3) Recall that in case


as n--+ oo

and l < p < oo, there exists a subsequence such that


J...(x) --+ f(x) a.e. on G as n--+ oo.

(II) Minimal sequence. Set


Jl = inf F(u).
ueU

From (H4), we have


-oo < F(u) :::;; +oo for all ue U.

Since F(u 0 ) < oo, the value Jl is finite. We choose a sequence (u.. ) in U
suchthat
as n--+ oo.

Coerciveness (H4) and (54) then yield that


(Hu.. ) is bounded in X.

(111) Weak convergence. Since Xis reflexive, there exists a subsequence, de-
noted again by (Hu.. ), such that
Hu .. _",v in X as n--+ oo.
From Proposition 62.10 follows
v=Hu.
This is the key to our proof.
(IV) Strong convergence. We set v11 = Hu 11 and choose (w.. ) as in (1-2). By
passing to a subsequence, if necessary, we may assume that
W11 -+Hu inX as n--+ oo,
w.. (x)--+ Hu(x) a.e. on G as n--+ oo.
(V) Lemma of Fatou. The convexity of P yields

L A P(HuJ
N(n)
P(w.. ) :::;; 111
1=1

From the coerciveness condition (H4) we have that


P(w.. ) ;;:::; const for all n,
and thus the Iemma of Fatou A2 (19c) yields

r P(u)dx = JGr lim P(w.. )dx:::;; lim JGr !) ;. .,P(Hu,)dx.


JG n-+ao n-+ao •=1
62.14. Application to Rubberlike Material 277

Hence
N(n)
F(u) ~ lim
n~oo
L A.niF(ui) ~ lim F(uN<n>) = f.l.,
i=l n-+oo

i.e.,

IG
P(u)dx- IG
Kudx- f iJ2G
TudO ~ f.l..
(VI) We show that uE V. Wehave P(u) < oo. Hence the construction of P
yields
det(J + u'(x)) > 0 a.e. on G.
o
Since u" = u0 on 1 G for all n, and since convex linear combinations of
the u" converge to u in WP1 (G), we also have that
u = u0 on o G.
1

Observe the continuity ofthe embedding W,/(G) f; W"1 - 1iP(oG).


Consequently, F(u) = f.l. and u EU, i.e., u is a solution of (52). 0

62.14. Application to Rubberlike Material


We consider Ogden's material from Example 61.15, and wish to show that
Theorem 62.E can be applied to such a material. The stored energy function
has the form
L = P(A, adj A, det A)
with A = I + u'(x) and
P = C tr EP + D tr adj E' + f(det A) + const, (55)
where E = (A*A) 112 • In the special case of Mooney-Rivlin material p = r = 2
we have
P = CIAI 2 + DladjAI 2 + f(detA) + const.
Let us assume that:
(Al) The function f: ]0, oo [--+ ~ is convex and continuous with
f(d) ~ ad• for all d > 0,
where 0 < a < oo and l < s < oo. Moreover, f(d)--+ +oo as d--+ +0.
(A2) The constants in (55) satisfy
2~p< 00, q ~ r < oo, C,D>O.

Proposition 62.12. Assurne (Al), (A2). Then Theorem 62.E can be applied to
Ogden's rubberlike material (55).
278 62. Monotone Potential Operators and a Class of Models

PROOF. Let;.. .. ).. 2 , ).. 3 be theeigenvaluesof(A*A) 112 • Thewell-knowninequality


oc()..~ + )..~ + )..w/2 ~ (,l..f + ).! + ).f)l/p
with a constant oc > 0 implies that
ocPIAIP ~ tr EP.
Similarly, we obtain
oc'ladjAI' ~ tradjE'.
Thus the coerciveness condition is satisfied. Polyconvexity follows from Ex-
ample 61.15. 0

62.15. Proof of Korn's lnequality


Our goal is to give a simple proof of Kom's inequality. This is the most
important inequality in elasticity theory. Our approach follows Neeas and
Hlaväcek (1981). The key will be a standard result about equivalent norms on
the Lebesgue space L 2 (G) in terms of negative norms, which correspond to
norms on the dual Sobolev space Wi(G)* (Proposition 62.18).
We make the following assumptions:
(Hl) Gis a bounded region in V3 with oGeC0 •1•
(H2) We have the boundary decomposition

where o1 G and o2 G are open subsets of oG.


As in Section 62.2 we set
x = Wi(G,al G; v3>·
Recall that this space corresponds to the boundary condition
u = 0 on o1 G.
Moreover, recall that the linearized strain tensor is given by
y(u)(x) = f(u'(x) + u'(x)*),
and define
X 0 = {ueX: y(u) = 0 on-G},
which is a linear, closed subspace of the H-space X, consisting of all displace-
ments with vanishing linearized strain. Let Xci' denote the orthogonal comple-
ment to X 0 , i.e.,
62.1 S. Proor or Korn's Inequality 279

Theorem 62.F (Korn's Inequality). Assurne (H 1), (H2). Then there exists a con-
stant c1 > 0 such that

L [y(u), y(u)] dx :2: c 1 llull~ for all ueX~. (56)

CoroUary 62.13 (Special Korn's Ineqtiality). Assurne (Hl), (H2) with o1 G ::/:: 0.
Then there exists a constant c1 > 0 such that

L [y(u), y(u)] dx :2: c 1 llulli for all ueX. (57)

In Cartesian coordinates we have


yj(u) = i(Diu1 + Diu;),
[y(u), y(u)] = yjy/, (58)

llull~ = L uiu1 + Diu1D1uidx.

Moreover, we introduce

(59)

CoroUary 62.14 (Equivalent Norm). Assurne (Hl1 (H2) with o1 G ::/:: 0. Then,
on the H -space X, the norms II u II x and Iulx are equivalent, i.e., there are constants
C, D > 0 such that
Clulx S: llullx S: Dlulx for all ueX.

Consequently, we can replace llulli with luli in (57). For smooth functions,
inequalities (56) and (57) go back to a fundamental paper ofKorn (1907) on
the existence of classical solutions in linear elasticity theory. We note that
Korn's inequality is by no means trivial. Observe that, in ~ontrast to the
left-hand side in (57), the right-hand side in (57) contains only special products
of the derivatives D 1u1.
In the special case iJ 1 G = iJG, we gave a simple proof of Korn's inequality
(57) in Section 61.9b by using integration by parts and the inequality of
Poincare-F riedrichs. The generat proof proceeds in several steps.

62.15a. The Structure ofthe Space X 0

Lemma 62.15. Assurne (Hl), (H2) with o1 G = 0. Then X0 consists of all in-
finitesimal rigid motions, i.e., we have u e X 0 if and only if there exist vectors a,
280 62. Monotone Potential Operators and a Class of Models

b e V3 such that
u(x) = a + b x x on G. (60)

PROOF. See Problem 62.3. 0

Recall that X= Wl(G, V3 ) if i\ G = 0, i.e.,


X0 = {ue Wl(G, J';): y(u) = 0 on G}. (61)

This Iemma therefore states the very natural result that the linearized strain
tensor vanishes identically precisely for all infinitesimal rigid motions.

Lemma 62.16. Assurne (Hl), (H2) with o G i= 0. Then X 0 =


1 {0}.

PROOF. See Problem 62.4. 0

Recall that
X 0 = {ue Wl(G, V3 ): y(u) = 0 on G and u = 0 on o G}.
1 (62)
o
This Iemma states the following natural result. If 1 G is nonempty, then all
o
infinitesimal rigid motions u with u = 0 on 1 G are trivial.

62.15b. Equivalent Norms on Sobolev Spaces


We now recall two important results on Sobolev spaces which were proved
in Part II.

Proposition 62.17. Assurne (Hl), (H2) with o G i= 0. Then


1

(L DivDivdx + J,,G v2 d0 Y'2


r
and

llvllt,2 = (L v2 + DivDjvdx 12

are equivalent norms on the Sobolev space Wl(G).

LetfeL 2 (G). We set

F(v) = L fv dx,

H(v) = ~ L f Divdx.
62.15. Proof of Korn's lnequality 281

Hölder's inequality implies that F, H: Wi(G)-+ IR are continuous, linear func-


tionals, i.e., F, HE Wi(G)*. The corresponding norms ofthese functionals IIFII
and IIHII are denoted by 11/11- 1 , 2and IID;/11-u. respectively, i.e., we obtain
the so-called negative norms

r
llfll-1.2 = sup 1 fvdxl·
vEB JG

IIDjf11-t.2 = sup Ir
veB JG
f Djvdxl,

with the closed unit ballBin Wl(G), i.e.,


B = {ve Wl(G): llvllt.2 ~ 1}.
Recall that the norm llvll 2on L2(G), is given by

llvll 2 = (L Y'
v2 dx
2
.

Proposition 62.18. Assurne (H 1). Then


3
11!11. = 11/11-1.2 + L IIDJfll-t,2
j=l

is an equivalent norm on L 2 (G), i.e., there are constants C, D > 0 suchthat


for all feL 2 (G).

PROOF. See Problem 21.9. 0

62.15c. The Algebraic Key Relation

The following Iemma shows that the second partial derivatives ofthe displace-
ments can be expressed in terms ofthe first partial derivatives ofthe linearized
strain tensor.

Lemma 62.19. Set y"' = y,•. There exist real constants A~c1m,rsr such that
D~cD,um = Aklm,rsrD'y(ur' for all Um E Cö(IR 3 ),
which satisfy the symmetry property Akml,rst = A~c 1 m,rrs· This holdsfor all possible
indices. The sum is taken over r, s, t from 1 to 3.

PRooF. Let Um be the Fourier transform of um. We use the key property of
Fourier transformsthat differentiation is transformed into multiplication, i.e.,
D1um is transformed into ie1 Um and vice versa. Therefore, it is sufficient to prove
e~ce,um = !A~c,m,rste,(e.u, + e,U,).
282 62. Monotone Potential Operators and a Class of Models

This is a purely algebraic problem. We make the ansatz


Akrm,,.,e, = P,,(el.e2,e3)
with P,1 = P15 for all t, s. Note that P,, depends also on k, l, m. For example,
Iet m = 1. Hence it is sufficient to find homogeneous polynomials P,, of first
degree which satisfy the relations

eke, = Puel + P12e2 + P13e3.


o = P12e1 + P22e2 + P23e3.
0 = p13~1 + p23~2 + p33~3·
First, we consider the special case k = 1, l = 2. In this case, our relations are
valid if we set
and for all the other indices.
For arbitrary k, l, m, we may use the same argument. 0

62.15d. Coerciveness of Strains

Lemma 62.20. Assurne (H1), (H2) with 81 G = iJG. Then there exists a constant
c > 0 such that

L [y(u), y(u)] + u2dx ;::: cllulli forall uEX.

Observe that, in this case, X= Wl(G, V3 ). In Cartesian coordinates the


Iemma states that

Ly(u)jy(u)/ + uiu;dx ~c L DiuiDiui + uiu;dx

for all u; E Wl (G) and all i. The key to our proof will be Proposition 62.18.

PROOF. Since C0 (~ 3 ) is densein Wl(G), it suffices to show that the assertion


is valid for all u; E C0 (~ 3 ). In the following, allpositive constants are denoted
by c.
(I) From Proposition 62.18 follows the key relation

IIDrum112 ~ c(IID,umll-1.2 + ~ IIDkDiumll-1,2).


(II) Lemma 62.19 implies that

IIDkDiumll-1,2 ~ C L IID,y,"JI-1,2•
r,s,t

where y = y(u). From Proposition 62.18 follows


IID,y,•ll-1.2 ~ CIIYr"ll2
62.15. Proof of Kom's lnequality 283

and hence

s,l

IID,umll-1,2 ~ Cllum112·
lnequality (I) yields

IID,um112 ~ c(~ 1iy,'ll2 + llum112).


and this proves our assertion. 0

62.15e. Proof of Theorem 62.F


In Section 21.2, we introduced a generat proof strategy. In order to obtain
equivalent norms (inequalities) we used compact embeddings. Here we want
to apply this strategy. The key is the following well-known embedding the-
orem of Rellich:
The embedding Wl(G)!;;;; L2 (G) is compact. (63)
Suppose that Theorem 62.F is not true. Then there exists a sequence (un) in
X~ suchthat llunllx = 1 and

r [y(un), y(un)J dx < !n


JG
for all n. (64)

Using (63), we find a subsequence, again denoted by (un), suchthat


as n-+ oo.
From the coerciveness of strains (Lemma 62.20) and (64), it follows that
(un) is a Cauchy sequence in the H-space X. Hence we obtain the stronger
convergence
Un-+ u in X as n-+ oo. (65)
Relation (64) yields y(u) = 0 on G, i.e.,
ueX0 •
From (65) and Une X~ for all n, we obtain
ueX{
Since X= X 0 Ea X~. we get u = 0. At the same time, it follows from (65) and
llunllx = 1 for all n that llullx = 1. This contradicts u = 0.
The proof of Theorem 62.F is complete.

Corollary 62.13 follows from Theorem 62.F and Lemma 62.16. Corollary
62.14 is an immediate consequence of Proposition 62.17. Observe that u = 0
on o1 G for ueX.
284 62. Monotone Potential Operators and a Class of Models

62.16. Legendre Transformation and the Strategy


of the General Friedrichs Duality in the
Calculus of Variations
In 1929, Friedrichs discovered a general rnethod for obtaining dual problerns
in the calculus of variations. Special cases of this so-called Friedrichs duality
are:
(i) the Trefftz duality (1927) for the Dirichlet problern (Section 62.17); and
(ii) the duality in elasticity theory between displacernent and stress (Section
62.18).
In order to explain the basic idea of this method, we consider a simple, but
typical example. In the following all functions are assumed to be sufficiently
smooth.

62.16a. The Original Variational Problem

We consider the variational problern

(P) F(u)~ l M(Du)- Kudx- l TudO = stationary!,


JG JozG
u = Uo on alG.
Here, G is a bounded region in ~N with oG e C0 • 1 and the boundary decorn-
positon

where o1 G and o2 G are open subsets of aa. We are looking for a solution
u: G-+ ~
of (P). We set
Du= (D1 u, ... ,DNu).
Moreover, for brevity, we write
N
Lan LDa
N
an = 1 1, div a = 1 1,
i=l i=l

N
DM' = L D;MD,u•
i=l
62.16. Legendre Transformation, Strategy ofthe General Friedrichs Duality 285

Lemma 62.21. Ifu is a solution of(P), then u satisfies the Euler equation
DM'(Du) + K = 0 on G
(66)
u = Uo on al G,
where n is the outer unit normal vector at boundary points of G.

PRooF. We use the standard technique of the calculus of variations described


in Section 18.3. Tothis end we set
tp(t) = F(u + t ~u), t e IR,
where the function ~u satisfies the boundary condition ~u = 0 on o G. Letting
1
~F = tp'(O), we obtain

~F = f M'D~u- K~udx- ( T~udO.


G Jo2G
If u is a solution of (P), then ~F = 0. Integration by parts yields

~F = -f G
(DM'+ K)~udx + (
Jo2G
(nM'- T)~udO = 0
for all ~u. and hence we obtain (66). 0

62.16b. Additional Side Condition

The first idea of Friedrichs was to replace (P) with the equivalent problern

fG
M(y)- Kudx- (
Jo2G
TudO = stationary!,
(67)
u = Uo on olG,
y =Du on G.
Here, we are looking for u and y.

62.16c. The Free Problem

The second idea of Friedrichs was to set

F(u, y, u, q) = L M(y) - Ku - u(y - Du) dx

r q(u- Uo)dO- fo2G TudO


- Jo,G
and to consider the new problern
F(u, y, u, q) = stationary!. (68)
286 62 Monotone Potential Operators and a Class of Models

Here, the variation is taken over all u, y, u, q. The important point is that this
is a free problem, i.e., no side conditions appear. We obtain i from the problern
(67) with side conditions, by using our generat strategy from Part 111: Reduce
problems with side conditions to free problems by adding terms which contain
the side conditions and Lagrange multipliers. In (68), the role ofthe Lagrange
multipliers is played by the functions
and

Lemma 62.22.1/ (u, y, u, q) is a solution of (68), then it satisfies the Euter equation
u = M'(y) (Legendre transformation),
div u + K = 0 on G (equilibrium condition),
y =Du on G, (69)
un =q and u = Uo on olG,
un = T on o2 G (equilibrium condition).

In terms of the components, the Legendre transformation becomes


oM(y)
U·=--, i=l, ... ,N.
• oy,

PRooF. Similarly, as in the proof of Lemma 62.21, we obtain

~F = t M' ~y- K ~u- ~u(y- Du)- u(~y- D~u)dx

- Ja,G
r ~q(u- Uo) + q~udO- Ja,G
r T~udO,
where ~u = 0 on o1 G. Integration by parts yields the key relation

~F = L (M'- u)~y- (K + divu)~u +(Du- y)~udx


(70)
+ r
Ja,G
(un- q)~u- (u- Uo)ÖqdO + r
Ja G
1
(un- T)öudO.

From ~i = 0 we obtain (69). D

62.16d. The Dual Problem


We make the following crucial assumption:
(H) The Legendre transformation
u = M'(y)
62.16. Legendre Transformation, Strategy ofthe General Friedrichs Duality 287

can be solved for y, i.e., the inverse transformation


y = M'- 1 (u)
exists.
In order to motivate the following definition ofthe function H, we consider
the original problern for the special case N = 1. This problern corresponds to
a problern in point mechanics. There, we have
M Lagrangian function; 1
u position;
y velocity (y = Du);
u mornentum.

Sirnilarly, as in Section 58.20, we define the Harniltonian function


H(u) = uy - M(y),

where qy = ~J= 1 O'i"/i· Moreover, we set

Cl>(u) = - r H(u)dx +Ia,G (un)uodO


JG
and consider the dual problern
(P*) Cl>(u) = stationary!,
div u + K =0 on G,
un = T on o2 G.

Theorem 6l.G (The Friedrichs Duality (1929)). Suppose that we have a su.ffi-
ciently srnooth situation and that (H) holds true.
lfu is a solution ofthe original problern (P), then u = M'(Du) is a solution of
the dual problern (P*), and the corresponding extrernal values are identical, i.e.,
F(u) = Cl>(u).

PROOF. The trick is to compute F(u, y, u, q) for the special choice

q = un

and

div u + K = 0 on G,

1 More precisely, the Lagrangian function is equal to L = M(y)- Ku, but this is not important
for the following.
288 62. Monotone Potential Operators and a Oass of Models

This is motivated by (69). Integration by parts yields the key relation

F = L M(y) - (K + div u)u - uy dx

+ r (un)udO- Ja,o
Jao
r un(u- Uo)dO- JazG
r TudO
= ( M(y)- uydx + ( (un)u 0 d0 = <D(u).
Jo Ja,o
Now Iet y = Du. According to (70), we then obtain
~<D = ~F = o,
i.e., u is a solution of (P*). Moreover, we have
F(u) = F(u, Du, u, un) = <D(u). D

62.17. Application to the Dirichlet Problem


(Trefftz Duality)
We consider the Dirichlet problern

(P) -21 ( IDul 2 - Kudx- ( TudO = stationary!,


Jo JozG
u = Uo on o1G.
This corresponds to the original problern in Section 62.16 with
M(Du) = f1Duj 2 •
The Euler equation to (P) becomes
Au+ K = 0 on G,

= Uo on al G, ou = T on u!I2 G.
u
on
The Legendre transformation has the simple form
u = M'(y) = l'
with u = (ut> ... , uN). The Hamiltonian function is
H(u) = lul 2 - !lul 2 = !lul 2,
i.e., H(u) = M(u). Thus, the dual problern in Section 62.16 becomes

(P*) --21 ( lul 2 dx + ( (un)u 0 dO = stationary!,


Jo Ja.o
divu+K=O onG, un = T on o2 G.
Problems 289

Theorem 62.G teils us that if u is a solution ofthe Dirichlet problern (P), then
a =Du
is a solution of the dual problern (P*). In this case,
div u + K =0 means L\u + K = 0.
This is the duality ofTrefftz (1927), which came up in Section 51.6 in a more
general functional-analytic context.

62.18. Application to Elasticity


If we apply the generat strategy of Section 62.16 to the principle of stationary
potential energy

(P) l M(Du)- Kudx- l TudO = stationary!,


JG J~G
u = Uo on e. G,
then we obtain precisely the duality between the displacement u and the stress
tensor u of Section 62.4.
In this case, y = Du is the linearized strain tensor, i.e.,
Du= t(u'(x) + u'(x)*).
As in Section 62.16, we obtain the Legendre transformation
a = M'(y),
which is nothing other than the constitutive law. The Hamiltonian function
becomes
H(a) = [a, y] - M(y),
and the dual problern becomes

(P*) -f G
H(a)dx + l (an)u d0 =
Jo,G 0
stationary!,

div a +K = 0 on G, an= T on o2 G.
If M satisfies the assumptions made in Section 62.3, then
H(a) = M*(a),
according to Problem 62.9, i.e., the Hamiltonian function is the conjugate
stored energy function.

PROBLEMS

62.1. An inequality. SetZ= W21(G, V3 ). Show that

t [y(u), y(u)] dx :s; const llull~ for all ueZ.


290 62. Monotone Potential Operators and a Class of Models

Solution: In Cartesian coordinates we have

Hölder's inequality yields

I Y}YI dx::::;; const i,J~m I ID 1u111Dtu.,l dx

::::;; const . .L (IlD'u112 dx) 112 (IlDtuml 2 dx) 112


•,J,k,m

::::;; const I~ ID'u.l 2 + lu.l 2 dx = const llull~·

62.2. Proof of Lemma 62.3.


Solution: By Hölder's inequality

: ; (L K 2 dx) llulli.

Moreover, Hölder's inequality and the continuous embedding W;l(G) s;


L 2 (oG) imply

[ Thd0' Jal
Ja,G
2
::::;; T 2 d0 [ u2 dO:::;;;c(f
JaG Ja.G
T 2 d0)ilulli.

'
2G

62.3. Proof of Lemma 62.15.


Solution: Let M(G) be the set of all ue Wl(G, V3 ) which have the
representation
u(x) =a + b x x on G
with a, be V3 , i.e., M(G) is the !et of allinfinitesimal rigid motions on G.
(I) Smooth case. Let ue C""(G, V3 ) with
y(u) =0 on G.
According to Lemma 62.19, the second-order partial derivatives of all the
components u1 vanish on G. Hence the u/s are polynomials offirst degree,
i.e.,
3
u1 = a1 + L biJ~J• j = 1, 2, 3.
j=l

Moreover, yj = !(D1u1 + Diu1) = 0 implies that


biJ = -~ 1 for all i, j.
Hence ueM(G).
(II) General case. Let ue Wl(G, V3 ) with y(u) = 0 on G. Choose a proper sub-
region H cc G. Let SA be the smoothing operator from Section 18.14,
and set
for all small h > 0.
Problems 291

Then u.e C"'(ii, V3 ), and y(u) = 0 on G implies


y(u.) = s.y(u) =0 on H for all small h > 0.
From (I) follows that "• e M(H) for allsmall h > 0. Moreover, the properties
of s. yield
as h -+0.
Since M(H) isafinite-dimensional subspace of Wl(H, V3 ), it follows that
M(H) is closed and hence that ueM(H).
Because ofthe arbitrariness of H, we find that ueM(G).
62.4. Proof of Lemma 62.16.
Solution: Suppose that
a+b = 0 on a, G
X X

with iJ1 G -1: 0- Wehave to show that a = b = 0.


If b = 0, then a = 0. So Iet b -1: 0. The coefticient matrix of the linear system

has rank = 2. Hence iJ1 G is contained in a line. But this contradicts the fact
that iJ1 Gisopen in iJG.
62.5. Gradient method. Consider the original variational problem, which corresponds
to the principle of minimal potential energy, and prove the convergence of the
gradient method for this problem.
Hint: Use Section 62.6d.
62.6. Existence proof for the second boundary-value problern and the equilibrium
condition. Consider the original variational problern (5), namely,

(P) [ M(y(u))dx- [ Kudx- [ ThdO = min!,


JG JG JÖG
with assumption (Hl) to (H6), as formulated in Section 62.3. Let iJ2 G = iJG, i.e.,
o1G = 0. Then the boundary condition is
un =T on iJG.
In contrast to the first and mixed boundary-value problem, the boundary
displacements are here completely unknown. We only know the boundary
forces. We are looking for the displacements u ofthe elastic body.
Show: There exists a solution u if and only if the following two equilibrium
conditions are satisfied:

f K dx + f T dO = 0,
JG JaG
(71)
[ K x x dx +[ T x x dO = 0,
JG JÖG
i.e., the total outer force and the total outer torque vanish (in the sense of(T.pprox)
292 62. Monotone Potential Operators and a Class of Models

of Section 61.6c). Moreover, the solution u is uniquely determined up to infini-


tesimal rigid motions.
Solution: As in Section 62.5 we write the variational problern (P) in the form
F(u)- b(u) = min!, ueX (72)
with the Euler equation
(F'(u), v) = b(v) for all veX. (73)
The trick of the proof is to solve first the modified problern
F(u) - b(u) = min!, ueX~. (74)
The space X0 has been introduced in Section 62.15. According to Theorem 62.F
(Korn's inequality), the energetic norm

lluiiE = (L [D(u),D(u)]dx y
12
, Du= y(u)

is an equivalent norm on the closed subspace X~ of X= Wl(G, Jil). Analo-


gously to Section 62.5a, we thus obtain that (74) has a unique solution u and
(F'(u), v) = b(v) for all veX~. (75)
We want to show that u is also a solution ofthe original problern (72). Because
ofthe strong stability condition (6) we obtain

c5 2 F(u; v) = L M"(Du)(Dv) 2 dx 2! 0 for all v e X.

Hence F: X ~ IR is convex, according to Section 42.3, and problems (72) and


(73), as weil as problems (74) and (75), are equivalent. lmportant is then the fact
that

F(u + w) = L M(Du + Dw)dx = F(u) for all weX0 ,

since Dw = 0 on X 0 . This implies


(F'(u), w) =0 for all w e X 0 •
Since X = X 0 E9 X~ it follows that u is a solution of (73), and hence of (72), if
and only if
b(w) =0 for all weX0 . (76)
Lemma 62.15 shows that X 0 consists precisely of all infinitesimal rigid motions
w(x) =a +b x x on G
with arbitrary vectors a, be V3 • Therefore, noting the definition of bin (10),
equation (76) is equivalent to

a (L K dx + LG T dO) + b (L x x K dx + LG x x T dO) =0
for all a, be V3 . This is the equilibrium condition (71).
Problems 293

62.7. Existence and uniqueness of solutions for the mixed boundary-value problern in
linear elasticity theory (main theorem of linear elastostatics). Give a direct proof
ofTheorem 62.A in the special case oflinear elasticity theory by using the main
theorem about quadratic variational problems (Theorem 22.A). In this way we
obtain an immediate generalization of Theorem 6l.D of Section 61.9c (first
boundary-value problem).
Solution: Set
9'(y, p) = A. tr }' tr 1-1 + 2~e[y, p]
for all }', J-1 e L.7m(V3 ). The principle of minimal potential energy (5) then becomes
the quadratic variational problern
ta(u,u)- b(u) = min!, U- u0 eX, (77)

L
where

a(u, v) = 9'(y(u), y(v)) dx,

b(u) = f Kudx + f TudO.


JG JiJ1G
lf we set v = u - u0 , then we obtain
ta(v, v)- f(v) = min!, veX (78)
with f(v) = b(v) - a(v, u0 ). Obviously, we have
9'(}', }') ~ 2K(}', }'].
The key to the proofis Korn's inequality (57), i.e.,

a(v, v) = L 9'(y(v), y(v)) dx ~ 2K L [y(v), y(v)] dx

for all veX.


Hence the bilinear, symmetric functional a: X x X -+ IR is strongly positive.
According to Theorem 22.A, equation (78) has a unique solution.
62.8. Special closed sets in Sobolev spaces. The following two simple results will be
used frequently. Let G be a bounded region in RN with N ~ 1 and choose a
fixed number c > 0.
62.8a. Show that the set
S = {ueL 2 (G): lul :s; c a.e. on G}
is closed, bounded, and convex in the Lebesgue space L 2 (G).
As usual, "a.e." stands for "almost everywhere". Each function in L 2 (G) can
be changed on a set of measure zero. In this sense we can write
S = { u eL 2 (G): Iu I :s; c on G}.
For the sake of brevity, we will use this notation in the future.
Solution: Obviously, S is convex and bounded.
We show that S is closed. Let (u.) be a sequence in S with u.-+ u in L 2 (G) as
294 62. Monotone Potential Operators and a Class of Models

n .._. oo. From A2 (36), there exists a subsequence such that un·(X) .._. u(x) a.e. on
Gas n .._. oo. Hence ueS.

62.8b. Assurne oG E C0 • 1 and suppose that there exists a decomposition of the bound-
ary

where o1 G and o2 G are open subsets of oG, and o1 G is nonempty. Choose a


fixed function u0 e Wl (G) and Iet T denote the set of all functions u e Wl (G) with

IDul ~ c on G and u = u0 on o1 G,
where
N )1/2
IDul = (
;~ IDu;l 2 •

Show that T is a closed, bounded, and convex set in the Sobolev space
Wl(G).
Solution: The set T is closed according to Problem 62.8a. Obviously, T is
convex.

r
On the space Wl(G), an equivalent norm is given by

(L IDul 2 dx + l,G lul dO 2


2

Thus, T is bounded. Note that u = u0 on o1 G and u0 e L 2 (oG).


62.9. The dual stored energy function M*. SetZ= Lsym(V3 ) and consider a stored
energy function M: Z _.IR as in Section 62.3. Show that
M*(o) = [a,y]- M(y) for all aeZ,
where y = M'- 1 (a).
Solution: The space Z is an H-space with scalar product [a,JL]. By Definition
51.1,
M*(a) = sup F(y),
yeZ

where F(y) = [a, y] - M(y) for all y e Z. The strong stability condition (6) im-
plies F(y) ..... -oo as IYI__. oo. Hence the concave function F has a maximal
point. This implies
M*(a) = F(y), F'(y) = 0.
The assertion now follows from

F'(y)JL = [a,JL]- M'(y)JL for all JL e Z,


i.e., F'(y) = 0 if and only if a = M'(y).

References to the Literature

Classical works: Korn (1907) and Lichtenstein (1924) (existence proofs in linear
elasticity theory), Ritz (1908) and Trefftz (1927) (approximation methods), Friedrichs
(1929) (duality).
References to the Literature 295

Dual variational principles in linear elasticity: Gurtin (1972, S), Washizu (1968, M).
lntroduction to the existence theory in elasticity theory: Neeas and Hlavaeek (1981,
M), Ciarlet (1983, L), Marsden and Hughes (1983, M).
(A) Existence proofs in linear elasticity theory
Differential equations: Fichera (1972, S) (handbook article), Duvaut and Lions (1972,
M), Neeas and Hlavaeek (1981, M).
Semigroups and elastodynamics: Marsden and Hughes (1983, M).
Potential theory and singularintegral equations: Kupradze (1976, M).
Plane elasticity, complex function theory, and singularintegral equations: Muske-
lisvili (1954, M) (classical monograph), Babuska (1960, M).
Piecewise homogeneaus bodies and singularintegral equations: Jentsch (1977, S)
(three-dimensional problems), Maul (1976, S) (two-dimensional problems).
Kom's inequality: Duvaut and Lions (1972, M), Neeas and Hlavaeek (1981, M).
Linear strongly elliptic systems: Browder (1954), Nirenberg (1955), Agmon, Douglis,
and Nirenberg (1959), Part II, Morrey (1966, M).
(B) Existence proofs in nonlinear elasticity theory .
lmplicit function theorem: Stoppeli (1954), Ciarlet and Rabier (1980, L), Ciarlet
(1983, L). Marsden and Hughes (1983, M).
Continuation method: Beckert (1975), (1977), (1982), (1984, S), Beyer (1979), Benkert
(1987) (structure of strongly stable solutions).
Polyconvex material and compensated compactness: Morrey (1952) (quasi-convex-
ity and lower semicontinuity of multiple integrals), Ball (1977) (fundamental paper),
Ciarlet (1983, L), Neeas (1983, L), Evans (1986) (partial regularity).
Method of compensated compactness: Tartar (1979), (1983), Dacarogna (1982, L)
(recommended as an introduction), Murat (1981) and Feistauer, Mandel, and Neeas
(1984) (embedding theorem of Murat), Murat (1987, S).
Semigroups and elastodynamics: Hughes, Kato, and Marsden (1977), Marsden and
Hughes (1983, M), Ebin (1986).
Variational methods for approximation models: Beju (1972), Oden and Reddy (1976,
M), Langenbach (1976, M) (monotone potential Operators).
Globalanalysis and nonlinear elasticity: Marsden and Hughes (1983, M).
Symmetry and bifurcation: Chillingworth, Marsden, and Wan (1983).
Uniqueness in nonlinear elasticity and approximation methods: John (1972), (1985)
(collected works).
Propagation ofwaves and·the life-span ofsolutions ofthe instationary equations of
nonlinear elasticity: John (1971), (1983).
(C) Approximation methods
Ciarlet (1971, M) (finite elements), Oden and Reddy (1976, M), Oden (1980, M),
Neeas and Hlavaeek (1981, M).
The boundary integral method: Cf. the References to the Literature for Chapter 22.
(See also the References to the Literature for Chapters 63-66)
CHAPTER 63

Variational Inequalities and the


Signorini Problem for
Nonlinear Material

Signorini (1959) posed the problern, now bearing bis narne, that in sirnplest terms
is to deterrnine the displacernents in a heavy, linearly elastic body resting on a
rigid, frictionless horizontal plane. The essential difficulty of this problern is that
the region of contact between the body and the planeisnot known a priori. It
is conceivable that the contact set could be especially cornplicated.
Fichera (1964) was the first to study the existence and uniqueness of this
problern which is nonlinear because position fields, satisfying the goveming
equations, are subjected to a unilateral constraint that restricts their values lying
in a half space.
Stuart Antrnan (1983)

With regard to the generat nonlinear model of Section 62.3 we now consider
boundary-value problems, for which the elastic body is supported on parts of
its boundary. The boundary conditions thereby have the form ofinequalities.
In functional-analytic terms, this Ieads to convex variational problems on
convex sets. The corresponding Euler equations are variational inequalities.
We shall use Theorem 46.A of Part 111 in order to obtain a general existence
and uniqueness theorem. Throughout, the same notation as in the previous
chapter will be employed.

63.1. Existence and Uniqueness Theorem


Similarly, as in Section 62.3, we consider the variational problern

( M(y(u))dx- ( Kudx- ( ThdO = min!,


JG JG JozG
(1)
u = 0 on o1 G,
un::s;;O ono3 G,

296
63.1. Existence and Uniqueness Theorem 297

(a) undeformed state (b) deformed state

Figure 63.1

where n denotes the outer unit normal vector at the points of the boundary
part o3 G.
The boundary inequality "un :::;; 0 on o3 G" means that we are in the situation
of Figure 63.1. The undeformed elastic body rests on a rigid support which
comes from the boundary part o3 G. The possible deformations of the body
are restricted by this support, i.e., for points x of o3 G only those displacements
u are possible which have an obtuse angle with n.
More precisely, we can write (1) in the form

F(u) - b(u) = min!, ueC (2)


with
C = {ueX: un:::;; 0 on o3 G}.

The H-space X = Wl(G, o1 G; V3 ) has been defined in Section 62.2. Analo-


gously to Section 62.3, our assumptions are the following:
(H1) G is a bounded region in IR 3 with sufficiently smooth boundary, i.e.,
oGeC0 •1 •
(H2) The boundary of G admits the decomposition

oG = 01 G u o2 G u o3 G

with pairwise disjoint, open subsets oiG of oG, j = 1, 2, 3. Moreover,


suppose that o1 G and o3 G are nonempty, while o2 G may Pe empty.
(H3) The stored energy function M satisfies assumption (H3) of Section 62.3.
(H4) On G the density ofthe outer forces K eL 2 (G, V3 ) is prescribed, and on
o2G the density of the outer boundary Stress forces TeL2(o2G, V3) is
given.
In connection with our problern (2) we consider the variational inequality
(F'(u)- b,v- u) ~ 0 for all veC. (3)
We are looking for a solution u e C. From Section 62.5 it follows that (3) has
298 63. Variational Inequalities and the Signorini Problem for Nonlinear Material

the explicit form

f [t1,y(v) -y(u)]dx- f K(v- u)dx- f T(v- u)dO;;::: 0, (4)


Jo Jo Ja2o
where y(u) is the linearlized strain tensor which corresponds to the displace-
ment u (see (62.5)), and
t1 = M'(y(u))
is the stress tensor (first Piola-Kirchhoff tensor).

Theorem 63.A. The variational problern (2) has a unique solution, which coin-
cides with the unique solution of the variational inequality (3).

PRooF. We want to apply Theorem 46.A.


The set Cis closed and convex in X. This is a consequence ofProblem 62.8b.
From Section 62.5a it follows that b is a linear, continuous functional on
X, and the operator F': X-+ X* is strongly monotone and Lipschitz continu-
ous. Consequently, the functional F: X -+ IRis strictly convex, continuous, and
coercive, according to Section 42.3.
Theorem 4().A then yields the assertion. 0

63.2. Physical Motivation


Let u be a sufficiently smooth solution of problern (3). After integration by
parts, the variational inequality (4) becomes

-f (divt1 + K)(v- u)dx + f (t1n- T)(v- u)dO


Jo Ja2o
+ f (t1n)(v- u)dO;;::: 0 (5)
Ja3o
for all veC.
We specialize v. First we choose ve Cö(G, V3 ), i.e., all components of v, in
a fixed Cartesian coordinate system, belong to Cö(G). Then we have veC
and hence (5) holds for all ve Cö(G, V3 ). This yields the equilibrium condition
divt~+K=O onG. (6)
Next, we choose veC~(G, V3 ) with v = 0 on ß1 G, v = arbitrary on o2 G, and
v = u ono3 G. Thus we have again that ve C and (5) implies
(7)
Therefore (5) is reduced to

f (t1n)(v- u)dO ;;::: 0 for all veC. (8)


Ja3o
Problems 299

Let us now analyze this relation. We set T = un on o3 G. Then T is the


density of the outer boundary forces on 83 G. Let t be a tangent vector at a
o
fixed point x of 3 G and Iet n denote the outer unit normal vector at x. We
decompose v at x as
v = v,t + v.n.
lf we choose v with v. = u. and v, = arbitrary, then it follows from (8) that
Tt = 0 on o3 G, (9)
i.e., the tangent component of the outer boundary forces vanishes on 83 G.
o
Therefore no friction occurs on 3 G. lf we choose v with v, = u, and v. S 0,
then (8) implies the following relation at the points of 3 G: o
Tn = 0 if un < 0,
(10)
Tn s;O if un = 0,
which means:
(i) If un < 0 at the point x of 83 G, i.e., in case there exists no contact between
the elastic body and the rigid support at x, then no boundary force occurs
at x in the direction of n (Fig. 63.1 (b) ).
(ii) If un = 0 at x, i.e., in case there exists a contact between the body and the
support at x, then the boundary force at x forms an obtuse angle with n.
These conditions are very natural. Finally, Iet us add the boundary con-
dition
u = 0 on o1 G, (11)
which is an immediate consequence of the fact that u e C and of the definition
of C and X above.
This shows that our original problern (2), which was solved in Theorem
63.A, can be regarded as a generalized problern to the classical problern (6),
(7), (9)-(11). It is called the Signorini problem. The boundary conditions (7)
and (10) do not explicitly appear in the original problern (2). They arenatural
boundary conditions.

PROBLEMS

63.1. Penalty technique for variational inequalities. Consider the variational inequality
(F'(u) - b, v- u) ~ 0 for all ve C. (12)
We are looking for a solution ue C, and analogously to Section 46.7, we also
study the penalty equation
F'(u.) + e;; 1P'(u.) = b, u. e X. (13)
The variational inequality corresponds to the variational problern
F(u) - b = min!, ueC,
and the operator equation (13) corresponds to the variational problern
F(u.) + e;; 1P(u.)- b = min!, u.ex.
300 63. Variational lnequalities and the Signorini Problem for Nonlinear Material

Weassume:
(i) Cis a nonempty, closed, convex set in the real H-space X.
(ii) The functionals F, P: X-+ IR are C 1 • The operator F': X-+ X* is strongly
monotone and Lipschitz continuous.
(iii) The operator P': X -+ X* is monotone and Lipschitz continuous. Moreover,
there holds the key condition
P'(u) =0 ~ ueC.

(iv) Wehave that b e X*, and (e.) is a sequence of positive numbers with e.-+ 0
as n-+ oo.
Prove the following:
(a) Problems (12) and (12*) as weil as (13) and (13*) are equivalent.
(b) Foreach n E N, there exists a unique solution u. of (13*), and the sequence
(u.) converges to the unique solution of(l2*) as n-+ oo.
This result has the advantage that the convex variational problern (12*) and
the variational inequality (12) can be approximately solved by using the family
offree variational problems (13*).
Hint: Use analogous arguments as in Section 46.7. See Necas and Hlavacek
(1981, M), p. 298.
63.2. Penalty technique for the Signorini problem. Under the same assumptions as in
Section 63.1, consider the variational proo'lem

[ M(y(u))dx- [ Kudx- [ TudO + e;; 1P(u) = min!, ueX (14)


JG Ja J~G
with penalty functional

P(u) = f a,a
[un]+ dO.

Here, we set [qJ]+ = max(qJ, 0). Since the assumptions of Problem 63.1 are sat-
isfied, one obtains a method for solving the Signorini problem.
Give a physical interpretation of (14).
Solution: As in Section 63.2, one shows that formula (14) corresponds to the
natural boundary condition
Tn = -e;;'[un]+ on 83 G.
This means that, in contrast to the original problem, the rigid support on 83 G
is approximated by an elastic support. Because of
as n-+ oo,
this elastic support becomes more and morerigid as n-+ oo.
63.3. Generaldisplacement on a, G. Prove a Statement, analogous to Theorem 63.A, by
replacing the boundary condition in (1) with
u = Uo on a, G,
un~O on8 3 G,
where UoE Wl(G, V3) with Uon ~ O"on a3G is given.
Solution: As in Section 62.5 set u = u0 + v with v E C.
References to the Literature 301

63.4. Second boundary-value problem. Consider problern (1) with o G = 0, i.e.,


1

In this case, the body is supported on o3 G, and the boundary stress forces are
o
prescribed on 2 G. Moreover, we use the stored energy function of linear elas-
ticity, i.e.,

M(y) = !A.(tq) 2 + rc[y, y].


The original problern (1) then becomes

f M(y(u))dx- f Kudx-
JG JG
f ~G
TudO = min!,

with the corresponding functional-analytic formulation


F(u) - b(u) = min!, ueC (16)
and the corresponding variational inequality
(F'(u)- b,v- u) ~ 0 for all veC. (17)

Hereweset X = Wl(G, V3 ) and


C = {ueX: un ~ 0 on o3 G}.
Let R be the set of all infinitesimally small rigid motions, i.e., we have u eR
if and only if
u(x) = a +b x x on G
with a, be V3 . Analogously to Problem 62.6, we consider the following equilib-
rium condition

f Kudx+J TudO~O for all ue R n C (18)


JG iJ2G

for the outer forces.


Prove: The variational problern (16) is equivalent to the corresponding varia-
tional inequality (17).
Problem (16) has a solution if and only if the equilibrium condition (18) is
satisfied. Two solutions of (16) ditTer by an infinitesimal rigid motion.
Solution: Apply Proposition 54.3, and note that
F(u) = 0, ueX- ueR,
according to Lemma 62.15. The semicoerciveness follows from Korn's inequality
(Theorem 62.F).
See also Neeas and Hlaväeek (1981, M), p. 301 for a penalty technique.

References to the Literature

Classical works: Signorini (1959), Fichera (1964).


Fichera (1972a, S, H), Kinderlehrer (1981), Necas and Hlaväeek (1981, M).
302 63. Variational Inequalities and the Signorini Problem for Nonlinear Material

Signorini problern with friction: Haslinger (1983).


Signorini problern for polyconvex material: Ciarlet and Neeas (1984).
Variational inequalities in mechanics: Duvaut and Lions (1972, M), Kinderlehrer
and Stampacchia (1981, M), Friedman (1982, M), Neeas and Hlavaeek (1986, M).
Approximation methods: Glowinski, Lions, and Tremolieres (1976, M).
Strong regularity results for the Signorini problern via pseudodift'erential operators:
Schurnano (1987), (1988).
CHAPTER 64

Bifurcation for Variational Inequalities

On the occasion of the problern of the rope curve, we encountered another


equally outstanding problern. lt concems the bending of bearns.
Jacob Bemoulli (1691)
The rnaxirnalloa.d. which a colurnn is capable ofbearing, is indirectly proportional
to the square of the height.
Leonhard Euter (1744)
In 1691, Jacob Bernoulli proposed the problern of a bent bearn, elastic bar,
or sirnply elastica. Bemoulli's problern was twofold: first derive the goveming
equations, then solve thern.
In rnathematical practice today it is, unfortunately, often forgotten that to
derive basic equations is as rnuch a rnathernatician's duty as it is to study their
properties.
ClitTord Arnbrose Truesdell (1983)
Bifurcation theory began with Euler's (1744) analysis of the planar equilibrium
configurations of the elastica subjected solely to end forces.
Stuart Antrnan (1983)

64.1. Basic ldeas


In Theorem 43.B we proved the existence of a bifurcation point for equations
with potential operators. Thereby we used a maximum principle. Herewe will
apply the same proof technique to variational inequalities of the form
A.(F'(u), v- u) ~ (G'(u), v- u) for all veC. (1)
We are looking for u e C with u ::1= 0 and A. e lll, i.e., we are looking for eigen-
solutions. Thereby C is a closed convex cone in a real H-space. As in Part I,

303
304 64. Bifurcation for Variational lnequalities

p p

P0 = buckling force

Figure 64.1

we call (l,O) a bifurcation point of(l) ifand only ifthere exists a sequence of
solutions (Än, un) of (1) with uni: 0 for all n E N and
(Än, Un) -+ (Ä, 0) as n-+ oo.

We then call Ä a bifurcation value.


Problems ofthe form (1) often occur in elasticity theory. Let F be the elastic
potential energy of a body and PG(u) the work done by the outer forces
depending on a parameter P, which can be changed, i.e., we increase, for
example, the compressive forces, acting on a rod (Fig. 64.1). The variational
principle, which will be motivated below by using stability arguments, is
F(u) - PG(u) = min!, ueC. (2)
lfthere are no restrictions imposed on the displacements u, such as in Figure
64.1, then we have C = X. The Euler equation to (2) is
F'(u)- PG'(u) = 0. (3)
In Figure 64.1 one observes experimentally that no buckling occurs for small
forces P, i.e., we have u = 0. Ifwe increase the force P, then for a critical value
P0 we observe buckling. Here, P0 is the smallest eigenvalue of (3) and is called
the buckling force.
lf, however, restrictions are imposed on the displacements through obstacles
as in Figure 64.2, then, for a convex set C, we obtain from the minimum
problern (2) the variational inequality (1) with Ä. = p-t, according to Section
46.1. A concrete example is considered in Section 64.6. There we will show
that, as expected, the buckling force in Figure 64.2 is greater than in Figure
64.1.
We now want to motivate the minimumproblern (2). Let u be an equilibrium
position of the elastic body. In order that u is stable, we require
[PG(v) - PG(u)] - [F(v) - F(u)] < 0 (4)

Figure 64.2
64.2. Quadratic Variationallnequalities 305

for all admissible displacements ve C with v :f:. u. As in Section 61.6b, the


expression on the left-hand side in (4) is equal to the work needed tobring the
body from state u to state v. Note that the elastic work is equal to the negative
potential energy difference. Therefore, [F(v) - F(u)] appears in (4) with a
minus sign. lnequality (4) corresponds to the generat stability principle of
Section 58.12. From (4) follows (2). Our motivation for (2) is complete.
Besides the variational inequality
(V) Ä.(F'(u),v- u) ~ (G'(u),v- u) for all veC,
we also consider its linearization at the point u = 0, that is
(L) Ä.(F"(O)u, v- u) ~ (G"(O)u, v- u) for all veC.
This is a quadratic variational inequality. Our goal is the following important
physical result:
(L) has a greatest positive eigenvalue Ä. 0 , and Ä. 0 is, at the same time, the greatest
bifurcation value of (V).
Physically, P0 = Ä.ö 1 corresponds to the buckling force. For rods, beams,
and columns the buckling force was computed for the first time by Euler
(1744). This will be discussed in (36).

64.2. Quadratic Variational Inequalities


We begin with the simplest case and consider
Ä.a(u, v - u) ~ b(u, v - u) for all veC. (5)
We are looking for u e C and Ä. eR In addition, we consider the maximum
problern
Ä.o = max b(u, u) (6)
ueC a(u, u)
with C= C - {0}. We assume:
(Hl) Cis a closed, convex cone in a real H-space X, i.e., from ueC, t ~ 0
follows tue C, and the set C is closed and convex.
(H2) The bilinear forms a, b: X x X -+ IR are symmetric. Moreover, a is
strongly positive, and b is compact.
(H3) There exists an element w e C with b(w, w) > 0.

Proposition 64.1. Assurne (Hl)-(H3). Then the maximumproblern for the Ray-
leigh quotient (6) has a solution. The number Ä. 0 is the largest positive eigenvalue
of the quadratic variational inequality (5) and, at the same time, a bifurcation
value of (5).
306 64. Bifurcation for Variational lnequalities

CoroUary 64.2 (Solution Set). For u =F 0, problems (5) and (6) with A. = A.0 are
equivalent. The solution set of (5) with A. = A.0 is a closed cone.

These results have many applications in linear elasticity theory. An example


is considered in Section 64.6. In the following two sections we generalize
Proposition 64.1 to nonquadratic variational inequalities.
PROOF OF PROPOSITION 64.1. We set V= 0 in (5). lt follows that A :S; Ä.0 •
(I) The trick of the proof is to replace the original problern (6) with the new
problern
b(u, u) = max!, ueC,
(7)
a(u, u) :S; r,
where r > 0. This maximumproblern for a weakly sequentially continu-
ous functional over a closed, convex, bounded set has a solution u ac-
cording to Section 38.3. Because of (H3) we get u =F 0. For reasons of
homogeneity we therefore have a solution of (7) with a(u, u) = r.
Hence u/Jr is a solution of(6). Thus we obtain A.0 = b(u,u)ja(u,u).
(II) We show that the solution u of (7) is also a solution of the variational
inequality (5) with A. = A.0 • For r ..,.. 0 it follows then that (A.0 , 0) is a
bifurcation point of (5).
For fixed ve C we set
w = (1 + !X)(u + 6(v- u))
and solve the quadratic equation a(w, w) = r for !X. This yields
a(u,v- u)
!%(6) = -6 (
a u,u
) + o(6), 6 ..... 0.

Wehave w e C. From (7) we thus obtain that b(w; w) :s; b(u, u). This yields
(5) with A. = b(u, u)ja(u, u), hence A. = A.0 • 0

The proof of Corollary 64.2 will be given in Problem 64.1.

64.3. Lagrange Multiplier Rule for Variational


Inequalities
In addition to the maximum problern
G(u) = max!, ueC,
(8)
F(u) = r,
with fixed r e IR, we consider the variational inequality
A.(F'(u), v- u) ~ (G'(u), v- u) for all veC. (9)
64.3. Lagrange Multiplier Rule for Variationallnequalities 307

We want to show that the solutions of (8) arealso solutions of (9). Important
is the nondegeneracy condition
(F'(u), u) ::F 0. (10)
Weassume:
(Hl) Cis a closed, convex cone on the real B-space X.
(H2) F, G: X ..... R are given, and u is a solution of the maximum problern (8)
with (10).
(H3) The functional Fis continuous at the point u and the F-derivatives F'(u),
G'(u) exist.

Proposition 64.3 (Lagrange Multiplier Rute). If (Hl) to (H3) hold, then there
exists a real number A. such that the solution u of the maximum problem (8) is
also a solution of the variational inequality (9).

lfwe choose C =X, then we obtain


A.F'(u) - G'(u) = 0.
This is the Lagrange multiplier rule of Section 43.2, whereby the eigenvalue
A. is called a Lagrange multiplier.
Proposition 64.3 is the key to our approach, since it allows us to reduce
eigenvalue problems for variational inequalities to maximum problems.
PRooF. From the proof of Proposition 43.6 there exist for each v e C two zero
sequences (an), (15") with oc" > 0 for all n such that F(u") = r with
Un = U + <Xn(/J + «5n)U + <XnV
and
ß= - (F'(u), v)/(F'(u), u).
Since the set Cis a cone, we have
xeC, t > 0 => txeC and x, yeC => x + yeC.
Thus we obtain u" e C for large n, i.e., u" is an admissible element for the
maximum problern (8). Consequently,
for all n ~ n0 •
This means
G(u) + <Xn(G'(u),ßu + v) + o(oc") ~ G(u).
For n ..... oo it follows that
(G'(u),ßu + v) ~ 0.
This is the variational inequality (9) with
A. = (G'(u), u)/(F'(u), u). 0
308 64. Bifurcation for Variational lnequalities

64.4. Main Theorem


We study the eigenvalue problern for the variational inequality
.Ä.(F'(u), v - u) ~ (G'(u), v - u) for all v e C (11)
on the real H-space X. We are looking for u e C with u =F 0 and .Ä. eR. Thereby
( 11) is a perturbation of the quadratic variational inequality
A.a(u, v - u) ~ b(u, v - u) for all v e C. (12)
Moreover we consider, as in Section 64.2, the maximum problern for the
Rayleigh quotient
A.0 = max b(u, u). (13)
uec a(u,u)
Our assumptions are:
(Al) The set Cis a closed, convex cone in the real H-space X. The bilinear
forms a, b: X x X-+ Rare syrnmetric. Furthermore, a is strongly posi-
tive and b is cornpact.
There exists an elernent w e C with b(w, w) > 0.
(A2) The functionals F, G: U(O) s;;; X-+ Rare F-differentiable with F'(O) =
G'(O) = 0. Moreover, F is weakly sequentially lower sernicontinuous,
and Gis weakly sequentially continuous.
(A3) For u -+ 0 we have the approxirnation formulas
F(u) = !a(u,u) + o(llull 2 ), F(u) = !(F'(u),u) + o(llull 2 ),
G(u) = fb(u,u) + o(llull 2 ), G(u) = !(G'(u),u) + o(llull 2 ).
Theorem 64.A. Assurne (Al) to (A3). Then the variational problern (13) has a
solution with A.0 > 0, and A.0 is the largest bijurcation value of (11) and the largest
eigenvalue of the linearized problern (12).

Corollary 64.4 (Linearization Principle). Assurne (A2). lf the second-order


F-derivatives F"(O), G"(O) exist and if F(O) = G(O) = 0, then also (A3) is valid
with
a(u, v) = (F"(O)u, v), b(u, v) = (G"(O)u, v)
for all u, veX.
If, in addition, (A 1) is satisfied for these a and b, then every positive bifurcation
value of (11) is an eigenvalue of the linearized problern (12).

Theorem· 64.A goes back to Miersernann (1975).


In assurnption (Al), we require the existence ofa point we C with b(w, w) >
0. If we replace this condition with
b(w,w) < 0 for a weC,
64.5. Proof of the Main Theorem 309

then we can substitute G and b with - G and - b, respectively. Multiplication


with ( -1) means that, in (11) and (12), the nurober A. is replaced by - A. and
"::::;;;" by "~ ". According to Theorem 64.A we also obtain an existence result
for this new problem.

64.5. Proof of the Main Theorem


We prove Theorem 64.A and follow Zeidler (1976). Fora proof of Corollary
64.4, see Problem 64.2.
In the following we assume that 0 < r : : ; ; r0 with sufficiently small r0 > 0.
The proof idea is as follows:
(i) We apply the Lagrange multiplier rule from Section 64.3.
(ii) Forthis we have to solve the maximumproblern
(M) max G(u) = G(u,)
ueCniJM,

over the set c II oM,.


(iii) In order to solve (M) we consider the simpler maximum problern
(M*) max G(u) = G(u,)
ueCnM,

over the weakly sequentially compact set C 11 M,. Then, we apply the
so-called boundary trick, i.e., we show that the solution u, of(M*) lies on
the boundary oM,. Consequently, u, is also a Solution of (M).
(I) Geometrical preparation. We introduce the new equivalent norm lul =
a(u,u) 1' 2 on the H-space X. We define the ball B, = {ueX: lul::::;; r} and
the set

M, = {ueX: F(u)::::;;; ~ ,lul < 2r}.


Because of (A3) we find that, for small r, the closed set M, lies in a small
open ball. Because of the continuity of F we find

oM, = {ueX: F(u) = ~, lul < 2r}.

The functional F is weakly sequentially lower semicontinuous. Therefore


the set M, is weakly sequentially compact. According to (A3) there exist
functions cx = cx(r) and P= ß(r) with

(14)
and cxjr, P/r --+ 1 for r --+ 0.
310 64. Bifurcation for Variational Inequalities

(II) We-consider the three maximum problems


max b(u, u) = Ä. 0 ,
ueCnB 1

max b(u, u) = b(v,, v,),


max G(u) = G(u,),

with corresponding solutions v, and u,. According to Section 38.3, there


exist solutions to all these problems, because weakly sequentially con-
tinuous functionals are maximzed over weakly sequentially compact sets.
Important is now the condition
. (G'(u,),u,)
1tm 1
2 =~~oo (15)
, .... o r

which will be proved at the end. From Section 64.2 follows that l 0 > 0.
(111) Boundary trick. We prove that u,eoM,. Thus we obtain
max G(u) = G(u,). (16)
ueCni!Mr

Suppose we have u, e int(M, n C). Since C is a cone, we then also have


u, + su,eM, n C for smalls > 0,
hence G(u, + su,) S: G(u,). For B-+ 0 we obtain ( G'(u,), u,) S: 0, which
contradicts (15). Hence u,eoM,.
(IV) Lagrange multipliers. From u,e iJM, and (A3) follows (F'(u,), u,)/r 2 -+ 1
as r -+ 0. Thus the nondegeneracy condition
(F'(u,), u,) ::/= 0
is satisfied. Since the proof in Section 64.3 has a purely local character,
we may apply Proposition 64.3 to (16), according to (I), and obtain
(G'(u,), v- u,) S: A.,(F'(u,), v- u,) for all veC. (17)
F or v = 0 and v = 2u, this gives
A. = (G'(u,),u,)r- 2
(18)
' (F'(u,), u,)r 2 •
Thus we obtain l,-+ A.0 as r-+ 0. Moreover, we have u,-+ 0 as r-+ 0.
Therefore (A. 0 , 0) is a bifurcation point of the original equation ( 11 ).
(V) We show that the number .Ä.0 is the largest bifurcation value of (11).
Suppose that in (17) we have
u,-+0 and as r -+ 0 with u, ::/= 0.
From (A3) follows
. (F'(u,),u,)
Itm - 1
,. . o lu,l 2 -
64.6. Applications to the Bending of Rods and Beams 311

and furthermore,
. (G'(u,),u,) 1. b(u"u,)
l1m ,
2 = 1m 2 ~ JLo·
r-+O lu,l r-+0 lu,l
Equation (i8) implies l 1 = lim, .... 0 l, ~ l 0 •
(VI) Proof of (15). For reasons of homogeneity we have
max b(u, u) = A.0 r 2 •

Because of(14) this implies


l 0 tX 2 ~ b(v" v,) ~ l 0 ß2 •
Thus we find
, . b(v" v,) 1. 2G(v,)
JLo = 11m 2 =1m-- 2 .
r-+0 r r-+0 r
The second equation is a consequence of (A3). From u, e C n M, follows
lu,l ~ ß, according to (14), hence
2G(v,) ~ 2G(u,) = b(u" u,) + o(lu,l 2 )
~ l 0 ß2 (1 + o(1)), ß-+ 0.
For r-+ 0 we thus obtain that G(u,)/r 2 -+ l 0 , and (A3) implies (15).
The proof of Theorem 64.A is complete.

Let us note that this proof can greatly be simplified ifwe make the additional
assumption that
G(u) > 0, ueC => (G'(u),u)>O.

64.6. Applications to the Bending of Rods and Beams


As in Section 64.3 we consider a rod or a beam of length I, which is clamped
at both boundary points. Moreover, we assume that the left boundary point
is fixed, and suppose that a compressive force of magnitude P acts on the right
movable boundary point
X= 1- Al.
Moreover, we assume that obstacles exist as in Figure 64.3. Fora mathemati-
cal description we use the generat observations of Section 64.1.
Let u(e) denote the deflection of the beam at the point e. The length of the
buckled beam is
312 64. Bifurcation for Variational Inequalities

1- Al

Figure 64.3

Since the beam is clamped at the boundary points we obtain


for e= 0, l - t\l. (19)
The work of the compressive force is obtained from the product of force times
length, that is
PG(u) = P t\l. (20)
We work in a linearized theory, i.e., we make the following assumption:
lul, lu'l, lu"l and t\l are small, and we
(21)
consider only lowest order terms,
In the next section we will show that, in the context of (21), the elastic
potential energy of the beam satisfies:
1 rl-&1
F(u) = 2EJ Jo u" 2 de. (22)

Thereby E is the elasticity module of Section 60.1 and J will be given in (32).
Besides the deßection u we also have .to determine the change in length t\l,
i.e., in (19) we have a free boundary condition. In order to eliminate this
difficulty, we replace (19) approximately with the boundary condition
for e= 0, l. (23)
because of the smallness of t\l. This means that we are looking for a solution
u on [0, l], which is not equal to zero on [l - t\l, l] as in the real situation, but
only small. Moreover, we assume that u = u(e) is the equation ofthe neutral
fiber, which is not subject to length changes. Thus we have 11 = l, hence
64.6. Applications to tbe Bending of Rods and Beams 313

Together with (21) this yields the final formulas:

l f'
PG(u) = P t11 = 2P Jo u' 2 de,
F(u) = ~ EJ I u" 2 de.

To take the obstacles in Figure 64.3 into account, we choose two subsets
M and N of [0, l] and assume
for eeM,
(24)
for eeN.
From the problern
F(u) - PG(u) = min!,
which has been discussed in Section 64.1, we obtain the final problem:
A.a(u, u) - b(u, u) = min!, ueC (25)
with
X= Wl(O,l) and C = {ueX: (24) holds}.
Moreover, we have A. = l/P and

a(u, v) = EJ I u"v" de, b(u, v) = J~ u'v' de.


With X, the boundary conditions (23) are taken into account. From Section
64.1 we then obtain the following variational inequality
(P) A.0 a(u, v - u) ~ b(u, v - u) for all v e C.
We are looking for u e C and A. e R As in Section 64.2 we also consider

A.o = max b(u, u). (26)


uec a(u, u)
Theorem 64.B. Let C '# {0}. Then (26) has a solution A.0 > 0 and A.0 is the largest
eigenvalue of (P) and, at the same time, a bifurcation value of (P). The solution
set of (P) with A. = A.0 is a closed cone.

PROOF. Because of the continuous embedding ~2 (0, l) !;;;; C 1 [0, l] it follows


that C is closed. Obviously, C is a convex cone.
From A2 (54) follows that a(u, u) 1' 2 is an equivalent norm on X, and hence
a is strongly positive.
The bilinear form b is compact, because from
and as n-+oo
314 64. Bifurcation for Variational Inequalities

follows un-+ u and vn-+ v in Wl(O, /), since the embedding Wl(O, /) ~ Wl(O, /)
is compact, hence b(un, vn)-+ b(u, v) as n-+ oo.
Proposition 64.1 yie1ds the assertion. D

We now want to give a physical motivation ofTheorem 64.B. Let P0 = 1/A.0 •


For
0 < P < P0
there only exists the trivial solution u = 0 of (P~ hence there occurs no
buckling. For P = P0 we obtain a nontrivial solution. Therefore P0 is the
buckling force. Without obstacles, we obtain C = X. Hence the eigenvalue A. 0
in (26) increases or remains equal if we pass from the obstacle case C c: X to
the case without obstacles X = C. As expected, this implies the following
natural result:
In the presence of obstacles the buckling force is at least as big as without
obstacles.
Roughly speaking, obstacles stabilize the beam.
When no obstacles are present, then the Euler equation for the classical
variational problern (25) is the well-known linearized rod (beam) equation
EJu<4 > = -Pu" on ]0, /[,
(27)
for e= 0, I.
The buckling force without obstacles is the smallest positive eigenvalue of(27).
As in the case of alllinear eigenvalue models of elasticity theory, our model
also has the disadvantage that several solutions occur for the buckling force
P0 , i.e., there exists no uniquely determined state of buckling (Fig. 64.4(a)).
Experimentally, one observes the following fact for increasing forces P > P0 •
The deflections are uniquely determined by the force P and they become larger
and larger (Fig. 64.4(b)). Such bifurcation branches exist in more realistic
nonlinear models.
lmportant, for our abstract model in Theorem 64.A, is the following:
(i) The elastic potential energy u 1-+ a(u, u) is strongly positive.
(ii) The work u 1-+ b(u, u) of the outer forces is compact.

max lul

Po Po
(a) (b)

Figure 64.4
64.7. Physical Motivation for the Nonlinear Rod Equation 315

In the case of the beam, we have as a typical structure:


(S) The elastic potential energy a(u, u) contains derivatives which are of higher
order than those of the work b(u, u) of the outer forces.
The proof of Proposition 64.5 shows that (S) implies (ii), while (i) follows
from the fact that a(u, u) 1' 2 is an equivalent norm on the Sobolev space Wl(O, 1).
In fact, (S) occurs in many other situations in elasticity theory. For example,
the following chapter will show that (S) occurs for plates. Thus we can also
apply Theorem 64.A to plates with obstacles (see Section 65.8).

64. 7. Physical Motivation for the


Nonlinear Rod Equation
We consider here the same situation as in the previous section, but without
obstacles. In order to obtain a better model, we choose the arclength s as a
parameter. Suppose the equation for the neutral fiber of the rod or beam of
length I is
e= e(s), ( = ((s). (28)
Since s is the arclength, we have f 2 + (' 2 = 1. Therefore it suffices to compute
( (Fig. 64.5). The angle between the tangent to the curve and the e-axis is
denoted by rp(s). Thus we have
e'(s) = cos rp(s), ('(s) = sin rp(s), (29)
and
rp(s) = arcsin {'(s). (30)
As is weil known rp'(s) is the curvature at the point s, and r = 1/rp' is called
the radius of curvature. Our goal is the variational problern

(31)
rp(O) =(X, rp(l) = fJ
for fixed given angles (X and fJ. The material constant A is called flexible

(~A
.._~_.__~--~-
t=O
.......- P
t(/)

Figure 64.5
316 64. Bifurcation for Variational lnequalities

stiffness. If the rod lies symmetrically to the ~-axis and has a constant cross
section Q, then we have
A =EJ. (32)
Thereby E is the elasticity module of Section 60.1 and

J= L( 2 dC d".
Moreover, l is the length ofthe rod and P the outer force. For P > 0, we have
a compressive force. If cp is expressed in terms of (, then we obtain the following
variational problern which is equivalent to (31):

Jo1'12(1AC"
2
_ C' 2) + P J1=-f2 ds- PI= min!,
(33)
((0) = W) = 0, cp(O) =IX, cp(l) = p.
Note that C" = cp' cos cp. In Section 37.28j we have studied problems of type
(33) by making use of catastrophe theory. As we shall see during the next
section, the form (31) is better suited for the construction of explicit solutions.
Motivation of (31). The work PG of the outer force is equal to force times

PI
displacement, hence

PG = P(l- e(l)) = Pl- coscpds.

Moreover, we assume that the elastic energy F depends only on the curvature

=I
ql, i.e.,

F L(cp')ds.

We require L(O) = 0 and L(-ql) = L(cp'), i.e., L does not depend on the sign
of the curvature. The Taylor expansion L(ql) = r 1 Acp' 2 + o(cp' 2 ) yields
approximately

F =! A
JoI' cp' ds =!2 A JoI' (" (1 - C' r
2 2 2 1 ds.
2
According to Section 64.1 we obtain (31) from
F- PG = min!.
Motivation of (32). We want to compute the material constant A more ac-
curately. Thereby we obtain another more generat motivation for F. Con-
sider Figure 64.6. Suppose the rod lies symmetrically to the ~-axis. We first
Iook at the (e,C)-plane. Since the upper fiber is stretched and the lower fiber
shortened we make, with Jacob Bernoulli, the following assumptions:
(i) The e-axis remains unstretched, i.e., it corresponds to the so-called neutral
fiber.
64.8. Explicit Solution of the Rod Equation 317

C D

I I
A B

Figure 64.6

(ii) Straight lines, perpendicular to the ~-axis, pass without stretching into
straight lines, perpendicular to the neutral fiber.
At the point A' of Figure 64.6 the neutral fiber behaves locally like a circle
of radius r = 1/ql. For the distances we find
IABI = IA'B'I = rfJ and IC'D'I = (r + C(C))fJ.
For the dilatation y of CD we therefore obtain
y = (IC'D'I - ICDI)/ICDI = C(C)fr,
hence y = C(C)cp'.
We now consider the three-dimensional case. Let Q(~) be the cross section
ofthe rod for fixed ~. Suppose that allplanes parallel to the (~. C)-plane behave
analogously to the (~.C)-plane, whereby no dilatation occurs in the direction
ofthe '1-axis. We choose a small axially parallel cuboid. According to Problem
61.3, its elastic energy is equal to
tEy 2 av = tEC 2 cp' 2 av.
Summation over the small cuboids yields the total elastic potential energy

F=-E
1
2
f'
0
Jcp' 2 ds with J(s) = j
JQ(s)
C2 dC d".

Fora constant cross section we obtain J = const. This yields (32).

64.8. Explicit Solution of the Rod Equation


F or a constant cross section of the rod the Euter equation for the variational
problern (31) is
cp"(s) = - p. 2 sin cp(s) (34)
318 64. Bifurcation for Variational lnequalities

with p. = jPJE.i. As usual, Iet sn( ·) denote the Jacobian elliptic sinus
amplitudinis.

Proposition 64.5. For each k e [0, 1] one obtains from

sin I = k sn(p.s, k) (35)

a solution of (34) with tp(O) = 0 and tp'(O) = 2kp..

PRooF. After multiplication with tp' and integration it follows from (34) that

tp' 2 = 2p. 2 (cos tp - cos tp0 ) = 4p. 2 ( sin 2 ~0 - sin 2 I).


Letting k = sin tp0 /2 and sin tp/2 = k sin 1/1 we obtain

p.s = ['" di/J .


Jo J1 - k 2 sin 2 1/1
By reversing this elliptic integral we find 1/1 = am(p.s, k) and
k sin 1/J = k sin am(p.s, k) = k sn(p.s, k).
This is (35). 0

The bending of the rod was studied for the first time by Jacob Bemoulli
in 1691. From 1696 on, he, and bis younger brother Johann, worked as
ernbittered rivals on a further development of variational calculus, after
Johann Bemoulli bad posed the famous problern ofthe brachystochrone. The
variational principle (31) goes back to Johann's son, Daniel Bemoulli. In 1742,
he communicated it in a Ietter to Euler. In a wonderful piece of work, Euler
(1744) studied the differential equation (34) in great detail. The theory of
elliptic functions was not known at this time. It was developed only later
during the nineteenth century by Jacobi and Weierstrass. Nevertheless, using
great computational skills, Euler was able to classify nine qualitatively differ-
ent types of solutions.This classification of (35) may be found in Geckeier
(1928), p. 189. Interestingly, the family of solutions of (35) also contains such
strange rod curves as shown in Figure 64.7.
If we consider a pendulum as in Figure 64.8, then we again obtain the
differential equation (34) with
P. 2 = g/L

Figure 64.7
Problems 319

Figure 64.8

Figure 64.9

and g = gravitational acceleration, L = length of the pendulum, s = time.


Using this observation one can determine solutions of(34)experimentally and
give intuitive interpretations.
For small cp, equation (34) becomes approximately
cp" + J.l2lp = 0, cp(O) = cp(l) = 0.
The smallest eigenvalue is J.l = 1r/l with eigenfunction cp = sin J.lS. Thus, for the
buckling force we obtain Euler's (1744) famous formula:
EJ1t2
Po= - 1-2-. (36)

Euler considered a column as in Figure 64.9. For the critical force P = P0 the
column collapses. Fora detailed discussion see Section 29.13.

PROBLEMS

64.1. Proof of Corollary 64.2.


Solution: The proof of Proposition 64.1 shows that every solution of (6) is also
a solution of (5) with A. = A.0 • Let, conversely, u be a solution of (5) with A. = A.0 •
If we choose v = 0 and v = 2u, then it follows that A.0 = b(u, u)fa(u, u).
LetS be the solution set of(5) with A. = A.0 . The continuity and homogeneity
of a and b imply that S is a closed cone.
64.2. Proof of Corollary 64.4.
Hint: See Zeidler (1976), pp. 44, 49.
320 64. Bifurcation for Variational Inequalities

u tube
a~------+----------

._,_ _ _ _ _ _..__ P (force)


rod
-a~--------------------

(a) O< P< Po (b) First critical buclding force, P0

(c) Po< P< P 1 (d) Second critical buclding force, P 1

The heavily drawn situations are observed in the experiment up to symmetry with respect
to the P-axis.
Figure 64.10

64.3. Bifurcation for variational inequalities on convex sets, higher eigenvalues, and
applications to beams and plates. See the papers of Miersemann in the References
to the Literature to this chapter.
64.4. Bending of rods with obstacles and optimal control. Let us consider the situation
of Figure 64.10, namely, a rod under the action of a compressive force P with
the side condition
lul ~a
for the deflection u. According to Section 64.6, we obtain the following problem:

21i' 0
Aw 2 - Pv 2 d~ = min!,
v = u', w=v',
(37)
lul ~ a,
u(O) = u(l) = 0.
This is a typical optimal control problern to which the Pontrjagin maximum
principle applies (see Section 48.6). A careful study oftbis problern may be found
in the paper of PhU (1987), where the author introduces a new method, which he
calls the method of region analysis. In Phu (1987a) this method is also applied
to the optimal colitrol of hydro-electric power stations. Figure 64.10 shows some
of the typical solutions which coincide with physical experiments. In those
experiments, the compressive force is increased and one observes an abrupt
change of the rod. More and more complex configurations occur.
References to the Literature 321

A typical and interesting feature of the method of region analysis is the fact
that the optimal solutions depend only on the structure of the problem, and not
on the precise formulation ofthe problem. Thus, one only needs a rough modeling
of the actual situation.

References to the Literature

Classical works: Bernoulli (1691), Euler (1744).


Modern development: Antman (1983, S, B, H).
Article about rods in the handbook of physics: Geckeier (1928, B, H), Antman
(1972, B).
Bifurcation for rods: Antman (1977, S), Antman and Rosenfeld (1978, S), Reiss
(1977, S), Marsden (1980), Antman and Kenney (1981).
Bifurcation for variational inequalities and its applications: Miersemann (1981, S)
(recommended as an introduction), (1975), (1978), (1979), (1981), (1981a, b), Do (1975),
(1976), (1977), Dias (1975), Diasand Hemandez (1975), Zeidler (1976), Kueera, Neeas,
and Soucek (1978), Kueera (1982), (1982a).
Explicit solutions for the rod: Euler (1744), Geckeier (1928), Pflüger (1965, M).
Rope and bifurcation: Dickey (1976, L).
Rotating rods, chains, and bifurcation: Stuart (1976, S), Reeken (1977) (1979), Antman
and Nachman (1980).
Rod and catastrophe theory: Zeeman (1976), Golubitsky (1978, S), Poston and
Stuart (1978, M), Gilmore (1981, M).
Mooney-Rivlin material and the bending ofrods: Ball (1977).
Bending of rods, optimal control, and the method of region analysis: Phu (1987),
(1987a) (optimal control of hydro-electric power stations).
Variational inequalities in mechanics: Panagiotopoulos (1985, M).
History: Antman (1983, S), Truesdell (1983, S).
CHAPTER 65

Pseudomonotone Operators, Bifurcation,


and the von Karman Plate Equations

The historical development of nonlinear plate and shell theories has not been
as felicitous as that of rod theories up to the creation of KirchhotT's theory.
KirchhotT (1824-1887) established a satisfactory linear theory of plates begun
by Navier (1785-1836). A popular nonlinear plate theory was finally developed
by von Karman in 1910.
These equations are "derived" by introducing, in the standard procedure of
mechanics, a number of geometric approximations that are roughly analogous
to the replacement of sin qJ in the rod equation
qJ" + sin qJ = 0
by its cubic approximation qJ - qJ 3/6. Moreover, the von Karman's equations
are based upon a certain linear stress-strain law.
Stuart Antman (1977)
The number of papers about plates is infinitely !arge, and gets !arger and !arger.
Mathematical folclore

65.1. Basic ldeas


In this chapter we consider a plate which is clamped at the boundary. Our
method of proof, however, can also be applied to other boundary conditions.
We use the following tools:
(I) Implicit function theorem (Theorem 4.B).
(P) Main theorem about pseudomonotone operators (Theorem 27.A).
(B) Main theorem of bifurcation theory for potential operators (Theorem
45.A).
The von Karman plate equations, which will be formulated in Section 65.2

322
65.1. Basic ldeas 323

and motivated in Section 65.7, Iead to the operator equation


w - pLw + Cw = f, WEX (I)
in the real H-space X. This equation has the following structure:
p = real parameter;
L = linear, symmetric, strongly continuous operator;
C = strongly continuous potential operator which is homogeneaus of
degree 3;
f = fixed element in X.
Roughly speaking, these quantities allow the physical interpretation:
w => deßection of the plate perpendicular to the plate plane;
f=> outer volume forces;
pL => outer boundary forces.
For p = 0, the outer boundary forces vanish, while for IPI-+ oo they increase.
Let us investigate two problems:
(i) Existence. For arbitrary fand smalllpl, i.e., for arbitrary volume forces
and small boundary forces, equation (l) has a solution.
For small 11!11 and IPI. i.e., for small volume and boundary forces, the
solution is unique.
Because of possible bifurcations, there exist no uniqueness results for
arbitrary forces.
(ii) Bifurcation. Letf = 0, i.e., the volume forces vanish. Then equation (l) has
the trivial solution w = 0, which corresponds to a state ofthe plate without
buckling.
We are interested in buckled states which correspond to nontrivial
solutions w ::;: 0. We will show that, in mathematical terms, the bifurcation
points (p,O)e IR x X of equation (1) with f = 0 are in a one-to-one corre-
spondence with the characteristic numbers p of the linearized problern
w- pLw=O, weX. (2)
In physical terms, this means the following.
Precisely all these characteristic numbers of equation (2), for which buck-
ling of the plate occurs, correspond to critical boundary forces. lf one
increases the boundary forces, beginning with forces equal to zero, i.e., if p
increases from zero onwards, then buckling occurs for the first time at the
smallest characteristic number
Po> 0
of equation (2).
The knowledge of p0 is very important to engineers in order to determine
the stability of plates. If
0 < P <Po.
324 65. Pseudomonotone Operators, Bifurcation, and the von Karman Plate Equations

then engineers expect that the plate is stable for the boundary force
corresponding to p. By using modifications at the plate, for example, by
adding additional forces, they try to obtain the largest possible value for
Po·
We now explain the basic ideas for the proofs of (i) and (ii).
(a) lt is clear that (ii) is an immediate consequence of (B).
(b) The uniqueness result in (i) follows from (I) or directly from the Banach
fixed-point theorem.
(c) The generat existence result in (i) is a consequence of (P), because the
operator
A =I- pL +C
is a strongly continuous perturbation of the identity and hence A is pseu-
domonotone and bounded. It is important that we show the coerciveness
of the operator A. We thereby make essential use of the special structure
of the operator C. Namely, there exists a bilinear, bounded, and strongly
continuous operator
B:X X X-+X
suchthat
Cw = B(w,B(w, w)) for all weX (3)
and
(Cwlw) = IIB(w, w)ll 2 • (4)
Thus we have
(Awlw) = llwll 2 - p(Lwlw) + (Cwlw)
;;::: llwll 2 (1 - IPI IILII)
and hence
(Awlw)
~-++oo as llwll-+ +oo (5)

for smalllpl. This is the coerciveness of A for smalllpl.


In Problem 65.2 we show, how, in contrast to (i), the existence ofsolutions
for arbitrary boundary and volume forces can be proved.
In Section 65.7 we give a detailed mathematical and physical motivation
for the von Karman equations. In this connection, we use our generat strategy
for obtaining approximation models in elasticity, which has· been described
in Section 61.6e. The basic idea is as follows:
(a) Westart with an exact stored energy function L = A(8), where 8 is the
strain tensor (Saint Venant-Kirchhoff material).
65.2. Notations 325

(b) We replace 8 with an approximation, where, roughly speaking, we assume


that all effects perpendicular to the plate plane are small.
(c) We consider the principle of stationary potential energy with respect to
the approximate stored energy function.
(d) We compute the corresponding Euter equations, which represent a non-
linear system of partial differential equations for the three displacements
u 1 , u2 , and u3 .
(e) We simplify the Euter equations by introducing a function U, called the
Airy stress function. This function plays the rote of a potential for u 1 and
u2 • In this way we obtain the von Karman plate equations for the two
unknown functions w = u3 and U.

For readers who are interested in practical applications, we remark that,


for example, the construction of the Metro in Prague and the Prague Iee
Palace was based on the Signorini problern and the von Karman-Vlasov shell
equations, respectivety.
There stiU remain many open problems in shell theory. This applies to
model building as weil as to mathematical existence theory. For shells, the
stress depends, in a more-sensible way, on higher order derivatives of the
displacements than this is the case for plates. This is the reason, why there
exist, for example, equations of degree eight and higher, in contrast to the
fourth-order elliptic equations for plates. Some deep work in shell theory has
been done by John (1965), (1971), (1985) (collected works), who found a priori
estimates for the stresses in shells. This way, he rigorously derived interior
shell equations.

65.2. Notations
In this chapter we work in a fixed Cartesian coordinate system with coor-
eee
dinates 1 , 2 , 3 and a corresponding system of orthonormal basis vectors
e 1 , e2 , e3 • Instead of the partial derivative

we simply write

This is a usual convention in elasticity theory. For example, we have

and

Let 11-11, and ll"llm,q denote the norms on L,(O) and on the Sobolev space
Wqm(O), respectively.
326 65. Pseudomonotone Operators, Bifurcation, and the von Karman Plate Equations

65.3. The von Karman Plate Equations


Let n be a bounded, simply connected region in the 1 2 )-plane with (e ,e
sufficiently regular boundary, i.e., an E C0 • 1• The von Karman plate equations
are
A2 U = -r 1 Ecp(w, w) on n, (6)
A2 w = 2aD- 1 (cp(w, U + pU0 ) + K 3 ) on n, (7)
with the boundary conditions
w= w,l = w,l = 0 on an, (8)
u = u,l = u,2 = 0 on an. (9)
Weset
cp(v, w) = v, 11 w, 22 + v, 22 w, 11 - 2v, 12 w, 12
and note that
D = 2Ea 3/3(1- p 2 ),
E = elasticity module, J.l = Poisson number,
2a = thickness of the plate.
In order to simplify the notation we choose a physical system ofunits for which
2- 1 E = 2aD- 1 = 1.
We are looking for the real functions wand U.
In Section 65.7 we shall show that all displacements and stress forces of the
plate are obtained from a solution w, U, i.e.,
all physically interesting quantities of the plate follow from
a solution w, U of the von Karman equation.
Let us restriet ourselves here to the following remarks:
(i) Displacements. In the undeformed state, the plate is in the region
G= n x ]-a,a[
of R3 (Fig. 65.1). Thus n lies in the medium plate plane. Moreover, w is

t
2a '
!
Figure 65.1
65.4. The Operator Equation 327

the displacernent of n in the direction of the ~3-axis, i.e., (~ 1' ~2• 0) is


deforrned into (~ 1 , ~ 2 , w(~ 1 , ~ 2 )). The boundary condition (8) rneans that
the plate is clarnped at the boundary.
(ii) Outer forces. The volurne force

(L K3dx)e3 = (2a LK3 (~ 1 .~2 )d~ 1 d~2)e3


and the boundary stress force

2a fan (T (s)e
1 1 + T2 (s)e 2 )ds
act on the deformed plate, where s is the arclength of the boundary curve
which corresponds to an. We orient this curve in such a way that n lies
to the left of this curve.
The function pU0 with the real pararneter p, given on Ö, then describes
the boundary stress forces as
d d
p-ds Uo . 2 = Tt, -p-ds U0 . 1 = T2 on an. (10)

We thereby rnake the additional assurnption that


A2 U0 = 0 on n.
(iii) Stress forces. See Section 65.7. lf one knows a solution w, U of the von
Karman equations, then the displacernents in the direction of the ~ 1-axis
with i = 1, 2 are determined only up to infinitesirnally srnall rigid rnotions.
This will be discussed in Section 65.7.

65.4. The Operator Equation


We now want to show that the generalized problern to the von Karman
equations Ieads to the operator equation
w - pLw + Cw = J, WEX (11)
with x = wi(n).
Roughly, the idea is:
(i) to elirninate the function U by solving equation (6) for U; and
(ii) inserting this expression into (7).
Classically, we thereby have to solve a boundary-value problern for the
equation

with w given.
328 65. Pseudomonotone Operators, Bifurcation, and the von Karman Plate Equations

Functional analytically, this can Pe done, very elegantly, by introduclog on


the Sobolev space X = Wi(O) the equivalent energetic scalar product

(vlw) = t
Jnr i,J=l v,ijw,ijdx
(see A2 (53b)). Integration by parts yields

(vlw) = t AvAwdx for all v, weX,

and multiplying (6) and (7) with Z and z, respectively, we again apply integra-
tion by parts to obtain
(UIZ) = -b(w, w,Z),
(12)
(wlz) = b(w, U + pU0 ,z) + F(z) for all z, Z e Cö(Cl),
where

b(v, w, z) = t qJ(v, w)z dx,

65.4a. The Generalized Problem


Definition 65.1. Set X = Wl(O). The generalized problern to the von Karman
plate equations (6)-(9) is the following. Suppose
peR, K 3 eL 2 (Cl), U0 eWi(O)
with A2 U0 = 0 on Cl are given.
We are then looking for U, weX suchthat equation (12) is satisfied.

Every classical solution of(6)-(9) satisfies (12), i.e., is a generalized solution.


If we reverse integration by parts in (12), then every sußiciently smooth
generalized solution is also a classical solution of(6)-(9). This is a consequence
of the arbitrariness of z and Z.
We use the space X, since it follows from w, U e X that the boundary
conditions (8), (9) are satisfied in the generalized sense.

65.4b. Elimination of the Airy Stress Function U via


the Theorem of Riesz
Our next goal is to obtain the system
U = -B(w,w),
(13)
w = B(w, U) + pLw + f
65.4. The Operator Equation 329

from the generalized problern (12) by using the theorem of Riesz of Section
18.11. Tothis en<t we set
(B(v, w)lz) = b(v, w,z),
(Lwlz) = b(w, U0 ,z), (14)
(flz) = F(z) for all v, w, zeX.
Forthis to make sense we have to show that
l(b(v, w, z)l :::;;; constllzll,
ib(w, U0 ,z)l:::;;; constllzll, (15)
IF(z)l:::;;; constllzll for all zeX
and arbitrarily fixed v, weX. In fact, if we fix v, weX and suppose that the
first inequality in (15) holds true, then
z~--+b(v,w,z)

is a linear, continuous functional on the H-space X. According to the theorem


ofRiesz, there exists an element B(v, w)eX suchthat the first line of(14) holds
true. In a similar way we obtain the second and third line of (14) from the
corresponding lines of (15).
The proof of (15) will be given below. From (12), (14) follows (13), since
C0 (Cl) is densein X.

65.4c. Properties of the Operator Equation

Wenow set
Cw = B(w, B(w, w)).
Then the system (13) becomes
w= -Cw+pLw+f, weX.
This is the operator equation (11).

Proposition 65.2. The generalized problern (12) to the von Karman equations is
equivalent to the operator equation (11), whereby the following holds:
(a) L: X -+ X is linear, symmetric, and strongly continuous.
(b) B: X x X -+ X is bilinear, symmetric, and strongly continuous.
(c) C: X -+ X is a strongly continuous potential operator which satisfies the key
property
(Cwlw) = IIB(w, w)ll 2 for all weX.

PRooF. We use arguments analogously to Section 27.4. All constants will be


denoted by c.
330 65. Pseudomonotone Operators, Bifurcation, and the von Karman Plate Equations

The pointisthat the integral, which corresponds to b(v, w, z) in (12), contains


three factors. In order to estimate this integral, we will use Hölder's inequality
for three factors

(16)

where eeL 2 (0), and f, geL 4 (0), and also the following Sobolev embedding
theorem:
The embedding Wl(O) s; Wi(O) is compact. (17)

This is a consequence of our generat results in A2 (45).


In fact, (17) is the heart of the proof, since it implies the strong continuity
ofB.
(I) Preparations. Integration by parts yields the symmetry relation

L q>(v, w)U dx = L q>(U, w)vdx (18)

for all v, w, U e Cö(O). Furthermore, we have the important divergence


representation
q>(v, w) = D1t/l(v, w) + D2 x(v, w) (19)

1/J(v, w) = v, 1 w, 22 - v, 2 w, 12 ,
X( V, w) = V,2 W,u - V,l W,u·

Again, using integration by parts we find that

L q>(v, w)U dx = - L 1/J(v, w)U, 1 + x(v, w)U, 2 dx (20)

for all v, w, U e Cö(O).


Note that relations (18) and (20) also hold true if, in each case, two
functions only belong to C 2 (Ö).
In the foliowing it is important that t/1 and x in (20) only contain
first-order derivatives of v.
Let Y = Wi (0) with norm
lvl ~ llvll1,4•
and recall that Uvll denotes the norm on X= Wl(O). From (16), (17), and
(20) we obtain the key estimate

IL q>(v, w)U dxl ~ cllviii,411wll2,211 Ull1,4 (21)

for all v, w, U e Wl(O). Notice that this is also true if we Wl(O).


65.4. The Operator Equation 331

In order to obtain (21 ), we first prove this relation for the space Cö (0).
Then we use the density of C0 (0) in Wi(O). In the ease we Wi(O), we
need the density of C 00 (Ö) in Wl(O).
(II) Proof of (14). From (21) follows that

IF(z)l ~ IIK3II2IIzll2 ~ cllzll,


ib(v, w,z)l ~ clvlllwllllzll.
ib(w, Uo,z)l ~ clwlllzll for all zeX.
This is (14). Moreover, we obtain
IILwll ~ clwl,
(22)
IIB(v, w)ll ~ clvlllwll for all v, weX.
(III) Proof of (a). Because of (17) the embedding X G Y is eompaet. Con-
sequently, the weak eonvergence
as n-+ oo
implies the strong eonvergenee wn -+ w in Y, i.e.,
as n-+ oo.
From (22) follows
IILwn - Lw II ~ clwn - wl-+ 0 as n-+ oo.
Therefore, the operator L: X-+ Xis strongly eontinuous.
Equation (18) teils us that
b(v, U0 , w) = b(w, U0 , v) for all v, w e X.
Thus, L is symmetrie.
(IV) Proof of (b). Beeause of the symmetry of qJ we have
b(v, w, z) = b(w, v, z) for all v, w, zeX,
i.e., B is symmetrie. The strong eontinuity of B follows from (22) by using
a similar argument as in (III) for L. In this eonneetion we note that there
exists a splitting
IIB(e,f)- B(g,h)ll = IIB(e- g,f) + B(f- h,g)ll
~ c(le - glllfll + 1/- hlllgll),

and observe the boundedness of weakly eonvergent sequences.


(V) Proof of(e). The strong eontinuity of C follows from the strong continuity
of B. Equation (18) implies
(B(v, w)lz) = (B(z, w)lv) for all v, w, zeX.
Thus, according to Problem 65.1, it follows that C: X-+ Xis a potential
operator. Moreover, setting z = B(v, v), we obtain
(Cvlv) = (B(z, v)lv) = (B(v, v)lz) = IIB(v, v)ll 2 • D
332 65. Pseudomonotone Operators, Bifurcation, and the von Karman Plate Equations

65.5. Existence Theorem


Theorem 65.A (Existence). Set X= Wl(O). The operator equation (11), which
corresponds to the generalized problern for the von Karman plate equations, has
a solution for every f EX and every p ER with sufficiently smalllpl.

Corollary 65.3 (Uniqueness). If p is not a characteristic number of the linear


operator L in (11), then there exists a constant r0 > 0 such that the operator
equation (11) has a unique solution w E X with
llwll:::;; ro
for every feX with sufficiently small norm.

PRooF OF THEOREM 65.A. Let A = I - pL + C. The operator A: X -+ X is a


strongly continuous perturbation of the identity. According to Section 27.2,
A is pseudomonotone and bounded. From (5) follows that A is coercive.
The main theorem about pseudomonotone operators (Theorem 27.A) yields
the existence result. D

PRooF OF CROLLARY 65.3. We write equation (11) in the form


F(w,p,f) = 0.
Because of Fw(O, p, 0) = I - pL, the assertion follows from the implicit function
theorem (Theorem 4.B). D

The operator L: X -+ X is symmetric and compact. Therefore, the set of


characteristic numbers p of L is nonempty, at most countable, and ±oo are
the only possible accumulation points.
The solutions in Theorem 65.A can be computed approximately by a ·
Galerkin procedure. According to Theorem 27.A, a subsequence ofthe Galer-
kin sequence converges to a solution w of(ll). Moreover, the entire Galerkin
sequence converges to w ifthe solution w of(ll) is unique.
The physical meaning ofthese results have already been discussed in Section
65.1. Recall that fand p correspond to the volume and boundary forces,
respectively.

65.6. Bifurcation
We make the following assumptions:

(H) It is K 3 = 0, i.e., no outer volume forces are present. The function U0 e


Wl(O) is given as a solution of the differential equation ~ 2 U0 = 0 on n.
65.6. Bifurcation 333

According to (10), the boundary values of pU0 are related to the boundary
stress forces.
The fundamental operator equation (11) then becomes

w = pLw- Cw, weX (23)


with the linearization
w =pLw, weX. (24)
Here,peR.

Theorem 6S.B (Bifurcation). Assume (H). Then (p, 0) is a bifurcation point of


(23) if and only if p is a characteristic number of (24).
Equation (24) has at least one characteristic number p.

Corollary 65.4 (Multiplicity). If the multiplicity of the characteristic number


Po> 0 of the linearized problem (24) is equal to n ~ 1, then the nonlinear
problem (23) has at least n different nontrivial solutions (p, w) with
0 < max lw(x)l :$ r
xeö

for every r > 0.

PRooF. From Proposition 8.2 we know that if (p, 0) is a bifurcation point of


(23), then p is a characteristic number of (24).
The converse follows from the main theorem of bifurcation theory for
potential operators (see Theorem 45.A and Example 45.2).
Corollary 65.4 is a consequence of Theorem 45.A. Note that
max lw(x)l :$ constllwllx.
xeÖ

which is implied by the continuity of the embedding W'i(O) !;;;;;; C(O) and
X= Wl(O). 0

The important physical interpretation of these bifurcation results has


already been given in Section 65.1. Note that because of w =F 0, there exists a
e
buckling ofthe plate in the direction ofthe 3-axis for each bifurcation solution
(p,w).
Since the operator Cis analytic, one can also apply the apparatus of analytic
bifurcation theory of Chapter 8 to equation (23). For this, explicitly known
eigensolutions of the linearized equation (24) are needed. For example, in case
that the region n is a disk or a rectangle, such eigensolutions are known.
In addition, we can use the results of topological bifurcation theory of
Chapter 15 which concem the global behavior ofthe bifurcation branches in
the case that the multiplicity of the characteristic number p of (24) is odd.
334 65. Pseudomonotone Operators, Bifurcation, and the von Karman Plate Equations

65.7. Physical Motivation of the Plate Equations


Our goal is to motivate the von Karman plate equations by using only the
natural model assumptions (Hl)-(H4) below.
We set G = n x ] - a, a[ and, as in Section 61.6, we start from the principle
of stationary potential energy

f A(tl) dx - f Ku dx - f Tu dO = Stationaryl (25)


JG JG JÖG
with boundary conditions
U3 = U3,1 = U3,2 = 0 On iJO X ]-a, a[. (26)

Here, the displacement of the plate is given by


u = u1 e 1 + u2 e2 + u 3 e3
(Fig. 65.1 in Section 65.3) and tl is the strain tensor with components
tlii = f(u 1, 1 + u1,1 + u~:, 1 ut)• i,j = 1, 2, 3.
The sum is taken over k = 1, 2, 3.
We now formulate a number of assumptions. Roughly speaking, we require
the following:
All effects perpendicular to the plate plane, i.e., in the direction
of the e3-axis, are small.
(Hl) In the undeformed state, the plate corresponds to the set G, where
G = n x ] - a, a[. The thickness of the plate 2a is small compared with
the diameter of n, where the region n lies in the (eto e 2)-plane. Suppose
that
1 13 = 1 31 = 0 for i,j = 1, 2, 3,
i.e., there exist no extensions or ·contractions in the direction of the
e3-axis.
(H2) The density ofthe outer volume forces has the form K = Lf=t K 1e1 with
Kt = K2 = 0, K3 = K3(et> e2).
The density T of the outer boundary stress forces has the form
and for i = 1, 2.
We assume that no boundary stress forces are acting on the covering
surfaces of the plate, i.e.,
1j = 0 On 0 X {±a} for i = 1, 2, 3.
In the following we sum from 1 to 2 over two equal indices.
(H3) For the stored energy function, which represents the elastic potential
65.7. Physical Motivation ofthe Plate Equations 335

energy of the plate, we make the ansatz


A = ti(111 + 122)2 + Klijll}
with I= EJl/(1 - 11 2) and" = E/2(1 + Jl).
This expression was obtained in Problem 61.4 by considering a
cuboid, where the stress forces were not acting in the direction of the
e3 -axis. In order to obtain a nonlinear model, we replace here the
linearized strain tensor y of Problem 61.4 with the strain tensor I.
(H4) The displacements satisfy the following assumptions. The displacement
"3• i.e., the deflection of the plate in the e3-direction, does not depend
on e3 • We set w = u3 , hence
w= w(e1.e2>·
We expand the displacements u1 and u2 with respect to 3 and take only e
linear terms into account, i.e.,
j = 1, 2.

In 111 we only keep the quadratic terms with k = 3, i.e.,


tlii = t(ul,J + u1,1 + w, 1w,1), i = 1, 2, 3. (27)
This is motivated by the assumption that the first-order derivatives of
u 1 and u 2 aresmall compared with those of u3 •

We now study the purely mathematical consequences of these model as-


sumptions.
Step 1: Computation of the displacements u1 and of the strain tensor tlii.
From 1 13 = 123 = 0 and (27) it follows that u1, 3 = -u 3 ,h i = 1, 2, hence
121 = - w. 1• Consequently, for i = 1, 2, we have
Ui = iii- W,ie3, (28)

with
"ij = ~j - w,lje3, (29)

der .l(- - )
-., der-
0 11 = elJ + .l2 w, 1w,1, -
eli = 2 ui,J + uJ,i .
Step 2: Auxiliary quantities 1111 = iJA(8)/o~1 •
We introduce these quantities in order to be able to express the first
variation below in a simple form. The physical interpretation of Jlii will be
given at the end of this section. Explicitly, we have
i,j = 1, 2.
By solving this system of equations we obtain
jll = E- 1(/111 - Jl/122), 822 = E- 1 (1122 - /1/-lu),
(30)
j12 = j21 = E- 1 (1 + /1)1112·
All quantities Jlii depend only on e1 and e 2•
336 65. Pseudomonotone Operators, Bifurcation, and the von Karmim Plate Equations

Step 3: Computation of the first variation.


The original variational problern (25) now becomes

J~ f
Jn J"-a A(l)dx- 2a Jnf K 3 wde 1de 2

(31)

with boundary condition


w = w,l = w,2 = 0 on ao. (32)
We thereby replace allliJ in A(l) with the expressions in (29). The integrands
then depend on Üt, u2, w, and e3.
According to our standard procedure for variational problems, we replace
with and w+ th,
respectively, where h1, h e C 00 (0). For w + th tobe an admissible variation, we
assume that
h = h,1 = h,2 = 0 on ao.
In this way, the function w + th satisfies the boundary condition (32). The
integral in (31) now depends on the real parameter t, i.e., we obtain J = J(t).
We set ~J = J'(O).
lfu1 , u2 , w is a solution of(31), (32), then
~J=O.

Afterintegration over [ -a, a] with respect to e and integration by parts, we


3
obtain the key relation

~J = f f B h ds = 0
Jn (A hi + Ch)de 1de + Jan
1 2 1 1

with
Ai= -2a(Jli1,t + 1'i2,2), i = 1, 2,
B1 = 2a(Jtiini - 'Ij),
C = D t\ 2 w - 2aK 3 + Aiw,i - 2aJt11 w,IJ.
Here, n = n1 e1 + n2 e2 is the outer unit normal vector of 80.
Step 4: The Euler equations.
Because of the arbitrariness of h1 and h, we obtain, in the usual way, from
~J = 0, the Euler equations

Ai=O onO, i = 1, 2, (33)


Bi= 0 onoO,
C=O onO. (34)
65.7. Physical Motivation ofthe Plate Equations 337

Step 5: Simplification of the Euler equation (33) by using Airy's stress


function V.
If we choose a smooth function V with
d
and --V 1 =Tz onan, (35)
ds ·
and ifwe set
1-'tt = Jt:zz• 1-'zz = Jt:u on n,
(36)
1-'12 = - Jt:12 on n,
then 1-'ii is a solution of (33).
However, in order to obtain the von Karman equations in a rigorous way,
we also need that the converse is true. This is sometimes overlooked in the
engineering literature. In fact, if an is sufficiently smooth, then every smooth
solution 1-LIJ of (33) can be represented in the form (35), (36). This will be shown
in Problem 65.5.
Step 6: Decomposition of V= U + pU0 •
Suppose that the boundary forces have the form
1j = p1'; 0 on an, i = 1, 2
with the real parameter p. Let p -::;:. 0. We want to show that it is possible to
find a uniquely determined decomposition of V = U + pU0 , where U is in-
dependent of the boundary forces.
First, assume that V is given with the boundary condition (35). By definition,
U is the unique solution of the classical boundary-value problern
llzu = -llV on n,
U = U, 1 = U,z = 0 on an.
Now set pU0 = V - U. Then,
llzU0 = 0 on n, (37)
and
pU0 = V, pUo, 1 = Jt: 1, pU0,z = Jt:z on an. (38)
Hence, the function pU0 is uniquely determined by V. Moreover, (35) implies
that
d d
p-ds Uo .z = Tt, -p-ds U0 . 1 =Tz on an. (39)

Conversely, if U0 satisfies (37) and (39) and if we choose a function U with


the boundary condition
U = U, 1 = U, 2 = 0 on an,
then (38) holds and hence V = U + pU0 satisfies (35).
338 65. Pseudomonotone Operators, Bifurcation, and the von Karman Plate Equations

Step 7: The second von Karman equation (7).


Inserting (36) into the second Euler equation (34), and using V = U + pU0 ,
we obtain (7).

Step 8: The firstvon Karman equation (6).


lt is important here that the relation
e.. = .l2 (ii· 1 + U· ·)
IJ '· '},&
(40)
implies the so-called compatibility condition
eu.22 + e22.11- 2eu,u = o. (41)
Conversely, if all the ev's are given, then we obtain the u/s from (40) in case
the compatibility condition (41) is satisfied. This will be shown in Problem
65.4.
The second von Karman equation is an equation for the two functions w
and U. We therefore need another equation. Suppose we know wand U. From
U we obtain JLu according to (36). From JL11 we obtain eu according to (30). In
order to determine the displacements u1 and u2 , we have to solve equation
(40). Thus we need the compatibility condition (41).
In fact, the first von Karman equation is nothing other than (41). This
follows, since, according to Step 1
e,.'J = l,.-
IJ
.lwiwj
2 • I'
and from (30), we can express "iu in terms of JLu· Formula (41) then yields (6).
Step 9: Computation of the physically important quantities.
Suppose we have a solution w, U ofthe von Karman equations (6)-(9). We
then need to show how to compute the displacements and the stresses of the
plate.

(i) Displacements. From (36) and (30) we find that


(w, U) => V = U + pU0 => JLu => Ru => ~J·

The first von Karman equation (6) guarantees that the compatibility
condition is satisfied. Thus, we can use Problem 65.4 in order to obtain
the displacements u1 and u2 from equation (40). Up to infinitesimally small
rigid motions they are uniquely determined.
From (28) we obtain, in addition, the quantities
and
i.e., the displacements are known.
(ii) Stresses. According to Section 61.6, we derive the first Piola-KirchhofT
stress tensor u from the variational principle (25) by using
. oA
u'--- for i,j = 1, 2, 3.
1 - OU· j

65.8. Principle of Stationary Potential Energy and Plates with Obstacles 339

If n, is a Subregion of n, then, after the deformation, the region H =


n, X ] -a, a[ becomes H'. On thedeformed region H', there acts the Stress
force

l tmdO
Jan
with
3
an = L ajnie1,
1,)=1

where n = LJ= 1 nle1 is the outer unit normal vector of the boundary of H.
This follows from Section 61.6.
In particular, we obtain that
aj = P.ii for e3 = 0 and i,j = 1, 2.
This yields the desired physical interpretation of the quantities p.11 •

Remark 65.5 (Alternate Approach). The von Karman equations can also be
obtained by showing that, under suitable assumptions, those equations repre-
sent the first terminan asymptotic expansion derived from the basic equations
of three-dimensional nonlinear elasticity theory. This can be found in Ciarlet
and Rabier (1980, M).

65.8. Principle of Stationary Potential Energy


and Plates with Obstades
Our goal is to formulate the principle of stationary potential energy in terms
ofthe vertical displacement w = u3 • Westart from the fundamental operator
equation of Section 65.4, i.e., we consider again
w - pLw + Cw = J, weX (42)
with X = Wi(O).

Proposition 65.6 (Principle of Stationary Potential Energy for Plates). Equation


(42) is the Euler equation to the variational problern

~ llwll 2 + ~ II Vll 2 - i(Lwlw)- (flw) = stationary!,


(43)
V= -B(w, w), weX.

PRooF. According to Problem 65.1, the operator C has the potential Wl-+
4- 1 IIB(w, w)ll 2 . 0
340 65. Pseudomonotone Operators, Bifurcation, and the von Karman Plate Equations

J
Frorn (14), we obtain (Lwlw) = b(w, U0 , w) and (flw) = 0 K 3 wdx. Hence
the variational problern (43) corresponds to the classical problern

t ~(.:1w)2 + ~(.:1U) 2 - ~q>(w, U0 )w- K 3 wdx = stationary!,

L1 2 U = -q>(w, w) on n,
(43*)
w = w,l = w,2 = 0 on an,
U = U, 1 = U, 2 = 0 on an,
where
q>(w, v) = w,u v,22 + v,u w,22 - 2w,12v,t2·
The function U0 satisfies

d d
p-ds Uo .2 = Tt, - P- Uo 1 = T2 on an.
ds ·
As a consequence ofSection 65.3, T1 and T2 correspond to the boundary forces
and K 3 corresponds to the volurne forces. Recall that we use a special physical
systern of units where r 1 E = 2aD- 1 = 1.
The advantage of the variational problern (43) over the operator equation
(42) is the fact that it allows a study of the rnore generat problern of plates
with obstacles. This is pictured in Figure 65.2. We proceed herein the sarne
way as in the case of bearns in Section 64.6. According to (43), the elastic
potential energy of the plate is
F(w) = !llwll 2 + liiU(w)ll 2
with U(w) = - B(w, w), and the work of the outer forces is given by

pG(w) = ~(Lwlw).

Figure 65.2
65.8. Principle of Stationary Potential Energy and Plates with Obstacles 341

Here we assume that there exist no outer volume forces, i.e., K 3 = 0, and hence
f= 0. For the second F-derivatives we obtain

(F"(O)w,v) = (wlv), (G"(O)w,v) = (Lwlv).

Let C be the set of all w e X with


w(x) ~ 0 on nl,
w(x) :::;;; 0 on n2,
where 0 1 and 0 2 are subsets ofn. This condition describes the obstacles. We
then study the variational problern
F(w)- pG(w) = min!, weC.

The corresponding variational inequality is


(V) Ä(F'(w),v- w) ~ (G'(w),v- w) for all veC

with Ä = 1/p.
We make the following assumptions:
(H 1) Let Q be a simply connected region in IR 2 with an e C 0 • 1• Moreover, Iet
0 1 and 0 2 be subsets ofn suchthat the set Cis nontrivial, i.e., C :F {0}.
(H2) There are no outer volume forces, i.e., K 3 = 0.
(H3) There exists a fixed vertical displacement w e X such that the work of
the outer boundary forces is positive, i.e., (Lwlw) > 0.
(H4) We are given U0 e Wf(O) with !J. 2 U0 = 0 on n.

Theorem 6S.C (Plates with Obstacles). Assurne (H1)-(H4). The maximum prob-
lern
(Lwlw)
(M) -"o = ma~ _(_I_)
weC WW

then has a solution and Ä0 > 0 is the largest bifurcation value of the variational
inequality (V).

PROOF. Recall that X = Wf(O). The embedding X !: C(Ö) is continuous.


Hence Cis a closed, convex cone in X.
Proposition 65.2 shows that the convex functional WH llwll 2 is weakly
sequentially lower continuous, and that WH IIU(w)ll 2 is weakly sequentially
continuous. Moreover, it follows that w H (Lw Iw) is weakly sequentially con-
tinuous.
Hence, the assertion follows from Theorem 64.A and Corollary 64.4. 0

In physical terms, this theorem guarantees that there exists a buckling force,
which corresponds to the force parameter p0 = 1/Ä0 of the boundary forces.
342 65. Pseudomonotone Operators, Bifurcation, and the von Karman Plate Equations

L
In (M), recall that

(wlw) = (L\w) 2 dx,

(Lwlw) =L cp(w, U0 )wdx.

The boundary forces acting on the plate are given by

2a ( (T1 e1 + T2e2)ds
Jan
with
d d
Tt = p-ds Uo .2• T2 = -p-U0
ds ·
1 onaO,

and arbitrary p > 0. Here s denotes the arclength of an. The buckling force
corresponds top= p0 • The function U0 has been introduced above.

Remark 65.7 (Multiply Connected Plates). The existence result (Theorem


64.A) as well as the bifurcation result (Theorem 64.B) and Theorem 64.C
remain valid if n is an arbitrary region, i.e., if n is not simply connected. In
fact, the proofs of Theorems 64.A-64.C do not use the fact that n is simply
connected. Thus these theorems describe properties of the vertical deflection
w for arbitrary plates. In Step 9 of Section 65.7, however, we computed the
horizontal deflections u1 and u2 of the plate by solving the equations
(E) !(D1ui + Diu1) = eu on 0, i,j = 1, 2.
In the case that n is simply connected, we can solve (E) for u1 and u2 • This is
because the integrability conditions, which correspond to the first von Karman
equation, are satisfied. If n is not simply connected, then a more careful
discussion of equation (E) is required.

PROBLEMS

65.1. Special potential operators. Let B: X x X-. X be a bilinear, bounded, sym-


metric operator on the H-space X. Set Cu = B(u, B(u, u)) and assume that
(B(u, v)lw) = (B(w, v)lu) for all u, v, weX.
Prove: The map C: X -. X is a potential operator.
Solution: According to Section 41.3 we begin with
fl (C(tu)lu)dt = 1((Bu, u)IB(u, u)).
= Jo
F(u) 4
Wefind
F'(u)h = (B(h, u)IB(u. u)) = (Culh) for all heX,
and hence F' = C.
Problems 343

65.2. Existence theorem for plates with generat boundary forces via a reduction trick.
In order to explain the reduction trick we drop the condition
112 U0 =0 on n,
which was usedin Step6ofSection65.7. Thus we havetoadd theterm -pi1 2 U0
to the right-hand side of equation (6). Moreover, we replace U0 with pU0 , where
the function p is chosen so that U0 and pU0 have the same boundary values
and normal derivatives on an. Then the corresponding boundary forces in (10)
do not change.
The trick is to make a convienent choice of the function p. Analogously to
Section 65.4, we then obtain the system
U = -B(w, w) + ppU0 ,
w = pB(w,pU0 ) + B(w, U) + f
Let h = (w, U). In the product space Wl(D) x Wl{ß), we then obtain an opera-
tor equation
h - pLPh + Qh = /0
with/0 = (ppU0 ,f) and the linear operator Lr Furthermore, Qis homogeneaus
of degree two. From
(B(w, w)l U) = (B(w, U)lw)

it follows that (Qhlh) = 0. Let A = I - pLP + Q. Then we find

(~::~) ~ llhll(1 -IPIIILpll).


Suppose that p = 1. For smalllpl the operator Ais then coercive. Moreover,
A is a strongly continuous perturbation of the identity. Hence, A is pseudo-
monotone. From Theorem 27.A we thus obtain a solution h of the equation
Ah=fo
for every / 0 •
Prove: After a suitable choice of p one can always assure that the operator
norm IILPII becomes arbitrarily small. The key is here that the operator A with
p = p(p) is then coercive for arbitrary p, and one obtains an existence result for
the von Karman equations with arbitrary boundary forces.
Hint: See Neeas and Hlavaeek (1981, M), p. 286.
65.3. lntegrability conditions. Let G be a simply connected region in IRN with N ~ 1.
Let f,e C1 (G) for i = l, ... , N. The differential equation
U, 1 =Ji on G, i= l, ... ,N (44)

then has a C2 -solution U if and only if the so-called integrability conditions

h.j = Jj,i for all i,j (45)


are satisfied.
Up to an additive constant, the solution U is uniquely determined.
344 65. Pseudomonotone Operators, Bifurcation, and the von Karman Plate Equations

In the case that (45) holds, we obtain a solution of(44) through

U(x) = U(x0 ) + i" "o


fdx, xeG

for fixed x 0 e G, whereby the line integral is path independent.


The necessity of (45) follows from U, 11 = U,11 •
This classical result should be used in the following two problems.
65.4. Computation of the displacements from the linearized strain tensor. Let G be a
simply connected region in RN. We consider the differential equation

!(u1,1 + u1, 1) = Yü on G, i,j = 1, ... , N, (46)

where the functions y11 e C2 (G) with Yü = y11 for all i,j are given.
Prove: Equation (46) has a C 3-solution u = (u 1, ... ,uN) if and only if the
so-called compatibility conditions
(47)

are satisfied for all i,j, k = 1, ... , N.


The solution is uniquely determined up to infinitesimally small rigid motions,
i.e., up to functions
N
U; = al + L wijej.
}=I
i = 1, ... ,N (48)

with arbitrary constants a1 and wü, where wiJ = -w11 for all i,j.
Solution: The trick is to define the functions
wu = t(u1,1 - u1, 1).
From (46) follows then that
(49)
(50)

(I) Necessity of (47). If u is a solut.on of (46), then (50) holds, and the com-
patibility condition (47) follows from the integrability condition

W;t,Jr = Wit,rj•
(II) Sufficiency of(47).1f(47) holds, then the integrability condition is satisfied
for equation (50). From Problem 65.3 follows that (50) has a solution w1t·
By making a suitable choice of the arbitral:y constants of the solutions,
one can get that
for all k, i. (51)
Equation (50) implies that the integrability condition is satisfied for
equation (49). Thus (49) has a solution u, which, as a consequence of (51)
also satisfies the original equation (46).
(III) Uniqueness. Let u be a solution of the original equation (46) with
for all i,j.
Problems 345

From (50) follows that


wik = const for all i, k,
and equation (49) yields (48). Substituting (48) into the original equation
(46) with yiJ = 0, we obtain that wiJ + w11 = 0.
Conversely, the functions u1 in (48) are always solutions ofthe original
equation (46) with yiJ = 0.
65.5. Equilibrium condition in plane elasticity theory and the Airy stress function V.
Let n be a bounded, simply connected region in IR 2 with smooth boundary, i.e.,
aneC 1. For i = l, 2 we consider the boundary-value problern
=0
tTij,J on n, (52)
t1;jnj = T; on an (53)
with a 12 = a 21 • Over equal indices it is summed from l to 2. Let n = (n 1 ,nz)
denote the outer unit normal vector to an. The system of equations corresponds
to the equilibrium condition
diva =0 on n, (52*)
an= T on an (53*)
for the Stress tensor t1 in the plane. Suppose Tl' Tz e C 1 (an) are given.
Prove: Tl.!_e C1(Ö)-functions aiJ are solutions of (52), (53) ifand only if there
exists a C3 (n)-function V with
t122 = P:u on n,
(54)
a 12 =- ll; 12 on 0,
d
- - V1 =Tz on an. (55)
ds ·
Here, s denotes the arclength on an, i.e.,

Solution: From (541 (55) one immediately obtains (52), (53).


Let, conversely, aiJ be a solution of equation (52), (53). The trick is then to
construct functions A and B through

(56)
B,z = -tr12.
The integrability conditions for this system are satisfied as a consequence ofthe
equilibrium condition (52). We construct the desired function V through
ll;l = B, (57)
and note that, in this case, the integrability conditions are satisfied because of
(56). Finally, we find that
(56), (57) ~ (54) and (53) ~ (55).
346 65. Pseudomonotone Operators, Bifurcation, and the von Karman Plate Equations

65.6. • Bifurcation for reetangular plates and the generat imperfection principle in bifur-
cation theory. Consider a plate which is given by the reetangular region
n = ]O,a[ X ]O,b[.
It is important that
(i) for noncritical ratios a: b the smallest characteristic number p0 > 0 of the
linearized problern is simple; and
(ii) there exist critical ratios a:b for which p0 is double.
In case (i) Theorem 8.A can be used in order to find a unique bifurcating
branch.
In case (ii) the branching equation has to be studied in detail. In order to
obtain a generic situation one has to use the method of Theorem 8.F (generic
point bifurcation) and methods of catastrophe theory.
The bifurcation diagrams are then rather complicated. This can be seen from
Chow and Haie (1982, M), p. 284.
Let us emphasize the following generat imperfection principle in bifurcation
theory:
In order to obtain a complete and natural picture for the bifurcation of me-
chanical and moregenerat physical systems, one has to add imperfections.
Mathematically, this means that, in order to obtain a structurally stable situa-
tion in the sense of catastrophe theory, one has to add additional parameters
(see Section 73.16).
In Section 37.28, this principle has already been discussed in connection with
the bifurcation of beams. In this case, one needs to add a small additional
verticalload. With regard to plates one has to add
(a) a small verticalload eK 3 ; and
(b) a small vertical displacement ocw0 in the case that no outer forces are present.
Hence, the complete (generic) bifurcation diagram depends on the following
parameters:
a, b (side length of the plate), p (boundary force),
e (small verticalload), oc (imperfection ofthe rest state).
In this direction, study Chow and Haie (1982, M).

References to the Literature

Classical works: von Karman (1910), Friedrichs and Stoker (1942) (nonlinear
circular plates), Derger and Fife (1967).
Explicit computations: Dickey (1976, L).
Introduction: Vlasov (1964, M) and Pflüger (1965, M) (shells), Szillard (1974, M)
(plates).
Modern introduction: Ciarlet and Rabier (1980, L), Djubek, Kodil.ar, and Skaloud
(1983, M) (existence and numerical methods).
Handbook article about plates and shells: Naghdi (1972, B).
Existence theory: Vorovic (1955), Bergerand Fife (1967), Langenbach (1976, M),
Ciarlet and Rabier (1980, L), Necas and Hlavacek(l981, M).
References to the Literature 347

Bifurcation for plates, imperfections, and catastrophe theory: Chow, Haie, and
Mallet-Paret (1975), Chow and Haie (1982, M).
Bifurcation for reetangular plates: Knightly and Sather (1974), List (1978), Recke
(1978), Chow and Haie (1982, M) (recommended as an introduction).
Bifurcation for circular plates: Dickey (1976, L), Antman (1980) (global branches).
Bifurcation for shells: Sather (1976, S), Recke (1978), Knightly and Sather (1980).
Cones and positive eigensolutions for the plate equation: Miersemann (1979).
Buckling of plates with obstacles: Do (1975), (19771 Miersemann (1975), (1981, S),
(1982).
Control of the stability of plates: Beckert (1972), Hofmann (1986) (general theory for
the control of the smallest eigenvalue of self-adjoint compact operator in H-spaces by
using a constructive approach via subgradient methods).
Derivation of the von Karman equations from three-dimensional elasticity via
asymptotic expansions: Ciarlet and Rabier (1980, L).
Numerical methods: Ciarlet (1977, M), Brezzi (19781 Djubeck, Kodnar, and Skaloud
(1983, M), Bemadou and Boisserie (1982, M) (finite elements in thin shell theory).
A priori estimates for the stresses in shells and interior shell equations: John (1965),
(1971), (1985, S) (fundamental papers).
Shells: Koiter (1972), (1980), Ramm (1982, M), Niordson (1985, M), Antman (1986).
CHAPTER 66

Convex Analysis, Maximal Monotone


Operators, and Elasto-Viscoplastic
Material with Linear Hardening
and Hysteresis

lt was shown by Moreau (1976) that for materials without hardening it is


inevitable, in general, to use irreflexive spaces of L00 -type or C-type. If some kind
ofhardening ofthe material is involved, then the situation is much simpler since
satisfactory a priori estimates for the solutions of the problern are available under
natural assumptions.
Konrad Gröger (1979)

In this chapter we generalize the results of Chapter 60 about the wire to


three-dimensional bodies. Our goal is to clarify the following points.
(i) The subgradient of functionals, which was studied in Part III in the con-
text of convex analysis, can be used to formulate.multivalued constitutive
laws which, in a mathematically elegant form, model plastic behavior and
more generally elasto-viscoplastic behavior.
(ii) We consider slow deformation processes of the form
u = u(t),
where u and t describe displacement and time, respectively. The basic idea
is to replace the classical strain-stress relation
y = F(a)
with the multivalued constitutive law
j(t) e oF(a(t)).
The dot Stands for the time derivative, and oF is the Subgradient of F.
In this way we obtain first-order evolution equations which contain
multivalued maximal monotone operators. Hence we are able to apply the
main theorem about these equations (Theorem 55.A).

348
66.1. Abstract Model for Slow Deformation Processes 349

Recall that the subgradient of convex lower semicontinuous functionals


is maximal monotone (Theorem 47.F).
(iii) In order to model hardening effects which occur in physical experiments,
it is convenient to introduce so-called internal state variables.
(iv) In this chapter we will show that the hardening operator B Ieads to a
mathematical regulariz!ttion of the problem. In the proof of the main
theorem in Section 66.3, we shall make essential use of the fact that the
operator PA is symmetric and positive. Note that PA + B is symmetric
and strongly positive. Hence, according to the main theorem on monotone
operators (Theorem 26.A), the inverse operator (PA + Bf1 exists, and
can be used in order to introduce an equivalent norm. This greatly
simplifies the original problem.
In summarizing Iet us say:
The study of plastic and viscoplastic material automatically
Ieads to nonlinear problems.
The abstract model of Section 66.1 clearly shows that the heart of elasticity
and plasticity theory has a functional-analytic character. Therefore the use of
methods from functional analysis is natural.
In Section 66.4 we shall apply the abstract model to several concrete
situations.
The reader should also keep the following in mind:
(a) In Chapter 62, as weil as the present chapter, weshall use models of plastic
behavior in which the displacements of the body are calculated. Such be-
havior is close to elastic behavior (e.g., the slow extension or contraction
ofwires).
(b) In Chapter 86 we consider viscoplastic liquids. We compute the slow
velocities of the liquid. A typical example in this regard is the transport of
extremely viscous liqu.ids in pipe lines. This is one of the main problems
in the chemical industry.
We should like to recommend that the reader Iook again at Chapters 60
and 61 before studying this chapter. In particular, we recommend Section 60.3,
because the following abstract model is a generalization of the concrete situa-
tion of wires which was considered there.

66.1. Abstract Model for Slow Deformation Processes


Our functional-analytic model for the description of slow deformation pro-
cesses for elasto-viscoplastic bodies with a linear hardening Iaw will be de-
scribed in abstract terms below. The physical meaning of the model will be
discussed in Sections 66.2 and 66.4. We assume that all equations below are
valid for almost all times t e [0, t 0 ] with fixed t 0 > 0.
350 66. Convex Analysis, Maximal Monotone Operators, Elasto-Viscoplastic Material

A characteristic property of our model is the fact that the observable strain
tensor y and the corresponding observable stress tensor fJ can be decomposed
by (i) and (ii) below, into intemal state variables e, p and q, r, respectively,
where p is the viscoplastic part of the strain tensor y. These quantities are not
directly observable. Our abstract model consists of the following elements:
(i) Deformation process:
y(t) = e(t) + p(t).
(ii) Stress process:
fJ(t) = q(t) + r(t).
(iii) Viscoplastic constitutive law:
p(t) e oF(q(t)).
(iv) Linear elastic constitutive law:
fJ(t) = Ae(t), r(t) = Bp(t).
(v) Linearrelation between displacement and (linearized) strain tensor y:

y(t) = Du(t).
(vi) Equilibrium condition for the stress tensor fJ and the outer forces K:
D*fJ(t) = K(t).
(vii) Generalized inequality of Korn:
IIDullr• ~ dllullv for all u e U and fixed d > 0.
(viii) Initial conditions at timet = 0:
u(O) = u0 , y(O) = f'o, p(O) =Po.
fJ(O) = fJo, q(O) = qo.
We assume that the initial values satisfy the following relations
D*fJ0 = K(O),
(ix) Spaces: For all times t e [0, t0 ], we have
u(t)e U,
y(t), e(t), p(t) er,
fJ(t), q(t), r(t) e r•
with the following real, separable H-spaces:
U: H-space ofthe displacements u,
U*: H-space ofthe outer forces K,
66.1. Abstract Model for Slow Deformation Processes 351

r: H-space of the strains y,


r•: H-space of the Stresses (1.
As usual, the star denotes the dual space. Our Hilbert space model
clearly shows that there exists a duality between displacement u and
outer force K on the one side, and between strain y and stress a on the
other side. We note that the generalized Korn inequality is very natural
because it implies that
y = 0 => Du =0 => u = 0,
which means that the vanishing of strain implies the vanishing of
displacement.
{x) Operators. We assume:
F: r• -+] - oo, oo]. is convex, lower semicontinuous and F :1= + oo,
A, B: r -+ r• are linear, continuous, symmetric, and strongly
positive,
D: U-+ r is linear and continuous.
Suppose the function
t1--+ K{t)

is given which describes the change in time for the outer force. We are looking
for the change in time of
y, e,p and a, q, r.

If we eliminate e and r by using relations {i) to {vi), then we obtain the new
basic equations
D*(q + Bp) = K,
q + (A + B)p = ADu, (1)
peoF(q).
Moreover, we find that
D*a0 = K{O), {2)
where
a0 = A{Duo - Po).
Thus we are led to the following problem.

Problem 66.1. Assurne that the outer force process


K e Wl(O,t 0 ; U*),
352 66. Convex Analysis, Maximal Monotone Operators, Elasto-Viscoplastic Material

as well as the initial values of the displacement


u0 eU,
the plastic deformation
PoEf,
and the viscoplastic stress

with (2) and F(q0 ) < oo are given.


We are looking for a process
(u,p,q)e Wi(O,to; u X r X r•),
which satisfies equation (1) and the initial condition
(u, p, q)(O) = (uo, Po. q0 ).
The spaces Wf(O,t 0 ;X) with a real, separable H-space X have been intro-
duced in Section 55.3. Recall that
the map f: [0, t0 ]-+ X belongs to Wi(O, t0 ; X)
if and only if f is coiltinuous and has a generalized derivative on [0, t0 ] with
jeL2 (0,t 0 ;X), i.e.,

Io II j(t)ll 2 dt < 00.

If one knows a solution of Problem 66.1, then y, e, and rare easily obtained
from
y=Du, e =I'- p, q = Ae, r= (1- q.
From q + (A + B)p = ADu we then obtain the simpler formula
r= Bp.

66.2. Physical Interpretation of the Abstract Model


An important concrete example will be considered in Section 66.4. Here we
only discuss some generat aspects ofthe model. As in Figure 66.1, our model
can be interpreted as the parallel connection of a viscoplastic element F and
an elastic element B, which are both subject to the plastic deformation p.
Furthermore, p is related to the plastic stress q through
peoF(q)
66.2. Physical Interpretation ofthe Abstract Model 353

strain stress
(deformation)
F = viscoplastic element;
A, B = elastic elements.
Figure 66.1

and to the Hooke stress r through


r=Bp.
These add up to the total stress
u = q + r.
This total stress causes another elastic deformation
e=A- 1u
so that the total deformation adds up to
y = e + p.
There exists a remarkable analogy between Figure 66.1 and electrical cir-
cuits. In terms of electricity, we find that
strain => voltage,
stress => current,
strain-stress relation => Ohm's law.
In case of an electrical circuit, A, B, and F in Figure 66.1 correspond to
switching elements.
As in the concrete case of the eJastoplastic wire of Section 60.3, the operator
B describes a hardening effect. The presence of the physically reasonable
hardening operatorBis very welcome from a mathematical point of view,
since the operator
PA+B
is symmetric and strongly positive and hence has an inverse.
354 66. Convex Analysis, Maximal Monotone Operators, Elasto-Viscoplastic Material

In order to motivate the constitutive law


p(t) e aF(q(t))
in (iii) of Section 66.1 we now consider several important special cases of the
constitutive law
j(t) e aF(a(t))
between strain y and stress a. In this direction, compare the concrete examples
of Section 60.2. Further concrete examples will be considered in Section 66.4.

EXAMPLE 66.2 (Plastic Behavior). Let C be a convex, closed, nonempty set in


the real H-space f*. Moreover, let X be the indicator function of C, i.e.,

x(a) = {o ~f a e c,
+oo tfaf$C.
If we set F = x. then the constitutive law
y(t)eax(a) (3)
describes plastic behavior. Explicitly, relation (3) means that a(t) e C and
(a(t), y(t))u ~ (a, y(t))u for alt a e C. (4)
This condition admits a direct physical interpretation if we observe that
(a, y) with y =Du
is equal to the virtual work which corresponds to the displacement u under
the action of the stress a, without taking the outer force K into account. If we
set {Jy = y(t) At, then (4) assumes the form
(a(t), lJy) ~ (a, lJy) for alt a e C.
Therefore, (4) can be regarded as the principle of the maximal virtual plastic
work.
By definition, the total virtual work, caused by the displacement u, is equal
to
(a, y)- (K, u) with y=Du,
which corresponds to our observations in Section 61.5. The principle of virtual
work is given by
(a,Du)- (K,u) = 0 for alt ue U.
This definition is reasonable, because it implies that
(D*u- K, u) = 0 for all ue U,
i.e.,
D*a = K,
and this is the equilibrium condition (vi) of Section 66.1.
66.3. Existence and Uniqueness Theorem 355

EXAMPLE 66.3 (Viscous Behavior). Let F1 : r• -+ R be a convex C1-functional


on the real H-space P. Then, by definition, the constitutive law y(t) e öF1 (u(t)),
i.e.,
y(t) = Fi (u(t))
describes viscous behavior. In addition, we assume the existence ofthe inverse
operator Ft 1 : r•-+ r.

EXAMPLE 66.4 (Viscoplastic Behavior). By setting F = x + F1 with x and F1


as in the two preceding examples, we obtain from Theorem 47.8 that
öF = öx + öF1 •
The constitutive law y(t)eöF(u(t)) then corresponds to
y(t)e(öx + Fi)(u(t)).
This is a Superposition of plastic and viscous behavior.

66.3. Existence and Uniqueness Theorem


Theorem 66.A (Gröger (1979)). Problem 66.1 has a unique solution.

After some elementary transformations, this theorem is an immediate con-


sequence of the main theorem about first-order evolution equations with
maximal monotone operators (Theorem 55.A). The main step in the following
proof is (111).
PROOF.

(I) Operator properties. The operator


D•ADeL(U, U•)
is linear, symmetric, and strongly positive, since
(D•ADu,u) = (ADu,Du);?!: a11Dull 2 ;?!: ad 2 llull 2
for all u e U with a, d > 0. Consequently, the inverse operator
(D• AD)- 1 e L(U•, U)

exists (see Theorem 26.A). We define the operators SeL(U•,P) and


Q, p E L(P' r•) through
S = AD(D•ADr 1, Q = SD.. P =I- Q.
A simple computation shows that
p2 =P, (5)
356 66. Convex Analysis, Maximal Monotone Operators, Elasto-Viscoplastic Material

i.e., P atid Q are projection operators. Moreover, we have


PAD=O, D*PA = 0, D*S =I,
(6)
QS=S, PS=O.
In addition, let us prove that
N(Q) = N(D*), R(Q) = R(AD). (7)
Clearly, we have N(D*) s;;; N(Q) andin order to prove N(Q) s;;; N(D*) we
consider the equation
Qg = AD(D*AD- 1 r 1 D*g = 0,

which yields D*g = 0. According to assumption (vii) and (x) of Section


66.1 the inverse operators
D- 1 : R(D)-+ U
exist and thus we find that N(Q) = N(D*).
Furthermore, we have the obvious relation R(AD) s;;; R(Q), and the
equation h = ADg implies that g = (D* ADr 1D*h, and hence h = Qh.
This shows that R(AD) = R(Q).
(II) Equation
ADu =g, ueU
has a unique solution for every g er* with Pg = 0. In fact, since g e R(Q)
and R(Q) = R(AD), there exists a solution to this equation and from
ADu = 0 it follows that u = 0. Hence the solution is unique and equal
to u = (D*AD)- 1D*g.
(III) Main step. We show that the auxiliary equation
SKeq +(PA+ B)oF(q),
(8)
q(O) = qo,
has a unique solution.
The operator
PAeL(f,f*)

is symmetric and positive (see Problem 66.2). Therefore the operator

PA+ BeL(f,f*)

is symmetric and strongly positive, and hence the inverse operator


(PA+ B)- 1 eL(r*,r)

exists. This is the key to our proof. Recall that the operator B describes
hardening effects of the material. Thus, the hardening operator B is
responsible for the regularization of the operator PA.
66.3. Existence and Uniqueness Theorem 357

On the H-space r•, we introduce the equivalent scalar product


(xly)t = (x,(PA + Br 1 y) for all x, yef*
and obtain that
(PA + B) oF(q) = ot F(q),
where al F denotes the Subgradient with respect to (·I· )1. This follows
from the definition of the subgradient in Section 47.3. Therefore, (8)
becomes the equivalent equation
SK eq + o1 F(q),
(9)
q(O) = q0 , qe Wl(O,t 0 ;f*).
According to Theorem 55.A, equation (9) has a unique solution. Note
that the subgradient
o1 F: f*-+ 2r•
isamaximal monotone operator, according to Theorem 47.F.
(IV) Uniqueness for Problem 66.1. Let
(u,p,q)eWl(O,to;U X r X r•)
be a solution of Problem 66.1, i.e.,
D*(q + Bp) = K, q + (A + B)p = ADu, peoF(q),
(10)
(u, p, q)(O) = (uo, Po• qo).
After multiplication with S and P we obtain
Q(q + Bp) = SK, Pq + P(A + B)p = 0.
Addition yields
q +(PA+ B)p = SK. (11)
Differentiation with respect to time t implies
q +(PA+ B)p = SK, (12)

and p e oF(q) yields (8). Together with (111) this shows that there exists
at most one solution of Problem 66.1.
(V) Existence for Problem 66.1. By reversing the argument in (IV), we show
that the solution q of(8) induces a solution (u,p, q) of Problem 66.1.
Webegin by constructing p as a unique solution of equation (11). An
application ofthe operator P to (11) yields
P(q + (A + B)p) = 0.
Using (II), we may thus construct u as the unique solution ofthe equation
ADu = q + (A + B)p. (13)
358 66. Convex Analysis, Maximal Monotone Operators, Elasto-Viscoplastic Material

In addition, an application ofthe dual operator D* to equation (11) yields


D*(q + Bp) = K. (14)
From (8) and (12) follows
peoF(q), (15)
and thus we obtain from (13) to (15) that (u,p,q) is a solution of the
original equation (10), i.e., a solution of Problem 66.1. 0

66.4. Applications
A first application ofTheorem 66.A to wires and bars has been given in Section
60.3. Here we want to show how the abstract model of Section 66.1 can
naturally be realized for three-dimensional, homogeneous, isotropic bodies.
We use the same notations as in Sections 61.5 and 62.2.
Let G be the undeformed region in R3 • We assume that Gis bounded with
oG e C0 • 1• The deformation of G is given by
y= x + u(x)
with
u=O onoG1 ,
where oG = o1 G u o2 G with o1 G -:1: 0 as in Section 62.2. Therefore, we choose
ue U with

The strain tensor is


y=Du, Du = !(u'(x) + u'(x)*).
Ifwe set
r = L2 (G,Lsym(V3 )),
then y er. Recall that r is an H-space with the scalar product

(uly) = L [u(x), y(x)] dx.

We identify r* with r, i.e., we identify the elements uer with uer• in the
sense of
(u, y) = (uly) for all yer.
Hence we obtain the usual relation u(x)eLsym(V3 ) for the stress tensor.
The classical Korn inequality teils us that (Du!Dv) is an equivalent scalar
66.4. Applications 359

product on the H-space U (see Lemma 62.2). Thus we have


IIDull ;?:dllull forall ueU andfixed d>O.
This coincides with the generalized Korn inequality of Section 66.1.
Let us now consider the outer forces K (volume forces) and T (boundary
forces). Suppose that

Then

(K,u) = f Kudx + f TudO for all ue U


JG Jo2G
defines a linear, continuous functional K: U -+ R, i.e.,
KeU*.
This is the reason why the dual space U* of Section 66.1 is called the space of
outer forces.
According to equation (61.39), the principle ofvirtual work corresponds to
(tr,y)-(K,u)=O forall ueU, (16)
where y = Du. From Section 66.2 follows that this is equivalent to the equi-
librium condition
D*tr = K. (16*)
In Section 61.5g we used integration by parts to see that (16), and hence (16*)
corresponds to the classical equilibrium condition
div t1 +K = 0 on G,
trn = T on o2 G.
To complete our model we will now be concerned with the realization of
the constitutive laws of Section 66.1.

EXAMPLE 66.5 (Plastic Behavior). Let qJ: Lsym(V3 )-+ R be convex and con-
tinuous. We choose tr0 > 0 such that the set
C = {tref*: ffJ(tr(x))::;; tr5 on G}
is. not empty. Given t1 e C we find that tr(x) e Lsym(V3 ) and that C is a closed,
convex subset of f* (see Problern 62.8).
Let xdenote the indicator function of C. From Exarnple 66.2, it follows that
the constitutive law
p(t) e ox(q(t))
models plastic behavior. Thus, if we choose
qJ(tr) = [if, if]
360 66. Convex Analysis, Maximal Monotone Operators, Elasto-Viscoplastic Material

with (f = u- itru I, then Example 61.5 shows that C corresponds to the


plasticity condition of von Mises.
EXAMPLE 66.6 (Linear Elastic Behavior). According to Section 66.1, we need
a constitutive law of the form
u = Ae.
The idea is to use a linear Hooke's law
u(x) = A. tr e(x) I + 2Ke(x).
Thus we construct the operator A: r-+ r* through

(Ae,y) = L [u(x),y(x)]dx forall e,yer.

Dropping the argument x we find


[u,y] = A.tretry + 2K[e,y].
This implies
(Ae,y) = (Ay,e) for all e, }'Er,
and

(Ae,e) ~ 2K L [e,e]dx = 2KIIell 2 for all eer,

i.e., the operator A is linear, symmetric, and strongly positive.


In the same way one can choose the operator B for the constitutive law
r=Bp.
Our model is now complete.

PROBLEMS

66.1. Proof of (5) and (6).


Hint: Elementary computation.
66.2. Operator PA. Show that the operator PA in the proof of Theorem 66.A is
symmetric and positive.
Solution: The symmetry follows from
(QA)* = A*Q* = AQ* = ADS* = (AD)(D*ADf 1D*A*
= SD*A = QA,
(PA)*= A*- (QA)* = A- QA =PA.
The positivity follows from
(PAy,y)= (Ay,P*y) = (A(P*y + Q*y),P*y)
= (AP*y,P*y) ~ 0 for all yer.
Note that we have PAQ* = P(QA)* = PQA = 0, since PQ = 0.
References to the Literature 361

66.3. • Application to plastic torsion. In Chapter 86 the plastic torsion problern is


considered by using a statical model: Formulate the torsion problern i~ terms
of our abstract model of Section 66.1 and use Theorem 66.A to prove existence
and uniqueness.
Hint: See Hünlich (1979). This reference also contains numerical results.

References to the Literature

Intemal state variables: Nguyen (1973), Gröger (1979, S, B) (general abstract model),
Hünlich (1979) (plastic torsion), Neeas and Hlavaeek (1981, M).
aassical works in plasticity theory: See the References to the Literature for Chapter
60.
Textbooks on classical plasticity theory: Hili (1950, M), Prager and Hodge (1951,
M, H~ Prager (1955, M), Sokolovskii (1955, M), Kaeanov (1969, M).
Handbook of Physics: Flügge (1956), Vol. VIa, Part 3, Geiringer (1972, S).
Explicit examples in plasticity theory: Nguyen (1982, L).
Modem introduction to plasticity theory: Duvaut and Lions (1972, M), Gröger
(1979, S, B), N~as and Hlavaeek (1981, M), Temam (1983a, M).
Existence proofs: Duvaut and Lions (1972, M), Nguyen (1973~ Halphen and Nguyen
(1975), Moreau (1976), Gröger (1979), Miersemann (1980) (regularity of generalized
solutions), N~as and Hlavaeek (1981, M), Temam (1983a, M), (1986).
Generalsystems with hysteresis: Krasnoselskii and Pokrovskii (1983, M).
aassical plastic torsion: Prager and Hodge (1951, M), Geiringer (1972, S).
Modem plastic torsion: Duvaut and Lions (1972, M), Lanchon (1974), and Friedman
(1982, M) (introduction).
Ting (1969), (1972, S), Gajewski (1970a), Brezis (1972), Langenbach (1976, M),
Gerhardt (1976), Hünlich (1979), Glowinski (1980, L).
Approximation methods: Glowinski, Lions, and Tremoliere& (1976, M), Hünlich
(1979), Glowinski (1980, L), (1984, M~ Komeev and Lange (1984, L), Miyoshi (1985,
M).
Plasticity, defects in crystals, and the methpds of the modern gauge field theory:
Kleinert (1987, P).
APPLICATIONS IN
THERMODYNAMICS

lmperfection of matter sows the seed of death.


ThomasMann
The greatest joy of a thinking man is to have explored the explorable and just
to admire the unexplorable.
Johann Wolfgang von Goethe

Many processes in nature have a thermodynamical character. Among them


we find cosmological, chemical, and biological processes. During the nine-
teenth century a great intellectual achievement was marked by the fact that
it was possible to separate two fundamental concepts from an abundance of
very diversified experimental data, namely: energy and entropy. The concept
of entropy is closely related to the phenomenon of evolution. Through the
second law ofthermodynamics, time gets a direction. It is possible to explicitly
distinguish between past and future, because actually all real processes in
nature are irreversible, i.e., other than in classical mechanics, they cannot
occur in the opposite time direction. The development of modern sciences is
characterized in all subject areas (cosmology, biology, chemistry, physics) by
an increased recognition of the role played by evolution, which Ieads from
simple to complicated structures. One may think, for example, of the process
of life. However, the particular mechanisms, leading to evolution, are only
vaguely known, since a comprehensive mathematical-physical theory for the
production of entropy is not available.
In Chapter 67 we will be concerned with phenomenological thermodynamics
of the quasi-equilibrium and the equilibrium. A deeper understanding of
thermodynamics, however, is only possible in the context of statistical physics.
This will be discussed in Chapter 68.
The two fundamental physical quantities, which determine the specific

363
364 Applications in Thermodynamics

character of thermodynamics, are temperature T and entropy S. In phenome-


nological thermodynamics, the existence of these quantities for thermo-
dynamical equilibrium states and certain quasi-equilibrium states is postu-
lated. Statistical physics shows that, in fact, T and S are of statistical nature
and meaningful for systems of particles. The systems which occur in macro-
scopical physics are systems with a great number of particles, whereby statisti-
cal fluctuations are small. This is the reason why the statistical character of
macrophysical phenomena is not that obvious. As in the case for chemical
processes, the number of particles may vary. In statistical physics, tempera-
ture T is a parameter on which the probability w of the particular state z
depends. The state z thereby is characterized through energy and number of
particles. In case the number of particles varies, walso depends on the chemical
potential Jl, i.e.,
w = w(T,Jl).
From a mathematical point of view, entropy S is nothing other than informa-
tion. The fundamental concept of information has been introduced in Section
43.11 of Part III. The basic idea is the following. Consider a system I: having
the possible states z 1 , ••• , z,.. Let w1 be the probability for the realization of
state z1• The corresponding entropy (information) is then defined by

S = -k L" w lnw
i=l
1 1•

In Section 43.11 we showed that this is a very natural definition. In fact, one
can prove that entropy is uniquely determined by few very natural axioms
(cf. Problem 43.7e). It is quite remarkable that the concept of entropy was
discovered by physicists in the middle of the nineteenth century, while the
mathematical concept ofinformation was introduced only approximately one
hundred years later. The constant k in the definition of S is the so-called
Boltzmann constant. This universal constant plays a fundamental role in
thermodynamics.
We emphasize, however, that not every arbitrary physical system with a
great number of particles has a temperature and an entropy. Roughly speak-
ing, the system must exhibit a regular probability behavior with respect to
energy and number of particles. A necessary condition for this is that in a
physically meaningful way one can speak of the concepts of mean energy and
average number of particles.
In the following two chapters weshall see that the Lagrangemultiplier rule
of Part 111 plays an important role in phenomenological thermodynamics, as
weil as in statistical physics.
The following quotation, taken from the "Scientific Autobiography" ofMax
Planck (1858-1947), should give an impression ofthe historical development
of the fundamental notion of entropy and its relations to probability, statistical
physics, and quantum theory:
My original decision to devote myself to science was a direct result of the
discovery (which has never ceased to fill me with enthusiasm since my early
Applications in Thermodynamics 365

youth), ofthe comprehension ofthe far from obvious fact that the laws ofhuman
reasoning coincide with the laws governing the sequences of impressions we
receive from the world about us; that, therefore, pure reasoning can enable man
to gain an insight into the mechanism of the Iatter. In this connection, it is of
paramount importance that the outside world is something independent from
man, something absolute, and the quest for the laws which apply tothisabsolute
appeared to me as the most sublime scientific pursuit in Iife.
These views were bolstered and furtbered by the excellent instruction which
I received, through many years, in the Maximilian Gymnasium in Munich from
my mathematics teacher, Hermann Müller, a middle-aged man with a keen mind
and a great sense of humor, a past master of the art of making his pupils visualize
and understand the meaning of the laws of physics.
After my graduation from the gymnasium, I attended university, first in
Munich for three years, then in Berlin foranother year. It was in Berlin that
my scientific horizon widened considerably under the guidance of Hermann
Helmholtz (1821-1894) and Gustav Kirchhoff(1824-1887), whose pupils had
every opportunity to follow their pioneering activities, known and watched all
over the world. I must confess that the lectures of these men netted me no
perceptible gain. It was obvious that Helmholtz never prepared his lectures
properly. We had the unmistakable impression that the class bored him at least
as much as it did us.
Kirchhoffwas the very opposite. He would always deliver a carefully prepared
lecture. Not a word too few, not one too many. But it would sound Iike a
memorized text, dry and monotonous. We would admire him, but not what he
was saying.
Under such circumstances, my only way to quench my thirst for advanced
scientific knowledge was to do my own reading on subjects which interested me.
Oneday, I happened to comeacross the treatises ofRudolfClausius (1822-1888),
whose lucid style and enlightening clarity of reasoning made an enormous
impression on me.
Clausius (1865) deduced his proof of the Second Law of Thermodynamics
from the hypothesis that

heat will not pass spontaneously from a hotter to a colder body.


Butthis hypothesis must be supplemented by a clarifying explanation. For it is
meant to express not only that heat will not pass from a colder into a warmer
body, but also that it is impossible to transmit, by any means, heat from a colder
into a hotter body without there remaining in nature some change to serve as
compensation.
In my endeavor to clarify this point as fully as possible, I discovered a way
to express this hypothesis in a form which I considered to be simpler and more
convenieot, namely:

The process of heat conduction cannot be completely reversed by any means.


A process which in no manner can be completely reversed I called a "natural"
one. The term for it in universal use today, is: "irreversible."
I found the meaning of the Second Law of Thermodynamics in the principle
that:

In every natural process the sum of the entropies of the bodies involved increase.
I worked out these ideas in my doctoral dissertation at the University ofMunich,
which I completed in 1879 (at the age of 21 years). The effects ofmy dissertation
on the physicists of those days was nil. Helmholtz probably did not even read
366 Applications in Thermodynamics

my paper at all. Kirchhoff expressly disapproved of its contents.... Clausius did


not answer my letters.
The universal acceptance of my thesis was ultimately brought about by
considerations of an altogether different type of argument, unrelated to the
arguments which I had adduced in support ofit-namely, by the atomic theory,
as represented by Ludwig Boltzmann (1844-1906).
Boltzmann (1872) succeeded in establishing, for a given gas in a given state,
a function H, which has the property that its value constantly decreases with
time (the famous H-theorem of Boltzmann). It suffices, therefore, to identify the
negative value of this H with entropy, to arrive at the principle of the increase
of entropy. This discovery demonstrated, at the same time, irreversibility to be
a characteristic of the processes occurring in a gas....
My new universal radiation formula was submitted to the Berlin Physical
Society, at the meeting on October 19, 1900. On the very day when I formulated
this law by using purely formal arguments, I began to devote myself to the task
of investing it with a true physical meaning. This quest led me to study the
interrelation of entropy S and probability W. Since the entropy S is an additive
magnitude, but the probability W is a multiplicative one, I simply postulated
that
S = klogW,
where k is a universal constant. 1
I found that k represents the so-called absolute gas constant. It is, under-
standably, often called Boltzmann's constant. However, this calls for the comment
that Boltzmann never introduced this constant.
Now, as for the magnitude W with respect to radiation in a cavity, I found
that in order to interpret it as a probability, it was necessary to introduce a
universal constant, which I called h. Since it had the dimension of action (energy
x time), I gave it the name, elementary quantum of action. Thus the nature of
entropy as a measure of probability, in the sense indicated by Boltzmann, was
established in the domain of radiation, too.
While the significance of the quantum of action for the interrelation between
entropy and probability was established, the part played by the new constant
in generat physical processes still remained an open question. I therefore tried
immediately to weid the elementary quantum of action into the framework of
classical theory. But in the face of all such attempts continued for a nurober of
years, this constant showed itself to be obdurate. For it heralded the advent of
something entirely unprecedented (namely, quantum physics), and was destined
to remodel basically the physical outlook and thinking of man which, ever since
Leibniz (1646-1716) and Newton (1643-1727) laid the groundwork for infini-
tesimal calculus, were founded on the assumption that all causal interactions in
nature are continuous.
Max Planck (1945)
The Second Law of Thermodynamics is essentially different from the First
Law (conservation of energy), since it deals with a question in no way touched
upon by the First Law, namely, the direction in which a process takes place in
nature. Not every change, which is consistent with the principle ofthe conserva-
tion of energy, also satisfies the additional conditions which the Second Law
imposes on the processes which actually take place in nature.

1 This is an excellent example which shows that fundamental relations in physics are often

obtained by extremely simple, but ingenious ideas. Note that W is not a probability in the
terminology of today, but a statistical weight. This will be explained in Problem 68.6.
Applications in Thermodynamics 367

The Second Law of Thermodynamics states that there exists in nature, for
each system of bodies. a quantity which by all changes of the system either
remains constant (in reversible processes) or increases in value (in irreversible
processes). This fundamental quantity is called, following Clausius (1865), the
entropy of the system.
Max Planck, Treatise on Thermodynamics, (1913)
CHAPTER 67

Phenomenological Thermodynamics
of Quasi-Equilibrium and
Equilibrium States

Temperature, energy, and entropy are state variables, i.e., they are independent
ofthe history ofthe body.
The energy of the universe is constant (first law).
The entropy of the universe tends towards a maximum (second law).
RudolfClausius (1865)
In the huge factory of natural processes, entropy plays the roJe of a president,
because it prescribes the way in which the entire course of business takes place.
The energy principle plays the roJe of the book-keeper by balancing debit and
credit.
Robert Emden (1938)
Only through the second law a certain direction is given to the course of the
world. This is not present in the mechanical world picture.
Arnold Sommerfeld (1954)

The main objectives of phenomenological thermodynamics are:


(i) Characterization of thermodynamical quasi-equilibrium states (Gibbs'
fundamental equation of Section 67.2).
(ii) Characterization of thermodynamical processes (laws of thermodynamics
of Section 67.4).
(iii) Computation of thermodynamical equilibrium states from quasi-
equilibrium states (extremal properlies ofthermodynamical potentials of
Section 67.6).
For didactical reasons we begin with Gibbs' fundamental equation rather
than with the three laws of thermodynamics. Later on we will discuss their
interrelation. Especial emphasis is placed on general princip/es. Through this
the reader might be able to overlook the entire phenomenological thermo-
dynamics with its plentiful applications. The mathematical substance of phe-

369
370 67. Phenomenological Thermodynamics ofQuasi-Equilibrium and Equilibrium States

nomenological tbermodynamics is contained in tbe solution of:


(a) equations for differential forms (Gibbs' fundamental equation); and
(b) extremal problems witb side conditions (Lagrange's multiplier rule).
We consider tbe following applications:
(!X) In connection witb (a) we solve Gibbs' fundamental equation in Section
67.3 for a generat tbermodynamical system witb two degrees of freedom.
We tbereby obtain tbe mostgenerat matbematical description for gases
whicb are compatible witb tbe laws of tbermodynamics.
(fJ) In connection witb (b) we consider:
Gibbs' pbase rule (Section 67.7); Law ofmass action (Section 67.8).
These are two famous generat tbermodynamical relations, wbicb are fre-
quently used in cbemistry.
(y) In Problems 67.3 and 67.4 we discuss tbe impossibility of tbe perpetuum
mobile oftbe second kindas well as Camot's cycle. During tbe nineteentb
century, tbese two pbenomena formed tbe experimental basis for tbe
formulation of tbe second law.
Tbe first and second law oftbermodynamics appears in tbe pbysicallitera-
ture in tbe form
dE = bQ + bW,
(1)
TdS ~bQ.

One speaks of "infinitesimal small cbanges" of inner energy E, heat Q, work


W, and entropy S. Here T denotes tbe absolute tbermodynamical tempera-
ture measured in K (degree Kelvin). However, more precisely meant are tbe
processes
E=Q+ W,
(2)
TS~Q.
Tbe dot means derivative witb respect to time. Here
E(t), T(t), S(t)
corresponds to tbe value of E, T, S at time t. On tbe otber band,
Q(t) and W(t)
denote tbe beat and work (energy), respectively, wbicb is added to tbe system
during the time interval [t 0 , t], for a fixed initialtime t 0 .1t is important to keep
the different meanings of E(t), S(t), T(t) on tbe one band, and of Q(t), W(t) on
tbe otber band, in mind. Note tbat Q and W are always used in connection
witb time-dependent processes. lt doesn't make any sense to speak of "beat
Q or work W at timet."
67.1. Thermodynamical States, Processes, and State Variables 371

Since (1) often Ieads to misunderstandings, we will only use (2), but the
relation between both notations is obvious.
We emphasize that, for a mathematical treatment of phenomenological
thermodynamics, it is necessary to begin with time-dependent thermo-
dynamical processes and to formulate the two laws ofthermodynamics in form
(2). Often, in the mathematically oriented literature, one finds equations
dE =@ + W,
TdS = @,
where @and Ware differential forms. As we shall show in Section 67.4d, this
approach is a special case of (2). But, actually, (2) is more generat and allows
the study of changes from quasi-equilibrium states to equilibrium states. This
is one of the main objectives in thermodynamics and will be discussed in
Sections 67 6 to 67.8.
In order to treat thermodynamical problems effectively, one needs to change
parameters, depending on the particular problem. In the language of mani-
folds this means chart change. In fact, the theory is independent of the
particular parametrization. An invariant formulation is obtained by intro-
ducing a state space Z, which is a manifold, and by using differential forms
on Z. Forafirst glimpse at this chapter we recommend that the reader should
treat differentials in a naive fashion. Precise definitions of differentials and
differential forms on manifolds may be found in Section 73.23. The theory of
manifolds is the natural setting for phenomenological thermodynamics.

67.1. Thermodynamical States, Processes, and


State Variables
For an understanding of thermodynamics it is important to give precise
definitions of the three concepts mentioned in the heading. Webegin with a
verbal characterization. A precise mathematical formulation can be found in
Section 67.4. We distinguish between:
(a) physical states; and
(ß) thermodynamical states (quasi-equilibrium states).
As an example we consider a gas G in a container which consists of N
molecules. In classical mechanics, the physical state of G at time t is charac-
terized by position vectors q;(t) and momentum vectors P;(t), i = 1, ... , N. If
one knows the forces, then according to Newton's mechanics, one can compute
the motion for all times. lf, in addition, each molecule is regarded as an
oscillating and rotating system, then further position and momentum vectors
are needed to characterize the complete classical state. Since N has an order
372 67. Phenomenological Thermodynamics of Quasi-Equilibrium and Equilibrium States

of magnitude of 1023, a great number of parameters is needed in order to


describe the system. A mathematical treatment in this setting, which would
Iead to practical results, is impossible. The same is true for a quantum
mechanical treatment. Schrödinger's wave function then depends on the N
position vectors qi. However, experience shows that the macroscopical be-
havior of many systems with a large number of particles can be described by
substantially less parameters. Such states, for which the temperature plays an
important role, are called thermodynamical quasi-equilibrium states.

EXAMPLE 67.1 (Homogeneous Gas). Gas states can completely be described


by volume V, pressure P, and temperature T,
T = T(V,P) (equation of state).
EXAMPLE 67.2 (Inhomogeneous Gas). In order to understand the propagation
of sound waves in the air, the situation ofthe previous example is not sufficient.
Here states have tobe described by the pressure distribution P = P(x), density
p = p(x), and temperature distribution T = T(x), together with the equation
of state
T = T(p,P).
Sound waves are then time periodic ßuctuations of these quantities.
In an abstract setting, quasi-equilibrium states are described by the elements
z of a set Z, which will be called state space. In Example 67.1 the state space
Z consists of all points
z = (V, P, T(V, P)) in R 3
with T, VeR!, i.e., Z is a surface in R3 • Astate function is a function
0 = O(z) onZ.
Important examples are temperature T, inner energy E, and entropy S. In
Example 67.1, the state functions can also be represented through
0 = O(V,P).
Furthermore, the representation
0 = O(V, T) or 0 = O(P, T)
is possible if the equation T = T( V, P) can be solved for P or V. One thereby
recognizes that states and state functions can be parametrized in very different
ways. This fact is widely used in thermodynamics.
State changes are given by -
z = z(t),
i.e., at each time t, the system is in a quasi-equilibrium state. State changes
depend on the particular state space Z. In Example 67.1, we only allow state
67.1. Thermodynamical States, Processes, and State Variables 373

changes of the form


V= V(t), P = P(t) and T = T(V(t), P(t)).
Such state changes can only be obtained in idealized form, by very slowly
relaxing the gas or compressing it. In Example 67.2 a state change is described
by
P = P(x,t), p = p(x,t), T = T(x,t), T = T(p,P). (3)
It is possible that at times t = 0 and t = t 1 , i.e., at the beginning and the end
ofthe process we have
P = const, p = const and T = const,
i.e., we are in the situation of Example 67.1, whereas during the process, we
are in the more generat situation (3). Such a process cannot be described in
the state space of Example· 67 .1.
Strictly speaking, besides z = z(t) for a process, one would also have to
know the heat Q(t) and other forms of energy W(t), which during the time
interval [t0 , t] are added to the system.
The concept of quasi-equilibrium states is introduced as an idealization,
since it greatly increases the number of possible applications. Most important
is the concept of thermodynamical equilibrium states. lt means the following.
If a system is subject to time-independent outer conditions, then one expects
that, after a long period of time, the system passes into a state in which it will
remain and which is characterized by a minimal number of parameters. These
states are called thermodynamical equilibrium states. A necessary condition
for the thermodynamical equilibrium is that the system has a common tem-
perature. Quasi-equilibrium states may become equilibrium states. We men-
tioned already earlier that quasi-equilibrium states are described by a smaller
number of parameters than arbitrary physical states. lf one passes from a
quasi-equilibrium state to an equilibrium state, then the number of state
parameters may further decrease. Example 67.1 shows a thermodynamical
equilibrium, while in Example 67.2 we only have a quasi-equilibrium. Here
temperature, pressure, and density compensations are still possible.

EXAMPLE 67.3 (Chemical Reactions). In a container of volume V, pressure P,


and temperature T, we have Ni particles of the ith substance with i = 1, ... , n.
Quasi-equilibriums are described by
z = (T, V, p., P(T, V, p.), N(T, V, p.))
with J1. = (p. 1 , ••• , Jl.n) and N = (N1 , ... , N"). Thereby J.l.i is the chemical potential
of the ith substance. If chemical reactions are possible, then the number of
particles Ni may still change. The thermodynamical equilibrium then corre-
sponds to the chemical equilibrium, which occurs after a long period of time.
In Section 67.8 we will see how this chemical equilibrium can be computed
374 67. Phenomenological Thermodynamics of Quasi-Equilibrium and Equilibrium States

(law of mass action). Thereby we use a generat method for the computation
ofthermodynamical equilibriums, which is discussed in Section 67.7.
From the standpoint of statistical physics, thermodynamical equilibriums
correspond to time-independent probability distributions of the entire system.
In Example 67.1, this is the distribution of gas molecules onto possible
velocities. This is a time-independent Gaussian normal distribution (Maxwell's
velocity distribution of Section 68.3). Quasi-equilibriums occur after dividing
a system into subsystems, and assigning probability distributions to each
subsystem. In Example 67.2, the subsystems consist of small volume elements,
whereby foreachsuch element the situation of Example 67.1 is assumed.
An important advantage of thermodynamical equilibriums and quasi-
equilibriums is the following. In these states the system completely forgets its
history. This, for example, makes it possible to compute the state of our
universe at a temperature of 10 11 K, one hundredth of a second after the Big
Bang (see Section 68.6). Without this "forgetting" we would be in a hopeless
situation, since we do not know the initial conditions of the Big Bang.

67.2. Gibbs' Fundamental Equation


Gibbs' fundamental equation for quasi-equilibrium states is
r s
dE = TdS- PdV + L Jt 1dN1 + L A1da1. (4)
i=l j=l

The quantities have the following meaning: Eis inner energy, T is absolute
temperature, S is entropy, Pis pressure, V is volume, p.1 is chemical potential,
and N1 is number of particles of the ith substance, and A1, a1 are further
parameters. We write N = (Nto·· .,N,), a = (ato ... ,a5 ). We are looking for
functions
E = E(y), S = S(y), p = P(y), Jl = p.(y), A = A(y)
with y = (T, V, N, a) which satisfy (4). A quasi-equilibrium state is characterized
by
z = (y, P(y), Jt(y), A(y)). (5)
The collection of all these z forms the state space Z. Explicitly, (4) is a system
of first-order partial differential equations:
Ev = TSv- P,
(6)
EaJ = TSaJ + Al.
In the following section we integrate (6) for an important special case. Every
solution of(4) describes possible quasi-equilibrium states ofthermodynamical
systems. In Section 67.4c we will show how (4) can be derived from the laws
of thermodynamics.
67.3. Applications to Gases and Liquids 375

In (5), the state space Z has been parametrized with y. Since ye RH•+• the
number 2 + r + s is called the number of degrees of freedom of the system. If
I}, p, and A in (5) are C 1-functions on an open set, then Z is a C 1-manifold
which lies in R3+ 2'+ 2 •. This manifold is important itself, and not the way in
which it is parametrized by choosing suitable parameters. In Section 67.5 we
will show how equation (4) can be solved in a generat form. We will use
thermodyliamical potentials.
The importance of the chemical potential Jl will become clear from the
following section and Sections 67.7 and 67.8.

67.3. Applications to Gasesand Liquids


We now discuss the important question: Which quantities of a gas must be
experimentally determined in order to coinpute its inner energy, its entropy,
and its chemical potential? By integrating the fundamental equation (4) we
want to show:
(A) Equation of state P = p(T, p) and specific heat capacity c = c(T, p) must
be determined experimentally. Thereby p denotes the density of the gas.
In fact, it is our goal to give a complete description of the structure of all
thermodynamical systems which solely depend on density p and temperature
T. Our observations are applicable as weil to liquids. The main result is
contained in Theorem 67.A below.
Quasi-equilibrium states of the gas are described by
z = (T, V, N, P(T, V, N)),
where T is temperature, V is volume, P is pressure, and N is the number of
particles (Fig. 67.1). In fact, this describes thermodynamical equilibrium states
since, by experience, the number of independent parameters cannot be re-
duced. In statistical physics the use of the particle number N is appropriate.
In our phenomenological discussion we want to eliminate the molecular
structure of the gas, and replace N with the mass M of the gas. We then find
M = m · N, where m is the molecular mass and

P = M/V.
The fundamental equation (4) takes the form
dE = T dS - P dV + J.t dN (7)

~-v----"~<...<1....-·=-- P
Figure 67.1
376 67. Phenomenological Thermodynamics ofQuasi-Equllibrium and Equilibrium States

with a modified ll· We are looking for P, Jl, E, S as C 2-functionsof (T, V, M)


for T > 0, V > 0, M > 0.
Consider the special process
T = T(t), V= V(t), M = M 0 = const,
i.e., temperature and volume are changed while the gas mass remains con-
stant. Let W(t) and Q(t) denote work and heat, respectively, which are added
to the gasdurlog the time interval [t 0 , t]. We then have
W(t) = - P(t) V(t)
with P(t) = P(T(t), V(t), M0 ) (see Problem 67.1). The first law of thermo-
dynamics yields, for the energy balance,
E(t) = W(t) + Q(t) (8)

with E(t) = E(T(t), V(t), M 0 ) (see Section 67.4). Now we assume that also
volume V remains constant, i.e., V(t) = V0 = const and we set
T(t) = 10 + t.
Then

is called the heat capacity (added heat per temperature change). This quantity
can be experimentally determined. From (8) with W(t) = 0 it follows then that
ET(T0 , V0 , M0 ) = C(T0 , V0 , M0 ). (9)
We now wanttoseparate the dependence on M. Tothis end, we set
E(T, V,M) = e(T,p)M, S(T, V, M) = s(T, p)M, P(T, V,M) = p(T,p).
(10)
We keep in mind that E and S areextensive quantities, i.e., homogeneaus of
degree one in V and M, while Pis intensive, i.e., homogeneaus of degree zero
in V and M. Moreover, we set ·
C(T, V,M) = c(T,p)M, (11)
and call e, s, and c specific inner energy, specific entropy, and specific heat
capacity, respectively.
We now assume that we know a C 2-solution of the basic equation (7).
Because of
V=Mfp and dV = dM/p- M dpfp 2
we obtain
(de- Tds- pdpfp 2 )M + (e- Ts + pfp- Jt)dM = 0.
Since M is arbitrary it follows that
Jl = e- Ts + pfp and de = Tds + pdpfp 2 •
67.3. Applications to Gases and Liquids 377

Because of c = er we find
ds = cdT/T + (ep/T- p/Tp 2)dp,
hence
Sr= c/T and
The integrability condition srp = spr yields
eP = p/p2- Pr T/p2, er= c,
and the integrability condition epr = erp implies
cP = -PrrT/p 2. (12)
We set q = (T,p). Integration yields

e(q) = e(q 0 ) + l 4
cdT + p- 2(p- PrT)dp,
Jqo
s(q) = s(q0 ) + l 4
T- 1cdT- p- 2prdp,
(13)
Jqo
/l(q) = e- Ts + p/p.
Theorem 67.A (Main Theorem for Gases and Liquids). We are given the
C 1-function c = c(T,p) and the C 2-function P = p(T,p) forT> 0, p > 0 with
(12). Then for fixed values e(q0 ) and s(q0 ) the functions e, s, and /l of q = (T, p)
are uniquely determined through (13) with q = (T, p). Theseintegralsare path
independent.

PRooF. Apply the above observations in opposite directions. Because of (12),


the integrability conditions for the integrals in (13) are satisfied. 0

For an important example, this theorem illustrates which kinds ofinforma-


tion can be obtained from Gibbs' fundamental equation.

EXAMPLE 67.4 (Ideal Gas). For gases which have a temperature T close to the
"room temperature" T0 and small densities p close to p0 , theexperiment shows
in first-order approximation the following two results.
p = rpT (equation of state), (14)
c = rx.r/2 (specific heat capacity). (15)
Here r is the gas constant and rx. = 3, 5, 6 (one-atomic, two-atomic, n-atomic
gas with n ~ 3). Thereby equation (12) is satisfied and from (13) we obtain for
e = e(T,p), s = s(T,p), ll = /l(T,p) the relations
e = cT + const,
s = cln(Tp 1-Y) + const, y = 1 + r/c, (16)
/l = e- Ts + p/p = rTlnp + /lo(T),
378 67. Phenomenological Thermodynamics ofQuasi-Equilibrium and Equilibrium States

where Jlo(T) is defined by the last line. Note that (14), (15) and hence also (16)
are only valid in a neighborhood of (T0 , p0 ), i.e., not for low temperatures.

Fora deeper understanding of (14) and (15) statistical physics is required.


lf we neglect oscillations and interactions of the gas molecules, then the inner
energy of the N molecules is equal to the mean kinetic energy. The law of
equipartition of statistical physics (see Problem 68.3) states that the mean
kinetic energy per molecular degree of freedom is equal to kT/2. This implies
E = fJNkT/2, (17)
where k = Boltzmann constant and fJ = number of molecular degrees of
freedom. Moreover, it follows from statistical physics that
PV= NkT.
In Section 68.3 this will be proved for one-atomic gases. The pressure, however,
which is exerted onto the container wall is independent of the number of atoms
of the molecule. Hence this relation is generally true. It follows that
p = NkTp/M.
A comparison with (14) yields
r=Nk/M.
This implies that e = E/ M = {JrT/2 and hence
c = eT = fJr/2.
We thus find that !X= fJ. This yields a direct physical interpretation of the
values of !X. Namely,
!X = number of the molecular degrees of freedom.
A one-atomic molecule has !X= 3 translational degrees of freedom. For a
two-atomic or n-atomic molecule, n ;;::: 3, we find in addition two or three
rotational degrees of freedom (see Section 58.13), i.e., we then have !X= 5 or
!X = 6, respectively.
In fact, an. n-atomic molecule may have further degrees of freedom, for
example, those given by oscillations. The experimental values for !X show that
at a "room temperature" these additional degrees of freedom are of no signifi-
cance. Physicists call these degrees of freedom "frozen." However, at higher
temperatures they become important. At extremely low temperatures near
T = 0, on the other band, we encounter completely new physical effects, which
can be understood only in the context of quantum theory. This will be
discussed in Chapter 94 of Part V.

67.4. The Three Laws of Thermodynamics


We now explairi the physical basis for the fundamental equation (4).
67.4. The Three Laws ofThermodynamics 379

67.4a. First Law or the Theorem about


Conservation of Energy

The key forrnula is


(18)
More precisely, we start with a C 1-manifold Z, which weshall call state space.
Concrete examples have already been given in Section 67.1. The points z in
Z are called quasi-equilibrium states of the system or simply states. A thermo-
dynamical process consists of a C 1-curve
z = z(t) in Z
and the two C 1-functions
Q = Q(t) and W = W(t)

on [t 0 ,td. For simplicity, we set t 0 = 0. We interpret:


Q(t) = amount of heat which is added to the system during the time in-
terval [0, t].
W(t) = work which is done at the system during the time interval [0, t], i.e.,
if W(t) > 0, then the system gains energy.
More precisely, W(t) represents all kinds of energy, except for heat energy.
Obviously, we must have that
Q(O) = W(O) = 0.
Note that Q(t) and W(t) can be positive as weil as negative. If, for example,
Q(t) < 0, then this means negative supply of heat, i.e., during the time interval
[0, t] the system supplies the amount ofheat IQ(t)l to the exterior environment.
From a physical point of view, a therrnodynamical process describes the
fact that a system passes through quasi-equilibrium states, whereby heat and
energy is added to the system.
The functions on Z are called state functions. The physical importance of
these functions is that they only depend on the state of the system and not on
the process, i.e., in particular, they do not depend on the history of the system.
The first law of thermodynamics states:
(i) There exists a C 1-function
E: Z-+ IR
which is called inner energy.
(ii) Only such processes occur, for which on [O,t 1 ] equation (18) is satisfied.
Thereby we set E(t) = E(z(t)).
An important consequence of(i) isthat the inner energy Eisastate function.
lfwe set
380 67. Phenomenological Thermodynamics ofQuasi-Equilibrium and Equilibrium States

and, analogously, for Q and W, then by integrating we immediately obtain


from (18) that
+ !'J.W.
!'J.E = !'J.Q
Fora cycle with z(ta) = z(tb) we have !'J.E = 0, hence
!'J.Q+!'J.W=O.
The state space Z is called regular if and only if it admits a C 1-function
T: Z -t IR.

We call T the absolute thermodynamical temperature or simply temperature.


From a physical point ofview, every state ofthe system has a certain tempera-
ture in this case.
The assumption that a state function T exists is called the zero-th law of
thermodynamics. We emphasize, however, that a restriction to regular state
spaces is too stringent. F or a description of temperature fields and a study of
temperature compensation in systems, one needs nonregular state spaces. This
is discussed in Example 67.6 in Section 67.6.
A thermodynamical process is called closed if and only if
Q(t) =0 and W(t) =0.
A system is called closed if and only if in the corresponding state space only
closed thermodynamical processes are allowed. From a physical point ofview
those systems are insulated against supply of outer heat and outer energy.
A thermodynamical process is called adiabatic if and only if
Q(t) = 0,
i.e., if during the entire process no heat is added to the system.

67.4b. Second Law or Entropy Theorem

The key formula is


TS;?: Q. (19)

Precisely, the second law of thermodynamics postulates the following:


(i) There exists a C 1-function
S: Z-. IR

which is called entropy.


(ii) For closed thermodynamical processes, i.e., for processes z = z(t), Q(t) =0
=
and W(t) 0 on [0, t 1 ] we have
S(t) ;?: 0 on [0, t 1 ].
Thereby we set S(t) = S(z(t)).
67.4. The Three Laws ofThermodynamics 381

(iii) For thermodynamical processes on a regular state space, equation (19) is


satisfied. Thereby we set T(t) = T(z{t)).
These assumptions greatly restriet the number of thermodynamical pro-
cesses which are possible in nature. An important consequence of (i) is that
entropy S is a state function.
A process
z = z(t), Q = Q(t), W = W(t)
is called reversible if and only if the process can occur with time reversed, i.e.,
also the process
z = z( -t), Q = Q(-t), W = W(-t)
is possible. From (19) follows that
T(t)S(t) ~ Q(t) and -T(-t)S(-t) ~ -Q(-t)
for a reversible process, hence
TS=Q. (20)
This is a necessary condition for reversible processes. Hence a process, for
which (19) is satisfied on [0, t 1 ] without having strict equality everywhere, is
irreversible. All real processes in nature are irreversible. Reversible thermo-
dynamical processes are idealizations. In order to describe the deviations of
a process from reversibility we define S" and S1 through

S"(t) = S(O) + I Q(t)/T(t) dt, (21)

and
S(t) = Se(t) + S1(t). (22)

According to (21), the quantity S" describes the change in entropy which is
caused by outer supply of heat Q. The quantity S1 on the other band contains
that portion of the entropy change which is caused by inner processes of the
system. From (19) and (21) we obtain
(23)
In a closed system, for example, we have that Q(t) = 0 for all occurring
processes. This implies S"(t) = 0.
To completely understand irreversible processes, we need explicit expres-
sions for the entropy production S1(t).
The great importance of equation (19) is the fact that it relates entropy and
heat. For reversible processes, the change in entropy can be determined from
(20) by measuring the amount of heat supply. It is important then that
"reversible" means intuitively that the process also can occur backwards in
time. But, in experiments, this can only approximately be realized by using
extremely slow processes.
382 67. Phenomenological Thermodynamics of Quasi-Equilibrium and Equilibrium States

67.4c. Gibbs' Fundamental Equation for Thermodynamical


Quasi-Equilibrium States

Our objective here is to consider a basic mathematical model for which both
the first and the second law are satisfied. In this model, which frequently is
used in thermodynamics, states are interpreted as thermodynamical quasi-
equilibrium states. One thereby obtains first-order partial differential equa-
tions for the inner energy E and the entropy S.
Our starting point is Gibbs' fundamental equation
, s
dE = TdS- PdV + L JJ dN + L A1da1•
i=l
1 1
j=l
(24)

A possible physical interpretation for these quantities has already been


given in Section 67.2. We set
y = (T, V,N,a),
where y varies on Y = ~+ x X and Xis an open set in R1 +r+s. The state space
Z is the set of all points
z = (y,P(y),JJ(Y),A(y))
with y e Y, where p. = (JJ 1 , ••• , JJ,) and A = (A 1 , ••. , A5 ). Moreover, Iet
P, Jl~o A1: Y ,... R
be C1-functions. We assume that there exist two C1-functions
E,S: y,...R
on Y with (24).
Obviously, E and S can also be considered as C1 -functions on Z. We choose
Z as above, since then all physical state parameters are explicitly contained.
In fact, Z and Y are diffeomorphic and hence can in this sense be identified.
We now warit to show that in this model both the first and second law of
thermodynamics are satisfied. To this end we consider, analogously to (24),
the two differential forms

(25)
ll = TdS.
A thermodynamical process is now, by definition, given through
z = z(t)
and W = W(t), Q = Q(t) with
W= -PV+ LJ.I.iNi+ LAJ'ii,
I j

Q= TS,
67.4. The Three Laws ofThermodynamics 383

and W(O) = Q(O) = 0. Thereby one obtains


T{t), V(t), N(t) and a(t)

immediately from z(t). Moreover, we set P(t) = P(y(t)). Analogously, we obtain


J,t(t), A(t) and S(t).
All these processes are reversible. Obviously, we have
TS=Q,
i.e., both the first and the second law are satisfied for all these processes.
In classical analysis, one works with differentials in a naive fashion. The
theory of manifolds, on the other band, Ieads to a precise concept of the
differential and the differential form. This will be discussed in detail in Section
73.23. In this sense, C"-differential forms on Z are identical with C"-cotangent
vector fields on Z. In this precise sense, W and ~ are C"-differential forms on
Z with k = 1 and k = 0, respectively. Furthermore, we have
dV(i(t)) = V(t)
where V(t) = V(z(t)), etc., hence
W(t) = W(i(t)), Q(t) = (2(i(t)).

Physically, the states z are interpreted as thermodynamical quasi-equilibrium


states. This model is motivated by the physical idea that it is possible to pass
from one quasi-equilibrium state to another by using reversible processes.
From

one obtains the heat function Q = Q{t), which describes the amount of heat
which has to be added to the system during the time interval [0, t] to guarantee
a reversible process. The assumption that reversible processes exist is an
idealization.
This model also allows the study of more generat thermodynamical pro-
cesses, which need not be reversible. Thermodynamical processes are then
given by
z = z(t), W = W(t) and Q = Q(t)
with
TS~Q.
Here we set E(t) = E(y(t)) and S(t) = S(y(t)). In this case W( ·) and Q( ·) are
not computed from the differential forms Wand (2 as above. For example,
one may choose W{ ·) with W= - PV; i.e., more precisely
W(t) = - P(y(t)) V(t).
Here the outer work which is added to the system results solely from the
384 67. Phenomenological Thermodynamics of Quasi-Equilibrium and Equilibrium States

volume change and corresponding pressure. This will be studied more precisely
in Problem 67.1.
Equation (24) by itself does not uniquely determine all state functions E, S,
P, Jl, and A, but the importance of(24) lies in the fact that this equation yields
conditions which every thetmodynamical quasi-equilibrium system of this
type must satisfy. Mathematical analysis then is used to determine which state
functions have to be known in order to uniquely describe the system, i.e., one
obtains the physically important information, which state functions have to
be experimentally determined. We followed this path in Section 67.3 in the
analysis of gases.

67.4d. Differential Forms


The model of Section 67.4c can be substantially generalized. For this, we
assume:

(Hl) Let a real, (n + !)-dimensional C 1-manifold Z be given, which we call


the state space. Thereby Z is C 1-diffeomorphic to Y = ~+ x X, where
Xis a real, n-dimensional C 1-manifold. In this sense, we can identify Z
with Y. The elements of Y have the form
y=(T,x),

where T > 0 denotes the temperature and x e X the system parameters.


(H2) Assurne that
E, S: Y-+ IR
are C 1-functions and Wis a C 1-differential form on Y suchthat on Y
the following holds:
dE=@+W.
(26)
@= TdS.
If (eh ... , en) are local COOrdinates of X, then
n
W= L W/(T,x)de,,
1=1

holds, i.e., Wdoes not depend on dT.


The thermodynamical processes and the thermodynamical interpre-
tation are then obtained as in Section 67.4c. In particülar, the reversible
thermodynamical processes are obtained from z = z(t), W = W(t~ and
Q = Q(t) with
W(t) = W(z(t)), Q(t) = @(i(t)).
This model can immediately be extended to Banach manifolds.
67.5. Change of Variables, Legendre Transformation, Thermodynamical Potentials 385

67.4e. Third Law

The key formulas are


lim S(y) = 0, lim S'(y) = 0 (27)
T-+0 T-+0

with y = (T, x) and S' denotes the derivative with respect to y. More precisely,
the third law of thermodynamics postulates the following:
(i) The state space Z satisfies assumptions (Hl) of Section 67.4d. In particular,
Z is regular.
(ii) For the C 1-function entropy S: Z-+ IR, relation (27) holds for all x.
From a physical point of view, this represents an assumption on the be-
havior of the entropy in a neighborhood of the absolute temperature zero
T = 0. In Example 67.4, relation (27) is not satisfied for ideal gases. In fact,
formulas (14)-(16) are only valid for temperatures which are not too low.
Especially (14), i.e., the equation of state
p = rpT

is not satisfied for ideal gases in a neighborhood of T = 0. In these regions of


extremely low temperatures, the methods of classical physics do not suffice.
Methods of quantum statistics are required. The third law is closely related
to the behavior of quantum systems in the ground state.

67.5. Change of Variables, Legendre Transformation,


and Thermodynamical Potentials
In thermodynamics, in order to obtain a convenient description for various
different processes, it is important to change the independent variables. In
Section 67.3, for example, in connection with the study of gases we chose T,
V, and N as the independent variables. The pressure was then obtained from
P = P(T, V,N).

For processes with P = const, however, it is more convenient to choose T, P,


N as independent variables. We then have
V = V(T, P, N).

All variable changes in thermodynamics are obtained easily from Legendre


transformations, which have been discussed in Section 58.16. There we also
gave conditions under which the Legendre transformations are possible. In
order not to obscure the simple idea of this method, we will not explicitly state
these conditions below. Our starting point is Gibbs' fundamental equation
dE = TdS- PdV + JldN + Ada. (28)
386 67. Phenomenological Thermodynamics of Quasi-Equilibrium and Equilibrium States

Here we simply write


r s
pdN = L JlidNj
i=l
and A da= L A1da1•
j=l

Furthermore, E, S, P, Jl, A are functions of (T, V, N, a).

Inner energy E. A Iook at (28) shows that it is more convenient to choose


as independent variables S, V, N, a. This yields
E = E(S, V,N,a)
and
Es= T, Ev = -P, E,.=A. (29)
Fora reversible process we have
TS=Q.
Processes with Q(t) = 0 are called adiabatic. Such processes are heat isolated.
For reversible, adiabatic processes we therefore have
S = const.
The generat solution of the fundamental equation (28) is obtained by
prescribing an arbitrary C1 -function E = E(S, V, N, a). The remaining quanti-
ties then follow from (29). lt is assumed thereby that it is possible to uniquely
pass to the variables (S, V, N, a).
Entropy S. From (28) follows
dS = T- 1 (dE + PdV- pdN- Ada). (30)
The natural (canonical) independentvariables for S are therefore (E, V, N, a).
From
S = S(E, V,N,a)
we obtain
Sv= P/T, S,. = -A/T.
Free energy F. From (28) and T dS = d(TS) - S dT follows
dF = -SdT- PdV + pdN + Ada (31)
with F = E- ST. For F = F(T, V,N,a) we find
Fv = -P, F,.=A.
In order to understand the meaning ofthe free energy, we consider a reversible
process. The two Iaws E = Q+ Wand TS = Qimply
F= -St+ W.
67.6. Extremal Principles for the Computation of Thermodynamical Equilibrium States 387

Table 67.1
Canonical
Thermodynamical independent Meaningof
potential Total differential variables derivatives
Inner energy E dE = TdS- PdV + J.LdN E(S, V,N,a) Es = T, Ev = - P,
+Ada EN = Jl, E. = A
Entropy S TdS = dE + PdV- J.LdN S(E, V,N,a) TSE = 1, TSy = P,
-Ada rsN = -J.L, rs. = -A
Free energy dF = -SdT- PdV + J.LdN F(T, V,N,a) Fr = - S, Fv = - P,
F=E-TS + Ada FN = J.l, F. = A
Free enthalpy dG = - S dT - V dP + Jl dN G(T,P,N,a) Gr = - S, Gp = - V,
G=F+PV +Ada GN = "' G. = A
Enthalpy dH = TdS- VdP + JldN H(S,P,N,a) Hs = T, Hp = -V,
H=E+PV + Ada HN = Jl, H. = A
Statistical potential dO = -SdT- PdV- N dJ.L O(T, V, Jl, a) Or = - S, Ov = - P,
O=F-J.LN +Ada n,. = -N,n.= A

For isothermal processes, i.e.,


T(t) = const.,
we find that i = Wand hence

This means, for reversible, isothermal processes, the change of the free energy
is equal to the work added to the system. The free energy will be applied during
the following chapter in connection with Planck's radiation law and the study
of the universe immediately after the Big Bang.
Table 67.1 gives a survey about further important thermodynamical po-
tentials, which can be obtained analogously. During the following discussion
we will frequently use Statements from Table 67.1.

67.6. Extremal Principles for the Computation of


Thermodynamical Equilibrium States
We want to show how equilibrium states can be computed from quasi-
equilibrium states. The idea thereby is the following.
(i) The first and second law imply that, depending on the process, one can
choose suitable thermodynamical potentials which exhibit a favorable
growth behavior with respect to time (e.g., S ~ 0 or i::::;; 0).
388 67. Phenomenological Thermodynamics ofQuasi-Equilibrium and Equilibrium States

(ii) We use this behavior in order to define equilibrium states and then to
compute them.

Principle 67.5 (Entropy). Assurne the system is closed, i.e., no energy is added
to the system neither in the form of heat nor in the form of work. We then have
s~o

for all processes which occur in the system. For a reversible process we have
S = 0, i.e., S = const.
PROOF. This follows from the second law. 0

EXAMPLE 67.6 (Temperature and Pressure Compensation). We consider a


system l: which consists of n subsystems l:i. We assume that each l:i is in a
quasi-equilibrium state, which is parametrized by
Yi = (Ei, Jj, ~).
According to Section 67.5, we obtain from
S1 = S1(y1)
that
as.
_J=-,
1
aE
1 1J
A process in l: is described by y1 = y1(t) and

for allj. The physicist imagines that the subsystems 1:1 are separated by walls
and that those are slowly removed. The systems are assumed tobe statistically
independent from each other, i.e., we neglect interactions (e.g., water and
steam). We then find entropy S and inner energy E of l: by adding the
corresponding quantities ofl:1. Moreover, we assume that l: is a closed system.
It follows then that
E = const, V= const, N = const.
According to Principle 67.5, only processes with S ~ 0 may occur in l:. We
expect that these processes tend towards a thermodynamical equilibrium as
t-+ oo. Therefore, it makes sense to compute these equilibrium states by using
the following extremal principle:

(32)
E = l:1E1 = const, V= 1:1 lj = const, N= 1: 1 ~ = const.
Assurne that all functions S1 are C 2 • According to Section 43.10 of Part III we
solve this problern by using Lagrange's multiplier rule. For this, we choose
67.6. Extremal Principles for the Computation of Thermodynamical Equilibrium States 389

the Lagrange function


L = S - rxE- ßV- yN.
The necessary solvability condition for (32) is
L'(y 0 ) = 0,
i.e., all partial derivatives of L with respect to Ej, J.j, ~ are equal to zero. This
gives
1/Jj =IX, Ij/1} = ß, -p)Jj=y (33)
for all j. Thus, in a thermodynamical equilibrium, all temperatures 7j, all
pressures Ij, and all chemical potentials J.lj must equal each other. The values
for the multipliery rx, ß, y are obtained by inserting (33) into the side conditions
of equation (32). In order to verify that the solution y0 = (y?, y~, ... ), computed
this way, does actually yield a maximum, we have, according to Section 43.10,
to show that there exists a number c > 0 with
(34)
and Ay = y- y0 for all y, which satisfy the side conditions in (32). Thereby
S"(y0 )(Ay) 2 is a quadratic form with the second-order partial derivatives of S
at the point y 0 as coefficient matrix. lf (34) holds, then physicists speak of a
stable thermodynamical equilibrium. They symbolize this by
b 2S < 0.
Observe the following fact. According to the general result of Section 43.10,
we obtain equation (34) where S is replaced with L. But since the side condi-
tions are linear, we find that L"(y) = S"(y).
EXAMPLE 67.7 (Chemical Reactions). The following Observations will be used
Iater on. We consider, for example, the chemical reaction
(35)

with rx 1 = 2, rx 2 = 1, rx 3 = 2. This means that two S0 2 -molecules are combined


with one 0 2 -molecule to form two S0 3 -molecules, or the opposite reaction
occurs. Let N 1 (t), N2 (t), N 3 (t) be the number of S0 2 , 0 2 , S0 3 -molecules at
timet, respectively. We then have

where ß is called the reaction velocity. As a more convenient mathematical


notation for chemical reactions we will use in the future:

L• Y;[NJ = 0.
i=l
(36)

In case (35) we have n = 3 and y1 = 11 1 , y2 = 11 2 , y3 = -11 3 . For general


reactions of the form (36) we have
for all i.
390 67. Phenomenological Thermodynamics of Quasi-Equilibrium and Equilibrium States

In the following chapter we will also consider reactions of elementary particles


of the form (36).

Principle 67.8 (Free Energy). Assurne that no work is done at a given system and
that its temperature is constant. We then have
P~o
for all processes which occur in the system. For reversible processes we obtain
F = 0, i.e., F = const.
PRooF. The first and second laws imply that E = Qand TS ~ Q. For F =
E - ST we obtain F = E - ST- st
= E - ST ~ 0. D

EXAMPLE 67.9 (Photon Gas). Consider a gas which consists of N photons in


a container with a fixed temperature T and fixed volume V. The photons are
emitted from the walls into the container. Each N corresponds to a quasi-
equilib'rium state. The free energy is
F = F(T, V, N).
According to Principle 67.8, it makes sense to assume that for fixed T and V
the free energy F has a minimum at the thermodynamical equilibrium with
repect to N. This implies
J1. = FN(T, V,N) = 0.
In Section 68.4 Planck's famous radiation law will be derived from this
relation. In Section 68.5 we wi11 show that Planck's law implies the Stefan-
Boltzmann law for the radiation of black bodies.
EXAMPLE 67.10 (Chemical Reactions). For fixed temperature T and fixed
volume V, we consider n substances in a container. We have
F = F(T, V,N1o ... ,Nn).
Assurne that the chemical reaction (36) occurs inside the container. Thus we
have
NI= "lifl·
Since N1 depends on the time, we cannot use assumption F = min! for the
definition ofthermodynamical equilibrium, but instead have to use the process
F(t) = F(T, V, Nt (t), ... , Nn(t)).
This is sometimes overlooked in the physicalliterature. According to Princi-
ple 67.8, we can characterize the equilibrium states through the condition
F(t) = o.
This yields "f. 1(aF;aN1)N1= 0, and hence
L"liJ1.1(T, V.Nt•···•Nn) = 0.
I
(37)
67.7. Gibbs' Phase Rule 391

Principle 67.11 (Free Enthalpy). Let G = F + PV. Suppose pressure and tem-
perature of the system are constant, and assume that besides mechanical work
through volume change no further work is done at the system.l t follows then that
G~O
for all processes which occur in the system. For reversible processes we have
G = 0, i.e., G = const.
PROOF. The first and second laws imply E = Q- PVand TS ~ Q. For G =
E - TS + PV we obtain
G= E- ts- TS + PV + PV = E- TS + pv = Q- TS ~ 0. 0

Applications will be given in the following two sections.

Principle 67.12 (Enthalpy). Let H = E + PV. Suppose the pressure is constant,


the system is heat isolated, and assume that besides mechanical work through
volume changes no further work is done at the system.lt follows then that
ii = 0
for all processes which occur in the system, i.e., H = const.

PROOF. The first law implies that E = - PV For H = E + PV we thus obtain


ii = E + PV = o. 0

The principle of enthalpy conservation, for example, plays an important


rote in connectitm with liquefaction of air.

67.7. Gibbs' Phase Rule


The phase rule states:
Suppose a system of K substances is given which can be in <P phases (gaseous,
liquid, solid, etc.). The system has temperature T and pressure P. No chemical
reactions occur. If the system is in a thermodynamical equilibrium, then it has
f=2+K-<P
degrees of freedom, i.e., the possible values of T, P, and the phase concentrations
depend on f parameters.
Before proving this, we consider, as an example, a system which consists of
water and steam. Here we have K = 1 and <P = 2. The system has one degree
of freedom. We are free to choose T and obtain
P = P(T).
lf water, ice, and steam are in a thermodynamical equilibrium, then we have
392 67. Phenomenological Thennodynamics ofQuasi-Equilibrium and Equilibrium States

K = 1 and cD = 3. The system has zero degrees of freedom, i.e., T and P are
fixed. This gives the triple point
T= 0.008 oc, P = 0.006 kp/cm 2•
We now prove the phase rule. Let N,", be the number ofparticles ofthe kth
substance in the cpth phase. According to Principle 67.11 we compute the
equilibrium by using the following extremal principle:
G(T, P, N) = min!,
(38)
N11 = L Nt., = const,
tp
k = l, ... ,K.

Let N be the tuple of all N,.". We set


L = G - L a.lNl,
l
and apply Lagrange's multiplier rule of Section 43.10. This gives
aG;aN,." = IX,.,
hence J.t11" = a.", i.e.,
J.tu = J.tu = · · · = J.lt", k = l, ... ,K. (39)
We now assume that for fixed cp the quantity J.lt" only depends on P, T, and
the concentrations in the cpth phase. These concentrations are

c," = N,";~ Nt", r= 1, ... ,K.

Since the sum of all these concentrations is equal to one, J.lt" only depends on
2 + (K - 1) variables. Relation (39) therefore contains K(«D - 1) equations for
2 + ci»(K - 1) variables. Thus in the regular case the solution of equation (39)
depends on
f = 2 + «D(K - 1)- K(«D- 1) = 2 - cD + K
parameters.

67.8. Applications to the Law of Mass Action


As in Example 67.7 we consider a chemical reaction
n

i=l
L Yi[N;] = 0 (40)

between n substances. Let c1 = Nd''D N1 be the concentration of the ith sub-


stance. Then
n
fl cf' =
i=l
K(T,P) (41)
Problems 393

is satisfied where K is a constant, depending on temperature T and pressure


P. Strictly speaking, (41) is valid only for ideal gases, but, approximately, this
relation can also be applied to real gases and liquids.
We will now derive (41). From (40) follows that
Iil, = y,{J.
We fixT and P and consider the free enthalpy
G = G(P, T,N1, ... ,Nn).
Analogous to Example 67.10 and, according to Principle ()7.11, we now use
the condition
G=O
to determine the equilibrium. This yields L (8Gf8N )N = 0, and hence
1 1 1

L Y;/l; =
i
0. (42)

The chemical potential of an ideal gas satisfies


Jl; = kTln P, + Jlo 1(T). (43)
This follows from Example 67.4 by simply changing the notation. In this
example we used the mass M = mN instead of N. Hence Jl in (16) has tobe
divided by m. Note also that we use p = NkTpfM in (16). The quantity P1 in
(43) denotes the pressure of the ith gas. We assume that the gases do not
interact with each other. From Example 67.4 it follows then that
PV = NkT and P1V = N1kT.
This implies P1 = Pc1, and hence (41) follows from exp L1Y1Jl 1 = 1 and (43).

PROBLEMS

67.1. Power during volurne increase. Assurne a gas in a bounded region G of IR 3 is under
pressure P. We increase the volume V by transforming G via
y(t) = x + u(t)
into G(t). Let W(t) denote the work done during the time interval [0, t]. Compute
the power W(O).
Solution: An application of the transformation formula for volume integrals

i
and integration by parts yields

V(t) = f dy = f dx +t nü(O)dO + o(t), t-+ 0


JG(I) JG öG

with outer unit normal vector n. The force - Pn !J.O acts on the surface element
!J.O. Hence the power is equal to

W(O) = -i ÖG
Pnü(O)dO = -PV(O).
394 67. Phenomenological Thermodynamics of Quasi-Equilibrium and Equilibrium States

This observation becomes completely elementary if one Iooks at the situation


of Figure 67.1. There one obtains ·
W(t) = P(V(O) - V(t))
for constant P. Note that pressure = forcefsurface and work = force · distance.
67.2. Adiabatic process for ideal gases. What is the relation between pressure P and
density p of ideal gases for reversible, adiabatic processes.
Solution: Principle 67.5 implies tbat S = const. From (16) it follows
Tp -rfc = const,
and P = rpTimplies
Pp- 1-rtc = const.
67.3. Perpetuum mobile of the second kind. Show: According to the second law, it is
impossible to construct a periodically working machine which, durlog one cycle,
emits energy in the form of mechanical work wbereby it only absorbes heat from
precisely one heat reservoir.
Solution: Let t 1 be the period. Since the entropy S only depends on the state,
we find that
S(O) = S(t 1).
Integration of the second law S ~ QJT over [0, t 1 ] yields

I' Qdt/T~O.
Because of Q > 0 and T > 0 this is a contradiction.
67.4. Carnot's cycle. We consider a periodically working machine with period t4 • Let
T(t) be the temperature at time t, and Q(t) the heat added to the macbine during
the time interval [0, t]. We consider four intervals 0 < t 1 < t 2 < t3 < t4 and set
Q1 = Q(t1) - Q(t1-1), where t0 = 0.
(i) For [0, td Iet Q1 < 0 and T(t) ~ T1 (e.g., heat emission into the surrounding
region).
(ii) For [t 1 , t2 ] and [t3 , t4 ] Iet Q(t) = 0.
(iii) For [t 2 , t 3 ] Iet Q3 > 0 and T(t) ~ T3 (heat absorbtion). Assurne that T1 < T3 •
Let A denote the work done by the machine during one run through [0, t4 ].
The ratio
'I= A/Q3
between work done and absorbed beat is called tbe efticiency. Compute 'I·
Solution: As in Problem 67.3 the important point is that S(O) = S(t4 ) for the
entropy S. From the second law S ~ Q/T follows tbe inequality

0 ~ Q. + Q3 (44)
1i. T3
by integration over [0, t4 ]. The first law implies E = Q+ W. By integration it
follows from E(O) = E(t4 ) tbat
0 = Q1 + Q3 + W(t4 ),
References to the Literature 395

and hence A = Q1 + Q3 and '7 = 1 - IQ 1 1/Q3 • From (44) follows


'1 ~ 1 - TtfT3 • (45)
In the case of reversible processes and T = const in (i) and (iii), we have equality
in (44). lt follows, then that
'1 =1- TtfT3 •
This is the ideal efficiency of a machine which has the form (i) to (iii).
The important fact about (45) is that, because of '1 < 1, heat cannot be com-
pletely transformed into work. This was Camot's fundamental technological
discovery of 1824 which, during the nineteenth century, led to the phenomeno-
logical formulation of the second law by Clausius.
61.5. Further concrete problems. Study the numerous examples in Sommerfeld (1954,
M), Vol. 5 and Landau and LifAic (1962, M), Vol. 5.

References to the Literature

Classical work: Clausius (1865), Boltzmann (1871), (1872), Planeie (1913, M).
Sommerfeld(1954, M), Vol. 5, Landau and Liflic(1962, M), Vols. 5, 9, 10, GlansdorfT
and Prigogine (1971, M).
Mathematical theory of entropy: Martin (1981, S).
Conceptual analysis of the laws of thermodynamics: Serrin (1979), (1983, S), (1986,
P), Owen (1984, M).
Tbermomechanics: Ziegler (1983, M).
(See also the References to the Literature to Chapters 68 and 86.)
CHAPTER 68

Statistical Physics

The true logic in this world lies in probability theory.


James Clerk Maxwell (1831-1879)
In 1866, Ludwig Boltzmann began bis scientific career with an attempt to give
a purely mechanical explanation of the second law of thermodynamics. He
gradually regognized the need to introduce statistical concepts in order to
understand irreversibility and the second law.
M. J. Klein (1973)
Don't trust any statistics that you didn't falsify yourself.
Folclore

During the study ofthe BigBang in Section 58.15 we already madeessential


use of Planck's radiation law. In order to find this law, Planck formulated his
famous hypothesis about the quantization of energy for the harmonic oscil-
lator. This was the hour of birth of quantum theory. Planck's radiation law
implies the Stefan-Boltzmann radiation law, which will be used during the
following chapter in the discussion of Carleman's radiation problem. In the
present chapter we want to show how these important physicallaws can be
derived from generat principles of statistical physics. The development of
statistical physics is mainly connected with the names of Maxwell (1831-
1879), Boltzmann (1844-1906), Gibbs (1839-1903), Planck (1858-1947), and
Einstein (1879-1955).
In classical statistical physics, it was assumed that, for example, the motion
of molecules in a gas is governed by the laws of classical mechanics. The
statistical point of view was only an auxiliary tool to mask complicated
mechanical systems. But as we already saw in Section 59.12, the concept of
probability has far deeper roots. In fact, in the microcosmos, the laws of
classical mechanics have to be replaced with the laws of quantum mechanics.
These laws have primarily a probability theoretical character.

396
68.1. Basic Equations of Statistical Physics 397

We begin our discussion of statistical physics with the basic equations


which, in Section 43.12 of Part III, have been obtained from the principle of
maximal entropy (information) by using Lagange's multiplier rule. This im-
plies the following special cases:
(i) Bose statistics and Fermi statistics which, other than classical statistics,
also describe correctly the behavior of thermodynamical systems at low
temperatures.
(ii) Classical statistics and quasi-classical statistics.
As applications we consider:
(a) Ideal gases (equation of state, Maxwell's velocity distribution).
(b) Photon gas, black-body radiation. Planck's radiation law, and the Stefan-
Boltzmann radiation law.
(c) Cosmos at the temperature of 10 11 Kat timet = 10- 2 s after the BigBang.
(d) Basic equation of star models and the maximal Chandrasekhar mass of
white dwarf stars.
(e) Law of equipartition.
In Part V we will consider problems in modern quantum statistics and
irreversible thermodynamics. In fact, there are still many open problems in
this area.

68.1. Basic Equations of Statistical Physics


The following equations stand at the beginning of statistical physics:

e(PNr-Er)/kT

w, = L eb•N.-E.lllcT'
(1)
r

E= L w,E" N = L w,N,, (2)

S = - k L w, ln w" (3)

F = E- ST. (4)
We consider a system which might be in the different states Z,. To each Z,
belongs the energy E, and the number of particles N,. We are looking for the
probability w, that the system is in the state Z,.

68.1a. Motivation
We determine w, from the assumption that under the side condition (2) the
entropy becomes maximal, i.e., we solve the problern
398 68. Statistical Physics

S = max!,
Lr w,E, = E, L w,N, = N,
r
:Lw,= 1
r

for fixed E and N. In Section 43.12 of Part 111 we used the Lagrangemultiplier
rule to show that the solution w, oftbis problern is given by (1). The Lagrange
multipliersTand Jl were obtained from (1), (2) as functions of E (mean energy)
and N (mean particle number). More precisely, we find from the equations

E = L w,(T, p)E"
r

N = L w,(T,p)N,,
r

that
T= T(E,N), Jl = p(E, N).

By making a comparison with phenomenological thermodynamics, we will


show below that it is useful to identify T with the absolute temperature and Jl
with the chemical potential.

68.1b. Mean Value and Dispersion of Arbitrary


Physical Quantities

Consider an arbitrary physical quantity .91 and assume that .91 has the value
A, in state Z, for all r. We then define the corresponding mean value as

A = L w,A, (5)
r

and the dispersion as


(6)

Those two values are important in linking the theory to physical measure-
ments. Roughly speaking, the mean value A is the value expected to be
measured in physical experiments. Moreover, we would assume that the
difference .91 - A between the actual measurement value .91 and the mean
value A is small if the dispersion (AA) 2 is small. The precise mathematical
formulation is given in the Chebyshev inequality (C) below. Forthis we define
p(I.Jil- Al~ 8) = probability for measuring some value .91 with 1.91- Al~ 8.
Chebyshev's inequality teils us that
(AA) 2
(C) p(I.Jil- Al ~ 8);;;:: 1 -:- - 2-
8
68.1. Basic Equations ofStatistical Physics 399

for all e > 0. In particular, letting e = 4.!\A, we obtain that


p(ist- Al~ 4.!\A) ~ fl
Thus, roughly speaking, in most cases, the measurement value Sl1 lies in the
interval [A - 4.!\A, A + 4.!\A]. If L\A = 0, we find
p(Slf = A) = 1.
To prove (C), Iet
0 for IA, - AI ~ e,
{
Xr = 1 for lA,- Ai > e.
This yields
p(ist- Al> e) = L x,w,.
r

From
(L\A) 2 = L (A,- A) 2 w, ~ L e2 x,w,
r r

follows that
p(id- Al~ e) = 1- p(id- Al> e)
~ 1 - (L\A) 2(e 2 ,
and this is (C).
In Example 68.6 below we will show that for a system with a great number
N of independent particles the dispersion (L\E) 2 of energy is extremely small.
Weshall derive the crucial formula
ll.E(E"' l(jN,
where E is the mean value of the energy. As usual we have N"' 1023 (the
number of molecules in a gas of reasonable size). Using the same argument,
analogaus formulas for the dispersion (ll.A) 2 of other physical quantities can
be obtained. This is the reason for the surprising fact that in macroscopical
physics the measurement values seem tobe constant without any fluctuations.

68.1c. Computation of All lnteresting Physical Quantities

The following proposition provides the key for computations in statistical


physics.

Proposition 68.1. All important thermodynamical quantities can be computed


from the function
Q(J.l, T) = - kTln L elllN.-E.J/kT.
r
400 68. Statistical Physics

Wehave
N= -0,.,
(7)

PROOF. This follows easily from (1) to (3). 0


If Cl depends also on the volume V, then we define the pressure
p = -Ov. (8)
The number k = 1.380·10- 23 JfK isauniversal constant (Boltzmann constant).
lf we compare (7), (8) with the last line of Table 67.1 of Section 67.5, then it
makes sense to identify T with the absolute temperature. Moreover, we may
identify: S as entropy, F as free energy, E as inner energy, Jl as chemical
potential, and Cl as statistical potential.

68.1d. The Role of Entropy

The concept of entropy is more general than that of temperature. Actually


one can always compute the entropy S from (3) if one knows arbitrary
probabilities w, for the states Z, with
r, w, = t.
The probabilities w, in (1) are characterized by the assumption of maximal
entropy. Thereby one obtains the temperature T as a Lagrange multiplier.
Physically, (1) corresponds to quasi-equilibrium states of thermodynamical
systems in the sense of Section 67.1 with a common temperature.

68.1e. The Role of the Partition Function

Remark 68.2 (Partition Function and the Counting of Different States). lt is


very important that the so-called partition function

in Proposition 68.1 is to be taken over all different states Z,, which the system
may occupy. In particular, different states may have the same energy. The
following examples will show that there exist many different ways to count
"different" states. This is the reason for the fact that there exist different forms
of statistical physics. In fact, the classification of "different states" is a physical
problem. The main difference, for example, between classical statistical physics
and quantum statistics lies in the different classification of states.
68.1. Basic Equations of Statistical Physics 401

(i) Prototype of classical statistical physics. If, for instance, we have two
particles A and B, with possible energies IX, ß, then the various states of
Table 68.1 are obtained. Thereby A/IX means that A has the energy IX, etc.
This is the classical way of counting states, where a strict distinction
between particle A and particle B is exercised. In Table 68.1 we obtain
four states Z" r = 1, .. :• 4 with corresponding energy E, and particle
number N,.

Table 68.1
r z, E, N,
l A/a., B/ß a.+ß 2
2 A/ß, B/a. a.+ß 2
3 A/a., B/a. 2a. 2
4 A/ß, B/ß 2ß 2

(ii) Prototype of Bose statistics. In quantum theory there exists the important
principle of indistinguishability for the particles, i.e., one cannot distin-
guish between particles A and B. This Ieads to Table 68.2.

Table 68.2
r z, E, N,

{A/a., B/ß} a.+ß 2


A/ß, ß/a.
2 Aja., B/a. 2a. 2
3 A/ß, B/ß 2ß 2

State Z 1 , for example, can be described by saying that one particle has
energy IX and one particle has energy ß, but we cannot say which one has
which energy. In Table 68.2 we therefore only have three different states
Z" r = 1, 2, 3 in contrast to Table 68.1. Note, however, that the situation
of Table 68.2 only applies to particles with integer spin (e.g., photons).
(iii) Prototype of Fermi statistics. Particles in quantum theory with half-
numberly spin (e.g., electrons) have the property that no two particles can
be in the same state. This yields Table 68.3, where only one state exists.

Table 68.3
r z,
{ A/a., B/ß} a. + ß 2
A/ß, B/a.
402 68. Statistical Physics

We therefore see that it is important to first clarify what we mean by "states,"


before starting with statistics. Different classifications of states may Iead to
entirely different results in statistical physics.

Remark 68.3 (Pure Energy Statistics). lf the nurober of particles is not subject
to statistical ßuctuations, i.e., we have N, = N for all r, then
w, = e-E.JkT /L e-E.JtT.
,
This formula follows as in Section 68.1a by omitting the side condition
L, w,N, = N. This way one obtains all quantities (7), (8) by formally letting
n.
N = 0 in Only N = - n,.
is no Ionger valid, but instead N is fixed.

68.2. Bose and Fermi Statistics


EXAMPLE 68.4 (Bose Statistics). Here the state Z, with r = 0, 1, ... is charac-
terized by the fact that it contains
N, =r particles with energy E, = re,.
for fixed e,. > 0 and fixed n. For Jl < 0 one obtains the geometrical series

n,. = -kTln L
CXl
e<rp-ren)/kT
r=O
= kTln(l - e<~&-•nl/I<T). (9)
From (7) one immediately obtains
N,. = -an,.;aJl = 1/(e<•n-,.>JtT- 1), E,. = N,.e,.
for the average number of particles N,. and the mean energy E,., respectively.
We now assume, in addition, that the particles may assume different en-
ergies e,. and that for different n the states are statistically independent. The
average number of particles is then defined by

Aiso we assume

This is obtained by letting

and applying the formulas of Proposition 68.1 to n. In particular, we have


E = }2N,.e".
n
68.3. Applications to Ideal Gases 403

E:XAMPLE 68.5 (Fermi Statistics). We assume that in the previous example only
r = 0, 1 is possible, i.e., only one particle of energy Bn is possible for each state.
Then we obtain
nn = -kTln L e(rp.-r•nl/kT
1

r=O

= -kTln(1 + eiP.-•nl/lcT).
As in Example 68.4 we then obtain the following formulas with "+ ":
Nn = 1/(el•n-P.l/kT ± 1), En = NnBn,
(10)
-n = ±kTL ln(1 ± eiP.-•nl/kT).
n

The sign " - " corresponds to the Bose statistics in Example 68.4.

In the case of elementary particles, the particles with half-numberly spin are
governed by the Fermi statistics (e.g., electrons, positrons, protons, neutrons,
and neutrinos) and particles with integer spin are governed by the Bose
statistics (e.g., photons). This fact can be proved in the context of axiomatic
quantum field theory (see Streader and Wightman (1964, M)).

68.3. Applications to Ideal Gases


By an ideal gas we mean a system which consists of particles that do not
interact. Hence, one obtains the statistics for this system by applying the Bose
and Fermi statistics to all possible energy states Bn of a particle. Let q and p
denote the position and momentum vector of the particle, respectively. We
assume for the energy of the particle that
B = B(p,q) + Q,, s = 1, 2, ... (11)
holds. Thereby e(p, q) describes the kinetic energy of the particle and its
potential energy U(q) in an outer field. In the context of classical mechanics
we have
e(p,q) = p 2 /2m + U(q) (12)
with m = mass of the particle. In the special theory of relativity we have
e(p,q) = Jm~c 4 + c2 p 2 + U(q) (13)
with m0 = rest mass ofthe particle and c = velocity oflight (see Section 75.11).
If the particle is a molecule, then Q, are its quantum mechanical energy states
which can be computed from the Schrödinger equation. Classically, e is the
total mechanical energy of the molecule (kinetic energy plus potential energy
in a given outer field and in the inner field of electrostatic attraction). The
quantities Q. then correspond to rotational and oscillation energies.
404 68. Statistical Pbysics

The statistical potential for (11) is

-0 = f
± kTg ~ ln(l ± e<,.-e(p,qi-Q.II"T) d~=q (14)

with Planck's constant h = 6.625 · 10- 34 J s. We obtain (14) from (10) by


formally replacing the summation I." over the states with an integration over
dp dq and a summation over s and the spin states of the particle. We assume
the particle has g spin states, and hence the factor g appears. The factor 1/h3
yields the correct quantum mechanical description. It corresponds to the
hypothesis that the phase space is quantized, i.e., if G is a region in the (p, q)-
phase space, then

L dpdq/h 3

is the nurober of corresponding states (see Section. 59.21). In (14) we choose


"+" and "-" for the Fermi and Dose statistics, respectively. The computation
of n in (14) is difficult in the general case and only approximations are pos-
sible. For large molecules it is already very difficult to numerically solve the
Schrödinger equation in order to determine Q5 • However, if one knows n,
then one can use Proposition 68.1 to compute all interesting physical quan-
tities E, F, S, and pressure P as functions of Jl, T, and V.
For the following computations we use the classical integral formulas which
are given in Problem 68.5.

EXAMPLE 68.6 (Ideal One-AtomicGas). We consider an ideal gas which con-


sists of N atoms of mass m in the volume V. The particle density is equal to
p = N /V. Moreover, we set

In this case we have


e = p2 /2m and g = 1, Qs =0.
We use the Dose statistics. Decause of Jdq = V we obtain

O(Jl, T, V)= kTVh- 3 Jln(1- e<,.-•<PIII"T)dp. (15)

This implies

N= -n,. = J n(p)dp (16)

with
n(p) = Vfh 3(e1•IPI-,.I!A:T- 1)
= ave-•<PII"Tfh 3(1- ae-•<Pli"T) (17)
= avh-3e-p2f2mkT + O(a2), (X ..... 0,
68.3. Applications to Ideal Gases 405

as well as

E= f e(p)n(p)dp (18)

and
F = Q + Ji.N = E - ST.
In this way one can compute all important quantities as functions of the
parameters Jl., T, V. Integration by parts in (16) yields the important relation
n= -2E/3. (19)
Since n depends linearly on V, we have Q = - PV, and hence

PV = 2E/3. (20)
The number n(p) describes·the distribution of the particles onto the momen-
tum vectors and because of
p=mv
also onto the velocity vectors v. The number of particles with p e G is equal to

L n(p)dp.

This can be seen by evaluating the partition function above only over momen-
tums with pe G. The integration in (15) then is tobe taken over G. The number
n(p) depends on the chemical potential Jl.. From (16) this can be determined
in the form
J1. = JJ.(N, V, T).
We obtain

IX-+ 0, (21)

where IX = eP.IkT.
We now want to perform the computation for fixed T and small particle
densities p. As expected we will obtain the quantities for ideal one-atomic
gases ofphenomenological thermodynamics ofExample 67.4. Note, however,
that there we had p = M/V with M = mN, and that the chemical potentials
of Example 67.4 and here are in the ratio of m: 1. We essentially use

f~oo e-flpl dp = (n/ß)312


and the formulas which are obtained by differentiation with respect to ß, for
example,
406 68. Statistical Pbysics

From (21) follows p = O(a), a ~ 0. The implicit function theorem (Theorem


4.B) implies that for small p equation (21) can be solved for a. This gives
a = e"''"T = ph3(2nkmTr312 + O(pl), p -+0. (22)
Using this, we find for the chemical potential
p. = kTln[ph 3(2nkmT)- 312 + O(p 2 )], p~o. (23)
From (17) and (22) we obtain
n(p) = Nw(p) + O(p 2 ), p -+ 0

with
w(p) = (2nmkTr312e-p2f2mtT.

J
This is a Gauss distribution with w(p) dp = l. The number of particles with
momentum p e G is equal to

N L w(p)dp + O(p 2 ), p-+ 0.

This is the so-called Maxwell momentum distribution from which, with p =


mv, Maxwell's velocity distribution follows. From (18) we obtain for the energy

E= N f e(p)w(p)dp + O(p 2 ) = 3NkT/2 + O(p 2 ), p-+ 0.

Neglecting the terms of order O(p 2 ) we obtain from (20) the equation of state
PV = NkT.
Moreover, we find for the free energy
F = 0. + p.N = - 2E/3 + p.N
(24)
= NkT + NkTln ph 3(2nkmT)- 312

and the entropy


S = T- 1 (E - F) = Nk{i- In ph 3 (2nkmT)- 3' 2 ).
Finally, we consider the important question of how large the energy fluc-
tuations are. Fora single atom the mean energy is equal to

8= f e(p)w(p)dp = 3kT/2

and the dispersion is equal to

(Ae) 2 = J(e(p)- 8) 2 w(p)dp

= f
e(p) 2 w(p)dp- 82 = 3PT 2 /2.
68.3. Applications to Ideal Gases 407

The total energy is equal to the sum of the energies of the single atoms. Since
the atoms are assumed to be independent, the dispersions are added in the
usual way, i.e.,
(AE) 2 = N(L\e) 2 = 2E 2/3N.
The relative energy fluctuation is therefore equal to
AE/E = 1/JlSN.
Hence, for the usual nurober of particles N- 1020, one obtains the small
quantity AE/E- 10- 10• Physically, such small energy fluctuations occur ifthe
total energy of the gas is not fixed, i.e., if the gas is in contact with a big heat
reservoir (e.g., the earth). This way an exchange of energy is possible.

E'XAMPLE 68.7 (Relativistic Particles). According to the special theory of rela-


tivity, the energy of a free particle is equal to
e(p) = Jm~c 4 + c2 p2
with rest mass m0 and. velocity of light c. We assume that e(p) » m0 c2 ,
i.e., the energy is substantially larger than the rest energy. Then we can Iet
approximately
e(p) = clpl.
This relation is exactly true for photons. From (14) we then obtain

-n = +kTV:gfln(1 + eiP-•IPll!tTidp_
- - h3

Weintegrate over Ol 3, i.e., we assume that all states are maximally occupied.
For the nurober of particles we obtain

N = -n,. = Jn(p)dp
with n(p) = Vgfh 3 (ei•IPJ-p)fkT ± 1). The energy is equal to

E= f clpin(p)dp.

We now derive the fundamental relation


P = E/3V,
i.e., the pressure Pis equal to one-third ofthe energy density. Wehave

n= +4ngkTvh- 3 c- 3 LCXJ e2 ln(1 ± e(p-e)fkT)de.

Integration by parts yields

The assertion now follows from n= - PV.


408 68. Statistical Physics

For the special case iJ = 0, we obtain the number of particles

N= f n(p)dp,
Vg
n(p) = h3(e<IPitkT ± 1)

J
and the energy E = clpln(p)dp, i.e.,
Dose statistics,
(25)
Fermi statistics,
with a = 8n 5 k4 T4 /15c 3 h3. To compute (25), one can use the integral formula
(38). Applications of the fundamental formula (25) will be given in the follow-
ing three sections.
lf in a gas the states are occupied only up to a maximallimiting momentum,
then all integrals J... dp are only tobe evaluated over a ball of radius p0 • This
will be the case in the study of white dwarf stars in Section 68.8. Integration
by parts then gives
PV = E/3 + o(po1 Po-+ +oo.
Thus PV = E/3 is an ideallimiting case.
Dy using the energy-momentum tensor one can motivate that PV < E/3
must always be satisfied for macroscopial bodies (e.g., gas or liquids). This can
be found in Weinberg (1972, M), Chapter 2, (2.10.24).

68.4. Planck's Radiation Law

We consider a container with volume Vand assume that its walls constantly
emit and absorb photons. We are Ioterested in the photon gas in the container
which, for a fixed temperature T, is in a thermodynamical equilibrium.
Photons have no rest mass, hence its energy is equal to
e(p) = clpl,
according to Example 68.7. Furthermore, we have the quantum formula
e(p) = hm/2n
with angular frequency m and wave length A. = 2nc/m, i.e.,
e(p) = ch/A..
The photon has the spin positions ± 1, so that g = 2, and it satisfies the Dose
statistics. Thus we can apply Example 68.7. The free energy bas the form
F = F(T, V, N).
Fora fixed volume V and a fixed temperature T, we arestill free to choose
the number of particles N. We determine N from the assumption that Fis
68.5. Stefan-Boltzmann Radiation Law for Black Bodies 409

minimal, i.e., FN = 0. This gives


J.l = FN(T, V,N) = 0
(see Example 67.9). From Example 68. 7, we find for the number of particles

N= f n(p)dp, n(p) = 2V/(e<IPI/tT- l)h3

and for the energy

E= fcipin(p)dp.

From Example 68.7, we obtain

Cl= -2kTVh- 3 fln(l- e•IPIItr)dp = -E/3.

In order to obtain the energy distribution with respect to the wave length A.,
we choose spherical coordinates dp = p2 dlpl dO. With IPI = h/A. we obtain
from (25)

E = LCX) 81thcV dA./A. 5 (e""'"n- 1) = aT4 V. (26)

This is Planck'sfamous radiationformula for a container which is filled with


radiation which is in a thermodynamical equilibrium. Equation (26) deter-
mines, at the same time, how the total energy Eis distributed onto the different
wave lengths A.. We now compute the remaining thermodynamical quantities.
Because of J.l = 0 we have F = Cl, hence
F = -E/3, S = (E- F)/T = 4aT 3 V/3.
Moreover, we have P = -nv = aT 4 /3, and hence
P = E/3V.

68.5. Stefan-Boltzmann Radiation Law for


Black Borlies
Consider a body with temperature T and surface measure A (e.g., a stove or
a star). The energy EA which is emitted from the surface, during the time
interval [0, t], is equal to
(27)
where
410 68. Statistical Physics

This is the so-called Stefan-Boltzmann radiation law. More precisely, the


distribution of the emitted energy onto the wave lengths A. is given by the
formula

EA. = 2xhc 2tA L" A.s(ellc::;, _ 1), (27•)

whereby an essential assumption is that the body is a so-called black radiator


(black body), i.e., it absorbs all incoming radiation. Stars, for example, can
approximately be regarded as black bodies. Let us motivate (27) and (27•).
(I) Webegin by considering a container C, filled with radiation (photons)
in a thermodynamical equilibrium. Let T denote the temperature of C.
We then Iook at a small boundary part of C with surface measure ~A
(Fig. 68.1 (a) ). Let Eu be the radiation energy which arrives at ~A during
the time interval [O,t]. As in Figure 68.1(b) we consider an arbitrarily
small volume element ~V of C with energy p ~V. According to (26) the
energy density in C is equal to

p = aT 4 = L" 8xhc dA./A. 5 ( ellclkTJ. - 1).

Let r be the distance between ~A and ~V. For symmetry reasons we


assume that after time t = rfc the energy in ~V has been distributed by
radiation onto the surface of a ball of radius r. Thus on ~A we have the
energy
~Acos8 ~
4xr 2 p V

(see Fig. 68.l(b)). By choosing spherical coordinates we find


~V= r2 sin8d8d(f'dr.

' I
' I
' I

~AA
~'
c
(a) (b)

(c) (d)

Figure 68.1
68.6. The Cosmos at a Temperature of 10 11 K 411

Durlog the time interval [0, t] all volume elements AV with 0 :s; r :s; ct
radiate energy onto AA, where c is the velocity of light. Summing all
these energies we obtain

E4A = JAAcos8
4nr2 p dV
-
f"'2
p AA
= ~ Jo cos.9sin.9d.9 Jof2• dqJ Jof<' dr
= ctpAA/4.
This yields (27) and (27*) with A replaced by AA.
(II) We now consider the situation of Figure 68.1(c), i.e., we Iook at the
container C of (I) with a slot of small surface measure L\A. According to
(1), this slot emits the energy E4 ,.. during the time interval [0, t]. The
crucial hypothesis of physicists then is that this slot behaves like a black
radiator. In fact, the slot absorbs almost completely all incoming radia-
tion (see Fig. 68.1(d)).
(III) Next consider any arbitrary black radiator of surface measure A. We
regard its small surface elements AA as black radiators which emit the
same energy as the slot in (II). Summation over AA yields
E,.. = L E4 ,.. = ctpA/4.
This gives (27) and (27*). Our motivation is complete.
In Problem 68.4 Wien's displacement law, which is a consequence of (27*),
will be discussed.
A deeper understanding of radiation processes, however, is only possible
by using quantum electrodynamics.

68.6. The Cosmos at a Temperature of 10 11 K


In Section 58.15 wemadeessential use of the fact that at 10 11 K the cosmos
has an energy density which is equal to 9/2 times the corresponding energy
density of photons. We want to show that this is a consequence of (25). To
this end we make the following two assumptions:
(i) At 10 11 K the universe is in a thermodynamical equilibrium. The critical
temperatures in Table 58.3 show that at 10 11 K essentially only photons,
electrons, positrons, and neutrinos exist.
(ii) The chemical potentials of these particles are equal to zero. This will be
motivated in Problem 68.1.
According to (25), the energy density of the universe at T = 10 11 K is then
equal to
412 68. Statistical Physics

Note that the photons satisfy the Dose statistics, and that the electrons,
positrons, and neutrinos satisfy the Fermi statistics. Moreover, Table 58.3
implies for photons g = 2 holds (two spin positions). For electrons and posi-
trons we also have g = 2, while for the four neutrino species we have g = 1.

68.7. Basic Equation for Star Models


F or a radially symmetric star the basic equation is
P'(r) = - GM(r)p(r)/r 2 ,
(28)
p = p(p, T).
Thereby r is the distance from one point to the center of the star. Moreover,
we have, Pas pressure, p as density, T as temperature, G as the gravitational
constant, and M(r) is the mass of a ball ofradius r around the center ofa star,
i.e.,

M(r) = J: 4np(r)r 2 dr.

We motivate (28). If we consider a small volume element ll V at a distance


r from the center, then ll V is affected by the gravitational force. According to
Problem 58.5, the effect is the same as if the entire mass M(r) were located at
the center. Hence, this force is equal to
-(GM(r)p(r)/l V/r 2 )e,.
Thereby e, is a unit radial vector. More precisely, the force density is equal to
K = -(GM(r)p(r)/r2 )e,.
Pressure and gravitational force must be in an equilibrium. We assume that
only pressure forces are effective, i.e., the stress tensor has the diagonal form
uj = -Pb] (see Section 70.2). From the equilibrium condition
divu +K= 0
of Section 61.5 it follows that (div u)e, = - Ke,. This is (28).

68.8. Maximal Chandrasekhar Mass of


White Dwarf Stars
White dwarfs have a very high density. The best-known white dwarf is the
Sirius companion. Dased on the irregular motion of Sirius, Dessei (1784-1846)
predicted this companion, andin 1862 it was discovered by Clark. Only in
1915 it was found by using spectroscopical methods that both stars have the
68.8. Maximal Chandrasekhar Mass of White Dwarf Stars 413

same effective surface temperature of 10,000 K. Since the companion has a


104 times smaller luminosity than Sirius, and its mass is equal to the mass of
the sun, it has tobe very small.lts radius is about one hundredth ofthe radius
of the sun and its density is approximately 1,000 kg/cm 3 • At these densities,
the electron shell of the atoms has been crushed. The electrons move freely.
The pressure of this electron gas is in an equilibrium with the gravitational
force.
For the maximal mass of such a star one obtains the famous formula of
Chandrasekhar
Mcrit = Jl. -2( 31t)112 O!o-3/2 mN = Jl. -2 5•87 M sun· (29)
Thereby we have: Gas gravitational constant, mN as average mass ofa nucleon
(proton or neutron), and Msun as mass of the sun. Moreover,

O!o
GmN = 6·10-39
= __
2

hc
is the so-called fine structure constant of gravitation.
The key number Jl. in (29) is the number of nucleons, which belongs to one
electron. Fora hydrogen star we have Jl. = 2, hence Mcrlt = 1.5Msun· For iron
we have J1. = 56/26, hence
Mcrit = 1.26Msun ·
This value !s important, since for elements in the periodic system, which are
heavier than iron, the nuclear synthesis does not produce energy but consumes
energy. Therefore, iron represents an especially important limiting case. We
find that a white dwarf can have maximally about 1.2 sun masses, whereby
this value still depends a little bit on the particular model.
This maximal mass was determined by Chandrasekhar, in 1933, in his
Cambridge dissertation. In Section 76.16, we explain why this mass is so
important for the final state of a star. In 1983, Chandrasekhar and Fowler
received the Nobel prize for their contributions towards an understanding of
the structure of stars.
We now motivate (29).
Step 1. Equation of state.
A white dwarf has a very high density. We assume that, in principle, the
heavy nucleons are motionless, and hence have no kinetic energy. Therefore
they do not contribute to the pressure P. Thus Pis only caused by the freely
moving electrons. According to the Pauli principle, two electrons cannot be
in the same quantum mechanical state. Thus in one cell of the phase space
(space-momentum space) with volume h3 we can have only two electrons with
different spin positions described by the spin quantum number S". Because of
the huge matter density we assume that all possible states are occupied, i.e.,
in every cell there are exactly two electrons. Let V be the volume of the star
and p the momentum vector of an electron. The number N of electrons in the
414 68. Statistical Physics

phase volume V Jdp is therefore equal to

N = 2Vh- 3 dp I =I n(p)dp.

Wehave to integrate over a ball of radius p 0 • Thus we have

= SnV rPo 2dl I= SnVp~


N h3 Jo p p 3h3 .

F or the limiting momentum we find


Po = (3h 3 N /Sn V) 113 •
The density of the star is
p =(Nm.+ NJJmN)fV
with electron mass m., nucleon mass mN, and nurober of nucleons JJ per
electron. Because of mN/m. = 1,836 we set approximately p = N JJmNfV. This
gives

Moreover, we assume that the electrons have a very high velocity v, which lies
in the neighborhood ofthe velocity c oflight. For the relativistic energy ofthe
electrons we find
e = m.c 2 /Jl - v2 jc 2 =Jm:c 4 + p 2 c2
(see Section 75.11). Therefore we may set approximately e = lp!c. Hence the
energy of the electrons is given by

E= f n(p)c!pl dp = 8 ~~V J:o IPI 3 dlpl

= SncVp~ = ~(3n 2 p)413 V.


4h 3 8n 3 mNJJ ·
Motivated by Example 68.7, we assume the limiting relation PV = E/3 be-
tween pressure P and energy E. This implies the important equation of state
p = CpY,

c = ~(37t2)4/3 (30)
y ;= 4/3.
24n 3 mNJJ '
Step 2: Application of the basic equation (28).
Differentiation of (28) yields

!!_ (.!!___ dP(r)) = - 4nGr 2 p(r). (31)


dr p(r) dr
68.8. Maximal Chandrasekhar Mass of White Dwarf Stars 415

We set p0 = p(O) and introduce new coordinates ~ and '1 through


Cy )112
r = ~( p'cJ-2>12,
4nG(y- l)
Then we have
p = CpJ"Yi(y-1)_

From (30) and (31) we obtain the differential equation


2
"" + _", + "1/(y-1) = 0 (32)
~
with initial conditions
'1(0) = 1, '1'(0) = 0. (33)
=
The first condition follows from p(O) p0 • The second condition is p'(O) = 0.
In fact, '1'(0) = 0 is a necessary consequence of the Taylor expansion for '1 at
the point ~ = 0 and (32).
The solutions '1 = '1@ of (32) with (33) are called Emden's functions. They
may be found in Chandrasekhar (1939, M). For y = 4/3 and ~ > 0 this function
has a first zero at ~ 0 = 6.90. Moreover, we have - ~~'1'(~ 0 ) = 2.02. The dif-
ferential equation (32) becomes
e-2<~2'1')' + "1/(y-1) = 0.
It follows therefore that

fo ~2'1(~)1/(y-1) d~ = - ~~'l'(~o) = 2.02 (34)

holds. On the boundary of the star we must have the density p = 0, hence
~ = ~ 0 • For the radius R = r(~ 0 ) of the star we obtain
R= ~o ( Cy )t/2 Pby-2)/2.
4nG(y- 1)
Because of (34) we find for the mass of the star

M = JlR 4nr2 p dr = 4n (4nG(yCy_ 1))3/2 · 2.02.


0

This is the required formula (29).


In our model we have used the maximally possible pressure for the electron
gas which would be present if all electrons bad the maximal kinetic energy,
i.e., would, practically, move with the velocity of light. The actual pressure
therefore is smaller. It follows that forM > Mcrit the pressure in the interior of
the star cannot be in an equilibrium with the gravitational force. A gravita-
tional collapse occurs which results in a much denser neutron star or at around
M > 2Mcrit in a black hole with extremely large density. This will be discussed
in Section 76.16.
416 68. Statistical Physics

Let us now briefly consider neutron stars. In this case all protons p and all
electrons e- are changed into neutrons n and neutrinos v. This is a con-
sequence of the reaction p + e--+ n + v. For the computation of neutron
stars, one therefore has to deal with a neutron gas instead of an electron gas.
In this direction, compare Weinberg (1972, M), Chapter 11, §4.

PROBLEMS

68.1. Chemical potential of elementary particles in the cosmos at high temperatures.


Motivate assumption (ii) in Section 68.6 by using generat symmetry arguments.
Solution: Use the free energy

Thereby T is the temperature of the cosmos, V is the volume of the cosmos, and
N1 is the number of particles ofthe ith particle species. According to Table 67.1
of Section 67.5, we find
P.1 = oF/oN,
for the chemical potential of the ith particle species. Let, for example, N1 , N2 ,
N3 be the number of photons y, of particles p and of the corresponding anti-
particles p•, respectively. According to Example 67.10, the annihilation reaction
p+p•-+y
yields p. 2 + p.3 = p. 1 • Motivated by Section 68.4 we assume that
J.lt = 0.
Thus we have p. 2 = - J.l 3 • Furthermore, we may assume that p.1(T, V, N 1 , ••• , N.)
changes its sign, when particles and antiparticles are interchanged. As above,
all reactions between elementary particles correspond to relations between the
chemical potentials. However, we do not consider all these concrete reactions
in detail, but assume in summary that all p.1 can be uniquely determined from
the conserved quantities for elementary particles, presently known:
Baryon number NB (number of nucleons and hyperons minus number of the
corresponding antiparticles),
Iepton number NL (number of electrons, neutrinos, and muons minus number
of the corresponding antiparticles), and
charge number NQ.
The numbers NB, NL and NQ change their sign, when particles and anti-
particles are interchanged. Thus, according to our assumption above, all p.1
are odd functions of NB, NL, and NQ. In the cosmos , however, we have
approximately

Thus we have
P.i =.0 for all i.
Furtherdetails may be found in Weinberg (1972, M), Chapter 15.
Problems 417

68.2. Quasi-classical statistics. We consider a gas of N particles with space Coor-


dinates q = (q 1 , ••. , q3 N) and momentum coordinates p = (p 1 , ... , p3 N ). Let the
energy be given by
E = E(q,p).
Determine the statistical potential Q, the entropy S, and the mean value A of
a quantity A(q,p).
Solution: From Proposition 68.1 and Remark 68.3 we obtain

Q = -kTJn f e-E(q.p)/kT dqdpfh 3 NN!.

For the calculation of the partition function, observe that the (q, p)-phase space
is decomposed into cells ofvolume h3N. Moreover, because ofthe indistinguish-
ability of the particles, one has to identify all points of the phase space which
are obtained from one another by a permutation of particles. This is the reason
why the factor N! occurs. In classical statistics, the factor h3 N N! does not occur
in Q. For the entropy we fimt

The probability density w(q, p) satisfies the normalization condition

Jw(q,p)dqdp = 1.
Similarly to w, = e(n-E.l/U, in the discrete case of Proposition 68.1, we obtain
in the continuous case

w(q,p) = e-E(t.PllkTIf e-E(t.PllkT dqdp

and

A = f A(q, p)w(q, p) dq dp. (35)

The complete analogy with the discrete case is obtained if everywhere dq dp is


replaced with dqdpfh 3 NN!. This, however, does not change A.
68.3. Classicallaw of equipartition. We choose
A(p) = (2mr'(p: + ··· + P~N),
i.e., A(p) is equal to the kinetic energy of the gas. Moreover, Iet
E(q,p) = A(p) + U(q)
with potential energy U. Prove that
A = 3N(kT/2),
i.e., to each of the 3N degrees of freedom corresponds the mean kinetic energy
kT/2.
418 68. Statistical Physics

Solution: Use (35) and the formula for the integration by parts

f +co ax e-u>Jf1 dx = ~2 f""


-co
2
-co
e-u>J/1 dx (36)

for 11., ß > 0. Implicitly it is assumed that U is given in such a form that Ä exists.
Generalization. Ifthe gas is described by q = (q 1 , ••• , q1 ), p = (p 1 , ••• ,p1 ) and
if
E(q,p) = A(p) + U(q),
where A is a positive definite quadratic form, then one can use an orthogonal
transformation from A into a sum of squares and (36) in order to obtain the
relation
Ä = fkT/2.
In the same way one can prove the following. If E = E(q, p) is a positive definite
quadratic form in the variables (q,p1 then the mean energy satisfies
E = fkT.
Such a situation occurs for oscillating systems. In particular, we have E = kT
for an harmonic oscillator with f = 1 and
p2 w2mq2
E(q,p)=2m +-2-.

According to Problem 58.11, a general oscillating system with f degrees of


freedom and in normal coordinates has the energy
I p~ OJ~
E(q, p) = .L -2' + -2· qf'
•=1

where p = q.
Thus we obtain the following important result:
The mean energy of an oscillating system with f degrees of freedom is given by
E = fkT.
Experience shows that quasi-classical statistics and hence the law of equi-
partition is only valid for sufficiently high temperatures. For Iow temperatures
one has to apply the quantum statistics of Section 68.2.
68.4. Wien's displacement law and an iteration scheme. Determine the wave length
lm.. for which, according to the Stefan-Boltzmann law, a body (black radiator)
emits the most energy.
Solution: According to Section 68.5, one finds for the energy, which during
the time interval [0, t] is emitted from a surface A,
81thc

For lm•• the value of PA is maximal, i.e., dpA/dl = 0. If we set x = hcfkTl, then
we obtain
Problems 419

The corresponding iteration scheme xn+l = f(x.) with x 0 = 5 converges very


rapidly to x = 4.965. This implies

A.mu T = hcfkx = 2.9 · 10- 3 mK. (37)


In Section 76.17 we will use this law in connection with the vaporization of
black holes.

68.5. Classical integrals. Prove the following formulas:

i
(1 - 2 1 -a)r(aK(a) if a > 0, a ::1- 1,
<X> xa-l
ln2 if a = 1,
--dx= { (38)
e" + 1 22n-l - 1
o ----,-2-.- nl•ßn if a = 2n,

I <X>

-- =
o e"-1
x•-l dx
~
rr(a)C(a)
l---"
(2n) 2"8
4n
if a > 1,

ifa=2n,

where n = 1, 2, .... Moreover,' is the Riemannian '-function, and 8. are the


Bernoulli numbers with 8 1 = -#;, 8 2 = iö. and 8 3 = if.
llint: See Landau and LifSic (1962), Vol. V, §58.
68.6. Connection between the general entropy definition and Planck's classical defini-
tion. Consider n different particles P1 , ••• , P. which may assume the energy
values E 1 , ... , ER. The key question in classical statistical physics is: "How, at
a given temperature T, are these particles distributed onto the energy values?
68.6a. Modernapproach via information theory. How can this problern be solved in
the context of our theory?
Solution: Consider one particle and assign the possible states Z, = (E., N,)
with r = 1, ... , R to it. Thereby E, denotes the energy of the state, and N, = 1
for all r, the nurober of particles. Let w, be the probability that state Z, is realized.
The corresponding entropy is
R
S= - k L w,ln w,. (39)
r=l

In order to determine w, we solve the problern

S = max!,

I w, = 1. IN,w, = N, IE,w, = E, (40)


• • •
with N = 1, according to Section 68.1a. The Lagrangemultiplier rule yields the
solution (1), i.e.,
W, = e(pN.-E.)/kT !L e(pN.-E.)/kT.
r

From N = 1 we obtain Jl = 0 for the chemical potential. If n particles are given,


then we find a distribution of
ne-E.tkT
n, = w,n = L e E.tkT' (41)
r
420 68. Statistical Physics

Table 68.4.
W(n 1 ,n 2) w(n 1 ,n 1 )
EI E2 nl n2 Statistical weight Probability

plp2 2 0 1/4
plp2 0 2 1 1/4
PI p2 2 1/2
p2 PI 2 1/2

i.e., exactly n, particles have the energy E, at temperature T. This is the famous
formula of classical statistical physics.
68.6b. Classical derivation of (41). In contrast to the procedure above, we now start
from n different particles P1 , ••• , P. instead of just from one. Boltzmann's basic
idea then was to consider all possible distributions of P1 , ••• , P. onto the
energies. Table 68.4 shows this for two particles P1 , P2 and two possible energy
values E 1 , E 2. Thereby n, is the number of particles with energy E,. Let
w(n 1 , ••• , nR) denote the probability that exactly n, particles have the energy E,
for all r. In order to compute this probability in a simple way, we observe the
generalized binomial formula
(a 1 + ·· · + aR)R = L W(n 1 , ••• , nR)a~• ... a_R•
n1 +···+na=n

with

One easily sees that the number of favorable cases for our problern equals
W(n 1 , ••• , nR)· By setting all a, equal to 1, we obtain
LW(nl, ... ,nR) = RR,
and hence
w(n~o···•nR) = W(n~o···•nR)fRR.
The quantities W(n 1, ••• , nR) are called statistical weights. F ollowing the clas-
sical example of Max Planck we set
sp = kln W(nl>···•nR)
and .compute the required distribution from the problem:
sp = max!,
R R (42)
Ln,=n, nE = L n,E,.
r=l r=l

To clearly realize the connection with our approach, we introduce the numbers
w, = n,/n
and use Stirling's approximation formula

n!- G)"
References to the Literature 421

for !arge n. In case that also all n, are !arge, it follows that

sp"" -ki w,lnw,.


r

Using this approximation formula, we obtain problern (40) from (42), which has
been computed above. Consequently, in the context of this approximation, we
find in expression (41) a.solution of (42).

References to the Literature

Classical works: Boltzmann (1873) (Boltzmann equation), (1909) (collected works),


Gibbs (1902, M), Planck (1913, M).
History of statistical physics: Sommerfeld (1954, M), Vol. 5, Cohen and Thirring
(1973, P).
Introduction: Kittel (1969, M).
Classical textbooks: Planck (1913), Sommerfeld (1954), Vol. 5, Landau and LifSic
(1962), Vols. 5, 9, 10 (much material), Huang (1963).
Modern presentation: Ruelle (1978, S), Wehr! (1978, S), Martin (1981, S), Balian
(1982, L), Thirring (1983, M), Vol. 4, Kubo (1983, M), (1985, M).
Infinite-dimensional systems in classical statistical mechanics: Petrina (1983, S).
Thermodynamics for nonequilibriums and the Boltzmann equation: Sommerfeld
(1954, M), Vol. 5 (classical methods), Landau and Lifsic(1962), Vol.10(standard work),
Groot and Mazur(1962, M), Glansdorffand Prigogine (1971, M), Cohen and Thirring
(1973, P), Cercignani (1975, M), Greenberg (1986, M), Röpke (1987).
Methods of nonstandard analysis, stochastic processes in mathematical physics, and
the Boltzmann equation: Albeverio (1986, M).
Synergetics and self-organization in nonequilibrium systems: Glansdorff and
Prigogine (1971, M), Nicolis and Prigogine (1977, M), Haken (1977, M), (1983, M),
Ebeling and Feiste! (1982, M), Ebeling and Klimontowitsch (1985, L).
Rational mechanics and gas kinetics; Truesdell and Muncaster (1980, M).
Modernquantum statistics: Thirring (1983, M), Vol. 4 (introduction), von Neumann
(1932, M) (classical work), Ruelle (1969, M), (1978, S), Bratteli and Robinson (1979, M)
(von Neumann algebras), Bogoljubov (1980, M), (1984, M), Alberti and Uhlmann
(1981, L), Fick and Sauermann (1982,.M), Sirnon (1987, M).
Solvable models of quantum statistics: Dubin (1974, M), Baxter (1982, M), Sinai
(1982, M), Sirnon (1987, M).
Critical phenomena, self-similarity, and phase transition: Wilson (1982, S, B)
(recommended as an introduction), Ma (1982, M), Kubo, Toda, and Saitö (1983, M),
Fröhlich (1983, M).
Spin glass theory: Mezard (1987, P).
CHAPTER 69

Continuation with
Respect to a Parameter and a
Radiation Problem of Carleman

We solve this problem with Carleman (1921) by using a method which was
first employed by Ed. Le Roy and which, with a striking success, was used by
S. Bernstein around 1900. It consists in a stepwise continuation ofthe solution
in dependence of a suitable parameter .... It is of great importance to find a
priori bounds for the solution of a differential equation, which contains a
parameter, as well as to find these bounds for all partial derivatives up to a
certain degree.
Leon Lichtenstein (1931)

In Chapter 6 we studied the important method of continuation with respect


to a parameter. In the present chapter we discuss a nontrivial physical applica-
tion. We want to show how, by using existence theorems for linear problems
and a priori estimates, one can find existence results for nonlinear problems.
The proof technique used here may also be applied to many other problems.
In the first two sections we discuss the basic equations for the heat conduc-
tion.

69.1. Conservation Laws

Many basic equations in physics have the form of conservation laws

Pr + div} = p. (1)

We want to explain in which sense (1) represents a conservation law. We


interpret p as the density of the chemical substance S, Tas the mass current
density vector, and p as the mass production per volume and time. If G is a

422
69.2. Basic Equations of Heat Conduction 423

bounded region in IR 3 , then, by definition, we have

L p(x, t) dx = mass in G at time t.

Moreover, we set
J(t) =mass of S which during the time interval [0, t] flows out of G,
M(t) = mass of S which during the time interval [0, t] is produced in Gas a
consequence of chemical reactions.
By definition, we have

j = r ]ndo,
Joa
M= Lpdx,

with outer unit normal vector n. Obviously, there exists the relation

-a
ar
j'
a
pdx . .
= M- J.

Integration by parts yields

L (p, + div}- p)dx = 0. (2)

We assume that all functions and regions are sufficiently regular. By con-
tracting G into a single point x we obtain (1) from (2), according to the mean
value theorem of integral calculus.
For the special case p = 0, relation (1) describes mass conservation.
Moreover, if
p(x,t) > 0 or p(x, t) < 0,
then there exists a source (production of mass) or a sink (annihilation of mass)
at the point x at time t (production of mass). This may have its cause, for
example, in chemical reactions.

69.2. Basic Equations of Heat Conduction


Let T = T(x, t) be the temperature at the point x E IR 3 at time t. We replace
the mass of the previous section with the amount of heat. Then p is the heat
density and p the heat production per volume and time.
If now J.l is the mass density, we find
Pr = CJ.lT,,
424 69. Continuation with Respect to a Parameter

because the temperature change AT in a small volume element AV causes the


change in heat
AQ = CJI.AVAT
with specific heat c = c(T). Note th~t llQ/Il V llt--+ p,. Thus we obtain from (1)
the following basic equations of heat conduction.
(i) Heat balance
CJI.T, + div}= p. (3)
(ii) Constitutive law

T= '~'<T. 'fx>. <4>


EXAMPLE 69.1 (Fourier's Law). For an isotropic body one often Qses
T= -~egrad T, (5)

where the so-called heat conductivity number " may still depend on the
position x; note that " is positive. For a homogeneous body we have, in
addition, that K(x) = const. Table 69.1 contains several values for"· Note that
there 1 kcal = 4,185 J.
By inserting (5) into (3) we obtain the basic equation
CJI.T, - div(K grad T) = p. (6)

For " = const it is


CJI.T,- KAT= p, (7)

where ll is the Laplace operator.


One still has to add boundary and initial conditions.
(i) The initial condition consists in prescribing T(x, 0), i.e., the temperature in
the region G at time t = 0.
(ii) As boundary condition one can use the temperature T or the normal
component }n of the heat flow density vector on oG.

Table 69.1
Material Heat conductivity " Heat conduction
Silver 0.1 kcal/ms K good
Iron 0.017
Lead 0.008
H2 0 1.43·10- 4 average
Rubber 0.3·10- 4
C0 2 3·10- 6 bad
N2 6·10- 6
69.3. Existence and Uniqueness for a Heat Conduction Problem 425

According to (5) we have


Tn = - KoTfon.
Also, the boundary condition
oTfon = f(T) on oG (8)
is possible, i.e., the heat flow on the boundary depends on the temperature.
In Section 37.7 we have seen that it is also meaningful to prescribe inequali-
ties for ]n. For example,P, ~ 0 means that no heat flows out in the direction
of the outer unit normal vector. Such a condition makes sense if the outer
temperature is greater than the temperature of the body.
In conclusion, we want to motivate Fourier's law (5). From experience we
assume that the heat flow depends only on the temperature difference. This
makes the ansatz]= 'I'(T") with '1'(0) = 0 plausible. Taylor expansion yields
T= 'I''(O)T, + ....
Since no direction is preferred for an isotropic body, we must have that
'1''(0) = - Kl.
This is (5). The positive sign of Kin (5) corresponds to the fact that heat always
flows in the direction of falling temperature.

69.3. Existence and Uniqueness for a Heat


Conduction Problem
We consider a Jlomogeneous body in a bounded region 0 of IR 3 . We assume
that there is a heat source at the point x 0 e 0, and moreover, that the body
emits heat. We investigate if in this situation a stationary temperature state
exists for the body.
Let T(x) be the temperature at the point x. We obtain the following
boundary-value problem:
L\T=O inO- {x 0 },
a
T(x) = u(x) +I I in 0- {x 0 },
X - Xo
(9)
T>O infi-{x 0 },
ar
on = -kT 4 on 80.

By a classical solution we mean u e C 2 (0) n C 1 (0). Our assumptions are:


(H) 0 is a bounded region in IR 3 with smooth boundary, i.e., more precisely,
let oOeC 2 •11 with fixed ,ue]O,l[. Moreover, let a and k be positive
constants.
426 69. Continuation with Respect to a parameter

Theorem 69.A (Carleman (1921)). If (H) holds, then (9) has a unique classical
solution.

The proof will be given in the following section. At first, however, we will
motivate the problem physically.
Because of the homogeneity of the body we obtain from (5) for the heat flow
density vector T= -"grad T. The singularity a/1 x - x0 I yields the portion
7
J = Ka(x- x0 )/lx- x0 13 .
Therefore during one unit of time, e.g., a second, the amount of heat

fP, dO = 4nKa
flows through the surface of a ball of radius R and center x 0 • Thus the
singularity models the heat source.
The boundary condition in (9) means that during one unit of time the
amount of heat

f }ndO = - f "~T dO= kK f T 4 d0


Jan Jan vn Jan
is emitted. This expression corresponds to the Stefan-Boltzmann law of
Section 68.5.

69.4. Proof of Theorem 69.A


We use the method of continuation with respect to a parameter. Thereby we
apply, without proof, a number of theorems of classical potential theory. The
analytical key result is contained in Lemma 69.4. In particular, we will,
repeatedly, make use of the maximum principle in the form of Problem 7.2.
In this section we assume (H) of the preceeding section and choose
X= C''(fJQ) for fixed J.l e ]0, 1[.

Then X is a B-space with norm


11/11 =max 1/(x)l + H,.(f).
xean
The Holder constant Hp(f) is the smallest number c with
1/(x)- f(y)l ~ clx- yl" for all x, yeon.
Lemma 69.2. For (9) there exists at most one classical solution.

The proof will be given in Problem 69.1.


We now discuss the existence proof. In order to eliminate the singularity in
69.4. Proof of Theorem 69.A 427

(9) we consider, for fixed x 0 , Green's function x H G(x, x 0 ) for the Laplace
operator, i.e.,
L\G = 0 inn- {x 0 },
G(x,xo) = v(x) + aflx- Xol inn- {xo}. (10)
G =0 on an,
with v e C 2 (0) f"' C 1 (0). The maximum principle implies
G > 0 inn- {x0 }, (11)
aGjan < 0 on an. (12)
We now set
T=G+w
and consider, instead of (9), the problem
L\w =0 inn,
w > 0 inn, (13)
-awjan = aGjan + (1- t)w + tkw4 on an,
with parameter te [0, 1]. If, for t = 1, we can find a classical solution we
C 2(0) f"' C 1(0) of(13), then we obtain a classical solution of(9) from T = G + w.

Lemma 69.3. For fixed t e [0, 1] there exists at most one classical solution of
(13).

This is proved in complete analogy to Lemma 69.2.

Lemma 69.4. Let b, f eX with 0 < cc ~ b(x) ~ fJ on an. Then the third boundary-
value problem
L\w = 0 inn,
(14)
aw;an + bw = f on an
has exactly one classical solution w. If we set
T(b)f = w on an,
then T(b): X -+ X is a linear and continuous operator, i.e.,
IIT(b)llx ~ ~llfllx for all feX. (15)
Thereby it is important that the constant ~ does not depend on b, but only on
the bounds cc and fJ.

This classical result of potential theory will be essentially used in the


following.
428 69. Continuation with Respect to a Parameter

Corollary 69.5. From f > 0 on an follows T(b)f > 0 on an.

PRooF. Let x be the point for which w assumes its minimum on an. The
maximum principle implies that aw(x)jan::;; 0. From (14) it follows then that
w(x) > 0. 0

Lemma 69.6 (A Priori Bounds). There exist positive numbers a, p, y, independent


oft, such that the classical solution of (13) satisfies:
0 < a ::;; w(x) ::;; p on an, (16)

llwllx :S Y· (17)

PROOF.

(I) For fixed te [0, 1] let w beaclassicalsolution of(13). Moreover,letxe an


be a point for which w achieves its minimum on an. According to the
maximum principle, we have aw(x)jan ::;; 0. The boundary condition in
(13) implies
(1 - t)w(x) + tkw(x)4 ~ - aG(x)jan ~ ~ >0
on an. Thereby~ denotes the minimal value of- aGjan on an. For fixed
t e [0, 1] the function

is strictly monotone increasing on R+. Therefore there exists a unique


a(t) > 0 with q>,(a(t)) = ~. This implies

w(x) ~ a(t) >0 for all X E an.

Since the function a(·) is continuous, there exists a number a with


a(t) ~a> 0 for all t e [0, 1].
{II) By replacing the minimum with· the maximum one obtains p in an
analogous fashion.
{III) A well-known result in potential theory states that if
L\w =0 onn
and weC 2 (n)n C1 (0), then

lw(x)- w(y)l :S c(m~x 1aw;an1}x- Yi"

is satisfied for all X, yEan. This estimate, which·may be found in Lichten-


stein (1930), p. 327, is obtained in the following way. One can use a
potential of the simple layer with continuous density p in order to
represent the solution of the second boundary-value problem. Thereby
p is obtained as a solution of a Fredholm integral equation. Then one
69.4. Proof of Theorem 69.A 429

uses the properties of the potential of the simple layer (see Smimov
(1956), Vol. IV, Section 206 and Giinter (1957), Chapter 2, §2).
Because of a ~ w ~ fJ on an we obtain from the boundary condition
in (13) that the Holder constant of w is bounded. This implies (17). 0

Using these classical tools, it is then easy to give the following existence
proof.

Lemma 69.7. Fort= 0 equation (13) has a classical solution.

PRooF. This follows from Lemma 69.4, because (12) and Corollary 69.5 imply
that w > 0 on an, and hence also w > 0 on fi, according to the maximum
principle. 0

Our goal is to continue this solution for t = 0 up to the parameter value


t = 1. For fixed t let w, be a classical solution of(13), where t denotes an index
in w,. Fort+ r we are looking for a solution of(13) which has the form
w = w, + v.
This yields
Av = 0 inn,
(18)
av;an + bv = f on an,
with
b = 1 - t - r + 4k(t + r)w,3 ,
f = r(w, - kw:) - k(t + r)(6w,2 v2 + 4w,v 3 + v4 ).
The following properties off are important:
All terms off, either contain the small parameter r, or are of
(19)
higher order than one in v.

Lemma 69.8. There exists a number r 0 independent oft such that (18) has a
classical solution v for every r with 0 < t:::;;; t 0 and t + t ~ 1.

PROOF. According to Lemma 69.4 we write (18) as an operator equation


v = T(b)f(v), veX. (20)
Because ofthe a priori estimate (16) and Lemma 69.4 we have II T(b)ll ~ const
uniformly for all t. Banach's fixed-point theorem (Theorem l.A) applied to the
ball {veX: llvll ~ r} with sufficiently small rand (19) yield a solution of(20)
(see Problem 69.2).
Because of the a priori estimate (17) we can choose rand t independently
oft. Because of(16) we can also assure that w, + v > 0 on an, hence w, + v > 0
on fi, according to the maximum principle. 0
430 69. Continuation with Respect to a Parameter

Lemma 69.8 shows that if we start with the solution at t = 0, then after
finitely many steps we obtain a solution for t = 1.

PROBLEMS

69.1. Proof of Lemma 69.2.


Solution: Suppose T1 and T2 are solutions of(9). We set T = 11. - T2 • Integra-
tion by parts yields

0 =- f TATdx = f (grad T) 2 dx- f TaTtandO.


Jn Jn JM
Because of the monotonicity of Tt-+ T 4 on R+ we obtain
- T aTtan = k(T14 - T24 )(T1 - T2) ~ 0. (21)
This implies that grad T = 0 on fi, hence T = const. From (21) follows then that
T=O.
69.2. Application of Banach's fixed-point theorem. Verify explicitly that Banach's
fixed-point theorem can, in fact, be applied to (20).
Solution: From (19) it follows that (20) has the precise structure of (H3) and
(H4) in Section 8.12 of Part I.
69.3. Generalizations. Prove Theorem 69.A for the more general boundary condition
-aTfan = F(T) on an,
where F: R-+ R is C 1 with F(O) ~ 0, F'(T) > 0 for all T > 0 and F(T)-+ + oo
as T-+ +oo.
Hint: Use arguments analogous to Section 69.4. See Lichtenstein (1931), p. 54.

References to the Literature

Classical works: Carleman (1921), Lichtenstein (1931, M).


Classical potential theory: Kellogg (1929, M) and Gunter (1957, M) (standard
works), Smirnov (1956, M), Vol IV (introduction), Lichtenstein (1921, S, B, H) (classical
survey article), (1929, M), (1930).
Approximation methods for heat conduction problems: Aziz and Na (1984, M), Shih
(1984, M), Cebeci (1984, M).
Boundary value problems in kinetic theory: Greenberg (1986, M).
APPLICATIONS IN
HYDRODYNAMICS

Although I envy a great generality with regard to the nature of fluids and the
forces that are being applied to their particles, I have no fear of the reproaches
often leveled with good reasons at those who have undertaken to generalize the
researches of others. Often a great generality causes more confusion than it does
illuminate, and sometimes it leads to such voluminous computations that it
becomes extremely difficult to derive any consequences, even in the simplest
cases. If generalizations have this disadvantage, then we should retreat from
them and restrict ourselves to the special cases.
The generality that I embrace, far from dazzling our lights, will reveal to us
rather the veritable laws of Nature in all their brilliance, and in them we shall
find even stronger reason to admire her beauty and her simplicity. It will be an
important lesson to learn that some principles, till now believed bound to some
special cases, arc of greater breadth. Finally, these researches will demand
calculations scarcely any more troublesome, and it will be easy to apply them
to all special cases we might set up.
Leonhard Euler, General Principles of the State of Equilibrium of Fluides (1755)

431
CHAPTER 70

Basic Equations of Hydrodynamics

The tensor surface n('rn) = -1 of hydrostatic pressure is a sphere. This law was
probably discovered by Pascal (1624-1662).
The first two papers of Euler (1707-1783) about equilibrium and motion of
fluids appeared in 1755...
Bernoulli's equation represents the most important theorem in hydro-
dynamics. It was discovered in 1738 by Daniel Bernoulli (1700-1782), even
before Euler's equations were known, by using an argument which can be
regarded as an early version of the energy conservation law ....
Today we consider the equation of Navier (1822) and Stokes (1845) as basic
for the representation of all properties of fluids. In the technology of the nine-
teenth century, however, it was the conviction that there exists a deep gap
between mathematical-physical hydrodynamics and technological hydraulic.
This gap was closed by the profound experimental and theoretical work of the
British engineer and physicist Osborne Reynolds (1842-1912). He worked with
a colored fluid thread in a ·glass tube. For a small diameter d and a small velocity
V the thread moved on a straight line (laminar flow). For large d or large V,
irregular side motions of the thread occurred (turbulent flow). Only by applying
the aspects of a similarity law, was Reynolds able to obtain some systematics.
Arnold Sommerfeld (1944)

As we shall see, the basic equations of hydrodynamics for iiquids and gases
are obtained by modifying the basic equations of elastodynamics.
The main difference is that, in hydrodynamics, it is not the motion of the
single volume element that is studied, but instead, the motion of the fluid
particles, which at various times are at a fixed pointy. Instead of an individual
description, such as in elasticity theory, one uses an anonymous description
in hydrodynamics.
In order to obtain the equations of gas dynamics as a special case, we also
consider thermodynamical effects for the basic equations. In Section 70.5, the
basic equations follow from the so-called transport theorem.

433
434 70. Basic Equations of Hydrodynamics

In Chapters 71 and 72, as well as in Part V, we will consider a number of


interesting and important concrete problems in the dynamics of liquids and
gases.

70.1. Basic Equations


The motions of a liquid or a gas are described by a velocity field
v = v(y,ij:
Thereby v(y, t) is the velocity vector at the pointy at time t. The basic equations
which will physically be motivated in Section 70.5, are as follows:
(i) Momentum balance (equations of motion)
(pv), + div pvov = K- gradp + divi. (1)
(ii) Mass balance (continuity equation)
p, + div pv = 0. (2)
(iii) Entropy balance
(sp), + div spv = ([i, Dv] - div q)/T. (3)
(iv) Thermodynamical equations
p= n (p, T>. s = L(p, T). (4)
(v) Constitutive laws
i = CI>(Dv,p, T), q = 'l'(grad T,p, T),
(5)
[i,Dv] ~ 0, qgrad T :s;; 0.
The quantities above, which all depend on position y and time t, have the
following meaning:

p density;
p pressure;
T temperature;
i stress tensor for the inner friction;
q heat flow density vector;
s specific entropy density.
The outer force which is exerted onto the flow region H is

t Kdy
and the entropy of H is

Lpsdy.
70.1. Basic Equations 435

The stress tensor r of Section 61.3 has the form


r =-pi+ i. (6)
Thus, according to Section 61.3, the stress force

( rn dO = - ( pn dO + ( in dO (7)
JaH JaH JaH
acts on H. Here n is the outer unit normal vector to oH. If the inner friction
vanishes, i.e., i = 0, then (7) corresponds to the well-known fact that, in a fluid,
the pressure force - pn flO is applied to the surface element flO in direction
of the inner normal vector - n (Pascal's law).
Moreover, we set
Dv(y) = 2- 1 (v'(y) + v'(y)*),
i.e., Dv is constructed analogously to the strain tensor y = Du. The constitutive
law relates the inner friction i with Dv, i.e., with the first-order derivatives of
the velocities with respect to space. As in the elasticity theory of Section 61.4
the stress tensor r is assumed to be symmetric, i.e., t e Lsym(V3). Hence it follows
that
i E Lsym(V3).
For a, be V3 we define the dyadic product u = a o b as the uniquely deter-
mined linear operator a 0 b: v3 -+ v3 with
(a o b)x = a(bx) for all X E V3.

In Cartesian coordinates we have uj = aibi, so that the equation of motion (1)


takes the form
(8)
According to Section 69.1, all basic equations (i) to (iii) have the form of
conservation laws. Thus the right-hand sides in (i) and (iii) describe the
production of momentum and entropy per volume and time.
Initial and boundary conditions have to be added to the basic equations. As
initial conditions, one prescribes at time t = 0 velocity v(y, 0), density p(y, 0),
and pressure p(y, 0) in the flow region G. If G is not in motion, then the normal
component of the velocity vector must vanish on the boundary oG, i.e.,
vn = 0 on oG.
For viscous fluids, the fluid sticks to the boundary. Thus we must have
v = 0 on oG.
Further possible boundary conditions will be discussed in the following
chapter.
In order to be able to conveniently transform the basic equations, we define
in an invariant way
flv = 2 div Dv - grad div v, (9)

(vgrad)v = div(vov)- vdivv. (10)


436 70. Basic Equations of Hydrodynamics

In Cartesian coordinates this gives


Av = (Avi)ei,
The continuity equation allows us to simplify conservation laws, because from
(2) it follows that for an arbitrary real function f:
(pf), + div fpv = p,f + pj, + f div pv + pv grad f
(11)
= pj, + pvgradf
In particular, we obtain for the equation of motion (1):
(pv), + div pv o v = pv1 + p(v grad)v. (12)
Moreover, we have
curl curl v = grad div v - Av.
All expressions in these formulas are defined in an invariant way. Therefore
it suffices to verify these relations in Cartesian coordinates. This strategy will
be used throughout this chapter.

70.2. Linear Constitutive Law for the Friction Tensor


Based on general symmetry arguments we want to formulate the constitutive
law for a fluid. In addition to the universal pressure, we observe forces of
friction in a fluid, which are caused by the fact that the particles of fluid move
with different velocities. We therefore assume in (6) that the stress tensor can
be written in the form
T = -p] + i,
whereby the friction tensor i depends on the space derivatives of v. More
precisely, we assume that
i = i(Dv)
in analogy to the constitutive t = -r(Du) of elasticity theory. Note that v = u,.
According to Proposition 61.11, the most general rotation invariant map
between the symmetric tensors i and Dv has the form
i = 2rtDv + (r( - 2rt)(tr Dv)l + rt"(Dv) 2 (13)
with
11 = rt(tr Dv, [Dv, Dv], det Dv)

and analogous expressions for rt' and rt". In first-order approximation we want
to obtain a linear law in (13). If 11 and· rt' are expanded in a Taylor series, then,
in first-order approximation, one may assume that 11 and rt' are constants.
70.2. Linear Constitutive Law for the Friction Tensor 437

Here 'I is called the viscosity and 'IIp = v the kinematic viscosity. Conse-
quently, we set ,, = 0. This gives
div i = 'I 11v + ('I' - 'I) grad div v
(14)
= -, curl curl v + ,, grad div v.
According to (7), the stress force

l tndO = - l pndO + l dividy (15)


loB loB JH
is applied to a region H of the fluid. As we shall see in Section 70.5, curl v and
div v are a measure for the change in time of the rotational motion of a fluid
particle, and the change in time of the volume, respectively. Thus, (14) shows
that 'I and ,, describe the so-called friction of vorticity and friction of compres-
sion, respectively. For incompressible fluids, i.e., div v = 0, the quantity,, does
not occur. But experience also shows that for compressible fluids ,, can be
neglected compared with 'I·

EXAMPLE 70.1 (Parallel Flow). In Cartesian coordinates we consider the


parallel flow
v = ll'(el>et
(Fig. 70.1), whereby the velocity in direction ofthe el-axis varies. This implies
tr Dv = 0 in (13). Let H be a cuboid parallel to the coordinate axes as shown
in Fig. 70.1. By (13~ this cuboid is subject to the stress force

l tndO =- l pndO + 2'1 l (Dv)ndO.


JaR JaR JaR
In particular, the force
tel 110 = -pel 110 + 'lll''e 1 110
is applied to this side of the cuboid which has the outer normal vector el. As
expected, the force of friction in Figure 70.1 is effective in direction e1 • The
larger the derivative ofthe speed ll'' is, the larger the force offriction becomes.

~ friction

.............. ~--
H
Figure 70.1
438 70. Basic Equations of Hydrodynamics

70.3. Applications to Viscous and Inviscid Fluids


EXAMPLE 70.2 (Viscous Fluids). We neglect temperature effects. The basic
equations for viscous fluids then follow from the basic equations of Section
70.1 and the constitutive law (13) by using (12) and (14). They are called
Navier-Stokes equations.
(i) Equation of motion
pv, + p(v grad)v - ,~v + ('I - 'I') grad div v = K - grad p. (16)
(ii) Continuity equation
p, + div pv = 0. (17)
(iii) Density-pressure relation
p= p(p). (18)
From (14) it follows that (i) also has the form
pv, + p(vgrad)v + ,curlcurlv- 'l'graddivv = K- gradp. (19)
EXAMPLE 70.3 (Special Cases). If p = const, then the fluid is called incom-
pressible. From (ii) it follows immediately that div v = 0.
If curl v = 0, then the flow is called irrotational. If C is a closed curve, then

r = L vdy

is called the circulation of C. If a given particle of fluid moves on a closed orbit


which corresponds to C, then we haver =F 0, since vy is proportional to v2
for a parametrization y = y(t) of C. This is not possible in an irrotational flow,
because in this case, Stokes' theorem implies that

r = JJ curlvdf = 0.

The motion is called stationary if the density p and the velocity v do not
depend on time.

Proposition 70.4. For a solution of the basic equations (16)-(18) in a connected


flow region, the Bernoulli equation
1
- v2 + U +
ip -dp = const (20)
2 Po p(p)
is valid if the flow is stationary and irrotational, r( = 0 holds, and the outer
forces possess a potential, i.e., K = - p grad U.

All functions are assumed to be sufficiently smooth. Equation (20) corre-


sponds to the energy conservation law.
70.4. Tube Flows, Similarity, and Turbulence 439

PRooF. Let L denote the left-hand side in (20). Because of


v2
(v grad)v = grad 2 - v x curl v (21)

the equation of motion (19) assumes the form

v, + grad L = v x curl v - ~
-curl ~·
curl v + -grad div v.
p p

From v, = 0, curl v = 0 and ~· = 0 follows grad L = 0, and hence L = const.


D

EXAMPLE 70.5 (Inviscid Fluid). A fluid is called inviscid if and only if the inner
friction vanishes, i.e.,~ = ~· = 0. In this case the basic equations (16)-(18) are
called Euler equations.
lnviscid fluids are also called ideal fluids.

70.4. Tube Flows, Similarity, and Turbulence


To understand a number of important phenomena of the Navier-Stokes
equations, we consider the simple case of a tube flow.

EXAMPLE 70.6 (Law of Hagen 1839 and Poiseuille 1840). As in Example 70.1
we consider a tube of length l, with radius R and pressure difference
lip = Pt - P2 > 0.
We set r = (e~ + e~) 1'2 , i.e., r is the distance from the tube axis. One easily
verifies that for ~· = 0, a solution of the stationary equations of motion
(16)-(18) is given by:
lip 2 2
v = 4 ~ 1 (R - r )e 1 ,
(22)
P = P1- ellipfl, p= const.
This is the typical parabolic velocity profile, shown in Figure 70.2. The mass

Figure 70.2
440 70. Basic Equations of Hydrodynamics

M, which during the time At flows through the tube, is equal to

M=At JorR plvl2nrdr= 8p, 1nR 4 Ap. (23)

From this formula the viscosity '1 can experimentally be determined. Also, the
expression kinematic viscosity for '11 p becomes clear. Table 70.1 contains
several values.

Table 70.1
Substance Viscosity 'I Kinematic viscosity 'liP
Air 1.8 · 10-4 gfs em 0.15 cm 2/s
Water O.Ql 0.01
Glycerine 15 12

The number
Re = pVR/'1
is called Reynolds number, where V is the mean velocity.
The following fact is now especially important. Expression (22) describes,
a mathematically correct solution of the Navier-Stokes equations, which is
valid for all values of Ap, I, p, '1· and R. In experiments, however, this solution
is observed only for small Reynolds numbers. One speaks of a laminar flow.
For large Reynolds numbers the solution (22) becomes unstable and in experi-
ments one observes turbulent behavior, which for Re-+ + oo becomes more
and more chaotic. For example,
M-Ap,
as in (23), is no longer valid in a turbulent flow, but only approximately
M "'(Ap)t/2.
The great achievement of Reynolds in 1885 was his discovery that the
change from laminar flow into turbulence does not depend on the single values
of R, V, '7, and p; but only on the number Re. This is Reynold's similarity law.
We have
Re{< Recrit laminar flow,
(24)
> Recrlt turbulent flow,
with Recrlt = 1,150. But Recrlt also depends on the particular way the fluid
flows into the tube. For a very good smoothing ofthe muzzle we may assume
values of 20,000 for Recrit·
We want to show how, by general dimension arguments, one is led to the
Reynolds number Re. We assume that the sudden change oflaminar flow into
turbulence depends on a dimensionless number, since the physical phenome-
non cannot depend on the system of units. Our ansatz is Re = f(p, '7, R, V).
70.5. Physical Motivation of the Basic Equations 441

More precisely, let


Re = p 11 ,,wv~.

This gives the dimension m- 3 «-/l+yH kg«+ /I s-/1- 6• In order that this expression
becomes dimensionless, we must have p = -a, {J = a, y = a, hence Re =
(pRV/'1J«· The value ofO!, however, is unimportant.

70.5. Physical Motivation of the Basic Equations


We use the same notations as in elasticity theory of Section 61.3. As a con-
sequence of the deformation
y= x + u(x,t), (25)
the point x at timet becomes the pointy, and the region H becomes H' = H(t).
According to Figure 70.3, the point x denotes the individual particles of fluid.
We may assume that u(x, 0) = 0 for all x. Thus the velocity of the individual
particle of fluid, which at time t is at the point y, is equal to
v(y, t) = u,(x, t).
Frotn Section 61.2 it follows that, in first-order approximation, a= div u
describes the relative change in volume and m = r• curl u the vector of the
infinitesimal rotation of the volume elements. This implies
0!1 = divv and m, = 2- 1 curl t'.
The following formula is important for obtaining the basic equations

dd (
t Jr
H(r)
hdy) =
Jr
H(r)
(h, + divhv)dy. (26)

Proposition 70.7 (Transport Theorem). Let H denote a bounded region in IR 3


which after applying the C 2 -diffeomorphism (25) becomes H(t) for every t e
[0, t0 ]. More precisely, let (25) be a C2 -diffeomorphism from H onto H(t). Let
h: R3 x [0, t 0 ]-+ IR be C1 • Then (26) holds for every te [0, t0 ].

The proof will be given in Problem 70.1. In the following we assume that
all functions are sufficiently smooth.

H H(t)

Figure 70.3
442 70. Basic Equations of Hydrodynamics

Mass balance. Conservation of mass means (d/dt)J 81 ,1 pdy = 0. From (26)


follows

( Pr + divpvdy = 0.
JH(t)

Contracting H(t) into one point, we obtain p, + div pv = 0. This is the conti-
nuity equation (2).
Momentum balance. Similarly as in Section 58.7, the time derivative of the
total momentum of H(t) is equal to the effective force on H(t). This implies

-d
dt f
H(t)
pvdy = i
H(t)
K dy + iiJH(t)
rndO. (27)

J
From (26) and rn dO = Jdiv r dy follows then
( [(pv), + div pvov- K- divr]dy = 0.
JH(t)

This immediately implies the equation of motion (l).


Kinetic energy balance. The kinetic energy of H(t) is

Ek;n(t) = ( 2-1 pv2 dy.


JH(t)

If one multiplies the equation of motion (l) with v, then one obtains the
following relation from the continuity equation (2):
(pv 2 ) 1 + div(pv 2 )v = 2(Kv + v div r).
Equation (26) yields

Ekin(t) = r
JH(t)
(vK + vdivr)dy.
Thermodynamical system. We consider the fluid in H(t) as a thermo-
dynamical system with mass M, inner energy E, and entropy S. We assume
that these quantities can be represented by specific densities of the form

M= f p dy, E= f pe dy, S= f ps dy.

The area of integration is H(t). We postulate that for e, p, and s the same
relations are valid as in Section 67.3. This means
de= Tds + pdpjp 2 ,
(28)
p = p(p, T), e = e(p, T), s = s(p, T).
First law of thermodynamics. Let W(t) be the work done at H(t) during the
time interval [0, t]. In point mechanics, the time derivative of the work is equal
70.5. Physical Motivation of the Basic Equations 443

to the power of all forces. Analogously, we set

W(t) = f Kvdy + l (rn)vdO.


JH(t) J
iJH(t)

Moreover, let Q(t) be the amount of heat which during the time interval [0, t]
flows into H(t). If q is the heat flow density vector, then it follows from Section
69.1 that

Q(t) = - l qndO.
JoH<tl
The total energy of H(t) is E + Ekin· Thus the first law of thermodynamics
implies
E + Ettn = w+ Q.
According to (26) this means

l (pe), + div pev + v div r dy = l (rn)v - qn dO.


J H(t) JiJH(t)

Passing to components and integration by parts yields

l (pe), + div pev- [r,Dv] + divqdy = 0,


JH(t)

hence
(pe), + divpev- [r,Dv] + divq = 0. (29)
We want to replace e with s. From (28) follows
e = rs + ppfp 2 •
Because of e = e(y(x, t), t) ·we obtain
e= vgrade + e,.
This implies

p(e, + v grad e) = pT(s, + v grads) + !!..(p1 + v grad p). (30)


p

The continuity equation together with (11) and (29) yields


(ps), + divpsv = ([i,Dv]- divq)/T. (31)
Note that 1: = -pi+ i and [-pi, Dv] = - pdiv v. Equation (31) is the same
as the basic equation (3).
Second law of thermodynamics. Motivated by the second law
s ~ QjT,
444 70. Basic Equations of Hydrodynamics

and noticing the dependence of the temperature on (y, t), we postulate the
inequality

:t Jl
H(t)
psdy 2! -
JliJH(t)
qn/TdO.

From (26), (31), and (61.4) this implies

f( T- 1 ([i,Dv]- divq) + div ~ )dy 2! 0

and hence y-t [i, Dv] - r-2 q grad T 2! 0. This inequality is always satisfied

if [i,Dv] 2! 0 and q grad T ~ 0.


This motivates the basic equations of Section 70.1.

70.6. Applications to Gas Dynamics

The basic equations of Section 70.1 are not only valid for liquids but also for
gases.

EXAMPLE 70.8 (In viscid Ideal Gas without Heat Conduction). We neglect inner
friction and heat conduction, i.e., i = 0 and q = 0.
The basic equations (1)-(5) then take the form:
(i) Equation of motion
pv, + p(v grad)v = K - grad p.
(ii) Continuity equation
p, + div pv = 0.
(iii) Entropy balance
s, + v grads = 0.
(iv) Thermodynamical equations for ideal gases
p = rpT, s = cln Tp-• 1<.
Relations (i)-(iii) follow from Section 70.1 and (iv) has been considered in
Example 67.4.

Problems in gas dynamics (e.g., shock waves and transonic flow) will be
considered in Part V. We will see that a generalized form ofthe basic equations
is needed in order to obtain the important Rankine-Hugoniotjump conditions
for shock waves.
Problems 445

PROBLEMS

70.1. Proof of Proposition 70.7. Let y = x + u(x,t) and D1 = o/oe1. Observe that
v = y1• The transformation yields

[ hdy= [ hJdx
JH(t) JH
with J(x, t) = det y.,(x, t). If AIJ denotes the adjoint subdeterminant to DJ't 1 in J,
then the well-known expansion formula

l>11 J = A 11 DJ'l•
is valid. The sum is taken over equal indices from 1 to 3. According to the
product rule, the derivative of a determinant is obtained by differentiating the
ith line and then adding all these determinants, i.e.,

J(x, t)1 = (D1'1;)1A 11 = (D1v1)AIJ


OV· OV·
= -' (D1171)Au = --'-J
0'1• 0'11
= Jdiv,v.
This implies

dd [ h(y,t)dt = dd [ h(y(x,t),t)J(x,t)dx
t JH(t) t JH

=L (v grad,h + h1 )J + hJdiv,vdx

=L (h 1 + div,hv)J dx

=[ (h 1 + div1 hv)dy.
JH(I)

70.2. The basic equations in hydrostatics of Newton (1687).


70.2a. Formulate these equations for constant temperature T.
Solution: We use the basic equations of hydrodynamics (1) to (5). In the
special case of hydrostatics, we have v = 0, i.e., the velocity of the fluid particles
is equal to zero. Hence, the inner friction vanishes, i.e., i = 0. From (1) to (5),
we obtain the following basic equations of hydrostatics:
(i) Equilibrium condition
K = gradp. (32)

(ii) Density-pressure relation

p = p(p, T). (33)

Here, p is pressure, and p is density. The outer force, which is exerted


446 70. Basic Equations of Hydrodynamics

onto an arbitrary flow region H, is equal to

L Kdy.

In a simply connected region, equation (32) determines the pressure p up to


a constant. Thus, we may describe the pressure p0 at a fixed point y0 , i.e.,
P(Yo) =Po·
70.2b. Give a physical interpretation of (32).
Solution: Integration by parts yields

l Kdy = l pndO.
JH JoH
This means that, for each flow region H, the outer forces onto H and the
pressure onto the boundary 8H are in an equilibrium.
70.2c. Formulate the basic equations of hydrostatics in case the outer force possesses
a potential U, i.e.,
K= -pgradU.
Solution: Instead of p = p(p, T) we write briefly p = p(p). From {32) we
obtain

grad( U+ f~P~~)) = 0.
In a simply connected region, this yields the following basic equations:
(i) Energy conservation

U+ i p

PoP
dp
-() = const = U(y 0 ).
P
(34)

(ii) Density-pressure relation

p = p{p). (35)
Here, we are given the potential U = U(y) and the pressure p(y0 ) = p0 at a fixed
point y0 • From {34) we obtain the pressure p = p(y), and from {35) we obtain
the density p = p(p{y)) at y.
If the fluid is incompressible, then
p = const, {36)
and equation (34) yields
pU + p = const: {37)
Further important material can be found in the problem sections of Chapters
71, 72, and 86.
References to the Literature 447

References to the Literature


Classical works: Newton (1687) (hydrostatics), Bernoulli (1738) (Bernoulli's law-
conservation of energy), Euler (1755) (fundamental paper-equation of motions),
Navier (1822) and Stokes (1845) (Navier-Stokes equations), Darcy (1856) (basic equa-
tions of filtration theory), Helmholtz (1858) (motion of vortices), Kelvin (1869) (con-
servation of circulation), Riemann (1860) (integration of the equations of gas dynamics
in special cases), Rankine (1870) and Hugoniot (1889) (jump conditions in gas dy-
namics), Reynolds (1885) (turbulence and similarity), Prandtl (1904) (boundary layer
equation), Lichtenstein (1929, M) (existence theorems for inviscid fluids via potential
theory). (See also "classical works" in the References to the Literature for Chapters 71
and 72.)
Article in the handbook of physics: Serrin (1959).
Classical monographs: Lamb (1924, H), Courant and Friedrichs (1948, H) (gas
dynamics), Kotschin (1954), Jacob (1959), Milne-Thomson (1960).
Standard works from the physical point of view: Sommerfeld (1954, M), Vol. 2,
Landau and LifSic (1962, M), .Yol. 6, Lighthill (1986, M).
Mathematical point of view: Hughes and Marsden (1976, L), and Chorin and
Marsden (1979, L)(recommended as introductions), Batchelor (1967, M), Meyer (1971,
M), Shinbrot (1973, M), Schreier (1982, M) (compressible fluids), Ockendon and Taylor
(1983, M)(inviscid fluids), Chipot (1984)(porous media), Pipkin (1986) (visco-elasticity).
Free boundary problems and variational inequalities: Friedman (1982, M).
Capillarity: Finn (1985, M).
Global analysis and hydrodynamics: Arnold (1966), Ebin, Fischer, and Marsden
(1972), Marsden (1972, L).
Gas dynamics and shock waves (mathematical point of view): Courant and Friedrichs
(1948, M), Bers (1958, M), Rozdestvenskii and Janenko (1978, M), Morawetz (1981,
L), Smoller (1983, M) (recommended as an introduction), Cole (1986, M) (transonic
aerodynamics).
Gas dynamics and shock waves (physical point of view): Landau and LifSic (1962,
M), Vol. 6, Guderley (1957, M), Becker (1965, M), Sauer (1960, M), (1966, M), Oswatitsch
(1976, M) (standard work).
Recent results in hydrodynamics and plasma physics: Marsden (1984, P).
Open questions in the dynamics of liquids and gases: Smoller (1983a), Majda (1984,
M), Marsden (1984, P).
Numerical methods: Cf. the References to the Literature for Chapter 72.
(Cf. also the References to the Literature for Chapters 71, 72, and 86.)
CHAPTER 71

Bifurcation and Permanent


Gravitational Waves

In one of his last papers, Lord Rayleigh, in 1917, computed approximate solu-
tions up to order 6. He also showed by means of a numerical example that
the relative error is not greater than 2.5 · 10- 6 • We give here a rigorous exis-
tence proof for permanent gravitational waves of infinite depth which is also
constructive.
Tullio Levi-Civita (1925)
I have tried to avoid long numerical computations, thereby following Rie-
mann's postulate that proofs should be given through ideas and not voluminous
computations.
David Hilbert, Report on Number Theory (1897)

In this chapter we study the existence of nontrivial water waves in a channel


of finite depth. As shown in Figure 71.1 we find that, in addition to the trivial
parallel flow, there occur nontrivial wave motions at certain critical velocities
c. Such waves were studied during the nineteenth century by British hydro-
dynamicists such as Airy, Stokes, Kelvin, and Rayleigh. They solved the
linearized problems and calculated nonlinear approximations up to order 6.
No convergence proofs, however, were given. The linearized theory of Airy
(1845) shows that for
g). 2nh
c2 =-tanh- (1)
2n ).
nontrivial waves occur with:
c propagation velocity of the wave;
). wave length;
h average channel depth;
g gravitational acceleration.

448
71. Bifurcation and Permanent Gravitational Waves 449

-c -c

Figure 71.1

Relations between propagation velocity and wave length, of which formula


(1) is an example, are called dispersion relations, and one of the main goals in
the theory of permanent water waves is to obtain such dispersion relations
for various situations.
For about 20 years Levi-Civita (1873-1941) worked on a rigorous solution
for the complete nonlinear problem. In 1925, he found a very complicated
existence proof for channels of infinite depth. Analogously to Section 71.6, he
calculated the solutions as a power series with respect to a small parameter s.
The main difficulty, which he overcame by performing voluminous computa-
tions, was to show the convergence of the formal solution by using a majorant
method.
In our functional-analytic approach this difficulty is avoided, since existence
and analyticity of solutions follows from the main theorem of analytic bifurca-
tion theory (Theorem 8.A). In addition, we obtain uniqueness results which
Levi-Civita did not have. Our approach shows that, by using the abstract
methods offunctional analysis, the proofs can be given through ideas following
Hilbert's requirement cited at beginning of this chapter.
Independently of Levi-Civita, the Russian mathematician Nekrasov (1883-
1957) also worked on this wave problem. In 1921, he gave the first existence
proof for channels of infinite depth by using a nonlinear integral equation and
a complicated majorant method. Using the method of Levi-Civita, Struik, in
1926, proved the existence of permanent gravitational waves in channels of
finite depth. This proof was even more complicated than the proof of Levi-
Civita for channels of infinite depth.
The functional-analytic approach of this section goes back to the author.
The monograph by Zeidler (1968) contains detailed historical remarks as well
as applications of this method to various classical wave problems. There, we
presented, for the first time, a number of new existence proofs. The key is
Theorem 8.A by the author. Further applications may be found in Zeidler
(1971), (1972a, b), (1973), (1977, S), and Beyer and Zeidler (1979).
From a physical point of view our main result in this chapter is as follows.
We are given:
h average channel depth;
Po constant density of the liquid;
A. wave length;
p0 constant barometric pressure on the surface of the liquid.
We find that in a neighborhood of the critical velocity c, given by (1), there
450 71. Bifurcation and Permanent Gravitational Waves

occur waves which satisfy:


(i) Wave surface
A. 2nx
Yo(x) = 27t scosT + O(s 2 ), s-+ 0.

(ii) Average velocity c at the channel bottom


gA. 2nh
c2 = 27t tanh T + e2 s2 + O(s4 ).
(iii) Channel bottom
y= -h.
(iv) Flux
t/10 =he.
(v) Velocity vector v = ae 1 + be 2 of the fluid particles at the wave surface

a= c + O(s), b = -scsin 2 ~x + O(s 2 ), s-+0

and in the flow region


a= c + O(s), b = O(s), s -+0.
(vi) Pressure in the flow region
P = Po + O(s), s -+ 0.
Thereby s is a small real parameter and e2 > 0. What this means is that the
nontrivial waves appear supercritical, i.e., above the classical critical velocity
(1), which is obtained from (ii) with s = 0. According to the formal stability
theory of Section 8.7, such solutions are stable, see Figure 8.6. The trivial
parallel flow of Figure 71.1 corresponds to s = 0. Roughly speaking, the waves
above are the only ones that occur in a neighborhood of the critical velocity
(1). More precisely, this will be stated in Theorem 71.B of Section 71.4.
We use a Cartesian (x, y)-coordinate system in which the wave surface seems
to be at rest. In such a system the fluid particles move from left to right with
a mean velocity of c. That this is a very natural result can be seen as fo1lows.
If we choose another coordinate system in which the wave surface travels from
right to left with velocity c (Fig. 71.1), then the fluid particles have a mean
velocity of zero, i.e., they just perform small oscillations.
Our present problem is a free boundary-value problem which exhibits the
typical difficulty that, a priori, the form of the wave surface (i) is not known.
In order to overcome this obstacle, we introduce a conformal mapping. The
main idea is this:
(a) The complex flow potential is used, to obtain a conformal map between
a suitable portion of the flow region and the circular ring.
71.1. Physical Problem and Complex Velocity 451

(b) Thereby the original free boundary-value problem reduces to a nonlinear


boundary-value problem for an analytic function on the circular ring.
(c) Using C2·•-Holder spaces, we transform problem (b) into an operator
equation on a B-space.
(d) We then apply the main theorem of analytic bifurcation theory (Theorem
8.A) to this operator equation. The following relations are valid:
trivial solution => trivial parallel flow,
bifurcation point => critical velocity (1),
bifurcation branch => nontrivial permanent waves.
During the last few years, various other free boundary-value problems have
been studied using the theory of variational inequalities. A discussion of this
may be found in Friedman (1982, M). In Part V we will consider an elegant
variational approach to problems with permanent waves.

71.1. Physical Problem and Complex Velocity


Let us now consider a water wave of wave length A which propagates with a
constant velocity in a channel of finite depth. This problem is most easily
studied in a Cartesian (x, y)-coordinate system, in which the fluid surface seems
to be at rest, i.e., where the surface is described by a A-periodic even function
Y = Yo(x), (2)

which satisfies the normalization condition

f A/2

-A/:Z
y0 (x)dx = 0. (3)

As the equation of the channel bottom we choose


y= -ho (4)
(Fig. 71.2). Let e1 , e2 denote the orthonormal basis vectors in the (x, y)-system.

y
--#'-:......,.0""""--l.lA-*"--,,__,.~-x
Y = Yo(X)

Figure 71.2
452 71. Bifurcation and Permanent Gravitational Waves

Suppose the velocity vector has the form


v = ae 1 + be 2 (5)
with velocity components a = a(x, y) and b = b(x, y1 i.e., we are interested in
a plane stationary velocity field. Moreover, we assume that the components
a and b have period A. with respect to x, and satisfy the following symmetry
conditions
a(- x, y) = a(x, y), b( -x,y) = -b(x,y). (6)
At the channel bottom we have
b(x, - h0 ) = 0. (7)
We define the average velocity at the channel bottom as

c =I1 JA/2 a(x, -h0 )dx. (8)


-A/2

According to our assumption above, the surface of the wave in the (x,y)-
systems seems to be at rest. However, our existence proof below shows that
the particles of fluid move with an average velocity of c from left to right.
By definition, the flux is equal to

1/Jo = J Yo(x)

-llo
a(O,y)dy. (9)

The quantity
h = 1/Jo
c
is called the average channel depth. Our existence proof below yields
h0 = h + O(s 2 ), s-+ 0.
Since we consider an inviscid, incompressible, irrotational, stationary flow
with constant density p0 , formulas (70.16) and (70.19) yield the following Euler
equations for the flow region:
grad(!p0 v2 + p0 U + p) = 0, (10)
divv = 0, (11)
curlv = 0. (12)
Thereby p is the pressure and
K = -p0 grad U
is the density ofthe outer forces. The effective outer force in our wave problem
is the gravitational force, i.e., K = - p0 ge 2 and hence
U=gy.
71.1. Physical Problem and Complex Velocity 453

Theorem 7l.A. Consider the basic equations (10)-(12) for a planar, irrotational
flow of an incompressible inviscid fluid in a region G. These equations are
equivalent to the following conditions:

(a) The so-called complex velocity function

V(z) = a(z) - ib(z)


with z = x + iy is holomorphic in G, where a and bare the components of
the velocity vector, i.e., v = ae 1 + be 2 •
(b) There exists a constant B0 such that the Bernoulli equation

(13)
is valid in G.

PRooF. Equations (11) and (12) are equivalent to the Cauchy-Riemann equa-
tions

Hence a-ibis holomorphic in G. D

Recall that the Bernoulli equation (13) describes conservation of energy.


Theorem 71.A allows us to apply methods of complex function theory to
planar hydrodynamics.
In the case of our wave problem, the following boundary conditions are
valid:
along y = y 0 (x), (14)
where p0 is the outer atmospheric pressure, and

y~a- b=0 along y = y0 (x), (15)


b=O along y = - h0 . (16)
Conditions (15) and (16) state that the wave surface and the channel bottom
are streamlines, i.e., the velocity vector is tangential. This leads us to the
following problem:

(i) We are looking for a holomorphic function V = a - ib in the flow region

G = {(x,y)eiR 2 : -h0 < y < y0 (x), -oo < x < oo},

whereby y = y0 (x) is unknown.


(ii) The boundary conditions (14)-(16) are to be satisfied whereby the Bernoulli
constant B0 is unknown.
If a solution is found, then the velocity is known. Furthermore, the pressure
pin the flow region follows from equation (13) with U = gy.
454 71. Bifurcation and Permanent Gravitational Waves

71.2. Complex Flow Potential and Free


Boundary-Value Problem
Let us simplify our problem, and introduce the holomorphic function
W(z) = cp + it/1 (17)
via the integral

W(z) = l: V(w)dw.
JiYo(O)

Since the flow region is simply connected, this so-called complex flow potential
is uniquely determined in this region. From
W'(z) = 'Px - icp, (18)
we obtain
b = cp,, (19)
with
cp(O,y 0 (0)) = 0. (20)
Moreover, the Cauchy-Riemann equations
'Px = t/J,, (21)
are valid in the flow region with
t/1(0, y0 (0)) = 0. (22)
The components cp and t/1 of the complex flow potential admit a simple
interpretation. From (19) we have
v = gradcp, (23)

i.e., cp is the negative velocity potential. By definition, the curve C: x = x(t),


y = y(t) is a streamline if and only if the velocity vector is tangential along C,
i.e.,
- x'(t)cp,(x(t), y(t)) + y'(t)cpx(x(t), y(t)) = 0 along C.
From (21) follows that

Hence C is a streamline if and only if


t/1 = const along C. (23*)

From Section 71.1, the wave surface y = y 0 (x) is a streamline, i.e.,


t/1 = 0 along y = y 0 (x)
71.2. Complex Flow Potential and Free Boundary-Value Problem 455

and the channel bottom is a streamline as well, i.e.,

1/1 = - 1/Jo along y = -h 0 •


The values of the constants follow from (9), (22), and a = 1/!y·
From (6), together with (19), (21), we obtain
tp(-x,y) = -tp(x,y), 1/1(- x, y) = 1/J(x, y), (24)
and the A.-periodicity of a and b with respect to x implies the same for IPx and
1/Jx. This yields
1/J(x + A, y) = 1/J(x, y),
(25)
tp(x + A, y) = tp(x, y) + d.
Notice that 1/1 is even with respect to x, and observe (8) with a = IPx· Further-
more, it follows from (24) and (25) that

tp ( A ) = ±2.
±2,y cA. (26)

This leads to the following problem.

Free Boundary-Value Problem 71.1. We are looking for a flow region


G = {(x,y): -h 0 < y < y0 (x), -oo < x < oo}
and a function 1/J with
A.ljl = 0 on G
which satisfies the boundary conditions

1/1=0 for y = y 0 (x),


1/1 = - 1/Jo for y = -h 0 ,
and
(27)
for y = y0 (x). Moreover, let y0 and 1/1 be even and A.-periodic with respect to
x. Finally, let 1/Jo > 0 and for y 0 assume the normalization condition (3).

Since the boundary y = y 0 (x) is unknown and hence has to be determined


together with 1/J, one speaks of a free boundary-value problem. In particular,
we are searching for a function y 0 (x) "¢= 0.
To further simplify our problem let us introduce the function

w=8 + ir.
We make the ansatz
456 71. Bifurcation and Permanent Gravitational Waves

where the constant '1 has still to be specified. This means that
a = f/ce• cos 8, b = 'lee• sin 8,
and hence 8 is the angle between the velocity vector v and the x-axis and
lvl = 'lee•.
For this to make sense, we assume that no stationary points exist in the flow
region, i.e., v =/; 0 in G. Hence W'(z) =/; 0 in G. The function w together with
W is holomorphic in G. Since the maps a and b are A.-periodic in x, it follows
that 8 and t have the same property. The symmetry of a and b implies
8( -x,y) = -8(x,y), t(- x, y) = t(x, y).
Moreover, we find that at the channel bottom
8=0 for y= -h 0 ,
and from (27) we obtain
tPo'1 2C2 e2 • + PogYo + Po = Bo (28)
for y = y 0 (x). The relation between '1 and tis given in (31) below.

71.3. Transformed Boundary-Value Problem


for the Circular Ring
In order to avoid the difficulty of a free boundary y = y 0 (x), we transform our
problem onto the circular ring.
As shown in Figure 71.3, only the portion G;, of the flow region is con-
sidered, where -A./2 < x < A./2. We then use the two mappings
W = W(z)
and
2nW
( = ex p ----:---,.--- .
IC11.

This way, the region G;, is mapped conformally onto a circular ring, cut along
the negative real axis. We transform the function w = 8 + it into the (-plane
and denote this transformed function again by w = 8 +it. The transformed
function w is then holomorphic on the open circular ring, cut along the
negative, real axis.
In the (-plane we introduce polar coordinates p and q through
( = peiu.

Because of 8(p, ±n) = 0 and


8(p, - q) =- 8(p, q), t(p, - q) = t(p, q),
71.3. Transformed Boundary-Value Problem for the Circular Ring 457

--~,..q..~~r----x
Y = Yo(X)

y = -ho
}.
2

Figure 71.3

we can use Schwarz's refiection principle to continue the function w analyti-


cally onto the open circular ring
A,~ {CeC: q < ICI < 1},

Furthermore, we set
der {
S,= CeC:ICI=r.}
Note that the boundary of A1 consists of the two circles S1 and s, (Fig. 71.4).
From
dW iA.
dz = - - = - - ekAI(CI dC
W'(z) 2n'IC

s.

Figure 71.4
458 71. Bifurcation and Permanent Gravitational Waves

; . f'
we obtain the back transformation

1(J)dC .
z = 27t'l 1 e T + 'Yo(O). (29)

This integral is to be evaluated along a curve in the circular ring, cut along
the negative real axis. The free surface y = y 0 (x) corresponds to the unit circle
C= e1", and hence has a parameter representation

x = --
21t'l
;. i" 0
e-t<"lcos8(u)du,

Yo= - 2;. [" e-t<"lsin8(u)du + y0 (0).


1t'l Jo
Here we write f(u) instead of /(1, u). The boundary relation (28) then becomes

1 2 c2e2t -pgA.
-Po'l
2 2-
1t'l
i"
0
• nd
e-t sm.,. u+p0 = B0
on S1 . Differentiation with respect to u yields
f., = J.U!-Jt sin 8 on sl (30)
with
gA.
/J = 21tc2'13 .

Under the back transformation the point C= qe±i• becomes z = +A./2- ih0 •
Hence, it follows from (29) that

'I = - 1 J"
21t - ..
e-t(q,ll) du. (31)

In summarizing, the free boundary-value problem of Section 71.2 has been


transformed into the following boundary-value problem for the circular ring.

Boundary-Value Problem 71.1. We are looking for functions


2 -
8, fEC (A11 )
which satisfy the Cauchy-Riemann differential equations
p8p =f.,, pfP = -8., on A., (32)
with boundary conditions

(33)
8=0 onS.,.
Furthermore, we assume the symmetry conditions
8(p, - u) = - 8(p, u), f(p, - u) = f(p, u) on A., (34)
71.4. Existence and Uniqueness of the Bifurcation Branch 459

and the normalization condition

J:. r(q,u)du = 0. (35)

The trivial solution 8 = r = 0 corresponds to a parallel flow. We are inter-


ested, however, in parameters p., for which nontrivial solutions exist. This is a
typical bifurcation problem.

71.4. Existence and Uniqueness of the


Bifurcation Branch
Our goal is to formulate the Boundary-Value Problem 71.2 as an operator
equation.
For this, we let X be the set of all function pairs
(8, r)e C 2 (A4 ) x C 2 (A4 ),
which satisfy the following additional properties:
(i) The Cauchy-Riemann differential equations (32) are valid on the open
circular ring A 4 •
(ii) The boundary condition 8 = 0 is valid on S4 •
(iii) We assume the symmetry condition (34) and the normalization condition
(35).
Obviously, X is a closed, linear subspace of the B-space C 2 (A4 ) x C 2 (A4 ).
Hence X is a B-space.
The Boundary-Value Problem 71.2 can now be formulated in the following
form. We are looking for an element (8, r)eX which satisfies
(36)
With regard to the final operator equation (38) below, let us now consider
the boundary condition
(37)
and formulate the solution of the corresponding boundary-value problem in
functional-analytical terms. For this, we let Y be the B-space of all C2-functions
f: sl-+ IR
which are odd with respect to u, i.e., f(- u) = - f(u).

Lemma 71.3. For each feY there exists a unique (8, r)e X which satisfies (37).
If we set
(8,-r:) = Rf,
then the operator R: Y-+ X is linear and compact.
460 71. Bifurcation and Permanent Gravitational Waves

This functional-analytical result contains the solution of the following clas-


sical boundary-value problem:
(37*)
Ta = f on S1 ,
8 =0 on s,,
8(p, -a) = - 8(p, a), -r(p, -a) = -r(p, a) on A~,

J:. -r(q, a) da = 0.

Let f e Y be given.
PROOF. Integration of equation (37) with respect to a yields the boundary
values oh on sl' i.e.,

-r(l,a) = -r(l, -1t) + J:. f(a)da.

Because off( -a)= -f(a) we obtain -r(l, 1t) = -r(l, -1t), i.e., for given -r(l, -1t)
the function T is Uniquely determined on S1.
We then solve the boundary-value problem (37*), where the condition
"-ra = f on sl" is replaced with

-r(l, a) = a + J:. f(a) da on S1 •

According to Problems 71.1 and 71.2, this classical boundary-value problem


has a unique solution. Notice that the normalization condition, i.e., the last
line in (37*), uniquely determines the constant a.
The Schauder estimates (46) of Problem 71.1 imply that the solution operator
-r(l, a) 1--+ (8, -r)
is a continuous linear operator
from to
for all 0 < a < l. Thus it is a compact linear operator
from to c2 (A,)
-
X c (A,),
2 -

since the embeddings C 3 (Sd s;; C 2 ·"(S1 ) and C 2 ·"(A,) s;; C 2 (A,) are compact.
Consequently, the solution operator
f(a)t--+(8, -r)
is a compact linear operator
from to
i.e., the operator R: Y-+ X is compact. 0
71.4. Existence and Uniqueness of the Bifurcation Branch 461

Using Lemma 71.3, the Boundary-Value Problem 71.2 can be reduced to


the following equivalent problem.

Operator Equation 71.4. We are looking for (8, r)EX and iJE R such that
(8, r) = iJR(e- 3' sin 8) (38)
is satisfied.

On R x X, this equation has the trivial solution (IJ, 0), which corresponds
to a parallel flow. We set

iJ,. = m(l + q2"')(1 - qz"'rt.

Theorem 71.8. A point (IJ, 0) is a bifurcation point of the operator equation (38)
in R x X if and only if
ll = /J,., m = 1,2, ....
In a sufficiently small neighborhood qf such a point in R x X, all bifurcation
solutions lie on a curve which can be analytically parametrized by a small
parameter s.

Corollary 71.5. On S1 the bifurcation solution satisfies


8 = s sin mu + O(s 2 ), s-+ 0,

ll = llm + O(s 2 ).

In Section 71.6 we shall describe a method which allows us to explicitly


compute the solution up to any given order of s. For small lsi, the series for
J1., 8, and r then converge in the spaceR x X. In particular, 8 and r converge
uniformly on the closed circular ring Aq, and the bifurcation branches have
the form depicted in Figure 71.5.

Figure 71.5
462 71. Bifurcation and Permanent Gravitational Waves

71.5. Proof of Theorem 71.B


We apply Theorem 8.A. The key step in the proof is the verification of the
bifurcation condition of Theorem 8.A. The main tools we use are Fourier
series.
Let f: S1 -+ IR be a continuous function. For k = 1, 2, ... , we denote the
Fourier coefficients off by

a"(f) =-1 J. f(a)coskada,

= ;- J.
1t -·

f\(f) 1 -x /(a) sin ka da.

Lemma 71.6. Iff: S1 -+ IRis C", n ~ 2, then

la~cl + Ibtl ~ const · k-", k = 1, 2, ....

The simple proof will be given in Problem 71.3.

Lemma 71.7. For every (.9, t)eX there exists the following Fourier expansion
on A,:
p"- q21r.p-lr. .
L b"
00

.9(p, a) = 1 - q2A: sm ka,


11=1
(39)
p" + q21r.p-lr.
L ht
00

t(p, a) = - /r.=l 1 - q 2" cos ka,

where ht = b~r.(.9(1, a)).

PRooF. Since the function w = 8 + it is holomorphic on the circular ring A,,


we make the ansatz
w= L/r. ex"{" + p,.c-".
By using the classical root convergence criterium for series, we immediately
observe that the series (39) on A, are arbitrarily often differentiable and satisfy
the Cauchy-Riemann differential equations (32). On S1 and s, the convergence
of(39) follows from Lemma 71.6. Because of

f~. t(q,a)da = 0,
we obtain from Cauchy's integral theorem that

fwd{ =0
71.5. Proof of Theorem 71.8 463

for all circles 1{1 = r with q ~ r ~ 1. Hence no constant term appears in the
expansion oft. 0

Lemma 71.8. Let m = 1, 2, ... be fixed. Iff: S1 -+ IRis C2 and odd with respect
to u, then there exists an element (.9, t)e X with
t 11 =JJ.,..9+f onS1, (40)
if and only if the solvability condition
b,.(f) = 0 (40*)
is satisfied. We find that
f\(.9(1, u)) = f\(f) for all k #; m, (41)
Ilk- Jl,.
and b,.(.9(1, u)) is arbitrary. Moreover, .9 and t on A9 are obtained from (39).

PRooF. Formulas (39) and (40) imply that


lltht = Jl,.f\ + bk(f). 0
We are now ready to prove Theorem 7l.B. Condition (40*) will play a key
role. Set w = (.9, t) and write the fundamental operator equation (38) in the
form
F(JJ., w) = 0, we X. (42)
Because of the linearization
e- 3 tsin.9 = .9 + ... ,
the partial F -derivative Fw(Jl, 0): X -+ X is equal to
Fw(Jl, O)w = w - JJ.RiJ for all we X.
The operator Fw(Jl, 0) is therefore a compact perturbation of the identity and
hence a Fredholm operator of index zero.
The eigenvalue equation
ll E IR, we X (42*)
corresponds to (40) with f = 0 and ll in place of Jl,.. According to Lemma
71.8, the eigensolutions of (42*) correspond to
ll = Jl,., b,. = arbitrary, for all k ¢ m,
and according to (39) we have
p"'- q2"'p-"' .
9,. = 1 _ q2 ,. sm mu,
p"' + q2"'p-"'
= cos ma.
q2 - 1
't"m ,.
464 71. Bifurcation and Permanent Gravitational Waves

Thus the linearized eigenvalue problem (42*) has the eigensolutions


J.I.=J.I.m> m = 1,2, ....
We now verify the decisive bifurcation condition of Theorem 8.A. We set
J.1. = J.l.m + e, and have to show that

we X (42**)
has no solution. This equation is equivalent to (8, t)eX and
t,. = J.l.m8 + sinmu on S1 •
Lemma 71.8 implies that this latter equation has no solution.
Theorem 7l.B and Corollary 71.5 thus follow directly from Theorem 8.A.
D

71.6. Explicit Construction of the Solution


We want to show that the method described in Corollary 8.25 can be applied
to our present case, i.e., we want to explicitly compute the solution. We make
the following ansatz

t = Stm + s2t m,2 + s3t m,3 + · · ·' (43)


J.l. = J.l.m + set + s2e2 + ... '
where sis a small real parameter. According to Theorem 8.A we have
bm(8) = s on S1 •
This yields the first key relation
for all k. (43*)
Note that bm(8m) = 1.
Our goal is to successively compute
and for k = 2, 3, ... ,
by using the solvability condition (40*).
(i) The functions 8m and tm are already known. Suppose we know 8m,, 't'm,,
and em,r-t for all r with 2 ~ r :S: k - 1.
(ii) We insert (43) into the Boundary-Value Problem 71.2, and comparing the
terms with s" we find that
dtm,k o o I"
C1 = J.l.m~m •k
-d- +Bt-l ~m + Jk on St,
Problems 465

where it depends only on already computed quantities. The solvability


condition (40*) of Lemma 71.8 then yields the second key relation
bm(ek-1 am +it) = 0 on s1'
which shows that
llt-1 = - bm(it).
We then compute 8m,k and trn,t from Lemma 71.8, and observe that these
functions are uniquely determined by condition (43*).
Since products of sine and cosine functions can always be transformed into
appropriate sums, it follows that it is a finite Fourier series. Hence it follows
from Lemma 71.8 that Brn,t and trn,t can explicitly be computed. The functions
correspond to finite sums of type (39).
Using the back transformation (29) from the circular ring, cut along the
negative real axis, to the flow plane, we obtain the basic formulas (i)-(vi) from
the introduction of this chapter.
Recall that the operator equation depends on A 9 , i.e., the solution depends
on the parameter
q= e-2tti•IA.

We therefore can prescribe the wave length .A. and the average channel depth h.

Remark 71.9. Note that h0 = h + O(s2 ), s-+ 0. Hence we obtain (i) and (iii)
from the introduction of this chapter by applying a small translation in the
direction of the y-axis.

PROBLEMS

71.1. • First mixed boundary-value problem for analytic functions on the circular ring.
We use the notations of Section 71.3 and consider the boundary-value problem
for the Cauchy-Riemann differential equations

t=f onS1 , (44)


8 = g on S4 ,
where 0 < q < l (see Fig. 71.4). For fixed n = 0, l, ... and 0 < tX < l we assume
that the functions
and g: S4 -+ IR are c•·~.
Prove that (44) has a unique classical solution (8, t). Moreover, this solution
satisfies
llt(q,u)ll •. ~ + 11/J(l,u)ll •. « ~ c(llfll •. « + llgll •. «). (45)
466 71. Bifurcation and Permanent Gravitational Waves

Hint: Use Fourier series to construct integral representations and apply


the classical theorem ofFatou-Privalov about singular integrals. See Zeidler
(1968, M), p. 63.
If n = 2, then similarly as in Section 6.3, one obtains the Schauder estimates
lltll + 11.911 :S: c1(llf112.« + llgll2.«), (46)
where 11·11 denotes the norm in C2·"(A4 ).
71.2. Symmetry properties of the solutions. Prove the following result for the equa-
tions (44): Iff is even and g is odd with respect to a, then the solution t is even
and .9 is odd with respect to a on A4 •
Solution: Set t 1(p,a) = t(p, -a) and .9 1(p,a) = -.9(p, -a), and observe
that in addition to .9, t, also .9 1, t 1 solves (44). Uniqueness of solutions of (44)
then implies that t = t 1 and .9 = .9 1.
71.3. Proof of Lemma 71.6.
Solution: Let f e C1 (Sd. Integration by parts yields

nat= f. f(a)coskada

= -k1 f"_,/'(a)sinkada,
which implies

A repeated application of integration by parts then yields

for f e C"(S1) and n ~ 2.


71.4. Explicit computation of the solution. Use the method of Section 71.6 to compute
the solution (i)-(vi) from the introduction of this chapter up to order 2.
Hint: See Zeidler (1968, M), p. 104.
71.5. Channels with infinite depth. Show that by making this assumption one obtains
a problem analogous to the Boundary-Value Problem 71.2 on the unit circle
and solve it.
Hint: See Zeidler (1968, M), p. 89.
71.5.* Sinusoidal waves, cnoidal, and solitary waves. During the nineteenth century
the following types of permanent water waves were discovered in channels of
finite depth h by using first-order approximations. Rigorous existence proofs
appeared in this century. Lets and a denote small parameters.
(a) Sinusoidal waves (Airy, 1845)
l . 2nx gl 2nh
y 0 (x) = 211: s · sm T' c2 =-tanh-.
211: l
Problems 467

(b) Cnoidal waves (Korteweg and de Vries, 1895)

0 < k < 1,

e2 = gh,

where "en" denotes the Jacobian elliptic function "amplitude cosine".


(c) Solitary waves (Boussinesq, 1871, Rayleigh, 1876)

y0 (x) = 3a 2h · sech 2 c:) (see Fig. 71.6),

c2 = gh, A= oo.
(d) Sinusoidal waves under the i~fluence of surface tension P(Kelvin, 1871)

Yo( X) = 2·A_ S ·SID----;-,


. 2nx c2 __ (gA + 2nP) tanh 2nh .
.. A 2n Ap0 A
(e) Cnoidal waves under the influence of surface tension p (Korteweg and de
Vries, 1895)

y0 (x) 3P 2 ) k 2 ·en 2 (3ax


= 3a 2h ( 1- UPoh 2h'k ) ,

c2 = gh, O<k<l.

A detailed discussion may be found in Zeidler (1968, M), (1971), where also
rigorous existence proofs are given which clarify the following relation between
sinusoidal waves, cnoidal waves, and solitary waves.
(i) Using conformal mappings we obtain a nonlinear boundary-value prob-
lem for the circular ring
A4 = {CeC: q <lei< 1},
This boundary-value problem can be reduced to an operator equation
F(x,p,q) =0 (47)
for functions on the unit circle {C e C: ICI = 1}, where pis an eigenvalue
parameter.
(ii) We fix q and arrive, as in Section 71.5, at a bifurcation problem. The
first-order approximation of (47) corresponds to a linear equation which

-
Figure 71.6 Figure 71.7
468 71. Bifurcation and Permanent Gravitational Waves

has the solution (a). The main theorem of analytic bifurcation theory
(Theorem 8.A) then yields a bifurcation branch, parametrized by the small
parameter s. This leads to sinusoidal waves of finite depth h.
(iii) Passing to the limit q --+ 0, i.e., h --+ oo we obtain sinusoidal waves of
infinite depth, for which
2 gl
c=-
21t
is valid.
(iv) Now comes the important point. Consider the limit q--+ 1, i.e., h/l--+ 0. A
nontrivial asymptotic expansion of (47) for small (1 - q) shows that the
first-order approximation of (47) with p = (1 - qr' leads to a nonlinear
equation which has the solution (b) or (e). The small parameter a in (b)
and (e) is related to (1 - q).
The existence of cnoidal waves therefore corresponds to a singular
bifurcation phenomenon.
(v) The limit k--+ 1 with K(k)--+ + oo yields solitary waves.
In summarizing, we find that cnoidal waves correspond to
h/l--+ 0,

where the channel depth h is small compared to the wave length l (shallow
water theory). Letting

we arrive at solitary waves.


A direct existence proof for solitary waves may be found in Friedrichs and
Hyers (1954) and Amick and Toland (1981).
71.6.* Sinusoidal waves and solitary waves of maximal height. Around 1880, Stokes
conjectured that such waves have a sharp crest with an angle of 120° (Fig.
71. 7). This conjecture was proved by Amick, Fraenkel, and Toland (1982), who
used methods of global bifurcation theory.
71.7. Coupled unharmonic oscillators, the Korteweg-de Vries equation, and solitons.
We study here the most elementary nonlinear oscillating system, and show
how it leads to the Korteweg-de Vries equation (KdV equation). This explains
why this equation appears in many branches of physics. Originally, Korteweg
and de Vries derived their equation in 1895 in order to approximately describe
long water waves in channels with small depth (shallow water theory). They
found cnoidal waves and solitary waves. We emphasize, however, that the
KdV equation is only a nonlinear first-order approximation for the many
nonlinear models in physics.
We first consider a nonlinear spring which has the position coordinate
~(t) = h + 'l(t)
at time t, where PJ(O) = 0, and K(PJ) is the spring force (Fig. 71.8). A Taylor
expansion yields
, --+ 0, (48)
Problems 469

-h-
~ reststate

~j-1 ~j ~j+l
- - - - h + YJ _.- - -

Figure 71.8 Figure 71.9

and the equation of motion is

m~ = - k(rt + art 2 ). (49)


If a = 0, then the general solution of (49) has the form
rt = Csinwt, w=p;;r.
N'!xt we consider an infinite number of points with identical mass m and
position coordinates

~it) = jh + rtit), j=O, ±1, ±2, ...


at timet, where 'IJ(O) =Ofor allj(Fig. 71.9). Suppose these points are connected
to their closest neighbors via nonlinear springs under the force law (48). We
then obtain the equation of motion

j = 0, ± 1, ± 2, ... (50)
with forces

KJ-1 = -k(rt1 - rt1-d- ka(rti- rt1- 1 ) 2 ,

Ki+1 = -k(rti+1 - 'IJ)- ka(rti+1 - 'IJ)2.

In 1955, Fermi, Pasta, and Ulam used numerical experiments in the study of
equation (50). To their great surprise they did not observe a trend towards
equipartition of energy among the possible degrees of freedom at a given time.
Let us now consider a continuous version of(50), where

is the position of the point ~E IR at time t with ~(0) = ~- We set

rtUh, t) = rtit),

and apply a Taylor expansion to (50). It follows that

h -+0,

where w 2 = k/m. We are looking for traveling waves and make the ansatz

rt(~, t) = U(~ - hwt, t).

Setting u = U~, we find that

u, + wh(2ahuu~ + :~um) = 0.
470 71. Bifurcation and Permanent Gravitational Waves

-
-
(a)
-
--
(b)

Figure 71.10

Finally, we obtain the classical Korteweg-de Vries equation (KdV equation)


u, + 6uu( + um = 0, (51)

by rescaling t, e. and u. In 1895, Korteweg and de Vries discovered that


equation (51) has a family of solutions of the form of solitary waves

u = 2k 2 sech 2 k(e - 4k 2 t - eo)

with velocity c = 4k 2 and amplitude 2k 2 (Fig. 71.6), where k and eo are


constants. Such solitary waves are also refered to as solitons.
Let us summarize some important modem results on the KdV equation.
(i) Stability of solitons. In Figure 71.10 two solitons interact. Kruskal and
Zabusky, in 1963, made the remarkable discovery that this interaction
does not effect the shape and velocity of the solitons. This stability
behavior of solitons was of great interest to physicists, because solitons
behave like elementary particles.
(ii) Inverse scattering theory and the solution of the initial-value problem for
the KdV equation. In 1967, Gardner, Greene, Kruskal, and Miura dis-
covered a new and important method in mathematical physics to ex-
plicitly solve nonlinear evolution equations. We explain here the basic
idea.
Suppose u is a solution of the nonlinear Korteweg-de Vries equation
(51). Then the scattering data ofthe linear Schrodinger equation
(52)

are known. Conversely, if the initial values u0 (e) of equation (51) are given,
then the scattering data of (52) are known for all times t. Using inverse
scattering theory, one determines the potential u = u(e, t) of(52) from the
known scattering data. This potential u is the.solution of the initial-value
problem for (51) (see Problem 30.7).
(iii) Lax pairs. The basic connection between the Korteweg-de Vries equation
(51) and the SchrOdinger equation (52) was discovered by Lax (1968) in
terms of functional analysis.
Problems 471

Introducing the two operators


D = a;a,,

equations (51) and (52) can be written as


u1 =LA- AL (51*)
and
L(t)l/l = l(t)l/1, (52*)

respectively, where A*= -A for a suitable specification of the domain


of definition D(A) s;; L2 ( - oo, oo ). Hence, the operator

U(t) =e811l with B(t) =I A(s)ds

is unitary for all t.


Let u be a solution of(51), i.e., u satisfies (51*). Then the relation
d
- U(t)L(t)U(tr 1 = e 8 (AL- LA + L')e- 8 = 0
dt
is valid because L' = u1 and (51*). This leads to the key observation that
U(t)L(t)U(tr 1 = L(O) tor all t.
Consequently, the eigenvalues in (52*) do not depend on t. This implies
the following two facts.
(a) The eigenvalues in (52*) depend only on the initial values u(x,O). The
inverse scattering method ir. (ii) above is closely related to this obser-
vation.
(b) Since the l's are time-independent, they are conserved quantities for
the solutions of (51). Thus the KdV equation has an infinite number
of conserved quantities.
We call (L, A) a Lax pair.
(iv) The KdV equation as a completely integrable infinite-dimensional Hamil-
tonian system. This was discussed in Problem 40.9. Physicists like to
reduce their problems to Hamiltonian systems in order to get a maximal
insight. In this regard the KdV equation has favorable properties.
For details, one may consult Novikov et al. (1980, M). Moreover, we
recommend Lax (1968), Lamb (1980, M), Bullough and Caudrey (1980,
P) (applications to physics), Ablowitz and Segur (1981, M~ Calogero and
Degasperis (1982, M), Rajaraman (1982, M) (solitons, instantons, and
elementary particles), Faddeev and Tahktadjan (1986, M) (infinite-
dimensional Hamiltonian systems, solitons, the Riemann-Hilbert prob-
lem, and inverse scattering theory), Knorrer (1986, S) (solitons and al-
gebraic geometry). A collection of important papers on solitons and
instantons can be found in Rebbi (1984).
71.8. The dead water problem, cavities, and free boundaries.
472 71. Bifurcation and Permanent Gravitational Waves

7l.8a. Physical problem. We consider the situation of Figure 71.11. Suppose that a
fixed obstacle (arc) AB is immersed in a planar stationary irrotational Row of
an inviscid, incompressible Ruid without outer forces. Behind the obstacle
there occurs a finite or infinite cavity. If the obstacle AB is regarded as a "ship,"
then, behind the obstacle, there occurs a so-called wake or dead water region.
The problem is to compute the Row outside the cavity and the boundary of
the cavity which corresponds to the unknown "free streamline f."
Formulate the corresponding mathematical problem.
Solution: As in Section 71.2 we use the complex Row potential
W= q~ + it/1,
where the velocity field is given by
v = gradq~.
Then the problem is the following:
in the Row region, (53)
t/1 = const along AB and f, (54)
Igrad t/11 = const along r. (55)

obstacle
(a)
r

(b)

Figure 71.11

deformation of the obstacle


Figure 71.12 ·
Problems 473

Equation (54) tells us that the obstacle and r correspond to a streamline.


Moreover, equation (55) is a consequence of Bernoulli's law

fpv 2 + p = const (56)

in the flow region and p = const along r. Observe that lfJx = 1/11 , qJ1 = - "'"·
If we have a solution 1/1 of (53)-(55), then equation (56) yields the pressure in
the flow region.
7l.8b. * Leray-Schauder mapping degree and existence theorems. Explicit solutions for
special obstacles were obtained by Helmholtz and Kirchhoff around 1870 (see
Jacob (1959, M)). The first general topological existence proof via mapping
degree was given in a famous paper by Leray (1935). We recommend for study
Serrin (1952). The main idea of this paper is as follows.
(i) Using a conformal mapping, the original problem is reduced to a nonlinear
integro-differential equation which is due to Villat (1911 ).
(ii) This equation is solved by applying the mapping degree. In order to
compute the mapping degree, which is equal to one, the given obstacle AB
is continuously deformed into a straight segment (Fig. 71.12).
71.8c.* Variational approach, the decisive trick of indicator functions, and existence
theorems. The method of conformal Itiappings can only be applied to planar
problems. The advantage of the modem variational approach to free boundary
problems is that one can also treat three-dimensional problems. Study Fried-
man (1982, M), Chapter 3, in this connection, where axially symmetric jets
and cavities are considered (Fig. 71.13). The basic trick is contained in the
following result:
Let G be a region in RN with a sufficiently smooth boundary. Let 81 G be a
nonempty open subset of aG. We are given a function 1/10 with

1/10 ~ 0 on G,

and consider the variational problem

L Igrad 1/11 2 + X{.;>o} dx = min!,


(57)
1/1=1/10 ono 1 G, I/JeL 1 (G), gradl/leL 2 (G).

jet

Figure 71.13
474 71. Bifurcation and Permanent Gravitational Waves

Here we set
0 if 1/f(x) > 0,
X{;>o}(x) = { + oo if 1/f(x) ~ 0,

i.e., X{;>o} is the indicator function ofthe set {1/1 > 0} = {xeG: 1/f(x) > 0}.
Show: If 1/1 is a local minimum of (57), then
lll/1 = 0 on a{"' > 0},

lim f (lgradl/11 2 - 1)hnd0 = 0


..... +o Ja{;>•l
for all he Cg'(G)N, where n denotes the outer unit normal vector.
Hence, if 1/1 and the free boundary a{l/1 > 0} are sufficiently smooth, then
we obtain the key condition
lgradl/11 =1 on a{ I/!> 0}. (58)
The important point is that the appearence of the indicator function in the
variational problem (57) guarantees the free boundary condition (58) at least
in a generalized sense.
71.9. Flow around a body.
71.9a. Planar flow and the Kutta-Jukovski formula. We consider a simply connected
region G in IR 2 with the boundary curve aG: z = z(t) oriented as in Figure
71.14. We regard G as a rigid body (e.g., a ship). Moreover, we consider a
planar flow in the exterior of the body IR 2 - G which satisfies the following
properties.
(i) The flow is irrotational, incompressible, and in viscid with density p0 and
velocity vector v = ae 1 + be2 •
(ii) The flow approaches the velocity -c0 e 1 at infinity.
(iii) The circulation of the flow around the body is r, i.e.,

f adx + bdy = r." (59)


JaG
(iv) There are no outer forces.
Let F be the force acting onto the body. Prove:
F = p0 c0 re 2 • (60)

Figure 71.14
Problems 475

This formula contains the so-called d' Alembert paradox: the body experi-
ences no drag, i.e., no force in the direction of - e1 • Think, for example, of a
ship G, which moves in a fluid resting at infinity. Let c0 e1 be the velocity of
the ship. We expect that there is a friction force acting on the ship in the
direction of -e 1 • This force is caused by the viscosity. If we change the
system of reference, then we obtain the situation of Figure 71.14. Hence, the
d'Alembert paradox appears because we assume that the fluid is inviscid.
Solution: We identify vectors 1Xe 1 + {Je 2 with points IX+ i{J in the complex
plane. Let z = x + iy. From Theorem 7l.A we have the following problem.
(a) We are looking for a complex velocity function V = a - ib which is
holomorphic in R2 - Gand V(oo) = -c0 •
(b) We have the boundary condition

Im V(z(t))z'(t) = 0 along iJG. (61)


This condition means that the velocity vector is tangential along the surface
ofthe rigid body.
If we know V, then the pressure p can be computed from
!PoiVI 2 + p = B0 in R2 - G, (62)
where 8 0 is a constant. Since V(oo) = -c0 , the function V has the Laurent
expansion
Ct Cz
V(z) = - c0 + - + - 2 + · · ·
z z
at infinity. Condition (61) implies

r = LG Vdz = f Vdz = 2nicl (63)

and hence
Ct = r/27ti.

The force exerted on· the body is equal to

F =- f pnds,
JaG
where n is the outer unit normal vector to iJG. This yields

F =i f
JaG
pz'(t)dt =i
JaG
f (Bo- Po2 (a 2 + b2))dz.
Because of JaGdz = 0 and since (61) implies bdx = ady, we obtain
F = ipo f V2dz.
2 JaG
Since
0 1
V2 = ---+c 2c c
0 +0-,
2 ( 1)
z 2 z
476 71. Bifurcation and Permanent Gravitational Waves

we have

and this is (60).


7l.9b. Boundary-value problem. Reduce the problem (i)-(iv) above to a boundary-
value problem for the Laplace equation.
Solution: Let 0 e G and set
r
+2 -. + V1 •
V= -c0
mz

Equation (63) implies that JaG V1dz = 0. Hence there exists a function W1 =
cp 1 + il/1 1 with
w; = v, in IR 2 - G,
where W,. is holomorphic in IR 2 - G. From (61) we obtain

Im W1 (z(t))z (t)
(
I I r -Z (t) -
+ -.
1
I )
c0 z (t) = 0.
2m z(t)
This yields
111/1 1 = 0 in IR 2 - G,
r
1/1 1 --In jz(t)l - c0 y(t) = const on aG.
2n

Moreover, we have W[ = O(l/z 2 ) at z = oo.


7l.9c. Planar subsonic flow around a body and quasi-conformal mappings. Study Bers
(1954).
71.9d. Three-dimensional subsonic flow around a body and Leray-Schauder theory.
Study Finn and Gilbarg (1957).
71.9e. Flow around a gas bubble. Study this free boundary-value problem in Zeidler
(1968, M).
7l.9f. Optimal design of aircraft. Study Morawetz (1982, S).
71.10. Equilibrium forms of rotating fluids. In a Cartesian (~~, r(, C)-inertial system we
consider a homogeneous incompressible fluid of constant temperature which
rotates counterclockwise around the C-axis with constant angular velocity ru.
Suppose the fluid is in a hydrostatic equilibrium with respect to a Cartesian
(~. rJ, C)-system I: which rotates together with the fluid. Moreover, assume that
n
the fluid occupies the simply connected region in I: (Fig. 71.15). Let p0 be
the constant outer pressure on an.
Determine the equation for the unknown
boundary an.
Solution: The density of the outer forces in I: is

K = -p(ro x (ro x x)) + i


n
pG(y- x)dy
I
y-x 13
Problems 477

an
Figure 71.15

which corresponds to the centrifugal force (Section 58.6) and the gravitational
force. Here pis the constant density and m = we 3 • Observe that the volume
elements t\ V of the fluid are resting in I:, and K t\ Vis the corresponding force

i
onto t\ V. Moreover, we have K = - p grad U with the potential

. w2 2 2 Gdy
U(x)= --(~ +'I)- --.
2 niY- xJ
From (70.37) we obtain pU +p= const on fi. Hence we arrive at the basic
equation

pU(x) +Po= const on an. (64)


This is the desired equation for the unknown boundary an

Historical Remarks

The classical problem of equilibrium forms of rotating fluids (see Problem


71.10) attracted the interest of physicists and mathematicians, since it models
rotating stars. At the beginning of the twentieth century, this problem played
an important role in the development of bifurcation theory and the theory of
nonlinear integral equations. The first explicit solutions for equation (64) were
discovered by Maclaurin (1698-1746) in the form of ellipsoids. In 1834, Jacobi
(1804-1851) found that, for critical angular velocities, there bifurcate families
of new ellipsoids from the ellipsoids of Maclaurin. The following general
problem arose.
(P) Suppose we are given a one-parameter family of equilibrium forms. Find
the critical values of angular velocity for which new equilibrium forms
bifurcate from the known solutions.
In 1885, Poincare (1854-1912) studied this problem in detail. The first
rigorous existence proof, however, was given by Ljapunov (1906). He thereby
created the bifurcation theory for nonlinear integral equations. Many results
can be found in the classical monograph by Lichtenstein (1933), who himself
made important contributions to this problem.
A variational approach to rotating fluids may be found in Friedman
(1982, M).
478 71. Bifurcation and Permanent Gravitational Waves

References to the Literature

Classical works: Nekrasov (1921), Levi-Civita (1925), Lavrentjev (1946), and Fried-
richs and Hyers (1954) (solitary waves).
Monograph about permanent waves: Zeidler (1968, B, H).
Permanent waves: Zeidler (1977, S), (1971), (1972a, b), (1973), Beyer and Zeidler
(1979), Beale (1979), Turner (1981), Amick and Toland (1981), Amick, Fraenkel, and
Toland (1982) (proof of the Stokes conjecture).
The blowing-up lemma and applications of bifurcation theo(y to capillary-gravity
waves: Zeidler (1968, M), Jones and Toland (1986).
Homoclinic bifurcation of dynamical systems and permanent waves: Kirchgiissner
(1988).
Survey about wave problems in physics: Whitham (1974, M, B) (standard work),
Stoker (1957, M) (water waves), Lighthill (1978, M) (1986), LeBlond (1978, M) (waves
in the ocean), Friedlander (1981) (geophysics), Debnath (1985, P) (nonlinear waves),
Brekhovskikh (1985, M), Ghil (1987, M) (atmospheric dynamics, dynamo theory, and
climate dynamics), Washington (1987, M) (climate modelling).
Dead water problems and cavities: Leray (1935) (classical work), Serrin (1952),
Birkkhoff and Zarantonello (1957, M), Hilbig (1964), (1982), Socolescu (1977).
Variational approach to jets and cavities: Friedman (1987, M) (recommended as an
introduction), Alt and Caffarelli (1981).
Equilibrium forms of rotating fluids: Lichtenstein (1933, M) (classical monograph),
Chandrasekhar (1969, M), Lebovitz (1977, S), Friedman (1982, M).
Rotating stars: Tassoul (1978, M).
Solitons: Novikov (1980, M) and Ablowitz and Segur (1981, M, H) (recommended
as an introduction), Lax (1968), Bullough and Caudrey (1980, P) (applications in
physics), Lamb (1980, M), Calogero and Degasperis (1982, M), Drazin (1983, M),
Rajaraman (1983, M), (solitons, instantons, and elementary particles), Davydov (1984,
M) (solitons in molecular systems), Faddeev and Takhtadjan (1986, M) (infinite-
dimensional Hamiltonian systems, solitons, the Riemann-Hilbert problem, and in-
verse scattering theory), Knorrer (1986, S) (solitons and algebraic geometry).
Collection of classical articles on solitons: Rebbi (1984).
The initial-value problem for the Korteweg-de Vries equation: Bona and Smith
(1975), Kato (1983).
The initial-value problem for water waves: Shinbrot (1976).
CHAPTER 72

Viscous Fluids and the


Navier-Stokes Equations

For nonlinear equations, such as the Navier-Stokes equations, it is known that


a regular solution for the nonstationary problem need not exist for all times
t :2: 0. At some finite time the solution may go to infinity or loose its regularity.
Even if the solution exists for all t :2: 0, it need not converge towards the solution
of the stationary problem as t-+ +oo, when the boundary conditions and the
forces converge towards a stationary situation.
Olga Aleksandrovna Ladyzenskaja (1970)
For many years now, the Navier-Stokes equations have attracted the attention
of engineers and mathematicians. The reason lies in the great number of interest-
ing and difficult problems which are connected with them and which lead to
important applications. Many of these problems are still unsolved. Beginning
with the work of Jean Leray, during the 1930s, a number of deeper results were
obtained for individual solutions of these equations. But from a physical point
of view, the study of only individual solutions is not always justified. For large
Reynolds numbers, i.e., roughly speaking, for large velocities or small viscosities,
the Oow becomes turbulent. Therefore, it is reasonable to look for a statistical
description, analogously to the kinetic gas theory.
Mark Iosifovic Visik and Andrei Vladimirovic Fursikov (1980)
The distinguishing feature of a turbulent Oow is that its velocity field appears to
be random and varies unpredictably. The Oow does, however, satisfy the Navier-
Stokes differential equations, which are not random. This contrast is the source
of much of what is interesting in turbulence theory....
If the Reynolds number Re is small, then the Navier-Stokes equations have
a unique solution which can be observed in nature. When Re is large, even when
a solution can be obtained, it is not observed in nature. The reason lies in the
fact that the solutions are unstable; very small perturbations, too small to be
measured by the experimenter, can be amplified and induce large changes in the
Oow. Note that this fact makes uniqueness theorems rather meaningless for a
person trying to describe physical reality, since the possible uniqueness rests on
an assumed uniqueness of the data, which cannot be assured in any meaningful
sense....

479
480 72. Viscous Fluids and the Navier-Stokes Equations

The main problem of turbulence theory is to isolate statistical properties of


solutions at the Navier-Stokes equations which are independent of the precise
statistical properties of random data, if this is possible, and then use the knowl-
edge thus acquired to construct reasonable predictive procedures for specific
problems. One should keep in mind that a practical person is usually interested
only in mean properties of a small number of functionals of the flow (e.g., lift
and drag in the case of a flow past a wing), and these could conceivably be
obtained even when the details of the flow are unknown ....
Alexandre Joel Chorin (1975)
The recent improvement of our understanding of the nature of turbulence has
three different roots. The first is the injection of new mathematical ideas from
the theory of dynamical systems (strange attractors). The second is the avail-
ability of powerful computers which permit, among other things, experimental
mathematics on dynamical systems and numerical simulation of hydrodynamic
equations. The third is the improvement of experimental techniques, in partic-
ular, Doppler measurements of velocities by use of a laser beam, and then
numerical Fourier analysis of the time series obtained....
Extensive computer studies of low-dimensional dynamical systems have
shown that sensitive dependence on initial data is quite common, but mostly
appears in systems for which we have no good mathematical theory.
David Ruelle (1983)

72.1. Basic Ideas


In this chapter we apply, step-by-step, the following functional-analytical
results:
(i) Leray-Schauder principle (Theorem 6.A).
(ii) Main theorem about pseudomonotone operators (Theorem 27.A).
(iii) Main theorem about first-order evolution equations (Theorem 23.A).
(iv) Implicit function theorem (Theorem 4.B).
(v) Main theorem of analytic bifurcation theory (Theorem 8.A).
According to Section 70.3, the Navier-Stokes equations for an incompres-
sible fluid are
pv, + p(v grad)v - 'I flv = K - grad p on G,

divv = 0 on G, (1)

v = 0 on oG,
where:
v(x, t) velocity vector at point x and timet;
p(x, t) pressure;

t,
p constant density of the fluid;

Kdx outer total force acting onto the fluid;


viscosity (a positive constant).
72.1. Basic Ideas 481

We choose a physical system of units such that p = 1. The fluid is located in


the fixed spatial region G. More precisely, we assume:
G is a bounded region in IR 3 with sufficiently regular boundary,
(2)
i.e., oG e C0 • 1•
Let n be the outer unit normal vector to oG. The boundary condition "v = 0
on oG" means that the viscous fluid sticks to the boundary.
We consider equation (1) for timet in a suitable interval and add the initial
condition
v(x, 0) = v0 (x) on G.
We are looking for velocity v and pressure p.
In the stationary case, v and p do not depend on t. We then have v, = 0 in (1).
As we will show below, the solution of equation (1) can be greatly simplified
by eliminating the incompressibility condition div v = 0 through an orthog-
onal projection in an H-space. In this way pressure p can be eliminated as
well, i.e., we obtain an operator equation for the velocity v.

72.la. Stationary Case


The functional-analytical treatment of equation (1) leads to the operator
equation
'IV- Qv = J, veV (3)

in the H-space V, where f depends only on the outer force K. The operator Q
is strongly continuous and hence compact. Moreover, Q is homogeneous of
degree two. The following relation
(Qvlv) = 0 forall veV, (4)
is essential and follows from the incompressibility condition div v = 0. It
implies the central coerciveness relation
. ('IV- Qvlv) .
hm II II = hm 'lllvll = +oo. (5)
!IVII-+ 00 V IIVII-+oo

According to Section 27.2 the operator '7I- Q is pseudomonotone because


it is a strongly continuous perturbation of the operator ,I. Thus the main
theorem about pseudomonotone operators (Theorem 27.A) guarantees the
existence of a solution v for every f e V. If this solution v is unique, then the
Galerkin sequence converges towards v. If it is not unique, then, at least, a
subsequence of the Galerkin sequence converges towards a solution of the
original problem.
Physically, this means that, for all outer forces K, the stationary problem
has a solution. We shall prove uniqueness for the case where 11/11/'7 2 is small,
482 72. Viscous Fluids and the Navier-Stokes Equations

f
i.e., for small
p2
4 K 2 dx,
t'f G

that is, the velocity vis uniquely determined, and, up to a constant, the pressure
pis also uniquely determined. Clearly, more cannot be expected for p, since
as with p also p + const is a solution of (1 ).
An alternative existence proof for (3) can be obtained by using the Leray-
Schauder principle (Theorem 6.A). Let us write (3) in the form

,t
v = -(f + Qv), VE V, 0:::;; t :::;; 1. (6)

If v is a solution of (6), then we obtain from (4) that


(vlv) ~ l(flv)l/'1 ~ llfllllvll/'1,
which yields the crucial a priori estimate

llvll ~ 11!111'1·
Hence, Theorem 6.A implies that equation (3) has a solution for every f e V.
We thereby use the fact that Q is compact.
In the usual formulation of the generalized problem to equation (1), it
follows from div v = 0 that, after an integration by parts, pressure p drops out,
so that p does not appear in the operator equation (3). On the other hand
pressure p is obtained from (3), by applying an important orthogonal decom-
position ofthe H-space L 2 (G, V3 ), see below. This decomposition corresponds
to the classical result of Stokes (1849), that every smooth vector field on G can
be written as
v =curiA+ grad U on G,
(7)
n~rlA=O ooaa
In classical physics, the vector field A is called the vector potential and the
function U is called the potential. Decomposition (7) greatly simplifies the
integration ofthe Maxwell equations in electrodynamics. If we set v1 = curl A
and v2 = grad U, then we obtain
v = v1 + v2 on G,
(8)
divv 1 = 0, curlv2 = 0,
i.e., the vector field v can be decomposed into a source free and irrotational
vector field. If
nv 1 = 0 on aG.
then the decomposition (8) is unique.
72.1. Basic Ideas 483

72.1b. Nonstationary Problem

The simple idea for our existence proof is to use the general trick for regular
semilinear equations, which was introduced in Section 8.12. For this we write
the original equation (1) in the form ·
v'(t) + Av(t) + Bv(t) = f(t) for t ~ 0,
(9)
v(O) = v0 ,
with the corresponding linearized problem
v'(t) + Av(t) = f(t) for t ~ 0,
(10)
v(t) = v0 •
Let the unique solution of (10) be denoted by v = S(J, v0 ). Formula (9) then
becomes
v = S(f - Bv, v0 ). (ll)

Equation (11) can be solved for small 11/11 and llv0 11 by using the implicit
function theorem or the Banach fixed-point theorem.
Physically, this means that for small outer forces IKI and small initial
velocities lv0 1 there exist unique solutions for all times. We do not expect to
have turbulence in this case. The monographs, cited in the References to the
Literature to this chapter, contain a number of stronger existence results. Here,
however, we restrict ourselves to situations which require only a minimal
amount of mathematical tools. For example, for arbitrary forces K and initial
velocities v0 , one can prove existence of weak generalized solutions of (1) by
assuming less regularity, but without obtaining uniqueness. On the other
hand, one can prove uniqueness by assuming more regularity, but with-
out obtaining existence. This strange situation might be a consequence of
turbulence phenomena.

72.lc. Approximation Methods


To engineers it is important to have effective numerical methods, which may
be found in Temam (1977, M), Girault and Raviart (1981, L), (1986, M),
Thomasset (1981, M), Glowinski (1984, L), and Murman (1985, P) (super-
computers). They consist, for example, of Galerkin methods with finite
elements as basis functions and difference methods. A typical difficulty which
arises is that finite elements are needed which satisfy
divv=O onG.
One trick to avoid this problem is to consider div v = 0 as a side condition
484 72. Viscous Fluids and the Navier-Stokes Equations

and to apply optimization methods for problems with side conditions. In this
direction, algorithms of Uzawa and Arrow-Hurewicz are very practicable (see
Problems 50.2 and 50.3 and Temam (1977, M), Girault and Raviart (1981, L)).
Another trick to avoid div v = 0, is to assume that the density has a weak
compressibility, i.e., a pressure dependence of the form p = p0 + ep. This gives
Pr + edivv = 0
and one can study the limiting process e --+ 0. This is contained in Temam
(1977, M), where one can also find the method of fractional steps. The idea
thereby is to discretize the time derivative and to use, at each time step, only
a certain portion of the spatial part of equation (1).

72.ld. Bifurcation

As two important examples for bifurcation phenomena we consider:


(a) the Taylor problem (viscous fluid between two rotating cylinders); and
(b) the Benard problem (heated viscous fluid between two horizontal plates).
To both problems we apply the main theorem of analytic bifurcation theory
(Theorem 8.A). We obtain existence and uniqueness results for the bifurcation
branch and also derive effective methods for its construction. The important
point thereby is that the so-called bifurcation condition be satisfied. In order
to guarantee simple zero solutions of the linearized problem, the following
trick is used. We consider the corresponding operator equations in spaces of
functions with suitable symmetry properties.

72.le. Hydrodynamic Similarity

In order to write the basic equation (1) dimensionless, we set


v= vv, x=Rx, t = Tt, p=Pp, K=kK, (12)
with

where the overbarred quantities v, x, etc. are dimensionless. The quantities V,


R, etc., on the other hand have respective dimensions and numerical values,
which depend on the specific problem. For example, V may be chosen as the
average velocity of the fluid, and R as the diameter of G. Moreover, we set
G=R·O.,
i.e., the flow region G is obtained from the region n by multiplying all
coordinates with R. If we set u = v, then the original Navier-Stokes equations
72.2. Notations 485

(1) assume the following dimensionless form


1 -
exu1 + (u grad)u - Re L1u = K - grad p on n,

divu =0 on n, (13)
u =0 on an,
where

Re = pVR
'1
is the crucial Reynold's number and ex = R/VT. The space derivatives in (13)
are taken with respect to x. Thus we find the following similarity principle:
If one knows one solution of the dimensionless equation (13), then one obtains
a family of solutions of the original Navier-Stokes equations (1).from (12) which
are similar in the sense of these transformation formulas.
This shows that only the dimensionless numbers Re and ex are of significance
to (1) and not the specific values of density p, viscosity f1, etc.
This principle is being used by engineers in order to simulate the flight of
aircraft in wind tunnels.

72.2. Notations
We will use the same notations as in Section 62.2 and set

equipped with the scalar products

(vlw)r = L vwdx,

(viw)x = L (vw + [v', w'*] dx.


In Cartesian coordinates this gives
[v'(x), w'(x)*] = D1vi(x)Diw;(x).

On the real H-space X we introduce the equivalent energetic scalar product

(vlw) = L [v'(x), w'(x)*] dx.

In order to give a functional-analytical formulation of the side condition


486 72. Viscous Fluids and the Navier-Stokes Equations

div v = 0 on G and v = 0 on oG, we let


C0 (G,div)
be the set of all vector fields v: G-+ V3 , whose components belong to C0 (G),
and which satisfy div v = 0 on G. Moreover, we define
H =closure of C0 (G,div) in Y,
V =closure of C0 (G,div) in X.
We denote the norm on V by
llvll = (vlv) 112 •
As for any H-space, there exists the orthogonal decomposition
Y = HEBHl., (14)
where His a closed subspace of the H-space Y.

Lemma 72.1. Under assumption (2) we have that


Hl. = {ve Y: v = gradpfor some pe Wi(G)},
and, up to a constant, the function p is uniquely determined by v e H 1..

The proof will be given in Problem 72.1. The intuitive meaning was dis-
cussed already in connection with the classical formula (7).
Throughout the following we assume that components of vectors corre-
spond to a fixed Cartesian coordinate system.

72.3. Generalized Stationary Problem


We multiply the stationary Navier-Stokes equations (1) with we C0 (G, div)
and use integration by parts to obtain
'1(vlw)- b(v, v, w) = c(w) forall weC0 (G,div), (15)
where

b(u, v, w) = L uw' v dx,

c(w) = L Kwdx,

or in component notation

b(u, v, w) = L uivi D1wi dx. (16)


72.3. Generalized Stationary Problem 487

Notice that because of div v = div w = 0, the integration by parts in the


derivation of equation (15) yields

Jvgradpdx = - Jpdivvdx = 0
and

f w(vgrad)vdx = J w1(viD1)vidx

= - Jv 1viD1widx = -b(v,v,w).

Definition 72.2. Suppose the region G satisfies assumption (2). The generalized
problem to the stationary Navier-Stokes equations (1) is this.
Given the density of the outer forces KeL 2 (G, V3 ), we are looking for the
velocity ve Vofthe fluid so that equation (15) is satisfied.

72.3a. Computation of the Pressure

The derivation of(15) shows that every classical solution of(1) is a generalized
solution. Conversely, if vis a sufficiently smooth generalized solution of (1),
then we set
g = f/AV- (vgrad)v + K.
The components of g are denoted by gi. From (15) and integration by parts
it follows that
(glw)r =0 for all weC0 (G,div).
This implies geH1.. Lemma 72.1 then shows that
g = gradp, pe Wl(G),
and the pressure p is uniquely determined up to an additive constant. There-
fore v, p is a solution of the original stationary equation (1 ).

72.3b. The Pressure in the Weak Sense

This argument can be refined in the case that the generalized solution v is not
sufficiently smooth, and instead only v e V, i.e.,
i = 1, 2, 3
is valid, such as in Definition 72.2. We thereby use the calculus of distributions,
which is summarized in A2 (62), and consider the components vi of v as
488 72. Viscous Fluids and the Navier-Stokes Equations

distributions, i.e.,
i = 1, 2, 3.
Since distributions are arbitrarily often differentiable, it follows tha~
i = 1, 2, 3.

u1eqj'(G), i = 1, 2, 3.
From K1eL 2 (G), we find K1eqj'(G), and consequently,
g1eqj'(G), i = 1, 2, 3.
Explicitly, we have

g1(cp) = L ('lv1 llcp- (viDiv1)cp + !'1cp)dx


for all cp e C~(G). We define
3
g(w) = L g (w1 1),
1=1

and integration by parts yields


g(w) = -f/(vlw) + b(v, v, w) + c(w)
for all w1 eC~(G). From the generalized problem (15) it then follows that
g(w) = 0 forall weC~(G,div).

Now to the point. The proof of Lemma 72.4 below implies


for all qJ e C~(G).

This means that g1 e W2- 1 (G). Therefore, Problem 72.2 shows that there exists
a function peL2 (G) with
g=gradp onG
in the sense of distributions. Up to a constant, p is uniquely determined.
Hence v, pis~ solution of the original stationary equation (1) in the sense
of distributions, and Definition 72.2 therefore makes sense.

72.3c. Operator Properties


Lemma 7l.3.Jt is ce V*.

PRooF. We have
lc(w)l = I(Kiw)rl:::; IIKIIrllwllr
:::; const IIKIIrllwll. 0
72.3. Generalized Stationary Problem 489

In order to study b(u, v, w), we make essential use of the following Sobolev
embedding theorem:
The embedding Wl(G) s;; L 4 (G) is compact, (17)
see A2 (45). We choose a. fixed Cartesian coordinate system and denote the
norm on L 4 (G, V3 ) by
3
lvl = L llvlll4·
1=1

From (17) follows that


the embedding V s;; L 4 (G) is compact. (17*)
In particular, this implies
lvl s;. const Uvll · for all ve V.
Recall that II vii denotes the norm on V and 11!11, is the norm off e L,(G).
Lemma 71.4. For all u, v, we V we have
lb(u, v, w)l s; const lullvlllwll
(18)
s; const llullllvllllwll,
b(v, v, v) = 0. (19)

PRooF. The Holder inequality for three factors (65.16) and (16) imply
lb(u, v, w)l s; llu;ll411v111411DJw1112
s; const lullvlllwll.
This is (18).
Integration by parts and div v = 0 yield

2b(v, v, v) = f 2viv1D1v1dx = f v grad v2 dx

= - f v2 divvdx = 0

for all ve C0 (G, div). Since this set is dense in V, we obtain (19) from (18).
D

72.3d. Equivalent Operator Equation


Our goal is to show that the generalized stationary problem (15) is equivalent
to the operator equation
'IV- Qv = J, VE V. (20)
490 72. Viscous Fluids and the Navier-Stokes Equations

To this end, we set


(flw) = c(w),
(21)
(B(u, v)lw) = b(u, v, w) for all we V
and fixed u, v e V. In order to justify (21 ), we use the theorem of Riesz of Section
18.11. In fact, from Lemma 72.3, we have that c: V-+ R is a linear continuous
functional and hence there exists an element f e V with (21). In the same way,
Lemma 72.4 guarantees the existence of a B(u, v)e V with (21).
Moreover, we set
Qv = B(v,v) for all ve V.
Obviously, equation (15) is equivalent to (20).

LeiJima 72.5. The operator B: V x V -+ V is bilinear, symmetric, bounded, and


strongly continuous.

PRooF. The operator B is bounded. This follows from (18) and hence
IIB(u, v)ll ~ const lull vi ~ const llullllvll
for all u, ve V.
We show that B is strongly continuous. Let
u,.~u and v,. ...... v in V as n-+ oo.
Then the sequences (u,.) and (v,.) are bounded in V. Because of the compact
embedding V!: L 4 (G) we have that lu,. - ul-+ 0 and lv,. - vi-+ 0 as n-+ oo.
Hence
IIB(u,., v,.) - B(u, v)ll = IIB(u,. - u, v,.) + B(u, v,. - v)ll
~ lu,.- ullv,.l + lullv,.- nl-+0 as n-+ oo.
The remaining claims are obvious. D

72.4. Existence and Uniqueness Theorem for


Stationary Flows
Theorem 72.A. Assume condition (2) for a fixed flow region G. The generalized
stationary problem (15) then has a solution for every K e L 2 (G, V3 ).
This solution is unique if

is sufficiently small.
72.5. Generalized Nonstationary Problem 491

PRooF.
(I) Existence. From Lemma 72.5 it follows that the operator Q: V-+ V is
strongly continuous, and (19) implies that
(Qvlv) = b(v, v, v) = 0 forall veV.
Thus the existence result follows from Section 72.1.
(II) Uniqueness. From (6) it follows that
llvll ::;; 11!11/'7
for every solution ve V of equation (20). For two solutions u and v we
therefore have
'111u- vii = IIQu- Qvll = IIB(u- v, u) + B(v, u- v)ll
::;; const llu- vll(llull + llvll)
::;; const llu - vllll/11/'7,
which shows that llu.- vii = 0 if 11/11/'7 2 is sufficiently small. This gives the
uniqueness of the solution.
From Lemma 72.3 follows that

11/11 2 ::;; const L K 2 dx.

In case we do not choose p = 1, such as in Section 72.1, we have to replace


'7, K with '7/p, K/p. 0
If we introduce dimensionless quantities, as in (13), then we obtain unique
solutions of(13) if the dimensionless number

(Re)4 L K2 dx
is sufficiently small for fixed 0.

72.5. Generalized Nonstationary Problem


In order to understand this section we need some tools which were presented
in Chapter 23 in connection with linear evolution equations. In particular, we
need the spaces
L = L 2 (0, T; V) and W = Wl(O, T; V,H).
Recall that v e W means that
and v' e L 2 (0, T; V*).
The spaces Vand H have been introduced in Section 72.2.
492 72. Viscous Fluids and the Navier-Stokes Equations

Moreover,
"V s; H s; V*"
is an evolution triple in the sense of Section 23.4. Let [0, T] be a fixed, but
otherwise arbitrary time interval with 0 < T < oo.
In order to obtain the generalized problem, we multiply equation (1) with
weC0 (G,div). After integrating by parts we then obtain

d
dt (vlw)n + 'l(vlw)- b(v, v, w) = (Kiw)r for all we V,
(22)
v(O) = v0 •
In contrast to the stationary case, v and K depend here on time t.

Definition 72.6. Assume condition (2) for the region G. The generalized problem
to the nonstationary Navier-Stokes equations (1) is the following.
Given the initial velocity v0 e H and the density of outer forces K e
L 2 (0, T; Y) with Y = L 2 (G, V3 ), we are looking for

veW

such that equation (22) is valid on the time interval ]0, T[.
The time derivative on ]0, T[ in (22) is understood in the generalized sense.

As in Section 72.3, we see that this is a meaningful generalization of the


original problem (1).
By using (21) we can write (22) in the form
d
dt (vlw)n + 'l(vlw)- (B(v, v)lw) = (flw) for all we V,
(23)
v(O) = v0 ,
where v and K depend on t. Recall that (·I·) denotes the scalar product on V.

72.5a. The Linearized Problem

The linearized problem to (23) is


d
dt (vlw)n + 'l(vlw) = (g!w) for all we V,
(24)
v(O) = v0 •
According to the main theorem about linear evolution equations of first order
(Theorem 23.A), this equation has a unique solution v e W for every v0 e H
and geL. We set
v = S(g,v0 ).·
72.5. Generalized Nonstationary Problem 493

Lemma 72.7. The solution operatorS: L x H-+ W is linear and continuous.

PRooF. From Theorem 23.A follows


(25)
0

72.5b. Operator Properties


Letting g = f- B(v, v) we can write (23) equivalently as
v = S(f - B(v, v), v0 ), veW. (26)
Observe that v depends on t. We need to show that f and Bare correctly
defined, i.e., that f e L and B(v, v) e L for all v e W.

Lemma 72.8. It is f e L.

PROOF. By assumption we have K e L 2 (0, T; Y), i.e., K(t) e Y for all t e ]0, T[.
From
(f(t)lw) = (K(t)lw)y for all we V

it follows that llf(t)llv :$; const IIK(t)llr, according to Lemma 72.3, and hence

llflli = LT llf(t)ll~dt :$; const LT IIK(t)ll~dt. 0

Lemma 72.9. The bilinear, bounded operator B: V x V-+ V can naturally be


extended to a bilinear, bounded operator B: W x W -+ L.

PROOF. All constants will be denoted by c. Set


Z = closure of C0 (G, div) in L 4 (G, V3 ).
Since the embedding Wl(G) £ L 4 (G) is continuous, it follows that the embed-
ding V £ Z is continuous as well. Moreover, Vis dense in Z. Therefore
"V £ Z £ V*"
is an evolution triple. From Proposition 23.23 it then follows that the
embedding
W £ C([O, T], Z)
is continuous. Note that the definition of W = Wl(O, T; V,H) depends only
on V and v•, and not on the concrete form of H, i.e., Wl(O, T; V,H) =
Wl(O, T; V,Z). We have the important estimate
max Jw(t)l ~ cllwllw for all we W. (27)
0:5:t:5:T
494 72. Viscous Fluids and the Navier-Stokes Equations

Recall that 1·1 and 11·11 denote the norm on Z £ L 4 (G, V3 ) and V, respectively.
Let
z(t) = B(u(t), v(t)).
Then for all u, v e W we have

liz III= LT IIB(u(t), v(t))ll 2 dt

~ LT clu(tW lv(t)l 2 dt

~ c(m~x lu(t)lr ( m~x lv(t)lr


~ cllulli-llvllfv. 0

72.6. Existence and Uniqueness Theorem for


Nonstationary Flows
Theorem 72.8. Assume condition (2) for a fixed flow region G. Suppose that we
are given the initial velocities v0 e H and the density of the outer forces K e
L 2 (0, T; Y).
The generalized nonstationary problem (22) then has at most one solution. If
the norms of v0 and K are sufficiently small, i.e., if

llvoll~ + LT L K(x,t) 2 dxdt ~ r2 (28)

for fixed small r > 0, then there exists a unique solution.

PROOF.

(I) Existence. In case that the norms llv0 11 8 and llfiiL are sufficiently small,
we apply the implicit function theorem (Theorem 4.B) to equation (26).
Observe that, because of Lemma 72.9, the right-hand side of (26) is
analytic.
(II) Uniqueness. Assume that u, v e Ware solutions of the generalized problem
(26), and set w = u - v. Then, according to (27), there exists a number
R > 0 with
lu(t)l,lv(t)l ~ R for all t e [0, T].
From (26) follows
w = S(B(w, u) + B(v, w), 0).
72.7. Taylor Problem and Bifurcation 495

This yields

lw{T)jl ~ cllwll~ ~ ciiSII 2 LT (lwllul + lwllvl)2 dt

~ 4~R 2 IISII 2 tT lw(t)jldt.


The function w is a generalized solution on each time interval [O,s] with
s ~ T, which belongs to the space Wl (0, s; V, H). Therefore, in the last
estimate, we may replace T with s. Note that, according to Theorem
23.A, the constant c in (25) does not depend on the subinterval [O,s] of
[0, T]. Consequently, the norm liS I of the solution operator is uniformly
bounded with respect to all subintervals [O,s] of [0, T ]. Gronwall's
lemma of Section 3.5 then implies that
w(s) = 0 for all s e [0, T]. 0

72.7. Taylor Problem and Bifurcation


In this and the following section we need results about B-spaces of Holder-
continuous functions which have been discussed in Section 6.2. We use argu-
ments which can also be applied to bifurcation problems for more general
elliptic systems of partial differential equations.

72.7a. The Physical Problem


As in Figure 72.1, we consider a viscous fluid between two concentric cylin-
ders, whereby the outer cylinder is at rest and the inner cylinder rotates
counterclockwise around the z-axis with angular velocity ro. Let the cylinder
radia be r 1 and r 2 with r 1 < r 2 . Important is the Reynolds number Re. We set
A. = Re = pwrU,.
In experiments one observes a critical number .1.0 with the following properties.
(i) For A. < .1.0 , i.e., for small angular velocities ro there exists an axisymmetric
flow which does not depend on the z-coordinate. This is the so-called
Couette flow.
(ii) For A. = .1.0 , so-called Taylor vortices occur, which are periodic in z (Fig.
72.2).
(iii) If the angular velocity ro gets larger and larger, i.e., for increasing A., one
obtains more and more complicated flow pictures until at a certain
A. = Acrit turbulence occurs.
496 72. Viscous Fluids and the Navier-Stokes Equations

Figure 72.1 Figure 72.2

X
X

Figure 72.4
Figure 72.3

Figure 72.5

Our goal is to treat this Taylor problem as a bifurcation problem in a


neighborhood of the first bifurcation point A.0 (Fig. 72.3). Up until today,
however, it has not been possible to follow the solution branch as A.--. +oo.
One expects successive secondary bifurcations as shown in Figure 72.4.

72.7b. The Mathematical Problem


As in formula (12) we introduce dimensionless quantities u, x, and p through

and describe x with resp~ct to the cylinder coordinates r, qJ, z and the cor-
responding orthonormal basis vectors e,, e,, ez (Fig. 72.5). Moreover, we let
R = r2 /r1 .
72.7. Taylor Problem and Bifurcation 497

By using the dimensionless stationary Navier-Stokes equations (13), our


problem can then be stated as follows.
We are looking for a velocity field u and a pressure function pwhich satisfies
1
-~Au+ (ugrad)u = -gradp,
(29)
divu = 0
for all arguments (r,z)e [1, R] x Rand the boundary conditions
u= 0 for r = 1,
(30)
u =e., for r = R.

More precisely, we are looking for axisymmetric solutions, which have the
period P with respect to z.

72. 7c. Bifurcation from the Couette Flow


By writing (29) in cylindrical coordinates, we immediately observe that the
so-called Couette flow
u0 (r) = (1- R 2 r 1 (r- R 2 r- 1 )e,,
p0 (r) = u~ ln r + const
is a solution of (29), (30). In order to find solutions which bifurcate from this
solution, i.e., which correspond to the Taylor vortices, we set
u = u0 + w, P= Po+ q
with
w = Ue, + Ve., + Wez.
Transformation of (29) and (30) onto cylindrical coordinates yields
+ U, + WUz- ,-• V 2 - 2r- 1 VV0 + q, = 0,
-A.- 1 A 1 U
-A.- 1 A 1 v + uv, + w~ + ,-•uv + ,- 1 (V0 + rV~)U = o, (31)
-A.- 1 (A 1 W + r- 2 W) + UW, + W~ + qz = 0,
with V0 = lu0 1and also the elliptic differential operator
A 1 u = uzz + u, + ,-•u,- ,- 2 u.
From div v = 0 we obtain
(rU), + (rW)z = 0, (32)
and the boundary conditions are
U=V=W=O for r =1 and r = R. (33)
498 72. Viscous Fluids and the Navier-Stokes Equations

72.7d. The Stream Function

In order to simplify equation (31) we use the so-called stream function t/1, which
can be uniquely determined from the equation
(rt/l)z = rU, (rt/1), = -rW
with (r,z)e[1,R] x Rand the normalization condition t/1(1,0) = 0. Because
of(32), the integrability conditions are satisfied (see Problem 65.3). In addition,
we assume the symmetry condition
t/J(r, - z) = - t/J(r, z), V(r, -z) = V(r,z). (34)
Our next goal is to reduce this problem to a corresponding one for the
functions t/1 and V. In order to eliminate the pressure q in (31), we differentiate
the first and third equation in (31) with respect to z and r, respectively, and
subtract the results. If we add the second equation ~f(31), we obtain
A~t/1+ l(a~ + N1(t/J, V)) = 0,
(35)
At V + l(bt/lz + N2(t/J, V)) = 0,
with symmetry conditions (34) and boundary conditions
"' = t/1, = v= 0 for r = 1 and r = R. (36)
We thereby have
a(r) = 2(R 2 - r 2 )/(R 2 - 1)r2 , b = 2/(R 2 - 1),
and
rN1 = ((rt/I),A 1t/l)z- r(t/JzA 1t/1), + 2V~,

rN2 = (rt/1), ~- (rV),t/lz:·


The concrete form of the nonlinear terms N 1 and N2 is not important for the
existence proof. We will only need the fact that they are bilinear with respect
to t/1 and V, and that N1 and N2 contain derivatives up to order 3 and 1,
respectively. Notice, furthermore, that a and bare positive on ]1,R[.
We are looking for functions
t/1 = t/J(r, z) and V = V(r,z)
with (r,z)e[1,R] x R, which have period P with respect to z. If we know a
solution of(35), then we can derive pressure q from equation (31), because (35)'
contains the integrability conditions for q.
Our problem is therefore reduced to the qu~stion of finding nontrivial
solutions for the boundary-value problem (34)-(36).

72. 7e. The Operator Equation

For a functional-analytic formulation of this problem we set


x = (t/J, V),
72.7. Taylor Problem and Bifurcation 499

and write the differential equation (35) in the form


Ax+ A.(Bx + Nx) = 0, xeX (37)
with linear operators
. 2
Ax= (A 1 t/I,A 1 V),
and the nonlinear operator
Nx = (N1 (x),N2 (x)).
In order to obtain the operators
A, B, N: X -+ Y,
we define the real B-spaces X and Yin the following way. Set
Q = [1,R] X [O,P],
and let c:~~(Q) denote the B-space of all ct·'"(Q)-functions, which can be
extended to ct·'"-functions on [1, R] x R and have the period P with respect
to the z-coordinate. Obviously, c:~~(Q) is a closed linear subspace of the
B-space C"·'"(Q) (see Section 6.2).
We now define X as the B-space of all function pairs
(t/J, V) E c;~:(Q) X c;;:(Q)
with 0 <a< 1, where in addition t/1 and V satisfy the symmetry conditions
(34) on Q as well as the boundary conditions (36).
Moreover, we define Y as the B-space of all function pairs
(f, g) E c:er(Q) X c:er(Q),
which satisfy the symmetry conditions
f(r, - z) = - f(r, z), g(r, - z) = g(r, z) on Q.
Note that the spaces X and Y depend on the period P.

72. 7f. Main Result

Theorem 7l.C (Taylor Problem). One can choose the period P > 0 in such a
way that there exists a positive number A.0 which has the following properties.
(i) For 0 <A.< A.0 , the operator equation (37) has only the trivial solution
x=O.
(ii) (.1.0 ,0) is a bifurcation point of (37). More precisely, there exists a unique
bifurcation branch in a neighborhood of (A. 0 , 0) in R x X. This branch
depends analytically on the small real parameter s, i.e.,

(38)
500 72. Viscous Fluids and the Navier-Stokes Equations

In the following section we will show that Theorem 72.C is a direct con-
sequence of the main theorem of analytic bifurcation theory (Theorem 8.A).
Corollary 8.25 contains an effective method for a successive computation of
the coefficients in (38). Letting C = A + A.0 B we repeatedly have to solve the
equation
Cx =y, xeX (39)
in order to determine (x1+ 1 , e1) for j = 1, 2,... . Moreover, we will com-
pute (x 1 , A.0 ~ with Cx 1 = 0. Equation (39) means that we have to solve a linear
elliptic system. By using a Fourier expansion this can be reduced to a
boundary-value problem for fourth-order ordinary differential equations, see
(47) below.
Numerical computations of Kirchgassner and Sorger (1969) have shown
that
and
The bifurcation branch has therefore the structure shown in Figure 72.3. The
result coincides with the physical experiments described in Section 72.7a. The
trivial solution x = 0 corresponds to the Couette flow, and for Reynolds
numbers A.> A.0 in a neighborhood of A. 0 , the bifurcation solutions yield the
Taylor vortices. Further numerical studies in connection with the Taylor
problem may be found in Meyer and Keller (1980) and Frank and Meyer
(1981).

72.8. Proof of Theorem 72.C


In order to apply Theorem 8.A, we need the following basic results.
(i) The linearized operator A + A.B is Fredholm of index zero.
(ii) The linearized equation (A + A.B)x = 0 has a simple solution.
(iii) The generic bifurcation condition is satisfied.
In connection with (ii) and (iii), we use the spectral theory of linear integral
equations with positive kernels. This is related to Chapter 7. Furthermore, we
apply Fourier series, and we force condition (ii) by using a suitable period P
(trick of changing the periodicity). The analytical key to the following proof
is contained in Lemmas 72.12 and 72.14.

Lemma 72.10. For all (1/J, V), (q>, W)eX we have

-fa (A 1 V)Wrdrdz =fa ("V,W, + ~~ + r 2 VW)rdrdz

= - t (A 1 W)Vrdrdz, (40)
72.8. Proof of Theorem 72.C 501

PROOF. Use integration by parts. D

Lemma 72.11. The operators A, B, N: X-+ Y have the following properties: A


is linear and continuous, B is linear and compact, and N is analytic with
Nx = O(llxll 2 ) as x-+ 0.
PRooF. According to Section 6.2, this follows from the compact embedding
ck+t.~(Q),s; ck·~(Q)

and estimates of the form

llfgiL. ~ const llfll~ IIYII~,


where 11·11~~ denotes the c~-norm. Notice that B, other than A, contains only
low-order derivatives. D

Lemma 72.12. Equation


Ax=y, xeX (41)
has a unique solution for every y e Y.

PRooF. This statement follows from standard results about linear elliptic
differential equations. Explicitly, equation (41) means that
Aft/! =f, A 1 V = g, (t/1, V)eX (42)
with (f,g)e Y.
(I) Uniqueness. From Ax = 0 and (40) we obtain at once that x = 0.
(II) Existence. Iff and g are expanded in Fourier series as in (47) below, and
iff, g e c;;.(Q), then it is easy to obtain a solution (t/1, V) in the form of a
Fourier series. This classical solution is also a generalized solution, i.e.,

L t/J(Afcp)rdrdz = L fcprdrdz,

L V(A 1 cp)rdrdz = L gcprdrdz

for all cpeC0 (Q). From (40) and the Holder inequality we derive the
estimates
l t/111 :S const llfll, II VII :S const llgll
in L 2 (Q)-norms.
For (f,g)e Y, i.e., for f, ge c:er(Q) we then obtain generalized solutions
(t/1, V) by applying these estimates and a simple approximation argument.
502 72. Viscous Fluids and the Navier-Stokes Equations

According to the regularity results in Agmon, Douglis, and Nirenberg


(1959), these generalized solutions (t/1, V) have classical derivatives up to
the natural order prescribed in (42), i.e., t/1 e C 4 •11 (Q) and V e C 2 •11 (Q). This
implies that(t/J, V)eX. 0

Lemma 72.13. For every A. E IR, the operator A + A.B: X -+ Y is Fredholm of


index zero.

PROOF. Lemma 72.12 implies that A is Fredholm of index zero. According to


Section 8.4, the compact perturbation A + A.B then has the same property.
0

Lemma 72.14. There exists a period P and a number A.0 > 0 such that the
equation
Ax+ A.Bx = 0, xeX (43)

has only the trivial solution x = 0 for every A. e ]0, A.0 [,and precisely one linearly
independent nontrivial solution x 1 = (t/1 1 , Vd for A.= A.0 with

where IX > 0 and {J > 0 on ] 1, R [. Equation


Ax+ A.0 Bx = Bx 1 , xeX (44)

has no solution.

This key lemma will be proved at the end of this section.

PRooF OF THEOREM 72.C. We set A. = A. 0 + e. The operator equation (37) then


takes the form
F(e,x) = 0, (e,x)e!R x X

with
F(O,O) = 0, F,(O, 0) = A + A.0 B, F,,(O, 0) = B.
We set C = F,(O, 0) and choose an element xf e Y* with xf ¢ 0 and C*xf = 0.
Such an xf exists according to Lemmas 72. 13 and 72.14 (see Section 8.4). This
implies that
(45)
since otherwise, equation (44) would have a solution.
Observe that (45) is precisely the generic bifurcation condition of Theorem
8.A. Thus, Theorem 72.C follows from Theorem 8.A. 0
72.8. Proof of Theorem 72.C 503

PRooF OF LEMMA 72.14. We make essential use of the positivity of certain


Green's functions and the spectral theory of linear integral operators with
positive kernels.
(I) Equivalent homogeneous integral equation. The operator equation (43)
is equivalent to the differential equation
A~l/l + A.a~ = 0, A1 V + A.bl/1" = 0 (46)
with (1/J, V)eX. According to the regularity theory for linear elliptic
differential equations, every such solution is arbitrarily often differenti-
able in [1, R] x R. Recall that, in the definition of the space X of the
previous section, periodicity conditions with respect to z, symmetry
conditions (34), and boundary conditions (36) were included. As a con-
sequence of these conditions, the solutions of (46) have Fourier expan-
sions of the form
21tnz
Y,(r,z) =
·
L a,.(r)sm-,
oo

n=l P

(47)
27tnz
oo
V(r,z) = Po(r) + n~l p,.(r)cosp.

From (46) it follows that


L 2 a,. = A.ak{J,. and L{J,. = A.bka,. on] 1, R[,
(48)
a,.(r) = a~(r) = p,.(r) = 0 for r = 1 and r = R
for fixed n e N with k = 21tn/P and

Moreover, we have
LP0 = 0 on ]l,R[ and Po(l) = Po(R) = o.
This implies Po = 0, since integration by parts yields

lR Po(LfJo)rdr ~ lR P~r- 1 dr.

Let G and H be the respective Green's function for Land L 2 with


boundary conditions as in (48). Then equation (48) is equivalent to the
system of integral equations

a,.(r) = A.k 1R H(r, s)a(s){J,.(s) ds,


lR
(49)
{J,.(s) = A.k G(s, t)b(t)a,.(t) dt.
504 72. Viscous Fluids and the Navier-Stokes Equations

This implies

ocn<r) = JllR K(r,t)oc,.(t)dt (50)

with Jl = A. 2 and

K(r,t) = k 2 1R H(r,s)G(s,t)a(s)b(t)ds.
From Problem 72.4 it follows the crucial fact that
HandGarepositive on]1,R[ x ]1,R[.
The same property is valid for K. Problem 72.5 implies that the eigen-
value problem (50) has a simple characteristic number Jlo(n) > 0 of
smallest absolute value with an eigenfunction
oc,.. 0 (r) > 0 on] 1, R[.

If we set A. 0 (n) = ~and

{J,., 0 (s) = A.0 (n)k lR G(s,t)b(t)oc,., (t)dt,


0

then A.0 (n), oc,., 0 , fJ... o is a solution of(49). Hence we derived the following
result:
For fixed n e 1\1, the number A. 0 (n) > 0 is the eigenvalue of smallest
absolute value of (48). It is simple, and the corresponding eigenfunctions
oc,., 0 and fJ... o are positive on] 1, R[.
(II) The inhomogeneous integral equation

oc(r) = Jlo(n) lR K (r, t)oc(t) dt + y(r) (51)

with either y(r) > 0 or y(r) < 0 on ] 1, R [ has no solution.


To see this, let oc* be the solution of the adjoint integral equation to
Jl
(50) with = Jlo(n). Problem 72.5 shows that oc*(r) > 0 on ] 1, R[. Ac-
cording to the classical theory, (51) has a solution if and only if

lR oc*(r)y(r)dr = 0.
This condition, however, cannot be satisfied because of the positivity of
the integrand.
(III) Trick of changing the periodicity. It is possible that
for n =F m.
Thus A. 0 (n) need not be necessarily a simple eigenvalue of the original
equation (43).
72.9. Benard Problem and Bifurcation 505

However, this can be guaranteed by modifying the space X. Recall


that X depends on the period P. Our idea is to change P in such a way
that certain eigenfunctions do not lie in X.
From Problem 72.6 it follows that
81t2n2 (
A. 0 (n) ~ - p
2 max (a(r) + b(r)) )-1 , (52)
l:!>rSR

and this implies that A. 0 (n)-+ +oo as n-+ oo. Let m be the largest natural
number with
A. 0 (m) ~ min A. 0 (n),
n:2:1

and set

P= Pm,O•
This is the required simple solution of Lemma 72.14 if, in the construction
of X,
we replace period P with Pfm.
Note that the function sin(2nnz/P) for n = 1, 2, ... , m - 1 does not
have period Pfm. Thus the corresponding eigenfunctions are not in the
modified space X.
(IV) Proof of (44). If x is a solution of equation (44), then we expand this
solution in a Fourier series ofthe form (47). As in (I) we obtain an integral
equation for !X 1 ofthe form (51), which, according to (II), has no solution.
This proves Lemma 72.14. 0

The proof of Theorem 72.C is now complete.

72.9. Benard Problem and Bifurcation


The Taylor vortices were experimentally discovered by Taylor in 1923. But
already in 1901, Benard had found another bifurcation phenomenon for
viscous fluids. In order to explain this phenomenon, we consider here a viscous
fluid between two plates, as shown in Figure 72.6, where the temperature T0
of the lower plate and the temperature T1 of the upper plate satisfy the
condition
To> Tt.
If T0 - T1 is sufficiently small, then the fluid is at rest. If the temperature
difference is increased, then, at a critical value, so-called Benard cells appear
in the fluid. These cells have a hexagonal structure.
In experiments a pan with silicon oil is heated with hot water from below.
The fluid flow is made visible through small equally distributed aluminum
506 72. Viscous Fluids and the Navier-Stokes Equations

Figure 72.6

G
lll heat
Figure 72.7

pieces. After reaching the critical temperature difference, hexagonal cells


appear in the pan, which are shown from above in Figure 72.7.
Benard cells correspond to a bifurcation phenomenon. Physically, they
arise by combining the gravitational force and the heat convection flow.
During the past years, physicists, chemists, and biologists have shown a great
deal of interest in these Benard cells, because one observes the formation of
a complicated structure. This is a process which frequently occurs in the
evolution of life.

72.9a. Physical Problem


We begin with the so-called Boussinesq approximation for the Navier-Stokes
equations with temperature effects. This is
p0 v, + p0 (v grad)v + grad p = K + f1 Av,
I; + v grad T = J AT, (53)
divv = 0,
with the density relation
p = Po - rxp 0 (T- T0 ). (54)
The physical motivation for these equations will be given in Section 72.10.
The quantities there have the following meaning:
p0 constant average density;
p density;
72.9. Benard Problem and Bifurcation 507

v velocity vector;
p pressure;
T temperature;

,
To constant average temperature;
viscosity;
v = '1/Po kinematic viscosity;

"
c
heat conductivity;
specific heat capacity;

L
l>= K/p0c heat diffusion coefficient;

KdV outer force which is applied to the region n.

The positive constant oc in (54) must be determined experimentally. We now


consider the situation of the Benard problem. There K corresponds to the
gravitational force, i.e.,
K = -pge 3 , (55)

where g denotes the gravitational acceleration. We consider a stationary


problem, i.e., the time derivatives v, and T, in (53) are identically zero. We
choose a Cartesian (x, y, z)-system. The equations for the lower and upper
plate in Figure 72.6 are given by z = 0 and z = h, respectively. Hence we
require
T= T0 for z = 0 and for z =h.

Proposition 72.15. The stationary problem to (53), (54) has the solution
v* = 0, T* = T0 - flz, p* = Po - gp0 (z + !ocflz2 ),
(56)
p* = Po(l + oc{Jz),
where fJ = (T0 - T1 )/h.

PROOF. Explicit computation. 0

The trivial solution (56) corresponds to a fluid at rest, where temperature


and pressure depend on the position. In order to give a mathematical descrip-
tion of the Benard effect, we search for solutions which are perturbations of
(56), i.e., we choose the following ansatz
v = v* + v, T = T* + f, p = p* + p,
p = p* + p.
Moreover, we pass to dimensionless quantities through
508 72. Viscous Fluids and the Navier-Stokes Equations

2v
Vj=-W·
h •• i = 1, 2, 3,

The stationary problem (53), (54) then yields the following key equations
L1w1 + w4 c513 - D1w5 = N1(w), i = 1, 2, 3,
(57)
.1w4 + Rw3 = N4 (w),
with divergence condition

(58)

where
3
N1(w) =
j=l
L w1D1wi> i = 1, 2, 3,
3
N4 (w) = P L w1D1w4 •
j=l

The two dimensionless numbers

R= grxp(~)4 v
P=-
vc5 2 ' c5

are called the Rayleigh number Rand Prandtl number P. In order for Benard
cells to occur, the value of R is essential, while P, in the following, is a fixed
but arbitrary value. In experiments Benard cells appear at the critical value
R = 1,700 ±50. Mathematical analysis shows that Rcrit = 1,708 (see Problem
72.8).

72.9b. Mathematical Problem


Equations (57), (58) have the trivial solution w1 = 0 for all i. We are looking
for nontrivial solutions. In order to present the key idea in the proof of the
existence of bifurcation solutions as clearly as possible, we begin with a simple
situation, i.e., we consider solutions which are periodic in ~ 1 and ~ 2 • This leads
to rectangular Benard cells, which are not observed in the experiment. Thus
our proof is somewhat academic. But the case of hexagonal cells can be treated
in a similar way (see Problem 72. 7). Furthermore, we emphasize the functional-
analytic aspects of the existence proof. The same methods, which are used in
the following, can also be applied to bifurcation problems for general elliptic
systems. We consider the strip region
72.9. Benard Problem and Bifurcation 509

and the periodicity cuboid


Q = {xe!R 3:0 ~ et ~Pt,O ~ e2 ~ P2• -1 ~ e3 ~ 1}
where p 1 > 0 and p2 > 0.

Problem 72.16. We are looking for five real functions w1 and a real number
R > 0 which satisfy the following conditions:
(i) Differential equation. Equations (57) and (58) are valid on G.
(ii) Boundary condition.
w1 =0 onoG, i = 1, 2, 3, 4. (59)
(iii) Periodicity condition. For all i and all x e G we have
w.(el + Pt.e2 + P2.e3) = Wj(el.e2.e3>· (60)
(iv) Normalization condition for the pressure w5 •

(61)

(v) Symmetry condition.

(62)

Here A1 and S1 mean antisymmetry and symmetry with respect to the variable
ej. respectively.

72.9c. The Operator Equation

Our goal is to transform this boundary-value problem into an operator


equation
weX, R >0. (63)
Thereby (63) corresponds to the system (57). The spaces X and Y are con-
structed in such a way that the operators BR, N: X--+ Yare continuous.
By definition, the B-space X consists of all function tuples w = (w1 , ... , w5 )
which satisfy the following properties:
(a) w1 e C 2 •11 (Q) fori = 1, 2, 3, 4 with fixed a e ]0, 1[.
(b) Ws E C 1' 11 (Q).
(c) Relations (58)-(62) are valid.
The norm on X is naturally given by
4
llwll = L llw,llz ... + llwsllt,e~·
i=l
510 72. Viscous Fluids and the Navier-Stokes Equations

Moreover, we introduce on X the scalar product:

(wlu) = r Rot
JQ i=l
WjUj + w4u4dx.
Furthermore, let Y be the B-space of all tuples f = (f1 , ••• , f4 ) with}; e C11 ( Q)
such that the periodicity condition (60) is satisfied for all i. The norm on Y is
given by
4
llfll = L IIJ;II~~·
i=l

72.9d. The Linearized Equation

We consider the inhomogeneous linear equation


BRow =f, weX, (64)
which corresponds to the linearization of (63). The following condition is
important.
(H) For fixed R0 > 0, equation (64) with f = 0 has precisely one linearly
independent eigensolution weX. We normalize this solution through
(wlw) = 1.

Lemma 72.17. Assume (H). Let feY be given. Equation (64) then has a solution
if and only if
(wlf) = o. (65)

The necessity of the solvability condition (65) follows immediately from (57)
after multiplication with wi and integration by parts over Q.

A proof of the sufficiency of (65), based on deep results about linear elliptic
systems, may be found in Fife (1970). There, Fourier transformations and the
Riesz-Schauder theory are used. It is important that the so-called comple-
menting condition of Agmon, Douglis, and Nirenberg (1959) can be verified.
It guarantees that the necessary a priori estimates of type Ck,,. are satisfied.

72.9e. Existence and Uniqueness of the Bifurcation Branch


Theorem 72.0. Assume (H). Then there exist positive numbers s 0 , r0 , e0 > 0
such that for given real s with 0 < lsi ~ 0 s there is a unique solution we X, R > 0
of the given problem (63) with
(wlw) = s,
729. Benard Problem and Bifurcation 511

This solution is analytic with respect to s, i.e.,


co
R = Ro + L eksk.
k=l
(66)

The series for w converges absolutely in the B-space X.

Remark 72.18. The expansion coefficients in (66) can be determined from the
procedure in Corollary 8.25. At each step, one has to solve a linear elliptic
system of the form (64).

Corollary 72.19. The periods p 1 and p2 can be chosen in such a way that the
smallest positive eigenvalue R = R 0 of equation BRw = 0, we X, satisfies con-
dition (H), and hence Theorem 72.0 can be applied to this R 0 .

PROOF OF THEOREM 72.0. We follow Zeidler (1972) and apply the main
theorem of analytic bifurcation theory (Theorem 8.A).
For this, we set R = R 0 + e and the original operator equation (63) becomes
BRow = eLw + N(w), we X (67)
with
(Lw)i = - w3t5i4• i = 1, 2, 3, 4.
We write (67) in the form
F(e, w) = 0, we X,
which implies that
Fw(O,O) = BRo·
From Lemma 72.17 it follows that the operator Fw(O,O): X-+ Y is Fredholm
of index zero. We construct a linear continuous functional w* eX* by setting
(w*, w) = (wlw) for all we X.

Lemma 72.17 implies


(w*,Fw(O,O)w) = 0 for all we X,
i.e., Fw(O, 0)* w* = 0.
The key bifurcation condition of Theorem 8.A is
(w*,F,w(O,O)w):;!: 0, (68)
and explicitly, this means (wiLw) :;!: 0, i.e.,

-L w3w4dx ¥= 0.

Let us prove this. The equation


512 72. Viscous Fluids and the Navier-Stokes Equations

corresponds to the homogeneous problem (57), (58) with R = R 0 , and in-


tegration by parts yields the key relation

t Jal (gradw 1) 2 dx = - t JQl w1 ~w1 dx


L
i=t i=t

= (1 + Ro) w3w4dx.
Now suppose that

L w3 w4 dx = o.
We obtain grad w1 = 0 for i = 1, ... , 4, and together with the boundary
condition w1 = 0 on iJ(J it follows that
w1 = 0 on G for i = 1, ... , 4.
The homogeneous equations (57) for w1 and the normalization condition (61)
then imply
w5 = 0 on G,
and hence w= 0. This contradicts (wlw) = 1. Consequently, the generic
bifurcation condition (68) is satisfied.
Theorem 72.0 therefore follows from Theorem 8.A. D

The proof of Corollary 72.19 may be found in Rabinowitz (1968). It is


analogous to the proof of Lemma 72.14.

72.10. Physical Motivation of the


Boussinesq Approximation
As in Section 70.5, the Navier-Stokes equations for a viscous fluid, including
thermodynamical effects, have the following form. We assume that no outer
heat sources are present. In this connection, observe (70.12) to (70.16).
(i) Momentum balance
pv, + p(v grad)v - '1 ~v + ('I - '1') grad div v = K - grad p. (69)
(ii) Mass balance
p, + div pv = 0. (70)
(iii) Balance of inner energy

pe, + pvgrade = [i- pl,Dv]- divq (71)


with the tensor of inner friction
i = 2, Dv + ('I' - 2,)(tr Dv)l.
72.11. The Kolmogorov 5/3-Law for Energy Dissipation in Turbulent Flows 513

(iv) Thermodynamical equation


p = p(T), e = e(T). (72)
(v) Fourier law for the heat flow
q = -Kgrad T. (73)
Typical for the Boussinesq approximation is the fact that the temperature
difference
T- T0
is regarded as small. Hence, in the Taylor expansion for the density p and
specific inner energy density e, only the linear terms are considered, i.e.,
P = Po - Poa.(T - To),
e = e0 + c(T - To).
Relation (67.11) shows that c = eT is the specific heat capacity.
In (69)-(71), we only consider the zero-th approximation ofthe density, i.e.,
we set p = p0 = const. This implies
divv = 0 (74)
and
Po v, + p0 (v grad)v - K + grad p - 'I t1v = 0. (75)
The temperature dependence of the density is considered only for the gravita-
tional force, i.e., we set
(76)
Moreover, in the balance of inner energy (71), we neglect the inner friction,
i.e., we set i = 0. Thus we obtain
T, + vgrad T = {Jt1T (77)
with {J = Kjp0c.
Equations (74)-(77) correspond to the basic equations (53).

72.11. The Kolmogorov 5/3-Law for


Energy Dissipation in Turbulent Flows
We shall use a very rough argument in order to obtain information about
turbulent flows. Our notations are:
'I viscosity;
p density;
e total rate of energy dissipation;
A. diameter of eddies;
k wave number of eddies (k~r 2n/A.).
514 72. Viscous Fluids and the Navier-Stokes Equations

It is a typical property of turbulent flows that there exist eddies of different


diameters A., where

One may think, for example, of clouds in the air or of nebulas in astronomy.
One finds that the large eddies tend to break down into smaller eddies. This
way energy from large eddies flows to smaller eddies. Moreover, physicists
assume that the energy ofthe smallest eddies with A. = A.min is transformed into
heat by friction (energy dissipation). Viscosity is of significance only for small
eddies. We define
loss of kinetic energy by dissipation
e= -------------
mass·time
This is a very important physical quantity. Note that e can be measured in
experiments; it is equal to the produced heat. We are interested here in the
distribution of e with respect to A.. To this end, we make the ansatz

e= f .l,..,

Amin
Jl(A.) d.A., (78)

i.e.,

I~~. Jl(A) dA.


is the loss of kinetic energy per mass and time, caused by all eddies of diameter
A. with A.min ~ A. ~ A.o.

Kolmogorov Law 72.20. We have

Jl(A) = C(A./A.min)!el/3 ;,-7/3, (79)


p

where C is dimensionless.

If A. is near .A.min• then we can replace C with a constant. Frequently, one uses
the so-called wave number of eddies k = 2n/A. and makes the ansatz

e = -, J,"ma• E(k)k 2 dk.


p kmlo

Then, for large k near kmax• one obtains


E(k) = const·e2' 3 k- 5' 3. (80)

This is called the Kolmogorov 5/3-law.


Let us motivate (79). It is natural to assume that Jl(A.) depends only on ,, p,
e, and A.. We therefore make the ansatz
Jl = c,a pbe• A.d.
72.12. Velocity in Turbulent Flows 515

The dimensions are


[ '7] = kg/ms, [p] = kg/m 3, [e] = m2/s 3 , [A]= m,
[Jl] = [e/A] = m/s 3 ,
and comparison of dimensions yields
b= -a, c = 1- a/3, d = -1- 4a/3.
Finally, let us motivate that a = 1. The energy dissipation in the region G is
equal to the time derivative of the kinetic energy in the region G, i.e.,

:t L!pv dx 2 = L pvv, dx

if the density pis constant. The Navier-Stokes equations yield pv1 = '1 ~v + ···.
Hence it is natural to postulate that the energy dissipation depends on '7" with
a= 1.
Formula (80) is obtained the same way.
Finally, we want to apply a typical argument used by physicists in order to
get some information about the magnitude of Amtn· We make the ansatz
Amin = const · '1" pbec.
Comparison of dimensions yields a= -b = i. c = -l Physical experience
shows that dimensionless constants are not too small and not too large. Hence,
we assume that

72.12. Velocity in Turbulent Flows


In a turbulent flow, the velocity field v = v(x, t) is assumed to be a stochastic
process. What can be measured in experiments are the mean velocity vector
v(x, t)
at the point x and time t, and the dispersions
(Av1) 2 = u1(x, t) 2, i = 1, 2, 3
for the velocity components v1• Here we set u1 = v1 - v1• The bar denotes
expectation values. The mean density of kinetic energy is equal to
!pv(x, t) 2 •
In order to describe the correlation between the velocity vectors at different
points, one uses the correlation coefficients
u1(x, t)uJ(x + h, t)
C1i(x, x + h; t) = 11 V; (X, t)/1Vi (X i, j = l, 2, 3.
+ h, t) ,
516 72. Viscous Fluids and the Navier-Stokes Equations

We have 0 ~ CIJ ~ l. The larger Cii is, the larger the correlation between
vi(x, t) and vi(x + h, t) becomes. If vi(x, t) and vi(x + h, t) are independent of
each other, then Cii = 0.
As an important example, let us consider a one-dimensional flow, i.e.,
V = Vl@el and X= eel+" ..

Kolmogorov Law 72.21. If A. is not too large and not too close to a critical
distance A.min• then
(81)

Here e is the total rate of energy dissipation as used in the previous section.

In order to motivate (81) let A.min be the smallest eddy size. Physicists assume
that the viscosity '1 does not play any role if the eddy size satisfies A.» A.mtn·
Hence we make the ansatz const · p"eb A. •. Comparison of dimensions yields
(81).

PROBLEMS

72.1.* Functional analytic decomposition of vector fields. Prove Lemma 72.1.


Hint: A short and elegant proof can be given by using the deep result of
de Rham of Problem 72.2. See Temam (1977, M), p. 15 and p. 19. A direct
proof, which uses the solution of the second boundary-value problem for the
Laplace equation, and further material can be found in Girault and Raviart
(1981, L), p. 31.

72.2. • Pressure as a mathematical distribution. Show:


(i) Let G be an open set in IR 3 and g = (g 1 ,g 2 ,g 3 ), where the gi are distri-
butions, i.e., elements of !!}'(G) (see A2 (64)). Then

g = gradp (82)

has a solution pe!l}'(G) if and only if


3
L gi(vi) = 0 (83)
i=l

holds for all vieC0 (G), i = 1, 2, 3 with divv = 0.


(ii) If the set G satisfies the regularity assumption (2) and if gi e W2- 1 (G) for
all i, then equation (82) has a solution pe L2 (G) if and only if (83) is
satisfied. Moreover, up to an additive constant, pis uniquely determined.
As usual, Wz- 1 (G) denotes the dual space to the Sobolev space Wz'(G).
Hint: See Temam (1977, M), p. 14. Statement (i), which, in an analogous
form, holds in IRn, is a special case of a general theorem of de Rham about
differential forms with distributions as coefficients (see de Rham (1960, M),
p. 114). An elementary proof of(ii) can be found in Temam (1977, M), p. 19.
Let gie L2 (G)for all i. Then the necessity of condition (83) follows by using
Problems 517

integration by parts. In fact, equation (82) implies

L Lgvdx = (gradp)vdx = - L pdivvdx = 0

for all V;E C0 (G), i = 1, 2, 3 with div v = 0.

72.3. * Very weak solutions for the nonstationary problem. As in Chapter 30, use the
Galerkin method to show that the generalized nonstationary problem for
the Navier-Stokes equations of Section 72.5 has a solution
veL 2 (0, T; V)
for every
v0 eH,

with Y = L2 (G, V3 ). In contrast to Theorem 72.B, the uniqueness for these


very weak solutions has not yet been verified.
Hint: See Temam (1977, M), p. 282. There one also finds uniqueness results
for weakly regular solutions. However, the existence in this case remains
open.
72.4.* Positivity of Green's functions. We set

Lu = -Au" + Bu' + C
and consider the two boundary-value problems
Lu = f on [a,b], u(a) = u(b) = 0,
and
L2 u = f on [a,b], u(a) = u'(a) = u(b) = u'(b) = 0,

where -oo < a < b < oo. For the coefficients we assume
A, B, CeC4 [a,b] and A, C > 0 on [a,b].
Prove: Green's functions for Land L 2 are positive on ]a,b[ x ]a,b[.
Hint: Use analogous arguments as in Problem 7.2g. See Kirchgassner
(1961), p. 18. More general results may be found in Karlin (1967, M), p. 534,
in connection with the theory of oscillating kernels of Krein and Gantmacher.
72.5.* Integral equations with positive kernels. Let K: [a,b] x [a,b]-+ R be con-
tinuous and positive on ]a, b[ x ]a, b[ for -oo <a < b < oo.

r
Show: The integral equation

u(r) = JL K(r,s)u(s)ds

has a simple characteristic number JLo > 0 of smallest absolute value. The
corresponding eigenfunction is positive on ]a, b[.
Hint: This is a sharpening of Example 7.30. See Jentzsch (1912), p. 248.
72.6. Proof of (52).
Solution: We set

{J = fJ•. o·
518 72. Viscous Fluids and the Navier-Stokes Equations

From
and LP = lkbcx
follows, after integration by parts over [1, R], that

r~ ).k lR (a+ b)czPrdr

= lR(cxL 2 cx + PLP}rdr

~ lR (k2p2 + k4cz2)rdr~rt.
Because of the positivity of a, b, ex, p, we haver > 0. From 2cxP ~ kC% 2 + P2/k
follows
;.rl
r s 2k2 ISrSR
max (a(r) + b(r)).

Since r 1 :::;; r we thus obtain (52).


72.7.* Benard problem with hexagonal cells. Give an existence proof for this case.
Hint: See Judovic (1967). The proof is analogous to the proof of Section
72.9, but the treatment of the linearized problem is now more complicated.
The following question is important: Why does nature choose hexagonal
cells? The answer must have something to do with stability analysis. Up to
now there exists no general stability analysis for the complete nonlinear
problem. Partial results can be found in Sattinger (1977), where group theory
and bifurcation theory are combined. In this direction, see also Sattinger
(1979, L), (1980, S), Knightly and Sather (1985).
72.8. * Weak stability analysis for the Benard problem. Starting with the time-
dependent equations (53) find solutions ofthe linearized problem in the form
of two-dimensional waves.
Show: There exists a critical Rayleigh number R.,1, = 1,708 with the
following property. For R < R., 1, these waves are stable, while for R > R.,11
they become unstable.
Hint: See Chandrasekhar (1961, M) and Ebeling (1976, M). This result
is in good correspondance with experiment. Benard cells occur for R =
1,700 ±50.
Many concrete stability results can be found in Joseph (1976, M).
72.9. Boundary layers and singular perturbation problems.
72.9a. A mathematical model. We consider the differential equation
ey" + y' = b (84)
with the small parameter e > 0. The general solution is
y = bx + C + De-"''· (85)
If e = 0, then the general solution of (84) is
y=bx +C. (86)
Problems 519

boundary
layer

Figure 72.8

The first key result is that the solutions (85) and (86) differ essentially only
in a thin boundary layer near the boundary point x = 0. For example, if we
add the boundary conditions
y(O) = 0, y(1) = 1 (87)
to equation (84), then we obtain the unique solution
1- e-"1'
y,=bx+(1-b) 1 _11, (88)
-e
(Fig. 72.8). Problem (84), (87) is called a singular perturbation problem, since
the term "ey"" with highest derivative disappears as e -+ 0, i.e., the type of
the problem changes drastically. One may regard "ey"" as a small viscosity
term. In fact, singular perturbation problems occur very frequently in the
natural sciences, and the existence of boundary layers is typical for such
problems.
72.9b. Rescaling. If we set
~ = xje,
then we obtain
y, = (1 - b)(1 - e-<) + O(e), e-+ 0.

This yields the second key result: The behavior of the solution in the
boundary layer (0 ~ ~ ~ 1) can be described best by a suitable rescaling.
In the following we describe a general method to obtain the correct
rescaling by just using the original equation. We make the ansatz
~ = xfet.
From (84) follows
d2 y dy
ll1-2k d~2 + Il-l d~ = b.

We now postulate that e1 -n = e-t. Hence k = 1 and


d2 y dy
d~2 + d~ = eb.

This is a regular perturbation problem.


520 72. Viscous Fluids and the Navier-Stokes Equations

y
outer flow

Figure 72.9

72.9c. The Prandtl boundary-layer equation (1904). We want to apply the preceding
argument to the Navier-Stokes equations and consider a planar incom-
pressible viscous flow in the upper half-plane G = {(x,y)e IR 2 : y > 0} with
velocity vector
(Fig. 72.9).
Suppose that x, y, a, b are dimensionless. According to (13) the dimensionless
Navier-Stokes equations are
aa, + aax + ba7 = -Px + eAa,
ab, + abx + bb7 = -p, + eAb, (89)
ax+ b, = 0
on G with the boundary conditions
if y = 0. (90)
Here we set
e = 1/Re.
We assume that the Reynolds number Re is large, i.e., e is small. If we set
e = 0 in (89), then we obtain the Euler equations for inviscid fluids. But in
the case of an inviscid fluid, we only have the boundary condition
b=O if y=O. (91)
This contradicts (90). In 1904, Prandtl proposed the following model for
large Reynolds numbers.
(i) Outside a thin boundary layer of thickness e1i2 = (Rer 1' 2 one uses
equation (89) with ll = 0.
(ii) Inside the boundary layer one uses the so-called Prandtl boundary layer
equations
aa, + aax + ba1 = - Px + ea17 ,
p, =0, (92)
ax+ b, = 0,
with the boundary condition
if y=O.
Motivate this model.
Problems 521

Solution: The idea is th;it the essential effects occur in the direction of the
y-axis. Therefore we use the following rescaling
X=x, A =a, B=b/e"'.
Formula (89) implies
A,+ AAx + e"'-•BAr = -px + eAxx + e1 - 2•Arr,
e"'HBr + e"'HABx + e2"'BBr = -py + e"'+HlBxx + em-HtBrr• (93)
Ax + e"'-•Br = 0.
We assume that the ¥-derivatives of A are essential. From the first line of
(93), a natural choice is e"'-• = e•-n = 1. Hence
m= k =t.
Letting e = 0 in (93) and going back to the original variables we obtain (92).
Observe that it is very important to use dimensionless quantities and the
rescaling argument. Otherwise one will not find a clear motivation for the
Prandtl boundary-layer equation.
72.9d.* Existence and uniqueness theorems for the Prandtl boundary layer equations.
Study Oleinik (1968). Information about the qualitative behavior of the
solutions may be found in Nickel (1958), (1963), where differential inequali-
ties play an important role. See also Walter (1964, M).
72.10.* Generic finiteness of the solution set. We consider the stationary Navier-
Stokes equations
p(vgrad)v- 'fAV = K- gradp in G,
divv = 0 in G,
v = 0 on oG.
Study Foias and Temam (1977). There it is shown that, for every fixed p and
,, the solution set is finite for "almost all" forces K. The proof is based on
the Smale theorem (Theorem 4.K).
72.11. Partial regularity of the solutions to the nonstationary problem.
72.11 a. Hausdorff measure. Let A be a subset of IR". By definition, the m-dimensional
Hausdorff measure
H"'(A) = inft
of A is the infimum of the set of all t such that 0 ~ t ~ oo and for every e > 0
there exists a countable covering C of the set A with

L (diam(S))"' ~ t
SeC 2
and diam(S) < e for all SeC.
If m is an integer and V,. denotes the volume of the m-dimensional unit
ball, then v... H"'(A) agrees with the surface area of smooth m-dimensional
surfaces A. See Federer (1969, M).
522 72. Viscous Fluids and the Navier-Stokes Equations

72.11b.** Navier-Stokes equations. Let v = v(x,t) be a solution of the nonstationary


Navier-Stokes equations. A point (x, t) ~s called singular if and only if
v is not essentially bounded in any neighborhood of (x, t) in the sense of the
space La>. Study Caft'arelli, Kohn, and Nirenberg (1982). There it is shown
that for "suita}>le weak" solutions v of the nonstationary Navier-Stokes
equations on an open set in space-time, the associated set of singular points
has a one-dimensional Hausdorff measure zero.
This shows that singular points are rare. For example, they cannot form
a regular curve.
72.12.* Statistical solutions of the Navier-Stokes equations. LetS be the set of all
solutions v = v(x, t) of the nonstationary Navier-Stokes equations. Let S0
be the set of all possible initial values v0 = v0 (x1 where v0 (x) = v0 (x, t).
Suppose we are given a probability measure p0 on S0 , i.e.,
p0 (A) = probability of v0 eA.
We are looking for a probability measure p on S, i.e.,
p(B) = probability of v e B,
where p and p0 are compatible, i.e.,
p(v: v0 e A) = p0 (A).
We call p a statistical solution of the Navier-Stokes equations. The existence
of statistical solutions and their properties are studied in detail in Visik and
Fursikov (1980, M). The main idea is to use a Galerkin method and a
theorem of Prohorov on weak compactness of sets in measure spaces, where
the measures are defined over metric spaces.
72.13. Thrbulence, Feigenbaum bifurcation, chaos, and universality theory. See
Problem 17.13.
72.14. Thrbulence and strange attractors.
72.14a. The Henon attractor. Let a= 1.4 and b = 0.3. Compute
Xn+l = Yn + 1 -ax. ••2
(94)
y.+l = bx.
for n = 0, 1, ... , 10 5 on a computer. For a bad choice of the initial values
x0 , y 0 , the sequence will tend to infinity. For a good choice of x0 , y 0 , the
sequence will rapidly approach the Henon attractor (Figure 72.10 shows

y
"="- ,,
,~,
\
\
,,,,
\
\
l I\ \
I II l
I II I
"'/./
~ /
I

'-----------• X

Figure 72.10
Problems 523

Figure 72.11

this schematically). One will observe that the sequence behaves in a strange
way.
Roughly speaking, an attractor A for a dynamical system is a set which
attracts the trajectories in a neighborhood of A (Fig. 72.11 ). Such an attrac-
tor is called strange if the trajectories depend sensitively on the initial data.
If one chooses a = 1.3 and b = 0.3, then the strange attractor disappears
and an attractor of period 7 appears.
72.14b. The Lorenz attractor and meteorology. Compute the trajectory of the initial-
value problem
x = a(y- x), y = bx- y- xz, i = xy- c,
(95)
x(O) = y(O) = z(O) = 0

with a= 10, b = 28, c = 8/3. In a Cartesian (x,y,z)-system, the trajectory


makes one loop to the right, then a few loops to the left, then to the right,
and so on in an irregular fashion. In fact, the trajectory approaches a strange
attractor which lies in a neighborhood ofthe origin. See Guckenheimer and
Holmes (1983), p. 92.
This system was considered by Lorenz (1963) as a simple Galerkin system
for the heat convection in a two-dimensional horizontal fluid layer heated
from below. The components of(95) correspond to the Fourier coefficients
of velocity and temperature. Lorenz was interested in this system as a model
of the heat convection in the atmosphere of the earth. Thus we expect to
have turbulence for critical data and the Lorenz attractor reflects this. Since
the trajectories depend sensitively on the initial data, this provides some
theoretical excuse for the unreliability of weather forecasting.
72.14c. Continuous frequency spectrum, strange attractors, and turbulence. Let f(t)
be a time-dependent physical quantity (e.g., the velocity component at a
fixed point in a fluid). Physicists analyze such quantities by using the Fourier
integral

f(t) = J:oo g(w)e- 1"'' dw. (96)

This is the superposition of special periodic motions

where w is the angular frequency and lg(w)l is the amplitude. If the frequency
diagram has only a finite number of sharp peaks such as in Figure 72.12(a),
524 72. Viscous Fluids and the Navier-Stokes Equations

Wt W2

(a) (b)

Figure 72.12

then this corresponds to a special quasi-periodic behavior, i.e.,


N
f(t) = L gke-iwkr. (97)
k=l

By definition, a general quasi-periodic behavior is described by


f( t) = "\'
L..,
g
ns ... n,
e-i(n,w,+ ... +no<D.ll
,
"s·····""
where the sum is taken over all integers n1 , ••• , nN. This corresponds to
sharp peaks at n1 w1 + · ·· + nNwN in the frequency diagram. If g(w) #: 0 for
a continuum of angular frequencies w, then one speaks of a continuous
spectrum (Fig. 72.l2(b)).
The classical turbulence theory, due to Landau and Hopf, was based on
the following idea. There exists a sequence of critical Reynolds numbers
Re 1 < Re 2 < Re 3 •••.
The velocity vector has the quasi-periodic structure
v(x t)
'
= "\'
L...
v
ns ... n,
(x)e-i(n,w, + ... +•o<D•lr
'
lltt•••tiiN

(b)

Figure 72.13 ·
References to the Literature 525

where the number of angular frequencies w 1, ... , wN increases at the critical


Reynolds numbers. Those critical Reynolds numbers Rek converge to a limit
Re1urb• at which turbulence occurs. Turbulence, in this picture, corresponds
to quasi-periodic motions with a very large number of basic frequencies,
which simulates a continuous frequency spectrum. Schematically this corre-
sponds to Figure 72.13(a).
In 1971, Ruelle and Takens proposed a completely different approach to
turbulence. They assumed that strange attractors are responsible in many
cases for the occurrence of turbulence. Mathematically, it is possible that a
quasi-periodic motion breaks down and a strange attractor occurs. Strange
attractors correspond to a continuous frequency spectrum. Ruelle and
Takens therefore predicted a situation as shown in Figure 72.13(b), where
a continuous spectrum appears very rapidly. Perhaps this can be verified
by using delicate physical experiments (see Swinney and Gollub (1978)).

References to the Literature

Classical work on the existence theory for inviscid fluids: Lichtenstein (1929, M).
Recent existence proofs: Majda (1984, L), Kato and Lai (1984), DiPerna and
Majda (1987).
Basic papers on the existence theory for viscous fluids: Leray (l934a), Hopf (1951).
Classical works on the existence theory for viscous fluids: Odquist (1930), Leray
(1933), (1934), (1934a), Hopf (1950) (Burgers equation and modeling of turbulence),
(1951) (initial-value problem for the Navier-Stokes equations), (1952) (statistical
hydrodynamics), Ladyrenskaja (1959), (1970, M), Finn (1959), (1961), (1965, S), Serrin
(1963), Fujita and Kato (1964), Judovic (1966) and Velte (1966) (Taylor problem),
Judovic (1967) and Rabinowitz (1968) (Benard problem), Ladyrenskaja and Solonnikov
(1977) and Solonnikov (1984) (unbounded regions).
Introduction to viscous flow from the physical point of view: Prandtl (1949, M),
Landau and LifSic (1962, M), Vol. VI (standard work).
Introduction to the Navier-Stokes equations from the mathematical point of view:
Temam (1977, M).
Monographs: Ladyzenskaja (1970), Temam (1977), Visik and Fursikov (1980) (sta-
tistical solutions), Girault and Raviart (1981), (1986), Telionis (1981), von Wahl (1985).
Numerical methods: Temam (1977, M), Chorin (1973), (1977), (1978), (1982), Girault
and Raviart (1981, L), (1986, M), Thomasset (1981, M), Glowinski (1981, L), (1983, S),
(1984, M), Fortin and Glowinski (1983), Holt (1984, M), Peyret (1985, M), Sod (1985),
Vols. 1, 2, Chavent (1986) (finite elements and reservoir simulation).
Numerical methods on supercomputers: Murman (1985, P).
Numerical weather prediction: Haltiner and Williams (1980, M).
Recent trends: Temam (1983, S), Berkeley (1983, P), (1986, P), Ruelle (1983, S),
Constantin, Foias, and Temam (1985) (turbulence and the dimension of attractors),
Ladyzenskaja (1986).
Boundary layers in mathematics and singular perturbation theory: Ljusternik and
Visik (1957) (classical work), Trenogin (1970, S), Kervorkian and Cole (1981, M),
Goering (1983, S) (see also the References to the Literature for Chapter 79).
Rheology: Reiner (1958, S) (handbook article), Fredrickson (1964, M), Wilkinson
(1960, M) and Showalter (1978, M) (non-Newtonian fluids), Duvaut and Lions (1972,
M) and Naumann (1982) (existence theorems).
Stability for concrete problems in fluid dynamics: Chandrasekhar (1961, M), Joseph
(1976, M).
526 72. Viscous Fluids and the Navier-Stokes Equations

Stability and bifurcation, Taylor problem, and Benard problem: Kirchgassner ( 1975,
S) and Sattinger (1980, S) (general survey), Serrin (1959), (1959a), Judovic (1966),
(1966a), (1967), Velte (1966), Rabinowitz (1968), Kirchgassner and Sorger (1969),
Joseph and Sattinger (1972), Zeidler (1972), Kirchgassner and Kielhofer (1973), Kirch-
gassner (1975a), Sattinger (1977), (1979, L), (1980, S), Knightly and Sather (1985).
Free boundary-value problems for the Navier-Stokes equations: Pukhnacov (1972),
Socolescu (1980), Solonnikov (1983).
Genericity and structure of solutions to the Navier-Stokes equations: Foias and
Temam (1977).
Partial regularity of solutions to the nonstationary Navier-Stokes equations:
Scheffer (1980), Caffarelli, Kohn, and Nirenberg (1982).
Unbounded regions, infinite channels, and tubes: Ladyzenskaja and Solonnikov
(1977), (1980), Amick (1978), Solonnikov (1984).
Existence, regularity, and decay of solutions: Heywood (1980, S).
Asymptotic behavior of the kinetic energy of viscous fluids in external regions: Galdi
and Maremonti (1986).
Existence theory for the Euler equations for incompressible and compressible in-
viscid fluids: Majda (1984, L) and Kato and Lai (1984) (especially recommended), Kato
(1967), (1972), Temam (1979), Schochet (1986), DiPerna and Majda (1987).
Applications of the methods of global analysis: Arnold (1966), Ebin and Marsden
(1970), Marsden (1974, L).
Boundary layer equation: Prandtl (1904) (classical work), Garabedian (1960, M),
Schlichting (1960, M), Oleinik (1968) (existence proofs), Nickel (1958), (1963), Walter
(1964, M) (differential inequalities), Pukhnacov (1975, M).
Asymptotic methods in fluid dynamics: Zeytounian (1987, M).
Statistical solutions in hydrodynamics: Foias (1973), Visik and Fursikov (1980, M).
Survey on turbulence: Lin and Reid (1963) (handbook article), Frost and Moulden
(1977, M) (handbook), Bernard and Ratiu (1977, P), Berkeley (1983, P).
Monographs on turbulence: Chorin (1975, L) (introductory), Batchelor (1982, M),
Dwoyer (1985, M).
Kolmogorov flow: Obuhov (1983, S).
Turbulence and universality theory: Feigenbaum (1980, S), Berkeley (1983, P), Vul,
Sinai, and Chanin (1984, S).
Turbulence and self-organization: Ebeling and Klimontowitsch (1985, L).
Turbulence, chaos, and strange attractors: Ruelle (1980, ·s), (1983, S) (introductory),
Ruelle and Takens (1971), Bothe (1982) (topological structure of attractors), Sparrow
(1982, M) (Lorenz equation), Guckenheimer and Holmes (1983, M) (recommended as
an introduction to strange attractors), Berge, Pomeau, and Vidal (1984, M).
Visual representations: Abraham (1983, M), Peitgen and Richter (1985, M).
Estimates for the dimension of attractors: Ladyzenskaja (1982), Babin and Visik
(1983, S), Constantin, Foias, and Temam (1985, S), Constantin and Foias (1985)
(Kaplan-Yorke formulas).
MANIFOLDS AND
THEIR APPLICATIONS

Some four years ago, I observed that a certain number of most significant
theorems and constructions of modern mathematics have undergone the fol-
lowing evolution, and that one might even talk of a principle. Viewed historically,
at first one knew certain natural objects (e.g., spaces~ then certain "abstract
objects" were discovered, or one was forced to introduce them. Finally, with
considerable effort and brilliance of mind, it was proved that these objects were
"simply" subspaces of the well-known spaces, and that in some (favorable) cases
they were indeed isomorphic with "classical" objects. The more natural the
spaces were, the more difficult it was to prove the corresponding embedding
theorems.
I realized that the evolution principle of modern mathematics which I had
observed was an exact illustration of the famous parable of the cave from the
seventh book of Plato's "Politea." The following correspondences were found:
(a) Shadow's on the cave~s walls are classical mathematical objects: e.g., planes
in the Euclidean space, algebraic projective varieties, etc.
(b) Ideas are "abstract objects," e.g., Riemann spaces, Hodge manifolds.
(c) The dramatic "descent" into the cave is the corresponding embedding
theorem.
And who is the prisoner at first fettered, then released, and dragged (by force)
into the sunlight, and finally descending again into the cave? It is mathematics
itself as a whole, for it is different researchers belonging to different generations
of mathematicians who have accomplished the ascent, the creation of awakening
of a great mathematical idea, e.g., the Riemann surface, and an embedding or
uniformization theorem which is often, several decades later, proved by quite
different mathematicians.
The well-known bon mot that "European philosophy" is only a footnote to
Plato is perhaps true, but I should venture the much truer one: modern mathe-
matics is only a footnote to Riemann.
Krysztof Maurin (1982)

527
CHAPTER 73

Banach Manifolds

The categories of differentiable manifolds and vector bundles provide a useful


context for the mathematics needed in mechanics, especially the new topological
and qualitative results.
Ralph Abraham and Jerrold Marsden (1978)
Too often in the physical sciences, the space of states is postulated to be a linear
space when the basic problem is essentially nonlinear; this confuses the mathe-
matical development.
Steve Smale (1980)
The proof that ... is left as a masochistic exercise for the reader. Rest assured
that we will never have to do this sort of abstract nonsense.
Michael Spivak (1979)

Typical examples of manifolds are sufficiently smooth curves and surfaces in


Rn which have a tangent space (tangent line, tangent plane) at each point.
Manifolds will always be manifolds without boundary. One may think, for
example, of the surface of a ball. Manifolds with boundary, such as the ball
itself, will be considered in Section 73.19.
The concept of manifolds is one of the most important concepts in mathe-
matical physics, perhaps the most important one. The reason for this is that
a description of scientific phenomena uses the process of measurements, i.e.,
phenomena are locally described by parameters. In astronomy, for instance,
we use space and time coordinates. The choice of these coordinates is some-
what arbitrary. A priori, it is not clear, for example, how we have to set our
clocks. We may adjust them to atomic oscillations or to the daily rotations
of the earth. Different observers will generally use different systems of refer-
ence. Thus it is important to have a way of comparing the results of measure-
ments. Locally, a manifold looks like Rn or more generally like a B-space. For

529
530 73. Banach Manifolds

011

s s s

E s E E
(a) (b) (c)
E =earth, S =space ship.
Figure 73.1

a local description, we allow different B-spaces (coordinate or parameter


spaces). It is then important to have a transformation rule for these different
coordinates. In the simplest case, one may use for the plane, for example,
Cartesian or oblique coordinates. From a mathematical or physical point of
view, those properties of a manifold are most important which are independent
of the choice of the local coordinates. Moreover, it should be noted that
besides local properties, manifolds also have global properties. Consider, for
example, a space flight in R 1• A space ship, which starts at some pointE in
R 1, and flies on a straight line, can never, like the flying dutchman in Wagner's
opera, reach its goal. It will travel through space R 1 forever (Fig. 73.l(a)).
Now, let us use the one-point compactification of R 1, i.e., we add an element
oo to R 1 such that the stereographic projection maps R 1 u { oo} homeomor-
phically onto the unit circle S 1 (Fig. 73.1(b)). Our space ship may now return
to its starting point. If, however, we use the two-point compactification of R 1 ,
then our space ship may reach the end of the world + oo (Fig. 73.l(c)). Note
that the sets R 1, R 1 u { oo} and R 1 u { + oo} u {- oo} have the same local
structure in points x e R 1• But, as we have seen above, the differences in the
global structure may have far-reaching physical consequences. In fact, today
we do not really understand the global structure of our universe, i.e., the global
structure of the corresponding four-dimensional space-time manifold. We
will discuss two cosmological models in Chapter 76.
The last seven chapters of this volume and the beginning chapters of the
following Part V deal with various aspects of the theory of manifolds and its
plentiful applications. Several important features will be discussed more care-
fully at the beginning of each of these chapters. In this chapter, we will give a
number of basic definitions with emphasis on a detailed motivation. The
following three chapters contain applications to the theory of surfaces and the
special and general theory of relativity. We then give applications to fixed-
point theory, the theory of stability for dynamical systems, electrodynamics,
mechanics (symplectic geometry), as well as to quantum theory in connection
with the theory of Lie groups and Lie algebras. In many textbooks, the
definition of tangent bundles is presented, after the more general concept of
vector bundles has been introduced. Tangent bundles are then special cases
73.1. Local Normal Forms for Nonlinear Double Splitting Maps 531

of vector bundles. From experience, I know that this procedure, because of


its abstractness, causes difficulties for the student. For didactical reasons, we
therefore choose the following presentation. We begin with a definition of
tangent and cotangent bundles, and explain the simple connection with ab-
stract bundles in Section.73.7. We show that the concept of tangent bundles
arises naturally, when introducing higher-order F-derivatives and vector fields
on Banach manifolds. In Section 73.21, we demonstrate the usefulness of
tangent bundles in the proof of the fundamental Whitney embedding theorem.
Then, after the reader has become somewhat familiar with the concept of
bundles, we introduce vector bundles in Section 73.22. We choose a geometric
definition which can be generalized to fiber bundles in Part V. Also in Part
V, we try to explain the great importance of vector and fiber bundles for global
analysis and mathematical physics.

73.1. Local Normal Forms for Nonlinear


Double Splitting Maps
Consider a map
f: U(x 0 ) !:;;;; X--+ Y,
which satisfies the following conditions:
(H1) X and Yare B-spaces over B< and f is Ck, k ~ 1.
(H2) f is double splitting at x 0 , i.e., the null space N(f'(x 0 )) splits X and the
range R(f'(x 0 )) splits Y (cf. A1 (22i)).
We let N = N(f'(x 0 )) and R = R(f'(x 0 )). For a Fredholm operator f'(x 0 ),
condition (H2) is always satisfied. In this case, N and Rl. are finite dimen-
sional. It follows from (H2) that X and Y have the topological direct sum
decompositions
and
Let UN(O) and UR(O) denote neighborhoods ofO inN and R.
In the following we will use the decomposition Y = R E9 Rl. to construct
the local normal form
f(qJ(n, r)) = f(x 0 ) + r + g(n, r) (1)
for all ne UN(O), re UR(O) such that the following conditions are satisfied:
cp(O, 0) = x 0 and
reR, g(n,r)eRl.,
(2)
g(O,O) = 0, g'(O,O) = 0.
Let P: N x R --+ R, P(n, r) = r, denote the natural projection. It follows then
532 73. Banach Manifolds

from (1) that, after a suitable coordinate change x = qJ(n, r), f behaves locally,
in first-order approximation, like f(x 0 ) + P(n, r), i.e., locally, in first-order
approximation, double splitting maps are translations of projections.
Iff is a submersion at x 0 , i.e., f'(x 0 ) is surjective, then the normal form
becomes particularly simple, since in this case R.l = {0} and (1) is satisfied
with g = 0. Here, f actually equals f(x 0 ) + P(n, r) in a neighborhood of x 0 •
In Section 78.9, normal form (1) will be a main tool in proving the Sard-Smale
theorem.
Another normal form that we consider is
f(t/J(x)) = f(x 0 ) + f'(x 0 )(x - x0 ) + a(x) on V(x0 ) (3)
with a(x 0 ) = 0, a'(x 0 ) = 0 and
a(x) e R.l on V(x 0 ).
This is a variant of Taylor's theorem, and iff is a submersion at x0 , then
a = 0 because of R.l = {0}. Let t/J(x 0 ) = x 0 •

Proposition 73.1 (Local Normal Forms). Let (H1), (H2) be satisfied. It follows
that:
(i) There exists a neighborhood W(x 0 ) of x 0 in X and a C"-diffeomorphism
qJ: UN(O) x UR(O)-+ W(x 0 ) such that normal form (1), (2) is satisfied.
(ii) There exist neighborhoods V(x 0 ) and W(x 0 ) of x 0 in X and a C"-
diffeomorphism t/1: V(x 0 )-+ W(x 0 ) such that normal form (3) is satisfied.

PRooF. (i) Without loss of generality, let x 0 = 0 and f(x 0 ) = 0. The proof idea
is to apply the inverse mapping theorem to
F(x) = (x 1 ,f1 (x))
and to let qJ = F- 1 •
(I) The splittings
X= NEB N.t and
yield the decompositions
and f(x) = f1 (x) + f2(x).
Since f(O) = 0 and f'(O)h = /{(O)h + f~(O)h with f'(O)h e R for all he X,
we obtain that
and f~(O) = 0.
(II) The map F: U(O) s;;; X-+ N x R, as defined above, is C" with F(O) = 0
and
F'(O)h = (h 1 ./{(0)h) = (h 1 ,f'(O)h).
Since f'(O): N .l -+ R is bijective, it follows that F'(O): X -+ N x R is bijec-
73.2. Banach Manifolds 533

tive, and the inverse mapping theorem (Theorem 4.F) implies that F is a
local C1-diffeomorphism at x 0 = 0.
(III) Letting cp = F-l, we get cp(n, r) = x for n = x 1 and r = / 1 (x). Thus
f(cp(n,r)) = / 1 (x) + / 2 (x) = r + / 2 (cp(n,r)).
This is (1) with g(n,r) = / 2 (cp(n,r)).
Finally, we obtain g(O, 0) = 0 from / 2 (0) = 0 and cp(O, 0) = 0 and
g'(O, 0) = f2(0)cp'(O, 0) = 0
from /2(0) = 0.
(ii) For X = N EfJ N.l let Q: X-+ N be the natural projection and
A:N.l-+R

the restriction of f'(x 0 ) to N.1. The operator Q is linear and continuous,


and A is a linear homeomorphism. For each x EX there exists a unique
decomposition

thus, there is a unique pair (n,r)EN x R with x = x 0 + n + A- 1r, and


n = Q(x- x 0 ), r = A(x - x0 - n) = f'(x 0 )(x - x 0 ).
The map x H (n, r), constructed this way, is a coo -diffeomorphism from X onto
N x R. Letting 1/J(x) = cp(n, r), we obtain
/(1/J(x)) = f(cp(n, r)) = f(x 0 ) + r + g(n, r)
= f(x 0 ) + f'(x 0 )(x - x 0 ) + a(x)
with a(x) = g(n, r), and this is (3). 0

73.2. Banach Manifolds


It is important for the reader of this chapter to get a good geometric under-
standing of the concept of manifolds. One may think ofthe surface of the earth
as a standard example of a nontrivial manifold. Because of its curvature, it
cannot be described with just one chart, but instead a geographical atlas is
needed. In such an atlas, the same city T may appear in different charts with
different local coordinates. Thus we need a rule which allows us to pass from
one local coordinate system to another. During the following formal defini-
tions of charts, atlas, ... , one may keep the example of the surface of the earth
in mind. Recall that, according to A 1 (8), all topological spaces are assumed
to be separated.

Definition 73.2. Let M be a topological space. A chart (U, cp) in M is a pair


534 73. Banach Manifolds

Figure 73.2

where the set U is open in M and qJ: U -+ u, is a homeomorphism onto an


open subset u, of a B-space X,. We callqJ a chart map.

The B-space X, is called chart space and u, is called chart image. For x e U,
we call
x, = lf'(X)
the representative of x in the chart (U, lf') or the local coordinate of x in the
local coordinate system lf' (Fig. 73.2). The point x eM may have different local
x., x.,
coordinates = lf'(X) and = 1/J(x) for two different charts (U, lf') and (V, 1/1).
The transformation rules between them are
and (4)

Definition 73.3. Let M be a topological space. Two charts (U, lf') and (V, 1/1) are
called C"-compatible if and only if U n V = 0 or lf' o .p- 1 and 1/1 o lf'-1 are C",
k;;:: 0.
If both maps are analytic, the charts are called analytic compatible.

Figure 73.2 shows the two bijective maps


lf' o .p- 1 : 1/J(U n V)-+ lf'(U n V)
and
1/1 o lf' - 1 : lf'(U n V)-+ 1/J(U n V).
Let k ;;:: 1. If lf' o .p-1 is C", then the inverse mapping theorem (Theorem 4.F)
makes it into a C"-diffeomorphism, i.e., 1/1 o lf'- 1 is C" as well. If the chart spaces
are complex, then the C"-maps lf' o.p- 1 and 1/1 olf'- 1 are automatically analytic.

Definition 73.4. Let M be a topological space. A C"-atlas forM, 0 :s; k :s; oo,
is a collection of charts (U.. , lf'.. ) (a ranging in some indexing set), which satisfies
the following conditions:
73.3. Strategy of the Theory of Manifolds 535

.
- .
) (

) (

Figure 73.3
(i) The U,. cover M.
(ii) Any two charts are C"-compatible.
(iii) All chart spaces X,. are B-spaces over K
M is said to be a C"-Banach manifold if and only if there exists a C"-atlas for
M. Keeping the geographical atlas of the earth in mind, it might be useful to
add new ch!lrts to the atlas. We call a chart in M, which is C"-compatible with
all atlas charts, an admissible chart. In particular, all atlas charts are admis-
sible. The collection of all admissible charts forms a new atlas, which is called
the maximal atlas forM.
Usually, we will have different chart spaces. If, however, all chart spaces are
equal to a fixed B-space X, then Miscalled a C"-Banach manifold modeled
on X. A Banach manifold is called real or complex if all chart spaces are real
or complex. Often we will simply use manifold instead of Banach manifold.
If all chart spaces have the same dimension d, 0 :::;; d :::;; oo, then d = dim M is
called the dimension of the manifold M. In this case, M is said to have a
dimension. If all atlas charts are analytic compatible, then M is called an
analytic manifold. Finally, C0 -manifolds are also called topological manifolds.

EXAMPLE 73.5. Obviously, each open set Min a B-space X is a C00 -manifold
which is also analytic. A chart (U, cp) is obtained with U = M and cp = id, i.e.,
the identity on U. In particular, each open set in Ill" and C", is ann-dimensional
real or complex analytic manifold, respectively.
EXAMPLE 73.6. The boundary S of a disk in lll 2 is a one-dimensional, real
C00 -manifold, which is also analytic. One can choose two charts (U,cp) and
(V, t/1) as in Figure 73.3, i.e., U and V are two open curves which cover S.
Similarly, the boundary of a ball in lll 3 (e.g., the surface of the earth) is a
two-dimensional real C00 -manifold which is also analytic. A detailed proof
follows in Section 73.11.

73.3. Strategy of the Theory of Manifolds


So far, it is a little disturbing that we describe manifolds with different atlases.
Thinking of the surface of the earth, for example, we have the feeling that its
structure does not depend on the form ofthe specific atlas. We therefore make
the following definition.
536 73. Banach Manifolds

Definition 73.7. Two C"-atlases forM are called equivalent if and only if all
charts are C"-compatible, k ~ 0. Two C"-Banach manifolds M and N are said
to have the same differentiable structure if and only if the following two
conditions are satisfied:

(i) The topological spaces M and N are equal.


(ii) The atlases are equivalent.

In Example 73.5, for instance, one can get an equivalent atlas for M by
choosing all pairs (U, qJ) with U open in M and qJ equal to the identity on U.
Obviously, two C"-Banach manifolds have the same differentiable structure
if and only if (i) is satisfied and all admissible charts in M coincide with those
inN, i.e., the corresponding maximal atlases are equal. In the 1950s, Milnor
made the surprising observation that the seven-dimensional unit sphere S7 ,
with topology induced by R8 , has more than one differentiable structure, i.e.,
there exist C"-atlases, k ~ 1, which are not equivalent.
Using the methods of modern gauge field theory, Donaldson (1983) proved
the deep result that only certain four-dimensional topological manifolds are
smoothable, i.e., they are also differentiable manifolds. Gompf(1985) showed
that the four-dimensional Euclidean space R4 is exotic, i.e., the set R4 possesses
infinitely many different differentiable structures.
Banach manifolds look locally like B-spaces, but not necessarily globally;
the important properties of manifolds are global. We propose the following
strategy for a study of manifolds:
(a) One works locally in charts, i.e., in local coordinate systems where the
calculus of B-spaces is available;
(b) One applies only such concepts which are invariant, i.e., independent of
the particular choice of atlas charts and which remain unchanged in
equivalent atlases. ·
This means that we only allow concepts which are independent of the choice
of admissible charts.
We explain this procedure in the foUowing two sections which deal with
ct-maps and tangent vectors.
It is an important task in many applications (tangent bundles, jet spaces,
etc.) to transform sets M, which are not originally equipped with a topology,
into manifolds. The method here is to use abstract charts (U, qJ) where
qJ: u-+ u,
is not a homeomorphism, but a bijection onto an open set of a B-space. It
follows from Problem 73.2 that the presence of an abstract atlas implies, in a
natural way, the existence of a topology for M. With this atlas, M becomes a
manifold.
We say that a manifold M has a countable basis if and only if there exists a
countable coUection {U,.} of open sets such that each open set in M is the
73.4. Dilfeomorphisms 537

union of some U,.'s. If M has an equivalent atlas which contains countably


many charts and all chart spaces are separable (e.g., finite dimensional), then
M has a countable basis. Every B-space is regular since each of its points has
a neighborhood basis of closed sets. Therefore, every Banach manifold is
regular. The metrization theorem of Section 13.10 implies that a Banach
manifold with a countable b,!lsis is metrizable, i.e., there exists a metric which
induces the given topology. This metric need not be unique.

73.4. Diffeomorphisms
Recall that a map g: U(x) s;; X-+ Y is said to be C' at a point x if and only if
it is C' in a neighborhood of x.

Definition 73.8. Let M and N beCk-Banach manifolds with chart spaces over
IK, k :2: 1. Then f: M -+ N is called C', r ~ k if and only iff is C' at each point
x E M in fixed admissible charts.
Moreover, f is called a C' -diffeomorphism if and only iff is bijective and f,
f- 1 are both C'.
We discuss this definition. Let (U, q>) and (V, t/1) be charts in M and N,
respectively, with x E U and f(x) E V. Figure 73.4 shows the map
J = t/Jofoq>-1,
which is well defined in a sufficiently small neighborhood of q>(x). This map
is assumed to be C' in the usual sense and will be called a representative off
The chain rule implies that the representatives remain C' after passing to other
admissible charts. Thus Definition 73.8 is invariant in the sense of Section
73.3, i.e., independent of the choice of the representatives.
The role that is played by topological spaces and homeomorphisms in

uo c;;J
topology is played by manifolds and diffeomorphisms in the study of mani-

9' -I r !~

0 f

Figure 73.4
0
538 73. Banach Manifolds

folds. For xeM, the map f: M-+ N is called a local C'-diffeomorphism


at the point x if and only iff maps an open neighborhood U(x) of x C'-
diffeomorphically onto an open neighborhood V(f(x)) of f(x).

Proposition 73.9 (Structure of One-Dimensional Manifolds). A connected,


one-dimensional, real C 00 -manifold with a countable basis is C00 -diffeomorphic
either to IR (open curve) or to the unit circle S 1 (closed curve).

The proof of this important structure theorem may be found in the appendix
of Milnor (1965, M). The Whitney embedding theorem implies that any such
manifold can be embedded in IR 3 •

73.5. Tangent Space


For surfaces in IR 3 , one has an intuitive understanding of tangent vectors and
tangent planes (tangent spaces) (Fig. 73.5). The importance of tangent spaces
is the following. For xeM, the space TM"' is a linear space which gives a
first-order approximation of the manifold M in a neighborhood of x. This
allows us to define a derivative
f'(x): TMx-+ TNJ(xl
at the point x for a map f: M-+ N by linearization. Thus the concept of
F-derivative carries over to Banach manifolds. For general manifolds it is not
possible to define tangent vectors in some surrounding space. However, we
can use the concept of equivalent curves. A collection of equivalent curves
through a point x is defined in such a way that by existence of a surrounding
space I:, all curves of the collection would have the same tangent vector in I:
with respect to x (Fig. 73.6).

Definition 73.10. Let M be a ct-manifold, k ;;::: 1, and x eM. Two C 1-curves


in M, which pass through the point x, are called equivalent at the point x if
and only if the representatives have the same tangent vector at x in some fixed
admissible chart.
A tangent vector v to M at x consists of all C 1 -curves which are equivalent
at x to a fixed C 1-curve.

We will now discuss these definitions in some detail (see Fig. 73.7). A
C 1 -curve
x = x(t)
in M through xis a C 1 -map x( · ): U(t 0 ) s;; IR-+ M such that x(t 0 ) = x at some
fixed t 0 • The representative of this curve in the chart (U, cp) is
x., = x.,(t) with x.,(t) = cp(x(t)).
In the corresponding chart space, this curve has a "concrete" tangent vertor
73.5. Tangent Space 539

Figure 73.5 Figure 73.6

Figure 73.7

at x,
v., = x~(t 0 ).
The "abstract" tangent vector of Definition 73.10 to the curve x = x(t) at the
point xis denoted by v or more suggestively by x'(t 0 ). Then v., is called the
representative or local coordinate of v.
Let ( V, 1/1) be another chart with x e U n V. Then
xy, = F(x.,) (5)
with F = 1/1 o cp- 1 and xy,(t) = F(x.,(t)). Differentiation with respect to tat the
point t 0 gives
(6)
which is the important transformation rule for the local coordinates of the
tangent vectors. The same argument shows that the equivalence of Definition
73.10 is invariant, i.e., independent of the choice of the admissible chart. Also,
it is easily seen to be an equivalence relation.

Definition 73.11. Let M be a C"-Banach manifold. The tangent space TM"' to


M at the point x is the set of all tangent vectors at x. Here, k ~ 1.

Proposition 73.12. The tangent space T M"' is a topological vector space which
is linear homeomorphic to each chart space X., at the point x.
In particular, we have dim T Mx = dim X.,.
540 73. Banach Manifolds

PRooF. We choose a chart (U, ({')in M with x e U. Then there exists a one-to-
one correspondence between the elements ve TM" and the representatives
v.,.ex.,.. We define the linear operations in such a way that
v+w and av
correspond to
and
i.e., we use representatives. Because of(6), this definition is independent of the
choice of the chart. This makes T M" into a linear space.
Moreover, the map F of (5) is a C"-diffeomorphism. Hence with Theorem
4.F, the map F'(x.,.): X.,.-+ X.p is a linear homeomorphism between the chart
spaces. Thus all chart spaces are linear homeomorphic at x. Now, there exists
a linear bijective map b: TM"-+ X.,. and by using its inverse b- 1 we can carry
the topology of X.,. onto TM". Also, X.p induces the same topology on TM",
so that TM" becomes a topological vector space (see A1 (22o)). D

Since the norms llv.pll and Uv.,.ll in (6) might be different, i.e., they might
depend on the choice of the charts, it is not always possible to make T M" into
a B-space in an invariant fashion. If M is open in a B-space X, then T M" can
be identified with X for all x eM.

73.6. Tangent Map


Letf: U(x) £;;;X-+ Ybe C1, and X and Ybe B-spaces. In Chapter 4 we gave
the definition for the F-derivative f'(x): X-+ Y by using the process of lin-
earization. Now we will see how this important concept carries over to Banach
manifolds.

Proposition 73.13. Let f: M-+ N be C1 , and M and N C 1 -Banach manifolds


with chart spaces over K Then there exists a linear, continuous map f'(x):
TM"-+ T NJ(xl at each point x eM, given by (7) below.

PROOF. Let veT M" and choose a C1-curve


x = x(t)
passing through a fixed x eM such that this curve belongs to the tangent
vector v at x. The map f takes this curve into the curve
y = y(t)
on N such that y(t) = f(x(t)). Let w denote the tangent vector toy= y(t) at
f(x), and define f'(x) through
w = f'(x)v. (7)
73.7. Higher-Order Derivatives and the Tangent Bundle 541

Passing to charts, we obtain that f takes equivalent curves at x into equivalent


curves at f(x). Thus (7) is well defined, i.e., independent of the curve x(t), which
belongs to v.
Linearity and continuity of f'(x) follow as in (8) below by passing to
representatives. 0

Definition 73.14. The map f'(x): ™x-+ T N11x1 is called the tangent map of
f: M-+ Nat the point x. Another way to write f'(x) is Txf.

In Part III we also used Tf(x) for I'x/. but in this Part IV this might cause
confusion with Tf(x, v) of Section 73.7. Sometimes, also Df(x) or df(x) is used
for f'(x) in the literature.
Let J be the representative off in the admissible charts (U, cp) and (V, t/1)
with x e U and f(x) e V, as pictured in Figure 73.4. The local coordinates v"
and wt/1 are related as in (7} through
wt/1 = f'(x")v". (8)
This follows immediately from the definition of the local coordinates. There-
fore,
/'(x"): X"-+ Xt/1
is the representative of
f'(x): TMx-+ TN/(x)•
i.e., in local coordinates f'(x) is the usual F-derivative. The chain rule of
Section 4.3 implies the following general result.

Proposition 73.15 (Chain Rule). Let M, N, and P be C 1 -Banach manifolds with


chart spaces over II<, and let f: M-+ Nand g: N-+ P be C 1• Then we have
(go f)'(x) = g'{f(x)) o f'(x) (9)
for all xeM.

Formula (9) can be rewritten as


7'x(g of) = 7J(x)g o 7'xf, (10)

i.e., linearization and composition commute.

73.7. Higher-Order Derivatives and the


Tangent Bundle
Let f: U £ X-+ Y be C2 , U open in X, and X, Y B-spaces over K Then
f'(x): X-+ Y exists for all x e U, i.e., f'(x) e L(X, Y). This induces a map
f': U-+ L(X, Y).
542 73. Banach Manifolds

Thus the second-order derivative can be defined through f"(x) = g'(x), where
g =f'.
This procedure carries over to manifolds after making the following modi-
fications. Let
f: M -+N
be C 2 , where M and N are open sets in the B-spaces X and Y. Let T M =
M x X and TN = N x Y and

Tf(x, v) = (f(x),f'(x)v) for all xeM, veX.


This defines a map
Tf: TM-+ TN.
If we write T 2 instead of TT, we obtain
T 2f: T 2 M-+ T 2 N
as a substitute for the second-order derivative. For arbitrary manifolds M this
leads to the concept of the tangent bundle TM. The proof of the Whitney
embedding theorem in Section 73.21 is a good illustration for the usefulness
ofTM.

Definition 73.16. Let M be a C"-Banach manifold, k ~ 1. The tangent bundle


T M is the set of all pairs
(x, v) with xeM, ve TM".

The map n: TM-+ M, where n(x,v) =xis called the natural projection.

Intuitively, T M is obtained by attaching the tangent space T M" at each


point x eM, neglecting the fact that different tangent spaces may intersect
(Fig. 73.8). If M is a surface in R3 , then TM will be a four-dimensional
manifold, since each component of a pair (x, v) e T M is determined by two
coordinates. Indeed, we need two coordinates to describe the point x on M,
and, moreover, we need two coordinates to describe the tangent vector v in
the tangent plane at x.

Proposition 73.17. Let M be a C"-manifold, k ~ 1. The tangent bundle T M is


a c"- 1-Banach manifold. If M has dimension d, then TM has dimension 2d.

Figure 73.8 Figure 73.9


73.7. Higher-Order Derivatives and the Tangent Bundle 543

PROOF. The simple proof idea. is as follows. Choose (x41 , v41 ) as the local
coordinates for (x, v) and then transform them according to (5) and (6).
More precisely, we assign a chart (TU,qJT) in TM to each chart (U,qJ) in
M, with
TU = {(x,v): xEU,vETM,J
and
cpT(x, v) = (x41 , v41 ).
It follows from Problem 73.2 that, in a natural way, T M can be given a
separated topology. This makes T M with the atlas above into a Banach
manifold. 0

Since (x41 , v41 ) is an element of U, x X,, the space T M looks locally like the
product U41 x X 41 with chart space X 41 x X41 • Globally, however, T M need not
have a product structure.

Definition 73.18. Let f: M-+ N be Ck, and M and N Ck-Banach manifolds


with chart spaces over II<, k ~ l. The tangent map Tf: TM-+ TN is defined
through
Tf(x, v) = (f(x),f'(x)v) for all (x, v)E TM.

Passing to local coordinates, we find that Tf is ck-I since the local repre-
sentative of Tf is
(x41 , v41 )t-+(f(x.,),f'(x.,)v41 ).

Through iteration we obtain a ck- 2-map T 2f: T 2 M-+ T 2 N fork~ 2, etc.,


where T 2f = T(Tf) and T 2 M = T(TM). Thus, using the tangent bundle, we
obtain a substitute for higher-order derivatives off In Problem 73.6 we prove
the following very convenient chain rule for higher-order derivatives
T'(gof) = T'go T'f.
As another application of tangent bundles, we now look at vector fields on
manifolds. One may think of a vector field v on M as having a tangent vector
v" attached at each point x eM (Fig. 73.9).

Definition 73.19. A Ck-vector field on a Banach manifold M is a Ck-section of


the tangent bundle TM, i.e., a Ck-map v: M-+ TM with
n(v(x)) = x for all xeM.
More precisely, we have v(x) = (x, v") for x EM and v" E T M". Such a map
is Ck if and only if for each point x EM there exists an admissible chart (U, cp)
in M such that the map x, 1-+ vtp(X) from u, into x, is ck. Here v.,(x) denotes
the representative of v" for cp.
544 73. Banach Manifolds

Definition 73.19 makes sense if T M is a Ck-manifold, i.e., M is a cHt_


manifold, k ~ 0.
The concept of sections can be defined for general but1dles.

Definition 73.20. A bundle is a triple (n, B, M) where Band M are topological


spaces and n: B-+ M is a surjective, continuous map.
B is called bundle space, M is called basis space and 1t is called natural
projection. The set Fx = n- 1 (x) is called a fiber over x eM. A section is a
continuous map
s: M-+ B with n(s(x)) = x for all xeM,
i.e., s(x) E Fx.

In many examples of bundles, the fibers are either linear spaces (e.g., vector
bundles) or groups (e.g., principal fiber bundles).

EXAMPLE 73.21. A standard example of a bundle (n, B, M) is obtained by


considering the product
B=MxY,
where M and Y are topological spaces and
n: M x Y -+M

is the natural projection, n(x, y) = x. Here, the fiber n- 1(x) equals {x} x Y and
is homeomorphic to Y. Figure 73.10 shows a section.
A bundle morphism (J,g) between two bundles (ni,Bi,Mi),j = 1, 2 is a pair
of continuous maps such that the following diagram commutes:
f
B t - B2

n, 1 g
1 n2 •

-M2
Thus, f and g are compatible with the fiber structure, i.e., f maps the fiber Fx
into the fiber F9 <x>· Iff and g are homeomorphisms, then (f, g) is called a bundle
isomorphism.

y
fiber F,
section

---+--~--+-x------M

Figure 73.10
73.8. Cotangent Bundle 545

If, in the above definitions, topological spaces, continuous maps, and


homeomorphisms are replaced by ct-Banach manifolds, C"-maps, and C"-
difTeomorphisms, then we will speak of ct-bundles, ct-bundle morphisms, and
C"-bundle isomorphisms, respectively.
The tangent bundle T M of a ct-Banach manifold, k ~ 1, corresponds to
the c"- 1-bundle (1t, T M, M). The fiber over x is the tangent space T M,..
Similarly, the cotangent bundle TM*, which will be defined below, corre-
sponds to the ct- 1-bundle (1t, T M*, M). The fiber over xis T M:, i.e., is equal
to the dual space of the tangent space TM,..

73.8. Cotangent Bundle


The dual space X* of all linear, continuous functionals on a B-space X plays
an important role in mathematics. A generalization to manifolds uses the
concept of cotangent bu_ndles.

Definition 73.22. Let M be act-Banach manifold, k ;;::: l. The cotangent bundle


T M* consists of all pairs (x, w*) with x eM and weT M:.

Each tangent space T M,. is a topological vector space. Thus, the dual space
T M: is defined as the set of all linear, continuous functionals on T M,..
Intuitively, the dual space T M: is attached at each point x eM. Let 1t: T M* -+
M be the natural projection.
An element w* e T M: is called a cotangent vector to M at x, or simply a
covector. We call TM: the cotangent space of Mat x.

Proposition 73.23. The cotangent bundle T M* is a c"- 1-Banach manifold. If


M has dimension d, then T M* has dimension 2d.

PRooF. Let (U, q>) be a chart in M. The local coordinates for x eM and veT M,.
are x, e U, and v, eX,. For w* e T M: we define a functional w; ex; through
w;(v,) ~ w*(v) for all veT M,.. (11)
The map w* H w; between T M: and x;
is a linear bijection. The pair (x,, w;)
is considered as a local coordinate for (x, w*)e TM*, and w; is called a
representative of w*.
More precisely, we use the concept of abstract atlases of Problem 73.2,
which means that we assign an abstract chart (TU*, «Pr•) in T M* to each chart
(U, q>) in M with
TU* = {(x, w*): xe U, w*e TM:}
and
«Pr•(X, w*) = (x,, w;).
546 73. Banach Manifolds

With this abstract atlas, T M* becomes a ck- 1-Banach manifold. This can be
seen by passing to another chart (V, 1/1) in M. Then
xo/1 = F(x'l') (12)
with F = 1/1 o cp- 1 and
vo/1 = F'(xq>)vq>.
Formula (11) implies that w;(v"') = w;(v'l'), so that
<w;,F'(x'~')v'~') = <w;,v'~') for all v'~'eX'~'.

From this we obtain the following transformation rule for the representatives
(13)
D

Since (xq>, w;)e uq> X x:.


the cotangent bundle TM* looks locally like the
product uq> X x: with chart space X 'I' X x:.
Globally, however, T M* need
not have a product structure.

Definition 73.24. A Ck-covector field on a Banach manifold M is a Ck-section


of the cotangent bundle, i.e., a Ck-map w*: M-+ TM* with
n(w*(x)) = x for all xeM.

More precisely, we have w*(x) = (x, w:) with x eM and e T M:. Such aw:
map is ct if and only if for each x EM there exists an admissible chart (U, cp)
in M such that the map
X'I'Hw:(x)

from Uq> into x: is Ck. Here w:<x> denotes the representative of for cp. w:
Definition 73.24 makes sense, if T M* is a Ck-manifold, i.e., M is a cHt_
manifold, k ;;:: 0.

73.9. Global Solutions of Differential Equations on


Manifolds and Flows
Consider the differential equation
x'(t) = v(x(t), t), x(O) = x 0 (14)
on a manifold M, which satisfies the following conditions:
(Hl) M is a C'+1-Banach manifold, r;;:: l.
(H2) vis a time-dependent vector field on M, i.e., there exists a C'-map
V:Mx~-+TM with V(x, t) = (x, v(x, t)).
73.9. Global Solutions of Differential Equations on Manifolds and Flows 547

Figure 73.11

Condition (H2) is satisfied if for each x eM there exists an admissible chart


(U,q>) in M with xe U such that the representative
(x.,, t) 1-+ v.,(x.,, t)
is a C'-map from u., x R into X.,.
Moreover, condition (H2) assigns a tangent vector v(x(t), t) to each point
x(t)e M at time t. Thus a solution of (14) is a curve x = x(t) in M such that
the tangent vector x'(t) at the point x(t) is equal to the given vector v(x(t), t)
of the vector field. A simple model for this differential equation is as follows.
Consider a moving liquid on M and let x = x(t) be the trajectory of a particle
of fluid. Suppose at time t = 0 it is at the point x0 • At time t it will be at the
point x(t) with velocity x'(t) = v(x(t), t). One may think, for example, of an
ocean current (Fig. 73.11). Writing F,x 0 = x(t) and f(x 0 , t) = F,x 0 , we obtain
the concept of flows for time-independent velocity fields v = v(x).

Definition 73.25. Let M be a C'-Banach manifold, r ~ 0. A flow on M is a map


(t,x)t-+ f(t,x)
from R x M into M such that the following conditions are satisfied for
F,x = f(t, x):
(i) F0 = I (identity on M).
(ii) Ft+s = F,Fs for all t, s e R.
Iff is C', then f is called a C'-flow.
Iff is only defined for t ~ 0 and (ii) holds for all t, s ~ 0, then f is called a
semiflow.
Iff is defined only locally, i.e., only for (t, x) in a neighborhood of some
point (0, x 0 ) and (i), (ii) hold locally, then f is called a local flow. The notation
for flows will be either for {F,}.

A C'-flow {F,} has an inverse F,- 1 = F_, and therefore F,: M-+ M is a
C'-diffeomorphism for all t e R. For this reason, a flow is also called a one-
parameter group of diffeomorphisms or simply a one-parameter group. Simi-
larly for semigroups.

Theorem 73.A (Existence Theorem for Differential Equations on Manifolds).


Assume (Hl) and (H2). For each initial value x 0 eM there exists a maximal open
548 73. Banach Manifolds

interval J in IR with OeJ, on which (14) has a solution x = x(t). This solution is
unique.
We set f(x 0 , t) = x(t). Then f is C' on its domain of definition and assumes
values in M. For the autonomous equation
x'(t) = v(x(t)),
f is a local C'-flow.

Corollary 73.26 (Flows on Compact Manifolds). Assume (H1) and (H2), and
let M be a compact manifold. Then we have J = IR in the above theorem, i.e.,
for each initial value x 0 eM there exists a unique solution of (14) which is defined
for all te IR.Jf (14) is an autonomous equation, then f is a C'-flow.

PRooF OF THEOREM 73.A. Working in charts, we obtain a differential equation


for the representatives analogous to (14), where the right-hand side is of class
C'. Theorem 4.0 implies that, locally, there exists a unique solution. Now
consider all open intervals J~~. in IR with OeJ~~. such that (14) has a solution in
J~~.. This solution is unique with maximal interval of existence equal to J =
U~~.J ...
Theorem 4.0 implies that f is C', and the proof below shows that f is a
local flow. 0

PRooF OF CoROLLARY 73.26. Let M be compact. Suppose that J #: IR, i.e., J


may be equal to J = ]t 1 , t 2 [ with t 2 < oo. It follows then from the compactness
of M that x(t)-+ x as t-+ t 2 - 0. Thus, for the initial-value problem in x, we
can continue the solution beyond t 2 , which is a contradiction.
We next show that f is a flow. Let x = x(t) be a solution of
x'(t) = v(x(t)), x(O) = x 0 •
Let y(t) = x(t + s) for some fixed s. Then y = y(t) is a solution of
y'(t) = v(y(t)), y(O) = x(s).
Therefore, we have y(t) = F,+.xo and
y(t) = F,x(s) = F,F,x 0 ,
and hence Fr+s = F,F,. 0

An important fact about the concept of flows and semiflows is that it allows
us to describe scientific processes which satisfy the principle of causality. Let
{F,} be a semiflow. The points x eM represent the states of a system. Suppose
the system is in x 0 at timet= 0 and in F,x0 at timet. Formula
(15)
shows that the final state of the process depends only on its initial state and
the time period in between. Each intermediate state F,x 0 can be viewed as a
73.9. Global Solutions of Differential Equations on Manifolds and Flows 549

new initial state, which leads to the final state F,(F.x 0 ) after time t. From (15),
this is the same as the final state F,+.x 0 , corresponding to the initial state x 0 ,
after timet+ s. Thus, formula (15) represents a principle of causality. Semi-
flows are only defined for t ~ 0, i.e., only for future times. They describe
irreversible processes such as diffusion and heat conduction, while reversible
processes like wave motions are described by flows.
We will now indicate why it is often easier to work with flows and semiflows,
rather than with the differential equation itself.

EXAMPLE 73.27. A standard example of a flow is the linear flow generated by


the differential equation
x'(t) = Ax(t), x(O) = x 0 , (16)
where A: X-+ X is a linear, continuous operator on a B-space X. The solution
has the form x(t) = e•Ax0, and

describes the flow on X. The multiplication rule for the exponential function
ei•+•>A = e•Ae•A is equivalent to the group property (15).
There are, however, many time-dependent scientific processes (16) where A
is not a bounded operator. A typical case is:
(H) A: D(A) £X-+ X is a linear, self-adjoint operator on an H-space X.
The proofs of the following results may be found in Chapter 19 of Part II.

EXAMPLE 73.28 (Parabolic Case). Assume (H) with (Axlx) s; 0 for all xeD(A).
Then
F, = e•A
defines a semiflow on X, and x(t) = F,x 0 is the unique solution of (16) which
passes through x 0 e D(A) and is defined for all t ~ 0.
Thus we have the important situation that F,x 0 is defined for all x 0 eX and
t ~ 0, whereas the corresponding differential equation might not have a
solution for each x0 eX, if D(A) =F X. Furthermore, F,: X-+ X is linear and
continuous for each t ~ 0.

As a typical application, we consider the parabolic boundary initial-value


problem
x, = .1.x,
X= Xo for t= 0 (initial condition), (17)
x = 0 on iJG for t~0 (boundary condition),
for the sought function x: R+ x G-+ R. Here G is a bounded region in IR". We
may think of x(t, p) as a temperature at the point peG at time t.
550 73. Banach Manifolds

Let X= L 2 (G). The boundary condition suggests to choose C0(G) as the


domain for the Laplacian Ax. The Laplacian is a symmetric operator which,
according to Section 19.11, can be extended to a self-adjoint operator A:
D(A) s; X-+ X with (Axlx) ~ 0 for all x e D(A). Formula (16) is a functional-
analytic version of (17) and
x(t) = e'Ax 0

is a generalized solution of (17) through x 0 eX.

EXAMPLE 73.29 (Schrodinger Case). Consider the differential equation


x'(t) =- iAx(t), x(O) = x 0 • (18)
Here A is the energy operator of the quantum system. Assume (H) and let X
be a complex H-space. Then F, = e-w defines a flow on X. For x 0 e D(A),
x(t) = e-wx0

is the unique solution of (18) defined for all t e 11\t


The Schrodinger equation of quantum mechanics has precisely the form
(18) if we seth= 1.

73.10. Linearization Principle for Maps


Consider the following question:
(L) Which map behaves, locally, after a suitable coordinate change, like its
linearization?
In this section we show that for etale mappings, submersions, immersions, and
subimmersions, (L) has a positive answer. The following section contains the
corresponding global results. We make the following general assumption.
(H) Let f: M-+ N beCk, k ~ 1, where M and N are Ck-Banach manifolds
with chart spaces over K
The splitting of linear subspaces, which has been used already in Part I, will
play an important role. The linearization off at the point x is the map
f'(x): TM"-+ TNnx>·

Let Y be a linear subspace of ™x· The tangent space T M" is a topological


vector space. Thus, it follows from A1 (22/) that the splitting of Y is well defined.
The tangent space T M" and the corresponding chart spaces Xtp are linear
homeomorphic. The invariance of the splitting under linear homeomorphisms
implies that Y splits T M" if and only if there exists a chart (U, qJ) in M with
x e U such that the representative Ytp of Y splits the chart space Xtp. Thus, Y
73.10. Linearization Principle for Maps 551

splits T M"' if one of the following three conditions is satisfied:


(i) dim M < oo.
(ii) dim Y < oo or codim Y < oo and Y is closed.
(iii) Y is closed, and some chart space X, at the point xis an H-space.
Note A 1 (22l). Analogous results hold for TNY.

Definition 73.30. Assume (H).


(a) fis called an etale mapping at x if and only if f'(x): TM"-+ TNJ(xl is
bijective.
(b) f is called a submersion at x if and only if f'(x) is onto and the null space
N(f'(x)) splits ™x·
(c) f is called an immersion at x if and only if f'(x) is injective, and the range
R(f'(x)) splits TNJ<xl·
(d) f is called a subimmersions at x 0 if and only if rankf'(x) is constant in a
neighborhood of x 0 .
Since rankf'(x) =dim R(f'(x)), this provides a natural classification of
maps between manifolds according to the behavior of the linearizations.
Properties (a)-( d) are satisfied if they are satisfied for a representative off in
local charts. If M and N are finite-dimensional then the splitting is automatic.
The following Theorems 73.B-73.F contain important applications of the
definitions above, which have already been studied in the context of B-spaces
in Chapter 4. For the convenience of the reader, we begin with a number of
definitions which often occur in connection with manifolds.
We begin with a global version of Definition 73.30. Let (H) be satisfied. A
map f: M-+ N which is a submersion at every point xeM is simply called a
submersion, and analogous definitions are given, for etale mappings, immer-
sions, and subimmersions. A map f: M-+ N is called closed if it takes closed
sets into closed sets, and it is called proper if the preimage of compact sets is
compact.

EXAMPLE 73.31. Let f: U(x 0 ) s;:; IKn-+ IKm be Cl, IK = IR, C. Then we have
f'(x 0 ) = (iJ};(x)/oe1) with i = l, ... , m and j = l, ... , n.
(i) If n :S m and rank f'(x 0 ) = n, then f is an immersion at x 0 •
(ii) If n;;:::: m and rank f'(x 0 ) = m, then f is a submersion at x 0 .
(iii) If rank f'(x) is constant in a neighborhood of x 0 , then f is a subimmersion
at x 0 •
The matrix f'(x 0 ) in (i) and (ii) has a non vanishing subdeterminant of maximal
size. Continuity implies that it is nonvanishing in a neighborhood of x 0 , and
thus (iii) follows from (i) as well as from (ii). Therefore, we )lave in finite-
dimensional spaces: every submersion and immersion at x 0 is also a sub-
immersion at x 0 .
552 73. Banach Manifolds

ef -
The map f: IR 2 --+ IR 1 With f(x) = e~ is not a SUbimmersion at X = (0, 0),
and hence is not an immersion and submersion since rank f'(x) = 0 (or= 1)
for x = 0 (or x =F 0). Note that f'(x) = (2e 1 , - 2e2>·

Definition 73.32. Assume (H) for f: M--+ N.


(a) A point x eM is called a regular point off if and only iff is a submersion
at x. Otherwise x is called a singular point.
(b) A point yeN is caled a regular value off if and only if the set f- 1 (y) is
empty or consists only of regular points. Otherwise y is called a singular
value, i.e., f- 1 (y) contains at least one singular point.
Instead of singular (or regular) point, one sometimes says critical or de-
generate (or noncritical or nondegenerate) point.

Definition 73.33. Let (H) be satisfied. A map f: M--+ N is called a Fredholm


operator at x if and only ifthe linearizationf'(x): TMx-+ TNf(xl is a Fredholm
operator, i.e., we have dim N(f'(x)) < oo and codim R(f'(x)) < oo. We define
the index as
indf'(x) = dimN(f'(x))- codimR(f'(x)).
Moreover, iff: M-+ N is a Fredholm operator at all xeM, then f is simply
called a Fredholm operator. If indf'(x) is constant on M, then this number
is called the index of f. We write indf

A map f: M -+ N is a Fredholm operator at x if and only if the representa-


tives off in local charts are Fredholm operators at the corresponding points.
The in variance of the index of Fredholm operators on B-spaces under pertur-
bations, which has been discussed in Section 8.4, implies that ind f' (x) is locally
constant, i.e., indf'(x) is constant on M, if M is connected.

Theorem 73.B (Diffeomorphisms). Assume (H) for the C"-map f: M--+ N,


k ~ 1. Then f is a local C"-diffeomorphism at x if and only iff is an etale
mapping at x.
f is a C"-diffeomorphism if and only iff is a bijective etale mapping.

PROOF. Working in local charts, the theorem follows from the inverse mapping
theorem (Theorem 4.F) applied to the representatives. 0

Theorem 73.8 is the linearization principle for diffeomorphism&. We now


look at the local behavior of f in the case that f'(x) is not bijective, i.e.,
we consider subimmersions, submersions, and immersions at x. The corre-
sponding global results will be discussed in the following section.

Proposition 73.34 (Local Subimmersion Theorem). Let X and Y be finite-


dimensional B-spaces over IK, and f: U(x 0 ) £X-+ Y is a C"-subimmersion at
73.10. Linearization Principle for Maps 553

x 0 , k ~ 1. Then there exist local. C"-diffeomorphisms cp and 1/J with cp(x 0 ) =0


and 1/J(O) = f(x 0 ) such that the following diagram commutes:
f
U(x 0 ) s = X - Y

~1 r~.
U(O)s=X~ Y
Iff is analytic, then also cp and 1/J are analytic.

Corollary 73.35. The same conclusions hold if, in the above diagram, f'(x 0 ) is
replaced by g with
g(el •.... e. >= <el ..... e,.o, ... ,o).
Here, we identify X and Y with IK" and !Km and note that r = rankf'(x 0 ).

PRooF. In Problem 4.4b we proved this proposition for g (rank theorem).


Furthermore, we know from linear algebra that there exist linear isomor-
phisms A and B with g = Af'(x 0 )B (normal form for matrices of rank r). This
proves the proposition in the case of f'(x 0 ). 0

Thus, after introducing new coordinates, f looks, locally at x 0 , like its


derivative f'(x 0 ). In the case of submersions and immersions, this important
linearization result can be generalized to infinite-dimensional B-spaces. For
simplicity, let f(x 0 ) = 0 and x 0 = 0. This can always be obtained by using
translations.

Proposition 73.36 (Local Submersion Theorem). Let X and Y be B-spaces


over K and f: U(x 0 ) !: X--. Y a Ct·submersion at x 0 , k ~ 1, with f(x 0 ) = 0.
Then there exists a local Ck·diffeomorphism cp with cp(x 0 ) = 0 and cp'(x 0 ) =I
such that the following diagram commutes:
f
U(x 0 )!: X--- Y

~ 1 ~xol·
U(O)!: X
Iff is analytic, then also cp is analytic.

PRooF. This is a consequence of results of Section 73.1, but we give here a


direct proof. Let N = N(f'(x 0 )). Since f is a submersion at x 0 , N splits the
B-space X. Thus, there exists a projection operator P: X--. N. Let P1 = I - P
and Nl. = pl.(X). Then we obtain that X= NEB N 1 and also that f'(x 0 ):
Nl.--. Y is bijective. Let its inverse be denoted by A: Y--. N 1 and define
cp(x) = Px + Af(x). (19)
554 73. Banach Manifolds

Then we have q1'(x 0 ) = P + Af'(x 0 ) = P + pl. = I. The inverse mapping


theorem (Theorem 4.F) implies that q1 is a local C"-diffeomorphism at x 0 •
Multiplication of both sides of(l9) with f'(x 0 ) gives f'(x 0 )(qJ(x)) = f(x). 0

Moreover, we obtain that lhe following diagram commutes:


U(x0 ) £ X__!___... Y

rp 1 pl.
r f'(xo).
U(O) s;;; X---+ Nl.
This, and the diagram of Proposition 73.36, show that, after a suitable coordi-
nate change, f looks, locally at x 0 , like f'(x 0 ) or like the projection operator
pl.,

Proposition 73.37 (Local Immersion Theorem). Let X and Y be B-spaces over


IK, and f: U(O) s;;; X-+ Y a C"-immersion, k ~ 1, at x 0 = 0. Then there exists a
local C"-diffeomorphism IP with q1(0) = f(O) and IP'(O) = I such that the following
diagram commutes:
U(0) s;;; X __!___... Y

~~rrp.
y

Iff is analytic, then also qJ is analytic.

PRooF. Without loss of generality, let f(O) = 0. Since f is an immersion at


x0 = 0, it follows that R(f'(O)) splits the B-space Y. Let P: Y-+ R(f'(O)) be a
projection operator and define
~P(Y) = f(f'(Or 1Py) + (/ - P)y. (20)
It follows from Pf'(O)x = f'(O)x that IP(f'(O)x) = f(x), i.e., the above diagram
commutes. Moreover, we have
q1'(0)h = f'(O)f'(Or 1 Ph + (/ - P)h = h,
and hence q1'(0) = I. The inverse mapping theorem (Theorem 4.F) implies that
q1 is a local C1-diffeomorphism at x 0 = 0. 0

73.11. Two Principles for Constructing Manifolds


We are looking for general and easy to apply criteria to decide whether or
not a given set is a Banach manifold. We will discuss the following two
principles.
73.11. Two Principles for Constructing Manifolds 555

(Pl) Implicit construction by solving an equation f(x) = y for some fixed y


(Theorems 73.C and 73.0).
(P2) Explicit construction by using an embedding (Theorem 73.E).

Let us explain the basic ideas. Principles (P1) and (P2) correspond to (i) and
(ii) below, respectively. In IRn, n ~ 2, there are two ways to construct a curve C.

(i) The curve C is defined as the set of all points x e IRn, which satisfy the
(n- 1) equations
/;(x) = 0, i = 1, ... , n- 1. (21)
(ii) The curve C is defined by using a parameter equation x = x(t).

In both cases we will not always obtain manifolds. In (ii), for example, the
curve C may intersect itself. In (i), the rank of f'(x) plays an important role.
More precisely, we obtain from Section 4.18:

Iff: IRn -+ IRn - 1 is Ck, k ~ 1, and 0 is a regular value of J, i.e., the rank of the
Jacobian matrix f'(x) = (o/;(x)foe1) is equal to n - 1 for all x which satisfy
(21), then the solution set of(2l) is a one-dimensional ct-manifold.

In many cases (i) is more convenient than (ii). For example, a circle is defined
e
through 2 + '7 2 = 1. Similarly for a sphere. In this case, parameter equations
are more elaborate.
We now generalize these two principles. Submersions (regular values) play
an important role in (i) and immersions in (ii). As in Theorem 73.B above, we
use the process of linearization, i.e., the behavior of the nonlinear map is
described by corresponding properties of the derivative.
As we saw in Section 4.16, formula (21) can be used to construct manifolds
even if the rank of f'(x) is not equal to n- l, i.e., is not maximal. More
precisely, we have:
Iff: IRn-+ IRn- 1 is Cl, k ~ 1, and f is a subimmersion on /- 1 (0) of rank r,
then the solution set of(21) is a (n- r)-dimensional ct-manifold.
Subimmersion on f- 1 (0) means that in the neighborhood of each point
x e IRn, which satisfies (21), the rank of f'(x) is equal to r. A generalization of
this yields the subimmersion theorem below, which is valid only in the finite-
dimensional case.
We need the concept of a submanifold. As a motivation, consider a curve
C on the surface of the earth M, with no endpoints (open or closed curve
without self-intersections). One may think of a circle of longitude or a circle
of latitude or a river without source and mouth. We call C a submanifold of
M if for each r e C there exists a chart in M, for which C looks like a straight
line (Fig. 73.12). Naturally, we do not require that such a chart is contained
in our atlas. We only assume that such a chart exists and that it is compatible
with our atlas, i.e., is an admissible chart in the sense of Section 73.2. Since
M is a C""-manifold, and a chart change uses ceo-maps, a one-dimensional
556 73. Banach Manifolds

Figure 73.12

submanifold C must be a C00 -curve. According to the general definition below,


points and open subsets of the surface of the earth are zero-dimensional and
two-dimensional submanifolds. Also, our definition implies that submanifolds
of Banach manifolds look, locally, in suitable admissible charts, like closed
linear subspaces of the chart spaces. In order to obtain more convenient
methods for constructing submanifolds, we make the additional assumptions
that these subspaces split the corresponding chart spaces. This condition is
automatically satisfied for finite-dimensional manifolds.

Definition 73.38. Let M be a C1-Banach manifold, k ~ 0. A subsetS of M is


called a submanifold of M if and only if for each point xeS there exists an
admissible chart (U, fl') in M with x e U such that the following hold:
(i) The chart space X, contains a linear, closed subspace Y, which splits X,.
(ii) The chart image qJ(U 11 S) is an open set in Y,.

Proposition 73.39. Every submanifold S of a C"-Banach manifold is itself a


C"-Banach manifold.

PRooF. Choose as atlas charts of S the restrictions of the special charts (U, fP)
of Definition 73.38. That is, the charts of S are (U 11 S, fl'lu,..,s) with chart spaces
Y,. Here, fl'lu,..,s denotes the restriction of fP onto U 11 S. 0

For the following three global theorems about submersions (regular values),
immersions, and subimmersions we make the assumption:
(H) The map f: M-+ N is C1, k ~ 1, with M and N C1-Banach manifolds with
chart spaces over K

Theorem 73.C (Preimage Theorem). Assume (H) for the map f: M-+ N. If y
is a regular value of J, then S = f- 1 (y) is a submanifold of M, i.e., in particular,
a C1 -Banach manifold.
The tangent space TSx is equal to N(f'(x)) for all xeS. If M and N have a
dimension, then codim S = dim N.
73.11. Two Principles for Constructing Manifolds 557

In Section 78.8 we prove the important Sard-Smale theorem. It states that,


for Fredholm operators f, most points yeN are regular values off

Corollary 73.40. Iff: M-+ N is a Ck-submersion, k;;::: 1, such that (H) is


satisfied, then each image point y e f(M) is a regular value off

The codimension codim S of a submanifold S of M is the codimension of


TS,., in T M,.,, if this number is independent of xeS. Recall that M and N have
a dimension, if dim TM,., and dim TN, are independent of xeM and yeN.
We then have dim M = dim T M" and dim N = dim TN,. Therefore, we get
dimS + codim S = dim M.
Theorem 73.C then implies
dimS+ dimN = dimM. (22)
Thus, Theorem 73.C generalizes the following well-known result of linear
algebra. If M = Rm, N = R", ;Y = 0, and iff is linear, surjective, then S is the
solution set ofthe system oflinear equations f(x) = 0. This is a linear subspace
of M with dimS = m - n. Similarly, Theorems 73.0 and 73.E below are
important generalizations of known results about systems of linear equations.
Corollary 73.40 immediately follows from Definition 73.32.
PROOF OF THEOREM 73.C. It suffices to study the local problem. Thus we look
at S in a neighborhood of some given point x.
(I) Let M and N be B-spaces. The local submersion theorem (Proposition
73.36) implies that there exists an admissible chart (U, (/)) in M with
qJ(x) = 0 and qJ'(x) =I such that
j{qJ- 1 {h)) = f'(x)h +y
for all h in a neighborhood of 0. Thus the solution set for the equation
f(z) = y
in a neighborhood of z = x corresponds to the solution set of the equation
f'(x)h = 0
in a neighborhood of h = 0. Thus in this chart, S looks, locally, like
N(f'(x)). Since f is a submersion at x, the null space N(f'(x)) splits the
B-space T M,., = M.
We now compute the tangent space TS,.,. Let x = x(t) be a C 1-curve in
S with x(O) = x.lt follows from f(x(t)) = 0 that f'(x)x'(O) = 0, and hence
that TS,., £ N(f'(x)). Conversely, let f'(x)v = 0. If we choose x(t) =
qJ- 1 (tv), then this curve lies inS and x'(O) = v. Therefore, we have TS,., =
N(f'(x)).
For a linear operator L we always have codim N(L) = dim R(L). Since
558 73. Banach Manifolds

f'(x): T M"-+ T N11" 1is surjective, it follows that codim TS" = dim T N11" 1•
For the special case at hand we have TM" =MandT N11" 1 = N.
(II) Let M and N be Banach manifolds. Then, locally, they look like B-spaces.
Thus we obtain Theorem 73.C by applying (I) to the charts of M and N
and to the corresponding representatives off 0

EXAMPLE 73.41. Let f: X-+ R be C", k ~ 1, X a B-space and f'(x) ::1- 0 for all
solutions x of the equation
f(x) = 0.
The solution setS ofthis equation is a submanifold of X and thus a C"-Banach
manifold. For the tangent space we have TS" = N(f'(x)) and codim TS" = 1
for all xeS.
In the special case that X is a H-space, and if we choose f(x) = (xlx) - r 2,
then Sis equal to the sphere of radius r > 0. Consequently, Sis a C 00 -Banach
manifold with codim S = 1. For the tangent space we have
TS" = N(f'(x)) ={heX: (xjh) = 0}.

PRooF. Since f'(x) ::1- 0, it follows that f'(x): X-+ R is surjective. Moreover,
we have codim N(f'(x)) = 1, because if h0 satisfies f'(x)h 0 = 1, then every
hE X has a unique representation h = h 1 + (f'(x)h)h 0 withf'(x)h 1 = 0. Theo-
rem 73.C then yields the assertion. 0

Theorem 73.0 (Subimmersions). Let f: M-+ N be C", k ~ 1, and assume (H).


Suppose that all chart spaces of M and N are finite-dimensional. If f is a
subimmersion at every point of S = f- 1(y) for some fixed yeN, then Sis a
submanifold of M, i.e., in particular, a C"-Banach manifold.
The tangent space TS" is equal to N(f'(x)). If dim.M = m, dim N = n, and
rankf'(x) = r for all xeS, then dimS= m- r.

PRooF. One applies the local subimmersion theorem (Proposition 73.34) and
follows the proof of Theorem 73.C. 0

In view of Theorem 73.E below, we consider the following important


question.
(Q) Which continuous maps have a continuous inverse?

COUNTEREXAMPLE 73.42. Consider the following continuous, injective map


f: R-+ R 2 of Figure 73.13. The behavior of the image f(R) at the point f(O)
plays an important role. We have f(x)-+ f(O) as x -+ 0 and x -+ + oo. Thus
f- 1 is not continuous at the point f(O).
Iff is sufficiently smooth, then the above map f: R -+ R 2 is also an example
of an injective C"-immersion, where f(R) is not a submanifold of R 2 because
73.11. Two Principles for Constructing Manifolds 559

_£)
0 /(0)

Figure 73.13

there exists no chart in IR 2 , for which f(IR) looks like a straight line in a
neighborhood of f(O). Intuitively, f(IR) has no tangent line at f(O).

Definition 73.43. Let M and N be topological spaces. Then f: M -+ N is called


a homeomorphic embedding (short: e-homeomorphism) if and only iff is a
homeomorphism onto its image f(M), i.e., f is injective and continuous and
f- 1 is continuous as well.
If M and N are C"-Banach manifolds, k ~ l, then f: M-+ N is called an
embedding if and only iff is an e-homeomorphism and an immersion.

Proposition 73.44. Iff: M-+ N is injective and continuous, and M and N are
topological spaces, then f is an e-homeomorphism if one of the following three
conditions is satisfied:
(i) f is closed.
(ii) f is proper, and N satisfies the first axiom of countability, i.e., every point
in N has a countable neighborhood basis.
(iii) M is compact.
Here, (ii) and (iii) are special cases of (i).

Corollary 73.45./f f: M-+ N is an injective C"-immersion, and M and N are


C"-Banach manifolds, k ~ 1, then f is an embedding if M is compact or f is
closed or proper.

PRooF. (i) Iff is closed, then f(M) is closed and f- 1 is continuous. This follows
because the closedness of Kin M implies the closedness of f(K) (see A 1 (11)).
(ii) An analogous argument, as used in the proof of Proposition 4.44, shows
that f is closed.
(iii) If K is a closed subset of M, then K and f(K) are compact, i.e., f is
closed. 0

Since every Banach manifold looks locally like a B-space, it follows that
every point has a countable neighborhood basis. Thus, Corollary 73.45 is an
immediate consequence of Proposition 73.44.

Theorem 73.E (Embeddings). Let f: M-+ N be a C"-embedding, k ~ 1, and let


(H) be satisfied. Then S = f(M) is a submanifold of N, i.e., in particular, a
C"-Banach manifold.
560 73. Banach Manifolds

The tangent space TSx is equal to R(f'(x)). If M has a dimension, then


dimS= dimM.

The following proof is analogous to the proof of Theorem 73.C. Here, we


use the local immersion theorem instead of the local submersion theorem in
Theorem 73.C. It follows from (I) below that the difficulties of Counterexample
73.42 at f(O) of Figure 73.13 will not occur.
PROOF.

(I) The map f is an e-homeomorphism. Thus it follows that if U is an open


neighborhood of x in M, then f(U) is an open neighborhood of f(x) in
f(M). Hence there exists an open set V inN with f(U) = V nf(M) (see
the definition of the subspace topology in A 1 (9) ).
(II) Let M and N be B-spaces and x eM. The local immersion theorem
(Proposition 73.37) shows that, in suitable coordinates, f looks like f'(x)
in a neighborhood of x. Thus there exists an open neighborhood U of x
such that f(U) looks like R(f'(x)) in an admissible chart for N. Hence,
f(M) is a submanifold of N.
Looking at curves through f(x), one finds that TSx = R(f'(x)). This
follows similarly as in the proof of Theorem 73.C. Since f'(x): TMx-+
R(f'(x)) is bijective, we obtain that dim T Mx = dim R(f'(x)).
(III) If M and N are Banach manifolds, then (ll) can be applied to the
charts. 0

73.12. Construction of Diffeomorphisms and the


Generalized Morse Lemma
The following important example illustrates the method of continuation of a
parameter. It allows us to construct diffeomorphisms by solving ordinary
differential equations on B-spaces, which yield normal forms. Consider the
Taylor expansion
f(xo + h)= f(xo) + f'(x 0 )h + f"(x 0 )h 2 /2 + · ··.
Let f'(x 0 ) = 0. Then f is not a submersion at x 0 and Proposition 73.36 cannot
be used to construct the local normal form f(ll'(y)) = f(x 0 ) + f'(x 0 )y in a
neighborhood of zero. Thus the following question arrises. Is there a coordi-
natetransformationx = qJ(y)withx 0 = qJ(O)formapsf: U(x 0 ) ~X-+ Ill with
f'(x 0 ) = 0 such that
(23)
is satisfied for all y in a neighborhood of zero? In this case f will behave like
the quadratic part of the Taylor expansion in a neighborhood of x 0 •
73.12. Construction ofDitreomorphisms and the Generalized Morse Lemma 561

Let X be a B-space over IK, and let Q: X x X -+ IK be a bounded, symmetric


bilinear form. We define the operator A through
(Ax)y = Q(x, y).
It follows that ii(Ax)yil:::;; qlixll IIYII and hence IIAxll:::;; qlixll. Thus, Axe X*,
and the linear operator A: X-+ X* is continuous.

Defmition 73.46. The bilinear form Q is called nondegenerate (or weakly


nondegenerate) if and only if A: X-+ X* is bijective (or injective).

For example, let X= R", and let


II

Q(x, Y> = I aijei,j


i,j=l

with a real, symmetric matrix (aii). Then Q: X x X -+ R is a bounded, sym-


metric bilinear form. If we identify X with X*, then A: X -+ X corresponds to
the matrix (a11), i.e., y = Ax corresponds to
II

'II = L aijej•
j=l

Moreover, Q is nondegenerate if and only if A -l exists, i.e., det(a11 ) :;: 0.


If dim X < oo, then the notions of nondegenerate and weakly nondegenerate
coincide.
Let X be a B-space. It follows from Definition 73.32 that a map f: U(x 0 ) £
X-+ R has a singular point at x 0 if and only if f'(x 0 ) = 0. Such a point is
called nondegenerate if and only if the bilinear form (h, k)t-+ f"(x 0 )hk is
nondegenerate.

Theorem 73.F (Generalized Morse Lemma of Palais (1969)). Let X be a real


B-space, and let f: U(x 0 ) £X-+ R be a CH 2 -map, k ~ 1, which has a non-
degenerate singular point at x 0 •
Then there exists a local Ck-diffeomorphism cp: U(O) £X-+ X with cp(O) =
x 0 such that (23) holds in a neighborhood of zero.

PRooF. Without loss of generality, let x 0 = 0 and f(O) = 0. The proof idea is
as follows. Let g(x) = f"(O)x 2/2 and use the homotopy
H(x, t) = tf(x) + (1 - t)g(x).

Then solve the linear equation


H"(x, t)h + H,(x, t) = 0 (24)

with respect to h. This allows us to solve the ordinary differential equation


cp, = h(cp, t), cp(O,x) = x. (25)
562 73. Banach Manifolds

Letting z = ((f'(t, x), t), we obtain

otaH(z) = Hx(z)h(z) + H,(z) = 0,


and hence
H(lf'(l,x), 1) = H(lf'(O,x),O) = H(x,O)
and therefore
f(lf'(1,x)) = g(x).
Hence, the map x 1-+ lf'( 1, x) is our desired local diffeomorphism at the origin.
We now justify our formal considerations.
(I) Solution of (24). It follows from Taylor's theorem (Theorem 4.A) with
integral remainder that

f(x) = I 1
(1 - t}/"(tx}x 2 dt

and

f'(x) = I 1
f"(tx)xdt.

Moreover, we have
H"(x, t) = tf'(x) + (1 - t)g'(x) = B(x, t)x
with

B(x, t) = f"(O) + t 1 1
(f"(tx} - f"(O})dt,

and
H,(x, t) = f(x) - g(x) = C(x)x 2

I
with
1
C(x) = [(1 - t}f"(tx} - 2- 1/"(0)] dt

and C(O) = 0. It follows then from f'(x)eX* and f"(x)eL(X,X*) that


B(x, t), C(x) e L(X, X*).
Since the bilinear form (h, k) 1-+ f" (O)hk is nondegenerate, we obtain
from the open mapping theorem A1 (36) that f"(Or 1 eL(X*,X). Thus
tr
B(x, 1 exists for all t e [0, 1] and all x in a fixed neighborhood of zero.
This is a consequence of the continuity of the formation of inverse
mappings (Problem 1.7). Letting
h(x, t) = - B(x, tr 1 C(x)x,
we obtain that Bxh = Bhx = -Cx 2 , i.e., his a solution of(24).
73.13. Transversality 563

(II) Since C(O) = 0, we have h(O, t) = 0 and hx(O, t) = 0 for all t.


(III) Solution of (25). It follows from Theorem 3.A that for each x, equation
(25) has a unique solution in [0, 1], which lies in a sufficiently small
neighborhood ofthe origin in X. Note that, because of(II), the Lipschitz
constant of h can be chosen arbitrarily small. From Theorem 4.0 (De-
pendence on Data) it follows that qJ is ct, since h is Ck.
If we pose the initial-value problem at t = 1 backwards with respect
tot, then we obtain from uniqueness that x 1-+ qJ(1, x) is a local diffeomor-
phism at x = 0. 0

A general criteria, which allows the transformation of functionals to not


necessarily quadratic normal forms, will be proved in Problem 73.8a. From
this, one obtains a nice version of Theorem 73.F for variational integrals.

73.13. Transversality
Consider the following two questions.
(i) When is the intersection of manifolds a manifold?
(ii) When is the preimage of a manifold a manifold?
Interestingly, the geometric concept of transversality provides answers to
both of these questions. Transversality is certainly one of the most important
concepts of modern mathematics. It is frequently used to express that a
qualitative phenomenon is natural or nondegenerate. As a first illustration,
consider Figure 73.14. In Figure 73.14(a) the two curves in R 2 intersect
transversally but in Figure 73.14(b) they intersect nontransversally. There
they only touch each other. It is convenient to speak of transversal intersection
at some point y if both curves do not intersect there. To get an impression of
how transversality can be used to answer question (i~ note that in Figure
73.14(a) the intersection ofthe two curves M and N forms a manifold (a point
or the empty set). In the case of nontransversal intersection, however, the
intersection M n N need not be a manifold. In Figure 73.15, M n N is not a
manifold because of the two endpoints P and Q.
Let us make the following three assumptions:
(Hl) M, N, andY are Ck-Banach manifolds with chart spaces over IK, k ~ 1,
and N is a submanifold of Y.
(H2) Besides (Hl), M is also a submanifold of Y.
(H3) Besides (Hl), the map f: M--+ Y is ct.

Definition 73.47. If Lis a linear space, then two linear subspaces A and B of
L are called transversal at 0 if and only if
A+B=L, (26)
i.e., A and B span L.
564 73. Banach Manifolds

transversal at y not transversal at y


(a) (b)

Figure 73.14

Figure 73.15

This is the key idea for the general definition of transversality below.
Note that this definition depends on L. In L = R3, for example, two straight
lines can never intersect transversally. Besides transversality between linear
subspaces, we also consider transversality between linear maps and linear
subspaces.
The linear map f: L-+ Lis called transversal to Bat 0 if and only if the
image f(L) and Bare transversal at 0, i.e.,
f(L) +B = L. (27)
These definitions can be generalized to B-spaces and Banach manifolds by

Space Transversal Not transversal

d;j ·C!J
'-..__/

RJ
C1
~
~ X

'1!t
0
J~------t CJ
\

Figure 73.16
73.13. Transversality 565

using linearization, i.e., by considering


TM, +TN,= TY, (28)
and
R(f'(x)) + TN,= TY,. (29)
For technical reasons, we also need the following conditions about splitting:
The intersection T M, n TN, splits T M,. (28a)
The preimage f'(xt 1 (TN,) splits TMx for all x with f(x) = y. (29a)
Note that the linearization off: M-+ Y at the point x is equal to f'(x):
TMx-+ TY, withy= f(x). Moreover, we haveN£ Y. For finite-dimensional
manifolds, (28a) and (29a) are automatically satisfied.

Definition 73.48. If (H2) holds, then M and N are called transversal in Y if and
only if (28) and (28a) are satisfied for all points yin the intersection M n N.
We write M rh N mod Y.
If (H3) holds, then f is called transversal to N in Y if and only if (29) and
(29a) are satisfied for all points y in the intersection f(M) n N. We write
frhNmod Y.

Thus for M n N = 0 and f(M) n N = 0, we always have transversality.


In Figure 73.14(a) the two curves are transversal in IR 2, because they do not
intersect at all or the tangent lines at the points of intersection span IR 2 • If two
curves touch each other as in Figure 73.14(b), then they are not transversal.
Figure 73.16 shows several examples of transversal intersections in IR 3 •

EXAMPLE 73.49. If (H3) holds for the map f: M-+ Y, then ye Y is a regular
value off iff f is transversal toN= {y} in Y. Compare Definition 73.32 and
note that TN, = {0}.

In Part I we saw that regular values play an important role in fixed-point


theory. The example, at hand, shows the connection with transversality.

Theorem 73.G (Transversality). If (H3) holds and the map f: M-+ Y is trans-
versal toN in Y, then f- 1 (N) is a submanifold of M.
If M, N, and Y have dimensions, then f- 1 (N) has the same codimension in M
as N in Y.

Corollary 73.50. If (H2) holds and the submanifolds M and N are transversal
in Y, then the intersection M n N is a submanifold of Y.

As shown in Example 73.49, this theorem is an immediate generalization


of the preimage theorem (Theorem 73.C).
566 73. Banach Manifolds

PROOF OF THEOREM 73.0.


(I) We first consider the special case where M and Yare B-spaces, and N is
a linear subspace of Y which splits Y. Let P: Y-+ N.l be a projection
operator onto N.l with corresponding topological direct sum decom-
position Y = N ffi N.1. The proof idea is as follows. Consider the map
g: X J.. N ffi N.l!. N.t,
instead of J, i.e., g = Pf Then f- 1 (N) = g- 1 (0). Using the preimage
theorem (Theorem 73.C), it suffices to show that 0 is a regular value of
g. Note that Pis linear and continuous.
In fact, g'(x) = Pf'(x). Let g(x) = 0, i.e., f(x)e N. Transversality off
and N in Y means that
R(f'(x)) +N =Y
and f'(xr 1 (N) splits M. The first condition implies that PR(f'(x)) =
N.1, i.e., g'(x) is surjective. The second condition shows that the null space
N(g'(x)) splits M.
(II) Theorem 73.C implies that codim g- 1 (0) = dimN.1. Thus we have
codimf- 1 (N) = dimN.l = codimN.
(III) If M, N, and Yare manifolds, then Theorem 73.G follows immediately
from (I) by using charts. 0

PRooF OF CoROLLARY 73.50. Let f: M-+ Y be the trivial embedding for


M s; Y. Then M rh N mod Y is equivalent to f rh N mod Y. Because off - 1 (N) =
M n N, Corollary 73.50 is a special case of Theorem 73.G. 0

73.14. Taylor Expansions and Jets


Let f: U(x 0 ) s; X-+ Y be C", k ;;:: 1, and X and Y B-spaces over IK. Define

jxf(h) = f(x) + L" J<•>(x)h•jr!,


r=1

i.e., j,J is equal to the Taylor expansion off at the point x up to the kth
derivative. Moreover, we define the k-jet coordinate off at the point x as
J"f(x) = (x,f(x),f'(x), ... ,j<1cl(x)),
and let
P"f(x) = (f'(x), ... ,pk>(x)).
We now give the global generalizations of J"f to manifolds.
In preparation we study the change of jet coordinates after coordinate
transformations. Let L:ym(X, Y) be the B-space of all n-linear, symmetric, and
73.14. Taylor Expansions and Jets 567

bounded maps A: X x ··· x X-+ Y. Also let

P"(X, Y) = n L:ym(X, Y).


A:

r=l

Then P"f(x) e P"(X, Y) and


J"f(x)eX x Y x P"(X, Y).
Define
g = "' 0 f 0 cp,
where cp and t/1 are C". More precisely, we choose (/): U(z) s;; X-+ X with
cp(z) = x and t/J: U(f(x)) s;; Y-+ Y. Then, we obtain the key relation
P"g(z) = BP"f(x), (30)
where B: P"(X, Y)-+ P"(X, Y) is linear and continuous. This follows from the
chain rule. If(/) and t/1 are local diffeomorphisms, then, conversely, P"j(x) can
be expressed in terms of P"g(z). Thus B is bijective and from the open mapping
theorem A1 (36) it follows that B is a linear homeomorphism. We now assume:
(H) M and N are C00 -Banach manifolds with chart spaces over K
Two C00 -maps J, g: U(x) s;; M-+ N are called k-equivalent at a point xeM
if and only if
Jkf{x) = J"g(x) (31)
holds for the representatives J. g, and xoff, g, and x in fixed admissible charts.
This means that the k-jet coordinates of the representatives coincide at the
point x. From (30) it follows that this definition is invariant, i.e., independent
of the choice of the charts.

Definition 73.51 (Jets). If (H) holds, then J"f(x) denotes the set of all C00 -maps
g: U(x) s;; M-+ N which are k-equivalent to the C00 -map f: U(x) s;; M-+ N
at the point x eM. Moreover,let J"(X, Y) be the set of all possible J"f(x) where
fand x vary.
We call J"f(x) a k-jet.

Proposition 73.51. The set J"(M, N) is a C00 -Banach manifold.

PRooF. Let (U, cp) and (V, t/1) be charts in M and N with xe U andf(x)e V and
~art spaces X., and X"'. Let J"f(x) have the local coordinates J"f(x), where
f and x are the representatives off and x. For a chart change, we obtain the
transformation rule for
J"!<x> = (x,f(x), ... , fl" 1(x)),
analogously to (30). This follows from an application of the chain rule to the
representatives.
568 73. Banach Manifolds

In order to obtain a manifold structure, we use local coordinates and the


method of abstract atlases of Problem 73.2. We choose (Wuv. x) as abstract
charts in Jk(M,N). Here, Wuv is the set of all Jl:j(x) with xe U and f(x)e V.
Moreover, we have
x(J"f(x)) = Jkf(x).
Since
Jk.J(x)e u, x v"' x JJk(X,, Y"'),
it follows that the set on the right-hand side is the chart image. Note that for
each element h in this set, there exists a ceo-map f: U(x) s;; M-+ N with
J"J(x) =h. To see this choose as the representative off a polynomial. The
chart space is equal to X, x Y.; x JJk(X,, Yy,).
The collection of all charts (Wuv.x), where U and V vary, then forms an
abstract ceo-atlas. This induces a topology on Jk(M, N) and gives us a manifold
structure (Problem 73.2). 0

In this way we obtain a map


J"f: M -+J"(M,N)
for every ceo-map f: M-+ N which is called the k-jet extension off This map
is also of class ceo.

EXAMPLE 73.53 (J"(M,N)). For every COO-map f: R-+ R we have


Jk.j(x) = (x,J(x),f'(x), ... ,J'"1(x)) e R2+". (32)
Actually, Jk.j(x) in the above definition is a collection of maps. But since all
of them are characterized by the same k-jet coordinate at x, we may identify
Jkf(x) with the k-jet coordinate as in (32). Thus we obtain
J"(R, R) = R2+".
Note that for each he R2 H there exists a ceo-map f: U(x) s;; R-+ R with
J"f(x) =h.
If M and N are open in R, then we obtain, analogously, for the ceo-map
f: M -+N that
J"f(x) = (x,f(x),f'(x), ... ,f'"1(x))eM x N x R". (33)
Thus we have
J"(M,N) = M X N X IRk.,
and (33) describes the map J"f: M-+ i"(M, N).
For ceo-maps f: R"-+ R we have
J"f(x) = (x,J(x), D"f(x))l si..I:St• (34)
i.e., Jk.j(x) contains all of the partial derivatives up to order k. Derivatives,
73.14. Taylor Expansions and Jets 569

which are different in the order of differentiation, are only accounted for once.
For example, only the first of the following two derivatives is considered;
o2f(x)/oe 1 oe 2 and o2f(x)/oe 2 oe 1 . If M and N are open in IR" and IR, respec-
tively, then
Jk(M,N) = M X N X IR'.

Here, 2 + r is the number of components of (34).


EXAMPLE 73.54 (Transversality and Nondegeneracy). The transversality of
k-jets can effectively be used to describe the nondegeneracy of higher-order
derivatives. For example, let f: IR--+ IR be coo, and g = Jkf, k ~ 1. It follows
then from (32) that
g'(x)h = (h,f'(x)h, ... ,Jik+ll(x)h)
for all he R Let
S = {(et•···•ek+z)EiRk+ 2: ek+2 = 0}.
This implies the following statement:
From Jlk1(x) = 0 follows Jlk+tl(x) =f. 0 if and only if
Jkf rh S mod Jk(IR, IR). (35)
Fork = 1, the transversality condition
J 1f rh S modJ 1 (1R, IR)
means precisely that f has only nondegenerate singular points.

PRooF. The condition J"f(x) n S =f. 0 is equivalent to f 1k1(x) = 0. Moreover,


Jlk+ 1 l(x) -:F 0 is equivalent to
g'(x)(IR) +S= IRH 2 • D

Condition (35), which appears somewhat complicated, is chosen in view of


the important transversality theorem of Thorn (Proposition 73.57).
The definition of jets allows an elegant introduction of the so-called Whitney
topology, which is the main topology in modern differential topology and
especially catastrophe theory.

Definition 73.55 (C 00 -Whitney Topology). Let M and N be coo-Banach mani-


folds with chart spaces over IK, and let C00 (M, N) denote the set of all C00 -maps
f: M -+N.
Let k = 0, 1, ... , be a nonnegative integer and U an arbitrary open set in
Jk(M, N). Define
O,.,u = {feC 00 (M, N): Jkf(M) s;;; U}.
A set Sin C 00 (M, N) is called oo-open if and only if it is the union of sets of
570 73. Banach Manifolds

the form Dt.u· The oo-open sets define a topology on C00(M, N) which is called
the C00 -Whitney topology.

This definition is equivalent to the definition of A1 (67) for a more special


situation. Roughly speaking, two maps f: M-+ N and g: M-+ N are con-
sidered close in the C00 -Whitney topology if all local Taylor expansions for
the representatives are sufficiently close. This means that all jet coordinates
are sufficiently close. The following example shows this more precisely.

EXAMPLE 73.56. Let/,, f e C00 (!Rm, IR) for all n. Then the statement that/,-+ f
as n-+ oo in the C00 -Whitney topology is equivalent to the following two
conditions:
(i) For each k = 0, 1, ... there exists a compact set K in R" such that
D«J, :4 D«f on K for all ex with 0 ~ loci ~ k (uniform convergence of the
derivatives up to order k).
(ii) fn = f outside of K except for finitely many n.

The proof, which is left as an exercise for the reader, can be found in
Golubitsky and Guillemin (1977, M), Chapter 2, §3. We also suggest to show
that Definition 73.55 actually gives a topology for C00 (M, N) (see Problem
73.3). As a typical application of the Whitney topology, we mention the
important transversality theorem of Thorn. Let C00 (M, N) be equipped with
the C00 -Whitney topology.

Proposition 73.57. Let M and N be finite-dimensional real C00 -manifolds with


countable bases. Also, let S be a submanifold of J"(M, N) for some fixed k = 0,
1, . . . . Then, the set
{f e C00 (M, N): Jkj rh S mod J"(M, N)}
is residual and dense in C00 (M,N). Moreover, this set is open in C00(M,N) if S
is closed.
Let Y be a closed submanifold of N. Then, the set
{f e C00 (M, N): f rh Y mod N}
is open and dense in C00(M, N).

A proof may be found in Golubitsky and Guillemin (1977, M), Theorem


4.9. Recall that a set is called residual precisely if it is the intersection of
countably many open and dense sets. Since C00(M, N) with the C00 -Whitney
topology is a Baire space, all residual sets are dense (see A1 (64)). In (35), we
saw that tninsversality conditions of the form J"f rh S mod J"(M, N) include
nondegeneracy conditions for higher-order derivatives. Thus, Proposition
73.57 is a precise formulation of the following general principle:
(G) Nondegeneracy is generic, i.e., holds in most cases.
73.15. Equivalence of Maps 571

73.15. Equivalence of Maps


This section should prepare the reader for the next following one. We begin
with the transformation rule
1/J[f(qJ(u))] = g(u), (36)
which takes the map f into g. Iff and g are real functions, for example, then
(36) follows from
f(x) = y
and a coordinate change in x and y, i.e., x = qJ(u) and v = Y,(y). This gives
g(u) = v.
Formula (36) defines an equivalence between maps which allows a transforma-
tion off into a normal form g.

Definition 73.58. Two ct-maps f: X-+ Y and g: U-+ V between ct-Banach


manifolds are called ct-equivalent if and only if there exist ct-diffeomorphisms
qJ and Y, such that the following diagram commutes:

u___!!__.v

~1 r~ (37)

x_!__.y
i.e., g = Y, of o qJ. If, in addition, Y, = id, Y = V (or qJ = id, X = U) we speak
of right equivalence (or left equivalence).

This definition has the following local form. A map f is ct-equivalent


at a point x 0 to a map g at a point u0 if and only if qJ and Y, in (37)
are local ct-diffeomorphisms at u0 and f(x 0 ), with qJ(u0 ) = x 0 , respectively,
and Y,(/(x 0 )) = g(u 0 ). In this case f and g need only be defined in a neigh-
borhood of x 0 and u0 .
Similarly, local right and left equivalence are defined. For example, the
C"-right equivalence off at x 0 to g at u0 is expressed through
f(qJ(u)) = g(u) (38)
for all u in a neighborhood of u0 • More precisely, we have that
f: U(x 0 ) s;;; X-+ Y and g: U(u 0 ) s;;; U-+ Y
are C", and qJ: V(u 0 ) s;;; U-+ X is a local ct-diffeomorphism at u0 with qJ(u 0 ) =
Xo.
In this language we can formulate the results of Section 73.10 quite elegantly.

EXAMPLE 73.59 (Linearization of Maps). Let f: U(x) s;;; X -+ Y be Ck, k ;;:: 1,


572 73. Banach Manifolds

and X and Y B-spaces over IK. Let g = j;J, i.e.,


g(u) = f(x) + f'(x)u.
Iff is a submersion (or immersion) at x, then f is ct-right equivalent at x to
gat 0 (or Ck-left equivalent).
If X = IK" and Y = !Km, and if f is a subimmersion at x, then f is Ck-
equivalent at x tog at 0. Moreover, if rank f'(x) = r, then f is Ck-equivalent
at x to h: X --+ Y at 0 with

Thus the following question arises. Under which conditions is f equivalent


to i!J. k ;:::: 2, if some of the above assumptions are not satisfied (multilineariza-
tion)? As of yet, no general theory is available. The important special case of
real functions is discussed in the following section.
A general multilinearization principle for systems of real equations will be
considered in Problem 73.9 (the blowing-up lemma). In fact, there are two
basic tools in modern bifurcation theory, namely,
(i) the implicit function theorem, and
(ii) the blowing-up lemma (the generalized implicit function theorem).
Experience shows that (i) and (ii) are enough in order to handle many cases
occuring in applications of bifurcation theory to concrete problems in the
natural sciences.

73.16. Multilinearization of Maps, Normal Forms,


and Catastrophe Theory
At the tenth bifurcation
my true love gave to me:
Ten dual parabolics
Nine parabolic umbilics
Eight hyperbolic umbilics
Seven elliptic umbilics
Six dual butterflys
Five butterflies;
Four swallowtails
Three dual cusps
Two standard cusps
and a fold catastrophe.
Folclore
Universal unfoldings generalize analytic continuation.
ReneThom

Let us begin with the following two questions, which are of major impor-
tance for a qualitative understanding of many scientific phenomena.
73.16. Multilinearization of Maps, Normal Forms, and Catastrophe Theory 573

(Ql) Principal part problem. When does the Taylor expansion up to some
order k
f(x) + f'(x)u + ··· + j<k>(x)uk/kl
provide enough information to understand the local behavior of a func-
tion fat x?
(Q2) Normal form problem. What are the local normal forms for parameter
families of functions?

In this section we generalize the results of Section 37.28. We discuss a


number of deeper results without proof. In particular, we generalize the
linearization procedure of Section 73.15 and the Morse lemma of Section
73.12. The central ideas are
transversality,
structural stability,
genericity.
Also, we need the concept of k-determination of functions and universal
unfoldings. Methodically, it is quite interesting that purely algebraic opera-
tions between polynomials give information about normal forms which are
obtained from ditTeomorphisms (multilinearization). This is a substantial
generalization of classical results about solving analytic equations by using
the Weierstrass preparation theorem and Newton diagrams. This has been
discussed in Problem 8.3. For proofs, we recommend Golubitsky (1978, S) and
Poston and Stewart (1978, M). Many interesting applications are contained
in Poston and Stewart (1978, M) and Gilmore (1981, M). A number of deeper
results about normal forms of singularities may be found in Arnold (1985).
The so-called catastrophe theory, which was initiated by Rene Thorn in the
1960s, tries to answer questions (Ql) and (Q2) above. The classification of
mathematical objects, by using normal forms, is one of the main objectives in
mathematics. From a scientific point of view, this aims at getting a survey of
all possible qualitative structures. For example, the classification of Lie alge-
bras, which exists in part, gives a good picture of all possible symmetries in
our world. Symmetry is one of the main tools of physicists in classifyitig
elementary particles and their processes. Unfortunately, at present, many
classification problems seem to be too difficult. This, for example, applies to
the classification of topological spaces and homeomorphisms as well as to
manifolds and ditTeomorphisms.
We emphasize in the following such methods, which actually allow a
computation of normal forms. This might help scientists and engineers to get
acquainted with mathematical tools which allow a qualitative description of
phenomena.
(C) Convention. In this section we only consider real C'"'-functions
J, g: U(O) £ IR"--+ IR with f(O) = g(O) = 0,
574 73. Banach Manifolds

and parameter families, i.e., real coo -functions


F, G: U(O) s;;; IR" x R"-+ R
of the form F = F(x, p) and G = G(y, q) with F(O, 0) = G(O, 0) = 0.
We think of x andy as independent variables (state variables) and of p and
q as parameters (outer control parameters). The dimension d of the parameter
space may depend on F and G. Let n be fixed. Local C00 ·maps are always maps
which are defined in a neighborhood of zero and map zero into zero. If such
a map is a local C00 -diffeomorphism at 0, then it is simply called a local
C00 -diffeomorphisms. Furthermore, we writej" fori! with X= 0, i.e.,
j"J(u) = f'(O)u + f"(O)u 2/2! + · · · + J<"l(O)u"/k!.
The main transformation rule is
g(q>(u)) = j"J(u) (39)
in a neighborhood of zero.

Definition 73.60 (k-Determination). The function f: U(O) s;;; R"-+ R is called


k-determined in R if and only if for each function g: U(O) s;;; R"-+ R with
j"g = i"f
there exists a local coo -diffeomorphism q> in R" which satisfies (39), i.e., all g
are right equivalent to j"f at 0.
f is called strongly k-determined if, in addition, q>'(O) = I.
This is a satisfactory definition for the principal part problem. Namely, iff
is k-determined, then all functions, which differ from f only in terms of order
higher thank, behave qualitatively like j"f in a neighborhood of zero. Roughly,
this means: the Taylor expansion off up to order k completely determines f
and its perturbations with terms of order higher than k.
In science and engineering, mathematical expressions are often simplified
by neglecting higher-order terms. In order not to loose important qualitative
information, one has to secure that the kth order approximation is also
k-determined. Consider, for example
f(e, '1) = ae 2 + 2be'l + C'1 2 + 03.
With 0 3 we denote terms of at least order 3. According to the following
example, f is 2-determined if ac - b2 =F 0. In this case, all the terms 0 3 can be
neglected. If, however, ac - b2 = 0, we also need to consider higher-order
terms, and this in such a way that f is k-determined with k ~ 3.

EXAMPLE 73.61. Let f: U(O) s;;; R"-+ R


(a) If f'(O) =F 0, i.e., not every first-order partial derivative off vanishes at 0,
then f is !-determined in R".
73.16. Multilinearization of Maps, Normal Forms, and Catastrophe Theory 575

(b) If f'(O) = 0, and the matrix f"(O) for the second-order partial derivatives
off at 0 is invertible, then f is 2-determined in !Rn.
(c) If n = 1, and if f'(O) = ··· = p"- 11 (0) = 0 and P"1(0) ::/:-0, k;;:::: 2 then f is
k-determined on IR 1.
(d) ~ 2 '7 is not k-determined in IR 2 for arbitrary k.
In (a), the point 0 is a regular point of f. In (b), the point 0 is a nondegenerate
singular point of f.

PRooF. (a), (b) This is another formulation of Example 73.59 and the Morse
lemma of Section 73.12.
(c) We have f(x) = x"(a + O(x)) with a ::/:- 0. Let g(x) = x"(a + O(x)). We
solve
g(tp(u)) = au",
using the following ansatz: t;o(u) = ut/J(u). Then t/1 is obtained from the inverse
function theorem.
(d) The solution sets· of ~ 2 '7 = 0 and ~ 2 '7 + 11 2'+1 = 0 with r ;;:::: 2 have a
completely different structure (two straight lines and one straight line). D

The case of k-determination, k;;:::: 3, will be discussed below.


In science, perturbations are often described by parameters, i.e., one con-
siders parameter families in the sence of (C) above. In what follows the
important transformation rule is
G(y,q) = F(x(y,q),p(q)) + c(q), (40)
i.e., the parameters p and q are separated from the state variables x and y.

Definition 73.61 (Unfoldings). Consider parameter families F and Gin the


sense of (C) above.
(a) F induces G if and only if (40) holds, where x, p, and care local C""-maps
and y 1-+ x(y, q) is a local C"" -diffeomorphism for all q in a neighborhood
ofO.
If, in addition, p is a local C""-diffeomorphism, then F and G are called
equivalent.
(b) F is called an unfolding or deformation off if and only if F(x, 0) = f(x) in
a neighborhood of 0.
(c) An unfolding F off is called versal if and only ifF induces every unfolding
Goff.
(d) Every versal unfolding with a minimal number of parameters is called
universal.

Roughly speaking, universal unfoldings provide a survey of all perturba-


tions off with finitely many parameters.
576 73. Banach Manifolds

Note that the equivalence of parameter families F and G is more special


than the equivalence of maps of Section 73.15.1f F and G are equivalent, then
the parameter spaces ofF and G have the same dimension. In the following,
we present a number of criteria for the numerical computation of normal
forms. Some examples are given in Problem 73.4. For
~!;;;~om!;;; pk,
k ~ 1, we use the following notations:
pk set of all polynomials of degree s; k with respect to the real variables
el ... en;
~ polynomials in pk which vanish at the origin;
~om homogeneous polynomials of degree k.
We also write pk(R"), etc. For the criteria below, the following expressions are
important:
(I) j"+l [Q 1 j"(Dt f + Dt Q) + ·" · + Qnj"(Dnf + DnQ)],
(II) jk+ 1[Qd"- 1(Dtf) + ··· + Q"j"- 1(Dnf)J,
(III) j"[Q1j"(Dtf) + ··· + Qni"(Dnf)J,
where D1 = a;ae 1• Also, recall condition (C) above.

(Cl) (Criterium for k-determination). The map f: U(O) £ R"-+ R is k-


determined in R" if and only if the following holds: If Q e ~~. then every
P e ~~ can be expressed through (I) with Q1e P', r ~ 1.
Iff is k-determined in R", then every Pe P:o-t;! can be expressed through
(II) with Q1 e P', r ~ 1.
(C2) (Criterium for strong k-determination). The map f is strongly k-
determined in R" if and only if every PeP:~ can be expressed through
(II) with Q1 E P', r ~ 2.
(C3) (Construction of universal unfoldings). Let f be k-determined in R" with
f'(O) = 0. If one chooses a cobasis {v 1 , ••• ,v,}, then
F(x, p) = f(x) + p1 v1(x) + ··· + p,v,(x)
is a universal unfolding off

We explain the concept of co basis for a k-determined function fin R". Let
ideal(/') be the set of all polynomials in ~ which can be expressed through
(III) with Q1e pt_ Here,~ is considered as a linear space. Then, dim~ is equal
to the maximal number oflinear independent polynomials in ~. The codimen-
sion off is defined as
codimf = dimP~- dim(ideal(f')).
Let r = codim f Then {v1 , .•• , v,} is a co basis if and only if
~ = ideal(f')$span{v 1 , ••• ,v,}.
EXAMPLE 73.63: The function f: R-+ R with f(x) = x" is k-determined ir R.
73.16. Multilinearization of Maps, Normal Forms, and Catastrophe Theory 577

Let k ~ 3. Then
P~(IR) = span{x, ... ,xk},

ideal(/')= span{xk-t,xk}.
Consequently, {x, ... ,x 1 - 2 } is a cobasis, i.e., codimf = k- 2. From (C3) we
obtain a universal unfolding F off through
F(x,p) = xk + PtXt + ... + Pk-2xk-2.
For k = 2, we have F = f.
The following result is the main theorem in catastrophe theory.
(C4) (Local normal forms for parameter families). A parameter family with at
most five parameters and n state variables is "in general" versal. If this is
the case, then it is induced by one of the universal unfoldings of Table 73.1.
These universal unfoldings are "structurally stable."
A precise definition for the expressions "in general" and "structurally stable"
will be given below in (C6) and (C7). The normal forms of Table 73.1 with
codimf ~ 1, are called catastrophies. For codimf:::;; 4, the catastrophies are
called the seven elementary catastrophies. In concrete applications, one uses
Table 73.1 in the following way.
(i) Given a parameter family G = G(y, q), let q be equal to zero and consider
g(y) = G(y, 0). If one can show, by using (C1), that g is k-determined with
codim g :::;; 5, then G is induced by one of the normal forms F of Table 73.1
with codimf = codim g, and f of Table 73.1 is right equivalent to g.
(ii) If, conversely, g is given and k-determined with codim g :::;; 5, then one can
construct from (C3) a universal unfolding G of g which is equivalent to a
unversal unfolding F of Table 73.1 with codim f = codim g.
In Table 73.1, we have used, for reasons of simplicity, a, b, c, d, e instead of
p 1 , ••• , Ps· Letting a, b, c, d, e in F be equal to zero, one obtains f. For
codimf ~ l,all thesefhaveadegeneratesingularpointatx = O,andcodimf
measures the degree of degeneracy. The intuitive meaning of codimf is as
follows: Iff: U(O) £ IRn-+ IR is k-determined with codimf = r, then all suf-
ficiently small perturbations off have no more than r + 1 singular points in
a small neighborhood of zero.
Table 73.1 gives a survey of the main structure of parameter families in a
neighborhood of zero. The importance of this for science will become clear in
the following section.
(C5) (Transversality criterium for versal unfoldings). Let f be k-determined
and let F = F(x,p) be an unfolding off with x = (e., ... ,en) and p =
(p 1 , ••• , p,). Let
v1(x) = /(oF(x, O)fopJ
Then F is versal ifand only ifspan {v1 , ..• , v,} and ideal (f') are transversal
in ?o(IR"), i.e., they span ?o(IR").
VI
-.1
00

Table 73.1
Normal form F f is obtained by letting a = b = c = d = e = 0
Name (universal unfolding of f) codimf
Regular point ~1 0
Nondegenerate singular point ~i + · · · + ~; - ~;+ 1 - · · • - ~;, 0~r~n 0
Fold ~ 3 +a~+ A 1
Cusp ±(~ 4 + a~ 2 + b~) +A 2
Swallow tail ~ 5 + a~ 3 + b~ 2 + c~ + A 3
Elliptic umbilics e, - '7 3 + a~ 2 + bfl + c~ + B 3
Hyperbolic umbilic 2
~ '1 + 17 3 +ae + b11 + c~ + B 3
Parabolic umbilic ±(~2'1 + '14 + a'12 + b~2 + C'1 + d~) + B 4
Butterfly ±(~6 + a~4 + b~3 + c~2 + d~) +A 4
Wigwam C + a~ 5 + b~ 4 + c~ 3 + de + e~ + A 5
Second elliptic umbilic e11 - f1 5 + a11 3 + b17 2 + c~ 2 + d11 + e~ + B 5
Second hyperbolic umbilic ~2'1 + 'Is + a'13 + b'12 + c~2 + d'1 + e~ + B 5
Symbolic umbilic ±(~3 + '14 + a~'12 + b'12 + C~'1 + d'1 + e~) + B 5
notations
~=~1• '1=~2 ......
A = ~i + ··· + ~; - ~;+1 - ··· - ~;, 1~ r~ n ~

B = ~~ + ·· · + ~; - ~;+ 1 - •• • - ~;, 2~ r~ n
~
a, b, c, d, e are real parameters g.
~
e.0'
~
73.17. Applications to Natural Sciences 579

Definition 73.64 (Structural Stability). Roughly speaking, a parameter family


F is called structurally stable if every F + Pis equivalent to F for a sufficiently
small perturbation P. More precisely,
F: U(O) £ IR" x IR 4 -. 1R
is called locally structurally $table if and only if there exists a neighborhood
U of 0 in IR" x IR 4 such that all parameter families
F + P: U(O) £ IR" x IR 4 -+ 1R
are equivalent to F for F + P in a neighborhood of F in C00 (U, IR). Here,
C00 (U, IR) is endowed with the C00 -Whitney topology.

(C6) (Structural stability). Every locally structurally stable parameter family


is versa[. All universal unfoldings F of Table 73.1 are loc~lly structurally
stable.
We now make the expression "in general" of (C4) more precise.

(C7) (Genericity). In C00 (IR" x IR 4, IR), d ~ 5, there exists an open and dense set
in the C00 -Whitney topology such that each function H = H(x,p) in this
set has the following property. If (x 0 , p0 ) is fixed and
G(x,p) = H(x 0 - x,p 0 - p)- H(xo,Po),
then G is locally structurally stable and a versal unfolding of g(x) = G(x, 0).
Moreover, G is induced by a universal unfolding F of Table 73.1.
Unfortunately, for d ~ 6, we no longer get such nice results. Structural
stability is an important basic concept for describing scientific phenomena.
We are convinced that the essential features in nature are structurally
stable. Roughly, Proposition (C7) states that most parameter families H in
C00 (R" X R4, IR) behave reasonably.

73.17. Applications to Natural Sciences


In Section 37.28 of Part III we already gave some simple applications to the
van der Waals gas (phase transitions) and bifurcations in connection with the
bending of beams. Now, let us make some principal remarks. We consider:
(i) Useless models.
(ii) Incomplete models.
(iii) Structurally stable perturbations of models.
We begin with (i). Suppose a scientist or engineer models a phenomenon with
e
a function 2 '7 by making a number of simplifying assumptions. This model
is not very useful since, from Example 73.61, there exists no k such that 2 '7 e
580 73. Banach Manifolds

is k-determined. Arbitrarily small perturbations, i.e., perturbations of arbi-


e
trarily high order, may change the qualitative picture of 2 , completely.
For (ii), we consider e3 + e'7 3 . It follows from Problem 73.4 that this
function is 4-determined, but not 3-determined. This means that this model
is incomplete, because 4-determination implies stability against perturbations
of order ~ 5, while perturbations of order 4 may influence the qualitative
picture. Thus, the criteria (Cl) and (C2) are quite useful in applications and
we suggest doing Problem 73.4 in order to get some practice with this.
We now discuss (iii). Each phenomenon in nature or engineering is subject
to outer perturbations which often are described by outer parameters p =
(p 1 , ••• , p,), i.e., we consider equations of the form y = F(x, p). Two important
questions arise.
(a) How does a measurement curve change during the repetition of measure-
ments, which are subject to stable perturbations?
(b) At least, how many parameters are needed to describe these stable pertur-
bations caused by outer influences?
The answer to (a) is: F must be a versal unfolding off with f(x) = F(x, 0).
The answer to (b) is: F must be a universal unfolding off

EXAMPLE 73.65. Consider f: U(O) s;; R-+ R with


f(x) = axk + O(xk+ 1 ) and a =I= 0, k ~ 3.
From Examples 73.61 and 73.63 it follows that f is k-determined with
codim f = k - 2 and f has the universal unfolding
F(x,p) = f(x) + P1X + P2X 2 + ··· + Pk-2xk-l.
Each parameter family G = G(y, q) with G(y, 0) = f(y) is induced by F, i.e., we
have
G(y, q) = F(x(y, q)), p(q)) + c(q)
with the properties of Definition 73.62. If k = 2, then F(x) = f(x), i.e., the
universal unfolding F contains no parameters. Let k = 3. Then
F(x, p) = f(x) + px.
Figure 73.17 shows the perturbations F of f(x) = x 3 for a fixed p. According
to Table 73.1, F(x,p) = x 3 + px is the first elementary catastrophe (fold).
In many applications F = F(x, p) is thought of as energy or thermodynamic
potential, or as a generating function of a dynamical system
x'(t) = F,.(x(t), p). (41)
Another field of applications is bifurcation theory. For this, we recommend
Golubitsky and Schaeffer (1979), (1984, M). In connection with energy ques-
tions, we consider the following example. Let F = F(x, p) be the potential
energy of a mechanical system, which is described by the state variable x e R"
73.17. Applications to Natural Sciences 581

p<O p=O p>O


Figure 73.17

and outer parameters p e !Rd. The equilibrium states (x, p) of this system are
obtained from F(x, p) = min!, i.e., from
Fx(x,p) = 0. (42)
Any reasonable energy function should be structurally stable, i.e., according
to (C6), F should be versal. In case we are interested in the minimal number
of outer parameters which influence a structurally stable system, it is useful
to assume that F is universal. To be more concrete, let n = 1 and
F(x, 0) = axk + O(xA:+ 1 )
with x e IR, k ~ 2 and a > 0. From Example 73.63 it follows that the minimal
number of parameters is equal to k - 2. Assume, for instance, that k = 4. Then
x 1-+ F(x, 0) has a degenerate minimum at x = 0 and a universal unfolding
F(X,PtoP2) = ax 4 + 0(x 5 ) - PtX 2 - P2X.
This corresponds to the second elementary catastrophe (cusp) of Table 73.1.
The equilibrium states (x, p 1 , p2 ), i.e., the solutions of (42) follow from
4ax 3 + O(x4 ) - 2p 1 x - p 2 = 0.
Figure 73.18 shows the diagrams for p2 = 0 and p2 ~ 0. Here, p2 = 0 corre-
sponds to the broken line. Since in reality, one always deals with perturbations,
a diagram with p2 = 0 cannot be expected in experiments, but only the
perturbed diagrams with p 2 ~ 0. This is actually the case.

Figure 73.18
582 73. Banach Manifolds

73.18. Orientation
In a finite-dimensional B-space X, an orientation is determined by specifying
a basis {bto ... , b.}. A linear map
L:X -+X
is called orientation preserving if and only if det L > 0. Note that the definition
of det L is invariant, i.e., independent of the basis in X (see Section 4.16). Two
bases {b1 , ... , b.} and {c1 , ••• , c.} in X are called orientation equivalent if and
only if there exists a linear, orientation preserving map L: X-+ X with
for allj.
The corresponding equivalence classes of bases are called orientations of X.
In X, there exist exactly two orientations with representatives {b1 ,b2 , ••• ,b.}
and {- b1 , b2 , ••• , b.}. Figure 73.19 shows the representatives for both orienta-
tions of R2 • A C1-map f: U £ X-+ X, where U is open in X, is called
orientation preserving if and only if every derivative has this property, i.e.,
detf'(x) > 0 for all xe U.
We now extend this concept to manifolds. Let us consider the surface of the
earth M. Intuitively, an orientation of M is determined by specifying two small
curved coordinate axes {a, b} at each point of M. This should be done globally,
without causing contradictions. To this end we require that, for each fixed
chart, the chart images of {a, b} have the same orientation, and that this
orientation is preserved under chart changes (see Fig. 73.20). Depending on
the choice of the geographical atlas, this might not be possible; however, we
can always find an equivalent atlas with this property. Proceeding like this,
we obtain exactly two orientations for the surface of the earth (Fig. 73.21). On
the other hand, there exists no orientation for the well-known Moebius band.
This is a surface in R3 which is obtained by joining two opposite sides of a
rectangle as in Figure 73.22, i.e., A and A' are identified. It is told that Moebius
(1790-1868) discovered this surface while watching his wife sewing a garter.
Now, if one moves a coordinate cross along BOB' in Figure 73.22, the
orientation at B' is different from the orientation at B. But since both points
on the surface correspond to the same point, the Moebius band cannot be
given an orientation. Let us make these heuristic arguments more precise.
Assume:
(H) Let M be a real, n-dimensional C11:-manifold, k ~ 1.

Figure 73.19
73.18. Orientation 583

!tp•'P-1

Figure 73.20

Figure 73.21

A C'
~

B t B'
0

c A'
Figure 73.22

Two admissible charts (U, cp) and (V, t/1) in Mare called orientation compati-
ble if and only if U n V is empty or the map
t/1 o cp -t: cp(U n V)-+ t/I(U n V),
describing the chart change, is orientation preserving (Fig. 73.20). An equiva-
lent atlas of M is called oriented if and only if all of its charts are orientation
compatible.

Definition 73.66. A manifold M is called orientable if and only if M has an


equivalent atlas which is oriented.

In order to formalize the concept of orientation, we say that two oriented,


equivalent atlases for M have the same orientation if and only if all charts in
the two atlases are orientation compatible. The corresponding equivalence
classes of oriented atlases are called orientations of M. If M is connected and
orientable, then M admits exactly two orientations. We say that Misoriented
584 73. Banach Manifolds

if we specify an oriented, equivalent atlas for M. All admissible charts in M,


which are orientation compatible with this atlas, are called admissible oriented
charts.

73.19. Manifolds with Boundary


According to our previous definition of manifolds, the open disk in R2 is a
two-dimensional manifold, while the closed disk is not a manifold at al~ since
a neighborhood of a boundary point x of M is not homeomorphic to an open
set in R2• Such a neighborhood, however, is homeomorphic to a relatively
open set of the half space HR 2 (Fig. 73.23). This leads to the following
definition of manifolds with boundary. Let
HR" = {(~ 1 , ... , ~~~)eR": ~ 1 ~ 0}.
A set A in HR" is called relatively open if and only if there exists an open set
Bin R" such that A= B n HR".
Recall that from Definition 4.22 a map f: A- R"' is C", k ~ 1, if and only
iff is C" at every inner point of A, and for every boundary point of A there
exists an open neighborhood in R" such that f can be extended to a C"-map
on this neighborhood.
The definition of a real, n-dimensional C"-manifold M with boundary is then
analogous to the definition of manifolds of Section 73.2. The only difference
is that the chart images U, are no longer open sets in R", but relatively open
sets in HR". A point xeM is called a boundary point or inner point in M if
and only if there exists a chart (U, lf') in M, such that lf'(X) is a boundary point
or inner point of the chart half space H R". In the case of disks and balls, this
corresponds to our intuition. If the boundary is empty, then we obtain a
manifold in the previous sense.

EXAMPLE 73.67. Let M be a subset of R". If each point in M has a neighborhood


which is C"-diffeomorphic to a relatively open subset of HR", then M is a real,
n-dimensional manifold with boundary.
In particular, the finite interval [a, b[ is a real one-dimensional ceo-manifold
with boundary point a. The closed ball in R" is a rea~ n-dimensional ceo.
manifold with boundary.

Figure 73.23
73.19. Manifolds with Boundary 585

Proposition 73.68. Every connected, real, one-dimensional C00 -manifold with


nonempty boundary and countable basis is C00 -diffeomorphic to the unit interval
[0, 1] or [0, l [.

For a proof see Milnor (1965, M), Appendix.

The orientation of manifolds with boundary is defined analogously to the


orientation of manifolds without boundary in Section 73.18.
Let e1 denote the ith unit vector in R",i.e., e1 = (l,O, ... ,O),etc. Let N = HR".
We say that N and oN are coherently oriented if and only if the orientations
of N and oN are given either by
and
or by
and

respectively (Fig. 73.24). This definition coincides with the following more
general situation.

Proposition 73.69 (Coherent Orientation). Let M be a real, oriented, n-


dimensional C"-manifold with boundary oM, k ~ 1. Then the oriented atlas for
M induces in a natural way an oriented atlas for oM.

This way an orientation is induced on oM. The corresponding orientations


of M and oM are called coherent.

PRooF. The basic idea is contained in Figure 73.24. Let us consider a fixed
boundary point xeiJM. In two different charts, a neighborhood of x in M is
described by the local coordinates

and

('11• ... '"")' '11 ~ 0.

aM

M
Figure 73.24
586 73. Banach Manifolds

Correspondingly, a neighborhood of x in oM is described by the local coordi-


nates
or (0, '12• •••' 'In)•
Let
'I}= 'IJ(el ..... en>• j = 1, ... , n
be the coordinate change on M. Since M is oriented, we have
0('11•" •• 'In)
~:....._~->
O.
o(el, ... ,en>
The corresponding coordinate change on oM is described by
0 = '11 (0, e2 .... ' en>•
'11 = 'IJ(O, e2 .... , en), j = 2, ... , n.
Thus, on oM, we obtain

i.e., oM is oriented. D

Figure 73.25 shows the two coherent' orientations of surfaces in Rl, if these
surfaces are connected submanifolds of R3 with boundary. The coherent
orientation will play an important role in the formulation of the generalized
integral theorem of Stokes of Section 74.24.

aM
Figure 73.25
73.20. Sard's Theorem 587

For a convenient formulation of this theorem we need the concept of a


submanifold M with boundary. One may think of Mas an open set in ~ 3 or
a closed ball, and of S as a sufficiently regular surface or curve with or without
boundary which lies in M.
Let M be a real, n-dimensional Ck-manifold with boundary, k ~ 0. More-
over, let S be a subset of M. Then Sis called an m-dimensional Ck-submanifold
of M with boundary if and only if the following holds: For each point x eM
there exists an admissible chart in M such that M looks locally like a relatively
open set in H~n and S looks locally like a relatively open set in H~"'. Here

H~n = {(el ..... en)E~n: el ~ 0}.


H~"' = {(et•· .. , en)EH~n: ei = 0 for all i > m}.
More precisely, for each x eM there exists an admissible chart (U, q>) in M
with x e U such that q>(U) and q>(U r. S) are the corresponding relatively open
sets in H~n and H~"'.
From these admissible charts in M, we obtain charts inS. Thus, S becomes
a real, m-dimensional Ck-manifold with boundary.
Analogously, one can define Banach manifolds with boundary. The only
difference with the Banach manifolds of Section 73.2 is that here the chart
images are no longer open sets in B-spaces X, but relatively open sets in half
spaces H X of B-spaces, i.e.,

HX = {xeX: f(x) ~ 0},


where f: X-...~ is linear and continuous.
Manifolds mean always manifolds in the sense of Section 73.2, i.e., manifolds
without boundary. For manifolds with boundary we will always add "with
boundary."

73.20. Sard's Theorem


Theorem 73.H (Sard (1942)). Let M and N be real, finite-dimensional cro-
manifolds with countable basis. Iff: M-... N is Ck with
k > max(O, dim M - dim N),
then the set of singular values off has measure zero in N. The set of regular
values off is residual and dense inN.

This important classical theorem states that almost all values of f are
regular. A proof may be found in Abraham and Robbin (1967, M), §15. The
prooffor cro-functions f is much easier and is contained in Lang (1972, M),
p. 173. The important generalization to Banach manifolds is proved in Section
78.8 (Theorem of Sard-Smale). In Chapter 78 we show how to use Sard's
588 73. Banach Manifolds

theorem for an elegant and constructive approach to fixed-point theory


and mapping degree. This also leads to effective, numerical programs for
computers.

73.21. Whitney's Embedding Theorem


Theorem 73.1 (Whitney (1936)). Let M be a real, n-dimensional C"'-manifold
with a countable basis. Then there exists a C"'-embedding f: M-+ IR 2"+ 1•

This fundamental structure theorem states that M can be realized as a sub-


manifold f(M) in IR 2 "+ 1• A proof may be found in Golubitsky and Guillemin
(1973, M), Chapter 2, §5. We will give here a simple proof for a special case
which illustrates the geometric meaning of this theorem, and shows how the
space IR 2"+1 comes up. Here, the tangent bundle TM plays a central role. The
proof should also convince the reader of the usefulness of the abstract con-
struction TM. Our additional assumption is:
(S) M is a compact submanifold of IRm, m ~ n.
Form~ 2n + 1, Theorem 73.1 is trivially true. Thus, only the case m >
2n + 1 is of interest. In order to get a good intuitive understanding, we
transform the tangent space T M x into the origin of IRm. Then T M x is a linear
subspace of IRm. For the tangent bundle we have
TM = {(x,v)eM x IRm: veTMx}·
Then TM is a real, 2n-dimensional C"'-manifold in IR 2 m. Roughly speaking,
the space IR 2 "+ 1 appears because dim TM < 2n + 1. Let us make this precise.
The key in the proof is Sard's theorem of the previous section.
PROOF IN CASE OF (S).
(I) Geometrical argument. Below, we prove the following: If g: M-+ IRt is
an injective immersion with k > 2n + 1, then there exists a non vanishing
vector a e IRt such that
nag: M-+ a.l
is an injective immersion. Here a.L denotes the orthogonal complement
of a in IRt and n: IR"-+ a.L is the corresponding orthogonal projection.
(II) Let m > 2n + 1. Since a.L is isomorphic to IR"- 1, we are able to construct
an injective C"'-immersion f: M-+ IR 2"+ 1 by succesively applying (I) to
the trivial embedding g: M-+ IRm.
Because of the compactness of M, f is· proper, i.e., ·from Corollary
73.45, an embedding. This proves the assertion.
(III) Proof of (I). We define the following two maps
G: TM-+ IR", G(x,v)~(T,g)(v),
73.22. Vector Bundles 589

and
def
H(x,y,t) = t(g(x)- g(y)).
The preimage spaces of G and H have dimensions 2n and 2n + 1 and the
image spaces have dimensions k > 2n + 1. This is essential. According
to Sard's theorem of Section 73.20, the set of singular values of G and H
has measure zero in Ri. Since the dimensions are lowered, each image
point is a singular value. Therefore, there exists an a e Rt which is not an
image point of G and H. Since G(O) = H(O) = 0, we have a -::;: 0.
The map nog: M--+ ai is injective. This is true because n(g(x)) =
n(g(y)) implies that g(x) - g(y) = ta for t e R. If x -::;: y, then t -::;: 0 since
g is injective. Thus H(x, y, 1/t) = a, which contradicts the choice of a.
Moreover, 1t o g is an immersion. Suppose, there exists a v -::;: 0
with ~(no g)(v) = 0. The chain rule implies that (no ~g)(v) = 0, i.e.,
(~g)(v) = ta for some t e R. Since g is an immersion, it follows that t -::;: 0.
Consequently, G(x, vjt) = a, which contradicts the choice of a. 0

73.22. Vector Bundles


In Definition 73.20 the abstract notion of bundles was introduced. Vector
bundles may be characterized in the following two ways.
(i) Vector bundles are bundles where the fibers are linear spaces, and which
behave locally like products of open sets with B-spaces.
(ii) Vector bundles are obtained if a B-space Fx is attached at each point of a
topological space or manifold M. Here, we neglect that different spaces Fx
may intersect.
In (ii), we may also think of a vector bundle as a family {Fx} of vector spaces
which continuously depends on a parameter x, where x runs through M. Here
Miscalled a basis space and Fx is called a fiber. The bundle space B is defined
as
B = {(x,t): xeM,teFx}·
Also, we let n(x, t) = x. This induces a triple (n, B, M).
The reader should convince himself that the following definition is very
natural, as is the case with every important definition in mathematics.
Let X and Y be B-spaces. As usual, L(X, Y) denotes the B-space of linear,
continuous maps from X into Y.

Definition 73.70. A vector bundle (n, B, M) is a triple which satisfies the following
conditions:
(Vl) Bundle property. The map n: B--+ M is continuous and surjective where
B and M are topological spaces.
590 73. Banach Manifolds

(V2) Linear fibers. For each x eM the fiber Fx = 1t- 1(x) is a linear space over
K
(V3) Fiber preserving local trivializations. There exists a covering {U1} of
the basis space M with open sets, and for every U1 there exist
homeomorphisms
-r1: 1t-1 (U1)--+ U, X l'j,
where Y, is a B-space over K These trivializations -r1 are fiber preserving,
i.e., we always have
t 1(Fx) = {x} X "l'j,
and the -r1 define linear maps from Fx onto Y,.
The B-space Y, is called a typical fiber.
(V4) Change of trivializations. Let xe U1 n ~·The following diagram
tij(x)

Ti-l'\ /Tj
{x} x 1'1-{x} x lj

Fx
commutes and defines the so-called transition functions
tu(x): Y, -+ lj
between the typical fibers where we require that
tii: U1 n ~-+ L{Y,, lj)
is continuous.

Definition 73.71. Let k ~ 0. If one replaces, in the above definition, topological


spaces, continuous maps, and homeomorphisms with ct-Banach manifolds,
ct-maps, and ct-diffeomorphisms, then we will speak of ct-vector bundles.
A seccion of a vector bundle (1t, B, M) is a continuous maps: M-+ B with
1t(s(x)) = x for all xeM.
This is equivalent to saying that s(x) e Fx for all x eM.
If we are given a ct-vector bundle and if, in addition, sis C", then we speak
of a C1-section.

Remark 73.71 (Meaning of (V4)). Condition (V3) implies that -r1 defines a
homeomorphism from Fx onto {x} x l'j. Therefore t 11 (x): Y, -+ lj is always a
linear homeomorphism.
In the finite-dimensional case, i.e., if all fibers are finite dimensional, we
identify Y, with IKn•. Then, all transition functions tii(x) are invertible matrices.
In this case, the definition of vector bundles can be simplified because (V4) is
then an obvious consequence of (V1)-(V3).
73.22. Vector Bundles 591

Instead of rh one often calls (U1, r 1) a trivialization or bundle chart.


The definition of vector bundles can easily be generalized. In Part V, for
example, we consider the concept of fiber bundles which plays an important
role in modern physics. Roughly, in this case, the typical fibers Yj are equal to
a topological space or a manifold Y, and the transition functions tu(x) are
elements of a transformation group on Y.

EXAMPLE 73.73 (Standard Example of a Vector Bundle). The simplest example


of a vector bundle (1t, B, M) is the product B = M x Y, where M is a topological
space and Ya :8-space. We let 1t(x,y) = x.
As a covering {U1}, we choose the trivial covering of M by M. The corre-
sponding trivialization is then the identity r: M x Y-+ M x Y.
A sections: M-+ M x Y has the form s(x, y) = (x,f(x)), where f: M-+ Y is
continuous.
If M is a ct-manifold, k ~ 0, then we obtain in this way a ct-vector bundle,
and sis a C"-section iff: M-+ Y is C".
General vector bundles are obtained from Definition 73.70 by gluing such
products U1 x Y; together. For this gluing together, one uses the transition
functions t1i(x).
EXAMPLE 73.74 (Cylinder as a Typical Geometrical Example of a Vector
Bundle). Let B be the surface of a cylinder and let M be the cylinder equator
(Fig. 73.26(a)). Let 1t: B-+ M denote the orthogonal projection onto M. Then
(1t, B, M) is a C'x'-vector bundle. The fibers F" generate the surface of the
cylinder.
A C"-section is a C"-curve s: M-+ B with s(x)eFx for all xe M (Fig. 73.26(b)).
EXAMPLE 73.75 (Normal Bundle of a Curve). Let M be the boundary of a disk
(Fig. 73.27(a)). If we draw the normal N" at every point x eM, and neglect that
different normals may intersect, then we obtain a vector bundle with fibers
Fx = N" which has the same structure as the surface ofthe cylinder of Example
73.74.
More generally, if M is a sufficiently smooth curve in R2, then we obtain in
a similar way the normal bundle (1t, B, M) of M (Fig. 73.27(b)). Precisely, we
proceed like this. Let N" be the oriented normal at x. The bundle space B
consists of all pairs (x, P) with x eM and PeN". The trivialization
r: B-+M x R
is defined through r(x, P) = (x, p), where p is the distance between P and x
with the correct sign.
EXAMPLE 73.76 (Tangent Bundle as a Vector Bundle). Let M be a C"-Banach
manifold, k ~ 1. Then (1t, T M, M) is a c"- 1-vector bundle.

PRooF. The bundle space B = T M consists of all pairs (x, v) with x eM and
veT Mx. The map 1t: T M -+ M is defined through 1t(x, v) = x. The fiber Fx =
1t- 1 (x) is the tangent space TMx at x.
592 73. Banach Manifolds

s
B

(a) (b)

Figure 73.26

F,=N,

(a) (b)

Figure 73.27

Let {(U;, cp;)} be a collection of charts in M. We let cp = cp; for a fixed i. The
local trivialization

is defined through
r;(x, v) = (x, v,.).
Here v,. is the coordinate of the tangent vector ve TM" with respect to cp. Thus
the typical fiber X,. is the chart space. To see that this actually yields a vector
bundle, we have to look at the change of the local trivializations. Let (lJ.i, cp1)
be another chart with xe U; n lJ.i and let t/1 = cp1. From (6), we obtain
vl/t = tii(x)v,.
with
t;1(x) = F'(cp(x)) and

The map F, which describes the chart change, is a Ck-map between the chart
spaces X, and Xy,. Thus F' is a ck- 1-map fro_m X, into L(X,, Xl/t). The
smoothness of maps in M is defined with respect to the corresponding chart
spaces. Therefore x.-. t;1(x) is a ck- 1 -map from U; into L(X,, XI/I). 0

EXAMPLE 73.77 (Cotangent Bundle as Vector Bundle). In a similar way as in


the previous example, one can prove that (n, T M*, M) is a ck- 1-vector bundle,
if M is a Ck-Banach manifold, k ~ 1.
73.22. Vector Bundles 593

Often, in mathematics, one has the following situation:


(i) One is given certain objects (e.g., groups, topological spaces, C"-manifolds,
etc.).
(ii) For these objects there exist maps which are called morphisms (e.g., group
homomorphisms, continuous maps, C"-maps, etc.).
(iii) Bijective maps b between these objects are called isomorphisms if b and
b- 1 are morphisms (e.g., group isomorphisms, homeomorphisms, C"-
diffeomorphisms, etc.).

In this way, one immediately recognizes common properties between differ-


ent mathematical structures. Morphisms are always defined in such a way
that the corresponding isomorphisms preserve in some intuitive sense the
structure of these objects. Here isomorphic objects are considered to be
essentially equal. Classes of objects,· together with their morphisms, form
categories (category of groups, category of topological spaces, category of
C"-manifolds, etc.). In category theory, one then studies general properties of
these categories, i.e., general properties of mathematical structures. In order
to obtain a category of vector bundles, we need to define the corresponding
morphisms for our objects "vector bundles." This will be done in the following
definition.

Definition 73.78. Let lJ = (n1, B1, M1), j = 1, 2, be two vector bundles. A mor-
phism from V1 to V2 is a pair (f, g) of maps which satisfies the following two
properties.
(Ml) Fiber preserving property. The diagram

commutes where f and g are continuous, i.e.,

J(FxJ £ Fg(x) for all xeM1 .

This way, we obtain a linear map from F., into F9<xl·


(M2) Mapping of typical fibers. For each xe M1 , there exist trivializations for
J'}:
j = 1, 2

with X E U1 , g(x) E U2 and g(Ud £ U2, such that the map


f(x): Y1 -+ Y2 ,
594 73. Banach Manifolds

which is naturally defined through the following commuting diagram


J(x)

l'•
{x} x Y1 ~ {g(x)} x Y2

,,. j
f
Fx ~ Fg(x)

has the property that/: U1 -+ L(Y1 , Y2 ) is continuous.


If V1 and V2 are C"-vector bundles, k ~ 0, then we obtain C"-morphisms
if "continuous" is everywhere replaced by "C"-map."
Two vector bundles V1 and V2 are called isomorphic if and only if there
exists a morphism (/,g) from V1 to V2 such that f and g in condition
(Ml) are bijective, and (f- 1,g- 1 ) is a morphism from V2 to V1 •
Analogously, C"-isomorphisms are defined.

Remark 73.79 (Meaning of (M2)). One may think of 1 as the representative


off with respect to the typical fibers. Since t 1 and t 2 are homeomorphisms,
condition (Ml) implies that the map J(x): Y1 -+ Y2 is always linear and con-
tinuous, i.e., f(x) e L( Y1 , Y2 ).
In the finite-dimensional case, i.e., if all fibers of V1 and V2 are finite
dimensional, one can simplify this definition because (M2) is then a con-
sequence of (M 1), so that (M2) is not needed.

In mathematical physics, the concept of vector bundles arises naturally if


at each point x of a manifold M a linear space F is attached (tangent spaces
for the description of vector fields, tensor spaces for the description of tensor
fields, spinor spaces in relativistic quantum theory, etc.). The concept of vector
bundles allows a global description of these linear objects on M.
Conversely, one can use the algebraic structure of the collection of all vector
bundles on M to describe topological properties of M. This is done inK-
theory. Important is the fact that it is possible to form a linear algebra for
vector bundles by reducing the algebraic operations between vector bundles
to the corresponding operations between typical fibers. K-theory-a gener-
alized cohomology theory-plays a central role in the formulation and proof
of the Atiyah-Singer index theorem, which describes a fundamental connec-
tion between the topology on M and the index of differential and integral
operators on M. This will be discussed in Part V. ·
Vector bundles also play a fundamental role in gauge theories. Model
equations for elementary particles are the Yang-Mills equations. These are
intimitely related to the connection in principal fiber bundles. The solutions
of these complicated nonlinear partial differential equations can explicitely be
given. This uses the Penrose transformation, which reduces the problem to a
study of analytic vector bundles on the three-dimensional complex projective
space CP3 • In this connection, it is important to master the theory of these
bundles in algebraic geometry. For this, we recommend Atiyah (1979, L).
73.23. Differentials and Derivations on Finite-Dimensional Manifolds 595

73.23. Differentials and Derivations on


Finite-Dimensional Manifolds
In this last section of the chapter on Banach manifolds we will give several
definitions and results that are used later on in the differential calculus for
finite-dimensional manifolds. In order to help the reader with the study of the
literature, we try to incorporate various different notations used by different
authors. Our general assumption is:
(Hl) M and N are C 1-Banach manifolds with chart spaces over K
Let f: U(x) s;;; M-+ N be C 1 on an open neighborhood of x. As in Section
73.6
f'(x): TMx-+ TNx

denotes the tangent map off at x.

Defmition 73.80. Similarly as in Section 4.2, we define the differential off at


the point x in the direction of v through
df(x; v) = f'(x)v.
This definition is valid for all tangent vectors v at x, i.e., for all ve TMx.
Another notation for f'(x) is df(x) or dfx, and df(x) or dfx is called the
differential off at x.

Thus "tangent map f'(x)" and "differential dfx" are synonymous concepts.
We have
dfx[v] = df(x)[v] = df(x;v) = f'(x)v
for all ve ™x·
For the special case N = IK we have that dfx: T Mx -+ IK is a linear, con-
tinuous functional on ™x• i.e., dfxe TM:. Consequently, dfx is a cotangent
vector to M at x. Moreover,
(dfx, v) = (df(x), v) = f'(x)v (43)
for all veT Mx. Instead of (43), one finds in the literature also

v(f) ~ f'(x)v. (44)


As in classical analysis, f'(x)v is also called a directional derivative off at the
point x in the direction of v.
We now want to show that, with the definition of dfx, the differential
calculus of R" carries over to finite-dimensional manifolds. Formally, the same
rules apply. For this, we specialize assumption (Hl).
(H2) Let M be an n-dimensional C 1-manifold with chart spaces K
596 73. Banach Manifolds

Fix a point xeM and let (U,q>) be a chart in M with xe U. The elements
of the chart space IK11 have the form u = (u 1, ••. , u11 ), i.e.,

where e1 is the unit vector in the direction of the u1-coordinate. Here, and in
the following, we use Einstein's summation convention, i.e., we sum from 1 to
n over equal upper and lower indices.

Definition 73.81. If (H2) holds, then we let b1 denote the tangent vector to M
at x which corresponds to the vector e1•

For a C 1-curve x = x(t) in M, we previously agreed to denote the tangent


vector at x(t) with x'(t) or .X(t). The representative of x = x(t) and .X(t) in the
u-chart is u = u(t) and
u(t) = u1(t)e1,
respectively. If we choose the u1-coordinate line through the point u = q>(x),
then we have t = u1 and u= e1• Consequently, b1 corresponds to the u1-
coordinate line through u = q>(x). In accordance with our convention for x'(t),
we write b1 = ox(u)jou 1 with u = cp(x) or in short notation
ox
bi = iJui' i = 1, ... , n. (45)

If M lies in W+l, then (45) actually represents a geometric vector in IK 11 +1


which corresponds to b1• Figure 73.28 shows this for the case n = 2. In general,
however, b1cannot be represented by a vector in some surrounding space; one
has to stick with the corresponding curves in M.
Since {e1 , ••• , e,.} is a basis for the chart space, it follows that {b1 , • •• , b,.} is
a basis for the tangent space ™x· Consequently, every tangent vector ve ™x
can be written as
(46)
with v1e IK for all i. The representative of v in the chart space to (U, q>) is

~·'L±+ u•

Figure 73.28
73.23. Differentials and Derivations on Finite-Dimensional Manifolds 597

In order to obtain a basis {b 1, •.• ,b"} for the dual space TM:, we define
linear, continuous functionals bi on T M" through
j = 1, ... , n.
Thus bi E T M: and
i,j= l, ... ,n.
We therefore obtain for each cotangent vector we TM:
w[v] = w[v 1ba = w1b 1[v]
with w1 = w[ba. This implies that
W=W;b 1• (47)
Hence {b 1, ••. ,b"} is in fact a basis for TM:. The representative wofwe TM:
with respect to the chart (U, QJ) is obtained from w[v] = w[v], i.e.,

Here, el is defined through ei[v 1e;] =vi, i.e., ei is a linear, continuous functional
on the chart space IK".

Definition 73.82. If (H2) holds, then {b1 , ••• , bn} is called a natural basis for the
tangent space TM" with respect to the chart (U, QJ). Moreover, {b 1, ... , b"} is
called a natural basis for the cotangent space TM: with respect to (U, QJ).
The numbers v1 and w1 in (46) and (47) are called natural coordinates of
veTM"and weTM:.
Recall that b1 = oxjou 1• Below we will show that b1 = du 1• In the literature,
sometimes also b1 = ofou 1 is used.

Proposition 73.83 (Ck-Fields). Assume (H2), and let M be a Ck-manifold, k ~ 1.


Then a vector field
x~-+(x,v(x))

[or covector field x~-+(x, w(x))] on M is Ck if and only iffor each xe M there
exists a chart (U, IP) with x E U such that the natural coordinate functions
Ul-+ v1(u), i = 1, ... , n
[or u 1--+ w1(u)] are C"-functions on the chart image QJ(U), i.e., all v 1 and w1 are
ct-functions on M.

PRooF. This follows by using the representations of the representatives v and


win the chart space. D

We now study how the b1 and b 1 are transformed under chart changes, i.e.,
we pass from the u1-coordinates to the new u1'-coordinates and denote the
new basis vectors by b1• and b 1'.
598 73. Banach Manifolds

We let
., oui'
Aj =~(u) and
uu'
where u and u' are the corresponding chart images of x EM.

Proposition 73.84. From assumption (H2) we obtain under chart changes that
b;· = Al·b;, (48)
bi' = Afbi. (49)
The natural coordinates vi and w; are transformed in the same way as bi and b;.

In Part V, these key relations will enable us to pass from invariant tensor
calculus on manifolds to coordinate representations.
PROOF. Ad(48). Consider the ui'-coordinate line through u'. Under chart
changes, this curve becomes
ui = u i( u1', ... ,u n')' i =·1, ... , n,
where ui' varies, and all other ui' are fixed. The tangent vector to this curve
at u' is
oui
~(u')ei.
uu'
But this is the representative of bi' in the u-chart. Hence, formula (48) follows.
Reversing the roles of ui and ui' formula (48) implies that
bi = Afbi'.
From v = vibi = vi'bi' we then obtain that

Ad(49). This relation follows immediately from bi'(v) =vi' and bi(v) =vi.
Finally, it follows from w = wibi = wi.bi' and (49) that
0

Proposition 73.85. If (H2) holds, then

PROOF. Consider the map f: U £ M--+ II<, defined through f(x) = ui, i.e., f
assigns to each point x eM the coordinate ui in the chart space. It follows that
df,. = du!. According to Section 73.6, one computes f'(x)v by passing to
representatives, i.e.,
f'(x)v = J'(u)v with u = q>(x).
73.23. Differentials and Derivations on Finite-Dimensional Manifolds 599

Thus we obtain
dfx[b1] = f'(x)b1 = f'(u)e1. (50)
But this is the directional derivative of 1 at u in the direction of e1, i.e.,
ou 1 •
dfx[bj] = oui = ~j
for allj. This means that df" = b1• 0

Instead of du!, one often uses the shorter notation du 1, i.e.,


i = 1, ... , n. (51)
We will see below that this formula is the key for the differential calculus on
n-dimensional manifolds. It is worth emphasizing that du 1 is a well-defined
object, not just a formal symbol. It provides a rigorous justification for the
old formal differential calculus which goes back to Leibniz (1646-1716), and
supports our general experience that every sucessful formal calculus admits a
rigorous justification. During this century, for instance, it was possible to
develop a mathematical framework for the Heaviside calculus and Dirac's
delta function which were earlier introduced in a formal way by physicists.
This was done, by using the Laplace transformation and the theory of distribu-
tions. The differential du 1 is a linear, continuous functional on the tangent
space T M". For an arbitrary tangent vector v = v1b1 we have
dui[v] = vi, i,j = 1, ... , n. (52)
The transformation rules (48) and (49) now assume the following form
ox ou 1 ox
(53)

(54)

which correspond to the classical formulas.

Corollary 73.86. If (H2) holds, then one obtains precisely all cotangent vectors
we T M: through
(55)
with arbitrary w1e IK.

PRooF. This follows from w = w1b1 and (51). 0


Because of formula (55), cotangent vectors are also called forms or differ-
ential forms. We will see in Part V that formula (55) allows a very elegant
formulation of the canonical formalism of classical mechanics for the general
situation that the state space is a manifold.
600 73. Banach Manifolds

In order to derive other useful consequences from the key formula (55) we
make the following assumption.
(H3) Let f, g: U(x) s;; M-+ IK be two C 1-functions defined on an open neigh-
borhood of x, and assume (H2) for M.

Definition 73.87. Assume (H3). Let


of def
i = 1, ... , n.
~(x) = df,.[ba,
uu'
By (44), this is the directional derivative off at x in the direction of b1•

Corollary 73.88. If 1denotes the chart representative off, i.e., [(u) = f(x) with
u = tp(x), then
of a[
ou (x) = ou
1 1 (u), i = 1, ... , n.

PRooF. This follows from (50). 0

Proposition 73.89 (Product Formula). Let J, g: U(x) s;; M-+ IK be two C 1-


functions defined on an open neighborhood of x. Furthermore, let M be a
C1-Banach manifold with chart spaces over K Then
d(fg),. = (df,.)g(x) + f(x)dg,.. (56)
In short notation, one writes
d(fg) = (df)g + f dg.

PROOF. We have
(fg)'(x)v = (f'(x)v)g(x) + f(x)g'(x)v.
This follows from an application ofthe product rule of Section 4.3 to the chart
representatives. 0

In conclusion, we consider the concept of derivation.

Definition 73.90. Let M be a C00 -Banach manifold with chart spaces over K
We fix a point xeM. Let C;' denote the collection of all C00 -functions f:
U(x) s;; M-+ IK, defined on an open neighborhood of x, where U(x) may
depend on fA derivation at xis a map D: C;'-+ IK with
D(u.f + fJg) = aD(f) + fJD(g),
D(fg) = D(f)g(x) + f(x)D(g)
for all ex, fJ e IK and f, g e C;'.
73.23. Differentials and Derivations on Finite-Dimensional Manifolds 601

If we fix a tangent vector v e T M x• then the directional derivative at x in


the direction of v
D(J) = dfx[v]
is obviously a derivation at x. This follows from the product rule (56). The
foJJowing proposition shows that all derivations on finite-dimensional coo-
manifolds are obtained in this way. Recall the notation
dfx[v] = v(f).
Thus, we can also write D(f) = v(J).

Proposition 73.91. Let M be an n-dimensional real or complex C00 -manifold.


Then each tangent vector ve ™x• i.e., v = v1b1 generates a derivation at x
through
of
v(J) = v' ou' (x). (57)

Every derivation at xis generated like this. In particular, we have

(58)

PROOF. Let ve ™x· Formulas (57) and (58) follow from


. . .of
v(f) = dfx[v'bJ = v'dfx[bJ = v'-;--:-(x).
uu'
Conversely,letDbeaderivationatx. Forf = g = 1 weobtainD(l) = 2D(l)
and thus D(l) = 0. In the same way we obtain D(f) = 0 for every constant
function f. Let u0 = qJ(x) and let 1 be the chart representative of f. Letting
D(1) = D(f) we obtain a derivation for the chart space IK". Taylor's theorem
of Section 4.6, with integral remainder, gives the foJJowing representation for
the C00 -function 1: U(x 0 ) !;;;;;; IK"-+ IK,
1(u) = 1(u0 ) + (u 1 - ub)g1(u),
in a neighborhood of u0 where the g1 are coo with g1(u 0 ) = o1(u0 )/ou 1 for all i.
Because of D(const) = 0 we obtain
D(1) = D(u' - ub)g1(u 0 ).

Letting w1 = D(u' - ub), it follows that


.of
D(f) = w'-;--:- (x).
uu'
Recall Corollary 73.88 in this connection. 0
Proposition 73.91 yields a one-to-one correspondence between tangent
vectors and derivations on finite-dimensional C""-manifolds. In the literature,
602 73. Banach Manifolds

the notion of derivations is often used to define tangent vectors. Starting from
(57), one writes in suggestive form
.a
v = v'!l"'"·
uu'
A comparison with v = vibi yields a correspondence between bi and o/ou 1
which formally coincides with (58).
For infinite-dimensional Banach manifolds, Proposition 73.91 is not true.
In the general case, the geometric definition of Section 73.5 is most appropriate
for the development of a corresponding theory.

PROBLEMS

73.1.* Proof of Proposition 73.9.


Hint: See Milnor (1965, M).

73.2. Abstract atlases. In Section 73.2 we gave a definition of a manifold M, using


the fact that M is a topological space. Another possible way to construct Banach
manifolds is the following. Let M be an arbitrary set. An abstract c•-atlas for
M is a set of pairs (U«, cp«), which will be called abstract charts and satisfy the
following properties:
(i) The sets U« cover M.
(ii) Every cp« maps U« bijectively onto an open set cp«(U«) in a B-space X« over
K
(iii) If u« " u{J is not empty, then
cp11 o cp; 1 : cp«( u« " utJ) -+ cptJ( u« " u 11 )
is a c•-map defined on the open set cp«(U« n Ufl).
Prove: If an abstract c•-attas belongs to M, k ~ 0, then a topology may be
introduced on M, such that M becomes a c•-Banach manifold in the sense of
Section 73.2, whenever this topology is separated.
Note that in Section 73.2, as throughout the book, topological spaces are
always assumed to be separated (see A 1 (7)). But even if the topology on M is
not separated one still can develop the theory. We then speak ofnonseparated
Banach manifolds.
Solution: A set U in M is called open if and only if for each x e U there exists
a chart ( u«, cp«) with X E u«" u such that cp«( u«" U) is open.
73.3. Whitney topology. Prove that Definition 73.55 actually gives a topology on
C"'(M,N).
Hint: See Golubitsky and Guillemin (1973, M), Chapter 2,§3. There, one also
finds a proof of Example 73.56.
73.4. k-determination. Using (Cl) and (C2) of Section 73.16, prove the following
statements in IR 2 , and by using (C3) construct the corresponding universal
unfoldings.
(i) a~ + b'f is strongly !-determined for a 2 + b2 :/= 0 and not !-determined
for a 2 + b2 = 0.
(ii) a~ 2 + 2b~'1 + C'f 2 is strongly 2-determined for ac - b2 :I= 0 and not 2-
determined for ac - b2 = 0.
Problems 603

(iii) ai; 2'7 + b11 3 is strongly 3-determined for ab # 0 and not 3-determined for
ab = 0.
(iv) 1; 2'1 is not k-determined.
(v) 31; 2 + 21; 3 - 1;'1 2 is not k-determined for k = 1, 2, 3, but strongly 4-
determined.
(vi) 1; 3 + ~;, 3 is not 3-determined and not strongly 4-determined.
(vii) 1; 5 + '7 5 is not 5-determined, but strongly 6-determined.
Hint: See Poston and Stewart (1978, M).
13.5. Normal forms for functions of two real variables. In Table 73.2 we give normal
forms for k-forms which are obtained from linear, bijective transformations of
the independent variables.

Table 73.2
Normal form
Strongly Not
k k-form k-determined k-determined
1 al; + b'l '1 0
2 al; 2 + bi;'l + C'1 2 ± w+ '12), 1;2 - '12 ±'12, 0
3 a1;3 + bi;2'1 + ci;'12 + d'l3 ,2, ± '13 ,2,, ,3, 0

Hint: See Poston and Stewart (1978, M), Chapter 2. Iff is a cubic form with
f ;/= 0, then one can determine the type of the normal form by solving the
equation f(e, '1) = 0 and studying the qualitative behavior of the solution set
(three lines, one simple line, one line and a double line, one triple line). For
quadratic forms the sign off plays a role.
73.6 General chain rule. Let f: M-+ N and g: N-+ P be C', r ~ 1, where M, N, and
Pare C'-Banach manifolds. Prove:
T'(go f)= T'go T'f (59)
Solution: It is sufficient to use local coordinates. The chain rule implies that
Tf(x, v) = (f(x),f'(x)v),
T(g o f)(x, v) = (g(f(x)), g'(f(x))f'(x)v),
Tg(f(x),f'(x)v) = (g(f(x)), g'(f(x))f'(x)v).
This gives (59) for r = 1. Moreover, we have
T 2(g of) = T(Tg o Tf) = T 2g o T 2f.
73. 7. Whitney's embedding theorem. Study the proofs in Guillemin and Pollack (1974,
M), Hirsch (1976, M), and Golubitsky and Guillemin (1973, M) in this order.
73.8. Generalized Morse lemma. Our goal is to generalize Theorem 73.F of Section
73.12 which is important for applications in partial differential equations. The
key to this and further applications is the result of Problem 73.8a.
73.8a. General criterium for right equivalence of Golubitsky and Marsden (1983). We
study the following important question. When is f + g C"'-right equivalent to
604 73. Banach Manifolds

fat 0?, i.e., letting h = f + g we try to find a map 1/1 with


h(l/l(x)) = f(x)
in a neighborhood ofO. Here, 1/1 is a local C"'-diffeomorphism at 0 with 1/1(0) = 0.
The following equations are important:
g(x) = f'(x)A(x) on U(O), (60)
g'(x) = f'(x)B(x) on U(O), (61)
A(O) = 0, B(O) = 0. (62)
Our assumptions are:
(i) X is a real 8-space. The maps f, g: U(O) s;;; X --. IR are C"', m ~ 1 and
/(0) = g(O) = 0.
(ii) There exist C"'-maps A: U(O)--. X and B: U(O)--. L(X, X) which satisfy
(60)-(62).
Prove similarly as in Section 73.12: The map f + g is C"'-right equivalent at
Otofat 0.
Solution:
(I) Let
H(x, t) = f(x) + tg(x)
and
C(x,t) = -(I+ tB(x))- 1A(x).
Moreover, let V be a sufficiently small neighborhood ofO in X. For each
fixed x e V we solve the ordinary differential equation
qJ,(x, t) = C(qJ(x, t), t), qJ{x,O) = x. (63)

It is important that this solution is defined for all t e [0, 1]. To see this note
that C is defined on V x [0, 1] for sufficiently small V, since B(O) = 0.
Furthermore, (63) has for x = 0 the solution qJ(O, t) = 0, since C(O, t) = 0.
From the dependence of solutions on the initial data (Theorem 4.0) it
follows that V can be chosen so small that (63) has a solution on [0, 1] for
all XE V.
(II) We let 1/J(x) = qJ(x, 1). Then 1/1 is a local C"'-diffeomorphism at 0 with
1/1(0) = 0. This follows from uniqueness of solutions of (63) and from
Theorem 4.0. Note that the map Cis C"'.
(III) Let P = (x, t). After a short computation it follows from (63) and (60), (61)
that

ota H(qJ(P), t) = f'(l'fJ(P))qJ,(P) + g(qJ(P)) + tg'(IP(P))qJ,{P) = 0.


Therefore H(qJ(x, t), t) = const, and hence
H(qJ(X, 1), 1) = H(qJ(X, 0), 0).
This yields
/(1/J(x)) + g(I/J(x)) = f(x).
Problems 605

73.8b. Morse- Tromba lemma. Let X be a real B-space with a scalar product (·I·), i.e.,
the map (x, y)~-+ (xly) from X x X into R is bilinear, symmetric, strictly positive,
and continuous. Also we assume:
(i) The map h: U(O) s;;; X -+ IR is ct, k ~ 3, and h(O) = 0, h'(O) = 0.
(ii) There exists a linear, continuous, and bijective map T: X -+ X with
h"(O)xy = (Txly) forall x,yeX.

(iii) There exists a ct- 1-map H: U(O) s;;; X-+ X with


h'(x)y = (H(x)ly)
for all xe U(O) and ye X. We let
f(x) = 2- 1 h"(O)x 2 = 2- 1 (Txlx).

Prove with Problem 73.8a that his ct- 2 -right equivalent to f.


Discussion. The lemma states that h can be transformed into a normal form
f. The assumptions here are weaker than in Theorem 73.F and can often be
verified in applications to variational problems. This also is in contrast to
Theorem 73.F. Compare Tromba (1976), (1983) and Marsden, Buchner, and
Schecter (1983) for a more thorough discussion.
If we have a continuous embedding X s;;; Y, and if Y is a H-space, then one
can choose the scalar product on Y as the scalar product on X. In variational
problems this situation occurs, for example, if X = Wt(G), m ~ 1, and Y =
L2 (G). Assumptions (ii) and (iii) mean that the first- and second-order differential
of h can be expressed in terms of the scalar product.
Solution:
(I) We prove (61). Define g through

h =f +g.
Then g is ct with g(O) = 0, g'(O) = 0, and g"(O) = 0. Moreover, we have
g'(x)y = (G(x)jy)
with G = H - T. Note that
f'(x)z = h"(O)xz = (Txjz), (64)

where G is ct-l with G(O) = 0 and G'(O) = 0. Consequently,

g'(x)y = (f G'(tx)x dtly).

From g"(x)zy = (G'(x)zly) and the symmetry of g"(x), it follows that


(G'(x)zly) = (G'(x)ylz), and thus

g'(x)y = (f G'(tx)ydtlx).

Because of (Txlz) = (xl Tz), (ii) implies that

g'(x)y = ( TxiT- 1 f G'(tx)ydt).


606 73. Banach Manifolds

Thus
g'(x)y = f'(x)B(x)y (65)
with

B(x)y = r' f G'(tx)ydt.

This is formula (61).


(II) Relation (60) follows immediately from (65) and (64), since

g(x) = f g'(tx)xdt = f'(x)A(x)


with

A(x) = f tB(tx)x dt.

Now apply Problem 73.8a with m = k - 2.


73.8c. Splitting lemma and infinite-dimensional catastrophe theory. In Problem 73.8b
we assumed that T: X-+ X is bijective, i.e., h"(O) is nondegenerate. In the
degenerate case, i.e., if T: X -+ X is a Fredholm operator of index zero, one may
find a normal form in Golubitsky and Marsden (1983). Using this result,
catastrophy theory in IRn can be extended to B-spaces. In particular, there exists
a map g for h such that the critical points of h in a neighborhood ofO correspond
to those of gin the finite-dimensional null space N(T).
73.9. • The multilinearization principle and the fundamental blowing-up lemma. In order
to study the structure of the solution set of the equation
f(x) =0 (66)
in a neighborhood of x = 0 in IRN, we consider the simpler multilinearized
equation
g(x) = 0, (66•)

where g = (g 1 , ... , gM ), and g1 corresponds to certain Taylor polynomials of J;,


i.e., more precisely,

fori= 1, ... , M. Here, a= (a 1 , ... ,aN)denotes a tuple of non-negative integers,


and
Dm = D~' ... D':f,

with X= (e I ' ... ' eN> and Di = ajaei.


Moreover, we set
Problems 607

We assume:
(HI) The map f: IRN-+ IRM is coo, where Nand Mare fixed positive integers
with N;;::: M +I.
We are given fixed tuples f3 = (/31 , ... , f3N) and y = (y 1 , ... , YN) of positive
integers.
(H2) Vanishing derivatives off We have f(O) = 0 and
D"/;(0) = 0, i= I, ... ,M
for all !X with <IXI/3) ~ Yi - I.
(H3) Nondegeneracy. If xis a nonzero solution of the multilinearized equation
(66*), then xis a regular point of g, i.e., rank g'(x) = M.
Prove: If (Hl)-(H3) hold, then, in a sufficiently small neighborhood of the
point x = 0, the solution set of the original equation (66) and the solution set
of the multilinearized equation (66*) have the same structure. More precisely,
we have the following result:
(a) There exists a homeomorphism
h: F 1 (0) n u-+ g- 1 (0) n v

with h(O) = 0, where U and V are fixed, sufficiently small open neighbor-
hoods of the point x = 0 in IRN.
(b) The restricted map
h: (f- 1 (0) n U)- {0}-+ (g- 1 (0) n V)- {0}

is coo, and the local solution set (f- 1 (0) n U)- {0} of the original equation
(66) is a C 00 -SUbmanifold of IRN.
Hint: This result generalizes the blowing-up lemma of Problem 8.22. Use a
similar argument as in the proof of Problem 8.22. Cf. Buchner, Marsden, and
Schecter (I983a) and Jones and Toland (1986). The latter paper contains an
interesting application to the bifurcation of capillary-gravity water waves.
In fact, there are two basic results in modern bifurcation theory, namely, the
implicit function theorem and the blowing-up lemma, which can be viewed as
a generalization of the implicit function theorem and the Morse lemma as well.
73.10. A simple example. We set N = 2, M = 1, x = (~,IJ), and
f(x) = ~3 - ~ 2 '7 +terms of order~ 4.

In this case, it is quite natural to set


g(x) = C- ~ 2 1].

Letting f3 =(I, I) andy= 3, we may apply the blowing-up lemma of Problem


73.9 to the equation f(x) = 0. Note that

<IXI/3) = Y (resp. ~ y- 1)
means that IX 1 + IX 2 = 3 (resp. s 2).
Moreover, note that
g'(x) = (3~ 2 - 2~11. -e),

and hence rank g'(x) = I if g(x) = 0 and x 0. *


The blowing-up lemma of Problem 73.9 tells us that, in a sufficiently small
608 73. Banach Manifolds

neighborhood of x = 0, the solution set of the equation


f(x) = 0
looks like the solution set of the multilinearized equation
~3- ~2, = 0.

References to the Literature

Introduction: Guillemin and Pollack (1974, M) and Marsden, Abraham, and Ratiu
(1983, M).
Standard reference: Lang (1972, M).
Transversality and dynamical systems: Abraham and Robbin (1967, M).
Finite-dimensional manifolds: Guillemin and Pollack (1974, M) (introduction),
Warner (1971, M), Golubitsky and Guillemin (1973, M), Hirsch (1976, M), and
Westenholz (1981, M).
Applications to mathematical physics: Westenholz (1981, M) (introduction), Marsden
(1974, L), (1980, L); Abraham and Marsden (1978, M), and Choquet-Bruhat (1982, M).
Catastrophe theory: Golubitsky (1978, S), Poston and Stewart (1978, M), Gilmore
(1981, M), and Arnold (1985)(see also the References to the Literature to Section 37.28).
Catastrophe theory and bifurcation theory: Golubitsky and Schaeffer (1979), (1984,
M) and Chow and Hale (1982, M).
Generalized Morse lemma: Golubitsky and Marsden (1983) (recommended as an
introduction); Tromba (1976), (1983), and Buchner, Marsden, and Schecter (1983).
Blowing-up lemma: Buchner, Marsden, and Schecter (1983a), Jones and Toland
(1986).
Infinite-dimensional catastrophe theory: Golubitsky and Marsden (1983).
Vector bundles: Lang (1972, M), Osborn (1982, M), Vols. 1-3, and Marsden,
Abraham, and Ratiu (1983, M).
History of manifolds: Scholz (1980).
Differential forms: Cf. the References to the Literature for Chapter 82.
CHAPTER 74

Classical Surface Theory, the


Theorema Egregium of Gauss, and
Differential Geometry on Manifolds

The curvature K of a surface depends only on the coefficients 9u of the first


fundamental form and their first- and second-order derivatives. Therefore K is
an intrinsic property of the surface.
Theorema egregium of Gauss (1827)
His spirit lifted the deepest secrets of numbers, space, and nature; he measured
the orbits of the planets, the form and the forces of the earth; in his mind he
carried the mathematical science of a coming century.
Under the picture of Carl Friedrich Gauss (1777-1855)
in the German Museum of Munich
Classical differential geometry books are filled with monstrosities of long equa-
tions with many upper and lower indices. The modern revolt against the classical
point of view has been so complete in certain quarters that some mathematicians
will give a three-page proofthat avoids coordinates in preference to a three-line
proof that uses them.
Michael Spivak (1970)
Tensor calculus is an application of the chain rule.
Mathematical folclore

In this and the following two chapters we consider three central applications
of the theory of manifolds:
(i) Classical surface theory of Gauss.
(ii) Riemannian and affine connected manifolds.
(iii) Einstein's general theory of relativity (1916).
One should note that (ii) is a consequent development of (i), and (iii) is based
on (ii). In fact, (i)-(iii) represent extraordinary achievements of mankind. We
like to stress this line of thought in mathematics and physics. Furthermore,
we want to emphasize the relation between differential geometry on Banach
manifolds and its intuitive classical roots.

609
610 74. Classical Surface Theory, Theorema Egregium of Gauss

This chapter contains results which form the hard core of differential
geometry. It is organized as follows:
(a) Tensor calculus in IR" and covariant differentiation (Sections 74.1-74.8).
(b) Applications to classical surface theory (Sections 74.9-74.16).
(c) Generalization to manifolds (Sections 74.17-74.21).
(d) Further development of the calculus in IR" and on manifolds (alternating
differential forms, Lie derivatives; Sections 74.22-74.25).
The surface theory of Gauss was strongly influenced by Gauss' practical
work as a surveyor. Under great physical pains he worked from 1821 to 1825
as a land surveyor in the kingdom of Hannover in the northern part of
Germany. It almost led to his physical exhaustion. In 1822, he submitted his
prize memoir "General solution of the problem of mapping parts of a given
surface onto another given surface in such a way that image and preimage
become similar in their smallest parts," to the Royal Society of Sciences in
Copenhagen for which he received the official prize. What was the importance
of his work?
The mapping of surfaces onto one another, which satisfy certain given
properties, is a basic problem in cartography; in particular, the reproduction
of parts of the surface of the earth in plane geographical charts. It is impossible,
for example, to map parts of the surface of the earth onto the plane and
preserve the length. This will be an easy consequence of the theorema egregium
of Section 74.14. Thus one has to look for other mappings. Of great practical
use are the conformal maps, i.e., angle preserving maps. Angle preservation
of geographical charts is important in navigation, i.e., in determining routes
of ships on charts. As we will see in Section 74.14, conformal maps are also
similar in the small. Special cases of conformal maps from the surface of the
earth onto the plane are stereographic projections (Fig. 74.1), which were
already known to the Greeks, and the projection of Mercator (1512-1594) is
still being used in the cartography of today. Gauss succeeded in finding a
procedure to determine all conformal maps in the small for analytic surfaces.
The study of conformal maps in the large began with the dissertation of
Bernhard Riemann (1826-1866), which was written in 1851. It contains a
development of the theory of complex function theory and the famous Rie-
mannian mapping theorem. When writing his prize memoir, Gauss had ap-
parently already worked on a more general surface theory, because he added

Figure 74.1
74. Classical Surface Theory, Theorema Egregium of Gauss 611

S(F)

Figure 74.2

the following Latin saying to his title page. "Ab his via sternitur ad maiora"
(From here the path to something more important is prepared).
The development of the general surface theory, however, was difficult,
though the basic ideas were already known to Gauss since 1816. On February
19, 1826 he wrote to Olbers: "I hardly know any period of my life, where I
earned so little real gain for truly exhausting work, as during this winter. I
found many, many beautiful things, but my work on other things has been
unsuccessful for months." Finally, on October 8, 1827, Gauss presented the
general surface theory. The title of his paper was "Disquisitiones generales
circa superficies curvas" (Investigations about curved surfaces). The most
important result of this masterpiece in the mathematical literature is the
theorema egregium-the beautiful theorem. Gauss begins with the following
definition of curvature at a surface point. He considers a piece of the surface
F with surface measure m(F) and draws the unit normal vectors of F at the
origin. This gives the spherical picture S(F) ofF on the unit sphere (Fig. 74.2).
The absolute value of the Gaussian curvature IK (P) I at the point P e F satisfies
by definition
IK(P)I =lim m(S(F))/m(F). (1)

IfF is the surface of a b~ll of radius r, then obviously IK(P)I = 1/r2 for all
points P of F. The above definition of IK(P)I depends on the embedding ofF
in R3 • The fundamental result of the Theorema egregium of Gauss is that K(P)
can be computed independently of the embedding in IR 3 , i.e., the curvature of
a surface is independent of the surrounding space. It may be determined solely
by measurements on the surface itself. We shall try to explain why this is not
only an important mathematical statement, but also has a fundamental impact
on our physical view of the world.
In his famous habilitation talk of 1854 "On the hypotheses which form the
fundaments of geometry" Riemann extended the Gaussian surface theory to
n-dimensional manifolds for which a differential ofthe arclength ds and hence
a metric is defined (Riemannian manifolds). Here the Gaussian curvature K(P)
is replaced with the Riemannian curvature tensor which is defined without
reference to the surrounding space. The ingenious idea of Einstein, in his
general theory of relativity (1916), was that the Riemannian curvature tensor
of our four-dimensional space-time universe E4 is determined by its masses,
612 74. Classical Surface Theory, Theorema Egregium of Gauss

and the force of gravity arises from the fact that the orbits of the planets
correspond to geodesics in £ 4 • In this way, Einstein gave a geometrical
explanation for the gravitational force, i.e., gravity was reduced to curvature.
Until the end of his life in 1955, Einstein, unsuccessfully, tried to find a unified
theory for all physical interactions using the concept of geometrization. Today
we have some hope that this program might be realized in the context of gauge
field theories. The idea is that the connection in principal fiber bundles induces
a curvature which causes the four fundamental interactions: strong, weak,
electromagnetic, and gravitative. Such a unified theory would include the
microcosmos (elementary particles) as well as the macrocosmos (cosmology).
If one looks at the history of the concept of manifolds, then during the last
one hundred years since Riemann's habilitation talk, five important mathe-
matical innovations have emerged, which are also relevant from a physical
point of view. We shall call a property an intrinsic property of a manifold if
it can be defined independently of the surrounding space and independently
of the particular choice of charts:

(i) The existence of tangent vectors is an intrinsic property.


(ii) Curvature is an intrinsic property.
(iii) The possibility of parallel transport of objects is an intrinsic property.
This allows a generalization of the concept of straight lines (geodesics).
(iv) Given a connection (Christoffel symbols), one can define parallel trans-
port and curvature. Moreover, one obtains an invariant differential cal-
culus on manifolds (covariant differentiation).
(v) For every metric (Riemannian manifold) there exists a connection. The
notion of connection, however, can be explained independently of the
existence of a metric (affine connected spaces, principal fiber bundles,
vector bundles).

Point (i) has already been discussed in the previous chapter. In a more
general form we shall discuss (ii)-(v} in Part V for Banach manifolds. In order
to prepare the reader for the general theory, we give the standard examples
for (ii)-(v) in this chapter, and applications to the special and general theory
of relativity in the next following chapters. From (i)-(v) it is clear that the
concept of a connection is of central importance in differential geometry.
In Part V we shall develop a tensor calculus and a differential geometry for
Banach manifolds which generalizes the classical vector analysis. This elegant
calculus is introduced in an invariant way, i.e., coordinate free. This is very
natural, since in infinite-dimensional B-spaces there are no coordinates to
work with. For a better understanding, however, it might be useful to be aware
of the classical tensor calculus with coordinates which will be introduced in
this chapter and applied to the general theory of relativity in Chapter 76. This
classical calculus has the disadvantage that many indices appear. Its ad-
vantage, on the other hand, is that because of the index principle of Section
74.5 it works on its own. This is one of the basic requirements of Leibniz
74. Classical Surface Theory, Theorema Egregium of Gauss 613

(1646-1716) on a good calculus. This way, one gets many important hints as
to which theorems to expect in the general calculus on Banach manifolds.
Also, the definitions of the general calculus get an intuitive interpretation. It
is really useful to master both calculi, in order to be able to apply the most
advantageous in a concrete situation. One should not underestimate the power
of the classical calculus.
In this chapter we will use the following strategy.
(a) We develop the tensor calculus in IRn (covariant differentiation, alternating
differential forms, Lie derivatives).
(b) This calculus carries over to finite-dimensional manifolds without further
thought (see Section 74.17).
(c) The heart of classical tensor calculus is the index principle of Section 74.5.
It allows us to write an equation in such a way that we know its form in
IRn for any arbitrary coordinate system. At the same time we obtain
equations on manifolds which are chart independent, i.e., which have a
geometrical meaning. Such notation is called covariant.
In mathematical physics, (c) is of particular importance because the physi-
cist is interested in transforming his equations into other systems of reference.
In surface theory we proceed as follows:
(i) The behavior of surfaces is studied in a particular coordinate system. This
yields, for example, in Section 74.11 a very simple and intuitive analytic
definition of the Gaussian curvature.
(ii) Using the index principle, we write these equations in covariant from for
arbitrary coordinate systems, i.e., geometrically invariant.
(iii) The basic equations of surface theory are the so-called fundamental
equations, i.e., the equations for the change in the natural trihedral of the
surface.
(iv) The integrability conditions, i.e., the necessary solvability conditions for
(iii) yield the curvature tensor and the Theorema egregium.
(v) The solution of (iii) yields the main theorem of surface theory: Locally, a
surface is uniquely determined by the first and second Gaussian fun-
damental form, except for motions in IR 3 .
(vi) The generalization of surface theory and covariant differentiation to IRn
leads naturally to Riemannian and affine connected manifolds.
It might be helpful for the reader to be aware of the classical origin of the
curvature tensor as well as to get some intuitive understanding. The curvature
tensor is the key to the general theory of relativity of Chapter 76.
The main analytic tool, which we use in this chapter, is the theorem of
Frobenius (1849-1917) of Chapter 4. We will use it in the following way:
(a) Main theorem of surface theory.
(fJ) Main theorem for Riemannian manifolds: Such a manifold is locally flat
if and only if the Riemannian curvature tensor is identically zero.
614 74. Classical Surface Theory, Theorema Egregium of Gauss

We note that (oc) and ({J) correspond to the following general strategy in
differential geometry:
(S1) One finds differential equations for structures.
(S2) The necessary solvability conditions follow from the integrability con-
ditions, i.e., from differentiating and equating the mixed derivatives. For
example, it immediately follows from u., = a and uy = b that ay = b.,
because u.,Y = uyx·
(S3) One shows that the integrability conditions are also sufficient. Here, the
theorem of Frobenius or the theorem of Poincare of Section 74.23 play
an important role.
In order to shed some light onto the historical background of the general
theory of relativity, we consider in Section 74.21 models for non-Euclidean
geometries. In the problems to this chapter we explain important connections
between topology and analysis (Theorem of Gauss-Bonnet-Chern, theorem
of de Rham, duality theorem of Poincare, index theorems of Poincare-Hopf
and Morse, theorem of Adams on vector fields). In Part V we will study the
connection between these subjects and the Atiyah-Singer index theorem. In
this chapter we only assume some knowledge of elementary differential and
integral calculus and the theorem of Frobenius of Chapter 4. Also, we need
the concepts of manifolds and tangent spaces of the previous chapter.
For readers who want to get acquainted with the life of Gauss, we recom-
mend the Gauss biographies of Worbs (1955), Wussing (1974}, and Biihler
(1981). Furthermore, we suggest taking a look at the collected works of Gauss
(1863). One will be fascinated by both, the depth of thought and the clarity
and simplicity of the language. We conclude these introductory remarks with
several citations from and about Gauss, which may help bring about a better
understanding of the scientist and human being Gauss.
Science should be the friend of applications, not its slave.
Gauss
I thank you, highly honored Sir, in the name of mankind, for presenting us with
a picture of the highest intellectual power and force together with an inspiring
and never ending warmth of feelings.
Alexander von Humboldt to Gauss (1853)
It is not the knowledge but the learning, not the possessing but the earning, not
the being there but the getting there, which gives us the greatest pleasure.
Gauss to Bolyai
It is quite extraordinary how much the young mathematicians here in Berlin
and, as I hear, in all parts of Germany adore Gauss. For them, he is the
incorporation of mathematical perfection.
Niels Henrik Abel (1825)
By explanations the scientist means nothing else than a reduction to very few
and simple basic rules, which cannot be reduced any further, but which allow a
complete deduction of the phenomena.
Gauss in Electromagnetism and Magnetometer
74.1. Basic Ideas of Tensor Calculus 615

My theories are important to me, but infinitely much more-the truth.


Gauss
Pauca sed matura (few, but ripe).
Inscription of Gauss' seal
"Princeps mathematicorilm"-Prince of the mathematicians-Gauss is called
on a commemorative coin, which the king of Hannover ordered after his death
in his honor.
Erich Worbs (1955)
Since three days, the almost, for this world too heavenly angel, is my bride. I am
unboundedly happy.
The twenty-seven-year-old Gauss in a letter to Bolyai (1804)
You were kind enough to invite me for a visit after my wife was well again. She
is well now. Yesterday evening, at 8 o'clock I closed her angelic eyes in which I
have found heaven for the last five ye~rs. Heaven give me the strength to bear
this blow. Grant me a few weeks, dear Olbers to gather new strength in the arms
of your friendship-strength for a life which is only valuable because it belongs
to my three small children.
The thirty-two-year-old Gauss in a letter to Olbers (1809)
Sartorius von Waltershausen reports that Gauss once said there were questions
of infinitely higher value than the mathematical ones, namely, those about our
relation to god, our determination, and our future. Only, he concluded, their
solutions lie far beyond our comprehension, and completely outside the field of
science.
Erich Worbs (1955)

74.1. Basic Ideas of Tensor Calculus


The main goal of tensor calculus is to write equations in such a way that they
might be identified in arbitrary coordinate systems. This is very important in
mathematical physics. For example, the Poisson equation llf = h in a fixed
Cartesian coordinate system of IR 3 with coordinates (u 1 , u2 , u 3 ) has the form
3
L Dff= h,
i=l
(2)

where D1 = ofou 1• This form is preserved if one passes to another Cartesian


coordinate system. However, it is not preserved if one uses arbitrary curved
coordinates such as the spherical coordinates. In an arbitrary coordinate
system, (2) takes the form
(3)
The sum is taken over equal upper and lower indices from 1 to 3. The
quantities g 1i and the symbol V1 for covariant differentiation are defined in
such a way that, for Cartesian coordinates, equation (3) coincides with equa-
tion (2). Tensor calculus has the great advantage that one immediately sees
616 74. Classical Surface Theory, Theorema Egregium of Gauss

whether or not an equation holds, independently of the particular choice of


the coordinate system. According to the index principle of Section 74.5, one
only needs to check if the index picture is right, i.e., if the free indices are equal
in all terms; an index is called free if it is not summed. In the special case (3),
for instance, the index picture is right, because the number of the free indices
is equal to 0 for both sides of the equation.

Einstein's Summation Convention 74.1. In this chapter, we always sum from 1


ton over equal upper and lower indices, unless the contrary is explicitly stated.
For example, we have

Smoothness Convention 74.2. In this chapter, all functions and manifolds are
of class coo, unless the contrary is explicitly stated.
This last convention is only made for convenience. Actually, many state-
ments are true under much weaker smoothness assumptions. Often C1 -
functions suffice with k = 1, 2.
In order to present the key ideas of tensor calculus as clearly as possible,
we begin with the most important relations. Later on, we will give a more
detailed exposition. Tensor calculus is based on the following two well-known
transformation rules for partial derivatives and differentials
D;-/= Al.DJ, (4)
du 1' = Af du 1, (5)
where D1 = o/ou 1 and A} = out;(Jui. Here u1 and u 1' denote different coordinate
systems (see (10)). Also, we essentially invoke the identity
(6)
where ~1 is the Kronecker symbol, i.e., ~1 = 1 for i = j and ~1 = 0 for i "# j. In
fact, the chain rule implies

ou 1~ au~ ~! = {1 for i' = j',


ou' ou 1 J 0 for i' "'j'.
The development of differential calculus makes essential use of the following
formulas:
VJ=DJ,
V1ti = D1ti + r1~t•, (7)

V1ti = D1ti - r jt,,


1

where r1~ are the so-called Christoffel symbols. They also play an important
role in the general theory of manifolds in connection with the definition of
74.2. Covariant and Contravariant Tensors 617

parallel transport and the geodesics. Of central importance are:


(i) The defining equation (23) for the Christoffel symbols.
(ii) The differential equation (30) for the parallel transport.
(iii) The differential equation (35) for straight lines in arbitrary coordinate
systems in IR", which immediately yields the differential equation for
geodesics on manifolds.
In Sections 74.2-74.8 we develop the calculus for R". This then immediately
yields the more general calculus for real n-dimensional manifolds.

74.2. Covariant and Contravariant Tensors


Let us make the following conventions for Sections 74.2-74.8.
(A) Let G be an open, nonempty subset of R". Let x denote the points in G
as well as the corresponding radius vectors (Fig. 74.3). A coordinate system
for G is a C00 -diffeomorphism u = u(x), which maps G onto an open set in IR".
Its inverse is denoted by x = x(u). This way, one assigns the coordinates
u = (u 1, ... , u") to a point x. The curve x = x(u) in G, which is obtained for
variable u1 and fixed ui for all j i: i, is called u1-coordinate line. Since x = x(u)
is a diffeomorphism, the inverse mapping theorem of Section 4.13 implies that
the vectors
b.= ax(u) (8)
I aui

are linearly independent. We call {b1} the natural basis at the point x(u). Note
that this basis depends on the choice of the point x as well as the coordinate
system. Obviously, b is a tangent vector for the u1-coordinate line at the point
x (Fig. 74.3). If we choose a Cartesian coordinate system in IR", i.e., x = u 1 e~o
and the vectors e1 , ... , e" form a positively oriented coordinate system, then
b1 = e1• In the physical literature one often uses in place of b1 the unit vectors
b;/lb1l. This, however, destroys much of the elegance of tensor calculus.

EXAMPLE 74.3 (Polar Coordinates). We choose a fixed, positively oriented


e e
orthonormal system {el,e2} in IR 2 and define X= 1e1 + 2e2. Through
e = rsinqJ,
2 (9)
we introduce polar coordinates in the usual way (Fig. 74.4). Here we have
u1 = r, u2 = IP· The u1-coordinate lines are rays fori = 1 and circles fori = 2.
Moreover, b1 = ax;ar and b2 = 8x/8qJ, hence

In order to precisely obtain situation (A), we let r and qJ vary in the corre-
sponding regions 0 < r < oo and -1t < IP < 1t.
618 74. Classical Surface Theory, Theorema Egregium of Gauss

Figure 74.3

Figure 74.4

Again we consider situation (A). If v = v(x) is another coordinate system in


G, then the corresponding coordinates are denoted for convenience by
v = (u 1 ', ••. ,un').
From u = u(x) and v = v(x) we obtain the map u = u(v) and its inverse v = v(u)
by eliminating x. The corresponding partial derivatives
!l I'
I'
A 1 (x) = ou
uU
1 (u(x)),
(10)
. iJul
Al-(x) = iJui' (v(x))

are called the transformation coefficients. They will essentially be used in the
definition of tensors. For simplicity in notation we will skip the argument x,
i.e., we simply write Ai-, t 1 instead of Ai.(x), t 1(x), etc. The tensorial transforma-
tion rules (11)-(13), below, therefore actually depend on x. The following
definition is fundamental.

Definition 74.4 (Tensors). Assume (A), and let all indices run from 1 ton.
A scalar field f on G is a function f: G -+IR, which assign_s a real number
to each point x e G, independently of the coordinate system in G.
A covariant tensor field t1 on G is an assignment of a tuple {t1} of real
numbers to each point x e G, depending on the coordinate system in G. Passing
to another coordinate system, these numbers are transformed like the deriva-
74.2. Covariant and Contravariant Tensors 619

tives D,fin (4), i.e.,


(11)
Analogously a contravariant tensor field t 1 is defined. Instead of (11), we
require that the t 1 are transformed like the differentials du 1 in (5), i.e.,
(12)
An r-fold covariant and s-fold contravariant tensor field

on G is an assignment of a tuple {tf: :::f;} of real numbers to each point x e G,


depending on the coordinate system in G. Passing to another coordinate
system, these numbers are transformed like the corresponding product

according to (11) and (12). The number r + sis called the degree of the tensor
field. For example,

hence
t1}'• = A 1I .AJi't!1• (13)
A one-fold or two-fold covariant tensor field is also called a simple or twice
covariant tensor field, etc.
Two tensor fields are called of the same type if and only if they have the
same number of corresponding upper and lower indices. For instance, t}&: and
sJ&: are of third degree and of the same type, while t}&: and sfl are of third degree
but not of the same type.

According to our Smoothness Convention 74.2, lett::: denote a C 00 -function


on G with fixed indices. For simplicity, we will often speak of tensors instead
of tensor fields.

Remark. Our definition of tensors naturally arises if one studies physical


processes of measurement. These depend on the choice of a coordinate system.
Here we need rules which transform one coordinate system or system of
reference into another. It is, for example, an important task in physics to
decide how physical observables-such as space, time, velocity, electric
and magnetic field strengths-change when passing to another system of
reference. Einstein's special theory of relativity, for example, shows that the
transformation rules of classical mechanics for space, time, and velocity are
no longer valid for high velocities, i.e., velocities which are close to the velocity
of light. Scalar fields will play an important role, since at some fixed point they
have the same numerical value for each coordinate system. Such values are
also called invariants.
620 74. Classical Surface Theory, Theorema Egregium of Gauss

EXAMPLE 74.5. Let f: G-+ R be a function. Then, from (4), the partial deriva-
tives t 1 = Dd' form the standard example of a covariant tensor field.
EXAMPLE 74.6. If w = w(x) is a vector field on G, which is decomposed at the
point x with respect to the natural basis {b1}, then
w(x) = w1b1,
and the components w1 provide the standard example of a contravariant
tensor field in G. In fact, it follows from (8) and the chain rule that
ox ox ou 1
b,. = au'' = ou 1 au'''

thus
(14)
This yields
w = w1b1 = w'Afb, ..
Moreover, we have the representation w = w'' b,. for the new components.
Since the b,. are linearly independent, we obtain w1' = Af w1•
EXAMPLE 74.7 (Unit Tensor). Let i,j = 1, ... , n. We assign to each point xeG
and every u1-coordinate system, the numbers
{J! = {1 for i = j,
' 0 for i c:F j.
Then{)/ is a simple covariant and simple contravariant tensor. This remark-
able fact immediately follows from (13) and (6).
As shown in the following example, there exists no covariant tensor field
g11 which is equal to the Kronecker symbol in each coordinate system. Thus
in tensor calculus we will not use (jli for the Kronecker symbol.
The tensor {)j is called the unit tensor because it is the unit element for the
tensor multiplication of Section 74.3.
EXAMPLE 74.8 (Metric Tensor). Set b1b1 = (b1lb1). The quantities
(15)
form a twice covariant tensor field, which is called the metric tensor field. In
fact, it follows immediately from (14) that g,.1• = Ai.Aj.gli.
We have gii = g11 • The linear independence of the b1 implies that det(gii) =F 0.
In Cartesian coordinates we have b1 = e, i.e., gli is equal to the Kronecker
symbol. This property, however, is generally lost in arbitrary coordinate
systems.
EXAMPLE 74.9 (Inverse Metric Tensor). Let g 1i denote the elements of the
inverse matrix to (g11 ) i.e.,
(16)
74.3. Algebraic Tensor Operations 621

in an arbitrary coordinate system. Then gii is a twice contravariant tensor field


and, moreover,
(17)

PRooF. If we write gi'J' = Ai.AJ.gii as a matrix equation


G' = ATGA,
we obtain G'- 1 = A- 1 G- 1 (A- 1 )r. From (6) it follows then that gi'J' =
AfA{g 1i. D

EXAMPLE 74.10 (Natural Basis bi and Natural Dual Basis b1). In Definition
74.4 we gave a definition of tensors as number tuples which satisfy a certain
transformation behavior.lt should be noted that various other objects exhibit
this same transformation behavior. For instance, we know already that the
vectors bi of the natural basis at some point x behave like the components of
a simple covariant tensor field under coordinate changes, i.e.,
(18)
Let V,. = span {b1 , ••• , b"}. For i = 1, ... , n we define linear functionals b 1 e V,.*
through bi(b1) = ~j. Then:
bi' = Afbi, (19)
i.e., the linear functionals bi are transformed like the components of a simple
contravariant tensor field. In fact, we have b1(vib1) =vi and
bi'(v 1b1) = bi'(vi'bi') =vi'= Afvi = Afbi(v1b1).

74.3. Algebraic Tensor Operations


The following propositions are immediate consequences of the linearity of the
tensor definition and identity (6).

Proposition 74.11. The sum of tensor fields of equal type is a tensor field of the
same type.
The product of tensor fields is again a tensor field where the type is given by
the indices.
If a tensor field vanishes at some point with respect to a fixed coordinate
system, then the same is true for every coordinate system.

PRooF. For example, it immediately follows from


ti' = Afti and si' = Afsi
that
ti' + si' = Af(t 1 + si),
t 1'si' = A!'Af(t 1si)
I J '
622 74. Qassical Surface Theory, Theorema Egregium of Gauss

i.e., t 1 + s 1 or t 1s1 is a simple contravariant or twice contravariant tensor field.


Moreover, from t 1(x) = 0, it always follows that t 1'(x) = 0. 0

An index is called free if it is not summed.

Proposition 74.12 (Contraction). If one sums over one upper and lower index
of a tensor field, then one obtains a tensor field whose type is given by the free
indices.
This operation is called a contraction.

PRooF. This is an immediate consequence of (6). If, for instance,

tJ·., = ., j
Aj A1.t1,
I

then it follows at once from (6) that


I'
t 1• = u/t1 = t;,'
~:·I

i.e., tl = t~ + ··· + t; is a scalar field. 0

The contraction tftm with respect to i, for example, makes the tensor field
t]tm into a twice covariant tensor field. The free indices are here k and m.

Proposition 74.13. Permutations of upper (or lower) indices of a tensor field do


not change the type of the tensor field.

PRooF. For example, it follows immediately from t1·r = Ai·A~.til and sil = t11
that
0

A tensor field t 11 ... t,. is called symmetric (or antisymmetric) if and only if t ...
remains unchanged under permutations of the indices (or is multiplied with
the sign of the permutation). Let, for example, t 11 be a tensor field. Then one
obtains from Propositions 74.11 and 74.13 a symmetric and antisymmetric
tensor field through sil =til+ t11 and ail= tu- t1~o respectively. Analo-
gously, t 11 ... 1, can be symmetrized and antisymmetrized. Here we use the
notations

and
74.4. Covariant Differentiation 623

One sums over all permutations 1t of the indices. For instance, Alt tu =
(til - t11 )/2 and Sym til = (til + t11 )/2.

Definition 74.14. The index picture for an equation is called right if and only
ifthe free indices are the same for all additive terms. Permutations are allowed.

Principle of the Index Picture 74.15. All tensor operations above automatically
yield the right form, if one only uses terms where the index picture is right.

EXAMPLE 74.16.1£ til, sli and w" are tensor fields, then
c11 = tu + s11 ,
c1 = tiiwi,
are also tensor fields. On the other hand, the index picture in t 11 + w" is not
right since the additive terms t11 and w" contain different free indices. In fact,
t 11 + w" is not a tensor field.

74.4. Covariant Differentiation


If t1 is a tensor field, then the partial derivatives D1t1 do not form a tensor field.
In order to make them into a tensor field we need to add additional terms.

Definition 74.17. We define covariant differentiation of a scalar field for a


simple covariant tensor field t1 and a simple contravariant tensor field ti
through
V;/= D;/, (20)
V,t1 = D1t1 - r 0t., (21)
V,tl = D,ti + r,~t•. (22)
The rt are the so-called Christoffel symbols which are defined through
r,~ = g"mrm,IJ•
(23)
rm,ii = t(D,gllli + Digm, - Dmg,J).
The intuitive meaning of these symbols is explained in Problem 74.4.
Because of gii = g11 we have
and for all i,j.
The definition of V1 is very simple. This is clear if one looks at the summation
index sand the index picture in (21) and (22). The Christoffel symbols were
introduced in 1869 by Elwin Bruno Christoffel (1829-1900).
624 74. Classical Surface Theory, Theorema Egregium of Gauss

Proposition 74.18. The covariant differentiation (20)-(22) defines tensor fields


where the type is given by the index picture.

PROOF. From the transformation rules

gi'j' -- AiAi
i' j'gij• g i'r -- Ai'Ai'
i j g'
ii

we obtain
k' i j k' k + (Di.Ar)A.
s k'
(24)
ri'i' = Ai'AJ'Ak rii

by differentiation. This is the central formula. It shows that the are notrt
transformed tensorial but, instead, an additional term appears. From (24)
and the transformation formulas for ti and ti, a straightforward computa-
tion yields the tensorial transformation formulas for (21) and (22). We have,
for example,
(25)
D

A geometric motivation for the covariant differentiation follows in Section


74.6.
The definition of covariant differentiation for general tensor fields uses the
following rule.
(i) Besides the given tensor field t{::::{;, one considers the corresponding
product ti, ... ti)h ... ti•.
(ii) One computes the covariant derivative of the product by formally apply-
ing the product rule and (21), (22).
(iii) The covariant derivative of the given tensor field in (i) is defined analo-
gously to (ii).
We obtain, for example,
Vititk = (Viti)tk + tiVitk
= D;(tjtk) - rijt.tk + rj~tjt•.
Thus we define
Vjtf = Ditjk- rijt~ + ri~tj.
Analogously, one obtains
Vktii = Dktii - rtit•i - rtitis·
Besides the partial derivative, we obtain a term with L for each index oft:::
which is formed analogously to (21) and (22). For some practise we recommend
Problem 74.2.
This definition guarantees that Proposition 74.18, as well as the sum and
product rule, hold for the covariant differentiation of general tensor fields.
74.5. Index Principle of Mathematical Physics 625

Definition 74.19 (Absolute Differential). If t is a real parameter, then the


absolute t-derivative along the curve u1 = u1(t) is defined through
D . . du" 1 .
-(t!•···J·) = - v t.•···!·
.dt ., ... I. dt k ....... •

Analogously, the absolute differential is defined through


.. -f•.
.. ·1• = du"Vk tfllt···•r
Dt!•lt•··ir

These two expressions are transformed in the same way as t{: :::t. because
from (5), duWt and du 1 are transformed like a simple contravariant tensor
field.

74.5. Index Principle of Mathematical Physics


In order that equations hold in arbitrary coordinate systems, one only has to
write them as equations for tensor fields. If, for instance, f and h are scalar
fields, i.e., functions, then
(26)
is an equation for tensor fields because the left- and right-hand sides consist
of scalar fields. In Cartesian coordinates one has the following special situation:
(i) The quantities g11 and g11 are equal to the Kronecker symbol, and for
g = det(gu) we have g = l.The Christoffel symbols are identical zero.
(ii) The covariant derivative V1 becomes the partial derivative D1•
(iii) D/dt: becomes d/dt:.
(iv) The vectors b1 of the natural basis form a positively oriented orthonormal
system, i.e., b1 = e1•
Thus (26) becomes in Cartesian coordinates
n
L Dff= h.
i=l
(27)

This is the Poisson equation. Hence, (26) is a version of (27) in arbitrary


coordinate systems. This represents a general method in mathematical phys-
ics. We will formulate a suggestive version.

Index Principle 74.10. Let an equation (E) be given in Cartesian coordinates.


One may think, for example, of (27). In order to obtain (E), in a form which
holds in arbitrary coordinate systems, write (E) as an equation of tensor fields,
i.e., replace the partial derivative D1 with V1 and assure that the index picture
in the sense of Definition 74.14 is right.
626 74. Classical Surface Theory, Theorema Egregium of Gauss

An inverse index principle is formulated in Problem 74.3i.

EXAMPLE 74.21. Instead of

one writes

EXAMPLE 74.22. Instead of


divw = D1w1
with w = w1e1, one writes w = w1b1 and
divw = V1w1• (28)
EXAMPLE 74.23. The equation
w =grad/
becomes w1 = Dd in Cartesian coordinates. The correct tensor form is w1 =
gliVJi. Thus for a scalar field f we obtain
grad/= (g 1iV1f)b1• (29)
Formulas (28) and (29) also follow immediately from the Index Principle
74.20, if b1 is treated formally as a covariant tensor field. The form of curl win
lll 3 is given in Example 74.30 using pseudotensors.
In Problem 74.3 we consider equations in elasticity theory as well as the
Navier-Stokes equations, and we will give a number of other useful applica-
tions of the index principle.

74.6. Parallel Transport and Motivation for


Covariant Differentiation
In this section, we consider the differential equation
vk + rtu 1vi = o (30)
for the parallel transport of a contravariant tensor field vk along the curve
u1 = u 1(t). The dot means differentiation with respect to the real curve param-
eter t, i.e., tit= u1D1vt. Equation (30), which is of fundamental importance for
aU of differential geometry, can also be written as
(31)
or especially short and elegant as
Dvk
dt =O. (32)
74.7. Pseudotensors and a Duality Principle 627

'flleorem 74.A (Parallel Transport). Let situation (A) of Section 74.2 be given,
and let C be a curve in the open set G, denoted by x = x(t) or in coordinates by
u1 = u 1(t), i = 1, ... , n.
The vector field v = v"b11 along the curve Cis constant if and only if equation
(30) holds along C.

PROOF. Being constant in Cartesian coordinates is equivalent to vt = 0 along


C. Moreover, (32) is a correct tensor equation which, in Cartesian coordinates,
becomes dv"/dt = 0 and hence v" = 0. 0

In Problem 74.4, we give a direct but somewhat elaborate proof of this


theorem, without using tensor calculus. This may help illustrate the geometric
meaning of the Christoffel symbols. According to (31), the importance of
covariant differentiation is that it allows a simple description of parallel
transport. Analogously to (32), we define the parallel transport of arbitrary
tensor fields along C through

(33)

EXAMPLE 74.24. The differential equation for a straight line ut = u"(t), k = 1,


... , n becomes, in arbitrary coordinate systems,

!!_ (du") = O. (34)


dt dt
This is equivalent to
(35)

PRooF. Equation (34) is a correct tensor equation because it has the same
transformation behavior as (32). In Cartesian coordinates, all the r 1' are equal
to zero, i.e., (34) and (35) become u" = 0. 0

The great importance of formulas (30) and (35) is that they allow us to define
parallel transport and generalized straight lines (geodesics) in the same way
for Riemannian manifolds and more general manifolds with affine connection.
This will be discussed in Sections 74.18 and 74.19.

74. 7. Pseudo tensors and a Duality Principle


We assume again situation (A) of Section 74.2. Let D = det(Ai.) denote the
Jacobian determinant. For tensor fields t 1, tu, etc., we have the transformation
rules
(36)
628 74. Classical Surface Theory, Theorema Egregium of Gauss

with at = 1. If (36) holds with at = sgn D, then we speak of pseudo tensor fields.
Here D is chosen at the corresponding point.

Defmition 74.25. A pseudotensor field t/: :::t is transformed like the corre-
sponding tensor field with the exception that the transformation rule is mul-
tiplied with at = sgn D such as in (36).

If sgnD = -1, then the natural basis {b1 , ... ,b,} has another orientation
than {b1 ., ... , b,.. }. Therefore pseudotensors often occur in connection with
orientations.

EXAMPLE 74.26 (Standard Example of a Pseudoscalar). If, as usual, we


specify a positive orientation of IR", and let s = ± 1 denote the orientation
of {b1 , ••• , b,.}, then s is a pseudoscalar field, since under coordinate changes
we haves'= (sgnD)s.

EXAMPLE 74.27 (Standard Example of a Pseudotensor Field). Let g = det(giJ)


and let 8i 1... in denote the sign of the permutation (1: :::rJ. In particular, for two
equal indices, 8 ••• is equal to zero. For n = 2, for example, we obtain 8u =
-8 21 = 1 and 811 = 822 = 0. Then

£i1 ... in = lgl 112 811 ... In•


Ei .... 1n = lgl-112 8. .
lt···'n'

are pseudotensor fields where the type is given by the index picture of E.

PRooF. From gi'J' = Al· Af. giJ follows g' = D2g according to the multiplication
rule for determinants. Therefore
lg'l 112 = DsgnDigl 112 .
Furthermore, it follows from the definition of determinants that
8ij ... i;.D =A:\ ... A:~8i 1 ... in·
This implies
Eij ... i;. = sgnDA:; ... A:~E1 1 ... 1n·
Analogous arguments apply to Ei .... in. D

Proposition 74.28. The sum of pseudotensor fields of the same type is again a
pseudotensor field of this type.
The product ofa pseudotensor fields with a tensor field (or pseudotensor field)
is again a pseudotensor field (or tensor field).

PROOF. As in Section 74.3 this follows immediately from the definition. D


74.7. Pseudotensors and a Duality Principle 629

Thus the Index Principle 74.20 remains valid for pseudotensors. One only
has to make sure that only pseudotensors (or only tensors) appear in each
additive term.

EXAMPLE 74.29 (Duality Principle). The pseudotensors of Example 74.27 can


be used to transform tensors into dual pseudotensors. Let, for example, n = 3.
Then we obtain the pseudotensor field
(37)
from the tensor field t 1~;.
If we lets= 1 (or= -1), if {b1 ,b2 ,b3 } is a right-hand (or left-hand) system,
then s is a pseudoscalar and
(38)
is a contravariant tensor field. More generally, tensor fields t ... and t··· generate
dual pseudotensor fields through

(39)

with type given by the index picture.


EXAMPLE 74.30 (curl win R3 ). For Cartesian coordinates in lll 3 , equation
curlw = v
is equivalent to v = v1e1 and v1 = e11~;D1 wk, where the sum is taken over j and
k. According to the index principle and (38), the correct tensor expression is
v1 = E 1i"V1 g~~.,w'.
We therefore obtain in arbitrary coordinates
curl w = s(EiJ"V1w")b1 (40)
with w = wkbk and w" = g~;rw'.
Because of the orientation factor s, curl w is called an axial vector in the
physical literature. From Section 74.5 we have for an arbitrary coordinate
system in R3 that
grad f = (V1)b1, (41)
(42)
with V1 = g 1iV1. Since no factors appears in (41), grad/is called a polar vector
in the physical literature.

EXAMPLE 74.31 (Vector Product). Analogously to (40), one obtains for the
vector product in arbitrary coordinates
(43)
630 74. Classical Surface Theory, Theorema Egregium of Gauss

Therefore v x w is called an axial vector. The scalar product is discussed in


Problem 74.3a.
The elegance with which one obtains formulas (40)-(43), without any com-
putations, already illustrates the usefulness of tensor calculus in classical
vector analysis.
EXAMPLE 74.32 (Curl of a Vector Field). In Rn the tensor field
tu = V1w1 - V1w1
is called the Curl of w1• Since the Christoffel symbols cancel out, we also have
tu = D1w1 - D1w1
for an arbitrary coordinate system.
The fact that the Christoffel symbols cancel by alternations and that partial
derivatives D1 can be used instead of V1 is the key to the fundamental calculus
of alternating differential forms of Section 74.24.
The dual pseudotensor to tu is given by
pi, ... in-l = El, ... in-ziJtlj.

In R3, i.e., in the special case n = 3, we obtain


curl w = (sp")b,. = (sE"Iit 11 )b"
from (40). This formula shows that in R3 there exists a close relation between
the tensor field "Curl" and the vector field "curl." In the general case of R",
n > 3, however, only the tensor field "Curl," i.e., tu is available.

74.8. Tensor Densities


If giJ is a tensor field on R" and hu is a pseudotensor field, then
g1•1• = Ai.AJ.gu, h,.i' = (sgnD)Ai.AJ.hu
with AI·= au';au'' and D = det(Al·>· Letting g = det(gu) and h = det(hu), we
obtain from the multiplication rule for determinants that
g' = D2g, h' = (sgnD)"D 2 h.
We call tf::::t a tensor density (or pseudotensor density) of weight w if and only
if tf: :::tis transformed like a tensor field (or pseudotensor field), however, with
an additional multiplication factor
IDiw (or sgnDIDiw).
For a tensor density tj, for example, of weight w, we have the formula
tj: = IDIWAI' Aj.tj.
It follows from above that g is a scalar density of weight 2, and h is a scalar
74.9. The Two Fundamental Forms of Gauss of Classical Surface Theory 631

density (or pseudodensity) of weight 2 for n odd (or n even). Because of


M=IDIJi91,
Ji91 is a scalar density of weight 1. This density plays the central role
in Riemannian geometry in connection with computations of volumina.
If a is a scalar density ofw~ight 1, then the transformation rule for integrals
implies that

J = f
adu 1 ... du" = fa!Didu 1 ' ... du"'

=fa' du 1' •• • du"' = J',

i.e., J is a scalar. Analogously to Section 74.3, one can prove the following:
(i) The product of two tensor densities (or pseudotensor densities) is a tensor
density where the weights are added.
(ii) The product of a tensor density with a pseudotensor density is a pseudo-
tensor density where again the weights are added.
(iii) Contractions of densities preserve the weights.

74.9. The Two Fundamental Forms of Gauss


of Classical Surface Theory
In the following sections we consider the classical local surface theory of Gauss
as an elegant application of tensor calculus. It also provides the standard
example for a differential geometry on general manifolds. If one specializes
the results of Section 74.19 about n-dimensional Riemannian manifolds to
n = 2, one also obtains a giobal surface theory.
Our starting point is as follows.
(B) Let x be the radius vector in R 3 and G a fixed region in R 2 , i.e., in the
(u 1,u 2 )-plane.
A surfaceS in R 3 is given by the equation
x = x(u), u = (u 1, u 2 ) (44)
on G (Fig. 74.5), where x: G-+ R3 is a bijective C00 -map, which is also an
s
x = x(u)
~ c;:/:?
Figure 74.5
632 74. Classical Surface Theory, Theorema Egregium of Gauss

Figure 74.6

immersion. In Cartesian coordinates for IR 3 we have x = ~ 1 eh and (44)


becomes
i = 1, 2, 3.
Then x( ·)is as C 00 -immersion if and only if all functions ~ 1 ( ·)are coo and
(45)
holds on G fori= 1, 2, 3 andj = 1, 2. The natural basis {b1 ,b2 } of the tangent
plane at the surface point x(u) is defined as
ox (46)
bi = oul'

i.e., b1 is a tangent vector to the u 1-coordinate line (Fig. 74.6). Because of (45),
b1 and b2 are linearly independent. The unit normal vector Nat x(u) is defined
through
(47)
A study of the trihedral b1 , b2 , N is essential for classical surface theory.
An admissible coordinate system for the surface is any C 00 -diffeomorphism
v = v(u) which maps G onto a region of IR 2 • We write v = (u 1 ', u2 '). In the
following let u 1, u2 and u1 ', u2 ·denote arbitrary admissible coordinate systems.

Principle of Coordinate Independence 74.33. We consider only such properties


of the surface which are independent ofthe choice ofthe admissible coordinate
system.

As we shall see, this can effectively be done with the aid of tensor calculus
on G. Using the vector calculus for IR 3, we also make sure that the properties
under consideration are invariant under motions of the surface in IR 3 . In the
following, a coordinate system will always be an admissible coordinate system.

Convention 74.34. Up to Section 74.16 all indices run from 1 to 2.


74.9. The Two Fundamental Forms of Gauss of Classical Surface Theory 633

The following definition is basic.

Definition 74.35 (Fundamental Forms). If x = x(t) is a curve on the surface,


which will also be denoted as u1 = u 1('r), i = 1, 2, then, for an arbitrary
coordinate system, the following two expressions represent the first and
second fundamental form:
(48)
and
(49)
The dot means derivative with respect to t. In short notation ohe also writes
ds 2 = g.. du 1du 1
I) '
(48a)
(49a)
We let bii = D1b1 with D1 = ofou 1, and hence b11 = D1D1 x. Moreover, we let
N1 =D1N.

Proposition 74.36. For an arbitrary coordinate system we have


g,j = b,bj, (50)
hiJ = Nbii = - b1N1. (51)
Moreover, g11 or h11 is a symmetric tensor field or symmetric pseudotensor field
on G, respectively, where the type is given by the index picture.

Corollary 74.37. Let g = det(g 11 ) and h = det(hiJ). Then g11 , g22 , g > 0 and
g' = D2 g, h' = D2 h (52)
with D = det(iJu 1/ou 1'). Consequently, h/g is a scalar field on G.

PRooF. It follows from x = b1u1 that


x2 = b,b1u'ui = g,1u'u1.
Since u1 and ui are arbitrary, we obtain (50).
From N = ~ui it follows immediately that
-.xril = -b,~u'ui,
hence h11 = -b1 ~. Differentiating Nb1 = 0 gives N1b1 + Nb11 = 0, and hence
(51).
As usual let A:. = ou 1fou'', hence D = det(A:,). Differentiation gives
(53)
Therefore gil = b1b1 is a twice covariant tensor field. Since b1 , b2 , N is always
634 74. Classical Surface Theory, Theorema Egregium of Gauss

a right-hand system, - xN changes its sign if we pass to a u1'-coordinate system


with change in the orientation, i.e., sgn D = -1. Thus - xN is a pseudoscalar,
i.e.,
h11 u·i u·J = sgn Dh1•1.u·i' u·i'.
Using the first equation in (53) and noting that the u1, ui can be chosen
arbitrarily, we obtain
I' .,
hiJ = sgn DA 1 AJ h1•1·,
i.e., h11 is a twice covariant pseudotensor field. D

PRooF OF CoROLLAY 74.37. The vector equation lb1 x b2 12 = det(b1b1) implies


that det g > 0. Equation (52) has been proved in Section 74.8. Therefore we
have h'/g' = h/g. D

As we shall see, Corollary 74.37 plays an important role in the investigation


of the Gaussian curvature.

Definition 74.38. Using the quantities giJ we define the Christoffel symbols f~
and the covariant differentiation V1 as in Section 74.4.

It is important that an application of V1 to the tensor fields yields again a


tensor field. This can be proved as in Proposition 74.18. There we only needed
the fact that the giJ are transformed like a twice covariant tensor field. Hence,
the entire tensor calculus is available. We will make essential use of this in the
following.

74.10. Metric Properties of Surfaces


We want to show that a knowledge of the first fundamental form, i.e., a
knowledge of the metric tensor field giJ, suffices to determine the metric
properties of the surface.
Let H be a subregion of Gin the (u 1, u2 )-plane. We then define the surface
measure m of the corresponding surface area through

m= L Jgdu 1 du 2 • (54)

If x = x(t), t 1 ~ t ~ t 2 is a curve on the surface, then we define the arclength


J!.
s ofthis curve between the points x(t d and x(t) through s(t) = lx(t)l dt, i.e.,

s(t) =
J,.f' J giju ui dt. (55)
1

More precisely, we have to write giJ(u(t))u 1(t)ui(t). Differentiation with re-


74.1 0. Metric Properties of Surfaces 635

spect to r gives

This motivates the notation

for the first fundamental form.


If x = y(r) is another curve on the surface with coordinate representation
u1 = v1(r) and if x = x(r) and x = y(r) intersect for fixed r, then the angle cp
between the two curves satisfies
xy = lxiiPI cos cp.
Note that x andy are the tangent vectors at the point of intersection. Because
of x = b1u1 and y = b1v1 we have
g .. uivi
coscp = '1 . (56)
Jgiju'ul~
More precisely, we must write gii(u(r)), u1(r), v1(r).

Proposition 74.39. Expressions (54)-(56) do not depend on the particular choice


of the coordinate system.

The advantage of formulas (54)-(56) is that they remain valid for Rie-
mannian manifolds. This will be shown in Section 74.19. The following proof
is given in view of this generalization.

PROOF. The transformation formula for integrals yields the invariance of(54),
since

m= JJgdu 1 du 2 = JJgiDidu 1'du 2 '= fJ?du 1'du 2 '=m'.

According to (53), u1 is transformed like a contravariant tensor field. Con-


sequently, g1Ju 1ui is a scalar, and hence sand cos cp in (55) and (56) are scalars.
0

A motivation for (56) has already been given. A motivation for (55) is
provided by the approximation formula .:1s = 1.:1xl, i.e.,

.:1s = I~: 1.:1r,


together with summation and passing to limits.
We now want to motivate (54). To do this, we consider, as in Figure 74.7,
a small rectangle R which is parallel to the coordinate axes in the (u 1, u2 )-plane
and has the area .:1u 1 t1u 2 • The Taylor expansion of x = x(u) shows that in
636 74. Classical Surface Theory, Theorema Egregium of Gauss

-Au'
Au 1
x = x(u)
~

L . __ _ _ _ _ _ _ ul

Figure 74.7

first-order approximation R becomes a parallelogram which is spanned by


(ox/ou 1 )Au 1 and (ox/ou 2 )Au 2 , i.e., by the vectors b1 Au 1 and b2 Au 2 in the
tangent plane. The area of this parallelogram is
Am= lb1 x b2 1Au 1 Au 2 •
We have lb1 x b2 12 = det(bibi). Consequently,
Am= Jg Au 1 Au 2 •
Summation and passing to limits gives (54).

74.11. Curvature Properties of Surfaces


We will use the first and second fundamental form to define the curvature of
surfaces. Our starting point is the following simple situation.

EXAMPLE 74.40. As in Figure 74.8 we consider a surface


c= c<e.,)
which passes through the origin where it has the (e,,)-plane as the tangent
plane, i.e.,
((0, 0) = '~(0, 0) = '·(0, 0) = 0.
The Taylor expansion in a neighborhood of zero is therefore
( = ae 2 + 2be'1 + C'7 2 + 03.

'1

Figure 74.8
74.11. Curvature Properties of Surfaces 637

Using a rotation around the (-axis one can always secure that
'= 2-l(exe2 + p,.,2) + o3. (57)
In mathematics, properties of curvatures are always given by the quadratic
terms of the Taylor expansion. We define the Gaussian curvature K and the
mean curvature H of the surface at the origin very simply through
K = exfJ, H =(ex+ /J)/2. (58)
Moreover, we call R = 1/ex and r = l/fJ the principal curvature radii, and the
e-direction and 17-direction the principal curvature directions. For ex = fJ every
direction is by definition a principal curvature direction.
Using the tensor calculus for K and H, we are now looking for expressions
which are valid for all coordinate systems. To this end we let e1 , e 2 , e3 be the
coordinate unit vectors in Figure 74.8.
Then the surface equation takes the form
X = eel + '1e2 + C(e, '7)e3.
Consequently, we have at the origin

and N = e1 x e 2 = e 3 • Therefore giJ(O,O) = e1e1 and


hii(O,O) = N(D1D1x) = DP1((0,0).
This gives at the origin
gll=g22=1, 912 = 921 = 0,
(59)
h11 =ex, h22 = p,
For the inverse matrix (g 1i) of (g;1) we obtain at the origin g 11 = g 22 = 1 and
g 12 = g 21 = 0. This gives
K = h/g, (60)

Definition 74.41. The Gaussian curvature K and the mean curvature H at


some surface point is defined through (60).

This definition is independent of the choice of the coordinate system. More


precisely, H changes its sign under changes in the orientation, because Propo-
sition 74.36 implies that h11 is a pseudotensor, and hence His a p,seudoscalar.
and Corollary 74.37 shows that K is a scalar.
Next we want to show that this definition of the Gaussian curvature allows
a pleasant geometric interpretation of the sign of K. For this we first show
how to compute ex and fJ in an invariant way.
At a fixed surface point P one can always, as in Figure 74.9, introduce a
local (e.,.,, ()-coordinate system which gives the situation of Example 74.40,
where P corresponds to the origin. The (e, 17)-plane is equal to the tangent
638 74. Classical Surface Theory, Theorema Egregium of Gauss

Figure 74.9

plane at P, and the C-axis points in the direction ofthe normal vector N. Then
(57) holds locally. Knowing K and H, one obtains ex and fJ as the zeros of the
quadratic equation (A. - ex)( A. - fJ), i.e., with (58) as the zeros of
A.2 - 2HA. + K = 0. (61)
From (60) it follows that H changes its sign under changes in the orientation.
Thus the same holds true for ex and fJ, i.e., ex and fJ are pseudoscalars. Further-
more, ex and fJ are the zeros of the equation
det(A.g 11 - hii) = 0. (62)
Without any computation this can be seen as follows. If A. is a pseudoscalar,
then tii = A.g11 - hii is a pseudotensor. According to Section 74.8, the left-
hand side of (62) is multiplied by D 2 under coordinate changes. Thus the
solutions of(62) are independent of the coordinate system. In the special case
(59), however, formula (62) has the solutions A. = ex, fJ.
From (57), one now immediately obtains a geometric interpretation for the
sign of the Gaussian curvature.
(i) If K > 0 at the point P, then exfJ > 0. Thus, in a neighborhood of P, the
surface lies on one side of the tangent plane at P.
(ii) If K < 0 at the point P, then cxfJ < 0. Thus, in a neighborhood of P, the
surface lies on both sides of the tangent plane· at P.
(iii) If K = 0 and H = 0 at the point P; then ex = fJ = 0. Thus, in a neighbor-
hood of P, the surface behaves like a plane, except for terms of at least
order 3.
(iv) If K = 0 and H =i: 0 at the point P, then ex = 0 and fJ =i: 0 or fJ = 0 and
ex =i: 0. Thus, in a neighborhood of P, the surface behaves like a cylinder,
except for terms of at least order 3.

EXAMPLE 74.42. In the case when the surface is a plane, no quadratic terms
appear in the Taylor expansion, i.e., ex= fJ = 0 and K = H = 0. For a sphere
of radius R we have
C= R _ jR2 _ ~2 _ '12 = rlR-~(~2 + '72) + 03 ,
and hence ex= fJ = H = l/R and K = 1/R2 •
IfwechoosethesurfaceequationforthesphereasC = jR 2 - ~ 2 - '7 2 - R,
74.12. Fundamental Equations and the Main Theorem of Classical Surface Theory 639

then we obtain a = P= H = -1/R. This shows that in contrast to K, the signs


of a, p, and H have no absolute geometrical meaning.

So far we have used the following convenient strategy.


(S) We choose a special coordinate system in which the problem at hand is
most easily treated. Then, by using the index principle of tensor calculus,
we obtain results in a form which are valid for arbitrary coordinate
systems.
Let us consider another example.

EXAMPLE 74.43 (Lines of Curvature). A line of curvature is a curve


j = 1, 2,

on the surface where at each point the tangent direction is a principal curva-
e
ture direction. In Example 74.40, we find at the origin = 0 or pJ = 0 for the
lines of curvature, or more precisely
er;(a - P> = o,
because for a = pevery direction is a principal curvature direction. Following
the index principle, we write this as
Eiig. h
" js
u'u" = o' (63)
where Eii = jgj- 111 eu. In fact, (59) and (63) imply that u1 u2 (tx- P> = 0. Equa-
tion (63) holds for arbitrary coordinate systems. Thus for -r 1 :s;; -r :s;; -r 2 , equa-
tion (63) is the differential equation for lines of curvature. More precisely, we
would have to write g1,(u(-r)) and h1.(u(-r)). Finally, (63) is equivalent to
-u•1 u•2 u2u2

Y22 = 0.
h22

74.12. Fundamental Equations and the


Main Theorem of Classical Surface Theory
We take a look at the central problem of surface theory. From a local point
of view, which data suffice to uniquely determine a .surface? We begin with
the so-called fundamental equations of Gauss and Weingarten:
D1b1 = huN + f';'bt (Gauss, 1827), (64a)
D1N = -h{b1 (Weingarten, 1861). (64b)
These equations describe the change in the local trihedral b1 , b2 , N of the
640 74. Classical Surface Theory, Theorema Egregium or Gauss

surface. We have D1 = iJfiJui and


hij = gi•h.sa·
The Christoffel symbols ri'• which depend on gli and their first-order deriva-
tives, have already been defined in (23). Recall r~ = fj~. The fundamental
equations are the most important equations in surface theory.
By integrability conditions for (64) we mean all the equations
for gli and hli, which follow from (64) by using
and (65)
There are exactly three equations, which have the explicit form
(70), (71) below.

Proposition 74.44. If assumption (B) of Section 14.9 holds, then (64) is true for
all surface points and coordinate systems.

PRooF. Ad(64b). Differentiation of N 2 = 1 yields 2N DiN = 0. Hence


DiN= c{b1,
where the numbers c{ remain to be determined. It follows from (51) that
h,i = -b,DiN, and hence

From gmsgsi = br
follows c;" = -g-h,i. This is (64b).
Ad(64a). We determine the coefficients a and pin the decomposition
Dibi = atb,. + PliN. (66)
From (51), multiplication with N gives pli = NDibJ = hli.
Furthermore, gsi = b,b1 implies that
Digsi = bi(Dib,) + b,Dibi.
Interchanging the indices and summation gives
b,D1b1 = r 1 (D1gsi + D1g.t- D,gli).
Thus from (23), b,Dibi = r,, 11 • Multiplication of (66) with b, gives
rs,ij = a~g.,., and thus a'i] = g"'"rs,iJ·
From (23), this means aij = rij. 0

Theorem 74.8 (Main Theorem of Surface Theory of Bonnet 1867). Choose the
first and second fundamental form in such a way that the integrability conditions
(65) are satisfied. Then, locally, except for motions, there exists exactly one
surface which has these two fundamental forms.
74.12. Fundamental Equations and the Main Theorem of Classical Surface Theory 641

Remark. More precisely, this theorem states the following. We are looking for
a surface x = x(u) in the sense of (B) of Section 74.9, where u varies in a
neighborhood U(u 0 ) of u0 • To this end we choose real C 00 -functions g1i and
h1i, i,j = 1, 2 on U(u 0 ) which satisfy

and the integrability conditions (65). Moreover, we choose a point x(u 0 ) and
basis vectors b? = D1x(u 0 ) at x(u 0 ) with
for all i,j.
Also, we define
N° = br x b~/lbr x b~l.
Then there exists precisely one C 00 -surface x = x(u) on a sufficiently small
u0 -neighborhood whose fundamental form coincides with g11 and h1i.
As our proof shows, it suffices for the g1i to be CH 1 and for the h11 to be
C1 with k ~ 1. Then X = x(u) is of class Ck+ 2 .
This immediately implies Theorem 74.B, because br, b~, N° and x(u 0 ) are,
except for motions in IR 3 , uniquely determined through g11(u 0 ).

In the following proof we make essential use of the theorem of Frobenius


of Section 4.12. From this, the initial-value problem
w(u 0 ) = Wo (67)
With W = (wl, ... , Wm) and i = 1, 2, j = 1, ... , m, has a unique C'+ 1-solution,
r ~ 1, in a sufficiently small u0 -neighborhood, if the fiJ are real C'-functions
which satisfy the integrability conditions which follow from D1,D1w1 = D1D1 w1.
PROOF.

(I) Construction of b1 and N. Consider the initial-value problem


ht(u 0 ) = b?, N(u 0 ) = N°
for the fundamental equations (64). Passing to Cartesian coordinates,
this is a problem of type (67) and has a unique local solution, because
the integrability conditions are satisfied by (65).
(II) Construction of x = x(u). Consider the initial-value problem

which again is of type (67). The integrability conditions Dib1 = D1b1 are
satisfied since the right-hand side of(64a) is symmetric. Thus there exists
a unique local solution x = x(u).
(Ill) Uniqueness. Since each surface x = x(u) satisfies the fundamental equa-
tions (64), uniqueness follows from (I) and (II).
(IV) Existence. We have to show that the given functions g11 and hiJ corre-
spond to the first and second fundamental form of x = x(u).
642 74. Classical Surface Theory, Theorema Egregium of Gauss

Let b1, N be the solutions of(64). We let

From (64) together with the product rule forD", and from the definition
of V" in (21 ), we obtain the system
V"a = - 2h1P,
vkpj = -h;;y,J + hjka, (68)

v" y,1 = h,"pl + hi"p,,


and a(u 0 ) = 1, P1(u 0 ) = 0, yv(u 0 ) = g11(u 0 ).
From the lemma of Ricci it follows that
V"giJ = 0
(see Problem 74.2d). Thus
a= 1, P=O,
is a solution of (68). This system is of type (67). Hence the solution is
locally unique. Therefore

i.e., gii corresponds to the first fundamental form, and we have N =


(b1 x b2 )/lb1 x b2 1. From (64b) follows
b,.DjN = -hjb,b,. = -g 1mhmjgik = -hkjo

i.e., hA:i corresponds to the second fundamental form. 0

74.13. Curvature Tensor and


the Theorema Egregium
We define the curvature tensor Rj11m for a surface through

RJ~:m = D~cr!1 - Dmrk + r:sr:V - r!srlj,


(69)
Rijkm = gisRjkm•
In Section 74.18 we demonstrate how the RJ~cm arise in a natural way in
connection with the commutability of the covariant differentiation. Here we
show how the curvature tensor of surface theory follows automatically from
the integrability condition of the fundamental equations. Note that sometimes
in the literature RJ11mis used with the opposite sign. Our definition is chosen
so that the general equation (82) below appears in a symmetric form.
74.13. Curvature Tensor and the Theorema Egregium 643

Proposition 74.45. The integrability conditions for the fundamental equations


are
V1h1t = Vth 11 , (70)
Rljkrn = hlkhj,. - h,,.hjk• (71)

A proof is given below. Equation (70) is called the equation of Mainardi-


Codazzi and (71) is called the theorema egregium of Gauss. The tensor
property of Ru""' is a consequence of (71), because the right-hand side is a
tensor. From hii = h11 and (71) follows

Rijkrn = Rkrnij•
(72)
Rljkrn = - Rjllcrn = - Rijrnk•

Thus R 2112 is the only essential component and


R 1212 = h11 h22 - (h 12 ) 2 = h =Kg.
Thus (70) and (71) only contain three essential equations. Moreover, we obtain
the following fundamental theorem.

Theorem 74.C (Theorema Egregium of Gauss (1827)). The curvature K can be


expressed through the metric tensor gii and its first- and second-order deriva-
tives. More precisely, it is K = R 1212 /g.

This theorem shows that K is an intrinsic property of the surface, because


gu is determined solely by measurements of length on the surface. For ex-
ample, the curvature of the surface of the earth may be determined just by
land surveying on the earth. We do not need the surrounding space IR 3 • The
theorema egregium was the starting point for Riemann's work (1854) on a
theory of curved n-dimensional manifolds, which he developed without using
a surrounding space. Riemann's point of view, in turn, was the key to Einstein's
general theory of relativity (1916). Gauss discovered his theorem after tedious
computations without using the curvature tensor. With the aid of covariant
differentiation, our proof now becomes very short.
PROOF OF PROPOSITION 74.45. Letting
V1b1 ~ D1b1 - rubs
similarly as in Section 74.4, we can write the fundamental equations (64) in
the elegant form
V1b1 = h11 N, D1N = -h{b1. (73)
(I) The equation D1 D1b1 = D1D1 b1 is equivalent to
Vt V;b1 - V; Vtbi = Rjitb,.. (74)
Here we define V1 V 1b1 in the same way as V1 tiJ of Section 74.4. Note that
644 14. Classical Surface Theory, Theorema Egregium of Gauss

Vt V1b1 = D~:D1 b1 + ··· . From (73) we obtain


Vt V1b1 - V1V~:b1 = (V~:h11 - V1h,j)N + (h1~:hr - h11 h;')brn. (75)
Comparison of(74) with (75) gives (70) and (71).
(II) A short computation shows that
D~:D1 N = D1D"N
is equivalent to (70). This follows from (73) and the lemma of Ricci
V"g11 = 0 (Problem 74.2d). 0

74.14. Surface Maps


The following considerations are important for applications in cartography.
Let
x = x(u)
be a surface S which satisfies assumption (B) of Section 74.9. Consider a
bijective map
x = x(x)
from S onto another surface g. Then g can be written in the form
X= x(u)
and again we assume (B) for g. The map is called length preserving (or area
preserving) if and only if the length of curves (or the surface area of subsets)
is preserved. The preservation of angles means that the angle of intersection
between curves is preserved. The metric tensor of S and S is denoted by gu
and iiu· Recall that g = det(gu).

Proposition 74.46 (Characterization of Maps). The map is length preserving (or


area preserving) if and only if
g11(u) = gu(u)
(or g(u) = g(u)) for all u and i,j = 1, 2.
The map is angle preserving if and only if for each u there exists a number
c(u) > 0 with
gu(u) = c(u)iiu(u)
for all u and i,j = 1, 2.

PRooF. This follows easily from (54) to (56). The preservation of area, for
example, means that

L L~du
Jgdu=

for all subregions H. This is equivalent to Jg = ~·


The necessity of this condition for the preservation of angles follows from
74.15. Parallel Transport on Surfaces According to Levi-Civita 645

(56). One uses coordinates for which g11(u) is equal to the Kronecker symbol
for fixed u, and then looks at special curves. D

Proposition 74.46 immediately yields the following results.

(i) Every length preserving map is also area preserving and angle preserving.
(ii) Every length preserving map preserves the Gaussian curvature K at every
point. This follows from the theorema egregium (Theorem 74.C).
(iii) There is no length preserving map which maps parts of the sphere onto
the plane. This follows, because K is different for the sphere and the plane.
(vi) For angle preserving maps, i.e., conformal maps, we have
ds 2 = c(u)ds2

at each point u. Thus in first-order approximation, the lengths are multi-


plied by a fixed number in a small neighborhood of u. Therefore con-
formal maps are also called similar in the small. A geographical chart,
which is made from a conformal map, therefore gives, in first-order
approximation, a precise picture of a sufficiently small section of the earth
in a fixed scale ratio.

74.15. Parallel Transport on Surfaces


According to Levi-Civita
Let a curve C of the form u1 = u 1(t), i = 1, 2 with curve parameter t be given
on a surface. Moreover, let v = v"b,. be a vector field along C. Analogously to
(30), we say that v is parallel along C if and only if Dv"/dt = 0, i.e.,
(76)

along C. The dot means differentiation with respect tot. This parallel trans-
port allows the following very simple interpretation.

Proposition 74.47. A vector field vis parallel along C if and only if vis always
perpendicular to the tangent plane.

Corollary 74.48. Let v be parallel along the curve C. Consider v at a fixed point
on the curve with parameter t 0 and move v(t 0 ) parallel along C in R 3 • This
vector field is denoted by w = a.ib1 + fJN. We obtain
v1(to) = ti 1(to).

The last statement means that


L\v 1 = L\a 1 + o(AT) as At-+ 0.
One also says briefly: The parallel transport of v along the curve C is obtained
646 74. Classical Surface Theory, Theorema Egregium of Gauss

through an infinitesimal parallel transport in R3 and a projection onto the


tangent plane.
PRooF. From v = vib1 + vib1 and 61 = D1b1u1 and the fundamental equations
(64) it follows that
v= (vi: + r~u 1 vi)b~: + (... )N.
Hence, the proposition follows from (76). D

PRooF OF CoROLLARY 74.48. From w= 0 follows


0 = ~ib1 + a.iD1b1u1 + iJN + fJN.
Consider this equation for T0 • It follows that /J(T 0 ) = 0. The fundamental
equations (64) yield

hence
~A:+ qu'oci = 0.
From cxi('t0 ) = vi('t0 ) and (76) it follows that ~ 11 (T 0 ) = vl('t0 ). D

Using parallel transport one can give a simple interpretation for covariant
differentiation on a surface. For this, consider a fixed point on the curve C
with parameter 't. Similarly as in Section 74.4, we define

Tr
Dt" = u' V1t " = t'" + r,1"u'tJ.
..

Lett = t 1b1 and let v = v1b1 denote the vector field which is obtained from t('t 0 )
by a parallel transport. Then v(T 0 ) = t('t 0 ) and (76) implies
Dtt('to) .· • . ••
~ = u'V1t" = t 11 ('to) - v"('to).
This then gives an intuitive interpretation of D/dT and V1 on surfaces.
Since (76) depends on the curve u1('t), the parallel transport on surfaces
usually depends on the path. More precisely, we obtain for a simply connected
surface: The parallel transport is path independent if and only if the curvature
tensor is identically zero. From the theorema egregium of section 74.13 this
is equivalent to K = 0 (see Problem 74.7).

74.16. Geodesics on Surfaces and


a Variational Principle
Section 74.6 provides the motivation for the following definition. A curve C
on a surface with parameter representation u1 = u1(s) is called a geodesic if
74.16. Geodesics on Surfaces and a Variational Principle 647

and only if Du 1/ds = 0, i.e.,


uk + ~'u 1 ui = o, k = l, 2 (77)

along C. Here s denotes the arclength and the dot means derivative with
respect to s. From Section 74.6, this definition generalizes the concept of
straight lines in a plane. According to Section 74.15, the geometrical meaning
of (77) is the following. The tangent vector field t = u1b1 along Cis obtained
through a parallel transport.
Now we want to show that (77) is closely related to the variational problem
of finding the shortest connecting line between two points on the surface.

1.2

••
ds =min!. (78)

Because of s = J giJu ui we rewrite (78) in the form


1

f Jg
.2

••
•i. j
11 u

u dt = mm!, (79)

One is looking for the shortest line, connecting two points P1 and P2 on the
surface with corresponding coordinates u~ and u~ (Fig. 74.10).

Proposition 74.49. If arclength is chosen as parameter, then every solution of


the variational problem (79) satisfies (77).

PRooF. Let L = JGwith G = g11 u1ui. From Section 58.18 it follows that the
Euler differential equations for (79) are equal to
!!._ aL _ aL _ 0
dt auk auk - .
This means

l dG ·i + 2G-(gk.lu
[ --gk.lu d a ) . ·JJ .
=
2G JG dt dt
·J ) - G ( - g .. u 1u
auk I)

Figure 74.10
648 74. Classical Surface Theory, Theorema Egregium of Gauss

Letting t = s, gives G = 1 along the extremal. Using (23), this immediately


implies (77). 0

74.17. Tensor Calculus on Manifolds


We want to show that without further thought, the tensor calculus of IR" carries
over to manifolds. Consider the following situation:
(C) M is a rea~ n-dimensional C 00 -manifold.
The indices in the following four sections run from 1 to n.

Definition 74.50. A tensor field

on M is an assignment of a tuple {tf: :::t} of real numbers to each point x of


M in every chart that belongs to x, which is transformed under chart changes
like a corresponding tensor field in IR".
Consequently, tf: :::t depends on the point x and on the charts.

Analogously, pseudotensor fields are defined. According to our Smoothness


Convention 74.2, we assume that t::: with fixed indices is of class coo with
respect to the corresponding chart coordinates.
We illustrate this definition with a simple example. Consider a fixed atlas
for M. Let u = u(x) and v = v(x) denote two chart maps which assign coor-
dinates u = (u 1 , ..• ,u") and v = (u 1 ', .•. ,u"') to the point x. As before, we let

Af = ou 1' (u~x)) and


ou'
Now, for t 1, we assume that
t 1' = Aft 1
under chart changes from the u-chart to the v-chart.
In a natural way, each tensor field can be extended to all admissible charts.
This follows from the chain rule. Consider, for example, t 1• Let w = w(x) be
an admissible chart with w = (u 1 ", ••• , u""). We define

The chain rule implies

Therefore it does not matter if we use u = u(x) or v = v(x) for the extension.
Analogously, one shows that the tensorial transformation rule applies to
changes in the admissible charts. We always extend tensor fields to the
74.18. Affine Connected Manifolds 649

maximal atlas, i.e., to all admissible charts. Two tensor fields on M are called
equal if and only if their extensions to the maximal atlas are identical.
Obviously, the algebraic tensor calculus of Section 74.3 applies to manifolds.
Let b1 denote the tangent vector at x with respect to the chart u = u(x),
which corresponds in the u-chart to the unit vector e1 in u 1-direction. Then
every tangent vector te TMx can be written in the form
t = t 1b,.
Let {b1 , ... , b.. } denote the natural basis of the tangent space TMx and t 1 the
natural coordinates oft in the u-chart. Moreover, we define linear, continuous
functionals on TMx through
bi(t) = ti.
From Section 73.23 it follows that

for every t*eTM:. We-call {b 1, ... ,b"} the natural basis of TM: apd t1 the
natural coordinates oft* in the u-chart. From Proposition 73.84, b1 and t 1 are
transformed like a simple covariant tensor under chart changes. Furthermore,
b 1 and t 1 are transformed like a simple contravariant tensor. Finally, Section
73.23 implies that b, = ax;au' and b1 = du 1•

74.18. Affine Connected Manifolds


We now want to extend covariant differentiation to manifolds. To this end
we need the Christoffel symbols.

Defmition 74.51. Assume (C) of Section 74.17. The manifold Miscalled affine
connected if and only if there exists a tuple {rM of real numbers for each point
x eM in every chart that belongs to x, which is transformed under chart
changes like:
r,'J' =At A 1.A1J.rli" + (D,.Ai')A..
It' It' i • It'
(80)
Consequently, r,,depends on the point xand on the charts. Here, D = a;au''.
1•

According to our Smoothness Convention 74.2, we assume that all the r~


are C 00 -functions in chart coordinates.
As in R" and surface theory we now use these ~~to introduce the following
concepts:
(i) Covariant differentiation V1•
(ii) Absolute differentiation Dfd-c.
(iii) Parallel transport.
(iv) Affine geodesics as generalized straight lines.
650 74. Classical Surface Theory, Theorema Egregium or Gauss

(v) Curvature tensor Rjf"'"


(vi) Torsion tensor TJ ~ rb - Ij~.
We define V1 as in Section 74.4. For example, we set
V1ti = D1t 1 + r~t'.
Direct computation shows that formula (80) implies the following: If

is a tensor field on M, then

is a tensor field on M where the type is given by the index picture.


Let C be a curve x = x(t) on M, which in charts has the local representation
u1 = u1(t). We define the absolute derivative along Cas
... T-- U·ivi t ...
Dt···;d .. ..

A tensor field t::: is called parallel along C if and only if


Dt:::/dt = 0 along C.
The curve Cis called an affine geodesic if and only if Du 1fdr = 0 along C with
respect to a suitable parameter t, i.e.,
(81)

The curvature tensor RJ""' is defined as in (69), using the r~. A direct
computation shows
v" V V1 -
u 111
V111 V,.v 1 -- R11,.,..vi - T..i
""' V1v1. (82)
Thus, analytically, one obtains the Rj11,.. and T/, in a natural way if one studies
the commutability of the covariant differentiation. In R" we have r~ = 0, and
hence Rj,., = 0 and T/,.. = 0. From (80), TJ is a tensor. Since the left-hand side
of (82) is a tensor, the same is true for Rj,.,..vl. Furthermore, since vi may be
chosen arbitrarily, the inverse index principle implies that Rj11 , is a tensor as
well (see Problem 74.3i).
The geometric meaning of RJ,.,.. and TJ is the subject of Theorem 74.D of
Section 74.20 and Problem 74.7. A consequence is the following two results
for simply connected manifolds M:
(i) The parallel transport in M is path independent if and only if Rj11, = 0 on
M.
(ii) M is locally flat if and only if RJ~:m = 0 and TJ = 0 on M.
In the next section, we will introduce so-called Riemannian manifolds.
Each Riemannian manifold is affine connected.
74.19. Riemannian Manifolds 651

74.19. Riemannian Manifolds


For affine connected manifolds, the concept of curve length is not available.
For this we need Riemannian manifolds. The definition of Riemannian mani-
folds is based on the following formula

s±(t) = r· J ±giiu u dt
J.,
1 1 (83)

for the arclength. Thus we need to know the gii. From (83) it follows that
s~ = ±guului.
In short notation, we write
ds~ = ±giidu 1dui.

We set g = det(giJ) and s = s+.

Definition 74.52. Assume (C) of Section 74.17. The manifold M is called a


Riemannian manifold if and only if there exists a symmetric tensor field g11 on
M with g :F 0. If g > 0 on M, then we speak of a proper Riemannian manifold.

Riemannian manifolds, which are not proper, are called pseudo-Riemannian


manifolds. They occur in the general theory of relativity. One should note that
many authors speak of Riemannian manifolds which in our terminology are
proper. From convention 74.2 it follows that in chart coordinates g11 is of class
COO. We call g11 the metric tensor field.
Using the giJ we obtain as in IR" the following concepts:
(i) Curve length.
(ii) Angle between curves.
(iii) Volume.
(iv) The Christoffel symbols r1~.
If Cis a curve in M, which has the local parameter representation u 1 = u1(t)
in charts, we define the arclength through (83). This integral has the following
meaning. We decompose the curve into parts so that each part lies in a single
chart. As in (83) we integrate over these parts and then sum the results. Since
g11 u1ui is a scalar, the result does not depend on the choice of the charts.
Because of the additivity of the integral, the result also does not depend on
the particular choice of the decomposition. In (83), we choose the sign "+"
(or"-"), if gliu 1ui ~ 0 (or ~0) along C.
If H is a region in M with a compact closure, then the volume of H is defined

LJiUT
through

m(H) = du 1 ••• du". (84)


652 74. Classical Surface Theory, Theorema Egregium of Gauss

This integral is to be understood in the sense of (83), i.e., it follows from a


sufficiently small decomposition of H. Such decompositions can conveniently
be described by using a partition of unity L
cp,. = 1 in H (see A1 (12h)). We
then let

m(H) = L,. Jnr cp,.JiUf du 1 ••• du".

From compactness of H, we may assume that each support of cp,. lies in one
chart and that the sum L,.
is finite. Each subintegral is then evaluated in the
corresponding chart.
As in Section 74.8, we obtain that the volume m(H) does not depend on the
choice of the chart coordinates (u 1, ••. , u").
This definition can be extended to arbitrary regions in M if M has a
countable basis. In this case, M is paracompact as a locally compact space
with a countable basis. Consequently, there exist partitions of unity in M. The
sum L,.,
defining m(H) might be an infinite series, so that, in addition, one has
to assume convergence. Because of the additivity of the integral in charts, this
definition is independent of the particular decomposition. Analogously, we
define

for a continuous f: H-+ R as

where we assume convergence of the corresponding series for 1/1 if H is not


compact. The general construction in A2 (75) shows that m(H) defines a mea-
sure m on M. The corresponding Lebesgue integral J8 fdm coincides with J
for a continuous f
The angle cp between two curves is defined through (56).
Using the 9u• the Christoffel symbols r;' can be defined as in (23). A
straightforward computation shows that the transformation rule for the gli
implies the transformation rule (80) for the r~. which therefore determines an
affine connection. Hence all the definitions and concepts of the previous
section are available for Riemannian manifolds.
In the case of a Riemannian manifold, it follows from the symmetry of 9u
and from the definition of r~ in (23), i.e.,
r~ = tg"m(D;gmi + Digmi - DmgiJ)
with D; = iJjiJui, that
for all i,j,
74.20. Main Theorem About Riemannian Manifolds 653

and hence
for all i,j,
i.e., the torsion tensor of a Riemannian manifold vanishes.
Let C be a C00 -curve in M with g11 u1ui > 0 (or <0) along C. Then C is
called a geodesic if and only jf (81) holds with respect to the special parameter
t = s+ (or t = s_ ). This definition is more special than the definition of affine
geodesics of Section 74.18. The notation s± has been introduced in (83). In
the following it is convenient to uses,. with ex = ± 1 instead of s±.

Proposition 74.53. Consider the variational problem

i = 1, ... , n
with given starting point u~ and endpoint u~ as well as some fixed ex= ± 1.
Then every solution curve C with s,. > 0 along C is a geodesic.

This follows analogously as in the proof of Proposition 74.49. Note that r 1'
remains unchanged when passing from g11 to -gu. An important application
of Proposition 74.53 is discussed in Section 76.2 in connection with the general
theory of relativity.

74.20. Main Theorem About Riemannian Manifolds


and the Geometric Meaning of
the Curvature Tensor
We assume:
(R) Let M be a real, n-dimensional Riemannian C00 -manifold with metric coo-
tensor field g11 •
The Morse-index m for the metric at the point x 0 eM is the Morse-index of
the quadratic form
ds 2 = g11 du 1dui,
i.e., the number of negative eigenvalues of the matrix (gu(x0 )). According to
the classical law of inertia of Sylvester for quadratic forms, m is invariant under
coordinate transformations. Using a linear coordinate transformation, one
can assume that
(N) ds2 = L
" et<dui')2
i=l
654 74. Classical Surface Theory, Theorema Egregium of Gauss

at the point x0 , where


-1 for i = 1, ... , m,
t:· = {
' 1 for i > m.
We call (e 1 , ••• ,e") the signature type of the metric at x 0 . For a proper
Riemannian manifold we have e1 = 1 for all i. For the general theory of
relativity of Chapter 76, the signature type is ( -1, -1, -1, 1), where u 1, u2 ,
u3 are the space coordinates and u4 is the time coordinate.
M is called locally flat at x 0 if and only if there exists a coordinate system
such that (N) holds in a neighborhood of x0 .

Theorem 74.D (Riemann (1861)). If (R) holds, then M is locally flat at x 0 if


and only if the curvature tensor Rjkm vanishes in a neighborhood of x 0 .

This theorem shows that the R}~cm are a measure for the deviation of a locally
flat metric. Similarly as in the proof of the main theorem of surface theory of
Section 74.12, this proof is an application of the theorem of Frobenius, i.e., we
solve systems of explicit first-order partial differential equations by checking
the integrability conditions. Also, as in Section 74.12, we use the lemma of
Ricci. The key to the proof is (I).
PROOF. Since RJ~cm contains the first- and second-order derivatives of gii, we
immediately obtain from g1•1• = const that Rf~c·m· = 0. The tensor property
implies that R}~cm = 0 is a necessary condition.
To prove the sufficiency of RJ~cm = 0, we choose a fixed u 1-system. Our goal
is to construct a u 1'-system, in which g 1'i' is constant in a neighborhood of x 0 •
Then also g1•1. is locally constant, and by using a linear transformation, we
obtain that (N) holds locally.
(I) The system
V~ch1 = 0
has a unique solution in a neighborhood of x 0 for any given h1(x 0 )
because this system is equivalent to
D~ch1 = rljh5 ,
and the integrability conditions D,D~ch1 = D~cD,h1 are satisfied because of
R~km = 0.
(II) The function cp = g 11 h1h1 is locally constant because the product rule for
vk implies that
D~ccp = V"cp = 0.
Note that from the lemma of Ricci,
V"gii = 0
(see Problem 74.2e). Furthermore, note that V1 h1 = 0, according to (1).
74.21. Applications to Non-Euclidean Geometry 655

(III) For h1 in (1), there always exists locally a function u' with
D1u'·= h1,
since because of r;1 = lji. the integrability conditions DkD1u' = D1D11 u'
are satisfied.
(IV) Now we construct the new u''-system through hj with
Vkh:' =0 and D.u
J
1' = h!J'
where det(hj(x 0 )) =F 0. Then g1'i' = giiD1u 1'D1ui' which, similarly to (II),
is locally constant. D
This proof also provides us with a simple analytic approach for the defini-
tion of the curvature tensor using the integrability condition in (1). This, in
principle, is the same way as was used by Riemann to obtain an analytic
expression for the curvature tensor. In his prize memoir for the Academy of
Paris, Riemann (1861) studied the problem of locally transforming the equa-
tion for the heat conduction equation
D1(giiD1f) = f,
by a coordinate transformation into the normal form
n
L e,DY=fr
i=l

which corresponds to a homogeneous body. One easily verifies that the


coefficients gil are transformed like tensors under coordinate changes. Thus
the local vanishing of the RJkm is necessary and sufficient for the existence of
such a local normal form.
Riemann's work of 1861 may be regarded as the hour of birth for the
Riemannian curvature tensor. In fact, this tensor was already implicitly con-
tained in Riemann's habilitation talk (1854) on the hypotheses of geometry.
There he generalized the Gaussian surface curvature. This is discussed more
thoroughly in Spivak (1979, M), Vol. 2, Chapter 4. This volume also contains
seven variants of the proof of Theorem 74.0. It also gives an introduction to
the various ways of developing a calculus for differential geometry. They all
use the integrability condition, but often in a very implicit form. Bernhard
Riemann died in 1866 at the age of 40. His collected works only fill one volume.
But his ideas, revealing deep connections between analysis, topology, and
geometry, profoundly influenced the mathematics and physics of our century.
He was, for example, the first mathematician who discovered global analytic
results in connection with his study of complex analytic functions.

74.21. Applications to Non-Euclidean Geometry


One of the most interesting developments in mathematics begins with the
parallel axiom of Euclid around 325 B.C., and leads up to Einstein's general
656 74. Classical Surface Theory, Theorema Egregium of Gauss

theory of relativity and the gauge field theories of our days. Generally, one is
concerned with the question: What is geometry, and what is the role played
by geometry in understanding the structure of our world?
In his "Elements," in which Euclid gave an axiomatic definition of geometry
he postulated for arbitrary points P and straight lines g in the plane:

(P) If P is not on g, then there exists exactly one straight line through P which
does not intersect g.
Actually, Euclid used another formulation, but for our purposes, version
(P) is most convenient. It is contained in the standard book on the axiomatic
foundations of geometry of Hilbert (1903), which was published in 1977 in its
twelfth edition. Historically, the following question was important: Is (P) a
consequence of the other axioms of Euclid or is (P) independent? Today we
know that (P) is independent and that there exist elliptic and hyperbolic
non-Euclidean geometries where (P) is replaced with (Pemp) and (Phyp):

(P.mp) If Pis not on g, then there exists no straight line through P which does
not intersect g.
(Phyp) If Pis not on g, then there are infinitely many straight lines through
P which do not intersect g.
In this section we want to show that there exists a natural connection
between elliptic, Euclidean, and hyperbolic geometries, which is obtained by
using spherical trigonometry and by passing to imaginary spherical radii,
i.e., a change from constant Gaussian curvature, K > 0 (elliptic), to K = 0
(Euclidean), and K < 0 (hyperbolic).
During the first half of the nineteenth century, Gauss (1817), and also Janos
Bolyai and Lobacevskii around 1830, independently, came to the conclusion
that there exist hyperbolic geometries. Gauss, however, did not publish his
results because he feared the verdict of small-minded philosophers and mathe-
maticians of his time. Non-Euclidean geometries only became generally ac-
cepted after Beltrami (1868), Klein (1871), and Poincare (1882) constructed
simple models for these geometries.

EXAMPLE 74.54 (Elliptic Non-Euclidean Geometry). We choose Memp(R) to


be equal to the upper half of the surface of a ball of radius R, including the
equator where antipodal points of the equator are identified. The "straight
lines" on Memp are the great circles, i.e., the curves of intersection between
M.mp and planes passing through the center of the ball. As a "plane" we choose
M.mp· Identification of the antipodal points of the equator implies that two
straight lines on Memp intersect at exactly one point and that only one straight
line passes through two different points. Thus (P.mp) holds. The geometry on
Memp is the usual spherical geometry. In particular, all straight lines have
length nR and the area of the plane is 2nR 2 • The Gaussian curvature of the
sphere is K = 1fR 2 •
Interestingly, in the 2000-year-long history of the parallel axiom, no one
74.21. Applications to Non-Euclidean Geometry 657

came up with the idea of using this simple model for a proof of the indepen-
dence of (P). The reason for this was probably that it was implicitly assumed
that straight lines are of infinite length. This is true for the hyperbolic geometry.
EXAMPLE 74.55 (Hyperbolic Non-Euclidean Geometry According to Poincare
(1882)). Let
Mhyp = {(e,,)e!R 2 : , > O}
and choose as metric for the upper half plane
ds2 = (de2 + a,z)/,2. (85)
Then Mhyp is a two-dimensional, proper Riemannian manifold. The hyper-
bolic geometry is the corresponding Riemannian geometry. We write
ds 2 = giiduidui
withe= u 1, '7 = u2, and g~ 1 = g22 = 1/'72 , g 12 = g 21 = 0. Measurements of
length, area, and angle on Mhyp are performed according to formulas (54)-(56).
From (56) it follows, in particular, that the measurement of angles on Mhyp
coincides with the Euclidean measurement of angles. This is one of the
advantages ofthe Poincare model. The area of the plane Mhyp is infinite since

f Jgdu 1 du 2 = J: LCX) ae a,;,z = 00.


As the Christoffel syrnbols we obtain
and
All the other r~ are identically zero. Thus K = R 1212 /g = -1, i.e., Mhyp has
constant negative Gaussian curvature. The "straight lines" in this geometry
are the geodesics. They are most easily computed from the variational problem
Jds -- mm.,
. I t.e.,
. from

f~.~2~
-'----d'l = mm!,
'7

(86)
j = 1, 2.
The Euler equation is here dL~·Id'l = 0, i.e.,
L~. = e'/'1~ = const,
hence

where C and D are constants. Therefore, all "straight lines" on Mhyp are
Euclidean circles with center on thee-axis. From Figure 74.11 we obtain (Phyp),
because if one chooses a "straight line" g in the open, upper half-plane Mhyp
and a point P on Mhyp which is not on g, then there exist infinitely many
"straight lines" which do not intersect g. The boundary points of the upper
658 74. Classical Surface Theory, Theorema Egregium of Gauss

Figure 74.11

half plane, i.e., the points on thee-axis play the role of infinitely distant points
on Mbyp· The "straight lines" on Mbyp are of infinite length because

as rt 1 ~ 0.
The geometry on Mbyp admits a simple physical interpretation. The varia-
tional problem (86) corresponds to Fermat's principle of Example 37.4 with
index of refraction equal to n(e,rt) = c/rt. i.e., we think of the upper half plane
as filled with a medium of index with refraction equal ton. Then (86) states
that light tries to get from one point to another in the shortest possible time.
The geodesics are the light rays (half circles) as shown in Figure 74.11. The
geodesic distance between two points Jds is the time t which is needed by the
light to get from one point to another.
EXAMPLE 74.56 (Natural Connection Between Elliptic, Euclidean, and Hyper-
bolic Geometry). As in Example 74.54, we introduce geodesic polar coor-
dinates qJ, p on Memp(R). Here qJ is the geographical length and pis the distance
between the corresponding point on the sphere and the North pole. Then
ds 2 = dp 2 + R 2sin 2 (p/R) dqJ 2. (87)
Note that p = R8 where 8 is the geographic latitude measured from the North
pole.
e
On Mbyp we choose qJ = as geodesic polar coordinates and
p = distance from (0, 0) in the metric for Mbyp·

After some computations this yields


ds 2 = dp 2 + sinh 2 pdqJ 2. (88)
In usual polar coordinates for the Euclidean plane we get
ds2 = dp2 + P2 dqJ2. (89)
Now note the remarkable fact that, for R = i, equation (88) follows from (87)
and (89) follows from (87) as R ~ oo. This is a special case of the following
74.21. Applications to Non-Euclidean Geometry 659

b
Figure 74.12
interesting result:
(L) If one has the formulas for the spherical geometry for a sphere of radius
R, then for R = i one gets the corresponding formulas for Mhyp and for
R-+ oo the formulas for the Euclidean geometry.
Therefore, Mhyp is often called a sphere with imaginary radius R = i. From
K = 1/R 2
on Memp(R), for exampl~. one obtains K = 0 as R -+ oo and K = -1 on Mhyp·
This coincides with Example 74.55. As another example we consider a triangle
on Memp(R) with angles a, {J, y and lengths a, b, c of the opposite sides (compare
Fig. 74.12). Then
a+ fJ + y = 1t + S/R 2,
where S is the area of the triangle. This implies
a+{J+y=n
for the Euclidean geometry and
a+{J+y=n-S
on Mhyp· In what follows, we list a number of basic formulas which, respec-
tively, hold for the elliptic geometry on Memp(R) with R = 1, the Euclidean
geometry, and the hyperbolic geometry on Mhyp· All these results are obtained
from (L). Note that the formulas for Memp(R) follow from the formulas for
Memp(1) below, by dividing the quantities L, p, a, b, c (or S) by R (or R2 ).
Circumference (p = radius of the circle):
L = 2nsinp (elliptic),
L = 2np (Euclidean),
L = 2nsinhp (hyperbolic).

Area of the circle:


S = 2n(l - cos p),
s = 7tp2,
S = 2n(coshp- 1).
660 74. aassical Surface Theory, Theorema Egregium of Gauss

Pythagorean theorem:
cosa = cosbcosc,
a2 = b2 + c2,
cosh a = cosh b cosh c.

Law of sines:
sin a: sin P: sin y = sin a: sin b: sin c,
sin a:sin p:sin y = a:b:c,
sin a:sin P:sin y =sinh a:sinh b:sinh c.

Law of cosines:
cos a- cosbcosc
cos (X = -----:--:--:---
sinbsinc

cosh a - cosh b cosh c


cos (X = ---:-:c-:--:-:----
sinhbsinhc

Angular sum in a triangle:


a+P+y=n+S>n (elliptic),
a+P+y=n (Euclidean),
a+P+y=n-S (hyperbolic).
The last three formulas for the angular sum in a triangle are special cases
of the important theorem of Gauss-Bonnet (cf. Problem 74.11). It might be
the most important theorem in global differential geometry. Intuitively, it
states that living beings in two dimensions are able to determine the geometry
they live.in solely on the basis of measurements at the circle or the triangle.
EXAMPLE 74.57 (Local Realization of the Hyperbolic Geometry on a Pseudo-
sphere). The pseudosphere Spseu is obtained through rotations ofthe curve C=
C(e) about the C-axis in a Cartesian (e, ,, ()-coordinate system. Introducing
polar coordinates cp, r we obtain

e= rcos cp, 'I= rsin cp, (= ± J: r- 1 ~dr


(Fig. 74.13). Naturally, the curves cp = const and r = const are called merid-
ians and circles of latitude. The element of arc is ds 2 = r- 2 dr 2 + r2 dcp 2.
e
Letting = cp, 'I = 1/r, we obtain
ds2 = (de2 + d'12)/,2.
74.21. Applications to Non-Euclidean Geometry 661

Figure 74.13

This is the metric (85) for Mhyp· Thus, the geometry of Mhyp with '7 > 1 can be
realized on the northern half of Spseu (Fig. 74.11).
Hilbert (1901), however, showed that the complete Riemannian manifold
Mhyp cannot be realized as a surface in R3• Note that Spseu is not a manifold,
since at the equator points r = 1 there exist no tangent planes. Nevertheless,
the Whitney embedding theorem (1936) of Section 73.21 implies that Mhyp can
be embedded as a manifold in R2"+1, n = 2. One should note, though, that the
embedded manifold need not have the same metric as Mhyp· However, the
embedding theorem of Nash (1956), shows that every real, n-dimensional,
proper Riemannian C"-manifold M with countable basis, 3 ~ k ~ oo, can
isometrically be embedded in R"' form sufficiently large. This embedding is of
class C".It suffices to choose m = 2- 1 n(n + 1)(3n + 11). If M is compact, one
can choose m = 2- 1 n(n + 5) + 3. For Mhyp we have n = 2. In proving this,
Nash solved a complicated system of nonlinear partial differential equations
by using the hard implicit function theorem (see Problem 5.9). Embedding
theorems for Riemannian manifolds are discussed in detail in Gromov and
Roblin (1970, S, B) and Gromov (1986, M).
An extremely simple and very elegant proof of the Nash embedding theorem
can be found in GUnther (1989).

EXAMPLE 74.58 (Model of Beltrami). In 1868, Beltrami published the first


concrete model for a hyperbolic geometry. On the open unit disk
M = {(u,v)e!R 2 : u2 + v2 < 1}
he introduced the metric
R2
ds 2 = ( 1 )2 [(1 - v2 )du 2 + 2uvdudv + (1- u2 )dv 2 ].
-u 2 -v 2
For the Gaussian curvature, one finds that K = -l/R 2 • The "straight lines,"
i.e., the geodesics in Mare segments of straight lines. From Figure 74.14 we
obtain (Phyp).
This model played a famous role in the development of abstract manifolds,
because it was an important example of an abstract manifold which could not
be realized as a surface in R3 • However, a local realization on the pseudosphere
662 74. Classical Surface Theory, Theorema Egregium of Gauss

Figure 74.14 Figure 74.15 Figure 74.16

with equator radius R is possible. All this led to the realization of abstract
manifolds as useful mathematical tools.
EXAMPLE 74.59 (Non-Euclidean Models of Felix Klein). In 1871, Felix Klein
showed that elliptic and hyperbolic models may be constructed in the context
of projective geometry. A model for the hyperbolic, non-Euclidean geometry
is obtained by choosing the open unit disk as the plane M. The "straight lines"
are segments of straight lines. From Figure 74.14 we obtain (Phyp). As distance
between two points x and y, one chooses
Ia- xllb- Yl
d(x,y) = lnla- bllx- yl'
This is the logarithm of the double ratio of the four points a, x, y, b where a
and b lie on the boundary of the unit disk (Fig. 74.15). All points on the
boundary of the unit disk are the infinitely distant points in this geometry.
The model for the elliptic geometry of Felix Klein is equivalent to Memp of
Example 74.54. More precisely, we choose the tangent plane Mat the north
pole of Memp and project Memp from the center of the ball onto M (Fig. 74.16).
We add infinitely distant points toM which correspond to the equator points
of Memp• where antipodal points are identified. This way, we obtain Moo and
by projecting this geometry we map Memp onto M00 •
These models are carefully discussed in the standard book of Klein (1928,
M,H).
We conclude this section with a general remark. In his famous work
"Criticism of pure reasoning," Immanuel Kant in 1781 states that Euclidean
geometry is "thought necessary," because it is immanent in the thoughts of
every person. After Gauss in 1817 recognized that there exist non-Euclidean
geometries, it was clear that physical laws determine the structure of space
which therefore would have to be tested experimentally. This program then
was realized by Einstein, 1916, as we shall see in Chapter 76. Interestingly,
Euclidean geometry was revived in a modified form at the beginning of this
century, since the geometry of infinite-dimensional H-spaces is Euclidean.
Around 1925 this geometry then became the basis of quantum theory. In Part
II we saw already that boundary-value problems for elliptic partial differential
equations fit into this geometric concept. It is quite remarkable that the
74.22. Further Development of the Differential and Integral Calculus 663

p
H

Figure 74.17
famous Dirichlet problem of the nineteenth century can be solved very easily
by using the following fact: In an H -space one can construct the normal
perpendicular to a closed hyperplane H from each point P¢H (Fig. 74.17).
This implies the theorem of Riesz and in turn the existence theorem for
quadratic variational problems (Section 18.11d). Some special applications of
this existence theorem are the existence theorems for the basic problems in
linear elasticity theory of Chapter 61. In Part V, we shall see that classical
mechanics and statistical physics in phase space can best be understood in the
context of symplectic geometry. Also, as has been mentioned already several
times, the goal of the gauge field theories for elementary particles is the
reduction of the fundamental physical interactions to the curvature of fiber
bundles. All this illustrates the current trend towards a geometrization of
physics.
The most general geometric properties are topological properties. Those
are properties which remain unchanged under homeomorphisms and hence
satisfy a strong form of structural stability. Deep mathematical results are
those which show a connection between analytic and topological properties.
In Problem 74.11 we consider, as an important example in this connection,
the global theorem of Gauss-Bonnet, which states that the Gaussian total
curvature of a closed surface is a topological property.

74.22. Strategy for a Further Development of


the Differential and Integral Calculus
on Manifolds
In the last sections of this chapter we consider several important supplements
to tensor calculus, which will be discussed more thoroughly in Part V in the
context of Banach manifolds:
(i) Alternating differentiation of alternating tensors (applications to Cartan's
calculus of differential forms).
(ii) Lie derivative as a generalized directional derivative (applications to Lie
algebras of vector fields and Lie groups).
The importance of (i) and (ii) is that one does not need the Christoffel
symbols for the definition, i.e., these two derivative concepts exist for arbitrary
664 74. Classical Surface Theory, Theorema Egregium of Gauss

manifolds. We use the following strategy:


(S) Using covariant differentiation, we look for expressions which do not
explicitly contain the r~. such as, e.g., (90) below. This yields tensor fields
which can be formed on general manifolds without r~, i.e., without affine
connection.
Moreover, by integrating differential forms, we obtain a general method
to formulate integral theorems on oriented, finite-dimensional manifolds. This
includes classical volume, surface, and curve integrals.

74.23. Alternating Differentiation of


Alternating Tensors
First we consider the situation in R", i.e., we assume (A) of Section 74.2. In
addition we require:
(T) The tensor field t 1, ... 1• is antisymmetric.

Proposition 74.60. If (T) holds, then


Alt V1t 1, ... 1• = Alt D1t 1, ... 1•• (90)

PROOF. The terms with r~ cancel out. For example,


V1t1 - V1t, = D,t1 - D1t, - r,jt, + ljjt,
= D1t1 - D1t1• 0

Definition 74.6l.lf(T) holds, then we define alternating differentiation through


(91)

It is important that d1t 1, ... 1• is again an antisymmetric tensor field, because


the left-hand side of (90) is a tensor, hence also the right-hand side. As in
Section 74.17, we immediately are able to extend the operation d1 to arbitrary
manifolds by using charts.

74.24. Applications to the Calculus of


Alternating Differential Forms
This very elegant and supple calculus, which was created by Ellie Cartan
(1869-1961), will be discussed more thoroughly in Part V, where we consider
this calculus for Banach manifolds. Here we want to show how easily the
74.24. Applications to the Calculus of Alternating Differential Forms 665

invariance properties of this calculus for finite-dimensional manifolds are


deduced from (91) and the substitution rule for integrals.

74.24a. Basic Ideas

The following four observations form the basis of Cartan's calculus.


(Cl) Substitution rule. LetS be a bounded region in R 2• Consider the integral
J = Jsfdudv. We also write

J = 1 f du 1\ dv.

For the product symbol """ we use the formal rule


al\b=-bl\a
and hence a " a = 0. Under changes of variables u = u(t, s), v = v(t, s),
we obtain
du 1\ dv = (u,dt + usds) 1\ (v,dt + v,ds) = (u,v,- u,v,)dt 1\ ds.
This gives

J = f o(u, v)
f o(t,s) dtds,

which is the correct substitution rule if the functional determinant is


positive. Under changes in the orientation, J changes its sign.
(C2) Theorem of Gauss. In order to obtain the well-known formula for the
integration by parts

[ f du + gdv = [ (gu- fv)dudv


Jas Js
in a more elegant form

[ w= [ dw, (92)
Jas Js
we set
w = f du + g dv,
and compute dw according to
dw = df 11. du + dg 11. dv.
This gives
~=~~+~~/\~+~~+~~/\·
= (gu - fv) du 1\ dv.
666 74. Classical Surface Theory, Theorema Egregium of Gauss

(C3) Theorem of Stokes. We choose S as a surface in R3 and let


m = adu + bdv + cdw.
This gives
dm = da A du + db A dv + de A dw
= a.dv Adw + {Jdw + dv + ydu Adv
with

and analogous formulas for fJ and y using cyclic permutations. Setting


w = ae 1 + be 2 + ce 3 ,
we obtain

Moreover, we find

i
s
dm= I[a.--+
o(v,w) po(w,v)
o(t, s) o(t,
o(u,v)]d d
--+y-- t s
o(t, s) s)

= L Ncurlwdm,

where N is the unit normal vector for the surface and m is the surface
measure. Note that from Section 74.10
N = (x, x x.)/lx, x x.l
with x = ue 1 + ve 2 + we 3 and dm = lx, x x,ldtds. Thus (92) corre-
sponds to the theorem of Stokes

f wdx = f Ncurlwdm.
Jas Js
If S is a bounded region in R3 and
m = adv A dw + bdw A du + cdu A dv,
then
~=·A~A~+·A~A·+~A·A~

=(a.,+ bv + Cw)du A dv A dw·


and
ddm = d(a., + bv + cw) A du A dv A dw = 0
since du " du = 0, etc. Then (92) is equivalent to the theorem of Gauss
74.24. Applications to the Calculus of Alternating Differential Forms 667

f
Jas
Nwdm = is
divwdx,

where N denotes the outer unit normal vector of the boundary as.
For all these formulas we assume that Sand as are coherently oriented
(see Section 73.18).
(C4) Integrability conditions. Let G be a region in R 3 • Let

w = adu + bdv + cdw


and
n = A dv A dw + B dw A du + c du A dv.
Moreover, let w = ae 1 + be 2 +ce 3 and 0 = Ae 1 + Be 2 + Ce 3 • Given
an n, we look for an w such that
dw = n on G.
This equation is equivalent to
curlw = 0 on G.
Because of d dw = 0 we immediately find the necessary solvability con-
dition
d0=0, i.e., div0=0 onG.
If G is a convex or, more generally, a smoothly contractible region, then
this condition is also sufficient for the existence of a solution w. This
follows from the theorem of Poincare of Section 74.24c below.
This way many formulas from vector analysis can quickly be derived in this
calculus. We shall also see how this calculus is generalized to classical vector
analysis on manifolds. It is a very important tool in modern mathematical
physics.

74.24b. Calculus in IR"


Assume (A) of Section 74.2 and consider expressions of the form
w = t.&1 ••• ir du 1• A ••• A duir ' (93)
where the t 1, ... 1• are antisymmetric in all indices. Thus instead of w =
adu 1 A du 2 , for example, we write
w = t1Jdu 1 A dui = t 12 du 1 A du 2 + t 21 du 2 A du 1
with t 1i = -ti1 and t 12 = a/2. We formally use the antisymmetric product
668 74. Classical Surface Theory, Theorema Egregium of Gauss

symbol A as above, i.e., a A b A c = - b A a A c, etc. For didactical reasons,


this formal point of view is very convenient. A rigorous approach will be
considered in Part V by defining a A b, etc., via antisymmetric multilinear
forms. However, this rigorous approach in Part V, which is also valid on
infinite-dimensional Banach manifolds, will not change the rules of the calculus
considered below.
We call ro an r-differential form or simply an r-form. Functions are called
0-forms. Making the change in variables

we obtain

= f..'•···•r.. du 1i A • • • A du 1~
with

This yields the important result that the coefficients t 11 ... 1• form an anti-
symmetric tensor. The derivative dw is defined through
dw = d-t·
I '•···lr
du 1 A du 1• A · • • A du 1•.
Because of the tensor property of d1t ... of Section 74.23, this definition does
not depend on the choice of the coordinate system. From (91) and the fact that
a A b = - b A a we obtain
dw = D-tit ···•r. du 1 A du 1•
I
A ••• A du 1•.
This yields the key formula
dw = dt·lt···•r. A du 1• A .. • A du 1•. (D)
This formula is easily remembered and coincides with the result for dw of
Section 74.24a. From D1D1 = D1D1 we immediately obtain the integrability
condition
ddw = 0.

74.24c. Calculus on Manifolds

In this section we assume:


(H1) M is a real, n-dimensional ceo-manifold.
(H2) S is an r-dimensional, compact, oriented submanifold of M with bound-
ary as where s and as are coherently oriented if as -:1: 0.
For r = 1, 2 one may think in (H2) of curves and surfaces Sin M = R3 •
Because of (H 1) we can define ro o11 M as in (93) if t 11 ... '·is an antisymmetric
ceo-tensor field on Min the sense of Section 74.17. Moreover, we may define
74.24. Applications to the Calculus of Alternating Differential Forms 669

the derivative dw as above. Because ofthe tensor property oft ... and dit ... these
two definitions are correct, i.e., chart independent.
The definition of integrals

1= Is w
for r-forms with r = dimS follows automatically from this calculus. Consider,
for example, a three-dimensional manifold M with local coordinates u, v, wand

w = a du " dv + b dv " dw,


where u = u(t, s), v = v(t, s), w = w(t, s) is a local chart representation of S.
Using transformations we obtain

iJ(u, v) b o(v, d d w))


w = ( a o(t,s) + iJ(t,s) t" s,

as in Section 74.24a. In computing 1, we replace dt " ds with dt ds. Thereby,


we obtain the subintegral of 1 over the local chart. As in Section 74.17 we find
1 by taking the sum of the local subintegrals using a partition of unity.
Again, this definition is correct, i.e., chart independent. This follows from
the fact that this calculus respects the substitution rule. For example, we have

11 = fdu " dv = fo(u,v)


o(t, s) dt ds,

12 = fdu " dv = fo(u,v)


o(t, u) dt du.

Because of the multiplication rule for functional determinants


o(u, v) o(t, s) o(u, v)
o(t, s) o(r, u) = o(t, u)'
we obtain 1 1 = 12 if o(t, s)/o(r, u) > 0. But since Sis oriented we always have
positive functional determinants for the chart changes.
Assuming (Hl), (H2), we now state three fundamental results which we will
prove in Part V. A region G in M is called smooth contractible if and only if
it can smoothly be contracted into a point x 0 eG, i.e., more precisely, there
exists a C 00 -map H: G X [0, I]-+ G with H(x,O) =X and H(x, l) = Xo for all
xeG (Fig. 74.18). For example, every convex region in R" has this property.

(I) Integrability condition: It is d dw = 0.


(P) Theorem of Poincare. Let G be a smooth contractible region in M. Then
equation
dw=Cl
has a solution w in G if and only if dO = 0 in G.
670 74. Classical Surface Theory, Theorema Egregium of Gauss

Figure 74.18

(S) General theorem of Stokes: It is

j w=j dw.
Jas Js
This last theorem is one of the most important theorems in analysis. For
the special case that S is an open interval in R, theorem (S) reads:

J." f'(u)du = f(b)- f(a).


This is the fundamental theorem of differential and integral calculus. In fact,
(S) follows from this quite easily by passing to quaders in the chart spaces and
using a partition of unity.

74.24<1. Pull-Back and Mapping Degree

Iff: M-+ N is a map and w a form on N, then the pull-back


f*w onM
is naturally obtained by mapping the N-coordinates in w via f onto the
M-coordinates. For example, it immedjately follows from
w = adu A dv
and (u, v) = f(t, s) that
f*w = a(u,dt +u 5 ds) A (v,dt + v.ds)
= a(u,v.- u.v,)dt Ads.

Let M and N be real, n-dimensional, compact, connected, and oriented coo-


manifolds. One may think of closed curves in R3 • Moreover, let w be ann-form
on N with JNw ¢ 0. Then one can assign a number degf to each C 00 -map
f: M-+ N such that

JM f*w = degf L w.
74.24. Applications to the Calculus of Alternating Differential Forms 671

It should be noted that this number does not depend on w. This will be
discussed more thoroughly in Part V. We call degf the mapping degree off
In gauge field theories, physicists call degf a topological charge, since degf is
an integer and hence invariant under deformations, i.e., homotopies.

74.24e. Inner Product

If

is an r-form on the manifold M and v = v;b; is a vector field on M, then we


define the inner product through
r! . . .
i"w = (r- 1)! v''t·., ...•., du' 2 A··· A du''.

Because of the tensor property of v;, t;, ... this definition is coordinate
independent.

74.24f. De Rham Cohomology

An r-form w with
dw= 0
is called an r-cocycle. If
w = drx,
then w is called an r-coboundary of ex. The choice of these names is motivated
by the duality between homology and cohomology, which will be discussed
in Part V. Two r-forms w 1 and w 2 are called cohomologous if and only if

i.e., if they differ in an r-coboundary. We write


w1 =w 2 mod rx.
All forms which are cohomologous to w 1 form the cohomology class of w 1 .
All r-forms on a manifold M can be added and multiplied with real numbers.
If one identifies cohomologous forms under these operations, one obtains a
real linear space H'(M), r = 0, 1, .... It is called the rth de Rham cohomology
group. It is already nontrivial to show that for then-sphere S":

H'(S") = {IR for r = 0, n,


0 otherwise,
holds. In the language of linear algebra, W(M) is the factor space of the
672 74. Oassical Surface Theory, Theorema Egregium of Gauss

r-cocycles with respect to the r-coboundaries. The elements of Hr(M) are the
cohomology classes. This very formal definition provides us with an excellent
tool to describe deep connections between analysis and topology. According
to the theorem of de Rham, which will be discussed in Problem 74.1le, W(M)
is a topological invariant of M. Cohomology theory will be discussed in
greater detail in Part V. The theorem of Poincare may now be stated in the
following way.
(P) If G is a smooth contractible region in M, then W(G) = 0 for all r.
Actually, W(G) = 0 means that all r-cocycles n are also r-coboundaries, i.e.,
from dO = 0 follows the existence of an ro with n = dro. This is a special case
ofthe important phenomenon that global analytic existence theorems may be
expressed in terms of a vanishing of cohomology groups.

74.24g. Duality of Forms and the Decomposition Theorem


of Hodge

Let M be a real, n-dimensional, oriented Riemannian C 00 -manifold. Then


the metric tensor gii and hence g = det(g 11) is defined. Therefore also the
pseudotensor
E,, ... l, = Ji9Ta,, . . l,

of Section 74.7 is defined on M. Given an r-form

r! t·., ... 1, du '


ro = _!_ du 1•
1 1\ • • • 1\

we define a dual (n - r)-form •ro through


1
•ro = (n- r)! •t·•r+loo•ln· du 1•+l 1\ • • • 1\ du 1"

with

Here t"' is obtained from t ... by lifting the indices with the aid of gii, i.e.,
tl, ... l, = gllil,., girlrtJ. ... j, ,

Since •t ... is a pseudotensor, •ro is transformed like a pseudoscalar.


Since M is oriented, a chart change uses positive functional determinants.
Consequently, •ro behaves like a scalar. Thus the definition of •ro is chart
independent.
Moreover, we define
~ro = ( -ly(n-r)+r(sgng)•d•ro

and lJf = 0 for· functions f Having this we define the Laplace operator for
74.25. Lie Derivative 673

forms as
/j.ro = - (d bw + b dw ).
For functions, this is the classical Laplace operator fj,j = giiViV1f Note that
in the literature one often uses the definition fj, = db + bd, which does not
coincide with the classical definition.
Now we assume in addition that M is a compact, proper Riemannian
manifold. For r-forms ro and a we define a scalar product

(ro, oc) = f
M ro A •oc.

Using a coordinate representation one easily verifies the following properties


> >
(i) (ro, (JJ ~ 0, and (ro, (JJ = o-ro = 0;
(ii) (dro, P> = (ro, bp), i.e., b is the adjoint operator to d;
(iii) (/j.ro, P> = ( ro, /j.p) and (- /j.ro, ro > ~ 0.
Furthermore, one has the following fundamental theorems of Hodge.
(Tl) Decomposition theorem. Every r-form ro admits a decomposition
ro = doc + bP + y
with fj,y = 0. Here doc, bp, andy are uniquely determined through ro and
pairwise orthogonal with respect to ( ·, · ).
(T2) Existence theorem. Equation /j.ro = non M has a solution ro if and only
if (Cl, y) = 0 for ally with fj,y = 0 on M.
Here M is a real, compact, n-dimensional, oriented, proper Riemannian
C 00 -manifold and all forms have C 00 -coefficients. A proof may be found in
Warner (1974, M), p. 223. There one also finds a functional-analytic theory
for elliptic operators on M, which is similar to the Sobolev space theory of
Chapter 22. In Part V we show how d and ~ can be used to obtain elegant
versions for Maxwell's equations of electrodynamics and the equations of
gauge field theories.

74.25. Lie Derivative


First we consider R", i.e., we assume situation (A) of Section 74.2. Let ai be a
tensor field. We want to generalize the concept of directional derivative a'D,f
for a function f to tensor fields t:::. One possibility is a'V,t:::. This definition,
however, depends on the Christoffel symbols. Another important possibility
is the Lie derivative Lat:::.

Definition 74.62. Let Laf = a'D,J and


Lati = a'D,ti - t'D,ai. (94)
Another notation is [a, tJ.
674 74. Classical Surface Theory, Theorema Egregium of Gauss

The definition of La for arbitrary tensor fields is then a consequence of the


product rule which we assume to be valid. From
La(t 1ti) = (Lat 1)ti + t 1Lati
and La(t 1t1) = a'D,(t 1t 1) for all t 1 we obtain the definition
Lati = a'D,t1 + t,D1a'. (95)
The symmetry of definitions (94) and (95) becomes clear if one looks at the
summation index and the index picture. Moreover, the product rule implies
that

and hence we define


La(t 1t1) = a'D,(t 1t1) - t1t'D,a 1 + t 1t,D1a•,
and similarly

Analogously, one proceeds for arbitrary tensor fields. The general rule is very
simple:
One writes Lat::: = a'D,t::: and adds a term of the form (94) or (95) to each
index oft::: and makes sure that the index picture is right.

Proposition 74.63. The Lie derivative of a tensor field is again a tensor field of
the same type.

PROOF. Iff is a function, i.e., a scalar field, then it follows from tensor calculus
that La!= a'D,f is again a scalar field. The tensor property of Lat 1 and Lat1
is an immediate consequence of

The tensor property of the other Lie derivatives follows then from our con-
struction by applying the product rule. 0

EXAMPLE 74.64 (Motivation for the Lie Derivative). If one tries to generalize
the directional derivative to tensor fields t 1, one observes that a'D,t 1 is not a
tensor field. On the other hand, a'V,t 1 and a'V,t 1 - t'V,a 1 are tensor fields.
The first expression contains r 1' and can therefore not be used for a general
theory on manifolds. The second expression does not contain any r~ and is
equal to Lat 1•
We shall give another motivation which also admits a physical interpreta-
tion. Consider a point u = (u 1, ••• , u") together with a neighboring point
v1 = u1 + ea 1(u).
74.25. Lie Derivative 675

At the point v we perform the coordinate transformation


vi' = vi - eai(u),

i.e., v1' = u 1• The transformation of t 1(v) yields


.. ov 1'
= -;---:-1 t 1(v).
0

t' (v)
uv
Explicitly, we have
t 1'(u + ea)- ti(u) = eL,/(u) + o(e), e-+ 0. (96)

PRooF. This follows from


fJvi' 0 oai(u) fJuk
-=~!-e----
fJvi J fJuk fJvi'

A formula, analogous to (96), is obtained if one replaces ti with an arbitrary


tensor field c. Equation (96) provides the interpretation of La. Physically, it
means the following. An observer travels from a point u, where he measures
the field ti(u), to a neighboring point v. There he measures the field t 1(v) which
he transforms back into his old coordinates at the point u. This gives t 1'(u + ea).
The difference
t 1' (u + ea) - t 1(u)

is equal to eLat 1(u) except for terms o(e), i.e.,

. t 1' (u
L at'.( u) = 1tm + ea) - t 1(u)
e
0

e-o
This interpretation motivates the following definition.

Definition 74.65. A tensor field r::: is constant in the sense of Lie along the
vector field a= a 1b1 if and only if the directional Lie derivative with respect
to a is identically zero, i.e., LaC 0. =
If one constructs a flow by using a and the differential equation
u1 = a 1(u(r)),
whose tangent vectors are equal to a, then one says the tensor field t::: is
invariant with respect to this flow if and only if

LaC= 0.
Here, a denotes the tangent vector field of the flow. One may think of a water
flow. Then a is the velocity field and t::: is another physical field, e.g., in the
case of t 1 a force field or magnetic field.
676 74. Classical Surface Theory, Theorema Egregium of Gauss

The Lie derivative therefore plays an important role in the study of sym-
metries of tensor fields.
Suppose M is a finite-dimensional manifold. Then, similarly as in Section
74.17, one can define the Lie derivative for tensor fields t::: on Mas above by
working in charts. In Definition 74.65, {hi} is the natural basis ofthe tangent
spaces toM.
In Part V we shall give a definition of Lie derivatives for Banach manifolds
using the concept of flows, which is equivalent to the above definition.

74.26. Applications to Lie Algebras of


Vector Fields and Lie Groups
We want to show that Lie derivatives lead naturally to Lie algebras.

Definition 74.66. A Lie algebra X over IK is a linear space over IK, where, in
addition, a product [v, w] is defined with
(i) [v,w] = -[w,v];
(ii) [v,[w,z]] + [w,[z,v]] + [z,[v,w]] = 0.
More precisely, (v, w)H [v, w] is a bilinear map from X x X into X with (i),
(ii) for all v, w, z eX. The dimension of the Lie algebra X is equal to the
dimension of the linear space X.

EXAMPLE 74.67. The standard example of a real Lie algebra is LGL(n, R), i.e.,
the set of all real (n x n)-matrices v with the usual addition and bracket
product
[v,w] = vw- wv.
The connection with the group GL(n, R) of all real, regular (n x n)-matrices
is the following. For each ve LGL(n, R) and te R we have
e'" = I + tv + o(t), t --. 0
and from e'" one obtains all elements of GL(n, R) in a neighborhood of I.
Furthermore,
e"'e'we-'"e-tw =I+ t 2 [v, w] + o(t2 ), t--. 0.
Therefore, the Lie algebra LGL(n, R) is obtained from the Lie group GL(n, R)
by linearization. The commutator of the group-thereby becomes the prod-
uct [u, w] of the Lie algebra. This was the ingenious idea of Sophus Lie
(1842-1899), when he reduced the study of Lie groups to Lie algebras. This
will be discussed in greater detail in Part V, in connection with applications
to the theory of elementary particles (theory of quarks). Lie algebras are a
basic tool in modern physics to explain quantum effects with symmetries.
74.26. Applications to Lie Algebras of Vector Fields and Lie Groups 677

EXAMPLE 74.68 (Lie Algebra of Vector Fields). Let M be a real, n-dimensional


C00 -manifold. If v = v1b1 and a= a1b1 are vector fields on M, then we define
the Lie derivative of v in the direction of a through
Lav = (Lav 1)bi.
We also write [a, v] = Lav. Thus
[a' v] = (a'Ds v1 - v'Ds a 1)b.•.
This makes the collection of all C00 -vector fields on Minto a real Lie algebra.
The Lie derivative of a covector field v = v1b 1is defined analogously through
Lav = (Lavi)b 1,
where {b 1} is the natural basis in the cotangent spaces of M (see Section 74.17).
Analogously, we define the Lie derivative of the r-form
ro = t.I t " ' lr· du 11 1\ .. • 1\ du 1r '
through
L a ro = (L a t 11 ••• 1, )du 11 1\ · · · 1\ du 1'.
We have
Law = ia dw + d(iaro).
This follows easily from a coordinate representation. As an exercise we
recommend a proof of this.

Definition 74.69. A group G is a set on which a multiplication is defined, i.e.,


to each pair (g, h) of elements in G a unique element in G, gh is assigned, such
that the following three conditions are satisfied:
(i) Associativity: (fg)h = f(gh) for all J, g, he G.
(ii) Existence of a unit element: There exists an element e in G with ge = eg =
g for all geG.
(iii) Existence of an inverse element: For each ge G there exists an element
heG with gh =e.
For h in (iii) one writes g- 1• One can easily show that e and g- 1 are uniquely
determined. Besides gg- 1 = e one also has g- 1 g = e.
Roughly speaking, a Lie group G is a group which can locally be para-
metrized, and the group operations depend smoothly on the parameters. This
way, one can study G by using the methods of the theory of manifolds.

Definition 74.70. A Lie group G is a real, n-dimensional C00 -manifold, whose


elements form a group and the map
(h,g)Hh- 1 g
is a C 00 -map from G X G into G
678 74. Classical Surface Theory, Theorema Egregium of Gauss

If we choose g = e and h = f- 1, then also the map h~--+h- 1 from G into G


and the map (/,g) I-+ fg from G x G into G are of class coo, i.e., taking the
inverse and multiplication are of class coo.

EXAMPLE 74.71. The standard example of a Lie group is GL(n, IR) of Example
74.67. All real (n x n)-matrices g with llg- Ill < e belong to GL(n, IR) for
sufficiently small e > 0, because the process oftaking the inverse is continuous
(see Problem 1.7). Letting v =log, we obtain g = eu (see A1(60b)). Thus
GL(n, IR) can be parametrized by v in a neighborhood U(I) ofthe unit element
I. For an arbitrary group element g0 we choose g0 U(I) for a parametrization.
Thereby, GL(n, IR) naturally becomes an n2-dimensional Lie group. Note that
the set of(n x n)-parameter matrices v can be identified with a zero neighbor-
hood of IR"1 •
EXAMPLE 74.72 (Lie Algebra of a Lie Group). Let G be a Lie group. Through
Tgx = gx
we obtain for each fixed g e G a C00 -diffeomorphism 1'g: G-+ G. If y = T,x,
then the derivative
r;: TGx ..... TG,
maps the corresponding tangent spaces onto each other. Let v be a vector field
on G. Then vis called left invariant if and only if
r:v= v for all geG,
i.e., J;(v(x)) = v(gx) for all x, ge G.
From the definition of [a, v] in Example 74.68, the following important
relation
T;[a, v] = [T;a, T;vJ
is obtained by linearization. Thus, for left invariant vector fields a, v we have
that r;[a, v] = [a, v], i.e., also [a, v] is left invariant.
Therefore all left invariant C00 -vector fields on a Lie group G form a real
Lie algebra with respect to
[a,v] = Lav,
which is called the Lie algebra LG of G.

In Part V, we will compute a number of Lie algebras in view of physical


applications. In particular, we will give a simple proof of the fact that the
algebra LGL(n, R) of Example 74.67 is isomorphic to the Lie algebra of
GL(n, IR). The strategy for the theory of Lie groups consists in reducing the
study of Lie groups to the much simpler object of Lie algebras. An important
and deep structure theorem is the following: Every finite-dimensional, real Lie
algebra is isomorphic to the Lie algebra of a Lie group (see Dieudonne (1975,
M), Vol. 5, 21.23).
Problems 679

PROBLEMS

74.1. Tensors. How is the tensor field tjt transformed?


Solution: From Definition 74.4 it follows that
i'
trt· = Ai'1 ArAt.tJt·
j t i

How is t{(::/f transformed in case it is a tensor density (or pseudotensor


density) of weight w?
Solution:
(97)
with cc = Dw (or cc = (sgn D)Dw). For w =0 we obtain tensor fields (or
pseudotensor fields).
74.2. Covariant differentiation.
74.2a. Compute V1tf,...
Solution:
V;tf,. = v.tt,. + r~t:,.- r~t1,.- r1-:.,~.
Observe the summation index and the index picture.
74.2b. Prove that covariant differentiation and contraction commute. For V1tj., for
instance, it does not matter whether V1 is applied before or after summing
over j.
Hint: Use the definition ofV1 and observe the different signs in (21) and (22).
74.2c. Prove that Vtb} = 0.
Solution:
v.bJ = v.bJ + r;,b/- rljb: = o.
74.2d. Lemma of Ricci. Prove that v.g 11 = 0.
Solution: This follows from

and (23).
74.2e. Prove that V.g•J = 0.
b:
Solution: From = g 11 g.. and the product rule, it follows that
o = v.g 1'g"' = g"' v.g 1' + g1"V.g.. = g..v.g••.
Multiplication with g•l gives 0 = bfV.g 1'.
74.2f. Verify (24) and (25).
74.3. Index principle. The following set of problems shows how naturally the index
principle of Section 74.5 occurs.
74.3a. Scalar product. What is the value of the scalar product vw for vectors v, w
for an arbitrary coordinate system of IR 8 ?
Solution: Let v = v1b1 and w = w1b1• It follows that
680 74. Classical Surface Theory, Theorema Egregium of Gauss

74.3b. Lifting and lowering of indices. This uses the giJ and giJ. For example, one
obtains

Prove that wi = gijwi implies wi = giiwi.


Solution: From gkigij =~lit follows that gk1wi =wk.
74.3c. Cartesian coordinate system. Why is the position of upper and lower indices
not important for Cartesian coordinates?
Solution: Instead of vi we can write vi with vi = gijv1. This follows from
vi = vi, because in Cartesian coordinates gij is equal to the Kronecker
symbol.
74.3d. Invariants for quadratic forms. Let a11 vivi be a quadratic form in IR8 with
aij = a11 • Prove:
Li
(i) The number trace a = a 11 has the same value for all Cartesian coor-
dinate systems. What is the value of trace a for an arbitrary coordinate
system?
(ii) For all leC, det(l~iJ- au) has the same value for every Cartesian
coordinate system. Therefore the solutions of the secular equation

i.e., the eigenvalues of (aij) are equal for all Cartesian coordinate
systems. What is the form of this equation for an arbitrary coordinate
system?
(iii) sgn det(au) is an invariant, i.e., a scalar.
(iv) "+• "-• and K 0 , i.e., the number of eigenvalues of(av) which are >0,
<0, and =0 are invariant. Recall that"- is the Morse index of(au).
Solution: Ad(i). Letting aJ = g 1kakl we obtain that trace a= af in Car-
tesian coordinates. For an arbitrary coordinate system this is a scalar.
Ad(ii). We let cij = A.g11 - aij. From Section 74.7, c = det(cij) is a scalar
of weight 2, i.e., c' = D 2 c. For transformations between Cartesian coordi-
nates we have D = det(Aj) = 1. Hence, for any such transformation, c is a
scalar. Furthermore, in Cartesian coordinates ·we have gil = ~iJ· The general
form of the secular equation is det c = 0, i.e.,

Ad(iii). For a= det(ai1) we have a'= D2 a, and hence sgna' = sgna.


Ad(iv). Using a rotation of the coordinate system one can always
write Q = ai1vivl as a sum of squares, i.e.,

Q = L livi' vi'.
i

Thereby the ..1.1 are the eigenvalues of(aiJ). From the law of inertia of Sylvester
it follows that the number of squares with positive or negative sign is
independent of the coordinate system.
74.3e. What is the form of v grad f for an arbitrary coordinate system of IR8 ?
Solution: Letting v = vibi one obtains that vgradf = viVJ
Problems 681

74.3f. The Navier-Stokes equations. What is the form of


pv, + p(v grad)v - '1 t.v = K - grad p,
divv =0
in component notation for an arbitrary coordinate system of IR 3 ?
Solution: We let v = v1b1, K = K 1b1• Then
pv~ + pv 1V1vk- '1giiV1Vivk = Kk- Vkp,

V1v1 = 0,
with Vk = gkmvm.
74.3g. Basic equation of elastodynamics. What is
pu, = divt +F
in component notation for an arbitrary coordinate system of IR 3 ?
Solution: In Cartesian coordinates we have divt = LJDJtJe1, and hence
div t = Vitjb1 and
pu:, = VitJ + F1,
where u = u1b1, F = F1b1, and the equation u = tF corresponds to u1 =
tjFi.
74.3h. Construction of tensors. Let G be a region in IR" and assume situation (A) of
Section 74.2. In a fixed u-coordinate system of G we assign real numbers t 1
to each point x E G. For an arbitrary u' -system we then define
;• defAj' ;
t = it'

Prove that through this a tensor field is obtained. Analogously, arbitrary


tensor fields, pseudotensor fields, tensor densities, and pseudotensor den-
sities are defined.
Solution: By definition we have
t 1" = Af't 1•

Now we must show that


ti" =Art~·.

In fact, Ar t 1' = A::· Af t 1 = Af' t 1 = t 1".


74.3i. Inverse index principle.
Prove: If for all tensor fields a 1,

is transformed like a tensor field tJ, then also rx 1J is a tensor field.


More generally: If for all tensor fields a:::, the quantity a:::rx::: is trans-
formed like a tensor field, where the type is given by the index picture, then
rx::: is also a tensor field where the type is given by the index picture.
This principle helps avoid tedious computations in proving the tensor
property of rx:::. In (82), for example, the tensor property of the curvature
tensor came up automatically.
682 74. Classical Surface Theory, Theorema Egregium of Gauss

Solution: From a1'111.i' = Af.(a1tX 0 ) it follows that


Af a1111•1• = Aj.a 1111••
Because of AJ' Aj. = lJj this implies
a i (A i'1 A1J' 111•1.) =a i a.IJ.
Since a 1 is an arbitrary tensor field, and Problem 74.3h admits, for any fixed
coordinate system, the computation of tensor fields with arbitrary numerical
values, it follows that

i.e., a. 0 is a tensor field.


74.4. Parallel transport and intuitive meaning of the Christoffel symbols. Give a
direct proof of Theorem 74.A without using tensor calculus.
Solution: Let x be the radius vector in Rn. We define vectors b11 ~ D1b1 =
D1D1x and numbers B~ using the vector decomposition
bij = Bijb,. (98)
(I) Parallel transport. Let C be the curve x = x(-r), i.e., u = u (-r). If the
1 1

vector field v = v1b1 is constant along C, then differentiation with


respect to -r gives:
0 = v1b; + v1b; = v"'b, + v1bijui.
From (98) we obtain v"' + Bijv 1ui = 0. This is equal to (30) if we can
show that
Bij = rij. (99)
(II) Proof of (99). From
f\b, = gtr (100)
and (98) it follows that biJb• = Bijg_. Multiplication with g•t gives
B~ = g'tbiJb,. (101)
Differentiation of (100), i.e., of b,b1 = g,1 gives
blrb} + b,blj = D;gr}• (102)
Interchanging the indices and summing yields
b0 b, = r 1(D1g,1 + D1g1, - D,g11 ) = rr,iJ•
(III) Interpretation of the Christoffel symbols r~. It follows from (98) that
D,b1 = r~ht.
i.e., r~ describes the change in the natural basis {bJ}•
74.5. Mean curvature H, Gaussian curvature K, and minimal-surfaces. For the
surface area of minimal surfaces,

t Jgdu 1 du 2 =min! (103)


Problems 683

holds for any given boundary curve. Prove that a smooth solution of this
variational problem satisfies H = 0 and K ~ 0 for all surface points.
Solution: To obtain the Euler equation for (103) is a purely local problem.
Hence we consider the special local coordinate system of Example 74.40 with
local surface representation
'= r1(«~2 + p,2> + o3.
The Euler equations to

JJt + 'l + c: d~d'l =min!


are

!.._ '~ + !.._ '· =0


o~ J t + Cl + ': a, J 1 + 'l + ': .
'l c:
Because of '~IJ1 + + = ot~ + 02, this implies ot + fJ = 0 at the point
' = '1 = 0, hence H = 0 and K = ot{J ~ 0.
74.6. Geometric meaning of the Gaussian curvature K. We want to show that the
analytic definition of K of Section 74.11 is the same as the original geometric
definition of Gauss.
Let P be a surface point and I. a small piece of the surface with P e I. and
surface area m(I.) = f JYdu 1 du 2• The spherical image f of I. is the piece of
the surface obtained by drawing the unit normal vector N of I. at the origin.
Prove:
IK(P)I = lim m(I)/m(I.).
diamJ:..,O

Solution: The mean value theorem of integral calculus implies that IKI =
Jg;Jg at P. We have Jg = lb1 x b2 1and, analogously, Jj = IN1 x N2 1
with b1 = D1x and N, = D1N. For computational reasons we choose the
special coordinate system of Example 74.40. Then P = (0, 0) and

N = (x~ x x,)/lx~ x x,l = (e3 - C(e1 - ,,e2)/J1 + Cl + c:


= e3 -«eel - fJ'1e2 + 02.
At (0, 0) we have
Jg = lx~ x x,l = 1 and

This means that Jii./Y = ot{J = IKI.


74.7. * Geometric meaning of the curvature tensor RJ- and the torsion tensor TJ. Let
G be a simply connected region in an affine connected, real, n-dimensional
C'"'-manifold M. Prove:
(i) The parallel transport in G is path independent if and only if Rjt.. = 0 in
G.
(ii) M is locally Oat if and only if R}t.. = 0 and Tij = 0 in M.
684 74. Classical Surface Theory, Theorema Egregium of Gauss

M is called locally flat if and only if for each point x0 there exists a
neighborhood U(x 0 ) such that ~~~ = 0 in U(x 0 ) for an admissible chart, i.e.,
for suitable coordinates. For Riemannian manifolds this coincides with the
definition of Section 74.20, because D1g11 can be expressed as a linear com-
bination ofthe r: ..
Hint: See Raschewski (1959, M), §106. For (i) use the curves u1 = u1(t,p)
which depend on an additional parameter p, and differentiate the differential
equation
v• + r~u'vi = o
for the parallel transport of v1 with respect top. Then ov•fop = 0 means that
the parallel transport is locally path independent.
74.8. Non-Euclidean geometry. Give an explicit proof of the trigonometric for-
mulas for Mhyp of Example 74.56.
Hint: See Baule (1956, M), Vol. 7, §20. Use the geodesics on Mhyp (circular
arcs) and the fact that goniometry on Mhyp is the same as Euclidean
goniometry.
74.9. Geodesics and classical dynamics. In order to obtain fundamental properties
for mechanical systems, (i) and (ii) below are of great importance. Consider,
for instance, a mechanical system q = q(t) with coordinates q = (q 1, ••. , q8 ).
Let kinetic and potential energy be equal to
T(q,q) = giJ(q)q'ql
and U(q), respectively, with gii = g11 • The Lagrangian is L = T- U. As
usual, generalized momentum and force are defined as the partial derivatives
p1 = L4, and P; = - u, .. Moreover, we let P1 = giiPj. Prove:
(i) The Lagrangian equations of motion (d/dt)L 4, - L,, = 0 can be written
in the elegant form
Dq•
-=F
dt '
k = 1, .... , n, (104)
i.e.,
q• + r;W4J = P•.
Here r~ follows from the metric tensor gii. Formula (104) explicitly
shows the tensorial character of the Lagrangian equations. It follows
from (104) that in the case of vanishing forces, i.e., P. = 0, the orbits are
affine geodesics. The following construction of Jacobi yields geodesics for
arbitrary forces.
(ii) We introduce a new metric
ds2 = gijdq 1dqi,
with gii = 2(E - U)gii for fixed E e Ill and assume that E - U > 0. The
geodesics for this metric are exactly the orbits with total energy E =
U+T.
According to Section 74.19, these geodesics are obtained from Dtj 1/ds = 0,
i.e.,
Problems 685

The dot means derivative with respect to s. The corresponding variational


principle

f ds = stationary!

is called the principle of stationary action of Jacobi. Because of the energy


theorem E = U + T, we have that E - U > 0 is equivalent to T > 0.
Hint: See Laugwitz (1960, M), 14.2.
The results remain unchanged if the state space M is a manifold, i.e., q eM.
Important applications of (i), (ii) in mechanics and especially in astronomy
may be found in Abraham and Marsden (1978, M).
74.10. Geodesic curvature of curves. The curvature of a curve x = x(s) in R3, para-
metrized with arclength s, is by definition equal to"= lxl. If this curve lies
on a surface, then we define the geodesic curvature as

"• = x(N x x).


We lett = N x x. It follows that tis a unit vector which, in the tangent plane
to the surface, is perpendicular to the tangent x of the curve, and is obtained
from x by using a rotation of 90° in the positive direction with respect to b1 ,
b2 • Prove that

~e,t = b,. d:
D'•
= b,.(u 1 + Guiui).
The last equation shows that "• can be obtained solely through measure-
ments on the surface. Under changes in the orientation, t changes its sign.
Thus "• is a pseudoscalar. From Section 74.16 the curve x = x(s), i.e.,
u1 = u1{s) is a geodesic if and only if

"• = 0.
Thus "• is a measure for the deviation of the curve from a geodesic.
Hint: See Laugwitz (1960), 11.3.5.
74.11. The theorem of Gauss-Bonnet-Chern and connections between topology and
analysis. We shall try to explain, step by step, probably the most important
theorem of global differential geometry-the theorem of Gauss-Bonnet-
Chern. It relates topology, geometry, and analysis. The deepest-known
generalization of this theorem, which also shows many other fundamental
connections between topology and analysis, is the Atiyah-Singer index
theorem. This will be discussed in Part V. In the following we assume that
all manifolds, curves, functions, and vector fields are of class coo.
For introductory reading, we recommend Kreyszig (1957, M) and Guil-
lemin and Pollack (1974, M). We also recommend Sulanke and Wintgen (1972,
M) and Spivak (1979, M, B), Vol. 5. Recall that a quantity of a topological
space is called a topological invariant if and only if it is preserved under
homeomorphisms. Of particular interest are topological invariants in the
686 74. Classical Surface Theory, Theorema Egregium of Gauss

Figure 74.19

form of numbers, such as the Euler characteristic. This will be discussed


below.
74.11a. Theorem of Gauss (1827) on the angular sum in a triangle. Besides the theo-
rema egregium, the following important theorem is also contained in the
"Disquisitiones generales circa superficies curvas" of Gauss. Let T be a
triangle on a surface of R3 with angles IX, {J, "' and sides which are geodesics
(Fig. 74.19). Then

L Kdm =IX+ {J + y- n, (105a)

where dm = J9 du 1 du 2 is the surface measure. On the unit sphere we have


K = 1, and hence IX + {J + "' - n equal to the surface area of the triangle. A
triangle is always meant to have a simply connected region as its interior.
74.llb.• Theorem of Bonnet (1848). If the triangle sides of the previous problem are
arbitrary curves, then

fT K dm + 1 JC ds =IX+ {J +"'- n.
hT 1 (105b)

Here, we move along the boundary iJT in the positive direction with respect
to b~o b2 (Fig. 74.19). If iJT is a smooth curve without comers, we let
IX=/J="t=O.
The geodesic curvature "• has been defined in Problem 74.10. For geo-
desics we have "• = 0. Hence (105a) is a special case of(105b). Prove (105b).
Hint: Equation (105b) follows directly from an application of the integral
theorem of Gauss in the parameter plane, using a suitable orthogonal
coordinate system (geodesic polar coordinates; see Kreyszig (1957, M),
p. 206). An elegant proof, which uses the calculus of differential forms of Sec-
tion 74.24, may be found in Blaschke (1950), §44, §46.
74.11c. Global theorem of Gauss-Bonnet. A closed, oriented surface Min R3 , i.e., a
real, two-dimensional, compact, and oriented manifold, is always homeo-
morphic to a sphere with p handles (Fig. 74.20). The number p, which is a
topological invariant, is called the genus of M. For a sphere (or a torus) we
have p = 0 (or p = 1). The number x(M) = 2(1 - p) is called the Euler
characteristic of M and is a fundamental topological invariant. Triangula-
Problems 687

p=O p= 1 p=2

p = 1 p=2
Figure 74.20

tion of M gives
(106)
where n0 , n 1 , n2 are the numbers of corners, edges, and surfaces. Equation
(106) contains Euler's polyeder theorem which states that n0 - n 1 - n2 is
independent of the triangulation. Prove that

L K dm = m(S•)x(M)/2, (107)

where k = 2. Thereby m(S•) is the surface measure of the k-dimensional unit


sphere s•. In (107) this gives m(S1 )x(M)/2 = 4n(1 - p). In the special case of
the sphere (or the torus) with p = 0 (or p = 1) we have

L K dm = 4n (or = 0),
i.e., the total curvature of the torus is equal to zero.
This theorem, which is called the global theorem of Gauss-Bonnet, was
formulated explicitly for the first time by Dyck (1885), who was a student of
Felix Klein.
Solution: Let p = 0 and hence x(M) = 2. Then the topological type of M
is the sphere S2 • We decompose Minto four triangles which are naturally
given on S 2 by the equator and one meridian. Summation of(105b) yields

fM K dm = 4 · 2n - 4n = 4n.
Analogously, one argues in the case of a torus, p = 1, and for p > 1. See
Blaschke (1950, M), §47.
74.lld. Meaning of the global theorem of Gauss-Bonnet. Equation (107) is of prin-
cipal value. On the left-hand side we have an analytic expression containing
688 74. Classical Surface Theory, Theorema Egregium of Gauss

K which is not a topological invariant, while the expression on the right-


hand side is a topological invariant. In fact, (107) represents a fundamen-
tal connection between analysis and topology. Topological invariants are
the most robust geometrical invariants. They remain unchanged under all
homeomorphisms, i.e., roughly speaking, under all gum transformations.
Analytic expressions like (107), which are topological invariants, might
therefore be of particular importance for a study of the structure of our
world. Actually, many physicists and mathematicians are now convinced
that topological invariants play an important role in the theory of elemen-
tary particles. The problem, therefore, is to find these topological invariants
in the form of analytic expressions and symmetries.
74.11e. The theorem of de Rham and the duality theorem of Poincare. Because of the
great importance of equation (107), many efforts have been made to gener-
alize it as far as possible. Below we give two central results. Let us first remark
that important topological invariants may be assigned to each topological
space M-its singular homology groups H9 (M, IR) with real coefficients
where q = 0, 1, .... Actually, H 9(M, IR) is a real, linear space. The numbers
b9 = dim H,(M, IR)
are called Betti numbers. If M is arcwise connected, then H0 (M, IR) = IR, and
hence b0 = 1. For the sphereS", one has

H (S",M) = {IR for q = 0, n,


9 0 otherwise,
i.e., b0 = b. = 1 and ~ = 0 otherwise. The simple construction of H9(M, R)
as well as the intuitive meaning of H 9 (S", IR) and b9 will be explained in Part
V. The Euler characteristic of M is defined similarly to (106) through

I (-1)9 dimH9 (M, IR),


00

x(M) =
q=O

provided the sum and all summands are finite. This is true, for example, for
compact, finite-dimensional, real manifolds.
For arbitrary finite-dimensional, real manifolds with countable basis one
has the following fundamental theorem of de Rham
(108)
i.e., the de Rham cohomology groups H9(M) of Section 74.24 are isomorphic
to the dual spaces of H9(M, IR) (see Warner (1971, M)).
Let M be a compact, n-dimensional, real manifold. Then; dim H 9(M, IR) <
oo for all q. From (108) we immediately obtain that H9(M) ~ H9 (M, IR) and
b9 = dim H9(M) as well as
I (-1) dim H (M).
00

x(M) = 9 9
q=O

Furthermore, the duality theorem of Poincare states that


H 9 (M) = H".- 9 (M), q = 0, 1, ... , n,
hence H9 (M, IR) = H.-,(M, IR) and b9 = b._9 • The geometrical meaning of
Problems 689

this duality will be explained in Part V. The special case of the sphere yields

H•(Sn) = {~ for q = 0, n,
0 otherwise.
Moreover, we have x(Sn) = 1 + (-1)n.
The importance of (108) is that H 4 (M) is defined purely analytically with
differential forms, while the right-hand side of(108) is a topological invariant.
Hence (108) represents a fundamental connection between analysis and
topology, which essentially was already discovered by Poincare at the end
of the nineteenth century.
74.llf. • Generalization of the global theorem of Gauss-Bonnet to ~ln+l. Let M be a
real, 2n-dimensional, compact, and oriented manifold in ~ln+l. By definition,
the Gauss map

assigns to each point x the outer unit normal vector g(x). We define the
Gaussian curvature through
K(x) = det g'(x).

From Problem 74.6 it follows that for n = 1 this definition coincides with
the definition of Section 74.11. For n ;;::: 1, we obtain (107) with
k = 2n.
Study the differential topological proof contained in Guillemin and Pollack
(1974, M), p. 196. This very transparent proof makes essential use of the
mapping degree on manifolds and the index theorem ofPoincare-Hopf(see
Problem 74.11j). If k is odd, then (107) is wrong, because in this case x(M) = 0,
whereas JM K dm "# 0 holds for a sphere.
74.llg.* The theorem qf Chern (1944). Formula (107) has the disadvantage that M
has to be embedded in IR'" and K is not an intrinsic property of M. Chern
succeeded in finding a differential form y for real, 2n-dimensional, compact
oriented Riemannian manifolds such that

L y= x(M). (109)

Study this proof in Chern (1944), (1959, L) a'nd in Sulanke and Wintgen
(1972, M). The differential form y is called the Euler class of M. It can
explicitly be given through

( -1)n
)' = --Sgn
( l... 2n) Q 1I .
1\ "• 1\ Q!ln-1
(4nrn! i! ... izn il '1• '

where Oj are the so-called curvature forms, i.e.,

Oj = !Rj1'"du 1 " du'".


The differential form y is a 2n-cocycle, i.e., dy = 0. Hence ylies in one of the
de Rham cohomology classes of H 28 (M). From the theorem of Stokes it
690 74. Classical Surface Theory, Theorema Egregium of Gauss

follows that one can replace y in (109) with an arbitrary form of this class.
Thus the integral in (109) depends only on the cohomology class.
74.l1h. The idea of characteristic classes. Equation (109) shows that topological
invariants may be constructed from differential forms which depend on the
curvature. The theory of characteristic classes provides a systematic ap-
proach for such a construction. The important point is that vector bundles
are considered as principal fiber bundles. This will be explained in Part V.
Roughly speaking, characteristic classes measure the twisting of such bun-
dles over M. In this context we recommend Spivak (1979, M), Vol. 5. The
characteristic classes of Chern, Pontrjagin, Stiefel-Whitney, and Todd pro-
vide us with some deep insights into the properties of manifolds and their
bundles. They also play a central role in the formulation of the Atiyah-Singer
index theorem; compare Shanahan (1978, L). Characteristic classes are a
wonderful tool to describe fundamental connections between analysis and
topology; the standard types of which have already been found during the
nineteenth century by Gauss, Riemann, and Poincare.
By reading Gilkey (1984, M), Shanahan (1978, L), and Choquet-Bruhat
(1981, M) simultaneously one soon discovers the interrelation between the
following topics: pseudodifferential operators and the Atiyah-Singer index
theorem for elliptic operators and elliptic complexes, de Rham cohomology,
decomposition theorem of Hodge for differential forms, theorem of Gauss-
Bonnet-Chern, index theorem of Poincare-Hop£, Dolbeaut's cohomology
and the theorem of Riemann-Roch-Hirzebruch. In Warner (1974, M) one
finds the connection between de Rahm cohomology and the cohomology of
sheaves discussed. The usefulness of cohomology of sheaves for the solution
of fundamental problems in complex function theory, i.e., for the construc-
tion of analytic functions from their zeros or poles (Cousin's problems), may
be seen from Hormander (1967, M).
74.1li. Connections in principal fiber bundles and gauge theories. This problem will
be treated in Part V. Here we just describe the basic idea, as it is closely
related to the subject at hand. The standard type of a principal fiber bundle
is the frame bundle, which was studied by Ellie Cartan (1869-1961) in
connection with his fundamental investigations in differential geometry. His
idea was to study the geometry on manifolds by using moving frames
(method of moving frames). Thereby he employed his calculus of differential
forms. A frame in IR" consists of n linearly independent vectors which are
attached at one point (see Figure 74.21 with n = 2). On a manifold M there
exists a frame of n linear independent tangent vectors e 1 , .•• , e. in the tangent
space of a fixed point x (Fig. 74.22). The frame bundle F(M) of M consists
of all possible tuples
(x,e 1 , ... ,e.).
If we represent ei in the natural basis {bd, i.e.,
ei = cfbk,
then (cj) is a regular matrix. As coordinates of (x, e 1 , ... , e.) we choose the
coordinates
with i,j, k = 1, ... , n.
Problems 691

LV
Figure 74.21 Figure 74.22

This way, F(M) becomes a manifold and, as we shall see in Part V, even a
principal fiber bundle. It is then important that for each principal fiber
bundle, one can introduce a connection by choosing an appropriate 1-
differential form w, which admits the definition of parallel transport and
covariant differentiation D. The 2-differential form !l, which describes the
curvature, is given by
Q = Dw. (110)

This equation is the basic equation of modem gauge field theories. Here w
represents a potential and Q a field. Roughly, (110) establishes the relation

field = derivative of the potential.


As we shall see in Part V, a standard example of(110) is provided by classical
electrodynamics. In this case w corresponds to the four potential and Q to
the electric and magnetic field.
74.l1j. Index theorem of Poincare-Hopf Let M be a real, finite-dimensional, com-
pact, and oriented manifold. Moreover, let v be a vector field on M with
finitely many zeros x1• Then for the index sum we obtain

L ind(v,xj) = x(M). (111)


J

Here

where vis the representative of v in a local chart and v(Xj) = 0. The zero
index deg(v, Xj) has been defined in Section 12.3.
A proof may be found in Guillemin and Pollack ( 1974, M), p. 134. Therein,
the index theorem is used to prove the global theorem of Gauss-Bonnet.
Sulanke and Wintgen (1972, M), p. 236, on the other hand, first prove the
global theorem of Gauss-Bonnet-Chern and then deduce (111) in a simple
fashion. Hence, the theorem of Gauss-Bonnet and (111) are equivalent.
For the sphere S2 in IR 3 we have x = 2. Thus (111) implies that there exists
no non vanishing vector field on S 2 • This special case of(111) has been proved
in Example 13.4 by using the mapping degree (hedgehog theorem). Note that
vector fields on manifolds are, by definition, tangent vector fields.
74.11k. Index theorem of Morse. We choose Mas in Problem 74.l1j with dim M = n
and assume that the function f: M--+ IR has only finitely many zeros x1 which
692 74. Classical Surface Theory, Theorema Egregium of Gauss

are all nondegenerate. Prove

L• (-l)'M, = x(M), (112)


i=O

where M1 is the number of points xi with Morse index i.


Solution: Consider the vector field von M with v(x) = f'(x). Letf'(xi) = 0
and i the Morse index of xi. In a local chart, the matrix f"(xi) has exactly i
negative eigenvalues. Thus
deg(f', xi) = ( -l)'.
Hence the theorem follows from (lll).
In many cases, this index theorem can be used to compute the Euler
characteristic x(M) quite easily. Let, for example, M be equal to the sphere
s· in jR•+l. The function l'p(X) = ~n+l has a maximum at the North pole with
Morse index 0 and a minimum at the South pole with Morse index n.
Consequently, x = l + (-l)".
74.lll. ** Theorem of Adams for vector fields on spheres. Let n ~ 2, and let the number
of linearly independent tangent vector fields on the (n- i)-dimensional
sphere s•-l be equal to p(n - l). Then p(n - 1) is obtained in the following
way. Let a, b, c, d ~ 0 denote integers with 0 ~ c ~ 3. We write n as the
product of a power of 2 and an odd number
n = (2a(n) + 1)2b1• 1•
Dividing b by 4 gives the representation
b(n) = c(n) + 4d(n).
Finally, we set
p(n - l) = 2<1•1 + 8d(n) - l.
For the special case of the circle S1 and the sphere S2 in R3 we obtain
p(l) = l and p(2) = 0. An m-dimensional manifold is called parallelizable if
and only if there exist m linearly independent vector fields on it. From the
above result, exactly the spheres S1 , S3 and S7 are parallelizable.
Let k be even. Since each tangent vector field on s• has a zero (Example
13.4), there is no linearly independent vector field on s•, i.e., p(k) = 0. In fact,
in this case we have b(k + l) = 0, and hence c(k + l) = d(k + l) = 0. Thus,
p(k) = 0.
The proof of this fundamental topological theorem may be found in
Adams (1962) and Schwartz (1968, L), p. 159. The proof makes essential use
of K-theory, which will be discussed in Part V.

References to the Literature

Classical works: Euclid (325 B.C.) ("Elements"), Gauss (1827) (surface theory),
Riemann (1854) (Riemannian manifolds), Beltrami (1868) (construction of a two-
dimensional Riemannian manifold with negative curvature, with non-Euclidean geom-
etry), Klein (1871) (models for non-Euclidean geometries in the context of projective
References to the Literature 693

geometry), Klein (1872) (Erlanger program-group-theoretical classification of the


geometries), Hilbert (1903, M) (axiomatic foundation for the general geometry), Ricci
and Levi-Civita (1901) (covariant differentiation), Einstein (1916) (applications of the
calculus of covariant differentiation to the general theory of relativity), Levi-Civita
(1917) (parallel transport).
History of non-Euclidean geometry and the concept of manifolds: Klein (1926, M),
(1928, M), Scholz (1980, M).
Collected works which contain important contributions in the development of
geometry: Gauss (1863), Riemann (1892), Klein (1921), Poincare (1928), Hilbert (1932),
Lie (1934), E. Cartan (1952), Einstein (1960).
Gauss biographies: Worbs (1955, M), Wussing (1974, M), Buhler (1981, M).
Classical surface theory: Kreyszig (1957, M, B, R), Laugwitz (1960, M).
Surface deformation in the large: Efimov (1957, M).
Introduction to classical tensor calculus: Schouten (1954, M) (standard work),
Raschewski (1959, M), Zeidler (1979, S) (connection between vector analysis, tensor
analysis, differential forms, and differential geometry).
Classical differential geometry: Blaschke (1923, M), (1950, M, H) (applications of
differential forms).
Modern differential geometry: Spivak (1979, M, H, B), Vols. 1-5 (recommended as
a comprehensive introduction), Helgason (1962, M), Kobayashi and Nomizu (1963,
M), Sternberg (1964, M), Sulanke and Wintgen (1972, M), Greub, Halperin, and
Vanstone (1972, M), Choquet-Bruhat (1982, M).
Riemannian geometry: Klingenberg (1983, M), Besse (1987, M).
Isometric embedding of Riemannian manifolds in theIR": Nash (1956) (fundamental
work), Schwartz (1969, L), Gromov and Roblin (1970, S, B), Gromov (1986, M).
Non-Euclidean geometry: Klein (1928, M).
Theorem of Gauss-Bonnet-Chern. Classical works: Gauss (1827), Bonnet (1848),
Dyck (1885), Chern (1944). Introduction: Kreyszig (1957, M), Guillemin and Pollack
(1974, M), Sulanke and Wintgen (1972, M). Connection with the theory of characteris-
tic classes: Spivak (1979, M), Vol. 5 (recommended as an introduction), Chern (1959,
L), Schwartz (1968, L), Greub, Halperin, and Vanstone (1972, M), Vol. 2 (general
theory).
Differential forms: Maurin (1976, M), Vol. 2 and Westenholz (1981, M) (introduc-
tion), Kahler (1934, M) (applications to systems of partial differential equations),
Hodge (1952, M), Cartan (1955, M), de Rahm (1960, M), Greub, Halperin, and Van-
stone (1972, M), Vol.s. 1-3, Marsden (1974, L), Abraham and Marsden (1978, M)
(applications to mechanics).
Lie groups: Choquet-Bruhat (1982) (introduction), Chevalley (1946, M), Mont-
gomery and Zippin (1955, M), Pontrjagin (1966, M), Warner (1974, M), Dieudonne
(1975, M), Vols. 4 and 5.
Modern differential geometry and its applications to mathematical physics: Dubro-
vin, Novikov, and Fomenko (1979, M), Westenholz (1981, M), Choquet-Bruhat (1982,
M), Curtis (1986, M).
CHAPTER 75

Special Theory of Relativity

He who understands geometry may understand anything in this world.


Galileo Galilei (1564-1642)
The ways of people to the laws of nature are no less admirable than the laws
themselves.
Johannes Kepler (1571-1630)
It is known that Maxwell's electrodynamics, when applied to moving bodies,
leads to asymmetries which do not appear to be inherent in the phenomena.
Take, e.g., the electrodynamic interaction between a magnet and a conductor.
The observable phenomenon here depends only on the relative motion of the
conductor and the magnet, whereas the customary view draws a sharp distinc-
tion between the two cases in which either the one or the other of these bodies
is in motion.
Albert Einstein (1905)
(Beginning of his paper on the special theory of relativity)
If the energy of a body changes by llE, then its mass changes by
Am= llE/c 2 •
Here c is the velocity of light
Albert Einstein (1905a)
If Einstein's theory of relativity proves correct, which I expect, then he will be
celebrated as the Kopernikus of the twentieth century.
Max Planck (1909)
Henceforth space by itself and time by itself are doomed to fade away into mere
shadows and only a kind of union of the two will preserve an independent reality.
Hermann Minkowski (1909)
At the moment I am only concerned with the gravitational problem and I hope
to overcome all the difficulties with the help of a local friend and mathematician
(Marcel Grossmann). But it is true that, never in my life, I have worked so hard,

694
75. Special Theory of Relativity 695

and I am ft1led with a great respect for mathematics. In its more subtle parts, I
have regarded it, in my simplicity, as pure luxury.
Albert Einstein in a letter of October 1912
We set
R11 = K(Tu - 2- 1 g11 T).
This completes the general theory of relativity as a logical structure. The pos-
tulate of relativity in its most general form, which makes the space-time coor-
dinates meaningless parameters, leads necessarily to a certain form of gravita-
tional theory which explains the motion of the Perihelion.
Anyone, who really has grasped the general theory of relativity, will be cap-
tured by its beauty. It is a triumph ofthe general differential calculus, which was
created by Gauss, Riemann, Christoffel, Ricci, and Levi-Civita.
Albert Einstein (1915)
The development of the general theory of relativity appears to me to be the
greatest achievement of scientific thought over the laws of nature, an admirable
unification of philosophical depth, physical intuition, and mathematical skills.
Max Born (1957)

In this and the following chapters we shall discuss the basic ideas of the general
theory of relativity, explain its connection with the theory of manifolds, and
give applications in the form of three interesting problems:
(i) Motion of the Perihelion of Mercury.
(ii) Big Bang and the expansion of the universe.
(iii) Black holes.
The present Chapter 75 on the special theory of relativity and Chapter 76 on
the general theory of relativity form a unity. We therefore give problems and
references to the literature at the end of Chapter 76. In connection with (i) we
want to mention that this phenomenon is discussed in the physical literature
only in first-order approximation, and the method used, is usually not clearly
motivated. The first step of an iteration scheme is used. The difficulty, however,
is that one has to solve an equation which has several solutions. This difficulty
is avoided by formally choosing a solution which is physically meaningful. In
Section 76.9 we present a consistent method which also uniquely determines
all higher-order approximations and we shall prove the convergence of this
method. We employ the same bifurcation methods which have been used in
Section 8.12 in studying nonlinear oscillations. This method may also be
applied to many other problems in celestial mechanics.
· Einstein's theory of relativity has been developed in two fundamental
papers, which appeared during the years 1905 (special theory of relativity) and
1916 (general theory ofrelativity). The special theory ofrelativity begins with
the principle of relativity:
(R) All physical processes have the same form for all inertial systems.

This principle will be discussed more precisely in Section 75.2. Because the
velocity of light is constant, a change of space and time between inertial
696 75. Special Theory of Relativity

systems is given by Lorentz transformations. In addition, all physical laws


have to be given in such a way that they assume the same mathematical form
for all inertial systems. This can be achieved by formulating these laws as
geometrical laws for Minkowski's uncurved four-dimensional space-time
manifold M4 •
The general theory ofrelativity represents an extension of Newton's theory
of gravity to arbitrary systems of reference. Newton's gravitational force is
replaced with the curvature of Einstein's four-dimensional space-time mani-
fold E4 , which is caused by the mass distribution. Conversely, the curvature
effects the mass motion, which occurs on geodesics in E4 • The following
ingenious geometrization concept forms the basis for the general theory of
relativity. It represents the deepest known connection between physics and
mathematics:
(G) Physical interactions can be reduced to geometrical properties.
In the introduction to the previous chapter we already mentioned that at
present many efforts are being made to use this geometrization principle to
find a unified theory for all interactions. This will be discussed in Part V. The
experimental results, which led Einstein to (R) and (G), were the fact that the
velocity of light is constant (Michelson experiment) and the equivalence of
gravitational and inert mass. Later on this will be further explained.
For the reader who wants to get acquainted with the modem development
in the theory of relativity, we recommend Carelli (1979), Hawking and Israel
(1979), Held (1980), and Schmutzer (1983). These ate four voluminous con-
ference reports, which appeared on the occasion of the centennial of Albert
Einstein's (1879-1955) birthday. Furthermore, we recoltlmend the collection
of survey articles by Dewitt and Stora (1984) and the monographs by Einstein
(1953) and (1965) on the theory ofrelativity and Einstein's "Weltbild," as well
as the collected works of Einstein (1960). A standard work on the theory of
relativity is Misner, Thome, and Wheeler (1973). Finally, we recommend the
conference report Ruffini (1987).
The following citations should illustrate the long historic development
which led to the geometrizatioil of physics in the context of the general theory
of relativity.

Geometry is the knowledge of what eternally exists.


Plato (427-347 B.C.)
Every process in nature will occur in the shortest possible way.
Leonardo da Vinci (1452-1519)
All human knowledge begins with intuition, thence passes to concepts, and ends
with ideas.
Immanuel Kant in his "Criticism of pure reasoning," 1781.
(This citation was chosen by Hilbert (1903) as the motto of his
Foundations of Geometry.)
In humbleness, we have to admit that if "number" is a product of our imagina-
75. Special Theory of Relativity 697

tion, "space" has a reality outside of our imagination, to which a priori we cannot
assign its laws.
Carl Friedrich Gauss (1777-1855) in a letter to Bessel
Riemann, 1854, presented three topics for his inaugural lecture. Gauss, in recol-
lection of his own struggle with Euclid's parallel axiom, chose-in breaking with
tradition-the third one: "On the hypotheses, which form the basis of geom-
etry." In his lecture, Riemann presented the fundamentals of a geometry for
the n-dimensional curved metric space (Riemannian geometry). This must have
made an extremely deep impression on Gauss, who at that time was already
very weak. Later, on his way home, he spoke with unusual excitement to
Wilhelm Weber about the depth of the presentation.
Erich Worbs in his Gauss biography (1955)
Every geometry is a theory about invariants of a transformation group.
Felix Klein (1872) (Erlanger program)
In physics there exists no concept, which a priori is necessary or justified. A
concept only becomes justified through its clear and unique correspondence with
events or physical experience. Newton's concepts of absolute simultaneity, ab-
solute velocity, and absolute acceleration were abandoned in the theory of rela-
tivity, because a unique connection with the world of experience appeared
to be impossible. The same applies to the concepts of the plane, straight line,
etc., upon which Euclidean geometry is based. Every physical concept must be
given a definition such that in a concrete situation, the validity or nonvalidity
of this concept can principally be determined.
Albert Einstein (1920)
Formerly it was believed that if all things vanish from this world, space and time
would remain, but according to the theory of relativity, space and time vanish
together with all things.
Albert Einstein (1921)
"Every little boy in the streets of our mathematics-Gottingen knows more about
four-dimensional geometry than Einstein," wrote David Hilbert with excusable
exaggeration. "But in spite of this," Hilbert added, "Einstein has completed the
work, not the mathematicians ...."
When the world was amazed with Einstein's theory of relativity, Minkowski
said: "To me this came as a great surprise, because as a student, Einstein had
been a lazy duck. Never has he been interested in mathematics."
Timothy Ferris (1977)
Only the genius Riemann, lonesome and unrecognized in the middle of the
previous century, found the way towards a new conception of space, whereby
space looses its stiffness, and gains the ability to participate in physical events.
Einstein (1953)

The following citations might help the reader to understand Einstein as


a human being.

I saw Einstein for the first time in Berlin in 1921, when I was wandering through
the streets, trying everything to enrol at the university where Planck, Laue, and
Einstein taught. I felt miserable, since I didn't know anyone. I was as lonesome
as one could possibly be in a great and hostile city. For weeks I waited for the
chance to meet some influential people, only to find out, how little they cared,
698 75. Special Theory of Relativity

whether or not I would become a student at Berlin University. In my desperation


I called Einstein, and to my greatest surprise I was invited to his house.
Kindness is a difficult thing to handle amidst all this coldness and hostility.
Einstein welcomed me with a smile and offered me a cigarette, spoke to me like
to one of his own, and took everything with a childlike confidence. This short
discussion was an important event in my life. Instead of thinking about his
ingenuity and his achievements in the area of physics, I thought then, as well as
later, about his great kindness, his loud laughter, the shining of his eyes,
and-the awkwardness with which, on a table covered with all sorts of papers,
he looked for a piece of paper-about the mixture of great kindness and great
remoteness.
Leopold Infeld (1969)
Not the person counts, but the work for the community.
Einstein (1929)
Few people are able to express opinions which differ from the common preju-
dices; most people are not even able to form them.
Einstein (1955)
Why do people always babble about my theory of relativity? I have done other
useful things, maybe even better ones. But the audience does not take any notice
of this.
Einstein (1955)

[In this connection we mention the paper (l905b) on the photoeffect in which
Planck's quantum hypothesis has been used to predict the existence and
properties of photons. Also, one may think of his paper (l905c) on the
quantitative theory of Brownian motion, which later on was extended by
Norbert Wiener to the theory of stochastic processes. In 1915, Einstein and
de Haas observed experimentally an effect for ferromagnetica, which ten years
later found its explanation through the discovery of the electron spin.]
Einstein is completely right that empiricism without bold ideas leads to nothing.
A master is able to find the right mixture between both.
Max Born (1957)
One thing I have learned in a long life: that all our science, measured against
reality, is primitive and childlike-and yet it is the most precious we have.
Einstein (1955)
History will tell that the best citizens in every country, the best defenders of
honor, were always those, who, by risking their positions, their names, or even
their lifes, spoke out against the errors and stupidities of their fellow-men.
Romain Rolland (1866-1944)
All students in Germany, all students in the entire world, should be brought
here to see how horrible the war really is.
Einstein in 1922, when visiting the battlefields of Verdun (France)
The political apathy of people during times of peace is a sign of their later
willingness to be massacred. Because today they are not willing to support
disarmament, they will be forced tomorrow to loose their blood.
Einstein (1928)
75.2. Inertial Systems and the Postulates of the Special Theory of Relativity 699

Dictatorship brings the muzzle and with it comes the lethargy. Science can only
flourish in an athmosphere of free speech.
Einstein (1929)
The ideas and methods of the past did not prevent the wars; the ideas of the
future must make them impossible.
Einstein to the New York Times (1946)
Society is in a crises which, in its full consequences, has not yet been recognized
by those having the power to decide between good and bad. The released atomic
force has changed everything except our way ofthinking, and unprepared we slip
into another catastrophy.
Einstein (1955)
It is the high determination of people to serve rather than to rule or to be
supreme over others in any other form.
Einstein (1955)

75.1. Notations
In this chapter y = ee 1 + '7e 2 + Ce 3 denotes a position vector in a Cartesian
(e.
coordinate system with point coordinates '7, C). Moreover, we lett denote
time. Let u = (u 1,u 2 ,u 3,u4 ) with

where c is the velocity of light, i.e., c = 299,793 km/s. In the general theory of
relativity, u = (u 1,u 2 ,u 3 ,u4 ) are arbitrary coordinates, where u1, u2, u3 are
space coordinates and u4 is a timelike coordinate. This will be discussed more
precisely in Section 16.i. Equal upper and lower Latin (or Greek) indices are
e
always summed from 1 to 4 (or 1 to 3). A dot as in means derivative with
respect to time, i.e., de/dt. On the other hand, the prime in e· does not stand
for a derivative, but instead refers to the system I:'.

75.2. Inertial Systems and the Postulates of


the Special Theory of Relativity
At the beginning of his paper (1905) "On the electrodynamics of moving
bodies," which represents the foundation of the special theory of relativity,
Einstein formulated the following two postulates in a somewhat modified
form. The concept of inertial system will be explained below:
(R) Einstein's principle of relativity. All inertial systems are physically equiva-
lent, i.e., physical processes are the same in all inertial systems when initial
and boundary conditions are the same.
700 75. Special Theory of Relativity

Figure 75.1

(C) Constant velocity of light. In every inertial system, light travels with the
same constant velocity c in every direction.
We shall see in Section 75.5 and the following sections how these apparently
very simple postulates lead to a fundamental revision of the classical concepts
of time and physics. Also, we postulate:
(T) Translation principle. There exists an inertial system. If I is an inertial
system, then also each Cartesian coordinate system I', which is obtained
from I by a constant translatory motion, is an inertial system.
Recall that we mean by a translatory motion that I' is not rotated compared
with I. By a constant translatory motion of I' we mean a constant motion of
I' with respect to I with constant velocity vector v (Fig. 75.1).
In order to understand these postulates, we need a definition of inertial
systems. A formal, mathematical definition will be given in Section 75.8. Here
we shall give a heuristic description to illustrate the physical meaning.
(I) A Cartesian coordinate system I is an inertial system precisely if there
exists a system time t for it such that each mass point, which is far enough
away from other masses and shielded against fields (e.g., light pressure),
remains at rest or moves rectilinearly with constant velocity.

75.3. Space and Time Measurements


in Inertial Systems
A mathematical axiom system uses terms which are not explained any further.
For example, Hilbert (1903), in his famous "Foundations of Geometry," used
the concepts "points, straight lines, and planes" in this way. A fundamental
difference in the way mathematicians and physicists work, is that a physicist
does not have this luxury. He needs to know how physical concepts are
connected to his experiments and how he can measure quantities such as
space, time and momentum. A physicist is·in the situation of a man who has
to fight his way through a dark labyrinth with many dead ends. Permanently,
75.3. Space and Time Measurements in Inertial Systems 701

hypotheses have to be made, which can only approximately be verified by


experiments, and a great number of experiments have to be compared in order
to find contradictions among those hypotheses. The theory of relativity and
quantum theory both showed the importance of the process of measurement
for physical theories.
Let us examine how a physicist can use (C), (T), and (I), to find an inertial
system which then allows him to introduce meaningful processes of measure-
ments for space and time. In connection with measurements of time, we always
think of atomic clocks when speaking of clocks. Thereby we have Cesium-133
atoms in mind. Passing from one energy level to another, in these atoms,
generates microwaves with 9,192,631,770 oscillations per second. The error of
these clocks is around 10- 9 seconds a day. Modern experimentalists use
hydrogen masers instead.
Consider a physicist P in a space ship S, which is located at a far distance
from stars and which flies without rocket propulsion. We expect that Sis an
inertial system. In order to verify this experimentally, P uses an atomic clock
and drops an object without accelerating it. If P observes rest or a rectilinear
motion with constant velocity, he concludes that Sis an inertial system. More
precisely, S may be chosen as the origin of an inertial system l:. We introduce
the following measurements of space and time in l:. Note that we are dealing
with cosmic distances.
(i) Time differences. All observers Q in l: agree via radio to measure time
differences with atomic clocks that are built the same way.
(ii) System time. In order to synchronize all clocks, Pat his local timet sends
a light signal (radar signal or laser) to Q which is reflected there and
returns to P after time L\t. Then P informs the observer Q via radio that
the light signal has been at Q at the system time t + llt/2.
(iii) Measurement of distances. Because of (C), P measures the distance be-
tween Q and himself as c · L\t/2. Thereby Euclidean geometry and a
rectilinear motion of light are assumed.
Using a known inertial system l: and the translation principle (T), one can
now determine whether or not other systems l:' in the universe are also inertial
systems. Astronomical experience shows that the system Isun is a good ap-
proximation of an inertial system. The origin of Isun is the center of gravity of
our solar system, which lies within the sun. The axes of Isun point towards
fixed stars which can be chosen arbitrarily. We only require a Cartesian coor-
dinate system.
Recall that, by definition, the axes of a Cartesian coordinate system I are
always positively oriented, i.e., are as in Figure 75.1. Our physicist Pin his
space ship knows this positive orientation even without his light system. He
only needs the first three fingers of his right hand. But how can he com-
municate this positive orientation to distant creatures which might have no
right hand? The answer to this is an experiment which was performed by Mrs.
Wu and her co-workers in 1957, and which showed the violation of parity
702 75. Special Theory of Relativity

electrons

Figure 75.2

(asymmetry of space reflections) for weak interactions. The effect is that an


P-decay of ~~Co kernels results in the emission of 30% more electrons anti-
parallel to the spin direction than parallel (Fig. 75.2).

75.4. Connection with Newtonian Mechanics


In classical mechanics, Newton assumed the existence of an absolute space
and an absolute time. This allows the following very precise definition of an
inertial system.
In the absolute space I .. we have the equation of motion
d2y..
m.. -d2 =K.. , (1)
t..
where I .. stands for a fixed Cartesian coordinate system which is at rest in the
absolute space, and where clocks show the absolute world time t... A particle
is called force-free if and only if K .. = 0. A Cartesian coordinate system I is an
inertial system precisely if it is at rest in I .. or obtained from a translation. It
follows from classical mechanics that the change from I .. to I is given by
y=y.. - tw +b, t = t.. , m=m.. (2)
with fixed velocity vector w and fixed vector b. This is shown in Section 58.6.
At timet.. = 0, the origin y.. = 0 of I .. has the coordinate y =bin I. From (1)
and (2) we obtain as the equation of motion in I
d2y
m dt2 = K .. , (3)

which has the same form as (1). In Section 58.6 we also saw that systems,
which are not inertial systems, i.e., which move accelerated in 1:11 , satisfy
equation

(4)
75.4. Connection with Newtonian Mechanics 703

where A is the force induced by the acceleration. From Ka = 0 and (3) we


immediately obtain that y = y(O) + y(O)t. This implies:
(lclass) A Cartesian system ~ with world time t as system time is an inertial
system if and only if every force-free particle remains at rest or moves
rectilinearly with constant velocity.
Note that force-free refers to ~a and not to the forces K = Ka + A, which
are observed in the system of reference~. As we shall see shortly, (lclass) cannot
be used in the theory of relativity, since there ~a cannot be used to give a
precise definition of the concept force-free. This is the reason we used formula-
tion (I) in Section 75.2.
If~ and~, are two inertial systems, then (2) yields the Galilei transformation

y' = y - tv + Yo. t' = t, m' =m (5)

with v = w' - w and y0 = b' - b. Furthermore, in ~, we obtain the equation


of motion

(6)

Comparison of (3) and (6) gives the classical principle of relativity:


(Rclass) The mechanical processes are the same for all inertial systems when the
initial conditions are the same.
If ~' is obtained from the system ~ by a translation which may be ac-
celerated, then we find that y'(t) = y(t) - a(t) (Fig. 75.3). Differentiation with
respect to t gives the following addition theorem for velocities
y' = Y- a. (7)
In the general case, Section 58.6 implies that the transformation rule for
motions y = y(t) andy'= y'(t) in~ and~' is given by
Y' = .Y - a- (ro x y'). (8)
Here a and ro are the translational and the rotational velocities of ~, with
respect to ~. All quantities in (8) depend on t.

Figure 75.3
704 75. Special Theory of Relativity

c c

-v v
(a) (b)

Figure 75.4

earth

Figure 75.5

The physicists were confronted with the following important question.


(Q) How can the absolute space Ia be determined experimentally?
The following answers were proposed at the end of the last century.
(i) From the classical principle of relativity (Rctass) it is clear that the absolute
space cannot be determined by mechanical experiments. One may, how-
ever, try to use light.
(ii) Consider, as in Figure 75.4, two bodies (e.g., two cars) with velocities c
and V. The classical addition theorem implies the relative velocities c + V
and c - V in Figure 75.4(a) and (b), respectively.
(iii) Assume that (ii) can be applied to light with velocity c. It follows then
that light cannot travel with the same velocity in each direction for every
inertial system. There is at most one such inertial system.
(iv) Now assume that there actually exists an inertial system Ie for which the
velocity of light is constant. We set Ia = Ie, i.e., we identify the absolute
space with Ie.
(v) Since the earth moves in the inertial system Isun of Section 75.2 on an
elliptic orbit, i.e., accelerated, no system I 1abor which is firmly connected
to the earth can be an inertial system. In particular, we have I 1abor =F Ia.
If we consider the orbit of the earth such as in Figure 75.5, then transfor-
mation formula (8) immediately implies that the velocity of light in I 1abor
cannot be constant for the whole year.
(vi) During the years from 1881 to 1887, Michelson performed very precise
interference experiments. To his great surprise, he observed that the veloc-
ity of light in I 1abor remained constant. This Michelson experiment was
the experimental starting point for the special theory of relativity.
Let us describe this experiment. As in Figure 75.6 we consider a light source
S which emits monochromatic light onto a semipermeable plate P. After
75.4. Connection with Newtonian Mechanics 705

*s
Figure 75.6

reflection, we obtain interference bands in I. Rotating the whole device about


different angles ex, one expects changes in the interference pictures, because the
axes of the system travel for different ex with different velocities relative to the
absolute space. In spite of a very high accuracy in his measurements, Michel-
son did not observe any such changes.
In order to resolve this contradiction we make, with Einstein (1905), the
following observations.
(a) Since (iv) leads to a contradiction we assume that light travels in every
inertial system with a constant velocity. This is requirement (C) of Section
75.2.
(b) Furthermore, we take the more radical but logically very satisfactory point
of view that no particular inertial system can be distinguished by means
of physical experiments. This leads to the principle of relativity (R) of
Section 75.2. In this way, one eliminates the special role of mechanics
which is expressed in (Rclass). The idea behind (R) is the unity of all physical
phenomena. Since, according to (R), absolute space cannot be determined
through experiments, we consider the concept of absolute space as phys-
ically meaningless. Therefore, the concepts of absolute rest and absolute
velocity also lose their meaning.
In the following section we study the mathematical implications of (C) and
(R). This leads to a revision of classical mechanics. In particular, transforma-
tion formula (5) and hence the addition theorem for the velocities does not
hold any more. We will see in Part V that other than classical mechanics,
Maxwell's theory of electrodynamics need not be changed if space and time,
electric and magnetic field strengths, and charge densities and currents are
transformed relativistically between inertial systems. The finding of these
transformation rules was one of the main goals of Einstein's classical paper
(1905).
The considerations above show a phenomenon which can be observed more
generally in the history of physics. New theories are developed when ex-
perimental results can no longer be explained by means of the old theories.
Under certain conditions, however, the old theory is contained as an ap-
proximation in the new theory. As we shall see, classical mechanics is a special
706 75. Special Theory of Relativity

case of relativistic mechanics if the velocities are small compared with the
velocity of light, i.e., belong to our everyday experience.

75.5. Special Lorentz Transformation


Consider two Cartesian coordinate systems I and I' with corresponding
space coordinates y =<e.,, C) andy'= (e',1J',('). Assume also that I and I'
are two inertial systems with corresponding system times t and t'. Further-
more, I' is obtained from system I by a constant translatory motion with
velocity v. Using a fixed rotation of I and I' and a translation of the coordi-
nates y' and t', one can always get the following more simple situation:

(S) At time t = 0, the two inertial systems I and I' coincide, and we have
t' = 0. Moreover, v = Vel> i.e., the translation is performed for V > 0
along thee-axis, and for V < 0 in the opposite direction (Fig. 75.7).

The coincidence of I and I' means that both origins are equal at time
t = t' = 0, and the corresponding coordinate axes have the same direction.

Postulate 75.1. The change from I to I' is given by the special Lorentz
transformation

,, = ,, ,, = '· (9)

where c is the velocity of light and IVI < c.

The inverse transformation of (9) is


e· + vt'
e= --;::=========
Jt- V /c 2 2'

If the translational velocity V is very small compared to the velocity of light


c, then we obtain from (9) in first-order approximation the classical Galilei

Figure 75.7
75.5. Special Lorentz Transformation 707

transformation
,, = e- vt, t' = t.

As a motivation, we will show that the following two natural conditions


immediately lead to (9).
(i) The change from I to I' is given by a linear regular transormation, i.e.,
,, = cxe + fJt, t' = ~'' + c;t, ,, = ,, ,, = c. (10)

(ii) We assume Einstein's two postulates (C) and (R) of Section 75.2.
From assumption (C) that the velocity of light is constant, it follows that
e= ct becomes f = ct'. This yields the following central condition
ccx + fJ = c2 y + c(;.
For e' = 0 and t' arbitrary, we require e= Vt for all t, i.e.,
cxV + fJ = 0.
The inverse transformation of(10) is
t = J.t(cxt' - ye'}, {1.-1 = ex(; - fJy. (11)

For e= 0 and t arbitrary, we require e' = - Vt' for all t', i.e.,

c;v + p = o.
This implies
fJ = -cxV, c; = ex,

Assume that V ~ 0. Fort = 0 and e> 0 we require e' > 0, and hence ex > 0.
The free parameter ex is then determined from the principle of relativity (R).
This means that I and I:' are equivalent. We therefore require in (11) that the
coefficient of e' in the first equation is equal to the correspondin coefficient
ex in (10), i.e., J.tc; =ex. This gives J.t = 1, and hence ex= 1/ 1 - V 2 /c 2 • At the
same time, we obtain V <c.
For V ~ 0 one uses an analogous argument. This motivates (9).
The following invariance relation
for all t, e IR
E (12)

is the key for the geometrical interpretation of the special theory of relativity
of Section 75.8. There we will use (12) to define, with Minkowski (1909), a
Riemannian metric for the four-dimensional space-time manifold.

Proposition 75.1. Formula (12) holds for all the special Lorentz transformations
(9). Conversely, every linear transformation (10) which satisfies (12) is a special
Lorentz transformation (9), except for reflections of the variables e' and t'.
708 75. Special Theory of Relativity

PROOF.

(I) It follows from (10) and (12) that


czt2 _ ~2 = c2tz(b2 _ pz;cz) _ ~z(IXz _ czyz)

+ 2~t(c 2 yb - IX{J),
hence
(13)
(14)

We want to prove (9). From (13) we have <5 2 , 1X 2 ;? 1. Using a reflection of


the variables in (10), we can always assure that IX, <5 ;? 1. Let us assume
that {J;? 0. From (14) it follows then that y ;? 0. Therefore, there are
numbers 0 ~ p, a < oo with
IX= coshp, <5 =cosh a.

From (13) it follows that


cy = sinhp, {J = c sinh a.
Equation (14) implies that p = tanh a, hence p =a. Now we choose a
number V;? 0 with V 2 /c 2 = 1 - 1/1X 2 • This implies (9). Analogous argu-
ments are used for {J ~ 0.
(II) Conversely, (12) is an immediate consequence of(9). 0

75.6. Length Contraction, Time Dilatation, and


Addition Theorem for Velocities
We consider in this section two inertial systems I: and I:' for which situation
(S) of the previous section applies. Let P and P' denote observers in I: and I:',
respectively.

EXAMPLE 75.3 (Length Contraction). Consider a rod of length /0 which is at


rest on the ~-axis in I:. Then P' in I:' measures the length
(15)

To see this, let ~ 1 and ~ 2 = ~ 1 +I be the coordinates at both endpoints of


the rod in I:. From (9) we obtain in system I:' as eguations of motion for these
endpoints

with variable t. Herej = 1, 2 and IX= t;Jt - V 2 jc 2 . Solving for ~i and t gives

~i = 1X(~j + Vtj), t = 1X(tj + V~jjc 2 ). (16)


75.6. Length Contraction, Time Dilatation, and Addition Theorem 709

It is important then that P' measures the length l' = ~~ - ~'1 not at the same
t time, but instead the same t' time. Letting t~ = t2 in (16) we obtain from
substraction 10 = ~ 2 - ~~ = cxl'. This is (15).
Now, consider a cuboid of volume Q0 in the inertial system :E which is
parallel to the axes and at rest. Because of(15) and r( = l'f, (' =(,one observes
the volume

(17)

in the inertial system :E'. One easily sees that this formula is also true for an
arbitrary cuboid in :E which is at rest. By integration, (17) can then be extended
to arbitrary volumina.
Formulas (15) and (17) are not in contradiction to the principle of relativity,
because :E and :E' are not equivalent: the rod is at rest in :E, while it moves in
:E'. In :E', one observes the following velocities for the endpoints of the rod
d~~ d~j dt
- ' = - - = -V j = 1, 2.
deJ dt dt~J '

This implies: If a body moves in an inertial system rectilinearly with constant


velocity V, then all lengths in this inertial system in the direction of motion are
shortened by a factor j1 - V 2 /c 2 , compared to lengths in the system at rest.
EXAMPLE 75.4 (Time Dilatation). In an inertial system :E two events take place
at the same place x = R, l'f, 0 and at different times t 1 and t 2 = t 1 + !:it. If !:it'
is the time which passes between the two events for an observer in the inertial
system :E', then
(18)

i.e., for P' the two events appear dilated compared to an observer P in :E.
To show this we note that for P' the two events have coordinates (~j, tj),
j = 1, 2 with

Substraction yields (18).


This time dilatation does not contradict the principle ofrelativity, since for
P the two events take place at the same place, while under the assumption
!:it =F 0 and V =F 0 this is not the case for P'. Therefore, the two inertial systems
:E and :E' are not equivalent.
EXAMPLE 75.5 (Relativistic Addition Theorem for Velocities). Suppose the
motion of a point object is given by the equation
y = y(t) in the inertial system :E.

This motion corresponds to


y' = y'(t') in the inertial system :E'.
710 15. Special Theory of Relativity

Thus one observes the velocities


w = dy(t)/dt and w' = dy'(t')/dt'
in I: and I:', respectively. The velocity components satisfy
wi = (w1 - V)jy, (19)

PRooF. From r((t') = 'l(t), ('(t') = C(t), and


e' = a(e(t) - Vt), t' = a(t - ve(t)/c 2 )
it follows by differentiation

~ ~;~· • • 2
dt' = dt dt = (e - V)/(1 - Ve/c ),

- - - ' _I l'/l(1 - u~/


d'l'-_-d'l'/dt' .. .,c.
2)
0
dt' dt dt

Equations (19) imply that


(c 2 - w2 )(1 - V 2 jc 2 )
c2 - w'2 = -------=--=---
(1 - w1 V/c 2 ) 2
Hence, because of IVI < c we have the following results:
(i) Fromlwl < cfollowslw'l < c,i.e.,underlightvelocityremainsunderlight
velocity.
(ii) From lwl = c follows lw'l = c, i.e., velocity of light remains velocity of
light.
(iii) For over light velocities with lw1 1 > c in an inertial system I: there always
exists a V > 0, i.e., an inertial system I:' for which transformation (19)
becomes singular, i.e., lw'l becomes infinitely large. 0

In Section 75.9 we will see that it is meaningful to require that physical


effects cannot travel with over light velocities.

75.7. Lorentz Group and Poincare Group


For all V e R with IVI < c we define the matrices
0 0 0 0

L(~~( ~ 1 0
0
-~jc)
0 '
R= (ri
0
82
0
0
83 ~}
-aV/c 0 0 IX 0 0 0 84
75.7. Lorentz Group and Poincare Group 711

0)0
.~(i)·
til t12 tl3
T=( t2t t22 t23
'
t31 t32 t33
~
0 0 0
with !X= 1/J,....l---V-=2/.,. .c-;;-2 and e1 = ± 1. Moreover, let T be an orthogonal
matrix with det T = 1. Assume the same for T1 and T2 • The special Lorentz
transformation (9) can now be written in the short form
u' = L(V)u.

Definition 75.6. Precisely all maps of the form


u' =Au+ a (20)
with A= RT1 L(V)T2 are called Poincare transformations. For a= 0 and
&4 = 1 we speak of a Lorentz transformation. These Lorentz transformations
are called proper if and only if R is equal to the unit matrix.

A proper Lorentz transformation consists of a spatial rotation T2 , a special


Lorentz transformation L(V), and a spatial rotation T1 • If, in addition, spatial
reflections are allowed, we obtain Lorentz transformations. If, moreover,
translations of the space and time coordinates and reflections of the time
coordinates, i.e., a4 = -1, are allowed, we obtain Poincare transformations.
Because of det L(V) = det T = 1 we have: A Lorentz transformation is proper
if and only if det A = 1.
A Poincare transformation with a = 0 is a Lorentz transformation precisely
if ot'fot > 0. Hence for a Lorentz transformation, the direction of time is
preserved.

EXAMPLE 75.7. Consider two Cartesian coordinate systems I: and I:' with
corresponding system times t and t'. Both systems are inertial systems, and
I:' is obtained from I: by a constant translatory motion with velocity vector
v. Fort= 0, I: and I:' have the same origin. By using a rotation of I: and I:',
the situation may be reduced to the more simple situation (S) of Section 75.5,
for which thee-axis and the e'-axis have the same direction as the vector v.
This implies
u' = T1 L(V)T2 u
with V = Ivi. For a suitable choice of I: and I:', every proper Lorentz transfor-
mation can be obtained like this. Without assuming that both systems I: and
I:' have the same origin at time t = 0, we obtain the more general equation
u' =Au+ a
with fixed a and A= T1 L(V)T2 •
712 75. Special Theory of Relativity

Every matrix T2 can be written as the product of rotations D1 about the jth
space axis,j = 1, 2, 3. Because of L(V)D3 = D3 L(V) we may always assume
that T2 is only the product of rotations D1 and D2 • Thus the Lorentz trans-
formations T1 L(V) T2 depend on 3 + 1 + 2 = 6 parameters, and hence the
Poincare transformations depend on 6 + 4 = 10 parameters.

Proposition 75.8. The collection of all Lorentz transformations (or Poincare


transformations) forms a group, which is called the Lorentz group L (or Poincare
group P). The proper Lorentz transformations form a subgroup of L.

PRooF. An elementary calculation shows that

'h
Wit
V3 = Vt+V2
2
1 + Yt V2 /c
for IJ.}l < c and j = l, 2, 3. Because of L(O) = I and L(V)L(- V) = I, all
matrices L(V) form a group. Since all matrices T form a group as well, it fol-
lows that all Lorentz transformations form a group.
From u' =Au+ a and u" = A'u' +a' it follows that
u" = A'Au +A' a+ a'.
Therefore all Poincare transformations form a group as well. D

It can easily be shown that L is a six-dimensional Lie group and P a


ten-dimensional Lie group. In relativistic quantum field theories the in-
variance under the Poincare group makes it possible to assign quantities
like energy, momentum, rest mass, spin (angular momentum), and parity to
elementary particles. Today, one expects that there exists PCT-invariance for
all interactions between elementary particles. This means: If an elementary
particle process 1t is possible in nature, then the same is true for 1tPCr, where
nPCT is obtained from n by a spatial reflection P, a. change to antiparticles C,
and a time reflection T (reversing all velocities). Under certain assumptions
a precise mathematical proof can be given for the PCT-invariance (see Streater
and Wightman (l964, M)).

Proposition 75.9. Let


Q(u) = c2t2- e- ,2- '2
with u = (e, fl, '· ct). Then Q is invariant under every Poincare transformation,
i.e., Q(u 1 - u2 ) = Q(u; - u;).

PRooF. This follows immediately from Proposition 75.2 and the fact that Q is
invariant under spatial rotations and reflections of the coordinates. D

The converse of Proposition 75.9 is discussed in Problem 76.1.


75.8. Space-Time Manifold of Minkowski 713

75.8. Space-Time Manifold of Minkowski


In order to understand the geometrical meaning of the special theory of
relativity and, later in Section 76.1, the difference between the special and the
general theory of relativity, we construct with Minkowski (1909) a four-
dimensional space-time manifold M4 which will also be called Minkowski
space. We use the notations of Section 75.1.
To every proper Lorentz transformation A and every a e R4 we assign a
sample of R4 which will be denoted by R4 (A, a). By definition, the change
between points u' e R4 (A', a') and u e R4 (/, 0) is given by
u' = A'u +a'. (21 *)
Thus, for the change between the corresponding elements u' e R4 (A', a') and
u" e R4 (A", a"), we have the formula
u" =Au'+ ii (21)
with the proper Lorentz· transformation
and ii = a" - A" A'- 1 a'.

We consider R4 (A, a) as chart spaces and define the chart change through (21).
According to the general construction of Problem 73.2, we thereby obtain a
C11)-manifold M 4 , whose points x = (u) consist of tuples with ueR4 (A,a).
Through (21) the elements u of the tuple are naturally connected to each other.
The physical interpretation is as follows. We think of Example 75.7. The
chart spaces R4 (A, a) correspond to all possible inertial systems. Therefore,
precisely the R4 (A, a) are called inertial charts. A point
x = (u) in M4
is called an event with coordinates u = (u 1 , u2 , u 3 , u4 ) in the inertial system for
R4 (A,a). Here u 1 , u2 , u3 are spatial Cartesian coordinates and u4 = ct, where
t denotes the time and c the velocity of light. The transformation between the
coordinates u' and u" of the event x in the inertial systems for R4 (A', a') and
R4 (A",a"), respectively, is given by (21).
In order to introduce a metric for M4 , we set
(22)
with
g44 = 1, gu = g22 = g33 = -1, g 11 =0 for i ::Fj (23)
in R4 (/,0). In R4 (A',a') we define g/1 by transforming g11 as a tensor with
respect to the corresponding coordinate transformation (21 *), u' = A'u +a',
i.e.,
714 75. Special Theory of Relativity

According to the general construction of Problem 74.3h, we thereby obtain a


symmetric tensor field g1i on M 4 • Since det A = 1 for all A, it follows that
det A = 1 in (21). Thus M4 is an oriented Riemannian C00 -manifold and we
can apply the tensor analysis and Riemannian geometry of Section 74.19.
From (23) it follows that R1~:... = 0, i.e., the curvature tensor is identically zero
on M 4 • The following proposition shows that the geometry on M 4 is particu-
larly simple. In mathematical terms it contains the physical fact that all inertial
systems are equivalent.

Proposition 75.10. For every inertial chart R4 (A, a) the metric tensor gil has the
simple form (23).
PRooF. If (21*) holds, then the right-hand side of (22) is transformed as
gii(u~ - u~)(u{ - u~). Proposition 75.9 then yields the desired result.D

Remark 75.ll. In every inertial system, i.e., in every inertial chart, we have the
following relations
ds 2 = c2 dt 2 - de 2 - d, 2 - d( 2
and r~ = 0. This implies that V1 = D1, i.e., the covariant derivative coincides
with the classical derivative. Note, however, that gil need not have the form
(23) for every possible admissible chart. For example, such charts may cor-
respond to curved space coordinates. Also, there exist coordinates for which
the distinction between space and time coordinates is lost. One may think, for
e
instance, of v4 = u 1 + u4 , i.e., v4 = + ct and v1 = u1 fori = 1, 2, 3.

Strategy 75.11. The goal of relativistic physics is to formulate all physical laws
in such a way that they have the same form for every inertial system. This is
Einstein's principle (R) of relativity of Section 75.2. Mathematically, this
program can be realized by using only geometrical properties of M4 , i.e.,
properties which are independent of the choice of inertial charts. For example,
tensor equations on M 4 satisfy this condition.

In the following sections we will use this geometrical method to formulate


a relativistic mechanics. In Part V it will be used for a formulation of electro-
dynamics. A special role is played by the scalars on M 4 • These are physical
quantities which have the same value for all inertial systems. Examples are
charge, rest mass, and entropy. As we shall see in Section 75.11, mass and
energy are no scalars. This is one of the fundamental results of the theory of
relativity.

75.9. Causality and Maximal Signal Velocity


Consider two events x 1 e M4 and x 2 e M 4 with corresponding coordinates u 1
and u2 in a fixed inertial chart, i.e., in a fixed inertial system l:. Recall that
u = (u 1,u 2 ,u 3 , u4 ) = (e,,,(,ct).
75.9. Causality and Maximal Signal Velocity 715

light cone

(a) (b)

Figure 75.8

We define the square of the distance between the two events as


d = gu(u~ - u~)(u{ - u~)
(24)
= c2(tt - t2) 2 - <~~ - ~2> 2 - ('lt - '12> 2 - (Ct - C2) 2.
Because of Proposition 75.9 this definition is independent of the inertial chart.
The quotient w = distance/time, i.e.,
w = J<~~ - ~2) 2 + ('lt - '12) 2 + (Ct - C2) 2/Itt - t2l
is called the signal velocity with respect to x 1 and x 2 in the inertial system I:.
For t 1 = t 2 we set w = oo. If at point (~ 1 ,, 1 ,Ctl and time t 1 one transmits a
signal which travels rectilinearly with constant velocity w, then it reaches point
(~ 2 , '1 2, C2 ) at time t 2• We say briefly that the signal connects the two events
x 1 and x 2 with each other.

Definition 75.13. An event x 2 is called timelike (or spacelike, lightlike) with


respect to x 1 if and only if d > 0 (or d < 0, d = 0). For a fixed x 1 we call the
set of all x 2 e M4 with d = 0 the light cone with respect to x 1 •

In Figure 75.8(a), whic~ corresponds to , = C= 0, the light cone is formed


by the two straight lines which pass through the point x 1 and have slope ±c.
If we take the straight line through the point x 1 and parallel to the t-axis as
the axis of the light cone, then exactly the inner and outer points of the light
cone are timelike and spacelike with respect to x 1 • Precisely the points x 2 on
the light cone are lightlike with respect to x 1 . The line segments in Figure
75.8(b) correspond to signals which travel rectilinearly and which connect the
two events x 1 and x 2 (or x 3 and x4 ). The constant signal velocity w is equal
to the absolute value of the slope of the line segments.

Proposition 75.14. An event x 2 is timelike (or spacelike, lightlike) with respect


to x 1 if and only if the corresponding signal velocity w satisfies w < c (or w > c,
w =c) in every inertial system.

PRooF. This follows immediately from the definition of d and w. 0

Proposition 75.15. An event x 2 is time/ike with respect to x 1 if and only if there


716 75. Special Theory of Relativity

exists an inertial system, in which x 1 and x 2 take place at the same point and at
different times or the same time.
The event x 2 is spacelike with respect to x 1 if and only if there exists an
inertial system, in which x 1 and x 2 take place at different points and at the same
time.

PRooF. Let x 1 =F x 2 • We write x1 ~ (e1, , 1, ( 1, ct1) in I if and only if x1 has these


coordinates in I. Using a translation and a spatial rotation we can always
find an inertial system I with x 2 ~ (0,0,0,0) and x 1 ~ (e 1 ,0,0,t.) in I. Thus
we have d = c 2 t~- e~. Using a special Lorentz transformation,
e) = cx(e1 - Vtj), tj = cx(t1 - ve1/c 2 ), j = 1, 2,
we pass from I to I'. Here we have ex= 1/J1 - V 2fc 2 • Ford> 0 we can find
v
a real number with IVI < c such that e2 = 0. In system I' we therefore have
x 1 ~ (0, 0, 0, ti) and x 2 ~ (0, 0, 0, 0) with t~ =F 0.
Ford < 0 one can always find a real number V with IVI < c such that t~ = 0.
In I' this implies Xt ~ <e~.o,O,O) and x2 ~ (0,0,0,0) with e't =F 0. D

Now we formulate the following two general laws of nature as postulates:


(P) Principle of causality. If an event x 2 is spacelike with respect to x 1 , then
there exists no causal connection between x 1 and x 2 •
(S) Maximal signal velocity. In an inertial system, physical signals can travel
at most with the velocity of light.
We shall motivate (P) and (S). From Proposition 75.15, (P) is equivalent to
the fact that there exists no causal connection between two events in an inertial
system if they take place at different points and at the same time, i.e., physical
signals cannot travel with infinite velocity. Suppose (S) does not hold. From
Proposition 75.14 it follows then that there exist two events x 1 and x 2 , where
x 2 is spacelike with respect to x 1 and the event x 2 is effected by x 1 via a signal
with a velocity greater than that of light. This contradicts (P).

EXAMPLE 75.16 (The Catastrophe in the Center of the Milky Way). We want
to show that Newton's theory of gravity is not compatible with (S). For this
reason, Einstein replaced it with his theory of gravity-the general theory of
relativity. Our Milky Way consists of approximately 10 11 stars. The sun is
located at the boundary of the Milky Way and at a distance ofapproximately
30,000 light years from its center C. It rotates about C with a velocity of
268 km/s, i.e., ten times as fast as the earth rotates about the sun. Suppose
there occurs a huge explosion at C which drastically changes the mass me
of C. According to Newton's theory of gravity we obtain
.. Gmc(yc- y)
y-
- lYe- Yl 3
for the motion y = y(t) of the sun. Thus the change of me causes an im-
75.10. Proper Time 717

mediate change in the orbit ofthe sun. But, according to (S) this will be noticed
only 30,000 years later. This is a contradiction.

At the end of this section, we try to find an expression for d which holds in
arbitrary admissible charts of M 4 • For this, we choose a line segment
0 ~ p ~ 1,
which connects x 1 and x 2 , i.e., u 1 and u 2 in a fixed inertial chart (Fig. 75.8(b)).
This gives d = Jds 2, i.e.,

d= I 1
giJ(u(p))uiui dp.

The dot means derivative with respect to p. The integrand can be transformed
to arbitrary admissible charts. Since the integrand is a scalar, d remains
unchanged.

75.10. Proper Time


Every curve x = x(p) in M4 will be called a world line. This is a set of events
parametrized by a real parameter p. For an inertial chart, x = x(p) corre-
sponds to the coordinate representation u = u(p), i.e.,
y = y(p), t = t(p) (25)
withy = (e, ,, ().The motion y = y(t) of a body or signal in an inertial system
is given by (25) with t = p. The arclength s ofthe world line between the points
x(p 1 ) and x(p) is defined in the usual way through s = ds, i.e., J
s= (P j g;1(u(p))ui(p)u 1(p) dp.
Jp,
Here we assume that

along the world line. As the following observation shows, this condition is
always satisfied if the world line corresponds to a motion of a mass point in
an inertial system with under light velocity (giJuiui > 0) or a motion of a light
ray (g;juiui = 0).
In an inertial chart we have

s= rp Jc2i2- y2 dp.
JPI
The dot means derivative with respect to p. We now consider the motion
l·yl < c.
y = y(t) of a body in an inertial system with under light velocity, i.e.,
718 15. Special Theory of Relativity

We choose p = t and set


t = sfc.
Here, t is called the proper time. It follows that

y2/c 2 dt.
J,, J1 -
t = [' (26)

In Section 75.11 we shall prove the following:


In an inertial system a particle can travel only with a velocity
smaller than that of light.
(27)

Postulate 75.17.1f a clock in an inertial system l: moves along an orbit y = y(t),


then tis the time which has passed on the clock between the instants t 1 and
t of the system time of l:.

Therefore, the moving clock in l: is always slow with respect to the system
time of l:. In order to motivate this postulate, we set t 1 = 0 and consider the
e
following special motion = Vt, '1 = C= 0 of the clock in l: with constant
velocity V. Moreover, we choose a second inertial system l:' in which the clock
is at rest at the origin (Fig. 75.9). The special Lorentz transformation gives

t
I
=
t- ve;c2 = .J1 -
r----:;:---:-
V 2fc 2 t = t.
j1- V /c 2 2

Hence, in this case, the proper time t of the clock coincides with the system
time t' in the rest system l:' of the clock. For arbitrary moving clocks, one
considers small time intervals and momentary rest systems. This implies
a-r = .Jt - y(t) /c 2 2 at.
Summation and passing to limits gives (26).

EXAMPLE 75.18 (The Twin Paradox). Suppose at timet= 0 and at the origin
P0 of an inertial system l: the twins T1 and T2 are born. Shortly thereafter, 12
is brought to a spaceship and begins a journey through the universe while T1
remains at P0 • After several years, T2 returns to T1 • Both are surprised to find

clock
Figure 75.9
75.11. The Free Particle and the Mass-Energy Equivalence 719

that T2 is much younger than T1 . This fact can be easily explained if one
assumes that the biological clock of 1j shows the proper time ri. The motion
=
of 1j is y = Yit) with y 1 (t) 0. With t 1 = 0, it follows immediately from (26)
that

The faster T2 travels, the smaller r 2 will be compared with r 1 •


In the limiting case of the velocity of light I.Y 2 1 = c one has r 2 = 0, i.e.,
T2 remains young forever. Because of (27), however, this limiting case is
impossible.
The twin paradox has caused many misunderstandings and controversies.
The usual mistake made is that the effect is not calculated according to formula
(26), instead one argues verbally. Furthermore, one often finds the false state-
ment that the twin paradox can only be understood in the context of the
general theory of relativity;

75.11. The Free Particle and the Mass-Energy


Equivalence
That sometimes, in his speculations, he went too far, such as, for example, in his
hypothesis of the light-quantum, should not be held too much against him.
Max Planck, in 1913, while recommending Einstein for the membership of
the Prussian Academy. (In 1921, Einstein won the Nobel prize for his results
about light-quanta of 1905.)

Already, the seemingly trivial case of a free particle leads to substantial


revisions of Newton's mechanics, and also leads to Einstein's fundamental
light quantum hypothesis. We present the following observations in a form
which can be extended to the general theory of relativity in the following
chapter.
In classical mechanics a free particle is described by the variational principle

i r,
ll

Ldt =stationary!,
(28)
y(td = Yt,
with L = m0 y2 /2 + const, where t 1 , t 2 , y 1 , and y 2 are given. The correspond-
ing Euler equation
d
dt y L y =0
-L·- (29)

yields the equation of motion m0 ji = 0. The momentum is


p = Lr
720 75. Special Theory of Relativity

and hence p = m0 y. The Hamiltonian function is


H= py- L,
i.e., H = p2 /2m 0 • Here His equal to the energy E of the particle. According
to Section 58.23, the Hamilton-Jacobi equation is obtained from
E = H(p)

by replacing E with - iJSjiJt and p with iJSjiJy = gradS. This gives


S, + Sffj2m0 = 0.
This Hamiltonian formalism has the advantage that it applies to every
Lagrangian function L. Before using this fact we illustrate the difference
between covariance and form invariance. Consider the vector equation
(30)

in IR 3 . For an arbitrary coordinate system we obtain in component notation


giia;bi = 0. (31)
For every Cartesian coordinate system, gii is equal to the Kronecker symbol,
i.e., one has
3
L a;b; = 0.
i=l
(32)

Here (30) is a geometrical equation which is formulated independently of a


coordinate system, and (31) is a covariant equation. Contrary to this, (32) is
called form-invariant with respect to a Cartesian coordinate system. Note also
that, because of the changing coefficients g ii, the form of (31) depends on the
choice of the particular curved coordinate system, while (32) is independent
of the choice of the Cartesian coordinate system.
In order to find a relativistic equation for the free particle, we require the
following.
(i) The variational principle has a geometrical meaning for the space-
time manifold M 4 • Thereby we guarantee the covariance of the Euler
equations.
(ii) In view of the principle of relativity (R) of Section 75.2, we assume that
the equations of motion are form-invariant under changes in the inertial
systems.
(iii) For velocities which are small compared with the velocity of light c, we
assume that L coincides in first-order approximation with the classical
expression (correspondence principle).
Assumption (ii) will follow from (i), because Proposition 75.10 implies that
gii has the same numerical values for every inertial system. In order to obtain

f
(i) we begin with

- m0 c ds = stationary!. (33)
75.11. The Free Particle and the Mass-Energy Equivalence 721

From Section 75.10 this gives the relativistic Lagrangian function


L = -m0 c2J1- y2/c 2.
Because of
p2
2-+0,
c
(iii) is satisfied. Now we immediately obtain the relativistic momentum
p =L; =my (34)
with relativistic mass
m = m0 /JI- y2 /c 2 • (35)
The Hamiltonian function is
i.e.,
For the energy E = H of the particle we therefore obtain
E = mc 2 , (36)
£2 = m~c4 + c2p2. (37)
This implies for the Hamilton-Jacobi equation
Sr2 = m~c 4 + c2 s;. (38)

A comparison of (34) with the classical momentum definition shows that the
expression relativistic mass form is meaningful. For y = 0 we have m = m0 ,
and we call m0 the rest mass of the particle. Equation (36) is of fundamental
importance. Note that (36) becomes
E = m0 c2
for a particle at rest. This is Einstein's famous formula, stating the equivalence
between mass and energy. The energy production of all stars is based upon
(36). For example, during the synthesis of helium from hydrogen in the sun,
mass is transformed into energy. Formula (36) is a triumph for the mental
ability of man; in a frightening way it also allows the self-destruction of
mankind by atomic bombs.
Using (35), the equation of motion (29) now becomes
d .
dt (my)= 0. (39)

According to (35) this is only meaningful for IPI <c. Therefore the inert mass
m of the free particle increases whenever the absolute velocity l.vl increases.
For IPI-+ c, m and E become infinitely large. This means that a free particle
with rest mass m0 > 0 can never reach the velocity of light.
This is different for particles with m0 = 0. For such particles, our theory is
not applicable. The only formulas which are also meaningful for m0 = 0 are
722 75. Special Theory of Relativity

the energy equation (37) and the Hamilton-Jacobi equation (38). In his paper
(1905b), Einstein made the hypothesis that light consists of quanta of energy
E = hv =he/A. (40a)
Here v is the frequency of light, A. the wave length, and h = 6.6 ·10- 34 Ws 2
Planck's quantum of action. Since a photon propagates with the velocity of
light we conclude from (35) that its rest mass m0 is equal to zero. From (37)
we obtain for its momentum IPI = E/c, i.e.,
IPI = hv/c = hfA.. (40b)
The two formulas (40a) and (40b) have been reaffirmed by many experiments
and now form the basis of quantum electrodynamics, i.e., for the quantum
field theory for electrons, positrons, and photons. In Section 68.4 we already
saw how from these formulas, together with the Bose statistics, Planck's
famous radiation law follows. In fact, Einstein (l905b) introduced his light
quantum hypothesis in order to give a different derivation of Planck's radia-
tion law. In his time, the light quantum hypothesis was a very bold and radical
hypothesis. Before Einstein, light was always regarded as an electromagnetic
wave in the context of Maxwell's theory. In fact, the wave picture explains
numerous phenomena. In Chapter 59, we already discussed this dualism
between wave and particle. At the present time, the quantum concept is the
dominating idea in physics.
Formula (40a) can be motivated in the following way: From experience
we know that the energy of light depends on the frequency v. If we assume the
proportionality E = Av, then A must have the dimension of an action, i.e.,
A = energy x time. Letting A = ah, the Bose statistics of Section 68.4 yields
Planck's radiation law for a = l.
The equations for the free particle, formulated so far, do not explicitly
exhibit covariance and form invariance. We now give a formulation for the
equations of motion for which covariance and form invariance can be ex-
plicitly recognized. The motion of a particle corresponds to a world line
x = x(o") with arclength

s= Jtl J giiu 1ui du.


tlo
We assume that giiu 1ui > 0 along the world line. According to Section 75.10,
this implies under light velocity for every inertial system. As parameter u we
choose the proper time -r = sfc. From (33), the world line for this motion
corresponds to a geodesic. Proposition 74.53 then implies for the equations of
motion that

i.e.,
Dp;
-=0 i = l, 2, 3, 4 (41)
d-r '
75.12. Energy Momentum Tensor and Relativistic Conservation Laws 723

with the so-called four momentum


. dui
p'=mo- (42)
dr
and the four velocity duifdr. Equation (41) holds in every admissible chart of
M 4 • In an inertial chart, we have V; = D; and therefore D/dr = dfdr. Because of

J,o J1- y jc
r = [' dt
2 2

we explicitly have for an inertial system


du«
p«=m- p4 =me= Efc, IX= 1, 2, 3.
dt '
From (42) it follows that pi is transformed like a tensor under chart changes;
in particular, this is true for the Lorentz transformations, which describe the
change between inertial systems. Therefore, in relativistic mechanics, energy
E and momentum vector p = p«e« form a unity.
The covariant form of the Hamilton-Jacobi equation (38) is
(43)
Note that (gii) = (gilr 1 . Therefore (43) coincides with (38) for every inertial
system.
In the general theory of relativity, (41) becomes the equation of motion for
a particle in a gravitational field. Only the metric gil changes.

75.12. Energy Momentum Tensor and


Relativistic Conservation Laws for Fields
The following purely mathematical ideas are of central importance for all
relativistic field theories, e.g., fluids, the cosmos, electromagnetic fields, quan-
tum fields, and elementary particles. We generalize here the statements about
conserved quantities of Section 69.1 to relativistic fields. Our goal is to develop
a formalism whereby quantities like charge, energy, momentum, angular mo-
mentum, and stress forces can be assigned to general fields. Important appli-
cations to relativistic ideal fluids will be given in the following section and to
electromagnetic fields and quantum fields in Part V. There, we will also use
the Noether theorem to derive the tensor fields Ti (currents) and TiJ (energy
momentum tensor) from relativistically invariant variational principles.
We make the following assumptions:
(Hl) Ti is a C 1 -tensor field on M4 with
V;T;::O. (44)
Consider an arbitrary, fixed inertial system, i.e., an inertial chart of M 4 •
724 150 Special Theory of Relativity

Then we have V1 = D1 and u = (y, ct). Let G be a bounded region in R3 and let

Q(G,t) = c- 1 L T 4 (y,ct)dy.

As in Section 69.1, integration of(44) over G and an application ofthe integral


theorem of Gauss gives

dd Q(G,t) = - ( T"n 11 d0, (45)


t JaG
where n = L:=tn11 e11 is the outer unit normal vector of iJG. We say that T 1 has
a compact spacelike support if and only if there exists an inertial system and
a bounded region G1 £;;; R3 therein such that
for all u = (y, ct) with y¢G1 foralli.
In this case we let
Q(t) ~ Q(R 3 , t).
From (45) it follows that
dQ(t) = 0
for all te R, (46)
dt
i.e., Q is a conserved quantity. Since for every fixed t, Lorentz transformations
take compact spatial sets into compact spatial sets, it follows from (45) that
(46) is true for every inertial system.

Proposition 75.19. If (H 1) holds and if T 1 has a compact spacelike support, then


Q is a conserved quantity for every inertial system and Q is a scalar on M 4 •

The proof, which follows easily from Stokes' integral theorem for differential
forms, will be given in Problem 76.3. In Part V, Q will represent an electrical
charge or a chargelike quantum number in high-energy physics, for instance,
the baryon number. We now assume:
(H2) Tii is a symmetric C 1-tensor field on M4 with
j = 1, 2, 3, 4.
We define

i
As above we obtain the conservation law

d p'(G,t)
-d 0
=- T'"n 11 d0.
0
(47)
t i!G
75.12. Energy Momentum Tensor and Relativistic Conservation Laws 725

Proposition 75.20. If (H2) holds and if Tii has a compact space/ike support, then
all the p1 are conserved quantities for every inertial system and p1 is a tensor on
M4.

The proof will be given in Problem 76.3. In Part V, we will show how the
energy-momentum tensor T 1i can be derived from a variational principle. In
all field theories one requires that T 44 can be physically explained as an energy
density. Then p1 corresponds to the four momentum vector of Section 75.11,
i.e., we interpret
p= p"e,.
as the momentum vector and
E=cp4
as the energy of the field. As an important example we consider in Section
75.13 a relativistic ideal fluid. We now compare (47) with classical mechanics.
Letting t1 1i = - TIJ for i,j = 1, 2, 3, we can write (47) fori= 1, 2, 3 as

ddp = ( t1n dO.


t JaG
The left-hand side represents a force (time derivative ofthe momentum vector).
Therefore t1 corresponds to the classical stress tensor of Section 61.3. This
allows the following interpretation of Tii for an inertial system:
- T"fl = component of the stress tensor for IX, p = 1, 2, 3,
c- 1 T" 4 = component of the momentum density for IX = 1, 2, 3, (48)
T 44 =energy density.
Moreover, we have T 1i = Ti 1• In addition, a comparison of (47) with Section
69.1 gives
p,. = T"flefJ current density vector for the IXth momentum component,
p4 = cT 4 flefJ energy current density vector.
The index p is summed from 1 to 3.
We now study the angular momentum of the field. For this we set
stJk = uiTi" _ uiTI"

for every inertial chart. Then S1i" is a tensor on M4 , and


V"S 1i" = 0 (49)
holds for every inertial chart. This follows immediately from (H2) with V1 = D1
and Tii = Ti 1• As in Section 74.17, S 1i" can be extended to every admissible
chart of M4 • Then (49) holds there, too. As above we define

Mii(G, t) = c- 1 L
S1i 4 (y, ct) dy.
726 75. Special Theory of Relativity

We have MIJ(G, t) = - Mi1(G, t). By definition the angular momentum vector


J(G, t) of the field for an inertial system is
J(G,t) = M23e1 + M31e2 + M12e3,
where, precisely, we would have to write M 23 (G, t), etc. This defmition is
correct, because if we define the momentum density vector as
Pd = c-1r«4e.
and the angular momentum density vector as
Jd = y X Pd•
then we obtain for the momentum vector of the field in the region G

p(G,t) = L Pd(y,t)dy

and for the angular momentum vector

J(G,t) = L Jd(y,t)dy.

For the energy of the field in the region G it follows that

E(G, t) = L T 44 (y, t)dy.

From (49) we obtain the important relation

ddt MiJ(G,t) = - f SIJ•n.dO.


JaG
Similarly to Section 69.1, we therefore call the vectors

current density vectors ofthe corresponding angular momentum components


J1, J2, J3. We 1et M IJ = M I} (IR 3, t).

Corollary 7S.ll. Under the assumptions of Proposition 15.20, all the MIJ are
conserved quantities for every inertial system and MIJ is a tensor on M 4 •
The proof will be given in Problem 76.3.

75.13. Applications to Relativistic Ideal Fluids


It follows from Section 70.3 that the equation of motion and the equation of
continuity for a classical ideal fluid in an inertial system are given by
D4 cpw• + Dp(pw•wP + b PP) = 0,
11 ex = 1, 2, 3, (50)
(51)
75.13. Applications to Relativistic Ideal Fluids 727

e,
with D4 = ofo(ct). According to our convention in Section 75.1, fJ is summed
from 1 to 3. Here w = w, is the velocity vector, p the density, and P the
pressure. The momentum density vector is Pd = pw. By letting
T
class -
_ (T «P
class
cpw
cpw )
pc2

with
a, fJ = 1, 2, 3, (52)

equations (50) and (51) can be written in the short form

(53)

Since equation (53) describes conservation of mass and momentum, we


call T;//ass the classical mass-momentum tensor. Nothing, however, has been
said about the tensor property. It can be shown that T:(ass is a tensor under
classical Galileian transformations. We are now looking for a relativistic
generalization.

Postulate 75.11. Let a tensor field vi be given on M4 such that vivi = c2• The
basic equations of a relativistic ideal fluid are

(54)

with energy-momentum tensor


Til= c-2(e + P)vivl- Pgil. (55)

Here e and Pare functions on M 4 (scalar fields). We have to add the constitu-
tive law
e = e(P, ... ),
where the dots stand for further variables like mass density, entropy density,
etc.
The motion of a fluid particle x = x(r) on M4 with proper timer follows
from

(56)

for an arbitrary admissible chart. In an inertial system l:, we get


W11 c
v4 -----r====;;~
V 11
- Jl- w /c
-
2 2'
-
- Jl- w2/c 2 '
where w denotes the classical velocity vector of the particle.
We now want to motivate this postulate, whereby the following two points
are important:
728 75. Special Theory of Relativity

(i) Tii is a tensor which, similarly as the corresponding classical expression,


is quadratic in the velocities.
(ii) The component T 44 is equal to the energy density.
First, we consider a fluid which is at rest in an inertial system I. The motion
y = y(t) of an arbitrary fluid particle is given then by y(t) = 0, i.e.,
u,. = 0 for IX = 1, 2, 3,
and t is equal to the proper time t. From (56) it follows that
for IX = 1, 2, 3 and
According to (48) it follows that:
- Tafl = component of the stress tensor for oc, fJ = 1, 2, 3,
T 44 = energy density.
The key trick is then to consider Tii in the rest system I. Section 70.1 implies
that

(T") ~(~ ~ ~ ;)

in I, which can be rewritten as


i, j = 1, 2, 3, 4. (57)
Notice that

0 0

~)
-1
.. 0 -1 0
(g'l) = (gij) =( ~
0 -1
0 0
in I, and that the right-hand side in (57) is a tensor. Thus we have found an
expression for the Tii which is valid in every system of reference. Our discus-
sion shows that the scalars P and e have the following physical meaning:
P = pressure of the fluid in the rest system,
e = energy density of the fluid in the rest system.
This motivates (55).
We have made use of the following general strategy which clearly illustrates
the advantage of tensor calculus in physics:
(a) First, we consider a special coordinate system in which the physical prob-
lem assumes a simple form. This leads to certain equations.
75.13. Applications to Relativistic Ideal Fluids 729

(b) Second, we write these equations as tensor equations, which are then valid
in every system of reference.
Let us now consider an arbitrary inertial system I. Then V; = D;. The
relativistic equation (54) with j = 4 is
a ( e ) . ( (e +
Ot 1 - w2 /c 2 + dJV 1 - w2 jc 2 = O.
P)w) (58)

This describes energy conservation. In fact, it follows from (58) that the total
energy

E-
-
fR3
T44 d -
y-
f Rl
e(y, t) d
1 - w(y, t) 2/c 2 y
is a conserved quantity if the motion. of the fluid is restricted to a bounded
spatial region.
Generally, the law of mass conservation (51) is not valid in relativistic
physics.
In the following chapter we will use relativistic fluids as a basis for cosmo-
logical models of the universe and for models of stars.
CHAPTER 76

General Theory of Relativity

The general laws of nature are to be expressed in equations which are valid for
all coordinate systems.
Albert Einstein (1916)
The fact that elementary particle physicists, astrophysicists, and cosmologists
have become interested in the same questions is one of the most significant
developments in physics within the last ten years.
Alan H. Guth and Paul J. Steinhardt
(Scientific American, July 1984)

76.1. Basic Equations of the


General Theory of Relativity
We use the notations of Section 75.1. The basic equations of the general theory
of relativity which determine the metric tensors gii of Einstein's four-
dimensional space-time manifold E 4 are
(1)
with the universal constant
"= 81rG/c4 = 2.07 ·10- 43 N- 1 • (2)
Here G is the gravitational constant.
The equations of motion for a mass particle are

D (dt# 1 ) = O. (3)
ds ds

730
76.1. Basic Equations of the General Theory of Relativity 731

The equations of motion for light rays (photons) are

!!_(dut) = 0 (4)
du du '
ds
du = O. (5)

Equations (3) and (4) can be written more explicitly as


at + r~u'u 1 = o. (6)
The dot means derivative with respect to arclength s or parameter u. In
Theorem 76.A of Section 76.2 we state a unified variational principle for (3)
and (4), (5).
These equations are motivated as follows. The masses which appear in the
energy-momentum tensor Tli effect the metric gli of E4 . In contrast to
classical mechanics, there exists no force of gravity but the gravitational effect
is caused by the metric. Mass particles move on geodesics, and light rays
correspond to affine geodesic lines (4) with side condition (5), i.e., to zero lines.
In general, Tli also depends on gli.
We now explain these equations. Our starting point is a four-dimensional
real Riemannian C00 -manifold E4 with metric
ds 2 = gli du 1dui.
As before, let (gli) denote the inverse matrix to (gu). Let ( -1, -1, -1, 1) be
the signature type of the metric tensor gli in the sense of Section 74.20. Thus,
according to the law ofinertia of Sylvester, we assume that at each point of £ 4:
g23 g24
g22
g32 g33 g34 > 0, g<O
g42 g43 g44
with g = det(gu). In physical models, the u1, u2 , u3 (and u4) represent space
(and time) coordinates. As in Chapter 74 we use the Christoffel symbols for g11,
r~ = !gu(D1 g~ + D1g1, - D,gu) (7)
with D, = o/ou 1• To describe the curvature of E4 we also use the curvature
tensor
(8)

Moreover, we define the Ricci tensor R1... and the scalar curvature R by
R-
- gimRJm•
(9)
R" = g'ig-RJ•·
The energy-momentum tensor Tli depends on the concrete physical situation.
732 76. General Theory of Relativity

For an ideal fluid, for example, we obtain from Section 75.13:


Tii = c- 2 (P + e)v 1vi - Pgii. (10)

Here v1 is the four-dimensional velocity field, P the pressure, and B the rest
energy density of the fluid.
Lowering the indices by using the gii, we obtain the following equation
which is equivalent to the basic equation (1):
RIJ- !guR = K'I;1, (11)
where 7;1 = girgi• T'". An application of g 1i yields R - 2R = KT with T =
giiTu· Thus (1) is equivalent to

Rij = K('I;j - !glj T). (12)


In the special case of absence of matter, electromagnetic radiation, and other
outer fields, we obtain Einstein's equations for the metric tensor of the
vacuum
R 11 = 0. (13)
The main difference between the special and the general theory of relativity
is the following. In the former case we use Minkowski's four-dimensional
space-time manifold M4 which has been introduced in Section 75.8. There
exist coordinates (inertial charts) which correspond to inertial systems such
that the metric tensor assumes the following special form
for i =F j.
Hence M4 has zero curvature, i.e.,
r it j-- 0• 1 - Rii - R - R - 0
RjA:m- - ij- - ·

The four-dimensional space-time manifold E4 of the general theory of rela-


tivity, on the other hand, has a nonzero curvature whenever matter is present.
This curvature is responsible for the gravitational effect between the masses.

76.2. Motivation of the Basic Equations and


the Variational Principle for the
Motion of Light and Matter
We motivate (3). The motion of a mass particle corresponds to the world line
x = x(u) with arclength

s= J. J gu(u(u))u (u)ui(u) du.


"t

"o
1

Here we assume g1Ju 1ui > 0 along the world line. As proper time we define
t = sjc.
76.2. Motivation of the Basic Equations and the Variational Principle 733

By definition, this is the time shown by an atomic clock which moves together
with the particle. Similarly as in Section 75.11, the variational principle to

f. ,Jgiluiui
determine the orbit is given by -m0 e Jds =stationary!, i.e.,

-m0 e da =stationary!, (14)


"o

with a0 , a 1 , u0 , and u1 fixed. From Proposition 74.53 it follows that the Euler
equations are equal to (3) if s is introduced as a curve parameter.
We motivate (4), (5). In the special theory of relativity we have
ds 2 = e2 dt 2 - (dy) 2
for an inertial system. A photon moves rectilinearly with velocity e, i.e.,
y(t) = vt + y0 with v2 = c2 • This can also be written as
u" = av" + y0,
It follows that
2
and ds)
( da = c2 - v2 = 0,

which implies (4), (5). Hence in the general case, this system is a natural
generalization of the situation in the special theory of relativity.

Theorem 76.A (General Variational Principle for Mass Particles and Pho-

f. ,
tons). The Euler equations for the variational problem Jds 2 =stationary!, i.e.,

Lda =stationary!,
"o
(15a)

with L = g;1uiul are the equations of motion


at + r~uiui = o. (15b)

PRooF. This follows from (d/da)L,;, - Lu, = 0 after a short computation. 0

For photons we have, in addition, that s = 0. For mass particles we must


haves> 0. In this case, we choose a equal to the proper timer. We obtain
s2 = e2 because of ds = e dr. This yields the following additional conditions
L=O (light),
(matter particle).
In view of Section 75.9 we assume that physical signals can only travel on
world lines u = u(a) with s ~ 0.
734 76. General Theory of Relativity

The variational problem (15a) also provides a convenient method of com-


puting the Christoffel symbols r 1' from (15b).
We motivate Einstein's basic equation (1). Thereby we use the heuristic
principle of the greatest possible simplicity of a theory. In order to derive the
field equations (1) for the vacuum we assume the variational principle

L L 1 dm =stationary!,

where H is a region in £ 4 and dm = jig~ du is the volume element in £ 4 • In


order to obtain covariant equations, the integral has to be invariant, i.e., L 1
has to be a scalar. Since the curvature of £ 4 should play an important role
and R is the "simplest" scalar that can be formed from Rjt,, we assume L 1 = R.
This way we obtain the fundamental variational problem of Hilbert

J = L L du = stationary! (V)

with
L = Rjjgl.
Since L depends on gii and the first- and second-order derivatives, we assume
furthermore that all gii and Dtgii remain fixed on the boundary aH.

L
In Problem 76.7 we prove

{JJ = - (Rii - -!gilR) {)gil dm.

It follows then from {JJ = 0 that


R 1i - -!giiR = 0. (16*)
This is the basic equation (1) with Til= 0. In the case of matter fields we
assume that the effect is described by a matter tensor Sii, and we replace
equation (16*) with
(16)
In the case of an ideal fluid, Til in (10) is the simplest possible candidate. For
dimensional reasons we need to assume that

In Section 76.5, the constant " will be exactly determined by comparison with
the expansion ofthe universe, which has been obtained in Section 58.15 in the
context of Newton's theory. The same value for " also follows from ap-
proximation (19) below. Problem 76.5 implies that we have the identity
V1(Rii- ig 1iR) = 0.
Hence for every solution of(1) we have
V1Tii = 0. (17)
76.2. Motivation of the Basic Equations and the Variational Principle 735

In the case of the ideal fluid (10), these are precisely the relativistic equations
of motion for the fluid which have been discussed in Section 75.13.
In the general case we make the following assumptions on the Tii:
(i) Tii = Tii and T 44 has the dimension of an energy density.
(ii) Equation (17) is physically meaningful for solutions of (1).
In Part V we will show how the Tii for electromagnetic fields and other
fields can be derived from a variational principle.
We now want to compare the field equations (1) with Newton's theory. We
begin with the metric
(18)
where t denotes the time and Cartesian coordinates are used as space coor-
dinates. If p is the mass density, then T 44 = pc 2 is the energy density. A
straightforward computation from basic equation (1) yields fori= j = 4
AU= 4nGp, (19)
if 1/c is regarded as small and higher-order terms are neglected (see Problem
76.8). Equation (19), however, is the classical equation for Newton's gravita-
tional potential

U( ) = - { Gp(y') d '.
y JHiy-y'l y
For further motivation we might mention that the variational principle (14)
for the case (18) becomes

f.ro
r,
L dt = stationary!,
y(td = Yt•
with the Lagrangian
L = -m0 c2 j(l + 2U/c 2 ) - (1- 2U/c 2 )(y 2 jc 2 )

The dots stand for higher-order derivatives with respect to 1/c. The corre-
sponding Euler equation (d/dt)L; - Ly = 0, however, coincides with the
classical equation of motion
m0 y = -m0 Uy.
We note that the particular form of the factor by (dy) 2 in (18) cannot be
determined from this variational argument, but instead may be determined
from (19). Formula (18) is called a quasi-classical approximation of the general
theory of relativity. This is well supported by experiments (e.g., the red shift
of light).
736 76. General Theory of Relativity

The experimental basis for Einstein, in his general theory of relativity, was
the equivalence between gravitational mass M and inert mass m. Let us explain
this. According to Newton, the motion of a mass point in the gravitational
field of the sun, for example, is given by the equation
(20)
For m = M the motion y = y(t) is only affected by Msun i.e., analogously to
the electric field, we may speak of a gravitational field. The equality M = m,
implicitly assumed by Newton, was confirmed experimentally by Eotvos,
1909, with an accuracy of 10- 9 (today 10-11 ). The field character of the
gravitational force enables us to write the equations of motion (3) in a form
where the mass of the particles does not appear.
An important role in the final formulation of the general theory of relativity,
which Einstein worked on for about 10 years, was played by the principle of
equivalence between gravitational and accelerational fields. This principle
states that, locally, the gravitational effect can be eliminated by passing to an
accelerated system. As an example, we consider an elevator I which moves
downward with exactly the acceleration of gravity. A physicist in I, who drops
a stone, will observe that the stone remains at rest in I. In mathematical terms,
this local equivalence principle means the following. At each fixed point x e E4
one can introduce locally a coordinate system with
at x,
where 64 = 1, 61 = -1. This will be proved in Problem 76.4. At the point x, the
equation of motion (6) then takes the form u" = 0. Globally, however, the
gravitational field can in general not be turned off by passing to an accelerated
system. This follows already from Newton's mechanics, since the gravitational
field ofthe sun and an accelerational field at infinity satisfy different boundary
conditions.
The local equivalence principle played an important role for Einstein, since
it led him to the idea of using the tensor calculus of Riemann, Ricci, and
Levi-Civita. This point has caused many controversies because of the formal
character of the requirement for the covariance of the equations. In fact, the
hard core of general relativity is Einstein's idea that gravitational effects are
caused by the geometry of the manifold E4 • Then a formulation of the theory
which only uses invariant concepts of the manifold E4 automatically leads to
covariant equations by passing to charts, i.e., systems of reference.

76.3. Friedman Solution for the


Closed Cosmological Model
In Section 74.21 we considered two-dimensional Riemannian manifolds with
constant positive and negative curvature together with the corresponding
76.3. Friedman Solution for the Closed Cosmological Model 737

r 'max
/
I
/
/
' '\
I \
I
r
Big Bang fc fend

t c = 1contraclion
cosmos
(a) (b)

Figure 76.1

elliptic and hyperbolic non-Euclidean geometries. In the following we will


show that the corresponding three-dimensional spaces can be used as models
for our cosmos.
In the closed cosmological model, the cosmos consists of the surface of s;
a ball of radius r in IR 4 , i.e.,
(21)
The radius r depends on timet, and r = r(ct) can be determined from the basic
equations
(22)
of the general theory of relativity. Figure 76.1(a) shows the corresponding
one-dimensional case. Therein we live on the boundary of a disk with radius
r where r depends on t. More precisely, r will have the form of Figure 76.1 (b).
In the two-dimensional analog we live on the surface of a ball. The open
cosmological model, on the other hand, which will be discussed in the follow-
ing section, corresponds to Figure 76.2 as a two-dimensional analog. Therein
we live on the surface of a cylinder, but the metric is not induced by the usual
metric of IR 3 , but instead is given by (39) below. For both models the cosmos
is unbounded. The volume of the cosmos, on the other hand, is finite for the
closed model but infinite for the open model.

Big Bang
cosmos --- .......

(a) (b)

Figure 76.2
738 76. General Theory of Relativity

In s; we introduce the usual spherical coordinates:


e4 = rcostjl, e3 = r sin"' cos 8, (23)
e2 = r sin"' sin .9 cos qJ, e = r sin "' sin .9 sin qJ,
1

where 0 ~ qJ < 2n and 0 ~ .9, tjJ ~ n. From dl 2 = Lt= 1 def we obtain


d/ 2 = r2 [dt/1 2 + sin 2 t/J(sin 2 .9dqJ 2 + d.9 2 )]. (24)

The volume of the cosmos is V = JJ[gf du with u = (qJ, .9, t/J) and hence
V= L L" I'
2
" r3 sin 2 tjlsin.9dqJd.9dtjJ = 2n 2 r3 . (25)

The greatest possible distance on s; is the distance between the North and
the South pole, i.e.,

l= L' rdt/1 = rn.


Note that analogously to the two-dimensional sphere S,2 in IR 3 , we need at
least two charts to describe the manifold s;.
The spherical coordinates are
chart coordinates only if one removes North and South pole. We might
consider everything in chart coordinates, but for convenience we shall use
spherical coordinates. This causes no difficulties. For example, (25) can be
obtained by passing to limits. Note that the choice for the position of the
North pole is arbitrary. The equivalence of all points on corresponds to s;
the following, today generally accepted cosmological principle:
(C) All points in the cosmos are equivalent.

As metric for the four-dimensional space-time manifold £ 4 we choose


ds 2 = c2 dt 2 - dl 2 (26)
with t > 0. An atomic clock A which is located at a fixed point in the cosmos
corresponds to the world line qJ = qJ0 , .9 = .90 , tjJ = t/10 , t = arbitrary > 0.
According to Section 76.2 the time parameter t corresponds to the proper
time

r = c- 1 ds =f I dt = t.

Therefore we may think oft as the world time. Time t = 0 corresponds to the
Big Bang.
More precisely, we let~> = {t e ~: t > 0} and choose the four-dimensional
space-time manifold £ 4 in the form
£4 = s: X IR,..
A point on £ 4 is described then by a point on Sl, i.e., by the spherical
76.3. Friedman Solution for the Closed Cosmological Model 739

coordinates
(q>,9,1/J)
and by the world time t. We shall see below that r = r(ct). Thus at time t a
point (q>, 9, t/1, t) on E 4 corresponds to a point (q>, 9, Y,) in the cosmos s~., 1 at
time t. In fact, we will see that the metric, obtained by solving Einstein's
field equations <n may have singularities for certain t-values. In the case
of the matter cosmos, for example, we obtain that r(ctcnd) = 0, and hence
detgu(tcnd) = O.. lf we want to rigorously work with Riemannian manifolds,
we would have to exclude these singularities by modifying E4 • In the case of
the matter cosmos, one can choose:
E4 = s~ X ]0, tend[.
As energy-momentum tensor for the cosmos we choose the corresponding
tensor
Tii = -PgiJ + c- 2 (P + 8(P))vivi (27)
for an ideal fluid. This procedure is motivated as follows:
(i) For a study of the cosmos in the large, local properties are not important.
(ii) Astronomical data are not in conflict with the assumption that in the
mean the mass is equally distributed in the cosmos.
(iii) There are no forces of friction between the masses in the cosmos.
One should not be irritated by the notion of a "fluid." As we saw in Section
75.13, formula (27) is the simplest ansatz for the motion of masses which
satisfies (iii).
We neglect the individual motion of the galaxies, and assume that at every
point in the cosmos, the mass is at rest with respect to our fixed system of
reference. Thus we have
v; = du; = {0 for i = 1, 2, 3,
dr c for i = 4.
The following two differential equations of Friedman
3(r 2 +b)= K8r 2 , (28)
e= - 3(8 + P)r/r, (29)
are important for determining r = r(ct) and the state equation
8 = e(P) (30)
between pressure P and rest energy density 8 of the cosmos. The dot means
derivative with respect to ct.

Theorem 76.8 (Friedman (1922)). Under the above assumptions, Einstein's


equations (22) for the metric (26) are equivalent to (28) and (29) with c5 = 1 if
we assume that r ::f: 0 and e + P ::f: 0.
740 76. General Theory of Relativity

PRooF: In the following, Greek indices run from 1 to 3. The nonvanishing


components of Iii = 9;r9is T" are
~fl =- Pg..p, T44 = 8.
We use a straightforward computation from (9) to obtain the Ricci tensor,
and from (26) we find, for the nonvanishing components,
R..p = - [rr + 2(r 2 + 15)] g..p/r2 ,
R44 = -3rfr.
We have
R = giiRii = g"'flR ..p + R44
with g"'fl g..p = 3, and hence
R = - [ 6rr + 6(r 2 + 15)]/r2 • (31)
Thus the field equations (22) at once yield the following system
2r r2 + 15
-+--= -KP, (32)
r 2 r
3(r2 + 15)/r2 = Kt:. (33)
Differentiation of (33) with respect to ct yields (28) and (29). Conversely,
formulas (32), (33) follow from (28), (29) after differentiating (28). D

Corollary 76.1. The curvature scalar R of S~ is equal to 6jr 2 •

PRooF: This follows most easily from (31) with r = const. Because of ds 2 =
-dl 2 fort = const, a sign change occurs in (31). D

Let us consider two important special cases.

EXAMPLE 76.2 (Closed Radiation Cosmos). According to Planck's radiation


law of Section 68.4, we find the relation 8 = 3P for a space only filled with
radiation. Formula (29) implies that
er4 = const = C (34)
and from Planck's radiation law follows that 8 = const · T 4 • Consequently, the
temperature T satisfies
Tr = const. (35)
This relation has already essentially been used in Section 58.15. Letting e= r 2 ,
the solution of (28) becomes:
(36)
EXAMPLE 76.3 (Closed Matter Cosmos). If matter is dominant, then we may
76.4. Friedman Solution for the Open Cosmological Model 741

approximately set P = 0 and e = pc2 where p is the mass density. It follows


from (29) that
er 3 = const.

Integration implies the conservation of the total mass M. More precisely, we


have p = M/V, and hence
(37)
If we introduce the new variable 'I = Jdct/r(ct), the solution of (28) becomes:

KMc 2
r = 127t 2 (1 - cos '1),
(38)
KMC( . )
t = 12n2 'I - SID 'I .

In both examples we have r = 0 at t = 0 (Big Bang). Moreover, r in (38) is


time-dependent as shown in Figure 76.1(b), i.e., at a time tcontraction the expan-
sion of the cosmos changes into a contraction, and at time tend the cosmos has
collapsed into a single point. We find 'lend= 2n, and thus
tend = KM cf6n
and tcontraction = tend/2.

76.4. Friedman Solution for the


Open Cosmological Model
In the open model the cosmos is equal to the set P,3 = S~ x Ill> with metric
(39)
A point of P/ is given by the coordinates (tp, 8, 1/1) with 0 s; tp < 2n, 0 s; 8 s; n
(spherical coordinates on S~) and 0 < 1/1 < oo.
As four-dimensional space-time manifold E4 we now choose
E4 = pt X Ill>.
If we know r = r(ct), then a point (tp, 8, 1/J, t) in E 4 corresponds to a point
(tp, 8, 1/1) in the cosmos PJcrl at time t. We can then make analogous observa-
tions as in Section 76.3. But it is much easier to reduce this case to the case
of the closed cosmological model by replacing r of Section 76.3 with ir and 1/J
with ii/J. For the volume ofthe cosmos we now find
742 76. General Theory of Relativity

Theorem 76.B holds with {) = -1. From (28), (29) with {J = - 1 we obtain the
following special cases.

EXAMPLE 76.4 (Open Radiation Cosmos). Fore = 3P we have er4 = const =


C. Furthermore, Tr = const and
r2 = c 2 t 2 + 2ct~.
EXAMPLE 76.5 (Open Matter Cosmos). For P = 0 and e = pc 2 we have er 3 =
const = C and
KC
r = 6 (cosh'7- 1), t= ~~(sinh '1 - '7).

Here r behaves as shown in Figure 76.2(b). For the spatial curvature scalar R
of the cosmos P,3 we obtain -6fr 2 •

76.5. Big Bang, Red Shift, and


Expansion of the Universe
In connection with Section 58.15 we consider the following questions:
(i) How can the red shift of galaxies (Hubble effect) be explained in the
context of cosmological models in the general theory of relativity?
(ii) How can we decide whether the open or the closed model is the right
model for our cosmos?
(iii) How can we determine the age of our universe?
We start with (i) and consider a galaxy as in Figure 76.3 which emits light
signals at times tG and tG + AtG. Here on earth, we receive these light signals
at times tE and tE +AtE. We begin with the closed cosmological model and,
without loss of generality, assume that the galaxy is at the North pole of s;.
The motion of the photons is described by the affine geodesics equation (4).
By computing the ri~ one shows that
<p = const, 8 = const, "'= 1/J(t)
galaxy

light ray

Figure 76.3
76.5. Big Bang, Red Shift, and Expansion of the Universe 743

are solutions of (4). This also can be motivated by symmetry arguments with-
out any computations. The additional condition ds 2 = c2 dt 2 - r2 dl/! 2 = 0

i
implies that
r edt
1/J(t) = - ()'
10 r ct

The coordinate ofthe galaxy is 1/16 = 0. Let 1/JE be the coordinate ofthe earth.
It follows then that
('• Cdt f.r,+M• Cdt
"'E = JIG r(ct) = IGHIG r(ct)"

This implies in first-order approximation


ME AtG
--=--
r(ctE) r(ct 6 )

By definition, the frequency v of light is equal to the number of oscillations per


unit of time, i.e.,

v6
= AtE = r(ctE)
Let tE - t 6 be small. Using a Taylor expansion, it follows that
r(ct 6 ) = r(ctE) + cf(ctd(t6 - tE) + o(lt6 -tED·
We define the Hubble constant

Hence

Because of vA. = c, we obtain for the wave length A. of the light:


AE- AG
A.G = H(tE)(tE - t6 ) + o(tE - t 6 ). (40)

Here, A.6 denotes the wave length of the light emitted from the galaxy, and A.E
denotes the wave length of the light received at the earth. According to (40),
we obtain A.E > A. 6 for H(tE) > 0, i.e., we observe a red shift at the earth. This
is Hubble's law:

For H(tE) > 0, i.e., for an expanding universe and for small travel times of
light tE- t 6 , the relative red shift is in first-order approximation proportional
to the travel time.
744 76. General Theory of Relativity

In a contracting universe, we have r(ctE) < 0, i.e., H(tE) < 0. This implies
A.E < A.G. In this case, we observe a blue shift at the earth.
If the red shift (40) is interpreted in terms of a classical Doppler effect, then
we obtain from Problem 58.6 that V = H R, where R is the distance and V the
escape velocity of the galaxy. For the open cosmological model one uses
analogous arguments.
We nQw consider (ii). For the matter cosmos with P = 0, e = pc 2 , which is
the case of our cosmos right now, we obtain from (28):
3~c 2 = r 2 (Kpc 4 - 3H 2 ). (41)
This gives:
Kpc 4 > 3H2 : closed cosmological model(~= 1),
Kpc 4 < 3H2 : open cosmological model(~= -1).
Thus the critical density is

A comparison with (58.72) shows that


K = 8TCG/c4 .

This determines the universal constant " which appears in Einstein's field
equation (1). Interestingly, the same value forK follows from approximation
(19). In Section 58.15 we already mentioned that the physical data, presently
known, admit no definite conclusion as to which model is the correct model
of our cosmos.
Finally, we answer (iii). From (32) and (33), it follows that
6F = - K(e + 3P)r. (42)
This equation is independent of the state equation e = e(P). Since, at present,
we observe a red shift, we have H > 0, and hence r(ct) > 0. From (42) the curve
r = r(ct) is concave. Figure 76.4 shows that r(ct)fct ~ r(ct). Therefore,
t S r(ct)/cf(ct) = 1/H(t).
This estimate has also been obtained in Section 58.15 by using another deriva-
tion. It implies a maximal age of our universe of 20 · 109 years.

• ct

Figure 76.4
76.6. The Future of Our Cosmos 745

76.6. The Future of Our Cosmos

76.6a. The Future in the Closed Cosmological Model


In Section 58.15 we already saw that our cosmos is a matter cosmos. Consider
the closed matter cosmos of Example 76.3.

EXAMPLE 76.6. The model of the closed matter cosmos is uniquely determined
by the mass density p and the Hubble constant H > 0 at the present time. We
define
rcru = c/H, Prel = p/Peri!•
tcrit = n/H.
This implies

The present radius of curvature of the cosmos is


r = rcrit(Pret- 1r 112 •

From Section 76.3 it follows then that, at the present time, the maximal
distance in the cosmos is equal to nr. The mass of the cosmos is M = 2n2 r 3 p,
hence
M = McritPret(Prel - 1)- 312 •
The age of the cosmos t is
sin '1('1 - sin '1)
COS(If/2) = Jf!P::;.
t = H(l- cos,)2 '
Because of 0 < '1 < 1r. we obtain
0 < t s; 0.7/H.
At time tcontraction the cosmos begins to collapse towards a single point. This
process is completed at time tend· We have
tend -- tcritPrel ( Prel -
1)-312• tcontraction = tend/2.

PRooF: The formula for r follows immediately from (41). Formulas (38) imply
that
KMc 2
ct = 127r. 2 ('I - sin '1). (43)

Because of M = 2n 2 r3p we obtain r = Kr 3pc 2(1 - cos,)/6, and hence


r = J6/pKc 2 (l- cos,). (44)
746 76. General Theory of Relativity

From H = cr(ct)/r(ct) together with (43) and the chain rule it follows that
rH = c(sin '1)/(1 - cos '1). (45)

Equations (44) and (45) imply that cos('l/2) = JP::JP and from (43) and (45)
we obtain the formula for t. D

The presently known data roughly yield H = 1/20·109 years. This allows us
to estimate the age t of the cosmos as
t :S: 13 · 109 years
as well as

and
rcril = 20. 109 light years, Merit = 5. 1053 kg,
tcrit = 60 ·109 years.

The present mass density is estimated as 1 nucleon/m 3, i.e., PreJ = 0.3. This
value, which favors the open cosmological model, however, is very uncertain.
Also, Prel may increase in the presence of dark masses between the galaxies
and the expected neutrino mass of 20 eVjc 2 • The proton has an approximate
mass of 109 eVjc 2•
If we assume PreJ = 1.3, for instance, then we obtain 'I = 1 and a reasonable
age of the universe oft= 13 ·109 years. The radius of curvature r, the mass
M of the cosmos, and the end time tend of the cosmos are
r = 34·109 light years,
M = 3·1054 kg,
tend = 400 · 109 years.
In this case, the cosmos begins to collapse at 200 · 109 years after the Big Bang.
At first, only astronomers will notice a sudden blue shift in the spectrum of
the galaxies. Slowly the sky will get brighter, and all living creatures will go
blind and start to sweat until the inferno breaks loose. The cosmos becomes
a gas ball which, after getting hotter and hotter, collapses into one point.
According to Figure 76.l(b) there exists the theoretical possibility that in the
end, i.e., after 400 · 109 years, there occurs another Big Bang. Astronomers on
this earth, however, will have no chance to observe the beginning of the
contracting phase, because already after 8 ·109 years, the radius of our sun
has increased by a factor of 100 and its luminosity by a factor of 2,000, and
thus all life on our planet will have been destroyed.
The model above describes a somewhat simplified situation, since in the
beginning the cosmos was a radiation cosmos and only later on became a
matter cosmos. This has been discussed in Section 58.15.
76.7. The Very Early Cosmos 747

76.6b. The Future in the Open Cosmological Model

If the present ideas about the theory of elementary particles are correct, then
we will have a gloomy future in the open cosmological model.
Similarly as in Example 76.6, we find as the radius of curvature
·r = 'crit(l - Prelr 112

and as the age of the universe


sinh '7(sinh '1 - '7)
t= Hcosh('7- W ' cosh('7/2) = jlfP;;..
Because of 0 < '1 < oo we have
2/3H ~ t ~ 1/H.
This means 13·109 years~ t ~ 20·109 years. For PreJ = 0.3 we obtain t =
14·109 years for the age of the universe and r = 24·109 light years for the
curvature radius of the universe.
In contrast to the closed model there occurs no collapse in the open model.
In connection with a unified theory for elementary particles (SU(5)-theory
or modifications), a decay of protons is predicted. If this is true, then after
1040 years, nucleons will no longer exist, but only black holes may exist. In
Section 76.17 we explain why quantum effects may cause the vaporization of
these black holes. After approximately 10150 years all black holes will have
vaporized. In the dark and empty cosmos only a few low energy photons and
neutrinos remain whose energy becomes weaker and weaker. This is a ghostly
situation.
Although one might be sceptical with regard to some details of these models,
most physicists are convinced that one of these two models is qualitatively
the correct model for our universe. The numerical values differ in the literature,
because the data for H and p are so uncertain.

76.7. The Very Early Cosmos


In this section we briefly mention a number of bold hypothetical ideas in
modem physics which concern the very early state of our cosmos. We consider:
(a) Grand unification theory (GUT) and phase transitions.
(b) Inflationary universe.
(c) Quantum cosmology.
(d) Supersymmetry and supergravity.
(e) Superstring theory and the unification of all fundamental forces.
For further reading one may consult the References to the Literature at the
end of this chapter.
748 76. General Theory of Relativity

76.7a. Grand Unification Theory and Phase Transitions

Phase transitions belong to the most interesting physical phenomena. The


simplest phase transition, which occurs in our every-day life, is the transition

steam ==- water ==- ice


under decreasing temperatures. It is known that such phase transitions can
take place with delay. For a certain amount of time the system is then in a
metastable equilibrium state. For example, water may exist below the freezing
point oo Celsius. In this case, however, only a small impulse is needed to
transform the supercool water into ice.
We now turn to the cosmos. In addition to the gravitational force, we
presently observe three fundamental interactions:

(i) Strong interaction.


(ii) Weak interaction.
(iii) Electromagnetic interaction.

In the context of the SU(5)-standard model, (i)-(iii) can be given a unified


description. This theory is called "grand unification theory," or briefly (GUT).
It is a quantum field theory which has the character of a gauge field theory.

Table 76.1
Mean energy per
degree of freedom Temperature T Time after the
of a particle E = kT of the cosmos Big Bang Interactions
> 10 19 GeV > 1032 K < 10- 44 s Quantum cosmology (strong
coupling between gravitational
and quantum effects; super-
symmetry and supergravity)
1019 GeV 10 32 K 10-44 s Only gravitation and a
(Planck energy) (Planck (Planck time) unified interaction Q for all
temperature) elementary particles exists
1015 GeV 1028 K w-3s s First phase transition
(Q splits into strong
interaction n. and electroweak
interation n.w; there exist
gravitational, strong, and
electroweak interaction
103 GeV 1016 K 10-12 s Second phase transition (Clew
splits into weak and electro-
magnetic interaction); there exist
gravitational, strong, weak,
and electromagnetic interaction
tO MeV 10 11 K w- 2 s See Section 58.15e
76.7. The Very Early Cosmos 749

All gauge field theories have the following mathematical mechanism in com-
mon: From the requirement of local gauge symmetry for the theory it follows
that physical fields exist which cause the interactions. This will be discussed
in Chapter 96 of Part V. A quantization of these fields yields the particles
which are responsible for those interactions. Table 76.1 shows some of the
typical phenomena. In the temperature region 1032 K > T > 10 28 K of the
cosmos, only one common interaction exists besides gravitation. The cooling-
off process of the cosmos then leads to the interactions (i)-(iii).
The mean energy per degree of freedom of a particle in the hot cosmos of
temperature Tis equal to E = kT. This follows from the equipartition law of
statistical physics (see Problem 68.3). The energy density in the early cosmos
can be calculated as

Here N8 is the number of boson species where a particle with s spin posi-
tions is counted s-times. Analogously, NF corresponds to the fermions. If, for
example, only photons exist, then we have N8 = 2 and NF = 0. The formula
above for 6 is a direct consequence of formula (68.25) for ideal Boson and
Fermi gases with chemical potential~t = 0. In Problem 68.1 we used symmetry
arguments to motivate the fact that It = 0 is valid for the early cosmos.

76.7b. Inflationary Universe


In studying the Friedman model in Theorem 76.B we saw that the expansion
of the universe r = r(ct) depends mainly on the equation of state
6 = 6(P)
between pressure P and energy density 6. In the hypothetical model of the
inflationary universe one assumes that the first phase transition in Table 76.1,
which occurred 10- 3 s s after the Big Bang, took place with delay, i.e., during
approximately the time interval L\ = [10- 35 s, 10- 20 s] the cosmos was, as a
consequence of quantum effects, in a metastable position. If one calculates this
state from quantum field theoretical models, then the Friedman model implies
that during the time interval L\ an enormous expansion of the universe in the
order of magnitude of 10 50 took place. Thereafter the development continued
in a regular fashion. If this model is correct, then the radius of curvature of
our present universe is much, much bigger than what has been calculated in
Section 76.6. This inflationary model resolves a number of problems.
(a) Critical density. Because of the huge radius of curvature r, the relative
mass density Prel lies with great accuracy in the neighborhood of 1 ac-
cording to Section 76.6. Consequently, we have that
IP - Pcritl = very small.
750 76. General Theory of Relativity

This way one can explain why the present mass density of the cosmos p
lies in a neighborhood of Peru·
({J) Isotropy of the 3 K-radiation. The 3 K-radiation, which was discovered
in 1964, is with great precision isotropic. This, however, is difficult to
understand, because parts of this radiation come from regions of the
cosmos which are so far apart that no causal connection can exist between
them. Note that physical effects can propagate with at most the velocity
of light. In the inflationary model, this difficulty does not exist, since then
the radiation which is presently observed comes from a small region of
the early cosmos.
(y) Magnetic monopoles. For superconductors of type II one obtains vortex
filaments in case that the gauge symmetry is broken. The first phase
transition w-Js s after the Big Bang corresponds to a breaking of the
SU(5)-gauge symmetry. Therefore physicists expect that at this time
magnetic monopoles appeared with a mass at least 10 16 times the proton
masses, i.e., 10 16 GeV/c 2 • Without using the· inflationary model, one
obtains a monopole density for the cosmos, which is too large and implies
that only some 104 years after the Big Bang the cosmos collapses. The
inflationary model yields a significant decrease of the monopole density.
It should be noted, however, that the inflationary model has also several
weak points, so that at the present time it can only be considered as a
hypothesis. At any rate, it is interesting that the high-energy processes in the
early universe might have had a significant influence on our present universe.

76.7c. Quantum Cosmology


We now discuss the first line of Table 76.1. Today the following universal
constants are known:
G gravitational constant (theory of gravity, general theory of relativity);
k Boltzmann constant (thermodynamics and statistical physics);
c velocity of light (Maxwell's theory of electromagnetism, theory of rela-
tivity);
h Planck's action quantum (quantum theory);
ep elementary charge of the proton.

Numerical values may be found in Table 2 of the Appendix. All these constants
are connected with fundamental physical theories. It is now very remarkable
that any other physical dimension such as length, time, mass, temperature,
and charge can be derived from them. The fact that these universal constants
are so fundamental suggests basing, in a unique way, a natural system of
units on them. Table 76.2 contains these natural units. In order to obtain the
elementary length /p (Planck length), for example, we make the ansatz
/p = G11kflcyhae;.
76.7. The Very Early Cosmos 751

Table 76.2
Elementary units
Planck length /p = Jhiii2 = 1.6. 10- 35 m
Planck time tp = lpfc = 5.4. 10- 44 s
Planck energy Ep = h/tp = 1.22 ·10 19 GeV
Planck mass mp = Epfc 2 = 1.22·10 19 GeV/c 2
= 1.3 · 1019 proton masses
Planck temperature 7;. = Ep/k = 1.4 · 10 32 K
Elementary charge ep = 1.6·10- 19 As
(charge ofthe proton)

Comparison of the dimensions (see Table 2 ofthe Appendix) implies in a unique


way that P= e = 0, IX= b = t and y = -l Analogously, one obtains the
remaining quantities of Table 76.2. Many physicists, using their experience
with quantum theory, expect that beyond the elementary units of Table 76.2,
completely new physical effects may occur. For example, there exists the
hypothesis that lengths which are less than lp cannot be measured. Since the
velocity of light c is the largest possible velocity with which physical effects
can propagate, it follows that also times which are less than tp = lpfc cannot
be measured. The particles in Table 76.1 have the mean energy Ep = 10 19 GeV
per degree of freedom for the cosmic temperature 1j, = 1032 K. One hypothesis
is that, for temperatures greater than Tp, a new kind of physics is required
whereby quantum theory and the general theory of relativity are related in
a significant way. This is quantum cosmology. Characteristic for quantum
theory is the fact that physical states can only be realized with certain prob-
abilities. Therefore one expects that the space-time metric (gravitational field)
and the matter fields in quantum cosmology can only be realized with ap-
propriate probabilities. This leads to the strange situation that the cosmos as
a whole has a stochastic character and may change at random (see Hawking
(1984, S)).

76.7d. Supersymmetry and Supergravity

The principle of supersymmetry states that between bosons (elementary parti-


cles with integer spin) and fermions (elementary particles with half-numberly
spin), there exists a complete symmetry, i.e., each boson corresponds to a
fermion and vice versa. It is then extremely important that the requirement
of local supersymmetry for the theory leads to the existence of a tensor field
giJ which may be regarded as a metric of a space-time manifold. This field
corresponds to gravitation. In a suggestive way this reads:
local supersymmetry ~ gravitation.
In the context of quantum field theory, all interactions are described by
752 76. General Theory of Relativity

particles (quantization of fields). For example, it is expected that the gravita-


tional field corresponds to the graviton. This is a boson of spin two. According
to the supersymmetry, this boson must correspond to a fermion which is called
a gravitino. Because of the existence of two gravitational particles, one speaks
of supergravity.
In the present cosmos no supersymmetry is observed. But, the hypothesis
exists that supersymmetry has been realized in the very early cosmos. Through
the cooling-off process of the cosmos and phase transitions, this symmetry
has been broken.
It is very interesting that a mathematical description of supersymmetry
theories requires a new differential calculus for graded anticommutative alge-
bras. One obtains supermanifolds, super-Lie groups, super-Lie algebras, etc.
(see Leites (1980, S), Wess and Bagger (1983, M), Regge (1984, S), Pressley
(1986, M), and West (1986, M)).
An elegant "supersymmetric" proof of the famous Atiyah-Singer index
theorem can be found in Simon (1986, M).

76.7e. Superstring Theory and the Unification of All


Fundamental Interactions in the Universe

The basic idea of string theory is the following:

Replace particles with strings.

We first want to explain the simple core of this idea in terms of the theory of
special relativity. The crucial generalization to higher-dimensional curved
spaces will be considered later on.
We use the same notation as in Section 75.1, i.e., we set
u = (u 1,u 2 ,u 3 ,u4 ) = R,,,(,ct),
e,
where '1· ( denote Cartesian coordinates, t denotes time, and c denotes the
velocity of light. Moreover, we set

and g;i = 0 if i =F j.
(i) Particles and world lines. In the theory of special relativity, the motion
of a free particle is described by an equation of the form
u = u(t),
where u4 (t) =ct. This corresponds to a curve in the four-dimensional space-
time manifold, which is called a world line. According to Section 75.11, the
fundamental variational principle for. the motion of the free particle u = u(t)
76.7. The Very Early Cosmos 753

i
reads as follows:
ll

L(u1(t))dt =stationary!,
It (V)
u(t) = fixed fort = t~o t 2 ,
where the Lagrangian L is given by
L(u1) = -m0 c(giiu:u/) 112
with u1 = du/dt. Here, m0 is the so-called rest mass of the free particle.
(ii) Strings and world sheets. In contrast to a free particle, the motion of a
string is described by an equation of the form
u=u(t,a), t 1 ~t~t 2 , a 1 ~a~a 2 ,

where u4 (t, a) = ct. This corresponds to a two-dimensional surface in the


four-dimensional space-time manifold, which is called a world sheet. For fixed
time t 0 , the shape ofthe string in the usual three-dimensional space is given by
x = x(t0 ,a),
where x = (u 1,u 2 ,u 3 ). Instead of(V), we now desribe the motion of the string
u = u(t, a) by the following variational principle:

L L(u1(t, a), u,.(t, a)) dt da = stationary!,


(V*)
u = fixed on an,
wherenisa bounded region in IR 2 • For example, wecanchoosen = ]t 1 ,t2 [ x
The Lagrangian Lis given by
]a~oa 2 [.

with u1 = ou/ot and u,. = ou/oa. Here, To is the so-called rest tension of the
string, and 1·1 denotes the absolute value of the determinant.
Note that the variational problem (V*) is formulated in an invariant way.
In fact, if we change the coordinates u into u', then L remains invariant. If we
change the parameters (t, a) into (t', a'), then

o(t',a') I
L(u1,u,.) = o(t,a) L(u1.,u,..),
'
and hence the integral Jn L dt da remains invariant in (V*).
The Euler equations for the variational problems (V) and (V*) are the
equations of motion for the free particle and the string, respectively.
We now want to discuss some possible generalizations. First, in the context
of the theory of general relativity, we have to replace the special metric tensor
gii with a general metric tensor of signature ( -1, -1, -1, 1).
754 76. General Theory of Relativity

In the more interesting context of higher-dimensional physics, we set

where ul, u2 , u 3, u4 correspond to the usual space-time coordinates, and in


(V*), we choose gii as the metric tensor of a curved d-dimensional Riemannian
manifold. Physicists regard (u 5, ... , ud) as additional degrees of freedom of
space and time. However, these degrees of freedom are invisible to us, because
they play only a role below the Planck length lp = 1.6 · 10- 35 m.
In Part V we will discuss in detail the following basic strategy of modern
physics:
(a) For each Lagrangian L, it is possible to construct a quantum field theory
by using the Feynman integral.
(b) Global symmetries of L yield conservation laws (e.g., conservation of
energy and charge).
(c) Local symmetries (i.e., gauge symmetries) yield additional particles which
are responsible for the interactions. (e.g., the photon is responsible for the
interaction between electrons and positrons in quantum electrodynamics).

Roughly speaking, this strategy, applied to modifications of the variational


principle (V*), leads to superstring theory. The following quotations refer to
a fascinating recent development in theoretical physics.
String theory is right now the hot topic of theoretical physics. String theory is
a new view of what the fundamental constituents of nature are. According to
this picture, the fundamental constituents of nature are not, in fact, particles or
even fields, but are instead little strings, little elementary rubber bands that go
zipping around, each in its own state of vibration. In these theories what we call
a particle is just a string in a particular state of vibration, and what we call a
reaction among particles, is just the collision of two or more strings, each in
its own state of vibration, forming a single joined string which then later
breaks up, forming several independent strings, each again in its own mode of
vibration.
It seems like a strange notion for physicists to have come to after all these
years of talking about particles and fields, and it would take too long to explain
why we think this is not an unreasonable picture of nature, but perhaps I can
summarize it in one sentence:
String theories incorporate gravitation.
In fact, not only do they incorporate it, you cannot have a string theory without
gravitation.
The graviton, the quantum of gravitational radiation, the particle which is
transmitted when a gravitational force is exerted between two masses, is just the
lowest mode of vibration of a fundamental closed string (closed meaning that it
is a loop). Not only do they incorporate and necessitate gravitation, but these
string theories for the first time allow a description of gravitation on a micro-
scopic quantum level which is free of mathematical inconsistencies.
All other descriptions of gravity broke down mathematically, gave nonsensi-
cal results when carried to very small distances or very high energies. String
theory is our first chance at a reasonable theory of gravity which extends from
76.7. The Very Early Cosmos 755

the very large down to the very small and as such, it is natural that we are all
agog over it.
String theory itself has focused the attention of physicists on branches of
mathematics that most of us weren't fortunate enough to have learned when we
were students. You can easily see that a string (just think of a little bit of cord)
traveling through space, sweeps out a two-dimensional surface. A very con-
venient (and, in fact, perhaps even more fundamental than talking about strings)
description of string theory is to say that it is the theory of these two-dimensional
surfaces.
The theory of two-dimensional surfaces is remarkably beautiful. There are
ways of classifying all possible two-dimensional surfaces according to their
topology, the number of handles on them and the number of boundaries, which
simply don't exist in any higher dimension. The theory of two-dimensional
surfaces is a branch of mathematics that, when you get into it, is one of the
loveliest things you can learn. It was developed in the nineteenth century, again,
I believe starting with Riemann, and further developed by mathematicians
working in the late nineteenth century motivated by problems in complex
analysis, and then continuing in the twentieth century. There are mathematicians
who have spent their whole lives working on this theory of two-dimensional
surfaces, who have never heard of string theory (or at least not until very
recently). Yet when the physicists started to figure out how to solve the dynamical
problems of strings, and they realized what they had to do was to perform sums
over all possible two-dimensional surfaces in order to add up all the ways that
reactions could occur, they found the mathematics just ready for their use,
developed over the past 100 years.
String theory involves another branch of mathematics which goes back to
group theory. The equations which govern these surfaces have a very large group
of symmetries, known as the conformal group. One description of these sym-
metries is in terms of an algebraic structure (the Lie algebra) representing all the
possible group transformations, which is actually infinite-dimensional. Mathe-
maticians have been doing a lot of work developing the theory of these infinite-
dimensional algebraic structures which underlie symmetry groups, again with-
out any clear motivation in terms of physics, and certainly without knowing
anything about string theory. Yet when the physicists started to work on it, there
it was.
Speaking quite personally, I have found it exhilarating at my stage of life to
have to go back to school and learn all this wonderful mathematics. Some of us
physicists have enjoyed our conversations with mathematicians, in which we
beg them to explain things to us in terms we can understand. At the same time
the mathematicians are pleased and somewhat bemused that we are paying
attention to them after all these years. The mathematics department of the
University of Texas at Austin now allows physicists to use one oftheir lounges-
which would have been unlikely in previous years.
Unfortunately, I must admit that there is no experimental evidence yet for
string theory, and so, if theoretical physicists are spending more time talking to
the mathematicians, they are spending less time talking to the experimentalists,
which is not good.
Steve Weinberg (1986)

It appears likely that superstring theories unite gravity and quantum mechanics
in a consistent manner. This is achieved by a modification of general relativity
at short distances so that Einstein's theory emerges as a long-distance approxi-
mation. Furthermore, the quantum consistency of superstring theories provides
very stringent restrictions on the possible unifying Yang-Mills gauge groups.
756 76. General Theory of Relativity

As a result, gravity is unified with the outer forces and particles in an almost
unique manner. The only possible unifying groups are
S0(32) and £ 8 x E8 •
Here, S0(32) is a large orthogonal group while £ 8 is the largest exceptional Lie
group. The dimensionality of space-time is also required to take a special (or
"critical") value
d = 10
in order to obtain a consistent superstring quantum theory. Clearly, in order
to have any chance of describing the observed physics of an (approxi-
mately) four-dimensional world, six dimensions must tum out to be curled
up (or "compactified") to a very small size (below the Planck space-time unit
1.6 · w- 35 m s).
M. Green (1986)
Perhaps the most important news for this conference, so far as astro-particle
physics is concerned, is the news of the emergence of a String Theory of Every-
thing (TOE)-a theory which will embrace cosmology, all forces of nature,
including gravitation and all matter. A field theory of closed strings, of the size
of Planck loops (.lQ- 35 m s)-which naturally arises from excitations of closed
strings is possibly finite to all loop orders in this formalism (i.e., the typical
singularities of quantum field theory are renormalizable). If this statement is
born out by future work, we shall have the first quantum theory of gravity:
something which has eluded us all so far. For cosmologists, this will mean that
we shall have, at last, a credible radiative extension of Einstein's equations-
admittedly of us only when the Universe was very tiny in size, but of great
conceptual significance nonetheless.
I cannot forbear from repeating a remark due to Chris Isham at Imperial
College. Chris said, when he started research, he went to quantum gravity; his
hope was to discover the origin of Planck's quantum of action h within the con-
text of general coordinate transformations-Planck as a part of Einstein. With
quantized strings one is succeeding, but in the opposite direction-Einstein's
theory appears to be emerging from a small part of quantum theory!
Abdus Salam (1986)

76.8. Schwarzschild Solution


I have read your paper with the greatest interest. I did not expect that the exact
solution of the problem can be formulated so easily. The analytic treatment of
the problem seems to be brilliant.
Einstein in a letter to Schwarzschild on January 9, 1916.
(The astronomer Karl Schwarzschild wrote this paper on his death-bed).

In the following sections we try to explain some of the interesting physical


consequences of the so-called Schwarzschild solution:
ds 2 = c 2 (1 - r./r)dt 2 - r2 (d8 2 + sin 2 8dqJ 2 ) - (1 - r./rr 1 dr 2 • (46)
Here
76.8. Schwarzschild Solution 757

is the so-called Schwarzschild radius. For the sun (or the earth) we haver,=
3 km (or 1 em). As four-dimensional space-time manifold we choose
E4 = {(y,t)e1R 4 : IYI > r,}.
Here r, 3, qJ are spherical coordinates in IR 3 •

Theorem 76.C (Schwarzschiid (1916)). The metric tensor on E4 , which belongs


to (46), is a solution of Einstein's equation for the vacuum.

The proof is a straightforward calculation, which is the subject of Problem


76.9, and also the more general and important theorem of Birkhoff (1923) is
proved there. This theorem states that for 0 < r < r5 and r, < r, (46) is the only
spherically symmetric solution of Einstein's equation for the vacuum Ril = 0.
In Section 76.13 we shall see that for 0 < r, < r, the metric (46) corresponds
to a black hole at rest.
In order to understand the physical meaning of (46) we present a simple
approximation computation which implies (46). We combine the special
theory of relativity with Newton's law of gravity and use the local equivalence
principle. Let I: be a Cartesian coordinate system, with the sun of mass M at
the origin. Let I:' be a box with rest mass m0 , which comes from infinity and
falls radially towards the sun (Fig. 76.5). As a consequence of the free fall, an
observer in I:' will not observe any gravitation. He therefore treats I:' as an
inertial system and chooses the metric
ds 2 = c2 dt' 2 - d~' 2 - dr( 2 - d(' 2 •
I:' has velocity vat time t', which can be computed from the energy conserva-
tion law
Ekin + Epot = const = C.

Newton's theory gives Epot = - GmM/r. Moreover, we have E = mc 2 with rest


energy m0 c 2 , and thus Ekin = (m- m0 )c 2 • Also, we have C = 0, since in the
beginning the box is at rest at infinity. This gives
(m - m0 )c 2 - GmMfr = 0.
Because ofm = m0 /J1- V 2 fc 2 we obtain
J1 - V 2 /c 2 = 1 - r./2r, 1 - V 2 fc 2 = 1- r,/r + ···.
I:'

lvl = V
v

o------r
sun
Figure 76.5
758 76. General Theory of Relativity

For the transformation from l:' to l: we use the formulas ofthe special theory
of relativity. This yields
de' = dr/Jt - V 2 /c 2 (length contraction),
dt' = dtJt- V 2/c 2 (time dilatation),
d,7' = rd3, d(' = rsin3dqJ (in variance of transversal lengths).
This implies (46).
Therefore we can interpret (46) as the metric which is induced by a primary
body of mass M. Furthermore, r, ((), 3 are spherical coordinates in l:. The
proper time t, which is shown by a clock at rest, is

t = t0+ f.' (1 - r,/r) 1'2dt.


to

The equation for a radial light ray is ds 2 = 0. This yields p = c(t - t 0 ) with

p= f.'
ro
(1 - r,/rrl/2 dr

and r0 > r,. As in Section 75.3, one may perform measurements of lengths by
using the travel time of light rays. Here one obtains p instead of r. Only p and
t have a physical meaning. If r,/r is very small, then t and p can be replaced
by t and r. This is true for reasonable physical experiments, because on the
surface of the sun or the surface of the earth we have approximately r./r =
5 ·10- 7, if the primary body is the sun or the earth.

76.9. Applications to the Motion of the


Perihelion of Mercury
The last month was one of the most exciting and exhausting periods in my life,
but also one of the most successful. ... I recognized that my previous field
equations of gravity were completely wrong. The Christoffel symbols have to be
regarded as the natural expression for the components of the gravitational
field .... The beautiful thing I experienced, was that not only Newton's theory
could be obtained in first-order approximation, but also the motion ofthe Peri-
helion of Mercury in second-order approximation. For the deOection of light at
the sun one finds a value twice as large as before.
Einstein in a letter to Sommerfeld on November 28, 1915

During the nineteenth century many very precise computations of the orbits
in our plan~tary system were performed using the methods of perturbation
theory. Leverrier (1811-1877), who also predicted the orbit of Neptune, found
a rotation of the Perihelion of Mercury of 43" per century, which could not be
explained by Newton's theory of gravity. The solution is a consequence of the
general theory of relativity, as will be shown in the following. In Section 58.9,
76.9. Applications to the Motion of the Perihelion of Mercury 159

we saw that classical mechanics yields planetary orbits


r- 1 = q(l + e cos QJ) (47)
with q = 1/a(1 - e2 ), where a is the great half axis and e the eccentricity, i.e.,
ea is the distance of the sun from the center of the ellipse. The sun is located
at a focus. The surface velocity S of each planet is constant, i.e.,
2- 1 r 2 cp = const = S. (48)
Moreover, we have q = GM/4S 2 with the gravitational constant G and mass
of the sun equal to M. From (47) and (48) one obtains cp = cp(t), i.e., the
planetary motion in time.
The equations of motion for the general theory of relativity are
u··k + rk
1iiu• i u• j -_ 0. (49)
In this section, the dot means derivative with respect to the proper time r of
the orbital motion. The Christoffel symbols rb
belong to the Schwarzschild
metric (46). We are looking for orbits r = r(QJ), which for QJ = 0, coincide with
the classical Kepler ellipse, i.e.,
r(Or 1 = q(l + e). (50)
In the literature one finds approximate solutions for (49). Sometimes dubious
arguments are used and the accuracy of the approximate solution is not clear.
We prove here that the solution can be written as a convergent series of the
small parameter '7 = r,q. Also, we present a simple procedure which admits
expansions up to any given order. We use the same methods as have been
used in Chapter 8 for nonlinear oscillations. As we shall see in step (VI) of the
following proof, the solution cannot be obtained by a direct iteration, but
instead one uses a procedure which is related to the branching equation of
Ljapunov.
As we shall show in (III) of the following proof, it is possible to describe
explicitly the motion of planets r- 1 = E(QJ) in the general theory of relativity
by an elliptic function E. However, our method of proof via a convergent
iteration method is also applicable to many other problems in celestial
mechanics, where an explicit solution is not available. We assume:
(H) Let the mass of the sun M be given, as well as a number e with 0 < e < 1
and a numberS ::F 0. With this we construct q = GM/4S 2 • Because of
r:. = 2GM/c 2 we also have q = r,c 2/8S 2. Furthermore, we let '7 = r,q and
a= lfq(l - e2 ).

Theorem 76.0. If (H) holds, then there exists an 'lo > 0 such that for every '7
with 0 < 1'71 ~ '7o there exists a unique orbit r = r(QJ) for (49), (50). This is a
periodic orbit in QJ and has the form

(51)
760 76. General Theory of Relativity

Figure 76.6

with q ~ r,- 1 '1 and


l:icp = 37t'l = 3xr,/a(l - 6 2 ).

Here, 0('1 2 ) denotes a convergent power series in 'I· The motion in time cp = cp(t)
can be uniquely determined from
and cp(O) = 0.
Remark. For the sun we haver.= 3 km. In the case of Mercury (and Pluto)
we find 6 2 = 0.04 (and 6 2 = 0.06). For all other planets 6 2 is significantly
smaller. For the earth we find 6 2 = 0.003. Hence it follows for all planets that
q ~ 1/a and 'I ~ r,ja. In the case of Mercury, which is the closest planet to the
sun, we obtain a = 60 · 106 km i.e., 'I = 5 · w-s assumes its largest value. Thus
the terms which differ from the classical Kepler ellipses are very small. Orbit
(51) has the period
2n/a. = 2x + l:icp + 0('1 2 ).
Hence we find a rotation of the Perihelion of l:icp per revolution (Fig. 76.6).
Table 76.3 contains several values. Perturbations, caused by other planets,
have already been taken into account for the observed values below. The proof
will show that the orbital motion is given by
dt/dt = 1 + c
where Cis very small. Thus, except for a very small error, the proper time
t for all planets is equal to t, i.e., all planets have approximately the same
proper time.

Table 76.3
Motion of the Perihelion in
angular seconds per century Mercury Vei:ius Earth
Theory 43.03 8.6 3.8
Observation 43.11 ± 0.45 8.4 ± 4.8 5.0 ± 1.2
76.9. Applications to the Motion of the Perihelion of Mercury 761

We present a proof, which for C = 0, in addition, yields the deflection of


light at the sun (see Section 76.10).
PRooF. The main idea is to reduce the problem to a second-order differential
equatidn. This is done by differentiating. The linearized problem (57), however,
can only be solved under suitable side conditions on the solution and the
right-hand side. These side conditions yield an operator equation (59), from
which the orbit and its unknown period can be determined. This operator
equation can be solved by using the implicit function theorem. Equation (57)
below need not have a unique solution. Carelessness with this leads to some
inconsistent approximation procedures in the literature.
Our proof also will show that every solution of the original problem also
must be a solution of the operator equation (59). Thus uniqueness of the
solution of(59) implies the uniqueness ofthe solution of the original eqqation.
(I) Variational problem. The starting point is the equations of motion (49).
As (15) shows, those are the Euler equations for the variational problem

f guuiui dr = stationary!.

Thus we begin with

(•• L dt = stationary,
J.o (52)
u(r0 ) = u0 , u(r.) = u.,
where L = giiuiul. From (46) it follows that
L = (1 - r.fr)c 2 i 2 - r 2 (8 2 + liJ 2 sin 2 8)- (1 - r.frr 1r2 •
Because of ds = c dt we have
(53)
with C = 1. We assume that 8(0) = n/2 and .9(0) = 0. This can always
be achieved by rotating the coordinate system and by passing to
8 - Dr. The Euler equation
d
dtL~- L 8 = 0

is linear in cos 8, 8, and 8. Hence 8(r) = const = n/2 is a solution. The


Euler equations
d d
-L·-L =0 and -L·-L
dt I =0
dt "' "' I

yield
2- 1 r 2 q, = const = S,
(54)
(1 - r./r)i = const = T.
762 76. General Theory of Relativity

Inserting the last equation into (53) gives an analog to the classical
energy theorem.
(II) First approximation. We let u = 1/r, r = r(<p), and u' = dujd<p. From
(53) and (54) we obtain
4S 2 u' 2 + 4S 2 u2 - T 2 + Cc 2 (1 - r.u) = 4S 2 r.u 3 . (55)
We choose
T = Jq1- r.q) + 4(1 + f. 2 )S 2 q2jc 2.
By using u = qvfr. we pass to dimensionless quantities. From (55) it
follows that
v(O) = 1 +f.. (56)
For t7 = 0 we obtain the first-order approximation
V = C + f.COSqJ.
Because of C = 1 these are precisely the Kepler ellipses.
For the planets we find
and
Hence 2Sq is approximately equal to the mean orbit velocity, which
is very small compared to the velocity of light c. For the earth we find
4S 2 q2 jc 2 = w-s. Thus Tis almost equal to one. Hence (54) shows that
the proper time r is almost equal to t.
(III) Elliptic functions. From (56) it follows by integration that <p = <p(v).
The inverse function v = v(<p) gives r = r(<p). Since the polynomial in
(56) has three real zeros for small t7 =F 0, it follows that v = v(<p) is an
elliptic function with one teal and one purely imaginary period.
(IV) Decomposition. Therefore the solution v = v(<p) of(56) is periodic. Let
the period be denoted by 2rcjrJ.. It is important then for the following
approximation method that v can uniquely be decomposed as
v= C + f.COSrJ.<p + H- H 0 cosrJ.<p
with

i 0
211/IJ
H cos rJ.<p d<p = 0 and H0 = H(O).

This uses Fourier series and the fact that v(O) = 1 +f..
(V) Study of the central linear differential equation. We set

ak(f) = rc- 1 I 2
" f(<p)cosk<pd<p.

This is the Fourier coefficient off by cos k<p for k ~ 1. Now we consider
the key equation ·
h" + h =f (57)
76.9. Applications to the Motion of the Perihelion of Mercury 763

Let X be the 8-space of all even, 2n-periodic C4 -functions f: IR-+ IR


with
a1(f) = 0.
Then Iat(!) I ~ const/k 4 • Thus we can differentiate the Fourier series of
f twice with respect to cp. Comparison of the coefficients of the Fourier
series shows that (57) has a unique solution he X for each f eX. We set
h=Af
Then the solution operator A: X-+ X of (57) is linear and continuous
(see Section 8.13).
It is important here that through the construction of X, i.e., through
the side conditions
and
we force (57) to have a unique solution. Without the side condition
a 1 (h) = 0, equation (57) will not have a unique solution, since h = cos cp
is a solution of the homogeneous equation (57). Neglecting this fact
leads to inconsistent approximation methods in the literature.
Moreover, one should note that the condition a 1(f) = 0 is a neces-
sary solvability condition for (57), because from he X it follows that
a 1 (h" +h)= 0.
Observe the Fourier expansion of h and also that a 1 (h) = 0. For the
function
f = a 2(f) cos 2cp + a 3(f) cos 3cp + ···
one obtains as a representation of the solution h = Af of (57):
h = a 2 (h)cos2cp + a 3 (h)cos3cp + ...
with
k = 2, 3, .... (57a)
(VI) Equivalent operator equation. Differentiation of (56) gives the key
equation
(57b)
In order to determine the unknown period 2na, we set w(cp) = v(cpja).
This gives

We choose
a- 2 = 1 + 2p,
and analogously to (IV) we set
W = C + llCOSC(J + h- h0 COSC(J
764 76. General Theory of Relativity

with a 1 (h) = 0 and h0 = h(O). This yields


h" + h =J, heX
with
f = g - 2/38 cos qJ.

The trick is to compute the unknown number fJ from


a 1{f) = 0.
According to (V) this condition is a necessary solvability condition.
Hence we obtain the equivalent system
h" + h = g- a 1 (g)cosqJ, heX,
(58)
fJ = a 1 (g)/28
with
g = llf(l + 2/J)(C + 8COS(/) + h- h0 COSqJ) 2 - 2/J(h- h0 COSqJ)
and the small parameter '1· From (V) it follows that for (h, fJ) eX x IR,
equation (58) is an operator equation of the form
h = A(g- a 1 (g)cosqJ),
(59)
fJ = a 1 (g)/28.
(VII) Existence and uniqueness proof. According to the implicit function
theorem of Section 4.7, equation (59) has a unique solution (h,fJ)e
X x IR for every 17: 1'71 ::::;; '7o which depends analytically on '1·
(VIII) Computation of the solution. Since the solution is analytic, its co-
efficients of expansion can be determined by using an ansatz and
comparison of coefficients in (58). For each step, one has to solve an
equation of the form (57) where f is a finite Fourier series. Then also h
is a finite Fourier series, which c~tn be determined by (57a).
In order to compute the first-order approximation, we note that only
that part of g will be needed which is obtained by letting P= 0 and
h = 0. This gives
g = f11C 2 + 31f8C cos qJ + i'78 2 (1 +cos 2qJ).
It follows from (58) that
p= 31fC/2 + 0(17 2 )

and hence a= 1 - 3'7/2 + 0(17 2 ) and

31f8 2 317 C 2 '78 2


w= C + 4 +- 2 - - 4 cos2qJ
(60)
( - T 2) COS(/).
3 2 '1 8
+ 8- 2'1c 0
76.10. Deflection of Light in the Gravitational Field of the Sun 765

Equation (58) can also be solved by using an iteration scheme. On the


right-hand side one replaces h, Pwith h., P. and on the left-hand side with
hn+t, Pn+t, where h0 = Po = 0. According to Section 4. 7 this iteration scheme
is convergent and differs significantly from the inconsistent scheme
v;+l + Vn+l = C + 3'fV~/2,
which can often be found in the literature in connection with (57b).

76.1 0. Deflection of Light in the


Gravitational Field of the Sun
We want to show that in a neighborhood of the boundary of the sun a light
ray behaves as pictured in Figure 76.7 with
{J = r./d + o(r./d). (61)
Hence, a light ray is deflected by 2{J. For the boundary of the sun we have
d = 0.686 ·106 km. Thus we obtain 2{) = 8.7 ·10- 7, i.e., 2{) = 1.75". This result
admits an experimental test by means of the photographic registration of stars
during a total eclipse of the sun. The observed values, however, vary between
1.4" and 2.7". For quasars (distant radio sources) the coincidence with the
theory reaches 10%.
Now we prove (61). The motion of a photon in the Schwarzschild metric is
given by equation (49), where the dot no longer means derivative with respect
tor, but instead with respect to u. Also, we assume that s = 0. Thus once again
we can use the variational problem (52), where, however, (53) has to be
replaced with L = 0. Hence, by letting C = 0 and e = l, the arguments in the
proof of Theorem 76.D apply. From (60) we find the solution

r- 1 = r. - 1, { ( 1 -~)cos IX(/) J
+ 34" - ~cos 21Xcp + 0('1 3 ),
(62)
IX = 1 + 0(17 2 ).

d
sun

Figure 76.7
766 76. General Theory of Relativity

In order to find the asymptotes, we look for the angles I'± with r-+ oo for
cp-+ I'±. For cp = n/2 + {J we have
COS(/)= -sinlJ = -lJ + 0(lJ 3 )
and
cos2cp = -cos2lJ = -1 + O(lJ 2 ).
By letting the right-hand side of (62) be equal zero we obtain
lJ - '7 + 0 2 (lJ, '7) = 0.
This means I'+ = ±(n/2 + '7 + 0('7 2 )).
The parameter '7 can be determined from r = d and cp = 0. This implies
'7 = r,/d.
For the boundary of the sun we have d = 0.7·106 km and '7 = 4·10- 6 • This
shows that in fact the parameter '7 is small.

76.11. Red Shift in the Gravitational Field


We consider again the Schwarzschild metric
ds 2 = c2 (1 - r,jr)dt 2 - r2 (d8 2 + sin2 8dcp 2 ) - (1 - r,/rr 1 dr 2 •
For the radial motion of a photon we find the following equation

t = t0 + c- 1 f.''
•o
(1 - r,/r)- 1 dr

from ds = 0. Consider two points P0 and P1 which lie on a radial ray through
the origin with r-coordinates r0 and r1 with r1 > r, for j = 1, 2. Two signals
which are emitted at P0 at a timely distance At, reach P1 with the same timely
distance At. We now use atomic clocks in ~· They show the proper time T.
According to Section 76.8 we have
A-r1 = (1 - r,jr1) 1' 2 At.
The frequency oflight v1 which is observed in~ satisfies vtfv0 = A-r 0 /AT 1 • For
the wave length A.1 we then obtain the relative change

because of A.= cjv. The last value is true for small r8 jr1. For: r1 > r0 > r, we
have A. 1 > A.0 , i.e., in P1 we observe a red shift with respect to P0 •

EXAMPLE 76.7 (Experiment On the Earth). Consider a y-source on the surface


ofthe earth and a receiver (iron absorber) in a tower of height 22.5 m. Because
76.12. Virtual Singularities, Continuation of Space-Time Manifolds 767

of
r. = 8.84 ·10- 3 m, r0 = 6.37·106 m and r1 = r0 + 22.5
we obtain
!!A./A.= 2.5 ·10- 15.

In 1960, this experiment had been performed by Pound and Repka at


Harvard University with a brilliant confirmation of the theoretical result.
These very small changes in the wave lengths can be measured by using the
Mossbauer effect (recoilless absorbtion of y-quanta).

76.12. Virtual Singularities, Continuation of


Space-Time Manifolds, and the
Kruskal Solution
Until now we only considered the Schwarzschild solution
ds 2 = c2 (1 - r./r)dt 2 - r2 (d8 2 + sin 2 8dqJ 2 ) - (1 - r,/rr 1 dr 2 (63)
for r > r•. Here r is a space and t a time variable. In fact, the metric (63) is also
a solution of Einstein's equations for the vacuum for 0 < r < r. (see Problem
76.9). Since now the coefficients by dt 2 and dr 2 change their sign in (63), it
follows that according to our definition of Section 76.1, r becomes a time and
t a space variable.
(Q1) Does physics end at r = r,?
We want to show that this is not the case and begin our discussion with
a simple example which illustrates the key phenomenon. Let M = IR and
M ± = IR± with metric
ds 2 = de 2 on M.
Now on M ± we introduce new coordinates through e= ± 2jt, 0 < t < oo.
This gives
ds 2 = dt 2 /t,
i.e., fort = 0 we obtain a singularity. This shows that by choosing unfortunate
chart coordinates one may pretend to have singularities. As an indication that,
in fact, we have a virtual singularity at t = 0, we compute the arclength

s= I dt/ Jt = 2jt for t > 0.

This is finite. In the following section we will see that a space ship in free fall
in the metric (63) only needs a finite proper time to reach the boundary r = r•.
768 76. General Theory of Relativity

This leads to the following additional question:


(Q2) Is it possible to extend the two disjoint space-time manifolds S> and
S<, which correspond to (63) for r > r, and 0 < r < r, to a space-time
manifold containing both?
As a matter of fact we now construct such an extension, which at the same
time is maximal. To this end, we introduce the Kruskal transformation by
letting

and
z = f(ecv+f2r. + ,e-cv-12•.),
W = f(ecv+f2r. _ ,e-cv-12••),

with '1 = 1 for r > r, and '1 = -1 for 0 < r < r,. This implies
w2 - z2 = (1 - r/r,)e'1'•
~ = {tanhct/2r, for lw/zl < 1, (64)
z cothct/2r, for lw/zl > 1.
This way we obtain from (63) the Kruskal metric

ds 2 = 4r, e-•t••(dw 2 - dz 2 ) - r2 (d8 2 + sin 2 8dqJ 2 ) (65)


r
with r = r(w, t). In order to understand this metric, we consider the hyperbola
w2 - z2 = 1 which corresponds to r = 0, and the two diagonals w = ±z
which correspond to t = ± oo. As shown in Figure 76.8, the hyperbola and
the two diagonals foi'm the boundaries of open sets, which will be denoted as
follows:
(i) Kw universe with a black hole;
Kout outer universe in Kw;
Ko black hole in Kw;
(ii) K&. universe with a white hole;
K:ut outer universe in K&.;
K: white hole in K&..
(iii) K Kruskal universe (universe with a black-white dipole hole);
I unreal universe.
For example, I is equal to the closed set, shaded in Figure 76.8, which
is bounded by the two branches of the hyperbola. Moreover, we have K =
~2 -1.
The physical meaning of these suggestive notations will become clear during
the following sections. The variables (r, t) vary in the form r > 0 and -oo <
t < oo. The variables (z, w) vary inK. The coordinate lines r = const inK are
hyperbolas, while the coordinate lines t = const correspond to straight lines
76.12. Virtual Singularities, Continuation of Space-Time Manifolds 769

r=O w r = r5 , t = + oo

~r=O r = rs. t = - oo
(a)

' /
Ka

Kw

K*B

,
~
K

~
(b)

Figure 76.8

through the origin. In Figure 76.9(a) we indicate the direction of increasing r-


and t-values. We now construct the set
E4(K) = S 2 x K.
Analogously, we let E4 (Kout) = S 2 x Kout• etc. The points of E4 (K) corre-
spond to the coordinates (cp, 8, z, w), where cp, 8 are spherical coordinates in
S 2 and (z, w) lies in K. Here, S 2 denotes the boundary of the unit ball in ~ 3 .

Theorem 76.E (Kruskal (1960)). With the Kruskal metric (65), the set E4 (K)
becomes a Riemannian C 00 -manifold of signature type ( -1, -1, -1, 1). On
E 4 (K) the metric tensor satisfies Einstein's field equations for the vacuum.
770 76. General Theory of Relativity

t= + 0>

t = const = 10

0 L-r ~

~ r=const
(a)
wt
I
I
I
I
L---------
X Z
(b)

Figure 76.9

PRooF. The components g11 of the metric tensor in (65) are C 00 -functions on
E 4 (K) with det gii ::1: 0. The virtual singularities for 8 = 0 and 8 = n only
appear in spherical coordinates. They disappear in the usual way if one uses
chart coordinates for S 2 • For 0 < r < r. and r > '• equation (65) follows from
the Schwarzschild solution by using a coordinate transformation. Thus, in
this region, (65) satisfies Einstein's equation
R11 - jg11 R = 0.
By continuity, g11 is also a solution for r = r., and hence a solution on Kw.
Using a reflection, one obtains K:, from Kw. where gii is equal in corre-
sponding points. Therefore, g11 satisfies R 11 - fgiiR = 0 on K. D

Hence (65) is the desired extension of (63), where Knut and K 8 in Figure
76.8(b) correspond to the Schwarzschild solutions for r > r, and 0 < r < r,.
More precisely, we have S> = E4 (K001 ) and S< = E4 (K8 ).
All radial light rays in E 4 (K), i.e., all light rays with rp = const and 8 = const
satisfy ds = 0, so that
w = ±z + const,
(66)
rp = const, 8 = const
76.13. Black Holes and the Sinking of a Space Ship 771

holds. Hence all radial light rays in Fig. 76.8(a) are straight lines with slope
± 1. One easily shows that (66) is a solution ofthe equations of motion, which
are the Euler equations of the variational problem (15).
The main reason for introducing the coordinates w and z is the simple form
which the equation assumes for radial light rays. In the following we will see
how this simplifies the treatment of the qualitative behavior of black holes.

Definition 76.8. Radial light rays are oriented in such a way that the future
corresponds to increasing values of w (Fig. 76.9(b)).

According to Figure 76.9(a) and Figure 76.8, this corresponds to the fol-
lowing increasing values:
in Kout and -r

t* in K:ut and -r*


where t* = - t and r* = - r. This corresponds to the fact that the space coor-
dinate for the outer universe Kout becomes a time coordinate inside the black
hole K8 .
One can prove that the Kruskal metric cannot be extended beyond K, since
on the boundary oK which is formed by the hyperbolas r = 0, there exists a
proper singularity whose physical meaning we now explain.

76.13. Black Holes and the Sinking


of a Space Ship
After a short while, I was obsessed with the whirlpool. In spite of the sacrifice
it would mean, I felt a strong desire to explore its depths; and that, what hurt
the most, was the fact that I would never be able to tell my old comrades about
the secrets that I would see.
Edgar Allan Poe: Down into the Maelstrom.
Many astrophysicists agree that Cyg X-1 is a black hole.
Walter Sullivan (1979)

Intuitively, one means by a black hole a region in the cosmos which has such
a strong gravitational force that neither light nor matter can leave it. Like a
moloch, a black hole swallows all surrounding matter, which thereby is heated
and emits X-rays as a cry of death. Therefore one expects that the X-ray source
Cyg X-1 in the constellation of the swan is a black hole. These days, one could
read in some newspapers that astronomers at Caltech in Pasadena discovered
heated matter which tumbles into a black hole at the center of our universe.
They studied photographs made with a large radio telescope in the desert of
Socorro (New Mexico). This black hole apparently has a mass of 200 to 2 · 106
times the mass ofthe sun. One also expects that very bright quasars, which are
located at huge distances, contain black holes. This will be discussed in Section
772 76. General Theory of Relativity

rs

black hole
Figure 76.10

76.16. There we also briefly explain how black holes may occur at the end of
the development of a star. For this reason one expects numerous black holes
throughout the cosmos.
We now use the Schwarzschild metric and the Kruskal transformation to
describe a mathematical model of a black hole, which is located at the origin,
in a Cartesian coordinate system l: and which has a radius of r, (Fig. 76.10).
We choose spherical coordinates r, lf', 8 and use the Schwarzschild metric (63).
More precisely, we extend this metric to the Kruscal metric on E4 (Kw). In
order to illustrate this situation, we choose a fixed plane with lf' = const and
8 = const. In the (z, w)-coordinate system, the space-time manifold corre-
sponds to the nonshaded open set in Figure 76.11(a). It consists of the

r = r5 , t = +..,

r = r 5 , t =-..,

(a) (b)

"(c)

Figure 76.11
76.13. Black Holes and the Sinking of a Space Ship 773

following open sets:


K 001 : (exterior universe in I: with r > r., -oo < t < oo),
K 8 : (black hole in I: with 0 < r < r., -oo < t < oo),
and the diagonal w = z, z 2: 0, which corresponds to the boundary iJ K 8 of the
black hole (r = r. and "t = +oo").
In order to determine the mass M of a black hole, we assume the same
relation as in Section 76.8., i.e.,
M = r.c 2 j2G.
Forr. = 3 km (orr.= 1 em) this is the mass of the sun (or the mass of the earth),
and implies an extraordinarily high mass density.

EXAMPLE 76.9 (Light Trap). According to Section 76.12, the radial light rays
are the straight lines with slope ±1 in the (z, w)-diagram. Important for a
qualitative understanding of the following considerations is the fact that
according to the previous section, increasing w-values for the light rays corre-
spond to increasing time. In all figures below, the arrows of the light rays
correspond to increasing time.
Exactly two radial light rays L + and L- pass through a point Pin the black
hole K 8 (Fig. 76.11(b)). The ray L +, however, remains in K 8 , while L-
corresponds to a ray which falls into K 8 • Both rays reach the hyperbola in
Figure 76.11(b), i.e., reach the singularity r = 0. We therefore obtain the
important result that no radial light ray can leave the black hole K 8 .
If a pair of photons is located on the boundary of the black hole, then, as
in Figure 76.11 (c), one of the two photons may disappear into the black hole
while the other is radiated into the universe. In Section 76.17 we will use this
effect in order to derive the formula for the vaporization of black holes.
EXAMPLE 76.10 (The Sinking of a Space Ship). From the earth, r = rE, which
is at rest in the system I:, a space ship A takes off in a radial direction towards
the black hole (Fig. 76.12). In order to save energy, the crew shuts off the
engines, so that A falls freely. Then the following holds:
(i) The spaceship A only needs a finite proper time in order to reach the
boundary r = r. of the black hole.
(ii) Observers on the earth find that, for the proper time on the earth, the
space ship A never reaches the boundary r = r., i.e., it takes an infinite
amount of time.
(iii) If space ship A has reached the boundary r = r., then it is in a hopeless
situation. After proper time
r :s;; nr.Jc
space ship A will have crashed at the singularity r = 0, even after the most
forceful rocket stopping. If M is the mass of the black hole, then we have
and
774 76. General Theory of Relativity

black hole

Figure 76.12

We prove (i) and (ii). From the variational principle of Theorem 76.A of
Section 76.2 and the Schwarzschild metric (63) we obtain the equations of
motion
(1 - r,/r)i = const = T,
(1- r,/r)c 2 i 2 - (1- r,/rr 1 r2 = c 2
(see (53), (54)). We choose i(O) > 1. This implies T > (1 - r.frE) and
cT 2 ~ r2 = c2 T 2 - c 2 (1 - r./r) ~ const > 0
for r. ~ r ~ 'E· The proper time needed by space ship A to travel from rE tor.

f. '·
is
.1t = dr/f < oo.
'E

The t-time needed is

.1t = J r.+•
i dr/f ~ const J'•+• i dr -HXJ as e-+ 0
rE rE

because of i = T/(1 - r,/r). According to (63) with r = 'E• 8 = cp = const, the


proper time which is measured on the earth is proportional to ~t.
Now we prove (iii). The transformation
cv = ct + r + r.lnll- r,/rl
implies for the Schwarzschild metric
ds 2 = (1 - r,/r)c 2 dv 2 - 2c dr dv
- r 2 (sin 2 8dcp 2 + d8 2 ).
This is a nonsingular metric for r > 0 (Eddington metric). For an arbitrary
motion r = r(t) and v = v(t) we have s2 = c 2 , and hence
(1 - r,/r)c 2 i} - 2cfv- r=((sin 2 8)cp 2 + .92 ) = c2 •
For r ~ r, we therefore cannot have r= 0. It follows that r < 0 for r ~ r.,
76.15. Black-White Dipole Holes 775

Figure 76.13

because for r = r. we have f < 0 since space ship A travels radially towards
the boundary. It follows from the Schwarzschild metric that
c2 (1 - r.fr)i2- (1 - r./rr 1r 2 - r 2 ((sin 2 3)£P 2 + .9 2 ) = c2 •

A look at the signs shows that -(1 - r./rr 1 f 2 ~ c2 • From f < 0 it follows that

rmax = -
f.
••
o. dr/f ::s;; c- 1 i'•
0
[(r./r) - 1r 112 dr = nr.fc.

This is (iii). Note that qJ = 3 = const.

Other important properties of black holes will be considered in Section


76.16.

76.14. White Holes


Consider the same Cartesian coordinate system l: as in Section 76.13. We
replace, however, timet with t* = - t. Then the black hole K 8 in Figure 76.11
becomes a white hole K:with the same mass. Because of time reverse, the
light rays which fall into the black hole now correspond to light rays which
emerge from the white hole. The space-time manifold is now E4 (K;, ). In the
(z, w)-diagram this corresponds to the nonshaded open set of Figure 76.13.
This figure shows that light rays which emerge from the white hole may K:
leave it into the exterior universe K!ut·

76.15. Black-White Dipole Holes and


Dual Creatures Without
Radio Contact to Us
My suspicion is that the universe is not only queerer than we suppose, but
queerer that we can suppose.
J. Haldane
776 76. General Theory of Relativity

X
(a) (b)

Figure 76.14

We now discuss the whole Kruskal universe. In the (z, w)-diagram this corre-
sponds to the nonshaded open set of Figure 76.14(a). This universe consists
ofthe two exterior universes Kout and K:ut• a black hole K8 , and a white hole
K:. As in Figure 76.14(b) radial light rays correspond to straight lines with
slope ±1 and future direction is shown by the arrow. Therefore radial light
rays cannot stay in the white hole, but instead are emitted into both, K:ut and
Kout· All light rays in the black hole K8 remain captured. Suppose that we live
in Kout• then we cannot send radial light rays to K:ut nor can we receive them
from there.
We now investigate arbitrary physical signals. For the propaBation of
physical signals we haves~ 0. It follows from the Kruskal metric (65) that
w2 ~ i 2 in E4 (K). This means ldwjdzl ~ 1. Consequently, themotionofmatter
and light in the Kruskal universe K (Fig. 76.14) is described by curves of the
form
w = w(z) with lw'(z)l ~ 1.
For this reason Kout and K:ut cannot be causally influenced by each other.
Neither can we visit the dual creatures in K:u, nor can we send radio signals
to them.
During the last years, singularities of solutions of Einstein's equations have
been intensely investigated. Roughly, the result was that singularities are not
the exception but are instead quite common. This might lead to some surprises
in astrophysics. In this direction we recommend Hawking and Ellis (1973, M),
Tipler, Clarke and Ellis (1980, S), and Seifert (1983, S).

76.16. Death of a Star


Thinking creatures on a distant star, who are able to identify parts of our
television program, will be surprised that the occupants of planet Earth are
mainly interested in the quality of detergents and similarly important things, or
have been. Even after millions of years we may create a ridiculous impression.
Harald Fritzsch (1983)
76.16. Death of a Star 777

Among the most interesting physical problems we encounter the study of star
models. The stimulus of this problem is that a comprehensive knowledge of
many physical disciplines is needed: Thermodynamics and statistical physics,
the general theory of relativity, elementary particle physics, plasma physics,
etc. One of the difficulties is that in part, matter is exposed to extreme
conditions. In Section 68.7 the classical basic equation for star models has
been motivated and used to calculate the critical Chandrasekhar mass for
white dwarf stars. In Problem 76.13 we give the relativistic basic equation
for star models, which contains the classical equation as a limiting case. In
Problem 76.14 we consider the gravitational collapse. In the context of star
models we recommend the monographs of Chandrasekhar (1939), Zeldovic
and Novikov (1971), and Weinberg (1972).
A very good survey about the developments of stars is gained from com-
plicated computer simulations. We only make some general remarks.
(i) Hertzsprung-Russel diagram. For astronomers, the two most important
quantities for a star are its absolute temperature T and its luminosity L. The
luminosity is the energy which the star emits per second. For the sun one sets
L = 1. This corresponds to 4 · 10 26 W per second. In Figure 76.15 the so-called
Hertzsprung-Russel diagram for LandT is depicted in a very schematic way.
Most stars belong to the so-called main sequence.
(ii) History of the sun. This history is shown in Figure 76.15. The sun
originated about five milliard years ago. At this time our region ofthe universe
was dark and bitter cold. There was only a huge cloud of interstellar dust with
as many atoms in 2,500 km 3 as are today in 1 cm 3 of air. One day this cloud
exploded. There are two hypotheses for this. The reason may have been that
a spiral arm of our Milky Way had been moving through this cloud or it may
have been the shock wave of a supernova. Together with the sun the planets

L
main sequence

protosun

J0-2 white dwarfs

L----L----------~----------L----- T
3,000 K 6,000 K 10,000 K
Hertzsprung-Russel diagram

Figure 76.15
778 76. General Theory of Relativity

appeared. The comets are probably the remains of the material which has been
left over during the creation of the planets.
The sun produces its energy through nuclear fusion, i.e., by the burning of
hydrogen into helium. During the next five millard years the sun will not
change very much. At the end of this epoch its luminosity L and its diameter
d will have doubled, and already this leads to serious climatical difficulties on
the earth. After eight milliard years L (and d) will have increased by a factor
2,000 (and 100). The sun becomes a red giant. At this time no life on this earth
is possible any longer. The reason for the increase of the sun is the following.
If large parts of hydrogen are burned, the central region becomes unstable.
The equilibrium between radiation and gravitation has been disturbed. The
boundary of the sun expands. The central region contracts and increases its
temperature from 15 · 106 K to 108 K. Thereby the Salpeter process can take
place where, among other things, helium is burned into beryllium under the
production of energy. After the nuclear fusion has come to an end, the sun
contracts, due to a gravitational collapse. This way a white dwarf occurs with
a very high density of approximately 1,000 kg/cm 3 . After cooling off the white
dwarf slowly becomes a black dwarf. This process is finished after approxi-
mately 105 milliard years. One might question, however, if the universe will
become that old.
(iii) The critical Chandrasekhar mass Merit = 1.2 Msun· All stars with a mass
M < Merit take a similar development as the sun.
If M > Merit• then a huge supernova explosion may occur, whereby great
parts of the masses of the star are thrown into the universe. If the remaining
mass is smaller than twice the mass of the sun, then a neutron star occurs
which consists of very densely packed neutrons. On an average such a neutron
star has a diameter of only 30 km; its density, however, is 109 times larger than
that of a white dwarf.
If, after a supernova explosion, the mass is bigger than twice the mass of
the sun, then a black hole occurs due to the gravitational collapse. One
estimates that 10 11 suns and 106 black holes exist in our Milky Way. Further-
more, one expects that there exist at least 10 11 galaxies.
(iv) Pulsars. These were discovered in 1967 in Cambridge (England) by the
graduate student Jocelyn Bell. They are very fast oscillating radio sources with
a period between 0.03 s and 4 s. One expects that they are very fast rotating
neutron stars which have a strong magnetic field, not parallel to the rotational
axis which therefore rotates as well. The best-known pulsar is located in the
Crab Nebula. There, a supernova explosion occurred A.D. 1054 which had
been observed by Chinese astronomers. The X-ray emission ofCyg X-l cannot
be explained by a pulsar, and seems to come from a very small source region.
One expects that Cyg X-1 has a black hole as a binary star companion,
analogously to Sirenius, which has a white dwarf as a companion.
(v) Quasars and the red limit of the cosmos. One of the most interesting
astronomical objects are the quasars. The first quasar was detected in 1960
by Allan Sandage at Caltech (Mount Wilson Observatories), using the modern
76.16. Death of a Star 779

methods of radio astronomy. These quasars have a very strange spectrum,


which at the beginning could not be explained at all. At first one thought that
quasars were related to stars, and therefore called them quasi-stellar objects
(quasars). Today we know that this was in error. Only in 1963 was it discovered
at Cal tech that the spectrum of quasars could be explained by enormous red
shifts. Today approximately 2,000 quasars are known. Some of them are
103 times brighter than our galaxy, and the radiation comes from a region
105 times smaller than the diameter of our galaxy. Enormous energies are
emitted. The nucleus of NGC 4151, for example, is not bigger than the interior
of our solar system. However, it emits X-ray radiation in the order of magni-
tude of one milliard suns in the entire electromagnetic spectrum.
Today, one expects that quasars are young galaxies from the early history
of our cosmos where huge explosions take place. It is possible that the centers
of quasars contain black holes which absorb huge amounts of matter. But the
physical mechanism of quasar explosions is not yet well understood. Also
remarkable are the enormous red shifts. The light of the most distant quasars
therefore takes approximately 15 milliard years to reach us. This is about the
age of the universe. If, according to the Doppler effect of Problem 58.6, one
computes the classical escape velocities from the red shift, one finds that the
escape velocities for quasars are close to the velocity of light. If the age of the
universe is t years, then we can observe only objects from which light has
traveled for at most t years. It seems that for quasars this limit has almost
been reached. Because of the red shift involved, this limit is also called "red
limit." Timothy Ferris has written a very interesting book about this. In
connection with the exciting history of modem astronomy and cosmology,
we recommend this book as well as the book of Sullivan (1979).
All numerical data about our cosmos should be treated with some caution,
because for very distant objects, astronomers can only measure the red shift
in the spectrum. All statements about the corresponding distances in light
years have a great error ~argin; this is because the Hubble constant is only
approximately known and also, we derived Hubble's law (40) only under the
assumption that the travel time oflight, i.e., the distance was sufficiently small.
(vi) Black holes. In the context of the general theory of relativity, black holes
are generally described by the Kerr- Newman solution:

ds 2 = [l - B- 1 (r.r - Q2 )]c 2 dt 2
- [r 2 + a2 + B- 1 (r.r- Q 2 )a 2 sin 2 .9]sin 2 .9drp 2
- B d.9 2 - BD- 1 dr 2 + B- 1 [(r.r- Q2 )2a sin 2 .9]c dt drp,
where a + Q ~ r; /4 and
2 2

D = r2 - r.r + a2 + Q2,
B = r2 + a2 cos 2 .9.
This is a solution of Einstein's equation for the vacuum, which becomes the
780 76. General Theory of Relativity

Schwarzschild solution for a = Q = 0. This solution can be interpreted as a


charged, rotating black hole with mass M = r5 c2 /2G, charge Q, and angular
momentum J = Ma. The black hole rotates around the axis 8 = 0.
One expects that black holes, which represent the final state of a star, are
of precisely this form, i.e., from the many parameters which describe a star,
only the three parameters mass M, charge Q, and angular momentum J
remain observable for the outer world. It is also expected that black holes are
very stable objects.
As literature on this object we recommend the fundamental papers of
Hawking (1972), (1973), (1975) and the standard work Chandrasekhar (1983).

76.17. Vaporization of Black Holes


We want to show that a black hole at rest cannot exist forever, but instead
has a lifetime of
(67)
where M0 is the mass of the hole at the time of its creation measured
in kilograms and Merit= 10 10 kg. This critical mass corresponds to the
Schwarzschild radius of a hole which is rs = 1.5 · 10- 7 em. Today only such
miniholes may vaporize, when they appeared 10 10 years ago, i.e., the time of
the Big Bang. Furthermore, a black hole of mass M has the temperature
T = 2·10 14 (Mcrit/M) K.
In Example 76.9 we already saw that photons may not escape from a black
hole. Now we want to look at the very interesting quantum effect which causes
the "boundary" of the black hole to radiate into the universe whereby the
hole vaporizes, i.e., the hole decreases its mass and increases its temperature
(Hawking effect).
The vacuum of quantum field theory, i.e., the ground state of quantum fields
is nonempty. It contains virtual pairs of particles which can normally not be
observed directly. Instead one observes indirect effects, e.g., on the spectrum
of the hydrogen atom (Lamb shift). Through the process of fluctuations a
virtual pair of photons {P+,P-} may appear in the real world. The energy is
given by E ± = ± hv, where v is the frequency. For states with finite lifetime
we have the energy-time uncertainty relation
~E ~t "' h/2n.
Hence, the .pair of photons has the time ~t- 1/2nv to move apart, before
annihilation occurs. If, however, {P+,P-} appears on the boundary of a black
hole, then according to Example 76.9, the photon p_ with negative energy
may vanish into the black hole, whereas P+ is radiated into the universe.
Since P_ cannot leave the black hole, the annihilation between P_ and P+
is impossible. Therefore, the photon P+ remains in our real world, whereas
Problems 781

P_ remains in the black hole. In fact, with P_,the black hole gains the negative
energy - E of P_, i.e., the black hole looses the energy E, and hence it looses
the mass m = E/c 2, according to Einstein's fundamental energy-mass relation.
Note that the total energy of this process is equal to zero, because the sum of
the energies of the two photons is equal to zero. Thus, conservation of energy
is not violated.
We now assume the radiation is black, i.e., it satisfies Planck's radiation
law. According to the Stefan-Boltzmann law of Section 68.5, the energy
emitted from the hole is equal to
(68)
with u = 2n 5 k4 /15c 2 h 3 • As a simple model computation, we consider un-
charged, black holes which are at rest. The surface is then equal to A = 4nr1
and the mass of the hole is given by

M = c2 r./2G.
We measure Min kilograms. In order to assign a temperature T to the black
hole we note that, according to Wien's displacement law (68.37),

Amax T = hc/Sk,
where Amax is the wave length for which the maximum energy is emitted. We
now assume that Amax has the same order of magnitude as the radius of the
black hole r•. Therefore, we define T = hcf5kr., hence
T = hc 3 jlOkGM.
This means that T""" (10 24/M) Kif the mass is measured in kilograms. From
(68) and A = 4nr; we obtain
EA = (hc 6 /5M 2 G2 )t.
Hence, approximately (10 38/M 2 ) watt are emitted per second. Letting

EA = Mc 2 ,
we find as time during which the entire mass of the hole is emitted:
t = 5M 3 G2 fhc 4 .
This yields assertion (67).

PROBLEMS

76.1. Characterization of the Poincare group. Let Q(u) = c2 t 2 - ~ 2 - 17 2 - ( 2 with


u = (~, '1· (, ct).
Prove: A linear transformation u' = 1\u + a is precisely a Poincare trans-
formation if Q remains invariant, i.e.,

(69)

Hint: Elementary matrix computations. See Neumark (1964, M), §7.


782 76. General Theory of Relativity

Many presentations of the theory of relativity begin with the invariance


assumption (69). It is often said that (69) follows immediately from the fact
that the velocity of light is constant, i.e., from (C) of Section 75.2. Actually,
only
Q(u 1 - u2 ) = 0 <=> Q(ui - u2) = 0
follows immediately from (C). A complete derivation of(69) from (C) which,
however, is quite long, may be found in Fock (1960, M), §8.
76.2.• Characterization of inertial systems. Let u' = D(u) be a Cal-diffeomorphism
from IR4 onto IR 4 with
c2 dt 2 - d~ 2 - = c2 dt' 2 - d~' 2 - d'1' 2 - dC' 2•
d'1 2 - dC 2 (70)
Prove that D is affine, i.e., Du = Au + a. From Problem 76.1, D is then a
Poincare transformation.
By relation (70) is meant to insert u' = D(u) and consider the system of
partial differential equations on IR4 , which is obtained by comparison of the
coefficients of the differentials.
Hint: See Fock (1960, M), §8.
In order to understand this result we consider the metric ds 2 = 9udu 1dui
on M4 of Section 75.8. Let C be an admissible chart in M4 with IR4 as chart
space. We call C constant if and only if 9u on C has the constant form (75.23).
From above follows: Except for space and time reflections, precisely the
inertial charts R4 (A, a) in M4 are constant. If, therefore one considers 9u in
an R4 -chart, then, except for uninteresting space and time reflections, one can
determine whether or not one is in an inertial chart, i.e., physically speaking,
in an inertial system.
76.3. Conservation laws and differential forms.
76.3a. Proof of Proposition 15.19.
Solution: Already in Section 75.12 we showed that Q(t) is a conserved
quantity. We now show that Q is a scalar on M4 , i.e., Q remains invariant
under proper Lorentz transformations (change to other inertial systems).
For simplicity, we consider the special Lorentz transformation
e· = cx(e - Vt)
with a= j1 - V 2 jc 2 • From this follows the general case, since every proper
Lorentz transformation can be obtained from such a special transformation
and rotations. The key for this is the differential form on M4 :
1
w = -3! T' E•• du" " dub 1\ du'.

The pseudotensor E...bc has been introduced in Example 74.27. For arbitrary
coordinate transformations, w behaves like -a pseudoscalar. Since a chart
change on M4 , by definition, uses a proper Lorentz transformation which
has a positive Jacobian, w behaves like a scalar under chart changes, i.e., w
is actually a differential form on M4 •
It follows that

cQ(y) = f T4(y,cy)dy = f w.
JR' Jr=y
Problems 783

t=y

Figure 76.16

In an inertial chart it follows from D1 T 1 = 0 that dw = 0. The theorem of


Stokes of Section 74.24 implies

r
laH
OJ = r dw = 0.
JH
(71)

We assume that the region His bounded by the two hyperplanes t =')I and
o
t' = y' as well as by other parts of the boundary 1 H as shown in Figure
76.16. Since T 1 has a compact spacelike support, we can choosey andy' in
such a way that T 1 = 0 on o1 H. Because of the coherent orientation of oH
and Hit follows from (71) that

r
Jl=y
w= r
Jl'=y'
w=O,

and thus Q'(y') = Q(y). Since Q and Q' are conserved quantities, it follows
that Q' = Q, i.e., Q is a scalar.
Under space reflections (or time reflections), T 4 remains unchanged as the
fourth component of a simple contravariant tensor (or changes its sign). It
follows then immediately from the integral representation of Q(t) above that
the charge Q remains unchanged under space reflections, while it changes its
sign under time reflections.
Passing from particles to antiparticles means a change in charge. This is
why physicists say that antiparticles are nothing other than particles with
time reverse. This can be formulated precisely in the context of quantum field
theory.
76.3b. Proof of Proposition 75.20.
Solution: We now choose

This implies

We have to show that the conserved properties p 1 are transformed like simple
contravariant tensors under proper Lorentz transformations. We use the
same arguments as in Problem 76.3a and note that w1 is transformed like a
simple contravariant tensor under proper Lorentz transformations.
784 76. General Theory of Relativity

76.3c. Proof of Corollary 15.21.


Solution: We now choose
l ..1
CO
IJ
= -S'
3!
b
"Eub<duG A du A du'.

76.4. Locally geodesic coordinates. Let x e E4 be a fixed point. Prove that in every
neighborhood of x one can introduce new coordinates such that at x
(72)
f~=O, (73)
holds with e4 = 1 and e1 = - 1 for i #: 4.
Solution: Let f 1.it = g1,f_it. Formula (7) implies
= r,,j, + IJ.,.
D,gij (74)
As in Section 74.20 it follows that g,.i. = e1/;i'i. at the point x after a linear
coordinate transformation. For simplicity in notation let u1'(x) = 0. We
choose new coordinates
(75)
Transformation rule (74.24) implies that f~ = 0. This immediately yields (72)
and (73). Note that iJu 1 (x)/iJu 1 ' = (;f..
76.5. Tensor identities. Prove the following identities by direct computation, which
can greatly be simplified by using locally geodesic coordinates. For mnemo-
technical reasons we use commas:
RiJ,trn= -RJI,trn= -RiJ,rnt• (76)

Rlj,km= Rtrn,ij• (77)

R;,jt"' + R;,t.aJ + R,,lll)t = 0, (78)

V,RIJ,trn + V1R1,, 1,. + V1R,1,trn = 0, (79)


V1(R1i - fgiJR) = 0. (80)
For ann-dimensional Riemannian manifold it follows that RiJ1,. has a(n) =
n2 (n 2 - 1)/12 independent components. For example, a(2) = l, a(3) = 6, and
a(4) = 20. Formula (79) is known as the Bianchi identity.
76.6. Divergence. Prove
(81)
on E4 withy = H·
Solution: This follows from
V1t 1 = D/ + r;,t• = D1t 1 + f(g 1lD,giJ)t'
and the differentiation rule for determinants D,g = ggilD,gu.
76.7. The variational principle of Hilbert (1915). As in Section 76.2, we let

J= L Rydu
Problems 785

with y = H· Prove
bJ = L (!gkmR- Rkm)ybgkmdU, (82)

where
bgij =0 and
As usual, we have
fJJ = q>'(O)
with q>(e) = J(gii + ehi1) and bgii ~ hii. Analogously, c5RiJ• c5f'b, etc. are
defined.
Solution: We have
b(Ry) =b(giiRiiy) = giiybRii + Riib(giiy).
(I) It is important that c5Rii does not contribute to bJ. To see this, let
w'~giibP.l} _ g•ic)[.i.I)'

From the definition of Rii in (9) it follows that


giic)Rii = y-tD,(yw'). (83)
Initially, this formula only holds for a locally geodesic coordinate system,
because there we have D,y = 0 (Problem 76.4). The right-hand side of
(83), however, is a tensor, and thus (83) holds for every coordinate system.
In order to recognize the tensor property, we observe that bf'i' is a tensor,
since the second-order derivatives cancel out in the transformation rule
(74.24) when passing from r~ to bf'~. Hence w• is a tensor, and from (81)
it follows that giibRii = V,w•.
Integration by parts yields

JH ygiibRi1du = JH D,w'du = 0.
Note that bf'i~ = 0 on oH, and hence w' = 0 on oH.
(II) Differentiation of g = det(giJ) gives
by = -h- 1 bg = hoj'boij·
From giig1k = b~ follows that
bgiigjk + giibgjk = 0.
This implies
b(giiy) = gilby + yJgii = y(!giigkm _ gikgjm)bgkm·
The desired formula (82) then follows immediately from

bJ = L Riib(giiy)du.

76.8. Connection with classical Newtonian mechanics. Prove that the Poisson equa-
786 76. General Theory of Relativity

tion (19) follows from metric (18) and Einstein's equation RiJ- !giJR = KTiJ
with T 44 = pc 2 if higher-order terms in 1/c are neglected.
Hint: Longwinding elementary computation. See Fock (1960, M), §55.
76.9. The theorem of Birkho.ff (1923). We want to show that a radially symmet-
ric gravitational field, which satisfies Einstein's equations R!l = 0 for the
vacuum, coincides locally with the Schwarzschild solution if some natural
regularity conditions are satisfied.
76.9a. Special case. Consider the metric g11 for 'I = ± 1, i.e.,
ds 2 = 'lea(r,tlc 2 dt 2 - 'leb(r,tl dr 2 - r2 (d8 2 + sin 2 8dfl), (84)
where a and bare C 2 -functions.
Prove: If RiJ = 0 holds for a region G with r > 0 for all points in G, then
(84) coincides with the Schwanschild solution in G.
Solution: Let a' = a, and a= a,. The non vanishing components of R!l are
R 14 = -b/r,
4R 11 = 2a" + a' 2 - a'b'- (4b'/r)- eb-a(2b + 62 - ab),
4R 44 = -ea-b(2a" + a' 2 - a'b' + 4a'/r) + 2b + b2 - ab,
- R 22 =1- ,e-b(1 + r(a' - b')/2),

From R1,4 = 0 it follows that b does not depend on t. From R22 = 0 it follows
that a' does not depend on t, i.e.,
a = cx(r) + {J(t).
After the transformation efl/2 dt = dt' we can assume in (84) that a is inde-
pendent oft and a(r0 ) = b(r0 ) for a fixed r0 • From R 11 = 0 and R44 = 0 it
follows that (a' + b')/r = 0, and hence a = -b. The substitution f = ,e-b
and R 22 = 0 with a' = - b' yields
rf' +f = 1

with the general solution


f = ,e-b = 'lea = 1 - r,/r.
Here r, > 0 is the constant of integration. For 'I= -1 (or= 1) we have
0 < r < r, (orr > r,). For these a and b we have in fact that R 11 = R44 = 0.
76.9b. General case. Prove that
ds 2 = AdT 2 - B2 (d8 2 + sin 2 8dqJ 2 )
(85)
- CdR 2 + 2DdRdT
can be locally transformed into (84). Thereby we assume that locally A, B, C,
D are C2 -functions in T and R with

B>O, AC-D 2 <0, A i' 0. (86)


Solution: Let r = B(T, R) and solve locally with respect to R. After change
in notation, we can assume that (85) and (86) hold with R =rand B 2 = r 2•
Problems 787

We Jet T = r) with

f
q~(t,

q~dJ!- A- 1 Ddr + 1/J(t),

where we choose 1/J in such a way that locally q11 -:1: 0 is satisfied. Then
T = q~(t, r) can be solved locally with respect to t. Because of
Aq~, + D= 0 and
we obtain
ds 2 = IX dt 2 - r2 (d8 2 + sin 2 8 dq~ 2 ) - y dr 2
with IX = Aq~12 and y = Aq~? - C. Therefore, we have locally that IX -:1: 0 and
y = (D 2 - AC)/A, and hence IXY < 0.
76.1 0. •• The initial-value problem for Einstein's equations and causality.
76.10a. Causal structure. Let E4 be a four-dimensional space-time manifold of
Section 76.1. A curve x = x(u) in E4 is called timelike (or spacelike, lightlike)
at a point x( u0 ) if and only if the following holds in local coordinates:
(or< 0, = 0).
Timelike curves are oriented in the direction of increasing proper time t, i.e.,
increasing arclength s (future orientation).
We say the point x 1 causally effects x 2 if and only if x 1 and x 2 lie on a
timelike curve where x 2 follows x 1 .
By a spacelike 3-surface we mean a three-dimensional submanifold S of
E4 which is locally given by an equation

q~(u) = 0
with a C"'-function q1 and g 1iD;q~D1 q~ < 0 on S. If we choose M4 with
ds 2 = c2 dt 2 - de - dl'f 2 - de,
then equation t = 0 describes a spacelike 3-surface S, where Sis the set of all
events which take place at time t = 0. The surface S can then be identified
with IR 3 •
76.10b. Equivalent metrics. Two metrics, i.e., two tensor fields giJ and 9;·1· on a
C"'-manifold M are called equivalent if and only if there exists a C"'-
diffeomorphismf: M-+ M which takesg;1 into Y;·r· That is, by setting locally
u' = f(u), we obtain
with
76.1 Oc. Initial-value problem and Banach's fixed-point theorem. Intuitively, one means
by an initial-value problem that the gravitational field, i.e., g;1 is uniquely
determined for all future times if it is known at an initial timet = 0. Moreover,
because of the principle of causality one expects that gii(x 2 ) only depends
on all events x 1 with t = 0, which causally effect x 2 in the sense of Problem
76.10a.
Under suitable assumptions one can prove such existence and uniqueness
theorems for Einstein's field equations. In place of the 3-surface t = 0 one
788 76. General Theory of Relativity

then has a more general spacelike 3-surface. The main difficulty is that besides
gii every other equivalent metric is also a solution of Einstein's equations.
In order to avoid this nonuniqueness one assigns four side conditions for the
components ofthe metric tensor g1J (harmonic coordinates). These additional
differential equations are called gauge conditions. The diffeomorphisms f of
Problem 76.l0b correspond to the so-called gauge transformations.
Study Hawking and Ellis (1973, M), Chapter 7 (hyperbolic second-order
differential equations) and Fischer and Marsden (1972) (quasi-linear sym-
metric hyperbolic first-order system). In this last paper, the system of differ-
ential equations corresponds to an operator equation of the form
F(u)u =h. (87)
Such equations are solved by studying the solutions v = Tu for the linear
equation
F(u)v =h (88)
for some fixed u. Then (87) corresponds to the fixed-point equation
u= Tu,
which in the present case can be solved by Banach's fixed-point theorem. The
main difficulty is the solution of (88). Under the assumption that u satisfies
certain smoothness conditions one has to show that the solution v is suffi-
ciently smooth. Otherwise, one cannot find a set in a Sobolev space which
under T is mapped into itself. In fact, system (88) is a symmetric, linear,
hyperbolic system. Since u appears in the coefficients one can obtain the
desired strong regularity results for v by using the theory of semigroups.
A simpler approach to quasi-linear symmetric hyperbolic systems will be
considered in Chapter 83 of Part V.
Furthermore, for the problems discussed here, we recommend the survey
article of Choquet-Bruhat and Yorke (1980, S) and Marsden (1983). Ex-
istence proofs for global solutions of the initial-v~lue problem, which corre-
spond to small initial data, can be found in Christodoulou and Klainerman
(1988, M).
76.lt.•• Perturbation series and bifurcation of solutions of Einstein's equations for the
vacuum. An important problem is the following: Under which conditions is
it possible to obtain from a known gravitational field gfl a new gravitational
field by using the series expansion
e-+0. (89)
If we write Einstein's equations for the vacuum in the form
E(g) = 0, (90)
then we have E(g0 ) = 0 and for hu the linearized equation
E'(g 0 )h = 0 (91)
must be satisfied. Let M denote the set of all solutions of (90). The answer to
our question, which was found only recently, is roughly like this.
(i) If g0 has no symmetries, then M is a manifold in a neighborhood of g0 ,
Problems 789

i.e., for each h with (91) there exists an expansion (89), for which h is a
solution of (90) for small lei. This means that precisely all h with (91) are
tangent vectors toM at the point g0 .
(ii) If g 0 has symmetries, then g 0 is not a regular point of M. The series (89)
are then solutions of (90) if and only if h is a solution of (91) and certain
relations, which only depend on E"(g 0 )h 2 and the symmetries, are satis-
fied. (The so-called Taub conserved quantities must vanish.)
Symmetries are described by the Lie group of the isometries of g0 • Study
Marsden (1980, S) and Arms, Marsden, and Moncrief (1982). All explicitly
known solutions of(90) have symmetries.
76.12.** The twistor program of Penrose (1977). In the following, a plane or straight
line is always understood as a two-dimensional or one-dimensional linear
subspace.
76.12a. Twistors. By a twistor we mean a point T = (T0 , T1 , T2 , T3 ) of C 4 • We set

(T,S) = T0 S0 + T1 St- T2S2- T3S3.

The twistor space 1r is the space (C\ <·, ·) ), i.e., C4 equipped with ( ·, · ). A
twistor Tis called positive (or negative, isotropic) if and only if (T, T) > 0
(or < 0, = 0). The geometry of these twistors is called twistor geometry.
LetT-# 0. The set TP W{pT: peC,p-# 0} is called the projective twist-
or of T. The space of all projective twistors is the three-dimensional, projec-
tive, complex space CIP' 3 and is denoted by IP'lr.
76.12b. Penrose transformation. Consider the space IR 4 with points(~, '7, C. ct), where
~. '7, Care Cartesian space coordinates and t is the time coordinate for an
inertial system. The corresponding set of the complex space-time points
p = (~, fl, C. ct) are denoted by c:pace-time· Our goal is to realize all pas planes
in lr. This will be done in two steps.
(i) Spinor realization. We assign to each p a complex (2 x 2)-matrix X which
is defined through

X= (ct + ~
'1- iC
'1
ct-
+ i')·
~

Conversely, to every X there corresponds a unique p.


(ii) To each X we assign the plane

(92)

in the twistor space lr. To different X correspond different planes. The


planes in lr, which cannot be expressed in the form (92), are called ideal
space-time points.
Since the set of all planes in 1r forms a compact manifold with respect to
PlUcker's coordinates (Grassmannian manifold), we obtain a compactifica-
tion of c:pace-time from this transformation. One may also think of (92) as a
straight line in the projective twistor space IP'lr. This interpretation is often
preferred, because in algebraic geometry one usually works with homo-
geneous coordinates, i.e., with projective structures.
790 76. General Theory of Relativity

The basic idea of the twistor program is to transform relativistic fields and
their differential equations into objects of algebraic geometry by using the
Penrose transformation (vector bundles, etc.). Thereby the differential equa-
tions vanish. One may think of this as a further development of the Fourier
transformation. Thereby differential equations are transformed into alge-
braic equations. The twistor program has been very successful in solving the
complicated, nonlinear Yang-Mills equations of gauge field theory. Here a
reduction to a study of vector bundles was possible (see Atiyah (1979, L)).
We recommend Wells (1979, S) and Penrose and Rindler (1984, M), as well
as the literature listed at the end of this chapter under the headline "twistor
program."
76.13. Basic equation for relativistic star models. We now generalize the basic
equation for the classical star models of Section 68.7. For this we use the
metric
ds 2 = A(r)c 2 dt 2 - B(r)dr 2 - r 2 (sin 2 8dq> 2 + d8 2 )
in the interior of the radially symmetric star and solve Einstein's equations

therein by choosing Tii as the energy-momentum tensor of an ideal fluid


with pressure P = P(r), mass density p = p(r), and energy density e = pc 2 •
Moreover, let

M(r) = J: 47tr 2 p(r)dr

be the mass of the ball of radius r. Prove that Einstein's equation implies the
following equation:

P )( 47tr 2
-r2P' = GMp ( 1 + clp 1 + elM
P)( 1- T,
2GM)-t
.

From this basic equation for star models one obtains the classical basic
equation -r 2 P' = GMp as c-+ oo.
Hint: See Weinberg (1972, M), Chapter 11, §1.
76.14. Gravitational collapse. We want to explain the physically interesting fact that
a star with no pressure left inside can collapse into a point during a finite
amount of time tcollap..-
76.l4a. Classical theory. Derive this fact classically.
Hint: As in Figure 76.17 consider a point P with mass m on the surface
of a ball of radius R. Let the mass of this ball be equal to M. During the
gravitational collapse the motion of Pis given by r = r(t). Fort = 0 we have
r(O) = o, r(O) = R.

The point Pis only affected by the gravitational force of the ball, i.e.,
mi' = -GMm/r 2 •
Note the important fact that according to Problem 58.5 the ball affects Pin
References to the Literature 791

Figure 76.17

the same way as if its entire mass would be located at its center. Multiplica-
tion with r yields the energy theorem
2- 1 ;- 2 - MG/r = const = -MG/R.
Thus we obtain as collapse time

tcollapoe = f dt = LR dr/r = 7t(R /8MG) 1


3 1 2•

76.14b.• Relativistic theory. Prove that the same formula for tcollapae can be obtained
if, as in Section 76.3, one chooses the metric ofthe closed cosmological model
for the interior of the star with energy-momentum tensor of an ideal fluid
with vanishing pressure.
Hint: See Weinberg (1972, M), Chapter 11, §9.
16.15.•• The famous positive-energy theorem in general relativity. Roughly speaking,
this theorem says the following. For a nontrivial isolated physical system in
general relativity, the total energy, including contributions from matter and
gravitation, is positive.
The precise result can be found in Schoen and Yau (1979). We also
recommend the survey article Choquet-Bruhat (1984).
76.16. An unsolvable problem. Compute the Riemannian invariant density

K(m) =H sgn (IXl ... IX 2"') RfJifl2Rfl•fl• ... RfJ2m-lfl2m


f3t ... fJlm 111112 113114 «2m-1112m

form = 5, where we sum over two equal indices from 1 to 2m. This expression
is closely related to the theorem of Gauss-Bonnet-Chem in (74.109) for
2m-dimensional manifolds.
There are 3.6 million terms which would take 5 years to compute by hand.
A computer needs 6 hours. Computer methods for numerical and algebraic
solutions of Einstein's equations may be found in d'Invemo (1983).

References to the Literature


Classical works: Einstein (1905), (1905a) (special theory of relativity~ Minkowski
(1909) (geometrical formulation of the special theory of relativity-Minkowski space),
Einstein (1915), (1916) (general theory of relativity).
Classical monograph: Einstein (1953).
Einstein's "Weltbild": Einstein (1965, M), Melcher (1979, M, 8).
792 76. General Theory of Relativity

Collected works: Einstein (1960).


Einstein biographies: Frank (1949, M), Infeld (1969, M), Dukas and Hoffmann (1972,
M), (1979, M).
Reminiscences of Einstein: Born and Infeld (1967, M).
Modern development of the general theory of relativity: Held (1980, P1 Vols. 1, 2
(especially recommended1 Carelli (1979, P), Hawking and Israel (1979, P), Schmutzer
(1983, P), DeWitt and Stora (1984, P), Ruffini (1987, P), Vols. 1, 2.
Introduction: Landau and Liflic (1962, M), Vol. 2, Stephani (1977, M), Sexl and
Urbantke (1983, M) (especially recommended), Straumann (1984, M).
Standard works about the general theory of relativity and cosmology: Weinberg
(1972, M, B), Misner, Thorne, and Wheeler (1973, M), Hawking and Ellis (1973, M).
Monographs: Weyl (1923, M) (classical monograph), Sommerfeld (1954, M), Vol. 3,
Fock (1960, M), Schmutzer (1968, M), Sachs and Wu (1977, M), Rindler (1977, M),
Dixon (1978, M), Triebel (1981, S), Wald (1984, M).
Lorentzian geometry: Beem and Ehrlich (1981, M), O'Neill (1983, M).
Experimental tests for the theory of relativity: Reasenberg and Shapiro (1983, S).
Computer methods for the general theory of relativity: d'Inverno (1983).
Initial-value problem for Einstein's equations: Choquet-Bruhat and Yorke (1980, S)
(recommended as an introduction), Dionne (1962), Fischer and Marsden (1972) (quasi-
linear symmetric hyperbolic system), Hawking and Ellis (1973, M), Hughes, Kato,
and Marsden (1977), Friedrich (1981), Marsden (1983, S, B).
Global solutions for the initial-value problem: Christodoulou and Klainerman
(1988, M).
Regular and singular points in the space of solutions for Einstein's equations and
the Einstein-Yang-Mills equations: Marsden (1980, S1 Fischer, Marsden, and Mon-
crief(1980), Arms, Marsden, and Moncrief(1982).
Positive-energy theorems: Schoen and Yau (1979), Choquet-Bruhat (1984, S).
Einstein manifolds: Besse (1987, M).
Group theory and general relativity: Carmeli (1977, M).
Black holes: Sexl (1981, M), Sexl and Urbantke (1983, M) (introduction), Chan-
drasekhar (1983, M) (standard work1 Hawking (1972), (1973), (1975) (basic laws),
Hawking and Israel (1979, P), Kaufmann (1979, M) (elementary introduction1 Lotze
(1980, S1 Miller and Sciama (1980) (gravitational collapse), Seifert (1983, S), Novikov
and Frolov (1986, M).
History of black holes: Sullivan (1979, M).
Singularities of gravitational fields: Hawking and Ellis (1973, M) (standard work),
Penrose (1972, L), (1979, S), Hawking and Israel (1979, P), Tipler, Clarke, and Ellis
(1980, S), Seifert (1983, S), O'Neill (1983, M).
Representation of all known solutions of Einstein's field equations: Kramer (1980,
M).
Gravitational waves: Weinberg (1972, M) (mathematical theory), Weber (1980, S)
(experiments1 Ruffini (1987, P).
Quantization of the gravitational theory: DeWitt (1984, L) and Hawking (1984, S)
(introduction), Misner, Thorne, and Wheeler (1973, M), Grib (1980, M), Marlow (1980,
M), Birell and Davies (1982, M) (introduction), Narlikar and Padmanabhan (1982)
(quantum cosmology via Feynman integrals), Novikov and Frolov (1986, M).
Textbooks on quantum field theory and modern part_icle physics: Lee (1981), Framp-
ton (1987).
Elementary particle physics and cosmology-toward a theory of the universe:
Weinberg (1983, P), (1986a, P), Kounas (1984, P), Audouze (1985, M), Haber (1987, P),
Hincliffe (1987, P).
Magnetic monopoles: Craigie (1986, L) (standard work), Lee (1981, M), Rajaraman
(1982, M).
Supersymmetry, superspace, supergravity, and unified theory of all interactions:
References to the Literature 793

Mohapatra (1986, M) and West (1986, M)(recommended as an introduction), Hawking


and Roeek(l981, P), Wessand Bagger(1983, L),Gates(1983, L), Nieuwenhuizen(1984,
S), Salam and Sezgin (1986, P), Haber (1987, P).
Supersymmetric Yang-Mills equations: Manin (1985).
Superstring theory and unified theory of all interactions: Green, Schwarz, and
Witten (1986, M) (standard· work), Green and Gross (1986, P), Witten (1986, S) and
Green (1987, S) (fundamental survey articles), Haber (1987, P).
Topological tools for superstring theory: Witten (1986), Green, Schwarz, and Witten
(1987, M), Segre (1987, S).
Loop groups: Pressley (1986, M).
Super-Lie groups, super-Lie algebras, supermanifolds: Leites (1980, S) (fundamental
survey article), Gates(1983, L), Wess and Bagger(1983, L), Regge(1984, S), West (1986,
M).
A new supersymmetric proof of the Atiyah-Singer index theorem: Simon (1986, M).
Penrose twistor program: Penrose (1977), Atiyah (1979, L) (applications to gauge
field theory), Wells (1979, S), Penrose and Ward (1980, S), Manin (1982) (Yang-Mills-
Dirac equations as Cauchy-Riemann equations in twistor space) (1984, M), Penrose
and Rindler (1984, M), Vols. 1, 2.
Collection of problems for the general theory of relativity with solutions: Lightman
(1975, M).
Elementary star models: Sexl (1981, M).
Relativistic astrophysics, star models and cosmology: Liang and Sachs (1980, S),
Sexl and Urbantke (1983, M) (introduction), Weinberg (1972, M) (standard work),
Chandrasekhar (1939, M) (classical exposition about star models), Zeldovic and
Novikov (1971, M), (1971a) (astrophysics), Peebles (1980, M), Longair (1974, P) (cos-
mology), Hoyle and Wickramsinghe (1979, M) (life cloud), Ambarzumjan (1980, M)
(cosmonogy).
Script for the development of the universe after the Big Bang: Weinberg (1972, M),
(1977, M), Singh (1983) (unified theory of elementary particles, magnetic monopoles),
Setti (1984, P), Haber (1987, P), Hincliffe (1987, P).
Inflationary universe: Guth (1981), Linde (1984, S), Hawking (1984, S), Borner (1985,
S), Abbott and Pi (1985) (collection of important papers).
Historical survey about various cosmological models: Treder (1975, M).
History of modem astronomy and cosmology: Ferris (1977, M), Sullivan (1979, M).
History of astronomy: Moore (1971, M), Herrmann (1978, M).
Search for extraterrestial intelligent creatures: Sagan (1968, M), (1973, M), (1980a),
Breuer (1978, M).
Encyclopedia of astronomy and space (1976).
Popular exposition of astronomy and cosmology: Ferris (1977, M), Sullivan (1979,
M), Silk (1980, M), Sagan (1980, M), (l980a, M), (1982, M), Harrison (1981, M), Unsold
(1981, M), Fritzsch (1983, M), Trefil (1984, M), Field (1986, M), Kippenhahn (1987, M).
CHAPTER 77

Simplicial Methods,
Fixed-Point Theory,
and Mathematical Economics

The highest praise for a mathematician is that his theorem, his proof, his theory
is considered beautiful. Every mathematician polishes his proofs until they as-
sume the most elegant form. After its first appearance, an important theorem
will be proved in many different w.1ys, and the most elegant proof, which usually
is also the most simple one, will then be used in monographs and textbooks.
Krysztof Maurin (1981)
The purpose of this note is to give a short proof of Brouwer's fixed-point
theorem. We first prove a lemma which contains the combinatorical key argu-
ment of a new and elegant proof of E. Sperner for the in variance of the dimension
number. From this we deduce a combinatorical-topological theorem from
which the above-mentioned fixed-point theorem, as well as Sperner's proof,
follows.
B. Knaster, C. Kuratowski, and S. Mazurkiewicz (1929)
It has been typical for the study of nonlinear phenomena in analysis that they
must be analyzed with one technique to obtain quantitative information (nu-
merical values ofthe solution), and by another to obtain qualitative information
(existence, uniqueness, multiplicity, stability, bifurcation). In a sense, this paper
is devoted to a unified approach, i.e., for a number oftopological methods which
have been used in the past only to obtain qualitative insight, we will show how
to exploit them and at the same time also provide numerical knowledge up to
implem~ntable and tested procedures.
Heinz-Otto Peitgen and Michael Priifer (1979)
As our brief remarks below, concerning the history of fixed-point theory as it
pertains to constructive methods, may indicate, there seems to have been a
blockage concerning the development of algorithms for Brouwer's fixed-point
theorem (1912). We suspect that we were not alone in our reaction, upon learning
of the Scarf (1967) simplicial algorithm or the Kellogg, Li, and Yorke (1976)
homotopy algorithm-approximately, "Yes! ... But of course!." The origin of
the blockage is perhaps due to an interface between analysis or topology and
computing, which will in time vanish, as the example of Brouwer's fixed-point
theorem allows us to hope.
Eugene Allgower and Kurt Georg (1980)

794
77. Simplicial Methods, Fixed-Point Theory, and Mathematical Economics 195

In the following two chapters, we present two ways of introducing fixed-point


theory which have been developed intensely during the last years and provide
an effective method for solving nonlinear equations on computers:
(i) Simplicial methods (Chapter 77).
(ii) Homotopy methods (Chapter 78).
In (i) we use triangulations and suitable labelings. The starting point is the
lemma ofSpemer of Section 77.1. In (ii), curves, i.e., one-dimensional manifolds
are being followed. Thereby differential topological methods, especially Sard's
theorem are employed. The disadvantage of (i) over (ii), in view of computer
usage, is that with an increasing number of variables in the nonlinear equa-
tions, the necessary storage place increases very rapidly.
The content of this chapter is pictured schematically in Figure 77.1. In
Chapter 2, we began the discussion of fixed-point theory with the negative
retract principle. This way we obtained a quick proof of Brouwer's fixed-
point theorem. In Chapter 9 we then used Brouwer's fixed-point theorem and

Simplicial lemma of Sperner (1928)


(elementary combinatorics)

Cubic Sperner lemma j


l
of Kuhn (1960)

Lemma of Knaster, Kuratowski, and Mazurkiewicz (1929)

Negative rtract principle / ~


IBrouwer's fixed-point theorem (1912) I 'I I
I-n-eq_u_a-lit_y_o_f-Fa_n_(_19_7_2....,)

! 1
Intermediate-value theorem of Nash equilibria (1951)
Bolzano-Poincare-Miranda (1941) (n-person games)
t l
Fixed-point theorem J. von Neumann's
ofKakutani (1941) minimax theorem (1928)
l (2-person games)
Fixed-point theorem of !
Duality theory
Fan-Glicksberg (1952)
t (Chapter 49)
Main theorem of mathematical economics
about Walras equilibria of ~
Gale (1955), Nikaido (1956),
and Debreu (1959)
t
Theorem of Hartman-Stampacchia Economical
(1966) about variational models
inequalities 1
Quasi-variational
inequalities
Figure 77.1
796 77. Simplicial Methods, Fixed-Point Theory, and Mathematical Economics

the partition of unity to prove existence theorems for variational inequalities.


From this we derived fixed-point theorems for multivalued maps and gave
applications to the minimax theorem. The negative retract principle is very
intuitive, but, in fact, represents a deep topological result. Thus the presenta-
tion of Brouwer's fixed-point theorem in Chapter 2 is certainly of great
geometrical clarity and simplicity, but actually does not represent an elemen-
tary approach. The goal of this chapter is to present an entirely elementary
introduction to fixed-point theory, which can also be used as a first lecture
about fixed-point theory. If one restricts attention to IR", then students only
need to be familiar with elementary geometrical facts and the concept of
continuous functions. We begin with an elementary combinatorical result-
the so-called lemma of Sperner. This then implies the central lemma of
Knaster-Kuratowski- Mazurkiewicz, which in turn implies Brouwer's fixed-
point theorem. Furthermore, a simple generalization of the classical lemma of
Knaster-Kuratowski-Mazurkiewicz in Section 77.4 yields the inequality of
Fan. As can be seen from Figure 77.1 this inequality is of central importance
for fixed-point theory, game theory, and mathematical economics. As several
applications we consider:
Existence of Nash equilibrium points (main theorem of n-person games);
Saddle points and minimax theorem (main theorem of 2-person zero sum
games);
Theorem of Hartman-Stampacchia about variational inequalities.
In Section 77.8 we derive the fixed-point theorem of Kakutani for multivalued
maps on ~n from Brouwer's fixed-point theorem by using a simple approxima-
tion argument. The finite intersection property then immediately yields a
generalization to locally convex spaces-the fixed-point theorem of Fan-
Glicksberg. Figure 77.2 shows important consequences of this fixed-point
theorem, which already have been proved in Section 9.3. Moreover, in Section
77.10 this fixed-point theorem is used to obtain the main theorem of mathe-
matical economics, which can also be viewed as a general existence result for
quasi-variational inequalities. Figure 77.3 shows important applications of
Brouwer's fixed-point theorem to the theory of monotone operators of Part II.
At the same time, Figure 77.1 shows that very many different results are
equivalent to Brouwer's fixed-point theorem. This will be proved in Section

Fixed-point theorem of Fan-Giicksberg (1952)

Tihonov (1935)
..----------
on multivalued maps

Fixed-point theorem of Fixed-point theorem of


Bohrienlust-Karlin (1950)
~
Fixed-point theorems of Schauder
(1927/30)
Figure 77.2
77.1. Lemma of Sperner 797

Brouwer's fixed-point theorem Schauder's fixed-point theorem


(1912) ~ ~30)

Existence principle Existence principle for


for systems of equations monotone systems of
(Proposition 2.8)
1 inequalities of Debrunner- Flor (1964)
(Proposi1ion 2.17)
Main theorem on monotone Main theorem on
operators of Browder-Minty maximal monotone operators
1963 (Theorem 26.A) of Minty (1962) and Browder (1968)
(Theorem 32.A)
Figure 77.3

77.13. It follows that Brouwer's fixed-point theorem can be viewed as one of


the possible formulations of a general mathematical existence principle, which
we shall call Brouwer's principle. Another such fundamental existence principle
is the principle of Hahn-Banach of Chapter 39. The annual numbers shown
in Figures 77.1-77.3 illustrate that only during a long process the various
applications of Brouwer's principle have been recognized. The many equiva-
lent formulations of Brouwer's principle are the reason for the many different,
but actually equivalent, approaches to fixed-point theory, game theory, and
mathematical economics.
In Section 77.12, we show that Brouwer's fixed-point theorem is equivalent
to an obvious generalization of the classical intermediate-value theorem of
Bolzano to !Rn (intermediate-value theorem of Bolzano-Poincare-Miranda).
This shows that Brouwer's principle is nothing other than a generalization of
the intermediate-value theorem of Bolzano (1817).
The connection between simplicial methods and the mapping degree, as
well as numerical methods, will be discussed in the problem section.
Recall that the notion of topological vector spaces and locally convex spaces
has been introduced in A1 (22) and A 1 (40), respectively. For example, the
Euclidean space !Rn and each B-space is a locally convex space and hence a
topological vector space.

77.1. Lemma of Sperner


As clearly as possible we try to present the simple and ingenious proof idea
for Brouwer's fixed-point theorem, which goes back to Knaster, Kuratowski,
and Mazurkiewicz. In order to avoid clumsy notation, we restrict ourselves
during the next three sections to IR 2 • The reader then can easily extend every-
thing to !Rn.
Let M be a closed triangle in IR 2 with vertices P0 , P 1 , P2 • The r-dimensional
798 77. S.implicial Methods, Fixed-Point Theory, and Mathematical Economics

Figure 77.4 Figure 77.5

sides of Mare the vertices P0 , P 1 , P2 for r = 0, the sides P0 P1 , P 1 P2 , P2 P0 for


r = 1, and the triangle itself for r = 2.
By definition the base of a point in M is the side of lowest dimension on
which the point lies. In Figure 77.4 the base of P and Q is the triangle and the
side P0 P1 , respectively.

Lemma 77.1 (Sperner (1928)). Let the triangle M be triangulated by sub-


triangles. Instead of P0 , P1 , P2 we simply denote the vertices of M by 0, 1, 2,
and assign to each knot point of the triangulation a number which belongs to
some vertex of its base in M. Then there exists a Sperner simplex, i.e., a
subtriangle having vertices with numbers 0, 1, 2 (Fig. 77.5).

PROOF: A side of a subtriangle T is called distinguished if and only if it carries


the numbers 0, 1. We have exactly the following two possibilities:
(i) T has precisely one distinguished side (Sperner simplex).
(ii) T has precisely two or no distinguished sides (no Sperner simplex).
But since the distinguished sides occur twice in the interior and in odd number
on the boundary, the total number is odd. Therefore there exists a Sperner
simplex. 0

77.2. Lemma of Knaster, Kuratowski,


and Mazurkiewicz
Lemma 77.2 (Knaster, Kuratowski, and Mazurkiewicz (1929)). Suppose a
closed triangle M of IR 2 with vertices P0 , P1 , P2 is covered with three closed sets
C0 , C 1 , C2 with
(1)
for all possible index combinations, where index repetitions are allowed. Then,
the intersection C0 n C1 n C2 is not empty.
77.3. Elementary Proof of Brouwer's Fixed-Point Theorem 799

PROOF: For k = 1, 2, ... consider a sequence of triangulations of M, where the


maximal diameters of the subtriangles tend to zero as k-+ oo. For every
triangulation there exists a Sperner simplex with vertices Pf,">, Pf"1, P!"1 which
carry the numbers 0, 1, 2. The construction of the labeling implies that
P;1"1E co{ P;, ... }
and from (1) it follows that
i = 0, 1, 2. (2)

We now choose a convergent subsequence with P;1"'1 -+ Qi ask-+ oo. Since the
diameters ofthe subtriangles tend to zero, we obtain Q0 = Q1 = Q2 • From (2)
the claim Q0 E Ci follows for all i. 0

77.3. Elementary Proof of Brouwer's


Fixed-Point Theorem
By an n-dimensional, closed simplex we mean the convex hull of n + 1 points
P0 , ••• , Pn in IR" which do not all lie in a hyperplane.

Theorem 77.A (Brouwer (1912)). Every continuous map f: M-+ M of an n-


dimensional, closed simplex M into itself has a fixed point.

PROOF: We apply Lemma 77.2. Let n = 2, i.e., M is a triangle in IR 2 • For n > 2


one can use analogous arguments. Every point P in M has the representation
(3)

with barycentrical coordinates A.i. Thereby we have


(4)
We set

These sets satisfy the assumptions of Lemma 77.2, since the continuity off
and A.i( ·)implies that Ci is closed, and (1) follows because of f(M) s;;; M and
(4). This is easily obtained by checking all possible cases. From A. 0 (P0 ) = 1, for
example, it follows that P0 E C0 •
According to Lemma 77.2 there exists a point P EM with P E C0 n C1 n C2 •
This is the required fixed point, because from
i = 0, 1, 2 (5)
and (4) we obtain equality in (5), and thus f(P) = P. 0
800 77. Simplicial Methods, Fixed-Point Theory, and Mathematical Economics

Figure 77.6

Corollary 77.3. The assertion of Theorem 77.A remains valid if M is a compact,


convex, nonempty set in IJil•.

PRooF. Either M is a point or M lies on an m-dimensional plane and has an


interior point Q there. In this last case, M is homeomorphic to a closed simplex
as shown in Figure 77.6. The assertion then follows from Theorem 77.A. D

77.4. Generalized Lemma of Knaster,


Kuratowski, and Mazurkiewicz
Proposition 77.4. It is
n F(x) "# 0
xeX
(6)

if the following assumptions are satisfied:


(i) X is a nonempty set in a topological vector space E.
(ii) F(x) is a closed, nonempty set in E for all x eX.
(iii) F(x0 ) is compact for a fixed x 0 eX.
(iv) For every finite subset {x lt ••. , x.} of X we have
n
co{x 1 , ••• ,x.} £ UF(x;).
i=l

PROOF:
(I) If X is finite, then the claim follows similarly as in the proof of Lemma 77 .2.
(II) Let X be infinite. Suppose (6) is not true. Because of (iii) and the finite
intersection property A1 ( 12g) there exists a tuple {x 1 , ... , x... } with

nF(x;)
Ill

i=l
= 0.
This contradicts (1). D
77.5. Inequality of Fan 801

77.5. Inequality of Fan


Theorem 77.B (Fan (1972)). It is
min sup f(x, y) ~ sup f(x, x) (7)
yeX xeX xeX
if the following three assumptions are satisfied:
(a) The function f: X x X-+ IR is given where X is a compact, convex, and
nonempty set in a topological vector space.
(b) f is quasi-concave in the first argument, i.e., x 1-+ f(x, y) is quasi-concave on
X for every fixed y eX.
(c) f is lower semicontinuous in the second argument, i.e., y 1-+ f(x, y) is lower
semicontinuous on X for every fixed x EX.

The concepts of lower semicontinuity and quasi-convexity have been de-


fined in Section 9.5. Moreover, g is upper semicontinuous and quasi-concave
if and only if -g is lower semicontinuous and quasi-convex, respectively.
Often the following special case of (7) is used. If f(x, x) ~ 0 for all x eX, then
there exists a y eX with
f(x,y) ~ 0 for all xeX, (8)

i.e., one obtains a solution y of this generalized variational inequality.

PRooF. We apply Proposition 77.4. For this we set m = SUPxexf(x,x) and


define
F(x) = {yeX:f(x,y) ~ m}.

Because of(c), the set F(x) is closed and because of(a) it is compact. Moreover,
we have
II

co{x 1 , ... ,x,.} £; UF(x 1).


1=1

Otherwise, there exists a convex linear combination


n

with c¢ U F(x;).
1=1

That is, f(x 1, c) > m for all i. From (b) it follows that f(c, c) > m, which
contradicts the definition of m.
From Proposition 77.4 there exists aye nxF(x). This is (7). 0

In the following two sections we consider applications of this inequality.


802 77. Simplicial Methods, Fixed-Point Theory, and Mathematical Economics

77.6. Main Theorem for n-Person Games of


Nash and the Minimax Theorem
Before reading this section, we recommend a look at Section 9.7 on game
theory. There we have been restricted to only two players. Now we consider
n players P 1 , ••• , P11 • Suppose, each player ~ has a strategy set K 1 available.
We assume:
(H) K 1 is a compact, convex, nonempty set in a topological vector space E1
for all i = 1, ... , n.
The real number /;(p 1 , ... , p11 ) represents the loss of P1 if each player ~
chooses the strategy p1 e K 1.

EXAMPLE 77.5. Let E1 = R"• and


K 1 = {p1 eE1: 0 ~ p1, 11 ~ 1, k = 1, ... , n1}.
Similarly as in Section 9.7, the p1,,. can be viewed as the probability with which
player~ makes his kth decision.
The main problem of game theory is to find useful definitions of equilibrium
points and to prove their existence. Roughly speaking, equilibrium points
correspond to a behavior most favorable to all players.

Definition 77.6. The point (qt> ... ,q11 ) with q1 eK1 for allj is called a Nash
equilibrium point of the game if and only if for all i = 1, ... , n:
/;(q1, ... ,q") =min /;(qt, ... ,q,_t,P~>ql+t• ... ,q").
p 1 eK 1

This definition has a simple interpretation. In a Nash equilibrium point no


player has a reason to change his strategy if the other players keep their
strategy (loss minimization).

Theorem 77.C (Nash (1951)). A Nash equilibrium point exists if in addition to


(H) the following two assumptions are satisfied:
(i) All maps/;: K 1 x · · · x K 11 -+ R are continuous.
(ii) All maps p 1-+ /;(p) are convex if we fix an arbitrary p1 in p = (p 1 , ••• , P11 ).

PROOF: We let X= K 1 x ··· x K 11 and


II

f(p,q) = L /;(q)- /;(pl, ... ,pi-l•qi,Pi+l>"''p")


i=l
for all p, q eX. Then f is concave in the first argument and continuous in the
second argument. We have f(p,p) = 0. According to the inequality of Fan (8)
there exists a q eX with
f(p,q) ~ 0 for all peX.
77.7. Applications to the Theorem of Hartman-Stampacchia 803

If, for any fixedj, we choose a p with P; = q; for all i ::1: j, then we obtain
Jj(q) - Jj(p) ~ 0 for allj. 0

In Section 9.6 we obtaiped the minimax theorem of John von Neumann by


using a fixed-point theorem for multivalued maps. Now we will see that this
theorem is a special case of Theorem 77.C.

EXAMPLE 77.7. We choose n = 2 in Theorem 77.C and suppose that f 1 = - f 2 •


Moreover, we let f = f 1 • The Nash equilibrium point q = (q 1 , q2 ) then implies
that

i.e., f has a saddle point q. .Corollary 9.16 implies that


f(q 1 ,q2 ) = min max f(P~oP 2 ) = max min f(p 1 ,p2 ).

This is a variant of the minimax theorem (Theorem 9.0). As we already saw


in Section 9.7, this theorem describes the zero sum games for two persons.
Because of f 1 = - f 2 the loss of player P1 is equal to the gain of player P2 •

In Chapter 49 the existence of saddle points was our starting point for the
discussion of duality theory.

77.7. Applications to the Theorem of


Hartman-Stampacchia for
Variational Inequalities
Consider the variational inequality
(Ay,x- y) ~ 0 for all xeK. (9)

Proposition 77.8 (Hartman-Stampacchia (1966)). If K is a compact, convex,


nonempty set in the B-space X and A: K -+ X* is continuous, then (9) has a
solution y e K.

PROOF. Apply the inequality of Fan (8) to f(x, y) =- (Ay, x - y), x, y e K.


0

Corollary 77.9. Instead of a B-space X we can also choose a locally convex


space X, when X* is equipped with the strong* topology.
804 11. Simplicial Methods, Fixed-Point Theory, and Mathematical Economics

77.8. Fixed-Point Theorem of Kakutani


In Chapter 9 we obtained the fixed-point theorems of Kakutani and Fan-
Glicksberg from Corollary 77.9. Here we want to prove these fixed-point
theorems in a direct and intuitive fashion by using Brouwer's fixed-point
theorem.

Proposition 77.10 (Kakutani (1941)). The multivalued map T: M-+ 2M has a


fixed point x*, i.e., x* e T(x*) if the following three conditions are satisfied:
(i) M is a compact, convex, nonempty set in Rn.
(ii) The set T(x) is convex and nonempty for every xeM.
(iii) The graph G(T) = {(x, T(x)): xeM} is closed in M x M.

PRooF. The proof idea is to approximate T with a single-valued map for which
Brouwer's fixed-point theorem can be applied and then passing to limits.
(I) Let n = 2, and M a triangle in R2 • Fork = 1, 2, ... we choose a sequence
of triangulations of M, for which the maximal diameter tends to zero as
k-+ oo. For each triangulation we construct maps T,.: M -+ M by assign-
ing to each knot point x a point T,.(x) in T(x), and by then extending T,.
linearly to the subtriangles. According to Brouwer's fixed-point theorem
(Theorem 77.A) each T,. has a fixed point x,.. Suppose it lies in the triangle
with vertices P~"". J1">, J1.">. From the definition ofT,. it follows that
i = 0, 1, 2. (10)
Then there exist convergent subsequences, not separately denoted, with
Pl"l-+ Qi> T,.(Pl">)-+ Si as k-+ oo.
Since the triangle diameters tend to zero as k-+ oo, we obtain Q0 =
Ql = Q2 , i.e., xt-+ Q0 ask-+ oo. Moreover, we have
x,.eco{P~t>,pft>, P~">}.

By using the representation of the convex hull through convex linear


combinations, we immediately obtain from x,. = T,.x" that
x, e co {T,.(Po">), T,.(J1">), T,.(P~t>)}.
By choosing convergent subsequences of the barycentrical coordinates
as k -+ oo, we obtain the relation
Q0 e co{ S0 , S1 , S2}.
Furthermore, it follows from (10) and (iii) that (Q 0 , S;) e G(T), i.e.,
for all i.
Because of the convexity of T(Q 0 ) we have Q0 e T(Q 0 ), i.e., Q0 is the
required fixed point.
77.9. Fixed-Point Theorem of Fan-Glicksberg 805

(II) For a simplex in !Rn one uses analogous arguments.


(III) In the general case, one can choose a simplex S as in Corollary 77.3,
which is mapped homeomorphically onto M by
h: S -+M.
Then one constructs approximations T,.: S-+ M of Toh: S-+ 2M and
applies Brouwer's fixed-point theorem to
h- 1 0 T,.: s-+ s. D

77.9. Fixed-Point Theorem of Fan-Glicksberg


Theorem 77.0 (Fan (1952), Glicksberg (1952)). The multivalued map T:
M -+ 2M has a fixed point if the following three conditions are satisfied:
(i) M is a compact, convex, nonempty set in the locally convex space X.
(ii) T(x) is convex, closed, and nonempty for every x EM.
(iii) T is upper semicontinuous.

We shall prove this generalization of the fixed-point theorem of Kakutani


by using this fixed-point theorem as well as the following lemma, which will
be proved in Problem 77.3.

Lemma 77.11. If Cis a closed set in X, then also the following two sets
Q = {xeM: xe T(x) + C},
P= {(x,y): xeM,ye T(x) + C}
are closed.

PRooF OF THEoREM 77.0.


(I) A convex set U in X is called absolute convex if xe U, lo:l::;;; 1 always
implies that o:xe U. Let B be a neighborhood basis of zero in X which
consists of open, absolute convex sets. For every U e B we define the set
Su = {xeM: xe T(x) + iJ}.
According to Lemma 77.11 each of these sets is closed. Moreover, we show
below in (II) that Su is nonempty for every U e B. From the choice of B
it follows that the intersection of finitely many Su is nonempty. Since M
is compact, the finite intersection property A 1 (12g) implies that there
exists an x with x E Su for all U e B. This means that x e T(x), and hence
x is the required fixed point.
(II) We show that Su # 0. Since M is compact, there exist finitely many
points x 1 , ..• , xkeM such that the open sets x 1 + U, ... , xk + U form a
806 77. Simplicial Methods, Fixed-Point Theory, and Mathematical Economics

covering of M. We set K = co{x 1 , ••• ,x,.} and define


Tu(x) = (T(x) + U) n K.
Because of T(x) s; M and U = - U it follows that Tu(x) is nonempty,
convex, and closed. The set K lies in a finite-dimensional subspace of X
which can be identified with ~". According to Lemma 77.11 the map
Tu: K -+ 2K has a closed graph. The fixed-point theorem of Kakutani of
Section 77.8 implies the existence of a point x with x e Tu(x), i.e., Su "# 0.
0

77.10. Applications to the Main Theorem


of Mathematical Economics
About Walras Equilibria and
Quasi-Variational Inequalities
In order to explain the basic economical idea we consider a simple situation.

Standard Model 77.12 (Reasonable Price System). We define P = {pe ~":


0 ~ p1 ~ 1 for all i} and let Q denote a compact, convex, nonempty set in ~".
Suppose there are n goods G1 , ... , Gn. Let Pi be the price of Gi with
0 ~Pi~ 1. We assign a set S(p) s; Q to each price vector peP. Thereby
seS(p) with s = (s 1' ... 'sn)
means that the difference between supply and demand for Gi is equal to s1•
The number
n
(pis) = L Pisi
i=l

is therefore equal to the difference between the value of the goods which are
supplied to the market and the value of the goods which are demanded by the
market. Thus
<viS>= o, seS(v)
is the mathematical formulation of the situation "supply equals demand." This
ideal situation cannot always be realized. We therefore will be satisfied with
the weaker result (12) below. First of all, it is reasonable to assume that
(pis)~ 0 for all peP, s e S(p). (11)
This is the so-called law of Walras. Roughly, it means that we only consider
economical situations with a supply excess.
We call (v, s) with se S(v) a Walras equilibrium if and only if the following
holds:
o ~ <vis> ~ (pis> for all peP. (12)
77.10. Applications to the Main Theorem of Mathematical Economics 807

Roughly, this means that the value difference between supply and demand
becomes minimal. The vector ji is called equilibrium price system.

The fundamental problem of mathematical economics is to find conditions


for the supply excess map S which guarantee the existence of a Walras
equilibrium.
For the special case that
X= Y = IR", P =unit cube
and f(p,s) =(pis) the following general assumptions (H1)-(H4) precisely
correspond to our model. If we write (12) in the form
(p- fils)~ 0 for all pEP,
(13)
sES(p),
then this is a so-called quasi-variational inequality. Therefore Theorem 77.E
below represents a general existence theorem for quasi-variational inequal-
ities. We assume that:
(H1) X and Yare locally convex spaces with compact, convex, nonempty
subsets P s;;; X and Q s;;; Y. Let a continuous function f: P x Q--+ IR, an
upper semicontinuous map S: P--+ 2Q, and a real number c be given.
(H2) The map f( ·, s) is quasi-convex on P for every fixed sEQ.
(H3) S(p) is convex, closed, and nonempty for every fixed pEP.
(H4) f(p, s) ~ c for all pEP, s E S(p) (Walras law).

Theorem 77.E (Main Theorem of Mathematical Economics of Gale (1955),


Nikaido (1956), and Debreu (1959)). If (H1)-(H4) hold, then there exists a
Walras equilibrium, i.e., there exists an element ji E P and an element sE S(ji)
such that
c ::5; f(fi, s) ::5; f(p, :f) for all peP.

PROOF. We use the fixed-point theorem of Fan-Glicksberg (Theorem 77.D).


For this we define a map R: Q--+ 2P through
R(s) = {pEP: f(p, s) ::5; f(q, s) for all qe P}.
The set R(s) is nonempty, because the continuous function f( ·, s) assumes its
minimum on the compact set P. Moreover, R(s) is convex and closed. From
the continuity off it follows that the graph
G(R) = {(s,p): sEQ,pER(s)}
is a closed subset of the compact set Q x P. From Problem 77.4 it follows that
R is upper semicontinuous.
We now let M = P x Q and define the map T: M--+ 2M through
T(p, s) = R(s) x S(p).
808 77. Simplicial Methods, Fixed-Point Theory, and Mathematical Economics

Theorem 77.0 implies the existence of a fixed point (p, s) e T(P, s). This means
that seS(p) and
J(p, s) ~ f(p, s) for all peP. 0

Leon Walras, through his basic work (1874), is regarded as the creator of
mathematical economics. However, only during the 1950s did his ideas
appear in a precise mathematical form following the work of Arrow and
Debreu(1954),Gale(1955),and Nikaido(1956). Thereby John von Neumann's
minimax theorem of game theory (1928) and the fixed-point theorem of
Kakutani (1941) played an important role.

77.11. Negative Retract Principle


Proposition 77.13. Let B be a closed ball in Ill", n;;:: 1. Then the boundary aB is
not a retract of B.

PRooF. If r: B-+ aB is a retraction, then - r: B-+ B is fixed-point free, which


contradicts Brouwer's fixed-point theorem. 0

77.12. Intermediate-Value Theorem of


Bolzano-Poincare-Miranda
Consider the equation
f(x) = 0, xeC. (14)
If C = [a, b] is a compact interval, and the function f: C-+ Ill is continuous
with
f(a) ~ 0 and f(b);;:: 0, (15)
then equation (14) has a solution (Fig. 77.7(a)). We want to extend this

~2

L-----------~------~.

(a) (b)

Figure 77.7
77.12. Intermediate-Value Theorem of Bolzano-Poincare-Miranda 809

well-known intermediate-value theorem of Bolzano to the R". To this end we


consider the compact cuboid
C = {xe R": a1 ~ ~~ ~ b1 for all i},
where X = (~ 1> ... ' ~n>· Furthermore, let ac,-
and act denote the side of c
e,
withe,= a1 and = b1 (Fig 77.7(b)). In place of(15), we have the very natural
condition
for i = 1, ... , n. (16)
The following proposition is used in modern computer algorithms to deter-
mine cuboids C, for which (14) has a solution. Thereby one gets a rough
idea about the position of the zero. This then can be improved by special
algorithms.

Proposition 77.14 (Miranda (1941)). Let C be a compact cuboid in R" and


f: C--+ R a continuous function which satisfies (16). Then f has a zero in C.

PRooF. We use Brouwer's fixed-point theorem. Let C be the standard cube,


i.e., a1 = -1 and b1 = 1 for all i. The general case can always be reduced by
an affine transformation to this special situation.
(I) First, we assume that (16) holds in the stronger form
±.t;(x) > 0 on ac,± for i = 1, ... , n. (16a)
Our simple trick is to let
F1(x) = ~~ - e1J;(x).

Because of(16a) there exist numbers e1 > 0 with


-1 ~ F1(x) ~ 1 on C for all i.
Thus we obtain F(C) £C. Brouwer's fixed-point theorem implies that F
has a fixed point in C, i.e., f has a zero in C.
(II) If (16) holds, but not (16a), then we pass from f to
g(x) = f(x) + ex, e > 0.
We then can apply (I) to g. It follows that the equation
f(x.) + ex. = 0, x.ec (17)
has a solution for each e > 0. Because of the compactness of C there exists
a convergent subsequence x. --+ x as e --+ 0. From (17) it follows that
f(x) = 0. 0

It is very interesting that this intermediate-value theorem (Proposition


77.14) is equivalent to Brouwer's fixed-point theorem. This clearly illustrates
the analytical nature of Brouwer's fixed-point theorem.
The proof above shows that Proposition 77.14 is a consequence of Brou-
wer's fixed-point theorem. We prove the converse. Let C be the standard cube
810 77. Simplicial Methods, Fixed-Point Theory, and Mathematical Economics

and F: C-+ C continuous, i.e.,


for all i.
We set f(x) = x - F(x). With this (16) holds. From Proposition 77.14 it
follows that f has a zero in C, i.e., F has a fixed point in C.
The general version of Brouwer's fixed-point theorem follows from this
special case as in the proof of Corollary 77 .3.

77.13. Equivalent Statements to


Brouwer's Fixed-Point Theorem
The following statements are equivalent:
(i) Lemma ofKnaster-Kuratowski-Mazurkiewicz.
(ii) Brouwer's fixed-point theorem.
(iii) Negative retract principle.
(iv) Inequality of Fan.
(v) Theorem of Hartman-Stampacchia.
(vi) Fixed-point theorem of Kakutani.
(vii) Fixed-point theorem of Fan-Glicksberg.
(viii) Main theorem of mathematical economy (Theorem 77.E).
(ix) Intermediate-value theorem of Bolzano-Poincare-Miranda.
(x) Cubic Spemer lemma of Kuhn.
The equivalences are pictured in Figure 77.1. In this chapter we proved that
(i) ~ (ii) ~ (iii), (ii) ~ (iv) ~ (v), (ii) ~ (vi) - (vii) ~ (viii)
and (ii) <=> (ix).
In Section 2.3 we showed that (iii) ~ (ii). Thus it remains to show that
(v) ~ (ii) ~ (i) and (viii)~ (ii). This will be done in Problems 77.5 to 77. 7.
The equivalence between the cubic Spemer lemma and Brouwer's fixed-
point theorem will be discussed in Problem 77.10.

PROBLEMS

77.1. Straightforward generalizations to the IR•. Generalize Lemmas 77.1 and 77.2
to the IR•.
Hint: See Knaster, Kuratowski, and Mazurkiewicz (1929).
77.2. Inequality of Fan and an economical model. Consider Problem 9.1 and prove
the existence of a reasonable price system. Do not use the fixed-point theorem
of Kakutani, but instead give a shorter proof by using the inequality of Fan.
Solution: Let

X = {p e IR•: p ~ 0, L• p, = 1}
1=1
Problems 811

and

Fj(p) = I"
i;l
Dtj(p), f(q,p) = (qiF(p))- (q,a).

Assumption (9.32) states that f(p,p) = 0 for all peX. From the inequality of
Fan (8) it follows that there exists an element p* eX with f(q, p*) =:;; 0 for all
q eX. Thus we obtain that F(p*) =:;; a. This is assertion (9.33).
77.3. Proof of Lemma 77.11.
Solution: Let P = {(x,y): xeM,ye T(x) + C}. We show that the comple-
ment (M x X) - P is open. Let (x 0 , y0 ) ~ P, i.e., x 0 EM and Yo~ T(x) + C.
Then there exists a neighborhood U of zero with

(y 0 + U) n (T(x 0 ) + C + U) = 0.
Since T is upper semicontinuous, it follows from Proposition 9.5 that there
exists a neighborhood V(x 0 ) of x 0 in M with
x E V(x 0 ) => T(x) ~ T(x 0 ) + U,
and hence
x E V(x 0 ) => (y 0 + U) n (T(x) + C) = 0.
Therefore a neighborhood of (x 0 , y0 ) in M x X does not belong to P.
Analogous arguments can be used for Q.
77.4. Criterium for upper semicontinuity. The spaces X and Yare locally convex with
compact subsets M ~ X and N ~ Y. Suppose the images T(x) ofthe map
T: M ->2N

are closed for each x eM.


Prove: Tis upper semicontinuous if and only if the graph of Tis closed.
Solution: The graph G(T) is equal to the set {(x,y): xe M,ye T(x)}.
(I) Let T be upper semicontinuous. We show that Z = (M x N) - G is open.
Then G(T)is closed. Let (x,y)eZ. Because of the compactness of T(x) and
since yj T(x) there exist open sets V and W with
ye V, T(x) ~ W, VnW=0.
Since Tis upper semicontinuous, it follows from Proposition 9.5 that there
exists an open set U with x e U and T(U n M) ~ Wand thus U x V ~ Z.
This means that (x, y) e int Z.
(II) Let G(T) be closed. If Tis not upper semicontinuous, then Proposition
9.5 implies that there exists a point x and an open set W with T(x) ~ W
such that for all neighborhoods U(x) the image T(U(x) n M) does not
completely lie in W. Hence, for every U(x) there exists a point (xu, Yu) e
G(T) with Yu ~ W. Because of the compactness of G(T) it follows from
A 1 (17f) that there exists a convergent M -S-subsequence (xu·· Yu·)-> (x, y).
Thus we find that (x, y) e G(T), and hence y e T(x). Because of Yu· ~ W
we obtain y~ W. This gives the required contradiction y~ T(x).
77.5. From the theorem of Hartman-Stampacchia follows Brouwer's fixed-point
theorem.
812 77. Simplicial Methods, Fixed-Point Theory, and Mathematical Economics

Solution: Let T: K s; IR"-+ K be continuous on the closed ball K. From


Proposition 77.8 with A = I - T there exists an element ye K with
(y- Tylx - y) ~ 0 for all xeK.
Forx= Tyweobtainy- Ty=O.
77.6. From Brouwer's fixed-point theorem follows the lemma of Knaster-
K uratowski-Mazurkiewicz.
Solution: We have to prove the statement of Proposition 77.4 for finite X
with dimE< co. Let, for example, X= {x 1 ,x 2 } and

and X;eF(x;) for i = 1, 2.

We have to show that F(x.) n F(x 2 ) -::1: 0.


Suppose F(x.) n F(x 2 ) = 0. We let K = co{x 1 ,x 2 }, d; = dist(x, F(x;)) and

f(x) = (d 1(x)x 1 + d2(x)x 2 )/(d 1 (x) + d 2(x))


for all x e K. The divisor is not equal to zero. Brouwer's fixed-point theorem
implies that there exists an element x e K with

f(x) = x.
Because of Ks;F(x 1 )uF(x 2 ) we may assume that xeF(x.), and hence
d 1 (x) = 0. From x = f(x) it follows that x = x 2 and hence xeF(x 2 ). This is a
contradiction to F(x.) n F(x 2 ) = 0.
77.7. From Theorem 77.E follows Brouwer's fixed-point theorem. LetS: P-+ P be a
continuous map from the closed ball Pinto itself. For r > 0 we define
M, ={peP: IP- Spl :S r}.
We prove that M, -::1: 0. Otherwise, we have IP- Spl ~ r for all peP. If we
choose
f(p, s) = IP - sl,
then Theorem 77.E implies that there exists an element peP with
r :S f(p,Sp) for all peP.
Especially for p = Sp we find f(p, Sp) = 0, which contradicts r > 0.
The intersection of finitely many sets M, is always nonempty. From the finite
intersection property A 1(l2g) there exists an element pe n,M,. This is the
required fixed point of S.
77.8. Simplicial algorithm to determine fixed points. We restrict ourselves to a de-
scription of the basic idea which is closely related to the proof of Brouwer's
fixed-point theorem of Section 77.3.
77.8a. Homotopy. Letthe map f: IR"-+ IR" be i-compact, i.e.,J is continuous and f(IR")
is relatively compact. We consider the homotopy
H(x, t) = (1 - t)x 0 + tf(x).
By definition, the fixed-point set Fix(H) is equal to the set of all points
Problems 813

(x, t)e Ill" x [0, 1] with


H(x, t) = x.

e
We look for a fixed point X 1 off, i.e., (x 1' 1) E Fix(H). Let X= ( 1' •.• 'e.).
Show: Fix(H) contains a component C which connects the point (x 0 , 0) with
a point (x 1 , 1) e Fix(H) (Figure. 77.8). Note that C need not be a curve.
Solution: Choose an open ball G which contains the image H(R" x [0, 1])
and apply Theorem 14.C of Part I (Global Leray-Schauder principle).
77.8b. Goal. As in Figure 77.8 we want to construct a sequence of simplices which
approximate C and lead us to a fixed pointx 1 off The essential steps are:
(i) Integer labeling.
(ii) Door-in/door-out principle.
(iii) Creation of new simplices by using quasi-reflections.
77.8c. Labeling. We assign an integer i = 0, 1, ... , n to a point (x, t) e Ill" x [0, 1],
where i is the greatest. number with the property
for j = 1, ... , i.
Ann-side of an (n + I)-simplex in IJl•+t is called regular or completely labeled
if and only if its vertices carry the numbers 0, ... , n. An (n + 1)-simplex is called
regular if and only if it contains exactly two regular sides.
Prove: Every (n + i)-simplex in Ill" x [0, 1] is either regular or contains no
regular side.
Hint: Elementary arguments. See Allgower and Georg (1980), p. 34.
77.8d. Quasi-reflection. Let a(x 0 , .•• , xn+t) denote the closed (n + 1)-simplex with
vertices x 0, ... , Xn+t· By a quasi-reflection with respect to xk we mean the
creation of a new simplex where xk is replaced with

Here, k_ and k+ is the left and right neighbor of kin Figure 77.9. For example,
Figure 77.10 shows the quasi-reflection of a(x 0 , x 1 , x 2 ) with respect to x 0 •
77.8e. • Algorithms. Let the map f satisfy the assumptions of Problem 77.8a. Then f
has a fixed point.
Formulate the following steps as an algorithm and show that for any given
e > 0 after finitely many steps one finds a point (x, 1) with
lf(x) - xl < e.
Moreover, any subsequence generated by this algorithm converges to a fixed
point off
(i) One constructs a regular (n + 1)-start simplex a such that (x 0 , 0) lies on a
regular side S0 of a. One enters the simplex a through S0 and leaves it
through the second regular side S1 (Door-in/door-out principle).
(ii) One performs a quasi-reflection of a with respect to the vertex which does
not lie on the exit side S1 • Thereby one obtains the simplex a 1 which one
enters through S1 and leaves through the second regular side, etc. (See
Figure 77.11 and Figure 77.8.)
Hint: See Allgower and Georg (1980), p. 37. There, and in Peitgen and Priifer
(1979), one may find further material together with numerical examples.
814 77. Simplicial Methods, Fixed-Point Theory, and Mathematical Economics

L k
Figure 77.8 Figure 77.9

Xo

io

Figure 71.10 Figure 71.11

77.9. • Sperner simplices and mapping degree. Let f: P £ Rn -+ Rn be a continuous map


from the n-dimensional homogeneous polyeder P with f(x) ::;: 0 on iJP. Prove
the following formula about the mapping degree:
deg(f, int P) = L orientation(a). (18)
"
The sum is taken over all Sperner simplices u of P with respect to f. We explain
this. The polyeder P is the union of finitely many closed n-simplices, whereby
two of them always have exactly one (n- 1)-side in common or are dis-
joint (Fig. 77.12). Therefore P possesses a natural triangulation. Let u =
co{x0 , ... ,xn} be a simplex of P. In Rn we specify a standard simplex u0 =
co{e0 , ... ,en} with e0 = 0 and e1 = (0, ... , 1, ... ,0), where 1 occurs at the ith
position. Using a linear transformation L we map u bijectively onto a transla-
tion of u0 with Lx1 = e, + const for all ~ and define
= sgn det L.
orientation(a)
In Figure 71.13 we have orientation(a±) = ± 1.
We assign a number i = 0, ... , n to every point xeP. This is the smallest
number i with
fl+t(x) :s;; 0.
Thereby let i = n if jj(x) > 0 for allj. Then u is called a Spemer simplex with
respect to f if and only if the vertices carry all numbers 0 to n. By orientation
(a) we mean the orientation of u, 'which occurs if one runs through the vertices
from 0 ton.
Problems 815

Figure 77.12

Xo xo
Figure 77.13

For the triangulation we require that to every (n - i)-simplex ton oP there


exists a number i(t) such that all points oft do not have the number i(t). This
condition is always satisfied for a sufficiently fine triangulation.
Hint: See Priifer and Siegberg (1979). In Kliesch (1984), (1988), one can find
a general approach which includes many different formulas of type (18). The
coincidence with the classical mapping degree is shown there by using the
axiomatic approach of Chapter 12.
The importance of(l8) and similar formulas is that deg(f,int P) #: 0 implies
the existence of a zero off in P, and a computation of deg(f, int P) only requires
finitely many values of f.
77.10.* Cubic Sperner lemma of Kuhn (1960).
77.l0a. Notations. For the points x, ye IR" we write x = (~ 1 , .•• , ~n) andy= ('7 1 , ••• , 'In).
As usual, x ~ y is equivalent to ~~ ~ '71 for all i, and x < y means x ~ y and
x #: y. Moreover, let ee IR" denote the point (l, ... , l). Let l be the set of all
integers. For p = l, 2, ... we define the discrete cube
c, = {xel": 0 ~ ei ~ p for all i}.
77.l0b. Lemma. Let y = f(x) be a map f: c,--+ C1 with the property
e~ = o~ ,i = o,
e~ =P ~ ,1 = 1.
Show: There exist points x 0 , x 1 , .•• , Xm E c, with
x0 < x 1 < · ·· < Xm < x 0 + e,
816 77. Simplicial Methods, Fixed-Point Theory, and Mathematical Economics

and
m
e :S: L f(x,) :S: me.
r=O

Hint: See Kuhn (1960).


17.10c. Equivalence.
Show: The cubic Spemer lemma of Problem 77.10b is equivalent to Brou-
wer's fixed-point theorem.
Hint: See Kuhn (1960) and Riedrich (1976, L), p. 59.
The importance of this is that Brouwer's fixed-point theorem is shown to
be equivalent to a purely combinatorical statement.

References to the Literature

Classical works: Spemer (1928), Knaster, Kuratowski, and Mazurkiewicz (1929),


Scarf(1967), Fan (1972).
Survey about statements equivalent to Brouwer's existence principle: Gwinner
(1981).
Applications to variational inequalities: Mosco (1976, S).
Inequality of Fan, multivalued maps, and mathematical economics: Aubin (1979,
M), (1979a, M).
Survey about the numerical application of simplicial methods: Todd (1976, S),
Allgower and Georg (1980, S, B, H), Peitgen and Priifer (1979, S) (bifurcation), Schilling
(1986, M).
Simpicial methods and mapping degree: Priifer and Siegberg (1979), Kliesch (1983),
(1984), (1988).
Mathematical economics: Karlin (1959, M, H), Nikaido (1968, M), (1970, M), Aubin
(1979a, M), Ramanathan (1983, L).
Classical works on mathematical economics: Walras (1874, M), von Neumann (1928)
(foundation of game theory), Arrow and Debreu (1954), Gale (1955), Nikaido (1956),
Debreu (1959, M).
Topological methods in Walrasian economics: Dierker (1974, M).
Methods of global analysis in mathematical economics: Smale (1982, S).
Equilibrium in incomplete markets: Duffie and Shafer (19851 Husseini, Lasry, and
Magill (1987) (generalization of the Brouwer fixed-point theorem to Grassmanians of
k-dimensional subspaces of JR•).
Handbook of mathematical economics: Arrow and Intrilligator (1982).
(See also the References to the Literature to Chapter 9.)
CHAPTER 78

Homotopy Methods and


One-Dimensional Manifolds

The idea of considering the measure of the set of critical values of one function
or of several functions is due to Marston Morse (1939).
Arthur Sard (1942)
The purpose of this note is to introduce a nonlinear version of Fredholm
operators, and to prove that in this context Sard's theorem (1942) holds if zero
measure is replaced by first category. Strictly speaking, our result is a general-
ization of a theorem of Brown (1935), an earlier special case of Sard's theorem.
Steve Smale (1965)
We illustrate that most existence theorems using degree theory are in principle
relatively constructive.
Shui-Nee Chow, John Mallet-Paret, and James A. Yorke (1978)
The term continuation method derives from a familiar class of numerical meth-
ods dating back at least to Lahaye (1935), and also known as embedding
methods. It is important to emphasize a distinction between classical embedding
methods and the present continuation methods. The classical methods require
that the homotopy parameter shall vary monotonically and the effort to follow
a homotopy curve is abandoned when a critical point of the homotopy parame-
ter, i.e., a turning point, is encountered. In contrast, the present continuation
methods have faith and proceed beyond such critical points.
Eugene Allgower and Kurt Georg (1980)

In this chapter, Sard's theorem (Proposition 4.55 of Part I) plays a central


role. Before reading this chapter, one should look again at this theorem as
well as Definition 4.52 about regular values. For didactical reasons, we use a
parametrized version of Sard's theorem already in Section 78.2, and present
the proof afterwards in Section 78.7. The definition of the fixed-point index
and the mapping degree of Section 78.6, however, only requires Sard's theorem
and not the parametrized version. Sard's theorem is one of the most important
theorems in modem mathematics. It gives a precise formulation of the follow-
ing philosophy: Most situations in nature are generic, i.e., not degenerate.

817
818 78. Homotopy Methods and One-Dimensional Manifolds

As another central tool in this chapter we use the structure theorem about
one-dimensional manifolds (Proposition 73.9). Roughly, this theorem states
that one-dimensional manifolds behave like reasonable curves.
The results of Section 78.2 about regular solution curves admit a develop-
ment of fixed-point theory which is characterized by great geometrical clarity
and intuition. Surprisingly, this approach has only been developed during the
last years.

78.1. Basic Idea


The simple basic idea of homotopy methods is the following. In order to solve
the equation
F(x) = 0, XE IR", (l)
for a map F: IR"--+ IR", we consider the more general equation
H(x,t) = 0, te [0, l], xeiR" (2)
with H(x, l) = F(x). Suppose (2) has a solution x 0 fort= 0 which is easy to
determine. We look for a curve
x = x(s), t = t(s) (3)
with arclength s such that for s = 0 this curve begins at x = x 0 , t = 0 and
passes through a point x = x 1 , t = 1 (Fig. 78.1). The classical continuation
methods are designed to find solutions of (2) of the form x = x(t). Such
procedures, however, break down at a turning point T of Figure 78.1, and one
can only follow curves such as in Figure 78.2. With (3), however, one can follow
any curve whatever.

78.2. Regular Solution Curves


The important question is: When does the solution set of the equation
H(x,t) = 0, (x, t) E Rn+t (4)
X

xo

Figure 78.1 Figure 78.2


78.2. Regular Solution Curves 819

Figure 78.3

consist of reasonable curves which can be followed by numerical algorithms?


Figure 78.3 shows that this need not always be the case.
Our assumption is:
(H) The map H: IR"+t -+ IR" is coo.
Definition 78.1. Assume (H). By a regular solution curve C of(4) we mean a set
of solution points which is C 00 -diffeomorphic to the straight line R1 or the
unit circle S1, and each point (x, t) e Cis a regular point of H. The solution set
of (4) is called regular if and only if all its components are regular solution
curves.
By an index of the solution point (x, t) of (4) we mean sgn det Hx(x, t).

Proposition 78.1 (Regularity Criterium). If (H) holds and zero is a regular value
of H, then the solution set of (4) is regular.

PRooF. From the preimage theorem (Theorem 4.J) it follows that the solution
set is a one-dimensional manifold. The structure theorem for connected,
one-dimensional manifolds (Proposition 73.9) yields the assertion. D

EXAMPLE 78.3 (Perturbation of H). If(H) holds, then for every 6 > 0 there exists
ape R" with IPI < 6 such that the solution set of the equation
H(x,t)- p = 0, (x,t)e R"+l
is regular.

PRooF. From Sard's theorem we can choose pas a regular value of H, i.e.,
zero is a regular value of H - p. D

EXAMPLE 78.4 (Parametrization). Consider H = H (x, t, p) with parameter p e


R'. Let H: R"+ 1 x R' -+ R" be coo. lfzero is a regular value of H, then equation
H(x,t,p) = 0, (x,t)e Rn+l
has a regular solution set for almost every parameter p e IR'.
As usual, "for almost all p" means "for all p except for a set of measure zero
in R•."
820 78. Homotopy Methods and One-Dimensional Manifolds

PROOF. According to the parametrized version of Sard's theorem of Section


78.7, zero is a regular value of H(·,p) for almost all p. 0

A common homotopy is
H(x, t, p) = (1 - t)(x - p) + tF(x), pe IR".
Fort = 0 we find that x =pis the unique solution of
H(x, t, p) = 0, (x, t) e IR"+ 1• (5)

EXAMPLE 78.5. IfF: IR"-+ IR" is coo with zero as a regular value, then equation
(5) has a regular solution set for almost all starting values p e IR".

PROOF. Zero is a regular value of H, because the linearization satisfies


H'(x, t, p) = (Hx(x, t, p), H,(x, t, p), Hp(x, t, p))
= ((1 - t)I + tF'(x), ... , (t- 1)I).

It follows that R(H'(x, t, p)) = IR" for all (x, t, p). For t = 1 and t ::1: 1 this is
guaranteed by the first term F'(x) and the last term (t - 1)I, respectively.
0
Example 78.5 is of great practical use. If one picks an arbitrary starting
value x 0 = p, then with probability one there passes a regular solution curve
through it. According to Sard's theorem, any arbitrarily small perturbation
ofF will make zero into a regular value of the perturbed F. On the computer
F is not exactly known. Thus on computers one always expects regular
solution curves. Experience shows that this is the case as a rule. In summary
we remark: Sard's theorem and its parametrized version guarantee that equa-
tion (4) can be treated in a naive way, i.e., the generic situation of regular
solution curves may be assumed.
The great advantage of regular solution curves is that they cannot intersect
themselves and cannot accumulate at one point. More precisely, we have the
following result, which will frequently be used.

Proposition 78.6 (Door-In/Door-Out Principle). Let Z be a bounded region in


IR"+l and let the solution set of (4) be regular.
If Cis a regular solution curve of (4) which enters Z at the boundary point
(x, t) e az, then C cannot remain in the region Z.
If C passes through a point (x, t 0 ), whose index sgn det Hx(x, t 0 ) is not equal
to zero, then C has the form x = x(t) in a neighborhood of (x, t 0 ).

The last statement is often used to assure that C enters Z (see Figure 78.9
of Section 78.5).

PROOF. If C remains in Z forever, then compactness of Z implies that C has


an accumulation point (x, t) e Z. Continuity of H implies that H(x, t) = 0, i.e.,
78.3. Turning Point Principle and Bifurcation Principle 821

(x, t) belongs to the solution set. This is in contradiction to the regularity of


this set.
The second statement about the index follows from the implicit function
theorem. 0

78.3. Turning Point Principle and


Bifurcation Principle
For simplicity in notation, we let y = (x, t). From
H(y(s)) = 0 (6)
we obtain, by differentiation with respect to s, the equation
Hx(y(s))x'(s) + H,(y(s))t'(s) = 0. (7)
Instead of (6), we use
H(y(s)) = 0,
(8)
(y'(s)ly'(s)) = 1.
The last equation shows that sis the arclength. We set

A( ) ~(H,(y(s)))
s y'(s) .

From (7) and (8) we obtain the matrix equation

A(s)(x'(s)
t'(s) 0
I)= (01 Hx(y(s)))·
x'(s)
By passing to determinants we obtain the important relation
t'(s)det A(s) = det Hx(y(s)). (9)
In the following let H: R"+l-+ R" be C 1• Moreover, lets~-+ y(s) be a solution
curve C which satisfies (8), and let S denote the solution set of the equation
H(y) = 0, yeR"+ 1•
(L) Local uniqueness. If det A(s 0 ) ::1: 0, then S has no bifurcation point at y(s 0 )
(Fig. 78.4).
We prove this. We may, eventually, after relabeling the variables, assume
that t' (s) ::1: 0. From (9) it follows that det Hx(y(s 0 )) ::1: 0. The implicit function
theorem (Theorem 4.8) then implies (L).
(C) Constancy principle. If C is a regular solution curve, then det A(s) ::1: 0
along C, i.e., the sign of this determinant remains constant as a result of
continuity.
822 78. Homotopy Methods and One-Dimensional Manifolds

/
X

Figure 78.4 Figure 78.5 Figure 78.6

This follows from rank H'(y(s)) = nand H'(y(s))y'(s) = 0 with y'(s) ::f:. 0, i.e.,
rank A(s) = n + 1. Note that H' = H1 .
(T) Turning point principle. If det A(s) ::f:. 0 along C, and if the index jumps at
s0 , i.e., det H,(y(s)) changes its sign at s0 , then y(s0 ) is a turning point of
C (Fig. 78.5). This follows from (9).
(B) Bifurcation principle. Let t'(s) ::f:. 0 on [s 1 , s 2 ]. If the index jumps, i.e.,
detH"(y(s)) has different signs for s 1 and s 2 , then for the solution setS
there exists a bifurcation point on C which lies between the curve points
y(sd and y(s 2 ) (Fig. 78.6).
In particular, y(s 0 ) is a bifurcation point if det A(s) changes its sign at s0
and t'(s 0 ) ::f:. 0.
The first statement follows from the index jump principle (Proposition 15.1).
For the second statement note that according to (9), the index det H"(y(s))
changes its sign at s 0 •

78.4. Curve Following Algorithm


From a numerical point of view it is important to have algorithms available
which follow the solution curve SH y(s) of equation (8) very closely. Let us
describe here the basic idea for such an algorithm (Fig. 78.7).
We start at a curve point y(s) for fixed sand want to compute a new curve
point y(sd.

Figure 78.7 Figure 78.8


78.5. Constructive Leray-Schauder Principle 823

(i) Predictor step. We determine a solution h of


H,.(y(s))h = 0, · (hlh) = l.

We choose the sign of h such that det A(s) > 0 if we replace y'(s) in A(s)
with h. This implies that the point
Yt = y(s) + th
lies on the tangent line to the curve for fixed t > 0.
(ii) Corrector step. We choose y 1 as a starting point for a Newton-like method:
H,(ytl(Yt+t - Yt) = H(y"), k = 1,2, ... ,
(hiYt+t - Yt) = 0.
Under favorable conditions the sequence (y11 ), computed this way, then con-
verges to a curve point. In dangerous situations like in Figure 78.8, one
eventually may have to choose a smaller parameter t. This is discussed more
thoroughly in Allgower and Georg (1980, S), p. 42. In order to avoid the
computation of the inverse matrix "H,(y 1 fl," one can use a quasi-Newton
procedure, where "H,(y 1 f 1 " is approximated by matrices which only depend
on the values of H. An effective algorithm may be found in Georg (1981).

78.5. Constructive Leray-Schauder Principle

With a typical example we shall try to explain the applicability of the regular
solution curves of Section 78.2 to constructive fixed-point theory. We assume:
(Hl) The map F: R"-+ R" is coo with zero as a regular value. We set
H(x, t) = (1 - t)(x - x 0 ) + tF(x).
(H2) There exists a bounded region G in R" and a point x 0 e G such that
equation H(x, t) = 0 has no solution in oG x [0, 1].

Proposition 78.7.1/ (Hl), (H2) hold, then equation F(x) = 0, x e G has a solution
xl.
IfF: G-+ R" is only continuous and (H2) is satisfied, then F(x) = 0, x e G has
a solution as well.

PRooF.
(I) Generic case. Assume (H1), (H2). We set Z = G x ]0, 1[ and denote the
covering surfaces and the lateral surface of the cylinder by = G x {k} z,.
and M = oG x [0, 1]. Here we have k = 0, 1 (Fig. 78.9(a)). If x 0 is only
perturbed slightly so that (H2) remains valid, then we may assume from
824 78. Homotopy Methods and One-Dimensional Manifolds

X X

Zo_. z

(a) (b)

Figure 78.9

Example 78.5 that a regular solution curve C of the equation


(1 - t)(x - x 0 ) + tF(x) = 0, (x,t)e R"+l (10)
passes through (x0 , 0).
According to the door-in/door-out principle (Proposition 78.6), C
enters the region Z through Z0 • Since (x 0 ,0) is the only solution of(lO)
for t = 0 and (H2) is satisfied, C cannot leave the region Z through either
Z0 or M. Hence C exits through Z 1 (Fig. 78.9(b)).
(II) Approximation argument for the weaker assumption. If F: G-+ R" is
continuous, then we uniformly approximate F on G with C00 -functions
F,.: R"-+ R" according to the Weierstrass approximation theorem, i.e., the
components of Ft are polynomials and F,. ::::t F on Gas k-+ oo. By passing
from F, to F" - y,. we can assume that zero is a regular value of F~; - Y1:
and Yt-+ 0 ask-+ oo. Then F,.- Yt satisfies assumptions (H1), (H2). We
obtain a solution

Since the set Gis compact, there exists a convergent subsequence Xt· -+ x.
Thus we obtain F(x) = 0, x e Gas k-+ oo. But from (H2) it follows that
xeG. 0

78.6. Constructive Approach for the Fixed-Point


Index and the Mapping Degree
In Chapter 12 we gave an elementary introduction to the fixed-point index
and the mapping degree. The only tools we used were the integral theorem of
Gauss and the substitution rule for multiple integrals. Here we present an
approach which is of great geometrical intuition, but uses some advanced
methods. The simple geometric idea is contained in Figure 78.10.
If one uses this approach in lectures, one might choose the following
presentation.
(a) One introduces the following concepts: Manifold in R", regular value
(Definition 4.52), preimage theorem (Theorem 4.J), Sard's theorem (Prop-
78.6. Constructive Approach for the Fixed-Point Index and the Mapping Degree 825

+
z

(a) (b)

(c) (d)

Figure 78.10

osition 4.55), and the structure theorem about one-dimensional manifolds


(Proposition 73.9).
One might present the last two results at first without proof, in order
not to exhaust the audience with long proofs.
(b) One discusses Sections 78.1 and 78.2.
(c) One formulates the axioms for the fixed-point index as in Sections 12.2
and 12.3 and adds the following observations.

As in Chapter 12let V0 (G, IR") denote the set of all functions f: G-+ R" which
satisfies the following properties:

(i) The set G is open and bounded in IR".


(ii) f is continuous on G and is C 1 on G.
(iii) f has at most finitely many fixed points, all of which are regular and do
not lie on the boundary iJG.

Let V(G,IR") denote the set of all continuous functions .f: G-+ IR" which
have no fixed points on iJG. We set C 00 (IR") = C 00 (IR",IR").

Definition 78.8 (Fixed-Point Index i(J, G)). For maps f e V0 (G, R") n C 00 (R")
we let
m
i(f,G) = L sgndetF'(xi),
j=l

where F(x) = x - f(x), and x 1 , •.. , Xm are precisely all fixed points off on G.
Iff has no fixed points on G, then let i(f, G) = 0.
For f e V(G, IR") and G =F 0 we choose a map
Je V0 (G,IR") n C 00 (1R"),
826 78. Homotopy Methods and One-Dimensional Manifolds

which has the approximation property


sup llf(x) -f(x)ll ~ rt inf llf(x) - xu. (11)
xeiJG xeiJG

and let
i(f, G) = i(f, G). (12)
For G = 0let i(J,G) = 0.
The mapping degree is defined as
deg(F, G, y) = i(f + y, G).

In this definition two points have to be justified: There always exists an 1


with (11), and i(f, G) in (12) is independent of the choice of J.

Proposition 78.9 (Existence of the Fixed-Point Index). For every map fe


V(G, IR") and all V(G, R") with arbitrary n eN there exists precisely one fixed-
point index which satisfies axioms (A1) to (A4) of Section 12.3. It is given by
Definition 78.8.

PRooF. The key formula is (14) below. The simple geometric proof idea can
be obtained from Figure 78.10.
(I) Generic case. Let / 0 , /1 e V0 (G, IR") n C""(IR") with G :F 0 and
sup IIF0 (x)- F1 (x)ll < inf IIFo(x)ll, (13)
xeiJG xeiJG

where .Fj(x) = x - jj(x). We want to show that


i(fo, G)= i(f., G). (14)
For this we let Z = G x ]0, 1[ and Z1 = G x {j} and also M =
oG x [0, 1] (Fig. 78.10(a)). The main trick is to construct the homotopy
H(x, t) = (1 - t)F0 (x) + tF1 (x) - y
with yeiR". Because of F0 (x) :F 0 on oG and (13) we have
H(x,t) :F 0 on M (15)
for smallllyll. According to Sard's theorem, we can choosey such that
zero is a regular value of H. From Proposition 78.2 it follows that the
solution set of the equation
H(x,t) = 0, (x, t) E IRn+l (16)
is regular.
(1-1) In order to simplify the notations, let us first assume that y = 0. Let C
be a regular solution curve of(16) which intersects Z~: fork= 0 or k = 1.
According to the door-in/door-out principle (Proposition 78.6) the curve
Centers the region Z. Because of (15), the set M contains no points of
C. Thus we have precisely two possibilities for C.
78.6. Constructive Approach for the Fixed-Point Index and the Mapping Degree 827

(i) C does not reach the other side Zm with m #:- k and leaves Z through
Z1 (Fig. 78.10(c)).
(ii) C reaches the other side Zm (Fig. 78.10(d)).

We want to count the points of intersection, whereby we attach a sign


to each of them. For this we define the intersection number
j(x, k) = sgn det Hx(x, k)

of the point of intersection (x, k) of the curve C with Let s be the z,..
arclength of C. We choose an orientation of C such that det A(s) > 0
along C. This is possible according to the constancy principle (C) of
Section 78.3. From (9) it follows that
sgn t'(s) = sgn det Hx(x(s), t(s)).

Thus j has the sign as shown in Figure 78.10(b). The definition of the
fixed-point index implies that

i(fk> G) = Li(x, k). (17)


X

One sums over all solution points in the set Z", i.e., over all solutions x
of F,.(x) = 0, x e G. Through formula (17) the fixed point index is given
a very intuitive geometrical interpretation.
Now (14) follows immediately from Figures 78.10(c) and 78.10(d):

(a) Suppose the curve C is of type (i). Then there exists an even num-
ber of intersection points on Z1 and the intersection numbers
have pairwise different sign, i.e., they do not contribute to (17). Hence
i(/0 , G)= i(/1 , G)= 0 (Fig. 78.10(c)).
(b) Suppose the curve Cis oftype (ii). Then there exists the same number
of intersection points on Z 0 and Z 1 with equal sign in case an even
number of intersection points with pairwise different sign is not
accounted for. Hence i(/0 , G) = i(/1 , G) (Fig. 78.10(d)).

(1-2) Now consider the case y #:- 0. As above, we obtain that


i(/0 - y, G) = i(/1 - y, G).
Note that if .fte V0 (G, IR"), then also .ft- ye V0 (G, IR") for sufficiently
smallJiyJI.
In fact, if F"(x) = 0, x e G with F1 (x) = x - .ft(x), then
det F~(x) #:- 0.
Thus F" is a local C 00 -diffeomorphism at x, and hence F" - y and F1 have
the same finite number of zeros on G with equal sign of the Jacobian for
sufficiently small llyJI.
This argument also shows that
i(.ft - y, G) = i(f,., G), k = 0, 1, i.e., (14) holds.
828 78. Homotopy Methods and One-Dimensional Manifolds

(II) Existence of approximations. Let f e V(G, IR"). On Gthe function f can


then arbitrarily close be approximated with an element g e V( G, R"),
whereby the components of g are polynomials. Sard's theorem implies
that we can choose an element ye IR" with arbitrarily smallllyll such that
zero is a regular value of 1 = g - y. From the inverse function theorem
(Theorem 4.F) it follows that 1e V0 (G, IR").
(Ill) Fixed-point index for f e V(G, IR"). If we choose 1 = f 0 , f 1 in V0 (G, R")
with (11), then we obtain (13). Moreover, it follows from (14) that
i(fo, G)= i(f1, G).
This justifies Definition 78.8. Note that i(f, G) = i(f, G).
(IV) As in Section 12.6 one shows that axioms (Al) to (A4) are satisfied and
also that the fixed-point index is unique. D

Now we can extend this fixed-point to B-spaces in analogy to Section


12.7. This provides all the tools needed to develop the theory of the fixed-point
index and the mapping degree of Chapters 13-17 of Part I.

78.7. Parametrized Version of Sard's Theorem


Proposition 78.10. Let the map
H: M X p s; R"' X R'-+ R"
be C", k > max(O, m - n), where M and Pare open sets in R"' and R', respectively.
Ify is a regular value of H, then y is also a regular value of H( ·, p) for almost
all parameter values peP.

The significance ofthis theorem has already been discussed in Section 4.19.
An important generalization to Banach manifolds may be found in Section
78.10.
PRooF. Let H = H(x, p) with x eM, peP. From the preimage theorem (Theo-
rem 4.J) it follows that H- 1 (0) is a C"-manifold in R"' x R'. Let H- 1 (0) ::1: 0.
Then we have
dimH- 1 (0) = m + r- n.
Let
n(x,p) =p
denote the natural projection. Also, let TH- 1 (0) denote the tangent space to
H- 1 (0) at the fixed point (x,p). According to Sard's theorem almost all peP
are regular values of n. Note that
dimH- 1 (0)- dimP ~ m- n
78.8. Theorem of Sard-Smale 829

and k > max(O,m- n). Let p be such a regular value with 1t- 1 (p) ::1: 0. Then
the linearization
T1t(x,p): TH- 1 (0)-+ R'
is surjective. This is equivalent to saying that for each q e R' there exists an
x(q) e R"' with
H'(x,p)(x(q),q) = 0. (18)
Note that TH- 1 (0) is given by an equation of the form (18). Equation (18)
means
Hx(x,p)x(q) + Hp(x,p)q = 0 for all q e R'. (19)
Since zero is a regular value of H, we obtain that R(H'(x, p)) = R". For the
Jacobian matrix we get
H'(x,p) = (Hx(x,p),Hp(x,p)).
The linear dependence relation (19) shows that
rankH'(x,p) = rankHx(x,p)
and hence R(Hx(x,p)) = R". Thus, zero is a regular value of x~-+H(x,p). 0

78.8. Theorem of Sard-Smale


In this section we extend Sard's theorem to Banach manifolds. This provides
us with a basic tool to generalize differential topological theorems on finite-
dimensional manifolds to Banach manifolds. An important example in this
direction will be considered in Section 78.1 0.
Recall that a set is called meager or of the first Baire category if and only
if it is the union of at most countably many nowhere dense sets.

Definition 78.ll. A set in a topological space is called residual (or massive) if


and only if it can be represented as a countable intersection of open, dense
subsets.
In particular, every open and dense subset is residual.

Theorem 78.A (Smale (1965)). Let M and N be C""-Banach manifolds with chart
spaces over IK, where M has a countable basis. If
f: M -+N
is a C1-Fredholm map with
k > max(indf'(x),O) for all xeM,
then the set of singular values off is meager, and the set of regular values is
residual.
830 78. Homotopy Methods and One-Dimensional Manifolds

A topological space is a Baire space if and only if all its residual sets are
dense. According to A 1 (66), every B-space and every Banach manifold is a
Baire space. Since subsets of meager sets are meager themselves and N behaves
locally like a B-space, the topological results of A 1 (65) and A 1 (66) imply
moreover: The set of regular values off is dense in N and not meager, i.e., of
the second Baire category.
For the following statement, we do not need that M has a countable basis.
Instead, we assume that f is proper.

Corollary 78.12. Let M and N be C 00 -Banach manifolds with chart spaces over
KIf f: M-+ N is a proper Ck-Fredholm map with k > max(indf'(x),O)for all
x eM, then the set of regular values off is open and dense in N.

Note that, for connected M, the index indf'(x) is constant on M. The


simplest example of a C 00 -Banach manifold with chart spaces over IK are the
open sets M in a B-space X over K If X is separable, then M has a countable
basis.

78.9. Proof of Theorem 78.A


By using the local normal form of Section 73.1 we transform Sard's theorem
into infinite dimensions. Since Banach manifolds behave locally like B-spaces,
we begin by studying the local properties. The global properties can then be
easily deduced. Our assumption is:
(H) X and Y are B-spaces over IK and f: V(x 0 ) £ X-+ Y is a C1-Fredholm
map with k > max(indf'(x 0 ),0). Here, V(x 0 ) is a neighborhood of x0 •

Lemma 78.13. If (H) holds, then there exists an open neighborhood W(x 0 ) in X
such that the regular values of the restriction flw<xol are dense in Y.

PRooF. We make essential use of the normal form (73.1). Let N = N(f'(x 0 ))
and R = R(f'(x 0 )). We choose topological direct sums
X =N $ N J. and Y = R fdJ R J..
From Proposition 73.1 there exists a Ck-diffeomorphism
cp: U(O) £ N x R -+ W(x 0 ),
such that the relation
h(u, v) = f(x 0 ) + v + g(u, v) on U(O) (20)
holds for h(u, v) ~ f(cp(u, v)), with ue N, ve R, and g(u, v)e RJ. on U(O).
The dimensions of N and RJ. are finite, because f'(x 0 ) is a Fredholm
78.9. ProofofTheorem 78.A 831

operator. Since regular values are invariant under diffeomorphisms it suffices


to show that the regular values of h are dense in Y.
Let y e Y. We decompose
Y = f(xo) + Yt + Y2 with Yt eR, Y2 eRl.
and let 1/J(u) = g(u,yt}. Then_ 1/J: V(O) s; N-.. Rl. is C". Letting
m = dimN- dimRl.,
we obtain m = indf'(x 0 ) and k > max(m,O) from (H).
(I) According to Sard's theorem, the regular values of t/1 are dense in Rl..
(II) We show: Ify 2 is a regular value oft/1, then y is a regular value of h. Indeed,
from h(u, v) = y it follows that v = y 1 and 1/J(u) = y 2 • Moreover, we have
h'(u, v)(u, v) = v + t/l'(u)u + gv(u, yd(O, v).
Therefore, the surjectivity of t/l'(u): N-.. Rl. implies the surjectivity of
h'(u,v): N x R-.. Y.
From (I) and (II) it follows that the regular values of h are dense in Y. 0

Lemma 78.14. If (H) holds, then f is locally closed, i.e., f maps closed sets in a
sufficiently small neighborhood of x 0 onto closed sets.

PRooF. It suffices to show that h is closed. Let


as n-.. oo,
where un eN and vn e R for all n. We decompose

From (20) it follows that vn -.. w2 as n -.. oo. Since dim N < oo we have that
un -.. u, eventually after passing to a subsequence. This implies that h(u, v) = w.
0

Lemma 78.15. If (H) holds and x 0 is a regular value of J, then there exists a
neighborhood of x 0 which contains only regular points off

PRooF. This follows from normal form (20) with R = Y and g = 0. Note that
h'(u, v)(u, V) = v for all ve Y
and all (u, v)e U(O), i.e., h'(u, v) is surjective for these points (u, v). 0

CoroUary 78.16. If (H) holds, then there exists an open neighborhood U(x 0 ) in
X such that the set of singular values of the restriction flu1xol is closed in Y and
the set of regular values is open and dense in Y. Thus the set of singular values
is nowhere dense in Y.
832 78. Homotopy Methods and One-Dimensional Manifolds

This follows immediately from the previous lemmata. Note that according
to Lemma 78.15 the set of regular points is open and hence the complement
of the set of regular points is closed.
PRooF OF THEOREM 78.A. For each point x eM we choose an open neighbor-
hood U(x) such that Corollary 78.16 holds in charts. Since M has a countable
basis, it follows that M is Lindelof, i.e., at most countably many U(x) cover
M. One observes then that y is a singular (regular) value off if and only if y
is a singular (regular) value for one of the (each of the) restrictions flu(.:)·
D

PROOF OF COROLLARY 78.12. The map


f:M-+N
is proper. Therefore f- 1 (y) is compact and f is closed (proof as in Proposition
4.44).
(I) The set Reg(f) of regular values off is open inN, because Lemma 78.15
implies that the set S of singular points off is closed in M, and hence
f(S) is closed and Reg(/) = N - f(S) is open.
(II) Let U be an open neighborhood of f- 1 (y). Then there exists a neigh-
borhood V(y) with f- 1 (V(y)) !;;;; U. Otherwise there exists a convergent
sequence f(xn) -+ y as n -+ oo with Xn ¢ U for all n. Since f is proper, we
may assume that, eventually after passing to an M -S-subsequence,
Xn -+ x. This yields the desired contradiction x ¢ U and f(x) = y.
(III) The set Reg(/) is dense in N. In order to prove this, we choose the
neighborhood V(y) so small that it lies in a chart of y. It suffices to show
that Reg(f) is dense in V(y).
Let C and Cube the set of singular values off: M-+ Nand flu in
V(y), respectively. We choose U(x) as in the proof of Theorem 78.A, that
is, f(U(x))!;;;; V(y) for all xef- 1 (y). Let U be the union of finitely many
U(x), which already cover the compact set /- 1 (y). It follows then, as in
the proof of Theorem 78.A, that Cu is meager in V(y). From (II) it follows
that C = Cu. eventually after decreasing V(y).
Therefore C is meager in V(y), and hence V(y) - C is dense in V(y) (see
A 1 (65)), i.e., Reg(f) is dense in V(y). D

78.1 0. Parametrized Version of the


Theorem of Sard -Smale
Consider the operator equation
H(x,p) = z, xeG, (21)
78.10. Parametrized Version of the Theorem ofSard-Smale 833

which depends on the parameter peP. Fix a point zeZ. Our goal is to find
conditions under which the solutions have a natural and favorable behavior
for "most" parameter values p. As an application of the results of this section
we will give conditions for which (21) has only finitely many solutions for
"most" p. This will be done in the following section. It need not be emphasized
that results of this type are of great mathematical and scientific interest.
Our assumptions are:
(H1) G, P, and Z are nonempty, metrizable C 00 -Banach manifolds with chart
spaces over K
This condition is satisfied, for example, if G, P, and Z are open and non-
empty sets in B-spaces over K
(H2) The ct-map H: G x P-+ Z with k ~ 1 has z as a regular value.
(H3) For each parameter peP, the map H( ·, p): G-+ Z is a Fredholm map,
where
indHx(x,p) < k
for every solution (x,p)eG x P of(21).
If G and Z are open sets in B-spaces, then in the usual way Hx(x, p) denotes
the partial F-derivative. In the general case, Hx(x,p) is the tangent map of
H( ·, p): G-+ Z at the point x.
(H4) Weak properness. The convergence p,-+ p on Pas n-+ oo and
for all n
implies the existence of a convergent subsequence x,. -+ x as n -+ oo with
xeG.
Let p be fixed. Recall that (H3) implies that a solution of (21) is regular if
and only if the linearization Hx(x, p): TGx -+ TZz is surjective. For the special
case that Z is a B-space and G is an open set in the B-space X, we have
TGx = X and TZz = Z.

Theorem 78.8 (Parametrized Version of the Theorem of Sard-Smale). If


(H1)-(H4) hold, then there exists an open, dense subset P0 of P such that z is
a regular value of H( ·,p) for each parameter peP0 •

Corollary 78.17. Fix an element peP0 . If there exists a number n ~ 0 with


ind Hx(x, p) = n
for all solutions x of (21 ), then the solution set of (21) consists of an n-dimensional
ct-Banach manifold or the solution set is empty.

We prove Theorem 78.8 in Section 78.12. Corollary 78.17 follows im-


mediately from Theorem 78.8 and the preimage theorem (Theorem 73.C).
834 78. Homotopy Methods and One-Dimensional Manifolds

78.11. Main Theorem About Generic Finiteness


of the Solution Set
Theorem 78.C (Main theorem)./f(H1)-(H4) of the previous section hold with
k = 1, and if
ind Hx(x, p) = 0 (22)
for all solutions (x,p)e G x P of equation (21), then there exists an open dense
subset P0 of P such that (21) has at most finitely many solutions for each fixed
parameter p e P0 •
In addition, all these solutions are regular.

PRooF. We use Theorem 78.B. LetS be the set of solutions of


H(x,p) = z, xeG
for fixed p e P0 • From (H4) it follows that Sis compact. Theorem 78.B implies
the surjectivity of H:r;(x, p): TG:r: -+ TZ,. for all xeS. Because of (22) this map is
even bijective. The inverse mapping theorem (Theorem 73.B) implies that S
consists of isolated points and from compactness it follows that it consists of
at most finitely many points. D

An important version of Theorem 78.B, for which the compactness condi-


tion (H4) is not needed, will be considered in Problem 78.3.

78.12. Proof of Theorem 78.B


We make essential use of a simple result about linear operators. The starting
point is the linear equation
Ax+ Bp = z0 , xeX, peY (23)
for a given z0 e Z. We set
D = {(x,p)eX x Y: Ax+ Bp = 0}
and define the projection operator Q: D-+ Yas
Q(x,p) = p.
(Ll) X, Y, and Z are B-spaces over K
(L2) The operators A: X -+ Z and B: Y -+ Z are linear and continuous, where
A is a Fredholm operator.
(L3) Equation (23) has a solution for every z0 e Z.

Lemma 78.18. Under the assumptions (Ll)-(L3) we have:


(i) The operator Q: D-+ Y is Fredholm with ind Q = ind A.
(ii) Q is surjective if and only if A is surjective.
78.12. Proof of Theorem 78.B 835

PRooF. We have
Q(x, p) =0 <=> p = 0, Ax = 0.
This implies dim N(Q) = dim N(A). We are done if we can show that
codim R(Q) = codim R(A).
From the definition of Q it follows that
R(Q) = B- 1 (R(A)).
We choose linear subspaces Y0 and Z 0 of Y and Z which induce the direct
(algebraic) sum decompositions
Y = N(B) EB Y0 , Z = R(A)EBZ0
(see A 1 (22k)). Let the operator B0 : Y0 -+ R(B) be the restriction of B onto Y0 .
Then B0 is bijective. Thus we have
B- 1 (R(A)) = N(B) Et> B0 - 1 (R(A)).
Because of (L3) we have Z 0 £; R(B), and hence
Y0 = B01 (R(B)) = B01 (R(A)) EB B01 (Z0 )).
This gives
Y = N(B) Et> B01 (R(A)) Et> B01 (Z0 )
= R(Q) + B01 (Z0 ).
Therefore
codimR(A) = dimZ0 = dimB01 (Z0 )
= codimR(Q). 0

Now we prove Theorem 78.B. We proceed as in the finite-dimensional case


of Section 78.7. Instead ofSard's theorem we apply the theorem ofSard-Smale
to the projection operator n.
For simplicity in notation, we assume that G and P are open, nonempty
sets in the B-spaces X and Y, respectively, and that Z is a B-space. The proof
of the general case is analogous, since Banach manifolds look locally like open
sets in B-spaces.

Step 1: Solution manifold M = H- 1 (z).


Let M denote the set of all (x, p) e G x P with
H(x,p) = z.
Because of (H2) and the preimage theorem (Theorem 73.C) it follows that M
is a Ck-manifold. The tangent space TMu at the point u = (x 0 ,p0 ) precisely
consists of all points (x,p)e X x Y with H'(u)(x,p) = 0, i.e,
H"(u)x + Hp(u)p = 0. (24)
836 78. Homotopy Methods and One-Dimensional Manifolds

From (H2) it follows that the corresponding inhomogeneous equation


H'(u)(x,p) = z0 , i.e.,
Hx(u)x + Hp(u)p = z0 , (x,p)EX x Y
has a solution for each z0 E Z.

Step 2: Nonlinear projection operator n: M--+ P.


We define
n(x,p) = p. (25)

Moreover, for fixed u we let D = TMu and define the linear projection operator
Q: D--+ Y through
Q(x,p) = p. (26)
Then we have
Q = n'(u). (27)

This follows because for each tangent vector (x, p) ED there exists a curve
t~---+(x(t),p(t)) on M

with x(O) = x 0 , p(O) = p0 and x'(O) = x, p'(O) = p. If we insert this curve into
equation (25), then we obtain (27) by differentiation.

Step 3: Application of the theorem of Sard-Smale ton.


From (H4) it follows that the operator
n: M--+ P

is proper. To see this let P1 be a compact set in P. If { (x,., p,.)} is a sequence


in n- 1 (Pd, then
H(x,., p,.) = z.
Since P1 is compact there exists a convergent subsequence p,.. --+ p with p E P1 •
From (H4) follows the existence of a convergent subsequence x,. .. --+ x with
xEG. Therefore H(x,p) = z, i.e., (x,p)EM. Hence n- 1 (Ptl is compact. Note
that because of the metrizability of G and P, compactness can be characterized
through sequences (see Ad21c)).
Moreover, it follows from Lemma 78.18 and (24) that the operator Q: D--+ Y
is Fredholm with ind Q = ind Hx(u). Thus we obtain from (27) that n: M--+ P
is a Ck-Fredholm operator with

ind n'(u) = ind Hx(u).


According to Corollary 78.12 there exists an open, dense subset P0 of P such
that each PoE P0 is a regular value of n.
Let PoE P0 • Then
n'(u): TMu--+ Y
Problems 837

is surjective for every u = (x 0 , p{)) in M, i.e., Q: D --+ Y is surjective. Lemma


78.18 shows that also
H"(u): X --+ Z
is surjective. Therefore z is a regular value of H(',p0 ). This proves Theorem
78.B.

PROBLEMS

78.1. Numerical construction of bifurcation solutions using the perturbation trick. Con-
sider the situation of Section 78.3. Let H: R"+ 1 -+ R" be a CC1D-map and let
st-+ y(s) be a solution curve C of the equation
H(y(s)) = 0. (28)
Assume that det A(s) changes its sign at s0 and that t'(s 0 ) :1: 0. Then y(s0 ) is a
bifurcation point of (28) (Fig. 78.ll(a)). Besides (28) we study the perturbed
problem
H*(y*(s), p) = 0 (28*)
with H*(y,p)~ H(y) + pf(y), where peR" is fixed and f: R"+ 1 -+ R is a CCID-
mapwith
f(y) > 0
in a small open neighborhood V of the bifurcation point and f = 0 outside of V.
The following is important:
H and H* coincide outside of V. (29)
Prove that if the bifurcation situation is sufficiently regular, then one can find
a regular solution curve c• of(28*) which runs into the other bifurcation branch
of (28) outside of V (Fig. 78.ll(b)).
By using the curve following algorithm of Section 78.4, one effectively can
compute bifurcation branches. Numerical results and a well-written algorithm
may be found in Georg (1981).
Solution: Zero is a regular value of (y, p),..... H*(y, p) on V x R". This follows
from
H*'(y,p) = (H,(y), f(y)I),
c•

/
/
/
/

(b)

Figure 78.11
838 78. Homotopy Methods and One-Dimensional Manifolds

and hence R(H*'(y,p)) = ~R•. From Example 78.4 it follows that fore> 0 there
exists ape iij• with IPI < e such that (28*) has a regular solution set. We choose
a regular solution curve C* of (28*) which intersects C in some small neighbor-
hood of the bifurcation point. Then det A(s) changes its sign along C. But
according to (C) of Section 78.3, the sign of det A *(s) along C* remains constant.
From (29) it follows that A and A* coincide outside of V. Thus C* cannot run
into C, but has to run outside of V into the other bifurcation branch.
78.2. Construction of a second solution with the p-trick. Let F: jij• __. iij• be a C..,·map
which has the following properties:
(i) F has the zero x 0 .
(ii) Zero is a regular value of F.
(iii) There exists an open neighborhood U(x 0 ) and points v, pe R• which satisfy
(vjp) > 0 and (viF(x)) > 0 on oU(x0 ).
Use H(x, t, p) = (1 - t)p + tF(x) to give a constructive proof that F has another
zero x 1 • It is important here that the zero indices of x 0 and x 1 are different, i.e.,
sgndetF'(x 0 )sgndetF'(x 1 ) = -1.
Solution: Let Z = U(x 0 ) x ]0, 1[. We set P = (x,t,p). Because of

H'(P) = (Hx(P), H,(P), Hp(P)) = (tF'(x), ... , (1 - t)J)


and (ii) it follows that zero is a regular value of H. Example 78.4 shows that,
eventually after a small change of p, we may assume that a regular solution curve
C of the equation
(1 - t)p + tF(x) = 0, (x, t)e iij•+I

passes through x 0 • Because of(i), the curve Centers Z at the point (x0 , 1). From
(iii) it follows that p :;:. 0. Therefore, C cannot leave the region Z on the side
Z 0 = U(x 0 ) x {0}. Also from (iii), it follows that C has no common points with
M = oU(xo) X [0, 1]. Therefore c has to leave the region z at a point (xl, 1)
(Fig. 78.12). The statement about the indices follows from the different sign of
intersection numbers (see the proof in Section 78.6).

X
M

Zo z

Fig. 78.12

78.3. Another form of the parametrized version of the theorem of Sard-Smale. As in


Section 78.10, we consider the equation
H(x,p) = z, xeG (30)
for fixed z e Z and assume:
References to the Literature 839

(H1) G, P, and Z are nonempty C 00 -Banach manifolds with chart spaces over IK,
where G and P have a countable basis.
This condition is satisfied, for example, if G and Pare nonempty, open sets in
separable B-spaces over IK, and Z is a B-space over IK.
(H2) The Ck-map H: G x P-+ Z, k ~ 1, has z as a regular value.
(H3) For each parameter peP, the map H(",p): G-+ Z is Fredholm with
ind H.,(x,p) < k for every solution (x,p)e G x P of(30).
Prove: There exists a residual subset P0 of P such that z is a regular value of
H(·,p)forallpeP0 .
Solution: Use similar arguments as in Section 78.12. But instead of Corollary
78.12, use Theorem 78.A of Section 78.8.
Further variants and generalizations may be found in Abraham and Robbin
(1967, M), p. 48 (transversal density theorem).

References to the Literature

Classical works: Sard (1942), Smale (1965).


Parametrized version ofSard's theorem and fixed-point theory: Chow, Mallet-Paret,
and Yorke (1978).
Theorem of Sard-Smale and Fredholm maps: Abraham and Robbin (1967, M),
Tromba (1976), (1978), Borisovic (1977, S, B, H).
Numerical methods: Garcia, Zangwill (1983, M) (introduction), Allgower and Georg
(1980, S, H, B), (1980a), (1988, M), Georg (1981), (1981a), Eaves (1982, P), Rheinboldt
(1986, M).
(See also the References to the Literature to Chapter 6.)
CHAPTER 79

Dynamical Stability and


Bifurcation in B-Spaces

In case the differential equation can be integrated, the problem of stability


presents no difficulty. It is important, however, to find methods which solve the
stability problem independently of the integration.
Alexander Mihailovic Ljapunov (1892)
On passing through J1. = 0 let us now assume that none of the characteristic
exponents vanishes, but a conjugate pair crosses the imaginary axis. This situa-
tion commonly occurs in nonconservative mechanical systems, for example, in
hydrodynamics. The following theorem asserts that, with this hypotheses, there
is always a periodic solution in the neighborhood of the equilibrium point. In
the literature, I have not come across this bifurcation problem. However, I
scarcely think that there is anything essentially new in the above theorem. The
methods have been developed by Poincare perhaps 50 years ago, and belong
today to the classical conceptual structure of the theory of periodic solutions. 1
Eberhard Hopf ( 1942)
Without the presence of stable phenomena, the world would pass into a state
of complete chaos and its apparent structures would dissolve. There is no
question that the discovery of interrelations in natural processes and its scientific
description requires the existence of stable phenomena.
Herbert Beckert (1977)

Because of its great importance for science and numerical analysis, stability
questions have been discussed already in a number of chapters of this volume
and the three previous ones. In the present chapter we examine the following
two important principles:
(L) Linearization principle. The nonlinear differential equation has locally the
same stability properties as the linearized differential equation.

1 Hopf was not aware of the papers of the Russian mathematicians Andronov and Bautin from

the years 1930 to 1941. Instead of Poincare-Andronov-Hopf bifurcation, we simply speak of


Hopf bifurcation.

840
79.1. Asymptotic Stability and Instability of Equilibrium Points 841

(B) Bifurcation principle. Loss of stability of an equilibrium point leads to


bifurcation.
Counterexamples show that (L) and (B) are not generally true. But the
criteria for (L) and (B) which will be discussed below are widely applicable. In
connection with (B) we discuss two important cases:
Simple curve bifurcation (Section 79.8).
Hopf bifurcation (Section 79.9).
As in Chapter 8 the central ingredients are transversality conditions (generic
bifurcation conditions). In the following two Chapters 80 and 81 of Part V we
continue the stability discussions and examine integral manifolds, i.e., mani-
folds which consist of trajectories (stable and unstable manifolds, center
manifolds) as well as the method of Ljapunov functions.
In this chapter we shall use some results in spectral theory which can be
found in At (56) to At (60).
The investigation of bifurcation problems for dynamical systems will be
continued in Chapter 80 by using the theory of stable and unstable manifolds.

79.1. Asymptotic Stability and Instability of


Equilibrium Points
We examine the differential equation
x' = F(x,t). (1)
We are looking for a function x = x(t) defined for all times t ~ t 0 , where x(t)
lies in a B-space X. Equation (1) is called autonomous if and only ifF does not
depend on time t.
The point x 0 eX is called an equilibrium point of (1) if and only if
F(x 0 ,t) = 0 for all times t ~ t0 •

Then x(t) = x 0 is a solution of (1) for all t ~ t 0 and is called a stationary


solution, because the system modeled by (1) remains in the same position x 0
for all times t ~ t 0 • Equilibrium points are also called stationary or singu-
lar points. The following definition goes back to the fundamental paper of
Ljapunov (1892).

Definition 79.1. Let x0 be an equilibrium point of (1) fort ~ t 0 • The point x 0


is called stable if and only if small perturbations of the initial condition
x(t 0 ) = x 0 lead to solutions which remain in the neighborhood of x 0 for all
times t ~ t 0 • More precisely, for each e > 0 there exists a o(e) > 0 such that
the initial-value problem x(t 0 ) = a for (1) has a unique solution x = x(t) for
each a eX with
iia- Xoli < O(e),
842 79. Dynamical Stability and Bifurcation in B-Spaces

which exists for all times t :2:: t 0 • Moreover,


llx(t) - x 0 11 < e for all t :2:: t 0 •
If, in addition, there exists a c5 0 > 0 such that
llx(to)- Xoll < c5o
implies
lim x(t) = x 0 ,
r-++cc

then x 0 is called asymptotically stable.


The point x 0 is called unstable if and only if x 0 is not stable.

Note that the concept of stability implies the existence and uniqueness of
the solution. Now, we assume more generally that y = y 0 (t) is a solution of
the differential equation
y' = G(y,t) (2)
fort :2:: t0 • We set y =Yo+ x and obtain
x' = F(x,t) (3)
with F(x, t) = G(y 0 (t) + x, t) - y~(t). This implies
F(O,t) = 0 for all t :2:: t 0 ,
i.e., x 0 = 0 is an equilibrium point of (3).

Definition 79.2. The solution y = y0 (t) of (2) is called stable, asymptotically


stable, unstable if and only if the equilibrium point x 0 = 0 of (3) has the
corresponding property.

We begin with the autonomous differential equation


x' = F(x) (4)
fort :2:: t0 • Let a(F'(x0 )) denote the spectrum of F'(x 0 ). If the B-space X is real,
the spectrum corresponds to the complexification of F'(x 0 ) in Xc (see A1 (23h)).

Proposition 79.3. Let F: U(x 0 )!;;; X-+ X be C" in a B-space X over IK = R, C


with F(x0 ) = 0. Then:
(a) If ReA. < 0 for all A. e a(F'(x0 )) and if k = 1, then the equilibrium point x 0
of (4) is asymptotically stable.
(b) If ReA. > 0 for some A. e a(F'(x0 )) and if k = 2, then x 0 is unstable.
This proposition contains the linearization principle of stability theory, i.e.,
the stability properties of the differential equation at x 0 depend only on the
spectrum of the linearization F'(x 0 ). Letting
F(x) = F'(x 0 )(x - x 0 ) + r(x),
79.2. Proof of Theorem 79.A 843

from Taylor's theorem of Section 4.6 it follows that:


r(x) = o(llx- xoll) as x-+ x 0 for k = 1,
r(x) = O(llx- x 0 ll 2 ) as x-+ x 0 for k = 2.
Therefore Proposition 79.3 is a special case of the following Theorem 79.A.
Instead of (4) we consider the time-dependent equation
x' = A(x - x0 ) + f(x, t) (5)
for all t ;:::: t 0 with time-independent linear principal part A.
(Hl) The operator A: X-+ X defined on a B-space X over 11\\ is linear and
continuous.
(H2) The mapf: U(x 0 ) x [t 0 , oo[-+ X with U(x 0 ) £;; X is C 1 with f(x 0 , t) = 0
for all t;:::: t 0 •
(H3) There exist numbers r, y > 0 such that one of the following two smallness
conditons is satisfied:
lim llf(x, t)ll/llx- Xoll =0 uniformly for all t ;::: t 0 (6)
x-+x0

or stronger
llf(x,t)ll ~ Yllx- Xollq, q>l (7)
for all x, t with llx- x 0 11 ~rand t;:::: t 0 •

Theorem 79.A (Ljapunov's Main Theorem of Stability Theory in B-Spaces).


Assume (Hl) and (H2).
(a) If ReA. < 0 for all A. E u(A) andif (6) holds, then the equilibrium point x 0 of
equation (5) is asymptotically stable.
(b) If ReA. > 0 for some A. E u(A) and if (7) holds, then x 0 is unstable.

Ljapunov (1892) proved this result for X= !Rn.

79.2. Proof of Theorem 79.A


Without loss of generality let x 0 = 0 and t0 = 0. Let X =1= {0}.
PROOF OF THEOREM 79.A(a).
(I) Estimation of etA. Let 11\\ = C. Since the spectrum u(A) is compact, there
exists a constant c > 0 with ReA.< -c for all A.eu(A). We show
II etA II ~ Ce -ct for all t ;:::: 0 (8)
for fixed C > 0. In fact, from A 1 (60b) it follows that

etA = (2nif 1 rh et=(zl - Af 1 dz


Jou
844 79. Dynamical Stability and Bifurcation in B-Spaces

if we choose an open disk U of radius R, which contains O"(A) and where


ReA.< -c for all A.e U. We obtain
lle'... ll :s;; Re-•' sup ll(zl- Ar 1 11 for all t;;;:: 0.
:rei!U

In the case of a real B-space X, i.e., for K = R we use the complexifica-


tion of A in X c. Then (8) holds in Xc and hence in X.
(II) Representation formula. For continuous g one immediately obtains by

+I
differentiation that

x(t)~ e'Aa e1•-•J...tg(s)ds (9)

is a solution of the initial-value problem


x'(t) = Ax(t) + g(t), x(O) =a. (10)
(III) Gronwall's lemma and an a priori estimate. Let x = x(t) be a solution of
(5) on [0, T]. From Theorem 3.A it follows that this solution is unique
and satisfies (10) with
g(t) = f(x(t), t). (11)
Because of (6) there exists a closed ball B around the origin in X with
for all xeB, t;;;:: 0.
Let x(t) e B for all t e [0, T]. It follows from (8) and (9) that

llx(t)ll :s;; ce-••nall + r 1 c J: e-•(•-•>nx(s)ll ds.

Gronwall's lemma of Section 3.5 with f(t) = e'•llx(t)ll yields


llx(t)ll :s;; ce-•'12 llall. (12)
Thus we have found the following a priori estimate:
There exists a neighborhood of zero V c B which is independent of T
and has the following property. A solution, defined for all te[O, T],
which starts at time t = 0 in V and remains in B, does not leave 2- 1B.
(IV) Global existence of solutions. Let x(O)e V. To start out, Theorem 3.A
implies that the initial-value problem (10) with (11) can be solved locally
on [0, T] for small T > 0. If one solves the initial-value problem again
for t = T, then the a priori estimate shows that the solution can be
continued for all t ;;;:: 0. Thereby it remains in B. The asymptotic stability
follows then immediately from (12). D

PRooF oF THEOREM 79.A(b). From uniqueness it follows that according to (9)


every solution of
x' = Ax + f(x, t), (13)
79.2. Proof of Theorem 79.A 845

with x(O) = t5b can be written as


x(t) = y(t) + z(t), (14)
with

z(t) = I e<r-s)AJ(x(s), s) ds.

We now make essential use of the following two results in spectral theory:

(I) If A.0 > 0 is the largest real part of points in the spectrum u(A), then from
(8) there exists a constant C with
for all t ~ 0.
(II) For every T > 0 there exists a vector bE X with 0 < II bII :::;; 1 such that
4-le.<oT:::;; lleTAbll, (15)
lle'Abll :::;; 2e.<or for all tE [0, T]. (16)
This will be proved in Problem 79.2. Since the numerical value of the
constants y, C, A. 0 is unimportant for the proof, we set y = C = A. 0 = 1.
Also let q = 2 in (7). For q > 1 one proceeds analogously.
(III) Suppose the equilibrium point Xo = 0 is stable. Fore = 10- 3 there exists
a t5 > 0 such that for every initial value x(O) with llx(O)II :::;; t5 there exists
a unique solution x = x(t) of (13) for all t ~ 0 and
llx(t)ll < 10- 3 for all t ~ 0. (17)
We will, however, construct a b with lib II = 1 such that the solution which
satisfies x(O) = t5b violates condition (17) at some time t = T.
(IV) Construction of the contradiction. Let R = 2.1. We choose T > 0 such
that
(18)
Moreover, for T > 0 we choose b as in (II) and pose the initial-value
problem for (13) with x(O) = t5b. Because of the continuity of the solution
there exists a t with 0 < t < T and
llx(t)ll :::;; bRe' for all t E [0, t].
We want to show that t can be increased up toT. With q = 2 it follows
from (14) to (16) and (7) that for all t E [0, t ]:
lly(t)ll :::;; lle'Abbll :::;; 2be',
and

llz(t)ll :::;; L
lle<r-s)AIIIIx(s)ll 2 ds :::;; I e3 <t-s)/2 t5 2 R 2 e2" ds :::;; 2<5 2 R 2 e21
846 79. Dynamical Stability and Bifurcation in B-Spaces

and hence
llx(t)ll ~ lly(t)ll + llz(t)ll ~ 15(2 + 215R 2e')e' < JRe'.
Therefore we can always increaser for r < T. Using the continuity of the
solution we may also chooser = T. From (15) and (18) it follows that
llx(T)II ~ lly(T)II - llz(T)II ~ 4- 1 ber- 215 2 R 2 e2 r
= 2(R- 2)(9 - 4R)/R 2 > 10- 3 •

This contradicts (17). 0

79.3. Multipliers and the Fixed-Point Trick for


Dynamical Systems
In order to give a unified description of the stability properties for equilibrium
points, periodic solutions, and fixed points, we use the concept of multipliers.
Those are suitably defined complex numbers.

Definition 79.4. A multiplier Jl. E C is called asymptotically stable, critical,


unstable if and only if IJJ.I < 1, = 1, > 1, respectively.

Critical multipliers usually create the most difficulties. If x 0 is a fixed point


of the equation
x = Sx, (19)
then, by definition, its multipliers are precisely the points in the spectrum
a(S'(x 0 )) of the linearization. Theorem 4.C then takes the following form.

Proposition 79.5. Let S: U(x 0 ) s;;; X--. X be C 1 in a B-space X over K If


the fixed point x 0 of S has only asymptotically stable multipliers, then it is
attracting.

The behavior in the hyperbolic case, i.e., when x 0 is allowed to have unstable
multipliers but no critical ones, will be studied in Chapter 80. There we will
also examine critical multipliers (Center theorem).
If x 0 is an equilibrium point of the autonomous differential equation
x' = F(x), (20)
then, by definition, its multipliers are precisely the points in the spectrum of
exp F'(x 0 ). Therefore Jl. is a multiplier of x 0 if and only if there exists a
A. E a(F'(x 0 )) with Jl. = exp A.. Proposition 79.3 may then be formulated as
follows.
79.3. Multipliers and the Fixed-Point Trick for Dynamical Systems 847

Proposition 79.6. Let F: U(x 0 ) £:;X-+ X be C2 in a B-space X over KIf the


equilibrium point x0 of (20) has only asymptotically stable multipliers (or at least
one unstable multiplier), then x 0 is asymptotically stable (or unstable).

Critical multipliers in the context of dynamical systems will be discussed in


Chapter 80 (Center theorem).
Now we explain an important fixed-point trick, through which equation
(20) can be reduced to (19). Let x = x(t) be the solution of (20) with x(O) = a.
We let
$,(a)~ x(t). (21)
This defines the flow {$,} at least locally for all a, t with lla- x 0 l <rand
- T::;; t::;; T. For fixed t we let
S=$,.
Then the equilibrium points of (20) are fixed points of the so-called shift
operatorS. For the linear differential equation
x' =Ax (22)
with A E L(X, X) the situation becomes particularly simple. Here we have
$, = exp tA for all t E IR. Letting S = exp A we obtain the following result.

Proposition 79.7. Let A E L(X, X). The multipliers of the equilibrium point
x 0 = 0 of (22) are precisely the multipliers of the fixed point x0 = 0 for the shift
operator S = exp A.

We now justify the important formula


for all t E [ - T, T], (23)
i.e., the linearization of the flow at x 0 is equal in a natural way to the flow of
the linearized differential equation at the point x 0 •

Theorem 79.8 (Structure of Flows for Autonomous Differential Equations).


Let F: U(x 0 ) s;;; X -+X be Ck, k ~ 1, in the open neighborhood U(x 0 ) of a
B-space X. Assume that the local flow {w,} which corresponds to
x' = F(x)
is defined for all a, t with lla- x0 l <rand - T::;; t::;; T. Then:
(a) The map (a, t) H $,(a) is ck.
(b) If x 0 is an equilibrium point, i.e., F(x 0 ) = 0, then (23) is satisfied.
(c) If one can choose T = l, then the multipliers of the equilibrium point x 0 are
equal to the multipliers of the fixed point x 0 of the shift operator S = $ 1 •
According to (23) we have
848 79. Dynamical Stability and Bifurcation in B-Spaces

for arbitrary T > 0. This shows the relation between the multipliers p. of the
equilibrium point x 0 and the multipliers Cofthe fixed point x 0 ofel»r. It follows
that in the sense of Definition 79.4, p. and Calways have the same stability
properties, i.e., simultaneously 1·1 < 1, = 1, > 0 is true for both.
PRooF. Ad(a). This follows immediately from Theorem 4.0.
Ad(b). Let x = x(t, a) be the solution of (20) with x(O, a) = a, that is, Cl»,(a) =
x(t, a). From Theorem 4.0 we can differentiate with respect to a. For a = x 0
we obtain
x~(t, x 0 )h = F'(x(t, x 0 ))x,.(t, x 0 )h for all heX.
Because of x(t, x0 ) = x 0 we have that t H x,.(t, x 0 )h is a solution of
x' = F'(x 0 )x with x(O) = h,
and hence x,.(t, x 0 )h = (exp tF'(x 0 ))h. This is (23).
Ad(c). This follows from (23) with t = 1. 0

79.4. Floquet Transformation Trick


In the following section we shall reduce the stability question for periodic
solutions to Theorem 79.A. We will need the so-called Floquet transformation
trick, and this section is a preparation for this. The important point is that
the Floquet transformation yields differential equations where the principal
part is time independent and hence Theorem 79.A can be applied.
We consider the linear differential equation
z' = B(t)z, z(O) = z0 • (24)
(H1) For every te R the map B(t): X--+ X, defined on a B-space X over IK, is
linear and continuous. Moreover, t~--+B(t) is a continuous map from R
into L(X, X) with period p > 0.
For example, if X= R" then B(t) is an (n x n)-matrix, where the elements
are continuous, p-periodic functions oft.
From Corollary 3.8 it follows that (24) with (H 1) has exactly one solution
z = z(t) which exists for all t e R Let
S(t)z0 = z(t).
From (24) we obtain the following differential equation for the shift operator
S:
S'(t) = B(t)S(t), S(O) = /. (25)
Note that because of the integral representation

z(t) = z0 + I B(s)z(s) ds
79.4. Floquet Transformation Trick 849

M
(a) (b)

Figure 79.1

and (19*) of Chapter 3, the derivative z'(t) exists as a uniform limit with respect
to all z0 in a ball. Thus S'(t) exists in L(X, X).
Especially, S: IR -+ L(X, X) is continuous. Since the initial-value problem for
(24) has a unique global solution for any arbitrary initial time, it follows that
S(t): X -+ X is bijective and continuous. From the open mapping theorem
A 1 (36) it follows that S(t) and S(tf 1 belong to L(X, X). Besides z = z(t) also
z = z(t + p) is a solution of(24), that is, z(t + p) = z(t). This implies the critical
property
S(t + p) = S(t)S(p) for all t e IR. (26)
We say that a compact set M in C does not surround the origin if there exists
a half ray which originates at the origin and does not intersect M (Fig. 79.l(a)).
This definition can be given in a more general form by replacing half rays with
"reasonable" curves as shown in Figure 79.l(b). What we actually need is the
fact that the set M contains an open neighborhood on which the function
z 1-+ In z is_ holomorphic.
(H2) The spectrum u(S(p)) does not surround the origin. For example, this
condition is always satisfied by dim X < oo, i.e., for differential equa-
tions in IR".

Proposition 79.8.If (Hl), (H2) hold, then the operator


A = p- 1 ln S(p)
is well defined and AeL(X,X). Furthermore, there exists a p-periodic, contin-
uous operator function P: IR -+ L(X, X) with
S(t) = P(t)e'A for all t e IR. (27)
For every te IR we find that P(tf 1 eL(X,X).

PROOF. That A is well defined follows from (H2) and the operator calculus in
A 1 (60b). We have S(p) =ePA. Let
P(t) = S(t)e-rA.
850 79. Dynamical Stability and Bifurcation in B-Spaces

From (26) it follows that


P(t + p) = S(t + p)e-lt+pJA = S(t)S(p)e-pAe-tA = P(t).
Moreover, P(tr 1 = e'AS(tr 1 • 0

Definition 79.9. The points in the spectrum o-(S(p)) are called Floquet multi-
pliers (. The transformation
z(t) = P(t)u(t) (28)
is called Floquet transformation. Precisely the eigenvalues in o-(S(p)) are called
Floquet eigenmultipliers.

An important approximation method to determine the Floquet multipliers


goes as follows. One transforms (25) into the integral equation

S(p) = I + f: B(s)S(s) ds (29)

which, according to Section 1.9, can be solved by successive approximation.


This yields S(p). A concrete example has already been discussed in Problem
4.8. In connection with the Hopf bifurcation of Section 79.10, it will be
convenient to use the following criterium, which is based on the periodic
differential equation
w'(t)- B(t)w(t) = -A.w(t), w(O) = w(p). (30)
This is the original differential equation w' = Bw in (24) in which the addi-
tional eigenvalue term - A.w appears.

ExAMPLE 79.10. Assume (Hl ), (H2). Then A. E IC is an eigenvalue of (30) if and


only if exp(pA.) is a Floquet eigenmultiplier.

PRooF. Let S(p)a = elPa with a =F 0. Letting


z(t) = S(t)a and w(t) = e-M z(t)
we obtain w(O) = w(p) = a and
w' = -A.w + e-;.'z' = -A.w + Bw
because of z' = Bz. This argument can be reversed. 0
Now we present the key trick. When studying the stability of the equilibrium
point x 0 = 0 of (24), we cannot apply Theorem 79.A right away, since B
depends on t. However, if we use the Floquet transformation (28), then (24)
passes to the new differential equation
u' =Au (31)
where A does not depend on t. This follows from P(t) = S(t)e-rA, hence
P'(t) = S'(t)e-tA - S(t)Ae-tA = B(t)P(t) - P(t)A
79.5. Asymptotic Stability and Instability of Periodic Solutions 851

and z' = P'u + Pu' = Bz. Note that according to the operator calculus in
A 1 (60b) we may interchange functions of A.

Proposition 79.11 (Stability Criterium). Assume (H1), (H2). If all Floquet


multipliers of equation (24) are asymptotically stable (or at least one ts unstable),
then the equilibrium point x 0 = 0 of (24) is asymptotically stable (or unstable).

PROOF. Because of S(p) = ePA we obtain the relation { = J.lP between the
Floquet multipliers {and the multipliers J.l of (31). From I{ I = IJ.liP it follows
that both have the same stability properties of Definition 79.4. Furthermore,
these stability properties are not changed under Floquet transformation, since
P: [O,p]-+ L(X,X) is continuous and thus Problem 1.7 implies that
sup IIP(t)ll < oo and sup IIP(tr 1 l < 00.
teA teR

An application of Theorem 79.A to the equilibrium point u = 0 of(31) yields


the assertion. 0

This transformation trick goes back to Floquet (1883).

79.5. Asymptotic Stability and Instability of


Periodic Solutions
Consider the nonlinear differential equation
x' = F(x,t). (32)
(Hl) The map F: X x ill-+ X is C 2 , where X is a B-space over IK. For each
x eX we assume that t H F(x, t) has period p > 0.
(H2) Let x = x 0 (t) be a p-periodic solution of (32).
(H3) Let B(t) = F.x(x 0 (t), t) and let S(p) denote the shift operator for z' = Bz.
The spectrum of S(p) does not surround the origin.

Definition 79.12. The Floquet multipliers ofthe periodic solution x = x0 (t) are
precisely the points of the spectrum u(S(p)).

Theorem 79.C. Assume (H1)-(H3). If all Floquet multipliers of x = x 0 (t) are


asymptotically stable (or at least one Floquet multiplier is unstable), then x =
x0 (t) is asymptotically stable (or unstable).

PRooF. We let x = x 0 (t) + z and obtain from (32) the equation


z' = B(t)z + g(z, t)
with
g(z, t) = F(x 0 (t) + z, t) - F(x 0 (t), t) - F.x(x 0 (t), t)z.
852 79. Dynamical Stability and Bifurcation in 8-Spaces

As in (31) the Floquet transformation z(t) = P(t)u(t) yields the new differential
equation
u' = Au + P(tr 1 g(t, P(t)u) (33)
with time-independent linear principal part A. As in the proof of Proposition
79.11 one then applies Theorem 79.A to (33). 0

79.6. Orbital Stability


Unfortunately, Theorem 79.C cannot be used to study the asymptotic stability
of autonomous differential equations
x'(t) = F(x(t)). (34)
This follows because one is a Floquet eigenmultiplier if x = x 0 (t) is a non-
constant, p-periodic solution. Namely, besides x = x0 (t) also x = x 0 (t + t) is
a solution of (34), i.e.,
x 0(t + -r) = F(x 0 (t + -r)).
Differentiation with respect to t gives
y'(t) = Fx(x 0 (t))y(t) (E)

with y(t) = x0(t). Hence


y(p) = S(p)y(O).
The fact that y(p) = y(O) :F 0 implies that y(O) is an eigenvector of S(p) for the
eigenvalue one. Note that from y(O) = 0 and (E) it would immediately follow
that x 0 (t) = const. This shows that the concept of Ljapunov stability is not
practical for periodic solutions of autonomous differential equations. How-
ever, one can employ the concept of orbital stability; Recall that an orbit of
x = x 0 (t) is the set C = {x0 (t): t e IR}, i.e., the set of all solution points.

Definition 79.13. The periodic solution x = x 0 (t) of equation (34) is called


orbitally asymptotically stable if and only if there exists an open neighborhood
U of the orbit C of x = x0 (t) such that every solution x = x(t) of (34) with
x(t 0 )e U for fixed t 0 ;;:::: 0 satisfies:
lim dist(x(t), C) = 0.
t-+ +oo

The periodic solution x = x 0 (t) of equation (34) is called orbitally stable if


and only if, for each neighborhood V of the orbit C of x = x0 (t), there exists
a neighborhood U of C such that every solution x = x(t) of (34), with x(t0 ) e U
for fixed t 0 ;;:::: 0, remains in V for all times t;;:::: t0 .
For a planet, this means that a sufficiently small perturbation at some fixed
time leaves the new orbit in a neighborhood of the old orbit, but the course
in time may change.
79.7. Perturbation of Simple Eigenvalues 853

The periodic solution x = x 0 (t) is called orbitally unstable if and only if it is


not orbitally stable.

Theorem 79.D. Let F: X-+ X be C 1 on a B-space X over K Let


X= X 0 (t)

be a nonconstant, p-periodic solution of (34), and assume that the set of all
Floquet multipliers of x = x0 (t) does not surround the origin. Then:
(a) If one is an algebraically simple Floquet eigenmultiplier of x = x 0 (t), and
if all the remaining Floquet multipliers of x = x0 (t) are asymptotically
stable, then x = x 0 (t) is asymptotically orbitally stable.
(b) IfF: X-+ X is C 2, and if there exists at least one unstable Floquet multiplier
of x = x 0 (t), then x = x 0 (t) is orbitally unstable.

The proof follows very simply from the Floquet transformation and the
existence of a stable manifold. Since, for didactical reasons, we discuss such
manifolds only in Chapter 80, we postpone the proof until then.
Assertion (b) of Theorem 79.0 follows immediately from Theorem 79.C.

79.7. Perturbation of Simple Eigenvalues


The results of this section will mainly be used to examine the stability of
bifurcation branches during the following sections. But actually, the results
are of general interest. One of the main tools used by physicists for the
solution of concrete problems is perturbation calculus. It has been applied
with great success to celestial mechanics and quantum theory. In this direction
we recommend the five volumes of Hagihara (1976) on celestial mechanics,
Reed and Simon (1972, ~).Vol 4 on quantum mechanics, Bogoljubov and
Sirkov (1973, M), (1980, M), and Itzykson and Zuber (1980) on quantum field
theory, and Kevorkian and Cole (1981, M) on boundary layers, for example,
in the hydrodynamics of viscous fluids. The classical standard work on pertur-
bation theory is Kato (1966, M). Modern methods which yield asymptotic
expansions in connection with critical effects, e.g., in the presence of caustics
in geometrical optics, may be found in Maslov (1972, M) and in the profound
exposition ofLeray (1978, M). As an introduction to these ideas we recommend
Eckman and Seneor (1976). Further literature may be found in the References
to the Literature of this chapter.
The basic idea of perturbation calculus is the following. Knowing the exact
solution of a particular problem one wants to compute the perturbed problem
approximately by using expansions of small parameters. In quantum electro-
dynamics, for example, the interaction between electrons, positrons, and
photons can be described by expansions with respect to Sommerfeld's fine
structure constant e = 1/137.
854 79. Dynamical Stability and Bifurcation in B-Spaces

Experience shows that perturbed eigenvalue problems may have a very


complicated structure. However, in the case of algebraically simple eigenvalues
and certain generalizations the situation becomes transparent. Consider, for
example, the eigenvalue problem
Ah = A.Bh, heX, A.ell<. (35)
It is our goal to study the perturbation of the eigenvalue A. 0 of equation
(36)
(H) Let X and Y be B-spaces over k and A 0 , Be L(X, Y).

Definition 79.14. Let C = A0 - A.0 B. Then A.0 is called a B-simple eigenvalue


of A 0 if and only if (36) holds for some x 1 and Cis a Fredholm operator of
index zero with dim N(C) = 1 and the transversality condition
Bx 1 ¢R(C). (37)

From Section 8.4 it follows that (37) is equivalent to the fact that there exists
a yfe Y* with C*yf = 0 and
(38)
In the special case that X= Y, B =I, the identity, Proposition 8.18 shows
that for compact A 0 and A.0 ::1: 0 the algebraically simple and /~simple eigen-
values coincide.

Proposition 79.15. Assume (H) and let A. 0 be a B-simple eigenvalue of A 0 •


Then there exists a neighborhood U(A.0 ) in k and a number r such that
equation (35) has a unique eigenvalue
A.(A) in U(A. 0 )
for every operator AeL(X, Y) with IIA- A0 11 < r.
This eigenvalue is simple and the map AHA.(A) is analytic with A.(A 0 ) = A.0 •
Moreover, we have
for all He L(X, Y). (39)

Corollary 79.16. Let U(J.lo) be a neighborhood in Ill and let


J.lHA(J.l)
be a ct-map from U(J.lo) into L(X, Y) with A(J.lo) = A0 and k ~ 1. If we write
A.(J.l) for A.(A(J.l)}, then J.l H A.(J.l) is a C"-map with
A.'(J.lo) = (yf, A'(J.lo)xl ). (40)

PRooF.
(I) Eigenvalue problem. We use the decomposition
X = N(C) EB N(C)J.
79.7. Perturbation of Simple Eigenvalues 855

and let w = (A.,z), W = IK x N(C)J.. as well as H(A,A.,z) = A(x1 + z)-


A.B(x 1 + z), (A., z) e W, where A e L(X, Y).
In order to solve (35), we consider
H(A,A.,z) = 0, (A.,z)e W. (41)
The linearization is
Hw(A 0 ,A.0 ,0)w = A0 z- A.0 Bz- A.Bx 1 = Cz- A.Bx 1 •
The operator Hw(A 0 , A.0 , 0): W-.. Y is bijective, because equation
Cz- A.Bx 1 = y, (A.,z)e W
has a unique solution for every y e Y. In fact, from
(yf,Cz) = (C*yf,z) = 0 for all zeN(C)J..
and (38) it follows that A.= -(yf,y). Moreover, equation
Cz = y + A.Bx 1 , zeN(C)J..
has a unique solution, since ( yf, y + A.Bx 1 ) = 0.
The implicit function theorem (Theorem 4.B) implies that (41) can be
uniquely solved for (A., z) in a neighborhood of (A 0 , A.0 , 0). This gives the
eigenvalues A.(A).
(II) Simplicity of A.(A). It remains to show that (35) has no eigensolutions
heN(C)J... We set x 1 = 0 in Hand consider the equation
QH(A,A.,z) = 0, zeN(C).l, (42)
where Q: Y-.. R(C) is a fixed projection operator onto R(C). The linear-
ization is
QH,.(A 0 ,A.0 ,0)z = A0 z- A.0 Bz = Cz.
Therefore the operator
QH.. (A 0 ,A.0 ,0): N(C)J..-.. R(C)
is bijective. The implicit function theorem implies that equation (42) can
be uniquely solved for z in a neighborhood of (A 0 , A. 0 , 0). This solution
is z = 0.
Following this preparation we consider the equation
Az- A.Bz = 0, zeN(C)J...
It follows that z is also a solution of (42), and hence it follows from the
previous arguments that z = 0.
(III) Proof of (39). We have z(A 0 ) = 0. Differentiation of
A(x 1 + z(A)) - A.(A)B(x 1 + z(A)) = 0
at the point A 0 with respect to A gives
Hx 1 - (A.'(A 0 )H)Bx 1 + C(z'(A 0 )H) = 0
for all He L(X, Y). Applying yf we obtain (39). D
856 79. Dynamical Stability and Bifurcation in B-Spaces

Important well-known results about the perturbation of nonsimple eigen-


values can be found in Problem 79.8.

79.8. Loss of Stability and the Main Theorem


About Simple Curve Bifurcation
We now come to the main objective of this chapter. An important scientific
phenomenon is that a state of a system looses its stability and thereby passes
into a qualitatively different state. Mathematically, this is bifurcation through
loss of stability. Consider, for example, the autonomous differential equation
z' = G(Jl,z) (43)
in a real B-space X depending on a real parameter Jl, which describes an outer
perturbation of the system. Assume that the system has a family {z(Jl)} of
equilibrium states, i.e.,
G(Jl, Z(Jl)) = 0 for all Jl.
According to Theorem 79.A the stability of z(Jl) is determined by the spectrum
of Gz(Jl, z(Jl) ). We consider two important possibilities which cause the loss of
stability.
(i) Simple curve bifurcation. At Jl = Jlo a real eigenvalue A.(Jl) of the lineariza-
tion Gz{Jl, z(Jl)) passes over the imaginary axis with positive velocity. In
Theorem 79.E below, we precisely state when, at the point z(J.t0 ), a new
curve bifurcates from the curve z = z(Jl) i.e., the system eventually passes
into new equilibrium states (Fig. 79.2).
(ii) Hopf bifurcation. At Jl = Jlo a pair of conjugate-complex eigenvalues
of G,.(Jl, z(Jl)) passes over the imaginary axis with positive velocity. In
Theorem 79.F below, we precisely state when, in the neighborhood of
z(J.t 0 ), nonconstant, periodic solutions of (43) bifurcate from the curve
z = z(Jl) i.e., the system begins to oscillate.

Bifurcation
Supercritical Subcritical Transcritical.

1:7(
Z = Z(jJ) ·

Figure 79.2
79.8. Loss of Stability and the Main Theorem About Simple Curve Bifurcation 857

In Section 79.11 we will show that the famous center theorem of Ljapunov
is a special case of the main theorem about Hopfbifurcation (Theorem 79.F).
Whether or not the system passes into the bifurcation solution mainly
depends on its stability. We assume that the known solution z = z(Jl) is stable
for Jl < Jlo and unstable for Jl > Jlo· Roughly, we obtain the following picture
for (i) and (ii). This corresponds to what one would naturally expect.
(S) Stability principle. The bifurcation solution is stable for Jl > Jlo and un-
stable for Jl < Jlo·
In particular, if bifurcation solutions only appear for Jl > Jlo (supercritical
bifurcation) or Jl < Jlo (subcritical bifurcation), then they are stable or un-
stable, respectively. This has schematically been pictured in Figure 79.2. The
dotted lines represent the unstable solutions. In studying (i) and (ii) we will
make essential use of the bunch theorem of Section 8.11. Our proofs will also
give effective procedures for the construction of the solutions. Later on the
stability concepts will be discussed in greater detail.
First we examine simple curve bifurcations for the stationary equation
G(Jl,Z) = 0, (Jl,Z)EIR X X. (44)
Our assumptions are as follows:
(H 1) X and Y are real B-spaces with the continuous embedding X £; Y and
corresponding embedding operator J.
(H2) (Trivial solution). The map G: U(Jl 0 , z(Jl 0 )) £; IR x X-+ Y is C", k ~ 2.
There exists a C"-map z: U(Jl0 ) £; IR-+ X with
G(Jl,Z(Jl)) = 0 for all Jl.
We call z = z(Jl) the trivial solution branch.
(H3) (Loss of stability). There exists a C 1-map A.: V(Jlo) ~ IR -+ IR such that
A.(Jl) is an eigenvalue of Gz(Jl, z(Jl)) for every Jl with A.(Jl 0 ) = 0 and
A.'(Jl 0 ) > 0. (45)
Moreover, A.(Jlo) is a J-simple eigenvalue of G,.(Jl0 , Z(Jl0)).
In (H3) note the following convention. We say that A. is an eigenvalue of
AeL(X, Y) if the equation Ax= A.Jx has a solution x #: 0. Condition (45)
elegantly indicates the change in stability. It states that at Jl = Jlo the real
eigenvalue A.(Jl) crosses the imaginary axis with nonvanishing velocity from
the left to the right. As the proof will show, this will yield the important generic
bifurcation condition.
From Section 79.7 it follows that (H3) is equivalent to the following condi-
tion (H3*), which is often easier to verify. Let L = Gz(Jl 0 , z(Jl0 )).
(H3*) The operator L: X -+ Y is Fredholm of index zero with dim N(L) = 1.
There exists an element x 1 eN(L) and an element yTeN(L*) with
(yf,Jx 1 ) = 1 and the generic bifurcation condition
(yT, G,.,.(Jlo, Z(Jlo)}xl + Gzz(Jlo, z(Jlo)}z'(Jlo)xl) > 0.
858 79. Dynamical Stability and Bifurcation in 8-Spaces

According to (40), the expression on the left-hand side is then equal to


).'(~to).

The weak stability concept used in the following will be defined precisely
during the pro'of below. Moreover, the real numbers s vary in a sufficiently
small neighborhood of zero.

Theorem 79.E (Main Theorem About Simple Curve Bifurcation of Crandall


and Rabinowitz (1973)). Assume (Hl)-(H3). Then (Jt0 ,z(Jt0 )} is a bifurcation
point of the equation (44).
In a neighborhood of the point (Jt0 , z(Jt 0 )} in R x X, all bifurcation solutions
lie on a c"- 1-curve
s 1-+ (Jt(s), z0 (s))
which passes at s = 0 through the bifurcation point (Jt0 ,z(Jt0 )).
The trivial solution z = z(Jt) is weakly stable for Jl < Jlo and weakly unstable
for Jl > Jlo·
If ~t'(s) =F 0 for all s =F 0, then the bifurcation solution is weakly stable for
Jt(s) > 0 and weakly unstable for Jt(s) < 0.
If G and z in (Hl), (H2) are analytic, then also st-+(Jt(s),z0 (s)) is analytic.

PRooF. We use the bunch theorem (Theorem 8.B) of Section 8.11.


(I) Bifurcation branch. In order to simplify the notation let
Jl=Jto+e, F(e, x) = G(Jto + e, z(Jlo + e) + x). (46)
The original equation (44) then becomes
F(e.,x) = 0, (e,x)eR x X. (47)
Moreover, we have F(e, 0) =0 and L = F..,(O, 0). The important generic
bifurcation condition
(yf,F..,.(O,O)x 1 ) > 0
of Theorem 8.B follows immediately from (H3*).
Theorem 8.B therefore yields the existence of a c"- 1-bifurcation
branch s 1-+ (e(s), x(s)) of (47) through the point (0, 0) with
x(s) = s(x 1 + w(s)), w(s)eN(L).L
and w(O) = 0. The corresponding solution of (44) is
Jt(s) = Jlo + e(s), z 0 (s) = z(Jt(s)) + x(s).
(II) Weak stability ofthe trivial solution. We have·
F..,(e, x) = G,.(Jlo + e, z(Jt) + x).
Since).= 0 is a J-simple eigenvalue of F..,(O,O), it follows from Section
79.7 that there exists a neighborhood U(O) inC such that the eigenvalue
79.8. Loss of Stability and the Main Theorem About Simple Curve Bifurcation 859

problem

F"(e, x)h = A.h, A.eC, heXc (48)


has exactly one eigenvalue A. in U(O) for every (e, x) in a neighborhood
of zero. The solution (e, x) of (47) is called weakly stable (or weakly
unstable) if and only if ReA. < 0 (or > 0). Because of (45) we have

for JL SILo·
This yields the stability result for the trivial solution.
(III) Weak stability of the bifurcation solution. We now consider
F"(e(s), x(s))h = A.(s)h.
Let e'(s) "# 0 for all s "# 0 in a neighborhood of zero. In Problem 79.4 we
prove
lim A.(s)/e'(s)s = -(yf,Fx.(O,O)x 1 ) < 0. (49)
~0

If, for example, e'(s) > 0 for all small s > 0, then e(s) > 0 and A.(s) < 0,
i.e., the bifurcation solution is weakly stable for smalls > 0. Analogously,
one treats the other cases. 0

Remark 79.17 (Construction of the Bifurcation Solution). We use (46). Further-


more, we let N = N(Fx(O,O)) and choose a decomposition X= N E9 Nl..
Equation
Fx(O,O)w = g, weNJ.

has a unique solution for every ge Y with (yf,g) = 0, which will be denoted
by w = Sg. Moreover, equation
(w,e)eNJ. x IR (50)
has also a unique solution for every feX, namely
e = (yf,f)/(yf,Fx.(O,O)xl), w = S(f- (yf,f)). (51)
The bifurcation solution has the form
x(s) = sx 1 + sw(s)
with w(s)e Nl. and w(O) = 0. From F(e(s), x(s)) = 0 follows (50) with
-f = s- 1 F(e(s),sx 1 + sw(s))- F"(O,O)w(s)- e(s)F".(O,O)x 1 •
We insert this f into (51) and fix a number s "# 0 in a sufficiently small
neighborhood of zero. As starting value we choose e0 (s) = 0, w0 (s) = 0. The
bunch theorem of Section 8.11 implies that the corresponding iteration scheme
for (51) converges to the bifurcation solution.
IfF is analytic, then also s~-+(e(s),x(s)) is analytic and one can use ansatz
and comparison of coefficients in (51). Note that f = O(s), s--. 0.
860 79. Dynamical Stability and Bifurcation in 8-Spaces

We now use the following condition (H4) in order to prove a stronger


stability result.
(H4) Let X = Y. Again we set L = G,(Jl0 , z(Jl 0 )}. Moreover, let M = u(L) -
{0}. We assume that M lies in the left half-plane with a positive distance
to the imaginary axis. Moreover, we assume that there exists a neighbor-
hood U of zero in IC such that
u(G(Jl,z))n U
consists entirely of eigenvalues. More precisely, this should be true for
all points (Jl, z) in a sufficiently small neighborhood of the point (Jl 0 ,
z(Jl 0 )} in ~ x X.

Corollary 79.18 (Stability of Solutions in the Sense of Ljapunov). If (H 1)-(H4)


holds, then we can replace weakly stable (or weakly unstable) with asymptotically
stable (or unstable) in Theorem 79.E.

PROOF. A well-known theorem in perturbation theory states that for small


perturbations of L in the operator norm, the perturbation of M remains in
the left open half-plane (see Problem 79.8). From (II) and (III) in the proof of
Theorem 79.E we know the behavior of the perturbation of A. = 0. Theorem
79.A then yields the assertion. D

79.9. Loss of Stability and the Main Theorem


About Hopf Bifurcation
We consider again the differential equation
z' = G(Jl, z). (52)

Our assumptions are now as follows:


(Hl) X and Yare real B-spaces with the continuous embedding X£ Y and
corresponding embedding operator J.
(H2) (Trivial solution). The map G: U(Jl 0 , z(Jl 0 )} £ ~ x X-+ Y is Ck, k ~ 2.
There exists act-map z: U(Jlo) £ ~-+X with G(Jl, z(Jl)) = 0 for all Jl·
(H3) (Loss of stability). There exists a C 1-map .A.: V(Jlo) £ ~-+ ~ such that
.A.(Jl) is an eigenvalue of G.(Jl, z(Jl)) for all Jl with .A.(Jlo) = iw0 , w 0 > 0 and
Re.A.'(Jlo) > 0. (53)
Moreover, .A.(Jlo) is a J-simple eigenvalue of G,(Jl 0 , z(Jl 0 )} with respect to
the complexification.
For X= Y = ~",!-simplicity means that .A.(Jl0 ) is algebraically simple. The
number p0 = 2njw 0 will arise as the limit period of the periodic solutions
which are bifurcating from z(Jl0 ).
79.9. Loss of Stability and the Main Theorem About Hop£ Bifurcation 861

Besides A.(ll) the complex conjugate number l(ll) is also an eigenvalue of


G,(/l.Z(Il)). It follows from (53) that, for ll = llo. the pair (l(ll), l(ll)) of complex
conjugate eigenvalues crosses the imaginary axis with nonvanishing velocity
from the left to the right. As the proof shows, this loss of stability yields the
important generic bifurcation condition.
(H4) (Nonresonance condition). None of the numbers ikw0 with k = 0 and
k = 2, 3, ... is an eigenvalue of Gz(Jl0 , z(Jl 0 )}.
Conditions (H3) and (H4) guarantee that the linearized equation z' =
GAllo• z(ll 0 ))z has periodic solutions of the form
z = (coskro0 t)c + (sinkro0 t)d, c, de X
with k = 1, but not with k = 0 or k = 2, 3, ....
Let q'.(R, X) denote the set of all 27t-periodic, m-times continuously dif-
ferentiable functions with values in X. Let
X 2• = q.(R, X) and Y2• = C2.(1R, Y).
(H5) (Technical condition). The operator d/dt - ro 01 Gz(llo• z(ll0 )) from X 2 •
into Y2 • is Fredholm of index zero with two-dimensional null space.
For X = Y = R" this condition follows automatically from (H1)-(H4) (see
Problem 79.5). Let A = GAllo• z(llo)) - iw0 J with ro 0 > 0. From Section 79.7
it follows that (H3) is equivalent to the following condition.
(H3*) The operator A: Xc-+ Yc is Fredholm of index zero with dim N(A) = 1.
There exists an element aeN(A) and an element a*eN(A*) with
(a*,Ja) = 1 and
Re(a*, Gz,.(ll0 , Z(llo))a + GzAilo• Z(llo))z'(llo)a) > 0.
According to (40) the expression on the left-hand side is then equal to
Re A.' (Jlo ).
If z = z0 (t) is a p-periodic solution of the original equation (52) with p > 0,
then we use time scaling t = 27tt/p to renorm it to period 27t. We let
x(t) = z0 (t)
and call the tuple
(ll,p,x) in IR 2 X X2•
a solution tuple of (52). By a phase shift we mean the transformation from
n-+x(t) to n-+ x(t + !X).
A solution tuple (ll, p, x) is called nontrivial if and only if x does not coincide
with the equilibrium point z(ll).
Our goal is to find a family
Jl = ~t(s), p= p(s), X= Xs (54)
862 79. Dynamical Stability and Bifurcation in 8-Spaces

of solution tuples for all real sin a neighborhood U(O) of zero with
(Jl(s), p(s), x.) -+ (Jlo, Po. z(Jlo)) as s-+0 (55)
in IR 2 x X 2 ,. as well as
Jl(s) = Jl(- s) and p(s) = p( -s) for all s e U (0), (56)
where Po = 2n/ro0 • If we write the element a e Xc of (H3*) in the form a =
a 1 + ia 2 with a 1 , a 2 eX, then we will obtain
xs(r) = z(Jl 0 ) + s[(cos r)a 1 + (sin r)a 2 ] + o(s), s -+0. (57)
Hence the tuple in (54) is nontrivial for s ::1: 0, and we have nontrivial periodic
solutions.
The following theorem is called the main theorem of Hopf bifurcation. The
stability concepts mentioned will be given a precise form during the proof.

Theorem 79.F (Hopf(l942), Crandall and Rabinowitz (1975)). Assume (Hl)-


(H5).
(a) Existence of periodic solutions. There exists a ct-
1-curve (54), whose points

consist of solution tuples of equation (52). Furthermore, (55) through (57)


hold.
(b) Uniqueness. There exists a neighborhood U(Jlo,Po,Z(Jlo))in IR 2 X x2,.such
that all nontrivial solution tuples of (52) in this neighborhood are given by
(54) with s ::1: 0 and by phase shifts of x •.
(c) Stability. The trivial solution z(Jl) is weakly stable for Jl < Jlo and weakly
unstable for Jl > Jlo· If Jl'(s) ::1: 0 for all s :F 0 in a neighborhood of zero,
then (54) is weakly stable for Jl(s) > Jlo and weakly unstable for Jl(s) < Jlo·
(d) Analyticity. If G and z in (Hl), (H2) are analytic, then all functions in (54)
depend analytically on s.
The proof will yield an effective iteration scheme for the construction of(54).
In the analytic case one can use ansatz and comparison of coefficients. If
Jl'(s) :F 0
for all s ::1: 0, it follows from (56) that nontrivial periodic solutions are possible
only for Jl > Jlo or Jl < Jlo, i.e., weakly stable supercritical bifurcation or
weakly unstable subcritical bifurcation are the only bifurcations that occur.
The natural scientist is interested in the former case.
Theorem 79.F can be applied to the case X = IR" (systems of ordinary
differential equations), to parabolic partial differential equations, and to the
time-dependent Navier-Stokes equations. In the last two cases one needs to
modify the assumptions slightly. We again assume (Hl) to (H5), but do not
choose X 2 ,. and Y2 ,. as above, but instead choose function spaces whose
elements are Holder continuous differentiable in an appropriate sense with
respect to the space and time variable, and have period 2n with respect to
time. Our proof will immediately apply to this situation. The appropriate
79.10. Proof of Theorem 79.F 863

spaces may be found in Joseph and Sattinger (1972) together with the a priori
estimates that imply (HS).
In the following proof, the concept of weak stability and weak instability
is based on the behavior of the essential Floquet multipliers (see Lemma 79.19
below). From the physical point of view, it is important to know the orbital
stability or orbital instability of the bifurcating periodic solutions. However,
similarly as in the proof of Corollary 79.18, it is not difficult to prove the orbital
stability or orbital instability via Theorem 79.0 by making additional natural
assumptions about the spectrum of Gz(Jl. 0 , z(Jl. 0 )}. An important result in this
direction will be proved in Problem 79.9.
Further interesting results about Hopf bifurcation can be found in Problems
79.10 and 79.11, and in Chapter 80.

79.10. Proof of Theorem 79.F


We apply the bunch theorem (Theorem 8.B) of Section 8.11 to the operator
equation (61) below. The important generic bifurcation condition

det((x1,Fxt1(0,0)x 1 )) i= 0

of Theorem 8.B will follow from (58) below, and (58) is a consequence of the
loss of stability ReA.'(Jl.o) > 0.
Since the bunch theorem follows from the implicit function theorem, the
same is true for the Hopf bifurcation (Theorem 79.F).

Step 1: Preparations.
By eventually passing to f(e, z) = G(Jl.o + e, z(Jl. + e) + z) we may assume
=
right away that /lo = 0 and z(Jl) 0. Also, for simplicity, we write x instead
of J x. Important is the generic bifurcation condition

(58)

which will follow from ReA.'(Jl.o) i= 0, where we set

and (fig)= J':,.(f(r),g(r))dr. We have that L 0 , L 1 EL(X, Y). Moreover, we


define
x = e;'a, x 1 =Rex, x 2 = lmx,
x* = e-ira*, xt =Rex*, x! = lmx.
From (H3*) follows
L 0 a = ia, L~a• = ia*, (a*, a) = 1t- 1 , (59)
864 79. Dynamical Stability and Bifurcation in B-Spaces

where a has been renormed. Finally, we let

('f..f)(r) = f(r + a).


Because of a= a 1 + ia 2 we have x 1 =(cos r)a 1 -(sin r)a 2 , etc. The fol-
lowing statements are easily verified.

(I) Equations (59) imply that (xrlx1) = ~11 for i,j = 1, 2 and

(xfiL 0 xd = 0,
(II) We prove (58). It is
n(xfiL 1 xd = (af,L 1 a 1 ) - (a~,L 1 a 2 )
= Re(a*, L 1 a).
Thus (H3*) implies that
(xfiL 1 xd = co01 ReA.'(~t 0 ) > 0.
From (I) follows (58).
(III) ('T.rxd(r) = -x 1 (t).
(IV) ~is a linear map in the spaces span{x 1 ,x 2 } and span{xt,xn
(V) For each point yespan{x 1 ,x 2 } there exist real numbers fJ, r with
Tpy = rx 1 ,
because we have always y = Re(bx) for some be C, hence y = Re e1' - 1'lbla.
(VI) Let xeX2K' With x we denote the derivative with respect to the time
variable r. It follows from (59) that xl> x 2 and xf, x! are 2n-periodic
solutions of the differential equations
x -L0 x = 0 and x* + L~x* = 0,
respectively. This way, we obtain all 2n-periodic solutions of the first
differential equation in X 2 ,. as span{x 1 ,x2 }. For X= Y = R"this follows
from (H4) and Fourier expansions. In the general case it follows from
(H5).
Step 2: Eigenvalue trick.
In order to eliminate the unknown period p, we let
x(r) = z(t) with r = 2nt/p.
From z' = G(Jl, z) we obtain

x= :n G(Jl, x).

By introducing a small parameter p through p = 2nco01 (1 + p) we obtain


co01 (1 + p)G(JJ.,x)- i = 0. (60)
79.1 0. Proof of Theorem 79.F 865

Setting 6 = (p, p.) we can write (60) simply as


F(6,x) = 0, (61)
with F: U(O,O)!;;;;; IR 2 x X 2.-+ Y2 ,..
Using this time scaling we can restrict ourselves to 2n-periodic solutions,
where (61) now depends on two parameters p and p..
Step 3: Existence proof.
For (61) we check the assumptions of the bunch theorem (Theorem 8.B) of
Section 8.11.
From (60) it follows that
Fx(O, O)x = L 0 x - x.
According to (H5) this is a Fredholm operator of index zero. From (VI) the
null space N(Fx(O, 0)) is spanned by x 1 and x 2 .If we identify xf with the linear,
COntinUOUS functional XH(Xflx) on X2•• then integration by parts gives

(xf,Fx(O,O)x) = (xfiL 0 x- x)
= (xf + L~xflx) = 0 for all xeX2.,
hence Fx(O, 0)* xf = 0. Because of
ind Fx(O, 0) =0 and dimN(Fx(O,O)) = 2
we obtain from Section 8.4 that dimN(Fx(O,O)*) = 2. Therefore xT and x~
span the null spaceN(Fx(O,O)*).
Since the generic bifurcation condition
det((xr,FxeiO,O)x 1 )) ::1:0
is identical with (58), Theorem 8.B implies the existence of a ct- 1-bifurcation
branch s 1-+ (a(s), x.) of (61) through the point (0, 0) with
x. = sx 1 + sw., w.eN(Fx(O,O)).L
and w. = 0 for s = 0. Time rescaling gives the existence result of Theorem 79.F.
Step 4: Uniqueness.
Let N = N(Fx(O,O)). The operator
Px = (xf,x)x 1 + (x~,x)x 2
is a projection operator from X2 • onto N. From Theorem 8.B it follows that
the bifurcation branch is uniquely determined through
Px. = sx 1 •
Let N.L =(I- P)X2 ,.. From (IV) it follows that T,. leaves N as well as NJ.
invariant. This implies T,.P = PT,..
Let (6, x) be a solution of (61) in a neighborhood of zero which has the form
866 79. Dynamical Stability and Bifurcation in 8-Spaces

of a ball. Then also (e, J;.x) is a solution which, because of 117;.11 = 1, remains
in the neighborhood of zero. We have J;.Px eN. From (V) we can choose oc
such that J;.Px = rx 1 • This implies PJ;.x = rx 1 • Hence we must have
and e = e(s) with s = r,
i.e., x differs from x, only by a phase shift.
Besides (e(s), x,), also (e(s), T,.x.) is a solution of (61). From (III) it follows
that PT,.x, = sT.x 1 = -sx 1 . Hence we must have T,.x, = x_, and e(s) = e( -s).
Step 5: Stability.
By definition the weak stability of the trivial solution has to be determined
by the behavior of A.(p). Because of (H3) we have
ReA.(p) ~ 0 for
This implies the stability result of Theorem 79.F for the trivial solution.
In order to study the stability of the bifurcation .solution, we consider the
eigenvalue problem
F"(e(s), x,)h = Kh, (62)
and note that e = (p,JL) and p = 2nm 01 (1 + p).
Lemma 79.19. Let Jl'(s) '=F 0 for all s '=F 0 in a neighborhood of zero. Then there
exists a real C1-function s t-+ K(s) in a neighborhood of zero, where all K(s) are
eigenvalues of (62) with K(O) = 0 and
= -m01 ReA.'(p0 ) < 0.
.-o
lim K(s)/p'(s)s (63)

The proof will be given in Problem 79.6. Example 79.10 shows that m(s) =
e2 ""<•> is a Floquet multiplier of the 2n-periodic solution x. of

x- Jn G(p(s), x) = 0.
Moreover, for s = 0 equation (62) has the double eigenvalue " = 0 with
eigenfunctions x 1 and x 2 • The index has nothing to do with x,. Differentiation
of F(e(s), x.) = 0 with respect to 1: gives
F"(e(s), x,)x. = 0.
Thus " = 0 is an eigenvalue of (62) for s '=F 0. According to Lemma 79.19 we
therefore may think of(K(s), 0) as a perturbation of the double eigenvalue (0, 0).
The corresponding Floquet multipliers are (m(s), 1).
Motivated by Theorem 79.0 about orbital stability, we define the weak
stability of x. according to the behavior of m(s). The solution x. is called weakly
stable (or weakly unstable) if
lm(s)l < 1 (or > 1).
This corresponds to K(s) < 0 (or K(s) > 0).
79.11. Applications to Ljapunov Bifurcation 867

From (63) we obtain that K(s) has the opposite sign of 1-l'(s)s. This implies
the stability result of Theorem 79.F.
The considerations in the proof to Problem 79.9 justify the designation
"weak" stability and "weak" instability in Theorem 79.F.
Step 6: Construction of the bifurcation solution.
The iteration scheme ofTheorem 8.B to determine the solutions of (61) takes
here the following specific form.
For every ye Y2" the linear differential equation
x- L 0 x = y + pL 0 x 1 + llL 1 x 1 (64)

has a unique solution (x,p,/l)EX2 " x IR 2 with


(xflx) = (x!Jx) = 0.
This solution is obtained by first solving the system of linear equations
p(xfiL 0 xd + /l(x!IL,xd = -(xfly),
p(x!IL 0 xd + /l(x!IL 1 x 1 ) = -(x!Jy),
and then finding x from (64). The solution of (64) will be denoted by (x, ll• p) =
Sy. We now set x, = sx 1 + sw, and consider the equation

(w,/l,P) = Sy (65)
with
w 0 y = s- 1 (1 + p)G(/l,SX 1 + sw)- Gz(O,O)(x 1 + w)
- pGz(O,O)x 1 -1-lGz,.(O,O)x,. (66)
The solution (w., .u(s), p(s)) can then be determined from (65) and (66) by
successive approximations with starting value w = 0, .u = p = 0. In the ana-
lytic case one can also use ansatz and comparison of coefficients.

79.11. Applications to Ljapunov Bifurcation


We examine conditions under which there exist nonconstant, periodic solu-
tions to the autonomous differential equation
z' = H(z) (67)

in the neighborhood of the equilibrium point z = 0 (see Figure 79.3 for IR 2 ).


In contrast to the Hopf bifurcation, no parameter occurs. We will, however,
reduce this problem in a very simple fashion to the Hopf bifurcation for the
equation
z' = H(z) + llE'(z) (68)
868 79. Dynamical Stability and Bifurcation in B-Spaces

Figure 79.3

at the point Jl. = 0. The trick is to choose the perturbation E' in such a way
that every small periodic solution of(68) is also a solution of(67). We assume:
(Hl) The map H: U(O) £ IR" ~ IR" is C 2 with H(O) = 0.
(H2) (Conserved quantity). The C3 -function E: ill"~ IRis a conserved quan-
tity for (67), i.e., we have E(z(t)) = const along every solution of (67).
Moreover, E'(O) = 0 and the matrix E"(O) of the second-order partial
derivatives of E at the point z = 0 is nonsingular.
Often, one can choose the energy as E.
(H3) (Nonresonance condition). H'(O) has the algebraically simple eigenvalue
w0 i with ro 0 > 0 and no kw 0 i with k = 0 or k = 2, 3, ... is an eigenvalue
of H'(O). We set Po= 2rr./ro0 .

Theorem 79.G (Center Theorem of Ljapunov (1892)). If (Hl)-(H3) hold, then


(67) has a family {z.} of nonconstant, periodic solutions of period p(s), where
max lz.(t)l ~ 0, p(s) ~Po
reiR

ass~o.

The name center theorem is used since the configuration in Figure 79.3 is
called a center.

PROOF.

(I) Preparations. Recall the proof of the following two well-known results
about conserved quantities:
E'(y)H(y) =0 for all yeiR", (69)
A ~c E"(O)H'(O) + H'(O)*E"(O) = 0. (70)
Ad(69). For every y e Ill" there exists a solution z of (67) with z(O) = y.
Because of
z'(t) = H(z(t)) = H'(O)z(t) + o(lz(t)l), t ~ 0
Problems 869

we obtain from Taylor's theorem that


z(t) = y + tH'(O)y + o(t), t -+0. (71)
Differentiation of E(z(t)) = const gives E'(z(t))z'(t) = 0. For t = 0 we
obtain (69).
Ad(70). From
E(z) = £(0) + 2- 1 (ziE"(O)z) + o(lz/ 2 ), z-+ 0
and (71) it follows that
E(z(t)) = E(O) +r 1 (yiE"(O)y) +r 1 t(yiAy) + o(t), t-+ 0.
This expression is constant in time, and hence (yl Ay) = 0 for all y e !Rn.
Because of A= A* we thus have A= 0.
(II) Now we present the key trick: Every periodic solution z(t)¥; 0 of (68),
in a sufficiently small neighborhood of zero, is also a solution of (67). To
see this, we compute
E(z(t))' = E'(z(t))z'(t) = E'(z(t))H(z(t)) + lliE'(z(t)W
= lliE'(z(t)W = ll{IE"(O)z(t)l 2 + o(lz(tW)}.
Let ll > 0 (or < 0). Because of z(t)¥; 0 we have z(t) =I= 0 for all t. Hence
we obtain E"(O)z(t) =I= 0 and therefore
E(z(t))' > 0 for all t (or < 0).
But this is impossible for a periodic solution. Thus we must have ll = 0.
(III) Hopf bifurcation for (68) at ll = 0. We want to apply Theorem 79.F to
(68). In doing so we only need to check condition (H3) of Section 79.9
about loss of stability.
Because of (H3) and (70) we can use a linear coordinate transformation
to get at first H'(O) and then E"(O) into the following form:

H'(O) = (-w0 0
w0
0
0)
0 , E"(O) = ( 0
a 0 0)0 .
a
0 0 w 0 0 b
Because of det E"(O) =1= 0 we have a =1= 0. Thus the matrix
H'(O) + llE"(O)
has the eigenvalue A.(/l) =/[a+ w0 i, and hence Re A.'(O) =I= 0.
The bifurcation solutions of (68), which follow from Theorem 79.F,
are the required solutions of (67). D

PROBLEMS

79.1. Quasi-eigenvectors. Let Y be a complex 8-space and let A E L( Y, Y). Moreover,


let ). be a boundary point of the spectrum u(A).
870 79. Dynamical Stability and Bifurcation in B-Spaces

Prove:
(i) For every e > 0 there exists a unit vector ye Y with II(A- Al)yll <e.
(ii) For every T > 0 and for every 'I e] 0, 1[ there exists a unit vector y e Y with
eiReA(1 -'I)~ lle'Ayll ~ eiReA(1 +'I) (72)
for all t e [0, T].
Solution: Ad(i). Let p(A) be the resolvent set and let R,. = (A - pi)-1 be the
resolvent for pep(A). We have
(A - Al)R,. = (p - l)R,. + I. (73)
For leu(A) and pep(A) we have IIR,.II ~ Jp- Ar 1. Otherwise we obtain
li(A -ll)R,. -Ill < 1, i.e., because of the Neumann series (see A1(57d)),
(A - U)R,. is invertible and hence also A - AI, which contradicts Ae u(A).
For the boundary point A of u(A) we find ape p(A) with Jp - AI < e/2, and
hence
IIR,.II > 2/e.
We choose a unit vector x with IIR,.xll > 2/e and let y = R,.(x/IIR,.xll) From
(73) it follows that
li(A- A.I)yll ~ Jp- AI+ IIR,.xll- 1 <e.
Ad(ii). Choose y as in (i). Power series expansion implies that
(eA' _ eA'J)y = eA'(eiA-Allr _ l)y

= eA' I eiA-Aila(A - U)yds

and thus
lleA'y- eA'yll ~ e'R•A LT lleiA-Allalleds.
Because of lleAiyJJ = e'R•A we obtain (72) for sufficiently smalle > 0.
79.2. Proof of (15) and (16).
Solution: For a complex B-space X this immediately follows from (72). Now
assume X is real. We choose Y equal to the complexification Xc (see A1(23h)).
Let y = y1 + iy2 • From (72) it follows with '1 = t that
teTReA ~ lieTAyll ~ lleTAY1II + lleTAY2Ii·
Forb = y1 orb = y2 we therefore have
ieTReA ~ lleTAbll.

This is (15). Let, for example, b = y1. From (72) it follows that
lle'Abll = IIRe(e'Ay)ll ~ lle'Ayll ~ 2e'R•A
for all t e [0, T]. This is (16).
79.3. A class of Fredholm operators. Let X and Y be B-spaces over K Let M denote
the set of all Fredholm operators A e L(X, Y) of index zero with dim N(A) = 1.
Prove that M is an analytic submanifold of L(X, Y).
Hint: Use Proposition 79.15. See Crandall and Rabinowitz (1973).
Problems 871

79.4. Proof of (49).


Solution:
(I) We shall use the fact about real functions that

lr(t)l ::;:;; o(l)a(t), t-+ 0

and a(t) =F 0 for t =F 0 always implies that

r(t)/a(t)-+ 0 as t-+ 0.

Thus we have r(t) = o(l)a(t).


(II) Furthermore, it always follows from

lla(s)ll :S o(l)lla(s)ll + lib II, s-+ 0

that lla(s)ll/2 ::;:;; llbll for small lsi.


(III) Let N.J.. = N(Fx(O, O)).L. It is important that the linear equation
Fx(O,O)z- eFx,(O,O)x 1 = y, (z,e)eN.L x IR (74)

has a unique solution (z, e) for every y e Y (see Remark 79.17). The open
mapping theorem Ad36) therefore implies that there exists a constant c
with
llzll +lei :S ciiYII· (75)

(IV) We write F(s) for F(e(s), x(s)). Let e'(s) =F 0 for s =F 0. According to Section
79.7, equation
F"(s)(x 1 + z(s)) = A.(s)J(x 1 + z(s))
has a C 1 -solution Sf-+(A.(s),z(s)) with A.(s)eiR, z(s)eN.L, and A.(O) = 0,
z(O) = 0. From F(s) = 0 follows
F,(s)e'(s) + Fx(s)x'(s) = 0.
Subtraction gives
fx(s)u(s) = A.(s)J(x 1 + z(s)) + F,(s)e'(s) (76)

with u(s)~ x 1 + z(s)- x'(s). Because of x'(s)- x 1 eN.L we obtain


u(s)eN.L Note that
F"(s)u(s) = Fx(O,O)u(s) + o(l)u(s).
=
From F(e, 0) 0 follows F,(e, 0) =0. Therefore, Taylor's theorem of
Section 4.6 implies
F,(s)e'(s) = (F,(e(s), x(s)) - F,(e(s), O))e'(s)
= (F,x(O, 0) + o(l))x(s)e'(s)

= F,x(O,O)sx 1 e'(s) + o(l)se'(s).

Note that x(s) = s(x 1 + o(l)). Thus it follows from (76) that

fx(O,O)u(s)- F",(O,O)e'(s)sx 1
= A.(s)(Jx 1 + o(l)) + o(l)u(s) + o(l)e'(s)s. (77)
Equation (75) and (II) imply

llu(s)ll + le'(s)sl ::;:;; const IA.(s)l. (78)


872 79. Dynamical Stability and Bifurcation in 8-Spaces

Hence we have A.(s) 'I: 0 for s 'I: 0. An application of yf to (77) gives

-e'(s)s(yf,Fx1 (0,0)x 1 ) = A.(s)(l + o(l)).


Here observe that (yf,Fx(O,O)x) = 0 for all xeX and note (78) and (1).
For s-+ 0 we obtain (49).
19.5. Special Fredholm operator. Prove that (H5) of Section 79.9 is always satisfied
for X= Y = IR".
Solution:WeletDx = x. ThenD: X 2 ,.-+ Y2.isaFredholmoperatorofindex
zero, because from x = 0 follows x = const, and x = f means that

Jof2" f(t)dt = 0.
Thus we have dim N(D) = codim R(D) = 1. According to Section 8.4 the
compact perturbation D - w01 Gz(Jt 0 , z(Jt0 )) of Dis also a Fredholm operator
of index zero.
79.6. Proof of Lemma 79.19.
Solution: We proceed analogously as in Problem 79.4. We set N =
N(Fx(O,O)). From Theorem 8.B the bifurcation solution x,eX2,. has the form
x. = sx 1 + sw,
with w,eNJ. and w, = 0 for s = 0. We let v, = sw,. With x~ or x, we denote
the derivatives with respect to the parameter s or time r. Moreover let
F(s) = F(e(s~ x,), Fp(s) = FP(e(s), x,), etc.
Note that e = (p, Jt).
(I) The equation
Fx(O,O)u- ICX 1 - f1X 2 = y,
has a unique solution for every ye Y2 ,. This·follows as in Section 79.7
from
and forall xeX2 ••
Hence, from the open mapping theorem A1(36) there exists a constant c
with

""" + '"' + ,,, =::;; c IIYII·


This estimate will now essentially be used.
(II) Because of (1), the implicit function theorem implies that equation
Fx(s)(x 1 + u,) = K(s)(x 1 + u,) + '7(s)(x2 - w,) (79)
has a C 1-solution curve st-+(u,K(s),,(s)) in NJ. x IR 2 which passes
through (0, 0, 0) at s = 0.
(III) Differentiation of F(s) = s with respect to r gives
Fx(s)x, = o.
Moreover, it is x, = -sx2 + sw,. Below we will show that K(s) 'I: 0 for
Problems 873

s ¥- 0. From (II) we obtain


F..(s)(xl + u. + K(s)- 1 '7(s)(x2- w.))
= K(S)(Xt + U5 + K(sf 1 '1(S)(X2 - W5 )),
i.e., K(s) is an eigenvalue of Fx(s).
(IV) Differentiation of f(s) = 0 with respect to s gives

p'(s)Fp(s) + tt'(s)F"(s) + F..(s)(x 1 + v;) = 0.


Subtraction of (79) yields
p'(s)Fp(s) + tt'(s)F"(s) + F..(s)(v;- u.)
+ K(S)(Xt + U + 17(s)(x2 - W = 0.
5) 5)

From (60) and G(J-1, 0) = 0 it follows that


F"(s) = sm01 G"z(O,O)x 1 + o(s) = sL 1 x 1 + o(s),
Fx(s) = Fx(O, 0) + o( 1),
FP(s) = m01 G(tt(s), x.) = (1 + p(s)f 1 x.
ass-+ 0.
Letting P(s) = sp'(s)/(1 + p(s)), we obtain
F..(O, O)(v; - u,) + K(s)x 1 + (17(s) - P(s))x 2
= o(1)(v; - u.) + o(1)K(s) + o(1)(17(s)- P(s))
+ o(l)stt'(s)- stt'(s)L 1 x 1 • (80)
From (I) follows

llv;- u.ll + IK(s)l + l11(s)- P(s)l ~ 0(1)lstt'(s)l. (81)


Note that (xf,L 1 x 1 ) = m 0 1 Rd'(~t 0 ). An application of xT to (80)
together with (I) and (81) gives
IK(s) + stt'(s)m01 Re..l.'(tt 0 )1 ~ o(1)stt'(s).

If tt'(s) ¥- 0 for s ¥- 0, we may divide by stt'(s) to obtain (63) for s-+ 0.


79.7.** Global Hopf bifurcation. Similarly as in Section 79.9, we consider the auton-
omous differential equation
z' = G(tt,z), (82)
where G: IR x R"-+ IR" is C 1. We assume:
(i) G(J-1, 0) = 0 for all J-1 in a neighborhood of J-lo·
(ii) G,(J-1 0 , 0) has m0 i as an algebraically simple eigenvalue with m0 > 0, and
we assume conditions (H3) and (H4) of Section 79.9 about loss of stability
and nonresonance. We set Po= 2n/m0 .
(iii) For any parameter value J-1 E IR there passes a unique solution curve of(82)
through each point z0 E IR", which exists for all times t.
We define a set Pin IR"+2. The point (p,J-1,z 0 ) then belongs toP precisely if
there exists a nonconstant, p-periodic solution of (82) which passes through
874 79. Dynamical Stability and Bifurcation in 8-Spaces

z0 • Let S denote the component of P u {(p 0 , p 0 , 0)} which contains the point
(p0 , p0 , 0). Prove that precisely one of the following two cases must occur:
(a) S is unbounded.
(b) Sis bounded and contains a point (p, p 1 , z 1 ) different from (p 0 , p0 , 0), where
z1 is an equilibrium point of(82).
Interpretation. The points of Sin a neighborhood of(p0 ,p0 ,0) correspond
to p-periodic solutions with
max lz(t)l + IP -Pol + If' - Pol-+ 0.
u;R

This follows from Theorem 79.F (local Hopfbifurcation). In case (a), p-periodic
solutions z of (82) lie on S with
max lz(t)l + IPI + lf'l-+ oo.
lEA

In case (b) there exists an equilibrium point z1 for which a Hopf bifurcation
occurs at p = p 1 , which is different from the Hopf bifurcation at z = 0, p = f'o.
Pendulum as an example. In order to get an intuitive understanding of cases
(a) and (b), we consider a pendulum under the influence of an outer force
parameter p which varies. Roughly, the following holds:
At p = f'o the pendulum passes from a state of rest into periodic oscillations.
In case (b) the pendulum returns to a state of rest.
If the pendulum does not return into a state of rest, then (a) implies that at
least one of the following three situations occurs:
(«) For all p there exist periodic oscillations.
(fJ) The period of the oscillations increases.
(')') The amplitudes of the oscillations increase.
This behavior corresponds to what one intuitively expects.
Ffint: This important theorem is due to Alexander and Yorke (1978). A
relatively simple proof may be found in Ize (1976), p. 93. It uses the same idea
as has been used in the proof of Theorem 15.C of Part I. Instead of the
fixed-point index, a more sophisticated homotopy argument about essential
maps is used. Moreover, we recommend Chow and Mallet-Paret (1978) (Fuller
index) and Nussbaum (1978) (retarded functional differential equations).
79.8. Two fundamental results about the perturbation of spectra. Let L(X, X) denote
the set of all continuous linear operators A: X-+ X on the B-space X over
I<= Ill, C where X#- {0}. Let a(A) denote the spectrum of A. Recall that, by
definition, a(A) is equal to the spectrum of the complexification Ac if k = lit
Moreover, in this case, A is called an eigenvalue of A if and only if it is an
eigenvalue of Ac (cf. A1 (23h), A1 (56)).
79.8a. The upper continuity of the spectrum. Show that the map
a: L(X, X) -+ 2c (83)
is upper semicontinuous, i.e., for each neighborhood U of a(A) in C, there
exists a neighborhood V(A) of A in L(X, X) such that
a(B) c U for all Be V(A).
In addition, for given A e L(X, X) and for each £ > 0, there exists a number
Problems 875

c5(A, e) > 0 such that


dist(u(B), u(A)) < e
for all BeL(X,X) with liB- All <b.
Hint: Use the resolvent (A - ur' and the continuity of inverse formation
from Problem 1.7 of Part I. Cf. Kato (1966, M), Chapter 4, 3.1.
Note that the map (83) is not necessarily lower semicontinuous, i.e., if an
open set U is contained in u(A), then U is not necessarily contained in u(B),
where B is a perturbation of A in L(X, X). Indeed, there exist an infinite-
dimensional 8-space X and operators A and C such that
_ {closed unit disk D if y = 0,
u(A + yC) - oD if y # 0,

(cf. Kato (1966, M), Example 3.8 of Chapter 4). The following result shows that
the situation is much better in finite-dimensional 8-spaces X (e.g., X = IR" or
X= IC").
79.8b. Main theorem of perturbation theory in finite-dimensional B-spaces. Let A E
L(X, X) be given, where X is a finite-dimensional 8-space over IK = IR, C. Let
11 , ••• , A., be the points in the spectrum u(A) of A with algebraic multiplicity
m1 , ..• , m,, respectively, i.e., the 11 , •.• , A., are the distinct eigenvalues of A,
where A.i is a solution of the characteristic equation
det(Ac - A./) =0
with multiplicity mi for each j. Moreover, let U1 , ... , U, be open subsets of C
with
j = l, ... , r
and~ n (Jt = 0 for allj # k. We set

U= U ~·
j=l

Finally, let V(A) be a connected neighborhood of A in L(X, X) such that


u(B) c U for all Be V(A).
According to Problem 79.8a, such a neighborhood always exists.
Show that
for allj,
and show that the sum of the algebraic multiplicities of the eigenvalues of B
contained in lJ.i is equal to mi.
Solution: We use the properties of the mapping degree summarized in
Section 13.6. We may assume that IK = C. Otherwise, we replace X and A with
the complexification Xc and Ac, respectively. We set
f 8 (A.) = det(B - ).[)
and
876 79. Dynamical Stability and Bifurcation in B-Spaces

Since fB(l) ::1- 0 on a~. the mapping degree is well-defined. According to


Proposition 14.2, the number n,; is equal to the sum of the multiplicities ofthe
zeros of f 8 in ~· Since the function BH f 8 is continuous from L(X, X) into
C(~), it follows from the homotopy invariance of the mapping degree that the
function
BHdeg(/8 , ~)
is continuous on the connected set V(A) and integer-valued. This implies
deg(f8 , ~) = constant for all Be V(A).
Letting B = A, we obtain nJ = mJ. Since deg(f8 , ~) ::1- 0, we also obtain that
f 8 has at least one zero on ~.i.e., u(B) 11 ~ ::1- 0.
79.9. Hopf bifurcation in IRN and the orbital stability of the bifurcating periodic
solutions. We consider the differential equation
z' = G(Jl,z) (84)
depending on the real parameter ll· We are looking for nonconstant periodic
solutions z = z(t).
Our assumptions are as follows:
(H1) Trivial solution (equilibrium state). The map

G: U(Jl 0 ,0) £ IR x IRN-+ IRN

is C 2 with G(Jl,O) = 0 for allll in a neighborhood of llo·


(H2) Loss of stability (transversality condition). There exists an ro 0 > 0 such
that iro 0 is an algebraically simple eigenvalue of G.(Jl 0 , 0).
According to Proposition 79.15, there exists a unique C1 -function ll H A(Jl)
on a neighborhood V(Jlo) of llo such that A(Jl 0 ) = iro0 , and l(Jl) is an alge-
braically simple eigenvalue of G.(Jl,O) for allJlE V(Jl 0 ). We assume that the
so-called transversality condition
Re l'(ll 0 ) > 0 (85)
is satisfied.
(H3) Nonresonance condition. None of the numbers ± ikw 0 with k = 0 and
k = 2, 3, ... is an eigenvalue of G.(Jl 0 , 0).
(H3*) Strong nonresonance condition. All eigenvalues ;_ of G,(Jl 0 , 0) different
from ± iro 0 satisfy the stability condition Re;. < 0.
Conditions (H2) and (H3*) mean that the equilibrium point z = 0 of (84) is
asymptotically stable for ll < llo and unstable for ll > llo· This follows from
Proposition 79.3 and Problem 79.8.
Show: If the assumptions (H 1), (H2), and (H3) are satisfied, then the following
are true.
(a) Existence. Let W(O) be a sufficiently small neighborhood of s = 0 in IR.
There exist two even C 1-functions
Jl, p: W(O)-+ IR (86)
and a family {z.} of nonconstant p(s)-periodic functions
z = z,(t)
for each s ::1- 0 in W(O) such that (Jl(s), z,) is a solution of the original
Problems 877

problem (84) for each se W(O). Here, we have z,(t) =0 for s = 0 and
Jl(s) = Jlo + O(s), s -+0,
2n
p(s) = - + O(s), s -+0,
Wo

27t + bsin-t
= s ( acos-t 27t ) + o(s),
z,(t) s -+0,
Wo Wo

with
lim max lz.(t)l = 0.
s-+0 teA

The real numbers a and b with Ia I + lbl .;:. 0 are independent of s.


If G is analytic in (H2), then Jl( ·) and p( ·) are also analytic.
(b) Uniqueness. There exists a neighborhood U(O) in IRN of the equilibrium
point z = 0, and there exist neighborhoods U(J10 ) and U(p(O)) in R such
that the following holds. If C is the orbit of a p-periodic solution (Jl, z) of
(84) with
C c U(O), JlE U(J1 0 ), and pe U(p(O)),
then (Jl, z) corresponds to one of the solutions in (a) up to time translation.
(c) Stability. If, in addition, condition (H3*) is satisfied, and if the bifurcation
is supercritical, i.e.,
SJl'(s) > 0 for all s .;:. 0 in W(O), (87a)
then the periodic solution t 1-+ z,(t) is asymptotically orbitally stable for all
s.;:. 0 in W(O).
(d) Instability. If, in addition, condition (H3*) is satisfied, and ifthe bifurcation
is subcritical, i.e.,
SJl'(s) < 0 for all s .;:. 0 in W(O), (87b)
then t 1-+ z,(t) is orbitally unstable for all s .;:. 0 in W(O).
Solution: The existence and uniqueness of the solution (Jl(s), z,) follows from
the main theorem on Hopf bifurcation (Theorem 79.F). We now investigate
the stability oft 1-+ z.(t) by using Theorem 79.D.
(I) First, lets= 0. We set

According to assumption (H3*), the operator G.(J1 0 , 0): IRN-+ RN has


the eigenvalues
with for allj
and A±= ±iw0 , where A+ and A_ are algebraically simple. Thus, the
operator L: RN -+ RN has the eigenvalues
for allj
and the eigenvalue
m = e1±p(O) = e±2xi = },
where the algebraic multiplicity of m equals two.
878 79. Dynamical Stability and Bifurcation in B-Spaces

(II) We now let s # 0. The Floquet multipliers of t t-+ z,(t) are the eigen-
values of the shift operator S(p(s)) which corresponds to the differential
equation
z'(t) = G.(Jl(s), z,(t))z(t).
According to (25), we obtain the integral equation

S(t) =I +I G.(Jl(S), z,(r))S(r)dr.

Since max,.a lz,(t)l-. 0 ass-. 0, according to Theorem 79.F, we obtain


from Proposition 1.2 the crucial property
lim S(p(s)) = L.
s-o
Therefore, we can use the methods of perturbation theory in order to
study the structure of the eigenvalues of S(p(s)), which are the Floquet
multipliers of z,. Let lsi be sufficiently small.
(Il-l) From Problem 79.8a it follows that the perturbations of m1 , ••• , m,
remain within the open unit disk and outside a small neighborhood of
the point m = 1.
(11-2) From Problem 79.8b it follows that the eigenvalue m = l of L splits
into the eigenvalues m+(s) and m_(s) counted according to their alge-
braic multiplicity, i.e., the total algebraic multiplicity equals two. More-
over m+(s) and m_(s) lie in a small neighborhood of the point m = 1.
According to Example 79.10 and Lemma 79.19, there are two
Floquet multipliers of z, with
M±(s) = e2nk±(•l

and K+(s) = K(s), K_(s) = 0, where K(s)-. 0 ass-. 0. From (Il-l) and
(11-2) we obtain
m±(s) = M±(s).

(III) It follows from Lemma 79.19 that


sgn K(s) = - sgn SJl'(s).
(111-1) For (87a), we obtain K(s) < 0. Thus, all the eigenvalues of S(p(s)) (i.e.,
the Floquet multipliers of z,) are contained in the open unit disk.
According to Theorem 79.D, the periodic solution tt-+z,(t) is asymp-
totically orbitally stable.
(111-2) For (87b), we obtain K(s) > 0. Therefore, the Floquet multiplier m+(s)
satisfies m+(s) > 1. According to Theorem 79.D, the solution z, is
orbitally unstable.
79.10.* Degenerate Hopf bifurcation.
79.10a. Violation of the transversality condition (H2) in Problem 19.9. Study Kielhofer
(1982).
79.10b. Violation of the nonresonance condition (H3) in Problem 79.9. Suppose that
condition (H3) is not valid. A careful study of this so-called resonance case can
Problems 879

be found in Recke (1987), (1988). In this connection it is very important that


the four-dimensional system of branching equations can be essentially simpli-
fied by using the symmetry in variance of the problem under the action of the
rotation group in IR 2 •
Roughly speaking, one obtains the following result in the generic resonance
case. If two pairs of complex conjugate eigenvalues of G,(J.L, 0) cross the imagi-
nary axis at the parameter value 1-1 = Jlo, then the equilibrium state z = 0
bifurcates into four families of periodic oscillations whose stability properties
are subtle. In order to describe transparently these stability properties, it is
necessary to consider multi-dimensional parameters Jl in IRd, where the dimen-
sion d of the parameter space is sufficiently large.
79.11. Abstract parabolic equations and Hopf bifurcation. In order to obtain a variant
of Theorem 79.F, we consider the differential equation
z' + Lz + g(Jl,Z) = 0, (88)
where 1-1 is a real parameter. We are looking for nonconstant periodic solutions
z = z(t) of (88).
In Section 19.26 we studied initial-value problems for abstract parabolic
equations of the form (88) together with applications to parabolic differ-
ential equations. In this connection, sectorial operators L, fractional powers
(L + rJ)«, and abstract Sobolev spaces X« played a fundamental role. The
following results allow applications to Hopf bifurcation for broad classes of
parabolic differential equations.
Recall that the assumption (Hla) below is satisfied, if, for example, the
operator L: D(L) !;; X -+X is linear, self-adjoint, and bounded below on the
real H-space X, i.e., there is a real number c such that
for all zeD(L).
We make the following assumptions.
(Hla) Sectorial operator L. Let X be a real B-space. The operator
L: D(L) £ X -+X
is linear, densely defined, graph closed, and the complexification Lc is
sectorial on the complexification X c.
(Hlb) Compact resolvent. The operator
(Lc- urt: Xc-+ Xc
is compact for each ,l, e C in the resolvent set of Lc.
It follows from (H la) that the operator - L generates an analytic semigroup.
Moreover, it follows that if r > - Re ,l, for all ,l, e C in the spectrum of L, then
the fractional powers (L + rJ)« are well defined for each IX~ 0. In this connec-
tion, the B-spaces x. (abstract Sobolev spaces) with norms 11·11« are defined by
X. = D(L + rl)")
and liz II.= II(L + rl)"zllx for all z eX•. We have x. !;; X for all IX~ 0.
(H2) Nonlinearity. There is an IX with 0 $ rx < 1 and a neighborhood U of
(J.L 0 , 0) in IR x X. such that the function
g: U -+X
880 79. Dynamical Stability and Bifurcation in 8-Spaces

is C"' with m ~ 2. Henceforth the value of ex is fixed. Furthermore, sup-


pose that
g(/-1,0) =0 and g,(O,O) = 0
for all11 in a neighborhood of 1-lo·
(H3) Loss of stability (transversality condition). There is an w0 > 0 such that
A.= iw 0 is an algebraically simple eigenvalue of Lc, i.e.,
dim N(Lc - iw 0 1) = codim R(Lc - iw 0 1) = 1
and zeN(Lc- iw0 1) with z #: 0 implies z¢R(Lc- iw 0 1).
This assumption tells us that A.= iw 0 is an !-simple eigenvalue of Lc
regarded as a map from (Xdc into Xc. Hence it follows from Section 79.7
that there exists a unique C 1-function w-+ A.(/-1) on a neighborhood U(!-1 0 ) of
1-lo such that A.(!-1 0 ) = iw 0 and A.(/-1) is an eigenvalue of L + g,(/-1,0) for each
/-IE U(!-1 0). We assume that the transversality condition

Re A.'(/-1 0 ) #: 0 (89)
is satisfied.
(H4) N onresonance condition. None of the numbers ± ikw0 with k = 0 and
k = 2, 3, ... is in the spectrum of L.
(H4"') Strong nonresonance condition. All points in the spectrum of L different
from ± iw0 satisfy ReA. > 0, and Re .1.'(/-lo) < 0.
By a solution of the original problem (88), we understand a continuous
function
z: [0, oo[-+ x.

such that z(t)e D(L) for all t > 0 and the derivative
z': ]0, oo[-+ X
is continuous and satisfies equation (88).
Show: If assumptions (Hl) to (H4) are satisfied, then the following are true.
(a) Existence. Let W(O) be a sufficiently small neighborhood of s = 0 in O;t.
There exist two even C"'- 1-functions
1-1, p: W(O) -+ O;t

and a family {z.} of nonconstant p(s)-periodic functions


z = z,(t)
for each s #: 0 in W(O) such that (/-l(s), z.) is a solution of the original
problem (88) for each se W(O). Here, we have z.(t) = 0 for s = 0 and
/-I(S) = 1-lo + O(s), s -+ 0,
2n
p(s) = - + O(s), s·-+ 0,
Wo

lim max llz.(t)ll. = 0.


_. .... 0 t~O

If g is analytic in (H2), then /-1( ·) and p( ·) are also analytic.


References to the Literature 881

(b) Uniqueness. There exists a neighborhood U(O) in X« of the equilibrium


point z = 0, and there exist neighborhoods U(J.L0 ) and U(p(O)) in IR such
that the following holds. If C is the orbit of a p-periodic solution (J.L, z) of
(88) with
C c U(O), J.LE U(J.L 0 ), and pe U(p(O)),
then (JL, z) corresponds to one of the solutions in (a) up to time translation.
(c) Stability. If, in addition, condition (H4*) is satisfied, and ifthe bifurcation
is supercritical, i.e.,

SJ.L'(s) > 0 for all s -1: 0 in W(O),


then the periodic solution t H z,(t) is asymptotically orbitally stable for all
s -1: 0 in W(O).
(d) Instability. If, in addition, condition (H4*) is satisfied, and ifthe bifurcation
is subcritical, i.e.,
sJ.L'(s) < 0 for all s -1: 0 in W(O),
then t H z,(t) is· orbitally unstable for all s -1: 0 in W(O).
Hint: Use analogous arguments as in the proofs of Theorem 79.F, Problem
79.9, and Theorem 19.1 on abstract parabolic equations. Replace the differ-
ential equation (88) with the equivalent integral equation

z(t) = S(t)z(O) - I S(t- s)g(JL, z(s))ds,

where {S(t)} denotes the analytic semigroup generated by - L. Cf. Crandall


and Rabinowitz (1977).

References to the Literature


Classical works: Floquet (1883), Ljapunov (1892), Poincare (1892), Hopf(1942).
Introductory exposition in IR": Amann (1983, M).
Applications to mechanics: Abraham and Marsden (1978, M).
Stability theory in B-spaces: Daleckii and Krein (1970, M, B, H), Henry (1981, L).
Hopf bifurcation: Joseph and Sattinger (1972), Crandall and Rabinowitz (1975),
(1977), Marsden and McCracken (1976, L), Sattinger (1979, L), looss (1979, M),
Hassard (1981, M) (theory and applications), Chow and Hale (1982, M), Kielhofer
(1979), (1980), (1982), (1983), Arnold (1983, M), (1987, S), Vol. 5.
Hopf bifurcation at resonance: Recke (1987), (1988).
Global Hopf bifurcation: Alexander and Yorke (1978), lze ( 1976), Chow and Mallet-
Paret (1978), Nussbaum (1978), Fiedler (1986).
Stability and bifurcation: looss and Joseph (1980, M) (elementary introduction),
Crandall and Rabinowitz (1973), Sattinger (1979, L), (1980, S) (see also the References
to the Literature to Chapters 3, 80, and 81).

General References to the Literature on Perturbation Theory


Standard works: Kato (1966, M), Reed and Simon (1972, M), Vol. 4.
Introduction: Nayfeh (1973, M), Kevorkian and Cole (1981, M), Gitterman and
Halpern (1981, M) (applications to physics), Kato (1982, M).
882 79. Dynamical Stability and Bifurcation in 8-Spaces

Matrices and operators: Baumgiirtel (1985, M).


Singular perturbations for partial differential equations and control problems: Lions
(1973, L).
Boundary layers for partial differential equations: Ljusternik and Visik (1957) (basic
paper), Trenogin (1970, S).
Method of averaging: Bogoljubov and Mitropolskii (1965, M), Daleckii and Krein
(1970, M), Hale (1980, M).
Celestial mechanics: Stumpf (1973, M), Vols. 1-3, Sternberg (1969, M), Hagihara
(1976, M), Vols. 1-5.
Oscillating systems: Bogoljubov and Mitropolskii (1965, M), Kirchgraber and Stiefel
(1978, M), Nayfeh and Mook (1979, M).
Quantum mechanics: Reed and Simon (1972, M), Vols. 1-4.
Quantum field theory: Bogoljubov and Sirkov (1973, M), (1980, M), ltzykson and
Zuber (1980, M), Lee (1981, M), Frampton (1987, M).
Reaction and diffusion: Fife (1978, S), (1979, L), Smoller (1983, M).
Maslov index and asymptotic expansions beyond the caustic: Eckman and SCneor
(1976) (introduction), Maslov (1972, M), Leray (1978, M).
Homogenization: Bensoussan, Lions, and Papanicolau (1978, M).
Regularization: Lions (1969, M).
Bifurcation: Chow and Hale (1982, M).
Geometric perturbation theory in physics: Omohundro (1986, M).
Appendix

Table 1 contains a survey of units from the international system of units. In


addition, the following abbreviations are used:
T (tera, 10 12 ),
G (giga, 109 ),
M (mega, 106 ),
k (kilo, 103 ),
f (femto, w- 15 ).
For example, 1 GeV (gigaelectron volt) is equal to 109 eV. The unit eV
(electron volt) does not belong to the international system of units, but is
frequently used in elementary particle physics. Because of Einstein's formula
E = mc 2 ,
which relates the energy E of a free particle, its mass m, and the velocity of
light c, masses in elementary particle physics are measured in eV/c 2 • The rest
mass of the electron, for instance, is
m. = 0.511· MeV/c 2 .
The rest mass of the proton is
mP = 1,836.lm. ""' 1 GeV/c 2 •
These two numbers form the basic scale for masses of elementary particles.
Table 2 contains the values of important universal constants.

883
884 Appendix

Table 1
Basic units
length m meter
time s second
mass kg kilogram
temperature K degrees Kelvin
current strength A ampere
amount of substance mol 1 mol = 6.026 · 1023 pieces
luminous intensity cd candela
Derived units
force N newton N = kgm/s 2
energy, work J joule J=Nm=Ws
eV electron volt (1 eV = 1.6 ·10- 19 J)
velocity m/s
acceleration m/s 2
density kgfm 3
pressure Pa pascal Pa = N/m 2
power w watt W=J/s=VA
action Js
voltage v volt V=W/A
charge c coulomb C=As
electric field strength V/m
magnetic flow Wb weber Wb=Vs
magnetic field strength T tesla T = Wb/m 2
electric resistance Q ohm ll=V/A
inductance H henry H=Wb/A
capacity F farad F=C/V

Table 2
Universal constants
velocity of light in the vacuum c = 2.997 93 · 108 m/s
=1/~
Planck's action quantum h = 6.625. 10- 34 J s
(h = h/2n)
Boltzmann constant " = 1.380 · 10- 23 J/K
gravitational constant G = 6.674·10- 11 N m2fkg 2
dielectric constant e0 = 8.854·10-12 A sfV m
permeability constant /Jo = 4n·10- 7 V s/A m
charge of the electron e= -1.602·10- 19 As
rest mass of the electron m. = 9.108. 10- 31 kg
= 0.511 MeV/c 2
rest mass of the proton mp = 1.672. 10- 27 kg
rest mass of the neutron mn = 1.675. 10- 27 kg
References

Abbott, L. and Pi, S. [eds.] (1985): In.flatioMry Cosmology. World Scientific, Singapore.
Ablowitz, M. and Segur, H. (1981): Solitons and the Inverse Scattering Transform.
SIAM, Philadelphia.
Abraham, R. and Robbin, J. (1967): Transversal Mappings and Flows. Benjamin,
Reading, MA.
Abraham, R. and Marsden, J. (1978): Foundations of Mechanics. Benjamin, Reading,
MA. (Cf. also Marsden. Abraham, and Ratiu (1983).)
Abraham, R. (1983): Dynamics-the Geometry of Behavior. Part 1: Periodic Behavior.
Part II: Chaotic Behavior. Part III: Global Behavior. Part IV: Bifurcation Behavior.
Birkhauser, Basel.
Adams, J. (1962): Vector fields on spheres. Ann. of Math. 75, 603-632.
Agmon, S., Douglis, A., and Nirenberg, L. (1959): Estimates near the boundary for
solutions of elliptic partial differential equations, I, II. Comm. Pure Appl. Math.
11 (1959), 623-727; 17 (1964), 35-92.
Ahrens, W. (1904): Scherz und Ernst in der Mathematik. Teubner, Leipzig.
Albers, D. and Alexanderson, A. [eds.] (1985): Mathematical People. Birkhauser,
Boston.
Alberti, P. and Uhlmann, A. (1981): Dissipative Motion in State Space. Teubner,
Leipzig.
Albeverio, S., Fenstad, J., and Hoegh-Krohn, R. (1986): Nonstandard Methods in
Stochastic Analysis and Mathematical Physics. Academic, New York.
Aleksandrov, P. [ed.] (1971): Die Hilbertschen Probleme. Geest & Portig, Leipzig.
Alexander, J. and Yorke, J. (1978): Global bifurcation of periodic orbits. Amer. J. Math.
100, 263-292.
Allgower, E. and Georg, K. (1980): Simplicial and continuation methods for approximat-
ing fixed points. SIAM Rev. ll, 28-85.
Allgower, E. and Georg, K. (1980a): Homotopy methods for approximating several
solutions to nonlinear systems of equations. In: Forster, W. [ed.] (1980), 253-270.
Allgower, E. and Georg, K. (1990): Numerical Continuation Methods. Springer-Verlag,
New York.

885
886 References

Alt, H. (1977): A free boundary-value problem associated with the flow of ground water.
Arch. Rat. Mech. Anal. 64, 111-126.
Alt, H. (1979): Stromungen durch inhomogene porose Medien mit freiem Rand. J. Reine
Angew. Math. 305,89-115.
Alt, H. and Caffarelli, L. (1981): Existence and regularity for a minimum problem with
free boundary. J. Reine Angew. Math. 105, 105-144.
Alt, H., Caffarelli, L., and Friedman, A. (1983): Axially symmetric jet flows. Arch. Rat.
Mech. Anal. 81,97-149.
Amann, H. (1983): Gewohnliche Di.fferentialgleichungen. De Gruyter, Berlin.
Ambarzumjan, V. [ed.] (1980): Probleme der modernen Kosmogonie. Akademie-Verlag,
Berlin.
American Institute of Physics Handbook (1972): McGraw-Hill, New York.
Amick, C. (1978): Properties of steady Navier-Stokes solutions for certain unbounded
channels and pipes. Nonlinear Anal. 6, 689-720.
Amick, C. and Toland, J. (1981): On solitary water waves of finite amplitude. Arch. Rat.
Mech. Anal. 76, 9-95.
Amick, C., Fraenkel, L., and Toland, J. (1982): On the Stokes conjecture for the wave
of extreme form. Acta Math. 148, 193-214.
Antman, S., (1972): The theory of rods. In: Fliigge, S. [ed.] (1956), Vol. VIa/2, 641-703.
Antman, S. (1977): Nonlinear elastic structures. In: Rabinowitz, P. [ed.] (1977), 73-125.
Antman, S. and Rosenfeld, G. (1978): Global behavior of buckled states of nonlinearly
elastic rods. SIAM Rev. 20, 513-566.
Antman, S. (1980): Buckled states of nonlinearly elastic plates. Arch. Rat. Mech. Anal.
67, 11-149.
Antman, S. and Nachman, A. (1980): Large buckled states of rotating rods. Nonlinear
Anal. 4, 303-327.
Antman, S. and Kenney, C. (1981): Large buckled states of nonlinearly elastic rods under
torsion, thrust, and gravity. Arch. Rat. Mech. Anal. 76, 289-300.
Antman, S. (1983): The influence of elasticity on analysis: Modern developments. Bull.
Amer. Math. Soc. (N.S.) 9, 267-291.
Antman, S. (1984): Geometrical and analytical questions in nonlinear elasticity. In:
Chern, S. [ed.] (1984), 1-30.
Antman, S. and Ke-Gang Shih (1986): Qualitative properties of large buckled states of
spherical shells. Arch. Rat. Mech. Anal. 93, 357-384.
Antman, S. (1995): Nonlinear Problems of Elasticity. Springer-Verlag, New York.
Arms, J., Marsden, J., and Moncrief, V. (1982): The structure of the space of solutions
of Einstein's equations, II. Several Killing fields and the Einstein- Yang-Mills
equations. Ann. Physics 144,81-106.
Arnold, V. (1963): Small divisors and problems of stability of motion in classical and
celestial mechanics. Uspekhi Mat. Nauk 18, (6), 91-196 (Russian).
Arnold, V. (1966): Sur Ia geometrie di.fferentielle des groupes de Lie de dimension infini
et ses applications a l'hydrodynamique des fluides parfait. Ann. Inst. Fourier
Grenoble 16 (1), 319-361.
Arnold, V. (1978): Mathematical Methods of Classical Mechanics. Springer-Verlag,
Berlin.
Arnold, V. (1981): Singularity Theory: Selected Papers. University Press, Cambridge,
England.
Arnold, V. (1983): Geometrical Methods in the Theory of Ordinary Differential Equa-
tions. Springer-Verlag, New York.
References 887

Arnold, V., Gusein-Zade, S., and Varcenko, A. (1985): Singularities of Differentiable


Maps, Vols. 1, 2. Birkhauser, Boston.
Arnold, V. (1986): Catastrophe Theory. Springer-Verlag, New York.
Arnold, V. et al. [eds.] (1987): Dynamical Systems, Vols. 1-8. Springer-Verlag, New
York. (Classical mechanics, celestial mechanics, Kolmogorov-Arnold-Moser
theory, bifurcation, catastrophe theory).
Arrow, K. and Debreu, G. (1954): Existence of an equilibrium for a competitive economy.
Econometrica 22, 265-290.
Arrow, K. and Intrilligator, A. (1982): Handbook of Mathematical Economics. North-
Holland, New York.
Atiyah, M. (1979): Geometry of Yang-Mills Fields(Lezioni Fermiane). Ace. Naz. Lincei,
Scuola Normale Superiore, Pisa.
Atiyah, M. ( 1984): An interview. Math. Intelligencer 6, 9-19.
Aubin, J. (1977): Applied Abstract Analysis. Wiley, New York.
Aubin, J. (1979): Applied Functional Analysis. Wiley, New York.
Aubin, J. (1979a): Mathematical Methods and Economic Theory. North-Holland, New
York.
Aubin, J. and Ekeland, I. (1983): Applied Nonlinear Analysis. Wiley, New York.
Aubin, T. (1982): Nonlinear Analysis on Manifolds: Monge-Ampere Equations.
Springer-Verlag, New York.
Audouze, J. and Tran Thanh Van, J. (1985): Fundamental Interactions and Cosmology.
Editions Frontieres, Paris.
Aziz, A. and Na, T. (1984): Perturbation Methods in Heat Transfer. Springer-Verlag,
New York.

Babin, A. and Visik, M. (1983): Attractors of evolution equations and estimates of their
dimensions. Uspekhi Mat. Nauk 38 (4), 133-187 (Russian).
Babin, A. and Visik, M. (1983a): On the dimension of attractors of the Navier-Stokes
system and other evolution equations. Dokl. Akad. Nauk SSSR 271 (6), 238-243.
Babin, A. and Visik, M. (1986): Unstable invariant sets of semigroups of nonlinear
operators and their perturbations. Uspekhi Mat. Nauk 41 (4), 3-34.
Babuska, 1., Rektorys, K., and. Vycichlo, F. ( 1960): M athematische Elastizitiitstheorie
der ebenen Probleme. Akademie-Verlag, Berlin.
Baiocchi, C. (1972): Su un problem a frontiera Iibera conesso a questioni di idraulica.
Ann. Mat. Pura Appl. 92, 107-127.
Baiocchi, C. and Capelo, A. (1978): Disequazioni variazionali e quasivariazionali, Vols.
1, 2. Pitagora, Bologna.
Balian, R. (1982): Du microscopique au macroscopique. Cours de physique statistique
de !'Ecole Polytechnique, Paris.
Ball, J. (1977): Convexity conditions and existence theorems in nonlinear elasticity. Arch.
Rat. Mech. Anal. 63, 337-403.
Ball, J., Curie, J., and Olver, P. (1981): Null Lagrangians, weak continuity, and varia-
tional problems of arbitrary order. J. Funct. Anal. 41, 135-174.
Ball, J. (1983): Energy minimizing configurations in nonlinear elasticity. In: Warsaw
(1983), 1309-1314.
Ball, J. [ed.] (1983a): Systems of Nonlinear Partial Differential Equations. Reidel,
Boston.
Bardos, C. (1982): Introduction aux problemes hyperbolique nonlim!aires. Lecture Notes
in Mathematics, Vol. 1047,1-76. Springer-Verlag, Berlin.
888 References

Bartel, N. [ed.] (1987): Supernovae as Distance Indicators. Lecture Notes in Physics,


Vol. 224. Springer-Verlag, New York.
Batchelor, G. (1967): Introduction to Fluid Dynamics. University Press, Cambridge,
England.
Batchelor, G. (1982): The Theory of Homogeneous Turbulence. University Press,
Cambridge, England.
Bauer, F., Garabedian, P., and Korn, D. (1972): A Theory of Supercritical Wing
Sections with Compurer Programs and Examples. Lecture Notes in Economics
and Mathematical Systems, Vol. 66, Springer-Verlag, New York.
Baule, B. (1956): Die Mathematik des Naturforschers und Ingenieurs, Vols. 1-7. Hirzel,
Leipzig.
Baumgartel, H. (1985): Analytic Perturbation Theory for Matrices and Operators.
Birkhauser, Boston.
Baxter, R. (1982): Exactly Solved Models in Statistical Mechanics. Academic Press, New
York.
Beale, J. (1979): The existence of cnoidal water waves with surface tension. J. Differential
Equations 31, 230-264.
Becher, P. and Bohme, M. (1981): Eichtheorien der starken und elektroschwachen
Wechselwirkung. Teubner, Stuttgart.
Becker, E. (1965): Gasdynamik. Teubner, Stuttgart.
Beckert, H. (1972): Zur Steuerung der Stabilitiit in elastischen Korpern. Z. Angew. Math.
Mech. 52,617-622.
Beckert, H. (1975): Zur ersten Randwertaufgabe in der nichtlinearen Elastizitiitstheorie.
Z. Angew. Math. Mech. 55,47-58.
Beckert, H. (1977): Bemerkungen zur Theorie der Stabilitiit. Ber. Sachs. Akad. Wiss.
Leipzig, Math.-nat. Kl. 113, 2. Akademie-Verlag, Berlin.
Beckert, H. (1982): The initial-value problem for the general dynamical equations in
nonlinear elasticity. Z. Angew. Math. Mech. 62, 357-369.
Beckert, H. (1984): Nichtlineare Elastizitiitstheorie. Ber. Sachs. Akad. Wiss. Leipzig,
Math.-nat. KJ. 117, 2. Akademie-Verlag, Berlin.
Beckert, H. (1985): Axiomatik-Mathematik und Erfahrung. Ber. Sachs. Akad. Wiss.
Leipzig, Math.-nat. Kl. 118, 2. Akademie-Verlag, Berlin.
Beckert, H. (1986): The bending Q[ plates and their stability region. Z. Angew. Math.
Mech. 66, 413-419.
Beem, J. and Ehrlich, P. (1981): Global Lorentzian Geometry. Marcel Dekker, New
York.
Beju, I. (1972): The place boundary-value problem in hyperelastostatics, I, II. Bull. Math.
Soc. Sci. Math. R. S. Roumanie (N.S.) 16, 131-149; 283-314.
Beltrami, E. (1868): Teoria fondamentale degli spazi di curvatura constante. Ann. Mat.
Pura Appl., Ser. 2 (2), 232-255.
Benard, M. (1901): Les tourbillons cellulaires dans une nappe liquide transportant de Ia
chaleur convection en regime permanent. Ann. Chern. Ser. 7 (23), 62-144.
Benkert, F. (1989): On the number of stable local minima of some functionals. Z. Anal.
Anwendungen 8, 89-96.
Bensoussan, A., Lions, J., and Papanico1ao, G. (1978): Asymptotic Methods in Periodic
Structures. North-Holland, Amsterdam.
Berge, P., Pomeau, Y., and Vidal, C. (1984): L'ordre dans le chaos. Hermann, Paris.
(English edition, 1986). ·
Berger, M. and Fife, P. (1967): On von Karman's equations, I, II. Comm. Pure Appl.
Math. 20 (1967), 687-719; 21 (1968), 227-241.
References 889

Berger, M. (1977): Nonlinearity and Functional Analysis. Academic Press, New York.
Bergmann, L. and Schaefer, C. (1979): Lehrbuch der Experimentalphysik, Vols. 1-4. De
Gruyter, Berlin.
Berkeley (1983): Proceedings of the AMS Summer Institute on Nonlinear Functional
Analysis and its Applications. Cf. Browder, F. [ed.] (1986).
Berkeley (1986): Proceedings of the International Conference of Mathematicians (to
appear).
Bernadou, M. and Boisserie, J. (1982): The Finite Element Method in Thin Shell Theory.
Birkhiiuser, Boston.
Bernard, P. and Ratiu, T. [eds.] (1977): Turbulence Seminar. Lecture Notes in Mathe-
matics, Vol. 615. Springer-Verlag, Berlin.
Bernoulli, J. and Euler, L. (1691/1744): Abhandlungen uber das Gleichgewicht und die
Schwingungen der ebenen elastischen Kurven. Ostwalds Klassiker, Vol. 175. Leip-
zig, 1910.
Bernoulli, J. (1705): Veritable hypothese de Ia resistance des solides. (See The Collected
Works of J. Bernoulli, Vol. 2. Geneva, 1744; and Bernoulli, J. and Euler, L.
(1691/1744).)
Bernoulli, D. (1738): Hydrodynamica. Argentorati,
Bers, L. (1954): Existence and uniqueness of a subsonic flow past a given profile. Comm.
Pure Appl. Math. 7 (1954), 441-504.
Bers, L. (1958): Mathematical Aspects of Subsonic and Suversonic Gas Dynamics. New
York.
Besse, A. (1987): Einstein Manifolds. Springer-Verlag, New York.
Beyer, K. (1979): Zur ersten Randwertaufgabe in der nichtlinearen Elastizitiitstheorie.
Math. Nachr. 89, 43-50.
Beyer, K. and Zeidler, E. (1979): Existenz- und Eindeutigkeitsbeweis fur Gezeitenwellen
mit allgemeinen Wirbelverteilungen. Math. Nachr. 88, 227-254.
Birell, N. and Davies, P. (1982): Quantum Fields in Curved Space. University Press,
Cambridge, England.
BirkhotT, G. (1923): Relativity and Modern Physics. University Press, Harvard, MA.
BirkhotT, G. (1950): Hydrodynamics: A Study in Logic, Fact, and Similitude. University
Press, Princeton, NJ.
BirkhotT, G. and Zarantonello, E. (1957): Jets, Wakes, and Cavities. Academic Press,
New York.
Bitsadze, A. (1964): Equations of Mixed Type. Macmillan, New York.
Blaschke, W. (1923): Vorlesungen uber Differentialgeometrie und geometrische Grund-
lagen von Einsteins Relativitiitstheorie, Vols. 1-3. Springer-Verlag, Berlin.
Blaschke, W. (1946): See Blaschke, W. (1957).
Blaschke, W. (1950): Ei'!fiihrung in die Differentialgeometrie. Springer-Verlag, Berlin.
Blaschke, W. (1957): Reden und Reisen eines Geometers. Veri. d. Wiss., Berlin.
Boffi, V. and Neunzert, H. [eds.] (1984): Applications of Mathematics in Technology.
Teubner, Stuttgart.
Bogoljubov, N. and Mitropolskii, J. (1965): Asymptotische Methoden in der Theorie
nichtlinearer Schwingungen. Akademie-Verlag, Berlin.
Bogoljubov, N. and Sirkov, D. (1973): Introduction to Quantum Field Theory. Nauka,
Moscow (Russian). (English edition: New York, 1979.)
Bogoljubov, N. and Sirkov, D. (1980): Quantum Fields. Nauka, Moscow (Russian).
Bogoljubov, N. and Bogoljubov, N. Jr. (1980): Introduction to Quantum Statistical
Mechanics. World Scientific, Singapore.
890 References

Bogoljubov, N. [ed.] (1983): Mathematicians and Physicists. Nauka, Moscow


(Russian).
Bogoljubov, N. and Bogoljubov, N. Jr. (1984): Quantum Statistical Mechanics. Nauka,
Moscow (Russian).
Bohm, J. and Reichhardt, H. [eds.] (1986): Gauss-Riemann-Minkowski (Selected
Papers). Teubner, Leipzig.
Boltzmann, L. (1871): Analytischer Beweis des zweiten Hauptsatzes der mechanischen
Wiirmetheorie aus dem Satz uber das Gleichgewicht der lebendigen Kraft. Wiener
Berichte 63, 712-733.
Boltzmann, L. (1872): Weitere Studien uber das Wiirmegleichgewicht zwischen Gas-
molekulen. Wiener Berichte 66, 275-361.
Boltzmann, L. (1873): Cf. Cohen, E. and Thirring, W. [eds.] (1973).
Boltzmann, L. (1909): Wissenschaftliche Abhandlungen (Collected Works). Edited by
F. Hasenohrl, Vols. 1-3.
Bolzano, B. (1817): Rein analytischer Beweis des Lehrsatzes, daft zwischen je zwei
Werten, die ein entgegengesetztes Resultat gewiihren, wenigstens eine reelle Wurzel
der Gleichung liegt. Prague. (Reprinted in: Ostwalds Klassiker, Vol. 153. Leipzig,
1905.)
Bona, J. and Smith, R. (1975): The initial-value problem for the Korteweg-de Vries
equation. Philos. Trans. Roy. Soc. London A278, 555-601.
Bonnet, 0. (1848): Memoire sur Ia theorie generale des surfaces. J. Ecole Polytechnique
19, 1-146.
Borisovic, J. (1977): Nonlinear Fredholm maps and the Leray-Schauder theory. Uspekhi
Mat. Nauk 32, (4), 3-54.
Born, M. (1926): Quantenmechanik der Stoftvorgiinge. Z. Phys. 37, 803-827.
Born, M. (1957): Physik im Wandel meiner Zeit. Vieweg, Braunschweig. (English edition:
Physics in My Generation. Springer-Verlag, Berlin, 1969.)
Born, M. and Infeld, L. (1967): Erinnerungen an Einstein. Union-Verlag, Berlin.
Borner, G. and Straumann, N. (1985): Das Modell des injlationiiren Universums. Phy-
sikalische Blatter 41, 146-151.
Bothe, H. (1982): The ambient structure of expanding attractors. Math. Nachr. 107,
327-348.
Bourbaki, N. (1960): Elements d'histoire des mathematiques. Hermann, Paris.
Bratteli, C. and Robinson, D. (1979): Operator Algebras and Quantum Statistical
Mechanics, Vols. 1, 2. Springer-Verlag, New York.
Brekhovskikh, L. (1985): Mechanics of Continua and Wave Dynamics. Springer-Verlag,
New York.
Dresch, G. (1978): Zwischenstufe Leben-Evolution ohne Ziel? Piper, Miinchen.
Breuer, R. (1981): Kontakt mit den Sternen. Ullstein,
Brezis, H. (1972): Multiplicateur de Lagrange en torsion elastoplastique. Arch. Rat.
Mech. Anal. 49, 32-40.
Brezis, H. and Stampacchia, G. (1976): The hodograph method in fluid dynamics in the
light of variational inequalities. Arch. Rat. Mech. Anal. 61, 1-18.
Brezzi, F. (1978): Finite element approximations of the von Karman equations. RAIRO
Anal. Numer. 12, 303-312.
Brockhaus ABC der Chemie (1965): Lexikon der Chemie, Vols. 1, 2. Brockhaus-Verlag,
Leipzig.
Brockhaus ABC der Physik (1971): Lexikon der Physik, Vols. 1, 2. Brockhaus-Verlag,
Leipzig.
References 891

Bronstein, I. and Semendjaev, K. (1979): Taschenbuch der Mathematik, Vols. 1, 2.


Teubner, Leipzig. (English edition: Handbook of Mathematics, Van Nostrand,
New York, 1985.)
Brouwer, L. (1912): Ober Abbildungen von Mannigfaltigkeiten. Math. Ann. 71,97-115.
Brouwer, L. (1975): Collected Works. North-Holland, Amsterdam.
Browder, F. (1954): Strongly elliptic systems of differential equations. In: Bers, L. [ed.]
(1954): Contributions to the Theory of Partial Differential Equations, pp. 14-52.
University Press, Princeton, NJ.
Browder, F. (1976): Mathematical Developments Arising from Hilbert's Problems.
American Mathematical Society, New York.
Browder, F. (1976a): Problems of present-day mathematics. In: Browder, F. [ed.] (1976),
35-79.
Browder, F. [ed.] (1984): The Mathematical Heritage of Henri Poincare. Proceedings
ofthe Symposia in Pure Mathematics, Vol. 39, Parts I, II. American Mathematical
Society, Providence, RI.
Browder, F. [ed.] (1986): Nonlinear Functional Analysis and its Applications, Vols. 1,
2. American Mathematical Society, Providence, Rl.
Brown, A. (1935): Function(JI dependence. Trans. Amer. Math. Soc. 38, 379-394.
Buchner, M., Marsden, J., and Schecter, S. (1983): Examples for the infinite-dimensional
Morse lemma. SIAM J. Math. Anal. 14, 1054-1055.
Buchner, M., Marsden, J., and Schecter, S. (1983a): Applications of the blowing-up
construction and algebraic geometry to bifurcation problems. J. Differential Equa-
tions 48, 404-422.
Bud6, A. (1956): Theoretische Mechanik. Veri. d. Wiss., Berlin.
Biideler, W. (1981): Faszinierendes Weltall. Dt. Verlagsanstalt, Stuttgart.
Buhler, W. (1981): Gauss: A Bibliographical Study. Springer-Verlag, New York.
Bullough, R. and Caudrey, P. [eds.] (1980): Solitons. Springer-Verlag, New York.
Burger, E. (1959): Einjiihrung in die Theorie der Spiele. De Gruyter, Berlin.
Busemann, A. and Foppl, 0. (1928): Physikalische Grundlagen der Elastomechanik. In:
Geiger, H. and Scheel, K. [eds.] (1926), Vol. 6, 1-46.

Catfarelli, L., Kohn, R., and Nirenberg, L. (1982): Partial regularity of suitable weak
solutions of the Navier-Stokes equations. Comm. Pure Appl. Math. 35, 771-831.
Calogero, F. and Degasperis, A. (1982): Spectral Transform and Solitons. North-
Holland, Amsterdam.
Carelli, A. et at. [ed.] (1979): Astrofisica e cosmologia gravitazione quanti e relativita.
Giunti Barbera, Firenze, ltalia.
Carleman, T. (1921): Ober eine nichtlineare Randwertaufgabe bei der Gleichung Au= 0.
Math. Z. 9, 35-43.
Carmeli, M. (1977): Group Theory and General Relativity. McGraw-Hill, New York.
Cartan, E. (1952): Oeuvres completes, Vols. 1-4. Gauthier-Villars, Paris.
Cartan, E. (1955): Calcul differentiel: Les systemes differentiels exterieurs et leurs appli-
cations geometriques. Hermann, Paris.
Cauchy, A. (1827): De Ia pression ou tension dans un corpe solide. Exercises de matbe-
matique, Vol. 2, 42-56 =Cauchy, A. (1882), II. Serie, Vol. 7, 60-78.
Cauchy, A. (1828): Surles equations qui experiment les conditions d'equilibre ou les lois
de mouvement interieur d'un corps solide. Exercises de mathematique, Vol. 3, 160-
187 =Cauchy, A. (1882), II Serie: Vol. 8, 253-277.
892 References

Cauchy, A. (1882/1970): Oeuvres completes (Collected Works). I Serie: Vols. 1-12; II


Serle: Vols. 1-15. Gauthier-Villars, Paris.
Cebeci, T. and Bradshaw, P. (1984): Physical and Computational Aspects to Convective
Heat Transfer. Springer-Verlag, New York.
Cercignani, C. (1975): Theory and Applications of the Boltzmann Equation. Scottish
Academic Press, Edinburgh.
Chandrasekhar, S. (1939): An Introduction to the Study of Stellar Structure. Dover, New
York.
Chandrasekhar, S. (1961): Hydrodynamic and Hydromagnetic Stability. University
Press, Oxford, England.
Chandrasekhar, S. (1969): Ellipsoidal Figures of Equilibrium. Yale University Press,
New Haven, CT.
Chandrasekhar, S. (1983): The Mathematical Theory of Black Holes. Clarendon Press,
Oxford, England.
Chang, K. and Howes, F. (1984): Nonlinear Singular Perturbation Phenomena. Springer-
Verlag, New York.
Chavent, G. and Jaffre, J. (1986): Mathematical Methods and Finite Elements for
Reservoir Simulation. North-Holland, Amsterdam.
Chern, S. (1945): A simple intrinsic proof of the Gauss-Bonnet formula for closed
Riemannian manifolds. Ann. of Math. 45, 747-752.
Chern, S. (1959): Differentiable Manifolds. Lecture Notes. University Press, Chicago,
IL.
Chern, S. [ed.] (1984): Seminar on Nonlinear Partial Differential Equations. Springer-
Verlag, New York.
Chernoff, P. and Marsden, J. (1974): Properties of Infinite-Dimensional Hamiltonian
Systems. Lecture Notes in Mathematics, Vol. 425. Springer-Verlag, Berlin.
Chevalley, C. (1946): Theory of Lie Groups. University Press, Princeton, NJ.
Chillingworth, D., Marsden, J., and Wan, Y. (1983): Symmetry and bifurcation in
three-dimensional elasticity, I, II. Arch. Rat. Mech. Anal. 80, 295-331; 83, 363-395.
Chipot, M. (1984): Variational Inequalities and Flow in Porous Media. Springer-Verlag,
New York.
Choquet-Bruhat, Y. and Yorke, J. (1980): The Cauchy problem. In: Held, A. [ed.] (1980),
Vol. 1, 99-136.
Choquet-Bruhat, Y. and Christodoulou, D. (1981): Existence of global solutions of the
Yang-Mills, Higgs, and spinor field equations in 3 + 1 dimensions. Ann. Sci. Ecole
Norm. Sup. IV. Ser., 14,481-500.
Choquet-Bruhat, Y. DeWitt-Morette, C., and Dillard-Bleick, M. (1982): Analysis,
Manifolds, and Physics. North-Holland, Amsterdam.
Choquet-Bruhat, Y. (1984): Positive-energy theorems. In: DeWitt, B., and Stora, R.
[eds.] (1984), 739-784.
Chorin, A. (1973): Numerical study of slightly viscuous flow. J. Fluid Mech. 57, 785-796.
Chorin, A. (1975): Lectures on TUrbulence Theory. Publish or Perish, Boston.
Chorin, A. et al. (1977): Product formulas and numerical algorithms. Comm. Pure Appl.
Math. 31, 205-256.
Chorin, A. (1978): Vortex sheet approximation of boundary layers. J. Comput. Phys.l7,
428-442.
Chorin, A. and Marsden, J. (1979): A Mathematical Introduction to Fluid Dynamics.
Springer-Verlag, New York.
Chorin, A. (1982):. The evolution of a turbulent vortex. Comm. Math. Phys. 83, 517-535.
References 893

Chow, S., Hale, J., and Mallet-Paret, J. (1975): Applications of generic bifurcation, I,
II. Arch. Rat. Mech. Anal. 59, 159-188; 62,209-235.
Chow, S. and Mallet-Paret, J. (1978): The fuller index and global Hopf bifurcation.
J. Differential Equations 29, 66-85.
Chow, S., Mallet-Paret, J., and Yorke, J. (1978): Finding zeros of maps: homotopy
methods that are constructive with probability one. Math. Comput. 32, 887-899.
Chow, S. and Hale, J. (1982): Methods of Bifurcation Theory. Springer-Verlag, New
York.
Christodoulou, D. and Klainerman, S. (1988): Cf. additional references.
Ciarlet, P. (1977): Numerical Analysis of the Finite Element Method. North-Holland,
Amsterdam.
Ciarlet, P. and Rabier, P. (1980): Les equations de von Karman. Lecture Notes in
Mathematics, Vol. 826. Springer-Verlag, Berlin.
Ciarlet, P. (1983): Lectures on Three-Dimensional Elasticity. Springer-Verlag, New
York.
Ciarlet, P. and Necas, J. (1984): Unilateral problems in nonlinear three-dimensional
elasticity (or how to carry a bottle of slivovitz). Arch. Rat. Mech. Anal. 87, 319-
338.
Claro, F. [ed.] (1985): Nonlinear Phenomena in Physics. Springer-Verlag, New York.
Clausius, R. (1865): Ober verschiedene flir die Anwendung bequeme Formen der Haupt-
gleichungen der mechanischen Wiirmetheorie. Pogg. Ann. 125, 353-400.
Cohen, E. and Thirring, W. [eds.] (1973): The Boltzmann Equation. Springer-Verlag,
New York.
Cole, J. and Cook, L. (1986): Transonic Aerodynamics. North-Holland, Amsterdam.
Concus, P. and Finn, R. [eds.] (1987): Variational Methods for Free Interfaces. Springer-
Verlag, New York.
Constantin, P. and Foias, C. (1985): Global Ljapunov exponents, Kaplan- Yorke formu-
las, and the dimension of the attractors for two-dimensional Navier-Stokes equa-
tions. Comm. Pure Appl. Math. 38, 1-27.
Constantin, P., Foias, C., and Temam, R. (1985): Attractors representing turbulent
flows. Memoirs Amer. Math. Soc., Vol. 53. American Mathematical Society,
Providence, Rl.
Cornwell, J. (1984): Group Theory in Physics, Vols. 1, 2. Academic Press, New York.
Courant, R. and Robbins, H. (1941): What is Mathematics? University Press, Oxford,
England.
Courant, R. and Friedrichs, K. (1948): Supersonic Flow and Shock Waves. Interscience,
New York.
Courant, R. and Hilbert, D. (1959): Methods of Mathematical Physics, Vols. 1, 2.
Interscience, New York.
Craigie, N. [ed.] (1986): Theory and Detection of Magnetic Monopoles in Gauge
Theories. World Scientific, Singapore.
Crandall, M. and Rabinowitz, P. (1973): Bifurcation, perturbation of simple eigenvalues,
and linearized stability. Arch. Rat. Mech. Anal. 52, 161-180.
Crandall, M. and Rabinowitz, P. (1975): The Hopf bifurcation theorem. MRC Technical
Report, No. 1604. University of Madison, Madison, WI.
Crandall, M. and Rabinowitz, P. (1977): The Hopf bifurcation theorem. Arch. Rat.
Mech. Anal. 67, 53-72.
Crandall, M. and Lions, P. (1983): Viscosity solutions of Hamilton-Jacobi equations.
Trans. Amer. Math. Soc. 277, 1-42.
894 References

Crank, J. (1984): Free and Moving Boundary Problems. Clarendon Press, Oxford,
England.
Cronin, V. (1981): The View from Planet Earth-Man Looks at the Cosmos. Collins,
Glasgow. (German edition: Siiulen des Himmels-die Weltbilder des Abendlandes.
Claasen, Dusseldorf.)
Curtis, W. and Miller, F. (1986): Differentiable Manifolds and Theoretical Physics.
Academic Press, New York.
Cycon, R., Froese, R., Kirsch, W., and Simon, B. (1986): Schrodinger Operators.
Springer-Verlag, New York.

Dacarogna, B. (1982): Weak Continuity and Weak Lower Semicontinuity of Nonlinear


Functionals. Lecture Notes in Mathematics, Vol. 922. Springer-Verlag, Berlin.
Daleckii, J. and Krein, M. (1970): Stability of Solutions of Differential Equations in
Banach Spaces. Nauka, Moscow (Russian). (English edition: American Mathe-
matical Society, Providence, Rl.)
Darcy, H. (1856): Les fontaines publiques de Ia ville de Dijon. Dalmont, Paris.
Davis, P. and Hersh, R. (1981): The Mathematical Experience. Birkhiiuser, Boston.
Davis, P. and Chinn, W. (1985): 3.1416 and All That. Birkhiiuser, Boston.
Davydov, A. (1984): Solitons in Molecular Systems. Naukova Dumkova, Kiev (Russian).
Debnath, L. [ed.] (1985): Advances in Nonlinear Waves. Pitman, London.
Debreu, G. (1959): Theory of Value. Wiley, New York.
De Rham, G. (1955): Varietes differentiables. Formes courants, formes harmoniques.
Hermann, Paris.
DeWitt, B. and Stora, R. [eds.] (1984): Relativite, groupes et topologies, II. North-
Holland, Amsterdam.
DeWitt, B. (1984): The space-time approach to quantum field theory. In: DeWitt,
B. and Stora, R. [eds.] (1984), 381-738.
DeWitt, B. (1984a): Supermanifolds. University Press, Cambridge, England.
Dias, J. (1975): Variational inequalities for nonlinear maximal monotone operators in a
Hilbert space. Amer. J. Math. 97, 905-914.
Dias, J. and Hernandez, J. (1975): A Sturm-Liouville theorem for some odd multivalued
maps. Proc. Amer. Math. Soc. 53, 72-74.
Dickerson, R., Gray, H., and Haight, G. (1974): Chemical Priciples. Benjamin, New
York. (German edition: de Gruyter, Berlin, 1978.)
Dickey, R. (1976): Bifurcation Problems in Nonlinear Elasticity. Pitman, London.
Dictionary of Mathematics (1961): Cf. Mathematisches Worterbuch (1961).
Dierker, E. (1974): Topological Methods in Walrasian Economics. Lecture Notes in
Economics, Vol. 92, Springer-Verlag, New York.
Dieudonne, J. ( 197 5): Grundzuge der modernen Analysis, Vols. 1-9. Veri. d. Wiss., Berlin.
(English edition: Foundations of Modern Analysis. Academic Press, New York
1960ff. French edition: Gauthier-Villars, Paris, 1968ff.)
Dieudonne, J. (1981): History of Functional Analysis. North-Holland, Amsterdam.
Dieudonne, J. (1982): The work of Bourbaki during the last thirty years. Notices Amer.
Math. Soc. 29, 618-623.
Dionne, P. (1962): Surles problemes de Cauchy bien poses. J. Analyse Math. 10, 1-90.
DiPerna, R. (1977): Decay of solutions of hyperbolic systems of conservation laws with
a convex extension. Arch. Rat. Mech. Anal. 64, 1-46.
References 895

DiPerna, R. (1979): Uniqueness of solutions of hyperbolic conservation laws. Indiana


Univ. Math. J. 28, 137-187.
DiPerna, R. (1983): Convergence of the viscosity method for isentropic gas dynamics.
Comm. Math. Phys. 91, 1-30.
DiPerna, R. and Majda, A. (1987): Concentrations in regularizations for two-dimensional
incompressible flow. Comm. Pure Appl. Math. 40, 301-346.
Dixon, W. (1978): Special Relativity. University Press, Cambridge, England.
Djubek, J., Kodnar, R., and Skaloud, M. (1983): Limit State of the Plate Elements of
Steel Structures. Birkhauser, Boston.
Do, C. (1975): Problemes de valeurs propres pur une inequation variationelle sur un c6ne.
C. R. Acad. Sci. Paris, Ser. A-B, 280,45-48.
Do, C. (1976): The buckling of a thin elastic plate subjected to unilateral conditions. In:
Germain, P. and Nayroles, B. [eds.] (1976), 307-316.
Do, C. (1977): Bifurcation theory for elastic plates subjected to unilateral conditions.
J. Math. Anal. Appl. 60, 435-448.
Donaldson, S. (1983): An application of gauge theory to the topology of four-
dimensional manifolds. J. Dill Geometry 18, 279-315.
Drazin, P. (1983): Solitons. University Press, Cambridge, England.
Dubin, D. (1974): Solvable Models in Algebraic Quantum Statistics. Clarendon Press,
Oxford, England.
Dubrovin, B., Novikov, S., and Fomenko, A. (1979): Modern Geometry: Methods and
Applications. Nauka, Moscow (Russian). (English edition: Springer-Verlag, New
York, 1984.)
Duffie, D. and Shafer, W. (1985): Equilibrium in incomplete markets. I. A basic model
of generic existence. J. Math. Economics 14, 285-300.
Dukas, H. and Hoffmann, B. (1972): Albert Einstein-Creator and Rebel. New York.
Dukas; H. and Hoffmann, B. (1979): Albert Einstein-the Human Side. University
Press, Princeton, NJ.
Duvaut, G. and Lions, J. (1972): Les inequations en mecanique et en physique. Dunod,
Paris.
Dwoyer, D. et al. (1985): Theoretical Approaches to Thrbulence. Springer-Verlag, New
York.
Dyck, W. (1885): Beitrage zur Analysis situs. Berichte Sachs. Ges. Wiss. Leipzig 37,
314-325.
Dyson, J. (1979): Disturbing the Universe. Harper & Row, New York. (German edition:
Innenansichten-Erinnerungen an die Zukunft. Birkhauser, Basel, 1981.)

Eavas, C. et a/. [eds.] (1982): Homotopy Methods and Global Convergence. Plenum,
New York.
Ebeling, W. (1976): Strukturbildung bei irreversiblen Prozessen. Teubner, Leipzig.
Ebeling, W. and Feistel, R. (1982): Physik der Selbstorganisation und Evolution.
Akademie-Verlag, Berlin.
Ebeling, W. and Klimontowitsch, A. (1985): Self-Organization and Turbulence. Teubner,
Leipzig.
Ebin, D. and Marsden, J. (1970): Groups of diffeomorphisms and the motion of an
incompressible fluid. Ann. of Math. 92, 102-163.
Ebin, D., Fischer, A., and Marsden, J. (1972): Diffeomorphism groups, hydrodynamics
and relativity. In: Vanstone, J. [ed.], Proceedings of the 13th Biennial Seminar of
896 References

the Canadian Mathematical Congress, pp. 135-279.


Ebin, D. ( 1979): The initial-value problem for subsonic fluid motion. Comm. Pure Appl.
Math. 32, 1-19.
Ebin, D. and Saxton, R. (1986): The initial-value problem for elasto-dynamics of
incompressible bodies. Arch. Rat. Mech. Anal. 94, 15-38.
Eckhaus, W. (1973): Matched Asymptotic Expansions and Singular Perturbations.
North-Holland, Amsterdam.
Eckmann, J. and Seneor, R. (1976): The Maslov-WKB-method for the (an)harmonic
oscillator. Arch. Rat. Mech. Anal. 61, 153-173.
Efimov, N. (1957): Fliichenverbiegung im GrojJen. Akademie-Verlag, Berlin.
Einstein, A. (1905): Zur Elektrodynamik bewegter Korper. Ann. Physik 17, 891-921.
Einstein, A. (1905a): 1st die Triigheit eines Korpers von seinem Energieinhalt abhiingig?
Ann. Physik 18,639-641.
Einstein, A. (1905b): Ober einen die Erzeugung und Verwandlung des Lichts betreffenden
Gesichtspunkte. Ann. Physik 17, 132-148.
Einstein, A. (1905c): Die von der molekular-kinetischen Theorie der Wiirme geforderte
Behandlung von in ruhenden Fliisssigkeiten suspendierten Teilchen. Ann. Physik 17,
549-560.
Einstein, A. (1915): Zur allgemeinen Relativitiitstheorie. Die Feldgleichungen der Gra-
vitation. Sitzungsber. Preuss. Akad. Wiss. Berlin vom 11.11.1915 und 2.12.1915.
Einstein, A. (1916): Die Grundlagen der allgemeinen Relativitiitstheorie. Ann. Physik 49,
769-822.
Einstein, A. (1920), (1921): Cf. Frank, P. (1949).
Einstein, A. (1953): The Meaning of Relativity. University Press, Princeton, NJ.
Einstein, A. (1955): Cf. Einstein, A. (1965) and Melcher, H. (1979).
Einstein, A. (1956): Grundzilge der Relativitiitstheorie. Vieweg, Braunschweig.
Einstein, A. (1960): Collected Writings of Albert Einstein. Readex Microprint, New
york. cr. also the additional references.
Einstein, A. (1965): Mein Weltbild. Frankfurt/Main.
Eisenreich, G. and Sube, R. (1973): Worterbuch Physik (Dictionary of Physics in
English, French, German and Russian). Verlag-Technik, Berlin.
Eisenreich, G. and Sube, R. (1982): Worterbuch Mathematik (Dictionary of Mathe-
matics in English, French, German and Russian). Verlag-Technik, Berlin.
Ekeland, I. and Temam, R. (1974): Analyse convexe et probtemes variationnels. Dunod,
Paris. (English edition: North-Holland, Amsterdam, 1976.)
Elliott, C. and Ockendon, J. (1982): Weak and Variational Methods for Moving Bound-
aries. Pitman, London.
Ellis, R. (1985): Entropy, Large Deviations, and Statistical Physics. Springer-Verlag,
New York.
Emden, R. (1938): Why we have heating? Nature 141, 908.
Encyclopedia of Astronomy and Space (1976): Edited by I. Ridpath. Macmillan,
London.
Encyclopedia of Mathematics and Its Applications (1976): Edited by G. Rota, Vols. 1ff.
Addison-Wesley, Reading, MA.
Encyclopedia of Mathematical Sciences (1987): Vols. 1ff. Springer-Verlag, New York.
Encyclopedic Dictionary of Mathematics (1977): Vols. 1, 2. MIT Press, Cambridge, MA.
Encyclopedic Dictionary of Physics (1977): Vols. 1-9. Supplementary Volumes 1-4.
Edited by J. Thewlis. MIT Press, Cambridge, MA.
References 897

Enzyklopiidie der mathematischen Wissenschaften (1904): Vols. 1fT. Teubner, Leipzig.


Euklid (325 B.C.): Elemente. Ostwalds Klassiker, Vols. 235, 236. Akad. Verlagsges.,
Leipzig.
Euler, L. (1744): Cf. Bernoulli, J. and Euler, L. (1691/1744).
Euler, L. (1755): Principels generaux du mouvements des fluides. Hist. de I'Acad. de
Berlin.
Euler, L. (1911): Opera omnia (Collected Works), Vols. 1fT. Leipzig-Zurich since 1911,
Lausanne since 1942.
Evans, L. (1986): Quasiconvexity and partial regularity in the calculus of variations.
Arch. Rat. Mech. Anal. 95, 227-252.

Faddeev, L. and Takhtadjan (1986): The Hamiltonian Approach in the Theory of


Solitons. Nauka Moscow (Russian). (English edition: Springer-Verlag, New York,
1987.)
Fan, Ky (1952): Fixed-point and minimax theorems in locally convex spaces. Proc. Nat.
Acad. Sci. U.S.A. 74,4749-4751.
Fan, Ky (1972): A minimax inequality and applications. In: Shisha, 0. [ed.] (1972),
Inequalities, pp. 103-114. Academic Press, New York.
Federer, H. (1969): Geometric Measure Theory. Springer-Verlag, Berlin.
Feigenbaum, M. (1980): Universal behaviour in nonlinear systems. Los Alamos Science
1, 4-27.
Feistauer, M. (1984): On irrotational flow through cascades of profiles in a layer of
variable thickness. Applikace Mathematiky 29, 423-458.
Feistauer, M., Mandel, J., and Necas, J. (1984): Entropy regularization of the transonic
potential flow problem. Comm. Mat. Univ. Carolinae 25 (3), 431-443.
Feistauer, M. and Necas, J. (1985): On the solvability of transonic potential flow
problems. Z. Anal. Anwendungen 4, 305-329.
Fermi, E., Pasta, J., and Ulam, S. (1955): Studies of nonlinear problems. Los Alamos
Report, LA, 1940, 1955. Reproduced in: Newell, A. [ed.] (1974), Nonlinear Wave
Motion. American Mathematical Society, Providence, Rl.
Ferrara, S. (1983): Aspects of supergravity theories. In: Schmutzer, E. [ed.] (1983),
207-224.
Ferraris, M. and Kijowski, J. (1982): Unified theory of electromagnetic and gravitational
interactions. Gen. Relativity Gravitation 14, 37-47.
Ferraris, M. and Kijowski, J. (1982a): On the equivalence of the relativistic theories of
gravitation. Gen. Relativity Gravitation 14, 165-180.
Ferris, T. (1977): The Red Limit. New York. (German edition: Birkhauser, Basel, 1982.)
Feynman, R., Leighton, R., and Sands, M. (1963): The Feynman Lectures on Physics.
Addison-Wesley, Reading, MA. (German-English edition: Oldenbourg, Miin-
chen, 1974.)
Fichera, G. (1964): Problemi elastostatici con vincoli unilateriali: il problema die Signo-
rini con ambigue condizioni al contorno. Mem. Accad. Naz. Lincei 8, 91-140.
Fichera, G. (1972): Existence theorems in elasticity. In: Fliigge, S. [ed.] (1956), Vol.
VIa/2, 347-390.
Fichera, G. (1972a): Boundary value problems of elasticity with unilateral constraints.
In: Fliigge, S. [ed.] (1956), Vol. Vla/2, 391-424.
Fick, E. and Sauermann, G. (1982): Quantenstatistik dynamischer Prozesse, Vols. 1, 2.
Geest & Portig, Leipzig.
898 References

Fiedler, B. (1986): Global Hopf bifurcation of two-parameter flows. Arch. Rat. Mech.
Anal. 94, 59-81.
Field, G. and Chaisson, E. (1986): Das unsichtbare Universum. Birkhauser, Boston.
(English edition: The Invisible Universe. Boston, 1985).
Fife, P. (1970): The Benard problem for general fluid dynamical equations and remarks
on the Boussinesq approximation. Indiana Univ. Math. J. 20, 303-326.
Fife, P. (1978): Asymptotic states for equations of reaction and diffusion. Bull. Amer.
Math. Soc. 84, 693-726.
Fife, P. (1979): Mathematical Aspects of Reacting and Diffusing Processes. Lecture
Notes in Biomathematics, Vol. 28. Springer-Verlag, Berlin.
Finn, R. and Gilbarg, D. (1957): Three-dimensional subsonic flow, and asymptotic
estimates for elliptic partial differential equations. Acta Math. 98, 265-296.
Finn, R. (1959): On steady state solutions of the Navier-Stokes partial differential
equations. Arch. Rat. Mech. Anal. 3, 381-396.
Finn, R. (1961): On the steady-state solutions of the Navier-Stokes equations. Acta
Math. 105, 197-244.
Finn, R. (1965): Stationary solutions of Navier-Stokes equations. Proc. Sympos. Appl.
Math. 17, 121-153.
Finn, R. (1985): Equilibrium Capillary Surfaces. Springer-Verlag, New York.
Fischer, A. and Marsden, J. (1972): The Einstein evolution equations as a first-order
symmetric hyperbolic system. Comm. Math. Phys. 28, 1-38.
Fischer, A. and Marsden, J. (1972a): The Einstein equations of evolution-a geometric
approach. J. Math. Phys. 13, 546-568.
Fischer, A., Marsden, J., and Moncrief, V. ( 1982): The structure of the space of solutions
of Einstein equations, II. Several Killing fields and the Einstein- Yang-Mills equa-
tions. Ann. Physics 144,81-106.
a
Floquet, G. (1883): Sur les equations differentielles lim!aires coefficients periodiques.
Ann. Sci. Ecole Norm. Sup., Ser. 2, 12,47-89.
Fliigge, S. [ed.] (1956): Handbuch der Physik (Handbook of Physics), Vol. 1ff. Springer-
Verlag, Berlin.
Fock, V. (1960): Theorie von Raum, Zeit und Gravitation. Akademie-Verlag, Berlin.
Foias, C. (1973): Statistical study of Navier-Stokes equations, I. II. Rend. Sem. Mat.
Univ. Padova 48, 219-348; 49,9-123.
Foias, C. and Temam, R. (1977): Structure of the set of stationary solutions of the
Navier-Stokes equations. Comm. Pure Appl. Math. 30, 149-164.
Foias, C. and Temam, R. (1982): Homogeneous statistical solutions of Navier-Stokes
equations. Indiana Univ. Math. J. 29, 913-957.
Forster, W. [ed.] (1980): Numerical Solutions of Highly Nonlinear Problems. North-
Holland, New York.
Fortin, M. and Glowinski, R. [eds.] (1983): Augmented Lagrangian's Methods: Applica-
tions to the Numerical Solutions of Boundary Value Problems. North-Holland, New
York.
Frampton, P. (1987): Gauge Field Theories. Benjamin, Reading, MA.
Frank, G. and Meyer-Spasche, R. (1981): Computations of transitions in Taylor vortex
flows. Z. Angew. Math. Phys. 32, 710-720.
Frank, P. (1949): Einstein-sein Leben und seine Zeit. Miinchen.
Frank, P. and Mises, R.v. (1962): Die Differential- und lntegralgleichungender Mechanik
und Physik, Vols. 1, 2. Vieweg, Braunschweig.
Franke, H. (1969): Lexikon der Physik. Stuttgart.
References 899

Fredrickson, A. (1964): Principles and Applications of Rheology. Prentice-Hall, Engle-


wood Cliffs, NJ.
Freed, D. and Uhlenbeck, K. (1984): Instantons and Four-Manifolds. Springer-Verlag,
New York.
Fridman, A. and Polyachenko, V. (1984): Physics of Gravitating Systems, Vols. 1, 2.
Springer-Verlag, New York.
Friedlander, S. (1980): An Introduction to the Mathematical Theory of Geophysical
Fluid Dynamics. North-Holland, Amsterdam.
Friedman, A. (1982): Variational Principles and Free Boundary Value Problems. Wiley,
New York.
Friedrich, H. (1981): The analytic characteristic initial-value problem for Einstein's
vacuum field equations as an initial-value problem for a first-order quasilinear
symmetric hyperbolic system. Proc. Roy. Soc. London A378, 401-421.
Friedrichs, K. (1927): Rand- und Eigenwertprobleme aus der Theorie der elastischen
Platten. Math. Ann. 98,206-247.
Friedrichs, K. (1929): Ein Verfahren das Minimum eines Integrals als das Maximum eines
anderen Ausdrucks darzustellen. Gottinger Nachrichten.
Friedrichs, K. and Stoker, J. (1942): Buckling of the circular plate beyond the critical
thrust. J. Appl. Mech. 9, A7-A14.
Friedrichs, K. and Hyers, D. (1954): The existence of solitary waves. Comm. Pure Appl.
Math. 7, 517-550.
Friedrichs, K. (1954): Symmetric hyperbolic linear differential equations. Comm. Pure
Appl. Math. 7, 345-392.
Fritzsch, H. (1982): Quarks-Urstoff unserer Welt. Piper, Miinchen.
Fritzsch, H. (1983): Vom Urknall zum Verfall. Piper, Miinchen.
Frohlich, J. (1983): Scaling and Self-Similarity in Physics. Renormalization in Statistical
Mechanics and Dynamics. Birkhiiuser, Boston.
Frolov, V. (1976): Black holes and quantum processes. Uspekhi Fiz. Nauk 18, 473-503
(Russian).
Frost, W. and Moulden, T. (1977): Handbook of TUrbulence. Plenum, New York.
FuCik, S. and Kufner, A. [eds.] (1979): Nonlinear Analysis. Function Spaces and Ap-
plications. Teubner, Leipzig.
Fujita, H. and Kato, T. (1964): On the Navier-Stokes initial value problem. Arch. Rat.
Mech. Anal. 16, 269-315.

Gajewski, H. (1970): Iterations- Projektions- und Projektions-Iterationsverfahren zur


Berechnung visco-plastischer Stromungen. Z. Angew. Math. Mech. 50, 485-490.
Gajewski, H. (1970a): Ober einige Fehlerabschiitzungen bei Gleichungen mit monotonen
Potentialoperatoren in Banachrilumen. Monatsberichte Akad. d. Wiss. Berlin,
571-579.
Galdi, G. and Maremonti, P. (1986): Monotonic decreasing and asymptotic behavior of
the kinetic energy for weak solutions of the Navier-Stokes equations in external
domains. Arch. Rat. Mech. Anal. 94, 253-266.
Gale, D. (1955): The law of supply and demand. Math. Scand. 3,155-169.
Galilei, G. (1638): Discorsi e dimonstrazioni matematiche. Firenze,ltalia.
Galilei, G. (1890/1909): Le opere di Galileo. Edited by A. Favaro, Vols. 1-20. Firenze,
ltalia.
Gantmacher, F. and Krein, M. (1960): Oszillationsmatrizen, Oszillationskerne und kleine
Schwingungen mechanischer Systeme. Akademie-Verlag, Berlin.
900 References

Garabedian, P. (1960): Boundary Layer Theory. McGraw-Hill, New York.


Garabedian, P. and Kom, D. (1971): Analysis of transonic airfoils. Comm. Pure Appl.
Math.l4, 841-851.
Garabedian, P. (1972): See Bauer, F., Garabedian, P., and Kom D. (1972).
Garcia, C. and Zangwill, W. (1983): Pathways to Solutions, Fixed Points and Equilibrill.
Prentice-Hall, Englewood Cliffs, NJ.
Gardiner, W. (1985): Handbook of Stochastic Methods for Physics, Chemistry and the
Natural Sciences. Springer-Verlag, New York.
Gates, S. et al. (1983): Superspace. Benjamin, Reading, MA.
Gauss, C. (1827): Disquisitiones generales circa superficies curvas. In: Gauss, C. (1863),
Vol. 5, 217-258, 341-347. (German translation: Allgemeine Fli:ichentheorie.
Oswalds Klassiker, Vol. 5, Leipzig, 1889.)
Gauss, C. (1863): Werke (Collected Works), Vols. 1-12. Gottingen, 1863-1929.
Gautreau, R. and Savin, W. (1978): Theory and Problems in Modern Physics. McGraw-
Hill, New York.
Geckeler, J. (1928): Elastostatik. In: Geiger, H. and Scheel, K. [eds.] (1926), Vol. 6,
141-308.
Geiger, H. and Scheel, K. [eds.] (1926): Handbuch der Physik, Vols. 1-24. Springer-
Verlag, Berlin, 1926tT.
Geiringer, H. (1972): Ideal plasticity. In: Fliigge, S. [eds.] (1956), Vol. Vla/3, 403-533.
Georg, K. (1981): On tracing an implicitly defined curve by quasi-Newton steps and
calculating bifurcation by local perturbations. SIAM J. Sci. Statist. Comput. l,
35-50.
Georg, K. (198la): Numerical integration of the Davidenko equation. In: Peitgen, H. and
Walther, H. [eds.] (1981), 128-161.
Georgi, H. (1984): Weak Interactions and Modern Particle Theory. Benjamin, Reading,
MA.
Gerhardt, C. (1976): On the existence and uniqueness of a warpening function in the
elasto-plastic torsion of a cylindrical bar with multiply connected cross section. In:
Germain, P. and Nayroles, B. [eds.] (1976), 328-342.
Germain, P. and Nayroles, B. [eds.] (1976): Applications of Functional Analysis to
Problems in Mechanics. Lecture Notes in Mathematics, Vol. 503. Springer-Verlag,
Berlin.
Gerthsen, C. (1982): Physik (14th edn). Springer-Verlag, New York.
Ghil, M. and Childress, S. (1987): Topics in Geophysical Fluid Dynamics: Atmospheric
Dynamics, Dynamo Theory, and Climate Dynamics. University Press, Oxford,
England.
Giacaglia, G. (1972): Perturbation Methods in Nonlinear Systems. Springer-Verlag,
New York.
Giaquinta, M. and Hildebrandt, S. (1988): Calculus of Variations, Vols. 1-3 (to appear).
Gibbs, J. (1902): Elementary Principles in Statistical Mechanics. Yale University Press,
New Haven, CT.
Gilbarg, D. (1960): Jets and cavities. In: Fliigge, S. [ ed.], (1958), Vol. IX, Fluid Dynamics,
III, 311-445.
Gilkey, P. (1974): The Index Theorem and the Heat Equation. Publish or Perish, Boston.
Gilkey, P. (1984): lnvariance Theory, the Heat Equation and the Atiyah-Singer Index
Theorem. Publish or Perish, Boston.
Gilmore, R. (1981): Catastrophe Theory for Scientists. Wiley, New York.
Girault, V. and Raviart, P. (1981): Finite Element Approximation of the Navier-Stokes
References 901

Equations. Lecture Notes in Mathematics Vol. 749. Springer-Verlag, Berlin.


Girault, V. and Raviart, P. (1986): Finite Element Methods for Navier-Stokes Equa-
tions. Springer-Verlag, New York.
Gittel, H. (1987): Studies on transonic problems by nonlinear variational inequalities. Z.
Anal. Anwendungen 6, 449-458.
Gitterman, M. and Halpern, V. (1981): Qualitative Analysis of Physical Problems.
Academic Press, New York.
GlansdorfT, P. and Prigogine, I. (1971): Thermodynamic Theory of Structure, Stability
and Fluctuations. Wiley, London.
Glicksberg, I. (1952): A further generalization of the Kakutani fixed-point theorem with
application to Nash equilibrium points. Proc. Amer. Math. Soc. 3, 170-174.
Glimm, J. (1965): Solutions in the large for nonlinear systems of conservation laws.
Comm. Pure Appl. Math.l8, 697-715.
Glowinski, R., Lions, J., and Tremolieres, R. (1976): Analyse numerique des inequations
variationnelles, Vols. 1, 2. Gauthier-Villars, Paris (English edition: North-Holland,
Amsterdam, 1981.)
Glowinski, R. (1980): Lectures on Numerical Methods for Nonlinear Variational Prob-
lems. Springer-Verlag, New York.
Glowinski, R. (1983): Numerical solution of nonlinear boundary value problems by
variational methods. Applications. In: Proceedings of the International Congress of
Mathematicians in Warsaw(1983), Vol. 2,1455-1508.
Glowinski, R. (1984): Numerical Methods for Nonlinear Variational Problems. Springer-
Verlag, New York.
Goering, H. et al. (1983): Singularly Perturbed Differential Equations. Akademie-
Verlag, Berlin.
Goldberg, S. (1984): Understanding Relativity. Birkhiiuser, Boston.
Golubitsky, M. and Guillemin, V. (1973): Stable Mappings and Their Singularities.
Springer-Verlag, New York.
Golubitsky, M. (1978): An introduction to catastrophe theory and its applications. SIAM
Rev. 20, 352-387.
Golubitsky, M. and Schaeffer, E. (1979): A theory for imperfect bifurcation via singu-
larity theory. Comm. Pure Appl. Math. 32,21-98.
Golubitsky, M. and Marsden, J. (1983): The Morse lemma in infinite dimensions via
singularity theory. SIAM J. Math. Anal. 14,1037-1044.
Golubitsky, M. and Schaeffer, D. (1984): Singularities and Bifurcation Theory. Springer-
Verlag, New York.
Gompf, R. (1985): An infinite set of exotic IR4 's. J. Diff. Geometry :n, 317-328.
Green, G. (1839): On the laws of reflection and refraction of light at the common surface
of two non-crystallised media. Trans. Cambridge Philos. Soc. 7, 1-24.
Green, M. and Gross, D. [eds.] (1986): Unified String Theories. World Scientific,
Singapore.
Green, M. (1986): Superstrings and the unification of forces and particles. In: Ruffini,
R. [ed.] (1986), pp. 203-226.
Green, M., Schwarz, J., and Witten, E. (1987): Superstrings, Vols. 1, 2. University Press,
Cambridge, England.
Greenberg, W. et al. (1986): Boundary Value Problems in Abstract Kinetic Theory.
Birkhiiuser, Boston.
Greiner, W. and Miiller, B. (1984): Theoretische Physik, Vols. 1-10. Harri Deutsch,
Frankfurt/Main, 1984ff. (Cf. also the additional references.)
902 References

Greiner, W. and Muller, B. (1986): Die elektroschwache Wechselwirkung. Cf. Greiner,


W. and Muller, B. (1984), Vol. 8.
Greub, W., Halperin, S., and Vanstone, R. (1972): Connections, Curvature, and Cohomo-
logy, Vols. 1-3. Academic Press, New York.
Grib, A. et al. (1980): Quantum Effects in Strong External Fields. Atomizdat, Moscow
(Russian).
Griffith, P. (1983): Exterior Differential Systems and the Calculus of Variations.
Birkhiiuser, Boston.
Grincenko, V. and Ulitko, A. (1985): Three-dimensional Problems in Elasticity and
Plasticity, Vols. 1-6. Naukova Dumka (Russian).
Groger, K. (1979): Initial-value problems for elastoplastic and elasto-viscoplastic sys-
tems. In: Fucik, S. and Kufner, A. [eds.] (1979), 95-127.
Gromov, M. and Rohlin, V. (1970): Embeddings and immersions in Riemannian geome-
try. Uspekhi Mat. Nauk 25 (5), 3-62 (Russian).
Gromov, M. (1986): Partial Differential Relations. Springer-Verlag, New York.
Groot, S. de and Mazur, P. (1962): Nonequilibrium Thermodynamics. North-Holland,
Amsterdam.
Guckenheimer, J. and Holmes, P. (1983): Nonlinear Oscillations, Dynamical Systems,
and Bifurcations. Springer-Verlag, New York.
Guderley, G. (1957): Theory of Transonic Flow. Pergamon Press, New York.
Guillemin, V. and Pollack, A. (1974): Differential Topology. Prentice-Hall, Englewood
Cliffs, NJ.
Gunter, N. (1957): Potentialtheorie. Teubner, Leipzig.
Gunther, M. (1987): Ein einfacher Beweis fur das nichtlineare Malodensky-Problem.
Math. Nachr. 130,251-265.
Gunther, M. (1989): Zum Einbettungssatz von J. Nash. Math. Nachr. 144, 165-187.
Gurtin, M. (1972): The linear theory of elasticity. In: Flugge, S. [ed.] (1956), Vol. VIa/2,
1-296.
Gurtin, M. (1981): Topics in Finite Elasticity. SIAM, Philadelphia.
Guth, A. (1981): The inflationary universe. Phys. Rev. D23, 347.
Gwinner, J. (1981): On fixed points and variational inequalities: A circular tour. Non-
linear Anal. 5, 565-583.

Haar, A. and Kiuman. T.v. (1909): Theorie der Spannungszustiinde in plastischen und
sandartigen Medien. Gottinger Nachrichten 1909, 204-218.
Haber, H. [ed.] (1987): From the Planck Scale to the Weak Scale: Toward a Theory of
the Universe. World Scientific, Singapore.
Hagihara, Y. (1976): Celestial Mechanics, Vols. 1-5. MIT Press, Cambridge, MA.
Hajko, V. and Schilling, H. (1975): Physik in Beispielen, Vols. 1-6. Fachbuchverlag,
Leipzig.
Haken, H. (1977): Synergetics-an Introduction: Nonequilibrium, Phase Transitions and
Self-Organisation in Physics, Chemistry and Biology. Springer-Verlag, New York.
Haken, H. (1983): Advanced Synergetics. Springer-Verlag, New York.
Hale, J. (1980): Ordinary Differential Equations. Wiley, New York.
Hale, J. (1981): Topics in Dynamic Bifurcation Theory. American Mathematical Society,
Providence, Rl.
Halmos, P. (1983): Selecta. Expository ffi-iting. Springer-Verlag, New York.
Halmos, P. (1985): I Want To Be A Mathematician. Springer-Verlag, New York.
References 903

Halphen, B. and Nguyen, Q. (1975): Surles materiaux standards gem?ralises. Mec., Paris
14,39-63.
Haltiner, G. and Williams, R. (1980): Numerical Weather Prediction and Dynamic
Meteorology. Wiley, New York.
Handbook of Applicable Mathematics (1980): Edited by W. Ledermann, Vols. 1-6.
Wiley, Chichester, 1980ff.
Handbook of Chemistry and Physics (1980): The Chemical Rubber, Cleveland, OH.
Handbook of Mathematics (1961): Cf. Mathematisches Worterbuch (1961).
Hansel, H. and Neumann, W. (1972): Physik: Eine Darstellung der Grundlagen, Vols.
1-7. Veri. d. Wiss., Berlin.
Harrison, E. (1981): Cosmology. University Press, Cambridge, England.
Hartman, P. and Stampacchia, G. (1966): On some nonlinear functional differential
equations. Acta Math. 115,271-310.
Hartmann, F. (1985): The Mathematical Foundations of Structural Mechanics. Springer-
Verlag, New York.
Haslinger, 1. (1983): Approximations of the Signorini problem with friction obeying the
Coulomb law. Math. Methods Appl. Sci. 5, 422-437.
Hassard, B., KazarinotT, N., and Wang, Y. (1981): Theory and Applications of Hopf
Bifurcation. University Press, Cambridge, England.
Hawking, S. (1972): Black holes in general relativity. Comm. Math. Phys. 25, 152-166.
Hawking, S., Bardeen, J., and Carter, B. (1973): The four laws of black hole mechanics.
Comm. Math. Phys. 31, 161-170.
Hawking, S. and Ellis, G. (1973): The Large Scale Structure of Space- Time. University
Press, Cambridge, England.
Hawking, S. (1975): Particle creation by black holes. Comm. Math. Phys. 43, 199-220.
Hawking, S. and Israel, W. [eds.] (1979): General Relativity: An Einstein Centenary
Survey. University Press, Cambridge, England.
Hawking, S. and Rocek, M. [eds.] (1981): Superspace and Supergravity. University
Press, Cambridge, England.
Hawking, S. (1984): Quantum cosmology. In: DeWitt, B. and Stora, R. [eds.] (1984),
330-380.
Heisenberg, W. (1925): Quantenmechanische Umdeutung kinematischer und mecha-
nischer Beziehungen. Z. Phys. 33, 879-884.
Heisenberg, W. (1927): Anschaulicher Inhalt der quantentheoretischen Kinematik und
Mechanik. z. Phys. 43, 172-199.
Heisenberg, W. (1937): Die physikalischen Prinzipien der Quantenmechanik. Hirzel,
Leipzig.
Heisenberg, W. (1968): Nonlinear problems in physics. In: Zabusky, N. [ed.] (1968),
1-18.
Heisenberg, W. (1977): Schritte iiber Grenzen. Piper, Miinchen.
Heisenberg, W. (1977a): Tradition in der Wissenschaft. Piper, Miinchen.
Heisenberg, W. (1978): Physik und Philosophie. Hirzel, Stuttgart.
Heisenberg, W. (1980): Wandlungen in den Grundlagen der Wissenschaft. Hirzel, Stutt-
gart.
Heisenberg, W. (1981): Der Teil und das Ganze. Piper, Miinchen.
Heisenberg, W. (1984): Gesammelte Werke-Co/lected Works. Springer-Verlag, New
York.
Held, A. [ed.] (1980): General Relativity and Gravitation, Vols. 1, 2. Plenum, New York.
904 References

Helgason, S. (1962): Differential Geometry and Symmetric Spaces. Academic Press, New
York.
Helmholtz, H. (1858): Ober lntegrale der hydrodynamischen Gleichungen, welche den
Wirbelbewegungen entsprechen. Crelle Journal.
Henbest, N. and Marten, M. (1984): Die neue Astronomie. Birkhiiuser, Basel.
Hencky, H. (1924): Zur Theorie plastischer Deformationen. Z. Angew. Math. Mech. 4,
323-334.
Henry, D. (1981): Geometric Theory of Semilinear Parabolic Equations. Lecture Notes
in Mathematics, Vol. 840. Springer-Verlag, Berlin.
Herrmann, D. (1978): Entdecker des Himmels. Urania, Leipzig.
Heywood, J. (1980): TheN avier-Stokes equations: on the existence, regularity and decay
of solutions. Indiana Univ. Math. J. 19, 639-681.
Hilbert, D. (1897): Die Theorie der algebraischen Zahlkorper. In: Hilbert, D. (1932), Vol.
1, 63-527.
Hilbert, D. (1901 ): Ober die Fliichen von konstanter GaujJscher Krummung. Trans. Amer.
Math. Soc.l, 87-99.
Hilbert, D. (1903): Ober die Grundlagen der Geometrie. Teubner, Leipzig. (12th edn,
Teubner, Stuttgart, 1977.)
Hilbert, D. (1915): Die Grundlagen der Physik. Nachr. Akad. Wiss. Gottingen, Math.-
phys. Kl. 1915, 395-407; 1917,53-76.
Hilbert, D. (1930): Naturerkenntnis und Logik. Naturwissenscharten, 959-963.
Hilbert, D. (1932): Gesammelte Abhandlungen (Collected Works), Vols. 1-3. Springer-
Verlag, Berlin.
Hilbig, H. (1964): Existenzsiitze fUr einige Totwasserprobleme der Hydrodynamik. Ber.
Sachs. Akademie Wiss. Leipzig, Math. Naturwiss. Klasse 105.
Hilbig, H. (1987): Weitere Existenzsiitze fur Totwasserprobleme der Hydrodynamik. Ber.
Sachs. Akademie Wiss. Leipzig, Math.-Naturwiss. Klasse 119/5.
Hildebrandt, S. (1985): Harmonic mappings of Riemannian manifolds. In: Guisti, E. [ed.]
(1985), Harmonic Mappings and M inimallmmersions. Lecture Notes Mathematics,
Vol. 1161, pp. 1-117, Springer-Verlag, Berlin.
Hildebrandt, S. and Tromba, T. (1985): Mathematics and Optimal Form. Scientific
American Books, New York. (German edition: Panoptimum, Spektrum der Wis-
senschart, 1987.)
Hildebrandt, S. and Giaquinta, M. (1988): cr. Giaquinta, M. (1988).
Hill, R. (1950): The Mathematical Theory of Plasticity. Clarendon Press, Oxford,
England.
Hillel, D. (1980): Fundamentals of Soil Physics. Academic Press, New York.
Hilton, P. (1974): cr. Otte, M. [ed.] (1974).
Hilton, P. and Young, G. [eds.] (1981): New Directions in Applied Mathematics.
Springer-Verlag, New York.
Hincliffe, I. [ed.] (1987): Cosmology and Particle Physics. World Scientific, Singapore.
Hirsch, M. (1976): Differential Topology. Springer-Verlag, New York.
Hirsch, M., Pugh, C., and Shub, M. (1977): Invariant Manifolds. Lecture Notes in
Mathematics, Vol. 583. Springer-Verlag, Berlin.
Hirzebruch, F. (1974): cr. Otte, M. [ed.] (1974).
Hlavaeek, I., Haslinger, J., Necas, J., and Lovi§ek, J. (1986): Solution of Variational
Inequalities in Mechanics. Mir, Moscow (Russian).
Hodge, W. (1952): The Theory and Applications of Harmonic Integrals. University
Press, Cambridge, England.
References 905

Hoffmann, K. (1981): Fixpunktprinzipien undfreie Randwertaufgaben.In: Lecture Notes


in Mathematics, Vol. 878, 169-181. Springer-Verlag, Berlin.
Hofmann, R. (1986): Eine Klasse von Extremalproblemen jur die Norm selbstadjungier-
ter vollstetiger Operatoren im Hilbertraum mit Anwendung auf die Platte und die
schwingende Saite. Dissertation B, Karl-Marx-Universitiit, Leipzig.
Hofstadter, D. (1979): Giidel, Escher, Bach: An Eternal Golden Braid. Basic Books, New
York.
Holmes, P. and Marsden, J. (1981): Chaotic oscillations of a forced beam. Arch. Rat.
Mech. Anal. 76, 135-166.
Holt, M. (1984): Numerical Methods in Fluid Dynamics. Springer-Verlag, New York.
Hooke, R. (1678): Lectures de Potentia Restitutiva or of Spring Explaining the Power of
Springing Bodies, London= Gunther, R. (1931), Early Science in Oxford 8,
331-356.
Hopf, E. (1942): Abzweigung einer periodischen Losung von einer stationiiren Losung
eines Differentialgleichungssystems. Ber. Sachs. Akad. Wiss. Leipzig, Math.-phys.
Kl. 94, 1-22.
Hopf, E. (1948): A mathematical example displaying the features of turbulence. Comm.
Pure Appl. Math.1, 303-322.
Hopf, E. (1950): The partial differential equation u, + uu"' = uu. Comm. Pure Appl.
Math. 3, 201-230.
Hopf, E. (1951): Ober die Anfangswertaufgabe jur die hydrodynamischen Grundglei-
chungen. Math. Nachr. 4, 213-231.
Hopf, E. (1952): Statistical hydrodynamics and functional calculus. J. Rat. Mech. Anal.
1,87-123.
Hormander, L. (1967): An Introduction to Complex Analysis in Several Variables. Van
Nostrand, Princeton, NJ.
Hormander, L. (1976): The boundary-value problems of physical geodesy. Arch. Rat.
Mech. Anal. 62, 1-52.
Hornung, U. (1983): A unilateral boundary value problem for unsteady water flow in
porous media. Meth. Verf. Math. Phys. 25, 59-94.
Hoyle, F. and Wickramasinghe, N. (1979): Life Cloud. Sphere Books, New York.
Huang, K. (1963): Statistical Mechanics. Wiley, New York.
Huang, K. (1982): Quarks, Leptons, and Gauge Fields. World Scientific, Singapore.
Hughes, T. and Marsden (1976): A Short Course in Hydrodynamics. Publish or Perish,
Boston.
Hughes, T., Kato, T., and Marsden, J. (1977): Well-posed quasi-linear hyperbolic systems
with applications to nonlinear elastodynamics and general relativity. Arch. Rat.
Mech. Anal. 63,272-294.
Hugoniot, H. (1889): Sur Ia propagation du mouvement dans les corps et specialement
dans les gax parfaits, J. Ecole Polytechnique 58, 1-125.
Hiinlich, R. (1979): On simultaneous torsion and tension of a circular cylindrical bar
consisting of an elastoplastic material with linear hardening. Z. Angew. Math.
Mech. 59,509-516.
Hurt, N. (1983): Geometric Quantization in Action. Reidel, Boston.
Husseini, S., Lasry, J., and Magill, M. (1990): Existence of equilibrium with incomplete
markets. J. Math. Economics 19, 39-67.

Infeld, L. (1969): Leben mit Einstein: Kontur einer Erinnerung. Wien.


906 References

XX lith International Co1!ference on High Energy Physics in Leipzig (1984): Proceed-


ings, Vols. 1, 2. Akad. d. Wiss., Berlin.
d'Inverno, R. (1983): Computer methods in general relativity. In: Schmutzer, E. [ed.]
(1983), 93-113.
Iooss, G. (1979): Bifurcation of Maps and Applications. North-Holland, Amsterdam.
Iooss, G. and Joseph, D. (1980): Elementary Stability and Bifurcation Theory. Springer-
Verlag, New York.
Itzykson, C. and Zuber, J. (1980): Quantum Field Theory. McGraw-Hill, New York.
Ivanov, V. (1983): Gerade und Ungerade: die Asymmetrie des Gehirns und der Zeichen-
struktur. Hirzel, Stuttgart.
Ize, J. (1976): Bifurcation Theory for Fredholm Operators. American Mathematical
Society, Providence, Rl.

Jacob, C. (1959): Introduction mathematique a Ia mecanique des fluides. Gauthier-


Villars, Paris.
Jaffe, A. (1984): Ordering the universe: the role of mathematics. Notices Amer. Math.
Soc. 236, 589-608.
Jager, W., Moser, J., and Remmert, R. [eds.] (1984): Perspectives in Mathematics.
Birkhiiuser, Boston.
Jantzsch, E. (1982): Die Selbstorganisation des Universums. Deutscher Taschenbuch-
verlag, Miinchen.
Jentsch, L. (1977): Zur Existenz von reguliiren Losungen der Elastostatik stuckweise
homogener Korper. Akademie-Verlag, Berlin.
Jentzsch, R. (1912): Ober lntegralgleichungen mit positivem Kern. J. Reine Angew.
Math. 141, 235-249.
John, F. (1965): Estimates for the derivative of the stresses in a thin shell and interior
shell equations. Comm. Pure Appl. Math. 18,235-267.
John, F. (1971): Refined interior equations for thin elastic shells. Comm. Pure Appl.
Math. 24, 583-615.
John, F. (1972): Uniqueness of nonlinear elastic equilibrium for prescribed boundary
displacements and sufficiently small strains. Comm. Pure Appl. Math. 25, 627-634.
John, F. (1977): Finite amplitude waves in homogeneous isotropic elastic solids. Comm.
Pure Appl. Math. 30, 421-446.
John, F. (1982): Partial Differential Equations. Springer-Verlag, New York.
John, F. (1983): Lower bounds for the life span of solutions of nonlinear wave equations
in three-dimensional space. Comm. Pure Appl. Math. 36, 1-35.
John, F. (1985): Collected Papers. Edited by J. Moser, Vols. 1, 2. Birkhiiuser, Boston.
John, J. (1985a): Formation of singularities in elastic waves. In: John, F. (1985), Vol. 1,
624-640.
John, F. (1987): Existence for large times of strict solutions of nonlinear wave equations
in three space dimensions for small initial data. Comm. Pure Appl. Math. 40,
79-110.
Jones, M. and Toland, J. (1986): Symmetry and the bifurcation of capillary-gravity
waves. Arch. Rat. Mech. Anal. 96,79-110.
Joseph, D. (1965): On the solvability of the Boussinesq equation. Arch. Rat. Mech. Anal.
20,59-71.
Joseph, D. and Sattinger, D. (1972): Bifurcating time periodic solutions and their stability.
Arch. Rat. Mech. Anal. 45, 79-109.
Joseph, D. (1976): Stability of Fluid Motions, Vols. 1, 2. Springer-Verlag, New York.
References 907

Jost, R. (1984): Mathematics and physics since 1800: discord and sympathy. In: DeWitt,
B. and Stora, R. [eds.] (1984), 4-50.
Judovic, V. (1966): Secondary flows and fluid instability between rotating cylinders.
Prikl. Mat. Mekh. 30, 688-698 (Russian.)
Judovic, V. (1966a): On the origin of convection. Prikl. Mat. Mekh. 30, 1193-1199
(Russian).
Judovic, V. (1967): Free convection and bifurcation. Prikl. Mat. Mekh. 31, 101-111
(Russian).

Kacanov, L. (1969): Foundations of Plasticity Theory. Nauka, Moscow (Russian).


Kahler, E. (1934): Ei'!{Uhrung in die Theorie der Systeme von Differentialgleichungen.
Teubner, Leipzig.
Kahler, E. (1941): Ober die Beziehungen der Mathematik zu Astronomie und Physik.
Jahresber. Deutsche Math.-Verein. 51, 52-63.
Kahler, E. (1979): Monadologie. Privatdruck, Hamburg.
Kakutani, S. (1941): A generalization of Brouwers fixed-point theorem. Duke Math. J.
8,457-459.
Karlin, S. (1959): Mathematical Methods and Theory in Games, Programming and
Economics. Addison-Wesley, Reading, MA.
Karlin, S. (1967): Total Positivity and Applications. University Press, Stanford, CA.
Karman, T.v. (1910): Festigkeitsprobleme im Maschinenbau. Enzyklopadie der mathe-
matischen Wissenschaften, Vol. IV/4, 601-694.
Kato, T. (1966): Perturbation Theory for Linear Operators. Springer-Verlag, New York.
Kato, T. (1967): On classical solutions of two-dimensional stationary Euler equations.
Arch. Rat. Mech. Anal.lS, 188-200.
Kato, T. (1972): Nonstationary flow of viscous and ideal fluids in R3 . J. Funct. Anal. 9,
296-305.
Kato, T. (1975): The Cauchy problem for quasilinear symmetric hyperbolic systems. Arch.
Rat. Mech. Anal. 58, 181-205.
Kato, T. (1975a): Quasilinear equations of evolution with applications to partial Differ-
ential equations. Lecture Notes in Mathematics, Vol. 448,25-70. Springer-Verlag,
New York.
Kato, T. (1982): A Short Introduction to Perturbation Theory for Linear Operators.
Springer-Verlag, New York.
Kato, T. (1983): On the Cauchy problem for the generalized Korteweg-de Vries equation.
In: Guillemin, V. [ed.] (1983), Studies in Applied Mathematics, pp. 93-128, Ac-
ademic Press, New York.
Kato, T. and Lai, C. (1984): Nonlinear evolution equations and the Euler flow. J. Funct.
Anal. 56, 15-28.
Kaufmann, W. (1979): Black Holes and Warped Space- Time. Freeman, San Francisco,
CA.
Keller, H. and Meyer-Spasche, R. (1980): Computation of the axialsymmetric flow
between rotating cylinders. J. Comput. Phys. 35, 100-109.
Kellog, R., Li, T., and Yorke, J. (1976): A constructive proof of the Brouwer fixed-point
theorem and computational results. SIAM J. Numer. Anal. 13,473-483.
Kellogg, 0. (1929): Foundations of Potential Theory. Springer-Verlag, Berlin.
Kelly, A. (1967): The stable, center-stable, center, center-unstable and unstable manifolds.
In: Abraham, R. and Robbin, J. (1967), 136-154.
908 References

Kelvin, Lord: See Thomson, W., Sir.


Kepler, J. (1609): Astronomia nova. (German translation by M. Caspar, Miinchen,
1939.)
Kepler, J. (1618): Harmonice mundi. (German translation by M. Caspar, Miinchen,
1939.)
Kepler, J. (1939): Gesammelte Werke (Collected Works). Edited by W. v. Dyck and
M. Caspar, Miinchen.
Kevorkian, J. and Cole, J. (1981): Perturbation Methods in Applied Mathematics.
Springer-Verlag, New York.
Kielhofer, H. (1979): Hopf bifurcation at multiple eigenvalues. Arch. Rat. Mech. Anal.
69,53-84.
Kielhofer, H. (1980): Degenerate bifurcation at simple eigenvalues and stability of
bifurcating solutions. J. Funct. Anal. 38, 416-441.
Kielhofer, H. (1982): Floquet exponents of bifurcating periodic orbits. Nonlinear Anal.
6, 571-584.
Kielhofer, H. and Lauterbach, R. (1983): On the principle of reduced stability. J. Funct.
Anal. 53, 99-111.
Kijowski, J. (1978): On a new variational principle in general relativity and the energy
of the gravitational field. Gen. Relativity Gravitation 9, 857-877.
Kijowski, J. (1985): On positivity of energy of the graviational fled. In: Ruffini, R. [ed.],
Proceedings of the Forth Marcel Grossmann Meeting, Rome, 1985. North-Holland,
Amsterdam.
Kinderlehrer, D. and Stampacchia, G. (1980): An Introduction to Variational Inequal-
ities and Their Applications. Academic Press, New York.
Kinderlehrer, D. (1981): Remarks about Signorinis problem in linear elasticity. Ann.
Scuola Norm. Sup. Pisa Cl. Sci. (4), 8, 605-645.
Kippenhahn, R. (1980): Geburt, Leben und Tod der Sterne. Piper, Miinchen.
Kippenhahn, R. (1987): Light from the Depth of Time. Springer-Verlag, New York.
Kirchgiissner, K. (1961): Die Instabilitat der Stromung zwischen zwei rotierenden
Z ylindern gegenuber Taylor- Wirbeln for beliebige Spaltbreiten. Z. Angew. Math.
Phys. 12, 14-30.
Kirchgiissner, K. and Sorger, P. (1969): Branching analysis for the Taylor problem.
Quart. J. Mech. Appl. Math. 32, 183-209.
Kirchgiissner, K: and Kielhofer, H. (1973): Stability and bifurcation in fluid dynamics.
Rocky Mountain J. Math. 3, 275-318.
Kirchgiissner, K. (1975): Bifurcation and nonlinear hydrodynamic stability. SIAM Rev.
17, 652-683.
Kirchgiissner, K. (1975a): Instability phenomena in fluid mechanics. In: SYNSPADE
1975, Edited by J. Hubbard. Academic Press, New York.
Kirchgiissner, K. (1981): Periodic and nonperiodic solutions of reversible systems. In: De
Mottoni, P. and Salvadori, L. [eds.], Nonlinear Differential Equations, Invariance,
Stability, and Bifurcation. Academic Press, New York, pp. 221-242.
Kirchgiissner, K. (1988): Nonlinearly resonant surface waves and homoclinic bifurcation
(to appear).
Kirchgraber, V. and Stiefel, E. (1978): Methoden der analytischen Storungsrechnung und
ihre Anwendungen. Teubner, Stuttgart.
Kittel, C. et al. (1965): Berkeley Physics: A Course in Physics, Vols. 1-5. McGraw-Hill,
New York. (German edition: Vieweg, Braunschweig.)
Kittel, C. (1969): Thermal Physics. Wiley, New York. (German edition: Geest & Portig,
Leipzig, 1973.)
References 909

Klein, F. (1871): Ober die sogenannte nicht-euklidische Geometrie. Math. Ann. 4,


573-625. {Further papers on this topic may be found in Klein, F. {1921), Vol. 1,
241-410.)
Klein, F. (1872): Erlangener Programm. In: Klein, F. (1921), Vol. 1, 460-497.
Klein, F. {1921): Gesammelte mathematische Abhandlungen (Collected Papers), Vols.
1-3. Springer-Verlag, Berlin.
Klein, F. (1926): Vorlesungen uber die Entwickung der Mathematik im 19. Jahrhundert,
Vols. 1, 2. Springer-Verlag, Berlin.
Klein, F. (1928): Vorlesungen uber nicht-euklidische Geometrie. Springer-Verlag, Berlin.
Klein, M. (1973): The development of Boltzmann's statistical ideas. In: Cohen, E. and
Thirring, W. [eds.] (1973), 53-106.
Kleinert, W. [ed.] (1987): Gauge Theory of Stresses and Defects. World Scientific,
Singapore.
Kliesch, W. (1983): Zur numerischen Bestimmung des Abbildungsgrades im IR" und zu
seiner Anwendung bei der Losung nichtlinearer Gleichungssysteme. Dissertation.
Universitiit, Leipzig.
Kliesch, W. (1984): Zur numerischen Bestimmung des Abbildungsgrades im IR". Z. Anal.
Anwendungen 3, 337-365; 489-502.
Kliesch, W. (1989): A unified numerical approach to the topological degree in IR". Math.
Nachr. 142, 181-213.
Kline, M. (1972): Mathematical Thought from Ancient to Modern Times. University
Press, Oxford, England.
Klingenberg, W. ( 1978): A Course in Differential Geometry. Springer-Verlag, New York.
Klingenberg, W. (1983): Riemannian Geometry. De Gruyter, Berlin.
Knaster, B., Kuratowski, C., and Mazurkiewicz, S. (1929): Ein Beweis des Fixpunkt-
satzes fUr n-dimensionale Simplexe. Fund. Math. 14, 132-137.
Knightly, G. and Sather, D. {1974): Nonlinear buckled states of rectangular plates. Arch.
Rat. Mech. Anal. 54, 356-372.
Knightly, G. and Sather, D. (1980): Existence and stability of axial symmetric buckled
states of spherical shells. Arch. Rat. Mech. Anal. 63, 305-319.
Knightly, G. and Sather, D. (1985): A selection principle for Benard-type convection.
Arch. Rat. Mech. Anal. 88, 163-193.
Koops, R. [ed.] (1976): Symposium on Nonlinear Analysis and Mechanics, Vols. 1-4.
Pitman, New York, 1976-79.
Knorrer, H. {1986): Integrable Hamiltonsche Systeme und Algebraische Geometrie.
Jahresber. Deutsche Math.-Ver. 88,82-103.
Kobayashi, S. and Nomizu, K. {1963): Foundation of D!fferential Geometry, Vols. 1, 2.
Interscience, New York.
Koiter, W. (1960): General theorems for elastic-plastic solids. In: Progress in Solid
Mechanics. North-Holland, Amsterdam, pp. 165-221.
Koiter, W. and Simmonds, J. (1972): Foundations of shell theory. In: Proc. 13th Int.
Congr. Appl. Mech. Moscow, 1972, pp. 150-176. Springer-Verlag, New York.
Koiter, W. {1980): The intrinsic equations of shell theory with some applications. In
Nemat-Nasser, S. [ed.], Mechanics Today, Vol. 5. Pergamon Press, London.
Kompanejec, A. (1961): Theoretical Physics. Mir, Moscow (Russian).
Korn, A. (1907): Surles equations d'elasticite. Ann. Ecole Norm. 24, 9-75.
Korneev, V. and Lange, U. {1984): Approximate Solution of Plastic Flow Theory
Problems. Teubner, Leipzig.
Korteweg, D. and de Vries, G. (1895): On the change of form of long waves and of a
910 References

new type of long stationary waves. Philos. Mag. 39, 422.


Kotschin, N. et al. (1954): Theoretische Hydrodynamik. Akademie-Verlag, Berlin.
Kounas, C. et al. [eds.] (1984): Grand Unification With and Without Supersymmetry
and Cosmological Implications. World Scientific, Singapore.
Kowalewski, G. (1939): GrojJe Mathematiker. Berlin.
Kramer, D. et al. (1980): Exact Solutions of Einstein's Field Equations. Veri. d. Wiss.,
Berlin.
Krasnoselskii, M. and Pokrovskii, A. (1983): Systems with Hysteresis. Nauka, Moscow.
(English edition in preparation.)
Krasnoselskii, M. et al. (1985): Positive Linear Systems: The Method of Positive
Operators. Nauka, Moscow (Russian).
Krein, M. (1964): Lectures on Stability. Kiev (Russian). (Cf. Daleckii, J. and Krein,
M. (1970).)
Kreyszig, E. (1957): Differentialgeometrie. Geest & Portig, Leipzig.
Kruskal, M. (1960): Maximal extension of the Schwarzschild metric. Phys. Rev. 119,
1743.
Kruzkov, S. (1970): Quasilinear equations of first order with several independent vari-
ables. Mat. Sbornik 81, 228-255.
Kubicek, M. and Marek, M. (1983): Computational Methods in Bifurcation Theory and
Dissipative Structures. Springer-Verlag, New York.
Kubo, R., Toda, M., and Saito, N. (1983): Statistical Physics, I. Equilibrium Statistical
Mechanics. Springer-Verlag, New York.
Kubo, R., Toda, M., and Hashitsume, N. (1985): Statistical Physics, II. Nonequilibrium
Statistical Mechanics. Springer-Verlag, New York.
Kueera, M., Necas, J., and SouCek, J. (1978): The eigenvalue problem for variational
inequalities and a new version of the Ljusternik-Schnirelman theory. In: Cesari,
L. [ed.] (1978), Nonlinear Analysis, 125-143, Academic Press, New York.
Kucera, M. (1982): A new method for obtaining eigenvalues of variational inequalities.
Czechoslovak Math. J. 32, 197-207.
Kueera, M. (1982a): Bifurcation points of variational inequalities. Czechoslovak Math.
J. 32, 208-226.
Kuhn, H. (1960): Some combinatorical lemma in topology. IBM J. Res. Develop. 4,
518-524.
Kupradze, V. (1976): Three-Dimensional Problems in Mathematical Elasticity and
Thermoelasticity. Nauka, Moscow (Russian). (English edition: North-Holland,
Amsterdam, 1979.)
Ky Fan: Cf. Fan, Ky.

Ladyzenskaja, 0. (1959): Solution "in the large" of the nonstationary boundary-value


problem for theNavier-Stokes system with two space variables. Comm. Pure Appl.
Math. 12,427-433.
Ladyzenskaja, 0. (1970): Mathematical Problems in the Dynamics of Viscous Incom-
pressible Fluids. Nauka, Moscow (Russian). (English edition: Gordon and Breach,
New York, 1969.)
Ladyzenskaja, 0. and Solonnikov, V. (1977): On the solvability of boundary-value
problems and boundary initial-value problems for the N avier-Stokes equations with
non-compact boundaries. Vestnik Leningr. Univ., 1977, 39-47 (Russian).
Ladyzenskaja, 0. (1979): On formulation and solvability of boundary-value problems for
viscous incompressible fluids in domains with non-compact boundaries. In: Equadiff
References 911

IV. Lecture Notes in Mathematics, Vol. 703,233-240. Springer-Verlag, Berlin.


Ladyzenskaja, 0. and Solonnikov, V. (1980): On solutions for the stationary Navier-
Stokes equations with unbounded Dirichlet integral. Zap. Naucn. Sem. Leningrad.
Otdel. Mat. Inst. Steklov. 96, 117-160 (Russian).
Ladyzenskaja, 0. (1982): On the finite dimension of bounded invariant sets for the
N avier-Stokes equations and other dissipative systems. Zap. Naucn. Sem. Lenin-
grad. Otdel. Mat. Inst. Steklov. 115, 137-155 (Russian).
Ladyzenskaja, 0. (1986): On some directions of the research in mathematical physics at
the Steklov-Institute in Leningrad. In: Vladimirov, V. (1986), 217-245 (Russian).
Lagrange, L. (1788): La Mecanique Analytique. Paris.
Lagrange, L. (1867/1892): Oeuvres (Collected Works). Gauthier-Villars, Paris.
Lahaye, E. (1935): Sur Ia representation des racines systemes d'equations transcendante.
Deuxieme Congres National des Sciences, Vol. 1, 141-146.
Lamb, G. (1980): Elements of Soliton Theory. Wiley, New York.
Lamb, H. (1924): Hydrodynamics. University Press, Cambridge, England. (German
edition: Teubner, Leipzig, 1931.)
Lanchon, H: (1974): Torsion elastoplastique d'une barre cylindrique de section simple-
ment ou multiplement connexe. J. Mech. 13, 267-320.
Landau, L. and Lifsic, E. (1962): Lehrbuch der Theoretischen Physik, Vols. 1-10.
Akademie-Verlag, Berlin. (English edition: Pergamon Press, Oxford, 1962ff.)
Landau, L., Achieser, A., and Lifsic, E. (1970): Mechanik und Molekularphysik.
Akademie-Verlag, Berlin.
Lang, S. (1972): Differential Manifolds. Addison-Wesley, Reading, MA.
Langenbach, A. (1976): Monotone Potentialoperatoren. Veri. d. Wiss., Berlin.
Latal, H. and Mitter, H. [eds.] (1987): Concepts and Trends in Particle Physics.
Springer-Verlag, New York.
Laue, M. v. (1950): History of Physics. Academic Press, New York. (German edition:
Ullstein, Frankfurt/Main.)
Laugwitz, D. (1960): Differentialgeometrie. Teubner, Stuttgart. (English edition: Aca-
demic Press, New York, 1965.)
Lavrentjev, M. (1946): On the theory of long waves. Sbornik Inst. Mat. Akad. Nauk
Ukrainsk. RSR 8, 13-69 (Ukrainian). (English edition: Amer. Math. Soc. Transl.,
Vol. 102, 1954.)
Lax, P. (1957): Hyperbolic systems of conservation laws. Comm. Pure Appl. Math. 10,
537-566.
Lax, P. and WendrotT, B. (1960): Systems of conservation laws. Comm. Pure Appl. Math.
13,217-237.
Lax, P. (1968): Integrals of nonlinear equations of evolution and solitary waves. Comm.
Pure Appl. Math. 21,467-490.
Lax, P. (1973): Hyperbolic Systems of Conservation Laws and the Mathematical Theory
of Shock Waves. SIAM, Philadelphia.
Lax, P. (1983): Problems solved and unsolved concerning linear and nonlinear partial
differential equations. In: Warsaw (1983), 119-138.
LeBlond, P. and Mysak, L. (1978): Waves in the Ocean, Vols. 1, 2. Elsevier.
Lebovitz, N. (1977): Bifurcation and stability problems in astrophysics. In: Rabinowitz,
P. [ed.] (1977), 259-284.
Lee, T. (1981): Particle Physics and Introduction to Field Theory. Harwood, New York.
Leites, D. (1980): Introduction to the theory of supermanifolds. Uspekhi Mat. Nauk 35
(1), 3-57 (Russian).
912 References

Leray, J. (1933): Etude de diverses equations integrales non lineaires et de quelques


prob/emes que pose l'hydrodynamique. J. Math. Pures Appl. 12, 1-82.
Leray, J. (1934): Essai sur les mouvements plans d'un liquide visceux que limitent des
parois. J. Math. Pures Appl. 13, 331-418.
Leray, J. (1934a): Sur le mouvement d'un liquide visqueux emplissant l'espace. Acta Math.
63, 193-248.
Leray, J. (1935): Les problemes de representation conforme d'Helmholtz. Comment.
Math. Helv. 8,149-180.
Leray, J. (1978): Analyse Lagrangienne et mecanique quantique. Strasbourg, France.
(English edition: MIT Press, 1981).
Les Houches (1951ff): Summer Schools on Theoretical Physics. North-Holland, Amster-
dam.
Les Houches (1979): Physical Cosmology. North-Holland, Amsterdam.
Les Houches (1981): Chaotic Behavior of Deterministic Systems. North-Holland,
Amsterdam.
Les Houches (1981a): Gauge Theories in High Energy Physics. North-Holland, Amster-
dam.
Les Houches (1982): Recent Developments in Field Theory and Statistical Mechanics.
North-Holland, Amsterdam.
Les Houches (1983): Relativity, Groups and Topology, II. North-Holland, Amsterdam.
Les Houches (1983a): Birth and Infancy of Stars. North-Holland, Amsterdam.
Levi-Civita, T. (1917): Nozione di parallelismo in una varieta Rend. Circ. Mat. Palermo
42, 173-205.
Levi-Civita, T. (1925): Determination rigoreuse des ondes permanentes d'ampleur finie.
Math. Ann. 93, 264-314.
Levy, M. (1871): Memoire sur /es equations des corps so/ides ductiles au-de/a de Ia limite
elastique. J. Math. Pures Appl. 16, 369-372.
Liang, E. and Sachs, R. (1980): Cosmology. In: Held, A. [ed.] (1980), Vol. 2, 329-357.
Lichtenstein, L. (1921): Neuere Entwicklungen der Potentialtheorie. In: Enzyklopiidie
der mathematischen Wissenschaften, Vol. 11/3.1, 181-217.
Lichtenstein, L. (1923): Astronomie und Mathematik in ihrer Wechse/wirkung. Hirzel,
Leipzig.
Lichtenstein, L. (1924): Ober die erste Randwertaufgabe der Elastizitiitstheorie. Math.
z. 20, 21-28.
Lichtenstein, L. ( 1929): Grundlagen der H ydromechanik. Springer-Verlag, Berlin.
Lichtenstein, L. ( 1930): 0 ber einige Hilfssiitze der Potentialtheorie IV. Ber. Sachs. Akad.
Wiss. Leipzig 82, 265-344.
Lichtenstein, L. (1931): Vorlesungen iiber einige Klassen nichtlinearer Integra/glei-
chungen und Integro-Differentialgleichungen nebst Anwendungen. Springer-Verlag,
Berlin.
Lichtenstein, L. (1933): Gleichgewichtsfiguren rotierender Fliissigkeiten. Springer-
Verlag, Berlin.
Lie, S. (1934): Gesammelte Abhandlungen (Collected Papers), Vols. 1-7. Teubner, Leip-
zig, 1934ff.
Lighthill, J. (1978): Waves in Fluids. University Press, Cambridge, England.
Lighthill, J. (1986): An Informal Introduction to Theoretical Fluid Dynamics. University
Press, Oxford, England.
Lightman, A. eta/. (1975): Problem Book in Relativity and Gravitation. University Press,
Princeton, NJ.
References 913

Lin, C. and Reid, W. (1963): Turbulent flow, theoretical aspects. In: Fliigge, S. [ed.]
(1956), Vol. VIII/2, 438-523.
Linde, A. (1984): Elementary particles and cosmology. In: International Cotiference on
High Energy Physics in Leipzig (1984), Vol. 2, 125-148.
Lions, J. (1969): Quelques methodes de resolution des problemes aux limites non lineaires.
Dunod, Paris.
Lions, J. (1973): Perturbation singulieres dans les problemes aux limites et en controle
optimal. Lecture Notes in Mathematics, Vol. 323. Springer-Verlag, Berlin.
Lions, P. (1982): Generalized Solutions of Hamilton-Jacobi Equations. Pitman, Lon-
don.
Lions, P. (1983): Hamilton-Jacobi-Bellman equations and the optimal control of sto-
chastic systems. In: Warsaw (1983), Vol. 2, 1403-1477.
Lions, P. (1984): The concentration-compactness principle in the calculus of variations.
Ann. Inst. H. Poincare. Anal. Non Lineaire 1, 109-145.
List, S. (1978): Generic bifurcation with applications to the von Karman equations.
J. Differential Equations 30, 89-118.
Ljapunov. A. (1882): The general problem on the stability of motion. In: Ljapunov, A.
(1954), Vol. 2, 7-263 (Russian).
Ljapunov, A. (1906): Sur les figures d'equilibre peu differentes d'une masse liquide
homogene donee d'un mouvement rotation. Memories 1906. St Peters burgh, 1-225.
(See Ljapunov, A. (1954), Vol. 4.)
Ljapunov, A. (1954): Collected Works, Vols. 1-5. Nauka, Moscow (Russian).
Ljusternik, L. and Visik, M. (1957): Regular degeneration and boundary layers for linear
differential equations with small parameters. Uspekhi Mat. Nauk 12 (5), 3-122
(Russian).
Lodge, A., McLeod, J., and Nohel, J. (1978): A nonlinear singularly perturbed Volterra
integrodifferential equation occurring in polymer rheology. Proc. Roy. Soc. Edin-
burgh A80, 99-137.
Longair, M. [ed.] (1974): Confrontation of Cosmological Theories with Observational
Data. Reidel, Boston.
Lopes, J. (1981): Gauge Field Theory. Pergamon Press, Oxford, England.
Lorenz, E. (1963): Deterministic non-periodic flow. J. Atomspheric Sci. 20, 130-141.
Lotze, K. (1980): Der Lebensweg der schwarzen Locher. Die Sterne 56, 82-92, 149-159.
Love, A. (1906): A Treatise on the Mathematical Theory of Elasticity. University Press,
Cambridge, England.
Ludwig, G. (1978): Ei'lfiihrung in die Grundlagen der theoretischen Physik, Vols. 1-4.
Vieweg, Braunschweig.
Lurje, A. (1980): Nonlinear Elasticity. Nauka, Moscow (Russian).
Luscher, E. (1980): Pipers Buch der modernen Physik. Piper, Miinchen.

Ma, S. (1982): Modern Theory of Critical Phenomena. Benjamin, London.


Macke, W. (1962): Lehrbuch der theoretischen Physik, Vols. 1-6. Geest & Portig,
Leipzig.
Majda, A. (1982): Smooth solutions for the equations of compressible and incompressible
flow. In: da Veiga, B. [ed.], Fluid Dynamics. Lecture Notes in Mathematics, Vol.
1047,77-126. Springer-Verlag, Berlin.
Majda, A. (1983): Systems of Conservation Laws in Several Space Variables. In: Warsaw
(1983), 1217-1224.
Majda, A. (1984): Compressible Fluid Flow and Systems of Conservation Laws in Several
914 References

Space Variables. Springer-Verlag, New York.


Mangoldt, H. v. and Knopp, K. (1957): EinjUhrung in die hOhere Mathematik, Vols.
1-3. Hirzel, Leipzig.
Manin, J. (1981): Mathematics and Physics. Birkhiiuser, Boston.
Manin, Y. and Khenkin, G. (1982): Yang-Mills-Dirac equations as Cauchy-Riemann
equations in twistor space. Soviet J. Nuclear Phys. 35, 941-950.
Manin, J. (1984): Gauge Field Theory and Complex Manifolds. Nauka, Moscow
(Russian).
Manin, Y. (1985): New exact solutions and cohomology analysis of ordinary and super-
symmetric Yang-Mills equations. Proc. Steklov Inst. Math. 165, 107-127.
Marlow, A. (1980): Quantum Theory and Gravitation. Academic Press, New York.
Marsden, J. (1968): Generalized Hamiltonian mechanics. Arch. Rat. Mech. Anal. 28,
323-361.
Marsden, J. (1968a): Hamiltonian one-parameter groups. Arch. Rat. Mech. Annal. 28,
362-396.
Marsden, J. (1974): Applications of Global Analysis in Mathematical Physics. Publish
or Perish, Boston.
Marsden, J. and McCracken, M. (1976): The Hopf Bifurcation and Its Applications.
Springer-Verlag, Berlin.
Marsden, J. and Tromba, A. (1976): Vector Calculus. Freeman, San Francisco, CA.
Marsden, J. (1980): Lectures on Geometric Methods in Mathematical Physics. SIAM,
Philadelphia.
Marsden, J. (1983): The initial-value problem and the dynamics of gravitational fields.
In: Schmutzer, E. [ed.] (1983), 115-126.
Marsden, J. and Hughes, T. (1983): Mathematical Foundations of Elasticity. Prentice-
Hall, Englewood Cliffs, NJ.
Marsden, J., Abraham, R., and Ratiu, T. (1983): Manifolds, Tensor Analysis, and Appli-
cations. Addison-Wesley, Reading, MA.
Marsden, J. [ed.] (1984): Fluids and Plasmas: Geometry and Dynamics. Contemporary
Mathematics, Vol. 28, American Mathematical Society, Providence, RI.
Martin, N. (1981): Mathematical Theory of Entropy. Addison-Wesley, London.
Maslov, V. (1972): Theorie des perturbations et methodes asymptotiques. Dunod, Paris.
Mathematisches Worterbuch (1961): Edited by J. Naas and H. Schmid. Akademie-
Verlag, Berlin.
Massey, B. (1971): Units, Dimensional Analysis, and Physical Similarity. Van Nostrand,
London.
Mathematics: The Unifying Thread in Science (1986): Notices Amer. Math. Soc. 33,
716-733.
Matsumara, A. and Nishida, T. (1979): The initial-value problem for the equations of
motion of compressible viscous and heat conductive fluids. Proc. Japan Acad. A55,
337-342.
Maul, J. (1976): Eine einheitliche Methode zur Losung der ebenen Aufgaben der linearen
Elastostatik. Akademie-Verlag, Berlin.
Maurin, K. (1967): Methods of Hilbert Spaces. PWN, Warsaw.
Maurin, K. (1976): Analysis, Vols. 1, 2. Reidel, Boston.
Maurin, K. (1981): Mathematik als Sprache und Kunst. In: Maurin, K. et al. [eds.],
Offene Systeme, Vol. 2, 118-241. Stuttgart,
Maurin, K. (1982): Plato's cave parable and the development of modern physics. Rend.
Sem. Mat. Univ. Politec. Torino 40, 1-31.
References 915

Mehra, J. and Rechenberg, H. (1982): The Historical Development of Quantum Theory,


Vols. 1-4. Springer-Verlag, New York.
Melcher, H. (1979): Albert Einstein wider Vorurteile und Denkgewohnheiten. Akademie-
Verlag, Berlin.
Menzel, D. (1955): Fundamental Formulas of Physics. Prentice-Hall, New York.
Meyer, R. (1971): Introduction to Mathematical Fluid Dynamics. Wiley, New York.
Meyers, N. (1963): An L,-estimiite for the gradient of solutions of second-order elliptic
divergence equations. Ann. Scuola Norm. Sup. Pisa 17 (3), 189-206.
Mezard, M. and Virusoro, M. [eds.] (1987): Spin Glass Theory and Beyond. World
Scientific, Singapore.
Miersemann, E. (1975): Verzweigungsprobleme fUr Variationsungleichungen. Math.
Nachr. 65, 187-209.
Miersemann, E. (1978): Ober hOhere Verzweigungspunkte nichtlinearer Variationsunglei-
chungen. Math. Nachr. 85, 195-213.
Miersemann, E. (1979): Ober positive Losungen von Eigenwertgleichungen mit Anwen-
dungen auf ein Beulproblem fUr die Platte. Z. Angew. Math. Mech. 59, 189-194.
Miersemann, E. (1980): Zur Regularitat der quasistatischen elastoviscoplastischen Ver-
schiebungen und Spannungen. Math. Nachr. 96, 293-299.
Miersemann, E. (1981): Eigenvalue problems for variational inequalities. Contemp.
Math. 4, 25-43.
Miersemann, E. (1981a): Eigenwertaufgaben fUr Variationsungleichungen. Math. Nachr.
100, 221-228.
Miersemann, E. (1981b): Zur Losungsverzweigung bei Variationsungleichungen mit einer
Anwendung auf den Knickstab mit begrenzter Durchbiegung. Math. Nachr. 102,
7-15.
Miersemann, E. (1982): Stabilitatsprobleme fUr Eigenwertaufgaben bei Beschrankungen
fUr die Variationen mit einer Anwendung auf die Platte. Math. Nachr.l06, 211-221.
Miller, J. and Sciama, D. (1980): Gravitational collapse to the black hole states. In: Held,
A. [ed.] (1980), Vol. 2, 359-392.
Milne-Thomson, L. (1960): Theoretical Hydrodynamics. Macmillan, London.
Milnor, J. (1965): Topology From the Differentiable Viewpoint. University Press, Char-
lottesville, VA.
Milnor, J. (1983): Hyperbolic geometry: the first 150 years. In: Browder, F. [ed.] (1983),
25-40.
Minkowski, H. (1909): Raum und Zeit. Teubner, Leipzig. (English edition: Space and
time. Calcutta Math. Soc. Bull. l, 135-141.)
Miranda, C. (1941): Un'osservazione su un teorema di Brouwer. Boll. Un. Mat. Ital.
Seconda Serie 3, 5-1.
Mises, R.v. (1913): Methodik der festen Korper im plastisch-deformablen Zustand.
Nachr. Akad. Wiss. Gottingen, Math.-phys. Kl.l913, 582-592.
Mises, R. v. (1962): Cf. Frank, P. and Mises, R. v. (1962).
Misner, C., Thorne, K., and Wheeler, J. (1973): Gravitation. Freeman, San Francisco,
CA.
Miyoshi, T. (1985): Foundations of the Numerical Analysis of Plasticity. North-Holland,
Amsterdam.
Mohapatra, R. (1986): Unification and Supersymmetry. Springer-Verlag, New York.
Monod, J. (1970): Le hasard et la necessite. Paris. (German edition: Zufall und Notwen-
digkeit, Miinchen, 1971.)
Monastyrsky, M. (1987): Riemann, Topology, and Physics. Birkhiiuser, Boston.
916 References

Montgomery, D. and Zippin, L. (1955): Transformation Groups. Interscience, New


York.
Moore, F. (1977): The Story of Astronomy. MacDonald, London.
Morawetz, C. (1981): Lectures on Nonlinear Waves and Shocks. Springer-Verlag, New
York.
Morawetz, C. (1982): The mathematical approach to the sonic barrier. Bull. Amer. Math.
Soc. (N.S.) 6, 127-145.
Moreau, J. (1968): La notion de sur-potential et les liaisons unilaterales en elastostatique.
C. R. Acad. Sci. Paris SCr. A267, 954-957.
Moreau, J. (1976): Applications of convex analysis to the treatment of elastoplastic
systems. In: Germain, P. and Nayroles, B. [eds.] (1976), 56-89.
Moritz, R. (1914): Memorabilia Mathematics. Macmillan, New York.
Morrey, C. (1952): Quasi-convexity and the lower semicontinuity of multiple integrals.
Pacific J. Math. 2, 25-53.
Morrey, C. (1966): Multiple Integrals in the Calculus of Variations. Springer-Verlag,
New York.
Morrisson, P. (1984): Zehn hoch. Dimensionen zwischen Quarks und Galaxien. Spektrum
der Wissenschaft, Heidelberg.
Morse, M. (1939): The behaviour of a function on its critical set. Ann. Math. 40, 62-70.
Morse, P. and Feshbach, H. (1953): Methods of Theoretical Physics, Vols. 1, 2.
McGraw-Hill, New York.
Mosco, U. (1976): Implicit variational problems and quasivariational inequalities. Lec-
ture Notes in Mathematics, Vol. 543, 83-156. Springer-Verlag, Berlin.
Mouritsen, 0. (1984): Computer Studies of Phase Transitions and Critical Phenomena.
Springer-Verlag, New York.
Murat, F. (1978): Compacticite par compensation. Ann. Scuola Norm. Sup. Pisa Sci.
Fis. Math. 5, 489-507.
Murat, F. (1981): L'injection du c6ne positif de H- 1 dans w-l,q est compacte pour tout
1 < q < 2. J. Math. Pures Appl. 60, 309-322.
Murat, F. (1987): A survey on compensated compactness. In: Cesari, L. [ed.] (1987),
Contributions to Modern Calculus of Variations. Pitman, London, pp. 145-183.
Murman, E. [ed.] (1985): Progress and Supercomputing in Computational Fluid Dy-
namics. Birkhiiuser, Boston.
Muskelisvili, N. (1954): Fundamental Problems in Mathematical Elasticity. Nauka,
Moscow (Russian). (English edition: Noordhoff, Leyden, 1975.)

Naghdi, P. (1972): Theory of plates and shells. In: Fliigge, S. [ed.] (1956), Vol. VIa/2,
425-640.
Narlikar, J. and Padmanabhan, T. (1982): Quantum cosmology via path integrals. Phys.
Rep. 110,151-200.
Nash, J. (1951): Non-cooperative games. Ann. Math. 54, 286-295.
Nash, J. (1956): The embedding problem for Riemannian manifolds. Ann. of Math. 63,
20-63.
Naumann, l (1982): Zur Existenz und Regularitiit der Losungen der Variationsunglei-
chungen der Theorie visko-plastischer und starr-idealplastischer Flussigkeiten. Dis-
sertation B, Universitiit Leipzig.
Naumann, J. (1984): Parabolische Variationsungleichungen. Teubner, Leipzig.
Navier, C. (1822): Memoire sur les lois du mouvement des fluides. Mem. Acad. Sciences.
References 917

Nayfeh, A. (1973): Perturbation Methods. Wiley, New York.


Nayfeh, A. and Mook, D. (1979): Nonlinear Oscillations. Wiley, New York.
Neeas, J. et al. (1980): On the solution of the variational inequality to the Signorini
problem with small friction. Boll. Un. Mat. ltal. (5) 178, 796-811.
Neeas, J. and Hlavacek, I. (1981): Mathematical Theory of Elastic and Elasto-plastic
Bodies. Elsevier, New York.
Neeas, J. (1983): Introduction to the Theory of Nonlinear Elliptic Equations. Teubner,
Leipzig.
Necas, J. and HlavaCek, I. (1986): Cf. Hlavaeek, I. (1986).
Nekrasov, A. (1921): On stationary waves, I. II. lvanova-Voznes. Bull. lost. Polytechnic
3, 52-65, 6, 155-171 (Russian).
Neumann, J. v. (1928): Zur Theorie der Gesellschaftsspiele. Math. Ann. 100,295-320.
Neumann, J. v. (1932): Mathematische Grundlagen der Quantenmechanik. Springer-
Verlag, Berlin. (English edition: Mathematical Foundations of Quantum Mechan-
ics. University Press, Princeton, NJ, 1955).
Neumann, J. v. and Morgenstern, 0. (1944): Theory of Games and Economic Behaviour.
University Press, Princeton, NJ.
Neumann, J. v. (1947): The Mathematician. In: Neumann, J. v. (19611 Vol. 1, 1-9.
Neumann, J. v. and Richtmyer, R. (1950): A method for the numerical calculation of
hydrodynamics. J. Appl. Phys. 11, 232-237.
Neumann, J. v. (1961): Collected Works. Pergamon Press, New York.
Neumark, M. (1963): Lineare Darstellungen der Lorentz-Gruppe. Veri. d. Wiss., Berlin.
Newton, I. (1687): Philosophiae Natura/is Principia Mathematica. London.
Newton, I. (1797): Opera. Vols. 1-5. Edited by S. Horseley. London.
Newton, I. (1967): The Mathematical Papers of Isaac Newton. Edited by D. Whiteside,
since 1967.
Nguyen, Q. (1973): Materiaux elastoplastiques ecrouissable. Arch. Mech. Stos. 15,
695-702.
Nguyen, Q. (1982): Problemes de plasticite et de rupture. Publ. math. d'Orsay, No. 82.08.
Universite de Paris-Sud.
Ni, L. (1982): A combinatorial approach to the mapping degree. J. Math. Anal. Appl. 89,
386-399.
Nickel, K. (1958): Einige Eigenschaften von Losungen der Prandtlschen Grenzschicht-
gleichung. Arch. Rat. Mech. Anal.l, 1-31.
Nickel, K. (1963): Die Prandtlschen Grenzschichtdifferentialgleichungen als asymptoti-
scher Grenzfall der N avier-Stokesschen Differentialgleichungen und der Eulerschen
Dijferentialgleichungen. Arch. Rat. Mech. Anal.l3, 1-14.
Nickel, K. (1984): Minimal drag for wings with prescribed lift, roll moment and yaw
moment, or how to fight adverse yaw. In: Boffi, V. and Neunzert, H. [eds.] (1984),
7-50.
Nicolis, G. and Prigogine, I. (1977): Self-organization in non-equilibrium systems. From
Dissipative Structures to Order Through Fluctuations. Wiley, New York.
Nieuwenhuizen, P. van (1984): An introduction to simple supergravity and the Klein-
Kaluza program. In: DeWitt, B. and Stora, R. [eds.] (1984), 824-912.
Nikaido, H. (1956): On the classical multilateral exchange problem. Metroeconomica 8,
135-145.
Nikaido, H. (1968): Convex Structures and Economic Theory. Academic Press, New
York.
918 References

Nikaido, H. (1970): Introduction to Sets and Mappings in Modern Economics. North-


Holland, New York.
Niordson, F. (1985): Shell Theory. North-Holland, Amsterdam.
Nirenberg, L. (1955): Remarks on strongly elliptic systems. Comm. Pure Appl. Math.
8, 649-675.
Nobel Prizes (1954ff): Nobel Lectures. Edited by the Nobel Foundation, Stockholm.
Nonlinear Phenomena (1986): Solitons and Coherent Structures. Proceedings of a con-
ference held at Santa Barbara, CA, 1985. Phys. D, 18.
Novikov, S. et al. (1980): Theory of Solitons. The Inverse Scattering Method. Nauka,
Moscow (Russian). (English edition: Plenum, New York, 1984.)
Novikov, I. and Frolov, V. (1986): Physics of Black Holes. Nauka, Moscow (Russian).
Nussbaum, R. (1978): Differential Delay Equations With 'TWo Time Lags. American
Mathematical Society, Providence, Rl.

Oberdorfer, E. (1969): Das internationale Mafisystem. Springer-Verlag, Berlin.


Obuhov, A. (1983): Kolmogorov flow and its realization in experiments. Uspekhi Mat.
Nauk 38 (4), 100-111 (Russian).
Ockendon, H. and Taylor, A. (1983): lnviscid Fluid Flows. Springer-Verlag, New York.
Oden, J. and Reddy, T. (1976): Variational Methods in Theoretical Mechanics. Springer-
Verlag, New York.
Oden, J. (1979): Existence theorems for a class of problems in nonlinear elasticity.
J. Math. Anal. Appl. 69,51-83.
Oden, J. (1980): Computational Methods in Nonlinear Mechanics. North-Holland, New
York.
Odquist, F. (1930): Vber die Randwertaufgaben der Hydrodynamik ziiher F!Ussigkeiten.
Math. Z. 32, 329-375.
Ogden, R. (1972): Large deformation isotropic elasticity: on the correlation of theory
and experiment for compressible rubberlike solids. Proc. Roy. Soc. London, A328,
567-583.
Oleinik, 0. (1957): Discontinuous solutions of nonlinear equations. Uspekhi Mat. Nauk
12 (3), 3-73 (Russian).
Oleinik, 0., Kalasnikov, A., and Yui-Lin, C. (1958): The Cauchy problem and boundary
problems for equations of the type of nonstationary filtration. Izv. Akad. Nauk
SSSR Ser. Mat. 22, 667-704 (Russian).
Oleinik, 0. (1959): On the construction of generalized solutions of the Cauchy problem
for quasilinear equations via artificial viscosity, Uspekhi Mat. Nauk 14 (2), 159-
164 (Russian).
Oleinik, 0. (1968): Mathematical problems in the theory of boundary layers. Uspekhi
Mat. Nauk 23 (3), 4-65 (Russian).
Olfe, D. and Zakkay, V. (1964): Supersonic Flow. Chemical Processes and Radiative
Transfer. Oxford.
Olszak, W. [ed.] (1980): Thin Shell Theory: New Tre1)ds and Applications. Springer-
Verlag, Wien.
Omohundro, S. (1986): Geometric Perturbation Theory in Physics. World Scientific,
Singapore.
0' Neill, B. (1983): Semi-Riemannian Geometry. Academic Press, New York.
Orear, J. (1966): Fundamental Physics. Wiley, New York. (Gyrman edition: Hanser-
Verlag, Miinchen, 1971.)
References 919

Orlik, P. [ed.]. Singularities. Proc. Sympos. Pure Math., Vol. 40, Parts I, II. American
Mathematical Society, Providence, Rl.
Osborn, H. (1982): Vector Bundles, Vols. 1-3. Academic Press, New York.
Oswatitsch, K. (1976): Grundlagen der Gasdynamik. Springer-Verlag, Berlin.
Otte, M. [ed.] (1974): Mathematiker uber Mathematik. Springer-Verlag, Berlin.
Owen, D. (1984): A First Course in the Mathematical Foundations of Thermodynamics.
Springer-Verlag, New York.

Palais, R. (1969): The Morse lemma on Banach spaces. Bull. Amer. Math. Soc. 75,968-
971.
Panagiotopoulos, P. (1985): Inequality Problems in Mechanics and Applications. Birk-
hiiuser, Boston.
Pascali, D. (1986): On critical points of nondifferentiable functions. Libertas Math. 6,
95-100.
Pauli, W. (1958): Die allgemeirzen Prinzipien der Wellenmechanik. In: Fliigge, S. [ed.]
(1956), Vol. V/1, 1-168.
Pauli, W. (1973): Lectures in Physics, Vols. 1-6. MIT Press, Cambridge, MA.
Pazy, A. (1983): Semigroups of Linear Operators and Applications to Partial Differential
Equations. Springer-Verlag, New York.
Peebles, P. (1980): The Large Scale Structure of the Universe. University Press, Prince-
ton, NJ.
Peitgen, H. and Priifer, M. (1979): The Leray-Schauder continuation method is a
constructive element in the numerical study of nonlinear eigenvalue and bifurcation
problems. In: Peitgen, H. and Walther, H. [eds.] (1979), 326-409.
Peitgen, H. and Walther, H. [eds.] (1979): Functional Differential Equations and Ap-
proximation of Fixed Points. Lecture Notes in Mathematics, Vol. 730. Springer-
Verlag, Berlin.
Peitgen, H. and Walther, H. [eds.] (1981): Numerical Solution of Nonlinear Equations.
Lecture Notes in Mathematics, Vol. 878. Springer-Verlag, New York.
Peitgen, H. (1984): Harmonie in Chaos und Kosmos. Bremen.
Peitgen, H. and Richter, P. (1985): The Beauty of Fractals. Springer-Verlag, New York.
Penrose, R. (1972): Techniques of Differential Topology in Relativity. SIAM, Philadel-
phia.
Penrose, R. (1977): The twistor program. Rep. Math. Phys. 12, 65-76.
Penrose, R. (1979): Singularities and time-asymmetry. In: Hawking, S. and Israel,
W. [eds.] (1979).
Penrose, R. and Ward, R. (1980): Twistors for flat and curved space-time. In: Held,
A. [eds.] (1980), Vol. 2, 283-328.
Penrose, R. and Rindler, W. (1984): Spinors and Space- Time, Vols. 1, 2. University
Press, Cambridge, England.
Perron, 0. (1929): Ober Stabilitiit und asymptotisches Verhalten der Integrate von
Differentialgleichungssystemen. Math. Z. 29, 129-160.
Perron, 0. (1930): Die Stabilitiitsfrage bei Differentialgleichungen. Math. Z. 32, 703-728.
Petrina, D. and Gerasimenko, V. (1983): A mathematical approach to the evolution of
infinite systems in classical statistical mechanics. Uspekhi Mat. Nauk 38 (5), 3-58
(Russian).
Peyret, R. and Taylor, T. (1985): Computational Methods for Fluid Flow. Springer-
Verlag, New York.
920 References

Pfluger, A. (1965): Stabilitiitsprobleme der Elastostatik. Springer-Verlag, Berlin.


Pfluger, A. (1967): Elementare Schalenstatik. Springer-Verlag, Berlin.
Phil, H. (1987): Zur Losung des linearisierten Knickstabproblems mit beschriinkter
Ausbiegung. Z. Anal. Anwendungen (to appear).
Phil, H. (1987a): On the optimal control of a hydroelectric power plant. Systems Control
Lett. 8, 281-288.
Pipkin, A. (1986): Lectures on Viscoelastic Theory. Springer-Verlag, New York.
Planck, M. (1900): Zur Theorie des Gesetzes der Energieverteilung im Normalspektrum.
Verh. Dt. Physik. Ges. Berlin 2, 237-248.
Planck, M. (1909): Gutachten zur Berufung Einsteins an die Universitiit Prag. (Cf. Frank,
P. (1949).)
Planck, M. (1913): Vorlesungen uber Thermodynamik. De Gruyter, Berlin. (English
edition: Treatise on Thermodynamics. Dover, New York, 1945.)
Planck, M. (1945): Wissenschaftliche Autobiographie. Barth, Leipzig. (English edition:
Scientific Autobiography, Philosophical Library, 1949.)
Planck, M. (1967): Der Kausalbegriffin der Physik. Barth, Leipzig.
Pliss, V. (1977): Solution Sets of Periodic Differential Equations. Nauka, Moscow
(Russian).
Poincare, H. (1885): Les figures equilibrium. Acta Math. 7, 259-302.
Poincare, H. (1892): Les methodes nouvelles de Ia mecanique celeste, Vols. 1-3.
Gauthier-Villars, Paris.
Poincare, H. (1928): Oeuvres (Collected Works), Vols. 1-10. Gauthier-Villars, Paris.
Pontrjagin, L. (1966): Topological Groups. Gordon and Breach, New York.
Poston, T. and Stewart, I. (1978): Catastrophe Theory and Its Applications. Pitman,
London.
Prager, W. and Hodge, P. (1951): Theory of Perfectly Plastic Solids. Wiley, New York.
Prager, W. (1955): Probleme der Plastizitiitstheorie. Birkhiiuser, Basel. (English edition:
Problems in Plasticity. Addison-Wesley, London, 1959.)
Prager, W. (1961): Einjuhrung in die Kontinuumsmechanik. Birkhauser, Basel.
Prandtl, L. (1904): Ober Flussigkeitsbewegung bei sehr kleiner Reibung. In: Verh. d. III.
Internat. Mathematikerkongresses, Heidelberg.
Prandt~ L. (1924): Spannungsverteilung in plastischen Korpern. In: Proceedings of the
First Int. Congr. Appl. Mech., Delft, pp. 43-54.
Prandtl, L. (1949): Stromungslehre. Vieweg, Braunschweig.
Press, H. et al. (1986): Numerical Recipies: The Art of Scientific Computing. University
Press, Cambridge, England.
Pressley, A. and Segal, G. (1986): Loop Groups. Clarendon Press, Oxford, England.
Prigogine, I. (1979): Vom Sein zum Werden. Piper, Munchen. (English edition: From
Being to Becoming.)
Prigogine, I. and Stengers, I. (1981): Dialog mit der Natur. Piper, Miinchen.
Primas, H. (1983): Chemistry, Quantum Mechanics, and Reductionism. Perspectives in
Theoretical Chemistry. Springer-Verlag, New York.
Priifer, M. and Siegberg, H. (1979): On computational aspects of topological degree in
R•. In: Peitgen, H. and Walther, H. [eds.] (1979), 410-433.
Pukhnacov, J. and Popov, J. (1985): Mathematik ohne Formeln. Urania-Verlag, Leipzig.
Pukhnacov, V. (1972): The plane steady problem with free boundary for the Navier-
Stokes equations. Prikl. Mech. Techn. Fiz. 5, 126-134 (Russian).
Pukhnacov, V. (1975): Nonclassical Problems in the Theory of a Boundary Layer.
University Press, Novosibirsk (Russian).
References 921

Quigg, C. (1983): Gauge Theories of the Strong, Weak and Electromagnetic Interactions.
Benjamin, London.

Rabier, P. (1985): Lectures on Topics in One-Parameter Bifurcation Problems. Springer-


Verlag, New York.
Rabier, P. (1986): A general study of nonlinear problems with three solutions in Hilbert
spaces. Arch. Rat. Mech. Anal. 9S, 123-154.
Rabinowitz, P. (1968): Existence and nonuniqueness of rectangular solutions of the
Benard problem. Arch. Rat. Mech. Anal. 29, 32-57.
Rabinowitz, P. [ed.] (1977): Applications of Bifurcation Theory. Academic Press, New
York.
Ramanathan, R. (1983): Introduction to the Theory of Economic Growth. Lecture Notes
in Economics, Vol. 205. Springer-Verlag, New York.
Rajaraman, R. (1982): Solitons and lnstantons. North-Holland, Amsterdam.
Ramm, E. [ed.] (1982): Buckling of Shells. Springer-Verlag, New York.
Ranft, G. and Ranft, J. (1976): Elementarteilchen, Vols. 1, 2. Teubner, Leipzig.
Rankine, W. (1870): On the thermodynamic theory of waves of finite longitudinal
disturbance. Trans. Roy. Soc. London 160, 277-288.
Raschewskii, P. (1959): Riemannsche Geometrie und Tensoranalysis. Veri. d. Wiss.,
Berlin.
Rayleigh, J. (1910): Aerial plane waves of finite amplitude. Proc. Roy. Soc. London 84,
247-284.
Reasenberg, R. and Shapiro, I. (1983): Terrestrial and planetary relativity experiments.
In: Schmutzer, E. [ed.] (1983), 149-164.
Rebbi, C. and Soliani, G. (1984): Solitons and Particles. World Scientific, Singapore.
Recke, L. (1978): Anwendung der Verzweigungstheorie auf geometrisch nichtlineare
Schalengleichungen. Dissertation, Humboldt-Universiti:it, Berlin.
Recke, L. (1987): Zur Oberlagerung zweier Hopf-Bifurkationen. Dissertation B,
Humboldt-Universiti:it Berlin (Seminarbericht Sektion Mathematik Nr. 79).
Reed, M. and Simon, B. (1972): Methods of Modern Mathematical Physics, Vols. 1-4.
Academic Press, New York.
Reeken, M. (1977): The equation of motion of a chain. Math. Z. ISS, 219-237.
Reeken, M. (1979): Classical solutions of the chain equations. Math. Z. 16S, 143-169,
166, 67-82.
Regge, T. (1984): The group manifold approach to unified gravity. In: DeWitt, B. and
Stora, R. [eds.] (1984), 933-1006.
Reichardt, H. (1985): Gau.P und die Anfiinge der nicht-euklidischen Geometrie. With
original papers by J. Bolyai, N. Lobaeevskii, and F. Klein. Teubner-Verlag,
Leipzig.
Reif, F. (1965): Fundamentals of Statistical and Thermal Physics. McGraw-Hill, New
York. (German edition: De Gruyter, Berlin, 1976.)
Reiner, M. (1958): Rheology. In: Fliigge, S. [ed.] (1956), Vol. VI, 434-550.
Reiss, E. (1977): Imperfect bifurcation. In: Rabinowitz, P. [ed.] (1977), 37-72.
Renardy, M. (1982): Bifurcation from rotating waves. Arch. Rat. Mech. Anal. 79,49-84.
Renardy, M. (1983): A class of quasilinear parabolic equations with infinite delay and
application to a problem of viscoelasticity. J. Differential Equations 48, 280-
292.
922 References

Reuss, A. (1930): Berucksichtigung der elastischen Formiinderung in der Plastizitiits-


theorie. Z. Angew, Math. Mech. 10,266-271.
Reynolds, 0. (1885): On the flow of gases. Proc. Manch. Lit. Phil. Sci.
Rheinboldt, W. (1986): Numerical Analysis of Parametrized Nonlinear Equations. Wiley,
New York.
Ricci, G. and Levi-Civita, T. (1901): Methodes de calcul differentiel absolu et leurs
applications. Math. Ann. 54, 125-201.
Richtmyer, R. and Morton, K. (1961): Difference Methods for Initial- Value Problems.
lnterscience, New York.
Richtmyer, R. (1978): Principles of Advanced Mathematical Physics, Vols. 1, 2. Springer-
Verlag, Berlin.
Riedl, R. (1976): Die Strategie der Genesis. Piper, Miinchen.
Riedrich, T. (1976): Vorlesungen uber nichtlineare Operatorgleichungen. Teubner, Leip-
zig.
Riemann, B. (1854): Ober die Hypothesen, welche der Geometrie zugrunde liegen. Habili•
tationsvortrag. Abh. Akad. Wiss. Gottingen 13. (English translation: In: Spivak,
M. (1979), Vol. 2.)
Riemann, B. (1860): Ober die Fortpjlanzung ebener Luftwellen von endlicher Schwin-
gungsweite. Abh. Ges. Wiss. Gottingen. Math.-Naturwiss. Kl. 8, p. 43.
Riemann, B. (1861): Mathematical remarks answering a question asked by the famous
Paris Academy. In: Riemann, B. (1892), 391-404.
Riemann, B. (1892): Gesammelte mathematische Werke (Collected Mathematical Works).
Teubner, Leipzig.
Riesz, F. and Nagy, B. (1978): Functional Analysis. Ungar, New York.
Rindler, W. (1977): Essential Relativity. Special, General, Cosmological. Springer-
Verlag, New York.
Ritz, W. (1908): Ober eine neue Methode zur Losung gewisser Randwertaufgaben.
Gottinger Nachr., Math.-Naturwiss. Kl.l908, 236-248.
Rivlin, R. and Ericksen, J. (1955): Stress-deformation relations for isotropic materials.
J. Rat. Mech. Anal. 4, 681-702.
Ropke, G. (1987): Statistische Mechanik fur das Nichtggleichgewicht. Veri. d. Wiss.,
Berlin.
Ross, G. (1984): Grand Unified Theories. Benjamin, Reading, MA.
Roy, P. and Singh, V. [eds.] (1984): Supersymmetry and Supergravity. Lecture Notes
in Physics, Vol. 208. ·
Rozdestvenskii, B. and Janenko, N. (1978): Systems of Quasilinear Equations. Nauka,
Moscow (Russian).
Ruelle, D. (1969): Statistical Mechanics. Rigorous Results. Benjamin, New York.
Ruelle, D. and Takens, F. (1971): On the nature of turbulence. Commun. Math. Phys.
20, 167-192,23,343-344.
Ruelle, D. (1978): The Mathematical Structure of Classical Equilibrium Statistical
Mechanics. Addison-Wesley, Reading, MA.
Ruelle, D. (1980): Strange attractors. Math. Intelligencer l, 126-137.
Ruelle, D. (1981): Differentiable dynamical systems and the problem of turbulence. Bull.
Amer. Math. Soc. (N.S.) 5, 29-42.
Ruelle, D. (1983): TUrbulent dynamical systems. In: Warsaw (1983), 271-283.
Ruffini, R. [ed.] (1987): Proceedings of the Fourth Marcel Grossmann Meeting on
General Relativity, Vols. 1, 2. North-Holland, Amsterdam.
References 923

Russian Encyclopedia of Mathematics (1977): Edited by I. Vinogradov. Vol. liT. Sovet-


skaja Encyclopedia, Moscow (Russian).

Sabinina, E. (1961): On the Cauchy problem for the equation of nonstationary gas
filtration •in several space variables. Dokl. Akad. Nauk SSSR 136, 1034-1037
(Russian).
Sachs, R. and Wu, H. (1977): General Relativity for Mathematicians. Springer-Verlag,
New York.
Sagan, C. and Shklovsky, I. (1968): Intelligent Life in the Universe. Dell, New York.
Sagan, C. and Agel, J. (1973): The Cosmic Connection: An Extraterrestrial Perspective.
New York. (German edition: Miinchen, 1978.)
Sagan, C. (1980): Cosmos. New York (German edition: Miinchen, 1982.)
Sagan, C. (1980a): Signale der Erde. Droemer, Miinchen, (English edition: Murmurs of
Earth, New York, 1978.)
Sagan, C. (1982): Aujbruch in den Kosmos. Heyne, Miinchen. (English edition: Broca's
Brain, New York, 1979.)
Saint-Venant, de M. (1871 ): Surles equations du mouvement interieur des so/ides ductiles.
J. Math. Pures Appl. 16, 373-382.
Salam, A. and Sezgin, E. [eds.] (1986): Supergravity Theories, Anomalies, and Com-
pactification, Vols. 1, 2. World Scientific, Singapore.
Sard, A. (1942): The measure of the critical points of differentiable maps. Bull. Amer.
Math. Soc. 48, 883-890.
Sather, D. (1976): Branching and stability for nonlinear shells. In: Germain, P. and
Nayroles, B. [eds.] (1976), 462-473.
Sattinger, D. (1977): Selection mechanisms for pattern formation. Arch. Rat. Mech. Anal.
66,31-42.
Sattinger, D. (1979): Group- Theoretic Methods in Bifurcation Theory. Lecture Notes in
Mathematics, Vol. 762. Springer-Verlag, Berlin.
Sattinger, D. (1980): Bifurcation and symmetry breaking in applied mathematics. Bull.
Amer. Math. Soc. (N.S.) 3, 779-819.
Sattinger, D. (1980a): Les symetries des equations et leurs applications dans Ia mecanique
et Ia physique. Publ. Math. d'Orsay No. 80.08. Universite de Paris-Sud.
Sauer, R. (1960): Gasdynamik. Springer-Verlag, New York.
Sauer, R. (1966): Nichtstationiire Probleme der Gasdynamik. Springer-Verlag, Berlin.
Scarf, H. (1967): The approximation of fixed points of continuous mappings. SIAM
J. Appl. Math. 15, 1328-1343.
Schauder, J. (1978): Oeuvres (Collected Works). PWN, Warsaw.
Scheffer, V. (1980): The Navier-Stokes equations on a bounded domain. Commun. Math.
Phys. 73, 1-42.
Scheidegger, A. (1963): Hydrodynamics in porous media. In: Fliigge, S. [ed.] (1956), Vol.
VIII/2, 625-662.
Scheidt, J. v. and Purkert, W. (1983): Random Eigenvalue Problems. North-Holland.
Amsterdam.
Schilling, K. (1986): Simpliziale Algorithmen zur Berechnung von Fixpunkten mengen-
wertiger Operatoren. Wissenschaftlicher Verlag, Trier.
Schlichting, S. (1960): Boundary Layer Theory. McGraw-Hill, New York.
Schmutzer, E. (1968): Relativistische Physik. Teubner, Leipzig.
Schmutzer, E. [ed.] (1983): Proceedings of the 9th International Conference on General
Relativity and Gravitation. Veri. d. Wiss., Berlin.
924 References

Schochet, S. (1986): The incompressible Euler equations in a bounded domain. Comm.


Math. Phys. 104,49-75.
Schoen, R. and Yau, S.: Cf. Yau.
Scholz, E. (1980): Geschichte des Mannigfaltigkeitsbegriffes von Riemann bis Poincare.
Birkhauser, Basel.
Schouten, J. (1954): Ricci Calculus. Springer-Verlag, Berlin.
Schreier, S. (1982): Compressible Flow. Wiley, New York.
Schrodinger, E. (1926): Quantisierung als Eigenwertproblem. Ann. Physik 9, 361-376.
Schrodinger, E. (1927): Abhandlungen zur Wellenmechanik. Barth, Leipzig.
Schumann, R. (1987): Eine neue Methode zur Gewinnung starker Regularitiitsaussagen
fur das Signorini-Problem in der linearen Elastizitiitstheorie. Dissertation B,
Universitiit, Leipzig.
Schumann, R. (1988): Regularity for Signorini's problem in linear elastostatics. Manu-
scripta Math. 63, 255-291.
Schuster, H. (1984): Deterministic Chaos. Physik-Verlag, Weinheim, Federal Republic
of Germany.
Schwartz, J. (1968): Differential Geometry and Topology. Gordon and Breach, New
York.
Schwartz, J. (1969): Nonlinear Functional Analysis. Gordon and Breach, New York.
Schwarz, J. [ed.] (1985): Superstrings, Vols. 1, 2. World Scientific, Singapore.
Schwarzschild, K. (1916): Ober das Gravitationsfeld eines Massenpunktes nach der
Einsteinschen Theorie. Sitzungsber. Preuss. Akad. Wiss. Berlin 1916, 189-196.
Sedov, L. (1959): Similarity and Dimensional Methods in Mechanics. Cleaver-Hume
.Press, London.
Segre, G. (1987): Superstrings and four-dimensional physics. In: Latal, H. and Mitter,
H. [eds.] (1987), 101-150.
Seifert, H. (1983): Black holes, singularities, and topology. In: Schmutzer, E. [ed.] (1983),
133-150.
Serrin, J. (1952): Existence theorems for some hydrodynamical free boundary-value prob-
lems. J. Rat. Mech. Anal. 1, 1-48.
Serrin, J. (1953): On plane and axially symmetric free boundary problems. J. Rat. Mech.
Anal. l, 563-575.
Serrin, J. (1959): Mathematical principles of classical fluid mechanics. In: Flugge, S.
[ed.] (1956), Vol. VIII/1, 125-246.
Serrin, J. (1959a): On the stability of viscous fluid motions. Arch. Rat. Mech. Anal. 3,
1-13.
Serrin, J. (1963): The initial-value problem for the Navier-Stokes equations. In: Langer,
R~ [ed.] (1963), Nonlinear Problems. University of Wisconsin Press, pp. 69-98.
Serrin, J. (1979): Conceptual analysis of the second law of thermodynamics. Arch. Rat.
Mech. Anal. 70, 355-371.
Serrin, J. (1983): The structure and laws of thermodynamics. In: Warsaw (1983), 1717-
1728.
Serrin, J. [ed.] (1986): New Perspectives in Thermodynamics. Springer-Verlag, New
York.
Setti, L. and Van Hove, L. [eds.] (1984): Large-Scale Structure oj the Universe and
Fundamental Physics. First ESO-CERN Symposium.
Sex!, R. and Sex!, M. (1981): Weisse Zwerge und schwarze LOcher. Vieweg, Braunschweig.
Sexl, R. (1982): Was die Welt zusammenhiilt: Physik auf der Suche nach dem Bauplan
der Natur. Dt. Verlagsanstalt, Stuttgart.
References 925

Sex!, R. and Urbantke, H. (1983): Gravitation und Kosmologie. Wissenschaftsverlag,


Mannheim.
Shanahan, P. (1978): The Atiyah-Singer Index Theorem. Lecture Notes in Mathe-
matics, Vol. 638. Springer-Verlag, Berlin.
Shih, T. (1984): Numerical Heat Transfer. Springer-Verlag, New York.
Shinbrot, M. (1973): Lectures on Fluid Mechanics. Gordon and Breach, New York.
Shinbrot, M. (1976): The initia1-value problem for surface waves under gravity, I. II.
Indiana Univ. Math. 1.25, 281-300, 1049-1071.
Shinbrot, M. (1979): The initial-value problem for surface waves under gravity, Ill. J.
Math. Anal. Appl. 67, 340-391.
Showalter. W. (1978): Mechanics of Non-Newtonian Fluids. Pergamon Press, Oxford,
England.
Siegel, C. and Moser, J. (1971): Lectures on Celestial Mechanics. Springer-Verlag,
Berlin.
Signorini, A. (1959): Questioni di elasticita nonlinearizzata e semilinearizzata. Rend.
Mat. 18, 1-45.
Silk, J. (1980): The Big Bang. Freeman, San Francisco, CA.
Simon, B. (1984): Fifteen problems in mathematical physics. In: Jager, W., Moser, J., and
Remmert, R. [eds.] (1984), 423-450.
Simon, B. (1986): Cf. Cycon, R. (1986).
Simon, B. (1993): Cf. additional references.
Sinai, Ya. (1982): Theory of Phase Transitions. Rigorous Results. Pergamon Press,
Oxford, England
Singh, V. (1983): Grand unification and the Big Bang cosmology. Progr. Phys. 31,
569-590.
Smale, S. (1965): An infinite-dimensional version of Sard's theorem. Amer. J. Math. 87,
861-866.
Smale, S. (1972): See Smale, S. (1980), pp. 95-105.
Smale, S. (1980): The Mathematics of Time: Essays on Dynamical Systems, Economic
Processes, and Related Topics. Springer-Verlag, New York.
Smale, S. (1981 ): The fundamental theorem of algebra and complexity theory. Bull. Amer.
Math. Soc. (N.S.) 4, 1-36.
Smale, S. (1982): Global analysis and economics. In: Arrow, K. and Intrilligator, A. [eds.]
(1982), Vol. 1, pp. 331-378.
Smirnow, W. (1956): Lehrgang der hOheren Mathematik, Vols. 1-5. Veri. d. Wiss. Berlin.
(English edition: A Course in Higher Mathematics. Addison-Wesley, Reading, MA,
1964.)
Smoller, J. (1983): Shock Waves and Reaction-Diffusion Equations. Springer-Verlag,
New York.
Smoller, J. [ed.] (l983a): Nonlinear partial differential equations. Contemp. Math. 17.
Socolescu, D. (1977): Existenz- und Eindeutigkeitsbeweis for das Problem der Zusam-
menwirkung von Strahlen. Indiana Univ. Math. J. 26, 707-730.
Socolescu, D. (1980): Existenz- und Eindeutigkeitsbeweis for ein freies Randwertproblem
for die stationiiren Navier-Stokesschen Bewegungsgleichungen. Arch. Rat. Mech.
Anal. 73, 191-242.
Sod, G. (1978): A survey of several finite difference methods for systems of nonlinear
hyperbolic conservation laws. J. Com put. Phys. 29, 1-31.
Sod, G. (1985): Numerical Methods in Fluid Dynamics, Vols. I, 2. University Press,
Cambridge, England.
926 References

Sokolovskii, V. (1955): Plastizitiitstheorie. Veri. d. Wiss., Berlin,


Solonnikov, V. (1983): Solvability of three-dimensional free boundary problems for the
Navier-Stokes equations. Banach Center Publ. 10,361-403.
Solonnikov, V. (1984): On the solvability of boundary and initial-boundary value prob-
lems of the Navier-Stokes system in a domain with noncompact boundaries. Pacific
J. Math. 93, 443-458.
Solomon, L. (1968): Elasticite lineaires. Masson, Paris.
Sommerfeld, A. (1944): See Sommerfeld, A. (1970).
Sommerfeld, A. (1954): Vorlesungen uber theoretische Physik, Vols. 1-6. Geest & Portig,
Leipzig.
Sommerfeld, A. (1970): Mechanik der deformierbaren Medien. Geest & Portig, Leipzig.
(First edition, 1944.)
Sparrow, C. (1982): The Lorenz Equation: Bifurcation, Chaos, and Strange Attractors.
Springer-Verlag, Berlin.
Sperner, E. (1928): Neuer Beweis fUr die Invarianz der Dimensionszahl und des Gebietes,
Abh. Math. Sem. Univ. Hamburg 6, 265-272.
Spivak, M. (1979): A Comprehensive Introduction to Differential Geometry, Vols. 1-5.
Publish or Perish, Boston.
Spohn, W. (1969): Can mathematics be saved? Notices Amer. Math. Soc. 16, 890-894.
Stephani, H. (1977): Allgemeine Relativitiitstheorie. Veri. d. Wiss., Berlin. (English
edition: General Relativity, Cambridge, 1981.)
Sternberg, S. (1964): Lectures on Differential Geometry. Prentice-Hall, Englewood
Cliffs, NJ.
Sternberg, S. (1969): Celestial Mechanics, Vols. 1, 2. Benjamin, Reading, MA.
Stoker, J. (1957): Water Waves. Interscience, New York.
Stokes, G. ( 1845): On the theories of the internal friction of fluids in motion. Cam. Trans.
Stoppeli, F. (1954): Un teorema di esistenza e di unicita relativo aile equazioni dell'
elastostatica isoterma per deformazioni finite. Recerche Mat. 3, 247-267.
Straumann, N. (1984): General Relativity and Relativistic Astrophysics. Springer-
Verlag, New York.
Streater, R. and Wightman, A. (1964): PCT, Spin, Statistics, and All That. Benjamin,
New York.
Strehlow, R. (1979): Fundamentals of Combustion. Krieger, New York.
Struik, D. (1926): Determination rigoreuse des ondes irrotationelles permanent dans un
canal aprofondeur finie. Math. Ann. 95, 595-634.
Struik, D. (1948): A Consise History of Mathematics. Dover, New York.
Stuart, C. (1976): Steadily rotating chains. In: Germain, P. and Nayroles, B. [eds.]
(1976), 490-499.
Stumpff, (1973): Himmelsmechanik, Vols. 1-3. Akademie-Verlag, Berlin.
Sulanke, R. and Wintgen, P. (1972): Differentialgeometrie und Faserbiindel. Veri. d.
Wiss., Berlin.
Sullivan, W. (1979): Black Holes-the Edge of Space-the End of Time. Anchor Press,
New York. (German edition: Breidenstein, Frankfurt/Main, 1980.)
Swinney, H. and Gollub, J. (1978): The transition to turbulence. Physics Today 31 (8),
41-49.
Szabo, I. (1987): Geschichte der mechanischen Prinzipien und ihrer wichtigsten Anwen-
dungen. Birkhauser, Basel.
Szillard, R. (1974): Theory and Analysis of Plates. Prentice-Hall, Englewood Cliffs, NJ.
Sziics, E. (1980): Similitude and Modelling. Akademiai Kiado, Budapest.
References 927

Ta-Pei Cheng and Ling-Fong Li (1984): Gauge Theory of Elementary Particle Physics.
Clarendon Press, Oxford, England.
Tartar, L. (1979): Compensated compactness and partial differential equations. In:
Knops, R. [ed.], Nonlinear Analysis and Mechanics, Vol. IV. Pitman, London,
pp. 136-212.
Tartar, L. (1983): The compensated compactness method applied to systems of conserva-
tion laws. In: Ball, J. [ed.] (1983a).
Tassoul, J. (1978): Theory of Rotating Stars. University Press, Princeton, NJ.
Taube, M. (1985): Evolution of Matter and Energy on a Cosmic and Planetary Scale.
Springer-Verlag, New York.
Taubes, C. (1986): Physical and mathematical applications of gauge theories. Notices
Amer. Math. Soc. 33,707-715.
Taylor, G. (1923): Stability of a viscous liquid contained between two rotating cylinders.
Phil. Trans. Roy. Soc. London A223, 289-343.
Telionis D. (1981): Unsteady Viscous Flow. Springer-Verlag, New York.
Temam, R. (1975): On the Euler equations of incompressible perfect fluids. J. Funct.
Anal. 20, 32-43.
Temam, R. (1977): Navier-Stokes Equations: Theory and Numerical Analysis. North-
Holland, New York.
Temam, R. (1983): Navier-Stokes Equation and Nonlinear Functional Analysis. CBMS
-NSF Regional Conference Series in Applied Mathematics. SIAM, Philadelphia.
Temam, R. (l983a): Problemes mathematiques en plasticite. Gauthier-Villars, Paris.
(English edition, Paris, 1985.)
Temam, R. (1986): A generalized Norton-Hoff model and the Prandtl-Reuss law of
plasticity. Arch. Rat. Mech. Anal. 95, 137-183.
Thirring, W. (1983): A Course in Mathematical Physics, Vols. 1-4. Springer-Verlag,
Wien.
Thorn, R. (1972): Stabilite structurelle et morphogenese. (English edition: Benjamin,
New York, 1975.)
Thomasset, F. (1981): Implementation of Finite Element Methods for Navier-Stokes
Equations. Springer-Verlag, New York.
Thomson, W., Sir, (1869): On Vortex Motion. Edin. Trans. XXV.
Ting, T. (1969): Elastic-plastic torsion. Arch. Rat. Mech. Anal. 34, 228-244.
Ting, T. (1972): Topics in Mathematical Theory of Plasticity. In: Fliigge, S. [ed.] (1956),
Vol. Vla/3, 535-623.
Tipler, F., Clarke, C., and Ellis, G. (1980): Singularities and horizons. In: Held, A. [ed.]
(1980), Vol. 2, 97-206.
Todd, M. (1976): The Computation of Fixed Points and Applications. Lecture Notes in
Economics, Vol. 124. Springer-Verlag, New York.
Treder, H. (1971): Gravitationstheorie und Aquivalenzprinzip. Akademie-Verlag, Berlin.
Treder, H. (1972): Die Relativitiit der Triigheit. Akademie-Verlag, Berlin.
Treder, H. (1974): Ober die Prinzipien der Dynamik von Einstein, Hertz, Mach und
Poincare. Akademie-Verlag, Berlin.
Treder, H. (1975): Elementare Kosmologie. Akademie-Verlag, Berlin.
Treder, H. (1983): GrojJe Physiker und ihre Probleme. Akademie-Verlag, Berlin.
Trefftz, E. (1927): Ein Gegenstuck zum Ritzschen Verfahren. Zweiter Kongress fiir
Technische Mechanik, Zurich.
Trefftz, E. (1928): Mathematische Elastizitiitstheorie. In: Geiger, H. and Scheel, K. [eds.]
(1926), Vol. 6, 47-140.
928 References

Trefil, J. (1983): The Unexpected Vista: A Physicist's View of Nature. Charles Scribner's
Sons, New York.
Trefil, J. (1984): The Moment of Creation: Big Bang Physics. Charles Scribner's Sons,
New York. (German edition: Birkhauser, Basel, 1985.)
Trenogin, V. (1970): The asymptotic method of Ljusternik-Visik. Uspekhi Mat. Nauk
25 (4), 123-156 (Russian).
Tresca, H. (1864): C. R. Acad. Sci. Paris 59, 754.
Triebel, H. (1972): Hohere Analysis. Veri. d. Wiss., Berlin.
Triebel, H. (1981): Analysis und mathematische Physik. Teubner, Leipzig. (English
edition, Leipzig, 1985.)
Tromba, A. (1976): Fredholm vector fields and transversality. J. Funct. Anal. 23, 362-
368.
Tromba, A. (1976a): Almost Riemannian structures on Banach manifolds, the Morse
lemma, and the Darboux theorem. Canad. J. Math. 28, 640-652.
Tromba, A. (1978): The Euler characteristic of vector fields on Banach manifolds and a
globalization of Leray-Schauder degree. Adv. Math. 28, 148-173.
Tromba, A. (1983): A sufficient condition for a critical point of a functional to be a
minimum and its application to Plateau's problem. Math. Ann. 263, 303-312.
Truesdell, C. and Noll, W. (1965): The non-linear field theories of mechanics. In: Fliigge,
S. [ed.] (1956), Vol. III/3.
Truesdell, C. (1968): Essay's in the History of Mechanics. Springer-Verlag, New York.
Truesdell, C. (1977): A First Course in Rational Mechanics, Vols. 1, 2. Academic Press,
New York.
Truesdell, C. and Muncaster, R. (1980): Fundamentals of Maxwell's Kinetic Theory of
a Simple Monatomic Gas. Academic Press, New York.
Truesdell, C. (1983): The i'!fluence of elasticity on analysis, the classical heritage. Bull.
Amer. Math. Soc. (N.S.) 9, 293-310.
Turner, R. (1981): Internal waves in fluids with rapidly varying density. Ann. Scuola
Norm. Sup. Pisa, Cl. Sci. Ser. 4, 8, 513-573.
Tymoczko, T. [ed.] (1985): New Directions in the Philosophy of Mathematics. Birk-
hauser, Boston.

Unsold, A. and Baschek, B. (1981): Der neue Kosmos. Springer-Verlag, Berlin.


Uralceva, N. (1973): On the solvability of the capillary problem. Vestnik Leningrad.
Univ. Ser. Math. (1973) 19, 54-64; (1975) 1, 143-149.

VanderMeer, C. (1985): The Hamiltonian-Hop[ Bifurcation. Lecture Notes in Mathe-


matics, Vol. 1160. Springer-Verlag, Berlin.
Vander Waerden, B. (1980): Group Theory and Quantum Mechanics. Springer-Verlag,
New York.
Van Nostrand's Scientific Encyclopedia (1976): Vols. 1-5. Van Nostrand, New York.
(German edition: Enzyklopadie Naturwissenschafi und Technik, Verlag moderne
Industrie, Miinchen.)
Velte, W. (1966): Stabilitat und Verzweigung stationarer Losungen der Navier-Stokes-
schen Gleichungen beim Taylorproblem. Arch. Rat. Mech. Anal. 22, l-14.
Vinogradov, A. and Kuperschmidt, B. (1977): Structure of Hamiltonian Mechanics.
Uspekhi Math. Nauk 32 (4), 175-236 (Russian).
References 929

Visconti, A. (1987): Introductory Differential Geometry for Physicists. World Scientific,


Singapore.
Visik, M. and Fursikov, A. (1980): Mathematical Problems in Statistical Hydrodynamics.
Nauka, Moscow (Russian). (German edition: Teubner, Leipzig, 1986.)
Vladimirov, V. (1976): Einfiihrung in die physikalische Theorie der Plastizitat und
Festigkeit. Verlag fiir Grundstoffindustrie, Leipzig.
Vladimirov, V. [ed.] (1986): Collection of Survey Articles on the Occasion of the Fifth
Anniversary of the Steklov Institute. Trudy Mat. lnst. Steklova 175.
Vlasov, V. (1964): General Theory of Shells and Its Applications in Engineering. NASA,
Washington, D.C.
Vogel, H. (1977): Probleme aus der Physik. Springer-Verlag, Berlin.
Volmir, A. (1972): Nonlinear Dynamics of Plates and Shells. Nauka, Moscow (Russian).
Vorovic, I. (1955): On the existence of solutions in nonlinear shell theory. lzv. Akad.
Nauk SSSR Ser. mat.l9, 173-186 (Russian).
Vul, E., Sinai, Ja, and Chanin, K. (1984): Universality of Feigenbaum and thermo-
dynamical formalism. Uspekhi Mat. Nauk 39 (3), 3-37 (Russian).

Wahl, W. v. (1985): The Equations of Navier-Stokes and Abstract Parabolic Equations.


Vieweg, Braunschweig.
Wald, R. (1984): General Relativity. University Press, Chicago, IL.
Walras, L. (1874): Elements d'economie politique pure. Corbaz, Lausanne.
Walter, W. (1964): Differential- und lntegralungleichungen. Springer-Verlag, New York.
Wang, C. (1979): Mathematical Principles of Mechanics and Electromagnetism, Vols.
1, 2. Plenum, New York.
Warner, F. (1971): Foundations of Differentiable Manifolds and Lie Groups. Scott-
Foresman, Dallas, TX.
Warsaw (1983): Proceedings of the International Conference of Mathematicians in
Warsaw, Vols. 1, 2. PWN, Warsaw and North-Holland, Amsterdam.
Washington, W. and Parkinson, C. (1987): An Introduction to Three-Dimensional
Climate Modelling. University Press, Oxford, England.
Washizu, K. (1968): Variational Methods in Elasticity and Plasticity. Pergamon Press,
Oxford, England.
Weber, J. (1980): A search for gravitational radiation. In: Held, A. [ed.] (1980), Vol. 1,
435-468.
Wehrl, A. (1978): General properties of entropy. Rev. Mod. Phys. SO, 221-260.
Weierstrass, K. (1857): Antrittsrede in der Berliner Akademie (Inaugural speech). See
Weierstrass, K. (1894/1927), Vol. 1, pp. 293-296.
Weierstrass, K. (1894/1927): M athematische Werke. (Mathematical Works.), Vols. 1-7.
Berlin.
Weinberg, S. (1972): Gravitation and Cosmology. Wiley, New York.
Weinberg, S. (1977): The First Three Minutes: A Modern View of the Origin of the
Universe. Basic Books, New York. (German edition: Piper, Miinchen, 1977.)
Weinberg, S. [ed.] (1983): Interaction Between Elementary Particle Physics and Cos-
mology. Wiley, New York.
Weinberg, S. (1984): Teile des Unteilbaren. Spektrum der Wissenschaft, Heidelberg.
(English edition: The Discovery of Subatomic Particles. Scientific American Books,
New York, 1983.)
930 References

Weinberg, S. (1986): Cf. Mathematics: The Unifying Thread (1986).


Weinberg, S. [ed.] (1986a): Physics in Higher Dimensions. Wiley, New York.
Weinstein A. (1977): Lectures on Symplectic Manifolds. American Mathematical
Society, Providence, RI.
Weizsiicker, C. v. (1973): Die philosophische Interpretation der modernen Physik.
Deutsche Akademie d. Naturforscher Leopoldina, Halle.
Weizsiicker, C. v. (1976): Die 1tagweite der Wissenschaft. Hirzel, Stuttgart.
Weizsiicker, C. v. (1976a): Zum Weltbild der Physik. Hirzel, Stuttgart.
Weizsiicker, C. v. (1979): Die Einheit der Natur. Hanser-Verlag, Miinchen.
Weizsiicker, C. v. (1979a): Die Geschichte der Natur. Vandenhoeck & Ruprecht, Got-
tingen.
Weller, W. and Winkler, H. (1974): Grundkurs klassische Physik, Vols. 1, 2. Teubner,
Leipzig.
Wells, R. (1979): Complex manifolds and mathematical physics. Bull. Amer. Math. Soc.
(N.S.) 1, 296-336.
Wells, R. (1980): Differential Analysis on Complex Manifolds. Springer-Verlag, New
York.
Wess, J. and Bagger, J. (1983): Supersymmetry and Supergravity. University Press,
Princeton, NJ.
West, P. (1986): Introduction to Supersymmetry and Supergravity. World Scientific,
Singapore.
Westenholz, C. v. (1981): Differential Forms in Mathematical Physics. North-Holland,
Amsterdam.
Weyl, H. (1923): Raum, Zeit, Materie. Springer-Verlag, Berlin. (First edition, 1918.)
Weyl, H. (1952): Symmetry. University Press, Princeton, NJ. (German edition: Birk-
hiiuser, Stuttgart, 1955.)
Weyl, H. (1966): Philosophie der Mathematik und der Naturwissenschaft. Leibniz-
Verlag, Miinchen. (English edition: Philosophy of Mathematics and Natural Sci-
ence. University Press, Princeton, NJ., 1949.)
Weyl, H. (1968): Gesammelte Werke (Collected Works), Vols. 1-4. Springer-Verlag, New
York.
Whitham, G. (1974): Linear and Nonlinear Waves. Wiley, New York.
Whitney, H. (1936): Differentiable manifolds. Ann. of Math. 37, 645-680.
Whitney, H. (1944): The self-intersection of a smooth n-manifold in 2n-space. Ann. of
Math. 45,220-246.
Whitney, H. (1944a): The singularities of a smooth n-manifold in (2n- 1)-space. Ann.
of Math. 45, 247-293.
Wilkinson, W. (1960): Non-Newtonian Fluids. Fluid Mechanics, Mixing and Heat
1ransfer. 'Pergamon Press, New York.
Wille, F. (1982): Humor in der Mathematik. Vandenboeck & Ruprecht, Gottingen.
Williams, F. (1964): Combustion Theory. Addison-Wesley, Reading, MA.
Wilson, K. (1982): The renormalization group and critical phenomena. In: Nobel Prizes
(1954ff), Vol. 1982, 57-87. ·
Wintner, A. (1947): The Analytical Foundations of Celestial Mechanics. University
Press, Princeton, NJ.
Witten, E. (1986): Topological tools in ten dimensions, and unifiCation in ten dimensions.
In: Green, M. and Gross, D. [eds.] (.1986), 400-458.
Worbs, E. (1955): Carl Friedrich Gauss, Koehler & Amelang, Leipzig.
References 931

Wussing, H. (1974). Carl Friedrich Gauss. Teubner, Leipzig.


Wussing, H. (1979): Vorlesungen zur Geschichte der Mathematik. Veri. d. Wiss., Berlin.
Wussing, H. [ed.] (1983): Geschichte der Naturwissenschaften. Edition Leipzig.

Yau, S. and Schoen, R. (1979): On the proof of the positive mass conjecture in general
relativity. Comm. Math. Phys. 65 (1979), 45-76,79 (1981), 231-260.
Yau, S. and Schoen, R. (1983): The existence of a black hole due to condensation of
matter. Comm. Math. Phys. 90,575-579.
Yosida, K. (1965): Functional Analysis. Springer-Verlag, New York.
Young, L. (1981): Mathematicians and Their Times. North-Holland, Amsterdam.

Zabusky, N. [ed.] (1968): Topics in Nonlinear Physics. Springer-Verlag, New York.


Zee, A. [ed.] (1982): Unity of Forces in the Universe, Vols. 1, 2. World Scientific,
Singapore.
Zeeman, F. (1976): Euler Buckling. Lecture Notes in Mathematics, Vol. 525, 373-395.
Springer-Verlag, Berlin.
Zeidler, E. (1968): Beitriige zur Theorie und Praxis freier Randwertaufgaben. Habili-
tationsschrift, Universitiit Leipzig. Published as a monograph, Akademie-Verlag,
Berlin, 1971.
Zeidler, E. (1971): Existenzbeweis fUr cnoidal waves unter Berucksichtigung der Ober-
fliichenspannung. Arch. Rat. Mech. Anal. 41, 81-107.
Zeidler, E. (1972): Zur Bifurkationstheorie und zur Stabilitiitstheorie der Navier-
Stokesschen Gleichungen. Math. Nachr. 52, 167-205.
Zeidler, E. (1972a): Existenz einer Gasblase in einer Parallel- und Zirkulationsstromung
unter Berucksichtigung der Schwerkraft. Beitrage Anal. 3, 67-95.
Zeidler, E. (1972b): Existenzbeweis fUr asymptotische Wirbelwellen. Beitriige Anal. 3,
109-134.
Zeidler, E. (1973): Existenzbeweis fUr permanente Kapillar-Schwerewellen mit allge-
meinen Wirbelverteilungen. Arch. Rat. Mech. Anal. SO, 34-72.
Zeidler, E. (1976): Lokale und globale Bifurkationsresultate fUr Variationsungleichungen.
Math. Nachr. 71, 37-63.
Zeidler, E. (1977): Bifurcation theory and permanent waves. In: Rabinowitz, P. [ed.]
(1977), 203-224.
Zeidler, E. (1979): Vektoranalysis, Di.fferentialgeotnetrie, Tensoranalysis. In: Bronstein,
I. and Semendjaev, K. [eds.] (1979), Vol. 1, 605-658; Vol. 2, 70-87. (English
edition: Bronshtein, I. and Semendjaev, K. [eds.]: Handbook of Mathematics,
pp. 550-597; 808-825. Van Nostrand, New York, 1985.)
Zeldovic, J. and Novikov, I. (1971): Theory of Gravitation and Evolution of Stars.
Nauka, Moscow (Russian).
Zeldovic, J. and Novikov, I. (1971a): Relativistic Astrophysics. University Press, Chi-
cago, IL.
Zeytounian, R. (1987): Les modeles asymptotiques de Ia mecanique des fluids, Vols. 1,
2. Springer-Verlag, New York.
Ziegler, H. (1983): Introduction to Thermomechanics. North-Holland, Amsterdam.
Additional References

Cf. also the "Additional References" to the revised edition of Volume 1.


Adams, D. and Hedberg, L. (1996): Function Spaces and Potential Theory.
Springer-Verlag, Berlin, Heidelberg (General Reference).
Aebischer, B. et al. (1994): Symplectic Geometry: An Introduction. Birkhauser, Basel
(Chapter 58).
Aldroubi, A. and Unser, M. (1996): Wavelets in Medicine and Biology. CRC Press,
New York (General Reference).
Alexander, D. (1994): A History of Complex Dynamics: From Schroder to Fatou und
Julia. Vieweg, Braunschweig (Chapter 79).
Alinhac, S. and Gerard, P. (1991): Operateurs pseudo-d!fferentiels et theoreme de Nash-
Moser. Intereditions, Paris (General Reference).
Allgower, E. and Georg, K. (1990): Numerical Continuation Methods. Springer-Verlag,
New York (Chapter 78).
Allgower, E., Bohmer, K., and Golubitsky, M. (eds.) (1992): Bifurcation and Symmetry:
Cross Influence between Mathematics and Applications. Birkhiiuser, Basel (Chap-
ter 79).
Amann, H. (1995): Linear and Quasilinear Parabolic Problems. Vol. 1: Abstract Linear
Theory. Vols. 2,3 (to appear). Birkhiiuser, Basel (General Reference).
Ambrosetti, A. (1993): A Primer of Nonlinear Analysis. Cambridge University Press,
Cambridge, UK (General Reference).
Ambrosetti, A. and Chang, K. (eds.)(1993): Variational Methods in Nonlinear Analysis.
Gordon & Breach, Newark, NJ (General Reference).
Ambrosetti, A. and Coti-Zelati, V. (1993): Periodic Solutions of Singular Lagrangian
Systems. Birkhiiuser, Basel (Chapter 58).
Antes, H. and Panagiotopoulos, P. (1992): The Boundary Integral Approach to Static
and Dynamic Contact Problems: Equality and Inequality Methods. Birkhauser,
Basel (Chapter 63).
Antman, S. (1995): Nonlinear Elasticity. Springer-Verlag, New York (Chapter 61).
Arbib, M. (ed.) (1995): The Handbook of Brain Theory and Neural Networks. MIT
Press, Cambridge, MA (General Reference).
934 Additional References

Arnold, (ed.) (1988/94): Dynamical Systems. Vols. 1-8. Encyclopedia of Mathematical


Sciences. Springer-Verlag, New York (Chapter 79).
Aubin, J. (1991): Viability Theory. Birkhiiuser, Basel (Chapter 77).
Aubin, J. (1993): Optima and Equilibria. Springer-Verlag, New York (Chapter 77).
Auerbach, A. (1994): Interacting Electrons and Quantum Magnetism. Springer-Verlag,
New York (General Reference).

Baez, J. (1994): Knots and Quantum Gravity. Oxford University Press, Oxford (Chapter
76).
Baggett, L. (1992): Functional Analysis: A Primer. Marcel Dekker, New York (General
Reference).
Bakelman, I. (1994): Convex Analysis and Nonlinear Geometric Elliptic Equations.
Springer-Verlag, Berlin, Heidelberg (General Reference).
Banks, R. (1994): Growth and Diffusion Phenomena. Springer-Verlag, Berlin, Heidelberg
(Chapter 69).
Bar'yakhtar, V., Chetkin, M., Ivanov, B., and Gadetskii, S. (1994): Dynamics of Topo-
logical Magnetic Solitons: Experiment and Theory. Springer-Verlag, Berlin,
Heidelberg (Chapter 71).
Bartsch, T. (1993): Topological Methods for Variational Problems with Symmetries.
Springer-Verlag, Berlin, Heidelberg (General Reference).
Beals, M. (1989): Propagation and Interaction of Singularities in Nonlinear Hyperbolic
Problems. Birkhiiuser, Basel (General Reference).
Beaulieu, L. (1997): Nicolas Bourbaki: History and Legend. Springer-Verlag, Berlin,
Heidelberg (General Reference) (to appear).
Beem, J., Ehrlich, P., and Easley, K. (1996): Global Lorentzian Geometry. Marcel
Dekker, New York (Chapter 76).
Bellissard, J. (1996): Applications of C*-Techniques to Modern Quantum Physics.
Springer-Verlag, Berlin, Heidelberg (Chapter 59).
Benatti, F. (1993): Deterministic Chaos in Infinite Quantum Systems. Springer-Verlag,
Berlin, Heidelberg (Chapters 59 and 68).
Bensoussan, A., Da Prato, G., Delfour, M., and Mitter, S. (1993): Representation and
Control of Infinite-Dimensional Systems. Vols. 1, 2. Birkhauser, Basel (General
Reference).
Berezin, F. and Shubin, M. (1991): The Schrodinger Equation. Kluwer, Dordrecht
(Chapter 59).
Berezin, F. (1987): Introduction to Superanalysis. Reidel, Dordrecht (Chapter 76).
Berti!, G. and Kreiss, H. (1995): Time Dependent Problems and Difference Methods.
Wiley, New York (General Reference).
Bertin, J., Glowinski, R., and Periaux, J. (eds.) (1989): Hypersonics. Vol. 1: Defining
the Hypersonic Environment. Vol. 2: Computation and Measurement of Hypersonic
Flows. Birkhiiuser, Basel (Chapter 70).
Bertin, J., Periaux, J., and Dallmann, J. (1992): Advances in Hypersonics. Vols. 1-3.
Birkhiiuser, Basel (Chapter 70).
Bethuel, F., Brezis, H., and Helein, F. (1994): Ginzburg-Landau Vor_tices. Birkhiiuser,
Basel (Chapter 67).
Binder, K. and Heermann, D. (1993): Monte-Carlo Simulation in Statistical Physics.
Springer-Verlag, Berlin, Heidelberg (Chapter 68).
Binney, J. and Tremaine, S. (1988): Galactic Dynamics. Princeton University Press,
Princeton, NJ (Chapter 76).
Additional References 935

Bishop, C. (1996): Neural Networks of Pattern Recognition. Oxford University Press,


Oxford, UK (General Reference).
Bloom, F. (1993): Mathematical Problems of Classical Nonlinear Electromagnetic The-
ory. Longman, Harlow, UK (Chapter 76).
Bobylev, N., Burman, Yu., and Korovin, S. (1994): Approximation Procedures in
Nonlinear Oscillation Theory. De Gruyter, Berlin (Chapter 79).
Boccaletti, D. and Pucacco, G. (1996): Theory of Orbits. Vols. 1, 2. Springer-Verlag,
Berlin, Heidelberg (Chapter 79).
Bogoljubov, N. et a! (1990): General Principles of Quantum Field Theory. Kluwer,
Dordrecht (General Reference).
Border, K. (1985): Fixed Point Theorems with Applications to Economics and Game
Theory. Cambridge University Press, Cambridge, UK (Chapter 78).
Bott, R. (1993): Collected Works. Vol. 1: Topology and Lie Groups. Vol. 2: Differential
Operators. Vol. 3: Foliations. Vol. 4: Mathematics Related to Physics. Birkhiiuser,
Basel (General Reference).
Bott, R. and Tu, L. (1994): Differential Forms in Algebraic Topology. Springer-Verlag,
New York (Chapter 74).
Bourbaki, N. (1994): Elements of the History of Mathematics. Springer-Verlag, Berlin,
Heidelberg (General Reference).
Bourguignon, J. (1996): Variational Calculus. Springer-Verlag, Berlin, Heidelberg (Gen-
eral Reference).
Brand, H. (1995): Spatial Structures in Systems Far From Equilibrium. Springer-Verlag,
Berlin, Heidelberg (Chapter 67).
Bredon, G. (1993): Topology and Geometry. Springer-Verlag, New York (General
Reference).
Brody, T. (1993): The Philosophy Behind Physics. Springer-Verlag, Berlin, Heidelberg
(General Reference).
Brokate, M. and Sprekels, J. (1996): Hysteresis Phenomena in Phase Transitions.
Springer-Verlag, Berlin, Heidelberg (Chapter 67).
Browder, F. (ed.) (1992): Nonlinear and Global Analysis. Reprints from the Bulletin of
the American Mathematical Society. Providence, RI (General Reference).
Brown, D. and Smith, K. (1991): Frontiers of Mathematical Psychology. Springer-
Verlag, New York (General Reference).
Brown, L. (ed.) (1993): Renormalization: From Lorentz to Landau and Beyond. Springer-
Verlag, New York (Chapter 59).
Brown, R. (1993): A Topological Introduction to Nonlinear Analysis. Birkhiiuser, Basel
(Chapter 77).
Brown, R. and Davis, S. (1994): Free Boundaries in Viscous Flows. Springer-Verlag,
New York (Chapter 71).
Brumberg, V. ( 1995): Analytical Techniques of Celestial Mechanics. Springer-Verlag,
Berlin, Heidelberg (Chapter 58).
Bruno, A. (1994): The Restricted 3-Body Problem: Plane Periodic Orbits. De Gruyter,
Berlin (Chapter 58).
Buchheim, G. and Sonnemann, R. (eds.) (1989): Lebensbilder von Ingenieurwissen-
schaftlern: Eine Sammlung von Biographien aus zwei Jahrhunderten. Birkhiiuser,
Basel (General Reference).
Buechler, S. (1996): Essential Stability Theory. Springer-Verlag, Berlin, Heidelberg
(Chapter 79).
Buttazzo, G. and Visintin, A. (eds.) (1994): Motion by Mean Curvature and Related
Topics. De Gruyter, Berlin (cf. also Damlamian, Spruck, and Visintin (eds.) (1995))
(Chapter 74).
936 Additional References

Caffarelli, L. and Cabre, X. (1995): Fully Nonlinear Elliptic Equations. American


Mathematical Society, Providence, RI (General Reference).
Carmichael, H. (1997): Quantum Statistical Methods in Quantum Optics. Springer-
Verlag, Berlin, Heidelberg (Chapter 59) (to appear).
Carmo, M. (1993): Riemannian Geometry. Birkhauser, Boston, MA (Chapter 74).
Carmo, M. (1994): Differential Forms. Springer-Verlag, Berlin, Heidelberg (Chapter 74).
Cascuberta, C. and Castellet, M. (1992): Mathematical Research Today and Tomorrow:
Viewpoints of Seven Fields Medalists. Springer-Verlag, Berlin, Heidelberg (Gen-
eral Reference).
Cercignani, C., Illner, R., and Pulvirenti, M. (1994): The Theory of Dilute Gases.
Springer-Verlag, Berlin, Heidelberg (Chapter 68).
Chang, K. (1997): Critical Point Theory and its Applications. Springer-Verlag, Berlin,
Heidelberg (General Reference) (to appear).
Choquet-Bruhat, Y., DeWitt-Morette, and Dillard-Bieick, M. (1988): Analysis, Mani-
folds, and Physics. Vol. 2. North-Holland, Amsterdam. (Chapter 73).
Chorin, A. (1994): Vorticity and Turbulence. Springer-Verlag, New York (Chapter 72).
Chossat, P. and Iooss, G. (1994): The Couette- Taylor Flow. Springer-Verlag, New
York (Chapter 72).
Christodoulou, D. and Klainerman, S. (1993): The Global Nonlinear Stability of the
Minkowski Space. Princeton University Press, Princeton, NJ (Chapter 76).
Chung, K. and Zhao, Z. (1994): From Brownian Motion to Schrodinger's Equation.
Springer-Verlag, Berlin, Heidelberg (Chapter 59).
Ciarlet, P. (1990): Plates and Junctions in Elastic Multi-Structures. Springer-Verlag,
New York (Chapter 65).
Clarke, C. (1994): The Analysis of Space-Time Singularities. Cambridge University
Press, Cambridge, UK (Chapter 76).
Clement, P. and Lumer, G. (eds.) (1994): Evolutione Equations, Control Theory, and
Biomathematics. Marcel Dekker, New York (General Reference).
Colombeau, J. (1992): Multiplication of Distributions. Lecture Notes in Mathematics
Vol. 1532. Springer-Verlag, Berlin, Heidelberg (General Reference).
Col om bini, F. et al. (1989): Partial Differential Equations and. the Calculus of Variations:
Essays in Honor of Ennio de Giorgi. Vols 1, 2. Birkhiiuser, Basel (General Refer-
ence).
Colton, D. and Kress, R. (1992): Inverse Acoustic and Electromagnetic Scattering.
Springer-Verlag, Berlin, Heidelberg (Chapter 59).
Companion Encyclopedia of the History and Philosophy of the Mathematical Sciences
(1994): Edited by I. Grattan-Guiness. Rutledge, London (General Reference).
Conlon, L. (1992): Differentiable Manifolds: A First Course. Birkhiiuser, Basel (Chapter
73).
Connes, A. (1994): Noncommutative Geometry. Academic Press, New York (General
Reference).
Coughran, W., Cole, J., Lloyd, P., and White, J. (1994): Semiconductors. Springer-
Verlag, Berlin, Heidelberg (Chapter 67).
Crandall, M., Benilan, P., and Pazy, A. (1997): Nonlinear Evolution Governed by
Accretive Operators. Springer-Verlag, Berlin, Heidelberg (General Reference) (to
appear).
Crandall, M., Hishii, M., and Lions, P. (1992): A User's Guide to Viscosity Solutions of
Second Order Partial Differential Equations. Bull. AMS 27, 1-67.
Cross, M. and Hohenberg, P. (1993): Pattern Formation Outside of Equilibrium. Rev.
Mod. Physics 65, 851-1112 (General Reference).
Additional References 937

Czichos, H. {ed.) {1989): Hutte-die Grundlagen der lngenieurwissenschaften. 29. vi:illig


neubearbeitete Auflage. Springer-Verlag, Berlin, Heidelberg {General Reference).

Damlamian, A., Spruck, J., and Visintin, A. (1995): Motion by Mean Curvature and
Related Topics. De Gruyter, Berlin {Chapter 74).
Dalen, D. van {ed.) (1996): L.E.J. Brouwer Biography (General Reference).
Dal Maso, G. (1993): An Introduction tor-Convergence. Birkhiiuser, Basel {General
Reference).
Dancer, E. {1994): Weakly Nonlinear Dirichlet Problems on Long or Thin Domains.
American Mathematical Society, Providence, RI (General Reference).
Das, A. {1993): The Special Theory of Relativity. Springer-Verlag, New York {Chapter
75).
Dautray, R. and Lions, J. L. {1990/93): Mathematical Analysis and Numerical Methods
for Science and Technology. Vol. 1: Physical Origins and Classical Methods. Vol.
2: Functional and Variational Methods. Vol. 3: Spectral Theory and Applications.
Vol. 4: Integral Equations and Numerical Methods. Vol. 5: Evolution Problems I.
Vol. 6: Evolution Problems II- TheN avier-Stokes Equations and Transport Equa-
tions and Numerical Methods. Springer-Verlag, Berlin, Heidelberg (General Refer-
ence).
Davies, P. {ed.)(1989): The New Physics. Cambridge University Press, Cambridge, UK
(General Reference).
Davis, P. (1993): The Nature and Power of Mathematics. Princeton University Press,
Princeton, NJ (General Reference).
Day, W (1993): Entropy and Partial Differential Equations. Longman, Harlow, UK
(General Reference).
De Gennes, P. and Prost, J. {1995): The Physics of Liquid Crystals. Clarendon Press,
Oxford, UK (Chapter 70).
Deimling, K. (1992): Multivalued Differential Equations. De Gruyter, Berlin {Chapter
79).
Demazure, M. {1996): Geometry: Catastrophes and Bifurcations. Springer-Verlag, Berlin,
Heidelberg (Chapter 79).
Deuflhard, P. ~nd Bornemann, F. (1994): Numerische Mathematik II: Integration
gewohnlicher Differentialgleichungen. De Gruyter, Berlin (English edition in prepa-
ration) (General Reference).
Deuflhard, P. and Hohmann, A. (1993): Numerische Mathematik 1: Eine algorithmisch
orientierte Einfiihrung. De Gruyter, Berlin (English edition: Numerical Analysis:
A First Course in Scientific Computation, De Gruyter, Berlin, 1994) (General
Reference).
Deuring, P. (1994): The Stokes Problem in an Infinite Cone. Akademie-Verlag, Berlin
(Chapter 72).
DeWitt, B. (1992): Supermanifolds. Cambridge University Press, Cambridge, UK
(Chapter 76).
DiBenedetto, E. (1993): Degenerate Parabolic Equations. Springer-Verlag, New York
(General Reference).
Diekmann, 0., Lunel, S., van Gils, A., and Walther, H. (1995): Delay Equations:
Functional Analysis, Complex Analysis, and Nonlinear Analysis. Springer-Verlag,
Berlin, Heidelberg.
Dieudonne, J. (1992): Mathematics-the Music of Reason. Springer-Verlag, Berlin,
Heidelberg (General Reference).
938 Additional References

Dittrich, W. and Reutter, M. (1994): Classical and Quantum Dynamics from Classical
Paths to Path Integrals. Springer-Verlag, Berlin, Heidelberg (Chapter 59).
Dobrushin, R. and Kusnoka, S. (1993): Statistical Mechanics and Fractals. Springer-
Verlag, Berlin, Heidelberg (Chapter 68).
Donoghue, J., Golowich, E., and Holstein, B. (1992): The Dynamics of the Standard
Model. Cambridge University Press, Cambridge, UK (Chapters 59 and 76).
Duhem, P. (1991): The Aim and Structure of Physical Theory. Princeton University
Press, Princeton, NJ (General Reference).
Duistermaat, H. and Kolk, J. (1996): Lie Groups. Springer-Verlag, Berlin, Heidelberg
(General Reference).
Dunham, W. (1991): Journey Through Genius: The Great Theorems of Mathematics.
Penguin Books, New York (General Reference).
Duren, P. and Zdravskoska, S. (1994): Golden Years of Moscow Mathematics. Oxford
University Press, Oxford (General Reference).

Earman, J., Janssen, H., and Norton, J. (1993): The Attraction of Gravitation: New
Studies in the History of General Relativity. Birkhiiuser, Basel (Chapter 76).
Economou, E. (1990): Green's Function in Quantum Physics. Springer-Verlag, New
York (Chapter 59).
Edwards, H. (1993): Advanced Calculus: A Differential Forms Approach. Birkhiiuser,
Basel (Chapter 74).
Efendiev, M. (1997): Degree Theory for Nonlinear Pseudodifferential Operators and its
Applications in Mathematical Physics (Chapter 78) (to appear).
Egorov, Yu. and Shubin, M. (1991): Partial Differential Equations. Vols. 1-4. Encyclo-
pedia of Mathematical Sciences. Springer-Verlag, New York (General Reference).
Ehlers, J. and Rindeler, W. (1996): Grundzuge der Relativitiitstheorie. Springer-Verlag,
Berlin, Heidelberg (Chapter 76).
Eigen, M. and Schuster, P. (1996): The Hypercycle: A Principle of Natural Self-
Organization. Springer-Verlag, Berlin, Heidelberg (General Reference).
Eigen, M. and Winkler, R. (1993): Laws of the Game: How the Principles of Nature
Govern Chance. Princeton University Press, Princeton, NJ (General Reference).
Einstein, A. (1992): Mileva Marie: The Love Letters. Edited by J. Renn and R. Schul-
mann. Princeton University Press, Princeton, NJ (General Reference).
Einstein, A. (1993): Collected Papers. Vol. 1: The Early Years: 1879-1902. Vol. 2: The
Swiss Years: Writings, 1900-1909. Vol. 3: The Swiss Years: Writings, 1909-1911.
Edited by M. Klein, A. Kox, J. Renn, and R. Schulmann. Princeton University
Press, Princeton, NJ (Chapter 76).
Encrenaz, T. and Bibring, J. (1994): The Solar System. Springer-Verlag, Berlin, Heidel-
berg (Chapter 58).
Encyclopedia of Applied Physics. Edited by G. Trigg. Vol. 1. VCM Publishers, New
York (General Reference).
Encyclopedia of Cosmology ( 1993): Edited by N. Hetherington. Garland Publishing,
New York (Chapter 76).
Encyclopedia of Mathematics (1988): Vols. 1-10. Kluwer, Dordrecht (General Refer-
ence).
Encyclopedia of the Mathematical Sciences(1988): Vols.IIT. Springer-Verlag, New York
(General Reference).
Encyclopedia of Science and Technology (1992): Vols. 1-20. McGraw-Hill, New York
(General Reference).
Additional References 939

Encyclopedic Dictionary of Mathematics (1993): Edited by Kiyosi Ito. The MIT Press,
Cambridge, MA (General Reference).
Enos, J. (ed.) (1993): Dynamics and Control of Mechanical Systems: The Falling Cat
and Related Problems. American Mathematical Society, Providence, RI (Chapter
58).
Esposito, G. (1993): Quantum Gravity, Quantum Cosmology, and Lorentzian Geometries.
Springer-Verlag, New York (Chapter 76).
Evans, L. (1990): Weak Convergence Methods for Nonlinear Partial D!fferential
Equations. American Mathematical Society, Providence, RI (General Refer-
ence).
Evans, L. and Gariepy, R. (1996): Measure Theory and Fine Properties of Functions.
CRC Press, Boca Raton, FL (General Reference).

Farkas, M: (1994): Periodic Motions. Springer-Verlag, Berlin, Heidelberg (Chapter 79).


Ferreyra, G., Goldstein, G., and Neubrander, F. (eds.) (1994): Evolution Equations.
Marcel Dekker, New York (General Reference).
Finkelstein, D. (1995): Quantum Relativity: A Synthesis of the Ideas of Einstein and
Heisenberg. Springer-Verlag, Berlin, Heidelberg (Chapter 76).
Fitzpatrick, P. and Furl, M. (eds.) (1993): Topological Methods for Ordinary D!fferential
Equations. Lecture Notes Mathematics Vol. 1537. Springer-Verlag, Berlin, Heidel-
berg (Chapter 79).
Fletcher, C. (1991): Computational Techniques for Fluid Dynamics. Vols.1, 2. Springer-
Verlag, Berlin, Heidelberg (Chapter 70).
Fletcher, C. and Srinivas, K. (1992): Computational Techniques for Fluid Dynamics: A
Solutions Manual. Springer-Verlag, Berlin, Heidelberg (Chapter 70).
Fomenko, A. (1994): Visual Geometry and Topology. Springer-Verlag, Berlin, Heidel-
berg (General Reference).
Fonseca, I., Gangbo, W. (1995): Degree Theory in Analysis and Applications. Clarendon
Press, Oxford (Chapter 78).
Friedman, A. (1988/96): Mathematics in Industrial Problems. Vols. 1-7. Springer-
Verlag, New York (General Reference).
Friedman, A. and Spruck, J. (eds.) (1992): Variational and Free Boundary Problems.
Springer-Verlag, New York (General Reference)
Frohlich, J. and Kerler, T. (1993): Quantum Groups, Quantum Categories, and Quantum
Field Theory. Lecture Notes in Mathematics. Vol. 1542. Springer-Verlag, Berlin,
Heidelberg (Chapters 59 and 76).
Fulde, P. (1995): Electron Correlations in Molecules and Solids. 3rd edition. Springer-
Verlag, New York (General Reference).

Galdi, G (1994): An Introduction to the Mathematical Theory of the Navier-Stokes


Equations. Vols. 1-4. Springer-Verlag, Berlin, Heidelberg (Vols. 3 and 4 to appear)
(Chapter 72).
Gallot, S., Hulin, D., and Lafontaine, J. (1987): Riemannian Geometry. Springer-Verlag,
Berlin, Heidelberg (Chapter 74).
Gamkrelidze, R. (ed.) (1991): Geometry I-IV. Encyclopedia of the Mathematical Sci-
ences. Springer-Verlag, New York (General Reference).
Gardiner, C. (1993): Handbook of Stochastic Methods for Physics, Chemistry and the
Natural Sciences. Springer-Verlag, New York (General Reference)
940 Additional References

Geii-Mann, M. (1994): The Quark and the Jaguar: Adventures in the Simple and the
Complex. Freeman, San Francisco, CA. (German edition: Das Quark und der
Jaguar: eine neue Theorie erkliirt die Welt, Piper, Miinchen, 1994) (General
Reference).
Giaquinta, M. (1993): Introduction to Regularity Theory for Nonlinear Elliptic Systems.
Birkhiiuser, Basel (General Reference).
Giaquinta, M. and Hildebrandt, S. (1996): Calculus of Variations. Vols. 1, 2. Springer-
Verlag, New York (General Reference).
Girvin, S. and Prange, R. (1990): The Quantum Hall Effect. Springer-Verlag, New York
(Chapter 59).
Gockeler, M. and Schlicker, T.: Differential Geometry, Gauge Theories, and Gravity.
Cambridge University Press, Cambridge, UK (Chapter 74).
Godlewski, E. and Raviart, P. (1996): Numerical Approximation of Hyperbolic Systems
of Conservation Laws. Springer-Verlag, New York (General Reference).
Goldberg, L. et al. (eds.) (1993): Topological Methods in Modern Mathematics: A
Symposium in Honor of John Milnor's 60th Birthday. Publish or Perish, Huston,
TX (General Reference).
Golub, G. and Ortega, J. (1993): Scientific Computing: An Introduction with Parallel
Computing. Academic Press, New York (General Reference).
Gorshkov, V. (1994): Physical and Biological Bases of Life Stability: Man, Biota,
Environment. Springer-Verlag, Berlin, Heidelberg (Chapter 79).
Greiner, W. (1994): Classical Physics. Vols. liT. Springer-Verlag, New York (General
Reference).
Greiner, W. (1993/94): Theoretical Physics. Vols. 1-6. Cf. the following titles.
Greiner, W. (1994): Quantum Mechanics: An Introduction. Springer-Verlag, Berlin,
Heidelberg (General Reference).
Greiner, W. (1993): Relativistic Quantum Mechanics. Springer-Verlag, Berlin, Heidel-
berg (General Reference).
Greiner, W. (1993): Gauge Theory of Weak Interactions. Springer-Verlag, Berlin,
Heidelberg (General Reference).
Greiner, W. and Miiller, B. (1994): Quantum Mechanics: Symmetries. Springer-Verlag,
Berlin, Heidelberg (General Reference).
Greiner, W. and Reinhardt, J. (1994): Quantum Electrodynamics. Springer-Verlag,
Berlin, Heidelberg (General Reference).
Greiner, W. and Reinhardt, J. (1996): Field Quantization. Springer-Verlag, New York
(General Reference).
Greiner, W. and Schafer, A. (1994): Quantum Chromodynamics. Springer-Verlag, Berlin,
Heidelberg (General Reference).
Grisvard, P. (1992): Singularities in Boundary Problems. Masson, Paris (General Refer-
ence).
Grosche, C. and Steiner, F. (1996): A Table of Feynman Path Integrals. Springer-Verlag,
Berlin, Heidelberg (Chapter 59).
Grosche, G., Ziegler, D., Ziegler, V., and Zeidler, E. (eds.) (1995): Teubner- Taschenbuch
der Mathematik II. Teubner-Verlag, Stuttgart/Leipzig (General Reference).
Grosse, H. (1996): Models in Statistical Physics and Quantum Field Theory. Springer-
Verlag, Berlin, Heidelberg (Chapter 68).
Gruber, P. and Wills, J. (1993): Handbook of Convex Geometry. Vols. 1, 2. North-
Holland, Amsterdam (General Reference).
Gurtin, M. (1993): Thermomechanics of Evolving Phase. Clarendon Press, Oxford
(Chapter 67).
Additional References 941

Haag, R. (1993): Local Quantum Physics: Fields, Particles, Algebras, Springer-Verlag,


Berlin, Heidelberg (General Reference).
Haken, H. (1996): Principles of Brain Functioning. Springer-Verlag, New York (Chap-
ter 79).
Haken, H. and Wolf, H. (1992): Molecular Physics and Elements of Quantum Chemistry.
Springer-Verlag, New York (General Reference).
Hale, J. and Ko~ak, H. (1991): Dynamics of Bifurcations. Springer-Verlag, Berlin,
Heidelberg (cf. also Ko~ak (1989)) (Chapter 79).
Hatfield, B. (1992): Quantum Field Theory of Point Particles and Strings. Addison-
Wesley, Redwood City, CA (Chapter 76).
Havin, V. and Joricke, B. (1994): The Uncertainty Principle in Harmonic Analysis,
Springer-Verlag, Berlin, Heidelberg (Chapter 59).
Heidmann, J. (1994): Bioastronomie: Uber irdisches Leben und aufterirdische Intelligenz.
Springer-Verlag, Berlin, Heidelberg (Chapter 76).
Heidrich, D., Kliesch, W., and Quapp, W. (1991): Properties of Chemically Interesting
Potential Energy Surfaces. Springer-Verlag, New York (Chapter 78).
Heisenberg, W. (1989): Encounters with Einstein and Other Essays on People, Places,
and Particles. Princeton University Press, Princeton, NJ (General Reference).
Henneaux, M. and Teitelboim, C. (1993): Quantization of Gauge Systems. Princeton
University Press, Princeton, NJ (Chapter 59).
Hermann, C. and Sapoval, B. (1994): Physics of Semiconductors. Springer-Verlag, New
York (Chapter 67).
Hilbert, D. (1991): Natur und mathematisches Erkennen. Lectures given in 1919-1920.
Edited by D. Rowe. Birkhauser, Basel (General Reference).
Hiriart-Urruty, J. and Lemarchal, C. (1993): Convex Analysis and Minimization Algo-
rithms. Vols. 1, 2. Springer-Verlag, Berlin, Heidelberg (General Reference).
Hirsch, M., Marsden, J., and Shub, M. (eds.) (1993): From Topology to Computation.
Proceedings of the Smalefest. Springer-Verlag, New York (General Reference).
Hislop, P. and Sigal, I. (1996): Introduction to Spectral Theory: With Applications to
SchrOdinger Operators. Springer-Verlag, New York (Chapter 59).
Hofer, H. and Zehnder, E. (1994): Symplectic Invariants and Hamiltonian Dynamics.
Birkhiiuser, Basel (Chapter 79).
Holmes, M. (1995): Introduction to Perturbation Methods. Springer-Verlag, New York
(General Reference).
Honerkamp, J. and Romer, H. (1993): Theoretical Physics: A Classical Approach.
Springer-Verlag, New York (General Reference).
Hoppenstaedt, F. (1993): Analysis and Simulation of Chaotic Systems. Springer-Verlag,
New York (Chapter 79).
Hoppenstaedt, F. and Peskin, C. (1994): Mathematics in Medicine and in the Life
Sciences. Springer-Verlag, New York (General Reference).
Hubbard, J. and West, B. (1995): Differential Equations: A Dynamical Systems
Approach. Vol. 2. Springer-Verlag, New York (Chapter 79).

Iagolnitzer, D. (1993): Scattering in Quantum Field Theory. Princeton University Press,


Princeton, NJ (Chapter 59).
Ibach, H. and Liith, H. (1993): Solid-State Physics: An Introduction to Theory and
Experiment. Springer-Verlag, New York (General Reference).
lbragimov, N. (1993): CRC Handbook of Lie Group Analysis of Differential Equations.
CRC Press, Boca Raton, FL (General Reference).
942 Additional References

Isham, C. (1989): Modern Differential Geometry for Physicists. World Scientific,


Singapore (Chapter 74).
Ize, J. Massabo, 1., and Vignoli, A. (1993): Degree Theory for Equivariant Maps: the
General S 1-Action. American Mathematical Society, Providence, RI (General
Reference).

Jacobs, K. (1992): Invitation to Mathematics. Princeton University Press, Princeton,


NJ (General Reference).
Jaffe, A. and Quinn, F. (1993): Theoretical Mathematics: Towards a Cultural Synthesis
of Mathematics and Theoretical Physics. Bull. AMS 29, 1-13. (Cf. also the contro-
versial discussion of this article in Bull. AMS 30 (1994), 161-211.) (General
Reference).
Jikov, V., Kozlov, S., and Oleinik, 0. (1994): Homogenization of Differential Operators.
Springer-Verlag, Berlin, Heidelberg (General Reference).
Joseph, D. (1990): Fluid Dynamics of Viscoelastic Liquids. Springer-Verlag, New York
(Chapter 70).
Joseph, D. and Renardy, Y. (1993): Fundamentals of Two-Fluid Dynamics. Vols. 1, 2.
Springer-Verlag, New York (Chapter 70).
Jost, J. (1994): Differentialgeometrie und Minimalfliichen. Springer-Verlag, Berlin,
Heidelberg (Chapter 74).
Jost, J. (1995): Riemannian Geometry and Geometric Analysis. Springer-Verlag, Berlin,
Heidelberg (Chapter 74) (to appear).
Jost, J. (1996a): Compact Riemann Surfaces. Springer-Verlag, New York (Chapter 74).
Jost, J. (1997): Postmodern Analysis. Springer-Verlag, Berlin, Heidelberg (General
Reference) (to appear).

Kac, M, Rota, G., and Schwartz, J. (1992): Discrete Thoughts: Essays on Mathematics,
Science, and Philosophy. Birkhiiuser, Basel (General Reference).
Kaiser, G. (1994): A Friendly Guide to Wavelets. Birkhiiuser, Basel (General Reference).
Kaku, M. (1988): Introduction to Superstring Theory. Springer-Verlag, New York
(Chapter 76).
Kaku, M. (1991): Strings, Conformal Fields, and Topology. Springer-Verlag, New York
(Chapter 76).
Kaku, M. (1993): Quantum Field Theory. Oxford University Press, Oxford (Chapters
59 and 76).
Kaku, M. (1994): Hyperspace: A Scientific Odyssey Through Parallel Universes, Time
Warps, and the lOth Dimension. Oxford University Press, Oxford (Chapter 76).
Kaku, M. and Trainer, J. (1987): Beyond Einstein: The Cosmic Quest for the Theory of
the Universe. Bantam, New York (Chapter 76).
Karttunen et al. (1993): Fundamental Astronomy. Springer-Verlag, Berlin, Heidelberg
(Chapter 76).
Kassel, C. (1994): Quantum Groups. Springer-Verlag, New York (General Reference).
Katok, A. and Hasselblatt, B. (1995): Introduction to the Modern Theory of Dynamical
Systems. Cambridge University Press, Cambridge, UK (General Reference).
Kavian, 0. (1993): Introduction a Ia theorie des points critiques et applications aux
problemes elliptiques. Springer-Verlag, New York (General Reference).
Kevorkian, J. and Cole, J. (1996): Multiple Scale and Singular Perturbation Methods.
Springer-Verlag, New York (General Reference).
Additional References 943

Kinderlehrer, D. et al. (eds.): Microstructure and Phase Transition. Springer-Verlag,


New York (Chapter 61).
Kippenhahn, R. (1993): 100 Billion Suns. Princeton University Press, Princeton, NJ
(Chapter 76).
Kippenhahn, R. and Weigert, A. (1994): Stellar Structure and Evolution. Springer-
Verlag, Berlin, Heidelberg (Chapter 76).
Kircher, R. and Bergner, W. (1991): Three-Dimensional Simulation of Semiconductor
Devices. Birkhiiuser, Basel (Chapter 67).
Kirsch, A. (1996): An Introduction to the Mathematical Theory of Inverse Problems.
Springer-Verlag, New York (General Reference).
Knauf, A. and Sinai, Ya. (1997): Classical Nonintegrability Quantum Chaos. Birk-
hiiuser, Basel (to appear).
K¢, H. (1989): Differential and Difference Equations Through Computer Experiments.
With Diskettes. Springer-Verlag, New York (cf. Hale and Ko~ak (1991)) (Chapter
79).
Kohonen, T. (1995): Self-Organizing Maps. Springer-Verlag, New York (Chapter
78).
Kolb, E. and Turner, M. (1990): The Early Universe. Addison-Wesley, Redwood City,
CA (Chapter 76). ·
Kozlov, V. and Fedorov, U. (1996): Memoirs on Integrable Systems. Springer-Verlag,
Berlin, Heidelberg (Chapter 58).
Kuchment, P. (1993): Floquet Theory for Partial Differential Equations. Birkhiiuser,
Basel (Chapter 79).
Kuksin, S. (1993): Nearly Integrable Infinite-Dimensional Hamiltonian Systems. Lecture
Notes Mathematics Vol. 1556. Springer-Verlag, Berlin, Heidelberg (Chapter 58).
Kuperschmidt, B. (1992): The Variational Principles of Dynamics. World Scientific,
Singapore (General Reference).
Kuzmin, A. (1992): Non-Classical Equations of Mixed Type and Their Applications in
Gas Dynamics. Birkhiiuser, Basel (Chapter 70).
Kuznetsov, Y. (1995): Elements of Applied Bifurcation Theory. Springer-Verlag, New York.

Lakshmikantham, V. (ed.) (1994): First World Congress of Nonlinear Analysts, Vols.


1-4. De Gruyter, Berlin (General Reference).
Lang, K. (1996): Astrophysical Formulae. Springer-Verlag, Berlin, Heidelberg (Chapter
76).
Lazutkin, V. (1993): KAM-Theory and Semiclassical Approximations to Eige'!func-
tions. Springer-Verlag, Berlin, Heidelberg (Chapter 58).
Leung, A. (1989): Systems of Nonlinear Partial Differential Equations: Applications to
Biology and Engineering. Kluwer, Dordrecht (General Reference).
LeVeque, R. (1990): Numerical Methods for Conservation Laws. Birkhiiuser, Basel
(Chapter 70).
Li, M. and Vitlinyi, P. (1993): An Introduction to Kolmogorov Complexity and its
Applications. Springer-Verlag, New York (General Reference).
Li, Ta-tsien (1994): Global Classical Solutions for Quasilinear Hyperbolic Systems.
Wiley, New York (General Reference).
Lichtenberg, A. and Lieberman, M. (1992): Regular and Chaotic Dynamics. Springer-
Verlag, New York (Chapter 76).
Lions, P. (1996): Mathematical Topics in Fluid Dynamics. Vol. 1: Incompressible Models.
Vol. 2: Compressible Models (to appear). Oxford University Press, Oxford, UK
(Chapter 72).
944 Additional References

Louis, A. (1995): Inverse and Ill-Posed Problems. Springer-Verlag, New York (General
Reference).
Lust, D. and Theissen, S. (1989): Lectures on String Theory. Springer-Verlag, Berlin,
Heidelberg (Chapter 76).
Lunardi, A. (1995): Analytic Semigroups and Optimal Regularity in Parabolic Prob-
lems. Birkhiiuser, Basel (General Reference).
Lusztig, G. (1993): Introduction to Quantum Groups. Birkhiiuser, Boston, MA (Chapter
76).

Mackey, G. (1992): The Scope and History of Commutative and Noncommutative


Harmonic Analysis. American Mathematical Society, Providence, RI (General
Reference).
Mackey, M. (1993): Time's Arrow: The Origin of Thermodynamic Behavior. Springer-
Verlag, New York (Chapter 67).
Mainzer, K. (1994): Thinking in Complexity: The Complex Dynamics of Matter, Mind,
and Mankind. Springer-Verlag, Berlin, Heidelberg (Chapter 79).
Malek, J., Necas, M., Rokyta, and Ruzicka, M. (1996): Weak and Measure-valued
Solutions to Evolutionary Partial Differential Equations. Chapman, London (Gen-
eral Reference).
Mandl, F. and Shaw, G. (1989): Quantum Field Theory. Wiley, New York (Chapter 59).
Mangel, M. and Segel, L. (eds.) (1996): Classical Papers in Mathematical Biology.
Springer-Verlag, Berlin, Heidelberg.
Marathe, K. and Martucci, G. (1992): The Mathematical Foundations of Gauge Theory.
North-Holland, Amsterdam (Chapter 59).
Marchioro, C. and Pulvirenti, M. (1994): Mathematical Theory of Inviscid Fluids.
Springer-Verlag, New York (Chapter 70).
Markowich, P. (1986): The Stationary Semiconductor Device Equations. Springer-
Verlag, Berlin, Heidelberg (Chapter 67).
Markowich, P. (1990): Semiconductor Equations. Springer-Verlag, Berlin, Heidelberg
(Chapter 67).
Marsden, J. (1992): Lectures in Mechanics. Cambridge University Press, Cambridge,
UK (Chapters 58 and 79).
Marsden, J. and Ratiu, T. (1994): Introduction to Mechanics, and Symmetry: A Basic
Exposition of Classical Mechanical Systems. Springer-Verlag, New York (Chapter
58).
Marshak, R. (1993): Conceptual Foundations of Modern Particle Physics. World Scien-
tific, Singapore (Chapter 59).
Maslova, N. (1993): Nonlinear Evolution Equations: Kinetic Approach. World Scientific,
Singapore (General Reference).
Mason, L. and Hughstone, L. (1990): Further Advances in Twistor Theory. Vols. 1, 2.
Longman, Essex, UK (Chapter 75).
Matveev, V. et al. (1994): Algebro-Geometrical Approach to Nonlinear Evolution Equa-
tions. Springer-Verlag, Berlin, Heidelberg (General Reference).
Mawhin, J. and Willem, M. (1989): Critical Point Theory and Hamiltonian Systems.
Springer-Verlag, New York (General Reference).
McDuff, D. and Salamon, D. (1994): Introduction to Symplectic Topology. Oxford
University Press, Oxford (General Reference).
Mehmeti, A. (1994): Nonlinear Waves inN etworks. Akademie-Verlag, Berlin (Chapter 71 ).
Meirmanov, A. (1992): The Stefan Problem. De Gruyter, Berlin (Chapter 67).
Additional References 945

Melo, W. de and van Strien, S. (1993): One-Dimensional Dynamics. Springer-Verlag,


New York (Chapter 79).
Meyer, K. and Hall, G. (1992): Introduction to Hamiltonian Dynamical Systems and the
N-Body Problem. Springer-Verlag, New York (Chapter 58).
Meyer, S. (1995): Computersimulation von Gittermodellen. Springer-Verlag, Berlin,
Heidelberg (General Reference).
Meyer, Y. (1990): Ondelettes et operateurs, Vols. 1-3. Hermann, Paris. (English edition:
Wavelets and Operators, Cambridge University Press, Cambridge, UK.) (General
Reference).
Mielke, A. (1991): Hamiltonian and Lagrangian Flows on Center Manifolds with Appli-
cations to Elliptic Variational Problems. Lecture Notes Mathematics, Vol. 1489.
Springer-Verlag, Berlin, Heidelberg (Chapter 79).
Mikhailov, A. (1994): Foundations of Synergetics. Vols. 1, 2. Springer-Verlag, Berlin,
Heidelberg (Chapter 67).
Milburn, G. and Walls, D. (1994): Quantum Optics. Springer-Verlag, Berlin, Heidelberg
(Chapter 59).
Mitsuru Ikawa (ed.) (1993): Spectral and Scattering Theory. Marcel Dekker, New York
(General Reference).
Monastirsky, M. (1993): Topology of Gauge Fields and Condensed Matter. Plenum
Press, New York (General Reference).
Monteiro, M. (1993): Differential Inclusions in Nonsmooth Mechanical Problems: Shocks
and Dry Friction. Birkhiiuser, Basel (Chapter 63).
Moreau, J. (ed.)(1988): Nonsmooth Mechanics and Applications. Springer-Verlag, Wien
(Chapter 60).
Miiller, I. (1994): Grundziige der Thermodynamik. Springer-Verlag, Berlin, Heidelberg
(Chapter 67).
Miiller, I. and Ruggeri, T. (1993): Extended Thermodynamics. Springer-Verlag, New
York (Chapter 67).
Miiller, W. et al. (1993): Geometrie und Physik. De Gruyter, Berlin(General Reference).
Murdock, J. (1991): Perturbations. Wiley, New York (General Reference).
Murray, J. (1989): Mathematical Biology. Springer-Verlag, Berlin, Heidelberg (Chapter 67).

Naber, G. (1992): The Geometry of Minkowski Spacetime. Springer-Verlag, New York


(Chapter 75).
Nagasawa, M. (1993): Schrodinger Equations and Diffusion Theory. Birkhiiuser, Basel
(Chapter 59).
Nakahara, M. (1990): Geometry, Topology, and Physics. Hilger, Bristol (Chapter 74).
Narasimhan, M. (1993): Principles of Continuum Mechanics. Wiley, New York (Chap-
ters 61 and 70).
Nettel, S. (1992): Wave Physics. Springer-Verlag, New York (General Reference).
Newton, R. (1988): Scattering Theory of Waves and Particles. Springer-Verlag, Berlin,
Heidelberg (Chapter 59).
Nishikawa, K. and Wakatani, M. (1993): Plasma Physics: Basic Theory with Fusion
Applications. Springer-Verlag, Berlin, Heidelberg (Chapter 68).
Nusse, H. and Yorke, J. (1994): Dynamics: Numerical Explorations. Springer-Verlag,
New York (Chapter 79).

Oberguggenberger, M. (1992): Multiplication of Distributions and Applications to Par-


tial Differential Equations. Harlow, Longman, UK (General Reference).
946 Additional References

Oberguggenberger, M. and Rosinger, E. (1994): Solution of Continuous Nonlinear


Partial Differential Equations Through Order Completion. North-Holland, New
York (General Reference).
Ohya, M. and Petz, D. (1993): Quantum Entropy and its Use. Springer-Verlag, Berlin,
Heidelberg (Chapter 68).
Ott, E. (1993): Chaos in Dynamical Systems. Cambridge University Press, Cambridge,
UK (Chapter 79).

Pais, A. (1993): Niels Bohr's Times in Physics, Philosophy, and Polity. Oxford University
Press, Oxford (Chapter 59).
Pauli, W. (1990): Die allgemeinen Prinzipen der Wellenmechanik. Neu herausgegeben
und mit historischen Anmerkungen versehen von Norbert Straumann. (Revised
edition by Norbert Straumann). Springer-Verlag, Berlin, Heidelberg (Chapter 59).
Pauli, W. (1992): Scientific Correspondence with Bohr, Einstein, Heisenberg, and others.
Vols. 1-3. Edited by K. von Meyenn. Springer-Verlag, Berlin, Heidelberg (Chap-
ter 59).
Pauli, W. (1994): Writings on Physics and Philosophy. Springer-Verlag, Berlin, Heidel-
berg (General Reference).
Peebles, P. (1991): Quantum Mechanics. Princeton University Press, Princeton, NJ
(Chapter 59).
Peebles, P. (1993): Principles of Physical Cosmology. Princeton University Press,
Princeton, NJ (Chapter 76).
Penrose, R. (1992): The Emperor's New Mind Concerning Computers, Minds, and the
Laws of Physics. Oxford University Press, Oxford (General Reference).
Penrose, R. (1994): Shadows of the Mind: The Search for the Missing Science of
Conciousness. Oxford University Press, Oxford (General Reference).
Peters, N. (1995): Laminar and Turbulent Combustion. Springer-Verlag, New York
(Chapter 70) (to appear).
Peyard, M. (ed.) (1995): Nonlinear Excitations in Biomolecules. Springer-Verlag, New
York (General Reference).
Pier, J. (ed.) (1995): Development of Mathematics 1900-1950. Birkhliuser, Basel (Gen-
eral Reference).
Plakida, N. (1994): High-Temperature Superconductivity: Experiment and Theory.
Springer-Verlag, New York (Chapter 59).
Plessis, A. du and Wall, C. (1994): The Geometry of Topological Stability. Clarendon
Press, Oxford (General Reference).
Polyakov, A. (1987): Gauge Fields and Strings. Academic Publishers, Harwood, NJ
(Chapter 76).
Princeton Problems in Physics with Solutions. Edited by N. Newbury et al., Princeton
University Press, Princeton, NJ (General Reference).
Proceedings of the International Congress of Mathematicians in Zurich. Birkhliuser,
Basel (to appear).
Proto, A. and Plastino, A. (1995): Maximum Entropy Principle. Springer-Verlag, New
York (Chapter 68) (to appear).
Priiss, J. (1993): Evolutionary Integral Equations and Applications. Birkhliuser, Basel
(General Reference).

Quartapelle, L. (1993): Numerical Solutions of the Incompressible Navier-Stokes Equa-


tions. Birkhliuser, Basel (Chapter 72).
Additional References 947

Quarteroni, A. and Valli, A. (1994): Numerical Approximation of Partial Differential


Equations. Springer-Verlag, Berlin, Heidelberg (General Reference).

Ratcliffe, J. (1994): Foundations of Hyperbolic Manifolds. Springer-Verlag, New York


(Chapter 74).
Ratner, V. et al. (1996): Molecplar Evolution. Springer-Verlag, New York (General
Reference).
Rauch, J. (1991). Partial Differential Equations. Springer-Verlag, New York (General
Reference).
Raychaudhuri, A., Banerji, S., and Banerjee, A. (1992): General Relativity, Astrophysics,
and Cosmology. Springer-Verlag, New York (Chapter 76).
Renardy, M. and Rogers, R. (1993): Introduction to Partial Differential Equations.
Springer-Verlag, New York (General Reference).
Riemann, B. (1990): Gesammelte Mathematische Werke, Wissenschaftlicher Nachlaft
und Nachtriige (Collected Papers edited by R. Narasimhan). Teubner-Verlag,
Leipzig and Springer-Verlag, Berlin, Heidelberg (Chapter 74).
Rivasseau, V. (1991): From Perturbative to Constructive Renormalization. Princeton
University Press, Princeton, NJ (Chapter 59).
Roepstorff, G. (1994): Path Integral Approach to Quantum Physics. Springer-Verlag,
Berlin, Heidelberg (Chapter 59).
Rolnick, W. (1994): Fundamental Particles and Their Interactions. Addison-Wesley,
Reading, MA (Chapter 59).
Roubicek, T. (1997): Relaxation in Optimization Theory and Variational Calculus. De
Gruyter, Berlin (to appear) (General Reference).
Ruder, H. et al. (1994): Atoms in Strong Magnetic Fields: Quantum Mechanical Treat-
ment and Applications in Astrophysics and Quantum Chaos. Springer-Verlag, New
York (Chapter 59).
Rudolph, E. (1994): Philosophy, Mathematics, and Modern Physics: A Dialogue.
Springer-Verlag, Berlin, Heidelberg (General Reference).
Ruelle, D. (1989): Chaotic Evolution and Strange Attractors. Cambridge University
Press, Cambridge, UK (Chapter 59).
Ruelle, D. (1993): Chance and Chaos. Princeton University Press, Princeton, NJ (Chap-
ter 79).
Rychlik, M. and Yakobson, M. (1997): One-dimensional Dynamical Systems. Springer-
Verlag, New York (General Reference).

Sanz, J., Martinez-Gonzalez, and Cayon, L. (eds.) (1994): Present and Future of the
Cosmic Microwave Background. Springer-Verlag, New York (Chapter 76).
Sattinger, D. and Weaver, 0. (1993): Lie Groups, Lie Algebras, and Their Representa-
tions. Springer-Verlag, New York (General Reference).
Schiepek, G. and Tschacher, W. (1997): Synergetik in den Humanwissenschaften.
Springer-Verlag, Berlin, Heidelberg (General Reference) (to appear).
Schmutzer, E. (1989): Grundlagen der theoretisehen Physik. Vols. 1, 2. Deutscher Verlag
der Wissenschaften, Berlin (General Reference).
Schneider, M. (1992): Himmelsmechanik. Vols. 1, 2. Bibliogr. Institut, Mannheim (Chap-
ter 58).
Schneider, P., Ehlers, J., and Falco, E. (1992): Gravitational Lenses. Springer-Verlag,
New York (Chapter 76).
948 Additional References

Schulze, B. (1994): Pseudo-Differential Boundary Value Problems, Conical Singularities,


and Asymptotics. Akademie-Verlag, Berlin (General Reference).
Schuster, R. (1996): Grundkurs Biomathematik: Mathematische Modelle in Biologie,
Biochemie, Medizin und Pharmazie mit ComputeriOsungen in M athematica. Teubner-
Verlag, Stuttgart/Leipzig (General Reference).
Schwarz, A. (1993): Quantum Field Theory and Topology. Springer-Verlag, Berlin,
Heidelberg (Chapter 59).
Schwarz, A. (1994): Topology for Physicists. Springer-Verlag, New York (General
Reference).
Scott, G. and Davidson, K. (1993): Wrinkles in Time. Morrow, New York (Chapter 76).
Sell, G., Foias, C., and Temam, R. (1993): Turbulence in Fluid Flows: A Dynamical
Systems Approach. Springer-Verlag, New York (Chapter 72).
Seydel, R. (1994): Practical Bifurcation and Stability Analysis: From Equilibrium to
Chaos. Springer-Verlag, Berlin, Heidelberg (Chapter 79).
Shore, S. (1992): An Introduction to Astrophysical Hydrodynamics. Academic Press, San
Diego, CA (Chapters 70 and 76).
Simon, B. (1993): The Statistical Mechanics of Lattice Gases. Princeton University
Press, Princeton, NJ (Chapter 68).
Sirovich, L. (ed.) (1991): New Perspectives in Turbulence. Springer-Verlag, New York
(Chapter 72).
Sirovich, L. (ed.) (1994): Trends and Perspectives in Applied Mathematics. Springer-
Verlag, New York (General Reference).
Smoller, J. (1994): Shock Waves and Reaction-Diffusion Equations. 2nd enlarged edition.
Springer-Verlag, New York (General Reference).
Smirnov, V. (1991): Renormalization and Asymptotic Expansions. Birkhiiuser, Basel
(General Reference).
Spohn, H. (1991): Large Scale Dynamics of Interacting Particles. Springer-Verlag,
Berlin, Heidelberg (Chapter 68).
Stein, E. (1993): Harmonic Analysis: Real-Variable Methods, Orthogonality, and Oscilla-
tory Integrals. Princeton University Press, Princeton, NJ (General Reference).
Stephani, H. (1989): Differential Equations: Their Solution Using Symmetries. Edited
by M. MacCallum. Cambridge University Press, Cambridge, UK (Chapter
58).
Sterman, G. (1993): An Introduction to Quantum Field Theory. Cambridge University
Press, Cambridge, UK (Chapter 59).
Stratonovich, R. (1992): Nonlinear Equilibrium Thermodynamics. Vols. 1, 2. Springer-
Verlag, New York (Chapter 67).
Straub, D. (1989): Thermofluiddynamics of Optimized Rocket Propulsions. Birkhiiuser.
Basel (Chapter 70).
Straughan, B. (1992): The Energy Method, Stability and Nonlinear Convection.
Springer-Verlag, New York (Chapter 69).
Strauss, W. (1989): Nonlinear Wave Equations. American Mathematical Society, Provi-
dence, RI (General Reference).
Struwe, M. (1988): Semilinear wave equations. Bull. AMS 26, 53-85 (General Reference).
Struwe, M. (1990): Variational Methods: Applications to Nonlinear Partial Differential
Equations and Hamiltonian Systems. Springer-Verlag, Berlin, Heidelberg (General
Reference).
Sun, N. (1994): Mathematical Modelling of Groundwater Pollution. Springer-Verlag,
Berlin, Heidelberg (Chapter 70).
Additional References 949

Taylor, M. (1993): Pseudodifferential Operators and Nonlinear Partial Differential Equa-


tions. Birkhiiuser, Boston (General Refer,ence).
Taylor, M. (1996): Partial Differential Equations, Vols. 1-3. Springer-Verlag, New
York (recommended as a standard text on the modem theory of linear and
nonlinear partial differential equations).
Temam, R. (1988): Infinite-Dimensional Dynamical Systems in Mechanics and Physics.
North-Holland, Amsterdam (General Reference).
Thaller, B. (1992): The Dirac Equation. Springer-Verlag, Berlin, Heidelberg(Chapter 59).
Thomas, J. (1995): Numerical Partial Differential Equations. Vols. 1,2. Springer-Verlag,
New York (General Reference).
Thome, K. (1994): Black Holes and Time Warps: Einstein's Outrageous Legacy. (Ger-
man edition: Gektiimmter Raum und verbogene Zeit: Einstein's Vermiichtnis,
Droemer und Knaur, Miinchen, 1994.) (Chapter 76).
Tipler, P. (1991): Physics for Scientists and Engineers. World Publishers, New York
(General Reference)
Toda, M. (1989): Nonlinear Waves and Solitons. Kluwer, Dordrecht (Chapter 71).
Tretter, C. (1993): On A.-Nonlinear Eigenvalue Problems. Akademie-Verlag, Berlin (Gen-
eral Reference).
Triebel, H. (1992): Theory of Function Spaces. Vol. 2. Birkhiiuser, Basel (General
Reference).
Troianello, G. (1987): Elliptic Differential Equations and Obstacle Problems. Plenum
Press, New York (Chapter 63).
Tromba, T. (1993): Teichmuller Theory in Riemannian Geometry. Birkhiiuser, Basel
(Chapter 74).
Truhlar, G. (ed.) (1988): Mathematical Frontiers in Computational Chemical Physics.
Springer-Verlag, New York (Chapter 67).

Urakawa, H. (1993): Calculus of Variations and Harmonic Maps. American Mathemati-


cal Society, Providence, RI (Chapter 74).

Van de Velde, E. (1994): Concurrent Scientific Computing. Springer-Verlag, New York


(General Reference).
Vanhorn, W. (1994): The Stokes Equation. Akademie-Verlag, Berlin (Chapter 72).
Vilenkin, N. and Klimyk, A. (1992): Representation of Lie Groups and Special Functions.
Vols. 1-4. Kluwer, Dordrecht (General Reference).
Vishik, M. and Fursikov, A. (1988): Mathematical Problems of Statistical Hydro-
mechanics. Kluwer, Dordrecht (Chapter 72?).
Visintin, A. (1994): Differentiable Models of Hysteresis. Springer-Verlag, Berlin, Heidel-
berg (Chapter 60).
Volkenstein, M. (1994): Physical Approaches to Biological Evolution. Springer-Verlag,
Berlin, Heidelberg (Chapter 79).

Wald, R. (1984): General Relativity. The University of Chicago Press, Chicago, IL


(Chapter 76).
Waldschmidt, M. et al. (eds.) (1993): From Number Theory to Physics. Springer-Verlag,
Berlin, Heidelberg (General Reference).
Weil, A. (1991): Apprenticeship of a Mathematician. Birkhiiuser, Basel (General Refer-
ence).
950 Additional References

Weinberg, S. (1989): The Cosmological Constant Problem. Rev. Mod. Phys. 61, 1-24
(Chapter 76).
Weinberg, S. (1992): Dreams of a Final Theory. Pantheon Books, New York (Chapter
76).
Weinberg, S. (1995): The Quantum Theory of Fields, Vols. 1,2. Cambridge University
Press, Cambridge, UK (Chapter 59).
Wendland, W. (1997): Integral Equation Methods for Boundary Value Problems.
Springer-Verlag, Berlin, Heidelberg (General Reference).
Wess, J. and Bagger, J. (1991): Supersymmetry and Supergravity. Second edition revised
and expanded. Princeton University Press, Princeton, NJ (Chapter 76).
Wiedemann, H. (1993): Particle Accelerator Physics. Vol. 1: Basic Principles and Linear
Beam Dynamics. Vol. 2: Nonlinear and High Order Beam Dynamics. Springer-
Verlag, Berlin, Heidelberg (Chapter 59).
Wiggins, S. (1994): Normally Hyperbolic Invariant Manifolds in Dynamical Systems.
Springer-Verlag, New York (Chapter 79).
Wiggins, S. (1994a): Global Dynamics, Phase Space Transport, Orbits Homoclinic to
Resonances, and Applications. American Mathematical Society, Providence, RI
(Chapter 79).
Wigner, E. (1993): Collected Works. Edited by A. Wightman. Vols.1ff. Springer-Verlag,
Berlin, Heidelberg (General Reference).
Willmore, T. (1993): Riemannian Geometry. Clarendon Press, Oxford (Chapter 74).
Winfree, A. (1990): The Geometry of Biological Time. Springer-Verlag, New York
(Chapter 67).
Wu, J. (1996): Theory and Applications of Partial Functional Equations. Springer-
Verlag, New York (General Reference).

Yndurain, F. (1993): The Theory of Quark and Gluon Interactions. Springer-Verlag,


Berlin, Heidelberg (Chapter 59).

Zabczyk, J. (1992): Optimal Control Theory. Birkhiiuser, Basel (General Reference).


Zeidler, E. (ed.) (1995): Teubner-Taschenbuch der Mathematik I. Teubner-Verlag,
Stuttgart-Leipzig (General Reference).
Zeidler, E. (1995a): Teubner-Taschenbuch der Mathematik II, Chapters 10-19. Cf.
Grosche, Ziegler, and Zeidler (eds.) (1995) (General Reference).
Zeidler, E. (1995): Applied Functional Analysis: Applications to Mathematical Physics.
Springer-Verlag, New York (General Reference).
Zeidler, E. (1995): Applied Functional Analysis: Main Principles and Their Applications.
Springer-Verlag, New York.
Zeldovich, Y. (1993): Selected Works. Vol. 1: Chemical Physics and Hydrodynamics. Vol.
2: Particles, Nuclei, and the Universe. Edited by G. Barenblatt and R. Sunyaev.
Princeton University Press, Princeton, NJ (Chapter 67).
Ziemer, W. (1989): Weakly Differentiable Functions: Springer-Verlag, New York
(General Reference).
Zinn-Justin, J. (1996): Quantum Field Theory and Critical Phenomena. Clarendon Press,
Oxford, UK (General Reference).
Zwillinger, D. (1992): Handbook of Differential Equations. Academic Press, New York
(General Reference).
List of Symbols

We use the following abbreviations:


B-space Banach space
H-space Hilbert space
M-S sequence Moore-Smith sequence
F-derivative Frechet derivative
G-derivative Gateaux derivative
Ai(lO) means (10) in the Appendix to Part i.

General Notation
.91~91 .91 implies 91
itT if and only if
.J/1<.>91 .J/1 itT 91
f(x)d,:! 2x f(x) = 2x by definition
xeS x is an element of the set S
xrjS x is not an element of S
{x: ... } set of all x with the property ...
S£ T the set S is contained in the set T
Sc T S is properly contained in T
SecT S is a strongly proper subset of T, i.e., the
closure of S is contained in T
n,u, -· intersection, union, difference
0 empty set
2s set of all all subsets of S, the power set of S
XxY product set, X x Y = {(x,y): xeX,ye Y}

951
952 List of Symbols

I, id identity mapping
f:S £X-+ Y mapping from the set S into the set Y with
S£X
D(f) domain off, D(f) = S
R(f) range off, R(f) = {f(x): xeS}
G(f) graph off, G(f) = { (x,f(x)): xeS}
N(f) null space off, N(f) = {x: f(x) = 0}
Fix (f) set of fixed points off, Fix(f) = {x: f(x) = x}
dom(f), im(f) identical with D(f), R(f), respectively
ker(f) identical with N(f)
f surjective mapping onto Y, i.e., f(S) = Y
/injective one-to-one mapping
/bijective one-to-one mapping onto Y, i.e., f is surjective
and injective
f(A) image of the set A, f(A) = {f(x): x e A}
f-t(B) preimage of the set B,f- 1(B) = {x: f(x)eB}
flA restriction of the map f to the set A
fog f applied to g, (f o g)(x) = f(g(x))
f: s -+2y multivalued mapping, f(x) is a subset of Y
R(f) range of the multivalued mapping f, R(f) =
Uxesf(x)
G(f) graph of the multivalued mapping f, G(f) =
{(x,y): xeS,yef(x)}
dom(f) effective domain of the multivalued map f,
dom(f) = {x: f(x) #:- 0}

"'R,C,O,l set of the natural numbers 1, 2, ...


set of the real, complex, rational, integer num-
bers
If< RorC
R+ e
nonnegative real numbers ~ 0
R> e
positive real numbers > 0
R_ e
nonnegative real numbers ~ 0
RN e
set ofall real N -tupels X = ( 1' ... ' eN)
R~ set of all X ERN with ei ~ 0 for all i
R~.A~ set of all X ERN with ei > 0 for all i
HRN e
set of all X ERN with 1 ~ 0 (special half-space)
Rez, Imz real part of the complex number z, imaginary
part ofz
partial derivative with respect to the ith co-
ordinate
lxl Euclidean norm of X ERN, lxl = (~J=l ef) 112
(xly) Euclidean scalar product in RN, (xly) =
~J=lei'lio Where X=(e., ... ,eN) and y=
('lto•••o'IN) ·
sgna signum of the real number a
List of Symbols 953

t5ij Kronecker symbol, t5ii = 1 if i = j, and t5ii = 0


ifi:Fj
g(x) = o(f(x)), x -+a g(x)/f(x)-+ 0 as x-+ a
g(x) = O(f(x)), x-+ a lg(x)l ~ constant lf(x)l for all x in a neighbor-
hood of the point a
[a,b], ]a,b[, [a,b[ closed, open, half-open real interval
BN closed unit ball in IRN, BN = {x e IRN: lxl ~ 1}
sN N-dimensional unit sphere, sN = {xeiRN+l:
lxl = 1}
measG Lebesgue measure of the set G (see Part II)

Special Notation
The page number 1123 refers to page 123 of Part I, etc., whereas the page
number 123 refers to page 123 of the present volume.
Page
X* dual space to X 1774
A* dual operator to the operator A 1775
F' F -derivative of the operator F 1135
F" partial F -derivative ofF with respect to x 1140
d"F(x;h 1 , ... ,h,.) nth F -differential ofF at x in the directions h1 ,
".' h,. 1143
d"F(x;h) identical with d"F(x; h, ... , h)
t5" F(x; h1 , ••. , h,.) nth variation ofF at x in the directions h1 , ••• ,
h,. 1134
t5"F(x;h) identical with t5" F(x; h, ... ; h)
Nxm identical with N(x, ... , x) where N ism-linear 1361
as boundary of the set S 1751
s closure of S 1751
intS interior of S 1751
U(x) neighborhood of the point x, i.e., there exists
an open set 0 such that x e 0 and 0 !;;:;: U(x) 1751
suppf support of the function J, supp f is the closure
of the set {x:f(x) =F 0} 1756
lim, lim lower, upper limit 1761
diamS, d(S) diameter of the set S 1762
d(x,S) distance of the point x from the set S 1762
d(S, T) distance between the sets S and T 1762
dist(x, S), dist(x, T) identical with d(x, S), d(x, T), respectively
S+ T sum of the sets S and T 1764
A.S product of the set S by the number A. 1764
spanS linear hull of the set S 1764
coS convex hull of S 1764
954 List of Symbols

coS closed convex hull of S 1764


dimL dimension of the linear subspace L; 1765
dimension of the manifold L 535
codimL codimension of the linear subspace L; 1765
codimension of the submanifold L 557
X/Y factor space (quotient space) 1765
XEBY direct sum or direct topological sum (the text
always refers precisely to the momentary
meaning) 1765, 1766
xj_ complement of X with respect to a direct sum 1766
XxY product space of two B-spaces (The norm on
X x Y is given by ll(x,y)ll = llxll + llyl!.) 1770
general product space 1747, 1755
n"x"
Xc complexification of the real B-space X 1770
Ac complexification of the linear operator A 1770
indA index of the linear operator 1767
rank A rank of the linear operator A, rank A =
dimR(A)
detA determinant of the linear operator A 1179
p(A) resolvent set of the linear operator 1795
a( A) spectrum of the linear operator A 1795
r(A) spectral radius of the linear operator A 1795
In real B-spaces, p(A), a(A), and r(A) always refer to the complexification of
A, i.e., p(A) = p(Ac), etc.
(x!y) inner (scalar) product in an H-space 1785
(x, y) ordered pair, an element of the product set
X x Y, where xeX and ye Y
<xly) inner (scalar) product in IRN or eN 1771
<f,x) value of the linear functional fat the point x,
<J.x) = f(x)
llxll norm of x in a B-space 1768
lxl Euclidean norm of x in IRN or eN, !xl =
<xlx)t/2
convergence (in norm) 1769
Xn--"- X weak convergence 1775
X n ~X weak* convergence 1775
!,.~! uniform convergence of functions 1801
x~--+ f(x) another notation for the mapping f
/(·) another notation for the mapping f
(xn') subsequence of the sequence (xn)
(X= (tXt•···•tXN) multi-index; the tX; are nonnegative integers
ItXI order of tX, ltXI = tXt + ··· + tXN
derivative in multi-index notation, D; = a;ae;
List of Symbols 955

a;an derivative in the direction of the exterior


normal
A= Df + ... + D~ Laplace operator in IRN
v3 real, three-dimensional, linear space 10
xy scalar product ofthe vectors x andy in V3 10
vector product of the vectors x andy in V3 10
vector of a basis {b1} 11, 596, 617
vector of a dual basis {b 1} 13, 597, 620
vector of an orthonormal basis {e1} (basis
vector of a Cartesian coordinate system) 11, 617
In the case of a finite-dimensional manifold, we write b1 = oxfou 1 or briefly
b1 = ofou 1• Moreover, we have b1 = du 1, where u 1 , ••• , u" denote local coordi-
nates. The precise definition of b1 and b1 on manifolds can be found in Section
73.23.
x(t) time derivative of the motion x = x(t) at t 11
U'(x) F -derivative of the potential U at the point x,
U'(x) = grad U(x) 13
Epot potential energy 33
£kin kinetic energy 32
oF(x) subdifferential of the functional F at the
point x III385
0) angular velocity vector 11
Q) angular velocity, w = lml; 11
angular frequency, w = vf2n, where v denotes
the frequency 100
c velocity of light 885
h Planck's quantum of action 78
h fl = h/2n
G gravitational constant 36
k Boltzmann constant 400
Re Reynolds number 440
Tii energy-momentum tensor 725
8 strain tensor 170
't' stress tensor 177
0' reduced stress tensor (first Piola-Kirchhoff
tensor) 184
space of all linear operators u: v3 -+ v3 166
space of all linear symmetric operators u:
V3 -+ V3 on the H-space V3 166
tru trace of u e L(V3 ) 166
detu determinant of ue L(V3 ) 166
[u,y] special product for u, yeL(V3 ) 166
(ujy) scalar product on L(V3 ) 166
956 List of Symbols

div a(x) divergence of the tensor field a: V3 --. L(V3 ),


i.e., a(x)eL(V3 ) 167
aoy dyadic product for a, yeL(V3 ) 167
adj a special operation for a e L( V3 ), adj a =
(det a)a- 1 167
J(x) Jacobian for y = y(x), J(x) = det y'(x)
w,i identical with Diw, where Di = ofoei (special
notation in the theory of elasticity) 325
identical with Di = a;aei in a Cartesian coordi-
nate system (This notation is used in connec-
tion with the Einstein summation convention.
For example, eiD1 means l::'=
1 eiDJ)

Special Notation for the Theory of Manifolds


X'I' representative of the point x on the manifold
M with respect to the chart lp, i.e., x"' = lfJ(X) 534
v"' representative of the tangent vector v on the
manifold M with respect to the chart lp, i.e.,
vtp = xtp(t) 539
v* representative of the cotangent vector v* with
"' respect to the chart lfJ 545
TMx tangent space of the manifold M at the point x 539
TM tangent bundle of the manifold M 542
T 2M identical with T(T M) 542
TM: cotangent space of the manifold M at x (dual
space to T Mx) 545
TM* cotangent bundle of M 545
f'(x) tangent map f'(x): T Mx--. T N11 x1 at the point
x of the manifold M with respect to the map
f: M -.N 540
In the literature, one also uses the following notations for f'(x): Txf, Tf(x),
df(x), Df(x), dJ, dfx·
Tf tangent map Tf: TM--. TN of the map f:
M-.N 542
identical with T(Tf), i.e., T 2f maps T 2 Minto
T 2N 542
df(x;h) differential of the map f at the point x in the
direction ofthe tangent vector h, i.e., df(x; h) =
f'Wh m
In the literature, one also uses the following notations for df(x; h): df:x(h) or
h(f). Note that df(x; h) is identical with the directional derivative off at the
point x in the direction of h.
List of Symbols 957

Iff: X-+ Yis a C 1 -map between the B-spaces X andY over IK = IR, C, then
we have TX" =X and Tf, = YforallxeX andye Y,respectively. Moreover,
the tangent map f'(x) at x is identical with the F-derivative f'(x): X-+ X.
Finally, we have df(x; h) = f'(x)h = t5f(x; h) for all x, he X.
du 1 identical with f'(x) for f(x) = u1(x), where u1,
... , u" are local coordinates of x, i.e., du 1{h) =
du!(h) = f'(x)h 598
i!f(h) Taylor expansion off at the point x up to kth
order, i.e., i!f(h) = f(x) + L'=t
JUl(x)h 1/j!
J"f(x) k-jet off: M-+ Nat the point x 567
In the special case of a function f: IR-+ IR, we have J"f(x) = (x,f(x),f'(x), ... ,
J<">(x)), where JA:f(x) is called the kth jet coordinate off at the point x.
JA:(M, N) k-jet manifold corresponding to smooth maps
f: M -+N 567
M rj'l N mod Y transversal intersection ofthe submanifolds M
and N of Y 565
frj'\Nmod Y the map f: M-+ Y is transversal to the sub-
manifold N of Y 565
Minkowski, Einstein manifold 696, 731

Special Notation for the Tensor Calculus


Af transformation coefficient, Af = iJu 1'fiJu 1 618
tl simply contravariant tensor 619
tl simply covariant tensor 619
t{·····t.··
I ••• •
r-fold contravariant and s-fold covariant
tensor 619
identical with Li'=t t 1t 1, according to Einstein's
summation convention 616
t5!J unit tensor 620
Alt tiJ identical with f(tiJ - t11 ) 622
Sym tiJ identical with f(tii + t11 ) 622
gij metric tensor 620,651
g det(g 11 ) 651
r; Christoffel symbol 623
RJkm Riemann's curvature tensor 642
Rjm Ricci tensor, Rim= RJ,.m 731
R scalar curvature, R = gimRim 731
K Gauss curvature ofa surface, K = R 1212 /g 637,643
Q( curvature form,
1

Cl(1 = 12 R(L du" 1\ dum 689


J ..m

covariant derivative of the tensor field t1 623


958 List of Symbols

absolute derivative of the tensor field t1 along


the curve u1 = u 1(a),
Dt/da = u1V1t1 625
absolute differential of ti,
625
grad/ gradient of a function f 626
divw divergence of a vector field in R" 626
curlw rotation of a vector field in R3 629
Curl w1 rotation of a vector field in R" 630
Levi-Civita symbol, the sign of the permuta-
tion (.L .. n)
lt•··in
628
standard n-fold covariant pseudotensor,

E1, ... 1" = IUI 112Bt, ... i" 628


standard n-fold contravariant pseudotensor,
Ei, ... l" = lul-1/28·
lt•••ln
628

d-ti
I J···'r
0 alternating differentiation of the anti-
symmetric tensor field t ...•
d-t· . = AltD1t.'s·· .a,..
I 's···•r
664
alternating differential form,
w 667
'J···•r. dul' " ... " dui•
= t·

dw derivative of the alternating differential form


w,
dw = dt- ......·,. 1\ du 1' 1\ • • • 1\ du 1• 668
*W dual differential form to w 672
i,w inner product of the alternating differential
form w with the vector field v 671
dual derivative of the alternating differential
form w 672
Lie derivative of the tensor field t 1 along a1,

673
Schwarzschild radius 757

Function Spaces
The reader should also consult the List of Symbols to Parts I and II. In the
following, G denotes a nonempty bounded open set in RN. Let - oo < a < b <
oo. Moreover, let k = 0, 1, ... , 1 ~ p < oo, and 0 <IX~ 1.
List of Symbols 959

oG = o1 Guo2 G special decomposition of the boundary oG of G 242


oGeC1 ·« boundary property of the set G (If k ~ 1, then
the boundary is smooth.) 1232
oGeC0 • 1 the set G has a piecewise smooth boundary 1232
L(X, Y) space of linear continuous operators from X
into Y 1135
C(X, Y) space of continuous operators from X into Y 1148
C"(X, Y) space of k-times continuously £-differentiable
operators from X into Y 1148
C0 (X, Y) identical with C(X, Y)
H«(u) Holder constant of u, i.e., H«(u) is the smallest
constant L with
lu(x)- u(y)l :S Llx- yl«
for all x, y e G
llullcra,bl identical with max,.:s;x:s;b lu(x)l
llullc~c(a,bl identical with ~]=o max,.:s;x:s;b luW(x)l
llullc~c."fa,bl identical with llullc..1a,bl + H«(u<11), where the
Holder constant H« refers toG= [a,b]
llullc(G) identical with maxxe"G lu(x)l
llullc..(G) identical with LIIII;S;i: maxxe 0 ID11 u(x)l
llullc...•(G) identical with
llullc...•(G) + L H«(D11u)
1111"'"
llu!IP identical with (fG lu(x)IP dx) 11P
llullp,oG identical with <faG lu(x)IP d0) 11P
!lull.., identical with ess SUPxeG lu(x)l (see Section 18.6
of Part II)
llulloo,oG identical with ess SUPxeoG lu(x)l
llull~:,p identical with (fG }:1111 :s;,. IDIIu(x)IP dx) 11P
llullt,p,o identical with (fG Lllll=i: IDIIu(x)IP dx) 11P
C[a,b] real B-space of all continuous functions u:
[a,b]-+ IR with the norm llullqa,bJ
C1 [a,b] real B-space of all k-times continuously differ-
entiable functions u: [a, b] -+ R with the norm
llullc~c(a,bl
real B-space of all continuous functions u:
G-+ R with the norm lluiiC(a)
real B-space of all k-times continuously differ-
entiable real functions on G with the norm
llullc~c161 • (More precisely, ct(G) consists of all
continuous functions u: G-+ R which are k-
times continuously differentiable on G and all
of whose partial derivatives up to kth order
can be continuously extended to the closure G
of G.)
960 List of Symbols

real B-space of all functions u e C"( G) with


llullc"·"<GJ < oo
space of Ck,«.functions u: oG --. IR 1232
space of infinitely differentiable functions
u: G-.R
C0(G) space of all functions u E C00 ( G) with compact
support supp u, i.e., u vanishes outside a com-
pact subset of G (see Section 18.1 of Part II)
Lp(G) Lebesgue space of all measurable functions
u: G -.R with lluiiP < oo (see Section 18.6 of
Part II)
Lp(oG) Lebesgue space of all measurable functions
u: oG -.R with llullp,oG < oo (The measur-
ability of u refers to the surface measure on oG .)
Sobolev space with the norm llullt,p (see Sec-
tion 21.2 of Part II)
W,"(G) closure of C0(G) in w:(G) (see Section 21.2 of
Part II)
The norms II·IIA:,p and II·IIA:,p,O are equivalent on the Sobolev space w:(G) (see
Section 21.2 of Part II).
X" product space of X, i.e., X" consists of all u =
(ut> ... , u,) with ui eX for all i
If X is a B-space over IK = IR, C, then X" is also a B-space over IK with the norm
II

!lull = L lluillx·
i=l

In this way we obtain the real B-spaces Lp(G)", WP"(G)", etc. and the spaces
C00 (G)", C0(G)", etc.
Lebesgue space of functions u: G --. V3 with
values in the real three-dimensional linear
space v3 243
L2(oG; V3) Lebesgue space of functions u: oG --. V3 243
L2(G;Lsym(V3)) Lebesgue space offunctions a: G--. Lsym(V3) 243
Wl(G; V3) Sobolev space of functions u: G --. V3 243
Wl(G; V3) Sobolev space offunctions u: G --. V3 with u =
0 on oG 242
Wl(G,ol G; V3) Sobolev space offunctions u: G --. V3 with u =
Ooo~~w~re~G£~ M2
List of Theorems

The collection of all our experiences consists of what we know and what we have
forgotten.
Marie von Ebner-Eschenbach (1830-1916)

Theorem 58.A (Balance and conservation laws) . . . . . . . . . . . . . . . . . . 34


Theorem 58.B (The two-body problem and the three Kepler laws) . . 40
Theorem 58.C (Stability principle of minimal potential energy) . . . . . 51
Theorem 58.0 (Existence and uniqueness theorem for the motion of
the rigid body). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
Theorem 58.E (Existence of Lagrange multipliers in Lagrangian me-
chanics). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
Theorem 58.F (Canonical transformations and the solution of the
canonical equations via the Hamilton-Jacobi equa-
tion).......................................... 84
Theorem 58.G (Lagrange brackets and the solution of the Hamilton-
Jacobi equations via the canonical equations). . . . . . . 85
Theorem 59.A (The uncertainty relation in quantum mechanics). . . . 123
Theorem 60.A (Existence and uniqueness theorem for the elasto-
plastic wire with linear hardening law)... . . . . . . . . . . 152
Theorem 61.A (The fundamental variational principle in nonlinear
elasticity) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 191
Theorem 61.B (The Rivlin-Ericksen theorem on the most general
constitutive law for homogeneous isotropic bodies) . . 207
Theorem 61.C (The stored-energy function of a homogeneous elastic
body)......................................... 208
Theorem 61.0 (Main theorem of linear elastostatics-generalized
solutions) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 211

961
962 List of Theorems

Theorem 6l.E (Main theorem of linear elastodynamics-generalized


solutions) ..................................... 212
Theorem 6l.F (Local existence and uniqueness in nonlinear elasticity) 219
Theorem 6l.G (Classical solutions in linear elastostatics) .......... 221
Theorem 6l.H (Convergence of a general approximation method in
nonlinear elasticity) ............................. 228
Theorem 62.A (The principle of minimal potential energy for con-
vex material in nonlinear elasticity-existence and
uniqueness of solutions) ......................... 244
Theorem 62.B (The principle of maximal dual energy and duality in
nonlinear elasticity) ............................. 247
Theorem 62.C (Existence and uniqueness in linear quasi-statical plas-
hetty) ......................................... 262
Theorem 62.0 (Existence and uniqueness in linear statical plasticity) 263
Theorem 62.E (Existence theorem for polyconvex material) ........ 275
Theorem 62.F (Kom's inequality).............................. 279
Theorem 62.G (Friedrichs' duality) ............................. 287
Theorem 63.A (Existence and uniqueness for the Signorini problem
. e1ast1c1ty
In . . ) ................................... 298
Theorem 64.A (Main theorem of bifurcation theory for variational
inequalities) ................................... 308
Theorem 64.B (Supported beams, variational inequalities, and bifur-
cation)........................................ 313
Theorem 65.A (Existence theorem for the von Karman plate equation) 332
Theorem 65.8 (Bifurcation theorem for the von Karman plate equa-
tion).......................................... 333
Theorem 65.C (Supported plates, variational inequalities, and bifur-
cation)........................................ 341
Theorem 66.A (Existence and uniqueness theorem for a general model
in plasticity including internal state variables and
hardening effects)............................... 355
Theorem 67.A (Solution of the Gibbs equation for gases and liquids) 377
Theorem 69.A (Existence and uniqueness theorem for Carleman's ra-
diation problem) ............................... 426
Theorem 71.A (Complex function theory and hydrodynamics) ...... 453
Theorem 7l.B (Permanent gravity waves and bifurcation) ......... 461
Theorem 72.A (Existence and uniqueness theorem for the stationary
Navier-Stokes equations) ....................... 490
Theorem 72.B (Existence and uniqueness theorem for the instationary
Navier-Stokes equations) ....................... 494
Theorem 72.C (The Taylor problem for viscous flow and bifurcation) 499
Theorem 72.0 (The Benard problem for viscous flow and bifurcation) 510
Theorem 73.A (Existence and uniqueness theorem for ordinary differ-
entiat equations on· Banach manifolds) ............. 547
Theorem 73.B (Linearization principle for diffeomorphisms) ....... 552
List of Theorems 963

Theorem 73.C (Construction of manifolds via submersions-the pre-


image theorem) ................................ 556
Theorem 73.0 (Construction of manifolds via subimmersions) ...... 558
Theorem 73.E (Construction of manifolds via em beddings) ........ 559
Theorem 73.F (Construction of diffeomorphisms via ordinary differ-
entiat equations and the generalized Morse lemma) .. 561
Theorem 73.G (Construction of manifolds via transversality) ....... 565
Theorem 73.H (Sard's theorem) ................................ 587
Theorem 73.1 (Whitney's embedding theorem) .................. 588
Theorem 74.A (Parallel transport of vector fields and covariant differ-
entiation) ..................................... 627
Theorem 74.B (Main theorem of surface theory of Bonnet)......... 640
Theorem 74.C (Theorema egregium of Gauss) ................... 643
Theorem 74.0 (Riemann's curvature tensor and Riemann's theorem
on the local flatness of Riemannian manifolds) ...... 654
Theotem 76.A (The fundamental variational principle for the motion
of light and matter in general relativity) ............ 733
Theorem 76.B (Friedman's model of the universe) ................ 739
Theorem 76.C (The Schwarzschild solution in general relativity) .... 757
Theorem 76.0 (The motion of the Perihelion of planets in general
relativity) ..................................... 759
Theorem 76.E (The Kruskal solution in general relativity and black
holes) ......................................... 769
Theorem 77.A (The fixed-point theorem of Brouwer) .............. 799
Theorem 77.B (The inequality of Fan) .......................... 801
Theorem 77.C (The main theorem about n-person games of Nash) .. 802
Theorem 77.0 (The fixed-point theorem of Fan and Glicksberg) .... 805
Theorem 77.E (The main theorem of mathematical economics of
Gale, Nikaido, and Oebreu) ...................... 807
Theorem 78.A (The Sard-Smale theorem for Banach manifolds) .... 829
Theorem 78.B (Parametrized version of the Sard-Smale theorem) .. 833
Theorem 78.C (The main theorem about the generic finiteness of the
solution set of operator equations) ................ 834
Theorem 79.A (Ljapunov's main theorem of stability theory in B-
spaces)........................................ 843
Theorem 79.B (Structure of flows for autonomous differential equa-
tions in B-spaces) ............................... 847
Theorem 79.C (Asymptotic stability and instability of periodic solu-
tions in B-spaces) ............................... 851
Theorem 79.0 (Orbital stability) ............................... 853
Theorem 79.E (Loss of stability and the main theorem about the
bifurcation of equilibrium points) ................. 858
Theorem 79.F (Loss of stability and the main theorem about Hopf
bifurcation) .................................... 862
Theorem 79.G (Center theorem of Ljapunov) .................... 868
List of the Most Important Definitions

International system of units . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 883


Dimension of important physical quantities . . . . . . . . . . . . . . . . . . . . . . . 884
Universal constants . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 884
Planck's quantum of action h (li = h/2n). . . . . . . . . . . . . . . . . . . . 102, 884
velocity of light c . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 700, 884
Boltzmann constant k................................... 400,884
gravitational constant G . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36, 884
Hubble constant. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
Critical density of the universe. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
Critical temperature of the universe for the existence of specific ele-
mentary particles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
Reynolds number Re and turbulence. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 440
Elementary units ........................... , . . . . . . . . . . . . . . . . . 750
Planck length, Planck time, Planck temperature, Planck energy,
Planck mass, elementary charge (charge of the proton)....... . 751

Energy
total................................................ ...... 34
kinetic.............................................. ...... 32
potential . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
inner . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 378, 387
free................................................. ...... 387
elastic ..................................... ·. . . . . . . . . . . . 190, 198
dual elastic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 236
Energy
of a free particle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 721
of a photon. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102

964
List of the Most Important Definitions 965

Energy-momentum tensor. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 725


current density vector . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 422
conservation law . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34, 422, 724
Entropy................................................. 387, 397
Thermodynamical potential . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 385, 387
inner energy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 387
free energy. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 387
enthalpy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 387
free enthalpy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 387
statistical potential of Gibbs. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 387
Temperature of a radiation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
The fundamental partition function in statistical physics . . . . . . . . . . . . 400
temperature. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 398
chemical potential . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 398

Velocity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
virtual velocity ..... , . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
Acceleration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
Momentum. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28, 32
momentum of a photon. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
generalized momentum in Hamiltonian mechanics . . . . . . . . . . . . . . . 73
Angular momentum. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
spin . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125
Mass
in classical mechanics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26, 28
in special relativity . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 721
in general relativity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 731, 736
rest mass..... . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 721
Center of mass . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
Force
conservative force. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
centrifugal force . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
Coriolis force . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
constraining force. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
the four fundamental forces in the universe. . . . . . . . . . . . . . . . . . . . . . 135
Torque...................................................... 32
Potential of a force. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
gauge invariance in classical mechanics. . . . . . . . . . . . . . . . . . . . . . . . . 33
gauge invariance in modern physics (see Part V)
Potential of a velocity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 454
Work (force times displacement) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
Power (work divided by time). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
principle of virtual power . . . . . . . . . . . . . . . . . . . . . . . . .. .. . . . 16, 19, 49
Action (energy times time). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
action along a motion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
966 List of the Most Important Definitions

principle of stationary action . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70


Lagrange function (Lagrangian). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21, 71
Hamilton function (Hamiltonian)... . . . . . . . . . . . . . . . . . . . . . . . . . . . 21, 73
Lagrange multipliers and constraining forces. . . . . . . . . . . . . . . . . . . . 46, 67
Legendre transformation
in mechanics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
in thermodynamics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 385

Wave
wave length ..l . . . . • • . . . . . . . . . . . . . • • . . . • . . . . . • . . . . . • • . . . . . • . • 100
wave vector.. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100
frequency v . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100
angular frequency w......................................... 100
phase displacement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
phase velocity.... . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100
group velocity.. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
dispersion relation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100
polarization. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
plane waves............................................... . 100
spherical waves. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
Damped oscillations
mean life-time. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104
frequency-time uncertainty relation. . . . . . . . . . . . . . . . . . . . . . . . . . . . 104
Probability of presence for particles in
quantum mechanics. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114
Heisenberg's uncertainty relation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123
position-momentum uncertainty . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124
energy-time uncertainty for quasi-stable quantum states . . . . . . . . . . 130
Measurements in quantum mechanics
expectation value....... . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114
dispersion and mean error . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114
Decay probability for particles. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106
Reaction probability for particles. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107
Cross section for particle reactions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106

Equilibrium point (state)


of a dynamical system . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 841
of a mechanical system . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16, 50
of a thermodynamical system (see also KMS states in Part V).... . . 373
in game theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 802
in mathematical economics (Walras equilibrium).... . . . . . . . . . . . . . 806
Stability of an equilibrium point in the sense of Ljapunov
asymptotically stable . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 842
stable............................................... ...... 841
unstable.... . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 842
List of the Most Important Definitions 967

Orbital stability of periodic processes . . . . . . . . . . . . . . . . . . . . . . . . . . . . 852


Statical stability of mechanical systems......................... 16, 50
Strongly stable states in elastostatics. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 218
Stability of thermodynamical equilibria via thermodynamical potentials 389
Multipliers for equilibrium states
asymptotically stable ...._. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 846
critical . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 846
unstable. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 846
Floquet multipliers for periodic processes. . . . . . . . . . . . . . . . . . . . . . . . . 851
Bifurcation point (see page 428 of Part I)

Deformation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 176
displacement. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 176
Strain tensor 8 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 170
Stress tensor t ........................................ ·. . . . . . . . 177
Reduced stress tensor t1 (first Piola-Kirchhofftensor)............... 184
Principal strain...... . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169
Piola transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . 175
Piola identity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 175
Constitutive law. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 176
dual constitutive law . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 246
Yield condition in plasticity . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . 155, 200
Stored energy function (density of the elastic potential energy of a body) 190
Dual energy of an elastic body. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 236
Friedrichs' duality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 287
Trefftz' duality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 288

Flow of fluids
inviscid (ideal). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 439
viscous. . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 438
incompressible . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 438
stationary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 438
irrotational . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 438
Circulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 438
Viscosity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 437
Tensor of inner friction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 436
Pressure (force divided by surface). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 435

Inertial system
in classical mechanics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
in the theory of relativity. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 700
Proper time... . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 718
Lorentz transformation...... . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 706, 711
Poincare group. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 712
Minkowski space-time manifold M 4 •••.••••••••••••••••••••••••• 713
968 List of the Most Important Definitions

Einstein space-time manifold E4 • • • • • • • • . • • • • • . • . • • • • • . • • • • • • • • • 730


Friedman metric
of the closed universe..... . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 739
of the open universe . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 741
Schwarzschild metric ofthe sun... . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 756
Kruskal metric and black holes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 768

Banach manifold . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 535


tangent vector. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 538
tangent space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 539
tangent bundle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 542
cotangent vector (covector or also differential form) . . . . . . . . . . . . . . 545
cotangent space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 545
cotangent bundle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 545
Chart.............................................. ......... 533
chart map . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 534
chart space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 534
local coordinates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 534
admissible chart. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 535
Atlas . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 534
equivalent atlas. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 536
maximal atlas . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 535
Differentiable structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 536
Ck-manifold. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 535
topological manifold . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 535
analytical manifold . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 535
Submanifold . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 556
Manifold with boundary. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 584
Dimension of a manifold. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 535
codimension of a submanifold . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 557
Orientation of a manifold . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 582
Fredholm mapping (see also Section 8.4 of Part 1). . . . . . . . . . . . . . . . . . 552
double splitting map . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 531
the linear subspace L splits the space X (see page 766 of Part I)
projection operator (see page 766 of Part I)
direct topological sum (see page 766 of Part I)

Mappings between manifolds


tangent map at a point . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 541
tangent map on the tangent bundle. . . . . . . . . . . . . . . . . . . . . . . . . . . . 543
Ck-map ... . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 537
diffeomorphism . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 537
regular and singular (critical) value . . . . . . . . . . . . . . . . . . . . . . . . . . . . 552
regular and singular (critical) point . . . . . . . . . . . . . . . . . . . . . . . . . . . . 552
etale map..... . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 551
List of the Most Important Definitions 969

submersion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 551
immersion....... . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 551
subimmersion............ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 551
embedding................................................. 559
proper map. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 551
closed map . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 551
k-jet Jlj(x) at the point x. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 567
k-jet manifold J"(M,N)...................................... 567
Vector field on a manifold...................................... 543
Flow on a manifold . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 547
Bundles
abstract bundle......... . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 544
vector bundle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 589
fiber bundle and principal fiber bundle (see Part V)

Singularities and catastrophe theory


equivalence of maps......................................... 571
unfolding... . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 575
versal................................................... 575
universal . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 575
k-determination . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 574
transversality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 565
structural stability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 579
genericity. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 579
Whitney topology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 569

Derivativef'(x): ™x-+ TN11 x1 of a map f: M-+ Nat the point x.... 541
Differential df(x; h)= f'(x)h............. . . . . . . . . . . . . . . . . . . . . . . . 595
Directional derivative . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 595
Derivation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 600
Subgradient oF of a functional (see page 385 of Part III)
Covariant derivative ofa tensor field. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 623
Absolute derivative . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 625
Absolute differential. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 625
Parallel transport of a tensor field . . . . . . . . . . . . . . . . . . . . . . . . . . . 626, 650
Alternating differentiation of antisymmetric tensor fields . . . . . . . . . . . . 664
Differentiation of alternating differential forms . . . . . . . . . . . . . . . . . . . . 668
Lie derivative . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 673
Lie group. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 677
Lie algebra . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 676

Tensor...................................................... 13
covariant . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 618
contravariant . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 619
symmetric . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 622
970 List of the Most Important Definitions

antisymmetric (skew-symmetric or alternating) . . . . . . . . . . . . . . . . . . 622


tensor in v3. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 166
Tensor density . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 630
Pseudotensor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 628
Pseudotensor density. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 630
Unit tensor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 620
Metric tensor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 651
signature of the metric tensor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 654
Curvature tensor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 642
Gaussian curvature........................................... 637
Affine connected manifold. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 649
Riemannian manifold
general. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 651
proper.................................................... 651
Local flatness of Riemannian manifolds . . . . . . . . . . . . . . . . . . . . . . . . . . 654
Geodesic ................................... : . . . . . . . . . . . . . . . . 653
affine geodesic. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 650

Sperner simplex . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 798


Regular solution curve......................................... 819
Homotopy method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 818
Mapping degree (see also Chapter 12 of Part I) . . . . . . . . . . . . . . . . 670, 826
Fixed-point index (see also Chapter 12 of Part 1)................... 825
Residual set (massive set). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 570
List of Basic Equations in
Mathematical Physics

Classical mechanics
Newton's fundamental equation in inertial systems. . . . . . . . . . . . . . . 26
Newton's fundamental equation in arbitrary systems of reference. . . 30
Gauss' principle of least constraint and the general basic equations
of point mechanics with side conditions . . . . . . . . . . . . . . . . . . . . 45
Lagrange's equation and the fundamental variational principle of
stationary action . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70, 71
Hamilton's canonical equation................................ 72
Hamilton-Jacobi equation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
Poisson's equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
the principle of virtual power (virtual work)............. 17, 18, 19,49
the statical stability principle for mechanical equilibrium states
and the principle of minimal potential energy . . . . . . . . . . 16, 50, 51
the general dynamical stability principle of Ljapunov. . . . . . . . . . 20, 843
symplectic manifolds and classical mechanics. . . . . . . . . . . . . . . . . Part V
algebraic approach to classical mechanics via operator algebras . . Part V
Nonlinear elasticity
Cauchy's fundamental equation in nonlinear elasticity. . . . . . . . . . . . 176
hyperelasticity and the principle of minimal potential energy....... 190
the principle of dual elastic energy. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 245
linear material and linear Hooke's law . . . . . . . . . . . . . . . . . . . . . . . . . 201
convex material and nonlinear Hooke's law. . . . . . . . . . . . . . . . . . . . . 235
polyconvex material. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 209
Plasticity
statical plasticity. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 262
quasi-statical plasticity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 259
quasi-dynamical plasticity. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 350

971
972 List of Basic Equations in Mathematical Physics

Dynamical plasticity and rheology


ideal plastic von Mises liquid . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Part V
viscoplastic Bingham liquid. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Part V
viscous Williamson liquid . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Part V
non-Newtonian liquids . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Part V
Hydrodynamics
the fundamental equations for smooth flow . . . . . . . . . . . . . . . . . . . . . 434
the Newton equation in hydrostatics. . . . . . . . . . . . . . . . . . . . . . . . . . . 445
the Euler equation for inviscid flow . . . . . . . . . . . . . . . . . . . . . . . . . . . . 439
the Navier-Stokes equation for viscous flow . . . . . . . . . . . . . . . . . . . . 438
the Prandtl boundary layer equation and singular perturbation theory 520
the Bernoulli equation and conservation of energy . . . . . . . . . . . . . . . 438
complex function theory and inviscid flow in the plane. . . . . . . . . . . . 453
basic equation in filtration theory . . . . . . . . . . . . . . . . . . . . . . . . . . Part V
basic equation for non-Newtonian fluids..................... Part V
Gas dynamics and shock waves
the fundamental equations for nonsmooth flow of gases and liquids
and the Rankine-Hugoniot jump conditions . . . . . . . . . . . . . Part V
the fundamental equations for isentropic flow and the Helmholtz
vorticity equation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Part V
Phenomenological thermodynamics
the three laws of thermodynamics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 378
Gibbs' fundamental equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 374
the stability of thermodynamical equilibria via thermodynamical po-
tentials. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 389
heat conduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 424
Statistical physics
the fundamental model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 397
Bose statistics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 402
Fermi statistics.............................. . . . . . . . . . . . . . 403
classical Boltzmann statistics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 420
quasi-classical statistics in phase space....................... 417
symplectic manifolds and the classical Gibbs statistics in phase
space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Part V
the fundamental Hilbert space approach to quantum statistics and
von Neumann's density matrix. . . . . . . . . . . . . . . . . . . . . . . . . Part V
the modem algebraic approach to quantum statistics via oJ)erator
algebras (equilibrium states correspond to KMS functionals) Part V
operator algebras, diagonalization of the Hamilton operator, and
quasi-particles in superfluidity . . . . . . . . . . . . . . . . . . . . . . . . . Part V
Irreversible thermodynamics
the fundamental equation for the entropy production (Onsager's
relations)........................................... Part V
the Boltzmann equation........ : . . . . . . . . . . . . . . . . . . . . . . . . . . Part V
plasma physics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Part V
List of Basic Equations in Mathematical Physics 973

physical kinetics and superfluidity . . . . . . . . . . . . . . . . . . . . . . . . . . Part V


phase transitions....... . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Part V
Chemistry
the fundamental equation of chemical kinetics . . . . . . . . . . . . . . . . Part V
the fundamental equation of quantum chemistry . . . . . . . . . . . . . . Part V
Theory of special relativity
Einstein's principle of relativity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 699
principle of constant velocity of light. . . . . . . . . . . . . . . . . . . . . . . . . . . 700
motion of free particles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 719
motion of photons . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 722
motion of relativistic in viscid fluids and cosmology. . . . . . . . . . . 726, 739
relativistic electromagnetism. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Part V
energy-momentum tensor and relativistic field theories. . . . . . . . . Part V
Dirac's equation for the relativistic electron . . . . . . . . . . . . . . . . . . Part V
Theory of general relativity
Einstein's fundamental equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 730
Hilbert's variational principle for the Einstein equation . . . . . . . . . . . 734
the fundamental equation for the motion of matter and light in the
universe. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 730
Electromagnetism
Maxwell's fundamental equations . . . . . . . . . . . . . . . . . . . . . . . . . . Part V
quantum electrodynamics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Part V
electromagnetism and gauge field theory. . . . . . . . . . . . . . . . . . . . . Part V
Quantum mechanics
Schrodinger's fundamental equation of nonrelativistic quantum me-
chanics. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112
Dirac's equation for the relativistic electron . . . . . . . . . . . . . . . . . . Part V
relativistic field theories . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Part V
the fundamental Hilbert space model and the dualism between
waves and particles. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Part V
algebraic approach to quantum mechanics via operator algebras Part V
Quantum field theory
approach via canonical quantization. . . . . . . . . . . . . . . . . . . . . . . . Part V
approach via Feynman integral . . . . . . . . . . . . . . . . . . . . . . . . . . . . Part V
axiomatic approach via operator algebras . . . . . . . . . . . . . . . . . . . Part V
Free quantum fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Part V
Quantum electrodynamics, scattering processes, and the S-matrix . . Part V
Gauge field theory and the curvature of principal fiber bundles. . . . Part V
Elementary particle physics and gauge field theory .
the Weinberg-Salam model for the electroweak interaction and
the group SU(2) x U(l)............................... Part V
the quark model for the strong interaction and the groups SU(3)
(color group) and SU(n) (flavor group). . . . . . . . . . . . . . . . . . . Part V
the grand unification of electromagnetic, weak, and strong interac-
tion and the group SU(5).............................. Part V
974 List of Basic Equations in Mathematical Physics

supersymmetry and graded Lie algebras . . . . . . . . . . . . . . . . . . . . . Part V


superstring theory and the unification of all fundamental interac-
tions in nature including gravitation . . . . . . . . . . . . . . . . . . . . Part V

Important Principles
Conservation laws
conserved quantities in classical mechanics. . . . . . . . . . . . . . . . . . . . . . 32
current density vectors and conservation laws . . . . . . . . . . . . . . . . . . . 422
energy-momentum tensor and conservation laws in relativistic field
theories . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 723
Noether theorem and general conservation laws via global sym-
metries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Part V
conservation laws and the Rankine-Hugoniotjump conditions in
gas dynamics (shock waves)............................ Part V
conservation laws and the fundamental equations of irreversible
thermodynamics. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Part V
Propagation of discontinuities along characteristics (wave propaga-
tion) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Part V
propagation of light in electromagnetism . . . . . . . . . . . . . . . . . . . . Part V
transversal and longitudinal elastic waves. . . . . . . . . . . . . . . . . . . . Part V
propagation of sound....... . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Part V
shock waves in gas dynamics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Part V
Symmetries in nature correspond to groups . . . . . . . . . . . . . . . . . . . . Part V
Stability
dynamical stability. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20, 841
statical stability and the principle of minimal potential energy 16, 50, 51
statical stability in elasticity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193, 304
orbital stability of periodic processes ......... ~ . . . . . . . . . . . . . . . . . 852
stability of thermodynamical equilibria. . . . . . . . . . . . . . . . . . . . . . . . . 389
Loss of stability leads to bifurcation
equilibrium states bifurcate into new equilibrium states . . . . . . . . . . . 856
equilibrium states bifurcate into periodic processes (Hopfbifurcation) 860
Similarity and the structure of physical laws
basic idea. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
turbulence and the Reynolds number . . . . . . . . . . . . . . . . . . . . . . . . . . 440
turbulence and the Kolmogorov law........................... 513
Singular limits and singular perturbation theory
boundary layers in hydrodynamics ........... :. . . . . . . . . . . . . . . . 518
incompressible flow as a singular limit of compressible flow for
c. --. oo (c. = speed of sound). . . . . . . . . . . . . . . . . . . . . . . . . . . Part V
in viscid flow as a singular limit of viscous flow for r, --. 0 (r, = viscosity) 520
geometrical optics as a singular limit of electromagnetic waves
for A. --. 0 (A. = wavelength of light) . . . . . . . . . . . . . . . . . . . . . . . . . . Part V
List of Basic Equations in Mathematical Physics 975

classical mechanics as a singular limit of the theory of relativity


for c -+ oo (c = velocity of light) . . . . . . . . . . . . . . . . . . . . . . . 706, 735
classical mechanics as a singular limit of quantum mechanics for
h-+ 0 (h = Planck's quantum of action). . . . . . . . . . . . . . . . . . . . . 129
simultaneous biological oscillations with completely different peri-
ods (a basic problem in mathematical biology)............ Part V
The fundamental algebraic approach to modern physics via operator
algebras
observables correspond to operators . . . . . . . . . . . . . . . . . . . . . . . . Part V
states correspond to positive functionals . . . . . . . . . . . . . . . . . . . . . Part V
the dynamics of physical systems corresponds to one-parameter
automorphism groups of the algebra . . . . . . . . . . . . . . . . . . . . Part V
thermodynamical states correspond to special functionals called
KMS states . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Part V
Applications of the algebraic approach to physics
classical mechanics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Part V
classical statistical physics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Part V
quantum statistics. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Part V
quantum field theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Part V
Heisenberg's uncertainty principle
position-momentum uncertainty.............................. 124
energy-time uncertainty for quasi-stable quantum states . . . . . . . . . . 130
general formulation (see also Part V). . . . . . . . . . . . . . . . . . . . . . . . . . . 123
Spin and statistics
Fermions (i.e., elementary particles with half-numberly spin) satisfy
the Fermi statistics...................................... 126
Bosons (i.e., elementary particles with integer spin) satisfy the Bose
statistics. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126
Principle of indistinguishability of identical elementary particles . . 401, 417
Pauli principle: in a system of Fermions, two particles can never be in
the same quantum state. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126
Fundamental principles in modern physics
the field equations are the Euler equations to the variational
principle of stationary action . . . . . . . . . . . . . . . . . . . . . . . . . . Part V
global symmetries yield conservation laws via the Noether theo-
rem................................................ Part V
local symmetries yield the fundamental interactions in nature via
gauge field theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Part V
interactions correspond to the curvature of appropriate manifolds Part V
elementary particles correspond to irreducible representations of
appropriate Lie groups . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Part V
Probability plays a fundamental role in nature. . . . . . . . . . 113, 366, Part V
Index

The reader should also consult the detailed Contents of this volume and the index
material (List of Theorems, etc.). Moreover, the reader may also consult the Indices
of Parts I through III. If several page numbers belong to the same catch word,
then the primary reference is italized. The page number 1345 refers to page 345 of
Part I, etc.

absolute differentiation of 665, 668


derivative 625 integration of 665, 669
differential 625 amplitude 101
space 29,31, 702, 704ff analytic map 1362
temperature (see temperature) angular
time 31, 702 frequency 28, 99
acceleration 11 momentum 32
accessory quadratic variational problem momentum in quantum mechanics
218 115
action 81, 884 velocity 12
propagation of 81 angular momentum tensor 725fT
addition theorem for velocities in antisymmetric (skew-symmetric) 622
relativity 708fT approximation methods in
adiabatic process 380, 394 elasticity 224ff, 252
admissible chart 535 general relativity 735
admissible coordinate system 632 hydrodynamics 483
affine connected manifolds 649 quantum mechanics 129
Airy's stress function 328, 337, 345 approximation models in elasticity
algebraically simple eigenvalue 1374, 163, 240ff
853fT area preserving maps 644
alternating differential forms 664fT (see asymptotically stable 842
also Part V) atlas 534
basic ideas of the calculus for 665 abstract 602

977
978 Index

atlas (continued) black hole 768, 771ff, 778fT


equivalent 536 black-white dipole hole 768, 775ff
maximal 535 blowing-up lemma 606
oriented 583 Bohr's atomic model 137
autonomous differential equations 548, Boltzmann constant 59, 366, 400, 884
841 Boltzmann statistics 420
Bose statistics 402, 408
basic ideas of 401
Baire category 1802 boson 126, 151
Baire space 1801 boundary layers 518fT
balance of boundary of stability 222
angular momentum 34 Boussinesq approximation 506,512
energy 34 Brouwer's fixed-point theorem 151
momentum 34 elementary proof of 799
Banach manifold (see manifold) equivalent statements to the 795,
Banach space (see B-space) 810
baryon number 416 Brouwer's principle 797
basic equations in mathematical physics, B-space (Banach space) 1769ff
list of 953 buckling force 304, 314
basis space of a bundle 544 bundle 544
beam equation 315ff abstract 544
Benard problem 505ff, 518 cotangent 545
bending of beams and rods 311fT fiber (see Part V)
Bernoulli constant 453 isomorphism 544
Bernoulli equation 433, 438, 453 morphism 544
Betti number 688 normal 591
bifurcation space 544
and loss of stability 221/f, 227, 841, tangent 542
856/f, 860ff vector 590
for beams and rods 311ff, 320
for dynamical systems 856fT (see also
Part V) canonical equation 21, 72, 83fT
for plates 323, 333 canonical transformation 83
for variational inequalities 303fT Carnot's cycle 394
in elasticity· 221, 227, 303ff, 311fT, catastrophe theory 572fT
332,346 basic ideas of 572
in hydrodynamics 448, 495, 505 causal structure of manifolds 787
bifurcation point causality 714fT
of an operator equation 1358 cavities 471
of a variational inequality 304 center
bifurcation principle 821, 837, 841 ofmass 33
big bang 57ff, 742/f theorem, general (see Part V)
bilinear form 1141 theorem of Ljapunov 868
compact (see Part II) center, stable, and unstable manifolds
nondegenerate 561 (see Part V)
strictly positive (see Part II) centrifugal force 30
strongly positive (see Part II) chain rule for tangent maps 543
weakly nondegenerate 561 Chandrasekhar mass, critical 412, 718
black-body radiation 410fT characteristic classes 690
Index 979

chart 533 in relativity 724 (see also Part V)


abstract 536 conservation of
admissible 535 angular momentum 35
admissible oriented 584 energy 34
compatible 534 mass 423
image 534 momentum 35
map 535 conservative force 33
space 534 conserved quantities in mechanics 32
Chebyshev inequality 115, 398 constitutive laws
chemical potential 374, 398 basic ideas of 147fT
as a Lagrange multiplier 398 convex material 237, 244fT
ofanidealgas 377,406 dual 237, 246
of a photon gas 409 elastic energy of the cuboid and 198,
of elementary particles 416 230
chemical reactions 373, 389/f elasto-viscoplastic material 151fT,
Christoffel symbols 623 348fT, 354
effective calculation of the 734 for homogeneous bodies 205fT
intuitive meaning of the 682 for homogeneous isotropic bodies
circulation 438 206
closed map 551 for the friction tensor 436fT
cnoidal waves 466fT Fourier's law 424
cobasis 576 hardening and 148, 151fT, 348ff, 353
coboundary 671 Hencky material 202, 256
cocycle 671 Hooke's law 148, 201, 233fT, 255
codimension 1765 ideal plastic material 149
of a map 576 ideal plastic von Mises liquid 157
of a submanifold 557 (see also Part V)
coherently oriented 585 in heat conduction 424
cohomology 671 in hydrodynamics 434fT
class 671 in hyperelasticity 190
group 671 in linear elasticity 201, 208, 255
compact operator 153 in nonlinear elasticity 176, 185
compact set 1756,1769 in plasticity 148fT, 154ff, 200fT, 257fT,
compact spacelike support 724 348fT
compensated compactness 264fT inverse (dual) 237, 246
complex flow potential 454 linear material 201, 208, 255, 259
complex velocity 453 Mooney-Rivlin material 209
complexification 1770 Navier-Stokes liquid 438
component of a set 1757 non-Newtonian liquid 157 (see also
cone 1276 Part V)
configuration space 22, 48 Ogden's material 209
conformal maps 645 polyconvex material 209, 241, 273
conjugate functional 65fT rubberlike material 208, 277
connected set 1757 Saint Venant- Kirchhoff material
connection in a princ1pal fiber bundle 208
690 (see also Part V) theory of invariants and 202fT
conservation laws 34, 422/f, 724 viscoplastic Bingham liquid 157 (see
basic ideas of 422 also Part V)
differential forms and 782 viscoplastic material 149/f, 348fT
980 Index

constitutive laws (continued) theorema egregium of Gauss and the


viscous material 150 642
viscous Williamson liquid 157 (see Riemann's theorem on local flatness
also Part V) and the 654
constraining force 46ff curve following algorithm 812ff, 822,
of a pendulum 93 837
continuation method
in heat conduction 422ff
in nonlinear elasticity 224ff de Broglie matter wave 112
in numerical mathematics 818ff de Rham cohomology 671
continuity equation 434 dead water problem 471
continuous operator 1755, 1770 decay of particles 105
contraction of tensors 622 decay probability 106
contravariant tensor 619 deflection of light in general relativity
convex 765
functional 111245, III380 deformation 177
set III245 of a cuboid 198
Coriolis force 30 degree of a tensor 619
cosmos derivation 600
early 63, 747ff dielectricity constant 38, 884
future of the 745ff diffeomorphism 537
cotangent bundle 545 construction of a 560
cotangent vector 545, 595 local 538
Couette flow 497 differentiable structure 536
coulomb (unit of charge) 884 differential 595
Coulomb's law 38 absolute 625
countable basis 1754, 536 calculus on manifolds 648, 663ff
covariance 720 differential forms 599
covariant differentiation 623 and conservation laws 782
motivation of the 626 in mechanics (see Part V)
on surfaces 646 in thermodynamics 383ff
covarianttensor 618 differential topology 688ff
covector field on a manifold 546 and numerical mathematics 817ff
critical mass density of the universe 60, and the fixed-point index 825
744 and the mapping degree 825
critical temperature for elementary differentiation
particles 62 absolute 625,650
cross section for elementary particle alternating 664
processes 106, 137, 139 covariant 623, 650
current density vector 422jJ, 725 Lie 673
curvature scalar 731 of differential forms 665ff
curvature tensor 642JJ, 650, 653, 655, on manifolds 540Jf, 663ff
683, 731, 734, 784 dimension analysis
analytic meaning of the 650 and boundary layers 521
definition of the 642 and turbulence 440, 5 I 5
geometric meaning of the 654, 683 basic ideas of 89
important properties of the 643, dimension of a manifold 535
784 direct sum 1766
in general relativity 730Jf, 734 directional derivative 595, 600, 673
Index 981

dispersion electron volt 883ft'


in quantum mechanics 114 elementary particles 130ft'
in statistical physics 398, 407 critical temperature for 62
in turbulence 515 cross section for 106, 137, 139
relation 100, 102, 449 lifetime of 130
divergence of a tensor 166 elliptic geometry 656ft'
door-in/door-out principle 813, embedding 559
820 embedding theorem of
Doppler effect 91 Nash 661
and the red shift of galaxies 57 Whitney 588
double splitting maps 531 energy 19, 34
dual elastic energy 245 and the first law of thermodynamics
dualism between waves and particles 379
107ft' dissipation 513ft'
duality dual elastic 245
in elasticity 235, 237, 245ff, 289 elastic potential 190
in plasticity 259, 262 fluctuations 399, 406
in the calculus of variations 284 free 387
of Friedrichs 284ft' in mechanics 34
of Trefftz 288 (see also Part III) inner 374,379,3868'
dyadic product 167 kinetic 32
dynamical stability 841ft' ofa free particle 721
basic ideas of 20 of a photon 102, 721
dynamical systems (see ordinary ofthe present universe 61
differential equations) potential 33
total 34
energy-momentum tensor 725,727,
e-homeomorphism 559 732, 734
eddies in turbulence 513 t:nergy -time uncertainty 104, 130
Einstein's enthalpy 387, 391
principle of relativity 32, 695 entropy 363ft', 380, 397
space-time manifold 730ft' and equilibrium states 388
summation convention 11,616 and Gibbs' fundamental equation
Einstein's equations 730 374
and bifurcation 788 and information 364,419
initial-value problem for 787 and statistical weights 366, 420
light quantum hypothesis 102 and the second law of thermo-
elastic energy of the cuboid 198 dynamics 380
elastic potential energy 190 as a thermodynamical potential 387
elasticity balance 434
basic ideas in 159ff, 234ft' basic ideas of 363ft'
linear 239,255 in classical statistics 417
module 148 in hydrodynamics 434
nonlinear 158, 176 in quasi-classical statistics 417
plane 345 in statistical physics 397
typical difficulties in 179 of radiation 59, 409
elastodynamics, linear 212 equation of state 372, 377, 414
elastostatics 180 equilibrium forms of rotating fluids
elasto-viscoplastic wire 151 476fT
982 Index

equilibrium point trick 846


asymptotically stable 842 flow
in economics 806 around a body 474
of a differential equation 841 in tubes 439
of dynamical systems 841 irrotational 438
ofNash 802 laminar 440
ofWalras 806 on a manifold 547JJ, 847
stable 841 parallel 437
unstable 842 planar 453
equilibrium states stationary 438
computation of 387 turbulent 440
in mechanics 50 fluid
in thermodynamics 373, 387ff compressible 437
necessary condition for 35 ideal (see idviscid)
equivalence of maps 571 incompressible 438
equivalent system of reference 29 inviscid 439
etale mapping 551 viscous 438
Euler flux 450
characteristic 686, 688ff, 692 force 26
class 689 and Newton's basic equation 26
equation in hydrodynamics 439 centrifugal 30
events conservative 33
lightlike 715 constraining 46
spacelike 715 Coriolis 30
timelike 715 Coulomb 38
expansion of the universe 58ff, 742ff four fundamental forces in nature
expectation value (see mean value) 135
extremal principles in thermodynamics gravitational 35
387 inertial 30
unit of 884
form invariance 720
Fan's inequality 801 four momentum 723
Feigenbaum bifurcation 522 four velocity 723
Fermi statistics 402, 408 Fourier's law 424
basic ideas of 401 Fredholm operator 1365, 552
fermion 126,751 free
fiber 544 boundary-value problem 450
field quantum 131fT energy 387, 390
fixed-point index 824 (see also Part I) enthalpy 387, 391
and differential topology 825 fall 26
fixed-point theorem of particle 719
Brouwer 799 frequency 28,99
Fan-Glicksberg 805 frequency-time uncertainty relation
Kakutani 804 104
fixed-point theorems 795fT friction 436fT
Floquet Friedman model
eigenmultiplier 850 closed 736, 145
multiplier 8500 open 741, 141
transformation 850 function spaces, notations for 940
Index 983

fundamental force in a mine 91


equation of Gibbs 374, 382 law 35
equations of surface theory 639 potential 36
forms of Gauss 633 potential of a body 90
interactions in nature 134fT potential of a spherical shell 91
future of our cosmos 745fT group velocity 101, 110

Ga1ileian principle of relativity 31 half-life period 104


game theory 802 Hamilton-Jacobi equation 82ff
gas constant 377 for a free particle 723
gas dynamics 444 (see also Part V) Hamiltonian equation (see canonical
gaugeinvariance 33 equation)
gauge theory 612, 690 (see also Part V) hardening of material 151ff, 348fT
Gauss harmonic oscillator 27, 127
basic equations in mechanics of 48 in quantum mechanics 79, 122, 126
map 689 Hausdorff dimension (see Part V)
principle of least constraint of 48 Hausdorff measure 521
Gaussian curvature 611,637, 643, 656, heat 370, 379
659,682ff,686,689 capacity 378
generic finiteness of the solution set capacity, specific 376ff, 424
521,834 conduction 423fT
genericity 570, 573, 579 conductivity number 424
genus 686 Heisenberg's uncertainty relation 123fT
geodesic coordinates, local 784 Hencky material, nonlinear 202, 256
geodesic curvature 685 Henon attractor 522
geodesics 646ff, 653 Hertzsprung-Russel diagram 777
affine 650 Hilbert's variational problem in general
in classical mechanics 684 relativity 734, 784
in general relativity 731 Holder inequality (see Part II)
geometrization of mechanics 21 Holder spaces 1230
geometry holonomic constraints 48
elliptic 656ff, 659 homotopy methods 817fT
Euclidean 656, 659 basic ideas of 818
hyperbolic 656ff, 659 Hooke's law, linear 148, 199, 201
non-Euclidean 656ff, 659 Hooke's law, nonlinear 233
Riemannian 634, 651, 730fT Hopf bifurcation 840, 862JJ, 873, 876
Gibbs' fundamental equation 374, 382 and abstract parabolic equations
Gibbs' phase rule 391 879
gradient method 255 degenerate 878
grand unification theory 135, 748 global 873
(see also Part V) H-space (Hilbert space) I784fT (see also
gravitational Part II)
acceleration 26, 37 Hubble constant 57, 743
collapse of stars 790 Hubble's law 57, 743
constant 36ff, 884 hydrodynamics 433fT
force 35 hydrogen atom 118ff, 121, 137
force, first-order approximation hydrostatics 445
27,36 hyperbolic geometry 656fT
984 Index

hyperelasticity 190ff irrotational flow 438


general strategy of 196 isotropic tensor functions 203fT
hysteresis 148Jf, 348fT

jet 567
ideal coordinates 566
fluid (see inviscid fluid) joule 883
gas 377, 394,403ff
plastic material 149
immersion 551 k-detennined map 574, 576, 602
incompressible fluid 437 k-equivalent maps 567
index Kepler's laws 22, 39, 41
of a Fredholm operator 552 Kerr-Newman solution 779
picture of tensors 623 kinetic energy 19, 32
principle for tensors 616, 623, 625, Kolmogorov's laws in turbulence 51311'
629, 67911' Korn's inequality 248, 279ff
principle of mathematical physics Korteweg-de Vries equation 46811'
625,681 (see also Part II)
principle, inverse 681 Kruskal solution 767
inequality K-theory 594 (see also Part V)
of Chebyshev 115, 398 Kutta-Jukovski formula 474
ofFan 801
of Garding 214 (see also Part II)
of HOlder (see Part II) Lagrange
of Korn 248, 279ff brackets 85
of Poincare- Friedrichs (see Part II) equation 21, 70
quasi-variational 807 function 21, 71
variational 296Jf, 303ff manifold 87
inertia tensor 53 Lagrange multiplier
inertial and chemical potential 398
charts 713 and temperature 398
force 30 in mechanics 6711'
system 28, 30, 699ff, 70211', 782 rule in variational inequalities 306
infinitesimal motion 12 Lame constants 199
infinitesimal rotation 12 dual 256
infinitesimally small rigid motion 292, Laplace operator on manifolds 673
344 law of
inflationary universe 749 equipartition 62, 378, 417ff
infonnation 111294, III307 Hagen-Poisseuille 439
inner friction 43611' Kepler 22, 39,41
integrability conditions 343, 640, 643, Kolmogorov 51311'
654,667,669 mass action 392
interactions in nature, four fundamental thennodynamics, first 366, 369fT,
13411' 379ff
international system of units 883 thennodynamics, second 363, 36511',
invariants 619 36911', 380ff
inviscid fluid 439 thermodynamics, third 385
inviscid (ideal) relativistic fluid 72611' thermodynamics, zero-th 380
irreversible 381 Walras 806
Index 985

Lax pair 470 Ljapunov


Lebesgue spaces (see Part II) bifurcation 867
left invariant vector field 678 center theorem 868
Legendre transformation stability 20, 841
and conjugate functionals 65ff local coordinates (see representatives)
basic ideas of 66 locally convex space 797
in elasticity 238, 286ff locally flat 654, 684
in mechanics 74 longitudinal waves 101
in the calculus of variations 284ft' Lorentz group 712
in thermodynamics 385 Lorentz transformation
Legendre-Hadamard condition 218 general 711
lemma of proper 711
Knaster-Kuratowski- Mazurkiewicz special 706
798 Lorenz attractor 523
Morse 560, 603 loss of stability and bifurcation 221ft',
Morse-Tromba 605 227, 856ff, 860ff
Ricci 642, 654, 679 lower semicontinuous 1456
Sperner 797 multivalued mapping 1450
Sperner, cubic 815 lowering of indices 680
length contraction in relativity 708
length preserving maps 644
lepton number 416 magnetic monopoles 750
Leray-Schauder principle, constructive manifolds 535
823 affine connected 649
lever principle 14 basic strategy of the theory of 535
Lie algebra 676ft' center, stable, and unstable (see
Lie derivative 673ft' Part V)
motivation for the 674 in general relativity 730
Lie group 677 in special relativity 713
lifetime of metrizable 537
black holes 780 one-dimensional 538,585, 817ft'
elementary particles 130 orientation of 582
lifting of indices 680 principles for constructing 555ff,
light 107 563ft'
cone 715 Riemannian 651
dualism between wave and particle with boundary 584
107ft' with countable basis 536
quantum hypothesis 102 mapping degree 824ft' (see also Part I)
trap 773 and differential topology 824ft'
lightlike events 715 and Sperner simplices 814
linearization principle for for maps on manifolds 670
differential equations 842 maps (see operator properties)
flows 842 area preserving 644
maps 550ft' conformal 645
list of length preserving 644
important principles 1866, 956 orientation preserving 582
symbols 933 mass 26,28,33, 37
the most important definitions 946 balance 434
theorems 943 Chandrasekhar 412
986 Index

mass (continued) regular 551


in general relativity 731, 736 strict III 193
in special relativity 721 strictly stable 193
of black holes 77 3, 779 strongly stable 218
of neutron stars 778 Minkowski space-time manifold 696
of white dwarf stars, maximal 412, modern mathematical physics 752,
778 952,955
mass-energy equivalence 719fT momentum 28, 32
mathematical economics 806 angular 32
mathematics and physics liT balance 34, 434
matrix mechanics in quantum theory generalized 73
78fT of a photon 102, 722
maximal monotone operator (see Part II) monotone operator (see Part II)
maximum (see minimum) Mooney-Rivlin material 209
Maxwell's velocity distribution 406 Morse index 653
meager set 1802 Morse lemma I 110, 1343
mean curvature 637 generalized 560, 603
and minimal surfaces 682 Morse-Tromba lemma 605
mean lifetime 104, 135, 137 multilinearization of maps 572
mean value multiplier 846, 851
in quantum mechanics 114 asymptotically stable 846
in statistical physics 398 critically 846
of velocity in turbulent flows 515 Floquet 850
mechanics 91T of a fixed point 846
basic ideas of 14 of an equilibrium point 846
Gaussian 45 of a periodic solution 850
Hamiltonian 72 unstable 846
Lagrangian 70
Newtonian 25
Poissonian 77 Nash equilibrium point 802
quantum 112fT natural
mesons 131 basis 597, 617, 632, 649
metric 620, 651 boundary condition 192, 299
Eddington 774 coordinates 597
Friedman 738fT projection 542, 544, 545
Kerr- Newman 779 system of units 751
Kruskal 768fT Navier-Stokes equations 438, 479ff
Newton 735 basic ideas of the 480fT
Schwarzschild 756fT n-body problem 38, 42
space 1761 negative retract principle 808, 810
tensor 620,634,651,731 neutron stars 415, 778
Michelson experiment 703 newton (unit of force) 884
minimal surfaces 682 Newton's basic equation 25jf, 32
minimax theorem 803 in general systems of reference 28fT
minimum nondegenerate singular point 561
bound III276 non-Euclidean geometry 655fT
free III193 nonresonance condition 861, 876, 880
local III 193 normal bundle 591
Index 987

normal coordinates 96 PCT-invariance in nature 712


normal forms pendulum 92, 318, 874
and catastrophe theory 573ff, 578 Penrose transformation via twistors
for immersions 554, 559 789
for subimmersions 552, 558 Perihelion of Mercury, motion of the
for submersions 553, 556 758ff
of oscillating systems 96 period of oscillation 99
nowhere dense set 1751 periodic system of the elements 127
nuclear forces 132 permanent waves 448ff
perpetuum mobile of the second kind
394
Ogden's material 209 perturbation
operator properties, important 1889 of orbits 759ff
(see also the Indices to Parts II of simple eigenvalues 853ff
and III) of the spectrum 874ff
orbit 852 theory 853,874
orbitally theory, singular 519
asymptotically stable 852 phase 100
stable 852, 877, 881 space 73
unstable 853 transition (see Part V)
ordinary differential equations transition in the cosmos 748
bifurcation theory for 856 photon 108JJ, 722
center, stable, and unstable manifolds equation of motion of a 731
for (see Part V) gas 59,408
on manifolds and flows 548 Piola identity 167, 175
stability theory for 841 Piola- Kirchhoff tensor
orientation 582ff first 165, 186, 189
coherent 585 second 186
of a manifold 585 Piola transformation 175
of a manifold with boundary 585 Planck's quantum of action 78, 102,
preserving map 582 108, 126, 366, 884
oscillating systems, normal form of 96 Planck's radiation law 58, 366, 408ff
and the big bang 58
plane wave 100
parallel axiom 655ff plastic torsion (see Part V)
parallel transport of tensors 626jf, plasticity
645ff basic ideas of 147ff, 154ff
basic ideas of the 627 condition of von Mises (yield
geometric interpretation of the 645 condition) 156, 201
in the sense of Levi-Civita 626ff, dynamical (plastic liquids) 156 (see
645ff, 650, 682 also Part V)
in the sense of Lie 675 historical remarks on 155ff
partial regularity of differential equations quasi-dynamical 154, 348ff
521 quasi-statical 154, 257ff
partition function in statistical physics statical 154, 262ff
400 plates 322ff
pascal (unit of pressure) 884 basic ideas for 322
Pauli principle 125ff, 413 with obstacles 339
988 Index

Poincare (see also theorem of Poincare) of maximal signal velocity 716


group 712, 781 of minimal free energy 390
model 657 of minimal free enthalpy 391
transformation 711 of minimal potential energy 19, 51,
Poisson 244
brackets 17 of multilinearization for maps 560,
mechanics 77 572JJ, 604fT
number 148 of relativity 695
polarization 101 ofrelativity, classical 31, 703
polyconvex material of Ball 209,241, of stationary action 69fT
273fl stationary potential energy 190
positive-energy theorem in general turning point 821
relativity 791 principle of virtual power 16fT, 50
potential 33 basic ideas of the 17
energy 17, 33 in elastostatics 188
of a force 33 in hyperelasticity 192
of a velocity 454 principle of virtual work 18, 345
operator III229 and the principle of virtual power 18
power 32, 884 principles, list of important 956
Prandtl number 508 probability of presence for particles
Prandtl's boundary layer equation 520 114
pressure 394, 435, 884 production of
in the weak sense 487 entropy 435
pricing system 806 mass 423
principal axes of momentum 435
inertia 53 projection-iteration method 254
strain 169 projection operator 1766
stress 178 propagation of action 81
principal moments of inertia 53 propagation velocity 100
principle proper map /173JJ, 551
bifurcation 821, 837,841 proper time
door-in/door-out 813, 820 in general rC!ativity 732
for constructing difJeomorphisms in special relativity 718
560 pseudomonotone operators
for constructing manifolds 555fT, definition of (see Part II)
563fT in elasticity 322fT
negative retract 808, 810 in hydrodynamics 481fT
of Brouwer 797 pseudosphere 660
of causality 716 pseudotensor 628
of constant enthalpy 391 density 630
of constant velocity of light 700 dual 629
of least constraint 45fT pull-back 670
of linearization for dynamical system pulsars 778
842
of linearization for maps 550
of maximal dual energy 245, 259fT, quantization 98fT
262 general approach to (see Part V)
of maximal entropy 388 of classical mechanics in the sense of
maximal plastic work 345 Heisenberg 78
Index 989

of classical mechanics in the sense of representatives (local coordinates)


Schrodinger 112 of a cotangent vector 545
of the phase space 126 of a map 537
quantum of a point of a manifold 534
cosmology 750 of a point of the tangent bundle 543
of action 78, 102, 108, 126, 366, of a tangent vector 539
884 residual set 570, 829
quantum statistics (see also Part V) retract 150
basic ideas of 401 reversible process 381
Bose statistics 402, 408 Reynolds number 440, 479, 485, 495,
Fermi statistics 402, 408 520
quasars 778 Ricci tensor 731
quasi-classical approximations Riemannian manifold 651
in general relativity 735 proper 651
in quantum mechanics 129 pseudo- 651
quasi-concave 1456 rigid body 52fT, 54
quasi-convex 1456 Ritz method 252
quasi-eigenvector 869 rod equation 315fT
quasi-equilibrium states 372ff, 374, rubberlike material of Ogden 208
379fT Rutherford's scattering formula 139
quasi-statical plastic material 257
quasi-variational inequality 807
Saint Yen ant-Kirchhoff material 208
scalar curvature 731
radiation problem of Carleman 422fT scales
Rankine-Hugoniot jump condition equilibrium of a pair of 14fT
in gas dynamics 444 (see also motion of a pair of 19fT
Part V) stability of a pair of 14fT
Rayleigh number 508 scattering of particles 106, 138fT
reaction probability 107 Schauder estimates 460
reaction velocity 389 SchrOdinger equation 112fT
red limit of the cosmos 778 Schwarzschild solution 756
red shift section
and the expansion of the universe of a bundle 544
57, 742 of a vector bundle 590
in gravitational fields 766 sectorial operator (see Part II)
in the spectrum of galaxies 57, 742 semiflow 547
region 1758 semigroup (see Part II)
regular sets, fundamental properties of 1893
minimum 51 shell theory 325
point 552 shift operator 847fT
solution curve 819 similarity in mathematical physics
space 1596, 537 439,484
state space 380 simple
value 552 contravariant tensor 619
relatively compact set 1756 covariant tensor 619
relativity principle simplicial algorithms 812
ofGalilei 31,703 simplicial methods 794fT
of Einstein 32, 695 basic ideas of 798
990 Index

singular stable equilibrium


homology groups 688 (see also dynamically 20, 842
Part V) statically 16, 50
perturbation problems 519 star, death of a 777ff
point 552 star models
point, nondegenerate 561 classical 412
value 552 relativistic 790
sinking of a space ship 773 state 371
sinusoidal waves 466ff equation of 372, 377, 414
skew-symmetric (see antisymmetric) equilibrium 373, 387ff
slow deformation processes 349 quasi-equilibrium 372Jf, 387fT
small oscillations 96 strictly stable 193
Sobolev spaces (see Part II) strongly stable 218, 244
solitary waves 466ff stationary flow 438
solitons 468ff statistical physics (see also Part V)
spacelike events 715 basic ideas of 396fT
specific Boltzmann statistics in 420
densities 442 Bose statistics in 402, 408
entropy 376 classical 417, 420
heat capacity 376 classification of states and 401
inner energy 376 Fermi statistics in 403, 408
spectrum 1795 idealgasin 403fT
does not surround the origin 849 photon gas in 408fT
Sperner simplex 798, 814 pure energy statistics in 402
spherical waves 103 quasi-classical 417
spin 125 statistical potential 387, 400
and statistics 125 statistical solution of the Navier-Stokes
splitting of spaces /766, 531, 550 equations 522
ofmaps 531 Stefan-Boltzmann law 409jf, 781
stability stored energy function 165, 190
basic ideas of 16, 20 general structure of the 207
dynamical (in the sense of Ljapunov) dual 246, 256, 294
20,842 strain tensor 165, 170
in elasticity 193, 217 geometrical meaning of the 171fT
of dynamical systems 841 linearized 170
of equilibrium points 841 strange attractor 522 (see also Part V)
of mechanical systems 16, 50 stream line 454
of orbits 852 stress force 165, I 77, 181
of periodic solutions 851 stress tensor 165, 177, 181fT
of solutions of ordinary differential basic formulas for the 189
equations 841 for the inner friction 434, 436ff
of the planetary system 42 in hyperelasticity 190
of thermodynamical states 389 strict local minimum III193
orbital 852 strictly stable 193
statical 16, 50 strings 753ff
stability principle in mechanics 16, strongly
50ff continuous operator 1498
basic idea of the 16 elliptic systems 213
Index 991

k-determined map 574 contravariant 619


stable solution 193,218, 244 covariant 618fT
structural stability 579 curvature (see curvature tensor)
subgradient III385 degree of a 619
subimmersion 551 density 630
submanifold 556 inertia (see inertia tensor)
with boundary 556 in three-dimensional space 13, 166
submersion 551 metric (see metric tensor)
sun, history of the 777 parallel transport of a (see parallel
supergravity 751 transport)
super-Lie groups 752 strain (see strain tensor)
supermanifolds 752 stress (see stress tensor)
superstring theory 752 symmetric 622
supersymmetry 751 torsion (see torsion tensor)
surface maps 644 trace of a 166
surface theory type of a 679
curvature properties in 636fT unit (see unit tensor)
metric properties in 631fT tensor calculus 615fT
symplectic geometry and mechanics 22 basic ideas of the 615
(see also Part V) on manifolds 648fT
tensor field
constant in the sense of Levi-Civita
tangent bundle 542 627
tangentmap 543 constant in the sense of Lie 675
at a point 541 curl of a 630
tangent vector 538 divergence of a 166, 626, 629
abstract 539 theorem of (see also lemma of)
concrete 538 Adams 692
local coordinates of a 539 BirkhofT 786
transformation rule for a 539 Bolzano-Poincare-Miranda 808
Taylor problem 495fT Bonnet (main theorem of surface
Taylor vortices 495 theory) 640
temperature Brouwer 799
as a Lagrange multiplier 398 Chern 689
basic ideas for 363fT Crandall and Rabinowitz 858, 862,
compensation 388 879
Gibbs' fundamental equation and de Rham 688
374 Euler (polyeder theorem) 687
in statistical physics 398 Fan and Glicksberg 805
of radiation 58, 408 Gale, Nikaido, and Debreu 807
regular states and 380 Gauss (angular sum theorem) 686
zero-th law of thermodynamics and Gauss (integral theorem) 665
380 Gauss (theorema egregium) 643
tensor 618fT Gauss- Bonnet -Chern 685fT
antisymmetric (skew-symmetric) Hartman-Stampacchia 803
622 Hodge 672
construction of a 681 Hopf 862, 876
contraction of a 622 Kakutani 804
992 Index

theorem of (continued) trace of a tensor 166


Ljapunov (center theorem) 868 transition functions of a vector bundle
Ljapunov (stability theorem) 843 590
Morse 691 transport theorem 441
Murat 267 transversal
Nash 661 linear subspaces 564
Poincare (differential forms) 669 maps 565
Poincare (duality theorem) 688 submanifolds 565
Poincare-Hopf 691 waves 101
Riemann 654 transversality 563tT
Rivlin-Ericksen 207 basic ideas of 563
Sard 587 in catastrophe theory 577
Sard (parametrized) 823 theorem, general 565
Sard-Smale 829tT theorem of Thorn 570
Sard-Smale (parametrized) 832 TretTtz method 253
Stokes (integral theorem) 660, 670 turbulence 439ff, 479tT, 495
Thorn (transversality theorem) 570 and Kolmogorov's laws 5l3tT
Whitney 588 and stochastic velocities 515
theorema egregium of Gauss 643 and strange attractors 522tT
theory of relativity basic ideas of 439
general 7300 classical Landau-Hopftheory of
special 694tT 524
thermodynamical turning point principle 821
equilibrium states 373, 387tT twin paradox 718
potential 385, 387 twistors 789
quasi-equilibrium states 372ff, 374, two-body problem 39
379tT type of a tensor 679
states 371ff, 379tT
thermodynamical process 371tT, 379
adiabatic 380 uncertainty
closed 380 of energy and time 130
irreversible 365, 381 of frequency and time 104
isothermal 387 of momentum and position 124
reversible 365, 381 of the angular-momentum
thermodynamics 363tT components 124
basic ideas of 363, 369 principle of Heisenberg 123
deviation from reversibility in 381 unfolding
three-body problem 42tT universal 575
time dilation in relativity 709 versa! 575
timelike events 715 unit tensor 620
topological direct sum 1766 units
topological vector space 797 elementary 751
topology and analysis 685tT international system of 883tT
torque 32 universal constants 750,884
torsion tensor 650 universe at a temperature of 10 11 K 63
geometric meaning of the 683 unstable 842
total energy 34 upper semicontinuous 1456
total momentum 32 multivalued map 1450
Index 993

vaporization of black holes 780 Walras equilibrium 806


variational inequalities Walras law 806
bifurcation problems for 303fT water waves 448ff, 466fT
in elasticity 296JJ, 303ff watt (unit of power) 884
in mathematical economics 806 wave
vector cnoidal 466
axial 629 length 100
polar 629 longitudinal 101
vector bundle 589 matter 112
isomorphism 594 number 100
morphism 593 permanent 448fT
vector fields on manifolds 543 plane 100
velocity sinusoidal 466
complex 453 solitary 466fT
oflight 884 spherical 103
potential 454 transversal 101
vector II vector 100
violation of parity in weak interaction weakly stable 860, 862, 866
701 white dwarf stars 412ff, 778
virtual white hole 768, 775
displacement 18 Whitney topology 569
motion 49 Wien's displacement law 418, 718
power 16,50 WKB method 129
singularities in general relativity 767 work
velocity 49 in mechanics 16,32
work 20,50 in thermodynamics 370, 379, 393
viscoplastic material 150 world line 717
viscous fluid 438 world sheet 753
viscous material 150
volt 884
von Karman's plate equations 326jJ, yield condition in plasticity 200
334 Yukawa potential 132

You might also like