Variational Principles in Classical Mechanics 2e
Variational Principles in Classical Mechanics 2e
Variational Principles in Classical Mechanics 2e
IN
CLASSICAL MECHANICS
SECOND EDITION
Douglas Cline
University of Rochester
24 November 2018
ii
c
°2018, 2017 by Douglas Cline
Contributors
Author: Douglas Cline
Illustrator: Meghan Sarkis
Variational Principles in Classical Mechanics, 2 edition by Douglas Cline is licensed under a Creative
Commons Attribution-NonCommercial-ShareAlike 4.0 International License (CC BY-NC-SA 4.0), except
where otherwise noted.
• Attribution — You must give appropriate credit, provide a link to the license, and indicate if changes
were made. You must do so in any reasonable manner, but not in any way that suggests the licensor
endorses you or your use.
• NonCommercial — You may not use the material for commercial purposes.
• ShareAlike — If you remix, transform, or build upon the material, you must distribute your contributions
under the same license as the original.
• No additional restrictions — You may not apply legal terms or technological measures that legally
restrict others from doing anything the license permits.
The licensor cannot revoke these freedoms as long as you follow the license terms.
Version 2.0
Contents
Contents iii
Preface xvii
Prologue xix
iii
iv CONTENTS
3 Linear oscillators 53
3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
3.2 Linear restoring forces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
3.3 Linearity and superposition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
3.4 Geometrical representations of dynamical motion . . . . . . . . . . . . . . . . . . . . . . . . . 55
3.4.1 Configuration space ( ) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
3.4.2 State space, ( ̇ ) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
3.4.3 Phase space, ( ) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
3.4.4 Plane pendulum . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
3.5 Linearly-damped free linear oscillator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
3.5.1 General solution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
3.5.2 Energy dissipation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
3.6 Sinusoidally-drive, linearly-damped, linear oscillator . . . . . . . . . . . . . . . . . . . . . . . 62
3.6.1 Transient response of a driven oscillator . . . . . . . . . . . . . . . . . . . . . . . . . . 62
3.6.2 Steady state response of a driven oscillator . . . . . . . . . . . . . . . . . . . . . . . . 63
3.6.3 Complete solution of the driven oscillator . . . . . . . . . . . . . . . . . . . . . . . . . 64
3.6.4 Resonance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
3.6.5 Energy absorption . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
3.7 Wave equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
3.8 Travelling and standing wave solutions of the wave equation . . . . . . . . . . . . . . . . . . . 69
3.9 Waveform analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
3.9.1 Harmonic decomposition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
3.9.2 The free linearly-damped linear oscillator . . . . . . . . . . . . . . . . . . . . . . . . . 70
3.9.3 Damped linear oscillator subject to an arbitrary periodic force . . . . . . . . . . . . . 71
3.10 Signal processing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
3.11 Wave propagation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
3.11.1 Phase, group, and signal velocities of wave packets . . . . . . . . . . . . . . . . . . . . 74
3.11.2 Fourier transform of wave packets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
3.11.3 Wave-packet Uncertainty Principle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
3.12 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
Workshop exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
19 Epilogue 503
Appendices
Bibliography 561
Index 565
Examples
xiii
xiv EXAMPLES
13.5 Example: Rotation about the center of mass of a solid cube . . . . . . . . . . . . . . . . . . . . . 325
13.6 Example: Rotation about the corner of the cube . . . . . . . . . . . . . . . . . . . . . . . . . . . . 326
13.7 Example: Euler angle transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 331
13.8 Example: Rotation of a dumbbell . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 336
13.9 Example: Precession rate for torque-free rotating symmetric rigid rotor . . . . . . . . . . . . . . 342
13.10Example: Tennis racquet dynamics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 345
13.11Example: Rotation of asymmetrically-deformed nuclei . . . . . . . . . . . . . . . . . . . . . . . . 346
13.12Example: The Spinning “Jack” . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 349
13.13Example: The Tippe Top . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 350
13.14Example: Tipping stability of a rolling wheel . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 353
13.15Example: Pivoting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 354
13.16Example: Rolling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 354
13.17Example: Forces on the bearings of a rotating circular disk . . . . . . . . . . . . . . . . . . . . . 355
14.1 Example: The Grand Piano . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 368
14.2 Example: Two coupled linear oscillators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 374
14.3 Example: Two equal masses series-coupled by two equal springs . . . . . . . . . . . . . . . . . . . 376
14.4 Example: Two parallel-coupled plane pendula . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 377
14.5 Example: The series-coupled double plane pendula . . . . . . . . . . . . . . . . . . . . . . . . . . 379
14.6 Example: Three plane pendula; mean-field linear coupling . . . . . . . . . . . . . . . . . . . . . . 380
14.7 Example: Three plane pendula; nearest-neighbor coupling . . . . . . . . . . . . . . . . . . . . . . 382
14.8 Example: System of three bodies coupled by six springs . . . . . . . . . . . . . . . . . . . . . . . . 384
14.9 Example: Linear triatomic molecular CO 2 . . . . . . . . . . . . . . . . . . . . . . 385
14.10Example: Benzene ring . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 387
14.11Example: Two linearly-damped coupled linear oscillators . . . . . . . . . . . . . . . . . . . . . . . 394
14.12Example: Collective motion in nuclei . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 397
15.1 Example: Check that a transformation is canonical . . . . . . . . . . . . . . . . . . . . . . . . . . 406
15.2 Example: Angular momentum: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 409
15.3 Example: Lorentz force in electromagnetism . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 412
15.4 Example: Wavemotion: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 412
15.5 Example: Two-dimensional, anisotropic, linear oscillator . . . . . . . . . . . . . . . . . . . . . . 413
15.6 Example: The eccentricity vector . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 414
15.7 Example: The identity canonical transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . 420
15.8 Example: The point canonical transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 420
15.9 Example: The exchange canonical transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . 420
15.10Example: Infinitessimal point canonical transformation . . . . . . . . . . . . . . . . . . . . . . . 420
15.11Example: 1-D harmonic oscillator via a canonical transformation . . . . . . . . . . . . . . . . . . 421
15.12Example: Free particle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 425
15.13Example: Point particle in a uniform gravitational field . . . . . . . . . . . . . . . . . . . . . . . 426
15.14Example: One-dimensional harmonic oscillator . . . . . . . . . . . . . . . . . . . . . . . . . . . 427
15.15Example: The central force problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 427
15.16Example: Linearly-damped, one-dimensional, harmonic oscillator . . . . . . . . . . . . . . . . . . 429
15.17Example: Adiabatic invariance for the simple pendulum . . . . . . . . . . . . . . . . . . . . . . . 436
15.18Example: Harmonic oscillator perturbation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 438
15.19Example: Lindblad resonance in planetary and galactic motion . . . . . . . . . . . . . . . . . . . 439
16.1 Example: Acoustic waves in a gas . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 459
17.1 Example: Muon lifetime . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 470
17.2 Example: Relativistic Doppler Effect . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 471
17.3 Example: Twin paradox . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 471
17.4 Example: Rocket propulsion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 474
17.5 Example: Lagrangian for a relativistic free particle . . . . . . . . . . . . . . . . . . . . . . . . . . 482
17.6 Example: Relativistic particle in an external electromagnetic field . . . . . . . . . . . . . . . . . . 483
17.7 Example: The Bohr-Sommerfeld hydrogen atom . . . . . . . . . . . . . . . . . . . . . . . . . . . . 487
A.1 Example: Eigenvalues and eigenvectors of a real symmetric matrix . . . . . . . . . . . . . . . . . 512
A.2 Example: Degenerate eigenvalues of real symmetric matrix . . . . . . . . . . . . . . . . . . . . . 513
D.1 Example: Rotation matrix: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 527
xvi EXAMPLES
The goal of this book is to introduce the reader to the intellectual beauty, and philosophical implications,
of the fact that nature obeys variational principles plus Hamilton’s Action Principle which underlie the
Lagrangian and Hamiltonian analytical formulations of classical mechanics. These variational methods,
which were developed for classical mechanics during the 18 − 19 century, have become the preeminent
formalisms for classical dynamics, as well as for many other branches of modern science and engineering.
The ambitious goal of this book is to lead the reader from the intuitive Newtonian vectorial formulation, to
introduction of the more abstract variational principles that underlie Hamilton’s Principle and the related
Lagrangian and Hamiltonian analytical formulations. This culminates in discussion of the contributions of
variational principles to classical mechanics and the development of relativistic and quantum mechanics.
The broad scope of this book attempts to unify the undergraduate physics curriculum by bridging the
chasm that divides the Newtonian vector-differential formulation, and the integral variational formulation of
classical mechanics, as well as the corresponding philosophical approaches adopted in classical and quantum
mechanics. This book introduces the powerful variational techniques in mathematics, and their application to
physics. Application of the concepts of the variational approach to classical mechanics is ideal for illustrating
the power and beauty of applying variational principles.
The development of this textbook was influenced by three textbooks: The Variational Principles of
Mechanics by Cornelius Lanczos (1949) [La49], Classical Mechanics (1950) by Herbert Goldstein[Go50],
and Classical Dynamics of Particles and Systems (1965) by Jerry B. Marion[Ma65]. Marion’s excellent
textbook was unusual in partially bridging the chasm between the outstanding graduate texts by Goldstein
and Lanczos, and a bevy of introductory texts based on Newtonian mechanics that were available at that
time. The present textbook was developed to provide a more modern presentation of the techniques and
philosophical implications of the variational approaches to classical mechanics, with a breadth and depth
close to that provided by Goldstein and Lanczos, but in a format that better matches the needs of the
undergraduate student. An additional goal is to bridge the gap between classical and modern physics in the
undergraduate curriculum. The underlying philosophical approach adopted by this book was espoused by
Galileo Galilei “You cannot teach a man anything; you can only help him find it within himself.”
This book was written in support of the physics junior/senior undergraduate course P235W entitled
“Variational Principles in Classical Mechanics” that the author taught at the University of Rochester be-
tween 1993−2015. Initially the lecture notes were distributed to students to allow pre-lecture study, facilitate
accurate transmission of the complicated formulae, and minimize note taking during lectures. These lecture
notes evolved into the present textbook. The target audience of this course typically comprised ≈ 70% ju-
nior/senior undergraduates, ≈ 25% sophomores, ≤ 5% graduate students, and the occasional well-prepared
freshman. The target audience was physics and astrophysics majors, but the course attracted a significant
fraction of majors from other disciplines such as mathematics, chemistry, optics, engineering, music, and the
humanities. As a consequence, the book includes appreciable introductory level physics, plus mathematical
review material, to accommodate the diverse range of prior preparation of the students. This textbook
includes material that extends beyond what reasonably can be covered during a one-term course. This sup-
plemental material is presented to show the importance and broad applicability of variational concepts to
classical mechanics. The book includes 164 worked examples to illustrate the concepts presented. Advanced
group-theoretic concepts are minimized to better accommodate the mathematical skills of the typical under-
graduate physics major. To conform with modern literature in this field, this book follows the widely-adopted
nomenclature used in “Classical Mechanics” by Goldstein[Go50], with recent additions by Johns[Jo05].
The second edition of this book has revised the presentation and includes recent developments in the
field. The book is broken into four major sections, the first of which presents a brief historical introduction
xvii
xviii PREFACE
(chapter 1), followed by a review of the Newtonian formulation of mechanics plus gravitation (chapter
2), linear oscillators and wave motion (chapter 3), and an introduction to non-linear dynamics and chaos
(chapter 4). The second section introduces the variational principles of analytical mechanics that underlie
this book. It includes an introduction to the calculus of variations (chapter 5), the Lagrangian formulation of
mechanics with applications to holonomic and non-holonomic systems (chapter 6), a discussion of symmetries,
invariance, plus Noether’s theorem (chapter 7). This book presents an introduction to the Hamiltonian, the
Hamiltonian formulation of mechanics, the Routhian reduction technique, and a discussion of the subtleties
involved in applying variational principles to variable-mass problems.(Chapter 8). The second edition of
this book presents a unified introduction to Hamiltons Principle, introduces a new approach for applying
Hamilton’s Principle to systems subject to initial boundary conditions, and discusses how best to exploit the
hierarchy of related formulations based on action, Lagrangian/Hamiltonian, and equations of motion, when
solving problems subject to symmetries (chapter 9). A consolidated introduction to the application of the
variational approach to nonconservative systems is presented (chapter 10). The third section of the book,
applies Lagrangian and Hamiltonian formulations of classical dynamics to central force problems (chapter 11),
motion in non-inertial frames (chapter 12), rigid-body rotation (chapter 13), and coupled linear oscillators
(chapter 14). The fourth section of the book introduces advanced applications of Hamilton’s Action Principle,
Lagrangian mechanics and Hamiltonian mechanics. These include Poisson brackets, Liouville’s theorem,
canonical transformations, Hamilton-Jacobi theory, the action-angle technique (chapter 15), and classical
mechanics in the continua (chapter 16). This is followed by a brief review of the revolution in classical
mechanics introduced by Einstein’s theory of relativistic mechanics. The extended theory of Lagrangian and
Hamiltonian mechanics is used to apply variational techniques to the Special Theory of Relativity, followed
by a discussion of the use of variational principles in the development of the General Theory of Relativity
(chapter 17). The book finishes with a brief review of the role of variational principles in bridging the gap
between classical mechanics and quantum mechanics, (chapter 18). These advanced topics extend beyond
the typical syllabus for an undergraduate classical mechanics course. They are included to stimulate student
interest in physics by giving them a glimpse of the physics at the summit that they have already struggled
to climb. This glimpse illustrates the breadth of classical mechanics, and the pivotal role that variational
principles have played in the development of classical, relativistic, quantal, and statistical mechanics.
The front cover picture of this book shows a sailplane soaring high above the Italian Alps. This picture
epitomizes the unlimited horizon of opportunities provided when the full dynamic range of variational princi-
ples are applied to classical mechanics. The adjacent pictures of the galaxy, and the skier, represent the wide
dynamic range of applicable topics that span from the origin of the universe, to everyday life. These cover
pictures reflect the beauty and unity of the foundation provided by variational principles to the development
of classical mechanics.
Information regarding the associated P235 undergraduate course at the University of Rochester is avail-
able on the web site at http://www.pas.rochester.edu/~cline/P235/index.shtml. Information about the
author is available at the Cline home web site: http://www.pas.rochester.edu/~cline/index.html.
The author thanks Meghan Sarkis who prepared many of the illustrations, Joe Easterly who designed the
book cover plus the webpage, and Moriana Garcia who organized publication. Andrew Sifain developed the
diagnostic workshop questions. The author appreciates the permission, granted by Professor Struckmeier, to
quote his published article on the extended Hamilton-Lagrangian formalism. The author acknowledges the
feedback and suggestions made by many students who have taken this course, as well as helpful suggestions
by his colleagues; Andrew Abrams, Adam Hayes, Connie Jones, Andrew Melchionna, David Munson, Alice
Quillen, Richard Sarkis, James Schneeloch, Steven Torrisi, Dan Watson, and Frank Wolfs. These lecture
notes were typed in LATEX using Scientific WorkPlace (MacKichan Software, Inc.), while Adobe Illustrator,
Photoshop, Origin, Mathematica, and MUPAD, were used to prepare the illustrations.
Douglas Cline,
University of Rochester, 2018
Prologue
Two dramatically different philosophical approaches to science were developed in the field of classical me-
chanics during the 17 - 18 centuries. This time period coincided with the Age of Enlightenment in Europe
during which remarkable intellectual and philosophical developments occurred. This was a time when both
philosophical and causal arguments were equally acceptable in science, in contrast with current convention
where there appears to be tacit agreement to discourage use of philosophical arguments in science.
xix
xx PROLOGUE
Newtonian mechanics: Momentum and force are vectors that underlie the Newtonian formulation of
classical mechanics. Newton’s monumental treatise, entitled “Philosophiae Naturalis Principia Mathemat-
ica”, published in 1687, established his three universal laws of motion, the universal theory of gravitation,
the derivation of Kepler’s three laws of planetary motion, and the development of calculus. Newton’s three
universal laws of motion provide the most intuitive approach to classical mechanics in that they are based on
vector quantities like momentum, and the rate of change of momentum, which are related to force. Newton’s
equation of motion
p
F= (Newton’s equation of motion)
is a vector differential relation between the instantaneous forces and rate of change of momentum, or equiva-
lent instantaneous acceleration, all of which are vector quantities. Momentum and force are easy to visualize,
and both cause and effect are embedded in Newtonian mechanics. Thus, if all of the forces, including the
constraint forces, acting on the system are known, then the motion is solvable for two body systems. The
mathematics for handling Newton’s “vectorial mechanics” approach to classical mechanics is well established.
Analytical mechanics: Variational principles apply to many aspects of our daily life. Typical examples
include; selecting the optimum compromise in quality and cost when shopping, selecting the fastest route
to travel from home to work, or selecting the optimum compromise to satisfy the disparate desires of the
individuals comprising a family. Variational principles underlie the analytical formulation of mechanics. It
is astonishing that the laws of nature are consistent with variational principles involving the principle of
least action. Minimizing the action integral led to the development of the mathematical field of variational
calculus, plus the analytical variational approaches to classical mechanics, by Euler, Lagrange, Hamilton,
and Jacobi.
Leibniz, who was a contemporary of Newton, introduced methods based on a quantity called “vis viva”,
which is Latin for “living force” and equals twice the kinetic energy. Leibniz believed in the philosophy
that God created a perfect world where nature would be thrifty in all its manifestations. In 1707, Leibniz
proposed that the optimum path is based on minimizing the time integral of the vis viva, which is equiva-
lent to the action integral of Lagrangian/Hamiltonian mechanics. In 1744 Euler derived the Leibniz result
using variational concepts while Maupertuis restated the Leibniz result based on teleological arguments.
The development of Lagrangian mechanics culminated in the 1788 publication of Lagrange’s monumental
treatise entitled “Mécanique Analytique”. Lagrange used d’Alembert’s Principle to derive Lagrangian me-
chanics providing a powerful analytical approach to determine the magnitude and direction of the optimum
trajectories, plus the associated forces.
The culmination of the development of analytical mechanics occurred in 1834 when Hamilton proposed
his Principle of Least Action, as well as developing Hamiltonian mechanics which is the premier variational
approach in science. Hamilton’s concept of least action is defined to be the time integral of the Lagrangian.
Hamilton’s Action Principle (1834) minimizes the action integral defined by
Z
= (q q̇) (Hamilton’s Principle)
In the simplest form, the Lagrangian (q q̇) equals the difference between the kinetic energy and the
potential energy . Hamilton’s Least Action Principle underlies Lagrangian mechanics. This Lagrangian is
a function of generalized coordinates plus their corresponding velocities ̇ . Hamilton also developed
the premier variational approach, called Hamiltonian mechanics, that is based on the Hamiltonian (q p)
which is a function of the fundamental position plus the conjugate momentum variables. In 1843
Jacobi provided the mathematical framework required to fully exploit the power of Hamiltonian mechanics.
Note that the Lagrangian, Hamiltonian, and the action integral, all are scalar quantities which simplifies
derivation of the equations of motion compared with the vector calculus used by Newtonian mechanics.
Figure 2 presents a philosophical roadmap illustrating the hierarchy of philosophical approaches based on
Hamilton’s Action Principle, that are available forRderiving the equations of motion of a system. The primary
Stage1 uses Hamilton’s Action functional, = (q q̇) to derive the Lagrangian, and Hamiltonian
functionals which provide the most fundamental and sophisticated level of understanding. Stage1 involves
specifying all the active degrees of freedom, as well as the interactions involved. Stage2 uses the Lagrangian
or Hamiltonian functionals, derived at Stage1, in order to derive the equations of motion for the system of
xxi
Stage 1
Stage 2
Stage 3
Figure 2: Philosophical road map of the hierarchy of stages involved in analytical mechanics. Hamilton’s
Action Principle is the foundation of analytical mechanics. Stage 1 uses Hamilton’s Principle to derive the
Lagranian and Hamiltonian. Stage 2 uses either the Lagrangian or Hamiltonian to derive the equations
of motion for the system. Stage 3 uses these equations of motion to solve for the actual motion using
the assumed initial conditions. The Lagrangian approach can be derived directly based on d’Alembert’s
Principle. Newtonian mechanics can be derived directly based on Newton’s Laws of Motion. The advantages
and power of Hamilton’s Action Principle are unavailable if the Laws of Motion are derived using either
d’Alembert’s Principle or Newton’s Laws of Motion.
interest. Stage3 then uses these derived equations of motion to solve for the motion of the system subject to
a given set of initial boundary conditions. Note that Lagrange first derived Lagrangian mechanics based on
d’ Alembert’s Principle, while Newton’s Laws of Motion specify the equations of motion used in Newtonian
mechanics.
The analytical approach to classical mechanics appeared contradictory to Newton’s intuitive vector-
ial treatment of force and momentum. There is a dramatic difference in philosophy between the vector-
differential equations of motion derived by Newtonian mechanics, which relate the instantaneous force to
the corresponding instantaneous acceleration, and analytical mechanics, where minimizing the scalar action
integral involves integrals over space and time between specified initial and final states. Analytical mechanics
uses variational principles to determine the optimum trajectory, from a continuum of tentative possibilities,
by requiring that the optimum trajectory minimizes the action integral between specified initial and final
conditions.
Initially there was considerable prejudice and philosophical opposition to use of the variational principles
approach which is based on the assumption that nature follows the principles of economy. The variational
approach is not intuitive, and thus it was considered to be speculative and “metaphysical”, but it was
tolerated as an efficient tool for exploiting classical mechanics. This opposition to the variational principles
underlying analytical mechanics, delayed full appreciation of the variational approach until the start of the
20 century. As a consequence, the intuitive Newtonian formulation reigned supreme in classical mechanics
for over two centuries, even though the remarkable problem-solving capabilities of analytical mechanics were
recognized and exploited following the development of analytical mechanics by Lagrange.
The full significance and superiority of the analytical variational formulations of classical mechanics
became well recognised and accepted following the development of the Special Theory of Relativity in 1905.
The Theory of Relativity requires that the laws of nature be invariant to the reference frame. This is not
satisfied by the Newtonian formulation of mechanics which assumes one absolute frame of reference and a
separation of space and time. In contrast, the Lagrangian and Hamiltonian formulations of the principle of
least action remain valid in the Theory of Relativity, if the Lagrangian is written in a relativistically-invariant
xxii PROLOGUE
form in space-time. The complete invariance of the variational approach to coordinate frames is precisely
the formalism necessary for handling relativistic mechanics.
Hamiltonian mechanics, which is expressed in terms of the conjugate variables (q p), relates classical
mechanics directly to the underlying physics of quantum mechanics and quantum field theory. As a conse-
quence, the philosophical opposition to exploiting variational principles no longer exists, and Hamiltonian
mechanics has become the preeminent formulation of modern physics. The reader is free to draw their own
conclusions regarding the philosophical question “is the principle of economy a fundamental law of classical
mechanics, or is it a fortuitous consequence of the fundamental laws of nature?”
From the late seventeenth century, until the dawn of modern physics at the start of the twentieth cen-
tury, classical mechanics remained a primary driving force in the development of physics. Classical mechanics
embraces an unusually broad range of topics spanning motion of macroscopic astronomical bodies to mi-
croscopic particles in nuclear and particle physics, at velocities ranging from zero to near the velocity of
light, from one-body to statistical many-body systems, as well as having extensions to quantum mechanics.
Introduction of the Special Theory of Relativity in 1905, and the General Theory of Relativity in 1916,
necessitated modifications to classical mechanics for relativistic velocities, and can be considered to be an
extended theory of classical mechanics. Since the 19200 s, quantal physics has superseded classical mechanics
in the microscopic domain. Although quantum physics has played the leading role in the development of
physics during much of the past century, classical mechanics still is a vibrant field of physics that recently
has led to exciting developments associated with non-linear systems and chaos theory. This has spawned
new branches of physics and mathematics as well as changing our notion of causality.
Goals: The primary goal of this book is to introduce the reader to the powerful variational-principles
approaches that play such a pivotal role in classical mechanics and many other branches of modern science
and engineering. This book emphasizes the intellectual beauty of these remarkable developments, as well as
stressing the philosophical implications that have had a tremendous impact on modern science. A secondary
goal is to apply variational principles to solve advanced applications in classical mechanics in order to
introduce many sophisticated and powerful mathematical techniques that underlie much of modern physics.
This book starts with a review of Newtonian mechanics plus the solutions of the corresponding equations
of motion. This is followed by an introduction to Lagrangian mechanics, based on d’Alembert’s Principle,
in order to develop familiarity in applying variational principles to classical mechanics. This leads to intro-
duction of the more fundamental Hamilton’s Action Principle, plus Hamiltonian mechanics, to illustrate the
power provided by exploiting the full hierarchy of stages available for applying variational principles to clas-
sical mechanics. Finally the book illustrates how variational principles in classical mechanics were exploited
during the development of both relativisitic mechanics and quantum physics. The connections and applica-
tions of classical mechanics to modern physics, are emphasized throughout the book in an effort to span the
chasm that divides the Newtonian vector-differential formulation, and the integral variational formulation, of
classical mechanics. This chasm is especially applicable to quantum mechanics which is based completely on
variational principles. Note that variational principles, developed in the field of classical mechanics, now are
used in a diverse and wide range of fields outside of physics, including economics, meteorology, engineering,
and computing.
This study of classical mechanics involves climbing a vast mountain of knowledge, and the pathway to the
top leads to elegant and beautiful theories that underlie much of modern physics. This book exploits varia-
tional principles applied to four major topics in classical mechanics to illustrate the power and importance of
variational principles in physics. Being so close to the summit provides the opportunity to take a few extra
steps beyond the normal introductory classical mechanics syllabus to glimpse the exciting physics found at
the summit. This new physics includes topics such as quantum, relativistic, and statistical mechanics.
Chapter 1
1.1 Introduction
This chapter reviews the historical evolution of classical mechanics since considerable insight can be gained
from study of the history of science. There are two dramatically different approaches used in classical
mechanics. The first is the vectorial approach of Newton which is based on vector quantities like momentum,
force, and acceleration. The second is the analytical approach of Lagrange, Euler, Hamilton, and Jacobi,
that is based on the concept of least action and variational calculus. The more intuitive Newtonian picture
reigned supreme in classical mechanics until the start of the twentieth century. Variational principles, which
were developed during the nineteenth century, never aroused much enthusiasm in scientific circles due to
philosophical objections to the underlying concepts; this approach was merely tolerated as an efficient tool
for exploiting classical mechanics. A dramatic advance in the philosophy of science occurred at the start of
the 20 century leading to widespread acceptance of the superiority of using variational principles.
1
2 CHAPTER 1. A BRIEF HISTORY OF CLASSICAL MECHANICS
of least time. Ptolemy (83 - 161 A.D.) wrote several scientific treatises that greatly influenced subsequent
philosophers. Unfortunately he adopted the incorrect geocentric solar system in contrast to the heliocentric
model of Aristarchus and others.
literature, philosophy, and art. Scientific development during the 17 century included the pivotal advances
made by Newton and Leibniz at the beginning of the revolutionary Age of Enlightenment, culminating in the
development of variational calculus and analytical mechanics by Euler and Lagrange. The scientific advances
of this age include publication of two monumental books Philosophiae Naturalis Principia Mathematica by
Newton in 1687 and Mécanique analytique by Lagrange in 1788. These are the definitive two books upon
which classical mechanics is built.
René Descartes (1596-1650) attempted to formulate the laws of motion in 1644. He talked about
conservation of motion (momentum) in a straight line but did not recognize the vector character of momen-
tum. Pierre de Fermat (1601-1665) and René Descartes were two leading mathematicians in the first
half of the 17 century. Independently they discovered the principles of analytic geometry and developed
some initial concepts of calculus. Fermat and Blaise Pascal (1623-1662) were the founders of the theory
of probability.
Isaac Newton (1642-1727) made pioneering contributions to physics and mathematics as well as
being a theologian. At 18 he was admitted to Trinity College Cambridge where he read the writings of
modern philosophers like Descartes, and astronomers like Copernicus, Galileo, and Kepler. By 1665 he had
discovered the generalized binomial theorem, and began developing infinitessimal calculus. Due to a plague,
the university closed for two years in 1665 during which Newton worked at home developing the theory
of calculus that built upon the earlier work of Barrow and Descartes. He was elected Lucasian Professor
of Mathematics in 1669 at the age of 26. From 1670 Newton focussed on optics leading to his Hypothesis
of Light published in 1675 and his book Opticks in 1704. Newton described light as being made up of a
flow of extremely subtle corpuscles that also had associated wavelike properties to explain diffraction and
optical interference that he studied. Newton returned to mechanics in 1677 by studying planetary motion
and gravitation that applied the calculus he had developed. In 1687 he published his monumental treatise
entitled Philosophiae Naturalis Principia Mathematica which established his three universal laws of motion,
the universal theory of gravitation, derivation of Kepler’s three laws of planetary motion, and was his first
publication of the development of calculus which he called “the science of fluxions”. Newton’s laws of motion
are based on the concepts of force and momentum, that is, force equals the rate of change of momentum.
Newton’s postulate of an invisible force able to act over vast distances led him to be criticized for introducing
“occult agencies” into science. In a remarkable achievement, Newton completely solved the laws of mechanics.
His theory of classical mechanics and of gravitation reigned supreme until the development of the Theory
of Relativity in 1905. The followers of Newton envisioned the Newtonian laws to be absolute and universal.
This dogmatic reverence of Newtonian mechanics prevented physicists from an unprejudiced appreciation of
the analytic variational approach to mechanics developed during the 17 through 19 centuries. Newton
was the first scientist to be knighted and was appointed president of the Royal Society.
Gottfried Leibniz (1646-1716) was a brilliant German philosopher, a contemporary of Newton, who
worked on both calculus and mechanics. Leibniz started development of calculus in 1675, ten years after
Newton, but Leibniz published his work in 1684, which was three years before Newton’s Principia. Leibniz
made significant contributions to integral calculus and developed the notation currently used in calculus.
He introduced the name calculus based on the Latin word for the small stone used for counting. Newton
and Leibniz were involved in a protracted argument over who originated calculus. It appears that Leibniz
saw drafts of Newton’s work on calculus during a visit to England. Throughout their argument Newton
was the ghost writer of most of the articles in support of himself and he had them published under non-
de-plume of his friends. Leibniz made the tactical error of appealing to the Royal Society to intercede on
his behalf. Newton, as president of the Royal Society, appointed his friends to an “impartial” committee to
investigate this issue, then he wrote the committee’s report that accused Leibniz of plagiarism of Newton’s
work on calculus, after which he had it published by the Royal Society. Still unsatisfied he then wrote an
anonymous review of the report in the Royal Society’s own periodical. This bitter dispute lasted until the
death of Leibniz. When Leibniz died his work was largely discredited. The fact that he falsely claimed to be
a nobleman and added the prefix “von” to his name, coupled with Newton’s vitriolic attacks, did not help
his credibility. Newton is reported to have declared that he took great satisfaction in “breaking Leibniz’s
heart.” Studies during the 20 century have largely revived the reputation of Leibniz and he is recognized
to have made major contributions to the development of calculus.
4 CHAPTER 1. A BRIEF HISTORY OF CLASSICAL MECHANICS
Figure 1.1: Chronological roadmap of the parallel development of the Newtonian and Variational-principles
approaches to classical mechanics.
1.5. VARIATIONAL METHODS IN PHYSICS 5
where the inertial reaction force ṗ is subtracted from the corresponding force F. This extension of the
principle of virtual work applies equally to both statics and dynamics leading to a single variational principle.
Joseph Louis Lagrange (1736-1813) was an Italian mathematician and a student of Leonhard Euler.
In 1788 Lagrange published his monumental treatise on analytical mechanics entitled Mécanique Analytique
1 Teleology is any philosophical account that holds that final causes exist in nature, meaning that – analogous to purposes
which introduces his Lagrangian mechanics analytical technique which is based on d’Alembert’s Principle of
Virtual Work. Lagrangian mechanics is a remarkably powerful technique that is equivalent to minimizing
the action integral defined as Z 2
=
1
The Lagrangian frequently is defined to be the difference between the kinetic energy and potential
energy . His theory only required the analytical form of these scalar quantities. In the preface of his
book he refers modestly to his extraordinary achievements with the statement “The reader will find no
figures in the work. The methods which I set forth do not require either constructions or geometrical or
mechanical reasonings: but only algebraic operations, subject to a regular and uniform rule of procedure.”
Lagrange also introduced the concept of undetermined multipliers to handle auxiliary conditions which
plays a vital part of theoretical mechanics. William Hamilton, an outstanding figure in the analytical
formulation of classical mechanics, called Lagrange the “Shakespeare of mathematics,” on account of the
extraordinary beauty, elegance, and depth of the Lagrangian methods. Lagrange also pioneered numerous
significant contributions to mathematics. For example, Euler, Lagrange, and d’Alembert developed much of
the mathematics of partial differential equations. Lagrange survived the French Revolution, and, in spite of
being a foreigner, Napoleon named Lagrange to the Legion of Honour and made him a Count of the Empire
in 1808. Lagrange was honoured by being buried in the Pantheon.
Carl Friedrich Gauss (1777-1855) was a German child prodigy who made many significant contri-
butions to mathematics, astronomy and physics. He did not work directly on the variational approach, but
Gauss’s law, the divergence theorem, and the Gaussian statistical distribution are important examples of
concepts that he developed and which feature prominently in classical mechanics as well as other branches
of physics, and mathematics.
Simeon Poisson (1781-1840), was a brilliant mathematician who was a student of Lagrange. He
developed the Poisson statistical distribution as well as the Poisson equation that features prominently in
electromagnetic and other field theories. His major contribution to classical mechanics is development, in
1809, of the Poisson bracket formalism which featured prominently in development of Hamiltonian mechanics
and quantum mechanics.
The zenith in development of the variational approach to classical mechanics occurred during the 19
century primarily due to the work of Hamilton and Jacobi.
William Hamilton (1805-1865) was a brilliant Irish physicist, astronomer and mathematician who was
appointed professor of astronomy at Dublin when he was barely 22 years old. He developed the Hamiltonian
mechanics formalism of classical mechanics which now plays a pivotal role in modern classical and quantum
mechanics. He opened an entirely new world beyond the developments of Lagrange. Whereas the Lagrange
equations of motion are complicated second-order differential equations, Hamilton succeeded in transforming
them into a set of first-order differential equations with twice as many variables that consider momenta and
their conjugate positions as independent variables. The differential equations of Hamilton are linear, have
separated derivatives, and represent the simplest and most desirable form possible for differential equations to
be used in a variational approach. Hence the name “canonical variables” given by Jacobi. Hamilton exploited
the d’Alembert principle to give the first exact formulation of the principle of least action which underlies the
variational principles used in analytical mechanics. The form derived by Euler and Lagrange employed the
principle in a way that applies only for conservative (scleronomic) cases. A significant discovery of Hamilton
is his realization that classical mechanics and geometrical optics can be handled from one unified viewpoint.
In both cases he uses a “characteristic” function that has the property that, by mere differentiation, the
path of the body, or light ray, can be determined by the same partial differential equations. This solution is
equivalent to the solution of the equations of motion.
Carl Gustave Jacob Jacobi (1804-1851), a Prussian mathematician and contemporary of Hamilton,
made significant developments in Hamiltonian mechanics. He immediately recognized the extraordinary im-
portance of the Hamiltonian formulation of mechanics. Jacobi developed canonical transformation theory
and showed that the function, used by Hamilton, is only one special case of functions that generate suit-
able canonical transformations. He proved that any complete solution of the partial differential equation,
without the specific boundary conditions applied by Hamilton, is sufficient for the complete integration of
the equations of motion. This greatly extends the usefulness of Hamilton’s partial differential equations.
In 1843 Jacobi developed both the Poisson brackets, and the Hamilton-Jacobi, formulations of Hamiltonian
mechanics. The latter gives a single, first-order partial differential equation for the action function in terms
1.6. THE 20 CENTURY REVOLUTION IN PHYSICS 7
of the generalized coordinates which greatly simplifies solution of the equations of motion. He also de-
rived a principle of least action for time-independent cases that had been studied by Euler and Lagrange.
Jacobi developed a superior approach to the variational integral that, by eliminating time from the integral,
determined the path without saying anything about how the motion occurs in time.
James Clerk Maxwell (1831-1879) was a Scottish theoretical physicist and mathematician. His most
prominent achievement was formulating a classical electromagnetic theory that united previously unrelated
observations, plus equations of electricity, magnetism and optics, into one consistent theory. Maxwell’s
equations demonstrated that electricity, magnetism and light are all manifestations of the same phenomenon,
namely the electromagnetic field. Consequently, all other classic laws and equations of electromagnetism
were simplified cases of Maxwell’s equations. Maxwell’s achievements concerning electromagnetism have
been called the “second great unification in physics”. Maxwell demonstrated that electric and magnetic
fields travel through space in the form of waves, and at a constant speed of light. In 1864 Maxwell wrote “A
Dynamical Theory of the Electromagnetic Field” which proposed that light was in fact undulations in the
same medium that is the cause of electric and magnetic phenomena. His work in producing a unified model
of electromagnetism is one of the greatest advances in physics. Maxwell, in collaboration with Ludwig
Boltzmann (1844-1906), also helped develop the Maxwell—Boltzmann distribution, which is a statistical
means of describing aspects of the kinetic theory of gases. These two discoveries helped usher in the era of
modern physics, laying the foundation for such fields as special relativity and quantum mechanics. Boltzmann
founded the field of statistical mechanics and was an early staunch advocate of the existence of atoms and
molecules.
Henri Poincaré (1854-1912) was a French theoretical physicist and mathematician. He was the first to
present the Lorentz transformations in their modern symmetric form and discovered the remaining relativistic
velocity transformations. Although there is similarity to Einstein’s Special Theory of Relativity, Poincaré and
Lorentz still believed in the concept of the ether and did not fully comprehend the revolutionary philosophical
change implied by Einstein. Poincaré worked on the solution of the three-body problem in planetary motion
and was the first to discover a chaotic deterministic system which laid the foundations of modern chaos
theory. It rejected the long-held deterministic view that if the position and velocities of all the particles are
known at one time, then it is possible to predict the future for all time.
The last two decades of the 19 century saw the culmination of classical physics and several important
discoveries that led to a revolution in science that toppled classical physics from its throne. The end of the
19 century was a time during which tremendous technological progress occurred; flight, the automobile,
and turbine-powered ships were developed, Niagara Falls was harnessed for power, etc. During this period,
Heinrich Hertz (1857-1894) produced electromagnetic waves confirming their derivation using Maxwell’s
equations. Simultaneously he discovered the photoelectric effect which was crucial evidence in support of
quantum physics. Technical developments, such as photography, the induction spark coil, and the vacuum
pump played a significant role in scientific discoveries made during the 1890’s. At the end of the 19 century,
scientists thought that the basic laws were understood and worried that future physics would be in the fifth
decimal place; some scientists worried that little was left for them to discover. However, there remained a
few, presumed minor, unexplained discrepancies plus new discoveries that led to the revolution in science
that occurred at the beginning of the 20 century.
formalisms of mechanics, plus the principle of least action, remain intact using a relativistically invariant
Lagrangian. The independence of the variational approach to reference frames is precisely the formalism
necessary for relativistic mechanics. The invariance to coordinate frames of the basic field equations also
must remain invariant for the General Theory of Relativity which also can be derived in terms of a rela-
tivistic action principle. Thus the development of the Theory of Relativity unambiguously demonstrated the
superiority of the variational formulation of classical mechanics over the vectorial Newtonian formulation,
and thus the considerable effort made by Euler, Lagrange, Hamilton, Jacobi, and others in developing the
analytical variational formalism of classical mechanics finally came to fruition at the start of the 20 century.
Newton’s two crowning achievements, the Laws of Motion and the Laws of Gravitation, that had reigned
supreme since published in the Principia in 1687, were toppled from the throne by Einstein.
Emmy Noether (1882-1935) has been described as “the greatest ever woman mathematician”. In
1915 she proposed a theorem that a conservation law is associated with any differentiable symmetry of a
physical system. Noether’s theorem evolves naturally from Lagrangian and Hamiltonian mechanics and
she applied it to the four-dimensional world of general relativity. Noether’s theorem has had an important
impact in guiding the development of modern physics.
Other profound developments that had revolutionary impacts on classical mechanics were quantum
physics and quantum field theory. The 1913 model of atomic structure by Niels Bohr (1885-1962) and
the subsequent enhancements by Arnold Sommerfeld (1868-1951), were based completely on classical
Hamiltonian mechanics. The proposal of wave-particle duality by Louis de Broglie (1892-1987), made
in his 1924 thesis, was the catalyst leading to the development of quantum mechanics. In 1925 Werner
Heisenberg (1901-1976), and Max Born (1882-1970) developed a matrix representation of quantum
mechanics using non-commuting conjugate position and momenta variables.
Paul Dirac (1902-1984) showed in his Ph.D. thesis that Heisenberg’s matrix representation of quantum
physics is based on the Poisson Bracket generalization of Hamiltonian mechanics, which, in contrast to
Hamilton’s canonical equations, allows for non-commuting conjugate variables. In 1926 Erwin Schrödinger
(1887-1961) independently introduced the operational viewpoint and reinterpreted the partial differential
equation of Hamilton-Jacobi as a wave equation. His starting point was the optical-mechanical analogy of
Hamilton that is a built-in feature of the Hamilton-Jacobi theory. Schrödinger then showed that the wave
mechanics he developed, and the Heisenberg matrix mechanics, are equivalent representations of quantum
mechanics. In 1928 Dirac developed his relativistic equation of motion for the electron and pioneered the
field of quantum electrodynamics. Dirac also introduced the Lagrangian and the principle of least action to
quantum mechanics, and these ideas were developed into the path-integral formulation of quantum mechanics
and the theory of electrodynamics by Richard Feynman(1918-1988).
The concepts of wave-particle duality, and quantization of observables, both are beyond the classical
notions of infinite subdivisions in classical physics. In spite of the radical departure of quantum mechanics
from earlier classical concepts, the basic feature of the differential equations of quantal physics is their self-
adjoint character which means that they are derivable from a variational principle. Thus both the Theory of
Relativity, and quantum physics are consistent with the variational principle of mechanics, and inconsistent
with Newtonian mechanics. As a consequence Newtonian mechanics has been dislodged from the throne
it occupied since 1687, and the intellectually beautiful and powerful variational principles of analytical
mechanics have been validated.
The 2015 observation of gravitational waves is a remarkable recent confirmation of Einstein’s General
Theory of Relativity and the validity of the underlying variational principles in physics. Another advance in
physics is the understanding of the evolution of chaos in non-linear systems that have been made during the
past four decades. This advance is due to the availability of computers which has reopened this interesting
branch of classical mechanics, that was pioneered by Henri Poincaré about a century ago. Although classical
mechanics is the oldest and most mature branch of physics, there still remain new research opportunities in
this field of physics.
The focus of this book is to introduce the general principles of the mathematical variational principle
approach, and its applications to classical mechanics. It will be shown that the variational principles, that
were developed in classical mechanics, now play a crucial role in modern physics and mathematics, plus
many other fields of science and technology.
References:
Excellent sources of information regarding the history of major players in the field of classical mechanics
can be found on Wikipedia and the book “Variational Principle of Mechanics” by Lanczos.[La49]
Chapter 2
2.1 Introduction
It is assumed that the reader has been introduced to Newtonian mechanics applied to one or two point objects.
This chapter reviews Newtonian mechanics for motion of many-body systems as well as for macroscopic
sized bodies. Newton’s Law of Gravitation also is reviewed. The purpose of this review is to ensure that the
reader has a solid foundation of elementary Newtonian mechanics upon which to build the powerful analytic
Lagrangian and Hamiltonian approaches to classical dynamics.
Newtonian mechanics is based on application of Newton’s Laws of motion which assume that the concepts
of distance, time, and mass, are absolute, that is, motion is in an inertial frame. The Newtonian idea of
the complete separation of space and time, and the concept of the absoluteness of time, are violated by the
Theory of Relativity as discussed in chapter 17. However, for most practical applications, relativistic effects
are negligible and Newtonian mechanics is an adequate description at low velocities. Therefore chapters
2 − 16 will assume velocities for which Newton’s laws of motion are applicable.
9
10 CHAPTER 2. REVIEW OF NEWTONIAN MECHANICS
If the forces acting on two bodies are their mutual action and reaction, then equation 24 simplifies to
p1 p2
F12 + F21 = + = (p1 + p2 ) = 0 (2.5)
This implies that the total linear momentum (P = p1 + p2 ) is a constant of motion.
Combining equations 21 and 22 leads to a second-order differential equation
p 2 r
F= = 2 = r̈ (2.6)
Note that the force on a body F, and the resultant acceleration a = r̈ are colinear. Appendix 2 gives
explicit expressions for the acceleration a in cartesian and curvilinear coordinate systems. The definition of
force depends on the definition of the mass . Newton’s laws of motion are obeyed to a high precision for
velocities much less than the velocity of light. For example, recent experiments have shown they are obeyed
with an error in the acceleration of ∆ ≤ 5 × 10−14 2
If the work done on the particle is positive, then the final kinetic energy 2 1 Especially noteworthy is that
the kinetic energy [ ] is a scalar quantity which makes it simple to use. This first-order spatial integral is the
foundation of the analytic formulation of mechanics that underlies Lagrangian and Hamiltonian mechanics.
1
P
The average location of the system corresponds to the location of the center of mass since r0 = 0
that is
1 X 1 X
r = R + r0 = R (2.23)
The vector R which describes the location of the center of mass, depends on the origin and coordinate
system chosen. For a continuous mass distribution the location vector of the center of mass is given by
Z
1 X 1
R= r = r (2.24)
The center of mass can be evaluated by calculating the individual components along three orthogonal axes.
The center-of-mass frame of reference is defined as the frame for which the center of mass is stationary.
This frame of reference is especially valuable for elucidating the underlying physics which involves only the
relative motion of the many bodies. That is, the trivial translational motion of the center of mass frame,
which has no influence on the relative motion of the bodies, is factored out and can be ignored. For example,
a tennis ball (006) approaching the earth (6 × 1024 ) with velocity could be treated in three frames,
(a) assume the earth is stationary, (b) assume the tennis ball is stationary, or (c) the center-of-mass frame.
The latter frame ignores the center of mass motion which has no influence on the relative motion of the
tennis ball and the earth. The center of linear momentum and center of mass coordinate frames are identical
in Newtonian mechanics but not in relativistic mechanics as described in chapter 1743.
14 CHAPTER 2. REVIEW OF NEWTONIAN MECHANICS
It is convenient to describe a many-body system by a position vector r0 with respect to the center of mass.
r = R + r0 (2.26)
That is,
X
X X
P= p = r = R + r0 = R + 0 = Ṙ (2.27)
P
since r0 = 0 as given by the definition of the center of mass. That is;
P = Ṙ (2.28)
Thus P
the total linear momentum for a system is the same as the momentum of a single particle of mass
= located at the center of mass of the system.
The origin of the external force is from outside of the system while the internal force is due to the mutual
interaction between the particles in the system. Newton’s Law tells us that
X
ṗ = F = F
+ f (2.30)
6=
Substituting Newton’s third law f = −f into equation 232 implies that
XX
XX X
X
f = f = − f = 0 (2.33)
6= 6= 6=
2.8. TOTAL LINEAR MOMENTUM OF A MANY-BODY SYSTEM 15
which is satisfied only for the case where the summations equal zero. That is, for every internal force, there
is an equal and opposite reaction force that cancels that internal force.
Therefore the first-order integral for linear momentum can be written in differential and integral forms
as
X Z2 X
Ṗ = F F
= P2 − P1 (2.34)
1
The reaction of a body to an external force is equivalent to a single particle of mass located at the center
of mass assuming that the internal forces cancel due to Newton’s third law.
Note that the total linear momentum P is conserved if the net external force F is zero, that is
P
F = =0 (2.35)
Therefore the P of the center of mass is a constant. Moreover, if the component of the force along any
direction b
e is zero, that is,
P · b
e
F · b
e= =0 (2.36)
then P · b
e is a constant. This fact is used frequently to solve problems involving motion in a constant force
field. For example, in the earth’s gravitational field, the momentum of an object moving in vacuum in the
vertical direction is time dependent because of the gravitational force, whereas the horizontal component of
momentum is constant if no forces act in the horizontal direction.
= cos = sin
16 CHAPTER 2. REVIEW OF NEWTONIAN MECHANICS
r = R + r0 (2.37)
The total angular momentum separates into two terms, the angular momentum about the center of mass,
plus the angular momentum of the center of mass about the origin of the axis system. This factoring of the
angular momentum only applies for the center of mass. This is called Samuel König’s first theorem.
Consider that the resultant force acting on particle in this -particle system can be separated into an
external force F
plus internal forces between the particles of the system
X
F = F
+ f (2.43)
6=
The origin of the external force is from outside of the system while the internal force is due to the interaction
with the other − 1 particles in the system. Newton’s Law tells us that
X
ṗ = F = F
+ f (2.44)
6=
2.9. ANGULAR MOMENTUM OF A MANY-BODY SYSTEM 17
Note that (r − r ) is the vector r connecting to . For central forces the force vector f = rc
thus
XX XX
(r − r ) × f = r × rc
= 0 (2.47)
That is, for central internal forces the total internal torque on a system of particles is zero, and the rate of
change of total angular momentum for central internal forces becomes
X X
L̇ = r × F = N =N
(2.48)
where N is the net external torque acting on the system. Equation 248 leads to the differential and integral
forms of the first integral relating the total angular momentum to total external torque.
Z2
L̇ = N N = L2 − L1 (2.49)
1
Angular momentum conservation occurs in many problems involving zero external torques N = 0 plus
two-body central forces F = ()r̂ since the torque on the particle about the center of the force is zero
Examples are, the central gravitational force for stellar or planetary systems in astrophysics, and the central
electrostatic force manifest for motion of electrons in the atom. In addition, the component of angular
momentum about any axis Lê is conserved if the net external torque about that axis Nê =0.
r = R + r0 (2.51)
R
The location of the center of mass is uniquely defined as being at the location where r0 = 0 The
velocity of the particle can be expressed in terms of the velocity of the center of mass Ṙ plus the velocity
of the particle with respect to the center of mass ṙ0 . That is,
For the special case of the center of mass, the middle term is zero since, by definition of the center of mass,
P 0
ṙ = 0 Therefore
X
1 1
= 02 + 2 (2.54)
2 2
Thus the total kinetic energy of the system is equal to the sum of the kinetic energy of a mass moving
with the center of mass velocity plus the kinetic energy of motion of the individual particles relative to the
center of mass. This is called Samuel König’s second theorem. P
Note that for a fixed center-of-mass energy, the total kinetic energy has a minimum value of 12 02
when the velocity of the center of mass = 0. For a given internal excitation energy, the minimum energy
required to accelerate colliding bodies occurs when the colliding bodies have identical, but opposite, linear
momenta. That is, when the center-of-mass velocity = 0.
Applying Stokes theorem for a path-independent force leads to the alternate statement that the curl is zero.
See appendix 33
∇ × F = 0 (2.56)
Note that the vector product of two del operators ∇ acting on a scalar field equals
∇ × ∇ = 0 (2.57)
Thus it is possible to express a path-independent force field as the gradient of a scalar field, , that is
F = −∇ (2.58)
2.10. WORK AND KINETIC ENERGY FOR A MANY-BODY SYSTEM 19
= + (2.60)
Note that the potential energy is defined only to within an additive constant since the force F = −∇
depends only on difference in potential energy. Similarly, the kinetic energy is not absolute since any inertial
frame of reference can be used to describe the motion and the velocity of a particle depends on the relative
velocities of inertial frames. Thus the total mechanical energy = + is not absolute.
If a single particle is subject to several path-independent forces, such as gravity, linear restoring forces,
etc., then a potential energy can be ascribed to each of the forces where for each force F = −∇ . In
X
contrast to the forces, which add vectorially, these scalar potential energies are additive, = . Thus
the total mechanical energy for potential energies equals
X
= + (r) = + (r) (2.61)
The time derivative of the total mechanical energy is given using equations 263 264 in equation 262
r r r
= + =F· + (∇ ) · + = [F + (∇ )] · + (2.65)
Note that if the field is path independent, that is ∇ × F = 0 then the force and potential are related by
F = −∇ (2.66)
Therefore, for path independent forces, the first term in the time derivative of the total energy in equation
265 is zero. That is,
= (2.67)
In addition, when the potential energy is not an explicit function of time, then
= 0 and thus the total
energy is conserved. That is, for the combination of (a) path independence plus (b) time independence, then
the total energy of a conservative field is conserved.
20 CHAPTER 2. REVIEW OF NEWTONIAN MECHANICS
Note that there are cases where the concept of potential still is useful even when it is time dependent.
That is, if path independence applies, i.e. F = −∇ at any instant. For example, a Coulomb field problem
where charges are slowly changing due to leakage etc., or during a peripheral collision between two charged
bodies such as nuclei.
The origin of the external force is from outside of the system while the internal force is due to the interaction
with the other − 1 particles in the system. Newton’s Law tells us that
X
ṗ = F = F
+ f (2.70)
6=
2.10. WORK AND KINETIC ENERGY FOR A MANY-BODY SYSTEM 21
The work done on the system by a force moving from configuration 1 → 2 is given by
X Z 2 X
X Z 2
1→2 = F
· r + f · r (2.71)
1 1
6=
Equating the two equivalent equations for 1→2 , that is 268 and 275gives that
1→2 = 2 − 1 = (1) − (2) + (1) − (2) (2.78)
This shows that, for conservative forces, the total energy is conserved and is given by
The three first-order integrals for linear momentum, angular momentum, and energy provide powerful
approaches for solving the motion of Newtonian systems due to the applicability of conservation laws for the
corresponding linear and angular momentum plus energy conservation for conservative forces. In addition,
the important concept of center-of-mass motion naturally separates out for these three first-order integrals.
Although these conservation laws were derived assuming Newton’s Laws of motion, these conservation laws
are more generally applicable, and these conservation laws surpass the range of validity of Newton’s Laws of
motion. For example, in 1930 Pauli and Fermi postulated the existence of the neutrino in order to account for
non-conservation of energy and momentum in -decay because they did not wish to relinquish the concepts
of energy and momentum conservation. The neutrino was first detected in 1956 confirming the correctness
of this hypothesis.
22 CHAPTER 2. REVIEW OF NEWTONIAN MECHANICS
X X
= p · ṙ + ṗ · r (2.81)
However, X X X
p · ṙ = ṙ · ṙ = 2 = 2 (2.82)
Thus
X
= 2 + F · r (2.84)
where the hi brackets refer to the time average. Note that if the motion is periodic and the chosen time
equals a multiple of the period, then ( )−(0)
= 0. Even if the motion is not periodic, if the constraints and
velocities of all the particles remain finite, then there is an upper bound to This implies that choosing
→ ∞ means that ( )−(0)
→ 0 In both cases the left-hand side of the equation tends to zero giving the
Virial theorem * +
1 X
h i = − F · r (2.86)
2
The right-hand side of this equation is called the Virial of the system. For a single particle subject to a
conservative central force F = −∇ the Virial theorem equals
¿ À
1 1
h i = h∇ · ri = (2.87)
2 2
If the potential is of the form = +1 that is, = −( + 1) , then
= ( + 1) . Thus for a single
particle in a central potential = +1 the Virial theorem reduces to
+1
h i = h i (2.88)
2
The following two special cases are of considerable importance in physics.
Hooke’s Law: Note that for a linear restoring force = 1 then
h i = + h i ( = 1)
You may be familiar with this fact for simple harmonic motion where the average kinetic and potential
energies are the same and both equal half of the total energy.
2.11. VIRIAL THEOREM 23
Inverse-square law: The other interesting case is for the inverse square law = −2 where
1
h i = − h i ( = −2)
2
The Virial theorem is useful for solving problems in that knowing the exponent of the field makes it
possible to write down directly the average total energy in the field. For example, for = −2
1 1
hi = h i + h i = − h i + h i = h i (2.89)
2 2
This occurs for the Bohr model of the hydrogen atom where the kinetic energy of the bound electron is half
of the potential energy. The same result occurs for planetary motion in the solar system.
2
h i ≈ ()
where is the radius of a cluster. The average kinetic energy per galaxy is 12 hi2 where hi2 is the average
square of the galaxy velocities with respect to the center of mass of the cluster. Thus the total kinetic energy
of the cluster is
hi2 hi2
hi ≈ = ()
2 2
The Virial theorem tells us that a central force having a radial dependence of the form ∝ gives hi =
+1
2 h i. For the inverse-square gravitational force then
1
hi = − h i ()
2
Thus equations and give an estimate of the total mass of the cluster to be
hi2
≈
This estimate is larger than the value estimated from the luminosity of the cluster implying a large amount
of “dark matter” must exist in galaxies which remains an open question in physics.
24 CHAPTER 2. REVIEW OF NEWTONIAN MECHANICS
F = F + N + f = a (2.91)
= cos (2.93)
ff N
Similarly, taking components along the inclined plane in the di-
rection y
2
sin − = 2 (2.94)
Using the concept of coefficient of friction
= (2.95) Fg
Thus the equation of motion can be written as x
2
(sin − cos ) = (2.96)
2
The block accelerates if sin cos that is, tan The
acceleration is constant if and are constant, that is Figure 2.3: Block on an inclined plane
2
= (sin − cos ) (2.97)
2
Remember that if the block is stationary, the friction coefficient balances such that (sin − cos ) = 0
that is, tan = . However, there is a maximum static friction coefficient beyond which the block starts
sliding. The kinetic coefficient of friction is applicable for sliding friction and usually
Another example of constant force and acceleration is motion of objects free falling in a uniform gravi-
tational field when air drag is neglected. Then one obtains the simple relations such as = + , etc.
2.12. APPLICATIONS OF NEWTON’S EQUATIONS OF MOTION 25
̈ + 20 = 0 (2.100)
which is the equation of the harmonic oscillator. Examples are small oscillations of a mass on a spring,
vibrations of a stretched piano string, etc.
The solution of this second order equation is
This is the well known sinusoidal behavior of the displacement for the simple harmonic oscillator. The
angular frequency 0 is
r
0 = (2.102)
Note that this linear system has no dissipative forces, thus the total energy is a constant of motion as
discussed previously. That is, it is a conservative system with a total energy given by
1 1
̇2 + 2 = (2.103)
2 2
The first term is the kinetic energy and the second term is the potential energy. The Virial theorem gives
that for the linear restoring force the average kinetic energy equals the average potential energy.
Consider a conservative force in one dimension. Since it was shown that the total energy = + is
conserved for a conservative field, then
1
= + = 2 + () (2.105)
2
Therefore: r
2
= =± [ − ()] (2.106)
Integration of this gives
Z
±
− 0 = q (2.107)
0 2
[ − ()]
where = 0 when = 0 Knowing () it is possible to solve this equation as a function of time.
26 CHAPTER 2. REVIEW OF NEWTONIAN MECHANICS
0.8
the exponential term in the potential function can be ex-
0.6
panded to give U(x) 0.4
∙ ¸2 Uo 0.2 x
( − 0 ) 0
() ≈ 0 1 − (1 − − ) −0 ≈ 2 (−0 )2 −0 0.0
1 2 3 4 5
-0.2
-0.4
This gives a restoring force
-0.6
() 0 -0.8
() = − = −2 ( − 0 ) -1.0
That is, for small amplitudes the restoring force is linear. Potential energy function ()0 versus
for the diatomic molecule.
2.12. APPLICATIONS OF NEWTON’S EQUATIONS OF MOTION 27
where for spherical objects of diameter , 1 ≈ 155×10−4 and 2 ≈ 0222 in MKS units. Fortunately, the
equation of motion usually can be integrated when the retarding force has a simple power law dependence.
As an example, consider free fall in the Earth’s gravitational field.
− − 1 =
Separate the variables and integrate
Z µ ¶
+ 1
= = − ln
0 − − 1 1 + 1 0
That is µ ¶
1
=− + + 0 −
1 1
Note that for À 1 the velocity approaches a terminal velocity of ∞ = −
1 The characteristic time
constant is = 1 = ∞ Note that if 0 = 0 then
³
´
= ∞ 1 − −
For the case of small raindrops with = 05 then ∞ = 8 (18) and time constant = 08 sec
Note that in the absence of air drag, these rain drops falling from 2000 would attain a velocity of over
400 m.p.h. It is fortunate that the drag reduces the speed of rain drops to non-damaging values. Note that
the above relation would predict high velocities for hail. Fortunately, the drag increases quadratically at the
higher velocities attained by large rain drops or hail, and this limits the terminal velocity to moderate values.
For the United States these velocities still are sufficient to do considerable crop damage in the mid-west.
Quadratic regime 2 1
For larger objects at higher velocities, i.e. high Reynold’s number, the drag depends on the square of the
velocity making it necessary to differentiate between objects rising and falling. The equation of motion is
− ± 2 2 =
2.12. APPLICATIONS OF NEWTON’S EQUATIONS OF MOTION 29
where the positive sign is for falling objects and negative sign for rising objects. Integrating the equation of
motion for falling gives
Z µ ¶
−1 0 −1
= 2
= tanh − tanh
0 − + 2 ∞ ∞
q q
where = 2
and ∞ = ∞
2 That is, = For the case of a falling object with 0 = 0 solving for
velocity gives
= ∞ tanh
As an example, a 06 basket ball with = 025 will have ∞ = 20 ( 43 m.p.h.) and = 21.
Consider President George H.W. Bush skydiving. Assume his mass is 70kg and assume an equivalent
spherical shape of the former President to have a diameter of = 1. This gives that ∞ = 56
( 120) and = 56. When Bush senior opens his 8 diameter parachute his terminal velocity is
estimated to decrease to 7 ( 15 ) which is close to the value for a typical ( 8) diameter emergency
parachute which has a measured terminal velocity of 11 in spite of air leakage through the central vent
needed to stabilize the parachute motion.
Therefore the rocket is given an equal and opposite increase in momentum
In the time interval the net change in the linear momentum of the rocket plus fuel system is given by
That is
0 − = (2.122)
Thus ³ ´
0
= − + ln (2.123)
Note that once the propellant is exhausted the rocket will continue to fly upwards as it decelerates in
the gravitational field. You can easily calculate the maximum height. Note that this formula assumes that
the acceleration due to gravity is constant whereas for large heights above the Earth it is necessary to use
the true gravitational force − 2 where is the distance from the center of the earth. In real situations
it is necessary to include air drag which requires a computer to numerically solve the equations of motion.
The highest rocket velocity is attained by maximizing the exhaust velocity and the ratio of initial to final
mass. Because the terminal velocity is limited by the mass ratio, engineers construct multistage rockets that
jettison the spent fuel containers and rockets. The variational-principle approach applied to variable mass
problems is discussed in chapter 87
2.12. APPLICATIONS OF NEWTON’S EQUATIONS OF MOTION 31
The vector triple product can be simplified using the vector identity equation 24 giving
X £¡ ¢ ¤
L= 2 ω − (r · ω) r (2.126)
The simplest case for rigid-body rotation is when the body has a symmetry axis with the angular velocity ω
parallel to this body-fixed symmetry axis. For this case then r can be taken perpendicular to ω for which
the second term in equation 2126, i.e. (r · ω) =0, thus
X ¡ ¢
L = 2 ω (r perpendicular to ω)
where is the perpendicular distance from the axis of rotation to the body, For a continuous body the
moment of inertia can be generalized to an integral over the mass density of the body
Z
= 2 (2.128)
where is perpendicular to the rotation axis. The definition of the moment of inertia allows rewriting the
angular momentum about a symmetry axis L in the form
where the moment of inertia is taken about the symmetry axis and assuming that the angular velocity
of rotation vector is parallel to the symmetry axis.
32 CHAPTER 2. REVIEW OF NEWTONIAN MECHANICS
ω = ẑ (2.130)
r = ( ) (2.131)
which is written as a column vector for clarity. Inserting v in the cross-product r ×v gives the components
of the angular momentum to be
⎛ ⎞
X X −
L= r × v = ⎝ − ⎠
2 + 2
where (2134) gives the elementary formula for the mo- Figure 2.6: A rigid rotating body comprising a sin-
ment of inertia = about the axis given earlier gle mass attached by a massless rod at a fixed
in (2129). angle shown at the instant when happens to
The surprising result is that and are non-zero lie in the plane. As the body rotates about
implying that the total angular momentum vector L is the − axis the mass has a velocity and mo-
in general not parallel with ω This can be understood mentum into the page (the negative direction).
by considering the single body shown in figure 26. Therefore the angular momentum L = r × p is in
When the body is in the plane then = 0 and the direction shown which is not parallel to the
= 0 Thus the angular momentum vector L has a angular velocity
component along the − direction as shown which is
not parallel with ω and, since the vectors ω L r are
coplanar, then L must sweep around the rotation axis ω to remain coplanar with the body as it rotates
about the axis. Instantaneously the velocity of the body v is into the plane of the paper and, since
2.12. APPLICATIONS OF NEWTON’S EQUATIONS OF MOTION 33
L = r × v then L is at an angle (90◦ − ) to the axis. This implies that a torque must be applied
to rotate the angular momentum vector. This explains why your automobile shakes if the rotation axis and
symmetry axis are not parallel for one wheel.
The first two moments in (2133) are called products of inertia of the body designated by the pair of
axes involved. Therefore, to avoid confusion, it is necessary to define the diagonal moment, which is called
the moment of inertia, by two subscripts as Thus in general, a body can have three moments of inertia
about the three axes plus three products of inertia. This group of moments comprise the inertia tensor
which will be discussed further in chapter 13. If a body has an axis of symmetry along the axis then the
summations will give = = 0 while will be unchanged. That is, for rotation about a symmetry
axis the angular momentum and rotation axes are parallel. For any axis along which the angular momentum
and angular velocity coincide is called a principal axis of the body.
0 =
0 = +
0
= ( + 2 )
That is
= =
0 0 + 2
Note that this is true independent of the details of the acceleration of the initially stationary child.
N = f · R =
34 CHAPTER 2. REVIEW OF NEWTONIAN MECHANICS
Since the moment of inertia about the center of a uniform sphere is = 25 2 then the angular acceleration
of the ball is
5
̇ = = 2 2
= ()
5 2
Moreover the frictional force causes a deceleration of the linear velocity of the center of mass of
= − = − ()
Integrating from time zero to gives
Z
5
= ̇ =
0 2
The linear velocity of the center of mass at time is given by integration of equation
Z
= = 0 −
0
The billiard ball stops sliding and only rolls when = , that is, when
5
= 0 −
2
That is, when
2 0
=
7
Thus the ball slips for a distance
Z
2 12 02
= = 0 − =
0 2 49
Note that if the ball is pushed at a distance above the center of mass, besides the linear velocity there
is an initial angular momentum of
0 5 0
= 2 =
5 2 2 2
For the special case = 25 the ball immediately assumes a pure non-slipping roll. For 25 one has
0 while 25 corresponds to 0 . In the latter case the frictional force points forward.
p
Since F() = then equation 2135 gives that
Z Z
p 0
P= = p = p() − p0 = ∆p (2.136)
0 0 0
Thus the impulse P is an unambiguous quantity that equals the change in linear momentum of the object
that has been struck which is independent of the details of the time dependence of the impulsive force.
Computation of the spatial motion still requires knowledge of () since the 2136 can be written as
Z
1
v() = F(0 )0 + v0 (2.137)
0
Integration gives
Z " Z ”
#
1 0 0
r() − r0 = v0 + F( ) ” (2.138)
0 0
In general this is complicated. However, for the case of a constant force F() = F0 this simplifies to the
constant acceleration equation
1 F0 2
r() − r0 = v0 + (2.139)
2
F0
where the constant acceleration a = .
y
P = ∆p
= ∆v s
M
T= s × P = ∆L = ∆ω
36 CHAPTER 2. REVIEW OF NEWTONIAN MECHANICS
P
∆v =
s×P
∆ω
=
Assume that the bat was stationary prior to the strike, then after the strike the net translational velocity
of a point along the body-fixed symmetry axis of the bat at a distance from the center of mass, is given
by
P 1 P 1
v () = ∆v + ∆ω × y = + ((s × P) × y) = + [(s · y) P− (s · P) y]
It is assumed that and are perpendicular and thus (s · P) = 0 which simplifies the above equation to
µ ¶
P (s · y)
v () = ∆v + ∆ω × y = 1+
Note that the translational velocity of the location along the bat symmetry axis at a distance from the
center of mass, is zero if the bracket equals zero, that is, if
2
s·y =− = −
where is called the radius of gyration of the body about the center of mass. Note that when the scalar
product · = −
= −2
then there will be no translational motion at the point . This point on the
axis lies on the opposite side of the center of mass from the strike point , and is called the center of
percussion corresponding to the impulse at the point . The center of percussion often is referred to as the
“sweet spot” for an object corresponding to the impulse at the point . For a baseball bat the batter holds
the bat at the center of percussion so that they do not feel an impulse in their hands when the ball is struck
at the point . This principle is used extensively to design bats for all sports involving striking a ball with
a bat, such as, cricket, squash, tennis, etc. as well as weapons such of swords and axes used to decapitate
opponents.
Thus
1 2
= − cos
40 0
3
Integrate from 2 2 gives that the total momentum imparted to 2 is
Z 3
1 2 2 1 2
= − cos =
40 0
2
20 0
where g is the gravitational field which is a position-dependent force per unit gravitational mass pointing
towards the center of the Earth. The gravitational mass is measured when an object is weighed.
Newton’s Law of Gravitation leads to the relation for the gravitational field g (r) at the location r due
to a gravitational mass distribution at the location r0 as given by the integral over the gravitational mass
density ³ ´
Z (r0 ) b r − rb0
g (r) = − 0 2
0 (2.149)
(r − r )
The acceleration of matter in a gravitational field relates the gravitational and inertial masses
F = g = a (2.150)
Thus
a= g (2.151)
That is, the acceleration of a body depends on the gravitational strength and the ratio of the gravitational
and inertial masses. It has been shown experimentally that all matter is subject to the same acceleration
in vacuum at a given location in a gravitational field. That is, is a constant common to all materials.
Galileo first showed this when he dropped objects from the Tower of Pisa. Modern experiments have shown
that this is true to 5 parts in 1013 .
The exact equivalence of gravitational mass and inertial mass is called the weak principle of equiva-
lence which underlies the General Theory of Relativity as discussed in chapter 17. It is convenient to use
the same unit for the gravitational and inertial masses and thus they both can be written in terms of the
common mass symbol .
= = (2.152)
Therefore the subscripts and can be omitted in equations 2150 and 2152. Also the local acceleration
due to gravity a can be written as
a=g (2.153)
F
The gravitational field g ≡ has units of in the MKS system while the acceleration a has units 2 .
since the scalar product of the unit vectors b r·br = 1 Note that the second two terms also cancel since
b
r · θ̂ = r̂ · φ̂ = 0 since the unit vectors are mutually orthogonal. Thus the line integral just depends only on
the starting and ending radii and is independent of the angular coordinates or the detailed path taken between
( ) and ( )
Consider the Principle of Superposition for a gravitational field produced by a set of point masses. The
line integral then can be written as:
Z Z
X
X
∆→ =− F · l = − F · l = ∆→ (2.157)
=1 =1
Thus the net potential energy difference is the sum of the contributions from each point mass producing the
gravitational force field. Since each component is conservative, then the total potential energy difference also
must be conservative. For a conservative force, this line integral is independent of the path taken, it depends
only on the starting and ending positions, r and r . That is, the potential energy is a local function
dependent only on position. The usefulness of gravitational potential energy is that, since the gravitational
force is a conservative force, it is possible to solve many problems in classical mechanics using the fact
that the sum of the kinetic energy and potential energy is a constant. Note that the gravitational field is
conservative, since the potential energy difference ∆→ is independent of the path taken. It is conservative
because the force is radial and time independent, it is not due to the 12 dependence of the field.
Note that the probe mass 0 factors out from the integral. It is convenient to define a new quantity called
gravitational potential where
Z
∆→
∆
→ = = − g · l (2.159)
0
That is; gravitational potential difference is the work that must be done, per unit mass, to move from a to
b with no change in kinetic energy. Be careful not to confuse the gravitational potential energy difference
∆→ and gravitational potential difference ∆→ , that is, ∆ has units of energy, , while ∆ has
units of .
The gravitational potential is a property of the gravitational force field; it is given as minus the line
integral of the gravitational field from to . The change in gravitational potential energy for moving a
mass 0 from to is given in terms of gravitational potential by:
∆→ = 0 ∆
→ (2.160)
Thus gravitational potential is a simple additive scalar field because the Principle of Superposition applies.
The gravitational potential, between two points differing by in height, is . Clearly, the greater or ,
the greater the energy released by the gravitational field when dropping a body through the height . The
unit of gravitational potential is the
2.14. NEWTON’S LAW OF GRAVITATION 41
= −g · l (2.164)
Using cartesian coordinates both g and l can be written as
g = bi + bj + k
b l = bi + bj + k
b (2.165)
Taking the scalar product gives:
= −g · l = − − − (2.166)
Differential calculus expresses the change in potential in terms of partial derivatives by:
= + + (2.167)
By association, 2166 and 2167 imply that
= − = − = − (2.168)
Thus on each axis, the gravitational field can be written as minus the gradient of the gravitational potential.
In three dimensions, the gravitational field is minus the total gradient of potential and the gradient of the
scalar function can be written as:
g = −∇ (2.169)
In cartesian coordinates this equals
∙ ¸
b b b
g=− i +j +k (2.170)
Thus the gravitational field is just the gradient of the gravitational potential, which always is perpendicular
to the equipotentials. Skiers are familiar with the concept of gravitational equipotentials and the fact that
the line of steepest descent, and thus maximum acceleration, is perpendicular to gravitational equipotentials
of constant height. The advantage of using potential theory for inverse-square law forces is that scalar
potentials replace the more complicated vector forces, which greatly simplifies calculation. Potential theory
plays a crucial role for handling both gravitational and electrostatic forces.
the path taken between two points and . Consider two possible paths
between and as shown in figure 29. The line integral from to via
route 1 is equal and opposite to the line integral back from to via 2
route 2 if the gravitational field is conservative as shown earlier.
A better way of expressing this is that the line integral of the gravita-
tional field is zero around any closed path. Thus the line integral between
and , via path 1, and returning back to , via path 2, are equal and Figure 2.9: Circulation of the
opposite. That is, the net line integral for a closed loop is zero gravitational field.
42 CHAPTER 2. REVIEW OF NEWTONIAN MECHANICS
I
g · l = 0 (2.171)
which is a measure of the circulation of the gravitational field. The fact that the circulation equals zero
corresponds to the statement that the gravitational field is radial for a point mass.
Stokes Theorem, discussed in appendix 3, states that
I Z
F · l = (∇ × F) · S (2.172)
∇×g =0 (2.174)
∇ × ∇ = 0 (2.175)
g = −∇ (2.176)
Thus is consistent with the above definition of gravitational potential in that the scalar product
Z Z Z X Z
∆→ = − g · l = (∇) · l = = (2.177)
An identical relation between the electric field and electric potential applies for the inverse-square law
electrostatic field.
Reference potentials:
Note that only differences in potential energy, , and gravitational potential, , are meaningful, the absolute
values depend on some arbitrarily chosen reference. However, often it is useful to measure gravitational
potential with respect to a particular arbitrarily chosen reference point such as to sea level. Aircraft
pilots are required to set their altimeters to read with respect to sea level rather than their departure
0 0
airport. This ensures that aircraft leaving from say both Rochester, 559 and Denver 5000 , have
their altimeters set to a common reference to ensure that they do not collide. The gravitational force is the
gradient of the gravitational field which only depends on differences in potential, and thus is independent of
any constant reference.
Consider a closed surface where the direction of the surface vector S is defined as outwards. The net
flux out of this closed surface is given by
I I
b
r · S
Φ = − = − Ω = −4 (2.183)
2
This is independent of where the point mass lies within the closed surface or on the shape of the closed
surface. Note that the solid angle subtended is zero if the point mass lies outside the closed surface. Thus
the flux is as given by equation 2183 if the mass is enclosed by the closed surface, while it is zero if the mass
is outside of the closed surface.
Since the flux for a point mass is independent of the location of the mass within the volume enclosed by
the closed surface, and using the principle of superposition for the gravitational field, then for enclosed
point masses the net flux is
Z X
Φ≡ g · S = −4 (2.184)
This can be extended to continuous mass distributions, with local mass density giving that the net flux
Z Z
Φ≡ g · S = −4 (2.185)
or Z
[∇ · g + 4] = 0 (2.187)
44 CHAPTER 2. REVIEW OF NEWTONIAN MECHANICS
This is true independent of the shape of the surface, thus the divergence of the gravitational field
∇ · g = −4 (2.188)
This is a statement that the gravitational field of a point mass has a 12 dependence.
Using the fact that the gravitational field is conservative, this can be expressed as the gradient of the
gravitational potential
g = −∇ (2.189)
and Gauss’s law, then becomes
∇ · ∇ = 4 (2.190)
which also can be written as Poisson’s equation
∇2 = 4 (2.191)
Knowing the mass distribution allows determination of the potential by solving Poisson’s equation.
A special case that often is encountered is when the mass distribution is zero in a given region. Then the
potential for this region can be determined by solving Laplace’s equation with known boundary conditions.
∇2 = 0 (2.192)
For example, Laplace’s equation applies in the free space between the masses. It is used extensively in elec-
trostatics to compute the electric potential between charged conductors which themselves are equipotentials.
An elegant way to express Newton’s Law of Gravitation is in terms of the flux and circulation of the
gravitational field. That is,
Flux: Z Z
Φ≡ g · S = −4 (2.194)
Circulation: I
g · l = 0 (2.195)
The flux and circulation are better expressed in terms of the vector differential concepts of divergence
and curl.
Divergence:
∇ · g = −4 (2.196)
Curl:
∇×g =0 (2.197)
Remember that the flux and divergence of the gravitational field are statements that the field between
point masses has a 12 dependence. The circulation and curl are statements that the field between point
masses is radial.
Because the gravitational field is conservative it is possible to use the concept of the scalar potential
field This concept is especially useful for solving some problems since the gravitational potential can be
evaluated using the scalar integral Z
(0 ) 0
∆∞→ = − (2.198)
0
2.14. NEWTON’S LAW OF GRAVITATION 45
An alternate approach is to solve Poisson’s equation if the boundary values and mass distributions are known
where Poisson’s equation is:
∇2 = 4 (2.199)
These alternate expressions of Newton’s law of gravitation can be exploited to solve problems. The
method of solution is identical to that used in electrostatics.
and then
g = −∇
c) The obvious spherical symmetry can be used in conjunction 0
with Gauss’s law to easily solve this problem.
Z Z g
-GM r -GM
g · S = −4 r²
2.15 Summary
Newton’s Laws of Motion:
A cursory review of Newtonian mechanics has been presented. The concept of inertial frames of reference
was introduced since Newton’s laws of motion apply only to inertial frames of reference.
Newton’s Law of motion
p
F= (26)
leads to second-order equations of motion which can be difficult to handle for many-body systems.
Solution of Newton’s second-order equations of motion can be simplified using the three first-order in-
tegrals coupled with corresponding conservation laws. The first-order time integral for linear momentum
is Z 2 Z 2
p
F = = (p2 − p1 ) (210)
1 1
The first-order time integral for angular momentum is
Z 2 Z 2
L p L
= r × = N N = = (L2 − L1 ) (216)
1 1
The first-order spatial integral is related to kinetic energy and the concept of work. That is
Z 2
F = F · r = (2 − 1 ) (221)
r 1
The conditions that lead to conservation of linear and angular momentum and total mechanical energy
were discussed for many-body systems. The important class of conservative forces was shown to R 2apply if
the position-dependent force do not depend on time or velocity, and if the work done by a force 1 F · r
is independent of the path taken between the initial and final locations. The total mechanical energy is a
constant of motion when the forces are conservative.
It was shown that the concept of center of mass of a many-body or finite sized body separates naturally
for all three first-order integrals. The center of mass is that point about which
X Z
r0 = r0 = 0 (Centre of mass definition)
where r0 is the vector defining the location of mass with respect to the center of mass. The concept of
center of mass greatly simplifies the description of the motion of finite-sized bodies and many-body systems
by separating out the important internal interactions and corresponding underlying physics, from the trivial
overall translational motion of a many-body system..
The Virial theorem states that the time-averaged properties are related by
* +
1 X
h i = − F · r (286)
2
It was shown that the Virial theorem is useful for relating the time-averaged kinetic and potential energies,
especially for cases involving either linear or inverse-square forces.
Typical examples were presented of application of Newton’s equations of motion to solving systems
involving constant, linear, position-dependent, velocity-dependent, and time-dependent forces, to constrained
and unconstrained systems, as well as systems with variable mass. Rigid-body rotation about a body-fixed
rotation axis also was discussed.
It is important to be cognizant of the following limitations that apply to Newton’s laws of motion:
1) Newtonian mechanics assumes that all observables are measured to unlimited precision, that is
p r are known exactly. Quantum physics introduces limits to measurement due to wave-particle duality.
2) The Newtonian view is that time and position are absolute concepts. The Theory of Relativity shows
that this is not true. Fortunately for most problems and thus Newtonian mechanics is an excellent
approximation.
2.15. SUMMARY 47
3) Another limitation, to be discussed later, is that it is impractical to solve the equations of motion for
many interacting bodies such as all the molecules in a gas. Then it is necessary to resort to using statistical
averages, this approach is called statistical mechanics.
Newton’s work constitutes a theory of motion in the universe that introduces the concept of causality.
Causality is that there is a one-to-one correspondence between cause of effect. Each force causes a known
effect that can be calculated. Thus the causal universe is pictured by philosophers to be a giant machine
whose parts move like clockwork in a predictable and predetermined way according to the laws of nature. This
is a deterministic view of nature. There are philosophical problems in that such a deterministic viewpoint
appears to be contrary to free will. That is, taken to the extreme it implies that you were predestined to
read this book because it is a natural consequence of this mechanical universe!
Gravitation Electrostatics
Force field g ≡ F E ≡ F
Density Mass density (r0 ) Charge density (r0 )
R (r0 )(r−r0 ) 1
R (r0 )(r−r0 ) 0
Conservative central field g (r) = − (r−r0 )2 0 E (r) = 4 2
R R R 0 R 0)
(r−r
Flux Φ ≡ g · S = −4 Φ ≡ E · S = 10
I I
Circulation g · l = 0 E · l = 0
Divergence ∇ · g = −4 ∇ · E = 10
Curl ∇×g =0 ∇×E=0
R (0 )0 1
R (0 ) 0
Potential ∆∞→ = − 0 ∆∞→ = 4 0 0
Poisson’s equation ∇2 = 4 ∇2 = − 10
Both the gravitational and electrostatic central fields are conservative making it possible to use the
concept of the scalar potential field This concept is especially useful for solving some problems since the
potential can be evaluated using a scalar integral. An alternate approach is to solve Poisson’s equation if the
boundary values and mass distributions are known. The methods of solution of Newton’s law of gravitation
are identical to those used in electrostatics and are readily accessible in the literature.
48 CHAPTER 2. REVIEW OF NEWTONIAN MECHANICS
Workshop exercises
1. Spend a few minutes looking over the following problems, paying particular attention to the problems that
you think you might have trouble with. All of the problems are taken from an introductory physics course on
mechanics, so this should seem like review material. After you have had some time to look over the problems,
you will take turns stepping up to the board to solve one. When it is your turn, you may pick ANY of the
problems that have not already been solved. Depending on the number of students in the recitation, you may
be asked to solve more than one problem. Good luck!
(a) Justin fires a 12-gram bullet into a block of wood. The bullet travels at 190 m/s, penetrates the 2.0-kg
block of wood, and emerges going 150 m/s. If the block is stationary on a frictionless surface when hit,
how fast does it move after the bullet emerges?
(b) A mass at the end of a spring vibrates with a frequency of 0.88 Hz; when an additional 1.25 kg mass
is added to , the frequency is 0.48 Hz. What is the value of ?
(c) Dan has a new chandelier in his living room. The chandelier is 27-kg and it hangs from the ceiling on a
vertical 4.0-m-long wire. What horizontal force would Dan need to use to displace its position 0.10 m to
one side? What will be the tension in the wire?
(d) Dianne has a new spring with a spring constant of 900 N/m that she bought at Springs-R-Us. She places
it vertically on a table and compresses it by 0.150 m. What upward speed can it give to a 0.300-kg ball
when released?
(e) A tiger leaps horizontally from a 6.5-m-high rock with a speed of 4.0 m/s. How far from the base of the
rock will she land?
(f) How much work must SuperRyan do to stop a 1300-kg car traveling at 100 km/hr?
(g) Jason catches a baseball 3.1 s after throwing it vertically upward. With what speed did he throw it and
what height did it reach?
(h) Laura is practicing her figure skating and during her finale she can increase her rotation rate from an
initial rate of 1.0 rev every 2.0 s to a final rate of 3.0 rev/s. If her initial moment of inertia was 4.6 kg·m2 ,
what is her final moment of inertia?
(i) On an icy day in Rochester (imagine that!), you worry about parking your car in your driveway, which
has an incline of 12◦ . Your neighbor Emily’s driveway has an incline of 9◦ , and Brian’s driveway across
the street has one of 6◦ . The coefficient of static friction between tire rubber and ice is 0.15. Which
driveway(s) will be safe to park a car?
2. Two particles are projected from the same point with velocities 1 and 2 , at elevations 1 and 2 , respectively
(1 2 ). Show that if they are to collide in mid-air the interval between the firings must be
21 2 sin(1 − 2 )
(1 cos 1 + 2 cos 2 )
(If you don’t have time to solve this problem completely, then at least give an outline of how you would go
about solving the problem.)
3. Read each of the following statements and, without consulting anyone else, mark them true or false. If you are
unsure of any of them, make a guess. Once everyone has answered each of the statements individually, break
into small groups and compare your answers. Try to come to an agreement as a group. The Teaching Assistant
will then make sure everyone has the correct answer. Good luck!
(a) The conservation of linear momentum is a consequence of translational symmetry, or the homogeneity of
space.
(b) For an isolated system with no external forces acting on it, the angular momentum will remain constant
in both magnitude and direction.
(c) A reference frame is called an inertial frame if Newton’s laws are valid in that frame.
(d) Newtonian mechanics and the laws of electromagnetism are invariant under Galilean transformations.
2.15. SUMMARY 49
(e) The law of conservation of angular momentum is a consequence of rotational symmetry, or the isotropy
of space.
(f) The center of mass of a system of particles moves like a single particle of mass (total mass of the
system) acted on by a single force that is equal to the sum of all the external forces acting on the
system.
(g) If Newton’s laws are valid in one reference frame, then they are also valid in any reference frame accelerated
with respect to the first system.
(h) The law of conservation of energy is a consequence of inversion symmetry, or the invertibility of space.
4. The teeter totter comprises two identical weights which hang on drooping arms attached to a peg as shown.
The arrangement is unexpectedly stable and can be spun and rocked with little danger of toppling over.
l l
L
m m
(a) Find an expression for the potential energy of the teeter toy as a function of when the teeter toy is
cocked at an angle about the pivot point. For simplicity, consider only rocking motion in the vertical
plane.
(b) Determine the equilibrium values(s) of .
(c) Determine whether the equilibrium is stable, unstable, or neutral for the value(s) of found in part (b).
(d) How could you determine the answers to parts (b) and (c) from a graph of the potential energy versus ?
(e) Expand the expression for the potential energy about = 0 and determine the frequency of small
oscillations.
5. For each of the situations described below, determine which of the four functional forms of the force is most
appropriate. Consider motion only along one dimension.
Go around the room and take turns answering a question. When it is your turn, pick a functional form and
explain why you chose the one you did. If you are unsure, make a guess or ask a question to get help from the
rest of the workshop. There may be more than one answer depending on your interpretation of the situation,
so be sure to explore all of the possibilities.
(a) A mass resting on a frictionless table is attached to a spring, which in turn is attached to a wall. The
mass is pulled to the side and executes simple harmonic motion in the horizontal direction.
(b) A freely-falling body subject to a constant gravitational field with no air resistance.
(c) An electron, initially at rest (treat it classically!), encounters an incoming electromagnetic wave of electric
field intensity given by = 0 sin( + ).
(d) A large mass is affected by the gravitational field of another mass a distance away.
50 CHAPTER 2. REVIEW OF NEWTONIAN MECHANICS
(e) A freely-falling body subject to a constant gravitational field with air resistance.
(f) A charged point particle is affected by the presence of another charged point particle a distance away.
6. A particle of mass is constrained to move on the frictionless inner surface of a cone of half-angle .
(a) Find the restrictions on the initial conditions such that the particle moves in a circular orbit about the
vertical axis.
(b) Determine whether this kind of orbit is stable. A particle of mass is constrained to move on the
frictionless inner surface of a cone of half-angle , as shown in the figure.
(a) Draw gravitational field lines and equipotential lines for the rod. What can you say about the equipotential
surfaces of the rod?
(b) Calculate the gravitational potential at a point that is a distance from one end of the rod and in a
direction perpendicular to the rod.
(c) Calculate the gravitational field at by direct integration.
(d) Could you have used Gauss’s law to find the gravitational field at ? Why or why not?
9. Consider a fluid with density and velocity in some volume . The mass current = determines the
amount of mass exiting the surface per unit time by the integral
·
(a) Using the divergence theorem, prove the continuity equation, ∇ · +
=0
10. A rocket of initial mass burns fuel at constant rate (kilograms per second), producing a constant force .
The total mass of available fuel is . Assume the rocket starts from rest and moves in a fixed direction with
no external forces acting on it.
Problems
1. Consider a solid hemisphere of radius . Compute the coordinates of the center of mass relative to the center
of the spherical surface used to define the hemisphere.
2. A 2000kg Ford was travelling south on Mt. Hope Avenue when it collided with your 1000kg sports car travelling
west on Elmwood Avenue. The two badly-damaged cars became entangled in the collision and leave a skid mark
that is 20 meters long in a direction 14◦ to the west of the original direction of travel of the Excursion. The
wealthy Excursion driver hires a high-powered lawyer who accuses you of speeding through the intersection.
Use your P235 knowledge, plus the police officer’s report of the recoil direction, the skid length, and knowledge
that the coefficient of sliding friction between the tires and road is = 06, to deduce the original velocities of
both cars. Were either of the cars exceeding the 30mph speed limit?
3. A particle of mass moving in one dimension has potential energy () = 0 [2( )2 − ( )4 ] where 0 and
are positive constants.
a) Find the force () that acts on the particle.
b) Sketch (). Find the positions of stable and unstable equilibrium.
c) What is the angular frequency of oscillations about the point of stable equilibrium?
d) What is the minimum speed the particle must have at the origin to escape to infinity?
e) At = 0 the particle is at the origin and its velocity is positive and equal to the escape velocity. Find ()
and sketch the result.
4. a) Consider a single-stage rocket travelling in a straight line subject to an external force acting along the
same line where is the exhaust velocity of the ejected fuel relative to the rocket. Show that the equation of
motion is
̇ = −̇ +
b) Specialize to the case of a rocket taking off vertically from rest in a uniform gravitational field Assume
that the rocket ejects mass at a constant rate of ̇ = − where is a positive constant. Solve the equation of
motion to derive the dependence of velocity on time.
c) The first couple of minutes of the launch of the Space Shuttle can be described roughly by; initial mass
= 2 × 106 kg, mass after 2 minutes = 1 × 106 kg, exhaust speed = 3000 and initial velocity is zero.
Estimate the velocity of the Space Shuttle after two minutes of flight.
d) Describe what would happen to a rocket where ̇
5. A time independent field is conservative if ∇ × = 0. Use this fact to test if the following fields are
conservative, and derive the corresponding potential .
a) = + + = + = +
b) = −− = ln = − +
52 CHAPTER 2. REVIEW OF NEWTONIAN MECHANICS
6. Consider a solid cylinder of mass and radius sliding without rolling down the smooth inclined face of a
wedge of mass that is free to slide without friction on a horizontal plane floor. Use the coordinates shown
in the figure.
a) How far has the wedge moved by the time the cylinder has descended from rest a vertical distance ?
b) Now suppose that the cylinder is free to roll down the wedge without slipping. How far does the wedge
move in this case if the cylinder rolls down a vertical distance ?
c) In which case does the cylinder reach the bottom faster? How does this depend on the radius of the cylinder?
x
y
x
7. If the gravitational field vector is independent of the radial distance within a sphere, find the function describing
the mass density () of the sphere.
Chapter 3
Linear oscillators
3.1 Introduction
Oscillations are a ubiquitous feature in nature. Examples are periodic motion of planets, the rise and fall
of the tides, water waves, pendulum in a clock, musical instruments, sound waves, electromagnetic waves,
and wave-particle duality in quantal physics. Oscillatory systems all have the same basic mathematical form
although the names of the variables and parameters are different. The classical linear theory of oscillations
will be assumed in this chapter since: (1) The linear approximation is well obeyed when the amplitudes of
oscillation are small, that is, the restoring force obeys Hooke’s Law. (2) The Principle of Superposition
applies. (3) The linear theory allows most problems to be solved explicitly in closed form. This is in contrast
to non-linear system where the motion can be complicated and even chaotic as discussed in chapter 4.
F = −∇ (3.1)
53
54 CHAPTER 3. LINEAR OSCILLATORS
Thus these linear combinations also satisfy the general linear equation
L() = () (3.14)
Applicability of the Principle of Superposition to a system provides a tremendous advantage for handling
and solving the equations of motion of oscillatory systems.
Figure 3.2: Configuration plots of ( ) where = cos(4) and = cos(5 − ) at four different phase values
. The curves are called Lissajous figures
56 CHAPTER 3. LINEAR OSCILLATORS
2 2
+ ¡ 2 ¢ = 1 (3.20)
2
the name "state space" in common with reference [Ta05]. Lanczos [La49] uses the term "state space" to refer to the extended
phase space (q p) discussed in chapter 17
3.4. GEOMETRICAL REPRESENTATIONS OF DYNAMICAL MOTION 57
F () = − ()b
v (3.24)
where the velocity dependent function () can be complicated. Fortunately there is a very large class of
problems in electricity and magnetism, classical mechanics, molecular, atomic, and nuclear physics, where
the damping force depends linearly on velocity which greatly simplifies solution of the equations of motion.
This chapter discusses the special case of linear damping.
Consider the free simple harmonic oscillator, that is, assuming no oscillatory forcing function, with a
linear damping term F () = −v where the parameter is the damping factor. Then the equation of
motion is
− − ̇ = ̈ (3.25)
The general solution to the linearly-damped free oscillator is obtained by inserting the complex trial
solution = 0 Then
2
() 0 + Γ0 + 20 0 = 0 (3.29)
2 − Γ − 20 = 0 (3.30)
The solution is
s µ ¶2
Γ Γ
± = ± 20 − (3.31)
2 2
The two solutions ± are complex conjugates and thus the solutions of the damped free oscillator are
2 2
Γ
2+ 20 −( Γ
2) Γ
2− 20 −( Γ
2)
= 1 + 2 (3.32)
where
s µ ¶2
Γ
1 ≡ 2 − (3.34)
2
3.5. LINEARLY-DAMPED FREE LINEAR OSCILLATOR 59
¡ Γ ¢2
Underdamped motion 21 ≡ 2 − 2 0
When 21 0 then the square root is real so the solution can be written taking the real part of which
gives that equation 333 equals
Where and are adjustable constants fit to the initial conditions. Therefore the velocity is given by
∙ ¸
−Γ
2
Γ
̇() = − 1 sin ( 1 − ) + cos ( 1 − ) (3.36)
2
This is the damped sinusoidal oscillation illustrated in figure 35. The solution has the following
characteristics:
2
a) The oscillation amplitude decreases exponentially with a time constant = Γ
q b) There
¡ Γ ¢2
is a small reduction in the frequency of the oscillation due to the damping leading to 1 =
2
− 2
Figure 3.5: The amplitude-time dependence and state-space diagrams for the free linearly-damped harmonic
oscillator. The upper row shows the underdamped system for the case with damping Γ = 50 . The lower
row shows the overdamped ( Γ2 0 ) [solid line] and critically damped ( Γ2 = 0 ) [dashed line] in both cases
assuming that initially the system is at rest.
60 CHAPTER 3. LINEAR OSCILLATORS
Figure 3.6: Real and imaginary solutions ± of the damped harmonic oscillator. A phase transition occurs
at Γ = 2 0 For Γ 2 0 (dashed) the two solutions are complex conjugates and imaginary. For Γ 2 0 ,
(solid), there are two real solutions + and − with widely different decay constants where + dominates
the decay at long times.
¡ Γ ¢2
Overdamped case 21 ≡ 2 − 2 0
q¡ ¢
Γ 2
In this case the square root of 21 is imaginary and can be expressed as 01 = 2 − 2 Therefore the
solution is obtained more naturally by using a real trial solution = 0 in equation 333 which leads to
two roots ⎡ sµ ¶ ⎤
2
Γ Γ
± = − ⎣− ± − 2 ⎦
2 2
Thus the exponentially damped decay has two time constants + and −
£ ¤
() = 1 −+ + 2 −− (3.37)
The time constant 1− 1+ thus the first term 1 −+ in the bracket decays in a shorter time than the
second term 2 −− As illustrated in figure 36 the decay rate, which is imaginary when underdamped, i.e.
Γ Γ
2 bifurcates into two real values ± for overdamped, i.e. 2 . At large times the dominant term
when overdamped is for + which has the smallest decay rate, that is, the longest decay constant + = 1+ .
There is no oscillatory motion for the overdamped case, it slowly moves monotonically to zero as shown in
fig 35. The amplitude decays away with a time constant that is longer than Γ2
¡ Γ ¢2
Critically damped 21 ≡ 2 − 2 =0
Γ
This is the limiting case where 2 = For this case the solution is of the form
1
2, gives the time-averaged total energy as
à µ ¶2 !
−Γ 1 2 2 1 2 Γ 1 2 2
hi = 1 + + 0 (3.43)
4 4 2 4
()
̈ + Γ̇ + 20 = (3.48)
where () is the driving force. For mathematical simplicity the driving force is chosen to be a sinusoidal
harmonic force. The solution of this second-order differential equation comprises two components, the
complementary solution (transient response), and the particular solution (steady-state response).
which is identical to the solution of the free linearly-damped harmonic oscillator. As discussed in section 35
the solution of the linearly-damped free oscillator is given by the real part of the complex variable where
Γ £ ¤
= − 2 1 1 + 2 −1 (3.50)
and s µ ¶2
Γ
1 ≡ 2 − (3.51)
2
2
Underdamped motion 21 ≡ 2 − Γ2 0 : When 21 0 then the square root is real so the transient
solution can be written taking the real part of which gives
0 − Γ
() = 2 cos ( 1 ) (3.52)
The solution has the following characteristics:
2
a) The amplitude of the transient solution decreases exponentially with a time constant = Γ while
the energy decreases with a time constant of Γ1
q ¡ ¢2
b) There is a small downward frequency shift in that 1 = 2 − Γ2
¡ ¢2
Overdamped case 21 ≡ 2 − Γ2 0 : In this case the square root is imaginary, which can be expressed
q¡ ¢
Γ 2
as 01 ≡ 2 − 2 which is real and the solution is just an exponentially damped one
0 − Γ h 01 0
i
() = 2 + −1 (3.53)
There is no oscillatory motion for the overdamped case, it slowly moves monotonically to zero. The total
energy decays away with two time constants greater than Γ1
¡ Γ ¢2
Critically damped 21 ≡ 2 − 2 = 0 : For this case, as mentioned for the damped free oscillator, the
solution is of the form
Γ
() = ( + ) − 2 (3.54)
Thus the particular solution is the real part of the complex variable which is a solution of
0
̈ + Γ̇ + 20 = (3.56)
A trial solution is
= 0 (3.57)
This leads to the relation
0
− 2 0 + Γ0 + 20 0 = (3.58)
¡ 2 ¢
Multiplying the numerator and denominator by the factor 0 − 2 − Γ gives
0 0 £¡ 2 ¢ ¤
0 =
=
0 − 2 − Γ (3.59)
( 20 − 2 ) + Γ ( 20 2 2
− ) + (Γ)
2
The steady state solution () thus is given by the real part of , that is
0 £¡ 2 ¢ ¤
() =
2 2
0 − 2 cos + Γ sin (3.60)
( 20 − 2 ) + (Γ)
20 − 2
cos = q (3.62)
2
( 20 − 2 ) + (Γ)2
and
Γ
sin = q (3.63)
2 2
( 20 − 2 ) + (Γ)
The phase represents the phase difference between the
driving force and the resultant motion. For a fixed 0 the
phase = 0 when = 0 and increases to = 2 when
= 0 . For 0 the phase → as → ∞. Figure 3.7: Phase between driving force and
The steady state solution can be re-expressed in terms of resultant motion.
the phase shift as
0
() = q [cos cos + sin sin ]
2
( 0 − 2 ) + (Γ)2
2
0
= q cos ( − ) (3.64)
2
( 0 − 2 ) + (Γ)2
2
64 CHAPTER 3. LINEAR OSCILLATORS
Figure 3.8: Amplitude versus time, and state space plots of the transient solution (dashed) and total solution
(solid) for two cases. The upper row shows the case where the driving frequency = 51 while the lower row
shows the same for the case where the driving frequency = 5 1
For the underdamped case, the transient solution is the complementary solution
0 − Γ2
() = cos ( 1 − ) (3.66)
q ¡ ¢2
where 1 = 2 − Γ2 . The steady-state solution is given by the particular solution
0
() = q cos ( − ) (3.67)
2
( 0 − 2 ) + (Γ)2
2
Note that the frequency of the transient solution is 1 which in general differs from the driving frequency
. The phase shift − for the transient component is set by the initial conditions. The transient response
leads to a more complicated motion immediately after the driving function is switched on. Figure 38
illustrates the amplitude time dependence and state space diagram for the transient component, and the
total response, when the driving frequency is either = 51 or = 5 1 Note that the modulation of the
steady-state response by the transient response is unimportant once the transient response has damped out
leading to a constant elliptical state space trajectory. For cases where the initial conditions are = ̇ = 0
then the transient solution has a relative phase difference − = radians at = 0 and relative amplitudes
such that the transient and steady-state solutions cancel at = 0
The characteristic sounds of different types of musical instruments depend very much on the admixture
of transient solutions plus the number and mixture of oscillatory active modes. Percussive instruments, such
as the piano, have a large transient component. The mixture of transient and steady-state solutions for
forced oscillations occurs frequently in studies of networks in electrical circuit analysis.
3.6. SINUSOIDALLY-DRIVE, LINEARLY-DAMPED, LINEAR OSCILLATOR 65
3.6.4 Resonance
The discussion so far has discussed the role of the transient and steady-state solutions of the driven damped
harmonic oscillator which occurs frequently is science, and engineering. Another important aspect is reso-
nance that occurs when the driving frequency approaches the natural frequency 1 of the damped system.
Consider the case where the time is sufficient for the transient solution to have decayed to zero.
Figure 39 shows the amplitude and phase for the steady-
state response as goes through a resonance as the driving
frequency is changed. The steady-states solution of the
driven oscillator follows the driving force when 0 in
that the phase difference is zero and the amplitude is just
0
The response of the system peaks at resonance, while
for 0 the harmonic system is unable to follow the
more rapidly oscillating driving force and thus the phase of
the induced oscillation is out of phase with the driving force
and the amplitude of the oscillation tends to zero.
Note that the resonance frequency for a driven damped
oscillator, differs from that for the undriven damped oscilla-
tor, and differs from that for the undamped oscillator. The
natural frequency for an undamped harmonic oscillator
is given by
20 = (3.68)
The transient solution is the same as damped free os-
cillations of a damped oscillator and has a frequency of
the system 1 given by
µ ¶2
Γ
21 = 20 − (3.69)
2
The absorptive term steadily absorbs energy while the elastic term oscillates as energy is alternately absorbed
or emitted. The time average over one cycle is given by
h D Ei
2
h i = 0 − hcos sin i + (cos ) (3.80)
®
where hcos sin i and cos 2 are the time average over one cycle. The time averages over one complete
cycle for the first term in the bracket is
1 2 Γ 2
h i = 0 = 0 (3.83)
2 2 ( 20 − 2 )2 + (Γ)2
This shape of the power curve is a classic Lorentzian shape. Note that the maximum of the average kinetic
¡ ¢2
energy occurs at = 0 which is different from the peak of the amplitude which occurs at 21 = 20 − Γ2 .
The potential energy is proportional to the amplitude squared, i.e. 2 which occurs at the same angular
¡ ¢2
frequency as the amplitude, that is, 2 = 2 = 20 − 2 Γ2 . The kinetic and potential energies resonate
at different angular frequencies as a result of the fact that the driven damped oscillator is not conservative
3.6. SINUSOIDALLY-DRIVE, LINEARLY-DAMPED, LINEAR OSCILLATOR 67
because energy is continually exchanged between the oscillator and the driving force system in addition to
the energy dissipation due to the damping.
When ∼ 0 Γ, then the power equation simplifies since
¡ 2 ¢
0 − 2 = ( 0 + ) ( 0 − ) ≈ 2 0 ( 0 − ) (3.84)
Therefore
02 Γ
h i ' ¡ ¢ (3.85)
8 ( 0 − )2 + Γ 2
2
This is called the Lorentzian or Breit-Wigner shape. The half power points are at a frequency difference
from resonance of ±∆ where
Γ
∆ = | 0 − | = ± (3.86)
2
Thus the full width at half maximum of the Lorentzian curve equals Γ Note that the Lorentzian has a
narrower peak but much wider tail relative to a Gaussian shape. At the peak of the absorbed power, the
absorptive amplitude can be written as
0
( = 0 ) = (3.87)
20
That is, the peak amplitude increases with increase in . This explains the classic comedy scene where the
soprano shatters the crystal glass because the highest quality crystal glass has a high which leads to a
large amplitude oscillation when she sings on resonance.
The mean lifetime of the free linearly-damped harmonic oscillator, that is, the time for the energy of
free oscillations to decay to 1 was shown to be related to the damping coefficient Γ by
1
= (3.88)
Γ
Therefore we have the classical uncertainty principle for the linearly-damped harmonic oscillator
that the measured full-width at half maximum of the energy resonance curve for forced oscillation and the
mean life for decay of the energy of a free linearly-damped oscillator are related by
Γ = 1 (3.89)
This relation is correct only for a linearly-damped harmonic system. Comparable relations between the
lifetime and damping width exist for different forms of damping.
One can demonstrate the above line width and decay time relationship using an acoustically driven
electric guitar string. Similarily, the width of the electromagnetic radiation is related to the lifetime for
decay of atomic or nuclear electromagnetic decay. This classical uncertainty principle is exactly the same
as the one encountered in quantum physics due to wave-particle duality. In nuclear physics it is difficult to
measure the lifetime of states when 10−13 For shorter lifetimes the value of Γ can be determined from
the shape of the resonance curve which can be measured directly when the damping is large.
= 00 where is the phase difference between the voltage and the current. For this circuit the impedance
is given by µ ¶
1
= + −
Because of the phases involved in this circuit, at resonance the maximum voltage across the resistor
occurs at a frequency of = 0 across the capacitor the maximum voltage occurs at a frequency 2 =
2
20 1
20 − 2 2
2 and across the inductor the maximum voltage occurs at a frequency = 2 where 20 =
1− 22
is the resonance angular frequency when = 0. Thus these resonance frequencies differ when 0.
Ψ Ψ
= ∓ (3.92)
The sign in this equation depends on the sign of the wave velocity making it not a generally useful formula.
Consider the second derivatives
2Ψ 2 Ψ 2 Ψ
= 2 = (3.93)
2
2
and
2Ψ 2 Ψ 2
2 Ψ
= = + (3.94)
2 2 2
3.8. TRAVELLING AND STANDING WAVE SOLUTIONS OF THE WAVE EQUATION 69
2 Ψ
Factoring out 2
gives
2Ψ 1 2Ψ
2
= 2 2 (3.95)
This wave equation in one dimension for a linear system is independent of the sign of the velocity. There
are an infinite number of possible shapes of waves both travelling and standing in one dimension, all of these
must satisfy this one-dimensional wave equation. The converse is that any function that satisfies this one
dimensional wave equation must be a wave in this one dimension.
The Wave Equation in three dimensions is
2Ψ 2Ψ 2Ψ 1 2Ψ
∇2 Ψ ≡ + + = (3.96)
2 2 2 2 2
There are an unlimited number of possible solutions Ψ to this wave equation, any one of which corresponds
to a wave motion with velocity .
The Wave Equation is applicable to all manifestations of wave motion, both transverse and longitudinal,
for linear systems. That is, it applies to waves on a string, water waves, seismic waves, sound waves,
electromagnetic waves, matter waves, etc. If it can be shown that a wave equation can be derived for any
system, discrete or continuous, then this is equivalent to proving the existence of waves of any waveform,
frequency, or wavelength travelling with the phase velocity given by the wave equation.[Cra65]
2) The spatial dependence of the waveform at a given instant = 0 which can be expressed using a
Fourier decomposition of the spatial dependence as a function of wavenumber = 0
∞
X ∞
X
Ψ( 0 ) = (0 −1 0 ) = (0 ) 0 (3.100)
=−∞ =−∞
The above is applicable both to discrete, or continuous linear oscillator systems, e.g. waves on a string.
In summary, stationary normal modes of a system are obtained by a superposition of travelling waves
travelling in opposite directions, or equivalently, travelling waves can result from a superposition of stationary
normal modes.
70 CHAPTER 3. LINEAR OSCILLATORS
Note that since the average over 2 of cos2 = 12 then the average over the cos2 ( 1 − ) term gives the
2
intensity () = 2 −Γ which has a mean lifetime for the decay of = Γ1 The | ()|2 distribution has the
classic Lorentzian shape, shown in figure 312, which has a full width at half-maximum, FWHM, equal to Γ.
Note that () is complex and thus one also can determine the phase shift which is given by the ratio of
the imaginary to real parts of equation 3105 i.e. tan = 2Γ .
( −21 )
The mean lifetime of the exponential decay of the intensity can be determined either by measuring
2
from the time dependence, or measuring the FWHM Γ = 1 of the Fourier transform | ()| . In nuclear
and atomic physics excited levels decay by photon emission with the wave form of the free linearly-damped,
linear oscillator. Typically the mean lifetime usually can be measured when & 10−12 whereas for
shorter lifetimes the radiation width Γ becomes sufficiently large to be measured. Thus the two experimental
approaches are complementary.
For each harmonic term the response of a linearly-damped linear oscillator to the forcing function
() = 0 () cos( ) is given by equation (365 − 67) to be
This is shown schematically in figure 313. The Fourier transformation connects the three quantities in the
time domain with the corresponding three in the frequency domain. For example, the impulse response of
the low-pass filter has a fall time of which is related by a Fourier transform to the width of the transfer
function. Thus the time and frequency domain approaches are closely related and give the same result for
the output signal for the low-pass filter to the applied square-wave input signal. The result is that the
higher-frequency components are attenuated leading to slow rise and fall times in the time domain.
Analog signal processing and Fourier analysis were the primary tools to analyze and process all forms of
periodic motion during the 20 century. For example, musical instruments, mechanical systems, electronic
circuits, all employed resonant systems to enhance the desired frequencies and suppress the undesirable
frequencies and the signals could be observed using analog oscilloscopes. The remarkable development of
computing has enabled use of digital signal processing leading to a revolution in signal processing that has
had a profound impact on both science and engineering. The digital oscilloscope, which can sample at fre-
quencies above 109 has replaced the analog oscilloscope because it allows sophisticated analysis of each
individual signal that was not possible using analog signal processing. For example, the analog approach in
nuclear physics used tiny analog electric signals, produced by many individual radiation detectors, that were
transmitted hundreds of meters via carefully shielded and expensive coaxial cables to the data room where
the signals were amplified and signal processed using analog filters to maximize the signal to noise in order to
separate the signal from the background noise. Stray electromagnetic radiation picked up via the cables sig-
nificantly degraded the signals. The performance and limitations of the analog electronics severely restricted
the pulse processing capabilities. Digital signal processing has rapidly replaced analog signal processing.
3.11. WAVE PROPAGATION 73
Figure 3.13: Response of an electrical circuit to an input square wave. The upper row shows the time
and the exponential-form frequency representations of the square-wave input signal. The middle row gives
the impulse response, and corresponding transfer function for the circuit. The bottom row shows the
corresponding output properties in both the time and frequency domains
Analog to digital detector circuits are built directly into the electronics for each individual detector so that
only digital information needs to be transmitted from each detector to the analysis computers. Computer
processing provides unlimited and flexible processing capabilities for the digital signals greatly enhancing
the response and sensitivity of our detector systems. Digital CD and DVD disks are common application of
digital signal processing.
The argument of the exponential is called the phase of the wave where
≡ − (3.114)
If we move along the axis at a velocity such that the phase is constant then we perceive a stationary
pattern in this moving frame. The velocity of this wave is called the phase velocity. To ensure constant
phase requires that is constant, or assuming real and
In the event that → ∞ and the frequencies are continuously distributed, then the summation is replaced
by an integral
Z ∞
( ) = ()(±) (3.121)
−∞
3.11. WAVE PROPAGATION 75
where the factor () represents the distribution amplitudes of the component waves, that is the spectral
decomposition of the wave. This is the usual Fourier decomposition of the spatial distribution of the wave.
Consider an extension of the linear superposition of two waves to a well defined wave packet where the
amplitude is nonzero only for a small range of wavenumbers 0 ± ∆
Z 0 +∆
( ) = ()(−) (3.122)
0 −∆
This functional shape is called a wave packet which only has meaning if ∆ 0 . The angular frequency
can be expressed by making a Taylor expansion around 0
µ ¶
() = (0 ) + ( − 0 ) + (3.123)
0
The summation of terms in the exponent given by 3124 leads to the amplitude 3122 having the form of a
product where the integral becomes
Z 0 +∆
(−0 )[−(
) ]
( ) = (0 −0 ) () 0 (3.125)
0 −∆
2E E
∇2 E − − = 0
2
2
H H
∇2 H − 2 − = 0
The third term in both of these wave equations is a damping term that leads to a damped solution of an
electromagnetic wave in a good conductor.
The solution of these damped wave equations can be solved by considering an incident wave
E = x̂(−)
−2 + 2 − = 0
That is ∙ ¸
2 = 2 1 −
In general is complex, that is, it has real and imaginary parts that lead to a solution of the form
E = − (− )
The first exponential term is an exponential damping term while the second exponential term is the oscillating
term.
Consider that the plasma involves the motion of a bound damped electron, of charge of mass bound
in a one dimensional atom or lattice subject to an oscillatory electric field of frequency . Assume that the
electromagnetic wave is travelling in the ̂ direction with the transverse electric field in the ̂ direction. The
equation of motion of an electron can be written as
where Γ is the damping factor. The instantaneous displacement of the oscillating charge equals
1
x= x̂0 (−)
( 20 − 2 ) + Γ
and the velocity is
ẋ = 2 x̂0 (−)
( 0 − 2 ) + Γ
Thus the instantaneous current density is given by
2
j = ẋ = 2 x̂0 (−)
( 0 − 2 ) + Γ
78 CHAPTER 3. LINEAR OSCILLATORS
2
= 2
( 0 − 2 ) + Γ
Let us consider only unbound charges in the plasma, that is let 0 = 0. Then the conductivity is given by
2
=
Γ − 2
For a low density ionized plasma Γ thus the conductivity is given approximately by
2
≈ −
Since is pure imaginary, then j and E have a phase difference of 2 which implies that the average of
the Joule heating over a complete period is hj · Ei = 0 Thus there is no energy loss due to Joule heating
implying that the electromagnetic energy is conserved.
Substitution of into the relation for 2
∙ ¸ ∙ ¸
2
2 = 2 1 − = 2 1 −
2
Define the Plasma oscillation frequency to be
r
2
≡
then 2 can be written as ∙ ³ ´2 ¸
2 2
= 1 − ()
For a low density plasma the dielectric constant ' 1 and the relative permeability ' 1 and thus
= 0 ' 0 and = 0 ' 0 . The velocity of light in vacuum = √10 . Thus for low density
0
equation can be written as
2 = 2 + 2 2 ()
Differentiation of equation with respect to gives 2 2 2
= 2 That is, = and the phase
velocity is r
2
= 2 + 2
There are three cases to consider. h ¡ ¢2 i
1) : For this case 1 − 1 and thus is a pure real number. Therefore the elec-
tromagnetic wave is transmitted with a phase velocity that exceeds while the group velocity is less than
. h ¡ ¢2 i
2) : For this case 1 − 1 and thus is a pure imaginary number. Therefore the
electromagnetic wave is not transmitted in the ionosphere and is attenuated rapidly as −( ) . However,
since there are no Joule heating losses, then the electromagnetic wave must be complete reflected. Thus the
Plasma oscillation frequency serves as a cut-off frequency. For this example the signal and group velocities
are identical.
For the ionosphere = 10−11 electrons/m 3 , which corresponds to a Plasma oscillation frequency of
= 2 = 3 . Thus electromagnetic waves in the AM waveband ( 16 ) are totally reflected by
the ionosphere and bounce repeatedly around the Earth, whereas for VHF frequencies above 3 , the waves
are transmitted and refracted passing through the atmosphere. Thus light is transmitted by the ionosphere.
By contrast, for a good conductor like silver, the Plasma oscillation frequency is around 1016 which is
in the far ultraviolet part of the spectrum. Thus, all lower frequencies, such as light, are totally reflected
by such a good conductor, whereas X-rays have frequencies above the Plasma oscillation frequency and are
transmitted.
3.11. WAVE PROPAGATION 79
That is, the transform of a rectangular wavepacket gives a cosine wave modulated by an unnormalized
function which is a nice example of a simple wave packet. That is, on the right hand side we have
2
a wavepacket ∆ = ± ∆ wide. Note that the product of the two measures of the widths ∆ · ∆ = ±
Example 2 considers a rectangular
³ ´ pulse of unity amplitude between − 2 ≤ ≤ 2 which resulted in a
sin
Fourier transform () =
2
. That is, for a pulse of width ∆ = ± 2 the frequency envelope has
2
the first zero at ∆ = ± . Note that this is the complementary system to the one considered here which has
∆ · ∆ = ± illustrating the symmetry of the Fourier transform and its inverse.
80 CHAPTER 3. LINEAR OSCILLATORS
trajectory. This implies that, for a particle of given momentum, the wavefunction is spread out spatially.
Planck’s constant ~ = 105410−34 · = 658210−16 · is extremely small compared with energies and
times encountered in normal life, and thus the effects due to the Uncertainty Principle are not important for
macroscopic dimensions.
Confinement of a particle, of mass , within ±() of a fixed location implies that there is a corresponding
uncertainty in the momentum
~
( ) ≥ (3.130)
2()
D E
Now the variance in momentum p is given by the difference in the average of the square (p · p)2 , and the
2
square of the average of hpi . That is
D E
2 2
(p)2 = (p · p) − hpi (3.131)
In 1959 Pound and Rebka used this to test Einstein’s general theory of relativity by measurement of the
gravitational red shift between the attic and basement of the 225 high physics building at Harvard. The
−
magnitude of the predicted relativistic red shift is ∆
= 25 × 10
15
which is what was observed with a
fractional precision of about 1%.
~
∆ = ≤ 03 (Due to transverse velocity uncertainty)
2∆
Combining both of these requirements gives
2∆2
~≤ = 54 10−2 ·
This is 32 orders of magnitude larger than ~ so quantal effects are negligible. However, if ~ exceeded the
above value, then the pitcher would have difficulty throwing a reliable strike.
3.12 Summary
Linear systems have the feature that the solutions obey the Principle of Superposition, that is, the am-
plitudes add linearly for the superposition of different oscillatory modes. Applicability of the Principle of
Superposition to a system provides a tremendous advantage for handling and solving the equations of motion
of oscillatory systems.
Geometric representations of the motion of dynamical systems provide sensitive probes of periodic mo-
tion. Configuration space (q q ), state space (q q̇ ) and phase space (q p ), are powerful geometric
representations that are used extensively for recognizing periodic motion where q q̇ and p are vectors in
-dimensional space.
Linearly-damped free linear oscillator The free linearly-damped linear oscillator is characterized by
the equation
̈ + Γ̇ + 20 = 0 (326)
The solutions of the linearly-damped free linear oscillator are of the form
s µ ¶2
−( Γ
£ 1 ¤ Γ
= 2 ) 1 + 2 −1 1 ≡ 2 − (333)
2
The solutions of the linearly-damped free linear oscillator have the following characteristic frequencies cor-
responding to the three levels of linear damping
q ¡ ¢2
() = −( 2 ) cos ( 1 − )
Γ
underdamped 1 = 2 − Γ2 0
∙ q¡ ¢ ¸
Γ 2
() = [1 −+ + 2 −− ] overdamped ± = − − Γ2 ± 2 − 2
q ¡ Γ ¢2
() = ( + ) −( 2 )
Γ
critically damped 1 = 2 − 2 = 0
3.12. SUMMARY 83
The energy dissipation for the linearly-damped free linear oscillator time averaged over one period is
given by
hi = 0 −Γ (344)
The quality factor characterizing the damping of the free oscillator is defined to be
1
= = (347)
∆ Γ
where ∆ is the energy dissipated per radian.
Resonance A detailed discussion of resonance and energy absorption for the driven linearly-damped linear
oscillator was given. For resonance of the linearly-damped linear oscillator the maximum amplitudes occur
at the following resonant frequencies
Resonant system Resonant
q frequency
undamped free linear oscillator 0 =
q ¡ ¢2
linearly-damped free linear oscillator 1 = 20 − Γ2
q ¡ ¢2
driven linearly-damped linear oscillator = 20 − 2 Γ2
The energy absorption for the steady-state solution for resonance is given by
() = cos + sin (373)
where the elastic amplitude
0 ¡ 2 ¢
=
0 − 2 (374)
( 20 2 2
− ) + (Γ) 2
Wave propagation The wave equation was introduced and both travelling and standing wave solutions
of the wave equation were discussed. Harmonic wave-form analysis, and the complementary time-sampled
wave form analysis techniques, were introduced in this chapter and in appendix . The relative merits of
Fourier analysis and the digital Green’s function waveform analysis were illustrated for signal processing.
The concepts of phase velocity, group velocity, and signal velocity were introduced. The phase velocity
is given by
= (3117)
and group velocity µ ¶
= = + (3128)
0
If the group velocity is frequency dependent then the information content of a wave packet travels at the
signal velocity which can differ from the group velocity.
The Wave-packet Uncertainty Principle implies that making a precise measurement of the frequency
q of a
2
sinusoidal wave requires that the wave packet be infinitely long. The standard deviation () = h2 i − hi
characterizing the width of the amplitude of the wavepacket spectral distribution in the angular frequency
domain, (), and the corresponding width in time () are related by :
The standard deviations for the spectral distribution and width of the intensity of the wave packet are
related by:
1
() · () > (3.134)
2
1 1 1
() · ( ) > () · ( ) > () · ( ) >
2 2 2
This applies to all forms of wave motion, including sound waves, water waves, electromagnetic waves, or
matter waves.
3.12. SUMMARY 85
Workshop exercises
1. Given below are a list of statements followed by a list of reasons related to harmonic motion. For each of the
statements, determine the reason(s) that make that statement true. You may do this in small groups or as one
large group—the teaching assistant will decide what works best for your workshop.
Statements:
• We can neglect the higher order terms in the Taylor expansion of ().
• The restoring force is a linear force.
• 0 must vanish.
• ()0 is negative and is positive.
• We can write () as a Taylor series expansion.
Reasons:
2. Second-order ordinary differential equations are an important part of the physics of the harmonic oscillator.
(a) What do each of the following terms mean with respect to differential equations?
i. Ordinary
ii. Second-order
iii. Homogeneous
iv. Linear
(b) Give a mini-lesson on how to solve second-order differential equations by working through the following
examples. Don’t just provide a solution; explain the steps leading up to the solution.
i. 00 +5 0 +6 = 0
ii. 00 + 0 + = 0
iii. 00 +4 0 +4 = 0
iv. 00 −3 02
v. 00 −3 0 −4 = 2 sin
3. Harmonic oscillations occur for many different types of systems and it is important to recognize when the
equations for harmonic motion apply. Three different systems are described below. Each system can be
approximately described using the equations for harmonic motion. Break up into three groups—one group per
system. For your group’s system, answer the following questions:
(a) What approximations are necessary for this system to exhibit harmonic oscillations?
(b) What is the differential equation that governs the motion of this system? Use Newton’s second law to
arrive at this equation.
(c) What is the solution to the differential equation that you found in part (b)?
(d) What is the natural frequency of oscillations?
• A mass is tied to a massless spring having a spring constant . The system oscillates in one dimension
along a horizontal frictionless surface.
86 CHAPTER 3. LINEAR OSCILLATORS
• A particle of mass is attached to a weightless, extensionless rod to form a pendulum. The length of
the rod is and the system oscillates in a single plane.
• A tube is bent into the shape of a U and is partially filled with a liquid of density . The cross-sectional
area of the tube is and the length of the tube filled with liquid is . The liquid is initially displaced so
that it is higher on one side of the tube than the other.
Once each group has answered all of the questions, share the results with the entire class.
4. Consider a mass attached to a spring of spring constant . The spring is mounted horizontally so that the
mass oscillates horizontally on a frictionless surface. The spring is attached to the wall on the right and the
mass is initially moved to the right of its equilibrium position (compressing the spring) by a distance and
released. Working individually, determine how (if at all) the period of the motion would be affected by each of
the changes below. Once you have answered each part on your own, compare your answers with a classmate.
5. When you were first introduced to simple harmonic motion, you used the formula ̈ = − to find the
position of the oscillating mass as a function of time. This assumes that the origin is defined to be the
equilibrium point. What happens if this is not the case? What would the equation of motion look like? How
would the position of the oscillating mass as a function of time change?
6. For each of the situations described below, give a rough sketch of the state space diagram (̇ versus ) that
represents the motion of each object. All of the motion takes place along the -axis.
7. Consider a simple harmonic oscillator consisting of a mass attached to a spring of spring constant . For
this oscillator () = sin( 0 − ).
8. Consider a damped, driven oscillator consisting of a mass attached to a spring of spring constant .
(a) Determine the direction and the magnitude of the gravitational field for all regions of space.
(b) If the gravitational potential is zero at the origin, what is the difference between the gravitational potential
at = and = ?
11. A mass is constrained to move along one dimension. Two identical springs are attached to the mass, one on
each side, and each spring is in turn attached to a wall. Both springs have the same spring constant .
12. Discuss the motion of a continuous string when ½ plucked at one third of the ¾
length of the string. That is, the
3
0 ≤ ≤ 3
initial condition is ̇( 0) = 0, and ( 0) = 3
2 ( − ) 3 ≤≤
13. When a particular driving force is applied to a stretched string it is observed that the string vibration in purely
of the harmonic. Find the driving force.
14. Consider the two-mass system pivoted at its vertex where 6= . It undergoes oscillations of the angle
with respect to the vertical in the plane of the triangle.
l l
M m
l
15. A cube of side and mass is immersed in water with density past the point of equilibrium and then
released. Assume there is no damping due to the water.
√ √
() = + 1 cos ( ) + 2 sin ( )
where 1 and 2 are constants. If (0) = −, determine ().
Problems
1. An unusual pendulum is made by fixing a string to a horizontal cylinder of radius wrapping the string
several times around the cylinder, and then tying a mass to the loose end. In equilibrium the mass hangs a
distance 0 vertically below the edge of the cylinder. Find the potential energy if the pendulum has swung to
an angle from the vertical. Show that for small angles, it can be written in the Hooke’s Law form = 12 2 .
Comment of the value of
3. A simple pendulum consists of a mass suspended from a fixed point by a weight-less, extensionless rod of
length .
p the equation of motion, and in the approximation sin ≈ show that the natural frequency is
a) Obtain
0 = , where is the gravitational field strength.
b) Discuss
√ the motion in the event that the motion takes place in a viscous medium with retarding force
2 ̇.
4. Derive the expression for the State Space paths of the plane pendulum if the total energy is 2. Note
that this is just the case of a particle moving in a periodic potential () = (1−cos) Sketch the State
Space diagram for both 2 and 2
5. Consider the motion of a driven linearly-damped harmonic oscillator after the transient solution has died out,
and suppose that it is being driven close to resonance, = .
1 2 2
a) Show that the oscillator’s total energy is = 2 .
b) Show that the energy ∆ dissipated during one cycle by the damping force Γ̇ is Γ2
6. Two masses m1 and m2 slide freely on a horizontal frictionless rail and are connected by a spring whose force
constant is k. Find the frequency of oscillatory motion for this system.
7. A particle of mass moves under the influence of a resistive force proportional to velocity and a potential ,
that is .
( ̇) = −̇ −
where 0 and () = (2 − 2 )2
a) Find the points of stable and unstable equilibrium.
b) Find the solution of the equations of motion for small oscillations around the stable equilibrium points
c) Show that as → ∞ the particle approaches one of the stable equilibrium points for most choices of initial
conditions. What are the exceptions? (Hint: You can prove this without finding the solutions explicitly.)
Chapter 4
4.1 Introduction
In nature only a subset of systems have equations of motion that are linear. Contrary to the impression
given by the analytic solutions presented in undergraduate physics courses, most dynamical systems in
nature exhibit non-linear behavior that leads to complicated motion. The solutions of non-linear equations
usually do not have analytic solutions, superposition does not apply, and they predict phenomena such as
attractors, discontinuous period bifurcation, extreme sensitivity to initial conditions, rolling motion, and
chaos. During the past four decades, exciting discoveries have been made in classical mechanics that are
associated with the recognition that nonlinear systems can exhibit chaos. Chaotic phenomena have been
observed in most fields of science and engineering such as, weather patterns, fluid flow, motion of planets in
the solar system, epidemics, changing populations of animals, birds and insects, and the motion of electrons
in atoms. The complicated dynamical behavior predicted by non-linear differential equations is not limited
to classical mechanics, rather it is a manifestation of the mathematical properties of the solutions of the
differential equations involved, and thus is generally applicable to solutions of first or second-order non-
linear differential equations. It is important to understand that the systems discussed in this chapter follow
a fully deterministic evolution predicted by the laws of classical mechanics, the evolution for which is based
on the prior history. This behavior is completely different from a random walk where each step is based on a
random process. The complicated motion of deterministic non-linear systems stems in part from sensitivity
to the initial conditions.
The French mathematician Poincaré is credited with being the first to recognize the existence of chaos
during his investigation of the gravitational three-body problem in celestial mechanics. At the end of the
nineteenth century Poincaré noticed that such systems exhibit high sensitivity to initial conditions character-
istic of chaotic motion, and the existence of nonlinearity which is required to produce chaos. Poincaré’s work
received little notice, in part it was overshadowed by the parallel development of the Theory of Relativity
and quantum mechanics at the start of the 20 century. In addition, solving nonlinear equations of motion
is difficult, which discouraged work on nonlinear mechanics and chaotic motion. The field blossomed during
the 19600 when computers became sufficiently powerful to solve the nonlinear equations required to calculate
the long-time histories necessary to document the evolution of chaotic behavior. Laplace, and many other
scientists, believed in the deterministic view of nature which assumes that if the position and velocities of
all particles are known, then one can unambiguously predict the future motion using Newtonian mechanics.
Researchers in many fields of science now realize that this “clockwork universe” is invalid. That is, knowing
the laws of nature can be insufficient to predict the evolution of nonlinear systems in that the time evolu-
tion can be extremely sensitive to the initial conditions even though they follow a completely deterministic
development. There are two major classifications of nonlinear systems that lead to chaos in nature. The
first classification encompasses nondissipative Hamiltonian systems such as Poincaré’s three-body celestial
mechanics system. The other main classification involves driven, damped, non-linear oscillatory systems.
Nonlinearity and chaos is a broad and active field and thus this chapter will focus only on a few examples
that illustrate the general features of non-linear systems. Weak non-linearity is used to illustrate bifurcation
and asymptotic attractor solutions for which the system evolves independent of the initial conditions. The
common sinusoidally-driven linearly-damped plane pendulum illustrates several features characteristic of the
89
90 CHAPTER 4. NONLINEAR SYSTEMS AND CHAOS
evolution of a non-linear system from order to chaos. The impact of non-linearity on wavepacket propagation
velocities and the existence of soliton solutions is discussed. The example of the three-body problem is
discussed in chapter 11. The transition from laminar flow to turbulent flow is illustrated by fluid mechanics
discussed in chapter 168. Analytic solutions of nonlinear systems usually are not available and thus one
must resort to computer simulations. As a consequence the present discussion focusses on the main features
of the solutions for these systems and ignores how the equations of motion are solved.
Insert this first-order solution into equation 44, then the cubic term in the expansion gives a term 3 =
1
4 (cos 3 + 3 cos ). Thus the perturbation expansion to third order involves a solution of the form
This perturbation solution shows that the non-linear term has distorted the signal by addition of the third
harmonic of the driving frequency with an amplitude that depends sensitively on . This illustrates that the
superposition principle is not obeyed for this non-linear system, but, if the non-linearity is weak, perturbation
theory can be used to derive the solution of a non-linear equation of motion.
Figure 41 illustrates that for a potential () = 22 + 4 the 4 non-linear term are greatest at the
maximum amplitude which makes the total energy contours in state-space more rectangular than the
elliptical shape for the harmonic oscillator as shown in figure 33. The solution is of the form given in
equation 46.
4.2. WEAK NONLINEARITY 91
Figure 4.1: The left side shows the potential energy for a symmetric potential () = 22 + 4 . The right
side shows the contours of constant total energy on a state-space diagram.
1 = 0 + 1
Substituting this into the equation of motion, and neglecting terms of higher order than gives
2
̈1 + 20 1 = 20 = [1 − cos (2 0 )]
2
To solve this try a particular integral
1 = + cos (2 0 )
and substitute into the equation of motion gives
2 2
−3 20 cos (2 0 ) + 20 = − cos (2 0 )
2 2
Comparison of the coefficients gives
2
=
2 20
2
=
6 20
92 CHAPTER 4. NONLINEAR SYSTEMS AND CHAOS
22
2 = −
3 2
and ∙ ¸
2 1 2 1
1 = ( + 1 ) sin ( 0 ) + 2 − cos ( 0 ) + cos (2 0 )
0 2 3 6
The constant ( + 1 ) is given by the initial amplitude and velocity.
This system is nonlinear in that the output amplitude is not proportional to the input amplitude. Secondly,
a large amplitude second harmonic component is introduced in the output waveform; that is, for a non-linear
system the gain and frequency decomposition of the output differs from the input. Note that the frequency
composition is amplitude dependent. This particular example of a nonlinear system does not exhibit chaos.
The Laboratory for Laser Energetics uses nonlinear crystals to double the frequency of laser light.
is at the minimum, which is the origin of the state-space diagram as shown in figure 41.
The more complicated one-dimensional potential well
shown in figure 42 has two minima that are symmetric about = 0 with a saddle of height 8.
The kinetic plus potential energies of a particle with mass = 2 released in this potential, will be
assumed to be given by
( ̇) = ̇2 + () (4.9)
The state-space plot in figure 42 shows contours of constant energy with the minima at ( ̇) = (±2 0).
At slightly higher total energy the contours are closed loops around either of the two minima at = ±2.
At total energies above the saddle energy of 8 the contours are peanut-shaped and are symmetric about
the origin. Assuming that the motion is weakly damped, then a particle released with total energy
which is higher than will follow a peanut-shaped spiral trajectory centered at ( ̇) = (0 0) in the
4.4. LIMIT CYCLES 93
Figure 4.2: The left side shows the potential energy for a bimodal symmetric potential () = 8 − 42 +
054 . The right-hand figure shows contours of the sum of kinetic and potential energies on a state-space
diagram. For total energies above the saddle point the particle follows peanut-shaped trajectories in state-
space centered around ( ̇) = (0 0). For total energies below the saddle point the particle will have closed
trajectories about either of the two symmetric minima located at ( ̇) = (±2 0). Thus the system solution
bifurcates when the total energy is below the saddle point.
state-space diagram for . For there are two separate solutions for the two
minimum centered at = ±2 and ̇ = 0. This is an example of bifurcation where the one solution for
bifurcates into either of the two solutions for .
For an initial total energy damping will result in spiral trajectories of the particle that
will be trapped in one of the two minima. For the particle trajectories are centered giving
the impression that they will terminate at ( ̇) = (0 0) when the kinetic energy is dissipated. However, for
the particle will be trapped in one of the two minimum and the trajectory will terminate
at the bottom of that potential energy minimum occurring at ( ̇) = (±2 0). These two possible terminal
points of the trajectory are called point attractors. This example appears to have a single attractor for
which bifurcates leading to two attractors at ( ̇) = (±2 0) for . The
determination as to which minimum traps a given particle depends on exactly where the particle starts in
state space and the damping etc. That is, for this case, where there is symmetry about the -axis, the
particle has an initial total energy then the initial conditions with radians of state space
will lead to trajectories that are trapped in the left minimum, and the other radians of state space will be
trapped in the right minimum. Trajectories starting near the split between these two halves of the starting
state space will be sensitive to the exact starting phase. This is an example of sensitivity to initial conditions.
occur frequently in physics. The state-space paths do not cross for such two-dimensional autonomous systems,
where an autonomous system is not explicitly dependent on time.
The Poincaré-Bendixson theorem states that, state-space, and phase-space, can have three possible paths:
(1) closed paths, like the elliptical paths for the undamped harmonic oscillator,
(2) terminate at an equilibrium point as → ∞, like the point attractor for a damped harmonic oscillator,
(3) tend to a limit cycle as → ∞.
The limit cycle is unusual in that the periodic motion tends asymptotically to the limit-cycle attractor
independent of whether the initial values are inside or outside the limit cycle. The balance of dissipative forces
and driving forces often leads to limit-cycle attractors, especially in biological applications. Identification of
limit-cycle attractors, as well as the trajectories of the motion towards these limit-cycle attractors, is more
complicated than for point attractors.
94 CHAPTER 4. NONLINEAR SYSTEMS AND CHAOS
Figure 4.3: The Poincaré-Bendixson theorem allows the following three scenarios for two-dimensional au-
tonomous systems. (1) Closed paths as illustrated by the undamped harmonic oscillator. (2) Terminate at
an equilibrium point as → ∞, as illustrated by the damped harmonic oscillator, and (3) Tend to a limit
cycle as → ∞ as illustrated by the van der Pol oscillator.
≡ (4.12)
¡ ¢
= − − 2 − 1 (4.13)
It is advantageous to transform the (̇ ) state space to polar coordinates by setting
= cos (4.14)
= sin
and using the fact that 2 = 2 + 2 Therefore
= + (4.15)
Similarly for the angle coordinate
= cos − sin (4.16)
= sin + cos (4.17)
Multiply equation 416 by and 417 by and subtract gives
2 = − (4.18)
4.4. LIMIT CYCLES 95
Figure 4.4: Solutions of the van der Pol system for = 02 top row and = 5 bottom row, assuming that
20 = 1. The left column shows the time dependence (). The right column shows the corresponding ( ̇)
state space plots. Upper: Weak nonlinearity, = 02; At large times the solution tends to one limit
cycle for initial values inside or outside the limit cycle attractor. The amplitude () for two initial condi-
tions approaches an approximately harmonic oscillation. Lower: Strong nonlinearity, μ = 5; Solutions
approach a common limit cycle attractor for initial values inside or outside the limit cycle attractor while
the amplitude () approaches a common approximate square-wave oscillation.
Equations 415 and 418 allow the van der Pol equations of motion to be written in polar coordinates
¡ ¢
= − 2 cos2 − 1 sin2 (4.19)
¡ ¢
= −1 − 2 cos2 − 1 sin cos (4.20)
The non-linear terms on the right-hand side of equations 419 − 20 have a complicated form.
Weak non-linearity: 1
In the limit that → 0, equations 419 420 correspond to a circular state-space trajectory similar to the
harmonic oscillator. That is, the solution is of the form
() = sin ( − 0 ) (4.21)
where and 0 are arbitrary parameters. For weak non-linearity, 1 the angular equation 420 has a
rotational frequency that is unity since the sin cos term changes sign twice per period, in addition to the
96 CHAPTER 4. NONLINEAR SYSTEMS AND CHAOS
¡ ¢
small value of . For 1 and 1 the radial equation 419 has a sign of the 2 cos2 − 1 term that
is positive and thus the radius increases monotonically to unity. For 1 the bracket is predominantly
negative resulting in a spiral decrease in the radius. Thus, for very weak non-linearity, this radial behavior
results in the amplitude spiralling to a well defined limit-cycle attractor value of = 2 as illustrated by
the state-space plots in figure 44 for cases where the initial condition is inside or external to the circular
attractor. The final amplitude for different initial conditions also approach the same asymptotic behavior.
Dominant non-linearity: 1
For the case where the non-linearity is dominant, that is 1, then as shown in figure 44, the system
approaches a well defined attractor, but in this case it has a significantly skewed shape in state-space, while
the amplitude approximates a square wave. The solution remains close to = +2 until = ̇ ≈ +7 and
then it relaxes quickly to = −2 with = ̇ ≈ 0 This is followed by the mirror image. This behavior is
called a relaxed vibration in that a tension builds up slowly then dissipates by a sudden relaxation process.
The seesaw is an extreme example of a relaxation oscillator where the seesaw angle switches spontaneously
from one solution to the other when the difference in their moment arms changes sign.
The study of feedback in electronic circuits was the stimulus for study of this equation by van der
Pol. However, Lord Rayleigh first identified such relaxation oscillator behavior in 1880 during studies of
vibrations of a stringed instrument excited by a bow, or the squeaking of a brake drum. In his discussion of
non-linear effects in acoustics, he derived the equation
2
̈ − ( − 2 )̇ − 20 = 0 (4.27)
02 0
The rhythm of a heartbeat driven by a pacemaker is an important application where the self-stabilization of
the attractor is a desirable characteristic to stabilize an irregular heartbeat; the medical term is arrhythmia.
The mechanism that leads to synchronization of the many pacemaker cells in the heart and human body due
to the influence of an implanted pacemaker is discussed in chapter 1412. Another biological application of
limit cycles is the time variation of animal populations.
In summary the non-linear damping of the van der Pol oscillator leads to a self-stabilized, single limit-
cycle attractor that is insensitive to the initial conditions. The van der Pol oscillator has many important
applications such as bowed musical instruments, electrical circuits, and human anatomy as mentioned above.
The van der Pol oscillator illustrates the complicated manifestations of the motion that can be exhibited by
non-linear systems
4.5. HARMONICALLY-DRIVEN, LINEARLY-DAMPED, PLANE PENDULUM 97
1A similar approach is used by the book "Chaotic Dynamics" by Baker and Gollub[Bak96].
98 CHAPTER 4. NONLINEAR SYSTEMS AND CHAOS
Figure 4.5: Motion of the driven damped pendulum for drive strengths of = 02, = 09 = 105 and
= 1078. The left side shows the time dependence of the deflection angle with the time axis expressed
in dimensionless units ̃. The right side shows the corresponding state-space plots. These plots assume
̃ = 0 = 23 , = 2, and the motion starts with = = 0.
4.5. HARMONICALLY-DRIVEN, LINEARLY-DAMPED, PLANE PENDULUM 99
20
10
2
2 4 6 8 10 12 14
t 2 2
2 10
20
10
2
2 4 6 8 10 12 14 t 2 2
10
2
20
Figure 4.6: The driven damped pendulum assuming that ̃ = 23 , = 2, with initial conditions (0) = − 2 ,
(0) = 0. The system exhibits period-two motion for drive strengths of = 1078 as shown by the state
space diagram for cycles 10 − 20. For = 1081 the system exhibits period-four motion shown for cycles
10 − 30.
where the admixture coefficient 1. This successive approximation method can be repeated to add
additional terms proportional to cos ( − ) where is an integer with ≥ 3. Thus the nonlinearity
introduces progressively weaker -fold harmonics to the solution. This successive approximation approach
is viable only when the admixture coefficient 1 Note that these harmonics are integer multiples of ,
thus the steady-state response is identical for each full period even though the state space contours deviate
from an elliptical shape.
Figure 4.7: Rolling motion for the driven damped plane pendulum for = 14. (a) The time dependence
of angle () increases by 2 per drive period whereas (b) the angular velocity () exhibits periodicity. (c)
The state space plot for rolling motion is shown with the origin shifted by 2 per revolution to keep the plot
within the bounds − +
for rolling motion corresponds to a chain of loops with a spacing of 2 between each loop. The state space
diagram for rolling motion is more compactly presented if the origin is shifted by 2 per revolution to keep
the plot within bounds as illustrated in figure 47.
Figure 4.8: Left: Space-space orbits for the driven damped pendulum with = 1105. Note that the orbits
do not repeat for cycles 25 to 200. Right: Time-state-space diagram for = 1168. The plot shows 16
trajectories starting with different initial values in the range −015 015.
102 CHAPTER 4. NONLINEAR SYSTEMS AND CHAOS
Figure 4.9: State-space plots for the harmonically-driven, linearly-damped, pendulum for driving amplitudes
of = 05 and = 12. These calculations were performed using the Runge-Kutta method by E. Shah,
(Private communication)
1 |()|
= lim lim ln (4.40)
→∞ 0 →0 |0 |
Systems for which the Lyapunov exponent 0 (negative), converge exponentially to the same attractor
solution at long times since |()| → 0 for → ∞. By contrast, systems for which 0 (positive) diverge
to completely different long-time solutions, that is, |()| → ∞ for → ∞. Even for infinitesimally
4.6. DIFFERENTIATION BETWEEN ORDERED AND CHAOTIC MOTION 103
Figure 4.10: Lyapunov plots of ∆ versus time for two initial starting points differing by ∆0 = 0001.
The parameters are = 2 and () = sin( 23 ) and ∆ = 004. The Lyapunov exponent for = 05
which is drawn as a dashed line, is convergent with = −0251 For = 12 the exponent is divergent as
indicated by the dashed line which as a slope of = 01538 These calculations were performed using the
Runge-Kutta method by E. Shah, (Private communication)
small differences in the initial conditions, systems having a positive Lyapunov exponent diverge to different
attractors, whereas when the Lyapunov exponent 0 they correspond to stable solutions.
Figure 410 illustrates Lyapunov plots for the harmonically-driven, linearly-damped, plane pendulum,
with the same conditions discussed in chapter 45. Note that for the small driving amplitude = 05
the Lyapunov plot converges to ordered motion with an exponent = −0251 whereas for = 12 the
plot diverges characteristic of chaotic motion with an exponent = 01538 The Lyapunov exponent usually
fluctuates widely at the local oscillator frequency, and thus the time average of the Lyapunov exponent must
be taken over many periods of the oscillation to identify the general trend with time. Some systems near an
order-to-chaos transition can exhibit positive Lyapunov exponents for short times, characteristic of chaos,
and then converge to negative at longer time implying ordered motion. The Lyapunov exponents are
used extensively to monitor the stability of the solutions for non-linear systems. For example the Lyapunov
exponent is used to identify whether fluid flow is laminar or turbulent as discussed in chapter 168.
A dynamical system in -dimensional phase space will have a set of Lyapunov exponents {1 2 }
associated with a set of attractors, the importance of which depend on the initial conditions. Typically one
Lyapunov exponent dominates at one specific location in phase space, and thus it is usual to use the maximal
Lyapunov exponent to identify chaos.The Lyapunov exponent is a very sensitive measure of the onset of chaos
and provides an important test of the chaotic nature for the complicated motion exhibited by non-linear
systems.
Figure 4.12: Three Poincaré section plots for the harmonically-driven, linearly-damped, pendulum for various
initial conditions with = 12 ̃ = 23 and ∆ = 100
. These calculations used the Runge-Kutta method
and were performed for 6000 by E. Shah (Private communication).
when the restoring force is non-linear. The system exhibits bifurcation where it can evolve to multiple
attractors that depend sensitively on the initial conditions. The system exhibits both oscillatory, and rolling,
solutions depending on the amplitude of the motion. The system exhibits domains of simple ordered motion
separated by domains of very complicated ordered motion as well as chaotic regions. The transitions between
these dramatically different modes of motion are extremely sensitive to the amplitude and phase of the
driver. Eventually the motion becomes completely chaotic. The Lyapunov exponent, bifurcation diagram,
and Poincaré section plots, are sensitive measures of the order of the motion. These three sensitive measures
of order and chaos are used extensively in many fields in classical mechanics. Considerable computing
capabilities are required to elucidate the complicated motion involved in non-linear systems. Examples
include laminar and turbulent flow in fluid dynamics and weather forecasting of hurricanes, where the
motion can span a wide dynamic range in dimensions from 10−5 to 104 .
where is used as the independent variable since it is invariant to phase transitions of the system. Note
that the factor for the first derivative term is the reciprocal of the group velocity
µ ¶
1
≡ (4.42)
=0
106 CHAPTER 4. NONLINEAR SYSTEMS AND CHAOS
3
+ 3
+ 6 =0 (4.49)
A solution of this equation has the characteristics of a solitary wave with fixed shape. It is given by
substituting the form ( ) = ( − ) into the Korteweg-de Vries equation which gives
108 CHAPTER 4. NONLINEAR SYSTEMS AND CHAOS
3
− + 3 + 6 =0 (4.50)
Integrating with respect to gives
2
3 2 + − = (4.51)
3
where is a constant of integration. This non-linear equation has a solution
∙√ ¸
1 2
( ) = sec ( − − ) (4.52)
2 2
where is a constant. Equation 452 is the equation of a solitary wave moving in the + direction at a
velocity .
Soliton behavior is observed in phenomena such as tsunamis, tidal bores that occur for some rivers,
signals in optical fibres, plasmas, atmospheric waves, vortex filaments, superconductivity, and gravitational
fields having cylindrical symmetry. Much work has been done on solitons for fibre optics applications. The
soliton’s inherent stability make long-distance transmission possible without the use of repeaters, and could
potentially double the transmission capacity.
Before the discovery of solitons, mathematicians were under the impression that nonlinear partial differ-
ential equations could not be solved exactly. However, solitons led to the recognition that there are non-linear
systems that can be solved analytically. This discovery has prompted much investigation into these so-called
“integrable systems.” Such systems are rare, as most non-linear differential equations admit chaotic behavior
with no explicit solutions. Integrable systems nevertheless lead to very interesting mathematics ranging from
differential geometry and complex analysis to quantum field theory and fluid dynamics.
Many of the fundamental equations in physics (Maxwell’s, Schrödinger’s) are linear equations. However,
physicists have begun to recognize many areas of physics in which nonlinearity can result in qualitatively
new phenomenon which cannot be constructed via perturbation theory starting from linearized equations.
These include phenomena in magnetohydrodynamics, meteorology, oceanography, condensed matter physics,
nonlinear optics, and elementary particle physics. For example, the European space mission Cluster detected
a soliton-like electrical disturbances that travelled through the ionized gas surrounding the Earth starting
about 50,000 kilometers from Earth and travelling towards the planet at about 8 km/s. It is thought that
this soliton was generated by turbulence in the magnetosphere.
Efforts to understand the nonlinearity of solitons has led to much research in many areas of physics. In
the context of solitons, their particle-like behavior (in that they are localized and preserved under collisions)
leads to a number of experimental and theoretical applications. The technique known as bosonization allows
viewing particles, such as electrons and positrons, as solitons in appropriate field equations. There are
numerous macroscopic phenomena, such as internal waves on the ocean, spontaneous transparency, and the
behavior of light in fiber optic cable, that are now understood in terms of solitons. These phenomena are
being applied to modern technology.
4.8 Summary
The study of the dynamics of non-linear systems remains a vibrant and rapidly evolving field in classical
mechanics as well as many other branches of science. This chapter has discussed examples of non-linear
systems in classical mechanics. It was shown that the superposition principle is broken even for weak
nonlinearity. It was shown that increased nonlinearity leads to bifurcation, point attractors, limit-cycle
attractors, and sensitivity to initial conditions.
Limit-cycle attractors: The Poincaré-Bendixson theorem for limit cycle attractors states that the
paths, both in state-space and phase-space, can have three possible paths:
(1) closed paths, like the elliptical paths for the undamped harmonic oscillator,
(2) terminate at an equilibrium point as → ∞, like the point attractor for a damped harmonic oscillator,
(3) tend to a limit cycle as → ∞.
The limit cycle is unusual in that the periodic motion tends asymptotically to the limit-cycle attractor
independent of whether the initial values are inside or outside the limit cycle. The balance of dissipative forces
and driving forces often leads to limit-cycle attractors, especially in biological applications. Identification of
4.8. SUMMARY 109
limit-cycle attractors, as well as the trajectories of the motion towards these limit-cycle attractors, is more
complicated than for point attractors.
The van der Pol oscillator is a common example of a limit-cycle system that has an equation of motion
of the form
2 ¡ ¢
2
+ 2 − 1 + 20 = 0 (411)
The van der Pol oscillator has a limit-cycle attractor that includes non-linear damping and exhibits
periodic solutions that asymptotically approach one attractor solution independent of the initial conditions.
There are many examples in nature that exhibit similar behavior.
Harmonically-driven, linearly-damped, plane pendulum: The non-linearity of the well-known
driven linearly-damped plane pendulum was used as an example of the behavior of non-linear systems in
nature. It was shown that non-linearity leads to discontinuous period bifurcation, extreme sensitivity to
initial conditions, rolling motion and chaos.
Differentiation between ordered and chaotic motion: Lyapunov exponents, bifurcation diagrams,
and Poincaré sections were used to identify the transition from order to chaos. Chapter 168 discusses
the non-linear Navier-Stokes equations of viscous-fluid flow which leads to complicated transitions between
laminar and turbulent flow. Fluid flow exhibits remarkable complexity that nicely illustrates the dominant
role that non-linearity can have on the solutions of practical non-linear systems in classical mechanics.
Wave propagation for non-linear systems: Non-linear equations can lead to unexpected behavior
for wave packet propagation such as fast or slow light as well as soliton solutions. Moreover, it is notable
that some non-linear systems can lead to analytic solutions.
The complicated phenomena exhibited by the above non-linear systems is not restricted to classical
mechanics, rather it is a manifestation of the mathematical behavior of the solutions of the differential
equations involved. That is, this behavior is a general manifestation of the behavior of solutions for second-
order differential equations. Exploration of this complex motion has only become feasible with the advent
of powerful computer facilities during the past three decades. The breadth of phenomena exhibited by
these examples is manifest in myriads of other nonlinear systems, ranging from many-body motion, weather
patterns, growth of biological species, epidemics, motion of electrons in atoms, etc. Other examples of non-
linear equations of motion not discussed here, are the three-body problem, which is mentioned in chapter
11, and turbulence in fluid flow which is discussed in chapter 16.
It is stressed that the behavior discussed in this chapter is very different from the random walk problem
which is a stochastic process where each step is purely random and not deterministic. This chapter has
assumed that the motion is fully deterministic and rigorously follows the laws of classical mechanics. Even
though the motion is fully deterministic, and follows the laws of classical mechanics, the motion is extremely
sensitive to the initial conditions and the non-linearities can lead to chaos. Computer modelling is the only
viable approach for predicting the behavior of such non-linear systems. The complexity of solving non-linear
equations is the reason that this book will continue to consider only linear systems. Fortunately, in nature,
non-linear systems can be approximately linear when the small-amplitude assumption is applicable.
Workshop exercises
1. Consider the chaotic motion of the driven damped pendulum whose equation of motion is given by
for which the Lyapunov exponent is = 1 with time measured in units of the drive period.
(a) Assume that you need to predict () with accuracy of 10−2 , and that the initial value (0) is
known to within 10−6 . What is the maximum time horizon max for which you can predict ()
to within the required accuracy?
(b) Suppose that you manage to improve the accuracy of the initial value to 10−9 (that is, a thousand-
fold improvement). What is the time horizon now for achieving the accuracy of 10−2 ?
(c) By what factor has max improved with the 1000 − improvement in initial measurement.
(d) What does this imply regarding long-term predictions of chaotic motion?
110 CHAPTER 4. NONLINEAR SYSTEMS AND CHAOS
2. A non-linear oscillator satisfies the equation ̈ + ̇3 + = 0 Find the polar equations for the motion in the
state-space diagram. Show that any trajectory that starts within the circle 1 encircle the origin infinitely
many times in the clockwise direction. Show further that these trajectories in state space terminate at the
origin.
3. Consider the system of a mass suspended between two identical springs as shown.
If each spring is stretched a distance to attach the mass at the equilibrium position the mass is subject to
two equal and oppositely directed forces of magnitude . Ignore gravity. Show that the potential in which
the mass moves is approximately
½ ¾ ½ ¾
2 ( − )
() = + 4
43
Construct a state-space diagram for this potential.
Problems
1. A non-linear oscillator satisfies the equation
2. A mass moves in one direction and is subject to a constant force +0 when 0 and to a constant force
−0 when 0. Describe the motion by constructing a state space diagram. Calculate the period of the
motion in terms of 0 and the amplitude . Disregard damping.
− ||
() = (
−( + ) + ||
Chapter 5
Calculus of variations
5.1 Introduction
The prior chapters have focussed on the intuitive Newtonian approach to classical mechanics, which is based
on vector quantities like force, momentum, and acceleration. Newtonian mechanics leads to second-order
differential equations of motion. The calculus of variations underlies a powerful alternative approach to
classical mechanics that is based on identifying the path that minimizes an integral quantity. This integral
variational approach was first championed by Gottfried Wilhelm Leibniz, contemporaneously with Newton’s
development of the differential approach to classical mechanics.
During the 18 century, Bernoulli, who was a student of Leibniz, developed the field of variational
calculus which underlies the integral variational approach to mechanics. He solved the brachistochrone
problem which involves finding the path for which the transit time between two points is the shortest. The
integral variational approach also underlies Fermat’s principle in optics, which can be used to derive that
the angle of reflection equals the angle of incidence, as well as derive Snell’s law. Other applications of the
calculus of variations include solving the catenary problem, finding the maximum and minimum distances
between two points on a surface, polygon shapes having the maximum ratio of enclosed area to perimeter,
or maximizing profit in economics. Bernoulli, developed the principle of virtual work used to describe
equilibrium in static systems, and d’Alembert extended the principle of virtual work to dynamical systems.
Euler, the preeminent Swiss mathematician of the 18 century and a student of Bernoulli, developed the
calculus of variations with full mathematical rigor. The culmination of the development of the Lagrangian
variational approach to classical mechanics is done by Lagrange (1736-1813), who was a student of Euler,.
The Euler-Lagrangian approach to classical mechanics stems from a deep philosophical belief that the
laws of nature are based on the principle of economy.That is, the physical universe follows paths through
space and time that are based on extrema principles. The standard Lagrangian is defined as the difference
between the kinetic and potential energy, that is
= − (5.1)
Chapters 6 through 9 will show that the laws of classical mechanics can be expressed in terms of Hamilton’s
variational principle which states that the motion of the system between the initial time 1 and final time
2 follows a path that minimizes the scalar action integral defined as the time integral of the Lagrangian.
Z 2
= (5.2)
1
The calculus of variations provides the mathematics required to determine the path that minimizes the
action integral. This variational approach is both elegant and beautiful, and has withstood the rigors of
experimental confirmation. In fact, not only is it an exceedingly powerful alternative approach to the intuitive
Newtonian approach in classical mechanics, but Hamilton’s variational principle now is recognized to be more
fundamental than Newton’s Laws of Motion. The Lagrangian and Hamiltonian variational approaches to
mechanics are the only approaches that can handle the Theory of Relativity, statistical mechanics, and the
dichotomy of philosophical approaches to quantum physics.
111
112 CHAPTER 5. CALCULUS OF VARIATIONS
Here is the independent variable, () the dependent variable, plus its first derivative 0 ≡ The quantity
[() 0 (); ] has some given dependence on 0 and The calculus of variations involves varying the
function () until a stationary value of is found, which is presumed to be an extremum. This means that
if a function = () gives a minimum value for the scalar functional , then any neighboring function, no
matter how close to () must increase . For all paths, the integral is taken between two fixed points,
1 1 and 2 2 Possible paths between the initial and final points are illustrated in figure 51. Relative to
any neighboring path, the functional must have a stationary value which is presumed to be the correct
extremum path.
Define a neighboring function using a parametric representation ( ) such that for = 0, = (0 ) =
() is the function that yields the extremum for . Assume that an infinitesimally small fraction of the
neighboring function () is added to the extremum path (). That is, assume
The condition that the integral has a stationary (extremum) value is that be independent of to first
order along the path. That is, the extremum value occurs for = 0 where
µ ¶
=0 (5.6)
=0
for all functions () This is illustrated on the right side of figure 51
Applying condition (56) to equation (55) and since is independent of then
Z 2 µ ¶
0
= + 0 = 0 (5.7)
1
Since the limits of integration are fixed, the differential operation affects only the integrand. From equations
(54),
= () (5.8)
and
0
= (5.9)
Consider the second term in the integrand
Z 2 Z 2
0
0
= 0
(5.10)
1 1
5.2. EULER’S DIFFERENTIAL EQUATION 113
y(x)
Varied path
x
x1 x2
x
O
Figure 5.1: The left shows the extremum () Rand neighboring paths ( ) = () + () between (1 1 )
and (2 2 ) that minimizes the function = 12 [() 0 (); ] . The right shows the dependence of
as a function of the admixture coefficient for a maximum (upper) or a minimum (lower) at = 0.
Integrate by parts Z Z
= − (5.11)
gives Z ∙ ¸2 Z 2 µ ¶
2
= () − () (5.12)
1 0 0 1 1 0
Note that the first term on the right-hand side is zero since by definition = () = 0 at 1 and 2 Thus
Z 2 µ ¶ Z 2 µ µ ¶¶
0
= + 0 = () − ()
1 1 0
This integral now appears to¢be independent of However, the functions and 0 occurring in the derivatives
¡
are functions of Since =0 must vanish for a stationary value, and because () is an arbitrary function
subject to the conditions stated, then the above integrand must be zero. This derivation that the integrand
must be zero leads to Euler’s differential equation
− =0 (5.15)
0
where and 0 are the original functions, independent of The basis of the calculus of variations is that the
function () that satisfies Euler’s equation is an stationary function. Note that the stationary value could
be either a maximum or a minimum value. When Euler’s equation is applied to mechanical systems using
the Lagrangian as the functional, then Euler’s differential equation is called the Euler-Lagrange equation.
114 CHAPTER 5. CALCULUS OF VARIATIONS
The function is
q
2 y
= 1 + ( 0 )
Therefore
x 1 y1
=0
and
0 x 2 y2
= q
0 2
1 + ( 0 )
Inserting these into Euler’s equation 515 gives
⎛ ⎞
⎝ 0 x
0+ q ⎠=0
2 Shortest distance between two points in a plane.
1 + ( 0 )
that is
0
q = constant =
1 + ( 0 )2
This is valid if
0 = √ =
1 − 2
Therefore
= +
which is the equation of a straight line in the plane. Thus the shortest path between two points in a plane is
a straight line between these points, as is intuitively obvious. This stationary value obviously is a minimum.
This trivial example of the use of Euler’s equation to determine an extremum value has given the obvious
answer. It has been presented here because it provides a proof that a straight line is the shortest distance in
a plane and illustrates the power of the calculus of variations to determine extremum paths.
Consider that the particle of mass starts at the origin 1 = 0 1 = 0 with zero velocity. Since the
problem conserves energy and assuming that initially = + = 0 then
1
2 − = 0
2
That is p
= 2
The transit time is given by
Z 2 Z 2
p Z 2 s
2 + 2 (1 + 02 )
= = √ =
1 1 2 1 2
where 0 ≡
. Note that, in this example, the independent variable has been chosen to be and the dependent
variable is ().
The function of the integral is s
1 (1 + 02 )
=√
2
√
Factor out the constant 2 term, which does not affect the final equation, and note that
= 0
0
= r ³ ´
0 2
1 + (0 )
or (x1 , y1) a a x
0
1
r ³ ´ = constant = √2 a
1 + (0 )2
P(x , y)
(x 2 , y 2 )
That is 2a
02 1
³ ´= Cycloid
1 + (0 )
2 2
or
= ( − sin ) + constant
116 CHAPTER 5. CALCULUS OF VARIATIONS
The parametric equations for a cycloid passing through the origin are
= ( − sin )
= (1 − cos )
which is the form of the solution found. That is, the shortest time between two points is obtained by con-
straining the motion of the mass to follow a cycloid shape. Thus the mass first accelerates rapidly by falling
down steeply and then follows the curve and coasts upward at the end. The elapsed time is obtained by
inserting
q the above parametric relations for and in terms of into the transit time integral giving
= where and are fixed by the end point coordinates. Thus the time to fall from starting with zero
q
velocity at the cusp to the minimum of the cycloid is If 2 = 1 = 0 then 2 = 2 which defines the
q q
shape of the cycloid and the minimum time is 2 = 2 If the mass starts with a non-zero initial
2
velocity, then the starting point is not at the cusp of the cycloid, but down a distance such that the kinetic
energy equals the potential energy difference from the cusp.
A modern application of the Brachistochrone problem is determination of the optimum shape of the low-
friction emergency chute that passengers slide down to evacuate a burning aircraft. Bernoulli solved the
problem of rapid evacuation of an aircraft two centuries before the first flight of a powered aircraft.
p
= −− 1 + 02
Therefore Euler’s equation equals
p 00 − 02 − 00 02 −
− 02 − √
− = − 1 + + √ + =0
0 1 + 02 1 + 02 (1 + 02 )
32
Integration gives
³ ´
Z Z ln cos(1 −)
ln(cos(1 − )) − ln(cos(1 + )) cos(1 +)
() = = tan(1 − ) = + 2 = + 2
− −
Using the initial condition that (−) = 0 gives 2 = 0. Similarly the final condition () = 0 implies that
1 = 0. Thus Euler’s equation has determined that the optimal trajectory that minimizes the cost integral
is µ ¶
1 cos()
() = ln
cos()
This example is typical of problems encountered in economics.
Independent variable
Assuming that is the independent variable, then the surface area can be written as
s µ ¶2
Z 2 Z 2 p
= 2 1+ = 2 1 + 02
1 1
118 CHAPTER 5. CALCULUS OF VARIATIONS
p
where 0 ≡ . The function of the surface integral is = 1 + 02 The derivatives are
p
= 1 + 02
and
0
= q
0
1 + (0 )2
Therefore Euler’s equation gives
⎛ ⎞
⎝ 0 p
q ⎠ − 1 + 02 = 0
2
1 + (0 )
Independent variable
Consider the case where the independent variable is chosen to be , then the surface integral can be written
as s
Z 2 µ ¶2 Z p
= 2 1+ = 2 1 + 02
1
√
where 0 ≡ 02
. Thus the function of the surface integral is = 1 + The derivatives are
=0
and
0
= q
0
1 + ( 0 )2
Therefore Euler’s equation gives ⎛ ⎞
0
⎝ ⎠=0
0+ q
2
1 + ( 0 )
That is
0
q =
1 + ( 0 )2
where is a constant. This can be rewritten as
¡ ¢
02 2 − 2 = 2
or
0 = =p
− 2
2
where = 1 2 3
By analogy with the one dimensional problem, define neighboring functions for each variable. Then
If the variables () are independent, then the () are independent. Since the () are independent,
then evaluating the above equation at = 0 implies that each term in the bracket must vanish independently.
That is, Euler’s differential equation becomes a set of equations for the independent variables
− =0 (5.19)
0
where = 1 2 3 Thus, each of the equations can be solved independently when the variables are
independent. Note that Euler’s equation involves partial derivatives for the dependent variables , 0 and
the total derivative for the independent variable .
This is a problem that has two dependent variables () and () with chosen as the independent
variable. The integral can be broken into two parts 1 → 0 and 0 → −2
∙Z 0 q Z −2 q ¸
1
= 1 1 + (0 )2 + ( 0 )2 + 2 1 + (0 )2 + ( 0 )2
1 0
The functionals are functions of 0 and 0 but not or . Thus Euler’s equation for simplifies to
µ ¶
1 1 0 2 0
0+ (√ +√ ) =0
1 + 0 2 + 02 1 + 02 + 0 2
This implies that 0 = 0, therefore is a constant. Since the initial and final values were chosen to be
1 = 2 = 0, therefore at the interface = 0. Similarly Euler’s equations for are
µ ¶
1 1 0 2 0
0+ (√ +√ ) =0
1 + 0 2 + 02 1 + 02 + 0 2
But 0 = tan 1 for 1 and 0 = − tan 2 for 2 and it was shown that 0 = 0. Thus
⎛ ⎞
µ ¶
⎝1 1 tan 1 2 tan 2 ⎠ 1
0+ (q −q ) = (1 sin 1 − 2 sin 2 ) = 0
1 + (tan 1 )2 1 + (tan 2 )2
Therefore 1 (1 sin 1 − 2 sin 2 ) = constant which must be zero since when 1 = 2 then 1 = 2 . Thus
Fermat’s principle leads to Snell’s Law.
1 sin 1 = 2 sin 2
The geometry of this problem is simple enough to directly minimize the path rather than using Euler’s
equations for the two parameters as performed above. The lengths of the paths 1 and 2 are
q
1 = 2 + 12 + 2
q
2 = (2 − )2 + 22 + 2
This problem involves two dependent variables, () and (). To find the minima, set the partial derivatives
= 0 and = 0. That is,
1 1 2
= (p +q )=0
2 + 12 + 2 2
(2 − ) + 22 + 2
This is zero only if = 0, that is the point lies in the plane containing 1 and 2 . Similarly
1 1 2 (2 − ) 1
= (p −q ) = (1 sin 1 − 2 sin 2 ) = 0
2 2
+ 1 + 2 2
(2 − ) + 22 + 2
1 sin 1 = 2 sin 2
Fermat’s principle has shown that the refracted light is given by Snell’s Law, and is in a plane normal to the
surface. The laws of reflection also are given since then 1 = 2 = and the angle of reflection equals the
angle of incidence.
5.6. EULER’S INTEGRAL EQUATION 121
Note that the variables 1 2 3 are independent, and thus Euler’s equation for several independent variables
can be used. To minimize the functional , the function
µ ¶2 µ ¶2 µ ¶2
= + + ()
1 2 3
must satisfy the Euler equation
3 µ ¶
X
− =0
=1 0
where 0 =
. Substitute into Euler’s equation gives
X3 µ ¶
=0
=1
0
= + + 0 (5.20)
But µ ¶
0
0 0 = 0 + 0 (5.21)
0
Combining these two equations gives
µ ¶
0
= − − 0 + 0 (5.22)
0 0
The last two terms can be rewritten as µ ¶
0 − (5.23)
0
which vanishes when the Euler equation is satisfied. Therefore the above equation simplifies to
µ ¶
0
− − =0 (5.24)
0
This integral form of Euler’s equation is especially useful when = 0 that is, when does not depend
explicitly on the independent variable . Then the first integral of equation 524 is a constant, i.e.
− 0 = constant (5.25)
0
This is Euler’s integral variational equation. Note that the shortest distance between two points, the mini-
mum surface of rotation, and the brachistochrone, described earlier, all are examples where
= 0 and thus
the integral form of Euler’s equation is useful for solving these cases.
122 CHAPTER 5. CALCULUS OF VARIATIONS
the assumption made in chapter 55 that the variables are inde- y
pendent. Ff
For example, for a disk rolling down an inclined plane without slip-
ping, there are three coordinates [perpendicular to the wedge], , [Along
the surface of the wedge], and the rotation angle shown in figure 52
The constraint forces, F N, lead to the correlation of the variables such
that = , while = . Basically there is only one independent
Figure 5.2: A disk rolling down
variable, which can be either or The use of only one independent
an inclined plane.
variable essentially buries the constraint forces under the rug, which is
fine if you only need to know the equation of motion. If you need to determine the forces of constraint then
it is necessary to include all coordinates explicitly in the equations of motion as discussed below.
where = 1 2 3 . There can be such equations of constraint where 0 ≤ ≤ . An example of such a
geometric constraint is when the motion is confined to the surface of a sphere of radius in coordinate space
which can be written in the form = 2 + 2 + 2 − 2 = 0 Such algebraic constraint equations are called
Holonomic which allows use of generalized coordinates as well as Lagrange multipliers to handle both the
constraint forces and the correlation of the coordinates.
where = 1 2 3 , = 1 2 3 . If equation (527) represents the total differential of a function then
it can be integrated to give a holonomic relation of the form of equation 526. However, if equation 527 is
5.7. CONSTRAINED VARIATIONAL SYSTEMS 123
not the total differential, then it is non-holonomic and can be integrated only after having solved the full
problem.
An example of differential constraint equations is for a wheel rolling on a plane without slipping which is
non-holonomic and more complicated than might be expected. The wheel moving on a plane has five degrees
of freedom since the height is fixed. That is, the motion of the center of mass requires two coordinates
( ) plus there are three angles ( ) where is the rotation angle for the wheel, is the pivot angle of
the axis, and is the tilt angle of the wheel. If the wheel slides then all five degrees of freedom are active.
If the axis of rotation of the wheel is horizontal, that is, the tilt angle = 0 is constant, then this kinematic
system leads to three differential constraint equations The wheel can roll with angular velocity ̇, as well as
pivot which corresponds to a change in Combining these leads to two differential equations of constraint
These constraints are insufficient to provide finite relations between all the coordinates. That is, the con-
straints cannot be reduced by integration to the form of equation 526 because there is no functional relation
between and the other three variables, . Many rolling trajectories are possible between any two points
of contact on the plane that are related to different pivot angles. That is, the point of contact of the disk
could pivot plus roll in a circle returning to the same point where are unchanged whereas the value
of depends on the circumference of the circle. As a consequence the rolling constraint is non-holonomic
except for the case where the disk rolls in a straight line and remains vertical.
where is a fixed length. This integral constraint is geometric and holonomic. Another example is finding
the minimum surface area of a closed surface subject to the enclosed volume being the constraint.
Such a system is called holonomic since there is a direct relation between the coupled variables. An example
of such a holonomic geometric constraint is if the motion is confined to the surface of a sphere of radius
which can be written in the form
= 2 + 2 + 2 − 2 = 0 (5.32)
Non-holonomic constraints There are many classifications of non-holonomic constraints that exist
if equation (531) is not satisfied. The algebraic approach is difficult to handle when the constraint is an
inequality, such as the requirement that the location is restricted to lie inside a spherical shell of radius
which can be expressed as
= 2 + 2 + 2 − 2 ≤ 0 (5.33)
124 CHAPTER 5. CALCULUS OF VARIATIONS
This non-holonomic constrained system has a one-sided constraint. Systems usually are non-holonomic if
the constraint is kinematic as discussed above.
Partial Holonomic constraints Partial-holonomic constraints are holonomic for a restricted range
of the constraint surface in coordinate space, and this range can be case specific. This can occur if the
constraint force is one-sided and perpendicular to the path. An example is the pendulum with the mass
attached to the fulcrum by a flexible string that provides tension but not compression. Then the pendulum
length is constant only if the tension in the string is positive. Thus the pendulum will be holonomic if
the gravitational plus centrifugal forces are such that the tension in the string is positive, but the system
becomes non-hononomic if the tension is negative as can happen when the pendulum rotates to an upright
angle where the centrifugal force outwards is insufficient to compensate for the vertical downward component
of the gravitational force. There are many other examples where the motion of an object is holonomic when
the object is pressed against the constraint surface, such as the surface of the Earth, but is unconstrained if
the object leaves the surface.
Time dependence
A constraint is called scleronomic if the constraint is not explicitly time dependent. This ignores the time
dependence contained within the solution of the equations of motion. Fortunately a major fraction of
systems are scleronomic. The constraint is called rheonomic if the constraint is explicitly time dependent.
An example of a rheonomic system is where the size or shape of the surface of constraint is explicitly time
dependent such as a deflating pneumatic tire.
Energy conservation
The solution depends on whether the constraint is conservative or dissipative, that is, if friction or drag are
acting. The system will be conservative if there are no drag forces, and the constraint forces are perpendicular
to the trajectory of the path such as the motion of a charged particle in a magnetic field. Forces of constraint
can result from sliding of two solid surfaces, rolling of solid objects, fluid flow in a liquid or gas, or result from
electromagnetic forces. Energy dissipation can result from friction, drag in a fluid or gas, or finite resistance
of electric conductors leading to dissipation of induced electric currents in a conductor, e.g. eddy currents.
A rolling constraint is unusual in that friction between the rolling bodies is necessary to maintain rolling.
A disk on a frictionless inclined plane will conserve it’s angular momentum since there is no torque acting
if the rolling contact is frictionless, that is, the disk will just slide. If the friction is sufficient to stop sliding,
then the bodies will roll and not slide. A perfect rolling body does not dissipate energy since no work is
done at the instantaneous point of contact where both bodies are in zero relative motion and the force is
perpendicular to the motion. In real life, a rolling wheel can involve a very small energy dissipation due to
deformation at the point of contact coupled with non-elastic properties of the material used to make the
wheel and the plane surface. For example, a pneumatic tire can heat up and expand due to flexing of the
tire.
Since equations 536 and 538 both equal zero, the equations 538 can be multiplied by arbitrary
undetermined factors and added to equations 536 to give.
Note that this is not trivial in that although the sum of the constraint equations for each is zero; the
individual terms of the sum are not zero.
Insert equations 536 plus 538 into 539 and collect all terms, gives
Ã
!
X X
+ = 0 (5.40)
=1
Note that all the are free independent variations and thus the terms in the brackets, which are the
coefficients of each , individually must equal zero. For each of the values of , the corresponding bracket
implies
X
+ =0 (5.41)
=1
Equation 542 is equivalent to a variational problem for finding the stationary value of 0
Ã
!
X
0
( ) = + = 0 (5.43)
where 0 is defined to be à !
X
0
≡ + (5.44)
=1
The solution to equation 543 can be found using Euler’s differential equation 519 of variational calculus.
At the extremum ( 0 ) = 0 corresponds to following contours of constant 0 which are in the surface that is
perpendicular to the gradients of the terms in 0 . The Lagrange multiplier constants are required because,
although these gradients are parallel at the extremum, the magnitudes of the gradients are not equal.
The beauty of the Lagrange multipliers approach is that the auxiliary conditions do not have to be
handled explicitly, since they are handled automatically as additional free variables during solution of
Euler’s equations for a variational problem with + unknowns fit to + equations. That is, the
variables are determined by the variational procedure using the variational equations
X
0 0
( 0 )−( )= ( 0)−( )− =0 (5.45)
simultaneously with the variables which are determined by the variational equations
0 0
( 0 )−( )=0 (5.46)
Equation 545 usually is expressed as
X
( )− ( 0)+ =0 (5.47)
The elegance of Lagrange multipliers is that a single variational approach allows simultaneous determination
of all + unknowns. Chapter 62 shows that the forces of constraint are given directly by the
terms.
( ; ) = 0 ()
Equation now contains only a single arbitrary function 1 () that is not restricted by the constraint. Thus
the bracket in the integrand of equation must equal zero for the extremum. That is
µ ¶ µ ¶−1 µ ¶ µ ¶−1
− = − ≡ −()
0 0
Now the left-hand side of this equation is only a function of and with respect to and 0 while the
right-hand side is a function of and with respect to and 0 Because both sides are functions of then
each side can be set equal to a function −() Thus the above equations can be written as
0
− = () 0
− = () ()
The complete solution of the three unknown functions. () () and () is obtained by solving the two
equations, , plus the equation of constraint . The Lagrange multiplier () is related to the force of
constraint. This example of two variables coupled by one holonomic constraint conforms with the general
relation for many variables and constraints given by equation 547.
is an extremum such that the fixed length of the perimeter satisfies the integral constraint
Z 2
() = ( 0 ; ) = (5.49)
1
That is, it is an extremum for both () and the Lagrange multiplier . This effectively involves finding the
extremum path for the function ( ) = ( ) + ( ) where both () and are the minimized
variables. Therefore the curve () must satisfy the differential equation
∙ ¸
− + − =0 (5.51)
0 0
5.9. LAGRANGE MULTIPLIERS FOR HOLONOMIC CONSTRAINTS 129
= + = ( + ) 1 + 02
The catenary
Note that this case is one where = 0 and is a constant; also
defining = + then 0 = 0 Therefore the Euler’s equations can be written in the integral form
− 0 = = constant
0
√
Inserting the relation = 1 + 02 gives
p 0
1 + 02 − 0 √ =
1 + 02
where is an arbitrary constant. This simplifies to
³ ´2
02 = −1
The integral of this is µ ¶
+
= cosh
where and are arbitrary constants fixed by the locations of the two fixed ends of the rope.
0
and 0 = √ 02
Insert these into the Euler-Lagrange equation (551) gives
1+
" #
0
1− p =0
1 + 02
That is " #
0 1
p =
1+ 02
Integrate with respect to gives
0
p =−
1 + 02
where is a constant of integration. This can be rearranged to give
± ( − )
0 = q
2 − ( − )2
5.10 Geodesic
The geodesic is defined as the shortest path between two fixed points for motion that is constrained to lie
on a surface. Variational calculus provides a powerful approach for determining the equations of motion
constrained to follow a geodesic.
The use of variational calculus is illustrated by considering the geodesic constrained to follow the surface
of a sphere of radius . As discussed in appendixq23, the element of path length on the surface of the
2
sphere is given in spherical coordinates as = 2 + (sin ) . Therefore the distance between two
points 1 and 2 is ⎡s ⎤
Z 2 µ ¶2
⎣
= + sin2 ⎦ (5.52)
1
where 0 =
This is a case where
= 0 and thus the integral form of Euler’s equation can be used
leading to the result that
p p
02 + sin2 − 0 0 02 + sin2 = constant = (5.54)
This gives that p
sin2 = 02 + sin2 (5.55)
This can be rewritten as
1 csc2
= 0 =√ (5.56)
1 − 2 csc2
5.11. VARIATIONAL APPROACH TO CLASSICAL MECHANICS 131
The terms in the brackets are just expressions for the rectangular coordinates That is,
− = (5.62)
This is the equation of a plane passing through the center of the sphere. Thus the geodesic on a sphere
is the path where a plane through the center intersects the sphere as well as the initial and final locations.
This geodesic is called a great circle. Euler’s equation gives both the maximum and minimum extremum
path lengths for motion on this great circle.
Chapter 17 discusses the geodesic in the four-dimensional space-time coordinates that underlie the General
Theory of Relativity. As a consequence, the use of the calculus of variations to determine the equations of
motion for geodesics plays a pivotal role in the General Theory of Relativity.
5.12 Summary
Euler’s differential equation: The calculus of variations has been introduced and Euler’s differential
equation was derived. The calculus of variations reduces to varying the functions () where = 1 2 3 ,
such that the integral Z 2
= [ () 0 (); ] (516)
1
is an extremum, that is, it is a maximum or minimum. Here is the independent variable, () are
the dependent variables plus their first derivatives 0 ≡ 0
The quantity [() (); ] has some given
0
dependence on and The calculus of variations involves varying the functions () until a stationary
value of is found which is presumed to be an extremum. It was shown that if the () are independent,
then the extremum value of leads to independent Euler equations
− =0 (519)
0
whereR = 1 2 3. This can be used to determine the functional form () that ensures that the integral
= 12 [() 0 (); ] is a stationary value, that is, presumably a maximum or minimum value.
Note that Euler’s equation involves partial derivatives for the dependent variables 0 and the total
derivative for the independent variable R
Euler’s integral equation: It was shown that if the function 12 [ () 0 (); ] does not depend on
the independent variable, then Euler’s differential equation can be written in an integral form. This integral
form of Euler’s equation is especially useful when = 0 that is, when does not depend explicitly on ,
then the first integral of the Euler equation is a constant
− 0 = constant (525)
0
Constrained variational systems: Most applications involve constraints on the motion. The equations
of constraint can be classified according to whether the constraints are holonomic or non-holonomic, the time
dependence of the constraints, and whether the constraint forces are conservative.
Generalized coordinates in variational calculus: Independent generalized coordinates can be chosen
that are perpendicular to the rigid constraint forces and therefore the constraint does not contribute to the
functional being minimized. That is, the constraints are embedded into the generalized coordinates and thus
the constraints can be ignored when deriving the variational solution.
Minimal set of generalized coordinates: If the constraints are holonomic then the holonomic
equations of constraint can be used to transform the coupled generalized coordinates to = −
independent generalized variables 0 . The generalized coordinate method then uses Euler’s equations to
determine these = − independent generalized coordinates.
− =0 (535)
0
Lagrange multipliers for holonomic constraints: The Lagrange multipliers approach for variables,
plus holonomic equations of constraint, determines all + unknowns for the system. The holonomic
forces of constraint acting on the variables, are related to the Lagrange multiplier terms ()
that
( ; ) = 0 (538)
The advantage of using the Lagrange multiplier approach is that the variational procedure simultaneously
determines both the equations of motion for the variables plus the constraint forces acting on the
system.
5.12. SUMMARY 133
Workshop exercises
1. Find the extremal of the functional
Z2
̇2
() =
3
1
that satisfies (1) = 3 and (2) = 18. Show that this extremal provides the global minimum of .
(a) A particle is constrained to move on the surface of a sphere. What are the equations of constraint for this
system?
(b) A disk of mass and radius rolls without slipping on the outside surface of a half-cylinder of radius
5. What are the equations of constraint for this system?
(c) What are holonomic constraints? Which of the equations of constraint that you found above are holo-
nomic?
(d) Equations of constraint that do not explicitly contain time are said to be scleronomic. Moving constraints
are rheonomic. Are the equations of constraint that you found above scleronomic or rheonomic?
3. For each of the following systems, describe the generalized coordinates that would work best. There may be
more than one answer for each system.
(a) An inclined plane of mass is sliding on a smooth horizontal surface, while a particle of mass is
sliding on the smooth inclined surface.
(b) A disk rolls without slipping across a horizontal plane. The plane of the disk remains vertical, but it is
free to rotate about a vertical axis.
(c) A double pendulum consisting of two simple pendula, with one pendulum suspended from the bob of the
other. The two pendula have equal lengths and have bobs of equal mass. Both pendula are confined to
move in the same plane.
(d) A particle of mass is constrained to move on a circle of radius . The circle rotates in space about
one point on the circle, which is fixed. The rotation takes place in the plane of the circle, with constant
angular speed , in the absence of a gravitational force.
(e) A particle of mass is attracted toward a given point by a force of magnitude 2 , where is a constant.
4. Looking back at the systems in problem 3, which ones could have equations of constraint? How would you
classify the equations of constraint (holonomic, scleronomic, rheonomic, etc.)?
134 CHAPTER 5. CALCULUS OF VARIATIONS
Problems
1. Find the extremal of the functional Z
() = (2 sin − ̇2 )
0
that satisfies () = () = 0. Show that this extremal provides the global maximum of .
Z2 q
√
2. Find and describe the path = () for which the the integral 1 + ( 0 )2 is stationary.
1
3. Find the dimensions of the parallelepiped of maximum volume circumscribed by a sphere of radius .
4. Consider a single loop of the cycloid having a fixed value of as shown in the figure. A car released from
rest at any point 0 anywhere on the track between and the lowest point , that is, 0 has a parameter
0 0
O x
P0
P
(a) Show that the time for the cart to slide from 0 to is given by the integral
r Z r
1 − cos
(0 → ) =
cos 0 − cos
0
p
(b) Prove that this time is equal to which is independent of the position 0
(c) Explain qualitatively how this surprising result can possibly be true.
5. Consider a medium for which the refractive index = where is a constant and is the distance from
2
the origin. Use Fermat’s Principle to find the path of a ray of light travelling in a plane containing the origin.
Hint, use two-dimensional polar coordinates with = () Show that the resulting path is a circle through
the origin.
6. Find the shortest path between the ( ) points (0 −1 0) and (0 1 0) on the conical surface
p
= 1 − 2 + 2
What is the length of this path? Note that this is the shortest mountain path around a volcano.
7. Show that the geodesic on the surface of a right circular cylinder is a segment of a helix.
Chapter 6
Lagrangian dynamics
6.1 Introduction
Newtonian mechanics is based on vector observables such as momentum and force, and Newton’s equations
of motion can be derived if the forces are known. Newtonian mechanics becomes difficult to apply for many-
body systems that involve constraint forces. The alternative algebraic Lagrangian mechanics approach is
based on the concept of scalar energies which circumvent many of the difficulties in handling constraint forces
and many-body systems.
The Lagrangian approach to classical dynamics is based on the calculus of variations introduced in chapter
5. It was shown that the calculus of variations determines the function () such that the scalar functional
Z 2 X
= [ () 0 (); ] (6.1)
1
is an extremum, that is, a maximum or minimum. Here is the independent variable, () are the
dependent variables, and their derivatives 0 ≡ 0
where = 1 2 3 The function [ () (); ] has
0
an assumed dependence on and The calculus of variations determines the functional dependence
of the dependent variables () on the independent variable that is needed to ensure that is an
extremum. For independent variables, has a stationary point, which is presumed to be an extremum,
that is determined by solution of Euler’s differential equations
− =0 (6.2)
0
If the coordinates () are independent, then the Euler equations, (62), for each coordinate are inde-
pendent. However, for constrained motion, the constraints lead to auxiliary conditions that correlate the
coordinates. As shown in chapter 5 a transformation to independent generalized coordinates can be made
such that the correlations induced by the constraint forces are embedded into the choice of the independent
generalized coordinates. The use of generalized coordinates in Lagrangian mechanics simplifies derivation of
the equations of motion for constrained systems. For example, for a system of coordinates, that involves
holonomic constraints, there are = − independent generalized coordinates. For such holonomic
constrained motion, it will be shown that the Euler equations can be solved using either of the following
three alternative ways.
1) The minimal set of generalized coordinates approach involves finding a set of = − indepen-
dent generalized coordinates that satisfy the assumptions underlying (62). These generalized coordinates
can be determined if the equations of constraint are holonomic, that is, related by algebraic equations of
constraint
( ; ) = 0 (6.3)
where = 1 2 3 These equations uniquely determine the relationship between the correlated coordi-
nates. This method has the advantage that it reduces the system of coordinates, subject to constraints,
to = − independent generalized coordinates which reduces the dimension of the problem to be solved.
However, it does not explicitly determine the forces of constraint which are effectively swept under the rug.
135
136 CHAPTER 6. LAGRANGIAN DYNAMICS
2) The Lagrange multipliers approach takes account of the correlation between the coordinates and
holonomic constraints by introducing the Lagrange multipliers (). These generalized coordinates
are correlated by the holonomic constraints.
X
0 − = () (6.4)
where = 1 2 3 . The Lagrange multiplier approach has the advantage that Euler’s calculus of variations
automatically use the Lagrange equations, plus the equations of constraint, to explicitly determine both
the coordinates and the forces of constraint
P which are related to the Lagrange multipliers as given
in equation (64). Chapter 62 shows that the () terms are directly related to the holonomic
forces of constraint.
3) The generalized force approach incorporates the forces of constraint explicitly as will be shown in
chapter 654. Incorporating the constraint forces explicitly allows use of holonomic, non-holonomic, and
non-conservative constraint forces.
Understanding the Lagrange formulation of classical mechanics is facilitated by use of a simple non-
rigorous plausibility approach that is based on Newton’s laws of motion. This introductory plausibility ap-
proach will be followed by two more rigorous derivations of the Lagrangian formulation developed using either
d’Alembert Principle or Hamiltons Principle. These better elucidate the physics underlying the Lagrange
and Hamiltonian analytic representations of classical mechanics. In 1788 Lagrange derived his equations of
motion using the differential d’Alembert Principle, that extends to dynamical systems the Bernoulli Principle
of infinitessimal virtual displacements and virtual work. The other approach, developed in 1834, uses the
integral Hamilton’s Principle to derive the Lagrange equations. Hamilton’s Principle is discussed in more
detail in chapter 9 Euler’s variational calculus underlies d’Alembert’s Principle and Hamilton’s Principle
since both are based on the philosophical belief that the laws of nature prefer economy of motion. Chap-
ters 62 − 65 show that both d’Alembert’s Principle and Hamilton’s Principle lead to the Euler-Lagrange
equations. This will be followed by a series of examples that illustrate the use of Lagrangian mechanics in
classical mechanics.
1 p·p 2 2 2
= 2 = = + +
2 2 2 2 2
It can be seen that
= (6.6)
̇
and
= = (6.7)
̇
Consider that the force, acting on a mass is arbitrarily separated into two components, one part that
is conservative, and thus can be written as the gradient of a scalar potential , plus the excluded part of
the force, . The excluded part of the force could include non-conservative frictional forces as well
as forces of constraint which may be conservative or non-conservative. This separation allows the force to
be written as
F = −∇ + F (6.8)
6.2. NEWTONIAN PLAUSIBILITY ARGUMENT FOR LAGRANGIAN MECHANICS 137
Equation (69) can be extended by transforming the cartesian coordinate to the generalized coordinates
Define the standard Lagrangian to be the difference between the kinetic energy and the potential energy,
which can be written in terms of the generalized coordinates as
= + = (6.11)
̇ ̇ ̇ ̇
Using the above equations allows Newton’s equation of motion (69) to be expressed as
− = (6.12)
̇
That is the Lagrange multiplier terms can be used to account for holonomic constraint forces
. Thus
equation 612 can be written as
X
− = () + (6.15)
̇
where the Lagrange multiplier term accounts for holonomic constraint forces, and
includes all the
remaining forces that are not accounted for by the scalar potential , or the Lagrange multiplier terms
.
For holonomic, conservative forces it is possible to absorb all the forces into the potential plus the
Lagrange multiplier term, that is
= 0 Moreover, the use of a minimal set of generalized coordinates
allows the holonomic constraint forces to be ignored by explicitly reducing the number of coordinates from
dependent coordinates to = − independent generalized coordinates. That is, the correlations due
to the constraint forces are embedded into the generalized coordinates. Then equation 615 reduces to the
basic Euler differential equations.
− =0 (6.16)
̇
Note that equation 616 is identical to Euler’s equation 534, if the independent variable is replaced
R
by time . Thus Newton’s equation of motion are equivalent to minimizing the action integral = 12 ,
that is Z 2
= ( ̇ ; ) = 0 (6.17)
1
which is Hamilton’s Principle. Hamilton’s Principle underlies many aspects of physics and as discussed in
chapter 9, and is used as the starting point for developing classical mechanics. Hamilton’s Principle was
postulated 46 years after Lagrange introduced Lagrangian mechanics.
The above plausibility argument, which is based on Newtonian mechanics, illustrates the close connection
between the vectorial Newtonian mechanics and the algebraic Lagrangian mechanics approaches to classical
mechanics.
138 CHAPTER 6. LAGRANGIAN DYNAMICS
X
X
F
· r + f · r = 0 (6.19)
The second term in equation 619 can be ignored if the virtual work due to the constraint forces is zero.
This is rigorously true for rigid bodies and is valid for any forces of constraint where the constraint forces
are perpendicular to the constraint surface and the virtual displacement is tangent to this surface. Thus if
the constraint forces do no work, then (619) reduces to
X
F
· r = 0 (6.20)
This relation is the Bernoulli’s Principle of Static Virtual Work and is used to solve problems in statics.
Bernoulli introduced dynamics by using Newton’s Law to related force and momentum.
F = ṗ (6.21)
For the special case where the forces of constraint are zero, then equation 624 reduces to d’Alembert’s
Principle
X
(F
− ṗ ) · r = 0 (6.25)
d’Alembert’s Principle, by a stroke of genius, cleverly transforms the principle of virtual work from the realm
of statics to dynamics. Application of virtual work to statics primarily leads to algebraic equations between
the forces, whereas d’Alembert’s principle applied to dynamics leads to differential equations.
6.3. LAGRANGE EQUATIONS FROM D’ALEMBERT’S PRINCIPLE 139
The arbitrary virtual displacement r can be related to the virtual displacement of the generalized coordinate
by
X
r
r = (6.28)
Note that by definition, a virtual displacement considers only displacements of the coordinates, and no time
variation is involved.
The above transformations can be used to express d’Alembert’s dynamical principle of virtual work in
generalized coordinates. Thus the first term in d’Alembert’s Dynamical Principle, (625) becomes
X
X X
r
F
· r = F
· = (6.29)
Note that just as the generalized coordinates need not have the dimensions of length, so the do not
necessarily have the dimensions of force, but the product must have the dimensions of work. For
example, could be torque and could be the corresponding infinitessimal rotation angle.
The second term in d’Alembert’s Principle (625) can be transformed using equation 628
à !
X X X r
ṗ · r = r̈ · r = r̈ · (6.31)
The second right-hand term in (632) can be rewritten by interchanging the order of the differentiation with
respect to and µ ¶
r v
= (6.35)
Substituting (634) and (635) into (632) gives
à ! ½ µ ¶ ¾
X X r X v v
ṗ · r = r̈ · = v · − v · (6.36)
̇
Inserting (629) and (636) into d’Alembert’s Principle (625) leads to the relation
( Ã Ã !! Ã ! )
X X X1 X1
2 2
(F − ṗ ) · r = − − − = 0 (6.37)
̇
2
2
P
The 21 2 term can be identified with the system kinetic energy . Thus d’Alembert Principle reduces
to the relation
∙½
X µ ¶ ¾ ¸
− − = 0 (6.38)
̇
For cartesian coordinates is a function only of velocities (̇ ̇ ̇) and thus the term = 0 However,
as discussed in appendix 22, for curvilinear coordinates 6= 0 due to the curvature of the coordinates
as is illustrated for polar coordinates where v =̇r̂ + ̇θ̂.
If all the generalized coordinates are independent, then equation 638 implies that the term in the
square brackets is zero for each individual value of . This leads to the basic Euler-Lagrange equations of
motion for each of the independent generalized coordinates
½ µ ¶ ¾
− = (6.39)
̇
where ≥ ≥ 1. That is, this leads to Euler-Lagrange equations of motion for the generalized forces .
As discussed in chapter 58 when holonomic constraint forces apply, it is possible to reduce the system
to = − independent generalized coordinates for which equation 625 applies.
In 1687 Leibniz proposed minimizing the time integral of his “vis viva”, which equals 2 That is,
Z 2
= 0 (6.40)
1
The variational equation 639 accomplishes the minimization of equation 640. It is remarkable that Leibniz
anticipated the basic variational concept prior to the birth of the developers of Lagrangian mechanics, i.e.,
d’Alembert, Euler, Lagrange, and Hamilton.
6.3.3 Lagrangian
The handling of both conservative
P and non-conservative generalized forces is best achieved by assuming
r̄
that the generalized force = F ·
can be partitioned into a conservative velocity-independent term,
that can be expressed in terms of the gradient of a scalar potential, −∇ plus an excluded generalized force
which contains the non-conservative, velocity-dependent, and all the constraint forces not explicitly
included in the potential . That is,
= −∇ + (6.41)
Inserting (641) into (638) and assuming that the potential is velocity independent, allows (638) to be
rewritten as
X ∙½ µ ( − ) ¶ ( − ) ¾ ¸
− − = 0 (6.42)
̇
6.4. LAGRANGE EQUATIONS FROM HAMILTON’S ACTION PRINCIPLE 141
≡ − (6.43)
∙½
X µ ¶ ¾ ¸
− − = 0 (6.44)
̇
Note that equation (644) contains the basic Euler-Lagrange equation (638) as a special case when = 0.
In addition, note that if all the generalized coordinates are independent, then the square bracket terms are
zero for each value of which leads to the general Euler-Lagrange equations of motion
½ µ ¶ ¾
− =
(6.45)
̇
where ≥ ≥ 1.
Chapter 653 will show that the holonomic constraint forces can be factored out of the generalized force
term
which simplifies derivation of the equations of motion using Lagrangian mechanics. The general
Euler-Lagrange equations of motion are used extensively in classical mechanics because conservative forces
play a ubiquitous role in classical mechanics.
has a minimum value for the correct path of motion. Hamilton’s Action Principle can be written in
terms of a virtual infinitessimal displacement as
Z 2
= = 0 (6.47)
1
Variational calculus therefore implies that a system of independent generalized coordinates must satisfy
the basic Lagrange-Euler equations
− =0 (6.48)
̇
Note that for = 0 this is the same as equation 645 which was derived using d’Alembert’s Principle.
This discussion has shown that Euler’s variational differential equation underlies both the differential vari-
ational d’Alembert Principle, and the more fundamental integral Hamilton’s Action Principle. As discussed
in chapter 92, Hamilton’s Principle of Stationary Action adds a fundamental new dimension to classical
mechanics which leads to derivation of both Lagrangian and Hamiltonian mechanics. That is, both Hamil-
ton’s Action Principle, and d’Alembert’s Principle, can be used to derive Lagrangian mechanics leading to
the most general Lagrange equations that are applicable to both holonomic and non-holonomic constraints,
as well as conservative and non-conservative systems. In addition, Chapter 62 presented a plausibility ar-
gument showing that Lagrangian mechanics can be justified based on Newtonian mechanics. Hamilton’s
Action Principle, and d’Alembert’s Principle, can be expressed in terms of generalized coordinates which is
much broader in scope than the equations of motion implied using Newtonian mechanics.
142 CHAPTER 6. LAGRANGIAN DYNAMICS
For the case of = − unknowns, any virtual displacement is independent of , therefore the
only way for (644) to hold is for the term in brackets to vanish for each value of , that is
½ µ ¶ ¾
− = (6.50)
̇
where = 1 2 3 These are the Lagrange equations for the minimal set of independent generalized
coordinates.
If all the generalized forces are conservative plus velocity independent, and are included in the potential
and = 0, then (650) simplifies to
½ µ ¶ ¾
− =0 (6.51)
̇
This is Euler’s differential equation, derived earlier using the calculus
R of variations. Thus d’Alembert’s
Principle leads to a solution that minimizes the action integral 12 = 0 as stated by Hamilton’s
Principle.
where = 1 2 3 Kinematic constraints can be expressed in terms of the infinitessimal displacements
of the form
X
(q ) + = 0 (6.53)
=1
, described by the vector q that are derived from the equations of constraint. As discussed in chapter 57,
if (653) represents the total differential of a function, then it can be integrated to give a holonomic relation
of the form of equation (652). However, if (653) is not the total differential, then it can be integrated only
after having solved the full problem. If = 0 then the
constraint is scleronomic.
The discussion of Lagrange multipliers in chapter 591, showed that, for virtual displacements
the correlation of the generalized coordinates, due to the constraint forces, can be taken into account by
multiplying (653) by unknown Lagrange multipliers and summing over all constraints. Generalized
forces can be partitioned into a Lagrange multiplier term plus a remainder force. That is
X
= (q ) +
(6.54)
=1
where
is the remaining part of the generalized force after subtracting both the part of the force
absorbed in the potential energy , which is buried in the Lagrangian
P , as well as the holonomic constraint
forces which are included in the Lagrange multiplier terms =1 (q ). The Lagrange multipliers
can be chosen arbitrarily in (656) Utilizing the free choice of the Lagrange multipliers allows them
to be determined in such a way that the coefficients of the first infinitessimals, i.e. the square brackets
vanish. Therefore the expression in the square bracket must vanish for each value of 1 ≤ ≤ . Thus it
follows that ½ µ ¶ ¾ X
− − (q ) −
=0 (6.57)
̇
=1
when = 1 2 Thus (656) reduces to a sum over the remaining coordinates between + 1 ≤ ≤
"½ µ ¶ ¾ X
#
X
− − (q ) − = 0 (6.58)
=+1
̇
=1
In equation (658) the = − infinitessimals can be chosen freely since the = − degrees
of freedom are independent. Therefore the expression in the square bracket must vanish for each value of
+ 1 ≤ ≤ . Thus it follows that
½ µ ¶ ¾ X
− − (q ) −
=0 (6.59)
̇
=1
where = + 1 + 2 Combining equations (657) and (659) then gives the important general relation
that for 1 ≤ ≤
½ µ ¶ ¾ X
− = (q ) +
(6.60)
̇
=1
144 CHAPTER 6. LAGRANGIAN DYNAMICS
To summarize, the Lagrange multiplier approach (660) automatically solves the equations plus the
holonomic equations of constraint, which determines the + unknowns, that is, the coordinates
plus the forces of constraint. The beauty of the Lagrange multipliers is that all variables, plus the
constraint forces, are found simultaneously by using the calculus of variations to determine the extremum
for the expanded Lagrangian 0 (q q̇ λ).
is the sum of the components in the direction for all external forces that have not been taken into account
by the scalar potential or the Lagrange multipliers. Thus the non-conservative generalized force
contains non-holonomic constraint forces, including dissipative forces such as drag or friction, that are not
included in or used in the Lagrange multiplier terms to account for the holonomic constraint forces.
The concept of generalized forces is illustrated by the case of spherical coordinate systems. The attached
table gives the displacement elements , (taken from table 4) and the generalized force for the three
coordinates. Note that has the dimensions of force and has the units of energy. By contrast
equation 630 gives that = and = which have the dimensions of torque. However, and
both have the dimensions of energy as is required in equation 630. This illustrates that the units used
for generalized forces depend on the units of the corresponding generalized coordinate.
Unit vectors ·
̂ r̂ r̂
θ̂ θ̂ θ̂
φ̂ φ̂ sin φ̂ sin sin
for variables, with equations of constraint. The generalized forces are not included in the
conservative, potential energy or the Lagrange multipliers approach for holonomic equations of constraint.2
The following is a logical procedure for applying the Euler-Lagrange equations to classical mechanics.
In summary, Lagrangian mechanics is based on energies which are scalars in contrast to Newtonian
mechanics which is based on vector forces and momentum. As a consequence, Lagrange mechanics allows
use of any set of independent generalized coordinates, which do not have to be orthogonal, and they can
have very different units for different variables. The generalized coordinates can incorporate the correlations
introduced by constraint forces.
The active forces are split into the following three categories;
1. Velocity-independent conservative forces are taken into account using scalar potentials .
2. Holonomic constraint forces can be determined using Lagrange multipliers.
3. Non-holonomic constraints require use of generalized forces
.
Use of the concept of scalar potentials is a trivial and powerful way to incorporate conservative forces in
Lagrangian mechanics. The Lagrange multipliers approach requires using the Euler-Lagrange equations for
+ coordinates but determines both holonomic constraint forces and equations of motion simultaneously.
Non-holonomic constraints and dissipative forces can be incorporated into Lagrangian mechanics via use of
generalized forces which broadens the scope of Lagrangian mechanics.
Note that the equations of motion resulting from the Lagrange-Euler algebraic approach are the same
equations of motion as obtained using Newtonian mechanics. However, the Lagrangian is a scalar which
facilitates rotation into the most convenient frame of reference. This can greatly simplify determination of
the equations of motion when constraint forces apply. As discussed in chapter 17, the Lagrangian and the
Hamiltonian variational approaches to mechanics are the only viable way to handle relativistic, statistical,
and quantum mechanics.
146 CHAPTER 6. LAGRANGIAN DYNAMICS
= ̇
̇
= ̇
̇
= ̇
̇
= = =0
Insert these in the Lagrange equation gives
Λ = − = ̇ − 0 = 0
̇
Thus
= ̇ =
= ̇ =
= ̇ =
That is, this shows that the linear momentum is conserved if is a constant, that is, no forces apply. Note
that momentum conservation has been derived without any direct reference to forces.
1 ³ 2 2´
= + −
2 g
(x, y)
Using the Lagrange equation for the coordinate
gives
Λ = − = − 0 = 0
Thus the horizontal momentum ̇ is conserved and
= 0 The coordinate gives x
The importance of selecting the most convenient generalized coordinates is nicely illustrated by trying to
solve this problem using polar coordinates where is radial distance and the elevation angle from the
axis as shown in the adjacent figure. Then
1 2 1 ³ ´2
= +
2 2
= sin
Thus
1 2 1 ³ ´2
= + ̇ − sin
2 2
Λ = 0 for the coordinate
2
̇ − sin − ̈ = 0
Λ = 0 for the coordinate
− cos − 2̇̇ − 2 ̈ = 0
These equations written in polar coordinates are more complicated than the result expressed in cartesian
coordinates. This is because the potential energy depends directly on the coordinate, whereas it is a function
of both This illustrates the freedom for using different generalized coordinates, plus the importance of
choosing a sensible set of generalized coordinates.
= ( − ) sin
1 = − = 0
2 = −=0
A holonomic constraint can be used to reduce the system to a single generalized coordinate plus generalized
velocity ̇ Expressed in terms of this single generalized coordinate, the Lagrangian becomes
µ ¶
1
= + 2 ̇ 2 − ( − ) sin
2
1 = − = 0
2 = −=0
which gives
̈ = −1
The constraint can be written as
̈ = ̈
1 2
Let = 2 and solve for and gives
1 = − ¡ 2
¢ sin = − sin
1+ 3
The frictional force is given by
1
= 1 = 1 = − sin
3
Also
2
̈ = sin + 1 = sin
3
and the torque is
−1 = = ̈
The four methods for handling the equations of constraint all are equivalent and result in the same
equations of motion. The scalar Lagrangian mechanics is able to calculate the vector forces acting in a direct
and simple way. The Newton’s law approach is more intuitive for this simple case and the ease and power
of the Lagrangian approach is not apparent for this simple system.
The following series of examples will gradually increase in complexity, and will illustrate the power,
elegance, plus superiority of the Lagrangian approach compared with the Newtonian approach.
6.8. APPLICATIONS TO SYSTEMS INVOLVING HOLONOMIC CONSTRAINTS 151
1 + 2 − = 0
2 = − 1
1 = 1 sin 1
2 = ( − 1 ) sin 2
The conservative gravitational force is absorbed into the potential energy given by
= cos
= sin
Thus
x
̇ = −(sin )̇
Two frictionless masses that are connected by a
̇ = (cos )̇
bar and are constrained to slide in vertical and
This constraint, that is absorbed into the generalized co- horizontal channels.
ordinate, is holonomic, scleronomic, and conservative.
The kinetic energy is given by
1 ¡2 ¢ 1
= (sin )2 ̇2 + 2 (cos )2 ̇2 = 2 ̇2
2 2
The gravitational potential energy is given by
= −0 sin
= ̇ + ̇ cos
= −̇ sin
( + ) ̈ + ̈ cos = 0 y
7
̈ cos + ̈ − sin = 0
5 .
Eliminating ̈ gives
µ ¶
7 cos2 sin
− ̈ =
5 +
Integrate this equation assuming the initial conditions,
results in
5 ( + ) sin x
= 2 y
2 [7 ( + ) − 5 cos2 ]
x
Thus Solid sphere rolling without slipping on an
cos 5 sin (2) inclined plane on a frictionless horizontal floor.
=− = 2
+ 4 [7 ( + ) − 5 cos2 ]
Note that these equations predict conservation of linear
momentum for the block plus sphere.
2 ̇ = constant =
6.8. APPLICATIONS TO SYSTEMS INVOLVING HOLONOMIC CONSTRAINTS 155
2 cos
̈ + sin − 2 4 3 = 0
sin
There are many possible solutions depending on the initial conditions. The pendulum can just oscillate
in the direction, or rotate in the direction or some combination of these. Note that if is zero, then
the equation reduces to the simple harmonic pendulum, while the other extreme is when ̈ = 0 for which the
motion is that of a conical pendulum that rotates at a constant angle 0 to the vertical axis.
156 CHAPTER 6. LAGRANGIAN DYNAMICS
= cos
= sin
The angular momentum = 2 ̇, thus the equation of motion can be written as
The last term in the right-hand side is the Coriolis force caused by the time variation of the pendulum length.
For the radial distance the Lagrange equation Λ = 0 gives
2
̈ = ̇ + cos − ( − 0 )
This equation just equals the tension in the spring, i.e. = ̈. The first term on the right-hand side
represents the centrifugal radial acceleration, the second term is the component of the gravitational force,
and the third term represents Hooke’s Law for the spring. For small amplitudes of the motion appears as
a superposition of harmonic oscillations in the plane.
In this example the orthogonal coordinate approach used gave the tension in the spring thus it is unnec-
essary to repeat this using the Lagrange multiplier approach.
6.8. APPLICATIONS TO SYSTEMS INVOLVING HOLONOMIC CONSTRAINTS 157
− = 1 2 (a)
³ ̇ ´
2
̈ − ̇ = 1 2
For Λ =
³ 2 ´
̇ = ̇ = 0 (b)
Thus the angular momentum is conserved, that is, it is a constant of motion.
For Λ =
̈ = − − 1 (c)
and the time differential of the constraint equation is
2̇ − ̇ = 0 (d)
The above four equations of motion can be used to determine 1
2
The
√ radius of the circle at the intersection of the plane = with the paraboloid = is given by
0 = For a constant height = , then ̈ = 0 and equation (c) reduces to
1 = −
Therefore the constraint force is given by
( )
= 1 =− 2
Assuming that ̈ = 0 then equation (a) for ̇ = and = 0 gives
¡ ¢
0 − 0 2 = 1 20 = − 20 =
That is, the constraint force equals
= −0 2
which is the usual centripetal force. These relations also give that the initial angular velocity required for
such a stable trajectory with height is r
2
̇ = =
6.8. APPLICATIONS TO SYSTEMS INVOLVING HOLONOMIC CONSTRAINTS 159
1 ³ 2
´ 1 ³ 2
´
= 1 ̇2 + 2 ̇ + 2 ̇2 + 2 ̇
2 2
³ ´ 1 µ ¶
1 ·2
2 2
= 1 ̇2 + ( − ) ̇ + 2 ̇2 + 2 m2
2 2
Mass 2 hanging from a rope that is connected
The potential energy in terms of the generalized coordi- to 1 which slides on a frictionless plane.
nates relative to the horizontal plane, is
= 0 − 2 cos
6.15 Example: Two connected masses constrained to slide along a moving rod
Consider two identical masses constrained to move
along the axis of a thin straight rod, of mass and length
which is free to both translate and rotate. Two identi- z1
cal springs link the two masses to the central point of the
rod. Consider only motions of the system for which the
extended lengths of the two springs are equal and opposite z r
such that the two masses always are equal distances from
the center of the rod keeping the center of mass at the O
center of the rod. Find the equations of motion for this y r y1
system.
x
Use a fixed cartesian coordinate system ( ) and
a moving frame with the origin at the center of the
rod with its cartesian coordinates (1 1 1 ) being parallel x1
to the fixed coordinate frame as shown in the figure. Let Two identical masses constrained to slide on
( ) be the spherical coordinates of a point referring to a moving rod of mass The masses are
the center of the moving (1 1 1 ) frame as shown in the attached to the center of the rod by identical
figure. Then the two masses have spherical coordinates springs each having a spring constant .
( ) and (− ) in the moving-rod fixed frame. The
frictionless constraints are holonomic.
The kinetic energy of the system is equal to the kinetic energy for all the mass concentrated at the center
of mass plus the kinetic energy about the center of mass. Since is the center of mass then the kinetic
energy can be separated into three terms
= + +
Note that since the kinetic energy is a scalar quantity it is rotational invariant and thus can be evaluated in
any rotated frame. Thus the kinetic energy of the center of mass is
1
= ( + 2)(̇2 + ̇ 2 + ̇ 2 )
2
The rotational kinetic energy of the two masses in the center of mass frame is
2
= (̇2 + 2 ̇ + 2 ̇2 sin2 )
The rotational kinetic energy of the rod is a scalar and thus can be evaluated in any rotated frame of
reference fixed with respect to the principal axis system of the rod. The angular velocity of the rod about
resolved along its principal axes is given by
1 1 2
= ( 2 + 2 + 2 ) = 2 (̇ + ̇2 sin2 )
2 24
The only potential energy is due to the two extended springs which are assumed to have the same length
where 0 is the unstretched length.
1
= 2 · ( − 0 )2 = ( − 0 )2
2
Thus the Lagrangian is
1 2 1 2
= ( + 2)(̇2 + ̇ 2 + ̇ 2 ) + (̇2 + 2 ̇ + 2 ̇2 sin2 ) + 2 (̇ + ̇2 sin2 ) − ( − 0 )2
2 24
Using Lagrange’s equations Λ = 0 for the generalized coordinates gives.
6.9. APPLICATIONS INVOLVING NON-HOLONOMIC CONSTRAINTS 161
( ) = − = 0
For the restricted domain where this system is holonomic, it can be solved using generalized coordinates,
generalized forces, Lagrange multipliers, or Newtonian mechanics as illustrated below.
Minimal generalized coordinates:
The minimal number of generalized coordinates reduces the system to one coordinate , which does not
determine the constraint force that is needed to know if the constraint applies. Thus this approach is not
useful for solving this partially-holonomic system.
162 CHAPTER 6. LAGRANGIAN DYNAMICS
Generalized forces:
The radial constraint has a corresponding generalized force . The Lagrange equation Λ = gives
2
̈ + cos − ̇ = (a)
The Lagrange equation Λ = = 0 since there is no tangential force for this frictionless system. Therefore
When constrained to follow the surface of the spherical shell, the system is holonomic, i.e. = and
̇ = ̈ = 0. Thus the above two equations reduce to
2
cos − ̇ = (c)
2 ̈ − sin = 0
That is
̈ = sin
Integrate to get ̇ using the fact that
̇ ̇
̈ = = ̇
then Z Z Z
̈ = ̇̇ = sin
Therefore
2 2
̇ = (1 − cos ) (d)
assuming that ̇ = 0 at = 0 Substituting equation () into equation () gives the constraint force, which
is normal to the surface, to be
= = (3 cos − 2)
Note that = = 0 when cos = 23 , that is = 482
Lagrange multipliers:
For the holonomic regime, which obeys the constraint, ( ) = − = 0 the Lagrange equation for
is Λ =
Since = 1 then
2
̈ + cos − ̇ = (a)
The Lagrange equation for gives ∆ =
= 0 since
= 0 Thus
As above, when constrained to follow the surface of the spherical shell, the system is holonomic =
and ̇ = ̈ = 0 Thus the above two equations reduce to
2
cos − ̇ = (c)
2 ̈ − sin = 0 (d)
That is, the answers are identical to that obtained using generalized forces, namely;
2 2
̇ = (1 − cos ) (d)
assuming that ̇ = 0 at = 0
The force of constraint applied by the surface is
= =
6.9. APPLICATIONS INVOLVING NON-HOLONOMIC CONSTRAINTS 163
= = (3 cos − 2)
Energy conservation:
This problem can be solved using energy conservation
1
2 = [1 − cos ]
2
Thus the centripetal acceleration
2
= 2[1 − cos ]
The normal force to the surface will cancel when the centripetal acceleration equals the gravitational acceler-
ation, that is, when
2
= 2[1 − cos ] = cos
This occurs when cos = 23 . This is an unusual case where the Newtonian approach is the simplest.
1 = − − = 0
2 = ( − ) − = 0
164 CHAPTER 6. LAGRANGIAN DYNAMICS
³ 2
´ 2
The kinetic energy is = 12 ̇2 + 2 ̇ + 12 ̇ and the potential energy is = cos Thus the
Lagrangian is
1 ³ 2
´ 1 2
= ̇2 + 2 ̇ + ̇ − cos
2 2
Consider the solution using Lagrange multipliers for the holonomic regime where both constraints are
satisfied and lead to the following differential constraint relations
1 1 1
= 1 =0 =0
2 2 2
= 0 = = − ( + )
The Lagrange operator equation Λ gives,
1 2
− = 1 + 2
̇
that is
2
̈ + cos − ̇ = 1 (a)
Λ gives
2 ̈ + 2̇̇ − sin = −2 ( + ) (b)
Λ gives
̈ = 2 (c)
Since the center of the sphere rolling on the spherical shell must have
=+
then
̇ = ̈ = 0
̈ = ̈
Substituting this into () gives
2
̈ = 2
Insert this into equation () gives
sin
2 = ¡ 2 2¢
+
The moment of inertia about the axis of a solid sphere is = 25 2 Then
2 sin
2 =
7
But also
̇ 2 5 5 sin
̈ = ̇ = 2 = 2 =
2 7
Integrating gives Z Z
5
̇̇ = sin
7
That is
2 10
̇ = (1 − cos )
7
assuming that ̇ = 0 at = 0 Inserting this into equation () gives
10
− [1 − cos ] + cos = 1
7
6.9. APPLICATIONS INVOLVING NON-HOLONOMIC CONSTRAINTS 165
That is
1 = [17 cos − 10]
7
Note that this equals zero when
10
cos =
17
For larger angles 1 is negative implying that the solid sphere will fly off the surface of the spherical shell.
The sphere will leave the surface of the cylinder when cos = 10
17 that is, = 5397 This is a significantly
larger angle than obtained for the similar problem where the mass is sliding on a frictionless cylinder because
the energy stored in rotation implies that the linear velocity of the mass is lower at a given angle for the
case of a rolling sphere.
The above discussion has omitted an important fact that, if ∞ the frictional force becomes
insufficient to maintain the rolling constraint before = 5397 that is, the frictional force will exceed
the sliding limit . To determine when the rolling constraint fails it is necessary to determine the
frictional torque
= −2
Thus
= −2
It is in the negative direction because of the direction chosen for The required coefficient of friction is
given by the ratio of the frictional force to the normal force, that is
2 2 sin
= =
1 [17 cos − 10]
For = 1 the disk starts to slip when = 47540 Note that the sphere starts slipping before it flies off
the cylinder since a normal force is required to support a frictional force and the difference depends on the
coefficient of friction. The no-slipping constraint is not satisfied once the sphere starts slipping and the
frictional force should equal 1 Thus for the angles beyond 4754 the problem needs to be solved with
the rolling constraint changed to a sliding non-conservative frictional force. This is best handled by including
the frictional force and normal forces as generalized forces. Fortunately this will be a small correction. The
friction will slightly change the exact angle at which the normal force becomes zero and the system transitions
to free motion of the sphere in a gravitational field.
Similarly Λ = = gives
̈ =
These can be solved by substituting the relation = . The sphere flies off the spherical shell
when ≤ 0 leading to free motion discussed in example 62. The problem of a solid uniform sphere rolling
inside a hollow sphere can be solved the same way.
166 CHAPTER 6. LAGRANGIAN DYNAMICS
6.19 Example: Small body held by friction on the periphery of a rolling wheel
Assume that a small body of mass is bal-
anced on a rolling wheel of mass and radius
as shown in the figure. The wheel rolls in y
a vertical plane without slipping on a horizontal
surface. This example illustrates that it is possi-
ble to use simultaneously a mixture of holonomic F N
constraints, partially-holonomic constraints, and
generalized forces.3
m
Assume that at = 0 the wheel touches the
floor at = = 0 with the mass perched at
the top of the wheel at = 0. Let the frictional
force acting on the mass be and the reaction
force of the periphery of the wheel on the mass
be . Let ̇ be the angular velocity of the wheel,
and ̇ the horizontal velocity of the center of the M
wheel. The polar coordinates of the mass x
O
are taken with measured from the center of the
x
wheel with measured with respect to the vertical.
Thus the cartesian coordinates of the small mass Small body of mass held by friction on the periphery
are ( + sin + cos ) with respect to the of a rolling wheel of mass and radius .
origin at = = 0.
The kinetic energy is given by
∙³ ´2 ³ ´2 ¸
1 1 1
= ̇2 + ̇2 + ̇ + ̇ cos + ̇ sin + ̇ cos − ̇ sin
2 2 2
The gravitational force can be absorbed into the scalar potential term of the Lagrangian and includes only
the potential energy of the mass since the potential energy of the rolling wheel is constant.
= + ( + cos )
1 1 1 h 2 i
= ( + ) ̇2 + ̇2 + 2 ̇ + 2̇̇ cos + 2̇̇ sin + ̇2 − ( + cos )
2 2 2
1 = − = ̇ − ̇ = 0
2) The mass is touching the periphery of the wheel, that is, the normal force 0 This is a one-sided
restricted holonomic constraint.
2 = − = 0
3) The mass does not slip on the wheel if the frictional force . When this restricted
holonomic constraint is satisfied, then
3 = ̇ − ̇ = 0
The rolling constraint is holonomic, and can be accounted for using one Lagrange multiplier plus the
differential constraint equations
3 This problem is solved in detail in example 319 of " Classical Mechanics and Relativity". by Muller-Kirsten [06]
6.9. APPLICATIONS INVOLVING NON-HOLONOMIC CONSTRAINTS 167
1
= 1
1
= 0
1
=
1
= 0
The other two constraints are non-holonomic, and thus these constraint forces are expressed in terms of two
generalized forces and that are related to the tangential force and radial reaction force . For
simplicity, assume that the wheel is a thin-walled cylinder with a moment of inertia of
= 2
= − cos + sin
= (− cos + sin ) (− cos ) − ( sin + cos ) sin = −
=
This last equation can be derived by Newtonian mechanics from consideration of the forces acting.
The above equations of motion can be used to calculate the motion for the following conditions.
a) Mass not slipping:
This occurs if = ≤ which also implies that 0 That is a situation where the system is
holonomic with = ̇ = ̇ ̇ = ̇ which can be solved using the generalized coordinate approach with
only one independent coordinate which can be taken to be .
b) Mass slipping:
Here the no-slip constraint is violated and thus one has to explicitly include the generalized forces
and assume that sliding friction is given by =
c) Reaction force is negative:
Here the mass is not subject to any constraints and it is in free fall.
The above example illustrates the flexibility provided by Lagrangian mechanics that allows simultane-
ous use of Lagrange multipliers, generalized forces, and scalar potential to handle combinations of several
holonomic and nonholonomic constraints for a complicated problem.
168 CHAPTER 6. LAGRANGIAN DYNAMICS
F = (E + v × B) (6.61)
It is interesting to use Maxwell’s equations and Lagrangian mechanics to show that the Lorentz force can be
represented by a conservative potential in Lagrangian mechanics.
Maxwell’s equations can be written as
∇·E = (6.62)
0
B
∇ × E+= 0
∇·B = 0
E
∇ × B−0 0 = J
Since ∇ · B =0 then it follows from Appendix that B can be represented by the curl of a vector
potential, A that is
B=∇×A (6.63)
Substituting this into ∇ × E+ B
= 0 gives that
∇ × A
∇ × E+ = 0 (6.64)
µ ¶
A
∇× E + = 0
Since this curl is zero it can be represented by the gradient of a scalar potential
A
E+ = −∇ (6.65)
The following shows that this relation corresponds to taking the gradient of a potential for the charge
where the potential is given by the relation
= (Φ − A · v) (6.66)
where Φ is the scalar electrostatic potential. This scalar potential can be employed in the Lagrange
equations using the Lagrangian
1
= v · v − (Φ − A · v) (6.67)
2
The Lorentz force can be derived from this Lagrangian by considering the Lagrange equation for the cartesian
coordinate
− =0 (6.68)
̇
Using the above Lagrangian (667) gives
∙ ¸
Φ A
̈ + + − ·v =0 (6.69)
But
= + ̇ + ̇ + ̇ (6.70)
and
A
·v = ̇ + ̇ + ̇ (6.71)
6.11. TIME-DEPENDENT FORCES 169
F = (E + v × B) (6.73)
This has demonstrated that the electromagnetic scalar potential
= (Φ − A · v) (6.74)
satisfies Maxwell’s equations, gives the Lorentz force, and it can be absorbed into the Lagrangian. Note that
the velocity-dependent Lorentz force is conservative since E is conservative, and because (v × B × v)=0
therefore the magnetic force does no work since it is perpendicular to the trajectory. The velocity-dependent
conservative Lorentz force is an important and ubiquitous force that features prominently in many branches
of science. It will be discussed further for the case of relativistic motion in example 176.
= cos
The kinetic energy is
∙³ ´2 ¸
1 1 h 2
i
= ̇ cos + (̇ + ̇ sin )2 = 2 ̇ + 2̇̇ sin + ̇ 2
2 2
and the potential energy is
= [(1 − cos ) + ]
Thus the Lagrangian is
1 h 2 2 i
= ̇ + 2̇̇ sin + ̇ 2 − [(1 − cos ) + ]
2
The Euler-Lagrange equations lead to equations of motion for and
Assume the small-angle approximation where → 0 then these two equations reduce to
µ ¶
̈
̈ + + = 0
̈ + =
Substitute ̈ = − 2 cos into these equations gives
µ ¶
2
̈ + − cos = 0
¡ ¢
− 2 cos =
These correspond to stable harmonic oscillations about ≈ 0 if the bracket term is positive, and to
unstable motion if the bracket is negative. Thus, for small amplitude oscillation about ≈ 0 the motion of
the system can be unstable whenever the bracket is negative, that is, when the acceleration 2 cos
and resonance behavior can occur coupling the pendulum period and the forcing frequency .
This discussion also applies to the inverted pendulum with a surprising result. It is well known that the
pendulum is unstable near = . However, if the support is oscillating, then for ≈ the equations of
motion become
µ ¶
2
̈ − − cos = 0
¡ ¢
− 2 cos =
The inverted pendulum has stable oscillations about ≈ if the bracket is negative, that is, if 2 cos
This illustrates that nonautonomous dynamical systems can involve either stable or unstable motion.
where the impulsive force is introduced using the generalized force
. Knowing the initial conditions at
time the conditions at the time + are given by integration of equation 675 over the duration of the
impulse which gives Z + µ ¶ Z + Z +
− =
(6.76)
̇
This integration determines the conditions at time + which then are used as the initial conditions for the
motion when the impulsive force is zero.
The second approach is to realize that equation 676 can be rewritten in the form
Z + µ ¶ ¯+ Z + µµ ¶ ¶
¯¯
lim = lim = ∆ = lim + (6.77)
→0 ̇ →0 ̇ ¯ →0
Note that in the limit that → 0 then the integral of the generalized momentum = simplifies to give
̇
³ ´
the change in generalized momentum ∆ . In addition, assuming that the non-impulsive forces
are
6.12. IMPULSIVE FORCES 171
finite and independent of the instantaneous impulsive force during the infinitessimal duration , then the
R + ³ ´
contribution of the non-impulsive forces during the impulse can be neglected relative to the
R +
large impulsive force term; lim →0 . Thus it can be assumed that
Z +
∆ = lim
= ̃ (6.78)
→0
where ̃ is the generalized impulse associated with coordinate = 1 2 3 . This generalized impulse
can be derived from the time integral of the impulsive forces P given by equation 2135 using the time
integral of equation 677, that is
Z + Z + X X
r r
∆ = ̃ = lim
≡ lim P · = P̃ · (6.79)
→0 →0
Note that the generalized impulse ̃ can be a translational impulse P̃ with corresponding translational
variable or an angular impulsive torque τ̃ with corresponding angular variable .
Impulsive force problems usually are solved in two stages. Either equations 676 or 679 are used to
determine the conditions of the system immediately following the impulse. If → 0 then impulse changes
the generalized velocities ̇ but not the generalized coordinates . The subsequent motion then is determined
using the Lagrangian equations of motion with the impulsive generalized force being zero, and assuming that
the initial condition corresponds to the result of the impulse calculation.
1 2 1 2
= (1 + 2 )21 ̇1 + 2 1 2 ̇1 ̇2 + 2 22 ̇2
2 2
The total potential energy is
Use equation 679 to transform to the generalized coordinates 1 and 2 with the corresponding generalized
impulsive torques
̃1 = ̃ 1
̃2 = ̃ ( − 1 )
Since the system starts at rest where 1 = 2 = 0, then using equation 677 gives the change in angular
momentum immediately following the impulse to be
³ ´
1 21 ̇1 + 2 1 1 ̇1 + 2 ̇2 = ̃ 1
³ ´
2 2 1 ̇1 + 2 ̇2 = ̃ ( − 1 )
These two equations determine ̇1 and ̇2 immediately after the impulse; these can be used with 1 = 2 = 0
as initial conditions for solving the subsequent force-free motion when the generalized impulsive force is zero.
As described in example 145 the subsequent motion of this series coupled pendulum will be a superposition
of the two normal modes with amplitudes determined by the result of the impulse calculation.
6.14 Summary
Newtonian plausibility argument for Lagrangian mechanics:
A justification for introducing the calculus of variations to classical mechanics becomes apparent when
the concept of the Lagrangian ≡ − is used in the functional and time is the independent variable.
It was shown that Newton’s equation of motion can be rewritten as
− = (612)
̇
where
are the excluded forces of constraint plus any other conservative or non-conservative forces not
included in the potential This corresponds to the Euler-Lagrange equation for determining the minimum
of the time integral of the Lagrangian. Equation 612 can be written as
X
− = () + (615)
̇
where the Lagrange multiplier term accounts for holonomic constraint forces, and
includes all ad-
ditional forces not accounted for by the scalar potential , or the Lagrange multiplier terms
. The
constraint forces can be included explicitly as generalized forces in the excluded term of equation
615.
d’Alembert’s Principle
It was shown that d’Alembert’s Principle
X
(F
− ṗ ) · r = 0 (625)
cleverly transforms the principle of virtual work from the realm of statics to dynamics. Application of virtual
work to statics primarily leads to algebraic equations between the forces, whereas d’Alembert’s principle
applied to dynamics leads to differential equations of motion.
Lagrange equations from d’Alembert’s Principle
After transforming to generalized coordinates, d’Alembert’s Principle leads to
∙½
X µ ¶ ¾ ¸
− − = 0 (638)
̇
If all the generalized coordinates are independent, then equation 638 implies that the term in the square
brackets is zero for each individual value of . That is, this implies the basic Euler-Lagrange equations of
motion.
The handling of both conservative
P andr̄non-conservative generalized forces is best achieved by assuming
that the generalized force = F · can be partitioned into a conservative velocity-independent term,
that can be expressed in terms of the gradient of a scalar potential, −∇ plus an excluded generalized force
which contains the non-conservative, velocity-dependent, and all the constraint forces not explicitly
included in the potential . That is,
= −∇ + (641)
Inserting (641) into (638) and assuming that the potential is velocity independent, allows (638) to be
rewritten as
X ∙½ µ ( − ) ¶ ( − ) ¾ ¸
− − = 0 (642)
̇
Note that equation (644) contains the basic Euler-Lagrange equation (638) for the special case when
= 0. In addition, note that if all the generalized coordinates are independent, then the square bracket
terms are zero for each value of which leads to the general Euler-Lagrange equations of motion
½ µ ¶ ¾
− =
(645)
̇
where ≥ ≥ 1.
Newtonian mechanics has trouble handling constraint forces because they lead to coupling of the degrees
of freedom. Lagrangian mechanics is more powerful since it provides the following three ways to handle such
correlated motion.
1) Minimal set of generalized coordinates
If the coordinates are independent, then the square bracket equals zero for each value of in equation
644, which corresponds to Euler’s equation for each of the independent coordinates. If the generalized
coordinates are coupled by constraints, then the coordinates can be transformed to a minimal set of
= − independent coordinates which then can be solved by applying equation 645 to the minimal set
of independent coordinates.
2) Lagrange multipliers approach
The Lagrangian method concentrates solely on active forces, completely ignoring all other internal forces.
In Lagrangian mechanics the generalized forces, corresponding to each generalized coordinate, can be parti-
tioned three ways
X
= −∇ + (q ) +
=1
where the velocity-independent conservative forces can be absorbed into Pascalar potential , the holonomic
constraint forces can be handled using the Lagrange multiplier term =1
(q ), and the remaining
part of the active forces can be absorbed into the generalized force . The scalar potential energy is
handled by absorbing it into the standard Lagrangian = − . If the constraint forces are holonomic then
these forces are easily and elegantly handled by use of Lagrange multipliers. All remaining forces, including
dissipative forces, can be handled by including them explicitly in the the generalized force .
Combining the above two equations gives
"½ µ ¶ ¾
#
X X
− − − (q ) = 0 (656)
̇
=1
Use of the Lagrange multipliers to handle the constraint forces ensures that all infinitessimals are
independent implying that the expression in the square bracket must be zero for each of the values of .
This leads to Lagrange equations plus constraint relations
½ µ ¶ ¾ X
− =
+ (q ) (660)
̇
=1
where = 1 2 3
3) Generalized forces approach
The two right-hand terms in (660) can be understood to be those forces acting on the system that are
not
P absorbed into the scalar potential component of the Lagrangian . The Lagrange multiplier terms
=1 (q ) account for the holonomic forces of constraint that are not included in the conservative
potential or in the generalized forces
. The generalized force
X r
= F
· (617)
is the sum of the components in the direction for all external forces that have not been taken into account
by the scalar potential or the Lagrange multipliers. Thus the non-conservative generalized force
contains non-holonomic constraint forces, including dissipative forces such as drag or friction, that are not
included in or used in the Lagrange multiplier terms to account for the holonomic constraint forces.
6.14. SUMMARY 175
= Φ − v · A (674)
leads to the Lorentz force where Φ is the scalar electric potential and A the vector potential.
Time-dependent forces:
It was shown that time-dependent forces can lead to complicated motion having both stable regions and
unstable regions of motion that can exhibit chaos.
Impulsive forces:
A generalized impulse ̃ can be derived for an instantaneous impulsive force from the time integral of
the impulsive forces P given by equation 2135 using the time integral of equation 678, that is
Z + Z + X X
r r
∆ = ̃ = lim ≡ lim F · = P̃ · (679)
→0 →0
Note that the generalized impulse ̃ can be a translational impulse P̃ with corresponding translational
variable or an angular impulsive torque T̃ with corresponding angular variable .
Comparison of Newtonian and Lagrangian mechanics:
In contrast to Newtonian mechanics, which is based on knowing all the vector forces acting on a system,
Lagrangian mechanics can derive the equations of motion using generalized coordinates without requiring
knowledge of the constraint forces acting on the system. Lagrangian mechanics provides a remarkably
powerful, and incredibly consistent approach to solving for the equations of motion in classical mechanics,
and is especially powerful for handling systems that are subject to holonomic constraints.
176 CHAPTER 6. LAGRANGIAN DYNAMICS
Workshop exercises
1. A disk of mass and radius rolls without slipping down a plane inclined from the horizontal by an angle
. The disk has a short weightless axle of negligible radius. From this axis is suspended a simple pendulum of
length and whose bob has a mass . Assume that the motion of the pendulum takes place in the plane
of the disk.
3. Consider a particle of mass moving in a plane and subject to an inverse square attractive force.
4. Consider a Lagrangian function of the form ( ˙ ¨ ). Here the Lagrangian contains a time derivative
of the generalized coordinates that is higher than the first. When working with such Lagrangians, the term
“generalized mechanics” is used.
(a) Consider a system with one degree of freedom. By applying the methods of the calculus of variations,
and assuming that Hamilton’s principle holds with respect to variations which keep both and ̇ fixed at
the end points, show that the corresponding Lagrange equation is
µ ¶ µ ¶
2
− + = 0
2 ̈ ̇
Such equations of motion have interesting applications in chaos theory.
(b) Apply this result to the Lagrangian
=− ̈ − 2
2 2
Do you recognize the equations of motion?
5. A bead of mass slides under gravity along a smooth wire bent in the shape of a parabola 2 = in the
vertical ( ) plane.
(c) Set up Lagrange’s equations of motion for both and with the constraint adjoined and a Lagrangian
multiplier introduced.
(d) Show that the same equation of motion for results from either of the methods used in part (b) or part
(c).
(e) Express in terms of and ̇.
(f) What are the and components of the force of constraint in terms of and ̇?
7. Consider the double pendulum comprising masses 1 and 2 connected by inextensible strings as shown in
the figure. Assume that the motion of the pendulum takes place in a vertical plane.
(a) Are there any equations of constraint? If so, what are they?
(b) Find Lagrange’s equations for this system.
O
L1
m1
2 L2
m 1g
m2
m2 g
8 Consider the system shown in the figure which consists of a mass suspended via a constrained massless link
of length where the point is acted upon by a spring of spring constant . The spring is unstretched when
the massless link is horizontal. Assume that the holonomic constraints at and are frictionless.
a Derive the equations of motion for the system using the method of Lagrange multipliers.
kx
x0 x L y
mg
Problems
1. A sphere of radius is constrained to roll without slipping on the lower half of the inner surface of a hollow
cylinder of radius Determine the Lagrangian function, the equation of constraint, and the Lagrange equations
of motion. Find the frequency of small oscillations.
2. A particle moves in a plane under the influence of a force = −−1 directed toward the origin; and
( 0) are constants. Choose generalized coordinates with the potential energy zero at the origin.
a) Find the Lagrangian equations of motion.
b) Is the angular momentum about the origin conserved?
c) Is the total energy conserved?
3. Two blocks, each of mass are connected by an extensionless, uniform string of length . One block is placed
on a frictionless horizontal surface, and the other block hangs over the side, the string passing over a frictionless
pulley. Describe the motion of the system:
a) when the mass of the string is negligible
b) when the string has mass .
4. Two masses 1 and 2 (1 6= 2 ) are connected by a rigid rod of length and of negligible mass. An
extensionless string of length 1 is attached to 1 and connected to a fixed point of the support . Similarly
a string of length 2 (1 6= 2 ) connects 2 and . Obtain the equation of motion describing the motion in
the plane of 1 2 and , and find the frequency of small oscillation around the equilibrium position.
5. A thin uniform rigid rod of length 2 and mass is suspended by a massless string of length . Initially the
system is hanging vertically downwards in the gravitational field . Use as generalized coordinates the angles
given in the diagram.
a) Derive the Lagrangian for the system.
b) Use the Lagrangian to derive the equations of motion.
c) A horizontal impulsive force in the direction strikes the bottom end of the rod for an infinitessimal
time . Derive the initial conditions for the system immediately after the impulse has occurred.
d) Draw a diagram showing the geometry of the pendulum shortly after the impulse when the displacement
angles are significant.
x
O
y
2L
2
Fx
Mg
Chapter 7
7.1 Introduction
The chapter 7 discussion of Lagrangian dynamics illustrates the power of Lagrangian mechanics for deriving
the equations of motion. In contrast to Newtonian mechanics, which is expressed in terms of force vectors
acting on a system, the Lagrangian method, based on d’Alembert’s Principle or Hamilton’s Principle, is
expressed in terms of the scalar kinetic and potential energies of the system. The Lagrangian approach is a
sophisticated alternative to Newton’s laws of motion, that provides a simpler derivation of the equations of
motion that allows constraint forces to be ignored. In addition, the use of Lagrange multipliers or generalized
forces allows the Lagrangian approach to determine the constraint forces when these forces are of interest.
The equations of motion, derived either from Newton’s Laws or Lagrangian dynamics, can be non-trivial to
solve mathematically. It is necessary to integrate second-order differential equations, which for degrees of
freedom, imply 2 constants of integration.
Chapter 7 will explore the remarkable connection between symmetry and invariance of a system under
transformation, and the related conservation laws that imply the existence of constants of motion. Even
when the equations of motion cannot be solved easily, it is possible to derive important physical principles
regarding the first-order integrals of motion of the system directly from the Lagrange equation, as well as for
elucidating the underlying symmetries plus invariance. This property is contained in Noether’s theorem
which states that conservation laws are associated with differentiable symmetries of a physical system.
= − = (7.1)
̇ ̇ ̇ ̇
X1 ¡ 2 ¢
= ̇ + ̇2 + ̇2
̇ =1 2
= ̇ =
= (7.2)
̇
which is the component of the linear momentum for the particle.
179
180 CHAPTER 7. SYMMETRIES, INVARIANCE AND THE HAMILTONIAN
This result suggests an obvious extension of the concept of momentum to generalized coordinates. The
generalized momentum associated with the coordinate is defined to be
≡ (7.3)
̇
Note that also is called the conjugate momentum or canonical momentum to where are
conjugate, or canonical, variables. Remember that the linear momentum is the first-order time integral
given by equation 210. If is not a spatial coordinate, then is the generalized momentum, not the
kinematic linear momentum. For example, if is an angle, then will be angular momentum. That
is, the generalized momentum may differ from the usual linear or angular momentum since the definition
(73) is more general than the usual = ̇ definition of linear momentum in classical mechanics. This is
illustrated by the case of a moving charged particles in an electromagnetic field. Chapter 6 showed
that electromagnetic forces on a charge can be described in terms of a scalar potential where
= (Φ − A · v ) (7.4)
The generalized momentum to the coordinate for charge and mass is given by the above Lagrangian
= = ̇ + (7.6)
̇
Note that this includes both the mechanical linear momentum plus the correct electromagnetic momentum.
The fact that the electromagnetic field carries momentum should not be a surprise since electromagnetic
waves also carry energy as is illustrated by the transmission of radiant energy from the sun.
Φ
N() = −
The initial angular momentum in the electromagnetic field can be derived using equation 76 plus Stoke’s
theorem (Appendix 3). Equation 2142 gives that the final angular momentum equals the angular impulse
Z I I I Z
L
= ̇ = = = B · dS =Φ
I Z
where Φ = = B · dS is the initial total magnetic flux through the solenoid. Thus the total initial
angular momentum is given by
L
= 0 + L
= Φ
Since the final electromagnetic field is zero the final total angular momentum is given by
L
= L
+ 0 = Φ
Note that the total angular momentum is conserved. That is, initially all the angular momentum is stored in
the electromagnetic field, whereas the final angular momentum is all mechanical. This explains the paradox
that the mechanical angular momentum is not conserved, only the total angular momentum of the system is
conserved, that is, the sum of the mechanical and electromagnetic angular momenta.
The new set of generalized coordinates satisfies Lagrange’s equations of motion with the new Lagrangian
The Lagrangian is a scalar, with units of energy, which does not change if the coordinate representa-
tion is changed. Thus ( 0 ̇ 0 ) can be derived from ( ̇ ) by substituting the inverse relation =
(10 20 0 ; ) into ( ̇ ) That is, the value of the Lagrangian is independent of which coordinate
representation is used. Although the general form of Lagrange’s equations of motion is preserved in any
point transformation, the explicit equations of motion for the new variables usually look different from those
with the old variables. A typical example is the transformation from cartesian to spherical coordinates. For
a given system, there can be particular transformations for which the explicit equations of motion are the
same for both the old and new variables. Transformations for which the equations of motion are invariant,
are called invariant transformations. It will be shown that if the Lagrangian does not explicitly contain
a particular coordinate of displacement then the corresponding conjugate momentum, is conserved.
This relation is called Noether’s theorem which states “For each symmetry of the Lagrangian, there is a
conserved quantity”.
Noether’s Theorem will be used to consider invariant transformations for two dependent variables, ()
and () plus their conjugate momenta and . For a closed system, these provide up to six possible
conservation laws for the three axes. Then we will discuss the independent variable and its relation to
the Generalized Energy Theorem, which provides another possible conservation law. For simplicity, these
discussions will assume that the systems are holonomic and conservative.
The Lagrange equations using generalized coordinates for holonomic systems, was given by equation 660
to be ½ µ ¶ ¾ X
− = (q ) +
(7.9)
̇
=1
or equivalently as " #
X
̇ = + (q ) + (7.11)
=1
Note that if the Lagrangian does not contain explicitly, that is, the Lagrangian is invariant to a linear
translation, or equivalently, is spatially homogeneous, and if the Lagrange multiplier constraint force and
generalized force terms are zero, then
" #
X
+ (q ) + =0 (7.12)
=1
r = θ × r
Similarly, the products of the generalized velocities ̇ with the corresponding derivatives of 1 and 0 give
X 2
̇ = 22 (7.27)
̇
X 1 (q q̇ )
̇ = 1 (q q̇ ) (7.28)
̇
X 0 (q )
̇ = 0 (7.29)
̇
186 CHAPTER 7. SYMMETRIES, INVARIANCE AND THE HAMILTONIAN
Equation 725 gives that = 2 when the transformed system is scleronomic, i.e. = 0 and then the
kinetic energy is a quadratic function of the generalized velocities ̇ . Using the definition of the generalized
momentum equation 73 assuming = 2 , and that the potential is velocity independent, gives that
2
≡ = − = (7.30)
̇ ̇ ̇ ̇
Then equation 727 reduces to the useful relation that
1X 1
2 = ̇ = q̇ · p (7.31)
2 2
The Lagrange equations for a conservative force are given by equation 660 to be
X
− =
+ (q ) (7.33)
̇
=1
The holonomic constraints can be accounted for using the Lagrange multiplier terms while the generalized
force
includes non-holonomic forces or other forces not included in the potential energy term of the
Lagrangian, or holonomic forces not accounted for by the Lagrange multiplier terms.
Substituting equation 733 into equation 732 gives
"
#
X X X X
= ̇ − ̇ + (q ) + ̈ +
̇
̇
=1
" #
X µ ¶ X X
= ̇ − ̇
+ (q ) + (7.34)
̇
=1
Jacobi’s generalized momentum, equation 73 can be used to express the generalized energy ( ̇ ) in
terms of the canonical coordinates ̇ and , plus time . Define the Hamiltonian function to equal the
generalized energy expressed in terms of the conjugate variables ( ), that is,
X µ ¶ X
(q p) ≡ (q q̇ ) ≡ ̇ − (q q̇ ) = (̇ ) − (q q̇ ) (7.37)
̇
This Hamiltonian (q p) underlies Hamiltonian mechanics which plays a profoundly important role in
most branches of physics as illustrated in chapters 8 15 and 18.
1 Most textbooks call the function (q q̇ ) Jacobi’s energy integral. This book adopts the more descriptive name Generalized
=− (7.39)
h P i
Thus the Hamiltonian is time independent if both +
=1 (q ) = 0 and the Lagrangian are
time-independent. For an isolated closed system having no external forces acting, then the Lagrangian is
time independent because the velocities are constant, and there is no external potential energy. That is, the
Lagrangian is time-independent, and
⎡ ⎤
µ ¶
⎣X
̇ − ⎦ = =− =0 (7.40)
̇
As a consequence, the Hamiltonian (q p) and generalized energy (q q̇ ), both are constants of motion
if the Lagrangian is a constant of motion, and if the external non-potential forces are zero. This is an example
of Noether’s theorem, where the symmetry of time independence leads to conservation of the conjugate
variable, which is the Hamiltonian or Generalized energy.
If the potential energy does not depend explicitly on velocities ̇ or time, then
( − )
= = = (7.42)
̇ ̇ ̇
Using equations 727 728 729 gives that the total generalized Hamiltonian (q p) equals
But the sum of the kinetic and potential energies equals the total energy. Thus equation 744 can be rewritten
in the form
(q p) = ( + ) − (1 + 20 ) = − (1 + 20 ) (7.45)
Note that Jacobi’s generalized energy and the Hamiltonian do not equal the total energy . However, in
the special case where the transformation is scleronomic, then 1 = 0 = 0 and if the potential energy
does not depend explicitly of ̇ , then the generalized energy (Hamiltonian) equals the total energy, that is,
= Recognition of the relation between the Hamiltonian and the total energy facilitates determining
the equations of motion.
188 CHAPTER 7. SYMMETRIES, INVARIANCE AND THE HAMILTONIAN
=− (7.47)
P h P i
Also, when ̇
+ =1 (q ) = 0 and if the Lagrangian is not an explicit function of time,
then the Hamiltonian is a constant of motion. That is, is conserved if, and only if, the Lagrangian, and
consequently the Hamiltonian, are not explicit functions of time, and if the external forces are zero.
= −
= 0 conserved, = conserved, 6=
= −
6= 0 not conserved, = not conserved, 6=
Note the following general facts regarding the Lagrangian and the Hamiltonian.
(1) the Lagrangian is indefinite with respect to addition of a constant to the scalar potential,
(2) the Lagrangian is indefinite with respect to addition of a constant velocity,
(3) there is no unique choice of generalized coordinates.
(4) the Hamiltonian is a scalar function that is derived from the Lagrangian scalar function.
(5) the generalized momentum is derived from the Lagrangian.
These facts, plus the ability to recognize the conditions under which is conserved, and when =
can greatly facilitate solving problems as shown by the following two examples.
7.10. HAMILTONIAN INVARIANCE 189
The Hamiltonian in the fixed frame is conserved and equals the total energy, that is = + .
Rotating frame of reference 0
The above inertial fixed-frame Lagrangian can be written in terms of the primed (non-inertial rotating
frame) coordinates as
µ ´2 ¶
³ 2 2
´ 02 ³ 0
= − = ̇ + 2 ̇ − () = ̇ + 02 ̇ + − (0 )
2 2
The generalized momenta derived from this Lagrangian are
³ 0 ´
0 = 0 = ̇ 02
̇ + = 00 + 02
̇
0 = = ̇02 =
̇0
The Hamiltonian expressed in terms of the non-inertial rotating frame coordinates is
⎛ ³ ´⎞
0 1 00 + 2
0 (0 0 0 0 ) = 0 ̇0 + 0 ̇ − =
⎝02
+
⎠ + (0 )
̇ ̇ 2 2
Note that 0 (0 0 0 0 ) is time independent and therefore is conserved, but (0 0 0 0 ) 6= because
the generalized coordinates are time dependent. In addition, 00 is conserved since
̇0 = 0 =− =0
0
7.10. HAMILTONIAN INVARIANCE 191
Note that the Lagrangian and Hamiltonian are not explicit functions of time, therefore they are conserved.
Also the potential is velocity independent and there is no coordinate transformation, thus the Hamiltonian
equals the total energy which is a constant of motion.
2
= − cos =
22
() = − ( + ) = 0
192 CHAPTER 7. SYMMETRIES, INVARIANCE AND THE HAMILTONIAN
Using the Lagrangian, plus the one equation of constraint, requires one Lagrange multiplier. Then the
Lagrange equations of motion for and are
∙ ¸
− + = 0
̇
∙ ¸
− + = 0
̇
Substitute the Lagrangian and the equation of constraint gives two equations of motion
− ( − ) sin − ( − )2 ̈ + ( − ) = 0
1
− 2 ̈ − = 0
2
The lower equation of motion gives that
1
= − ̈
2
Substitute this into the equation of constraint gives
1
= − ( − ) ̈
2
Substitute this into the first equation of motion gives the equation of motion for to be
2
̈ = sin
3 ( − )
that is
=− sin
3
The torque acting on the small cylinder due to the frictional force is
1
= 2 ̈ = −
2
Thus the frictional force is
= − = sin
3
Noether’s theorem can be used to ascertain if the angular momentum is a constant of motion. The
derivative of the Lagrangian
= ( − ) sin
and thus the Lagrange equations tells us that ̇ = ( − ) sin . Therefore is not a constant of motion.
The Lagrangian is not an explicit function of which would suggest that is a constant of motion.
But this is incorrect because the constraint equation = (−) couples and , that is, they are not
independent variables, and thus and are coupled by the constraint equation. As a result is not a
constant of motion because it is directly coupled to = ( − ) sin which is not a constant of motion.
Thus neither nor are constants of motion. This illustrates that one must account carefully for equations
of constraint, and the concomitant constraint forces, when applying Noether’s theorem which tacitly assumes
independent variables.
The Hamiltonian can be derived using the generalized momenta
= = ( − )2 ̇
̇
1
= = 2 ̇
̇ 2
Then the Hamiltonian is given by
2 2
= ̇ + ̇ − = + + [ − ( − ) cos ]
2 ( − )2 2
Note that the transformation to generalized coordinates is time independent and the potential is not velocity
dependent, thus the Hamiltonian also equals the total energy. Also the Hamiltonian is conserved since
= 0.
7.11. HAMILTONIAN FOR CYCLIC COORDINATES 193
The importance of the relations between invariance and symmetry cannot be overemphasized. It extends
beyond classical mechanics to quantum physics and field theory. For a three-dimensional closed system,
there are three possible constants for linear momentum, three for angular momentum, and one for energy. It
is especially interesting in that these, and only these, seven integrals have the property that they are additive
for the particles comprising a system, and this occurs independent of whether there is an interaction among
the particles. That is, this behavior is obeyed by the whole assemble of particles for finite systems. Because
of its profound importance to physics, these relations between symmetry and invariance are used extensively.
It is more convenient to write the generalized coordinates plus their generalized momentum as
vectors, e.g. q ≡ (1 2 ), p ≡ (1 2 ). The generalized momenta conjugate to the coordinate ,
defined by 73, then can be written in the form
(q q̇ t)
= (7.50)
̇
Substituting this definition of the generalized momentum into the Hamiltonian defined in (737), and
expressing it in terms of the coordinate q and its conjugate generalized momenta p, leads to
X
(q p ) = ̇ − (q q̇ ) (7.51)
= p · q̇−(q q̇ ) (7.52)
P
Note that the scalar product p · q̇ = ̇ equals 2 for systems that are scleronomic and when the
potential is velocity independent.
The crucial feature of the Hamiltonian is that it is expressed as (q p ) that is, it is a function
of the generalized coordinates q and their conjugate momenta p, which are taken to be independent, in
addition to the independent variable, . This is in contrast to the Lagrangian (q q̇ ) which is a function
of the generalized coordinates , the corresponding velocities ̇ , and time The velocities q̇ are the
time derivatives of the coordinates q and thus these are related. In physics, the fundamental conjugate
coordinates are (q p) which are the coordinates underlying the Hamiltonian. This is in contrast to (q q̇)
which are the coordinates that underlie the Lagrangian. Thus the Hamiltonian is more fundamental than
the Lagrangian and is a reason why the Hamiltonian mechanics, rather than the Lagrangian mechanics, was
used as the foundation for development of quantum and statistical mechanics.
Hamiltonian mechanics will be derived two other ways. Chapter 8 uses the Legendre transformation
between the conjugate variables (q q̇ ) and (q p ) where the generalized coordinate q and its conjugate
generalized momentum, p are independent. This shows that Hamiltonian mechanics is based on the same
variational principles as those used to derive Lagrangian mechanics. Chapter 9 derives Hamiltonian mechan-
ics directly from Hamilton’s Principle of Least action. Chapter 8 will introduce the algebraic Hamiltonian
mechanics, that is based on the Hamiltonian. The powerful capabilities provided by Hamiltonian mechanics
will be described in chapter 15.
7.14 Summary
This chapter has explored the importance of symmetries and invariance in Lagrangian mechanics and has
introduced the Hamiltonian. The following summarizes the important conclusions derived in this chapter.
Noether’s theorem:
Noether’s theorem explores the remarkable connection between symmetry, plus the invariance of a sys-
tem under transformation, and related conservation laws which imply the existence of important physical
principles, and constants of motion. Transformations where the equations of motion are invariant are called
invariant transformations. Variables that are invariant to a transformation are called cyclic variables. It
was shown that if the Lagrangian does not explicitly contain a particular coordinate of displacement, then
the corresponding conjugate momentum, ̇ is conserved. This is Noether’s theorem which states “For each
symmetry of the Lagrangian, there is a conserved quantity”. In particular it was shown that translational
invariance in a given direction leads to the conservation of linear momentum in that direction, and rotational
invariance about an axis leads to conservation of angular momentum about that axis. These are the first-
order spatial and angular integrals of the equations of motion. Noether’s theorem also relates the properties
of the Hamiltonian to time invariance of the Lagrangian, namely;
(1) is conserved if, and only if, the Lagrangian, and consequently the Hamiltonian, are not explicit
functions of time.
(2) The Hamiltonian gives the total energy if the constraints and coordinate transformations are time
independent and the potential energy is velocity independent. This is equivalent to stating that = if the
constraints, or generalized coordinates, for the system are time independent.
Noether’s theorem is of importance since it underlies the relation between symmetries, and invariance in
all of physics; that is, its applicability extends beyond classical mechanics.
7.14. SUMMARY 195
Generalized momentum:
The generalized momentum associated with the coordinate is defined to be
≡ (73)
̇
where is also called the conjugate momentum (or canonical momentum) to where are
conjugate, or canonical, variables. Remember that the linear momentum is the first-order time integral
given by equation 210. Note that if is not a spatial coordinate, then is not linear momentum, but is
the conjugate momentum. For example, if is an angle, then will be angular momentum.
Kinetic energy in generalized coordinates:
It was shown that the kinetic energy can be expressed in terms of generalized coordinates by
XX 1 XX XX 1 µ ¶2
(q q̇ ) = ̇ ̇ + ̇ + (719)
2
2
= 2 (q q̇ ) + 1 (q q̇ ) + 0 (q ) (7.53)
For scleronomic systems with a potential that is velocity independent, then the kinetic energy can be
expressed as
1X 1
= 2 = ̇ = q̇ · p (731)
2 2
Generalized energy
Jacobi’s Generalized Energy (q ̇ ) was defined as
X µ ¶
(q q̇ ) ≡ ̇ − (q q̇ ) (736)
̇
Hamiltonian function
The Hamiltonian (q p) was defined in terms of the generalized energy (q q̇ ) and by introducing
the generalized momentum. That is
X
(q p) ≡ (q q̇ ) = ̇ − (q q̇ ) = p · q̇−(q q̇ ) (737)
Note that if all the generalized non-potential forces are zero, then the bracket in equation 738 is zero, and
if the Lagrangian is not an explicit function of time, then the Hamiltonian is a constant of motion.
Generalized energy and total energy:
The generalized energy, and corresponding Hamiltonian, equal the total energy if:
1) The kinetic energy has a homogeneous quadratic dependence on the generalized velocities and the
transformation to generalized coordinates is independent of time, = 0
2) The potential energy is not velocity dependent, thus the terms ̇ = 0
Chapter 8 will introduce Hamiltonian mechanics that is built on the Hamiltonian, and chapter 15 will
explore applications of Hamiltonian mechanics.
196 CHAPTER 7. SYMMETRIES, INVARIANCE AND THE HAMILTONIAN
Workshop exercises
1. Consider a particle of mass moving in a plane and subject to an inverse square attractive force.
2. Consider a Lagrangian function of the form ( ˙ ¨ ). Here the Lagrangian contains a time derivative
of the generalized coordinates that is higher than the first. When working with such Lagrangians, the term
“generalized mechanics” is used.
(a) Consider a system with one degree of freedom. By applying the methods of the calculus of variations,
and assuming that Hamilton’s principle holds with respect to variations which keep both and ̇ fixed at
the end points, show that the corresponding Lagrange equation is
µ ¶ µ ¶
2
− + = 0
2 ̈ ̇
Such equations of motion have interesting applications in chaos theory.
(b) Apply this result to the Lagrangian
=− ̈ − 2
2 2
Do you recognize the equations of motion?
3. A uniform solid cylinder of radius and mass rests on a horizontal plane and an identical cylinder rests
on it touching along the top of the first cylinder with the axes of both cylinders parallel. The upper cylinder
is given an infinitessimal displacement so that both cylinders roll without slipping in the directions shown by
the arrows.
y
x
t=0 t>0
4. Consider a diatomic molecule which has a symmetry axis along the line through the center of the two atoms
comprising the molecule. Consider that this molecule is rotating about an axis perpendicular to the symmetry
axis and that there are no external forces acting on the molecule. Use Noether’s Theorem to answer the
following questions:
a) Is the total angular momentum conserved?
b) Is the projection of the total angular momentum along a space-fixed axis conserved?
c) Is the projection of the angular momentum along the symmetry axis of the rotating molecule conserved?
d) Is the projection of the angular momentum perpendicular to the rotating symmetry axis conserved?
7.14. SUMMARY 197
5. A bead of mass slides under gravity along a smooth wire bent in the shape of a parabola 2 = in the
vertical ( ) plane.
Problems
1. Let the horizontal plane be the − plane. A bead of mass is constrained to slide with speed along a
curve described by the function = (). What force does the curve apply to the bead? (Ignore gravity)
2. Consider the Atwoods machine shown. The masses are 4, 5, and 3. Let and be the heights of the
right two masses relative to their initial positions.
a) Solve this problem using the Euler-Lagrange equations
b) Use Noether’s theorem to find the conserved momentum.
4m
x y
5m 3m
3. A cube of side 2 and center of mass , is placed on a fixed horizontal cylinder of radius and center as
shown in the figure. Originally the cube is placed such that is centered above but it can roll from side to
side without slipping. (a) Assuming that use the Lagrangian approach to to find the frequency for small
oscillations about the top of the cylinder. For simplicity make the small angle approximation for before using
the Lagrange-Euler equations. (b) What will be the motion if ? Note that the moment of inertia of the
cube about the center of mass is 23 2 .
h
b
O
198 CHAPTER 7. SYMMETRIES, INVARIANCE AND THE HAMILTONIAN
4. Two equal masses of mass are glued to a massless hoop of radius is free to rotate about its center in a
vertical plane. The angle between the masses is 2 , as shown. Find the frequency of oscillations.
5. Three massless sticks each of length 2, and mass with the center of mass at the center of each stick, are
hinged at their ends as shown. The bottom end of the lower stick is hinged at the ground. They are held so
that the lower two sticks are vertical, and the upper one is tilted at a small angle with respect to the vertical.
They are then released. At the instant of release what are the three equations of motion derived from the
Lagrangian derived assuming that is small? Use these to determine the initial angular accelerations of the
three sticks.
m
Chapter 8
Hamiltonian mechanics
8.1 Introduction
The three major formulations of classical mechanics are
1. Newtonian mechanics which is the most intuitive vector formulation used in classical mechanics.
2. Lagrangian mechanics is a powerful algebraic formulation of classical mechanics derived using either
d’Alembert’s Principle, or Hamilton’s Principle. The latter states ”A dynamical system follows a path
that minimizes the time integral of the difference between the kinetic and potential energies”.
3. Hamiltonian mechanics has a beautiful superstructure that, like Lagrangian mechanics, is built
upon variational calculus, Hamilton’s principle, and Lagrangian mechanics.
Hamiltonian mechanics is introduced at this juncture since it is closely interwoven with Lagrange mechan-
ics. Hamiltonian mechanics plays a fundamental role in modern physics, but the discussion of the important
role it plays in modern physics will be deferred until chapters 15 and 18 where applications to modern physics
are addressed.
The following important concepts were introduced in chapter 7:
The generalized momentum was defined to be given by
(q q̇)
≡ (8.1)
̇
Note that, as discussed in chapter 72, if the potential is velocity dependent, such as the Lorentz force, then
the generalized momentum includes terms in addition to the usual mechanical momentum.
Jacobi’s generalized energy function (q q̇ ) was introduced where
µ
X ¶
(q q̇ ) = ̇ − (q q̇ ) (8.2)
̇
The Hamiltonian function was defined to be given by expressing the generalized energy function,
equation 82, in terms of the generalized momentum. That is, the Hamiltonian (q p ) is expressed as
X
(q p ) = ̇ − (q q̇ ) (8.3)
The symbols q, p, designate vectors of generalized coordinates, q ≡ (1 2 ) p ≡P(1 2 ).
Equation 83 can be written compactly in a symmetric form using the scalar product p · q̇ = ̇ .
(q p ) + (q q̇ ) = p · q̇ (8.4)
A crucial feature of Hamiltonian mechanics is that the Hamiltonian is expressed as (q p ) that
is, it is a function of the generalized coordinates and their conjugate momenta, which are taken to be
independent, plus the independent variable, time. This contrasts with the Lagrangian (q q̇ ) which is a
function of the generalized coordinates , and the corresponding velocities ̇ , that is the time derivatives
of the coordinates , plus the independent variable, time.
199
200 CHAPTER 8. HAMILTONIAN MECHANICS
v = ∇u (u w) (8.5)
and where w designates passive variables. The function ∇u (u w) is the first-order derivative, (gradient)
of (u w) with respect to the components of the vector u. The Legendre transform states that the inverse
formula can always be written as a first-order derivative
u = ∇v (v w) (8.6)
The relationship between the functions (u w) and (v w) is symmetrical and each is said to be the
Legendre transform of the other.
The general Legendre transform can be used to relate the Lagrangian and Hamiltonian by identifying the
active variables v with p and u with q̇ the passive variable w with q, and the corresponding functions
(u w) =(q q̇) and (v w) =(q p). Thus the generalized momentum (81) corresponds to
where (q) are the passive variables. Then the Legendre transform states that the transformed variable q̇
is given by the relation
q̇ = ∇p (q p) (8.10)
Since the functions (q q̇) and (q p) are the Legendre transforms of each other, they satisfy the
relation
(q p ) +(q q̇ ) = p · q̇ (8.11)
The function (q p ), which is the Legendre transform of the Lagrangian (q q̇ ) is called the Hamil-
tonian function and equation (811) is identical to our original definition of the Hamiltonian given by
equation (83). The variables q and are passive variables thus equation (88) gives that
Written in component form equation 812 gives the partial derivative relations
Note that equations 813 and 814 are strictly a result of the Legendre transformation. To complete the
transformation from Lagrangian to Hamiltonian mechanics it is necessary to invoke the calculus of variations
via the Lagrange-Euler equations. The symmetry of the Legendre transform is illustrated by equation 811
Equation 731 gives that the scalar product p · q̇ =22 For scleronomic systems, with velocity indepen-
dent potentials the standard Lagrangian = − and = 2 − + = + . Thus, for this simple
case, equation 811 reduces to an identity + = 2 .
X
− = +
(8.16)
̇
=1
This gives the corresponding Hamilton equation for the time derivative of to be
X
= ̇ = + +
(8.17)
̇
=1
Substitute equation 813 into equation 817 leads to the second Hamilton equation of motion
(q p) X
̇ = − + +
(8.18)
=1
One can explore further the implications of Hamiltonian mechanics by taking the time differential of (83)
giving. µ ¶
(q p) X ̇ ̇
= ̇ + − − − (8.19)
̇
Inserting the conjugate momenta ≡ ̇ and equation 817 into equation 819 results in
à "
# !
(q p) X ̇ X ̇
= ̇ ̇ + − ̇ − − ̇ − − (8.20)
=1
The second and fourth terms cancel as well as the ̇ ̇ terms, leaving
Ã" # !
(q p) X X
= + ̇ − (8.21)
=1
Use equations 815 and 818 to substitute for and in equation 822 gives
Ã"
# !
(q p) X X (q p)
= + ̇ + (8.23)
=1
202 CHAPTER 8. HAMILTONIAN MECHANICS
Note that equation 823 must equal the generalized energy theorem, i.e. equation 821 Therefore,
=− (8.24)
In summary, Hamilton’s equations of motion are given by
(q p)
̇ = (8.25)
" #
(q p) X
̇ = − + +
(8.26)
=1
Ã" # !
(q p) X X (q q̇)
= + ̇ − (8.27)
=1
The symmetry of Hamilton’s equations of motion is illustrated when the Lagrange multiplier and gener-
alized forces are zero. Then
(q p)
̇ = (8.28)
(p q )
̇ = − (8.29)
(p q ) (p q ) (q̇ q)
= =− (8.30)
This simplified form illustrates the symmetry of Hamilton’s equations of motion. Many books present
the Hamiltonian only for this special simplified case where it is holonomic, conservative, and generalized
coordinates are used.
= cos (8.31)
= sin
=
Using appendix table 3 the Lagrangian can be written in cylindrical coordinates as
³ 2 2
´
= − = ̇ + 2 ̇ + ̇ 2 − ( ) (8.32)
2
The conjugate momenta are
= = ̇ (8.33)
̇
= = 2 ̇ (8.34)
̇
= = ̇ (8.35)
̇
Assume a conservative force, then is conserved. Since the transformation from cartesian to non-
rotating generalized cylindrical coordinates is time independent, then = Then using (832 − 835) gives
the Hamiltonian in cylindrical coordinates to be
X
(q p ) = ̇ − (q q̇ ) (8.36)
³ ´ µ 2 2
¶
2
= ̇ + ̇ + ̇ − + 2 + + ( )
2
à !
2
1
= 2 + 2 + 2 + ( ) (8.37)
2
2
̇ = − = − (8.38)
3
̇ = − =− (8.39)
̇ = − =− (8.40)
̇ = = (8.41)
̇ = = (8.42)
2
̇ = = (8.43)
Note that if is cyclic, that is = 0 then the angular momentum about the axis, , is a constant
of motion. Similarly, if is cyclic, then is a constant of motion.
204 CHAPTER 8. HAMILTONIAN MECHANICS
The Lagrangian is
³ 2 2 2
´
= − = ̇ + 2 ̇ + 2 sin2 ̇ − () (8.45)
2
The conjugate momenta are
= = ̇ (8.46)
2
= = ̇ (8.47)
2 2
= = sin ̇ (8.48)
Assuming a conservative force then is conserved. Since the transformation from cartesian to generalized
spherical coordinates is time independent, then = Thus using (846 − 848) the Hamiltonian is given
in spherical coordinates by
X
(q p ) = ̇ − (q q̇ ) (8.49)
³ ´ ³ 2 2
´
= ̇ + ̇ + ̇ − ̇2 + 2 ̇ + 2 sin2 ̇ + ( ) (8.50)
à 2 !
2 2
1
= 2 + 2 + 2 2 + ( ) (8.51)
2 sin
Note that the Lagrangian is not explicitly time dependent, thus the Hamiltonian is a constant of motion.
Hamilton’s equations give that
̇ = = − ̇ = =0
̇ = = − ̇ = =0
̇ = = − ̇ = =
Combining these gives that ̈ = 0 ̈ = 0 ̈ = −. Note that the linear momenta and are constants
of motion whereas the rate of change of is given by the gravitational force . Note also that = +
for this conservative system.
The Hamiltonian is
X
= ̇ − = ̇ −
1 2 1 2 1 2 1 2
= − + = +
2 2 2 2
Note that the Lagrangian is not explicitly time dependent, thus the Hamiltonian will be a constant of motion.
Hamilton’s equations give that
̇ = =
or
= ̇
In addition
−̇ = = =
Combining these gives that
̈ + =0
which is the equation of motion for the harmonic oscillator.
1 2 2
= ̇ + cos
2
The momentum conjugate to is
= = 2 ̇
̇
which is the angular momentum about the pivot point.
The Hamiltonian is
X 1 2 2 2
= ̇ − = ̇ − = ̇ − cos = 2 − cos
2 2
horizontal axis when || , that is, the pendulum swings
around a circle continuously, i.e. it rotates continuously in one
direction about the horizontal axis. The phase change occurs at
= and is designated by the separatrix trajectory.
O
The plot of versus for the plane pendulum is better pre-
sented on a cylindrical phase space representation since is a
cyclic variable that cycles around the cylinder, whereas oscil-
lates equally about zero having both positive and negative values.
When wrapped around a cylinder then the unstable and stable (b)
equilibrium points will be at diametrically opposite locations on Phase-space diagrams for the plane
the surface of the cylinder at = 0. For small oscillations pendulum. The separatrix (bold line)
about equilibrium, also called librations, the correlation between separates the oscillatory solutions from
and is given by the clockwise closed ellipses wrapped on the the rolling solutions. The upper (a)
cylindrical surface, whereas for energies || the positive shows one complete cycle while the lower
corresponds to counterclockwise rotations while the negative (b) shows two complete cycles.
corresponds to clockwise rotations.
2 + 2 = 2
F = −r
y
the potential is the same as for the harmonic oscillator,
that is
1 1
= 2 = (2 + 2 )
2 2 x
This is independent of and thus is cyclic.
Mass attracted to origin by force proportional to
In cylindrical coordinates the velocity is
distance from origin with the motion constrained
2 2 to the surface of a cylinder.
2 = ̇2 + 2 ̇ +
=
= 0
208 CHAPTER 8. HAMILTONIAN MECHANICS
= ̇ + ̇ + ̇ −
1 ³ 2 2
´ 1
= ̇ + 2 ̇ + ̇ 2 − + 2 ̇
2 2
µ ¶2
2 1 1 2 2
= + + + −
2 22 2 2
" µ ¶2 #
1 2 1 2
= + + + −
2 2
Note that the Hamiltonian is not an explicit function of time, therefore it is a constant of motion which
equals the total energy. " #
µ ¶2
1 2 1 2
= + + + − =
2 2
Since ̇ = −
and if is not an explicit function of then ̇ = 0 that is, is a constant of motion.
Thus and are constants of motion.
Consider the initial conditions = ̇ = ̇ = ̇ = 0. Then
1 1
= = 2 ̇ − 2 = − 2
̇ 2 2
= 0
" µ ¶2 #
1 2 1 2 ln( )
= + + + + 0 = 0
2 2 ln( )
Note that at = then is given by the last equation since the Hamiltonian equals a constant 0 . That
is, assuming that then
1
2 = 20 − ( )2
2
Define a critical magnetic field by r
2 20
≡
then
¡ 2¢ ¡ ¢ 1
= = 2 − 2 ( )2
2
Note that if then is real at = . However, if then is imaginary at =
implying that there must be a maximum orbit radius 0 for the electron where 0 . That is, the electron
trajectories are confined spatially to coaxial cylindrical orbits concentric with the magnetron electromagnetic
fields. These closed electron trajectories excite the microwave cavities located in the nearby outer cylindrical
wall of the anode.
.
210 CHAPTER 8. HAMILTONIAN MECHANICS
X
X −
X
(1 ; 1 ; ) = ̇ − = ̇ + ̇ − (8.58)
=1
Routh’s clever idea was to define a new function, called the Routhian, that include only one of the two
partitions of the kinetic energy terms. This makes the Routhian a Hamiltonian for the coordinates for which
the kinetic energy terms are included, while the Routhian acts like a negative Lagrangian for the coordinates
where the kinetic energy term is omitted. This book defines two Routhians.
X
(1 ; ̇1 ̇ ; +1 ; ) ≡ ̇ − (8.59)
X
(1 ; 1 ; ̇+1 ̇ ; ) ≡ ̇ − (8.60)
The first, Routhian, called includes the kinetic energy terms only for the cyclic variables, and behaves
like a Hamiltonian for the cyclic variables, and behaves like a Lagrangian for the non-cyclic variables. The
second Routhian, called − includes the kinetic energy terms for only the non-cyclic variables, and
behaves like a Hamiltonian for the non-cyclic variables, and behaves like a negative Lagrangian for the cyclic
variables. These two Routhians complement each other in that they make the Routhian either a Hamiltonian
for the cyclic variables, or the converse where the Routhian is a Hamiltonian for the non-cyclic variables.
The Routhians use ( ̇ ) to denote those coordinates for which the Routhian behaves like a Lagrangian, and
( ) for those coordinates where the Routhian behaves like a Hamiltonian. For uniformity, it is assumed
that the degrees of freedom between 1 ≤ ≤ are non-cyclic, while those between +1 ≤ ≤ are ignorable
cyclic coordinates.
The Routhian is a hybrid of Lagrangian and Hamiltonian mechanics. Some textbooks minimize discussion
of the Routhian on the grounds that this hybrid approach is not fundamental. However, the Routhian is
used extensively in engineering in order to derive the equations of motion for rotating systems. In addition
it is used when dealing with rotating nuclei in nuclear physics, rotating molecules in molecular physics, and
rotating galaxies in astrophysics. The Routhian reduction technique provides a powerful way to calculate
the intrinsic properties for a rotating system in the rotating frame of reference. The Routhian approach is
included in this textbook because it plays an important role in practical applications of rotating systems, plus
it nicely illustrates the relative advantages of the Lagrangian and Hamiltonian formulations in mechanics.
8.6. ROUTHIAN REDUCTION 211
The first two terms on the right can be combined to give the Hamiltonian for only the cyclic
variables, = + 1 + 2 , that is
(1 ; ̇1 ̇ ; +1 ; ) = − (8.63)
The Routhian (1 ; ̇1 ̇ ; +1 ; ) also can be written in an alternate form
X
X
X
(1 ; ̇1 ̇ ; +1 ; ) ≡ ̇ − = ̇ − − ̇ (8.64)
=1
X
= − ̇ (8.65)
which is expressed as the complete Hamiltonian minus the kinetic energy term for the noncyclic coordinates.
The Routhian behaves like a Hamiltonian for the cyclic coordinates and behaves like a negative
Lagrangian for all the = − noncyclic coordinates = 1 2 Thus the equations of motion
for the non-cyclic variables are given using Lagrange’s equations of motion, while the Routhian behaves
like a Hamiltonian for the ignorable cyclic variables = + 1
Ignoring both the Lagrange multiplier and generalized forces, then the partitioned equations of motion
for the non-cyclic and cyclic generalized coordinates are given in Table 81
Table 81; Equations of motion for the Routhian
Lagrange equations Hamilton equations
Coordinates Noncyclic: 1 ≤ ≤ Cyclic: ( + 1) ≤ ≤
Thus there are cyclic (ignorable) coordinates ( )+1 ( ) which obey Hamilton’s equations of
motion, while the the first = − non-cyclic (non-ignorable) coordinates ( ̇)1 ( ̇) for = 1 2
obey Lagrange equations. The solution for the cyclic variables is trivial since they are constants of motion
and thus the Routhian has reduced the number of equations of motion that must be solved from to
the = − non-cyclic variables This Routhian provides an especially useful way to reduce the number
of equations of motion for rotating systems.
Note that there are several definitions used to define the Routhian, for example some books define this
Routhian as being the negative of the definition used here so that it corresponds to a positive Lagrangian.
However, this sign usually cancels when deriving the equations of motion, thus the sign convention is unim-
portant if a consistent sign convention is used.
212 CHAPTER 8. HAMILTONIAN MECHANICS
This Routhian behaves like a Hamiltonian for the non-cyclic variables which are expressed in terms of
and appropriate for a Hamiltonian. This Routhian writes the cyclic coordinates in terms of , and ̇
appropriate for a Lagrangian, which are treated assuming the Routhian is a negative Lagrangian for
these cyclic variables as summarized in table 82.
This non-cyclic Routhian is especially useful since it equals the Hamiltonian for the non-cyclic
variables, that is, the kinetic energy for motion of the cyclic variables has been removed. Note that since the
cyclic variables are constants of motion, then is a constant of motion if is a constant of motion.
However, does not equal the total energy since the coordinate transformation is time dependent,
that is, corresponds to the energy of the non-cyclic parts of the motion. For example, when used
to describe rotational motion, corresponds to the energy in the non-inertial rotating body-fixed
frame of reference. This is especially useful in treating rotating systems such as rotating galaxies, rotating
machinery, molecules, or rotating strongly-deformed nuclei as discussed in chapter 129
The Lagrangian and Hamiltonian are the fundamental algebraic approaches to classical mechanics. The
Routhian reduction method is a valuable hybrid technique that exploits a trick to reduce the number of
variables that have to be solved for complicated problems encountered in science and engineering. The
Routhian provides the most useful approach for solving the equations of motion for rotating
molecules, deformed nuclei, or astrophysical objects in that it gives the Hamiltonian in the non-inertial
body-fixed rotating frame of reference ignoring the rotational energy of the frame. By contrast, the cyclic
Routhian is especially useful to exploit Lagrangian mechanics for solving problems in rigid-body
rotation such as the Tippe Top described in example 1313.
Note that the Lagrangian, Hamiltonian, plus both the and Routhian’s, all are scalars
under rotation, that is, they are rotationally invariant. However, they may be expressed in terms of the
coordinates in either the stationary or aP rotating frame. The major difference is that the Routhian includes
only subsets of the kinetic energy term ̇ . The relative merits of using Lagrangian, Hamiltonian, and
both the and Routhian reduction methods, are illustrated by the following examples.
8.6. ROUTHIAN REDUCTION 213
Take the time derivative of equation () and use () to substitute for ̇ gives that
2 cos
̈ − 3 + sin = 0 ()
2 4 sin
Note that equation (b) shows that is a cyclic coordinate. Thus
that is the angular momentum about the vertical axis is conserved. Note that although is a constant of
motion, ̇ = 2 sin2 is a function of and thus in general it is not conserved. There are various solutions
depending on the initial conditions. If = 0 then the pendulum is just the simple pendulum discussed
previously that can oscillate, or rotate in the direction. The opposite extreme is where = 0 where the
pendulum rotates in the direction with constant . In general the motion is a complicated coupling of the
and motions.
214 CHAPTER 8. HAMILTONIAN MECHANICS
= = 2 sin2 ̇
̇
The Routhian ( ̇ ̇ ) behaves like a Hamiltonian for and like a Lagrangian 0 = −
for . Use of Hamilton’s canonical equations for give
̇ = =
sin2
2
−̇ = =0
These two equations show that is a constant of motion given by
Note that the Hamiltonian only includes the kinetic energy for the motion which is a constant of motion,
but this energy does not equal the total energy. This solution is what is predicted by Noether’s theorem due
to the symmetry of the Lagrangian about the vertical axis.
Since ( ̇ ̇ ) behaves like a Lagrangian for then the Lagrange equation for is
Λ = − =0
̇
where the negative sign of the Lagrangian in ( ̇ ̇ ) cancels. This leads to
2 cos
2 ̈ = − sin
2 sin3
that is
2 cos
̈ − 3 + sin = 0 ()
2 4 sin
This result is identical to the one obtained using Lagrangian mechanics in example 610 and Hamiltonian
mechanics given in example 86. The Routhian simplified the problem to one degree of freedom by
absorbing into the Hamiltonian the ignorable cyclic coordinate and its conserved conjugate momentum .
Note that the central term in equation is the centrifugal term which is due to rotation about the vertical
axis. This term is zero for plane pendulum motion when = 0.
8.6. ROUTHIAN REDUCTION 215
2 2
+ = − cos − ̇
22 22 sin2
2 1 2
= − 2 sin2 ̇ − cos ()
22 2
This behaves like a negative Lagrangian for and a Hamiltonian for . The conjugate momenta are
= =− = 2 sin2 ̇
̇ ̇
̇ = =− =0
that is, is a constant of motion.
Hamilton’s equations of motion give
̇ = = ()
2
2 cos
−̇ = = − 2 3 + sin ()
sin
Equation gives that
̇
̇ = ̈ =
2
Inserting this into equation gives
2 cos
̈ − + sin = 0
2 4 sin3
which is identical to the equation of motion derived using . The Hamiltonian in the rotating frame
is a constant of motion given by but it does not include the total energy.
Note that these examples show that both forms of the Routhian, as well as the complete Lagrangian
formalism, shown in example 610, and complete Hamiltonian formalism, shown in example 86 all give the
same equations of motion. This illustrates that the Lagrangian, Hamiltonian, and Routhian mechanics all
give the same equations of motion and this applies both in the static inertial frame as well as a rotating frame
since the Lagrangian, Hamiltonian and Routhian all are scalars under rotation, that is, they are rotationally
invariant.
216 CHAPTER 8. HAMILTONIAN MECHANICS
8.9 Example: Single particle moving in a vertical plane under the influence of
an inverse-square central force
The Lagrangian for a single particle of mass moving in a vertical plane and subject to a central inverse
square central force, is specified by two generalized coordinates, and
2 2
= (̇ + 2 ̇ ) +
2
The ignorable coordinate is since it is cyclic. Let the constant conjugate momentum be denoted by =
̇
= 2 ̇. Then the corresponding cyclic Routhian is
2 1
( ̇ ) = ̇ − = 2
− ̇2 −
2 2
This Routhian is the equivalent one-dimensional potential () minus the kinetic energy of radial motion.
Applying Hamilton’s equation to the cyclic coordinate gives
̇ = 0 = ̇
2
implying a solution
= 2 ̇ =
Λ = − =0
̇
where the negative sign of cancels. This leads to the radial solution
2
̈ − 3
+ 2 =0
where = which is a constant of motion in the centrifugal term. Thus the problem has been reduced to a
one-dimensional problem in radius that is in a rotating frame of reference.
The first term is the usual mass times acceleration, while the second term arises from the rate of change of
mass times the velocity. The equation of motion for rocket motion is easily derived using either Lagrangian
or Hamiltonian mechanics by relating the rocket thrust to the generalized force
8.7. VARIABLE-MASS SYSTEMS 217
1 1
( − ) ̇ 2 − (2 + 2 − 2 ) = − 2 (8.73)
4 4 4
Solve for ̇ 2 gives
(2 − 2 )
̇ 2 = (8.74)
−
The acceleration of the falling arm, ̈ is given by taking the time derivative of equation 874
¡ ¢
2 − 2
̈ = + (8.75)
2 ( − )
The rate of change in linear momentum for the moving right side of the chain, ̇ , is given by
(2 − 2 )
̇ = ̈ + ̇ ̇ = + (8.76)
2 ( − )
For this energy-conserving chain, the tension in the chain 0 at the fixed end of the chain is given by
1
0 = ( + ) + ̇ 2 (8.77)
2 4
Equations 874 and 876, imply that the tension diverges to infinity when → . Calkin and March
measured the dependence of the chain tension at the support for the folded chain and observed the predicted
dependence. The maximum tension was ' 25 which is consistent with that predicted using equation 877
after taking into account the finite size and mass of individual links in the chain. This result is very different
from that obtained using the erroneous assumption that the right arm falls with the free-fall acceleration ,
which implies a maximum tension 0 = 2 . Thus the free-fall assumption disagrees with the experimental
results, in addition to violating energy conservation and the tenets of Lagrangian and Hamiltonian mechanics.
That is, the experimental result demonstrates unambiguously that the energy conservation predictions apply
in contradiction with the erroneous free-fall assumption.
The unusual feature of variable mass problems, such as the folded chain problem, is that the rate of change
of momentum in equation 876 includes two contributions to the force and rate of change of momentum,
that is, it includes both the acceleration term ̈ plus the variable mass term ̇ ̇ that accounts for the
transfer of matter at the intersection of the moving and stationary partitions of the chain. At the transition
point of the chain, moving links are transferred from the moving section and are added to the stationary
subsection. Since this moving section is falling downwards, and the stationary section is stationary, then the
transferred momentum is in a downward direction corresponding to an increased effective downward force.
Thus the measured acceleration of the moving arm actually is faster than . A related phenomenon is the
loud cracking sound heard when cracking a whip.
2 2
L( ̇) = ̇ + (8.78)
2 2
L
= = ̇ (8.79)
̇
2 2
= − = (8.80)
2 2
8.8. SUMMARY 219
The Lagrangian and Hamiltonian are not explicitly time dependent, and the Hamiltonian equals the initial
total energy, 0 . Thus energy conservation can be used to give that
1
= (̇ 2 − ) = 0 (8.81)
2
Lagrange’s equation of motion gives
1
̇ = ̈ + ̇ ̇ = + ̇ 2 = − 0 (8.82)
2
The important difference between the folded chain and falling chain is that the moving component of the
falling chain is gaining mass with time rather than losing mass. Also the tension in the chain 0 reduces the
acceleration of the falling chain making it less than the free-fall value . This is in contrast to that for the
folded chain system where the acceleration exceeds .
The above discussion shows that Lagrangian and Hamiltonian can be applied to variable-mass systems if
both the donor and receptor degrees of freedom are included to ensure that the total mass is conserved.
8.8 Summary
Hamilton’s equations of motion
Inserting the generalized momentum into Jacobi’s generalized energy relation was used to define the
Hamiltonian function to be
(q p ) = p · q̇−(q q̇ ) (83)
The Legendre transform of the Lagrange-Euler equations, led to Hamilton’s equations of motion.
̇ = (825)
" #
X
̇ = − + +
(826)
=1
where
=− (824)
The are treated as independent canonical variables Lagrange was the first to derive the canonical
equations but he did not recognize them as a basic set of equations of motion. Hamilton derived the canonical
equations of motion from his fundamental variational principle and made them the basis for a far-reaching
theory of dynamics. Hamilton’s equations give 2 first-order differential equations for for each of the
degrees of freedom. Lagrange’s equations give second-order differential equations for the variables ̇
Routhian reduction technique
The Routhian reduction technique is a hybrid of Lagrangian and Hamiltonian mechanics that exploits
the advantages of both approaches for solving problems involving cyclic variables. It is especially useful for
solving motion in rotating systems in science and engineering. Two Routhians are used frequently for solving
the equations of motion of rotating systems. Assuming that the variables between 1 ≤ ≤ are non-cyclic,
while the variables between + 1 ≤ ≤ are ignorable cyclic coordinates, then the two Routhians are:
X
X
(1 ; ̇1 ̇ ; +1 ; ) = ̇ − = − ̇ (865)
X X
(1 ; 1 ; ̇+1 ̇ ; ) = ̇ − = − ̇ (868)
220 CHAPTER 8. HAMILTONIAN MECHANICS
The Routhian is a negative Lagrangian for the non-cyclic variables between 1 ≤ ≤ , where
= − and is a Hamiltonian for the cyclic variables between + 1 ≤ ≤ . Since the cyclic
variables are constants of the Hamiltonian, their solution is trivial, and the number of variables included in
the Lagrangian is reduced from to = − . The Routhian is useful for solving some problems in
classical mechanics. The Routhian is a Hamiltonian for the non-cyclic variables between 1 ≤ ≤ ,
and is a negative Lagrangian for the cyclic variables between + 1 ≤ ≤ . Since the cyclic variables
are constants of motion, the Routhian also is a constant of motion but it does not equal the total
energy since the coordinate transformation is time dependent. The Routhian is especially valuable
for solving rotating many-body systems such as galaxies, molecules, or nuclei, since the Routhian
is the Hamiltonian in the rotating body-fixed coordinate frame.
Variable mass systems:
Two examples of heavy flexible chains falling in a uniform gravitational field were used to illustrate
how variable mass systems can be handled using Lagrangian and Hamiltonian mechanics. The falling-mass
system is conservative assuming that both the donor plus the receptor body systems are included.
Comparison of Lagrangian and Hamiltonian mechanics
Lagrangian and the Hamiltonian dynamics are two powerful and related variational algebraic formulations
of mechanics that are based on Hamilton’s action principle. They can be applied to any conservative degrees
of freedom as discussed in chapters 6 8 and 15. Lagrangian and Hamiltonian mechanics both concentrate
solely on active forces and can ignore internal forces. They can handle many-body systems and allow
convenient generalized coordinates of choice. This ability is impractical or impossible using Newtonian
mechanics. Thus it is natural to compare the relative advantages of these two algebraic formalisms in order
to decide which should be used for a specific problem.
For a system with generalized coordinates, plus constraint forces that are not required to be known,
then the Lagrangian approach, using a minimal set of generalized coordinates, reduces to only = −
second-order differential equations and unknowns compared to the Newtonian approach where there are
+ unknowns. Alternatively, use of Lagrange multipliers allows determination of the constraint forces
resulting in + second order equations and unknowns. The Lagrangian potential function is limited
to conservative forces, Lagrange multipliers can be used to handle holonomic forces of constraint, while
generalized forces can be used to handle non-conservative and non-holonomic forces. The advantage of the
Lagrange equations of motion is that they can deal with any type of force, conservative or non-conservative,
and they directly determine , ̇ rather than which then requires relating to ̇.
For a system with generalized coordinates, the Hamiltonian approach determines 2 first-order differ-
ential equations which are easier to solve than second-order equations. However, the 2 solutions must be
combined to determine the equations of motion. The Hamiltonian approach is superior to the Lagrange ap-
proach in its ability to obtain an analytical solution of the integrals of the motion. Hamiltonian dynamics also
has a means of determining the unknown variables for which the solution assumes a soluble form. Important
applications of Hamiltonian mechanics are to quantum mechanics and statistical mechanics, where quantum
analogs of and can be used to relate to the fundamental variables of Hamiltonian mechanics. This
does not apply for the variables and ̇ of Lagrangian mechanics. The Hamiltonian approach is especially
powerful when the system has cyclic variables, then the conjugate momenta are constants. Thus the
conjugate variables ( ) can be factored out of the Hamiltonian, which reduces the number of conjugate
variables required to − . This is not possible using the Lagrangian approach since, even though the
coordinates can be factored out, the velocities ̇ still must be included, thus the conjugate variables
must be included. The Lagrange approach is advantageous for obtaining a numerical solution of systems in
classical mechanics. However, Hamiltonian mechanics expresses the variables in terms of the fundamental
canonical variables (q p) which provides a more fundamental insight into the underlying physics.2
2 Recommended reading: "Classical Mechanics" H. Goldstein, Addison-Wesley, Reading (1950). The present chapter
closely follows the notation used by Goldstein to facilitate cross-referencing and reading the many other textbooks that have
adopted this notation.
8.8. SUMMARY 221
Workshop exercises
1. A block of mass rests on an inclined plane making an angle with the horizontal. The inclined plane (a
triangular block of mass ) is free to slide horizontally without friction. The block of mass is also free to
slide on the larger block of mass without friction.
2. Discuss among yourselves the following four conditions that can exist for the Hamiltonian and give several
examples of systems exhibiting each of the four conditions.
(a) The Hamiltonian is conserved and equals the total mechanical energy
(b) The Hamiltonian is conserved but does not equal the total mechanical energy
(c) The Hamiltonian is not conserved but does equal the total mechanical energy
(d) The Hamiltonian is not conserved and does not equal the mechanical total energy.
3. A block of mass rests on an inclined plane making an angle with the horizontal. The inclined plane (a
triangular block of mass ) is free to slide horizontally without friction. The block of mass is also free to
slide on the larger block of mass without friction.
4. Discuss among yourselves the following four conditions that can exist for the Hamiltonian and give several
examples of systems exhibiting each of the four conditions.
a) The Hamiltonian is conserved and equals the total mechanical energy
b) The Hamiltonian is conserved but does not equal the total mechanical energy
c) The Hamiltonian is not conserved but does equal the total mechanical energy
d) The Hamiltonian is not conserved and does not equal the mechanical total energy
5. Compare the Lagrangian formalism and the Hamiltonian formalism by creating a two-column chart. Label one
side “Lagrangian” and the other side “Hamiltonian” and discuss the similarities and differences. Here are some
ideas to get you started:
6. It can be shown that if ( ̇ ) is the Lagrangian of a particle moving in one dimension, then = 0 where
0 ( ̇ ) = ( ̇ ) +
and ( ) is an arbitrary function. This problem explores the consequences of
this on the Hamiltonian formalism.
222 CHAPTER 8. HAMILTONIAN MECHANICS
(a) Relate the new canonical momentum 0 , for 0 , to the old canonical momentum , for .
(b) Express the new Hamiltonian 0 ( 0 0 ) for 0 in terms of the old Hamiltonian ( ) and .
(c) Explicitly show that the new Hamilton’s equations for 0 are equivalent to the old Hamilton’s equations
for .
7. A massless hoop of radius is rotating about an axis perpendicular to its central axis at constant angular
velocity . A mass can freely slide around the hoop.
8. Consider a pendulum of length attached to the end of rod of length . The rod is rotating at constant
angular velocity in the plane. Assume the pendulum is always taut.
Problems
1) A particle of mass in a gravitational field slides on the inside of a smooth parabola of revolution whose axis is
vertical. Using the distance from the axis and the azimuthal angle as generalized coordinates, find the following.
a) The Lagrangian of the system.
b) The generalized momenta and the corresponding Hamiltonian
c) The equation of motion for the coordinate as a function of time.
d) If
= 0 show that the particle can execute small oscillations about the lowest point of the paraboloid and
find the frequency of these oscillations.
2) Consider a particle of mass which is constrained to move on the surface of a sphere of radius . There are no
external forces of any kind acting on the particle.
a) What is the number of generalized coordinates necessary to describe the problem?
b) Choose a set of generalized coordinates and write the Lagrangian of the system.
c) What is the Hamiltonian of the system? Is it conserved?
d) Prove that the motion of the particle is along a great circle of the sphere.
3. A block of mass is attached to a wedge of mass by a spring with spring constant . The inclined frictionless
surface of the wedge makes an angle to the horizontal. The wedge is free to slide on a horizontal frictionless surface
as shown in the figure.
a) Given that the relaxed length of the spring is , find the values 0 when both book and wedge are stationary.
b) Find the Lagrangian for the system as a function of the coordinate of the wedge and the length of spring .
Write down the equations of motion.
c) What is the natural frequency of vibration?
8.8. SUMMARY 223
4. A fly-ball governor comprises two masses connected by 4 hinged arms of length to a vertical shaft and to a
mass which can slide up or down the shaft without friction in a uniform vertical gravitational field as shown in
the figure. The assembly is constrained to rotate around the axis of the vertical shaft with same angular velocity as
that of the vertical shaft. Neglect the mass of the arms, air friction, and assume that the mass has a negligible
moment of inertia. Assume that the whole system is constrained to rotate with a constant angular velocity 0 .
a) Choose suitable coordinates and use the Lagrangian to derive equations of motion of the system around the
equilibrium position.
b) Determine the height of the mass above its lowest position as a function of 0 .
c) Find the frequency of small oscillations about this steady motion.
d) Derive a Routhian that provides the Hamiltonian in the rotating system.
e) Is the total energy of the fly-ball governor in the rotating frame of reference constant in time?
f) Suppose that the shaft and assembly are not constrained to rotate at a constant angular velocity 0 , that is,
it is allowed to rotate freely at angular velocity ̇. What is the difference in the overall motion?
5. A rigid straight, frictionless, massless, rod rotates about the axis at an angular velocity ̇. A mass slides
along the frictionless rod and is attached to the rod by a massless spring of spring constant .
a; Derive the Lagrangian and the Hamiltonian
b; Derive the equations of motion in the stationary frame using Hamiltonian mechanics.
c; What are the constants of motion?
d; If the rotation is constrained to have a constant angular velocity ̇ = then is the non-cyclic Routhian
= − ̇ a constant of motion, and does it equal the total energy?
e; Use the non-cyclic Routhian to derive the radial equation of motion in the rotating frame of reference
for the cranked system with ̇ = .
224 CHAPTER 8. HAMILTONIAN MECHANICS
6. A thin uniform rod of length 2 and mass is suspended from a massless string of length tied to a nail. Initially
the rod hangs vertically. A weak horizontal force is applied to the rod’s free end.
a) Write the Lagrangian for this system.
b) For very short times such that all angles are small, determine the angles that string and the rod make with
the vertical. Start from rest at = 0
c) Draw a diagram to illustrate the initial motion of the rod.
7. A uniform ladder of mass and length 2 is leaning against a frictionless vertical wall with its feet on a
frictionless horizontal floor. Initially the stationary ladder is released at an angle 0 = 60◦ to the floor. Assume
that gravitation field = 9812 acts vertically downward and that the moment of inertia of the ladder about its
midpoint is = 13 2 .
a) Derive the Lagrangian
b) Derive the Hamiltonian
c) Explain if the Hamiltonian is conserved and/or if it equals the total energy
d) Use the Lagrangian to derive the equations of motion
e) Derive the angle at which the ladder loses contact with the vertical wall?
8. The classical mechanics exam induces Jacob to try his hand at bungee jumping. Assume Jacob’s mass
is suspended in a gravitational field by the bungee of unstretched length and spring constant . Besides the
longitudinal oscillations due to the bungee jump, Jacob also swings with plane pendulum motion in a vertical plane.
Use polar coordinates , neglect air drag, and assume that the bungee always is under tension.
a; Derive the Lagrangian
b; Determine Lagrange’s equation of motion for angular motion and identify by name the forces contributing to
the angular motion.
c; Determine Lagrange’s equation of motion for radial oscillation and identify by name the forces contributing to
the tension in the spring.
d; Derive the generalized momenta
e; Determine the Hamiltonian and give all of Hamilton’s equations of motion.
Chapter 9
9.1 Introduction
Hamilton’s principle of stationary action was introduced in two papers published by Hamilton in 1834 and
1835 As mentioned in the Prologue, Hamilton’s Action Principle is the foundation of the hierarchy of three
philosophical stages that are used in applying analytical mechanics. The first stage is to use Hamilton’s
Action Principle to derive either the Hamiltonian and Lagrangian for the system. The second stage is to use
either Lagrangian mechanics, or Hamiltonian mechanics, to derive the equations of motion for the system.
The third stage is to solve these equations of motion for the assumed initial conditions. Lagrange had
pioneered Lagrangian mechanics in 1788 based on d’Alembert’s Principle. Hamilton’s Action Principle now
underlies theoretical physics, and many other disciplines in mathematics and economics. In 1834 Hamilton
was seeking a theory of optics when he developed both his Action Principle, and the field of Hamiltonian
mechanics.
Hamilton’s Action Principle is based on defining the action functional1 for generalized coor-
dinates which are expressed by the vector q and their corresponding velocity vector q̇.
Z
= (q q̇) (9.1)
The scalar action is a functional of the Lagrangian (q q̇), integrated between an initial time and
final time . In principle, higher order time derivatives of the generalized coordinates could be included, but
most systems in classical mechanics are described adequately by including only the generalized coordinates,
plus their velocities. The definition of the action functional allows for more general Lagrangians than the
simple Standard Lagrangian (q q̇) = (q̇) − (q ) that has been used throughout chapters 5 − 8.
Hamilton stated that the actual trajectory of a mechanical system is that given by requiring that the action
functional is stationary with respect to change of the variables. The action functional is stationary when the
variational principle can be written in terms of a virtual infinitessimal displacement, to be
Z
= (q q̇) = 0 (9.2)
Typically the stationary point corresponds to a minimum of the action functional. Applying variational
calculus to the action functional leads to the same Lagrange equations of motion for
P systems as the equations
derived using d’Alembert’s Principle, if the additional generalized force terms, =1
(q ) + ,
are omitted in the corresponding equations of motion.
These are used to derive the equations of motion, which then are solved for an assumed set of ini-
tial conditions. Prior to Hamilton’s Action Principle, Lagrange developed Lagrangian mechanics based on
d’Alembert’s Principle while the Newtonian equations of motion are defined in terms of Newton’s Laws of
Motion.
1 The term "action functional" was named "Hamilton’s Principal Function" in older texts. The name usually is abbreviated
225
226 CHAPTER 9. HAMILTON’S ACTION PRINCIPLE
Note that equation 97 includes contributions from the entire path of the integral as well as the variations
at the ends of the curve and the ∆ terms. Equation 97 leads to the following two pioneering principles of
least action in variational mechanics that were developed by Hamilton.
For independent generalized coordinates , the integrand in brackets vanishes leading to the Euler-Lagrange
equations. Conversely, if the Euler-Lagrange equations in 98 are satisfied, then, = 0 that is, the path
is stationary. This leads to the statement that the path in configuration space between two configurations
q( ) and q( ) that the system occupies at times and respectively, is that for which the action is
stationary. This is a statement of Hamilton’s Principle.
9.2. HAMILTON’S PRINCIPLE OF STATIONARY ACTION
where and ̇ are evaluated at and . Then equation 97 reduces to
⎡ ⎤ ⎡ ⎛ ⎞ ⎤
X X X
= ⎣ + ∆⎦ = ⎣ ∆ + ⎝− ̇ + ⎠ ∆⎦ (9.10)
̇
̇
̇
The integrand, = [p · q̇ − (q p)] in this modified Hamilton’s principle, can be used in the Euler-
Lagrange equations for = 1 2 3 to give
µ ¶
− = ̇ + =0 (9.15)
̇
Similarly, the other Euler-Lagrange equations give
µ ¶
− = −̇ + =0 (9.16)
̇
Thus Hamilton’s principle of least-action leads to Hamilton’s equations of motion, that is equations 915
and 916.
The total time derivative of the action , which is a function of the coordinates and time, is
X
= + ̇ = + p · q̇ (9.17)
Combining equations 917 and 918 gives the Hamilton-Jacobi equation which is discussed in chapter 154.
+ (q p) = 0 (9.19)
In summary, Hamilton’s principle of least action leads directly to Hamilton’s equations of motion (915 916)
plus the Hamilton-Jacobi equation (919). Note that the above discussion has derived both Hamilton’s Ac-
tion Principle (98) and Hamilton’s equations of motion (915 916) directly from Hamilton’s variational
concept of stationary action, , without explicitly invoking the Lagrangian.
The abbreviated action can be simplified assuming use of the standard Lagrangian = − with a
velocity-independent potential , then equation 84 gives.
Z X Z Z Z
0 ≡ ̇ = ( + ) = 2 = p·q (9.22)
Abbreviated action provides for use of a simplified form of the principle of least action that is based
on the kinetic energy, and not potential energy. For conservative systems it determines the path of the
motion, but not the time dependence of the motion. Consider virtual motions where the path satisfies
energy conservation, and where the end points are held fixed, that is = 0 but allow for a variation in
the final time. Then using the Hamilton-Jacobi equation, 919
= − = − (9.23)
However, equation 921 gives that
= 0 − (9.24)
Therefore
0 = 0 (9.25)
That is, the abbreviated action has a minimum with respect to all paths that satisfy the conservation of
energy which can be written as Z
0 = 2 = 0 (9.26)
Equation 926 is called the Maupertuis’ least-action principle which he proposed in 1744 based on Fermat’s
Principle in optics. Credit for the formulation of least action commonly is given to Maupertuis; however, the
Maupertuis principle is similar to the use of least action applied to the “vis viva”, as was proposed by Leibniz
four decades earlier. Maupertuis used teleological arguments, rather than scientific rigor, because of his
limited mathematical capabilities. In 1744 Euler provided a scientifically rigorous argument, presented above,
that underlies the Maupertuis principle. Euler derived the correct variational relation for the abbreviated
action to be Z X
0 = = 0 (9.27)
Hamilton’s use of the principle of least action to derive both Lagrangian and Hamiltonian mechanics is
a remarkable accomplishment. It underlies both Lagrangian and Hamiltonian mechanics and confirmed the
conjecture of Maupertuis.
9.2. HAMILTON’S PRINCIPLE OF STATIONARY ACTION
The above relation assumes that the doubled variables (q1 q̇1 ) and (q2 q̇2 ) are decoupled from each other.
More generally one can assume that the two sets of variables are coupled by some arbitrary function
(q1 q̇1 q2 q̇2 ). Then the action can be written as
Z
(q1 q2 ) = [ (q1 q̇1 t) − (q2 q̇2 t) + (q1 q̇1 q2 q̇2 )] (9.29)
The effective Lagrangian for this doubled system then can be defined as
Λ (q1 q2 q̇1 q̇2 ) ≡ [ (q1 q̇1 ) − (q2 q̇2 ) + (q1 q̇1 q2 q̇2 )] (9.30)
The coupling term (q1 q̇1 q2 q̇2 ) for the doubled system of degrees of freedom must satisfy the
following two properties.
230 CHAPTER 9. HAMILTON’S ACTION PRINCIPLE
(a) If it can be expressed as the difference of two scalar potentials, ∆ (q1 q2 ) = (q1 ) − (q2 ), then
it can be absorbed into the potential term for each of the doubled variables in the Lagrangian. This implies
that = 0 and there is no reason to double the number of degrees of freedom because the system is
conservative. Thus describes generalized forces that are not derivable from potential energy, that is, not
conservative.
(b) A second property of the coupling term (q1 q̇1 q2 q̇2 ) is that it must be antisymmetric under
interchange of the arbitrary labels 1 ↔ 2. That is,
where q12 ( 0) are the coordinates for which the action is stationary, ¿ 1 and where 12 () are arbitrary
functions of time denoting virtual displacements of the paths. The doubled system has two independent
paths connecting the two initial boundary conditions at , and it requires that these paths intersect at .
The variational system for the two intersecting paths requires specifying four conditions, two per path. Two
of the four conditions are determined by requiring that at the initial boundary conditions satisfies that
12 ( ) = 0. The remaining two conditions are derived by requiring that the variation of the action (q1 q2 )
satisfies
∙ ¸ Z ½ ∙ ¸ ∙ ¸ ¾
Λ 1 Λ 2
=0= 1 − − 2 − + [1 1 − 2 2 ]= (9.34)
=0 1 =0 2 =0
The canonical momenta 12 conjugate to the doubled coordinates q12 are defined using the nonconser-
vative Lagrangian Λ to be
where the superscript designates the solution based on the initial conditions. Note that the conjugate
momentum 1 = (q 1 q̇1 )
̇1 ()
while the (q1q̇̇1()
q2 q̇2 )
term is part of the total momentum due to the
1
nonconservative interaction. Similarly the momentum for the second path is
The last term in equation 934 that is, the term [ 1 1 − 2 2 ]= results from integration by parts,
which will vanish if
1 ( ) 1 ( ) = 2 ( ) 2 ( ) (9.37)
The equality condition at the intersection of the two paths at requires that
Therefore equations 938 and 939 constitute the equality condition that must be satisfied when the two
paths intersect at . The equality condition ensures that the boundary term for integration by parts in
equation 934 will vanish for arbitrary variations provided that the two unspecified paths agree at the final
time . Similarly the conjugate momenta 1 ( ) 2 ( ) must agree, but otherwise are unspecified. As a
consequence, the equality condition ensures that the variational principle is consistent with the final state at
9.2. HAMILTON’S PRINCIPLE OF STATIONARY ACTION
not being specified. That is, the equations of motion are only specified by the initial boundary conditions
of the time asymmetric action for the doubled system.
More physics insight is provided by using a more convenient parametrization of the coordinates in terms
of their average and difference. That is, let
1 + 2
+ ≡ − ≡ 1 − 2 (9.40)
2
Then the physical limit is
+ →
− →0 (9.41)
That is, the average history is the relevant physical history, while the difference coordinate simply vanishes.
For these coordinates, the nonconservative Lagrangian is Λ (q+ q− q̇+ q̇− ) and the equality conditions
reduce to
− ( ) = 0 (9.42)
− ( ) = 0 (9.43)
which implies that the physically relevant average (+) quantities are not specified at the final time in
order to have a well-defined variational principle.
The canonical momenta are given by
1 + 2 Λ
+ = = (9.44)
2 ̇−
Λ
− = 1 − 2 = (9.45)
̇+
The equations of motion can be written as
Λ Λ
= (9.46)
̇± ±
Equation 946 is identically zero for the + subscript, while, in the physical limit (PL), the negative subscript
gives that ∙ ¸
Λ Λ
−
=0 (9.47)
̇− −
Substituting for the Lagrangian Λ gives that
∙ ¸
− =
−
≡ (q1 q̇1 ) (9.48)
̇− − − ̇−
Tsang, and Stein[Gal13, Gal14] for further discussion plus examples of applying this formalism to nonconservative systems in
classical mechanics, electromagnetic radiation, RLC circuits, fluid dynamics, and field theory.
232 CHAPTER 9. HAMILTON’S ACTION PRINCIPLE
9.3 Lagrangian
9.3.1 Standard Lagrangian
Lagrangian mechanics, as introduced in chapter 6 was based on the concepts of kinetic energy and potential
energy. d’Alembert’s principle of virtual work was used to derive Lagrangian mechanics in chapter 6 and this
led to the definition of the standard Lagrangian. That is, the standard Lagrangian was defined in chapter
62 to be the difference between the kinetic and potential energies.
Hamilton extended Lagrangian mechanics by defining Hamilton’s Principle, equation 92, which states that
a dynamical system follows a path for which the action functional is stationary, that is, the time integral
of the Lagrangian. Chapter 6 showed that using the standard Lagrangian for defining the action functional
leads to the Euler-Lagrange variational equations
½ µ ¶ ¾ X
− =
+ (q ) (9.51)
̇
=1
The Lagrange multiplier terms handle the holonomic constraint forces and handles the remaining
excluded generalized forces. Chapters 6 − 8 showed that the use of the standard Lagrangian, with the Euler-
Lagrange equations (951) provides a remarkably powerful and flexible way to derive second-order equations
of motion for dynamical systems in classical mechanics.
Note that the Euler-LagrangePequations, expressed solely in terms of the standard Lagrangian (951)
that is, excluding the
+
=1 (q ) terms, are valid only under the following conditions:
1. The forces acting on the system, apart from any forces of constraint, must be derivable from scalar
potentials.
2. The equations of constraint must be relations that connect the coordinates of the particles and may
be functions of time, that is, the constraints are holonomic.
P
The
+ =1 (q ) terms extend the range of validity of using the standard Lagrangian in the
1. The Lagrangian is indefinite with respect to addition of a constant to the scalar potential which cancels
out when the derivatives in the Euler-Lagrange differential equations are applied.
2. The Lagrangian is indefinite with respect to addition of a constant kinetic energy.
3. The Lagrangian is indefinite with respect to addition of a total time derivative of the form 2 →
1 + [Λ( )] for any differentiable function Λ( ) of the generalized coordinates plus time, that
has continuous second derivatives.
9.3. LAGRANGIAN 233
This last statement can be proved by considering a transformation between two related standard La-
grangians of the form
µ ¶
Λ(q ) Λ(q ) Λ(q )
2 (q ) = 1 (q ) + = 1 (q ) + ̇ + (9.52)
This leads to a standard Lagrangian 2 that has the same equations of motion as 1 as is shown by
substituting equation 952 into the Euler-Lagrange equations. That is,
µ ¶ µ ¶ µ ¶
2 2 1 1 2 Λ(q ) 2 Λ(q ) 1 1
− = − + − = − (9.53)
̇ ̇ ̇
Thus even though the related Lagrangians 1 and 2 are different, they are completely equivalent in that
they generate identical equations of motion.
There is an unlimited range of equivalent standard Lagrangians that all lead to the same equations of
motion and satisfy the requirements of the Lagrangian. That is, there is no unique choice among the wide
range of equivalent standard Lagrangians expressed in terms of generalized coordinates. This discussion is
an example of gauge invariance in physics.
Modern theories in physics describe reality in terms of potential fields. Gauge invariance, which also is
called gauge symmetry, is a property of field theory for which different underlying fields lead to identical
observable quantities. Well-known examples are the static electric potential field and the gravitational
potential field where any arbitrary constant can be added to these scalar potentials with zero impact on the
observed static electric field or the observed gravitational field. Gauge theories constrain the laws of physics
in that the impact of gauge transformations must cancel out when expressed in terms of the observables.
Gauge symmetry plays a crucial role in both classical and quantal manifestations of field theory, e.g. it is
the basis of the Standard Model of electroweak and strong interactions.
Equivalent Lagrangians are a clear manifestation of gauge invariance as illustrated by equations 952 953
which show that adding any total time derivative of a scalar function Λ(q) to the Lagrangian has no
observable consequences on the equations of motion. That is, although addition of the total time derivative
of the scalar function Λ(q ) changes the value of the Lagrangian, it does not change the equations of motion
for the observables derived using equivalent standard Lagrangians.
For Lagrangian formulations of classical mechanics, the gauge invariance is readily apparent by direct
inspection of the Lagrangian.
B=∇×A
A
E = −∇Φ −
The equations of motion for a charge in an electromagnetic field can be obtained by using the Lagrangian
1
= v · v − (Φ − A · v)
2
Consider the transformations (AΦ) → (A0 Φ0 ) in the transformed Lagrangian 0 where
A0 = A + ∇Λ(r)
Λ(r)
Φ0 = Φ −
234 CHAPTER 9. HAMILTON’S ACTION PRINCIPLE
Note that the additive term Λ(r) is an exact time differential. Thus the Lagrangian 0 is gauge invariant
implying identical equations of motion are obtained using either of these equivalent Lagrangians.
The force fields E and B can be used to show that the above transformation is gauge-invariant. That is,
A0 A
E0 = −∇Φ0 − = −∇Φ − =E
B0 = ∇ × A0 = ∇ × A = B
That is, the additive terms due to the scalar field Λ(r) cancel. Thus the electromagnetic force fields following
a gauge-invariant transformation are shown to be identical in agreement with what is inferred directly by
inspection of the Lagrangian.
3. Equations of motion stage: The “equations-of-motion stage” uses the derived equations of motion to
solve for the motion of the system subject to a given set of initial boundary conditions. Nonconservative
forces, such as dissipative forces, that were not included at the primary and secondary stages, may be
added at the equations of motion stage.
Lagrange omitted the action stage when he used d’Alembert’s Principle to derive Lagrangian mechanics.
The Newtonian mechanics approach omits both the primary “action” stage, as well as the secondary “Hamil-
tonian/Lagrangian” stage, since Newton’s Laws of Motion directly specify the “equations-of-motion stage”.
Thus these did not allow exploiting the considerable advantages provided by use of action, the Lagrangian,
and the Hamiltonian. Newtonian mechanics requires that all the active forces be included when deriving the
equations of motion, which involves dealing with vector quantities. In Newtonian mechanics, symmetries
must be incorporated directly at the equations of motion stage, which is more difficult than when done at
the primary “action” stage, or the secondary “Lagrangian/Hamiltonian” stage. The “action” and “Hamil-
tonian/Lagrangian” stages allow for use of the powerful arsenal of mathematical techniques that have been
developed for applying variational principles.
There are considerable advantages to deriving the equations of motion based on Hamilton’s Principle,
rather than derive them using Newtonian mechanics. It is significantly easier to use variational principles to
handle the scalar functionals, action, Lagrangian, and Hamiltonian, rather than starting at the equations-
of-motion stage. For example, utilizing all three stages of algebraic mechanics facilitates accommodating
extra degrees of freedom, symmetries, and interactions. The symmetries identified by Noether’s theorem are
more easily recognized during the primary “action” and secondary “Hamiltonian/Lagrangian” stages rather
than at the subsequent “equations of motion” stage. Approximations made at the “action” stage are easier
to implement than at the “equations-of-motion” stage. Constrained motion is much more easily handled at
the primary “action”, or secondary “Hamilton/Lagrangian” stages, than at the equations-of-motion stage.
An important advantage of using Hamilton’s Action Principle, is that there is a close relationship between
action in classical and quantal mechanics, as discussed in chapters 15 and 18. Algebraic principles, that
underly analytical mechanics, naturally encompass applications to many branches of modern physics, such
as relativistic mechanics, fluid motion, and field theory.
In summary, the use of the single fundamental invariant quantity, action, as described above, provides a
powerful and elegant framework, that was developed first for classical mechanics, but now is exploited in a
wide range of science, engineering, and economics. An important feature of using the algebraic approach to
classical mechanics is the tremendous arsenal of powerful mathematical techniques that have been developed
for use of variational calculus applied to Lagrangian and Hamiltonian mechanics. Some of these variational
techniques were presented in chapters 6 7 8 and 9, while others will be introduced in chapter 15.
236 CHAPTER 9. HAMILTON’S ACTION PRINCIPLE
9.5 Summary
The Hamilton’s 1834 publication, introducing both Hamilton’s Principle of Stationary Action and Hamil-
tonian mechanics, marked the crowning achievements for the development of variational principles in classical
mechanics. A fundamental advantage of Hamiltonian mechanics is that it uses the conjugate coordinates
q p plus time , which is a considerable advantage in most branches of physics and engineering. Compared
to Lagrangian mechanics, Hamiltonian mechanics has a significantly broader arsenal of powerful techniques
that can be exploited to obtain an analytical solution of the integrals of the motion for complicated sys-
tems, as described in chapter 15. In addition, Hamiltonian dynamics provides a means of determining the
unknown variables for which the solution assumes a soluble form, and is ideal for study of the fundamen-
tal underlying physics in applications to fields such as quantum or statistical physics. As a consequence,
Hamiltonian mechanics has become the preeminent variational approach used in modern physics.
This chapter has introduced and discussed Hamilton’s Principle of Stationary Action, which underlies
the elegant and remarkably powerful Lagrangian and Hamiltonian representations of algebraic mechanics.
The basic concepts employed in algebraic mechanics are summarized below.
Hamilton’s Action Principle: As discussed in chapter 92, Hamiltonian mechanics is built upon Hamil-
ton’s action functional Z
(q p) = (q q̇) (91)
Generalized momentum : In chapter 72, the generalized (canonical) momentum was defined in terms
of the Lagrangian to be
(q q̇)
≡ (73)
̇
Chapter 922 defined the generalized momentum in terms of the action functional to be
(q p)
= (912)
Generalized energy (q ̇ ): Jacobi’s Generalized Energy (q ̇ ) was defined in equation 737 as
X µ (q q̇ ) ¶
(q q̇ ) ≡ ̇ − (q q̇ ) (737)
̇
Hamiltonian function: (q p) The Hamiltonian (q p) was defined in terms of the generalized
energy (q q̇ ) plus the generalized momentum. That is
X
(q p) ≡ (q q̇ ) = ̇ − (q q̇ ) = p · q̇−(q q̇ ) (737)
P
where p q correspond to -dimensional vectors, e.g. q ≡ (1 2 ) and the scalar product p· q̇ = ̇ .
Chapter 82 used a Legendre transformation to derive this relation between the Hamiltonian and Lagrangian
functions. Note that whereas the Lagrangian (q q̇ ) is expressed in terms of the coordinates q plus
conjugate velocities q̇, the Hamiltonian (q p ) is expressed in terms of the coordinates q plus their
conjugate momenta p. For scleronomic systems, using the standard Lagrangian, in equations 744 and 729
shows that the Hamiltonian simplifies to be equal to the total mechanical energy, that is, = + .
9.5. SUMMARY 237
Generalized energy theorem: The equations of motion lead to the generalized energy theorem which
states that the time dependence of the Hamiltonian is related to the time dependence of the Lagrangian.
"
#
(q p) X X (q q̇ )
= ̇
+ (q ) − (738)
=1
Note that if all the generalized non-potential forces and Lagrange multiplier terms are zero, and if the
Lagrangian is not an explicit function of time, then the Hamiltonian is a constant of motion.
Lagrange equations of motion: Equation 660 gives that the Lagrange equations of motion are
½ µ ¶ ¾
X
− = (q ) +
(660)
̇
=1
where = 1 2 3
Hamilton’s equations of motion: Chapter 83 showed that a Legendre transform, plus the Lagrange-
Euler equations, (964 965) lead to Hamilton’s equations of motion. Hamilton derived these equations of
motion directly from the action functional, as shown in chapter 92
(q p)
̇ = (825)
" #
X
̇ = − (q p) + + (826)
=1
(q p) (q q̇ )
= − (824)
Note the symmetry of Hamilton’s two canonical equations. The canonical variables are treated
as independent canonical variables Lagrange was the first to derive the canonical equations but he did not
recognize them as a basic set of equations of motion. Hamilton derived the canonical equations of motion
from his fundamental variational principle and made them the basis for a far-reaching theory of dynamics.
Hamilton’s equations give 2 first-order differential equations for for each of the degrees of freedom.
Lagrange’s equations give second-order differential equations for the variables ̇
Hamilton-Jacobi equation: Hamilton used Hamilton’s Principle plus equation 919 to derive the Hamilton-
Jacobi equation.
+ (q p) = 0 (919)
The solution of Hamilton’s equations is trivial if the Hamiltonian is a constant of motion, or when a set of
generalized coordinate can be identified for which all the coordinates are constant, or are cyclic (also called
ignorable coordinates). Jacobi developed the mathematical framework of canonical transformation required
to exploit the Hamilton-Jacobi equation.
Hamilton’s Principle applied using initial boundary conditions: The definition of Hamilton’s Prin-
ciple assumes integration between the initial time and final time . A recent development has extended
applications of Hamilton’s Principle to apply to systems that are defined in terms of only the initial bound-
ary conditions. This method doubles the number of degrees of freedom and uses a coupling Lagrangian
(q2 q̇2 q1 q̇1 ) between the corresponding q1 and q2 doubled degrees of freedom
∙ ¸
−
=
−
≡ (q1 q̇1 ) (950)
̇− − − ̇−
This was used in equation 93 to derive the action in terms of the fundamental Lagrangian defined by equation
952 The assumption that the action is the fundamental property inverts this procedure and now equation
93 is used to derived the Lagrangian. That is, the assumption that Hamilton’s Principle is the foundation
of algebraic mechanics defines the Lagrangian in terms of the fundamental action
Non-standard Lagrangians: The flexibility and power of Lagrangian mechanics can be extended to a
broader range of dynamical systems by employing an extended definition of the Lagrangian that assumes that
the action is the fundamental property, and then the Lagrangian is defined in terms of Hamilton’s variational
action principle using equation 92. It was illustrated that the inverse variational calculus formalism can
be used to identify non-standard Lagrangians that generate the required equations of motion. These non-
standard Lagrangians can be very different from the standard Lagrangian and do not separate into kinetic
and potential energy components. These alternative Lagrangians can be used to handle dissipative systems
which are beyond the range of validity when using standard Lagrangians. That is, it was shown that several
very different Lagrangians and Hamiltonians can be equivalent for generating useful equations of motion
of a system. Currently the use of non-standard Lagrangians is a narrow, but active, frontier of classical
mechanics with important applications to relativistic mechanics.
Gauge invariance of the standard Lagrangian: It was shown that there is a continuum of equivalent
standard Lagrangians that lead to the same set of equations of motion for a system. This feature is related
to gauge invariance in mechanics. The following transformations change the standard Lagrangian, but leave
the equations of motion unchanged.
1. The Lagrangian is indefinite with respect to addition of a constant to the scalar potential which cancels
out when the derivatives in the Euler-Lagrange differential equations are applied.
2. Similarly the Lagrangian is indefinite with respect to addition of a constant kinetic energy.
3. The Lagrangian is indefinite with respect to addition of a total time derivative of the form →
+ [Λ( )] for any differentiable function Λ( ) of the generalized coordinates, plus time, that has
continuous second derivatives.
Application of Hamilton’s Action Principle to mechanics: The derivation of the equations of mo-
tion for any system can be separated into a hierarchical set of three stages in both sophistication and
understanding. Variational principles are employed during the primary “action” stage and secondary “Hamil-
ton/Lagrangian” stage to derive the required equations of motion, which then are solved during the third
“equations-of-motion stage”. Hamilton’s Action Principle, is a scalar function that is the basis for deriving
the Lagrangian
R and Hamiltonian functions. The primary “action stage” uses Hamilton’s Action functional,
= (q q̇) to derive the Lagrangian and Hamiltonian functionals that are based on Hamilton’s
action functional and provide the most fundamental and sophisticated level of understanding. The second
“Hamiltonian/Lagrangian stage” involves using the Lagrangian and Hamiltonian functionals to derive the
equations of motion. The third “equations-of-motion stage” uses the derived equations of motion to solve
for the motion subject to a given set of initial boundary conditions. The Newtonian mechanics approach
bypasses the primary “action” stage, as well as the secondary “Hamiltonian/Lagrangian” stage. That is,
Newtonian mechanics starts at the third “equations-of-motion” stage, which does not allow exploiting the
considerable advantages provided by use of action, the Lagrangian, and the Hamiltonian. Newtonian me-
chanics requires that all the active forces be included when deriving the equations of motion, which involves
dealing with vector quantities. This is in contrast to the action, Lagrangian, and Hamiltonian which are
scalar functionals. Both the primary “action” stage, and the secondary “Lagrangian/Hamiltonian” stage,
exploit the powerful arsenal of mathematical techniques that have been developed for exploiting variational
principles.
Chapter 10
Nonconservative systems
10.1 Introduction
Hamilton’s action principle, Lagrangian mechanics, and Hamiltonian mechanics, all exploit the concept of
action which is a single, invariant, quantity. These algebraic formulations of mechanics all are based on
energy, which is a scalar quantity, and thus these formulations are easier to handle than the vector concept
of force employed in Newtonian mechanics. Algebraic formulations provide a powerful and elegant approach
to understand and develop the equations of motion of systems in nature. Chapters 6 − 9 applied variational
principles to Hamilton’s action principle which led to the Lagrangian, and Hamiltonian formulations that
simplify determination of the equations of motion for systems in classical mechanics.
A conservative force has the property that the total work done moving between two points is independent
of the taken path. That is, a conservative force is time symmetric and can be expressed in terms of the
gradient of a scalar potential . Hamilton’s action principle implicitly assumes that the system is conservative
for those degrees of freedom that are built into the definition of the action, and the related Lagrangian, and
Hamiltonian. The focus of this chapter is to discuss the origins of nonconservative motion and how it can
be handled in algebraic mechanics.
The system exhibits the common “beats” behavior where the coupled ¡ 1 +2 ¢harmonic oscillators have an angular
frequency that is the average oscillator frequency
¡ = ¢ 2 and the oscillation intensities are
modulated at the difference frequency, = 1 − 2
2
Although the total energy is conserved
for this conservative system, this shared energy flows back and forth between the two coupled harmonic
oscillators at the difference frequency. If the equations of motion for oscillator 1 ignore the coupling to the
239
240 CHAPTER 10. NONCONSERVATIVE SYSTEMS
motion of oscillator 2, that is, assume a constant average value 2 = h2 i is used, then the intensity |1 |2 and
¯ ¡ ¢ ¯2
energy of the first oscillator still is modulated by the ¯sin 1 −2
2
¯ term. Thus the total energy for this
truncated coupled-oscillator system is no longer conserved due to neglect of the energy flowing into and out
of oscillator 1 due to its coupling to oscillator 2. That is, the solution for the truncated system of oscillator
1 is not conservative since it is exchanging energy with the coupled, but ignored, second oscillator. This
elementary example illustrates that ignoring active degrees of freedom can transform a conservative system
into a nonconservative system, for which the equations of motion derived using the truncated Lagrangian is
incorrect.
The above example illustrates the importance of including all active degrees of freedom when deriving the
equations of motion, in order to ensure that the total system is conservative. Unfortunately, nonconservative
systems due to viscous or frictional dissipation typically result from weak thermal interactions with an
enormous number of nearby atoms, which makes inclusion of all of these degrees of freedom impractical.
Even though the detailed behavior of such dissipative degrees of freedom may not be of direct interest, all
the active degrees of freedom must be included when applying Lagrangian or Hamiltonian mechanics.
1. Expand the number of degrees of freedom used to include all active degrees of freedom for the system,
so that the expanded system is conservative. This is the preferred approach when it is viable. Hamil-
ton’s action principle based on initial conditions, introduced in chapter 924, doubles the number of
degrees of freedom, which can be used to account for the dissipative forces providing one approach to
solve nonconservative systems. However, this approach typically is impractical for handling dissipated
processes because of the large number of degrees of freedom that are involved in thermal dissipation.
2. Nonconservative forces can be introduced directly at the equations of motion stage as generalized forces
. This approach is used extensively. For the case of linear velocity dependence, the Rayleigh’s
dissipation function provides an elegant and powerful way to express the generalized forces in terms of
scalar potential energies.
3. New degrees of freedom or effective forces can be postulated that are then incorporated into the
Lagrangian or the Hamiltonian in order to mimic the effects of the nonconservative forces.
Examples that exploit the above three ways to introduce nonconservative dissipative forces in algebraic
formulations are given below.
Multiplying equation 105 by ̇ , take the time integral, and sum over , gives the following energy equation
X Z
X X Z
X X Z
X Z
X
̈ ̇ + ̇ ̇ + ̇ = ()̇ (10.6)
=1 =1 0 =1 =1 0 =1 =1 0 0
The right-hand term is the total energy supplied to the system by the external generalized forces ()
at the time . The first time-integral term on the left-hand side is the total kinetic energy, while the third
time-integral term equals the potential energy. The second integral term on the left is defined to equal 2R(q̇)
where Rayeigh’s dissipation function R(q̇) is defined as
1 XX
R(q̇)≡ ̇ ̇ (10.7)
2 =1 =1
and the summations are over all particles of the system. This definition allows for complicated cross-
coupling effects between the particles.
The particle-particle coupling effects usually can be neglected allowing use of the simpler definition that
includes only the diagonal terms. Then the diagonal form of the Rayleigh dissipation function simplifies to
1X 2
R(q̇)≡ ̇ (10.8)
2 =1
Therefore the frictional force in the direction depends linearly on velocity ̇ , that is
R(q̇)
= − = − ̇ (10.9)
̇
In general, the dissipative force is the velocity gradient of the Rayleigh dissipation function,
The physical significance of the Rayleigh dissipation function is illustrated by calculating the work done
by one particle against friction, which is
Using equations 628 and 629 the component of the generalized frictional force is given by
X X X
r ṙ ṙ R(q̇)
= F · = F · =− ∇ R(q̇) · =− (10.15)
=1
=1
̇ =1
̇ ̇
Equation 1015 provides an elegant expression for the generalized dissipative force in terms of the
Rayleigh’s scalar dissipation potential R.
Where
corresponds to the generalized forces remaining after removal of the generalized linear, velocity-
dependent, frictional force . The holonomic forces of constraint are absorbed into the Lagrange multiplier
term.
The Rayleigh dissipation function R(q q̇) provides an elegant and convenient way to account for dissi-
pative forces in both Lagrangian and Hamiltonian mechanics.
10.4. RAYLEIGH’S DISSIPATION FUNCTION 243
gives
These two coupled equations can be decoupled and simplified by making a transformation to normal coor-
dinates, 1 2 where
1 = 1 − 2 2 = 1 + 2
Thus
1 1
1 =
( + 2 ) 2 = ( − 1 )
2 1 2 2
Insert these into the equations of motion gives
Add and subtract these two equations gives the following two decoupled equations
( + 20 ) 0
̈ 1 + ̇1 + 1 = cos ()
0
̈2 + ̇ 2 + 2 = cos ()
q p
(+20 ) 0
Define Γ = 1 = 2 = = . Then the two independent equations of motion become
This solution is a superposition of two independent, linearly-damped, driven normal modes 1 and 2 that
have different natural frequencies 1 and 2 . For weak damping these two driven normal modesq
each undergo
¡ ¢2
damped oscillatory motion with the 1 and 2 normal modes exhibiting resonances at 1 = 21 − 2 Γ2
0
q ¡ ¢2
and 02 = 22 − 2 Γ2
Thus the total magnetic energy which is analogous to kinetic energy is given by summing over all
circuits to be
1 XX
= = ̇ ̇
2 =1
=1
Similarly the electrical energy stored in the mutual capacitance between the circuits, which
is analogous to potential energy, is given by
1 X X
= =
2 =1
=1
Assuming that Ohm’s Law is obeyed, that is, the dissipation force depends linearly on velocity, then the
Rayleigh dissipation function can be written in the form
1 XX
R≡ ̇ ̇ ()
2 =1
=1
where is the resistance matrix. Thus the dissipation force, expressed in volts, is given by
R 1X
= − = ̇ ()
̇ 2
=1
Inserting equations and into equation 1018 plus making the assumption that an additional gen-
eralized electrical force = () volts is acting on circuit then the Euler-Lagrange equations give the
following equations of motion.
X ∙ ¸
̈ + ̇ + = ()
=1
This is a generalized version of Kirchhoff’s loop rule which can be seen by considering the case where the
diagonal term = is the only non-zero term. Then
∙ ¸
̈ + ̇ + = ()
This sum of the voltages is identical to the usual expression for Kirchhoff’s loop rule. This example
illustrates the power of variational methods when applied to fields beyond classical mechanics.
10.5. DISSIPATIVE LAGRANGIANS 245
Note that this Hamiltonian is time independent, and thus is conserved for this complete dual-variable system.
Using Hamilton’s equations of motion gives the same two uncoupled equations of motion as obtained using
the Lagrangian, i.e. () and ().
2: Time-dependent Lagrangian:
The complementary subsystem of the above dual-component Lagrangian, that is added to the primary
dissipative subsystem, is the adjoint to the equations for the primary subsystem of interest. In some cases, a
set of the solutions of the complementary equations can be expressed in terms of the solutions of the primary
subsystem allowing the equations of motion to be expressed solely in terms of the variables of the primary
subsystem. Inspection of the solutions of the damped harmonic oscillator, presented in chapter 35, implies
that and must be related by the function
= Γ ()
Therefore Bateman proposed a time-dependent, non-standard Lagrangian of the form
Γ £ 2 ¤
= ̇ − 20 2 ( )
2
This Lagrangian corresponds to a harmonic oscillator for which the mass = 0 Γ is accreting
exponentially with time in order to mimic the exponential energy dissipation. Use of this Lagrangian in the
Euler-Lagrange equations gives the solution
£ ¤
Γ ̈ + Γ̇ + 20 = 0 ()
If the factor outside of the bracket is non-zero, then the equation in the bracket must be zero. The expression
in the bracket is the required equation of motion for the linearly-damped linear oscillator. This Lagrangian
generates a generalized momentum of
= Γ ̇
and the Hamiltonian is
2 −Γ 2 Γ 2
= ̇ − 2 = + 0 ()
2 2
The Hamiltonian is time dependent as expected. This leads to Hamilton’s equations of motion
−Γ
̇ = = ()
−̇ = = 20 Γ ()
Take the total time derivative of equation and use equation to substitute for ̇ gives
£ ¤
Γ ̈ + Γ̇ + 20 = 0 ()
If the term Γ is non-zero, then the term in brackets is zero. The term in the bracket is the usual equation
of motion for the linearly-damped harmonic oscillator.
3: Complex Lagrangian:
Dekker proposed use of complex dynamical variables for solving the linearly-damped harmonic oscillator.
It exploits the fact that, in principle, each second order differential equation can be expressed in terms of
a set of first-order differential equations. This feature is the essential difference between Lagrangian and
Hamiltonian mechanics. Let be complex and assume it can be expressed in the form of a real variable as
µ ¶
Γ
= ̇ − + ()
2
Substituting this complex variable into the relation
∙ ¸
Γ
̇ + + =0 ()
2
leads to the second-order equation for the real variable of
̈ + Γ̇ + 20 = 0 ()
10.6. SUMMARY 247
This is the desired equation of motion for the linearly-damped harmonic oscillator. This result also can be
shown by taking the time derivative of equation () and taking only the real part, i.e.
µ ¶
Γ Γ
̈ + ̇ + ̇ = ̈ + − ̇ + Γ̇ = ̈ + Γ̇ + 20 = 0 ()
2 2
This feature is exploited using the following Lagrangian
∙ ¸
∗ Γ ∗
= ( ̇ − ̇ ∗ ) − − ()
2 2
¡ ¢2
where 2 ≡ 20 − Γ2 . The Lagrangian is real for a conservative system and complex for a
dissipative system. Using the Lagrange-Euler equation for variation of ∗ , that is, Λ∗ = 0, gives
equation () which leads to the required equation of motion ()
The canonical conjugate momenta are given by
= ̃ = ()
̇ ̇ ∗
The above Lagrangian plus canonically conjugate momenta lead to the complimentary Hamiltonians
µ ¶
∗ Γ
( ̃ ) = + (̃∗ ∗ − ) ()
2
µ ¶
∗ Γ
̃ ( ̃ ) = − (̃∗ ∗ − ) ()
2
These Hamiltonians give Hamilton equations of motion that lead to the correct equations of motion for
and ∗
The above examples have shown that three very different, non-standard, Lagrangians, plus their corre-
sponding Hamiltonians, all lead to the correct equation of motion for the linearly-damped harmonic oscilla-
tor. This illustrates the power of using non-standard Lagrangians to describe dissipative motion in classical
mechanics. However, postulating non-standard Lagrangians to produce the required equations of motion
appears to be of questionable usefulness. A fundamental approach is needed to build a firm foundation upon
which non-standard Lagrangian mechanics can be based. Non-standard Lagrangian mechanics remains an
active, albeit narrow, frontier of classical mechanics
10.6 Summary
Dissipative drag forces are non-conservative and usually are velocity dependent. Chapter 4 showed that the
motion of non-linear dissipative dynamical systems can be highly sensitive to the initial conditions and can
lead to chaotic motion.
Algebraic mechanics for nonconservative systems Since Lagrangian and Hamiltonian formulations
are invalid for the nonconservative degrees of freedom, the following three approaches are used to include
nonconservative degrees of freedom directly in the Lagrangian and Hamiltonian formulations of mechanics.
1. Expand the number of degrees of freedom used to include all active degrees of freedom for the system,
so that the expanded system is conservative. This is the preferred approach when it is viable. Unfor-
tunately this approach typically is impractical for handling dissipated processes because of the large
number of degrees of freedom that are involved in thermal dissipation.
2. Nonconservative forces can be introduced directly at the equations of motion stage as generalized forces
. This approach is used extensively. For the case of linear velocity dependence, the Rayleigh’s
dissipation function provides an elegant and powerful way to express the generalized forces in terms of
scalar potential energies.
3. New degrees of freedom or effective forces can be postulated that are then incorporated into the
Lagrangian or the Hamiltonian in order to mimic the effects of the nonconservative forces.
248 CHAPTER 10. NONCONSERVATIVE SYSTEMS
Rayleigh’s dissipation function Generalized dissipative forces that have a linear velocity dependence
can be easily handled in Lagrangian or Hamiltonian mechanics by introducing the powerful Rayleigh’s
dissipation function R(q̇) where
1 XX
R(q̇)≡ ̇ ̇ (107)
2 =1 =1
This approach is used extensively in physics. This approach has been generalized by defining a linear velocity
dependent Rayleigh dissipation function
(q q̇)
F = − (1016)
q̇
where the generalized Rayleigh dissipation function R(q q̇) satisfies the general Lagrange mechanics relation
− =0 (1017)
̇
This generalized Rayleigh’s dissipation function eliminates the prior restriction to linear dissipation processes,
which greatly expands the range of validity for using Rayleigh’s dissipation function.
Rayleigh dissipation in Lagrange equations of motion Linear dissipative forces can be directly, and
elegantly, included in Lagrangian mechanics by using Rayleigh’s dissipation function as a generalized force
. Inserting Rayleigh dissipation function 1015 in the generalized Lagrange equations of motion 660 gives
½ µ ¶ ¾ "
#
X R(q q̇)
− = (q ) + − (1018)
̇ ̇
=1
Where
corresponds to the generalized forces remaining after removal of the generalized linear, velocity-
dependent, frictional force . The holonomic forces of constraint are absorbed into the Lagrange multiplier
term.
The Rayleigh dissipation function R(q q̇) provides an elegant and convenient way to account for dissi-
pative forces in both Lagrangian and Hamiltonian mechanics.
Dissipative Lagrangians or Hamiltonians New degrees of freedom or effective forces can be postulated
that are then incorporated into the Lagrangian or the Hamiltonian in order to mimic the effects of the
nonconservative forces. This approach has been used for special cases.
Chapter 11
11.1 Introduction
Conservative two-body central forces are important in physics because of the pivotal role that the Coulomb
and the gravitational forces play in nature. The Coulomb force plays a role in electrodynamics, molecular,
atomic, and nuclear physics, while the gravitational force plays an analogous role in celestial mechanics.
Therefore this chapter focusses on the physics of systems involving conservative two-body central forces
because of the importance and ubiquity of these conservative two-body central forces in nature.
A conservative two-body central force has the following three important attributes.
1. Conservative: A conservative force depends only on the particle position, that is, the force is not
time dependent. Moreover the work done by the force moving a body between any two points 1 and 2
is path independent. Conservative fields are discussed in chapter 210.
2. Two-body: A two-body force between two bodies depends only on the relative locations of the two
interacting bodies and is not influenced by the proximity of additional bodies. For two-body forces
acting between bodies, the force on body 1 is the vector superposition of the two-body forces due
to the interactions with each of the other − 1 bodies. This differs from three-body forces where the
force between any two bodies is influenced by the proximity of a third body.
3. Central: A central force field depends on the distance 12 from the origin of the force at point 1 to
the body location at point 2, and the force is directed along the line joining them, that is, r̂12 .
A conservative, two-body, central force combines the above three attributes and can be expressed as,
The force field F21 has a magnitude (12 ) that depends only on the magnitude of the relative separation
vector r12 = r2 − r1 between the origin of the force at point 1 and point 2 where the force acts, and the force
is directed along the line joining them, that is, r̂12 .
Chapter 210 showed that if a two-body central force is conservative, then it can be written as the gradient
of a scalar potential energy () which is a function of the distance from the center of the force field.
249
250 CHAPTER 11. CONSERVATIVE TWO-BODY CENTRAL FORCES
That is, the two vectors r1 r2 are written in terms of the position vector for the center of mass R and the
position vector r for relative motion in the center of mass.
Assuming that the two-body central force is conservative and represented by (), then the Lagrangian
of the two-body system can be written as
1 1
= 1 |ṙ1 |2 + 2 |ṙ2 |2 − () (11.11)
2 2
11.2. EQUIVALENT ONE-BODY REPRESENTATION FOR TWO-BODY MOTION 251
Differentiating equations 1110 with respect to time, and inserting them into the Lagrangian, gives
1 ¯¯ ¯¯2 1
= ¯Ṙ¯ + |ṙ|2 − () (11.12)
2 2
where the total mass is defined as
= 1 + 2 (11.13)
and the reduced mass is defined by
1 2
≡ (11.14)
1 + 2
or equivalently
1 1 1
= + (11.15)
1 2
The total Lagrangian can be separated into two independent parts
1 ¯¯ ¯¯2
= ¯Ṙ¯ + (11.16)
2
where
1 2
= |ṙ| − () (11.17)
2
Assuming that no external forces are acting, then R = 0 and the three Lagrange equations for each of the
three coordinates of the R coordinate can be written as
P
= =0 (11.18)
Ṙ
That is, for a pure central force, the center-of-mass momentum P is a constant of motion where
P = = Ṙ (11.19)
Ṙ
It is convenient to work in the center-of-mass frame using
the effective Lagrangian . In the center-of-mass
¯ ¯2 frame of
1 ¯ ¯
reference, the translational kinetic energy 2 ¯Ṙ¯ associated
with center-of-mass motion is ignored, and only the energy in
the center-of-mass is considered. This center-of-mass energy
is the energy involved in the interaction between the colliding
bodies. Thus, in the center-of-mass, the problem has been re-
duced to an equivalent one-body problem of a mass moving
about a fixed force center with a path given by r which is the
separation vector between the two bodies, as shown in figure
112. In reality, both masses revolve around their center of
mass, also called the barycenter, in the center-of-mass frame
as shown in figure 112. Knowing r allows the trajectory of
each mass about the center of mass r01 and r02 to be calcu-
lated. Of course the true path in the laboratory frame of
reference must take into account both the translational mo-
tion of the center of mass, in addition to the motion of the
Figure 11.2: Orbits of a two-body system
equivalent one-body representation relative to the barycenter.
with mass ratio of 2 rotating about the
Be careful to remember the difference between the actual tra-
center-of-mass, O. The dashed ellipse is the
jectories of each body, and the effective trajectory assumed
equivalent one-body orbit with the center of
when using the reduced mass which only determines the rel-
force at the focus O.
ative separation r of the two bodies. This reduction to an
equivalent one-body problem greatly simplifies the solution
of the motion, but it misrepresents the actual trajectories and the spatial locations of each mass in space.
The equivalent one-body representation will be used extensively throughout this chapter.
252 CHAPTER 11. CONSERVATIVE TWO-BODY CENTRAL FORCES
The center-of-mass Lagrangian leads to the following two general properties regarding the angular mo-
mentum vector L.
1) The motion lies entirely in a plane perpendicular to the fixed direction of the total angular momentum
vector. This is because
L·r=r×p·r=0 (11.21)
that is, the radius vector is in the plane perpendicular to the total angular momentum vector. Thus, it is
possible to express the Lagrangian in polar coordinates, ( ) rather than spherical coordinates. In polar
coordinates the center-of-mass Lagrangian becomes
1 ³ 2
´
= ̇2 + 2 ̇ − () (11.22)
2
2) If the potential is spherically symmetric, then the polar angle is cyclic and therefore Noether’s
theorem gives that the angular momentum p ≡ L = r × p is a constant of motion. That is, since = 0
where the vectors ṗ and ψ̇ imply that equation 1123 refers to three independent equations corresponding
to the three components of these vectors. Thus the angular momentum p conjugate to ψ is a constant of
motion. The generalized momentum p is a first integral of the motion which equals
p = = 2 ψ̇ = p̂ (11.24)
ψ̇
where the magnitude of the angular momentum , and the direction p̂ both are constants of motion.
A simple geometric interpretation of equation 1124 is illus-
trated in figure 113 The radius vector sweeps out an area A
in time where y
1
A = r × v (11.25)
2
and the vector A is perpendicular to the − plane. The rate
of change of area is
A 1
= r×v (11.26)
2
But the angular momentum is r+dr
A r
L = r × p = r × v = 2 (11.27)
Thus the conservation of angular momentum implies that the
areal velocity
also is a constant of motion This fact is called
Kepler’s second law of planetary motion which he deduced in
1609 based on Tycho Brahe’s 55 years of observational records x
O
of the motion of Mars. Kepler’s second law implies that a
planet moves fastest when closest to the sun and slowest when
farthest from the sun. Note that Kepler’s second law is a state-
ment of the conservation of angular momentum which is inde- Figure 11.3: Area swept out by the radius
pendent of the radial form of the central potential. vector in the time dt.
11.4. EQUATIONS OF MOTION 253
2
̈ = − + 3 (11.30)
Similarly, for the angular coordinate, the operator equation Λ = 0 leads to equation 1124. That is,
the angular equation of motion for the magnitude of is
= = 2 ̇ = (11.31)
̇
Lagrange’s equations have given two equations of motion, one dependent on radius and the other on
the polar angle . Note that the radial acceleration is just a statement of Newton’s Laws of motion for the
radial force in the center-of-mass system of
2
= − + 3 (11.32)
This can be written in terms of an effective potential
2
() ≡ () + (11.33)
22
which leads to an equation of motion
()
= ̈ = − (11.34)
2
2
Since 3 = ̇ , the second term in equation (1133)
It is remarkable that the six-dimensional equations the combined effective bound potential.
of motion, for two bodies interacting via a two-body
central force, has been reduced to trivial center-of-mass translational motion, plus a one-dimensional one-
body problem given by (1134) in terms of the relative separation and an effective potential ().
254 CHAPTER 11. CONSERVATIVE TWO-BODY CENTRAL FORCES
11.6 Hamiltonian
Since the center-of-mass Lagrangian is not an explicit function of time, then
=− =0 (11.40)
Thus the center-of mass Hamiltonian is a constant of motion. However, since the transformation to
center of mass can be time dependent, then 6= that is, it does not include the total energy because
the kinetic energy of the center-of-mass motion has been omitted from . Also, since no transformation
is involved, then
= + = (11.41)
That is, the center-of-mass Hamiltonian equals the center-of-mass total energy. The center-of-mass
Hamiltonian then can be written using the effective potential (1133) in the form
2 2 2 2 2
= + 2 + () = + 2
+ () = + () = (11.42)
2 2 2 2 2
It is convenient to express the center-of-mass Hamiltonian in terms of the energy equation for the
orbit in a central field using the transformed variable = 1 . Substituting equations 1133 and 1137 into
the Hamiltonian equation 1142 gives the energy equation of the orbit
"µ ¶ #
2
2 ¡ ¢
+ + −1 =
2
(11.43)
2
Energy conservation allows the Hamiltonian to be used to solve problems directly. That is, since
̇2 2
= + + () = (11.44)
2 22
then s µ ¶
2 2
̇ = =± − − (11.45)
22
The time dependence can be obtained by integration
Z
±
= r ³ ´ + constant (11.46)
2 2
− − 22
An inversion of this gives the solution in the standard form = () However, it is more interesting to find
the relation between and From relation 1146 for then
±
= r ³ ´ (11.47)
2 2
− − 22
Therefore Z
±
= r ³ ´ + constant (11.49)
2
2 2 − − 22
which can be used to calculate the angular coordinate. This gives the relation between the radial and angular
coordinates which specifies the trajectory.
Although equations (1145) and (1149) formally give the solution, the actual solution can be derived
analytically only for certain specific forms of the force law and these solutions differ for attractive versus
repulsive interactions.
256 CHAPTER 11. CONSERVATIVE TWO-BODY CENTRAL FORCES
2) E = 0 : It can be shown that the orbit for this case is parabolic.
3) 0 E Umin : For this case the equivalent orbit has both a maximum and minimum radial distance
2
at which ̇ = 0 At the turning points the radial kinetic energy term is zero so = + 2 2 For the
attractive inverse square law force the path is an ellipse with the focus at the center of attraction (Figure
115), which is Kepler’s First Law. During the time that the radius ranges from min to max and back the
radius vector turns through an angle ∆ which is given by
Z max
±
∆ = 2 r ³ ´ (11.50)
min 2
2 2 − − 2 2
The general path prescribes a rosette shape which is a closed curve only if ∆ is a rational fraction of
2.
4) E = Umin : In this case is a constant implying that the path is circular since
s µ ¶
2 2
̇ = =± − − =0 (11.51)
22
5) E Umin : For this case the square root is imaginary and there is no real solution.
In general the orbit is not closed, and such open orbits do not repeat. Bertrand’s Theorem states that
the inverse-square central force, and the linear harmonic oscillator, are the only radial dependences of the
central force that lead to stable closed orbits.
2 ()
+ () = 0 r
2
Q
A solution of this is
1 r0
() = cos( − )
0
where 0 and are arbitrary constants. This can be rewritten as x
0
() = Trajectory of a free body
cos( − )
This is the equation of a straight line in polar coordinates as illustrated in the adjacent figure. This shows
that a free body moves in a straight line if no forces are acting on the body.
11.8. INVERSE-SQUARE, TWO-BODY, CENTRAL FORCE 257
Equation 1158 is the polar equation of a conic section. Equation 1158 also can be derived with the
origin at a focus by inserting the inverse square law potential into equation 1149 which gives
Z
±
= q + constant (11.60)
2 2 2
2 + 2 −
The value of 0 merely determines the orientation of the major axis of the equivalent orbit. Without loss of
generality, it is possible to assume that the angle is measured with respect to the major axis of the orbit,
that is 0 = 0. Then the equation can be written as
" s #
1 2 2
= = − 2 [1 + cos ()] = − 2 1 + 1 + cos () (11.63)
2
This is the equation of a conic section where is the eccentricity of the conic section. The conic section is a
hyperbola if 1, parabola if = 1 ellipse if 1 and a circle if = 0 All the equivalent one-body orbits
for an attractive force have the origin of the force at a focus of the conic section. The orbits depend on
whether the force is attractive or repulsive, on the conserved angular momentum and on the center-of-mass
energy .
foci of the elliptical orbit. The term periapsis or pericenter both are used to designate the closest distance of approach, while
apoapsis or apocenter are used to designate the farthest distance of approach. Attaching the terms "perí-" and "apo-" to the
general term "-apsis" is preferred over having different names for each object in the solar system. For example, frequently used
terms are "-helion" for orbits of the sun, "-gee" for orbits around the earth, and "-cynthion" for orbits around the moon.
11.8. INVERSE-SQUARE, TWO-BODY, CENTRAL FORCE 259
The maximum distance, = max which is called the apoapsis, occurs when = 180
2
max = − (11.65)
[1 − ]
Remember that since 0 for bound orbits, the negative signs in equations 1164 and 1165 lead to 0.
2
The most bound orbit is a circle having = 0 which implies that = − 2 .
The shape of the elliptical orbit also can be described with respect to the center of the elliptical equivalent
orbit by deriving the lengths of the semi-major axis and the semi-minor axis shown in figure 115
µ ¶
1 1 2 2 2
= (min + max ) = + = (11.66)
2 2 [1 + ] [1 − ] [1 − 2 ]
p 2
= 1 − 2 = p (11.67)
[1 − 2 ]
Remember that the predicted bound elliptical orbit corresponds to the equivalent one-body representation
for the two-body motion as illustrated in figure 112. This can be transformed to the individual spatial
trajectories of the each of the two bodies in an inertial frame.
The eccentricity of the major planets ranges from = 02056 for Mercury, to = 00068 for Venus. The
Earth has an eccentricity of = 00167 with min = 91 · 106 miles and max = 95 · 106 miles. On the other
hand, = 0967 for Halley’s comet, that is, the radius vector ranges from 06 to 18 times the radius of the
orbit of the Earth.
The orbit energy can be derived by substituting the eccentricity, given by equation 1162 into the semi-
major axis length given by equation 1166 which leads to the center-of-mass energy of
= − (11.73)
2
However, the Hamiltonian, given by equation 1142 implies that is
µ ¶
1
= 2 + − =− (11.74)
2 2
For the simple case of a circular orbit, = then the velocity equals
s
= (11.75)
For a circular orbit, the drag on a satellite lowers the total energy resulting in a decrease in the radius
of the orbit and a concomitant increase in velocity. That is, when the orbit radius is decreased, part of the
gain in potential energy accounts for the work done against the drag, and the remaining part goes towards
increase of the kinetic energy. Also note that, as predicted by the Virial Theorem, the kinetic energy always
is half the potential energy for the inverse square law force.
ṗ = ()r̂ (11.79)
Note that the angular moment L = r × p is conserved for a central force, that is L̇ = 0. Therefore the time
derivative of the product p × L reduces to
£ ¤
(p × L) = ṗ × L = ()r̂× (r×ṙ) = () r (r · ṙ) − 2 ṙ (11.80)
This can be simplified using the fact that
1
r · ṙ = (r · r) = ̇ (11.81)
2
thus ∙ ¸
£ 2
¤ 2 ṙ ṙ ³r´
() r (r · ṙ) − ṙ = − () − 2 = − ()2 (11.82)
This allows equation 1180 to be reduced to
³r´
(p × L) = − ()2 (11.83)
Assume the special case of the inverse-square law, equation 1152, then the central force equation 1183
reduces to
(p × L) = − (r̂) (11.84)
or
[(p × L) + (r̂)] = 0 (11.85)
Define the eccentricity vector A as
A ≡ (p × L) + (r̂) (11.86)
then equation 1185 corresponds to
A
=0 (11.87)
This is a statement that the eccentricity vector is a constant of motion for an inverse-square, central
force.
The definition of the eccentricity vector A and angular momentum vector L implies a zero scalar product,
A · L =0 (11.88)
Thus the eccentricity vector A and angular momentum L are mutually perpendicular, that is, A is in the
plane of the orbit while L is perpendicular to the plane of the orbit. The eccentricity vector A, always points
along the major axis of the ellipse from the focus to the periapsis as illustrated on the left side in figure 117.
2 The symmetry underlying the eccentricity vector is less intuitive than the energy or angular momentum invariants leading
to it being discovered independently several times during the past three centuries. Jakob Hermann was the first to indentify
this invariant for the special case of the inverse-square central force. Bernoulli generalized his proof in 1710. Laplace derived
the invariant at the end of the 18 century using analytical mechanics. Hamilton derived the connection between the invariant
and the orbit eccentricity. Gibbs derived the invariant using vector analysis. Runge published the Gibb’s derivation in his
textbook which was referenced by Lenz in a 1924 paper on the quantal model of the hydrogen atom. Goldstein named this
invariant the "Laplace-Runge-Lenz vector", while others have named it the "Runge-Lenz vector" or the "Lenz vector". This
book uses Hamilton’s more intuitive name of "eccentricity vector".
262 CHAPTER 11. CONSERVATIVE TWO-BODY CENTRAL FORCES
Figure 11.7: The elliptical trajectory and eccentricity vector A for two bodies interacting via the inverse-
square, central force for eccentricity = 075. The left plot shows the elliptical spatial trajectory where
the semi-major axis is assumed to be on the -axis and the angular momentum L =ẑ, is out of the page.
The force centre is at one foci of the ellipse. The vector coupling relation A ≡ (p × L) + (r̂) is illustrated
at four points on the spatial trajectory. The right plot is a hodograph of the linear momentum p for this
trajectory. The periapsis is denoted by the number 1 and the apoapsis is marked as 3 on both plots. Note
that the eccentricity vector A is a constant that points parallel to the major axis towards the perapsis.
As a consequence, the two orthogonal vectors A and L completely define the plane of the orbit, plus the
orientation of the major axis of the Kepler orbit, in this plane. The three vectors A, p × L, and (r̂) obey
the triangle rule as illustrated in the left side of figure 117.
Hamilton noted the direct connection between the eccentricity vector A and the eccentricity of the
conic section orbit. This can be shown by considering the scalar product
r· (p × L) = (r × p) ·L = L · L =2 (11.90)
Note that equations 1163 and 1191 are identical if 0 = 0. This implies that the eccentricity and
are related by
=− (11.92)
where is defined to be negative for an attractive force. The relation between the eccentricity and total
center-of-mass energy can be used to rewrite equation 1162 in the form
2 = 2 2 + 2 2 (11.93)
The combination of the eccentricity vector A and the angular momentum vector L completely specifies
the orbit for an inverse square-law central force. The trajectory is in the plane perpendicular to the angu-
lar momentum vector L, while the eccentricity, plus the orientation of the orbit, both are defined by the
eccentricity vector A. The eccentricity vector and angular momentum vector each have three independent
coordinates, that is, these two vector invariants provide six constraints, while the scalar invariant energy
adds one additional constraint. The exact location of the particle moving along the trajectory is not defined
and thus there are only five independent coordinates governed by the above seven constraints. Thus the
11.9. ISOTROPIC, LINEAR, TWO-BODY, CENTRAL FORCE 263
eccentricity vector, angular momentum, and center-of-mass energy are related by the two equations 1188
and 1193.
Noether’s theorem states that each conservation law is a manifestation of an underlying symmetry.
Identification of the underlying symmetry responsible for the conservation of the eccentricity vector A is
elucidated using equation 1186 to give
(r̂) = A− (p × L) (11.94)
Take the scalar product
2
(r̂) · (r̂) = () = 2 2 + 2 − 2 · (p × L) (11.95)
Choose the angular momentum to be along the -axis, that is, L =ẑ, and, since p and A are perpendicular
to L, then p and A are in the x̂ − ŷ plane. Assume that the semimajor axis of the elliptical orbit is along
the x-axis, then the locus of the momentum vector on a momentum hodograph has the equation
µ ¶2 µ ¶2
2 + − = (11.96)
¯ ¯
¯ ¯
Equation 1196 implies that the locus of the momentum vector is a circle of radius ¯
¯ with the center
¡ ¢
displaced from the origin at coordinates 0 as shown by the momentum hodograph on the right side of
an figure 117. The angle and eccentricity are related by,
cos = − =− = (11.97)
The circular orbit is centered at the origin for = − = 0, and thus the magnitude |p| is a constant around
the whole trajectory.
The inverse-square, central, two-body, force is unusual in that it leads to stable closed bound orbits
because the radial and angular frequencies are degenerate, i.e. = In momentum space, the locus of
the linear momentum vector p is a perfect circle which is the underlying symmetry responsible for both the
fact that the orbits are closed, and the invariance of the eccentricity vector. Mathematically this symmetry
for the Kepler problem corresponds to the body moving freely on the boundary of a four-dimensional sphere
in space and momentum. The invariance of the eccentricity vector is a manifestation of the special property
of the inverse-square, central force under certain rotations in this four-dimensional space; this (4) symmetry
is an example of a hidden symmetry.
be expressed in polar coordinates. In addition, since the force is spherically symmetric, then the angular
momentum is conserved. The orbit solutions are conic sections as described in chapter 117. The shape of
the orbit for the harmonic two-body central force can be derived using either polar or cartesian coordinates
as illustrated below.
The right-hand side of equation 11105 is a constant. The solution of 11105 must be a sine or cosine function
with polar angle = . That is
à ! ⎡à !2 ⎤ 12
0 − 2 =⎣ + 2 ⎦ cos 2 ( − 0 ) (11.106)
2
That is,
⎛ Ã ! 12 ⎞
1 2
0 = = 2 ⎝1 + 1+ 2 cos 2( − 0 )⎠ (11.107)
2
Equation 11107 corresponds to a closed orbit centered at the origin of the elliptical orbit as illustrated in
figure 118 The eccentricity of this closed orbit is given by
à !1
2 2 2
1+ 2 = (11.108)
2 − 2
Equations 1166 1167 give that the eccentricity is related to the semi-major and semi-minor axes by
µ ¶2
2 = 1 − (11.109)
Note that for a repulsive force 0, then ≥ 1 leading to unbound hyperbolic or parabolic orbits centered
on the origin. An attractive force, 0 allows for bound elliptical, as well as unbound parabolic and
hyperbolic orbits.
11.9. ISOTROPIC, LINEAR, TWO-BODY, CENTRAL FORCE 265
y py
0.8 1.2
1.0
0.6
0.8
0.4 r 0.6 p
0.4
0.2
0.2
x px
-1.2 -1.0 -0.8 -0.6 -0.4 -0.2 0.2 0.4 0.6 0.8 1.0 1.2 -0.8 -0.6 -0.4 -0.2 0.2 0.4 0.6 0.8
-0.2
-0.2
-0.4
-0.4 -0.6
-0.8
-0.6
-1.0
-0.8 -1.2
Figure 11.8: The elliptical equivalent trajectory for two bodies interacting via the linear, central force for
eccentricity = 075. The left plot shows the elliptical spatial trajectory where the semi-major axis is
assumed to be on the -axis and the angular momentum L =ẑ, is out of the page. The force center is at
the center of the ellipse. The right plot is a hodograph of the linear momentum p for this trajectory.
Solutions for the independent coordinates, and their corresponding momenta, are
2 2
2 = 2 + 2 = [ cos ( + )] + [ cos ( + )] (11.113)
p
2 + 2 4 + 4 + 2 2 cos ( − )
= + cos (2 + 0 )
2 2
where
2 cos + 2 cos
cos 0 = p (11.114)
4 + 4 + 2 2 cos ( − )
For a phase difference − = ± 2 this equation describes an ellipse centered at the origin which agrees
with equation 11107 that was derived using polar coordinates.
The two normal modes of the isotropic harmonic oscillator are degenerate, therefore are equally good
normal modes with two corresponding total energies, 1 2 , while the corresponding angular momentum
points in the direction.
2 1
1 = + 2 (11.115)
2 2
2 1
2 = + 2 (11.116)
2 2
= ( − ) (11.117)
266 CHAPTER 11. CONSERVATIVE TWO-BODY CENTRAL FORCES
Figure 118 shows the closed elliptical equivalent orbit plus the corresponding momentum hodograph for
the isotropic harmonic two-body central force. Figures 117 and 118 contrast the differences between the
elliptical orbits for the inverse-square force, and those for the harmonic two-body central force. Although
the orbits for bound systems with the harmonic two-body force, and the inverse-square force, both lead to
elliptical bound orbits, there are important differences. Both the radial motion and momentum are two
valued per cycle for the reflection-symmetric harmonic oscillator, whereas the radius and momentum have
only one maximum and one minimum per revolution for the inverse-square law. Although the inverse-square,
and the isotropic, harmonic, two-body central forces both lead to closed bound elliptical orbits for which the
angular momentum is conserved and the orbits are planar, there is another important difference between the
orbits for these two interactions. The orbit equation for the Kepler problem is expressed with respect to a
foci of the elliptical equivalent orbit, as illustrated in figure 117, whereas the orbit equation for the isotropic
harmonic oscillator orbit is expressed with respect to the center of the ellipse as illustrated in figure 118.
The diagonal matrix elements 011 = 1 , and 022 = 2 which are constants of motion. The off-diagonal
term is given by
µ ¶2 µ 2 ¶Ã 2 !
1 1 2 1 2 2
02
12 ≡ + = + + − 4 ( − )2 = 1 2 − 3 (11.119)
2 2 2 2 2 2 4
The terms on the right-hand side of equation 11119 all are constants of motion, therefore 02 12 also is a
constant of motion. Thus the 3 × 3 symmetry tensor A0 can be reduced to a 2 × 2 symmetry tensor for which
all the matrix elements are constants of motion, and the trace of the symmetry tensor is equal to the total
energy.
In summary, the inverse-square, and harmonic oscillator two-body central interactions both lead to closed,
elliptical equivalent orbits, the plane of which is perpendicular to the conserved angular momentum vector.
However, for the inverse-square force, the origin of the equivalent orbit is at the focus of the ellipse and
= , whereas the origin is at the center of the ellipse and = 2 for the harmonic force. As a
consequence, the elliptical orbit is reflection symmetric for the harmonic force but not for the inverse square
force. The eccentricity vector and symmetry tensor both specify the major axes of these elliptical orbits,
the plane of which are perpendicular to the angular momentum vector. The eccentricity vector, and the
symmetry tensor, both are directly related to the eccentricity of the orbit and the total energy of the two-
body system. Noether’s theorem states that the invariance of the eccentricity vector and symmetry tensor,
plus the corresponding closed orbits, are manifestations of underlying symmetries. The dynamical 3
symmetry underlies the invariance of the symmetry tensor, whereas the dynamical 4 symmetry underlies
the invariance of the eccentricity vector. These symmetries lead to stable closed elliptical bound orbits only
for these two specific two-body central forces, and not for other two-body central forces.
11.10. CLOSED-ORBIT STABILITY 267
2 2
(0 ) = − = −0 ̇ (11.123)
03
which can be written in terms of the central force for a stable orbit as
µ ¶
3 (0 )
− + 0 (11.128)
0 0
To the extent that this linear restoring force dominates over higher-order terms, then a perturbation of the
stable orbit will undergo simple harmonic oscillations about the stable orbit with angular frequency
v³
u 2 ´
u
t 2 =0
= (11.133)
The above discussion shows that a small amplitude radial oscillation about the stable orbit with amplitude
will be of the form
= sin(2 + )
The orbit will be closed if the product of the oscillation frequency and the orbit period is an integer
value.
The fact that planetary orbits in the gravitational field are observed to be closed is strong evidence
that the gravitational force field must obey the inverse square law. Actually there are small precessions of
planetary orbits due to perturbations of the gravitational field by bodies other than the sun, and due to
relativistic effects. Also the gravitational field near the earth departs slightly from the inverse square law
because the earth is not a perfect sphere, and the field does not have perfect spherical symmetry. The study
of the precession of satellites around the earth has been used to determine the oblate quadrupole and slight
octupole (pear shape) distortion of the shape of the earth.
The most famous test of the inverse square law for gravitation is the precession of the perihelion of
Mercury. If the attractive force experienced by Mercury is of the form
F() = − r̂
2+
where || is small, then it can be shown that, for approximate circular orbitals, the perihelion will advance
by a small angle per orbit period. That is, the precession is zero if = 0, corresponding to an inverse
square law dependence which agrees with Bertrand’s theorem. The position of the perihelion of Mercury has
been measured with great accuracy showing that, after correcting for all known perturbations, the perihelion
advances by 43(±5) seconds of arc per century, that is 5 × 10−7 radians per revolution. This corresponds to
= 16 × 10−7 which is small but still significant. This precession remained a puzzle for many years until
1915 when Einstein predicted that one consequence of his general theory of relativity is that the planetary
orbit of Mercury should precess at 43 seconds of arc per century, which is in remarkable agreement with
observations.
11.10. CLOSED-ORBIT STABILITY 269
1 2 2
= +
2 22
At the minimum µ ¶
2
= − =0
=0 3
Thus
µ ¶ 14
2
0 =
and µ ¶
2 32
= + = 4 0
2 =0 04
which is a stable orbit. Small perturbations of such a stable circular orbit will have an angular frequency
v³
u 2 ´ s
u
t 2 =0
= =2
Note that this is twice the frequency for the planar harmonic oscillator with the same restoring coefficient.
This is due to the central repulsion, the effective potential well for this rotating oscillator example has about
half the width for the corresponding planar harmonic oscillator. Note that the kinetic energy for the rotational
2 1 2
motion, which is 2 2 equals the potential energy 2 at the minimum as predicted by the Virial Theorem
for a linear two-body restoring force.
= 0
That this is true can be shown by inserting this orbit into the differential orbit equation.
Using a Binet transformation of the variable to gives
1 1
= = −
0
−
= −
0
2 2
−
=
2 0
2 1 1
+ = − 2 2( )
2
gives µ ¶
2 − 1 1
+ − = − 2 02 2
0 0
That is µ ¶ ¡ 2 ¢ ¡ 2 ¢
1 + 1 2 −3 −3 + 1 2
=− 0 =−
3
which is a central attractive inverse cubic force.
The time dependence of the spiral orbit can be derived since the angular momentum gives
̇ = 2
= 2 2
0
2
̇2 + + ( − ) =
22
The effective potential is
2
= + ( − )
22
which is shown in the adjacent figure. The stationary value occurs when
µ ¶
2
= − 3 + = 0
0 0
2 = 2 03
Note that 0 = 0 if = 0.
The stability of the solution is given by the second deriv-
ative µ 2 ¶
32 3
2
= 4 = 0
0 0 0
Therefore the stationary point is stable.
Note that the equation of motion for the minimum can be
expressed in terms of the restoring force on the two masses
µ 2 ¶
2̈ = − ( − 0 )
2 0
Even when all the bodies are interacting via two-body central
forces, the problem usually is insoluble in terms of known ana-
lytic integrals. Newton first posed the difficulty of the three-body
Kepler problem which has been studied extensively by mathe-
maticians and physicists. No known general analytic integral Figure 11.10: A contour plot of the effec-
solution has been found. Each body for the -body system has tive potential for the Sun-Earth gravita-
6 degrees of freedom, that is, 3 for position and 3 for momen- tional system in the rotating frame where
tum. The center-of-mass motion can be factored out, therefore the Sun and Earth are stationary. The
the center-of-mass system for the -body system has 6 − 10 de- 5 Lagrange points are saddle points
grees of freedom after subtraction of 3 degrees for location of the where the net force is zero. (Figure cre-
center of mass, 3 for the linear momentum of the center of mass, ated by NASA)
3 for rotation of the center of mass, and 1 for the total energy of
the system. Thus for = 2 there are 12 − 10 = 2 degrees of freedom for the two-body system for which the
Kepler approach takes to be r and For = 3 there are 8 degrees of freedom in the center of mass system
that have to be determined.
Numerical solutions to the three-body problem can be obtained using successive approximation or per-
turbation methods in computer calculations. The problem can be simplified by restricting the motion to
either of following two approximations:
1) Planar approximation
This approximation assumes that the three masses move in the same plane, that is, the number of degrees
of freedom are reduced from 8 to 6 which simplifies the numerical solution.
The polar angle is measured with respect to the symmetry axis of the two-body system which is along
the line of distance of closest approach as shown in figure 116. The geometry and symmetry show that the
scattering angle is related to the trajectory angle ∞ by
= − 2 ∞ (11.144)
Since
2 = 2 2 = 2 2 (11.146)
then the scattering angle can be written as.
Z ∞
−
∞ = = r³ ´ (11.147)
2 min 2
2 1 − − 2
Let = 1 , then Z ∞
−
∞ = = r³ ´ (11.148)
2 min
1− − 2 2
This cross section assumes elastic scattering by a repulsive two-body inverse-square central force. For scat-
tering of nuclei in the Coulomb potential, the constant is given to be
2
= (11.160)
4
The cross section, scattering angle and of equation 11159 are evaluated in the center-of-mass co-
ordinate system, whereas usually two-body elastic scattering data involve scattering of the projectiles by a
stationary target as discussed in chapter 1113
11.12. TWO-BODY SCATTERING 277
Gieger and Marsden performed scattering of 77 MeV particles from a thin gold foil and proved that
the differential scattering cross section obeyed the Rutherford formula back to angles corresponding to a
distance of closest approach of 10−14 which is much smaller that the 10−10 size of the atom. This
validated the Rutherford model of the atom and immediately led to the Bohr model of the atom which
played such a crucial role in the development of quantum mechanics. Bohr showed that the agreement with
the Rutherford formula implies the Coulomb field obeys the inverse square law to small distances. This work
was performed at Manchester University, England between 1908 and 1913. It is fortunate that the classical
result is identical to the quantal cross section for scattering, otherwise the development of modern physics
could have been delayed for many years.
Scattering of very heavy ions, such as 208 Pb, can electromagnetically excite target nuclei. For the Coulomb
force the impact parameter and the distance of closest approach, min are directly related to the scattering
angle by equation 11155. Thus observing the angle of the scattered projectile unambiguously determines
the hyperbolic trajectory and thus the electromagnetic impulse given to the colliding nuclei. This process,
called Coulomb excitation, uses the measured angular distribution of the scattered ions for inelastic excitation
of the nuclei to precisely and unambiguously determine the Coulomb excitation cross section as a function
of impact parameter. This unambiguously determines the shape of the nuclear charge distribution.
̇ = ̇ = =− = − cos ()
2
1
√
The initial energy gives that = 2 Hence the orbit equation is
√
1 2
= = sin ()
The above trajectory has a distance of closest approach, min , when min = 2 . Moreover, due to the
symmetry of the orbit, the scattering angle is given by
µ ¶
1
= − 2 0 = 1 −
Since 2 = 2 2 ̇∞
2
= 22 then
µ ¶− 12 µ ¶− 12
2
1− = 1+ 2 = 1+ 2
This gives that the impact parameter is related to scattering angle by
2
( − )
2 =
(2 − )
This impact parameter relation can be used in equation 11141 to give the differential cross section
¯ ¯
¯¯ ¯¯ 2 ( − )
= =
Ω sin ¯ ¯ (2 − )2 2
These orbits are called Cotes spirals.
278 CHAPTER 11. CONSERVATIVE TWO-BODY CENTRAL FORCES
p
+ p
=0 (11.161)
Using the center-of-momentum frame, coupled with the conservation of linear momentum, implies that the
vector sum of the final momenta of the reaction products,
also is zero. That is
X
p
=0 (11.162)
=1
An additional constraint is that energy conservation relates the initial and final kinetic energies by
¡ ¢2 ¡ ¢2 ¡ ¢2 ¡ ¢2
+ += + (11.163)
2 2 2 2
where the value is the energy contributed to the final total kinetic energy by the reaction between the
incoming projectile and target. For exothermic reactions, 0 the summed kinetic of the reaction products
exceeds the sum of the incoming kinetic energies, while for endothermic reactions, 0 the summed kinetic
energy of the reaction products is less than that of the incoming channel.
For two-body kinematics, the following are three advantages to working in the center-of-momentum frame
of reference.
1. The two incident colliding bodies are colinear as are the two final bodies.
2. The linear momenta for the two colliding bodies are identical in both the incident channel and the
outgoing channel.
3. The total energy in the center-of-momentum coordinate frame is the energy available to the reac-
tion during the collision. The trivial kinetic energy of the center-of-momentum frame relative to the
laboratory frame is handled separately.
The kinematics for two-body reactions is easily determined using the conservation of linear momentum
along and perpendicular to the beam direction plus the conservation of energy, 11161 − 11163. Note that it
is common practice to use the term “center-of-mass” rather than “center-of-momentum” in spite of the fact
that, for relativistic mechanics, only the center-of-momentum is a meaningful concept.
General features of the transformation between the center-of-momentum and laboratory frames of refer-
ence are best illustrated by elastic or inelastic scattering of nuclei where the two reaction products in the final
channel are identical to the incident bodies. Inelastic excitation of an excited state energy of ∆ in either
reaction product corresponds to = −∆ while elastic scattering corresponds to = −∆ = 0.
For inelastic scattering, the conservation of linear momenta for the outgoing channel in the center-of-
momentum simplifies to
p
+ p
=0 (11.164)
that is, the linear momenta of the two reaction products are equal and opposite.
Assume that the center-of-momentum direction of the scattered projectile is at an angle = relative
to the direction of the incoming projectile and that the scattered target nucleus is scattered at a center-
of-momentum direction = − . Elastic scattering corresponds to simple¯ scattering for which
¯ ¯ ¯ the
magnitudes of the incoming and outgoing projectile momenta are equal, that is, ¯
¯
= ¯
¯.
11.13. TWO-BODY KINEMATICS 279
Figure 11.15: Vector hodograph of the scattered projectile and target velocities for a projectile, with incident
velocity that is elastically scattered by a stationary target body. The circles show the magnitude of the
projectile and target body final velocities in the center of mass. The center-of-mass velocity vectors are
shown as dashed lines while the laboratory vectors are shown as solid lines. The left hodograph shows
normal kinematics where the projectile mass is less than the target mass. The right hodograph shows
inverse kinematics where the projectile mass is greater than the target mass. For elastic scattering = 0 .
Velocities
The transformation between the center-of-momentum and laboratory frames requires knowledge of the par-
ticle velocities which can be derived from the linear momenta since the particle masses are known. Assume
that a projectile, mass , with incident energy in the laboratory frame bombards a stationary target
with mass The incident projectile velocity is given by
r
2
= (11.165)
The final velocities in the laboratory frame after the inelastic collision are
In the center-of-momentum coordinate system, equation 1110 implies that the initial center-of-momentum
velocities are
=
+
= (11.166)
+
It is simple to derive that the final center-of-momentum velocities after the inelastic collision are given
280 CHAPTER 11. CONSERVATIVE TWO-BODY CENTRAL FORCES
by
r
2
0 = ̃
+
r
2
0 = ̃ (11.167)
+
Angles
The angles of the scattered recoils are written as
and
where
1 1
= q = q (11.170)
1 + 1 +
( +
(1 + ) )
and
is the energy per nucleon on the incident projectile.
Equation 11169 can be rewritten as
sin
tan
=
(11.171)
cos
+
Another useful relation from equation 11169 gives the center-of-momentum scattering angle in terms of
the laboratory scattering angle.
= sin
−1
( sin
) + (11.172)
This gives the difference in angle between the lab scattering angle and the center-of-momentum scattering
angle. Be careful with this relation since
is two-valued for inverse kinematics corresponding to the two
possible signs for the solution.
The angle relations between the lab and center-of-momentum for the recoiling target nucleus are connected
by
r
sin( − )
= ≡ ̃ (11.173)
sin ̃
That is
= sin−1 (̃ sin ) + (11.174)
11.13. TWO-BODY KINEMATICS 281
Figure 11.16: The kinematic correlation of the laboratory and center-of-mass scattering angles of the recoiling
projectile and target nuclei for scattering for 43 /nucleon 104 Pd on 208 Pb (left) and for the inverse
43 /nucleon 208 Pb on 104 Pd (right). The projectile scattering angles are shown by solid lines while the
recoiling target angles are shown by dashed lines. The blue curves correspond to elastic scattering, that is
= 0 while the red curves correspond to inelastic scattering with = −5 .
where
1 1
̃ = q =q (11.175)
1+ (1 +
) 1+ ( +
)
Note that ̃ is the same under interchange of the two nuclei at the same incident energy/nucleon, and
that ̃ is always larger than or equal to unity since is negative. For elastic scattering ̃ = 1 which gives
1
= ( − ) (Recoil lab angle for elastic scattering)
2
sin
tan = (Target lab to CM angle conversion)
cos + ̃
Velocity vector hodographs provide useful insight into the behavior of the kinematic solutions. As shown
in figure 1115, in the center-of-momentum frame the scattered projectile has a fixed final velocity 0 , that
is, the velocity vector describes a circle as a function of . The vector addition of this vector and the velocity
of the center-of-mass vector − gives the laboratory frame velocity 0 . Note that for normal kinematics,
where then | | |0 | leading to a monotonic one-to-one mapping of the center-of-momentum
angle and 0
. However, for inverse kinematics, where then | | | | leading to two valued
solutions at any fixed laboratory scattering angle .
Billiard ball collisions are an especially simple example where the two masses³are identical
´ and the collision
is essentially elastic. Then essentially = ̃ = 1, =
2 and
= 1
2 −
, that is, the angle
between the scattered billiard balls is 2 .
Both normal and inverse kinematics are illustrated in figure 1116 which shows the dependence of the
projectile and target scattering angles in the laboratory frame as a function of center-of-momentum scattering
angle for the Coulomb scattering of 104 Pd by 208 Pb, that is, for a mass ratio of 2 : 1. Both normal and
inverse kinematics are shown for the same bombarding energy of 43 for elastic scattering and
for inelastic scattering with a -value of −5 .
282 CHAPTER 11. CONSERVATIVE TWO-BODY CENTRAL FORCES
Figure 11.17: Recoil energies, in , versus laboratory scattering angle, shown on the left for scattering
of 447 104 Pd by 208 Pb with = −50 , and shown on the right for scattering of 894 208 Pb
on 104 Pd with = −50
Since sin( − ) ≤ 1 then equation 11173 implies that ̃ sin ≤ 1 Since ̃ is always larger than
or equal to unity there is a maximum scattering angle in the laboratory frame for the recoiling target nucleus
given by
1
sin max = (11.176)
̃
For elastic scattering = sin−1 ( ̃1 ) = 90◦ since ̃ = 1 for both 894 208 Pb bombarding 104 Pd, and
the inverse reaction using a 447 104 Pd beam scattered by a 208 Pb target. A -value of −5
gives ̃ = 1002808 which implies a maximum scattering angle of = 8571◦ for both 894 208 Pb
bombarding 104 Pd, and the inverse reaction of a 447 104 Pd beam scattered by a 208 Pb target. As a
consequence there are two solutions for for any allowed value of as illustrated in figure 1116.
Since sin(
− ) ≤ 1 then equation 11150 implies that sin ≤ 1 For a 447
104
Pd beam
208
scattered by a Pb target = 050, thus = 05 for elastic scattering which implies that there is no
upper bound to
. This leads to a one-to-one correspondence between and for normal kinematics.
In contrast, the projectile has a maximum scattering angle in the laboratory frame for inverse kinematics
since
= 20 leading to an upper bound to given by
1
sin
max = (11.177)
For elastic scattering = 2 implying ◦
max = 30 . In addition to having a maximum value for , when
1 also there are two solutions for for any allowed value of . For the example of 894 208 Pb
bombarding 178 Hf leads to a maximum projectile scattering angle of ◦
= 300 for elastic scattering and
◦
= 29907 for = −5
Kinetic energies
In the laboratory frame the kinetic energies of the scattered projectile and recoiling target nucleus are
given by
µ ¶2 ³ ´
= 1 + 2 + 2 cos
̃ (11.180)
+
³ ´
2
= 2 1 + ̃ + 2̃ cos ̃ (11.181)
( + )
where
and are the center-of-mass scattering angles respectively for the scattered projectile and
target nuclei.
For the chosen incident energies the normal and inverse reactions give the same center-of-momentum
energy of 298 which is the energy available to the interaction between the colliding nuclei. However,
the kinetic energy of the center-of-momentum is 447−298 = 149 for normal kinematics and 894−298 =
596 for inverse kinematics. This trivial center-of-momentum kinetic energy does not contribute to the
reaction. Note that inverse kinematics focusses all the scattered nuclei into the forward hemisphere which
reduces the required solid angle for recoil-particle detection.
Solid angles
The laboratory-frame solid angles for the scattered projectile and target are taken to be and
respectively, while the center-of-momentum solid angles are Ω and Ω respectively. The Jacobian relating
the solid angles is
à !2
sin ¯ ¯
¯ ¯
= ¯cos( − ¯
) (11.182)
Ω sin
à !2
sin ¯ ¯
¯ ¯
= ¯cos( − )¯ (11.183)
Ω sin
These can be used to transform the calculated center-of-momentum differential cross sections to the
laboratory frame for comparison with measured values. Note that relative to the center-of-momentum frame,
the forward focussing increases the observed differential cross sections in the forward laboratory frame and
decreases them in the backward hemisphere.
11.14 Summary
This chapter has focussed on the classical mechanics of bodies interacting via conservative, two-body, central
interactions. The following are the main topics presented in this chapter.
Equivalent one-body representation for two bodies interacting via a central interaction The
equivalent one-body representation of the motion of two bodies interacting via a two-body central interaction
greatly simplifies solution of the equations of motion. The position vectors r1 and r2 are expressed in terms
of the center-of-mass vector R plus total mass = 1 + 2 while the position vector r plus associated
reduced mass = 11+ 2
2
describe the relative motion of the two bodies in the center of mass. The total
Lagrangian then separates into two independent parts
1 ¯¯ ¯¯2
= ¯Ṙ¯ + (1116)
2
where the center-of-mass Lagrangian is
1
= |ṙ|2 − () (1117)
2
Equations 1110, and 1111 can be used to derive the actual spatial trajectories of the two bodies expressed
in terms of r1 and r2 from the relative equations of motion, written in terms of R and r for the equivalent
one-body solution..
Angular momentum Noether’s theorem shows that the angular momentum is conserved if only a spherically-
symmetric two-body central force acts between the interacting two bodies. The plane of motion is perpen-
dicular to the angular momentum vector and thus the Lagrangian can be expressed in polar coordinates
as
1 ³ 2
´
= ̇2 + 2 ̇ − () (1122)
2
Differential orbit equation of motion The Binet transformation = 1 allows the center-of-mass
Lagrangian for a central force F = ()r̂ to be used to express the differential orbit equation for the
radial motion as
2 1 1
2 + = − 2 2 ( ) (1139)
The Lagrangian, and the Hamiltonian all were used to derive the equations of motion for two bodies inter-
acting via a two-body, conservative, central interaction. The general features of the conservation of angular
momentum and conservation of energy for a two-body, central potential were presented.
Inverse-square, two-body, central force The inverse-square, two-body, central force is of pivotal im-
portance in nature since it is applies to both the gravitational force and the Coulomb force. The underlying
symmetries of the inverse-square, two-body, central interaction, lead to conservation of angular momentum,
conservation of energy, Gauss’s law, and that the two-body orbits follow closed, degenerate, orbits that are
conic sections, for which the eccentricity vector is conserved. The radial dependence, relative to the force
center lying at one focus of the conic section, is given by
1
= − 2 [1 + cos ( − 0 )] (1158)
where the orbit eccentricity equals s
2 2
= 1+ (1162)
2
These lead to Kepler’s three laws of motion for two bodies in a bound orbit due to the attractive gravitational
force for which = −1 2 . The inverse-square law is special in that the eccentricity vector A is a third
invariant of the motion, where
A ≡ (p × L) + (r̂) (1186)
11.14. SUMMARY 285
The eccentricity vector unambiguously defines the orientation and direction of the major axis of the elliptical
orbit. The invariance of the eccentricity vector, and the existence of stable closed orbits, are manifestations
of the dynamical 04 symmetry.
Isotropic, harmonic, two-body, central force The isotropic, harmonic, two-body, central interaction
is of interest since, like the inverse-square law force, it leads to closed elliptical orbits described by
⎛ Ã !1 ⎞
1 ⎝ 2 2
= 2 1+ 1+ 2 cos 2( − 0 )⎠ (11107)
2
Orbit stability Bertrand’s theorem states that only the inverse square law and the linear radial depen-
dences of the central forces lead to stable closed bound orbits that do not precess. These are manifestation
of the dynamical symmetries that occur for these two specific radial forms of two-body forces.
The three-body problem The difficulties encountered in solving the equations of motion for three bodies,
that are interacting via two-body central forces, was discussed. The three-body motion can include the
existence of chaotic motion. It was shown that solution of the three-body problem is simplified if either the
planar approximation, or the restricted three-body approximation, are applicable.
Two-body scattering The total and differential two-body scattering cross sections were introduced. It
was shown that for the inverse-square law force there is a simple relation between the impact parameter
and scattering angle given by
= cot (11155)
2 2
This led to the solution for the differential scattering cross-section for Rutherford scattering due to the
Coulomb interaction. µ ¶2
1 1
= (11159)
Ω 4 2 sin4 2
This cross section assumes elastic scattering by a repulsive two-body inverse-square central force. For scat-
tering of nuclei in the Coulomb potential the constant is given to be
2
= (11160)
4
Two-body kinematics The transformation from the center-of-momentum frame to laboratory frames of
reference was introduced. Such transformations are used extensively in many fields of physics for theoretical
modelling of scattering, and for analysis of experiment data.
286 CHAPTER 11. CONSERVATIVE TWO-BODY CENTRAL FORCES
Workshop exercises
1. Listed below are several statements concerning central force motion. For each statement, give the reason for
why the statement is true. If a statement is only true in certain situations, then explain when it holds and
when it doesn’t. The system referred to below consists of mass 1 located at 1 and mass 2 located at 2 .
• The potential energy of the system depends only on the difference 1 − 2 , not on 1 and 2 separately.
• The potential energy of the system depends only on the magnitude of 1 − 2 , not the direction.
• It is possible to choose an inertial reference frame in which the center of mass of the system is at rest.
• The total energy of the system is conserved.
• The total angular momentum of the system is conserved.
2 2
2. A particle of mass moves in a potential () = −0 −
.
(a) Given the constant , find an implicit equation for the radius of the circular orbit. A circular orbit at
= is possible if µ ¶¯
¯¯
=0
¯=
where is the effective potential.
(b) What is the largest value of for which a circular orbit exists? What is the value of the effective potential
at this critical orbit?
3. A particle of mass is observed to move in a spiral orbit given by the equation = , where is a constant.
Is it possible to have such an orbit in a central force field? If so, determine the form of the force function.
4. The
£ interaction energy¤ between two atoms of mass is given by the Lennard-Jones potential, () =
(0 )12 − 2(0 )6
(a) Determine the Lagrangian of the system where 1 and 2 are the positions of the first and second mass,
respectively.
(b) Rewrite the Lagrangian as a one-body problem in which the center-of-mass is stationary.
(c) Determine the equilibrium point and show that it is stable.
(d) Determine the frequency of small oscillations about the stable point.
5. Consider two bodies of mass in circular orbit of radius 0 2, attracted to each other by a force () , where
is the distance between the masses.
(a) Determine the Lagrangian of the system in the center-of-mass frame (Hint: a one-body problem subject
to a central force).
(c) Determine the equation of motion in in terms of the angular momentum and |F()|.
(d) Expand your result in (c) about an equilibrium radius 0 and show that the condition for stability
0 ( )
is, (00) + 30 0
6. Consider two charges of equal magnitude connected by a spring of spring constant 0 in circular orbit. Can
the charges oscillate about some equilibrium? If so, what condition must be satisfied?
7. Consider a mass in orbit around a mass , which is subject to a force = − 2 ̂ , where is the distance
between the masses. Show that the eccentricity vector = × − ̂ is conserved.
11.14. SUMMARY 287
Problems
1. Show that the areal velocity is constant for a particle moving under the influence of an attractive force given
by () = − . Calculate the time averages of the kinetic and potential energies and compare with the the
results of the virial theorem.
2. Assume that the Earth’s orbit is circular and that the Sun’s mass suddenly decreases by a factor of two. (a)
What orbit will the earth then have? (b) Will the Earth escape the solar system?
3. Discuss the motion of a particle in a central inverse-square-law force field for a superimposed force whose
magnitude is inversely proportional to the cube of the distance from the particle to force center; that is
() = − − (k, 0)
2 3
Show that the motion is described by a precessing ellipse. Consider the cases
2 2 2
a) , b) = c) where is the angular momentum and the reduced mass.
4. A communications satellite is in a circular orbit around the earth at a radius and velocity . A rocket
accidentally fires quite suddenly, giving the rocket an outward velocity in addition to its original tangential
velocity
a) Calculate the ratio of the new energy and angular momentum to the old.
b) Describe the subsequent motion of the satellite and plot () () the net effective potential, and ()
after the rocket fires.
5. Two identical point objects, each of mass are bound by a linear two-body force = − where is the
vector distance between the two point objects. The two point objects each slide on a horizontal frictionless
plane subject to a vertical gravitational field . The two-body system is free to translate, rotate and oscillate
on the surface of the frictionless plane.
a) Derive the Lagrangian for the complete system including translation and relative motion.
b) Use Noether’s theorem to identify all constants of motion.
c) Use the Lagrangian to derive the equations of motion for the system.
d) Derive the generalized momenta and the corresponding Hamiltonian.
e) Derive the period for small amplitude oscillations of the relative motion of the two masses.
6. A bound binary star system comprises two spherical stars of mass 1 and 2 bound by their mutual gravita-
tional attraction. Assume that the only force acting on the stars is their mutual gravitation attraction and let
be the instantaneous separation distance between the centers of the two stars where is much larger than
the sum of the radii of the stars.
a) Show that the two-body motion of the binary star system can be represented by an equivalent one-body system
and derive the Lagrangian for this system.
b) Show that the motion for the equivalent one-body system in the center of mass frame lies entirely in a plane
and derive the angle between the normal to the plane and the angular momentum vector.
c) Show whether is a constant of motion and whether it equals the total energy.
d) It is known that a solution to the equation of motion for the equivalent one-body orbit for this gravitational
force has the form
1
= − 2 [1 + cos ]
and that the angular momentum is a constant of motion = . Use these to prove that the attractive force leading
to this bound orbit is
F= r̂
2
where must be negative.
288 CHAPTER 11. CONSERVATIVE TWO-BODY CENTRAL FORCES
7 When performing the Rutherford experiment, Gieger and Marsden scattered 77 4 He particles (alpha
particles) from 238 U at a scattering angle in the laboratory frame of = 900 . Derive the following observables
as measured in the laboratory frame.
238
(a) The recoil scattering angle of the U in the laboratory frame.
(b) The scattering angles of the 4 He and 238
U in the center-of-mass frame
(c) The kinetic energies of the 4 He and 238
U in the laboratory frame
(d) The impact parameter
(e) The distance of closest approach min
Chapter 12
12.1 Introduction
Newton’s Laws of motion apply only to inertial frames of reference. Inertial frames of reference make it
possible to use either Newton’s laws of motion, or Lagrangian, or Hamiltonian mechanics, to develop the
necessary equations of motion. There are certain situations where it is much more convenient to treat the
motion in a non-inertial frame of reference. Examples are motion in frames of reference undergoing trans-
lational acceleration, rotating frames of reference, or frames undergoing both translational and rotational
motion. This chapter will analyze the behavior of dynamical systems in accelerated frames of reference,
especially rotating frames such as on the surface of the Earth. Newtonian mechanics, as well as the La-
grangian and Hamiltonian approaches, will be used to handle motion in non-inertial reference frames by
introducing extra inertial forces that correct for the fact that the motion is being treated with respect to a
non-inertial reference frame. These inertial forces are often called fictitious even though they appear real in
the non-inertial frame. The underlying reasons for each of the inertial forces will be discussed followed by a
presentation of important applications.
a = A +a0 (12.3)
2 r 2 r0 2 R
where a = 2 a0 =
2 and A = 2
In the fixed frame, Newton’s laws give that
F = a (12.4)
The force in the fixed frame can be separated into two terms, the acceleration of the accelerating frame of
reference A plus the acceleration with respect to the accelerating frame a0 .
The accelerating frame of reference can exploit Newton’s Laws of motion using an effective translational
force F0 ≡ F − A The additional −A term is called an inertial force; it can be altered by
choosing a different non-inertial frame of reference, that is, it is dependent on the frame of reference in which
the observer is situated.
Consider that during a time the position vector in the fixed
primed reference frame moves by an arbitrary infinitessimal
distance r0 As illustrated in figure 122, this infinitessi-
mal distance in the primed non-rotating frame can be split
into two parts:
a) r = θ×r0 which is due to rotation of the rotating
frame with respect to the translating primed frame.
b) (r00 ) which is the motion with respect to the rotating
(double-primed) frame.
That is, the motion has been arbitrarily divided into
a part that is due to the rotation of the double-primed
frame, plus the vector displacement measured in this rotating
(double-primed) frame. It is always possible to make such a
decomposition of the displacement as long as the vector sum
can be written as Figure 12.2: Infinitessimal displacement in
the non rotating primed frame and in the ro-
r0 = r00 + θ × r0 (12.8) tating double-primed reference frame frame.
12.3. ROTATING REFERENCE FRAME 291
Since θ = ω then the time differential of the displacement, equation 128, can be written as
µ 0¶ µ 00 ¶
r r
= + ω × r0 (12.9)
³ 0´
The important conclusion is that a velocity measured in a non-rotating reference frame r can be
³ 00 ´
expressed as the sum of the velocity r
measured relative to a rotating frame, plus the term ω × r0
which accounts for the rotation of the frame. The division of the r0 vector into two parts, a part due to
rotation of the frame plus a part with respect to the rotating frame, is valid for any vector as shown below.
The inertial-frame time derivative taken with components along the rotating coordinate basis ê
, equation
1211, is
µ ¶ X3 µ ¶ X3
G ê
= ê
+ ( ) (12.14)
=1 =1
Substitute the unit vector ê for r0 in equation 129 plus using equation 1212 gives that
µ ¶
ê
= ω × ê (12.15)
Substitute this into the second term of equation 1214 gives
µ ¶ µ ¶
G G
= +ω×G (12.16)
This important identity relates the time derivatives of any vector expressed in both the inertial frame and
the rotating non-inertial frame bases. Note that the ω × G term originates from the fact that the unit
basis vectors of the rotating reference frame are time dependent with respect to the non-rotating frame basis
vectors as given by equation (1215). Equation (1216) is used extensively for problems involving rotating
frames. For example, for the special case where G = r0 , then equation (1216) relates the velocity vectors in
the fixed and rotating frames as given in equation (129).
Another example is the vector ω̇
µ ¶ µ ¶ µ ¶
ω ω ω
ω̇ = = +ω×ω = = ω̇ (12.17)
That is, the angular acceleration ̇ has the same value in both the fixed and rotating frames of reference.
292 CHAPTER 12. NON-INERTIAL REFERENCE FRAMES
Now we wish to use the general transformation to a rotating frame basis which requires inclusion of the time
dependence of the unit vectors in the rotating frame, that is,
µ 00 ¶ µ 00 ¶
v v 00
= + ω × v (12.23)
µ ¶ µ ¶
ω ω
× r0 = × r0 (12.24)
µ 0 ¶
r 00
ω× = ω × v + ω × (ω × r0 ) (12.25)
12.6. LAGRANGIAN MECHANICS IN A NON-INERTIAL FRAME 293
a = A + a00 + 2ω × v
00
+ ω × (ω × r0 ) + ω̇ × r0 (12.26)
µ 00 ¶ µ 00 ¶
v r
where the acceleration in the rotating frame is a00 =
00
while the velocity is v = and
A is with respect to the fixed frame.
Newton’s laws of motion are obeyed in the inertial frame, that is
In the double-primed frame, which may be both rotating and accelerating in translation, one can ascribe an
effective force F 00
that obeys an effective Newton’s law for the acceleration a in the rotating frame
F 00 00 0 0
= a = F − (A + 2ω × v + ω × (ω × r ) + ω̇ × r ) (12.28)
1 h 2
i
= V ·V +v00 ·v00 + 2V ·v00 + 2V · (ω × r0 ) + 2v
00
· (ω × r0 ) + (ω × r0 ) − ()
2
(12.30)
This can be used to derive the canonical momentum in the rotating frame
p00 = 00 = [V +v00 + ω × r0 ] (12.31)
v
The Lagrange equations can be used to derive the equations of motion in terms of the variables evaluated
in the rotating reference frame. The required Lagrange derivatives are
00 = [A +a00 + (ω × v
00
) + (ω̇ × r0 )] (12.32)
v
and
00
= − [(ω × V ) − (ω × v ) − ω × (ω × r0 )] − ∇ (12.33)
r0
where the scalar triple product, equation 21 has been used. Thus the Lagrange equations give for the
rotating frame basis that
The external force is identified as F = −∇ . Equation 1216 can be used to transform between the
fixed and the rotating bases. h i
A = A + (ω × V) (12.35)
This leads to an effective force in the non-inertial translating plus rotating frame that corresponds to an
effective Newtonian force of
F 00 00 0 0
= a = F − [A + 2ω × v + ω × (ω × r ) + (ω̇ × r )] (12.36)
where A is expressed in the fixed frame. The derivation of equation 1236 using Lagrangian mechanics,
confirms the identical formula 1229 derived using Newtonian mechanics.
The four correction terms for the non-inertial frame basis correspond to the following effective forces.
Translational acceleration: F
= −A is the usual inertial force experienced in a linearly acceler-
ating frame of reference, and where A is with respect to the fixed frame .
Coriolis force; F 00
= −2ω × v This is a new type of inertial force that is present only when a
particle is moving in the rotating frame. This force is proportional to the velocity in the rotating frame and
is independent of the position in the rotating frame
Centrifugal force: F 0
= −ω × (ω × r ) This is due to the centripetal acceleration of the particle
owing to the rotation of the moving axis about the axis of rotation.
Transverse (azimuthal) force: F 0
= −ω̇ × r This is a straightforward term due to acceleration of
the particle due to the angular acceleration of the rotating axes.
The above inertial forces are correction terms arising from trying to extend Newton’s laws of motion to
a non-inertial frame involving both translation and rotation. These correction forces are often referred to as
“fictitious” forces. However, these non-inertial forces are very real when located in the non-inertial frame.
Since the centrifugal and Coriolis terms are unusual they are discussed below.
Note that
ω · F = 0 (12.38)
therefore the centrifugal force is perpendicular to the axis of
rotation.
Using the vector identity, equation 24 allows the centrifu-
gal force to be written as
£ ¤
F = − (ω · r0 ) ω − 2 r0 (12.39)
.
0 0 r
For the case where the radius r is perpendicular to ω then ω·r =
0 and thus for this special case
= = ̇ − cos
̇
= = 2 ̇ + sin
̇
These lead to the corresponding velocities of
̇ = + cos
sin
̇ = 2
−
and thus the Hamiltonian is given by
= ̇ + ̇ −
2 1 1
= + 2
− sin + cos + ( − 0 )2 + 2 − cos
2 2 2 2
The Hamilton equations of motion give that
̇ = = + cos
sin
̇ = = −
2
These radial and angular velocities are the same as obtained using Lagrangian mechanics.
The Hamilton equations for ̇ and ̇ are given by
2
̇ = − = − 2 sin − ( − 0 ) + cos + 3
Similarly
̇ = − = cos + sin − sin
The transformation equations relating the generalized coordinates are time dependent so the Hamil-
tonian does not equal the total energy . In addition neither the Lagrangian nor the Hamiltonian are
conserved since they both are time dependent. The fact that the Hamiltonian is not conserved is obvious since
the whole system is accelerating upwards leading to increasing kinetic and potential energies. Moreover, the
time derivative of the angular momentum ̇ is non-zero so the angular momentum is not conserved.
Non-inertial fulcrum frame:
This system also can be addressed in the accelerating non-inertial fulcrum frame of reference which is
fixed to the fulcrum of the spring of the pendulum. In this non-inertial frame of reference, the acceleration
of the frame can be taken into account using an effective acceleration which is added to the gravitational
force; that is, is replaced by an effective gravitational force ( + ). Then the Lagrangian in the fulcrum
frame simplifies to
1 2 1
= ̇2 + 2 ̇ + ( + ) ( cos ) − ( − 0 )2
2 2
The Lagrange equations of motion in the fulcrum frame are given by
12.8. CORIOLIS FORCE 297
Λ = 0
2
̈ − ̇ − ( + ) cos + ( − 0 ) = 0
Λ = 0
2 ( + )
̈ + ̇̇ + sin = 0
These are identical to the Lagrange equations of motion derived in the inertial frame.
The can be used to derive the momenta in the non-inertial fulcrum frame
̃ = = ̇
̇
̃ = = 2 ̇
̇
which comprise only a part of the momenta derived in the inertial frame. These partial fulcrum momenta
lead to a Hamiltonian for the fulcum-frame of
̃2 ̃ 1 2
= ̃ ̇ + ̃ ̇ − = + + ( − 0 ) − ( + ) cos
2 22 2
Both and are time independent and thus the fulcrum Hamiltonian is a constant
of motion in the fulcrum frame. However, does not equal the total energy which is increasing with
time due to the acceleration of the fulcrum frame relative to the inertial frame. This example illustrates that
use of non-inertial frames can simplify solution of accelerating systems.
F = F0 − g
zero that is
a00 = 0 = F0 + (g − ω × (ω × r0 ))
z + 2 ρ
g = −b b
This is the equation of a paraboloid and corresponds to a parabolic gravitational equipotential energy surface.
Astrophysicists build large parabolic mirrors for telescopes by continuously spinning a large vat of glass while
it solidifies. This is much easier than grinding a large cylindrical block of glass into a parabolic shape.
2ω × ṙ00
ω̇ = − ()
”
that is, the rotational frequency decreases if the radius is increased. Note that, as shown in equation 1217
̇ = ̇ 00 . This nonzero value of ̇ obviously leads to an azimuthal force in addition to the Coriolis force.
Consider the rate of change of angular momentum for the rotating mass assuming that the angular
momentum comes purely from the rotation Then in the rotating frame
ṗ00 = (”2 ω) = 200 ̇00 ω + 002 ω̇
Substituting equation for ̇ in the second term gives
That is, the two terms cancel. Thus the angular momentum is conserved for this case where the velocity is
radial. Note that, since ” is assumed to be colinear with then it is the same in both the stationary and
rotating frames of reference and thus angular momentum is conserved in both frames. In addition, in the
fixed frame, the angular momentum is conserved if no external torques are acting as assumed above.
Note that the rotational energy is
1
= 2
2
Also the angular momentum is conserved, that is
p = ω = ω̂
p
Substituting ω = in the rotational energy gives
2 2
= =
2 2
Therefore the rotational energy actually increases as the moment of inertia decreases when the ice skater
pulls her arms close to her body. This increase in rotational energy is provided by the work done as the
dancer pulls her arms inward against the centrifugal force.
12.9. ROUTHIAN REDUCTION FOR ROTATING SYSTEMS 299
(1 ; ̇1 ̇ ; +1 ; ) = − = ω · J − (12.43)
This Routhian behaves like a Hamiltonian for the ignorable cyclic coordinates ω J Simultaneously it behaves
like a negative Lagrangian for all the other coordinates
The non-cyclic Routhian complements in that it is defined as
(1 ; 1 ; ̇+1 ̇ ; ) = − = − ω · J (12.44)
This non-cyclic Routhian behaves like a Hamiltonian for all the non-cyclic variables and behaves like a
negative Lagrangian for the two cyclic variables . Since the cyclic variables are constants of motion,
then is a constant of motion that equals the energy in the rotating frame if is a constant of
motion. However, does not equal the total energy since the coordinate transformation is time
dependent, that is, the Routhian corresponds to the energy of the non-cyclic parts of the motion.
For example, the Routhian for a system that is being cranked about the axis at some fixed
angular frequency ̇ = with corresponding total angular momentum p = J can be written as1
1 For clarity sections 101 to 108 of this chapter adopted a naming convention that uses unprimed coordinates with the
subscript for the inertial frame of reference, primed coordinates with the subscript for the translating coordinates, and
double-primed coordinates with the subscript for the translating plus rotating frame. For brevity the subsequent discussion
omits the redundant subscripts since the single and double prime superscripts completely define the moving and
rotating frames of reference.
300 CHAPTER 12. NON-INERTIAL REFERENCE FRAMES
= − ω · J
where it is assumed that the deformed nucleus has the symmetry axis along the direction and rotates about
the axis. Since the Routhian is for a non-inertial rotating frame of reference it does not include the total
energy but, if the shape is constant in time, then and the corresponding body-fixed Hamiltonian
are conserved and the energy levels for the nucleons bound in the spheroidal potential well can be calculated
using a conventional quantum mechanical model.
For a prolate spheroidal deformed potential well, the nucleon orbits that have the angular momentum
nearly aligned to the symmetry axis correspond to nucleon trajectories that are restricted to the narrowest
302 CHAPTER 12. NON-INERTIAL REFERENCE FRAMES
part of the spheroid, whereas trajectories with the angular momentum vector close to perpendicular to the
symmetry axis have trajectories that probe the largest radii of the spheroid. The Heisenberg Uncertainty
Principle, mentioned in chapter 3113, describes how orbits restricted to the smallest dimension will have
the highest linear momentum, and corresponding kinetic energy, and vise versa for the larger sized orbits.
Thus the binding energy of different nucleon trajectories in the spheroidal potential well depends on the angle
between the angular momentum vector and the symmetry axis of the spheroid as well as the deformation of
the spheroid. A quantal nuclear model Hamiltonian is solved for assumed spheroidal-shaped potential wells.
The corresponding orbits each have angular momenta j for which the projection of the angular momentum
along the symmetry axis Ω is conserved, but the projection of j in the laboratory frame is not conserved
since the potential well is not spherically symmetric. However, the total Hamiltonian is spherically symmetric
in the laboratory frame, which is satisfied by allowing the deformed spheroidal potential well to rotate freely in
the laboratory frame, and then 2 and Ω all are conserved quantities. The attractive residual nucleon-
nucleon pairing interaction results in pairs of nucleons being bound in time-reversed orbits ( × )0 , that
is, with resultant total spin zero, in this spheroidal nuclear potential. Excitation of an even-even nucleus
can break one pair and then the total projection of the angular momentum along the symmetry axis is
= |Ω1 ± Ω2 |, depending on whether the projections are parallel or antiparallel. More excitation energy
can break several pairs and the projections continue to be additive. The binding energies calculated in the
spheroidal potential well must be added to the rotational energy = J2 2 to get the total energy, where
J is the moment of inertia. Nuclear structure measurements are in good agreement with the predictions of
nuclear structure calculations that employ the Routhian approach.
F
a0 = + g − (2ω × v0 + ω × (ω × [r0 + R]) + ω̇ × r0 )
F
= + g − (2ω × v0 + ω × (ω × r) + ω̇ × r0 )
where r is with respect to the center of the Earth. This is as expected directly from equation 1236. Since
the angular frequency of the earth is a constant then ̇ × r0 = 0 Thus the acceleration can be written as
F
a0 = + [g − ω × (ω × r)] − 2ω × v0 (12.52)
The term in the square brackets combines the gravitational acceleration plus the centrifugal acceleration.
A measurement of the Earth’s gravitational accel-
eration actually measures the term in the square brack-
ets in equation 1252, that is, an effective gravitational
acceleration where
g = g − ω × (ω × r) (12.53)
This is quite small for the Earth since = 073 × 10−4 and = 6371 leading to a correction
term 2 cos = 003 cos 2 Since
= 2 cos sin (12.55)
and
= − 2 cos2 (12.56)
Then the angle between g and g is given by
2 cos sin
' tan =
= (12.57)
− 2 cos2
a0 = g − 2ω × v0 (12.58)
x (East)
Neglect the centrifugal correction term since it is very small,
that is, let g = g. Using the coordinate axis shown in
figure 127, the surface-frame vectors have components
Equator
ω = 0ib0 + cos jb0 + sin kb0 (12.59)
and
g = − kb0 (12.60)
Thus the Coriolis term is
¯ ¯
¯ ib0 jb0 kb0 ¯
¯ ¯ Figure 12.7: Rotating frame fixed on the sur-
2ω × v0 = 2 ¯¯ 0 cos sin ¯¯ face of the Earth.
¯ 0 0
0 ¯
h³ 0
´ ³ ´ ³ ´ i
0 0 0
= 2 cos − sin ib0 + sin jb0 − cos kb0
r̈0 = − kb0 −2[ib0 (̇ 0 cos − ̇ 0 sin ) + jb0 ̇0 sin − kb0 ̇0 cos ] (12.61)
where ̇00 ̇00 ̇00 are the initial velocities. Substituting the above velocity relations into the equation of motion
for ̈ gives
̈0 = 2 cos − 2 (̇00 cos − ̇00 sin ) − 4 2 0 (12.64)
The last term 4 2 is small and can be neglected leading to a simple uncoupled second-order differential
equation in . Integrating this twice assuming that 00 = 00 = 00 = 0 plus the fact that 2 cos and
2 (̇00 cos − ̇00 sin ) are constant, gives
1
0 = 3 cos − 2 (̇00 cos − ̇00 sin ) + ̇00 (12.65)
3
Similarly, ¡ ¢
0 = ̇00 − ̇00 2 sin (12.66)
1
0 = − 2 + ̇00 + ̇00 2 cos (12.67)
2
12.11. FREE MOTION ON THE EARTH 305
Note that the velocity equals zero when = 0 assuming that is finite. That is, the velocity reaches a
maximum at a radius
1 1
= (1 + ) (12.73)
4 sin
12.12. WEATHER SYSTEMS 307
Figure 12.9: Hurricane Katrina over the Gulf of Mexico on 28 August 2005. [Published by the NOAA]
which occurs at the wall of the eye of the circulating low-pressure system.
Low pressure regions are produced by heating of air causing it to rise and resulting in an inflow of air
to replace the rising air. Hurricanes form over warm water when the temperature exceeds 26◦ and the
moisture levels are above average. They are created at latitudes between 10◦ − 15◦ where the sea is warmest,
but not closer to the equator where the Coriolis force drops to zero. About 90% of the heating of the air comes
from the latent heat of vaporization due to the rising warm moist air condensing into water droplets in the
cloud similar to what occurs in thunderstorms. For hurricanes in the northern hemisphere, the air circulates
anticlockwise inwards. Near the wall of the eye of the hurricane, the air rises rapidly to high altitudes at
which it then flows clockwise and outwards and subsequently back down in the outer reaches of the hurricane.
Both the wind velocity and pressure are low inside the eye which can be cloud free. The strongest winds
are in vortex surrounding the eye of the hurricane, while weak winds exist in the counter-rotating vortex of
sinking air that occurs far outside the hurricane.
Figure 129 shows the satellite picture of the hurricane Katrina, recorded on 28 August 2005. The eye of
the hurricane is readily apparent in this picture. The central pressure was 902002 (902) compared
with the standard atmospheric pressure of 1013002 (1013). This 111 pressure difference produced
steady winds in Katrina of 280 ( 175) with gusts up to 344 which resulted in 1833 fatalities.
Tornadoes are another example of a vortex low-pressure system that are the opposite extreme in both
size and duration compared with a hurricane. Tornadoes may last only ∼ 10 minutes and be quite small in
radius. Pressure drops of up to 100 have been recorded, but since they may only be a few 100 meters in
diameter, the pressure gradient can be much higher than for hurricanes leading to localized winds thought to
approach 500. Unfortunately, the instrumentation and buildings hit by a tornado often are destroyed
making study difficult. Note that the the pressure gradient in small diameter of rope tornadoes is much
more destructive than for larger 14 mile diameter tornadoes, which results in stronger winds.
308 CHAPTER 12. NON-INERTIAL REFERENCE FRAMES
Assume the small angle approximation for the pendulum deflection angle , then = cos ' and
= , thus ' . Then has shown in figure 1210, the horizontal components of the restoring force
are
= − (12.79)
= − (12.80)
Since g is vertical, and neglecting terms involving ̇ then evaluating the cross product in equation (1278)
simplifies to
̈ = − + 2̇Ω cos (12.81)
̈ = − − 2̇Ω cos (12.82)
where is the colatitude which is related to the latitude by
Ω = Ω cos (12.85)
̈ − 2Ω ̇ + 20 = 0
̈ − 2Ω ̇ + 20 = 0 (12.86)
These are two coupled equations that can be solved by making a coordinate transformation.
Define a new coordinate that is a complex number
= + (12.87)
Multiply the second of the coupled equations 1286 by and add to the first equation gives
̈ + 2Ω ̇ + 20 = 0 (12.88)
Note that the complex number contains the same information regarding the position in the − plane
as equations 1286. The plot of in the complex plane, the Argand diagram, is a birds-eye view of the
position coordinates ( ) of the pendulum. This second-order homogeneous differential equation has two
independent solutions that can be derived by guessing a solution of the form
2 − 2Ω − 2 = 0
That is q
= Ω ± Ω2 + 20 (12.90)
310 CHAPTER 12. NON-INERTIAL REFERENCE FRAMES
' Ω ± 0 (12.91)
Thus the solution is of the form
() = −Ω (+ 0 + − 0 ) (12.92)
This can be written as
() = −Ω cos( 0 + ) (12.93)
where the phase and amplitude depend on the initial conditions. Thus the plane of oscillation of the
pendulum is defined by the ratio of the and coordinates, that is the phase angle Ω This phase angle
rotates with angular velocity Ω where
At the north pole the earth rotates under the pendulum with angular velocity Ω and the axis of the
pendulum is fixed in an inertial frame of reference. At lower latitudes, the pendulum precesses at the lower
angular frequency Ω = Ω sin that goes to zero at the equator. For example, in Rochester, NY, = 43◦
and therefore a Foucault pendulum precesses at Ω = 0682Ω. That is, the pendulum precesses 2455◦ /day.
12.14 Summary
This chapter has focussed on describing motion in non-inertial frames of reference. It has been shown that the
force and acceleration in non-inertial frames can be related using either Newtonian or Lagrangian mechanics
by introducing additional inertial forces in the non-inertial reference frame.
Rotating reference frame It was shown that the time derivatives of a general vector G in both an
inertial frame and a rotating reference frame are related by
µ ¶ µ ¶
G G
= +ω×G (1216)
where the ω × G term originates from the fact that the unit vectors in the rotating reference frame are time
dependent with respect to the inertial frame.
Reference frame undergoing both rotation and translation Both Newtonian and Lagrangian me-
chanics were used to show that for the case of translational acceleration plus rotation, the effective force in
the non-inertial (double-primed) frame can be written as
These inertial correction forces result from describing the system using a non-inertial frame. These inertial
forces are felt when in the rotating-translating frame of reference. Thus the notion of these inertial forces
can be very useful for solving problems in non-inertial frames. For the case of rotating frames, two important
inertial forces are the centrifugal force, −ω × (ω × r0 ) and the Coriolis force −2ω × v00 .
Routhian reduction for rotating systems It was shown that for non-inertial systems, identical equa-
tions of motion are derived using Newtonian, Lagrangian, Hamiltonian, and Routhian mechanics.
12.14. SUMMARY 311
Terrestrial manifestations of rotation Examples of motion in rotating frames presented in the chapter
included projectile motion with respect to the surface of the Earth, rotation alignment of nucleons in rotating
nuclei, and weather phenomena.
Workshop exercises
1. Consider a fixed reference frame and a rotating frame 0 . The origins of the two coordinate systems always
coincide. By carefully drawing a diagram, derive an expression relating the coordinates of a point in the two
systems. (This was covered in Chapter 2, but it is worth reviewing now.
2. The effective force observed in a rotating coordinate system is given by equation 1228.
(a) What is the significance of each term in this expression?
(b) Suppose you wanted to measure the gravitational force, both magnitude and direction, on a body of mass
at rest on the surface of the Earth. What terms in the effective force can be neglected?
(c) Suppose you wanted to calculate the deflection of a projectile fired horizontally along the Earth’s surface.
What terms in the effective force can be neglected?
(d) Suppose you wanted to calculate the effective force on a small block of mass placed on a frictionless
turntable rotating with a time-dependent angular velocity (). What terms in the effective force can be
neglected?
3. A plumb line is carried along in a moving train, with the mass of the plumb bob. Neglect any effects due to
the rotation of the Earth and work in the noninertial frame of reference of the train.
(a) Find the tension in the cord and the deflection from the local vertical if the train is moving with constant
acceleration 0 .
(b) Find the tension in the cord and the deflection from the local vertical if the train is rounding a curve of
radius with constant speed 0 .
4. A bead on a rotating rod is free to slide without friction. The rod has a length and rotates about its end
with angular velocity . The bead is initially released from rest (relative to the rod) at the midpoint of the
rod.
(a) Find the displacement of the bead along the wire as a function of time.
(b) Find the time when the bead leaves the end of the rod.
(c) Find the velocity (relative to the rod) of the bead when it leaves the end of the rod.
5. Here is a “thought experiment” for you to consider. Suppose you are in a small sailboat of mass at the
Earth’s equator. At the equator there is very little wind (this is known as the “equatorial doldrums”), so your
sailboat is, more or less, sitting still. You have a small anchor of mass on deck and a single mast of height
in the middle of the boat. How can you use the anchor to put the boat into motion? In which direction will
the boat move?
6. Does water really flow in the other direction when you flush a toilet in the southern hemisphere? What (if
anything) does the Coriolis force have to do with this?
7. We are presently at a latitude (with respect to the equator) and Earth is rotating with constant angular
velocity . Consider the following two scenarios: Scenario A: A particle is thrown upward with initial speed
0 . Scenario B: An identical particle is dropped (at rest) from the maximum height of the particle in Scenario
A. Circle all the true statements regarding the Coriolis deflection assuming that the particles have landed for
a) and b), .
Problems
1. If a projectile is fired due east from a point on the surface of the Earth at a northern latitude with a velocity
of magnitude 0 and at an inclination to the horizontal of show that the lateral deflection when the projectile
strikes the Earth is
403
= sin sin2 cos (12.95)
2
where is the rotation frequency of the Earth.
2. Obtain an expression for the angular deviation of a particle projected from the North Pole in a path that lies
close to the surface of the earth. Is the deviation significant for a missile that makes a 4800-km flight in 10
minutes? What is the ”miss distance” if the missile is aimed directly at the target? Is the miss difference
greater for a 19300-km flight at the same velocity?
3. An automobile drag racer drives a car with acceleration and instantaneous velocity . The tires of radius 0
are not slipping. Derive which point on the tire has the greatest acceleration relative to the ground. What is
this acceleration?
4. Shot towers were popular in the eighteenth and nineteenth centuries for dropping melted lead down tall towers
to form spheres for bullets. The lead solidified while falling and often landed in water to cool the lead bullets.
Many such shot towers were built in New York State. Assume a shot tower was constructed at latitude 42◦ ,
and that the lead fell a distance of 27. In what direction and by how far did the lead bullets land from the
direct vertical?
Chapter 13
Rigid-body rotation
13.1 Introduction
Rigid-body rotation features prominently in science, engineering, and sports. Prior chapters have focussed
primarily on motion of point particles. This chapter extends the discussion to motion of finite-sized rigid
bodies. A rigid body is a collection of particles where the relative separations remain rigidly fixed. In real
life, there is always some motion between individual atoms, but usually this microscopic motion can be
neglected when describing macroscopic properties. Note that the concept of perfect rigidity has limitations
in the theory of relativity since information cannot travel faster than the velocity of light, and thus signals
cannot be transmitted instantaneously between the ends of a rigid body which is implied if the body had
perfect rigidity.
The description of rigid-body rotation is most easily handled by specifying the properties of the body
in the rotating body-fixed coordinate frame whereas the observables are measured in the stationary iner-
tial laboratory coordinate frame. In the body-fixed coordinate frame, the primary observable for classical
mechanics is the inertia tensor of the rigid body which is well defined and independent of the rotational
motion. By contrast, in the stationary inertial frame the observables depend sensitively on the details of the
rotational motion. For example, when observed in the stationary fixed frame, rapid rotation of a long thin
cylindrical pencil about the longitudinal symmetry axis gives a time-averaged shape of the pencil that looks
like a thin cylinder, whereas the time-averaged shape is a flat disk for rotation about an axis perpendicular
to the symmetry axis of the pencil. In spite of this, the pencil always has the same unique inertia tensor
in the body-fixed frame. Thus the best solution for describing rotation of a rigid body is to use a rotation
matrix that transforms from the stationary fixed frame to the instantaneous body-fixed frame for which the
moment of inertia tensor can be evaluated. Moreover, the problem can be greatly simplified by transforming
to a body-fixed coordinate frame that is aligned with any symmetry axes of the body since then the inertia
tensor can be diagonal; this is called a principal axis system.
Rigid-body rotation can be broken into the following two classifications.
1) Rotation about a fixed axis:
A body can be constrained to rotate about an axis that has a fixed location and orientation relative to
the body. The hinged door is a typical example. Rotation about a fixed axis is straightforward since the
axis of rotation, plus the moment of inertia about this axis, are well defined and this case was discussed in
chapter 2127.
2) Rotation about a point
A body can be constrained to rotate about a fixed point of the body but the orientation of this rotation
axis about this point is unconstrained. One example is rotation of an object flying freely in space which can
rotate about the center of mass with any orientation. Another example is a child’s spinning top which has
one point constrained to touch the ground but the orientation of the rotation axis is undefined.
The prior discussion in chapter 2127 showed that rigid-body rotation is more complicated than assumed
in introductory treatments of rigid-body rotation. It is necessary to expand the concept of moment of inertia
to the concept of the inertia tensor, plus the fact that the angular momentum may not point along the
rotation axis. The most general case requires consideration of rotation about a body-fixed point where the
orientation of the axis of rotation is unconstrained. The concept of the inertia tensor of a rotating body is
313
314 CHAPTER 13. RIGID-BODY ROTATION
crucial for describing rigid-body motion. It will be shown that working in the body-fixed coordinate frame of
a rotating body allows a description of the equations of motion in terms of the inertia tensor for a given point
of the body, and that it is possible to rotate the body-fixed coordinate system into a principal axis system
where the inertia tensor is diagonal. For any principal axis, the angular momentum is parallel to the angular
velocity if it is aligned with a principal axis. The use of a principal axis system greatly simplifies treatment
of rigid-body rotation and exploits the powerful and elegant matrix algebra mentioned in appendix .
The following discussion of rigid-body rotation is broken into three topics, (1) the inertia tensor of the
rigid body, (2) the transformation between the rotating body-fixed coordinate system and the laboratory
frame, i.e., the Euler angles specifying the orientation of the body-fixed coordinate frame with respect to the
laboratory frame, and (3) Lagrange and Euler’s equations of motion for rigid-bodies. This is followed by a
discussion of practical applications.
There are two especially convenient choices for the fixed point . If no point in the body is fixed with
respect to an inertial coordinate system, then it is best to choose as the center of mass. If one point of
the body is fixed with respect to a fixed inertial coordinate system, such as a point on the ground where a
child’s spinning top touches, then it is best to choose this stationary point as the body-fixed point
13.3. RIGID-BODY ROTATION ABOUT A BODY-FIXED POINT 315
where
= 1 =
= 0 6=
In most cases it is more useful to express the components of the inertia tensor in an integral form over
the mass distribution rather than a summation for discrete bodies. That is,
Z Ã Ã 3 ! !
X
0 2
= (r ) − (13.13)
The inertia tensor is easier to understand when written in cartesian coordinates r0 = ( ) rather
than in the form r0 = (1 2 3 ) Then, the diagonal moments of inertia of the inertia tensor are
X
£ ¤ X £ ¤
≡ 2 + 2 + 2 − 2 = 2 + 2 (13.14)
X
£ ¤ X £ ¤
≡ 2 + 2 + 2 − 2 = 2 + 2
X
X
£ ¤ £ ¤
≡ 2 + 2 + 2 − = 2
2 + 2
The above notation for the inertia tensor allows the angular momentum (1312) to be written as
3
X
= (13.17)
Note that every fixed point in a body has a specific inertia tensor. The components of the inertia tensor
at a specified point depend on the orientation of the coordinate frame whose origin is located at the specified
fixed point. For example, the inertia tensor for a cube is very different when the fixed point is at the center
of mass compared with when the fixed point is at a corner of the cube.
13.5. MATRIX AND TENSOR FORMULATIONS OF RIGID-BODY ROTATION 317
As discussed in appendix 2, equation (1318) now can be written in tensor notation as an inner product
of the form
L = {I} · ω (13.21)
Note that the above notation uses boldface for the inertia tensor I, implying a rank-2 tensor representation,
while the angular velocity ω and the angular momentum L are written as column vectors. The inertia tensor
is a 9-component rank-2 tensor defined as the ratio of the angular momentum vector L and the angular
velocity ω.
L
{I} = (13.22)
ω
Note that, as described in appendix , the inner product of a vector ω, which is the rank 1 tensor, and a
rank 2 tensor {I} leads to the vector L. This compact notation exploits the fact that the matrix and tensor
representation are completely equivalent, and are ideally suited to the description of rigid-body rotation.
where are real numbers, which are called the principal moments of inertia of the body, and are
usually written as . When the angular velocity vector ω points along any principal axis unit vector ̂, then
the angular momentum L is parallel to ω and the magnitude of the principal moment of inertia about this
principal axis is given by the relation
̂ = ̂ (13.24)
The principal axes are fixed relative to the shape of the rigid body and they are invariant to the orientation
of the body-fixed coordinate system used to evaluate the inertia tensor. The advantage of having the body-
fixed coordinate frame aligned with the principal axis coordinate frame is that then the inertia tensor is
diagonal, which greatly simplifies the matrix algebra. Even when the body-fixed coordinate system is not
aligned with the principal axis frame, if the angular velocity is specified to point along a principal axis then
the corresponding moment of inertia will be given by (1324)
In principle it is possible to locate the principal axes by varying the orientation of the angular velocity
vector ω to find those orientations for which the angular momentum L and angular velocity ω are parallel
which characterizes the principal axes. However, the best approach is to diagonalize the inertia tensor.
318 CHAPTER 13. RIGID-BODY ROTATION
These equations are solved for the ratios 11 : 21 : 31 which are the direction numbers of the principle axis
system corresponding to solution 1 This principal axis system is defined relative to the original coordinate
system. This procedure is repeated to find the orientation of the other two mutually perpendicular principal
axes.
R=a+r (13.34)
Figure 13.2: Transformation be-
where a is the vector connecting the origins of the coordinate systems tween two parallel body-coordinate
and illustrated in figure 132. The elements of the inertia tensor systems, O and Q.
with respect to axis system are given by equation 1312 to be
" Ã 3 ! #
X X
2
≡ − (13.35)
The components along the three axes for each of the two coordinate systems are related by
= + (13.36)
The first summation on the right-hand side corresponds to the elements of the inertia tensor in the
center-of-mass frame. Thus the terms can be regrouped to give
à 3
!
" 3
#
X X X X
2
≡ + − + 2 − − (13.38)
P
However, each term in the last bracket involves a sum of the form Take the coordinate system
to be with respect to the center of mass for which
X
r0 = 0 (13.39)
320 CHAPTER 13. RIGID-BODY ROTATION
where is the center-of-mass inertia tensor. This is the general form of Steiner’s parallel-axis theorem.
As an example, the moment of inertia around the 1 axis is given by
¡¡ ¢ ¢ ¡ ¢
11 ≡ 11 + 21 + 22 + 23 11 − 21 = 11 + 22 + 23 (13.43)
which corresponds to the elementary statement that the difference in the moments of inertia equals the
mass of the body multiplied by the square of the distance between the parallel axes, 1 1 Note that the
minimum moment of inertia of a body is which is about the center of mass.
13.1 Example: Inertia tensor of a solid cube rotating about the center of mass.
The complicated expressions for the inertia tensor can be un-
derstood using the example of a uniform solid cube with side ,
density and mass = 3 rotating about different axes. As-
sume that the origin of the coordinate system is at the center
of mass with the axes perpendicular to the centers of the faces of
the cube.
The components of the inertia tensor can be calculated using
(1313) written as an integral over the mass distribution rather O
than a summation.
Z Ã Ã 3 ! !
X
= (r0 ) 2 −
Thus
Z Z Z Inertia tensor of a uniform solid cube of
2 2 2 ¡ 2 ¢ side about the center of mass and a
11 = 2 + 23 3 2 1
−2 −2 −2 corner of the cube . The vector is the
1 5 1 vector distance between and .
= = 2 = 22 = 33
6 6
By symmetry the diagonal moments of inertia about each face
are identical. Similarly the products of inertia are given by
Z 2 Z 2 Z 2
12 = − (1 2 ) 3 2 1 = 0
−2 −2 −2
a) Direct calculation Let one corner of the cube be the origin of the coordinate system and assume
that the three adjacent sides of the cube lie along the coordinate axes. The components of the inertia tensor
can be calculated using (1313) Thus
Z Z Z
¡ 2 ¢ 2 2
11 = 2 + 23 3 2 1 = 5 = 2
0 0 0 3 3
Z Z Z
1 1
12 = − (1 2 ) 3 2 1 = − 5 = − 2
0 0 0 4 4
Thus, evaluating all the nine components gives
⎛ ⎞
8 −3 −3
1
I = 2 ⎝ −3 8 −3 ⎠
12 −3 −3 8
b) Parallel-axis theorem This inertia tensor also can be calculated using the parallel-axis theorem to
relate the moment of inertia about the corner, to that at the center of mass. As shown in the figure, the
vector has components
1 = 2 = 3 =
2
Applying the parallel-axis theorem gives
¡ ¢ ¡ ¢ 1 1 2
11 = 11 + 2 − 21 = 11 + 22 + 23 = 2 + 2 = 2
6 2 3
and similarly for 22 and 33 . The off-diagonal terms are given by
1
12 = 12 + (−1 2 ) = − 2
4
Thus the inertia tensor, transposed from the center of mass, to the corner of the cube is
⎛ 2 ⎞ ⎛ ⎞
3
2
− 14 2 − 14 2 8 −3 −3
1
I = ⎝ − 14 2 23 2 − 14 2 ⎠ = 2 ⎝ −3 8 −3 ⎠
1 2 1 2 2 2 12
−4 −4 3
−3 −3 8
This inertia tensor about the corner of the cube, is the same as that obtained by direct integration.
c) Principal moments of inertia The coordinate axis frame used for rotation about the corner of the
cube is not a principal axis frame. Therefore let us diagonalize the inertia tensor to find the principal
axis frame the principal moments of inertia about a corner. To achieve this requires solving the secular
determinant ¯ ¡2 ¢ ¯
¯ 2 1 2 1 2 ¯
¯ 31 2− ¡−24 2 ¢ − 41 2 ¯
¯ − ¯=0
¯ 41 3 − −4
¡ ¢ ¯
¯ − 2 1
−4 2 2 2 ¯
4 3 −
The value of a determinant is not affected by adding or subtracting any row or column from any other
row or column. Subtract row 1 from row 2 gives
¯ ¡2 ¢ ¯
¯ 2 − 1 2
− 14 2 ¯
¯ 311 ¡−114 2 ¢ ¯
¯ − + 2 ¯=0
¯ 12 12 − ¡0 2 ¢ ¯
¯ − 1 2 − 14 2 2 ¯
4 3 −
where the second subscript 1 attached to signifies that this solution corresponds to 11 This gives
2 11 − 21 − 31 = 0
− 11 + 2 21 − 31 = 0
− 11 − 21 + 2 31 = 0
Solving ⎛ these⎞three equations gives the unit vector for the first principal axis for which 11 = 16 2 to be
1
ê1 = √13 ⎝ 1 ⎠. This can be repeated to find the other two principal axes by substituting 22 = 11 2
12 This
1
gives for the second principal moment 22
⎛ ⎞⎛ ⎞
−3 −3 −3 12
1
({I} − {I}) · ω = 2 ⎝ −3 −3 −3 ⎠ ⎝ 22 ⎠ = 0
12
−3 −3 −3 32
This results in three identical equations for the components of but all three equations are the same, namely
12 + 22 + 32 = 0
This does not uniquely determine the direction of However, it does imply that 2 corresponding to the
second principal axis has the property that
ω̂ · ê1 = 0
that is, any direction of ̂2 that is perpendicular to ̂1 is acceptable. In other words; any two orthogonal unit
vectors ̂2 and ̂3 that are perpendicular to ̂1 are acceptable. This ambiguity exists whenever two eigenvalues
are equal; the three principal axes are only uniquely defined if all three eigenvalues are different. The same
ambiguity exist when all three eigenvalues are identical as occurs for the principal moments of inertia about
the center-of-mass of a uniform solid cube. This explains why the principal moment of inertia for the diagonal
of the cube, that passes through the center of mass, has the same moment as when the principal axes pass
through the center of the faces of the cube.
Z Z Z
= (2 + 2 ) + 2 2 ≥ (2 + 2 ) = (13.44)
13.10. GENERAL PROPERTIES OF THE INERTIA TENSOR 323
Note that for any body the three principal moments of inertia must satisfy the triangle rule that the sum of
any pair must exceed or equal the third. Moreover, if the body is a thin lamina with thickness = 0 that
is, a thin plate in the − plane, then
+ = (13.45)
This perpendicular-axis theorem can be very useful for solving problems involving rotation of plane laminae.
The opposite of a plane laminae is a long thin cylindrical needle of mass , length , and radius
Along the symmetry axis the principal moments are = 12 2 → 0 as → 0 while perpendicular to the
1
symmetry axis = = 12 2 . These satisfy the triangle rule.
Spherical top: 1 = 2 = 3
A spherical top is a body having three degenerate principal moments of inertia. Such a body has the same
symmetry as the inertia tensor about the center of a uniform sphere. For a sphere it is obvious from the
symmetry that any orientation of three mutually orthogonal axes about the center of the uniform sphere are
equally good principal axes. For a uniform cube the principal axes of the inertia tensor about the center of
mass were shown to be aligned such that they pass through the center of each face, and the three principal
moments are identical; that is, inertially it is equivalent to a spherical top. A less obvious consequence of the
spherical symmetry is that any orientation of three mutually perpendicular axes about the center of mass of
a uniform cube is an equally good principal axis system.
324 CHAPTER 13. RIGID-BODY ROTATION
Symmetric top: 1 = 2 6= 3
The equivalent ellipsoid for a body with two degenerate principal moments of inertia is a spheroid which has
cylindrical symmetry with the cylindrical axis aligned along the third axis. A body with 3 1 = 2 is a
prolate spheroid while a body with 3 1 = 2 is an oblate spheroid. Examples with a prolate spheroidal
equivalent inertial shape are a rugby ball, pencil, or a baseball bat. Examples of an oblate spheroid are an
orange, or a frisbee. A uniform sphere, or a uniform cube, rotating about a point displaced from the center-
of-mass also behave inertially like a symmetric top. The cylindrical symmetry of the equivalent spheroid
makes it obvious that any mutually perpendicular axes that are normal to the axis of cylindrical symmetry
are equally good principal axes even when the cross section in the 1−2 plane is square as opposed to circular.
A rotor is a diatomic-molecule shaped body which is a special case of a symmetric top where 1 = 0
and 2 = 3 . The rotation of a rotor is perpendicular to the symmetry axis since the rotational energy and
angular momentum about the symmetry axis are zero because the principal moment of inertia about the
symmetry axis is zero.
Asymmetric top: 1 6= 2 6= 3
A body where all three principal moments of inertia are distinct, 1 6= 2 6= 3 is called an asymmetric
top. Some molecules, and nuclei have asymmetric, triaxially-deformed, shapes.
The left-hand sides of these equations are identical since the inertia tensor is symmetric, that is =
Therefore subtracting these equations gives
X X
− = 0 (13.51)
That is X
( − ) = 0 (13.52)
or
( − ) ω · ω = 0 (13.53)
13.11. ANGULAR MOMENTUM L AND ANGULAR VELOCITY ω VECTORS 325
If 6= then
ω · ω = 0 (13.54)
which implies that the and principal axes are perpendicular. However, if = then equation
1353 does not require that ω · ω = 0, that is, these axes are not necessarily perpendicular, but, with
no loss of generality, these two axes can be chosen to be perpendicular with any orientation in the plane
perpendicular to the symmetry axis.
Summarizing the above discussion, the inertia tensor has the following properties.
1) Diagonalization may be accomplished by an appropriate rotation of the axes in the body.
2) The principal moments (eigenvalues) and principal axes (eigenvectors) are obtained as roots of the
secular determinant and are real.
3) The principal axes (eigenvectors) are real and orthogonal.
4) For a symmetric top with two identical principal moments of inertia, any orientation of two orthogonal
axes perpendicular to the symmetry axis are satisfactory eigenvectors.
5) For a spherical top with three identical principal moment of inertia, the principal axes system can
have any orientation with respect to the origin.
where ω is the angular velocity, {I} the inertia tensor, and L the corresponding angular momentum.
Two important consequences of equation 1355 are that:
• The angular momentum L and angular velocity ω are not necessarily colinear.
• In general the Principal axis system of the rotating rigid body is not aligned with either the angular
momentum or angular velocity vectors.
An exception to these statements occurs when the angular velocity ω is aligned along a principal axes
for which the inertia tensor is diagonal, i.e. = , and then both L and ω point along this principal
axis. In general the angular momentum L and angular velocity ω precess around each other. An important
special case is for torque-free systems where Noether’s theorem implies that the angular momentum vector
L is conserved both in magnitude and amplitude. In this case, the angular velocity ω and the Principal axis
system, both precesses around the angular momentum vector L. That is, the body appears to tumble with
respect to the laboratory fixed frame. Understanding rigid-body rotation requires care not to confuse the
body-fixed Principal axis coordinate frame, used to determine the inertia tensor, and the fixed laboratory
frame where the motion is observed.
Consider that the body is rotated about a diagonal of the cube for which
⎛ the⎞center of mass will be on
1
the rotation axis. Then the angular velocity vector is written as ω = √13 ⎝ 1 ⎠ where the components of
1
q
1 2 2 2
= = = √3 with the angular velocity magnitude + + =
⎛ ⎞ ⎛ ⎞ ⎛ ⎞
1 0 0 1 1
1 1 1 1 1
L = {I} · ω = 2 √ ⎝ 0 1 0 ⎠ · ⎝ 1 ⎠ = 2 √ ⎝ 1 ⎠ = 2 ω
6 3 0 0 1 1 6 3 1 6
Note that L and ω again are colinear showing it also is a principal axis. Moreover, the magnitude of L
is identical for orientations of the rotation axes passing through the center of mass when centered on
either one face, or the diagonal, of the cube implying that the principal moments of inertia about these axes
are identical. This illustrates the important property that, when the three principal moments of inertia are
identical, then any orientation of the coordinate system is an equally good principal axis system. That is,
this corresponds to the spherical top where all orientations are principal axes, not just along the obvious
symmetry axes.
This is a general expression for the kinetic energy that is valid for any choice of the origin from which the
body-fixed vectors r0 are measured. However, if the origin is chosen to be the center of mass, then, and only
then, the middle term cancels. That is, since V · ω is independent of the specific particle, then
à !
X X
0 0
V · ω × r = V · ω × r (13.59)
and R = 0 in the body-fixed frame if the selected point in the body is the center of mass. Thus, when using
the center of mass frame, the middle term of equation 1358 is zero. Therefore, for the center of mass frame,
the kinetic energy separates into two terms in the body-fixed frame
where
1X
= 2 (13.62)
2
1X
= (ω × r0 ) · (ω × r0 )
2
The rotational kinetic energy can be expressed in terms of components of ω and r0 in the body-fixed
frame. Also the following formulae are greatly simplified if r0 = ( ) in the rotating body-fixed frame
328 CHAPTER 13. RIGID-BODY ROTATION
is written in the form r0 = (1 2 3 ) where the axes are defined by the numbers 1 2 3 rather than
. In this notation the rotational kinetic energy is written as
⎡Ã !Ã ! Ã !⎛ ⎞⎤
1X X X X X
= ⎣ 2 2 − ⎝ ⎠⎦ (13.65)
2
where = 1 if = and = 0 if 6=
Then the kinetic energy can be written more compactly
⎡Ã !Ã ! Ã !⎛ ⎞⎤
1X X X X X
= ⎣ 2 2 − ⎝ ⎠⎦
2
3
" Ã 3 ! #
1 XX X
2
= ( ) − ( ) ( )
2
3
" " Ã 3 ! ##
1X X X
2
= − (13.67)
2
The term in the outer square brackets is the inertia tensor defined in equation 1312 for a discrete body. The
inertia tensor components for a continuous body are given by equation 1313.
Thus the rotational component of the kinetic energy can be written in terms of the inertia tensor as
3
1X
= (13.68)
2
Note that when the inertia tensor is diagonal ,then the evaluation of the kinetic energy simplifies to
3
1X
= 2 (13.69)
2
which is the familiar relation in terms of the scalar moment of inertia discussed in elementary mechanics.
Equation 1368 also can be factored in terms of the angular momentum L.
1X 1X X 1X
= = = (13.70)
2 2
2
As mentioned earlier, tensor algebra is an elegant and compact way of expressing such matrix operations.
Thus it is possible to express the rotational kinetic energy as
⎛ ⎞ ⎛ ⎞
¡ ¢ 11 12 13 1
1
= 1 2 3 · ⎝ 21 22 23 ⎠ · ⎝ 2 ⎠ (13.71)
2
31 32 33 3
1
≡ T = ω · {I} · ω (13.72)
2
where the rotational energy T is a scalar. Using equation 1355 the rotational component of the kinetic
energy also can be written as
1
≡ T = ω · L (13.73)
2
which is the same as given by (1370). It is interesting to realize that even though L = {I} · ω is the inner
product of a tensor and a vector, it is a vector as illustrated by the fact that the inner product = 12 ω·L =
1
2 ω · ({I} · ω) is a scalar. Note that the translational kinetic energy must be added to the rotational
kinetic energy to get the total kinetic energy as given by equation 1361
13.13. EULER ANGLES 329
1) Rotation about the space-fixed ẑ axis from the space x̂ axis to the line of nodes n̂ : The
first rotation (x y z) · λ → (n y0 z) is in a right-handed direction through an angle about the space-fixed
z axis. Since the rotation takes place in the x − y plane, the transformation matrix is
⎛ ⎞
cos sin 0
{λ } = ⎝ − sin cos 0 ⎠ (13.75)
0 0 1
1 The space-fixed coordinate frame and the body-fixed coordinate frames are unambiguously defined, that is, the space-fixed
frame is stationary while the body-fixed frame is the principal-axis frame of the body. There are several possible intermediate
frames that can be used to define the Euler angles. The − − sequence of rotations, used here, is used in most physics
textbooks in classical mechanics. Unfortunately scientists and engineers use slightly different conventions for defining the Euler
angles. As discussed in Appendix A of "Classical Mechanics" by Goldstein, nuclear and particle physicists have adopted the
− − sequence of rotations while the US and UK aerodynamicists have adopted a − − sequence of rotations.
330 CHAPTER 13. RIGID-BODY ROTATION
This leads to the intermediate coordinate system (n y0 z) where the rotated x axis now is colinear with the
n axis of the intermediate frame, that is, the line of nodes.
The precession angular velocity ̇ is the rate of change of angle of the line of nodes with respect to the space
axis about the space-fixed axis.
2) Rotation about the line of nodes n̂ from the space ẑ axis to the body-fixed 3̂ axis: The
second rotation
(n y0 z) · → (n y00 3) (13.77)
is in a right-handed direction through the angle about the n̂ axis (line of nodes) so that the “” axis becomes
colinear with the body-fixed 3̂ axis. Because the rotation now is in the ẑ− 3̂ plane, the transformation matrix
is ⎛ ⎞
1 0 0
{λ } = ⎝ 0 cos sin ⎠ (13.78)
0 − sin cos
The line of nodes which is at the intersection of the space-fixed and body-fixed planes, shown in figure 133
points in the n̂ = ẑ × 3̂ direction. The new “” axis now is the body-fixed 3̂ axis. The angular velocity ̇ is
the rate of change of angle of the body-fixed 3̂-axis relative to the space-fixed ẑ-axis about the line of nodes.
3) Rotation about the body-fixed 3̂ axis from the line of nodes to the body-fixed 1̂ axis: The
third rotation
(n y00 3) · → (1̂ 2̂ 3̂) (13.79)
is in a right-handed direction through the angle about the new body-fixed 3̂ axis This third rotation
transforms the rotated intermediate (n y00 3) frame to final body-fixed coordinate system (1̂ 2̂ 3̂) The
transformation matrix is ⎛ ⎞
cos sin 0
{λ } = ⎝ − sin cos 0 ⎠ (13.80)
0 0 1
The spin angular velocity ̇ is the rate of change of the angle of the body-fixed 1-axis with respect to the
line of nodes about the body-fixed 3 axis.
The total rotation matrix {λ} is given by
Thus the complete rotation from the space-fixed (x y z) axis system to the body-fixed (1 2 3) axis system
is given by
(1 2 3) = {λ} · (x y z) (13.82)
where {λ} is given by the triple product equation (1381) leading to the rotation matrix
⎛ ⎞
cos cos − sin cos sin sin cos + cos cos sin sin sin
{λ} = ⎝ − cos sin − sin cos cos − sin sin + cos cos cos sin cos ⎠ (13.83)
sin sin − cos sin cos
The inverse transformation from the body-fixed axis system to the space-fixed axis system is given by
⎛ ⎞
cos cos − sin cos sin − cos sin − sin cos cos sin sin
−1
{λ} = {λ} = ⎝ sin cos + cos cos sin − sin sin + cos cos cos − cos sin ⎠ (13.85)
sin sin sin cos cos
13.14. ANGULAR VELOCITY ω 331
Taking the product {λ} {λ}−1 = 1 shows that the rotation matrix is a proper, orthogonal, unit matrix.
The use of three different coordinate systems, space-fixed, the intermediate line of nodes, and the body-
fixed frame can be confusing at first glance. Basically the angle specifies the rotation about the space-fixed
axis between the space-fixed axis and the line of nodes of the Euler angle intermediate frame. The angle
specifies the rotation about the body-fixed 3 axis between the line of nodes and the body-fixed 1 axis. Note
that although the space-fixed and body-fixed axes systems each are orthogonal, the Euler angle basis in
general is not orthogonal. For rigid-body rotation the rotation angle about the space-fixed axis is time
dependent, that is, the line of nodes is rotating with an angular velocity ̇ with respect to the space-fixed
coordinate frame. Similarly the body-fixed coordinate frame is rotating about the body-fixed 3 axis with
angular velocity ̇ relative to the line of nodes.
Note that the precession angular velocity ̇ is the angular velocity that the body-fixed ê3 and ẑ × 3̂ axes
precess around the space-fixed ẑ axis. Table 131 gives the Euler angular velocities required to calculate
the components of the angular velocity ω for the body-fixed (1 2 3) axis system. Collecting the individual
components of ω gives the components of the angular velocity of the body, relative to the space-fixed axes,
in the body-fixed axis system (1 2 3)
The angular velocity of the body about the body-fixed 3-axis, 3 , is the sum of the projection of the
precession angular velocity of the line-of-nodes ̇ with respect to the space-fixed x-axis, plus the angular
velocity ̇ of the body-fixed 3-axis with respect to the rotating line-of-nodes.
Similarly, the components of the body angular velocity ω for the space-fixed axis system ( ) can be
derived to be
Note that when = 0 then the Euler angles are singular in that the space-fixed axis is parallel with
the body-fixed 3 axis and there is no way of distinguishing between precession ̇ and spin ̇, leading to
= 3 = ̇ + ̇. When = then the axis and 3 axis are antiparallel and = ̇ − ̇ = − 3 . The other
special case is when cos = 0 for which the Euler angle system is orthogonal and the space-fixed = ̇,
that is, it equals the precession, while the body-fixed 3 = ̇, that is, it equals the spin. When the Euler
angle basis is not orthogonal then equations (1386 − 88) and (1389 − 91) are needed for expressing the
Euler equations of motion in either the body-fixed frame or the space-fixed frame respectively.
Equations 1386 − 88 for the components of the angular velocity in the body-fixed frame can be expressed
in terms of the Euler angle velocities in a matrix form as
⎛ ⎞ ⎛ ⎞ ⎛ ⎞
1 sin sin cos 0 ̇
⎝ 2 ⎠ = ⎝ sin cos − sin 0 ⎠ · ⎝ ̇ ⎠ (13.92)
3 cos 0 1 ̇
again note that the transformation matrix is not orthogonal which is to be expected since the Euler angular
velocities are about axes that do not form a rectangular system of coordinates. Similarly equations 1389−91
for the angular velocity in the space-fixed frame can be expressed in terms of the Euler angle velocities in
matrix form as ⎛ ⎞ ⎛ ⎞ ⎛ ⎞
0 cos sin sin ̇
⎝ ⎠ = ⎝ 0 sin sin cos ⎠ · ⎝ ̇ ⎠ (13.93)
1 0 cos ̇
Using equation 1386 − 88 for the body-fixed angular velocities gives the rotational kinetic energy in terms
of the Euler angular velocities and principal-frame moments of inertia to be
∙ ³ ´2 ³ ´2 ³ ´2 ¸
1
= 1 ̇ sin sin + ̇ cos + 2 ̇ sin cos − ̇ sin + 3 ̇ cos + ̇ (13.95)
2
13.16. ROTATIONAL INVARIANTS 333
Similarly, the scalar product can be calculated using the Euler angle velocities for the space-fixed frame
using equations 1389 − 91.
2 2 2
ω · ω = ||2 = 2 + 2 + 2 = ̇ + ̇ + ̇ + 2̇̇ cos
2
This shows the obvious result that the scalar product · = || is invariant to rotations of the coordinate
frame, that is, it is identical when evaluated in either the space-fixed, or body-fixed frames.
Note that for = 0, the 3̂ and ̂ axes are parallel, and perpendicular to the ̂ axis, then
³ ´2 2
||2 = ̇ + ̇ + ̇
For the case when = 180◦ , the 3̂ and ̂ axes are antiparallel, and perpendicular to the ̂ axis, then
³ ´2 2
2
|| = ̇ − ̇ + ̇
For the case when = 90◦ , the 3̂ , ̂, and ̂ axes are mutually perpendicular, that is, orthogonal, and then
2 2 2
||2 = ̇ + ̇ + ̇
The time-averaged shape of a rapidly-rotating body, as seen in the fixed inertial frame, is very different
from the actual shape of the body, and this difference depends on the rotational frequency. For example, a
pencil rotating rapidly about an axis perpendicular to the body-fixed symmetry axis has an average shape
that is a flat disk in the laboratory frame which bears little resemblance to a pencil. The actual shape of the
pencil could be determined by taking high-speed photographs which display the instantaneous body-fixed
shape of the object at given times. Unfortunately for fast rotation, such as rotation of a molecule or a
nucleus, it is not possible to take photographs with sufficient speed and spatial resolution to observe the
instantaneous shape of the rotating body. What is measured is the average shape of the body as seen in the
fixed laboratory frame. In principle the shape observed in the fixed inertial frame can be related to the shape
in the body-fixed frame, but this requires knowing the body-fixed shape which in general is not known. For
example, a deformed nucleus may be both vibrating and rotating about some triaxially deformed average
shape which is a function of the rotational frequency. This is not apparent from the shapes measured in the
fixed frame for each of the excited states.
The fact that scalar products are rotationally invariant, provides a powerful means of transforming prod-
ucts of observables in the body-fixed frame, to those in the laboratory frame. In 1971 Cline developed
a powerful model-independent method that utilizes rotationally-invariant products of the electromagnetic
quadrupole operator 2 to relate the electromagnetic 2 properties for the observed levels of a rotating
nucleus measured in the laboratory frame, to the electromagnetic 2 properties of the deformed rotating
nucleus measured in the body-fixed frame.[Cli71, Cli72, Cli86] The method uses the fact that scalar products
of the electromagnetic multipole operators are rotationally invariant. This allows transforming scalar prod-
ucts of a complete set of measured electromagnetic matrix elements, measured in the laboratory frame, into
334 CHAPTER 13. RIGID-BODY ROTATION
the electromagnetic properties in the body-fixed frame of the rotating nucleus. These rotational invariants
provide a model-independent determination of the magnitude, triaxiality, and vibrational amplitudes of the
average shapes in the body-fixed frame for individual observed nuclear states that may be undergoing both
rotation and vibration. When the bombarding energy is below the Coulomb barrier, the scattering of a
projectile nucleus by a target nucleus is due purely to the electromagnetic interaction since the distance
of closest approach exceeds the range of the nuclear force. For such pure Coulomb collisions, the electro-
magnetic excitation of collective nuclei populates many excited states, as illustrated in figure 1413, with
cross sections that are a direct measure of the 2 matrix elements. These measured matrix elements are
precisely those required to evaluate, in the laboratory frame, the 2 rotational invariants from which it is
possible to deduce the intrinsic quadrupole shapes of the rotating-vibrating nuclear states in the body-fixed
frame[Cli86].
Note that this relation is expressed in the inertial space-fixed frame of reference, not the non-inertial body-
fixed frame. The subscript is added to emphasize that this equation is written in the inertial space-fixed
frame of reference. However, as already discussed, it is much more convenient to transform from the space-
fixed inertial frame to the body-fixed frame for which the inertia tensor of the rigid body is known. Thus the
next stage is to express the rotational motion in terms of the body-fixed frame of reference. For simplicity,
translational motion will be ignored.
The rate of change of angular momentum can be written in terms of the body-fixed value, using the
transformation from the space-fixed inertial frame (x̂ ŷ ẑ) to the rotating frame (ê1 ê2 ê3 ) as given in
chapter 103, µ ¶ µ ¶
L L
N= = +ω×L (13.99)
However, the body axis ê is chosen to be the principal axis such that
= (13.100)
where the principal moments of inertia are written as . Thus the equation of motion can be written using
the body-fixed coordinate system as
¯ ¯
¯ ê1 ê2 ê3 ¯¯
¯
N = 1 ̇ 1 ê1 + 2 ̇ 2 ê2 + 3 ̇ 3 ê3 + ¯¯ 1 2 3 ¯¯ (13.101)
¯ 1 1 2 2 3 3 ¯
= (1 ̇ 1 − (2 − 3 ) 2 3 ) ê1 + (2 ̇ 2 − (3 − 1 ) 3 1 ) ê2 + (3 ̇ 3 − (1 − 2 ) 1 2 ) ê3(13.102)
13.18. LAGRANGE EQUATIONS OF MOTION FOR RIGID-BODY ROTATION 335
1 = 1 ̇ 1 − (2 − 3 ) 2 3 (13.103)
2 = 2 ̇ 2 − (3 − 1 ) 3 1
3 = 3 ̇ 3 − (1 − 2 ) 1 2
These are the Euler equations for rigid body in a force field expressed in the body-fixed coordinate
frame. They are applicable for any applied external torque N.
The motion of a rigid body depends on the structure of the body only via the three principal moments
of inertia 1 2 and 3 Thus all bodies having the same principal moments of inertia will behave exactly the
same even though the bodies may have very different shapes. As discussed earlier, the simplest geometrical
shape of a body having three different principal moments is a homogeneous ellipsoid. Thus, the rigid-body
motion often is described in terms of the equivalent ellipsoid that has the same principal moments.
A deficiency of Euler’s equations is that the solutions yield the time variation of ω as seen from the body-
fixed reference frame axes, and not in the observers fixed inertial coordinate frame. Similarly the components
of the external torques in the Euler equations are given with respect to the body-fixed axis system which
implies that the orientation of the body is already known. Thus for non-zero external torques the problem
cannot be solved until the the orientation is known in order to determine the components . However,
these difficulties disappear when the external torques are zero, or if the motion of the body is known and it
is required to compute the applied torques necessary to produce such motion.
since the and eb3 axes are colinear. This can be rewritten as
3 ̇ 3 − (1 − 2 ) 1 2 = 3 (13.109)
Any axis could have been designated the eb3 axis, thus the above equation can be generalized to all three
axes to give
1 ̇ 1 − (2 − 3 ) 2 3 = 1 (13.110)
2 ̇ 2 − (3 − 1 ) 3 1 = 2
3 ̇ 3 − (1 − 2 ) 1 2 = 3
These are the Euler’s equations given previously in (13103). Note that although ̇ 3 is the equation
of motion for the coordinate, this is not true for the φ and θ rotations which are not along the body-fixed
1 and 2 axes as given in table 131.
Because L is perpendicular to the shaft, and L rotates around ω as the shaft rotates, let eb2 be along L
L = 2 eb2
1 = 0
2 = sin
3 = cos
1 = 1 1 = 0
2 = 2 2 = (1 + 2 ) 2 sin Rotation of a dumbbell.
3 = 3 3 = 0
which is consistent with the angular momentum being along the eb2 axis.
Using Euler’s equations, and assuming that the angular velocity is constant, i.e. ̇ = 0 then the compo-
nents of the torque required to satisfy this motion are
That is, this motion can only occur in the presence of the above applied torque which is in the direction
−eb1 that is, mutually perpendicular to eb2 and eb3 . This torque can be written as N = ω × L.
13.19. HAMILTONIAN EQUATIONS OF MOTION FOR RIGID-BODY ROTATION 337
(2 − 3 ) 2 3 − 1 ̇ 1 = 0 (13.111)
(3 − 1 ) 3 1 − 2 ̇ 2 = 0 (13.112)
3 ̇ 3 = 0 (13.113)
where the precession angular velocity Ω =̇ with respect to the body-fixed frame is defined to be
µ ¶
(3 − 1 )
Ω≡ ω3 (13.116)
1
Combining the time derivatives of equations 13114 and 13115 leads to two uncoupled equations
̈ 1 + Ω2 1 = 0 (13.117)
̈ 2 + Ω2 2 = 0 (13.118)
These are the differential equations for a harmonic oscillator with solutions
1 = cos Ω (13.119)
2 = sin Ω
338 CHAPTER 13. RIGID-BODY ROTATION
These equations describe a vector rotating in a circle of radius about an axis perpendicular to ̂3 that
is, rotating in the ̂1 − ̂2 plane with angular frequency Ω = −̇. Note that
21 + 22 = 2 (13.120)
which is a constant. In addition 3 is constant, therefore the magnitude of the total angular velocity
q
|ω| = 21 + 22 + 23 = constant (13.121)
The motion of the torque-free symmetric body is that the angular velocity ω precesses around the
symmetry axis ̂3 of the body at an angle with a constant precession frequency Ω with respect to the
body-fixed frame as shown in figure 134. Thus, to an observer on the body, ω traces out a cone around the
body-fixed symmetry axis. Note from (13116) that the vectors Ω̂3 and 3 ̂3 are parallel when Ω is positive,
that is, 3 (oblate shape) and antiparallel if 3 (prolate shape).
For the system considered, the orientation of the angular momentum vector L must be stationary in the
space-fixed inertial frame since the system is torque free, that is, L is a constant of motion. Also we have
that the projection of the angular momentum on the body-fixed symmetry axis is a constant of motion, that
is, it is a cyclic variable. Thus
1 3
3 = 3 3 = Ω (13.122)
(3 − 1 )
Understanding the relation between the angular momentum and angular velocity is facilitated by consid-
ering another constant of motion for the torque-free symmetric rotor, namely the rotational kinetic energy.
1
= ω · L = constant (13.123)
2
Since L is a constant for torque-free motion, and also the magnitude of ω was shown to be constant, therefore
the angle between these two vectors must be a constant to ensure that also rot = 12 ω · L = constant. That
is, ω precesses around L at a constant angle ( − ) such that the projection of ω onto L is constant. Note
that
ω × eb3 = 2 eb1 − 1 eb2 (13.124)
and, for a symmetric rotor,
L · ω × eb3 = 1 1 2 − 2 1 2 = 0 (13.125)
since 1 = 2 for the symmetric rotor. Because L · ω × eb3 = 0 for a symmetric top then L ω and eb3 are
coplanar.
Figure 135 shows the geometry of the motion for both oblate and prolate axially-deformed bodies. To
an observer in the space-fixed inertial frame, the angular velocity ω traces out a cone that precesses with
angular velocity Ω around the space fixed L axis called the space cone. For convenience, figure 135 assumes
that L and the space-fixed inertial frame ẑ axis are colinear. The angular velocity ω also traces out the
body cone as it precesses about the body-fixed ê3 axis. Since L ω and eb3 are coplanar, then the ω vector is
at the intersection of the space and body cones as the body cone rolls around the space cone. That is, the
space and body cones have one generatrix in common which coincides with ω. As shown in figure 135, for
a needle the body cone appears to roll without slipping on the outside of the space cone at the precessional
velocity of Ω = − By contrast, as shown in figure 135 for an oblate (disc-shaped) symmetric top the
space cone rolls inside the body cone and the precession Ω is faster than .
Since no external torques are acting for torque-free motion, then the magnitude and direction of the total
angular momentum are conserved. The description of the motion is simplified if L is taken to be along the
space-fixed ẑ axis, then the Euler angle is the angle between the body-fixed basis vector ê3 and space-fixed
basis vector ẑ. If at some instant in the body frame, it is assumed that eb2 is aligned in the plane of L ω
and eb3 then
1 = 0 2 = sin 3 = cos (13.126)
If is the angle between the angular velocity ω and the body-fixed ê3 axis, then at the same instant
z z
L
Space cone 3 L
3
Space cone
Body cone
2
2
Body cone
1
(a) (b)
Figure 13.5: Torque-free rotation of symmetric tops; (a) circular flat disk, (b) circular rod. The space-fixed
and body-fixed cones are shown by fine lines. The space-fixed axis system is designated by the unit vectors
(x̂ ŷ ẑ) and the body-fixed principal axis system by unit vectors (1̂ 2̂ 3̂)
The components of the angular momentum also can be derived from L = I · ω to give
1 = 1 1 = 0 2 = 2 2 = 1 sin 3 = 3 3 = 3 cos (13.128)
2
Equations 13126 and 13128 give two relations for the ratio 3 , that is,
2 1
= tan = tan (13.129)
3 3
For a prolate spheroid 1 3 therefore while Ω and 3 have opposite signs.
For a oblate spheroid 1 3 therefore while Ω and 3 have the same sign.
The sense of precession can be understood if the body cone rolls without slipping on the outside of the
space cone with Ω in the opposite orientation to for the prolate case, while for the oblate case the space
cone rolls inside the body cone with Ω and oriented in similar directions. Note from (13129) that = 0
if = 0, that is L ω and the 3 axis are aligned corresponding to a principal axis. Similarly, = 90◦ if
= 90◦ , then again L and ω are aligned corresponding to them being principal axes.
Lagrangian mechanics has been used to calculate the motion with respect to the body-fixed principal
axis system. However, the motion needs to be known relative to the space-fixed inertial frame where the
motion is observed. This transformation can be done using the following relation
µ ¶ µ ¶
ê3 ê3
= + ω × ê3 = ω × ê3 (13.130)
since the unit vector ê3 is stationary in the body-fixed frame. The vector product of ω × ê3 and ê3 gives
µ ¶
ê3
ê3 × = ê3 × ω × ê3 = (ê3 · ê3 ) ω − (ê3 · ω) ê3 = ω − 3 ê3
therefore µ ¶
ê3
ω = ê3 × + 3 ê3 (13.131)
¡ 3¢
The angular momentum equals L = {I} ·ω. Since ê3 × ê
is perpendicular to the ê3 axis, then
for the case with 1 = 2 , µ ¶
ê3
L =1 ê3 × + 3 3 ê3 (13.132)
340 CHAPTER 13. RIGID-BODY ROTATION
Thus the angular momentum for a torque-free symmetric rigid rotor comprises two components, one being
the perpendicular component that precesses around ê3 , and the other is 3 .
In the space-fixed frame assume that the ẑ axis is colinear with L Then taking the scalar product of ê3
and L, using equation 13126 gives
µ ¶
ê3
3 = ê3 · L =1 ê3 · ê3 × + 3 3 ê3 · ê3 (13.133)
The first term on the right is zero and thus equation 13133 and 13126 give
3 = 3 3 = cos (13.134)
The time dependence of the rotation of the body-fixed symmetry axis with respect to the space-fixed
axis system can be obtained by taking the vector product ê3 × L using equation 13132 and using equation
24 to expand the triple vector product,
à µ ¶ !
ê3
ê3 × L = 1 ê3 × ê3 × + 3 3 ê3 × ê3 (13.135)
"Ã µ ¶ ! µ ¶ #
ê3 ê3
= 1 ê3 · ê3 − (ê3 · ê3 ) +0
¡ ê3 ¢
since (ê3 × ê3 ) = 0. Moreover (ê3 · ê3 ) = 1, and ê3 ·
= 0 since they are perpendicular, then
µ ¶
ê3 L
= × ê3 (13.136)
1
This equation shows that the body-fixed symmetry axis ê3 precesses around the L where L is a constant
of motion for torque-free rotation. The true rotational angular velocity ω in the space-fixed frame, given by
equations 13131 can be evaluated using equation 13136 Remembering that it was assumed that L is in
the ẑ direction, that is, L =ẑ then
µ ¶
ê3
ω = ê3 × + 3 ê3
µ ¶
cos
= ê3 × (ẑ × ê3 ) + ê3
1 3
µ ¶
1 − 3
= ẑ + cos ê3 (13.137)
1 1 3
That is, the symmetry axis of the axially-symmetric rigid rotor makes an angle to the angular momentum
vector ẑ and precesses around ẑ with a constant angular velocity 1 while the axial spin of the rigid body
has a constant value 3 . Thus, in the precessing frame, the rigid body appears to rotate about its fixed
³ ´
1 −3
symmetry axis with a constant angular velocity cos3
− cos
1 = cos 1 3 . The precession of the
symmetry axis looks like a wobble superimposed on the spinning motion about the body-fixed symmetry
axis. The angular precession rate in the space-fixed frame can be deduced by using the fact that
which gives the precession rate about the space-fixed axis in terms of the angular velocity . Note that the
precession rate ̇ if 31 1, that is, for oblate shapes, and ̇ if 31 1, that is, for prolate shapes.
13.20. TORQUE-FREE ROTATION OF AN INERTIALLY-SYMMETRIC RIGID ROTOR 341
Since and are constants of motion, then the precessional angular velocity ̇ about the space-fixed ẑ
axis, and the spin angular velocity ̇, which is the spin frequency about the body-fixed 3̂ axis, are constants
that depend directly on 1 3 and
There is one additional constant of motion available if no dissipative forces act on the system, that is,
energy conservation which implies that the total energy
1 ³ 2 2 2
´ 1 ³ ´2
= 1 ̇ sin + ̇ + 3 ̇ cos + ̇ (13.153)
2 2
will be a constant of motion. But the second term on the right-hand side also is a constant of motion since
and 3 both are constants, that is
1 1 ³ ´2 2
3 23 = 3 ̇ cos + ̇ = = constant (13.154)
2 2 3
Thus energy conservation implies that the first term on the right-hand side also must be a constant given by
1 ¡ 2 ¢ 1 ³ 2 2
´ 2
1 1 + 22 = 1 ̇ sin2 + ̇ = − = constant (13.155)
2 2 3
These results are identical to those given in equations 13120 and 13121 which were derived using Euler’s
equations. These results illustrate that the underlying physics of the torque-free rigid rotor is more easily
extracted using Lagrangian mechanics rather than using the Euler-angle approach of Newtonian mechanics.
13.9 Example: Precession rate for torque-free rotating symmetric rigid rotor
Table 132 lists the precession and spin angular velocities, in the space-fixed frame, for torque-free rotation
of three extreme symmetric-top geometries spinning with constant angular momentum when the motion
is slightly perturbed such that is at a small angle to the symmetry axis. Note that this assumes the
perpendicular axis theorem, equation 1345 which states that for a thin laminae 1 + 2 = 3 giving, for a
thin circular disk, 1 = 2 and thus 3 = 21
Table 132: Precession and spin rates for torque-free axial rotation of symmetric rigid rotors
3
Rigid-body symmetric shape Principal moment ratio 1 Precession rate ̇ Spin rate ̇
Symmetric needle 0 0
Sphere 1 0
Thin circular disk 2 2 −
The precession angular velocity in the space frame ranges between 0 to 2 depending on whether the
body-fixed spin angular velocity is aligned or anti-aligned with the rotational frequency . For an extreme
prolate spheroid 31 = 0 the body-fixed spin angular velocity Ω = − 3 which cancels the angular velocity
of the rotating frame resulting in a zero precession angular velocity of the body-fixed ê3 axis around the
space-fixed frame. The spin Ω = 0 in the body-fixed frame for the rigid sphere 31 = 1 and thus the precession
rate of the body-fixed ̂3 axis of the sphere around the space-fixed frame equals . For oblate spheroids and
thin disks, such as a frisbee, 31 = 2 making the body-fixed precession angular velocity Ω = + which adds
to the angular velocity and increases the precession rate up to 2 as seen in the space-fixed frame. This
illustrates that the spin angular velocity can add constructively or destructively with the angular velocity 2
2 Inhis autobiography Surely You’re Joking Mr Feynman, he wrote " I was in the [Cornell] cafeteria and some guy, fooling
around, throws a plate in the air. As the plate went up in the air I saw it wobble, and noticed that the red medallion of
Cornell on the plate going around. It was pretty obvious to me that the medallion went around faster than the wobbling. I
started to figure out the motion of the rotating plate. I discovered that when the angle is very slight, the medallion rotates
twice as fast as the wobble rate. It came out of a very complicated equation! ". The quoted ratio (2 : 1) is incorrect, it should
be (1 : 2). Benjamin Chao in Physics Today of February 1989 speculated that Feynman’s error in inverting the factor of
two might be "in keeping with the spirit of the author and the book, another practical joke meant for those who do physics
without experimenting". He pointed out that this story occurred on page 157 of a book of length 314 pages (1:2). Observe the
dependence of the ratio of wobble to rotation angular velocities on the tilt angle .
13.21. TORQUE-FREE ROTATION OF AN ASYMMETRIC RIGID ROTOR 343
1 ̇ 1 = (2 − 3 ) 2 3 (13.156)
2 ̇ 2 = (3 − 1 ) 3 1
3 ̇ 3 = (1 − 2 ) 1 2
Since = for = 1 2 3, then equation 13156 gives
The bracket is equivalent to (21 + 22 + 23 ) = 0 which impliesFigure 13.6: Rotation of an asymmetric
that the total rotational angular momentum is a constant of rigid rotor. The dark lines correspond to
motion as expected for this torque-free system, even though the contours of constant total rotational ki-
individual components 1 2 3 may vary. That is netic energy T, which has an ellipsoidal
2 2 2
1 + 2 + 3 = 2
(13.159) shape, projected onto the angular momen-
tum L sphere in the body-fixed frame.
Note that equation 13159 is the equation of a sphere of radius .
Multiply the first equation of 13157 by 1 , the second by 2 , and the third by 3 , and sum gives
2
Thus, for a given value of when = min = 2 3
the orientation of L in the body-fixed frame is either
(0 0 +) or (0 0 −), that is, aligned with the ê3 axis along which the principal moment of inertia is largest.
For slightly higher kinetic energy the trajectory of follows closed paths precessing around ê3 . When the
2
kinetic energy = 222 the angular momentum vector follows either of the two thin-line trajectories each
of which are a separatrix. These do not have closed orbits around ê2 and they separate the closed solutions
around either ê3 or ê1 For higher kinetic energy the precessing angular momentum vector follows closed
trajectories around ê1 and becomes fully aligned with ê1 at the upper-bound kinetic energy.
Note that for the special case when 3 2 = 1 then the asymmetric rigid rotor equals the symmetric
rigid rotor for which the solutions of Euler’s equations were solved exactly in chapter 1319. For the symmetric
rigid rotor the -ellipsoid becomes a spheroid aligned with the symmetry axis and thus the intersections
with the -sphere lead to circular paths around the ê3 body-fixed principal axis, while the separatrix circles
the equator corresponding to the ê3 axis separating clockwise and anticlockwise precession about L3 . This
discussion shows that energy, plus angular momentum conservation, provide the general features of the
solution for the torque-free symmetric top that are in agreement with those derived using Euler’s equations
of motion
ω = 1 b
e1 (13.163)
Consider that a small perturbation is applied causing the angular velocity vector to be
ω = 1 b
e1 + b
e2 + b
e3 (13.164)
(2 − 3 ) − 1 ̇ 1 = 0
(3 − 1 ) 1 − 2 ̇ = 0
(1 − 2 ) 1 − 3 ̇ = 0
Assuming that the product in the first equation is negligible, then ̇ 1 = 0 that is, 1 is constant.
The other two equations can be solved to give
µ ¶
(3 − 1 )
̇ = 1 (13.165)
2
µ ¶
(1 − 2 )
̇ = 1 (13.166)
3
Take the time derivative of the first equation
µ ¶
(3 − 1 )
̈ = 1 ̇ (13.167)
2
and substitute for ̇ gives µ ¶
(1 − 3 ) (1 − 2 ) 2
̈ + 1 = 0 (13.168)
2 3
The solution of this equation is
() = Ω1 + −Ω1 (13.169)
where s
(1 − 3 ) (1 − 2 )
Ω1 = 1 (13.170)
2 3
13.22. STABILITY OF TORQUE-FREE ROTATION OF AN ASYMMETRIC BODY 345
Note that since it was assumed that 3 2 1 then Ω1 is real. The solution for () therefore represents a
stable oscillatory motion with precession frequency Ω1 The identical result is obtained for Ω1 = Ω1 = Ω1
Thus the motion corresponds to a stable minimum about the ê1 axis with oscillations about the = = 0
minimum with period. s
(1 − 3 ) (1 − 2 )
Ω1 = 1 (13.171)
2 3
Permuting the indices gives that for perturbations applied to rotation about either the 2 or 3 axes give
precession frequencies s
(2 − 1 ) (2 − 3 )
Ω2 = 2 (13.172)
1 3
s
(3 − 2 ) (3 − 1 )
Ω3 = 3 (13.173)
1 2
Since 3 2 1 then Ω1 and Ω3 are real while Ω2 is imaginary. Thus, whereas rotation about either
the 3 or the 1 axes are stable, the imaginary solution about ê2 corresponds to a perturbation increasing
with time. Thus, only rotation about the largest or smallest moments of inertia are stable. Moreover for
the symmetric rigid rotor, with 1 = 2 6= 3 stability exists only about the symmetry axis ê3 independent
on whether the body is prolate or oblate. This result was implied from the discussion of energy and angular
momentum conservation in chapter 1320. Friction was not included in the above discussion. In the presence
of dissipative forces, such as friction or drag, only rotation about the principal axis corresponding to the
maximum moment of inertia is stable.
Stability of rigid-body rotation has broad applications to rotation of satellites, molecules and nuclei.
The first U.S. satellite, Explorer 1, was launched in 1958 with the rotation axis aligned with the cylindrical
axis which was the minimum principal moment of inertia. After a few hours the satellite started tumbling
with increasing amplitude due to a flexible antenna dissipating and transferring energy to the perpendicular
axis which had the largest moment of inertia. Torque-free motion of a deformed rigid body is a ubiquitous
phenomena in many branches of science, engineering, and sports as illustrated by the following examples.
The imaginary precession frequency Ω1 about the 1 axis implies unstable rotation leading to tumbling
whereas the minimum moment 22 and maximum moment 33 imply stable rotation about the 2 and 3 axes.
This rotational behavior is easily demonstrated by throwing a tennis racquet and is called the tennis racquet
theorem. The center of percussion, example 214 is another important inertial property of a tennis racquet.
346 CHAPTER 13. RIGID-BODY ROTATION
and (cos ) is an associated Legendre function of cos . Spherical harmonics are the angular portion of a
set of solutions to Laplace’s equation. Represented in a system of spherical coordinates, Laplace’s spherical
harmonics ( ) are a specific set of spherical harmonics that form an orthogonal system. Spherical
harmonics are important in many theoretical and practical applications.
In the principal axis frame of the body, there are three non-zero quadrupole deformation parameters
which can be written in terms of the deformation parameters where 20 = cos , 21 = 2−1 = 0 and
22 = 2−2 = √12 sin Using these in equations () give the three semi-axis dimensions in the principal
axis frame, (primed frame), r
5 2
= 0 cos( − ) ()
4 3
q q
Note that for = 0, then 1 = 2 = − 12 4 5
0 while 3 = + 4 5
0 , that is the body has prolate
deformation with the symmetry axis along the 3 axis. The same prolate shape is obtained for = 23 and
= 4 with the prolate symmetry axes along the 1 and 2 axes respectively. For =
then 1 = 3 =
q3 q 3
1 5 5
+ 2 4 0 while 2 = − 4 0 , that is the body has oblate deformation with the symmetry axis along
the 2 axis. The same oblate shape is obtained for = and = 5 3 with the oblate symmetry axes along
the 3 and 1 axes respectively. For other values of the shape is ellipsoidal.
For the asymmetric deformed rigid body, the rotational Hamiltonian can be expressed in the form[Dav58]
3
X ||2
=
=1
4 2 sin2 ( 0 − 2
3 )
where the rotational angular momentum is R The principal moments of inertia are related by the triaxiality
parameter 0 which they assumed is identical to the shape parameter . For axial symmetry the moment of
inertia about the symmetry axis is taken to be zero for a quantal system since rotation of the potential well
about the symmetry axis corresponds to no change in the potential well, or corresponding rotation of the bound
nucleons. That is, the nucleus is not a rigid body, the nucleons only rotate to the extent that the ellipsoidal
potential well is cranked around such that the nucleons must follow the rotation of the potential well. In
addition, vibrational modes coexist about the average asymmetric deformation, plus octupole deformation
often coexists with the above quadrupole deformed modes.
13.23. SYMMETRIC RIGID ROTOR SUBJECT TO TORQUE ABOUT A FIXED POINT 347
1 ³ 2 2 2
´ 1 ³ ´2
= 1 ̇ sin + ̇ + 3 ̇ cos + ̇ + cos (13.183)
2 2
will be a constant of motion. But the middle term on the right-hand
side also is a constant of motion
1 ³ ´2 2
2
3 ̇ cos + ̇ = = 3 = constant (13.184)
2 3 3
The effective potential () is shown in figure 138. It is clear that the motion of a symmetric top with
effective energy 0 is confined to angles 1 2
Note that the above result also is obtained if the Routhian is used, rather than the Lagrangian, as
mentioned in chapter 87, and defined by equation (865). That is, the Routhian can be written as
The Routhian ( ̇ ) acts like a Hamiltonian for the ( ) and ( ) variables which are
constants of motion, and thus are ignorable variables. The Routhian acts as the negative Lagrangian for the
2
remaining variable with rotational kinetic energy 12 1 ̇ and effective potential energy
2
( − cos ) 2 2
= + + cos = () +
21 sin2 3 3
The equation of motion describing the system in the rotating frame is given by one Lagrange equation
( )− =0
̇
The negative sign of the Routhian cancels out when used in the Lagrange equation. Thus, in the rotating
frame of reference, the system is reduced to a single degree of freedom, the nutation angle with effective
energy 0 given by equations 13186 − 13188.
13.23. SYMMETRIC RIGID ROTOR SUBJECT TO TORQUE ABOUT A FIXED POINT 349
Figure 13.9: Nutational motion of the body-fixed symmetry axis projected onto the space-fixed unit sphere.
The three case are (a) ̇ never vanishes, (b) ̇ = 0 at = 2 (c) ̇ changes sign between 1 and 2
The motion of the symmetric top is simplest at the minimum value of the effective potential curve, where
0 = min at which the nutation is restricted to a single value = ¡0 The
¢ motion is a steady precession
at a fixed angle of inclination, that is, the “sleeping top”. Solving for =0 = 0 gives that
" s #
sin2 0 4 1 cos 0
− cos = 1± 1− (13.190)
2 cos 0 2
If 0 2 then to ensure that the solution is real requires a minimum value of the angular momentum on the
body-fixed axis of 2 ≥ 4 1 cos 0 . If 0 2 then there is no minimum angular momentum projection
on the body-fixed axis. There are two possible solutions to the quadratic relation corresponding to either a
slow or fast precessional frequency. Usually the slow precession is observed.
For the general case, where 10 min the nutation angle between the space-fixed and body-fixed 3
axes varies in the range 1 2 This axis exhibits a nodding variation which is called nutation. Figure
139 shows the projection of the body-fixed symmetry axis on the unit sphere in the space-fixed frame. Note
that the observed nutation behavior depends on the relative sizes of and cos For certain values, the
precession ̇ changes sign between the two limiting values of producing a looping motion as shown in figure
139. Another condition is where the precession is zero for 2 producing a cusp at 2 as illustrated in figure
139. This behavior can be demonstrated using the gyroscope or the symmetric top.
6
10̇ 1 − 6 2 3 = sin sin (a)
6
10̇ 2 − 6 1 3 = sin cos (b)
4̇ 3 = 0 (c)
Equation () relates the spin about the 3 axis, the precession, and the angle to the vertical that is
The bracket must be positive to have stable sinusoidal oscillations. That is, the spin angular velocity
required for the jack to spin about a stable vertical axis is given by.
3Ω 3
+
2 2Ω
This example illustrates the conditions required for stable rotation of any axially-symmetric top.
of degrees of freedom from 5 to 2, namely which is the tilt angle, and 0 which is the orientation of the
tilt. This Routhian is a Lagrangian in two dimension that was used to derive the equations of motion
via the Lagrange Euler equation
( )− =
̇
( )− = 0
̇0 0
where the 0 are generalized torques about the 2 angles that take into account the sliding frictional
forces. This sophisticated Routhian reduction approach provides an exhaustive and refined solution for the
Tippe Top and confirms that sliding friction plays a key role in the unusual behavior of the Tippe Top.
1 = ̇ (13.191)
2 = ̇ sin
3 = ̇ cos
The frame fixed in the rotating wheel must include the additional angular velocity of the disk ̇ about the
ê3 axis, that is
Ω1 = 1 = ̇ (13.192)
Ω2 = 2 = ̇ sin
Ω3 = 3 + ̇ = ̇ cos + ̇
where Ω designates the angular velocity of the rotating disk, while ω designates the rotation of the moving
frame (1 2 3).
The principle moments of inertia of a thin circular disk are related by the perpendicular axis theorem
(chapter 139)
1 + 2 = 3
Since 1 = 2 for a uniform disk, therefore 3 = 21 .
Equation 1216 can be used to relate the vector forces F in the space-fixed frame to the rate of change
of momenta in the moving frame (1 2 3)
This leads to the following relations for the three components in the moving frame
1 = ̇1 + 2 3 − 3 2 (13.194)
2 − sin = ̇2 + 3 1 − 1 3
3 − cos = ̇3 + 1 2 − 2 1
352 CHAPTER 13. RIGID-BODY ROTATION
Figure 13.10: Uniform disk rolling on a horizontal plane as viewed in the (a) fixed frame, and (b) rolling
disk frame. The space-fixed axis system is (x y z), while the moving reference frame (1 2 3) is centered at
the center of mass of the disk with the 1 2 axes in the plane of the disk. The disk is rotating with a uniform
angular velocity ̇ about the 3 axis and rolling in the direction that is at an angle relative to the axis.
Equations 13199 are non-linear, and a closed-form solution is possible only for limited cases such as when
= 90◦ .
Note that the above equations of motion also can be derived using Lagrangian mechanics knowing that
1 ¡ 2 ¢ 1 ¡ ¢ 1
= 1 + 22 + 32 + 1 Ω21 + Ω22 + 3 Ω23 − cos
2 2 2
The differential equations of constraint can be derived from equations 13197 to be
− cos = 0
− sin = 0
Use of generalized forces plus the Lagrange-Euler equations (645) can be used to derive the equations of
motion and solve for the components of the constraint force 1 2 and 3 .
addition to the gyroscopic effects. Excellent articles on this subject have been written by D.E.H. Jones Physics Today 23(4)
(1970) 34, and also by J. Lowell & H.D. McKell, American Journal of Physics 50 (1982) 1106.
354 CHAPTER 13. RIGID-BODY ROTATION
The parallel-axis theorem relates the moment of inertia with respect to the pivot point and center of mass
The angular velocities of the center of mass, and about the center of mass, are identical since the pivot point
is fixed, that is
= =
Thus the angular momentum about the pivot point is given by the sum of the angular momenta
That is, the angular momentum is the sum of the angular momentum of the body about the center of mass,
plus the angular momentum of the center of mass about the pivot point. This is an example of Chasles
theorem.
The kinetic energy is given only by the rotational energy since the pivot point is stationary
1 1 1 1 1
= 2 = 2 2 + 2 = 2 + 2
2 2 2 2 2
That is, it equals the kinetic energy of rotation about the center of mass plus the instantaneous kinetic energy
for translation of the center of mass in agreement with Chasles theorem. Thus, for pivoting, the angular
momentum and kinetic energy are the same if evaluated using either center of mass coordinates or using the
pivot point as the reference point.
That is, the angular momentum only includes the angular momentum about the center of mass which is
smaller than the angular momentum for the same body pivoting about a point on the periphery of the cylinder.
The kinetic energy is given by
1 1 1 1
= 2 + 2 = 2 + 2
2 2 2 2
Thus the angular momentum is significantly smaller for rolling relative to pivoting of a given body, whereas
the kinetic energy is the same for both rolling or pivoting of a given body.
13.25. DYNAMIC BALANCING OF WHEELS 355
1 = 3 = 0
1
2 = − 2 sin cos 2
4
That is, the torque is in the ̂2 direction. Thus the forces on the bearings can be calculated since N = r × F,
thus
|2 | sin 2
| | = = 2 2
2 16
Estimate the size of these forces for the front wheel of your car travelling at 70 m.p.h. if the rotation axis is
displaced by 2◦ from the symmetry axis of the wheel.
356 CHAPTER 13. RIGID-BODY ROTATION
Figure 13.11: Forward two-and-a-half somersaults with two twists demonstrates unequivocally that a diver
can initiate continuous twisting in midair. In the illustrated maneuver the diver does more than one full
somersault before he starts to twist. To maintain the twisting the diver does not have to move his legs.[Fro80]
13.27 Summary
This chapter has introduced the important, topic of rigid-body rotation which has many applications in
physics, engineering, sports, etc.
Inertia tensor The concept of the inertia tensor was introduced where the 9 components of the inertia
tensor are given by à à 3 ! !
Z X
0 2
= (r ) − (1314)
Steiner’s parallel-axis theorem
¡¡ 2 ¢ ¢ ¡ ¢
11 ≡ 11 + 1 + 22 + 23 11 − 21 = 11 + 22 + 23 (1343)
relates the inertia tensor about the center-of-mass to that about parallel axis system not through the center
of mass.
Diagonalization of the inertia tensor about any point was used to find the corresponding Principal axes
of the rigid body.
Angular momentum The angular momentum L for rigid-body rotation is expressed in terms of the
inertia tensor and angular frequency by
⎛ ⎞ ⎛ ⎞
11 12 13 1
L= ⎝ 21 22 23 ⎠ · ⎝ 2 ⎠ = {I} · ω (1356)
31 32 33 3
Euler angles The Euler angles relate the space-fixed and body-fixed principal axes. The angular velocity
ω expressed in terms of the Euler angles has components for the angular velocity in the body-fixed axis system
(1 2 3)
1 = ̇1 + ̇1 + 1 = ̇ sin sin + ̇ cos (1386)
2 = ̇2 + ̇2 + 2 = ̇ sin cos − ̇ sin (1387)
3 = ̇3 + ̇3 + 3 = ̇ cos + ̇ (1388)
Similarly, the components of the angular velocity for the space-fixed axis system ( ) are
= ̇ cos + ̇ sin sin (1389)
= ̇ sin − ̇ sin cos (1390)
= ̇ + ̇ cos (1391)
Rotational invariants The powerful concept of the rotational invariance of scalar properties was intro-
duced. Important examples of rotational invariants are the Hamiltonian, Lagrangian, and Routhian.
Euler equations of motion for rigid-body motion The dynamics of rigid-body rotational motion was
explored and the Euler equations of motion were derived using both Newtonian and Lagrangian mechanics.
1 = 1 1 − (2 − 3 ) 2 3 (13103)
2 = 2 2 − (3 − 1 ) 3 1
3 = 3 3 − (1 − 2 ) 1 2
358 CHAPTER 13. RIGID-BODY ROTATION
Lagrange equations of motion for rigid-body motion The Euler equations of motion for rigid-body
motion, given in equation 13103 were derived using the Lagrange-Euler equations.
Torque-free motion of rigid bodies The Euler equations and Lagrangian mechanics were used to study
torque-free rotation of both symmetric and asymmetric bodies including discussion of the stability of torque-
free rotation.
Rotating symmetric body subject to a torque The complicated motion exhibited by a symmetric top,
that is spinning about one fixed point and subject to a torque, was introduced and solved using Lagrangian
mechanics.
The rolling wheel The non-holonomic motion of rolling wheels was introduced, as well as the importance
of static and dynamic balancing of rotating machinery..
Rotation of deformable bodies The complicated non-holonomic motion involving rotation of deformable
bodies was introduced.
13.27. SUMMARY 359
Workshop exercises
1. Three objects are described below. Break up into three groups, one group per object, and determine the inertia
tensor.
• A very thin sheet with a mass density = where is a positive constant. The sheet lies in the
plane and its sides are both of length .
• An inclined-plane shaped block of mass is oriented with one corner at the origin as shown.
• An equilateral triangle made up of three thin rods of length and uniform mass density .
(a) For the first object (the thin sheet), determine the principal moments of inertia.
(b) For the second object (the inclined plane), determine the principal axes.
(c) For the third object (the equilateral triangle), determine the products of inertia.
(a) Calculate the inertia tensor for a set of coordinates whose origin is at the center of mass of the shell.
(b) Now suppose that the shell is rolling without slipping toward a step of height , where . The shell
has a linear velocity . What is the angular momentum of the shell relative to the tip of the step?
(c) The shell now strikes the tip of the step inelastically (so that the point of contact sticks to the step,
but the shell can still rotate about the tip of the step). What is the angular momentum of the shell
immediately after contact?
(d) Finally, find the minimum velocity which enables the shell to surmount the step. Express your result in
terms of and .
5. The vectors ̂, ̂ , and ̂ constitute a set of orthogonal right-handed axes. The vectors ̂ + ̂ − 2̂ , −̂ + ̂ , and
̂ + ̂ + ̂ are also perpendicular to one another.
(a) Write out the set of direction cosines relating the new axes to the old.
(b) How are the Eulerian angles defined? Describe this transformation by a set of Eulerian angles.
360 CHAPTER 13. RIGID-BODY ROTATION
6. A torsional pendulum consists of a vertical wire attached to a mass which can rotate about the vertical axis.
Consider three torsional pendula which consist of identical wires from which identical homogeneous solid cubes
are hung. One cube is hung from a corner, one from midway along an edge, and one from the middle of a face
as shown. What are the ratios of the periods of the three pendula?
7. A dumbbell comprises two equal point masses connected by a massless rigid rod of length 2 which is
constrained to rotate about an axle fixed to the center of the rod at an angle as shown in the figure. The
center of the rod is at the origin of the coordinates, the axle along the -axis, and the dumbbell lies in the
− plane at = 0. The angular velocity is a constant in time and is directed along the axis.
a) Calculate all elements of the inertia tensor. Be sure to specify the coordinate system used.
b) Using the calculated inertia tensor find the angular momentum of the dumbbell in the laboratory frame as
a function of time.
c) Using the equation = × , calculate the angular momentum and show that it it is equal to the answer
of part (b).
d) Calculate the torque on the axle as a function of time.
e) Calculate the kinetic energy of the dumbbell.
x
O
z
8. A heavy symmetric top has a mass with the center of mass a distance from the fixed point about which
it spins and 1 = 2 6= 3 . The top is precessing at a steady angular velocity Ω about the vertical space-fixed
axis. What is the minimum spin 0 about the body-fixed symmetry axis, that is, the 3 axis assuming that
the 3 axis is inclined at an angle = with respect to the vertical axis. Solve the problem at the instant
when the 3 1 axes all are in the same plane as shown in the figure.
z
O x
1
13.27. SUMMARY 361
9. Consider an object with the center of mass is at the origin and inertia tensor,
⎛ ⎞
12 −12 0
= ⎝ −12 12 0 ⎠
0 0 1
(a) Determine the principal moments of inertia and the principal axes. Guess the object.
(b) Determine the rotation matrix and compute † . Do the diagonal elements match with your results
from (a)? Note: columns of are eigenvectors of .
(c) Assume = (̂
+ ̂). Determine in the rotating coordinate system. Are and in the same
√
2
direction? What does this mean?
(d) Repeat (c) for = √
2
(̂ − ̂). What is different and why?
(e) For which case will there be a non-zero torque required?
(f) Determine the rotational kinetic energy for the case = √
2
(̂ − ̂)?
10. Consider a wheel (solid disk) of mass and radius . The wheel is subject to angular velocities = ̂
where ̂ is normal to the surface and = ̂ .
11. Determine the principal moments of inertia of an ellipsoid given by the equation,
2 2 2
2
+ 2 + 2 = 1
12. Determine the principal moments of inertia of a sphere of radius with a cavity of radius located from the
center of the sphere.
13. Three
³ ´ ³masses form
equal ´ the ³
vertices of an equilateral
´ triangle of side length . The masses are located at
0 0 3 , 0 2 − 2 3 , and 0 − 2 − 2 3 , such that the center-of-mass is located at the origin.
√ √ √
Problems
1. Calculate the moments of inertia 1 2 3 for a homogeneous cone of mass whose height is and whose
base has a radius Choose the 3 -axis along the symmetry axis of the cone.
a) Choose the origin at the apex of the cone, and calculate the elements of the inertia tensor.
b) Make a transformation such that the center of mass of the cone is the origin and find the principal moments
of inertia.
2. Four masses, all of mass lie in the − plane at positions ( ) = ( 0) (− 0) (0 +2) (0 −2)
These are joined by massless rods to form a rigid body
(a) Find the inertial tensor, using the axes as a reference system. Exhibit the tensor as a matrix.
(b) Consider a direction given by the unit vector ̂ that lies equally between the positive axes; that is
it makes equal angles with these three directions. Find the moment of inertia for rotation about this ̂ axis.
(c) Given that at a certain time the angular velocity vector lies along the above direction ̂, find, for that
instant, the angle between the angular momentum vector and ̂
3. A homogeneous cube, each edge of which has a length initially is in a position of unstable equilibrium with
one edge of the cube in contact with a horizontal plane. The cube then is given a small displacement causing
it to tip over and fall. Show that the angular velocity of the cube when one face strikes the plane is given by
³√ ´
2 = 2−1
3 12
where = 2 if the edge cannot slide on the plane, and where = 5 if sliding can occur without friction.
4. A symmetric body moves without the influence of forces or torques. Let 3 be the symmetry axis of the body
and be along 03 . The angle between and 3 is . Let and initially be in the 2 − 3 plane. What is
the angular velocity of the symmetry axis about in terms of 1 3 and ?
5. Consider a thin rectangular plate with dimensions by and mass Determine the torque necessary to
rotate the thin plate with angular velocity about a diagonal. Explain the physical behavior for the case when
= .
Chapter 14
14.1 Introduction
Chapter 3 discussed the behavior of a single linearly-damped linear oscillator subject to a harmonic force.
No account was taken for the influence of the single oscillator on the driver for the case of forced oscillations.
Many systems in nature comprise complicated free or forced oscillations of coupled-oscillator systems. Ex-
amples of coupled oscillators are; automobile suspension systems, electronic circuits, electromagnetic fields,
musical instruments, atoms bound in a crystal, neural circuits in the brain, networks of pacemaker cells in
the heart, etc. Energy can be transferred back and forth between coupled oscillators as the motion evolves.
It is possible to describe the motion of coupled linear oscillators in terms of a sum over independent normal
coordinates, i.e. normal modes, even though the motion may be very complicated. These normal modes
are constructed from the original coordinates in such a way that the normal modes are uncoupled. The
topic of finding the normal modes of coupled oscillator systems is a ubiquitous problem encountered in all
branches of science and engineering. As discussed in chapter 3 oscillatory motion of non-linear systems
can be complicated. Fortunately most oscillatory systems are approximately linear when the amplitude of
oscillation is small. This discussion assumes that the oscillation amplitudes are sufficiently small to ensure
linearity.
363
364 CHAPTER 14. COUPLED LINEAR OSCILLATORS
1 ≡ 1 − 2 (14.12)
2 ≡ 1 + 2
that is
1
1 = ( + 1 ) (14.13)
2 2
1
2 = ( − 1 )
2 2
Substitute these into the equations of motion (141), gives
¡ ¢
1 + 2 + ( + 20 ) 1 + 0 2 = 0 (14.14)
¡ ¢ 0 0
1 − 2 + ( + 2 ) 1 − 2 = 0
Adding and subtracting these two equations gives Figure 14.3: Motion of two coupled har-
0 monic oscillators in the (1 2 ) spatial
̈ 1 + ( + 2 ) 1 = 0 (14.15)
configuration space and in terms of the
̈ 2 + 2 = 0 normal modes ( 1 2 ). Initial conditions
are 2 = 1 = ̇1 = ̇2 = 0
Note that the two coordinates 1 and 2 are uncoupled and there-
fore are independent. The solutions of these equations are
where 1 corresponds to angular frequencies 1 , and 2 corresponds to 2 . The two coordinates 1 and 2 are
called the normal coordinates and the two solutions are the normal modes with corresponding
angular frequencies, 1 and 2 .
The (1 2 ) axes of the two normal modes correspond to a
rotation of 45◦ in configuration space, figure 143. The initial
conditions chosen correspond to 1 = −2 and thus both modes
1
are excited with equal intensity. Note that there are 5 lobes along
the 2 axis versus 4 lobes along the 1 axis reflecting the ratio
of the eigenfrequencies 1 and 2 Also note that the diamond
shape of the motion in the (1 2 ) configuration space illustrates
that the extrema amplitudes for 2 are a maximum when 1 is
zero, and vise versa. This is equivalent to the statement that Antisymmetric mode
the energies in the two modes are coupled with the energy for (out of phase)
the first oscillator being a maximum when the energy is a min-
2
imum for the second oscillator, and vise versa. By contrast, in
the ( 1 2 ) configuration space, the motion is bounded by a rec-
tangle parallel to the (1 2 ) axes reflecting the fact that the
extrema amplitudes, and corresponding energies, for the 1 nor-
mal mode are constant and independent of the motion for the 2
normal mode, and vise versa. The decoupling of the two normal Symmetric mode
modes is best illustrated by considering the case when only one (in phase)
of these two normal modes is excited. For the initial conditions
1 (0) = −2 (0) and 1 (0) = −2 (0) then 2 () = 0 That is,
only the 1 () normal mode is excited with frequency 1 which
Figure 14.4: Normal modes for two cou-
corresponds to motion confined to the 1 axis of figure 143
pled oscillators.
366 CHAPTER 14. COUPLED LINEAR OSCILLATORS
As shown in figure 144, 1 () is the antisymmetric mode in which the two masses oscillate out of phase
such as to keep the center of mass of the two masses stationary. For the initial conditions 1 (0) = 2 (0)
and 1 (0) = 2 (0) then 1 () = 0 that is, only the 2 () normal mode is excited. The 2 () normal mode
is the symmetric mode where the two masses oscillate in phase with frequency 2 ; it corresponds to motion
along the 2 axis For the symmetric phase, both masses move together leading to a constant extension of
the coupling spring. As a result the frequency 2 of the symmetric mode 2 () is lower than the frequency
1 of the asymmetric mode 1 () That is, the asymmetric mode is stiffer since all three springs provide
active restoring forces, compared to the symmetric mode where the coupling spring is uncompressed. In
general, for attractive forces the lowest frequency always occurs for the mode with the highest symmetry.
2 = + 1 + + 0 + 2 = 2 + 0 + 2
= ( + 0 + 2 ) − ( + 1 ) = 0 − 1
1 = 0 − (14.17)
0
2 = 2 − 2 −
q
0
The 1 mode, which has angular frequency 1 = +2 corresponds to an oscillations of the relative
separation , whilep the center-of-mass location is stationary. By contrast, the 2 mode, with angular
frequency 2 = corresponds to an oscillation of the center of mass with the relative separation
being a constant.
Figure 145 illustrates the decoupled center-of-mass
, and relative motions for both normal modes of
the coupled double-oscillator system. The difference in 2.0
while r
2 = ≈ 0 (1 − ) (14.26)
1
That is the two solutions are split equally spaced q about the
+0
0 2
single uncoupled oscillator value given by 0 = ≈
p 3
(1 + ). Note that the single uncoupled oscillator fre-
quency 0 depends on the coupling strength . 0 n=3
This splitting of the characteristic frequencies is a feature
exhibited by many systems of identical oscillators where
half of the frequencies are shifted upwards and half down-
ward. If is odd, then the central frequency is unshifted as Figure 14.6: Normal-mode frequencies for
illustrated for the case of = 3. An example of this behav- n=2 and n=3 weakly-coupled oscillators.
ior is the Zeeman effect where the magnetic field couples the
atomic motion resulting in a hyperfine splitting of the energy
levels as illustrated.
368 CHAPTER 14. COUPLED LINEAR OSCILLATORS
There are myriad examples involving weakly-coupled oscillators in many aspects of the natural world.
The example of collective modes in nuclear physics, illustrated in example 1413, is typical of applications to
physics, while there are many examples applied to musical instruments, acoustics, and engineering. Weakly-
coupled oscillators are a dominant theme throughout biology as illustrated by congregations of synchronously
flashing fireflies, crickets that chirp in unison, an audience clapping at the end of a performance, networks
of pacemaker cells in the heart, insulin-secreting cells in the pancreas, and neural networks in the brain and
spinal cord that control rhythmic behaviors such as breathing, walking, and eating. Synchronous motion of
a large number of weakly-coupled oscillators often leads to large collective motion of weakly-coupled systems
as discussed in chapter 1412
Hitchpin
Damper String Bridge
Hammer
Pin block
Jack
Soundboard
Ribs
Key
Schematic diagram of the action for a grand piano, including the strings, bridge and sounding board. Note
that there are either two or three parallel strings per note that are hit by a single hammer.
The grand piano provides an excellent example of a weakly-coupled harmonic oscillator system that has
normal modes. There are either two or three parallel strings per note that are stretched tightly parallel to the
top of the horizontal sounding board. The strings press downwards on the bridge that is attached to the top of
the sounding board. The strings for each note are excited when struck vertically upwards by a single hammer.
In the base section of the piano each note comprises two strings tuned to nearly the same frequency. The
coupling of the motion of the strings is via the bridge plus sounding board. Normally, the hammer strikes both
strings simultaneously exciting the vertical symmetric mode, not the vertical antisymmetric mode. The bridge
is connected to the sounding board which moves the largest amount for the symmetric mode where both strings
move the bridge in phase. This strong coupling produces a loud sound. The antisymmetric mode does not
move the sounding board much since the strings at the bridge move out of phase. Consequently, the symmetric
mode, that is strongly coupled to the sounding board, damps out more rapidly than the antisymmetric mode
which is weakly coupled to the sound board and thus has a longer time constant for decay since the radiated
sound energy is lower than the symmetric mode.
The una-corda pedal (soft pedal) for a grand piano moves the action sideways such that the hammer strikes
only one of the two strings, or two of the three strings, resulting in both the symmetric and antisymmetric
modes being excited equally. The una-corda pedal produces a characteristically different tone than when the
hammer simultaneously hits the coupled strings; that is, it produces a smaller transient component. The
symmetric mode rapidly damps due to energy propagation by the sounding board. Thus the longer lasting
antisymmetric mode becomes more prominent when both modes are equally excited using the una-corda pedal.
The symmetric and antisymmetric modes have slightly different frequencies and produce beats which also
contributes to the different timbre produced using the una-corda pedal. For the mid and upper frequency
range, the piano has three strings per note which have one symmetric mode and two separate antisymmetric
modes. To further complicate matters, the strings also can oscillate horizontally which couples weakly to the
bridge plus sounding board. The strengths that these different modes are excited depend on subtle differences
in the shape and roughness of the hammer head striking the strings. Primarily the hammer excites the two
vertical modes rather than the horizontal modes.
14.6. GENERAL ANALYTIC THEORY FOR COUPLED LINEAR OSCILLATORS 369
Expressing these in terms of generalized coordinates = ( ) where = 1 2 then the generalized
velocities are given by
X
̇ = ̇ + (14.30)
=1
As discussed in chapter 76 if the system is scleronomic then the partial time derivative
=0 (14.31)
Thus the kinetic energy, equation 1429, of a scleronomic system can be written as a homogeneous quadratic
function of the generalized velocities
1X
= ̇ ̇ (14.32)
2
Note that if the velocities ̇ correspond to translational velocity, then the kinetic energy tensor T corresponds
to an effective mass tensor, whereas if the velocities correspond to angular rotational velocities, then the
kinetic energy tensor T corresponds to the inertia tensor.
370 CHAPTER 14. COUPLED LINEAR OSCILLATORS
It is possible to make an expansion of the about the equilibrium values of the form
X µ ¶
(1 2 ) = (0 ) + + (14.34)
0
Only the first-order term will be kept since the second and higher terms are of the same order as the higher-
order
³ ´terms ignored in the Taylor expansion of the potential. Thus, at the equilibrium point, assume that
= 0 where = 1 2 3 .
0
That is
1X
0 (1 2 ) = (14.38)
2
where the components of the kinetic energy tensor T and potential energy tensor V are
à 3
!
X X
≡ (14.43)
0
µ 2 0 ¶
≡ (14.44)
0
14.6. GENERAL ANALYTIC THEORY FOR COUPLED LINEAR OSCILLATORS 371
Note that and may have different units, but all the terms in the summations for both and 0 have
units of energy. The and values are evaluated at the equilibrium point, and thus both and
are × arrays of values evaluated at the equilibrium location.
and
X
= ̇ (14.48)
̇
Thus the Lagrange equations reduce to the following set of equations of motion,
X
( + ̈ ) = 0 (14.49)
For each where 1 ≤ ≤ there exists a set of second-order linear homogeneous differential equations
with constant coefficients. Since the system is oscillatory, it is natural to try a solution of the form
() = (−) (14.50)
Assuming that the system is conservative, then this implies that is real, since an imaginary term for
would lead to an exponential damping term. The arbitrary constants are the real amplitude and the
phase Substitution of this trial solution for each leads to a set of equations
X¡ ¢
− 2 = 0 (14.51)
(−)
where the common factor has been removed. Equation 1451 corresponds to a set of linear
homogeneous algebraic equations that the amplitudes must satisfy for each . For a non-trivial solution
to exist, the determinant of the coefficients must vanish, that is
¯ ¯
¯ 11 − 2 11 12 − 2 12 13 − 2 13 ¯
¯ ¯
¯ 12 − 2 12 22 − 2 22 23 − 2 23 ¯
¯ ¯
¯ 13 − 2 13 23 − 2 23 33 − 2 33 ¯ = 0 (14.52)
¯ ¯
¯ ¯
where the symmetry = has been included. This is the standard eigenvalue problem for which
the above determinant gives the secular equation or the characteristic equation. It is an equation
of degree in 2 The roots of this equation are 2 where are the characteristic frequencies or
eigenfrequencies of the normal modes.
Substitution of 2 into equation 1452 determines the ratio 1 : 2 : 3 : : for this solution
which defines the components of the -dimensional eigenvector a . That is, solution of the secular equations
have determined the eigenvalues and eigenvectors of the solutions of the coupled-channel system.
372 CHAPTER 14. COUPLED LINEAR OSCILLATORS
14.6.4 Superposition
P
The equations of motion ( + ̈ ) = 0 are linear equations that satisfy superposition. Thus the
most general solution () can be a superposition of the eigenvectors a , that is
X
() = ( − ) (14.53)
Thus the most general solution of these linear equations involves a sum over the eigenvectors of the
system which are cosine functions of the corresponding eigenfrequencies.
Multiply equation 1455 by and sum over . Similarly multiply equation 1456 by and sum over .
These summations lead to
X X
= 2 (14.57)
X X
= 2 (14.58)
Note that the left-hand sides of these two equations are identical. Thus taking the difference between these
equations gives
¡ 2 ¢X
− 2 = 0 (14.59)
¡ ¢
Note that if 2 − 2 6= 0, that is, assuming that the eigenfrequencies are not degenerate, then to ensure
that equation 1459 is zero requires that
X
= 0 6= (14.60)
This shows that the eigenfunctions are orthogonal. If the eigenfrequencies are degenerate, i.e. 2 = 2 ,
then, with no loss of generality, the axes and can be chosen to be orthogonal.
The eigenfunction normalization can be chosen freely since only ratios of the eigenfunction compo-
nents are determined when is used in equation 1451. The kinetic energy, given by equation 1432
must be positive, or zero for the case of a static system. That is
1X
= ̇ ̇ ≥ 0 (14.61)
2
14.6. GENERAL ANALYTIC THEORY FOR COUPLED LINEAR OSCILLATORS 373
Use the time derivative of equation 1454 to determine ̇ and insert into equation 1461 gives that the kinetic
energy is
1X 1X X
= ̇ ̇ = cos ( − ) cos ( − ) (14.62)
2 2
Since this sum must be a positive number, and the magnitude of the amplitudes can be chosen freely, then
it is possible to normalize the eigenfunction amplitudes to unity. That is, choose that
X
= 1 (14.65)
The orthogonality equation, 1460 and the normalization equation 1465 can be combined into a single
orthonormalization equation
X
= (14.66)
where eb are the unit vectors for the generalized coordinates.
1 2 1 2 1 0 1 1
= 1 + 2 + (2 − 1 )2 = ( + 0 ) 21 + ( + 0 ) 22 − 0 1 2
2 2 2 2 2
while the kinetic energy is given by
1 1
= ̇21 + ̇22
2 2
2) The second stage is to evaluate the potential energy and kinetic energy tensors. The potential
energy tensor is nondiagonal since gives
µ ¶
2
11 ≡ = + 0 = 22
1 1
µ ¶0
2
12 = = −0 = 21
1 2 0
Since 11 = 22 = and 12 = 21 = 0 then the kinetic energy tensor is
½ ¾
0
T=
0
Note that for this case, the kinetic energy tensor equals the mass tensor, which is diagonal, whereas the
potential energy tensor equals the spring constant tensor, which is nondiagonal.
3) The third stage is to use the potential energy and kinetic energy tensors to evaluate the secular
determinant using equations 1452
¯ ¯
¯ + 0 − 2 −0 ¯
¯ ¯
2 ¯=0
¯ − 0 0
+ −
That is ¡ ¢
+ 0 − 2 = ±0
14.7. TWO-BODY COUPLED OSCILLATOR SYSTEMS 375
which simplifies to
= 11 = −21
Similarly, for the other eigenfrequency 2 , that is, = 1 = 2
( + 0 − ) 12 − 0 22 = 0
which simplifies to
= 12 = 22
5) The final stage is to write the general coordinates in terms of the normal coordinates () ≡
Thus
1 = 11 1 + 12 2 = 11 1 + 22 2
and
2 = 21 1 + 22 2 = −11 1 + 22 2
Adding or subtracting gives that the normal modes are
1
1 = (1 − 2 )
211
1
2 = (2 + 1 )
222
then 11 = 22 = and 12 = 21 = 0 Thus the kinetic energy tensor is
½ ¾
0
T=
0
Note that for this case the kinetic energy tensor is diagonal whereas the potential energy tensor is nondiagonal.
3) The third stage is to use the potential energy and kinetic energy tensors to evaluate the secular
determinant using equation 1452 ¯ ¯
¯ 2 − 2 − ¯
¯ ¯=0
¯ − − 2 ¯
The expansion of this secular determinant yields
¡ ¢¡ ¢
2 − 2 − 2 − 2 = 0
That is
2 2
4 − 3 + 2 =0
The solutions are √ r √ r
5+1 5−1
1 = 2 =
2 2
4) The fourth step is to insert these eigenfrequencies into the secular equation 1451
X¡ ¢
− 2 = 0
14.7. TWO-BODY COUPLED OSCILLATOR SYSTEMS 377
Note that for this case the kinetic energy tensor is diagonal whereas the potential energy tensor is nondiagonal.
The third stage is to evaluate the secular determinant
¯ ¯
¯ + 2 − 2 2 −2 ¯
¯ ¯
2 ¯=0
¯ − 2 2
+ − 2
Consider = 1 ¡ ¢
+ 2 − 2 2 1 − 2 2 = 0
Then for the first eigenfrequency, 1 , the subscripts are = 1 = 1
³ ´
+ 2 − 2 11 − 2 21 = 0
which simplifies to
11 = 21
Similarly, for = 1 = 2
µ µ ¶ ¶
2
+ 2 − + 2 12 − 2 22 = 0
which simplifies to
12 = −22
The final stage is to write the general coordinates in terms of the normal coordinates
1 = 11 1 + 12 2 = 11 1 − 22 2
and
2 = 21 1 + 22 2 = 11 1 + 22 2
Adding or subtracting these equations gives that the normal modes are
1 1
1 = (1 + 2 ) 2 = (2 − 1 )
211 222
As for the case of the double oscillator discussed in example 142, the symmetric normal mode corresponds
to an oscillation pof the center-of-mass, with zero relative motion of the two pendula, which has the lower
frequency 1 = This frequency is the same as for one independent pendulum as expected since they
vibrate in unison and thus the only restoring force is gravity. The antisymmetric mode corresponds q¡ to
2
¢
relative motion of the two pendula with stationary center-of-mass and has the frequency 2 = +
since the restoring force includes both the coupling spring and gravity.
This example introduces the role of degeneracy which occurs in this system p if the coupling of the pendula
is zero, that is, = 0 leading to both frequencies being equal, i.e. 1 = 2 = . When = 0, then both
{T} and {V} are diagonal and thus in the (1 2 ) space the two pendula are independent normal modes.
However, the symmetric and asymmetric normal modes, as derived above, are equally good normal modes.
In fact, since the modes are degenerate, any linear combination of the motion of the independent pendula are
equally good normal modes and thus one can use any set of orthogonal normal modes to describe the motion.
14.7. TWO-BODY COUPLED OSCILLATOR SYSTEMS 379
As shown in the adjacent figure, the normal modes for this system
are
1 1 Normal modes for two
1 = ( + √2 ) 2 = (1 − √2 ) series-coupled plane pendula.
211 1 2 2 22 2
√
The second mass has a 2 larger amplitude that is in phase for solution 1 and out of phase for solution 2.
b) Large amplitude chaotic regime
Stachowiak and Okada [Sta05] used computer simulations to numerically analyze the behavior of this
system with increase in the oscillation amplitudes. Poincaré sections, bifurcation diagrams, and Lyapunov
exponents all confirm that this system evolves from regular normal-mode oscillatory behavior in the linear
regime at low energy, to chaotic behavior at high excitation energies where non-linearity dominates. This
behavior is analogous to that of the driven, linearly-damped, harmonic pendulum described in chapter 35
380 CHAPTER 14. COUPLED LINEAR OSCILLATORS
¡ 02 ¢ ¡ 2 ¢
= 1 + 02 02
2 + 3 = 1 + 22 + 23 − 21 2 − 21 3 − 22 3
2 2
The kinetic energy evaluated at the equilibrium location is
1 ³ ´2 1 ³ ´2 1 ³ ´2
= ̇1 + ̇2 + ̇3
2 2 2
The next stage is to evaluate the {T} and {V} tensors
⎧ ⎫ ⎧ ⎫
⎨ 1 0 0 ⎬ ⎨ 1 − − ⎬
T = 2 0 1 0 V = − 1 −
⎩ ⎭ ⎩ ⎭
0 0 1 − − 1
The third stage is to evaluate the secular determinant which can be written as
¯ ¯
¯ 1 − 2 − − ¯
¯ ¯
¯ − 2
1 − − ¯
¯ ¯=0
¯ ¯
¯ − − 1 − 2 ¯
while for = 3 = 2
−13 + 223 − 33 = 0
Solving these gives
13 = 23 = 33
Assuming that the eigenfunction is normalized to unity
The general analytic approach requires the and energy tensors given by
⎧ ⎫ ⎧ ⎫
⎨ 1 0 0 ⎬ ⎨ + 2 −2 0 ⎬
T = 2 0 1 0 V= −2 + 22 −2
⎩ ⎭ ⎩ ⎭
0 0 1 0 −2 + 2
14.8. THREE-BODY COUPLED LINEAR OSCILLATOR SYSTEMS 383
Note that in contrast to the prior case of three fully-coupled pendula, for the nearest neighbor case the potential
energy tensor {V} is non-zero only on the diagonal and ±1 components ¡ parallel
¢ to the diagonal.
The third stage is to evaluate the secular determinant of the V − 2 T matrix, that is
¯ ¯
¯ + 2 − 2 2 −2 0 ¯
¯ ¯
¯ − 2 2
+ 2 − 2 2
− 2 ¯=0
¯ ¯
¯ 0 − 2 2
+ − 2 2 ¯
which results in the three non-degenerate eigenfrequencies for the normal modes.
The normal modes are similar to the prior case of complete linear
coupling, pas shown in the adjacent figure.
1 = This lowest mode 1 involves the three pendula oscillating
in phase such that the springs are not stretched or compressed thus the 1
period of this coherent oscillation is the same as an independent pendulum
of mass and length . That is
1
η 1 = √ (1 2 3 )
3
p
2 = +
This second mode 2 has the central mass stationary with
the outer pendula oscillating with the same amplitude and out of phase.
That is
1
η 2 = √ (1 0 −3 )
2
q 2
3 = + 3
. This third mode 3 involves the outer pendula in phase
with the same amplitude while the central pendulum oscillating with angle
3 = −21 . That is
1
η3 = √ (1 −22 3 )
6
Similar to the prior case of three completely-coupled pendula, the coherent
normal mode η 1 corresponds to an oscillation of the center-of-mass with
no relative motion, while η 2 and η 3 correspond to relative motion of
the pendula with stationary center of mass motion. In contrast to the
prior example of complete coupling, for nearest neighbor coupling the two 3
higher lying solutions are not degenerate. That is, the nearest neighbor
coupling solutions differ from when all masses are linearly coupled.
It is interesting to note that this example combines two coupling mech-
anisms that can be used to predict the solutions for two extreme cases
by switching off one of these coupling mechanisms. Switching off the
coupling springs, by setting = 0,pmakes all three normal frequencies
degenerate with 1 = 2 = 3 = . This corresponds to three inde- Normal modes of three plane
p
pendent identical pendula each with frequency = . Also the three pendula with nearest-neighbour
linear combinations 1 2 3 also have this same frequency, in particular coupling.
1 corresponds to an in-phase oscillation of the three pendula. The three
uncoupled pendula are independent and any combination the three modes is allowed since the three frequencies
are degenerate.
The other extreme is to let = 0 that is switch off the gravitational field or let → ∞, then the only
coupling is due to the two springs. This results in 1 = 0 because there is no restoring force acting on the
coherent motion of the three in-phase coupled oscillators; as a result, oscillatory motion cannot be sustained
since it corresponds to the center of mass oscillation with no external forces acting which is spurious. That
is, this spurious solution corresponds to constant linear translation.
384 CHAPTER 14. COUPLED LINEAR OSCILLATORS
Note that for this case the kinetic energy tensor is diagonal whereas k
Longitudinal modes
The coordinate system used is illustrated in the adjacent figure.
The Lagrangian for this system is
µ ¶
2 2 2 2 2
= ̇1 + ̇2 + ̇3 − [(2 − 1 ) + (3 − 2 ) ]
2 2 2 2
Evaluating the kinetic energy tensor gives
⎧ ⎫
⎨ 0 0 ⎬
T= 0 0
⎩ ⎭
0 0
Note that the same answer is obtained using Newtonian mechanics. That is, the force equation gives
̈1 − (2 − 1 ) = 0
̈2 + (2 − 1 ) − (3 − 2 ) = 0
̈3 − (3 − 2 ) = 0
This leads to the same secular determinant as given above with the matrix elements clustered along the
diagonal for nearest-neighbor problems.
386 CHAPTER 14. COUPLED LINEAR OSCILLATORS
Transverse modes
The solutionsqare:
¡ ¢
4) 4 = 2 2+ This is the only non-spurious transverse mode 4 which corresponds to the two
outside masses vibrating in unison transverse to the symmetry axis while the central mass vibrates oppositely.
This mode radiates electric dipole radiation since the electric dipole is oscillating.
5) 5 = 0. This transverse solution 5 has all three nuclei vibrating in unison transverse to the symmetry
axis and corresponds to a spurious center of mass oscillation.
6) 6 = 0 This transverse solution 6 corresponds to a stationary central mass with the two outside
masses vibrating oppositely. This corresponds to a rotational oscillation of the molecule which is spurious
since there are no torques acting on the molecule for a central force. Rotational motion usually is taken into
account separately.
The normal modes for the bent triatomic molecule are similar except that the oscillator coupling strength
is reduced by the factor cos where is the bend angle.
14.9. MOLECULAR COUPLED OSCILLATOR SYSTEMS 387
1 X³ 2 ´
+1
2 Figure 14.8: Transverse motion of a
= ̇ − (−1 − ) (14.83)
2 =1 linear discrete lattice chain
Using this Lagrangian in the Lagrange Euler equations gives the following second-order equation of motion
for transverse oscillations
̈ = 2 (−1 − 2 + +1 ) (14.84)
where = 1 2 and r
≡ (14.85)
The normal modes for the transverse modes comprise standing waves that satisfy the same boundary
conditions as for the longitudinal modes. The equations of motion for longitudinal motion, equation
1477 or transverse motion, equation
p 1484 are identical in form. The major difference is thatp 0 for the
transverse normal modes ≡ differs from that for the longitudinal modes which is ≡ . Thus
the following discussion of the normal modes on a discrete lattice chain is identical in form for both transverse
and longitudinal waves.
This secular determinant corresponds to the special case of nearest neighbor interactions with the kinetic
energy tensor T being diagonal and the potential energy tensor V involving coupling only to adjacent
masses. The secular determinant is of order and thus determines exactly eigen frequencies for each
polarization mode.
For large the solution of this problem is more efficiently obtained by using a recursion relation approach,
rather than solving the above secular determinant. The trick is to assume that the phase differences
between the motion of adjacent masses all are identical for a given polarization. Then the amplitude for the
mass for the frequency mode is of the form
which reduces to
2 = 2 2 − 2 2 cos = 4 2 sin2
2
that is
= 2 sin (14.91)
2
where = 1 2 3
Now it is necessary to determine the phase angle which can be done by applying the boundary
conditions for standing waves on the lattice chain. These boundary conditions for stationary modes require
that the ends of the lattice chain are nodes, that is = (+1) = 0 Using the fact that only the real
part of has physical meaning, leads to the amplitude for the mass for the mode to be
Therefore
( + 1) = (14.95)
where = 1 2 3 . That is
= = = = (14.96)
+1 ( + 1) 2
= 2 sin = 2 sin = 2 sin = 2 sin (14.97)
2 ( + 1) 2 ( + 1) 2 2
where the corresponding wavenumber is given by
2
= = = (14.98)
( + 1)
This implies that the normal modes are quantized with half-wavelengths 2 = .
14.10. DISCRETE LATTICE CHAIN 391
r=1 r=2
1 .0 1 .0
0 .8 0 .8
0 .6 0 .6
0 .4 0 .4
0 .2 0 .2
0 .0 0 .0
0. 1 0 .2 0. 3 0. 4 0. 5 0. 6 0. 7 0. 8 0. 9 1. 0 0. 1 0 .2 0. 3 0. 4 0. 5 0. 6 0. 7 0. 8 0. 9 1. 0
-0 .2 -0 .2
-0 .4 -0 .4
-0 .6 -0 .6
-0 .8 -0 .8
-1 .0 -1 .0
r=3 r=4
1 .0 1 .0
0 .8 0 .8
0 .6 0 .6
0 .4 0 .4
0 .2 0 .2
0 .0 0 .0
0. 1 0 .2 0. 3 0. 4 0. 5 0. 6 0. 7 0. 8 0. 9 1. 0 0. 1 0 .2 0. 3 0. 4 0. 5 0. 6 0. 7 0. 8 0. 9 1. 0
-0 .2 -0 .2
-0 .4 -0 .4
-0 .6 -0 .6
-0 .8 -0 .8
-1 .0 -1 .0
r=5 r=6
1 .0 1 .0
0 .8 0 .8
0 .6 0 .6
0 .4 0 .4
0 .2 0 .2
0 .0 0 .0
0. 1 0 .2 0. 3 0. 4 0. 5 0. 6 0. 7 0. 8 0. 9 1. 0 0. 1 0 .2 0. 3 0. 4 0. 5 0. 6 0. 7 0. 8 0. 9 1. 0
-0 .2 -0 .2
-0 .4 -0 .4
-0 .6 -0 .6
-0 .8 -0 .8
-1 .0 -1 .0
Figure 14.9: Plots of the maximal vibrational amplitudes for the frequency sinusoidal mode, versus
distance along the chain, for transverse normal modes of a vibrating discrete lattice with = 5. Only =
1 2 3 4 5 are distinct modes because = 6 is a null mode. Note that the modes with = 7 8 9 10 11 12
shown dashed, duplicate the locations of the mass displacement given by the lower-order modes.
Combining equations 1496 and 1493 gives the maximum amplitudes for the eigenvectors to be
= sin
(14.99)
2
For independent linear oscillators there are only independent normal modes, that is, for = + 1 the
sine function in equation 1497 must be zero. Beyond = the equations do not describe physically new
situations. This is illustrated by figure 149 which shows the transverse modes of a lattice chain with = 5.
There are only = 5 independent normal modes of this system since = + 1 = 6 corresponds to a null
mode with all () = 0. Also note that the solutions for + 1 shown dashed, replicate the mass
locations of modes with + 1, that is, the modes with 6 are replicas of the lower-order modes.
Note that has a maximum value ≤ 2 0 since the sine function cannot exceed unity. This leads
to a maximum frequency = 2 0 called the cut-off frequency, which occurs when = . That is, the
null-mode occurs when = + 1 for which equation 1499 equals zero The range of quantized normal
modes that can occur is intuitive. That is, the longest half-wavelength max
2 = = ( + 1) equals the total
length of the discrete lattice chain. The shortest half-wavelength −
2
= is set by the lattice spacing.
Thus the discrete wavenumbers of the normal modes, for each polarization, range from 1 to 1 where is
an integer.
Assuming real the normal coordinate and corresponding frequency are,
= (14.100)
Equations 1497 and 1499 give the angular frequency and displacement. Note that superposition applies
since this system is linear. Therefore the most general solution for each polarization can be any superposition
of the form ∙ ¸
X
() = sin (14.101)
=1
( + 1)
392 CHAPTER 14. COUPLED LINEAR OSCILLATORS
where the distance along the chain = , that is, it is quantized in units of the cell spacing , with being
an integer. The positive sign in the exponent corresponds to a wave travelling in the − direction while
the negative sign corresponds to a wave travelling in the + direction. The velocity of a fixed phase of the
travelling wave must satisfy that ± is a constant. This will occur if the phase velocity of the wave is
given by
= = (14.103)
The wave has a frequency = 2 and wavelength = 2
thus the phase velocity = = .
Inserting the travelling wave 14102 into the transverse equation of motion 1484 for the discrete lattice
chain gives
The lattice chain has a phase velocity for the wave given by
¯ ¯
¯sin ¯
2
= = 0 (14.110)
2
Γ
= 2 0 cosh (14.116)
2
which increases with Γ. Thus, when = 2 0 then the amplitude of the wave is of the form
For this special case, it was shown in chapter 10 that the Lagrange equations can be written in terms of the
Rayleigh dissipation function as ½ µ ¶ ¾
R
− + = (14.120)
̇ ̇
where are generalized forces acting on the system that are not absorbed into the potential Using
equations 1443 1444 and 14120 allows the equations of motion for damped coupled linear oscillators to
be written in a matrix form as
{T} q̈ + {C} q̇+ {V} q = {Q} (14.121)
where the symmetric matrices {T} {C} and {V} are positive definite for positive definite systems. Rayleigh
pointed out that in the special case where the damping matrix {C} is a linear combination of the {T} and
{V} matrices, then the matrix {C} is diagonal leading to a separation of the damped system into normal
modes. As discussed in chapter 4 many systems in nature are linear for small amplitude oscillations allowing
use of the Rayleigh dissipation function which provides an analytic solution. However, in general, except for
when {C} is small, this separation into normal modes is not possible for damped systems and the solutions
must be obtained numerically.
The following example illustrates approaches used to handle linearly-damped coupled-oscillator systems.
Figure 14.12: Kuramoto model of collective synchronization of coupled oscillators. The left and center
plots show the time and coupling strength dependence of the order parameter . The right plot shows the
frequency dependence including coupling (solid line) and without coupling (dashed line).
where = 1 2 . Kuramoto recognized that mean-field coupling was the most tractable system to solve,
that is, a system where the coupling is applicable equally to all the oscillators. Moreover, he assumed an
equally-weighted, pure sinusoidal coupling for the coupling term Γ ( − ) between the coupled oscillators.
That is, he assumed
Γ ( − ) = sin( − ) (14.124)
where ≥ 0 is the coupling strength, and the factor 1 ensures that the model is well behaved as → ∞.
Kuramoto assumed that the frequency distribution () was unimodular and symmetric about the mean
frequency Ω, that is (Ω + ) = (Ω − ).
This problem can be simplified by exploiting the rotational symmetry and transforming to a frame of
reference that is rotating at an angular frequency Ω. That is, use the transformation = − Ω where
is measured in the rotating frame. This makes () unimodular with a symmetric frequency distribution
about = 0. The phase velocity in this rotating frame is
X
̇ = + sin( − ) (14.125)
=1
Kuramoto observed that the phase-space distribution can be expressed in terms of the order parameters
in that equation 14122 can be multiplied on both sides by − to give
1 X ( − )
(− ) = (14.126)
=1
angular frequency Ω and co-rotate with average phase (), whereas those frequencies lying further from
the center continue to rotate independently at their natural frequencies and drift relative to the coherent
cluster frequency Ω. As a consequence this mixed state is only partially synchronized as illustrated on the
right side of figure 1412. The synchronized fraction has a -function behavior for the frequency distribution
which grows in intensity with further increase in . The unsynchronized component has nearly the original
frequency distribution () except that it is depleted in the region of the locked frequency due to strength
absorbed by the -function component.
Kuramoto’s toy model nicely illustrates the essential features of the evolution of collective synchronization
with coupling strength. It has been applied to the study neuronal synchronization in the brain[Cum07]. The
model illustrates that the collective synchronization of coupled oscillators leads to a component that has a
single frequency for correlated motion which can be much narrower than the inherent frequency distribution
of the ensemble of coupled oscillators.
238
Figure 14.13: Collective rotational bands in the nucleus U excited by Coulomb excitation. [Sim98]
analogous to the correlated flow of individual water molecules in a tidal wave. The weaker octupole term in
the residual interaction leads to an octupole [pear-shaped] coupled oscillator coherent state lying slightly above
the quadrupole coherent state. In contrast to the rotational motion of strongly-deformed quadrupole-deformed
nuclei, the octupole deformation exhibits more vibrational-like properties than rotational motion of a charged
tidal wave. The observed large increase in moment of inertia at higher rotational frequencies, shown in the
insert, is due to the Coriolis force aligning the individual valence nucleons along the rotational axis. Thus,
although the nucleus 238 U is the epitome of a complicated many-body quantal system, it is apparent that
basic classical mechanics of coupled oscillators, and rotation, underlie the physics phenomena exhibited by
synchronized collective motion in the nuclear many-body system.
The close correspondence between classical mechanics predictions, and the observed excitation phenomena
observed for the 238 nucleus, is surprising for a system that is the epitome of a many-body quantal fluid.
The following list identifies other manifestations of classical mechanics discussed in this book, that were
exploited for study of such correlated motion of many-body nuclear systems.
1. Coincident detection of the excited nuclei recoiling in vacuum was used to identify the exact scat-
tering angles, plus recoil velocities, of the scattered nuclei. This specifies the hyperbolic Rutherford
trajectory for each scattered nucleus, the nuclear masses, and their recoil velocities. The deexcitation
−rays emitted in flight by each recoiling nucleus, were detected in coincidence with the scattered
nuclei. Knowledge of the recoil velocities and scattering angles enabled correction for the Doppler shift
in energy of each detected coincident -ray to enhance the experimental energy resolution achieved by
the -ray detectors.
2. The transition energies and angular distribution of the deexcitation -rays determined the energies,
spins, and parities of the excited states in 235 .
3. The measured yields of the coincident deexcitation -rays determined the excitation cross section as a
function of the nuclear scattering angle.
14.13. SUMMARY 399
4. A full quantal calculation for this system is beyond the capabilities of modern computers since the
experiment involves excitation of ∼ 100 excited levels, coupled by about ∼ 1000 electromagnetic matrix
elements, and the scattering involves inclusion of thousands of partial wave due to the long range of the
Coulomb potential for the heavy mass of the scattered nuclei. Therefore a semi-classical approximation
is used for the quantal calculation of the electromagnetic excitation cross sections as a function of time
as the scattered nuclei traverse Rutherford’s hyperbolic Coulomb scattering trajectory for each scattered
nucleus.
5. The measured cross section for the deexcitation -rays are compared with the predicted cross sections
to determine the ∼ 1000 electromagnetic matrix elements connecting the states in 235 .
6. The measured electromagnetic matrix elements have been measured in the laboratory frame of reference.
Much more insight into the collective motion in 235 is obtained by transforming the electromagnetic
matrix elements into the body-fixed frame of reference for this rotating deformed body. Rotational
invariants, described in chapter 1316, are used to derive the electromagnetic properties in the rotating
body-fixed frame of reference which unambiguously determines the electromagnetic shape for each excited
nuclear state observed in 235 .
7. Hamiltonian mechanics, based on the Routhian is used to make theoretical model calculations
of the nuclear structure of 235 in the rotating body-fixed frame for comparison with the experimental
data derived from this experiment.
This experiment illustrates that classical mechanics plays a key role in all aspects of the study of the
nuclear structure of the many-body nuclear quantal system.
14.13 Summary
This chapter has focussed on many—body coupled linear oscillator systems which are a ubiquitous feature in
nature. A summary of the main conclusions are the following.
Normal modes: It was shown that coupled linear oscillators exhibit normal modes and normal coordinates
that correspond to independent modes of oscillation with characteristic eigenfrequencies .
General analytic theory for coupled linear oscillators Lagrangian mechanics was used to derive the
general analytic procedure for solution of the many-body coupled oscillator problem which reduces to the
conventional eigenvalue problem. A summary of the procedure for solving coupled oscillator problems is as
follows:.
1) Choose generalized coordinates and evaluate and .
1X
= ̇ ̇ (1441)
2
and
1X
= (1442)
2
and µ ¶
2
≡ (1444)
0
400 CHAPTER 14. COUPLED LINEAR OSCILLATORS
4) From the initial conditions determine the complex scale factors where
5) Determine the normal coordinates where each is a normal mode. The normal coordinates can be
expressed as
η = {a}−1 q (1461)
Few-body coupled oscillator systems The general analytic theory was used to determine the solutions
for parallel and series couplings of two and three linear oscillators. The phenomena observed include degen-
erate and non-degenerate eigenvalues and spurious center-of-mass oscillatory modes. There are two broad
classifications for three or more coupled oscillators, that is, either complete coupling of all oscillators, or
coupling of the nearest-neighbor oscillators. It is observed that the eigenvalue corresponding to the most
coherent motion of the coupled oscillators corresponds to the most collective motion and its eigenvalue is dis-
placed the most in energy from the remaining eigenvalues. For some systems this coherent collective mode
corresponded to a center-of-mass motion with no internal excitation of the other modes, while the other
eigenvalues corresponded to modes with internal excitation of the oscillators such that the center of mass
is stationary. The above procedure has been applied to two classification of coupling, complete coupling of
many oscillators, and nearest neighbor coupling. Both degenerate and spurious center-of-mass modes were
observed. Strong collective shape degrees of freedom in nuclei are examples of complete coupling due to the
weak residual interactions between nucleons in the nucleus. It was seen that, for many coupled oscillators,
one coherent state separates from the other states and this coherent state carries the bulk of the collective
strength.
Discrete lattice chain Transverse and longitudinal modes of motion on the discrete lattice chain were dis-
cussed because of the important role it plays in nature, such as in crystalline lattice structures. Both normal
modes and travelling waves were discussed including the phenomena of dispersion and cut-off frequencies.
Molecules and the crystalline lattice chains are examples where nearest neighbor coupling is manifest. It
was shown that, for the −oscillator discrete lattice chain, there are only independent longitudinal modes
plus modes for the two transverse polarizations, and that the angular frequency ≤ 2 0 that is, a cut-off
frequency exists.
Damped coupled linear oscillators It was shown that linearly-damped coupled oscillator systems can
be solved analytically using the concept of the Rayleigh dissipation function.
Collective synchronization of coupled oscillators The Kuramoto schematic phase model was used
to illustrate how weak residual forces can cause collective synchronization of the motion of many coupled
oscillators. This is applicable to biological systems as well as mechanical systems.
14.13. SUMMARY 401
Workshop exercises
1. Consider two masses (each of mass ) connected by a spring to each other and by springs to fixed positions.
Motion is only allowed along one dimension. (This is exactly the same system that is discussed in chapter
142 on coupled oscillations.) Let each of the two oscillator springs have a force constant and let the force
constant of the coupling spring be 12 . Let 1 and 2 be the coordinates as described in the textbook.
(a) Draw a picture of the two masses displaced by a small amount. Using the picture, try to make sense of
the equations of motion as given in the text:
(b) Each of the trial solutions is written in the form . Why are the trial solutions written this way?
Are there any other ways to write the trial solution?
(c) For a nontrivial solution to exist for the pair of simultaneous equations resulting from the substitution of
the trial solution, the determinant of the coefficients of 1 and 2 must vanish. Why must this be the
case? Is a similar statement true when considering three masses? What about masses?
(d) Suppose you had the actual two-mass system sitting in front of you. How could you create antisymmetric
motion? How could you create symmetric motion? Can you describe each of these motions using a set of
suitable initial conditions?
2. Two particles, each with mass , move in one dimension in a region near a local minimum of the potential
energy where the potential energy is approximately given by
1
= (721 + 422 + 41 2 )
2
where is a constant.
5. A mechanical analog of the benzene molecule comprises a discrete lattice chain of 6 point masses connected
in a plane hexagonal ring by 6 identical springs each with spring constant and length .
a) List the wave numbers of the allowed undamped longitudinal standing waves.
b) Calculate the phase velocity and group velocity for longitudinal travelling waves on the ring.
c) Determine the time dependence of a longitudinal standing wave for a angular frequency = 2 , that
is, twice the cut-off frequency.
such that = 2 ,
(a) Determine the eigenfrequencies and normal coordinates.
(b) Choose a set of initial conditions such that the system oscillates at its highest eigenfrequency.
(c) Determine the solutions 1 () and 2 ().
402 CHAPTER 14. COUPLED LINEAR OSCILLATORS
Problems
1. Four identical masses are connected by four identical springs, spring constant and constrained to move
on a frictionless circle of radius as shown on the left in the figure.
a) How many normal modes of small oscillation are there?
b) What are the eigenfrequencies of the small oscillations?
c) Describe the motion of the four masses for each eigenfrequency.
2. Consider the two identical coupled oscillators given on the right in the figure assuming 1 = 2 = . Let both
oscillators be linearly damped with a damping constant . A force = 0 cos() is applied to mass 1 .
Write down the pair of coupled differential equations that describe the motion. Obtain a solution by expressing
the differential equations in terms of the normal coordinates. Show that the normal coordinates 1 and 2
exhibit resonance peaks at the characteristic frequencies 1 and 2 respectively.
3. As shown on the left below the mass moves horizontally along a frictionless rail. A pendulum is hung from
with a weightless rod of length with a mass at its end.
a) Prove that the eigenfrequencies are
r
1 = 0 2 = ( + )
x
M
Chapter 15
15.1 Introduction
This study of classical mechanics has involved climbing a vast mountain of knowledge, while the pathway to
the top has led us to elegant and beautiful theories that underlie much of modern physics. Being so close to
the summit provides the opportunity to take a few extra steps in order to provide a glimpse of applications
to physics at the summit. These are described in chapters 15 − 18.
Hamilton’s development of Hamiltonian mechanics in 1834 is the crowning achievement for applying vari-
ational principles to classical mechanics. A fundamental advantage of Hamiltonian mechanics is that it uses
the conjugate coordinates q p plus time , which is a considerable advantage in most branches of physics
and engineering. Compared to Lagrangian mechanics, Hamiltonian mechanics has a significantly broader
arsenal of powerful techniques that can be exploited to obtain an analytical solution of the integrals of the
motion for complicated systems. In addition, Hamiltonian dynamics provides a means of determining the
unknown variables for which the solution assumes a soluble form, and is ideal for study of the fundamental
underlying physics in applications to fields such as quantum or statistical physics. As a consequence, Hamil-
tonian mechanics has become the preeminent variational approach used in modern physics. This chapter
introduces the following four techniques in Hamiltonian mechanics: (1) the elegant Poisson bracket repre-
sentation of Hamiltonian mechanics, which played a pivotal role in the development of quantum theory; (2)
the powerful Hamilton-Jacobi theory coupled with Jacobi’s development of canonical transformation theory;
(3) action-angle variable theory; and (4) canonical perturbation theory.
Prior to further development of the theory of Hamiltonian mechanics, it is useful to summarize the major
formula relevant to Hamiltonian mechanics that have been presented in chapters 7 8 and 9.
Action functional :
As discussed in chapter 92, Hamiltonian mechanics is built upon Hamilton’s action functional
Z 2
(q p) = (q q̇) (15.1)
1
Generalized momentum :
In chapter 72, the generalized (canonical) momentum was defined in terms of the Lagrangian to be
(q q̇)
≡ (15.3)
̇
Chapter 92 defined the generalized momentum in terms of the action functional to be
(q p)
= (15.4)
403
404 CHAPTER 15. ADVANCED HAMILTONIAN MECHANICS
Hamiltonian function:
The Hamiltonian (q p) was defined in terms of the generalized energy (q q̇ ) plus the generalized
momentum. That is
X
(q p) ≡ (q q̇ ) = ̇ − (q q̇ ) = p · q̇−(q q̇ ) (15.6)
P
where p q correspond to -dimensional vectors, e.g. q ≡ (1 2 ) and the scalar product p· q̇ = ̇ .
Chapter 82 used a Legendre transformation to derive this relation between the Hamiltonian and Lagrangian
functions. Note that whereas the Lagrangian (q q̇ ) is expressed in terms of the coordinates q plus
conjugate velocities q̇, the Hamiltonian (q p ) is expressed in terms of the coordinates q plus their
conjugate momenta p. For scleronomic systems, plus assuming the standard Lagrangian, then equations
744 and 729 give that the Hamiltonian simplifies to equal the total mechanical energy, that is, = + .
Generalized energy theorem:
The equations of motion lead to the generalized energy theorem which states that the time dependence
of the Hamiltonian is related to the time dependence of the Lagrangian.
"
#
(q p) X X (q q̇ )
= ̇
+ (q ) − (15.7)
=1
Note that if all the generalized non-potential forces and Lagrange multiplier terms are zero, and if the
Lagrangian is not an explicit function of time, then the Hamiltonian is a constant of motion.
Hamilton’s equations of motion:
Chapter 83 showed that a Legendre transform plus the Lagrange-Euler equations led to Hamilton’s
equations of motion. Hamilton derived these equations of motion directly from the action functional, as
shown in chapter 92
(q p)
̇ = (15.8)
" #
X
̇ = − (q p) + + (15.9)
=1
(q p) (q q̇ )
= − (15.10)
Note the symmetry of Hamilton’s two canonical equations. The canonical variables are treated
as independent canonical variables Lagrange was the first to derive the canonical equations but he did not
recognize them as a basic set of equations of motion. Hamilton derived the canonical equations of motion
from his fundamental variational principle and made them the basis for a far-reaching theory of dynamics.
Hamilton’s equations give 2 first-order differential equations for for each of the degrees of freedom.
Lagrange’s equations give second-order differential equations for the variables ̇
Hamilton-Jacobi equation:
Hamilton used Hamilton’s Principle to derive the Hamilton-Jacobi equation.
+ (q p) = 0 (15.11)
The solution of Hamilton’s equations is trivial if the Hamiltonian is a constant of motion, or when a set
of generalized coordinate can be identified for which all the coordinates are constant, or are cyclic (also
called ignorable coordinates). Jacobi developed the mathematical framework of canonical transformations
required to exploit the Hamilton-Jacobi equation.
15.2. POISSON BRACKET REPRESENTATION OF HAMILTONIAN MECHANICS 405
Note that the above definition of the Poisson bracket leads to the following identity, antisymmetry, linearity,
Leibniz rules, and Jacobi Identity.
[ ] = 0 (15.13)
where and are functions of the canonical variables plus time. Jacobi’s identity; (1517) states that
the sum of the cyclic permutation of the double Poisson brackets of three functions is zero. Jacobi’s identity
plays a useful role in Hamiltonian mechanics as will be shown.
Note that the Poisson bracket is antisymmetric under interchange in and It is interesting that the only
non-zero fundamental Poisson bracket is for conjugate variables where = that is
Let = and replace by , and use the fact that the fundamental Poisson brackets [ ] = 0
and [ ] = , then equation 1525 reduces to
X µ
¶ X
[ ] = [ ] + [ ] = (15.28)
That is
[ ] = − (15.29)
Similarly
X µ
¶
[ ] = [ ] + [ ] (15.30)
leading to
[ ] = (15.31)
Substituting equations (1529) and (1531) into equation (1527) gives
X µ
¶
[ ] = − = [ ] (15.32)
Thus the canonical variable subscripts ( ) and ( ) can be ignored since the Poisson bracket is
invariant to any canonical transformation of canonical variables. The counter argument is that if the Poisson
bracket is independent of the transformation, then the transformation is canonical.
Since it has been shown that this transformation is canonical, it is possible to go further and determine
the function that generates this transformation. Solving the transformation equations for and give
¡ ¢2 ¡ ¢
= − 1 sec2 = 2 − 1 tan
Since the transformation is canonical, there exists a generating function 3 ( ) such that
3 3
=− =−
3 3
3 ( ) = + = − −
h¡ ¢2 i ¡ ¢2 h¡ ¢2 i
= − − 1 tan − − 1 tan = − − 1 tan
This example illustrates how to determine a useful generating function and prove that the transformation is
canonical.
These two Poisson Brackets for three functions can be used to derive the Poisson Bracket of four functions,
taken in pairs. This can be accomplished two ways using either equation 1533 or 1534
These two alternate derivations give different relations for the same Poisson Bracket. Equating the alternative
equations 1535 and 1536 gives that
This can be factored into separate relations, the left-hand side for body 1 and the right-hand side for body
2.
(1 1 − 1 1 ) (2 2 − 2 2 )
= = (15.37)
[1 1 ] [2 2 ]
408 CHAPTER 15. ADVANCED HAMILTONIAN MECHANICS
Since the left-hand ratio holds for 1 1 independent of 2 2 , and vise versa, then they must equal
a constant that does not depend on 1 1 does not depend on 2 2 , and must commute with
(1 1 − 1 1 ). That is, must be a constant number independent of these variables.
X µ 1 1 1 1
¶
(1 1 − 1 1 ) = [1 1 ] ≡ − (15.38)
Equation 1538 is an especially important result which states that to within a multiplicative constant number
, there is a one-to-one correspondence between the Poisson Bracket and the commutator of two independent
functions. An important implication is that if two functions, have a Poisson Bracket that is zero, then
the commutator of the two functions also must be zero, that is, and commute.
Consider the special case where the variables 1 and 1 correspond to the fundamental canonical vari-
ables, ( ). Then the commutators of the fundamental canonical variables are given by
− = [ ] = (15.39)
− = [ ] = 0 (15.40)
− = [ ] = 0 (15.41)
In 1925, Paul Dirac, a 23-year old graduate student at Bristol, recognized that the formal correspondence
between the Poisson bracket in classical mechanics, and the corresponding commutator, provides a logical
and consistent way to bridge the chasm between the Hamiltonian formulation of classical mechanics, and
quantum mechanics. He realized that making the assumption that the constant ≡ ~, leads to Heisenberg’s
fundamental commutation relations in quantum mechanics, as is discussed in chapter 1831. Assuming that
≡ ~ provides a logical and consistent way that builds quantization directly into classical mechanics, rather
than using ad-hoc, case-dependent, hypotheses as was used by the older quantum theory of Bohr.
Time dependence:
The total time differential of a function ( ) is defined by
µ ¶
X
= + ̇ + ̇ (15.42)
that is
= + [ ] (15.45)
This important equation states that the total time derivative of any function ( ) can be expressed in
terms of the partial time derivative plus the Poisson bracket of ( ) with the Hamiltonian.
15.2. POISSON BRACKET REPRESENTATION OF HAMILTONIAN MECHANICS 409
Any observable ( ) will be a constant of motion if = 0, and thus equation (1545) gives
+ [ ] = 0 (If is a constant of motion)
That is, it is a constant of motion when
= [ ] (15.46)
Moreover, this can be extended further to the statement that if the constant of motion is not explicitly
time dependent then
[ ] = 0 (15.47)
The Poisson bracket with the Hamiltonian is zero for a constant of motion that is not explicitly time
dependent. Often it is more useful to turn this statement around with the statement that if [ ] = 0 and
= 0 then = 0, implying that is a constant of motion.
Independence
Consider two observables ( ) and ( ). The independence of these two observables is determined
by the Poisson bracket
[ ] = − [ ] (15.48)
If this Poisson bracket is zero, that is, if the two observables ( ) and ( ) commute, then their
values are independent and can be measured independently. However, if the Poisson bracket [ ] 6= 0, that
is ( ) and ( ) do not commute, then and are correlated since interchanging the order of
the Poisson bracket changes the sign which implies that the measured value for depends on whether is
simultaneously measured.
A useful property of Poisson brackets is that if and both are constants of motion, then the double
Poisson bracket [ [ ]] = 0. This can be proved using Jacobi’s identity
If [ ] = 0 and [ ] = 0 then [ [ ]] = 0 that is, the Poisson bracket [ ] commutes with . Note
that if and do not depend explicitly on time, that is
= = 0, then combining equations (1545)
and (1549) leads to Poisson’s Theorem that relates the total time derivatives.
∙ ¸ ∙ ¸
[ ] = + (15.50)
[ ] = 0
410 CHAPTER 15. ADVANCED HAMILTONIAN MECHANICS
Since is not an explicit function of time, = 0 then = 0 that is, the angular momentum about
the axis = is a constant of motion.
The Poisson bracket of the total angular momentum 2 commutes with the Hamiltonian, that is
" #
£ 2 ¤ 2
2
= + = 0
sin2
2
Since the total angular momentum 2 = 2 + sin2 is not explicitly time dependent, then it also must be
a constant of motion. Note that Noether’s theorem gives that both the angular momenta 2 and are
constants of motion. Also since the Poisson brackets are
[ ] = 0
£ 2 ¤
= 0
then Jacobi’s identity, equation 1517 can be used to imply that
£ ¤
[ 2 ] = 0
£ 2 ¤
That
£ 2 is,
¤ the Poisson bracket is a constant of motion. Note that if 2 and commute, that is,
£2 ¤ = 0 then they can be measured simultaneously with unlimited accuracy, and this also satisfies that
commutes with .
The ( ) components of the angular momentum are given by
X
X
= (r × p) = ( − )
=1 =1
X X
= (r × p) = ( − )
=1 =1
X X
= (r × p) = ( − )
=1 =1
̇ = [ ] = (15.57)
̇ = [ ] = − (15.58)
The above shows that the full structure of Hamilton’s equations of motion can be expressed directly in
terms of Poisson brackets.
The elegant formulation of Poisson brackets has the same form in all canonical coordinates as the Hamil-
tonian formulation. However, the normal Hamilton canonical equations in classical mechanics assume implic-
itly that one can specify the exact position and momentum of a particle simultaneously at any point in time
which is applicable only to classical mechanics variables that are continuous functions of the coordinates,
and not to quantized systems. The important feature of the Poisson Bracket representation of Hamilton’s
equations is that it generalizes Hamilton’s equations into a form (1557 1558) where the Poisson bracket is
equally consistent with both classical and quantum mechanics in that it allows for non-commuting canonical
variables and Heisenberg’s Uncertainty Principle. Thus the generalization of Hamilton’s equations, via use
of the Poisson brackets, provides one of the most powerful analytic tools applicable to both classical and
quantal dynamics. It played a pivotal role in derivation of quantum theory as described in chapter 18.
412 CHAPTER 15. ADVANCED HAMILTONIAN MECHANICS
(p−A)2
= (p · ẋ) − = + Φ
2
The Hamilton equations of motion give
(p−A)
ẋ= [x ] =
and
ṗ = [p] = −∇Φ + {(p−A) × (∇ × A)}
Define the magnetic field to be
B≡∇×A
and the electric field to be
A
E = − ∇Φ −
then the Lorentz force can be written as
F = ṗ = (E + ẋ × B)
conservative system of many identical coupled linear oscillators. Then evaluating the following Poisson
brackets gives
[ ] = 0
[ ] = 0
[ ] = 0
[ ] = 0
[ ] =
6 0
[ ] = 6 0
Thus one cannot simultaneously measure the conjugate variables ( ) or ( ). This is the Uncertainty
Principle that is manifest by all forms of wave motion in classical and quantal mechanics as discussed in
chapter 3113
15.2. POISSON BRACKET REPRESENTATION OF HAMILTONIAN MECHANICS 413
1 1
≡ √ ( + ) ≡ √ ( − )
2 2
Express the kinetic and potential energies in terms of the new coordinates gives
∙³ ´2 ³ ´2 ¸ 1 ³ ´
1 2
(̇ ̇) = ̇ + ̇ + ̇ − ̇ = ̇2 + ̇
4 2
1 h 2 2
i 1 ¡ ¢ 1 1
= ( + ) + ( − ) + 2 − 2 = ( + ) 2 + ( − ) 2
4 2 2 2
Note that the coordinate transformation makes the Lagrangian separable, that is
1 ³ 2 2
´ 1 1
= ̇ + ̇ − ( + ) 2 + ( − ) 2 = +
2 2 2
where
1 1 1 2 1
= ̇2 − ( + ) 2 = ̇ − ( − ) 2
2 2 2 2
This shows that that the transformation has separated the system into two normal modes that are harmonic
oscillators with angular frequencies
r r
+ −
1 = 2 =
Note that the non-isotropic harmonic oscillator reduces to the isotropic linear oscillator when = 0.
b) HAMILTONIAN: The canonical momenta are given by
= = ̇
̇
= = ̇
̇
The definition of the Hamiltonian gives
1 ¡ 2 ¢ 1 1
= ̇ + ̇ − = + 2 + ( + ) 2 + ( − ) 2
2 2 2
Note that this can be factored as
= +
where
1 2 1 1 2 1
= + ( + ) 2 = + ( − ) 2
2 2 2 2
414 CHAPTER 15. ADVANCED HAMILTONIAN MECHANICS
Using the Poisson Bracket expression for the time dependence, equation 1545 and using the fact that
the Hamiltonian is not explicitly time dependent, that is,
= 0, gives
= + [ ] = 0 + [ + ] = [ ]
= + − − =0
Similarly = 0. This implies that the Hamiltonians for both normal modes, and are time-
independent constants of motion which are equal to the total energy for each mode.
c) ANGULAR MOMENTUM: The angular momentum for motion in the plane is perpendicular to
the plane with a magnitude of
= ( − )
The time dependence of the angular momentum is given by
= + [ ] = 0 + − + −
= + + − − + = 2
Note that if = 0 then the two eigenfrequencies, are degenerate, = , that is, the system reduces to
the isotropic harmonic oscillator in the plane that was discussed in chapter 119. In addition, = 0
for = 0 that is, the angular momentum in the plane is a constant of motion when = 0.
d) SYMMETRY TENSOR: The symmetry tensor was defined in chapter 1193 to be
1
0 = +
2 2
where and can correspond to either or . The symmetry tensor defines the orientation of the major
axis of the elliptical orbit for the two-dimensional, isotropic, linear oscillator as described in chapter 1193
The isotropic oscillator has been shown to have two normal modes that are degenerate, therefore and
are equally good normal modes. The Hamiltonian showed that, for = 0 the Hamiltonian gives that the
total energy is conserved, as well as the energies for each of the two normal modes which are.
2 1 2 1
= + 2 = + 2
2 2 2 2
Consider the matrix element
1
0 = +
2 2
where each can represent or . Then for each matrix element
0 0 0 0 0 0
= + [ ] = 0 + − + − =0
That is, each matrix element 012 commutes with the Hamiltonian
£ 0 ¤
= 0
Thus the Poisson Brackets representation of Hamiltonian mechanics has been used to prove that the
symmetry tensor 0 = 2 + 12 is a constant of motion for the isotropic harmonic oscillator. That is,
all the elements , and 0 of the symmetric tensor A0 commute with the Hamiltonian.
0 0
Note that the three constants of motion, L, A0 and H for the isotropic, two-dimensional, linear oscillator
form a closed algebra under the Poisson Bracket formalism.
A ≡ (p × L) + (r̂)
15.2. POISSON BRACKET REPRESENTATION OF HAMILTONIAN MECHANICS 415
is a constant of motion that specifies the major axis of the elliptical orbit. The eccentricity vector for the
inverse-square-law force can be investigated using Poisson Brackets as was done for the symmetry tensor
above. It can be shown that
[ ] =
µ 2 ¶
p
[ ] = −2 + (a)
2
Note that the bracket on the right-hand side of equation () equals the Hamiltonian for the inverse square-
law attractive force, and thus the Poisson bracket equals
µ 2 ¶
p
[ ] = −2 + = −2
2
For the Hamiltonian it can be shown that the Poisson bracket
[ A] = 0
That is, the eccentricity vector commutes with the Hamiltonian and thus it is a constant of motion. Previously
this result was obtained directly using the equations of motion as given in equation 1187. Note that the three
constants of motion, L, A and H form a closed algebra under the Poisson Bracket formalism similar to
the triad of constants of motion, L, A0 and H that occur for the two-dimensional, isotropic linear oscillator
described above. Examples 155 and 156 illustrate that the Poisson Brackets representation of Hamiltonian
mechanics is a powerful probe of the underlying physics, as well as confirming the results obtained directly
from the equations of motion as described in chapter 1184 and 11 9 3 .
Hence the net increase in in the infinitessimal rectangular element due to flow in the horizontal
direction is
− (̇ ) (15.62)
Similarly, the net gain due to flow in the vertical direction is
− (̇ ) (15.63)
Thus the total increase in the element per unit time is therefore
∙ ¸
− (̇ ) + (̇ ) (15.64)
Assume that the total number of points must be conserved, then the total increase in the number of
points inside the element must equal the net changes in on the infinitessimal surface element per
unit time. That is µ ¶
(15.65)
Thus summing over all possible values of gives
∙ ¸
X
+ (̇ ) + (̇ ) = 0 (15.66)
or ∙ ¸
X X ∙ ̇ ̇
¸
+ ̇ + ̇ + + =0 (15.67)
Inserting Hamilton’s canonical equations into both brackets and differentiating the last bracket results in
∙ ¸ X ∙ 2 ¸
X 2
+ − + − =0 (15.68)
The two terms in the last bracket cancel and thus
∙ ¸
X
+ − = + [ ] = 0 (15.69)
However, this just equals , therefore
= + [ ] = 0 (15.70)
This is called Liouville’s theorem which states that the rate of change of density of representative
points vanishes, that is, the density of points is a constant in the Hamiltonian phase space along a specific
trajectory. Liouville’s theorem means that the system acts like an incompressible fluid that moves such as to
occupy an equal volume in phase space at every instant, even though the shape of the phase-space volume
may change, that is, the phase-space density of the fluid remains constant. Equation (1570) is another
illustration of the basic Poisson bracket relation (1545) and the usefulness of Poisson brackets in physics.
Liouville’s theorem is crucially important to statistical mechanics of ensembles where the exact knowledge
of the system is unknown, only statistical averages are known. An example is in focussing of beams of charged
particles by beam handling systems. At a focus of the beam, the transverse width in is minimized, while
the width in is largest since the beam is converging to the focus, whereas a parallel beam has maximum
width and minimum spreading width . However, the product remains constant throughout the
focussing system. For a two dimensional beam, this applies equally for the and coordinates, etc. It is
obvious that the final beam quality for any beam transport system is ultimately limited by the emittance of
the source of the beam, that is, the initial area of the phase space distribution. Note that Liouville’s theorem
only applies to Hamiltonian − phase space, not to − ̇ Lagrangian state space. As a consequence,
Hamiltonian dynamics, rather than Lagrange dynamics, is used to discuss ensembles in statistical physics.
Note that Liouville’s theorem is applicable only for conservative systems, that is, where Hamilton’s
equations of motion apply. For dissipative systems the phase space volume shrinks with time rather than
being a constant of the motion.
15.3. CANONICAL TRANSFORMATIONS IN HAMILTONIAN MECHANICS 417
Similarly, applying Hamilton’s Principle of least action to the new Lagrangian L(Q Q̇ ) gives
Z 2 Z 2 h i
= L(Q Q̇ ) = P · Q̇ − H(Q P ) = 0 (15.74)
1 1
The discussion of gauge-invariant Lagrangians, chapter 93 showed that and L can be related by the total
time derivative of a generating function where
=L− (15.75)
The generating function can be any well-behaved function with continuous second derivatives of both the
old and new canonical variables p q P Q and Thus the integrands of (1573) and (1574) are related by
h i
p · q̇ − (q p ) = P · Q̇ − H(Q P ) + (15.76)
418 CHAPTER 15. ADVANCED HAMILTONIAN MECHANICS
where is a possible scale transformation. A scale transformation, such as changing units, is trivial, and will
be assumed to be absorbed into the coordinates, making = 1 Assuming that 6= 1 is called an extended
canonical transformation.
The total time derivative of the generating function = 1 (q Q) is given by
∙ ¸
(q Q) 1 (q Q) 1 (q Q) 1 (q Q)
= · q̇ + · Q̇ + (15.77)
q Q
Insert equation (1577) into equation (1576), and assume that the trivial scale factor = 1 then
∙ ¸ ∙ ¸
1 (q Q) 1 (q Q) 1 (q Q)
p− · q̇ − (q p ) = P + · Q̇ − H(Q P ) +
q Q
Assume that the generating function 1 determines the canonical variables p and P to be
then the terms in each square bracket cancel, leading to the required canonical transformation
The total time derivative of the generating function = 2 (q P)−Q · P is given by
∙ ¸
2 (q P) 2 (q P) 2 (q P)
= · q̇ + · Ṗ − P · Q̇ − Ṗ · Q + (15.80)
q P
Insert this into equation (1576) and assume that the trivial scale factor = 1 then
µ ¶ ∙ ¸
2 (q P) 2 (q P) 2 (q P)
p− · q̇ − (q p ) = P · Q̇ − P · Q̇+ − Q · Ṗ − H(Q P ) +
q P
Assume that the generating function 2 determines the canonical variables p and Q to be
Insert this into equation (1576) and assume that the trivial scale factor = 1 then
∙ ¸ ∙ ¸
3 (p Q) 3 (p Q) 3 (p Q)
− q+ · ṗ − (q p ) = P+ ·Q̇ − H(Q P ) +
p Q
Assume that the generating function 3 determines the canonical variables q and P to be
3 (p Q) 3 (p Q)
q=− P=− (15.84)
p Q
then the terms in brackets cancel, leading to the required transformation
3 (p Q)
H(Q P ) = (q p ) + (15.85)
Insert this into equation (1576) and assume that the trivial scale factor = 1 then
∙ ¸ ∙ ¸
4 (p P) 4 (p P) 4 (p P)
− q+ · ṗ − (q p ) = − Q ·Ṗ − H(Q P ) +
p P
Assume that the generating function 4 determines the canonical variables q and Q to be
4 (p P) 4 (p P)
q=− Q= (15.87)
p P
then the terms in brackets cancel, leading to the required transformation
4 (p P)
H(Q P ) = (q p ) + (15.88)
Note that the last three generating functions require the inclusion of additional bilinear products of
in order for the terms to cancel to give the required result. The addition of the bilinear terms,
ensures that the resultant generating function is the same using any of the four generating functions
1 2 3 4 . Frequently the 2 (q P ) generating function is the most convenient. The four possible
generating functions of the first kind, given above, are related by Legendre transformations. A canonical
transformation does not have to conform to only one of the four generating functions for all the degrees
of freedom, they can be a mixture of different flavors for the different degrees of freedom. The properties of
the generating functions are summarized in table 151.
The partial derivatives of the generating functions determine the corresponding conjugate variables
not explicitly included in the generating function . Note that, for the first trivial example 1 = the
old momenta become the new coordinates, = and vice versa, = − . This illustrates that it is
better to name them “conjugate variables” rather than “momenta” and “coordinates”.
In summary, Jacobi has developed a mathematical framework for finding the generating function
required to make a canonical transformation to a new Hamiltonian H(Q P ), that has a known solution.
That is,
H(Q P ) = (q p ) + (15.89)
When H(Q P ) is a constant, then a solution has been obtained. The inverse transformation for this solution
Q() P() → q() p() now can be used to express the final solution in terms of the original variables of the
system.
Note the special case when H(Q P ) = 0 then equation 1589 has been reduced to the Hamilton-Jacobi
relation (1511)
(q p ) + =0 (1511)
In this case, the generating function determines the action functional required to solve the Hamilton-
Jacobi equation (15110). Since equation (1589) has transformed the Hamiltonian (q p ) → H(Q P )
for which H(Q P ) = 0, then the solution Q() P() for the Hamiltonian H(Q P ) = 0 is obtained easily.
This approach underlies Hamilton-Jacobi theory presented in chapter 154
2 (q P )
= = +
Thus the infinitessimal changes in and are given by
(q P ) (q P )
(q p) = − = = + (2 )
(q P ) (q P )
(q p) = − = − = − + (2 )
2 2 1 ¡ 2 ¢
= + = + 2 2 2
2 2 2
This form of the Hamiltonian is a sum of two squares suggesting a canonical transformation for which
is cyclic in a new coordinate. A guess for a canonical transformation is of the form = cot which
2
is of the 1 (q Q) type where 1 equals 1 ( ) = 2 cot Using (1578) gives
1 ( )
= = cot
1 ( ) 2
= − =
2 sin2
H = =
Since
H
̇ = =
then
= +
Substituting into () gives the well known solution of the one-dimensional harmonic oscillator
r
2
= sin( + )
2
422 CHAPTER 15. ADVANCED HAMILTONIAN MECHANICS
The Hamilton-Jacobi equation, (1594) can be written more compactly using tensors q and ∇ to designate
(1 ) and 1
respectively. That is
(q ∇ ) + =0 (15.95)
Equation (1595) is a first-order partial differential equation in + 1 variables which are the old spatial
coordinates plus time . The new momenta have not been specified except that they are constants
since H = 0
Assume the existence of a solution of (1595) of the form ( ) = (1 ; 1 +1 ; ) where
the generalized momenta = 1 2 plus are the + 1 independent constants of integration in the
transformed frame. One constant of integration is irrelevant to the solution since only partial derivatives of
( ) with respect to and are involved. Thus, if is a solution of the first-order partial differential
equation, then so is + where is a constant. Thus it can be assumed that one of the + 1 constants of
integration is just an additive constant which can be ignored leading effectively to a solution
where none of the independent constants are solely additive. Such generating function solutions are called
complete solutions of the first-order partial differential equations since all constants of integration are known.
It is possible to assume that the generalized momenta, are constants , where the are the
constants This allows the generalized momentum to be written as
(q α )
= (15.97)
Similarly, Hamilton’s equations of motion give the conjugate coordinate Q = β where are constants That
is
(q α )
= = (15.98)
The above procedure has determined the complete set of 2 constants (Q = β P = α). It is possible to
invert the canonical transformation to express the above solution, which is expressed in terms of =
and = back to the original coordinates, that is, = ( ) and momenta = ( ) which is
the required solution.
Note that this equals the abbreviated action described in chapter 923, that is (q α) = 0 (q α)
Inserting the action (q α) into the Hamilton-Jacobi equation (1512) gives
(q α)
(q; ) = (α) (15.104)
This is called the time-independent Hamilton-Jacobi equation. Usually it is convenient to have
equal the total energy. However, sometimes it is more convenient to exclude the energy ( ) in the
set, in which case = (1 2 −1 ); the Routhian exploits this feature.
The equations of the canonical transformation expressed in terms of (q α) are
(q α) (α) (q α)
= + = (15.105)
These equations show that Hamilton’s characteristic function (q α) is itself the generating function of a
time-independent canonical transformation from the old variables ( ) to a set of new variables
(α)
= + = (15.106)
Table 152 summarizes the time-dependent and time-independent forms of the Hamilton-Jacobi equation.
Hamilton-Jacobi equation ( 1 ; 1
; )+
= 0 ( 1 ;
1 ) =
Transformation equations =
=
= = = = +
15.4. HAMILTON-JACOBI THEORY 425
(q α) (q α)
= = (15.109)
H H
̇ = =0 ̇ = =0 (15.110)
H = + = − =0 (15.111)
which has reduced the problem to a simple sum of one-dimensional first-order differential equations.
If the variable is cyclic, then the Hamiltonian is not a function of and the term in Hamilton’s
characteristic function equals = which separates out from the summation in equation 15107 That
is, all cyclic variables can be factored out of (q α) which greatly simplifies solution of the Hamilton-Jacobi
equation. As a consequence, the ability of the Hamilton-Jacobi method to make a canonical transformation to
separate the system into many cyclic or independent variables, which can be solved trivially, is a remarkably
powerful way for solving the equations of motion in Hamiltonian mechanics.
Since the Hamiltonian does not explicitly depend on the coordinates ( ) then the coordinates are cyclic
and separation of the variables, 15107, gives that the action
= α · r − ()
Since
S α
= r− Q̇ =
α
the equation of motion and the conjugate momentum are given by
α
r = Q̇ + p = ∇ = α
Thus the Hamilton-Jacobi relation has given both the equation of motion and the linear momentum p.
Assuming that the variables can be separated = () + () + () leads to
()
= =
()
= =
() q
= = 2( − ) − 2 − 2
Thus by integration the total equals
Z Z Z ³q ´
= + + 2( − ) − 2 − 2
0 0 0
³ ´
− 0 = ( − 0 )
´
³
− 0 = ( − 0 )
⎛q ⎞
2( − ) − 2 − 2
− 0 = ⎝ ⎠ ( − 0 ) − 1 ( − 0 )2
2
( ) = ( ) −
=
Inserting the generalized momentum into the Hamiltonian gives
Ã∙ ¸2 !
1
+ 2 2 2 =
2
The left-hand side is independent of whereas the right-hand side is independent of and Both sides
must equal a constant which is set to equal −2 , that is
"µ ¶2 µ ¶2 #
1 1 Θ 2
+ 2 + () + =
2 22 sin2
µ ¶2
Φ
= 2
The equation in and can be rearranged in the form
" µ ¶2 # "µ ¶2 #
2 1 Θ 2
2 + () − = − +
2 sin2
The left-hand side is independent of and the right-hand side is independent of so both must equal a
constant which is set to be −2 µ ¶2
1 2
+ () + =
2 22
µ ¶2
Θ 2
+ = 2
sin2
The variables now are completely separated and, by rearrangement plus integration, one obtains
√ Z r
2
() = 2 − () −
22
Z r
2
Θ() = 2 −
sin2
Φ() =
√ Z r Z r
2 2
= 2 − () − + 2 − +
2 2
sin2
Hamilton’s characteristic function is the generating function from coordinates ( ) to new
coordinates, which are cyclic, and new momenta that are constant and taken to be the separation constants
r
√ 2
= = 2 − () −
22
r
2
= = 2 −
sin2
= =
15.4. HAMILTON-JACOBI THEORY 429
These equations lead to the elliptical, parabolic, or hyperbolic orbits discussed in chapter 11.
2 1
2 ( ) = ̇ − 2 ( ̇ ) = −Γ + 20 2 Γ ()
2 2
Note that both the Lagrangian and Hamiltonian are explicitly time dependent and thus they are not
conserved quantities. This is as expected for this dissipative system.
Hamilton-Jacobi theory:
The form of the non-autonomous Hamiltonian () suggests use of the generating function for a canonical
transformation to an autonomous Hamiltonian, for which H is a constant of motion.
Γ
( ) = 2 ( ) = 2 = ()
That is, the transformed Hamiltonian H( ) is not explicitly time dependent, and thus is conserved.
Expressed in the original canonical variables ( ), the transformed Hamiltonian H( )
2 −Γ Γ 20 2 Γ
H( )= + +
2 2 2
is a constant of motion which was not readily apparent when using the original Hamiltonian. This unexpected
result illustrates the usefulness of canonical transformations for solving dissipative systems. The Hamilton-
Jacobi theory now can be used to solve the equations of motion for the transformed variables ( ) plus the
transformed Hamiltonian H( ). The derivative of the generating function
= ()
Use equation ( ) to substitute for in the Hamiltonian H( ) (equation ( )), then the Hamilton-
Jacobi method gives
µ ¶2
1 Γ 20 2
+ + + =0
2 2 2
This equation is separable as described in 15107 and thus let
( ) = ( ) −
The choice of the sign is irrelevant for this case and thus the positive sign is chosen. There are three possible
cases for the solution depending on whether the square-root term is real, zero, or imaginary.
Case 1: 1, that is, 2
2 r
0
1
h ¡ ¢2 i
Define = 1− 2 Then equation () can be integrated to give
Z p
2
= − − + ( − 2 2 ) ()
4
and Z
1
= = − + p
0 ( − 2 2 )
This integral gives µ ¶
−1
sin √ = 0 ( + ) ≡ +
15.4. HAMILTON-JACOBI THEORY 431
where s s
µ ¶2 µ ¶2
Γ Γ
= 0 = 0 1− = 20 − ()
2 0 2
Transforming back to the original variable gives
Γ
() = − 2 sin ( + ) ()
where and are given by the initial conditions. Equation is identical to the solution for the underdamped
linearly-damped linear oscillator given previously in equation 335.
Case 2: that is, 2Γ0 = 1
2 = 1, r
h ¡ ¢2 i
In this case = 1− 2 = 0 and thus equation simplifies to
2 √
= − − +
4
and
= = − + √
0
Therefore the solution is
Γ
() = − 2 ( + ) ()
where F and G are constants given by the initial conditions. This is the solution for the critically-damped
linearly-damped, linear oscillator given previously in equation 338.
Case 3: Γ
2 1, that is, 2 0 1 rh i
¡ ¢2
Define a real constant where = 2 − 1 = , then
Z p
2
= − − + ( + 2 2 )
4
Then Z
1
= = − + p
0 ( + 2 2 )
This last integral gives µ ¶
−1
sinh √ = 0 ( + ) ≡ +
where sµ ¶2
= 0 = 0 −1
2 0
Then the original variable gives
Γ
() = − 2 sinh ( + ) ()
This is the classic solution of the overdamped linearly-damped, linear harmonic oscillator given previously in
equation 337 The canonical transformation from a non-autonomous to an autonomous system allowed use
of Hamiltonian mechanics to solve the damped oscillator problem.
Note that this example used Bateman’s non-standard Lagrangian, and corresponding Hamiltonian, for
handling a dissipative linear oscillator system where the dissipation depends linearly on velocity. This non-
standard Lagrangian led to the correct equations of motion and solutions when applied using either the
time-dependent Lagrangian, or time-dependent Hamiltonian, and these solutions agree with those given in
chapter 35 which were derived using Newtonian mechanics.
432 CHAPTER 15. ADVANCED HAMILTONIAN MECHANICS
Let 1 = , 2 = 3 = 1 = 2 = 3 = .
The momentum components are given by
( )
= (15.113)
which corresponds to
p = ∇ = ∇ (15.114)
where for each cyclic variable the integral is taken over one complete period of oscillation. The cyclic variable
is called the action variable where
I
1 1
≡ = (15.117)
2 2
The canonical variable to the action variable I is the angle variable
R φ. Note that the name “action variable”
is used to differentiate I from the action functional = which has the same units; i.e. angular
momentum.
The general principle underlying the use of action-angle variables is illustrated by considering one body,
of mass , subject to a one-dimensional bound conservative potential energy (). The Hamiltonian is
given by
2
( ) = + () (15.118)
2
This bound system has a ( ) phase space contour for each energy =
p
( ) = ± 2( − ()) (15.119)
For an oscillatory
I system the two-valued momentum of equation 15119 is non-trivial to handle. By contrast,
the area ≡ of the closed loop in phase space is a single-valued scalar quantity that depends on
I
and (). Moreover, Liouville’s theorem states that the area of the closed contour in phase space ≡
is invariant to canonical transformations. These facts suggest the use of a new pair of conjugate variables,
( ) where () uniquely labels the trajectory, and corresponding area, of a closed loop in phase space
for each value of , and the single-valued function is a corresponding angle that specifies the exact point
along the phase-space contour as illustrated in Fig 153.
For simplicity consider the linear harmonic oscillator where
1
() = 2 2 (15.120)
2
Then the Hamiltonian, 15118 equals
2 1
( ) = + 2 2 (15.121)
2 2
Hamilton’s equations of motion give that
̇ = − = − 2 (15.122)
̇ = = (15.123)
434 CHAPTER 15. ADVANCED HAMILTONIAN MECHANICS
= cos(( − 0 )) (15.124)
= − sin ( − 0 ) (15.125)
[ ]() = 1
That is, the phase space has been mapped from ellipses, with area proportional to in the ( ) phase
space, to a cylindrical ( ) phase space where =
are constant values that are independent of the angle,
while increases linearly with time. Thus the variables ( ) are periodic with modulus ∆ = 2.
The period of the periodic oscillatory motion is given simply by ∆ = 2 = which is the well known re-
sult for the harmonic oscillator. Note that the action-angle variable canonical transformation has determined
the frequency of the periodic motion without solving the detailed trajectory of the motion.
15.5. ACTION-ANGLE VARIABLES 435
The above example of the harmonic oscillator has shown that, for integrable periodic systems, it is
possible to identify a canonical transformation to ( ) such that the Hamiltonian is independent of the
angle which specifies the instantaneous location on the constant energy contour . If the phase space
contour is a separatrix, then it divides phase space into invariant regions containing phase-space contours
with differing behavior. The action-angle variables are not useful for separatrix contours. For rolling motion,
the system rotates with continuously increasing, or decreasing angle, and there is no natural boundary for the
action angle variable since the phase space trajectory is continuous and not closed. However, the action-angle
approach still is valid if the motion involves periodic as well as rolling motion.
The example of the one-dimensional, one-body, harmonic oscillator can be expanded to the more general
case for many bodies in three dimensions. This is illustrated by considering multiple periodic systems for
which the Hamiltonian is conservative and where the equations of the canonical transformation are separable.
The generalized momenta then can be written as
( ; 1 2 )
= (15.136)
for which each is a function of and the integration constants
The momentum ( 1 2 ) represents the trajectory of the system in the ( ) phase space that is
characterized by Hamilton’s characteristic function ( ) Combining equations 15116 15136 gives
I
( ; 1 2 )
≡ (15.138)
Since is merely a variable of integration, each active action variable is a function of the constants
of integration in the Hamilton-Jacobi equation. Because of the independence of the separable-variable pairs
( ), the form independent functions of the and hence are suitable for use as a new set of constant
momenta. Thus the characteristic function can be written as
X
(1 ; 1 ) = ( ; 1 ) (15.139)
= 2 + (15.142)
that is, they are linear functions of time The constants can be identified with the frequencies of the
multiple periodic motions.
The action-angle variables appear to be no different than a particular set of transformed coordinates.
Their merit appears when the physical interpretation is assigned to . Consider the change as the
are changed infinitesimally
X X 2
= = (15.143)
The derivative with respect to vanishes except for the component of . Thus equation 15143 reduces
to
436 CHAPTER 15. ADVANCED HAMILTONIAN MECHANICS
X
= ( ) (15.144)
Therefore, the total change in as the system goes through one complete cycle is
X I
∆ = ( ) = 2 (15.145)
where
is outside the integral since the are constants for cyclic motion. Thus ∆ = 2 = where
is the period for one cycle of oscillation, where the angular frequency is given by
1
= = (15.146)
2
Thus the frequency associated with the periodic motion is the reciprocal of the period The secret here is
that the derivative of with respect to the action variable given by equation (15141) directly determines
the frequency of the periodic motion without the need to solve the complete equations of motion. Note that
multiple periodic motion can be represented by a Fourier expansion of the form
∞
X ∞
X ∞
X
= 1 2(1 1 +2 2 +3 3 ++ ) (15.147)
1 =−∞ 2 =−∞ =−∞
Although the action-angle approach to Hamilton-Jacobi theory does not produce complete equations of
motion, it does provide the frequency decomposition that often is the physics of interest. The reason that
the powerful action-angle variable approach has been introduced here is that it is used extensively in celestial
mechanics. The action-angle concept also played a key role in the development of quantum mechanics, in
that Sommerfeld recognized that Bohr’s ad hoc assumption that angular momentum is quantized, could be
expressed in terms of quantization of the angle variable as is mentioned in chapter 18.
Then the average mean square amplitude and velocity over one period are
2® ® 2
= [0 cos( + 0 )]2 = 0
2
D 2E ® 2 20
̇ = [−0 sin( + 0 )]2 =
2
Since, for the simple pendulum, 2 = , then the tension in the string
2® D 2E
2
= (1 − ) + ̇ = (1 + 0 )
2 4
Assuming that 0 is a small angle, and that the change in length −∆ is very small during one period
then the work done is
2
∆ = ∆ = − ∆ − 0 ∆ (a)
4
while the change in internal oscillator energy is
∙ ¸
2 1 1
∆(− cos 0 ) = ∆ − (1 − 0 ) = − ∆ + ∆(20 ) = − ∆ + 20 ∆ + 0 ∆0
2 2 2
(b)
The work done must balance the increment in internal energy therefore
320 ∆
0 ∆0 + =0
4
or
3
20 ∆ ln(0 4 ) = 0
Therefore it follows that
3
(0 4 ) = constant (c)
or
3
0 ∝ − 4
Thus shortening the length of the pendulum string from to 2 adiabatically corresponds to the amplitude
increasing by a factor 168.
Consider the action-angle integral for one closed period = 2
for this problem
I
=
I
= 2 ̇ · ̇
D 2 E 2
= 2 ̇
2 2
= 0
1 3
= 2 20 2 = constant
The constant can be identified with the new momentum Then the transformation equations become
= = = = = − =
That is
= +
which corresponds to motion with a uniform velocity in the system.
2
(b) Consider that the Hamiltonian is perturbed by addition of potential = 2 which corresponds to the
harmonic oscillator. Then
1 2
= 2 +
2 2
Consider the transformed Hamiltonian
1 2 2 2 1 2
H=+ = 2 + − = = ( + )
2 2 2 2 2
Hamilton’s equations of motion
H H
̇ = ̇ = −
give that
̇ = ( + )
̇ = − ( + )
These two equations can be solved to give
̈ + = 0
which is the equation of a harmonic oscillator showing that is harmonic of the form = 0 sin ( + )
where 0 are constants of motion. Thus
= −̇ − = −0 [cos( + ) + sin( + )]
The transformation equations then give
= = 0 sin ( + )
= + = −̇ = −0 cos( + )
Hence the solution for the perturbed system is harmonic, which is to be expected since the potential has a
quadratic dependence of position.
The symplectic matrix J is defined as being a 2 by 2 skew-symmetric, orthogonal matrix that is broken
into four × null or unit matrices according to the scheme
µ ¶
[0] + [1]
J= (15.152)
− [1] [0]
where [0] is the -dimension null matrix, for which all elements are zero. Also [1] is the -dimensional unit
matrix, for which the diagonal matrix elements are unity and all off-diagonal matrix elements are zero. The
J matrix accounts for the opposite signs used in the equations for q̇ and ṗ. The symplectic representation
allows the Hamilton’s equations of motion to be written in the compact form
η̇ = J (15.153)
η
This textbook does not use the elegant symplectic representation since it ignores the important generalized
forces and Lagrange multiplier forces.
The Lagrangian function (q q̇) and the action functional (q p) are scalar functions under rotation,
but they determine the vector force fields and the corresponding equations of motion. Thus the use of
rotationally-invariant functions (q q̇) and (q p) provide a simple representation of the vector force
fields. This is analogous to the use of scalar potential fields (q ) to represent the electrostatic and gravita-
tional vector force fields. Like scalar potential fields, Lagrangian and Hamiltonian mechanics represents the
observables as derivatives of (q q̇) and (q p) and the absolute values of (q q̇) and (q p) are
undefined; only differences in (q q̇) and (q p) are observable. For example, the generalized momenta
are given by the derivatives ≡
̇ and = . The physical significance of the least action (q α) is
15.8. COMPARISON OF THE LAGRANGIAN AND HAMILTONIAN FORMULATIONS 441
illustrated when the canonically transformed momenta P = α is a constant. Then the generalized momenta
and the Hamilton-Jacobi equation, imply that the total time derivative of the action equals
= ̇ + = − = (15.154)
The indefinite integral of this equation reproduces the definite integral (151) to within an arbitrary constant,
i.e. Z
(q p) = (q q̇) + constant (15.155)
Lagrangian formulation:
Consider a system with independent generalized coordinates, plus constraint forces that are not required
to be known. The Lagrangian approach can reduce the system to a minimal system of = − inde-
pendent generalized coordinates leading to = − second-order differential equations. By comparison,
the Newtonian approach uses + unknowns. Alternatively, the Lagrange multipliers approach allows
determination of the holonomic constraint forces resulting in = + second order equations to determine
= + unknowns. The Lagrangian potential function is limited to conservative forces, but generalized
forces can be used to handle non-conservative and non-holonomic forces. The advantage of the Lagrange
equations of motion is that they can deal with any type of force, conservative or non-conservative, and
they directly determine , ̇ rather than which then requires relating to ̇. The Lagrange approach is
superior to the Hamiltonian approach if a numerical solution is required for typical undergraduate problems
in classical mechanics. However, Hamiltonian mechanics has a clear advantage for addressing more profound
and philosophical questions in physics.
Hamiltonian formulation:
For a system with independent generalized coordinates, and constraint forces, the Hamiltonian approach
determines 2 first-order differential equations. In contrast to Lagrangian mechanics, where the Lagrangian
is a function of the coordinates and their velocities, the Hamiltonian uses the variables q and p, rather
than velocity. The Hamiltonian has twice as many independent variables as the Lagrangian which is a great
advantage, not a disadvantage, since it broadens the realm of possible transformations that can be used to
simplify the solutions. Hamiltonian mechanics uses the conjugate coordinates q p corresponding to phase
space. This is an advantage in most branches of physics and engineering. Compared to Lagrangian mechanics,
Hamiltonian mechanics has a significantly broader arsenal of powerful techniques that can be exploited to
obtain an analytical solution of the integrals of the motion for complicated systems. These techniques
include, the Poisson bracket formulation, canonical transformations, the Hamilton-Jacobi approach, the
action-angle variables, and canonical perturbation theory. In addition, Hamiltonian dynamics also provides
a means of determining the unknown variables for which the solution assumes a soluble form, and it is
ideal for study of the fundamental underlying physics in applications to other fields such as quantum or
statistical physics. However, the Hamiltonian approach endemically assumes that the system is conservative
putting it at a disadvantage with respect to the Lagrangian approach. The appealing symmetry of the
Hamiltonian equations, plus their ability to utilize canonical transformations, makes it the formalism of
choice for examination of system dynamics. For example, Hamilton-Jacobi theory, action-angle variables
and canonical perturbation theory are used extensively to solve complicated multibody orbit perturbations
in celestial mechanics by finding a canonical transformation that transforms the perturbed Hamiltonian to
a solved unperturbed Hamiltonian.
The Hamiltonian formalism features prominently in quantum mechanics since there are well established
rules for transforming the classical coordinates and momenta into linear operators used in quantum me-
chanics. The variables q q̇ used in Lagrangian mechanics do not have simple analogs in quantum physics.
As a consequence, the Poisson bracket formulation, and action-angle variables of Hamiltonian mechanics
played a key role in development of matrix mechanics by Heisenberg, Born, and Dirac, while the Hamilton-
Jacobi formulation played a key role in development of Schrödinger’s wave mechanics. Similarly, Hamiltonian
mechanics is the preeminent variational approached used in statistical mechanics.
442 CHAPTER 15. ADVANCED HAMILTONIAN MECHANICS
15.9 Summary
This chapter has gone beyond what is normally covered in an undergraduate course in classical mechanics,
in order to illustrate the power of the remarkable arsenal of methods available for solution of the equations of
motion using Hamiltonian mechanics. This has included the Poisson bracket representation of Hamiltonian
formulation of mechanics, canonical transformations, Hamilton-Jacobi theory, action-angle variables, and
canonical perturbation theory. The purpose was to illustrate the power of variational principles in Hamil-
tonian mechanics and how they relate to fields such as quantum mechanics. The following are the key points
made in this chapter.
Poisson brackets: The elegant and powerful Poisson bracket formalism of Hamiltonian mechanics was
introduced. The Poisson bracket of any two continuous functions of generalized coordinates ( ) and
( ) is defined to be
X µ
¶
[ ] ≡ − (1513)
[ ] = 0 (1522)
There is a one-to-one correspondence between the commutator and Poisson Bracket of two independent
functions,
(1 1 − 1 1 ) = [1 1 ] (1538)
where is an independent constant. In particular 1 1 commute of the Poisson Bracket [1 1 ] = 0.
Poisson Bracket representation of Hamiltonian mechanics: It has been shown that the Poisson
bracket formalism contains the Hamiltonian equations of motion and is invariant to canonical transforma-
tions. Also this formalism extends Hamilton’s canonical equations to non-commuting canonical variables.
Hamilton’s equations of motion can be expressed directly in terms of the Poisson brackets
̇ = [ ] = (1557)
̇ = [ ] = − (1558)
An important result is that the total time derivative of any operator is given by
= + [ ] (1545)
Poisson brackets provide a powerful means of determining which observables are time independent and
whether different observables can be measured simultaneously with unlimited precision. It was shown that
the Poisson bracket is invariant to canonical transformations, which is a valuable feature for Hamiltonian
mechanics. Poisson brackets were used to prove Liouville’s theorem which plays an important role in the use
of Hamiltonian phase space in statistical mechanics. The Poisson bracket is equally applicable to continuous
solutions in classical mechanics as well as discrete solutions in quantized systems.
15.9. SUMMARY 443
Canonical transformations: A transformation between a canonical set of variables ( ) with Hamil-
tonian ( ) to another set of canonical variable ( ) with Hamiltonian H( ) can be achieved
using a generating functions such that
H( ) = ( ) + (1589)
Possible generating functions are summarized in the following table.
If the canonical transformation makes H( ) = 0 then the conjugate variables ( ) are constants
of motion. Similarly if H( ) is a cyclic function then the corresponding are constants of motion.
Hamilton-Jacobi theory: Hamilton-Jacobi theory determines the generating function required to per-
form canonical transformations that leads to a powerful method for obtaining the equations of motion for
a system. The Hamilton-Jacobi theory uses the action function ≡ 2 as a generating function, and the
canonical momentum is given by
= (154)
This can be used to replace in the Hamiltonian leading to the Hamilton-Jacobi equation
(; ; ) + =0 (1594)
Solutions of the Hamilton-Jacobi equation were obtained by separation of variables. The close optical-
mechanical analogy of the Hamilton-Jacobi theory is an important advantage of this formalism that led to
it playing a pivotal role in the development of wave mechanics by Schrödinger.
Action-angle variables: The action-angle variables exploits a canonical transformation from ( ) →
( ) where I
1 1
≡ = (15117)
2 2
For periodic motion the phase-space trajectory is closed with area given by and this area is conserved for
the above canonical transformation. For a conserved Hamiltonian the action variable is independent of
the angle variable . The time dependence of the angle variable directly determines the frequency of the
periodic motion without recourse to calculation of the detailed trajectory of the periodic motion.
Canonical perturbation theory: Canonical perturbation theory is a valuable method of handling multi-
body interactions. The adiabatic invariance of the action-angle variables provides a powerful approach for
exploiting canonical perturbation theory.
Comparison of Lagrangian and Hamiltonian formulations: The remarkable power, and intellectual
beauty, provided by use of variational principles to exploit the underlying principles of natural economy in
nature, has had a long and rich history. It has led to profound developments in many branches of theoretical
physics. However, it is noted that although the above algebraic formulations of classical mechanics have been
used for over two centuries, the important limitations of these algebraic formulations to non-linear systems
remain a challenge that still is being addressed.
It has been shown that the Lagrangian and Hamiltonian formulations represent the vector force fields,
and the corresponding equations of motion, in terms of the Lagrangian function (q q̇) or the action
444 CHAPTER 15. ADVANCED HAMILTONIAN MECHANICS
functional (q p) which are scalars under rotation. The Lagrangian function (q q̇) is related to the
action functional (q p) by
Z 2
(q p) = (q q̇) (151)
1
These functions are analogous to electric potential, in that the observables are derived by taking derivatives
of the Lagrangian function (q q̇) or the action functional (q p). The Lagrangian formulation is more
convenient for deriving the equations of motion for simple mechanical systems. The Hamiltonian formulation
has a greater arsenal of techniques for solving complicated problems plus it uses the canonical variables ( )
which are the variables of choice for applications to quantum mechanics and statistical mechanics.
15.9. SUMMARY 445
Workshop exercises
1. Poisson brackets are a powerful means of elucidating when observables are constant of motion and whether
two observables can be simultaneously measured with unlimited precision. Consider a spherically symmetric
Hamiltonian à !
1 2 2
= 2 + + + ()
2 2 2 sin2
for a mass where ( is a central potential. Use the Poisson bracket plus the time dependence to determine
the following:
(c) Show { } = . The following identity may be useful: = − .
(d) Show { 2 } = 0 .
p2 1 ¡ ¢
= + 21 12 + 22 22
2 2
What condition is satisfied if 2 a conserved quantity?
446 CHAPTER 15. ADVANCED HAMILTONIAN MECHANICS
Problems
1. Consider the motion of a particle of mass in an isotropic harmonic oscillator potential = 12 2 and take
the orbital plane to be the − plane. The Hamiltonian is then
1 2 1
≡ 0 = ( + 2 ) + (2 + 2 )
2 2
Introduce the three quantities
1 2 1
1 = ( − 2 ) + (2 − 2 )
2 2
1
2 = +
3 = ( − )
q
with = . Use Poisson brackets to solve the following:
a) Show that [0 ] = 0 for = 1 2 3 proving that (1 2 3 ) are constants of motion.
b) Show that
[1 2 ] = 23
[2 3 ] = 21
[3 1 ] = 22
−1
so that (2) (1 2 3 ) have the same Poisson bracket relations as the components of a 3-dimensional angular
momentum.
c) Show that
02 = 12 + 22 + 32
2. Assume that the transformation equations between the two sets of coordinates ( ) and ( ) are
1
= ln(1 + 2 cos )
1 1
= 2(1 + 2 cos ) 2 sin )
a) Assuming that are canonical variables, i.e. [ ] = 1, show directly from the above transformation
equations that are canonical variables.
b) Show that the generating function that generates this transformation between the two sets of canonical variables
is
3 = −[ − 1]2 tan
3. Consider a bound two-body system comprising a mass in an orbit at a distance from a mass . The
attractive central force binding the two-body system is
F= r̂
2
where is negative. Use Poisson brackets to prove that the eccentricity vector = × + ̂ is a conserved
quantity.
4. (a) Consider the case of a single mass where the Hamiltonian = 12 2 . Use the generating function
( ) to solve the Hamilton-Jacobi equation with the canonical transformation = ( ) and =
( ) and determine the equations relating the ( ) variables to the transformed coordinate and momentum
( ).
(b) If there is a perturbing Hamiltonian ∆ = 12 2 , then will not be constant. Express the transformed
Hamiltonian (using the transformation given above in terms of and ). Solve for () and () and
show that the perturbed solution [() ()] [() ()] is simple harmonic.
Chapter 16
16.1 Introduction
Lagrangian and Hamiltonian mechanics have been used to determine the equations of motion for discrete sys-
tems having a finite number of discrete variables where 1 ≤ ≤ . There are important classes of systems
where it is more convenient to treat the system as being continuous. For example, the interatomic spacing in
solids is a few 10−10 which is negligible compared with the size of typical macroscopic, three-dimensional
solid objects. As a consequence, for wavelengths much greater than the atomic spacing in solids, it is use-
ful to treat macroscopic crystalline lattice systems as continuous three-dimensional uniform solids, rather
than as three-dimensional discrete lattice chains. Fluid and gas dynamics are other examples of continuous
mechanical systems. Another important class of continuous systems involves the theory of fields, such as
electromagnetic fields. Lagrangian and Hamiltonian mechanics of the continua extend classical mechanics
into the advanced topic of field theory. This chapter goes beyond the scope of a typical undergraduate
classical mechanics course in order to provide a brief glimpse of how Lagrangian and Hamiltonian mechanics
can underlie advanced and important aspects of the mechanics of the continua, including field theory.
1 X³ 2 ´
+1
2
= ̇ − (−1 − ) (16.1)
2 =1
where the masses are attached in series to +1 identical springs of length and spring constant . Assume
that the spring has a uniform cross-section area and length Then each spring volume element ∆ =
has a mass , that is, the volume mass density = ∆ or = ∆ . Chapter 1653 will show that the
spring constant = where is Young’s modulus, is the cross sectional area of the chain element, and
is the length of the element. Then the spring constant can be written as = ∆ 2 . Therefore equation
161 can be expressed as a sum over volume elements ∆ =
+1
à µ ¶2 !
1X 2 −1 −
= ̇ − ∆ (16.2)
2 =1
In the limit that → ∞ and the spacing = → 0 then the summation in equation 162 can be written
as a volume integral where = is the distance along the linear chain and the volume element ∆τ → 0.
Then the Lagrangian can be written as the integral over the volume element rather than a summation
447
448 CHAPTER 16. ANALYTICAL FORMULATIONS FOR CONTINUOUS SYSTEMS
The discrete-chain coordinate () is assumed to be a continuous function ( ) for the uniform chain. Thus
the integral form of the Lagrangian can be expressed as
Z Ã µ ¶2 ! Z
1 2 ( )
= ̇ − = L (16.4)
2
where the function L is called the Lagrangian density defined by
à µ ¶2 !
1 ( )
L≡ ̇ 2 − (16.5)
2
The variable in the Lagrangian density is not a generalized coordinate; it only serves the role of a continuous
index played previously by the index . For the discrete case, each value of defined a different generalized
coordinate . Now for each value of there is a continuous function ( ) which is a function of both
position and time.
Lagrange’s equations of motion applied to the continuous Lagrangian in equation 164 gives
2 2
2
− 2 =0 (16.6)
This is the familiar wave equation in one dimension for a longitudinal wave on the continuous chain with a
phase velocity s
= (16.7)
The continuous linear chain also can exhibit transverse modes which have a Lagrangian density were the
Young’s modulus is replaced by the tension in the chain, and is replaced
q by the linear mass density
of the chain, leading to a phase velocity for a transverse wave = .
Following the same approach used in chapter 52, it is assumed that the stationary path for the action
integral is described by the function ( ). Define a neighboring function using a parametric representation
( ; ) such that when = 0, the extremum function = ( ) yields the stationary action integral .
16.3. THE LAGRANGIAN DENSITY FORMULATION FOR CONTINUOUS SYSTEMS 449
Assume that an infinitessimal fraction of a neighboring function ( ) is added to the extremum path
( ). That is, assume
Then Hamilton’s principle requires that the action integral be a stationary function value for = 0, that is,
() is independent of which is satisfied if
Z 2 Z 2 µ ¶
() L L ̇ L 0
= + + 0 = 0 (16.13)
1 1 ̇
Since the auxiliary function ( ) is arbitrary, then the integrand term in the square brackets of equation
1619 must equal zero. That is, µ ¶ µ ¶
L L L
+ − =0 (16.20)
̇ 0
Equation 1620 gives the equations of motion in terms of the Lagrangian density that has been derived
based on Hamilton’s principle.
equations of motion in terms of the Lagrangian density for three spatial dimensions involves the straightfor-
ward addition of the and coordinates. That is, in three dimensions the vector displacement is expressed
by the vector q ( ) and the Lagrangian density is related to the Lagrangian by integration over three
dimensions. That is, they are related by the equation
Z
q
= L(q ∇ · q ) (16.21)
where, in cartesian coordinates, the volume element = . The Lagrangian density is a function
L(q q
∇ · q ) where the one field quantity ( ) has been extended to a spatial vector q ( )
and the spatial derivatives 0 have been transformed into ∇ · q. Applying the method used for the one-
dimensional spatial system, to the three-dimensional system, leads to the following set of equations of motion
à ! à ! à ! à !
L L L L L
+ + + − =0 (16.22)
q
q
q
q
q
where the spatial derivatives have been written explicitly for clarity.
Note that the equations of motion, equation 1622, treat the spatial and time coordinates symmetrically.
This symmetry between space and time is unchanged by multiplying the spatial and time coordinate by
arbitrary numerical factors. This suggests the possibility of introducing a four-dimensional coordinate system
≡ { }
where the parameter is freely chosen. Using this 4-dimensional formalism allows equation 1622 to be
written more compactly as ⎛ ⎞
X4
⎝ L ⎠ L
q
− =0 (16.23)
q
As discussed in chapter 17 relativistic mechanics treats time and space symmetrically, that is, a four-
dimensional vector q ( ) can be used that treats time and the three spatial dimensions symmetrically
and equally. This four-dimensional space-time formulation allows the first four terms in equation 1622 to be
condensed into a single term which illustrates the symmetry underlying equation 1623. If the Lagrangian
density is Lorentz invariant, and if = then equation 1623 is covariant. Thus the Lagrangian density
formulation is ideally suited to the development of relativistically covariant descriptions of fields.
Chapter 163 illustrates, in general terms, how field theory can be expressed in a Lagrangian formulation
via use of the Lagrange density. It is equally possible to obtain a Hamiltonian formulation for continuous
systems analogous to that obtained for discrete systems. As summarized in chapter 8, the Hamiltonian
and Hamilton’s canonical equations of motion are related directly to the Lagrangian by use of a Legendre
transformation. The Hamiltonian is defined as being
X µ ¶
≡ ̇ − (16.24)
̇
In the limit that the coordinates are continuous, then the summation in equation 1626 can be
transformed into a volume integral over the Lagrangian density L. In addition, a momentum density can be
represented by the vector field π where
L
π≡ (16.27)
q̇
Then the obvious definition of the Hamiltonian density H is
Z Z
= H = (π · q̇−L) (16.28)
H =π · q̇−L (16.29)
Unfortunately the Hamiltonian density formulation does not treat space and time symmetrically making
it more difficult to develop relativistically covariant descriptions of fields. Hamilton’s principle can be used
to derive the Hamilton equations of motion in terms of the Hamiltonian density analogous to the approach
used to derive the Lagrangian density equations of motion. As described in Classical Mechanics 2 edition
by Goldstein, the resultant Hamilton equations of motion for one dimension are
H
= ̇ (16.30)
H H
− = −̇ (16.31)
0
H L
= − (16.32)
Note that equation 1631 differs from that for discontinuous systems.
The diagonal first term is the dilation term which corresponds to changes in the volume with no changes
in shape. The off-diagonal second term involves the shear terms that correspond to changes of the shape of
the body that also changes the volume. The constants and are Lamé’s moduli of elasticity which are
positive. The various moduli of elasticity, corresponding to different distortions in the shape and volume of
any solid body, can be derived from Lamé’s moduli for the material.
The components of the elastic forces can be derived from the gradient of the elastic potential energy,
equation 1642 by use of Gauss’ law plus vector differential calculus. The components of the elastic force,
derived from the strain tensor σ, can be associated with the corresponding components of the stress tensor
T. Thus, for homogeneous isotropic linear materials, the components of the stress tensor are related to the
strain tensor by the relation
X µ ¶ X
= + + = + 2 (16.43)
where it has been assumed that = . The two moduli of elasticity and are material-dependent
constants. Equation 1643 can be written in tensor notation as
where () is the trace of the strain tensor and is the identity matrix.
Equation 1644 can be inverted to give the strain tensor components in terms of the stress tensor com-
ponents. " #
1 X
= − (16.45)
2 (3 + 2)
The various moduli of elasticity relate combinations of different stress and strain tensor components. The
following five elastic moduli are used frequently to describe elasticity in homogeneous isotropic media, and
all are related to Lamé’s two moduli of elasticity.
1) Young’s modulus describes tensile elasticity which is axial stiffness of the length of a body to
deformation along the axis of the applied tensile force.
2) Bulk modulus = ∆ defines the relative dilation or compression of a bodies volume to pressure
applied uniformly in all directions.
2
=+ (16.47)
3
The bulk modulus is an extension of Young’s modulus to three dimensions and typically is larger than .
The inverse of the bulk modulus is called the compressibility of the material.
454 CHAPTER 16. ANALYTICAL FORMULATIONS FOR CONTINUOUS SYSTEMS
3) Shear modulus describes the shear stiffness of a body to volume-preserving shear deformations.
The shear strain becomes a deformation angle given by the ratio of the displacement along the axis of the
shear force and the perpendicular moment arm. The shear modulus equals Lamé’s constant . That is,
= (16.48)
4) Poisson’s ratio is the negative ratio of the transverse to axial strain. It is a measure of the volume
conserving tendency of a body to contract in the directions perpendicular to the axis along which it is
stretched. In terms of Lamé’s constants, Poisson’s ratio equals
= (16.49)
2 ( + )
Note that for a stable, isotropic elastic material, Poisson’s ratio is bounded between −10 ≤ ≤ 05 to ensure
that the and moduli have positive values. At the incompressible limit, = 05, and the bulk modulus
and Lame parameter are infinite, that is, the compressibility is zero. Typical solids have Poisson’s ratios
of ≈ 005 if hard and = 025 if soft.
The stiffness of elastic solids in terms of the elastic moduli of solids can be complicated due to the
geometry and composition of solid bodies. Often it is more convenient to express the stiffness in terms of
the spring constant where
= (16.50)
The spring constant is inversely proportional to the length of the spring because the strain of the material
is defined to be the fractional deformation, not the absolute deformation.
That is, the inner product of the del operator, ∇,I and the rank-2 stress tensor T, give the vector force
2
density f . This force acting on the enclosed mass for the closed volume, leads to an acceleration 2 .
Thus I Z I
2ξ
F = T·A = ∇ · T = 2 (16.52)
Use equation 1644 to relate the stress tensor T to the moduli of elasticity gives
" #
2 ξ X 2 ξ 2 ξ
2 = ( + ) + 2 (16.53)
where = 1 2 3. In general this equation is difficult to solve. However, for the simple case of a plane wave
in the = 1 direction, the problem reduces to the following three equations
2 ξ1 2 ξ1
= ( + 2) (16.54)
2 21
2ξ 2 ξ2
22 = (16.55)
21
2ξ 2ξ
23 = 23 (16.56)
1
q
(+2)
Equation 1654 corresponds to a longitudinal wave travelling with velocity = . Equations
q
1655 1656 correspond to two perpendicular transverse waves travelling with velocity = . This il-
lustrates the important fact that longitudinal waves travel faster than transverse waves in an elastic solid.
Seismic waves in the Earth, generated by earthquakes, exhibit this property. Note that shearing stresses do
not exist in ideal liquids and gases since they cannot maintain shear forces and thus = 0
16.6. ELECTROMAGNETIC FIELD THEORY 455
f = (E + J × B) (16.58)
Maxwell’s equations
1 E
= 0 ∇ · E J= ∇ × B − ²0 (16.59)
0
can be used to eliminate the charge and current densities in equation 1657
µ ¶
1 E
f =0 (∇ · E) E + ∇ × B − ²0 ×B (16.60)
0
Vector calculus gives that
E B
(E × B) = × B + E× (16.61)
while Faraday’s law gives
B
= −∇ × E (16.62)
Equation 1662 allows equation 1661 to be rewritten as
E B
× B = + (E × B) − E× = + (E × B) + E× (∇ × E) (16.63)
Equation 1663 can be inserted into equation 1660. In addition, a term 1 (∇ · B) B can be added since
0
∇ · B =0 which allows equation 1660 to be written in the symmetric form
1 1 E
f = 0 (∇ · E) E + (∇ · B) B+ (∇ × B) × B − ²0 ×B (16.64)
0 0
1 1
= 0 (∇ · E) E + (∇ · B) B+ (∇ × B) × B−0 (E × B) − 0 E× (∇ × E) (16.65)
0 0
Using the vector identity
∇ (A · B) = A× (∇ × B) + B× (∇ × A) + (A · ∇) B+ (B · ∇) A (16.66)
Let A = B = E then ¡ ¢
∇ 2 = 2E× (∇ × E) + 2 (E · ∇) E (16.67)
That is
1 ¡ 2¢
E× (∇ × E) = ∇ − (E · ∇) E (16.68)
2
Similarly
1 ¡ 2¢
B× (∇ × B) = ∇ − (B · ∇) B (16.69)
2
Inserting equations 1668 and 1669 into equation 1665 gives
∙ ¸ ∙ ¸
1 2 1 1 2
f =0 (∇ · E) E+ (E · ∇) E− ∇ + (∇ · B) B+ (B · ∇) B− ∇ − 0 (E × B) (16.70)
2 0 2
456 CHAPTER 16. ANALYTICAL FORMULATIONS FOR CONTINUOUS SYSTEMS
This complicated formula can be simplified by defining the rank-2 Maxwell stress tensor T which has
components µ ¶ µ ¶
1 1 1
≡ 0 − 2 + − 2 (16.71)
2 0 2
The inner product of the del operator and the Maxwell stress tensor is a vector with components of
∙ ¸ ∙ ¸
1 2 2 1 1 2 2
(∇ · T) = 0 (∇ · E) + (E · ∇) − ∇ + (∇ · B) + (B · ∇) − ∇ (16.72)
2 0 2
The above definition of the Maxwell stress tensor, plus the Poynting vector S = 1 (E × B) allows the force
0
density equation 1658 to be written in the form
S
f = ∇ · T−0 0 (16.73)
The divergence theorem allows the total force, acting of the volume to be written in the form
Z µ ¶
S
F = ∇ · T−0 0 (16.74)
I Z
= T·a−0 0 Sdτ (16.75)
Note that, if the Poynting vector is time independent, then the second term in equation 1675 is zero and the
Maxwell stress tensor T is the force per unit area, (stress) acting on the surface. The fact that T is a rank-2
tensor is apparent since the stress represents the ratio of the force-density vector f and the infinitessimal
area vector a, which do not necessarily point in the same directions.
Then equation 1676 implies that the total momentum flux density π = π +π is related to Maxwell’s
stress tensor by
(π + π ) = ∇ · T (16.79)
That is, like the elasticity stress tensor, the divergence of Maxwell’s stress tensor T equals the rate of change
of the total momentum density, that is, −T is the momentum flux density.
This discussion of the Maxwell stress tensor and its relation to momentum in the electromagnetic field
illustrates the role that analytical formulations of classical mechanics can play in field theory.
16.7. IDEAL FLUID DYNAMICS 457
Mass conservation must hold for any arbitrary volume, therefore the continuity equation can be written in
the differential form
+ ∇· (v) = 0 (16.82)
v v v
+ + = (r · ∇) v (16.85)
Thus
v
v = + (r · ∇) v (16.86)
458 CHAPTER 16. ANALYTICAL FORMULATIONS FOR CONTINUOUS SYSTEMS
Divide both sides by gives that the acceleration of the atoms in the fluid equals
v v
= + (v · ∇) v (16.87)
Substitute equation 1687 into 1684 gives
v 1
+ (v · ∇) v = − ∇ (16.88)
This is Euler’s equation for hydrodynamics. The two terms on the left represent the acceleration in the
individual fluid components while the right-hand side lists the force density producing the acceleration.
Additional forces can be added to the right-hand side. For example, the gravitational force density g
can be expressed in terms of the gravitational scalar potential to be
g = −ρ∇ (16.89)
v = −∇ (16.91)
This scalar potential field can be used to derive the vector velocity field for irrotational flow.
Note that the (v · ∇) v term in Euler’s equation (1690) can be rewritten using the vector identity
1 ¡ ¢
(v · ∇) v = ∇ 2 − v × ∇ × v (16.92)
2
Inserting equation 1692 into Euler’s equation 1690 then gives
µ ¶
v 1 1 2
= v × ∇ × v− ∇ + + (16.93)
2
v
Potential flow corresponds to time independent irrotational flow, that is, both = 0 and ∇ × v = 0 For
potential flow equation 1693 reduces to
µ ¶
1 2
∇ + + = 0
2
The Navier-Stokes equations are nonlinear due to the (v · ∇) v term as well as being a function of
velocity. This non-linearity leads to a wide spectrum of dynamic behavior ranging from ordered laminar
flow to chaotic turbulence. Numerical solution of the Navier-Stokes equations is extremely difficult because
of the wide dynamic range of the dimensions of the coherent structures involved in turbulent motion. For
example, simulation calculations require use of a high resolution mesh which is a challenge to the capabilities
of current generation computers.
The microscopic boundary condition at the interface of the solid and fluid is that the fluid molecules
have zero average tangential velocity relative to the normal to the solid-fluid interface. This implies that
there is a boundary layer for which there is a gradient in the tangential velocity of the fluid between the
solid-fluid interface and the free-steam velocity. This velocity gradient produces vorticity in the fluid. When
the viscous forces are negligible then the angular momentum in any coherent vortex structure is conserved
leading to the vortex motion being preserved as it propagates.
10
Inertial forces CD
Re ≡ = = (16.99) B
Viscous forces 1 C D
varies inversely with Re leading to the drag forces that are roughly linear with velocity as described in chapter
2105 The size and velocities of raindrops in a light rain shower correspond to such Reynolds numbers.
B) For 10 Re 30 the flow has two turbulent vortices immediately behind the body in the wake of
the cylinder, but the flow still is primarily laminar as illustrated.
C) For 40 Re 250 the pair of vortices peel off alternately producing a regular periodic sequence of
vortices although the flow still is laminar. This vortex sheet is called a von Kármán vortex sheet for which
the velocity at a given position, relative to the cylinder, is time dependent in contrast to the situation at
lower Reynolds numbers.
D) For 103 Re 105 viscous forces are negligible relative to the inertial effects of the vortices and
boundary-layer vortices have less time to diffuse into the larger region of the fluid, thus the boundary layer is
thinner. The boundary-layer flow exhibits a small scale chaotic turbulence in three dimensions superimposed
on regular alternating vortex structures. In this range is roughly constant and thus the drag forces are
proportional to the square of the velocity. This regime of Reynold numbers corresponds to typical velocities
of moving automobiles.
E) For Re ≈ 106 , which is typical of a flying aircraft, the inertial effects dominate except in the narrow
boundary layer close to the solid-fluid interface. The chaotic region works its way further forward on the
cylinder reducing the volume of the chaotic turbulent boundary layer which results in a significant decreases
in . For a sailplane wing flying at about 50, the boundary layer at the leading edge of the cylinder
reduces to the order of a millimeter in thickness at the leading edge and a centimeter at the trailing edge. At
these Reynold’s numbers the airflow comprises a thin boundary layer, where viscous effects are important,
plus fluid flow in the bulk of the fluid where the vortex inertial terms dominate and viscous forces can be
ignored. That is, the viscous stress tensor term ∇ · T on the right-hand side of equation 1697 can be
ignored, and the Navier-Stokes equation reduces to the simpler Euler equation for such inviscid fluid flow.
The importance of the inertia of the vortices is illustrated by the persistence of the vortex structure
and turbulence over a wide range of length scales characteristic of turbulent flow. The dynamic range of
the dimension of coherent vortex structures is enormous. For example, in the atmosphere the vortex size
ranges from 105 in diameter for hurricanes down to 10−3 in thin boundary layers adjacent to an aircraft
wing. The transition from laminar to turbulent flow is illustrated by water flow over the hull of a ship which
involves laminar flow at the bow followed by turbulent flow behind the bow wave and at the stern of the
ship. The broad extent of the white foam of seawater along the side and the stern of a ship illustrates the
considerable energy dissipation produced by the turbulence. The boundary layer of a stalled aircraft wing
is another example. At a high angle of attack, the airflow on the lower surface of the wing remains laminar,
that is, the stream velocity profile, relative to the wing, increases smoothly from zero at the wing surface
outwards until it meets the ambient air velocity on the outer surface of the boundary layer which is the order
of a millimeter thick. The flow on the top surface of the wing initially is laminar before becoming turbulent
at which point the boundary layer rapidly increases in thickness. Further back the airflow detaches from
the wing surface and large-scale vortex structures lead to a wide boundary layer comparable in thickness to
the chord of the wing with vortex motion that leads to the airflow reversing its direction adjacent to the
upper surface of the wing which greatly increases drag. When the vortices begin to shed off the bounded
surface they do so at a certain frequency which can cause vibrations that can lead to structural failure if the
frequency of the shedding vortices is close to the resonance frequency of the structure.
Considerable time and effort are expended by aerodynamicists and hydrodynamicists designing aircraft
wings and ship hulls to maximize the length of laminar region of the boundary layer to minimize drag.
When the Reynolds number is large the slightest imperfections in the shape of wing, such as a speck of
dust, can trigger the transition from laminar to turbulent flow. The boundaries between adjacent large-scale
coherent structures are sensitively identified in computer simulations by large divergence of the streamlines
at any separatrix. A large positive, finite-time, Lyapunov exponent identifies divergence of the streamlines
which occurs at a separatrix between adjacent large-scale coherent vortex structures, whereas the Lyapunov
exponents are negative for converging streamlines within any coherent structure. Computations of turbulent
flow often combine the use of finite-time Lyapunov exponents to identify coherent structures, plus Lagrangian
mechanics for the equations of motion since the Lagrangian is a scalar function, it is frame independent, and
it gives far better results for fluid motion than using Newtonian mechanics. Thus the Lagrangian approach in
the continua is used extensively for calculations in aerodynamics, hydrodynamics, and studies of atmospheric
phenomena such as convection, hurricanes, tornadoes, etc.
16.9. SUMMARY AND IMPLICATIONS 463
Hamiltonian density formulation: In the limit that the coordinates are continuous, then the Hamil-
tonian density can be expressed in terms of a volume integral over the momentum density and the La-
grangian density L where
L
π≡ (1627)
q̇
Then the obvious definition of the Hamiltonian density H is
Z Z
= H = (π · q̇−L) (1628)
Linear elastic solids: The theory of continuous systems was applied to the case of linear elastic solids.
The stress tensor T is a rank 2 tensor defined as the ratio of the force vector F and the surface element
vector A. That is, the force vector is given by the inner product of the stress tensor T and the surface
element vector A.
F = T·A (1633)
The strain tensor σ also is a rank 2 tensor defined as the ratio of the strain vector ξ and infinitessimal
area A
ξ = σ·A (1638)
where the component form of the rank 2 strain tensor is
¯ ¯
¯ 1 1 1 ¯
1 ¯¯
2
1 2
2
3
2
¯
¯
σ = ¯ ¯ (1639)
2 ¯ 31 2
3
3
3 ¯
¯ ¯
1 2 3
The modulus of elasticity is defined as the slope of the stress-strain curve. For linear, homogeneous,
elastic matter, the potential energy density separates into diagonal and off-diagonal components of the
strain tensor " #
1 X 2
X 2
= ( ) + 2 ( ) (1642)
2
where the constants and are Lamé’s moduli of elasticity which are positive. The stress tensor is related
to the strain tensor by
X µ ¶ X
= + + = + 2 (1643)
464 CHAPTER 16. ANALYTICAL FORMULATIONS FOR CONTINUOUS SYSTEMS
Electromagnetic field theory: The rank 2 Maxwell stress tensor T has components
µ ¶ µ ¶
1 2 1 1 2
≡ 0 − + − (1671)
2 0 2
The divergence theorem allows the total electromagnetic force, acting of the volume to be written as
Z µ ¶ I Z
S
F= ∇ · T−0 0 = T·a−0 0 Sdτ (1674)
Viscous fluid dynamics: For incompressible flow the stress tensor term simplifies to ∇ · T =∇2 v. Then
the Navier-Stokes equation becomes
∙ ¸
v
+ v · ∇v = −∇ + ∇2 v+f (1698)
where ∇2 v is the viscosity drag term. The left-hand side of equation 1698 represents the rate of change
of momentum per unit volume while the right-hand side represents the summation of the forces per unit
volume that are acting.
The Reynolds number is a dimensionless number that characterizes the ratio of inertial forces to viscous
forces in a viscous medium. The evolution of flow from laminar flow to turbulent flow, with increase of
Reynolds number, was discussed.
The classical mechanics of continuous fields encompasses a remarkably broad range of phenomena with
important applications to laminar and turbulent fluid flow, gravitation, electromagnetism, relativity, and
quantum fields.
Chapter 17
Relativistic mechanics
17.1 Introduction
Newtonian mechanics incorporates the Newtonian concept of the complete separation of space and time.
This theory reigned supreme from inception, in 1687, until November 1905 when Einstein pioneered the
Special Theory of Relativity. Relativistic mechanics undermines the Newtonian concepts of absoluteness of
time that is inherent to Newton’s formulation, as well as when recast in the Lagrangian and Hamiltonian
formulations of classical mechanics. Relativistic mechanics has had a profound impact on twentieth-century
physics and the philosophy of science. Classical mechanics is an approximation of relativistic mechanics
that is valid for velocities much less than the velocity of light in vacuum. The term “relativity” refers to
the fact that physical measurements are always made relative to some chosen reference frame. Naively one
may think that the transformation between different reference frames is trivial and contains little underlying
physics. However, Einstein showed that the results of measurements depend on the choice of coordinate
system, which revolutionized our concept of space and time.
Einstein’s work on relativistic mechanics comprised two major advances. The first advance is the 1905
Special Theory of Relativity which refers to nonaccelerating frames of reference. The second major advance
was the 1916 General Theory of Relativity which considers accelerating frames of reference and their relation
to gravity. The Special Theory is a limiting case of the General Theory of Relativity. The mathematically
complex General Theory of Relativity is required for describing accelerating frames, gravity, plus related
topics like Black Holes, or extremely accurate time measurements inherent to the Global Positioning System.
The present discussion will focus primarily on the mathematically simple Special Theory of Relativity since it
encompasses most of the physics encountered in atomic, nuclear and high energy physics. This chapter uses
the basic concepts of the Special Theory of Relativity to investigate the implications of extending Newtonian,
Lagrangian and Hamiltonian formulations of classical mechanics into the relativistic domain. The Lorentz-
invariant extended Hamiltonian and Lagrangian formalisms are introduced since they are applicable to the
Special Theory of Relativity. The General Theory of Relativity incorporates the gravitational force as a
geodesic phenomena in a four-dimensional Reimannian structure based on space, time, and matter. A
superficial outline is given to the fundamental concepts and evidence that underlie the General Theory of
Relativity.
465
466 CHAPTER 17. RELATIVISTIC MECHANICS
Consider two coordinate systems shown in figure 171, where the primed frame is moving along the
axis of the fixed unprimed frame. A Galilean transformation implies that the following relations apply;
01 = 1 − (17.1)
02 = 2
03 = 3
0 =
= 0
= 0µ ¶
4
0
= 0 + 2 3
The Lorentz factor, defined above, is the key feature 2
Tick! Tick!
d d
Figure 17.4: The observer and mirror are at rest in the left-hand frame (a). The light beam takes a time
∆ = to travel to the mirror. In the right-hand frame (b) the source and mirror are travelling at a velocity
relative to the observer. The light travels further in the right-hand frame of reference (b) than is the
stationary frame (a). Since Einstein states that the velocity of light is the same in both frames of reference
then the time interval must by larger in frame (b) since the light travels further than in (a).
There are many experimental verifications of time dilation in physics. For example, a stationary muon
has a mean lifetime of = 2 sec, whereas the lifetime of a fast moving muon, produced in the upper
atmosphere by high-energy cosmic rays, was observed in 1941 to be longer and given by as described in
example 171. In 1972 Hafely and Keating used four accurate cesium atomic clocks to confirm time dilation.
Two clocks were flown on regularly scheduled airlines travelling around the World, one westward and the
other eastward. The other two clocks were used for reference. The westward moving clock was slow by
(273 ± 7) compared to the predicted value of (275 ± 10) sec. The Global Positioning System of 24
geosynchronous satellites is used for locating positions to within a few meters. It has an accuracy of a few
nanoseconds which requires allowance for time dilation and is a daily tribute to the correctness of Einstein’s
Theory of Relativity.
Since 2 = 1 , the measured lengths in the two frames are related by:
17.3.5 Simultaneity
The Lorentz transformations imply a new philosophy of space and time. A surprising consequence is that
the concept of simultaneity is frame dependent in contrast to the prediction of Newtonian mechanics.
Consider that two events occur in frame at (1 1 ) and (2 2 ) In frame 0 these two events occur at
(1 1 ) and (02 02 ) From the Lorentz transformation the time difference is
0 0
∙ ¸
0 0 (2 − 1 )
2 − 1 = (2 − 1 ) − (17.14)
2
Thus the event is not simultaneous in frame 0 if (2 − 1 ) = 6= 0 That is, an event that is simultaneous
in one frame is not simultaneous in the other frame if the events are spatially separated. The equivalent
statement is that for two clocks, spatially separated by a distance , which are synchronized in their rest
frame, then in a moving frame they are not simultaneous.
470 CHAPTER 17. RELATIVISTIC MECHANICS
L L
2 2
Figure 17.5: If lightning strikes the front and rear of the carriage simultaneously, according to the man in
the fixed frame, then the woman in the moving frame sees the flash from the front first since she is moving
towards that approaching wavefront during the transit time of the light. Thus if the length of the carriage
in the stationary frame is (2 − 1 ) = then the time difference is ∆0 = 2 .
Einstein discussed the example shown in figure 175, where lightning strikes both ends of a train simul-
taneously in the stationary earth frame of reference. A woman on the train will see that the strikes are
not simultaneous since the wavefront from the front of the carriage will be seen first because she is moving
forward during the time the light from the two lightning flashes is travelling towards her. As a consequence
she observes that the two lightning flashes are not simultaneous. This explains why measurement of the
length of a moving rod, performed by simultaneously locating both ends in the fixed frame, implies that the
measurement occurs at different times for both ends in the moving frame resulting in a shorter apparent
length. The lack of simultaneity explains why one can get the apparent inconsistency that the moving bicy-
clist sees that the stationary street block to be length contracted, while in contrast, a pedestrian sees that
the bicycle is length contracted.
The concept of causality breaks down since (02 − 01 ) can be either positive or negative, therefore the
corresponding ∆ can be positive of negative. A consequence of the lack of simultaneity is that the image
shown by a photograph of a rapidly moving object is not a true representation of the moving object. Not
only is the body contracted in the direction of travel, but also it appears distorted because light arriving from
the far side of the body had to be emitted earlier, that is, when the body was at an earlier location, in order
to reach the observer simultaneously with light from the near side. The relativistic snake paradox, addressed
in Chapter 17 workshop exercise 1 is an example of the role of simultaneity in relativistic mechanics.
= ( − )∆
= =
( − )∆
According to the source, it emits waves of frequency 0 during the proper time interval ∆0 , that is
= 0 ∆0
This proper time interval ∆0 , in the source frame, corresponds to a time interval ∆ in the receiver frame
where
∆ = ∆0
p s
1 0 1 − ( )2 1+
= = 0 = 0
(1 − ) (1 − ) 1−
where ≡ . This formula for source and receiver approaching each other also gives the correct answer for
source and receiver receding if the sign of is changed.
This relativistic Doppler Effect accounts for the red shift observed for light emitted by receding stars and
galaxies, as well as many examples in atomic and nuclear physics involving moving sources of electromagnetic
radiation.
Consider the two parallel coordinate frames with the primed frame moving at a velocity along the 01 axis
as shown in figure 171. Velocities of an object measured in both frames are defined to be
= (17.16)
0
0 =
0
Using the Lorentz transformations 173 175 between the two frames moving with relative velocity along
the 1 axis, gives that the velocity along the 01 axis is
01 1 − 1 −
01 = = = (17.17)
0 − 2 1 1 − 12
Similarly we get the velocities along the perpendicular 02 and 03 axes to be
02 2
02 = = (17.18)
0 1 − 12
0
3 3
03 = =
0 1 − 12
When 12 → 0 these velocity transformations become the usual Galilean relations for velocity addition.
Do not confuse u and u0 with v; that is, u and u0 are the velocities of some object measured in the unprimed
and primed frames of reference respectively, whereas v is the relative velocity of the origin of one frame with
respect to the origin of the other frame.
17.4.2 Momentum
Using the classical definition of momentum, that is p =u, the linear momentum is not conserved using the
above relativistic velocity transformations if the mass is a scalar quantity. This problem originates from
the fact that both x and have non-trivial transformations and thus u = x is frame dependent.
Linear momentum conservation can be retained by redefining momentum in a form that is identical in
all frames of reference, that is by referring to the proper time as measured in the rest frame of the moving
object. Therefore we define relativistic linear momentum as
x x
p ≡ = (17.19)
But we know the time dilation relation
= q = (17.20)
2
(1 − 2 )
Note that the in this relation refers to the velocity between the moving object and the frame; this is
quite different from the = 1 2 which refers to the transformation between the two frames of reference.
(1− 2 )
Thus the new relativistic definition of momentum is
x x
p ≡ = = u (17.21)
The relativistic definition of linear momentum is the same as the classical definition with the rest mass
replaced by the relativistic mass .1
1 Note that, until recently, the rest mass was denoted by and the relativistic mass was referred to as . Modern texts
0
denote the rest mass by and the relativistic mass by . This book follows the modern nomenclature for rest mass to avoid
confusion.
17.4. RELATIVISTIC KINEMATICS 473
17.4.4 Force
Newton’s second law F = p is covariant under a Galilean transformation. In special relativity this definition
also applies using the relativistic definition of momentum p. The fact that the relativistic momentum p
is conserved in the force-free situation, leads naturally to using the definition of force to be
p
F= (17.22)
Then the relativistic momentum is conserved if F =0
17.4.5 Energy
The classical definition of work done is defined by
Z 2
12 = F·r =2 − 1 (17.23)
1
Assume 1 = 0 let r = u and insert the relativistic force relation in equation 1723, gives
Z Z
= = ( u) ·u = ( ) (17.24)
0 0
of a paper clip, provides = 9 × 1013 joules. This is the daily output of a 1 nuclear power station or
the explosive power of the Nagasaki or Hiroshima bombs.
As the velocity of a particle approaches then and the relativistic mass both approach infinity.
This means that the force needed to accelerate the mass also approaches infinity, and thus no particle can
exceed the velocity of light. The energy continues to increase not by increasing the velocity but by increase
of the relativistic mass. Although the relativistic relation for kinetic energy is quite different from the
Newtonian relation, the Newtonian form is obtained for the case of in that
2 − 12 1 2 1
= 2 (1 − 2
) − 2 = 2 (1 + + · · ·) − 2 = 2 (17.29)
2 2 2
An especially useful relativistic relation that can be derived from the above is
2 = 2 2 + 02 (17.30)
This is useful because it provides a simple relation between total energy of a particle and its relativistic
linear momentum plus rest energy.
Integrate the left-hand side between 0 and and the right-hand side between and gives
µ ¶ ³´
1 1 +
ln = − ln
2 1 −
This reduces to ¡ ¢2
1−
= ¡ ¢2
1+
When → 0 this equation reduces to the non-relativistic answer given in equation 2123.
17.5. GEOMETRY OF SPACE-TIME 475
s = 0 ê0 + 1 ê1 + 2 ê2 + 3 ê3 = 00 ê00 + 01 ê01 + 02 ê02 + 03 ê03 (17.31)
The convention used is that greek subscripts (covariant) or superscripts (contravariant) designate a four
vector with 0 ≤ ≤ 3 The covariant unit vectors ê are written with the subscript which has 4 values
0 ≤ ≤ 3. As described in appendix 3, using the Einstein convention the components are written with
the contravariant superscript where the time axis 0 = , while the spatial coordinates, expressed in
cartesian coordinates, are 1 = , 2 = , and 3 = . With respect to a different (primed) unit vector basis
ê0 the displacement must be unchanged as given by equation 1731. In addition, equation 1743 shows that
the magnitude ||2 of the displacement four vector is invariant to a Lorentz transformation.
The most general Lorentz transformation between inertial coordinate systems and 0 , in relative motion
with velocity v, assuming that the two sets of axes are aligned, and that their origins overlap when = 0 = 0,
is given by the symmetric matrix where
X
0 = (17.32)
This Lorentz transformation of the four vector X components can be written in matrix form as
X0 = λX (17.33)
Assuming that the two sets of axes are aligned, then the elements of the Lorentz transformation are
given by
⎛ ⎞
⎛ 0 ⎞ − 1 − 2 − 3 ⎛ ⎞
⎜ 2 ⎟
⎜ 01 ⎟ ⎜ − 1 1 + ( − 1) 12 ( − 1) 1 2 2 ( − 1) 1 2 3 ⎟ ⎜ 1 ⎟
X =⎜
0 ⎟ ⎜
⎝ 02 ⎠ = ⎜ − ( − 1) 1 2 2 2
⎟·⎜ 2 ⎟
⎟ ⎝ ⎠ (17.34)
⎝ 2
1 + ( − 1) 22 ( − 1) 2 2 3 ⎠
03 2 3
− 3 ( − 1) 1 2 3 ( − 1) 2 2 3 1 + ( − 1) 32
1
where = and = √ and assuming that the origin of transforms to the origin of 0 at (0 0 0 0).
1− 2
For the case illustrated in figure 171 where the corresponding axes of the two frames are parallel and in
relative motion with velocity in the 1 direction, then the Lorentz transformation matrix 1734 reduces to
⎛ 0 ⎞ ⎛ ⎞ ⎛ ⎞
− 0 0
⎜ 01 ⎟ ⎜ − 0 0 ⎟ ⎜ 1 ⎟
⎜ 02 ⎟ = ⎜ ⎟ · ⎜ 2 ⎟ (17.35)
⎝ ⎠ ⎝ 0 0 1 0 ⎠ ⎝ ⎠
03 3
0 0 0 1
This Lorentz transformation matrix is called a standard boost since it only boosts from one frame to another
parallel frame. In general a rotation matrix also is incorporated into the transformation matrix for the
spatial variables.
476 CHAPTER 17. RELATIVISTIC MECHANICS
The correct sign of the inner product is obtained by inclusion of the Minkowski metric defined by
The contravariant metric component is defined as the component of the inverse metric matrix g−1
where
gg−1 = I = g−1 g (17.40)
where I is the four-vector identity matrix. The contravariant components of the four vector can be expressed
in terms of the covariant components as
X3
= (17.41)
=0
Thus equations 1739 and 1741 can be used to transform between covariant and contravariant four vectors,
that is, to raise or lower the index .
The scalar inner product of two four vectors can be written compactly as the scalar product of a covariant
four vector and a contravariant four vector. The Minkowski metric matrix can be absorbed into either X or
Y thus
X 3
3 X 3
X 3
X
X·Y= = = (17.42)
=0 =0 =0 =0
If this covariant expression is Lorentz invariant in one coordinate system, then it is Lorentz invariant in all
coordinate systems obtained by proper Lorentz transformations.
2 Older textbooks, such as all editions of Marion, and the first two editions of Goldstein, use the Euclidean Poincaré 4-
dimensional space-time with the imaginary time axis . About half the scientific community, and modern physics textbooks
including this textbook and the 3 edition of Goldstein, use the Bjorken - Drell + − − −, sign convention given in equation
1738 where 0 ≡ and 1 2 3 are the spatial coordinates. The other half of the community, including mathematicians
and gravitation physicists, use the opposite − + + + sign convention. Further confusion is caused by a few books that assign
the time axis to be 4 rather than 0
17.5. GEOMETRY OF SPACE-TIME 477
The scalar inner product of the invariant space-time interval is an especially important example.
3
X
2 2 2 2
() ≡ X·X=2 () 2 − (r) = () − 2 = ( ) (17.43)
=1
This is invariant to a Lorentz transformation as can be shown by applying the Lorentz standard boost
transformation given above. In particular, if 0 is the rest frame of the clock, then the invariant space-time
interval is simply given by the proper time interval .
time , then the time observed in the fixed frame can be obtained by looking at the interval Because of
the invariance of the interval, 2 then
£ ¤
2 = 2 2 = 2 2 − 21 + 22 + 23 (17.44)
That is,
" ¡ 2 ¢ # 12 ∙ ¸1
1 + 22 + 23 2 2
= 1 − = 1 − 2 = (17.45)
2 2
that is = which satisfies the normal expression for time dilation, 178.
Remember that the square of the four-dimensional space-time element of length ()2 is invariant (1743),
and is simply related to the proper time element . Thus the scalar product
£ ¤
X·X = 2 = 2 2 = 2 2 − 21 + 22 + 23 (17.47)
Thus the proper time is an invariant.
The ratio of the four-vector element X and the invariant proper time interval is a four-vector called
the four-vector velocity U where
µ ¶ µ ¶
X x x
U= = = = ( u) (17.48)
where u is the particle velocity, and = 1 .
2
(1−
2
)
The four-vector momentum P can be obtained from the four-vector velocity by multiplying it by the
scalar rest mass
P = U = ( u) (17.49)
However,
= (17.50)
thus the momentum four vector can be written as
µ ¶
P= p (17.51)
where the vector p represents the three spatial components of the relativistic momentum. It is interesting to
realize that the Theory of Relativity couples not only the spatial and time coordinates, but also, it couples
their conjugate variables linear momentum p and total energy, .
An additional feature of this momentum-energy four vector P, is that the scalar inner product P · P is
invariant to Lorentz transformations and equals ()2 in the rest frame
X 3
3 X 3 X
X 3
2
P·P= = = ( ) − |p|2 = 2 2 (17.52)
=0 =0 =0 =0
where (q() q̇()) denotes the conventional Lagrangian. This approach implicitly assumes the Newtonian
concept of absolute time which is chosen to be the independent variable that characterizes the evolution
parameter of the system. The actual path [q() q̇()] the system follows is defined by the extremum of the
action integral (q q̇) which leads to the corresponding Euler-Lagrange equations. This assumption is
contrary to the Theory of Relativity which requires that the space and time variables be treated equally,
that is, the Lagrangian formalism must be covariant.
The conventional action and extended action S, address alternate characterizations of the same underlying
physical system, and thus the action principle implies that = S = 0 must hold simultaneously. That is,
Z Z
q q
(q ) = L(q ) (17.57)
As discussed in chapter 93 there is a continuous spectrum of equivalent gauge-invariant Lagrangians for
which the Euler-Lagrange equations lead to identical equations of motion. Equation 1757 is satisfied if the
conventional and extended Lagrangians are related by
q q Λ(q)
L(q ) = (q ) + (17.58)
where Λ(q) is a continuous function of q and that has continuous second derivatives. It is acceptable to
assume that Λ(q)
= 0, then the extended and conventional Lagrangians have a unique relation requiring
no simultaneous transformation of the dynamical variables. That is, assume
q q
L(q ) = (q ) (17.59)
Note that the time derivative of q can be expressed in terms of the derivatives by
q q
= (17.60)
Thus, for a conventional Lagrangian with variables, the corresponding extended Lagrangian is a function
of + 1 variables while the conventional and extended Lagrangians are related using equations 1759 and
1760.
The derivatives of the relation between the extended and conventional Lagrangians lead to
L
= (17.61)
L
= (17.62)
L
³ ´ = ³ ´ (17.63)
X
L
¡ ¢ = − ³ ´ (17.64)
=1
where 1 ≤ ≤ since the = 0 time derivatives are written explicitly in equations 1762 1764.
Equations 1763 — 1764, summed over the extended range 0 ≤ ≤ of time and spatial dynamical
variables, imply
X µ ¶
L X X
³ ´ = − ³ ´ + ³ ´ =L (17.65)
=0
=1 =1
Assume that the definitions of the extended Lagrangian L, and the extended Hamiltonian H, are related
by a Legendre transformation, and are based on variational principles, analogous to the relation that exists
between the conventional Lagrangian and Hamiltonian . The Legendre transformation requires defining
the extended generalized (canonical) momentum-energy four vector P()= ( () p()). The momentum
components of the momentum-energy four vector P()= ( ()
p()) are given by the 1 ≤ ≤ components
using equation 1763
L
() = ³ ´ = ³ ´ (17.68)
The = 0 component of the momentum-energy four vector can be derived by recognizing that the right-hand
side of equation 1764 is equal to −( ). That is, the corresponding generalized momentum 0 that
is conjugate to 0 = is given by
à ! ⎛ ⎞
X
L 1 L 1 ⎠ ( )
0 = ³ 0 ´ = ¡ ¢ = ⎝ − ³ ´ =− (17.69)
=1
where the extended generalized force Q shown on the right-hand side of equation 1770 accounts for all
forces not included in the potential energy term in the Lagrangian. The extended generalized force Q can
be factored into two terms as discussed in chapter 6, equation 660. The Lagrange multiplier term includes
1 ≤ ≤ holonomic constraint forces where the holonomic constraints, which do no work, are expressed
in terms of the algebraic equations of holonomic constraint . The term includes the remaining
constraint forces and generalized forces that are not included in the Lagrange multiplier term or the potential
energy term of the Lagrangian.
For the case where = 0, since 0 = , then equation 1770 reduces to
à !
L L X X
¡ ¢ − = − (17.71)
=1
=1
These Euler-Lagrange equations of motion 1770 1771 determine the 1 ≤ ≤ generalized coordinates
() plus 0 = () in terms of the independent variable .
If the holonomic equations of constraint are time independent, that is
= 0 and if Q0
= 0, then
the = 0 term of the Euler-Lagrange equations simplifies to
à !
L L
¡ ¢ − =0 (17.72)
One interpretation is to select to be primary. Then L is derived from using equation 1759 and L
must satisfy the identity given by equation 1766 while the Euler-Lagrange equations containing yield an
identity which implies that does not provide an equation of motion in terms of (). Conversely, if L is
482 CHAPTER 17. RELATIVISTIC MECHANICS
chosen to be primary, then L is no longer a homogeneous function and equation 1766 serves as a constraint
on the motion that can be used to deduce , while yields a non-trivial equation of motion in terms of
(). In both cases the occurrence of a constraint surface results from the fact that the extended space has
2 + 2 variables to describe 2 + 1 degrees of freedom, that is, one more degree of freedom than required for
the actual system.
The constant third term in the bracket is included to ensure that the extended Lagrangian converges to the
standard Lagrangian in the limit → 1.
Note that the extended Lagrangian () is not homogeneous to first order in the velocities q
as is required.
Equation 1766 must be used to ensure that equation () is homogeneous. That is, it must satisfy the
constraint relation µ ¶2 µ ¶2
1 q
− 2 −1=0 ()
Inserting () into the extended Lagrangian () yields that the square bracket in equation must equal 2.
Thus
1
|L| = 2 [−2] = −2 ()
2
The constraint equation () implies that
s µ ¶2
1 q 1
= 1− 2 = ()
Equation () is the conventional relativistic Lagrangian derived by assuming that the system evolution para-
meter is transformed to be along the world line where the invariant length replaces the proper time
interval
= = ()
The definition of the generalized (canonical) momentum
= = ̇ ()
̇
leads to the relativistic expression for momentum given in equation 1721.
The relativistic Lagrangian is an important example of a non-standard Lagrangian. Equation () does not
equal the difference between the kinetic and potential energies, that is, the relativistic expression for kinetic
energy is given by 1728 to be
= ( − 1) 2 ()
The non-standard relativistic Lagrangian () can be used with the Euler-Lagrange equations to derive the
second-order equations of motion for both relativistic and non-relativistic problems within the Special Theory
of Relativity.
17.6. LORENTZ-INVARIANT FORMULATION OF LAGRANGIAN MECHANICS 483
If we adopt the definition that the relativistic canonical momentum is = then the left hand side is
the relativistic force while the right-hand side is the well-known Lorentz force of electromagnetism. Thus
the extended Lagrangian formulation correctly reproduces the well-known Lorentz force for a charged particle
moving in an electromagnetic field.
484 CHAPTER 17. RELATIVISTIC MECHANICS
Struckmeier[Str08] assumes that the definitions of the extended Lagrangian L, and the extended Hamil-
tonian H, are related by a Legendre transformation, and are based on variational principles, analogous to the
relation that exists between the conventional Lagrangian and Hamiltonian . The Legendre transforma-
tion requires defining the extended generalized (canonical) momentum-energy four vector P()= ( () p()).
()
The momentum components of the momentum-energy four vector P()= ( p()) are given by the 1 ≤
≤ components using either the conventional or the extended Lagrangians as given in equation 1768
L
() = ³ ´ = ³ ´ (1768)
where E() represents the instantaneous generalized energy of the conventional Hamiltonian at the point
but not the functional form of (q() p() ()). That is
E()6≡
= (q() p() ()) (17.76)
Note that E() does not give the function (q p ). Equations 1768 and 1769 give that
E()
0 () = − (17.77)
The extended Hamiltonian H(q p E()), in an extended phase space, can be defined by the Legendre
transformation and the four-vector P to be
q
H(q p E()) = (P·q) − L(q ) (17.78)
X µ ¶
q
= − L(q )
=0
X µ ¶
q
= −E − L(q ) (17.79)
=1
where the 0 term has been written explicitly as −E in equation 1779. The extended Hamiltonian
H((q p E()) can carry all the information on the dynamical system that is carried by the extended
Lagrangian L(q q
) if the Hesse matrix is non-singular. That is, if
⎛ ⎞
2
L
det ⎝ ³ ´ ³ ´ ⎠ 6= 0 (17.80)
17.7. LORENTZ-INVARIANT FORMULATIONS OF HAMILTONIAN MECHANICS 485
If the extended Lagrangian L(q q
) is not homogeneous in the +1 velocities , then the extended
set of Euler-Lagrange equations 1772 is not redundant. Thus equation 1766 is not an identity but it can be
regarded as an implicit equation that is always satisfied by the extended set of Euler-Lagrange equations. As
a result, the Legendre transformation to an extended Hamiltonian exists. That is, equation 1766 is identical
to the Legendre transform for H((q p E()) which was shown to equal zero. Therefore
H(q() p() () E()) = 0 (17.81)
which means that the extended Hamiltonian H((q p E()) directly defines the restricted hypersurface on
which the particle motion is confined.
The extended canonical equations of motion, derived using the extended Hamiltonian H(q() p() () E())
with the usual Hamiltonian mechanics relations, are:
H
= (17.82)
H
= − (17.83)
H E
= (17.84)
H
= − (17.85)
E
These canonical equations give that the total derivative of H((q() p() () E()) with respect to is
H H H H H E
= + + +
E
E E
= − + − =0 (17.86)
That is, in contrast to the total time derivative of (q p ), the total derivative of the extended Hamil-
tonian H((q() p() () E()) always vanishes, that is, H(q() p() () E()) is autonomous which is ideal
for use with Hamilton’s equations of motion. The constraints give that H(q() p() () E()) = 0, (equation
1781) and = 0, (equation 1786) implying that the correlation between the extended and conventional
Hamiltonians is given by
X µ ¶
q
H((q() p() () E()) = −E − L(q ) (17.87)
=1
X µ ¶
q
= −E
− (q ) (17.88)
=1
µ ¶ " µ ¶#
X X
= −E + (q p ) − (17.89)
=1
=1
= ((q p ) − E) =0 (17.90)
since only the term with = 0 does not cancel in equation 1779. Equations 1781 and 1790 give that both the
left and right-hand sides of equation 1790 are zero while equation 1786 implies that H((q() p() () E())
is a constant of motion, that is, is a cyclic variable for H((q() p() () E()). Formally one can consider
the extended Hamiltonian is a constant which equals zero
H(q p E()) = E() = 0 (17.91)
Equations 1784 1785 imply that (E ) form a pair of canonically conjugate variables in addition to the
newly-introduced canonically-conjugate variables (E() ). Equation 1790 shows that the motion in the
2 + 2 extended phase space is constrained to the surface reflecting the fact that the observed system has
one less degree of freedom than used by the extended Hamiltonian.
In summary, the Lorentz-invariant extended canonical formalism leads to Hamilton’s first-order equations
of motion in terms of derivatives with respect to where is related to the proper time for a relativistic
system.
486 CHAPTER 17. RELATIVISTIC MECHANICS
As for the conventional Poisson bracket discussed in chapter 15, the extended Poisson also leads to the
fundamental Poisson bracket relations
££ ¤¤ ££ ¤¤
=0 [[ ]] = 0 = (17.93)
where = 0 1 . These are identical to the non-extended fundamental Poisson brackets.
The discussion of observables in Hamiltonian mechanics in chapter 1525 can be trivially expanded to
the extended Poisson bracket representation. In particular, the total derivative of the function is given
by
= + [[ H]] (17.94)
If commutes with the extended Hamiltonian, that is, the Poisson bracket equals zero, and if = 0, then
= 0. That is, the observable is a constant of motion.
Substitute the fundamental variables for gives
H H
= [[ H]] = − = [[ H]] = (17.95)
where = 0 1 . These are Hamilton’s extended canonical equations of motion expressed in terms of
the system evolution parameter . The extended Poisson bracket representation is a trivial extension of the
conventional canonical equations presented in chapter 153.
s s
2 4
4 2 Γ2 2 Γ2 (1 − 2 )
= Γ= 1− 2 2 = 2 = 1+
1 + cos[Γ( − 0 ] 1 − Γ2
The apses are min = (1+) for Γ( − 0 ) = 0 2 4 and max = (1−) for Γ( − 0 ) = 3. The
perihelion advances between cycles due to the change in relativistic mass during the trajectory as shown in
the adjacent figure. This precession leads to the fine structure observed in the optical spectra of the hydrogen
atom. The same precession of the perihelion occurs for planetary motion, however, there is a comparable
size effect due to gravity that requires use of general relativity to compute the trajectories.
488 CHAPTER 17. RELATIVISTIC MECHANICS
Mach’s principle:
The 1883 work “The Science of Mechanics” by the philosopher/physicist, Ernst Mach, criticized Newton’s
concept of an absolute frame of reference, and suggested that local physical laws are determined by the large-
scale structure of the universe. Mach’s Principle assumes that local motion of a rotating frame is determined
by the large-scale distribution of matter, that is, relative to the fixed stars. Einstein’s interpretation of
Mach’s statement was that the inertial properties of a body is determined by the presence of other bodies
in the universe, and he named this concept “Mach’s Principle”.
Equivalence principle:
The equivalence principle comprises closely-related concepts dealing with the equivalence of gravitational and
inertial mass. The weak equivalence principle states that the inertial mass and gravitational mass of a
body are identical, leading to acceleration that is independent of the nature of the body. Galileo demonstated
this at the Leaning Tower of Pisa. Recent measurements have shown that this weak equivalence principle
is obeyed to a sensitivity of 5 × 10−13 . Einstein’s equivalence principle states that the outcome of
any local non-gravitational experiment, in a freely falling laboratory, is independent of the velocity of the
laboratory and its location in space-time. This principle implies that the result of local experiments must be
independent of the velocity of the apparatus. Einstein’s equivalence principle has been tested by searching
for variations of dimensionless fundamental constants such as the fine structure constant. The strong
equivalence principle combines the weak equivalence and Einstein equivalence principles, and implies
that the gravitational constant is constant everywhere in the universe. The strong equivalence principle
suggests that gravity is geometrical in nature and does not involve any fifth force in nature. Tests of the
strong equivalence principle have involved searches for variations in the gravitational constant and masses
of fundamental particles throughout the life of the universe.
Principle of covariance
A physical law that is expressed in a covariant formulation has the same mathematical form in all coordinate
systems, and is usually expressed in terms of tensor fields. In the Special Theory of Relativity, the Lorentz,
rotational, translational and reflection transformations between inertial coordinate frames are covariant. The
covariant quantities are the 4-scalars, and 4-vectors in Minkowski space-time. Einstein recognized that the
principle of covariance, that is built into the Special Theory of Relativity, should apply equally to accelerated
relative motion in the General Theory of Relativity. He exploited tensor calculus to extend the Lorentz
covariance to the more general local covariance in the General Theory of Relativity. The reduction locally
of the general metric tensor to the Minkowski metric corresponds to free-falling motion, that is geodesic
motion, and thus encompasses gravitation.
17.8. THE GENERAL THEORY OF RELATIVITY 489
Correspondence principle
The Correspondence Principle states that the predictions of any new scientific theory must reduce to the
predictions of well established earlier theories under circumstances for which the preceding theory was known
to be valid. The Correspondence Principle is an important concept used both in quantum mechanics and
relativistic mechanics. Einstein’s Special Theory of Relativity satisfies the Correspondence Principle be-
cause it reduces to classical mechanics in the limit of velocities small compared to the speed of light. The
Correspondence Principle requires that the General Theory of Relativity reduce to the Special Theory of
Relativity for inertial frames, and should approximate Newton’s Theory of Gravitation in weak fields and at
low velocities.
Kepler problem In 1915 Einstein showed that relativistic mechanics explained the anomalous precession
of the perihelion of the planet mercury, that is, the axes of the elliptical Kepler orbit are observed to precess.
490 CHAPTER 17. RELATIVISTIC MECHANICS
Deflection of light Einstein’s prediction of the deflection of light in a gravitational field was confirmed by
Eddington during the solar eclipse of 29 May 1919 Pictures of stars in the region around the Sun showed that
their apparent locations were slightly shifted because the light from the stars had been curved by passing
close to the sun’s gravitational field.
Gravitational lensing The deflection of light by the gravitational attraction of a massive object situated
between a distant star and the observer has resulted in the observation of multiple images of a distant quasar.
Gravitational time dilation and frequency shift Processes occurring in a high gravitation field are
slower than in a weak gravitational field; this is called gravitational time dilation. In addition, light climbing
out of a gravitational well is red shifted. The gravitational time dilation has been measured many times and
the successful operation of the Global Position System provides an ongoing validation. The gravitational
red shift has been confirmed in the laboratory using the precise Mössbauer effect in nuclear physics. Tests
in stronger gravitational fields are provided by studies of binary pulsars.
Black holes When the mass to radius ratio of a massive object becomes sufficiently large, general relativity
predicts formation of a black hole, which is a region of space from which neither light nor matter can escape.
Supermassive black holes, with a mass that can be 106 − 109 solar masses, are thought to have played an
important role in formation of the galaxies.
Gravitational waves detection In 1916 Einstein predicted the existence of gravitational waves on the
basis of the theory of general relativity. The first implied detection of gravitational waves were made in
1976 by Hulse and Taylor who detected a decrease in the orbital period due to significant energy loss which
presumably was associated with emission of gravity waves by the compact neutron star in the binary pulsar
1913 + 16. The most compelling direct evidence for observation of a gravitational wave was made
on 15 September 2015 by the LIGO Laser Interferometer Gravitational-Wave Observatories. The waveform
detected by the two LIGO observatories matched the predictions of General Relativity for gravitational waves
emanating from the inward spiral plus merger of a pair of black holes of around 36 and 29 solar masses,
followed by the resultant binary black hole. The gravitational wave emitted by this cataclysmic merger
reached Earth as a ripple in space-time that changed the length of the 4 LIGO arm by a thousandth of
the width of the proton. The gravitational energy emitted was 30+05 2
−05 solar masses. A second observation
of gravitational waves was made on 26 December 2015, and four similar observations were made during
2017. The detection of such miniscule changes in space-time is a truly remarkable achievement. This direct
detection of gravitational waves resulted in the award of the 2017 Nobel Prize to Rainer Weiss, Barry
Barish, and Kip Thorn. Gravitational wave detection has opened an exciting and powerful new frontier in
astrophysics that could lead to exciting new physics.
17.10 Summary
Special theory of relativity: The Special Theory of Relativity is based on Einstein’s postulates;
1) The laws of nature are the same in all inertial frames of reference.
2) The velocity of light in vacuum is the same in all inertial frames of reference.
For a primed frame moving along the 1 axis with velocity Einstein’s postulates imply the following
Lorentz transformations between the moving (primed) and stationary (unprimed) frames
The General Theory of Relativity: An elementary summary was given of the fundamental concepts
of the General Theory of Relativity and the resultant unified description of the gravitational force plus
planetary motion as geodesic motion in a four-dimensional Riemannian structure. Variational mechanics
were shown to be ideally suited to applications of the General Theory of Relativity.
Philosophical implications: Newton’s equations of motion, and his Law of Gravitation, that reigned
supreme from 1687 to 1905, have been toppled from the throne by Einstein’s theories of relativistic mechanics.
By contrast, the complete independence to coordinate frames in Lagrangian, and Hamiltonian formulations of
classical mechanics, plus the underlying Principle of Least Action, are equally valid in both the relativistic and
non-relativistic regimes. As a consequence, relativistic Lagrangian and Hamiltonian formulations underlie
much of modern physics, especially quantum physics, which explains why relativistic mechanics plays such
an important role in classical dynamics.
492 CHAPTER 17. RELATIVISTIC MECHANICS
Workshop exercises
1. A relativistic snake of proper length 100 is travelling to the right across a butcher’s table at = 06. You
hold two meat cleavers, one in each hand which are 100 apart. You strike the table simultaneously with
both cleavers at the moment when the left cleaver lands just behind the tail of the snake. You rationalize that
since the snake is moving with = 06 then the length of the snake is Lorentz contracted by the factor = 54
and thus the Lorentz-contracted length of the snake is 80 and thus will not be harmed. However, the snake
reasons that relative to it the cleavers are moving at = 06 and thus are only 80 apart when they strike
the 100 long snake and thus it will be severed. Use the Lorentz transformation to resolve this paradox.
2. Explain what is meant by the following statement: “Lorentz transformations are orthogonal transformations
in Minkowski space.”
(a) Energy
(b) Momentum
(c) Mass
(d) Force
(e) Charge
(f) The length of a vector
(g) The length of a four-vector
4. What does it mean for two events to have a spacelike interval? What does it mean for them to have a timelike
interval? Draw a picture to support your answer. In which case can events be causally connected?
Problems
1. A supply rocket flies past two markers on the Space Station that are 50 apart in a time of 02 as measured
by an observer on the Space station.
(a) What is the separation of the two markers as seen by the pilot riding in the supply rocket?
(b) What is the elapsed time as measured by the pilot in the supply rocket?
(c) What are the speeds calculated by the observer in the Space Station and the pilot of the supply rocket?
2. The Compton effect involves a photon of incident energy being scattered by an electron of mass which
initially is stationary. The photon scattered at an angle with respect to the incident photon has a final energy
. Using the special theory of relativity derive a formula that related and to .
3. Pair creation involves production of an electron-positron pair by a photon. Show that such a process is
impossible unless some other body, such as a nucleus, is involved. Suppose that the nucleus has a mass
and the electron mass . What is the minimum energy that the photon must have in order to produce an
electron-positron pair?
4. A meson of rest energy 494 decays into a meson of rest energy 106 and a neutrino of zero
rest energy. Find the kinetic energies of the meson and the neutrino into which the meson decays while
at rest.
Chapter 18
18.1 Introduction
Classical mechanics, including extensions to relativistic velocities, embrace an unusually broad range of topics
ranging from astrophysics to nuclear and particle physics, from one-body to many-body statistical mechanics.
It is interesting to discuss the role of classical mechanics in the development of quantum mechanics which
plays a crucial role in physics. A valid question is “why discuss quantum mechanics in a classical mechanics
course?”. The answer is that quantum mechanics supersedes classical mechanics as the fundamental the-
ory of mechanics. Classical mechanics is an approximation applicable for situations where quantization is
unimportant. Thus there must be a correspondence principle that relates quantum mechanics to classical
mechanics, analogous to the relation between relativistic and non-relativistic mechanics. It is illuminating to
study the role played by the Hamiltonian formulation of classical mechanics in the development of quantal
theory and statistical mechanics. The Hamiltonian formulation is expressed in terms of the phase-space
variables q p for which there are well-established rules for transforming to quantal linear operators.
= = (18.1)
where is the frequency of the electromagnetic radiation and Planck’s constant, = 662610−34 · was
the best fit parameter of the interpolation. That is, Planck assumed that energy comes in discrete bundles
of energy equal to which are called quanta. By making this extreme assumption, in an act of desperation,
Planck was able to reproduce the experimental black body radiation spectrum. The assumption that energy
was exchanged in bundles hinted that the classical laws of physics were inadequate in the microscopic
domain. The older generation physicists initially refused to believe Planck’s hypothesis which underlies
493
494 CHAPTER 18. THE TRANSITION TO QUANTUM PHYSICS
quantum theory. It was the new generation physicists, like Einstein, Bohr, Heisenberg, Born, Schrödinger,
and Dirac, who developed Planck’s hypothesis leading to the revolutionary quantum theory.
In 1905, Einstein predicted the existence of the photon, derived the theory of specific heat, as well
as deriving the Theory of Special Relativity. It is remarkable to realize that he developed these three
revolutionary theories in one year, when he was only 26 years old. Einstein uncovered an inconsistency in
Planck’s derivation of the black body spectral distribution in that it assumed the statistical part of the energy
is quantized, whereas the electromagnetic radiation assumed Maxwell’s equations with oscillator energies
being continuous. Planck demanded that light of frequency be packaged in quanta whose energies were
multiples of , but Planck never thought that light would have particle-like behavior. Newton believed that
light involved corpuscles, and Hamilton developed the Hamilton-Jacobi theory seeking to describe light in
terms of the corpuscle theory. However, Maxwell had convinced physicists that light was a wave phenomena;
interference plus diffraction effects were convincing manifestations of the wave-like properties of light. In
order to reproduce Planck’s prediction, Einstein had to treat black-body radiation as if it consisted of a gas
of photons, each photon having energy = . This was a revolutionary concept that returned to Newton’s
corpuscle theory of light. Einstein realized that there were direct tests of his photon hypothesis, one of which
is the photo-electric effect. According to Einstein, each photon has an energy = , in contrast to the
classical case where the energy of the photoelectron depends on the intensity of the light. Einstein predicted
that the ejected electron will have a kinetic energy
= − (18.2)
where is the work function which is the energy needed to remove an electron from a solid.
Many older scientists, including Planck, accepted Einstein’s theory of relativity but were skeptical of the
photon concept, even after Einstein’s photon concept was vindicated in 1915 by Millikan who showed that,
as predicted, the energy of the ejected photoelectron depended on the frequency, and not intensity, of the
light. In 1923 Compton’s demonstrated that electromagnetic radiation scattered by free electrons obeyed
simple two-body scattering laws which finally convinced the many skeptics of the existence of the photon.
Table 181: Chronology of the development of quantum mechanics
Date Author Development
1887 Hertz Discovered the photo-electric effect
1895 Röntgen Discovered x-rays
1896 Becquerel Discovered radioactivity
1897 J.J. Thomson Discovered the first fundamental particle, the electron
1898 Pierre & Marie Curie Showed that thorium is radioactive which founded nuclear physics
1900 Planck Quantization = explained the black-body spectrum
1905 Einstein Theory of special relativity
1905 Einstein Predicted the existence of the photon
1906 Einstein Used Planck’s constant to explain specific heats of solids
1909 Millikan The oil drop experiment measured the charge on the electron
1911 Rutherford Discovered the atomic nucleus with radius 10−15
1912 Bohr Bohr model of the atom explained the quantized states of hydrogen
1914 Moseley X-ray spectra determined the atomic number of the elements.
1915 Millikan Used the photo-electric effect to confirm the photon hypothesis.
1915 Wilson-Sommerfeld Proposed quantization of the action-angle integral
1921 Stern-Gerlach Observed space quantization in non-uniform magnetic field
1923 Compton Compton scattering of x-rays confirmed the photon hypothesis
1924 de Broglie Postulated wave-particle duality for matter and EM waves
1924 Bohr Explicit statement of the correspondence principle
1925 Pauli Postulated the exclusion principle
1925 Goudsmit-Uhlenbeck Postulated the spin of the electron of = 12 h
1925 Heisenberg Matrix mechanics representation of quantum theory
1925 Dirac Related Poisson brackets and commutation relations
1926 Schrödinger Wave mechanics
1927 G.P. Thomson/Davisson Electron diffraction proved wave nature of electron
1928 Dirac Developed the Dirac relativistic wave equation
18.2. BRIEF SUMMARY OF THE ORIGINS OF QUANTUM THEORY 495
18.2.2 Quantization
By 1912 Planck, and others, had abandoned the concept that quantum theory was a branch of classical
mechanics, and were searching to see if classical mechanics was a special case of a more general quantum
physics, or quantum physics was a science altogether outside of classical mechanics. Also they were trying
to find a consistent and rational reason for quantization to replace the ad hoc assumption of Bohr.
In 1912 Sommerfeld proposed that, in every elementary process, the atom gains or loses a definite amount
of action between times 0 and of Z
= (0 )0 (18.3)
0
where is the quantal analogue of the classical action function It has been shown that the classical principle
of least action states that the action function is stationary for small variations of the trajectory. In 1915
Wilson and Sommerfeld recognized that the quantization of angular momentum could be expressed in terms
of the action-angle integral, that is equation 15116. They postulated that, for every coordinate, the action-
angle variable is quantized I
= (18.4)
where the action-angle variable integral is over one complete period of the motion. That is, they postulated
that Hamilton’s phase space is quantized, but the microscopic granularity is such that the quantization is
only manifest for atomic-sized domains. That is, is a small integer for atomic systems in contrast to
≈ 1064 for the Earth-Sun two-body system.
496 CHAPTER 18. THE TRANSITION TO QUANTUM PHYSICS
Sommerfeld recognized that quantization of more than one degree of freedom is needed to obtain more
accurate description of the hydrogen atom. Sommerfeld reproduced the experimental data by assuming
quantization of the three degrees of freedom,
I I I
= 1 = 2 = 3 (18.5)
and solving Hamilton-Jacobi theory by separation of variables. In 1916 the Bohr-Sommerfeld model solved
the classical orbits for the hydrogen atom, including relativistic corrections as described in example 177.
This reproduced fine structure observed in the optical spectra of hydrogen. The use of the canonical trans-
formation to action-angle variables proved to be the ideal approach for solving many such problems in
quantum mechanics. In 1921 Stern and Gerlach demonstrated space quantization by observing the splitting
of atomic beams deflected by non-uniform magnetic fields. This result was a major triumph for quantum
theory. Sommerfeld declared that “With their bold experimental method, Stern and Gerlach demonstrated
not only the existence of space quantization, they also proved the atomic nature of the magnetic moment,
its quantum-theoretic origin, and its relation to the atomic structure of electricity.”
In 1925 Pauli’s Exclusion Principle proposed that no more than one electron can have identical quantum
numbers and that the atomic electronic state is specified by four quantum numbers. Two students, Goudsmit
and Uhlenbeck suggested that a fourth two-valued quantum number was the electron spin of ± 2 . This
provided a plausible explanation for the structure of multi-electron atoms.
This relation, derived by de Broglie, is required to ensure that the particle travels at the group velocity
of the wave packet characterizing the particle. Note that although the relations used to characterize the
matter waves are purely classical, the physical content of such waves is beyond classical physics. In 1927 C.
Davisson and G.P. Thomson independently observed electron diffraction confirming wave/particle duality for
the electron. Ironically, J.J. Thomson discovered that the electron was a particle, whereas his son attributed
it to an electron wave.
Heisenberg developed the modern matrix formulation of quantum theory in 1925; he was 24 years old
at the time. A few months later Schrödinger’s developed wave mechanics based on de Broglie’s concept of
wave-particle duality. The matrix mechanics, and wave mechanics, quantum theories are radically different.
Heisenberg’s algebraic approach employs non-commuting quantities and unfamiliar mathematical techniques
that emphasized the discreteness characteristic of the corpuscle aspect. In contrast, Schrödinger used the
familiar analytical approach that is an extension of classical laws of motion and waves which stressed the
element of continuity.
18.3. HAMILTONIAN IN QUANTUM THEORY 497
gives the Hamiltonian function (p q) of the matrices q and p which leads to Hamilton’s canonical equations
q̇= ṗ=− (18.11)
p q
− = ~ (18.12)
− = 0
− = 0
Born realized that equation (1812) is the only fundamental equation for introducing ~ into the theory in a
logical and consistent way.
Chapter 1524 discussed the formal correspondence between the Poisson bracket, defined in chapter 153,
and the commutator in classical mechanics. It was shown that the commutator of two functions equals a
constant multiplicative factor times the corresponding Poisson Bracket. That is
Dirac recognized that the correspondence between the classical Poisson bracket, and quantum commuta-
tor, given by equation (1813) provides a logical and consistent way that builds quantization directly into
the theory, rather than using an ad-hoc, case-dependent, hypothesis as used by the older quantum theory of
498 CHAPTER 18. THE TRANSITION TO QUANTUM PHYSICS
Bohr. The basis of Dirac’s quantization principle, involves replacing the classical Poisson Bracket, [ ]
1
by the commutator, ( − ). That is,
1
[ ] =⇒ ( − ) (18.17)
~
Hamilton’s canonical equations, as introduced in chapter 15, are only applicable to classical mechanics
since they assume that the exact position and conjugate momentum can be specified both exactly and
simultaneously which contradicts the Heisenberg’s Uncertainty Principle. In contrast, the Poisson bracket
generalization of Hamilton’s equations allows for non-commuting variables plus the corresponding uncertainty
principle. That is, the transformation from classical mechanics to quantum mechanics can be accomplished
simply by replacing the classical Poisson Bracket by the quantum commutator, as proposed by Dirac. The
formal analogy between classical Hamiltonian mechanics, and the Heisenberg representation of quantum
mechanics is strikingly apparent using the correspondence between the Poisson Bracket representation of
Hamiltonian mechanics and Heisenberg’s matrix mechanics.
The direct relation between the quantum commutator, and the corresponding classical Poisson Bracket,
applies to many observables. For example, the quantum analogs of Hamilton’s equations of motion are
given by use of Hamilton’s equations of motion, 1553 1556 and replacing each Poisson Bracket by the
corresponding commutator. That is
1
= = [ ] = ( − ) (18.18)
~
1
= − = [ ] = ( − ) (18.19)
~
Chapter 1525 discussed the time dependence of observables in Hamiltonian mechanics. Equation 1545
gave the total time derivative of any observable to be
= + [ ] (18.20)
Equation 1817 can be used to replace the Poisson Bracket by the quantum commutator, which gives the
corresponding time dependence of observables in quantum physics.
1
= + ( − ) (18.21)
~
In quantum mechanics, equation 1821 is called the Heisenberg equation. Note that if the observable is
chosen to be a fundamental canonical variable, then
= 0 = and equation 1520 reduces to Hamilton’s
1
+ ( − ) = 0 (18.22)
~
Moreover, if is not an explicit function of time, then
1
0= ( − ) (18.23)
~
That is, the transition to quantum physics shows that, if is a constant of motion, and is not explicitly
time dependent, then commutes with the Hamiltonian .
The above discussion has illustrated the close and beautiful correspondence between the Poisson Bracket
representation of classical Hamiltonian mechanics, and the Heisenberg representation of quantum mechanics.
Dirac provided the elegant and simple correspondence principle connecting the Poisson bracket representation
of classical Hamiltonian mechanics, to the Heisenberg representation of quantum mechanics.
18.3. HAMILTONIAN IN QUANTUM THEORY 499
where the action gives the phase of the wavefront, and the amplitude of the wave, as described in
chapter 1544. The time dependence, that characterizes the motion of the wavefront, is contained in the
time dependence of This form for the wavefunction has the advantage that the wavefunction frequently
factors into a product of terms, e.g. = ()Θ()Φ() which corresponds to a summation of the exponents
= + + − . This summation form is exploited by separation of the variables, as discussed in
chapter 1543.
Insert (1833) into equation (1828) plus using the fact that
µ ¶ µ ¶ µ ¶2
2 1 2
= = = − 2 + 2 (18.35)
2 ~ ~ ~
leads to
1 ~ 2
− = (∇ · ∇) + () − ∇ = (18.36)
2 2
Note that if Planck’s constant ~ = 0 then the imaginary term in equation (1835) is zero, leading to 1835
being real, and identical to the Hamilton-Jacobi result, equation 1823. The fact that equation 1835
equals the Hamilton-Jacobi equation in the limit ~ → 0, illustrates the close analogy between the wave-
particle duality of the classical Hamilton-Jacobi theory, and de Broglie’s wave-particle duality in Schrödinger’s
quantum wave-mechanics representation.
The Schrödinger approach was accepted in 1925 and exploited extensively with tremendous success, since
it is much easier to grasp conceptually than is the algebraic approach of Heisenberg. Initially there was much
conflict between the proponents of these two contradictory approaches, but this was resolved by Schrödinger
who showed in 1926 that there is a formal mathematical identity between wave mechanics and matrix
mechanics. That is, these quantal two representations of Hamiltonian mechanics are equivalent, even though
they are built on either the Poisson bracket representation, or the Hamilton-Jacobi representation. Wave
mechanics is based intimately on the quantization rule of the action variable. Heisenberg’s Uncertainty
Principle is automatically satisfied by Schrödinger’s wave mechanics since the uncertainty principle is a
feature of all wave motion, as described in chapter 3.
In 1928 Dirac developed a relativistic wave equation which includes spin as an integral part. This Dirac
equation remains the fundamental wave equation of quantum mechanics. Unfortunately it is difficult to
apply.
Today the powerful and efficient Heisenberg representation is the dominant approach used in the field of
physics, whereas chemists tend to prefer the more intuitive Schrödinger wave mechanics approach. In either
case, the important role of Hamiltonian mechanics in quantum theory is undeniable.
The motivation for Feynman’s 1942 Ph.D thesis, entitled “The Principle of Least Action in Quantum
Mechanics”, was to quantize the classical action at a distance in electrodynamics. This theory adopted an
overall space-time viewpoint for which the classical Hamiltonian approach, as used in conventional formu-
lations of quantum mechanics, is inapplicable. Feynman used the Lagrangian, plus the principle of least
action, to underlie his development of quantum field theory. To paraphrase Feynman’s Nobel Lecture, he
used a physical approach that is quite different from the customary Hamiltonian point of view for which the
system is discussed in great detail as a function of time. That is, you have the field at this moment, then a
differential equation gives you the field at a later moment and so on; that is, the Hamiltonian approach is a
time differential method. In Feynman’s least-action approach the action describes the character of the path
throughout all of space and time. The behavior of nature is determined by saying that the whole space-time
path has a certain character. The use of action involves both advanced and retarded terms that make it
difficult to transform back to the Hamiltonian form. The Feynman space-time approach is far beyond the
scope of this course. This topic will be developed in advanced graduate courses on quantum field theory.
18.6 Summary
The important point of this discussion is that variational formulations of classical mechanics provide a
rational, and direct basis, for the development of quantum mechanics. It has been shown that the final form
of quantum mechanics is closely related to the Hamiltonian formulation of classical mechanics. Quantum
mechanics supersedes classical mechanics as the fundamental theory of mechanics in that classical mechanics
only applies for situations where quantization is unimportant, and is the limiting case of quantum mechanics
when ~ → 0 which is in agreement with the Bohr’s Correspondence Principle. The Dirac relativistic theory
of quantum mechanics is the ultimate quantal theory for the relativistic regime.
This discussion has barely scratched the surface of the correspondence between classical and quantal
mechanics, which goes far beyond the scope of this course. The goal of this chapter is to illustrate that
classical mechanics, in particular, Hamiltonian mechanics, underlies much of what you will learn in your
quantum physics courses. An interesting similarity between quantum mechanics and classical mechanics is
that physicists usually use the more visual Schrödinger wave representation in order to describe quantum
physics to the non-expert, which is analogous to the similar use of Newtonian physics in classical mechan-
ics. However, practicing physicists invariably use the more abstract Heisenberg matrix mechanics to solve
problems in quantum mechanics, analogous to widespread use of the variational approach in classical me-
chanics, because the analytical approaches are more powerful and have fundamental advantages. Quantal
problems in molecular, atomic, nuclear, and subnuclear systems, usually involve finding the normal modes
of a quantal system, that is, finding the eigen-energies, eigen-functions, spin, parity, and other observables
for the discrete quantized levels. Solving the equations of motion for the modes of quantal systems is sim-
ilar to solving the many-body coupled-oscillator problem in classical mechanics, where it was shown that
use of matrix mechanics is the most powerful representation. It is ironic that the introduction of matrix
methods to classical mechanics is a by-product of the development of matrix mechanics by Heisenberg, Born
and Jordan. This illustrates that classical mechanics not only played a pivotal role in the development of
quantum mechanics, but it also has benefitted considerably from the development of quantum mechanics;
that is, the synergistic relation between these two complementary branches of physics has been beneficial to
both classical and quantum mechanics.
Recommended reading
“Quantum Mechanics” by P.A.M. Dirac, Oxford Press, 1947,
“Conceptual Development of Quantum Mechanics” by Max Jammer, Mc Graw Hill 1966.
Chapter 19
Epilogue
Stage 1
Stage 2
Stage 3
Figure 19.1: Philosophical road map of the hierarchy of stages involved in analytical mechanics. Hamilton’s
Action Principle is the foundation of analytical mechanics. Stage 1 uses Hamilton’s Principle to derive the
Lagranian and Hamiltonian. Stage 2 uses either the Lagrangian or Hamiltonian to derive the equations
of motion for the system. Stage 3 uses these equations of motion to solve for the actual motion using
the assumed initial conditions. The Lagrangian approach can be derived directly based on d’Alembert’s
Principle. Newtonian mechanics can be derived directly based on Newton’s Laws of Motion.
This book has introduced powerful analytical methods in physics that are based on applications of
variational principles to Hamilton’s Action Principle. These methods were pioneered in classical mechanics
by Leibniz, Lagrange, Euler, Hamilton, and Jacobi, during the remarkable Age of Enlightenment, and reached
full fruition at the start of the 20 century.
The philosophical roadmap, shown above, illustrates the hierarchy of philosophical approaches available
when using Hamilton’s Action Principle Rto derive the equations of motion of a system. The primary Stage1
uses Hamilton’s Action functional, = (q q̇) to derive the Lagrangian, and Hamiltonian function-
als. Stage1 provides the most fundamental and sophisticated level of understanding and involves specifying
all the active degrees of freedom, as well as the interactions involved. Stage2 uses the Lagrangian or Hamil-
tonian functionals, derived at Stage1, in order to derive the equations of motion for the system of interest.
Stage3 then uses the derived equations of motion to solve for the motion of the system, subject to a given
set of initial boundary conditions.
Newton postulated equations of motion for nonrelativistic classical mechanics that are identical to those
derived by applying variational principles to Hamilton’s Principle. However, Newton’s Laws of Motion are
503
504 CHAPTER 19. EPILOGUE
applicable only to nonrelativistic classical mechanics, and cannot exploit the advantages of using the more
fundamental Hamilton’s Action Principle, Lagrangian, and Hamiltonian. Newtonian mechanics requires that
all the active forces be included in the equations of motion, and involves dealing with vector quantities which
is more difficult than using the scalar functionals, action, Lagrangian, or Hamiltonian. Lagrangian mechanics
based on d’Alembert’s Principle does not exploit the advantages provided by Hamilton’s Action Principle.
Considerable advantages result from deriving the equations of motion based on Hamilton’s Principle,
rather than basing them on the Newton’s postulated Laws of Motion. It is significantly easier to use varia-
tional principles to handle the scalar functionals, action, Lagrangian, and Hamiltonian, rather than starting
with Newton’s vector differential equations-of-motion. The three hierarchical stages of analytical mechanics
facilitate accommodating extra degrees of freedom, symmetries, constraints, and other interactions. For
example, the symmetries identified by Noether’s theorem are more easily recognized during the primary “ac-
tion” and secondary “Hamiltonian/Lagrangian” stages, rather than at the subsequent “equations-of-motion”
stage. Constraint forces, and approximations, introduced at the Stage1 or Stage2, are easier to implement
than at the subsequent Stage3. The correspondence of Hamilton’s Action in classical and quantal mechan-
ics, as well as relativistic invariance, are crucial advantages for using the analytical approach in relativistic
mechanics, fluid motion, quantum, and field theory.
Philosophically, Newtonian mechanics is straightforward to understand since it uses vector differential
equations of motion that relate the instantaneous forces to the instantaneous accelerations. Moreover,
the concepts of momentum plus force are intuitive to visualize, and both cause and effect are embedded
in Newtonian mechanics. Unfortunately, Newtonian mechanics is incompatible with quantum physics, it
violates the relativistic concepts of space-time, and fails to provide the unified description of the gravitational
force plus planetary motion as geodesic motion in a four-dimensional Riemannian structure.
The remarkable philosophical implications embedded in applying variational principles to Hamilton’s
Principle, are based on the astonishing assumption that motion of a constrained system in nature follows
a path that minimizes the action integral. As a consequence, solving the equations of motion is reduced
to finding the optimum path that minimizes the action integral. The fact that nature follows optimization
principles is nonintuitive, and was considered to be metaphysical by many scientists and philosophers during
the 19 century, which delayed full acceptance of analytical mechanics until the development of the Theory
of Relativity and quantum mechanics. Variational formulations now have become the preeminent approach
in modern physics and they have toppled Newtonian mechanics from the throne of classical mechanics that
it occupied for two centuries.
The scope of this book extends beyond the typical classical mechanics textbook in order to illustrate
how Lagrangian and Hamiltonian dynamics provides the foundation upon which modern physics is built.
Knowledge of analytical mechanics is essential for the study of modern physics. The techniques and physics
discussed in this book reappear in different guises in many fields, but the basic physics is unchanged illustrat-
ing the intellectual beauty, the philosophical implications, and the unity of the field of physics. The breadth
of physics addressed by variational principles in classical mechanics, and the underlying unity of the field,
are epitomized by the wide range of dimensions, energies, and complexity involved. The dimensions range
from as large as 1027 to quantal analogues of classical mechanics of systems spanning in size down to the
Planck length of 162 × 10−35 . Individual particles have been detected with kinetic energies ranging from
zero to greater than 1015 eV. The complexity of classical mechanics spans from one body to the statistical
mechanics of many-body systems. As a consequence, analytical variational methods have become the pre-
mier approach to describe systems from the very largest to the smallest, and from one-body to many-body
dynamical systems.
The goal of this book has been to illustrate the astonishing power of analytical variational methods for
understanding the physics underlying classical mechanics, as well as extensions to modern physics. However,
the present narrative remains unfinished in that fundamental philosophical and technical questions have
not been addressed. For example, analytical mechanics is based on the validity of the assumed principle of
economy. This book has not addressed the philosophical question, “is the principle of economy a fundamental
law of nature, or is it a fortuitous consequence of the fundamental laws of nature? ”
In summary, Hamilton’s action principle, which is built into Lagrangian and Hamiltonian mechanics,
coupled with the availability of a wide arsenal of variational principles and mathematical techniques, provides
a remarkably powerful approach for deriving the equations of motions required to determine the response of
systems in a broad and diverse range of applications in science and engineering.
Appendix A
Matrix algebra
A.2 Matrices
Matrix algebra provides an elegant and powerful representation of multivariate operators, and coordinate
transformations that feature prominently in classical mechanics. For example they play a pivotal role in
finding the eigenvalues and eigenfunctions for coupled equations that occur in rigid-body rotation, and
coupled oscillator systems. An understanding of the role of matrix mechanics in classical mechanics facilitates
understanding of the equally important role played by matrix mechanics in quantal physics.
It is interesting that although determinants were used by physicists in the late 19 century, the concept
of matrix algebra was developed by Arthur Cayley in England in 1855 but many of these ideas were the work
of Hamilton, and the discussion of matrix algebra was buried in a more general discussion of determinants.
Matrix algebra was an esoteric branch of mathematics, little known by the physics community, until 1925
when Heisenberg proposed his innovative new quantum theory. The striking feature of this new theory
was its representation of physical quantities by sets of time-dependent complex numbers and a peculiar
multiplication rule. Max Born recognized that Heisenberg’s multiplication rule is just the standard “row
times column” multiplication rule of matrix algebra; a topic that he had encountered as a young student in a
mathematics course. In 1924 Richard Courant had just completed the first volume of the new text Methods
of Mathematical Physics during which Pascual Jordan had served as his young assistant working on matrix
manipulation. Fortuitously, Jordan and Born happened to share a carriage on a train to Hanover during
505
506 APPENDIX A. MATRIX ALGEBRA
which Jordan overheard Born talk about his problems trying to work with matrices. Jordan introduced
himself to Born and offered to help. This led to publication, in September 1925, of the famous Born-Jordan
paper[Bor25a] that gave the first rigorous formulation of matrix mechanics in physics. This was followed in
November by the Born-Heisenberg-Jordan sequel[Bor25b] that established a logical consistent general method
for solving matrix mechanics problems plus a connection between the mathematics of matrix mechanics and
linear algebra. Matrix algebra developed into an important tool in mathematics and physics during World
War 2 and now it is an integral part of undergraduate linear algebra courses.
Most applications of matrix algebra in this book are restricted to real, symmetric, square matrices. The
size of a matrix is defined by the rank, which equals the row rank and column rank, i.e. the number of
independent row vectors or column vectors in the square matrix. It is presumed that you have studied
matrices in a linear algebra course. Thus the goal of this review is to list simple manipulation of symmetric
matrices and matrix diagonalization that will be used in this course. You are referred to a linear algebra
textbook if you need further details.
Matrix definition
A matrix is a rectangular array of numbers with rows and columns. The notation used for an element
of a matrix is where designates the row and designates the column of this matrix element in the
matrix A. Convention denotes a matrix A as
⎛ ⎞
11 12 1( −1) 1
⎜ 21 22 2( −1) 2 ⎟
⎜ ⎟
A≡⎜ ⎜ : : : : ⎟
⎟ (A.1)
⎝ ( −1)1 (−1)2 (−1)(−1) (−1) ⎠
1 2 (−1)
Matrices can be square, = , or rectangular 6= . Matrices having only one row or column are
called row or column vectors respectively, and need only a single subscript label. For example,
⎛ ⎞
1
⎜ 2 ⎟
⎜ ⎟
A =⎜ ⎜ : ⎟
⎟ (A.2)
⎝ −1 ⎠
Matrix manipulation
Matrices are defined to obey certain rules for matrix manipulation as given below.
1) Multiplication of a matrix by a scalar simply multiplies each matrix element by
= (A.3)
2) Addition of two matrices A and B having the same rank, i.e. the number of columns, is given by
= + (A.4)
3) Multiplication of a matrix A by a matrix B is defined only if the number of columns in A equals the
number of rows in B. The product matrix C is given by the matrix product
C= A · B (A.5)
X
= [] = (A.6)
For example, if both A and B are rank three symmetric matrices then
⎛ ⎞ ⎛ ⎞
11 12 13 11 12 13
C = A · B = ⎝ 21 22 23 ⎠ · ⎝ 21 22 23 ⎠
31 32 33 31 32 33
⎛ ⎞
11 11 + 12 21 + 13 31 11 12 + 12 22 + 13 32 11 13 + 12 23 + 13 33
= ⎝ 21 11 + 22 21 + 23 31 21 12 + 22 22 + 23 32 21 13 + 22 23 + 23 33 ⎠
31 11 + 32 21 + 33 31 31 12 + 32 22 + 33 32 31 13 + 32 23 + 33 33
A.2. MATRICES 507
Transposed matrix A
The transpose of a matrix A will be denoted by A and is given by interchanging rows and columns, that is
¡ ¢
= (A.8)
The transpose of a column vector is a row vector. Note that older texts use the symbol à for the transpose.
Orthogonal matrix
A matrix with real elements is orthogonal if
A = A−1 (A.12)
That is X¡ ¢ X
= = (A.13)
Adjoint matrix A†
For a matrix with complex elements, the adjoint matrix, denoted by A† is defined as the transpose of the
complex conjugate ¡ †¢
A = A∗ (A.14)
Hermitian matrix
The Hermitian conjugate of a complex matrix H is denoted as H† and is defined as
¡ ¢∗
H† = H = (H∗ ) (A.15)
Therefore
† ∗
= (A.16)
A matrix is Hermitian if it is equal to its adjoint
H† = H (A.17)
that is
† ∗
= = (A.18)
A matrix that is both Hermitian and has real elements is a symmetric matrix since complex conjugation has
no effect.
508 APPENDIX A. MATRIX ALGEBRA
Unitary matrix
A matrix with complex elements is unitary if its inverse is equal to the adjoint matrix
U† = U−1 (A.19)
which is equivalent to
U† U = I (A.20)
A unitary matrix with real elements is an orthogonal matrix as given in equation 12
The trace of a square matrix, denoted by A, is defined as the sum of the diagonal matrix elements.
X
A = (A.21)
=1
Real vectors The generalization of the scalar (dot) product in Euclidean space is called the inner prod-
uct. Exploiting the rules of matrix multiplication requires taking the transpose of the first column vector
to form a row vector which then is multiplied by the second column vector using the conventional rules for
matrix multiplication. That is, for rank vectors
⎛ ⎞ ⎛ ⎞ ⎛ ⎞
1 1 1
⎜ 2 ⎟ ⎜ 2 ⎟ ¡ ¢ ⎜ 2 ⎟ X
[X] · [Y] = ⎜ ⎟ ⎜
⎝ : ⎠·⎝ :
⎟ = [X] [Y] = 1
⎠ 2 ⎜
⎝ : ⎠
⎟= (A.22)
=1
For rank = 3 this inner product agrees with the conventional definition of the scalar product and gives a
result that is a scalar. For the special case when [A] · [B] = 0 then the two matrices are called orthogonal.
The magnitude squared of a column vector is given by the inner product
X 2
[X] · [X] = ( ) ≥ 0 (A.23)
=1
Complex vectors For vectors having complex matrix elements the inner product is generalized to a form
that is consistent with equation 22 when the column vector matrix elements are real.
⎛ ⎞
1
⎜ 2 ⎟ X
¡ ¢⎜ ⎟
∗ †
[X] · [Y] = [X] [Y] = 1∗ 2∗ ∗
−1 ∗
⎜ : ⎟= ∗ (A.24)
⎜ ⎟
⎝ −1 ⎠ =1
A.3 Determinants
Definition
The determinant of a square matrix with rows equals a single number derived using the matrix elements
of the matrix. The determinant is denoted as det A or |A| where
X
|A| = (1 2 )11 22 (A.26)
=1
where (1 2 ) is the permutation index which is either even or odd depending on the number of
permutations required to go from the normal order (1 2 3 ) to the sequence (1 2 3 ).
For example for = 3 the determinant is
|A| = 11 22 33 + 12 23 31 + 13 21 32 − 13 22 31 − 11 23 32 − 12 21 33 (A.27)
Properties
1. The value of a determinant || = 0, if
3. The value of a determinant changes sign if two rows, or any two columns, are interchanged.
¯ ¯
4. Transposing a square matrix does not change its determinant. ¯A ¯ = |A|
5. If any row (column) is multiplied by a constant factor then the value of the determinant is multiplied
by the same factor.
6. The determinant of a diagonal matrix equals the product of the diagonal matrix elements. That is,
when = then |A| = 1 2 3
8. The determinant of the null matrix, for which all matrix elements are zero, |0| = 0
10. If each element of any row (column) appears as the sum (difference) of two or more quantities, then
the determinant can be written as a sum (difference) of two or more determinants of the same order.
For example for order = 2
¯ ¯ ¯ ¯ ¯ ¯
¯ 11 ± 11 12 ± 12 ¯¯ ¯¯ 11 12 ¯¯ ¯¯ 11 12 ¯¯
¯
¯ 21 22 ¯ = ¯ 21 ±
22 ¯ ¯ 21 22 ¯
11 A determinant of a matrix product equals the product of the determinants. That is, if C = AB then
|C| = |A| |B|
510 APPENDIX A. MATRIX ALGEBRA
Cofactors are used to expand the determinant of a square matrix in order to evaluate the determinant.
1
−1
= (A.30)
|A|
¡ ¢
Equations 28 and 29 can be used to evaluate the element of the matrix product A−1 A
X
¡ −1 ¢ 1 X 1
A A = −1
= = |A| = = I (A.31)
|A| |A|
=1 =1
⎡ ⎤−1 ⎡ ⎤ ⎡ ⎤
1 1
A −1
= ⎣ ⎦ = ⎣ ⎦ = ⎣ ⎦
|A| |A|
⎡ ⎤
= ( − ) = − ( − ) = ( − )
1 ⎣ = − ( − )
= = ( − ) = − ( − ) ⎦ (A.33)
+ + = ( − ) = − ( − ) = ( − )
where the functions are equal to rank 2 determinants listed in equation 33.
A.4. REDUCTION OF A MATRIX TO DIAGONAL FORM 511
X0 = R·X
Y0 = R·Y (A.35)
R· (A · X) = R · Y (A.36)
R · A · R−1 · R · X = R · Y (A.37)
R · A · R−1 · X0 = A0 · X0 = Y0 (A.38)
using the fact that the identity matrix I = R · R−1 = R · R since the rotation matrix in dimensions is
orthogonal.
Thus we have that the rotated matrix
A0 = R · A · R (A.39)
Let us assume that this transformed matrix is diagonal, then it can be written as the product of the unit
matrix I and a vector of scalar numbers called the characteristic roots as
A0 = R · A · R = I (A.40)
or £ ¤
I−A0 X0 = 0 (A.43)
This represents a set of homogeneous linear algebraic equations in unknowns X0 where is a set of
characteristic roots, (eigenvalues) with corresponding eigenfunctions X0 Ignoring the trivial case of X0 being
zero, then (43) requires that the secular determinant of the bracket be zero, that is
¯ ¯
¯I−A0 ¯ = 0 (A.44)
( − 1 ) ( − 2 ) ( − 3 ) ( − ) = 0 (A.45)
eigenvalues are identical, then the reduction to a true diagonal form is not possible and one has the freedom
to select an appropriate eigenvector that is orthogonal to the remaining axes.
In summary, the matrix can only be fully diagonalized if (a) all the eigenvalues are distinct, (b) the real
matrix is symmetric, (c) it is unitary.
A frequent application of matrices in classical mechanics is for solving a system of homogeneous linear
equations of the form
11 1 +12 2 +1 = 0
11 1 +12 2 +1 = 0
(A.47)
=
1 1 +2 2 + = 0
Making the following definitions ⎛ ⎞
11 12 1
⎜ 21 22 2 ⎟
A =⎜
⎝
⎟ (A.48)
⎠
1 2
⎛ ⎞
1
⎜ 2 ⎟
X =⎜
⎝
⎟
⎠ (A.49)
Then the set of linear equations can be written in a compact form using the matrices
A · X =0 (A.50)
which can be solved using equation (43). Ensure that you are able to diagonalize a matrices with rank
2 and 3. You can use Mathematica, Maple, MatLab, or other such mathematical computer programs to
diagonalize larger matrices.
This expands to
−( + 1)( − 1) = 0
Thus the three eigen values are = −1 0 1
To find each eigenvectors we substitute the corresponding eigenvalue into equation (48)
⎛ ⎞⎛ ⎞ ⎛ ⎞
− 1 0 0
⎝ 1 − 0 ⎠ ⎝ ⎠ = ⎝ 0 ⎠
0 0 − 0
This expands to
(1 − ) ( + 1)( − 1) = 0
Thus the three eigen values are = −1 1 1
The eigenvectors are determined by substituting the corresponding eigenvalue into equation (42)
⎛ ⎞ ⎛ ⎞ ⎛ ⎞
1− 0 0 0
⎝ 0 − 1 ⎠ · ⎝ ⎠ = ⎝ 0 ⎠
0 0 − 0
The eigenvalue = −1 yields 2 = 0 and + = 0 Thus the eigen vector is 1 = (0 √12 √ −1
2
). The
eigenvalue = 1 yields − + = 0 The eigenvector 2 must be perpendicular to 1 and there are an infinite
number of choices. Let us assume that 2 = (0 √12 √12 ) which satisfies equation (50) then the eigenvector
3 must be perpendicular to both 1 and 2 For rank three this is found using
r3 = r1 × r2 = (1 0 0)
514 APPENDIX A. MATRIX ALGEBRA
Appendix B
Vector algebra
a ± b = ±b + a (B.2)
a+ (b + c) = (a + b) +c
(a + b) = a+b
The manipulation of vectors is greatly facilitated by use of components along an orthogonal coordinate
system defined by three orthogonal unit vectors (ê1 ê2 ê3 ) . For example the cartesian coordinate system
is defined by three unit vectors which, by convention, are called (î ĵ k̂).
where is the angle between the two vectors. It is a scalar and thus is independent of the orientation of
the coordinate axis system. Note that the scalar product commutes, is distributive, and associative with a
scalar multiplier, that is
Note that a · a = ||2 and if a and b are perpendicular then cos = 0 and thus a · b =0
515
516 APPENDIX B. VECTOR ALGEBRA
If the three unit vectors (ê1 ê2 ê3 ) form an orthonormal basis, that is, they are orthogonal unit vectors,
then from equations 3 and 4
ê · ê = (B.5)
If â is the unit vector for the vector a then the scalar product of a vector a with one of these unit vectors
ê gives the cosine of the angle between the vector a and ê , that is
a · ê1 = || (â · ê1 ) = || cos (B.6)
a · ê2 = || (â · ê2 ) = || cos
a · ê3 = || (â · ê3 ) = || cos
where the cosines are called the direction cosines since they define the direction of the vector a with respect
to each orthogonal basis unit vector. Moreover, a · ê1 = || â · ê1 = || cos is the component of a along the
ê1 axis. Thus the three components of the vector a is fully defined by the magnitude || and the direction
cosines, corresponding to the angles . That is,
1 = || (â · ê1 ) = || cos (B.7)
2 = || (â · ê2 ) = || cos
3 = || (â · ê3 ) = || cos
If the three unit vectors (ê1 ê2 ê3 ) form an orthonormal basis then the vector is fully defined by
a = 1 ê1 + 2 ê2 + 3 ê3 (B.8)
Consider two vectors
a = 1 ê1 + 2 ê2 + 3 ê3
b = 1 ê1 + 2 ê2 + 3 ê3
Then using 5
a · b =1 1 + 2 2 + 3 3 = || || cos (B.9)
1
where is the angle between the two vectors. In particular, since the direction cosine cos = || , then
equation 9 gives
cos = cos cos + cos cos + cos cos (B.10)
Note that when = 0 then 10 gives
cos2 + cos2 + cos2 = 1 (B.11)
where the (Levi-Civita) permutation symbol has the following properties
= 0 if an index is equal to any another index
= +1 if form an even permutation of 1 2 3 (B.14)
= −1 if form an odd permutation of 1 2 3
B.4. TRIPLE PRODUCTS 517
P
For example, if the three unit vectors (ê1 ê2 ê3 ) form an orthonormal basis, then ê ≡ ê ê , i.e.
ê1 × ê2 = ê3 ê2 × ê3 = ê1 ê3 × ê1 = ê2 (B.15)
ê2 × ê1 = −ê3 ê3 × ê2 = −ê1 ê1 × ê3 = −ê2 (B.16)
ê1 × ê1 = 0 ê2 × ê2 = 0 ê3 × ê0 = 0 (B.17)
a × b = −b × a (B.18)
a× (b + c) = a × b + a × c (B.19)
(a) ×b = (a × b) (B.20)
where is the angle between the two vectors and the determinant is evaluated for the top row. Examples of
vector products are torque N = r × F, angular momentum L = r × p, and the magnetic force F = v × B.
a· (b × c) = c· (a × b) = b· (c × a) = (a × b) · c = −a· (c × b) (B.21)
That is, the scalar product is invariant to cyclic permutations of the three vectors but changes sign for
interchange of two vectors. The scalar product is unchanged by swapping the scalar ()and vector ().
Because of the symmetry the scalar triple product can be denoted as [a b c] and
The scalar triple product can be written in terms of the components using a determinant
¯ ¯
¯ 1 2 3 ¯
¯ ¯
[a b c] = ¯¯ 1 2 3 ¯¯ (B.23)
¯ 1 2 3 ¯
518 APPENDIX B. VECTOR ALGEBRA
a × (b × c) = (a · c) b − (a · b) c (B.24)
Workshop exercises
1. Partition the following exercises among the group. Once you have completed your problem, check with a
classmate before writing it on the board. After you have verified that you have found the correct solution,
write your answer in the space provided on the board, taking care to include the steps that you used to arrive
at your solution. The following information is needed.
a⎛= 3i + 2j − 9k ⎞ b = −2i + 3k c =⎛
−2i + j − 6k⎞ d⎛= i + 9j + 4k ⎞
2 7 −4 µ ¶ 2 −4 −8 −1 −3
3 4
E = ⎝ 3 1 −2 ⎠ F = G=⎝ 7 1 ⎠ H = ⎝ −4 2 −2 ⎠
5 6
−2 0 5 −1 1 −1 0 0
Calculate each of the following
1 |a − (b + 3c)| 7 (EH)
2 Component of c along a 8 |HE|
3 Angle between c and d 9 EHG
4 (b × d) · a 10 EG − HG
5 (b × d) × a 11 EH − H E
6 b× (d × a) 12 F−1
Problems
[1] For what values of are the vectors A = 2̂ − 2̂ + ̂ and B = ̂ + 2̂ + 2̂ perpendicular?
Show also that the product is unaffected by interchange of the scalar and vector product operations or by change in
(A × B) · C = A · (B × C) = B · (C × A) =(C × A) · B
Therefore we may use the notation to denote the triple scalar product. Finally give a geometric interpre-
tation of by computing the volume of the parallelepiped defined by the three vectors A B C
Appendix C
The methods of vector analysis provide a convenient representation of physical laws. However, the manip-
ulation of scalar and vector fields is greatly facilitated by use of components with respect to an orthogonal
coordinate system.
= ( ) (C.1)
r = î+ ĵ+ k̂ (C.2)
Calculation of the time derivatives of the position vector is especially simple using cartesian coordinates
because the unit vectors (î ĵ k̂) are constant and independent in time. That is;
Since the time derivatives of the unit vectors are all zero then the velocity ṙ = r
reduces to the partial time
derivatives of and . That is,
ṙ =̇î+̇ ĵ+̇ k̂ (C.3)
Similarly the acceleration is given by
r̈ =̈î+̈ ĵ+̈ k̂ (C.4)
519
520 APPENDIX C. ORTHOGONAL COORDINATE SYSTEMS
Curvilinear coordinate systems introduce a complication in that the unit vectors are time dependent in
contrast to cartesian coordinate system where the unit vectors (î ĵ k̂) are independent and constant in time.
The introduction of this time dependence warrants further discussion.
Each of the three axes in curvilinear coordinate systems can be expressed in cartesian coordinates
( ) as surfaces of constant given by the function
= ( ) (C.5)
where = 1 2 or 3. An element of length perpendicular to the surface is the distance between the
surfaces and + which can be expressed as
where is a function of (1 2 3 ). In cartesian coordinates 1 ,2 and 3 are all unity. The unit-length
vectors ̂1 , ̂2 , ̂3 , are perpendicular to the respective 1 2 3 surfaces, and are oriented to have increasing
indices such that q̂1 ×q̂2 = q̂3 . The correspondence of the curvilinear coordinates, unit vectors, and transform
coefficients to cartesian, polar, cylindrical and spherical coordinates is given in table 1
s = 1 q̂1 + 2 q̂2 + 3 q̂3 = 1 1 q̂1 + 2 2 q̂2 + 3 3 q̂3 (C.7)
= 1 2 3 = 1 2 3 (1 2 3 ) (C.8)
These are evaluated below for polar, cylindrical, and spherical coordinates.
since the unit vector r̂ is a constant with |r̂| = 1. Note that the infinitessimal r̂ is perpendicular to the unit
vector r̂, that is, r̂ points in the tangential direction θ̂
Similarly, the infinitessimal
θ̂ = θ̂ 2 − θ̂ 1 = θ̂ = −r̂ (C.10)
which is perpendicular to the tangential θ̂ unit vector and therefore points in the direction −r̂ . The minus
sign causes −r̂ to be directed in the opposite direction to r̂.
C.2. CURVILINEAR COORDINATE SYSTEMS 521
r̂
= θ̂ (C.12)
θ̂
= − r̂ (C.13)
Note that the time derivatives of unit vectors are perpendicular to the corresponding unit vector, and the
unit vectors are coupled.
Consider that the velocity v is expressed as
r r̂
v= = (r̂) = r̂ + = ̇r̂ + ̇θ̂ (C.14)
The velocity is resolved into a radial component ̇ and an angular, transverse, component ̇.
Similarly the acceleration is given by
2
where the ̇ r̂ term is the effective centripetal acceleration while the 2̇̇θ̂ term is called the Coriolis term.
For the case when ̇ = ̈ = 0, then the first bracket in 15 is the centripetal acceleration while the second
bracket is the tangential acceleration.
This discussion has shown that in contrast to the time independence of the cartesian unit basis vectors,
the unit basis vectors for curvilinear coordinates are time dependent which leads to components of the velocity
and acceleration involving coupled coordinates.
Coordinates
Distance element s = r̂ + θ̂
Area element =
Unit vectors r̂ = ̂ cos + ̂ sin
θ̂ = −̂ sin + ̂ cos
r̂
Time derivatives = ̇ θ̂
̂
of unit vectors = −̇r̂
Velocity v= ³ ̇r̂ + ̇θ̂´
2
Kinetic energy 2 ̇2 +2 ̇
³ 2
´
Acceleration a = ̈ − ̇ r̂
³ ´
+ ̈ + 2̇̇ θ̂
Table 2: Differential relations plus a diagram of the unit vectors for 2-dimensional polar coordinates.
522 APPENDIX C. ORTHOGONAL COORDINATE SYSTEMS
Coordinates
Distance element s = ρ̂ + φ̂ + ẑ
Volume element =
Unit vectors ρ̂ = ̂ cos + ̂ sin
φ̂ = −̂ sin + ̂ cos
ẑ = k̂
̂
Time derivatives = ̇φ̂
̂
of unit vectors = −̇ρ̂
ẑ
= 0
Velocity v =³ ̇ρ̂ + ̇φ̂ + ̇ẑ
´
2
Kinetic energy
2̇2 +2 ̇ + ̇ 2
³ 2
´
Acceleration a = ̈ − ̇ ρ̂
³ ´
+ ̈ + 2̇̇ φ̂ + ̈ẑ
Table 3: Differential relations plus a diagram of the unit vectors for cylindrical coordinates.
Coordinates
Distance element = r̂ + θ̂ + sin φ̂
Volume element = 2 sin
Unit vectors r̂ = ̂ sin cos + ̂ sin sin + k̂ cos
θ̂ = ̂ cos cos + ̂ cos sin − k̂ sin
φ̂ = −̂ sin + ̂ cos
r̂
Time derivatives = θ̂ ̇ + φ̂̇ sin
̂
of unit vectors = −r̂̇ + φ̂̇ cos
̂
= −r̂̇ sin − θ̂ ̇ cos
Velocity v= ³ ̇r̂ + ̇θ̂ + ̇ sin φ̂´
2 2
Kinetic energy
2 ̇2 +2 ̇ +2 sin2 ̇
³ 2 2
´
Acceleration a = ̈ − ̇ − ̇ sin2 r̂
³ 2
´
+ ̈ + 2̇̇ − ̇ sin cos θ̂
³ ´
+ ̈ sin + 2̇̇ sin + 2̇̇ cos φ̂
Table 4 Differential relations plus a diagram of the unit vectors for spherical coordinates.
C.3. FRENET-SERRET COORDINATES 523
The distance and volume elements, the cartesian coordinate components of the spherical unit basis
vectors, and the unit vector time derivatives are shown in the table given in figure 4. The time dependence
of the unit vectors is used to derive the acceleration. As for the case of cylindrical coordinates, the r̂ θ̂ and
φ̂ components of the acceleration involve coupling of the coordinates and their time derivatives.
It is important to note that the angular unit vectors θ̂ and φ̂ are taken to be tangential to the circles of
rotation. However, for discussion of angular velocity of angular momentum it is more convenient to use the
axes of rotation defined by r̂ × θ̂ and r̂ × φ̂ for specifying the vector properties which is perpendicular to
the unit vectors θ̂ and φ̂. Be careful not to confuse the unit vectors θ̂ and φ̂ with those used for the angular
velocities ̇ and ̇.
t̂
= n̂ (C.16)
b̂
= − n̂ (C.17)
n̂
= −t̂+ b̂ (C.18)
The curvature = 1 where is the radius of curvature and is the torsion that can be either positive
or negative. For increasing a non-zero curvature implies that the triad of unit vectors rotate in a
right-handed sense about b̂. If the torsion is positive (negative) the triad of unit vectors rotates in right
(left) handed sense about t̂.
¯ ¯
¯ ¯
Distance element s() = t̂ ¯ r()
¯ = t̂()
v()
Unit vectors t̂() = |()|
t̂
n̂() =
|t̂|
b̂()= t̂ × n̂ ^
n
Time derivatives ⎛ ⎞ ⎛ ⎞⎛ ⎞
t̂ 0 0 t̂
⎝ n̂ ⎠
of unit vectors = || ⎝ − 0 ⎠ ⎝ n̂ ⎠ ^t
b̂ 0 − 0 b̂ ^
b
Velocity v() = r()
Acceleration a() = 2
t̂+ n̂
Table 5. The differential relations plus a diagram of the corresponding unit vectors for the Frenet-Serret
coordinate system.
524 APPENDIX C. ORTHOGONAL COORDINATE SYSTEMS
The above equations also can be rewritten in the form using a new unit rotation vector ω where
t̂
= ω × t̂ (C.20)
n̂
= ω × n̂ (C.21)
b̂
= ω × b̂ (C.22)
In general the Frenet-Serret unit vectors are time dependent. If the curvature = 0 then the curve is a
straight line and n̂ and b̂ are not well defined. If the torsion is zero then the trajectory lies in a plane. Note
that a helix has constant curvature and constant torsion.
The rate of change of a general vector field E along the trajectory can be written as
µ ¶
E
= t̂ + n̂+ b̂ + ω × E (C.23)
The Frenet-Serret coordinates are used in the life sciences to describe the motion of a moving organism
in a viscous medium. The Frenet-Serret coordinates also have applications to General Relativity.
Workshop exercises
1. The goal of this problem is to help you understand the origin of the equations that relate two different coordinate
systems. Refer to diagrams for cylindrical and spherical coordinates as your teaching assistant explains how to
arrive at expressions for 1 2 and 3 in terms of and and how to derive expressions for the velocity and
acceleration vectors in cylindrical coordinates. Now try to relate spherical and rectangular coordinate systems.
Your group should derive expressions relating the coordinates of the two systems, expressions relating the unit
vectors and their time derivatives of the two systems, and finally, expressions for the velocity and acceleration
in spherical coordinates.
Appendix D
Coordinate transformations
Coordinate systems can be translated, or rotated with respect to each other as well as being subject to spatial
inversion or time reversal. Scalars, vectors, and tensors are defined by their transformation properties under
rotation, spatial inversion and time reversal, and thus such transformations play a pivotal role in physics.
The velocities for a moving frame are given by the vector difference of the velocity in a stationary frame,
and the velocity of the origin of the moving frame. Linear accelerations can be handled similarly.
525
526 APPENDIX D. COORDINATE TRANSFORMATIONS
origins of both frames coincide. Rotation of a frame does not change the vector, only the vector components
of the unit basis states. Therefore
x = ê01 01 + ê02 02 + ê03 03 = ê1 1 + ê2 2 + ê3 3 (D.3)
Note that if one designates that the unit vectors for the unprimed coordinate frame are (ê1 ê2 ê3 ) and for
the primed coordinate frame (ê01 ê02 ê03 ) then taking the scalar product of equation 3 sequentially with
each of the unit base vectors (ê01 ê02 ê03 ) leads to the following three relations
01 = (ê01 ·ê1 )1 + (ê01 ·ê2 )2 + (ê01 ·ê3 )3 (D.4)
02 = (ê02 ·ê1 )1 + (ê02 ·ê2 )2 + (ê02 ·ê3 )3
03 = (ê03 ·ê1 )1 + (ê03 ·ê2 )2 + (ê03 ·ê3 )3
Note that the (ê0 ·ê ) are the direction cosines as defined by the scalar product of two unit vectors for axes
, that is, they are the cosine of the angle between the two unit vectors.
Equation 4 can be written in matrix form as
x0 = λ · x (D.5)
where the “·” means the inner matrix product of the rotation matrix λ and the vector x where
⎛ 0 ⎞ ⎛ ⎞ ⎛ 0 ⎞
1 1 ê1 ·ê1 ê01 ·ê2 ê01 ·ê3
x0 ≡ ⎝ 02 ⎠ x ≡ ⎝ 2 ⎠ λ ≡ ⎝ ê02 ·ê1 ê02 ·ê2 ê02 ·ê3 ⎠ (D.6)
0
3 3 ê03 ·ê1 ê03 ·ê2 ê03 ·ê3
The inverse procedure is obtained by multiplying equation 3 successively by one of the unit basis
vectors (ê1 ê2 ê3 ) leading to three equations
1 = (ê1 ·ê01 )01 + (ê1 ·ê02 )02 + (ê1 ·ê03 )03 (D.7)
2 = (ê2 ·ê01 )01 + (ê2 ·ê02 )02 + (ê2 ·ê03 )03
3 = (ê3 ·ê01 )01 + (ê3 ·ê02 )02 + (ê3 ·ê03 )03
x = λ ·x0 (D.8)
Thus ³ ´
λ ·λ = I
where I is the identity matrix. This implies that the rotation matrix λ is orthogonal with λ = λ−1 .
It is convenient to rename the elements of the rotation matrix to be
Consider an arbitrary rotation through an angle . Equations (10) and (11) can be used to relate
six of the nine quantities in the rotation matrix, so only three of the quantities are independent. That
is, because of equation (11) we have three equations which ensure that the transformation is unitary.
The fact that the rotation matrix should have three independent quantities is due to the fact that all rotations
can be expressed in terms of rotations about three orthogonal axes.
0
0 0 =cos(0 )
1 1 0 1
1 2 90 0
1 3 90 0
2 1 90 0
2 2 60 0500
2 3 90 − 60 0866
3 1 90 0
3 2 90 + 60 −0866
3 3 60 0500
λ = λ λ (D.19)
That is: ⎛ ⎞⎛ ⎞ ⎛ ⎞
0 1 0 1 0 0 0 0 1
λ = ⎝ −1 0 0 ⎠ ⎝ 0 0 1 ⎠ = ⎝ −1 0 0 ⎠ 6= λ (D.20)
0 0 1 0 −1 0 0 −1 0
An entirely different orientation results as illustrated in figure 1.
This behavior of finite rotations is a consequence of the fact that finite rotations do not commute, that
is, reversing the order does not give the same answer. Thus, if we associate the vectors A and B with
these rotations, then it implies that the vector product AB 6= BA. That is, for finite rotation matrices, the
product does not behave like for true vectors since they do not commute.
D.2. ROTATIONAL TRANSFORMATIONS 529
r = θ × r (D.21)
r1 = θ 1 × r (D.22)
and
r2 = θ 2 × (r + r1 ) (D.23)
Thus the final position vector for θ1 followed by θ2 is
Note that the products of these two infinitessimal rotations, 25 and 27 are identical. That is, assuming
that second-order infinitessimals can be neglected, then the infinitessimal rotations commute, and thus θ 1
and θ2 are correctly represented by vectors.
The fact that θ is a vector allows angular velocity to be represented by a vector. That is, angular
velocity is the ratio of an infinitessimal rotation to an infinitessimal time.
θ
ω= (D.28)
Note that this implies that the velocity of the point can be expressed as
r θ
v= = ×r=ω×r (D.29)
It was shown in equation 12 that, for such an orthogonal matrix, the inverse matrix −1 equals the
transposed matrix
λ−1 = λ
530 APPENDIX D. COORDINATE TRANSFORMATIONS
Inserting the orthogonality relation for the rotation matrix leads to the fact that the square of the determinant
of the rotation matrix equals one,
||2 = 1 (D.31)
that is
|| = ±1 (D.32)
A proper rotation is the rotation of a normal vector and has
|| = +1 (D.33)
For all proper rotations the determinant of = +1 and thus the cross product also acts like a proper vector
under rotation. This is not true for improper rotations where || = −1
x 1‘
A() = −A(−) (D.37)
B() = −B(−) x2 x ‘2
C = A × B (D.39)
C = B × A = −A × B (D.40)
D.4. TIME REVERSAL TRANSFORMATION 531
That is, handedness corresponds to a definite ordering of the cross product. Proper orthogonal transforma-
tions are said to preserve chirality (Greek for handedness) of a coordinate system.
An example of the use of the right-handed system is the usual definition of cartesian unit vectors,
bi × bj = k
b (D.41)
An obvious question to be asked, is the handedness of a coordinate system merely a mathematical curiosity
or does it have some deep underlying significance? Consider the Lorentz force
F = (E + v × B) (D.42)
Since force and velocity are proper vectors then the magnetic B field must be a pseudo vector. Note that
calculation of the B field occurs only in cross products such as,
∇ × B = j (D.43)
where the current density j is a proper vector. Another example is the Biot-Savart Law which expresses B
as
l × r
B = (D.44)
4 2
Thus even though B is a pseudo vector, the force F remains a proper vector. Thus if a left-handed coordinate
definition of B = 4 r×l
2 is used in 44, and F = (E + B ×v) in 42 then the same final physical
result would be obtained.
It was long thought that the laws of physics were symmetric with respect to spatial inversion ( i.e. mirror
reflection), meaning that the choice between a left-handed and right-handed representations (chirality) was
arbitrary. This is true for gravitational, electromagnetic and the strong force, and is called the conservation
of parity. The fourth fundamental force in nature, the weak force, violates parity and favours handedness.
It turns out that right-handed ordinary matter is symmetrical with left-handed antimatter.
In addition to the two flavours of vectors, one has scalars and pseudoscalars defined by:
are invariant under time reversal. Since the force can be expressed as the gradient of a scalar potential for
a conservative field, then the potential also remains unchanged. That is
p
= −∇ () = F (D.47)
It is necessary to introduce tensor algebra, given in appendix , prior to discussion of the transformation
properties of observables which is the topic of appendix 5.
Workshop exercises
1. Suppose the 2 -axis of a rectangular coordinate system is rotated by 30◦ away from the 3 -axis around the
1 -axis.
(a) Find the corresponding transformation matrix. Try to do this by drawing a diagram instead of going to
the book or the notes for a formula.
532 APPENDIX D. COORDINATE TRANSFORMATIONS
(b) Is this an orthogonal matrix? If so, show that it satisfies the main properties of an orthogonal matrix. If
not, explain why it fails to be orthogonal.
(c) Does this matrix represent a proper or an improper rotation? How do you know?
2. When you were first introduced to vectors, you most likely were told that a scalar is a quantity that is defined
by a magnitude, while a vector has both a magnitude and a direction. While this is certainly true, there is
another, more sophisticated way to define a scalar quantity and a vector quantity: through their transformation
P
properties. A scalar quantity transforms as 0 = while a vector quantity transforms as 0 = To
show that the scalar product does indeed transform as a scalar, note that:
⎛ ⎞Ã ! Ã !
X X X X X X
A0 ·B0 = 0 0 = ⎝ ⎠ =
à !
X X X
= = = A · B
Now you will show that the vector product transforms as a vector. Begin by writing out what you are trying
to show explicitly and show it to the teaching assistant. Once the teaching assistant has confirmed that you
have the correct expression, try to prove it. The vector product is a bit more difficult to work with than the
scalar product, so your teaching assistant is prepared to give you a hint if you get stuck.
3. Suppose you have two rectangular coordinate systems that share a common origin, but one system is rotated
by an angle with respect to the other. To describe this rotation, you have made use of the rotation matrix
(). (I’m changing the notation slightly to put the emphasis on the angle of rotation.)
(a) Verify that the product of two rotation matrices (1 )(2 ) is in itself a rotation matrix.
(b) In abstract algebra, a group is defined as a set of elements together with a binary operation ∗ acting
on that set such that four properties are satisfied:
i. (Closure) For any two elements and in the group , the product of the elements, ∗ is also
in the group .
ii. (Associativity) For any three elements of the group , ( ∗ ) ∗ = ∗ ( ∗ ).
iii. (Existence of Identity) The group contains an identity element such that ∗ = ∗ = for
all ∈ .
iv. (Existence of Inverses) For each element ∈ , there exists an inverse element −1 ∈ such that
∗ −1 = −1 ∗ = .
Show that if the product ∗ denotes the product of two matrices, then the set of rotation matrices together
with ∗ forms a group. This group is known as the special orthogonal group in two dimensions, also known
as (2).
(c) Is this group commutative? In abstract algebra, a commutative group is called an abelian group.
4. When you look in a mirror the image of you appears left-to-right reversed, that is, the image of your left ear
appears to be the right ear of the image and vise versa. Explain why the image is left-right reversed rather
than up-down reversed or reversed about some other axis; i.e. explain what breaks the symmetry that leads to
these properties of the mirror image.
Problems
[1] Find the transformation matrix that rotates the axis 3 of a rectangular coordinate system 45 toward 1 around
the 2 axis.
2
[2] For simplicity, take to be a two-dimensional transformation matrix. Show by direct expansion that |λ| = 1.
Appendix E
Tensor algebra
E.1 Tensors
Mathematically scalars and vectors are the first two members of a hierarchy of entities, called tensors,
that behave under coordinate transformations as described in appendix . The use of the tensor notation
provides a compact and elegant way to handle transformations in physics.
A scalar is a rank 0 tensor with one component, that is invariant under change of the coordinate system.
(0 0 0 ) = () (E.1)
A vector is a rank 1 tensor which has three components, that transform under rotation according to
matrix relation
x0 = λ · x (E.2)
where λ is the rotation matrix. Equation 2 can be written in the suffix form as
3
X
0
= (E.3)
=1
The above definitions of scalars and vectors can be subsumed into a class of entities called tensors of rank
that have 3 components. A scalar is a tensor of rank = 0, with only 30 = 1 component, whereas a vector
has rank = 1 that is, the vector x has one suffix and 31 = 3 components.
A second-order tensor has rank = 2 with two suffixes, that is, it has 32 = 9 components that
transform under rotation as
3 X
X 3
0 = (E.4)
=1 =1
For second-order tensors, the transformation formula given by equation 4 can be written more compactly
using matrices. Thus the second-order tensor can be written as a 3 × 3 matrix
⎛ ⎞
11 12 13
T ≡ ⎝ 21 22 23 ⎠ (E.5)
31 32 33
The rotational transformation given in equation 4 can be written in the form
3
à 3 ! 3
à 3 !
X X X X
0 = = (E.6)
=1 =1 =1 =1
where are the matrix elements of the transposed matrix λ . The summations in 6 can be expressed
in both the tensor and conventional matrix form as the matrix product
T0 = λ · T · λ (E.7)
Equation 7 defines the rotational properties of a spherical tensor.
533
534 APPENDIX E. TENSOR ALGEBRA
= (E.9)
This second-order tensor product has a rank = 2 that is, it equals the sum of the ranks of the two
vectors. Equation 8 is called a dyad since it was derived by taking the dyadic product of two vectors. In
general, multiplication, or division, of two vectors leads to second-order tensors. Note that this second-order
tensor product completes the triad of tensors possible taking the product of two vectors. That is, the scalar
product a · b, has rank = 0, the vector product a × b, rank = 1 and the tensor product a ⊗ b has rank1
= 2.
Higher-order tensors can be created by taking more complicated tensor products. For example, a rank-3
tensor can be created by taking the tensor outer product of the rank-2 tensor and a vector which, for
a dyadic tensor, can be written as the tensor product of three vectors. That is,
In summary, the rank of the tensor product equals the sum of the ranks of the tensors included in the tensor
product.
The scalar product a · b is a scalar number, and thus the inner-product tensor is the vector c renormalized
by the magnitude of the scalar product a · b. That is, it has a rank = 2 + 1− 2 = 1. Thus the inner product
of this rank-2 tensor with a vector gives a vector. The inner product of a rank-2 tensor with a rank-1 tensor
is used in this book for handling the rotation matrix, the inertia tensor for rigid-body rotation, and for the
stress and the strain tensors used to describe elasticity in solids.
Then the vector φ can be expressed compactly as the inner product of G and xthat is
φ = G·x
Equation 13 relates the contravariant components in the unprimed and primed frames.
Derivatives of a scalar function , such as
X X
0 = = = (E.14)
That is, covariant components of the tensor transform according to the relation
X
0 = (E.15)
It is important to differentiate between contravariant and covariant vectors. The Einstein superscript/subscript
convention for distinguishing between these two flavours of tensors is given in table 1
In linear algebra one can map from one coordinate system to another as illustrated in appendix . That
is, the tensor x can be expressed as components with respect to either the unprimed or primed coordinate
frames
x = ê01 01 + ê02 02 + ê03 03 = ê1 1 + ê2 2 + ê3 3 (E.16)
For a −dimensional manifold the unit basis column vectors ê transform according to the transformation
matrix λ
ê0 = λ · ê (E.17)
Since the tensor x is independent of the coordinate basis, the components of x must have the opposite
transform ¡ ¢
x0 = λ−1 ·x (E.18)
This normal vector x is called a “contravariant vector” because it transforms contrary to the basis column
vector transformation.
The inverse of equation 18 gives that the column vector element
X
= λ 0 (E.19)
E.5. GENERALIZED INNER PRODUCT 537
Consider the case of a gradient with respect to the coordinate x in both the unprimed and primed bases.
Using the chain rule for the partial derivative then the component of the gradient in the primed frame can
be expanded as
X X
(∇ )0 = = = λ = (E.20)
0
0
Again the summation cancels the superscript and subscript. The Kronecker delta symbol is written as
X
= (E.24)
where is a unitary matrix called a covariant metric. The covariant metric transforms a contravariant to
a covariant tensor. For example the matrix element of a covariant tensor can be written as
X
= (E.26)
By association of the covariant metric with either of the vectors in the inner product gives
X X X
= = = (E.27)
Then
X
= (E.29)
Association of the contravariant metric with one of the vectors in the inner product gives the inner
product X X X
= = = (E.30)
For most situations in this book the metric is diagonal and unitary.
538 APPENDIX E. TENSOR ALGEBRA
Table 2 : Transformation properties of scalar, vector, pseudovector, and tensor observables
under rotation, spatial inversion, and time reversal2
Physical Observable Rotation Space Time Name
(Tensor rank) inversion reversal
1) Classical Mechanics
Mass density 0 Even Even Scalar
Kinetic energy 2 2 0 Even Even Scalar
Potential energy () 0 Even Even Scalar
Lagrangian 0 Even Even Scalar
Hamiltonian 0 Even Even Scalar
Gravitational potential 0 Even Even Scalar
Coordinate r 1 Odd Even Vector
Velocity v 1 Odd Odd Vector
Momentum p 1 Odd Odd Vector
Angular momentum L=r×p 1 Even Odd Pseudovector
Force F 1 Odd Even Vector
Torque N=r×F 1 Even Even Pseudovector
Gravitational field g 1 Odd Even Vector
Inertia tensor I 2 Even Even Tensor
Elasticity stress tensor T 2 Even Even Tensor
2) Electromagnetism
Charge density 0 Even Even Scalar
Current density j 1 Odd Odd Vector
Electric field E 1 Odd Even Vector
Polarization P 1 Odd Even Vector
Displacement D 1 Odd Even Vector
Magnetic field B 1 Even Odd Pseudovector
Magnetization M 1 Even Odd Pseudovector
Magnetic field H 1 Even Odd Pseudovector
Poynting vector S=E×H 1 Odd Odd Vector
Dielectric tensor K 2 Even Even Tensor
Maxwell stress tensor T 2 Even Even Tensor
2 Based on table 6.1 in "Classical Electrodynamics" 2 edition, by J.D. Jackson [?]
Appendix F
Multivariate calculus provides the framework for handling systems having many variables associated with
each of several bodies. It is assumed that the reader has studied linear differential equations plus multivariate
calculus and thus has been exposed to the calculus used in classical mechanics. Chapter 5 of this book
introduced variational calculus which covers several important aspects of multivariate calculus such as Euler’s
variational calculus and Lagrange multipliers. This appendix provides a brief review of a selection of other
aspects of multivariate calculus that feature prominently in classical mechanics.
539
540 APPENDIX F. ASPECTS OF MULTIVARIATE CALCULUS
typically comprise partial derivatives that act on scalar, vector, or tensor fields. Table 1 lists a few
elementary examples of the use of linear operators in this textbook. The first four linear operators involve
the widely used del operator ∇ to generate the gradient, divergence and curl as described in appendices
and . The fifth and sixth linear operators act on the Lagrangian in Lagrangian mechanics applications.
The final two linear operators act on the wavefunction for wave mechanics.
There are three ways of expressing operations such as addition, multiplication, transposition or inversion
of operations that are completely equivalent because they all are based on the same principles of linear
algebra. For example, a transformation O acting on a vector A can produced the vector B. The simplest
way to express this transformation is in terms of components
3
X
= (F.6)
=1
Another way is to use matrix mechanics where the 3 × 3 matrix (O) transforms the column vector (A) to
the column vector (B), that is,
(B) = (O) (A) (F.7)
The third approach is to assume an operator O acts on the vector A
B = OA (F.8)
In classical mechanics, and quantum mechanics, these three equivalent approaches are used and exploited
extensively and interchangeably. In particular the rules of matrix manipulation, that are given in appendix
are synonymous, and equivalent to, those that apply for operator manipulation. If the operator is complex
then the operator properties are summarized as follows.
The generalization of the transpose for complex operators is the Hermitian conjugate †
† ∗
= (F.9)
For a real matrix the complex conjugation has no effect so the matrix is real and symmetric.
The generalization of orthogonal is unitary for which the operator is unitary if it is non-singular and
−1 = † (F.12)
which implies
† = = † (F.13)
F.3. TRANSFORMATION JACOBIAN 541
As shown in table 4, 1 2 3 = 2 sin that is, the Jacobian equals 2 sin Thus equation 16
can be written as
∙ ¸
3 (1 2 3 ) 3 2 2 ( )
1 2 3 = (sin ) = Ω (F.17)
1 2 3 1 2 3 Ω
The differential cross section is defined by
2 ( ) 3
≡ 2 (F.18)
Ω 1 2 3
where the 2 factor is absorbed into the cross section and the solid angle term is factored out
the geometric relations 1 = sin cos , 2 = sin sin , 3 = cos . For this transformation the Jacobian
determinant equals
¯ ¯
¯ sin cos cos cos − sin sin ¯
¯ ¯
( ) = ¯¯ sin sin cos sin sin cos ¯¯ = 2 sin
¯ cos − sin 0 ¯
Thus the three-dimensional volume integral transforms to
Z Z Z
(1 2 3 )1 2 3 = ( )( ) = ( )2 sin
Workshop exercises
1. Below you will find a set of integrals. Your teaching assistant will divide you into groups and each group will
be assigned one integral to work on. Once your group has solved the integral, write the solution on the board
in the space provided by the teaching assistant.
R 2 R 4 R cos
(a) 2 sin
R0¡ ṙ 0
¢0
(b) − ṙ2
R
(c) A · a where A = ̂ + ̂ + k̂ and is the sphere 2 + 2 + 2 = 9.
R
(d) (∇ × A) · a where A = ̂ + ̂ + k̂ and is the surface defined by the paraboloid = 1 − 2 − 2 ,
where ≥ 0.
Appendix G
This appendix reviews vector differential calculus which is used extensively in both classical mechanics and
electromagnetism.
0
= (G.1)
That is, differentiation of scalar or vector fields with respect to a scalar operator does not change the
rotational behavior. In particular, the scalar differentials of vectors continue to obey the rules of ordinary
proper vectors. The scalar operator is used for calculation of velocity or acceleration.
0 = (G.3)
then the partial differential with respect to one component of the vector x0 gives
0 X
0 = (G.4)
0
543
544 APPENDIX G. VECTOR DIFFERENTIAL CALCULUS
Therefore
X 0 X
= = = (G.6)
0 0
Thus
0 X
0 = (G.7)
That is the vector derivative acting of a scalar field transforms like a proper vector.
Define the gradient, or ∇ operator, as
X
∇≡ eb (G.8)
where eb is the unit vector along the axis. In cartesian coordinates, the del vector operator is,
b
∇ ≡ bi + bj +k (G.9)
The gradient was applied to the gravitational and electrostatic potential to derive the corresponding field.
For example, for electrostatics it was shown that the gradient of the scalar electrostatic potential field can
be written in cartesian coordinates as
E = −∇ (G.10)
Note that the gradient of a scalar field produces a vector field. You are familiar with this if you are a skier
in that the gravitational force pulls you down the line of steepest descent for the ski slope.
By contrast to the scalar product, both the gradient of a scalar field, and the vector product, are vector
fields for which the components along the coordinate axes transform in a specific manner, such as to keep the
length of the vector constant, as the coordinate frame is rotated. The gradient, scalar and vector products
with the ∇ operator are the first order derivatives of fields that occur most frequently in physics.
Second derivatives of fields also are used. Let us consider some possible combinations of the product of
two del operators.
1) ∇· (∇ ) = ∇2
The scalar product of two del operators is a scalar under rotation. Evaluating the scalar product in
cartesian coordinates gives
µ ¶ µ ¶ 2 2 2
bi + bj + k b = + +
b · bi + bj + k (G.13)
2 2 2
This also can be obtained without confusion by writing this product as;
∇· (∇ ) = ∇ · ∇ = (∇ · ∇) (G.14)
G.3. VECTOR DIFFERENTIAL OPERATORS IN CURVILINEAR COORDINATES 545
where the scalar product of the del operator is a scalar, called the Laplacian ∇2 given by
2 2 2
∇ · ∇ = ∇2 ≡ + + (G.15)
2 2 2
The Laplacian operator is encountered frequently in physics.
2) ∇× (∇ ) = 0
Note that the vector product of two identical vectors
A×A=0 (G.16)
Therefore
∇× (∇ ) = 0 (G.17)
This can be confirmed by evaluating the separate components along each axis.
3) ∇· (∇ × A) = 0
This is zero because the cross-product is perpendicular to ∇ × A and thus the dot product is zero.
4) ∇× (∇ × A) = ∇· (∇ · A) − ∇2 A
The identity
A × (B × C) = B (A · C) − (A · B) C (G.18)
since ∇ · ∇ = ∇2
There are pitfalls in the discussion of second derivatives in that it is assumed that both del operators
operate on the same variable, otherwise the results are different.
G.3.1 Gradient:
The gradient in curvilinear coordinates is
1 1 1
∇ = q̂1 + q̂2 + q̂3 (G.20)
1 1 2 2 3 3
1
∇ = ρ̂ + ϕ̂ + ẑ (G.21)
In spherical coordinates
1 1
∇ = r̂ + θ̂ + ϕ̂ (G.22)
sin
546 APPENDIX G. VECTOR DIFFERENTIAL CALCULUS
G.3.2 Divergence:
The divergence can be expressed as
∙ ¸
1
∇·A= (1 2 3 ) + (2 3 1 ) + (3 1 2 ) (G.23)
1 2 3 1 2 3
G.3.3 Curl:
¯ ¯
¯ 1 q̂1 2 q̂2 3 q̂3 ¯¯
1 ¯
∇×A= ¯ ¯ (G.26)
1 2 3 ¯ 1 2 3 ¯
¯ 1 1 2 2 3 3 ¯
In cylindrical coordinates the curl is
¯ ¯
¯ ρ̂ ϕ̂ ẑ ¯¯
1 ¯¯ ¯
∇ × A = ¯ ¯ (G.27)
¯
¯
G.3.4 Laplacian:
Taking the divergence of the gradient of a scalar gives
∙ µ ¶ µ ¶ µ ¶¸
1 2 3 3 1 1 2
∇2 = ∇ · ∇ = + + (G.29)
1 2 3 1 1 1 2 2 2 3 3 3
The gradient, divergence, curl and Laplacian are used extensively in curvilinear coordinate systems when
dealing with vector fields in Newtonian mechanics, electromagnetism, and fluid flow.
Appendix H
Field equations, such as for electromagnetic and gravitational fields, require both line integrals, and surface
integrals, of vector fields to evaluate potential, flux and circulation. These require use of the gradient, the
Divergence Theorem and Stokes Theorem which are discussed in the following sections.
∆ = (∇ ) · l (H.1)
since the gradient of that is, ∇ is the rate of change of with l Discussions of gravitational and
electrostatic potential show that the line integral between points and is given in terms of the del operator
by
Z
− = (∇ ) · l (H.2)
This relates the difference in values of a scalar field at two points to the line integral of the dot product of
the gradient with the element of the line integral.
common to both 1 and 2 are equal and in the same direction. Then
the net flux through the sum of 1 and 2 is given by
I I I
F · S + F · S = F · S (H.4) cut
1 2
since the contributions of the common surface cancel in that the
flux out of 1 is equal and opposite to the flux into 2 over the surface
Figure H.1: A volume V enclosed
That is, independent of how many times the volume enclosed by
by a closed surface S is cut into two
is subdivided, the net flux for the sum of all the Gaussian
H surfaces
pieces at the surface S This gives
enclosing these subdivisions of the volume, still equals F · S
V1 enclosed by S1 and V1 enclosed
by S2
547
548 APPENDIX H. VECTOR INTEGRAL CALCULUS
Consider
H that the volume enclosed by is subdivided into subdivisions where → ∞ then even
though F · S → 0 as → ∞, the sum over surfaces of all the infinitessimal volumes remains unchanged
I →∞ I
X
Φ= F · S = F · S (H.5)
Thus we can take the limit of a sum of an infinite number of infinitessimal volumes as is needed to obtain a
differential
H form. The surface integral for each infinitessimal volume will equal zero which is not useful, that
is F · S → 0 as → ∞ However, the flux per unit volume has a finite value as → ∞ This ratio is
called the divergence of the vector field;
H
F · S
F = ∆ →0 (H.6)
∆
where ∆ is the infinitessimal volume enclosed by surface The divergence of the vector field is a scalar
quantity.
Thus the sum of flux over all infinitessimal subdivisions of the volume enclosed by a closed surface
equals
I →∞
X
H →∞
F · S X
Φ= F · S = ∆ = F∆ (H.7)
∆
This is called the Divergence Theorem or Gauss’s Theorem. To avoid confusion with Gauss’s law in electro-
statics, it will be referred to as the Divergence theorem.
Thus the net flux out of the box due to the z component of F is
x
∆Φ = ∆Φ
− ∆Φ
= ∆∆∆ (H.11)
Adding the similar and components for ∆Φ gives Figure H.2: Computation of flux
µ ¶ out of an infinitessimal rectangular
∆Φ = + + ∆∆∆ (H.12) box, ∆ ∆ ∆
since ∆ = ∆∆∆ But the right hand side of the equation equals the scalar product ∇ · F that is,
F = ∇ · F (H.14)
The divergence is a scalar quantity. The physical meaning of the divergence is that it gives the net flux per
unit volume flowing out of an infinitessimal volume. A positive divergence corresponds to a net outflow of
flux from the infinitessimal volume at any location while a negative divergence implies a net inflow of flux
to this infinitessimal volume.
It was shown that for an infinitessimal rectangular box
µ ¶
∆Φ = + + ∆∆∆ = ∇ · F∆ (H.15)
Integrating over the finite volume enclosed by the surface gives
I Z
Φ= F · S = ∇ · F (H.16)
The divergence theorem, developed by Gauss, is of considerable importance, it relates the surface integral of
a vector field, that is, the outgoing flux, to a volume integral of ∇ · F over the enclosed volume.
This is true independent of the shape of the Gaussian surface leading to the differential form of Gauss’s law
for B
∇·B=0
That is, the local value of the divergence of B is zero everywhere.
[∇ + ()ḡ(z)] · r =0
The right hand side of this equation equals minus the weight of the displaced fluid. That is, the buoyancy force
equals the weight of the fluid displaced by the empty volume. Note that this proof applies both to compressible
fluids, where the density depends on pressure, as well as to incompressible fluids where the density is constant.
It also applies to situations where local gravity is position
R dependent. If an object of mass is completely
submerged then the net force on the object is − ()() If the object floats on the surface
of a fluid then the buoyancy force must be calculated separately for the volume under the fluid surface and
the upper volume above the fluid surface. The buoyancy due to displaced air usually is negligible since the
density of air is about 10−3 times that of fluids such as water.
because the contributions along the common boundary cancel since they are taken in opposite directions if
1 and 2 both are taken in the same direction. Note that the line integral, and corresponding enclosed area,
H.3. STOKES THEOREM 551
are vector quantities related by the right-hand rule and this must be taken into account when subdividing
the area. Thus the area can be subdivided into an infinite number of pieces for which
I →∞ I
X →∞
X
H
F · l
F · l = F · l = b
∆S · n (H.19)
∆S · nb
where ∆S is the infinitessimal area bounded by the closed sub-loop and ∆S · n b is the normal component
of this area pointing along the nb direction which is the direction along which the line integral points.
The component of the curl of the vector function along the di-
rection nb is defined to be
→∞
H C
X F · l
b ≡ ∆→0
(F) · n
(H.20)
b
∆S · n
which is identical to the right hand side of the relation for the curl in cartesian coordinates. That is;
→
−
∇ × F = F (H.29)
The physics meaning of the curl is that it is the circulation, or rotation, for an infinitessimal loop at any
location. The word curl is German for rotation.
F = ∇ (H.32)
since
∇× (∇) = 0 (H.33)
That is, any curl-free vector field can be expressed in terms of the gradient of a scalar field.
The scalar field is not unique, that is, any constant can be added to since ∇ = 0 that is, the
addition of the constant does not change the gradient. This independence to addition of a number to the
scalar potential is called a gauge invariance discussed in chapter 132 for which
F = ∇0 = ∇ ( + ) = ∇ (H.34)
That is, this gauge-invariant transformation does not change the observable F. The electrostatic field E
and the gravitation field g are examples of irrotational fields that can be expressed as the gradient of scalar
potentials.
∇·F=0 (H.35)
everywhere. This is automatically obeyed if the field F is expressed in terms of the curl of a vector field G
such that
F=∇×G (H.36)
since ∇ · ∇ × G = 0. That is, any divergence-free vector field can be written as the curl of a related vector
field.
As discussed in chapter 132, the vector potential G is not unique in that a gauge transformation can be
made by adding the gradient of any scalar field, that is, the gauge transformation G0 = G + ∇ϕ gives
F = ∇ × G0 = ∇× (G + ∇ϕ) = ∇ × G (H.37)
This gauge invariance for transformation to the vector potential G0 does not change the observable vector
field F The magnetic field B is an example of a solenoidal field that can be expressed in terms of the curl
of a vector potential A.
∇×E=0
Therefore theorem 1 states that it is possible to express this static electric field as the gradient of the scalar
electric potential , where
E = −∇
554 APPENDIX H. VECTOR INTEGRAL CALCULUS
(∇ · A)
∇·E = −∇2 − = ()
0
Similarly insertion of the vector potential A in Ampère’s Law gives
µ ¶ µ ¶
E 2A
∇ × B = ∇ × (∇ × A)=0 j + 0 0 = 0 j−0 0 ∇ − 0 0
2
Using the vector identity ∇ × (∇ × A) = ∇ (∇ · A) − ∇2 allows the above equation to be rewritten as
µ µ 2 ¶¶ µ µ ¶¶
A
∇2 A−0 0 − ∇ ∇ · A+
0 0 = −0 j ( )
2
The use of the scalar potential and vector potential A leads to two coupled equations and . These
coupled equations can be transformed into two uncoupled equations by exploiting the freedom to make a gauge
transformation for the vector potential such that the middle brackets in both equations and are zero.
That is, choosing the Lorentz gauge µ ¶
∇ · A = −0 0
simplifies equations and to be
2
∇2 −0 0 2 = −
0
µ 2 ¶
A
∇2 A−0 0 = −0 j
2
The virtue of using the Lorentz gauge, rather than the Coulomb gauge ∇ · A = 0 is that it separates the
equations for the scalar and vector potentials. Moreover, these two equations are the wave equations for these
two potential fields corresponding to a velocity = √1 0 . This example illustrates the power of using the
0
concept of potentials in describing vector fields.
Appendix I
Waveform analysis
where 0 is the lowest (fundamental) frequency solution. For an aperiodic function a cosine decomposition
can be of the form Z ∞
() = () cos( + ()) (I.2)
0
Either of the complementary functions () ⇔ (), or () ⇔ ( ) are equivalent representations of
the harmonic content that can be used to describe signals and waves. The following two sections give an
introduction to Fourier analysis.
555
556 APPENDIX I. WAVEFORM ANALYSIS
∞
0 X
() = + sin ( + ) (I.5)
2 =0
where is an integer, and are phase shifts fit to the initial conditions.
The normal modes of a discrete system form a complete set of solutions that satisfy the following orthog-
onality relation Z 2
() () = (I.6)
0
where is the Kronecker delta symbol defined in equation (10). Orthogonality can be used to determine
the coefficients for equations (3) to be
Z +
1
0 = () (I.7)
−
Z +
1
= () cos () (I.8)
−
Z +
1
= () sin () (I.9)
−
Similarly the coefficients for (4) and (5) are related to the above coefficients by
Instead of the simple trigonometric form used in equations (3 − 5) the cosine and sine functions can
be expanded into the exponential form where
1 ¡ ¢
cos = + − (I.10)
2
− ¡ ¢
sin = − −
2
then equation (3) becomes
∞
X
() = (I.11)
=−∞
where is any integer and, from the orthogonality, the Fourier coefficients are given by
Z +
1
= () (I.12)
2 −
These coefficients are related to the cosine plus sine series amplitudes by
1
= ( − ) ( when is positive)
2
1
= ( + ) (when is negative)
2
These results show that the coefficients of the exponential series are in general complex, and that they
occur in conjugate pairs (that is, the imaginary part of a coefficient is equal but opposite in sign to that
for the coefficient − ). Although the introduction of complex coefficients may appear unusual, it should
be remembered that the real part of a pair of coefficients denotes the magnitude of the cosine wave of the
relevant frequency, and that the imaginary part denotes the magnitude of the sine wave. If a particular
pair of coefficients and − are real, then the component at the frequency 0 is simply a cosine; if
and − are purely imaginary, the component is just a sine; and if, as is the general case, and − are
complex, both cosine and a sine terms are present.
The use of the exponential form of the Fourier series gives rise to the notion of ‘negative frequency’. Of
course, () = cos is a wave of a single frequency = 0 radians/second, and may be represented
I.1. HARMONIC WAVEFORM DECOMPOSITION 557
by a single line of height in a normal spectral diagram. However, using the exponential form of the Fourier
series results in both positive and negative components.
The coexistence of both negative and positive angular frequencies ± can be understood by consideration
of the Argand diagram where the real component is plotted along the -axis and the imaginary component
along the -axis. The function + represents a vector of length that rotates with an angular velocity
in a positive direction, that is counterclockwise, whereas, − represents the vector rotating in a negative
direction, that is clockwise. Thus the sum of the two rotating vectors, according to equations (3), leads
to cancellation of the opposite components on the imaginary axis and addition of the two cos real
components on the axis. Subtraction leads to cancellation of the real components and addition of the
imaginary axis components.
where is the period of the periodic force. Let () = , = 0 and take the limit for → ∞ then
equation (12) can be written as
Z +∞
() = () (I.14)
−∞
2
Similarly making the same limit for → ∞ then 0 = → and equation (11) becomes
X∞ X∞ Z +∞
() 0 0 1
() = = () = () (I.15)
=−∞
=−∞
2 2 −∞
Equation (15) shows how a non-repetitive time-domain wave form is related to its continuous spectrum.
These are known as Fourier integrals or Fourier transforms. They are of central importance for signal
processing. For convenience the transforms often are written in the operator formalism using the F symbol
in the form
Z +∞ ∙ ¸
1 1
() = () ≡ F −1 () (I.16)
2 −∞ 2
Z +∞
() = () − ≡ F () (I.17)
−∞
It is very important to grasp the significance of these two equations. The first tells us that the Fourier
transform of the waveform () is continuously distributed in the frequency range between = ±∞, whereas
the second shows how, in effect, the waveform may be synthesized from an infinite set of exponential functions
of the form ± , each weighted by the relevant value of (). It is crucial to realize that this transformation
can go either way equally, that is, from () to () or vice versa.1
1 The only asymmetry in the Fourier transform relations comes from the 2 factor originating from the fact that by convention
physicists use the angular frequency = 2 rather than the frequency . In order to restore symmetry many papers use the
factor √1 in both relations rather than using the 21
factor in equation 16 and unity in equation 17.
2
558 APPENDIX I. WAVEFORM ANALYSIS
That is, assume that the amplitude of the pulse is unity between − 2 ≤ ≤
2 . Then the Fourier transform
Z µ ¶
+
− sin
2
() = 1 =
− 2
which is an unnormalized ( ) function. Note that the width of the pulse ∆ = ± 2 leads to a frequency
envelope that has the first zeros at ∆ = ± . Thus the product of these widths ∆ · ∆ = ± which is
independent of the width of the pulse, that is ∆ = ∆ which is an example of the uncertainty principle
which is applicable to all forms of wave motion.
The Dirac function, which is sometimes referred to as the impulse function, has many important appli-
cations to physics and signal processing. For example, a shell shot from a gun is given a mechanical impulse
imparting a certain momentum to the shell in a very short time. Other things being equal, one is interested
only in the impulse imparted to the shell, that is, the time integral of the force accelerating the shell in the
gun, rather than the details of the time dependence of the force. Since the force acts for a very short time
the Dirac delta function can be employed in such problems.
As described in section 311 and appendix , the Dirac delta function is employed in signal processing
when signals are sampled for short time intervals. The Fourier transform of the delta function is needed for
discussion of sampling of signals
Z +∞
0
() = ( − 0 ) − = −
−∞
Since − essentially is constant over the infinitesimal time duration of the ( − 0 ) function, and the
time integral of the function is unity, thus the term − has unit magnitude for any value of and has
a phase shift of − ( − 0 )radians. For 0 = 0 the phase shift is zero and thus the Fourier transform of a
Dirac () function is () = 1. That is, this is a uniform white spectrum for all values of .
Figure I.1: Response of a underdamped linear oscillator with = 10, and Γ = 2 to the following impulsive
force. (a) Step function force = 0 for 0 and = for 0 (b) Square-wave force where = for
0 for = 3 and = 0 at other times. (c) Delta-function impulse = 1.
()
̈ + Γ̇ + 20 = (I.18)
and assume that a step function is applied at time = 0. That is;
() ()
=0 0 = 0 (I.19)
where is a constant. The initial conditions are that (0) = ̇(0) = 0.
The transient or complementary solution is the solution of the linearly-damped harmonic oscillator
̈ + Γ̇ + 20 = 0 (I.20)
This is independent of the driving force and the solution is given in the chapter 35 discussion of the linearly-
damped harmonic oscillator.
The particular, steady-state, solution is easy to obtain just by inspection since the force is a constant,
that is, the particular solution is
= 0 = 0 0
20
Taking the sum of the transient and particular solutions, using the initial conditions, gives the final solution
to be " #
Γ
−Γ Γ− 2
() = 2 1 − 2 cos 1 − sin 1 (I.21)
0 2 1
q ¡ ¢2
where 1 ≡ 20 − Γ2 This functional form is shown in figure 1. Note that the amplitude of the
transient response equals − at = 0 to cancel the particular solution when it jumps to +. The oscillatory
behavior then is just that of the transient response.
A square impulse can be generated by the superposition of two opposite-sign stepfunctions separated by
a time as shown in figure 1.
The square impulse can be taken to the limit where the width is negligibly small relative to the response
times of the system. It can be shown that letting → 0 but keeping the magnitude of the total impulse
= finite for the impulse at time 0 , leads to the solution for the -function impulse occurring at 0
− Γ (−0 )
() = 2 sin 1 ( − 0 ) 0 (I.22)
1
This response to a delta function impulse is shown in figure 1 for the case where 0 = 0. An example is
the response when the hammer strikes a piano string at = 0.
560 APPENDIX I. WAVEFORM ANALYSIS
Figure I.2: Decomposition of the function () = 2 sin ()+sin (5)+ 13 sin (15)+ 15 sin(25) into a time-ordered
sequence of -function samples.
As illustrated in figure 2 discrete-time waveform analysis involves repeatedly sampling the instantaneous
amplitude in a regular and repetitive sequence of -function impulses. Since the superposition principle
applies for this linear system then the waveform can be described by a sum of an ordered series of delta-
function impulses where 0 is the time of an impulse. Integrating over all the -function responses that have
occurred at time 0 , that is prior to the time of interest leads to
Z
(0 ) − Γ2 (−0 )
() = sin 1 ( − 0 ) 0 ≥ 0 (I.24)
−∞ 1
Superposition allows the summed response of the system to be written in an integral form
Z
() = (0 )( − 0 )0 (I.26)
−∞
which gives the final time dependence of the forced system. This repetitive time-sampling approach avoids
the need of using Fourier analysis. Note that the Green’s function ( − 0 ) includes
q implicitly the frequency
2
¡ Γ ¢2
of the free undamped linear oscillator 0 the free damped linear oscillator 1 ≡ 0 − 2 as well as the
damping coefficient Γ. Access to the combination of fast microcomputers coupled to fast digital sampling
techniques has made digital signal sampling the pre-eminent technique for signal recording of audio, video,
and detector signal processing.
Bibliography
561
562 BIBLIOGRAPHY
[Ki85] T.W.B. Kibble, F.H. Berkshire. “Classical Mechanics, (5th edition)”, Imperial College Press,
London, 2004. Based on the textbook written by Kibble that was published in 1966 by McGraw-
Hill. The 4th and 5th editions were published jointly by Kibble and Berkshire. This excellent
and well-established textbook addresses the same undergraduate student audience as the present
textbook. This book covers the variational principles and applications with minimal discussion of
the philosophical implications of the variational approach.
[La49] C. Lanczos, “The Variational Principles of Mechanics”, University of Toronto Press, Toronto,
(1949)
An outstanding graduate textbook that has been one of the founding pillars of the field since
1949. It gives an excellent introduction to the philosophical aspects of the variational approach
to classical mechanics, and introduces the extended formulations of Lagrangian and Hamiltonian
mechanics that are applicable to relativistic mechanics.
[La60] L. D. Landau, E. M. Lifshitz, “Mechanics”, Volume 1 of a Course in Theoretical Physics, Perga-
mon Press (1960)
An outstanding, succinct, description of analytical mechanics that is devoid of any superfluous
text. This Course in Theoretical Physics is a masterpiece of scientific writing and is an essential
component of any physics library. The compactness and lack of examples makes this textbook
less suitable for most undergraduate students.
[Li94] Yung-Kuo Lim, “Problems and Solutions on Mechanics” (1994)
This compendium of 408 solved problems, which are taken from graduate qualifying examinations
in physics at several U.S. universities, provides an invaluable resource that complements this
textbook for study of Lagrangian and Hamiltonian mechanics.
[Ma65] J. B. Marion, “Classical Dynamics of Particles and Systems”, Academic Press, New York, (1965)
This excellent undergraduate text played a major role in introducing analytical mechanics to
the undergraduate curriculum. It has an outstanding collection of challenging problems. The 5
edition has been published by S. T. Thornton and J. B. Marion, Thomson, Belmont, (2004).
[Me70] L. Meirovitch, “Methods of Analytical Dynamics”, McGraw-Hill New York, (1970)
An advanced engineering textbook that emphasizes solving practical problems, rather than the
underlying theory.
[Mu08] H. J. W. Müller-Kirsten, “Classical Mechanics and Relativity”, World Scientific, Singapore, (2008)
This modern graduate-level textbook emphasizes relativistic mechanics making it an excellent
complement to the present textbook.
[Pe82] I. Percival and D. Richards, “Introduction to Dynamics” Cambridge University Press, London,
(1982)
Provides a clear presentation of Lagrangian and Hamiltonian mechanics, including canonical
transformations, Hamilton-Jacobi theory, and action-angle variables.
[Sy60] J.L. Synge, “Principles of Classical Mechanics and Field Theory” , Volume III/I of “Handbuck
der Physik” Springer-Verlag, Berlin (1960).
A classic graduate-level presentation of analytical mechanics.
[Ta05] J. R. Taylor, “Classical Mechanics”, University Science Books, Sausalito, (2006)
This undergraduate book gives a well-written descriptive introduction to analytical mechanics.
The scope of the book is limited and the problems are easy.
BIBLIOGRAPHY 563
[Ray1881] J.W. Strutt, 3 Baron Rayleigh, Proc. London Math. Soc., s1-4 (1), (1881) 357
[Ray1887] J.W. Strutt, 3 Baron Rayleigh, The Theory of Sound, 1887 (Macmillan, London)
[Rou1860] E.J. Routh, Treatise on the dynamics of a system of rigid bodies, MacMillan (1860)
[Sim98] M. Simon, D. Cline, K. Vetter, et al, Unpublished
[Sta05] T. Stachowiak and T. Okada, Chaos, Solitons, and Fractals, 29 (2006) 417.
[Str00] S.H. Strogatz, Physica D43 (2000) 1
[Str05] J. Struckmeier, J. Phys. A: Math; Gen. 38 (2005) 1257
[Str08] J. Struckmeier, Int. J. of Mod. Phys. E18 (2008) 79
[Vir15] E.G. Virga, Phys, Rev. E91 (2015) 013203
[Win67] A.T. Winfree, J. Theoretical Biology 16 (1967) 15
Index
Abbreviated action, 228 Bohr-Sommerfeld atom
Action special relativity, 487
abbreviated action, 228 Brahe
Hamilton’s Action Principle, 226 history, 2
Action-angle variables Bulk modulus of elasticity, 453
Hamilton-Jacobi theory, 433 Buoyancy forces, 550
Sommerfeld atom, 495
Adiabatic invariance Calculus of variations
action variables, 436 brachistochrone, 111
plane pendulum, 436 Euler, 111
Analytical mechanics, xviii history, 111
Androyer-Deprit variables Leibniz, xviii
rigid-body rotation, 337 Canonical equation of motion
Archimedes Hamilton’s equations of motion, 202
history, 1 Canonical perturbation theory
Aristotle Hamilton-Jacobi theory, 438
history, 1 harmonic oscillator perturbation, 438
Asymmetric rotor Canonical transformations
stability of torque-free rotation, 344 generating function, 418
Asymmetric top Hamilton method, 420
5 somersaults plus 3 rotations of high diver, 356 Hamilton’s equations of motion, 417
separatrix, 344 Hamilton-Jacobi theory, 422
tennis racket motion, 345 identity transformation, 420
torque-free rotation, 343 Jacobi method, 420, 422
Attractor one-dimensional harmonic oscillator, 421
van der Pol oscillator, 94 Cartesian coordinates, 519
Autonomous system, 93, 169 Cayley
history, 505
Barycenter, 251 Center of momentum
Bernoulli bolas, 17
history, 5 Center of percussion, 35
principle of virtual work, 138 Central forces
virtual work, 111 two-body forces, 249
Bertrand’s Theorem Centre of mass
orbit stability, 263 finite sized objects, 12
Bertrand’s theorem Centre of momentum
orbit solution, 256 relativistic kinematics, 473
Bicycle stability Centrifugal force
rolling wheel, 353 parabolic mirror, 297
Bifurcation Chaos
non-linear system, 103 Lyapunov exponent, 103
Billiard ball, 15 onset of chaos for non-linear system, 101
Bohr Characteristic function
history, 8 Hamilton-Jacobi theory, 424
model of the atom, 495 Chasles’ theorem
565
566 INDEX
Douglas Cline