100% found this document useful (1 vote)

1K views600 pages

Variational Principles in Classical Mechanics 2e

Uploaded by

flifixwebs

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

100% found this document useful (1 vote)

1K views600 pages

Variational Principles in Classical Mechanics 2e

Uploaded by

flifixwebs

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 600

VARIATIONAL PRINCIPLES

IN
CLASSICAL MECHANICS
SECOND EDITION

Douglas Cline
University of Rochester

24 November 2018
ii

c
°2018, 2017 by Douglas Cline

ISBN: 978-0-9988372-6-0 e-book (Adobe PDF)

ISBN: 978-0-9988372-7-7 print (Paperback)

Variational Principles in Classical Mechanics, 2  edition

Contributors
Author: Douglas Cline
Illustrator: Meghan Sarkis

Published by University of Rochester River Campus Libraries

University of Rochester
Rochester, NY 14627

Variational Principles in Classical Mechanics, 2  edition by Douglas Cline is licensed under a Creative
Commons Attribution-NonCommercial-ShareAlike 4.0 International License (CC BY-NC-SA 4.0), except
where otherwise noted.

You are free to:

• Share — copy or redistribute the material in any medium or format.

• Adapt — remix, transform, and build upon the material.

Under the following terms:

• Attribution — You must give appropriate credit, provide a link to the license, and indicate if changes
were made. You must do so in any reasonable manner, but not in any way that suggests the licensor
endorses you or your use.
• NonCommercial — You may not use the material for commercial purposes.
• ShareAlike — If you remix, transform, or build upon the material, you must distribute your contributions
under the same license as the original.
• No additional restrictions — You may not apply legal terms or technological measures that legally
restrict others from doing anything the license permits.

The licensor cannot revoke these freedoms as long as you follow the license terms.

Version 2.0
Contents

Contents iii

Preface xvii

Prologue xix

1 A brief history of classical mechanics 1

1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Greek antiquity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.3 Middle Ages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.4 Age of Enlightenment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.5 Variational methods in physics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.6 The 20 century revolution in physics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

2 Review of Newtonian mechanics 9

2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.2 Newton’s Laws of motion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.3 Inertial frames of reference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.4 First-order integrals in Newtonian mechanics . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.4.1 Linear Momentum . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.4.2 Angular momentum . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.4.3 Kinetic energy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.5 Conservation laws in classical mechanics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.6 Motion of finite-sized and many-body systems . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.7 Center of mass of a many-body system . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.8 Total linear momentum of a many-body system . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.8.1 Center-of-mass decomposition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.8.2 Equations of motion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.9 Angular momentum of a many-body system . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
2.9.1 Center-of-mass decomposition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
2.9.2 Equations of motion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
2.10 Work and kinetic energy for a many-body system . . . . . . . . . . . . . . . . . . . . . . . . . 18
2.10.1 Center-of-mass kinetic energy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
2.10.2 Conservative forces and potential energy . . . . . . . . . . . . . . . . . . . . . . . . . . 18
2.10.3 Total mechanical energy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
2.10.4 Total mechanical energy for conservative systems . . . . . . . . . . . . . . . . . . . . . 20
2.11 Virial Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
2.12 Applications of Newton’s equations of motion . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
2.12.1 Constant force problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
2.12.2 Linear Restoring Force . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
2.12.3 Position-dependent conservative forces . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
2.12.4 Constrained motion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
2.12.5 Velocity Dependent Forces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
2.12.6 Systems with Variable Mass . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
2.12.7 Rigid-body rotation about a body-fixed rotation axis . . . . . . . . . . . . . . . . . . . 31

iii
iv CONTENTS

2.12.8 Time dependent forces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34

2.13 Solution of many-body equations of motion . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
2.13.1 Analytic solution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
2.13.2 Successive approximation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
2.13.3 Perturbation method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
2.14 Newton’s Law of Gravitation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
2.14.1 Gravitational and inertial mass . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
2.14.2 Gravitational potential energy  . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
2.14.3 Gravitational potential  . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
2.14.4 Potential theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
2.14.5 Curl of the gravitational field . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
2.14.6 Gauss’s Law for Gravitation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
2.14.7 Condensed forms of Newton’s Law of Gravitation . . . . . . . . . . . . . . . . . . . . . 44
2.15 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
Workshop exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51

3 Linear oscillators 53
3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
3.2 Linear restoring forces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
3.3 Linearity and superposition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
3.4 Geometrical representations of dynamical motion . . . . . . . . . . . . . . . . . . . . . . . . . 55
3.4.1 Configuration space (    ) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
3.4.2 State space, (  ̇ ) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
3.4.3 Phase space, (   ) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
3.4.4 Plane pendulum . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
3.5 Linearly-damped free linear oscillator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
3.5.1 General solution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
3.5.2 Energy dissipation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
3.6 Sinusoidally-drive, linearly-damped, linear oscillator . . . . . . . . . . . . . . . . . . . . . . . 62
3.6.1 Transient response of a driven oscillator . . . . . . . . . . . . . . . . . . . . . . . . . . 62
3.6.2 Steady state response of a driven oscillator . . . . . . . . . . . . . . . . . . . . . . . . 63
3.6.3 Complete solution of the driven oscillator . . . . . . . . . . . . . . . . . . . . . . . . . 64
3.6.4 Resonance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
3.6.5 Energy absorption . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
3.7 Wave equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
3.8 Travelling and standing wave solutions of the wave equation . . . . . . . . . . . . . . . . . . . 69
3.9 Waveform analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
3.9.1 Harmonic decomposition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
3.9.2 The free linearly-damped linear oscillator . . . . . . . . . . . . . . . . . . . . . . . . . 70
3.9.3 Damped linear oscillator subject to an arbitrary periodic force . . . . . . . . . . . . . 71
3.10 Signal processing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
3.11 Wave propagation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
3.11.1 Phase, group, and signal velocities of wave packets . . . . . . . . . . . . . . . . . . . . 74
3.11.2 Fourier transform of wave packets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
3.11.3 Wave-packet Uncertainty Principle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
3.12 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
Workshop exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88

4 Nonlinear systems and chaos 89

4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
4.2 Weak nonlinearity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
4.3 Bifurcation, and point attractors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92
4.4 Limit cycles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
CONTENTS v

4.4.1 Poincaré-Bendixson theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93

4.4.2 van der Pol damped harmonic oscillator: . . . . . . . . . . . . . . . . . . . . . . . . . . 94
4.5 Harmonically-driven, linearly-damped, plane pendulum . . . . . . . . . . . . . . . . . . . . . . 97
4.5.1 Close to linearity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
4.5.2 Weak nonlinearity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
4.5.3 Onset of complication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100
4.5.4 Period doubling and bifurcation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100
4.5.5 Rolling motion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100
4.5.6 Onset of chaos . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
4.6 Diﬀerentiation between ordered and chaotic motion . . . . . . . . . . . . . . . . . . . . . . . . 102
4.6.1 Lyapunov exponent . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
4.6.2 Bifurcation diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
4.6.3 Poincaré Section . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104
4.7 Wave propagation for non-linear systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
4.7.1 Phase, group, and signal velocities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
4.7.2 Soliton wave propagation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107
4.8 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108
Workshop exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110

5 Calculus of variations 111

5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111
5.2 Euler’s diﬀerential equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112
5.3 Applications of Euler’s equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114
5.4 Selection of the independent variable . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117
5.5 Functions with several independent variables  () . . . . . . . . . . . . . . . . . . . . . . . . 119
5.6 Euler’s integral equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121
5.7 Constrained variational systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122
5.7.1 Holonomic constraints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122
5.7.2 Geometric (algebraic) equations of constraint . . . . . . . . . . . . . . . . . . . . . . . 122
5.7.3 Kinematic (diﬀerential) equations of constraint . . . . . . . . . . . . . . . . . . . . . . 122
5.7.4 Isoperimetric (integral) equations of constraint . . . . . . . . . . . . . . . . . . . . . . 123
5.7.5 Properties of the constraint equations . . . . . . . . . . . . . . . . . . . . . . . . . . . 123
5.7.6 Treatment of constraint forces in variational calculus . . . . . . . . . . . . . . . . . . . 124
5.8 Generalized coordinates in variational calculus . . . . . . . . . . . . . . . . . . . . . . . . . . 125
5.9 Lagrange multipliers for holonomic constraints . . . . . . . . . . . . . . . . . . . . . . . . . . 126
5.9.1 Algebraic equations of constraint . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126
5.9.2 Integral equations of constraint . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128
5.10 Geodesic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130
5.11 Variational approach to classical mechanics . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131
5.12 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132
Workshop exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133
Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134

6 Lagrangian dynamics 135

6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135
6.2 Newtonian plausibility argument for Lagrangian mechanics . . . . . . . . . . . . . . . . . . . 136
6.3 Lagrange equations from d’Alembert’s Principle . . . . . . . . . . . . . . . . . . . . . . . . . . 138
6.3.1 d’Alembert’s Principle of Virtual Work . . . . . . . . . . . . . . . . . . . . . . . . . . 138
6.3.2 Transformation to generalized coordinates . . . . . . . . . . . . . . . . . . . . . . . . . 139
6.3.3 Lagrangian . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140
6.4 Lagrange equations from Hamilton’s Action Principle . . . . . . . . . . . . . . . . . . . . . . 141
6.5 Constrained systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142
6.5.1 Choice of generalized coordinates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142
6.5.2 Minimal set of generalized coordinates . . . . . . . . . . . . . . . . . . . . . . . . . . . 142
vi CONTENTS

6.5.3 Lagrange multipliers approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142

6.5.4 Generalized forces approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 144
6.6 Applying the Euler-Lagrange equations to classical mechanics . . . . . . . . . . . . . . . . . . 144
6.7 Applications to unconstrained systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 146
6.8 Applications to systems involving holonomic constraints . . . . . . . . . . . . . . . . . . . . . 148
6.9 Applications involving non-holonomic constraints . . . . . . . . . . . . . . . . . . . . . . . . . 161
6.10 Velocity-dependent Lorentz force . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 168
6.11 Time-dependent forces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169
6.12 Impulsive forces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 170
6.13 The Lagrangian versus the Newtonian approach to classical mechanics . . . . . . . . . . . . . 172
6.14 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 173
Workshop exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 176
Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 178

7 Symmetries, Invariance and the Hamiltonian 179

7.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 179
7.2 Generalized momentum . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 179
7.3 Invariant transformations and Noether’s Theorem . . . . . . . . . . . . . . . . . . . . . . . . . 181
7.4 Rotational invariance and conservation of angular momentum . . . . . . . . . . . . . . . . . . 183
7.5 Cyclic coordinates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 184
7.6 Kinetic energy in generalized coordinates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 185
7.7 Generalized energy and the Hamiltonian function . . . . . . . . . . . . . . . . . . . . . . . . . 186
7.8 Generalized energy theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 187
7.9 Generalized energy and total energy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 187
7.10 Hamiltonian invariance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 188
7.11 Hamiltonian for cyclic coordinates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193
7.12 Symmetries and invariance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193
7.13 Hamiltonian in classical mechanics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193
7.14 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 194
Workshop exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 196
Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 197

8 Hamiltonian mechanics 199

8.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 199
8.2 Legendre Transformation between Lagrangian and Hamiltonian mechanics . . . . . . . . . . . 200
8.3 Hamilton’s equations of motion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 201
8.3.1 Canonical equations of motion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 202
8.4 Hamiltonian in diﬀerent coordinate systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . 203
8.4.1 Cylindrical coordinates    . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 203
8.4.2 Spherical coordinates,    . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 204
8.5 Applications of Hamiltonian Dynamics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 205
8.6 Routhian reduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 210
8.6.1 R - Routhian is a Hamiltonian for the cyclic variables . . . . . . . . . . . . . . . . 211
8.6.2 R - Routhian is a Hamiltonian for the non-cyclic variables . . . . . . . . . . . 212
8.7 Variable-mass systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 216
8.7.1 Rocket propulsion: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 216
8.7.2 Moving chains: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 217
8.8 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 219
Workshop exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 221
Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 222

9 Hamilton’s Action Principle 225

9.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 225
9.2 Hamilton’s Principle of Stationary Action
9.2.1 Stationary-action principle in Lagrangian mechanics . . . . . . . . . . . . . . . . . . . 226
9.2.2 Stationary-action principle in Hamiltonian mechanics . . . . . . . . . . . . . . . . . . 227
CONTENTS vii

9.2.3 Abbreviated action . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 228

9.2.4 Hamilton’s Principle applied using initial boundary conditions . . . . . . . . . . . . . 229
9.3 Lagrangian . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 232
9.3.1 Standard Lagrangian . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 232
9.3.2 Gauge invariance of the standard Lagrangian . . . . . . . . . . . . . . . . . . . . . . . 232
9.3.3 Non-standard Lagrangians . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 234
9.3.4 Inverse variational calculus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 234
9.4 Application of Hamilton’s Action Principle to mechanics . . . . . . . . . . . . . . . . . . . . . 235
9.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 236

10 Nonconservative systems 239

10.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 239
10.2 Origins of nonconservative motion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 239
10.3 Algebraic mechanics for nonconservative systems . . . . . . . . . . . . . . . . . . . . . . . . . 240
10.4 Rayleigh’s dissipation function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 240
10.4.1 Generalized dissipative forces for linear velocity dependence . . . . . . . . . . . . . . . 241
10.4.2 Generalized dissipative forces for nonlinear velocity dependence . . . . . . . . . . . . . 242
10.4.3 Lagrange equations of motion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 242
10.4.4 Hamiltonian mechanics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 242
10.5 Dissipative Lagrangians . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 245
10.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 247

11 Conservative two-body central forces 249

11.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 249
11.2 Equivalent one-body representation for two-body motion . . . . . . . . . . . . . . . . . . . . . 250
11.3 Angular momentum L . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 252
11.4 Equations of motion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 253
11.5 Diﬀerential orbit equation: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 254
11.6 Hamiltonian . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 255
11.7 General features of the orbit solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 256
11.8 Inverse-square, two-body, central force . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 257
11.8.1 Bound orbits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 258
11.8.2 Kepler’s laws for bound planetary motion . . . . . . . . . . . . . . . . . . . . . . . . . 259
11.8.3 Unbound orbits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 260
11.8.4 Eccentricity vector . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 261
11.9 Isotropic, linear, two-body, central force . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 263
11.9.1 Polar coordinates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 264
11.9.2 Cartesian coordinates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 265
11.9.3 Symmetry tensor A0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 266
11.10Closed-orbit stability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 267
11.11The three-body problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 272
11.12Two-body scattering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 273
11.12.1 Total two-body scattering cross section . . . . . . . . . . . . . . . . . . . . . . . . . . 273
11.12.2 Diﬀerential two-body scattering cross section . . . . . . . . . . . . . . . . . . . . . . . 274
11.12.3 Impact parameter dependence on scattering angle . . . . . . . . . . . . . . . . . . . . 274
11.12.4 Rutherford scattering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 276
11.13Two-body kinematics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 278
11.14Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 284
Workshop exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 286
Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 287

12 Non-inertial reference frames 289

12.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 289
12.2 Translational acceleration of a reference frame . . . . . . . . . . . . . . . . . . . . . . . . . . . 289
12.3 Rotating reference frame . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 290
12.3.1 Spatial time derivatives in a rotating, non-translating, reference frame . . . . . . . . . 290
viii CONTENTS

12.3.2 General vector in a rotating, non-translating, reference frame . . . . . . . . . . . . . . 291

12.4 Reference frame undergoing rotation plus translation . . . . . . . . . . . . . . . . . . . . . . . 292
12.5 Newton’s law of motion in a non-inertial frame . . . . . . . . . . . . . . . . . . . . . . . . . . 292
12.6 Lagrangian mechanics in a non-inertial frame . . . . . . . . . . . . . . . . . . . . . . . . . . . 293
12.7 Centrifugal force . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 294
12.8 Coriolis force . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 295
12.9 Routhian reduction for rotating systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 299
12.10Eﬀective gravitational force near the surface of the Earth . . . . . . . . . . . . . . . . . . . . 302
12.11Free motion on the earth . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 304
12.12Weather systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 306
12.12.1 Low-pressure systems: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 306
12.12.2 High-pressure systems: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 308
12.13Foucault pendulum . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 308
12.14Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 310
Workshop exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 311
Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 312

13 Rigid-body rotation 313

13.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 313
13.2 Rigid-body coordinates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 314
13.3 Rigid-body rotation about a body-fixed point . . . . . . . . . . . . . . . . . . . . . . . . . . . 314
13.4 Inertia tensor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 316
13.5 Matrix and tensor formulations of rigid-body rotation . . . . . . . . . . . . . . . . . . . . . . 317
13.6 Principal axis system . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 317
13.7 Diagonalize the inertia tensor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 318
13.8 Parallel-axis theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 319
13.9 Perpendicular-axis theorem for plane laminae . . . . . . . . . . . . . . . . . . . . . . . . . . . 322
13.10General properties of the inertia tensor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 323
13.10.1 Inertial equivalence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 323
13.10.2 Orthogonality of principal axes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 324
13.11Angular momentum L and angular velocity ω vectors . . . . . . . . . . . . . . . . . . . . . . 325
13.12Kinetic energy of rotating rigid body . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 327
13.13Euler angles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 329
13.14Angular velocity ω . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 331
13.15Kinetic energy in terms of Euler angular velocities . . . . . . . . . . . . . . . . . . . . . . . . 332
13.16Rotational invariants . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 333
13.17Euler’s equations of motion for rigid-body rotation . . . . . . . . . . . . . . . . . . . . . . . . 334
13.18Lagrange equations of motion for rigid-body rotation . . . . . . . . . . . . . . . . . . . . . . . 335
13.19Hamiltonian equations of motion for rigid-body rotation . . . . . . . . . . . . . . . . . . . . . 337
13.20Torque-free rotation of an inertially-symmetric rigid rotor . . . . . . . . . . . . . . . . . . . . 337
13.20.1 Euler’s equations of motion: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 337
13.20.2 Lagrange equations of motion: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 341
13.21Torque-free rotation of an asymmetric rigid rotor . . . . . . . . . . . . . . . . . . . . . . . . . 343
13.22Stability of torque-free rotation of an asymmetric body . . . . . . . . . . . . . . . . . . . . . . 344
13.23Symmetric rigid rotor subject to torque about a fixed point . . . . . . . . . . . . . . . . . . . 347
13.24The rolling wheel . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 351
13.25Dynamic balancing of wheels . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 355
13.26Rotation of deformable bodies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 356
13.27Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 357
Workshop exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 359
Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 362

14 Coupled linear oscillators 363

14.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 363
14.2 Two coupled linear oscillators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 363
CONTENTS ix

14.3 Normal modes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 365

14.4 Center of mass oscillations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 366
14.5 Weak coupling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 367
14.6 General analytic theory for coupled linear oscillators . . . . . . . . . . . . . . . . . . . . . . . 369
14.6.1 Kinetic energy tensor T . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 369
14.6.2 Potential energy tensor V . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 370
14.6.3 Equations of motion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 371
14.6.4 Superposition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 372
14.6.5 Eigenfunction orthonormality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 372
14.6.6 Normal coordinates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 373
14.7 Two-body coupled oscillator systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 374
14.8 Three-body coupled linear oscillator systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . 380
14.9 Molecular coupled oscillator systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 385
14.10Discrete Lattice Chain . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 388
14.10.1 Longitudinal motion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 388
14.10.2 Transverse motion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 388
14.10.3 Normal modes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 389
14.10.4 Travelling waves . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 392
14.10.5 Dispersion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 392
14.10.6 Complex wavenumber . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 393
14.11Damped coupled linear oscillators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 394
14.12Collective synchronization of coupled oscillators . . . . . . . . . . . . . . . . . . . . . . . . . . 395
14.13Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 399
Workshop exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 401
Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 402

15 Advanced Hamiltonian mechanics 403

15.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 403
15.2 Poisson bracket representation of Hamiltonian mechanics . . . . . . . . . . . . . . . . . . . . 405
15.2.1 Poisson Brackets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 405
15.2.2 Fundamental Poisson brackets: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 405
15.2.3 Poisson bracket invariance to canonical transformations . . . . . . . . . . . . . . . . . 406
15.2.4 Correspondence of the commutator and the Poisson Bracket . . . . . . . . . . . . . . . 407
15.2.5 Observables in Hamiltonian mechanics . . . . . . . . . . . . . . . . . . . . . . . . . . . 408
15.2.6 Hamilton’s equations of motion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 411
15.2.7 Liouville’s Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 415
15.3 Canonical transformations in Hamiltonian mechanics . . . . . . . . . . . . . . . . . . . . . . . 417
15.3.1 Generating functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 418
15.3.2 Applications of canonical transformations . . . . . . . . . . . . . . . . . . . . . . . . . 420
15.4 Hamilton-Jacobi theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 422
15.4.1 Time-dependent Hamiltonian . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 422
15.4.2 Time-independent Hamiltonian . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 424
15.4.3 Separation of variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 425
15.4.4 Visual representation of the action function . . . . . . . . . . . . . . . . . . . . . . . 432
15.4.5 Advantages of Hamilton-Jacobi theory . . . . . . . . . . . . . . . . . . . . . . . . . . . 432
15.5 Action-angle variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 433
15.5.1 Canonical transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 433
15.5.2 Adiabatic invariance of the action variables . . . . . . . . . . . . . . . . . . . . . . . . 436
15.6 Canonical perturbation theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 438
15.7 Symplectic representation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 440
15.8 Comparison of the Lagrangian and Hamiltonian formulations . . . . . . . . . . . . . . . . . . 440
15.9 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 442
Workshop exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 445
Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 446
x CONTENTS

16 Analytical formulations for continuous systems 447

16.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 447
16.2 The continuous uniform linear chain . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 447
16.3 The Lagrangian density formulation for continuous systems . . . . . . . . . . . . . . . . . . . 448
16.3.1 One spatial dimension . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 448
16.3.2 Three spatial dimensions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 449
16.4 The Hamiltonian density formulation for continuous systems . . . . . . . . . . . . . . . . . . 450
16.5 Linear elastic solids . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 451
16.5.1 Stress tensor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 452
16.5.2 Strain tensor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 452
16.5.3 Moduli of elasticity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 453
16.5.4 Equations of motion in a uniform elastic media . . . . . . . . . . . . . . . . . . . . . . 454
16.6 Electromagnetic field theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 455
16.6.1 Maxwell stress tensor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 455
16.6.2 Momentum in the electromagnetic field . . . . . . . . . . . . . . . . . . . . . . . . . . 456
16.7 Ideal fluid dynamics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 457
16.7.1 Continuity equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 457
16.7.2 Euler’s hydrodynamic equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 457
16.7.3 Irrotational flow and Bernoulli’s equation . . . . . . . . . . . . . . . . . . . . . . . . . 458
16.7.4 Gas flow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 458
16.8 Viscous fluid dynamics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 460
16.8.1 Navier-Stokes equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 460
16.8.2 Reynolds number . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 461
16.8.3 Laminar and turbulent fluid flow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 461
16.9 Summary and implications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 463

17 Relativistic mechanics 465

17.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 465
17.2 Galilean Invariance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 465
17.3 Special Theory of Relativity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 467
17.3.1 Einstein Postulates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 467
17.3.2 Lorentz transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 467
17.3.3 Time Dilation: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 468
17.3.4 Length Contraction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 469
17.3.5 Simultaneity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 469
17.4 Relativistic kinematics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 472
17.4.1 Velocity transformations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 472
17.4.2 Momentum . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 472
17.4.3 Center of momentum coordinate system . . . . . . . . . . . . . . . . . . . . . . . . . . 473
17.4.4 Force . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 473
17.4.5 Energy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 473
17.5 Geometry of space-time . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 475
17.5.1 Four-dimensional space-time . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 475
17.5.2 Four-vector scalar products . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 476
17.5.3 Minkowski space-time . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 477
17.5.4 Momentum-energy four vector . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 478
17.6 Lorentz-invariant formulation of Lagrangian mechanics . . . . . . . . . . . . . . . . . . . . . . 479
17.6.1 Parametric formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 479
17.6.2 Extended Lagrangian . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 479
17.6.3 Extended generalized momenta . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 481
17.6.4 Extended Lagrange equations of motion . . . . . . . . . . . . . . . . . . . . . . . . . . 481
17.7 Lorentz-invariant formulations of Hamiltonian mechanics . . . . . . . . . . . . . . . . . . . . . 484
17.7.1 Extended canonical formalism . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 484
17.7.2 Extended Poisson Bracket representation . . . . . . . . . . . . . . . . . . . . . . . . . 486
17.7.3 Extended canonical transformation and Hamilton-Jacobi theory . . . . . . . . . . . . . 486
CONTENTS xi

17.7.4 Validity of the extended Hamilton-Lagrange formalism . . . . . . . . . . . . . . . . . . 486

17.8 The General Theory of Relativity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 488
17.8.1 The fundamental concepts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 488
17.8.2 Einstein’s postulates for the General Theory of Relativity . . . . . . . . . . . . . . . . 489
17.8.3 Experimental evidence in support of the General Theory of Relativity . . . . . . . . . 489
17.9 Implications of relativistic theory to classical mechanics . . . . . . . . . . . . . . . . . . . . . 490
17.10Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 491
Workshop exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 492
Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 492

18 The transition to quantum physics 493

18.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 493
18.2 Brief summary of the origins of quantum theory . . . . . . . . . . . . . . . . . . . . . . . . . 493
18.2.1 Bohr model of the atom . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 495
18.2.2 Quantization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 495
18.2.3 Wave-particle duality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 496
18.3 Hamiltonian in quantum theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 497
18.3.1 Heisenberg’s matrix-mechanics representation . . . . . . . . . . . . . . . . . . . . . . . 497
18.3.2 Schrödinger’s wave-mechanics representation . . . . . . . . . . . . . . . . . . . . . . . 499
18.4 Lagrangian representation in quantum theory . . . . . . . . . . . . . . . . . . . . . . . . . . . 500
18.5 Correspondence Principle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 501
18.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 502

19 Epilogue 503

Appendices

A Matrix algebra 505

A.1 Mathematical methods for mechanics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 505
A.2 Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 505
A.3 Determinants . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 509
A.4 Reduction of a matrix to diagonal form . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 511

B Vector algebra 515

B.1 Linear operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 515
B.2 Scalar product . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 515
B.3 Vector product . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 516
B.4 Triple products . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 517

C Orthogonal coordinate systems 519

C.1 Cartesian coordinates (  ) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 519
C.2 Curvilinear coordinate systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 519
C.2.1 Two-dimensional polar coordinates ( ) . . . . . . . . . . . . . . . . . . . . . . . . . . 520
C.2.2 Cylindrical Coordinates (  ) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 522
C.2.3 Spherical Coordinates (  ) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 522
C.3 Frenet-Serret coordinates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 523

D Coordinate transformations 525

D.1 Translational transformations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 525
D.2 Rotational transformations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 525
D.2.1 Rotation matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 525
D.2.2 Finite rotations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 528
D.2.3 Infinitessimal rotations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 529
D.2.4 Proper and improper rotations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 529
D.3 Spatial inversion transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 530
D.4 Time reversal transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 531
xii CONTENTS

E Tensor algebra 533

E.1 Tensors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 533
E.2 Tensor products . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 534
E.2.1 Tensor outer product . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 534
E.2.2 Tensor inner product . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 534
E.3 Tensor properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 535
E.4 Contravariant and covariant tensors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 536
E.5 Generalized inner product . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 537
E.6 Transformation properties of observables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 538

F Aspects of multivariate calculus 539

F.1 Partial diﬀerentiation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 539
F.2 Linear operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 539
F.3 Transformation Jacobian . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 541
F.3.1 Transformation of integrals: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 541
F.3.2 Transformation of diﬀerential equations: . . . . . . . . . . . . . . . . . . . . . . . . . . 541
F.3.3 Properties of the Jacobian: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 541
F.4 Legendre transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 542

G Vector diﬀerential calculus 543

G.1 Scalar differential operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 543
G.1.1 Scalar field . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 543
G.1.2 Vector field . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 543
G.2 Vector differential operators in cartesian coordinates . . . . . . . . . . . . . . . . . . . . . . . 543
G.2.1 Scalar field . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 543
G.2.2 Vector field . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 544
G.3 Vector differential operators in curvilinear coordinates . . . . . . . . . . . . . . . . . . . . . . 545
G.3.1 Gradient: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 545
G.3.2 Divergence: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 546
G.3.3 Curl: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 546
G.3.4 Laplacian: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 546

H Vector integral calculus 547

H.1 Line integral of the gradient of a scalar field . . . . . . . . . . . . . . . . . . . . . . . . . . . . 547
H.2 Divergence theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 547
H.2.1 Flux of a vector field for Gaussian surface . . . . . . . . . . . . . . . . . . . . . . . . . 547
H.2.2 Divergence in cartesian coordinates. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 548
H.3 Stokes Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 550
H.3.1 The curl . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 550
H.3.2 Curl in cartesian coordinates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 551
H.4 Potential formulations of curl-free and divergence-free fields . . . . . . . . . . . . . . . . . . . 553

I Waveform analysis 555

I.1 Harmonic waveform decomposition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 555
I.1.1 Periodic systems and the Fourier series . . . . . . . . . . . . . . . . . . . . . . . . . . . 555
I.1.2 Aperiodic systems and the Fourier Transform . . . . . . . . . . . . . . . . . . . . . . . 557
I.2 Time-sampled waveform analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 558
I.2.1 Delta-function impulse response . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 559
I.2.2 Green’s function waveform decomposition . . . . . . . . . . . . . . . . . . . . . . . . . 560

Bibliography 561

Index 565
Examples

2.1 Example: Exploding cannon shell . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

2.2 Example: Billiard-ball collisions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
2.3 Example: Bolas thrown by gaucho . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
2.4 Example: Central force . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
2.5 Example: The ideal gas law . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
2.6 Example: The mass of galaxies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
2.7 Example: Diatomic molecule . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
2.8 Example: Roller coaster . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
2.9 Example: Vertical fall in the earth’s gravitational field. . . . . . . . . . . . . . . . . . . . . . . . . 28
2.10 Example: Projectile motion in air . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
2.11 Example: Moment of inertia of a thin door . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
2.12 Example: Merry-go-round . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
2.13 Example: Cue pushes a billiard ball . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
2.14 Example: Center of percussion of a baseball bat . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
2.15 Example: Energy transfer in charged-particle scattering . . . . . . . . . . . . . . . . . . . . . . . 36
2.16 Example: Field of a uniform sphere . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
3.1 Example: Harmonically-driven series RLC circuit . . . . . . . . . . . . . . . . . . . . . . . . . . 67
3.2 Example: Vibration isolation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
3.3 Example: Water waves breaking on a beach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
3.4 Example: Surface waves for deep water . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
3.5 Example: Electromagnetic waves in ionosphere . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
3.6 Example: Fourier transform of a Gaussian wave packet: . . . . . . . . . . . . . . . . . . . . . . . 79
3.7 Example: Fourier transform of a rectangular wave packet: . . . . . . . . . . . . . . . . . . . . . . 79
3.8 Example: Acoustic wave packet . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
3.9 Example: Gravitational red shift . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
3.10 Example: Quantum baseball . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
4.1 Example: Non-linear oscillator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
5.1 Example: Shortest distance between two points . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114
5.2 Example: Brachistochrone problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114
5.3 Example: Minimal travel cost . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116
5.4 Example: Surface area of a cylindrically-symmetric soap bubble . . . . . . . . . . . . . . . . . . . 117
5.5 Example: Fermat’s Principle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119
5.6 Example: Minimum of (∇)2 in a volume . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121
5.7 Example: Two dependent variables coupled by one holonomic constraint . . . . . . . . . . . . . . 127
5.8 Example: Catenary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129
5.9 Example: The Queen Dido problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129
6.1 Example: Motion of a free particle, U=0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 146
6.2 Example: Motion in a uniform gravitational field . . . . . . . . . . . . . . . . . . . . . . . . . . . 146
6.3 Example: Central forces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147
6.4 Example: Disk rolling on an inclined plane . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 148
6.5 Example: Two connected masses on frictionless inclined planes . . . . . . . . . . . . . . . . . . . 151
6.6 Example: Two blocks connected by a frictionless bar . . . . . . . . . . . . . . . . . . . . . . . . . 152
6.7 Example: Block sliding on a movable frictionless inclined plane . . . . . . . . . . . . . . . . . . . 153
6.8 Example: Sphere rolling without slipping down an inclined plane on a frictionless floor. . . . . . 154
6.9 Example: Mass sliding on a rotating straight frictionless rod. . . . . . . . . . . . . . . . . . . . . 154

xiii
xiv EXAMPLES

6.10 Example: Spherical pendulum . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155

6.11 Example: Spring plane pendulum . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 156
6.12 Example: The yo-yo . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157
6.13 Example: Mass constrained to move on the inside of a frictionless paraboloid . . . . . . . . . . . 158
6.14 Example: Mass on a frictionless plane connected to a plane pendulum . . . . . . . . . . . . . . . 159
6.15 Example: Two connected masses constrained to slide along a moving rod . . . . . . . . . . . . . . 160
6.16 Example: Mass sliding on a frictionless spherical shell . . . . . . . . . . . . . . . . . . . . . . . . 161
6.17 Example: Rolling solid sphere on a spherical shell . . . . . . . . . . . . . . . . . . . . . . . . . . 163
6.18 Example: Solid sphere rolling plus slipping on a spherical shell . . . . . . . . . . . . . . . . . . . 165
6.19 Example: Small body held by friction on the periphery of a rolling wheel . . . . . . . . . . . . . . 166
6.20 Example: Plane pendulum hanging from a vertically-oscillating support . . . . . . . . . . . . . . 169
6.21 Example: Series-coupled double pendulum subject to impulsive force . . . . . . . . . . . . . . . . . 171
7.1 Example: Feynman’s angular-momentum paradox . . . . . . . . . . . . . . . . . . . . . . . . . . . 180
7.2 Example: Atwoods machine . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 182
7.3 Example: Conservation of angular momentum for rotational invariance: . . . . . . . . . . . . . . 183
7.4 Example: Diatomic molecules and axially-symmetric nuclei . . . . . . . . . . . . . . . . . . . . . 184
7.5 Example: Linear harmonic oscillator on a cart moving at constant velocity . . . . . . . . . . . . 189
7.6 Example: Isotropic central force in a rotating frame . . . . . . . . . . . . . . . . . . . . . . . . . 190
7.7 Example: The plane pendulum . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 191
7.8 Example: Oscillating cylinder in a cylindrical bowl . . . . . . . . . . . . . . . . . . . . . . . . . . 191
8.1 Example: Motion in a uniform gravitational field . . . . . . . . . . . . . . . . . . . . . . . . . . . 205
8.2 Example: One-dimensional harmonic oscillator . . . . . . . . . . . . . . . . . . . . . . . . . . . . 205
8.3 Example: Plane pendulum . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 206
8.4 Example: Hooke’s law force constrained to the surface of a cylinder . . . . . . . . . . . . . . . . . 207
8.5 Example: Electron motion in a cylindrical magnetron . . . . . . . . . . . . . . . . . . . . . . . . 208
8.6 Example: Spherical pendulum using Hamiltonian mechanics . . . . . . . . . . . . . . . . . . . . . 213
8.7 Example: Spherical pendulum using  (   ̇ ̇  ) . . . . . . . . . . . . . . . . . . . . . 214
8.8 Example: Spherical pendulum using  (       ̇) . . . . . . . . . . . . . . . . . . . 215
8.9 Example: Single particle moving in a vertical plane under the influence of an inverse-square
central force . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 216
8.10 Example: Folded chain . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 217
8.11 Example: Falling chain . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 218
9.1 Example: Gauge invariance in electromagnetism . . . . . . . . . . . . . . . . . . . . . . . . . . . 233
10.1 Example: Driven, linearly-damped, coupled linear oscillators . . . . . . . . . . . . . . . . . . . . . 243
10.2 Example: Kirchhoﬀ’s rules for electrical circuits . . . . . . . . . . . . . . . . . . . . . . . . . . . 244
10.3 Example: The linearly-damped, linear oscillator: . . . . . . . . . . . . . . . . . . . . . . . . . . . 245
11.1 Example: Central force leading to a circular orbit  = 2 cos  . . . . . . . . . . . . . . . . . . . 254
11.2 Example: Orbit equation of motion for a free body . . . . . . . . . . . . . . . . . . . . . . . . . . 256
11.3 Example: Linear two-body restoring force . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 269
11.4 Example: Inverse square law attractive force . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 269
11.5 Example: Attractive inverse cubic central force . . . . . . . . . . . . . . . . . . . . . . . . . . . . 270
11.6 Example: Spiralling mass attached by a string to a hanging mass . . . . . . . . . . . . . . . . . . 271
11.7 Example: Two-body scattering by an inverse cubic force . . . . . . . . . . . . . . . . . . . . . . . 277
12.1 Example: Accelerating spring plane pendulum . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 295
12.2 Example: Surface of rotating liquid . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 297
12.3 Example: The pirouette . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 298
12.4 Example: Cranked plane pendulum . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 300
12.5 Example: Nucleon orbits in deformed nuclei . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 301
12.6 Example: Free fall from rest . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 305
12.7 Example: Projectile fired vertically upwards . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 305
12.8 Example: Motion parallel to Earth’s surface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 305
13.1 Example: Inertia tensor of a solid cube rotating about the center of mass. . . . . . . . . . . . . . 320
13.2 Example: Inertia tensor of about a corner of a solid cube. . . . . . . . . . . . . . . . . . . . . . . 321
13.3 Example: Inertia tensor of a hula hoop . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 323
13.4 Example: Inertia tensor of a thin book . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 323
EXAMPLES xv

13.5 Example: Rotation about the center of mass of a solid cube . . . . . . . . . . . . . . . . . . . . . 325
13.6 Example: Rotation about the corner of the cube . . . . . . . . . . . . . . . . . . . . . . . . . . . . 326
13.7 Example: Euler angle transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 331
13.8 Example: Rotation of a dumbbell . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 336
13.9 Example: Precession rate for torque-free rotating symmetric rigid rotor . . . . . . . . . . . . . . 342
13.10Example: Tennis racquet dynamics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 345
13.11Example: Rotation of asymmetrically-deformed nuclei . . . . . . . . . . . . . . . . . . . . . . . . 346
13.12Example: The Spinning “Jack” . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 349
13.13Example: The Tippe Top . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 350
13.14Example: Tipping stability of a rolling wheel . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 353
13.15Example: Pivoting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 354
13.16Example: Rolling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 354
13.17Example: Forces on the bearings of a rotating circular disk . . . . . . . . . . . . . . . . . . . . . 355
14.1 Example: The Grand Piano . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 368
14.2 Example: Two coupled linear oscillators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 374
14.3 Example: Two equal masses series-coupled by two equal springs . . . . . . . . . . . . . . . . . . . 376
14.4 Example: Two parallel-coupled plane pendula . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 377
14.5 Example: The series-coupled double plane pendula . . . . . . . . . . . . . . . . . . . . . . . . . . 379
14.6 Example: Three plane pendula; mean-field linear coupling . . . . . . . . . . . . . . . . . . . . . . 380
14.7 Example: Three plane pendula; nearest-neighbor coupling . . . . . . . . . . . . . . . . . . . . . . 382
14.8 Example: System of three bodies coupled by six springs . . . . . . . . . . . . . . . . . . . . . . . . 384
14.9 Example: Linear triatomic molecular CO 2 . . . . . . . . . . . . . . . . . . . . . . 385
14.10Example: Benzene ring . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 387
14.11Example: Two linearly-damped coupled linear oscillators . . . . . . . . . . . . . . . . . . . . . . . 394
14.12Example: Collective motion in nuclei . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 397
15.1 Example: Check that a transformation is canonical . . . . . . . . . . . . . . . . . . . . . . . . . . 406
15.2 Example: Angular momentum: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 409
15.3 Example: Lorentz force in electromagnetism . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 412
15.4 Example: Wavemotion: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 412
15.5 Example: Two-dimensional, anisotropic, linear oscillator . . . . . . . . . . . . . . . . . . . . . . 413
15.6 Example: The eccentricity vector . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 414
15.7 Example: The identity canonical transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . 420
15.8 Example: The point canonical transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 420
15.9 Example: The exchange canonical transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . 420
15.10Example: Infinitessimal point canonical transformation . . . . . . . . . . . . . . . . . . . . . . . 420
15.11Example: 1-D harmonic oscillator via a canonical transformation . . . . . . . . . . . . . . . . . . 421
15.12Example: Free particle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 425
15.13Example: Point particle in a uniform gravitational field . . . . . . . . . . . . . . . . . . . . . . . 426
15.14Example: One-dimensional harmonic oscillator . . . . . . . . . . . . . . . . . . . . . . . . . . . 427
15.15Example: The central force problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 427
15.16Example: Linearly-damped, one-dimensional, harmonic oscillator . . . . . . . . . . . . . . . . . . 429
15.17Example: Adiabatic invariance for the simple pendulum . . . . . . . . . . . . . . . . . . . . . . . 436
15.18Example: Harmonic oscillator perturbation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 438
15.19Example: Lindblad resonance in planetary and galactic motion . . . . . . . . . . . . . . . . . . . 439
16.1 Example: Acoustic waves in a gas . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 459
17.1 Example: Muon lifetime . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 470
17.2 Example: Relativistic Doppler Eﬀect . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 471
17.3 Example: Twin paradox . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 471
17.4 Example: Rocket propulsion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 474
17.5 Example: Lagrangian for a relativistic free particle . . . . . . . . . . . . . . . . . . . . . . . . . . 482
17.6 Example: Relativistic particle in an external electromagnetic field . . . . . . . . . . . . . . . . . . 483
17.7 Example: The Bohr-Sommerfeld hydrogen atom . . . . . . . . . . . . . . . . . . . . . . . . . . . . 487
A.1 Example: Eigenvalues and eigenvectors of a real symmetric matrix . . . . . . . . . . . . . . . . . 512
A.2 Example: Degenerate eigenvalues of real symmetric matrix . . . . . . . . . . . . . . . . . . . . . 513
D.1 Example: Rotation matrix: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 527
xvi EXAMPLES

D.2 Example: Proof that a rotation matrix is orthogonal . . . . . . . . . . . . . . . . . . . . . . . . . 528

E.1 Example: Displacement gradient tensor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 534
F.1 Example: Jacobian for transform from cartesian to spherical coordinates . . . . . . . . . . . . . . 541
H.1 Example: Maxwell’s Flux Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 549
H.2 Example: Buoyancy forces in fluids . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 550
H.3 Example: Maxwell’s circulation equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 552
H.4 Example: Electromagnetic fields: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 553
I.1 Example: Fourier transform of a single isolated square pulse: . . . . . . . . . . . . . . . . . . . . 558
I.2 Example: Fourier transform of the Dirac delta function: . . . . . . . . . . . . . . . . . . . . . . . 558
Preface

The goal of this book is to introduce the reader to the intellectual beauty, and philosophical implications,
of the fact that nature obeys variational principles plus Hamilton’s Action Principle which underlie the
Lagrangian and Hamiltonian analytical formulations of classical mechanics. These variational methods,
which were developed for classical mechanics during the 18 − 19 century, have become the preeminent
formalisms for classical dynamics, as well as for many other branches of modern science and engineering.
The ambitious goal of this book is to lead the reader from the intuitive Newtonian vectorial formulation, to
introduction of the more abstract variational principles that underlie Hamilton’s Principle and the related
Lagrangian and Hamiltonian analytical formulations. This culminates in discussion of the contributions of
variational principles to classical mechanics and the development of relativistic and quantum mechanics.
The broad scope of this book attempts to unify the undergraduate physics curriculum by bridging the
chasm that divides the Newtonian vector-diﬀerential formulation, and the integral variational formulation of
classical mechanics, as well as the corresponding philosophical approaches adopted in classical and quantum
mechanics. This book introduces the powerful variational techniques in mathematics, and their application to
physics. Application of the concepts of the variational approach to classical mechanics is ideal for illustrating
the power and beauty of applying variational principles.
The development of this textbook was influenced by three textbooks: The Variational Principles of
Mechanics by Cornelius Lanczos (1949) [La49], Classical Mechanics (1950) by Herbert Goldstein[Go50],
and Classical Dynamics of Particles and Systems (1965) by Jerry B. Marion[Ma65]. Marion’s excellent
textbook was unusual in partially bridging the chasm between the outstanding graduate texts by Goldstein
and Lanczos, and a bevy of introductory texts based on Newtonian mechanics that were available at that
time. The present textbook was developed to provide a more modern presentation of the techniques and
philosophical implications of the variational approaches to classical mechanics, with a breadth and depth
close to that provided by Goldstein and Lanczos, but in a format that better matches the needs of the
undergraduate student. An additional goal is to bridge the gap between classical and modern physics in the
undergraduate curriculum. The underlying philosophical approach adopted by this book was espoused by
Galileo Galilei “You cannot teach a man anything; you can only help him find it within himself.”
This book was written in support of the physics junior/senior undergraduate course P235W entitled
“Variational Principles in Classical Mechanics” that the author taught at the University of Rochester be-
tween 1993−2015. Initially the lecture notes were distributed to students to allow pre-lecture study, facilitate
accurate transmission of the complicated formulae, and minimize note taking during lectures. These lecture
notes evolved into the present textbook. The target audience of this course typically comprised ≈ 70% ju-
nior/senior undergraduates, ≈ 25% sophomores, ≤ 5% graduate students, and the occasional well-prepared
freshman. The target audience was physics and astrophysics majors, but the course attracted a significant
fraction of majors from other disciplines such as mathematics, chemistry, optics, engineering, music, and the
humanities. As a consequence, the book includes appreciable introductory level physics, plus mathematical
review material, to accommodate the diverse range of prior preparation of the students. This textbook
includes material that extends beyond what reasonably can be covered during a one-term course. This sup-
plemental material is presented to show the importance and broad applicability of variational concepts to
classical mechanics. The book includes 164 worked examples to illustrate the concepts presented. Advanced
group-theoretic concepts are minimized to better accommodate the mathematical skills of the typical under-
graduate physics major. To conform with modern literature in this field, this book follows the widely-adopted
nomenclature used in “Classical Mechanics” by Goldstein[Go50], with recent additions by Johns[Jo05].
The second edition of this book has revised the presentation and includes recent developments in the
field. The book is broken into four major sections, the first of which presents a brief historical introduction

xvii
xviii PREFACE

(chapter 1), followed by a review of the Newtonian formulation of mechanics plus gravitation (chapter
2), linear oscillators and wave motion (chapter 3), and an introduction to non-linear dynamics and chaos
(chapter 4). The second section introduces the variational principles of analytical mechanics that underlie
this book. It includes an introduction to the calculus of variations (chapter 5), the Lagrangian formulation of
mechanics with applications to holonomic and non-holonomic systems (chapter 6), a discussion of symmetries,
invariance, plus Noether’s theorem (chapter 7). This book presents an introduction to the Hamiltonian, the
Hamiltonian formulation of mechanics, the Routhian reduction technique, and a discussion of the subtleties
involved in applying variational principles to variable-mass problems.(Chapter 8). The second edition of
this book presents a unified introduction to Hamiltons Principle, introduces a new approach for applying
Hamilton’s Principle to systems subject to initial boundary conditions, and discusses how best to exploit the
hierarchy of related formulations based on action, Lagrangian/Hamiltonian, and equations of motion, when
solving problems subject to symmetries (chapter 9). A consolidated introduction to the application of the
variational approach to nonconservative systems is presented (chapter 10). The third section of the book,
applies Lagrangian and Hamiltonian formulations of classical dynamics to central force problems (chapter 11),
motion in non-inertial frames (chapter 12), rigid-body rotation (chapter 13), and coupled linear oscillators
(chapter 14). The fourth section of the book introduces advanced applications of Hamilton’s Action Principle,
Lagrangian mechanics and Hamiltonian mechanics. These include Poisson brackets, Liouville’s theorem,
canonical transformations, Hamilton-Jacobi theory, the action-angle technique (chapter 15), and classical
mechanics in the continua (chapter 16). This is followed by a brief review of the revolution in classical
mechanics introduced by Einstein’s theory of relativistic mechanics. The extended theory of Lagrangian and
Hamiltonian mechanics is used to apply variational techniques to the Special Theory of Relativity, followed
by a discussion of the use of variational principles in the development of the General Theory of Relativity
(chapter 17). The book finishes with a brief review of the role of variational principles in bridging the gap
between classical mechanics and quantum mechanics, (chapter 18). These advanced topics extend beyond
the typical syllabus for an undergraduate classical mechanics course. They are included to stimulate student
interest in physics by giving them a glimpse of the physics at the summit that they have already struggled
to climb. This glimpse illustrates the breadth of classical mechanics, and the pivotal role that variational
principles have played in the development of classical, relativistic, quantal, and statistical mechanics.
The front cover picture of this book shows a sailplane soaring high above the Italian Alps. This picture
epitomizes the unlimited horizon of opportunities provided when the full dynamic range of variational princi-
ples are applied to classical mechanics. The adjacent pictures of the galaxy, and the skier, represent the wide
dynamic range of applicable topics that span from the origin of the universe, to everyday life. These cover
pictures reflect the beauty and unity of the foundation provided by variational principles to the development
of classical mechanics.
Information regarding the associated P235 undergraduate course at the University of Rochester is avail-
able on the web site at http://www.pas.rochester.edu/~cline/P235/index.shtml. Information about the
author is available at the Cline home web site: http://www.pas.rochester.edu/~cline/index.html.
The author thanks Meghan Sarkis who prepared many of the illustrations, Joe Easterly who designed the
book cover plus the webpage, and Moriana Garcia who organized publication. Andrew Sifain developed the
diagnostic workshop questions. The author appreciates the permission, granted by Professor Struckmeier, to
quote his published article on the extended Hamilton-Lagrangian formalism. The author acknowledges the
feedback and suggestions made by many students who have taken this course, as well as helpful suggestions
by his colleagues; Andrew Abrams, Adam Hayes, Connie Jones, Andrew Melchionna, David Munson, Alice
Quillen, Richard Sarkis, James Schneeloch, Steven Torrisi, Dan Watson, and Frank Wolfs. These lecture
notes were typed in LATEX using Scientific WorkPlace (MacKichan Software, Inc.), while Adobe Illustrator,
Photoshop, Origin, Mathematica, and MUPAD, were used to prepare the illustrations.

Douglas Cline,
University of Rochester, 2018
Prologue

Two dramatically diﬀerent philosophical approaches to science were developed in the field of classical me-
chanics during the 17 - 18 centuries. This time period coincided with the Age of Enlightenment in Europe
during which remarkable intellectual and philosophical developments occurred. This was a time when both
philosophical and causal arguments were equally acceptable in science, in contrast with current convention
where there appears to be tacit agreement to discourage use of philosophical arguments in science.

Snell’s Law: The genesis of two contrasting philosophical ap-

proaches to science relates back to early studies of the reflection
and refraction of light. The velocity of light in a medium of re-
fractive index  equals  =  . Thus a light beam incident at an
angle 1  to the normal of a plane interface between medium 1
and medium 2 is refracted at an angle 2 in medium 2 where the
angles are related by Snell’s Law.
sin 1 1 2
= = (Snell’s Law)
sin 2 2 1
Ibn Sahl of Bagdad (984) first described the refraction of light,
while Snell (1621) derived his law mathematically. Both of these
scientists used the “vectorial approach” where the light velocity 
is considered to be a vector pointing in the direction of propaga-
tion.

Fermat’s Principle: Fermat’s principle of least time (1657),

which is based on the work of Hero of Alexandria (∼ 60) and Ibn
al-Haytham (1021), states that “light travels between two given
points along the path of shortest time”. The transit time  of a
light beam between two locations  and  in a medium with
position-dependent refractive index () is given by
Z  Z
1 
=  = () (Fermat’s Principle)
  
Fermat’s Principle leads to the derivation of Snell’s Law.
Philosophically the physics underlying the contrasting vectorial
and Fermat’s Principle derivations of Snell’s Law are dramatically
diﬀerent. The vectorial approach is based on diﬀerential relations
between the velocity vectors in the two media, whereas Fermat’s
variational approach is based on the fact that the light prefer- Figure 1: Vectorial and variational represen-
entially selects a path for which the integral of the transit time tations of Snell’s Law for refraction of light.
between the initial location  and the final location  is mini-
mized. That is, the first approach is based on “vectorial mechanics” whereas Fermat’s approach is based on
variational principles in that the path between the initial and final locations is varied to find the path that
minimizes the transit time. Fermat’s enunciation of variational principles in physics played a key role in the
historical development, and subsequent exploitation, of the principle of least action in analytical formulations
of classical mechanics as discussed below.

xix
xx PROLOGUE

Newtonian mechanics: Momentum and force are vectors that underlie the Newtonian formulation of
classical mechanics. Newton’s monumental treatise, entitled “Philosophiae Naturalis Principia Mathemat-
ica”, published in 1687, established his three universal laws of motion, the universal theory of gravitation,
the derivation of Kepler’s three laws of planetary motion, and the development of calculus. Newton’s three
universal laws of motion provide the most intuitive approach to classical mechanics in that they are based on
vector quantities like momentum, and the rate of change of momentum, which are related to force. Newton’s
equation of motion
p
F= (Newton’s equation of motion)

is a vector diﬀerential relation between the instantaneous forces and rate of change of momentum, or equiva-
lent instantaneous acceleration, all of which are vector quantities. Momentum and force are easy to visualize,
and both cause and eﬀect are embedded in Newtonian mechanics. Thus, if all of the forces, including the
constraint forces, acting on the system are known, then the motion is solvable for two body systems. The
mathematics for handling Newton’s “vectorial mechanics” approach to classical mechanics is well established.

Analytical mechanics: Variational principles apply to many aspects of our daily life. Typical examples
include; selecting the optimum compromise in quality and cost when shopping, selecting the fastest route
to travel from home to work, or selecting the optimum compromise to satisfy the disparate desires of the
individuals comprising a family. Variational principles underlie the analytical formulation of mechanics. It
is astonishing that the laws of nature are consistent with variational principles involving the principle of
least action. Minimizing the action integral led to the development of the mathematical field of variational
calculus, plus the analytical variational approaches to classical mechanics, by Euler, Lagrange, Hamilton,
and Jacobi.
Leibniz, who was a contemporary of Newton, introduced methods based on a quantity called “vis viva”,
which is Latin for “living force” and equals twice the kinetic energy. Leibniz believed in the philosophy
that God created a perfect world where nature would be thrifty in all its manifestations. In 1707, Leibniz
proposed that the optimum path is based on minimizing the time integral of the vis viva, which is equiva-
lent to the action integral of Lagrangian/Hamiltonian mechanics. In 1744 Euler derived the Leibniz result
using variational concepts while Maupertuis restated the Leibniz result based on teleological arguments.
The development of Lagrangian mechanics culminated in the 1788 publication of Lagrange’s monumental
treatise entitled “Mécanique Analytique”. Lagrange used d’Alembert’s Principle to derive Lagrangian me-
chanics providing a powerful analytical approach to determine the magnitude and direction of the optimum
trajectories, plus the associated forces.
The culmination of the development of analytical mechanics occurred in 1834 when Hamilton proposed
his Principle of Least Action, as well as developing Hamiltonian mechanics which is the premier variational
approach in science. Hamilton’s concept of least action is defined to be the time integral of the Lagrangian.
Hamilton’s Action Principle (1834) minimizes the action integral  defined by
Z 
= (q q̇) (Hamilton’s Principle)


In the simplest form, the Lagrangian (q q̇) equals the diﬀerence between the kinetic energy  and the
potential energy  . Hamilton’s Least Action Principle underlies Lagrangian mechanics. This Lagrangian is
a function of  generalized coordinates  plus their corresponding velocities ̇ . Hamilton also developed
the premier variational approach, called Hamiltonian mechanics, that is based on the Hamiltonian (q p)
which is a function of the  fundamental position  plus the conjugate momentum  variables. In 1843
Jacobi provided the mathematical framework required to fully exploit the power of Hamiltonian mechanics.
Note that the Lagrangian, Hamiltonian, and the action integral, all are scalar quantities which simplifies
derivation of the equations of motion compared with the vector calculus used by Newtonian mechanics.
Figure 2 presents a philosophical roadmap illustrating the hierarchy of philosophical approaches based on
Hamilton’s Action Principle, that are available forRderiving the equations of motion of a system. The primary

Stage1 uses Hamilton’s Action functional,  =  (q q̇) to derive the Lagrangian, and Hamiltonian
functionals which provide the most fundamental and sophisticated level of understanding. Stage1 involves
specifying all the active degrees of freedom, as well as the interactions involved. Stage2 uses the Lagrangian
or Hamiltonian functionals, derived at Stage1, in order to derive the equations of motion for the system of
xxi

Hamilton’s action principle

Stage 1

Hamiltonian Lagrangian d’ Alembert’s Principle

Stage 2

Equations of motion Newtonian mechanics

Stage 3

Solution for motion Initial conditions

Figure 2: Philosophical road map of the hierarchy of stages involved in analytical mechanics. Hamilton’s
Action Principle is the foundation of analytical mechanics. Stage 1 uses Hamilton’s Principle to derive the
Lagranian and Hamiltonian. Stage 2 uses either the Lagrangian or Hamiltonian to derive the equations
of motion for the system. Stage 3 uses these equations of motion to solve for the actual motion using
the assumed initial conditions. The Lagrangian approach can be derived directly based on d’Alembert’s
Principle. Newtonian mechanics can be derived directly based on Newton’s Laws of Motion. The advantages
and power of Hamilton’s Action Principle are unavailable if the Laws of Motion are derived using either
d’Alembert’s Principle or Newton’s Laws of Motion.

interest. Stage3 then uses these derived equations of motion to solve for the motion of the system subject to
a given set of initial boundary conditions. Note that Lagrange first derived Lagrangian mechanics based on
d’ Alembert’s Principle, while Newton’s Laws of Motion specify the equations of motion used in Newtonian
mechanics.
The analytical approach to classical mechanics appeared contradictory to Newton’s intuitive vector-
ial treatment of force and momentum. There is a dramatic difference in philosophy between the vector-
differential equations of motion derived by Newtonian mechanics, which relate the instantaneous force to
the corresponding instantaneous acceleration, and analytical mechanics, where minimizing the scalar action
integral involves integrals over space and time between specified initial and final states. Analytical mechanics
uses variational principles to determine the optimum trajectory, from a continuum of tentative possibilities,
by requiring that the optimum trajectory minimizes the action integral between specified initial and final
conditions.
Initially there was considerable prejudice and philosophical opposition to use of the variational principles
approach which is based on the assumption that nature follows the principles of economy. The variational
approach is not intuitive, and thus it was considered to be speculative and “metaphysical”, but it was
tolerated as an efficient tool for exploiting classical mechanics. This opposition to the variational principles
underlying analytical mechanics, delayed full appreciation of the variational approach until the start of the
20 century. As a consequence, the intuitive Newtonian formulation reigned supreme in classical mechanics
for over two centuries, even though the remarkable problem-solving capabilities of analytical mechanics were
recognized and exploited following the development of analytical mechanics by Lagrange.
The full significance and superiority of the analytical variational formulations of classical mechanics
became well recognised and accepted following the development of the Special Theory of Relativity in 1905.
The Theory of Relativity requires that the laws of nature be invariant to the reference frame. This is not
satisfied by the Newtonian formulation of mechanics which assumes one absolute frame of reference and a
separation of space and time. In contrast, the Lagrangian and Hamiltonian formulations of the principle of
least action remain valid in the Theory of Relativity, if the Lagrangian is written in a relativistically-invariant
xxii PROLOGUE

form in space-time. The complete invariance of the variational approach to coordinate frames is precisely
the formalism necessary for handling relativistic mechanics.
Hamiltonian mechanics, which is expressed in terms of the conjugate variables (q p), relates classical
mechanics directly to the underlying physics of quantum mechanics and quantum field theory. As a conse-
quence, the philosophical opposition to exploiting variational principles no longer exists, and Hamiltonian
mechanics has become the preeminent formulation of modern physics. The reader is free to draw their own
conclusions regarding the philosophical question “is the principle of economy a fundamental law of classical
mechanics, or is it a fortuitous consequence of the fundamental laws of nature?”
From the late seventeenth century, until the dawn of modern physics at the start of the twentieth cen-
tury, classical mechanics remained a primary driving force in the development of physics. Classical mechanics
embraces an unusually broad range of topics spanning motion of macroscopic astronomical bodies to mi-
croscopic particles in nuclear and particle physics, at velocities ranging from zero to near the velocity of
light, from one-body to statistical many-body systems, as well as having extensions to quantum mechanics.
Introduction of the Special Theory of Relativity in 1905, and the General Theory of Relativity in 1916,
necessitated modifications to classical mechanics for relativistic velocities, and can be considered to be an
extended theory of classical mechanics. Since the 19200 s, quantal physics has superseded classical mechanics
in the microscopic domain. Although quantum physics has played the leading role in the development of
physics during much of the past century, classical mechanics still is a vibrant field of physics that recently
has led to exciting developments associated with non-linear systems and chaos theory. This has spawned
new branches of physics and mathematics as well as changing our notion of causality.

Goals: The primary goal of this book is to introduce the reader to the powerful variational-principles
approaches that play such a pivotal role in classical mechanics and many other branches of modern science
and engineering. This book emphasizes the intellectual beauty of these remarkable developments, as well as
stressing the philosophical implications that have had a tremendous impact on modern science. A secondary
goal is to apply variational principles to solve advanced applications in classical mechanics in order to
introduce many sophisticated and powerful mathematical techniques that underlie much of modern physics.
This book starts with a review of Newtonian mechanics plus the solutions of the corresponding equations
of motion. This is followed by an introduction to Lagrangian mechanics, based on d’Alembert’s Principle,
in order to develop familiarity in applying variational principles to classical mechanics. This leads to intro-
duction of the more fundamental Hamilton’s Action Principle, plus Hamiltonian mechanics, to illustrate the
power provided by exploiting the full hierarchy of stages available for applying variational principles to clas-
sical mechanics. Finally the book illustrates how variational principles in classical mechanics were exploited
during the development of both relativisitic mechanics and quantum physics. The connections and applica-
tions of classical mechanics to modern physics, are emphasized throughout the book in an eﬀort to span the
chasm that divides the Newtonian vector-diﬀerential formulation, and the integral variational formulation, of
classical mechanics. This chasm is especially applicable to quantum mechanics which is based completely on
variational principles. Note that variational principles, developed in the field of classical mechanics, now are
used in a diverse and wide range of fields outside of physics, including economics, meteorology, engineering,
and computing.
This study of classical mechanics involves climbing a vast mountain of knowledge, and the pathway to the
top leads to elegant and beautiful theories that underlie much of modern physics. This book exploits varia-
tional principles applied to four major topics in classical mechanics to illustrate the power and importance of
variational principles in physics. Being so close to the summit provides the opportunity to take a few extra
steps beyond the normal introductory classical mechanics syllabus to glimpse the exciting physics found at
the summit. This new physics includes topics such as quantum, relativistic, and statistical mechanics.
Chapter 1

A brief history of classical mechanics

1.1 Introduction
This chapter reviews the historical evolution of classical mechanics since considerable insight can be gained
from study of the history of science. There are two dramatically diﬀerent approaches used in classical
mechanics. The first is the vectorial approach of Newton which is based on vector quantities like momentum,
force, and acceleration. The second is the analytical approach of Lagrange, Euler, Hamilton, and Jacobi,
that is based on the concept of least action and variational calculus. The more intuitive Newtonian picture
reigned supreme in classical mechanics until the start of the twentieth century. Variational principles, which
were developed during the nineteenth century, never aroused much enthusiasm in scientific circles due to
philosophical objections to the underlying concepts; this approach was merely tolerated as an eﬃcient tool
for exploiting classical mechanics. A dramatic advance in the philosophy of science occurred at the start of
the 20 century leading to widespread acceptance of the superiority of using variational principles.

1.2 Greek antiquity

The great philosophers in ancient Greece played a key role by using the astronomical work of the Babylonians
to develop scientific theories of mechanics. Thales of Miletus (624 - 547BC), the first of the seven
great greek philosophers, developed geometry, and is hailed as the first true mathematician. Pythagorus
(570 - 495BC) developed mathematics, and postulated that the earth is spherical. Democritus (460 -
370BC) has been called the father of modern science, while Socrates (469 - 399BC) is renowned for his
contributions to ethics. Plato (427-347 B.C.) who was a mathematician and student of Socrates, wrote
important philosophical dialogues. He founded the Academy in Athens which was the first institution of
higher learning in the Western world that helped lay the foundations of Western philosophy and science.
Aristotle (384-322 B.C.) is an important founder of Western philosophy encompassing ethics, logic,
science, and politics. His views on the physical sciences profoundly influenced medieval scholarship that
extended well into the Renaissance. He presented the first implied formulation of the principle of virtual
work in statics, and his statement that “what is lost in velocity is gained in force” is a veiled reference to
kinetic and potential energy. He adopted an Earth centered model of the universe. Aristarchus (310 - 240
B.C.) argued that the Earth orbited the Sun and used measurements to imply the relative distances of the
Moon and the Sun. The greek philosophers were relatively advanced in logic and mathematics and developed
concepts that enabled them to calculate areas and perimeters. Unfortunately their philosophical approach
neglected collecting quantitative and systematic data that is an essential ingredient to the advancement of
science.
Archimedes (287-212 B.C.) represented the culmination of science in ancient Greece. As an engineer
he designed machines of war, while as a scientist he made significant contributions to hydrostatics and
the principle of the lever. As a mathematician, he applied infinitessimals in a way that is reminiscent of
modern integral calculus, which he used to derive a value for  Unfortunately much of the work of the
brilliant Archimedes subsequently fell into oblivion. Hero of Alexandria (10 - 70 A.D.) described the
principle of reflection that light takes the shortest path. This is an early illustration of variational principle

1
2 CHAPTER 1. A BRIEF HISTORY OF CLASSICAL MECHANICS

of least time. Ptolemy (83 - 161 A.D.) wrote several scientific treatises that greatly influenced subsequent
philosophers. Unfortunately he adopted the incorrect geocentric solar system in contrast to the heliocentric
model of Aristarchus and others.

1.3 Middle Ages

The decline and fall of the Roman Empire in ∼410 A.D. marks the end of Classical Antiquity, and the
beginning of the Dark Ages in Western Europe (Christendom), while the Muslim scholars in Eastern Europe
continued to make progress in astronomy and mathematics. For example, in Egypt, Alhazen (965 - 1040
A.D.) expanded the principle of least time to reflection and refraction. The Dark Ages involved a long
scientific decline in Western Europe that languished for about 900 years. Science was dominated by religious
dogma, all western scholars were monks, and the important scientific achievements of Greek antiquity were
forgotten. The works of Aristotle were reintroduced to Western Europe by Arabs in the early 13 century
leading to the concepts of forces in static systems which were developed during the fourteenth century.
This included concepts of the work done by a force, and the virtual work involved in virtual displacements.
Leonardo da Vinci (1452-1519) was a leader in mechanics at that time. He made seminal contributions
to science, in addition to his well known contributions to architecture, engineering, sculpture, and art.
Nicolaus Copernicus (1473-1543) rejected the geocentric theory of Ptolomy and formulated a scientifically-
based heliocentric cosmology that displaced the Earth from the center of the universe. The Ptolomic view
was that heaven represented the perfect unchanging divine while the earth represented change plus chaos,
and the celestial bodies moved relative to the fixed heavens. The book, De revolutionibus orbium coelestium
(On the Revolutions of the Celestial Spheres), published by Copernicus in 1543, is regarded as the starting
point of modern astronomy and the defining epiphany that began the Scientific Revolution. The book De
Magnete written in 1600 by the English physician William Gilbert (1540-1603) presented the results of
well-planned studies of magnetism and strongly influenced the intellectual-scientific evolution at that time.
Johannes Kepler (1571-1630), a German mathematician, astronomer and astrologer, was a key
figure in the 17th century Scientific Revolution. He is best known for recognizing the connection between the
motions in the sky and physics. His laws of planetary motion were developed by later astronomers based on
his written work Astronomia nova, Harmonices Mundi, and Epitome of Copernican Astrononomy. Kepler
was an assistant to Tycho Brahe (1546-1601) who for many years recorded accurate astronomical data
that played a key role in the development of Kepler’s theory of planetary motion. Kepler’s work provided
the foundation for Isaac Newton’s theory of universal gravitation. Unfortunately Kepler did not recognize
the true nature of the gravitational force.
Galileo Galilei (1564-1642) built on the Aristotle principle by recognizing the law of inertia, the
persistence of motion if no forces act, and the proportionality between force and acceleration. This amounts
to recognition of work as the product of force times displacement in the direction of the force. He applied
virtual work to the equilibrium of a body on an inclined plane. He also showed that the same principle
applies to hydrostatic pressure that had been established by Archimedes, but he did not apply his concepts
in classical mechanics to the considerable knowledge base on planetary motion. Galileo is famous for the
apocryphal story that he dropped two cannon balls of diﬀerent masses from the Tower of Pisa to demonstrate
that their speed of descent was independent of their mass.

1.4 Age of Enlightenment

The Age of Enlightenment is a term used to describe a phase in Western philosophy and cultural life in
which reason was advocated as the primary source and legitimacy for authority. It developed simultaneously
in Germany, France, Britain, the Netherlands, and Italy around the 1650’s and lasted until the French
Revolution in 1789. The intellectual and philosophical developments led to moral, social, and political
reforms. The principles of individual rights, reason, common sense, and deism were a revolutionary departure
from the existing theocracy, autocracy, oligarchy, aristocracy, and the divine right of kings. It led to political
revolutions in France and the United States. It marks a dramatic departure from the Early Modern period
which was noted for religious authority, absolute state power, guild-based economic systems, and censorship
of ideas. It opened a new era of rational discourse, liberalism, freedom of expression, and scientific method.
This new environment led to tremendous advances in both science and mathematics in addition to music,
1.4. AGE OF ENLIGHTENMENT 3

literature, philosophy, and art. Scientific development during the 17 century included the pivotal advances
made by Newton and Leibniz at the beginning of the revolutionary Age of Enlightenment, culminating in the
development of variational calculus and analytical mechanics by Euler and Lagrange. The scientific advances
of this age include publication of two monumental books Philosophiae Naturalis Principia Mathematica by
Newton in 1687 and Mécanique analytique by Lagrange in 1788. These are the definitive two books upon
which classical mechanics is built.

René Descartes (1596-1650) attempted to formulate the laws of motion in 1644. He talked about
conservation of motion (momentum) in a straight line but did not recognize the vector character of momen-
tum. Pierre de Fermat (1601-1665) and René Descartes were two leading mathematicians in the first
half of the 17 century. Independently they discovered the principles of analytic geometry and developed
some initial concepts of calculus. Fermat and Blaise Pascal (1623-1662) were the founders of the theory
of probability.

Isaac Newton (1642-1727) made pioneering contributions to physics and mathematics as well as
being a theologian. At 18 he was admitted to Trinity College Cambridge where he read the writings of
modern philosophers like Descartes, and astronomers like Copernicus, Galileo, and Kepler. By 1665 he had
discovered the generalized binomial theorem, and began developing infinitessimal calculus. Due to a plague,
the university closed for two years in 1665 during which Newton worked at home developing the theory
of calculus that built upon the earlier work of Barrow and Descartes. He was elected Lucasian Professor
of Mathematics in 1669 at the age of 26. From 1670 Newton focussed on optics leading to his Hypothesis
of Light published in 1675 and his book Opticks in 1704. Newton described light as being made up of a
flow of extremely subtle corpuscles that also had associated wavelike properties to explain diﬀraction and
optical interference that he studied. Newton returned to mechanics in 1677 by studying planetary motion
and gravitation that applied the calculus he had developed. In 1687 he published his monumental treatise
entitled Philosophiae Naturalis Principia Mathematica which established his three universal laws of motion,
the universal theory of gravitation, derivation of Kepler’s three laws of planetary motion, and was his first
publication of the development of calculus which he called “the science of fluxions”. Newton’s laws of motion
are based on the concepts of force and momentum, that is, force equals the rate of change of momentum.
Newton’s postulate of an invisible force able to act over vast distances led him to be criticized for introducing
“occult agencies” into science. In a remarkable achievement, Newton completely solved the laws of mechanics.
His theory of classical mechanics and of gravitation reigned supreme until the development of the Theory
of Relativity in 1905. The followers of Newton envisioned the Newtonian laws to be absolute and universal.
This dogmatic reverence of Newtonian mechanics prevented physicists from an unprejudiced appreciation of
the analytic variational approach to mechanics developed during the 17 through 19 centuries. Newton
was the first scientist to be knighted and was appointed president of the Royal Society.

Gottfried Leibniz (1646-1716) was a brilliant German philosopher, a contemporary of Newton, who
worked on both calculus and mechanics. Leibniz started development of calculus in 1675, ten years after
Newton, but Leibniz published his work in 1684, which was three years before Newton’s Principia. Leibniz
made significant contributions to integral calculus and developed the notation currently used in calculus.
He introduced the name calculus based on the Latin word for the small stone used for counting. Newton
and Leibniz were involved in a protracted argument over who originated calculus. It appears that Leibniz
saw drafts of Newton’s work on calculus during a visit to England. Throughout their argument Newton
was the ghost writer of most of the articles in support of himself and he had them published under non-
de-plume of his friends. Leibniz made the tactical error of appealing to the Royal Society to intercede on
his behalf. Newton, as president of the Royal Society, appointed his friends to an “impartial” committee to
investigate this issue, then he wrote the committee’s report that accused Leibniz of plagiarism of Newton’s
work on calculus, after which he had it published by the Royal Society. Still unsatisfied he then wrote an
anonymous review of the report in the Royal Society’s own periodical. This bitter dispute lasted until the
death of Leibniz. When Leibniz died his work was largely discredited. The fact that he falsely claimed to be
a nobleman and added the prefix “von” to his name, coupled with Newton’s vitriolic attacks, did not help
his credibility. Newton is reported to have declared that he took great satisfaction in “breaking Leibniz’s
heart.” Studies during the 20 century have largely revived the reputation of Leibniz and he is recognized
to have made major contributions to the development of calculus.
4 CHAPTER 1. A BRIEF HISTORY OF CLASSICAL MECHANICS

Figure 1.1: Chronological roadmap of the parallel development of the Newtonian and Variational-principles
approaches to classical mechanics.
1.5. VARIATIONAL METHODS IN PHYSICS 5

1.5 Variational methods in physics

Pierre de Fermat (1601-1665) revived the principle of least time, which states that light travels between
two given points along the path of shortest time and was used to derive Snell’s law in 1657. This enunciation
of variational principles in physics played a key role in the historical development of the variational principle
of least action that underlies the analytical formulations of classical mechanics.
Gottfried Leibniz (1646-1716) made significant contributions to the development of variational prin-
ciples in classical mechanics. In contrast to Newton’s laws of motion, which are based on the concept of
momentum, Leibniz devised a new theory of dynamics based on kinetic and potential energy that anticipates
the analytical variational approach of Lagrange and Hamilton. Leibniz argued for a quantity called the “vis
viva”, which is Latin for living force, that equals twice the kinetic energy. Leibniz argued that the change
in kinetic energy is equal to the work done. In 1687 Leibniz proposed that the optimum path is based on
minimizing the time integral of the vis viva, which is equivalent to the action integral. Leibniz used both
philosophical and causal arguments in his work which were acceptable during the Age of Enlightenment. Un-
fortunately for Leibniz, his analytical approach based on energies, which are scalars, appeared contradictory
to Newton’s intuitive vectorial treatment of force and momentum. There was considerable prejudice and
philosophical opposition to the variational approach which assumes that nature is thrifty in all of its actions.
The variational approach was considered to be speculative and “metaphysical” in contrast to the causal
arguments supporting Newtonian mechanics. This opposition delayed full appreciation of the variational
approach until the start of the 20 century.
Johann Bernoulli (1667-1748) was a Swiss mathematician who was a student of Leibniz’s calculus, and
sided with Leibniz in the Newton-Leibniz dispute over the credit for developing calculus. Also Bernoulli sided
with the Descartes’ vortex theory of gravitation which delayed acceptance of Newton’s theory of gravitation
in Europe. Bernoulli pioneered development of the calculus of variations by solving the problems of the
catenary, the brachistochrone, and Fermat’s principle. Johann Bernoulli’s son Daniel played a significant
role in the development of the well-known Bernoulli Principle in hydrodynamics.
Pierre Louis Maupertuis (1698-1759) was a student of Johann Bernoulli and conceived the universal
hypothesis that in nature there is a certain quantity called action which is minimized. Although this bold
assumption correctly anticipates the development of the variational approach to classical mechanics, he
obtained his hypothesis by an entirely incorrect method. He was a dilettante whose mathematical prowess
was behind the high standards of that time, and he could not establish satisfactorily the quantity to be
minimized. His teleological1 argument was influenced by Fermat’s principle and the corpuscle theory of light
that implied a close connection between optics and mechanics.
Leonhard Euler (1707-1783) was the preeminent Swiss mathematician of the 18 century and was
a student of Johann Bernoulli. Euler developed, with full mathematical rigor, the calculus of variations
following in the footsteps of Johann Bernoulli. Euler used variational calculus to solve minimum/maximum
isoperimetric problems that had attracted and challenged the early developers of calculus, Newton, Leibniz,
and Bernoulli. Euler also was the first to solve the rigid-body rotation problem using the three components
of the angular velocity as kinematical variables. Euler became blind in both eyes by 1766 but that did not
hinder his prolific output in mathematics due to his remarkable memory and mental capabilities. Euler’s
contributions to mathematics are remarkable in quality and quantity; for example during 1775 he published
one mathematical paper per week in spite of being blind. Euler implicitly implied the principle of least
action using vis visa which is not the exact form explicitly developed by Lagrange.
Jean le Rond d’Alembert (1717-1785) was a French mathematician and physicist who had the
clever idea of extending use of the principle of virtual work from statics to dynamics. d’Alembert’s Principle
rewrites the principle of virtual work in the form

X
(F − ṗ )r = 0
=1

where the inertial reaction force ṗ is subtracted from the corresponding force F. This extension of the
principle of virtual work applies equally to both statics and dynamics leading to a single variational principle.
Joseph Louis Lagrange (1736-1813) was an Italian mathematician and a student of Leonhard Euler.
In 1788 Lagrange published his monumental treatise on analytical mechanics entitled Mécanique Analytique
1 Teleology is any philosophical account that holds that final causes exist in nature, meaning that – analogous to purposes

found in human actions – nature inherently tends toward definite ends.

6 CHAPTER 1. A BRIEF HISTORY OF CLASSICAL MECHANICS

which introduces his Lagrangian mechanics analytical technique which is based on d’Alembert’s Principle of
Virtual Work. Lagrangian mechanics is a remarkably powerful technique that is equivalent to minimizing
the action integral  defined as Z 2
= 
1

The Lagrangian  frequently is defined to be the difference between the kinetic energy  and potential
energy  . His theory only required the analytical form of these scalar quantities. In the preface of his
book he refers modestly to his extraordinary achievements with the statement “The reader will find no
figures in the work. The methods which I set forth do not require either constructions or geometrical or
mechanical reasonings: but only algebraic operations, subject to a regular and uniform rule of procedure.”
Lagrange also introduced the concept of undetermined multipliers to handle auxiliary conditions which
plays a vital part of theoretical mechanics. William Hamilton, an outstanding figure in the analytical
formulation of classical mechanics, called Lagrange the “Shakespeare of mathematics,” on account of the
extraordinary beauty, elegance, and depth of the Lagrangian methods. Lagrange also pioneered numerous
significant contributions to mathematics. For example, Euler, Lagrange, and d’Alembert developed much of
the mathematics of partial differential equations. Lagrange survived the French Revolution, and, in spite of
being a foreigner, Napoleon named Lagrange to the Legion of Honour and made him a Count of the Empire
in 1808. Lagrange was honoured by being buried in the Pantheon.
Carl Friedrich Gauss (1777-1855) was a German child prodigy who made many significant contri-
butions to mathematics, astronomy and physics. He did not work directly on the variational approach, but
Gauss’s law, the divergence theorem, and the Gaussian statistical distribution are important examples of
concepts that he developed and which feature prominently in classical mechanics as well as other branches
of physics, and mathematics.
Simeon Poisson (1781-1840), was a brilliant mathematician who was a student of Lagrange. He
developed the Poisson statistical distribution as well as the Poisson equation that features prominently in
electromagnetic and other field theories. His major contribution to classical mechanics is development, in
1809, of the Poisson bracket formalism which featured prominently in development of Hamiltonian mechanics
and quantum mechanics.
The zenith in development of the variational approach to classical mechanics occurred during the 19
century primarily due to the work of Hamilton and Jacobi.
William Hamilton (1805-1865) was a brilliant Irish physicist, astronomer and mathematician who was
appointed professor of astronomy at Dublin when he was barely 22 years old. He developed the Hamiltonian
mechanics formalism of classical mechanics which now plays a pivotal role in modern classical and quantum
mechanics. He opened an entirely new world beyond the developments of Lagrange. Whereas the Lagrange
equations of motion are complicated second-order differential equations, Hamilton succeeded in transforming
them into a set of first-order differential equations with twice as many variables that consider momenta and
their conjugate positions as independent variables. The differential equations of Hamilton are linear, have
separated derivatives, and represent the simplest and most desirable form possible for differential equations to
be used in a variational approach. Hence the name “canonical variables” given by Jacobi. Hamilton exploited
the d’Alembert principle to give the first exact formulation of the principle of least action which underlies the
variational principles used in analytical mechanics. The form derived by Euler and Lagrange employed the
principle in a way that applies only for conservative (scleronomic) cases. A significant discovery of Hamilton
is his realization that classical mechanics and geometrical optics can be handled from one unified viewpoint.
In both cases he uses a “characteristic” function that has the property that, by mere differentiation, the
path of the body, or light ray, can be determined by the same partial differential equations. This solution is
equivalent to the solution of the equations of motion.
Carl Gustave Jacob Jacobi (1804-1851), a Prussian mathematician and contemporary of Hamilton,
made significant developments in Hamiltonian mechanics. He immediately recognized the extraordinary im-
portance of the Hamiltonian formulation of mechanics. Jacobi developed canonical transformation theory
and showed that the function, used by Hamilton, is only one special case of functions that generate suit-
able canonical transformations. He proved that any complete solution of the partial differential equation,
without the specific boundary conditions applied by Hamilton, is sufficient for the complete integration of
the equations of motion. This greatly extends the usefulness of Hamilton’s partial differential equations.
In 1843 Jacobi developed both the Poisson brackets, and the Hamilton-Jacobi, formulations of Hamiltonian
mechanics. The latter gives a single, first-order partial differential equation for the action function in terms
1.6. THE 20  CENTURY REVOLUTION IN PHYSICS 7

of the  generalized coordinates which greatly simplifies solution of the equations of motion. He also de-
rived a principle of least action for time-independent cases that had been studied by Euler and Lagrange.
Jacobi developed a superior approach to the variational integral that, by eliminating time from the integral,
determined the path without saying anything about how the motion occurs in time.
James Clerk Maxwell (1831-1879) was a Scottish theoretical physicist and mathematician. His most
prominent achievement was formulating a classical electromagnetic theory that united previously unrelated
observations, plus equations of electricity, magnetism and optics, into one consistent theory. Maxwell’s
equations demonstrated that electricity, magnetism and light are all manifestations of the same phenomenon,
namely the electromagnetic field. Consequently, all other classic laws and equations of electromagnetism
were simplified cases of Maxwell’s equations. Maxwell’s achievements concerning electromagnetism have
been called the “second great unification in physics”. Maxwell demonstrated that electric and magnetic
fields travel through space in the form of waves, and at a constant speed of light. In 1864 Maxwell wrote “A
Dynamical Theory of the Electromagnetic Field” which proposed that light was in fact undulations in the
same medium that is the cause of electric and magnetic phenomena. His work in producing a unified model
of electromagnetism is one of the greatest advances in physics. Maxwell, in collaboration with Ludwig
Boltzmann (1844-1906), also helped develop the Maxwell—Boltzmann distribution, which is a statistical
means of describing aspects of the kinetic theory of gases. These two discoveries helped usher in the era of
modern physics, laying the foundation for such fields as special relativity and quantum mechanics. Boltzmann
founded the field of statistical mechanics and was an early staunch advocate of the existence of atoms and
molecules.
Henri Poincaré (1854-1912) was a French theoretical physicist and mathematician. He was the first to
present the Lorentz transformations in their modern symmetric form and discovered the remaining relativistic
velocity transformations. Although there is similarity to Einstein’s Special Theory of Relativity, Poincaré and
Lorentz still believed in the concept of the ether and did not fully comprehend the revolutionary philosophical
change implied by Einstein. Poincaré worked on the solution of the three-body problem in planetary motion
and was the first to discover a chaotic deterministic system which laid the foundations of modern chaos
theory. It rejected the long-held deterministic view that if the position and velocities of all the particles are
known at one time, then it is possible to predict the future for all time.
The last two decades of the 19 century saw the culmination of classical physics and several important
discoveries that led to a revolution in science that toppled classical physics from its throne. The end of the
19 century was a time during which tremendous technological progress occurred; flight, the automobile,
and turbine-powered ships were developed, Niagara Falls was harnessed for power, etc. During this period,
Heinrich Hertz (1857-1894) produced electromagnetic waves confirming their derivation using Maxwell’s
equations. Simultaneously he discovered the photoelectric eﬀect which was crucial evidence in support of
quantum physics. Technical developments, such as photography, the induction spark coil, and the vacuum
pump played a significant role in scientific discoveries made during the 1890’s. At the end of the 19 century,
scientists thought that the basic laws were understood and worried that future physics would be in the fifth
decimal place; some scientists worried that little was left for them to discover. However, there remained a
few, presumed minor, unexplained discrepancies plus new discoveries that led to the revolution in science
that occurred at the beginning of the 20 century.

1.6 The 20 century revolution in physics

The two greatest achievements of modern physics occurred at the beginning of the 20 century. The first
was Einstein’s development of the Theory of Relativity; the Special Theory of Relativity in 1905 and the
General Theory of Relativity in 1915. This was followed in 1925 by the development of quantum mechanics.
Albert Einstein (1879-1955) developed the Special Theory of Relativity in 1905 and the General The-
ory of Relativity in 1915; both of these revolutionary theories had a profound impact on classical mechanics
and the underlying philosophy of physics. The Newtonian formulation of mechanics was shown to be an
approximation that applies only at low velocities, while the General Theory of Relativity superseded New-
ton’s Law of Gravitation and explained the Equivalence Principle. The Newtonian concepts of an absolute
frame of reference, plus the assumption of the separation of time and space, were shown to be invalid at
relativistic velocities. Einstein’s postulate that the laws of physics are the same in all inertial frames requires
a revolutionary change in the philosophy of time, space and reference frames which leads to a breakdown
in the Newtonian formalism of classical mechanics. By contrast, the Lagrange and Hamiltonian variational
8 CHAPTER 1. A BRIEF HISTORY OF CLASSICAL MECHANICS

formalisms of mechanics, plus the principle of least action, remain intact using a relativistically invariant
Lagrangian. The independence of the variational approach to reference frames is precisely the formalism
necessary for relativistic mechanics. The invariance to coordinate frames of the basic field equations also
must remain invariant for the General Theory of Relativity which also can be derived in terms of a rela-
tivistic action principle. Thus the development of the Theory of Relativity unambiguously demonstrated the
superiority of the variational formulation of classical mechanics over the vectorial Newtonian formulation,
and thus the considerable effort made by Euler, Lagrange, Hamilton, Jacobi, and others in developing the
analytical variational formalism of classical mechanics finally came to fruition at the start of the 20 century.
Newton’s two crowning achievements, the Laws of Motion and the Laws of Gravitation, that had reigned
supreme since published in the Principia in 1687, were toppled from the throne by Einstein.
Emmy Noether (1882-1935) has been described as “the greatest ever woman mathematician”. In
1915 she proposed a theorem that a conservation law is associated with any differentiable symmetry of a
physical system. Noether’s theorem evolves naturally from Lagrangian and Hamiltonian mechanics and
she applied it to the four-dimensional world of general relativity. Noether’s theorem has had an important
impact in guiding the development of modern physics.
Other profound developments that had revolutionary impacts on classical mechanics were quantum
physics and quantum field theory. The 1913 model of atomic structure by Niels Bohr (1885-1962) and
the subsequent enhancements by Arnold Sommerfeld (1868-1951), were based completely on classical
Hamiltonian mechanics. The proposal of wave-particle duality by Louis de Broglie (1892-1987), made
in his 1924 thesis, was the catalyst leading to the development of quantum mechanics. In 1925 Werner
Heisenberg (1901-1976), and Max Born (1882-1970) developed a matrix representation of quantum
mechanics using non-commuting conjugate position and momenta variables.
Paul Dirac (1902-1984) showed in his Ph.D. thesis that Heisenberg’s matrix representation of quantum
physics is based on the Poisson Bracket generalization of Hamiltonian mechanics, which, in contrast to
Hamilton’s canonical equations, allows for non-commuting conjugate variables. In 1926 Erwin Schrödinger
(1887-1961) independently introduced the operational viewpoint and reinterpreted the partial differential
equation of Hamilton-Jacobi as a wave equation. His starting point was the optical-mechanical analogy of
Hamilton that is a built-in feature of the Hamilton-Jacobi theory. Schrödinger then showed that the wave
mechanics he developed, and the Heisenberg matrix mechanics, are equivalent representations of quantum
mechanics. In 1928 Dirac developed his relativistic equation of motion for the electron and pioneered the
field of quantum electrodynamics. Dirac also introduced the Lagrangian and the principle of least action to
quantum mechanics, and these ideas were developed into the path-integral formulation of quantum mechanics
and the theory of electrodynamics by Richard Feynman(1918-1988).
The concepts of wave-particle duality, and quantization of observables, both are beyond the classical
notions of infinite subdivisions in classical physics. In spite of the radical departure of quantum mechanics
from earlier classical concepts, the basic feature of the differential equations of quantal physics is their self-
adjoint character which means that they are derivable from a variational principle. Thus both the Theory of
Relativity, and quantum physics are consistent with the variational principle of mechanics, and inconsistent
with Newtonian mechanics. As a consequence Newtonian mechanics has been dislodged from the throne
it occupied since 1687, and the intellectually beautiful and powerful variational principles of analytical
mechanics have been validated.
The 2015 observation of gravitational waves is a remarkable recent confirmation of Einstein’s General
Theory of Relativity and the validity of the underlying variational principles in physics. Another advance in
physics is the understanding of the evolution of chaos in non-linear systems that have been made during the
past four decades. This advance is due to the availability of computers which has reopened this interesting
branch of classical mechanics, that was pioneered by Henri Poincaré about a century ago. Although classical
mechanics is the oldest and most mature branch of physics, there still remain new research opportunities in
this field of physics.
The focus of this book is to introduce the general principles of the mathematical variational principle
approach, and its applications to classical mechanics. It will be shown that the variational principles, that
were developed in classical mechanics, now play a crucial role in modern physics and mathematics, plus
many other fields of science and technology.
References:
Excellent sources of information regarding the history of major players in the field of classical mechanics
can be found on Wikipedia and the book “Variational Principle of Mechanics” by Lanczos.[La49]
Chapter 2

Review of Newtonian mechanics

2.1 Introduction
It is assumed that the reader has been introduced to Newtonian mechanics applied to one or two point objects.
This chapter reviews Newtonian mechanics for motion of many-body systems as well as for macroscopic
sized bodies. Newton’s Law of Gravitation also is reviewed. The purpose of this review is to ensure that the
reader has a solid foundation of elementary Newtonian mechanics upon which to build the powerful analytic
Lagrangian and Hamiltonian approaches to classical dynamics.
Newtonian mechanics is based on application of Newton’s Laws of motion which assume that the concepts
of distance, time, and mass, are absolute, that is, motion is in an inertial frame. The Newtonian idea of
the complete separation of space and time, and the concept of the absoluteness of time, are violated by the
Theory of Relativity as discussed in chapter 17. However, for most practical applications, relativistic eﬀects
are negligible and Newtonian mechanics is an adequate description at low velocities. Therefore chapters
2 − 16 will assume velocities for which Newton’s laws of motion are applicable.

2.2 Newton’s Laws of motion

Newton defined a vector quantity called linear momentum p which is the product of mass and velocity.
p = ṙ (2.1)
Since the mass  is a scalar quantity, then the velocity vector ṙ and the linear momentum vector p are
colinear.
Newton’s laws, expressed in terms of linear momentum, are:
1 Law of inertia: A body remains at rest or in uniform motion unless acted upon by a force.
2 Equation of motion: A body acted upon by a force moves in such a manner that the time rate of change
of momentum equals the force.
p
F= (2.2)

3 Action and reaction: If two bodies exert forces on each other, these forces are equal in magnitude and
opposite in direction.
Newton’s second law contains the essential physics relating the force F and the rate of change of linear
momentum p.
Newton’s first law, the law of inertia, is a special case of Newton’s second law in that if
p
F= =0 (2.3)

then p is a constant of motion.
Newton’s third law also can be interpreted as a statement of the conservation of momentum, that is, for
a two particle system with no external forces acting,
F12 = −F21 (2.4)

9
10 CHAPTER 2. REVIEW OF NEWTONIAN MECHANICS

If the forces acting on two bodies are their mutual action and reaction, then equation 24 simplifies to
p1 p2 
F12 + F21 = + = (p1 + p2 ) = 0 (2.5)
  
This implies that the total linear momentum (P = p1 + p2 ) is a constant of motion.
Combining equations 21 and 22 leads to a second-order diﬀerential equation
p 2 r
F= =  2 = r̈ (2.6)
 
Note that the force on a body F, and the resultant acceleration a = r̈ are colinear. Appendix 2 gives
explicit expressions for the acceleration a in cartesian and curvilinear coordinate systems. The definition of
force depends on the definition of the mass . Newton’s laws of motion are obeyed to a high precision for
velocities much less than the velocity of light. For example, recent experiments have shown they are obeyed
with an error in the acceleration of ∆ ≤ 5 × 10−14 2 

2.3 Inertial frames of reference

An inertial frame of reference is one in which Newton’s Laws of
motion are valid. It is a non-accelerated frame of reference. An
inertial frame must be homogeneous and isotropic. Physical ex- x’
P
periments can be carried out in diﬀerent inertial reference frames.
The Galilean transformation provides a means of converting be-
tween two inertial frames of reference moving at a constant rel- r’
ative velocity. Consider two reference frames  and 0 with 0 r
y’
moving with constant velocity V at time  Figure 21 shows a x O’
Galilean transformation which can be expressed in vector form.
z’
Vt
r0 = r − V (2.7)
0 =  y
O
Equation 27 gives the boost, assuming Newton’s hypothesis
that the time is invariant to change of inertial frames of reference. z

The time diﬀerential of this transformation gives

ṙ0 = ṙ − V (2.8)
r̈0 = r̈ Figure 2.1: Frame 0 moving with a con-
stant velocity  with respect to frame 
Note that the forces in the primed and unprimed inertial frames at the time .
are related by
p
F= = r̈ =r̈0 = F0 (2.9)

Thus Newton’s Laws of motion are invariant under a Galilean transformation, that is, the inertial mass is
unchanged under Galilean transformations. If Newton’s laws are valid in one inertial frame of reference,
then they are valid in any frame of reference in uniform motion with respect to the first frame of reference.
This invariance is called Galilean invariance. There are an infinite number of possible inertial frames all
connected by Galilean transformations.
Galilean invariance violates Einstein’s Theory of Relativity. In order to satisfy Einstein’s postulate
that the laws of physics are the same in all inertial frames, as well as satisfy Maxwell’s equations for
electromagnetism, it is necessary to replace the Galilean transformation by the Lorentz transformation. As
will be discussed in chapter 17, the Lorentz transformation leads to Lorentz contraction and time dilation both
of which are related to the parameter  ≡  1  2 where  is the velocity of light in vacuum. Fortunately,
1−(  )
most situations in life involve velocities where   ; for example, for a body moving at 25 000m.p.h.
(11 111 ) which is the escape velocity for a body at the surface of the earth, the  factor diﬀers from
unity by about 6810−10 which is negligible. Relativistic eﬀects are significant only in nuclear and particle
physics as well as some exotic conditions in astrophysics. Thus, for the purpose of classical mechanics,
usually it is reasonable to assume that the Galilean transformation is valid and is well obeyed under most
practical conditions.
2.4. FIRST-ORDER INTEGRALS IN NEWTONIAN MECHANICS 11

2.4 First-order integrals in Newtonian mechanics

A fundamental goal of mechanics is to determine the equations of motion for an −body system, where
the force F acts on the individual mass  where 1 ≤  ≤ . Newton’s second-order equation of motion,
equation 26 must be solved to calculate the instantaneous spatial locations, velocities, and accelerations for
each mass  of an -body system. Both F and r̈ are vectors, each having three orthogonal components.
The solution of equation 26 involves integrating second-order equations of motion subject to a set of initial
conditions. Although this task appears simple in principle, it can be exceedingly complicated for many-body
systems. Fortunately, solution of the motion often can be simplified by exploiting three first-order integrals
of Newton’s equations of motion, that are related directly to conservation of either the linear momentum,
angular momentum, or energy of the system. In addition, for the special case of these three first-order
integrals, the internal motion of any many-body system can be factored out by a simple transformations into
the center of mass of the system. As a consequence, the following three first-order integrals are exploited
extensively in classical mechanics.

2.4.1 Linear Momentum

Newton’s Laws can be written as the diﬀerential and integral forms of the first-order time integral which
equals the change in linear momentum. That is
Z 2 Z 2
p p
F = F  =  = (p2 − p1 ) (2.10)
 1 1 
This allows Newton’s law of motion to be expressed directly in terms of the linear momentum p =  ṙ of
each of the 1     bodies in the system This first-order time integral features prominently in classical
mechanics since it connects to the important concept of linear momentum p. This first-order time integral
gives that the total linear momentum is a constant of motion when the sum of the external forces is zero.

2.4.2 Angular momentum

The angular momentum L of a particle  with linear momentum p with respect to an origin from which
the position vector r is measured, is defined by
L ≡ r × p (2.11)
The torque, or moment of the force N with respect to the same origin is defined to be
N ≡ r × F (2.12)
where r is the position vector from the origin to the point where the force F is applied. Note that the
torque N can be written as
p
N = r × (2.13)

Consider the time diﬀerential of the angular momentum, L 


L  r p

= (r × p ) = × p + r × (2.14)
   
However,
r r r
× p =  × =0 (2.15)
  
Equations 213 − 215 can be used to write the first-order time integral for angular momentum in either
diﬀerential or integral form as
Z 2 Z 2
L p L
= r × = N N  =  = (L2 − L1 ) (2.16)
  1 1 
Newton’s Law relates torque and angular momentum about the same axis. When the torque about any axis
is zero then angular momentum about that axis is a constant of motion. If the total torque is zero then the
total angular momentum, as well as the components about three orthogonal axes, all are constants.
12 CHAPTER 2. REVIEW OF NEWTONIAN MECHANICS

2.4.3 Kinetic energy

The thirdR 2first-order integral, that can be used for solving the equations of motion, is the first-order spatial
integral 1 F · r . Note that this spatial integral is a scalar in contrast to the first-order time integrals for
linear and angular momenta which are vectors. The work done on a mass  by a force F in transforming
from condition 1 to 2 is defined to be Z 2
[12 ] ≡ F · r (2.17)
1
If F is the net resultant force acting on a particle  then the integrand can be written as
µ ¶
p v r v   1 2
F · r = · r =  ·  =  · v  = (v · v )  =    =  [ ] (2.18)
    2  2
where the kinetic energy of a particle  is defined as
1
[ ] ≡  2 (2.19)
2
Thus the work done on the particle , that is, [12 ] equals the change in kinetic energy of the particle if
there is no change in other contributions to the total energy such as potential energy, heat dissipation, etc.
That is ∙ ¸
1 2 1 2
[12 ] =  −  = [2 − 1 ] (2.20)
2 2 2 1 
Thus the diﬀerential, and corresponding first integral, forms of the kinetic energy can be written as
Z 2

F = F · r = (2 − 1 ) (2.21)
r 1

If the work done on the particle is positive, then the final kinetic energy 2  1  Especially noteworthy is that
the kinetic energy [ ] is a scalar quantity which makes it simple to use. This first-order spatial integral is the
foundation of the analytic formulation of mechanics that underlies Lagrangian and Hamiltonian mechanics.

2.5 Conservation laws in classical mechanics

Elucidating the dynamics in classical mechanics is greatly simplified when conservation laws are applicable.
In nature, isolated many-body systems frequently conserve one or more of the first-order integrals for linear
momentum, angular momentum, and mass/energy. Note that mass and energy are coupled in the Theory
of Relativity, but for non-relativistic mechanics the conservation of mass and energy are decoupled. Other
observables such as lepton and baryon numbers are conserved, but these conservation laws usually can be
subsumed under conservation of mass for most problems in non-relativistic classical mechanics. The power
of conservation laws in calculating classical dynamics makes it useful to combine the conservation laws
with the first integrals for linear momentum, angular momentum, and work-energy, when solving problems
involving Newtonian mechanics. These three conservation laws will be derived assuming Newton’s laws of
motion, however, these conservation laws are fundamental laws of nature that apply well beyond the domain
of applicability of Newtonian mechanics.

2.6 Motion of finite-sized and many-body systems

Elementary presentations in classical mechanics discuss motion and forces involving single point particles.
However, in real life, single bodies have a finite size introducing new degrees of freedom such as rotation and
vibration, and frequently many finite-sized bodies are involved. A finite-sized body can be thought of as a
system of interacting particles such as the individual atoms of the body. The interactions between the parts
of the body can be strong which leads to rigid body motion where the positions of the particles are held
fixed with respect to each other, and the body can translate and rotate. When the interaction between the
bodies is weaker, such as for a diatomic molecule, additional vibrational degrees of relative motion between
the individual atoms are important. Newton’s third law of motion becomes especially important for such
many-body systems.
2.7. CENTER OF MASS OF A MANY-BODY SYSTEM 13

2.7 Center of mass of a many-body system

A finite sized body needs a reference point with respect
to which the motion can be described. For example,
there are 8 corners of a cube that could server as ref-
erence points, but the motion of each corner is compli-
cated if the cube is both translating and rotating. The CM
treatment of the behavior of finite-sized bodies, or many-
body systems, is greatly simplified using the concept of R
r’i
center of mass. The center of mass is a particular fixed
point in the body that has an especially valuable prop- mi
erty; that is, the translational motion of a finite sized
body can be treated like that of a point mass located at ri
the center of mass. In addition the translational motion
is separable from the rotational-vibrational motion of a
many-body system when the motion is described with
respect to the center of mass. Thus it is convenient at
this juncture to introduce the concept of center of mass
of a many-body system.
For a many-body system, the position vector r , de-
fined relative to the laboratory system, is related to the Figure 2.2: Position vector with respect to the
position vector r0 with respect to the center of mass, and center of mass.
the center-of-mass location R relative to the laboratory
system. That is, as shown in figure 22
r = R + r0 (2.22)
This vector relation defines the transformation between the laboratory and center of mass systems. For
discrete and continuous systems respectively, the location of the center of mass is uniquely defined as being
where Z
X
 r = r0  = 0
0
(Center of mass definition)


Define the total mass  as Z


X
=  =  (Total mass)
 

1
P
The average location of the system corresponds to the location of the center of mass since    r0 = 0
that is
1 X 1 X
 r = R +  r0 = R (2.23)
   

The vector R which describes the location of the center of mass, depends on the origin and coordinate
system chosen. For a continuous mass distribution the location vector of the center of mass is given by
Z
1 X 1
R=  r = r (2.24)
  

The center of mass can be evaluated by calculating the individual components along three orthogonal axes.
The center-of-mass frame of reference is defined as the frame for which the center of mass is stationary.
This frame of reference is especially valuable for elucidating the underlying physics which involves only the
relative motion of the many bodies. That is, the trivial translational motion of the center of mass frame,
which has no influence on the relative motion of the bodies, is factored out and can be ignored. For example,
a tennis ball (006) approaching the earth (6 × 1024 ) with velocity  could be treated in three frames,
(a) assume the earth is stationary, (b) assume the tennis ball is stationary, or (c) the center-of-mass frame.
The latter frame ignores the center of mass motion which has no influence on the relative motion of the
tennis ball and the earth. The center of linear momentum and center of mass coordinate frames are identical
in Newtonian mechanics but not in relativistic mechanics as described in chapter 1743.
14 CHAPTER 2. REVIEW OF NEWTONIAN MECHANICS

2.8 Total linear momentum of a many-body system

2.8.1 Center-of-mass decomposition
The total linear momentum P for a system of  particles is given by

X 
 X
P= p =  r (2.25)

 

It is convenient to describe a many-body system by a position vector r0 with respect to the center of mass.

r = R + r0 (2.26)

That is,

X  
 X   X 
P= p =  r =  R +  r0 =  R + 0 =  Ṙ (2.27)

     
P
since   r0 = 0 as given by the definition of the center of mass. That is;

P =  Ṙ (2.28)

Thus P
the total linear momentum for a system is the same as the momentum of a single particle of mass

 =   located at the center of mass of the system.

2.8.2 Equations of motion

The force acting on particle  in an -particle many-body system, can be separated into an external force
F
 plus internal forces f between the  particles of the system

X
F = F
 + f (2.29)

6=

The origin of the external force is from outside of the system while the internal force is due to the mutual
interaction between the  particles in the system. Newton’s Law tells us that

X
ṗ = F = F
 + f (2.30)

6=

Thus the rate of change of total momentum is


X 
X  X
X 
Ṗ = ṗ = F
 + f (2.31)
   
6=

Note that since the indices are dummy then


XX 
XX
f = f (2.32)
   
6= 6=

Substituting Newton’s third law f = −f into equation 232 implies that

XX 
XX  X
X 
f = f = − f = 0 (2.33)
     
6= 6= 6=
2.8. TOTAL LINEAR MOMENTUM OF A MANY-BODY SYSTEM 15

which is satisfied only for the case where the summations equal zero. That is, for every internal force, there
is an equal and opposite reaction force that cancels that internal force.
Therefore the first-order integral for linear momentum can be written in diﬀerential and integral forms
as
X Z2 X

Ṗ = F F
  = P2 − P1 (2.34)
 1 

The reaction of a body to an external force is equivalent to a single particle of mass  located at the center
of mass assuming that the internal forces cancel due to Newton’s third law.
Note that the total linear momentum P is conserved if the net external force F is zero, that is

P
F = =0 (2.35)

Therefore the P of the center of mass is a constant. Moreover, if the component of the force along any
direction b
e is zero, that is,
P · b
e
F · b
e= =0 (2.36)

then P · b
e is a constant. This fact is used frequently to solve problems involving motion in a constant force
field. For example, in the earth’s gravitational field, the momentum of an object moving in vacuum in the
vertical direction is time dependent because of the gravitational force, whereas the horizontal component of
momentum is constant if no forces act in the horizontal direction.

2.1 Example: Exploding cannon shell

Consider a cannon shell of mass  moves along a parabolic trajectory in the earths gravitational field.
An internal explosion, generating an amount  of mechanical energy, blows the shell into two parts. One
part of mass  where   1 continues moving along the same trajectory with velocity 0 while the other
part is reduced to rest. Find the velocity of the mass  immediately after the explosion.
It is important to remember that the energy release  is given in
the center of mass. If the velocity of the shell immediately before the
explosion is  and  0 is the velocity of the  part immediately after the M
1 2 1 02
explosion, then energy conservation gives that 2   +  = 2    
The conservation of linear momentum gives   =   0 . Eliminating v

 from these equations gives

s
(1-k)M
2
0 =
[(1 − ) ] kM v’

Exploding cannon shell

2.2 Example: Billiard-ball collisions
A billiard ball with mass  and incident velocity  collides with an identical stationary ball. Assume that
the balls bounce oﬀ each other elastically in such a way that the incident ball is deflected at a scattering angle
 to the incident direction. Calculate the final velocities  and  of the two balls and the scattering angle
 of the target ball. The conservation of linear momentum in the incident direction , and the perpendicular
direction give
 =  cos  +  cos  0 =  sin  −  sin 
Energy conservation gives .
 2  2  2
 =  + 
2 2 2
Solving these three equations gives  = 900 −  that is, the balls bounce oﬀ perpendicular to each other in
the laboratory frame. The final velocities are

 =  cos   =  sin 
16 CHAPTER 2. REVIEW OF NEWTONIAN MECHANICS

2.9 Angular momentum of a many-body system

2.9.1 Center-of-mass decomposition
As was the case for linear momentum, for a many-body system it is possible to separate the angular mo-
mentum into two components. One component is the angular momentum about the center of mass and the
other component is the angular motion of the center of mass about the origin of the coordinate system. This
separation is done by describing the angular momentum of a many-body system using a position vector r0
with respect to the center of mass plus the vector location R of the center of mass.

r = R + r0 (2.37)

The total angular momentum


X 
X
L = L = r × p
 

X ³ ´
= (R + r0 ) ×  Ṙ + ṙ0


X h i
=  r0 × ṙ0 + r0 × Ṙ + R × ṙ0 + R × Ṙ (2.38)

P
Note that if the position vectors are with respect to the center of mass, then   r0 = 0 resulting in the
middle two terms in the bracket being zero, that is;

X
L= r0 × p0 + R × P (2.39)


The total angular momentum separates into two terms, the angular momentum about the center of mass,
plus the angular momentum of the center of mass about the origin of the axis system. This factoring of the
angular momentum only applies for the center of mass. This is called Samuel König’s first theorem.

2.9.2 Equations of motion

The time derivative of the angular momentum

L̇ = r × p = ṙ × p + r × ṗ (2.40)

But
ṙ × p =  ṙ × ṙ = 0 (2.41)
Thus the torque  acting on mass  is given by

N = L̇ = r × ṗ = r × F (2.42)

Consider that the resultant force acting on particle  in this -particle system can be separated into an
external force F
 plus internal forces between the  particles of the system

X
F = F
 + f (2.43)

6=

The origin of the external force is from outside of the system while the internal force is due to the interaction
with the other  − 1 particles in the system. Newton’s Law tells us that

X
ṗ = F = F
 + f (2.44)

6=
2.9. ANGULAR MOMENTUM OF A MANY-BODY SYSTEM 17

The rate of change of total angular momentum is

X X X XX
L̇ = L̇ = r × ṗ = r × F
 + r × f (2.45)
    
6=

Since f = −f the last expression can be written as

XX XX
r × f = (r − r ) × f (2.46)
   
6= 

Note that (r − r ) is the vector r connecting  to . For central forces the force vector f =  rc
 thus
XX XX
(r − r ) × f = r ×  rc
 = 0 (2.47)
   
 

That is, for central internal forces the total internal torque on a system of particles is zero, and the rate of
change of total angular momentum for central internal forces becomes
X X
L̇ = r × F = N =N

(2.48)
 

where N is the net external torque acting on the system. Equation 248 leads to the diﬀerential and integral
forms of the first integral relating the total angular momentum to total external torque.

Z2

L̇ = N N  = L2 − L1 (2.49)
1

Angular momentum conservation occurs in many problems involving zero external torques N = 0 plus
two-body central forces F = ()r̂ since the torque on the particle about the center of the force is zero

N = r × F = ()[r × r̂] =0 (2.50)

Examples are, the central gravitational force for stellar or planetary systems in astrophysics, and the central
electrostatic force manifest for motion of electrons in the atom. In addition, the component of angular
momentum about any axis Lê is conserved if the net external torque about that axis Nê =0.

2.3 Example: Bolas thrown by gaucho

Consider the bolas thrown by a gaucho to catch cattle. This is a
system with conserved linear and angular momentum about certain
axes. When the bolas leaves the gaucho’s hand the center of mass
has a linear velocity V plus an angular momentum about the center
CM
of mass of L If no external torques act, then the center of mass of
the bolas will follow a typical ballistic trajectory in the earth’s grav- m1
itational field while the angular momentum vector L is conserved,
V0
that is, both in magnitude and direction. The tension in the ropes
connecting the three balls does not impact the motion of the system
as long as the ropes do not snap due to centrifugal forces.
Bolas thrown by a gaucho
18 CHAPTER 2. REVIEW OF NEWTONIAN MECHANICS

2.10 Work and kinetic energy for a many-body system

2.10.1 Center-of-mass kinetic energy
For a many-body system the position vector r0 with respect to the center of mass is given by.

r = R + r0 (2.51)
R
The location of the center of mass is uniquely defined as being at the location where r0  = 0 The
velocity of the  particle can be expressed in terms of the velocity of the center of mass Ṙ plus the velocity
of the particle with respect to the center of mass ṙ0 . That is,

ṙ = Ṙ + ṙ0 (2.52)

The total kinetic energy  is

  
Ã !
X 1 X 1 X 1  X X1
 =  2 =  ṙ · ṙ =  ṙ0 · ṙ0 +  ṙ0 · Ṙ +  Ṙ · Ṙ (2.53)

2 
2 
2   
2

For the special case of the center of mass, the middle term is zero since, by definition of the center of mass,
P 0
  ṙ = 0 Therefore
X
1 1
 =  02 +   2 (2.54)

2 2
Thus the total kinetic energy of the system is equal to the sum of the kinetic energy of a mass  moving
with the center of mass velocity plus the kinetic energy of motion of the individual particles relative to the
center of mass. This is called Samuel König’s second theorem. P
Note that for a fixed center-of-mass energy, the total kinetic energy  has a minimum value of  12  02
when the velocity of the center of mass  = 0. For a given internal excitation energy, the minimum energy
required to accelerate colliding bodies occurs when the colliding bodies have identical, but opposite, linear
momenta. That is, when the center-of-mass velocity  = 0.

2.10.2 Conservative forces and potential energy

R2
In general, the line integral of a force field F, that is, 1 F·r is both path and time dependent. However,
an important class of forces, called conservative forces, exist for which the following two facts are obeyed.
1) Time independence:
The force depends only on the particle position r, that is, it does not depend on velocity or time.
2) Path independence:
For any two points 1 and 2 , the work done by F is independent of the path taken between 1 and 2 .
If forces are path independent, then it is possible to define a scalar field, called potential energy, denoted
by  (r) that is only a function of position. The path independence can be expressed by noting that the
integral around a closed loop is zero. That is
I
F · r = 0 (2.55)

Applying Stokes theorem for a path-independent force leads to the alternate statement that the curl is zero.
See appendix 33
∇ × F = 0 (2.56)
Note that the vector product of two del operators ∇ acting on a scalar field  equals

∇ × ∇ = 0 (2.57)

Thus it is possible to express a path-independent force field as the gradient of a scalar field,  , that is

F = −∇ (2.58)
2.10. WORK AND KINETIC ENERGY FOR A MANY-BODY SYSTEM 19

Then the spatial integral Z Z

2 2
F · r = − (∇ ) · r = 1 − 2 (2.59)
1 1
Thus for a path-independent force, the work done on the particle is given by the change in potential energy
if there is no change in kinetic energy. For example, if an object is lifted against the gravitational field, then
work is done on the particle and the final potential energy 2 exceeds the initial potential energy, 1 .

2.10.3 Total mechanical energy

The total mechanical energy  of a particle is defined as the sum of the kinetic and potential energies.

 = + (2.60)

Note that the potential energy is defined only to within an additive constant since the force F = −∇
depends only on diﬀerence in potential energy. Similarly, the kinetic energy is not absolute since any inertial
frame of reference can be used to describe the motion and the velocity of a particle depends on the relative
velocities of inertial frames. Thus the total mechanical energy  =  +  is not absolute.
If a single particle is subject to several path-independent forces, such as gravity, linear restoring forces,
etc., then a potential energy  can be ascribed to each of the  forces where for each force F = −∇ . In
X
contrast to the forces, which add vectorially, these scalar potential energies are additive,  =  . Thus

the total mechanical energy for  potential energies equals

X
 =  +  (r) =  +  (r) (2.61)


The time derivative of the total mechanical energy  =  +  equals

  
= + (2.62)
  
Equation 218 gave that  = F · r. Thus, the first term in equation 262 equals
 r
=F· (2.63)
 
The potential energy can be a function of both position and time. Thus the time diﬀerence in potential
energy due to change in both time and position is given as
 X    r 
= + = (∇ ) · + (2.64)
 
    

The time derivative of the total mechanical energy is given using equations 263 264 in equation 262
   r r  r 
= + =F· + (∇ ) · + = [F + (∇ )] · + (2.65)
       
Note that if the field is path independent, that is ∇ × F = 0 then the force and potential are related by

F = −∇ (2.66)

Therefore, for path independent forces, the first term in the time derivative of the total energy in equation
265 is zero. That is,
 
= (2.67)
 
In addition, when the potential energy  is not an explicit function of time, then 
 = 0 and thus the total
energy is conserved. That is, for the combination of (a) path independence plus (b) time independence, then
the total energy of a conservative field is conserved.
20 CHAPTER 2. REVIEW OF NEWTONIAN MECHANICS

Note that there are cases where the concept of potential still is useful even when it is time dependent.
That is, if path independence applies, i.e. F = −∇ at any instant. For example, a Coulomb field problem
where charges are slowly changing due to leakage etc., or during a peripheral collision between two charged
bodies such as nuclei.

2.4 Example: Central force

A particle of mass  moves along a trajectory given by  = 0 cos  1  and  = 0 sin  2 .
a) Find the  and  components of the force and determine the condition for which the force is a central
force.
Diﬀerentiating with respect to time gives
̇ = −0  1 sin ( 1 ) ̈ = −0  21 cos ( 1 )
̇ = −0  2 cos ( 2 ) ̈ = −0  22 sin ( 2 )
Newton’s second law gives
£ ¤ £ ¤
F= (̈̂+̈̂) = − 0  21 cos ( 1 )̂ + 0  22 sin ( 2 ) ̂ = −  21 ̂ +  22 ̂
Note that if  1 =  2 =  then
F = = − 2 [̂ + ̂] = − 2 r
That is, it is a central force if  1 =  2 = 
b) Find the potential energy as a function of  and .
Since ∙ ¸
 
F = −∇ = − ̂ + ̂
 
then
1 ¡ ¢
 =   21 2 +  22  2
2
assuming that  = 0 at the origin.
c) Determine the kinetic energy of the particle and show that it is conserved.
The total energy
1 ¡ 2 ¢ 1 ¡ ¢ 1 ¡ ¢
 = + =  ̇ + ̇2 +   21 2 +  22  2 =  20  21 + 02  22
2 2 2
since cos2  + sin2  = 1. Thus the total energy  is a constant and is conserved.

2.10.4 Total mechanical energy for conservative systems

Equation 220 showed that, using Newton’s second law, F = p   the first-order spatial integral gives that
the work done, 12  is related to the change in the kinetic energy. That is,
Z 2
1 1
12 ≡ F · r = 22 − 12 = 2 − 1 (2.68)
1 2 2
The work done 12 also can be evaluated in terms of the known forces F in the spatial integral.
Consider that the resultant force acting on particle  in this -particle system can be separated into an
external force F
 plus internal forces between the  particles of the system

X
F = F
 + f (2.69)

6=

The origin of the external force is from outside of the system while the internal force is due to the interaction
with the other  − 1 particles in the system. Newton’s Law tells us that

X
ṗ = F = F
 + f (2.70)

6=
2.10. WORK AND KINETIC ENERGY FOR A MANY-BODY SYSTEM 21

The work done on the system by a force moving from configuration 1 → 2 is given by
X Z 2  X
X  Z 2
1→2 = F
 · r + f · r (2.71)
 1   1
6=

Since f = −f then

 Z
X 2 X  Z
 X 2
1→2 = F
 · r + f · (r − r ) (2.72)
 1   1


where r − r = r is the vector from  to 

Assume that both the external and internal forces are conservative, and thus can be derived from time
independent potentials, that is
F 
 = −∇  (2.73)

f = −∇  (2.74)
Then
 Z
X 2 X  Z
 X 2
1→2 = − ∇  · r − 
∇  · r
 1   1


X 
X X 
X
=  (1) −  (2) +  (1) −  (2)
   
=   (1) −   (2) +   (1) −   (2) (2.75)

Define the total external potential energy,


X
  =  (2.76)


and the total internal energy


X
  =  (2.77)


Equating the two equivalent equations for 1→2 , that is 268 and 275gives that

1→2 = 2 − 1 =   (1) −   (2) +   (1) −   (2) (2.78)

Regroup these terms in equation 278 gives

1 +   (1) +   (1) = 2 +   (2) +   (2)

This shows that, for conservative forces, the total energy is conserved and is given by

 =  +   +   (2.79)

The three first-order integrals for linear momentum, angular momentum, and energy provide powerful
approaches for solving the motion of Newtonian systems due to the applicability of conservation laws for the
corresponding linear and angular momentum plus energy conservation for conservative forces. In addition,
the important concept of center-of-mass motion naturally separates out for these three first-order integrals.
Although these conservation laws were derived assuming Newton’s Laws of motion, these conservation laws
are more generally applicable, and these conservation laws surpass the range of validity of Newton’s Laws of
motion. For example, in 1930 Pauli and Fermi postulated the existence of the neutrino in order to account for
non-conservation of energy and momentum in -decay because they did not wish to relinquish the concepts
of energy and momentum conservation. The neutrino was first detected in 1956 confirming the correctness
of this hypothesis.
22 CHAPTER 2. REVIEW OF NEWTONIAN MECHANICS

2.11 Virial Theorem

The Virial theorem is an important theorem for a system of moving particles both in classical physics and
quantum physics. The Virial Theorem is useful when considering a collection of many particles and has a
special importance to central-force motion. For a general system of mass points with position vectors r and
applied forces F , consider the scalar product 
X
≡ p · r (2.80)


where  sums over all particles. The time derivative of  is

 X X
= p · ṙ + ṗ · r (2.81)
  

However, X X X
p · ṙ = ṙ · ṙ =  2 = 2 (2.82)
  

Also, since ṗ = F X X

ṗ · r = F · r (2.83)
 

Thus
 X
= 2 + F · r (2.84)
 

The time average over a period  is

Z * +
1 
 ( ) − (0) X
 = = h2 i + F · r (2.85)
 0   

where the hi brackets refer to the time average. Note that if the motion is periodic and the chosen time 
equals a multiple of the period, then ( )−(0)
 = 0. Even if the motion is not periodic, if the constraints and
velocities of all the particles remain finite, then there is an upper bound to  This implies that choosing
 → ∞ means that ( )−(0)
 → 0 In both cases the left-hand side of the equation tends to zero giving the
Virial theorem * +
1 X
h i = − F · r (2.86)
2 

The right-hand side of this equation is called the Virial of the system. For a single particle subject to a
conservative central force F = −∇ the Virial theorem equals
¿ À
1 1 
h i = h∇ · ri =  (2.87)
2 2 

If the potential is of the form  = +1 that is,  = −( + 1) , then  
 = ( + 1) . Thus for a single
particle in a central potential  = +1 the Virial theorem reduces to
+1
h i = h i (2.88)
2
The following two special cases are of considerable importance in physics.
Hooke’s Law: Note that for a linear restoring force  = 1 then

h i = + h i ( = 1)

You may be familiar with this fact for simple harmonic motion where the average kinetic and potential
energies are the same and both equal half of the total energy.
2.11. VIRIAL THEOREM 23

Inverse-square law: The other interesting case is for the inverse square law  = −2 where
1
h i = − h i ( = −2)
2
The Virial theorem is useful for solving problems in that knowing the exponent  of the field makes it
possible to write down directly the average total energy in the field. For example, for  = −2
1 1
hi = h i + h i = − h i + h i = h i (2.89)
2 2
This occurs for the Bohr model of the hydrogen atom where the kinetic energy of the bound electron is half
of the potential energy. The same result occurs for planetary motion in the solar system.

2.5 Example: The ideal gas law

The Virial theorem deals with average properties and has applications to statistical mechanics. Consider
an ideal gas. According to the Equipartition theorem the average kinetic energy per atom in an ideal gas is
3
2  where  is the absolute temperature and  is the Boltzmann constant. Thus the average total kinetic
energy for  atoms is hi = 32   . The right-hand side of the Virial theorem contains the force  . For
an ideal gas it is assumed that there are no interaction forces between atoms, that is the only force is the
force of constraint of the walls of the pressure vessel. The pressure  is force per unit area and thus the
instantaneous force on an area of wall  is F = −n̂  where ̂ designates the unit vector normal to
the surface. Thus the right-hand side of the Virial theorem is
* + Z
1 X 
− F · r = n̂ · r 
2 
2
R R R
Use of the divergence theorem thus gives that n̂ · r  = ∇ · r = 3  = 3 Thus the Virial theorem
leads to the ideal gas law, that is
  =  

2.6 Example: The mass of galaxies

The Virial theorem can be used to make a crude estimate of the mass of a cluster of galaxies. Assuming a
spherically-symmetric cluster of  galaxies, each of mass  then the total mass of the cluster is  =  .
A crude estimate of the cluster potential energy is

 2
h i ≈ ()

where  is the radius of a cluster. The average kinetic energy per galaxy is 12  hi2 where hi2 is the average
square of the galaxy velocities with respect to the center of mass of the cluster. Thus the total kinetic energy
of the cluster is
  hi2  hi2
hi ≈ = ()
2 2
The Virial theorem tells us that a central force having a radial dependence of the form  ∝  gives hi =
+1
2 h i. For the inverse-square gravitational force then

1
hi = − h i ()
2
Thus equations   and  give an estimate of the total mass of the cluster to be

 hi2
≈

This estimate is larger than the value estimated from the luminosity of the cluster implying a large amount
of “dark matter” must exist in galaxies which remains an open question in physics.
24 CHAPTER 2. REVIEW OF NEWTONIAN MECHANICS

2.12 Applications of Newton’s equations of motion

Newton’s equation of motion can be written in the form
p v 2 r
F= = = 2 (2.90)
  
A description of the motion of a particle requires a solution of this second-order diﬀerential equation of
motion. This equation of motion may be integrated to find r() and v() if the initial conditions and
the force field F() are known. Solution of the equation of motion can be complicated for many practical
examples, but there are various approaches to simplify the solution. It is of value to learn eﬃcient approaches
to solving problems.
The following sequence is recommended
a) Make a vector diagram of the problem indicating forces, velocities, etc.
b) Write down the known quantities.
c) Before trying to solve the equation of motion directly, look to see if a basic conservation law applies.
That is, check if any of the three first-order integrals, can be used to simplify the solution. The use of
conservation of energy or conservation of momentum can greatly simplify solving problems.
The following examples show the solution of typical types of problem encountered using Newtonian
mechanics.

2.12.1 Constant force problems

Problems having a constant force imply constant acceleration. The classic example is a block sliding on an
inclined plane, where the block of mass  is acted upon by both gravity and friction. The net force F is
given by the vector sum of the gravitational force F , normal force N and frictional force f .

F = F + N + f = a (2.91)

Taking components perpendicular to the inclined plane in the  direction

− cos  +  = 0 (2.92)

That is, since  = 

 =  cos  (2.93)
ff N
Similarly, taking components along the inclined plane in the  di-
rection y
2 
 sin  −  =  2 (2.94)

Using the concept of coeﬃcient of friction 

 =  (2.95) Fg
Thus the equation of motion can be written as x
2
 
 (sin  −  cos ) =  (2.96)
2
The block accelerates if sin    cos  that is, tan    The
acceleration is constant if  and  are constant, that is Figure 2.3: Block on an inclined plane

2 
=  (sin  −  cos ) (2.97)
2
Remember that if the block is stationary, the friction coefficient balances such that (sin  −  cos ) = 0
that is, tan  = . However, there is a maximum static friction coefficient  beyond which the block starts
sliding. The kinetic coefficient of friction  is applicable for sliding friction and usually    
Another example of constant force and acceleration is motion of objects free falling in a uniform gravi-
tational field when air drag is neglected. Then one obtains the simple relations such as  =  + , etc.
2.12. APPLICATIONS OF NEWTON’S EQUATIONS OF MOTION 25

2.12.2 Linear Restoring Force

An important class of problems involve a linear restoring force, that is, they obey Hooke’s law. The equation
of motion for this case is
 () = − = ̈ (2.98)
It is usual to define

 20 ≡ (2.99)

Then the equation of motion then can be written as

̈ +  20  = 0 (2.100)

which is the equation of the harmonic oscillator. Examples are small oscillations of a mass on a spring,
vibrations of a stretched piano string, etc.
The solution of this second order equation is

() =  sin ( 0  − ) (2.101)

This is the well known sinusoidal behavior of the displacement for the simple harmonic oscillator. The
angular frequency  0 is
r

0 = (2.102)

Note that this linear system has no dissipative forces, thus the total energy is a constant of motion as
discussed previously. That is, it is a conservative system with a total energy  given by

1 1
̇2 + 2 =  (2.103)
2 2
The first term is the kinetic energy and the second term is the potential energy. The Virial theorem gives
that for the linear restoring force the average kinetic energy equals the average potential energy.

2.12.3 Position-dependent conservative forces

The linear restoring force is an example of a conservative field. The total energy  is conserved, and if the
field is time independent, then the conservative forces are a function only of position. The easiest way to
solve such problems is to use the concept of potential energy  illustrated in Figure 24.
Z 2
2 − 1 = − F · r (2.104)
1

Consider a conservative force in one dimension. Since it was shown that the total energy  =  +  is
conserved for a conservative field, then

1
 = + =  2 +  () (2.105)
2
Therefore: r
 2
= =± [ −  ()] (2.106)
 
Integration of this gives
Z 
±
 − 0 = q (2.107)
0 2
 [ −  ()]

where  = 0 when  = 0  Knowing  () it is possible to solve this equation as a function of time.
26 CHAPTER 2. REVIEW OF NEWTONIAN MECHANICS

It is possible to understand the general features of the

solution just from inspection of the function  () For ex- U(x)
ample, as shown in figure 24 the motion for energy 1
is periodic between the turning points  and   Since
the potential energy curve is approximately parabolic be- E4
tween these limits the motion will exhibit simple harmonic
motion. For 0 the turning point coalesce to 0  that is
there is no motion. For total energy 2 the motion is E3
periodic in two independent regimes,  ≤  ≤   and E2

 ≤  ≤   Classically the particle cannot jump from E1

one pocket to the other. The motion for the particle with
total energy 3 is that it moves freely from infinity, stops E0
and rebounds at  =  and then returns to infinity. That
is the particle bounces oﬀ the potential at   For energy
4 the particle moves freely and is unbounded. For all x
these cases, the actual velocity is given by the above re- xgxc xa xo xb xd xe xf
lation for  ()  Thus the kinetic energy is largest where
the potential is deepest. An example would be motion of
a roller coaster car.
Position-dependent forces are encountered extensively Figure 2.4: One-dimensional potential  ()
in classical mechanics. Examples are the many manifesta-
tions of motion in gravitational fields, such as interplane-
tary probes, a roller coaster, and automobile suspension systems. The linear restoring force is an especially
simple example of a position-dependent force while the most frequently encountered conservative potentials
are in electrostatics and gravitation for which the potentials are;
1 1 2
 () = 2 (Electrostatic potential energy)
40 12
1 2
 () = − 2 (Gravitational potential energy)
12
Knowing  () it is possible to solve the equation of motion as a function of time.

2.7 Example: Diatomic molecule

An example of a conservative field is a vibrating diatomic molecule which has a potential energy depen-
dence with separation distance  that is described approximately by the Morse function
h i
(−0 ) 2
 () = 0 1 − −  − 0
where 0  0 and  are parameters chosen to best describe the particular pair of atoms. The restoring force
is given by
 () 0 h (−0 )
i h (−0 ) i
 () = − =2 1 − −  − 
 
This has a minimum value of  (0 ) = 0 at  = 0 
Note that for small amplitude oscillations, where
( − 0 )   1.0

0.8
the exponential term in the potential function can be ex-
0.6
panded to give U(x) 0.4
∙ ¸2 Uo 0.2 x
( − 0 ) 0
 () ≈ 0 1 − (1 − − ) −0 ≈ 2 (−0 )2 −0 0.0
1 2 3 4 5
  -0.2

-0.4
This gives a restoring force
-0.6

 () 0 -0.8
 () = − = −2 ( − 0 ) -1.0
 
That is, for small amplitudes the restoring force is linear. Potential energy function  ()0 versus 
for the diatomic molecule.
2.12. APPLICATIONS OF NEWTON’S EQUATIONS OF MOTION 27

2.12.4 Constrained motion

A frequently encountered problem involving position dependent forces, is when the motion is constrained to
follow a certain trajectory. Forces of constraint must exist to constrain the motion to a specific trajectory.
Examples are, the roller coaster, a rolling ball on an undulating surface, or a downhill skier, where the
motion is constrained to follow the surface or track contours. The potential energy can be evaluated at all
positions along the constrained trajectory for conservative forces such as gravity. However, the additional
forces of constraint that must exist to constrain the motion, can be complicated and depend on the motion.
For example, the roller coaster must always balance the gravitational and centripetal forces. Fortunately
forces of constraint F often are normal to the direction of motion and thus do not contribute to the total
mechanical energy since then the work done F · l is zero. Magnetic forces F =v × B exhibit this feature
of having the force normal to the motion.
Solution of constrained problems is greatly simplified if the other forces are conservative and the forces
of constraint are normal to the motion, since then energy conservation can be used.

2.8 Example: Roller coaster

Consider motion of a roller coaster shown in the
adjacent figure. This system is conservative if the fric-
tion and air drag are neglected and then the forces of
constraint are normal to the direction of motion.
The kinetic energy at any position is just given by
energy conservation and the fact that
 = +
where  depends on the height of the track at any the
given location. The kinetic energy is greatest when the
potential energy is lowest. The forces of constraint
can be deduced if the velocity of motion on the track
is known. Assuming that the motion is confined to a
vertical plane, then one has a centripetal force of con-
2
straint  normal to the track inwards towards the
center of the radius of curvature , plus the gravita-
tion force downwards of 
2
The constraint force is   −  upwards at the
2
top of the loop, while it is   +  downwards at
the bottom of the loop. To ensure that the car and
occupants do not leave the required trajectory, the force
upwards at the top of the loop has to be positive, that
is, 2 ≥ . The velocity at the bottom of the loop
is given by 12  2
= 12 2 + 2 assuming that the
track has a constant radius of curvature . That is;
2
at a minimum  =  + 4 = 5 Therefore the
occupants now will feel an acceleration downwards of Roller coaster (CCO Public Domain)
2
at least  +  = 6 at the bottom of the loop The
first roller coaster was built with such a constant radius of curvature but an acceleration of 6 was too much
for the average passenger. Therefore roller coasters are designed such that the radius of curvature is much
larger at the bottom of the loop, as illustrated, in order to maintain suﬃciently low  loads and also ensure
that the required constraint forces exist.
Note that the minimum velocity at the top of the loop,  , implies that if the cart starts from rest it must
start at a height  > 2 above the top of the loop if friction is negligible. Note that the solution for the rolling
ball on such a roller coaster diﬀers from that for a sliding object since one must include the rotational energy
of the ball as well as the linear velocity.
Looping the loop in a sailplane involves the same physics making it necessary to vary the elevator control
to vary the radius of curvature throughout the loop to minimize the maximum  load.
28 CHAPTER 2. REVIEW OF NEWTONIAN MECHANICS

2.12.5 Velocity Dependent Forces

Velocity dependent forces are encountered frequently in practical problems. For example, motion of an
object in a fluid, such as air, where viscous forces retard the motion. In general the retarding force has a
complicated dependence on velocity. A quadrative-velocity drag force in air often can be expressed in the
form,
1
F () = −   2 v b (2.108)
2
where  is a dimensionless drag coefficient,  is the density of air,  is the cross sectional area perpendicular
to the direction of motion, and  is the velocity. Modern automobiles have drag coefficients as low as 03. As
described in chapter 16, the drag coefficient  depends on the Reynold’s number which relates the inertial to
viscous drag forces. Small sized objects at low velocity, such as light raindrops, have low Reynold’s numbers
for which  is roughly proportional to −1 leading to a linear dependence of the drag force on velocity, i.e.
 () ∝ . Larger objects moving at higher velocities, such as a car or sky-diver, have higher Reynold’s
numbers for which  is roughly independent of velocity leading to a drag force  () ∝ 2 . This drag force
always points in the opposite direction to the unit velocity vector. Approximately for air
¡ ¢
F () = − 1  + 2  2 vb (2.109)

where for spherical objects of diameter , 1 ≈ 155×10−4  and 2 ≈ 0222 in MKS units. Fortunately, the
equation of motion usually can be integrated when the retarding force has a simple power law dependence.
As an example, consider free fall in the Earth’s gravitational field.

2.9 Example: Vertical fall in the earth’s gravitational field.

Linear regime 1  2 
For small objects at low-velocity, i.e. low Reynold’s number, the drag approximately has a linear depen-
dence on velocity. Then the equation of motion is


− − 1  = 

Separate the variables and integrate
Z  µ ¶
   + 1 
= = − ln
0 − − 1  1  + 1 0

That is µ ¶
  1
=− + + 0 −  
1 1
Note that for  À 1 the velocity approaches a terminal velocity of ∞ = − 
1  The characteristic time
constant is  = 1 = ∞  Note that if 0 = 0 then
³ 
´
 = ∞ 1 − − 

For the case of small raindrops with  = 05 then ∞ = 8 (18) and time constant  = 08 sec 
Note that in the absence of air drag, these rain drops falling from 2000 would attain a velocity of over
400 m.p.h. It is fortunate that the drag reduces the speed of rain drops to non-damaging values. Note that
the above relation would predict high velocities for hail. Fortunately, the drag increases quadratically at the
higher velocities attained by large rain drops or hail, and this limits the terminal velocity to moderate values.
For the United States these velocities still are suﬃcient to do considerable crop damage in the mid-west.
Quadratic regime 2   1
For larger objects at higher velocities, i.e. high Reynold’s number, the drag depends on the square of the
velocity making it necessary to diﬀerentiate between objects rising and falling. The equation of motion is


− ± 2  2 = 

2.12. APPLICATIONS OF NEWTON’S EQUATIONS OF MOTION 29

where the positive sign is for falling objects and negative sign for rising objects. Integrating the equation of
motion for falling gives
Z  µ ¶
 −1 0 −1 
= 2
=  tanh − tanh
0 − + 2  ∞ ∞
q q
where  =  2
and ∞ =  ∞
2  That is,  =   For the case of a falling object with 0 = 0 solving for
velocity gives

 = ∞ tanh

As an example, a 06 basket ball with  = 025 will have ∞ = 20 ( 43 m.p.h.) and  = 21.
Consider President George H.W. Bush skydiving. Assume his mass is 70kg and assume an equivalent
spherical shape of the former President to have a diameter of  = 1. This gives that ∞ = 56
( 120) and  = 56. When Bush senior opens his 8 diameter parachute his terminal velocity is
estimated to decrease to 7 ( 15 ) which is close to the value for a typical ( 8) diameter emergency
parachute which has a measured terminal velocity of 11 in spite of air leakage through the central vent
needed to stabilize the parachute motion.

2.10 Example: Projectile motion in air

Consider a projectile initially at  =  = 0 at  = 0, that is fired at an initial velocity v0 at an angle
 to the horizontal. In order to understand the general features of the solution, assume that the drag is
proportional to velocity. This is incorrect for typical projectile velocities, but simplifies the mathematics. The
equations of motion can be expressed as
̈ = −̇
̈ = −̇ − 
where  is the coeﬃcient for air drag. Take the initial conditions at  = 0 to be  =  = 0 ̇ =  cos 
̇ =  sin 
Solving in the x coordinate,
̇
= −̇

Therefore
̇ =  cos −
That is, the velocity decays to zero with a time constant  = 1 .
Integration of the velocity equation gives
 ¡ ¢
= 1 − −

Note that this implies that the body approaches a value of  =  as  → ∞
The trajectory of an object is distorted from the parabolic shape, that occurs for  = 0 due to the rapid
drop in range as the drag coeﬃcient increases. For realistic cases it is necessary to use a computer to solve
this numerically.

2.12.6 Systems with Variable Mass

Classic examples of systems with variable mass are the rocket, a falling chain, and nuclear fission. Consider
the problem of vertical rocket motion in a gravitational field using Newtonian mechanics. When there is a
vertical gravitational external field, the vertical momentum is not conserved due to both gravity and the
ejection of rocket propellant. In a time  the rocket ejects propellant  vertically with exhaust velocity
relative to the rocket of . Thus the momentum imparted to this propellant is

 = − (2.110)

Therefore the rocket is given an equal and opposite increase in momentum 

 = + (2.111)

30 CHAPTER 2. REVIEW OF NEWTONIAN MECHANICS

In the time interval  the net change in the linear momentum of the rocket plus fuel system is given by

 = ( −  )( + ) +  ( − ) −  =  −  (2.112)

The rate of change of the linear momentum thus equals

  
 = = − (2.113)
  
Consider the problem for the special case of vertical ascent of the rocket against the external gravitational
force  = −. Then
 
− +  = (2.114)
 
This can be rewritten as
− + ̇ = ̇
The second term comes from the variable rocket mass where
the loss of mass of the rocket equals the mass of the ejected
y
propellant. Assuming a constant fuel burn ̇ =  then
v
̇ = −̇ = − (2.115)

where   0 Then the equation becomes g

³  ´
 = − +   (2.116) m

Since
 dm’ u
= − (2.117)

Earth
then

− =  (2.118)

Inserting this in the above equation gives Figure 2.5: Vertical motion of a rocket in a
³ ´ gravitational field
 = −  (2.119)
 
Integration gives ³ ´
 0
=− (0 − ) +  ln (2.120)
 
But the change in mass is given by
Z  Z 
 = −  (2.121)
0 0

That is
0 −  =  (2.122)
Thus ³ ´
0
 = − +  ln (2.123)

Note that once the propellant is exhausted the rocket will continue to fly upwards as it decelerates in
the gravitational field. You can easily calculate the maximum height. Note that this formula assumes that
the acceleration due to gravity is constant whereas for large heights above the Earth it is necessary to use
the true gravitational force − 2 where  is the distance from the center of the earth. In real situations
it is necessary to include air drag which requires a computer to numerically solve the equations of motion.
The highest rocket velocity is attained by maximizing the exhaust velocity and the ratio of initial to final
mass. Because the terminal velocity is limited by the mass ratio, engineers construct multistage rockets that
jettison the spent fuel containers and rockets. The variational-principle approach applied to variable mass
problems is discussed in chapter 87
2.12. APPLICATIONS OF NEWTON’S EQUATIONS OF MOTION 31

2.12.7 Rigid-body rotation about a body-fixed rotation axis

The most general case of rigid-body rotation involves rotation about some body-fixed point with the orien-
tation of the rotation axis undefined. For example, an object spinning in space will rotate about the center
of mass with the rotation axis having any orientation. Another example is a child’s spinning top which spins
with arbitrary orientation of the axis of rotation about the pointed end which touches the ground about a
static location. Such rotation about a body-fixed point is complicated and will be discussed in chapter 13.
Rigid-body rotation is easier to handle if the orientation of the axis of rotation is fixed with respect to the
rigid body. An example of such motion is a hinged door.
For a rigid body rotating with angular velocity  the total angular momentum L is given by

X 
X
L= L = r × p (2.124)
 

For rotation equation appendix 29 gives

v = ω × r (294)

thus the angular momentum can be written as


X 
X
L= r × p =  r × ω × r (2.125)
 

The vector triple product can be simplified using the vector identity equation 24 giving

X £¡ ¢ ¤
L=  2 ω − (r · ω) r (2.126)


Rigid-body rotation about a body-fixed symmetry axis

The simplest case for rigid-body rotation is when the body has a symmetry axis with the angular velocity ω
parallel to this body-fixed symmetry axis. For this case then r can be taken perpendicular to ω for which
the second term in equation 2126, i.e. (r · ω) =0, thus

X ¡ ¢
L =  2 ω (r perpendicular to ω)


The moment of inertia about the symmetry axis is defined as


X
 =  2 (2.127)


where  is the perpendicular distance from the axis of rotation to the body,   For a continuous body the
moment of inertia can be generalized to an integral over the mass density  of the body
Z
 = 2  (2.128)

where  is perpendicular to the rotation axis. The definition of the moment of inertia allows rewriting the
angular momentum about a symmetry axis L in the form

L =  ω (2.129)

where the moment of inertia  is taken about the symmetry axis and assuming that the angular velocity
of rotation vector is parallel to the symmetry axis.
32 CHAPTER 2. REVIEW OF NEWTONIAN MECHANICS

Rigid-body rotation about a non-symmetric body-fixed axis

In general the fixed axis of rotation is not aligned with a symmetry axis of the body, or the body does not
have a symmetry axis, both of which complicate the problem.
For illustration consider that the rigid body comprises a system of  masses  located at positions r 
with the rigid body rotating about the  axis with angular velocity ω That is,

ω =  ẑ (2.130)

In cartesian coordinates the fixed-frame vector for particle  is

r = (     ) (2.131)

using these in the cross product (294) gives

⎛ ⎞
−  
v = ω × r = ⎝    ⎠ (2.132)
0

which is written as a column vector for clarity. Inserting v in the cross-product r ×v gives the components
of the angular momentum to be
⎛ ⎞
X X − 
L=  r × v =    ⎝ −  ⎠
  2 + 2

That is, the components of the angular momentum are

Ã  !
X z
 = −      ≡    (2.133)

Ã  !
X
r
 = −      ≡    L

Ã  !
X £ 2 ¤
2
 =   +    ≡   

y
Note that the perpendicular distance from the  axis O
p
in cylindrical coordinates is  = 2 + 2  thus the an-
gular momentum  about the  axis can be written
as Ã  ! x
X
2
 =     =    (2.134)


where (2134) gives the elementary formula for the mo- Figure 2.6: A rigid rotating body comprising a sin-
ment of inertia  =  about the  axis given earlier gle mass  attached by a massless rod at a fixed
in (2129). angle  shown at the instant when  happens to
The surprising result is that  and  are non-zero lie in the  plane. As the body rotates about
implying that the total angular momentum vector L is the − axis the mass  has a velocity and mo-
in general not parallel with ω This can be understood mentum into the page (the negative  direction).
by considering the single body  shown in figure 26. Therefore the angular momentum L = r × p is in
When the body is in the   plane then  = 0 and the direction shown which is not parallel to the
 = 0 Thus the angular momentum vector L has a angular velocity 
component along the − direction as shown which is
not parallel with ω and, since the vectors ω L r are
coplanar, then L must sweep around the rotation axis ω to remain coplanar with the body as it rotates
about the  axis. Instantaneously the velocity of the body v is into the plane of the paper and, since
2.12. APPLICATIONS OF NEWTON’S EQUATIONS OF MOTION 33

L =  r × v  then L is at an angle (90◦ − ) to the  axis. This implies that a torque must be applied
to rotate the angular momentum vector. This explains why your automobile shakes if the rotation axis and
symmetry axis are not parallel for one wheel.
The first two moments in (2133) are called products of inertia of the body designated by the pair of
axes involved. Therefore, to avoid confusion, it is necessary to define the diagonal moment, which is called
the moment of inertia, by two subscripts as   Thus in general, a body can have three moments of inertia
about the three axes plus three products of inertia. This group of moments comprise the inertia tensor
which will be discussed further in chapter 13. If a body has an axis of symmetry along the  axis then the
summations will give  =  = 0 while  will be unchanged. That is, for rotation about a symmetry
axis the angular momentum and rotation axes are parallel. For any axis along which the angular momentum
and angular velocity coincide is called a principal axis of the body.

2.11 Example: Moment of inertia of a thin door

Consider that the door has width  and height  and assume the door thickness is negligible with areal
density 2. Assume that the door is hinged about the  axis. The mass of a surface element of
dimension  at a distance  from the rotation axis is  =  thus the mass of the complete door
is  =  The moment of inertia about the  axis is given by
Z  Z 
1 1
= 2  = 3 =  2
=0 =0 3 3

2.12 Example: Merry-go-round

A child of mass  jumps onto the outside edge of a circular merry-go-round of moment of inertia , and
radius  and initial angular velocity  0  What is the final angular velocity   ?
If the initial angular momentum is 0 and, assuming the child jumps with zero angular velocity, then the
conservation of angular momentum implies that

0 = 
 0 =  +  
0 
 = ( + 2 )
 
That is
  
= =
0 0  + 2
Note that this is true independent of the details of the acceleration of the initially stationary child.

2.13 Example: Cue pushes a billiard ball

Consider a billiard ball of mass  and radius 
is pushed by a cue in a direction that passes through
the center of gravity such that the ball attains a veloc-
0
ity 0 . The friction coeﬃcient between the table and
the ball is . How far does the ball move before the
initial slipping motion changes to pure rolling mo-
tion? Cue pushing a billiard ball horizontally at the height
Since the direction of the cue force passes through of the centre of rotation of the ball.
the center of mass of the ball, it contributes zero
torque to the ball. Thus the initial angular momen-
tum is zero at  = 0. The friction force  points opposite to the direction of motion and causes a torque 
about the center of mass in the direction ̂.

N = f · R = 
34 CHAPTER 2. REVIEW OF NEWTONIAN MECHANICS

Since the moment of inertia about the center of a uniform sphere is  = 25  2 then the angular acceleration
of the ball is
    5 
̇ = = 2 2
= ()
 5   2 
Moreover the frictional force causes a deceleration  of the linear velocity of the center of mass of


 = − = − ()

Integrating  from time zero to  gives
Z 
5 
= ̇ = 
0 2 
The linear velocity of the center of mass at time  is given by integration of equation 
Z 
 =   = 0 − 
0

The billiard ball stops sliding and only rolls when  = , that is, when

5 
 = 0 − 
2 
That is, when
2 0
 =
7 
Thus the ball slips for a distance
Z 
2 12 02
=   = 0  − =
0 2 49 

Note that if the ball is pushed at a distance  above the center of mass, besides the linear velocity there
is an initial angular momentum of
 0  5 0 
= 2 =
5   2 2 2

For the special case  = 25  the ball immediately assumes a pure non-slipping roll. For   25  one has
  0 while   25  corresponds to   0 . In the latter case the frictional force points forward.

2.12.8 Time dependent forces

Many problems involve action in the presence of a time dependent force. There are two extreme cases that
are often encountered. One case is an impulsive force that acts for a very short time, for example, striking
a ball with a bat, or the collision of two cars. The second case involves an oscillatory time dependent force.
The response to impulsive forces is discussed below whereas the response to oscillatory time-dependent forces
is discussed in chapter 3.

Translational impulsive forces

An impulsive force acts for a very short time relative to the response time of the mechanical system being
discussed. In principle the equation of motion can be solved if the complicated time dependence of the force,
 () is known. However, often it is possible to use the much simpler approach employing the concept of an
impulse and the principle of the conservation of linear momentum.
Define the linear impulse P to be the first-order time integral of the time-dependent force.
Z
P ≡ F() (2.135)
2.12. APPLICATIONS OF NEWTON’S EQUATIONS OF MOTION 35

p
Since F() =  then equation 2135 gives that
Z  Z 
p 0
P=  = p = p() − p0 = ∆p (2.136)
0 0 0

Thus the impulse P is an unambiguous quantity that equals the change in linear momentum of the object
that has been struck which is independent of the details of the time dependence of the impulsive force.
Computation of the spatial motion still requires knowledge of  () since the 2136 can be written as
Z
1 
v() = F(0 )0 + v0 (2.137)
 0
Integration gives
Z " Z ”
#
1 0 0
r() − r0 = v0  + F( ) ” (2.138)
0  0

In general this is complicated. However, for the case of a constant force F() = F0  this simplifies to the
constant acceleration equation
1 F0 2
r() − r0 = v0  +  (2.139)
2
F0
where the constant acceleration a = .

Angular impulsive torques

Note that the principle of impulse also applies to angular motion. Define an impulsive torque T as the
first-order time integral of the time-dependent torque.
Z
T ≡ N() (2.140)

Since torque is related to the rate of change of angular momentum

L
N() = (2.141)

then Z Z
 
L 0
T=  = L = L() − L0 = ∆L (2.142)
0 0 0
Thus the impulsive torque T equals the change in angular momentum ∆L of the struck body.

2.14 Example: Center of percussion of a baseball bat

When an impulsive force  strikes a bat of mass  at a dis-
tance s from the center of mass, then both the linear momentum
of the center of mass, and angular momenta about the center
of mass, of the bat are changed. Assume that the ball strikes O
the bat with an impulsive force  = ∆ perpendicular to the
symmetry axis of the bat at the strike point  which is a distance
 from the center of mass of the bat. The translational impulse y
given to the bat equals the change in linear momentum of the
ball as given by equation 2136 coupled with the conservation of
linear momentum C

y
P = ∆p 
 =  ∆v s
M

Similarly equation 2142 gives that the angular impulse  equals

the change in angular momentum about the center of mass to be 0
S x

T= s × P = ∆L = ∆ω 
36 CHAPTER 2. REVIEW OF NEWTONIAN MECHANICS

The above equations give that

 P
∆v =

s×P
∆ω 
 =


Assume that the bat was stationary prior to the strike, then after the strike the net translational velocity
of a point  along the body-fixed symmetry axis of the bat at a distance  from the center of mass, is given
by
P 1 P 1
v () = ∆v + ∆ω  × y = + ((s × P) × y) = + [(s · y) P− (s · P) y]
   
It is assumed that  and  are perpendicular and thus (s · P) = 0 which simplifies the above equation to
µ ¶
P  (s · y)
v () = ∆v + ∆ω  × y = 1+
 

Note that the translational velocity of the location  along the bat symmetry axis at a distance  from the
center of mass, is zero if the bracket equals zero, that is, if

 2
s·y =− = −

where  is called the radius of gyration of the body about the center of mass. Note that when the scalar
product  ·  = − 

= −2
then there will be no translational motion at the point . This point on the
 axis lies on the opposite side of the center of mass from the strike point , and is called the center of
percussion corresponding to the impulse at the point . The center of percussion often is referred to as the
“sweet spot” for an object corresponding to the impulse at the point . For a baseball bat the batter holds
the bat at the center of percussion so that they do not feel an impulse in their hands when the ball is struck
at the point . This principle is used extensively to design bats for all sports involving striking a ball with
a bat, such as, cricket, squash, tennis, etc. as well as weapons such of swords and axes used to decapitate
opponents.

2.15 Example: Energy transfer in charged-particle scattering

Consider a particle of charge +1 moving with very high
velocity 0 along a straight line that passes a distance 
from another charge +2 and mass . Find the energy  y
transferred to the mass  during the encounter assuming
the force is given by Coulomb’s law electrostatics. Since the +e 1
charged particle 1 moves at very high speed it is assumed
that charge 2 does not change position during the encounter.
.
Assume that charge 1 moves along the − axis through the p
origin while charge 2 is located on the  axis at  = .
Let us consider the impulse given to charge 2 during the V0
encounter. By symmetry the  component must cancel while
the  component is given by
m
1 2 1 2  O
x
 =   = − 2
cos  = − 2
cos   +e 2
40  40  
Charged-particle scattering
But

̇ = −0 cos 

where

= cos( − ) = − cos 

2.13. SOLUTION OF MANY-BODY EQUATIONS OF MOTION 37

Thus
1 2
 = − cos 
40 0
 3
Integrate from 2  2 gives that the total momentum imparted to 2 is
Z 3
1 2 2 1 2
 = − cos  =
40 0 
2
20 0

Thus the recoil energy of charge 2 is given by

µ ¶2
2 1 1 2
2 = =
2 2 20 0

2.13 Solution of many-body equations of motion

The following are general methods used to solve Newton’s many-body equations of motion for practical
problems.

2.13.1 Analytic solution

In practical problems one has to solve a set of equations of motion since the forces depend on the location
of every body involved. For example one may be dealing with a set of coupled oscillators such as the
many components that comprise the suspension system of an automobile. Often the coupled equations of
motion comprise a set of coupled second-order diﬀerential equations. The first approach to solve such a
system is to try an analytic solution comprising a general solution of the inhomogeneous equation plus one
particular solution of the inhomogeneous equation. Another approach is to employ numeric integration using
a computer.

2.13.2 Successive approximation

When the system of coupled diﬀerential equations of motion is too complicated to solve analytically, one
can use the method of successive approximation. The diﬀerential equations are transformed to integral
equations. Then one starts with some initial conditions to make a first order estimate of the functions. The
functions determined by this first order estimate then are used in a second iteration and this is repeated
until the solution converges. An example of this approach is when making Hartree-Foch calculations of the
electron distributions in an atom. The first order calculation uses the electron distributions predicted by
the one-electron model of the atom. This result then is used to compute the influence of the electron charge
distribution around the nucleus on the charge distribution of the atom for a second iteration etc.

2.13.3 Perturbation method

The perturbation technique can be applied if the force separates into two parts  = 1 + 2 where 1  2
and the solution is known for the dominant 1 part of the force. Then the correction to this solution due
to addition of the perturbation 2 usually is easier to evaluate. As an example, consider that one of the
Space Shuttle thrusters fires. In principle one has all the gravitational forces acting plus the thrust force
of the thruster. The perturbation approach is to assume that the trajectory of the Space Shuttle in the
earth’s gravitational field is known. Then the perturbation to this motion due to the very small thrust,
produced by the thruster, is evaluated as a small correction to the motion in the Earth’s gravitational field.
This perturbation technique is used extensively in physics, especially in quantum physics. An example
from my own research is scattering of a 1 208   ion in the Coulomb field of a 197  nucleus The
trajectory for elastic scattering is simple to calculate since neither nucleus is excited and the total energy and
momenta are conserved. However, usually one of these nuclei will be internally excited by the electromagnetic
interaction. This is called Coulomb excitation. The eﬀect of the Coulomb excitation usually can be treated as
a perturbation by assuming that the trajectory is given by the elastic scattering solution and then calculate
the excitation probability assuming the Coulomb excitation of the nucleus is a small perturbation to the
trajectory.
38 CHAPTER 2. REVIEW OF NEWTONIAN MECHANICS

2.14 Newton’s Law of Gravitation

Gravitation plays a fundamental role in classical mechan-
ics as well as being an important example of a conservative
¡ ¢2 z
central 1 force. Although you may not be familiar with
use of vector calculus for the gravitational field g, it is as- dx’dy’dz’ r-r’ m
sumed that you have met the identical approach for studies
of the electric field E in electrostatics. The primary dif-
ference is that mass  replaces charge  and gravitational r’
r
field g replaces the electric field E. This chapter reviews the
concepts of vector calculus as used for study of conservative
inverse-square law central fields. y
In 1666 Newton formulated the Theory of Gravitation
which he eventually published in the Principia in 1687 New-
ton’s Law of Gravitation states that each mass particle at- x
tracts every other particle in the universe with a force that
varies directly as the product of the mass and inversely as
the square of the distance between them. That is, the force
on a gravitational point mass  produced by a mass  Figure 2.7: Gravitational force on mass m due
  to an infinitessimal volume element of the mass
F = − b
r (2.143) density distribution.
2
where b r is the unit vector pointing from the gravitational
mass  to the gravitational mass  as shown in figure 27. Note that the force is attractive, that
is, it points toward the other mass. This is in contrast to the repulsive electrostatic force between two
similar charges. Newton’s law was verified by Cavendish using a torsion balance. The experimental value of
 = (66726 ± 00008) × 10−11  · 2  2 
The gravitational force between point particles can be extended to finite-sized bodies using the fact that
the gravitational force field satisfies the superposition principle, that is, the net force is the vector sum of the
individual forces between the component point particles. Thus the force summed over the mass distribution
is 
X 
F (r) = − rb (2.144)
=1
2
where r is the vector from the gravitational mass  to the gravitational mass  at the position r.
For a continuous gravitational mass distribution  (r0 ), the net force on the gravitational mass  at
the location r can be written as
³ ´
Z  (r0 ) b r − rb0

F (r) = − 0 2
 0 (2.145)
 (r − r )
where  0 is the volume element at the point r0 as illustrated in figure 27.

2.14.1 Gravitational and inertial mass

Newton’s Laws use the concept of inertial mass  ≡  in relating the force F to acceleration a
F =  a (2.146)
and momentum p to velocity v
p =  v (2.147)
That is, inertial mass is the constant of proportionality relating the acceleration to the applied force.
The concept of gravitational mass  is the constant of proportionality between the gravitational force
and the amount of matter. That is, on the surface of the earth, the gravitational force is assumed to be
" 
#
X 
F =  − rb =  g (2.148)
=1
2
2.14. NEWTON’S LAW OF GRAVITATION 39

where g is the gravitational field which is a position-dependent force per unit gravitational mass pointing
towards the center of the Earth. The gravitational mass is measured when an object is weighed.
Newton’s Law of Gravitation leads to the relation for the gravitational field g (r) at the location r due
to a gravitational mass distribution at the location r0 as given by the integral over the gravitational mass
density  ³ ´
Z  (r0 ) b r − rb0

g (r) = − 0 2
 0 (2.149)
 (r − r )
The acceleration of matter in a gravitational field relates the gravitational and inertial masses
F =  g =  a (2.150)
Thus

a= g (2.151)

That is, the acceleration of a body depends on the gravitational strength  and the ratio of the gravitational
and inertial masses. It has been shown experimentally that all matter is subject to the same acceleration
in vacuum at a given location in a gravitational field. That is,  is a constant common to all materials.


Galileo first showed this when he dropped objects from the Tower of Pisa. Modern experiments have shown
that this is true to 5 parts in 1013 .
The exact equivalence of gravitational mass and inertial mass is called the weak principle of equiva-
lence which underlies the General Theory of Relativity as discussed in chapter 17. It is convenient to use
the same unit for the gravitational and inertial masses and thus they both can be written in terms of the
common mass symbol .
 =  =  (2.152)
Therefore the subscripts  and  can be omitted in equations 2150 and 2152. Also the local acceleration
due to gravity a can be written as
a=g (2.153)
F
The gravitational field g ≡  has units of  in the MKS system while the acceleration a has units 2 .

2.14.2 Gravitational potential energy 

Chapter 2102 showed that a conservative field can be expressed in
terms of the concept of a potential energy  (r) which depends on
position. The potential energy diﬀerence ∆→ between two points
r and r , is the work done moving from  to  against a force F. That
is: F
Z 
∆→ =  (r) −  (r ) = − F · l (2.154)
dl

In general, this line integral depends on the path taken. mg

Consider the gravitational field produced by a single point mass

1  The work done moving a mass 0 from  to  in this gravita-
tional field can be calculated along an arbitrary path shown in figure
28 by assuming Newton’s law of gravitation. Then the force on 0
due to point mass 1 is;
1 0
F = − 2 b r (2.155)
 Figure 2.8: Work done against a
force field moving from a to b.
Expressing l in spherical coordinates l =r̂+θ̂+ sin φ̂ gives
the path integral (2154) from (   ) to (   ) is
Z  Z h i Z 
1 0 1 0
∆→ = − F · l =  2 (r̂·b r + r̂ · θ̂ +  sin r̂ · φ̂) =  b
r·b
r
    2
∙ ¸
1 1
= −1 0 − (2.156)
 
40 CHAPTER 2. REVIEW OF NEWTONIAN MECHANICS

since the scalar product of the unit vectors b r·br = 1 Note that the second two terms also cancel since
b
r · θ̂ = r̂ · φ̂ = 0 since the unit vectors are mutually orthogonal. Thus the line integral just depends only on
the starting and ending radii and is independent of the angular coordinates or the detailed path taken between
(   ) and (   ) 
Consider the Principle of Superposition for a gravitational field produced by a set of  point masses. The
line integral then can be written as:
Z   Z
X  
X
 
∆→ =− F · l = − F · l = ∆→ (2.157)
 =1  =1

Thus the net potential energy difference is the sum of the contributions from each point mass producing the
gravitational force field. Since each component is conservative, then the total potential energy difference also
must be conservative. For a conservative force, this line integral is independent of the path taken, it depends
only on the starting and ending positions, r and r . That is, the potential energy is a local function
dependent only on position. The usefulness of gravitational potential energy is that, since the gravitational
force is a conservative force, it is possible to solve many problems in classical mechanics using the fact
that the sum of the kinetic energy and potential energy is a constant. Note that the gravitational field is

conservative, since the potential energy difference ∆→ is independent of the path taken. It is conservative
because the force is radial and time independent, it is not due to the 12 dependence of the field.

2.14.3 Gravitational potential 

Using F = 0 g gives that the change in potential energy due to moving a mass 0 from  to  in a
gravitational field g is:
Z 

∆→ = −0 g · l (2.158)


Note that the probe mass 0 factors out from the integral. It is convenient to define a new quantity called
gravitational potential  where
 Z 
∆→
∆
→ = = − g · l (2.159)
0 

That is; gravitational potential difference is the work that must be done, per unit mass, to move from a to
b with no change in kinetic energy. Be careful not to confuse the gravitational potential energy difference
∆→ and gravitational potential difference ∆→ , that is, ∆ has units of energy, , while ∆ has
units of .
The gravitational potential is a property of the gravitational force field; it is given as minus the line
integral of the gravitational field from  to . The change in gravitational potential energy for moving a
mass 0 from  to  is given in terms of gravitational potential by:

∆→ = 0 ∆
→ (2.160)

Superposition and potential

Previously it was shown that the gravitational force is conservative for the superposition of many masses.
To recap, if the gravitational field
g = g1 + g2 + g3 (2.161)
then
Z  Z  Z  Z 

→ = − g · l = − g1 · l − g2 · l − g3 · l = Σ → (2.162)
   

Thus gravitational potential is a simple additive scalar field because the Principle of Superposition applies.
The gravitational potential, between two points diﬀering by  in height, is . Clearly, the greater  or ,
the greater the energy released by the gravitational field when dropping a body through the height . The
unit of gravitational potential is the 
 
2.14. NEWTON’S LAW OF GRAVITATION 41

2.14.4 Potential theory

The gravitational force and electrostatic force both obey the inverse square law, for which the field and
corresponding potential are related by:
Z 
∆→ = − g · l (2.163)


For an arbitrary infinitessimal element distance l the change in gravitational potential  is

 = −g · l (2.164)
Using cartesian coordinates both g and l can be written as

g = bi + bj + k
b  l = bi + bj + k
b (2.165)
Taking the scalar product gives:

 = −g · l = −  −   −   (2.166)
Diﬀerential calculus expresses the change in potential  in terms of partial derivatives by:
  
 =  +  +  (2.167)
  
By association, 2166 and 2167 imply that
  
 = −  = −  = − (2.168)
  

Thus on each axis, the gravitational field can be written as minus the gradient of the gravitational potential.
In three dimensions, the gravitational field is minus the total gradient of potential and the gradient of the
scalar function  can be written as:

g = −∇ (2.169)
In cartesian coordinates this equals
∙ ¸
b  b b 
g=− i +j +k (2.170)
  
Thus the gravitational field is just the gradient of the gravitational potential, which always is perpendicular
to the equipotentials. Skiers are familiar with the concept of gravitational equipotentials and the fact that
the line of steepest descent, and thus maximum acceleration, is perpendicular to gravitational equipotentials
of constant height. The advantage of using potential theory for inverse-square law forces is that scalar
potentials replace the more complicated vector forces, which greatly simplifies calculation. Potential theory
plays a crucial role for handling both gravitational and electrostatic forces.

2.14.5 Curl of the gravitational field

It has been shown that the gravitational field is conservative, that is
∆→ is independent of the path taken between  and . Therefore,
equation 2159 gives that the gravitational potential is independent of 1

the path taken between two points  and . Consider two possible paths
between  and  as shown in figure 29. The line integral from  to  via
route 1 is equal and opposite to the line integral back from  to  via 2
route 2 if the gravitational field is conservative as shown earlier.
A better way of expressing this is that the line integral of the gravita-
tional field is zero around any closed path. Thus the line integral between
 and , via path 1, and returning back to , via path 2, are equal and Figure 2.9: Circulation of the
opposite. That is, the net line integral for a closed loop is zero gravitational field.
42 CHAPTER 2. REVIEW OF NEWTONIAN MECHANICS

I
g · l = 0 (2.171)

which is a measure of the circulation of the gravitational field. The fact that the circulation equals zero
corresponds to the statement that the gravitational field is radial for a point mass.
Stokes Theorem, discussed in appendix 3, states that
I Z
F · l =  (∇ × F) · S (2.172)
 



Thus the zero circulation of the gravitational field can be rewritten as

I Z
g · l =  (∇ × g) · S = 0 (2.173)
 



Since this is independent of the shape of the perimeter , therefore

∇×g =0 (2.174)

That is, the gravitational field is a curl-free field.

A property of any curl-free field is that it can be expressed as the gradient of a scalar potential  since

∇ × ∇ = 0 (2.175)

Therefore, the curl-free gravitational field can be related to a scalar potential  as

g = −∇ (2.176)

Thus  is consistent with the above definition of gravitational potential  in that the scalar product
Z  Z  Z X Z 

∆→ = − g · l = (∇) · l =  =  (2.177)
   
 

An identical relation between the electric field and electric potential applies for the inverse-square law
electrostatic field.

Reference potentials:
Note that only diﬀerences in potential energy,  , and gravitational potential, , are meaningful, the absolute
values depend on some arbitrarily chosen reference. However, often it is useful to measure gravitational
potential with respect to a particular arbitrarily chosen reference point  such as to sea level. Aircraft
pilots are required to set their altimeters to read with respect to sea level rather than their departure
0 0
airport. This ensures that aircraft leaving from say both Rochester, 559  and Denver 5000 , have
their altimeters set to a common reference to ensure that they do not collide. The gravitational force is the
gradient of the gravitational field which only depends on diﬀerences in potential, and thus is independent of
any constant reference.

Gravitational potential due to continuous distributions of charge Suppose mass is distributed

over a volume  with a density  at any point within the volume. The gravitational potential at any field
point  due to an element of mass  =  at the point 0 is given by:
Z
(0 ) 0
∆∞→ = − (2.178)
 0 
This integral is over a scalar quantity. Since gravitational potential  is a scalar quantity, it is easier to
compute than is the vector gravitational field g . If the scalar potential field is known, then the gravitational
field is derived by taking the gradient of the gravitational potential.
2.14. NEWTON’S LAW OF GRAVITATION 43

2.14.6 Gauss’s Law for Gravitation

The flux Φ of the gravitational field g through a surface
, as shown in figure 210, is defined as
Z
Φ≡ g · S (2.179)

dS
Note that there are two possible perpendicular directions
that could be chosen for the surface vector S Using g
Newton’s law of gravitation for a point mass  the flux
through the surface  is
Z
b
r · S
Φ = − (2.180)
 2
Note that the solid angle subtended by the surface 
at an angle  to the normal from the point mass is given
by
cos  b
r · S
Ω = 2
= (2.181)
 2
Thus the net gravitational flux equals Figure 2.10: Flux of the gravitational field through
Z an infinitessimal surface element dS.
Φ = − Ω (2.182)


Consider a closed surface where the direction of the surface vector S is defined as outwards. The net
flux out of this closed surface is given by
I I
b
r · S
Φ = − = − Ω = −4 (2.183)
 2 

This is independent of where the point mass lies within the closed surface or on the shape of the closed
surface. Note that the solid angle subtended is zero if the point mass lies outside the closed surface. Thus
the flux is as given by equation 2183 if the mass is enclosed by the closed surface, while it is zero if the mass
is outside of the closed surface.
Since the flux for a point mass is independent of the location of the mass within the volume enclosed by
the closed surface, and using the principle of superposition for the gravitational field, then for  enclosed
point masses the net flux is
Z X
Φ≡ g · S = −4  (2.184)
 

This can be extended to continuous mass distributions, with local mass density  giving that the net flux
Z Z
Φ≡ g · S = −4  (2.185)
 


Gauss’s Divergence Theorem was given in appendix 2 as

I Z
Φ= F · S = ∇ · F (2.186)
 


Applying the Divergence Theorem to Gauss’s law gives that

I Z Z
Φ= g · S = ∇ · g = −4 
  
 

or Z
[∇ · g + 4]  = 0 (2.187)


44 CHAPTER 2. REVIEW OF NEWTONIAN MECHANICS

This is true independent of the shape of the surface, thus the divergence of the gravitational field

∇ · g = −4 (2.188)

This is a statement that the gravitational field of a point mass has a 12 dependence.
Using the fact that the gravitational field is conservative, this can be expressed as the gradient of the
gravitational potential 
g = −∇ (2.189)
and Gauss’s law, then becomes
∇ · ∇ = 4 (2.190)
which also can be written as Poisson’s equation

∇2  = 4 (2.191)

Knowing the mass distribution  allows determination of the potential by solving Poisson’s equation.
A special case that often is encountered is when the mass distribution is zero in a given region. Then the
potential for this region can be determined by solving Laplace’s equation with known boundary conditions.

∇2  = 0 (2.192)

For example, Laplace’s equation applies in the free space between the masses. It is used extensively in elec-
trostatics to compute the electric potential between charged conductors which themselves are equipotentials.

2.14.7 Condensed forms of Newton’s Law of Gravitation

The above discussion has resulted in several alternative expressions of Newton’s Law of Gravitation that will
be summarized here. The most direct statement of Newton’s law is
³ ´
Z  (r0 ) b
r − rb0
g (r) = − 2  0 (2.193)
 (r − r0 )

An elegant way to express Newton’s Law of Gravitation is in terms of the flux and circulation of the
gravitational field. That is,
Flux: Z Z
Φ≡ g · S = −4  (2.194)
 


Circulation: I
g · l = 0 (2.195)

The flux and circulation are better expressed in terms of the vector diﬀerential concepts of divergence
and curl.
Divergence:
∇ · g = −4 (2.196)
Curl:
∇×g =0 (2.197)
Remember that the flux and divergence of the gravitational field are statements that the field between
point masses has a 12 dependence. The circulation and curl are statements that the field between point
masses is radial.
Because the gravitational field is conservative it is possible to use the concept of the scalar potential
field  This concept is especially useful for solving some problems since the gravitational potential can be
evaluated using the scalar integral Z
(0 ) 0
∆∞→ = − (2.198)
 0 
2.14. NEWTON’S LAW OF GRAVITATION 45

An alternate approach is to solve Poisson’s equation if the boundary values and mass distributions are known
where Poisson’s equation is:
∇2  = 4 (2.199)
These alternate expressions of Newton’s law of gravitation can be exploited to solve problems. The
method of solution is identical to that used in electrostatics.

2.16 Example: Field of a uniform sphere

Consider the simple case of the gravitational field due to a uniform sphere of matter of radius  and
mass  . Then the volume mass density
3
=
43
The gravitational field and potential for this uniform sphere of matter can be derived three ways;
a) The field can be evaluated by directly integrating over the volume
³ ´
Z  (r0 ) b
r − rb0
g (r) = − 2  0
 (r − r0 )

b) The potential can be evaluated directly by integration of

Z
(0 ) 0
∆∞→ = −
 0 

and then
g = −∇
c) The obvious spherical symmetry can be used in conjunction 0
with Gauss’s law to easily solve this problem.
Z Z g
-GM r -GM
g · S = −4  r²
 


42  () = −4 (rR)

That is: for    -GM | 3R²-r² |
 -GM
g = − 2 b
r (rR) r

Similarly, for   
Gravitational field g and gravitational
2 4 3 potential Φ of a uniformly-dense
4  () =   (rR)
3 spherical mass distribution of radius .
That is:

g = − r (rR)
3
The field inside the Earth is radial and is proportional to the distance from the center of the Earth. This
is Hooke’s Law, and thus ignoring air drag, any body dropped down a q hole through the center of the Earth
p
will undergo harmonic oscillations with an angular frequency of  0 =  3 =   This gives a period of
oscillation of 14 hours, which is about the length of a  235 lecture in classical mechanics, which may seem
like a long time.
Clearly method (c) is much simpler to solve for this case. In general, look for a symmetry that allows
identification of a surface upon which the magnitude and direction of the field is constant. For such cases
use Gauss’s law. Otherwise use methods (a) or (b) whichever one is easiest to apply. Further examples will
not be given here since they are essentially identical to those discussed extensively in electrostatics.
46 CHAPTER 2. REVIEW OF NEWTONIAN MECHANICS

2.15 Summary
Newton’s Laws of Motion:
A cursory review of Newtonian mechanics has been presented. The concept of inertial frames of reference
was introduced since Newton’s laws of motion apply only to inertial frames of reference.
Newton’s Law of motion
p
F= (26)

leads to second-order equations of motion which can be diﬃcult to handle for many-body systems.
Solution of Newton’s second-order equations of motion can be simplified using the three first-order in-
tegrals coupled with corresponding conservation laws. The first-order time integral for linear momentum
is Z 2 Z 2
p
F  =  = (p2 − p1 ) (210)
1 1 
The first-order time integral for angular momentum is
Z 2 Z 2
L p L
= r × = N N  =  = (L2 − L1 ) (216)
  1 1 
The first-order spatial integral is related to kinetic energy and the concept of work. That is
Z 2

F = F · r = (2 − 1 ) (221)
r 1

The conditions that lead to conservation of linear and angular momentum and total mechanical energy
were discussed for many-body systems. The important class of conservative forces was shown to R 2apply if
the position-dependent force do not depend on time or velocity, and if the work done by a force 1 F · r
is independent of the path taken between the initial and final locations. The total mechanical energy is a
constant of motion when the forces are conservative.
It was shown that the concept of center of mass of a many-body or finite sized body separates naturally
for all three first-order integrals. The center of mass is that point about which

X Z
 r0 = r0  = 0 (Centre of mass definition)


where r0 is the vector defining the location of mass  with respect to the center of mass. The concept of
center of mass greatly simplifies the description of the motion of finite-sized bodies and many-body systems
by separating out the important internal interactions and corresponding underlying physics, from the trivial
overall translational motion of a many-body system..
The Virial theorem states that the time-averaged properties are related by
* +
1 X
h i = − F · r (286)
2 

It was shown that the Virial theorem is useful for relating the time-averaged kinetic and potential energies,
especially for cases involving either linear or inverse-square forces.
Typical examples were presented of application of Newton’s equations of motion to solving systems
involving constant, linear, position-dependent, velocity-dependent, and time-dependent forces, to constrained
and unconstrained systems, as well as systems with variable mass. Rigid-body rotation about a body-fixed
rotation axis also was discussed.
It is important to be cognizant of the following limitations that apply to Newton’s laws of motion:
1) Newtonian mechanics assumes that all observables are measured to unlimited precision, that is  
p r are known exactly. Quantum physics introduces limits to measurement due to wave-particle duality.
2) The Newtonian view is that time and position are absolute concepts. The Theory of Relativity shows
that this is not true. Fortunately for most problems    and thus Newtonian mechanics is an excellent
approximation.
2.15. SUMMARY 47

3) Another limitation, to be discussed later, is that it is impractical to solve the equations of motion for
many interacting bodies such as all the molecules in a gas. Then it is necessary to resort to using statistical
averages, this approach is called statistical mechanics.
Newton’s work constitutes a theory of motion in the universe that introduces the concept of causality.
Causality is that there is a one-to-one correspondence between cause of eﬀect. Each force causes a known
eﬀect that can be calculated. Thus the causal universe is pictured by philosophers to be a giant machine
whose parts move like clockwork in a predictable and predetermined way according to the laws of nature. This
is a deterministic view of nature. There are philosophical problems in that such a deterministic viewpoint
appears to be contrary to free will. That is, taken to the extreme it implies that you were predestined to
read this book because it is a natural consequence of this mechanical universe!

Newton’s Laws of Gravitation

Newton’s Laws of Gravitation and the Laws of Electrostatics are essentially identical since they both
involve a central inverse square-law dependence of the forces. The important diﬀerence is that the gravi-
tational force is attractive whereas the electrostatic force between identical charges is repulsive. That is,
1
the gravitational constant  is replaced by − 4 0
, and the mass density  becomes the charge density for
the case of electrostatics. As a consequence it is unnecessary to make a detailed study of Newton’s law of
gravitation since it is identical to what has already been studied in your accompanying electrostatic courses.
Table 21 summarizes and compares the laws of gravitation and electrostatics. For both gravitation and
electrostatics the field is central and conservative and depends as 12 r̂
The laws of gravitation and electrostatics can be expressed in a more useful form in terms of the flux and
circulation of the gravitational field as given either in the vector integral or vector diﬀerential forms. The
radial independence of the flux, and corresponding divergence, is a statement that the fields are radial and
have a 12 r̂ dependence. The statement that the circulation, and corresponding curl, are zero is a statement
that the fields are radial and conservative.

Table 21; Comparison of Newton’s law of gravitation and electrostatics.

Gravitation Electrostatics
Force field g ≡ F E ≡ F
Density Mass density  (r0 ) Charge density  (r0 )
R (r0 )(r−r0 ) 1
R (r0 )(r−r0 ) 0
Conservative central field g (r) = −  (r−r0 )2  0 E (r) = 4 2 
R R R 0  R 0)
(r−r
Flux Φ ≡  g · S = −4   Φ ≡  E · S = 10  
I  I 
Circulation g · l = 0 E · l = 0
Divergence ∇ · g = −4 ∇ · E = 10 
Curl ∇×g =0 ∇×E=0
R (0 )0 1
R (0 ) 0
Potential ∆∞→ = −  0  ∆∞→ = 4 0  0 
Poisson’s equation ∇2  = 4 ∇2  = − 10 

Both the gravitational and electrostatic central fields are conservative making it possible to use the
concept of the scalar potential field  This concept is especially useful for solving some problems since the
potential can be evaluated using a scalar integral. An alternate approach is to solve Poisson’s equation if the
boundary values and mass distributions are known. The methods of solution of Newton’s law of gravitation
are identical to those used in electrostatics and are readily accessible in the literature.
48 CHAPTER 2. REVIEW OF NEWTONIAN MECHANICS

Workshop exercises
1. Spend a few minutes looking over the following problems, paying particular attention to the problems that
you think you might have trouble with. All of the problems are taken from an introductory physics course on
mechanics, so this should seem like review material. After you have had some time to look over the problems,
you will take turns stepping up to the board to solve one. When it is your turn, you may pick ANY of the
problems that have not already been solved. Depending on the number of students in the recitation, you may
be asked to solve more than one problem. Good luck!

(a) Justin fires a 12-gram bullet into a block of wood. The bullet travels at 190 m/s, penetrates the 2.0-kg
block of wood, and emerges going 150 m/s. If the block is stationary on a frictionless surface when hit,
how fast does it move after the bullet emerges?
(b) A mass  at the end of a spring vibrates with a frequency of 0.88 Hz; when an additional 1.25 kg mass
is added to , the frequency is 0.48 Hz. What is the value of ?
(c) Dan has a new chandelier in his living room. The chandelier is 27-kg and it hangs from the ceiling on a
vertical 4.0-m-long wire. What horizontal force would Dan need to use to displace its position 0.10 m to
one side? What will be the tension in the wire?
(d) Dianne has a new spring with a spring constant of 900 N/m that she bought at Springs-R-Us. She places
it vertically on a table and compresses it by 0.150 m. What upward speed can it give to a 0.300-kg ball
when released?
(e) A tiger leaps horizontally from a 6.5-m-high rock with a speed of 4.0 m/s. How far from the base of the
rock will she land?
(f) How much work must SuperRyan do to stop a 1300-kg car traveling at 100 km/hr?
(g) Jason catches a baseball 3.1 s after throwing it vertically upward. With what speed did he throw it and
what height did it reach?
(h) Laura is practicing her figure skating and during her finale she can increase her rotation rate from an
initial rate of 1.0 rev every 2.0 s to a final rate of 3.0 rev/s. If her initial moment of inertia was 4.6 kg·m2 ,
what is her final moment of inertia?
(i) On an icy day in Rochester (imagine that!), you worry about parking your car in your driveway, which
has an incline of 12◦ . Your neighbor Emily’s driveway has an incline of 9◦ , and Brian’s driveway across
the street has one of 6◦ . The coeﬃcient of static friction between tire rubber and ice is 0.15. Which
driveway(s) will be safe to park a car?

2. Two particles are projected from the same point with velocities 1 and 2 , at elevations 1 and 2 , respectively
(1  2 ). Show that if they are to collide in mid-air the interval between the firings must be

21 2 sin(1 − 2 )

(1 cos 1 + 2 cos 2 )

(If you don’t have time to solve this problem completely, then at least give an outline of how you would go
about solving the problem.)

3. Read each of the following statements and, without consulting anyone else, mark them true or false. If you are
unsure of any of them, make a guess. Once everyone has answered each of the statements individually, break
into small groups and compare your answers. Try to come to an agreement as a group. The Teaching Assistant
will then make sure everyone has the correct answer. Good luck!

(a) The conservation of linear momentum is a consequence of translational symmetry, or the homogeneity of
space.
(b) For an isolated system with no external forces acting on it, the angular momentum will remain constant
in both magnitude and direction.
(c) A reference frame is called an inertial frame if Newton’s laws are valid in that frame.
(d) Newtonian mechanics and the laws of electromagnetism are invariant under Galilean transformations.
2.15. SUMMARY 49

(e) The law of conservation of angular momentum is a consequence of rotational symmetry, or the isotropy
of space.
(f) The center of mass of a system of particles moves like a single particle of mass  (total mass of the
system) acted on by a single force  that is equal to the sum of all the external forces acting on the
system.
(g) If Newton’s laws are valid in one reference frame, then they are also valid in any reference frame accelerated
with respect to the first system.
(h) The law of conservation of energy is a consequence of inversion symmetry, or the invertibility of space.

4. The teeter totter comprises two identical weights which hang on drooping arms attached to a peg as shown.
The arrangement is unexpectedly stable and can be spun and rocked with little danger of toppling over.

l l
L
m m

(a) Find an expression for the potential energy of the teeter toy as a function of  when the teeter toy is
cocked at an angle  about the pivot point. For simplicity, consider only rocking motion in the vertical
plane.
(b) Determine the equilibrium values(s) of .
(c) Determine whether the equilibrium is stable, unstable, or neutral for the value(s) of  found in part (b).
(d) How could you determine the answers to parts (b) and (c) from a graph of the potential energy versus ?
(e) Expand the expression for the potential energy about  = 0 and determine the frequency of small
oscillations.

5. For each of the situations described below, determine which of the four functional forms of the force is most
appropriate. Consider motion only along one dimension.

• Constant force:  = 

• Time-dependent force:  =  ()
• Velocity-dependent force:  =  ()
• Distance-dependent force:  =  ()

Go around the room and take turns answering a question. When it is your turn, pick a functional form and
explain why you chose the one you did. If you are unsure, make a guess or ask a question to get help from the
rest of the workshop. There may be more than one answer depending on your interpretation of the situation,
so be sure to explore all of the possibilities.

(a) A mass resting on a frictionless table is attached to a spring, which in turn is attached to a wall. The
mass is pulled to the side and executes simple harmonic motion in the horizontal direction.
(b) A freely-falling body subject to a constant gravitational field with no air resistance.
(c) An electron, initially at rest (treat it classically!), encounters an incoming electromagnetic wave of electric
field intensity  given by  = 0 sin( + ).
(d) A large mass is aﬀected by the gravitational field of another mass a distance  away.
50 CHAPTER 2. REVIEW OF NEWTONIAN MECHANICS

(e) A freely-falling body subject to a constant gravitational field with air resistance.
(f) A charged point particle is aﬀected by the presence of another charged point particle a distance  away.

6. A particle of mass  is constrained to move on the frictionless inner surface of a cone of half-angle .

(a) Find the restrictions on the initial conditions such that the particle moves in a circular orbit about the
vertical axis.
(b) Determine whether this kind of orbit is stable. A particle of mass  is constrained to move on the
frictionless inner surface of a cone of half-angle , as shown in the figure.

7. Consider a thin rod of length  and mass  .

(a) Draw gravitational field lines and equipotential lines for the rod. What can you say about the equipotential
surfaces of the rod?
(b) Calculate the gravitational potential at a point  that is a distance  from one end of the rod and in a
direction perpendicular to the rod.
(c) Calculate the gravitational field at  by direct integration.
(d) Could you have used Gauss’s law to find the gravitational field at  ? Why or why not?

8. Consider a single particle of mass 

(a) Determine the position  and velocity  of a particle in spherical coordinates.

(b) Determine the total mechanical energy of the particle in potential  .
(c) Assume the force is conservative. Show that  = −∇ . Show that it agrees with Stoke’s theorem.
d
(d) Show that the angular momentum  =  ×  of the particle is conserved. Hint: d ( × ) =
 × dB dA
d + d ×  .

9. Consider a fluid with density  and velocity  in some volume . The mass current  =  determines the
amount of mass exiting the surface per unit time by the integral 
 · 

(a) Using the divergence theorem, prove the continuity equation, ∇ ·  + 
=0

10. A rocket of initial mass  burns fuel at constant rate  (kilograms per second), producing a constant force .
The total mass of available fuel is  . Assume the rocket starts from rest and moves in a fixed direction with
no external forces acting on it.

(a) Determine the equation of motion of the rocket.

(b) Determine the final velocity of the rocket.
(c) Determine the displacement of the rocket in time.
2.15. SUMMARY 51

Problems
1. Consider a solid hemisphere of radius . Compute the coordinates of the center of mass relative to the center
of the spherical surface used to define the hemisphere.

2. A 2000kg Ford was travelling south on Mt. Hope Avenue when it collided with your 1000kg sports car travelling
west on Elmwood Avenue. The two badly-damaged cars became entangled in the collision and leave a skid mark
that is 20 meters long in a direction 14◦ to the west of the original direction of travel of the Excursion. The
wealthy Excursion driver hires a high-powered lawyer who accuses you of speeding through the intersection.
Use your P235 knowledge, plus the police oﬃcer’s report of the recoil direction, the skid length, and knowledge
that the coeﬃcient of sliding friction between the tires and road is  = 06, to deduce the original velocities of
both cars. Were either of the cars exceeding the 30mph speed limit?

3. A particle of mass  moving in one dimension has potential energy  () = 0 [2(  )2 − (  )4 ] where 0 and 
are positive constants.
a) Find the force  () that acts on the particle.
b) Sketch  (). Find the positions of stable and unstable equilibrium.
c) What is the angular frequency  of oscillations about the point of stable equilibrium?
d) What is the minimum speed the particle must have at the origin to escape to infinity?
e) At  = 0 the particle is at the origin and its velocity is positive and equal to the escape velocity. Find ()
and sketch the result.

4. a) Consider a single-stage rocket travelling in a straight line subject to an external force   acting along the
same line where  is the exhaust velocity of the ejected fuel relative to the rocket. Show that the equation of
motion is
̇ = −̇ +  

b) Specialize to the case of a rocket taking oﬀ vertically from rest in a uniform gravitational field  Assume
that the rocket ejects mass at a constant rate of ̇ = − where  is a positive constant. Solve the equation of
motion to derive the dependence of velocity on time.
c) The first couple of minutes of the launch of the Space Shuttle can be described roughly by; initial mass
= 2 × 106 kg, mass after 2 minutes = 1 × 106 kg, exhaust speed  = 3000 and initial velocity is zero.
Estimate the velocity of the Space Shuttle after two minutes of flight.
d) Describe what would happen to a rocket where ̇  

5. A time independent field  is conservative if ∇ ×  = 0. Use this fact to test if the following fields are
conservative, and derive the corresponding potential  .
a)  =  +  +   =  +   =  + 

b)  = −−   = ln   = − + 
52 CHAPTER 2. REVIEW OF NEWTONIAN MECHANICS

6. Consider a solid cylinder of mass  and radius  sliding without rolling down the smooth inclined face of a
wedge of mass  that is free to slide without friction on a horizontal plane floor. Use the coordinates shown
in the figure.
a) How far has the wedge moved by the time the cylinder has descended from rest a vertical distance  ?
b) Now suppose that the cylinder is free to roll down the wedge without slipping. How far does the wedge
move in this case if the cylinder rolls down a vertical distance  ?
c) In which case does the cylinder reach the bottom faster? How does this depend on the radius of the cylinder?

x
y
x

7. If the gravitational field vector is independent of the radial distance within a sphere, find the function describing
the mass density  () of the sphere.
Chapter 3

Linear oscillators

3.1 Introduction
Oscillations are a ubiquitous feature in nature. Examples are periodic motion of planets, the rise and fall
of the tides, water waves, pendulum in a clock, musical instruments, sound waves, electromagnetic waves,
and wave-particle duality in quantal physics. Oscillatory systems all have the same basic mathematical form
although the names of the variables and parameters are diﬀerent. The classical linear theory of oscillations
will be assumed in this chapter since: (1) The linear approximation is well obeyed when the amplitudes of
oscillation are small, that is, the restoring force obeys Hooke’s Law. (2) The Principle of Superposition
applies. (3) The linear theory allows most problems to be solved explicitly in closed form. This is in contrast
to non-linear system where the motion can be complicated and even chaotic as discussed in chapter 4.

3.2 Linear restoring forces

An oscillatory system requires that there be a stable equilibrium about
which the oscillations occur. Consider a conservative system with potential
energy  for which the force is given by

F = −∇ (3.1)

Figure 31 illustrates a conservative system that has three locations at

which the restoring force is zero, that is, where the gradient of the potential
is zero. Stable oscillations occur only around locations 1 and 3 whereas
the system is unstable at the zero gradient location 2. Point 2 is called a
separatrix in that an infinitessimal displacement of the particle from this
separatrix will cause the particle to diverge towards either minimum 1 or
3 depending on which side of the separatrix the particle is displaced.
The requirements for stable oscillations about any point 0 are that
the potential energy must have the following properties. Figure 3.1: Stability for a one-
Stability requirements dimensional potential U(x).
¡ ¢
1) The potential has a stable position for which the restoring force is zero, i.e. 
 =0 =³0

´
2) The potential  must be positive and an even function of displacement  − 0  That is.  0
0
where  is even.
The requirement for the restoring force to be linear is that the restoring force for perturbation about a
stable equilibrium at 0 is of the form
F = −(−0 ) = ̈ (3.2)
The potential energy function for a linear oscillator has a pure parabolic shape about the minimum location,
that is,
1
 = ( − 0 )2 (3.3)
2

53
54 CHAPTER 3. LINEAR OSCILLATORS

where 0 is the location of the minimum.

Fortunately, oscillatory systems involve small amplitude oscillations about a stable minimum. For weak
non-linear systems, where the amplitude of oscillation ∆ about the minimum is small, it is useful to make
a Taylor expansion of the potential energy about the minimum. That is
 (0 ) ∆2 2  (0 ) ∆3 3  (0 ) ∆4 4  (0 )
 (∆) =  (0 ) + ∆ + + + +  (3.4)
 2! 2 3! 3 4! 4
 (0 )
By definition, at the minimum  = 0 and thus equation 33 can be written as

∆2 2  (0 ) ∆3 3  (0 ) ∆4 4  (0 )

∆ =  (∆) −  (0 ) = + + +  (3.5)
2! 2 3! 3 4! 4
2 2
  (0 )
For small amplitude oscillations, the system is linear if the second-order ∆2! 2 term in equation 32 is
dominant.
The linearity for small amplitude oscillations greatly simplifies description of the oscillatory motion and
complicated chaotic motion is avoided. Most physical systems are approximately linear for small amplitude
oscillations, and thus the motion close to equilibrium approximates a linear harmonic oscillator.

3.3 Linearity and superposition

An important aspect of linear systems is that the solutions obey the Principle of Superposition, that is, for
the superposition of diﬀerent oscillatory modes, the amplitudes add linearly. The linearly-damped linear
oscillator is an example of a linear system in that it involves only linear operators, that is, it can be written
in the operator form (appendix 2)
µ 2 ¶
  2
+ Γ +   () =  cos  (3.6)
2 
The quantity in the brackets on the left hand side is a linear operator that can be designated by L where
L() =  () (3.7)
An important feature of linear operators is that they obey the principle of superposition. This property
results from the fact that linear operators are distributive, that is
L(1 + 2 ) = L (1 ) + L (2 ) (3.8)
Therefore if there are two solutions 1 () and 2 () for two diﬀerent forcing functions 1 () and 2 ()
L1 () = 1 () (3.9)
L2 () = 2 ()
then the addition of these two solutions, with arbitrary constants, also is a solution for linear operators.
L(1 1 + 2 2 ) = 1 1 () + 2 2 () (3.10)
In general then Ã ! Ã !

X 
X
L   () =   () (3.11)
=1 =1
The left hand bracket can be identified as the linear combination of solutions

X
() =   () (3.12)
=1

while the driving force is a linear superposition of harmonic forces


X
 () =   () (3.13)
=1
3.4. GEOMETRICAL REPRESENTATIONS OF DYNAMICAL MOTION 55

Thus these linear combinations also satisfy the general linear equation
L() =  () (3.14)
Applicability of the Principle of Superposition to a system provides a tremendous advantage for handling
and solving the equations of motion of oscillatory systems.

3.4 Geometrical representations of dynamical motion

The powerful pattern-recognition capabilities of the human brain, coupled with geometrical representations
of the motion of dynamical systems, provide a sensitive probe of periodic motion. The geometry of the
motion often can provide more insight into the dynamics than inspection of mathematical functions. A
system with  degrees of freedom is characterized by locations  , velocities ̇  and momenta   in addition
to the time  and instantaneous energy (). Geometrical representations of the dynamical correlations are
illustrated by the configuration space and phase space representations of these 2 + 2 variables.

3.4.1 Configuration space (    )

A configuration space plot shows the correlated motion of two spatial coordinates  and  averaged over
time. An example is the two-dimensional linear oscillator with two equations of motion and solutions
̈ +   = 0 ̈ +   = 0 (3.15)
 () =  cos (  )  () =  cos (   − ) (3.16)
q

where  =  . For unequal restoring force constants,  6=   the trajectory executes complicated
Lissajous figures that depend on the angular frequencies       and the phase factor . When the ratio of
the angular frequencies along the two axes is rational, that is  is a rational fraction, then the curve will
repeat at regular intervals as shown in figure 32 and this shape depends on the phase diﬀerence. Otherwise
the trajectory uniformly traverses the whole rectangle.

Figure 3.2: Configuration plots of ( ) where  = cos(4) and  = cos(5 − ) at four diﬀerent phase values
. The curves are called Lissajous figures
56 CHAPTER 3. LINEAR OSCILLATORS

3.4.2 State space, (  ̇ )

Visualization of a trajectory is enhanced by correlation of configuration  and it’s corresponding velocity
̇ which specifies the direction of the motion. The state space representation1 is especially valuable when
discussing Lagrangian mechanics which is based on the Lagrangian (q q̇).
The free undamped harmonic oscillator provides a simple illustration of state space. Consider a mass 
attached to a spring with linear spring constant  for which the equation of motion is
̇
− = ̈ = ̇ (3.17)

By integration this gives
1 1
̇2 + 2 =  (3.18)
2 2
The first term in equation 318 is the kinetic energy, the second term is the potential energy, and  is the
total energy which is conserved for this system. This equation can be expressed in terms of the state space
coordinates as
̇2 2
¡ 2 ¢ + ¡ 2 ¢ = 1 (3.19)
 
This corresponds to the equation of an ellipse for a state-space plot of ̇ versus  as shown in figure 33.
The elliptical paths shown correspond to contours of constant total energy which is partitioned between
kinetic and potential energy. For the coordinate axis shown, the motion of a representative point will be in
a clockwise direction as the total oscillator energy is redistributed between potential to kinetic energy. The
area of the ellipse is proportional to the total energy .

3.4.3 Phase space, (   )

Phase space, which was introduced by J.W. Gibbs for the field of sta-
tistical mechanics, provides a fundamental graphical representation in
classical mechanics. The phase space coordinates   are the conju-
gate coordinates (q p) and are fundamental to Hamiltonian mechanics
which is based on the Hamiltonian (q p). For a conservative system,
only one phase-space curve passes through any point in phase space
like the flow of an incompressible fluid. This makes phase space more
useful than state space where many curves pass through any location.
Lanczos [La49] defined an extended phase space using four-dimensional
relativistic space-time as discussed in chapter 17.
Since  = ̇ for the non-relativistic, one-dimensional, linear os-
cillator, then equation 319 can be rewritten in the form

2 2
+ ¡ 2 ¢ = 1 (3.20)
2 

This is the equation of an ellipse in the phase space diagram shown in

Fig.33- which looks identical to Fig 33- where the ordinate
variable  = ̇. That is, the only diﬀerence is the phase-space coor-
dinates (  ) replace the state-space coordinates ( ̇). State space
plots are used extensively in this chapter to describe oscillatory mo-
tion. Although phase space is more fundamental, both state space and Figure 3.3: State space (upper),
phase space plots provide useful representations for characterizing and and phase space (lower) diagrams,
elucidating a wide variety of motion in classical mechanics. The follow- for the linear harmonic oscillator.
ing discussion of the undamped simple pendulum illustrates the general
features of state space.
1 A universal name for the (q q̇) representation has not been adopted in the literature. Therefore this book has adopted

the name "state space" in common with reference [Ta05]. Lanczos [La49] uses the term "state space" to refer to the extended
phase space (q p) discussed in chapter 17
3.4. GEOMETRICAL REPRESENTATIONS OF DYNAMICAL MOTION 57

3.4.4 Plane pendulum

Consider a simple plane pendulum of mass  attached to a string of length  in a uniform gravitational field
. There is only one generalized coordinate,  Since the moment of inertia of the simple plane-pendulum is
 = 2  then the kinetic energy is
1 2
 = 2 ̇ (3.21)
2
and the potential energy relative to the bottom dead center is
 =  (1 − cos ) (3.22)
Thus the total energy equals
1 2 2 2
=  ̇ + (1 − cos ) =  2 +  (1 − cos ) (3.23)
2 2
where  is a constant of motion. Note that the angular momentum  is not a constant of motion since the
angular acceleration ̇ explicitly depends on . ³ ´
It is interesting to look at the solutions for the equation of motion for a plane pendulum on a  ̇
state space diagram shown in figure 34. The curves shown are equally-spaced contours of constant total
energy. Note that the trajectories are ellipses only at very small angles where 1 − cos  ≈ 2 , the contours are
non-elliptical for higher amplitude oscillations. When the energy is in the range 0    2 the motion
corresponds to oscillations of the pendulum about  = 0. The center of the ellipse is at (0 0) which is a
stable equilibrium point for the oscillation. However, when ||  2 there is a phase change to rotational
motion about the horizontal axis, that is, the pendulum swings around and over top dead center, i.e. it
rotates continuously in one direction about the horizontal axis. The phase change occurs at  = 2 and
is designated by the separatrix trajectory.
Figure 34 shows two cycles for  to better illustrate
the cyclic nature of the phase diagram. The closed loops,
shown as fine solid lines, correspond to pendulum oscil-
lations about  = 0 or 2 for   2. The dashed
lines show rolling motion for cases where the total en-
ergy   2. The broad solid line is the separatrix
that separates the rolling and oscillatory motion. Note
that at the separatrix, the kinetic energy and ̇ are zero
when the pendulum is at top dead center which occurs
when  = ±The point ( 0) is an unstable equilib-
rium characterized by phase lines that are hyperbolic
to this unstable equilibrium point. Note that  = +
and − correspond to the same physical point, that is,
the phase diagram is better presented on a cylindri-
cal phase space representation since  is a cyclic vari-
able that cycles around the cylinder whereas ̇ oscillates
equally about zero having both positive and negative val- Figure 3.4: State space diagram for a plane pendu-
ues. The state-space diagram can be wrapped around a lum. The  axis is in units of  radians. Note that
cylinder, then the unstable and stable equilibrium points  = + and − correspond to the same physical
will be at diametrically opposite locations on the surface point, that is the phase diagram should be rolled
of the cylinder at ̇ = 0. For small oscillations about into a cylinder connected at  = ±.
equilibrium, also called librations, the correlation be-
tween ̇ and  is given by the clockwise closed loops wrapped on the cylindrical surface, whereas for energies
||  2 the positive ̇ corresponds to counterclockwise rotations while the negative ̇ corresponds to
clockwise rotations.
State-space diagrams will be used for describing oscillatory motion in chapters 3 and 4 Phase space is
used in statistical mechanics in order to handle the equations of motion for ensembles of ∼ 1023 independent
particles since momentum is more fundamental than velocity. Rather than try to account separately for
the motion of each particle for an ensemble, it is best to specify the region of phase space containing the
ensemble. If the number of particles is conserved, then every point in the initial phase space must transform
to corresponding points in the final phase space. This will be discussed in chapters 83 and 1527.
58 CHAPTER 3. LINEAR OSCILLATORS

3.5 Linearly-damped free linear oscillator

3.5.1 General solution
All simple harmonic oscillations are damped to some degree due to energy dissipation via friction, viscous
forces, or electrical resistance etc. The motion of damped systems is not conservative in that energy is
dissipated as heat. As was discussed in chapter 2 the damping force can be expressed as

F () = − ()b
v (3.24)

where the velocity dependent function  () can be complicated. Fortunately there is a very large class of
problems in electricity and magnetism, classical mechanics, molecular, atomic, and nuclear physics, where
the damping force depends linearly on velocity which greatly simplifies solution of the equations of motion.
This chapter discusses the special case of linear damping.
Consider the free simple harmonic oscillator, that is, assuming no oscillatory forcing function, with a
linear damping term F () = −v where the parameter  is the damping factor. Then the equation of
motion is
− − ̇ = ̈ (3.25)

This can be rewritten as

̈ + Γ̇ +  20  = 0 (3.26)

where the damping parameter


Γ= (3.27)

and the characteristic angular frequency
r

0 = (3.28)


The general solution to the linearly-damped free oscillator is obtained by inserting the complex trial
solution  = 0   Then
2
() 0  + Γ0  +  20 0  = 0 (3.29)

This implies that

 2 − Γ −  20 = 0 (3.30)

The solution is
s µ ¶2
Γ Γ
± =  ±  20 − (3.31)
2 2

The two solutions  ± are complex conjugates and thus the solutions of the damped free oscillator are
     
2 2
 Γ
2+ 20 −( Γ
2)   Γ
2−  20 −( Γ
2) 
 = 1  + 2  (3.32)

This can be written as

£ ¤
 = −( 2 ) 1 1  + 2 −1 
Γ
(3.33)

where
s µ ¶2
Γ
1 ≡  2 − (3.34)
2
3.5. LINEARLY-DAMPED FREE LINEAR OSCILLATOR 59

¡ Γ ¢2
Underdamped motion  21 ≡  2 − 2 0

When  21  0 then the square root is real so the solution can be written taking the real part of  which
gives that equation 333 equals

() = −( 2 ) cos ( 1  − )

Γ
(3.35)

Where  and  are adjustable constants fit to the initial conditions. Therefore the velocity is given by

∙ ¸
−Γ
2
Γ
̇() = −  1 sin ( 1  − ) + cos ( 1  − ) (3.36)
2

This is the damped sinusoidal oscillation illustrated in figure 35. The solution has the following
characteristics:
2
a) The oscillation amplitude decreases exponentially with a time constant   = Γ

q b) There
¡ Γ ¢2
is a small reduction in the frequency of the oscillation due to the damping leading to  1 =
2
 − 2

Figure 3.5: The amplitude-time dependence and state-space diagrams for the free linearly-damped harmonic
oscillator. The upper row shows the underdamped system for the case with damping Γ = 50 . The lower
row shows the overdamped ( Γ2   0 ) [solid line] and critically damped ( Γ2 =  0 ) [dashed line] in both cases
assuming that initially the system is at rest.
60 CHAPTER 3. LINEAR OSCILLATORS

Figure 3.6: Real and imaginary solutions  ± of the damped harmonic oscillator. A phase transition occurs
at Γ = 2 0  For Γ  2 0 (dashed) the two solutions are complex conjugates and imaginary. For Γ  2 0 ,
(solid), there are two real solutions  + and  − with widely diﬀerent decay constants where  + dominates
the decay at long times.

¡ Γ ¢2
Overdamped case  21 ≡  2 − 2 0
q¡ ¢
Γ 2
In this case the square root of  21 is imaginary and can be expressed as  01 =  2 −  2  Therefore the
solution is obtained more naturally by using a real trial solution  = 0  in equation 333 which leads to
two roots ⎡ sµ ¶ ⎤
2
Γ Γ
 ± = − ⎣− ± −  2 ⎦
2 2

Thus the exponentially damped decay has two time constants  + and  − 
£ ¤
() = 1 −+  + 2 −−  (3.37)
The time constant 1−  1+ thus the first term 1 −+  in the bracket decays in a shorter time than the
second term 2 −−   As illustrated in figure 36 the decay rate, which is imaginary when underdamped, i.e.
Γ Γ
2     bifurcates into two real values  ± for overdamped, i.e. 2    . At large times the dominant term
when overdamped is for  + which has the smallest decay rate, that is, the longest decay constant  + = 1+ .
There is no oscillatory motion for the overdamped case, it slowly moves monotonically to zero as shown in
fig 35. The amplitude decays away with a time constant that is longer than Γ2 

¡ Γ ¢2
Critically damped  21 ≡  2 − 2 =0
Γ
This is the limiting case where 2 =   For this case the solution is of the form

() = ( + ) −( 2 )

Γ
(3.38)
This motion also is non-sinusoidal and evolves monotonically to zero. As shown in figure 35 the critically-
damped solution goes to zero with the shortest time constant, that is, largest . Thus analog electric meters
are built almost critically damped so the needle moves to the new equilibrium value in the shortest time
without oscillation.
It is useful to graphically represent the motion of the damped linear oscillator on either a state space
(̇ ) diagram or phase space (  ) diagram as discussed in chapter 34. The state space plots for the
undamped, overdamped, and critically-damped solutions of the damped harmonic oscillator are shown in
figure 35 For underdamped motion the state space diagram spirals inwards to the origin in contrast to
critical or overdamped motion where the state and phase space diagrams move monotonically to zero.
3.5. LINEARLY-DAMPED FREE LINEAR OSCILLATOR 61

3.5.2 Energy dissipation

The instantaneous energy is the sum of the instantaneous kinetic and potential energies
1 1
= ̇2 + 2 (3.39)
2 2
where  and ̇ are given by the solution of the equation of motion.
Consider the total energy of the underdamped system
1 2 1
=  +  20 2 (3.40)
2 2
where  =  20  The average total energy is given by substitution for  and ̇ and taking the average over
one cycle. Since
() = −( 2 ) cos ( 1  − )
Γ
(3.41)
Then the velocity is given by
∙ ¸
Γ Γ
̇() = −− 2   1 sin ( 1  − ) + cos ( 1  − ) (3.42)
2
Inserting equations 341 and 342 into 340 gives a small amplitude oscillation about Dan exponential
E D decayEfor
the energy . Averaging over one cycle and using the fact that hsin  cos i = 0, and [sin ] = [cos ]2 =
2

1
2, gives the time-averaged total energy as
Ã µ ¶2 !
−Γ 1 2 2 1 2 Γ 1 2 2
hi =    1 +  +   0 (3.43)
4 4 2 4

which can be written as

hi = 0 −Γ (3.44)
1
Note that the energy of the linearly damped free oscillator decays away with a time constant  = Γ
That
is, the intensity has a time constant that is half the time constant for the decay of the amplitude of the
transient response. Note that the average kinetic and potential energies are identical, as implied by the
Virial theorem, and both decay away with the same time constant. This relation between the mean life 
for decay of the damped harmonic oscillator and the damping width term Γ occurs frequently in physics.
The damping of an oscillator usually is characterized by a single parameter  called the Quality Factor
where
Energy stored in the oscillator
≡ (3.45)
Energy dissipated per radian
The energy loss per radian is given by
 1 Γ Γ
∆ = = =q ¡ ¢2 (3.46)
  1 1
 2 − Γ2
q ¡ ¢2
where the numerator  1 =  2 − Γ2 is the frequency of the free damped linear oscillator.
Thus the Quality factor  equals
 1 .
= = (3.47) Typical Q factors
∆ Γ Earth, for earthquake wave 250-1400
The larger the  factor, the less damped is the system, and the Piano string 3000
greater is the number of cycles of the oscillation in the damped Crystal in digital watch 104
wave train. Chapter 3113 shows that the longer the wave train, Microwave cavity 104
that is the higher is the  factor, the narrower is the frequency Excited atom 107
distribution around the central value. The Mössbauer eﬀect in Neutron star 1012
nuclear physics provides a remarkably long wave train that can LIGO laser 1013
be used to make high precision measurements. The high- pre- Mössbauer eﬀect in nucleus 1014
cision of the LIGO laser interferometer was used in the first suc-
cessful observation of gravity waves in 2015. Table 3.1: Typical Q factors in nature.
62 CHAPTER 3. LINEAR OSCILLATORS

3.6 Sinusoidally-drive, linearly-damped, linear oscillator

The linearly-damped linear oscillator, driven by a harmonic driving force, is of considerable importance to
all branches of science and engineering. The equation of motion can be written as

 ()
̈ + Γ̇ +  20  = (3.48)


where  () is the driving force. For mathematical simplicity the driving force is chosen to be a sinusoidal
harmonic force. The solution of this second-order diﬀerential equation comprises two components, the
complementary solution (transient response), and the particular solution (steady-state response).

3.6.1 Transient response of a driven oscillator

The transient response of a driven oscillator is given by the complementary solution of the above second-order
diﬀerential equation
̈ + Γ̇ +  20  = 0 (3.49)

which is identical to the solution of the free linearly-damped harmonic oscillator. As discussed in section 35
the solution of the linearly-damped free oscillator is given by the real part of the complex variable  where
Γ £ ¤
 = − 2  1 1  + 2 −1  (3.50)

and s µ ¶2
Γ
1 ≡  2 − (3.51)
2

2
Underdamped motion  21 ≡  2 − Γ2  0 : When  21  0 then the square root is real so the transient
solution can be written taking the real part of  which gives

0 − Γ 
() =  2 cos ( 1 ) (3.52)

The solution has the following characteristics:
2
a) The amplitude of the transient solution decreases exponentially with a time constant   = Γ while
the energy decreases with a time constant of Γ1 
q ¡ ¢2
b) There is a small downward frequency shift in that  1 =  2 − Γ2 

¡ ¢2
Overdamped case  21 ≡  2 − Γ2  0 : In this case the square root is imaginary, which can be expressed
q¡ ¢
Γ 2
as  01 ≡ 2 −  2 which is real and the solution is just an exponentially damped one

0 − Γ  h 01  0
i
() =  2  + −1  (3.53)

There is no oscillatory motion for the overdamped case, it slowly moves monotonically to zero. The total
energy decays away with two time constants greater than Γ1 

¡ Γ ¢2
Critically damped  21 ≡  2 − 2 = 0 : For this case, as mentioned for the damped free oscillator, the
solution is of the form
Γ
() = ( + ) − 2  (3.54)

The critically-damped system has the shortest time constant.

3.6. SINUSOIDALLY-DRIVE, LINEARLY-DAMPED, LINEAR OSCILLATOR 63

3.6.2 Steady state response of a driven oscillator

The particular solution of the diﬀerential equation gives the important steady state response, () to the
forcing function. Consider that the forcing term is a single frequency sinusoidal oscillation.

 () = 0 cos() (3.55)

Thus the particular solution is the real part of the complex variable  which is a solution of
0 
̈ + Γ̇ +  20  =  (3.56)

A trial solution is
 = 0  (3.57)
This leads to the relation
0
− 2 0 + Γ0 +  20 0 = (3.58)

¡ 2 ¢
Multiplying the numerator and denominator by the factor  0 −  2 − Γ gives
0 0 £¡ 2 ¢ ¤
0 = 
= 
 0 −  2 − Γ (3.59)
( 20 − 2 ) + Γ ( 20 2 2
−  ) + (Γ)
2

The steady state solution () thus is given by the real part of , that is
0 £¡ 2 ¢ ¤
 () = 
2 2
 0 −  2 cos  + Γ sin  (3.60)
( 20 − 2 ) + (Γ)

This can be expressed in terms of a phase  defined as

µ ¶
Γ
tan  ≡ (3.61)
 20 −  2

As shown in figure 37 the hypotenuse of the triangle equals

q
2 2
( 20 −  2 ) + (Γ) . Thus

 20 −  2
cos  = q (3.62)
2
( 20 −  2 ) + (Γ)2

and
Γ
sin  = q (3.63)
2 2
( 20 −  2 ) + (Γ)
The phase  represents the phase diﬀerence between the
driving force and the resultant motion. For a fixed  0 the
phase  = 0 when  = 0 and increases to  = 2 when
 =  0 . For    0 the phase  →  as  → ∞. Figure 3.7: Phase between driving force and
The steady state solution can be re-expressed in terms of resultant motion.
the phase shift  as
0

 () = q [cos  cos  + sin  sin ]
2
( 0 −  2 ) + (Γ)2
2

0

= q cos ( − ) (3.64)
2
( 0 −  2 ) + (Γ)2
2
64 CHAPTER 3. LINEAR OSCILLATORS

Figure 3.8: Amplitude versus time, and state space plots of the transient solution (dashed) and total solution
(solid) for two cases. The upper row shows the case where the driving frequency  = 51 while the lower row
shows the same for the case where the driving frequency  = 5 1 

3.6.3 Complete solution of the driven oscillator

To summarize, the total solution of the sinusoidally forced linearly-damped harmonic oscillator is the sum
of the transient and steady-state solutions of the equations of motion.

()  = () + () (3.65)

For the underdamped case, the transient solution is the complementary solution
0 − Γ2 
() =  cos ( 1  − ) (3.66)

q ¡ ¢2
where  1 =  2 − Γ2 . The steady-state solution is given by the particular solution
0

() = q cos ( − ) (3.67)
2
( 0 −  2 ) + (Γ)2
2

Note that the frequency of the transient solution is  1 which in general differs from the driving frequency
. The phase shift  −  for the transient component is set by the initial conditions. The transient response
leads to a more complicated motion immediately after the driving function is switched on. Figure 38
illustrates the amplitude time dependence and state space diagram for the transient component, and the
total response, when the driving frequency is either  = 51 or  = 5 1  Note that the modulation of the
steady-state response by the transient response is unimportant once the transient response has damped out
leading to a constant elliptical state space trajectory. For cases where the initial conditions are  = ̇ = 0
then the transient solution has a relative phase difference  −  =  radians at  = 0 and relative amplitudes
such that the transient and steady-state solutions cancel at  = 0
The characteristic sounds of different types of musical instruments depend very much on the admixture
of transient solutions plus the number and mixture of oscillatory active modes. Percussive instruments, such
as the piano, have a large transient component. The mixture of transient and steady-state solutions for
forced oscillations occurs frequently in studies of  networks in electrical circuit analysis.
3.6. SINUSOIDALLY-DRIVE, LINEARLY-DAMPED, LINEAR OSCILLATOR 65

3.6.4 Resonance
The discussion so far has discussed the role of the transient and steady-state solutions of the driven damped
harmonic oscillator which occurs frequently is science, and engineering. Another important aspect is reso-
nance that occurs when the driving frequency  approaches the natural frequency  1 of the damped system.
Consider the case where the time is sufficient for the transient solution to have decayed to zero.
Figure 39 shows the amplitude and phase for the steady-
state response as  goes through a resonance as the driving
frequency is changed. The steady-states solution of the
driven oscillator follows the driving force when    0 in
that the phase difference is zero and the amplitude is just
0
  The response of the system peaks at resonance, while
for    0 the harmonic system is unable to follow the
more rapidly oscillating driving force and thus the phase of
the induced oscillation is out of phase with the driving force
and the amplitude of the oscillation tends to zero.
Note that the resonance frequency for a driven damped
oscillator, differs from that for the undriven damped oscilla-
tor, and differs from that for the undamped oscillator. The
natural frequency for an undamped harmonic oscillator
is given by

 20 = (3.68)

The transient solution is the same as damped free os-
cillations of a damped oscillator and has a frequency of
the system  1 given by
µ ¶2
Γ
 21 =  20 − (3.69)
2

That is, damping slightly reduces the frequency.

For the driven oscillator the maximum value of the
steady-state amplitude response is obtained by taking the
maximum of the function () , that is when 
 = 0 This


occurs at the resonance angular frequency   where

Figure 3.9: Resonance behavior for the
µ ¶2 linearly-damped, harmonically driven, linear
Γ
 2 =  20 − 2 (3.70) oscillator.
2
¡ ¢2
No resonance occurs if  20 −2 Γ2  0 since then   is imaginary and the amplitude decreases monotonically
with increasing  Note that the above three frequencies are identical if Γ = 0 but they diﬀer when Γ  0
and     1   0 
For the driven oscillator it is customary to define the quality factor  as

≡ (3.71)
Γ
When   1 the system has a narrow high resonance peak. As the damping increases the quality factor
decreases leading to a wider and lower peak. The resonance disappears when   1 .

3.6.5 Energy absorption

Discussion of energy stored in resonant systems is best described using the steady state solution which is
dominant after the transient solution has decayed to zero. Then
0 £¡ 2 ¢ ¤
 () = 
2 2
 0 −  2 cos  + Γ sin  (3.72)
( 20 − 2
 ) + (Γ)
66 CHAPTER 3. LINEAR OSCILLATORS

This can be rewritten as

() =  cos  +  sin  (3.73)
where the elastic amplitude
0 ¡ 2 ¢
 = 
2 2
0 − 2 (3.74)
( 20 − 2 ) + (Γ)
while the absorptive amplitude
0

 = 2 2
Γ (3.75)
( 20 − 2 ) + (Γ)
Figure 310 shows the behavior of the absorptive and
elastic amplitudes as a function of angular frequency .
The absorptive amplitude is significant only near res-
onance whereas the elastic amplitude goes to zero at
resonance. Note that the full width at half maximum of
the absorptive amplitude peak equals Γ
The work done by the force 0 cos  on the oscillator
is Z Z
 =   =  ̇ (3.76)

Thus the absorbed power  () is given by


 () = =  ̇ (3.77)

The steady state response gives a velocity

̇() = − sin  +  cos  (3.78)

Figure 3.10: Elastic (solid) and absorptive
Thus the steady-state instantaneous power input is (dashed) amplitudes of the steady-state solution
for Γ = 010 0 
 () = 0 cos  [− sin  +  cos ] (3.79)

The absorptive term steadily absorbs energy while the elastic term oscillates as energy is alternately absorbed
or emitted. The time average over one cycle is given by
h D Ei
2
h i = 0 − hcos  sin i +  (cos ) (3.80)
®
where hcos  sin i and cos 2 are the time average over one cycle. The time averages over one complete
cycle for the first term in the bracket is

− hcos  sin i = 0 (3.81)

while for the second term Z

® 0 +
1 1
cos 2 = cos 2  = (3.82)
  2
Thus the time average power input is determined by only the absorptive term

1 2 Γ 2
h i = 0  = 0 (3.83)
2 2 ( 20 −  2 )2 + (Γ)2

This shape of the power curve is a classic Lorentzian shape. Note that the maximum of the average kinetic
¡ ¢2
energy occurs at   =  0 which is diﬀerent from the peak of the amplitude which occurs at  21 =  20 − Γ2 .
The potential energy is proportional to the amplitude squared, i.e. 2 which occurs at the same angular
¡ ¢2
frequency as the amplitude, that is,  2  =  2 =  20 − 2 Γ2 . The kinetic and potential energies resonate
at diﬀerent angular frequencies as a result of the fact that the driven damped oscillator is not conservative
3.6. SINUSOIDALLY-DRIVE, LINEARLY-DAMPED, LINEAR OSCILLATOR 67

because energy is continually exchanged between the oscillator and the driving force system in addition to
the energy dissipation due to the damping.
When  ∼  0  Γ, then the power equation simplifies since
¡ 2 ¢
 0 −  2 = ( 0 + ) ( 0 − ) ≈ 2 0 ( 0 − ) (3.84)

Therefore
02 Γ
h i ' ¡ ¢ (3.85)
8 ( 0 − )2 + Γ 2
2

This is called the Lorentzian or Breit-Wigner shape. The half power points are at a frequency diﬀerence
from resonance of ±∆ where
Γ
∆ = | 0 − | = ± (3.86)
2
Thus the full width at half maximum of the Lorentzian curve equals Γ Note that the Lorentzian has a
narrower peak but much wider tail relative to a Gaussian shape. At the peak of the absorbed power, the
absorptive amplitude can be written as

0 
 ( =  0 ) = (3.87)
  20

That is, the peak amplitude increases with increase in . This explains the classic comedy scene where the
soprano shatters the crystal glass because the highest quality crystal glass has a high  which leads to a
large amplitude oscillation when she sings on resonance.
The mean lifetime  of the free linearly-damped harmonic oscillator, that is, the time for the energy of
free oscillations to decay to 1 was shown to be related to the damping coeﬃcient Γ by

1
= (3.88)
Γ
Therefore we have the classical uncertainty principle for the linearly-damped harmonic oscillator
that the measured full-width at half maximum of the energy resonance curve for forced oscillation and the
mean life for decay of the energy of a free linearly-damped oscillator are related by

Γ = 1 (3.89)

This relation is correct only for a linearly-damped harmonic system. Comparable relations between the
lifetime and damping width exist for diﬀerent forms of damping.
One can demonstrate the above line width and decay time relationship using an acoustically driven
electric guitar string. Similarily, the width of the electromagnetic radiation is related to the lifetime for
decay of atomic or nuclear electromagnetic decay. This classical uncertainty principle is exactly the same
as the one encountered in quantum physics due to wave-particle duality. In nuclear physics it is diﬃcult to
measure the lifetime of states when   10−13  For shorter lifetimes the value of Γ can be determined from
the shape of the resonance curve which can be measured directly when the damping is large.

3.1 Example: Harmonically-driven series RLC circuit

The harmonically-driven, resonant, series  circuit, is encountered fre-
quently in AC circuits. Kirchhoﬀ’s Rules applied to the series  circuit
lead to the diﬀerential equation

̈ + ̇ + = 0 sin 

where  is charge, L is the inductance,  is the capacitance,  is the resistance,
and the applied voltage across the circuit is  () = 0 sin . The linearity of
the network allows use of the phasor approach which assumes that the current
 = 0   the voltage  = 0 (+)  and the impedance is a complex number
68 CHAPTER 3. LINEAR OSCILLATORS

 = 00  where  is the phase diﬀerence between the voltage and the current. For this circuit the impedance
is given by µ ¶
1
 =  +   −

Because of the phases involved in this  circuit, at resonance the maximum voltage across the resistor
occurs at a frequency of   =  0  across the capacitor the maximum voltage occurs at a frequency  2 =
2
  20 1
 20 − 2 2
2  and across the inductor  the maximum voltage occurs at a frequency   =  2  where  20 = 
1− 22
is the resonance angular frequency when  = 0. Thus these resonance frequencies diﬀer when   0.

3.7 Wave equation

Wave motion is a ubiquitous feature in nature. Mechanical wave motion is manifest by transverse waves
on fluid surfaces, longitudinal and transverse seismic waves travelling through the Earth, and vibrations of
mechanical structures such as suspended cables. Acoustical wave motion occurs on the stretched strings of
the violin, as well as the cavities of wind instruments. Wave motion occurs for deformable bodies where
elastic forces acting between the nearest-neighbor atoms of the body exert time-dependent forces on one
another. Electromagnetic wave motion includes wavelengths ranging from 105  radiowaves, to 10−13  -
rays. Matter waves are a prominent feature of quantum physics. All these manifestations of waves exhibit
the same general features of wave motion. Chapter 14 will introduce the collective modes of motion, called
the normal modes, of coupled, many-body, linear oscillators which act as independent modes of motion.
The basic elements of wavemotion are introduced at this juncture because the equations of wave motion are
simple, and wave motion features prominently in several chapters throughout this book.
Consider a travelling wave in one dimension for a linear system. If the wave is moving, then the wave
function Ψ ( ) describing the shape of the wave, is a function of both  and . The instantaneous amplitude
of the wave Ψ ( ) could correspond to the transverse displacement of a wave on a string, the longitudinal
amplitude of a wave on a spring, the pressure of a longitudinal sound wave, the transverse electric or magnetic
fields in an electromagnetic wave, a matter wave, etc. If the wave train maintains its shape as it moves, then
one can describe the wave train by the function  () where the coordinate  is measured relative to the
shape of the wave, that is, it could correspond to the phase of a crest of the wave. Consider that  ( = 0)
corresponds to a constant phase, e.g. the peak of the travelling pulse, then assuming that the wave travels
at a phase velocity  in the  direction and the peak is at  = 0 for  = 0 then it is at  =  at time .
That is, a point with phase  fixed with respect to the waveform shape of the wave profile  () moves in
the + direction for  =  −  and in − direction for  =  + .
General wave motion can be described by solutions of a wave equation. The wave equation can be
written in terms of the spatial and temporal derivatives of the wave function Ψ() Consider the first partial
derivatives of Ψ() =  ( ∓ ) =  ()
Ψ Ψ  Ψ
= = (3.90)
   
and
Ψ Ψ  Ψ
= = ∓ (3.91)
   
Ψ
Factoring out  for the first derivatives gives

Ψ Ψ
= ∓ (3.92)
 
The sign in this equation depends on the sign of the wave velocity making it not a generally useful formula.
Consider the second derivatives
2Ψ 2 Ψ  2 Ψ
= 2 = (3.93)
 2
  2
and
2Ψ 2 Ψ  2
2 Ψ
= = + (3.94)
2 2  2
3.8. TRAVELLING AND STANDING WAVE SOLUTIONS OF THE WAVE EQUATION 69

2 Ψ
Factoring out 2
gives
2Ψ 1 2Ψ
2
= 2 2 (3.95)
  
This wave equation in one dimension for a linear system is independent of the sign of the velocity. There
are an infinite number of possible shapes of waves both travelling and standing in one dimension, all of these
must satisfy this one-dimensional wave equation. The converse is that any function that satisfies this one
dimensional wave equation must be a wave in this one dimension.
The Wave Equation in three dimensions is
2Ψ 2Ψ 2Ψ 1 2Ψ
∇2 Ψ ≡ + + = (3.96)
2  2  2 2 2
There are an unlimited number of possible solutions Ψ to this wave equation, any one of which corresponds
to a wave motion with velocity .
The Wave Equation is applicable to all manifestations of wave motion, both transverse and longitudinal,
for linear systems. That is, it applies to waves on a string, water waves, seismic waves, sound waves,
electromagnetic waves, matter waves, etc. If it can be shown that a wave equation can be derived for any
system, discrete or continuous, then this is equivalent to proving the existence of waves of any waveform,
frequency, or wavelength travelling with the phase velocity given by the wave equation.[Cra65]

3.8 Travelling and standing wave solutions of the wave equation

The wave equation can exhibit both travelling and standing-wave solutions. Consider a one-dimensional
travelling wave with velocity  having a specific wavenumber  ≡ 2
 . Then the travelling wave is best
written in terms of the phase of the wave as
2
Ψ( ) = ()  (∓) = ()(∓) (3.97)
2
where the wave number  ≡  
with  being the wave length, and angular frequency  ≡ . This particular
solution satisfies the wave equation and corresponds to a travelling wave with phase velocity  =  in the
positive or negative direction  depending on whether the sign is negative or positive. Assuming that the
superposition principle applies, then the superposition of these two particular solutions of the wave equation
can be written as
Ψ( ) = ()((−) + (+) ) = () (− +  ) = 2() cos  (3.98)
Thus the superposition of two identical single wavelength travelling waves propagating in opposite directions
can correspond to a standing wave solution. Note that a standing wave is identical to a stationary normal
mode of the system discussed in chapter 14. This transformation between standing and travelling waves can
be reversed, that is, the superposition of two standing waves, i.e. normal modes, can lead to a travelling
wave solution of the wave equation.
Discussion of waveforms is simplified when using either of the following two limits.
1) The time dependence of the waveform at a given location  = 0 which can be expressed using a
Fourier decomposition, appendix 2, of the time dependence as a function of angular frequency  =  0 .
∞
X ∞
X
Ψ(0  ) =  (0 0 −0 ) =  (0 ) −0  (3.99)
=−∞ =−∞

2) The spatial dependence of the waveform at a given instant  = 0 which can be expressed using a
Fourier decomposition of the spatial dependence as a function of wavenumber  = 0
∞
X ∞
X
Ψ( 0 ) =  (0 −1 0 ) =  (0 ) 0  (3.100)
=−∞ =−∞

The above is applicable both to discrete, or continuous linear oscillator systems, e.g. waves on a string.
In summary, stationary normal modes of a system are obtained by a superposition of travelling waves
travelling in opposite directions, or equivalently, travelling waves can result from a superposition of stationary
normal modes.
70 CHAPTER 3. LINEAR OSCILLATORS

3.9 Waveform analysis

3.9.1 Harmonic decomposition
As described in appendix , when superposition applies, then a
Fourier series decomposition of the form 3101 can be made of
any periodic function where

X
 () =  cos( 0  +  ) (3.101)
=1

A more general Fourier Transform can be made for an aperiodic

function where
Z
 () =  () cos( +  ()) (3.102)

Any linear system that is subject to the forcing function  ()

has an output that can be expressed as a linear superposition
of the solutions of the individual harmonic components of the
forcing function. Fourier analysis of periodic waveforms in terms
of harmonic trigonometric functions plays a key role in describing
Figure 3.11: The time and frequency rep-
oscillatory motion in classical mechanics and signal processing
resentations of a system exhibiting beats.
for linear systems. Fourier’s theorem states that any arbitrary
forcing function  () can be decomposed into a sum of harmonic
terms. As a consequence two equivalent representations can be used to describe signals and waves; the first
is in the time domain which describes the time dependence of the signal. The second is in the frequency
domain which describes the frequency decomposition of the signal. Fourier analysis relates these equivalent
representations.
For example, the superposition of two equal intensity har-
monic oscillators in the time domain is given by

() =  cos ( 1 ) +  cos ( 2 )

∙µ ¶ ¸ ∙µ ¶ ¸
1 + 2 1 − 2
= 2 cos  cos  (3.103)
2 2
which leads to the phenomenon of beats as illustrated for both
the time domain and frequency domain in figure 311

3.9.2 The free linearly-damped linear oscilla-

tor
The response of the free, linearly-damped, linear oscillator is one
of the most frequently encountered waveforms in science and thus
it is useful to investigate the Fourier transform of this waveform.
The waveform amplitude for the underdamped case, shown in
figure 35 is given by equation (335), that is
Γ
 () = − 2  cos ( 1  − )  ≥ 0 (3.104)
 () = 0 0 (3.105)
¡ ¢2
where  21 =  20 − Γ2 and where  0 is the angular frequency of
the undamped system. The Fourier transform is given by Figure 3.12: The intensity  ()2 and
0 £¡ 2 2
¢ ¤ Fourier transform |()|2 of the free
 () = 2  −  1 − Γ (3.106) linearly-underdamped harmonic oscillator
( 2 −  21 ) + (Γ)2
with  0 = 10 and damping Γ = 1.
which is complex and has the famous Lorentz form.
3.9. WAVEFORM ANALYSIS 71

The intensity of the wave gives

2
| ()| = 2 −Γ cos2 ( 1  − ) (3.107)
 20
| ()|2 = 2 2
(3.108)
( 2 −  21 ) + (Γ)

Note that since the average over 2 of cos2 = 12  then the average over the cos2 ( 1  − ) term gives the
2
intensity  () = 2 −Γ which has a mean lifetime for the decay of  = Γ1  The | ()|2 distribution has the
classic Lorentzian shape, shown in figure 312, which has a full width at half-maximum, FWHM, equal to Γ.
Note that  () is complex and thus one also can determine the phase shift  which is given by the ratio of
the imaginary to real parts of equation 3105 i.e. tan  = 2Γ .
( −21 )
The mean lifetime of the exponential decay of the intensity can be determined either by measuring 
2
from the time dependence, or measuring the FWHM Γ = 1 of the Fourier transform | ()| . In nuclear
and atomic physics excited levels decay by photon emission with the wave form of the free linearly-damped,
linear oscillator. Typically the mean lifetime  usually can be measured when  & 10−12  whereas for
shorter lifetimes the radiation width Γ becomes suﬃciently large to be measured. Thus the two experimental
approaches are complementary.

3.9.3 Damped linear oscillator subject to an arbitrary periodic force

Fourier’s theorem states that any arbitrary forcing function  () can be decomposed into a sum of harmonic
terms. Consider the response of a damped linear oscillator to an arbitrary periodic force.

X
 () =  0 (  ) cos (   +   ) (3.109)
=0

For each harmonic term   the response of a linearly-damped linear oscillator to the forcing function
 () = 0 () cos(  ) is given by equation (365 − 67) to be

()  = () + ()

⎡ ⎤
0 (  ) ⎣ − Γ  1
=  2 cos ( 1  −   ) + q cos (   −   )⎦ (3.110)
 2 2 2 2
( 0 −   ) + (Γ  )
0 (  )
The amplitude is obtained by substituting into (3110) the derived values  from the Fourier analysis.

3.2 Example: Vibration isolation

Frequently it is desired to isolate instrumentation from the
influence of horizontal and vertical external vibrations that exist
in the environment. One arrangement to achieve this isolation
is to mount a heavy base of mass  on weak springs of spring
constant  plus weak damping. The response of this system is
given by equation 3109 which exhibits a resonance at the angu-
¡ ¢2
lar frequency  2 =  20 − 2 Γ2 associated with each resonant
frequency  0 of the system. For each resonant frequency the sys-
tem amplifies the vibrational amplitude
√ for angular frequencies
close to resonance that is, below 2  0  while it attenuates the Seismic isolation of an optical bench.
¡ ¢2
vibration roughly by a factor of 0 at higher frequencies. To
avoid the amplification near the resonance it is necessary to make  0 very much smaller than the frequency
range of the vibrational spectrum and have a moderately high  value. This is achieved by use a very heavy
base and weak spring constant so that  0 is very small. A typical table may have the resonance frequency
at 05 which is well below typical perturbing vibrational frequencies, and thus the table attenuates the
vibration by 99% at 5 and even more attenuation for higher frequency perturbations. This principle is
used extensively in design of vibration-isolation tables for optics or microbalance equipment.
72 CHAPTER 3. LINEAR OSCILLATORS

3.10 Signal processing

It has been shown that the response of the linearly-damped linear oscillator, subject to any arbitrary periodic
force, can be calculated using a frequency decomposition, (Fourier analysis), of the force, appendix . The
response also can be calculated using a time-ordered discrete-time sampling of the pulse shape; that is, the
Green’s function approach, appendix . The linearly-damped, linear oscillator is the simplest example of
a linear system that exhibits both resonance and frequency-dependent response. Typically physical linear
systems exhibit far more complicated response functions having multiple resonances. For example, an au-
tomobile suspension system involves four wheels and associated springs plus dampers allowing the car to
rock sideways, or forward and backward, in addition to the up-down motion, when subject to the forces
produced by a rough road. Similarly a suspension bridge or aircraft wing can twist as well as bend due to
air turbulence, or a building can undergo complicated oscillations due to seismic waves. An acoustic system
exhibits similar complexity. Signal analysis and signal processing is of pivotal importance to elucidating the
response of complicated linear systems to complicated periodic forcing functions. Signal processing is used
extensively in engineering, acoustics, and science.
The response of a low-pass filter, such as an R-C circuit or a coaxial cable, to a input square wave,
shown in figure 313, provides a simple example of the relative advantages of using the complementary
Fourier analysis in the frequency domain, or the Green’s discrete-function analysis in the time domain. The
response of a repetitive square-wave input signal is shown in the time domain plus the Fourier transform to
the frequency domain. The middle curves show the time dependence for the response of the low-pass filter
to an impulse  () and the corresponding Fourier transform (). The output of the low-pass filter can
be calculated by folding the input square wave and impulse time dependence in the time domain as shown
on the left or by folding of their Fourier transforms shown on the right. Working in the frequency domain
the response of linear mechanical systems, such as an automobile suspension or a musical instrument, as
well as linear electronic signal processing systems such as amplifiers, loudspeakers and microphones, can
be treated as black boxes having a certain transfer function ( ) describing the gain and phase shift
versus frequency. That is, the output wave frequency decomposition is

() = ( ) · () (3.111)


Working in the time domain, the the low-pass system has an impulse response () = −  , which is the
Fourier transform of the transfer function ( ). In the time domain
Z ∞
() = ( ) · ( −  ) (3.112)
−∞

This is shown schematically in figure 313. The Fourier transformation connects the three quantities in the
time domain with the corresponding three in the frequency domain. For example, the impulse response of
the low-pass filter has a fall time of  which is related by a Fourier transform to the width of the transfer
function. Thus the time and frequency domain approaches are closely related and give the same result for
the output signal for the low-pass filter to the applied square-wave input signal. The result is that the
higher-frequency components are attenuated leading to slow rise and fall times in the time domain.
Analog signal processing and Fourier analysis were the primary tools to analyze and process all forms of
periodic motion during the 20 century. For example, musical instruments, mechanical systems, electronic
circuits, all employed resonant systems to enhance the desired frequencies and suppress the undesirable
frequencies and the signals could be observed using analog oscilloscopes. The remarkable development of
computing has enabled use of digital signal processing leading to a revolution in signal processing that has
had a profound impact on both science and engineering. The digital oscilloscope, which can sample at fre-
quencies above 109  has replaced the analog oscilloscope because it allows sophisticated analysis of each
individual signal that was not possible using analog signal processing. For example, the analog approach in
nuclear physics used tiny analog electric signals, produced by many individual radiation detectors, that were
transmitted hundreds of meters via carefully shielded and expensive coaxial cables to the data room where
the signals were amplified and signal processed using analog filters to maximize the signal to noise in order to
separate the signal from the background noise. Stray electromagnetic radiation picked up via the cables sig-
nificantly degraded the signals. The performance and limitations of the analog electronics severely restricted
the pulse processing capabilities. Digital signal processing has rapidly replaced analog signal processing.
3.11. WAVE PROPAGATION 73

Figure 3.13: Response of an  electrical circuit to an input square wave. The upper row shows the time
and the exponential-form frequency representations of the square-wave input signal. The middle row gives
the impulse response, and corresponding transfer function for the  circuit. The bottom row shows the
corresponding output properties in both the time and frequency domains

Analog to digital detector circuits are built directly into the electronics for each individual detector so that
only digital information needs to be transmitted from each detector to the analysis computers. Computer
processing provides unlimited and flexible processing capabilities for the digital signals greatly enhancing
the response and sensitivity of our detector systems. Digital CD and DVD disks are common application of
digital signal processing.

3.11 Wave propagation

Wave motion typically involves a packet of waves encompassing a finite number of wave cycles. Information
in a wave only can be transmitted by starting, stopping, or modulating the amplitude of a wave train, which
is equivalent to forming a wave packet. For example, a musician will play a note for a finite time, and this
wave train propagates out as a wave packet of finite length. You have no information as to the frequency
and amplitude of the sound prior to the wave packet reaching you, or after the wave packet has passed you.
The velocity of the wavelets contained within the wave packet is called the phase velocity. For a dispersive
system the phase velocity of the wavelets contained within the wave packet is frequency dependent and the
shape of the wave packet travels at the group velocity which usually diﬀers from the phase velocity. If
the shape of the wave packet is time dependent, then neither the phase velocity, which is the velocity of the
wavelets, nor the group velocity, which is the velocity of an instantaneous point fixed to the shape of the
wave packet envelope, represent the actual velocity of the overall wavepacket.
A third wavepacket velocity, the signal velocity, is defined to be the velocity of the leading edge of the
energy distribution, and corresponding information content, of the wave packet. For most linear systems
the shape of the wave packet is not time dependent and then the group and signal velocities are identical.
However, the group and signal velocities can be very diﬀerent for non-linear systems as discussed in chapter
47. Note that even when the phase velocity of the waves within the wave packet travels faster than the group
velocity of the shape, or the signal velocity of the energy content of the envelope of the wave packet, the
information contained in a wave packet is only manifest when the wave packet envelope reaches the detector
and this energy and information travel at the signal velocity. The modern ideas of wave propagation,
including Hamilton’s concept of group velocity, were developed by Lord Rayleigh when applied to the theory
of sound[Ray1887]. The concept of phase, group, and signal velocities played a major role in discussion of
electromagnetic waves as well as de Broglie’s development of wave-particle duality in quantum mechanics.
74 CHAPTER 3. LINEAR OSCILLATORS

3.11.1 Phase, group, and signal velocities of wave packets

The concepts of wave packets, as well as their phase, group, and signal velocities, are of considerable impor-
tance for propagation of information and other manifestations of wave motion in science and engineering.
This importance warrants further discussion at this juncture.
Consider a particular   component of a one-dimensional wave,

( ) = (±) (3.113)

The argument of the exponential is called the phase  of the wave where

 ≡  −  (3.114)

If we move along the  axis at a velocity such that the phase is constant then we perceive a stationary
pattern in this moving frame. The velocity of this wave is called the phase velocity. To ensure constant
phase requires that  is constant, or assuming real  and 

 =  (3.115)

Therefore the phase velocity is defined to be


 = (3.116)

The velocity discussed so far is just the phase velocity of the individual wavelets at the carrier frequency. If
 or  are complex then one must take the real parts to ensure that the velocity is real.
If the phase velocity of a wave is dependent on the wavelength, that is,  ()  then the system is
said to be dispersive in that the wave is dispersed according the wavelength. The simplest illustration of
dispersion is the refraction of light in glass prism which leads to dispersion of the light into the spectrum of
wavelengths. Dispersion leads to development of wave packets that travel at group and signal velocities that
usually diﬀer from the phase velocity. To illustrate this behavior, consider two equal amplitude travelling
waves having slightly diﬀerent wave number  and angular frequency . Superposition of these waves gives

( ) = ([−] + [(+∆)−(+∆)] ) (3.117)

[(+ ∆ ∆
2 )−(+ 2 )] −[ ∆ ∆
2 − 2 ] [ ∆ ∆
2 − 2 ]
=  · { + }
∆ ∆ ∆ ∆
= 2[(+ 2 )−(+ 2 )] cos[ − ]
2 2
This corresponds to a wave with the average carrier frequency modulated by the cosine term which has a
wavenumber of ∆ ∆
2 and angular frequency 2 , that is, this is the usual example of beats The cosine term
modulates the average wave producing wave packets as shown in figure 311. The velocity of these wave
packets is called the group velocity given by requiring that the phase of the modulating term is constant,
that is
∆ ∆
 =  (3.118)
2 2
Thus the group velocity is given by
 ∆
 = = (3.119)
 ∆
If dispersion is present then the group velocity  = ∆
∆ does not equal the phase velocity  =


Expanding the above example to superposition of  waves gives

X
( ) =  ( ± ) (3.120)
=1

In the event that  → ∞ and the frequencies are continuously distributed, then the summation is replaced
by an integral
Z ∞
( ) = ()(±)  (3.121)
−∞
3.11. WAVE PROPAGATION 75

where the factor  () represents the distribution amplitudes of the component waves, that is the spectral
decomposition of the wave. This is the usual Fourier decomposition of the spatial distribution of the wave.
Consider an extension of the linear superposition of two waves to a well defined wave packet where the
amplitude is nonzero only for a small range of wavenumbers 0 ± ∆
Z 0 +∆
( ) = ()(−)  (3.122)
0 −∆

This functional shape is called a wave packet which only has meaning if ∆  0 . The angular frequency
can be expressed by making a Taylor expansion around 0
µ ¶

() = (0 ) + ( − 0 ) +  (3.123)
 0

For a linear system the phase then reduces to

µ ¶

 −  = (0  −  0 ) + ( − 0 ) − ( − 0 ) (3.124)
 0

The summation of terms in the exponent given by 3124 leads to the amplitude 3122 having the form of a
product where the integral becomes
Z 0 +∆
(−0 )[−( 
 ) ]
( ) = (0 −0 ) () 0  (3.125)
0 −∆

The integral term modulates the (0 −0 ) first term.

The group velocity is defined to be that for which the phase of the exponential term in the integral is
constant. Thus µ ¶

 = (3.126)
 0
Since  =  then

 =  +  (3.127)

For non-dispersive systems the phase velocity is independent of the wave number  or angular frequency 
and thus  =   The case discussed earlier, equation (3103)  for beating of two waves gives the
same relation in the limit that ∆ and ∆ are infinitessimal.
¡ The
¢ group velocity of a wave packet is of physical significance for dispersive media where  =

 0 6=  =  . Every wave train has a finite extent and thus we usually observe the motion of a
group of waves rather than the wavelets moving within the wave packet. In general, for non-linear dispersive

systems the derivative 
 can be either positive or negative and thus in principle the group velocity
can either be greater than, or less than, the phase velocity. Moreover, if the group velocity is frequency
dependent, that is, when group velocity dispersion occurs, then the overall shape of the wave packet is time
dependent and thus the speed of a specific relative location defined by the shape of the envelope of the wave
packet does not represent the signal velocity of the wave packet. Brillouin showed that the distribution
of the energy, and corresponding information content, for any wave packet, travels at the signal velocity
which can be diﬀerent from the group velocity if the shape of the envelope of the wave packet is time
dependent. For electromagnetic waves one has the possibility that the group velocity    =  In
1914 Brillouin[Bri14][Bri60] showed that the signal velocity of electromagnetic waves, defined by the leading
edge of the time-dependent envelope of the wave packet, never exceeds  even though the group velocity
corresponding to the velocity of the instantaneous shape of the wave packet may exceed . Thus, there is
no violation of Einstein’s fundamental principle of relativity that the velocity of an electromagnetic wave
cannot exceed .
76 CHAPTER 3. LINEAR OSCILLATORS

3.3 Example: Water waves breaking on a beach

The concepts of phase and group velocity are illustrated by the example of water waves moving at velocity
 incident upon a straight beach at an angle  to the shoreline. Consider that the wavepacket comprises
many wavelengths of wavelength . During the time it takes the wave to travel a distance  the point where
the crest of one wave breaks on the beach travels a distance cos  along beach. Thus the phase velocity of the
crest of the one wavelet in the wave packet is

 =
cos 
The velocity of the wave packet along the beach equals
 =  cos 

Note that for the wave moving parallel to the beach  = 0 and  =  = . However, for  = 2
 → ∞ and  → 0. In general for waves breaking on the beach
  =  2
The same behavior is exhibited by surface waves bouncing oﬀ the sides of the Erie canal, sound waves in
a trombone, and electromagnetic waves transmitted down a rectangular wave guide. In the latter case the
phase velocity exceeds the velocity of light  in apparent violation of Einstein’s theory of relativity. However,
the information travels at the signal velocity which is less than .

3.4 Example: Surface waves for deep water

In the “Theory of Sound”[Ray1887] Rayleigh discusses the example of surface waves for water. He derives
a dispersion relation for the phase velocity  and wavenumber  which are related to the density , depth
, gravity , and surface tension  , by
 3
 2 =  + tanh()

For deep water where the wavelength is short compared with the depth, that is kl  1  then tanh() → 1
and the dispersion relation is given approximately by
 3
 2 =  +

For long surface waves for deep water, that is, small , then the gravitational first term in the dispersion
relation dominates and the group velocity is given by
µ ¶ r
 1  1 
 = = = =
 2  2 2
That is, the group velocity is half of the phase velocity. Here the wavelets are building at the back of the wave
packet, progress through the wave packet and dissipate at the front. This can be demonstrated by dropping a
pebble into a calm lake. It will be seen that the surface disturbance comprises a wave packet moving outwards
at the group velocity with the individual waves within the wave packet expanding at twice the group velocity
of the wavepacket, that is, they are created at the inner radius of the wave packet and disappear at the outer
radius of the wave packet.
For small wavelength ripples, where  is large, then the surface tension term dominates and the dispersion
relation is approximately given by
 3
2 '

leading to a group velocity of µ ¶
 3
 = = 
 2
Here the group velocity exceeds the phase velocity and wavelets are building at the front of the wave packet and
dissipate at the back. Note that for this linear system, the Brillion signal velocity equals the group velocity
for both gravity and surface tension waves for deep water.
3.11. WAVE PROPAGATION 77

3.5 Example: Electromagnetic waves in ionosphere

The response to radio waves, incident upon a free electron plasma in the ionosphere, provides an excellent
example that involves cut-oﬀ frequency, complex wavenumber  as well as the phase, group, and signal
velocities. Maxwell’s equations give the most general wave equation for electromagnetic waves to be
2E j  ³ ´
 
∇2 E −  =  + ∇·
2  
2H
∇2 H −  2 = −∇ × j 

where   and j  are the unbound charge and current densities. The eﬀect of the bound charges and
currents are absorbed into  and . Ohm’s Law can be written in terms of the electrical conductivity  which
is a constant
j =E
Assuming Ohm’s Law plus assuming   = 0, in the plasma gives the relations

2E E
∇2 E −  −  = 0
2 
2
 H H
∇2 H −  2 −  = 0
 
The third term in both of these wave equations is a damping term that leads to a damped solution of an
electromagnetic wave in a good conductor.
The solution of these damped wave equations can be solved by considering an incident wave

E =  x̂(−)

Substituting for E in the first damped wave equation gives

−2 +  2  −  = 0

That is ∙ ¸

2 =  2  1 −

In general  is complex, that is, it has real  and imaginary  parts that lead to a solution of the form

E =  −  (− )

The first exponential term is an exponential damping term while the second exponential term is the oscillating
term.
Consider that the plasma involves the motion of a bound damped electron, of charge  of mass  bound
in a one dimensional atom or lattice subject to an oscillatory electric field of frequency . Assume that the
electromagnetic wave is travelling in the ̂ direction with the transverse electric field in the ̂ direction. The
equation of motion of an electron can be written as

ẍ + Γẋ +  20  = x̂0 (−)

where Γ is the damping factor. The instantaneous displacement of the oscillating charge equals
 1
x= x̂0 (−)
 ( 20 −  2 ) + Γ
and the velocity is
 
ẋ = 2 x̂0 (−)
 ( 0 −  2 ) + Γ
Thus the instantaneous current density is given by
 2 
j =   ẋ = 2 x̂0 (−)
 ( 0 −  2 ) + Γ
78 CHAPTER 3. LINEAR OSCILLATORS

Therefore the electrical conductivity is given by

 2 
= 2
 ( 0 −  2 ) + Γ
Let us consider only unbound charges in the plasma, that is let  0 = 0. Then the conductivity is given by

 2 
=
 Γ −  2
For a low density ionized plasma   Γ thus the conductivity is given approximately by

 2
 ≈ −

Since  is pure imaginary, then j and E have a phase difference of 2 which implies that the average of
the Joule heating over a complete period is hj · Ei = 0 Thus there is no energy loss due to Joule heating
implying that the electromagnetic energy is conserved.
Substitution of  into the relation for 2
∙ ¸ ∙ ¸
  2
2 =  2  1 − =  2  1 −
  2
Define the Plasma oscillation frequency   to be
r
 2
 ≡

then 2 can be written as ∙ ³  ´2 ¸
2 2 
 =   1 − ()

For a low density plasma the dielectric constant  ' 1 and the relative permeability  ' 1 and thus
 =  0 ' 0 and  =  0 ' 0 . The velocity of light in vacuum  = √10  . Thus for low density
0
equation  can be written as
 2 =  2 + 2 2 ()
Differentiation of equation  with respect to  gives 2  2 2
 = 2  That is,   =  and the phase
velocity is r
 2
 = 2 + 2

There are three cases to consider. h ¡ ¢2 i
1)     : For this case 1 −   1 and thus  is a pure real number. Therefore the elec-
tromagnetic wave is transmitted with a phase velocity that exceeds  while the group velocity is less than
. h ¡ ¢2 i
2)     : For this case 1 −   1 and thus  is a pure imaginary number. Therefore the

electromagnetic wave is not transmitted in the ionosphere and is attenuated rapidly as −(  ) . However,
since there are no Joule heating losses, then the electromagnetic wave must be complete reflected. Thus the
Plasma oscillation frequency serves as a cut-off frequency. For this example the signal and group velocities
are identical.
For the ionosphere  = 10−11 electrons/m 3 , which corresponds to a Plasma oscillation frequency of
 =   2 = 3 . Thus electromagnetic waves in the AM waveband (  16 ) are totally reflected by
the ionosphere and bounce repeatedly around the Earth, whereas for VHF frequencies above 3 , the waves
are transmitted and refracted passing through the atmosphere. Thus light is transmitted by the ionosphere.
By contrast, for a good conductor like silver, the Plasma oscillation frequency is around 1016  which is
in the far ultraviolet part of the spectrum. Thus, all lower frequencies, such as light, are totally reflected
by such a good conductor, whereas X-rays have frequencies above the Plasma oscillation frequency and are
transmitted.
3.11. WAVE PROPAGATION 79

3.11.2 Fourier transform of wave packets

The relation between the time distribution and the cor-
responding frequency distribution, or equivalently, the
spatial distribution and the corresponding wave-number
distribution, are of considerable importance in discus-
sion of wave packets and signal processing. It directly
relates to the uncertainty principle that is a characteris-
tic of all forms of wave motion. The relation between the
time and corresponding frequency distribution is given
via the Fourier transform discussed in appendix . The
following are two examples of the Fourier transforms of
typical but rather diﬀerent wavepacket shapes that are
encountered often in science and engineering.

3.6 Example: Fourier transform of a

Gaussian wave packet:
Assuming that the amplitude of the wave is a
Gaussian wave packet shown in the adjacent figure where
(−0 )2
− 22
 () =  

This leads to the Fourier transform

√ 2

2

 () =  2 − 2 cos ( 0 ) Fourier transform of a Gaussian frequency

distribution.
Note that the wavepacket has a standard deviation for the amplitude of the wavepacket of   = 1 , that
is   ·   = 1. The Gaussian wavepacket results in the minimum product of the standard deviations of the
frequency and time representations for a wavepacket. This has profound importance for all wave phenomena,
and especially to quantum mechanics. Because matter exhibits wave-like behavior, the above property of wave
packet leads to Heisenberg’s Uncertainty Principle. For signal processing, it shows that if you truncate a
wavepacket you will broaden the frequency distribution.

3.7 Example: Fourier transform of a rectangular wave packet:

Assume unity amplitude of the frequency distribution between  0 − ∆ ≤  ≤  0 + ∆ , that is, a single
isolated square pulse of width  that is described by the rectangular function Π defined as
½
1 | −  0 |  ∆
Π() =
0 | −  0 |  ∆

Then the Fourier transform us given by

∙ ¸
sin ∆
 () = cos  0 
∆

That is, the transform of a rectangular wavepacket gives a cosine wave modulated by an unnormalized
 function which is a nice example of a simple wave packet. That is, on the right hand side we have
2
a wavepacket ∆ = ± ∆ wide. Note that the product of the two measures of the widths ∆ · ∆ = ±
 
Example 2 considers a rectangular
³ ´ pulse of unity amplitude between − 2 ≤  ≤ 2 which resulted in a

sin
Fourier transform  () =  
2
. That is, for a pulse of width ∆ = ± 2 the frequency envelope has
2

the first zero at ∆ = ±  . Note that this is the complementary system to the one considered here which has
∆ · ∆ = ± illustrating the symmetry of the Fourier transform and its inverse.
80 CHAPTER 3. LINEAR OSCILLATORS

3.11.3 Wave-packet Uncertainty Principle

The Uncertainty Principle states that wavemotion exhibits a minimum product of the uncertainty in the
simultaneously measured width in time of a wave packet, and the distribution width of the frequency de-
composition of this wave packet. This was illustrated by the Fourier transforms of wave packets discussed
above where it was shown the product of the widths is minimized for a Gaussian-shaped wave packet. The
Uncertainty Principle implies that to make a precise measurement of the frequency of a sinusoidal wave
requires that the wave packet be infinitely long. If the duration of the wave packet is reduced then the
frequency distribution broadens. The crucial aspect qneeded for this discussion, is that, for the amplitudes
2
of any wavepacket, the standard deviations  () = h2 i − hi characterizing the width of the spectral
distribution in the angular frequency domain,   (), and the width for the conjugate variable in time   ()
are related :
  () ·   () > 1 (Relation between amplitude uncertainties.)
This product of the standard deviations equals unity only for the special case of Gaussian-shaped spectral
distributions, and it is greater than unity for all other shaped spectral distributions.
The intensity of the wave is the square of the amplitude leading to standard deviation widths for a
Gaussian distribution where   ()2 = 12   ()2 , that is,  () = √
 ()
2
. Thus the standard deviations for the
spectral distribution and width of the intensity of the wavepacket are related by:
1
  () ·   () > (Uncertainty principle for frequency-time intensities)
2
This states that the uncertainties with which you can simultaneously measure the time and frequency
for the intensity of a given wavepacket are related. If you try to measure the frequency within a short time
interval  () then the uncertainty in the frequency measurement  () > 21()  Accurate measurement
of the frequency requires measurement times that encompass many cycles of oscillation, that is, a long
wavepacket.
Exactly the same relations exist between the spectral distribution as a function of wavenumber  and
the corresponding spatial dependence of a wave  which are conjugate representations. Thus the spectral
distribution plotted versus  is directly related to the amplitude as a function of position ; the spectral
distribution versus  is related to the amplitude as a function of ; and the  spectral distribution is related
to the spatial dependence on  Following the same arguments discussed above, the standard deviation,
  ( ) characterizing the width of the spectral intensity distribution of  , and the standard deviation
  () characterizing the spatial width of the wave packet intensity as a function of  are related by the
Uncertainty Principle for position-wavenumber. Thus in summary the temporal and spatial uncertainty
principles of the intensity of wave motion is,
1
  () ·   () > (3.128)
2
1 1 1
  () ·   ( ) >   () ·   ( ) >   () ·   ( ) >
2 2 2
This applies to all forms of wave motion, be they, sound waves, water waves, electromagnetic waves, or
matter waves.
As discussed in chapter 18, the transition to quantum mechanics involves relating the matter-wave prop-
erties to the energy and momentum of the corresponding particle. That is, in the case of matter waves,
multiplying both sides of equation 3129 by ~ and using the de Broglie relations gives that the particle en-
ergy is related to the angular frequency by  = ~ and the particle momentum is related to the wavenumber,
→
−
that is −
→p = ~ k . These lead to the Heisenberg Uncertainty Principle:
~
  () ·   () > (3.129)
2
~ ~ ~
 () ·  ( ) >   () ·   ( ) >   () ·   ( ) >
2 2 2
This uncertainty principle applies equally to the wavefunction of the electron in the
hydrogen atom, proton in a nucleus, as well as to a wavepacket describing a particle wave moving along some
3.11. WAVE PROPAGATION 81

trajectory. This implies that, for a particle of given momentum, the wavefunction is spread out spatially.
Planck’s constant ~ = 105410−34  ·  = 658210−16  ·  is extremely small compared with energies and
times encountered in normal life, and thus the eﬀects due to the Uncertainty Principle are not important for
macroscopic dimensions.
Confinement of a particle, of mass , within ±() of a fixed location implies that there is a corresponding
uncertainty in the momentum
~
( ) ≥ (3.130)
2()
D E
Now the variance in momentum p is given by the diﬀerence in the average of the square (p · p)2 , and the
2
square of the average of hpi . That is
D E
2 2
(p)2 = (p · p) − hpi (3.131)

Assuming a fixed average location implies that hpi = 0, then

D E µ ¶2
2 ~
(p · p) = ()2 ≥ (3.132)
2()
Since the kinetic energy is given by:
2 ~2
Kinetic energy = ≥ (Zero-point energy)
2 8()2
This zero-point energy is the minimum kinetic energy that a particle of mass  can have if confined within a
distance ±() This zero-point energy is a consequence of wave-particle duality and the uncertainty between
the size and wavenumber for any wave packet. It is a quantal effect in that the classical limit has ~ → 0 for
which the zero-point energy → 0
Inserting numbers for the zero-point energy gives that an electron confined to the radius of the atom,
that is () = 10−10  has a zero-point kinetic energy of ∼ 1 . Confining this electron to 3 × 10−15  the
size of a nucleus, gives a zero-point energy of 109  (1 ) Confining a proton to the size of the nucleus
gives a zero-point energy of 05  . These values are typical of the level spacing observed in atomic and
nuclear physics. If ~ was a large number, then a billiard ball confined to a billiard table would be a blur
as it oscillated with the minimum zero-point kinetic energy. The smaller the spatial region that the ball
was confined, the larger would be its zero-point energy and momentum causing it to rattle back and forth
between the boundaries of the confined region. Life would be dramatically different if ~ was a large number.
In summary, Heisenberg’s Uncertainty Principle is a well-known and crucially important aspect of quan-
tum physics. What is less well known, is that the Uncertainty Principle applies for all forms of wave motion,
that is, it is not restricted to matter waves. The following three examples illustrate application of the
Uncertainty Principle to acoustics, the nuclear Mössbauer effect, and quantum mechanics.

3.8 Example: Acoustic wave packet

A violinist plays the note middle C (261625) with constant intensity for precisely 2 seconds. Using
the fact that the velocity of sound in air is 3432 calculate the following:
1) The wavelength of the sound wave in air:  = 3432261625 = 1312.
2) The length of the wavepacket in air: Wavepacket length = 3432 × 2 = 6864
3) The fractional frequency width of the note: Since the wave packet has a square pulse shape of length
 = 2, then the Fourier transform is a sinc function having the first zeros when sin  1
2 = 0, that is, ∆ =  .
∆ 1 ∆ −6
Therefore the fractional width is  =  = 00019. Note that to achieve a purity of  = 10 the violinist
would have to play the note for 106.

3.9 Example: Gravitational red shift

The Mössbauer eﬀect in nuclear physics provides a wave packet that has an exceptionally small fractional
width in frequency. For example, the 57 Fe nucleus emits a 144 deexcitation-energy photon which corre-
sponds to  ≈ 2 × 1025  with a decay time of  ≈ 10−7 . Thus the fractional width is ∆
 ≈ 3 × 10
−18
.
82 CHAPTER 3. LINEAR OSCILLATORS

In 1959 Pound and Rebka used this to test Einstein’s general theory of relativity by measurement of the
gravitational red shift between the attic and basement of the 225 high physics building at Harvard. The
−
magnitude of the predicted relativistic red shift is ∆
 = 25 × 10
15
which is what was observed with a
fractional precision of about 1%.

3.10 Example: Quantum baseball

George Gamow, in his book ”Mr. Tompkins in Wonderland”, describes the strange world that would exist
if ~ was a large number. As an example, consider you play baseball in a universe where ~ is a large number.
The pitcher throws a 150 ball 20 to the batter at a speed of 40. For a strike to be thrown, the ball’s
position must be pitched within the 30 radius of the strike zone, that is, it is required that ∆ ≤ 03.

The uncertainty relation tells us that the transverse velocity of the ball cannot be less than ∆ = 2∆  The
time of flight of the ball from the mound to batter is  = 05. Because of the transverse velocity uncertainty,
∆ the ball will deviate ∆ transversely from the strike zone. This also must not exceed the size of the
strike zone, that is;

~
∆ = ≤ 03 (Due to transverse velocity uncertainty)
2∆
Combining both of these requirements gives

2∆2
~≤ = 54 10−2  · 

This is 32 orders of magnitude larger than ~ so quantal eﬀects are negligible. However, if ~ exceeded the
above value, then the pitcher would have diﬃculty throwing a reliable strike.

3.12 Summary
Linear systems have the feature that the solutions obey the Principle of Superposition, that is, the am-
plitudes add linearly for the superposition of diﬀerent oscillatory modes. Applicability of the Principle of
Superposition to a system provides a tremendous advantage for handling and solving the equations of motion
of oscillatory systems.
Geometric representations of the motion of dynamical systems provide sensitive probes of periodic mo-
tion. Configuration space (q q ), state space (q q̇ ) and phase space (q p ), are powerful geometric
representations that are used extensively for recognizing periodic motion where q q̇ and p are vectors in
-dimensional space.

Linearly-damped free linear oscillator The free linearly-damped linear oscillator is characterized by
the equation
̈ + Γ̇ +  20  = 0 (326)
The solutions of the linearly-damped free linear oscillator are of the form
s µ ¶2
−( Γ
£ 1  ¤ Γ
= 2 ) 1  + 2 −1  1 ≡  2 − (333)
2

The solutions of the linearly-damped free linear oscillator have the following characteristic frequencies cor-
responding to the three levels of linear damping
q ¡ ¢2
() = −( 2 ) cos ( 1  − )
Γ
underdamped 1 =  2 − Γ2  0
∙ q¡ ¢ ¸
Γ 2
() = [1 −+  + 2 −−  ] overdamped  ± = − − Γ2 ± 2 −  2

q ¡ Γ ¢2
() = ( + ) −( 2 )
Γ
critically damped  1 =  2 − 2 = 0
3.12. SUMMARY 83

The energy dissipation for the linearly-damped free linear oscillator time averaged over one period is
given by
hi = 0 −Γ (344)
The quality factor  characterizing the damping of the free oscillator is defined to be
 1
= = (347)
∆ Γ
where ∆ is the energy dissipated per radian.

Sinusoidally-driven, linearly-damped, linear oscillator The linearly-damped linear oscillator, driven

by a harmonic driving force, is of considerable importance to all branches of physics, and engineering. The
equation of motion can be written as
 ()
̈ + Γ̇ +  20  = (349)

where  () is the driving force. The complete solution of this second-order diﬀerential equation comprises
two components, the complementary solution (transient response), and the particular solution (steady-state
response). That is,
()  = () + () (365)
For the underdamped case, the transient solution is the complementary solution
0 − Γ 
() =  2 cos ( 1  − ) (366)

and the steady-state solution is given by the particular solution
0

() = q cos ( − ) (367)
2 2 2
( 0 −  2 ) + (Γ)

Resonance A detailed discussion of resonance and energy absorption for the driven linearly-damped linear
oscillator was given. For resonance of the linearly-damped linear oscillator the maximum amplitudes occur
at the following resonant frequencies
Resonant system Resonant
q frequency

undamped free linear oscillator 0 = 
q ¡ ¢2
linearly-damped free linear oscillator  1 =  20 − Γ2
q ¡ ¢2
driven linearly-damped linear oscillator   =  20 − 2 Γ2

The energy absorption for the steady-state solution for resonance is given by
() =  cos  +  sin  (373)
where the elastic amplitude
0 ¡ 2 ¢
 = 
0 − 2 (374)
( 20 2 2
−  ) + (Γ) 2

while the absorptive amplitude

0

 = Γ (375)
2 2
( 20 −  ) + (Γ)2
The time average power input is given by only the absorptive term
1 2 Γ 2
h i = 0  = 0 (3.133)
2 2 ( 20 −  2 )2 + (Γ)2

This power curve has the classic Lorentzian shape.

84 CHAPTER 3. LINEAR OSCILLATORS

Wave propagation The wave equation was introduced and both travelling and standing wave solutions
of the wave equation were discussed. Harmonic wave-form analysis, and the complementary time-sampled
wave form analysis techniques, were introduced in this chapter and in appendix . The relative merits of
Fourier analysis and the digital Green’s function waveform analysis were illustrated for signal processing.
The concepts of phase velocity, group velocity, and signal velocity were introduced. The phase velocity
is given by

 = (3117)

and group velocity µ ¶
 
 = =  +  (3128)
 0 
If the group velocity is frequency dependent then the information content of a wave packet travels at the
signal velocity which can diﬀer from the group velocity.
The Wave-packet Uncertainty Principle implies that making a precise measurement of the frequency
q of a
2
sinusoidal wave requires that the wave packet be infinitely long. The standard deviation  () = h2 i − hi
characterizing the width of the amplitude of the wavepacket spectral distribution in the angular frequency
domain,   (), and the corresponding width in time   () are related by :

  () ·   () > 1 (Relation between amplitude uncertainties.)

The standard deviations for the spectral distribution and width of the intensity of the wave packet are
related by:
1
  () ·   () > (3.134)
2
1 1 1
  () ·   ( ) >   () ·   ( ) >   () ·   ( ) >
2 2 2
This applies to all forms of wave motion, including sound waves, water waves, electromagnetic waves, or
matter waves.
3.12. SUMMARY 85

Workshop exercises
1. Given below are a list of statements followed by a list of reasons related to harmonic motion. For each of the
statements, determine the reason(s) that make that statement true. You may do this in small groups or as one
large group—the teaching assistant will decide what works best for your workshop.
Statements:

• We can neglect the higher order terms in the Taylor expansion of  ().
• The restoring force is a linear force.
• 0 must vanish.
• ()0 is negative and  is positive.
• We can write  () as a Taylor series expansion.

Reasons:

•  () depends only on .

• A position of stable equilibrium exists and we call this point the origin of our coordinate system.
•  () has continuous derivatives of all orders.
• The restoring force is directed toward the equilibrium position.
• We consider only small displacements.

2. Second-order ordinary diﬀerential equations are an important part of the physics of the harmonic oscillator.

(a) What do each of the following terms mean with respect to diﬀerential equations?
i. Ordinary
ii. Second-order
iii. Homogeneous
iv. Linear
(b) Give a mini-lesson on how to solve second-order diﬀerential equations by working through the following
examples. Don’t just provide a solution; explain the steps leading up to the solution.
i.  00 +5 0 +6 = 0
ii.  00 + 0 + = 0
iii.  00 +4 0 +4 = 0
iv.  00 −3 02
v.  00 −3 0 −4 = 2 sin 

3. Harmonic oscillations occur for many diﬀerent types of systems and it is important to recognize when the
equations for harmonic motion apply. Three diﬀerent systems are described below. Each system can be
approximately described using the equations for harmonic motion. Break up into three groups—one group per
system. For your group’s system, answer the following questions:

(a) What approximations are necessary for this system to exhibit harmonic oscillations?
(b) What is the diﬀerential equation that governs the motion of this system? Use Newton’s second law to
arrive at this equation.
(c) What is the solution to the diﬀerential equation that you found in part (b)?
(d) What is the natural frequency of oscillations?

Here are the three systems:

• A mass  is tied to a massless spring having a spring constant . The system oscillates in one dimension
along a horizontal frictionless surface.
86 CHAPTER 3. LINEAR OSCILLATORS

• A particle of mass  is attached to a weightless, extensionless rod to form a pendulum. The length of
the rod is  and the system oscillates in a single plane.
• A tube is bent into the shape of a U and is partially filled with a liquid of density . The cross-sectional
area of the tube is  and the length of the tube filled with liquid is . The liquid is initially displaced so
that it is higher on one side of the tube than the other.

Once each group has answered all of the questions, share the results with the entire class.

4. Consider a mass  attached to a spring of spring constant . The spring is mounted horizontally so that the
mass oscillates horizontally on a frictionless surface. The spring is attached to the wall on the right and the
mass is initially moved to the right of its equilibrium position (compressing the spring) by a distance  and
released. Working individually, determine how (if at all) the period of the motion would be aﬀected by each of
the changes below. Once you have answered each part on your own, compare your answers with a classmate.

(a) The spring is replaced with a stiﬀer spring.

(b) The mass is initially displaced a distance  to the left and released.
(c) The mass is replaced with a heavier mass.
(d) The mass is initially displaced a distance  (  ) to the right and released.

5. When you were first introduced to simple harmonic motion, you used the formula ̈ = − to find the
position of the oscillating mass as a function of time. This assumes that the origin is defined to be the
equilibrium point. What happens if this is not the case? What would the equation of motion look like? How
would the position of the oscillating mass as a function of time change?

6. For each of the situations described below, give a rough sketch of the state space diagram (̇ versus ) that
represents the motion of each object. All of the motion takes place along the -axis.

(a) An eggplant is at rest at a point on the + axis.

(b) A monkey on a skateboard skates with constant speed in the negative  direction.
(c) A race car moving in the + direction undergoes constant acceleration until it abruptly stops.
(d) A cantaloupe undergoes simple harmonic motion. The initial location of the cantaloupe is at a point on
the + axis.

7. Consider a simple harmonic oscillator consisting of a mass  attached to a spring of spring constant . For
this oscillator () =  sin( 0  − ).

(a) Find an expression for ̇().

(b) Eliminate  between () and ̇() to arrive at one equation similar to that for an ellipse.
(c) Rewrite the equation in part (b) in terms of , ̇, , , and the total energy  .
(d) Give a rough sketch of the phase space diagram (̇ versus ) for this oscillator. Also, on the same set of
axes, sketch the phase space diagram for a similar oscillator with a total energy that is larger than the
first oscillator.
(e) What direction are the paths that you have sketched? Explain your answer.
(f) Would diﬀerent trajectories for the same oscillator ever cross paths? Why or why not?

8. Consider a damped, driven oscillator consisting of a mass  attached to a spring of spring constant .

(a) What is the equation of motion for this system?

(b) Solve the equation in part (a). The solution consists of two parts, the complementary solution and the
particular solution. When might it be possible to safely neglect one part of the solution?
(c) What is the diﬀerence between amplitude resonance and kinetic energy resonance?
(d) How might phase space diagrams look for this type of oscillator? What variables would aﬀect the diagram?
3.12. SUMMARY 87

9. A particle of mass  is subject to the following force

F = (3 − 42 + 3)x̂
where  is a constant.

(a) Determine the points when the particle is in equilibrium.

(b) Which of these points is stable and which are unstable?
(c) Is the motion bounded or unbounded?

10. A very long cylindrical shell has a mass density that depends upon the radial distance such that () = ,
where  is a constant. The inner radius of the shell is  and the outer radius is .

(a) Determine the direction and the magnitude of the gravitational field for all regions of space.
(b) If the gravitational potential is zero at the origin, what is the diﬀerence between the gravitational potential
at  =  and  = ?

11. A mass  is constrained to move along one dimension. Two identical springs are attached to the mass, one on
each side, and each spring is in turn attached to a wall. Both springs have the same spring constant .

(a) Determine the frequency of the oscillation, assuming no damping.

(b) Now consider damping. It is observed that after  oscillations, the amplitude of the oscillation has
dropped to one-half of its initial value. Find an expression for the damping constant.
(c) How long does it take for the amplitude to decrease to one-quarter of its initial value?

12. Discuss the motion of a continuous string when ½ plucked at one third of the ¾
length of the string. That is, the
3 
  0 ≤  ≤ 3
initial condition is ̇( 0) = 0, and ( 0) = 3 
2 ( − ) 3 ≤≤

13. When a particular driving force is applied to a stretched string it is observed that the string vibration in purely
of the  harmonic. Find the driving force.

14. Consider the two-mass system pivoted at its vertex where  6= . It undergoes oscillations of the angle 
with respect to the vertical in the plane of the triangle.

l l

M m
l

(a) Determine the angular frequency of small oscillations.


(b) Use your result from part (a) to show  2 ≈  for  À .
 00 ( )
(c) Show that your result from part (a) agrees with  2 =  where  is the equilibrium angle and  is
the moment of inertia.
(d) Assume the system has energy . Setup an integral that determines the period of oscillation.

15. A cube of side  and mass  is immersed in water with density  past the point of equilibrium and then
released. Assume there is no damping due to the water.

(a) Show that the cube’s equation of motion is

2 
+  +  = 0
2
where  and  are constants. Determine  and  .
88 CHAPTER 3. LINEAR OSCILLATORS

(b) The solution to the equation of motion is

 √ √
() = + 1 cos ( ) +  2 sin ( )

where 1 and 2 are constants. If (0) = −, determine ().

(c) Determine the period  of oscillation.

Problems
1. An unusual pendulum is made by fixing a string to a horizontal cylinder of radius  wrapping the string
several times around the cylinder, and then tying a mass  to the loose end. In equilibrium the mass hangs a
distance 0 vertically below the edge of the cylinder. Find the potential energy if the pendulum has swung to
an angle  from the vertical. Show that for small angles, it can be written in the Hooke’s Law form  = 12 2 .
Comment of the value of 

2. Consider the two-dimensional anisotropic oscillator with motion with   =  and   =  .

 
a) Prove that if the ratio of the frequencies is rational (that is,  =  where  and  are integers) then the
motion is periodic. What is the period?
b) Prove that if the same ratio is irrational, the motion never repeats itself.

3. A simple pendulum consists of a mass  suspended from a fixed point by a weight-less, extensionless rod of
length .

p the equation of motion, and in the approximation sin  ≈  show that the natural frequency is
a) Obtain
 0 =  , where  is the gravitational field strength.
b) Discuss
√ the motion in the event that the motion takes place in a viscous medium with retarding force
2 ̇.
4. Derive the expression for the State Space paths of the plane pendulum if the total energy is   2. Note
that this is just the case of a particle moving in a periodic potential  () = (1−cos) Sketch the State
Space diagram for both   2 and   2

5. Consider the motion of a driven linearly-damped harmonic oscillator after the transient solution has died out,
and suppose that it is being driven close to resonance,  =   .
1 2 2
a) Show that the oscillator’s total energy is  = 2   .
b) Show that the energy ∆ dissipated during one cycle by the damping force Γ̇ is Γ2

6. Two masses m1 and m2 slide freely on a horizontal frictionless rail and are connected by a spring whose force
constant is k. Find the frequency of oscillatory motion for this system.

7. A particle of mass  moves under the influence of a resistive force proportional to velocity and a potential  ,
that is .

 ( ̇) = −̇ −

where   0 and  () = (2 − 2 )2
a) Find the points of stable and unstable equilibrium.
b) Find the solution of the equations of motion for small oscillations around the stable equilibrium points
c) Show that as  → ∞ the particle approaches one of the stable equilibrium points for most choices of initial
conditions. What are the exceptions? (Hint: You can prove this without finding the solutions explicitly.)
Chapter 4

Nonlinear systems and chaos

4.1 Introduction
In nature only a subset of systems have equations of motion that are linear. Contrary to the impression
given by the analytic solutions presented in undergraduate physics courses, most dynamical systems in
nature exhibit non-linear behavior that leads to complicated motion. The solutions of non-linear equations
usually do not have analytic solutions, superposition does not apply, and they predict phenomena such as
attractors, discontinuous period bifurcation, extreme sensitivity to initial conditions, rolling motion, and
chaos. During the past four decades, exciting discoveries have been made in classical mechanics that are
associated with the recognition that nonlinear systems can exhibit chaos. Chaotic phenomena have been
observed in most fields of science and engineering such as, weather patterns, fluid flow, motion of planets in
the solar system, epidemics, changing populations of animals, birds and insects, and the motion of electrons
in atoms. The complicated dynamical behavior predicted by non-linear differential equations is not limited
to classical mechanics, rather it is a manifestation of the mathematical properties of the solutions of the
differential equations involved, and thus is generally applicable to solutions of first or second-order non-
linear differential equations. It is important to understand that the systems discussed in this chapter follow
a fully deterministic evolution predicted by the laws of classical mechanics, the evolution for which is based
on the prior history. This behavior is completely different from a random walk where each step is based on a
random process. The complicated motion of deterministic non-linear systems stems in part from sensitivity
to the initial conditions.
The French mathematician Poincaré is credited with being the first to recognize the existence of chaos
during his investigation of the gravitational three-body problem in celestial mechanics. At the end of the
nineteenth century Poincaré noticed that such systems exhibit high sensitivity to initial conditions character-
istic of chaotic motion, and the existence of nonlinearity which is required to produce chaos. Poincaré’s work
received little notice, in part it was overshadowed by the parallel development of the Theory of Relativity
and quantum mechanics at the start of the 20 century. In addition, solving nonlinear equations of motion
is difficult, which discouraged work on nonlinear mechanics and chaotic motion. The field blossomed during
the 19600  when computers became sufficiently powerful to solve the nonlinear equations required to calculate
the long-time histories necessary to document the evolution of chaotic behavior. Laplace, and many other
scientists, believed in the deterministic view of nature which assumes that if the position and velocities of
all particles are known, then one can unambiguously predict the future motion using Newtonian mechanics.
Researchers in many fields of science now realize that this “clockwork universe” is invalid. That is, knowing
the laws of nature can be insufficient to predict the evolution of nonlinear systems in that the time evolu-
tion can be extremely sensitive to the initial conditions even though they follow a completely deterministic
development. There are two major classifications of nonlinear systems that lead to chaos in nature. The
first classification encompasses nondissipative Hamiltonian systems such as Poincaré’s three-body celestial
mechanics system. The other main classification involves driven, damped, non-linear oscillatory systems.
Nonlinearity and chaos is a broad and active field and thus this chapter will focus only on a few examples
that illustrate the general features of non-linear systems. Weak non-linearity is used to illustrate bifurcation
and asymptotic attractor solutions for which the system evolves independent of the initial conditions. The
common sinusoidally-driven linearly-damped plane pendulum illustrates several features characteristic of the

89
90 CHAPTER 4. NONLINEAR SYSTEMS AND CHAOS

evolution of a non-linear system from order to chaos. The impact of non-linearity on wavepacket propagation
velocities and the existence of soliton solutions is discussed. The example of the three-body problem is
discussed in chapter 11. The transition from laminar flow to turbulent flow is illustrated by fluid mechanics
discussed in chapter 168. Analytic solutions of nonlinear systems usually are not available and thus one
must resort to computer simulations. As a consequence the present discussion focusses on the main features
of the solutions for these systems and ignores how the equations of motion are solved.

4.2 Weak nonlinearity

Most physical oscillators become non-linear with increase in amplitude of the oscillations. Consequences
of non-linearity include breakdown of superposition, introduction of additional harmonics, and complicated
chaotic motion that has great sensitivity to the initial conditions as illustrated in this chapter Weak non-
linearity is interesting since perturbation theory can be used to solve the non-linear equations of motion.
The potential energy function for a linear oscillator has a pure parabolic shape about the minimum
location, that is,  = 12 ( − 0 )2 where 0 is the location of the minimum. Weak non-linear systems have
small amplitude oscillations ∆ about the minimum allowing use of the Taylor expansion

 (0 ) ∆2 2  (0 ) ∆3 3  (0 ) ∆4 4  (0 )

 (∆) =  (0 ) + ∆ + + + +  (4.1)
 2! 2 3! 3 4! 4
 (0 )
By definition, at the minimum  = 0 and thus equation 41 can be written as

∆2 2  (0 ) ∆3 3  (0 ) ∆4 4  (0 )

∆ =  (∆) −  (0 ) = + + +  (4.2)
2! 2 3! 3 4! 4
2 2
  (0 )
For small amplitude oscillations the system is linear when only the second-order ∆ 2! 2 term in equation
42 is significant. The linearity for small amplitude oscillations greatly simplifies description of the oscillatory
motion in that superposition applies, and complicated chaotic motion is avoided. For slightly larger amplitude
motion, where the higher-order terms in the expansion are still much smaller than the second-order term,
then perturbation theory can be used as illustrated by the simple plane pendulum which is non linear since
the restoring force equals
3 5 7
 sin  ' ( − + − + ) (4.3)
3! 5! 7!
This is linear only at very small angles where the higher-order terms in the expansion can be neglected.
Consider the equation of motion at small amplitudes for the harmonically-driven, linearly-damped plane
pendulum
3
̈ + Γ̇ +  20 sin  = ̈ + Γ̇ +  20 ( − ) = 0 cos () (4.4)
6
where only the first two terms in the expansion 43 have been included. It was shown in chapter 3 that when
sin  ≈  then the steady-state solution of equation 44 is of the form

 () =  cos ( − ) (4.5)

Insert this first-order solution into equation 44, then the cubic term in the expansion gives a term 3  =
1
4 (cos 3 + 3 cos ). Thus the perturbation expansion to third order involves a solution of the form

 () =  cos ( − ) +  cos 3( − ) (4.6)

This perturbation solution shows that the non-linear term has distorted the signal by addition of the third
harmonic of the driving frequency with an amplitude that depends sensitively on . This illustrates that the
superposition principle is not obeyed for this non-linear system, but, if the non-linearity is weak, perturbation
theory can be used to derive the solution of a non-linear equation of motion.
Figure 41 illustrates that for a potential  () = 22 + 4  the 4 non-linear term are greatest at the
maximum amplitude  which makes the total energy contours in state-space more rectangular than the
elliptical shape for the harmonic oscillator as shown in figure 33. The solution is of the form given in
equation 46.
4.2. WEAK NONLINEARITY 91

Figure 4.1: The left side shows the potential energy for a symmetric potential  () = 22 + 4 . The right
side shows the contours of constant total energy on a state-space diagram.

4.1 Example: Non-linear oscillator

Assume that a non-linear oscillator has a potential given by
2 3
 () = −
2 3
where  is small. Find the solution of the equation of motion to first order in , assuming  = 0 at  = 0.
The equation of motion for the nonlinear oscillator is

̈ = − = − + 2

If the 2 term is neglected, then the second-order equation of motion reduces to a normal linear oscillator
with
0 =  sin ( 0  + )
where r

0 =

Assume that the first-order solution has the form

1 = 0 + 1

Substituting this into the equation of motion, and neglecting terms of higher order than  gives
2
̈1 +  20 1 = 20 = [1 − cos (2 0 )]
2
To solve this try a particular integral
1 =  +  cos (2 0 )
and substitute into the equation of motion gives
2 2
−3 20  cos (2 0 ) +  20  = − cos (2 0 )
2 2
Comparison of the coeﬃcients gives
2
 =
2 20
2
 =
6 20
92 CHAPTER 4. NONLINEAR SYSTEMS AND CHAOS

The homogeneous equation is

̈1 +  20 1 = 0
which has a solution of the form
1 = 1 sin ( 0 ) + 2 cos ( 0 )
Thus combining the particular and homogeneous solutions gives
∙ 2 ¸
 2
1 = ( + 1 ) sin ( 0 ) +  + 2 cos ( 0 ) + 2 cos (2 0 )
2 20 6 0

The initial condition  = 0 at  = 0 then gives

22
2 = −
3 2
and ∙ ¸
2 1 2 1
1 = ( + 1 ) sin ( 0 ) + 2 − cos ( 0 ) + cos (2 0 )
0 2 3 6
The constant ( + 1 ) is given by the initial amplitude and velocity.
This system is nonlinear in that the output amplitude is not proportional to the input amplitude. Secondly,
a large amplitude second harmonic component is introduced in the output waveform; that is, for a non-linear
system the gain and frequency decomposition of the output diﬀers from the input. Note that the frequency
composition is amplitude dependent. This particular example of a nonlinear system does not exhibit chaos.
The Laboratory for Laser Energetics uses nonlinear crystals to double the frequency of laser light.

4.3 Bifurcation, and point attractors

Interesting new phenomena, such as bifurcation, and attractors, occur when the non-linearity is large. In
chapter 3 it was shown that the state-space diagram (̇ ) for an undamped harmonic oscillator is an
ellipse with dimensions defined by the total energy of the system. As shown in figure 35 for the damped
harmonic oscillator, the state-space diagram spirals inwards to the origin due to dissipation of energy. Non-
linearity distorts the shape of the ellipse or spiral on the state-space diagram, and thus the state-space, or
corresponding phase-space, diagrams, provide useful representations of the motion of linear and non-linear
periodic systems.
The complicated motion of non-linear systems makes it necessary to distinguish between transient and
asymptotic behavior. The damped harmonic oscillator executes a transient spiral motion that asymptotically
approaches the origin. The transient behavior depends on the initial conditions, whereas the asymptotic limit
of the steady-state solution is a specific location, that is called a point attractor. The point attractor for
damped motion in the anharmonic potential well

 () = 22 + 4 (4.7)

is at the minimum, which is the origin of the state-space diagram as shown in figure 41.
The more complicated one-dimensional potential well

 () = 8 − 42 + 054 (4.8)

shown in figure 42 has two minima that are symmetric about  = 0 with a saddle of height 8.
The kinetic plus potential energies of a particle with mass  = 2 released in this potential, will be
assumed to be given by
( ̇) = ̇2 +  () (4.9)
The state-space plot in figure 42 shows contours of constant energy with the minima at ( ̇) = (±2 0).
At slightly higher total energy the contours are closed loops around either of the two minima at  = ±2.
At total energies above the saddle energy of 8 the contours are peanut-shaped and are symmetric about
the origin. Assuming that the motion is weakly damped, then a particle released with total energy 
which is higher than  will follow a peanut-shaped spiral trajectory centered at ( ̇) = (0 0) in the
4.4. LIMIT CYCLES 93

Figure 4.2: The left side shows the potential energy for a bimodal symmetric potential  () = 8 − 42 +
054 . The right-hand figure shows contours of the sum of kinetic and potential energies on a state-space
diagram. For total energies above the saddle point the particle follows peanut-shaped trajectories in state-
space centered around ( ̇) = (0 0). For total energies below the saddle point the particle will have closed
trajectories about either of the two symmetric minima located at ( ̇) = (±2 0). Thus the system solution
bifurcates when the total energy is below the saddle point.

state-space diagram for    . For    there are two separate solutions for the two
minimum centered at  = ±2 and ̇ = 0. This is an example of bifurcation where the one solution for
   bifurcates into either of the two solutions for    .
For an initial total energy     damping will result in spiral trajectories of the particle that
will be trapped in one of the two minima. For    the particle trajectories are centered giving
the impression that they will terminate at ( ̇) = (0 0) when the kinetic energy is dissipated. However, for
   the particle will be trapped in one of the two minimum and the trajectory will terminate
at the bottom of that potential energy minimum occurring at ( ̇) = (±2 0). These two possible terminal
points of the trajectory are called point attractors. This example appears to have a single attractor for
   which bifurcates leading to two attractors at ( ̇) = (±2 0) for    . The
determination as to which minimum traps a given particle depends on exactly where the particle starts in
state space and the damping etc. That is, for this case, where there is symmetry about the -axis, the
particle has an initial total energy     then the initial conditions with  radians of state space
will lead to trajectories that are trapped in the left minimum, and the other  radians of state space will be
trapped in the right minimum. Trajectories starting near the split between these two halves of the starting
state space will be sensitive to the exact starting phase. This is an example of sensitivity to initial conditions.

4.4 Limit cycles

4.4.1 Poincaré-Bendixson theorem
Coupled first-order diﬀerential equations in two dimensions of the form

̇ =  ( ) ̇ = ( ) (4.10)

occur frequently in physics. The state-space paths do not cross for such two-dimensional autonomous systems,
where an autonomous system is not explicitly dependent on time.
The Poincaré-Bendixson theorem states that, state-space, and phase-space, can have three possible paths:
(1) closed paths, like the elliptical paths for the undamped harmonic oscillator,
(2) terminate at an equilibrium point as  → ∞, like the point attractor for a damped harmonic oscillator,
(3) tend to a limit cycle as  → ∞.
The limit cycle is unusual in that the periodic motion tends asymptotically to the limit-cycle attractor
independent of whether the initial values are inside or outside the limit cycle. The balance of dissipative forces
and driving forces often leads to limit-cycle attractors, especially in biological applications. Identification of
limit-cycle attractors, as well as the trajectories of the motion towards these limit-cycle attractors, is more
complicated than for point attractors.
94 CHAPTER 4. NONLINEAR SYSTEMS AND CHAOS

Figure 4.3: The Poincaré-Bendixson theorem allows the following three scenarios for two-dimensional au-
tonomous systems. (1) Closed paths as illustrated by the undamped harmonic oscillator. (2) Terminate at
an equilibrium point as  → ∞, as illustrated by the damped harmonic oscillator, and (3) Tend to a limit
cycle as  → ∞ as illustrated by the van der Pol oscillator.

4.4.2 van der Pol damped harmonic oscillator:

The van der Pol damped harmonic oscillator illustrates a non-linear equation that leads to a well-studied,
limit-cycle attractor that has important applications in diverse fields. The van der Pol oscillator has an
equation of motion given by
2  ¡ ¢ 
2
+  2 − 1 +  20  = 0 (4.11)
 
¡ 2 ¢ 
The non-linear   − 1  damping term is unusual in that the sign changes when  = 1 leading to
positive damping for   1 and negative damping for   1 To simplify equation 411 assume that the term
 20  =  that is,  20 = 1.
This equation was studied extensively during the 1920’s and 1930’s by the Dutch engineer, Balthazar
van der Pol, for describing electronic circuits that incorporate feedback. The form of the solution can be
simplified by defining a variable  ≡    Then the second-order equation 411 can be expressed as two
coupled first-order equations.


 ≡ (4.12)

 ¡ ¢
= − −  2 − 1  (4.13)

It is advantageous to transform the (̇ ) state space to polar coordinates by setting
 =  cos  (4.14)
 =  sin 
and using the fact that 2 = 2 +  2  Therefore
  
 = + (4.15)
  
Similarly for the angle coordinate
  
= cos  −  sin  (4.16)
  
  
= sin  +  cos  (4.17)
  
Multiply equation 416 by  and 417 by  and subtract gives
  
2 = − (4.18)
  
4.4. LIMIT CYCLES 95

Figure 4.4: Solutions of the van der Pol system for  = 02 top row and  = 5 bottom row, assuming that
 20 = 1. The left column shows the time dependence (). The right column shows the corresponding ( ̇)
state space plots. Upper: Weak nonlinearity, = 02; At large times the solution tends to one limit
cycle for initial values inside or outside the limit cycle attractor. The amplitude () for two initial condi-
tions approaches an approximately harmonic oscillation. Lower: Strong nonlinearity, μ = 5; Solutions
approach a common limit cycle attractor for initial values inside or outside the limit cycle attractor while
the amplitude () approaches a common approximate square-wave oscillation.

Equations 415 and 418 allow the van der Pol equations of motion to be written in polar coordinates
 ¡ ¢
= − 2 cos2  − 1  sin2  (4.19)

 ¡ ¢
= −1 −  2 cos2  − 1 sin  cos  (4.20)

The non-linear terms on the right-hand side of equations 419 − 20 have a complicated form.

Weak non-linearity:   1
In the limit that  → 0, equations 419 420 correspond to a circular state-space trajectory similar to the
harmonic oscillator. That is, the solution is of the form
 () =  sin ( − 0 ) (4.21)
where  and 0 are arbitrary parameters. For weak non-linearity,   1 the angular equation 420 has a
rotational frequency that is unity since the sin  cos  term changes sign twice per period, in addition to the
96 CHAPTER 4. NONLINEAR SYSTEMS AND CHAOS
¡ ¢
small value of . For   1 and   1 the radial equation 419 has a sign of the 2 cos2  − 1 term that
is positive and thus the radius increases monotonically to unity. For   1 the bracket is predominantly
negative resulting in a spiral decrease in the radius. Thus, for very weak non-linearity, this radial behavior
results in the amplitude spiralling to a well defined limit-cycle attractor value of  = 2 as illustrated by
the state-space plots in figure 44 for cases where the initial condition is inside or external to the circular
attractor. The final amplitude for diﬀerent initial conditions also approach the same asymptotic behavior.

Dominant non-linearity:   1
For the case where the non-linearity is dominant, that is   1, then as shown in figure 44, the system
approaches a well defined attractor, but in this case it has a significantly skewed shape in state-space, while
the amplitude approximates a square wave. The solution remains close to  = +2 until  = ̇ ≈ +7 and
then it relaxes quickly to  = −2 with  = ̇ ≈ 0 This is followed by the mirror image. This behavior is
called a relaxed vibration in that a tension builds up slowly then dissipates by a sudden relaxation process.
The seesaw is an extreme example of a relaxation oscillator where the seesaw angle switches spontaneously
from one solution to the other when the diﬀerence in their moment arms changes sign.
The study of feedback in electronic circuits was the stimulus for study of this equation by van der
Pol. However, Lord Rayleigh first identified such relaxation oscillator behavior in 1880 during studies of
vibrations of a stringed instrument excited by a bow, or the squeaking of a brake drum. In his discussion of
non-linear eﬀects in acoustics, he derived the equation

̈ − ( − ̇2 )̇ +  20  (4.22)

Diﬀerentiation of Rayleigh’s equation 422 gives

...
 − ( − 3̇2 )̈ +  20 ̇ = 0 (4.23)

Using the substitution of r

3
 = 0 ̇ (4.24)

leads to the relations
r r r
   ̇ ...  ̈
̇ = ̈ =  = (4.25)
3 0 3 0 3 0
Substituting these relations into equation 423 gives
r r ∙ ¸ r
 ̈  3 ̇ 2 ̇ 2  
− − +  0 =0 (4.26)
3 0 3  02 0 3 0
q
Multiplying by 0 3  and rearranging leads to the van der Pol equation

 2
̈ − ( −  2 )̇ −  20  = 0 (4.27)
02 0

The rhythm of a heartbeat driven by a pacemaker is an important application where the self-stabilization of
the attractor is a desirable characteristic to stabilize an irregular heartbeat; the medical term is arrhythmia.
The mechanism that leads to synchronization of the many pacemaker cells in the heart and human body due
to the influence of an implanted pacemaker is discussed in chapter 1412. Another biological application of
limit cycles is the time variation of animal populations.
In summary the non-linear damping of the van der Pol oscillator leads to a self-stabilized, single limit-
cycle attractor that is insensitive to the initial conditions. The van der Pol oscillator has many important
applications such as bowed musical instruments, electrical circuits, and human anatomy as mentioned above.
The van der Pol oscillator illustrates the complicated manifestations of the motion that can be exhibited by
non-linear systems
4.5. HARMONICALLY-DRIVEN, LINEARLY-DAMPED, PLANE PENDULUM 97

4.5 Harmonically-driven, linearly-damped, plane pendulum

The harmonically-driven, linearly-damped, plane pendulum illustrates many of the phenomena exhibited by
non-linear systems as they evolve from ordered to chaotic motion. It illustrates the remarkable fact that
determinism does not imply either regular behavior or predictability. The well-known, harmonically-driven
linearly-damped pendulum provides an ideal basis for an introduction to non-linear dynamics1 .
Consider a harmonically-driven linearly-damped plane pendulum of moment of inertia  and mass  in
a gravitational field that is driven by a torque due to a force  () =  cos  acting at a moment arm .
The damping term is  and the angular displacement of the pendulum, relative to the vertical, is . The
equation of motion of the harmonically-driven linearly-damped simple pendulum can be written as
 ̈ + ̇ +  sin  =  cos  (4.28)
Note that the sinusoidal restoring force for the plane pendulum is non-linear for large angles . The natural
period of the free pendulum is r

0 = (4.29)

A dimensionless parameter , which is called the drive strength, is defined by

≡ (4.30)

The equation of motion 428 can be generalized by introducing dimensionless units for both time ̃ and
relative drive frequency ̃ defined by

̃ ≡  0  ̃ ≡ (4.31)
0
In addition, define the inverse damping factor  as
0
≡ (4.32)

These definitions allow equation 428 to be written in the dimensionless form
2  1 
+ + sin  =  cos ̃ ̃ (4.33)
̃2  ̃
The behavior of the angle  for the driven damped plane pendulum depends on the drive strength 
and the damping factor . Consider the case where equation 433 is evaluated assuming that the damping
coeﬃcient  = 2, and that the relative angular frequency ̃ = 23  which is close to resonance where chaotic
phenomena are manifest. The Runge-Kutta method is used to solve this non-linear equation of motion.

4.5.1 Close to linearity

For drive strength  = 02 the amplitude is suﬃciently small that sin  '  superposition applies, and the
solution is identical to that for the driven linearly-damped linear oscillator. As shown in figure 45, once
the transient solution dies away, the steady-state solution asymptotically approaches one attractor that has
an amplitude of ±03 radians and a phase shift  with respect to the driving force. The abscissa is given
in units of the dimensionless time ̃ =  0 . The transient solution depends on the initial conditions and
dies away after about 5 periods, whereas the steady-state solution is independent of the initial conditions
and has a state-space diagram that has an elliptical shape, characteristic of the harmonic oscillator. For all
initial conditions, the time dependence and state space diagram for steady-state motion approaches a unique
solution, called an “attractor”, that is, the pendulum oscillates sinusoidally with a given amplitude at the
frequency of the driving force and with a constant phase shift , i.e.
() =  cos( − ) (4.34)
This solution is identical to that for the harmonically-driven, linearly-damped, linear oscillator discussed in
chapter 36

1A similar approach is used by the book "Chaotic Dynamics" by Baker and Gollub[Bak96].
98 CHAPTER 4. NONLINEAR SYSTEMS AND CHAOS

Figure 4.5: Motion of the driven damped pendulum for drive strengths of  = 02,  = 09  = 105 and
 = 1078. The left side shows the time dependence of the deflection angle  with the time axis expressed
in dimensionless units ̃. The right side shows the corresponding state-space plots. These plots assume
̃ = 0 = 23 ,  = 2, and the motion starts with  =  = 0.
4.5. HARMONICALLY-DRIVEN, LINEARLY-DAMPED, PLANE PENDULUM 99

10
2

2 4 6 8 10 12 14
t 2 2

2 10

10
2

2 4 6 8 10 12 14 t 2 2

10
2

Figure 4.6: The driven damped pendulum assuming that ̃ = 23 ,  = 2, with initial conditions (0) = − 2 ,
(0) = 0. The system exhibits period-two motion for drive strengths of  = 1078 as shown by the state
space diagram for cycles 10 − 20. For  = 1081 the system exhibits period-four motion shown for cycles
10 − 30.

4.5.2 Weak nonlinearity

Figure 45 shows that for drive strength  = 09, after the transient solution dies away, the steady-state
solution settles down to one attractor that oscillates at the drive frequency with an amplitude of slightly
more than 2 radians for which the small angle approximation fails. The distortion due to the non-linearity
is exhibited by the non-elliptical shape of the state-space diagram.
The observed behavior can be calculated using the successive approximation method discussed in chapter
42. That is, close to small angles the sine function can be approximated by replacing
1
sin  ≈  − 3
6
in equation 433 to give µ ¶
1 1
̈ + ̇ +  20  − 3 =  cos ̃ ̃ (4.35)
 6
As a first approximation assume that
(̃) ≈  cos(̃ ̃ − )
then the small 3 term in equation 435 contributes a term proportional to cos3 (̃ ̃ − ). But
1¡ ¢
cos3 (̃ ̃ − ) = cos 3(̃ ̃ − ) + 3 cos(̃ ̃ − )
4
That is, the nonlinearity introduces a small term proportional to cos 3( − ). Since the right-hand side of
equation 435 is a function of only cos  then the terms in  ̇ and ̈ on the left hand side must contain
the third harmonic cos 3( − ) term. Thus a better approximation to the solution is of the form
£ ¤
(̃) =  cos(̃ ̃ − ) +  cos 3(̃ ̃ − ) (4.36)
100 CHAPTER 4. NONLINEAR SYSTEMS AND CHAOS

where the admixture coeﬃcient   1. This successive approximation method can be repeated to add
additional terms proportional to cos ( − ) where  is an integer with  ≥ 3. Thus the nonlinearity
introduces progressively weaker -fold harmonics to the solution. This successive approximation approach
is viable only when the admixture coeﬃcient   1 Note that these harmonics are integer multiples of ,
thus the steady-state response is identical for each full period even though the state space contours deviate
from an elliptical shape.

4.5.3 Onset of complication

Figure 45 shows that for  = 105 the drive strength is suﬃciently strong to cause the transient solution for
the pendulum to rotate through two complete cycles before settling down to a single steady-state attractor
solution at the drive frequency. However, this attractor solution is shifted two complete rotations relative
to the initial condition. The state space diagram clearly shows the rolling motion of the transient solution
for the first two periods prior to the system settling down to a single steady-state attractor. The successive
approximation approach completely fails at this coupling strength since  oscillates through large values that
are multiples of 
Figure 45 shows that for drive strength  = 1078 the motion evolves to a much more complicated
periodic motion with a period that is three times the period of the driving force. Moreover the amplitude
exceeds 2 corresponding to the pendulum oscillating over top dead center with the centroid of the motion
oﬀset by 3 from the initial condition. Both the state-space diagram, and the time dependence of the motion,
illustrate the complexity of this motion which depends sensitively on the magnitude of the drive strength 
in addition to the initial conditions, ((0) (0)) and damping factor  as is shown in figure 46

4.5.4 Period doubling and bifurcation

For drive strength  = 1078 with the initial condition ((0)  (0)) = (0 0)  the system exhibits a regular
motion with a period that is three times the drive period. In contrast, if the initial condition is [(0) =
− 2   (0) = 0] then, as shown in figure 46 the steady-state solution has the drive frequency with no offset
in , that is, it exhibits period-one oscillation. This appearance of two separate and very different attractors
for  = 1078 using different initial conditions, is called bifurcation.
An additional feature of the system response for  = 1078 is that changing the initial conditions to
[(0) = − 2   (0) = 0] shows that the amplitude of the even and odd periods of oscillation differ slightly
in shape and amplitude, that is, the system really has period-two oscillation. This period-two motion, i.e.
period doubling, is clearly illustrated by the state space diagram in that, although the motion still is
dominated by period-one oscillations, the even and odd cycles are slightly displaced. Thus, for different
initial conditions, the system for  = 1078 bifurcates into either of two attractors that have very different
waveforms, one of which exhibits period doubling.
The period doubling exhibited for  = 1078 is followed by a second period doubling when  = 1081 as
shown in figure 46 . With increase in drive strength this period doubling keeps increasing in binary multiples
to period 8, 16, 32, 64 etc. Numerically it is found that the threshold for period doubling is  1 = 10663
from two to four occurs at  2 = 10793 etc. Feigenbaum showed that this cascade increases with increase in
drive strength according to the relation that obeys
1
( +1 −   ) ' ( −  −1 ) (4.37)
 
where  = 46692016,  is called a Feigenbaum number. As  → ∞ this cascading sequence goes to a limit
  where
  = 10829 (4.38)

4.5.5 Rolling motion

It was shown that for   105 the transient solution causes the pendulum to have angle excursions exceeding
2, that is, the system rolls over top dead center. For drive strengths in the range 13    14 the steady-
state solution for the system undergoes continuous rolling motion as illustrated in figure 47. The time
dependence for the angle exhibits a periodic oscillatory motion superimposed upon a monotonic rolling
motion, whereas the time dependence of the angular frequency  =   is periodic. The state space plots
4.5. HARMONICALLY-DRIVEN, LINEARLY-DAMPED, PLANE PENDULUM 101

Figure 4.7: Rolling motion for the driven damped plane pendulum for  = 14. (a) The time dependence
of angle () increases by 2 per drive period whereas (b) the angular velocity () exhibits periodicity. (c)
The state space plot for rolling motion is shown with the origin shifted by 2 per revolution to keep the plot
within the bounds −    +

for rolling motion corresponds to a chain of loops with a spacing of 2 between each loop. The state space
diagram for rolling motion is more compactly presented if the origin is shifted by 2 per revolution to keep
the plot within bounds as illustrated in figure 47.

4.5.6 Onset of chaos

When the drive strength is increased to  = 1105 then the system does not approach a unique attractor
as illustrated by figure 48  which shows state space orbits for cycles 25 − 200. Note that these orbits do
not repeat implying the onset of chaos. For drive strengths greater than   = 10829 the driven damped
plane pendulum starts to exhibit chaotic behavior. The onset of chaotic motion is illustrated by making a 3-
dimensional plot which combines the time coordinate with the state-space coordinates as illustrated in figure
48. This plot shows 16 trajectories starting at diﬀerent initial values in the range −015    015
for  = 1168. Some solutions are erratic in that, while trying to oscillate at the drive frequency, they never
settle down to a steady periodic motion which is characteristic of chaotic motion. Figure 48 illustrates
the considerable sensitivity of the motion to the initial conditions. That is, this deterministic system can
exhibit either order, or chaos, dependent on miniscule diﬀerences in initial conditions.

Figure 4.8: Left: Space-space orbits for the driven damped pendulum with  = 1105. Note that the orbits
do not repeat for cycles 25 to 200. Right: Time-state-space diagram for  = 1168. The plot shows 16
trajectories starting with diﬀerent initial values in the range −015    015.
102 CHAPTER 4. NONLINEAR SYSTEMS AND CHAOS

Figure 4.9: State-space plots for the harmonically-driven, linearly-damped, pendulum for driving amplitudes
of  = 05 and  = 12. These calculations were performed using the Runge-Kutta method by E. Shah,
(Private communication)

4.6 Diﬀerentiation between ordered and chaotic motion

Chapter 45 showed that motion in non-linear systems can exhibit both order and chaos. The transition
between ordered motion and chaotic motion depends sensitively on both the initial conditions and the model
parameters. It is surprisingly diﬃcult to unambiguously distinguish between complicated ordered motion
and chaotic motion. Moreover, the motion can fluctuate between order and chaos in an erratic manner
depending on the initial conditions. The extremely sensitivity to initial conditions of the motion for non-
linear systems, makes it essential to have quantitative measures that can characterize the degree of order, and
interpret the complicated dynamical motion of systems. As an illustration, consider the harmonically-driven,
linearly-damped, pendulum with  = 2 and driving force  () =  sin ̃ ̃ where ̃ = 23 . Figure 49 shows
the state-space plots for two driving amplitudes,  = 05 which leads to ordered motion, and  = 12
which leads to possible chaotic motion. It can be seen that for  = 05 the state-space diagram converges
to a single attractor once the transient solution has died away. This is in contrast to the case for  = 12
where the state-space diagram does not converge to a single attractor, but exhibits possible chaotic motion.
Three quantitative measures can be used to diﬀerentiate ordered motion from chaotic motion for this system;
namely, the Lyapunov exponent, the bifurcation diagram, and the Poincaré section, as illustrated below.

4.6.1 Lyapunov exponent

The Lyapunov exponent provides a quantitative and useful measure of the instability of trajectories, and how
quickly nearby initial conditions diverge. It compares two identical systems that start with an infinitesimally
small difference in the initial conditions in order to ascertain whether they converge to the same attractor
at long times, corresponding to a stable system, or whether they diverge to very different attractors, charac-
teristic of chaotic motion. If the initial separation between the trajectories in phase space at  = 0 is |0 |,
then to first order the time dependence of the difference can be assumed to depend exponentially on time.
That is,
|()| ∼  |0 | (4.39)
where  is the Lyapunov exponent. That is, the Lyapunov exponent is defined to be

1 |()|
 = lim lim ln (4.40)
→∞ 0 →0  |0 |

Systems for which the Lyapunov exponent   0 (negative), converge exponentially to the same attractor
solution at long times since |()| → 0 for  → ∞. By contrast, systems for which   0 (positive) diverge
to completely diﬀerent long-time solutions, that is, |()| → ∞ for  → ∞. Even for infinitesimally
4.6. DIFFERENTIATION BETWEEN ORDERED AND CHAOTIC MOTION 103

Figure 4.10: Lyapunov plots of ∆ versus time for two initial starting points diﬀering by ∆0 = 0001.
The parameters are  = 2 and  () =  sin( 23 ) and ∆ = 004. The Lyapunov exponent for  = 05
which is drawn as a dashed line, is convergent with  = −0251 For  = 12 the exponent is divergent as
indicated by the dashed line which as a slope of  = 01538 These calculations were performed using the
Runge-Kutta method by E. Shah, (Private communication)

small diﬀerences in the initial conditions, systems having a positive Lyapunov exponent diverge to diﬀerent
attractors, whereas when the Lyapunov exponent   0 they correspond to stable solutions.
Figure 410 illustrates Lyapunov plots for the harmonically-driven, linearly-damped, plane pendulum,
with the same conditions discussed in chapter 45. Note that for the small driving amplitude  = 05
the Lyapunov plot converges to ordered motion with an exponent  = −0251 whereas for  = 12 the
plot diverges characteristic of chaotic motion with an exponent  = 01538 The Lyapunov exponent usually
fluctuates widely at the local oscillator frequency, and thus the time average of the Lyapunov exponent must
be taken over many periods of the oscillation to identify the general trend with time. Some systems near an
order-to-chaos transition can exhibit positive Lyapunov exponents for short times, characteristic of chaos,
and then converge to negative  at longer time implying ordered motion. The Lyapunov exponents are
used extensively to monitor the stability of the solutions for non-linear systems. For example the Lyapunov
exponent is used to identify whether fluid flow is laminar or turbulent as discussed in chapter 168.
A dynamical system in -dimensional phase space will have a set of  Lyapunov exponents {1  2    }
associated with a set of attractors, the importance of which depend on the initial conditions. Typically one
Lyapunov exponent dominates at one specific location in phase space, and thus it is usual to use the maximal
Lyapunov exponent to identify chaos.The Lyapunov exponent is a very sensitive measure of the onset of chaos
and provides an important test of the chaotic nature for the complicated motion exhibited by non-linear
systems.

4.6.2 Bifurcation diagram

The bifurcation diagram simplifies the presentation of the dynamical motion by sampling the status of
the system once per period, synchronized to the driving frequency, for many sets of initial conditions. The
results are presented graphically as a function of one parameter of the system in the bifurcation diagram. For
example, the wildly diﬀerent behavior in the driven damped plane pendulum is represented on a bifurcation
diagram in figure 411, which shows the observed angular velocity  of the pendulum sampled once per drive
cycle plotted versus drive strength. The bifurcation diagram is obtained by sampling either the angle ,
or angular velocity  once per drive cycle, that is, it represents the observables of the pendulum using a
stroboscopic technique that samples the motion synchronous with the drive frequency. Bifurcation plots also
can be created as a function of either the time ̃, the damping factor  , the normalized frequency ̃ = 0 ,
or the driving amplitude 
104 CHAPTER 4. NONLINEAR SYSTEMS AND CHAOS

In the domain with drive strength  

10663 there is one unique angle each drive
cycle as illustrated by the bifurcation di-
agram. For slightly higher drive strength
period-two bifurcation behavior results in
two diﬀerent angles per drive cycle. The
Lyapunov exponent is negative for this re-
gion corresponding to ordered motion. The
cascade of period doubling with increase in
drive strength is readily apparent until chaos
sets in at the critical drive strength   when
there is a random distribution of sampled an-
gular velocities and the Lyapunov exponent
becomes positive. Note that at  = 10845
there is a brief interval of period-6 motion
followed by another region of chaos. Around
 = 11 there is a region that is primarily
chaotic which is reflected by chaotic values of
the angular velocity on the bifurcation plot
and large positive values of the Lyapunov ex-
ponent. The region around  = 112 exhibits
period three motion and negative Lyapunov
exponent corresponding to ordered motion. Figure 4.11: Bifurcation diagram samples the angular velocity
The 115    125 region is mainly chaotic  once per period for the driven, linearly-damped, plane pen-
and has a large positive Lyapunov exponent. dulum plotted as a function of the drive strength . Regions
The region with 13    14 is striking of period doubling, and chaos, as well as islands of stability
in that this corresponds to rolling motion all are manifest as the drive strength  is changed. Note that
with reemergence of period one and negative the limited number of samples causes broadening of the lines
Lyapunov exponent. This period-1 motion adjacent to bifurcations.
is due to a continuous rolling motion of the
plane pendulum as shown in figure 47 where it is seen that the average  increases 2 per cycle, whereas the
angular velocity  exhibits a periodic motion. That is, on average the pendulum is rotating 2 per cycle.
Above  = 14 the system start to exhibit period doubling followed by chaos reminiscent of the behavior
seen at lower  values.
These results show that the bifurcation diagram nicely illustrates the order to chaos transitions for the
harmonically-driven, linearly-damped, pendulum. Several transitions between order and chaos are seen to
occur. The apparent ordered and chaotic regimes are confirmed by the corresponding Lyapunov exponents
which alternate between negative and positive values for the ordered and chaotic regions respectively.

4.6.3 Poincaré Section

State-space plots are very useful for characterizing periodic motion, but they become too dense for useful
interpretation when the system approaches chaos as illustrated in figure 411 Poincaré sections solve this
difficulty by taking a stroboscopic sample once per cycle of the state-space diagram. That is, the point on
the state space orbit is sampled once per drive frequency. For period-1 motion this corresponds to a single
point ( ). For period-2 motion this corresponds to two points etc. For chaotic systems the sequence of
state-space sample points follow complicated trajectories. Figure 412 shows the Poincaré sections for the
corresponding state space diagram shown in figure 49 for cycles 10 to 6000. Note the complicated curves do
not cross or repeat. Enlargements of any part of this plot will show increasingly dense parallel trajectories,
called fractals, that indicates the complexity of the chaotic cyclic motion. That is, zooming in on a small
section of this Poincaré plot shows many closely parallel trajectories. The fractal attractors are surprisingly
robust to large differences in initial conditions. Poincaré sections are a sensitive probe of periodic motion
for systems where periodic motion is not readily apparent.
In summary, the behavior of the well-known, harmonically-driven, linearly-damped, plane pendulum
becomes remarkably complicated at large driving amplitudes where non-linear effects dominate. That is,
4.7. WAVE PROPAGATION FOR NON-LINEAR SYSTEMS 105

Figure 4.12: Three Poincaré section plots for the harmonically-driven, linearly-damped, pendulum for various
initial conditions with  = 12 ̃ = 23  and ∆ = 100

. These calculations used the Runge-Kutta method
and were performed for 6000 by E. Shah (Private communication).

when the restoring force is non-linear. The system exhibits bifurcation where it can evolve to multiple
attractors that depend sensitively on the initial conditions. The system exhibits both oscillatory, and rolling,
solutions depending on the amplitude of the motion. The system exhibits domains of simple ordered motion
separated by domains of very complicated ordered motion as well as chaotic regions. The transitions between
these dramatically diﬀerent modes of motion are extremely sensitive to the amplitude and phase of the
driver. Eventually the motion becomes completely chaotic. The Lyapunov exponent, bifurcation diagram,
and Poincaré section plots, are sensitive measures of the order of the motion. These three sensitive measures
of order and chaos are used extensively in many fields in classical mechanics. Considerable computing
capabilities are required to elucidate the complicated motion involved in non-linear systems. Examples
include laminar and turbulent flow in fluid dynamics and weather forecasting of hurricanes, where the
motion can span a wide dynamic range in dimensions from 10−5 to 104 .

4.7 Wave propagation for non-linear systems

4.7.1 Phase, group, and signal velocities
Chapter 3 discussed the wave equation and solutions for linear systems. It was shown that, for linear systems,
the wave motion obeys superposition and exhibits dispersion, that is, a frequency-dependent phase velocity,
and, in some cases, attenuation. Nonlinear systems introduce intriguing new wave phenomena. For example
for nonlinear systems, second, and higher terms must be included in the Taylor expansion given in equation
42 These second and higher order terms result in the group velocity being a function of  that is, group
velocity dispersion occurs which leads to the shape of the envelope of the wave packet being time dependent.
As a consequence the group velocity in the wave packet is not well defined, and does not equal the signal
velocity of the wave packet or the phase velocity of the wavelets. Nonlinear optical systems have been studied
experimentally where   , which is called slow light, while other systems have    which is
called superluminal light. The ability to control the velocity of light in such optical systems is of considerable
current interest since it has signal transmission applications.
The dispersion relation for a nonlinear system can be expressed as a Taylor expansion of the form
µ ¶ µ ¶
 1 2
 = 0 + ( −  0 ) + ( −  0 )2 +  (4.41)
 =0 2  2 =0

where  is used as the independent variable since it is invariant to phase transitions of the system. Note
that the factor for the first derivative term is the reciprocal of the group velocity
µ ¶
 1
≡ (4.42)
 =0 
106 CHAPTER 4. NONLINEAR SYSTEMS AND CHAOS

while the factor for the second derivative term is

µ 2 ¶ ∙ ¸ µ ¶
   1 1 
= = − (4.43)
 2 =0   () =0 2
  = 0

which gives the velocity dispersion for the system.

Since

= (4.44)

then
1
 1 1 
≡ = +   (4.45)
   
The inverse velocities for electromagnetic waves are best represented in terms of the corresponding refractive
indices  where

≡ (4.46)

and the group refractive index

 ≡ (4.47)

Then equation 445 can be written in the more convenient form

 =  +  (4.48)

Wave propagation for an optical system that
is subject to a single resonance gives one ex-
ample of nonlinear frequency response that has
applications to optics.
Figure 413 shows that the real  and imag-
inary  parts of the phase refractive index ex-
hibit the characteristic resonance frequency de-
pendence of the sinusoidally-driven, linear oscil-
lator that was discussed in chapter 36 and as
illustrated in figure 310. Figure 413 also shows
the group refractive index  computed us-
ing equation 448.
Note that at resonance,  is reduced be-
low the non-resonant value which corresponds
to superluminal (fast) light, whereas in the
wings of the resonance  is larger than the
non-resonant value corresponding to slow light.
Thus the nonlinear dependence of the refractive
index  on angular frequency  leads to fast
or slow group velocities for isolated wave pack-
ets. Velocities of light as slow as 17 sec have
been observed. Experimentally the energy ab-
sorption that occurs on resonance makes it dif-
ficult to observe the superluminal electromag-
netic wave at resonance.
Note that Sommerfeld and Brillouin showed Figure 4.13: The real and imaginary parts of the phase
that even though the group velocity may exceed refractive index n plus the real part of the group refractive
, the signal velocity, which marks the arrival of index associated with an isolated atomic resonance.
the leading edge of the optical pulse, does not
exceed , the velocity of light in vacuum, as was
postulated by Einstein.[Bri14]
4.7. WAVE PROPAGATION FOR NON-LINEAR SYSTEMS 107

4.7.2 Soliton wave propagation

The soliton is a fascinating and very special
wave propagation phenomenon that occurs for
certain non-linear systems. The soliton is a self-
reinforcing solitary localized wave packet that
maintains its shape while travelling long distances
at a constant speed. Solitons are caused by a
cancellation of phase modulation resulting from
non-linear velocity dependence, and the group ve-
locity dispersive effects in a medium. Solitons
arise as solutions of a widespread class of weakly-
nonlinear dispersive partial differential equations
describing many physical systems. Figure 414
shows a soliton comprising a solitary water wave
approaching the coast of Hawaii. While the soli-
ton in Fig. 414 may appear like a normal wave,
it is unique in that there are no other waves ac-
companying it. This wave was probably created
far away from the shore when a normal wave was Figure 4.14: A solitary wave approaches the coast of Hawaii.
modulated by a geometrical change in the ocean (Image: Robert Odom/University of Washington)
depth, such as the rising sea floor, which forced
it into the appropriate shape for a soliton. The
wave then was able to travel to the coast intact,
despite the apparently placid nature of the ocean near the beach. Solitons are notable in that they interact
with each other in ways very different from normal waves. Normal waves are known for their complicated
interference patterns that depend on the frequency and wavelength of the waves. Solitons, can pass right
through each other without being a affected at all. This makes solitons very appealing to scientists because
soliton waves are more sturdy than normal waves, and can therefore be used to transmit information in ways
that are distinctly different than for normal wave motion. For example, optical solitons are used in optical
fibers made of a dispersive, nonlinear optical medium, to transmit optical pulses with an invariant shape.
Solitons were first observed in 1834 by John Scott Russell (1808 − 1882). Russell was an engineer con-
ducting experiments to increase the efficiency of canal boats. His experimental and theoretical investigations
allowed him to recreate the phenomenon in wave tanks. Through his extensive studies, Scott Russell noticed
that soliton propagation exhibited the following properties:
• The waves are stable and hold their shape for long periods of time.
• The waves can travel over long distances at uniform speed.
• The speed of propagation of the wave depends on the size of the wave, with larger waves traveling
faster than smaller waves.
• The waves maintained their shape when they collided - seemingly passing right through each other.
Scott Russell’s work was met with scepticism by the scientific community. The problem with the Wave
of Translation was that it was an effect that depended on nonlinear effects, whereas previously existing
theories of hydrodynamics (such as those of Newton and Bernoulli) only dealt with linear systems. George
Biddell Airy, and George Gabriel Stokes, published papers attacking Scott Russell’s observations because
the observations could not be explained by their theories of wave propagation in water. Regardless, Scott
Russell was convinced of the prime importance of the Wave of Translation, and history proved that he was
correct. Scott Russell went on to develop the “wave line” system of hull construction that revolutionized
nineteenth century naval architecture, along with a number of other great accomplishments leading him to
fame and prominence. Despite all of the success in his career, he continued throughout his life to pursue his
studies of the Wave of Translation.
In 1895 Korteweg and de Vries developed a wave equation for surface waves for shallow water.

  3  
+ 3
+ 6 =0 (4.49)
  
A solution of this equation has the characteristics of a solitary wave with fixed shape. It is given by
substituting the form ( ) =  ( − ) into the Korteweg-de Vries equation which gives
108 CHAPTER 4. NONLINEAR SYSTEMS AND CHAOS

 3 
− + 3 + 6 =0 (4.50)
  
Integrating with respect to  gives
2 
3 2 + −  =  (4.51)
3
where  is a constant of integration. This non-linear equation has a solution
∙√ ¸
1 2 
( ) =  sec  ( −  − ) (4.52)
2 2

where  is a constant. Equation 452 is the equation of a solitary wave moving in the + direction at a
velocity .
Soliton behavior is observed in phenomena such as tsunamis, tidal bores that occur for some rivers,
signals in optical fibres, plasmas, atmospheric waves, vortex filaments, superconductivity, and gravitational
fields having cylindrical symmetry. Much work has been done on solitons for fibre optics applications. The
soliton’s inherent stability make long-distance transmission possible without the use of repeaters, and could
potentially double the transmission capacity.
Before the discovery of solitons, mathematicians were under the impression that nonlinear partial differ-
ential equations could not be solved exactly. However, solitons led to the recognition that there are non-linear
systems that can be solved analytically. This discovery has prompted much investigation into these so-called
“integrable systems.” Such systems are rare, as most non-linear differential equations admit chaotic behavior
with no explicit solutions. Integrable systems nevertheless lead to very interesting mathematics ranging from
differential geometry and complex analysis to quantum field theory and fluid dynamics.
Many of the fundamental equations in physics (Maxwell’s, Schrödinger’s) are linear equations. However,
physicists have begun to recognize many areas of physics in which nonlinearity can result in qualitatively
new phenomenon which cannot be constructed via perturbation theory starting from linearized equations.
These include phenomena in magnetohydrodynamics, meteorology, oceanography, condensed matter physics,
nonlinear optics, and elementary particle physics. For example, the European space mission Cluster detected
a soliton-like electrical disturbances that travelled through the ionized gas surrounding the Earth starting
about 50,000 kilometers from Earth and travelling towards the planet at about 8 km/s. It is thought that
this soliton was generated by turbulence in the magnetosphere.
Efforts to understand the nonlinearity of solitons has led to much research in many areas of physics. In
the context of solitons, their particle-like behavior (in that they are localized and preserved under collisions)
leads to a number of experimental and theoretical applications. The technique known as bosonization allows
viewing particles, such as electrons and positrons, as solitons in appropriate field equations. There are
numerous macroscopic phenomena, such as internal waves on the ocean, spontaneous transparency, and the
behavior of light in fiber optic cable, that are now understood in terms of solitons. These phenomena are
being applied to modern technology.

4.8 Summary
The study of the dynamics of non-linear systems remains a vibrant and rapidly evolving field in classical
mechanics as well as many other branches of science. This chapter has discussed examples of non-linear
systems in classical mechanics. It was shown that the superposition principle is broken even for weak
nonlinearity. It was shown that increased nonlinearity leads to bifurcation, point attractors, limit-cycle
attractors, and sensitivity to initial conditions.
Limit-cycle attractors: The Poincaré-Bendixson theorem for limit cycle attractors states that the
paths, both in state-space and phase-space, can have three possible paths:
(1) closed paths, like the elliptical paths for the undamped harmonic oscillator,
(2) terminate at an equilibrium point as  → ∞, like the point attractor for a damped harmonic oscillator,
(3) tend to a limit cycle as  → ∞.
The limit cycle is unusual in that the periodic motion tends asymptotically to the limit-cycle attractor
independent of whether the initial values are inside or outside the limit cycle. The balance of dissipative forces
and driving forces often leads to limit-cycle attractors, especially in biological applications. Identification of
4.8. SUMMARY 109

limit-cycle attractors, as well as the trajectories of the motion towards these limit-cycle attractors, is more
complicated than for point attractors.
The van der Pol oscillator is a common example of a limit-cycle system that has an equation of motion
of the form
2  ¡ ¢ 
2
+  2 − 1 +  20  = 0 (411)
 
The van der Pol oscillator has a limit-cycle attractor that includes non-linear damping and exhibits
periodic solutions that asymptotically approach one attractor solution independent of the initial conditions.
There are many examples in nature that exhibit similar behavior.
Harmonically-driven, linearly-damped, plane pendulum: The non-linearity of the well-known
driven linearly-damped plane pendulum was used as an example of the behavior of non-linear systems in
nature. It was shown that non-linearity leads to discontinuous period bifurcation, extreme sensitivity to
initial conditions, rolling motion and chaos.
Differentiation between ordered and chaotic motion: Lyapunov exponents, bifurcation diagrams,
and Poincaré sections were used to identify the transition from order to chaos. Chapter 168 discusses
the non-linear Navier-Stokes equations of viscous-fluid flow which leads to complicated transitions between
laminar and turbulent flow. Fluid flow exhibits remarkable complexity that nicely illustrates the dominant
role that non-linearity can have on the solutions of practical non-linear systems in classical mechanics.
Wave propagation for non-linear systems: Non-linear equations can lead to unexpected behavior
for wave packet propagation such as fast or slow light as well as soliton solutions. Moreover, it is notable
that some non-linear systems can lead to analytic solutions.
The complicated phenomena exhibited by the above non-linear systems is not restricted to classical
mechanics, rather it is a manifestation of the mathematical behavior of the solutions of the differential
equations involved. That is, this behavior is a general manifestation of the behavior of solutions for second-
order differential equations. Exploration of this complex motion has only become feasible with the advent
of powerful computer facilities during the past three decades. The breadth of phenomena exhibited by
these examples is manifest in myriads of other nonlinear systems, ranging from many-body motion, weather
patterns, growth of biological species, epidemics, motion of electrons in atoms, etc. Other examples of non-
linear equations of motion not discussed here, are the three-body problem, which is mentioned in chapter
11, and turbulence in fluid flow which is discussed in chapter 16.
It is stressed that the behavior discussed in this chapter is very different from the random walk problem
which is a stochastic process where each step is purely random and not deterministic. This chapter has
assumed that the motion is fully deterministic and rigorously follows the laws of classical mechanics. Even
though the motion is fully deterministic, and follows the laws of classical mechanics, the motion is extremely
sensitive to the initial conditions and the non-linearities can lead to chaos. Computer modelling is the only
viable approach for predicting the behavior of such non-linear systems. The complexity of solving non-linear
equations is the reason that this book will continue to consider only linear systems. Fortunately, in nature,
non-linear systems can be approximately linear when the small-amplitude assumption is applicable.

Workshop exercises
1. Consider the chaotic motion of the driven damped pendulum whose equation of motion is given by

̈ + Γ̇ +  20 sin  =  20 cos 

for which the Lyapunov exponent is  = 1 with time measured in units of the drive period.

(a) Assume that you need to predict  () with accuracy of 10−2 , and that the initial value  (0) is
known to within 10−6 . What is the maximum time horizon max for which you can predict  ()
to within the required accuracy?
(b) Suppose that you manage to improve the accuracy of the initial value to 10−9  (that is, a thousand-
fold improvement). What is the time horizon now for achieving the accuracy of 10−2 ?
(c) By what factor has max improved with the 1000 −   improvement in initial measurement.
(d) What does this imply regarding long-term predictions of chaotic motion?
110 CHAPTER 4. NONLINEAR SYSTEMS AND CHAOS

2. A non-linear oscillator satisfies the equation ̈ + ̇3 +  = 0 Find the polar equations for the motion in the
state-space diagram. Show that any trajectory that starts within the circle   1 encircle the origin infinitely
many times in the clockwise direction. Show further that these trajectories in state space terminate at the
origin.

3. Consider the system of a mass suspended between two identical springs as shown.

If each spring is stretched a distance  to attach the mass at the equilibrium position the mass is subject to
two equal and oppositely directed forces of magnitude . Ignore gravity. Show that the potential in which
the mass moves is approximately
½ ¾ ½ ¾
 2 ( − )
 () =  + 4
 43
Construct a state-space diagram for this potential.

Problems
1. A non-linear oscillator satisfies the equation

̈ + (2 + ̇2 − 1)̇ +  = 0

Find the polar equations

√ for the motion in the state-space diagram. Show that any trajectory that starts in
the domain 1    3 spirals clockwise and tends to the limit cycle  = 1. [The same is true of trajectories
that start in the domain 0    1. ] What is the period of the limit cycle?

2. A mass  moves in one direction and is subject to a constant force +0 when   0 and to a constant force
−0 when   0. Describe the motion by constructing a state space diagram. Calculate the period of the
motion in terms of  0 and the amplitude . Disregard damping.

3. Investigate the motion of an undamped mass subject to a force of the form

− ||  
 () = (
−( + ) +  ||  
Chapter 5

Calculus of variations

5.1 Introduction
The prior chapters have focussed on the intuitive Newtonian approach to classical mechanics, which is based
on vector quantities like force, momentum, and acceleration. Newtonian mechanics leads to second-order
differential equations of motion. The calculus of variations underlies a powerful alternative approach to
classical mechanics that is based on identifying the path that minimizes an integral quantity. This integral
variational approach was first championed by Gottfried Wilhelm Leibniz, contemporaneously with Newton’s
development of the differential approach to classical mechanics.
During the 18 century, Bernoulli, who was a student of Leibniz, developed the field of variational
calculus which underlies the integral variational approach to mechanics. He solved the brachistochrone
problem which involves finding the path for which the transit time between two points is the shortest. The
integral variational approach also underlies Fermat’s principle in optics, which can be used to derive that
the angle of reflection equals the angle of incidence, as well as derive Snell’s law. Other applications of the
calculus of variations include solving the catenary problem, finding the maximum and minimum distances
between two points on a surface, polygon shapes having the maximum ratio of enclosed area to perimeter,
or maximizing profit in economics. Bernoulli, developed the principle of virtual work used to describe
equilibrium in static systems, and d’Alembert extended the principle of virtual work to dynamical systems.
Euler, the preeminent Swiss mathematician of the 18 century and a student of Bernoulli, developed the
calculus of variations with full mathematical rigor. The culmination of the development of the Lagrangian
variational approach to classical mechanics is done by Lagrange (1736-1813), who was a student of Euler,.
The Euler-Lagrangian approach to classical mechanics stems from a deep philosophical belief that the
laws of nature are based on the principle of economy.That is, the physical universe follows paths through
space and time that are based on extrema principles. The standard Lagrangian  is defined as the difference
between the kinetic and potential energy, that is

= − (5.1)

Chapters 6 through 9 will show that the laws of classical mechanics can be expressed in terms of Hamilton’s
variational principle which states that the motion of the system between the initial time 1 and final time
2 follows a path that minimizes the scalar action integral  defined as the time integral of the Lagrangian.
Z 2
=  (5.2)
1

The calculus of variations provides the mathematics required to determine the path that minimizes the
action integral. This variational approach is both elegant and beautiful, and has withstood the rigors of
experimental confirmation. In fact, not only is it an exceedingly powerful alternative approach to the intuitive
Newtonian approach in classical mechanics, but Hamilton’s variational principle now is recognized to be more
fundamental than Newton’s Laws of Motion. The Lagrangian and Hamiltonian variational approaches to
mechanics are the only approaches that can handle the Theory of Relativity, statistical mechanics, and the
dichotomy of philosophical approaches to quantum physics.

111
112 CHAPTER 5. CALCULUS OF VARIATIONS

5.2 Euler’s diﬀerential equation

The calculus of variations, presented here, underlies the powerful variational approaches that were developed
for classical mechanics. Variational calculus, developed for classical mechanics, now has become an essential
approach to many other disciplines in science, engineering, economics, and medicine.
For the special case of one dimension, the calculus of variations reduces to varying the function () such
that the scalar functional  is an extremum, that is, it is a maximum or minimum, where.
Z 2
 =  [()  0 (); ]  (5.3)
1


Here  is the independent variable, () the dependent variable, plus its first derivative  0 ≡   The quantity
 [()  0 (); ] has some given dependence on   0 and  The calculus of variations involves varying the
function () until a stationary value of  is found, which is presumed to be an extremum. This means that
if a function  = () gives a minimum value for the scalar functional  , then any neighboring function, no
matter how close to () must increase  . For all paths, the integral  is taken between two fixed points,
1  1 and 2  2  Possible paths between the initial and final points are illustrated in figure 51. Relative to
any neighboring path, the functional  must have a stationary value which is presumed to be the correct
extremum path.
Define a neighboring function using a parametric representation ( ) such that for  = 0,  = (0 ) =
() is the function that yields the extremum for  . Assume that an infinitesimally small fraction  of the
neighboring function () is added to the extremum path (). That is, assume

( ) = (0 ) + () (5.4)

( ) (0 ) 
 0 ( ) ≡ = +
  
where it is assumed that the extremum function (0 ) and the auxiliary function () are well behaved
functions of  with continuous first derivatives, and where () vanishes at 1 and 2  because, for all possible
paths, the function ( ) must be identical with () at the end points of the path, i.e. (1 ) = (2 ) = 0.
The situation is depicted in figure 51. It is possible to express any such parametric family of curves  as
a function of  Z 2
 () =  [( )  0 ( ); ]  (5.5)
1

The condition that the integral has a stationary (extremum) value is that  be independent of  to first
order along the path. That is, the extremum value occurs for  = 0 where
µ ¶

=0 (5.6)
 =0

for all functions () This is illustrated on the right side of figure 51
Applying condition (56) to equation (55)  and since  is independent of  then
Z 2 µ ¶
     0
= + 0  = 0 (5.7)
 1    

Since the limits of integration are fixed, the diﬀerential operation aﬀects only the integrand. From equations
(54),

= () (5.8)

and
 0 
= (5.9)
 
Consider the second term in the integrand
Z 2 Z 2
  0  
0
 = 0
 (5.10)
1   1  
5.2. EULER’S DIFFERENTIAL EQUATION 113

y(x)

Varied path

Extremum path, y(x)

x
x1 x2

x
O

Figure 5.1: The left shows the extremum () Rand neighboring paths ( ) = () + () between (1  1 )

and (2  2 ) that minimizes the function  = 12  [()  0 (); ] . The right shows the dependence of 
as a function of the admixture coeﬃcient  for a maximum (upper) or a minimum (lower) at  = 0.

Integrate by parts Z Z
 =  −  (5.11)

gives Z ∙ ¸2 Z 2 µ ¶
2
    
 = () − ()  (5.12)
1  0   0 1 1   0

Note that the first term on the right-hand side is zero since by definition  = () = 0 at 1 and 2  Thus
Z 2 µ ¶ Z 2 µ µ ¶¶
     0   
= + 0  = () − () 
 1     1    0

Thus equation 57 reduces to Z µ ¶

2
   
= − () (5.13)
 1    0

The function  will be an extremum if it is stationary at  = 0. That is,
Z 2 µ ¶
   
= − () = 0 (5.14)
 1    0

This integral now appears to¢be independent of  However, the functions  and  0 occurring in the derivatives
¡ 
are functions of  Since  =0 must vanish for a stationary value, and because () is an arbitrary function
subject to the conditions stated, then the above integrand must be zero. This derivation that the integrand
must be zero leads to Euler’s diﬀerential equation

  
− =0 (5.15)
   0

where  and  0 are the original functions, independent of  The basis of the calculus of variations is that the
function () that satisfies Euler’s equation is an stationary function. Note that the stationary value could
be either a maximum or a minimum value. When Euler’s equation is applied to mechanical systems using
the Lagrangian as the functional, then Euler’s diﬀerential equation is called the Euler-Lagrange equation.
114 CHAPTER 5. CALCULUS OF VARIATIONS

5.3 Applications of Euler’s equation

5.1 Example: Shortest distance between two points

Consider the path lies in the x − y plane. The infinitessimal length of arc is
⎡s ⎤
p µ ¶2

 = 2 +  2 = ⎣ 1 + ⎦ 


Then the length of the arc is ⎡s ⎤

Z 2 Z 2 µ ¶2
⎣ 1+  ⎦ 
=  =
1 1 

The function  is
q
2 y
 = 1 + ( 0 )

Therefore
x 1 y1

=0

and
 0 x 2 y2
= q
 0 2
1 + ( 0 )
Inserting these into Euler’s equation 515 gives
⎛ ⎞
 ⎝ 0 x
0+ q ⎠=0
 2 Shortest distance between two points in a plane.
1 + ( 0 )

that is
0
q = constant = 
1 + ( 0 )2
This is valid if

0 = √ =
1 − 2
Therefore
 =  + 
which is the equation of a straight line in the plane. Thus the shortest path between two points in a plane is
a straight line between these points, as is intuitively obvious. This stationary value obviously is a minimum.
This trivial example of the use of Euler’s equation to determine an extremum value has given the obvious
answer. It has been presented here because it provides a proof that a straight line is the shortest distance in
a plane and illustrates the power of the calculus of variations to determine extremum paths.

5.2 Example: Brachistochrone problem

The Brachistochrone problem involves finding the path having the minimum transit time between two
points. The Brachistochrone problem stimulated the development of the calculus of variations by John
Bernoulli and Euler. For simplicity, take the case of frictionless motion in the  −  plane with a uni-
form gravitational field acting in the yb direction, as shown in the adjacent figure. The question is what
constrained path will result in the minimum transit time between two points (1 1 ) and (2 2 )
5.3. APPLICATIONS OF EULER’S EQUATION 115

Consider that the particle of mass  starts at the origin 1 = 0 1 = 0 with zero velocity. Since the
problem conserves energy and assuming that initially  =  +   = 0 then
1
2 −  = 0
2
That is p
= 2
The transit time is given by
Z 2 Z 2
p Z 2 s
 2 +  2 (1 + 02 )
= = √ = 
1  1 2 1 2

where 0 ≡ 
 . Note that, in this example, the independent variable has been chosen to be  and the dependent
variable is ().
The function  of the integral is s
1 (1 + 02 )
=√
2 
√
Factor out the constant 2 term, which does not aﬀect the final equation, and note that


= 0

 0
= r ³ ´
0 2
 1 + (0 )

Therefore Euler’s equation gives

⎛ ⎞
 ⎜
⎜r 0 ⎟
⎟
0+ ³ ´⎠ = 0
 ⎝ 0 2
 1 + ( )

or (x1 , y1) a a x
0
 1
r ³ ´ = constant = √2 a
 1 + (0 )2
P(x , y)
(x 2 , y 2 )
That is 2a
02 1
³ ´= Cycloid
 1 + (0 )
2 2

This may be rewritten as

Z 2 y

= p The Bachistochrone problem involves finding the path for
1 2 −  2
the minimum transit time for constrained frictionless
Change the variable to  = (1 − cos ) gives motion in a uniform gravitational field.
that  =  sin  leading to the integral
Z
 =  (1 − cos ) 

or
 = ( − sin ) + constant
116 CHAPTER 5. CALCULUS OF VARIATIONS

The parametric equations for a cycloid passing through the origin are
 = ( − sin )
 = (1 − cos )
which is the form of the solution found. That is, the shortest time between two points is obtained by con-
straining the motion of the mass to follow a cycloid shape. Thus the mass first accelerates rapidly by falling
down steeply and then follows the curve and coasts upward at the end. The elapsed time is obtained by
inserting
q the above parametric relations for  and  in terms of  into the transit time integral giving
 =   where  and  are fixed by the end point coordinates. Thus the time to fall from starting with zero
q
velocity at the cusp to the minimum of the cycloid is    If 2 = 1 = 0 then 2 = 2 which defines the
q q
shape of the cycloid and the minimum time is 2  = 2   If the mass starts with a non-zero initial
2

velocity, then the starting point is not at the cusp of the cycloid, but down a distance  such that the kinetic
energy equals the potential energy diﬀerence from the cusp.
A modern application of the Brachistochrone problem is determination of the optimum shape of the low-
friction emergency chute that passengers slide down to evacuate a burning aircraft. Bernoulli solved the
problem of rapid evacuation of an aircraft two centuries before the first flight of a powered aircraft.

5.3 Example: Minimal travel cost

Assume that the cost of flying an aircraft at height  is − per unit distance of flight-path, where  is a
positive constant. Consider that the aircraft flies in the ( )-plane from the point (− 0) to the point ( 0)
where  = 0 corresponds to ground level, and where the -axis points vertically upwards. Find the extremal
for the problem of minimizing the total cost of the journey.
The diﬀerential arc-length element of the flight path  can be written as
p p
 = 2 +  2 = 1 +  02 

where  0 ≡  . Thus the cost integral to be minimized is
Z + Z + p
= −  = − 1 +  02 
− −

The function of this integral is p

 = − 1 +  02
The partial diﬀerentials required for the Euler equations are
   00 −  02 −  00  02 −
= √ −√ −
  0 1 +  02 1 +  02 (1 +  02 )
32

 p
= −− 1 +  02

Therefore Euler’s equation equals
   p  00 −  02 −  00  02 −
− 02 − √
− = − 1 +  + √ + =0
   0 1 +  02 1 +  02 (1 +  02 )
32

This can be simplified by multiplying the radical to give

− − 2 02 −  04 −  00 −  00  02 +  02 +  04 +  00  02 = 0
Cancelling terms gives ¡ ¢
 00 +  1 +  02 = 0
Separating the variables leads to
Z Z
 0
arctan  0 = 02
=−  = − + 1
 +1
5.4. SELECTION OF THE INDEPENDENT VARIABLE 117

Integration gives
³ ´
Z  Z  ln cos(1 −)
ln(cos(1 − )) − ln(cos(1 + )) cos(1 +)
() =  = tan(1 − ) = + 2 = + 2
− −  
Using the initial condition that (−) = 0 gives 2 = 0. Similarly the final condition () = 0 implies that
1 = 0. Thus Euler’s equation has determined that the optimal trajectory that minimizes the cost integral 
is µ ¶
1 cos()
() = ln
 cos()
This example is typical of problems encountered in economics.

5.4 Selection of the independent variable

A wide selection of variables can be chosen as the independent variable for variational calculus. The derivation
of Euler’s equation and example 51 both assumed that the independent variable is  whereas example
52 used  as the independent variable, example 53 used , and Lagrange mechanics uses time  as the
independent variable. Selection of which variable to use as the independent variable does not change the
physics of a problem, but some selections can simplify the mathematics for obtaining an analytic solution.
The following example of a cylindrically-symmetric soap-bubble surface formed by blowing a soap bubble that
stretches between two circular hoops, illustrates the importance when selecting the independent variable.

5.4 Example: Surface area of a cylindrically-symmetric soap bubble

Consider a cylindrically-symmetric soap-bubble surface
formed by blowing a soap bubble that stretches between two
circular hoops. The surface energy, that results from the sur- z
face tension of the soap bubble, is minimized when the surface
area of the bubble is minimized. Assume that the axes of the
two hoops lie along the  axis as shown in the adjacent figure.
It is intuitively obvious that the soap bubble having the mini-
mum surface area that is bounded by the two hoops will have
a circular cross section that is concentric with the symmetry
axis, and the radius will be smaller between the two hoops. z
Therefore, intuition can be used to simplify the problem to
finding the shape of the contour of revolution around the axis
of symmetry that defines the shape of the surface of minimum
surface area. Use cylindrical coordinates (  ) and assume
that hoop 1 at 1 has radius 1 and hoop 2 at 2 has radius y x
2 . Consider the cases where either , or , are selected to
be the independent variable. Cylindrically-symmetric surface formed by
The diﬀerential arc-length element of the circular annu- rotation about the  axis of a soap bubble
lus at constant  between  and  +  is given by  = suspended between two identical hoops
p
2 2
 +  . Therefore the area of the infinitessimal circular centred on the  axis.
annulus is  = 2 which can be integrated to give the
area of the surface  of the soap bubble bounded by the two
circular hoops as Z 2 p
 = 2   2 + 2
1

Independent variable 
Assuming that  is the independent variable, then the surface area can be written as
s µ ¶2
Z 2 Z 2 p

 = 2  1+  = 2  1 + 02 
1  1
118 CHAPTER 5. CALCULUS OF VARIATIONS


p
where 0 ≡  . The function of the surface integral is  =  1 + 02  The derivatives are

 p
= 1 + 02

and
 0
= q
0
1 + (0 )2
Therefore Euler’s equation gives
⎛ ⎞
 ⎝ 0 p
q ⎠ − 1 + 02 = 0
 2
1 + (0 )

This is not an easy equation to solve.

Independent variable 
Consider the case where the independent variable is chosen to be , then the surface integral can be written
as s
Z 2 µ ¶2 Z p

 = 2  1+  = 2  1 +  02 
1 
√
where  0 ≡  02
 . Thus the function of the surface integral is  =  1 +   The derivatives are


=0

and
  0
= q
 0
1 + ( 0 )2
Therefore Euler’s equation gives ⎛ ⎞
0
 ⎝  ⎠=0
0+ q
 2
1 + ( 0 )
That is
 0
q =
1 + ( 0 )2
where  is a constant. This can be rewritten as
¡ ¢
 02 2 − 2 = 2

or
 
0 = =p
  − 2
2

The integral of this is ³´

 =  cosh−1 +

That is
−
 =  cosh

which is the equation of a catenary. The catenary is the shape of a uniform flexible cable hung in a uniform
gravitational field. The constants  and  are given by the end points. The physics of the solution must be
identical for either choice of independent variable. However, mathematically one case is easier to solve than
the other because, in the latter case, one term in Euler’s equation is zero.
5.5. FUNCTIONS WITH SEVERAL INDEPENDENT VARIABLES  () 119

5.5 Functions with several independent variables  ()

The discussion has focussed on systems having only a single function () such that the functional is an
extremum. It is more common to have a functional that is dependent upon several independent variables
 [1 () 10 () 2 () 20 () ; ] which can be written as
Z 
2 X
 =  [ () 0 (); ]  (5.16)
1 =1

where  = 1 2 3  
By analogy with the one dimensional problem, define neighboring functions  for each variable. Then

 ( ) =  (0 ) +  () (5.17)

 ( )  (0 ) 
0 ( ) ≡ = + 
  
where   are independent functions of  that vanish at 1 and 2  Using equations 512 and 517 leads to
the requirements for an extremum value to be
Z  µ
2 X ¶ Z  µ
2 X ¶
     0   
= + 0   = −  () = 0 (5.18)
 1 
    1 
  0

If the variables  () are independent, then the  () are independent. Since the   () are independent,
then evaluating the above equation at  = 0 implies that each term in the bracket must vanish independently.
That is, Euler’s diﬀerential equation becomes a set of  equations for the  independent variables
  
− =0 (5.19)
  0
where  = 1 2 3 Thus, each of the  equations can be solved independently when the  variables are
independent. Note that Euler’s equation involves partial derivatives for the dependent variables  , 0 and
the total derivative for the independent variable .

5.5 Example: Fermat’s Principle

In 1662 Fermat’s proposed that the propagation of
light obeyed the generalized principle of least transit time.
In optics, Fermat’s principle, or the principle of least (0, y , 0)P1
1
time, is the principle that the path taken between two 1

points by a ray of light is the path that can be traversed in

the least time. Historically, the proof of Fermat’s princi- (x, 0, z)
ple by Johann Bernoulli was one of the first triumphs of O

the calculus of variations, and served as a guiding princi-

ple in the formulation of physical laws using variational
calculus. 2

Consider the geometry shown in the figure, where

the light travels from the point 1 (0 1  0) to the point (x , -y , 0)
P2 2 2
2 (2  −2  0). The light beam intersects a plane glass
Light incident upon a plane glass interface in the
interface at the point ( 0 ).
( ) plane at  = 0.
The French mathematician Fermat discovered that
the required path travelled by light is the path for which
the travel time  is a minimum. That is, the transit time from the initial point 1 to the final point 2 is
given by Z 2 Z 2 Z Z q
 1 2 1 2 2 2
=  = =  = (  ) 1 + (0 ) + ( 0 ) 
1 1   1  1
assuming that the velocity of light in any medium is given by  =  where  is the refractive index of the
medium and  is the velocity of light in vacuum.
120 CHAPTER 5. CALCULUS OF VARIATIONS

This is a problem that has two dependent variables () and () with  chosen as the independent
variable. The integral can be broken into two parts 1 → 0 and 0 → −2 
∙Z 0 q Z −2 q ¸
1
= 1 1 + (0 )2 + ( 0 )2  + 2 1 + (0 )2 + ( 0 )2 
 1 0

The functionals are functions of 0 and  0 but not  or . Thus Euler’s equation for  simplifies to
µ ¶
 1 1  0 2  0
0+ (√ +√ ) =0
  1 + 0 2 +  02 1 + 02 +  0 2
This implies that  0 = 0, therefore  is a constant. Since the initial and final values were chosen to be
1 = 2 = 0, therefore at the interface  = 0. Similarly Euler’s equations for  are
µ ¶
 1 1 0 2 0
0+ (√ +√ ) =0
  1 + 0 2 +  02 1 + 02 +  0 2
But 0 = tan 1 for 1 and 0 = − tan 2 for 2 and it was shown that  0 = 0. Thus
⎛ ⎞
µ ¶
 ⎝1 1 tan 1 2 tan 2 ⎠  1
0+ (q −q ) = (1 sin 1 − 2 sin 2 ) = 0
   
1 + (tan 1 )2 1 + (tan 2 )2

Therefore 1 (1 sin 1 − 2 sin 2 ) = constant which must be zero since when 1 = 2  then 1 = 2 . Thus
Fermat’s principle leads to Snell’s Law.

1 sin 1 = 2 sin 2

The geometry of this problem is simple enough to directly minimize the path rather than using Euler’s
equations for the two parameters as performed above. The lengths of the paths 1  and 2 are
q
1  = 2 + 12 +  2
q
2 = (2 − )2 + 22 +  2

The total transit time is given by

µ q q ¶
1 2
= 1 2 + 12 +  2 + 2 (2 − ) + 22 +  2


This problem involves two dependent variables, () and (). To find the minima, set the partial derivatives
 
 = 0 and  = 0. That is,

 1 1  2 
= (p +q )=0
  2 + 12 +  2 2
(2 − ) + 22 +  2

This is zero only if  = 0, that is the point  lies in the plane containing 1 and 2 . Similarly

 1 1  2 (2 − ) 1
= (p −q ) = (1 sin 1 − 2 sin 2 ) = 0
  2 2
 + 1 +  2 2 
(2 − ) + 22 +  2

This is zero only if Snell’s law applies that is

1 sin 1 = 2 sin 2

Fermat’s principle has shown that the refracted light is given by Snell’s Law, and is in a plane normal to the
surface. The laws of reflection also are given since then 1 = 2 =  and the angle of reflection equals the
angle of incidence.
5.6. EULER’S INTEGRAL EQUATION 121

5.6 Example: Minimum of (∇)2 in a volume

Find the function (1  2  3 ) that has the minimum value of (∇)2 per unit volume. For the volume
 it is desired to minimize the following
Z Z Z Z Z Z "µ ¶2 µ ¶2 µ ¶2 #
1 2 1   
= (∇) 1 2 3 = + + 1 2 3
  1 2 3

Note that the variables 1  2  3 are independent, and thus Euler’s equation for several independent variables
can be used. To minimize the functional , the function
µ ¶2 µ ¶2 µ ¶2
  
= + + ()
1 2 3
must satisfy the Euler equation
3 µ ¶
 X  
− =0
 =1  0
where 0 = 
 . Substitute  into Euler’s equation gives

X3 µ ¶
 
=0
=1
 

This is just Laplace’s equation

∇2  = 0
Therefore  must satisfy Laplace’s equation in order that the functional  be a minimum.

5.6 Euler’s integral equation

An integral form of the Euler diﬀerential equation can be written which is useful for cases when the function
 does not depend explicitly on the independent variable , that is, when  = 0 Note that

      0
= + + 0 (5.20)
     
But µ ¶
    0  
0 0 = 0 + 0 (5.21)
      0
Combining these two equations gives
µ ¶
 0      
 = − − 0 + 0 (5.22)
  0      0
The last two terms can be rewritten as µ ¶
  
0 − (5.23)
  0 
which vanishes when the Euler equation is satisfied. Therefore the above equation simplifies to
µ ¶
  0 
−  − =0 (5.24)
   0
This integral form of Euler’s equation is especially useful when   = 0 that is, when  does not depend
explicitly on the independent variable . Then the first integral of equation 524 is a constant, i.e.

 − 0 = constant (5.25)
 0
This is Euler’s integral variational equation. Note that the shortest distance between two points, the mini-
mum surface of rotation, and the brachistochrone, described earlier, all are examples where 
 = 0 and thus
the integral form of Euler’s equation is useful for solving these cases.
122 CHAPTER 5. CALCULUS OF VARIATIONS

5.7 Constrained variational systems

Imposing a constraint on a variational system implies:

1. The  constrained coordinates  () are correlated which violates N

the assumption made in chapter 55 that the  variables are inde- y
pendent. Ff

2. Constrained motion implies that constraint forces must be acting

to account for the correlation of the variables. These constraint
mg
forces must be taken into account in the equations of motion.

For example, for a disk rolling down an inclined plane without slip-
ping, there are three coordinates  [perpendicular to the wedge], , [Along
the surface of the wedge], and the rotation angle  shown in figure 52
The constraint forces, F N, lead to the correlation of the variables such
that  = , while  = . Basically there is only one independent
Figure 5.2: A disk rolling down
variable, which can be either  or  The use of only one independent
an inclined plane.
variable essentially buries the constraint forces under the rug, which is
fine if you only need to know the equation of motion. If you need to determine the forces of constraint then
it is necessary to include all coordinates explicitly in the equations of motion as discussed below.

5.7.1 Holonomic constraints

Most systems involve restrictions or constraints that couple the coordinates. For example, the  () may
be confined to a surface in coordinate space. The constraints mean that the coordinates  () are not inde-
pendent, but are related by equations of constraint. A constraint is called holonomic if the equations of
constraint can be expressed in the form of an algebraic equation that directly and unambiguously specifies
the shape of the surface of constraint. A non-holonomic constraint does not provide an algebraic relation
between the correlated coordinates. In addition to the holonomy of the constraints, the equations of con-
straint also can be grouped into the following three classifications depending on whether they are algebraic,
diﬀerential, or integral. These three classifications for the constraints exhibit diﬀerent holonomy relating the
coupled coordinates. Fortunately the solution of constrained systems is greatly simplified if the equations of
constraint are holonomic.

5.7.2 Geometric (algebraic) equations of constraint

Geometric constraints can be expressed in the form of algebraic relations that directly specify the shape of
the surface of constraint in coordinate space 1  2     

 (1  2     ; ) = 0 (5.26)

where  = 1 2 3 . There can be  such equations of constraint where 0 ≤  ≤ . An example of such a
geometric constraint is when the motion is confined to the surface of a sphere of radius  in coordinate space
which can be written in the form  = 2 +  2 +  2 − 2 = 0 Such algebraic constraint equations are called
Holonomic which allows use of generalized coordinates as well as Lagrange multipliers to handle both the
constraint forces and the correlation of the coordinates.

5.7.3 Kinematic (diﬀerential) equations of constraint

The  constraint equations also can be expressed in terms of the infinitessimal displacements of the form

X  
 +  = 0 (5.27)
=1
 

where  = 1 2 3 ,  = 1 2 3 . If equation (527) represents the total diﬀerential of a function then
it can be integrated to give a holonomic relation of the form of equation 526. However, if equation 527 is
5.7. CONSTRAINED VARIATIONAL SYSTEMS 123

not the total differential, then it is non-holonomic and can be integrated only after having solved the full
problem.
An example of differential constraint equations is for a wheel rolling on a plane without slipping which is
non-holonomic and more complicated than might be expected. The wheel moving on a plane has five degrees
of freedom since the height  is fixed. That is, the motion of the center of mass requires two coordinates
( ) plus there are three angles (  ) where  is the rotation angle for the wheel,  is the pivot angle of
the axis, and  is the tilt angle of the wheel. If the wheel slides then all five degrees of freedom are active.
If the axis of rotation of the wheel is horizontal, that is, the tilt angle  = 0 is constant, then this kinematic
system leads to three differential constraint equations The wheel can roll with angular velocity ̇, as well as
pivot which corresponds to a change in  Combining these leads to two differential equations of constraint

 −  sin  = 0  +  cos  = 0 (5.28)

These constraints are insuﬃcient to provide finite relations between all the coordinates. That is, the con-
straints cannot be reduced by integration to the form of equation 526 because there is no functional relation
between  and the other three variables,   . Many rolling trajectories are possible between any two points
of contact on the plane that are related to diﬀerent pivot angles. That is, the point of contact of the disk
could pivot plus roll in a circle returning to the same point where    are unchanged whereas the value
of  depends on the circumference of the circle. As a consequence the rolling constraint is non-holonomic
except for the case where the disk rolls in a straight line and remains vertical.

5.7.4 Isoperimetric (integral) equations of constraint

Equations of constraint also can be expressed in terms of direct integrals. This situation is encountered for
isoperimetric problems, such as finding the maximum volume bounded by a surface of fixed area, or the
shape of a hanging rope of fixed length. Integral constraints occur in economics when minimizing some cost
algorithm subject to a fixed total cost constraint.
A simple example of an isoperimetric problem involves finding the curve  = () such that the functional
has an extremum where the curve () satisfies boundary conditions such that (1 ) =  and (2 ) = ,
that is Z 2
 () =  (  0 ; ) (5.29)
1
is an extremum such that the perimeter also is constrained to satisfy
Z 2
() = (  0 ; ) =  (5.30)
1

where  is a fixed length. This integral constraint is geometric and holonomic. Another example is finding
the minimum surface area of a closed surface subject to the enclosed volume being the constraint.

5.7.5 Properties of the constraint equations

Holonomic constraints Geometric constraints can be expressed in the form of an algebraic equation
that directly specifies the shape of the surface of constraint

(1  2  3  ; ) = 0 (5.31)

Such a system is called holonomic since there is a direct relation between the coupled variables. An example
of such a holonomic geometric constraint is if the motion is confined to the surface of a sphere of radius 
which can be written in the form
 = 2 +  2 +  2 − 2 = 0 (5.32)

Non-holonomic constraints There are many classifications of non-holonomic constraints that exist
if equation (531) is not satisfied. The algebraic approach is diﬃcult to handle when the constraint is an
inequality, such as the requirement that the location is restricted to lie inside a spherical shell of radius 
which can be expressed as
 = 2 +  2 +  2 − 2 ≤ 0 (5.33)
124 CHAPTER 5. CALCULUS OF VARIATIONS

This non-holonomic constrained system has a one-sided constraint. Systems usually are non-holonomic if
the constraint is kinematic as discussed above.

Partial Holonomic constraints Partial-holonomic constraints are holonomic for a restricted range
of the constraint surface in coordinate space, and this range can be case specific. This can occur if the
constraint force is one-sided and perpendicular to the path. An example is the pendulum with the mass
attached to the fulcrum by a flexible string that provides tension but not compression. Then the pendulum
length is constant only if the tension in the string is positive. Thus the pendulum will be holonomic if
the gravitational plus centrifugal forces are such that the tension in the string is positive, but the system
becomes non-hononomic if the tension is negative as can happen when the pendulum rotates to an upright
angle where the centrifugal force outwards is insuﬃcient to compensate for the vertical downward component
of the gravitational force. There are many other examples where the motion of an object is holonomic when
the object is pressed against the constraint surface, such as the surface of the Earth, but is unconstrained if
the object leaves the surface.

Time dependence

A constraint is called scleronomic if the constraint is not explicitly time dependent. This ignores the time
dependence contained within the solution of the equations of motion. Fortunately a major fraction of
systems are scleronomic. The constraint is called rheonomic if the constraint is explicitly time dependent.
An example of a rheonomic system is where the size or shape of the surface of constraint is explicitly time
dependent such as a deflating pneumatic tire.

Energy conservation

The solution depends on whether the constraint is conservative or dissipative, that is, if friction or drag are
acting. The system will be conservative if there are no drag forces, and the constraint forces are perpendicular
to the trajectory of the path such as the motion of a charged particle in a magnetic field. Forces of constraint
can result from sliding of two solid surfaces, rolling of solid objects, fluid flow in a liquid or gas, or result from
electromagnetic forces. Energy dissipation can result from friction, drag in a fluid or gas, or finite resistance
of electric conductors leading to dissipation of induced electric currents in a conductor, e.g. eddy currents.
A rolling constraint is unusual in that friction between the rolling bodies is necessary to maintain rolling.
A disk on a frictionless inclined plane will conserve it’s angular momentum since there is no torque acting
if the rolling contact is frictionless, that is, the disk will just slide. If the friction is suﬃcient to stop sliding,
then the bodies will roll and not slide. A perfect rolling body does not dissipate energy since no work is
done at the instantaneous point of contact where both bodies are in zero relative motion and the force is
perpendicular to the motion. In real life, a rolling wheel can involve a very small energy dissipation due to
deformation at the point of contact coupled with non-elastic properties of the material used to make the
wheel and the plane surface. For example, a pneumatic tire can heat up and expand due to flexing of the
tire.

5.7.6 Treatment of constraint forces in variational calculus

There are three major approaches to handle constraint forces in variational calculus. All three of them exploit
the tremendous freedom and flexibility available when using generalized coordinates. The (1) generalized
coordinate approach, described in chapter 58, exploits the correlation of the  coordinates due to the 
constraint forces to reduce the dimension of the equations of motion to  =  −  degrees of freedom. This
approach embeds the  constraint forces, into the choice of generalized coordinates and does not determine
the constraint forces, (2) Lagrange multiplier approach, described in chapter 59, exploits generalized
coordinates but includes the  constraint forces into the Euler equations to determine both the constraint
forces in addition to the  equations of motion. (3) Generalized forces approach, described in chapter
673 introduces constraint and other forces explicitly.
5.8. GENERALIZED COORDINATES IN VARIATIONAL CALCULUS 125

5.8 Generalized coordinates in variational calculus

Newtonian mechanics is based on a vectorial treatment of mechanics which can be difficult to apply when
solving complicated problems in mechanics. Constraint forces acting on a system usually are unknown. In
Newtonian mechanics constrained forces must be included explicitly so that they can be determined simul-
taneously with the solution of the dynamical equations of motion. The major advantage of the variational
approaches is that solution of the dynamical equations of motion can be simplified by expressing the motion
in terms of  independent generalized coordinates. These generalized coordinates can be any set of in-
dependent variables,  , where 1 ≤  ≤ , plus the corresponding velocities ̇ for Lagrangian mechanics,
or the corresponding canonical variables,    for Hamiltonian mechanics. These generalized coordinates for
the  variables are used to specify the scalar functional dependence on these generalized coordinates. The
variational approach employs this scalar functional to determine the trajectory. The generalized coordinates
used for the variational approach do not need to be orthogonal, they only need to be independent since they
are used only to completely specify the magnitude of the scalar functional. This greatly expands the arse-
nal of possible generalized coordinates beyond what is available using Newtonian mechanics. For example,
generalized coordinates can be the dimensionless amplitudes for the  normal modes of coupled oscillator
systems, or action-angle variables. In addition, generalized coordinates having different dimensions can be
used for each of the  variables. Each generalized coordinate,  specifies an independent mode of the system,
not a specific particle. For example, each normal mode of coupled oscillators can involve correlated motion of
several coupled particles. The major advantage of using generalized coordinates is that they can be chosen
to be perpendicular to a corresponding constraint force, and therefore that specific constraint force does no
work for motion along that generalized coordinate. Moreover, the constrained motion does no work in the
direction of the constraint force for rigid constraints. Thus generalized coordinates allow specific constraint
forces to be ignored in evaluation of the minimized functional. This freedom and flexibility of choice of gen-
eralized coordinates allows the correlated motion produced by the constraint forces to be embedded directly
into the choice of the independent generalized coordinates, and the actual constraint forces can be ignored.
Embedding of the constraint induced correlations into the generalized coordinates, effectively “sweeps the
constraint forces under the rug” which greatly simplifies the equations of motion for any system that in-
volve constraint forces. Selection of the appropriate generalized coordinates can be obvious, and often it is
performed subconsciously by the user.
Three variational approaches are used that employ generalized coordinates to derive the equations of
motion of a system that has  generalized coordinates subject to  constraints.
1) Minimal set of generalized coordinates: When the  equations of constraint are holonomic, then
the  algebraic constraint relations can be used to transform the coordinates into  =  −  independent
generalized coordinates  . This approach reduces the number of unknowns,  by the number of constraints
, to give a minimal set of  =  −  independent generalized dynamical variables. The forces of constraint
are not explicitly discussed, or determined, when this generalized coordinate approach is employed. This
approach greatly simplifies solution of dynamical problems by avoiding the need for explicit treatment of the
constraint forces. This approach is straight forward for holonomic constraints, since the  spatial coordinates
1 ()  () are coupled by  algebraic equations which can be used to make the transformation to
generalized coordinates. Thus the  coupled spatial coordinates are transformed to  =  −  independent
generalized dynamical coordinates 1 ()  (), and their generalized first derivatives ̇1 () ̇ () These
generalized coordinates are independent, and thus it is possible to use Euler’s equation for each independent
parameter 
  
− =0 (5.34)
  0
where  = 1 2 3 There are  = − such Euler equations. The freedom to choose generalized coordinates
underlies the tremendous advantage of applying the variational approach.
2) Lagrange multipliers: The  Lagrange equations, plus the  equations of constraint, can be used
to explicitly determine the  generalized coordinates plus the  constraint forces. That is,  +  unknowns
are determined. This approach is discussed in chapter 59.
3) Generalized forces: This approach introduces the constraint forces explicity. This approach, applied
to Lagrangian mechanics, is discussed in chapter 663
The above three approaches exploit generalized coordinates to handle constraint forces as described in
chapter 6
126 CHAPTER 5. CALCULUS OF VARIATIONS

5.9 Lagrange multipliers for holonomic constraints

5.9.1 Algebraic equations of constraint
The Lagrange multiplier technique provides a powerful, and elegant, way to handle holonomic constraints
using Euler’s equations1 . The general method of Lagrange multipliers for  variables, with  constraints,
is best introduced using Bernoulli’s ingenious exploitation of virtual infinitessimal displacements, which
Lagrange signified by the symbol . The term “virtual” refers to an intentional variation of the generalized
coordinates  in order to elucidate the local sensitivity of a function  (  ) to variation of the variable.
Contrary to the usual infinitessimal interval in differential calculus, where an actual displacement  occurs
during a time , a virtual displacement is imagined to be an instantaneous, infinitessimal, displacement of
a coordinate, not an actual displacement, in order to elucidate the local dependence of  on the coordinate.
The local dependence of any functional  to virtual displacements of all  coordinates, is given by taking
the partial differentials of  .
X

 =  (5.35)


The function  is stationary, that is an extremum, if equation 535 equals zero. The extremum of the
functional  , given by equation 516 can be expressed in a compact form using the virtual displacement
formalism as Z 2 X 
X 
 =   [ () 0 (); ]  =  = 0 (5.36)
1  

The auxiliary conditions, due to the  holonomic algebraic constraints for the  variables  , can be
expressed by the  equations
 (q) = 0 (5.37)
where 1 ≤  ≤  and 1 ≤  ≤  with   . The variational problem for the  holonomic constraint
equations also can be written in terms of  differential equations where 1 ≤  ≤ 

X 
 =  = 0 (5.38)
=1


Since equations 536 and 538 both equal zero, the  equations 538 can be multiplied by arbitrary
undetermined factors   and added to equations 536 to give.

 (  ) + 1 1 + 2 2 · ·  · ·  = 0 (5.39)

Note that this is not trivial in that although the sum of the constraint equations for each  is zero; the
individual terms of the sum are not zero.
Insert equations 536 plus 538 into 539 and collect all  terms, gives

Ã 
!
X  X 
+   = 0 (5.40)

 
=1

Note that all the  are free independent variations and thus the terms in the brackets, which are the
coeﬃcients of each  , individually must equal zero. For each of the  values of , the corresponding bracket
implies

 X 
+  =0 (5.41)
 
=1

This is equivalent to what would be obtained from the variational principle


X
 +   = 0 (5.42)
=1
1 This textbook uses the symbol  to designate a generalized coordinate, and  0 to designate the corresponding first derivative
 
with respect to the independent variable, in order to diﬀerentiate the spatial coordinates from the more powerful generalized
coordinates.
5.9. LAGRANGE MULTIPLIERS FOR HOLONOMIC CONSTRAINTS 127

Equation 542 is equivalent to a variational problem for finding the stationary value of  0
Ã 
!
X
0
 ( ) =   +   = 0 (5.43)


where  0 is defined to be Ã !

X
0
 ≡ +   (5.44)
=1
The solution to equation 543 can be found using Euler’s diﬀerential equation 519 of variational calculus.
At the extremum  ( 0 ) = 0 corresponds to following contours of constant  0 which are in the surface that is
perpendicular to the gradients of the terms in  0 . The Lagrange multiplier constants are required because,
although these gradients are parallel at the extremum, the magnitudes of the gradients are not equal.
The beauty of the Lagrange multipliers approach is that the auxiliary conditions do not have to be
handled explicitly, since they are handled automatically as  additional free variables during solution of
Euler’s equations for a variational problem with  +  unknowns fit to  +  equations. That is, the 
variables  are determined by the variational procedure using the  variational equations
X  
  0  0   
( 0 )−( )= ( 0)−( )−  =0 (5.45)
      


simultaneously with the  variables  which are determined by the  variational equations
  0  0
( 0 )−( )=0 (5.46)
  
Equation 545 usually is expressed as
X  
  
( )− ( 0)+  =0 (5.47)
   


The elegance of Lagrange multipliers is that a single variational approach allows simultaneous determination
of all  + unknowns. Chapter 62 shows that the forces of constraint are given directly by the  
 terms.


5.7 Example: Two dependent variables coupled by one holonomic constraint

The powerful, and generally applicable, Lagrange multiplier technique is illustrated by considering the case
of only two dependent variables, () and  ()  with the function  (()  0 () () ()0 ; ) and with one
holonomic equation of constraint coupling these two dependent variables. The extremum is given by requiring
Z 2 ∙µ ¶ µ ¶ ¸
        
= − + −  = 0 ()
 1    0     0 
with the constraint expressed by the auxiliary condition

 ( ; ) = 0 ()

Note that the variations  

 and  are no longer independent because of the constraint equation, thus the
the two terms in the brackets of equation  are not separately equal to zero at the extremum. However,
diﬀerentiating the constraint equation  gives
µ ¶
    
= + =0 ()
    
 
No  term applies because, for the independent variable,  = 0 Introduce the neighboring paths by adding
the auxiliary functions

( ) = () +  1 () ()

( ) = () +  2 () ()
128 CHAPTER 5. CALCULUS OF VARIATIONS

Insert the diﬀerentials of equations  and , into  gives

µ ¶
  
=  () +  () = 0 ( )
  1  2
implying that


 2 () = −  1 ()

Equation  can be rewritten as
Z 2 ∙µ ¶ µ ¶ ¸
     
−  1 () + −  2 ()  = 0
1    0    0
Z 2 "µ ¶ µ ¶  #
      
− − − 1 () = 0 ()
1    0    0 


Equation  now contains only a single arbitrary function  1 () that is not restricted by the constraint. Thus
the bracket in the integrand of equation  must equal zero for the extremum. That is
µ ¶ µ ¶−1 µ ¶ µ ¶−1
       
− = − ≡ −()
   0     0 

Now the left-hand side of this equation is only a function of  and  with respect to  and  0 while the
right-hand side is a function of  and  with respect to  and  0  Because both sides are functions of  then
each side can be set equal to a function −() Thus the above equations can be written as
       
0
− =  () 0
− =  () ()
       
The complete solution of the three unknown functions. () () and () is obtained by solving the two
equations, , plus the equation of constraint  . The Lagrange multiplier () is related to the force of
constraint. This example of two variables coupled by one holonomic constraint conforms with the general
relation for many variables and constraints given by equation 547.

5.9.2 Integral equations of constraint

The constraint equation also can be given in an integral form which is used frequently for isoperimetric
problems. Consider a one dependent-variable isoperimetric problem, for finding the curve  = () such that
the functional has an extremum, and the curve () satisfies boundary conditions such that (1 ) =  and
(2 ) = . That is Z 2
 () =  (  0 ; ) (5.48)
1

is an extremum such that the fixed length  of the perimeter satisfies the integral constraint
Z 2
() = (  0 ; ) =  (5.49)
1

Analogous to (544) these two functionals can be combined requiring that

Z 2
(  ) ≡  [ () + ()] =  [ + ] = 0 (5.50)
1

That is, it is an extremum for both () and the Lagrange multiplier . This eﬀectively involves finding the
extremum path for the function (  ) =  ( ) + ( ) where both () and  are the minimized
variables. Therefore the curve () must satisfy the diﬀerential equation
∙ ¸
     
− + − =0 (5.51)
 0   0 
5.9. LAGRANGE MULTIPLIERS FOR HOLONOMIC CONSTRAINTS 129

subject to the boundary conditions (1 ) =  (2 ) =  and () = .

5.8 Example: Catenary

One isoperimetric problem is the catenary which is the shape a uniform rope or chain of fixed length 
that minimizes the gravitational potential energy. Let the rope have a uniform mass per unit length of 
kg/m
The gravitational potential energy is
Z 2 Z 2 p Z 2 p
 =   =  2 2
  +  =   1 +  02 
1 1 1

The constraint is that the length be a constant 

Z 2 Z 2p
=  = 1 +  02 
1 1
p
0
Thus the functionp is  (  ; ) =  1 +  02 while the integral con-
straint sets  = 1 +  02

These need to be inserted into the Euler equation (551) by defining

p 1 1

 =  +  = ( + ) 1 +  02
The catenary

Note that this case is one where = 0 and  is a constant; also
defining  =  +  then  0 =  0  Therefore the Euler’s equations can be written in the integral form

 − 0 =  = constant
 0
√
Inserting the relation  =  1 +  02 gives
p  0
 1 +  02 −  0 √ =
1 +  02
where  is an arbitrary constant. This simplifies to
³  ´2
 02 = −1

The integral of this is µ ¶
+
 =  cosh

where  and  are arbitrary constants fixed by the locations of the two fixed ends of the rope.

5.9 Example: The Queen Dido problem

A famous constrained isoperimetric legend is that of Dido, first Queen of Carthage. Legend says that,
when Dido landed in North Africa, she persuaded the local chief to sell her as much land as an oxhide could
contain. She cut an oxhide into narrow strips and joined them to make a continuous thread more than four
kilometers in length which was suﬃcient to enclose the land adjoining the coast on which Carthage was built.
Her problem was to enclose the maximum area for a given perimeter. Let us assume that the coast line is
straight and the ends of the thread are at ± on the coast line. The enclosed area is given by
Z +
= 
−

The constraint equation is that the total perimeter equals .

Z p
1 +  02  = 
−
130 CHAPTER 5. CALCULUS OF VARIATIONS
p   
Thus we have that the functional  (  0  ) =  and (  0  ) = 1 +  02 . Then  = 1  0 = 0  = 0

 0
and  0 = √ 02
 Insert these into the Euler-Lagrange equation (551) gives
1+
" #
 0
1− p =0
 1 +  02

That is " #
 0 1
p =
 1+ 02 
Integrate with respect to  gives
 0
p =−
1 +  02
where  is a constant of integration. This can be rearranged to give
± ( − )
0 = q
2 − ( − )2

The integral of this is q

 = ∓ 2 − ( − )2 + 
Rearranging this gives
( − )2 + ( − )2 = 2
This is the equation of a circle centered at ( ). Setting the bounds to be (− 0) to ( 0) gives that
 =  = 0 and the circle radius is  Thus the length of the thread must be  = . Assuming that  = 4
then  = 127 and Queen Dido could buy an area of 2532 

5.10 Geodesic
The geodesic is defined as the shortest path between two fixed points for motion that is constrained to lie
on a surface. Variational calculus provides a powerful approach for determining the equations of motion
constrained to follow a geodesic.
The use of variational calculus is illustrated by considering the geodesic constrained to follow the surface
of a sphere of radius . As discussed in appendixq23, the element of path length on the surface of the
2
sphere is given in spherical coordinates as  =  2 + (sin ) . Therefore the distance  between two
points 1 and 2 is ⎡s ⎤
Z 2 µ ¶2
⎣ 
= + sin2 ⎦  (5.52)
1 

The function  for ensuring that  be an extremum value uses

p
 = 02 + sin2  (5.53)

where 0 = 
 This is a case where 
= 0 and thus the integral form of Euler’s equation can be used
leading to the result that
p  p
02 + sin2  − 0 0 02 + sin2  = constant =  (5.54)

This gives that p
sin2  =  02 + sin2  (5.55)
This can be rewritten as
 1  csc2 
= 0 =√ (5.56)
  1 − 2 csc2 
5.11. VARIATIONAL APPROACH TO CLASSICAL MECHANICS 131

Solving for  gives µ ¶

cot 
 = sin−1 + (5.57)

where
1 − 2
≡ (5.58)
2
That is
cot  =  sin ( − ) (5.59)
Expanding the sine and cotangent gives

( cos )  sin  sin  − ( sin )  sin  cos  =  cos  (5.60)

Since the brackets are constants, this can be written as

 ( sin  sin ) −  ( sin  cos ) = ( cos ) (5.61)

The terms in the brackets are just expressions for the rectangular coordinates    That is,

 −  =  (5.62)

This is the equation of a plane passing through the center of the sphere. Thus the geodesic on a sphere
is the path where a plane through the center intersects the sphere as well as the initial and final locations.
This geodesic is called a great circle. Euler’s equation gives both the maximum and minimum extremum
path lengths for motion on this great circle.
Chapter 17 discusses the geodesic in the four-dimensional space-time coordinates that underlie the General
Theory of Relativity. As a consequence, the use of the calculus of variations to determine the equations of
motion for geodesics plays a pivotal role in the General Theory of Relativity.

5.11 Variational approach to classical mechanics

This chapter has introduced the general principles of variational calculus needed for understanding the La-
grangian and Hamiltonian approaches to classical mechanics. Although variational calculus was developed
originally for classical mechanics, now it has grown to be an important branch of mathematics with applica-
tions to many other fields outside of physics. The prologue of this book emphasized the dramatic differences
between the differential vectorial approach of Newtonian mechanics, and the integral variational approaches
of Lagrange and Hamiltonian mechanics. The Newtonian vectorial approach involves solving Newton’s dif-
ferential equations of motion that relate the force and momenta vectors. This requires knowledge of the
time dependence of all the force vectors, including constraint forces, acting on the system which can be very
complicated. Chapter 2 showed that the first-order time integrals, equations 210 216, relate the initial and
final total momenta without requiring knowledge of the complicated instantaneous forces acting during the
collision of two bodies. Similarly, for conservative systems, the first-order spatial integral, equation 221,
relates the initial and final total energies to the net work done on the system without requiring knowledge
of the instantaneous force vectors. The first-order spatial integral has the advantage that it is a scalar quan-
tity, in contrast to time integrals which are vector quantities. These first-order integral relations are used
frequently in Newtonian mechanics to derive solutions of the equations of motion that avoid having to solve
complicated differential equations of motion.
This chapter has illustrated that variational principles provide a means of deriving more detailed infor-
mation, such as the trajectories for the motion between given initial and final conditions, by requiring that
scalar functionals have extrema values. For example, the solution of the brachistochrone problem determined
the trajectory having the minimum transit time, based on only the magnitudes of the kinetic and gravita-
tional potential energies. Similarly, the catenary shape of a suspended chain was derived by minimizing the
gravitational potential energy. The calculus of variations uses Euler’s equations to determine directly the
differential equations of motion of the system that lead to the functional of interest being stationary at an
extremum. The Lagrangian and Hamiltonian variational approaches to classical mechanics are discussed
in chapters 6 − 16. The broad range of applicability, the flexibility, and the power provided by variational
approaches to classical mechanics and modern physics will be illustrated.
132 CHAPTER 5. CALCULUS OF VARIATIONS

5.12 Summary
Euler’s diﬀerential equation: The calculus of variations has been introduced and Euler’s diﬀerential
equation was derived. The calculus of variations reduces to varying the functions  () where  = 1 2 3 ,
such that the integral Z 2
 =  [ () 0 (); ]  (516)
1

is an extremum, that is, it is a maximum or minimum. Here  is the independent variable,  () are
the dependent variables plus their first derivatives 0 ≡  0
  The quantity  [()  (); ] has some given


0
dependence on    and  The calculus of variations involves varying the functions  () until a stationary
value of  is found which is presumed to be an extremum. It was shown that if the  () are independent,
then the extremum value of  leads to  independent Euler equations
  
− =0 (519)
  0
whereR  = 1 2 3. This can be used to determine the functional form  () that ensures that the integral

 = 12  [()  0 (); ]  is a stationary value, that is, presumably a maximum or minimum value.
Note that Euler’s equation involves partial derivatives for the dependent variables   0  and the total
derivative for the independent variable  R
Euler’s integral equation: It was shown that if the function 12  [ () 0 (); ] does not depend on
the independent variable, then Euler’s diﬀerential equation can be written in an integral form. This integral
form of Euler’s equation is especially useful when   = 0 that is, when  does not depend explicitly on ,
then the first integral of the Euler equation is a constant

 − 0 = constant (525)
 0
Constrained variational systems: Most applications involve constraints on the motion. The equations
of constraint can be classified according to whether the constraints are holonomic or non-holonomic, the time
dependence of the constraints, and whether the constraint forces are conservative.
Generalized coordinates in variational calculus: Independent generalized coordinates can be chosen
that are perpendicular to the rigid constraint forces and therefore the constraint does not contribute to the
functional being minimized. That is, the constraints are embedded into the generalized coordinates and thus
the constraints can be ignored when deriving the variational solution.
Minimal set of generalized coordinates: If the constraints are holonomic then the  holonomic
equations of constraint can be used to transform the  coupled generalized coordinates to  =  − 
independent generalized variables   0 . The generalized coordinate method then uses Euler’s equations to
determine these  =  −  independent generalized coordinates.
  
− =0 (535)
  0
Lagrange multipliers for holonomic constraints: The Lagrange multipliers approach for  variables,
plus  holonomic equations of constraint, determines all  +  unknowns for the system. The holonomic
forces of constraint acting on the  variables, are related to the Lagrange multiplier terms  () 
 that


are introduced into the Euler equations. That is,

X 
   
− 0 +  () =0 (548)
   


where the holonomic equations of constraint are given by

 ( ; ) = 0 (538)

The advantage of using the Lagrange multiplier approach is that the variational procedure simultaneously
determines both the equations of motion for the  variables plus the  constraint forces acting on the
system.
5.12. SUMMARY 133

Workshop exercises
1. Find the extremal of the functional
Z2
̇2
() = 
3
1

that satisfies (1) = 3 and (2) = 18. Show that this extremal provides the global minimum of  .

2. Consider the use of equations of constraint.

(a) A particle is constrained to move on the surface of a sphere. What are the equations of constraint for this
system?
(b) A disk of mass  and radius  rolls without slipping on the outside surface of a half-cylinder of radius
5. What are the equations of constraint for this system?
(c) What are holonomic constraints? Which of the equations of constraint that you found above are holo-
nomic?
(d) Equations of constraint that do not explicitly contain time are said to be scleronomic. Moving constraints
are rheonomic. Are the equations of constraint that you found above scleronomic or rheonomic?

3. For each of the following systems, describe the generalized coordinates that would work best. There may be
more than one answer for each system.

(a) An inclined plane of mass  is sliding on a smooth horizontal surface, while a particle of mass  is
sliding on the smooth inclined surface.
(b) A disk rolls without slipping across a horizontal plane. The plane of the disk remains vertical, but it is
free to rotate about a vertical axis.
(c) A double pendulum consisting of two simple pendula, with one pendulum suspended from the bob of the
other. The two pendula have equal lengths and have bobs of equal mass. Both pendula are confined to
move in the same plane.
(d) A particle of mass  is constrained to move on a circle of radius . The circle rotates in space about
one point on the circle, which is fixed. The rotation takes place in the plane of the circle, with constant
angular speed  , in the absence of a gravitational force.
(e) A particle of mass  is attracted toward a given point by a force of magnitude 2 , where  is a constant.

4. Looking back at the systems in problem 3, which ones could have equations of constraint? How would you
classify the equations of constraint (holonomic, scleronomic, rheonomic, etc.)?
134 CHAPTER 5. CALCULUS OF VARIATIONS

Problems
1. Find the extremal of the functional Z 
() = (2 sin  − ̇2 )
0

that satisfies () = () = 0. Show that this extremal provides the global maximum of  .
Z2 q
√
2. Find and describe the path  = () for which the the integral  1 + ( 0 )2  is stationary.
1

3. Find the dimensions of the parallelepiped of maximum volume circumscribed by a sphere of radius .
4. Consider a single loop of the cycloid having a fixed value of  as shown in the figure. A car released from
rest at any point 0 anywhere on the track between  and the lowest point  , that is, 0 has a parameter
0  0  

O x

P0
P

(a) Show that the time  for the cart to slide from 0 to  is given by the integral
r Z r
 1 − cos 
 (0 →  ) = 
 cos 0 − cos 
0
p
(b) Prove that this time  is equal to   which is independent of the position 0 
(c) Explain qualitatively how this surprising result can possibly be true.


5. Consider a medium for which the refractive index  = where  is a constant and  is the distance from
2
the origin. Use Fermat’s Principle to find the path of a ray of light travelling in a plane containing the origin.
Hint, use two-dimensional polar coordinates with  =  ()  Show that the resulting path is a circle through
the origin.

6. Find the shortest path between the (  ) points (0 −1 0) and (0 1 0) on the conical surface
p
 = 1 − 2 +  2

What is the length of this path? Note that this is the shortest mountain path around a volcano.

7. Show that the geodesic on the surface of a right circular cylinder is a segment of a helix.
Chapter 6

Lagrangian dynamics

6.1 Introduction
Newtonian mechanics is based on vector observables such as momentum and force, and Newton’s equations
of motion can be derived if the forces are known. Newtonian mechanics becomes diﬃcult to apply for many-
body systems that involve constraint forces. The alternative algebraic Lagrangian mechanics approach is
based on the concept of scalar energies which circumvent many of the diﬃculties in handling constraint forces
and many-body systems.
The Lagrangian approach to classical dynamics is based on the calculus of variations introduced in chapter
5. It was shown that the calculus of variations determines the function  () such that the scalar functional
Z 2 X
 =  [ () 0 (); ]  (6.1)
1 

is an extremum, that is, a maximum or minimum. Here  is the independent variable,  () are the 
dependent variables, and their derivatives 0 ≡  0
  where  = 1 2 3  The function  [ ()  (); ] has


0
an assumed dependence on    and  The calculus of variations determines the functional dependence
of the dependent variables  () on the independent variable  that is needed to ensure that  is an
extremum. For  independent variables,  has a stationary point, which is presumed to be an extremum,
that is determined by solution of Euler’s diﬀerential equations
  
− =0 (6.2)
 0 
If the coordinates  () are independent, then the Euler equations, (62), for each coordinate  are inde-
pendent. However, for constrained motion, the constraints lead to auxiliary conditions that correlate the
coordinates. As shown in chapter 5 a transformation to independent generalized coordinates can be made
such that the correlations induced by the constraint forces are embedded into the choice of the independent
generalized coordinates. The use of generalized coordinates in Lagrangian mechanics simplifies derivation of
the equations of motion for constrained systems. For example, for a system of  coordinates, that involves
 holonomic constraints, there are  =  −  independent generalized coordinates. For such holonomic
constrained motion, it will be shown that the Euler equations can be solved using either of the following
three alternative ways.
1) The minimal set of generalized coordinates approach involves finding a set of  =  −  indepen-
dent generalized coordinates  that satisfy the assumptions underlying (62). These generalized coordinates
can be determined if the  equations of constraint are holonomic, that is, related by algebraic equations of
constraint
 ( ; ) = 0 (6.3)
where  = 1 2 3  These equations uniquely determine the relationship between the  correlated coordi-
nates. This method has the advantage that it reduces the system of  coordinates, subject to  constraints,
to  =  −  independent generalized coordinates which reduces the dimension of the problem to be solved.
However, it does not explicitly determine the forces of constraint which are eﬀectively swept under the rug.

135
136 CHAPTER 6. LAGRANGIAN DYNAMICS

2) The Lagrange multipliers approach takes account of the correlation between the  coordinates and
 holonomic constraints by introducing the Lagrange multipliers  (). These  generalized coordinates 
are correlated by the  holonomic constraints.
X
   
0 − =  () (6.4)
   


where  = 1 2 3 . The Lagrange multiplier approach has the advantage that Euler’s calculus of variations
automatically use the  Lagrange equations, plus the  equations of constraint, to explicitly determine both
the  coordinates  and the  forces of constraint
P which are related to the Lagrange multipliers  as given
in equation (64). Chapter 62 shows that the   ()   terms are directly related to the holonomic


forces of constraint.
3) The generalized force approach incorporates the forces of constraint explicitly as will be shown in
chapter 654. Incorporating the constraint forces explicitly allows use of holonomic, non-holonomic, and
non-conservative constraint forces.
Understanding the Lagrange formulation of classical mechanics is facilitated by use of a simple non-
rigorous plausibility approach that is based on Newton’s laws of motion. This introductory plausibility ap-
proach will be followed by two more rigorous derivations of the Lagrangian formulation developed using either
d’Alembert Principle or Hamiltons Principle. These better elucidate the physics underlying the Lagrange
and Hamiltonian analytic representations of classical mechanics. In 1788 Lagrange derived his equations of
motion using the diﬀerential d’Alembert Principle, that extends to dynamical systems the Bernoulli Principle
of infinitessimal virtual displacements and virtual work. The other approach, developed in 1834, uses the
integral Hamilton’s Principle to derive the Lagrange equations. Hamilton’s Principle is discussed in more
detail in chapter 9 Euler’s variational calculus underlies d’Alembert’s Principle and Hamilton’s Principle
since both are based on the philosophical belief that the laws of nature prefer economy of motion. Chap-
ters 62 − 65 show that both d’Alembert’s Principle and Hamilton’s Principle lead to the Euler-Lagrange
equations. This will be followed by a series of examples that illustrate the use of Lagrangian mechanics in
classical mechanics.

6.2 Newtonian plausibility argument for Lagrangian mechanics

Insight into the physics underlying Lagrange mechanics is given by showing the direct relationship between
Newtonian and Lagrangian mechanics. The variational approaches to classical mechanics exploit the first-
order spatial integral of the force, equation 217 which equals the work done between the initial and final
conditions. The work done is a simple scalar quantity that depends on the initial and final location for
conservative forces. Newton’s equation of motion is
p
F= (6.5)

The kinetic energy is given by

1 p·p 2 2 2
 =  2 = =  + + 
2 2 2 2 2
It can be seen that

=  (6.6)
 ̇
and
  
= =  (6.7)
  ̇ 
Consider that the force, acting on a mass  is arbitrarily separated into two components, one part that
is conservative, and thus can be written as the gradient of a scalar potential  , plus the excluded part of
the force,   . The excluded part of the force   could include non-conservative frictional forces as well
as forces of constraint which may be conservative or non-conservative. This separation allows the force to
be written as
F = −∇ + F (6.8)
6.2. NEWTONIAN PLAUSIBILITY ARGUMENT FOR LAGRANGIAN MECHANICS 137

Along each of the  axes,

  
=− +  (6.9)
  ̇  

Equation (69) can be extended by transforming the cartesian coordinate  to the generalized coordinates
 
Define the standard Lagrangian to be the diﬀerence between the kinetic energy and the potential energy,
which can be written in terms of the generalized coordinates  as

(  ̇ ) ≡  (̇ ) −  ( ) (6.10)


Assume that the potential is only a function of the generalized coordinates   that is  ̇ = 0 then

   
= + = (6.11)
 ̇  ̇  ̇  ̇
Using the above equations allows Newton’s equation of motion (69) to be expressed as
  
− =  (6.12)
  ̇  

The excluded force 

can be partitioned into a holonomic constraint force 

 plus any remaining
excluded forces    as given by


= 

+   (6.13)
A comparison of equations (612 613) and (64) shows that the holonomic constraint forces 
 that are
contained in the excluded force    can be identified with the Lagrange multiplier term in equation 64.

X 
 ≡  () (6.14)




That is the Lagrange multiplier terms can be used to account for holonomic constraint forces 

. Thus
equation 612 can be written as

   X 
− =  () +  (6.15)
  ̇   


where the Lagrange multiplier term accounts for holonomic constraint forces, and  
includes all the
remaining forces that are not accounted for by the scalar potential  , or the Lagrange multiplier terms 

.
For holonomic, conservative forces it is possible to absorb all the forces into the potential  plus the
Lagrange multiplier term, that is 

= 0 Moreover, the use of a minimal set of generalized coordinates
allows the holonomic constraint forces to be ignored by explicitly reducing the number of coordinates from
 dependent coordinates to  =  −  independent generalized coordinates. That is, the correlations due
to the constraint forces are embedded into the generalized coordinates. Then equation 615 reduces to the
basic Euler diﬀerential equations.
  
− =0 (6.16)
  ̇ 
Note that equation 616 is identical to Euler’s equation 534, if the independent variable  is replaced
R
by time . Thus Newton’s equation of motion are equivalent to minimizing the action integral  = 12 ,
that is Z 2
 =  (  ̇ ; ) = 0 (6.17)
1
which is Hamilton’s Principle. Hamilton’s Principle underlies many aspects of physics and as discussed in
chapter 9, and is used as the starting point for developing classical mechanics. Hamilton’s Principle was
postulated 46 years after Lagrange introduced Lagrangian mechanics.
The above plausibility argument, which is based on Newtonian mechanics, illustrates the close connection
between the vectorial Newtonian mechanics and the algebraic Lagrangian mechanics approaches to classical
mechanics.
138 CHAPTER 6. LAGRANGIAN DYNAMICS

6.3 Lagrange equations from d’Alembert’s Principle

6.3.1 d’Alembert’s Principle of Virtual Work
The Principle of Virtual Work provides a basis for a rigorous derivation of Lagrangian mechanics. Bernoulli
introduced the concept of virtual infinitessimal displacement of a system mentioned in chapter 591. This
refers to a change in the configuration of the system as a result of any arbitrary infinitessimal instantaneous
change of the coordinates r  that is consistent with the forces and constraints imposed on the system at
the instant . Lagrange’s symbol  is used to designate a virtual displacement which is called “virtual” to
imply that there is no change in time , i.e.  = 0. This distinguishes it from an actual displacement r of
body  during a time interval  when the forces and constraints may change.
Suppose that the system of  particles is in equilibrium, that is, the total force on each particle  is
zero. The virtual work done by the force F moving a distance r is given by the dot product F · r . For
equilibrium, the sum of all these products for the  bodies also must be zero

X
F · r = 0 (6.18)


Decomposing the force F on particle  into applied forces F 

 and constraint forces f gives


X 
X
F
 · r + f · r = 0 (6.19)
 

The second term in equation 619 can be ignored if the virtual work due to the constraint forces is zero.
This is rigorously true for rigid bodies and is valid for any forces of constraint where the constraint forces
are perpendicular to the constraint surface and the virtual displacement is tangent to this surface. Thus if
the constraint forces do no work, then (619) reduces to

X
F
 · r = 0 (6.20)


This relation is the Bernoulli’s Principle of Static Virtual Work and is used to solve problems in statics.
Bernoulli introduced dynamics by using Newton’s Law to related force and momentum.

F = ṗ (6.21)

Equation (621) can be rewritten as

F − ṗ = 0 (6.22)
In 1742, d’Alembert developed the Principle of Dynamic Virtual Work in the form

X
(F − ṗ ) · r = 0 (6.23)


Using equations (619) plus (623) gives


X 
X
(F
 − ṗ  ) · r + f · r = 0 (6.24)
 

For the special case where the forces of constraint are zero, then equation 624 reduces to d’Alembert’s
Principle

X
(F
 − ṗ ) · r = 0 (6.25)


d’Alembert’s Principle, by a stroke of genius, cleverly transforms the principle of virtual work from the realm
of statics to dynamics. Application of virtual work to statics primarily leads to algebraic equations between
the forces, whereas d’Alembert’s principle applied to dynamics leads to diﬀerential equations.
6.3. LAGRANGE EQUATIONS FROM D’ALEMBERT’S PRINCIPLE 139

6.3.2 Transformation to generalized coordinates

In classical mechanical systems the coordinates r usually are not independent due to the forces of constraint
and the constraint-force energy contributes to equation 624. These problems can be eliminated by expressing
d’Alembert’s Principle in terms of virtual displacements
P of  independent generalized coordinates  of the
system for which the constraint force term  f · q = 0. Then the individual variational coeﬃcients 
are independent and (F  − ṗ ) · q = 0 can be equated to zero for each value of .
The transformation of the  -body system to  independent generalized coordinates  can be expressed
as
r = r (1  2  3    ) (6.26)
Assuming  independent coordinates, then the velocity v can be written in terms of general coordinates 
using the chain rule for partial diﬀerentiation.

r X r r
v ≡ = ̇ + (6.27)
 
 

The arbitrary virtual displacement r can be related to the virtual displacement of the generalized coordinate
 by
X
r
r =  (6.28)

 

Note that by definition, a virtual displacement considers only displacements of the coordinates, and no time
variation  is involved.
The above transformations can be used to express d’Alembert’s dynamical principle of virtual work in
generalized coordinates. Thus the first term in d’Alembert’s Dynamical Principle, (625) becomes

X 
X X 
r
F
 · r = F
 ·  =   (6.29)
 
 

where  are called components of the generalized force,1 defined as


X r
 ≡ F
 · (6.30)



Note that just as the generalized coordinates  need not have the dimensions of length, so the  do not
necessarily have the dimensions of force, but the product   must have the dimensions of work. For
example,  could be torque and  could be the corresponding infinitessimal rotation angle.
The second term in d’Alembert’s Principle (625) can be transformed using equation 628
 
Ã  !
X X X r
ṗ · r =  r̈ · r =  r̈ ·  (6.31)
  


The right-hand side of (631) can be rewritten as

Ã  !  ½ µ ¶ µ ¶¾
X r X  r  r
 r̈ ·  =  ṙ · −  ṙ ·  (6.32)

 
   
Note that equation (627) gives that
v r
= (6.33)
 ̇ 
therefore the first right-hand term in (632) can be written as
µ ¶ µ ¶
 r  v
 ṙ · =  v · (6.34)
    ̇
1 This proof, plus the notation, conform with that used by Goldstein [Go50] and by other texts on classical mechanics.
140 CHAPTER 6. LAGRANGIAN DYNAMICS

The second right-hand term in (632) can be rewritten by interchanging the order of the diﬀerentiation with
respect to  and  µ ¶
 r v
= (6.35)
  
Substituting (634) and (635) into (632) gives

Ã  !  ½ µ ¶ ¾
X X r X  v v
ṗ · r =  r̈ ·  =  v · −  v ·  (6.36)
 
 
  ̇ 

Inserting (629) and (636) into d’Alembert’s Principle (625) leads to the relation
 
( Ã Ã !! Ã ! )
X X   X1  X1
 2 2
(F − ṗ ) · r = −   −   −   = 0 (6.37)
 
  ̇ 
2  
2
P
The  21  2 term can be identified with the system kinetic energy  . Thus d’Alembert Principle reduces
to the relation
 ∙½
X µ ¶ ¾ ¸
  
− −   = 0 (6.38)

  ̇ 

For cartesian coordinates  is a function only of velocities (̇ ̇ ̇) and thus the term  = 0 However,

as discussed in appendix 22, for curvilinear coordinates  6= 0 due to the curvature of the coordinates
as is illustrated for polar coordinates where v =̇r̂ + ̇θ̂.
If all the  generalized coordinates  are independent, then equation 638 implies that the term in the
square brackets is zero for each individual value of . This leads to the basic Euler-Lagrange equations of
motion for each of the independent generalized coordinates
½ µ ¶ ¾
  
− =  (6.39)
  ̇ 

where  ≥  ≥ 1. That is, this leads to  Euler-Lagrange equations of motion for the generalized forces  .
As discussed in chapter 58 when  holonomic constraint forces apply, it is possible to reduce the system
to  =  −  independent generalized coordinates for which equation 625 applies.
In 1687 Leibniz proposed minimizing the time integral of his “vis viva”, which equals 2 That is,
Z 2
   = 0 (6.40)
1

The variational equation 639 accomplishes the minimization of equation 640. It is remarkable that Leibniz
anticipated the basic variational concept prior to the birth of the developers of Lagrangian mechanics, i.e.,
d’Alembert, Euler, Lagrange, and Hamilton.

6.3.3 Lagrangian
The handling of both conservative
P and non-conservative generalized forces  is best achieved by assuming
r̄
that the generalized force  =  F ·
  can be partitioned into a conservative velocity-independent term,
that can be expressed in terms of the gradient of a scalar potential, −∇  plus an excluded generalized force

 which contains the non-conservative, velocity-dependent, and all the constraint forces not explicitly
included in the potential  . That is,
 = −∇ +   (6.41)
Inserting (641) into (638)  and assuming that the potential  is velocity independent, allows (638) to be
rewritten as
X ∙½  µ ( −  ) ¶ ( −  ) ¾ ¸

− −   = 0 (6.42)

  ̇ 
6.4. LAGRANGE EQUATIONS FROM HAMILTON’S ACTION PRINCIPLE 141

The definition of the Standard Lagrangian is

≡ − (6.43)

then (642) can be written as

 ∙½
X µ ¶ ¾ ¸
   
− −   = 0 (6.44)

  ̇ 

Note that equation (644) contains the basic Euler-Lagrange equation (638) as a special case when  = 0.
In addition, note that if all the generalized coordinates are independent, then the square bracket terms are
zero for each value of  which leads to the general Euler-Lagrange equations of motion
½ µ ¶ ¾
  
− = 
 (6.45)
  ̇ 

where  ≥  ≥ 1.
Chapter 653 will show that the holonomic constraint forces can be factored out of the generalized force
term 
 which simplifies derivation of the equations of motion using Lagrangian mechanics. The general
Euler-Lagrange equations of motion are used extensively in classical mechanics because conservative forces
play a ubiquitous role in classical mechanics.

6.4 Lagrange equations from Hamilton’s Action Principle

Hamilton published two papers in 1834 and 1835 announcing a fundamental new dynamical principle that
underlies both Lagrangian and Hamiltonian mechanics. Hamilton was seeking a theory of optics when he
developed Hamilton’s Action Principle, plus the field of Hamiltonian mechanics, both of which play a crucial
role in classical mechanics and modern physics. Hamilton’s Action Principle states “ dynamical systems
follow paths that minimize the time integral of the Lagrangian”. That is, the action functional 
Z 2
= (q q̇) (6.46)
1

has a minimum value for the correct path of motion. Hamilton’s Action Principle can be written in
terms of a virtual infinitessimal displacement  as
Z 2
 =   = 0 (6.47)
1

Variational calculus therefore implies that a system of  independent generalized coordinates must satisfy
the basic Lagrange-Euler equations
  
− =0 (6.48)
  ̇ 
Note that for  = 0 this is the same as equation 645 which was derived using d’Alembert’s Principle.
This discussion has shown that Euler’s variational diﬀerential equation underlies both the diﬀerential vari-
ational d’Alembert Principle, and the more fundamental integral Hamilton’s Action Principle. As discussed
in chapter 92, Hamilton’s Principle of Stationary Action adds a fundamental new dimension to classical
mechanics which leads to derivation of both Lagrangian and Hamiltonian mechanics. That is, both Hamil-
ton’s Action Principle, and d’Alembert’s Principle, can be used to derive Lagrangian mechanics leading to
the most general Lagrange equations that are applicable to both holonomic and non-holonomic constraints,
as well as conservative and non-conservative systems. In addition, Chapter 62 presented a plausibility ar-
gument showing that Lagrangian mechanics can be justified based on Newtonian mechanics. Hamilton’s
Action Principle, and d’Alembert’s Principle, can be expressed in terms of generalized coordinates which is
much broader in scope than the equations of motion implied using Newtonian mechanics.
142 CHAPTER 6. LAGRANGIAN DYNAMICS

6.5 Constrained systems

The motion for systems subject to constraints is diﬃcult to calculate using Newtonian mechanics because
all the unknown constraint forces must be included explicitly with the active forces in order to determine
the equations of motion. Lagrangian mechanics avoids these diﬃculties by allowing selection of independent
generalized coordinates that incorporate the correlated motion induced by the constraint forces. This allows
the constraint forces acting on the system to be ignored by reducing the system to a minimal set of generalized
coordinates. The holonomic constraint forces can be determined using the Lagrange multiplier approach, or
all constraint forces can be determined by including them as generalized forces, as described below.

6.5.1 Choice of generalized coordinates

As discussed in chapter 58, the flexibility and freedom for selection of generalized coordinates is a consid-
erable advantage of Lagrangian mechanics when handling constrained systems. The generalized coordinates
can be any set of independent variables that completely specify the scalar action functional, equation 646.
The generalized coordinates are not required to be orthogonal as is required when using the vectorial New-
tonian approach. The secret to using generalized coordinates is to select coordinates that are perpendicular
to the constraint forces so that the constraint forces do no work. Moreover, if the constraints are rigid, then
the constraint forces do no work in the direction of the constraint
P force. As a consequence, the constraint
forces do not contribute to the action integral and thus the  f · r term in equation 619 can be omit-
ted from the action integral. Generalized coordinates allow reducing the number of unknowns from  to
 =  −  when the system has  holonomic constraints. In addition, generalized coordinates facilitate
using both the Lagrange multipliers, and the generalized forces, approaches for determining the constraint
forces.

6.5.2 Minimal set of generalized coordinates

The set of  generalized coordinates  are used to describe the motion of the system. No restrictions have
been placed on the nature of the constraints other than they are workless for a virtual displacement. If the
 constraints are holonomic, then it is possible to find sets of  =  −  independent generalized coordinates
 that contain the  constraint conditions implicitly in the transformation equations

r = r (1  2  3    ) (6.49)

For the case of  =  −  unknowns, any virtual displacement  is independent of  , therefore the
only way for (644) to hold is for the term in brackets to vanish for each value of , that is
½ µ ¶ ¾
  
− =  (6.50)
  ̇ 
where  = 1 2 3   These are the Lagrange equations for the minimal set of  independent generalized
coordinates.
If all the generalized forces are conservative plus velocity independent, and are included in the potential
 and  = 0, then (650) simplifies to
½ µ ¶ ¾
  
− =0 (6.51)
  ̇ 
This is Euler’s diﬀerential equation, derived earlier using the calculus
R  of variations. Thus d’Alembert’s
Principle leads to a solution that minimizes the action integral  12  = 0 as stated by Hamilton’s
Principle.

6.5.3 Lagrange multipliers approach

Equation (644) sums over all  coordinates for  particles, providing  equations of motion. If the 
constraints are holonomic they can be expressed by  algebraic equations of constraint

 (1  2    ) = 0 (6.52)

6.5. CONSTRAINED SYSTEMS 143

where  = 1 2 3  Kinematic constraints can be expressed in terms of the infinitessimal displacements
of the form
X
 
(q ) +  = 0 (6.53)
=1
  

where  = 1 2 3 ,  = 1 2 3 , and where the  

 , and  are functions of the generalized coordinates


 , described by the vector q that are derived from the equations of constraint. As discussed in chapter 57,
if (653) represents the total diﬀerential of a function, then it can be integrated to give a holonomic relation
of the form of equation (652). However, if (653) is not the total diﬀerential, then it can be integrated only
after having solved the full problem. If   = 0 then the 
 
constraint is scleronomic.
The discussion of Lagrange multipliers in chapter 591, showed that, for virtual displacements  
the correlation of the generalized coordinates, due to the constraint forces, can be taken into account by
multiplying (653) by unknown Lagrange multipliers  and summing over all  constraints. Generalized
forces can be partitioned into a Lagrange multiplier term plus a remainder force. That is

X 

 =  (q ) + 
 (6.54)

=1

since by definition  = 0 for virtual displacements.

Chapter 591 showed that holonomic forces of constraint can be taken into account by introducing
the Lagrange undetermined multipliers approach, which is equivalent to defining an extended Lagrangian
0 (q q̇ λ) where
 X
X 

0 (q q̇ λ) = (q q̇) +  (q ) (6.55)
=1

=1
0
Finding the extremum for the extended Lagrangian  (q q̇ λ) using (647) gives

"½ µ ¶ ¾ X
#
X     
− −  (q ) −   = 0 (6.56)

  ̇  
=1

where 
 is the remaining part of the generalized force  after subtracting both the part of the force
absorbed in the potential energy  , which is buried in the Lagrangian
P , as well as the holonomic constraint
forces which are included in the Lagrange multiplier terms =1    (q ). The  Lagrange multipliers


 can be chosen arbitrarily in (656)  Utilizing the free choice of the  Lagrange multipliers  allows them
to be determined in such a way that the coeﬃcients of the first  infinitessimals, i.e. the square brackets
vanish. Therefore the expression in the square bracket must vanish for each value of 1 ≤  ≤ . Thus it
follows that ½ µ ¶ ¾ X 
   
− −  (q ) − 
 =0 (6.57)
  ̇  
=1

when  = 1 2  Thus (656) reduces to a sum over the remaining coordinates between  + 1 ≤  ≤ 

"½ µ ¶ ¾ X 
#
X     
− −  (q ) −   = 0 (6.58)
=+1
  ̇  
=1

In equation (658) the  =  −  infinitessimals  can be chosen freely since the  =  −  degrees
of freedom are independent. Therefore the expression in the square bracket must vanish for each value of
 + 1 ≤  ≤ . Thus it follows that
½ µ ¶ ¾ X 
   
− −  (q ) − 
 =0 (6.59)
  ̇  
=1

where  =  + 1  + 2  Combining equations (657) and (659) then gives the important general relation
that for 1 ≤  ≤ 
½ µ ¶ ¾ X 
   
− =  (q ) + 
 (6.60)
  ̇  
=1
144 CHAPTER 6. LAGRANGIAN DYNAMICS

To summarize, the Lagrange multiplier approach (660) automatically solves the  equations plus the
 holonomic equations of constraint, which determines the  +  unknowns, that is, the  coordinates
plus the  forces of constraint. The beauty of the Lagrange multipliers is that all  variables, plus the 
constraint forces, are found simultaneously by using the calculus of variations to determine the extremum
for the expanded Lagrangian 0 (q q̇ λ).

6.5.4 Generalized forces approach

The two right-hand terms in (660) can be understood to be those forces acting on the system that are
not
P absorbed into the scalar potential  component of the Lagrangian . The Lagrange multiplier terms


=1   (q ) account for the holonomic forces of constraint that are not included in the conservative
potential or in the generalized forces 
 . The generalized force

X r

 = F
 · (617)



is the sum of the components in the  direction for all external forces that have not been taken into account
by the scalar potential or the Lagrange multipliers. Thus the non-conservative generalized force  
contains non-holonomic constraint forces, including dissipative forces such as drag or friction, that are not
included in  or used in the Lagrange multiplier terms to account for the holonomic constraint forces.
The concept of generalized forces is illustrated by the case of spherical coordinate systems. The attached
table gives the displacement elements  , (taken from table 4) and the generalized force for the three
coordinates. Note that  has the dimensions of force and   has the units of energy. By contrast
equation 630 gives that  =   and  =   which have the dimensions of torque. However,   and
  both have the dimensions of energy as is required in equation 630. This illustrates that the units used
for generalized forces depend on the units of the corresponding generalized coordinate.
Unit vectors    · 
̂ r̂ r̂  
θ̂ θ̂ θ̂   
φ̂ φ̂ sin  φ̂  sin    sin 

6.6 Applying the Euler-Lagrange equations to classical mechanics

d’Alembert’s principle of virtual work has been used to derive the Euler-Lagrange equations, which also
satisfy Hamilton’s Principle, and the Newtonian plausibility argument. These implyR that the actual path
  
taken in configuration space (    ) is the one that minimizes the action integral 12 (   ; ) As a
consequence, the Euler equations for the calculus of variations lead to the Lagrange equations of motion.
½ µ ¶ ¾ X 
   
− =  (q ) + 
 (660)
  ̇  
=1

for  variables, with  equations of constraint. The generalized forces   are not included in the
conservative, potential energy  or the Lagrange multipliers approach for holonomic equations of constraint.2
The following is a logical procedure for applying the Euler-Lagrange equations to classical mechanics.

1) Select a set of independent generalized coordinates:

Select an optimum set of independent generalized coordinates as described in chapter 651. Use of generalized
coordinates is always advantageous since they incorporate the constraints, and can reduce the number of
unknowns, both of which simplify use of Lagrangian mechanics
2 Euler’s diﬀerential equation is ubiquitous in Lagrangian mechanics. Thus, for brevity, it is convenient to define the concept

of the Lagrange linear operator Λ  as described in appendix  2

  
Λ ≡ −
  ̇ 
where Λ operates on the Lagrangian . Then Euler’s equations can be written compactly in the form Λ  = 0.
6.6. APPLYING THE EULER-LAGRANGE EQUATIONS TO CLASSICAL MECHANICS 145

2) Partition of the active forces:

The active forces should be partitioned into the following three groups:
(i) Conservative one-body forces plus the velocity-dependent electromagnetic force which
can be characterized by the scalar potential  , that is absorbed into the Lagrangian. The gravitational
forces plus the velocity-dependent electromagnetic force can be absorbed into the potential  as discussed
in chapter 610. This approach is by far the easiest way to account for such forces in Lagrangian mechanics.
(ii) Holonomic constraint forces provide algebraic relations that couple some of the generalized coor-
dinates. This coupling can be used either to reduce the number of generalized coordinates, or to determine
these holonomic constraint forces using the Lagrange multiplier approach.
(iii) Generalized forces provide a mechanism for introducing non-conservative and non-holonomic
constraint forces into Lagrangian mechanics. Typically general forces are used to introduce dissipative
forces.
Typical systems can involve a mixture of all three categories of active forces. For example, mechanical
systems often include gravity, introduced as a potential, holonomic constraint forces are determined using
Lagrange multipliers, and dissipative forces are included as generalized forces.

3) Minimal set of generalized coordinates:

The ability to embed constraint forces directly into the generalized coordinates is a tremendous advantage
enjoyed by the Lagrangian and Hamiltonian variational approaches to classical mechanics. If the constraint
forces are not required, then choice of a minimal set of generalized coordinates significantly reduces the
number of equations of motion that need to be solved .

4) Derive the Lagrangian:

The Lagrangian is derived in terms of the generalized coordinates and including the conservative forces that
are buried into the scalar potential 

5) Derive the equations of motion:

Equation (660) is solved to determine the  generalized coordinates, plus the  Lagrange multipliers char-
acterizing the holonomic constraint forces, plus any generalized forces that were included. The holonomic
constraint forces then are given by evaluating the   (q ) terms for the  holonomic forces.


In summary, Lagrangian mechanics is based on energies which are scalars in contrast to Newtonian
mechanics which is based on vector forces and momentum. As a consequence, Lagrange mechanics allows
use of any set of independent generalized coordinates, which do not have to be orthogonal, and they can
have very diﬀerent units for diﬀerent variables. The generalized coordinates can incorporate the correlations
introduced by constraint forces.
The active forces are split into the following three categories;

1. Velocity-independent conservative forces are taken into account using scalar potentials  .
2. Holonomic constraint forces can be determined using Lagrange multipliers.
3. Non-holonomic constraints require use of generalized forces 
 .

Use of the concept of scalar potentials is a trivial and powerful way to incorporate conservative forces in
Lagrangian mechanics. The Lagrange multipliers approach requires using the Euler-Lagrange equations for
 +  coordinates but determines both holonomic constraint forces and equations of motion simultaneously.
Non-holonomic constraints and dissipative forces can be incorporated into Lagrangian mechanics via use of
generalized forces which broadens the scope of Lagrangian mechanics.
Note that the equations of motion resulting from the Lagrange-Euler algebraic approach are the same
equations of motion as obtained using Newtonian mechanics. However, the Lagrangian is a scalar which
facilitates rotation into the most convenient frame of reference. This can greatly simplify determination of
the equations of motion when constraint forces apply. As discussed in chapter 17, the Lagrangian and the
Hamiltonian variational approaches to mechanics are the only viable way to handle relativistic, statistical,
and quantum mechanics.
146 CHAPTER 6. LAGRANGIAN DYNAMICS

6.7 Applications to unconstrained systems

Although most dynamical systems involve constrained motion, it is useful to consider examples of systems
subject to conservative forces with no constraints. For no constraints, the Lagrange-Euler equations (660)
simplify to Λ  = 0 where  = 1 2  and the transformation to generalized coordinates is of no conse-
quence.

6.1 Example: Motion of a free particle, U=0

The Lagrangian in cartesian coordinates is  = 12 (̇2 + ̇ 2 + ̇ 2 ) Then


= ̇
 ̇

= ̇
 ̇

= ̇
 ̇
  
= = =0
  
Insert these in the Lagrange equation gives
   
Λ  = − = ̇ − 0 = 0
  ̇  
Thus

 = ̇ = 
 = ̇ = 
 = ̇ = 

That is, this shows that the linear momentum is conserved if  is a constant, that is, no forces apply. Note
that momentum conservation has been derived without any direct reference to forces.

6.2 Example: Motion in a uniform gravitational field

Consider the motion³ is in the´  −  plane. The
1 2 2
kinetic energy  = 2   +  while the potential
y
energy is  =  where  ( = 0) = 0 Thus

1 ³  2  2´
=   +  − 
2 g
(x, y)
Using the Lagrange equation for the  coordinate
gives

    
Λ  =  − =  − 0 = 0
    
Thus the horizontal momentum ̇ is conserved and

 = 0 The  coordinate gives x

     Motion in a gravitational field

Λ  =  − =  +  = 0
    

Thus the Lagrangian produces the same results as de-

rived using Newton’s Laws of Motion.

̈ = 0  = −
6.7. APPLICATIONS TO UNCONSTRAINED SYSTEMS 147

The importance of selecting the most convenient generalized coordinates is nicely illustrated by trying to
solve this problem using polar coordinates   where  is radial distance and  the elevation angle from the
 axis as shown in the adjacent figure. Then
1  2 1 ³  ´2
 =  +  
2 2
 =  sin 
Thus
1  2 1 ³ ´2
=  +  ̇ −  sin 
2 2
Λ  = 0 for the  coordinate
2
̇ −  sin  − ̈ = 0
Λ  = 0 for the  coordinate
− cos  − 2̇̇ − 2 ̈ = 0
These equations written in polar coordinates are more complicated than the result expressed in cartesian
coordinates. This is because the potential energy depends directly on the  coordinate, whereas it is a function
of both   This illustrates the freedom for using diﬀerent generalized coordinates, plus the importance of
choosing a sensible set of generalized coordinates.

6.3 Example: Central forces

Consider a mass  moving under the influence of a spherically-symmetric, conservative, attractive,
inverse-square force. The potential then is

 =−

It is natural to express the Lagrangian in spherical coordinates for this system. That is,
1 2 1 ³ ´2 1 
= ̇ +  ̇ + ( sin ̇)2 +
2 2 2 
Λ  = 0 for the  coordinate gives
2 2 
̈ − [̇ + sin2 ̇ ] =
2
2
where the  sin2 ̇ term comes from the centripetal acceleration.
Λ  = 0 for the  coordinate gives
 ³ 2 2 ´
 sin ̇ = 0

This implies that the derivative of the angular momentum about the  axis, ̇ = 0 and thus  = 2 sin2 ̇
is a constant of motion.
Λ  = 0 for the  coordinate gives
 2
(2 ̇) − 2 sin  cos ̇ = 0

That is,
2 2 cos 
̇ = 2 sin  cos ̇ =
22 sin3 
Note that  is a constant of motion if  = 0 and only the radial coordinate is influenced by the radial form
of the central potential.
148 CHAPTER 6. LAGRANGIAN DYNAMICS

6.8 Applications to systems involving holonomic constraints

The equations of motion that result from the Lagrange-Euler algebraic approach are the same as those given
by Newtonian mechanics. The solution of these equations of motion can be obtained mathematically using
the chosen initial conditions. The following simple example of a disk rolling on an inclined plane, is useful
for comparing the merits of the Newtonian method with Lagrange mechanics employing either minimal
generalized coordinates, the Lagrange multipliers, or the generalized forces approaches.

6.4 Example: Disk rolling on an inclined plane

Consider a disk rolling down an inclined plane to compare
the results obtained using Newton’s laws with the results ob-
tained using Lagrange’s equations with either generalized coor- N
dinates, Lagrange multipliers, or generalized forces. All these y
cases assume that the friction is suﬃcient to ensure that the Ff
rolling equation of constraint applies and that the disk has a
radius  and moment of inertia of . Assume as generalized
coordinates, distance along the inclined plane  which is per-
pendicular to the normal constraint force  , and perpendicular mg
to the inclined plane , plus the rolling angle . The constraint
for rolling is holonomic
 −  = 0
The frictional force is   The constraint that it rolls along the Disk rolling without slipping on an
plane implies inclined plane.
− =0

a) Newton’s laws of motion

Newton’s law for the components of the forces along the inclined plane gives

 sin  −  =  (a)
Perpendicular to the inclined plane, Newton’s law gives
 cos  =  (b)
The torque on the disk gives
  =  ̈ (c)
Assuming the disc rolls gives

 = ̈
then

 = ̈
2
Inserting this into equation (a) gives
µ ¶

 + 2 ̈ −  sin  = 0

The moment of inertia of a uniform solid circular disk is  = 12 2
Therefore
2
̈ =  sin 
3
and the frictional force is

 = sin 
3
which is smaller than the gravitational force along the plane which is  sin 
6.8. APPLICATIONS TO SYSTEMS INVOLVING HOLONOMIC CONSTRAINTS 149

b) Lagrange equations with a minimal set of generalized coordinates

Using the generalized coordinates defined above, the total kinetic energy is
1 1 2
 = ̇ 2 +  ̇
2 2
The conservative gravitational force can be absorbed into the potential energy

 = ( − ) sin 

Thus the Lagrangian is

1 1 2
= ̇ 2 +  ̇ − ( − ) sin 
2 2
The holonomic equations of constraint are

1 =  −  = 0
2 = −=0

A holonomic constraint can be used to reduce the system to a single generalized coordinate  plus generalized
velocity ̇ Expressed in terms of this single generalized coordinate, the Lagrangian becomes
µ ¶
1 
=  + 2 ̇ 2 − ( − ) sin 
2 

The Lagrange equation Λ  = 0 gives

µ ¶
 
 sin  = + 2 


Again if  = 12 2 then

2
̈ =  sin 
3
The solution for the  coordinate is trivial. This answer is identical to that obtained using Newton’s laws
of motion. Note that no forces have been determined using the single generalized coordinate.

c) Lagrange equation with Lagrange multipliers

Again the conservative gravitation force is absorbed into the scalar potential while the holonomic constraints
are taken into account using Lagrange multipliers. Ignoring the trivial  dependence, the Lagrangian is given
above to be
1 1 2
 = ̇ 2 +  ̇ − ( − ) sin 
2 2
The constraint equations are

1 =  −  = 0
2 = −=0

The Lagrange equation for the  coordinate

   1
− = 1 + 2 0
  ̇  
gives
̈ −  sin  = 1
The Lagrange equation for the  coordinate
   1
− = 1 + 2 0
  ̇  
150 CHAPTER 6. LAGRANGIAN DYNAMICS

which gives
 ̈ = −1 
The constraint can be written as
̈ = ̈
1 2
Let  = 2 and solve for   and  gives
 
1 = − ¡ 2
¢ sin  = − sin 
1+  3
The frictional force is given by
1 
 = 1 = 1 = − sin 
 3
Also
2
̈ =  sin  + 1 =  sin 
3
and the torque is
−1  =   =  ̈

d) Lagrange equation using a generalized force

Again the conservative gravitation force is absorbed into the scalar potential while the holonomic constraints
are taken into account using generalized forces. Ignoring the trivial  dependence, the Lagrangian was given
above to be
1 1 2
 = ̇ 2 +  ̇ − ( − ) sin 
2 2
The generalized forces (630) are
 = −
 =  
The Euler-Lagrange equations are:
The Λ  =  Lagrange equation for the  coordinate
̈ −  sin  =  = −
The Λ  =  Lagrange equation for the  coordinate
 ̈ =  =  
The constraint equation gives that  =  and assuming  = 12 2 leads to the  relation
 
=  = ̈
 2
Substitute this equation into the  relation gives that

̈ −  sin  =  = − = ̈
2
Thus
2
̈ =  sin 
3
and

 = − sin 
3

The four methods for handling the equations of constraint all are equivalent and result in the same
equations of motion. The scalar Lagrangian mechanics is able to calculate the vector forces acting in a direct
and simple way. The Newton’s law approach is more intuitive for this simple case and the ease and power
of the Lagrangian approach is not apparent for this simple system.
The following series of examples will gradually increase in complexity, and will illustrate the power,
elegance, plus superiority of the Lagrangian approach compared with the Newtonian approach.
6.8. APPLICATIONS TO SYSTEMS INVOLVING HOLONOMIC CONSTRAINTS 151

6.5 Example: Two connected masses on frictionless inclined planes

Consider the system shown in the figure. This is
a problem that has five constraints that will be solved
using the method of generalized coordinates. The ob-
vious generalized coordinates are 1 and 2 which are 1 2

perpendicular to the normal constraint forces on the

inclined planes. Another holonomic constraint is that
the length of the rope connecting the masses is assumed
to be constant. Thus the equation of constraint is that m1 x1 x2 m2

1 + 2 −  = 0

The other four constraints ensure that the two masses

slide directly down the inclined planes in the plane Two connected masses on frictionless inclined
shown. This is assumed implicitly by using only the planes
variables, 1 and 2  Let us chose 1 as the primary
generalized coordinate, thus

2 =  − 1
1 = 1 sin 1
2 = ( − 1 ) sin 2

The conservative gravitational force is absorbed into the potential energy given by

 = −1 1 sin 1 − 2  ( − 1 ) sin 2

 
Since 1 = −2 the kinetic energy is given by
1 1 1
 = 1 ̇21 + 2 ̇22 = (1 + 2 ) ̇21
2 2 2
The Lagrangian then gives that
1
= (1 + 2 ) ̇21 + 1 1 sin 1 + 2  ( − 1 ) sin 2
2
Therefore

= (1 + 2 ) ̇1
 ̇1

=  (1 sin 1 − 2 sin 2 )
1
Thus
  
Λ1  = − = 0 = (1 + 2 ) ̈1 −  (1 sin 1 − 2 sin 2 )
  ̇1 1 x1 x2
Note that the system acts as though the inertial mass is (1 + 2 )
while the driving force comes from the diﬀerence of the forces. The
acceleration is zero if
m1
1 sin 1 = 2 sin 2
m2
A special case of this is the Atwood’s machine with a massless
pulley shown in the adjacent figure. For this case 1 = 2 = 90  Atwoods machine
Thus
(1 + 2 ) ̈1 =  (1 − 2 )
Note that this problem has been solved without any reference to the
force in the rope or the normal constraint forces on the inclined planes.
152 CHAPTER 6. LAGRANGIAN DYNAMICS

6.6 Example: Two blocks connected by a friction-

less bar
Two identical masses  are connected by a massless
rigid bar of length , and they are constrained to move
in two frictionless slides, one vertical and the other hor- y
izontal as shown in the adjacent figure. Assume that the
conservative gravitational force acts along the negative 
axis and is incorporated into the scalar potential  . The
generalized coordinate can be chosen to be the angle 
corresponding to a single degree of freedom. The relative l
cartesian coordinates of the blocks are given by

 =  cos 
 =  sin 

Thus
x
̇ = −(sin )̇
Two frictionless masses that are connected by a
̇ = (cos )̇
bar and are constrained to slide in vertical and
This constraint, that is absorbed into the generalized co- horizontal channels.
ordinate, is holonomic, scleronomic, and conservative.
The kinetic energy is given by
1 ¡2 ¢ 1
 =   (sin )2 ̇2 + 2 (cos )2 ̇2 = 2 ̇2
2 2
The gravitational potential energy is given by

 =  =  sin 

Thus the Lagrangian is

1 2 2
 ̇ −  sin 
=
2
Using the Lagrange operator equation Λ  = 0 gives

2 ̈ +  cos  = 0


̈ + cos  = 0

Multiply by ̇ yields

̈̇ + ̇ cos  = 0

This can be integrated to give
1 2 
̇ + sin  = 
2 
where  is a constant. That is r ³
 ´
̇ = 2  − sin 

Separation of the variable gives

 = q ¡ ¢
2  −  sin 
Integration of this gives Z 

 − 0 = q ¡ ¢
0 2  −  sin 
The constants  and 0 are determined from the given initial conditions.
6.8. APPLICATIONS TO SYSTEMS INVOLVING HOLONOMIC CONSTRAINTS 153

6.7 Example: Block sliding on a movable frictionless inclined plane

Consider a block of mass  free to slide on a smooth
frictionless inclined plane of mass  that is free to slide
horizontally as shown in the adjacent figure. The six de-
grees of freedom can be reduced to two independent gen-
eralized coordinates since the inclined plane and mass  x’
are confined to slide along specific non-orthogonal direc-
tions. Choose  as the coordinate for movement of the
inclined plane in the horizontal ̂ direction and 0 the m
position of the block with respect to the surface of the M
inclined plane in the ê direction which is inclined down- x
ward at an angle . Thus the velocity of the inclined
plane is
V = ̂̇ A block sliding on a frictionless movable inclined
while the velocity of the small block on the inclined plane plane.
is
v = ̂̇ + ê̇0
The kinetic energy is given by
1 1 1 1
 =  V · V+ v · v =  ̇2 + [̇2 + ̇02 + 2̇̇0 cos ]
2 2 2 2
The conservative gravitational force is absorbed into the scalar potential energy which depends only on the
vertical position of the block and is taken to be zero at the top of the wedge.

 = −0 sin 

Thus the Lagrangian is

1 1
= ̇2 + [̇2 + ̇02 + 2̇̇0 cos ] + 0 sin 
2 2
Consider the Lagrange-Euler equation for the  coordinate, Λ  = 0 which gives

[(̇ + ̇0 cos ) +  ̇] = 0 ()

which states that [(̇ + ̇0 cos ) +  ̇] is a constant of motion. This constant of motion is just the total
linear momentum of the complete system in the  direction. That is, conservation of the linear momentum
is satisfied automatically by the Lagrangian approach. The Newtonian approach also predicts conservation of
the linear momentum since there are no external horizontal forces,
Consider the Lagrangian equation for the 0 coordinate Λ0  = 0 which gives
 0
[̇ + ̇ cos ] =  sin  ()

Perform both of the time derivatives for equations  and  give

[̈ + ̈0 cos ] +  ̈ = 0

̈0 + ̈ cos  =  sin 

Solving for ̈ and ̈0 gives

− sin  cos 
̈ =
( +  ) − cos2 
and.
 sin 
̈0 =
1 −  cos2 ( +  )
This example illustrates the flexibility of being able to use non-orthogonal displacement vectors to specify the
scalar Lagrangian energy. Newtonian mechanics would require more thought to solve this problem.
154 CHAPTER 6. LAGRANGIAN DYNAMICS

6.8 Example: Sphere rolling without slipping down an inclined plane on a

frictionless floor.
A sphere of mass  and radius  rolls, without slipping, down an inclined plane, of mass  sitting on a
frictionless horizontal floor as shown in the adjacent figure. The velocity of the rolling sphere has horizontal
and vertical components of

 = ̇ + ̇ cos 
 = −̇ sin 

Assume initial conditions are  = 0  = 0  = 0  = 0  =  ̇ = ̇ = 0 Choose the independent coordinates

 and  as generalized coordinates plus the holonomic constraint  = . Then the Lagrangian is
 2 h 2 2
i  2
= ̇ + ̇ + 2 ̇ + 2̇̇ cos  + 2 ̇ −  ( −  sin )
2 2 5
Lagrange’s equations Λ  = 0 and Λ  = 0, give

( + ) ̈ + ̈ cos  = 0 y
7
̈ cos  + ̈ −  sin  = 0
5 .
Eliminating ̈ gives
µ ¶
7  cos2  sin 
− ̈ = 
5  + 
Integrate this equation assuming the initial conditions,
results in
5 ( + ) sin  x
= 2 y
2 [7 ( + ) − 5 cos2 ]
x
Thus Solid sphere rolling without slipping on an
 cos  5 sin (2) inclined plane on a frictionless horizontal floor.
=− = 2
 + 4 [7 ( + ) − 5 cos2 ]
Note that these equations predict conservation of linear
momentum for the block plus sphere.

6.9 Example: Mass sliding on a rotating straight frictionless rod.

Consider a mass  sliding on a frictionless rod that
rotates

about one end of the rod with an angular velocity
. Choose  and  to be generalized coordinates. Then
the kinetic energy is given by . m
1 1 2
 = ̇2 + 2 ̇
2 2
and potential energy
 =0
The Lagrange equation for  gives Mass sliding on a rotating straight frictionless
    rod.
Λ  = − = (2 ̇) = 0
  ̇  
Thus the angular momentum is constant

2 ̇ = constant = 
6.8. APPLICATIONS TO SYSTEMS INVOLVING HOLONOMIC CONSTRAINTS 155

The Lagrange equation for  gives

   2
Λ  = − = ̈ − ̇ = 0
  ̇ 
The  equation states that the angular momentum is conserved for this case which is what we expect since
there are no external torques acting on the system. The  equation states that the centrifugal acceleration is
̈ =  2  These equations of motion were derived without reference to the forces between the rod and mass.

6.10 Example: Spherical pendulum

The spherical pendulum is a classic holonomic
problem in mechanics that involves rotation plus os-
cillation where the pendulum is free to swing in any
direction. This also applies to a particle constrained
to slide in a smooth frictionless spherical bowl under
gravity, such as a bar of soap in a wet hemispherical
sink. Consider the equation of motion of the spher-
ical pendulum of mass  and length  shown in the
adjacent figure. The most convenient generalized co-
ordinates are    with origin at the fulcrum, since g
the length is constrained to be  =  The kinetic
energy is
1 2 2 1 2 2 2
 =  ̇ +  sin ̇
2 2
m
The potential energy
 = − cos 
Spherical pendulum
giving that
1 2 2 1 2 2 2
=  ̇ +  sin ̇ +  cos 
2 2
The Lagrange equation for 
  
Λ  = − =0
  ̇ 
which gives
2
2 ̈ = 2 ̇ sin  cos  −  sin 
The Lagrange equation for 
   
Λ  = − = [2 sin2 ̇] = 0
  ̇  
which gives
2 sin2 ̇ =  = constant
This is just the angular momentum  for the pendulum rotating in the  direction. Automatically the
Lagrange approach shows that the angular momentum  is a conserved quantity. This is what is expected
from Newton’s Laws of Motion since there are no external torques applied about this vertical axis.
The equation of motion for  can be simplified to

 2 cos 
̈ + sin  − 2 4 3 = 0
   sin 
There are many possible solutions depending on the initial conditions. The pendulum can just oscillate
in the  direction, or rotate in the  direction or some combination of these. Note that if  is zero, then
the equation reduces to the simple harmonic pendulum, while the other extreme is when ̈ = 0 for which the
motion is that of a conical pendulum that rotates at a constant angle 0 to the vertical axis.
156 CHAPTER 6. LAGRANGIAN DYNAMICS

6.11 Example: Spring plane pendulum

A mass  is suspended by a spring with spring constant  in the gravitational field. Besides the longi-
tudinal spring vibration, the spring performs a plane pendulum motion in the vertical plane, as illustrated in
the adjacent figure. Find the Lagrangian, the equations of motion, and force in the spring.
The system is holonomic, conservative, and scleronomic. Introduce plane polar coordinates with radial
length  and polar angle  as generalized coordinates. The generalized coordinates are related to the cartesian
coordinates by

 =  cos 
 =  sin 

Therefore the velocities are given by

r
̇ = ̇ cos  + ̇ sin 
̇ = ̇ sin  − ̇ cos 

The kinetic energy is given by

1 ¡ 2 ¢ 1 ³ 2
´
 =  ̇ + ̇ 2 =  ̇2 + 2 ̇
2 2
m
The gravitational plus spring potential energies both can be absorbed
y
into the potential  .
Spring pendulum having spring
 constant  and oscillating in a
 = − cos  + ( − 0 )2
2 vertical plane.
where 0 denotes the rest length of the spring. The Lagrangian thus equals
1 ³ 2 2
´  2
=  ̇ + 2 ̇ +  cos  − ( − 0 )
2 2
For the polar angle , the Lagrange equation Λ  = 0 gives
 ³ 2 ´
 ̇ = − sin 


The angular momentum  = 2 ̇, thus the equation of motion can be written as

̇ = − sin 

³ ´

Alternatively, evaluating  2 ̇ gives

2 ̈ = − sin  − 2̇̇

The last term in the right-hand side is the Coriolis force caused by the time variation of the pendulum length.
For the radial distance  the Lagrange equation Λ  = 0 gives
2
̈ = ̇ +  cos  −  ( − 0 )

This equation just equals the tension in the spring, i.e.  = ̈. The first term on the right-hand side
represents the centrifugal radial acceleration, the second term is the component of the gravitational force,
and the third term represents Hooke’s Law for the spring. For small amplitudes of  the motion appears as
a superposition of harmonic oscillations in the   plane.
In this example the orthogonal coordinate approach used gave the tension in the spring thus it is unnec-
essary to repeat this using the Lagrange multiplier approach.
6.8. APPLICATIONS TO SYSTEMS INVOLVING HOLONOMIC CONSTRAINTS 157

6.12 Example: The yo-yo

Consider a yo-yo comprising a disc that has a string wrapped around it with one end attached to a fixed
support. The disc is allowed to fall with the string unwinding as it falls as illustrated in the adjacent figure.
Derive the equations of motion and the forces of constraint via use of Lagrange multipliers. Use  and  as
independent generalized coordinates.
The kinetic energy of the falling yo-yo is given by
1 1 2 1 1 2
 = ̇ 2 +  ̇ = ̇ 2 + 2 ̇
2 2 2 4
where  is the mass of the disc,  the radius, and  =
1 2
2  is the moment of inertia of the disc about its central
axis. The potential energy of the disc is y
 = −
Thus the Lagrangian is
1 1 2
= ̇ 2 + 2 ̇ + 
2 4
The one equation of constraint is holonomic
( ) =  −  = 0
The two Lagrange equations are
The yo-yo comprises a falling disc unrolling
   
− 0
+ =0 from a string attached to the disc at one end
   
and a fixed support at the other end.
   
− 0 + =0
   
with only one Lagrange multiplier . Evaluating these two Euler-Lagrange equations leads to two equations
of motion
 − ̈ +  = 0
1
− 2 ̈ −  = 0
2
Diﬀerentiating the equation of constraint gives
̈
̈ =

Inserting this into the second equation and solving the two equations gives
1
 = − 
3
Inserting  into the two equations of motion gives
2
̈ = 
3
2
̈ =
3
The generalized force of constraint
 1
 =  = − 
 3
and the constraint torque is
 1
 =  = 
 3
1
Thus the string reduces the acceleration of the disc in the gravitational field by a factor of 3.
158 CHAPTER 6. LAGRANGIAN DYNAMICS

6.13 Example: Mass constrained to move on the inside of a frictionless paraboloid

A mass  moves on the frictionless inner surface of a paraboloid
2 +  2 = 2 =  z

with a gravitational potential energy of  = 

This system is holonomic, scleronomic, and conservative. Choose
cylindrical coordinates    with respect to the vertical axis of the
paraboloid to be the generalized coordinates. g
The Lagrangian is
1 ³ 2
´
 =  ̇2 + 2 ̇ + ̇ 2 −  z
2
The equation of constraint is
y
r
( ) = 2 −  = 0
x
The Lagrange multiplier approach will be used to determine the forces
of constraint. Mass constrained to slide on the
For Λ  =   inside of a frictionless paraboloid.


  
− = 1 2 (a)
³ ̇ ´
2
 ̈ − ̇ = 1 2


For Λ  =  
 ³ 2 ´
 ̇ = ̇ = 0 (b)

Thus the angular momentum  is conserved, that is, it is a constant of motion.
For Λ  =  

̈ = − − 1  (c)
and the time diﬀerential of the constraint equation is
2̇ − ̇ = 0 (d)
The above four equations of motion can be used to determine   1 
2
The
√ radius of the circle at the intersection of the plane  =  with the paraboloid  =  is given by
0 =  For a constant height  = , then ̈ = 0 and equation (c) reduces to

1 = −

Therefore the constraint force  is given by
( ) 
 = 1 =− 2
 
Assuming that ̈ = 0 then equation (a) for ̇ =  and  = 0 gives
¡ ¢ 
 0 − 0  2 = 1 20 = − 20 = 

That is, the constraint force equals
 = −0  2
which is the usual centripetal force. These relations also give that the initial angular velocity required for
such a stable trajectory with height  is r
2
̇ =  =

6.8. APPLICATIONS TO SYSTEMS INVOLVING HOLONOMIC CONSTRAINTS 159

6.14 Example: Mass on a frictionless plane connected to a plane pendulum

Two masses 1 and 2 are connected by a string of
length . Mass 1 is on a horizontal frictionless table
and it is assumed that mass 2 moves in a vertical plane. s m1
This is another problem involving holonomic constrained
motion. The constraints are:
1) 1 moves in the horizontal plane
2) 2 moves in the vertical plane
3)  +  =  Therefore ̇ = −̇
There are 6 − 3 = 3 remaining degrees of freedom after
taking the constraints into account. Choose as a set of
generalized coordinates,   and  In terms of these three
generalized coordinates, the kinetic energy is r

1 ³ 2
´ 1 ³ 2
´
 = 1 ̇2 + 2 ̇ + 2 ̇2 + 2 ̇
2 2
³ ´ 1 µ ¶
1 ·2
2 2
= 1 ̇2 + ( − ) ̇ + 2 ̇2 + 2  m2
2 2
Mass 2  hanging from a rope that is connected
The potential energy in terms of the generalized coordi- to 1  which slides on a frictionless plane.
nates relative to the horizontal plane, is

 = 0 − 2  cos 

Therefore the Lagrangian equals

1 ³ ´ 1 ³ ´
2 2 2
= 1 ̇2 + ( − ) ̇ + 2 ̇2 + 2 ̇ + 2  cos 
2 2
The diﬀerentials are
 2 2
= −( − )̇ + 2 ̇ +  cos 


= (1 + 2 )̇
 ̇

= − sin 


= 2 2 ̇
 ̇

= 0


= 1 ( − )2 ̇
 ̇
Thus the three Lagrange equations are
2 2
Λ  = (1 + 2 )̈ + 1 ( − ) ̇ − 2 ̇ − 2  cos  = 0
 h i
Λ  = 2 2 ̇ + 2  sin  = 0

that is
22 ̇̇ + 2 2 ̈ + 2  sin  = 0
 h i
Λ  = 1 ( − )2 ̇ = 0

This last equation is a statement of the conservation of angular momentum. These three diﬀerential equations
of motion can be solved for known initial conditions.
160 CHAPTER 6. LAGRANGIAN DYNAMICS

6.15 Example: Two connected masses constrained to slide along a moving rod
Consider two identical masses  constrained to move
along the axis of a thin straight rod, of mass  and length
 which is free to both translate and rotate. Two identi- z1
cal springs link the two masses to the central point of the
rod. Consider only motions of the system for which the
extended lengths of the two springs are equal and opposite z r
such that the two masses always are equal distances from
the center of the rod keeping the center of mass at the O
center of the rod. Find the equations of motion for this y r y1
system.
x
Use a fixed cartesian coordinate system (  ) and
a moving frame with the origin  at the center of the
rod with its cartesian coordinates (1  1  1 ) being parallel x1
to the fixed coordinate frame as shown in the figure. Let Two identical masses  constrained to slide on
(  ) be the spherical coordinates of a point referring to a moving rod of mass  The masses are
the center of the moving (1  1  1 ) frame as shown in the attached to the center of the rod by identical
figure. Then the two masses  have spherical coordinates springs each having a spring constant .
(  ) and (−  ) in the moving-rod fixed frame. The
frictionless constraints are holonomic.
The kinetic energy of the system is equal to the kinetic energy for all the mass concentrated at the center
of mass plus the kinetic energy about the center of mass. Since  is the center of mass then the kinetic
energy can be separated into three terms
 
 =  +  + 

Note that since the kinetic energy is a scalar quantity it is rotational invariant and thus can be evaluated in
any rotated frame. Thus the kinetic energy of the center of mass is
1
 = ( + 2)(̇2 + ̇ 2 + ̇ 2 )
2
The rotational kinetic energy of the two masses in the center of mass frame is
2

 = (̇2 + 2 ̇ + 2 ̇2 sin2 )

The rotational kinetic energy of the rod  is a scalar and thus can be evaluated in any rotated frame of
reference fixed with respect to the principal axis system of the rod. The angular velocity of the rod about 
resolved along its principal axes is given by

̄ = ̇ cos ê − ̇ sin ê − ̇ê

1 2
The corresponding moments of inertia of the uniform infinitesimally-thin rod are  = 0  = 12     =
1 2
12   . Hence the rotational kinetic energy of the rod is

1 1 2

 = (  2 +   2 +   2 ) =  2 (̇ + ̇2 sin2 )
2 24
The only potential energy is due to the two extended springs which are assumed to have the same length 
where 0 is the unstretched length.
1
 = 2 · ( − 0 )2 = ( − 0 )2
2
Thus the Lagrangian is
1 2 1 2
= ( + 2)(̇2 + ̇ 2 + ̇ 2 ) + (̇2 + 2 ̇ + 2 ̇2 sin2 ) +  2 (̇ + ̇2 sin2 ) − ( − 0 )2
2 24
Using Lagrange’s equations Λ  = 0 for the generalized coordinates gives.
6.9. APPLICATIONS INVOLVING NON-HOLONOMIC CONSTRAINTS 161

( + 2)̇ = constant (Λ  = 0)

( + 2)̇ = constant (Λ  = 0)
( + 2)̇ = constant (Λ  = 0)
µ ¶
1
2 +   ̇ sin2 
2 2
= constant (Λ  = 0)
12
2 
̈ − ̇ − ̇2 sin2  + ( − 0 ) = 0 (Λ  = 0)
µ 2
¶ µ 2
¶ 
  
2 + ̈ + 2̇̇ − 2 + ̇2 sin  cos  = 0 (Λ  = 0)
24 24
The first three equations show that the three components of the linear momentum of the center of mass
are constants of motion. The fourth equation shows that the component of the angular momentum about
the  0 axis is a constant of motion. Since the 1 axis has been arbitrarily chosen then the total angular
momentum must be conserved. The fifth and sixth equations give the radial and angular equations of motion
of the oscillating masses .

6.9 Applications involving non-holonomic constraints

In general, non-holonomic constraints can be handled by use of generalized forces   in the Lagrange-
Euler equations 660. The following examples, 616 − 619 involve one-sided constraints which exhibit
holonomic behavior for restricted ranges of the constraint surface in coordinate space, and this range is case
specific. When the forces of constraint press the object against the constraint surface, then the system is
holonomic, but the holonomic range of coordinate space is limited to situations where the constraint forces
are positive. When the constraint force is negative, the object flies free from the constraint surface. In
addition, when the frictional force     where  is the static coeﬃcient of friction, then the
object slides negating any rolling constraint that assumes static friction.

6.16 Example: Mass sliding on a frictionless spherical shell

Consider a mass starts from rest at the top of a frictionless
fixed spherical shell of radius . The questions are what is the
force of constraint and determine the angle  at which the mass
leaves the surface of the spherical shell. The coordinates   shown
are the obvious generalized coordinates to use. The constraint will
not apply if the force of constraint does not hold the mass against
the surface of the spherical shell, that is, it is only holonomic in a
restricted domain.
The Lagrangian is
1 ³ 2
´
 =  ̇2 + 2 ̇ −  cos 
2
Mass  sliding on frictionless cylinder
This Lagrangian is applicable irrespective of whether the constraint of radius .
is obeyed, where the constraint is given by

( ) =  −  = 0

For the restricted domain where this system is holonomic, it can be solved using generalized coordinates,
generalized forces, Lagrange multipliers, or Newtonian mechanics as illustrated below.
Minimal generalized coordinates:
The minimal number of generalized coordinates reduces the system to one coordinate , which does not
determine the constraint force that is needed to know if the constraint applies. Thus this approach is not
useful for solving this partially-holonomic system.
162 CHAPTER 6. LAGRANGIAN DYNAMICS

Generalized forces:
The radial constraint has a corresponding generalized force  . The Lagrange equation Λ  =  gives
2
̈ +  cos  − ̇ =  (a)

The Lagrange equation Λ  =  = 0 since there is no tangential force for this frictionless system. Therefore

2 ̈ −  sin  + 2̇̇ = 0 (b)

When constrained to follow the surface of the spherical shell, the system is holonomic, i.e.  =  and
̇ = ̈ = 0. Thus the above two equations reduce to
2
 cos  − ̇ =  (c)
2 ̈ −  sin  = 0

That is

̈ = sin 

Integrate to get ̇ using the fact that
̇  ̇
̈ = = ̇
  
then Z Z Z

̈ = ̇̇ = sin 

Therefore
2 2
̇ = (1 − cos ) (d)

assuming that ̇ = 0 at  = 0 Substituting equation () into equation () gives the constraint force, which
is normal to the surface, to be
 =  = (3 cos  − 2)
Note that  =  = 0 when cos  = 23 , that is  = 482 
Lagrange multipliers:
For the holonomic regime, which obeys the constraint, ( ) =  −  = 0 the Lagrange equation for 
is Λ  =   
  Since  = 1 then
2
̈ +  cos  − ̇ =  (a)
The Lagrange equation for  gives ∆  =  
 = 0 since

 = 0 Thus

2 ̈ −  sin  + 2̇̇ = 0 (b)

As above, when constrained to follow the surface of the spherical shell, the system is holonomic  = 
and ̇ = ̈ = 0 Thus the above two equations reduce to
2
 cos  − ̇ =  (c)
2 ̈ −  sin  = 0 (d)

That is, the answers are identical to that obtained using generalized forces, namely;
2 2
̇ = (1 − cos ) (d)


assuming that ̇ = 0 at  = 0
The force of constraint applied by the surface is

 = =

6.9. APPLICATIONS INVOLVING NON-HOLONOMIC CONSTRAINTS 163

Substituting equation () into equation () gives

 =  = (3 cos  − 2)

Note that  = 0 when cos  = 23 , that is  = 482 

Both of the above methods give identical results and give that the force of constraint is negative when
  482  Assuming that the surface cannot hold the mass against the surface, then the mass will fly oﬀ the
spherical shell when   482 and the system reduces to an unconstrained object falling freely in a uniform
gravitational field, which is holonomic, that is  =  = 0 Then the equations of motion () and () reduce
to
2
̈ +  cos  − ̇ = 0 (e)
2
 ̈ −  sin  + 2̇̇ = 0 (f)

Energy conservation:
This problem can be solved using energy conservation
1
2 = [1 − cos ]
2
Thus the centripetal acceleration
2
= 2[1 − cos ]

The normal force to the surface will cancel when the centripetal acceleration equals the gravitational acceler-
ation, that is, when
2
= 2[1 − cos ] =  cos 

This occurs when cos  = 23 . This is an unusual case where the Newtonian approach is the simplest.

6.17 Example: Rolling solid sphere on a spherical shell

This is a similar problem to the prior one with the added
complication of rolling which is assumed to move in a vertical
plane making it holonomic. Here we would like to determine
the forces of constraint to see when the solid sphere flies oﬀ the
spherical shell and when the friction is insuﬃcient to stop the
rolling sphere from slipping.
The best generalized coordinates are the distance of the center
of the sphere from the center of the spherical shell,   and 
It is important to note that  is measured with respect to the
vertical, not the time-dependent vector r. That is, the direction
of the radius  is  which is time dependent and thus is not a
useful reference to use to define the angle . Let us assume
that the sphere is uniform with a moment of inertia of  =
2 2
5   If the tangential frictional force  is less than the limiting Disk of mass , radius  rolling on a
value   , with   0 then the sphere will roll without cylindrical surface of radius .
slipping on the surface of the cylinder and both constraints apply.
Under these conditions the system is holonomic and the solution is solved using Lagrange multipliers and the
equations of constraint are the following:
1) The center of the sphere follows the surface of the cylinder

1 =  −  −  = 0

2) The sphere rolls without slipping

2 =  ( − ) −  = 0
164 CHAPTER 6. LAGRANGIAN DYNAMICS
³ 2
´ 2
The kinetic energy is  = 12  ̇2 + 2 ̇ + 12  ̇ and the potential energy is  =  cos  Thus the
Lagrangian is
1 ³ 2
´ 1 2
 =  ̇2 + 2 ̇ +  ̇ −  cos 
2 2
Consider the solution using Lagrange multipliers for the holonomic regime where both constraints are
satisfied and lead to the following diﬀerential constraint relations
1 1 1
= 1 =0 =0
  
2 2 2
= 0 = = − ( + )
  
The Lagrange operator equation Λ  gives,
   1 2
− = 1 + 2
  ̇   
that is
2
̈ +  cos  − ̇ = 1 (a)
Λ  gives
2 ̈ + 2̇̇ −  sin  = −2 ( + ) (b)
Λ  gives
 ̈ = 2 (c)
Since the center of the sphere rolling on the spherical shell must have

 =+

then

̇ = ̈ = 0

̈ = ̈

Substituting this into () gives
2
̈ = 2

Insert this into equation () gives
 sin 
2 = ¡ 2 2¢
 + 
The moment of inertia about the axis of a solid sphere is  = 25 2  Then

2 sin 
2 =
7
But also
̇ 2 5 5 sin 
̈ = ̇ = 2 = 2 =
  2 7
Integrating gives Z Z
5
̇̇ = sin 
7
That is
2 10
̇ = (1 − cos )
7
assuming that ̇ = 0 at  = 0 Inserting this into equation () gives
10
− [1 − cos ] +  cos  = 1
7
6.9. APPLICATIONS INVOLVING NON-HOLONOMIC CONSTRAINTS 165

That is

1 = [17 cos  − 10]
7
Note that this equals zero when
10
cos  =
17
For larger angles 1 is negative implying that the solid sphere will fly off the surface of the spherical shell.
The sphere will leave the surface of the cylinder when cos  = 10 
17 that is,  = 5397  This is a significantly
larger angle than obtained for the similar problem where the mass is sliding on a frictionless cylinder because
the energy stored in rotation implies that the linear velocity of the mass is lower at a given angle  for the
case of a rolling sphere.
The above discussion has omitted an important fact that, if   ∞ the frictional force becomes
insufficient to maintain the rolling constraint before  = 5397  that is, the frictional force will exceed
the sliding limit   . To determine when the rolling constraint fails it is necessary to determine the
frictional torque
  = −2 
Thus
 = −2
It is in the negative direction because of the direction chosen for  The required coefficient of friction  is
given by the ratio of the frictional force to the normal force, that is
2 2 sin 
= =
1 [17 cos  − 10]

For  = 1 the disk starts to slip when  = 47540  Note that the sphere starts slipping before it flies off
the cylinder since a normal force is required to support a frictional force and the difference depends on the
coefficient of friction. The no-slipping constraint is not satisfied once the sphere starts slipping and the
frictional force should equal  1  Thus for the angles beyond 4754 the problem needs to be solved with
the rolling constraint changed to a sliding non-conservative frictional force. This is best handled by including
the frictional force and normal forces as generalized forces. Fortunately this will be a small correction. The
friction will slightly change the exact angle at which the normal force becomes zero and the system transitions
to free motion of the sphere in a gravitational field.

6.18 Example: Solid sphere rolling plus slipping on a spherical shell

Consider the above case when the frictional force is insuﬃcient to constrain the motion to rolling. Now
the frictional force  is given by
 =  
when  is positive.
This can be solved using generalized forces with the previous Lagrangian. Then
  
− =  = 
  ̇ 
which gives
2
̈ +  cos  − ̇ = 
Similarly Λ  =  = − ( + ) gives

2 ̈ + 2̇̇ −  sin  = − ( + )

Similarly Λ  =  =  gives
 ̈ = 
These can be solved by substituting the relation  =   . The sphere flies oﬀ the spherical shell
when  ≤ 0 leading to free motion discussed in example 62. The problem of a solid uniform sphere rolling
inside a hollow sphere can be solved the same way.
166 CHAPTER 6. LAGRANGIAN DYNAMICS

6.19 Example: Small body held by friction on the periphery of a rolling wheel
Assume that a small body of mass  is bal-
anced on a rolling wheel of mass  and radius
 as shown in the figure. The wheel rolls in y
a vertical plane without slipping on a horizontal
surface. This example illustrates that it is possi-
ble to use simultaneously a mixture of holonomic F N
constraints, partially-holonomic constraints, and
generalized forces.3
m
Assume that at  = 0 the wheel touches the
floor at  =  = 0 with the mass perched at
the top of the wheel at  = 0. Let the frictional
force acting on the mass  be  and the reaction
force of the periphery of the wheel on the mass
be  . Let ̇ be the angular velocity of the wheel,
and ̇ the horizontal velocity of the center of the M
wheel. The polar coordinates   of the mass  x
O
are taken with  measured from the center of the
x
wheel with  measured with respect to the vertical.
Thus the cartesian coordinates of the small mass Small body of mass  held by friction on the periphery
 are ( +  sin   +  cos ) with respect to the of a rolling wheel of mass  and radius .
origin at  =  = 0.
The kinetic energy is given by
∙³ ´2 ³ ´2 ¸
1 1 1
 =  ̇2 +  ̇2 +  ̇ + ̇ cos  + ̇ sin  + ̇ cos  − ̇ sin 
2 2 2

The gravitational force can be absorbed into the scalar potential term of the Lagrangian and includes only
the potential energy of the mass  since the potential energy of the rolling wheel is constant.

 = + ( +  cos )

Thus the Lagrangian is

1 1 1 h 2 i
= ( + ) ̇2 +  ̇2 +  2 ̇ + 2̇̇ cos  + 2̇̇ sin  + ̇2 −  ( +  cos )
2 2 2

The equations of constraints are:

1) The wheel rolls without slipping on the ground plane leading to a holonomic constraint:

1 =  −  = ̇ − ̇ = 0

2) The mass  is touching the periphery of the wheel, that is, the normal force   0 This is a one-sided
restricted holonomic constraint.
2 =  −  = 0

3) The mass  does not slip on the wheel if the frictional force     . When this restricted
holonomic constraint is satisfied, then
3 = ̇ − ̇ = 0

The rolling constraint is holonomic, and can be accounted for using one Lagrange multiplier  plus the
diﬀerential constraint equations
3 This problem is solved in detail in example 319 of " Classical Mechanics and Relativity". by Muller-Kirsten [06] 
6.9. APPLICATIONS INVOLVING NON-HOLONOMIC CONSTRAINTS 167

1
= 1

1
= 0

1
= 

1
= 0

The other two constraints are non-holonomic, and thus these constraint forces are expressed in terms of two
generalized forces   and  that are related to the tangential force  and radial reaction force  . For
simplicity, assume that the wheel is a thin-walled cylinder with a moment of inertia of

 =  2

The Euler-Lagrange equations for the four coordinates     are

 ³ ´
− ( + ) ̇ + ̇ cos  + ̇ sin  +  +  = 0 (Λ )

 ³ 2 ´
̇̇ sin  + ̇̇ cos  −  sin  −  ̇ + ̇ cos  +  = 0 (Λ )

 ¡ ¢
−  2 ̇ −  = 0 (Λ )


− cos  − (̇ sin  + ̇) +  = 0 (Λ )

The generalized forces can be related to  and  using the definition
r
 = F()·

where  () is the vectorial sum of the forces acting at  The components of vector  = ( +  sin   +  cos )
and  , and  are in the directions defined in the figure which leads to the generalized forces

 = − cos  +  sin 
 = (− cos  +  sin ) (− cos ) − ( sin  +  cos )  sin  = − 
 = 

Solving the above 7 equations gives that

2
̈ sin  + ̇ −  cos  +  = 0

This last equation can be derived by Newtonian mechanics from consideration of the forces acting.
The above equations of motion can be used to calculate the motion for the following conditions.
a) Mass not slipping:

This occurs if  =  ≤  which also implies that   0 That is a situation where the system is
holonomic with  =  ̇ = ̇ ̇ = ̇ which can be solved using the generalized coordinate approach with
only one independent coordinate which can be taken to be .
b) Mass slipping:
Here the no-slip constraint is violated and thus one has to explicitly include the generalized forces     
and assume that sliding friction is given by  =   
c) Reaction force  is negative:
Here the mass is not subject to any constraints and it is in free fall.

The above example illustrates the flexibility provided by Lagrangian mechanics that allows simultane-
ous use of Lagrange multipliers, generalized forces, and scalar potential to handle combinations of several
holonomic and nonholonomic constraints for a complicated problem.
168 CHAPTER 6. LAGRANGIAN DYNAMICS

6.10 Velocity-dependent Lorentz force

The Lorentz force in electromagnetism is unusual in that it is a velocity-dependent force, as well as being a
conservative force that can be treated using the concept of potential. That is, the Lorentz force is

F = (E + v × B) (6.61)

It is interesting to use Maxwell’s equations and Lagrangian mechanics to show that the Lorentz force can be
represented by a conservative potential in Lagrangian mechanics.
Maxwell’s equations can be written as

∇·E = (6.62)
0
B
∇ × E+= 0

∇·B = 0
E
∇ × B−0 0 = J

Since ∇ · B =0 then it follows from Appendix  that B can be represented by the curl of a vector
potential, A that is
B=∇×A (6.63)
Substituting this into ∇ × E+ B
 = 0 gives that

∇ × A
∇ × E+ = 0 (6.64)
µ  ¶
A
∇× E + = 0


Since this curl is zero it can be represented by the gradient of a scalar potential 
A
E+ = −∇ (6.65)

The following shows that this relation corresponds to taking the gradient of a potential  for the charge 
where the potential  is given by the relation

 = (Φ − A · v) (6.66)

where Φ is the scalar electrostatic potential. This scalar potential  can be employed in the Lagrange
equations using the Lagrangian
1
 = v · v − (Φ − A · v) (6.67)
2
The Lorentz force can be derived from this Lagrangian by considering the Lagrange equation for the cartesian
coordinate 
  
− =0 (6.68)
  ̇ 
Using the above Lagrangian (667) gives
∙ ¸
 Φ A
̈ +  + − ·v =0 (6.69)
  
But
    
= + ̇ + ̇ + ̇ (6.70)
    
and
A   
·v = ̇ + ̇ + ̇ (6.71)
   
6.11. TIME-DEPENDENT FORCES 169

Inserting equations 670 and 671 into 669 gives

∙µ ¶ µ ¶ µ ¶ ¸
Φ     
 = ̈ =  − − + − ̇ − − ̇ =  [E + v × B] (6.72)
     
Corresponding expressions can be obtained for  and  . Thus the total force is the well-known Lorentz
force

F = (E + v × B) (6.73)
This has demonstrated that the electromagnetic scalar potential

 = (Φ − A · v) (6.74)

satisfies Maxwell’s equations, gives the Lorentz force, and it can be absorbed into the Lagrangian. Note that
the velocity-dependent Lorentz force is conservative since E is conservative, and because (v × B × v)=0
therefore the magnetic force does no work since it is perpendicular to the trajectory. The velocity-dependent
conservative Lorentz force is an important and ubiquitous force that features prominently in many branches
of science. It will be discussed further for the case of relativistic motion in example 176.

6.11 Time-dependent forces

All examples discussed in this chapter have assumed Lagrangians that are time independent. Mathematical
systems where the ordinary differential equations do not depend explicitly on the independent variable, which
in this case is time , are called autonomous systems. Systems having differential equations governing the
dynamical behavior that have time-dependent coefficients are called non-autonomous systems.
In principle it is trivial to incorporate time-dependent behavior into the equations of motion by intro-
ducing either a time dependent generalized force ( ), or allowing the Lagrangian to be time dependent.
For example, in the rocket problem the mass is time dependent. In some cases the time dependent forces
can be represented by a time-dependent potential energy rather than using a generalized force. Solutions
for non-autonomous systems can be considerably more difficult to obtain, and can involve regions where the
motion is stable and other regions where the motion is unstable or chaotic similar to the behavior discussed
in chapter 4. The following case of a simple pendulum, whose support is undergoing vertical oscillatory
motion, illustrates the complexities that can occur for systems involving time-dependent forces.

6.20 Example: Plane pendulum hanging from a vertically-oscillating support

Consider a plane pendulum having a mass  fastened to a massless rigid rod of length  that is at an
angle () to the vertical gravitational field . The pendulum is attached to a support that is subject to a
vertical oscillatory force  such that the vertical position  of the support is

 =  cos 
The kinetic energy is
∙³ ´2 ¸
1 1 h 2
i
 =  ̇ cos  + (̇ + ̇ sin )2 =  2 ̇ + 2̇̇ sin  + ̇ 2
2 2
and the potential energy is
 =   [(1 − cos ) + ]
Thus the Lagrangian is
1 h 2 2 i
=   ̇ + 2̇̇ sin  + ̇ 2 −   [(1 − cos ) + ]
2
The Euler-Lagrange equations lead to equations of motion for  and 

 2 ̈ +  ̈ sin  +   sin  = 0

2
 ̈ sin  +  ̇ cos  +  ̈ +   = 
170 CHAPTER 6. LAGRANGIAN DYNAMICS

Assume the small-angle approximation where  → 0 then these two equations reduce to
µ ¶
 ̈
̈ + +  = 0
 

̈ +  =

Substitute ̈ = − 2 cos  into these equations gives
µ ¶
  2
̈ + − cos   = 0
 
¡ ¢
  −  2 cos  = 

These correspond to stable harmonic oscillations about  ≈ 0 if the bracket term is positive, and to
unstable motion if the bracket is negative. Thus, for small amplitude oscillation about  ≈ 0 the motion of
the system can be unstable whenever the bracket is negative, that is, when the acceleration  2 cos   
and resonance behavior can occur coupling the pendulum period and the forcing frequency .
This discussion also applies to the inverted pendulum with a surprising result. It is well known that the
pendulum is unstable near  = . However, if the support is oscillating, then for  ≈  the equations of
motion become
µ ¶
  2
̈ − − cos   = 0
 
¡ ¢
  −  2 cos  = 

The inverted pendulum has stable oscillations about  ≈  if the bracket is negative, that is, if  2 cos   
This illustrates that nonautonomous dynamical systems can involve either stable or unstable motion.

6.12 Impulsive forces

Colliding bodies often involve large impulsive forces that act for a short time. As discussed in chapter 2128
the treatment of impulsive forces or torques is greatly simplified if they act for a suﬃciently short time that
the displacement during the impact can be ignored, even though the instantaneous change in velocities may
be large. The simplicity is achieved by taking the time integral of the Euler-Lagrange equations over the
duration  of the impulse and assuming  → 0.
The impact of the impulse on a system can be handled two ways. The first approach is to use the
Euler-Lagrange equation during the impulse to determine the equations of motion
µ ¶
  
− = 
 (6.75)
  ̇ 

where the impulsive force is introduced using the generalized force 
 . Knowing the initial conditions at
time  the conditions at the time  +  are given by integration of equation 675 over the duration  of the
impulse which gives Z + µ ¶ Z + Z +
  
 −  = 
  (6.76)
   ̇   

This integration determines the conditions at time  +  which then are used as the initial conditions for the
motion when the impulsive force  is zero.
The second approach is to realize that equation 676 can be rewritten in the form
Z + µ ¶ ¯+ Z + µµ ¶ ¶
   ¯¯  
lim  = lim = ∆ = lim +   (6.77)
 →0    ̇  →0  ̇ ¯  →0  



Note that in the limit that  → 0 then the integral of the generalized momentum  = simplifies to give
 ̇
³ ´

the change in generalized momentum ∆ . In addition, assuming that the non-impulsive forces  
are
6.12. IMPULSIVE FORCES 171

finite and independent of the instantaneous impulsive force during the infinitessimal duration  , then the
R + ³  ´
contribution of the non-impulsive forces    during the impulse can be neglected relative to the
R + 
large impulsive force term; lim →0    . Thus it can be assumed that
Z +
∆ = lim 
  = ̃ (6.78)
 →0 

where ̃ is the generalized impulse associated with coordinate  = 1 2 3  . This generalized impulse
can be derived from the time integral of the impulsive forces P given by equation 2135 using the time
integral of equation 677, that is
Z + Z + X X
r r
∆ = ̃ = lim 
  ≡ lim P ·  = P̃ · (6.79)
 →0   →0  
 


Note that the generalized impulse ̃ can be a translational impulse P̃ with corresponding translational
variable   or an angular impulsive torque τ̃  with corresponding angular variable  .
Impulsive force problems usually are solved in two stages. Either equations 676 or 679 are used to
determine the conditions of the system immediately following the impulse. If  → 0 then impulse changes
the generalized velocities ̇ but not the generalized coordinates  . The subsequent motion then is determined
using the Lagrangian equations of motion with the impulsive generalized force being zero, and assuming that
the initial condition corresponds to the result of the impulse calculation.

6.21 Example: Series-coupled double pendulum subject to impulsive force

Consider a series-coupled double pendulum comprising
two masses 1 and 2 connected by rigid massless rods of
lengths 1 and 2 as shown in the figure. Initially the two
pendula are at rest and hanging vertically when a horizontal
impulse ̃ strikes the system at a distance  below the up-
per fulcrum where 1    1 + 2 . For this system the
kinetic energy of the masses 1 and 2 are
1 2
1 = 1 21 ̇1
2
1 2 2
2 = 2 [21 ̇1 + 21 2 ̇1 ̇2 cos(1 − 2 ) + 22 ̇2 ]
2
Note the velocity of 2 is the vector sum of the two velocities Two series-coupled plane pendula.
shown, separated by the angle 2 − 1 . Thus the total kinetic
energy is
1 2 1 2
 = (1 + 2 )21 ̇1 + 2 1 2 ̇1 ̇2 cos(1 − 2 ) + 2 22 ̇2
2 2
To first order in cos(1 − 2 )

1 2 1 2
 = (1 + 2 )21 ̇1 + 2 1 2 ̇1 ̇2 + 2 22 ̇2
2 2
The total potential energy is

 = 1 1 (1 − cos 1 ) + 2 [1 (1 − cos 1 ) + 2 (1 − cos 2 )

= (1 + 2 )1 (1 − cos 1 ) + 2 2 (1 − cos 2 )

Thus, assuming the small-angle approximation, the Lagrangian becomes

µ ¶
1 2 2 1 2 2 1 2 1 2
 = (1 + 2 )1 ̇1 + 2 1 2 ̇1 ̇2 + 2 2 ̇2 − (1 + 2 )1 1 + 2 2 2
2 2 2 2
172 CHAPTER 6. LAGRANGIAN DYNAMICS

Use equation 679 to transform to the generalized coordinates 1 and 2 with the corresponding generalized
impulsive torques

̃1 = ̃ 1
̃2 = ̃ ( − 1 )

Since the system starts at rest where 1 = 2 = 0, then using equation 677 gives the change in angular
momentum immediately following the impulse to be
³ ´
1 21 ̇1 + 2 1 1 ̇1 + 2 ̇2 = ̃ 1
³ ´
2 2 1 ̇1 + 2 ̇2 = ̃ ( − 1 )

These two equations determine ̇1 and ̇2 immediately after the impulse; these can be used with 1 = 2 = 0
as initial conditions for solving the subsequent force-free motion when the generalized impulsive force is zero.
As described in example 145 the subsequent motion of this series coupled pendulum will be a superposition
of the two normal modes with amplitudes determined by the result of the impulse calculation.

6.13 The Lagrangian versus the Newtonian approach to classical

mechanics
It is useful to contrast the differences, and relative advantages, of the Newtonian and Lagrangian formulations
of classical mechanics. The Newtonian force-momentum formulation is vectorial in nature, it has cause and
effect embedded in it. The Lagrangian approach is cast in terms of kinetic and potential energies which involve
only scalar functions and the equations of motion come from a single scalar function, i.e. Lagrangian. The
directional properties of the equations of motion come from the requirement that the trajectory is specified
by the principle of least action. The directional properties of the vectors in the Newtonian approach assist
in our intuition when setting up a problem, but the Lagrangian method is simpler mathematically when the
mechanical system is more complex.
The major advantage of the variational approaches to mechanics is that solution of the dynamical equa-
tions of motion can be simplified by expressing the motion in terms of independent generalized coordi-
nates. For Lagrangian mechanics these generalized coordinates can be any set of independent variables,
 , where 1 ≤  ≤ , plus the corresponding velocities ̇ . These independent generalized coordinates
completely specify the scalar potential and kinetic energies used in the Lagrangian or Hamiltonian. The
variational approach allows for a much larger arsenal of possible generalized coordinates than the typical
vector coordinates used in Newtonian mechanics. For example, the generalized coordinates can be dimension-
less amplitudes for the  normal modes of coupled oscillator systems, or action-angle variables. Moreover,
very different generalized coordinates can be used for each of the  variables. The tremendous freedom
plus flexibility of the choice of generalized coordinates is important when constraint forces are acting on the
system. Generalized coordinates allow the constraint forces to be ignored by including auxiliary conditions
to account for the kinematic constraints that lead to correlated motion. The Lagrange method provides
an incredibly consistent and mechanistic problem-solving strategy for many-body systems subject to con-
straints. Expressed in terms of generalized coordinates, the Lagrange’s equations can be applied to a wide
variety of physical problems including those involving fields. The manipulation of scalar quantities in a
configuration space of generalized coordinates can greatly simplify problems compared with being confined
to a rigid orthogonal coordinate system characterized by the Newtonian vector approach.
The use of generalized coordinates in Lagrange’s equations of motion can be applied to a wide range
of physical phenomena including field theory, such as for electromagnetic fields, which are beyond the ap-
plicability of Newton’s equations of motion. The superiority of the Lagrangian approach compared to the
Newtonian approach for solving problems in mechanics is apparent when dealing with holonomic constraint
forces. Constraint forces must be known and included explicitly in the Newtonian equations of motion. Un-
fortunately, knowledge of the equations of motion is required to derive these constraint forces. For holonomic
constrained systems, the equations of motion can be solved directly without calculating the constraint forces
using the minimal set of generalized coordinate approach to Lagrangian mechanics. Moreover, the Lagrange
approach has significant philosophical advantages compared to the Newtonian approach.
6.14. SUMMARY 173

6.14 Summary
Newtonian plausibility argument for Lagrangian mechanics:
A justification for introducing the calculus of variations to classical mechanics becomes apparent when
the concept of the Lagrangian  ≡  −  is used in the functional and time  is the independent variable.
It was shown that Newton’s equation of motion can be rewritten as

  
− =  (612)
  ̇  

where 
are the excluded forces of constraint plus any other conservative or non-conservative forces not
included in the potential  This corresponds to the Euler-Lagrange equation for determining the minimum
of the time integral of the Lagrangian. Equation 612 can be written as

   X 
− =  () +  (615)
  ̇   


where the Lagrange multiplier term accounts for holonomic constraint forces, and  
includes all ad-
ditional forces not accounted for by the scalar potential  , or the Lagrange multiplier terms 
. The

constraint forces can be included explicitly as generalized forces in the excluded term  of equation
615.
d’Alembert’s Principle
It was shown that d’Alembert’s Principle

X
(F
 − ṗ ) · r = 0 (625)


cleverly transforms the principle of virtual work from the realm of statics to dynamics. Application of virtual
work to statics primarily leads to algebraic equations between the forces, whereas d’Alembert’s principle
applied to dynamics leads to diﬀerential equations of motion.
Lagrange equations from d’Alembert’s Principle
After transforming to generalized coordinates, d’Alembert’s Principle leads to
 ∙½
X µ ¶ ¾ ¸
  
− −   = 0 (638)

  ̇ 

If all the  generalized coordinates  are independent, then equation 638 implies that the term in the square
brackets is zero for each individual value of . That is, this implies the basic Euler-Lagrange equations of
motion.
The handling of both conservative
P andr̄non-conservative generalized forces  is best achieved by assuming
that the generalized force  =  F  ·  can be partitioned into a conservative velocity-independent term,


that can be expressed in terms of the gradient of a scalar potential, −∇  plus an excluded generalized force

 which contains the non-conservative, velocity-dependent, and all the constraint forces not explicitly
included in the potential  . That is,
 = −∇ +   (641)
Inserting (641) into (638)  and assuming that the potential  is velocity independent, allows (638) to be
rewritten as
X ∙½  µ ( −  ) ¶ ( −  ) ¾ ¸

− −   = 0 (642)

  ̇ 

Expressed in terms of the standard Lagrangian  =  −  this gives

 ∙½
X µ ¶ ¾ ¸
   
− −   = 0 (644)

  ̇ 
174 CHAPTER 6. LAGRANGIAN DYNAMICS

Note that equation (644) contains the basic Euler-Lagrange equation (638) for the special case when
 = 0. In addition, note that if all the generalized coordinates are independent, then the square bracket
terms are zero for each value of  which leads to the  general Euler-Lagrange equations of motion
½ µ ¶ ¾
  
− = 
 (645)
  ̇ 
where  ≥  ≥ 1.
Newtonian mechanics has trouble handling constraint forces because they lead to coupling of the degrees
of freedom. Lagrangian mechanics is more powerful since it provides the following three ways to handle such
correlated motion.
1) Minimal set of generalized coordinates
If the  coordinates  are independent, then the square bracket equals zero for each value of  in equation
644, which corresponds to Euler’s equation for each of the  independent coordinates. If the  generalized
coordinates are coupled by  constraints, then the coordinates can be transformed to a minimal set of
 =  −  independent coordinates which then can be solved by applying equation 645 to the minimal set
of  independent coordinates.
2) Lagrange multipliers approach
The Lagrangian method concentrates solely on active forces, completely ignoring all other internal forces.
In Lagrangian mechanics the generalized forces, corresponding to each generalized coordinate, can be parti-
tioned three ways

X 
 = −∇ +  (q ) + 


=1

where the velocity-independent conservative forces can be absorbed into Pascalar potential  , the holonomic
constraint forces can be handled using the Lagrange multiplier term =1   
 (q ), and the remaining

part of the active forces can be absorbed into the generalized force  . The scalar potential energy  is
handled by absorbing it into the standard Lagrangian  =  −  . If the constraint forces are holonomic then
these forces are easily and elegantly handled by use of Lagrange multipliers. All remaining forces, including
dissipative forces, can be handled by including them explicitly in the the generalized force   .
Combining the above two equations gives

"½ µ ¶ ¾ 
#
X    X 

− −  −  (q )  = 0 (656)

  ̇  
=1

Use of the Lagrange multipliers to handle the  constraint forces ensures that all  infinitessimals  are
independent implying that the expression in the square bracket must be zero for each of the  values of .
This leads to  Lagrange equations plus  constraint relations
½ µ ¶ ¾ X 
   
− = 
 +  (q ) (660)
  ̇  
=1

where  = 1 2 3 
3) Generalized forces approach
The two right-hand terms in (660) can be understood to be those forces acting on the system that are
not
P absorbed into the scalar potential  component of the Lagrangian . The Lagrange multiplier terms


=1   (q ) account for the holonomic forces of constraint that are not included in the conservative
potential or in the generalized forces 
 . The generalized force

X r

 = F
 · (617)



Applying the Euler-Lagrange equations in mechanics:

The optimal way to exploit Lagrangian mechanics is as follows:

1. Select a set of independent generalized coordinates.

2. Partition the active forces into three groups:

(a) Conservative one-body forces

(b) Holonomic constraint forces
(c) Generalized forces

3. Minimize the number of generalized coordinates.

4. Derive the Lagrangian
5. Derive the equations of motion

Velocity-dependent Lorentz force:

Usually velocity-dependent forces are non-holonomic. However, electromagnetism is a special case where
the velocity-dependent Lorentz force F = (E + v × B) can be obtained from a velocity-dependent potential

function  (  ). It was shown that the velocity-dependent potential

 = Φ − v · A (674)

leads to the Lorentz force where Φ is the scalar electric potential and A the vector potential.
Time-dependent forces:
It was shown that time-dependent forces can lead to complicated motion having both stable regions and
unstable regions of motion that can exhibit chaos.
Impulsive forces:
A generalized impulse ̃ can be derived for an instantaneous impulsive force from the time integral of
the impulsive forces P given by equation 2135 using the time integral of equation 678, that is
Z + Z + X X
 r r
∆ = ̃ = lim   ≡ lim F ·  = P̃ · (679)
 →0   →0 

 


Note that the generalized impulse ̃ can be a translational impulse P̃ with corresponding translational
variable  or an angular impulsive torque T̃ with corresponding angular variable  .
Comparison of Newtonian and Lagrangian mechanics:
In contrast to Newtonian mechanics, which is based on knowing all the vector forces acting on a system,
Lagrangian mechanics can derive the equations of motion using generalized coordinates without requiring
knowledge of the constraint forces acting on the system. Lagrangian mechanics provides a remarkably
powerful, and incredibly consistent approach to solving for the equations of motion in classical mechanics,
and is especially powerful for handling systems that are subject to holonomic constraints.
176 CHAPTER 6. LAGRANGIAN DYNAMICS

Workshop exercises
1. A disk of mass  and radius  rolls without slipping down a plane inclined from the horizontal by an angle
. The disk has a short weightless axle of negligible radius. From this axis is suspended a simple pendulum of
length    and whose bob has a mass . Assume that the motion of the pendulum takes place in the plane
of the disk.

(a) What generalized coordinates would be appropriate for this situation?

(b) Are there any equations of constraint? If so, what are they?
(c) Find Lagrange’s equations for this system.

2. A Lagrangian for a particular system can be written as

 
= (̇2 + 2̇̇ + ̇ 2 ) − (2 + 2 +  2 )
2 2
where   and  are arbitrary constants, but subject to the condition that 2 − 4 6= 0.

(a) What are the equations of motion?

(b) Examine the case  = 0 = . What physical system does this represent?
(c) Examine the case  = 0 and  = −. What physical system does this represent?
(d) Based on your answers to (b) and (c), determine the physical system represented by the Lagrangian given
above.

3. Consider a particle of mass  moving in a plane and subject to an inverse square attractive force.

(a) Obtain the equations of motion.

(b) Is the angular momentum about the origin conserved?
(c) Obtain expressions for the generalized forces. Recall that the generalized forces are defined by
X 
 =  



4. Consider a Lagrangian function of the form (  ˙  ¨  ). Here the Lagrangian contains a time derivative
of the generalized coordinates that is higher than the first. When working with such Lagrangians, the term
“generalized mechanics” is used.

(a) Consider a system with one degree of freedom. By applying the methods of the calculus of variations,
and assuming that Hamilton’s principle holds with respect to variations which keep both  and ̇ fixed at
the end points, show that the corresponding Lagrange equation is
µ ¶ µ ¶
2    
− + = 0
2  ̈   ̇ 
Such equations of motion have interesting applications in chaos theory.
(b) Apply this result to the Lagrangian
 
=−  ̈ −  2 
2 2
Do you recognize the equations of motion?

5. A bead of mass  slides under gravity along a smooth wire bent in the shape of a parabola 2 =  in the
vertical ( ) plane.

(a) What kind (holonomic, nonholonomic, scleronomic, rheonomic) of constraint acts on ?

(b) Set up Lagrange’s equation of motion for  with the constraint embedded.
6.14. SUMMARY 177

(c) Set up Lagrange’s equations of motion for both  and  with the constraint adjoined and a Lagrangian
multiplier  introduced.
(d) Show that the same equation of motion for  results from either of the methods used in part (b) or part
(c).
(e) Express  in terms of  and ̇.
(f) What are the  and  components of the force of constraint in terms of  and ̇?

6. Consider the two Lagrangians

 ( )
( ̇; ) and 0 ( ̇; ) = ( ̇; ) +

where  ( ) is an arbitrary function of the generalized coordinates (). Show that these two Lagrangians
yield the same Euler-Lagrange equations. As a consequence two Lagrangians that diﬀer only by an exact time
derivative are said to be equivalent.

7. Consider the double pendulum comprising masses 1 and 2 connected by inextensible strings as shown in
the figure. Assume that the motion of the pendulum takes place in a vertical plane.

(a) Are there any equations of constraint? If so, what are they?
(b) Find Lagrange’s equations for this system.
O

2 L2
m 1g
m2

m2 g

8 Consider the system shown in the figure which consists of a mass  suspended via a constrained massless link
of length  where the point  is acted upon by a spring of spring constant . The spring is unstretched when
the massless link is horizontal. Assume that the holonomic constraints at  and  are frictionless.

a Derive the equations of motion for the system using the method of Lagrange multipliers.

x0 x L y

9 Consider a pendulum, with mass , connected to a (horizontally) moveable support of mass  .

(a) Determine the Lagrangian of the system.

(b) Determine the equations of motion for  ¿ 1.
(c) Find an equation of motion in  alone. What is the frequency of oscillation?
(d) What is the frequency of oscillation for  À ? Does this make sense?
178 CHAPTER 6. LAGRANGIAN DYNAMICS

Problems
1. A sphere of radius  is constrained to roll without slipping on the lower half of the inner surface of a hollow
cylinder of radius  Determine the Lagrangian function, the equation of constraint, and the Lagrange equations
of motion. Find the frequency of small oscillations.

2. A particle moves in a plane under the influence of a force  = −−1 directed toward the origin;  and
 ( 0) are constants. Choose generalized coordinates with the potential energy zero at the origin.
a) Find the Lagrangian equations of motion.
b) Is the angular momentum about the origin conserved?
c) Is the total energy conserved?

3. Two blocks, each of mass  are connected by an extensionless, uniform string of length . One block is placed
on a frictionless horizontal surface, and the other block hangs over the side, the string passing over a frictionless
pulley. Describe the motion of the system:
a) when the mass of the string is negligible
b) when the string has mass .

4. Two masses 1 and 2 (1 6= 2 ) are connected by a rigid rod of length  and of negligible mass. An
extensionless string of length 1 is attached to 1 and connected to a fixed point of the support  . Similarly
a string of length 2 (1 6= 2 ) connects 2 and  . Obtain the equation of motion describing the motion in
the plane of 1  2  and  , and find the frequency of small oscillation around the equilibrium position.

5. A thin uniform rigid rod of length 2 and mass  is suspended by a massless string of length . Initially the
system is hanging vertically downwards in the gravitational field  . Use as generalized coordinates the angles
given in the diagram.
a) Derive the Lagrangian for the system.
b) Use the Lagrangian to derive the equations of motion.
c) A horizontal impulsive force  in the  direction strikes the bottom end of the rod for an infinitessimal
time  . Derive the initial conditions for the system immediately after the impulse has occurred.
d) Draw a diagram showing the geometry of the pendulum shortly after the impulse when the displacement
angles are significant.

x
O

y
2L
2

Mg
Chapter 7

Symmetries, Invariance and the

Hamiltonian

7.1 Introduction
The chapter 7 discussion of Lagrangian dynamics illustrates the power of Lagrangian mechanics for deriving
the equations of motion. In contrast to Newtonian mechanics, which is expressed in terms of force vectors
acting on a system, the Lagrangian method, based on d’Alembert’s Principle or Hamilton’s Principle, is
expressed in terms of the scalar kinetic and potential energies of the system. The Lagrangian approach is a
sophisticated alternative to Newton’s laws of motion, that provides a simpler derivation of the equations of
motion that allows constraint forces to be ignored. In addition, the use of Lagrange multipliers or generalized
forces allows the Lagrangian approach to determine the constraint forces when these forces are of interest.
The equations of motion, derived either from Newton’s Laws or Lagrangian dynamics, can be non-trivial to
solve mathematically. It is necessary to integrate second-order diﬀerential equations, which for  degrees of
freedom, imply 2 constants of integration.
Chapter 7 will explore the remarkable connection between symmetry and invariance of a system under
transformation, and the related conservation laws that imply the existence of constants of motion. Even
when the equations of motion cannot be solved easily, it is possible to derive important physical principles
regarding the first-order integrals of motion of the system directly from the Lagrange equation, as well as for
elucidating the underlying symmetries plus invariance. This property is contained in Noether’s theorem
which states that conservation laws are associated with diﬀerentiable symmetries of a physical system.

7.2 Generalized momentum

Consider a holonomic system of  masses under the influence of conservative forces that depend on position
 but not velocity ̇ , that is, the potential is velocity independent. Then for the  coordinate of particle 
for  particles

   
= − = (7.1)
 ̇  ̇  ̇  ̇

 X1 ¡ 2 ¢
=  ̇ + ̇2 + ̇2
 ̇ =1 2
=  ̇ = 

Thus for a holonomic, conservative, velocity-independent potential we have


=  (7.2)
 ̇

which is the  component of the linear momentum for the  particle.

179
180 CHAPTER 7. SYMMETRIES, INVARIANCE AND THE HAMILTONIAN

This result suggests an obvious extension of the concept of momentum to generalized coordinates. The
generalized momentum associated with the coordinate  is defined to be


≡  (7.3)
 ̇
Note that  also is called the conjugate momentum or canonical momentum to  where    are
conjugate, or canonical, variables. Remember that the linear momentum  is the first-order time integral
given by equation 210. If  is not a spatial coordinate, then  is the generalized momentum, not the
kinematic linear momentum. For example, if  is an angle, then  will be angular momentum. That
is, the generalized momentum may diﬀer from the usual linear or angular momentum since the definition
(73) is more general than the usual  = ̇ definition of linear momentum in classical mechanics. This is
illustrated by the case of a moving charged particles    in an electromagnetic field. Chapter 6 showed
that electromagnetic forces on a charge  can be described in terms of a scalar potential  where

 =  (Φ − A · v ) (7.4)

Thus the Lagrangian for the electromagnetic force can be written as

 ∙
X ¸
1
=  v · v −  (Φ − A · v ) (7.5)
=1
2

The generalized momentum to the coordinate  for charge   and mass   is given by the above Lagrangian


 = =  ̇ +   (7.6)
 ̇

Note that this includes both the mechanical linear momentum plus the correct electromagnetic momentum.
The fact that the electromagnetic field carries momentum should not be a surprise since electromagnetic
waves also carry energy as is illustrated by the transmission of radiant energy from the sun.

7.1 Example: Feynman’s angular-momentum paradox

Feynman[Fey84] posed the following paradox. A circular insulating disk  mounted on frictionless bearings,
has a circular ring of total charge  uniformly distributed around the perimeter of the circular disk at the
radius . A superconducting long solenoid of radius  where   , is fixed to the disk and is mounted
coaxial with the bearings. The moment of inertia of the system about the rotation axis is . Initially the disk
plus superconducting solenoid are stationary with a steady current producing a uniform magnetic field 0
inside the solenoid. Assume that a rise in temperature of the solenoid destroys the superconductivity leading
to a rapid dissipation of the electric current and resultant magnetic field. Assume that the system is free to
rotate, no other forces or torques are acting on the system, and that the charge carriers in the solenoid have
zero mass and thus do not contribute to the angular momentum. Does the system rotate when the current in
the solenoid stops?
Initially the system is stationary with zero mechanical angu-
lar momentum. Faraday’s Law states that, when the magnetic
field dissipates from 0 to zero, there will be a torque N acting
on the circumferential charge  at radius  due to the change
in magnetic flux Φ.

Φ
N() = −


Since Φ  0, this torque leads to an angular impulse which

will equal the final mechanical angular momentum.
Z
L
  = T = N() = Φ

7.3. INVARIANT TRANSFORMATIONS AND NOETHER’S THEOREM 181

The initial angular momentum in the electromagnetic field can be derived using equation 76 plus Stoke’s
theorem (Appendix 3). Equation 2142 gives that the final angular momentum equals the angular impulse
Z I I I Z
L
 =  ̇  =    =    =  B · dS =Φ

I Z
where Φ =   = B · dS is the initial total magnetic flux through the solenoid. Thus the total initial
angular momentum is given by
L
 
= 0 + L
 = Φ

Since the final electromagnetic field is zero the final total angular momentum is given by

L 
 
= L
  + 0 = Φ

Note that the total angular momentum is conserved. That is, initially all the angular momentum is stored in
the electromagnetic field, whereas the final angular momentum is all mechanical. This explains the paradox
that the mechanical angular momentum is not conserved, only the total angular momentum of the system is
conserved, that is, the sum of the mechanical and electromagnetic angular momenta.

7.3 Invariant transformations and Noether’s Theorem

One of the great advantages of Lagrangian mechanics is the freedom it allows in choice of generalized
coordinates which can simplify derivation of the equations of motion. For example, for any set of coordinates,
  a reversible point transformation can define another set of coordinates 0 such that

0 = 0 (1  2   ; ) (7.7)

The new set of generalized coordinates satisfies Lagrange’s equations of motion with the new Lagrangian

( 0  ̇ 0  ) = ( ̇ ) (7.8)

The Lagrangian is a scalar, with units of energy, which does not change if the coordinate representa-
tion is changed. Thus ( 0  ̇ 0  ) can be derived from ( ̇ ) by substituting the inverse relation  =
 (10  20  0 ; ) into ( ̇ ) That is, the value of the Lagrangian  is independent of which coordinate
representation is used. Although the general form of Lagrange’s equations of motion is preserved in any
point transformation, the explicit equations of motion for the new variables usually look diﬀerent from those
with the old variables. A typical example is the transformation from cartesian to spherical coordinates. For
a given system, there can be particular transformations for which the explicit equations of motion are the
same for both the old and new variables. Transformations for which the equations of motion are invariant,
are called invariant transformations. It will be shown that if the Lagrangian does not explicitly contain
a particular coordinate of displacement   then the corresponding conjugate momentum,   is conserved.
This relation is called Noether’s theorem which states “For each symmetry of the Lagrangian, there is a
conserved quantity”.
Noether’s Theorem will be used to consider invariant transformations for two dependent variables, ()
and () plus their conjugate momenta  and  . For a closed system, these provide up to six possible
conservation laws for the three axes. Then we will discuss the independent variable  and its relation to
the Generalized Energy Theorem, which provides another possible conservation law. For simplicity, these
discussions will assume that the systems are holonomic and conservative.
The Lagrange equations using generalized coordinates for holonomic systems, was given by equation 660
to be ½ µ ¶ ¾ X 
   
− =  (q ) + 
 (7.9)
  ̇  
=1

This can be written in terms of the generalized momentum as

½ ¾ X 
  
 − =  (q ) + 
 (7.10)
  
=1
182 CHAPTER 7. SYMMETRIES, INVARIANCE AND THE HAMILTONIAN

or equivalently as " #
 X 

̇ = +  (q ) +  (7.11)
 
=1
Note that if the Lagrangian  does not contain  explicitly, that is, the Lagrangian is invariant to a linear
translation, or equivalently, is spatially homogeneous, and if the Lagrange multiplier constraint force and
generalized force terms are zero, then
" #
 X 

+  (q ) +  =0 (7.12)
 
=1

In this case the Lagrange equation reduces to


̇ = =0 (7.13)

Equation 713 corresponds to  being a constant of motion. Stated in words, the generalized momentum 
is a constant of motion if the Lagrangian is invariant to a spatial translation of  , and the constraint plus
generalized force terms are zero. Expressed another way, if the Lagrangian does not contain a given coordi-
nate  and the corresponding constraint plus generalized forces are zero, then the generalized momentum
associated with this coordinate is conserved. Note that this example of Noether’s theorem applies to any
component of q. For example, in the uniform gravitational field at the surface of the earth, the Lagrangian
does not depend on the  and  coordinates in the horizontal plane, thus  and  are conserved, whereas,
due to the gravitational force, the Lagrangian does depend on the vertical  axis and thus  is not conserved.

7.2 Example: Atwoods machine

Assume that the linear momentum is conserved for the Atwood’s machine shown in the figure below. Let
the left mass rise a distance  and the right mass rise a distance . Then the middle mass must drop by
 +  to conserve the length of the string. The Lagrangian of the system is
1 1 1 7
= (4)̇2 + (3)(−̇−̇)2 + ̇ 2 −(4 + 3(− − ) + ) = ̇2 +3̇̇+2̇ 2 −(−2)
2 2 2 2
Note that the transformation
 = 0 + 2
 = 0 + 
results in the potential energy term (−2) = (0 −20 )
which is a constant of motion. As a result the Lagrangian
is independent of  which means that it is invariant to the
small perturbation  and thus   = 0 Therefore, accord-
x y
4m 3m m
ing to Noether’s theorem, the corresponding linear momen-
tum  =  Example of an Atwood’s machine
̇ is conserved. This conserved linear momentum
then is given by
   ̇   ̇
 = = + = (7̇ + 3̇)(2) + (3̇ + 4̇) = (17̇ + 10̇)
̇  ̇  ̇  ̇  ̇
Thus, if the system starts at rest with  = 0, then ̇ always equals − 10
17 ̇ since  is constant.
Note that this also can be shown using the Euler-Lagrange equations in that Λ  = 0 and Λ  = 0 give
7̈ + 3̈ = −
3̈ + 4̈ = 2
Adding the second equation to twice the first gives

17̈ + 10̈ = (17̇ + 10̇) = 0

This is the result obtained directly using Noether’s theorem.
7.4. ROTATIONAL INVARIANCE AND CONSERVATION OF ANGULAR MOMENTUM 183

7.4 Rotational invariance and conservation of angular momentum

The arguments, used above, apply equally well to conjugate momenta  and  for rotation about any axis.
The Lagrange equation is
½ ¾ X 
  
 − =  (q ) + 
 (7.14)
  
=1
If no constraint or generalized torques act on the system, then the right-hand side of equation 714 is zero.
Moreover if the Lagrangian in not an explicit function of  then  = 0 and assuming that the constraint
plus generalized torques are zero, then  is a constant of motion.
Noether’s Theorem illustrates this general result which can be stated as, if the Lagrangian is rotationally
invariant about some axis, then the component of the angular momentum along that axis is conserved. Also
this is true for the more general case where the Lagrangian is invariant to rotation about any axis, which
leads to conservation of the total angular momentum.

7.3 Example: Conservation of angular momentum for rotational invariance:

The Noether theorem result for rotational-invariance about an
axis also can be derived using cartesian coordinates as shown below.
As discussed in appendix , it is necessary to limit discussion of
rotation to infinitessimal rotation angles in order to represent the
rotation by a vector. Consider an infinitessimal rotation  about
some axis, which is a vector. As illustrated in the adjacent figure,
this can be expressed as

r = θ × r

The velocity vectors also change on rotation of the system obeying

the transformation equation which is common to all vectors, that
is, r
 ṙ = θ × ṙ
If the Lagrangian is unaﬀected by the orientation of the system,
that is, it is rotationally invariant, then it can be shown that the
angular momentum is conserved. For example, consider that the
Lagrangian is invariant to rotation about some axis  . Since the
Infinitessimal rotation
Lagrangian is a function
 = (  ̇ ; )
then the expression that the Lagrangian does not change due to an infinitesimal rotation  about this axis
can be expressed as
X  X 
 =  +  ̇ = 0 ()

 
 ̇
where cartesian coordinates have been used.
Using the generalized momentum

= 
 ̇
then, Lagrange’s equation gives
 
 − =0
 
that is

̇ =

Inserting this into equation  gives
3
X 3
X
 = ̇ +   ̇ = 0
 
184 CHAPTER 7. SYMMETRIES, INVARIANCE AND THE HAMILTONIAN

This is equivalent to the scalar products

ṗ · r + p ·  ṙ = 0
For an infinitessimal rotation then  =  ×  and  ̇ =  × ̇ . Therefore
ṗ · (θ × r) + p · (θ × ṙ) = 0
The cyclic order can be permuted giving
θ · (r × ṗ) + θ · (ṙ × p) = 0
θ · [(r × ṗ) + (ṙ × p)] = 0

θ · (r × p) = 0

Because the infinitessimal angle  is arbitrary, then the time derivative

(r × p) = 0

about the axis of rotation  But the bracket (r × p) equals the angular momentum. That is;
Angular momentum = (r × p) = constant
This proves the Noether’ theorem that the angular momentum about any axis is conserved if the Lagrangian
is rotationally invariant about that axis.

7.4 Example: Diatomic molecules and axially-symmetric nuclei

An interesting example of Noether’s theorem applies to diatomic molecules such as 2  2  2  2  2
and 2 . The electric field produced by the two charged nuclei of the diatomic molecule has cylindrical
symmetry about the axis through the two nuclei. Electrons are bound to this dumbbell arrangement of the two
nuclear charges which may be rotating and vibrating in free space. Assuming that there are no external torques
acting on the diatomic molecule in free space, then the angular momentum about any fixed axis in free space
must be conserved according to Noether’s theorem. If no external torques are applied, then the component of
the angular momentum about any fixed axis is conserved, that is, the total angular momentum is conserved.
What is especially interesting is that since the electrostatic potential, and thus the Lagrangian, of the diatomic
molecule has cylindrical symmetry, that is   = 0, then the component of the angular momentum with respect
to this symmetry axis also is conserved irrespective of how the diatomic molecule rotates or vibrates in free
space. That is, an additional symmetry has been identified that leads to an additional conservation law that
applies to the angular momentum.
An example of Noether’s theorem is in nuclear physics where some nuclei have a spheroidal shape similar
to an american football or a rugby ball. This spheroidal shape has an axis of symmetry along the long axis.
The Lagrangian is rotationally invariant about the symmetry axis resulting in the angular momentum about
the symmetry axis being conserved in addition to conservation of the total angular momentum.

7.5 Cyclic coordinates

Translational and rotational invariance occurs when a system has a cyclic coordinate   A cyclic coordinate
is one that does not explicitly appear in the Lagrangian. The term cyclic is a natural name when one has
cylindrical or spherical symmetry. In Hamiltonian mechanics a cyclic coordinate often is called an ignorable
coordinate. By virtue of Lagrange’s equations
  
− =0 (7.15)
  ̇ 

then a cyclic coordinate   is one for which  = 0. Thus
 
= ̇ = 0 (7.16)
  ̇
that is,  is a constant of motion if the conjugate coordinate  is cyclic. This is just Noether’s Theorem.
7.6. KINETIC ENERGY IN GENERALIZED COORDINATES 185

7.6 Kinetic energy in generalized coordinates

Application of Noether’s theorem to the conservation of energy requires the kinetic energy to be expressed
in generalized coordinates. In terms of fixed rectangular coordinates, the kinetic energy for  bodies, each
having three degrees of freedom, is expressed as
 3
1 XX
 =  ̇2 (7.17)
2 =1 =1
These can be expressed in terms of generalized coordinates as  =  (  ) and in terms of generalized
velocities 
X  
̇ = ̇ + (7.18)
=1
 
Taking the square of ̇ and inserting into the kinetic energy relation gives
XX 1 XX XX 1 µ ¶2
    
 (q q̇ ) =  ̇ ̇ +  ̇ +  (7.19)

2    
   
2 


This can be abbreviated as

 (q q̇ ) = 2 (q q̇ ) + 1 (q q̇ ) + 0 (q ) (7.20)
where
XX 1   X
2 (q q̇ ) =  ̇ ̇ =  ̇ ̇ (7.21)

2  
 
XX   X
1 (q q̇ ) = 
̇ =  ̇ (7.22)
 
 

XX 1 µ ¶2

0 (q ) =  (7.23)
 
2 
where
X X 3
1  
 ≡  (7.24)
=1 =1
2  
When the transformed system is scleronomic, time does not appear explicitly in the transformation

equations to generalized coordinates since  = 0. Then 1 = 0 = 0, and the kinetic energy reduces to
a homogeneous quadratic function of the generalized velocities
 (q q̇ ) = 2 (q q̇ ) (7.25)
A useful relation can be derived by taking the diﬀerential of equation 721 with respect to ̇ . That is
2 (q q̇ ) X X
=  ̇ +  ̇ (7.26)
 ̇  

Multiply this by ̇ and sum over  gives

X 2 (q q̇ ) X X X
̇ =  ̇ ̇ +  ̇ ̇ = 2  ̇ ̇ = 22
 ̇
   

Similarly, the products of the generalized velocities ̇ with the corresponding derivatives of 1 and 0 give
X 2
̇ = 22 (7.27)
 ̇

X 1 (q q̇ )
̇ = 1 (q q̇ ) (7.28)
 ̇

X 0 (q )
̇ = 0 (7.29)
 ̇

186 CHAPTER 7. SYMMETRIES, INVARIANCE AND THE HAMILTONIAN


Equation 725 gives that  = 2 when the transformed system is scleronomic, i.e.  = 0 and then the
kinetic energy is a quadratic function of the generalized velocities ̇ . Using the definition of the generalized
momentum equation 73 assuming  = 2 , and that the potential  is velocity independent, gives that
   2
 ≡ = − = (7.30)
 ̇  ̇  ̇  ̇
Then equation 727 reduces to the useful relation that
1X 1
2 = ̇  = q̇ · p (7.31)
2 2


where, for compactness, the summation is abbreviated as a scalar product.

7.7 Generalized energy and the Hamiltonian function

Consider the time derivative of the Lagrangian, plus the fact that time is the independent variable in the
Lagrangian. Then the total time derivative is
 X  X  
= ̇ + ̈ + (7.32)
 
 
 ̇ 

The Lagrange equations for a conservative force are given by equation 660 to be
X  
  
− = 
 +  (q ) (7.33)
  ̇  
=1

The holonomic constraints can be accounted for using the Lagrange multiplier terms while the generalized
force 
 includes non-holonomic forces or other forces not included in the potential energy term of the
Lagrangian, or holonomic forces not accounted for by the Lagrange multiplier terms.
Substituting equation 733 into equation 732 gives
" 
#
 X   X X  X  

= ̇ − ̇  +  (q ) + ̈ +
 
  ̇ 
 
 ̇ 
=1
" #
X  µ  ¶ X X
 
= ̇ − ̇ 
 +  (q ) + (7.34)

  ̇ 
 
=1

This can be written in the form

⎡ ⎤ " #
µ ¶ 
 ⎣X  X X  
̇ − ⎦ = ̇ 
 +  (q ) − (7.35)
   ̇ 
  
=1

Define Jacobi’s Generalized Energy1 (q q̇ ) by

X µ  ¶
(q q̇ ) ≡ ̇ − (q q̇ ) (7.36)

 ̇

Jacobi’s generalized momentum, equation 73 can be used to express the generalized energy ( ̇ ) in
terms of the canonical coordinates ̇ and  , plus time . Define the Hamiltonian function to equal the
generalized energy expressed in terms of the conjugate variables (   ), that is,
X µ  ¶ X
 (q p) ≡ (q q̇ ) ≡ ̇ − (q q̇ ) = (̇  ) − (q q̇ ) (7.37)

 ̇ 

This Hamiltonian  (q p) underlies Hamiltonian mechanics which plays a profoundly important role in
most branches of physics as illustrated in chapters 8 15 and 18.
1 Most textbooks call the function (q q̇ ) Jacobi’s energy integral. This book adopts the more descriptive name Generalized

energy in analogy with use of generalized coordinates q and generalized momentum p.

7.8. GENERALIZED ENERGY THEOREM 187

7.8 Generalized energy theorem

The Hamilton function, 737 plus equation 735 lead to the generalized energy theorem
" 
#
 (q p) (q q̇ ) X 
X  (q q̇ )
= = ̇  +  (q ) − (7.38)
  
 
=1
h P i

Note that for the special case where all the external forces 
 + 
=1   (q ) = 0, then

 
=− (7.39)
 
h P i

Thus the Hamiltonian is time independent if both   + 
=1   (q ) = 0 and the Lagrangian are
time-independent. For an isolated closed system having no external forces acting, then the Lagrangian is
time independent because the velocities are constant, and there is no external potential energy. That is, the
Lagrangian is time-independent, and
⎡ ⎤
µ ¶
 ⎣X   
̇ − ⎦ = =− =0 (7.40)
   ̇  

As a consequence, the Hamiltonian  (q p)  and generalized energy (q q̇ ), both are constants of motion
if the Lagrangian is a constant of motion, and if the external non-potential forces are zero. This is an example
of Noether’s theorem, where the symmetry of time independence leads to conservation of the conjugate
variable, which is the Hamiltonian or Generalized energy.

7.9 Generalized energy and total energy

The generalized kinetic energy, equation 720, can be used to write the generalized Lagrangian as

(q q̇ ) = 2 (q q̇ ) + 1 (q q̇ ) + 0 (q ) −  (q ) (7.41)

If the potential energy  does not depend explicitly on velocities ̇ or time, then

  ( −  ) 
 = = = (7.42)
 ̇  ̇  ̇

Equation 742 can be used to write the Hamiltonian, equation 737, as

X µ 2 ¶ X µ 1 ¶ X µ 0 ¶
 (q p) = ̇ + ̇ + ̇ − (q q̇ ) (7.43)

 ̇ 
 ̇ 
 ̇

Using equations 727 728 729 gives that the total generalized Hamiltonian  (q p) equals

 (q p) = 22 + 1 − (2 + 1 + 0 −  ) = 2 − 0 +  (7.44)

But the sum of the kinetic and potential energies equals the total energy. Thus equation 744 can be rewritten
in the form
 (q p) = ( +  ) − (1 + 20 ) =  − (1 + 20 ) (7.45)
Note that Jacobi’s generalized energy and the Hamiltonian do not equal the total energy . However, in
the special case where the transformation is scleronomic, then 1 = 0 = 0 and if the potential energy 
does not depend explicitly of ̇ , then the generalized energy (Hamiltonian) equals the total energy, that is,
 =  Recognition of the relation between the Hamiltonian and the total energy facilitates determining
the equations of motion.
188 CHAPTER 7. SYMMETRIES, INVARIANCE AND THE HAMILTONIAN

7.10 Hamiltonian invariance

Chapters 78 79 addressed two important and independent features of the Hamiltonian regarding: ) when
 is conserved, and ) when  equals the total mechanical energy. These important results are summarized
below with a discussion of the assumptions made in deriving the Hamiltonian, as well as the implications.

a) Conservation of generalized energy

The generalized energy theorem (738) was given as
" 
#
 (q p) (q q̇ ) X 
X  (q q̇ )
= = ̇  +  (q ) − (7.46)
  
 
=1
P h P i
Note that when  ̇  +  
=1   (q ) = 0, then equation 746 reduces to

 
=− (7.47)
 
P h P i

Also, when  ̇ 
 + =1   (q ) = 0 and if the Lagrangian is not an explicit function of time,
then the Hamiltonian is a constant of motion. That is,  is conserved if, and only if, the Lagrangian, and
consequently the Hamiltonian, are not explicit functions of time, and if the external forces are zero.

b) The generalized energy and total energy

If the following two requirements are satisfied
1) The kinetic energy has a homogeneous quadratic dependence on the generalized velocities, that is, the

transformation to generalized coordinates is independent of time,  = 0
2) The potential energy is not velocity dependent, thus the terms 
̇ = 0
Then equation 745 implies that the Hamiltonian equals the total mechanical energy, that is,
 = + = (7.48)
Expressed in words, the generalized energy (Hamiltonian) equals the total energy if the constraints are
time independent and the potential energy is velocity independent. This is equivalent to stating that, if the
constraints, or generalized coordinates, for the system are time independent, then  = .
The four combinations of the above two independent conditions, assuming that the external forces term
in equation 746 is zero, are summarized in table 71.

Table 7.1: Hamiltonian and total energy

Hamiltonian Constraints and coordinate transformation
Time behavior Time independent Time dependent


 = − 
 = 0  conserved,  =   conserved,  6= 


 = − 
 6= 0  not conserved,  =   not conserved,  6= 

Note the following general facts regarding the Lagrangian and the Hamiltonian.
(1) the Lagrangian is indefinite with respect to addition of a constant to the scalar potential,
(2) the Lagrangian is indefinite with respect to addition of a constant velocity,
(3) there is no unique choice of generalized coordinates.
(4) the Hamiltonian is a scalar function that is derived from the Lagrangian scalar function.
(5) the generalized momentum is derived from the Lagrangian.
These facts, plus the ability to recognize the conditions under which  is conserved, and when  = 
can greatly facilitate solving problems as shown by the following two examples.
7.10. HAMILTONIAN INVARIANCE 189

7.5 Example: Linear harmonic oscillator on a cart moving at constant velocity

Consider a linear harmonic oscillator located on a cart that
is moving with constant velocity 0 in the  direction, as shown
in the adjacent figure. Let the laboratory frame be the unprimed
frame, and the cart frame be designated the primed frame. As- x
sume that  = 0 at  = 0 Then
x’
0 =  − 0  ̇0 = ̇ − 0 ̈0 = ̈ m

The harmonic oscillator will have a potential energy of

1 1 2
 = 02 =  ( − 0 )
2 2 v0
Laboratory frame: The Lagrangian is
̇2 1 2 v0t
( ̇ ) =
−  ( − 0 )
2 2
Lagrange equation Λ  = 0 gives the equation of motion to be Harmonic oscillator on cart moving at
̈ = −( − 0 ) uniform velocity 0 .

The definition of generalized momentum gives


= = ̇
 ̇
The Hamiltonian is
X  2 1 2
(  ) = ̇ −= +  ( − 0 )

 ̇ 2 2
The Hamiltonian is the sum of the kinetic and potential energies and equals the total energy of the system,
but it is not conserved since  and  are both explicit functions of time, that is   
 =  = −  6= 0.
Physically this is understood in that energy must flow into and out of the external constraint keeping the cart
moving uniformly at a constant velocity 0 against the reaction to the oscillating mass. That is, assuming
a uniform velocity for the moving cart constitutes a time-dependent constraint on the mass, and the force of
constraint does work in actual displacement of the complete system. If the constraint did not exist, then the
cart momentum would oscillate such that the total momentum of cart plus spring system is conserved.
Cart frame: Transform the Lagrangian to the primed coordinates in the moving frame of reference,
which also is an inertial frame. Then the Lagrangian  in terms of the moving cart frame coordinates, is
 ¡ 02 ¢ 1
(0  ̇0  ) = ̇ + 2̇0 0 + 02 − 02
2 2
The Lagrange equation of motion Λ0  = 0 gives the equation of motion to be
̈0 = −0
where 0 is the displacement of the mass with respect to the cart. This implies that an observer on the
cart will observe simple harmonic motion as is to be expected from the principle of equivalence in Galilean
relativity.
The definition of the generalized momentum gives the linear momentum in the primed frame coordinates
to be

0 = = ̇0 + 0
 ̇0
The cart-frame Hamiltonian also can be expressed in terms of the coordinates in the moving frame to be
 (0 − 0 )2 1 
(0  0  ) = ̇0
−  = + 02 − 02
 ̇0 2 2 2
Note that the Lagrangian and Hamiltonian expressed in terms of the coordinates in the cart frame of reference
are not explicitly time dependent, therefore  is conserved. However, the cart-frame Hamiltonian does not
equal the total energy since the coordinate transformation is time dependent. Actually the first two terms in
the above Hamiltonian are the energy of the harmonic oscillator in the cart frame. This example shows that
the Hamiltonians diﬀer when expressed in terms of either the laboratory or cart frames of reference
190 CHAPTER 7. SYMMETRIES, INVARIANCE AND THE HAMILTONIAN

7.6 Example: Isotropic central force in a rotating frame

Consider a mass subject to a central isotropic radial
force  () as shown in the adjacent figure. Compare
the Hamiltonian  in the fixed frame of reference , z
with the Hamiltonian  0 in a frame of reference  0 that
is rotating about the center of the force with constant
angular velocity  Restrict this case to rotation about
one axis so that only two polar coordinates  and  need
to be considered. The transformations are
y
0 = 
0 =  − 
Also m
 () =  (0 ) x

Fixed frame of reference : Mass subject to radial force

³ 2 2
´
= − = ̇ + 2 ̇ −  ()
2
Since the Lagrangian is not explicitly time dependent, then the Hamiltonian is conserved. For this fixed-frame
Hamiltonian the generalized momenta are

 = = ̇2 ̇
 ̇

 = = ̇
 ̇
The Hamiltonian equals
X µ ¶
 1  2
(     ) = ̇ −= 2 + 2 +  () = 

 ̇ 2 

The Hamiltonian in the fixed frame is conserved and equals the total energy, that is  =  +  .
Rotating frame of reference  0
The above inertial fixed-frame Lagrangian can be written in terms of the primed (non-inertial rotating
frame) coordinates as
µ ´2 ¶
³ 2 2
´  02 ³ 0
= − = ̇ + 2 ̇ −  () = ̇ + 02 ̇ +  −  (0 )
2 2
The generalized momenta derived from this Lagrangian are
 ³ 0 ´
0 = 0 =  ̇ 02
̇ +  = 00 + 02 
 ̇

0 = = ̇02 = 
 ̇0
The Hamiltonian expressed in terms of the non-inertial rotating frame coordinates is
⎛ ³ ´⎞
  0 1 00 + 2 
 0 (0  0  0  0 ) = 0 ̇0 + 0 ̇ −  =
⎝02
 +
⎠ +  (0 )
 ̇  ̇ 2 2

Note that  0 (0  0  0  0 ) is time independent and therefore is conserved, but (0  0  0  0 ) 6=  because
the generalized coordinates are time dependent. In addition, 00 is conserved since

 
̇0 = 0 =− =0
 0
7.10. HAMILTONIAN INVARIANCE 191

7.7 Example: The plane pendulum

The simple plane pendulum in a uniform gravita-
tional field  is an example that illustrates Hamiltonian
invariance. There is only one generalized coordinate, 
and the Lagrangian for this system is
1 2 2
=  ̇ +  cos 
2
g
The momentum conjugate to  is

 = = 2 ̇ m
 ̇
which is the angular momentum about the pivot point. The plane pendulum constrained to oscillate in a
Using the Lagrange-Euler equation this gives that vertical plane in a uniform gravitational field.
 
 = ̇ = = − sin 
 
Note that the angular momentum  is not a constant of motion since it explicitly depends on .
The Hamiltonian is
X 1 2 2 2
=  ̇ −  =  ̇ −  =  ̇ −  cos  =  2 −  cos 

2 2

Note that the Lagrangian and Hamiltonian are not explicit functions of time, therefore they are conserved.
Also the potential is velocity independent and there is no coordinate transformation, thus the Hamiltonian
equals the total energy  which is a constant of motion.

2
= −  cos  = 
22

7.8 Example: Oscillating cylinder in a cylindrical bowl

It is important to correctly account for constraint forces when us-
ing Noether’s theorem for constrained systems. Noether’s theorem as-
sumes the variables are independent. This is illustrated by considering
the example of a solid cylinder rolling in a fixed cylindrical bowl. As-
sume that a uniform cylinder of radius  and mass  is constrained
to roll without slipping on the inner surface of the lower half of a hol-
low cylinder of radius . The motion is constrained to ensure that
the axes of both cylinders remain parallel and   .
The generalized coordinates are taken to be the angles  and 
which are measured with respect to a fixed vertical axis. Then the
kinetic energy and potential energy are
1 h i2 1 2
 =  ( − ) ̇ +  ̇  = [ − ( − ) cos ] 
2 2
where  is the mass of the small cylinder and where  = 0 at the lowest position of the sphere. The moment
of inertia of a uniform cylinder is  = 12 2 .
The Lagrangian is
1 h i2 1 2
 −  −  =  ( − ) ̇ + 2 ̇ − [ − ( − ) cos ] 
2 4
Since the solid cylinder rotates without slipping inside the cylindrical shell, then the equation of constraint is

() =  −  ( + ) = 0
192 CHAPTER 7. SYMMETRIES, INVARIANCE AND THE HAMILTONIAN

Using the Lagrangian, plus the one equation of constraint, requires one Lagrange multiplier. Then the
Lagrange equations of motion for  and  are
∙ ¸
   
− + = 0
   ̇ 
∙ ¸
   
− + = 0
   ̇ 
Substitute the Lagrangian and the equation of constraint gives two equations of motion
− ( − )  sin  −  ( − )2 ̈ +  ( − ) = 0
1
− 2 ̈ −  = 0
2
The lower equation of motion gives that
1
 = − ̈
2
Substitute this into the equation of constraint gives
1
 = −  ( − ) ̈
2
Substitute this into the first equation of motion gives the equation of motion for  to be
2
̈ = sin 
3 ( − )
that is

=− sin 
3
The torque acting on the small cylinder due to the frictional force is
1
  = 2 ̈ = −
2
Thus the frictional force is

 = − = sin 
3
Noether’s theorem can be used to ascertain if the angular momentum  is a constant of motion. The
derivative of the Lagrangian

= ( − )  sin 

and thus the Lagrange equations tells us that ̇ = ( − )  sin . Therefore  is not a constant of motion.
The Lagrangian is not an explicit function of  which would suggest that  is a constant of motion.
But this is incorrect because the constraint equation  = (−)   couples  and , that is, they are not
independent variables, and thus  and  are coupled by the constraint equation. As a result  is not a
constant of motion because it is directly coupled to  = ( − )  sin  which is not a constant of motion.
Thus neither  nor  are constants of motion. This illustrates that one must account carefully for equations
of constraint, and the concomitant constraint forces, when applying Noether’s theorem which tacitly assumes
independent variables.
The Hamiltonian can be derived using the generalized momenta

 = =  ( − )2 ̇
 ̇
 1
 = = 2 ̇
 ̇ 2
Then the Hamiltonian is given by
2 2
 =  ̇ +  ̇ −  = + + [ − ( − ) cos ] 
2 ( − )2 2
Note that the transformation to generalized coordinates is time independent and the potential is not velocity
dependent, thus the Hamiltonian also equals the total energy. Also the Hamiltonian is conserved since

 = 0.
7.11. HAMILTONIAN FOR CYCLIC COORDINATES 193

7.11 Hamiltonian for cyclic coordinates


It is interesting to discuss the properties of the Hamiltonian for cyclic coordinates  for which  = 0.
Ignoring the external and Lagrange multiplier terms,
 
̇ = =− =0 (7.49)
 
That is, a cyclic coordinate has a constant corresponding momentum  for the Hamiltonian as well as
for the Lagrangian. Conversely, if a generalized coordinate does not occur in the Hamiltonian, then the
corresponding generalized momentum is conserved. Cyclic coordinates were discussed earlier when discussing
symmetries and conservation-law aspects of the Lagrangian. For example, if the Lagrangian, or Hamiltonian
do not depend on a linear coordinate  then  is conserved. Similarly for  and   An extension of this
principle has been derived for the relationship between time independence and total energy of a system,
that is, the Hamiltonian equals the total energy if the transformation to generalized coordinates is time
independent and the potential is velocity independent.
A valuable feature of the Hamiltonian formulation is that it allows elimination of cyclic variables which
reduces the number of degrees of freedom to be handled. As a consequence, cyclic variables are called
ignorable variables in Hamiltonian mechanics. For example, consider that the Lagrangian has one cyclic
variable  . As a consequence, the Lagrangian does not depend on  , and thus it can be written as
 = (1   −1 ; ̇1   ̇ ; ) The Lagrangian still contains  generalized velocities, thus one still has to
treat  degrees of freedom even though one degree of freedom  is cyclic. However, in the Hamiltonian
formulation, only  − 1 degrees of freedom are required since the momentum for the cyclic degree of freedom
is a constant  =  Thus the Hamiltonian can be written as  = (1   −1 ; 1   −1 ; ; ) , that is,
the Hamiltonian includes only −1 degrees of freedom. Thus the dimension of the problem has been reduced
by one since the conjugate cyclic (ignorable) variables (   ) are eliminated. Hamiltonian mechanics can
significantly reduce the dimension of the problem when the system involves several cyclic variables. This is
in contrast to the situation for the Lagrangian approach as discussed in chapters 8 and 15.

7.12 Symmetries and invariance

This chapter has shown that the symmetries of a system lead to invariance of physical quantities as was pro-
posed by Noether. The symmetry properties of the Lagrangian can lead to the conservation laws summarized
in table 72.

Table 7.2: Symmetries and conservation laws in classical mechanics

Symmetry Lagrange property Conserved quantity
Spatial invariance Translational invariance Linear momentum
Spatial homogeneous Rotational invariance Angular momentum
Time invariance Time independence Total energy

The importance of the relations between invariance and symmetry cannot be overemphasized. It extends
beyond classical mechanics to quantum physics and field theory. For a three-dimensional closed system,
there are three possible constants for linear momentum, three for angular momentum, and one for energy. It
is especially interesting in that these, and only these, seven integrals have the property that they are additive
for the particles comprising a system, and this occurs independent of whether there is an interaction among
the particles. That is, this behavior is obeyed by the whole assemble of particles for finite systems. Because
of its profound importance to physics, these relations between symmetry and invariance are used extensively.

7.13 Hamiltonian in classical mechanics

The Hamiltonian was defined by equation 737 during the discussion of time invariance and energy conserva-
tion. The Hamiltonian is of much more profound importance to physics than implied by the ad hoc definition
given by equation 737 This relates to the fact that the Hamiltonian is written in terms of the fundamental
coordinate  and its generalized momentum  defined by equation 73.
194 CHAPTER 7. SYMMETRIES, INVARIANCE AND THE HAMILTONIAN

It is more convenient to write the  generalized coordinates   plus their generalized momentum   as
vectors, e.g. q ≡ (1  2   ), p ≡ (1  2   ). The generalized momenta conjugate to the coordinate  ,
defined by 73, then can be written in the form
(q q̇ t)
 = (7.50)
 ̇
Substituting this definition of the generalized momentum into the Hamiltonian defined in (737), and
expressing it in terms of the coordinate q and its conjugate generalized momenta p, leads to
X
 (q p ) =  ̇ − (q q̇ ) (7.51)

= p · q̇−(q q̇ ) (7.52)
P
Note that the scalar product p · q̇ =   ̇ equals 2 for systems that are scleronomic and when the
potential is velocity independent.
The crucial feature of the Hamiltonian is that it is expressed as  (q p )  that is, it is a function
of the  generalized coordinates q and their conjugate momenta p, which are taken to be independent, in
addition to the independent variable, . This is in contrast to the Lagrangian (q q̇ ) which is a function
of the  generalized coordinates  , the corresponding velocities ̇ , and time  The velocities q̇ are the
time derivatives of the coordinates q and thus these are related. In physics, the fundamental conjugate
coordinates are (q p) which are the coordinates underlying the Hamiltonian. This is in contrast to (q q̇)
which are the coordinates that underlie the Lagrangian. Thus the Hamiltonian is more fundamental than
the Lagrangian and is a reason why the Hamiltonian mechanics, rather than the Lagrangian mechanics, was
used as the foundation for development of quantum and statistical mechanics.
Hamiltonian mechanics will be derived two other ways. Chapter 8 uses the Legendre transformation
between the conjugate variables (q q̇ ) and (q p ) where the generalized coordinate q and its conjugate
generalized momentum, p are independent. This shows that Hamiltonian mechanics is based on the same
variational principles as those used to derive Lagrangian mechanics. Chapter 9 derives Hamiltonian mechan-
ics directly from Hamilton’s Principle of Least action. Chapter 8 will introduce the algebraic Hamiltonian
mechanics, that is based on the Hamiltonian. The powerful capabilities provided by Hamiltonian mechanics
will be described in chapter 15.

7.14 Summary
This chapter has explored the importance of symmetries and invariance in Lagrangian mechanics and has
introduced the Hamiltonian. The following summarizes the important conclusions derived in this chapter.
Noether’s theorem:
Noether’s theorem explores the remarkable connection between symmetry, plus the invariance of a sys-
tem under transformation, and related conservation laws which imply the existence of important physical
principles, and constants of motion. Transformations where the equations of motion are invariant are called
invariant transformations. Variables that are invariant to a transformation are called cyclic variables. It
was shown that if the Lagrangian does not explicitly contain a particular coordinate of displacement,  then
the corresponding conjugate momentum, ̇ is conserved. This is Noether’s theorem which states “For each
symmetry of the Lagrangian, there is a conserved quantity”. In particular it was shown that translational
invariance in a given direction leads to the conservation of linear momentum in that direction, and rotational
invariance about an axis leads to conservation of angular momentum about that axis. These are the first-
order spatial and angular integrals of the equations of motion. Noether’s theorem also relates the properties
of the Hamiltonian to time invariance of the Lagrangian, namely;
(1)  is conserved if, and only if, the Lagrangian, and consequently the Hamiltonian, are not explicit
functions of time.
(2) The Hamiltonian gives the total energy if the constraints and coordinate transformations are time
independent and the potential energy is velocity independent. This is equivalent to stating that  =  if the
constraints, or generalized coordinates, for the system are time independent.
Noether’s theorem is of importance since it underlies the relation between symmetries, and invariance in
all of physics; that is, its applicability extends beyond classical mechanics.
7.14. SUMMARY 195

Generalized momentum:
The generalized momentum associated with the coordinate  is defined to be

≡  (73)
 ̇
where  is also called the conjugate momentum (or canonical momentum) to  where    are
conjugate, or canonical, variables. Remember that the linear momentum  is the first-order time integral
given by equation 210. Note that if  is not a spatial coordinate, then  is not linear momentum, but is
the conjugate momentum. For example, if  is an angle, then  will be angular momentum.
Kinetic energy in generalized coordinates:
It was shown that the kinetic energy can be expressed in terms of generalized coordinates by
XX 1 XX XX 1 µ ¶2
    
 (q q̇ ) =  ̇ ̇ +  ̇ +  (719)

2    
   
2 

= 2 (q q̇ ) + 1 (q q̇ ) + 0 (q ) (7.53)

For scleronomic systems with a potential that is velocity independent, then the kinetic energy can be
expressed as
1X 1
 = 2 = ̇  = q̇ · p (731)
2 2


Generalized energy
Jacobi’s Generalized Energy (q ̇ ) was defined as
X µ  ¶
(q q̇ ) ≡ ̇ − (q q̇ ) (736)

 ̇

Hamiltonian function
The Hamiltonian  (q p) was defined in terms of the generalized energy (q q̇ ) and by introducing
the generalized momentum. That is
X
 (q p) ≡ (q q̇ ) =  ̇ − (q q̇ ) = p · q̇−(q q̇ ) (737)


Generalized energy theorem

The equations of motion lead to the generalized energy theorem which states that the time dependence
of the Hamiltonian is related to the time dependence of the Lagrangian.
" 
#
 (q p) X 
X  (q q̇ )
= ̇  +  (q ) − (738)
 
 
=1

Note that if all the generalized non-potential forces are zero, then the bracket in equation 738 is zero, and
if the Lagrangian is not an explicit function of time, then the Hamiltonian is a constant of motion.
Generalized energy and total energy:
The generalized energy, and corresponding Hamiltonian, equal the total energy if:
1) The kinetic energy has a homogeneous quadratic dependence on the generalized velocities and the

transformation to generalized coordinates is independent of time,  = 0
2) The potential energy is not velocity dependent, thus the terms ̇ = 0
Chapter 8 will introduce Hamiltonian mechanics that is built on the Hamiltonian, and chapter 15 will
explore applications of Hamiltonian mechanics.
196 CHAPTER 7. SYMMETRIES, INVARIANCE AND THE HAMILTONIAN

Workshop exercises
1. Consider a particle of mass  moving in a plane and subject to an inverse square attractive force.

(a) Obtain the equations of motion.

(b) Is the angular momentum about the origin conserved?
(c) Obtain expressions for the generalized forces.

2. Consider a Lagrangian function of the form (  ˙  ¨  ). Here the Lagrangian contains a time derivative
of the generalized coordinates that is higher than the first. When working with such Lagrangians, the term
“generalized mechanics” is used.

3. A uniform solid cylinder of radius  and mass  rests on a horizontal plane and an identical cylinder rests
on it touching along the top of the first cylinder with the axes of both cylinders parallel. The upper cylinder
is given an infinitessimal displacement so that both cylinders roll without slipping in the directions shown by
the arrows.

(a) Find Lagrangian for this system

(b) What are the constants of motion?
(c) Show that as long as the cylinders remain in contact then
2 12 (1 − cos )
̇ =
 (17 + 4 cos  − 4 cos2 )

y
x
t=0 t>0

4. Consider a diatomic molecule which has a symmetry axis along the line through the center of the two atoms
comprising the molecule. Consider that this molecule is rotating about an axis perpendicular to the symmetry
axis and that there are no external forces acting on the molecule. Use Noether’s Theorem to answer the
following questions:
a) Is the total angular momentum conserved?
b) Is the projection of the total angular momentum along a space-fixed  axis conserved?
c) Is the projection of the angular momentum along the symmetry axis of the rotating molecule conserved?
d) Is the projection of the angular momentum perpendicular to the rotating symmetry axis conserved?
7.14. SUMMARY 197

5. A bead of mass  slides under gravity along a smooth wire bent in the shape of a parabola 2 =  in the
vertical ( ) plane.

(a) What kind (holonomic, nonholonomic, scleronomic, rheonomic) of constraint acts on ?

(b) Set up Lagrange’s equation of motion for  with the constraint embedded.
(c) Set up Lagrange’s equations of motion for both  and  with the constraint adjoined and a Lagrangian
multiplier  introduced.
(d) Show that the same equation of motion for  results from either of the methods used in part (b) or part
(c).
(e) Express  in terms of  and ̇.
(f) What are the  and  components of the force of constraint in terms of  and ̇?

Problems
1. Let the horizontal plane be the  −  plane. A bead of mass  is constrained to slide with speed  along a
curve described by the function  =  (). What force does the curve apply to the bead? (Ignore gravity)

2. Consider the Atwoods machine shown. The masses are 4, 5, and 3. Let  and  be the heights of the
right two masses relative to their initial positions.
a) Solve this problem using the Euler-Lagrange equations
b) Use Noether’s theorem to find the conserved momentum.

x y
5m 3m

3. A cube of side 2 and center of mass  , is placed on a fixed horizontal cylinder of radius  and center  as
shown in the figure. Originally the cube is placed such that  is centered above  but it can roll from side to
side without slipping. (a) Assuming that    use the Lagrangian approach to to find the frequency for small
oscillations about the top of the cylinder. For simplicity make the small angle approximation for  before using
the Lagrange-Euler equations. (b) What will be the motion if    ? Note that the moment of inertia of the
cube about the center of mass is 23 2 .

h
b

O
198 CHAPTER 7. SYMMETRIES, INVARIANCE AND THE HAMILTONIAN

4. Two equal masses of mass  are glued to a massless hoop of radius  is free to rotate about its center in a
vertical plane. The angle between the masses is 2 , as shown. Find the frequency of oscillations.

5. Three massless sticks each of length 2, and mass  with the center of mass at the center of each stick, are
hinged at their ends as shown. The bottom end of the lower stick is hinged at the ground. They are held so
that the lower two sticks are vertical, and the upper one is tilted at a small angle  with respect to the vertical.
They are then released. At the instant of release what are the three equations of motion derived from the
Lagrangian derived assuming that  is small? Use these to determine the initial angular accelerations of the
three sticks.

m
Chapter 8

Hamiltonian mechanics

8.1 Introduction
The three major formulations of classical mechanics are
1. Newtonian mechanics which is the most intuitive vector formulation used in classical mechanics.
2. Lagrangian mechanics is a powerful algebraic formulation of classical mechanics derived using either
d’Alembert’s Principle, or Hamilton’s Principle. The latter states ”A dynamical system follows a path
that minimizes the time integral of the diﬀerence between the kinetic and potential energies”.
3. Hamiltonian mechanics has a beautiful superstructure that, like Lagrangian mechanics, is built
upon variational calculus, Hamilton’s principle, and Lagrangian mechanics.
Hamiltonian mechanics is introduced at this juncture since it is closely interwoven with Lagrange mechan-
ics. Hamiltonian mechanics plays a fundamental role in modern physics, but the discussion of the important
role it plays in modern physics will be deferred until chapters 15 and 18 where applications to modern physics
are addressed.
The following important concepts were introduced in chapter 7:
The generalized momentum was defined to be given by
(q q̇)
 ≡ (8.1)
 ̇
Note that, as discussed in chapter 72, if the potential is velocity dependent, such as the Lorentz force, then
the generalized momentum includes terms in addition to the usual mechanical momentum.
Jacobi’s generalized energy function (q q̇ ) was introduced where
 µ
X ¶

(q q̇ ) = ̇ − (q q̇ ) (8.2)

 ̇
The Hamiltonian function was defined to be given by expressing the generalized energy function,
equation 82, in terms of the generalized momentum. That is, the Hamiltonian (q p ) is expressed as

X
 (q p ) =  ̇ − (q q̇ ) (8.3)


The symbols q, p, designate vectors of  generalized coordinates, q ≡ (1  2   ) p ≡P(1  2   ).
Equation 83 can be written compactly in a symmetric form using the scalar product p · q̇ =   ̇ .
 (q p ) + (q q̇ ) = p · q̇ (8.4)
A crucial feature of Hamiltonian mechanics is that the Hamiltonian is expressed as  (q p )  that
is, it is a function of the  generalized coordinates and their conjugate momenta, which are taken to be
independent, plus the independent variable, time. This contrasts with the Lagrangian (q q̇ ) which is a
function of the  generalized coordinates  , and the corresponding velocities ̇ , that is the time derivatives
of the coordinates  , plus the independent variable, time.

199
200 CHAPTER 8. HAMILTONIAN MECHANICS

8.2 Legendre Transformation between Lagrangian and Hamiltonian

mechanics
Hamiltonian mechanics can be derived directly from Lagrange mechanics by considering the Legendre trans-
formation between the conjugate variables (q q̇ ) and (q p ). Such a derivation is of considerable im-
portance in that it shows that Hamiltonian mechanics is based on the same variational principles as those
used to derive Lagrangian mechanics; that is d’Alembert’s Principle and Hamilton’s Principle. The general
problem of converting Lagrange’s equations into the Hamiltonian form hinges on the inversion of equation
(81) that defines the generalized momentum p This inversion is simplified by the fact that (81) is the first
partial derivative of the Lagrangian scalar function (q q̇ t).
As described in appendix  4, consider transformations between two functions  (u w) and (v w)
where u and v are the active variables related by the functional form

v = ∇u  (u w) (8.5)

and where w designates passive variables. The function ∇u  (u w) is the first-order derivative, (gradient)
of  (u w) with respect to the components of the vector u. The Legendre transform states that the inverse
formula can always be written as a first-order derivative

u = ∇v (v w) (8.6)

The function (v w) is related to  (u w) by the symmetric relation

(v w)+ (u w) = u · v (8.7)

P
where the scalar product u · v = =1   .
Furthermore the first-order derivatives with respect to all the passive variables  are related by

∇w  (u w) = −∇w (v w) (8.8)

The relationship between the functions  (u w) and (v w) is symmetrical and each is said to be the
Legendre transform of the other.
The general Legendre transform can be used to relate the Lagrangian and Hamiltonian by identifying the
active variables v with p and u with q̇ the passive variable w with q, and the corresponding functions
 (u w) =(q q̇) and (v w) =(q p). Thus the generalized momentum (81) corresponds to

p = ∇q̇ (q q̇) (8.9)

where (q) are the passive variables. Then the Legendre transform states that the transformed variable q̇
is given by the relation
q̇ = ∇p (q p) (8.10)
Since the functions (q q̇) and (q p) are the Legendre transforms of each other, they satisfy the
relation
 (q p ) +(q q̇ ) = p · q̇ (8.11)
The function  (q p ), which is the Legendre transform of the Lagrangian (q q̇ ) is called the Hamil-
tonian function and equation (811) is identical to our original definition of the Hamiltonian given by
equation (83). The variables q and  are passive variables thus equation (88) gives that

∇q (q̇ q) = −∇q (p q ) (8.12)

Written in component form equation 812 gives the partial derivative relations

(q̇ q) (p q )

= − (8.13)
 
(q̇ q) (p q )
= − (8.14)
 
8.3. HAMILTON’S EQUATIONS OF MOTION 201

Note that equations 813 and 814 are strictly a result of the Legendre transformation. To complete the
transformation from Lagrangian to Hamiltonian mechanics it is necessary to invoke the calculus of variations
via the Lagrange-Euler equations. The symmetry of the Legendre transform is illustrated by equation 811
Equation 731 gives that the scalar product p · q̇ =22  For scleronomic systems, with velocity indepen-
dent potentials  the standard Lagrangian  =  −  and  = 2 −  +  =  +  . Thus, for this simple
case, equation 811 reduces to an identity  +  = 2 .

8.3 Hamilton’s equations of motion

The explicit form of the Legendre transform 810 gives that the time derivative of the generalized coordinate
 is
(q p)
̇ = (8.15)

The Euler-Lagrange equation 660 is

X 
  
− =  + 
 (8.16)
  ̇  
=1

This gives the corresponding Hamilton equation for the time derivative of  to be

   X 
= ̇ = +  + 
 (8.17)
  ̇  
=1

Substitute equation 813 into equation 817 leads to the second Hamilton equation of motion

(q p) X 
̇ = − +  + 
 (8.18)
 
=1

One can explore further the implications of Hamiltonian mechanics by taking the time diﬀerential of (83)
giving. µ ¶
(q p) X  ̇    ̇ 
= ̇ +  − − − (8.19)
 
     ̇  

Inserting the conjugate momenta  ≡  ̇ and equation 817 into equation 819 results in
Ã " 
# !
(q p) X ̇ X   ̇ 
= ̇ ̇ +  − ̇ −  −  ̇ −  − (8.20)
 
    
=1

The second and fourth terms cancel as well as the ̇ ̇ terms, leaving
Ã"  # !
(q p) X X 
 
=  +  ̇ − (8.21)
 
  
=1

This is the generalized energy theorem given by equation 738.

The total diﬀerential of the Hamiltonian also can be written as
µ ¶
(q p) X   
= ̇ + ̇ + (8.22)
 
  

 
Use equations 815 and 818 to substitute for  and  in equation 822 gives
Ã" 
# !
(q p) X X   (q p)
=  +  ̇ + (8.23)
 
 
=1
202 CHAPTER 8. HAMILTONIAN MECHANICS

Note that equation 823 must equal the generalized energy theorem, i.e. equation 821 Therefore,
 
=− (8.24)
 
In summary, Hamilton’s equations of motion are given by
(q p)
̇ = (8.25)

" #
(q p) X 
̇ = − +  + 
 (8.26)
 
=1
Ã"  # !
(q p) X X  (q q̇)

=  +  ̇ − (8.27)
 
 
=1

The symmetry of Hamilton’s equations of motion is illustrated when the Lagrange multiplier and gener-
alized forces are zero. Then

(q p)
̇ = (8.28)

(p q )
̇ = − (8.29)

(p q ) (p q ) (q̇ q)
= =− (8.30)
  
This simplified form illustrates the symmetry of Hamilton’s equations of motion. Many books present
the Hamiltonian only for this special simplified case where it is holonomic, conservative, and generalized
coordinates are used.

8.3.1 Canonical equations of motion

Hamilton’s equations of motion, summarized in equations 825 − 27 use either a minimal set of generalized
coordinates, or the Lagrange multiplier terms, to account for holonomic constraints, or generalized forces

 to account for non-holonomic or other forces. Hamilton’s equations of motion usually are called the
canonical equations of motion. Note that the term “canonical” has nothing to do with religion or canon
law; the reason for this name has bewildered many generations of students of classical mechanics. The
term was introduced by Jacobi in 1837 to designate a simple and fundamental set of conjugate variables
and equations. Note the symmetry of Hamilton’s two canonical equations, plus the fact that the canonical
variables    are treated as independent canonical variables. The Lagrange mechanics coordinates (q q̇)
are replaced by the Hamiltonian mechanics coordinates (q p) where the conjugate momenta p are taken
to be independent of the coordinate q.
Lagrange was the first to derive the canonical equations but he did not recognize them as a basic set of
equations of motion. Hamilton derived the canonical equations of motion from his fundamental variational
principle, chapter 92, and made them the basis for a far-reaching theory of dynamics. Hamilton’s equations
give 2 first-order diﬀerential equations for    for each of the  =  −  degrees of freedom. Lagrange’s
equations give  second-order diﬀerential equations for the  independent generalized coordinates   ̇ 
It has been shown that (p q ) and (q̇ q) are the Legendre transforms of each other. Although
the Lagrangian formulation is ideal for solving numerical problems in classical mechanics, the Hamiltonian
formulation provides a better framework for conceptual extensions to other fields of physics since it is written
in terms of the fundamental conjugate coordinates, q p. The Hamiltonian is used extensively in modern
physics, including quantum physics, as discussed in chapters 15 and 18. For example, in quantum mechanics
there is a straightforward relation between the classical and quantal representations of momenta; this does
not exist for the velocities.
The concept of state space, introduced in chapter 332, applies naturally to Lagrangian mechanics since
(̇ ) are the generalized coordinates used in Lagrangian mechanics. The concept of Phase Space, introduced
in chapter 333, naturally applies to Hamiltonian phase space since ( ) are the generalized coordinates
used in Hamiltonian mechanics.
8.4. HAMILTONIAN IN DIFFERENT COORDINATE SYSTEMS 203

8.4 Hamiltonian in diﬀerent coordinate systems

Prior to solving problems using Hamiltonian mechanics, it is useful to express the Hamiltonian in cylindrical
and spherical coordinates for the special case of conservative forces since these are encountered frequently
in physics.

8.4.1 Cylindrical coordinates   

Consider cylindrical coordinates    Expressed in cartesian coordinate

 =  cos  (8.31)
 =  sin 
 = 

Using appendix table 3 the Lagrangian can be written in cylindrical coordinates as
³ 2 2
´
= − = ̇ + 2 ̇ + ̇ 2 −  (  ) (8.32)
2
The conjugate momenta are

 = = ̇ (8.33)
 ̇

 = = 2 ̇ (8.34)
 ̇

 = = ̇ (8.35)
 ̇
Assume a conservative force, then  is conserved. Since the transformation from cartesian to non-
rotating generalized cylindrical coordinates is time independent, then  =  Then using (832 − 835) gives
the Hamiltonian in cylindrical coordinates to be
X
 (q p ) =  ̇ − (q q̇ ) (8.36)

³ ´  µ 2 2
¶
2
=  ̇ +  ̇ +  ̇ −  + 2  +  +  (  )
2
Ã !
2
1 
= 2 + 2 + 2 +  (  ) (8.37)
2 

The canonical equations of motion in cylindrical coordinates can be written as

 2 
̇ = − = − (8.38)
 3 
 
̇ = − =− (8.39)
 
 
̇ = − =− (8.40)
 
 
̇ = = (8.41)
 
 
̇ = = (8.42)
 2
 
̇ = = (8.43)
 

Note that if  is cyclic, that is  = 0 then the angular momentum about the  axis,  , is a constant
of motion. Similarly, if  is cyclic, then  is a constant of motion.
204 CHAPTER 8. HAMILTONIAN MECHANICS

8.4.2 Spherical coordinates,   

Appendix table 4 shows that the spherical coordinates are related to the cartesian coordinates by

 =  sin  cos  (8.44)

 =  sin  sin 
 =  cos 

The Lagrangian is
³ 2 2 2
´
 =  −  = ̇ + 2 ̇ + 2 sin2 ̇ −  () (8.45)
2
The conjugate momenta are

 =  = ̇ (8.46)

 2
 =  =  ̇ (8.47)

 2 2
 =  =  sin  ̇ (8.48)

Assuming a conservative force then  is conserved. Since the transformation from cartesian to generalized
spherical coordinates is time independent, then  =  Thus using (846 − 848) the Hamiltonian is given
in spherical coordinates by
X
 (q p ) =  ̇ − (q q̇ ) (8.49)

³ ´ ³ 2 2
´
=  ̇ +  ̇ +  ̇ − ̇2 + 2 ̇ + 2 sin2 ̇ +  (  ) (8.50)
Ã 2 !
2 2
1   
= 2 + 2 + 2 2 +  (  ) (8.51)
2   sin 

Then the canonical equations of motion in spherical coordinates are

Ã !
 1 2
2 
̇ = − = 3
 + 2 − (8.52)
  sin  
Ã !
 1 2 cos  
̇ = − = − (8.53)
 2 sin3  
 
̇ = − =− (8.54)
 
 
̇ = = (8.55)
 
 
̇ = = (8.56)
 2
 
̇ = = (8.57)
  sin2 
2

Note that if the coordinate  is cyclic, that is 

 = 0 then the angular momentum  is conserved. Also
if the  coordinate is cyclic, and  = 0 that is, there is no change in the angular momentum perpendicular
to the  axis, then  is conserved.
An especially important spherically-symmetric Hamiltonian is that for a central field. Central fields, such
as the gravitational or Coulomb fields of a uniform spherical mass, or charge, distributions, are spherically
symmetric and then both  and  are cyclic. Thus the projection of the angular momentum  about the 
axis is conserved for these spherically symmetric potentials. In addition, since both  and   are conserved,
then the total angular momentum also must be conserved as is predicted by Noether’s theorem
8.5. APPLICATIONS OF HAMILTONIAN DYNAMICS 205

8.5 Applications of Hamiltonian Dynamics

The equations of motion of a system can be derived using the Hamiltonian coupled with Hamilton’s equations
of motion, that is, equations 825 − 827.
Formally the Hamiltonian is constructed from the Lagrangian. That is
1) Select a set of independent generalized coordinates 
2) Partition the active forces.
3) Construct the Lagrangian (  ̇  )
4) Derive the conjugate generalized momenta via  = 
P ̇
5) Knowing  ̇   derive  =   ̇ − 
P
6) 
Derive ̇ =  
and ̇ = − (qp)
 +  
=1   + 


This procedure appears to be unnecessarily complicated compared to just using the Lagrangian plus
Lagrangian mechanics to derive the equations of motion. Fortunately the above lengthy procedure often can
be bypassed for conservative systems. That is, if the following conditions are satisfied;

)  =  () −  (), that is,  () is independent of the velocity ̇.
) the generalized coordinates are time independent.
then it is possible to use the fact that  =  +  = .
The following five examples illustrate the use of Hamiltonian mechanics to derive the equations of motion.

8.1 Example: Motion in a uniform gravitational field

Consider a mass  in a uniform gravitational field acting in the −z direction. The Lagrangian for this
simple case is
1 ¡ ¢
 =  ̇2 + ̇ 2 + ̇ 2 − 
2
Therefore the generalized momenta are  =   
 ̇ = ̇  =  ̇ = ̇  =  ̇ = ̇. The corresponding
Hamiltonian  is
X
 =  ̇ −  =  ̇ +  ̇ +  ̇ − 

Ã ! Ã !
2 2 2 1 2 2 2 1 2 2 2
= + +  − + +  +  = + +  + 
   2    2   

Note that the Lagrangian is not explicitly time dependent, thus the Hamiltonian is a constant of motion.
Hamilton’s equations give that
  
̇ = = − ̇ = =0
  
  
̇ = = − ̇ = =0
  
  
̇ = = − ̇ = = 
  
Combining these gives that ̈ = 0 ̈ = 0 ̈ = −. Note that the linear momenta  and  are constants
of motion whereas the rate of change of  is given by the gravitational force . Note also that  =  + 
for this conservative system.

8.2 Example: One-dimensional harmonic oscillator

Consider a mass  subject to a linear restoring force with spring constant  The Lagrangian  =  − 
equals
1 1
 = ̇2 − 2
2 2
Therefore the generalized momentum is

 = = ̇
 ̇
206 CHAPTER 8. HAMILTONIAN MECHANICS

The Hamiltonian  is
X
 =  ̇ −  =  ̇ − 

  1 2 1 2 1 2 1 2
= − +  = + 
 2 2 2 2
Note that the Lagrangian is not explicitly time dependent, thus the Hamiltonian will be a constant of motion.
Hamilton’s equations give that
 
̇ = =
 
or
 = ̇
In addition
 
−̇ = = = 
 
Combining these gives that

̈ + =0

which is the equation of motion for the harmonic oscillator.

8.3 Example: Plane pendulum

The plane pendulum, in a uniform gravitational field  is an interesting system to consider. There is
only one generalized coordinate,  and the Lagrangian for this system is

1 2 2
=  ̇ +  cos 
2
The momentum conjugate to  is


 = = 2 ̇
 ̇
which is the angular momentum about the pivot point.
The Hamiltonian is

X 1 2 2 2
=  ̇ −  =  ̇ −  =  ̇ −  cos  =  2 −  cos 

2 2

Hamilton’s equations of motion give

 
̇ = =
 2


̇ = − = − sin 

Note that the Lagrangian and Hamiltonian are not explicit functions of time, therefore they are conserved.
Also the potential is velocity independent and there is no coordinate transformation, thus the Hamiltonian
equals the total energy, that is
2
 =  2 −  cos  = 
2
where  is a constant of motion. Note that the angular momentum  is not a constant of motion since ̇
explicitly depends on .
8.5. APPLICATIONS OF HAMILTONIAN DYNAMICS 207

The solutions for the plane pendulum on a (  ) phase di-

agram, shown in the adjacent figure, illustrate the motion. The P
upper phase-space plot shows the range ( = ±  ). Note that
the  = + and − correspond to the same physical point, that is
the phase diagram should be rolled into a cylinder connected along
the dashed lines. The lower phase space plot shows two cycles for
 to better illustrate the cyclic nature of the phase diagram. The
corresponding state-space diagram is shown in figure 34. The
trajectories are ellipses for low energy −     corre-
sponding to oscillations of the pendulum about  = 0. The center
of the ellipse (0 0) is a stable equilibrium point for the oscillation. Elliptic point
P
However, there is a phase change to rotational motion about the Hyperbolic point

horizontal axis when ||  , that is, the pendulum swings
around a circle continuously, i.e. it rotates continuously in one
direction about the horizontal axis. The phase change occurs at
 =  and is designated by the separatrix trajectory.
O
The plot of  versus  for the plane pendulum is better pre-
sented on a cylindrical phase space representation since  is a
cyclic variable that cycles around the cylinder, whereas  oscil-
lates equally about zero having both positive and negative values.
When wrapped around a cylinder then the unstable and stable (b)

equilibrium points will be at diametrically opposite locations on Phase-space diagrams for the plane
the surface of the cylinder at  = 0. For small oscillations pendulum. The separatrix (bold line)
about equilibrium, also called librations, the correlation between separates the oscillatory solutions from
 and  is given by the clockwise closed ellipses wrapped on the the rolling solutions. The upper (a)
cylindrical surface, whereas for energies ||   the positive shows one complete cycle while the lower
 corresponds to counterclockwise rotations while the negative (b) shows two complete cycles.
 corresponds to clockwise rotations.

8.4 Example: Hooke’s law force constrained to the surface of a cylinder

Consider the case where a mass  is attracted by a
force directed toward the origin and proportional to the
distance from the origin. Determine the Hamiltonian z
if the mass is constrained to move on the surface of a
cylinder defined by

2 +  2 = 2

It is natural to transform this problem to cylindrical co-

ordinates   . Since the force is just Hooke’s law z

F = −r
y
the potential is the same as for the harmonic oscillator,
that is
1 1
 = 2 = (2 +  2 )
2 2 x
This is independent of  and thus  is cyclic.
Mass attracted to origin by force proportional to
In cylindrical coordinates the velocity is
distance from origin with the motion constrained
2 2 to the surface of a cylinder.
 2 = ̇2 + 2 ̇ + 

Confined to the surface of the cylinder means that

 = 

 = 0
208 CHAPTER 8. HAMILTONIAN MECHANICS

Then the Lagrangian simplifies to

1 ³ 2 2 ´ 1
= − =   ̇ + ̇ 2 − (2 +  2 )
2 2
The generalized coordinates are   and the corresponding generalized momenta are

 = = 2 ̇ (a)
 ̇

 = = ̇ (b)
 ̇
The system is conservative, and the transformation from rectangular to cylindrical coordinates does not
depend explicitly on time. Therefore the Hamiltonian is conserved and equals the total energy. That is
X 2 2 1
=  ̇ −  = 2
+  + (2 +  2 ) = 

2 2 2

The equations of motion then are given by the canonical equations

  
̇ = − =0 ̇ = = (c)
  2
  
̇ = − = − ̇ = = (d)
  
Equation (a) and (c) imply that

 =  = 2 ̇ = constant

Thus the angular momentum about the axis of the cylinder is conserved, that is, it is a cyclic variable.
Combining equations (b) and (d) implies that

̈ + =0

q

This is the equation for simple harmonic motion with angular frequency  =  . The symmetries imply
that this problem has the same solutions for the  coordinate as the harmonic oscillator, while the  coordinate
moves with constant angular velocity.

8.5 Example: Electron motion in a cylindrical magnetron

A magnetron comprises a hot cylindrical wire cathode that emits electrons and is at a high negative
voltage. It is surrounded by a larger diameter concentric cylindrical anode at ground potential. A uniform
magnetic field runs parallel to the cylindrical axis of the magnetron. The electron beam excites a multiple set
of microwave cavities located around the circumference of the cylindrical wall of the anode. The magnetron
was invented in England during World War 2 to generate microwaves required for the development of radar.
Consider a non-relativistic electron of mass  and charge − in a cylindrical magnetron moving between
the central cathode wire, of radius  at a negative electric potential −0 , and a concentric cylindrical anode
conductor of radius  which has zero electric potential. There is a uniform constant magnetic field  parallel
to the cylindrical axis of the magnetron.
Using SI units and cylindrical coordinates (  ) aligned with the axis of the magnetron, the electromag-
netic force Lagrangian, given in chapter 610 equals
1 2
=ṙ + ( − ṙ · A)
2
The electric and vector potentials for the magnetron geometry are
ln(  )
 = −0 
ln(  )
1
A = ̂
2
8.5. APPLICATIONS OF HAMILTONIAN DYNAMICS 209

Thus expressed in cylindrical coordinates the Lagrangian equals

1 ³ 2
´ 1
 =  ̇2 + 2 ̇ + ̇ 2 +  − 2 ̇
2 2
The generalized momenta are

 = = ̇
 ̇
 1
 = = 2 ̇ − 2
 ̇ 2

 = = ̇
 ̇
Note that the vector potential  contributes an additional term to the angular momentum  .
Using the above generalized momenta leads to the Hamiltonian

 =  ̇ +  ̇ +  ̇ − 
1 ³ 2 2
´ 1
=  ̇ + 2 ̇ + ̇ 2 −  + 2 ̇
2 2
µ ¶2
2 1 1 2 2
= +   +  + − 
2 22 2 2
" µ ¶2 #
1 2  1 2
=  + +  +  − 
2   2

Note that the Hamiltonian is not an explicit function of time, therefore it is a constant of motion which
equals the total energy. " #
µ ¶2
1 2  1 2
=  + +  +  −  = 
2   2

Since ̇ = − 
  and if  is not an explicit function of   then ̇ = 0 that is,  is a constant of motion.
Thus  and  are constants of motion.
Consider the initial conditions  =  ̇ = ̇ = ̇ = 0. Then
 1 1
 = = 2 ̇ − 2 = − 2
 ̇ 2 2
 = 0
" µ ¶2 #
1 2  1 2 ln(  )
 =  + +  +  + 0   = 0
2  2 ln(  )

Note that at  =  then  is given by the last equation since the Hamiltonian equals a constant 0 . That
is, assuming that    then
1
2 = 20 − ( )2
2
Define a critical magnetic field by r
2 20
 ≡
 
then
¡ 2¢ ¡ ¢ 1
 = = 2 −  2 ( )2
2
Note that if    then  is real at  = . However, if    then  is imaginary at  = 
implying that there must be a maximum orbit radius 0 for the electron where 0  . That is, the electron
trajectories are confined spatially to coaxial cylindrical orbits concentric with the magnetron electromagnetic
fields. These closed electron trajectories excite the microwave cavities located in the nearby outer cylindrical
wall of the anode.
.
210 CHAPTER 8. HAMILTONIAN MECHANICS

8.6 Routhian reduction

Noether’s theorem states that if the coordinate  is cyclic, and if the Lagrange multiplier plus generalized
force contributions for the   coordinates are zero, then the canonical momentum of the cyclic variable,   is
a constant of motion as is discussed in chapter 73. Therefore, both (   ) are constants of motion for cyclic
variables, and these constant (   ) coordinates can be factored out of the Hamiltonian (p q ). This
reduces the number of degrees of freedom included in the Hamiltonian. For this reason, cyclic variables are
called ignorable variables in Hamiltonian mechanics. This advantage does not apply to the (  ̇ ) variables
used in Lagrangian mechanics since ̇ is not a constant of motion for a cyclic coordinate. The ability
to eliminate the cyclic variables as unknowns in the Hamiltonian is a valuable advantage of Hamiltonian
mechanics that is exploited extensively for solving problems, as is described in chapter 15.
It is advantageous to have the ability to exploit both the Lagrangian and Hamiltonian formulations simul-
taneously when handling systems that involve a mixture of cyclic and non-cyclic coordinates. The equations
of motion for each independent generalized coordinate can be derived independently of the remaining general-
ized coordinates. Thus it is possible to select either the Hamiltonian or the Lagrangian formulations for each
generalized coordinate, independent of what is used for the other generalized coordinates. Routh[Rou1860]
devised an elegant, and useful, hybrid technique that separates the cyclic and non-cyclic generalized coor-
dinates in order to simultaneously exploit the diﬀering advantages of both the Hamiltonian
P and Lagrangian
formulations of classical mechanics. The Routhian reduction approach partitions the =1  ̇ kinetic energy
term in the Hamiltonian into a cyclic group, plus a non-cyclic group, i.e.


X 
X −
X
(1    ; 1    ; ) =  ̇ −  =  ̇ +  ̇ −  (8.58)
=1  

Routh’s clever idea was to define a new function, called the Routhian, that include only one of the two
partitions of the kinetic energy terms. This makes the Routhian a Hamiltonian for the coordinates for which
the kinetic energy terms are included, while the Routhian acts like a negative Lagrangian for the coordinates
where the kinetic energy term is omitted. This book defines two Routhians.


X
 (1    ; ̇1   ̇ ; +1    ; ) ≡  ̇ −  (8.59)

X
 (1    ; 1    ; ̇+1   ̇ ; ) ≡  ̇ −  (8.60)


The first, Routhian, called   includes the kinetic energy terms only for the cyclic variables, and behaves
like a Hamiltonian for the cyclic variables, and behaves like a Lagrangian for the non-cyclic variables. The
second Routhian, called −  includes the kinetic energy terms for only the non-cyclic variables, and
behaves like a Hamiltonian for the non-cyclic variables, and behaves like a negative Lagrangian for the cyclic
variables. These two Routhians complement each other in that they make the Routhian either a Hamiltonian
for the cyclic variables, or the converse where the Routhian is a Hamiltonian for the non-cyclic variables.
The Routhians use (  ̇ ) to denote those coordinates for which the Routhian behaves like a Lagrangian, and
(   ) for those coordinates where the Routhian behaves like a Hamiltonian. For uniformity, it is assumed
that the degrees of freedom between 1 ≤  ≤  are non-cyclic, while those between +1 ≤  ≤  are ignorable
cyclic coordinates.
The Routhian is a hybrid of Lagrangian and Hamiltonian mechanics. Some textbooks minimize discussion
of the Routhian on the grounds that this hybrid approach is not fundamental. However, the Routhian is
used extensively in engineering in order to derive the equations of motion for rotating systems. In addition
it is used when dealing with rotating nuclei in nuclear physics, rotating molecules in molecular physics, and
rotating galaxies in astrophysics. The Routhian reduction technique provides a powerful way to calculate
the intrinsic properties for a rotating system in the rotating frame of reference. The Routhian approach is
included in this textbook because it plays an important role in practical applications of rotating systems, plus
it nicely illustrates the relative advantages of the Lagrangian and Hamiltonian formulations in mechanics.
8.6. ROUTHIAN REDUCTION 211

8.6.1 R - Routhian is a Hamiltonian for the cyclic variables

The cyclic Routhian  is defined assuming that the variables between 1 ≤  ≤  are non-cyclic, where
 =  − , while the  variables between  + 1 ≤  ≤  are ignorable cyclic coordinates. The cyclic Routhian
 expresses the cyclic coordinates in terms of ( ) which are required for use by Hamilton’s equations,
while the non-cyclic variables are expressed in terms of ( ̇) for use by the Lagrange equations. That is,
the cyclic Routhian  is defined to be

X
 (1    ; ̇1   ̇ ; +1    ; ) ≡  ̇ −  (8.61)

P
where the summation   ̇ is over only the  cyclic variables  + 1 ≤  ≤ . Note that the Lagrangian
can be split into the cyclic and the non-cyclic parts

X
 (1    ; ̇1   ̇ ; +1    ; ) =  ̇ −  −  (8.62)


The first two terms on the right can be combined to give the Hamiltonian  for only the  cyclic
variables,  =  + 1  + 2  , that is
 (1    ; ̇1   ̇ ; +1    ; ) =  −  (8.63)
The Routhian  (1    ; ̇1   ̇ ; +1    ; ) also can be written in an alternate form

X 
X 
X
 (1    ; ̇1   ̇ ; +1    ; ) ≡  ̇ −  =  ̇ −  −  ̇ (8.64)
 =1 

X
= −  ̇ (8.65)


which is expressed as the complete Hamiltonian minus the kinetic energy term for the noncyclic coordinates.
The Routhian  behaves like a Hamiltonian for the  cyclic coordinates and behaves like a negative
Lagrangian  for all the  =  −  noncyclic coordinates  = 1 2   Thus the equations of motion
for the  non-cyclic variables are given using Lagrange’s equations of motion, while the Routhian behaves
like a Hamiltonian  for the  ignorable cyclic variables  =  + 1  
Ignoring both the Lagrange multiplier and generalized forces, then the partitioned equations of motion
for the non-cyclic and cyclic generalized coordinates are given in Table 81
Table 81; Equations of motion for the Routhian 
Lagrange equations Hamilton equations
Coordinates Noncyclic: 1 ≤  ≤  Cyclic: ( + 1) ≤  ≤ 

  

 = −   = −̇
Equations of motion

  

 ̇ =−  ̇  = ̇

Thus there are  cyclic (ignorable) coordinates ( )+1   ( ) which obey Hamilton’s equations of
motion, while the the first  = − non-cyclic (non-ignorable) coordinates ( ̇)1   ( ̇) for  = 1 2  
obey Lagrange equations. The solution for the cyclic variables is trivial since they are constants of motion
and thus the Routhian  has reduced the number of equations of motion that must be solved from  to
the  =  −  non-cyclic variables This Routhian provides an especially useful way to reduce the number
of equations of motion for rotating systems.
Note that there are several definitions used to define the Routhian, for example some books define this
Routhian as being the negative of the definition used here so that it corresponds to a positive Lagrangian.
However, this sign usually cancels when deriving the equations of motion, thus the sign convention is unim-
portant if a consistent sign convention is used.
212 CHAPTER 8. HAMILTONIAN MECHANICS

8.6.2 R - Routhian is a Hamiltonian for the non-cyclic variables

The non-cyclic Routhian  complements  . Again the generalized coordinates between 1 ≤  ≤
 are assumed to be non-cyclic, while those between  + 1 ≤  ≤  are ignorable cyclic coordinates. However,
the expression in terms of ( ) and ( ̇) are interchanged, that is, the cyclic variables are expressed in
terms of ( ̇) and the non-cyclic variables are expressed in terms of ( ) which is opposite of what was
used for  .

X
 (1    ; 1    ; ̇+1   ̇ ; ) =  ̇ −  −  (8.66)

=  −  (8.67)

It can be written in a frequently used form


X 
X 
X
 (1    ; 1    ; ̇+1   ̇ ; ) ≡  ̇ −  =  ̇ −  −  ̇
 =1 
X
= −  ̇ (8.68)


This Routhian behaves like a Hamiltonian for the  non-cyclic variables which are expressed in terms of 
and  appropriate for a Hamiltonian. This Routhian writes the  cyclic coordinates in terms of , and ̇
appropriate for a Lagrangian, which are treated assuming the Routhian  is a negative Lagrangian for
these cyclic variables as summarized in table 82.

Table 82; Equations of motion for the Routhian 

Hamilton equations Lagrange equations
Coordinates Noncyclic: 1 ≤  ≤  Cyclic: ( + 1) ≤  ≤ 

  

 = −̇  = − 
Equations of motion

  

 = ̇  ̇ =−  ̇

This non-cyclic Routhian  is especially useful since it equals the Hamiltonian for the non-cyclic
variables, that is, the kinetic energy for motion of the cyclic variables has been removed. Note that since the
cyclic variables are constants of motion, then  is a constant of motion if  is a constant of motion.
However,  does not equal the total energy since the coordinate transformation is time dependent,
that is,  corresponds to the energy of the non-cyclic parts of the motion. For example, when used
to describe rotational motion,  corresponds to the energy in the non-inertial rotating body-fixed
frame of reference. This is especially useful in treating rotating systems such as rotating galaxies, rotating
machinery, molecules, or rotating strongly-deformed nuclei as discussed in chapter 129
The Lagrangian and Hamiltonian are the fundamental algebraic approaches to classical mechanics. The
Routhian reduction method is a valuable hybrid technique that exploits a trick to reduce the number of
variables that have to be solved for complicated problems encountered in science and engineering. The
Routhian  provides the most useful approach for solving the equations of motion for rotating
molecules, deformed nuclei, or astrophysical objects in that it gives the Hamiltonian in the non-inertial
body-fixed rotating frame of reference ignoring the rotational energy of the frame. By contrast, the cyclic
Routhian  is especially useful to exploit Lagrangian mechanics for solving problems in rigid-body
rotation such as the Tippe Top described in example 1313.
Note that the Lagrangian, Hamiltonian, plus both the  and  Routhian’s, all are scalars
under rotation, that is, they are rotationally invariant. However, they may be expressed in terms of the
coordinates in either the stationary or aP rotating frame. The major diﬀerence is that the Routhian includes
only subsets of the kinetic energy term   ̇ . The relative merits of using Lagrangian, Hamiltonian, and
both the  and  Routhian reduction methods, are illustrated by the following examples.
8.6. ROUTHIAN REDUCTION 213

8.6 Example: Spherical pendulum using Hamiltonian mechanics

The spherical pendulum provides a simple test case for compar-
ison of the use of Lagrangian mechanics, Hamiltonian mechanics,
and both approaches to Routhian reduction. The Lagrangian me-
chanics solution of the spherical pendulum is described in example
610. The solution using Hamiltonian mechanics is given in this
example followed by solutions using both of the Routhian reduction
approaches.
Consider the equations of motion of a spherical pendulum of g
mass  and length . The generalized coordinates are   since
the length is fixed at  =  The kinetic energy is
1 2 2 1 2 2  2
 =   +  sin 
2 2 m

The potential energy  = − cos  giving that

Spherical pendulum
1 2 2 1 2 2  2
(   ̇ ̇ ̇) =   +  sin  +  cos 
2 2
The generalized momenta are
   
 = = 2   = = 2 sin2 
 ̇  ̇
Since the system is conservative, and the transformation from rectangular to spherical coordinates does not
depend explicitly on time, then the Hamiltonian is conserved and equals the total energy. The generalized
momenta allow the Hamiltonian to be written as
2 2
(        ) = + −  cos 
22 22 sin2 
The equations of motion are 
 2 cos 
̇ = − = −  sin  ()
 22 sin3 

ṗ = − =0 ()

 
̇ = = 2 ()
 
 
̇ = = ()
  sin2 
2

Take the time derivative of equation () and use () to substitute for ̇ gives that

2 cos  
̈ − 3 + sin  = 0 ()
2 4 sin  
Note that equation (b) shows that  is a cyclic coordinate. Thus

 = 2 sin2 ̇ = constant

that is the angular momentum about the vertical axis is conserved. Note that although  is a constant of

motion, ̇ = 2 sin2  is a function of  and thus in general it is not conserved. There are various solutions

depending on the initial conditions. If  = 0 then the pendulum is just the simple pendulum discussed
previously that can oscillate, or rotate in the  direction. The opposite extreme is where  = 0 where the
pendulum rotates in the  direction with constant . In general the motion is a complicated coupling of the
 and  motions.
214 CHAPTER 8. HAMILTONIAN MECHANICS

8.7 Example: Spherical pendulum using  (   ̇ ̇  )

The Lagrangian for the spherical pendulum is
1 2 2 1 2 2 2
(   ̇ ̇ ̇) =  ̇ +  sin ̇ +  cos 
2 2
Note that the Lagrangian is independent of , therefore  is an ignorable variable with
 
̇ = =− =0
 
Therefore  is a constant of motion equal to


 = = 2 sin2 ̇
 ̇

The Routhian  (   ̇ ̇  ) equals

 (   ̇ ̇  ) =  ̇ − 

∙ ¸
1 2 1 2 2
= − 2 ̇ + 2 sin2 ̇ +  cos  − 2 sin2 ̇
2 2
1 2 1 2
= − 2 ̇ + +  cos 
2 2 2 sin2 

The Routhian  (   ̇ ̇  ) behaves like a Hamiltonian for  and like a Lagrangian 0 = −
for . Use of Hamilton’s canonical equations for  give
 
̇ = =
  sin2 
2


−̇ = =0

These two equations show that  is a constant of motion given by

2 sin2 ̇ =  = constant ()

Note that the Hamiltonian only includes the kinetic energy for the  motion which is a constant of motion,
but this energy does not equal the total energy. This solution is what is predicted by Noether’s theorem due
to the symmetry of the Lagrangian about the vertical  axis.
Since  (   ̇ ̇  ) behaves like a Lagrangian for  then the Lagrange equation for  is

  
Λ  = − =0
  ̇ 

where the negative sign of the Lagrangian in  (   ̇ ̇  ) cancels. This leads to

2 cos 
2 ̈ = −  sin 
2 sin3 
that is
2 cos  
̈ − 3 + sin  = 0 ()
2 4 sin  

This result is identical to the one obtained using Lagrangian mechanics in example 610 and Hamiltonian
mechanics given in example 86. The Routhian  simplified the problem to one degree of freedom  by
absorbing into the Hamiltonian the ignorable cyclic  coordinate and its conserved conjugate momentum  .
Note that the central term in equation  is the centrifugal term which is due to rotation about the vertical
axis. This term is zero for plane pendulum motion when  = 0.
8.6. ROUTHIAN REDUCTION 215

8.8 Example: Spherical pendulum using  (       ̇)

For a rotational system the Routhian  (       ̇) also can be used to project out the Hamil-
tonian for the active variables in the rotating body-fixed frame of reference. Consider the spherical pendulum
where the rotating frame is rotating with angular velocity ̇. The Lagrangian for the spherical pendulum is
1 2 2 1 2 2 2
(   ̇ ̇ ̇) =
 ̇ +  sin ̇ +  cos 
2 2
Note that the Lagrangian is independent of , therefore  is an ignorable variable with
 
̇ = =− =0
 
Therefore  is a constant of motion equal to

 = = 2 sin2 ̇
 ̇
The total Hamiltonian is given by
X 2 2
(        ) =  ̇ −  = + −  cos 

22 22 sin2 
The Routhian for the rotating frame of reference  is given by equation 868, that is

X
 (       ̇) =  ̇ −  ̇ −  =  −  ̇
=1

2 2
+ = −  cos  −  ̇
22 22 sin2 
2 1 2
= − 2 sin2 ̇ −  cos  ()
22 2
This behaves like a negative Lagrangian for  and a Hamiltonian for . The conjugate momenta are
 
 = =− = 2 sin2 ̇
 ̇  ̇
 
̇ = =− =0
 
that is,  is a constant of motion.
Hamilton’s equations of motion give
 
̇ = = ()
 2
 2 cos 
−̇ = = − 2 3 +  sin  ()
  sin 
Equation  gives that
 ̇
̇ = ̈ =
 2
Inserting this into equation  gives
2 cos 

̈ − + sin  = 0
2 4 sin3  
which is identical to the equation of motion  derived using  . The Hamiltonian in the rotating frame
is a constant of motion given by but it does not include the total energy.
Note that these examples show that both forms of the Routhian, as well as the complete Lagrangian
formalism, shown in example 610, and complete Hamiltonian formalism, shown in example 86 all give the
same equations of motion. This illustrates that the Lagrangian, Hamiltonian, and Routhian mechanics all
give the same equations of motion and this applies both in the static inertial frame as well as a rotating frame
since the Lagrangian, Hamiltonian and Routhian all are scalars under rotation, that is, they are rotationally
invariant.
216 CHAPTER 8. HAMILTONIAN MECHANICS

8.9 Example: Single particle moving in a vertical plane under the influence of
an inverse-square central force
The Lagrangian for a single particle of mass  moving in a vertical plane and subject to a central inverse
square central force, is specified by two generalized coordinates,  and 

 2 2 
= (̇ + 2 ̇ ) +
2 
The ignorable coordinate is  since it is cyclic. Let the constant conjugate momentum be denoted by  =

 ̇
= 2 ̇. Then the corresponding cyclic Routhian is

2 1 
 (  ̇  ) =  ̇ −  = 2
− ̇2 −
2 2 

This Routhian is the equivalent one-dimensional potential  () minus the kinetic energy of radial motion.
Applying Hamilton’s equation to the cyclic coordinate  gives

̇ = 0 = ̇
2
implying a solution
 = 2 ̇ = 

where the angular momentum  is a constant.

The Lagrange-Euler equation can be applied to the non-cyclic coordinate 

  
Λ  = − =0
  ̇ 
where the negative sign of  cancels. This leads to the radial solution

2 
̈ − 3
+ 2 =0
 
where  =  which is a constant of motion in the centrifugal term. Thus the problem has been reduced to a
one-dimensional problem in radius  that is in a rotating frame of reference.

8.7 Variable-mass systems

Lagrangian and Hamiltonian mechanics assume that the total mass and energy of the system are conserved.
Variable-mass systems involve transferring mass and energy between donor and receptor bodies. However,
such systems still can be conservative if the Lagrangian or Hamiltonian include all the active degrees of
freedom for the combined donor-receptor system. The following examples of variable mass systems illustrate
subtle complications that occur handling such problems using algebraic mechanics.

8.7.1 Rocket propulsion:

Newtonian mechanics was used to solve the rocket problem in chapter 2126. The equation of motion
(2113) relating the rocket thrust  to the rate of change of the momentum separated into two terms,

 = ̇ = ̈ + ̇̇ (8.69)

The first term is the usual mass times acceleration, while the second term arises from the rate of change of
mass times the velocity. The equation of motion for rocket motion is easily derived using either Lagrangian
or Hamiltonian mechanics by relating the rocket thrust to the generalized force 
 
8.7. VARIABLE-MASS SYSTEMS 217

8.7.2 Moving chains:

The motion of a flexible, frictionless, heavy chain that is falling in a gravitational field, often can be split into
two coupled variable-mass partitions that have diﬀerent chain-link velocities. These partitions are coupled
at the moving intersection between the chain partitions. That is, these partitions share time-dependent
fractions of the total chain mass. Moving chains were discussed first by Caley in 1857[Cay1857] and since
then the moving chain problem has had a controversial history due to the frequent erroneous assumption
that, in the gravitational field, the chain partitions fall with acceleration  rather than applying the correct
energy conservation assumption for this conservative system. The following two examples of conservative
falling-chain systems illustrate solutions obtained using variational principles applied to a single chain that
is partitioned into two variable length sections.1
Consider the following two possible scenarios for motion of a flexible, heavy, frictionless, chain located in
a uniform gravitational field . The first scenario is the “folded chain” system which assumes that one end
of the chain is held fixed, while the adjacent free end is released at the same altitude as the top of the fixed
arm, and this free end is allowed to fall in the constant gravitational field . The second “falling chain”,
scenario assumes that one end of the chain is hanging down through a hole in a frictionless, smooth, rigid,
horizontal table, with the stationary partition of the chain sitting on the table surrounding the hole. The
falling section of this chain is being pulled out of the stationary pile by the hanging partition. Both of these
systems are conservative since it is assumed that the total mass of the chain is fixed, and no dissipative forces
are acting. The chains are assumed to be inextensible, flexible, and frictionless, and subject to a uniform
gravitational field  in the vertical  direction. In both examples, the chain, with mass  and length  is
partitioned into a stationary segment, plus a moving segment, where the mass per unit length of the chain
is  =  . These partitions are strongly coupled at their intersection which propagates downward with time
for the “folded chain” and propagates upward, relative to the lower end of the falling chain, for the “falling
chain”. For the “folded chain”, the chain links are transferred from the moving segment to the stationary
segment as the moving section falls. By contrast, for the “falling system”, the chain links are transferred
from the stationary upper section to the moving lower segment of the chain.

8.10 Example: Folded chain

The folded chain of length  and mass-per-unit-length  =   hangs
vertically downwards in a gravitational field  with both ends held initially
at the same height. The fixed end is attached to a fixed support while the
free end of the chain is dropped at time  = 0 with the free end at the same y
height and adjacent to the fixed end. Let  be the distance the falling free
end is below the fixed end. Using an idealized one-dimensional assumption,
the Lagrangian L is given by
 1
L( ̇) = ( − )̇ 2 +   (2 + 2 −  2 ) (8.70) L+y L-y
4 4 2 2
where the bracket in the second term is the height of the center of mass of
the folded chain with respect to the fixed upper end of the chain.
The Hamiltonian is given by
 (2 + 2 −  2 )
(  ) =  ̇ − L( ̇) = −  (8.71)
 ( − ) 4
where  is the linear momentum of the right-hand arm of the folded chain.
As shown in the discussion of the Generalized Energy Theorem, (chapters 78 and 79), when all the
active forces are included in the Lagrangian and the Hamiltonian, then the total mechanical energy  is
given by  =  Moreover, both the Lagrangian and the Hamiltonian are time independent, since
  L
= =− =0 (8.72)
  
Therefore the “folded chain” Hamiltonian equals the total energy, which is a constant of motion. Energy
conservation for this system can be used to give
1 Discussions with Professor Frank Wolfs stimulated inclusion of these two examples of moving chains.
218 CHAPTER 8. HAMILTONIAN MECHANICS

 1 1
( − ) ̇ 2 − (2 + 2 −  2 ) = − 2 (8.73)
4 4 4
Solve for ̇ 2 gives
(2 −  2 )
̇ 2 =  (8.74)
−
The acceleration of the falling arm, ̈ is given by taking the time derivative of equation 874
¡ ¢
 2 −  2
̈ =  + (8.75)
2 ( − )

The rate of change in linear momentum for the moving right side of the chain, ̇ , is given by

(2 −  2 )
̇ =  ̈ + ̇ ̇ =   +   (8.76)
2 ( − )

For this energy-conserving chain, the tension in the chain 0 at the fixed end of the chain is given by
 1
0 = ( + ) + ̇ 2 (8.77)
2 4
Equations 874 and 876, imply that the tension  diverges to infinity when  → . Calkin and March
measured the  dependence of the chain tension at the support for the folded chain and observed the predicted
 dependence. The maximum tension was ' 25  which is consistent with that predicted using equation 877
after taking into account the finite size and mass of individual links in the chain. This result is very diﬀerent
from that obtained using the erroneous assumption that the right arm falls with the free-fall acceleration ,
which implies a maximum tension 0 = 2 . Thus the free-fall assumption disagrees with the experimental
results, in addition to violating energy conservation and the tenets of Lagrangian and Hamiltonian mechanics.
That is, the experimental result demonstrates unambiguously that the energy conservation predictions apply
in contradiction with the erroneous free-fall assumption.
The unusual feature of variable mass problems, such as the folded chain problem, is that the rate of change
of momentum in equation 876 includes two contributions to the force and rate of change of momentum,
that is, it includes both the acceleration term  ̈ plus the variable mass term ̇ ̇ that accounts for the
transfer of matter at the intersection of the moving and stationary partitions of the chain. At the transition
point of the chain, moving links are transferred from the moving section and are added to the stationary
subsection. Since this moving section is falling downwards, and the stationary section is stationary, then the
transferred momentum is in a downward direction corresponding to an increased eﬀective downward force.
Thus the measured acceleration of the moving arm actually is faster than . A related phenomenon is the
loud cracking sound heard when cracking a whip.

8.11 Example: Falling chain

The “falling chain”, scenario assumes that one end of the chain is hang-
ing down through a hole in a frictionless, smooth, rigid, horizontal table,
with the stationary partition of the chain lying on the frictionless table sur-
rounding the hole. The falling section of this chain is being pulled out of
the stationary pile by the hanging partition. The analysis for the problem of
the falling chain behaves diﬀerently from the folded chain. For the “falling-
chain” let  be the falling distance of the lower end of the chain measured y
with respect to the table top. The Lagrangian and Hamiltonian are given by

 2 2
L( ̇) =  ̇ +  (8.78)
2 2
L
 = =  ̇ (8.79)
 ̇
2  2
 = − = (8.80)
2 2
8.8. SUMMARY 219

The Lagrangian and Hamiltonian are not explicitly time dependent, and the Hamiltonian equals the initial
total energy, 0 . Thus energy conservation can be used to give that
1
= (̇ 2 − ) = 0 (8.81)
2
Lagrange’s equation of motion gives
1
̇ =  ̈ + ̇ ̇ =   + ̇ 2 =   − 0 (8.82)
2
The important diﬀerence between the folded chain and falling chain is that the moving component of the
falling chain is gaining mass with time rather than losing mass. Also the tension in the chain 0 reduces the
acceleration of the falling chain making it less than the free-fall value . This is in contrast to that for the
folded chain system where the acceleration exceeds .
The above discussion shows that Lagrangian and Hamiltonian can be applied to variable-mass systems if
both the donor and receptor degrees of freedom are included to ensure that the total mass is conserved.

8.8 Summary
Hamilton’s equations of motion
Inserting the generalized momentum into Jacobi’s generalized energy relation was used to define the
Hamiltonian function to be
 (q p ) = p · q̇−(q q̇ ) (83)
The Legendre transform of the Lagrange-Euler equations, led to Hamilton’s equations of motion.

̇ = (825)


" #
 X 
̇ = − +  + 
 (826)
 
=1

The generalized energy equation 738 gives the time dependence

Ã"  # !
(q p) X X  (q q̇)
=  + 
 ̇ − (827)
 
 
=1

where
 
=− (824)
 
The    are treated as independent canonical variables Lagrange was the first to derive the canonical
equations but he did not recognize them as a basic set of equations of motion. Hamilton derived the canonical
equations of motion from his fundamental variational principle and made them the basis for a far-reaching
theory of dynamics. Hamilton’s equations give 2 first-order diﬀerential equations for    for each of the
 degrees of freedom. Lagrange’s equations give  second-order diﬀerential equations for the variables   ̇ 
Routhian reduction technique
The Routhian reduction technique is a hybrid of Lagrangian and Hamiltonian mechanics that exploits
the advantages of both approaches for solving problems involving cyclic variables. It is especially useful for
solving motion in rotating systems in science and engineering. Two Routhians are used frequently for solving
the equations of motion of rotating systems. Assuming that the variables between 1 ≤  ≤  are non-cyclic,
while the  variables between  + 1 ≤  ≤  are ignorable cyclic coordinates, then the two Routhians are:

X 
X
 (1    ; ̇1   ̇ ; +1    ; ) =  ̇ −  =  −  ̇ (865)
 
X X
 (1    ; 1    ; ̇+1   ̇ ; ) =  ̇ −  =  −  ̇ (868)
 
220 CHAPTER 8. HAMILTONIAN MECHANICS

The Routhian  is a negative Lagrangian for the non-cyclic variables between 1 ≤  ≤ , where
 =  −  and is a Hamiltonian for the  cyclic variables between  + 1 ≤  ≤ . Since the cyclic
variables are constants of the Hamiltonian, their solution is trivial, and the number of variables included in
the Lagrangian is reduced from  to  =  − . The Routhian  is useful for solving some problems in
classical mechanics. The Routhian  is a Hamiltonian for the non-cyclic variables between 1 ≤  ≤ ,
and is a negative Lagrangian for the  cyclic variables between  + 1 ≤  ≤ . Since the cyclic variables
are constants of motion, the Routhian  also is a constant of motion but it does not equal the total
energy since the coordinate transformation is time dependent. The Routhian  is especially valuable
for solving rotating many-body systems such as galaxies, molecules, or nuclei, since the Routhian 
is the Hamiltonian in the rotating body-fixed coordinate frame.
Variable mass systems:
Two examples of heavy flexible chains falling in a uniform gravitational field were used to illustrate
how variable mass systems can be handled using Lagrangian and Hamiltonian mechanics. The falling-mass
system is conservative assuming that both the donor plus the receptor body systems are included.
Comparison of Lagrangian and Hamiltonian mechanics
Lagrangian and the Hamiltonian dynamics are two powerful and related variational algebraic formulations
of mechanics that are based on Hamilton’s action principle. They can be applied to any conservative degrees
of freedom as discussed in chapters 6 8 and 15. Lagrangian and Hamiltonian mechanics both concentrate
solely on active forces and can ignore internal forces. They can handle many-body systems and allow
convenient generalized coordinates of choice. This ability is impractical or impossible using Newtonian
mechanics. Thus it is natural to compare the relative advantages of these two algebraic formalisms in order
to decide which should be used for a specific problem.
For a system with  generalized coordinates, plus  constraint forces that are not required to be known,
then the Lagrangian approach, using a minimal set of generalized coordinates, reduces to only  =  − 
second-order diﬀerential equations and unknowns compared to the Newtonian approach where there are
 +  unknowns. Alternatively, use of Lagrange multipliers allows determination of the constraint forces
resulting in  +  second order equations and unknowns. The Lagrangian potential function is limited
to conservative forces, Lagrange multipliers can be used to handle holonomic forces of constraint, while
generalized forces can be used to handle non-conservative and non-holonomic forces. The advantage of the
Lagrange equations of motion is that they can deal with any type of force, conservative or non-conservative,
and they directly determine , ̇ rather than   which then requires relating  to ̇.
For a system with  generalized coordinates, the Hamiltonian approach determines 2 first-order diﬀer-
ential equations which are easier to solve than second-order equations. However, the 2 solutions must be
combined to determine the equations of motion. The Hamiltonian approach is superior to the Lagrange ap-
proach in its ability to obtain an analytical solution of the integrals of the motion. Hamiltonian dynamics also
has a means of determining the unknown variables for which the solution assumes a soluble form. Important
applications of Hamiltonian mechanics are to quantum mechanics and statistical mechanics, where quantum
analogs of  and   can be used to relate to the fundamental variables of Hamiltonian mechanics. This
does not apply for the variables  and ̇ of Lagrangian mechanics. The Hamiltonian approach is especially
powerful when the system has  cyclic variables, then the  conjugate momenta  are constants. Thus the
 conjugate variables (   ) can be factored out of the Hamiltonian, which reduces the number of conjugate
variables required to  − . This is not possible using the Lagrangian approach since, even though the 
coordinates  can be factored out, the velocities ̇ still must be included, thus the  conjugate variables
must be included. The Lagrange approach is advantageous for obtaining a numerical solution of systems in
classical mechanics. However, Hamiltonian mechanics expresses the variables in terms of the fundamental
canonical variables (q p) which provides a more fundamental insight into the underlying physics.2

2 Recommended reading: "Classical Mechanics" H. Goldstein, Addison-Wesley, Reading (1950). The present chapter

closely follows the notation used by Goldstein to facilitate cross-referencing and reading the many other textbooks that have
adopted this notation.
8.8. SUMMARY 221

Workshop exercises
1. A block of mass  rests on an inclined plane making an angle  with the horizontal. The inclined plane (a
triangular block of mass  ) is free to slide horizontally without friction. The block of mass  is also free to
slide on the larger block of mass  without friction.

(a) Construct the Lagrangian function.

(b) Derive the equations of motion for this system.
(c) Calculate the canonical momenta.
(d) Construct the Hamiltonian function.
(e) Find which of the two momenta found in part (c) is a constant of motion and discuss why it is so. If the
two blocks start from rest, what is the value of this constant of motion?

2. Discuss among yourselves the following four conditions that can exist for the Hamiltonian and give several
examples of systems exhibiting each of the four conditions.

(a) The Hamiltonian is conserved and equals the total mechanical energy
(b) The Hamiltonian is conserved but does not equal the total mechanical energy
(c) The Hamiltonian is not conserved but does equal the total mechanical energy
(d) The Hamiltonian is not conserved and does not equal the mechanical total energy.

3. A block of mass  rests on an inclined plane making an angle  with the horizontal. The inclined plane (a
triangular block of mass  ) is free to slide horizontally without friction. The block of mass  is also free to
slide on the larger block of mass  without friction.

(a) Construct the Lagrangian function.

4. Discuss among yourselves the following four conditions that can exist for the Hamiltonian and give several
examples of systems exhibiting each of the four conditions.
a) The Hamiltonian is conserved and equals the total mechanical energy
b) The Hamiltonian is conserved but does not equal the total mechanical energy
c) The Hamiltonian is not conserved but does equal the total mechanical energy
d) The Hamiltonian is not conserved and does not equal the mechanical total energy

5. Compare the Lagrangian formalism and the Hamiltonian formalism by creating a two-column chart. Label one
side “Lagrangian” and the other side “Hamiltonian” and discuss the similarities and diﬀerences. Here are some
ideas to get you started:

• What are the basic variables in each formalism?

• What are the form and number of the equations of motion derived in each case?
• How does the Lagrangian “state space” compare to the Hamiltonian “phase space”?

6. It can be shown that if ( ̇ ) is the Lagrangian of a particle moving in one dimension, then  = 0 where
0 ( ̇ ) = ( ̇ ) + 
 and  ( ) is an arbitrary function. This problem explores the consequences of
this on the Hamiltonian formalism.
222 CHAPTER 8. HAMILTONIAN MECHANICS

(a) Relate the new canonical momentum 0 , for 0 , to the old canonical momentum , for .
(b) Express the new Hamiltonian  0 ( 0  0  ) for 0 in terms of the old Hamiltonian (  ) and  .
(c) Explicitly show that the new Hamilton’s equations for  0 are equivalent to the old Hamilton’s equations
for  .

7. A massless hoop of radius  is rotating about an axis perpendicular to its central axis at constant angular
velocity  . A mass  can freely slide around the hoop.

(a) Determine the Lagrangian of the system.

(b) Determine the Hamiltonian of the system. Does it equal the total mechanical energy?
(c) Determine the Lagrangian of the system with respect to a coordinate frame in which  =  +eff . What
is eff ? What force generates the additional term in eff ?

8. Consider a pendulum of length  attached to the end of rod of length . The rod is rotating at constant
angular velocity  in the plane. Assume the pendulum is always taut.

(a) Determine equations of motion.

(b) For what value of  2  is this system the same as a plane pendulum in a constant gravitational field?
(c) Show  6=  . What is the reason?

Problems
1) A particle of mass  in a gravitational field slides on the inside of a smooth parabola of revolution whose axis is
vertical. Using the distance from the axis  and the azimuthal angle  as generalized coordinates, find the following.
a) The Lagrangian of the system.
b) The generalized momenta and the corresponding Hamiltonian
c) The equation of motion for the coordinate  as a function of time.
d) If 
 = 0 show that the particle can execute small oscillations about the lowest point of the paraboloid and
find the frequency of these oscillations.

2) Consider a particle of mass  which is constrained to move on the surface of a sphere of radius . There are no
external forces of any kind acting on the particle.
a) What is the number of generalized coordinates necessary to describe the problem?
b) Choose a set of generalized coordinates and write the Lagrangian of the system.
c) What is the Hamiltonian of the system? Is it conserved?
d) Prove that the motion of the particle is along a great circle of the sphere.
3. A block of mass  is attached to a wedge of mass  by a spring with spring constant  . The inclined frictionless
surface of the wedge makes an angle  to the horizontal. The wedge is free to slide on a horizontal frictionless surface
as shown in the figure.
a) Given that the relaxed length of the spring is , find the values 0 when both book and wedge are stationary.
b) Find the Lagrangian for the system as a function of the  coordinate of the wedge and the length of spring .
Write down the equations of motion.
c) What is the natural frequency of vibration?
8.8. SUMMARY 223

4. A fly-ball governor comprises two masses  connected by 4 hinged arms of length  to a vertical shaft and to a
mass  which can slide up or down the shaft without friction in a uniform vertical gravitational field as shown in
the figure. The assembly is constrained to rotate around the axis of the vertical shaft with same angular velocity as
that of the vertical shaft. Neglect the mass of the arms, air friction, and assume that the mass  has a negligible
moment of inertia. Assume that the whole system is constrained to rotate with a constant angular velocity  0 .
a) Choose suitable coordinates and use the Lagrangian to derive equations of motion of the system around the
equilibrium position.
b) Determine the height  of the mass  above its lowest position as a function of  0 .
c) Find the frequency of small oscillations about this steady motion.
d) Derive a Routhian that provides the Hamiltonian in the rotating system.
e) Is the total energy of the fly-ball governor in the rotating frame of reference constant in time?
f) Suppose that the shaft and assembly are not constrained to rotate at a constant angular velocity  0 , that is,
it is allowed to rotate freely at angular velocity ̇. What is the diﬀerence in the overall motion?

5. A rigid straight, frictionless, massless, rod rotates about the  axis at an angular velocity ̇. A mass  slides
along the frictionless rod and is attached to the rod by a massless spring of spring constant .
a; Derive the Lagrangian and the Hamiltonian
b; Derive the equations of motion in the stationary frame using Hamiltonian mechanics.
c; What are the constants of motion?
d; If the rotation is constrained to have a constant angular velocity ̇ =  then is the non-cyclic Routhian
 =  −  ̇ a constant of motion, and does it equal the total energy?
e; Use the non-cyclic Routhian  to derive the radial equation of motion in the rotating frame of reference
for the cranked system with ̇ =  .
224 CHAPTER 8. HAMILTONIAN MECHANICS

6. A thin uniform rod of length 2 and mass  is suspended from a massless string of length  tied to a nail. Initially
the rod hangs vertically. A weak horizontal force  is applied to the rod’s free end.
a) Write the Lagrangian for this system.
b) For very short times such that all angles are small, determine the angles that string and the rod make with
the vertical. Start from rest at  = 0
c) Draw a diagram to illustrate the initial motion of the rod.

7. A uniform ladder of mass  and length 2 is leaning against a frictionless vertical wall with its feet on a
frictionless horizontal floor. Initially the stationary ladder is released at an angle 0 = 60◦ to the floor. Assume
that gravitation field  = 9812 acts vertically downward and that the moment of inertia of the ladder about its
midpoint is  = 13  2 .
a) Derive the Lagrangian
b) Derive the Hamiltonian
c) Explain if the Hamiltonian is conserved and/or if it equals the total energy
d) Use the Lagrangian to derive the equations of motion
e) Derive the angle  at which the ladder loses contact with the vertical wall?

8. The classical mechanics exam induces Jacob to try his hand at bungee jumping. Assume Jacob’s mass 
is suspended in a gravitational field by the bungee of unstretched length  and spring constant  . Besides the
longitudinal oscillations due to the bungee jump, Jacob also swings with plane pendulum motion in a vertical plane.
Use polar coordinates  , neglect air drag, and assume that the bungee always is under tension.
a; Derive the Lagrangian
b; Determine Lagrange’s equation of motion for angular motion and identify by name the forces contributing to
the angular motion.
c; Determine Lagrange’s equation of motion for radial oscillation and identify by name the forces contributing to
the tension in the spring.
d; Derive the generalized momenta
e; Determine the Hamiltonian and give all of Hamilton’s equations of motion.
Chapter 9

Hamilton’s Action Principle

9.1 Introduction
Hamilton’s principle of stationary action was introduced in two papers published by Hamilton in 1834 and
1835 As mentioned in the Prologue, Hamilton’s Action Principle is the foundation of the hierarchy of three
philosophical stages that are used in applying analytical mechanics. The first stage is to use Hamilton’s
Action Principle to derive either the Hamiltonian and Lagrangian for the system. The second stage is to use
either Lagrangian mechanics, or Hamiltonian mechanics, to derive the equations of motion for the system.
The third stage is to solve these equations of motion for the assumed initial conditions. Lagrange had
pioneered Lagrangian mechanics in 1788 based on d’Alembert’s Principle. Hamilton’s Action Principle now
underlies theoretical physics, and many other disciplines in mathematics and economics. In 1834 Hamilton
was seeking a theory of optics when he developed both his Action Principle, and the field of Hamiltonian
mechanics.
Hamilton’s Action Principle is based on defining the action functional1  for  generalized coor-
dinates which are expressed by the vector q and their corresponding velocity vector q̇.
Z 
= (q q̇) (9.1)


The scalar action  is a functional of the Lagrangian (q q̇), integrated between an initial time  and
final time  . In principle, higher order time derivatives of the generalized coordinates could be included, but
most systems in classical mechanics are described adequately by including only the generalized coordinates,
plus their velocities. The definition of the action functional allows for more general Lagrangians than the
simple Standard Lagrangian (q q̇) =  (q̇) −  (q ) that has been used throughout chapters 5 − 8.
Hamilton stated that the actual trajectory of a mechanical system is that given by requiring that the action
functional is stationary with respect to change of the variables. The action functional is stationary when the
variational principle can be written in terms of a virtual infinitessimal displacement,  to be
Z 
 =  (q q̇) = 0 (9.2)


Typically the stationary point corresponds to a minimum of the action functional. Applying variational
calculus to the action functional leads to the same Lagrange equations of motion for
P systems as the equations
derived using d’Alembert’s Principle, if the additional generalized force terms,  =1 
 
 (q ) +  ,
are omitted in the corresponding equations of motion.
These are used to derive the equations of motion, which then are solved for an assumed set of ini-
tial conditions. Prior to Hamilton’s Action Principle, Lagrange developed Lagrangian mechanics based on
d’Alembert’s Principle while the Newtonian equations of motion are defined in terms of Newton’s Laws of
Motion.
1 The term "action functional" was named "Hamilton’s Principal Function" in older texts. The name usually is abbreviated

to "action" in modern mechanics.

225
226 CHAPTER 9. HAMILTON’S ACTION PRINCIPLE

9.2 Hamilton’s Principle of Stationary Action

Hamilton’s crowning achievement was his use of the general form of
Hamilton’s principle of stationary action , equation 92, to derive
both Lagrangian mechanics, and Hamiltonian mechanics. Consider qj
t = t2
the action  for the extremum path of a system in configuration
space, that is, along path  for  = 1 2   coordinates  ( ) at q j (t) t = t2 t2
initial time  to  ( ) at a final time  as shown in figure 91.
Then the action  is given by q j (t) q j(t)
A
Z  B
 = (q() q̇()) (9.3)


As used in chapter 52 a family of neighboring paths is defined t = t1 t = t1 t1

by adding an infinitessimal fraction  of a continuous, well-behaved

neighboring function   where  = 0 for the extremum path. That
is, qi

 ( ) =  ( 0) +   () (9.4) q

k
In contrast to the variational case discussed when deriving La-
grangian mechanics, the variational path used here does not assume
that the functions  () vanish at the end points. Assume that the
neighboring path  has an action  where
Z  +∆ Figure 9.1: Extremum path A, plus
 = (q()+δq() q̇()+δ q̇()) (9.5) the neighboring path B, shown in con-
 +∆ figuration space.
Expanding the integrand of  in equation 95 gives that, relative to the extremum path , the incremental
change in action is Z  X µ ¶
 
 =  −  =  +  ̇  + [∆] (9.6)
 
   ̇
³ ´

The second term in the integral can be integrated by parts since  ̇ =   leading to
⎡ ⎤
Z  X µ   
¶ X 
 = −   + ⎣  + ∆⎦ (9.7)
 
   ̇ 
 ̇


Note that equation 97 includes contributions from the entire path of the integral as well as the variations
at the ends of the curve and the ∆ terms. Equation 97 leads to the following two pioneering principles of
least action in variational mechanics that were developed by Hamilton.

9.2.1 Stationary-action principle in Lagrangian mechanics

Derivation of Lagrangian mechanics in chapter 6 was based on the extremum path for neighboring paths
between two given locations q( ) and q( ) that the system occupies at the initial and final times  and 
respectively. For the special case, where the end points do not vary, that is, when  ( ) =  ( ) = 0, and
∆ = ∆ = 0, then the least action  for the stationary path (98) reduces to
Z  X µ ¶
  
 = −   = 0 (9.8)
 
   ̇

For independent generalized coordinates  , the integrand in brackets vanishes leading to the Euler-Lagrange
equations. Conversely, if the Euler-Lagrange equations in 98 are satisfied, then,  = 0 that is, the path
is stationary. This leads to the statement that the path in configuration space between two configurations
q( ) and q( ) that the system occupies at times  and  respectively, is that for which the action  is
stationary. This is a statement of Hamilton’s Principle.
9.2. HAMILTON’S PRINCIPLE OF STATIONARY ACTION

9.2.2 Stationary-action principle in Hamiltonian mechanics

Hamilton used the general variation of the least-action path to derive of the basic equations of Hamiltonian
mechanics. For the general path, the integral term in equation 97 vanishes because the Euler-Lagrange
equations are obeyed for the stationary path. Thus the only remaining non-zero contributions are due to
the end point terms, which can be written by defining the total variation of each end point to be

∆ =  + ̇ ∆ (9.9)

where  and ̇ are evaluated at  and  . Then equation 97 reduces to
⎡ ⎤ ⎡ ⎛ ⎞ ⎤
X  X  X 
 = ⎣  + ∆⎦ = ⎣ ∆ + ⎝− ̇ + ⎠ ∆⎦ (9.10)

 ̇ 
 ̇ 
 ̇
 

Since the generalized momentum  = 

̇ , then equation 910 can be expressed in terms of the Hamiltonian
and generalized momentum as
⎡ ⎤
X 
 = ⎣  ∆ − ∆⎦ = [p·∆q − ∆] (9.11)


 
= =  (9.12)
  ̇
Equation 911 contains Hamilton’s Principle of Least-action. Equation 912 gives an alternative relation of
the generalized momentum  that is expressed in terms of the action functional . Note that equations
911 and 912 were derived directly without invoking reference to the Lagrangian.
Integrating the action , equation 910, between the end points gives the action for the path between
 =  and  =  , that is, ( ( ) 1   ( ) 2 ) to be
Z 
( ( )    ( )  ) = [p · q̇ − (q p)]  (9.13)


The stationary path is obtained by using the variational principle

Z 
 =  [p · q̇ − (q p)]  = 0 (9.14)


The integrand,  = [p · q̇ − (q p)]  in this modified Hamilton’s principle, can be used in the  Euler-
Lagrange equations for  = 1 2 3   to give
µ ¶
   
− = ̇ + =0 (9.15)
  ̇  
Similarly, the other  Euler-Lagrange equations give
µ ¶
   
− = −̇ + =0 (9.16)
  ̇  
Thus Hamilton’s principle of least-action leads to Hamilton’s equations of motion, that is equations 915
and 916.
The total time derivative of the action , which is a function of the coordinates and time, is

  X  
= + ̇ = + p · q̇ (9.17)
  
 

But the total time derivative of equation 914 equals


= p · q̇ − (q p) (9.18)

228 CHAPTER 9. HAMILTON’S ACTION PRINCIPLE

Combining equations 917 and 918 gives the Hamilton-Jacobi equation which is discussed in chapter 154.

+ (q p) = 0 (9.19)

In summary, Hamilton’s principle of least action leads directly to Hamilton’s equations of motion (915 916)
plus the Hamilton-Jacobi equation (919). Note that the above discussion has derived both Hamilton’s Ac-
tion Principle (98) and Hamilton’s equations of motion (915 916) directly from Hamilton’s variational
concept of stationary action, , without explicitly invoking the Lagrangian.

9.2.3 Abbreviated action

Hamilton’s Action Principle determines completely the path of the motion and the position on the path as
a function of time. If the Lagrangian and the Hamiltonian are time independent, that is, conservative, then
 =  and equation 913 equals
Z  Z 
( (1 ) 1   (2 ) 2 ) = [p · q̇ − ]  = p·q − ( −  ) (9.20)
 
R2
The 1
p · δ q̇ term in equation 920, is called the abbreviated action which is defined as
Z  Z 
0 ≡ p· q̇ = p·q (9.21)
 

The abbreviated action can be simplified assuming use of the standard Lagrangian  =  −  with a
velocity-independent potential  , then equation 84 gives.
Z  X  Z  Z  Z 
0 ≡  ̇  = ( + )  = 2  = p·q (9.22)
    

Abbreviated action provides for use of a simplified form of the principle of least action that is based
on the kinetic energy, and not potential energy. For conservative systems it determines the path of the
motion, but not the time dependence of the motion. Consider virtual motions where the path satisfies
energy conservation, and where the end points are held fixed, that is  = 0 but allow for a variation  in
the final time. Then using the Hamilton-Jacobi equation, 919
 = − = − (9.23)
However, equation 921 gives that
 = 0 −  (9.24)
Therefore
0 = 0 (9.25)
That is, the abbreviated action has a minimum with respect to all paths that satisfy the conservation of
energy which can be written as Z 
0 =  2  = 0 (9.26)

Equation 926 is called the Maupertuis’ least-action principle which he proposed in 1744 based on Fermat’s
Principle in optics. Credit for the formulation of least action commonly is given to Maupertuis; however, the
Maupertuis principle is similar to the use of least action applied to the “vis viva”, as was proposed by Leibniz
four decades earlier. Maupertuis used teleological arguments, rather than scientific rigor, because of his
limited mathematical capabilities. In 1744 Euler provided a scientifically rigorous argument, presented above,
that underlies the Maupertuis principle. Euler derived the correct variational relation for the abbreviated
action to be Z  X
0 =   = 0 (9.27)
 

Hamilton’s use of the principle of least action to derive both Lagrangian and Hamiltonian mechanics is
a remarkable accomplishment. It underlies both Lagrangian and Hamiltonian mechanics and confirmed the
conjecture of Maupertuis.
9.2. HAMILTON’S PRINCIPLE OF STATIONARY ACTION

9.2.4 Hamilton’s Principle applied using initial boundary conditions

Galley[Gal13] identified a subtle inconsistency in the appli-
cations of Hamilton’s Principle of Stationary Action to both
Lagrangian and Hamiltonian mechanics. The inconsistency
involves the fact that Hamilton’s Principle is defined as the
action integral between the initial time  and the final time
 as boundary conditions, that is, it is assumed to be time
symmetric. However, most applications in Lagrangian and
Hamiltonian mechanics assume that the action integral is
evaluated based on the initial values as the boundary condi-
tions, rather than the initial  and final times  . That
is, typical applications require use of a time-asymmetric
version of Hamilton’s principle. Galley[Gal13][Gal14] pro-
posed a framework for transforming Hamilton’s Principle to Figure 9.2: The left schematic shows paths be-
a time-asymmetric form in order to handle problems where tween the initial q( ) and final q( ) times
the boundary conditions are based on using only the ini- for conservative mechanics. The solid line des-
tial values at the initial time  , rather than the initial plus ignates the path for which the action is sta-
final times (   ) that is assumed in the time-symmetric tionary, while the dashed lines represent the
definition of the action in Hamilton’s Principle. varied paths. The right schematic shows the
The following describes the framework proposed by paths applied to the doubled degrees of free-
Galley for transforming Hamilton’s Principle to a time- dom with two initial boundary conditions, that
asymmetric form. Let q and q̇ designate sets of  gener- is, q1 ( ) and q2 ( ) plus assuming that both
alized coordinates, plus their velocities, where q and q̇ are paths are identical at their intersection and
the fundamental variables assumed in the definition of the that they intersect at the same final time, that
Lagrangian used by Hamilton’s Principle. As illustrated is, q1 ( ) = q2 ( ).
schematically in figure 92, Galley proposed doubling the
number of degrees of freedom for the system considered, that is, let q → (q1  q2 ) and q̇ → (q̇1  q̇2 ). In ad-
dition he defines two identical variational paths 1 and 2 where path 2 is the time reverse of path 1. That
is, path 1 starts at the initial time  , and ends at  , whereas path 2 starts at  and ends at  . That
is, he assumes that q and q̇ specify the two paths in the space of the doubled degrees of freedom that are
identical, and that they intersect at the final time  . The arrows shown on the paths in figure 92 designate
the assumed direction of the time integration along these paths.
For the doubled system of degrees of freedom, the total action for the sum of the two paths is given by
the time integral of the doubled variables, (q1  q2 ) which can be written as
Z  Z  Z 
 (q1  q2 ) =  (q1  q̇1 )  +  (q2  q̇2  )  = [ (q1  q̇1 )  −  (q2  q̇2 )]  (9.28)
  

The above relation assumes that the doubled variables (q1  q̇1 ) and (q2  q̇2 ) are decoupled from each other.
More generally one can assume that the two sets of variables are coupled by some arbitrary function
 (q1  q̇1  q2  q̇2  ). Then the action can be written as
Z 
 (q1  q2 ) = [ (q1  q̇1  t)  −  (q2  q̇2  t) +  (q1  q̇1  q2  q̇2  )]  (9.29)


The eﬀective Lagrangian for this doubled system then can be defined as

Λ (q1  q2  q̇1  q̇2  ) ≡ [ (q1  q̇1 )  −  (q2  q̇2 ) +  (q1  q̇1  q2  q̇2  )] (9.30)

and the action can be written as

Z 
 (q1  q2 ) = Λ (q1  q̇1  q2  q̇2  )  (9.31)


The coupling term  (q1  q̇1  q2  q̇2  ) for the doubled system of degrees of freedom must satisfy the
following two properties.
230 CHAPTER 9. HAMILTON’S ACTION PRINCIPLE

(a) If it can be expressed as the diﬀerence of two scalar potentials, ∆ (q1  q2 ) =  (q1 ) −  (q2 ), then
it can be absorbed into the potential term for each of the doubled variables in the Lagrangian. This implies
that  = 0 and there is no reason to double the number of degrees of freedom because the system is
conservative. Thus  describes generalized forces that are not derivable from potential energy, that is, not
conservative.
(b) A second property of the coupling term  (q1  q̇1  q2  q̇2  ) is that it must be antisymmetric under
interchange of the arbitrary labels 1 ↔ 2. That is,

 (q2  q̇2  q1  q̇1  ) = − (q1  q̇1  q2  q̇2  ) (9.32)

Therefore the antisymmetric function  (q1  q̇1  q2  q̇2  ) vanishes when q2 = q1 .

The variational condition requires that the action  (q1  q2 ) has a well defined stationary point for the
doubled system. This is achieved by parametrizing both coordinate paths as

q12 ( ) = q12 ( 0) + 12 () (9.33)

where q12 ( 0) are the coordinates for which the action is stationary,  ¿ 1 and where  12 () are arbitrary
functions of time denoting virtual displacements of the paths. The doubled system has two independent
paths connecting the two initial boundary conditions at  , and it requires that these paths intersect at  .
The variational system for the two intersecting paths requires specifying four conditions, two per path. Two
of the four conditions are determined by requiring that at  the initial boundary conditions satisfies that
 12 ( ) = 0. The remaining two conditions are derived by requiring that the variation of the action  (q1  q2 )
satisfies

∙ ¸ Z  ½ ∙ ¸ ∙ ¸ ¾
 Λ  1 Λ 2
=0=  1 − − 2 − + [1 1 − 2 2 ]= (9.34)
 =0  1  =0 2  =0

The canonical momenta 12 conjugate to the doubled coordinates q12 are defined using the nonconser-
vative Lagrangian Λ to be

Λ  (q1  q̇1 )  (q1  q̇1  q2  q̇2  )

 1 (q12  q̇12 ) ≡ = + (9.35)
 ̇1 ()  ̇1 ()  ̇1 ()

where the superscript  designates the solution based on the initial conditions. Note that the conjugate
momentum 1 = (q 1 q̇1 )
 ̇1 ()
while the (q1q̇̇1()
q2 q̇2 )
term is part of the total momentum due to the
1
nonconservative interaction. Similarly the momentum for the second path is

Λ  (q1  q̇1 )  (q1  q̇1  q2  q̇2  )

 2 (q12  q̇12 ) ≡ = + (9.36)
 ̇2 ()  ̇2 ()  ̇2 ()

The last term in equation 934 that is, the term [ 1  1 −  2  2 ]= results from integration by parts,
which will vanish if
1 ( ) 1 ( ) = 2 ( ) 2 ( ) (9.37)
The equality condition at the intersection of the two paths at  requires that

1 ( ) = 2 ( ) (9.38)

Therefore equations 937 and 938 imply that

 1 ( ) = 2 ( ) (9.39)

Therefore equations 938 and 939 constitute the equality condition that must be satisfied when the two
paths intersect at  . The equality condition ensures that the boundary term for integration by parts in
equation 934 will vanish for arbitrary variations provided that the two unspecified paths agree at the final
time  . Similarly the conjugate momenta 1 ( )  2 ( ) must agree, but otherwise are unspecified. As a
consequence, the equality condition ensures that the variational principle is consistent with the final state at
9.2. HAMILTON’S PRINCIPLE OF STATIONARY ACTION

 not being specified. That is, the equations of motion are only specified by the initial boundary conditions
of the time asymmetric action for the doubled system.
More physics insight is provided by using a more convenient parametrization of the coordinates in terms
of their average and diﬀerence. That is, let

 1 + 2 
+ ≡ − ≡ 1 − 2 (9.40)
2
Then the physical limit is

+ →  
− →0 (9.41)
That is, the average history is the relevant physical history, while the diﬀerence coordinate simply vanishes.
For these coordinates, the nonconservative Lagrangian is Λ (q+  q−  q̇+  q̇−  ) and the equality conditions
reduce to
 − ( ) = 0 (9.42)
 − ( ) = 0 (9.43)
which implies that the physically relevant average (+) quantities are not specified at the final time  in
order to have a well-defined variational principle.
The canonical momenta are given by
 1 +  2 Λ
 + = =  (9.44)
2  ̇−
Λ
 − =  1 −  2 =  (9.45)
 ̇+
The equations of motion can be written as
 Λ Λ

=  (9.46)
  ̇± ±
Equation 946 is identically zero for the + subscript, while, in the physical limit (PL), the negative subscript
gives that ∙ ¸
 Λ Λ

− 
=0 (9.47)
  ̇− − 
Substituting for the Lagrangian Λ gives that
∙ ¸
     

−  = 
− 
≡  (q1  q̇1 ) (9.48)
  ̇− − −   ̇− 

where  is a generalized nonconservative force derived from .

Note that equation 946 can be derived equally well by taking the direct functional derivative with respect

to − (), that is, ∙ ¸

0=  ()
(9.49)
− 
The above time-asymmetric formalism applies Hamilton’s action principle to systems that involve initial
boundary conditions while the second path corresponds to the final boundary conditions. This framework,
proposed recently by Galley[Gal13], provides a remarkable advance for the handling of nonconservative action
in Lagrangian and Hamiltonian mechanics.2 This formalism directly incorporates the variational principle
for initial boundary conditions and causal dynamics that are usually required for applications of Lagrangian
and Hamiltonian mechanics. Currently, there is limited exploitation of this new formalism because there
has been insuﬃcient time for it to become well known, for full recognition of its importance, and for the
development and publication of applications. Chapter 10 discusses an application of this formalism to
nonconservative systems in classical mechanics.
2 This topic goes beyond the planned scope of this book. It is recommended that the reader refer to the work of Galley,

Tsang, and Stein[Gal13, Gal14] for further discussion plus examples of applying this formalism to nonconservative systems in
classical mechanics, electromagnetic radiation, RLC circuits, fluid dynamics, and field theory.
232 CHAPTER 9. HAMILTON’S ACTION PRINCIPLE

9.3 Lagrangian
9.3.1 Standard Lagrangian
Lagrangian mechanics, as introduced in chapter 6 was based on the concepts of kinetic energy and potential
energy. d’Alembert’s principle of virtual work was used to derive Lagrangian mechanics in chapter 6 and this
led to the definition of the standard Lagrangian. That is, the standard Lagrangian was defined in chapter
62 to be the diﬀerence between the kinetic and potential energies.

(q q̇) =  (q̇) −  (q ) (9.50)

Hamilton extended Lagrangian mechanics by defining Hamilton’s Principle, equation 92, which states that
a dynamical system follows a path for which the action functional is stationary, that is, the time integral
of the Lagrangian. Chapter 6 showed that using the standard Lagrangian for defining the action functional
leads to the Euler-Lagrange variational equations
½ µ ¶ ¾ X
   
− = 
 +  (q ) (9.51)
  ̇  
=1

The Lagrange multiplier terms handle the holonomic constraint forces and   handles the remaining
excluded generalized forces. Chapters 6 − 8 showed that the use of the standard Lagrangian, with the Euler-
Lagrange equations (951) provides a remarkably powerful and flexible way to derive second-order equations
of motion for dynamical systems in classical mechanics.
Note that the Euler-LagrangePequations, expressed solely in terms of the standard Lagrangian (951)

that is, excluding the 
 + 
=1   (q ) terms, are valid only under the following conditions:

1. The forces acting on the system, apart from any forces of constraint, must be derivable from scalar
potentials.
2. The equations of constraint must be relations that connect the coordinates of the particles and may
be functions of time, that is, the constraints are holonomic.
P
The 
 + =1    (q ) terms extend the range of validity of using the standard Lagrangian in the


Lagrange-Euler equations by introducing constraint and omitted forces explicitly.

Chapters 6−8 exploited Lagrangian mechanics based on use of the standard definition of the Lagrangian.
The present chapter will show that the powerful Lagrangian formulation, using the standard Lagrangian,
can be extended to include alternative non-standard Lagrangians that may be applied to dynamical systems
where use of the standard definition of the Lagrangian is inapplicable. If these non-standard Lagrangians
satisfy Hamilton’s Action Principle, 92, then they can be used with the Euler-Lagrange equations to generate
the correct equations of motion, even though the Lagrangian may not have the simple relation to the kinetic
and potential energies adopted by the standard Lagrangian. Currently, the development and exploitation of
non-standard Lagrangians is an active field of Lagrangian mechanics.

9.3.2 Gauge invariance of the standard Lagrangian

Note that the standard Lagrangian is not unique in that there is a continuous spectrum of equivalent
standard Lagrangians that all lead to identical equations of motion. This is because the Lagrangian  is a
scalar quantity that is invariant with respect to coordinate transformations. The following transformations
change the standard Lagrangian, but leave the equations of motion unchanged.

1. The Lagrangian is indefinite with respect to addition of a constant to the scalar potential which cancels
out when the derivatives in the Euler-Lagrange diﬀerential equations are applied.
2. The Lagrangian is indefinite with respect to addition of a constant kinetic energy.
3. The Lagrangian is indefinite with respect to addition of a total time derivative of the form 2 →

1 +  [Λ(  )]  for any diﬀerentiable function Λ( ) of the generalized coordinates plus time, that
has continuous second derivatives.
9.3. LAGRANGIAN 233

This last statement can be proved by considering a transformation between two related standard La-
grangians of the form
µ ¶
  Λ(q )  Λ(q ) Λ(q )
2 (q  ) = 1 (q  ) + = 1 (q  ) + ̇ + (9.52)
  
This leads to a standard Lagrangian 2 that has the same equations of motion as 1 as is shown by
substituting equation 952 into the Euler-Lagrange equations. That is,
µ ¶ µ ¶ µ ¶
 2 2  1 1  2 Λ(q )  2 Λ(q )  1 1
− = − + − = − (9.53)
  ̇    ̇      ̇ 
Thus even though the related Lagrangians 1 and 2 are diﬀerent, they are completely equivalent in that
they generate identical equations of motion.
There is an unlimited range of equivalent standard Lagrangians that all lead to the same equations of
motion and satisfy the requirements of the Lagrangian. That is, there is no unique choice among the wide
range of equivalent standard Lagrangians expressed in terms of generalized coordinates. This discussion is
an example of gauge invariance in physics.
Modern theories in physics describe reality in terms of potential fields. Gauge invariance, which also is
called gauge symmetry, is a property of field theory for which diﬀerent underlying fields lead to identical
observable quantities. Well-known examples are the static electric potential field and the gravitational
potential field where any arbitrary constant can be added to these scalar potentials with zero impact on the
observed static electric field or the observed gravitational field. Gauge theories constrain the laws of physics
in that the impact of gauge transformations must cancel out when expressed in terms of the observables.
Gauge symmetry plays a crucial role in both classical and quantal manifestations of field theory, e.g. it is
the basis of the Standard Model of electroweak and strong interactions.
Equivalent Lagrangians are a clear manifestation of gauge invariance as illustrated by equations 952 953
which show that adding any total time derivative of a scalar function Λ(q) to the Lagrangian has no
observable consequences on the equations of motion. That is, although addition of the total time derivative
of the scalar function Λ(q ) changes the value of the Lagrangian, it does not change the equations of motion
for the observables derived using equivalent standard Lagrangians.
For Lagrangian formulations of classical mechanics, the gauge invariance is readily apparent by direct
inspection of the Lagrangian.

9.1 Example: Gauge invariance in electromagnetism

The scalar electric potential Φ and the vector potential  fields in electromagnetism are examples of gauge-
invariant fields. These electromagnetic-potential fields are not directly observable, that is, the electromagnetic
observable quantities are the electric field  and magnetic field  which can be derived from the scalar and
vector potential fields Φ and . An advantage of using the potential fields is that they reduce the problem
from 6 components, 3 each for  and  to 4 components, one for the scalar field Φ and 3 for the vector
potential . The Lagrangian for the velocity-dependent Lorentz force, given by equation 667 provides an
example of gauge invariance. Equations 663 and 665 showed that the electric and magnetic fields can be
expressed in terms of scalar and vector potentials Φ and A by the relations

B=∇×A
A
E = −∇Φ −

The equations of motion for a charge  in an electromagnetic field can be obtained by using the Lagrangian
1
= v · v − (Φ − A · v)
2
Consider the transformations (AΦ) → (A0  Φ0 ) in the transformed Lagrangian 0 where

A0 = A + ∇Λ(r)
Λ(r)
Φ0 = Φ −

234 CHAPTER 9. HAMILTON’S ACTION PRINCIPLE

The transformed Lorentz-force Lagrangian 0 is related to the original Lorentz-force Lagrangian  by

∙ ¸
Λ(r) 
0 =  +  ṙ·∇Λ(r) + =  +  Λ(r)
 


Note that the additive term   Λ(r) is an exact time diﬀerential. Thus the Lagrangian 0 is gauge invariant
implying identical equations of motion are obtained using either of these equivalent Lagrangians.
The force fields E and B can be used to show that the above transformation is gauge-invariant. That is,

A0 A
E0 = −∇Φ0 − = −∇Φ − =E
 

B0 = ∇ × A0 = ∇ × A = B

That is, the additive terms due to the scalar field Λ(r) cancel. Thus the electromagnetic force fields following
a gauge-invariant transformation are shown to be identical in agreement with what is inferred directly by
inspection of the Lagrangian.

9.3.3 Non-standard Lagrangians

The definition of the standard Lagrangian was based on d’Alembert’s diﬀerential variational principle. The
flexibility and power of Lagrangian mechanics can be extended to a broader range of dynamical systems
by employing an extended definition of the Lagrangian that is based on Hamilton’s Principle, equation 92.
Note that Hamilton’s Principle was introduced 46 years after development of the standard formulation of
Lagrangian mechanics. Hamilton’s Principle provides a general definition of the Lagrangian that applies
to standard Lagrangians, which are expressed as the diﬀerence between the kinetic and potential energies,
as well as to non-standard Lagrangians where there may be no clear separation into kinetic and potential
energy terms. These non-standard Lagrangians can be used with the Euler-Lagrange equations to generate
the correct equations of motion, even though they may have no relation to the kinetic and potential energies.
The extended definition of the Lagrangian based on Hamilton’s action functional 91 can be exploited for
developing non-standard definitions of the Lagrangian that may be applied to dynamical systems where use
of the standard definition is inapplicable. Non-standard Lagrangians can be equally as useful as the standard
Lagrangian for deriving equations of motion for a system. Secondly, non-standard Lagrangians, that have no
energy interpretation, are available for deriving the equations of motion for many nonconservative systems.
Thirdly, Lagrangians are useful irrespective of how they were derived. For example, they can be used to
derive conservation laws or the equations of motion. Coordinate transformations of the Lagrangian is much
simpler than that required for transforming the equations of motion. The relativistic Lagrangian defined in
chapter 176 is a well-known example of a non-standard Lagrangian.

9.3.4 Inverse variational calculus

Non-standard Lagrangians and Hamiltonians are not based on the concept of kinetic and potential energies.
Therefore, development of non-standard Lagrangians and Hamiltonians require an alternative approach that
ensures that they satisfy Hamilton’s Principle, equation 92 which underlies the Lagrangian and Hamil-
tonian formulations. One useful alternative approach is to derive the Lagrangian or Hamiltonian via an
inverse variational process based on the assumption that the equations of motion are known. Helmholtz de-
veloped the field of inverse variational calculus which plays an important role in development of non-standard
Lagrangians. An example of this approach is use of the well-known Lorentz force as the basis for deriving
a corresponding Lagrangian to handle systems involving electromagnetic forces. Inverse variational calculus
is a branch of mathematics that is beyond the scope of this textbook. The Douglas theorem[Dou41] states
that, if the three Helmholtz conditions are satisfied, then there exists a Lagrangian that, when used with the
Euler-Lagrange diﬀerential equations, leads to the given set of equations of motion. Thus, it will be assumed
that the inverse variational calculus technique can be used to derive a Lagrangian from known equations of
motion.
9.4. APPLICATION OF HAMILTON’S ACTION PRINCIPLE TO MECHANICS 235

9.4 Application of Hamilton’s Action Principle to mechanics

Knowledge of the equations of motion is required to predict the response of a system to any set of initial
conditions. Hamilton’s action principle, that is built into Lagrangian and Hamiltonian mechanics, coupled
with the availability of a wide arsenal of variational principles and techniques, provides a remarkably powerful
and broad approach to deriving the equations of motions required to determine the system response.
As mentioned in the Prologue, derivation of the equations of motion for any system, based on Hamilton’s
Action Principle, separates naturally into a hierarchical set of three stages that diﬀer in both sophistication
and understanding, as described below.
R
1. Action stage: The primary “action stage” employs Hamilton’s Action functional,  =  (q q̇)
to derive the Lagrangian and Hamiltonian functionals. This action stage provides the most fundamental
and sophisticated level of understanding. It involves specifying all the active degrees of freedom, as
well as the interactions involved. Symmetries incorporated at this primary action stage can simplify
subsequent use of the Hamiltonian and Lagrangian functionals.

2. Hamiltonian/Lagrangian stage: The “Hamiltonian/Lagrangian stage” uses the Lagrangian or

Hamiltonian functionals, that were derived at the action stage, in order to derive the equations of
motion for the system of interest. Symmetries, not already incorporated at the primary action stage,
may be included at this secondary stage.

3. Equations of motion stage: The “equations-of-motion stage” uses the derived equations of motion to
solve for the motion of the system subject to a given set of initial boundary conditions. Nonconservative
forces, such as dissipative forces, that were not included at the primary and secondary stages, may be
added at the equations of motion stage.

Lagrange omitted the action stage when he used d’Alembert’s Principle to derive Lagrangian mechanics.
The Newtonian mechanics approach omits both the primary “action” stage, as well as the secondary “Hamil-
tonian/Lagrangian” stage, since Newton’s Laws of Motion directly specify the “equations-of-motion stage”.
Thus these did not allow exploiting the considerable advantages provided by use of action, the Lagrangian,
and the Hamiltonian. Newtonian mechanics requires that all the active forces be included when deriving the
equations of motion, which involves dealing with vector quantities. In Newtonian mechanics, symmetries
must be incorporated directly at the equations of motion stage, which is more diﬃcult than when done at
the primary “action” stage, or the secondary “Lagrangian/Hamiltonian” stage. The “action” and “Hamil-
tonian/Lagrangian” stages allow for use of the powerful arsenal of mathematical techniques that have been
developed for applying variational principles.
There are considerable advantages to deriving the equations of motion based on Hamilton’s Principle,
rather than derive them using Newtonian mechanics. It is significantly easier to use variational principles to
handle the scalar functionals, action, Lagrangian, and Hamiltonian, rather than starting at the equations-
of-motion stage. For example, utilizing all three stages of algebraic mechanics facilitates accommodating
extra degrees of freedom, symmetries, and interactions. The symmetries identified by Noether’s theorem are
more easily recognized during the primary “action” and secondary “Hamiltonian/Lagrangian” stages rather
than at the subsequent “equations of motion” stage. Approximations made at the “action” stage are easier
to implement than at the “equations-of-motion” stage. Constrained motion is much more easily handled at
the primary “action”, or secondary “Hamilton/Lagrangian” stages, than at the equations-of-motion stage.
An important advantage of using Hamilton’s Action Principle, is that there is a close relationship between
action in classical and quantal mechanics, as discussed in chapters 15 and 18. Algebraic principles, that
underly analytical mechanics, naturally encompass applications to many branches of modern physics, such
as relativistic mechanics, fluid motion, and field theory.
In summary, the use of the single fundamental invariant quantity, action, as described above, provides a
powerful and elegant framework, that was developed first for classical mechanics, but now is exploited in a
wide range of science, engineering, and economics. An important feature of using the algebraic approach to
classical mechanics is the tremendous arsenal of powerful mathematical techniques that have been developed
for use of variational calculus applied to Lagrangian and Hamiltonian mechanics. Some of these variational
techniques were presented in chapters 6 7 8 and 9, while others will be introduced in chapter 15.
236 CHAPTER 9. HAMILTON’S ACTION PRINCIPLE

9.5 Summary
The Hamilton’s 1834 publication, introducing both Hamilton’s Principle of Stationary Action and Hamil-
tonian mechanics, marked the crowning achievements for the development of variational principles in classical
mechanics. A fundamental advantage of Hamiltonian mechanics is that it uses the conjugate coordinates
q p plus time , which is a considerable advantage in most branches of physics and engineering. Compared
to Lagrangian mechanics, Hamiltonian mechanics has a significantly broader arsenal of powerful techniques
that can be exploited to obtain an analytical solution of the integrals of the motion for complicated sys-
tems, as described in chapter 15. In addition, Hamiltonian dynamics provides a means of determining the
unknown variables for which the solution assumes a soluble form, and is ideal for study of the fundamen-
tal underlying physics in applications to fields such as quantum or statistical physics. As a consequence,
Hamiltonian mechanics has become the preeminent variational approach used in modern physics.
This chapter has introduced and discussed Hamilton’s Principle of Stationary Action, which underlies
the elegant and remarkably powerful Lagrangian and Hamiltonian representations of algebraic mechanics.
The basic concepts employed in algebraic mechanics are summarized below.

Hamilton’s Action Principle: As discussed in chapter 92, Hamiltonian mechanics is built upon Hamil-
ton’s action functional Z 
(q p) = (q q̇) (91)


Hamilton’s Principle of least action states that

Z 
(q p) =  (q q̇) = 0 (92)


Generalized momentum : In chapter 72, the generalized (canonical) momentum was defined in terms
of the Lagrangian  to be
(q q̇)
 ≡ (73)
 ̇
Chapter 922 defined the generalized momentum in terms of the action functional  to be

(q p)
 = (912)


Generalized energy (q ̇ ): Jacobi’s Generalized Energy (q ̇ ) was defined in equation 737 as
X µ (q q̇ ) ¶
(q q̇ ) ≡ ̇ − (q q̇ ) (737)

 ̇

Hamiltonian function:  (q p) The Hamiltonian  (q p) was defined in terms of the generalized
energy (q q̇ ) plus the generalized momentum. That is
X
 (q p) ≡ (q q̇ ) =  ̇ − (q q̇ ) = p · q̇−(q q̇ ) (737)

P
where p q correspond to -dimensional vectors, e.g. q ≡ (1  2    ) and the scalar product p· q̇ =   ̇ .
Chapter 82 used a Legendre transformation to derive this relation between the Hamiltonian and Lagrangian
functions. Note that whereas the Lagrangian (q q̇ ) is expressed in terms of the coordinates q plus
conjugate velocities q̇, the Hamiltonian  (q p ) is expressed in terms of the coordinates q plus their
conjugate momenta p. For scleronomic systems, using the standard Lagrangian, in equations 744 and 729
shows that the Hamiltonian simplifies to be equal to the total mechanical energy, that is,  =  +  .
9.5. SUMMARY 237

Generalized energy theorem: The equations of motion lead to the generalized energy theorem which
states that the time dependence of the Hamiltonian is related to the time dependence of the Lagrangian.
" 
#
 (q p) X X   (q q̇ )
= ̇ 
 +  (q ) − (738)
 
  
=1

Note that if all the generalized non-potential forces and Lagrange multiplier terms are zero, and if the
Lagrangian is not an explicit function of time, then the Hamiltonian is a constant of motion.

Lagrange equations of motion: Equation 660 gives that the  Lagrange equations of motion are
½ µ ¶ ¾ 
X
   
− =  (q ) + 
 (660)
  ̇  
=1

where  = 1 2 3 

Hamilton’s equations of motion: Chapter 83 showed that a Legendre transform, plus the Lagrange-
Euler equations, (964 965) lead to Hamilton’s equations of motion. Hamilton derived these equations of
motion directly from the action functional, as shown in chapter 92

 (q p)
̇ = (825)

" #
 X 

̇ = − (q p) +  +  (826)
 
=1
 (q p) (q q̇ )
= − (824)
 
Note the symmetry of Hamilton’s two canonical equations. The canonical variables    are treated
as independent canonical variables Lagrange was the first to derive the canonical equations but he did not
recognize them as a basic set of equations of motion. Hamilton derived the canonical equations of motion
from his fundamental variational principle and made them the basis for a far-reaching theory of dynamics.
Hamilton’s equations give 2 first-order diﬀerential equations for    for each of the  degrees of freedom.
Lagrange’s equations give  second-order diﬀerential equations for the variables   ̇ 

Hamilton-Jacobi equation: Hamilton used Hamilton’s Principle plus equation 919 to derive the Hamilton-
Jacobi equation.

+ (q p) = 0 (919)

The solution of Hamilton’s equations is trivial if the Hamiltonian is a constant of motion, or when a set of
generalized coordinate can be identified for which all the coordinates  are constant, or are cyclic (also called
ignorable coordinates). Jacobi developed the mathematical framework of canonical transformation required
to exploit the Hamilton-Jacobi equation.

Hamilton’s Principle applied using initial boundary conditions: The definition of Hamilton’s Prin-
ciple assumes integration between the initial time  and final time  . A recent development has extended
applications of Hamilton’s Principle to apply to systems that are defined in terms of only the initial bound-
ary conditions. This method doubles the number of degrees of freedom and uses a coupling Lagrangian
 (q2  q̇2  q1  q̇1  ) between the corresponding q1 and q2 doubled degrees of freedom
∙ ¸
     

− 
= 
− 
≡  (q1  q̇1 ) (950)
  ̇− − −   ̇− 

and where  is a generalized nonconservative force derived from .

238 CHAPTER 9. HAMILTON’S ACTION PRINCIPLE

Standard Lagrangians: Derivation of Lagrangian mechanics, using d’Alembert’s principle of virtual

work, assumed that the Lagrangian is defined by equation 952

(q q̇) =  (q̇) −  (q ) (952)

This was used in equation 93 to derive the action in terms of the fundamental Lagrangian defined by equation
952 The assumption that the action  is the fundamental property inverts this procedure and now equation
93 is used to derived the Lagrangian. That is, the assumption that Hamilton’s Principle is the foundation
of algebraic mechanics defines the Lagrangian in terms of the fundamental action 

Non-standard Lagrangians: The flexibility and power of Lagrangian mechanics can be extended to a
broader range of dynamical systems by employing an extended definition of the Lagrangian that assumes that
the action is the fundamental property, and then the Lagrangian is defined in terms of Hamilton’s variational
action principle using equation 92. It was illustrated that the inverse variational calculus formalism can
be used to identify non-standard Lagrangians that generate the required equations of motion. These non-
standard Lagrangians can be very diﬀerent from the standard Lagrangian and do not separate into kinetic
and potential energy components. These alternative Lagrangians can be used to handle dissipative systems
which are beyond the range of validity when using standard Lagrangians. That is, it was shown that several
very diﬀerent Lagrangians and Hamiltonians can be equivalent for generating useful equations of motion
of a system. Currently the use of non-standard Lagrangians is a narrow, but active, frontier of classical
mechanics with important applications to relativistic mechanics.

Gauge invariance of the standard Lagrangian: It was shown that there is a continuum of equivalent
standard Lagrangians that lead to the same set of equations of motion for a system. This feature is related
to gauge invariance in mechanics. The following transformations change the standard Lagrangian, but leave
the equations of motion unchanged.

1. The Lagrangian is indefinite with respect to addition of a constant to the scalar potential which cancels
out when the derivatives in the Euler-Lagrange diﬀerential equations are applied.
2. Similarly the Lagrangian is indefinite with respect to addition of a constant kinetic energy.
3. The Lagrangian is indefinite with respect to addition of a total time derivative of the form  →

 +  [Λ(  )] for any diﬀerentiable function Λ( ) of the generalized coordinates, plus time, that has
continuous second derivatives.

Application of Hamilton’s Action Principle to mechanics: The derivation of the equations of mo-
tion for any system can be separated into a hierarchical set of three stages in both sophistication and
understanding. Variational principles are employed during the primary “action” stage and secondary “Hamil-
ton/Lagrangian” stage to derive the required equations of motion, which then are solved during the third
“equations-of-motion stage”. Hamilton’s Action Principle, is a scalar function that is the basis for deriving
the Lagrangian
R and Hamiltonian functions. The primary “action stage” uses Hamilton’s Action functional,
 =  (q q̇) to derive the Lagrangian and Hamiltonian functionals that are based on Hamilton’s
action functional and provide the most fundamental and sophisticated level of understanding. The second
“Hamiltonian/Lagrangian stage” involves using the Lagrangian and Hamiltonian functionals to derive the
equations of motion. The third “equations-of-motion stage” uses the derived equations of motion to solve
for the motion subject to a given set of initial boundary conditions. The Newtonian mechanics approach
bypasses the primary “action” stage, as well as the secondary “Hamiltonian/Lagrangian” stage. That is,
Newtonian mechanics starts at the third “equations-of-motion” stage, which does not allow exploiting the
considerable advantages provided by use of action, the Lagrangian, and the Hamiltonian. Newtonian me-
chanics requires that all the active forces be included when deriving the equations of motion, which involves
dealing with vector quantities. This is in contrast to the action, Lagrangian, and Hamiltonian which are
scalar functionals. Both the primary “action” stage, and the secondary “Lagrangian/Hamiltonian” stage,
exploit the powerful arsenal of mathematical techniques that have been developed for exploiting variational
principles.
Chapter 10

Nonconservative systems

10.1 Introduction
Hamilton’s action principle, Lagrangian mechanics, and Hamiltonian mechanics, all exploit the concept of
action which is a single, invariant, quantity. These algebraic formulations of mechanics all are based on
energy, which is a scalar quantity, and thus these formulations are easier to handle than the vector concept
of force employed in Newtonian mechanics. Algebraic formulations provide a powerful and elegant approach
to understand and develop the equations of motion of systems in nature. Chapters 6 − 9 applied variational
principles to Hamilton’s action principle which led to the Lagrangian, and Hamiltonian formulations that
simplify determination of the equations of motion for systems in classical mechanics.
A conservative force has the property that the total work done moving between two points is independent
of the taken path. That is, a conservative force is time symmetric and can be expressed in terms of the
gradient of a scalar potential  . Hamilton’s action principle implicitly assumes that the system is conservative
for those degrees of freedom that are built into the definition of the action, and the related Lagrangian, and
Hamiltonian. The focus of this chapter is to discuss the origins of nonconservative motion and how it can
be handled in algebraic mechanics.

10.2 Origins of nonconservative motion

Nonconservative degrees of freedom involve irreversible processes, such as dissipation, damping, and also
can result from course-graining, or ignoring coupling to active degrees of freedom. The nonconservative role
of ignored active degrees of freedom is illustrated by the weakly-coupled double harmonic oscillator system
discussed below. Let the two harmonic oscillators have masses (1  2 ) uncoupled angular frequencies
( 1   2 ) , and oscillation amplitudes (1  2 ). Assume that the coupling potential energy is  = 1 2  The
Lagrangian for this weakly-coupled double oscillator is
1 ¡ 2 ¢ 2 ¡ 2 ¢
(1 2  ̇1  ̇2  ) = ̇1 −  21 12 + 1 2 + ̇2 −  22 22 (10.1)
2 2
Note that the total Lagrangian is conservative since the Lagrangian is explicitly time independent. As shown
in chapter 142 the solution for the amplitudes of the oscillation for the coupled system are given by
∙µ ¶ ¸ ∙µ ¶ ¸
1 + 2 1 − 2
1 () =  sin  sin  (10.2)
2 2
∙µ ¶ ¸ ∙µ ¶ ¸
1 + 2 1 − 2
2 () =  cos  cos  (10.3)
2 2

The system exhibits the common “beats” behavior where the coupled ¡ 1 +2 ¢harmonic oscillators have an angular
frequency that is the average oscillator frequency  
¡ = ¢ 2  and the oscillation intensities are
modulated at the diﬀerence frequency,     = 1 − 2
2
 Although the total energy is conserved
for this conservative system, this shared energy flows back and forth between the two coupled harmonic
oscillators at the diﬀerence frequency. If the equations of motion for oscillator 1 ignore the coupling to the

239
240 CHAPTER 10. NONCONSERVATIVE SYSTEMS

motion of oscillator 2, that is, assume a constant average value 2 = h2 i is used, then the intensity |1 |2 and
¯ ¡ ¢ ¯2
energy of the first oscillator still is modulated by the ¯sin 1 −2
2
¯ term. Thus the total energy for this
truncated coupled-oscillator system is no longer conserved due to neglect of the energy flowing into and out
of oscillator 1 due to its coupling to oscillator 2. That is, the solution for the truncated system of oscillator
1 is not conservative since it is exchanging energy with the coupled, but ignored, second oscillator. This
elementary example illustrates that ignoring active degrees of freedom can transform a conservative system
into a nonconservative system, for which the equations of motion derived using the truncated Lagrangian is
incorrect.
The above example illustrates the importance of including all active degrees of freedom when deriving the
equations of motion, in order to ensure that the total system is conservative. Unfortunately, nonconservative
systems due to viscous or frictional dissipation typically result from weak thermal interactions with an
enormous number of nearby atoms, which makes inclusion of all of these degrees of freedom impractical.
Even though the detailed behavior of such dissipative degrees of freedom may not be of direct interest, all
the active degrees of freedom must be included when applying Lagrangian or Hamiltonian mechanics.

10.3 Algebraic mechanics for nonconservative systems

Since Lagrangian and Hamiltonian formulations are invalid for the nonconservative degrees of freedom, the
following three approaches are used to include nonconservative degrees of freedom directly in the Lagrangian
and Hamiltonian formulations of mechanics.

1. Expand the number of degrees of freedom used to include all active degrees of freedom for the system,
so that the expanded system is conservative. This is the preferred approach when it is viable. Hamil-
ton’s action principle based on initial conditions, introduced in chapter 924, doubles the number of
degrees of freedom, which can be used to account for the dissipative forces providing one approach to
solve nonconservative systems. However, this approach typically is impractical for handling dissipated
processes because of the large number of degrees of freedom that are involved in thermal dissipation.
2. Nonconservative forces can be introduced directly at the equations of motion stage as generalized forces

 . This approach is used extensively. For the case of linear velocity dependence, the Rayleigh’s
dissipation function provides an elegant and powerful way to express the generalized forces in terms of
scalar potential energies.
3. New degrees of freedom or eﬀective forces can be postulated that are then incorporated into the
Lagrangian or the Hamiltonian in order to mimic the eﬀects of the nonconservative forces.

Examples that exploit the above three ways to introduce nonconservative dissipative forces in algebraic
formulations are given below.

10.4 Rayleigh’s dissipation function

As mentioned above, nonconservative systems involving viscous or frictional dissipation, typically result from
weak thermal interactions with many nearby atoms, making it impractical to include a complete set of active
degrees of freedom. In addition, dissipative systems usually involve complicated dependences on the velocity
and surface properties that are best handled by including the dissipative drag force explicitly as a generalized
drag force in the Euler-Lagrange equations. The drag force can have any functional dependence on velocity,
position, or time.
F = − (q̇ q )v̂ (10.4)
Note that since the drag force is dissipative the dominant component of the drag force must point in the
opposite direction to the velocity vector.
In 1881 Lord Rayleigh[Ray1881, Ray1887] showed that if a dissipative force F depends linearly on velocity,
it can be expressed in terms of a scalar potential functional of the generalized coordinates called the Rayleigh
dissipation function R(q̇). The Rayleigh dissipation function is an elegant way to include linear velocity-
dependent dissipative forces in both Lagrangian and Hamiltonian mechanics, as is illustrated below for both
Lagrangian and Hamiltonian mechanics.
10.4. RAYLEIGH’S DISSIPATION FUNCTION 241

10.4.1 Generalized dissipative forces for linear velocity dependence

Consider  equations of motion for the  degrees of freedom, and assume that the dissipation depends linearly
on velocity. Then, allowing all possible cross coupling of the equations of motion for   the equations of
motion can be written in the form

X
[ ̈ +  ̇ +   −  ()] = 0 (10.5)
=1

Multiplying equation 105 by ̇ , take the time integral, and sum over  , gives the following energy equation
X  Z
 X  X  Z
 X  X  Z
 X   Z
X 
 ̈ ̇  +  ̇ ̇  +   ̇  =  ()̇  (10.6)
=1 =1 0 =1 =1 0 =1 =1 0  0

The right-hand term is the total energy supplied to the system by the external generalized forces  ()
at the time . The first time-integral term on the left-hand side is the total kinetic energy, while the third
time-integral term equals the potential energy. The second integral term on the left is defined to equal 2R(q̇)
where Rayeigh’s dissipation function R(q̇) is defined as
 
1 XX
R(q̇)≡  ̇ ̇ (10.7)
2 =1 =1

and the summations are over all  particles of the system. This definition allows for complicated cross-
coupling eﬀects between the  particles.
The particle-particle coupling eﬀects usually can be neglected allowing use of the simpler definition that
includes only the diagonal terms. Then the diagonal form of the Rayleigh dissipation function simplifies to

1X 2
R(q̇)≡  ̇ (10.8)
2 =1 

Therefore the frictional force in the  direction depends linearly on velocity ̇ , that is
R(q̇)
 = − = − ̇ (10.9)
 ̇
In general, the dissipative force is the velocity gradient of the Rayleigh dissipation function,

F = −∇q̇ R(q̇) (10.10)

The physical significance of the Rayleigh dissipation function is illustrated by calculating the work done
by one particle  against friction, which is

 = −F · r = −F · q̇  =  ̇2  (10.11)

Therefore
 
2R(q̇)= (10.12)

which is the rate of energy (power) loss due to the dissipative forces involved. The same relation is obtained
after summing over all the particles involved.
Transforming the frictional force into generalized coordinates requires equation 627
X r r
ṙ = ̇ + (10.13)
 


Note that the derivative with respect to ̇ equals

 ṙ r
= (10.14)
 ̇ 
242 CHAPTER 10. NONCONSERVATIVE SYSTEMS

Using equations 628 and 629 the  component of the generalized frictional force  is given by

X X X
r  ṙ  ṙ R(q̇)
 = F · = F · =− ∇ R(q̇) · =− (10.15)
=1
 =1
 ̇ =1
 ̇   ̇

Equation 1015 provides an elegant expression for the generalized dissipative force  in terms of the
Rayleigh’s scalar dissipation potential R.

10.4.2 Generalized dissipative forces for nonlinear velocity dependence

The above discussion of the Rayleigh dissipation function was restricted to the special case of linear velocity-
dependent dissipation. Virga[Vir15] proposed that the scope of the classical Rayleigh-Lagrange formalism
can be extended to include nonlinear velocity dependent dissipation by assuming that the nonconservative
dissipative forces are defined by
(q q̇)
F = − (10.16)
 q̇
where the generalized Rayleigh dissipation function R(q q̇) satisfies the general Lagrange mechanics relation
 
− =0 (10.17)
  ̇
This generalized Rayleigh’s dissipation function eliminates the prior restriction to linear dissipation processes,
which greatly expands the range of validity for using Rayleigh’s dissipation function.

10.4.3 Lagrange equations of motion

Linear dissipative forces can be directly, and elegantly, included in Lagrangian mechanics by using Rayleigh’s
dissipation function as a generalized force  . Inserting Rayleigh dissipation function 1015 in the generalized
Lagrange equations of motion 660 gives
½ µ ¶ ¾ "X 
#
    R(q q̇)
− =  (q ) + 
 − (10.18)
  ̇    ̇
=1

Where 
 corresponds to the generalized forces remaining after removal of the generalized linear, velocity-
dependent, frictional force  . The holonomic forces of constraint are absorbed into the Lagrange multiplier
term.

10.4.4 Hamiltonian mechanics

If the nonconservative forces depend linearly on velocity, and are derivable from Rayleigh’s dissipation
function according to equation 1015, then using the definition of generalized momentum gives
" #
   X  R(q q̇)

̇ = = +  (q ) +  − (10.19)
  ̇    ̇
=1
" #
(p q ) X  R(q q̇)
̇ = − +  (q ) + 
 − (10.20)
   ̇
=1

Thus Hamilton’s equations become


̇ = (10.21)


" #
 X  R(q q̇)
̇ = − +  (q ) + 
 − (10.22)
   ̇
=1

The Rayleigh dissipation function R(q q̇) provides an elegant and convenient way to account for dissi-
pative forces in both Lagrangian and Hamiltonian mechanics.
10.4. RAYLEIGH’S DISSIPATION FUNCTION 243

10.1 Example: Driven, linearly-damped, coupled linear oscillators

Consider the two identical, linearly damped, coupled
oscillators (damping constant ) shown in the figure. A
periodic force  = 0 cos() is applied to the left-hand
mass . The kinetic energy of the system is x1 x2
1
 = (̇21 + ̇22 ) m m
2
Harmonically-driven, linearly-damped, coupled
The potential energy is linear oscillators.
1 2 1 2 1 0 2 1 1
=  +  +  (2 − 1 ) = ( + 0 ) 21 + ( + 0 ) 22 − 0 1 2
2 1 2 2 2 2 2
Thus the Lagrangian equals
∙ ¸
1 2 2 1 0 2 1 0 2 0
 = (̇1 + ̇ ) − ( +  ) 1 + ( +  ) 2 −  1 2
2 2 2
Since the damping is linear, it is possible to use the Rayleigh dissipation function
1
R = (̇21 + ̇22 )
2
The applied generalized forces are

01 =  cos () 02 = 0

Use the Euler-Lagrange equations 1018 to derive the equations of motion

½ µ ¶ ¾ 
X
   F 
− + = 0 +  (q )
  ̇   ̇ 
=1

gives

̈1 +  ̇1 + ( + 0 )1 − 0 2 = 0 cos ()

̈2 +  ̇2 + ( + 0 )2 − 0 1 = 0

These two coupled equations can be decoupled and simplified by making a transformation to normal coor-
dinates, 1   2 where
1 = 1 − 2 2 = 1 + 2
Thus
1 1
1 =
( + 2 ) 2 = ( − 1 )
2 1 2 2
Insert these into the equations of motion gives

(̈ 1 + ̈2 ) + (̇1 + ̇2 ) + ( + 0 )(1 +  2 ) − 0 (2 −  1 ) = 20 cos ()

( 2 − 1 ) + (2 − 1 ) + ( + 0 )(2 −  1 ) − 0 (1 +  2 ) = 0

Add and subtract these two equations gives the following two decoupled equations
 ( + 20 ) 0
̈ 1 + ̇1 + 1 = cos ()
  
  0
̈2 + ̇ 2 + 2 = cos ()
  
q p
 (+20 ) 0
Define Γ =   1 =   2 =   = . Then the two independent equations of motion become

̈1 + Γ̇1 +  21  1 =  cos () ̈2 + Γ̇ 2 +  22 2 =  cos ()

244 CHAPTER 10. NONCONSERVATIVE SYSTEMS

This solution is a superposition of two independent, linearly-damped, driven normal modes 1 and 2 that
have diﬀerent natural frequencies  1 and  2 . For weak damping these two driven normal modesq
each undergo
¡ ¢2
damped oscillatory motion with the 1 and  2 normal modes exhibiting resonances at  1 =  21 − 2 Γ2
0
q ¡ ¢2
and  02 =  22 − 2 Γ2

10.2 Example: Kirchhoﬀ’s rules for electrical circuits

The mathematical equations governing the behavior of mechanical systems and  electrical circuits
have a close similarity. Thus variational methods can be used to derive the analogous behavior for electrical
circuits. For example, for a system of  separate circuits, the magnetic flux Φ through circuit  due to
electrical current  = ̇ flowing in circuit  is given by
Φ =  ̇
where  is the mutual inductance. The diagonal term  =  corresponds to the self inductance of
circuit . The net magnetic flux Φ through circuit  due to all  circuits, is the sum

X
Φ =  ̇
=1

Thus the total magnetic energy  which is analogous to kinetic energy  is given by summing over all
 circuits to be
 
1 XX
 =  =  ̇ ̇
2 =1
=1
Similarly the electrical energy  stored in the mutual capacitance  between the  circuits, which
is analogous to potential energy,  is given by
 
1 X X  
 =  =
2 =1 
=1

Thus the standard Lagrangian for this electric system is given by

  ∙ ¸
1 XX  
= − =  ̇ ̇ − ()
2 =1 
=1

Assuming that Ohm’s Law is obeyed, that is, the dissipation force depends linearly on velocity, then the
Rayleigh dissipation function can be written in the form
 
1 XX
R≡  ̇ ̇ ()
2 =1
=1

where  is the resistance matrix. Thus the dissipation force, expressed in volts, is given by

R 1X
 = − =  ̇ ()
 ̇ 2
=1

Inserting equations   and  into equation 1018 plus making the assumption that an additional gen-
eralized electrical force  =   () volts is acting on circuit  then the Euler-Lagrange equations give the
following equations of motion.
X ∙ ¸

 ̈ +  ̇ + =   ()

=1
This is a generalized version of Kirchhoﬀ’s loop rule which can be seen by considering the case where the
diagonal term  =  is the only non-zero term. Then
∙ ¸

 ̈ +  ̇ + =   ()

This sum of the voltages is identical to the usual expression for Kirchhoﬀ’s loop rule. This example
illustrates the power of variational methods when applied to fields beyond classical mechanics.
10.5. DISSIPATIVE LAGRANGIANS 245

10.5 Dissipative Lagrangians

The prior discussion of nonconservative systems mentioned the following three ways to incorporate dissipative
processes into Lagrangian or Hamiltonian mechanics. (1) Expand the number of degrees of freedom to include
all the active dissipative active degrees of freedom as well as the conservative ones. (2) Use generalized forces
to incorporate dissipative processes. (3) Add dissipative terms to the Lagrangian or Hamiltonian to mimic
dissipation. The following illustrates the use of dissipative Lagrangians.
Bateman[Bat31] pointed out that an isolated dissipative system is physically incomplete, that is, a com-
plete system must comprise at least two coupled subsystems where energy is transferred from a dissipating
subsystem to an absorbing subsystem. A complete system should comprise both the dissipating and ab-
sorbing systems to ensure that the total system Lagrangian and Hamiltonian are conserved, as is assumed
in conventional Lagrangian and Hamiltonian mechanics. Both Bateman and Dekker[Dek75] have illustrated
that the equations of motion for a linearly-damped, free, one-dimensional harmonic oscillator are derivable
using the Hamilton variational principle via introduction of a fictitious complementary subsystem that mim-
ics dissipative processes. The following example illustrate that deriving the equations of motion for the
linearly-damped, linear oscillator may be handled by three alternative equivalent non-standard Lagrangians
that assume either: (1) a multidimensional system, (2) explicit time dependent Lagrangians and Hamiltoni-
ans, or (3) complex non-standard Lagrangians.

10.3 Example: The linearly-damped, linear oscillator:

Three toy dynamical models have been used to describe the linearly-damped, linear oscillator employing
very diﬀerent non-standard Lagrangians to generate the required Hamiltonians, and to derive the correct
equations of motion.
1: Dual-component Lagrangian: 
Bateman proposed a dual system comprising a mass  subject to two coupled one-dimensional variables
( ) where  is the observed variable and  is the mirror variable for the subsystem that absorbs the energy
dissipated by the subsystem .
Assume a non-standard Lagrangian of the form
∙ ¸
 Γ 2
 = ̇̇ − [ ̇ − ̇] −  0  ()
2 2

where Γ =  is the damping coeﬃcient. Minimizing by variation of the auxiliary variable , that is, Λ  = 0,
leads to the uncoupled equation of motion for 
£ ¤
̈ + Γ̇ +  20  = 0 ()
2
Similarly minimizing by variation of the primary variable  that is Λ  = 0 leads to the uncoupled equation
of motion for 
£ ¤
̈ − Γ̇ +  20  = 0 ()
2
Note that equation of motion () which was obtained by variation of the auxiliary variable  corresponds
to that for the usual free, linearly-damped, one-dimensional harmonic oscillator for the  variable which
dissipates energy as is discussed in chapter 35. The equation of motion () is obtained by variation of the
primary variable  and corresponds to a free linear, one-dimensional, oscillator for the  variable that is
absorbing the energy dissipated by the dissipating  system.
The generalized momenta,

 ≡
 ̇
can be used to derive the corresponding Hamiltonian
Ã µ ¶2 !
  Γ  Γ
 (     ) = [ ̇ +  ̇ − ] = − [ −  ] +  20 −  ()
2 2 2 2
246 CHAPTER 10. NONCONSERVATIVE SYSTEMS

Note that this Hamiltonian is time independent, and thus is conserved for this complete dual-variable system.
Using Hamilton’s equations of motion gives the same two uncoupled equations of motion as obtained using
the Lagrangian, i.e. () and ().
2: Time-dependent Lagrangian: 
The complementary subsystem of the above dual-component Lagrangian, that is added to the primary
dissipative subsystem, is the adjoint to the equations for the primary subsystem of interest. In some cases, a
set of the solutions of the complementary equations can be expressed in terms of the solutions of the primary
subsystem allowing the equations of motion to be expressed solely in terms of the variables of the primary
subsystem. Inspection of the solutions of the damped harmonic oscillator, presented in chapter 35, implies
that  and  must be related by the function
 = Γ ()
Therefore Bateman proposed a time-dependent, non-standard Lagrangian  of the form
 Γ £ 2 ¤
 =  ̇ −  20 2 ( )
2
This Lagrangian  corresponds to a harmonic oscillator for which the mass  = 0 Γ is accreting
exponentially with time in order to mimic the exponential energy dissipation. Use of this Lagrangian in the
Euler-Lagrange equations gives the solution
£ ¤
Γ ̈ + Γ̇ +  20  = 0 ()

If the factor outside of the bracket is non-zero, then the equation in the bracket must be zero. The expression
in the bracket is the required equation of motion for the linearly-damped linear oscillator. This Lagrangian
generates a generalized momentum of
 = Γ ̇
and the Hamiltonian is
2 −Γ  2 Γ 2
 =  ̇ − 2 =  + 0   ()
2 2
The Hamiltonian is time dependent as expected. This leads to Hamilton’s equations of motion
  −Γ
̇ = =  ()
 

−̇ = =  20 Γ  ()

Take the total time derivative of equation  and use equation  to substitute for ̇ gives
£ ¤
Γ ̈ + Γ̇ +  20  = 0 ()

If the term Γ is non-zero, then the term in brackets is zero. The term in the bracket is the usual equation
of motion for the linearly-damped harmonic oscillator.
3: Complex Lagrangian: 
Dekker proposed use of complex dynamical variables for solving the linearly-damped harmonic oscillator.
It exploits the fact that, in principle, each second order differential equation can be expressed in terms of
a set of first-order differential equations. This feature is the essential difference between Lagrangian and
Hamiltonian mechanics. Let  be complex and assume it can be expressed in the form of a real variable  as
µ ¶
Γ
 = ̇ −  +  ()
2
Substituting this complex variable into the relation
∙ ¸
Γ
̇ +  + =0 ()
2
leads to the second-order equation for the real variable  of

̈ + Γ̇ +  20 = 0 ()
10.6. SUMMARY 247

This is the desired equation of motion for the linearly-damped harmonic oscillator. This result also can be
shown by taking the time derivative of equation () and taking only the real part, i.e.
µ ¶
Γ Γ
̈ +  ̇ + ̇ = ̈ +  − ̇ + Γ̇ = ̈ + Γ̇ +  20  = 0 ()
2 2
This feature is exploited using the following Lagrangian
∙ ¸
 ∗ Γ ∗
 = ( ̇ −  ̇ ∗ ) −  −    ()
2 2
¡ ¢2
where  2 ≡  20 − Γ2 . The Lagrangian  is real for a conservative system and complex for a
dissipative system. Using the Lagrange-Euler equation for variation of  ∗ , that is, Λ∗  = 0, gives
equation () which leads to the required equation of motion ()
The canonical conjugate momenta are given by
 
= ̃ = ()
 ̇  ̇ ∗
The above Lagrangian plus canonically conjugate momenta lead to the complimentary Hamiltonians
µ ¶
∗ Γ
 (  ̃  ) =  + (̃∗  ∗ − ) ()
2
µ ¶
∗ Γ
̃ (  ̃  ) =  − (̃∗  ∗ − ) ()
2
These Hamiltonians give Hamilton equations of motion that lead to the correct equations of motion for 
and  ∗
The above examples have shown that three very diﬀerent, non-standard, Lagrangians, plus their corre-
sponding Hamiltonians, all lead to the correct equation of motion for the linearly-damped harmonic oscilla-
tor. This illustrates the power of using non-standard Lagrangians to describe dissipative motion in classical
mechanics. However, postulating non-standard Lagrangians to produce the required equations of motion
appears to be of questionable usefulness. A fundamental approach is needed to build a firm foundation upon
which non-standard Lagrangian mechanics can be based. Non-standard Lagrangian mechanics remains an
active, albeit narrow, frontier of classical mechanics

10.6 Summary
Dissipative drag forces are non-conservative and usually are velocity dependent. Chapter 4 showed that the
motion of non-linear dissipative dynamical systems can be highly sensitive to the initial conditions and can
lead to chaotic motion.

Algebraic mechanics for nonconservative systems Since Lagrangian and Hamiltonian formulations
are invalid for the nonconservative degrees of freedom, the following three approaches are used to include
nonconservative degrees of freedom directly in the Lagrangian and Hamiltonian formulations of mechanics.
1. Expand the number of degrees of freedom used to include all active degrees of freedom for the system,
so that the expanded system is conservative. This is the preferred approach when it is viable. Unfor-
tunately this approach typically is impractical for handling dissipated processes because of the large
number of degrees of freedom that are involved in thermal dissipation.
2. Nonconservative forces can be introduced directly at the equations of motion stage as generalized forces

 . This approach is used extensively. For the case of linear velocity dependence, the Rayleigh’s
dissipation function provides an elegant and powerful way to express the generalized forces in terms of
scalar potential energies.
3. New degrees of freedom or eﬀective forces can be postulated that are then incorporated into the
Lagrangian or the Hamiltonian in order to mimic the eﬀects of the nonconservative forces.
248 CHAPTER 10. NONCONSERVATIVE SYSTEMS

Rayleigh’s dissipation function Generalized dissipative forces that have a linear velocity dependence
can be easily handled in Lagrangian or Hamiltonian mechanics by introducing the powerful Rayleigh’s
dissipation function R(q̇) where
 
1 XX
R(q̇)≡  ̇ ̇ (107)
2 =1 =1

This approach is used extensively in physics. This approach has been generalized by defining a linear velocity
dependent Rayleigh dissipation function
(q q̇)
F = − (1016)
 q̇
where the generalized Rayleigh dissipation function R(q q̇) satisfies the general Lagrange mechanics relation

 
− =0 (1017)
  ̇
This generalized Rayleigh’s dissipation function eliminates the prior restriction to linear dissipation processes,
which greatly expands the range of validity for using Rayleigh’s dissipation function.

Rayleigh dissipation in Lagrange equations of motion Linear dissipative forces can be directly, and
elegantly, included in Lagrangian mechanics by using Rayleigh’s dissipation function as a generalized force
 . Inserting Rayleigh dissipation function 1015 in the generalized Lagrange equations of motion 660 gives
½ µ ¶ ¾ " 
#
   X  R(q q̇)

− =  (q ) +  − (1018)
  ̇    ̇
=1

Rayleigh dissipation in Hamiltonian mechanics If the nonconservative forces depend linearly on

velocity, and are derivable from Rayleigh’s dissipation function according to equation 1015, then using the
definition of generalized momentum gives
" #
   X  R(q q̇)

̇ = = +  (q ) +  − (1019)
  ̇    ̇
=1
" #
(p q ) X  R(q q̇)

̇ = − +  (q ) +  − (1020)
   ̇
=1

Thus Hamilton’s equations become


̇ = (1021)


" #
 X  R(q q̇)
̇ = − +  (q ) + 
 − (1022)
   ̇
=1

The Rayleigh dissipation function R(q q̇) provides an elegant and convenient way to account for dissi-
pative forces in both Lagrangian and Hamiltonian mechanics.

Dissipative Lagrangians or Hamiltonians New degrees of freedom or eﬀective forces can be postulated
that are then incorporated into the Lagrangian or the Hamiltonian in order to mimic the eﬀects of the
nonconservative forces. This approach has been used for special cases.
Chapter 11

Conservative two-body central forces

11.1 Introduction
Conservative two-body central forces are important in physics because of the pivotal role that the Coulomb
and the gravitational forces play in nature. The Coulomb force plays a role in electrodynamics, molecular,
atomic, and nuclear physics, while the gravitational force plays an analogous role in celestial mechanics.
Therefore this chapter focusses on the physics of systems involving conservative two-body central forces
because of the importance and ubiquity of these conservative two-body central forces in nature.
A conservative two-body central force has the following three important attributes.

1. Conservative: A conservative force depends only on the particle position, that is, the force is not
time dependent. Moreover the work done by the force moving a body between any two points 1 and 2
is path independent. Conservative fields are discussed in chapter 210.

2. Two-body: A two-body force between two bodies depends only on the relative locations of the two
interacting bodies and is not influenced by the proximity of additional bodies. For two-body forces
acting between  bodies, the force on body 1 is the vector superposition of the two-body forces due
to the interactions with each of the other  − 1 bodies. This diﬀers from three-body forces where the
force between any two bodies is influenced by the proximity of a third body.

3. Central: A central force field depends on the distance 12 from the origin of the force at point 1 to
the body location at point 2, and the force is directed along the line joining them, that is, r̂12 .

A conservative, two-body, central force combines the above three attributes and can be expressed as,

F21 = (12 )r̂12 (11.1)

The force field F21 has a magnitude  (12 ) that depends only on the magnitude of the relative separation
vector r12 = r2 − r1 between the origin of the force at point 1 and point 2 where the force acts, and the force
is directed along the line joining them, that is, r̂12 .
Chapter 210 showed that if a two-body central force is conservative, then it can be written as the gradient
of a scalar potential energy  () which is a function of the distance from the center of the force field.

F21 = −∇ (12 ) (11.2)

As discussed in chapter 2, the ability to represent the conservative central force by a scalar function  ()
greatly simplifies the treatment of central forces.
The Coulomb and gravitational forces both are true conservative, two-body, central forces whereas the
nuclear force between nucleons in the nucleus has three-body components. Two bodies interacting via a
two-body central force is the simplest possible system to consider, but equation 111 is applicable equally
for  bodies interacting via two-body central forces because the superposition principle applies for two-body
central forces. This chapter will focus first on the motion of two bodies interacting via conservative two-body
central forces followed by a brief discussion of the motion for   2 interacting bodies.

249
250 CHAPTER 11. CONSERVATIVE TWO-BODY CENTRAL FORCES

11.2 Equivalent one-body representation for two-body motion

The motion of two bodies, 1 and 2, interacting via two-body
central forces, requires 6 spatial coordinates, that is, three each
for r1 and r2 . Since the two-body central force only depends on
the relative separation r = r1 − r2 of the two bodies, it is more
convenient to separate the 6 degrees of freedom into 3 spatial
coordinates of relative motion r plus 3 spatial coordinates for
the center-of-mass location R as described in chapter 27. It will
be shown here that the equation of motion for relative motion
of the two-bodies in the center of mass can be represented by an
equivalent one-body problem which simplifies the mathematics.
Consider two bodies acted upon by a conservative two-body
central force, where the position vectors r1 and r2 specify the
location of each particle as illustrated in figure 111. An alternate
set of six variables would be the three components of the center
of mass position vector R and the three components specifying
the diﬀerence vector r defined by figure 111. Define the vectors
r01 and r02 as the position vectors of the masses 1 and 2 with
respect to the center of mass. Then
Figure 11.1: Center of mass cordinates for
r1 = R + r01 (11.3) the two-body system.
r2 = R + r02

By the definition of the center of mass

1 r1 + 2 r2
R= (11.4)
1 + 2
and
1 r01 + 2 r02 = 0 (11.5)
so that
1 0
− r = r02 (11.6)
2 1
Therefore
1 + 2 0
r = r01 − r02 = r1 (11.7)
2
that is,
2
r01 = r (11.8)
1 + 2
Similarly;
1
r02 = − r (11.9)
1 + 2
Substituting these into equation 113 gives
2
r1 = R + r01 = R + r
1 + 2
1
r2 = R + r02 = R − r (11.10)
1 + 2

That is, the two vectors r1  r2 are written in terms of the position vector for the center of mass R and the
position vector r for relative motion in the center of mass.
Assuming that the two-body central force is conservative and represented by  (), then the Lagrangian
of the two-body system can be written as

1 1
= 1 |ṙ1 |2 + 2 |ṙ2 |2 −  () (11.11)
2 2
11.2. EQUIVALENT ONE-BODY REPRESENTATION FOR TWO-BODY MOTION 251

Differentiating equations 1110 with respect to time, and inserting them into the Lagrangian, gives
1 ¯¯ ¯¯2 1
 =  ¯Ṙ¯ +  |ṙ|2 −  () (11.12)
2 2
where the total mass  is defined as
 = 1 + 2 (11.13)
and the reduced mass  is defined by
1 2
≡ (11.14)
1 + 2
or equivalently
1 1 1
= + (11.15)
 1 2
The total Lagrangian can be separated into two independent parts
1 ¯¯ ¯¯2
 =  ¯Ṙ¯ +  (11.16)
2
where
1 2
 =  |ṙ| −  () (11.17)
2

Assuming that no external forces are acting, then R = 0 and the three Lagrange equations for each of the
three coordinates of the R coordinate can be written as
  P
= =0 (11.18)
  Ṙ 
That is, for a pure central force, the center-of-mass momentum P is a constant of motion where

P = =  Ṙ (11.19)
 Ṙ
It is convenient to work in the center-of-mass frame using
the effective Lagrangian  . In the center-of-mass
¯ ¯2 frame of
1 ¯ ¯
reference, the translational kinetic energy 2  ¯Ṙ¯ associated
with center-of-mass motion is ignored, and only the energy in
the center-of-mass is considered. This center-of-mass energy
is the energy involved in the interaction between the colliding
bodies. Thus, in the center-of-mass, the problem has been re-
duced to an equivalent one-body problem of a mass  moving
about a fixed force center with a path given by r which is the
separation vector between the two bodies, as shown in figure
112. In reality, both masses revolve around their center of
mass, also called the barycenter, in the center-of-mass frame
as shown in figure 112. Knowing r allows the trajectory of
each mass about the center of mass r01 and r02 to be calcu-
lated. Of course the true path in the laboratory frame of
reference must take into account both the translational mo-
tion of the center of mass, in addition to the motion of the
Figure 11.2: Orbits of a two-body system
equivalent one-body representation relative to the barycenter.
with mass ratio of 2 rotating about the
Be careful to remember the difference between the actual tra-
center-of-mass, O. The dashed ellipse is the
jectories of each body, and the effective trajectory assumed
equivalent one-body orbit with the center of
when using the reduced mass which only determines the rel-
force at the focus O.
ative separation r of the two bodies. This reduction to an
equivalent one-body problem greatly simplifies the solution
of the motion, but it misrepresents the actual trajectories and the spatial locations of each mass in space.
The equivalent one-body representation will be used extensively throughout this chapter.
252 CHAPTER 11. CONSERVATIVE TWO-BODY CENTRAL FORCES

11.3 Angular momentum L

The notation used for the angular momentum vector is L where the magnitude is designated by |L| = .
Be careful not to confuse the angular momentum vector L with the Lagrangian   Note that the angular
momentum for two-body rotation about the center of mass with angular velocity  is identical when evaluated
in either the laboratory or equivalent two-body representation. That is, using equations 118 and 119

L = m1 102 ω + m2 202 ω =2 ω (11.20)

The center-of-mass Lagrangian leads to the following two general properties regarding the angular mo-
mentum vector L.
1) The motion lies entirely in a plane perpendicular to the fixed direction of the total angular momentum
vector. This is because
L·r=r×p·r=0 (11.21)
that is, the radius vector is in the plane perpendicular to the total angular momentum vector. Thus, it is
possible to express the Lagrangian in polar coordinates, ( ) rather than spherical coordinates. In polar
coordinates the center-of-mass Lagrangian becomes
1 ³ 2
´
 =  ̇2 + 2 ̇ −  () (11.22)
2
2) If the potential is spherically symmetric, then the polar angle  is cyclic and therefore Noether’s
theorem gives that the angular momentum p ≡ L = r × p is a constant of motion. That is, since  = 0


then the Lagrange equations imply that

 
ṗ = =0 (11.23)
  ψ̇

where the vectors ṗ and ψ̇ imply that equation 1123 refers to three independent equations corresponding
to the three components of these vectors. Thus the angular momentum p  conjugate to ψ is a constant of
motion. The generalized momentum p is a first integral of the motion which equals

p = = 2 ψ̇ = p̂  (11.24)
 ψ̇
where the magnitude of the angular momentum , and the direction p̂  both are constants of motion.
A simple geometric interpretation of equation 1124 is illus-
trated in figure 113 The radius vector sweeps out an area A
in time  where y
1
A = r × v (11.25)
2
and the vector A is perpendicular to the  −  plane. The rate
of change of area is
A 1
= r×v (11.26)
 2
But the angular momentum is r+dr

A r
L = r × p = r × v = 2 (11.27)

Thus the conservation of angular momentum implies that the
areal velocity 
 also is a constant of motion This fact is called
Kepler’s second law of planetary motion which he deduced in
1609 based on Tycho Brahe’s 55 years of observational records x
O
of the motion of Mars. Kepler’s second law implies that a
planet moves fastest when closest to the sun and slowest when
farthest from the sun. Note that Kepler’s second law is a state-
ment of the conservation of angular momentum which is inde- Figure 11.3: Area swept out by the radius
pendent of the radial form of the central potential. vector in the time dt.
11.4. EQUATIONS OF MOTION 253

11.4 Equations of motion

The equations of motion for two bodies interacting via a conservative two-body central force can be deter-
mined using the center of mass Lagrangian,   given by equation 1122 For the radial coordinate, the
operator equation Λ  = 0 for Lagrangian mechanics leads to
 2 
(̇) − ̇ + =0 (11.28)
 
But

̇ = (11.29)
2
therefore the radial equation of motion is

 2
̈ = − + 3 (11.30)
 
Similarly, for the angular coordinate, the operator equation Λ  = 0 leads to equation 1124. That is,
the angular equation of motion for the magnitude of  is

 = = 2 ̇ =  (11.31)
 ̇
Lagrange’s equations have given two equations of motion, one dependent on radius  and the other on
the polar angle . Note that the radial acceleration is just a statement of Newton’s Laws of motion for the
radial force  in the center-of-mass system of

 2
 = − + 3 (11.32)
 
This can be written in terms of an eﬀective potential

2
  () ≡  () + (11.33)
22
which leads to an equation of motion
  ()
 = ̈ = − (11.34)

2
 2
Since  3 =  ̇ , the second term in equation (1133)

is the usual centrifugal force that originates because the

variable  is in a non-inertial, rotating frame of reference.
Note that the angular equation of motion is independent
of the radial dependence of the conservative two-body
central force.
Figure 114 shows, by dashed lines, the radial depen-
dence of the potential corresponding to the attractive
inverse square law force, that is  = −  , and the po-
2
tential corresponding to the centrifugal term 2 2 cor-

responding to a repulsive centrifugal force. The sum of

these two potentials   (), shown by the solid line,
has a minimum min value at a certain radius similar
Figure 11.4: The attractive inverse-square law po-
to that manifest by the diatomic molecule discussed in 2
example 27. tential (  ), the centrifugal potential ( 2 2 ), and

It is remarkable that the six-dimensional equations the combined eﬀective bound potential.
of motion, for two bodies interacting via a two-body
central force, has been reduced to trivial center-of-mass translational motion, plus a one-dimensional one-
body problem given by (1134) in terms of the relative separation  and an eﬀective potential   ().
254 CHAPTER 11. CONSERVATIVE TWO-BODY CENTRAL FORCES

11.5 Diﬀerential orbit equation:

The differential orbit equation relates the shape of the orbital motion, in plane polar coordinates, to the
radial dependence of the two-body central force. A Binet coordinate transformation, which depends on the
functional form of F(r) can simplify the differential orbit equation. For the inverse-square law force, the
best Binet transformed variable is  which is defined to be
1
≡ (11.35)

Inserting the transformed variable  into equation 1129 gives
2
̇ = (11.36)

From the definition of the new variable
    
= −−2 = −−2 ̇ = − (11.37)
    
Differentiating again gives
µ ¶ µ ¶2 2
2       
=− =− (11.38)
2      2
Substituting these into Lagrange’s radial equation of motion gives
2   1 1
2 +  = − 2 2  (  ) (11.39)

Binet’s differential orbit equation directly relates  and  which determines the overall shape of the orbit
trajectory. This shape is crucial for understanding the orbital motion of two bodies interacting via a two-
body central force. Note that for the special case of an inverse square-law force, that is where  ( 1 ) = 2 ,
then the right-hand side of equation 1139 equals a constant − 2 since the orbital angular momentum is a
conserved quantity.

11.1 Example: Central force leading to a circular orbit  = 2 cos 

Binet’s differential orbit equation can be used to derive the
central potential that leads to the assumed circular trajectory
of  = 2 cos  where  is the radius of the circular orbit.
Note that this circular orbit passes through the origin of the
central force when  = 2 cos  = 0 r
Inserting this trajectory into Binet’s differential orbit equa- R
tion 1139 gives
1 2 (cos )−1 1  1
2 + (cos )−1 = − 2 42 (cos )2  ( ) ()
2  2  
Note that the differential is given by
µ ¶
2 (cos )−1  sin  2 sin2  1
2 = 3
= 3
+
  cos  cos  cos 
Circular trajectory passing through the
Inserting this differential into equation  gives origin of the central force.
2 sin2  1 1 2  2 1
+ + = = − 2 83 (cos )  ( )
cos3  cos  cos  cos3   
Thus the radial dependence of the required central force is
2 2 82 2 1 
 =− = − =− 5
83  cos5   5 
This corresponds to an attractive central force that depends to the fifth power on the inverse radius r . Note
that this example is unrealistic since the assumed orbit implies that the potential and kinetic energies are
infinite when  → 0 at  → 2 .
11.6. HAMILTONIAN 255

11.6 Hamiltonian
Since the center-of-mass Lagrangian is not an explicit function of time, then
 
=− =0 (11.40)
 
Thus the center-of mass Hamiltonian  is a constant of motion. However, since the transformation to
center of mass can be time dependent, then  6=  that is, it does not include the total energy because
the kinetic energy of the center-of-mass motion has been omitted from  . Also, since no transformation
is involved, then
 =  +  =  (11.41)
That is, the center-of-mass Hamiltonian  equals the center-of-mass total energy. The center-of-mass
Hamiltonian then can be written using the eﬀective potential (1133) in the form
2 2 2 2 2
 = +  2 +  () =  + 2
+  () =  +   () =  (11.42)
2 2 2 2 2
It is convenient to express the center-of-mass Hamiltonian  in terms of the energy equation for the
orbit in a central field using the transformed variable  = 1 . Substituting equations 1133 and 1137 into
the Hamiltonian equation 1142 gives the energy equation of the orbit
"µ ¶ #
2
2  ¡ ¢
+  +  −1 = 
2
(11.43)
2 

Energy conservation allows the Hamiltonian to be used to solve problems directly. That is, since
̇2 2
 = + +  () =  (11.44)
2 22
then s µ ¶
 2 2
̇ = =±  −  − (11.45)
  22
The time dependence can be obtained by integration
Z
±
= r ³ ´ + constant (11.46)
2 2
  −  − 22

An inversion of this gives the solution in the standard form  =  ()  However, it is more interesting to find
the relation between  and  From relation 1146 for  then

±
 = r ³ ´ (11.47)
2 2
  −  − 22

while equation 1129 gives

 ±
 = = r ³ ´ (11.48)
2 2
2 2  −  − 22

Therefore Z
±
= r ³ ´ + constant (11.49)
2
2 2  −  − 22

which can be used to calculate the angular coordinate. This gives the relation between the radial and angular
coordinates which specifies the trajectory.
Although equations (1145) and (1149) formally give the solution, the actual solution can be derived
analytically only for certain specific forms of the force law and these solutions diﬀer for attractive versus
repulsive interactions.
256 CHAPTER 11. CONSERVATIVE TWO-BODY CENTRAL FORCES

11.7 General features of the orbit solutions

It is useful to look at the general features of the solutions of the equations of motion given by the equivalent
one-body representation of the two-body motion. These orbits depend on the net center of mass energy  
There are five possible situations depending on the center-of-mass total energy  .
1) E  0 : The trajectory is hyperbolic and has a minimum distance, but no maximum. The distance
2
of closest approach is given when ̇ = 0 At the turning point  =  + 2 2

2) E = 0 : It can be shown that the orbit for this case is parabolic.
3) 0  E  Umin : For this case the equivalent orbit has both a maximum and minimum radial distance
2
at which ̇ = 0 At the turning points the radial kinetic energy term is zero so  =  + 2 2  For the

attractive inverse square law force the path is an ellipse with the focus at the center of attraction (Figure
115), which is Kepler’s First Law. During the time that the radius ranges from min to max and back the
radius vector turns through an angle ∆ which is given by
Z max
±
∆ = 2 r ³ ´ (11.50)
min 2
2 2  −  − 2 2

The general path prescribes a rosette shape which is a closed curve only if ∆ is a rational fraction of
2.
4) E = Umin : In this case  is a constant implying that the path is circular since
s µ ¶
 2 2
̇ = =±  −  − =0 (11.51)
  22

5) E  Umin : For this case the square root is imaginary and there is no real solution.
In general the orbit is not closed, and such open orbits do not repeat. Bertrand’s Theorem states that
the inverse-square central force, and the linear harmonic oscillator, are the only radial dependences of the
central force that lead to stable closed orbits.

11.2 Example: Orbit equation of motion for a free body

It is illustrative to use the diﬀerential orbit equation 1139 to show that
a body in free motion travels in a straight line. Assume that a line through
the origin  intersects perpendicular to the instantaneous trajectory at the y
point  which has polar coordinates (0  ) relative to the origin. The
point  with polar coordinates ( ) lies on straight line through  that
P
is perpendicular to  if, and only if,  cos( − ) = 0  Since the force is
zero then the diﬀerential orbit equation simplifies to

2 ()
+ () = 0 r
2
Q
A solution of this is
1 r0
() = cos( − )
0
where 0 and  are arbitrary constants. This can be rewritten as x

0
() = Trajectory of a free body
cos( − )

This is the equation of a straight line in polar coordinates as illustrated in the adjacent figure. This shows
that a free body moves in a straight line if no forces are acting on the body.
11.8. INVERSE-SQUARE, TWO-BODY, CENTRAL FORCE 257

11.8 Inverse-square, two-body, central force

The most important conservative, two-body, central interaction is the attractive inverse-square law force,
which is encountered in both gravitational attraction and the Coulomb force. This force F(r) can be written
in the form

F() = 2 b r (11.52)

The force constant  is defined to be negative for an attractive force and positive for a repulsive force. In
1 2
S.I. units the force constant  = −1 2 for the gravitational force and  = + 4 0
for the Coulomb force.
Note that this sign convention is the opposite of what is used in many books which use a negative sign in
equation 1152 and assume  to be positive for an attractive force and negative for a repulsive force.
The conservative, inverse-square, two-body, central force is unique in that the underlying symmetries
lead to four conservation laws, all of which are of pivotal importance in nature.
1. Conservation of angular momentum: Like all conservative central forces, the inverse-square cen-
tral two-body force conserves angular momentum as proven in chapter 113.
2. Conservation of energy: This conservative central force can be represented in terms of a scalar
potential energy  () as given by equation 112 where for this central force

 () = (11.53)

Moreover, equation 1142 showed that the center-of-mass Hamiltonian is conserved, that is,  = 
3. Gauss’ Law: For a conservative, inverse-square, two-body, central force, the flux of the force field out
of any closed surface is proportional to the algebraic sum of the sources and sinks of this field that
are located inside the closed surface. The net flux is independent of the distribution of the sources
and sinks inside the closed surface, as well as the size and shape of the closed surface. Chapter 2145
proved this for the gravitational force field.
4. Closed orbits: Two bodies interacting via the conservative, inverse-square, two-body, central force
follow closed (degenerate) orbits as stated by Bertrand’s Theorem. The first consequence of this
symmetry is that Kepler’s laws of planetary motion have stable, single-valued orbits. The second
consequence of this symmetry is the conservation of the eccentricity vector discussed in chapter 1184.
Observables that depend on Gauss’s Law, or on closed planetary orbits, are extremely sensitive to addition
of even a miniscule incremental exponent  to the radial dependence −(2±) of the force. The statement
that the inverse-square, two-body, central force leads to closed orbits can be proven by inserting equation
1152 into the orbit diﬀerential equation,
2   1 
+  = − 2 2 2 = − 2 (11.54)
 2   
Using the transformation

 ≡+ (11.55)
2
the orbit equation becomes
2 
+ =0 (11.56)
 2
A solution of this equation is
 =  cos ( −  0 ) (11.57)
Therefore
1 
= = − 2 [1 +  cos ( −  0 )] (11.58)
 
This the equation of a conic section. For an attractive, inverse-square, central force, equation 1158 is the
equation for an ellipse with the origin of  at one of the foci of the ellipse that has eccentricity  defined as
2
≡ (11.59)

258 CHAPTER 11. CONSERVATIVE TWO-BODY CENTRAL FORCES

Equation 1158 is the polar equation of a conic section. Equation 1158 also can be derived with the
origin at a focus by inserting the inverse square law potential into equation 1149 which gives
Z
±
= q + constant (11.60)
2 2 2
 2 + 2  − 

The solution of this gives s

" #
1  2 2
= =− 2 1+ 1+ cos ( −  0 ) (11.61)
  2

Equations 1158 and 1161 are identical if the eccentricity  equals

s
2 2
= 1+ (11.62)
2

The value of  0 merely determines the orientation of the major axis of the equivalent orbit. Without loss of
generality, it is possible to assume that the angle  is measured with respect to the major axis of the orbit,
that is  0 = 0. Then the equation can be written as
" s #
1   2 2
 = = − 2 [1 +  cos ()] = − 2 1 + 1 + cos () (11.63)
   2

This is the equation of a conic section where  is the eccentricity of the conic section. The conic section is a
hyperbola if   1, parabola if  = 1 ellipse if   1 and a circle if  = 0 All the equivalent one-body orbits
for an attractive force have the origin of the force at a focus of the conic section. The orbits depend on
whether the force is attractive or repulsive, on the conserved angular momentum  and on the center-of-mass
energy  .

11.8.1 Bound orbits

Closed bound orbits occur only if the following requirements
are satisfied.

1. The force must be attractive, (  0) then equation

1163 ensures that  is positive.

2. For a closed elliptical orbit. the eccentricity   1 of the

equivalent one-body representation of the orbit implies
that the total center-of-mass energy   0, that is,
the closed orbit is bound.

Bound elliptical orbits have the center-of-force at one in-

terior focus 1 of the elliptical one-body representation of the
orbit as shown in figure 115.
The minimum value of the orbit  = min occurs when
 = 0 where
2 Figure 11.5: Bound elliptical orbit.
min = − (11.64)
 [1 + ]
This minimum distance is called the periapsis1 .
1 The greek term apsis refers to the points of greatest or least distance of approach for an orbiting body from one of the

foci of the elliptical orbit. The term periapsis or pericenter both are used to designate the closest distance of approach, while
apoapsis or apocenter are used to designate the farthest distance of approach. Attaching the terms "perí-" and "apo-" to the
general term "-apsis" is preferred over having diﬀerent names for each object in the solar system. For example, frequently used
terms are "-helion" for orbits of the sun, "-gee" for orbits around the earth, and "-cynthion" for orbits around the moon.
11.8. INVERSE-SQUARE, TWO-BODY, CENTRAL FORCE 259

The maximum distance,  = max  which is called the apoapsis, occurs when  = 180

2
max = − (11.65)
 [1 − ]
Remember that since   0 for bound orbits, the negative signs in equations 1164 and 1165 lead to   0.
2
The most bound orbit is a circle having  = 0 which implies that  = −  2 .
The shape of the elliptical orbit also can be described with respect to the center of the elliptical equivalent
orbit by deriving the lengths of the semi-major axis  and the semi-minor axis  shown in figure 115
µ ¶
1 1 2 2 2
 = (min + max ) = + = (11.66)
2 2  [1 + ]  [1 − ]  [1 − 2 ]
p 2
 =  1 − 2 = p (11.67)
 [1 − 2 ]

Remember that the predicted bound elliptical orbit corresponds to the equivalent one-body representation
for the two-body motion as illustrated in figure 112. This can be transformed to the individual spatial
trajectories of the each of the two bodies in an inertial frame.

11.8.2 Kepler’s laws for bound planetary motion

Kepler’s three laws of motion apply to the motion of two bodies in a bound orbit due to the attractive
gravitational force for which  = −1 2 .
1) Each planet moves in an elliptical orbit with the sun at one focus
2) The radius vector, drawn from the sun to a planet, describes equal areas in equal times
3) The square of the period of revolution about the sun is proportional to the cube of the major axis
of the orbit.
Two bodies interacting via the gravitational force, which is a conservative, inverse-square, two-body
central force, is best handled using the equivalent orbit representation. The first and second laws were
proved in chapters 118 and 113. That is, the second law is equivalent to the statement that the angular
momentum is conserved. The third law can be derived using the fact that the area of an ellipse is
p  3
 =  = 2 1 − 2 = √ 2 (11.68)
−
Equations 1126 and 1127 give that the rate of change of area swept out by the radius vector is
 1 
= 2 ̇ = (11.69)
 2 2
Therefore the period for one revolution  is given by the time to sweep out one complete ellipse
µ ¶ 12
  3
 = ¡  ¢ = 2 2 (11.70)

−

This leads to Kepler’s 3 law

 3
 2 = 4 2 (11.71)
−
Bound orbits occur only for attractive forces for which the force constant  is negative, and thus cancel
the negative sign in equation 1171. For example, for the gravitational force  = −1 2 .
Note that the reduced mass  = 11+ 2
2
occurs in Kepler’s 3 law. That is, Kepler’s third law can be
written in terms of the actual masses of the bodies to be
4 2
2 = 3 (11.72)
 (1 + 2 )
In relating the relative periods of the diﬀerent planets Kepler made the approximation that the mass of the
planet 1 is negligible relative to the mass of the sun 2 
260 CHAPTER 11. CONSERVATIVE TWO-BODY CENTRAL FORCES

The eccentricity of the major planets ranges from  = 02056 for Mercury, to  = 00068 for Venus. The
Earth has an eccentricity of  = 00167 with min = 91 · 106 miles and max = 95 · 106 miles. On the other
hand,  = 0967 for Halley’s comet, that is, the radius vector ranges from 06 to 18 times the radius of the
orbit of the Earth.
The orbit energy can be derived by substituting the eccentricity, given by equation 1162 into the semi-
major axis length  given by equation 1166 which leads to the center-of-mass energy of

 = − (11.73)
2
However, the Hamiltonian, given by equation 1142 implies that  is
µ ¶
1  
 =  2 + − =− (11.74)
2  2
For the simple case of a circular orbit,  =  then the velocity  equals
s

= (11.75)

For a circular orbit, the drag on a satellite lowers the total energy resulting in a decrease in the radius
of the orbit and a concomitant increase in velocity. That is, when the orbit radius is decreased, part of the
gain in potential energy accounts for the work done against the drag, and the remaining part goes towards
increase of the kinetic energy. Also note that, as predicted by the Virial Theorem, the kinetic energy always
is half the potential energy for the inverse square law force.

11.8.3 Unbound orbits

Attractive inverse-square central forces lead to hyperbolic
orbits for   1 for which   0, that is, the orbit is
unbound. In addition, the orbits always are unbound for
a repulsive force since  =  is positive as is the kinetic
energy  , thus  =  +   0 The radial orbit
equation for either an attractive or a repulsive force is
2
=− (11.76)
 [1 +  cos ]
For a repulsive force  is positive and 2 always is positive.
Therefore to ensure that  remain positive the bracket term
must be negative. That is
[1 +  cos ]  0 0 (11.77)
For an attractive force  is negative and since 2 is positive
then the bracket term must be positive to ensure that  is
positive. That is,
[1 +  cos ]  0 0 (11.78)
Figure 116 shows both branches of the hyperbola for a given
angle  for the equivalent two-body orbits where the center
of force is at the origin. For an attractive force,   0
the center of force is at the interior focus of the hyperbola,
whereas for a repulsive force the center of force is at the
exterior focus. For a given value of || the asymptotes of the Figure 11.6: Hyperbolic two-body orbits for a
orbits both are displaced by the same impact parameter repulsive (left) and attractive (right) inverse-
 from parallel lines passing through the center of force. square, central two-body forces. Both orbits
The scattering angle, between the outgoing direction of the have the angular momentum vector pointing
scattered body and the incident direction, is designated to upwards out of the plane of the orbit
be  which is related to the angle  by  = 180◦ − 2.
11.8. INVERSE-SQUARE, TWO-BODY, CENTRAL FORCE 261

11.8.4 Eccentricity vector

Two-bodies interacting via a conservative two-body central force have two invariant first-order integrals,
namely the conservation of energy and the conservation of angular momentum. For the special case of the
inverse-square law, there is a third invariant of the motion, which Hamilton called the eccentricity vector2 ,
that unambiguously defines the orientation and direction of the major axis of the elliptical orbit. It will be
shown that the angular momentum plus the eccentricity vector completely define the plane and orientation
of the orbit for a conservative inverse-square law central force.
Newton’s second law for a central force can be written in the form

ṗ = ()r̂ (11.79)

Note that the angular moment L = r × p is conserved for a central force, that is L̇ = 0. Therefore the time
derivative of the product p × L reduces to
 £ ¤
(p × L) = ṗ × L = ()r̂× (r×ṙ) =  () r (r · ṙ) − 2 ṙ (11.80)
 
This can be simplified using the fact that
1 
r · ṙ = (r · r) = ̇ (11.81)
2 
thus ∙ ¸
£ 2
¤ 2 ṙ r̇  ³r´
 () r (r · ṙ) −  ṙ = − () − 2 = − ()2 (11.82)
    
This allows equation 1180 to be reduced to
  ³r´
(p × L) = −  ()2 (11.83)
  
Assume the special case of the inverse-square law, equation 1152, then the central force equation 1183
reduces to
 
(p × L) = − (r̂) (11.84)
 
or

[(p × L) + (r̂)] = 0 (11.85)

Define the eccentricity vector A as
A ≡ (p × L) + (r̂) (11.86)
then equation 1185 corresponds to
A
=0 (11.87)

This is a statement that the eccentricity vector  is a constant of motion for an inverse-square, central
force.
The definition of the eccentricity vector A and angular momentum vector L implies a zero scalar product,

A · L =0 (11.88)

Thus the eccentricity vector A and angular momentum L are mutually perpendicular, that is, A is in the
plane of the orbit while L is perpendicular to the plane of the orbit. The eccentricity vector A, always points
along the major axis of the ellipse from the focus to the periapsis as illustrated on the left side in figure 117.
2 The symmetry underlying the eccentricity vector is less intuitive than the energy or angular momentum invariants leading

to it being discovered independently several times during the past three centuries. Jakob Hermann was the first to indentify
this invariant for the special case of the inverse-square central force. Bernoulli generalized his proof in 1710. Laplace derived
the invariant at the end of the 18 century using analytical mechanics. Hamilton derived the connection between the invariant
and the orbit eccentricity. Gibbs derived the invariant using vector analysis. Runge published the Gibb’s derivation in his
textbook which was referenced by Lenz in a 1924 paper on the quantal model of the hydrogen atom. Goldstein named this
invariant the "Laplace-Runge-Lenz vector", while others have named it the "Runge-Lenz vector" or the "Lenz vector". This
book uses Hamilton’s more intuitive name of "eccentricity vector".
262 CHAPTER 11. CONSERVATIVE TWO-BODY CENTRAL FORCES

Figure 11.7: The elliptical trajectory and eccentricity vector A for two bodies interacting via the inverse-
square, central force for eccentricity  = 075. The left plot shows the elliptical spatial trajectory where
the semi-major axis is assumed to be on the -axis and the angular momentum L =ẑ, is out of the page.
The force centre is at one foci of the ellipse. The vector coupling relation A ≡ (p × L) + (r̂) is illustrated
at four points on the spatial trajectory. The right plot is a hodograph of the linear momentum p for this
trajectory. The periapsis is denoted by the number 1 and the apoapsis is marked as 3 on both plots. Note
that the eccentricity vector A is a constant that points parallel to the major axis towards the perapsis.

As a consequence, the two orthogonal vectors A and L completely define the plane of the orbit, plus the
orientation of the major axis of the Kepler orbit, in this plane. The three vectors A, p × L, and (r̂) obey
the triangle rule as illustrated in the left side of figure 117.
Hamilton noted the direct connection between the eccentricity vector A and the eccentricity  of the
conic section orbit. This can be shown by considering the scalar product

A · r = cos  = r· (p × L) +  (11.89)

Note that the triple scalar product can be permuted to give

r· (p × L) = (r × p) ·L = L · L =2 (11.90)

Inserting equation 1190 into 1189 gives

µ ¶
1  
=− 2 1− cos  (11.91)
  

Note that equations 1163 and 1191 are identical if  0 = 0. This implies that the eccentricity  and 
are related by

=− (11.92)

where  is defined to be negative for an attractive force. The relation between the eccentricity and total
center-of-mass energy can be used to rewrite equation 1162 in the form

2 = 2 2 + 2 2 (11.93)

The combination of the eccentricity vector A and the angular momentum vector L completely specifies
the orbit for an inverse square-law central force. The trajectory is in the plane perpendicular to the angu-
lar momentum vector L, while the eccentricity, plus the orientation of the orbit, both are defined by the
eccentricity vector A. The eccentricity vector and angular momentum vector each have three independent
coordinates, that is, these two vector invariants provide six constraints, while the scalar invariant energy 
adds one additional constraint. The exact location of the particle moving along the trajectory is not defined
and thus there are only five independent coordinates governed by the above seven constraints. Thus the
11.9. ISOTROPIC, LINEAR, TWO-BODY, CENTRAL FORCE 263

eccentricity vector, angular momentum, and center-of-mass energy are related by the two equations 1188
and 1193.
Noether’s theorem states that each conservation law is a manifestation of an underlying symmetry.
Identification of the underlying symmetry responsible for the conservation of the eccentricity vector A is
elucidated using equation 1186 to give

(r̂) = A− (p × L) (11.94)
Take the scalar product
2
(r̂) · (r̂) = () = 2 2 + 2 − 2 · (p × L) (11.95)
Choose the angular momentum to be along the -axis, that is, L =ẑ, and, since p and A are perpendicular
to L, then p and A are in the x̂ − ŷ plane. Assume that the semimajor axis of the elliptical orbit is along
the x-axis, then the locus of the momentum vector on a momentum hodograph has the equation
µ ¶2 µ ¶2
 
2 +  − = (11.96)
 
¯ ¯
¯ ¯
Equation 1196 implies that the locus of the momentum vector is a circle of radius ¯ 
 ¯ with the center
¡ ¢
displaced from the origin at coordinates 0  as shown by the momentum hodograph on the right side of
an figure 117. The angle  and eccentricity  are related by,

 
cos  = − =− = (11.97)
 

The circular orbit is centered at the origin for  = −  = 0, and thus the magnitude |p| is a constant around
the whole trajectory.
The inverse-square, central, two-body, force is unusual in that it leads to stable closed bound orbits
because the radial and angular frequencies are degenerate, i.e.   =    In momentum space, the locus of
the linear momentum vector p is a perfect circle which is the underlying symmetry responsible for both the
fact that the orbits are closed, and the invariance of the eccentricity vector. Mathematically this symmetry
for the Kepler problem corresponds to the body moving freely on the boundary of a four-dimensional sphere
in space and momentum. The invariance of the eccentricity vector is a manifestation of the special property
of the inverse-square, central force under certain rotations in this four-dimensional space; this (4) symmetry
is an example of a hidden symmetry.

11.9 Isotropic, linear, two-body, central force

Closed orbits occur for the two-dimensional linear oscillator when  is a rational fraction as discussed in
chapter 33. Bertrand’s Theorem states that the linear oscillator, and the inverse-square law (Kepler
problem), are the only two-body central forces that have single-valued, stable, closed orbits of the coupled
radial and angular motion. The invariance of the eccentricity vector was the underlying symmetry leading
to single-valued, stable, closed orbits for the Kepler problem. It is interesting to explore the symmetry that
leads to stable closed orbits for the harmonic oscillator. For simplicity, this discussion will restrict discussion
to the isotropic, harmonic, two-body, central force where   =   = , for which the two-body, central force
is linear
F() = r (11.98)
where   0 corresponds to a repulsive force and   0 to an attractive force. This isotropic harmonic force
can be expressed in terms of a spherical potential  () where
1
 () = − 2 (11.99)
2
Since this is a central two-body force, both the equivalent one-body representation, and the conservation
of angular momentum, are equally applicable to the harmonic two-body force. As discussed in section
113, since the two-body force is central, the motion is confined to a plane, and thus the Lagrangian can
264 CHAPTER 11. CONSERVATIVE TWO-BODY CENTRAL FORCES

be expressed in polar coordinates. In addition, since the force is spherically symmetric, then the angular
momentum is conserved. The orbit solutions are conic sections as described in chapter 117. The shape of
the orbit for the harmonic two-body central force can be derived using either polar or cartesian coordinates
as illustrated below.

11.9.1 Polar coordinates

The origin of the equivalent orbit for the harmonic force will be found to be at the center of an ellipse, rather
than the foci of the ellipse as found for the inverse square law. The shape of the orbit can be defined using
a Binet diﬀerential orbit equation that employs the transformation
1
0 ≡ (11.100)
2
Then
0 2 
=− 3 (11.101)
  
The chain rule gives that
 3 0   0
̇ = ̇ = − ̇ =− (11.102)
 2  2  
Substitute this into the Hamiltonian   equation 1142 gives
µ ¶2
1 2 1 2 0 2 0 
̇ = 0
=−  + 0 (11.103)
2 8    2 2
Rearranging this equation gives
µ ¶2
0 8 0 4
+ 402 −  = 2 (11.104)
 2 
Addition of a constant to both sides of the equation completes the square
" Ã !#2 Ã !2 Ã !2
 0  0  4 
 − 2 +4  − 2 =+ 2 +4 (11.105)
    2

The right-hand side of equation 11105 is a constant. The solution of 11105 must be a sine or cosine function
with polar angle  = . That is
Ã ! ⎡Ã !2 ⎤ 12
  
0 − 2 =⎣ + 2 ⎦ cos 2 ( −  0 ) (11.106)
 2 

That is,
⎛ Ã ! 12 ⎞
1  2
0 = = 2 ⎝1 + 1+ 2 cos 2( −  0 )⎠ (11.107)
2   

Equation 11107 corresponds to a closed orbit centered at the origin of the elliptical orbit as illustrated in
figure 118 The eccentricity  of this closed orbit is given by
Ã !1
2 2 2
1+ 2 = (11.108)
  2 − 2

Equations 1166 1167 give that the eccentricity is related to the semi-major  and semi-minor  axes by
µ ¶2

2 = 1 − (11.109)

Note that for a repulsive force   0, then  ≥ 1 leading to unbound hyperbolic or parabolic orbits centered
on the origin. An attractive force,   0 allows for bound elliptical, as well as unbound parabolic and
hyperbolic orbits.
11.9. ISOTROPIC, LINEAR, TWO-BODY, CENTRAL FORCE 265

y py
0.8 1.2
1.0
0.6
0.8

0.4 r 0.6 p
0.4
0.2
0.2

x px
-1.2 -1.0 -0.8 -0.6 -0.4 -0.2 0.2 0.4 0.6 0.8 1.0 1.2 -0.8 -0.6 -0.4 -0.2 0.2 0.4 0.6 0.8
-0.2
-0.2
-0.4

-0.4 -0.6
-0.8
-0.6
-1.0

-0.8 -1.2

Figure 11.8: The elliptical equivalent trajectory for two bodies interacting via the linear, central force for
eccentricity  = 075. The left plot shows the elliptical spatial trajectory where the semi-major axis is
assumed to be on the -axis and the angular momentum L =ẑ, is out of the page. The force center is at
the center of the ellipse. The right plot is a hodograph of the linear momentum p for this trajectory.

11.9.2 Cartesian coordinates

The isotropic harmonic oscillator, expressed in terms of cartesian coordinates in the ( ) plane of the orbit,
is separable because there is no direct coupling term between the  and  motion. That is. the center-of-mass
Lagrangian in the ( ) plane separates into independent motion for  and .
∙ ¸ ∙ ¸
1 1 1 2 1 2 1 2 1 2
 = ṙ · ṙ + r · r = ̇ +  + ̇ +  (11.110)
2 2 2 2 2 2

Solutions for the independent coordinates, and their corresponding momenta, are

r = ̂ cos ( + ) + ̂ cos ( + ) (11.111)

p = −̂ sin ( + ) − ̂ sin ( + ) (11.112)
q

where  = . Therefore

2 2
2 = 2 +  2 = [ cos ( + )] + [ cos ( + )] (11.113)
p
2 +  2 4 +  4 + 2 2 cos ( − )
= + cos (2 +  0 )
2 2
where
2 cos  +  2 cos 
cos  0 = p (11.114)
4 +  4 + 2 2 cos ( − )
For a phase diﬀerence  −  = ± 2  this equation describes an ellipse centered at the origin which agrees
with equation 11107 that was derived using polar coordinates.
The two normal modes of the isotropic harmonic oscillator are degenerate, therefore   are equally good
normal modes with two corresponding total energies, 1  2 , while the corresponding angular momentum 
points in the  direction.

2 1
1 = + 2 (11.115)
2 2
2 1
2 = +  2 (11.116)
2 2
 =  ( −  ) (11.117)
266 CHAPTER 11. CONSERVATIVE TWO-BODY CENTRAL FORCES

Figure 118 shows the closed elliptical equivalent orbit plus the corresponding momentum hodograph for
the isotropic harmonic two-body central force. Figures 117 and 118 contrast the differences between the
elliptical orbits for the inverse-square force, and those for the harmonic two-body central force. Although
the orbits for bound systems with the harmonic two-body force, and the inverse-square force, both lead to
elliptical bound orbits, there are important differences. Both the radial motion and momentum are two
valued per cycle for the reflection-symmetric harmonic oscillator, whereas the radius and momentum have
only one maximum and one minimum per revolution for the inverse-square law. Although the inverse-square,
and the isotropic, harmonic, two-body central forces both lead to closed bound elliptical orbits for which the
angular momentum is conserved and the orbits are planar, there is another important difference between the
orbits for these two interactions. The orbit equation for the Kepler problem is expressed with respect to a
foci of the elliptical equivalent orbit, as illustrated in figure 117, whereas the orbit equation for the isotropic
harmonic oscillator orbit is expressed with respect to the center of the ellipse as illustrated in figure 118.

11.9.3 Symmetry tensor A0

The invariant vectors L and A provide a complete specification of the geometry of the bound orbits for
the inverse square-law Kepler system. It is interesting to search for a similar invariant that fully specifies
the orbits for the isotropic harmonic central force. In contrast to the Kepler problem, the harmonic force
center is at the center of the elliptical orbit, and the orbit is reflection symmetric with the radial and angular
frequencies related by   = 2  . Since the orbit is reflection-symmetric, the orientation of the major axis
of the orbit cannot be uniquely specified by a vector. Therefore, for the harmonic interaction it is necessary
to specify the orientation of the principal axis by the symmetry tensor. The symmetry of the isotropic
harmonic, two-body, central force leads to the symmetry tensor A0  which is an invariant of the motion
analogous to the eccentricity vector A. Like a rotation matrix, the symmetry tensor defines the orientation,
but not direction, of the major principal axis of the elliptical orbit. In the plane of the polar orbit the 3 × 3
symmetry tensor A0 reduces to a 2 × 2 matrix having matrix elements defined to be,
  1
0 = +   (11.118)
2 2

The diagonal matrix elements 011 = 1 , and 022 = 2 which are constants of motion. The oﬀ-diagonal
term is given by
µ ¶2 µ 2 ¶Ã 2 !
  1  1 2  1 2  2
02
12 ≡ +  = +  +  − 4 ( −  )2 = 1 2 − 3 (11.119)
2 2 2 2 2 2 4

The terms on the right-hand side of equation 11119 all are constants of motion, therefore 02 12 also is a
constant of motion. Thus the 3 × 3 symmetry tensor A0 can be reduced to a 2 × 2 symmetry tensor for which
all the matrix elements are constants of motion, and the trace of the symmetry tensor is equal to the total
energy.
In summary, the inverse-square, and harmonic oscillator two-body central interactions both lead to closed,
elliptical equivalent orbits, the plane of which is perpendicular to the conserved angular momentum vector.
However, for the inverse-square force, the origin of the equivalent orbit is at the focus of the ellipse and
  =   , whereas the origin is at the center of the ellipse and   = 2  for the harmonic force. As a
consequence, the elliptical orbit is reflection symmetric for the harmonic force but not for the inverse square
force. The eccentricity vector and symmetry tensor both specify the major axes of these elliptical orbits,
the plane of which are perpendicular to the angular momentum vector. The eccentricity vector, and the
symmetry tensor, both are directly related to the eccentricity of the orbit and the total energy of the two-
body system. Noether’s theorem states that the invariance of the eccentricity vector and symmetry tensor,
plus the corresponding closed orbits, are manifestations of underlying symmetries. The dynamical  3
symmetry underlies the invariance of the symmetry tensor, whereas the dynamical 4 symmetry underlies
the invariance of the eccentricity vector. These symmetries lead to stable closed elliptical bound orbits only
for these two specific two-body central forces, and not for other two-body central forces.
11.10. CLOSED-ORBIT STABILITY 267

11.10 Closed-orbit stability

Bertrand’s theorem states that the linear oscillator and
the inverse-square law are the only two-body, central
forces for which all bound orbits are single-valued, and
stable closed orbits. The stability of closed orbits can
be illustrated by studying their response to perturba-
tions. For simplicity, the following discussion of stabil-
ity will focus on circular orbits, but the general prin-
ciples are the same for elliptical orbits.
A circular orbit occurs whenever the attractive
force just balances the eﬀective ”centrifugal force” in
the rotating frame. This can occur for any radial func-
tional form for the central force. The eﬀective poten-
tial, equation 1133 will have a stationary point when
µ ¶
 
=0 (11.120)
 =0

that is, when

µ ¶
 2
− =0 (11.121)
 =0 03

This is equivalent to the statement that the net force

is zero. Since the central attractive force is given by
 
 () = − (11.122)

then the stationary point occurs when

2 2
 (0 ) = − = −0 ̇ (11.123)
03

This is the so-called centrifugal force in the rotating

frame. The Hamiltonian, equation 1144, gives that
s µ ¶
2 2
̇ = ±  −  − (11.124)
 22

For a circular orbit ̇ = 0 that is

2
 =  − (11.125) Figure 11.9: Stable and unstable eﬀective central
22 potentials. The repulsive centrifugal and the attrac-
A stable circular orbit is possible if both equations tive potentials (k<0) are shown dashed. The solid
(11121) and (11125) are satisfied. Such a circular curve is the eﬀective potential.
orbit will be a stable orbit at the minimum when
µ 2 ¶
  
0 (11.126)
2 =0

Examples of stable and unstable orbits are shown in

figure 119.
Stability of a circular orbit requires that
µ 2 ¶
  32
+ 0 (11.127)
2 =0 04
268 CHAPTER 11. CONSERVATIVE TWO-BODY CENTRAL FORCES

which can be written in terms of the central force for a stable orbit as
µ ¶
 3 (0 )
− + 0 (11.128)
 0 0

If the attractive central force can be expressed as a power law

 () = − (11.129)

then stability requires

0−1 (3 + )  0 (11.130)
or
  −3 (11.131)
Stable equivalent orbits will undergo oscillations about the stable orbit if perturbed. To first order, the
restoring force on a bound reduced mass  is given by
µ 2 ¶
  
 = − ( − 0 ) = ̈ (11.132)
2 =0

To the extent that this linear restoring force dominates over higher-order terms, then a perturbation of the
stable orbit will undergo simple harmonic oscillations about the stable orbit with angular frequency
v³
u 2   ´
u
t 2 =0
= (11.133)


The above discussion shows that a small amplitude radial oscillation about the stable orbit with amplitude
 will be of the form
 =  sin(2 + )
The orbit will be closed if the product of the oscillation frequency  and the orbit period  is an integer
value.
The fact that planetary orbits in the gravitational field are observed to be closed is strong evidence
that the gravitational force field must obey the inverse square law. Actually there are small precessions of
planetary orbits due to perturbations of the gravitational field by bodies other than the sun, and due to
relativistic eﬀects. Also the gravitational field near the earth departs slightly from the inverse square law
because the earth is not a perfect sphere, and the field does not have perfect spherical symmetry. The study
of the precession of satellites around the earth has been used to determine the oblate quadrupole and slight
octupole (pear shape) distortion of the shape of the earth.
The most famous test of the inverse square law for gravitation is the precession of the perihelion of
Mercury. If the attractive force experienced by Mercury is of the form
 
F() = − r̂
2+
where || is small, then it can be shown that, for approximate circular orbitals, the perihelion will advance
by a small angle  per orbit period. That is, the precession is zero if  = 0, corresponding to an inverse
square law dependence which agrees with Bertrand’s theorem. The position of the perihelion of Mercury has
been measured with great accuracy showing that, after correcting for all known perturbations, the perihelion
advances by 43(±5) seconds of arc per century, that is 5 × 10−7 radians per revolution. This corresponds to
 = 16 × 10−7 which is small but still significant. This precession remained a puzzle for many years until
1915 when Einstein predicted that one consequence of his general theory of relativity is that the planetary
orbit of Mercury should precess at 43 seconds of arc per century, which is in remarkable agreement with
observations.
11.10. CLOSED-ORBIT STABILITY 269

11.3 Example: Linear two-body restoring force

The eﬀective potential for a linear two-body restoring force  = − is

1 2 2
  =  +
2 22
At the minimum µ ¶
  2
=  − =0
 =0 3
Thus
µ ¶ 14
2
0 =

and µ ¶
2   32
= +  = 4  0
2 =0 04
which is a stable orbit. Small perturbations of such a stable circular orbit will have an angular frequency
v³
u 2   ´ s
u
t 2 =0 
= =2
 

Note that this is twice the frequency for the planar harmonic oscillator with the same restoring coeﬃcient.
This is due to the central repulsion, the eﬀective potential well for this rotating oscillator example has about
half the width for the corresponding planar harmonic oscillator. Note that the kinetic energy for the rotational
2 1 2
motion, which is 2 2  equals the potential energy 2  at the minimum as predicted by the Virial Theorem
for a linear two-body restoring force.

11.4 Example: Inverse square law attractive force

The eﬀective potential for an inverse square law restoring force  = − 2 ̂ where  is assumed to be
positive,
 2
  = − +
 22
At the minimum µ ¶
   2
= − =0
 =0 2 3
Thus
2
0 =

and µ ¶
2   32 2 
= − 3 = 3 0
2 =0 04 0 0
which is a stable orbit. Small perturbations about such a stable circular orbit will have an angular frequency
v³
u 2   ´
u
t 2 =0 2
= = 3
 
2

The kinetic energy for oscillations about this stable circular orbit, which is 2 2  equals half the magnitude

of the potential energy −  at the minimum as predicted by the Virial Theorem.
270 CHAPTER 11. CONSERVATIVE TWO-BODY CENTRAL FORCES

11.5 Example: Attractive inverse cubic central force

The inverse cubic force is an interesting example to investigate the stability of the orbit equations. One
solution of the inverse cubic central force, for a reduced mass  is a spiral orbit

 = 0 

That this is true can be shown by inserting this orbit into the diﬀerential orbit equation.
Using a Binet transformation of the variable  to  gives
1 1
 = = −
 0
  −
= − 
 0
2  2
 −
= 
 2 0

Substituting these into the diﬀerential equation of the orbit

2   1 1
+  = − 2 2( )
 2   

gives µ ¶
2 − 1  1
 + − = − 2 02 2 
0 0  
That is µ ¶ ¡ 2 ¢ ¡ 2 ¢
1  + 1 2 −3 −3  + 1 2
 =− 0  =−
  3
which is a central attractive inverse cubic force.
The time dependence of the spiral orbit can be derived since the angular momentum gives
 
̇ = 2
= 2 2
 0 

This can be written as


2  = 
02
Integrating gives
2 
= 2 +
2 0
where  is a constant. But the orbit gives
2
2 = 02 2 = + 2

Thus the radius increases or decreases as the square root of the time. That is, an attractive cubic central force
does not have a stable orbit which is what is expected since there is no minimum in the effective potential
energy. Note that it is obvious that there will be no minimum or maximum for the summation of effective
potential energy since, if the force is  = − 3  then the effective potential energy is
µ ¶
 2 2 1
  = − 2
+ = −
2 22  22

which has no stable minimum or maximum.

11.10. CLOSED-ORBIT STABILITY 271

11.6 Example: Spiralling mass attached by a string to a hanging mass

An example of an application of orbit stability is the case shown in the adjacent figure. A particle of
mass  moves on a horizontal frictionless table. This mass is attached by a light string of fixed length  and
rotates about a hole in the table. The string is attached to a second equal mass  that is hanging vertically
downwards with no angular motion.
The equations are most conveniently expressed in cylindrical
coordinates (   ) with the origin at the hole in the table, and 
vertically upward. The fixed length of the string requires  = −. z
The potential energy is
O
 =  = ( − )

The system is central and conservative, thus the Hamiltonian

can be written as
³ 2 2
´ 
2
= ̇ + 2 ̇ +  + ( − ) = 
2 2 Rotating mass  on a frictionless
The Lagrangian is independent of , that is,  is cyclic, thus the horizontal table connected to a
angular momentum 2 ̇ =  is a constant of motion. Substi- suspended mass .
tuting this into the Hamiltonian equation gives

2
̇2 + + ( − ) = 
22
The eﬀective potential is
2
  = + ( − )
22
which is shown in the adjacent figure. The stationary value occurs when
µ ¶
  2
= − 3 +  = 0
 0 0

That is, when the angular momentum is related to the radius by

2 = 2 03

Note that 0 = 0 if  = 0.
The stability of the solution is given by the second deriv-
ative µ 2 ¶
   32 3
2
= 4 = 0
 0 0 0
Therefore the stationary point is stable.
Note that the equation of motion for the minimum can be
expressed in terms of the restoring force on the two masses
µ 2 ¶
  
2̈ = − ( − 0 )
2 0

Thus the system undergoes harmonic oscillation with fre-

quency s
3 r r
0 3
= =
2 20 Eﬀective potential for two connected masses.
The solution of this system is stable and undergoes simple
harmonic motion.
272 CHAPTER 11. CONSERVATIVE TWO-BODY CENTRAL FORCES

11.11 The three-body problem

Two bodies interacting via conservative central forces can be
solved analytically for the inverse square law and the Hooke’s law
radial dependences as already discussed. Central forces that have
other radial dependences for the equations of motion may not be
expressible in terms of simple functions, nevertheless the motion
always can be given in terms of an integral. For a gravitational
system comprising  ≥ 3 bodies that are interacting via the two-
body central gravitational force, then the equations of motion
can be written as

X (q − q )
 q̈ = G   ( = 1 2  )

|q − q |3
6=

Even when all the  bodies are interacting via two-body central
forces, the problem usually is insoluble in terms of known ana-
lytic integrals. Newton first posed the diﬃculty of the three-body
Kepler problem which has been studied extensively by mathe-
maticians and physicists. No known general analytic integral Figure 11.10: A contour plot of the eﬀec-
solution has been found. Each body for the -body system has tive potential for the Sun-Earth gravita-
6 degrees of freedom, that is, 3 for position and 3 for momen- tional system in the rotating frame where
tum. The center-of-mass motion can be factored out, therefore the Sun and Earth are stationary. The
the center-of-mass system for the -body system has 6 − 10 de- 5 Lagrange points  are saddle points
grees of freedom after subtraction of 3 degrees for location of the where the net force is zero. (Figure cre-
center of mass, 3 for the linear momentum of the center of mass, ated by NASA)
3 for rotation of the center of mass, and 1 for the total energy of
the system. Thus for  = 2 there are 12 − 10 = 2 degrees of freedom for the two-body system for which the
Kepler approach takes to be r and  For  = 3 there are 8 degrees of freedom in the center of mass system
that have to be determined.
Numerical solutions to the three-body problem can be obtained using successive approximation or per-
turbation methods in computer calculations. The problem can be simplified by restricting the motion to
either of following two approximations:

1) Planar approximation
This approximation assumes that the three masses move in the same plane, that is, the number of degrees
of freedom are reduced from 8 to 6 which simplifies the numerical solution.

2) Restricted three-body approximation

The restricted three-body approximation assumes that two of the masses are large and bound while the
third mass is negligible such that the perturbation of the motion of the larger two by the third body is
negligible. This approximation essentially reduces the system to a two body problem in order to calculate
the gravitational fields that act on the third much lighter mass.
Euler and Lagrange showed that the restricted three-body system has five points at which the combined
gravitational attraction plus centripetal force of the two large bodies cancel. These are called the Lagrange
points and are used for parking satellites in stable orbits with respect to the Earth-Moon system, or with
respect to the Sun-Earth system. Figure 1110 illustrates the five Lagrange points for the Earth-Sun system.
Only two of the Lagrange points, 4 and 5 lead to stable orbits. Note that these Lagrange points are fixed
with respect to the Earth-Sun system which rotates with respect to inertial coordinate frames. The 1900’s
discovery of the Trojan asteroids at the 4 and 5 Lagrange points of the Sun-Jupiter system confirmed the
Lagrange predictions.
Poincaré showed that the motion of a light mass bound to two heavy bodies can exhibit extreme sensitivity
to initial conditions as well as characteristics of chaos. Solution of the three-body problem has remained a
largely unsolved problem since Newton identified the diﬃculties involved.
11.12. TWO-BODY SCATTERING 273

11.12 Two-body scattering

Two moving bodies, that are interacting via a central force, scatter when the force is repulsive, or when
an attractive system is unbound. Two-body scattering of bodies is encountered extensively in the fields of
astronomy, atomic, nuclear, and particle physics. The probability of such scattering is most conveniently
expressed in terms of scattering cross sections defined below.

11.12.1 Total two-body scattering cross section

The concept of scattering cross section for two-body scat-
tering is most easily described for the total two-body cross
section. The probability  that a beam of  incident point
particles/second, distributed over a cross sectional area  
will hit a single solid object, having a cross sectional area  AB
is given by the ratio of the areas as illustrated in figure 1111.
That is,

 = (11.134)

where it is assumed that    For a spherical target
body of radius , the cross section  = 2  The scattering
probability  is proportional to the cross section  which Figure 11.11: Scattering probability for an
is the cross section of the target body perpendicular to the incident beam of cross sectional area A by a
beam; thus  has the units of area. target body of cross sectional area .
Since the incident beam of  incident point parti-
cles/second, has a cross sectional area  , then it will have
an areal density  given by

= beam particles2 / sec (11.135)

The number of beam particles scattered per second  by this single target scatterer equals

 =   =  =  (11.136)

Thus the cross section for scattering by this single target body is
 Scattered particles/sec
= =
 incident beam/m2 /sec
Realistically one will have many target scatterers in the target and the total scattering probability increases
proportionally to the number of target scatterers. That is, for a target comprising an areal density of 
target bodies per unit area of the incident beam, then the number scattered will increase proportional to the
target areal density    That is, there will be   scattering bodies that interact with the beam assuming
that the target has a larger area than the beam. Thus the total number scattered per second  by a target
that comprises multiple scatterers is

 =    =   (11.137)
 
Note that this is independent of the cross sectional area of the beam assuming that the target area is larger
than that of the beam. That is, the number scattered per second is proportional to the cross section  times
the product of the number of incident particles per second,   and the areal density of target scatterers,
 . Typical cross sections encountered in astrophysics are  ≈ 1014 2 , in atomic physics:  ≈ 10−20 2 ,
and in nuclear physics;  ≈ 10−28 2 = 3
N. B., the above proof assumed that the target size is larger than the cross sectional area of the incident
beam. If the size of the target is smaller than the beam, then  is replaced by the areal density/sec of the
beam   and  is replaced by the number of target particles  and the cross-sectional size of the target
cancels.
3 The term "barn" was chosen because nuclear physicists joked that the cross sections for neutron scattering by nuclei were

as large as a barn door.

274 CHAPTER 11. CONSERVATIVE TWO-BODY CENTRAL FORCES

11.12.2 Diﬀerential two-body scattering cross section

The diﬀerential two-body scattering cross section gives much
more detailed information of the scattering force than does
the total cross section because of the correlation between the
impact parameter and the scattering angle. That is, a mea-
surement of the number of beam particles scattered into a db
given solid angle as a function of scattering angles   probes
the radial form of the scattering force.
b
The diﬀerential cross section for scattering of an incident
beam by a single target body into a solid angle Ω at scat-
tering angles   is defined to be
 1  ( )
() ≡ (11.138)
Ω  Ω
where the right-hand side is the ratio of the number scattered
per target nucleus into solid angle Ω( ) to the incident
beam intensity  2 . Figure 11.12: The equivalent one-body prob-
Similar reasoning used to derive equation 11137 leads to lem for scattering of a reduced mass  by a
the number of beam particles scattered into a solid angle force centre in the centre of mass system.
Ω for  beam particles incident upon a target with areal
density   is
 ( ) 
=   () (11.139)
Ω Ω
Consider the equivalent one-body system for scattering of one body by a scattering force center in
the center of mass. As shown in figures 116 and 1112, the perpendicular distance between the center of
force of the two body system and trajectory of the incoming body at infinite distance is called the impact
parameter . For a central force the scattering system has cylindrical symmetry, therefore the solid angle
Ω() = sin  can be integrated over the azimuthal angle  to give Ω() = 2 sin 
For the inverse-square, two-body, central force there is a one-to-one correspondence between impact
parameter  and scattering angle  for a given bombarding energy. In this case, assuming conservation of
flux means that the incident beam particles passing through the impact-parameter annulus between  and
 +  must equal the the number passing between the corresponding angles  and  +  That is, for an
incident beam flux of  2  the number of particles per second passing through the annulus is

2 || = 2  sin  || (11.140)
Ω
The modulus is used to ensure that the number of particles is always positive. Thus
¯ ¯
  ¯¯  ¯¯
= (11.141)
Ω sin  ¯  ¯

11.12.3 Impact parameter dependence on scattering angle

¯  ¯
If the function  =  (  ) is known, then it is possible to evaluate ¯  ¯ which can be used in equation
11141 to calculate the diﬀerential cross section. A simple and important case to consider is two-body elastic
scattering for the inverse-square law force such as the Coulomb or gravitational forces. To avoid confusion in
the following discussion, the center-of-mass scattering angle will be called  while the angle used to define
the hyperbolic orbits in the discussion of trajectories for the inverse square law, will be called .
In chapter 118 the equivalent one-body representation gave that the radial distance for a trajectory for
the inverse square law is given by
1 
= − 2 [1 +  cos ] (11.142)
 
Note that closest approach occurs when  = 0 while for  → ∞ the bracket must equal zero, that is
¯ ¯
¯1¯
cos  ∞ = ± ¯¯ ¯¯ (11.143)

11.12. TWO-BODY SCATTERING 275

The polar angle  is measured with respect to the symmetry axis of the two-body system which is along
the line of distance of closest approach as shown in figure 116. The geometry and symmetry show that the
scattering angle  is related to the trajectory angle  ∞ by

 =  − 2 ∞ (11.144)

Equation 1150 gives that Z ∞

±
∞ = r ³ ´ (11.145)
min 2
2 2  −  − 22

Since
2 = 2 2 = 2 2 (11.146)
then the scattering angle can be written as.
Z ∞
− 
∞ = = r³ ´ (11.147)
2 min 2
2 1 −  − 2

Let  = 1 , then Z ∞
− 
∞ = = r³ ´ (11.148)
2 min 
1−  − 2 2

For the repulsive inverse square law


 =−= − (11.149)

where  is taken to be positive for a repulsive force. Thus the scattering angle relation becomes
Z ∞
− 
∞ = = r³ ´ (11.150)
2 min
1 + 

− 2 2

The solution of this equation is given by equation 1163 to be

1 
= = − 2 [1 +  cos ] (11.151)
 
where the eccentricity
s
2 2
= 1+ (11.152)
2

For  → ∞  = 0 then, as shown previously,

¯ ¯
¯1¯
¯ ¯ = cos  ∞ = cos  −  = sin  (11.153)
¯¯ 2 2
Therefore
2  p 2 
=  − 1 = cot (11.154)
 2 Figure 11.13: Impact parameter depen-
that is, the impact parameter  is given by the relation dence on scattering angle for Rutherford
  scattering.
= cot (11.155)
2 2
Thus, for an inverse-square law force, the two-body scattering
has a one-to-one correspondence between impact parameter 
and scattering angle  as shown schematically in figure 1113.
276 CHAPTER 11. CONSERVATIVE TWO-BODY CENTRAL FORCES

If  is negative, which corresponds to an attractive inverse square

law, then one gets the same relation between impact parameter and
scattering angle except that the sign of the impact parameter  is
opposite. This means that the hyperbolic trajectory has an interior
rather than exterior focus. That is, the trajectory partially orbits
around the center of force rather than being repelled away.
Note that the distance of closest approach is related to the
eccentricity  by equation 11151, therefore

min = (1 + ) (11.156)
2
Ã !
 1
min = 1+ (11.157)
2 sin 2
Note that for  = 180 then Figure 11.14: Classical trajectories
for scattering to a given angle by the
 repulsive Coulomb field plus the at-
 = =  (min) (11.158)
min tractive nuclear field for three diﬀer-
ent impact parameters. Path 1 is
which is what you would expect from equating the incident kinetic
pure Coulomb. Paths 2 and 3 in-
energy to the potential energy at the distance of closest approach.
clude Coulomb plus nuclear interac-
For scattering of two nuclei by the repulsive Coulomb force, if the
tions. The dashed parts of trajecto-
impact parameter becomes small enough, the attractive nuclear force
ries 2 and 3 correspond to only the
also acts leading to impact-parameter dependent eﬀective potentials
Coulomb force acting, i.e. zero nu-
illustrated in figure 1114 Trajectory 1 does not overlap the nuclear
clear force
force and thus is pure Coulomb. Trajectory 2 interacts at the periph-
ery of the nuclear potential and the trajectory deviates from pure Coulomb shown dashed. Trajectory 3
passes through the interior of the nuclear potential. These three trajectories all can lead to the same scat-
tering angle and thus there no longer is a one-to-one correspondence between scattering angle and impact
parameter.

11.12.4 Rutherford scattering

Two models of the nucleus evolved in the 1900’s, the Rutherford model assumed electrons orbiting around a
small nucleus like planets around the sun, while J.J. Thomson’s ”plum-pudding” model assumed the electrons
were embedded in a uniform sphere of positive charge the size of the atom. When Rutherford derived his
classical formula in 1911 he realized that it can be used to determine the size of the nucleus since the electric
field obeys the inverse square law only when outside of the charged spherical nucleus. Inside a uniform sphere
of charge the electric field is E ∝ r and thus the scattering cross section will not obey the Rutherford relation
for distances of closest approach that are less than the radius of the sphere of negative charge. Observation
of the angle beyond which the Rutherford formula breaks down immediately determines the radius of the
nucleus. ¯  ¯
For pure Coulomb scattering, equation 11155 can be used to evaluate ¯  ¯  which when used in equation
11141 gives the center-of-mass Rutherford scattering cross section
µ ¶2
 1  1
= (11.159)
Ω 4 2 sin4 2

This cross section assumes elastic scattering by a repulsive two-body inverse-square central force. For scat-
tering of nuclei in the Coulomb potential, the constant  is given to be

  2
= (11.160)
4
The cross section, scattering angle and  of equation 11159 are evaluated in the center-of-mass co-
ordinate system, whereas usually two-body elastic scattering data involve scattering of the projectiles by a
stationary target as discussed in chapter 1113
11.12. TWO-BODY SCATTERING 277

Gieger and Marsden performed scattering of 77 MeV  particles from a thin gold foil and proved that
the diﬀerential scattering cross section obeyed the Rutherford formula back to angles corresponding to a
distance of closest approach of 10−14  which is much smaller that the 10−10  size of the atom. This
validated the Rutherford model of the atom and immediately led to the Bohr model of the atom which
played such a crucial role in the development of quantum mechanics. Bohr showed that the agreement with
the Rutherford formula implies the Coulomb field obeys the inverse square law to small distances. This work
was performed at Manchester University, England between 1908 and 1913. It is fortunate that the classical
result is identical to the quantal cross section for scattering, otherwise the development of modern physics
could have been delayed for many years.
Scattering of very heavy ions, such as 208 Pb, can electromagnetically excite target nuclei. For the Coulomb
force the impact parameter  and the distance of closest approach, min are directly related to the scattering
angle  by equation 11155. Thus observing the angle of the scattered projectile unambiguously determines
the hyperbolic trajectory and thus the electromagnetic impulse given to the colliding nuclei. This process,
called Coulomb excitation, uses the measured angular distribution of the scattered ions for inelastic excitation
of the nuclei to precisely and unambiguously determine the Coulomb excitation cross section as a function
of impact parameter. This unambiguously determines the shape of the nuclear charge distribution.

11.7 Example: Two-body scattering by an inverse cubic force

Assume two-body scattering by a potential  = 2 where   0. This corresponds to a repulsive two-body
force F = 2
 3 r̂. Insert this force into Binet’s diﬀerential orbit, equation 1139 gives
µ ¶
2  2
+  1 + =0
2 2
The solution is of the form  =  sin( + ) where  and  are constants of integration,  = 2 ̇ and
µ ¶
2
2 = 1 + 2

q
Initially  = ∞,  = 0 and therefore  = 0. Also at  = ∞,  = 12 ̇∞
2
, that is |̇∞ | = 2
 . Then

     
̇ = ̇ = =− = −  cos ()
  2   
1
√
The initial energy gives that  =  2 Hence the orbit equation is
√
1 2
= = sin ()
 

The above trajectory has a distance of closest approach, min , when  min = 2 . Moreover, due to the
symmetry of the orbit, the scattering angle  is given by
µ ¶
1
 =  − 2 0 =  1 −

Since 2 = 2 2 ̇∞
2
= 22  then
µ ¶− 12 µ ¶− 12
 2 
1− = 1+ 2 = 1+ 2
   
This gives that the impact parameter  is related to scattering angle by
2
 ( − )
2 =
 (2 − ) 
This impact parameter relation can be used in equation 11141 to give the diﬀerential cross section
¯ ¯
  ¯¯  ¯¯  2 ( − )
= =
Ω sin  ¯  ¯  (2 − )2 2
These orbits are called Cotes spirals.
278 CHAPTER 11. CONSERVATIVE TWO-BODY CENTRAL FORCES

11.13 Two-body kinematics

So far the discussion has been restricted to the center-of-momentum system. Actual scattering measurements
are performed in the laboratory frame, and thus it is necessary to transform the scattering angle, energies
and cross sections between the laboratory and center-of-momentum coordinate frame. In principle the
transformation between the center-of-momentum and laboratory frames is straightforward, using the vector
addition of the center-of-mass velocity vector and the center-of-momentum velocity vectors of the two bodies.
The following discussion assumes non-relativistic kinematics apply.
In chapter 28 it was shown that, for Newtonian mechanics, the center-of-mass and center-of-momentum
frames of reference are identical. By definition, in the center-of-momentum frame the vector sum of the
linear momentum of the incoming projectile 
 and target, 
 are equal and opposite. That is

p
 + p
 =0 (11.161)

Using the center-of-momentum frame, coupled with the conservation of linear momentum, implies that the
vector sum of the final momenta of the  reaction products, 


 also is zero. That is

X
p


=0 (11.162)
=1

An additional constraint is that energy conservation relates the initial and final kinetic energies by
¡  ¢2 ¡  ¢2 ¡   ¢2 ¡   ¢2
   
+ +=  +  (11.163)
2 2 2 2
where the  value is the energy contributed to the final total kinetic energy by the reaction between the
incoming projectile and target. For exothermic reactions,   0 the summed kinetic of the reaction products
exceeds the sum of the incoming kinetic energies, while for endothermic reactions,   0 the summed kinetic
energy of the reaction products is less than that of the incoming channel.
For two-body kinematics, the following are three advantages to working in the center-of-momentum frame
of reference.

1. The two incident colliding bodies are colinear as are the two final bodies.
2. The linear momenta for the two colliding bodies are identical in both the incident channel and the
outgoing channel.
3. The total energy in the center-of-momentum coordinate frame is the energy available to the reac-
tion during the collision. The trivial kinetic energy of the center-of-momentum frame relative to the
laboratory frame is handled separately.

The kinematics for two-body reactions is easily determined using the conservation of linear momentum
along and perpendicular to the beam direction plus the conservation of energy, 11161 − 11163. Note that it
is common practice to use the term “center-of-mass” rather than “center-of-momentum” in spite of the fact
that, for relativistic mechanics, only the center-of-momentum is a meaningful concept.
General features of the transformation between the center-of-momentum and laboratory frames of refer-
ence are best illustrated by elastic or inelastic scattering of nuclei where the two reaction products in the final
channel are identical to the incident bodies. Inelastic excitation of an excited state energy of ∆ in either
reaction product corresponds to  = −∆  while elastic scattering corresponds to  = −∆ = 0.
For inelastic scattering, the conservation of linear momenta for the outgoing channel in the center-of-
momentum simplifies to
p

+ p


=0 (11.164)
that is, the linear momenta of the two reaction products are equal and opposite.
Assume that the center-of-momentum direction of the scattered projectile is at an angle   =  relative
to the direction of the incoming projectile and that the scattered target nucleus is scattered at a center-
of-momentum direction  =  − . Elastic scattering corresponds to simple¯ scattering for which
¯ ¯  ¯ the
magnitudes of the incoming and outgoing projectile momenta are equal, that is, ¯ 
 ¯
= ¯

¯.
11.13. TWO-BODY KINEMATICS 279

Figure 11.15: Vector hodograph of the scattered projectile and target velocities for a projectile, with incident
velocity   that is elastically scattered by a stationary target body. The circles show the magnitude of the
projectile and target body final velocities in the center of mass. The center-of-mass velocity vectors are
shown as dashed lines while the laboratory vectors are shown as solid lines. The left hodograph shows
normal kinematics where the projectile mass is less than the target mass. The right hodograph shows
inverse kinematics where the projectile mass is greater than the target mass. For elastic scattering  = 0 .

Velocities
The transformation between the center-of-momentum and laboratory frames requires knowledge of the par-
ticle velocities which can be derived from the linear momenta since the particle masses are known. Assume
that a projectile, mass  , with incident energy  in the laboratory frame bombards a stationary target
with mass   The incident projectile velocity  is given by
r
2
 = (11.165)


The initial velocities in the laboratory frame are taken to be

 =  (Initial Lab velocities)

 = 0

The final velocities in the laboratory frame after the inelastic collision are

0 (Final Lab velocities)

0

In the center-of-momentum coordinate system, equation 1110 implies that the initial center-of-momentum
velocities are

 = 
 + 

 =  (11.166)
 + 

It is simple to derive that the final center-of-momentum velocities after the inelastic collision are given
280 CHAPTER 11. CONSERVATIVE TWO-BODY CENTRAL FORCES

by
r
 2
0 = ̃
 +  
r
 2
0 = ̃ (11.167)
 +  

The energy ̃ is defined to be given by


̃ =  + (1 + ) (11.168)

where  = −∆ which is the excitation energy of the final excited states in the outgoing channel.

Angles
The angles of the scattered recoils are written as

 (Final laboratory angles)



and

 =  (Final CM angles)

 = −

where  is the center-of-mass (center-of-momentum) scattering angle.

Figure 1115 shows that the angle relations between the laboratory and center of momentum frames for
the scattered projectile are connected by
r
sin( 
 −   )  
= ≡ (11.169)
sin 

 ̃

where
 1  1
= q = q (11.170)
 1 +    1 + 
( +
(1 + )  )

   


and 
is the energy per nucleon on the incident projectile.
Equation 11169 can be rewritten as

sin 
tan 
 =

(11.171)
cos 
 + 

Another useful relation from equation 11169 gives the center-of-momentum scattering angle in terms of
the laboratory scattering angle.

 = sin
−1
( sin  
 ) +   (11.172)
This gives the diﬀerence in angle between the lab scattering angle and the center-of-momentum scattering
angle. Be careful with this relation since 
 is two-valued for inverse kinematics corresponding to the two
possible signs for the solution.
The angle relations between the lab and center-of-momentum for the recoiling target nucleus are connected
by
r
sin( −  ) 
= ≡ ̃ (11.173)
sin  ̃
That is
 = sin−1 (̃ sin  ) +  (11.174)
11.13. TWO-BODY KINEMATICS 281

Figure 11.16: The kinematic correlation of the laboratory and center-of-mass scattering angles of the recoiling
projectile and target nuclei for scattering for 43  /nucleon 104 Pd on 208 Pb (left) and for the inverse
43  /nucleon 208 Pb on 104 Pd (right). The projectile scattering angles are shown by solid lines while the
recoiling target angles are shown by dashed lines. The blue curves correspond to elastic scattering, that is
 = 0 while the red curves correspond to inelastic scattering with  = −5  .

where
1 1
̃ = q =q (11.175)
 
1+  (1 + 
 ) 1+   ( +
 )


Note that ̃ is the same under interchange of the two nuclei at the same incident energy/nucleon, and
that ̃ is always larger than or equal to unity since  is negative. For elastic scattering ̃ = 1 which gives

1
 = ( − ) (Recoil lab angle for elastic scattering)
2

For the target recoil equation 11173 can be rewritten as

sin 
tan  = (Target lab to CM angle conversion)
cos  + ̃

Velocity vector hodographs provide useful insight into the behavior of the kinematic solutions. As shown
in figure 1115, in the center-of-momentum frame the scattered projectile has a fixed final velocity 0 , that
is, the velocity vector describes a circle as a function of . The vector addition of this vector and the velocity
of the center-of-mass vector − gives the laboratory frame velocity 0 . Note that for normal kinematics,
where     then | |  |0 | leading to a monotonic one-to-one mapping of the center-of-momentum
angle  and  0
 . However, for inverse kinematics, where     then | |  | | leading to two valued
 solutions at any fixed laboratory scattering angle .
Billiard ball collisions are an especially simple example where the two masses³are identical
´ and the collision

is essentially elastic. Then essentially  = ̃ = 1,   = 
2  and  
 = 1
2  −  
 , that is, the angle

between the scattered billiard balls is 2 .
Both normal and inverse kinematics are illustrated in figure 1116 which shows the dependence of the
projectile and target scattering angles in the laboratory frame as a function of center-of-momentum scattering
angle for the Coulomb scattering of 104 Pd by 208 Pb, that is, for a mass ratio of 2 : 1. Both normal and
inverse kinematics are shown for the same bombarding energy of 43   for elastic scattering and
for inelastic scattering with a -value of −5  .
282 CHAPTER 11. CONSERVATIVE TWO-BODY CENTRAL FORCES

Figure 11.17: Recoil energies, in   , versus laboratory scattering angle, shown on the left for scattering
of 447  104 Pd by 208 Pb with  = −50  , and shown on the right for scattering of 894  208 Pb
on 104 Pd with  = −50 

Since sin( −  ) ≤ 1 then equation 11173 implies that ̃ sin  ≤ 1 Since ̃ is always larger than
or equal to unity there is a maximum scattering angle in the laboratory frame for the recoiling target nucleus
given by
1
sin max = (11.176)
̃
For elastic scattering  = sin−1 ( ̃1 ) = 90◦ since ̃ = 1 for both 894  208 Pb bombarding 104 Pd, and
the inverse reaction using a 447  104 Pd beam scattered by a 208 Pb target. A -value of −5 
gives ̃ = 1002808 which implies a maximum scattering angle of  = 8571◦ for both 894  208 Pb
bombarding 104 Pd, and the inverse reaction of a 447  104 Pd beam scattered by a 208 Pb target. As a
consequence there are two solutions for  for any allowed value of  as illustrated in figure 1116.
Since sin(  
 −   ) ≤ 1 then equation 11150 implies that  sin   ≤ 1 For a 447 
104
Pd beam
208 
scattered by a Pb target  = 050, thus  = 05 for elastic scattering which implies that there is no
upper bound to   
 . This leads to a one-to-one correspondence between   and  for normal kinematics.
In contrast, the projectile has a maximum scattering angle in the laboratory frame for inverse kinematics

since 
 = 20 leading to an upper bound to   given by


1
sin 
max = (11.177)

For elastic scattering  = 2 implying  ◦ 
max = 30 . In addition to having a maximum value for   , when
 
  1 also there are two solutions for  for any allowed value of  . For the example of 894  208 Pb
bombarding 178 Hf leads to a maximum projectile scattering angle of  ◦
 = 300 for elastic scattering and
 ◦
 = 29907 for  = −5 

Kinetic energies

The initial total kinetic energy in the center-of-momentum frame is

 
 =  (11.178)
 + 
The final total kinetic energy in the center-of-momentum frame is
   
 =  +  = ̃ (11.179)
 + 
11.13. TWO-BODY KINEMATICS 283

In the laboratory frame the kinetic energies of the scattered projectile and recoiling target nucleus are
given by
µ ¶2 ³ ´

 = 1 +  2 + 2 cos 
 ̃ (11.180)
 + 
  ³ ´
2 
 = 2 1 + ̃ + 2̃ cos  ̃ (11.181)
( +  )

where  
 and  are the center-of-mass scattering angles respectively for the scattered projectile and
target nuclei.
For the chosen incident energies the normal and inverse reactions give the same center-of-momentum
energy of 298  which is the energy available to the interaction between the colliding nuclei. However,
the kinetic energy of the center-of-momentum is 447−298 = 149  for normal kinematics and 894−298 =
596  for inverse kinematics. This trivial center-of-momentum kinetic energy does not contribute to the
reaction. Note that inverse kinematics focusses all the scattered nuclei into the forward hemisphere which
reduces the required solid angle for recoil-particle detection.

Solid angles
The laboratory-frame solid angles for the scattered projectile and target are taken to be   and  
respectively, while the center-of-momentum solid angles are Ω and Ω respectively. The Jacobian relating
the solid angles is
Ã !2
  sin  ¯ ¯
 ¯   ¯
= ¯cos(  −   ¯
) (11.182)
Ω sin 
Ã !2
  sin  ¯ ¯
¯   ¯
= ¯cos( −   )¯ (11.183)
Ω sin 
These can be used to transform the calculated center-of-momentum diﬀerential cross sections to the
laboratory frame for comparison with measured values. Note that relative to the center-of-momentum frame,
the forward focussing increases the observed diﬀerential cross sections in the forward laboratory frame and
decreases them in the backward hemisphere.

Exploitation of two-body kinematics

Computing the above non-trivial transform relations between the center-of-mass and laboratory coordinate
frames for two-body scattering is used extensively in many fields of physics. This discussion has assumed non-
relativistic two-body kinematics. Relativistic two-body kinematics encompasses non-relativistic kinematics
as discussed in chapter 174. Many computer codes are available that can be used for making either non-
relativistic or relativistic transformations.
It is stressed that the underlying physics for two interacting bodies is identical irrespective of whether
the reaction is observed in the center-of-mass or the laboratory coordinate frames. That is, no new physics is
involved in the kinematic transformation. However, the transformation between these frames can dramati-
cally alter the angles and velocities of the observed scattered bodies which can be beneficial for experimental
detection. For example, in heavy-ion nuclear physics the projectile and target nuclei can be interchanged
leading to very diﬀerent velocities and scattering angles in the laboratory frame of reference. This can greatly
facilitate identification and observation of the velocities vectors of the scattered nuclei. In high-energy physics
it is advantageous to collide beams having identical, but opposite, linear momentum vectors, since then the
laboratory frame is the center-of-mass frame, and the energy required to accelerate the colliding bodies is
minimized.
284 CHAPTER 11. CONSERVATIVE TWO-BODY CENTRAL FORCES

11.14 Summary
This chapter has focussed on the classical mechanics of bodies interacting via conservative, two-body, central
interactions. The following are the main topics presented in this chapter.

Equivalent one-body representation for two bodies interacting via a central interaction The
equivalent one-body representation of the motion of two bodies interacting via a two-body central interaction
greatly simplifies solution of the equations of motion. The position vectors r1 and r2 are expressed in terms
of the center-of-mass vector R plus total mass  = 1 + 2 while the position vector r plus associated
reduced mass  = 11+ 2
2
 describe the relative motion of the two bodies in the center of mass. The total
Lagrangian then separates into two independent parts
1 ¯¯ ¯¯2
 =  ¯Ṙ¯ +  (1116)
2
where the center-of-mass Lagrangian is
1
 =  |ṙ|2 −  () (1117)
2
Equations 1110, and 1111 can be used to derive the actual spatial trajectories of the two bodies expressed
in terms of r1 and r2  from the relative equations of motion, written in terms of R and r for the equivalent
one-body solution..

Angular momentum Noether’s theorem shows that the angular momentum is conserved if only a spherically-
symmetric two-body central force acts between the interacting two bodies. The plane of motion is perpen-
dicular to the angular momentum vector and thus the Lagrangian can be expressed in polar coordinates
as
1 ³ 2
´
 =  ̇2 + 2 ̇ −  () (1122)
2

Diﬀerential orbit equation of motion The Binet transformation  = 1 allows the center-of-mass
Lagrangian  for a central force F = ()r̂ to be used to express the diﬀerential orbit equation for the
radial motion as
2   1 1
2 +  = − 2 2  (  ) (1139)

The Lagrangian, and the Hamiltonian all were used to derive the equations of motion for two bodies inter-
acting via a two-body, conservative, central interaction. The general features of the conservation of angular
momentum and conservation of energy for a two-body, central potential were presented.

Inverse-square, two-body, central force The inverse-square, two-body, central force is of pivotal im-
portance in nature since it is applies to both the gravitational force and the Coulomb force. The underlying
symmetries of the inverse-square, two-body, central interaction, lead to conservation of angular momentum,
conservation of energy, Gauss’s law, and that the two-body orbits follow closed, degenerate, orbits that are
conic sections, for which the eccentricity vector is conserved. The radial dependence, relative to the force
center lying at one focus of the conic section, is given by
1 
= − 2 [1 +  cos ( −  0 )] (1158)
 
where the orbit eccentricity  equals s
2 2
= 1+ (1162)
2
These lead to Kepler’s three laws of motion for two bodies in a bound orbit due to the attractive gravitational
force for which  = −1 2 . The inverse-square law is special in that the eccentricity vector A is a third
invariant of the motion, where
A ≡ (p × L) + (r̂) (1186)
11.14. SUMMARY 285

The eccentricity vector unambiguously defines the orientation and direction of the major axis of the elliptical
orbit. The invariance of the eccentricity vector, and the existence of stable closed orbits, are manifestations
of the dynamical 04 symmetry.

Isotropic, harmonic, two-body, central force The isotropic, harmonic, two-body, central interaction
is of interest since, like the inverse-square law force, it leads to closed elliptical orbits described by
⎛ Ã !1 ⎞
1  ⎝ 2 2
= 2 1+ 1+ 2 cos 2( −  0 )⎠ (11107)
2   

where the eccentricity  is given by

Ã ! 12
2 2
1+ 2 = (11108)
  2 − 2
The harmonic force orbits are distinctly diﬀerent from those for the inverse-square law in that the force center
is at the center of the ellipse, rather than at the focus for the inverse-square law force. This elliptical orbit
is reflection symmetric for the harmonic force, but not for the inverse square force. The isotropic harmonic
two-body force leads to invariance of the symmetry tensor, A0 which is an invariant of the motion analogous
to the eccentricity vector A. This leads to stable closed orbits, which are manifestations of the dynamical
 3 symmetry.

Orbit stability Bertrand’s theorem states that only the inverse square law and the linear radial depen-
dences of the central forces lead to stable closed bound orbits that do not precess. These are manifestation
of the dynamical symmetries that occur for these two specific radial forms of two-body forces.

The three-body problem The diﬃculties encountered in solving the equations of motion for three bodies,
that are interacting via two-body central forces, was discussed. The three-body motion can include the
existence of chaotic motion. It was shown that solution of the three-body problem is simplified if either the
planar approximation, or the restricted three-body approximation, are applicable.

Two-body scattering The total and diﬀerential two-body scattering cross sections were introduced. It
was shown that for the inverse-square law force there is a simple relation between the impact parameter 
and scattering angle  given by
 
= cot (11155)
2 2
This led to the solution for the diﬀerential scattering cross-section for Rutherford scattering due to the
Coulomb interaction. µ ¶2
 1  1
= (11159)
Ω 4 2 sin4 2
This cross section assumes elastic scattering by a repulsive two-body inverse-square central force. For scat-
tering of nuclei in the Coulomb potential the constant  is given to be

  2
= (11160)
4

Two-body kinematics The transformation from the center-of-momentum frame to laboratory frames of
reference was introduced. Such transformations are used extensively in many fields of physics for theoretical
modelling of scattering, and for analysis of experiment data.
286 CHAPTER 11. CONSERVATIVE TWO-BODY CENTRAL FORCES

Workshop exercises
1. Listed below are several statements concerning central force motion. For each statement, give the reason for
why the statement is true. If a statement is only true in certain situations, then explain when it holds and
when it doesn’t. The system referred to below consists of mass 1 located at 1 and mass 2 located at 2 .

• The potential energy of the system depends only on the diﬀerence 1 − 2 , not on 1 and 2 separately.
• The potential energy of the system depends only on the magnitude of 1 − 2 , not the direction.
• It is possible to choose an inertial reference frame in which the center of mass of the system is at rest.
• The total energy of the system is conserved.
• The total angular momentum of the system is conserved.
2 2
2. A particle of mass  moves in a potential  () = −0 − 
.

(a) Given the constant , find an implicit equation for the radius of the circular orbit. A circular orbit at
 =  is possible if µ ¶¯
 ¯¯
=0
 ¯=
where  is the eﬀective potential.
(b) What is the largest value of  for which a circular orbit exists? What is the value of the eﬀective potential
at this critical orbit?

3. A particle of mass  is observed to move in a spiral orbit given by the equation  = , where  is a constant.
Is it possible to have such an orbit in a central force field? If so, determine the form of the force function.

4. The
£ interaction energy¤ between two atoms of mass  is given by the Lennard-Jones potential,  () =
 (0 )12 − 2(0 )6

(a) Determine the Lagrangian of the system where 1 and 2 are the positions of the first and second mass,
respectively.
(b) Rewrite the Lagrangian as a one-body problem in which the center-of-mass is stationary.
(c) Determine the equilibrium point and show that it is stable.
(d) Determine the frequency of small oscillations about the stable point.

5. Consider two bodies of mass  in circular orbit of radius 0 2, attracted to each other by a force  () , where
 is the distance between the masses.

(a) Determine the Lagrangian of the system in the center-of-mass frame (Hint: a one-body problem subject
to a central force).

(b) Determine the angular momentum. Is it conserved?

(d) Expand your result in (c) about an equilibrium radius 0 and show that the condition for stability
 0 ( )
is,  (00) + 30  0

6. Consider two charges of equal magnitude  connected by a spring of spring constant 0 in circular orbit. Can
the charges oscillate about some equilibrium? If so, what condition must be satisfied?

7. Consider a mass  in orbit around a mass  , which is subject to a force  = − 2 ̂ , where  is the distance
between the masses. Show that the eccentricity vector  =  ×  −  ̂ is conserved.
11.14. SUMMARY 287

Problems
1. Show that the areal velocity is constant for a particle moving under the influence of an attractive force given
by  () = − . Calculate the time averages of the kinetic and potential energies and compare with the the
results of the virial theorem.

2. Assume that the Earth’s orbit is circular and that the Sun’s mass suddenly decreases by a factor of two. (a)
What orbit will the earth then have? (b) Will the Earth escape the solar system?

3. Discuss the motion of a particle in a central inverse-square-law force field for a superimposed force whose
magnitude is inversely proportional to the cube of the distance from the particle to force center; that is

 
 () = − − (k,   0)
2 3
Show that the motion is described by a precessing ellipse. Consider the cases
2 2 2
a)   , b)  =  c)    where  is the angular momentum and  the reduced mass.

4. A communications satellite is in a circular orbit around the earth at a radius  and velocity . A rocket
accidentally fires quite suddenly, giving the rocket an outward velocity  in addition to its original tangential
velocity 
a) Calculate the ratio of the new energy and angular momentum to the old.
b) Describe the subsequent motion of the satellite and plot  ()  () the net eﬀective potential, and ()
after the rocket fires.

5. Two identical point objects, each of mass  are bound by a linear two-body force  = − where  is the
vector distance between the two point objects. The two point objects each slide on a horizontal frictionless
plane subject to a vertical gravitational field  . The two-body system is free to translate, rotate and oscillate
on the surface of the frictionless plane.

a) Derive the Lagrangian for the complete system including translation and relative motion.
b) Use Noether’s theorem to identify all constants of motion.
c) Use the Lagrangian to derive the equations of motion for the system.
d) Derive the generalized momenta and the corresponding Hamiltonian.
e) Derive the period for small amplitude oscillations of the relative motion of the two masses.

6. A bound binary star system comprises two spherical stars of mass 1 and 2 bound by their mutual gravita-
tional attraction. Assume that the only force acting on the stars is their mutual gravitation attraction and let
 be the instantaneous separation distance between the centers of the two stars where  is much larger than
the sum of the radii of the stars.

a) Show that the two-body motion of the binary star system can be represented by an equivalent one-body system
and derive the Lagrangian for this system.
b) Show that the motion for the equivalent one-body system in the center of mass frame lies entirely in a plane
and derive the angle between the normal to the plane and the angular momentum vector.
c) Show whether  is a constant of motion and whether it equals the total energy.
d) It is known that a solution to the equation of motion for the equivalent one-body orbit for this gravitational
force has the form
1 
= − 2 [1 +  cos ]
 
and that the angular momentum is a constant of motion  = . Use these to prove that the attractive force leading
to this bound orbit is

F= r̂
2
where  must be negative.
288 CHAPTER 11. CONSERVATIVE TWO-BODY CENTRAL FORCES

7 When performing the Rutherford experiment, Gieger and Marsden scattered 77  4 He particles (alpha
particles) from 238 U at a scattering angle in the laboratory frame of  = 900 . Derive the following observables
as measured in the laboratory frame.

238
(a) The recoil scattering angle of the U in the laboratory frame.
(b) The scattering angles of the 4 He and 238
U in the center-of-mass frame
(c) The kinetic energies of the 4 He and 238
U in the laboratory frame
(d) The impact parameter
(e) The distance of closest approach min
Chapter 12

Non-inertial reference frames

12.1 Introduction
Newton’s Laws of motion apply only to inertial frames of reference. Inertial frames of reference make it
possible to use either Newton’s laws of motion, or Lagrangian, or Hamiltonian mechanics, to develop the
necessary equations of motion. There are certain situations where it is much more convenient to treat the
motion in a non-inertial frame of reference. Examples are motion in frames of reference undergoing trans-
lational acceleration, rotating frames of reference, or frames undergoing both translational and rotational
motion. This chapter will analyze the behavior of dynamical systems in accelerated frames of reference,
especially rotating frames such as on the surface of the Earth. Newtonian mechanics, as well as the La-
grangian and Hamiltonian approaches, will be used to handle motion in non-inertial reference frames by
introducing extra inertial forces that correct for the fact that the motion is being treated with respect to a
non-inertial reference frame. These inertial forces are often called fictitious even though they appear real in
the non-inertial frame. The underlying reasons for each of the inertial forces will be discussed followed by a
presentation of important applications.

12.2 Translational acceleration of a reference frame

Consider an inertial system (        ) which is fixed
in space, and a non-inertial system (0   0 0
  ) that
is moving in a direction relative to the fixed frame such as
to maintain constant orientations of the axes relative to the
fixed frame, as illustrated in figure 121. The fixed frame is
designated to be the unprimed frame and, to avoid confu-
sion the subscript   is attached to the fixed coordinates
taken with respect to the fixed coordinate frame. Similarly,
the translating reference frame, which is undergoing trans-
lational acceleration, has the subscript  attached to the
coordinates taken with respect to the translating frame of
reference. Newton’s Laws of motion are obeyed only in the
inertial (unprimed) reference frame. The respective position
vectors are related by
r  = R  +r0 (12.1)
where r  is the vector relative to the fixed frame, r0 is
the vector relative to the translationally accelerating frame
and R  is the vector from the origin of the fixed frame to
the origin of the accelerating frame. Diﬀerentiating equation
121 gives the velocity vector relation Figure 12.1: Inertial reference frame (un-
0
v  = V  +v (12.2) primed), and translational accelerating frame
(primed).
289
290 CHAPTER 12. NON-INERTIAL REFERENCE FRAMES

r  0 r0 R 

where v  =   v =  and V  =  . Similarly the acceleration vector relation is

a  = A  +a0 (12.3)
2 r 2 r0 2 R 
where a  = 2  a0 = 
2 and A  = 2 
In the fixed frame, Newton’s laws give that

F  = a  (12.4)

The force in the fixed frame can be separated into two terms, the acceleration of the accelerating frame of
reference A  plus the acceleration with respect to the accelerating frame a0 .

F  = A  +a0 (12.5)

Relative to the accelerating reference frame the acceleration is given by

a0 = F  − A  (12.6)

The accelerating frame of reference can exploit Newton’s Laws of motion using an eﬀective translational
force F0 ≡ F  − A   The additional −A  term is called an inertial force; it can be altered by
choosing a diﬀerent non-inertial frame of reference, that is, it is dependent on the frame of reference in which
the observer is situated.

12.3 Rotating reference frame

Consider a rotating frame of reference which will be designated as the double-primed (rotating) frame
to diﬀerentiate it from the non-rotating primed (moving) frame, since both of which may be undergoing
translational acceleration relative to the inertial fixed unprimed frame as described above.

12.3.1 Spatial time derivatives in a rotating, non-translating, reference frame

For simplicity assume that R  = V  = 0 that is, the
primed reference frame is stationary and identical to the fixed
stationary unprimed frame. The double-primed (rotating)
frame is a non-inertial frame rotating with respect to the
origin of the fixed primed frame. Appendix 23 shows that
an infinitessimal rotation  about an instantaneous axis of
rotation leads to an infinitessimal displacement r where

r = θ × r0 (12.7)

Consider that during a time  the position vector in the fixed
primed reference frame moves by an arbitrary infinitessimal
distance r0  As illustrated in figure 122, this infinitessi-
mal distance in the primed non-rotating frame can be split
into two parts:
a) r = θ×r0 which is due to rotation of the rotating
frame with respect to the translating primed frame.
b) (r00 ) which is the motion with respect to the rotating
(double-primed) frame.
That is, the motion has been arbitrarily divided into
a part that is due to the rotation of the double-primed
frame, plus the vector displacement measured in this rotating
(double-primed) frame. It is always possible to make such a
decomposition of the displacement as long as the vector sum
can be written as Figure 12.2: Infinitessimal displacement in
the non rotating primed frame and in the ro-
r0 = r00 + θ × r0 (12.8) tating double-primed reference frame frame.
12.3. ROTATING REFERENCE FRAME 291

Since θ = ω then the time diﬀerential of the displacement, equation 128, can be written as
µ 0¶ µ 00 ¶
r r
= + ω × r0 (12.9)
   
³ 0´
The important conclusion is that a velocity measured in a non-rotating reference frame r  can be
³ 00 ´ 
expressed as the sum of the velocity r
  measured relative to a rotating frame, plus the term ω × r0

which accounts for the rotation of the frame. The division of the r0 vector into two parts, a part due to
rotation of the frame plus a part with respect to the rotating frame, is valid for any vector as shown below.

12.3.2 General vector in a rotating, non-translating, reference frame

Consider an arbitrary vector G which can be expressed in terms of components along the three unit vector
basis ê  in the fixed inertial frame as
X3
G=   ê  (12.10)
=1
Neglecting translational motion, then it can be expressed in terms of the three unit vectors in the non-inertial
rotating frame unit vector basis ê
 as
X3
G= ( ) ê
 (12.11)
=1
Since the unit basis vectors ê
 are constant in the rotating frame, that is,
µ  ¶
ê
=0 (12.12)
 

then the time derivatives of G in the rotating coordinate system ê

 can be written as
µ ¶ X3 µ ¶
G 
= ê (12.13)
  =1   

The inertial-frame time derivative taken with components along the rotating coordinate basis ê
 , equation
1211, is
µ ¶ X3 µ ¶ X3
G  ê
= ê
 + ( )  (12.14)
   =1   =1

Substitute the unit vector ê for r0 in equation 129 plus using equation 1212 gives that
µ  ¶
ê
= ω × ê (12.15)
  
Substitute this into the second term of equation 1214 gives
µ ¶ µ ¶
G G
= +ω×G (12.16)
    
This important identity relates the time derivatives of any vector expressed in both the inertial frame and
the rotating non-inertial frame bases. Note that the ω × G term originates from the fact that the unit
basis vectors of the rotating reference frame are time dependent with respect to the non-rotating frame basis
vectors as given by equation (1215). Equation (1216) is used extensively for problems involving rotating
frames. For example, for the special case where G = r0 , then equation (1216) relates the velocity vectors in
the fixed and rotating frames as given in equation (129).
Another example is the vector ω̇
µ ¶ µ ¶ µ ¶
ω ω ω
ω̇ = = +ω×ω = = ω̇ (12.17)
      
That is, the angular acceleration ̇ has the same value in both the fixed and rotating frames of reference.
292 CHAPTER 12. NON-INERTIAL REFERENCE FRAMES

12.4 Reference frame undergoing rotation plus translation

Consider the case where the system is accelerating in translation as well as rotating, that is, the primed
frame is the non-rotating translating frame. The position vector r  is taken with respect to the inertial
fixed unprimed frame which can be written in terms of the fixed unit basis vectors (bi   bj   k
b  ) This r 
vector can be written as the vector sum of the translational motion R  of the origin of the rotating system
with respect to the fixed frame, plus the position r0 with respect to this translating primed frame basis
r  = R  + r0 (12.18)
The time diﬀerential is µ ¶ µ ¶ µ ¶
r R r0
= + (12.19)
      
The vector r0 is the position
³ with respect to ´ the translating frame of reference which can be expressed in
terms of the unit vectors ib0   jb0   kb0  .
Equation 1219 takes into account the translational motion of the moving primed frame basis. Now,
assuming that the double primed frame rotates about the origin of the moving primed frame, then the net
displacement with respect to the original inertial frame basis can be combined with equation 129 leading to
the relation µ ¶ µ ¶ µ 00 ¶
r R r
= + + ω × r0 (12.20)
       
Here the double-primed frame
³ is both rotating ´ and translating. Vectors in this frame are expressed in terms
of the unit basis vectors ib00   jb00   k
c00  
Expressed as velocities, equation 1220 can be written as
00
v  = V  + v + ω × r0 (12.21)
where:
v  is the velocity measured with respect to the inertial (unprimed) frame basis.
V  is the velocity of the origin of the non-inertial translating (primed) frame basis with respect to the
origin of the inertial (unprimed) frame basis.
00
v is the velocity of the particle with respect to the non-inertial rotating (double-primed) frame basis
the origin of which is both translating and rotating.
ω × r0 is the motion of the rotating (double-primed) frame with respect to the linearly-translating
(primed) frame basis.
Thus this relation takes into account both the translational velocity plus rotation of the reference coor-
dinate frame basis vectors.

12.5 Newton’s law of motion in a non-inertial frame

The acceleration of the system in the rotating inertial frame can be derived by diﬀerentiating the general
velocity relation for v equation 1221 in the fixed frame basis which gives
µ ¶ µ ¶ µ 00 ¶ µ ¶ µ 0 ¶
v  V  v ω 0 r
a  = = + + × r + ω × (12.22)
              

Now we wish to use the general transformation to a rotating frame basis which requires inclusion of the time
dependence of the unit vectors in the rotating frame, that is,
µ 00 ¶ µ 00 ¶
v v 00
= + ω × v (12.23)
    
µ ¶ µ ¶
ω ω
× r0 = × r0 (12.24)
    
µ 0 ¶
r 00
ω× = ω × v + ω × (ω × r0 ) (12.25)
  
12.6. LAGRANGIAN MECHANICS IN A NON-INERTIAL FRAME 293

Using equations 1223 1224 1225 gives

a  = A  + a00 + 2ω × v
00
+ ω × (ω × r0 ) + ω̇ × r0 (12.26)
µ 00 ¶ µ 00 ¶
v r
where the acceleration in the rotating frame is a00 = 
00
while the velocity is v =  and
 
A  is with respect to the fixed frame.
Newton’s laws of motion are obeyed in the inertial frame, that is

F  = a  =  (A  + a00 + 2ω × v

00
+ ω × (ω × r0 ) + ω̇ × r0 ) (12.27)

In the double-primed frame, which may be both rotating and accelerating in translation, one can ascribe an
eﬀective force F  00
 that obeys an eﬀective Newton’s law for the acceleration a in the rotating frame

F  00 00 0 0
 = a = F  −  (A  + 2ω × v + ω × (ω × r ) + ω̇ × r ) (12.28)

Note that the eﬀective force F 

 comprises the physical force F  minus four non-inertial forces that are
introduced to correct for the fact that the rotating reference frame is a non-inertial frame.

12.6 Lagrangian mechanics in a non-inertial frame

The above derivation of the equations of motion in the rotating frame is based on Newtonian mechanics.
Lagrangian mechanics provides another derivation of these equations of motion for a rotating frame of
reference by exploiting the fact that the Lagrangian is a scalar which is frame independent, that is, it is
invariant to rotation of the frame of reference.
The Lagrangian in any frame is given by
1
= v · v −  () (12.29)
2
The scalar product v · v is the same in any rotated frame and can be evaluated in terms of the rotating
frame variables using the same decomposition of the translational plus rotational motion as used previously
and given in equation 1221
Equation (1221) decomposes the velocity in the fixed inertial frame v  into four vector terms, the
00
translational velocity V  of the translating frame, the velocity in the rotating-translating frame v  and
0
rotational velocity (ω × r ). Using equations 1229 and 1221 plus appendix equation 21 for the triple
products, gives that the Lagrangian evaluated using v  ·v  equals

1 h 2
i
=  V  ·V  +v00 ·v00 + 2V  ·v00 + 2V  · (ω × r0 ) + 2v
00
· (ω × r0 ) + (ω × r0 ) − ()
2
(12.30)
This can be used to derive the canonical momentum in the rotating frame

p00 = 00 =  [V  +v00 + ω × r0 ] (12.31)
v
The Lagrange equations can be used to derive the equations of motion in terms of the variables evaluated
in the rotating reference frame. The required Lagrange derivatives are
 
00 =  [A  +a00 + (ω × v
00
) + (ω̇ × r0 )] (12.32)
 v
and
 00
= − [(ω × V  ) − (ω × v ) − ω × (ω × r0 )] − ∇ (12.33)
r0
where the scalar triple product, equation 21 has been used. Thus the Lagrange equations give for the
rotating frame basis that

a00 = −∇ − [A  + (ω × V  ) +2 (ω × v

00
) + ω × (ω × r0 ) + (ω̇ × r0 )] (12.34)
294 CHAPTER 12. NON-INERTIAL REFERENCE FRAMES

The external force is identified as F  = −∇ . Equation 1216 can be used to transform between the
fixed and the rotating bases. h i
A  = A  + (ω × V)  (12.35)

This leads to an eﬀective force in the non-inertial translating plus rotating frame that corresponds to an
eﬀective Newtonian force of

F  00 00 0 0
 = a = F − [A  + 2ω × v + ω × (ω × r ) + (ω̇ × r )] (12.36)

where A  is expressed in the fixed frame. The derivation of equation 1236 using Lagrangian mechanics,
confirms the identical formula 1229 derived using Newtonian mechanics.
The four correction terms for the non-inertial frame basis correspond to the following eﬀective forces.
Translational acceleration: F 
 = −A  is the usual inertial force experienced in a linearly acceler-
ating frame of reference, and where A  is with respect to the fixed frame .
Coriolis force; F  00
 = −2ω × v This is a new type of inertial force that is present only when a
particle is moving in the rotating frame. This force is proportional to the velocity in the rotating frame and
is independent of the position in the rotating frame
Centrifugal force: F  0
 = −ω × (ω × r ) This is due to the centripetal acceleration of the particle
owing to the rotation of the moving axis about the axis of rotation.
Transverse (azimuthal) force: F  0
 = −ω̇ × r This is a straightforward term due to acceleration of
the particle due to the angular acceleration of the rotating axes.
The above inertial forces are correction terms arising from trying to extend Newton’s laws of motion to
a non-inertial frame involving both translation and rotation. These correction forces are often referred to as
“fictitious” forces. However, these non-inertial forces are very real when located in the non-inertial frame.
Since the centrifugal and Coriolis terms are unusual they are discussed below.

12.7 Centrifugal force

The centrifugal force was defined as

F = −ω × (ω × r0 ) (12.37)

Note that
ω · F = 0 (12.38)
therefore the centrifugal force is perpendicular to the axis of
rotation.
Using the vector identity, equation 24 allows the centrifu-
gal force to be written as
£ ¤
F = − (ω · r0 ) ω −  2 r0 (12.39)
.
0 0 r
For the case where the radius r is perpendicular to ω then ω·r =
0 and thus for this special case

F =  2 r0 (12.40)

The centrifugal force is experienced when riding in a car

driven rapidly around a bend. The passenger experiences an ap- O
parent centrifugal (center fleeing) force that thrusts them to the
outside of the bend relative to the inside of the turning car. In
reality, relative to the fixed inertial frame, i.e. the road, the fric-
tion between the car tires and the road is changing the direction Figure 12.3: Centrifugal force.
of the car towards the inside of the bend and the car seat is caus-
ing the centripetal (center seeking) acceleration of the passenger.
A bucket of water attached to a rope can be swung around in a
vertical plane without spilling any water if the centrifugal force
exceeds the gravitation force at the top of the trajectory.
12.8. CORIOLIS FORCE 295

12.8 Coriolis force

The Coriolis force was defined to be
00
F = −2ω × v (12.41)
where v00 is the velocity measured in the ro-
tating (double-primed) frame. The Coriolis
force is an interesting force; it is perpendic-
ular to both the axis of rotation and the ve-
locity vector in the rotating frame, that is, it
is analogous to the v × B Lorentz magnetic
force .
The understanding of the Coriolis eﬀect
is facilitated by considering the physics of a
hockey puck sliding on a rotating frictionless Figure 12.4: Free-force motion of a hockey puck sliding on
table. Assume that the table rotates with a rotating frictionless table of radius  that is rotating with
constant angular frequency ω =  k b about constant angular frequency  out of the page.
the  axis. For this system the origin of the
rotating system is fixed, and the angular frequency is constant, thus A and ̇ ×r0 are zero. Also it is assumed
that there are no external forces acting on the hockey puck, thus the net acceleration of the puck sliding on
the table, as seen in the rotating frame, simplifies to
a00 = −2ω × v
00
− ω × (ω × r0 ) = −2 k̂ × v
00
+  2 r0 (12.42)
b × v00 is to
The centrifugal acceleration + 2 r0 is radially outwards while the Coriolis acceleration −2 k 
the right. Integration of the equations of motion can be used to calculate the trajectories in the rotating
frame of reference.
Figure 124 illustrates trajectories of the hockey puck in the rotating reference frame when no external
forces are acting, that is, in the inertial frame the puck moves in a straight line with constant velocity v0 .
In the rotating reference frame the Coriolis force accelerates the puck to the right leading to trajectories
that exhibit spiral motion. The apparent complicated trajectories are a result of the observer being in the
rotating frame for which that the straight inertial-frame trajectories of the moving puck exhibit a spiralling
trajectory in the rotating-frame.
The Coriolis force is the reason that winds circulate in an anticlockwise direction about low-pressure
regions in the Earth’s northern hemisphere. It also has important consequences in many activities on earth
such as ballet dancing, ice skating, acrobatics, nuclear and molecular rotation, and the motion of missiles.

12.1 Example: Accelerating spring plane pendulum

Comparison of the relative merits of using a non-inertial frame versus an inertial frame is given by a
spring pendulum attached to an accelerating fulcrum. As shown in the figure, the spring pendulum comprises
a mass  attached to a massless spring that has a rest length 0 and spring constant . The system is
in a vertical gravitational field  and the fulcrum of the pendulum is accelerating vertically upwards with a
constant acceleration . Assume that the spring pendulum oscillates only in the vertical  plane.
Inertial frame:
This problem can be solved in the fixed inertial coordinate system with coordinates ( ). These coordi-
nates, and their time derivatives, are given in terms of  and  by
 =  sin  ̇ = ̇ sin  + ̇ cos 
1
 = − cos  + 2 ̇ = ̇ sin  − ̇ cos  + 
2
Thus
1 ¡ 2 ¢ 1
 =  ̇ + ̇ 2 −  −  ( − 0 )2
2 2 µ ¶
1 h 2 2 2
³ ´i 1 1 2
=  ̇ +  ̇ +   + 2 ̇ sin  − ̇ cos  +   cos  − 2 −  ( − 0 )
2 2
2 2 2
296 CHAPTER 12. NON-INERTIAL REFERENCE FRAMES

The Lagrange equations of motion are given by

Λ  = 0
2 
̈ − ̇ − ( + ) cos  + ( − 0 ) = 0

Λ  = 0
2 ( + )
̈ + ̇̇ + sin  = 0
 
The generalized momenta are


 = = ̇ −  cos 
 ̇

 = = 2 ̇ +  sin 
 ̇
These lead to the corresponding velocities of

̇ = +  cos 

  sin 
̇ = 2
−
 
and thus the Hamiltonian is given by

 =  ̇ +  ̇ − 
2   1 1
= + 2
−  sin  +  cos  +  ( − 0 )2 + 2 −  cos 
2 2  2 2
The Hamilton equations of motion give that

 
̇ = = +  cos 
 
   sin 
̇ = = −
 2 

These radial and angular velocities are the same as obtained using Lagrangian mechanics.
The Hamilton equations for ̇ and ̇ are given by

  2
̇ = − = − 2  sin  −  ( − 0 ) +  cos  +  3
  
Similarly
 
̇ = − =  cos  +  sin  −  sin 
 
The transformation equations relating the generalized coordinates   are time dependent so the Hamil-
tonian  does not equal the total energy . In addition neither the Lagrangian nor the Hamiltonian are
conserved since they both are time dependent. The fact that the Hamiltonian is not conserved is obvious since
the whole system is accelerating upwards leading to increasing kinetic and potential energies. Moreover, the
time derivative of the angular momentum ̇ is non-zero so the angular momentum  is not conserved.
Non-inertial fulcrum frame:
This system also can be addressed in the accelerating non-inertial fulcrum frame of reference which is
fixed to the fulcrum of the spring of the pendulum. In this non-inertial frame of reference, the acceleration
of the frame can be taken into account using an eﬀective acceleration  which is added to the gravitational
force; that is,  is replaced by an eﬀective gravitational force ( + ). Then the Lagrangian in the fulcrum
frame simplifies to
1 2 1
  = ̇2 + 2 ̇ +  ( + ) ( cos ) −  ( − 0 )2
2 2
The Lagrange equations of motion in the fulcrum frame are given by
12.8. CORIOLIS FORCE 297

Λ   = 0
2 
̈ − ̇ − ( + ) cos  + ( − 0 ) = 0

Λ   = 0
2 ( + )
̈ + ̇̇ + sin  = 0
 
These are identical to the Lagrange equations of motion derived in the inertial frame.
The   can be used to derive the momenta in the non-inertial fulcrum frame

 
̃ = = ̇
 ̇
 
̃ = = 2 ̇
 ̇
which comprise only a part of the momenta derived in the inertial frame. These partial fulcrum momenta
lead to a Hamiltonian for the fulcum-frame of

̃2 ̃ 1 2
  = ̃ ̇ + ̃ ̇ −   = + +  ( − 0 ) −  ( + )  cos 
2 22 2
Both   and   are time independent and thus the fulcrum Hamiltonian   is a constant
of motion in the fulcrum frame. However,   does not equal the total energy which is increasing with
time due to the acceleration of the fulcrum frame relative to the inertial frame. This example illustrates that
use of non-inertial frames can simplify solution of accelerating systems.

12.2 Example: Surface of rotating liquid

Find the shape of the surface of liquid in a bucket
that rotates with angular speed  as shown in the ad-
jacent figure. Assume that the liquid is at rest in the
frame of the bucket. Therefore, in the coordinate system
F’
rotating with the bucket of liquid, the centrifugal force is
important whereas the Coriolis, translational, and trans-
verse forces are zero. The external force 2

F = F0 − g

where F0 is the pressure which is perpendicular to the

surface. At equilibrium the acceleration of the surface is mg mg

zero that is

a00 = 0 = F0 +  (g − ω × (ω × r0 ))

The eﬀective gravitational force is

g  = (g − ω × (ω × r0 ))
which must be perpendicular to the surface of the liquid since F0 is perpendicular to the surface of a fluid,
and the net force is zero. In cylindrical coordinates this can be written as

z +  2 ρ
g  = −b b

From the figure it can be deduced that

  2
tan  = =
 
By integration
2 2
=  + constant
2
298 CHAPTER 12. NON-INERTIAL REFERENCE FRAMES

This is the equation of a paraboloid and corresponds to a parabolic gravitational equipotential energy surface.
Astrophysicists build large parabolic mirrors for telescopes by continuously spinning a large vat of glass while
it solidifies. This is much easier than grinding a large cylindrical block of glass into a parabolic shape.

12.3 Example: The pirouette

An interesting application of the Coriolis force is the problem of a spinning ice skater or ballet dancer.
Her angular frequency increases when she draws in her arms. The conventional explanation is that angular
momentum is conserved in the absence of any external forces which is correct. Thus since her moment of
inertia decreases when she retracts her arms, her angular velocity must increase to maintain a constant
angular momentum L =  ω But this explanation does not address the question as to what are the forces
that cause the angular frequency to increase? The real radial forces the skater feels when she retracts her
arms cannot directly lead to angular acceleration since radial forces are perpendicular to the rotation. The
00
following derivation shows that the Coriolis force −2ω × v acts tangentially to the radial retraction
velocity of her arms leading to the angular acceleration required to maintain constant angular momentum.
00
Consider that a mass  is moving radially at a velocity ̇ then the Coriolis force in the rotating frame
is
F = −2ω × ṙ00

This Coriolis force leads to an angular acceleration of the mass of

2ω × ṙ00
ω̇ = − ()
”
that is, the rotational frequency decreases if the radius is increased. Note that, as shown in equation 1217
̇ = ̇ 00 . This nonzero value of ̇ obviously leads to an azimuthal force in addition to the Coriolis force.
Consider the rate of change of angular momentum for the rotating mass  assuming that the angular
momentum comes purely from the rotation  Then in the rotating frame


ṗ00 = (”2 ω) = 200 ̇00 ω + 002 ω̇

Substituting equation  for ̇ in the second term gives

ṗ” = 200 ̇00 ω−200 ̇00 ω =0

That is, the two terms cancel. Thus the angular momentum is conserved for this case where the velocity is
radial. Note that, since  ” is assumed to be colinear with  then it is the same in both the stationary and
rotating frames of reference and thus angular momentum is conserved in both frames. In addition, in the
fixed frame, the angular momentum is conserved if no external torques are acting as assumed above.
Note that the rotational energy is
1
 =  2
2
Also the angular momentum is conserved, that is

p = ω = ω̂
p
Substituting ω =  in the rotational energy gives

2 2
 = =
2 2
Therefore the rotational energy actually increases as the moment of inertia decreases when the ice skater
pulls her arms close to her body. This increase in rotational energy is provided by the work done as the
dancer pulls her arms inward against the centrifugal force.
12.9. ROUTHIAN REDUCTION FOR ROTATING SYSTEMS 299

12.9 Routhian reduction for rotating systems

The Routhian reduction technique, that was introduced in chapter 86 is a hybrid variational approach. It
was devised by Routh to handle the cyclic and non-cyclic variables separately in order to simultaneously
exploit the diﬀering advantages of the Hamiltonian and Lagrangian formulations. The Routhian reduction
technique is a powerful method for handling rotating systems ranging from galaxies to molecules, or deformed
nuclei, as well as rotating machinery in engineering. A valuable feature of the Hamiltonian formulation is
that it allows elimination of cyclic variables which reduces the number of degrees of freedom to be handled.
As a consequence, cyclic variables are called ignorable variables in Hamiltonian mechanics. The Lagrangian,
the Hamiltonian and the Routhian all are scalars under rotation and thus are invariant to rotation of the
frame of reference. Note that often there are only two cyclic variables for a rotating system, that is, θ̇ = ω
and the corresponding canonical total angular momentum p = J.
As mentioned in chapter 86, there are two possible Routhians that are useful for handling rotation frames
of reference. For rotating systems the cyclic Routhian  simplifies to

 (1    ; ̇1   ̇ ; +1    ; ) =  −  = ω · J −  (12.43)

This Routhian behaves like a Hamiltonian for the ignorable cyclic coordinates ω J Simultaneously it behaves
like a negative Lagrangian  for all the other coordinates
The non-cyclic Routhian  complements  in that it is defined as

 (1    ; 1    ; ̇+1   ̇ ; ) =  −  =  − ω · J (12.44)

This non-cyclic Routhian behaves like a Hamiltonian for all the non-cyclic variables and behaves like a
negative Lagrangian for the two cyclic variables   . Since the cyclic variables are constants of motion,
then  is a constant of motion that equals the energy in the rotating frame if  is a constant of
motion. However,  does not equal the total energy since the coordinate transformation is time
dependent, that is, the Routhian  corresponds to the energy of the non-cyclic parts of the motion.
For example, the Routhian  for a system that is being cranked about the  axis at some fixed
angular frequency ̇ =  with corresponding total angular momentum p = J can be written as1

 =  −ω·J (12.45)

1 h 2
i
=  V · V + v” · v” + 2V · v” + 2V · (ω × r0 ) + 2v” · (ω × r0 ) + (ω × r0 ) − ω · J +  ()
2
Note that  is a constant of motion if  = 0 which is the case when the system is being cranked
at a constant angular frequency. However the Hamiltonian in the rotating frame  =  − ω · J is given
by  =  6=  since the coordinate transformation is time dependent. The canonical Hamilton
equations for the fourth and fifth terms in the bracket can be identified with the Coriolis force 2ω × v00 
while the last term in the bracket is identified with the centrifugal force. That is, define
1 2
 ≡ −  (ω × r0 ) (12.46)
2
where the gradient of  gives the usual centrifugal force.
 h 2
i £ ¤
F = −∇ = ∇  2 02 − (ω · r0 ) =   2 r0 − (ω · r0 ) ω = −ω × (ω × r0 ) (12.47)
2
The Routhian reduction method is used extensively in science and engineering to describe rotational
motion of rigid bodies, molecules, deformed nuclei, and astrophysical objects. The cyclic variables describe
the rotation of the frame and thus the Routhian  =  corresponds to the Hamiltonian for the
non-cyclic variables in the rotating frame.

1 For clarity sections 101 to 108 of this chapter adopted a naming convention that uses unprimed coordinates with the

subscript   for the inertial frame of reference, primed coordinates with the subscript  for the translating coordinates, and
double-primed coordinates with the subscript  for the translating plus rotating frame. For brevity the subsequent discussion
omits the redundant subscripts     since the single and double prime superscripts completely define the moving and
rotating frames of reference.
300 CHAPTER 12. NON-INERTIAL REFERENCE FRAMES

12.4 Example: Cranked plane pendulum

The cranked plane pendulum, which is also called the rotating plane
pendulum, comprises a plane pendulum that is cranked around a verti-
cal axis at a constant angular velocity ̇ =  as determined by some
external drive mechanism. The parameters are illustrated in the adja-
cent figure. The cranked pendulum nicely illustrates the advantages of
working in a non-inertial rotating frame for a driven rotating system.
Although the cranked plane pendulum looks similar to the spherical pen- g
dulum, there is one very important diﬀerence; for the spherical pendulum
 = 2 sin2 ̇ is a constant of motion and thus the angular velocity

varies with , i.e. ̇ = 2 sin2   whereas for the cranked plane pendulum,

the constant of motion is ̇ =  and thus the angular momentum varies m

with  i.e.  =  sin2 . For the cranked plane pendulum, the energy
must flow into and out of the cranking drive system that is providing the Cranked plane pendulum that is
constraint force to satisfy the equation of constraint cranked around the vertical axis
with angular velocity ̇ = .
 = ̇ −  = 0
The easiest way to solve the equations of motion for the cranked plane pendulum is to use generalized coor-
dinates to absorb the equation of constraint and applied constraint torque. This is done by incorporating the
̇ =  constraint explicitly in the Lagrangian or Hamiltonian and solving for just  in the rotating frame.
Assuming that ̇ =  and using generalized coordinates to absorb the cranking constraint forces, then
the Lagrangian for the cranked pendulum can be written as.
1 2 2
=  (̇ + sin2  2 ) +  cos 
2
The momentum conjugate to  is

 = = 2 ̇
 ̇
Consider the Routhian  =  ̇ −  =  −  ̇ which acts as a Hamiltonian  in the rotating
frame
2 1
 =  ̇ −  =  −  ̇ = − 2  2 sin2  −  cos 
22 2
Note that if ̇ =  is constant, then  is a constant of motion for rotation about the  axis since

it is independent of  Also  = − 
 = 0 thus the energy in the rotating non-inertial frame of the
pendulum  =  =  −  ̇ is a constant of motion, but it does not equal the total energy since
the rotating coordinate transformation is time dependent. The driver that cranks the system at a constant 
provides or absorbs the energy  =  =  as  changes in order to maintain a constant .
The Routhian  can be used to derive the equations of motion using Hamiltonian mechanics.
 
̇ = =
 2
∙ ¸
 
̇ = − = − sin  1 − cos  2
 
Since ̇ = 2 ̈ then the equation of motion is
∙ ¸
  2
̈ + sin  1 − cos  = 0 ()
 
Assuming that sin  ≈ , then equation  leads to linear harmonic oscillator
h solutions
i about a minimum
at  = 0 if the term in brackets is positive. That is, when the bracket 1 −  cos  2  0 then equation 
corresponds to a harmonic oscillator with angular velocity Ω given by
∙ ¸
2   2
Ω = 1 − cos 
 
12.9. ROUTHIAN REDUCTION FOR ROTATING SYSTEMS 301

The adjacent figure shows the phase-space diagrams for a plane

P
pendulum
p  rotating aboutp a vertical axis at angular velocity  for (a)
   and (b)     The upper phase plot shows small  when
the square bracket of equation  is positive and the the phase space
trajectories are ellipses around the stable equilibrium point (0 0).
As  increases the bracket becomes smaller and changes sign when
 2 cos  =  . For larger  the bracket is negative leading to hyper-
bolic phase space trajectories around the (  ) = (0 0) equilibrium
point, that is, an unstable equilibrium point. However, new sta- (a)

ble equilibrium points now occur at angles (  ) = (±0  0) where P

cos 0 = 2  That is, the equilibrium point (0 0) undergoes bifurca-
tion as illustrated in the lower figure. These new equilibrium points
are stable as illustrated by the elliptical trajectories around these
points. It is interesting that these new equilibrium points ±0 move
to larger angles given by 0 = 2 beyond the bifurcation point
at 2 = 1. For low energy the mass oscillates about the minimum
at  = 0 whereas the motion becomes more complicated for higher
(b)
energy. The bifurcation corresponds to symmetry breaking since,
under spatial reflection, the equilibrium point is unchanged at low Phase-space diagrams for the plane
rotational frequencies but it transforms from +0 to −0 once the pendulum cranked at angular velocity
solution bifurcates, that is, the symmetry is broken. Also chaos can  about a vertical axis. Figure () is
 
occur at the separatrix that separates the bifurcation. Note that for    while () is for   .
either the Lagrange multiplier approach, or the generalized force ap-
proach, can be used to determine the applied torque required to ensure a constant  for the cranked pendulum.

12.5 Example: Nucleon orbits in deformed nuclei

Consider the rotation of axially-symmetric,
prolate-deformed nucleus. Many nuclei have a pro-
late spheroidal shape, (the shape of a rugby ball)
and they rotate perpendicular to the symmetry axis.
In the non-inertial body-fixed frame, pairs of nucle-
ons, each with angular momentum  are bound in
orbits with the projection of the angular momentum
along the symmetry axis being conserved with value
Ω =  which is a cyclic variable. Since the nucleus
is of dimensions 10−14  quantization is important
and the quantized binding energies of the individual
nucleons are separated by spacings ≤ 500
The Lagrangian and Hamiltonian are scalars
and can be evaluated in any coordinate frame of Schematic diagram for the strong coupling of a
reference. It is most useful to calculate the Hamil- nucleon to the deformation axis. The projection of 
tonian for a deformed body in the non-inertial ro- on the symmetry axis is , and the projection of  is
tating body-fixed frame of reference. The body- Ω. For axial symmetry Noether’s theroem gives that
fixed Hamiltonian corresponds to the Routhian the projection of the angular momentum  on the
 symmetry axis is a conserved quantity.

 =  − ω · J

where it is assumed that the deformed nucleus has the symmetry axis along the  direction and rotates about
the  axis. Since the Routhian is for a non-inertial rotating frame of reference it does not include the total
energy but, if the shape is constant in time, then  and the corresponding body-fixed Hamiltonian
are conserved and the energy levels for the nucleons bound in the spheroidal potential well can be calculated
using a conventional quantum mechanical model.
For a prolate spheroidal deformed potential well, the nucleon orbits that have the angular momentum
nearly aligned to the symmetry axis correspond to nucleon trajectories that are restricted to the narrowest
302 CHAPTER 12. NON-INERTIAL REFERENCE FRAMES

part of the spheroid, whereas trajectories with the angular momentum vector close to perpendicular to the
symmetry axis have trajectories that probe the largest radii of the spheroid. The Heisenberg Uncertainty
Principle, mentioned in chapter 3113, describes how orbits restricted to the smallest dimension will have
the highest linear momentum, and corresponding kinetic energy, and vise versa for the larger sized orbits.
Thus the binding energy of diﬀerent nucleon trajectories in the spheroidal potential well depends on the angle
between the angular momentum vector and the symmetry axis of the spheroid as well as the deformation of
the spheroid. A quantal nuclear model Hamiltonian is solved for assumed spheroidal-shaped potential wells.
The corresponding orbits each have angular momenta j for which the projection of the angular momentum
along the symmetry axis Ω is conserved, but the projection of j in the laboratory frame  is not conserved
since the potential well is not spherically symmetric. However, the total Hamiltonian is spherically symmetric
in the laboratory frame, which is satisfied by allowing the deformed spheroidal potential well to rotate freely in
the laboratory frame, and then 2    and Ω all are conserved quantities. The attractive residual nucleon-
nucleon pairing interaction results in pairs of nucleons being bound in time-reversed orbits ( × )0 , that
is, with resultant total spin zero, in this spheroidal nuclear potential. Excitation of an even-even nucleus
can break one pair and then the total projection of the angular momentum along the symmetry axis is
 = |Ω1 ± Ω2 |, depending on whether the projections are parallel or antiparallel. More excitation energy
can break several pairs and the projections continue to be additive. The binding energies calculated in the
spheroidal potential well must be added to the rotational energy  = J2  2 to get the total energy, where
J is the moment of inertia. Nuclear structure measurements are in good agreement with the predictions of
nuclear structure calculations that employ the Routhian approach.

12.10 Eﬀective gravitational force near the surface of the Earth

Consider that the translational acceleration of the center of
the Earth can be neglected, and thus a set of non-rotating
axes through the center of the Earth can be assumed to be
approximately an inertial frame. The eﬀects of the motion of
the Earth around the Sun, or the motion of the Solar system x3
in our Galaxy, are small compared with the eﬀects due to the
rotation of the Earth.
Consider a rotating frame attached to the surface of the P x2
r’
earth as shown in figure 125. The vector with respect to the O’
center of the Earth r can be decomposed into a vector to the
origin of the reference frame fixed to the surface of the Earth r
R plus the vector with respect to this surface reference frame
r0 
O x1
r = R + r0 (12.48)

If the external force is separated into the gravitational

term g plus some other physical force F then the acceler-
ation in the non-inertial surface frame of reference is
Figure 12.5: Rotating frame at the surface of
F
a0 = +g−(A + 2ω × v0 + ω × (ω × r0 ) + ω̇ × r0 ) (12.49) the Earth.

But µ ¶ µ ¶
R R
V= = +ω×R=ω×R (12.50)
    
¡ R ¢
since in the rotating frame  
= 0 Also the acceleration
µ ¶ µ ¶
V V
A= = + ω × V = ω × (ω × R) (12.51)
    
12.10. EFFECTIVE GRAVITATIONAL FORCE NEAR THE SURFACE OF THE EARTH 303
¡ V ¢
since  
= 0 Substituting this into the above equation gives

F
a0 = + g − (2ω × v0 + ω × (ω × [r0 + R]) + ω̇ × r0 )

F
= + g − (2ω × v0 + ω × (ω × r) + ω̇ × r0 )

where r is with respect to the center of the Earth. This is as expected directly from equation 1236. Since
the angular frequency of the earth is a constant then ̇ × r0 = 0 Thus the acceleration can be written as
F
a0 = + [g − ω × (ω × r)] − 2ω × v0 (12.52)

The term in the square brackets combines the gravitational acceleration plus the centrifugal acceleration.
A measurement of the Earth’s gravitational accel-
eration actually measures the term in the square brack-
ets in equation 1252, that is, an eﬀective gravitational
acceleration where

g  = g − ω × (ω × r) (12.53)

near the surface of the earth r ≈ R. The eﬀective grav-

itational force does not point towards the center of the
Earth as shown in figure 126. A plumb line points,
or an object falls, in the direction of g   The shape
of the earth is such that the Earth’s surface is per-
pendicular to g  . This is the reason why the earth is g
distorted into an oblate ellipsoid, that is, it is flattened g
at the poles.
The angle  between g  and the line pointing
to the center of the earth is dependent on the latitude
 = 2 − Note that the colatitude  is taken to be zero
at the North pole whereas the latitude  is taken to
be zero at the equator. The angle  can be estimated
by assuming that 0   then the centrifugal term Figure 12.6: Eﬀective gravitational acceleration.
then can be approximated by

|ω × (ω × r)| ≈  2  sin  =  2  cos  (12.54)

This is quite small for the Earth since  = 073 × 10−4  and  = 6371 leading to a correction
term  2  cos  = 003 cos  2  Since

  =  2  cos  sin  (12.55)

and

  =  −  2  cos2  (12.56)
Then the angle  between g  and g is given by

   2  cos  sin 
 ' tan  = 
= (12.57)
   −  2  cos2 

This has a maximum value at  = 45 which is  = 00088◦ .

304 CHAPTER 12. NON-INERTIAL REFERENCE FRAMES

12.11 Free motion on the earth

The calculation of trajectories for objects as they move near
the surface of the earth is frequently required for many ap-
plications. Such calculations require inclusion of the non-
inertial Coriolis force. In the frame of reference fixed to
the earth’s surface, assuming that air resistance and other
forces can be neglected, then the acceleration equals y (North)
z (vertical)

a0 = g  − 2ω × v0 (12.58)
x (East)
Neglect the centrifugal correction term since it is very small,
that is, let g  = g. Using the coordinate axis shown in
figure 127, the surface-frame vectors have components
Equator
ω = 0ib0 +  cos jb0 +  sin kb0 (12.59)

and
g  = − kb0 (12.60)
Thus the Coriolis term is
¯ ¯
¯ ib0 jb0 kb0 ¯
¯ ¯ Figure 12.7: Rotating frame fixed on the sur-
2ω × v0 = 2 ¯¯ 0  cos   sin  ¯¯ face of the Earth.
¯  0  0 
0 ¯
h³  0
´ ³  ´ ³  ´ i
0 0 0
= 2   cos  −   sin  ib0 +   sin  jb0 −   cos  kb0

Therefore the equations of motion are

r̈0 = − kb0 −2[ib0 (̇ 0  cos  − ̇ 0  sin ) + jb0 ̇0  sin  − kb0 ̇0  cos ] (12.61)

That is, the components of this equation of motion are

̈0 = −2 (̇ 0 cos  − ̇ 0 sin ) (12.62)

̈ 0 = −2 ̇0 sin 
̈ 0 = − + 2 ̇0 cos 

Integrating these diﬀerential equations gives

̇0 = −2 ( 0 cos  −  0 sin ) + ̇00 (12.63)

̇ 0 = −20 sin  + ̇00
̇ 0 = − + 20 cos  + ̇00

where ̇00  ̇00  ̇00 are the initial velocities. Substituting the above velocity relations into the equation of motion
for ̈ gives
̈0 = 2 cos  − 2 (̇00 cos  − ̇00 sin ) − 4 2 0 (12.64)
The last term 4 2  is small and can be neglected leading to a simple uncoupled second-order diﬀerential
equation in . Integrating this twice assuming that 00 = 00 = 00 = 0 plus the fact that 2 cos  and
2 (̇00 cos  − ̇00 sin ) are constant, gives
1
0 = 3 cos  − 2 (̇00 cos  − ̇00 sin ) + ̇00  (12.65)
3
Similarly, ¡ ¢
 0 = ̇00  −  ̇00 2 sin  (12.66)
1
 0 = − 2 + ̇00  +  ̇00 2 cos  (12.67)
2
12.11. FREE MOTION ON THE EARTH 305

Consider the following special cases;

12.6 Example: Free fall from rest

Assume that an object falls a height  starting from rest at  = 0,  = 0,  = 0,  = . Then
1
0 = 3 cos 
3
0 = 0
1
0 =  − 2
2
Substituting for  gives s
10 83
 =  cos 
3 
Thus the object drifts eastward as a consequence of the earth’s rotation. Note that relative to the fixed frame
it is obvious that the angular velocity of the body must increase as it falls to compensate for the reduced
distance from the axis of rotation in order to ensure that the angular momentum is conserved.

12.7 Example: Projectile fired vertically upwards

An upward fired projectile with initial velocities ̇00 = ̇00 = 0 and ̇00 = 0 leads to the relations
1
0 =3 cos  − 2 0 cos 
3
0 = 0
1
 0 = − 2 + 0 
2
Solving for  when  0 = 0 gives  = 0 and  = 20  Also since the maximum height  that the projectile
reaches is related by p
0 = 2
then the final deflection is s
0 4 83
 = −  cos 
3 
Thus the body drifts westwards.

12.8 Example: Motion parallel to Earth’s surface

For motion in the horizontal 0 −  0 plane the deflection is always to the right in the northern hemisphere
→
−
of the Earth since the vertical component of  is upwards and thus −2− →ω × v0 points to the right. In the
→
−
southern hemisphere the vertical component of  is downward and thus −2− →
ω × v0 points to the left. This
is also shown using the above relations for the case of a projectile fired upwards in an easterly direction with
0 0
components 0  0  0  The resultant displacements are
1
0 = 3 cos  − 2 ̇00 cos  + ̇00 
3
Similarly,
 0 = − ̇00 2 sin 
1
 0 = − 2 + ̇00  +  ̇00 2 cos 
2
The trajectory is non-planar and, in the northern hemisphere, the projectile drifts to the right, that is
southerly.
In the battle of the River de la Plata, during World War 2, the gunners on the British light cruisers
Exeter, Ajax and Achilles found that their accurately aimed salvos against the German pocket battleship Graf
Spee were falling 100 yards to the left. The designers of the gun sighting mechanisms had corrected for the
Coriolis eﬀect assuming the ships would fight at latitudes near 50 north, not 50 south.
306 CHAPTER 12. NON-INERTIAL REFERENCE FRAMES

12.12 Weather systems

Weather systems on Earth provide a classic example of motion in a rotating coordinate system. In the
northern hemisphere, air flowing into a low-pressure region is deflected to the right causing counterclockwise
circulation, whereas air flowing out of a high-pressure region is deflected to the right causing a clockwise
circulation. Trade winds on the Earth result from air rising or sinking due to thermal activity combined
with the Coriolis eﬀect. Similar behavior is observed on other planets such as the Red Spot on Jupiter.
For a fluid or gas, equation (1236) can be written in terms of the fluid density  in the form
a” = −∇ − [2ω × v” − ω × (ω × r0 )] (12.68)
where the translational acceleration A, the gravitational force, and the azimuthal acceleration (ω̇ × r0 ) terms
are ignored. The external force per unit volume equals the pressure gradient −∇ while ω is the rotation
vector of the earth.
In fluid flow, the Rossby number  is defined to be
inertial force a”
 = ≈ (12.69)
Coriolis force 2ω × v”
For large dimensional pressure systems in the atmosphere, e.g.  ' 1000, the Rossby number is  ∼ 01
and thus the Coriolis force dominates and the radial acceleration can be neglected. This leads to a flow
velocity  ' 10 which is perpendicular to the pressure gradient ∇ , that is, the air flows horizontally
parallel to the isobars of constant pressure which is called geostrophic flow. For much smaller dimension
systems, such as at the wall of a hurricane,  ' 50, and  ' 50 the Rossby number  ' 10 and
the Coriolis eﬀect plays a much less significant role compared to the balance between the radial centrifugal
forces and the pressure gradient. The same situation of the Coriolis forces being insignificant occurs for most
small-scale vortices such as tornadoes, typical thermal vortices in the atmosphere, and for water draining a
bath tub.

12.12.1 Low-pressure systems:

It is interesting to analyze the motion of air circulat-
ing around a low pressure region at large radii where
the motion is tangential. As shown in figure 128,
a parcel of air circulating anticlockwise around the
low with velocity  involves a pressure diﬀerence ∆
acting on the surface area  plus the centrifugal and
Coriolis forces. Assuming that these forces are bal-
anced such that a” ' 0 then equation 1268 simpli- S
fies to Low
2 1
= ∇ − 2 sin  (12.70)
 
where the latitude  = −. Thus the force equation
can be written
1  2
= + 2 sin  (12.71)
  
It is apparent that the combined outward Coriolis Figure 12.8: Air flow and pressures around a low-
force plus outward centrifugal force, acting on the pressure region.
circulating air, can support a large pressure gradient.
The tangential velocity  can be obtained by solving this equation to give
s
2  
 = ( sin ) + −  sin  (12.72)
 

Note that the velocity equals zero when  = 0 assuming that  is finite. That is, the velocity reaches a
maximum at a radius
1 1 
 = (1 + ) (12.73)
4  sin  
12.12. WEATHER SYSTEMS 307

Figure 12.9: Hurricane Katrina over the Gulf of Mexico on 28 August 2005. [Published by the NOAA]

which occurs at the wall of the eye of the circulating low-pressure system.
Low pressure regions are produced by heating of air causing it to rise and resulting in an inflow of air
to replace the rising air. Hurricanes form over warm water when the temperature exceeds 26◦  and the
moisture levels are above average. They are created at latitudes between 10◦ − 15◦ where the sea is warmest,
but not closer to the equator where the Coriolis force drops to zero. About 90% of the heating of the air comes
from the latent heat of vaporization due to the rising warm moist air condensing into water droplets in the
cloud similar to what occurs in thunderstorms. For hurricanes in the northern hemisphere, the air circulates
anticlockwise inwards. Near the wall of the eye of the hurricane, the air rises rapidly to high altitudes at
which it then flows clockwise and outwards and subsequently back down in the outer reaches of the hurricane.
Both the wind velocity and pressure are low inside the eye which can be cloud free. The strongest winds
are in vortex surrounding the eye of the hurricane, while weak winds exist in the counter-rotating vortex of
sinking air that occurs far outside the hurricane.
Figure 129 shows the satellite picture of the hurricane Katrina, recorded on 28 August 2005. The eye of
the hurricane is readily apparent in this picture. The central pressure was 902002 (902) compared
with the standard atmospheric pressure of 1013002 (1013). This 111 pressure diﬀerence produced
steady winds in Katrina of 280 ( 175) with gusts up to 344 which resulted in 1833 fatalities.
Tornadoes are another example of a vortex low-pressure system that are the opposite extreme in both
size and duration compared with a hurricane. Tornadoes may last only ∼ 10 minutes and be quite small in
radius. Pressure drops of up to 100 have been recorded, but since they may only be a few 100 meters in
diameter, the pressure gradient can be much higher than for hurricanes leading to localized winds thought to
approach 500. Unfortunately, the instrumentation and buildings hit by a tornado often are destroyed
making study diﬃcult. Note that the the pressure gradient in small diameter of rope tornadoes is much
more destructive than for larger 14 mile diameter tornadoes, which results in stronger winds.
308 CHAPTER 12. NON-INERTIAL REFERENCE FRAMES

12.12.2 High-pressure systems:

In contrast to low-pressure systems, high-pressure systems are very different in that the Coriolis force points
inward opposing the outward pressure gradient and centrifugal force. That is,
2 1 
= 2 sin  − (12.74)
  
which gives that s
 
 =  sin  − ( sin )2 − (12.75)
 
This implies that the maximum pressure gradient plus centrifugal force supported by the Coriolis force is
  2
≤ ( sin ) (12.76)
 
As a consequence, high pressure regions tend to have weak pressure gradients and light winds in contrast
to the large pressure gradients plus concomitant damaging winds possible for low pressure systems such a
hurricanes or tornados.
The circulation behavior, exhibited by weather patterns, also applies to ocean currents and other liquid
flow on earth. However, the residual angular momentum of the liquid often can overcome the Coriolis terms.
Thus often it will be found experimentally that water exiting the bathtub does not circulate anticlockwise in
the northern hemisphere as predicted by the Coriolis force. This is because it was not stationary originally,
but rotating slowly.
Reliable prediction of weather is an extremely difficult, complicated and challenging task, which is of con-
siderable importance in modern life. As discussed in chapter 168, fluid flow can be much more complicated
than assumed in this discussion of air flow and weather. Both turbulent and laminar flow are possible. As a
consequence, computer simulations of weather phenomena are difficult because the air flow can be turbulent
and the transition from order to chaotic flow is very sensitive to the initial conditions. Typically the air
flow can involve both macroscopic ordered coherent structures over a wide dynamic range of dimensions,
coexisting with chaotic regions. Computer simulations of fluid flow often are performed based on Lagrangian
mechanics to exploit the scalar properties of the Lagrangian. Ordered coherent structures, ranging from
microscopic bubbles to hurricanes, can be recognized by exploiting Lyapunov exponents to identify the or-
dered motion buried in the underlying chaos. Thus the techniques discussed in classical mechanics are of
considerable importance outside of physics.

12.13 Foucault pendulum

A classic example of motion in non-inertial frames is the rotation of
the Foucault pendulum on the surface of the earth. The Foucault
pendulum is a spherical pendulum with a long suspension that os- z

cillates in the  −  plane with suﬃciently small amplitude that the

vertical velocity ̇ is negligible. Assume that the pendulum is a sim-
ple pendulum of length  and mass  as shown in figure 1210. The
equation of motion is given by
l
T y
r̈ = g + − 2Ω × ṙ (12.77)

x
 l
where  is the acceleration produced by the tension in the pendulum
suspension and the rotation vector of the earth is designated by Ω
to avoid confusion with the oscillation frequency of the pendulum x y
l mg
. The eﬀective gravitational acceleration g is given by
g = g0 − Ω × [Ω × (r + R)] (12.78)
that is, the true gravitational field g0 corrected for the centrifugal Figure 12.10: Foucault pendulum.
force.
12.13. FOUCAULT PENDULUM 309

Assume the small angle approximation for the pendulum deflection angle , then  =  cos  '  and
 = , thus  ' . Then has shown in figure 1210, the horizontal components of the restoring force
are

 = − (12.79)


 = − (12.80)

Since g is vertical, and neglecting terms involving ̇ then evaluating the cross product in equation (1278)
simplifies to

̈ = − + 2̇Ω cos  (12.81)


̈ = − − 2̇Ω cos  (12.82)

where  is the colatitude which is related to the latitude  by

cos  = sin  (12.83)

The natural angular frequency of the simple pendulum is

r

0 = (12.84)

while the  component of the earth’s angular velocity is

Ω = Ω cos  (12.85)

Thus equations 1281 and 1282 can be written as

̈ − 2Ω ̇ +  20  = 0
̈ − 2Ω ̇ +  20  = 0 (12.86)

These are two coupled equations that can be solved by making a coordinate transformation.
Define a new coordinate that is a complex number

 =  +  (12.87)

Multiply the second of the coupled equations 1286 by  and add to the first equation gives

(̈ + ̈) + 2Ω (̇ + ̇) +  20 ( + ) = 0

which can be written as a diﬀerential equation for 

̈ + 2Ω ̇ +  20  = 0 (12.88)

Note that the complex number  contains the same information regarding the position in the  −  plane
as equations 1286. The plot of  in the complex plane, the Argand diagram, is a birds-eye view of the
position coordinates ( ) of the pendulum. This second-order homogeneous diﬀerential equation has two
independent solutions that can be derived by guessing a solution of the form

() = − (12.89)

Substituting equation 1289 into 1288 gives that

2 − 2Ω  −  2 = 0

That is q
 = Ω ± Ω2 +  20 (12.90)
310 CHAPTER 12. NON-INERTIAL REFERENCE FRAMES

If the angular velocity of the pendulum  0  Ω, then

 ' Ω ±  0 (12.91)
Thus the solution is of the form
() = −Ω  (+ 0  + − 0  ) (12.92)
This can be written as
() = −Ω  cos( 0  + ) (12.93)
where the phase  and amplitude  depend on the initial conditions. Thus the plane of oscillation of the
pendulum is defined by the ratio of the  and  coordinates, that is the phase angle Ω  This phase angle
rotates with angular velocity Ω where

Ω = Ω cos  = Ω sin  (12.94)

At the north pole the earth rotates under the pendulum with angular velocity Ω and the axis of the
pendulum is fixed in an inertial frame of reference. At lower latitudes, the pendulum precesses at the lower
angular frequency Ω = Ω sin  that goes to zero at the equator. For example, in Rochester, NY,  = 43◦ 
and therefore a Foucault pendulum precesses at Ω = 0682Ω. That is, the pendulum precesses 2455◦ /day.

12.14 Summary
This chapter has focussed on describing motion in non-inertial frames of reference. It has been shown that the
force and acceleration in non-inertial frames can be related using either Newtonian or Lagrangian mechanics
by introducing additional inertial forces in the non-inertial reference frame.

Translational acceleration of a reference frame In a primed frame, that is undergoing translational

acceleration A the motion in this non-inertial frame can be calculated by addition of an inertial force -A,
that leads to an equation of motion
a0 = F − A (126)
Note that the primed frame is an inertial frame if A = 0.

Rotating reference frame It was shown that the time derivatives of a general vector G in both an
inertial frame and a rotating reference frame are related by
µ ¶ µ ¶
G G
= +ω×G (1216)
    

where the ω × G term originates from the fact that the unit vectors in the rotating reference frame are time
dependent with respect to the inertial frame.

Reference frame undergoing both rotation and translation Both Newtonian and Lagrangian me-
chanics were used to show that for the case of translational acceleration plus rotation, the eﬀective force in
the non-inertial (double-primed) frame can be written as

F  = a00 = F −  (A + ω × V + 2ω × v00 + ω × (ω × r0 ) + ω̇ × r0 ) (1228 1236)

These inertial correction forces result from describing the system using a non-inertial frame. These inertial
forces are felt when in the rotating-translating frame of reference. Thus the notion of these inertial forces
can be very useful for solving problems in non-inertial frames. For the case of rotating frames, two important
inertial forces are the centrifugal force, −ω × (ω × r0 )  and the Coriolis force −2ω × v00 .

Routhian reduction for rotating systems It was shown that for non-inertial systems, identical equa-
tions of motion are derived using Newtonian, Lagrangian, Hamiltonian, and Routhian mechanics.
12.14. SUMMARY 311

Terrestrial manifestations of rotation Examples of motion in rotating frames presented in the chapter
included projectile motion with respect to the surface of the Earth, rotation alignment of nucleons in rotating
nuclei, and weather phenomena.

Workshop exercises
1. Consider a fixed reference frame  and a rotating frame  0 . The origins of the two coordinate systems always
coincide. By carefully drawing a diagram, derive an expression relating the coordinates of a point  in the two
systems. (This was covered in Chapter 2, but it is worth reviewing now.

2. The effective force observed in a rotating coordinate system is given by equation 1228.
(a) What is the significance of each term in this expression?
(b) Suppose you wanted to measure the gravitational force, both magnitude and direction, on a body of mass
 at rest on the surface of the Earth. What terms in the effective force can be neglected?
(c) Suppose you wanted to calculate the deflection of a projectile fired horizontally along the Earth’s surface.
What terms in the effective force can be neglected?
(d) Suppose you wanted to calculate the effective force on a small block of mass  placed on a frictionless
turntable rotating with a time-dependent angular velocity (). What terms in the effective force can be
neglected?

3. A plumb line is carried along in a moving train, with  the mass of the plumb bob. Neglect any eﬀects due to
the rotation of the Earth and work in the noninertial frame of reference of the train.

(a) Find the tension in the cord and the deflection from the local vertical if the train is moving with constant
acceleration 0 .
(b) Find the tension in the cord and the deflection from the local vertical if the train is rounding a curve of
radius  with constant speed 0 .

4. A bead on a rotating rod is free to slide without friction. The rod has a length  and rotates about its end
with angular velocity  . The bead is initially released from rest (relative to the rod) at the midpoint of the
rod.

(a) Find the displacement of the bead along the wire as a function of time.
(b) Find the time when the bead leaves the end of the rod.
(c) Find the velocity (relative to the rod) of the bead when it leaves the end of the rod.
5. Here is a “thought experiment” for you to consider. Suppose you are in a small sailboat of mass  at the
Earth’s equator. At the equator there is very little wind (this is known as the “equatorial doldrums”), so your
sailboat is, more or less, sitting still. You have a small anchor of mass  on deck and a single mast of height
 in the middle of the boat. How can you use the anchor to put the boat into motion? In which direction will
the boat move?

6. Does water really flow in the other direction when you flush a toilet in the southern hemisphere? What (if
anything) does the Coriolis force have to do with this?

7. We are presently at a latitude  (with respect to the equator) and Earth is rotating with constant angular
velocity  . Consider the following two scenarios: Scenario A: A particle is thrown upward with initial speed
0 . Scenario B: An identical particle is dropped (at rest) from the maximum height of the particle in Scenario
A. Circle all the true statements regarding the Coriolis deflection assuming that the particles have landed for
a) and b), .

(a) The magnitude is greater in A than in B.

(b) The direction in A and B are the same.
(c) The direction in A does not change throughout flight.
312 CHAPTER 12. NON-INERTIAL REFERENCE FRAMES

Problems

1. If a projectile is fired due east from a point on the surface of the Earth at a northern latitude  with a velocity
of magnitude 0 and at an inclination to the horizontal of  show that the lateral deflection when the projectile
strikes the Earth is
403
=  sin  sin2  cos  (12.95)
2
where  is the rotation frequency of the Earth.

2. Obtain an expression for the angular deviation of a particle projected from the North Pole in a path that lies
close to the surface of the earth. Is the deviation significant for a missile that makes a 4800-km flight in 10
minutes? What is the ”miss distance” if the missile is aimed directly at the target? Is the miss diﬀerence
greater for a 19300-km flight at the same velocity?

3. An automobile drag racer drives a car with acceleration  and instantaneous velocity  . The tires of radius 0
are not slipping. Derive which point on the tire has the greatest acceleration relative to the ground. What is
this acceleration?

4. Shot towers were popular in the eighteenth and nineteenth centuries for dropping melted lead down tall towers
to form spheres for bullets. The lead solidified while falling and often landed in water to cool the lead bullets.
Many such shot towers were built in New York State. Assume a shot tower was constructed at latitude 42◦  ,
and that the lead fell a distance of 27. In what direction and by how far did the lead bullets land from the
direct vertical?
Chapter 13

Rigid-body rotation

13.1 Introduction
Rigid-body rotation features prominently in science, engineering, and sports. Prior chapters have focussed
primarily on motion of point particles. This chapter extends the discussion to motion of finite-sized rigid
bodies. A rigid body is a collection of particles where the relative separations remain rigidly fixed. In real
life, there is always some motion between individual atoms, but usually this microscopic motion can be
neglected when describing macroscopic properties. Note that the concept of perfect rigidity has limitations
in the theory of relativity since information cannot travel faster than the velocity of light, and thus signals
cannot be transmitted instantaneously between the ends of a rigid body which is implied if the body had
perfect rigidity.
The description of rigid-body rotation is most easily handled by specifying the properties of the body
in the rotating body-fixed coordinate frame whereas the observables are measured in the stationary iner-
tial laboratory coordinate frame. In the body-fixed coordinate frame, the primary observable for classical
mechanics is the inertia tensor of the rigid body which is well defined and independent of the rotational
motion. By contrast, in the stationary inertial frame the observables depend sensitively on the details of the
rotational motion. For example, when observed in the stationary fixed frame, rapid rotation of a long thin
cylindrical pencil about the longitudinal symmetry axis gives a time-averaged shape of the pencil that looks
like a thin cylinder, whereas the time-averaged shape is a flat disk for rotation about an axis perpendicular
to the symmetry axis of the pencil. In spite of this, the pencil always has the same unique inertia tensor
in the body-fixed frame. Thus the best solution for describing rotation of a rigid body is to use a rotation
matrix that transforms from the stationary fixed frame to the instantaneous body-fixed frame for which the
moment of inertia tensor can be evaluated. Moreover, the problem can be greatly simplified by transforming
to a body-fixed coordinate frame that is aligned with any symmetry axes of the body since then the inertia
tensor can be diagonal; this is called a principal axis system.
Rigid-body rotation can be broken into the following two classifications.
1) Rotation about a fixed axis:
A body can be constrained to rotate about an axis that has a fixed location and orientation relative to
the body. The hinged door is a typical example. Rotation about a fixed axis is straightforward since the
axis of rotation, plus the moment of inertia about this axis, are well defined and this case was discussed in
chapter 2127.
2) Rotation about a point
A body can be constrained to rotate about a fixed point of the body but the orientation of this rotation
axis about this point is unconstrained. One example is rotation of an object flying freely in space which can
rotate about the center of mass with any orientation. Another example is a child’s spinning top which has
one point constrained to touch the ground but the orientation of the rotation axis is undefined.
The prior discussion in chapter 2127 showed that rigid-body rotation is more complicated than assumed
in introductory treatments of rigid-body rotation. It is necessary to expand the concept of moment of inertia
to the concept of the inertia tensor, plus the fact that the angular momentum may not point along the
rotation axis. The most general case requires consideration of rotation about a body-fixed point where the
orientation of the axis of rotation is unconstrained. The concept of the inertia tensor of a rotating body is

313
314 CHAPTER 13. RIGID-BODY ROTATION

crucial for describing rigid-body motion. It will be shown that working in the body-fixed coordinate frame of
a rotating body allows a description of the equations of motion in terms of the inertia tensor for a given point
of the body, and that it is possible to rotate the body-fixed coordinate system into a principal axis system
where the inertia tensor is diagonal. For any principal axis, the angular momentum is parallel to the angular
velocity if it is aligned with a principal axis. The use of a principal axis system greatly simplifies treatment
of rigid-body rotation and exploits the powerful and elegant matrix algebra mentioned in appendix .
The following discussion of rigid-body rotation is broken into three topics, (1) the inertia tensor of the
rigid body, (2) the transformation between the rotating body-fixed coordinate system and the laboratory
frame, i.e., the Euler angles specifying the orientation of the body-fixed coordinate frame with respect to the
laboratory frame, and (3) Lagrange and Euler’s equations of motion for rigid-bodies. This is followed by a
discussion of practical applications.

13.2 Rigid-body coordinates

Motion of a rigid body is a special case for motion of the  -body system when the relative positions of
the  bodies are related. It was shown in chapter 2 that the motion of a rigid body can be broken into
a combination of a linear translation of some point in the body, plus rotation of the body about an axis
through that point. This is called Chasles’ Theorem. Thus the position of every particle in the rigid body
is fixed with respect to one point in the body. If the fixed point of the body is chosen to be the center of
mass, then, as discussed in chapter 2, it is possible to separate the kinetic energy, linear momentum, and
angular momentum into the center-of-mass motion, plus the motion about the center of mass. Thus the
behavior of the body can be described completely using only six independent coordinates governed by six
equations of motion, three for translation and three for rotation.
Referred to an inertial frame, the translational motion of the center of mass is governed by
P
F = (13.1)

while the rotational motion about the center of mass is determined by
L
N = (13.2)

where the external force F and external torque N are identified separately from the internal forces acting
between the particles in the rigid body. It will be assumed that the internal forces are central and thus do
not contribute to the angular momentum.
The location of any fixed point in the body, such as the center of mass, can be specified by three generalized
cartesian coordinates with respect to a fixed frame. The rotation of the body-fixed axis system about this
fixed point in the body can be described in terms of three independent angles with respect to the fixed frame.
There are several possible sets of orthogonal angles that can be used to describe the rotation. This book
uses the Euler angles    which correspond first to a rotation  about the -axis, then a rotation  about
the  axis subsequent to the first rotation, and finally a rotation  about the new  axis following the first
two rotations. The Euler angles will be discussed in detail following introduction of the inertia tensor and
angular momentum.

13.3 Rigid-body rotation about a body-fixed point

With respect to some point  fixed in the body coordinate system, the angular momentum of the body  is
given by
X X
L= L = r × p (13.3)
 

There are two especially convenient choices for the fixed point . If no point in the body is fixed with
respect to an inertial coordinate system, then it is best to choose  as the center of mass. If one point of
the body is fixed with respect to a fixed inertial coordinate system, such as a point on the ground where a
child’s spinning top touches, then it is best to choose this stationary point as the body-fixed point 
13.3. RIGID-BODY ROTATION ABOUT A BODY-FIXED POINT 315

Consider a rigid body composed of  particles of mass

 where  = 1 2 3  As discussed in chapter 124, if the
body rotates with an instantaneous angular velocity ω about
some fixed point, with respect to the body-fixed coordinate
system, and this point has an instantaneous translational ve-
locity V with respect to the fixed (inertial) coordinate system,
see figure 131, then the instantaneous velocity v of the 
particle in the fixed frame of reference is given by
00
v = V + v + ω × r0 (13.4)
However, for a rigid body, the velocity of a body-fixed point
00
with respect to the body is zero, that is v = 0 thus
v = V + ω × r0 (13.5)
Consider the translational velocity of the body-fixed point
 to be zero, i.e. V = 0 and let R = 0 then r = r0 . These
assumptions allow the linear momentum of the particle  to
be written as
p =  v =  ω × r (13.6)
Therefore Figure 13.1: Infinitessimal displacement 0 
in the primed frame, broken into a part 

X 
X due to rotation of the primed frame plus a
L= r × p =  r × (ω × r ) (13.7)
part 00 due to displacement with respect to
 
this rotating frame.
Using the vector identity
A × (B × A) = 2 B − A (A · B)
leads to

X £ ¤
L=  2 ω − r (r · ω) (13.8)

The angular momentum can be expressed in terms of components of ω and r0 relative to the body-fixed
frame. The following formulae can be written more compactly if r = (     ) in the rotating body-fixed
frame, is written in the form r = (1  2  3 ) where the axes are defined by the numbers 1 2 3 rather
than   . In this notation, the angular momentum is written in component form as
⎡ ⎛ ⎞⎤
X X X
 =  ⎣  2 −  ⎝    ⎠⎦ (13.9)
  

Assume the Kronecker delta relation

3
X
 =     (13.10)


where
  = 1 =
  = 0  6= 

Substitute (1310) in (139) gives


" #
X X X
 =      2 −    
  
3
" Ã !#
X X X
=     2 −   (13.11)
  
316 CHAPTER 13. RIGID-BODY ROTATION

13.4 Inertia tensor

The square bracket term in (1311) is called the moment of inertia tensor I which is usually referred to
as the inertia tensor " Ã 3 ! #

X X
2
 ≡     −   (13.12)
 

In most cases it is more useful to express the components of the inertia tensor in an integral form over
the mass distribution rather than a summation for  discrete bodies. That is,
Z Ã Ã 3 ! !
X
0 2
 =  (r )    −    (13.13)


The inertia tensor is easier to understand when written in cartesian coordinates r0 = (     ) rather
than in the form r0 = (1  2  3 ) Then, the diagonal moments of inertia of the inertia tensor are

X 
£ ¤ X £ ¤
 ≡  2 + 2 + 2 − 2 =  2 + 2 (13.14)
 

X 
£ ¤ X £ ¤
 ≡  2 + 2 + 2 − 2 =  2 + 2
 

X 
X
£ ¤ £ ¤
 ≡  2 + 2 + 2 −  = 2
 2 + 2
 

while the oﬀ-diagonal products of inertia are


X
 =  ≡ −  [  ] (13.15)


X
 =  ≡ −  [  ]


X
 =  ≡ −  [  ]


Note that the products of inertia are symmetric in that

 =  (13.16)

The above notation for the inertia tensor allows the angular momentum (1312) to be written as
3
X
 =    (13.17)


Expanded in cartesian coordinates

 =    +    +    (13.18)

 =    +    +   
 =    +    +   

Note that every fixed point in a body has a specific inertia tensor. The components of the inertia tensor
at a specified point depend on the orientation of the coordinate frame whose origin is located at the specified
fixed point. For example, the inertia tensor for a cube is very diﬀerent when the fixed point is at the center
of mass compared with when the fixed point is at a corner of the cube.
13.5. MATRIX AND TENSOR FORMULATIONS OF RIGID-BODY ROTATION 317

13.5 Matrix and tensor formulations of rigid-body rotation

The prior notation is clumsy and can be streamlined by use of matrix methods. Write the inertia tensor in
a matrix form as ⎛ ⎞
11 12 13
{I} = ⎝ 21 22 23 ⎠ (13.19)
31 32 33
The angular velocity and angular momentum both can be written as a column vectors, that is
⎛ ⎞ ⎛ ⎞
1 1
ω = ⎝ 2 ⎠ L = ⎝ 2 ⎠ (13.20)
3 3

As discussed in appendix 2, equation (1318) now can be written in tensor notation as an inner product
of the form
L = {I} · ω (13.21)
Note that the above notation uses boldface for the inertia tensor I, implying a rank-2 tensor representation,
while the angular velocity ω and the angular momentum L are written as column vectors. The inertia tensor
is a 9-component rank-2 tensor defined as the ratio of the angular momentum vector L and the angular
velocity ω.
L
{I} = (13.22)
ω
Note that, as described in appendix , the inner product of a vector ω, which is the rank 1 tensor, and a
rank 2 tensor {I}  leads to the vector L. This compact notation exploits the fact that the matrix and tensor
representation are completely equivalent, and are ideally suited to the description of rigid-body rotation.

13.6 Principal axis system

The inertia tensor is a real symmetric matrix because of the symmetry given by equation (1316)  A property
of real symmetric matrices is that there exists an orientation of the coordinate frame, with its origin at the
chosen body-fixed point  such that the inertia tensor is diagonal. The coordinate system for which the
inertia tensor is diagonal is called the Principal axis system which has three perpendicular principal
axes. Thus, in the principal axis system, the inertia tensor has the form
⎛ ⎞
11 0 0
{I} = ⎝ 0 22 0 ⎠ (13.23)
0 0 33

where  are real numbers, which are called the principal moments of inertia of the body, and are
usually written as  . When the angular velocity vector ω points along any principal axis unit vector ̂, then
the angular momentum L is parallel to ω and the magnitude of the principal moment of inertia about this
principal axis is given by the relation
 ̂ =    ̂ (13.24)

The principal axes are fixed relative to the shape of the rigid body and they are invariant to the orientation
of the body-fixed coordinate system used to evaluate the inertia tensor. The advantage of having the body-
fixed coordinate frame aligned with the principal axis coordinate frame is that then the inertia tensor is
diagonal, which greatly simplifies the matrix algebra. Even when the body-fixed coordinate system is not
aligned with the principal axis frame, if the angular velocity is specified to point along a principal axis then
the corresponding moment of inertia will be given by (1324) 
In principle it is possible to locate the principal axes by varying the orientation of the angular velocity
vector ω to find those orientations for which the angular momentum L and angular velocity ω are parallel
which characterizes the principal axes. However, the best approach is to diagonalize the inertia tensor.
318 CHAPTER 13. RIGID-BODY ROTATION

13.7 Diagonalize the inertia tensor

Finding the three principal axes involves diagonalizing the inertia tensor, which is the classic eigenvalue
problem discussed in appendix . Solution of the eigenvalue problem for rigid-body motion corresponds to
a rotation of the coordinate frame to the principal axes resulting in the matrix
{I} · ω = ω (13.25)
where  comprises the three-valued eigenvalues, while the corresponding vector ω is the eigenvector. Ap-
pendix 4 gives the solution of the matrix relation
{I} · ω =  {I} ω (13.26)
where  are three-valued eigen values for the principal axis moments of inertia, and {I} is the unity tensor,
equation 24. ⎧ ⎫
⎨ 1 0 0 ⎬
{I} ≡ 0 1 0 (13.27)
⎩ ⎭
0 0 1
Rewriting (1326) gives
({I} −  {I}) · ω = 0 (13.28)
This is a matrix equation of the form A · ω =0 where A is a 3 × 3 matrix and ω is a vector with values
         The matrix equation A · ω =0 really corresponds to three simultaneous equations for the three
numbers         . It is a well-known property of equations like (1328) that they have a non-zero solution
if, and only if, the determinant det(A) is zero, that is
det(I−I)=0 (13.29)
This is called the characteristic equation, or secular equation for the matrix I. The determinant
involved is a cubic equation in the value of  that gives the three principal moments of inertia. Inserting
one of the three values of  into equation (1317) gives the corresponding eigenvector . Applying the above
eigenvalue problem to rigid-body rotation corresponds to requiring that some arbitrary set of body-fixed
axes be the principal axes of inertia. This is obtained by rotating the body-fixed axis system such that
1 = 11  1 + 12  2 + 13  3 =  1 (13.30)
2 = 21  1 + 22  2 + 23  3 =  2
3 = 31  1 + 32  2 + 33  3 =  3
or
(11 − )  1 + 12  2 + 13  3 = 0 (13.31)
21  1 + (22 − )  2 + 23  3 = 0
31  1 + 32  2 + (33 − )  3 = 0
These equations have a non-trivial solution for the ratios  1 :  2 :  3 since the determinant vanishes, that is
¯ ¯
¯ (11 − ) 12 13 ¯
¯ ¯
¯ 21 (22 − ) 23 ¯=0 (13.32)
¯ ¯
¯ 31 32 (33 − ) ¯
The expansion of this determinant leads to a cubic equation with three roots for  This is the secular
equation for  whose eigenvalues are the principal moments of inertia.
The directions of the principal axes, that is the eigenvectors, can be found by substituting the cor-
responding solution for  into the prior equation. Thus for eigensolution 1 the eigenvector is given by
solving
(11 − 1 )  11 + 12  21 + 13  31 = 0 (13.33)
21  11 + (22 − 1 )  21 + 23  31 = 0
31  11 + 32  21 + (33 − 1 )  31 = 0
13.8. PARALLEL-AXIS THEOREM 319

These equations are solved for the ratios  11 :  21 :  31 which are the direction numbers of the principle axis
system corresponding to solution 1  This principal axis system is defined relative to the original coordinate
system. This procedure is repeated to find the orientation of the other two mutually perpendicular principal
axes.

13.8 Parallel-axis theorem

The values of the components of the inertia tensor depend on both the
location and the orientation about which the body rotates relative to
the body-fixed coordinate system. The parallel-axis theorem is valuable x3
for relating the inertia tensor for rotation about parallel axes passing X3
through diﬀerent points fixed with respect to the rigid body. For ex-
ample, one may wish to relate the inertia tensor through the center of O x2
mass to another location that is constrained to remain stationary, like r
a x
the tip of the spinning top. 1
Consider the mass  at the location r = (1  2  3 ) with respect R
to the origin of the center of mass body-fixed coordinate system . Q X2
Transform to an arbitrary but parallel body-fixed coordinate system
, that is, the coordinate axes have the same orientation as the center
of mass coordinate system. The location of the mass  with respect
X1
to this arbitrary coordinate system is R = (1  2  3 ) That is, the
general vectors for the two coordinates systems are related by

R=a+r (13.34)
Figure 13.2: Transformation be-
where a is the vector connecting the origins of the coordinate systems tween two parallel body-coordinate
 and  illustrated in figure 132. The elements of the inertia tensor systems, O and Q.
with respect to axis system  are given by equation 1312 to be

" Ã 3 ! #
X X
2
 ≡     −   (13.35)
 

The components along the three axes for each of the two coordinate systems are related by

 =  +  (13.36)

Substituting these into the above inertia tensor relation gives


" Ã 3 ! #
X X 2
 =    ( +  ) − ( +  ) ( +  ) (13.37)
 

" Ã 3 ! # 
" Ã 3 ! #
X X X X¡ ¢
=    2 −   +    2
2  +  − (  +   +   )
   

The first summation on the right-hand side corresponds to the elements  of the inertia tensor in the
center-of-mass frame. Thus the terms can be regrouped to give

Ã 3
! 
" 3
#
X X X X
2
 ≡  +     −   +  2    −   −   (13.38)
   
P
However, each term in the last bracket involves a sum of the form     Take the coordinate system
 to be with respect to the center of mass for which

X
 r0 = 0 (13.39)

320 CHAPTER 13. RIGID-BODY ROTATION

This also applies to each component , that is


X
  = 0 (13.40)


Therefore all of the terms in the last bracket cancel leaving


Ã 3
!
X X
2
 ≡  +     −   (13.41)
 
P P3
But   =  and  2 = 2  thus
¡ ¢
 ≡  +  2   −   (13.42)

where  is the center-of-mass inertia tensor. This is the general form of Steiner’s parallel-axis theorem.
As an example, the moment of inertia around the 1 axis is given by
¡¡ ¢ ¢ ¡ ¢
11 ≡ 11 +  21 + 22 + 23  11 − 21 = 11 +  22 + 23 (13.43)

which corresponds to the elementary statement that the diﬀerence in the moments of inertia equals the
mass of the body multiplied by the square of the distance between the parallel axes, 1  1  Note that the
minimum moment of inertia of a body is  which is about the center of mass.

13.1 Example: Inertia tensor of a solid cube rotating about the center of mass.
The complicated expressions for the inertia tensor can be un-
derstood using the example of a uniform solid cube with side ,
density  and mass  = 3  rotating about diﬀerent axes. As-
sume that the origin of the coordinate system  is at the center
of mass with the axes perpendicular to the centers of the faces of
the cube.
The components of the inertia tensor can be calculated using
(1313) written as an integral over the mass distribution rather O
than a summation.
Z Ã Ã 3 ! !
X
 =  (r0 )   2 −   


Thus
Z Z Z Inertia tensor of a uniform solid cube of
2 2 2 ¡ 2 ¢ side  about the center of mass  and a
11 =  2 + 23 3 2 1
−2 −2 −2 corner of the cube . The vector  is the
1 5 1 vector distance between  and .
=  =  2 = 22 = 33
6 6
By symmetry the diagonal moments of inertia about each face
are identical. Similarly the products of inertia are given by
Z 2 Z 2 Z 2
12 = − (1 2 ) 3 2 1 = 0
−2 −2 −2

Thus the inertia tensor is given by ⎛ ⎞

1 0 0
1
I =  2 ⎝ 0 1 0 ⎠
6
0 0 1
Note that this inertia tensor is diagonal implying that this is the principal axis system. In this case all three
principal moments of inertia are identical and perpendicular to the centers of the faces of the cube. This is
as expected from the symmetry of the cubic geometry.
13.8. PARALLEL-AXIS THEOREM 321

13.2 Example: Inertia tensor of about a corner of a solid cube.

a) Direct calculation Let one corner of the cube be the origin of the coordinate system  and assume
that the three adjacent sides of the cube lie along the coordinate axes. The components of the inertia tensor
can be calculated using (1313)  Thus
Z Z Z 
¡ 2 ¢ 2 2
11 =  2 + 23 3 2 1 = 5 =  2
0 0 0 3 3
Z Z Z 
1 1
12 = − (1 2 ) 3 2 1 = − 5 = −  2
0 0 0 4 4
Thus, evaluating all the nine components gives
⎛ ⎞
8 −3 −3
1
I =  2 ⎝ −3 8 −3 ⎠
12 −3 −3 8

b) Parallel-axis theorem This inertia tensor also can be calculated using the parallel-axis theorem to
relate the moment of inertia about the corner, to that at the center of mass. As shown in the figure, the
vector  has components

1 = 2 = 3 =
2
Applying the parallel-axis theorem gives
¡ ¢ ¡ ¢ 1 1 2
11 = 11 +  2 − 21 = 11 +  22 + 23 =  2 +  2 =  2
6 2 3
and similarly for 22 and 33 . The oﬀ-diagonal terms are given by
1
12 = 12 +  (−1 2 ) = −  2
4
Thus the inertia tensor, transposed from the center of mass, to the corner of the cube is
⎛ 2 ⎞ ⎛ ⎞
3
2
− 14  2 − 14  2 8 −3 −3
1
I = ⎝ − 14  2 23  2 − 14  2 ⎠ =  2 ⎝ −3 8 −3 ⎠
1 2 1 2 2 2 12
−4 −4 3
−3 −3 8

This inertia tensor about the corner of the cube, is the same as that obtained by direct integration.

c) Principal moments of inertia The coordinate axis frame used for rotation about the corner of the
cube is not a principal axis frame. Therefore let us diagonalize the inertia tensor to find the principal
axis frame the principal moments of inertia about a corner. To achieve this requires solving the secular
determinant ¯ ¡2 ¢ ¯
¯ 2 1 2 1 2 ¯
¯ 31  2−  ¡−24  2 ¢ − 41  2 ¯
¯ −  ¯=0
¯ 41 3 −  −4
¡ ¢ ¯
¯ −  2 1
−4 2 2 2 ¯
4 3 − 
The value of a determinant is not aﬀected by adding or subtracting any row or column from any other
row or column. Subtract row 1 from row 2 gives
¯ ¡2 ¢ ¯
¯  2 −  1 2
− 14  2 ¯
¯ 311 ¡−114   2 ¢ ¯
¯ −  + 2 ¯=0
¯ 12 12   −  ¡0 2 ¢ ¯
¯ − 1  2 − 14  2 2 ¯
4 3 − 

The determinant of this matrix is straightforward to evaluate and equals

µ ¶µ ¶µ ¶
1 2 11 2 11 2
 −   −   −  = 0
6 12 12
322 CHAPTER 13. RIGID-BODY ROTATION

Thus the roots are ⎛ ⎞

1 2
6 0 0
I = ⎝ 0 11
12  
2
0 ⎠
11 2
0 0 12  

The identical roots 22 = 33 = 11 2

12   imply that the principal axis associated with 11 must be a symmetry
axis. The orientation can be found by substituting 11 into the above equation
⎛ ⎞⎛ ⎞
6 −3 −3  11
1
({I} −  {I}) · ω =  2 ⎝ −3 6 −3 ⎠ ⎝  21 ⎠ = 0
12
−3 −3 6  31

where the second subscript 1 attached to   signifies that this solution corresponds to 11  This gives

2 11 −  21 −  31 = 0
− 11 + 2 21 −  31 = 0
− 11 −  21 + 2 31 = 0

Solving ⎛ these⎞three equations gives the unit vector for the first principal axis for which 11 = 16  2 to be
1
ê1 = √13 ⎝ 1 ⎠. This can be repeated to find the other two principal axes by substituting 22 = 11 2
12    This
1
gives for the second principal moment 22
⎛ ⎞⎛ ⎞
−3 −3 −3  12
1
({I} −  {I}) · ω =  2 ⎝ −3 −3 −3 ⎠ ⎝  22 ⎠ = 0
12
−3 −3 −3  32

This results in three identical equations for the components of  but all three equations are the same, namely

 12 +  22 +  32 = 0

This does not uniquely determine the direction of  However, it does imply that  2 corresponding to the
second principal axis has the property that
ω̂ · ê1 = 0
that is, any direction of ̂2 that is perpendicular to ̂1 is acceptable. In other words; any two orthogonal unit
vectors ̂2 and ̂3 that are perpendicular to ̂1 are acceptable. This ambiguity exists whenever two eigenvalues
are equal; the three principal axes are only uniquely defined if all three eigenvalues are diﬀerent. The same
ambiguity exist when all three eigenvalues are identical as occurs for the principal moments of inertia about
the center-of-mass of a uniform solid cube. This explains why the principal moment of inertia for the diagonal
of the cube, that passes through the center of mass, has the same moment as when the principal axes pass
through the center of the faces of the cube.

13.9 Perpendicular-axis theorem for plane laminae

Rigid-body rotation of thin plane laminae objects is encountered frequently. Examples of such laminae
bodies are a plane sheet of metal, a thin door, a bicycle wheel, a thin envelope or book. Deriving the inertia
tensor for a plane lamina is relatively simple because there are limits on the possible relative magnitude
of the principal moments of inertia. Consider that the principal axis are along the    coordinate axes.
Then the sum of two principal moments of inertia about the center of mass are
Z Z
 +  = ( +  ) + (2 +  2 )
2 2

Z Z Z
= (2 +  2 ) + 2  2  ≥ (2 +  2 ) =  (13.44)
13.10. GENERAL PROPERTIES OF THE INERTIA TENSOR 323

Note that for any body the three principal moments of inertia must satisfy the triangle rule that the sum of
any pair must exceed or equal the third. Moreover, if the body is a thin lamina with thickness  = 0 that
is, a thin plate in the  −  plane, then
 +  =  (13.45)
This perpendicular-axis theorem can be very useful for solving problems involving rotation of plane laminae.
The opposite of a plane laminae is a long thin cylindrical needle of mass , length , and radius 
Along the symmetry axis the principal moments are  = 12 2 → 0 as  → 0 while perpendicular to the
1
symmetry axis  =  = 12 2 . These satisfy the triangle rule.

13.3 Example: Inertia tensor of a hula hoop

The hula hoop is a thin plane circular ring or radius  and mass  . Assume that the symmetry axis of
the circular ring is the 3 axis.
a) The principal moments of inertia about the center of mass: The principal moment of inertia along the
3 axis is 33 =  2 . Then equation 1345 plus symmetry tells us that the two principal moments of inertia
in the plane of the hula hoop must be 11 = 22 = 12  2 .
b) The principal moments of inertia about the periphery of the ring: Using the Parallel-axis theorem
tells us that the moment perpendicular to the plane of the hula hoop 33 = 2 2 . In the plane of the hoop
the moment tangential to the hoop is 11 = 32  2 and the moment radial to the hoop 22 = 12  2 . The
hula dancer often swings the hoop about the periphery and perpendicular to the plane by swinging their hips.
Another movement is jumping through the hoop by rotating the hoop tangential to the periphery. Calculation
of such maneuvers requires knowledge of these principal moments of inertia.

13.4 Example: Inertia tensor of a thin book

Consider a thin rectangular book of mass  width  and length  with thickness    and   .
About the center of mass the inertia tensor perpendicular to the plane of the book is 33 =  2 2
12 ( +  ). The
 2  2
other two moments are 11 = 12  and 22 = 12  which satisfy equation 1345.

13.10 General properties of the inertia tensor

13.10.1 Inertial equivalence
The elements of the inertia tensor, the values of the principal moments of inertia, and the orientation of the
principal axes for a rigid body, all depend on the choice of origin for the system. Recall that for the kinetic
energy to be separable into translational and rotational portions, the origin of the body coordinate system
must coincide with the center of mass of the body. However, for any choice of the origin of any body, there
always exists an orientation of the axes that diagonalizes the inertia tensor.
The inertial properties of a body for rotation about a specific body-fixed location is defined completely
by only three principal moments of inertia irrespective of the detailed shape of the body. As a result, the
inertial properties of any body about a body-fixed point are equivalent to that of an ellipsoid that has the
same three principal moments of inertia. The symmetry properties of this equivalent ellipsoidal body define
the symmetry of the inertial properties of the body. If a body has some simple symmetry then usually it is
obvious as to what will be the principal axes of the body.

Spherical top: 1 = 2 = 3
A spherical top is a body having three degenerate principal moments of inertia. Such a body has the same
symmetry as the inertia tensor about the center of a uniform sphere. For a sphere it is obvious from the
symmetry that any orientation of three mutually orthogonal axes about the center of the uniform sphere are
equally good principal axes. For a uniform cube the principal axes of the inertia tensor about the center of
mass were shown to be aligned such that they pass through the center of each face, and the three principal
moments are identical; that is, inertially it is equivalent to a spherical top. A less obvious consequence of the
spherical symmetry is that any orientation of three mutually perpendicular axes about the center of mass of
a uniform cube is an equally good principal axis system.
324 CHAPTER 13. RIGID-BODY ROTATION

Symmetric top: 1 = 2 6= 3
The equivalent ellipsoid for a body with two degenerate principal moments of inertia is a spheroid which has
cylindrical symmetry with the cylindrical axis aligned along the third axis. A body with 3  1 = 2 is a
prolate spheroid while a body with 3  1 = 2 is an oblate spheroid. Examples with a prolate spheroidal
equivalent inertial shape are a rugby ball, pencil, or a baseball bat. Examples of an oblate spheroid are an
orange, or a frisbee. A uniform sphere, or a uniform cube, rotating about a point displaced from the center-
of-mass also behave inertially like a symmetric top. The cylindrical symmetry of the equivalent spheroid
makes it obvious that any mutually perpendicular axes that are normal to the axis of cylindrical symmetry
are equally good principal axes even when the cross section in the 1−2 plane is square as opposed to circular.
A rotor is a diatomic-molecule shaped body which is a special case of a symmetric top where 1 = 0
and 2 = 3 . The rotation of a rotor is perpendicular to the symmetry axis since the rotational energy and
angular momentum about the symmetry axis are zero because the principal moment of inertia about the
symmetry axis is zero.

Asymmetric top: 1 6= 2 6= 3
A body where all three principal moments of inertia are distinct, 1 6= 2 6= 3  is called an asymmetric
top. Some molecules, and nuclei have asymmetric, triaxially-deformed, shapes.

13.10.2 Orthogonality of principal axes

The body-fixed principal axes comprise an orthogonal set, for which the vectors L and ω are simply related.
Components of L and ω can be taken along the three body-fixed axes denoted by  Thus for the 
principal moment 
 =    (13.46)
Written in terms of the inertia tensor
3
X
 =    =    (13.47)


Similarly the  principal moment can be written as

3
X
 =    =    (13.48)


Multiply the equation 1347 by   and sum over  gives

X X
     =      (13.49)
 

Similarly multiplying equation 1348 by   and summing over  gives

X X
     =      (13.50)
 

The left-hand sides of these equations are identical since the inertia tensor is symmetric, that is  =  
Therefore subtracting these equations gives
X X
     −      = 0 (13.51)
 

That is X
( −  )     = 0 (13.52)

or
( −  ) ω  · ω  = 0 (13.53)
13.11. ANGULAR MOMENTUM L AND ANGULAR VELOCITY ω VECTORS 325

If  6=  then
ω · ω = 0 (13.54)
which implies that the  and  principal axes are perpendicular. However, if  =  then equation
1353 does not require that ω  · ω  = 0, that is, these axes are not necessarily perpendicular, but, with
no loss of generality, these two axes can be chosen to be perpendicular with any orientation in the plane
perpendicular to the symmetry axis.
Summarizing the above discussion, the inertia tensor has the following properties.
1) Diagonalization may be accomplished by an appropriate rotation of the axes in the body.
2) The principal moments (eigenvalues) and principal axes (eigenvectors) are obtained as roots of the
secular determinant and are real.
3) The principal axes (eigenvectors) are real and orthogonal.
4) For a symmetric top with two identical principal moments of inertia, any orientation of two orthogonal
axes perpendicular to the symmetry axis are satisfactory eigenvectors.
5) For a spherical top with three identical principal moment of inertia, the principal axes system can
have any orientation with respect to the origin.

13.11 Angular momentum L and angular velocity ω vectors

The angular momentum is a primary observable for rotation. As discussed in chapter 135, the angular
momentum L is compactly and elegantly written in matrix form using the tensor algebra relation
⎛ ⎞ ⎛ ⎞
11 12 13 1
L= ⎝ 21 22 23 ⎠ · ⎝  2 ⎠ = {I} · ω (13.55)
31 32 33 3

where ω is the angular velocity, {I} the inertia tensor, and L the corresponding angular momentum.
Two important consequences of equation 1355 are that:

• The angular momentum L and angular velocity ω are not necessarily colinear.

• In general the Principal axis system of the rotating rigid body is not aligned with either the angular
momentum or angular velocity vectors.

An exception to these statements occurs when the angular velocity ω is aligned along a principal axes
for which the inertia tensor is diagonal, i.e.  =    , and then both L and ω point along this principal
axis. In general the angular momentum L and angular velocity ω precess around each other. An important
special case is for torque-free systems where Noether’s theorem implies that the angular momentum vector
L is conserved both in magnitude and amplitude. In this case, the angular velocity ω and the Principal axis
system, both precesses around the angular momentum vector L. That is, the body appears to tumble with
respect to the laboratory fixed frame. Understanding rigid-body rotation requires care not to confuse the
body-fixed Principal axis coordinate frame, used to determine the inertia tensor, and the fixed laboratory
frame where the motion is observed.

13.5 Example: Rotation about the center of mass of a solid cube

It is illustrative to use the inertia tensors of a uniform cube to compute the angular momentum for any
applied angular velocity vector  using equation (1355). If the angular velocity is along the  axis, then
using the inertia tensor for a solid cube, derived earlier, in equation (1355) gives the angular momentum to
be ⎛ ⎞ ⎛ ⎞ ⎛ ⎞
1 0 0 1 1
1 2 ⎝ ⎠ ⎝ ⎠ 1 2 ⎝
L = {I} · ω =    0 1 0 · 0 =   0 ⎠
6 0 0 1 0 6 0
This shows that L and ω are colinear and thus the  axis is a principal axis. By symmetry, the  and 
body fixed axis also must be principal axes.
326 CHAPTER 13. RIGID-BODY ROTATION

Consider that the body is rotated about a diagonal of the cube for which
⎛ the⎞center of mass will be on
1
the rotation axis. Then the angular velocity vector is written as ω = √13 ⎝ 1 ⎠ where the components of
1
q
1 2 2 2
  =   =   =  √3 with the angular velocity magnitude   +   +   = 
⎛ ⎞ ⎛ ⎞ ⎛ ⎞
1 0 0 1 1
1 1 1 1 1
L = {I} · ω =  2  √ ⎝ 0 1 0 ⎠ · ⎝ 1 ⎠ =  2  √ ⎝ 1 ⎠ =  2 ω
6 3 0 0 1 1 6 3 1 6

Note that L and ω again are colinear showing it also is a principal axis. Moreover, the magnitude of L
is identical for orientations of the rotation axes  passing through the center of mass when centered on
either one face, or the diagonal, of the cube implying that the principal moments of inertia about these axes
are identical. This illustrates the important property that, when the three principal moments of inertia are
identical, then any orientation of the coordinate system is an equally good principal axis system. That is,
this corresponds to the spherical top where all orientations are principal axes, not just along the obvious
symmetry axes.

13.6 Example: Rotation about the corner of the cube

Let us repeat the above exercise for rotation about one corner of the cube. Consider that the angular
velocity is along the  axis. Then example (132) gives the angular momentum to be
⎛ ⎞ ⎛ ⎞ ⎛ ⎞
+8 −3 −3 1 +8
1 1
L = {I} · ω =  2  ⎝ −3 +8 −3 ⎠ · ⎝ 0 ⎠ =  2 ω ⎝ −3 ⎠
12 12
−3 −3 +8 0 −3
The angular momentum is far from being aligned with the axis  that is, it is not a principal axis.
Consider that the body is rotated with the angular velocity aligned along a⎛diagonal
⎞ of the cube through
1
the center of mass on this axis. Then the angular velocity is written as ω = √13 ⎝ 1 ⎠ where the components
1
q
1
of   =   =   = √3 ensuring that the magnitude equals  2 +  2 +  2 = 
⎛ ⎞ ⎛ ⎞ ⎛ ⎞
+8 −3 −3 1 2
1 1 1 1 1
L = {I} · ω =  2  √ ⎝ −3 +8 −3 ⎠ · ⎝ 1 ⎠ =  2  √ ⎝ 2 ⎠ =  2 ω
12 3 12 3 6
−3 −3 +8 1 2
This is a principal axis since L and  again are colinear and the angular momentum is the same as for any
axis through the center of mass of a uniform solid cube due to the high symmetry of the cube. If the angular
velocity is perpendicular to the diagonal of the cube, then, for either of these perpendicular axes, the relation
between  and  is given by
⎛ ⎞ ⎛ ⎞ ⎛ ⎞ ⎛ ⎞
+8 −3 −3 −1 −11 −1
1 1 1 1 11
L=  2  √ ⎝ −3 +8 −3 ⎠ · ⎝ +1 ⎠ =  2  √ ⎝ +11 ⎠ =  2  ⎝ +1 ⎠
12 2 12 2 12
−3 −3 +8 0 0 0
Note that this must be a principal axis for rotation about a corner of the cube since L and ω are colinear.
The angular momentum is the same for both possible orientations of  that are perpendicular to the diagonal
through the center of mass. Diagonalizing the inertia tensor in example 132 also gave the above result with
the symmetry axis along the diagonal of the cube.
This example illustrates that it is not necessary to diagonalize the inertia tensor matrix to obtain the
principal axes. The corner of the cube has three mutually perpendicular principal axes independent of the
choice of a body-fixed coordinate frame. The advantage of the principal axis coordinate frame is that the
inertia tensor is diagonal making evaluation of the angular momentum trivial. That is, there is no physics
associated with the orientation chosen for the body-fixed coordinate frame, this frame only determines the
ratio of the components of the inertia tensor along the chosen coordinates. Note that, if a body has an obvious
symmetry, then intuition is a powerful way to identify the principal axis frame.
13.12. KINETIC ENERGY OF ROTATING RIGID BODY 327

13.12 Kinetic energy of rotating rigid body

An important observable is the kinetic energy of rotation of a rigid body. Consider a rigid body composed
of  particles of mass  where  = 1 2 3  If the body rotates with an instantaneous angular velocity
ω about some fixed point, with respect to the body coordinate system, and this point has an instantaneous
translational velocity V with respect to the fixed (inertial) coordinate system, see figure 131, then the
instantaneous velocity v of the  particle in the fixed frame of reference is given by
00
v = V + v + ω × r0 (13.56)
00
However, for a rigid body, the velocity of a body-fixed point with respect to the body is zero, that is v = 0
thus
v = V + ω × r0 (13.57)
The total kinetic energy is given by

X 
X
1 1
 =  v · v =  (V + ω × r0 ) · (V + ω × r0 )

2 
2

X 
X 
1 1X
=   2 +  V · ω × r0 +  (ω × r0 ) · (ω × r0 ) (13.58)
2  
2 

This is a general expression for the kinetic energy that is valid for any choice of the origin from which the
body-fixed vectors r0 are measured. However, if the origin is chosen to be the center of mass, then, and only
then, the middle term cancels. That is, since V · ω is independent of the specific particle, then

Ã !
X X
0 0
 V · ω × r = V · ω ×  r (13.59)
 

But the definition of the center of mass is

X
 r0 =  R (13.60)


and R = 0 in the body-fixed frame if the selected point in the body is the center of mass. Thus, when using
the center of mass frame, the middle term of equation 1358 is zero. Therefore, for the center of mass frame,
the kinetic energy separates into two terms in the body-fixed frame

 =  +  (13.61)

where

1X
 =   2 (13.62)
2 

1X
 =  (ω × r0 ) · (ω × r0 )
2 

The vector identity

2
(A × B) · (A × B) = 2  2 − (A · B) (13.63)
can be used to simplify 
1X
 h i
2
 =   2 02 − (ω · r0 ) (13.64)
2 

The rotational kinetic energy  can be expressed in terms of components of ω and r0 in the body-fixed
frame. Also the following formulae are greatly simplified if r0 = (     ) in the rotating body-fixed frame
328 CHAPTER 13. RIGID-BODY ROTATION

is written in the form r0 = (1  2  3 ) where the axes are defined by the numbers 1 2 3 rather than
  . In this notation the rotational kinetic energy is written as
⎡Ã !Ã ! Ã !⎛ ⎞⎤

1X X X X X
 =  ⎣  2 2 −    ⎝    ⎠⎦ (13.65)
2    


Assume the Kronecker delta relation

3
X
 =     (13.66)


where   = 1 if  =  and   = 0 if  6= 
Then the kinetic energy can be written more compactly
⎡Ã !Ã ! Ã !⎛ ⎞⎤

1X X X X X
 =  ⎣  2 2 −    ⎝    ⎠⎦
2    

 3
" Ã 3 ! #
1 XX X
2
=  (      )  − (   ) (   )
2  

3
" " Ã 3 ! ##
1X X X
2
=       −   (13.67)
2   

The term in the outer square brackets is the inertia tensor defined in equation 1312 for a discrete body. The
inertia tensor components for a continuous body are given by equation 1313.
Thus the rotational component of the kinetic energy can be written in terms of the inertia tensor as
3
1X
 =      (13.68)
2 

Note that when the inertia tensor is diagonal ,then the evaluation of the kinetic energy simplifies to
3
1X
 =   2 (13.69)
2 

which is the familiar relation in terms of the scalar moment of inertia  discussed in elementary mechanics.
Equation 1368 also can be factored in terms of the angular momentum L.
1X 1X X 1X
 =      =     =    (13.70)
2  2  
2 

As mentioned earlier, tensor algebra is an elegant and compact way of expressing such matrix operations.
Thus it is possible to express the rotational kinetic energy as
⎛ ⎞ ⎛ ⎞
¡ ¢ 11 12 13 1
1
 =  1  2  3 · ⎝ 21 22 23 ⎠ · ⎝  2 ⎠ (13.71)
2
31 32 33 3
1
 ≡ T = ω · {I} · ω (13.72)
2
where the rotational energy T is a scalar. Using equation 1355 the rotational component of the kinetic
energy also can be written as
1
 ≡ T = ω · L (13.73)
2
which is the same as given by (1370). It is interesting to realize that even though L = {I} · ω is the inner
product of a tensor and a vector, it is a vector as illustrated by the fact that the inner product  = 12 ω·L =
1
2 ω · ({I} · ω) is a scalar. Note that the translational kinetic energy  must be added to the rotational
kinetic energy  to get the total kinetic energy as given by equation 1361
13.13. EULER ANGLES 329

13.13 Euler angles

The description of rigid-body rotation is greatly facil-
itated by transforming from the space-fixed coordinate
frame
¡ (x̂
¢ ŷ ẑ) to a rotating body-fixed coordinate frame
1̂ 2̂ 3̂ for which the inertia tensor is diagonal. Appen-
dix  introduced the rotation matrix {λ} which can be
used to rotate between the space-fixed coordinate sys-
tem, which is stationary, and the instantaneous body-
fixed frame which is rotating with respect to the space-
fixed frame. The transformation can be represented by
a matrix equation
¡ ¢
1̂ 2̂ 3̂ = {λ} · (x̂ ŷ ẑ) (13.74)

where the space-fixed

¡ ¢system is identified by unit vectors
(x̂ ŷ ẑ) while 1̂ 2̂ 3̂ defines unit vectors in the rotated
body-fixed system. The rotation matrix {λ} completely
describes the instantaneous relative orientation of the
two systems. Rigid-body rotation requires three inde-
pendent angular parameters that specify the orientation
of the rigid body such that the corresponding orthog-
onal transformation matrix is proper, that is, it has a
determinant || = +1 as given by equation (33).
As discussed in Appendix 2, the 9 component ro-
tation matrix involves only three independent angles.
There are many possible choices for these three angles.
It is convenient to use the Euler angles,    (also
called Eulerian angles) shown in figure 133.1 The Euler Figure 13.3: The  −  −  sequence of rotations
angles are generated by a series of three rotations that      corresponding to the Eulerian angles
rotate¡from the (  ). The first rotation  about the space-
¢ space-fixed (x̂ ŷ ẑ) system to the body- fixed z axis (blue) is from the -axis (blue) to the
fixed 1̂ 2̂ 3̂ system. The rotation must be such that
the space-fixed  axis rotates by an angle  to align with line of nodes n (green). The second rotation 
the body-fixed 3 axis. This can be performed by rotating about the line of nodes (green) is from the space-
through an angle  about the n̂ ≡ ẑ × 3̂ direction, where fixed  axis (blue) to the body-fixed 3-axis (red).
ẑ and 3̂ designate the unit vectors along the “” axes The third rotation  about the body-fixed 3-axis
of the space and body fixed frames respectively. The (red) is from the line of nodes (green) to the body-
unit vector n̂ ≡ ẑ × 3̂ is the vector normal to the plane fixed 1 axis (red).
defined by the ẑ and 3̂ unit vectors and this unit vector n̂ = ẑ × 3̂ is called the line of nodes. The chosen
convention
¡ ¢is that the unit vector n̂ = ẑ × 3̂ is along the “” axis of an intermediate-axis frame designated
by n̂ ŷ0  ẑ , that is, the unit vector n̂ = ẑ × 3̂ plus the unit vectors ŷ0 and ẑ are in the same plane as the ẑ
and 3̂ unit vectors. The sequence of three rotations is performed as summarized below.

1) Rotation  about the space-fixed ẑ axis from the space x̂ axis to the line of nodes n̂ : The
first rotation (x y z) · λ → (n y0  z) is in a right-handed direction through an angle  about the space-fixed
z axis. Since the rotation takes place in the x − y plane, the transformation matrix is
⎛ ⎞
cos  sin  0
{λ } = ⎝ − sin  cos  0 ⎠ (13.75)
0 0 1
1 The space-fixed coordinate frame and the body-fixed coordinate frames are unambiguously defined, that is, the space-fixed

frame is stationary while the body-fixed frame is the principal-axis frame of the body. There are several possible intermediate
frames that can be used to define the Euler angles. The  −  −  sequence of rotations, used here, is used in most physics
textbooks in classical mechanics. Unfortunately scientists and engineers use slightly diﬀerent conventions for defining the Euler
angles. As discussed in Appendix A of "Classical Mechanics" by Goldstein, nuclear and particle physicists have adopted the
 −  −  sequence of rotations while the US and UK aerodynamicists have adopted a  −  −  sequence of rotations.
330 CHAPTER 13. RIGID-BODY ROTATION

This leads to the intermediate coordinate system (n y0  z) where the rotated x axis now is colinear with the
n axis of the intermediate frame, that is, the line of nodes.

(n y0  z) = {λ } · (x y z) (13.76)

The precession angular velocity ̇ is the rate of change of angle of the line of nodes with respect to the space
 axis about the space-fixed  axis.

2) Rotation  about the line of nodes n̂ from the space ẑ axis to the body-fixed 3̂ axis: The
second rotation
(n y0  z) ·  → (n y00  3) (13.77)
is in a right-handed direction through the angle  about the n̂ axis (line of nodes) so that the “” axis becomes
colinear with the body-fixed 3̂ axis. Because the rotation now is in the ẑ− 3̂ plane, the transformation matrix
is ⎛ ⎞
1 0 0
{λ } = ⎝ 0 cos  sin  ⎠ (13.78)
0 − sin  cos 
The line of nodes which is at the intersection of the space-fixed and body-fixed planes, shown in figure 133
points in the n̂ = ẑ × 3̂ direction. The new “” axis now is the body-fixed 3̂ axis. The angular velocity ̇ is
the rate of change of angle of the body-fixed 3̂-axis relative to the space-fixed ẑ-axis about the line of nodes.

3) Rotation  about the body-fixed 3̂ axis from the line of nodes to the body-fixed 1̂ axis: The
third rotation
(n y00  3) ·  → (1̂ 2̂ 3̂) (13.79)
is in a right-handed direction through the angle  about the new body-fixed 3̂ axis This third rotation
transforms the rotated intermediate (n y00  3) frame to final body-fixed coordinate system (1̂ 2̂ 3̂) The
transformation matrix is ⎛ ⎞
cos  sin  0
{λ } = ⎝ − sin  cos  0 ⎠ (13.80)
0 0 1
The spin angular velocity ̇ is the rate of change of the angle of the body-fixed 1-axis with respect to the
line of nodes about the body-fixed 3 axis.
The total rotation matrix {λ} is given by

{λ} = {λ } · {λ } · {λ } (13.81)

Thus the complete rotation from the space-fixed (x y z) axis system to the body-fixed (1 2 3) axis system
is given by
(1 2 3) = {λ} · (x y z) (13.82)
where {λ} is given by the triple product equation (1381) leading to the rotation matrix
⎛ ⎞
cos  cos  − sin  cos  sin  sin  cos  + cos  cos  sin  sin  sin 
{λ} = ⎝ − cos  sin  − sin  cos  cos  − sin  sin  + cos  cos  cos  sin  cos  ⎠ (13.83)
sin  sin  − cos  sin  cos 

The inverse transformation from the body-fixed axis system to the space-fixed axis system is given by

(x y z) = {λ}−1 · (1 2 3) (13.84)

−1 
where the inverse matrix {λ} equals the transposed rotation matrix {λ} , that is,

⎛ ⎞
cos  cos  − sin  cos  sin  − cos  sin  − sin  cos  cos  sin  sin 
−1 
{λ} = {λ} = ⎝ sin  cos  + cos  cos  sin  − sin  sin  + cos  cos  cos  − cos  sin  ⎠ (13.85)
sin  sin  sin  cos  cos 
13.14. ANGULAR VELOCITY ω 331

Taking the product {λ} {λ}−1 = 1 shows that the rotation matrix is a proper, orthogonal, unit matrix.
The use of three diﬀerent coordinate systems, space-fixed, the intermediate line of nodes, and the body-
fixed frame can be confusing at first glance. Basically the angle  specifies the rotation about the space-fixed
 axis between the space-fixed  axis and the line of nodes of the Euler angle intermediate frame. The angle
 specifies the rotation about the body-fixed 3 axis between the line of nodes and the body-fixed 1 axis. Note
that although the space-fixed and body-fixed axes systems each are orthogonal, the Euler angle basis in
general is not orthogonal. For rigid-body rotation the rotation angle  about the space-fixed  axis is time
dependent, that is, the line of nodes is rotating with an angular velocity ̇ with respect to the space-fixed
coordinate frame. Similarly the body-fixed coordinate frame is rotating about the body-fixed 3 axis with
angular velocity ̇ relative to the line of nodes.

13.7 Example: Euler angle transformation

The definition of the Euler angles can be confusing, therefore it is useful to illustrate their use for a
rotational transformation of a primed frame (0   0   0 ) to an unprimed frame (  ) Assume the first
rotation about the  0 axis, is  = 30◦ ⎛ √ ⎞
3 1
2 2 0
⎜ √ ⎟
 = ⎝ − 1 3
0 ⎠
2 2
0 0 1
Let the second rotation be  = 45◦ about the line of nodes, that is, the intermediate ” axis. Then
⎛ ⎞
1 0 0
1 √1
 = ⎝ 0 √2 2
⎠
0 − √12 √1
2
Let the third rotation be  = 90◦ about the  axis.
⎛ ⎞
0 1 0
 = ⎝ −1 0 0 ⎠
0 0 1
Thus the net rotation corresponds to  =   
⎛ √ ⎞⎛ ⎞⎛ ⎞ ⎛ 1√ √ √ ⎞
3 1 1 0 0 1 1
2 2 0 0 1 0 − 4 √2 2 3 4 √2
⎜ √ ⎟⎝ 0 √ 1 1
√ ⎠ ⎝ −1 0 0 ⎠ = ⎝ − 1 6 ⎠
 = ⎝ −1 3
0 ⎠ 2 2 4√ − 12 1
4 √6
2 2
0 0 1 0 − √12 √12 0 0 1 1
2 2 0 1
2 2

13.14 Angular velocity ω

It is useful to relate the rigid-body equations of motion in the space-fixed (x̂ ŷ ẑ) coordinate system to
those in the body-fixed (ê1  ê2  ê3 ) coordinate system where the principal axis inertia tensor is defined. It
was shown in appendix  that an infinitessimal rotation can be represented by a vector. Thus the time
derivatives of these rotation angles can be associated with the components of the angular velocity ω where
the precession   = ̇, the nutation   = ̇, and the spin   = ̇. Unfortunately the coordinates (  )
are with respect to mixed coordinate frames and thus are not orthogonal axes. That is, the Euler angular
velocities are expressed in diﬀerent coordinate frames, where the precession ̇ is around the space-fixed ẑ
axis measured relative to the x̂-axis, the spin ̇ is around the body-fixed ê3 axis relative to the rotating
line-of-nodes, and the nutation ̇ is the angular velocity between the ẑ and ê3 axes and points along the
instantaneous line-of-nodes in the ê3 × ẑ direction. By reference to figure 133 it can be seen that the
components along the body-fixed axes are as given in Table 131.
Table 131; Euler angular velocity components in the body-fixed frame
Precession ̇ Nutation ̇ Spin ̇
̇1 = ̇ sin  sin  ̇1 = ̇ cos  ̇ 1 = 0
̇2 = ̇ sin  cos  ̇2 = −̇ sin  ̇ 2 = 0
̇3 = ̇ cos  ̇3 = 0 ̇ 3 = ̇
332 CHAPTER 13. RIGID-BODY ROTATION

Note that the precession angular velocity ̇ is the angular velocity that the body-fixed ê3 and ẑ × 3̂ axes
precess around the space-fixed ẑ axis. Table 131 gives the Euler angular velocities required to calculate
the components of the angular velocity ω for the body-fixed (1 2 3) axis system. Collecting the individual
components of ω gives the components of the angular velocity of the body, relative to the space-fixed axes,
in the body-fixed axis system (1 2 3)

1 = ̇1 + ̇1 + ̇ 1 = ̇ sin  sin  + ̇ cos  (13.86)

2 = ̇2 + ̇2 + ̇ 2 = ̇ sin  cos  − ̇ sin  (13.87)
3 = ̇3 + ̇3 + ̇ 3 = ̇ cos  + ̇ (13.88)

The angular velocity of the body about the body-fixed 3-axis,  3 , is the sum of the projection of the
precession angular velocity of the line-of-nodes ̇ with respect to the space-fixed x-axis, plus the angular
velocity ̇ of the body-fixed 3-axis with respect to the rotating line-of-nodes.
Similarly, the components of the body angular velocity ω for the space-fixed axis system (  ) can be
derived to be

 = ̇ cos  + ̇ sin  sin  (13.89)

 = ̇ sin  − ̇ sin  cos  (13.90)
 = ̇ + ̇ cos  (13.91)

Note that when  = 0 then the Euler angles are singular in that the space-fixed  axis is parallel with
the body-fixed 3 axis and there is no way of distinguishing between precession ̇ and spin ̇, leading to
  =  3 = ̇ + ̇. When  =  then the  axis and 3 axis are antiparallel and   = ̇ − ̇ = − 3 . The other
special case is when cos  = 0 for which the Euler angle system is orthogonal and the space-fixed   = ̇,
that is, it equals the precession, while the body-fixed  3 = ̇, that is, it equals the spin. When the Euler
angle basis is not orthogonal then equations (1386 − 88) and (1389 − 91) are needed for expressing the
Euler equations of motion in either the body-fixed frame or the space-fixed frame respectively.
Equations 1386 − 88 for the components of the angular velocity in the body-fixed frame can be expressed
in terms of the Euler angle velocities in a matrix form as
⎛ ⎞ ⎛ ⎞ ⎛ ⎞
1 sin  sin  cos  0 ̇
⎝  2 ⎠ = ⎝ sin  cos  − sin  0 ⎠ · ⎝ ̇ ⎠ (13.92)
3 cos  0 1 ̇
again note that the transformation matrix is not orthogonal which is to be expected since the Euler angular
velocities are about axes that do not form a rectangular system of coordinates. Similarly equations 1389−91
for the angular velocity in the space-fixed frame can be expressed in terms of the Euler angle velocities in
matrix form as ⎛ ⎞ ⎛ ⎞ ⎛ ⎞
 0 cos  sin  sin  ̇
⎝   ⎠ = ⎝ 0 sin  sin  cos  ⎠ · ⎝ ̇ ⎠ (13.93)
 1 0 cos  ̇

13.15 Kinetic energy in terms of Euler angular velocities

The kinetic energy is a scalar quantity and thus is the same in both stationary and rotating frames of
reference. It is much easier to evaluate the kinetic energy in the rotating Principal-axis frame since the
inertia tensor is diagonal in the Principal-axis frame as given in equation 1369
3
1X
 =   2 (13.94)
2 

Using equation 1386 − 88 for the body-fixed angular velocities gives the rotational kinetic energy in terms
of the Euler angular velocities and principal-frame moments of inertia to be
∙ ³ ´2 ³ ´2 ³ ´2 ¸
1
 = 1 ̇ sin  sin  + ̇ cos  + 2 ̇ sin  cos  − ̇ sin  + 3 ̇ cos  + ̇ (13.95)
2
13.16. ROTATIONAL INVARIANTS 333

13.16 Rotational invariants

The scalar properties of a rotating body, such as mass  Lagrangian , and Hamiltonian  are rotationally
invariant, that is, they are the same in any body-fixed or laboratory-fixed coordinate frame. This fact also
applies to scalar products of all vector observables such as angular momentum. For example the scalar
product
L · L =2 (13.96)
where  is the root mean square value of the angular momentum. An example of a scalar invariant is the
scalar product of the angular velocity
ω · ω = 2 (13.97)
where  2 is the mean square angular velocity. The scalar product  ·  = ||2 can be calculated using the
Euler-angle velocities for the body-fixed frame, equations 1386 − 88, to be
2 2 2
ω · ω = ||2 =  21 +  22 +  23 = ̇ + ̇ + ̇ + 2̇̇ cos 

Similarly, the scalar product can be calculated using the Euler angle velocities for the space-fixed frame
using equations 1389 − 91.
2 2 2
ω · ω = ||2 =  2 +  2 +  2 = ̇ + ̇ + ̇ + 2̇̇ cos 
2
This shows the obvious result that the scalar product  ·  = || is invariant to rotations of the coordinate
frame, that is, it is identical when evaluated in either the space-fixed, or body-fixed frames.
Note that for  = 0, the 3̂ and ̂ axes are parallel, and perpendicular to the ̂ axis, then
³ ´2 2
||2 = ̇ + ̇ + ̇

For the case when  = 180◦ , the 3̂ and ̂ axes are antiparallel, and perpendicular to the ̂ axis, then
³ ´2 2
2
|| = ̇ − ̇ + ̇

For the case when  = 90◦ , the 3̂ , ̂, and ̂ axes are mutually perpendicular, that is, orthogonal, and then
2 2 2
||2 = ̇ + ̇ + ̇

The time-averaged shape of a rapidly-rotating body, as seen in the fixed inertial frame, is very different
from the actual shape of the body, and this difference depends on the rotational frequency. For example, a
pencil rotating rapidly about an axis perpendicular to the body-fixed symmetry axis has an average shape
that is a flat disk in the laboratory frame which bears little resemblance to a pencil. The actual shape of the
pencil could be determined by taking high-speed photographs which display the instantaneous body-fixed
shape of the object at given times. Unfortunately for fast rotation, such as rotation of a molecule or a
nucleus, it is not possible to take photographs with sufficient speed and spatial resolution to observe the
instantaneous shape of the rotating body. What is measured is the average shape of the body as seen in the
fixed laboratory frame. In principle the shape observed in the fixed inertial frame can be related to the shape
in the body-fixed frame, but this requires knowing the body-fixed shape which in general is not known. For
example, a deformed nucleus may be both vibrating and rotating about some triaxially deformed average
shape which is a function of the rotational frequency. This is not apparent from the shapes measured in the
fixed frame for each of the excited states.
The fact that scalar products are rotationally invariant, provides a powerful means of transforming prod-
ucts of observables in the body-fixed frame, to those in the laboratory frame. In 1971 Cline developed
a powerful model-independent method that utilizes rotationally-invariant products of the electromagnetic
quadrupole operator 2 to relate the electromagnetic 2 properties for the observed levels of a rotating
nucleus measured in the laboratory frame, to the electromagnetic 2 properties of the deformed rotating
nucleus measured in the body-fixed frame.[Cli71, Cli72, Cli86] The method uses the fact that scalar products
of the electromagnetic multipole operators are rotationally invariant. This allows transforming scalar prod-
ucts of a complete set of measured electromagnetic matrix elements, measured in the laboratory frame, into
334 CHAPTER 13. RIGID-BODY ROTATION

the electromagnetic properties in the body-fixed frame of the rotating nucleus. These rotational invariants
provide a model-independent determination of the magnitude, triaxiality, and vibrational amplitudes of the
average shapes in the body-fixed frame for individual observed nuclear states that may be undergoing both
rotation and vibration. When the bombarding energy is below the Coulomb barrier, the scattering of a
projectile nucleus by a target nucleus is due purely to the electromagnetic interaction since the distance
of closest approach exceeds the range of the nuclear force. For such pure Coulomb collisions, the electro-
magnetic excitation of collective nuclei populates many excited states, as illustrated in figure 1413, with
cross sections that are a direct measure of the 2 matrix elements. These measured matrix elements are
precisely those required to evaluate, in the laboratory frame, the 2 rotational invariants from which it is
possible to deduce the intrinsic quadrupole shapes of the rotating-vibrating nuclear states in the body-fixed
frame[Cli86].

13.17 Euler’s equations of motion for rigid-body rotation

Rigid-body rotation can be confusing in that two coordinate frames are involved and, in general, the angular
velocity and angular momentum are not aligned. The motion of the rigid body is observed in the space-fixed
inertial frame whereas it is simpler to calculate the equations of motion in the body-fixed principal axis
frame, for which the inertia tensor is known and is constant. The rigid body is rotating about the angular
velocity vector ω, which is not aligned with the angular momentum L. For torque-free motion, L is conserved
and has a fixed orientation in the space-fixed axis system. Euler’s equations of motion, presented below,
are given in the body-fixed frame for which the inertial tensor is known since this simplifies solution of the
equations of motion. However, this solution has to be rotated back into the space-fixed frame to describe
the rotational motion as seen by an observer in the inertial frame.
This chapter has introduced the inertial properties of a rigid body, as well as the Euler angles for
transforming between the body-fixed and inertial frames of reference. This has prepared the stage for
solving the equations of motion for rigid-body motion, namely, the dynamics of rotational motion about a
body-fixed point under the action of external forces. The Euler angles are used to specify the instantaneous
orientation of the rigid body.
In Newtonian mechanics, the rotational motion is governed by the equivalent Newton’s second law given
in terms of the external torque N and angular momentum L
µ ¶
L
N= (13.98)
 

Note that this relation is expressed in the inertial space-fixed frame of reference, not the non-inertial body-
fixed frame. The subscript  is added to emphasize that this equation is written in the inertial space-fixed
frame of reference. However, as already discussed, it is much more convenient to transform from the space-
fixed inertial frame to the body-fixed frame for which the inertia tensor of the rigid body is known. Thus the
next stage is to express the rotational motion in terms of the body-fixed frame of reference. For simplicity,
translational motion will be ignored.
The rate of change of angular momentum can be written in terms of the body-fixed value, using the
transformation from the space-fixed inertial frame (x̂ ŷ ẑ) to the rotating frame (ê1  ê2  ê3 ) as given in
chapter 103, µ ¶ µ ¶
L L
N= = +ω×L (13.99)
   
However, the body axis ê is chosen to be the principal axis such that
 =    (13.100)
where the principal moments of inertia are written as  . Thus the equation of motion can be written using
the body-fixed coordinate system as
¯ ¯
¯ ê1 ê2 ê3 ¯¯
¯
N = 1 ̇ 1 ê1 + 2 ̇ 2 ê2 + 3 ̇ 3 ê3 + ¯¯  1 2  3 ¯¯ (13.101)
¯ 1  1 2  2 3  3 ¯
= (1 ̇ 1 − (2 − 3 )  2  3 ) ê1 + (2 ̇ 2 − (3 − 1 )  3  1 ) ê2 + (3 ̇ 3 − (1 − 2 )  1  2 ) ê3(13.102)
13.18. LAGRANGE EQUATIONS OF MOTION FOR RIGID-BODY ROTATION 335

where the components in the body-fixed axes are given by

1 = 1 ̇ 1 − (2 − 3 )  2  3 (13.103)
2 = 2 ̇ 2 − (3 − 1 )  3  1
3 = 3 ̇ 3 − (1 − 2 )  1  2

These are the Euler equations for rigid body in a force field expressed in the body-fixed coordinate
frame. They are applicable for any applied external torque N.
The motion of a rigid body depends on the structure of the body only via the three principal moments
of inertia 1 2 and 3  Thus all bodies having the same principal moments of inertia will behave exactly the
same even though the bodies may have very different shapes. As discussed earlier, the simplest geometrical
shape of a body having three different principal moments is a homogeneous ellipsoid. Thus, the rigid-body
motion often is described in terms of the equivalent ellipsoid that has the same principal moments.
A deficiency of Euler’s equations is that the solutions yield the time variation of ω as seen from the body-
fixed reference frame axes, and not in the observers fixed inertial coordinate frame. Similarly the components
of the external torques in the Euler equations are given with respect to the body-fixed axis system which
implies that the orientation of the body is already known. Thus for non-zero external torques the problem
cannot be solved until the the orientation is known in order to determine the components  . However,
these difficulties disappear when the external torques are zero, or if the motion of the body is known and it
is required to compute the applied torques necessary to produce such motion.

13.18 Lagrange equations of motion for rigid-body rotation

The Euler equations of motion were derived using Newtonian concepts of torque and angular momentum.
It is of interest to derive the equations of motion using Lagrangian mechanics. It is convenient to use a
generalized torque  and assume that  = 0 in the Lagrange-Euler equations. Note that the generalized
force is a torque since the corresponding generalized coordinate is an angle, and the conjugate momentum
is angular momentum. If the body-fixed frame of reference is chosen to be the principal axes system, then,
since the inertia tensor is diagonal in the principal axis frame, the kinetic energy is given in terms of the
principal moments of inertia as
1X
 =   2 (13.104)
2 
Using the Euler angles as generalized coordinates, then the Lagrange equation for the specific case of the 
coordinate and including a generalized force  gives
  
− =  (13.105)
  ̇ 
which can be expressed as
3 3
 X    X   
− =  (13.106)
     ̇ 
  
Equation 13104 gives

=    (13.107)
 
Diﬀerentiating the angular velocity components in the body-fixed frame, equations (1386 − 1388)  give
1 1 2
 = ̇ sin  cos  − ̇ sin  =  2  ̇
=  ̇
=0
2 1 2
 = −̇ sin  sin  − ̇ cos  = − 1  ̇
=  ̇
=0
3 3
 =0  ̇
=1

Substituting these into the Lagrange equation (13106) gives


3  3 − 1  1  2 + 2  2 (− 1 ) = 3 (13.108)

336 CHAPTER 13. RIGID-BODY ROTATION

since the  and eb3 axes are colinear. This can be rewritten as

3 ̇ 3 − (1 − 2 )  1  2 = 3 (13.109)

Any axis could have been designated the eb3 axis, thus the above equation can be generalized to all three
axes to give

1 ̇ 1 − (2 − 3 )  2  3 = 1 (13.110)
2 ̇ 2 − (3 − 1 )  3  1 = 2
3 ̇ 3 − (1 − 2 )  1  2 = 3

These are the Euler’s equations given previously in (13103). Note that although ̇ 3 is the equation
of motion for the  coordinate, this is not true for the φ and θ rotations which are not along the body-fixed
1 and 2 axes as given in table 131.

13.8 Example: Rotation of a dumbbell

Consider the motion of the symmetric dumbbell shown in the adjacent figure. Let |1 | = |2 | =  Let the
body-fixed coordinate system have its origin at  and symmetry axis eb3 be along the weightless shaft toward
1 and v =  ̂1  The angular momentum is given by
X
L=  r × v


Because L is perpendicular to the shaft, and L rotates around ω as the shaft rotates, let eb2 be along L

L = 2 eb2

If  is the angle between ω and the shaft, the components of ω

are

1 = 0
2 =  sin 
3 =  cos 

Assume that the principal moments of the dumbbell are L

2
1 = (1 + 2 )  O
2 = (1 + 2 ) 2
3 = 0

Thus the angular momentum is given by

1 = 1  1 = 0
2 = 2  2 = (1 + 2 ) 2  sin  Rotation of a dumbbell.
3 = 3  3 = 0

which is consistent with the angular momentum being along the eb2 axis.
Using Euler’s equations, and assuming that the angular velocity is constant, i.e. ̇ = 0 then the compo-
nents of the torque required to satisfy this motion are

1 = − (1 + 2 ) 2  2 sin  cos 

2 = 0
3 = 0

That is, this motion can only occur in the presence of the above applied torque which is in the direction
−eb1  that is, mutually perpendicular to eb2 and eb3 . This torque can be written as N = ω × L.
13.19. HAMILTONIAN EQUATIONS OF MOTION FOR RIGID-BODY ROTATION 337

13.19 Hamiltonian equations of motion for rigid-body rotation

The Hamiltonian equations of motion are expressed in terms of the Euler angles plus their corresponding
canonical angular momenta (        ) in contrast to Lagrangian mechanics which is based on the
Euler angles plus their corresponding angular velocities (   ̇ ̇ ̇). The Hamiltonian approach is con-
veniently expressed in terms of a set of Andoyer-Deprit action-angle coordinates that include the three Euler
angles, specifying the orientation of the body-fixed frame, plus the corresponding three angles specifying the
orientation of the spin frame of reference. This phase space approach[Dep67] can be employed for calcu-
lations of rotational motion in celestial mechanics that can include spin-orbit coupling. This Hamiltonian
approach is beyond the scope of the present textbook.

13.20 Torque-free rotation of an inertially-symmetric rigid rotor

13.20.1 Euler’s equations of motion:
There are many situations where one has rigid-body motion free
of external torques, that is, N = 0. The tumbling motion of a
jugglers baton, a diver, a rotating galaxy, or a frisbee, are exam-
ples of rigid-body rotation. For torque-free rotation, the body
will rotate about the center of mass, and thus the inertia tensor
with respect to the center of mass is required. An inertially-
symmetric rigid body has two identical principal moments of
inertia with 1 = 2 6= 3 , and provides a simple example that
illustrates the underlying motion. The force-free Euler equations
for the symmetric body in the body-fixed principal axis system
are given by

(2 − 3 )  2  3 − 1 ̇ 1 = 0 (13.111)
(3 − 1 )  3  1 − 2 ̇ 2 = 0 (13.112)
3 ̇ 3 = 0 (13.113)

where 1 = 2 and  = 0 apply.

Note that for torque-free motion of an inertially symmetric
body equation 13113 implies that ̇ 3 = 0 i.e.  3 is a constant
of motion and thus is a cyclic variable for the symmetric rigid
body.
Figure 13.4: The force-free symmetric top
Equations 13111 and 13112 can be written as two coupled
angular velocity  precesses on a conical
equations
trajectory about the body-fixed symme-
̇ 1 + Ω 2 = 0 (13.114) try axis 3̂.
̇ 2 − Ω 1 = 0 (13.115)

where the precession angular velocity Ω =̇ with respect to the body-fixed frame is defined to be
µ ¶
(3 − 1 )
Ω≡ ω3 (13.116)
1

Combining the time derivatives of equations 13114 and 13115 leads to two uncoupled equations

̈ 1 + Ω2  1 = 0 (13.117)
̈ 2 + Ω2  2 = 0 (13.118)

These are the diﬀerential equations for a harmonic oscillator with solutions

1 =  cos Ω (13.119)
2 =  sin Ω
338 CHAPTER 13. RIGID-BODY ROTATION

These equations describe a vector  rotating in a circle of radius  about an axis perpendicular to ̂3  that
is, rotating in the ̂1 − ̂2 plane with angular frequency Ω = −̇. Note that

 21 +  22 = 2 (13.120)

which is a constant. In addition  3 is constant, therefore the magnitude of the total angular velocity
q
|ω| =  21 +  22 +  23 = constant (13.121)

The motion of the torque-free symmetric body is that the angular velocity ω precesses around the
symmetry axis ̂3 of the body at an angle  with a constant precession frequency Ω with respect to the
body-fixed frame as shown in figure 134. Thus, to an observer on the body, ω traces out a cone around the
body-fixed symmetry axis. Note from (13116) that the vectors Ω̂3 and  3 ̂3 are parallel when Ω is positive,
that is, 3   (oblate shape) and antiparallel if 3   (prolate shape).
For the system considered, the orientation of the angular momentum vector L must be stationary in the
space-fixed inertial frame since the system is torque free, that is, L is a constant of motion. Also we have
that the projection of the angular momentum on the body-fixed symmetry axis is a constant of motion, that
is, it is a cyclic variable. Thus
1 3
3 = 3  3 = Ω (13.122)
(3 − 1 )
Understanding the relation between the angular momentum and angular velocity is facilitated by consid-
ering another constant of motion for the torque-free symmetric rotor, namely the rotational kinetic energy.
1
 = ω · L = constant (13.123)
2
Since L is a constant for torque-free motion, and also the magnitude of ω was shown to be constant, therefore
the angle between these two vectors must be a constant to ensure that also rot = 12 ω · L = constant. That
is, ω precesses around L at a constant angle ( − ) such that the projection of ω onto L is constant. Note
that
ω × eb3 =  2 eb1 −  1 eb2 (13.124)
and, for a symmetric rotor,
L · ω × eb3 = 1  1  2 − 2  1  2 = 0 (13.125)
since 1 = 2 for the symmetric rotor. Because L · ω × eb3 = 0 for a symmetric top then L ω and eb3 are
coplanar.
Figure 135 shows the geometry of the motion for both oblate and prolate axially-deformed bodies. To
an observer in the space-fixed inertial frame, the angular velocity ω traces out a cone that precesses with
angular velocity Ω around the space fixed L axis called the space cone. For convenience, figure 135 assumes
that L and the space-fixed inertial frame ẑ axis are colinear. The angular velocity ω also traces out the
body cone as it precesses about the body-fixed ê3 axis. Since L ω and eb3 are coplanar, then the ω vector is
at the intersection of the space and body cones as the body cone rolls around the space cone. That is, the
space and body cones have one generatrix in common which coincides with ω. As shown in figure 135, for
a needle the body cone appears to roll without slipping on the outside of the space cone at the precessional
velocity of Ω = − By contrast, as shown in figure 135 for an oblate (disc-shaped) symmetric top the
space cone rolls inside the body cone and the precession Ω is faster than .
Since no external torques are acting for torque-free motion, then the magnitude and direction of the total
angular momentum are conserved. The description of the motion is simplified if L is taken to be along the
space-fixed ẑ axis, then the Euler angle  is the angle between the body-fixed basis vector ê3 and space-fixed
basis vector ẑ. If at some instant in the body frame, it is assumed that eb2 is aligned in the plane of L ω
and eb3  then
1 = 0 2 =  sin  3 =  cos  (13.126)
If  is the angle between the angular velocity ω and the body-fixed ê3 axis, then at the same instant

1 = 0  2 =  sin   3 =  cos  (13.127)

13.20. TORQUE-FREE ROTATION OF AN INERTIALLY-SYMMETRIC RIGID ROTOR 339

z z
L
Space cone 3 L
3
Space cone
Body cone

2
2
Body cone
1

(a) (b)

Figure 13.5: Torque-free rotation of symmetric tops; (a) circular flat disk, (b) circular rod. The space-fixed
and body-fixed cones are shown by fine lines. The space-fixed axis system is designated by the unit vectors
(x̂ ŷ ẑ) and the body-fixed principal axis system by unit vectors (1̂ 2̂ 3̂)

The components of the angular momentum also can be derived from L = I · ω to give
1 = 1  1 = 0 2 = 2  2 = 1  sin  3 = 3  3 = 3  cos  (13.128)
2
Equations 13126 and 13128 give two relations for the ratio 3 , that is,
2 1
= tan  = tan  (13.129)
3 3
For a prolate spheroid 1  3 therefore    while Ω and  3 have opposite signs.
For a oblate spheroid 1  3 therefore    while Ω and  3 have the same sign.
The sense of precession can be understood if the body cone rolls without slipping on the outside of the
space cone with Ω in the opposite orientation to  for the prolate case, while for the oblate case the space
cone rolls inside the body cone with Ω and  oriented in similar directions. Note from (13129) that  = 0
if  = 0, that is L ω and the 3 axis are aligned corresponding to a principal axis. Similarly,  = 90◦ if
 = 90◦ , then again L and ω are aligned corresponding to them being principal axes.
Lagrangian mechanics has been used to calculate the motion with respect to the body-fixed principal
axis system. However, the motion needs to be known relative to the space-fixed inertial frame where the
motion is observed. This transformation can be done using the following relation
µ ¶ µ ¶
ê3 ê3
= + ω × ê3 = ω × ê3 (13.130)
   
since the unit vector ê3 is stationary in the body-fixed frame. The vector product of ω × ê3 and ê3 gives
µ ¶
ê3
ê3 × = ê3 × ω × ê3 = (ê3 · ê3 ) ω − (ê3 · ω) ê3 = ω −  3 ê3
 
therefore µ ¶
ê3
ω = ê3 × +  3 ê3 (13.131)
 
¡ 3¢
The angular momentum equals L = {I} ·ω. Since ê3 × ê
  is perpendicular to the ê3 axis, then
for the case with 1 = 2 , µ ¶
ê3
L =1 ê3 × + 3  3 ê3 (13.132)
 
340 CHAPTER 13. RIGID-BODY ROTATION

Thus the angular momentum for a torque-free symmetric rigid rotor comprises two components, one being
the perpendicular component that precesses around ê3 , and the other is 3 .
In the space-fixed frame assume that the ẑ axis is colinear with L Then taking the scalar product of ê3
and L, using equation 13126 gives
µ ¶
ê3
3 = ê3 · L =1 ê3 · ê3 × + 3  3 ê3 · ê3 (13.133)
 

The first term on the right is zero and thus equation 13133 and 13126 give

3 = 3  3 =  cos  (13.134)

The time dependence of the rotation of the body-fixed symmetry axis with respect to the space-fixed
axis system can be obtained by taking the vector product ê3 × L using equation 13132 and using equation
24 to expand the triple vector product,
Ã µ ¶ !
ê3
ê3 × L = 1 ê3 × ê3 × + 3  3 ê3 × ê3 (13.135)
 
"Ã µ ¶ ! µ ¶ #
ê3 ê3
= 1 ê3 · ê3 − (ê3 · ê3 ) +0
   
¡ ê3 ¢
since (ê3 × ê3 ) = 0. Moreover (ê3 · ê3 ) = 1, and ê3 ·  
= 0 since they are perpendicular, then
µ ¶
ê3 L
= × ê3 (13.136)
  1

This equation shows that the body-fixed symmetry axis ê3 precesses around the L where L is a constant
of motion for torque-free rotation. The true rotational angular velocity ω in the space-fixed frame, given by
equations 13131 can be evaluated using equation 13136 Remembering that it was assumed that L is in
the ẑ direction, that is, L =ẑ then
µ ¶
ê3
ω = ê3 × +  3 ê3
 
µ ¶
  cos 
= ê3 × (ẑ × ê3 ) + ê3
1 3
µ ¶
 1 − 3
= ẑ +  cos  ê3 (13.137)
1 1 3

That is, the symmetry axis of the axially-symmetric rigid rotor makes an angle  to the angular momentum
vector ẑ and precesses around ẑ with a constant angular velocity 1 while the axial spin of the rigid body
has a constant value 3 . Thus, in the precessing frame, the rigid body appears to rotate about its fixed
³ ´
1 −3
symmetry axis with a constant angular velocity  cos3

−  cos 
1 =  cos  1 3 . The precession of the
symmetry axis looks like a wobble superimposed on the spinning motion about the body-fixed symmetry
axis. The angular precession rate in the space-fixed frame can be deduced by using the fact that

̇ sin  =  sin  (13.138)

Then using equation 13129 allows equation 13138 to be written as

v" Ãµ ¶ ! #
u
u 3
2
̇ =  t 1+ − 1 cos 2 (13.139)
1

which gives the precession rate about the space-fixed axis in terms of the angular velocity . Note that the
precession rate ̇   if 31  1, that is, for oblate shapes, and ̇   if 31  1, that is, for prolate shapes.
13.20. TORQUE-FREE ROTATION OF AN INERTIALLY-SYMMETRIC RIGID ROTOR 341

13.20.2 Lagrange equations of motion:

It is interesting to compare the equations of motion for torque-free rotation of an inertially-symmetric
rigid rotor derived using Lagrange mechanics with that derived previously using Euler’s equations based on
Newtonian mechanics. Assume that the principal moments about the fixed point of the symmetric top are
1 = 2 6= 3 and that the kinetic energy equals the rotational kinetic energy, that is, it is assumed that the
translational kinetic energy  = 0 Then the kinetic energy is given by
1X 1 ¡ ¢ 1
 =   2 = 1  21 +  22 + 3  23 (13.140)
2  2 2

Equations (1386 − 88) for the body-fixed frame give

³ ´2 2 2
 21 = ̇ sin  sin  + ̇ cos  = ̇ sin2  sin2  + 2̇ sin  sin  cos  + ̇ cos2  (13.141)
³ ´2 2 2
 22 = ̇ sin  cos  − ̇ sin  = ̇ sin2  cos2  − 2̇ sin  sin  cos  + ̇ sin2  (13.142)
Therefore
2 2
 21 +  22 = ̇ sin2  + ̇ (13.143)
and ³ ´2
 23 = ̇ cos  + ̇ (13.144)
Therefore the kinetic energy is
1 ³ 2 2 2
´ 1 ³ ´2
 = 1 ̇ sin  + ̇ + 3 ̇ cos  + ̇ (13.145)
2 2
Since the system is torque free, the scalar potential energy  can be assumed to be zero, and then the
Lagrangian equals
1 ³ 2 2
´ 1 ³ ´2
 = 1 ̇ sin2  + ̇ + 3 ̇ cos  + ̇ (13.146)
2 2
The angular momentum about the space-fixed  axis  is conjugate to . From Lagrange’s equations

̇ = =0 (13.147)

that is, the angular momentum about the space-fixed  axis,  is a constant of motion given by
 ¡ ¢
 = = 1 sin2  + 3 cos2  ̇ + 3 ̇ cos  = constant. (13.148)
 ̇
Similarly, the angular momentum about the body-fixed 3 axis is conjugate to  From Lagrange’s equations

̇ = =0 (13.149)

that is,  is a constant of motion given by
µ ¶

 = = 3 ̇ cos  + ̇ = 3  3 = constant (13.150)
 ̇
The above two relations derived from the Lagrangian can be solved to give the precession angular velocity
̇ about the space-fixed ẑ axis
 −  cos 
̇ = (13.151)
1 sin2 
and the spin about the body-fixed 3̂ axis ̇ which is given by
 ( −  cos ) cos 
̇ = − (13.152)
3 1 sin2 
342 CHAPTER 13. RIGID-BODY ROTATION

Since  and  are constants of motion, then the precessional angular velocity ̇ about the space-fixed ẑ
axis, and the spin angular velocity ̇, which is the spin frequency about the body-fixed 3̂ axis, are constants
that depend directly on 1  3  and 
There is one additional constant of motion available if no dissipative forces act on the system, that is,
energy conservation which implies that the total energy
1 ³ 2 2 2
´ 1 ³ ´2
= 1 ̇ sin  + ̇ + 3 ̇ cos  + ̇ (13.153)
2 2
will be a constant of motion. But the second term on the right-hand side also is a constant of motion since
 and 3 both are constants, that is

1 1 ³ ´2 2

3  23 = 3 ̇ cos  + ̇ = = constant (13.154)
2 2 3
Thus energy conservation implies that the first term on the right-hand side also must be a constant given by

1 ¡ 2 ¢ 1 ³ 2 2
´ 2
1  1 +  22 = 1 ̇ sin2  + ̇ =  − = constant (13.155)
2 2 3
These results are identical to those given in equations 13120 and 13121 which were derived using Euler’s
equations. These results illustrate that the underlying physics of the torque-free rigid rotor is more easily
extracted using Lagrangian mechanics rather than using the Euler-angle approach of Newtonian mechanics.

13.9 Example: Precession rate for torque-free rotating symmetric rigid rotor
Table 132 lists the precession and spin angular velocities, in the space-fixed frame, for torque-free rotation
of three extreme symmetric-top geometries spinning with constant angular momentum  when the motion
is slightly perturbed such that  is at a small angle  to the symmetry axis. Note that this assumes the
perpendicular axis theorem, equation 1345 which states that for a thin laminae 1 + 2 = 3 giving, for a
thin circular disk, 1 = 2 and thus 3 = 21 

Table 132: Precession and spin rates for torque-free axial rotation of symmetric rigid rotors

3
Rigid-body symmetric shape Principal moment ratio 1 Precession rate ̇ Spin rate ̇

Symmetric needle 0 0 
Sphere 1  0
Thin circular disk 2 2 −

The precession angular velocity in the space frame ranges between 0 to 2 depending on whether the
body-fixed spin angular velocity is aligned or anti-aligned with the rotational frequency . For an extreme
prolate spheroid 31 = 0 the body-fixed spin angular velocity Ω = − 3 which cancels the angular velocity
 of the rotating frame resulting in a zero precession angular velocity of the body-fixed ê3 axis around the
space-fixed frame. The spin Ω = 0 in the body-fixed frame for the rigid sphere 31 = 1 and thus the precession
rate of the body-fixed ̂3 axis of the sphere around the space-fixed frame equals . For oblate spheroids and
thin disks, such as a frisbee, 31 = 2 making the body-fixed precession angular velocity Ω = + which adds
to the angular velocity  and increases the precession rate up to 2 as seen in the space-fixed frame. This
illustrates that the spin angular velocity can add constructively or destructively with the angular velocity 2

2 Inhis autobiography Surely You’re Joking Mr Feynman, he wrote " I was in the [Cornell] cafeteria and some guy, fooling
around, throws a plate in the air. As the plate went up in the air I saw it wobble, and noticed that the red medallion of
Cornell on the plate going around. It was pretty obvious to me that the medallion went around faster than the wobbling. I
started to figure out the motion of the rotating plate. I discovered that when the angle is very slight, the medallion rotates
twice as fast as the wobble rate. It came out of a very complicated equation! ". The quoted ratio (2 : 1) is incorrect, it should
be (1 : 2). Benjamin Chao in Physics Today of February 1989 speculated that Feynman’s error in inverting the factor of
two might be "in keeping with the spirit of the author and the book, another practical joke meant for those who do physics
without experimenting". He pointed out that this story occurred on page 157 of a book of length 314 pages (1:2). Observe the
dependence of the ratio of wobble to rotation angular velocities on the tilt angle .
13.21. TORQUE-FREE ROTATION OF AN ASYMMETRIC RIGID ROTOR 343

13.21 Torque-free rotation of an asymmetric rigid rotor

The Euler equations of motion for the case of torque-free rotation of an asymmetric (triaxial) rigid rotor
about the center of mass, with principal moments of inertia 1 6= 2 6= 3  lead to more complicated motion
than for the symmetric rigid rotor.3 The general features of the motion of the asymmetric rotor can be
deduced using the conservation of angular momentum and rotational kinetic energy.
Assuming that the external torques are zero then the Euler
equations of motion can be written as

1 ̇ 1 = (2 − 3 )  2  3 (13.156)
2 ̇ 2 = (3 − 1 )  3  1
3 ̇ 3 = (1 − 2 )  1  2
Since  =    for  = 1 2 3, then equation 13156 gives

2 3 ̇1 = (2 − 3 ) 2 3 (13.157)

1 3 ̇2 = (3 − 1 ) 3 1
1 2 ̇3 = (1 − 2 ) 1 2

Multiply the first equation by 1 1 , the second by 2 2 and the

third by 3 3 and sum, which gives
³ ´
1 2 3 1 ̇1 + 2 ̇2 + 3 ̇3 = 0 (13.158)


The bracket is equivalent to  (21 + 22 + 23 ) = 0 which impliesFigure 13.6: Rotation of an asymmetric
that the total rotational angular momentum  is a constant of rigid rotor. The dark lines correspond to
motion as expected for this torque-free system, even though the contours of constant total rotational ki-
individual components 1  2  3 may vary. That is netic energy T, which has an ellipsoidal
2 2 2
1 + 2 + 3 =  2
(13.159) shape, projected onto the angular momen-
tum L sphere in the body-fixed frame.
Note that equation 13159 is the equation of a sphere of radius .
Multiply the first equation of 13157 by 1 , the second by 2 , and the third by 3 , and sum gives

2 3 1 ̇1 + 1 3 2 ̇2 + 1 2 3 ̇3 = 0 (13.160)

2
 1 22 23
Divide 13160 by 1 2 3 gives  ( 21 + 22 + 23 ) = 0. This implies that the total rotational kinetic energy
 , given by
21 2 2
+ 2 + 3 = (13.161)
21 22 23
is a constant of motion as expected when there are no external torques and zero energy dissipation. Note
that 13161 is the equation of an ellipsoid.
Equations 13159 and 13161 both must be satisfied by the rotational motion for any value of the total
angular momentum L and kinetic energy  . Fig 136 shows a graphical representation of the intersection of
the  sphere and  ellipsoid as seen in the body-fixed frame. The angular momentum vector L must follow
the constant-energy contours given by where the  -ellipsoids intersect the -sphere, shown for the case where
3  2  1 . Note that the precession of the angular momentum vector L follows a trajectory that has
closed paths that circle around the principal axis with the smallest , that is, ê1  or the principal axis with
the maximum , that is, ê3 . However, the angular momentum vector does not have a stable minimum for
precession around the intermediate principal moment of inertia axis ê2 . In addition to the precession, the
angular momentum vector L executes nutation, that is a nodding of the angle 
For any fixed value of , the kinetic energy has upper and lower bounds given by
2 2
≤ ≤ (13.162)
23 21
3 Similar discussions of the freely-rotating asymmetric top are given by Landau and Lifshitz [La60] and by Gregory [Gr06].
344 CHAPTER 13. RIGID-BODY ROTATION

2

Thus, for a given value of  when  = min = 2 3
 the orientation of L in the body-fixed frame is either
(0 0 +) or (0 0 −), that is, aligned with the ê3 axis along which the principal moment of inertia is largest.
For slightly higher kinetic energy the trajectory of  follows closed paths precessing around ê3 . When the
2
kinetic energy  = 222 the angular momentum vector  follows either of the two thin-line trajectories each
of which are a separatrix. These do not have closed orbits around ê2 and they separate the closed solutions
around either ê3 or ê1  For higher kinetic energy the precessing angular momentum vector follows closed
trajectories around ê1 and becomes fully aligned with ê1 at the upper-bound kinetic energy.
Note that for the special case when 3  2 = 1  then the asymmetric rigid rotor equals the symmetric
rigid rotor for which the solutions of Euler’s equations were solved exactly in chapter 1319. For the symmetric
rigid rotor the  -ellipsoid becomes a spheroid aligned with the symmetry axis and thus the intersections
with the -sphere lead to circular paths around the ê3 body-fixed principal axis, while the separatrix circles
the equator corresponding to the ê3 axis separating clockwise and anticlockwise precession about L3 . This
discussion shows that energy, plus angular momentum conservation, provide the general features of the
solution for the torque-free symmetric top that are in agreement with those derived using Euler’s equations
of motion

13.22 Stability of torque-free rotation of an asymmetric body

It is of interest to extend the prior discussion to address the stability of an asymmetric rigid rotor undergoing
force-free rotation close to a principal axes, that is, when subject to small perturbations. Consider the case
of a general asymmetric rigid body with 3  2  1  Let the system start with rotation about the ê1 axis,
that is, the principal axis associated with the moment of inertia 1  Then

ω = 1 b
e1 (13.163)

Consider that a small perturbation is applied causing the angular velocity vector to be

ω = 1 b
e1 + b
e2 + b
e3 (13.164)

where   are very small. The Euler equations (13156) become

(2 − 3 )  − 1 ̇ 1 = 0
(3 − 1 )  1 − 2 ̇ = 0
(1 − 2 )  1  − 3 ̇ = 0

Assuming that the product  in the first equation is negligible, then ̇ 1 = 0 that is,  1 is constant.
The other two equations can be solved to give
µ ¶
(3 − 1 )
̇ = 1  (13.165)
2
µ ¶
(1 − 2 )
̇ = 1  (13.166)
3
Take the time derivative of the first equation
µ ¶
(3 − 1 )
̈ =  1 ̇ (13.167)
2
and substitute for ̇ gives µ ¶
(1 − 3 ) (1 − 2 ) 2
̈ + 1  = 0 (13.168)
2 3
The solution of this equation is
() = Ω1  + −Ω1  (13.169)
where s
(1 − 3 ) (1 − 2 )
Ω1 =  1 (13.170)
2 3
13.22. STABILITY OF TORQUE-FREE ROTATION OF AN ASYMMETRIC BODY 345

Note that since it was assumed that 3  2  1  then Ω1 is real. The solution for () therefore represents a
stable oscillatory motion with precession frequency Ω1  The identical result is obtained for Ω1 = Ω1 = Ω1 
Thus the motion corresponds to a stable minimum about the ê1 axis with oscillations about the  =  = 0
minimum with period. s
(1 − 3 ) (1 − 2 )
Ω1 =  1 (13.171)
2 3
Permuting the indices gives that for perturbations applied to rotation about either the 2 or 3 axes give
precession frequencies s
(2 − 1 ) (2 − 3 )
Ω2 =  2 (13.172)
1 3
s
(3 − 2 ) (3 − 1 )
Ω3 =  3 (13.173)
1 2
Since 3  2  1 then Ω1 and Ω3 are real while Ω2 is imaginary. Thus, whereas rotation about either
the 3 or the 1 axes are stable, the imaginary solution about ê2 corresponds to a perturbation increasing
with time. Thus, only rotation about the largest or smallest moments of inertia are stable. Moreover for
the symmetric rigid rotor, with 1 = 2 6= 3  stability exists only about the symmetry axis ê3 independent
on whether the body is prolate or oblate. This result was implied from the discussion of energy and angular
momentum conservation in chapter 1320. Friction was not included in the above discussion. In the presence
of dissipative forces, such as friction or drag, only rotation about the principal axis corresponding to the
maximum moment of inertia is stable.
Stability of rigid-body rotation has broad applications to rotation of satellites, molecules and nuclei.
The first U.S. satellite, Explorer 1, was launched in 1958 with the rotation axis aligned with the cylindrical
axis which was the minimum principal moment of inertia. After a few hours the satellite started tumbling
with increasing amplitude due to a flexible antenna dissipating and transferring energy to the perpendicular
axis which had the largest moment of inertia. Torque-free motion of a deformed rigid body is a ubiquitous
phenomena in many branches of science, engineering, and sports as illustrated by the following examples.

13.10 Example: Tennis racquet dynamics

A tennis racquet is an asymmetric body that exhibits the above rota-
tional behavior. Assume that the head of a tennis racquet is a uniform
thin circular disk of radius  and mass  which is attached to a cylin-
drical handle of diameter  = 10
, length 2, and mass  as shown in M
the figure. The principle moments of inertia about the three axes through
the center-of-mass can be calculated by addition of the moments for the 2
circular disk and the cylindrical handle and using both the parallel-axis
and the perpendicular-axis theorems. 1
Axis
1 2
Head
2 5 2
Handle
4 2
Racquet
31 2
M
1 4   +  = 4   3 12  
1 2 1 2 1 2 51 2
2 4   +0 = 4   200   200  
1 2 2 3 2 4 2 17 2
3 2   +  = 2   3 6 
Principal rotation axes for the
Note that 11 : 22 : 33 = 25833 : 02550 : 28333. Inserting these center of mass of a tennis racket.
principle moments of inertia into equations 13171 − 13173 gives the The 1 and 2 -axes are in the
following precession frequencies plane of the racket head and the
3 axis is perpendicular to the
Ω1 = i0 8976  1 Ω2 = 0 9056  2 Ω3 = 0 9892  3 plane of the racket head.

The imaginary precession frequency Ω1 about the 1 axis implies unstable rotation leading to tumbling
whereas the minimum moment 22 and maximum moment 33 imply stable rotation about the 2 and 3 axes.
This rotational behavior is easily demonstrated by throwing a tennis racquet and is called the tennis racquet
theorem. The center of percussion, example 214 is another important inertial property of a tennis racquet.
346 CHAPTER 13. RIGID-BODY ROTATION

13.11 Example: Rotation of asymmetrically-deformed nuclei

Some nuclei and molecules have average shapes that have significant asymmetric deformation leading to
interesting quantal analogs of the rotational properties of an asymmetrically-deformed rigid body. The major
diﬀerence between a quantal and a classical rotor is that the energies, and angular momentum are quantized,
rather than being continuously variable quantities. Otherwise, the quantal rotors exhibit general features
similar to the classical analog. Studies [Cli86] of the rotational behavior of asymmetrically-deformed nuclei
exploit three aspects of classical mechanics, namely classical Coulomb trajectories, rotational invariants, and
the properties of ellipsoidal rigid-bodies.
Ellipsoidal deformation can be specified by the dimensions along each of the three principle axes. Bohr
and Mottelson parameterized the ellipsoidal deformation in terms of three parameters, 0 which is the radius
of the equivalent sphere,  which is a measure of the magnitude of the ellipsoidal deformation from the sphere,
and  which specifies the deviation of the shape from axial symmetry. The ellipsoidal intrinsic shape can be
expressed in terms of the deviation from the equivalent sphere by the equation
+2
X
( ) = ( ) − 0 = 0 ∗2 2 ( ) ()
=−2

where  ( ) is a Laplace spherical harmonic defined as

s
(2 + 1) ( − )!
 ( ) =  (cos )−
4 ( + )!

and  (cos ) is an associated Legendre function of cos . Spherical harmonics are the angular portion of a
set of solutions to Laplace’s equation. Represented in a system of spherical coordinates, Laplace’s spherical
harmonics  ( ) are a specific set of spherical harmonics that form an orthogonal system. Spherical
harmonics are important in many theoretical and practical applications.
In the principal axis frame of the body, there are three non-zero quadrupole deformation parameters
which can be written in terms of the deformation parameters   where 20 =  cos , 21 = 2−1 = 0 and
22 = 2−2 = √12  sin  Using these in equations () give the three semi-axis dimensions in the principal
axis frame, (primed frame), r
5 2
 = 0  cos( − ) ()
4 3
q q
Note that for  = 0, then 1 = 2 = − 12 4 5
0  while 3 = + 4 5
0 , that is the body has prolate
deformation with the symmetry axis along the 3 axis. The same prolate shape is obtained for  = 23 and
 = 4 with the prolate symmetry axes along the 1 and 2 axes respectively. For  = 
then 1 = 3 =
q3 q 3
1 5 5
+ 2 4 0  while 2 = − 4 0 , that is the body has oblate deformation with the symmetry axis along
the 2 axis. The same oblate shape is obtained for  =  and  = 5 3 with the oblate symmetry axes along
the 3 and 1 axes respectively. For other values of  the shape is ellipsoidal.
For the asymmetric deformed rigid body, the rotational Hamiltonian can be expressed in the form[Dav58]
3
X ||2
=
=1
4 2 sin2 ( 0 − 2
3 )

where the rotational angular momentum is R The principal moments of inertia are related by the triaxiality
parameter  0 which they assumed is identical to the shape parameter . For axial symmetry the moment of
inertia about the symmetry axis is taken to be zero for a quantal system since rotation of the potential well
about the symmetry axis corresponds to no change in the potential well, or corresponding rotation of the bound
nucleons. That is, the nucleus is not a rigid body, the nucleons only rotate to the extent that the ellipsoidal
potential well is cranked around such that the nucleons must follow the rotation of the potential well. In
addition, vibrational modes coexist about the average asymmetric deformation, plus octupole deformation
often coexists with the above quadrupole deformed modes.
13.23. SYMMETRIC RIGID ROTOR SUBJECT TO TORQUE ABOUT A FIXED POINT 347

13.23 Symmetric rigid rotor subject to torque about a fixed point

The motion of a symmetric top rotating in a gravitational field, with
one point at a fixed location, is encountered frequently in rotational
motion. Examples are the gyroscope and a child’s spinning top.
Rotation of a rigid rotor subject to torque about a fixed point, is a z
case where it is necessary to take the inertia tensor with respect to 3
the fixed point in the body, and not at the center of mass.
Consider the geometry, shown in figure 137, where the symmet-
ric top of mass  is spinning about a fixed tip that is displaced by
a distance  from the center of mass. The tip of the top is assumed
2
to be at the origin of both the space-fixed frame (  ) and the
body-fixed frame (1 2 3)  Assume that the translational velocity
Mg h
is zero and let the principal moments about the fixed point of the
symmetric top be 1 = 2 6= 3 
The Lagrange equations of motion can be derived assuming that y
the kinetic energy equals the rotational kinetic energy, that is, it is
assumed that the translational kinetic energy  = 0 Then the
kinetic energy of an inertially-symmetric rigid rotor can be derived x 1
for the torque-free symmetric top as given in equation 13145 to be Line of nodes
1X 1 ¡ ¢ 1
 =   2 = 1  21 +  22 + 3  23 (13.174)
2  2 2
1 ³ 2 2 2
´ 1 ³ ´2
= 1 ̇ sin  + ̇ + 3 ̇ cos  + ̇ (13.175) Figure 13.7: Symmetric top spinning
2 2
about one fixed point.
Since the potential energy is  =   cos  then the Lagrangian
equals
1 ³ 2 2
´ 1 ³ ´2
 = 1 ̇ sin2  + ̇ + 3 ̇ cos  + ̇ −   cos  (13.176)
2 2
The angular momentum about the space-fixed  axis  is conjugate to . From Lagrange’s equations

̇ = =0 (13.177)

that is,  is a constant of motion given by the generalized momentum
 ¡ ¢
 = = 1 sin2  + 3 cos2  ̇ + 3 ̇ cos  =  = constant (13.178)
 ̇
where  is the angular momentum projection along the space-fixed  axis.
Similarly, the angular momentum about the body-fixed 3 axis is conjugate to  From Lagrange’s equations

̇ = =0 (13.179)

that is,  is a constant of motion given by the generalized momentum
µ ¶

 = = 3 ̇ cos  + ̇ = 3 = constant (13.180)
 ̇
where 3 is the angular momentum projection along the body-fixed 3 axis. The above two relations can be
solved to give the precessional angular velocity ̇ about the space-fixed  axis
 −  cos   − 3 cos 
̇ = = (13.181)
1 sin2  1 sin2 
and the spin angular velocity ̇ about the body-fixed 3 axis
 ( −  cos ) cos  3 ( − 3 cos ) cos 
̇ = − 2 = − (13.182)
3 1 sin  3 1 sin2 
Since  and  are constants of motion, i.e. 3  3  then these rotational angular velocities depend on only
1  3  and 
348 CHAPTER 13. RIGID-BODY ROTATION

There is one further constant of motion available if no frictional

forces act on the system, that is, energy conservation. This implies
that the total energy

1 ³ 2 2 2
´ 1 ³ ´2
= 1 ̇ sin  + ̇ + 3 ̇ cos  + ̇ +   cos  (13.183)
2 2
will be a constant of motion. But the middle term on the right-hand
side also is a constant of motion

1 ³ ´2 2
 2
3 ̇ cos  + ̇ = = 3 = constant (13.184)
2 3 3

Thus energy conservation can be rewritten by defining an energy  0

where
2 1 ³ 2 2
´ 0
 0 ≡ − = 1 ̇ sin2  + ̇ +  cos  = constant (13.185)
3 2

This can be written as

2
1 2 ( −  cos )
0 = 1 ̇ + +   cos  (13.186) Figure 13.8: Eﬀective potential dia-
2 21 sin2  gram for a spinning symmetric top
as a function of theta.
which can be expressed as
1 2
0 = 1 ̇ +  () (13.187)
2
where  () is an eﬀective potential

( −  cos )2 ( − 3 cos )2

 () ≡ +   cos  = +   cos  (13.188)
21 sin2  21 sin2 

The eﬀective potential  () is shown in figure 138. It is clear that the motion of a symmetric top with
eﬀective energy  0 is confined to angles 1    2 
Note that the above result also is obtained if the Routhian is used, rather than the Lagrangian, as
mentioned in chapter 87, and defined by equation (865). That is, the Routhian can be written as

( ̇   ) = ̇ + ̇ −  = (     ) − ( ̇)

2
1 2 ( −  cos ) 2
= − 1 ̇ + + +   cos  (13.189)
2 21 sin2  23

The Routhian ( ̇   ) acts like a Hamiltonian for the (  ) and (  ) variables which are
constants of motion, and thus are ignorable variables. The Routhian acts as the negative Lagrangian for the
2
remaining variable  with rotational kinetic energy 12 1 ̇ and eﬀective potential energy  
2
( −  cos ) 2 2
  = + +   cos  =  () +
21 sin2  3 3

The equation of motion describing the system in the rotating frame is given by one Lagrange equation

  
( )− =0
  ̇ 
The negative sign of the Routhian cancels out when used in the Lagrange equation. Thus, in the rotating
frame of reference, the system is reduced to a single degree of freedom, the nutation angle  with eﬀective
energy  0 given by equations 13186 − 13188.
13.23. SYMMETRIC RIGID ROTOR SUBJECT TO TORQUE ABOUT A FIXED POINT 349

(a) (b) (c)

Figure 13.9: Nutational motion of the body-fixed symmetry axis projected onto the space-fixed unit sphere.
The three case are (a) ̇ never vanishes, (b) ̇ = 0 at  = 2 (c) ̇ changes sign between 1 and 2 

The motion of the symmetric top is simplest at the minimum value of the eﬀective potential curve, where
 0 = min  at which the nutation  is restricted to a single value  = ¡0  The
¢ motion is a steady precession
at a fixed angle of inclination, that is, the “sleeping top”. Solving for  =0 = 0 gives that
" s #
 sin2 0 4 1 cos 0
 −  cos  = 1± 1− (13.190)
2 cos 0 2

If 0  2  then to ensure that the solution is real requires a minimum value of the angular momentum on the
body-fixed axis of 2 ≥ 4 1 cos 0 . If 0  2 then there is no minimum angular momentum projection
on the body-fixed axis. There are two possible solutions to the quadratic relation corresponding to either a
slow or fast precessional frequency. Usually the slow precession is observed.
For the general case, where 10  min  the nutation angle  between the space-fixed and body-fixed 3
axes varies in the range 1    2  This axis exhibits a nodding variation which is called nutation. Figure
139 shows the projection of the body-fixed symmetry axis on the unit sphere in the space-fixed frame. Note
that the observed nutation behavior depends on the relative sizes of  and  cos  For certain values, the
precession ̇ changes sign between the two limiting values of  producing a looping motion as shown in figure
139. Another condition is where the precession is zero for 2 producing a cusp at 2 as illustrated in figure
139. This behavior can be demonstrated using the gyroscope or the symmetric top.

13.12 Example: The Spinning “Jack”

The game “Jacks” is played using metal Jacks, each of which com-
prises six equal masses  at the opposite ends of orthogonal axes of length
 Consider one jack spinning around the body-fixed 3−axis with the lower
mass at a fixed point on the ground, and with a steady precession around
z
3
the space-fixed vertical axis  with angle  as shown. Assume that the
body-fixed axes align with the arms of the jack.
The principal moments of inertia about one mass is given by the par- S
allel axis theorem to be 2 = 1 = 42 +62 = 102 and 3 = 42 .
In the rotating body-fixed frame the torque due to gravity has compo-
nents ⎛ ⎞
6 sin  sin 
N = ⎝ 6 sin  cos  ⎠
0
and the components of the angular velocity are
⎛ ⎞ O
̇ sin  sin  + ̇ cos 
ω = ⎝ ̇ sin  cos  − ̇ sin ⎠ Jack comprises six bodies of
̇ cos  + ̇ mass  at each end of
orthogonal arms of length 
Using Euler’s equations ( 13103) for the above components of  and
 in the body-fixed frame, gives
350 CHAPTER 13. RIGID-BODY ROTATION

6
10̇ 1 − 6 2  3 = sin  sin  (a)

6
10̇ 2 − 6 1  3 = sin  cos  (b)

4̇ 3 = 0 (c)

Equation () relates the spin about the 3 axis, the precession, and the angle to the vertical  that is

 3 = ̇ cos  + ̇ = Ω cos  +  = constant

where ̇ ≡  is the spin and ̇ ≡ Ω is the precession angular velocity.

If the spin axis is nearly vertical,  ≈ 0 and thus sin  ≈  and cos  ≈ 1. Multiply equation () × sin  +
() × cos  and using the equations of the components of  gives
µ ¶
2 3
5̈ + 2Ω − 3Ω − =0


The bracket must be positive to have stable sinusoidal oscillations. That is, the spin angular velocity 
required for the jack to spin about a stable vertical axis is given by.
3Ω 3
 +
2 2Ω
This example illustrates the conditions required for stable rotation of any axially-symmetric top.

13.13 Example: The Tippe Top

The Tippe Top comprises a section of a sphere, to
which a short cylindrical rod is mounted on the planar
section, as illustrated. When the Tippe Top is spun on z
a horizontal surface this top exhibits the perverse behav-
3 axis
ior of transitioning from rotation with the spherical head
resting on the horizontal surface, to flipping over such
that it rotates resting on its elongated cylindrical rod.
The orientation of angular momentum remains roughly
vertical as expected from conservation of angular mo-
mentum. This implies that the rotation with respect to
the body-fixed axes must invert as the top inverts. The
center of mass is raised when the top inverts; the addi- a
tional potential energy is provided by a reduction in the
CG
rotational kinetic energy.
The Tippe Top behavior was first discovered in the r
1890’s but adequate solutions of the equations of motion
have only been developed since the 1950’s. Since the top The geometry of the Tippe Top of radius 
precesses around the vertical axis, the point of contact is spinning on a horizontal surface with slipping
not on the symmetry axis of the top. Sliding friction be- friction acting between the top and the
tween the surface of the spinning top and the horizontal horizontal plane. The center of mass is a distance
surface provides a torque that causes the precession of  from the center of the spherical section along
the top to increase and eventually flip up onto the cylin- the axis of symmetry of the top.
drical peg. The Tippe Top is typical of many phenomena
in physics where the underlying physics principle can be
recognized but a detailed and rigorous solution can be complicated.
The system has five degrees of freedom,   which specify the location on the horizontal plane, plus the
three Euler angles (  ). The paper by Cohen[Coh77] explains the motion in terms of Euler angles using
the laboratory to body-fixed transformation relation. It shows that friction plays a pivotal role in the motion
contrary to some earlier claims. Ciocci and Langerock[Cio07] used the Routhian  to reduce the number
13.24. THE ROLLING WHEEL 351

of degrees of freedom from 5 to 2, namely  which is the tilt angle, and 0 which is the orientation of the
tilt. This Routhian  is a Lagrangian in two dimension that was used to derive the equations of motion
via the Lagrange Euler equation

  
( )− = 
  ̇ 
  
( )− = 0
  ̇0 0

where the  0 are generalized torques about the 2 angles that take into account the sliding frictional
forces. This sophisticated Routhian reduction approach provides an exhaustive and refined solution for the
Tippe Top and confirms that sliding friction plays a key role in the unusual behavior of the Tippe Top.

13.24 The rolling wheel

As discussed in chapter 57 the rolling wheel is a non-holonomic system that is simple in principle, but
in practice the solution can be complicated, as illustrated by the Tippe Top. Chapter 1323 discussed the
motion of a symmetric top rotating about a fixed point on the symmetry axis when subject to a torque. The
rolling wheel involves rotation of a symmetric rigid body that is subject to torques. However, the point of
contact of the wheel with a static plane is on the periphery of the wheel, and friction at the point of contact
is assumed to ensure zero slip. Note that friction is necessary to ensure that the rotating object rolls without
slipping, but the frictional force does no work for pure rolling of an undeformable rigid wheel.
The coordinate system employed is shown in Figure 1310. For simplicity it is better to use a moving
coordinate frame (1 2 3) that is fixed to the orientation of the wheel with the origin at the center of mass
of the wheel, but this moving reference frame does not include the angular velocity ̇ of the disk about the
3 axis. That is, the moving (1 2 3) frame has angular velocities

1 = ̇ (13.191)
2 = ̇ sin 
3 = ̇ cos 

The frame fixed in the rotating wheel must include the additional angular velocity of the disk ̇ about the
ê3 axis, that is

Ω1 =  1 = ̇ (13.192)
Ω2 =  2 = ̇ sin 
Ω3 =  3 + ̇ = ̇ cos  + ̇

where Ω designates the angular velocity of the rotating disk, while ω designates the rotation of the moving
frame (1 2 3).
The principle moments of inertia of a thin circular disk are related by the perpendicular axis theorem
(chapter 139)
1 + 2 = 3
Since 1 = 2 for a uniform disk, therefore 3 = 21 .
Equation 1216 can be used to relate the vector forces F in the space-fixed frame to the rate of change
of momenta in the moving frame (1 2 3) 

F = ṗ = ṗ + ω × p (13.193)

This leads to the following relations for the three components in the moving frame

1 = ̇1 +  2 3 −  3 2 (13.194)
2 −   sin  = ̇2 +  3 1 −  1 3
3 −   cos  = ̇3 +  1 2 −  2 1
352 CHAPTER 13. RIGID-BODY ROTATION

Figure 13.10: Uniform disk rolling on a horizontal plane as viewed in the (a) fixed frame, and (b) rolling
disk frame. The space-fixed axis system is (x y z), while the moving reference frame (1 2 3) is centered at
the center of mass of the disk with the 1 2 axes in the plane of the disk. The disk is rotating with a uniform
angular velocity ̇ about the 3 axis and rolling in the direction that is at an angle  relative to the  axis.

where 1  2  3 are the reactive forces acting shown in figure 1310.

Similarly, the torques N in the space-fixed frame can be related to the rate of change of angular momentum
by
N = L̇ = L̇ + ω × L (13.195)
where  = I Ω . This leads to the following relations for the three torque equations in the moving frame
1 = −3  = 1 Ω̇1 + 3 Ω3  2 − 2 Ω2  3 (13.196)
2 = 0 = 1 Ω̇2 + 1 Ω1  3 − 3 Ω3  1
3 = 1  = 3 Ω̇3 + 2 Ω2  1 − 1 Ω1  2
The rolling constraints are
1 +  Ω3 = 0 (13.197)
2 = 0
3 −  Ω1 = 0
where  =   . Combining equations 13194 13196 13197 gives
¡ ¢ ¡ ¢
1 +  2 Ω̇1 + 3 +  2  2 Ω3 − 2  3 Ω2 = −  cos  (13.198)
1 Ω̇2 + 1  3 Ω1 − 3  1 Ω3 = 0
¡ 2
¢ ¡ ¢
3 +   Ω̇3 + 2  1 Ω2 − 1 +  2  2 Ω1 = 0
These are the torque equations about the point of contact .
Introduction of equations 13191 and 13192 into equation 13198 expresses the equations of motion in
terms of the Euler angles to be
¡ ¢ ¡ ¢ ³ ´ 2
1 +  2 ̈ + 3 +  2 ̇ sin  ̇ cos  + ̇ − 1 ̇ sin  cos  = −  cos  (13.199)
³ ´
1 ̈ sin  + 21 ̇̇ cos  − 3 ̇ ̇ cos  + ̇ = 0
¡ ¢ ³ ´
3 +  2 ̈ cos  − ̇̇ sin  + ̈ −  2 ̇̇ sin  = 0
13.24. THE ROLLING WHEEL 353

Equations 13199 are non-linear, and a closed-form solution is possible only for limited cases such as when
 = 90◦ .
Note that the above equations of motion also can be derived using Lagrangian mechanics knowing that
1 ¡ 2 ¢ 1 ¡ ¢ 1
=  1 + 22 + 32 + 1 Ω21 + Ω22 + 3 Ω23 −   cos 
2 2 2
The diﬀerential equations of constraint can be derived from equations 13197 to be
 −  cos  = 0
 −  sin  = 0
Use of generalized forces plus the Lagrange-Euler equations (645) can be used to derive the equations of
motion and solve for the components of the constraint force 1  2  and 3 .

13.14 Example: Tipping stability of a rolling wheel

A circular wheel rolling in a vertical plane at high angular velocity initially rolls in a straight line and
remains vertical. However, below a certain angular velocity, gyroscopic forces become weaker and the wheel
will tip sideways and veer rapidly from the initial direction. It is interesting to estimate the minimum angular
velocity of the disk such that it does not start to tip over sideways.
Note that equations 13199 are satisfied for  = 2   = 0 and ̇ = Ω3 = constant. Assume a small
disturbance causes the tilt angle to be  = 2 +  where  is small and that  is non-zero but small, that is
̇ = ̇ and ̇ are small. Keeping only terms to first order in the third of equations 13199 and integrating
gives
̇ cos  + ̇ = Ω3 (a)
The first two of equations 13198 become
¡ ¢ ¡ ¢
1 +  2 ̈ + 3 +  2 ̇Ω3 −   = 0 (b)
1 ̈ − 3 Ω3 ̇ = 0 (c)
Integrating equation () gives
3 Ω3
̇ =  (d)
1
Inserting () into () gives
∙ ¸
¡ 2
¢ ¡ ¢ 2
2 3 Ω3
1 +   ̈ + 3 +   −    = 0 (e)
1
Equation () has a stable oscillatory solution when the square bracket in positive, that is,
1  
Ω23  (f)
3 (3 +  2 )
which gives the minimum angular velocity required for stable rolling motion. For angular velocity less than the
minimum, the square bracket in equation () is negative leading to an exponentially decaying and divergent
solution. For a uniform disk the perpendicular axis theorem gives 3 = 21 = 12  2 for which equation ( )
gives

Ω23  (g)
3
Therefore the critical linear velocity of the wheel is
r

 = Ω3  (h)
3
The bicycle wheel provides a common example of the tipping of a rolling wheel. For the typical 035
radius of a bicycle wheel, this gives a critical velocity of   107 = 24.4
4 The stability of the bicycle is sensitive to the castor and other aspects of the steering geometry of the front wheel, in

addition to the gyroscopic eﬀects. Excellent articles on this subject have been written by D.E.H. Jones Physics Today 23(4)
(1970) 34, and also by J. Lowell & H.D. McKell, American Journal of Physics 50 (1982) 1106.
354 CHAPTER 13. RIGID-BODY ROTATION

13.15 Example: Pivoting

A rolling and a pivoting body can lead to confusion as to whether to compute the angular momentum and
kinetic energy with respect to the center of mass, or the point of contact on the circumference of the body for
rolling, or of the pivot point for a fixed pivot. For pivoting or rolling of a wheel it is useful to compare the
angular momentum and total energy computed with respect to (1) the center of mass of a cylinder and (2)
with respect to the point of contact of the cylinder with the horizontal ground plane.
Consider a cylinder of radius  and mass  pivoting about the point of contact with the plane with

angular velocity  =  where  is the instantaneous velocity of the center of mass. The angular momentum
about the pivot point is
L = R × v = ω

The parallel-axis theorem relates the moment of inertia with respect to the pivot point and center of mass

 = 2 + 

The angular velocities of the center of mass, and about the center of mass, are identical since the pivot point
is fixed, that is
  =   = 

Thus the angular momentum about the pivot point is given by the sum of the angular momenta

L =  ω = 2 ω +  ω

That is, the angular momentum is the sum of the angular momentum of the body about the center of mass,
plus the angular momentum of the center of mass about the pivot point. This is an example of Chasles
theorem.
The kinetic energy is given only by the rotational energy since the pivot point is stationary

1 1 1 1 1
 =   2 = 2  2 +   2 = 2 +   2
2 2 2 2 2
That is, it equals the kinetic energy of rotation about the center of mass plus the instantaneous kinetic energy
for translation of the center of mass in agreement with Chasles theorem. Thus, for pivoting, the angular
momentum and kinetic energy are the same if evaluated using either center of mass coordinates or using the
pivot point as the reference point.

13.16 Example: Rolling

Consider the same system except the cylinder is rolling without slipping on a plane. The subtle diﬀerence
between pivoting and rolling is that the rolling point of contact and the center of mass are moving at the same
velocity in contrast to pivoting where the point of contact is stationary. Thus for rolling there is no angular
momentum of the center of mass with respect to the point of contact. Therefore the angular momentum about
the instantaneous point of contact is

L =  +  = 2 0 +  ω = ω

That is, the angular momentum only includes the angular momentum about the center of mass which is
smaller than the angular momentum for the same body pivoting about a point on the periphery of the cylinder.
The kinetic energy is given by

1 1 1 1
 =  2 +   2 =  2 +   2
2 2 2 2
Thus the angular momentum is significantly smaller for rolling relative to pivoting of a given body, whereas
the kinetic energy is the same for both rolling or pivoting of a given body.
13.25. DYNAMIC BALANCING OF WHEELS 355

13.25 Dynamic balancing of wheels

For rotating machinery It is crucial that rotors be both statically and dynamically balanced. Static balance
means that the center of mass is on the axis of rotation. Dynamic balance means that the axis of rotation is
a principal axis.
For example, consider the symmetric rotor that has its symmetry axis at an angle  to the axis of rotation.
In this case the system is statically balanced since the center of gravity is on the axis of rotation. However,
the rotation axis is at an angle  to the symmetry axis. This implies that the axle has to provide a torque
to maintain rotation that is not along a principal axis. If you distort the front wheel of your car by hitting it
sideways against the sidewalk curb, or if the wheel is not dynamically balanced, then you will find that the
steering wheel can vibrate wildly at certain speeds due to the torques caused by dynamic imbalance shaking
the steering mechanism. This can be especially bad when the rotation frequency is close to a resonant
frequency of the suspension system. Insist that your automobile wheels are dynamically balanced when you
change tires, static balancing will not eliminate the dynamic imbalance forces. Another example is that the
ailerons, rudder, and elevator on aircraft usually are dynamically balanced to stop the build up of oscillations
that can couple to flexing and flutter of the airframe which can lead to airframe failure.

13.17 Example: Forces on the bearings of a rotating circular disk

A homogeneous circular disk of mass  , and radius ,
rotates with constant angular velocity  about a body-fixed
axis passing through the center of the circular disk as shown
in the adjacent figure. The rotation axis is inclined at an
angle  to the symmetry axis of the circular disk by bearings
on both sides of the disk spaced a distance  apart. Determine
the forces on the bearings.
Choose the body-fixed axes such that ̂3 is along the sym-
metry axis of the circular disk, and ̂1 points in the plane of
the disk symmetry axis and the rotation axis. These axes are
the principal axes for which the inertia tensor can be calcu-
lated to be ⎛ ⎞
1 0 0
 2 ⎝
I= 0 1 0 ⎠
4 0 0 2
Note that for this thin plane laminae disk 11 + 22 = 33 . Rotation of circular disk about an axis that
The components of the angular velocity vector  along the is at an angle  to the symmetry axis of the
three body-fixed axes are given by circular disk.
ω = ( sin  0  cos )
Since it is assumed that ̇ = 0 then substituting into Euler’s equations (13103) gives the torques acting to
be

1 = 3 = 0
1
2 = − 2 sin  cos   2
4
That is, the torque is in the ̂2 direction. Thus the forces  on the bearings can be calculated since N = r × F,
thus
|2 | sin 2
| | = =  2  2
2 16
Estimate the size of these forces for the front wheel of your car travelling at 70 m.p.h. if the rotation axis is
displaced by 2◦ from the symmetry axis of the wheel.
356 CHAPTER 13. RIGID-BODY ROTATION

Figure 13.11: Forward two-and-a-half somersaults with two twists demonstrates unequivocally that a diver
can initiate continuous twisting in midair. In the illustrated maneuver the diver does more than one full
somersault before he starts to twist. To maintain the twisting the diver does not have to move his legs.[Fro80]

13.26 Rotation of deformable bodies

The discussion in this chapter has assumed that the rotating body is a rigid body. However, there is a
broad and important class of problems in classical mechanics where the rotating body is deformable that
leads to intriguing new phenomena. The classic example is the cat, which, if dropped upside down with zero
angular momentum, is able to distort its body plus tail in order to rotate such that it lands on its feet in
spite of the fact that there are no external torques acting and thus the angular momentum is conserved.
Another example is the high diver doing a forward two—and-a-half somersault with two twists.[Fro80] Once
the diver leaves the board then the total angular momentum must be conserved since there are no external
torques acting on the system. The diver begins a somersault by rotating about a horizontal axis which is a
principal axis that is perpendicular to the axis of his body passing through his hips. Initially the angular
momentum, and angular velocity, are parallel and point perpendicular to the symmetry axis. Initially the
diver goes into a tuck which greatly reduces his moment of inertia along the axis of his somersault which
concomitantly increases his angular velocity about this axis and he performs one full somersault prior to
initiating twisting. Then the diver twists its body and moves its arms to destroy the axial symmetry of his
body which changes the direction of the principal axes of the inertia tensor. This causes the angular velocity
to change in both direction and magnitude such that the angular momentum remains conserved. The angular
velocity now is no longer parallel to the angular momentum resulting in a component along the length of
the body causing it to twist while somersaulting. This twisting motion will continue until the symmetry
of the diver’s body is restored which is done just before entering the water. By skilled timing, and body
movement, the diver restores the symmetry of his body to the optimum orientation for entering the water.
Such phenomena involving deformable bodies are important to motion of ballet dancers, jugglers, astronauts
in space, and satellite motion. The above rotational phenomena would be impossible if the cat or diver were
rigid bodies having a fixed inertia tensor. Calculation of the dynamics of the motion of deformable bodies
is complicated and beyond the scope of this book, but the concept of a time dependent transformation of
the inertia tensor underlies the subsequent motion. The theory is complicated since it is diﬃcult even to
quantify what corresponds to rotation as the body morphs from one shape to another. Further information
on this topic can be found in the literature. [Fro80]
13.27. SUMMARY 357

13.27 Summary
This chapter has introduced the important, topic of rigid-body rotation which has many applications in
physics, engineering, sports, etc.

Inertia tensor The concept of the inertia tensor was introduced where the 9 components of the inertia
tensor are given by Ã Ã 3 ! !
Z X
0 2
 =  (r )    −    (1314)

Steiner’s parallel-axis theorem
¡¡ 2 ¢ ¢ ¡ ¢
11 ≡ 11 +  1 + 22 + 23  11 − 21 = 11 +  22 + 23 (1343)
relates the inertia tensor about the center-of-mass to that about parallel axis system not through the center
of mass.
Diagonalization of the inertia tensor about any point was used to find the corresponding Principal axes
of the rigid body.

Angular momentum The angular momentum L for rigid-body rotation is expressed in terms of the
inertia tensor and angular frequency  by
⎛ ⎞ ⎛ ⎞
11 12 13 1
L= ⎝ 21 22 23 ⎠ · ⎝  2 ⎠ = {I} · ω (1356)
31 32 33 3

Rotational kinetic energy The rotational kinetic energy is

⎛ ⎞ ⎛ ⎞
¡ ¢ 11 12 13 1
1
 =  1  2  3 · ⎝ 21 22 23 ⎠ · ⎝  2 ⎠ (1372)
2
31 32 33 3
1 1
 ≡ T = ω · {I} · ω = ω · L (1373)
2 2

Euler angles The Euler angles relate the space-fixed and body-fixed principal axes. The angular velocity
ω expressed in terms of the Euler angles has components for the angular velocity in the body-fixed axis system
(1 2 3)

1 = ̇1 + ̇1 +  1 = ̇ sin  sin  + ̇ cos  (1386)

2 = ̇2 + ̇2 +  2 = ̇ sin  cos  − ̇ sin  (1387)

3 = ̇3 + ̇3 +  3 = ̇ cos  + ̇ (1388)
Similarly, the components of the angular velocity for the space-fixed axis system (  ) are
 = ̇ cos  + ̇ sin  sin  (1389)
 = ̇ sin  − ̇ sin  cos  (1390)
 = ̇ + ̇ cos  (1391)

Rotational invariants The powerful concept of the rotational invariance of scalar properties was intro-
duced. Important examples of rotational invariants are the Hamiltonian, Lagrangian, and Routhian.

Euler equations of motion for rigid-body motion The dynamics of rigid-body rotational motion was
explored and the Euler equations of motion were derived using both Newtonian and Lagrangian mechanics.

1 = 1  1 − (2 − 3 )  2  3 (13103)

2 = 2  2 − (3 − 1 )  3  1

3 = 3  3 − (1 − 2 )  1  2
358 CHAPTER 13. RIGID-BODY ROTATION

Lagrange equations of motion for rigid-body motion The Euler equations of motion for rigid-body
motion, given in equation 13103 were derived using the Lagrange-Euler equations.

Torque-free motion of rigid bodies The Euler equations and Lagrangian mechanics were used to study
torque-free rotation of both symmetric and asymmetric bodies including discussion of the stability of torque-
free rotation.

Rotating symmetric body subject to a torque The complicated motion exhibited by a symmetric top,
that is spinning about one fixed point and subject to a torque, was introduced and solved using Lagrangian
mechanics.

The rolling wheel The non-holonomic motion of rolling wheels was introduced, as well as the importance
of static and dynamic balancing of rotating machinery..

Rotation of deformable bodies The complicated non-holonomic motion involving rotation of deformable
bodies was introduced.
13.27. SUMMARY 359

Workshop exercises
1. Three objects are described below. Break up into three groups, one group per object, and determine the inertia
tensor.

• A very thin sheet with a mass density  =  where  is a positive constant. The sheet lies in the 
plane and its sides are both of length .
• An inclined-plane shaped block of mass  is oriented with one corner at the origin as shown.

• An equilateral triangle made up of three thin rods of length  and uniform mass density .

2. Consider the objects described in problem 1.

(a) For the first object (the thin sheet), determine the principal moments of inertia.
(b) For the second object (the inclined plane), determine the principal axes.
(c) For the third object (the equilateral triangle), determine the products of inertia.

3. Consider the inertia tensor.

(a) What are the advantages of diagonalizing the inertia tensor?

(b) How can the inertia tensor be diagonalized?
(c) What can you say about a tensor that is real and symmetric?

4. A hollow spherical shell has a mass  and radius .

(a) Calculate the inertia tensor for a set of coordinates whose origin is at the center of mass of the shell.
(b) Now suppose that the shell is rolling without slipping toward a step of height , where   . The shell
has a linear velocity  . What is the angular momentum of the shell relative to the tip of the step?
(c) The shell now strikes the tip of the step inelastically (so that the point of contact sticks to the step,
but the shell can still rotate about the tip of the step). What is the angular momentum of the shell
immediately after contact?
(d) Finally, find the minimum velocity which enables the shell to surmount the step. Express your result in
terms of    and .

5. The vectors ̂, ̂ , and ̂ constitute a set of orthogonal right-handed axes. The vectors ̂ + ̂ − 2̂ , −̂ + ̂ , and
̂ + ̂ + ̂ are also perpendicular to one another.

(a) Write out the set of direction cosines relating the new axes to the old.
(b) How are the Eulerian angles defined? Describe this transformation by a set of Eulerian angles.
360 CHAPTER 13. RIGID-BODY ROTATION

6. A torsional pendulum consists of a vertical wire attached to a mass which can rotate about the vertical axis.
Consider three torsional pendula which consist of identical wires from which identical homogeneous solid cubes
are hung. One cube is hung from a corner, one from midway along an edge, and one from the middle of a face
as shown. What are the ratios of the periods of the three pendula?

7. A dumbbell comprises two equal point masses  connected by a massless rigid rod of length 2 which is
constrained to rotate about an axle fixed to the center of the rod at an angle  as shown in the figure. The
center of the rod is at the origin of the coordinates, the axle along the  -axis, and the dumbbell lies in the
 −  plane at  = 0. The angular velocity  is a constant in time and is directed along the  axis.
a) Calculate all elements of the inertia tensor. Be sure to specify the coordinate system used.
b) Using the calculated inertia tensor find the angular momentum of the dumbbell in the laboratory frame as
a function of time.
c) Using the equation  =  × , calculate the angular momentum and show that it it is equal to the answer
of part (b).
d) Calculate the torque on the axle as a function of time.
e) Calculate the kinetic energy of the dumbbell.
x

O
z

8. A heavy symmetric top has a mass  with the center of mass a distance  from the fixed point about which
it spins and 1 = 2 6= 3 . The top is precessing at a steady angular velocity Ω about the vertical space-fixed
 axis. What is the minimum spin  0 about the body-fixed symmetry axis, that is, the 3 axis assuming that
the 3 axis is inclined at an angle  =  with respect to the vertical  axis. Solve the problem at the instant
when the   3 1 axes all are in the same plane as shown in the figure.
z

O x

1
13.27. SUMMARY 361

9. Consider an object with the center of mass is at the origin and inertia tensor,
⎛ ⎞
12 −12 0
 =  ⎝ −12 12 0 ⎠
0 0 1

(a) Determine the principal moments of inertia and the principal axes. Guess the object.
(b) Determine the rotation matrix  and compute † . Do the diagonal elements match with your results
from (a)? Note: columns of  are eigenvectors of  .

(c) Assume  = (̂
+ ̂). Determine  in the rotating coordinate system. Are  and  in the same
√
2
direction? What does this mean?

(d) Repeat (c) for  = √
2
(̂ − ̂). What is diﬀerent and why?
(e) For which case will there be a non-zero torque required?

(f) Determine the rotational kinetic energy for the case  = √
2
(̂ − ̂)?

10. Consider a wheel (solid disk) of mass  and radius . The wheel is subject to angular velocities   =   ̂
where ̂ is normal to the surface and   =   ̂ .

(a) Choose a set of principal axes by observation.

(b) Determine the angular velocities and angular momentum along the principal axes. Note: 1 = 12 2 and
2 = 3 = 14 2 .
(c) Determine the torque.
(d) Determine the rotation matrix that rotates the fixed coordinate system to the body coordinate system.

11. Determine the principal moments of inertia of an ellipsoid given by the equation,

2 2  2
2
+ 2 + 2 = 1
  

12. Determine the principal moments of inertia of a sphere of radius  with a cavity of radius  located  from the
center of the sphere.

13. Three
³ ´ ³masses  form
equal ´ the ³
vertices of an equilateral
´ triangle of side length . The masses are located at
    
0 0 3 , 0 2  − 2 3 , and 0 − 2  − 2 3 , such that the center-of-mass is located at the origin.
√ √ √

(a) Determine the principal moments of inertia and principal axes. ³ ´ ³ ´

   
Now consider the same system rotated 45◦ about the ̂ -axis. The masses are located at 0 0 √ , − √  √  − 2√ ,
³ ´ 3 2 2 2 2 3
  
and √
2 2
 − 2√ 2
 − 2√ 3
, respectively.
(b) Determine the principal moments of inertia and principal axes.
(c) Could you have answered (b) without explicitly determining the inertia tensor? How?
362 CHAPTER 13. RIGID-BODY ROTATION

Problems
1. Calculate the moments of inertia 1  2  3 for a homogeneous cone of mass  whose height is  and whose
base has a radius  Choose the 3 -axis along the symmetry axis of the cone.
a) Choose the origin at the apex of the cone, and calculate the elements of the inertia tensor.
b) Make a transformation such that the center of mass of the cone is the origin and find the principal moments
of inertia.

2. Four masses, all of mass  lie in the  −  plane at positions ( ) = ( 0) (− 0) (0 +2) (0 −2)
These are joined by massless rods to form a rigid body
(a) Find the inertial tensor, using the    axes as a reference system. Exhibit the tensor as a matrix.
(b) Consider a direction given by the unit vector ̂ that lies equally between the positive    axes; that is
it makes equal angles with these three directions. Find the moment of inertia for rotation about this ̂ axis.
(c) Given that at a certain time  the angular velocity vector lies along the above direction ̂, find, for that
instant, the angle between the angular momentum vector and ̂

3. A homogeneous cube, each edge of which has a length  initially is in a position of unstable equilibrium with
one edge of the cube in contact with a horizontal plane. The cube then is given a small displacement causing
it to tip over and fall. Show that the angular velocity of the cube when one face strikes the plane is given by
 ³√ ´
2 =  2−1

3 12
where  = 2 if the edge cannot slide on the plane, and where  = 5 if sliding can occur without friction.

4. A symmetric body moves without the influence of forces or torques. Let 3 be the symmetry axis of the body
and  be along 03 . The angle between  and 3 is . Let  and  initially be in the 2 − 3 plane. What is
the angular velocity of the symmetry axis about  in terms of 1  3  and ?

5. Consider a thin rectangular plate with dimensions  by  and mass  Determine the torque necessary to
rotate the thin plate with angular velocity  about a diagonal. Explain the physical behavior for the case when
 = .
Chapter 14

Coupled linear oscillators

14.1 Introduction
Chapter 3 discussed the behavior of a single linearly-damped linear oscillator subject to a harmonic force.
No account was taken for the influence of the single oscillator on the driver for the case of forced oscillations.
Many systems in nature comprise complicated free or forced oscillations of coupled-oscillator systems. Ex-
amples of coupled oscillators are; automobile suspension systems, electronic circuits, electromagnetic fields,
musical instruments, atoms bound in a crystal, neural circuits in the brain, networks of pacemaker cells in
the heart, etc. Energy can be transferred back and forth between coupled oscillators as the motion evolves.
It is possible to describe the motion of coupled linear oscillators in terms of a sum over independent normal
coordinates, i.e. normal modes, even though the motion may be very complicated. These normal modes
are constructed from the original coordinates in such a way that the normal modes are uncoupled. The
topic of finding the normal modes of coupled oscillator systems is a ubiquitous problem encountered in all
branches of science and engineering. As discussed in chapter 3 oscillatory motion of non-linear systems
can be complicated. Fortunately most oscillatory systems are approximately linear when the amplitude of
oscillation is small. This discussion assumes that the oscillation amplitudes are suﬃciently small to ensure
linearity.

14.2 Two coupled linear oscillators

Consider the two-coupled linear oscillator, shown in figure
141, which comprises two identical masses each connected to
fixed locations by identical springs having a force constant
. A spring with force constant 0 couples the two oscilla- x1 x2
tors. The equilibrium lengths of the outer two springs are  m
m
while that of the coupling spring is 0 . The problem is simpli-
fied by restricting the motion to be along the line connecting
the masses and assuming fixed endpoints. The small displace- cm
ments of 1 and 2 are taken to be 1 and 2 with respect to
the equilibrium positions  and  + 0 respectively. The restor-
ing force on 1 is −1 −0 (1 − 2 ) while the restoring force
on 2 is −2 − 0 (2 − 1 )  This coupled double-oscillator
system exhibits basic features of coupled linear oscillator sys- Figure 14.1: Two coupled linear oscillators.
tems. The equilibrium spring-lengths are  for the
Assuming 1 = 2 =  then the equations of motion outer springs and 0 for the coupling spring.
are The displacement from the stable locations
are given by 1 and 2 . The separation be-
̈1 + ( + 0 ) 1 − 0 2 = 0 (14.1) tween the two masses is  and the location of
̈2 + ( + 0 ) 2 − 0 1 = 0 the center-of-mass is  .

Assume that the motion for these coupled equations is oscil-

363
364 CHAPTER 14. COUPLED LINEAR OSCILLATORS

latory with a solution of the form

1 = 1  (14.2)
2 = 2 
where the constants  may be complex to take into account both the magnitude and phase. Substituting
these possible solutions into the equations of motion gives
− 2 1  + ( + 0 ) 1  − 0 2  = 0 (14.3)
− 2 2  + ( + 0 ) 2  − 0 1  = 0
Collecting terms, and cancelling the common exponential fac-
tor, gives
¡ ¢
 + 0 −  2 1 − 0 2 = 0 (14.4)
¡ 0 2
¢ 0
 +  −  2 −  1 = 0
The existence of a non-trivial solution of these two simultane-
ous equations requires that the determinant of the coeﬃcients of
1 and 2 must vanish, that is
¯ ¯
¯  + 0 −  2 −0 ¯
¯ ¯
2 ¯=0 (14.5)
¯ − 0 0
 +  − 
The expansion of this secular determinant yields
¡ ¢2
 + 0 −  2 − 02 = 0 (14.6)
Solving for  gives r
 + 0 ± 0
= (14.7)

That is, there are two characteristic frequencies (or eigenfrequen-
cies) for the system
r
 + 20
1 = (14.8)

r

2 = (14.9)
 Figure 14.2: Displacement of each of two
Since superposition applies for these linear equations, then the coupled linear harmonic oscillators with
general solution can be written as a sum of the terms that account  = 4 and 0 = 1 in relative units.
for the two possible values of .
Figure 142 shows the solutions for a case where  = 4 and 0 = 1 in arbitrary units, with the q initial
6
condition that 2 =  and 1 = ̇1 = ̇2 = 0. The two characteristic frequencies are  1 =  and
q
4
2 =  . The characteristic beats phenomenon is exhibited where the envelope over one complete cycle of
the low frequency encompasses several higher frequency oscillations. That is, the solution is
∙µ ¶ ¸ ∙µ ¶ ¸
 £ 1  ¤ 1 + 2 1 − 2
2 () =  + −1  + 2  + −2  =  cos  cos  (14.10)
4 2 2
while
∙µ ¶ ¸ ∙µ ¶ ¸
 £ 1  ¤ 1 + 2 1 − 2
1 () =  + −1  − 2  − −2  =  sin  sin  (14.11)
4 2 2
The energy in the two-coupled oscillators flows back and forth between the coupled oscillators as illus-
trated in figure 142.
A better understanding of the energy flow occurring between the two coupled oscillators is given by
using a (1  2 ) configuration-space plot, shown in figure 143 The flow of energy occurring between the two
coupled oscillators can be represented by choosing normal-mode coordinates 1 and 2 that are rotated by
45◦ with respect to the spatial coordinates (1  2 ). These normal-mode coordinates ( 1  2 ) correspond to
the two normal modes of the coupled double-oscillator system.
14.3. NORMAL MODES 365

14.3 Normal modes

The normal modes of the two-coupled oscillator system are
obtained by a transformation to a pair of normal coordinates
(1   2 ) that are independent and correspond to the two normal
modes. The pair of normal coordinates for this case are

1 ≡ 1 − 2 (14.12)
2 ≡ 1 + 2

that is
1
1 = ( +  1 ) (14.13)
2 2
1
2 = ( −  1 )
2 2
Substitute these into the equations of motion (141), gives
¡  ¢
 1 + 2 + ( + 20 )  1 + 0  2 = 0 (14.14)
¡  ¢ 0 0
 1 − 2 + ( + 2 )  1 −   2 = 0

Adding and subtracting these two equations gives Figure 14.3: Motion of two coupled har-
0 monic oscillators in the (1  2 ) spatial
̈ 1 + ( + 2 ) 1 = 0 (14.15)
configuration space and in terms of the
̈ 2 +  2 = 0 normal modes ( 1  2 ). Initial conditions
are 2 =  1 = ̇1 = ̇2 = 0
Note that the two coordinates  1 and  2 are uncoupled and there-
fore are independent. The solutions of these equations are

 1 () = 1+ 1  + 1− −1  (14.16)

 2 () = 2+ 2  + 2− −2 

where 1 corresponds to angular frequencies  1 , and  2 corresponds to  2 . The two coordinates  1 and 2 are
called the normal coordinates and the two solutions are the normal modes with corresponding
angular frequencies,  1 and  2 .
The (1   2 ) axes of the two normal modes correspond to a
rotation of 45◦ in configuration space, figure 143. The initial
conditions chosen correspond to  1 = −2 and thus both modes
1
are excited with equal intensity. Note that there are 5 lobes along
the  2 axis versus 4 lobes along the 1 axis reflecting the ratio
of the eigenfrequencies  1 and  2  Also note that the diamond
shape of the motion in the (1  2 ) configuration space illustrates
that the extrema amplitudes for 2 are a maximum when 1 is
zero, and vise versa. This is equivalent to the statement that Antisymmetric mode
the energies in the two modes are coupled with the energy for (out of phase)
the first oscillator being a maximum when the energy is a min-
2
imum for the second oscillator, and vise versa. By contrast, in
the ( 1  2 ) configuration space, the motion is bounded by a rec-
tangle parallel to the (1   2 ) axes reflecting the fact that the
extrema amplitudes, and corresponding energies, for the 1 nor-
mal mode are constant and independent of the motion for the 2
normal mode, and vise versa. The decoupling of the two normal Symmetric mode
modes is best illustrated by considering the case when only one (in phase)
of these two normal modes is excited. For the initial conditions
 
1 (0) = −2 (0)  and 1 (0) = −2 (0)  then  2 () = 0 That is,
only the  1 () normal mode is excited with frequency  1 which
Figure 14.4: Normal modes for two cou-
corresponds to motion confined to the  1 axis of figure 143
pled oscillators.
366 CHAPTER 14. COUPLED LINEAR OSCILLATORS

As shown in figure 144,  1 () is the antisymmetric mode in which the two masses oscillate out of phase
such as to keep the center of mass of the two masses stationary. For the initial conditions 1 (0) = 2 (0) 
 
and 1 (0) = 2 (0)  then  1 () = 0 that is, only the 2 () normal mode is excited. The 2 () normal mode
is the symmetric mode where the two masses oscillate in phase with frequency  2 ; it corresponds to motion
along the 2 axis For the symmetric phase, both masses move together leading to a constant extension of
the coupling spring. As a result the frequency  2 of the symmetric mode 2 () is lower than the frequency
 1 of the asymmetric mode 1 ()  That is, the asymmetric mode is stiﬀer since all three springs provide
active restoring forces, compared to the symmetric mode where the coupling spring is uncompressed. In
general, for attractive forces the lowest frequency always occurs for the mode with the highest symmetry.

14.4 Center of mass oscillations

Transforming the coordinates into the center of mass of the two oscillating masses elucidates an interesting
feature of the normal modes for the two-coupled linear oscillator. As illustrated in figure 141, the center-
of-mass coordinate for the two mass system is

2 =  + 1 +  + 0 + 2 = 2 + 0 + 2

while the relative separation distance is

 = ( + 0 + 2 ) − ( + 1 ) = 0 − 1

That is, the two normal modes are

1 = 0 −  (14.17)
0
2 = 2 − 2 − 
q
0
The 1 mode, which has angular frequency  1 = +2  corresponds to an oscillations of the relative
separation , whilep the center-of-mass location  is stationary. By contrast, the 2 mode, with angular
frequency  2 =   corresponds to an oscillation of the center of mass  with the relative separation 
being a constant.
Figure 145 illustrates the decoupled center-of-mass
 , and relative motions  for both normal modes of
the coupled double-oscillator system. The diﬀerence in 2.0

angular frequencies and amplitudes is readily apparent.

It is of interest to consider the special case where the 1.5
Rcm
spring constant  = 0 for the two outside
q springs. Then
0
the angular frequencies are  1 = 2  and  2 = 0 for
1.0

the two normal modes. When  = 0 the 2 mode is a r

0.5
spurious center-of-mass mode since it corresponds to an
oscillation with  2 = 0 in spite of the fact that there
are no forces acting on the center of mass. That is, the 0.0
0 1 2 3 4 5 6 7 8 9 10

center-of-mass momentum must be a constant of motion. t

This spurious center-of-mass oscillation is a consequence
of measuring the displacements (1  2 ) with respect to
an arbitrary external reference that is not related to the
center of mass of the coupled system. Spurious center- Figure 14.5: Time dependence of the center-of-
of-mass modes are encountered frequently in many-body mass  and relative separation  for two cou-
coupled oscillator systems such as molecules and nuclei. pled linear oscillators assuming spring constants
In such cases it is necessary to project out the center-of- of  = 4 and 0 =  .
mass motion to eliminate such spurious solutions as will
be discussed later.
14.5. WEAK COUPLING 367

14.5 Weak coupling

If one of the two coupled linear oscillator masses is held fixed, then the other free mass will oscillate with a
frequency. r
 + 0
0 = (14.18)

The eﬀect of coupling of the two oscillators is to split the degeneracy of the frequency for each mass to
r r r
 + 20  + 0 
1 =  0 =  2 =  (14.19)
  
Thus the degeneracy is broken, and the two normal modes have frequencies straddling the single-oscillator
frequency.
It is interesting to consider the case where the coupling is weak because this situation occurs frequently
in nature. The coupling is weak if the coupling constant 0   Then
r r
 + 20 √
1 = = 1 + 4 (14.20)
 
where
0
≡  1 (14.21)
2
Thus r

1 ≈ (1 + 2) (14.22)

The natural frequency of a single oscillator was shown to be
r r
 + 0 
0 = ≈ (1 + ) (14.23)
 
that is r

=  0 (1 − ) (14.24)

1
Thus the frequencies for the normal modes for weak coupling
can be written as 0
r
 2
1 = (1 + 2)
 n=2
≈  0 (1 − ) (1 + 2) ≈  0 (1 + ) (14.25)

while r

2 = ≈  0 (1 − ) (14.26)
 1
That is the two solutions are split equally spaced q about the
+0
0 2
single uncoupled oscillator value given by  0 =  ≈
p 3
 (1 + ). Note that the single uncoupled oscillator fre-
quency  0 depends on the coupling strength  . 0 n=3
This splitting of the characteristic frequencies is a feature
exhibited by many systems of  identical oscillators where
half of the frequencies are shifted upwards and half down-
ward. If  is odd, then the central frequency is unshifted as Figure 14.6: Normal-mode frequencies for
illustrated for the case of  = 3. An example of this behav- n=2 and n=3 weakly-coupled oscillators.
ior is the Zeeman eﬀect where the magnetic field couples the
atomic motion resulting in a hyperfine splitting of the energy
levels as illustrated.
368 CHAPTER 14. COUPLED LINEAR OSCILLATORS

There are myriad examples involving weakly-coupled oscillators in many aspects of the natural world.
The example of collective modes in nuclear physics, illustrated in example 1413, is typical of applications to
physics, while there are many examples applied to musical instruments, acoustics, and engineering. Weakly-
coupled oscillators are a dominant theme throughout biology as illustrated by congregations of synchronously
flashing fireflies, crickets that chirp in unison, an audience clapping at the end of a performance, networks
of pacemaker cells in the heart, insulin-secreting cells in the pancreas, and neural networks in the brain and
spinal cord that control rhythmic behaviors such as breathing, walking, and eating. Synchronous motion of
a large number of weakly-coupled oscillators often leads to large collective motion of weakly-coupled systems
as discussed in chapter 1412

14.1 Example: The Grand Piano

Hitchpin
Damper String Bridge

Hammer
Pin block
Jack
Soundboard
Ribs
Key

Schematic diagram of the action for a grand piano, including the strings, bridge and sounding board. Note
that there are either two or three parallel strings per note that are hit by a single hammer.
The grand piano provides an excellent example of a weakly-coupled harmonic oscillator system that has
normal modes. There are either two or three parallel strings per note that are stretched tightly parallel to the
top of the horizontal sounding board. The strings press downwards on the bridge that is attached to the top of
the sounding board. The strings for each note are excited when struck vertically upwards by a single hammer.
In the base section of the piano each note comprises two strings tuned to nearly the same frequency. The
coupling of the motion of the strings is via the bridge plus sounding board. Normally, the hammer strikes both
strings simultaneously exciting the vertical symmetric mode, not the vertical antisymmetric mode. The bridge
is connected to the sounding board which moves the largest amount for the symmetric mode where both strings
move the bridge in phase. This strong coupling produces a loud sound. The antisymmetric mode does not
move the sounding board much since the strings at the bridge move out of phase. Consequently, the symmetric
mode, that is strongly coupled to the sounding board, damps out more rapidly than the antisymmetric mode
which is weakly coupled to the sound board and thus has a longer time constant for decay since the radiated
sound energy is lower than the symmetric mode.
The una-corda pedal (soft pedal) for a grand piano moves the action sideways such that the hammer strikes
only one of the two strings, or two of the three strings, resulting in both the symmetric and antisymmetric
modes being excited equally. The una-corda pedal produces a characteristically different tone than when the
hammer simultaneously hits the coupled strings; that is, it produces a smaller transient component. The
symmetric mode rapidly damps due to energy propagation by the sounding board. Thus the longer lasting
antisymmetric mode becomes more prominent when both modes are equally excited using the una-corda pedal.
The symmetric and antisymmetric modes have slightly different frequencies and produce beats which also
contributes to the different timbre produced using the una-corda pedal. For the mid and upper frequency
range, the piano has three strings per note which have one symmetric mode and two separate antisymmetric
modes. To further complicate matters, the strings also can oscillate horizontally which couples weakly to the
bridge plus sounding board. The strengths that these different modes are excited depend on subtle differences
in the shape and roughness of the hammer head striking the strings. Primarily the hammer excites the two
vertical modes rather than the horizontal modes.
14.6. GENERAL ANALYTIC THEORY FOR COUPLED LINEAR OSCILLATORS 369

14.6 General analytic theory for coupled linear oscillators

The above discussion of a coupled double-oscillator system has shown that it is possible to select symmetric
and antisymmetric normal modes that are independent and each have characteristic frequencies. The normal
coordinates for these two normal modes correspond to linear superpositions of the spatial amplitudes of the
two oscillators and can be obtained by a rotation into the appropriate normal coordinate system. Extension
of this to systems comprising  coupled linear oscillators, requires development of a general analytic theory,
that is capable of finding the normal modes plus their eigenvalues and eigenvectors. As illustrated for the
double oscillator, the solution of many coupled linear oscillators is a classic eigenvalue problem where one has
to rotate to the principal axis system to project out the normal modes. The following discussion presents a
general approach to the problem of finding the normal coordinates for a system of  coupled linear oscillators.
Consider a conservative system of  coupled oscillators, described in terms of generalized coordinates
 and  with subscript  = 1 2 3 for a system with  degrees of freedom The coupled oscillators are
assumed to have a stable equilibrium with generalized coordinates 0 at equilibrium. In addition, it is
assumed that the oscillation amplitudes are suﬃciently small to ensure that the system is linear.
For the equilibrium position  = 0 the Lagrange equations must satisfy
̇ = 0 (14.27)
̈ = 0
 
Every non-zero term of the form   ̇ in Lagrange’s equations must contain at least either ̇ or ̈ which
are zero at equilibrium; thus all such terms vanish at equilibrium. That is at equilibrium
µ ¶ µ ¶ µ ¶
  
= − =0 (14.28)
 0  0  0
where the subscript 0 designates at equilibrium.

14.6.1 Kinetic energy tensor T

In chapter 76 it was shown that, in terms of fixed rectangular coordinates, the kinetic energy for  bodies,
with  generalized coordinates, is expressed as
 3
1 XX
 =  ̇2 (14.29)
2 =1 =1

Expressing these in terms of generalized coordinates  =  (  ) where  = 1 2  then the generalized
velocities are given by
X
 
̇ = ̇ + (14.30)
=1
 
As discussed in chapter 76 if the system is scleronomic then the partial time derivative

=0 (14.31)

Thus the kinetic energy, equation 1429, of a scleronomic system can be written as a homogeneous quadratic
function of the generalized velocities

1X
 =  ̇ ̇ (14.32)
2


where the components of the kinetic energy tensor T are


X 3
X  
 ≡  (14.33)
 
 

Note that if the velocities ̇ correspond to translational velocity, then the kinetic energy tensor T corresponds
to an eﬀective mass tensor, whereas if the velocities correspond to angular rotational velocities, then the
kinetic energy tensor T corresponds to the inertia tensor.
370 CHAPTER 14. COUPLED LINEAR OSCILLATORS

It is possible to make an expansion of the  about the equilibrium values of the form
X µ  ¶
 (1  2   ) =  (0 ) +  +  (14.34)
 0


Only the first-order term will be kept since the second and higher terms are of the same order as the higher-
order
³ ´terms ignored in the Taylor expansion of the potential. Thus, at the equilibrium point, assume that

 = 0 where  = 1 2 3 .
0

14.6.2 Potential energy tensor V

Equations 1428 plus 1434 imply that µ ¶

=0 (14.35)
 0
where  = 1 2 3 
Make a Taylor expansion about equilibrium for the potential energy, assuming for simplicity that the
coordinates have been translated to ensure that  = 0 at equilibrium. This gives
X µ  ¶ 1X
µ 2
 
¶
 (1  2   ) = 0 +  +   +  (14.36)
 0 2   0
 
³ ´

The linear term is zero since  
= 0 at the equilibrium point, and without loss of generality, the
0
potential can be measured with respect to 0 . Assume that the amplitudes are small, then the expansion
can be restricted to the quadratic term, corresponding to the simple linear oscillator potential
µ 2 ¶
1X   1X
 (1  2   ) − 0 =  0 (1  2   ) =   =    (14.37)
2   0 2
 

That is
1X
 0 (1  2   ) =    (14.38)
2


where the components of the potential energy tensor V are defined as

µ 2 0 ¶
 
 ≡ (14.39)
  0
Note that the order of diﬀerentiation is unimportant and thus the quantity  is symmetric
 =  (14.40)
The motion of the system has been specified for small oscillations around the equilibrium position and
it has been shown that  0 (1  2   ) has a minimum value at equilibrium which is taken to be zero for
convenience.
In conclusion, equations (1432) and (1438) give

1X
 =  ̇ ̇ (14.41)
2

X
1
0 =    (14.42)
2


where the components of the kinetic energy tensor T and potential energy tensor V are
Ã 3
!
X X  
 ≡  (14.43)
 
 
0
µ 2 0 ¶
 
 ≡ (14.44)
  0
14.6. GENERAL ANALYTIC THEORY FOR COUPLED LINEAR OSCILLATORS 371

Note that  and  may have diﬀerent units, but all the terms in the summations for both  and  0  have
units of energy. The  and  values are evaluated at the equilibrium point, and thus both  and 
are  ×  arrays of values evaluated at the equilibrium location.

14.6.3 Equations of motion

Both the kinetic energy and potential energy terms are products of the coordinates leading to a set of
coupled equations that are complicated to solve. The problem is greatly simplified by selecting a set of
normal coordinates for which both  and  are diagonal, then the coupling terms disappear. Thus a
coordinate transformation must be found that simultaneously diagonalizes  and  in order to obtain a
set of normal coordinates.

The kinetic energy  is only a function of generalized velocities   while the conservative potential energy
is only a function of the generalized coordinates   Thus the Lagrange equations
  
− =0 (14.45)
   ̇
reduce to
  
+ =0 (14.46)
   ̇
But 
 X
=   (14.47)
 

and
X

=  ̇ (14.48)
 ̇ 

Thus the Lagrange equations reduce to the following set of equations of motion,

X
(  +  ̈ ) = 0 (14.49)


For each  where 1 ≤  ≤  there exists a set of  second-order linear homogeneous diﬀerential equations
with constant coeﬃcients. Since the system is oscillatory, it is natural to try a solution of the form
 () =  (−) (14.50)
Assuming that the system is conservative, then this implies that  is real, since an imaginary term for 
would lead to an exponential damping term. The arbitrary constants are the real amplitude  and the
phase  Substitution of this trial solution for each  leads to a set of equations
X¡ ¢
 −  2   = 0 (14.51)


(−)
where the common factor  has been removed. Equation 1451 corresponds to a set of  linear
homogeneous algebraic equations that the  amplitudes must satisfy for each . For a non-trivial solution
to exist, the determinant of the coeﬃcients must vanish, that is
¯ ¯
¯ 11 −  2 11 12 −  2 12 13 −  2 13  ¯
¯ ¯
¯ 12 −  2 12 22 −  2 22 23 −  2 23  ¯
¯ ¯
¯ 13 −  2 13 23 −  2 23 33 −  2 33  ¯ = 0 (14.52)
¯ ¯
¯     ¯
where the symmetry  =  has been included. This is the standard eigenvalue problem for which
the above determinant gives the secular equation or the characteristic equation. It is an equation
of degree  in  2  The  roots of this equation are  2 where   are the characteristic frequencies or
eigenfrequencies of the normal modes.
Substitution of  2 into equation 1452 determines the ratio 1 : 2 : 3 :  :  for this solution
which defines the components of the -dimensional eigenvector a . That is, solution of the secular equations
have determined the eigenvalues and eigenvectors of the  solutions of the coupled-channel system.
372 CHAPTER 14. COUPLED LINEAR OSCILLATORS

14.6.4 Superposition
P
The equations of motion  (  +  ̈ ) = 0 are linear equations that satisfy superposition. Thus the
most general solution  () can be a superposition of the  eigenvectors a , that is

X
 () =  ( − ) (14.53)


Only the real part of  () is meaningful, that is,


X 
X
 () = Re  ( − ) =  cos (   −   ) (14.54)
 

Thus the most general solution of these linear equations involves a sum over the eigenvectors of the
system which are cosine functions of the corresponding eigenfrequencies.

14.6.5 Eigenfunction orthonormality

It can be shown that the eigenvectors are orthogonal. In addition, the above procedure only determines ratios
of amplitudes, thus there is an indeterminacy that can be used to normalize the  . Thus the eigenvectors
form an orthonormal set. Orthonormality of the eigenfunctions for the rank 3 inertia tensor was illustrated
in chapter 13102 Similar arguments apply that allow extending orthonormality to higher rank cases such
that for -body coupled oscillators.
The eigenfunction orthogonality for  coupled oscillators can be proved by writing equation 1451
for both the  root and the  root. That is,
X X
  =  2   (14.55)
 
X X
  =  2   (14.56)
 

Multiply equation 1455 by  and sum over . Similarly multiply equation 1456 by  and sum over .
These summations lead to
X X
   =  2    (14.57)
 
X X
   =  2    (14.58)
 

Note that the left-hand sides of these two equations are identical. Thus taking the diﬀerence between these
equations gives
¡ 2 ¢X
  −  2    = 0 (14.59)

¡ ¢
Note that if  2 −  2 6= 0, that is, assuming that the eigenfrequencies are not degenerate, then to ensure
that equation 1459 is zero requires that
X
   = 0  6=  (14.60)


This shows that the eigenfunctions are orthogonal. If the eigenfrequencies are degenerate, i.e.  2 =  2 ,
then, with no loss of generality, the axes  and  can be chosen to be orthogonal.
The eigenfunction normalization can be chosen freely since only ratios of the eigenfunction compo-
nents  are determined when   is used in equation 1451. The kinetic energy, given by equation 1432
must be positive, or zero for the case of a static system. That is

1X
 =  ̇ ̇ ≥ 0 (14.61)
2

14.6. GENERAL ANALYTIC THEORY FOR COUPLED LINEAR OSCILLATORS 373

Use the time derivative of equation 1454 to determine ̇ and insert into equation 1461 gives that the kinetic
energy is
 
1X 1X X
 =  ̇ ̇ =       cos (   −   )  cos (   −   ) (14.62)
2 2 
 

For the diagonal term  = 


"  #
1X 1X 2 2
X
 =  ̇ ̇ =   cos (   −   )    ≥ 0 (14.63)
2 2 
 

Since the term in the square brackets must be positive, then

X
   ≥ 0 (14.64)


Since this sum must be a positive number, and the magnitude of the amplitudes can be chosen freely, then
it is possible to normalize the eigenfunction amplitudes to unity. That is, choose that
X
   = 1 (14.65)


The orthogonality equation, 1460 and the normalization equation 1465 can be combined into a single
orthonormalization equation
X
   =   (14.66)


This has shown that the eigenvectors form an orthonormal set.

Since the   component of the  eigenvector is  , then the  eigenvector can be written in the form
X
a =  eb (14.67)


where eb are the unit vectors for the generalized coordinates.

14.6.6 Normal coordinates

The above general solution of the coupled-oscillator problem is best expressed in terms of the normal coor-
dinates which are independent. It is more transparent if the superposition of the normal modes are written
in the form 
X
 () =      (14.68)

where the complex factor   includes the arbitrary scale factor to allow for arbitrary amplitudes  as well
as the fact that the amplitudes  have been normalized and the phase factor   has been chosen.
Define
  () ≡     (14.69)
then equation 1468 can be written as

X
 () =   () (14.70)

Equation 1470 can be expressed schematically as the matrix multiplication
q = {a} · η (14.71)
The   () are the normal coordinates which can be expressed in the form
η = {a}−1 q (14.72)
Each normal mode   corresponds to a single eigenfrequency,   which satisfies the linear oscillator equation
̈  +  2   = 0 (14.73)
374 CHAPTER 14. COUPLED LINEAR OSCILLATORS

14.7 Two-body coupled oscillator systems

The two-body coupled oscillator is the simplest coupled-oscillator system that illustrates the general fea-
tures of coupled oscillators. The following four examples involve parallel and series couplings of two linear
oscillators or two plane pendula.

14.2 Example: Two coupled linear oscillators

The coupled double-oscillator problem, figure 141 discussed in chapter 142, can be used to demonstrate
that the general analytic theory gives the same solution as obtained by direct solution of the equations of
motion in chapter 142.
1) The first stage is to determine the potential and kinetic energies using an appropriate set of generalized
coordinates, which here are 1 and 2 . The potential energy is

1 2 1 2 1 0 1 1
= 1 + 2 +  (2 − 1 )2 = ( + 0 ) 21 + ( + 0 ) 22 − 0 1 2
2 2 2 2 2
while the kinetic energy is given by
1 1
 = ̇21 + ̇22
2 2
2) The second stage is to evaluate the potential energy  and kinetic energy  tensors. The potential
energy tensor  is nondiagonal since  gives
µ ¶
2
11 ≡ =  + 0 = 22
1 1
µ ¶0
2
12 = = −0 = 21
1 2 0

That is, the potential energy tensor  is

½ ¾
 + 0 −0
V=
−0  + 0

Similarly, the kinetic energy is given by

1 1 1X
 = ̇21 + ̇22 =  ̇ ̇
2 2 2


Since 11 = 22 =  and 12 = 21 = 0 then the kinetic energy tensor  is
½ ¾
 0
T=
0 

Note that for this case, the kinetic energy tensor  equals the mass tensor, which is diagonal, whereas the
potential energy tensor equals the spring constant tensor, which is nondiagonal.
3) The third stage is to use the potential energy  and kinetic energy  tensors to evaluate the secular
determinant using equations 1452
¯ ¯
¯  + 0 −  2 −0 ¯
¯ ¯
2 ¯=0
¯ − 0 0
 +  − 

The expansion of this secular determinant yields

¡ ¢2
 + 0 −  2 − 02 = 0

That is ¡ ¢
 + 0 −  2 = ±0
14.7. TWO-BODY COUPLED OSCILLATOR SYSTEMS 375

Solving for   gives r

 + 0 ± 0
 =

The solutions are
r
 + 20
1 =

r

2 =


which is the same as derived previously, (equations 147 − 9).

4) The fourth step is to insert either one of these eigenfrequencies into the secular equation
X¡ ¢
 −  2   = 0 ()


Consider the secular equation  for  = 1

¡ ¢
 + 0 −  2  1 − 0 2 = 0

Then for the first eigenfrequency  1  that is,  = 1  = 1

( + 0 −  − 20 ) 11 − 0 21 = 0

which simplifies to
 = 11 = −21
Similarly, for the other eigenfrequency  2 , that is,  = 1  = 2

( + 0 − ) 12 − 0 22 = 0

which simplifies to
 = 12 = 22
5) The final stage is to write the general coordinates in terms of the normal coordinates   () ≡
     Thus
1 = 11 1 + 12 2 = 11 1 + 22 2
and
2 = 21 1 + 22  2 = −11  1 + 22  2
Adding or subtracting gives that the normal modes are
1
1 = (1 − 2 )
211
1
2 = (2 + 1 )
222

p  normal mode  2 corresponds to an oscillation of the center-of-mass with the lower

Thus the symmetric
frequency  2 =   This frequency is the same as for one single mass on a spring of spring constant
 which is as expected since they vibrate in unison andqthus the coupling spring force does not act. The
0
antisymmetric mode  1 has the higher frequency  1 = +2  since the restoring force includes both the
main spring plus the coupling spring.
The above example illustrates that the general analytic theory for coupled linear oscillators gives the
same answer as obtained in chapter 142 using Newton’s equations of motion. However, the general analytic
theory is a more powerful technique for solving complicated coupled oscillator systems. Thus the general
analytic theory will be used for solving all the following coupled oscillator problems.
376 CHAPTER 14. COUPLED LINEAR OSCILLATORS

14.3 Example: Two equal masses series-coupled by two equal springs

Consider the series-coupled system shown in the figure.
1) The first stage is to determine the potential and kinetic
energies using an appropriate set of generalized coordinates,
which here are 1 and 2 . The potential energy is
1 2
1 2 1 1
=  +  (2 − 1 )2 = 21 + 22 − 1 2
2 1 2 2
Two equal masses series-coupled by two
while the kinetic energy is given by equal springs.
1 1
̇21 + ̇22
 =
2 2
2) The second stage is to evaluate the potential energy  and mass  tensors. The potential energy tensor
 is nondiagonal since  gives
µ 2 ¶
 
11 ≡ = 2
1 1 0
µ 2 ¶
 
12 = = − = 21
1 2 0
µ 2 ¶
 
22 = =
2 2 0

That is, the potential energy tensor  is

½ ¾
2 −
V=
− 

Similarly, since the kinetic energy is given by

1 1 1X
 = ̇21 + ̇22 =  ̇ ̇
2 2 2


then 11 = 22 =  and 12 = 21 = 0 Thus the kinetic energy tensor  is
½ ¾
 0
T=
0 

Note that for this case the kinetic energy tensor is diagonal whereas the potential energy tensor is nondiagonal.
3) The third stage is to use the potential energy  and kinetic energy  tensors to evaluate the secular
determinant using equation 1452 ¯ ¯
¯ 2 −  2 − ¯
¯ ¯=0
¯ −  −  2 ¯
The expansion of this secular determinant yields
¡ ¢¡ ¢
2 −  2  −  2 − 2 = 0

That is
 2 2
4 − 3  + 2 =0
 
The solutions are √ r √ r
5+1  5−1 
1 = 2 =
2  2 
4) The fourth step is to insert these eigenfrequencies into the secular equation 1451
X¡ ¢
 −  2   = 0

14.7. TWO-BODY COUPLED OSCILLATOR SYSTEMS 377

Consider  = 1 in the above equation

¡ ¢
2 −  2  1 − 2 = 0

Then for eigenfrequency  1 , that is,  = 1  = 1

√
5−1
11 = −21
2
Similarly, for  = 1  = 2 √
5+1
12 = 22
2
5) The final stage is to write the general coordinates in terms of the normal coordinates   () ≡
    
Thus
222
1 = 11  1 + 12  2 = 11  1 + √ 2
5+1
and Ã√ !
5−1
2 = 21  1 + 22 2 = − 11 1 + 22  2
2
Adding or subtracting gives that the normal modes are
Ã Ã√ ! !
1 5−1
1 = √ 1 − 2
11 5 2
Ã Ã√ ! !
1 5+1
2 = √ 1 + 2
22 5 2
√ p
Thus the symmetric normal mode has the lower frequency  2 = 5−1 2   The antisymmetric mode has the
√ p
frequency  1 = 5+1
2

 since both springs provide the restoring force. This case is interesting in that for
both normal modes, the amplitudes for the motion of the two masses are diﬀerent.

14.4 Example: Two parallel-coupled plane pendula

Consider the coupled double pendulum system shown in
the adjacent figure, which comprises two parallel plane pen-
dula weakly coupled by a spring. The angles 1 and 2 are
chosen to be the generalized coordinates and the potential en-
ergy is chosen to be zero at equilibrium. Then the kinetic
energy is
1 ³ ´2 1 ³ ´2 1 2
 =  ̇1 +  ̇2
2 2 k
As discussed in chapter 3, it is necessary to make the small-
angle approximation in order to make the equations of motion
for the simple pendulum linear and solvable analytically. That Two parallel-coupled plane pendula.
is,
1 2
 =  (1 − cos 1 ) +  (1 − cos 2 ) +  ( sin 1 −  sin 2 )
2
 ¡ 2 ¢ 2 2
' 1 + 22 + (1 − 2 )
2 2
2
assuming the small angle approximation sin  ≈  and (1 − cos 1 ) = 2 
The second stage is to evaluate the kinetic energy  and potential energy  tensors
½ ¾ ½ ¾
2 0  + 2 −2
T= V=
0 2 −2  + 2
378 CHAPTER 14. COUPLED LINEAR OSCILLATORS

Note that for this case the kinetic energy tensor is diagonal whereas the potential energy tensor is nondiagonal.
The third stage is to evaluate the secular determinant
¯ ¯
¯  + 2 −  2 2 −2 ¯
¯ ¯
2 ¯=0
¯ − 2 2
 +  −  2

which gives the characteristic equation

¡ ¢2 ¡ ¢2
 + 2 −  2 2 = 2
or
 +  −  2  = ±
The two solutions are
  2
 21 =  22 = +
  
The fourth step is to insert these eigenfrequencies into equation 1451

X ¡ ¢
 −  2   = 0


Consider  = 1 ¡ ¢
 + 2 −  2 2 1 − 2 2 = 0
Then for the first eigenfrequency,  1 , the subscripts are  = 1  = 1
³  ´
 + 2 − 2 11 − 2 21 = 0

which simplifies to
11 = 21
Similarly, for  = 1  = 2
µ µ ¶ ¶
 2
 + 2 − + 2 12 − 2 22 = 0
 
which simplifies to
12 = −22
The final stage is to write the general coordinates in terms of the normal coordinates
1 = 11  1 + 12 2 = 11  1 − 22 2
and
2 = 21  1 + 22 2 = 11  1 + 22 2
Adding or subtracting these equations gives that the normal modes are
1 1
1 = (1 + 2 ) 2 = (2 − 1 )
211 222
As for the case of the double oscillator discussed in example 142, the symmetric normal mode corresponds
to an oscillation pof the center-of-mass, with zero relative motion of the two pendula, which has the lower

frequency  1 =   This frequency is the same as for one independent pendulum as expected since they
vibrate in unison and thus the only restoring force is gravity. The antisymmetric mode corresponds q¡ to
 2
¢
relative motion of the two pendula with stationary center-of-mass and has the frequency  2 =  + 
since the restoring force includes both the coupling spring and gravity.
This example introduces the role of degeneracy which occurs in this system p if the coupling of the pendula
is zero, that is,  = 0 leading to both frequencies being equal, i.e.  1 =  2 =  . When  = 0, then both
{T} and {V} are diagonal and thus in the (1  2 ) space the two pendula are independent normal modes.
However, the symmetric and asymmetric normal modes, as derived above, are equally good normal modes.
In fact, since the modes are degenerate, any linear combination of the motion of the independent pendula are
equally good normal modes and thus one can use any set of orthogonal normal modes to describe the motion.
14.7. TWO-BODY COUPLED OSCILLATOR SYSTEMS 379

14.5 Example: The series-coupled double plane pendula

The double-pendula system comprises one plane pendulum attached
to the end of another plane pendulum both oscillating in the same plane.
The kinetic and potential energies for this system are given in example
621 to be
1 2 1 2
 = (1 + 2 )21 ̇1 + 2 1 2 ̇1 ̇2 cos(1 − 2 ) + 2 22 ̇2
2 2
 = (1 + 2 )1 (1 − cos 1 ) + 2 2 (1 − cos 2 )

a) Small-amplitude linear regime

Use of the small-angle approximation makes this system linear and
solvable analytically. That is,  and  become
Two series-coupled plane pendula.
1 1
 = (1 + 2 )1 21 + 2 2 22
2 2
1 2 2 1 2
 = (1 + 2 )1 ̇1 + 2 1 2 ̇1 ̇2 + 2 22 ̇2
2 2
Thus the kinetic energy and potential energy tensors are
½ ¾ ½ ¾
(1 + 2 )21 2 1 2 (1 + 2 )1 0
T= V =
2 1 2 2 22 0 2 2

Note that T is nondiagonal, whereas V is diagonal which is opposite

to the case of the two parallel-coupled plane pendula.
The solution of this case is simpler if it is assumed that 1 = 2 = 
and 1 = 2 = . Then
½ ¾ ½ ¾
2 2 1 2 2 20 0
T =  V = 
1 1 0  20
p
where  0 =  which is the frequency of a single pendulum.
The next stage is to evaluate the secular determinant
¯ ¯
¯ 2 2
2 ¯ 2( 0 −  ) − 2 ¯
 ¯ ¯=0
− 2 ( 20 −  2 ) ¯

The eigenvalues are

√ √
 21 = (2 − 2) 20  22 = (2 + 2) 20

As shown in the adjacent figure, the normal modes for this system
are
1  1  Normal modes for two
1 = ( + √2 ) 2 = (1 − √2 ) series-coupled plane pendula.
211 1 2 2 22 2
√
The second mass has a 2 larger amplitude that is in phase for solution 1 and out of phase for solution 2.
b) Large amplitude chaotic regime
Stachowiak and Okada [Sta05] used computer simulations to numerically analyze the behavior of this
system with increase in the oscillation amplitudes. Poincaré sections, bifurcation diagrams, and Lyapunov
exponents all confirm that this system evolves from regular normal-mode oscillatory behavior in the linear
regime at low energy, to chaotic behavior at high excitation energies where non-linearity dominates. This
behavior is analogous to that of the driven, linearly-damped, harmonic pendulum described in chapter 35
380 CHAPTER 14. COUPLED LINEAR OSCILLATORS

14.8 Three-body coupled linear oscillator systems

Chapter 147 discussed parallel and series arrangements of two coupled oscillators. Extending from two to
three coupled linear oscillators introduces interesting new characteristics of coupled oscillator systems. For
more than two coupled oscillators, coupled oscillator systems separate into two classifications depending on
whether each oscillator is coupled to the remaining  − 1 oscillators, or when the coupling is only to the
nearest neighbors as illustrated below.

14.6 Example: Three plane pendula; mean-field linear coupling

Consider three identical pendula with mass m and length
, suspended from a common support that yields slightly to
pendulum motion leading to a coupling between all three pen-
dula as illustrated in the adjacent figure. Assume that the
motion of the three pendula all are in the same plane. This
case is analogous to the piano where three strings in the tre-
ble section are coupled by the slightly-yielding common bridge
plus sounding board leading to coupling between each of the
three coupled oscillators. This case illustrates the important b b b
concept of degeneracy.
The generalized coordinates are the angles 1  2  and 3 
Assume that the support yields such that the actual deflection m m m
angle for pendulum 1 is Three plane pendula with complete linear
 coupling.
01 = 1 − (2 + 3 )
2
where the coupling coeﬃcient  is small and involves all the pendula, not just the nearest neighbors. Assume
that the same coupling relation exists for the other angle coordinates. The gravitational potential energy of
each pendulum is given by
1
1 = (1 − cos 1 ) ≈ 21
2
assuming the small angle approximation. Ignoring terms of order 2 gives that the potential energy

 ¡ 02 ¢  ¡ 2 ¢
= 1 + 02 02
2 + 3 = 1 + 22 + 23 − 21 2 − 21 3 − 22 3
2 2
The kinetic energy evaluated at the equilibrium location is

1 ³ ´2 1 ³ ´2 1 ³ ´2
 =  ̇1 +  ̇2 +  ̇3
2 2 2
The next stage is to evaluate the {T} and {V} tensors
⎧ ⎫ ⎧ ⎫
⎨ 1 0 0 ⎬ ⎨ 1 − − ⎬
T = 2 0 1 0 V =  − 1 −
⎩ ⎭ ⎩ ⎭
0 0 1 − − 1

The third stage is to evaluate the secular determinant which can be written as
¯ ¯
¯ 1 −  2 − − ¯
¯  ¯
¯ −  2
1 −  − ¯
 ¯ ¯=0
¯ ¯
¯ − − 1 −   2 ¯

Expanding and factoring gives

µ ¶µ ¶µ ¶
 2  2  2
 −1−  −1−  − 1 + 2 = 0
  
14.8. THREE-BODY COUPLED LINEAR OSCILLATOR SYSTEMS 381

The roots are

r r r
√ √ √
1 = 1+ 2 = 1+ 3 = 1 − 2
  
This case results in two degenerate eigenfrequencies,  1 =  2 while  3 is the lowest eigenfrequency.
The eigenvectors can be determined by substitution of the eigenfrequencies into

X ¡ ¢
 −  2   = 0

p√
Consider the lowest eigenfrequency  3 i.e.  = 3 for  = 1 and substitute for  3 =  1 − 2 gives

213 − 23 − 33 = 0

while for  = 3  = 2
−13 + 223 − 33 = 0
Solving these gives
13 = 23 = 33
Assuming that the eigenfunction is normalized to unity

213 + 223 + 233 = 1

then for the third eigenvector 3

1
13 = 23 = 33 = √
3
This solution corresponds to all three pendula oscillating in phase with the same amplitude, that is, a coherent
oscillation.
Derivation of the eigenfunctions for the other two eigenfrequencies is complicated because of the degen-
eracy  1 =  2  there are only five independent equations to specify the six unknowns for the eigenvectors
1 and 2  That is, the eigenvectors can be chosen freely as long as the orthogonality and normalization are
satisfied. For example, setting 31 = 0 to remove the indeterminacy, results in the a matrix
⎧ 1√ 1
√ 1
√ ⎫
⎨ 2 √2 6 √6 3 √3 ⎬
{a} = −1 2 1
6 √6 1
3 √3 ⎭
⎩ 2
0 − 3 6 13 3
1

and thus the solution is given by

⎧ ⎫ ⎧ √ √ √ ⎫⎧ ⎫
⎨ 1 ⎬ ⎨ 12 √2 1
6 √6
1
3 √3 ⎬ ⎨ 1 ⎬
2 = − 1 2 16 √6 1
3 √3

⎩ ⎭ ⎩ 2 ⎭⎩ 2 ⎭
3 0 − 13 6 1
3 3
3
−1 −1
The normal modes are obtained by taking the inverse matrix {a} and using {η} = {a} {θ}  Note
that since {a} is real and orthogonal, then {a}−1 equals the transpose of {a}  That is;
⎧ ⎫ ⎧ √ √ ⎫⎧ ⎫
⎨  1 ⎬ ⎨ 12 √2 − 12√ 2 0√ ⎬ ⎨ 1 ⎬
1
 = 6 16 √6 − 13√ 6 2
⎩ 2 ⎭ ⎩ 61 √ 1 1 ⎭⎩ ⎭
3 3 3 3 3 3 3 3
The normal mode  3 has eigenfrequency r
√
3 = 1 − 2

and eigenvector
1
η 3 = √ (1  2  3 )
3
382 CHAPTER 14. COUPLED LINEAR OSCILLATORS

This corresponds to the in-phase oscillation of all three pendula.

The other two degenerate solutions are
1 1
η 1 = √ (1  −2  0) η 2 = √ (1  2  −23 )
2 6
with eigenvalues r
√
1 = 2 = 1+

These two degenerate normal modes correspond to two pendula oscillating out of phase with the same ampli-
tude, or two oscillating in phase with the same amplitude and the third out of phase with twice the amplitude.
An important result of this toy model is that the most symmetric mode 3 is pushed far from all the other
modes. Note that for this example, the coherent mode 3 corresponds to the center-of-mass oscillation with
no relative motion between the three pendula. This is in contrast to the eigenvectors 1 and 2 which both
correspond to relative motion of the pendula such that there is zero center-of-mass motion. This mean-field
coupling behavior is exhibited by collective motion in nuclei as discussed in example 1412.

14.7 Example: Three plane pendula; nearest-neighbor coupling

There is a large and important class of coupled oscillators
where the coupling is only between nearest neighbors; a crys-
talline lattice is a classic example. A toy model for such a
system is the case of three identical pendula coupled by two
identical springs, where only the nearest neighbors are cou-
pled as shown in the adjacent figure. Assume the identical
pendula are of length  and mass . As in the last example, 1 2 3
the kinetic energy evaluated at the equilibrium location is
1 2 2 1 2 2 1 2 2
 =  ̇1 +  ̇2 +  ̇3 Three plane pendula with nearest-neighbour
2 2 2
coupling.
The gravitational potential energy of each pendulum equals
(1 − cos ) ≈ 12 2 thus
1
 = (21 + 22 + 23 )
2
while the potential energy in the springs is given by
1 h i 1 £ ¤
2 2
 = 2 (2 − 1 ) + (3 − 2 ) = 2 21 + 222 + 23 − 21 2 − 22 3
2 2
Thus the total potential energy is given by
1 1 £ ¤
= (21 + 22 + 23 ) + 2 21 + 222 + 23 − 21 2 − 22 3
2 2
The Lagrangian then becomes
1 ³ 2 2 2
´ 1¡ ¢ 1¡ ¢ 1¡ ¢
 = 2 ̇1 + ̇2 + ̇3 −  + 2 21 +  + 22 22 +  + 2 23 − 2 (1 2 + 2 3 )
2 2 2 2
Using this in the Euler-Lagrange equations gives the equations of motion

2 ̈1 − ( + 2 )1 + 2 2 = 0

2 ̈2 − ( + 22 )2 + 2 (1 + 3 ) = 0
2 ̈3 − ( + 2 )3 + 2 2 = 0

The general analytic approach requires the  and  energy tensors given by
⎧ ⎫ ⎧ ⎫
⎨ 1 0 0 ⎬ ⎨  + 2 −2 0 ⎬
T = 2 0 1 0 V= −2  + 22 −2
⎩ ⎭ ⎩ ⎭
0 0 1 0 −2  + 2
14.8. THREE-BODY COUPLED LINEAR OSCILLATOR SYSTEMS 383

Note that in contrast to the prior case of three fully-coupled pendula, for the nearest neighbor case the potential
energy tensor {V} is non-zero only on the diagonal and ±1 components ¡ parallel
¢ to the diagonal.
The third stage is to evaluate the secular determinant of the V −  2 T matrix, that is
¯ ¯
¯  + 2 −  2 2 −2 0 ¯
¯ ¯
¯ − 2 2
 + 2 −   2 2
− 2 ¯=0
¯ ¯
¯ 0 − 2 2
 +  −   2 2 ¯

This results in the characteristic equation

¡ ¢¡ ¢¡ ¢
 −  2 2  + 2 −  2 2  + 32 −  2 2 = 0

which results in the three non-degenerate eigenfrequencies for the normal modes.
The normal modes are similar to the prior case of complete linear
coupling, pas shown in the adjacent figure.
 1 =  This lowest mode 1 involves the three pendula oscillating
in phase such that the springs are not stretched or compressed thus the 1
period of this coherent oscillation is the same as an independent pendulum
of mass  and length . That is
1
η 1 = √ (1  2  3 )
3
p
 2 =  +  
 This second mode  2 has the central mass stationary with
the outer pendula oscillating with the same amplitude and out of phase.
That is
1
η 2 = √ (1  0 −3 )
2
q 2
 3 =  + 3
 . This third mode  3 involves the outer pendula in phase
with the same amplitude while the central pendulum oscillating with angle
3 = −21 . That is
1
η3 = √ (1  −22  3 )
6
Similar to the prior case of three completely-coupled pendula, the coherent
normal mode η 1 corresponds to an oscillation of the center-of-mass with
no relative motion, while η 2 and η 3 correspond to relative motion of
the pendula with stationary center of mass motion. In contrast to the
prior example of complete coupling, for nearest neighbor coupling the two 3
higher lying solutions are not degenerate. That is, the nearest neighbor
coupling solutions differ from when all masses are linearly coupled.
It is interesting to note that this example combines two coupling mech-
anisms that can be used to predict the solutions for two extreme cases
by switching off one of these coupling mechanisms. Switching off the
coupling springs, by setting  = 0,pmakes all three normal frequencies
degenerate with  1 =  2 =  3 =  . This corresponds to three inde- Normal modes of three plane
p
pendent identical pendula each with frequency  =  . Also the three pendula with nearest-neighbour
linear combinations  1  2  3 also have this same frequency, in particular coupling.
1 corresponds to an in-phase oscillation of the three pendula. The three
uncoupled pendula are independent and any combination the three modes is allowed since the three frequencies
are degenerate.
The other extreme is to let  = 0 that is switch off the gravitational field or let  → ∞, then the only
coupling is due to the two springs. This results in  1 = 0 because there is no restoring force acting on the
coherent motion of the three in-phase coupled oscillators; as a result, oscillatory motion cannot be sustained
since it corresponds to the center of mass oscillation with no external forces acting which is spurious. That
is, this spurious solution corresponds to constant linear translation.
384 CHAPTER 14. COUPLED LINEAR OSCILLATORS

14.8 Example: System of three bodies coupled by six springs

Consider the completely-coupled mechanical system shown in the adjacent figure.
1) The first stage is to determine the potential and kinetic energies using an appropriate set of
generalized coordinates, which here are 1 and 2 . The potential energy is the sum of the potential energies
for each of the six springs
3 2 3 2 3 2
=  +  +  − 1 2 − 1 3 − 2 3
2 1 2 2 2 3
while the kinetic energy is given by
1 1 1
 = ̇21 + ̇22 + ̇23
2 2 2
2) The second stage is to evaluate the potential energy  and
kinetic energy  tensors.
⎧ ⎫ ⎧ ⎫
⎨ 3 − − ⎬ ⎨  0 0 ⎬
k
V= − 3 − T= 0  0
⎩ ⎭ ⎩ ⎭
− − 3 0 0  m

Note that for this case the kinetic energy tensor is diagonal whereas k

the potential energy tensor is nondiagonal and corresponds to com- m k

plete coupling of the three coordinates.
k
3) The third stage is to use the potential  and kinetic 
k
energy tensors to evaluate the secular determinant giving m
¯ ¡ ¢ ¯
¯ 3 −  2 ¯ k
¯ ¡ − 2 ¢ − ¯
¯ − 3 −  − ¯=0
¯ ¡ ¢ ¯
¯ − − 3 −  2 ¯ System of three bodies coupled by six
The expansion of this secular determinant yields springs.
¡ ¢ ¡ ¢ ¡ ¢
 −  2 4 −  2 4 −   2 = 0
The solution for this complete-coupled system has two degenerate eigenvalues.
r r
 
1 = 2 = 2 3 =
 
4) The fourth step is to insert these eigenfrequencies into the secular equation
X¡ ¢
 −  2   = 0


to determine the coeﬃcients  

5) The final stage is to write the general coordinates
p in terms of the normal coordinates
The result is that the angular frequency  3 =  corresponds to a normal mode for which the three
masses oscillate in phase corresponding to a center-of-mass oscillation with no relative motion of the masses.
1
3 = √ (1 + 2 + 3 )
3
For this coherent motion only one spring per mass is stretched resulting in the same frequency as one
mass on a spring. The other two solutions correspond to the three masses oscillating out of phase which
implies all three
p  springs are stretched and thus the angular frequency is higher. Since the two eigenvalues
1 = 2 = 2  are degenerate then there are only five independent equations to specify the six unknowns
for the degenerate eigenvalues. Thus it is possible to select a combination of the eigenvectors  1 and  2 such
that the combination is orthogonal to  3  Choose 31 = 0 to removes the indeterminacy. Then adding or
subtracting gives that the normal modes are
1 1
 1 = √ (1 − 2 + 0)  2 = √ (1 + 2 − 23 )
2 2
These two degenerate normal modes correspond to relative motion of the masses with stationary center-of-
mass.
14.9. MOLECULAR COUPLED OSCILLATOR SYSTEMS 385

14.9 Molecular coupled oscillator systems

There are many examples of coupled oscillations in atomic and molecular physics most of which involve
nearest-neighbor coupling. The following two examples are for molecular coupled oscillators. The triatomic
molecule is a typical linearly-coupled molecular oscillator. The benzene molecule is an elementary example
of a ring structure coupled oscillator.

14.9 Example: Linear triatomic molecular CO 2

Molecules provide excellent examples of vibrational modes involving nearest neighbor coupling. Depending
on the atomic structure, triatomic molecules can be either linear, like 2 , or bent like water, 2  which
has a bend angle of  = 109◦  A molecule with  atoms has 3 degrees of freedom. There are three degrees
of freedom for translation and three degrees of freedom for rotation leaving 3 − 6 degrees of freedom for
vibrations. A triatomic molecule has three vibrational modes, two longitudinal and one transverse. Consider
the normal modes for vibration of the linear molecule 2

Longitudinal modes
The coordinate system used is illustrated in the adjacent figure.
The Lagrangian for this system is
µ ¶
 2  2  2  2 2
= ̇1 + ̇2 + ̇3 − [(2 − 1 ) + (3 − 2 ) ]
2 2 2 2
Evaluating the kinetic energy tensor gives
⎧ ⎫
⎨  0 0 ⎬
T= 0  0
⎩ ⎭
0 0 

while the potential energy tensor gives

⎧ ⎫
⎨ 1 −1 0 ⎬
V= −1 2 −1
⎩ ⎭
0 −1 1

The secular equation becomes

¯ ¡ ¢ ¯
¯ − 2 +  − 0 ¯
¯ ¡ ¢ ¯
¯ − −  2
+ 2 − ¯=0
¯ ¡ ¢ ¯
¯ 0 − − 2 +  ¯

Note that the same answer is obtained using Newtonian mechanics. That is, the force equation gives

̈1 −  (2 − 1 ) = 0
 ̈2 +  (2 − 1 ) −  (3 − 2 ) = 0
̈3 −  (3 − 2 ) = 0

Let the solution be of the form

 =    = 1 2 3
Substitute this solution gives
¡ ¢
− 2 +  1 − 2 = 0
¡ ¢
−1 + −  2 + 2 2 − 3 = 0
¡ ¢
−2 +  2 +  3 = 0

This leads to the same secular determinant as given above with the matrix elements clustered along the
diagonal for nearest-neighbor problems.
386 CHAPTER 14. COUPLED LINEAR OSCILLATORS

Expanding the determinant and collecting terms

yields
¡ ¢¡ ¢
 2 − 2 +  −  2 +  + 2 = 0
x
Equating either of the three factors to zero gives m K M K m
1 = 0
r
 1
2 =

sµ ¶
 2 2
3 = +
 
3
The solutions are:
1)  1 = 0; This solution gives 1 =  {1 1 1}. This
mode is not an oscillation at all, but is a pure transla- 4
tion of the system as a whole as shown in the adjacent
figure. There is no change in the restoring forces since
Normal modes of a linear triatomic molecule
the system moves such as not to change the length of the
springs, that is, they stay in their equilibrium positions.
This motion corresponds to a spurious oscillation of the center of mass that results from referencing the
three atom locations with respect to some fixed reference point. This reference point should have been chosen
as the center of mass since the motion of the center-of-mass already has been taken into account separately.
Spurious center of mass oscillations occur any time that the reference point is not at the center of mass for
an isolated system
p  with no external forces acting.
2)  2 =  : This solution corresponds to  2 =  {1 0 −1} and is shown in the adjacent figure. The
central mass  remains stationary while the two end masses vibrate longitudinally in opposite directions
with the same amplitude. This mode has a stationary center of mass. For 2 the electrical geometry is
−  ++ −  Mode 2 for 2 does not radiate electromagnetically because the center of charge is stationary
with respect toq¡the center of mass, that is, the electric dipole moment is constant.
 2
¢ © ¡¢ ª
3)  3 =  +  : This solution corresponds to 3 =  1 −2   1  As shown in the adjacent
figure, this motion corresponds to the two end masses vibrating in unison while the central mass vibrates
oppositely with a diﬀerent amplitude such that the center-of-mass is stationary. This 2 mode does radiate
electromagnetically since it corresponds to an oscillating electric dipole.
It is interesting to note that the ratio 
 2 = 1915 for 2 and the ratio of the two modes is independent
3

of the potential energy tensor  That is

r³
3 ´
= 1+2
2 

Transverse modes
The solutionsqare:
¡ ¢ 
4)  4 = 2 2+    This is the only non-spurious transverse mode  4 which corresponds to the two
outside masses vibrating in unison transverse to the symmetry axis while the central mass vibrates oppositely.
This mode radiates electric dipole radiation since the electric dipole is oscillating.
5)  5 = 0. This transverse solution 5 has all three nuclei vibrating in unison transverse to the symmetry
axis and corresponds to a spurious center of mass oscillation.
6)  6 = 0 This transverse solution 6 corresponds to a stationary central mass with the two outside
masses vibrating oppositely. This corresponds to a rotational oscillation of the molecule which is spurious
since there are no torques acting on the molecule for a central force. Rotational motion usually is taken into
account separately.
The normal modes for the bent triatomic molecule are similar except that the oscillator coupling strength
is reduced by the factor cos  where  is the bend angle.
14.9. MOLECULAR COUPLED OSCILLATOR SYSTEMS 387

14.10 Example: Benzene ring

The benzene ring comprises six carbon atoms bound in a plane hexagonal ring. A classical analog of the
benzene ring comprises 6 identical masses  on a frictionless ring bound by 6 identical springs with linear
spring constant  as illustrated in the adjacent figure Consider only the in-plane motion, then the kinetic
energy is given by
X6
1 2
 = 2 ̇
2 =1
The potential energy equals
6
" 6 #
1 X X 2
2 2 2
 =  (+1 −  ) =   − 1 2 − 2 3 − 3 4 − 4 5 − 5 6 − 6 1
2 =1 =1
where  = 7 ≡ 1. Thus the kinetic energy and potential energy tensors are given by
⎛ ⎞ ⎛ ⎞
1 0 0 0 0 0 2 −1 0 0 0 −1
⎜ 0 1 0 0 0 0 ⎟ ⎜ −1 2 −1 0 0 0 ⎟
⎜ ⎟ ⎜ ⎟
⎜
2⎜ 0 0 1 0 0 0 ⎟ ⎜ 0 −1 2 −1 0 0 ⎟
 =  ⎜ ⎟  =  ⎜2⎜ ⎟
⎜ 0 0 0 1 0 0 ⎟ ⎟ ⎜ 0 0 −1 2 −1 0 ⎟ ⎟
⎝ 0 0 0 0 1 0 ⎠ ⎝ 0 0 0 −1 2 −1 ⎠
0 0 0 0 0 1 −1 0 0 0 −1 2
This nearest-neighbor system includes non-zero ( 1) and (1 ) elements due to the ring structure. Define
2
 = 
 − 2 then the solution of the set of linear homogeneous equations requires that
¯ ¯
¯  1 0 0 0 1 ¯
¯ ¯
¯ 1  1 0 0 0 ¯
¯ ¯
¯ 0 1  1 0 0 ¯
¯ ¯
¯ 0 0 1  1 0 ¯=0
¯ ¯
¯ 0 0 0 1  1 ¯
¯ ¯
¯ 1 0 0 0 1  ¯
that is
2 2
( − 2) ( − 1) ( + 1) ( + 2) = 0
The eigenvalues and eigenfunctions are given in the table
Classical analog of a benzene molecular ring.
K
n x  2 Normal modes m m
4
1 2   1 − 2 + 3 − 4 + 5 − 6
3 K 3 2 K
2 1  − 1 + 3 − 4 + 6
3
3 1  −1 +2 −4 +5 m 4 1 m

4 −1  1 −3 −4 +6

5 −1  − 1 − 2 + 4 + 5 K 5 6 K
6 −2 0 1 +2 +3 +4 +5 +6
m m
K
Note the following properties of the normal modes and their frequencies.
 = 1: Adjacent masses vibrate 180◦ out of phase, thus each spring has maximal compression or extension,
leading to the energy of this normal mode being the highest.
 = 2 3: These two solutions are degenerate and correspond to two pairs of masses vibrating out of phase
while the third pair of masses are stationary. Thus the energy of this normal mode is slightly lower than the
 = 1 normal mode. Any combination of these degenerate normal modes are equally good solutions.
 = 4 5: From the figure it can be seen that both of these solutions correspond to a center of mass
oscillation and thus these modes are spurious.
 = 6: This vibrational mode has zero energy corresponding to zero restoring force and all six masses
moving uniformly in the same direction. This mode corresponds to the rotation of the benzene molecule about
the symmetry axis of the ring which usually is taken into account assuming a separate rotational component.
This classical analog of the benzene molecule is interesting because it simultaneously exhibits degenerate
normal modes, spurious center of mass oscillation, and a rotational mode.
388 CHAPTER 14. COUPLED LINEAR OSCILLATORS

14.10 Discrete Lattice Chain

Crystalline lattices and linear molecules are important classes of coupled oscillator systems where nearest
neighbor interactions dominate. A crystalline lattice comprises thousands of coupled oscillators in a three-
dimensional matrix with atomic spacing of a few 10−10 . Even though a full description of the dynamics of
crystalline lattices demands a quantal treatment, a classical treatment is of interest since classical mechanics
underlies many features of the motion of atoms in a crystalline lattice. The linear discrete lattice chain is
the simplest example of many-body coupled oscillator systems that can illuminate the physics underlying a
range of interesting phenomena in solid-state physics. As illustrated in example 27 the linear approxima-
tion usually is applicable for small-amplitude displacements of nearest-neighbor interacting systems which
greatly simplifies treatment of the lattice chain. The linear discrete lattice chain involves three independent
polarization modes, one longitudinal mode, plus two perpendicular transverse modes. The 3 degrees of
freedom for the  atoms, on a discrete linear lattice chain, are partitioned with  degrees of freedom for each
of the three polarization modes. These three polarization modes each have  normal modes, or  travelling
waves, and exhibit quantization, dispersion, and can have a complex wave number.

14.10.1 Longitudinal motion

The equations of motion for longitudinal modes of the lattice chain can be derived by considering a linear
chain of  identical masses, of mass  separated by a uniform spacing  as shown in Fig 147. Assume
that the  masses are coupled by  + 1 springs, with spring constant , where both ends of the chain are
fixed, that is, the displacements 0 = +1 = 0 and velocities ̇0 = ̇+1 = 0 The force required to stretch a
length  of the chain a longitudinal displacements,  for mass  is  =   Thus the potential energy for
stretching the spring for segment (−1 −  ) is  = 2 (−1 −  ). The total potential and kinetic energies
are
+1
X 2
= (−1 −  ) (14.74)
2 =1

1 X 2
 =  ̇ (14.75)
2 =1 

Since ̇+1 = 0 the kinetic energy and Lagrangian can be

extended to  =  + 1, that is, the Lagrangian can be written
as d d d d
1 X³ 2 ´
+1
= ̇ −  (−1 −  )2 (14.76)
2 =1
q j-2 q j-1 q q qj+2
j j+1
Using this Lagrangian in the Lagrange-Euler equations
gives the following second-order equation of motion for lon-
gitudinal oscillations

̈ =  2 (−1 − 2 + +1 ) (14.77)

Figure 14.7: Portion of a lattice chain of iden-
where  = 1 2  and where tical masses  connected by identical springs
r of spring constant . The displacement of the
 
 ≡ (14.78)  mass from the equilibrium position is 
 assumed to be positive to the right.

14.10.2 Transverse motion

The equations of motion for transverse motion on a linear discrete lattice chain, illustrated in figure 148,
can be derived by considering the displacements  of the  mass for  identical ¡ masses,
¢ with mass 
separated by equal spacings  and assuming that the tension in the string is  =   . Assuming that the
transverse deflections  are small, then the  − 1 to  spring is stretched to a length
q
0 = 2 + ( − −1 )2 (14.79)
14.10. DISCRETE LATTICE CHAIN 389

Thus the incremental stretching is

( − −1 )2
 ∼ (14.80)
2
The work done against the tension  is  ·  per segment. Thus the
total potential energy is
+1
 X 2
= (−1 −  ) (14.81)
2 =1
where 0 and +1 are identically zero.
The kinetic energy is d d d d

X
1
 =  ̇ 2 (14.82)
2 =1 

Since ̇+1 = 0 the kinetic energy and Lagrangian summations can

be extended to  =  + 1, that is

1 X³ 2  ´
+1
2 Figure 14.8: Transverse motion of a
= ̇ − (−1 −  ) (14.83)
2 =1  linear discrete lattice chain

Using this Lagrangian in the Lagrange Euler equations gives the following second-order equation of motion
for transverse oscillations
̈ =  2 (−1 − 2 + +1 ) (14.84)
where  = 1 2  and r

 ≡ (14.85)

The normal modes for the transverse modes comprise standing waves that satisfy the same boundary
conditions as for the longitudinal modes. The  equations of motion for longitudinal motion, equation
1477 or transverse motion, equation
p  1484 are identical in form. The major diﬀerence is thatp  0 for the

transverse normal modes   ≡  diﬀers from that for the longitudinal modes which is   ≡  . Thus
the following discussion of the normal modes on a discrete lattice chain is identical in form for both transverse
and longitudinal waves.

14.10.3 Normal modes

The normal modes of the  equations of motion on the discrete lattice chain, are either longitudinal or
transverse standing waves that satisfy the boundary conditions at the extreme ends of the lattice chain.
The solutions can be given by assuming that the  identical masses on the chain oscillate with a common
frequency . Then the displacement amplitude for the   mass can be written in the form

 () =   (14.86)

where the amplitude  can be complex. Substitution into the preceding  equations of motion, 1477 1484
yields the following recursion relation
¡ 2 ¢
− + 2 2  −  20 (−1 + +1 ) = 0 (14.87)
where  = 1 2  Note that the boundary conditions, 0 = 0 and +1 = 0 require that  = +1 = 0
The above recursion relation corresponds to a system of  homogeneous algebraic equations with 
unknowns 1  2    A non-trivial solution is given by setting the determinant of its coeﬃcients equal to
zero ¯ ¯
¯ − 2 + 2 2 − 2 0 0 ¯
¯ ¯
¯ − 2
− 2
+ 2 2
− 2
0 ¯
¯    ¯
¯ 0 − 2
− 2
+ 2 2
− 2 ¯=0 (14.88)
¯    ¯
¯     ¯
¯ ¯
¯ 0 0 − 2 − 2 + 2 2 ¯
390 CHAPTER 14. COUPLED LINEAR OSCILLATORS

This secular determinant corresponds to the special case of nearest neighbor interactions with the kinetic
energy tensor T being diagonal and the potential energy tensor V involving coupling only to adjacent
masses. The secular determinant is of order  and thus determines exactly  eigen frequencies   for each
polarization mode.
For large  the solution of this problem is more eﬃciently obtained by using a recursion relation approach,
rather than solving the above secular determinant. The trick is to assume that the phase diﬀerences 
between the motion of adjacent masses all are identical for a given polarization. Then the amplitude for the
  mass for the  frequency mode   is of the form

 =  ( − ) (14.89)

Insert the above into the recursion relation (1487) gives

¡ 2 ¢ £ ¤
−  + 2 2 −  20 − +  = 0 (14.90)

which reduces to

 2 = 2 2 − 2 2 cos  = 4 2 sin2
2
that is

  = 2  sin (14.91)
2
where  = 1 2 3 
Now it is necessary to determine the phase angle  which can be done by applying the boundary
conditions for standing waves on the lattice chain. These boundary conditions for stationary modes require
that the ends of the lattice chain are nodes, that is  = (+1) = 0 Using the fact that only the real
part of  has physical meaning, leads to the amplitude for the   mass for the  mode to be

 =  cos ( −   ) (14.92)


The boundary condition 0 = 0 requires that the phase   = 2 That is
³ ´
 =  cos  − =  sin  (14.93)
2
where  = 1 2  
The boundary condition for  =  + 1 gives

(+1) = 0 =  sin ( + 1)  (14.94)

Therefore
( + 1)  =  (14.95)
where  = 1 2 3  . That is

    
 = = = = (14.96)
+1 ( + 1)   2

where  = ( + 1) is the total length of the discrete lattice chain.

The  eigen frequencies for a given polarization are given by

    
  = 2  sin = 2  sin = 2  sin = 2  sin (14.97)
2 ( + 1) 2 ( + 1)  2 2
where the corresponding wavenumber  is given by
  2
 = = = (14.98)
( + 1)   
 
This implies that the normal modes are quantized with half-wavelengths 2 =  .
14.10. DISCRETE LATTICE CHAIN 391

r=1 r=2
1 .0 1 .0

0 .8 0 .8

0 .6 0 .6

0 .4 0 .4

0 .2 0 .2

0 .0 0 .0
0. 1 0 .2 0. 3 0. 4 0. 5 0. 6 0. 7 0. 8 0. 9 1. 0 0. 1 0 .2 0. 3 0. 4 0. 5 0. 6 0. 7 0. 8 0. 9 1. 0
-0 .2 -0 .2

-0 .4 -0 .4

-0 .6 -0 .6

-0 .8 -0 .8

-1 .0 -1 .0

r=3 r=4
1 .0 1 .0

0 .8 0 .8

0 .6 0 .6

0 .4 0 .4

0 .2 0 .2

0 .0 0 .0
0. 1 0 .2 0. 3 0. 4 0. 5 0. 6 0. 7 0. 8 0. 9 1. 0 0. 1 0 .2 0. 3 0. 4 0. 5 0. 6 0. 7 0. 8 0. 9 1. 0
-0 .2 -0 .2

-0 .4 -0 .4

-0 .6 -0 .6

-0 .8 -0 .8

-1 .0 -1 .0

r=5 r=6
1 .0 1 .0

0 .8 0 .8

0 .6 0 .6

0 .4 0 .4

0 .2 0 .2

0 .0 0 .0
0. 1 0 .2 0. 3 0. 4 0. 5 0. 6 0. 7 0. 8 0. 9 1. 0 0. 1 0 .2 0. 3 0. 4 0. 5 0. 6 0. 7 0. 8 0. 9 1. 0
-0 .2 -0 .2

-0 .4 -0 .4

-0 .6 -0 .6

-0 .8 -0 .8

-1 .0 -1 .0

Figure 14.9: Plots of the maximal vibrational amplitudes  for the  frequency sinusoidal mode, versus
distance along the chain, for transverse normal modes of a vibrating discrete lattice with  = 5. Only  =
1 2 3 4 5 are distinct modes because  = 6 is a null mode. Note that the modes with  = 7 8 9 10 11 12
shown dashed, duplicate the locations of the mass displacement given by the lower-order modes.

Combining equations 1496 and 1493 gives the maximum amplitudes for the eigenvectors to be
 
 =  sin 
(14.99)
2
For  independent linear oscillators there are only  independent normal modes, that is, for  =  + 1 the
sine function in equation 1497 must be zero. Beyond  =  the equations do not describe physically new
situations. This is illustrated by figure 149 which shows the transverse modes of a lattice chain with  = 5.
There are only  = 5 independent normal modes of this system since  =  + 1 = 6 corresponds to a null
mode with all  () = 0. Also note that the solutions for    + 1 shown dashed, replicate the mass
locations of modes with    + 1, that is, the modes with   6 are replicas of the lower-order modes.
Note that   has a maximum value   ≤ 2 0 since the sine function cannot exceed unity. This leads
to a maximum frequency   = 2 0  called the cut-oﬀ frequency, which occurs when   = . That is, the
null-mode occurs when  =  + 1 for which equation 1499 equals zero The range of  quantized normal
modes that can occur is intuitive. That is, the longest half-wavelength max
2 =  = ( + 1) equals the total

length of the discrete lattice chain. The shortest half-wavelength −
2

=  is set by the lattice spacing.
Thus the discrete wavenumbers of the normal modes, for each polarization, range from 1 to 1 where  is
an integer.
Assuming real   the normal coordinate  and corresponding frequency   are,
 =    (14.100)
Equations 1497 and 1499 give the angular frequency and displacement. Note that superposition applies
since this system is linear. Therefore the most general solution for each polarization can be any superposition
of the form  ∙ ¸
X 
 () =  sin (14.101)
=1
( + 1)
392 CHAPTER 14. COUPLED LINEAR OSCILLATORS

14.10.4 Travelling waves

Travelling waves are equally good solutions of the equations of motion 1477 1484 as are the normal modes.
Travelling waves on the one-dimensional lattice chain will be of the form

( ) = (±) (14.102)

where the distance along the chain  = , that is, it is quantized in units of the cell spacing , with  being
an integer. The positive sign in the exponent corresponds to a wave travelling in the − direction while
the negative sign corresponds to a wave travelling in the + direction. The velocity of a fixed phase of the
travelling wave must satisfy that  ±  is a constant. This will occur if the phase velocity of the wave is
given by
 
  = = (14.103)
 

The wave has a frequency  = 2 and wavelength  = 2 
  thus the phase velocity  =  =  .
Inserting the travelling wave 14102 into the transverse equation of motion 1484 for the discrete lattice
chain gives

− 2  =  20 (− − 2 +  ) (14.104)

where  = 1 2  That is

  = ±2 0 sin (14.105)
2
The phase  is determined by the Born-von Karman periodic boundary condition that assumes that the
chain is duplicated indefinitely on either side of  = ±  . Thus, for  discrete masses,  must satisfy the
condition that  = + . That is
  = 1 (14.106)
That is
2
 = (14.107)

Note that the periodic boundary condition gives  discrete modes
for wavenumbers between
First Brillouin zone
 
− ≤  ≤ + (14.108)
 
where the index
   
 = −  − + 1  − 1
2 2 2 2
Thus equation 14105 becomes
kd
  0
  = ±2 0 sin (14.109)
2
Equation 14109 is a dispersion relation that is identical to equa-
tion 1497 derived during the discussion of the normal modes of the
Figure 14.10: Plot of the dispersion
lattice chain. This confirms that the travelling waves on the lat-
curve ( versus ) for a monoatomic
tice chain are equally good solutions as the normal standing-wave
linear lattice chain subject to only
modes. Clearly, superposition of the standing-wave normal modes
nearest neighbor interactions. The
can lead to travelling waves and vice versa.
first Brillouin zone is the segment be-
tween −  ≤  ≤  which covers all
14.10.5 Dispersion independent solutions.
The lattice chain is an interesting example of a dispersive system in that   is a function of   Figure 1410
shows a plot of the dispersion curve ( versus ) for a monoatomic linear lattice chain subject to only nearest
neighbor interactions. Note that  depends linearly on  for small  and that   = 0 at the boundaries of
the first Brillouin zone.
14.10. DISCRETE LATTICE CHAIN 393

The lattice chain has a phase velocity for the  wave given by
¯ ¯
 ¯sin   ¯
 2
 = =  0    (14.110)
 2

while the group velocity is

µ ¶
  
 = =  0  cos (14.111)
  2
 
Note that in the limit when 2 → 0 the phase velocity and group velocity are identical, that is,  =
 =  0 

14.10.6 Complex wavenumber

The maximum allowed frequency, which is called the cut-off frequency,   = 2 0  occurs when   = , that
is, 2 = . That is, the minimum half-wavelength equals the spacing  between the discrete masses. At the
cut-off frequency, the phase velocity is  = 2  0  and the group velocity  = 0
It is interesting to note that   can exceed the cut-off frequency   = 2 0 if  is assumed to be complex,
that is, if
 =  − Γ (14.112)
Then
µ ¶
     Γ    Γ 
  = 2 0 sin = 2 0 sin ( − Γ ) = 2 0 sin cosh −  cos sinh (14.113)
2 2 2 2 2 2

To ensure that   is real, the imaginary term must be zero, that is

 
cos =0 (14.114)
2
Therefore
 
=1
sin (14.115)
2
that is,  =  , and the dispersion relation between  and  for   2 0 becomes

Γ 
  = 2 0 cosh (14.116)
2
which increases with Γ. Thus, when     = 2 0 then the amplitude of the wave is of the form

 () =  −Γ  ( − ) (14.117)

which corresponds to a spatially damped oscillatory wave with phase velocity


 = (14.118)

and damping factor Γ .
There are many examples in physics where the wavenumber is complex as exhibited by the discrete lattice
chain for 2 ≤ . Other examples are electromagnetic waves in conductors or plasma (example 35), matter
waves tunnelling through a potential barrier, or standing waves on musical instruments which have a complex
wavenumber  due to damping.
This simple toy model of the discrete linear lattice chain has illustrated that classical mechanics explains
many features of the many-body nearest-neighbor coupled linear oscillator system, including normal modes,
standing and travelling waves, cut-oﬀ frequency dispersion, and complex wavenumber. These phenomena
feature prominently in applications of the quantal discrete coupled-oscillator system to solid-state physics.
394 CHAPTER 14. COUPLED LINEAR OSCILLATORS

14.11 Damped coupled linear oscillators

The discussion of coupled linear oscillators has neglected non-conservative damping forces which always exist
to some extent in physical systems. In general, dissipative forces are non linear which greatly complicates
solving the equations of motion for such coupled oscillator systems. However, for some systems the dissipative
forces depend linearly on velocity which allows use of the Rayleigh dissipation function, described in chapter
104. The most general definition of the Rayleigh dissipation function, 104 was given to be
 
1 XX
R=  ̇ ̇ (14.119)
2 =1 =1

For this special case, it was shown in chapter 10 that the Lagrange equations can be written in terms of the
Rayleigh dissipation function as ½ µ ¶ ¾
   R
− + =  (14.120)
  ̇   ̇
where  are generalized forces acting on the system that are not absorbed into the potential  Using
equations 1443 1444 and 14120 allows the equations of motion for damped coupled linear oscillators to
be written in a matrix form as
{T} q̈ + {C} q̇+ {V} q = {Q} (14.121)
where the symmetric matrices {T}  {C}  and {V} are positive definite for positive definite systems. Rayleigh
pointed out that in the special case where the damping matrix {C} is a linear combination of the {T} and
{V} matrices, then the matrix {C} is diagonal leading to a separation of the damped system into normal
modes. As discussed in chapter 4 many systems in nature are linear for small amplitude oscillations allowing
use of the Rayleigh dissipation function which provides an analytic solution. However, in general, except for
when {C} is small, this separation into normal modes is not possible for damped systems and the solutions
must be obtained numerically.
The following example illustrates approaches used to handle linearly-damped coupled-oscillator systems.

14.11 Example: Two linearly-damped coupled linear oscillators

Consider the two coupled oscillator system shown
where the two carts have spring constants 1  2 and
linear damping constants 1 2 . As discussed in exam-
ple 143, the kinetic energy tensor is given by
1 1
 = 1 ̇12 + 2 ̇22 ()
2 2
and the potential energy is given by Two linearly-damped coupled linear oscillators.
1h 2 2
i
 = 1 1 + 2 (2 − 1 )
2
1£ ¤
= (1 + 2 ) 12 − 22 1 2 + 2 22 ()
2
Similarly the Rayleigh dissipation function has the form
1£ 2 ¡ ¢¤ 1 £ ¤
R= 1 ̇1 + 2 ̇22 − ̇12 = (1 + 2 ) ̇12 − 22 ̇1 ̇2 + 2 ̇22 ()
2 2
Inserting   and  into equation 14120 gives the two equations of motion to be

1 ̈1 + (1 + 2 ) ̇1 − 2 ̇2 + (1 + 2 ) 1 − 2 2 = 0

2 ̈2 − 2 ̇1 + 2 ̇2 − 2 1 + 2 2 = 0
When the drag is zero the solution of these two coupled equations can be separated into two independent
normal modes of the system as described earlier. Usually it is not possible to separate the motion into
decoupled normal modes except for certain cases where the dissipative forces can be described by Rayleigh’s
dissipation function.
14.12. COLLECTIVE SYNCHRONIZATION OF COUPLED OSCILLATORS 395

14.12 Collective synchronization of coupled oscillators

Collective synchronization of coupled oscillators is a multifaceted phenomenon where large ensembles of
coupled oscillators, with comparable natural frequencies, self synchronize leading to coherent collective modes
of motion. Biological examples include congregations of synchronously flashing fireflies, crickets that chirp in
unison, an audience clapping at the end of a performance, networks of pacemaker cells in the heart, insulin-
secreting cells in the pancreas, as well as neural networks in the brain and spinal cord that control rhythmic
behaviors such as breathing, walking, and eating. Example 1413 illustrates an application to nuclei.
An ensemble of coupled oscillators will have a frequency distribution with a finite width. It is interesting
to elucidate how an ensemble of coupled oscillators, that have a finite width frequency distribution, can self
synchronize their motion to a unique common frequency, and how that synchronization is maintained over
long time periods. The answers to these issues provide insight into the dynamics of coupled oscillators.
The discussion of coupled oscillators has implicitly assumed  identical undamped linear oscillators that
have identical, infinitely-sharp, natural frequencies   . In nature typical coupled oscillators can have a finite-
width frequency distribution () about some average value, due to the natural variability of the oscillator
parameters for biological systems, the manufacturing tolerances for mechanical oscillators, or the natural
Lorentzian frequency distribution associated with the uncertainty principle that occurs even for atomic clocks
where the oscillator frequencies are defined directly by the physical constants. Assume that the ensemble of
coupled oscillators has a frequency distribution () about some average value.
Undamped linear oscillators have elliptical closed-path trajectories in phase space whereas dissipation
leads to a spiral attractor unless the system is driven such as to preserve the total energy. As described
in chapter 44 many systems in nature, especially biological systems, have closed limit cycles in phase
space where the energy lost to dissipation is replenished by a driving mechanism. The simplest systems for
understanding collective synchronization of coupled oscillators are those that involve closed limit cycles in
phase space.
N. Wiener first recognized the ubiquity of collective synchronization in the natural world, but his mathe-
matical approach, based on Fourier integrals, was not suited to this problem. A more fruitful approach was
pioneered in 1975 by an undergraduate student A.T. Winfree[Win67] who recognized that the long-time be-
havior of a large ensemble of limit-cycle oscillators can be characterized in the simplest terms by considering
only the phase of closed phase-space trajectories. He assumed that the instantaneous state of an ensemble
of oscillators can be represented by points distributed around the circular phase-space diagram shown in
figure 1411. For uncoupled oscillators these points will be distributed randomly around the circle, whereas
coupling of the oscillators will result in a spatial correlation of the points. That is, the dynamics of the
phases can be visualized as a swarm of points running around the unit circle in the complex plane of the
phase space diagram. The complex order parameter of this swarm can be defined to be the magnitude and
phase of the centroid of this swarm

1 X 
 =  (14.122)
 =1

The centroid of the ensemble of points on the phase diagram has a

magnitude  designating the oﬀset of the centroid from the center of
the circular phase diagram, and  which is the phase of this centroid.
A uniform distribution of points around the unit circle will lead to a
centroid  = 0. Correlated motion leads to a bunching of the points
around some phase value leading to a non-zero centroid  and angle
. If the swarm acts like a fully-coupled single oscillator then  ≈ 1
with an appropriate phase .
The Kuramoto model[Kur75, Str00] incorporates Winfree’s
intuition by mapping the limit cycles onto a simple circular phase
diagram and incorporating the long-term dynamics of coupled oscil-
lators in terms of the relative phases for a mean-field system. That
is, the angular velocity of the phase ̇ for the  oscillator is

Figure 14.11: Order parameter for
X
̇ =   + Γ ( −  ) (14.123) weakly-coupled oscillators.
=1
396 CHAPTER 14. COUPLED LINEAR OSCILLATORS

Figure 14.12: Kuramoto model of collective synchronization of coupled oscillators. The left and center
plots show the time and coupling strength dependence of the order parameter . The right plot shows the
frequency dependence including coupling (solid line) and without coupling (dashed line).

where  = 1 2    . Kuramoto recognized that mean-field coupling was the most tractable system to solve,
that is, a system where the coupling is applicable equally to all the oscillators. Moreover, he assumed an
equally-weighted, pure sinusoidal coupling for the coupling term Γ ( −  ) between the coupled oscillators.
That is, he assumed

Γ ( −  ) = sin( −  ) (14.124)

where  ≥ 0 is the coupling strength, and the factor 1 ensures that the model is well behaved as  → ∞.
Kuramoto assumed that the frequency distribution () was unimodular and symmetric about the mean
frequency Ω, that is (Ω + ) = (Ω − ).
This problem can be simplified by exploiting the rotational symmetry and transforming to a frame of
reference that is rotating at an angular frequency Ω. That is, use the transformation  =  − Ω where
 is measured in the rotating frame. This makes () unimodular with a symmetric frequency distribution
about  = 0. The phase velocity in this rotating frame is

X 
̇ =   + sin( −  ) (14.125)
=1


Kuramoto observed that the phase-space distribution can be expressed in terms of the order parameters  
in that equation 14122 can be multiplied on both sides by − to give

1 X ( − )
(− ) =  (14.126)
 =1

Equating the imaginary parts yields


1 X
 sin ( −  ) = sin ( −  ) (14.127)
 =1

This allows equation 14125 to be written as

̇ =   +  sin( −  ) (14.128)

for  = 1 2   . Equation 14128 reflects the mean-field aspect of the model in that each oscillator  is
attracted to the phase of the mean field  rather than to the phase of another individual oscillator.
Simulations showed that the evolution of the order parameter with coupling strength  is as illustrated
in figure 1412. This simulation shows (1) for all  when below a certain threshold  , the order parameter
decays to an incoherent jitter as expected for random scatter of  points. (2) When    this incoherent
state becomes unstable and the order parameter  grows exponentially reflecting the nucleation of small
clusters of oscillators that are mutually synchronized. (3) The population of individual oscillators splits
into two groups. The oscillators near the center of the distribution lock together in phase at the mean
14.12. COLLECTIVE SYNCHRONIZATION OF COUPLED OSCILLATORS 397

angular frequency Ω and co-rotate with average phase (), whereas those frequencies lying further from
the center continue to rotate independently at their natural frequencies and drift relative to the coherent
cluster frequency Ω. As a consequence this mixed state is only partially synchronized as illustrated on the
right side of figure 1412. The synchronized fraction has a -function behavior for the frequency distribution
which grows in intensity with further increase in . The unsynchronized component has nearly the original
frequency distribution () except that it is depleted in the region of the locked frequency due to strength
absorbed by the -function component.
Kuramoto’s toy model nicely illustrates the essential features of the evolution of collective synchronization
with coupling strength. It has been applied to the study neuronal synchronization in the brain[Cum07]. The
model illustrates that the collective synchronization of coupled oscillators leads to a component that has a
single frequency for correlated motion which can be much narrower than the inherent frequency distribution
of the ensemble of coupled oscillators.

14.12 Example: Collective motion in nuclei

The nucleus is an unusual quantal system that involves the coupled motion of the many nucleons. It
exhibits features characteristic of the many-body classical coupled oscillator with coupling between all the
valence nucleons. Nuclear structure can be described by a shell model of individual nucleons bound in weakly
interacting orbits in a central average mean field that is produced by the summed attraction of all the nucleons
in the nucleus. However, nuclei also exhibit features characteristic of collective rotation and vibration of a
quantal fluid. For example, beautiful rotational bands up to spin over 60~ are observed in heavy nuclei. These
rotational bands are similar to those observed in the rotational structure of diatomic molecules. Actinide
nuclei also can fission into two large fragments which is another manifestation of collective motion.
Figure 1413 shows the case of collective bands in 238  populated by Coulomb exciting a 1355 
238
 beam by a 208   target. This case exhibits both quadrupole and octupole collective rotational bands up
to spin 40. The inset shows the moment of inertia plotted versus the angular rotational energy ~ The
electromagnetic 2 transition rates correspond to collective motion of ≈ 32 nucleons. Collective motion of
many nucleons is the antithesis of shell model motion where the nucleons are assumed to follow independent
orbiting motion like planets around the Sun. Although the nucleus is a quantal system, this strange dichotomy
can be understood in terms of a classical rotating system having weak linear coupling between each of many
similar harmonic oscillators; which in this case, are nucleons bound in a spheroidally-deformed shell-model
potential well.
The essential general feature of weakly-coupled identical oscillators is illustrated by the solutions of the
three linearly-coupled identical oscillators where the most symmetric state is displaced in frequency from the
remaining states For  identical oscillators, one state is displaced significantly in energy from the remaining
 − 1 degenerate states. This most symmetric state is pushed downwards in energy if the residual coupling
force is attractive, and it is pushed upwards if the coupling force is repulsive. This symmetric state corresponds
to the coherent oscillation of all the coupled oscillators, and carries all of the strength for the corresponding
dominant multipole for the coupling force. In the nucleus this state corresponds to coherent shape oscillations
of many nucleons.
The weak residual electric quadrupole and octupole nucleon-nucleon correlations in the nucleon-nucleon
interactions generate collective quadrupole and octupole motion in nuclei. The collective synchronization
of such coherent quadrupole and octupole excitation leads to collective bands of states, that correspond to
synchronized in-phase motion of the protons and neutrons in the valence oscillator shell. These modes
correspond to rotations and vibrations about the center of mass. The attractive residual nucleon-nucleon
interaction couples the many individual particle excitations in a given shell producing one coherent state
that is pushed downwards in energy far from the remaining  − 1 degenerate states. This coherent state
involves correlated motion of the nucleons that corresponds to a macroscopic oscillation of a charged fluid.
For non-closed shell nuclei like 238 U, the dominant quadrupole multipole in the residual nucleon-nucleon
interaction leads to the ground state being a coherent state corresponding to ≈ 16 protons plus ≈ 20 neutrons
oscillating in phase. The collective motion of the charged protons leads to electromagnetic 2 radiation
with a transition decay amplitude being about 16 times larger than for a single proton. This corresponds to
radiative decay probability being enhanced by a factor of ≈ 256 relative to radiation by a single proton. This
collective state corresponds to a macroscopic quadrupole deformation at low excitation energies that exhibits
both collective rotational and vibrational degrees of freedom as shown in the figure. This coherent state is
398 CHAPTER 14. COUPLED LINEAR OSCILLATORS

238
Figure 14.13: Collective rotational bands in the nucleus U excited by Coulomb excitation. [Sim98]

analogous to the correlated flow of individual water molecules in a tidal wave. The weaker octupole term in
the residual interaction leads to an octupole [pear-shaped] coupled oscillator coherent state lying slightly above
the quadrupole coherent state. In contrast to the rotational motion of strongly-deformed quadrupole-deformed
nuclei, the octupole deformation exhibits more vibrational-like properties than rotational motion of a charged
tidal wave. The observed large increase in moment of inertia at higher rotational frequencies, shown in the
insert, is due to the Coriolis force aligning the individual valence nucleons along the rotational axis. Thus,
although the nucleus 238 U is the epitome of a complicated many-body quantal system, it is apparent that
basic classical mechanics of coupled oscillators, and rotation, underlie the physics phenomena exhibited by
synchronized collective motion in the nuclear many-body system.
The close correspondence between classical mechanics predictions, and the observed excitation phenomena
observed for the 238  nucleus, is surprising for a system that is the epitome of a many-body quantal fluid.
The following list identifies other manifestations of classical mechanics discussed in this book, that were
exploited for study of such correlated motion of many-body nuclear systems.

1. Coincident detection of the excited nuclei recoiling in vacuum was used to identify the exact scat-
tering angles, plus recoil velocities, of the scattered nuclei. This specifies the hyperbolic Rutherford
trajectory for each scattered nucleus, the nuclear masses, and their recoil velocities. The deexcitation
−rays emitted in flight by each recoiling nucleus, were detected in coincidence with the scattered
nuclei. Knowledge of the recoil velocities and scattering angles enabled correction for the Doppler shift
in energy of each detected coincident -ray to enhance the experimental energy resolution achieved by
the -ray detectors.

2. The transition energies and angular distribution of the deexcitation -rays determined the energies,
spins, and parities of the excited states in 235  .

3. The measured yields of the coincident deexcitation -rays determined the excitation cross section as a
function of the nuclear scattering angle.
14.13. SUMMARY 399

4. A full quantal calculation for this system is beyond the capabilities of modern computers since the
experiment involves excitation of ∼ 100 excited levels, coupled by about ∼ 1000 electromagnetic matrix
elements, and the scattering involves inclusion of thousands of partial wave due to the long range of the
Coulomb potential for the heavy mass of the scattered nuclei. Therefore a semi-classical approximation
is used for the quantal calculation of the electromagnetic excitation cross sections as a function of time
as the scattered nuclei traverse Rutherford’s hyperbolic Coulomb scattering trajectory for each scattered
nucleus.

5. The measured cross section for the deexcitation -rays are compared with the predicted cross sections
to determine the ∼ 1000 electromagnetic matrix elements connecting the states in 235  .

6. The measured electromagnetic matrix elements have been measured in the laboratory frame of reference.
Much more insight into the collective motion in 235  is obtained by transforming the electromagnetic
matrix elements into the body-fixed frame of reference for this rotating deformed body. Rotational
invariants, described in chapter 1316, are used to derive the electromagnetic properties in the rotating
body-fixed frame of reference which unambiguously determines the electromagnetic shape for each excited
nuclear state observed in 235  .

7. Hamiltonian mechanics, based on the Routhian   is used to make theoretical model calculations
of the nuclear structure of 235  in the rotating body-fixed frame for comparison with the experimental
data derived from this experiment.

This experiment illustrates that classical mechanics plays a key role in all aspects of the study of the
nuclear structure of the many-body nuclear quantal system.

14.13 Summary
This chapter has focussed on many—body coupled linear oscillator systems which are a ubiquitous feature in
nature. A summary of the main conclusions are the following.

Normal modes: It was shown that coupled linear oscillators exhibit normal modes and normal coordinates
that correspond to independent modes of oscillation with characteristic eigenfrequencies   .

General analytic theory for coupled linear oscillators Lagrangian mechanics was used to derive the
general analytic procedure for solution of the many-body coupled oscillator problem which reduces to the
conventional eigenvalue problem. A summary of the procedure for solving coupled oscillator problems is as
follows:.
1) Choose generalized coordinates  and evaluate  and  .

1X
 =  ̇ ̇ (1441)
2


and

1X
=    (1442)
2


where the components of the T and V tensors are


X 3
X  
 ≡  (1443)
 
 

and µ ¶
2
 ≡ (1444)
  0
400 CHAPTER 14. COUPLED LINEAR OSCILLATORS

2) Determine the eigenvalues   using the secular determinant.

¯ ¯
¯ 11 −  2 11 12 −  2 12 13 −  2 13  ¯
¯ ¯
¯ 12 −  2 12 22 −  2 22 23 −  2 23  ¯
¯ ¯=0 (1452)
¯ 13 −  2 13 23 −  2 23 33 −  2 33  ¯
¯ ¯
¯     ¯

3) The eigenvectors are obtained by inserting the eigenvalues   into


X ¡ ¢
 −  2   = 0 (1451)


4) From the initial conditions determine the complex scale factors   where

  () ≡     (1458)

5) Determine the normal coordinates where each   is a normal mode. The normal coordinates can be
expressed as
η = {a}−1 q (1461)

Few-body coupled oscillator systems The general analytic theory was used to determine the solutions
for parallel and series couplings of two and three linear oscillators. The phenomena observed include degen-
erate and non-degenerate eigenvalues and spurious center-of-mass oscillatory modes. There are two broad
classifications for three or more coupled oscillators, that is, either complete coupling of all oscillators, or
coupling of the nearest-neighbor oscillators. It is observed that the eigenvalue corresponding to the most
coherent motion of the coupled oscillators corresponds to the most collective motion and its eigenvalue is dis-
placed the most in energy from the remaining eigenvalues. For some systems this coherent collective mode
corresponded to a center-of-mass motion with no internal excitation of the other modes, while the other
eigenvalues corresponded to modes with internal excitation of the oscillators such that the center of mass
is stationary. The above procedure has been applied to two classification of coupling, complete coupling of
many oscillators, and nearest neighbor coupling. Both degenerate and spurious center-of-mass modes were
observed. Strong collective shape degrees of freedom in nuclei are examples of complete coupling due to the
weak residual interactions between nucleons in the nucleus. It was seen that, for many coupled oscillators,
one coherent state separates from the other states and this coherent state carries the bulk of the collective
strength.

Discrete lattice chain Transverse and longitudinal modes of motion on the discrete lattice chain were dis-
cussed because of the important role it plays in nature, such as in crystalline lattice structures. Both normal
modes and travelling waves were discussed including the phenomena of dispersion and cut-oﬀ frequencies.
Molecules and the crystalline lattice chains are examples where nearest neighbor coupling is manifest. It
was shown that, for the −oscillator discrete lattice chain, there are only  independent longitudinal modes
plus  modes for the two transverse polarizations, and that the angular frequency   ≤ 2 0 that is, a cut-oﬀ
frequency exists.

Damped coupled linear oscillators It was shown that linearly-damped coupled oscillator systems can
be solved analytically using the concept of the Rayleigh dissipation function.

Collective synchronization of coupled oscillators The Kuramoto schematic phase model was used
to illustrate how weak residual forces can cause collective synchronization of the motion of many coupled
oscillators. This is applicable to biological systems as well as mechanical systems.
14.13. SUMMARY 401

Workshop exercises
1. Consider two masses (each of mass  ) connected by a spring to each other and by springs to fixed positions.
Motion is only allowed along one dimension. (This is exactly the same system that is discussed in chapter
142 on coupled oscillations.) Let each of the two oscillator springs have a force constant  and let the force
constant of the coupling spring be 12 . Let 1 and 2 be the coordinates as described in the textbook.

(a) Draw a picture of the two masses displaced by a small amount. Using the picture, try to make sense of
the equations of motion as given in the text:

 ̈1 + ( + 0 )1 − 0 2 = 0 ::::  ̈2 + ( + 0 )2 − 0 1 = 0

(b) Each of the trial solutions is written in the form  . Why are the trial solutions written this way?
Are there any other ways to write the trial solution?
(c) For a nontrivial solution to exist for the pair of simultaneous equations resulting from the substitution of
the trial solution, the determinant of the coeﬃcients of 1 and 2 must vanish. Why must this be the
case? Is a similar statement true when considering three masses? What about  masses?
(d) Suppose you had the actual two-mass system sitting in front of you. How could you create antisymmetric
motion? How could you create symmetric motion? Can you describe each of these motions using a set of
suitable initial conditions?

2. Two particles, each with mass , move in one dimension in a region near a local minimum of the potential
energy where the potential energy is approximately given by
1
= (721 + 422 + 41 2 )
2
where  is a constant.

(a) Determine the frequencies of oscillation.

(b) Determine the normal coordinates.

3. What is degeneracy? When does it arise?

4. The Lagrangian of three coupled oscillators is given by:
X3 ∙ ¸
̇2 2
− + 0 (1 2 + 2 3 )
=1
2 2

Find 2 () for the following initial conditions (at  = 0):

(1  2  3 ) = (0  0 0) :::::: (̇1  ̇2  ̇3 ) = (0 0 0 )

5. A mechanical analog of the benzene molecule comprises a discrete lattice chain of 6 point masses  connected
in a plane hexagonal ring by 6 identical springs each with spring constant  and length .
a) List the wave numbers of the allowed undamped longitudinal standing waves.
b) Calculate the phase velocity and group velocity for longitudinal travelling waves on the ring.
c) Determine the time dependence of a longitudinal standing wave for a angular frequency  = 2   , that
is, twice the cut-oﬀ frequency.

6. Consider a one dimensional, two-mass, three-spring system governed by the matrix ,

µ ¶
4 −2
=
−2 7

such that  =  2 ,
(a) Determine the eigenfrequencies and normal coordinates.
(b) Choose a set of initial conditions such that the system oscillates at its highest eigenfrequency.
(c) Determine the solutions 1 () and 2 ().
402 CHAPTER 14. COUPLED LINEAR OSCILLATORS

Problems
1. Four identical masses  are connected by four identical springs, spring constant  and constrained to move
on a frictionless circle of radius  as shown on the left in the figure.
a) How many normal modes of small oscillation are there?
b) What are the eigenfrequencies of the small oscillations?
c) Describe the motion of the four masses for each eigenfrequency.

2. Consider the two identical coupled oscillators given on the right in the figure assuming 1 = 2 = . Let both
oscillators be linearly damped with a damping constant  . A force  = 0 cos() is applied to mass 1 .
Write down the pair of coupled diﬀerential equations that describe the motion. Obtain a solution by expressing
the diﬀerential equations in terms of the normal coordinates. Show that the normal coordinates  1 and  2
exhibit resonance peaks at the characteristic frequencies  1 and  2 respectively.

3. As shown on the left below the mass  moves horizontally along a frictionless rail. A pendulum is hung from
 with a weightless rod of length  with a mass  at its end.
a) Prove that the eigenfrequencies are
r

1 = 0 2 = ( + )


b) Describe the normal modes.

x
M
Chapter 15

Advanced Hamiltonian mechanics

15.1 Introduction
This study of classical mechanics has involved climbing a vast mountain of knowledge, while the pathway to
the top has led us to elegant and beautiful theories that underlie much of modern physics. Being so close to
the summit provides the opportunity to take a few extra steps in order to provide a glimpse of applications
to physics at the summit. These are described in chapters 15 − 18.
Hamilton’s development of Hamiltonian mechanics in 1834 is the crowning achievement for applying vari-
ational principles to classical mechanics. A fundamental advantage of Hamiltonian mechanics is that it uses
the conjugate coordinates q p plus time , which is a considerable advantage in most branches of physics
and engineering. Compared to Lagrangian mechanics, Hamiltonian mechanics has a significantly broader
arsenal of powerful techniques that can be exploited to obtain an analytical solution of the integrals of the
motion for complicated systems. In addition, Hamiltonian dynamics provides a means of determining the
unknown variables for which the solution assumes a soluble form, and is ideal for study of the fundamental
underlying physics in applications to fields such as quantum or statistical physics. As a consequence, Hamil-
tonian mechanics has become the preeminent variational approach used in modern physics. This chapter
introduces the following four techniques in Hamiltonian mechanics: (1) the elegant Poisson bracket repre-
sentation of Hamiltonian mechanics, which played a pivotal role in the development of quantum theory; (2)
the powerful Hamilton-Jacobi theory coupled with Jacobi’s development of canonical transformation theory;
(3) action-angle variable theory; and (4) canonical perturbation theory.
Prior to further development of the theory of Hamiltonian mechanics, it is useful to summarize the major
formula relevant to Hamiltonian mechanics that have been presented in chapters 7 8 and 9.
Action functional :
As discussed in chapter 92, Hamiltonian mechanics is built upon Hamilton’s action functional
Z 2
(q p) = (q q̇) (15.1)
1

Hamilton’s Principle of least action states that

Z 2
(q p) =  (q q̇) = 0 (15.2)
1

Generalized momentum :
In chapter 72, the generalized (canonical) momentum was defined in terms of the Lagrangian  to be

(q q̇)
 ≡ (15.3)
 ̇
Chapter 92 defined the generalized momentum in terms of the action functional  to be

(q p)
 = (15.4)


403
404 CHAPTER 15. ADVANCED HAMILTONIAN MECHANICS

Generalized energy (q ̇ ) :

Jacobi’s Generalized Energy (q ̇ ) was defined in equation 737 as
X µ (q q̇ ) ¶
(q q̇ ) ≡ ̇ − (q q̇ ) (15.5)

 ̇

Hamiltonian function:
The Hamiltonian  (q p) was defined in terms of the generalized energy (q q̇ ) plus the generalized
momentum. That is
X
 (q p) ≡ (q q̇ ) =  ̇ − (q q̇ ) = p · q̇−(q q̇ ) (15.6)

P
where p q correspond to -dimensional vectors, e.g. q ≡ (1  2    ) and the scalar product p· q̇ =   ̇ .
Chapter 82 used a Legendre transformation to derive this relation between the Hamiltonian and Lagrangian
functions. Note that whereas the Lagrangian (q q̇ ) is expressed in terms of the coordinates q plus
conjugate velocities q̇, the Hamiltonian  (q p ) is expressed in terms of the coordinates q plus their
conjugate momenta p. For scleronomic systems, plus assuming the standard Lagrangian, then equations
744 and 729 give that the Hamiltonian simplifies to equal the total mechanical energy, that is,  =  +  .
Generalized energy theorem:
The equations of motion lead to the generalized energy theorem which states that the time dependence
of the Hamiltonian is related to the time dependence of the Lagrangian.
" 
#
 (q p) X X   (q q̇ )
= ̇ 
 +  (q ) − (15.7)
 
  
=1

Note that if all the generalized non-potential forces and Lagrange multiplier terms are zero, and if the
Lagrangian is not an explicit function of time, then the Hamiltonian is a constant of motion.
Hamilton’s equations of motion:
Chapter 83 showed that a Legendre transform plus the Lagrange-Euler equations led to Hamilton’s
equations of motion. Hamilton derived these equations of motion directly from the action functional, as
shown in chapter 92

 (q p)
̇ = (15.8)

" #
 X 

̇ = − (q p) +  +  (15.9)
 
=1
 (q p) (q q̇ )
= − (15.10)
 
Note the symmetry of Hamilton’s two canonical equations. The canonical variables    are treated
as independent canonical variables Lagrange was the first to derive the canonical equations but he did not
recognize them as a basic set of equations of motion. Hamilton derived the canonical equations of motion
from his fundamental variational principle and made them the basis for a far-reaching theory of dynamics.
Hamilton’s equations give 2 first-order diﬀerential equations for    for each of the  degrees of freedom.
Lagrange’s equations give  second-order diﬀerential equations for the variables   ̇ 
Hamilton-Jacobi equation:
Hamilton used Hamilton’s Principle to derive the Hamilton-Jacobi equation.

+ (q p) = 0 (15.11)

The solution of Hamilton’s equations is trivial if the Hamiltonian is a constant of motion, or when a set
of generalized coordinate can be identified for which all the coordinates  are constant, or are cyclic (also
called ignorable coordinates). Jacobi developed the mathematical framework of canonical transformations
required to exploit the Hamilton-Jacobi equation.
15.2. POISSON BRACKET REPRESENTATION OF HAMILTONIAN MECHANICS 405

15.2 Poisson bracket representation of Hamiltonian mechanics

15.2.1 Poisson Brackets
Poisson brackets were developed by Poisson, who was a student of Lagrange. Hamilton’s canonical equations
of motion describe the time evolution of the canonical variables ( ) in phase space. Jacobi showed that the
framework of Hamiltonian mechanics can be restated in terms of the elegant and powerful Poisson bracket
formalism. The Poisson bracket representation of Hamiltonian mechanics provides a direct link between
classical mechanics and quantum mechanics.
The Poisson bracket of any two continuous functions of generalized coordinates  ( ) and ( ) is
defined to be
X µ    
¶
[ ] ≡ − (15.12)

   

Note that the above definition of the Poisson bracket leads to the following identity, antisymmetry, linearity,
Leibniz rules, and Jacobi Identity.
[  ] = 0 (15.13)

[ ] = − [  ] (15.14)

[  +  ] = [  ] + [  ] (15.15)

[   ] = [  ]  +  [  ] (15.16)

0 = [ [  ]] + [ [  ]] + [ [ ]] (15.17)

where   and  are functions of the canonical variables plus time. Jacobi’s identity; (1517) states that
the sum of the cyclic permutation of the double Poisson brackets of three functions is zero. Jacobi’s identity
plays a useful role in Hamiltonian mechanics as will be shown.

15.2.2 Fundamental Poisson brackets:

The Poisson brackets of the canonical variables themselves are called the fundamental Poisson brackets.
They are
X µ    
¶ X
[   ] = − = (  · 0 − 0 ·   ) = 0 (15.18)

    

X µ    

¶ X
[   ] = − = (0 ·   −   · 0) = 0 (15.19)

    

X µ    

¶ X
[   ] = − = (  ·   − 0 · 0) =   (15.20)

    

In summary, the fundamental Poisson brackets equal

[   ] = 0 (15.21)

[   ] = 0 (15.22)

[   ] = − [   ] =   (15.23)

Note that the Poisson bracket is antisymmetric under interchange in  and  It is interesting that the only
non-zero fundamental Poisson bracket is for conjugate variables where  =  that is

[   ] = 1 (15.24)

406 CHAPTER 15. ADVANCED HAMILTONIAN MECHANICS

15.2.3 Poisson bracket invariance to canonical transformations

The Poisson brackets are invariant under a canonical transformation from one set of canonical variables
(   ) to a new set of canonical variables (   ) where  →  (q p) and  →  (q p). This is shown
by transforming equation 1512 to the new variables by the following derivation
X µ    
¶
[ ] = − (15.25)

   
X µ  µ    
¶

µ
   
¶¶
= + − + (15.26)
         


The terms can be rearranged to give

X µ  
¶
[ ] = [  ] + [  ] (15.27)
 


Let  =  and replace  by  , and use the fact that the fundamental Poisson brackets [   ] = 0
and [   ] =   , then equation 1525 reduces to
X µ  
¶ X

[   ] = [   ] + [   ] =   (15.28)

  


That is

[  ] = − (15.29)

Similarly
X µ  
¶
[   ] = [   ] + [   ] (15.30)

 

leading to

[  ] = (15.31)

Substituting equations (1529) and (1531) into equation (1527) gives
X µ    
¶
[ ] = − = [ ] (15.32)
   


Thus the canonical variable subscripts ( ) and (  ) can be ignored since the Poisson bracket is
invariant to any canonical transformation of canonical variables. The counter argument is that if the Poisson
bracket is independent of the transformation, then the transformation is canonical.

15.1 Example: Check that a transformation is canonical

The independence of Poisson brackets to canonical transformations can be used to test if a transformation
is canonical. Assume that the transformation equations between two sets of coordinates are given by
³ 1
´ ³ 1
´ 1
 = ln 1 +  2 cos   = 2 1 +  2 cos   2 sin 

Evaluating the Poisson brackets gives [ ] = 0, [  ] = 0 while

   
[  ] = −
   
1 1
 − 2 cos  2 1 1  2 sin2  1 1
= 1 [− sin  + (1 +  2 cos ) 2 cos ] +
1 [cos  + (1 +  2 cos ) − 2 ] = 1
1 +  2 cos  1 +  2 cos 
Therefore if   are canonical with a Poisson bracket [ ] = 1, then so are   since [  ] = 1 = [ ] 
15.2. POISSON BRACKET REPRESENTATION OF HAMILTONIAN MECHANICS 407

Since it has been shown that this transformation is canonical, it is possible to go further and determine
the function that generates this transformation. Solving the transformation equations for  and  give
¡ ¢2 ¡ ¢
 =  − 1 sec2   = 2  − 1 tan 

Since the transformation is canonical, there exists a generating function 3 ( ) such that

3 3
=−  =−
 

The transformation function 3 ( ) can be obtained using

3 3
3 ( ) =  +  = −  − 
 
h¡ ¢2 i ¡ ¢2 h¡ ¢2 i
= −  − 1 tan  −  − 1  tan  = −  − 1 tan 

This then gives that the required generating function is

¡ ¢2
3 ( ) =  − 1 tan 

This example illustrates how to determine a useful generating function and prove that the transformation is
canonical.

15.2.4 Correspondence of the commutator and the Poisson Bracket

In classical mechanics there is a formal correspondence between the Poisson bracket and the commutator.
This can be shown by deriving the Poisson Bracket of four functions taken in two pairs. The derivation
requires deriving the two possible Poisson Brackets involving three functions.
X ½µ 1 ¶
2 
µ
1
¶
2 
¾
[1 2  ] = 2 + 1 − 2 + 1

     
= [1  ] 2 + 1 [2  ] (15.33)
[ 1 2 ] = [ 1 ] 2 + 1 [ 2 ] (15.34)

These two Poisson Brackets for three functions can be used to derive the Poisson Bracket of four functions,
taken in pairs. This can be accomplished two ways using either equation 1533 or 1534

[1 2  1 2 ] = [1  1 2 ] 2 + 1 [2  1 2 ]

= {[1  1 ] 2 + 1 [1  2 ]} 2 + 1 {[2  1 ] 2 + 1 [2  2 ]}
= [1  1 ] 2 2 + 1 [1  2 ] 2 + 1 [2  1 ] 2 + 1 1 [2  2 ] (15.35)

The alternative approach gives

[1 2  1 2 ] = [1 2 1 ] 2 + 1 [1 2  2 ]

= [1  1 ] 2 2 + 1 [2  1 ] 2 + 1 [1  2 ] 2 + 1 1 [2  2 ] (15.36)

These two alternate derivations give diﬀerent relations for the same Poisson Bracket. Equating the alternative
equations 1535 and 1536 gives that

[1  1 ] (2 2 − 2 2 ) = (1 1 − 1 1 ) [2  2 ]

This can be factored into separate relations, the left-hand side for body 1 and the right-hand side for body
2.
(1 1 − 1 1 ) (2 2 − 2 2 )
= = (15.37)
[1  1 ] [2  2 ]
408 CHAPTER 15. ADVANCED HAMILTONIAN MECHANICS

Since the left-hand ratio holds for 1  1 independent of 2  2 , and vise versa, then they must equal
a constant  that does not depend on 1  1  does not depend on 2  2 , and  must commute with
(1 1 − 1 1 ). That is,  must be a constant number independent of these variables.
X µ 1 1 1 1
¶
(1 1 − 1 1 ) =  [1  1 ] ≡  − (15.38)

   

Equation 1538 is an especially important result which states that to within a multiplicative constant number
, there is a one-to-one correspondence between the Poisson Bracket and the commutator of two independent
functions. An important implication is that if two functions,   have a Poisson Bracket that is zero, then
the commutator of the two functions also must be zero, that is,  and  commute.
Consider the special case where the variables 1 and 1 correspond to the fundamental canonical vari-
ables, (   ). Then the commutators of the fundamental canonical variables are given by

  −   =  [   ] =   (15.39)
  −   =  [   ] = 0 (15.40)
  −   =  [   ] = 0 (15.41)

In 1925, Paul Dirac, a 23-year old graduate student at Bristol, recognized that the formal correspondence
between the Poisson bracket in classical mechanics, and the corresponding commutator, provides a logical
and consistent way to bridge the chasm between the Hamiltonian formulation of classical mechanics, and
quantum mechanics. He realized that making the assumption that the constant  ≡ ~, leads to Heisenberg’s
fundamental commutation relations in quantum mechanics, as is discussed in chapter 1831. Assuming that
 ≡ ~ provides a logical and consistent way that builds quantization directly into classical mechanics, rather
than using ad-hoc, case-dependent, hypotheses as was used by the older quantum theory of Bohr.

15.2.5 Observables in Hamiltonian mechanics

Poisson brackets, and the corresponding commutation relations, are especially useful for elucidating which
observables are constants of motion, and whether any two observables can be measured simultaneously and
exactly. The properties of any observable are determined by the following two criteria.

Time dependence:
The total time diﬀerential of a function  (    ) is defined by
µ ¶
  X  
= + ̇ + ̇ (15.42)
  
 

Hamilton’s canonical equations give that


̇ = (15.43)


̇ = − (15.44)

Substituting these in the above relation gives
µ ¶
  X    
= + −
  
   

that is
 
= + [ ] (15.45)
 
This important equation states that the total time derivative of any function (  ) can be expressed in
terms of the partial time derivative plus the Poisson bracket of (  ) with the Hamiltonian.
15.2. POISSON BRACKET REPRESENTATION OF HAMILTONIAN MECHANICS 409


Any observable (  ) will be a constant of motion if  = 0, and thus equation (1545) gives


+ [ ] = 0 (If  is a constant of motion)

That is, it is a constant of motion when

= [ ] (15.46)

Moreover, this can be extended further to the statement that if the constant of motion  is not explicitly
time dependent then
[ ] = 0 (15.47)
The Poisson bracket with the Hamiltonian is zero for a constant of motion  that is not explicitly time
dependent. Often it is more useful to turn this statement around with the statement that if [ ] = 0 and
 
 = 0 then  = 0, implying that  is a constant of motion.

Independence
Consider two observables  (  ) and (  ). The independence of these two observables is determined
by the Poisson bracket
[ ] = − [  ] (15.48)
If this Poisson bracket is zero, that is, if the two observables  (  ) and (  ) commute, then their
values are independent and can be measured independently. However, if the Poisson bracket [ ] 6= 0, that
is  (  ) and (  ) do not commute, then  and  are correlated since interchanging the order of
the Poisson bracket changes the sign which implies that the measured value for  depends on whether  is
simultaneously measured.
A useful property of Poisson brackets is that if  and  both are constants of motion, then the double
Poisson bracket [ [ ]] = 0. This can be proved using Jacobi’s identity

[ [ ]] + [ [  ]] + [ [ ]] = 0 (15.49)

If [ ] = 0 and [ ] = 0 then [ [ ]] = 0 that is, the Poisson bracket [ ] commutes with . Note
that if  and  do not depend explicitly on time, that is  
 =  = 0, then combining equations (1545)
and (1549) leads to Poisson’s Theorem that relates the total time derivatives.
∙ ¸ ∙ ¸
  
[ ] =   +  (15.50)
  

This implies that if  and  are invariants, that is 

 =

 = 0 then the Poisson bracket [ ] is an
invariant if  and  are not explicitly time dependent.

15.2 Example: Angular momentum:

Angular momentum,  provides an example of the use of Poisson brackets to elucidate which observables
can be determined simultaneously. Consider that the Hamiltonian is time independent with a spherically
symmetric potential  (). Then it is best to treat such a spherically symmetric potential using spherical
coordinates since the Hamiltonian is independent of both  and .
The Poisson Brackets in classical mechanics can be used to tell us if two observables will commute. Since
 () is time independent, then the Hamiltonian in spherical coordinates is
Ã !
1 2 2 2
 = + =  + 2 + 2 2 +  ()
2   sin 

Evaluate the Poisson bracket using the above Hamiltonian gives

[  ] = 0
410 CHAPTER 15. ADVANCED HAMILTONIAN MECHANICS

 
Since  is not an explicit function of time,  = 0 then  = 0 that is, the angular momentum about
the  axis  =  is a constant of motion.
The Poisson bracket of the total angular momentum 2 commutes with the Hamiltonian, that is
" #
£ 2 ¤ 2
2
   =  +  = 0
sin2 
2
Since the total angular momentum 2 = 2 + sin2  is not explicitly time dependent, then it also must be
a constant of motion. Note that Noether’s theorem gives that both the angular momenta 2 and  are
constants of motion. Also since the Poisson brackets are
[  ] = 0
£ 2 ¤
  = 0
then Jacobi’s identity, equation 1517 can be used to imply that
£ ¤
[ 2   ] = 0
£ 2 ¤
That
£ 2 is,
¤ the Poisson bracket    is a constant of motion. Note that if 2 and  commute, that is,
£2   ¤ = 0 then they can be measured simultaneously with unlimited accuracy, and this also satisfies that
   commutes with .
The (  ) components of the angular momentum  are given by

X 
X
 = (r × p) = (  −   )
=1 =1
X X
 = (r × p) = (  −   )
=1 =1
X X
 = (r × p) = (  −   )
=1 =1

Evaluate the Poisson bracket

X ∙µ ¶ µ ¶ µ ¶¸
           
[   ] = − + − + −
=1
           

X
= [(0) + (0) + (  −   )] = 
=1

Similarly, Poisson brackets for      are

[   ] = 
[   ] = 
[   ] = 
where   and  are taken in a right-handed cyclic order. This usually is written in the form
[   ] =  
where the Levi-Civita density  equals zero if two of the  indices are identical, otherwise it is +1 for a
cyclic permutation of   , and −1 for a non-cyclic permutation.
Note that since these Poisson brackets are nonzero, the components of the angular momentum     
do not commute and thus simultaneously they cannot be measured precisely. Thus we see that although 2 and
 are simultaneous constants of motion, where the subscript  can be either   or  only one component
 can be measured simultaneously with 2 . This behavior is exhibited by rigid-body rotation where the body
precesses around one component of the total angular momentum,  , such that the total angular momentum,
2 , plus the component along one axis,  are constants of motion. Then 2 + 2 = 2 − 2 is constant
but not the individual  or  .
15.2. POISSON BRACKET REPRESENTATION OF HAMILTONIAN MECHANICS 411

15.2.6 Hamilton’s equations of motion

An especially important application of Poisson brackets is that Hamilton’s canonical equations of motion
can be expressed directly in the Poisson bracket form. The Poisson bracket representation of Hamiltonian
mechanics has important implications to quantum mechanics as will be described in chapter 18.
In equation (1545) assume that  is a fundamental coordinate, that is,  ≡  . Since  is not explicitly
time dependent, then
 
= + [  ] (15.51)
 
X µ    
¶
= 0+ −

   
X µ ¶
 
=   −0·

 

= (15.52)

That is

̇ = [  ] = (15.53)

Similarly consider the fundamental canonical momentum  ≡  . Since it is not explicitly time dependent,
then
 
= + [  ] (15.54)
 
X µ    
¶
= 0+ −

   
X µ  
¶
= 0· −   ·

 

= − (15.55)

That is

̇ = [  ] = − (15.56)

Thus, it is seen that the Poisson bracket form of the equations of motion includes the Hamilton equations
of motion. That is,


̇ = [  ] = (15.57)


̇ = [  ] = − (15.58)

The above shows that the full structure of Hamilton’s equations of motion can be expressed directly in
terms of Poisson brackets.
The elegant formulation of Poisson brackets has the same form in all canonical coordinates as the Hamil-
tonian formulation. However, the normal Hamilton canonical equations in classical mechanics assume implic-
itly that one can specify the exact position and momentum of a particle simultaneously at any point in time
which is applicable only to classical mechanics variables that are continuous functions of the coordinates,
and not to quantized systems. The important feature of the Poisson Bracket representation of Hamilton’s
equations is that it generalizes Hamilton’s equations into a form (1557 1558) where the Poisson bracket is
equally consistent with both classical and quantum mechanics in that it allows for non-commuting canonical
variables and Heisenberg’s Uncertainty Principle. Thus the generalization of Hamilton’s equations, via use
of the Poisson brackets, provides one of the most powerful analytic tools applicable to both classical and
quantal dynamics. It played a pivotal role in derivation of quantum theory as described in chapter 18.
412 CHAPTER 15. ADVANCED HAMILTONIAN MECHANICS

15.3 Example: Lorentz force in electromagnetism

Consider a charge  and mass  in a constant electromagnetic fields with scalar potential Φ and vector
potential  Chapter 610 showed that the Lagrangian for electromagnetism can be written as
1
= ẋ · ẋ−(Φ − A · ẋ)
2
The generalized momentum then is given by

p= =ẋ + A
 ẋ
Thus the Hamiltonian can be written as

(p−A)2
 = (p · ẋ) −  = + Φ
2
The Hamilton equations of motion give

(p−A)
ẋ= [x ] =

and

ṗ = [p] = −∇Φ + {(p−A) × (∇ × A)}

Define the magnetic field to be
B≡∇×A
and the electric field to be
A
E = − ∇Φ −

then the Lorentz force can be written as

F = ṗ = (E + ẋ × B)

15.4 Example: Wavemotion:

Assume that one is dealing with traveling waves of the form Ψ = (   −) for a one-dimensional
1

conservative system of many identical coupled linear oscillators. Then evaluating the following Poisson
brackets gives

[  ] = 0
[ ] = 0
[ ] = 0
[ ] = 0

Thus     and  are constants of motion. However,

[  ] =
6 0
[ ] = 6 0

Thus one cannot simultaneously measure the conjugate variables ( ) or ( ). This is the Uncertainty
Principle that is manifest by all forms of wave motion in classical and quantal mechanics as discussed in
chapter 3113
15.2. POISSON BRACKET REPRESENTATION OF HAMILTONIAN MECHANICS 413

15.5 Example: Two-dimensional, anisotropic, linear oscillator

Consider a mass  bound by an anisotropic, two-dimensional, linear oscillator potential. As discussed
in chapter 11 , the motion can be described as lying entirely in the  −  plane that is perpendicular to the
angular momentum . It is interesting to derive the equations of motion for this system using the Poisson
bracket representation of Hamiltonian mechanics.
The kinetic energy is given by
1 ¡ ¢
 (̇ ̇) =  ̇2 + ̇ 2
2
The linear binding is reproduced assuming a quadratic scalar potential energy of the form
1 ¡ 2 ¢
 ( ) =   +  2 + 
2
where  is the anharmonic strength that coupled the modes of the isotropic linear oscillator.
a) NORMAL MODES: As discussed in chapter 14 , a transformation to the normal modes of the system
is given by using variables ( ) where  ≡ √12 ( + ) and  ≡ √12 ( − ), that is

1 1
 ≡ √ ( + )  ≡ √ ( − )
2 2
Express the kinetic and potential energies in terms of the new coordinates gives
∙³ ´2 ³ ´2 ¸ 1 ³ ´
1 2
 (̇ ̇) =  ̇ + ̇ + ̇ − ̇ =  ̇2 + ̇
4 2
1 h 2 2
i 1 ¡ ¢ 1 1
 =  ( + ) + ( − ) +  2 −  2 = ( + ) 2 + ( − )  2
4 2 2 2
Note that the coordinate transformation makes the Lagrangian separable, that is
1 ³ 2 2
´ 1 1
=  ̇ + ̇ − ( + ) 2 + ( − )  2 =  + 
2 2 2
where
1 1 1 2 1
 = ̇2 − ( + ) 2  = ̇ − ( − )  2
2 2 2 2
This shows that that the transformation has separated the system into two normal modes that are harmonic
oscillators with angular frequencies
r r
+ −
1 = 2 =
 
Note that the non-isotropic harmonic oscillator reduces to the isotropic linear oscillator when  = 0.
b) HAMILTONIAN: The canonical momenta are given by


 = = ̇
 ̇

 = = ̇
 ̇
The definition of the Hamiltonian gives
1 ¡ 2 ¢ 1 1
 =  ̇ +  ̇ −  =  + 2 + ( + ) 2 + ( − )  2
2 2 2
Note that this can be factored as
 =  + 
where
1 2 1 1 2 1
 =  + ( + ) 2  =  + ( − )  2
2  2 2  2
414 CHAPTER 15. ADVANCED HAMILTONIAN MECHANICS

Using the Poisson Bracket expression for the time dependence, equation 1545 and using the fact that
the Hamiltonian is not explicitly time dependent, that is, 
 = 0, gives

 
= + [  ] = 0 + [   +  ] = [   ]
 
       
= + − − =0
       

Similarly  = 0. This implies that the Hamiltonians for both normal modes,  and   are time-
independent constants of motion which are equal to the total energy for each mode.
c) ANGULAR MOMENTUM: The angular momentum for motion in the  plane is perpendicular to
the  plane with a magnitude of
 =  ( −  )
The time dependence of the angular momentum is given by
         
= + [ ] = 0 + − + −
         
=   +  +  −   −  +  = 2

Note that if  = 0 then the two eigenfrequencies, are degenerate,   =   , that is, the system reduces to
the isotropic harmonic oscillator in the  plane that was discussed in chapter 119. In addition,   = 0
for  = 0 that is, the angular momentum  in the  plane is a constant of motion when  = 0.
d) SYMMETRY TENSOR: The symmetry tensor was defined in chapter 1193 to be
  1
0 = +  
2 2
where  and  can correspond to either  or . The symmetry tensor defines the orientation of the major
axis of the elliptical orbit for the two-dimensional, isotropic, linear oscillator as described in chapter 1193
The isotropic oscillator has been shown to have two normal modes that are degenerate, therefore  and
 are equally good normal modes. The Hamiltonian showed that, for  = 0 the Hamiltonian gives that the
total energy is conserved, as well as the energies for each of the two normal modes which are.

2 1 2 1
 = + 2  = +  2
2 2 2 2
Consider the matrix element
  1
0 = +  
2 2
where   each can represent  or . Then for each matrix element
0 0 0  0  0  0 
= + [  ] = 0 + − + − =0
         

That is, each matrix element 012  commutes with the Hamiltonian
£ 0 ¤
   = 0

Thus the Poisson Brackets representation of Hamiltonian mechanics has been used to prove that the
 
symmetry tensor 0 = 2 + 12   is a constant of motion for the isotropic harmonic oscillator. That is,
all the elements  ,   and 0 of the symmetric tensor A0 commute with the Hamiltonian.
0 0

Note that the three constants of motion, L, A0 and H for the isotropic, two-dimensional, linear oscillator
form a closed algebra under the Poisson Bracket formalism.

15.6 Example: The eccentricity vector

Chapter 1184 showed that Hamilton’s eccentricity vector for the inverse square-law attractive force,

A ≡ (p × L) + (r̂)
15.2. POISSON BRACKET REPRESENTATION OF HAMILTONIAN MECHANICS 415

is a constant of motion that specifies the major axis of the elliptical orbit. The eccentricity vector for the
inverse-square-law force can be investigated using Poisson Brackets as was done for the symmetry tensor
above. It can be shown that

[   ] =  
µ 2 ¶
p 
[   ] = −2 +   (a)
2 
Note that the bracket on the right-hand side of equation () equals the Hamiltonian  for the inverse square-
law attractive force, and thus the Poisson bracket equals
µ 2 ¶
p 
[   ] = −2 +   = −2 
2 
For the Hamiltonian  it can be shown that the Poisson bracket

[ A] = 0

That is, the eccentricity vector commutes with the Hamiltonian and thus it is a constant of motion. Previously
this result was obtained directly using the equations of motion as given in equation 1187. Note that the three
constants of motion, L, A and H form a closed algebra under the Poisson Bracket formalism similar to
the triad of constants of motion, L, A0 and H that occur for the two-dimensional, isotropic linear oscillator
described above. Examples 155 and 156 illustrate that the Poisson Brackets representation of Hamiltonian
mechanics is a powerful probe of the underlying physics, as well as confirming the results obtained directly
from the equations of motion as described in chapter 1184 and 11 9 3 .

15.2.7 Liouville’s Theorem

Liouvilles Theorem illustrates an application of Poisson Brackets
to Hamiltonian phase space that has important implications for
statistical physics. The trajectory of a single particle in phase pi
space is completely determined by the equations of motion if the
initial conditions are known. However, many-body systems have
so many degrees of freedom it becomes impractical to solve all
the equations of motion of the many bodies. An example is a
statistical ensemble in a gas, a plasma, or a beam of particles.
dpi
qi
Usually it is not possible to specify the exact point in phase space
for such complicated systems. However, it is possible to define an
ensemble of points in phase space that encompasses all possible
trajectories for the complicated system. That is, the statistical
distribution of particles in phase space can be specified.
pi
Consider a density  of representative points in (q p) phase dq
i
space. The number  of systems in the volume element  is qi
 =  (15.59)

where it is assumed that the infinitessimal volume element

 = 1  2  1  2  contains many possible sys- Figure 15.1: Infinitessimal element of area
tems so that  can be considered a continuous distribution. For in phase space
the conjugate variables (   ) shown in figure 151, the number
of representative points moving across the left-hand edge into
the area per unit time is
̇  (15.60)
The number of representative points flowing out of the area along the right-hand edge is
∙ ¸

̇ + (̇ )   (15.61)

416 CHAPTER 15. ADVANCED HAMILTONIAN MECHANICS

Hence the net increase in  in the infinitessimal rectangular element   due to flow in the horizontal
direction is

− (̇ )   (15.62)

Similarly, the net gain due to flow in the vertical direction is

− (̇ )   (15.63)

Thus the total increase in the element   per unit time is therefore
∙ ¸
 
− (̇ ) + (̇ )   (15.64)
 
Assume that the total number of points must be conserved, then the total increase in the number of
points inside the element   must equal the net changes in  on the infinitessimal surface element per
unit time. That is µ ¶

  (15.65)

Thus summing over all possible values of  gives
∙ ¸
 X  
+ (̇ ) + (̇ ) = 0 (15.66)
 
 
or ∙ ¸
 X   X ∙  ̇  ̇
¸
+ ̇ + ̇ + + =0 (15.67)
 
  
 
Inserting Hamilton’s canonical equations into both brackets and diﬀerentiating the last bracket results in
∙ ¸ X ∙ 2 ¸
 X     2
+ − + − =0 (15.68)
 
    
   
The two terms in the last bracket cancel and thus
∙ ¸
 X     
+ − = + [ ] = 0 (15.69)
 
    

However, this just equals  , therefore
 
= + [ ] = 0 (15.70)
 
This is called Liouville’s theorem which states that the rate of change of density of representative
points vanishes, that is, the density of points is a constant in the Hamiltonian phase space along a specific
trajectory. Liouville’s theorem means that the system acts like an incompressible fluid that moves such as to
occupy an equal volume in phase space at every instant, even though the shape of the phase-space volume
may change, that is, the phase-space density of the fluid remains constant. Equation (1570) is another
illustration of the basic Poisson bracket relation (1545) and the usefulness of Poisson brackets in physics.
Liouville’s theorem is crucially important to statistical mechanics of ensembles where the exact knowledge
of the system is unknown, only statistical averages are known. An example is in focussing of beams of charged
particles by beam handling systems. At a focus of the beam, the transverse width in  is minimized, while
the width in  is largest since the beam is converging to the focus, whereas a parallel beam has maximum
width  and minimum spreading width  . However, the product  remains constant throughout the
focussing system. For a two dimensional beam, this applies equally for the  and  coordinates, etc. It is
obvious that the final beam quality for any beam transport system is ultimately limited by the emittance of
the source of the beam, that is, the initial area of the phase space distribution. Note that Liouville’s theorem
only applies to Hamiltonian  −  phase space, not to  − ̇ Lagrangian state space. As a consequence,
Hamiltonian dynamics, rather than Lagrange dynamics, is used to discuss ensembles in statistical physics.
Note that Liouville’s theorem is applicable only for conservative systems, that is, where Hamilton’s
equations of motion apply. For dissipative systems the phase space volume shrinks with time rather than
being a constant of the motion.
15.3. CANONICAL TRANSFORMATIONS IN HAMILTONIAN MECHANICS 417

15.3 Canonical transformations in Hamiltonian mechanics

Hamiltonian mechanics is an especially elegant and powerful way to derive the equations of motion for com-
plicated systems. Unfortunately, integrating the equations of motion to derive a solution can be a challenge.
Hamilton recognized this diﬃculty, so he proposed using generating functions to make canonical transfor-
mations which transform the equations into a known soluble form. Jacobi, a contemporary mathematician,
recognized the importance of Hamilton’s pioneering developments in Hamiltonian mechanics, and therefore
he developed a sophisticated mathematical framework for exploiting the generating function formalism in
order to make the canonical transformations required to solve Hamilton’s equations of motion.
In the Lagrange formulation, transforming coordinates (  ̇ ) to cyclic generalized coordinates (  ̇ ),
simplifies finding the Euler-Lagrange equations of motion. For the Hamiltonian formulation, the concept of
coordinate transformations is extended to include simultaneous canonical transformation of both the spatial
coordinates  and the conjugate momenta  from (   ) to (   ), where both of the canonical variables
are treated equally in the transformation. Compared to Lagrangian mechanics, Hamiltonian mechanics has
twice as many variables which is an asset, rather than a liability, since it widens the realm of possible
canonical transformations.
Hamiltonian mechanics has the advantage that generating functions can be exploited to make canonical
transformations to find solutions, which avoids having to use direct integration. Canonical transformations
are the foundation of Hamiltonian mechanics; they underlie Hamilton-Jacobi theory and action-angle variable
theory, both of which are powerful means for exploiting Hamiltonian mechanics to solve problems in physics
and engineering. The concept underlying canonical transformations is that, if the equations of motion are
simplified by using a new set of generalized variables (Q P) compared to using the original set of variables
(q p) then an advantage has been gained. The solution, expressed in terms of the generalized variables
(Q P) can be transformed back to express the solution in terms of the original coordinates, (q p).
Only a specialized subset of transformations will be considered, namely canonical transformations that
preserve the canonical form of Hamilton’s equations of motion. That is, given that the original set of variables
(   ) satisfy Hamilton’s equations
(q p ) (q p )
q̇ = − ṗ = (15.71)
p q
for some Hamiltonian (q p ) then the transformation to coordinates  (   )  (    ) is canonical
if, and only if, there exists a function H(Q P ) such that the P and Q are still governed by Hamilton’s
equations. That is,
H(Q P ) H(Q P )
Q̇ = − Ṗ = (15.72)
P Q
where H(Q P ) plays the role of the Hamiltonian for the new variables. Note that H(Q P ) may be
very diﬀerent from the old Hamiltonian (q p ). The invariance of the Poisson bracket to canonical
transformations, chapter 1523, provides a powerful test that the transformation is canonical.
Hamilton’s Principle of least action, discussed in chapter 9, states that
Z 2 Z 2
 =  (q q̇ ) =  [p · q̇ − (q p )]  = 0 (15.73)
1 1

Similarly, applying Hamilton’s Principle of least action to the new Lagrangian L(Q Q̇ ) gives
Z 2 Z 2 h i
 =  L(Q Q̇ ) =  P · Q̇ − H(Q P )  = 0 (15.74)
1 1

The discussion of gauge-invariant Lagrangians, chapter 93 showed that  and L can be related by the total
time derivative of a generating function  where

=L− (15.75)

The generating function  can be any well-behaved function with continuous second derivatives of both the
old and new canonical variables p q P Q and  Thus the integrands of (1573) and (1574) are related by
h i 
p · q̇ − (q p ) =  P · Q̇ − H(Q P ) + (15.76)

418 CHAPTER 15. ADVANCED HAMILTONIAN MECHANICS

where  is a possible scale transformation. A scale transformation, such as changing units, is trivial, and will
be assumed to be absorbed into the coordinates, making  = 1 Assuming that  6= 1 is called an extended
canonical transformation.

15.3.1 Generating functions

The generating function  has to be chosen such that the transformation from the initial variables (q p)
to the final variables (Q P) is a canonical transformation. The chosen generating function contributes to
(1576) only if it is a function of the old plus new variables. The four possible types of generating functions
of the first kind, are 1 (q Q ), 2 (q P ), 3 (p Q ) , and 4 (p P ). These four generating functions
lead to relatively simple canonical transformations, are shown below.

Type 1:  = 1 (q Q) :

The total time derivative of the generating function  = 1 (q Q) is given by
∙ ¸
 (q Q) 1 (q Q) 1 (q Q) 1 (q Q)
= · q̇ + · Q̇ + (15.77)
 q Q 

Insert equation (1577) into equation (1576), and assume that the trivial scale factor  = 1 then
∙ ¸ ∙ ¸
1 (q Q) 1 (q Q) 1 (q Q)
p− · q̇ − (q p ) = P + · Q̇ − H(Q P ) +
q Q 

Assume that the generating function 1 determines the canonical variables p and P to be

1 (q Q) 1 (q Q)

p= P=− (15.78)
q Q

then the terms in each square bracket cancel, leading to the required canonical transformation

1 (q Q)

H(Q P ) = (q p ) + (15.79)


Type 2:  = 2 (q P) − Q · P :

The total time derivative of the generating function  = 2 (q P)−Q · P is given by
∙ ¸
 2 (q P) 2 (q P) 2 (q P)
= · q̇ + · Ṗ − P · Q̇ − Ṗ · Q + (15.80)
 q P 

Insert this into equation (1576)  and assume that the trivial scale factor  = 1 then
µ ¶ ∙ ¸
2 (q P) 2 (q P) 2 (q P)
p− · q̇ − (q p ) = P · Q̇ − P · Q̇+ − Q · Ṗ − H(Q P ) +
q P 

Assume that the generating function 2 determines the canonical variables p and Q to be

2 (q P) 2 (q P)

p= Q= (15.81)
q P

then the terms in brackets cancel, leading to the required transformation

2 (q P)

H(Q P ) = (q p ) + (15.82)

15.3. CANONICAL TRANSFORMATIONS IN HAMILTONIAN MECHANICS 419

Type 3:  = 3 (p Q) + q · p :

The total time derivative of the generating function  = 3 (p Q t) + q · p is given by
∙ ¸
 3 (p Q) 3 (p Q) 3 (p Q)
= · ṗ + · Q̇ + q̇ · p + q · ṗ + (15.83)
 p Q 

Insert this into equation (1576)  and assume that the trivial scale factor  = 1 then
∙ ¸ ∙ ¸
3 (p Q) 3 (p Q) 3 (p Q)
− q+ · ṗ − (q p ) = P+ ·Q̇ − H(Q P ) +
p Q 

Assume that the generating function 3 determines the canonical variables q and P to be
3 (p Q) 3 (p Q)
q=− P=− (15.84)
p Q
then the terms in brackets cancel, leading to the required transformation
3 (p Q)
H(Q P ) = (q p ) + (15.85)


Type 4:  = 4 (p P) + q · p − Q · P :

The total time derivative of the generating function  = 4 (p P) + q · p − Q · P is given by
∙ ¸
 4 (p P) 4 (p P) 4 (p P)
= · ṗ + · Ṗ + q̇ · p + q · ṗ − Q̇ · P − Q · Ṗ + (15.86)
 p P 

Insert this into equation (1576)  and assume that the trivial scale factor  = 1 then
∙ ¸ ∙ ¸
4 (p P) 4 (p P) 4 (p P)
− q+ · ṗ − (q p ) = − Q ·Ṗ − H(Q P ) +
p P 

Assume that the generating function 4 determines the canonical variables q and Q to be
4 (p P) 4 (p P)
q=− Q= (15.87)
p P
then the terms in brackets cancel, leading to the required transformation
4 (p P)
H(Q P ) = (q p ) + (15.88)

Note that the last three generating functions require the inclusion of additional bilinear products of
    in order for the terms to cancel to give the required result. The addition of the bilinear terms,
ensures that the resultant generating function  is the same using any of the four generating functions
1  2  3  4 . Frequently the 2 (q P ) generating function is the most convenient. The four possible
generating functions of the first kind, given above, are related by Legendre transformations. A canonical
transformation does not have to conform to only one of the four generating functions  for all the degrees
of freedom, they can be a mixture of diﬀerent flavors for the diﬀerent degrees of freedom. The properties of
the generating functions are summarized in table 151.

Table 151 Canonical transformation generating functions

Generating function Generating function derivatives Trivial special examples
 = 1 (q Q )  = 

1
 = − 1
 1 =    =   = −
2 2
 = 2 (q P ) − Q · P  =   =  2 =    =   = 
3 3
 = 3 (p Q ) + q · p  = −   = −  3 =    = −  = −
 = 4 (p P ) + q · p − Q · P  = − 

4
  = 4
  4 =     =     = −
420 CHAPTER 15. ADVANCED HAMILTONIAN MECHANICS

The partial derivatives of the generating functions  determine the corresponding conjugate variables
not explicitly included in the generating function  . Note that, for the first trivial example 1 =    the
old momenta become the new coordinates,  =   and vice versa,  = − . This illustrates that it is
better to name them “conjugate variables” rather than “momenta” and “coordinates”.
In summary, Jacobi has developed a mathematical framework for finding the generating function 
required to make a canonical transformation to a new Hamiltonian H(Q P ), that has a known solution.
That is,

H(Q P ) = (q p ) + (15.89)

When H(Q P ) is a constant, then a solution has been obtained. The inverse transformation for this solution
Q() P() → q() p() now can be used to express the final solution in terms of the original variables of the
system.
Note the special case when H(Q P ) = 0 then equation 1589 has been reduced to the Hamilton-Jacobi
relation (1511)

(q p ) + =0 (1511)

In this case, the generating function  determines the action functional  required to solve the Hamilton-
Jacobi equation (15110). Since equation (1589) has transformed the Hamiltonian (q p ) → H(Q P )
for which H(Q P ) = 0, then the solution Q() P() for the Hamiltonian H(Q P ) = 0 is obtained easily.
This approach underlies Hamilton-Jacobi theory presented in chapter 154

15.3.2 Applications of canonical transformations

The canonical transformation procedure may appear unnecessarily complicated for solving the examples
given in this book, but it is essential for solving the complicated systems that occur in nature. For example,
canonical transformations can be used to transform time-dependent, (non-autonomous) Hamiltonians to
time-independent, (autonomous) Hamiltonians for which the solutions are known. Example 1519 describes
such a system. Canonical transformations provide a remarkably powerful approach for solving the equations
of motion in Hamiltonian mechanics, especially when using the Hamilton-Jacobi approach discussed in
chapter 154.

15.7 Example: The identity canonical transformation

The identity transformation 2 (q P) = q · P satisfies (1589) if the following relations are satisfied
 =  2
 =  ,  =  =  , H=. Note that the new and old coordinates are identical, hence 2 =  
2

generates the identity transformation  =    =  .

15.8 Example: The point canonical transformation

Consider the point transformation 2 (q · P) =  (q)·P where  (q) is some function of q. This
 ( )
transformation satisfies (1589) if the following relations are satisfied  = 2 2
 =  ( ),  =  =  
H=. Point transformations correspond to point-to-point transformations of coordinates.

15.9 Example: The exchange canonical transformation

The identity transformation 1 (q Q) = q · Q satisfies (1589) if the following relations are satisfied
 =  1
 =  ,  = −  = − , H= That is, the coordinates and momenta have been interchanged.
1

15.10 Example: Infinitessimal point canonical transformation

Consider an infinitessimal point canonical transformation, that is infinitesimally close to a point identity.
2 (q · P) = q · P+(q P)
satisfies (1589) if the following relations are satisfied
2 (q P )
 = =  + 
 
15.3. CANONICAL TRANSFORMATIONS IN HAMILTONIAN MECHANICS 421

2 (q P )
 = =  + 
 
Thus the infinitessimal changes in  and  are given by

(q P ) (q P )
 (q p) =  −  =  = + (2 )
 
(q P ) (q P )
 (q p) =  −  = − = − + (2 )
 

Thus (q P) is the generator of the infinitessimal canonical transformation.

15.11 Example: 1-D harmonic oscillator via a canonical transformation

The classic one-dimensional harmonic oscillator provides an example of the use of canonical transforma-

tions. Consider the Hamiltonian where  2 =  then

2  2 1 ¡ 2 ¢
= + =  + 2  2  2
2 2 2
This form of the Hamiltonian is a sum of two squares suggesting a canonical transformation for which
 is cyclic in a new coordinate. A guess for a canonical transformation is of the form  =  cot  which
2
is of the 1 (q Q) type where 1 equals 1 ( ) = 2 cot  Using (1578) gives

1 ( )
 = =  cot 

1 ( )   2
 = − =
 2 sin2 

Solving for the coordinates ( ) yields

r
2
 = sin  (a)

√
 = 2 cos  (b)

Inserting these into  gives

H = (cos2  + sin2 ) = 
which implies that  is a cyclic coordinate.
The Hamiltonian is conservative, since it does not explicitly depend on time, and it equals the total energy
since the transformation to generalized coordinates is time independent. Thus

H = = 

Since
H
̇ = =

then
 =  + 
Substituting  into () gives the well known solution of the one-dimensional harmonic oscillator
r
2
= sin( + )
 2
422 CHAPTER 15. ADVANCED HAMILTONIAN MECHANICS

15.4 Hamilton-Jacobi theory

Hamilton used the Principle of Least Action to derive the Hamilton-Jacobi relation (chapter 153)

(q p ) + =0 (1511)

where q p refer to the 1 ≤  ≤  variables    and ( (1 ) 1   (2 ) 2 ) is the action functional. Inte-
gration of this first-order partial diﬀerential equation is non trivial which is a major handicap for practical
exploitation of the Hamilton-Jacobi equation. This stimulated Jacobi to develop the mathematical frame-
work for canonical transformation that are required to solve the Hamilton-Jacobi equation. Jacobi’s approach
is to exploit generating functions for making a canonical transformation to a new Hamiltonian H(Q P )
that equals zero.

H(Q P ) = (q p ) + =0 (15.90)

The generating function for solving the Hamilton-Jacobi equation then equals the action functional .
The Hamilton-Jacobi theory is based on selecting a canonical transformation to new coordinates (  )
all of which are either constant, or the  are cyclic, which implies that the corresponding momenta  are
constants. In either case, a solution to the equations of motion is obtained. A remarkable feature of Hamilton-
Jacobi theory is that the canonical transformation is completely characterized by a single generating function,
. The canonical equations likewise are characterized by a single Hamiltonian function, . Moreover, the
generating function  and Hamiltonian function  are linked together by equation 1511 The underlying
goal of Hamilton-Jacobi theory is to transform the Hamiltonian to a known form such that the canonical
equations become directly integrable. Since this transformation depends on a single scalar function, the
problem is reduced to solving a single partial diﬀerential equation.

15.4.1 Time-dependent Hamiltonian

Jacobi’s complete integral (    )
The principle underlying Jacobi’s approach to Hamilton-Jacobi theory is to provide a recipe for finding
the generating function  =  needed to transform the Hamiltonian (q p ) to the new Hamiltonian
H(Q P ) using equation 1590 When the derivatives of the transformed Hamiltonian H(Q P ) are zero,
then the equations of motion become
H
̇ = =0 (15.91)

H
̇ = − =0 (15.92)

and thus  and  are constants of motion. The new Hamiltonian H must be related to the original
Hamiltonian  by a canonical transformation for which

H(Q P ) = (q p ) + (15.93)

Equations 1591 and 1592 are automatically satisfied if the new Hamiltonian H = 0 since then equation
1593 gives that the generating function  satisfies equation 1590
Any of the four types of generating function can be used. Jacobi chose the type 2 generating function
as being the most useful for many practical cases, that is, (    ) which is called Jacobi’s complete
integral.
For generating functions 1 and 2 the generalized momenta are derived from the action by the derivative

 = (154)

Use this generalized momentum to replace  in the Hamiltonian , given in equation (1593)  leads to the
Hamilton-Jacobi equation expressed in terms of the action 
  
(1   ;   ; ) + =0 (15.94)
1  
15.4. HAMILTON-JACOBI THEORY 423

The Hamilton-Jacobi equation, (1594) can be written more compactly using tensors q and ∇ to designate
 
(1   ) and 1
  
respectively. That is


(q ∇ ) + =0 (15.95)

Equation (1595) is a first-order partial differential equation in  + 1 variables which are the old spatial
coordinates  plus time . The new momenta  have not been specified except that they are constants
since H = 0
Assume the existence of a solution of (1595) of the form (    ) = (1   ; 1  +1 ; ) where
the generalized momenta  = 1  2   plus  are the  + 1 independent constants of integration in the
transformed frame. One constant of integration is irrelevant to the solution since only partial derivatives of
(    ) with respect to  and  are involved. Thus, if  is a solution of the first-order partial differential
equation, then so is  +  where  is a constant. Thus it can be assumed that one of the  + 1 constants of
integration is just an additive constant which can be ignored leading effectively to a solution

(    ) = (1   ; 1   ; ) (15.96)

where none of the  independent constants are solely additive. Such generating function solutions are called
complete solutions of the first-order partial diﬀerential equations since all constants of integration are known.
It is possible to assume that the  generalized momenta,  are constants  , where the  are the
constants This allows the generalized momentum to be written as

(q α )
 = (15.97)

Similarly, Hamilton’s equations of motion give the conjugate coordinate Q = β where   are constants That
is
(q α )
 =   = (15.98)

The above procedure has determined the complete set of 2 constants (Q = β P = α). It is possible to
invert the canonical transformation to express the above solution, which is expressed in terms of  =  
and  =   back to the original coordinates, that is,  =  (  ) and momenta  =  (  ) which is
the required solution.

Hamilton’s principle function  (q  ; q  )

Hamilton’s approach to solving the Hamilton-Jacobi equation (1595) is to seek a canonical transformation
from variables (p q) at time  to a new set of constant quantities, which may be the initial values (q0  p0 )
at time  = 0 Hamilton’s principle function  (  ;   ) is the generating function for this canonical
transformation from the variables (q p) at time  to the initial variables (q0  p0 ) at time 0 . Hamilton’s
principle function  (  ;   ) is directly related to Jacobi’s complete integral (    ).
Note that  is the generating function of a canonical transformation from the present time (q p )
variables to the initial (q0  p0  0 ), whereas Jacobi’s  is the generating function of a canonical transformation
from the present (q p ) variables to the constant variables (Q = β P = α). For the Hamilton approach,
the canonical transformation can be accomplished in two steps using  by first transforming from (q p )
at time , to (β α), then transforming from (β α) to (q0  p0  0 )  That is, this two-step process corresponds
to
 (q ; q  ) = (q α ) − (q0  α 0 ) (15.99)
Hamilton’s principle function  (q ; q  ) is related to Jacobi’s complete integral (q α ) and it will not
be discussed further in this book.
424 CHAPTER 15. ADVANCED HAMILTONIAN MECHANICS

15.4.2 Time-independent Hamiltonian

Frequently the Hamiltonian does not explicitly depend on time. For the standard Lagrangian with time-
independent constraints and transformation, then  (q p) =  which is the total energy. For this case,
the Hamilton-Jacobi equation simplifies to give

= −(q p ) = − (α) (15.100)

The integration of the time dependence is trivial, and thus the action integral for a time-independent Hamil-
tonian equals
(q α) =  (q α) −  (α)  (15.101)
That is, the action integral has separated into a time independent term  (q α) which is called Hamilton’s
characteristic function plus a time-dependent term − (α) . Thus using equations 1597 15101 gives
that the generalized momentum is
 (q α)
 = (15.102)

The physical significance of Hamilton’s characteristic function  (q α) can be understood by taking the
total time derivative
 X  (q α) X
= ̇ =  ̇
 
 
Taking the time integral then gives
Z X Z X
 (q α) =  ̇  =   (15.103)

Note that this equals the abbreviated action described in chapter 923, that is  (q α) = 0 (q α)
Inserting the action  (q α) into the Hamilton-Jacobi equation (1512) gives
 (q α)
(q; ) =  (α) (15.104)

This is called the time-independent Hamilton-Jacobi equation. Usually it is convenient to have 
equal the total energy. However, sometimes it is more convenient to exclude the  energy ( ) in the
set, in which case  = (1  2  −1 ); the Routhian exploits this feature.
The equations of the canonical transformation expressed in terms of  (q α) are
 (q α) (α)  (q α)
 =  + = (15.105)
  
These equations show that Hamilton’s characteristic function  (q α) is itself the generating function of a
time-independent canonical transformation from the old variables ( ) to a set of new variables
(α)
 =   +   =  (15.106)

Table 152 summarizes the time-dependent and time-independent forms of the Hamilton-Jacobi equation.

Table 152; Hamilton-Jacobi formulations

Hamiltonian Time dependent (  ) Time independent ( )
Transformed Hamiltonian H= 0 H is cyclic
Canonical transformed variables All   are constants of motion All  are constants of motion
H H
Transformed equations of motion ̇ =  
= 0 therefore  =   ̇ =  
=   therefore  =   +  
H H
̇ = −  
= 0 therefore  =  ̇ = − 
= 0 therefore  = 
Generating function Jacobi’s complete integral (q P ) Characteristic Function  (q P)

 
Hamilton-Jacobi equation ( 1    ; 1
  
; )+ 
 = 0 ( 1    ;  
1    ) = 


Transformation equations  = 
 = 

 
 =  =    =  =    + 
15.4. HAMILTON-JACOBI THEORY 425

15.4.3 Separation of variables

Exploitation of the Hamilton-Jacobi theory requires finding a suitable action function . When the Hamil-
tonian is time independent, then equation 15101 shows that the time dependence of the action integral
separates out from the dependence on the spatial variables. For many systems, the Hamilton’s characteristic
function  (q P) separates into a simple sum of terms each of which is a function of a single variable. That
is,
 (q α) = 1 (1 ) + 2 (2 ) + · · · · · ( ) (15.107)
where each function in the summation on the right depends only on a single variable. Then equation (15100)
reduces to
 
(1   ;   )= (15.108)
1 
where  is the constant denoting the total energy.
Hamilton’s characteristic function  (q P) can be used with equations (15101), (15102)  (1591),
(1592), and (1593) to derive

 (q α)  (q α)
 =  = (15.109)
 
H H
̇ = =0 ̇ = =0 (15.110)
 

H = + = − =0 (15.111)

which has reduced the problem to a simple sum of one-dimensional first-order diﬀerential equations.
If the  variable is cyclic, then the Hamiltonian is not a function of  and the  term in Hamilton’s
characteristic function equals  =   which separates out from the summation in equation 15107 That
is, all cyclic variables can be factored out of  (q α) which greatly simplifies solution of the Hamilton-Jacobi
equation. As a consequence, the ability of the Hamilton-Jacobi method to make a canonical transformation to
separate the system into many cyclic or independent variables, which can be solved trivially, is a remarkably
powerful way for solving the equations of motion in Hamiltonian mechanics.

15.12 Example: Free particle

Consider the motion of a free particle of mass  in a force-free region. Then equation 1593 reduces to
  
(1   ;   ; ) + =0
1  
Since no forces act, and the momentum p = ∇, thus the Hamilton-Jacobi equation reduces to
1 2 
∇ + =0 ()
2 
The Hamiltonian is time independent, thus equation 15101 applies

(q ) =  (q α) − (α)

Since the Hamiltonian does not explicitly depend on the coordinates (  ) then the coordinates are cyclic
and separation of the variables, 15107, gives that the action

 = α · r −  ()

For equation  to be a solution of equation  requires that

1 2
= α ()
2
Therefore
1 2
 =α·r− α  ()
2
426 CHAPTER 15. ADVANCED HAMILTONIAN MECHANICS

Since
S α
= r−  Q̇ =
α 
the equation of motion and the conjugate momentum are given by
α
r = Q̇ +  p = ∇ = α

Thus the Hamilton-Jacobi relation has given both the equation of motion and the linear momentum p.

15.13 Example: Point particle in a uniform gravitational field

The Hamiltonian is
1 2
( + 2 + 2 ) + 
=
2 
Since the system is conservative, then the Hamilton-Jacobi equation can be written in terms of Hamilton’s
characteristic function 
"µ ¶2 µ ¶2 µ ¶2 #
1   
= + + + 
2   

Assuming that the variables can be separated  = () +  () + () leads to

()
 = = 

 ()
 = = 

() q
 = = 2( − ) − 2 − 2

Thus by integration the total  equals
Z  Z  Z  ³q ´
 =   +   + 2( − ) − 2 − 2 
0 0 0

Therefore using (15106) gives

Z 

 =  − 0 = q
0 2( − ) − 2 − 2
Z 
 
 = constant = ( − 0 ) − q
0 2( − ) − 2 − 2
Z 
 
 = constant = ( − 0 ) − q
0 2( − ) − 2 − 2

If 0  0  0 is the position of the particle at time  = 0 then   =   = 0, and from (15106)

³ ´

 − 0 = ( − 0 )
´
³

 − 0 = ( − 0 )

⎛q ⎞
2( − ) − 2 − 2
 − 0 = ⎝ ⎠ ( − 0 ) − 1 ( − 0 )2
 2

This corresponds to a parabola as should be expected for this trivial example.

15.4. HAMILTON-JACOBI THEORY 427

15.14 Example: One-dimensional harmonic oscillator

As discussed in example 1511 the Hamiltonian for the one-dimensional harmonic oscillator can be written
as
1 ¡ 2 ¢
=  + 2  2  2 = 
2
q

assuming it is conservative and where  =  
Hamilton’s characteristic function  can be used where

 (  ) =  ( ) − 


 =

Inserting the generalized momentum  into the Hamiltonian gives
Ã∙ ¸2 !
1 
+ 2  2  2 = 
2 

Integration of this equation gives r

√ Z
 2  2
 = 2  1−
2
That is Z r
√  2  2
 = 2  1− − 
2
Note that r Z
(  ) 2 
= q −
  1 − 
2 2
2

This can be integrated to give Ã r !

1  2
 = arcsin  + 0
 2
That is r
2
= sin  ( − 0 )
 2
This is the familiar solution of the undamped harmonic oscillator.

15.15 Example: The central force problem

The problem of a particle acted upon by a central force occurs frequently in physics. Consider the mass 
acted upon by a time-independent central potential energy  () The Hamiltonian is time independent and
can be written in spherical coordinates as
µ ¶
1 1 1
= 2 + 2 2 + 2 2 2 +  () = 
2   sin 

The time-independent Hamilton-Jacobi equation is conservative, thus

"µ ¶2 µ ¶2 µ ¶2 #
1  1  1 
+ 2 + 2 2 +  () = 
2     sin  

Try a separable solution for Hamilton’s characteristic function  of the form

 = () + Θ() + Φ()

428 CHAPTER 15. ADVANCED HAMILTONIAN MECHANICS

The Hamilton-Jacobi equation then becomes

"µ ¶2 µ ¶2 µ ¶2 #
1  1 Θ 1 Φ
+ 2 + 2 2 +  () = 
2     sin  

This can be rearranged into the form

( "µ ¶2 µ ¶2 # ) µ ¶2
2 2 1  1 Θ Φ
2 sin  + 2 +  () +  = −
2    

The left-hand side is independent of  whereas the right-hand side is independent of  and  Both sides
must equal a constant which is set to equal −2 , that is
"µ ¶2 µ ¶2 #
1  1 Θ 2
+ 2 +  () + =
2    22 sin2 
µ ¶2
Φ
= 2

The equation in  and  can be rearranged in the form
" µ ¶2 # "µ ¶2 #
2 1  Θ 2
2 +  () −  = − +
2   sin2 

The left-hand side is independent of  and the right-hand side is independent of  so both must equal a
constant which is set to be −2 µ ¶2
1  2
+  () + =
2  22
µ ¶2
Θ 2
+ = 2
 sin2 
The variables now are completely separated and, by rearrangement plus integration, one obtains

√ Z r
2
() = 2  −  () − 
22
Z r
2
Θ() = 2 − 
sin2 
Φ() =  

Substituting these into  = () + Θ() + Φ() gives

√ Z r Z r
2 2
 = 2  −  () −  + 2 −  +  
2 2
sin2 
Hamilton’s characteristic function  is the generating function from coordinates (        ) to new
coordinates, which are cyclic, and new momenta that are constant and taken to be the separation constants
   
r
 √ 2
 = = 2  −  () −
 22
r
 2

 = = 2 −
 sin2 

 = = 

15.4. HAMILTON-JACOBI THEORY 429

Similarly, using (15109) gives the new coordinates   

r Z
  
 +  = = q
 2  −  () − 22
2
Z µ ¶ Z
 √  − 
 = = 2 q 2
+ q
 2 2 2
 −  () − 2 2 2 − sin2 
Z µ ¶
  −
  = = q +
 2 22
2 − sin2 

These equations lead to the elliptical, parabolic, or hyperbolic orbits discussed in chapter 11.

15.16 Example: Linearly-damped, one-dimensional, harmonic oscillator

A canonical treatment of the linearly-damped harmonic oscillator provides an example that combines use
of non-standard Lagrangian and Hamiltonians, a canonical transformation to an autonomous system, and
use of Hamilton-Jacobi theory to solve this transformed system. It shows that Hamilton-Jacobi theory can be
used to determine directly the solutions for the linearly-damped harmonic oscillator.
Non-standard Hamiltonian:
In chapter 35 the equation of motion for the linearly-damped, one-dimensional, harmonic oscillator was
given to be
£ ¤
̈ + Γ̇ +  20  = 0 ()
2
Example 103 showed that three non-standard Lagrangians give equation of motion  when used with the
standard Euler-Lagrange variational equations. One of these was the Bateman[Bat31] time-dependent La-
grangian
 £ ¤
2 ( ̇ ) = Γ ̇ 2 −  20  2 ()
2
This Lagrangian gave the generalized momentum to be
2
= = ̇Γ ()
 ̇
which was used with equation 153 to derive the Hamiltonian

2 1
2 (  ) = ̇ − 2 ( ̇ ) = −Γ +  20  2 Γ ()
2 2
Note that both the Lagrangian and Hamiltonian are explicitly time dependent and thus they are not
conserved quantities. This is as expected for this dissipative system.
Hamilton-Jacobi theory:
The form of the non-autonomous Hamiltonian () suggests use of the generating function for a canonical
transformation to an autonomous Hamiltonian, for which H is a constant of motion.
Γ
(  ) = 2 (  ) =   2 =  ()

Then the canonical transformation gives

 Γ
 = =  2 ()

 Γ
 = =  2

Insert this canonical transformation into the above Hamiltonian leads to the transformed Hamiltonian that
is autonomous.
2  2 Γ  20 2
H(  )=2 (  ) + = +  +  ( )
 2 2 2
430 CHAPTER 15. ADVANCED HAMILTONIAN MECHANICS

That is, the transformed Hamiltonian H(  ) is not explicitly time dependent, and thus is conserved.
Expressed in the original canonical variables ( ), the transformed Hamiltonian H(  )

2 −Γ Γ  20 2 Γ
H(  )=  +  +  
2 2 2
is a constant of motion which was not readily apparent when using the original Hamiltonian. This unexpected
result illustrates the usefulness of canonical transformations for solving dissipative systems. The Hamilton-
Jacobi theory now can be used to solve the equations of motion for the transformed variables (  ) plus the
transformed Hamiltonian H(  ). The derivative of the generating function

= ()

Use equation ( ) to substitute for  in the Hamiltonian H(  ) (equation ( )), then the Hamilton-
Jacobi method gives
µ ¶2
1  Γ   20 2 
+  +  + =0
2  2  2 
This equation is separable as described in 15107 and thus let

(  ) =  ( ) − 

where  is a separation constant. Then

" µ ¶2 #
1    20 2
+ Γ +  = ()
2   2

To simplify the equations define the variable  as

√
≡  0  ()

then equation ( ) can be written as

µ ¶2
  ¡ 2 ¢
+  +  − =0 ()
 

where  = Γ0 and  = 2  0 . Assume initial conditions (0) = 0 and ̇(0) = 0

For this case the separation constant   0 therefore   0. Note that equation ( ) is a simple
second-order algebraic relation, the solution of which is
v "
u µ ¶2 #
  u t 
=− ± − 1− 2 ()
 2 2

The choice of the sign is irrelevant for this case and thus the positive sign is chosen. There are three possible
cases for the solution depending on whether the square-root term is real, zero, or imaginary.
Case 1:   1, that is, 2
2 r

0
1
h ¡  ¢2 i
Define  = 1− 2 Then equation () can be integrated to give
Z p
2
 = − − + ( −  2 2 ) ()
4
and Z
 1 
= = − + p
 0 ( −  2 2 )
This integral gives µ ¶
−1 
sin √ =  0 ( + ) ≡  + 

15.4. HAMILTON-JACOBI THEORY 431

where s s
µ ¶2 µ ¶2
Γ Γ
 = 0  = 0 1− =  20 − ()
2 0 2
Transforming back to the original variable  gives
Γ
() = − 2 sin ( + ) ()

where  and  are given by the initial conditions. Equation  is identical to the solution for the underdamped
linearly-damped linear oscillator given previously in equation 335.
Case 2:  that is, 2Γ0 = 1
2 = 1, r
h ¡ ¢2 i
In this case  = 1−  2 = 0 and thus equation  simplifies to

2 √
 = − − + 
4
and
 
= = − + √
 0 
Therefore the solution is
Γ
() = − 2 ( + ) ()
where F and G are constants given by the initial conditions. This is the solution for the critically-damped
linearly-damped, linear oscillator given previously in equation 338.
Case 3:  Γ
2  1, that is, 2 0  1 rh i
¡  ¢2
Define a real constant  where  = 2 − 1 = , then

Z p
2
 = − − + ( + 2 2 )
4
Then Z
 1 
= = − + p
 0 ( + 2 2 )
This last integral gives µ ¶
−1 
sinh √ =  0 ( + ) ≡  + 

where sµ ¶2

 = 0  = 0 −1
2 0
Then the original variable gives
Γ
() = − 2 sinh ( + ) ()
This is the classic solution of the overdamped linearly-damped, linear harmonic oscillator given previously in
equation 337 The canonical transformation from a non-autonomous to an autonomous system allowed use
of Hamiltonian mechanics to solve the damped oscillator problem.
Note that this example used Bateman’s non-standard Lagrangian, and corresponding Hamiltonian, for
handling a dissipative linear oscillator system where the dissipation depends linearly on velocity. This non-
standard Lagrangian led to the correct equations of motion and solutions when applied using either the
time-dependent Lagrangian, or time-dependent Hamiltonian, and these solutions agree with those given in
chapter 35 which were derived using Newtonian mechanics.
432 CHAPTER 15. ADVANCED HAMILTONIAN MECHANICS

15.4.4 Visual representation of the action function .

The important role of the action integral  can be illu-
minated by considering the case of a single point mass
 moving in a time independent potential  (). Then
the action reduces to

(  ) =  ( ) −  (15.112)

Let 1 = , 2 =  3 =  1 =   2 =   3 =  .
The momentum components are given by
 ( )
 = (15.113)

which corresponds to

p = ∇ = ∇ (15.114)

That is, the time-independent Hamilton-Jacobi equation

is Figure 15.2: Surfaces of constant action integral S
1
|∇ |2 +  () =  (15.115) (dashed lines) and the corresponding particle mo-
2
menta (solid lines) with arrows showing the direc-
This implies that the particle momentum is given by
tion.
the gradient of Hamilton’s characteristic function and is
perpendicular to surfaces of constant  as illustrated in
figure 152. The constant  surfaces are time dependent as given by equation (15101)  Thus, if at time
 = 0 the equi-action surface 0 ( ) = 0 (  ) = 0 then at  = 1 the same surface 0 ( ) = 0 now
coincides with the 0 ( ) =  surface etc. That is, the equi-action surfaces move through space separately
from the motion of the single point mass.
The above pictorial representation is analogous to the situation for motion of a wavefront for electromag-
netic waves in optics, or matter waves in quantum physics where the wave equation separates into the form

 = 0   = 0 (k·r−) . Hamilton’s goal was to create a unified theory for optics that was equally applica-
ble to particle motion in classical mechanics. Thus the optical-mechanical analogy of the Hamilton-Jacobi
theory has culminated in a universal theory that describes wave-particle duality; this was a Holy Grail of
classical mechanics since Newton’s time. It played an important role in development of the Schrödinger
representation of quantum mechanics.

15.4.5 Advantages of Hamilton-Jacobi theory

Initially, only a few scientists, like Jacobi, recognized the advantages of Hamiltonian mechanics. In 1843
Jacobi made some brilliant mathematical developments in Hamilton-Jacobi theory that greatly enhanced
exploitation of Hamiltonian mechanics. Hamilton-Jacobi theory now serves as a foundation for contemporary
physics, such as quantum and statistical mechanics. A major advantage of Hamilton-Jacobi theory, compared
to other formulations of analytic mechanics, is that it provides a single, first-order partial diﬀerential equation
for the action  which is a function of the  generalized coordinates q and time . The generalized momenta
no longer appear explicitly in the Hamiltonian in equations 1594 1595. Note that the generalized momentum
do not explicitly appear in the equivalent Euler-Lagrange equations of Lagrangian mechanics, but these
comprise a system of  second-order, partial diﬀerential equations for the time evolution of the generalized
coordinate q. Hamilton’s equations of motion are a system of 2 first-order equations for the time evolution
of the generalized coordinates and their conjugate momenta.
An important advantage of the Hamilton-Jacobi theory is that it provides a formulation of classical
mechanics in which motion of a particle can be represented by a wave. In this sense, the Hamilton-Jacobi
equation fulfilled a long-held goal of theoretical physics, that dates back to Johann Bernoulli, of finding an
analogy between the propagation of light and the motion of a particle. This goal motivated Hamilton to
develop Hamiltonian mechanics. A consequence of this wave-particle analogy is that the Hamilton-Jacobi
formalism featured prominently in the derivation of the Schrödinger equation during the development of
quantum-wave mechanics.
15.5. ACTION-ANGLE VARIABLES 433

15.5 Action-angle variables

15.5.1 Canonical transformation
Systems possessing periodic solutions are a ubiquitous feature in physics. The periodic motion can be either
an oscillation, for which the trajectory in phase space is a closed loop (libration), or rolling (rotational)
motion as discussed in chapter 344. For many problems involving periodic motion, the interest often lies in
the frequencies of motion rather than the detailed shape of the trajectories in phase space. The action-angle
variable approach uses a canonical transformation to action and angle variables which provide a powerful, and
elegant method to exploit Hamiltonian mechanics. In particular, it can determine the frequencies of periodic
motion without having to calculate the exact trajectories for the motion. This method was introduced by
the French astronomer Ch. E. Delaunay(1816 − 1872) for applications to orbits in celestial mechanics, but
it has equally important applications beyond celestial mechanics such as to bound solutions of the atom in
quantum mechanics.
The action-angle method replaces the momenta in the Hamilton-Jacobi procedure by the action phase
integral for the closed loop (libration) trajectory in phase space defined by
I
 ≡   (15.116)

where for each cyclic variable the integral is taken over one complete period of oscillation. The cyclic variable
 is called the action variable where
I
1 1
 ≡  =   (15.117)
2 2
The canonical variable to the action variable I is the angle variable
R φ. Note that the name “action variable”
is used to diﬀerentiate I from the action functional  =  which has the same units; i.e. angular
momentum.
The general principle underlying the use of action-angle variables is illustrated by considering one body,
of mass , subject to a one-dimensional bound conservative potential energy  (). The Hamiltonian is
given by
2
( ) = +  () (15.118)
2
This bound system has a ( ) phase space contour for each energy  = 
p
( ) = ± 2( −  ()) (15.119)
For an oscillatory
I system the two-valued momentum of equation 15119 is non-trivial to handle. By contrast,
the area  ≡  of the closed loop in phase space is a single-valued scalar quantity that depends on 
I
and  (). Moreover, Liouville’s theorem states that the area of the closed contour in phase space  ≡ 
is invariant to canonical transformations. These facts suggest the use of a new pair of conjugate variables,
( ) where () uniquely labels the trajectory, and corresponding area, of a closed loop in phase space
for each value of  , and the single-valued function  is a corresponding angle that specifies the exact point
along the phase-space contour as illustrated in Fig 153.
For simplicity consider the linear harmonic oscillator where
1
 () =  2  2 (15.120)
2
Then the Hamiltonian, 15118 equals
2 1
( ) = +  2  2 (15.121)
2 2
Hamilton’s equations of motion give that

̇ = − = − 2  (15.122)

 
̇ = = (15.123)
 
434 CHAPTER 15. ADVANCED HAMILTONIAN MECHANICS

The solution of equations 15122 and 15123 is of the form

 =  cos(( − 0 )) (15.124)
 = − sin ( − 0 ) (15.125)

where  and 0 are integration constants. For the harmonic oscillator,

equations 15124 and 15125 correspond to the usual elliptical contours
in phase space, as illustrated in figure 153.
The action-angle canonical transformation involves making the
transform
( ) → ( ) (15.126)
where  is defined by equation 15117 and the angle  being the cor-
responding canonical angle. The logical approach to this canonical
transformation for the harmonic oscillator is to define  and  in
terms of  and 
r
2
 = cos  (15.127)

√
 = 2 sin  (15.128)

Note that the Poisson bracket is unity

[ ]() = 1

which implies that the above transformation

I is canonical, and thus
1
the phase space area () ≡ 2  is conserved.
For this canonical transformation the transformed Hamiltonian
H ( ) is
1 1 2
H ( ) = (2) sin2  +  2 cos2  =  (15.129)
2 2 
Note that this Hamiltonian is a constant that is independent of the
angle  and thus Hamilton’s equations of motion give
H ( )
˙ = − =0 (15.130)

Figure 15.3: The potential energy
H ( )
̇ = = (15.131)  (), (upper) and corresponding
 phase space ( ) (middle) for the
Thus we have mapped the harmonic oscillator to new coordinates harmonic oscillator at four equally
( ) where spaced total energies . The corre-
sponding action-angles ( ) result-
H ( ) 
 = = (15.132) ing from a canonical transformation
  of this system are shown in the lower
 =  ( − 0 ) (15.133) plot.

That is, the phase space has been mapped from ellipses, with area proportional to  in the ( ) phase
space, to a cylindrical ( ) phase space where  = 
 are constant values that are independent of the angle,
while  increases linearly with time. Thus the variables ( ) are periodic with modulus ∆ = 2.

( + 2 ) =  ( ) (15.134)

( + 2 ) =  ( ) (15.135)

The period  of the periodic oscillatory motion is given simply by ∆ = 2 =  which is the well known re-
sult for the harmonic oscillator. Note that the action-angle variable canonical transformation has determined
the frequency of the periodic motion without solving the detailed trajectory of the motion.
15.5. ACTION-ANGLE VARIABLES 435

The above example of the harmonic oscillator has shown that, for integrable periodic systems, it is
possible to identify a canonical transformation to ( ) such that the Hamiltonian is independent of the
angle  which specifies the instantaneous location on the constant energy contour . If the phase space
contour is a separatrix, then it divides phase space into invariant regions containing phase-space contours
with diﬀering behavior. The action-angle variables are not useful for separatrix contours. For rolling motion,
the system rotates with continuously increasing, or decreasing angle, and there is no natural boundary for the
action angle variable since the phase space trajectory is continuous and not closed. However, the action-angle
approach still is valid if the motion involves periodic as well as rolling motion.
The example of the one-dimensional, one-body, harmonic oscillator can be expanded to the more general
case for many bodies in three dimensions. This is illustrated by considering multiple periodic systems for
which the Hamiltonian is conservative and where the equations of the canonical transformation are separable.
The generalized momenta then can be written as
 ( ; 1  2   )
 = (15.136)

for which each  is a function of  and the  integration constants 

 =  (  1  2   ) (15.137)

The momentum  (  1  2   ) represents the trajectory of the system in the (   ) phase space that is
characterized by Hamilton’s characteristic function  ( ) Combining equations 15116 15136 gives
I
 ( ; 1  2   )
 ≡  (15.138)

Since  is merely a variable of integration, each active action variable  is a function of the  constants
of integration in the Hamilton-Jacobi equation. Because of the independence of the separable-variable pairs
(   ), the  form  independent functions of the   and hence are suitable for use as a new set of constant
momenta. Thus the characteristic function  can be written as
X
 (1   ; 1   ) =  ( ; 1   ) (15.139)


while the Hamiltonian is only a function of the momenta  (1   )

The generalized coordinate, conjugate to  is known as the angle variable  which is defined by the
transformation equation
X
  ( ; 1   )
 = = (15.140)
 =1


The corresponding equation of motion for  is given by

()
̇ = = 2  (1   ) (15.141)

where   () are constant functions of the action variables  with a solution

 = 2   +   (15.142)

that is, they are linear functions of time The constants   can be identified with the frequencies of the
multiple periodic motions.
The action-angle variables appear to be no diﬀerent than a particular set of transformed coordinates.
Their merit appears when the physical interpretation is assigned to   . Consider the change  as the 
are changed infinitesimally
X  X 2

 =  =  (15.143)

 
 

The derivative with respect to  vanishes except for the  component of  . Thus equation 15143 reduces
to
436 CHAPTER 15. ADVANCED HAMILTONIAN MECHANICS

 X
 =  (  )  (15.144)
 

Therefore, the total change in  as the system goes through one complete cycle is
X  I
∆ =  (  )  = 2  (15.145)




where  
is outside the integral since the  are constants for cyclic motion. Thus ∆ = 2 =     where
  is the period for one cycle of oscillation, where the angular frequency   is given by

 1
=  = (15.146)
2 

Thus the frequency  associated with the periodic motion is the reciprocal of the period   The secret here is
that the derivative of  with respect to the action variable  given by equation (15141) directly determines
the frequency of the periodic motion without the need to solve the complete equations of motion. Note that
multiple periodic motion can be represented by a Fourier expansion of the form
∞
X ∞
X ∞
X
 =  1  2(1 1 +2 2 +3 3 ++  ) (15.147)
1 =−∞ 2 =−∞  =−∞

Although the action-angle approach to Hamilton-Jacobi theory does not produce complete equations of
motion, it does provide the frequency decomposition that often is the physics of interest. The reason that
the powerful action-angle variable approach has been introduced here is that it is used extensively in celestial
mechanics. The action-angle concept also played a key role in the development of quantum mechanics, in
that Sommerfeld recognized that Bohr’s ad hoc assumption that angular momentum is quantized, could be
expressed in terms of quantization of the angle variable as is mentioned in chapter 18.

15.5.2 Adiabatic invariance of the action variables

When the Hamiltonian depends on time it can be quite difficult to solve for the motion because it is difficult
to find constants of motion for time-dependent systems. However, if the time dependence is sufficiently
slow, that is, if the motion is adiabatic, then there exist dynamical variables that are almost constant which
can be used to solve for the motion. In particular, such approximate constants are the familiar action-angle
integrals. The adiabatic invariance of the action variables played an important role in the development of
quantum mechanics during the 1911 Solvay Conference. This was a time when physicists were grappling with
the concepts of quantum mechanics. Einstein used the following classical mechanics example of adiabatic
invariance, applied to the simple pendulum, in order to illustrate the concept of adiabatic invariance of the
action. This example demonstrates the power of using action-angle variables.

15.17 Example: Adiabatic invariance for the simple pendulum

Consider that the pendulum is made up of a point mass  suspended from a pivot by a light string of
length  that is swinging freely in a vertical plane. Derive the dependence of the amplitude of the oscillations
, assuming  is small, if the string is very slowly shortened by a factor of 2, that is, assume that the change
in length during one period of the oscillation is very small.
The tension in the string  is given by
* 2+
 2 ̇
 =   hcos i +


Let the pendulum angle be oscillatory

 = 0 cos( + 0 )
15.5. ACTION-ANGLE VARIABLES 437

Then the average mean square amplitude and velocity over one period are

2® ® 2
 = [0 cos( + 0 )]2 = 0
2
D 2E ®  2 20
̇ = [−0  sin( + 0 )]2 =
2

Since, for the simple pendulum,  2 = , then the tension in the string
2® D 2E
 2
 =  (1 − ) +   ̇ =  (1 + 0 )
2 4
Assuming that 0 is a small angle, and that the change in length −∆ is very small during one period
  then the work done is
2
∆ =  ∆ = − ∆ −   0 ∆ (a)
4
while the change in internal oscillator energy is
∙ ¸
2 1 1
∆(−  cos 0 ) = ∆ − (1 − 0 ) = − ∆ +  ∆(20 ) = − ∆ +  20 ∆ +  0 ∆0
2 2 2
(b)
The work done must balance the increment in internal energy therefore

320 ∆
0 ∆0 + =0
4
or
3
20 ∆ ln(0  4 ) = 0
Therefore it follows that
3
(0  4 ) = constant (c)
or
3
0 ∝ − 4
Thus shortening the length of the pendulum string from  to 2 adiabatically corresponds to the amplitude
increasing by a factor 168.
Consider the action-angle integral for one closed period  = 2
 for this problem
I
 =  
I
=  2 ̇ · ̇
D 2 E 2
=  2 ̇

2 2
=   0 
1 3
=   2 20  2 = constant

where that last step is due to equation ().

The above example shows that the action integral  = , that is, it is invariant to an adiabatic
change. In retrospect this result is as expected in that the action integral should be minimized.
438 CHAPTER 15. ADVANCED HAMILTONIAN MECHANICS

15.6 Canonical perturbation theory

Most examples in classical mechanics discussed so far have been capable of exact solutions. In real life, the
majority of problems cannot be solved exactly. For example, in celestial mechanics the two-body Kepler
problem can be solved exactly, but solution of the three-body problem is intractable. Typical systems in
celestial mechanics are never as simple as the two-body Kepler system because of the influence of additional
bodies. Fortunately in most cases the influence of additional bodies is suﬃciently small to allow use of
perturbation theory. That is, the restricted three-body approximation can be employed for which the system
is reduced to considering it as an exactly solvable two-body problem, subject to a small perturbation to this
solvable two-body system. Note that even though the change in the Hamiltonian due to the perturbing term
may be small, the impact on the motion can be especially large near a resonance.
Consider the Hamiltonian, subject to a time-dependent perturbation, is written as
(  ) = 0 (  ) + ∆(  )
where 0 (  ) designates the unperturbed Hamiltonian and ∆(  ) designates the perturbing term.
For the unperturbed system the Hamilton-Jacobi equation is given by
  
H(    ) = 0 (1   ;  ; ) + =0 (1590)
1  
where (    ) is the generating function for the canonical transformation ( ) → (  ). The perturbed
(    ) remains a canonical transformation, but the transformed Hamiltonian H(    ) 6= 0. That is,

H(    ) = 0 + ∆(  ) + = ∆(  ) (15.148)

The equations of motion satisfied by the transformed variables now are
∆
̇ = (15.149)

∆
̇ =

These equations remain as diﬃcult to solve as the full Hamiltonian. However, the perturbation technique
assumes that ∆ is small, and that one can neglect the change of (   ) over the perturbing interval.
Therefore, to a first approximation, the unperturbed values of ∆ ∆
 and  can be used in equations 15149.
A detailed explanation of canonical perturbation theory is presented in chapter 12 of Goldstein[Go50].

15.18 Example: Harmonic oscillator perturbation

(a) Consider first the Hamilton-Jacobi equation for the generating function (  ) for the case of a
single free particle subject to the Hamiltonian  = 12 2 . Find the canonical transformation  = ( ) and
 = ( ) where  and  are the transformed coordinate and momentum respectively.
The Hamilton-Jacobi equation

+ (  ) = 0

Using  =  1 2
 in the Hamiltonian  = 2  gives
µ ¶2
 1 
+ =0
 2 
Since  does not depend on   explicitly, then the two terms on the left hand side of the equation can be
set equal to −  respectively, where  is at most a function of . Then the generating function is
p
 = 2 − 
√
Set  = 2 then the generating function can be written as
1
 =  − 2 
2
15.6. CANONICAL PERTURBATION THEORY 439

The constant  can be identified with the new momentum  Then the transformation equations become
  
= = = = =  −  = 
  
That is
 =  + 
which corresponds to motion with a uniform velocity  in the   system.
2
(b) Consider that the Hamiltonian is perturbed by addition of potential  = 2 which corresponds to the
harmonic oscillator. Then
1 2
 = 2 +
2 2
Consider the transformed Hamiltonian
 1  2 2 2 1 2
H=+ = 2 + − = = ( + )
 2 2 2 2 2
Hamilton’s equations of motion
H H
̇ = ̇ = −
 
give that
̇ = ( + ) 
̇ = − ( + )
These two equations can be solved to give
̈ +  = 0
which is the equation of a harmonic oscillator showing that  is harmonic of the form  = 0 sin ( + )
where 0   are constants of motion. Thus
 = −̇ −  = −0 [cos( + ) +  sin( + )]
The transformation equations then give
 =  = 0 sin ( + )
 =  +  = −̇ = −0 cos( + )
Hence the solution for the perturbed system is harmonic, which is to be expected since the potential has a
quadratic dependence of position.

15.19 Example: Lindblad resonance in planetary and galactic motion

Use of canonical perturbation theory in celestial mechanics has been exploited by Professor Alice Quillen
and her group. They combine use of action-angle variables and Hamilton-Jacobi theory to investigate the role
of Lindblad resonance to planetary motion, and also for stellar motion in galaxies. A Lindblad resonance
is an orbital resonance in which the orbital period of a celestial body is a simple multiple of some forcing
frequency. Even for very weak perturbing forces, such resonance behavior can lead to orbit capture and chaotic
motion.
For planetary motion the planet masses are about 11000 that of the central star, so the perturbations
to Kepler orbits are small. However, Lindblad resonance for planetary motion led to Saturn’s rings which
result from perturbations produced by the moons of Saturn that skulpt and clear dust rings. Stellar orbits in
disk galaxies are perturbed a few percent by non axially-symmetric galactic features such as spiral arms or
bars. Lindblad resonances perturb stellar motion and drive spiral density waves at distances from the center
of a galactic disk where the natural frequency of the radial component of a star’s orbital velocity is close to
the frequency of the fluctuations in the gravitational field due to passage through spiral arms or bars. If a
stars orbital speed around a galactic center is greater than that of the part of a spiral arm through which it is
traversing, then an inner Lindblad resonance occurs which speeds up the star’s orbital speed moving the orbit
outwards. If the orbital speed is less than that of a spiral arm, an inner Lindblad resonance occurs causing
inward movement of the orbit.
440 CHAPTER 15. ADVANCED HAMILTONIAN MECHANICS

15.7 Symplectic representation

The Hamilton’s first-order equations of motion are symmetric if the generalized and constraint force terms,
in equation 159 are excluded.
 
q̇ = − ṗ =
p q
This stimulated attempts to treat the canonical variables (q p) in a symmetric form using group theory.
Some graduate textbooks in classical mechanics have adopted use of symplectic symmetry in order to unify
the presentation of Hamiltonian mechanics. For a system of  degrees of freedom, a column matrix η is
constructed that has 2 elements where

  =  + =  ≤ (15.150)

Therefore the column matrix

µ ¶ µ ¶
   
= = ≤ (15.151)
η   η + 

The symplectic matrix J is defined as being a 2 by 2 skew-symmetric, orthogonal matrix that is broken
into four  ×  null or unit matrices according to the scheme
µ ¶
[0] + [1]
J= (15.152)
− [1] [0]

where [0] is the -dimension null matrix, for which all elements are zero. Also [1] is the -dimensional unit
matrix, for which the diagonal matrix elements are unity and all oﬀ-diagonal matrix elements are zero. The
J matrix accounts for the opposite signs used in the equations for q̇ and ṗ. The symplectic representation
allows the Hamilton’s equations of motion to be written in the compact form

η̇ = J (15.153)
η
This textbook does not use the elegant symplectic representation since it ignores the important generalized
forces and Lagrange multiplier forces.

15.8 Comparison of the Lagrangian and Hamiltonian formulations

Common features
The discussion of Lagrangian and Hamiltonian dynamics has illustrated the power of such algebraic formu-
lations. Both approaches are based on application of variational principles to scalar energy which gives the
freedom to concentrate solely on active forces and to ignore internal forces. Both methods can handle many-
body systems and exploit canonical transformations, which are impractical or impossible using the vectorial
Newtonian mechanics. These algebraic approaches simplify the calculation of the motion for constrained
systems by representing the vector force fields, as well as the corresponding equations of motion, in terms of
either the Lagrangian function (q q̇) or the action functional (q p) which are related by the definite
integral Z 2
(q p) = (q q̇) (151)
1

The Lagrangian function (q q̇) and the action functional (q p) are scalar functions under rotation,
but they determine the vector force fields and the corresponding equations of motion. Thus the use of
rotationally-invariant functions (q q̇) and (q p) provide a simple representation of the vector force
fields. This is analogous to the use of scalar potential fields  (q ) to represent the electrostatic and gravita-
tional vector force fields. Like scalar potential fields, Lagrangian and Hamiltonian mechanics represents the
observables as derivatives of (q q̇) and (q p) and the absolute values of (q q̇) and (q p) are
undefined; only diﬀerences in (q q̇) and (q p) are observable. For example, the generalized momenta
are given by the derivatives  ≡  
̇ and  =  . The physical significance of the least action (q α) is
15.8. COMPARISON OF THE LAGRANGIAN AND HAMILTONIAN FORMULATIONS 441

illustrated when the canonically transformed momenta P = α is a constant. Then the generalized momenta
and the Hamilton-Jacobi equation, imply that the total time derivative of the action equals

  
= ̇ + =   −  =  (15.154)
  

The indefinite integral of this equation reproduces the definite integral (151) to within an arbitrary constant,
i.e. Z
(q p) = (q q̇) + constant (15.155)

Lagrangian formulation:

Consider a system with  independent generalized coordinates, plus  constraint forces that are not required
to be known. The Lagrangian approach can reduce the system to a minimal system of  =  −  inde-
pendent generalized coordinates leading to  =  −  second-order diﬀerential equations. By comparison,
the Newtonian approach uses  +  unknowns. Alternatively, the Lagrange multipliers approach allows
determination of the holonomic constraint forces resulting in  =  +  second order equations to determine
 =  +  unknowns. The Lagrangian potential function is limited to conservative forces, but generalized
forces can be used to handle non-conservative and non-holonomic forces. The advantage of the Lagrange
equations of motion is that they can deal with any type of force, conservative or non-conservative, and
they directly determine , ̇ rather than   which then requires relating  to ̇. The Lagrange approach is
superior to the Hamiltonian approach if a numerical solution is required for typical undergraduate problems
in classical mechanics. However, Hamiltonian mechanics has a clear advantage for addressing more profound
and philosophical questions in physics.

Hamiltonian formulation:

For a system with  independent generalized coordinates, and  constraint forces, the Hamiltonian approach
determines 2 first-order diﬀerential equations. In contrast to Lagrangian mechanics, where the Lagrangian
is a function of the coordinates and their velocities, the Hamiltonian uses the variables q and p, rather
than velocity. The Hamiltonian has twice as many independent variables as the Lagrangian which is a great
advantage, not a disadvantage, since it broadens the realm of possible transformations that can be used to
simplify the solutions. Hamiltonian mechanics uses the conjugate coordinates q p corresponding to phase
space. This is an advantage in most branches of physics and engineering. Compared to Lagrangian mechanics,
Hamiltonian mechanics has a significantly broader arsenal of powerful techniques that can be exploited to
obtain an analytical solution of the integrals of the motion for complicated systems. These techniques
include, the Poisson bracket formulation, canonical transformations, the Hamilton-Jacobi approach, the
action-angle variables, and canonical perturbation theory. In addition, Hamiltonian dynamics also provides
a means of determining the unknown variables for which the solution assumes a soluble form, and it is
ideal for study of the fundamental underlying physics in applications to other fields such as quantum or
statistical physics. However, the Hamiltonian approach endemically assumes that the system is conservative
putting it at a disadvantage with respect to the Lagrangian approach. The appealing symmetry of the
Hamiltonian equations, plus their ability to utilize canonical transformations, makes it the formalism of
choice for examination of system dynamics. For example, Hamilton-Jacobi theory, action-angle variables
and canonical perturbation theory are used extensively to solve complicated multibody orbit perturbations
in celestial mechanics by finding a canonical transformation that transforms the perturbed Hamiltonian to
a solved unperturbed Hamiltonian.
The Hamiltonian formalism features prominently in quantum mechanics since there are well established
rules for transforming the classical coordinates and momenta into linear operators used in quantum me-
chanics. The variables q q̇ used in Lagrangian mechanics do not have simple analogs in quantum physics.
As a consequence, the Poisson bracket formulation, and action-angle variables of Hamiltonian mechanics
played a key role in development of matrix mechanics by Heisenberg, Born, and Dirac, while the Hamilton-
Jacobi formulation played a key role in development of Schrödinger’s wave mechanics. Similarly, Hamiltonian
mechanics is the preeminent variational approached used in statistical mechanics.
442 CHAPTER 15. ADVANCED HAMILTONIAN MECHANICS

15.9 Summary
This chapter has gone beyond what is normally covered in an undergraduate course in classical mechanics,
in order to illustrate the power of the remarkable arsenal of methods available for solution of the equations of
motion using Hamiltonian mechanics. This has included the Poisson bracket representation of Hamiltonian
formulation of mechanics, canonical transformations, Hamilton-Jacobi theory, action-angle variables, and
canonical perturbation theory. The purpose was to illustrate the power of variational principles in Hamil-
tonian mechanics and how they relate to fields such as quantum mechanics. The following are the key points
made in this chapter.

Poisson brackets: The elegant and powerful Poisson bracket formalism of Hamiltonian mechanics was
introduced. The Poisson bracket of any two continuous functions of generalized coordinates  ( ) and
( ) is defined to be
X µ    
¶
[ ] ≡ − (1513)

   

The fundamental Poisson brackets equal

[   ] = 0 (1521)

[   ] = 0 (1522)

[   ] = − [   ] =   (1523)

The Poisson bracket is invariant to a canonical transformation from ( ) to (  ). That is
X µ    
¶
[ ] = − = [ ] (1532)
   


There is a one-to-one correspondence between the commutator and Poisson Bracket of two independent
functions,
(1 1 − 1 1 ) =  [1  1 ] (1538)
where  is an independent constant. In particular 1 1 commute of the Poisson Bracket [1  1 ] = 0.

Poisson Bracket representation of Hamiltonian mechanics: It has been shown that the Poisson
bracket formalism contains the Hamiltonian equations of motion and is invariant to canonical transforma-
tions. Also this formalism extends Hamilton’s canonical equations to non-commuting canonical variables.
Hamilton’s equations of motion can be expressed directly in terms of the Poisson brackets


̇ = [  ] = (1557)



̇ = [  ] = − (1558)

An important result is that the total time derivative of any operator is given by

 
= + [ ] (1545)
 
Poisson brackets provide a powerful means of determining which observables are time independent and
whether diﬀerent observables can be measured simultaneously with unlimited precision. It was shown that
the Poisson bracket is invariant to canonical transformations, which is a valuable feature for Hamiltonian
mechanics. Poisson brackets were used to prove Liouville’s theorem which plays an important role in the use
of Hamiltonian phase space in statistical mechanics. The Poisson bracket is equally applicable to continuous
solutions in classical mechanics as well as discrete solutions in quantized systems.
15.9. SUMMARY 443

Canonical transformations: A transformation between a canonical set of variables ( ) with Hamil-
tonian (  ) to another set of canonical variable (  ) with Hamiltonian H(  ) can be achieved
using a generating functions  such that


H(  ) = (  ) + (1589)

Possible generating functions are summarized in the following table.

Generating function Generating function derivatives Trivial special case

 = 1 (q Q )  = 

1 1
 = −  
1 =    =   = −
 = 2 (q P ) − Q · P  = 

2
 = 

2
2 =    =   = 
 = 3 (p Q ) + q · p  = − 

3 3
 = −  
3 =    = −  = −
 = 4 (p P ) + q · p − Q · P  = − 

4
 = 

4
1 =    =   = −

If the canonical transformation makes H(  ) = 0 then the conjugate variables (  ) are constants
of motion. Similarly if H(  ) is a cyclic function then the corresponding  are constants of motion.

Hamilton-Jacobi theory: Hamilton-Jacobi theory determines the generating function required to per-
form canonical transformations that leads to a powerful method for obtaining the equations of motion for
a system. The Hamilton-Jacobi theory uses the action function  ≡ 2 as a generating function, and the
canonical momentum is given by

 = (154)

This can be used to replace  in the Hamiltonian  leading to the Hamilton-Jacobi equation

 
(; ; ) + =0 (1594)
 

Solutions of the Hamilton-Jacobi equation were obtained by separation of variables. The close optical-
mechanical analogy of the Hamilton-Jacobi theory is an important advantage of this formalism that led to
it playing a pivotal role in the development of wave mechanics by Schrödinger.

Action-angle variables: The action-angle variables exploits a canonical transformation from ( ) →
( ) where I
1 1
 ≡  =   (15117)
2 2
For periodic motion the phase-space trajectory is closed with area given by  and this area is conserved for
the above canonical transformation. For a conserved Hamiltonian the action variable  is independent of
the angle variable . The time dependence of the angle variable  directly determines the frequency of the
periodic motion without recourse to calculation of the detailed trajectory of the periodic motion.

Canonical perturbation theory: Canonical perturbation theory is a valuable method of handling multi-
body interactions. The adiabatic invariance of the action-angle variables provides a powerful approach for
exploiting canonical perturbation theory.

Comparison of Lagrangian and Hamiltonian formulations: The remarkable power, and intellectual
beauty, provided by use of variational principles to exploit the underlying principles of natural economy in
nature, has had a long and rich history. It has led to profound developments in many branches of theoretical
physics. However, it is noted that although the above algebraic formulations of classical mechanics have been
used for over two centuries, the important limitations of these algebraic formulations to non-linear systems
remain a challenge that still is being addressed.
It has been shown that the Lagrangian and Hamiltonian formulations represent the vector force fields,
and the corresponding equations of motion, in terms of the Lagrangian function (q q̇) or the action
444 CHAPTER 15. ADVANCED HAMILTONIAN MECHANICS

functional (q p) which are scalars under rotation. The Lagrangian function (q q̇) is related to the
action functional (q p) by
Z 2
(q p) = (q q̇) (151)
1

These functions are analogous to electric potential, in that the observables are derived by taking derivatives
of the Lagrangian function (q q̇) or the action functional (q p). The Lagrangian formulation is more
convenient for deriving the equations of motion for simple mechanical systems. The Hamiltonian formulation
has a greater arsenal of techniques for solving complicated problems plus it uses the canonical variables (   )
which are the variables of choice for applications to quantum mechanics and statistical mechanics.
15.9. SUMMARY 445

Workshop exercises
1. Poisson brackets are a powerful means of elucidating when observables are constant of motion and whether
two observables can be simultaneously measured with unlimited precision. Consider a spherically symmetric
Hamiltonian Ã !
1 2 2
= 2 + + +  ()
2 2 2 sin2 
for a mass  where  ( is a central potential. Use the Poisson bracket plus the time dependence to determine
the following:

(a) Does  commute with  and is it a constant of motion?

2
(b) Does 2 + sin2 
commute with  and is it a constant of motion?
(c) Does  commute with  and is it a constant of motion?
(d) Does  commute with  and what does the result imply?

2. Consider the Poisson brackets for angular momentum L

(a) Show {   } =   , where the Levi-Cevita tensor is,

⎧
⎨ +1 if  are cyclically permuted
 = −1 if  are anti-cyclically permuted
⎩
0 if  =  or  =  or  = 

(b) Show {   } =   .

(c) Show {   } =   . The following identity may be useful:   =     −     .
(d) Show {  2 } = 0 .

3. Consider the Hamiltonian of a two-dimensional harmonic oscillator,

p2 1 ¡ ¢
= +   21 12 +  22 22
2 2
What condition is satisfied if 2 a conserved quantity?
446 CHAPTER 15. ADVANCED HAMILTONIAN MECHANICS

Problems
1. Consider the motion of a particle of mass  in an isotropic harmonic oscillator potential  = 12 2 and take
the orbital plane to be the  −  plane. The Hamiltonian is then
1 2 1
 ≡ 0 = ( + 2 ) + (2 +  2 )
2  2
Introduce the three quantities
1 2 1
1 = ( − 2 ) + (2 −  2 )
2  2
1
2 =   + 

3 = ( −  )
q

with  = . Use Poisson brackets to solve the following:
a) Show that [0   ] = 0 for  = 1 2 3 proving that (1  2  3 ) are constants of motion.
b) Show that

[1  2 ] = 23
[2  3 ] = 21
[3  1 ] = 22
−1
so that (2) (1  2  3 ) have the same Poisson bracket relations as the components of a 3-dimensional angular
momentum.
c) Show that
02 = 12 + 22 + 32
2. Assume that the transformation equations between the two sets of coordinates ( ) and (  ) are
1
 = ln(1 +  2 cos )
1 1
 = 2(1 +  2 cos ) 2 sin )

a) Assuming that   are canonical variables, i.e. [ ] = 1, show directly from the above transformation
equations that   are canonical variables.
b) Show that the generating function that generates this transformation between the two sets of canonical variables
is
3 = −[ − 1]2 tan 

3. Consider a bound two-body system comprising a mass  in an orbit at a distance  from a mass  . The
attractive central force binding the two-body system is

F= r̂
2
where  is negative. Use Poisson brackets to prove that the eccentricity vector  =  ×  + ̂ is a conserved
quantity.

4. (a) Consider the case of a single mass  where the Hamiltonian  = 12 2 . Use the generating function
(  ) to solve the Hamilton-Jacobi equation with the canonical transformation  = (  ) and  =
(  ) and determine the equations relating the ( ) variables to the transformed coordinate and momentum
(  ).
(b) If there is a perturbing Hamiltonian ∆ = 12  2 , then  will not be constant. Express the transformed
Hamiltonian  (using the transformation given above in terms of   and ). Solve for () and  () and
show that the perturbed solution [()  ()] [()  ()] is simple harmonic.
Chapter 16

Analytical formulations for continuous

systems

16.1 Introduction
Lagrangian and Hamiltonian mechanics have been used to determine the equations of motion for discrete sys-
tems having a finite number of discrete variables  where 1 ≤  ≤ . There are important classes of systems
where it is more convenient to treat the system as being continuous. For example, the interatomic spacing in
solids is a few 10−10  which is negligible compared with the size of typical macroscopic, three-dimensional
solid objects. As a consequence, for wavelengths much greater than the atomic spacing in solids, it is use-
ful to treat macroscopic crystalline lattice systems as continuous three-dimensional uniform solids, rather
than as three-dimensional discrete lattice chains. Fluid and gas dynamics are other examples of continuous
mechanical systems. Another important class of continuous systems involves the theory of fields, such as
electromagnetic fields. Lagrangian and Hamiltonian mechanics of the continua extend classical mechanics
into the advanced topic of field theory. This chapter goes beyond the scope of a typical undergraduate
classical mechanics course in order to provide a brief glimpse of how Lagrangian and Hamiltonian mechanics
can underlie advanced and important aspects of the mechanics of the continua, including field theory.

16.2 The continuous uniform linear chain

The Lagrangian for the discrete lattice chain, for longitudinal modes, is given by equation 1476 to be

1 X³ 2 ´
+1
2
= ̇ −  (−1 −  ) (16.1)
2 =1

where the  masses are attached in series to +1 identical springs of length  and spring constant . Assume
that the spring has a uniform cross-section area  and length  Then each spring volume element ∆ = 

has a mass , that is, the volume mass density  = ∆ or  = ∆ . Chapter 1653 will show that the

spring constant  =  where  is Young’s modulus,  is the cross sectional area of the chain element, and
 is the length of the element. Then the spring constant can be written as  = ∆ 2 . Therefore equation
161 can be expressed as a sum over volume elements ∆ = 

+1
Ã µ ¶2 !
1X 2 −1 − 
= ̇ −  ∆ (16.2)
2 =1 

In the limit that  → ∞ and the spacing  =  → 0 then the summation in equation 162 can be written
as a volume integral where  =  is the distance along the linear chain and the volume element ∆τ → 0.
Then the Lagrangian can be written as the integral over the volume element  rather than a summation

447
448 CHAPTER 16. ANALYTICAL FORMULATIONS FOR CONTINUOUS SYSTEMS

over ∆ . That is,

Z Ã µ ¶2 !
1 2 ( )
= ̇ −   (16.3)
2 

The discrete-chain coordinate () is assumed to be a continuous function ( ) for the uniform chain. Thus
the integral form of the Lagrangian can be expressed as
Z Ã µ ¶2 ! Z
1 2 ( )
= ̇ −   = L (16.4)
2 
where the function L is called the Lagrangian density defined by
Ã µ ¶2 !
1 ( )
L≡ ̇ 2 −  (16.5)
2 

The variable  in the Lagrangian density is not a generalized coordinate; it only serves the role of a continuous
index played previously by the index . For the discrete case, each value of  defined a diﬀerent generalized
coordinate  . Now for each value of  there is a continuous function ( ) which is a function of both
position and time.
Lagrange’s equations of motion applied to the continuous Lagrangian in equation 164 gives

2  2 
 2
− 2 =0 (16.6)
 
This is the familiar wave equation in one dimension for a longitudinal wave on the continuous chain with a
phase velocity s

 = (16.7)


The continuous linear chain also can exhibit transverse modes which have a Lagrangian density were the
Young’s modulus  is replaced by the tension  in the chain, and  is replaced
q by the linear mass density 

of the chain, leading to a phase velocity for a transverse wave  =  .

16.3 The Lagrangian density formulation for continuous systems

16.3.1 One spatial dimension
In general the Lagrangian density can be a function of  ∇       and . It is of interest that Hamilton’s
principle leads to a set of partial diﬀerential equations of motion, based on the Lagrangian density, that are
analogous to the Lagrange equations of motion for discrete systems. When deriving the Lagrangian equations
of motion in terms of the Lagrangian density using Hamilton’s principle, the notation is simplified if the
system is limited to one spatial coordinate  In addition, it is convenient to use the compact notation

where the spatial derivative is written  0 ≡  and the time derivative is ̇ ≡   , and the one-dimensional

Lagrangian density is assumed to be a function L(  0  ̇  ) The appearance of the derivative  0 ≡  as
an argument of the Lagrange density is a consequence of the continuous dependence of  on . In principle,
higher-order derivatives could occur but they do not arise in most problems of physical interest.
Assuming that the one spatial dimension is , then Hamilton’s principle of least action can be expressed
in terms of the Lagrangian density as
Z 2 Z 2 Z 2
 =  ( ̇ ) =  L(  0  ̇  ) (16.8)
1 1 1

Following the same approach used in chapter 52, it is assumed that the stationary path for the action
integral is described by the function ( ). Define a neighboring function using a parametric representation
( ; ) such that when  = 0, the extremum function  = ( ) yields the stationary action integral .
16.3. THE LAGRANGIAN DENSITY FORMULATION FOR CONTINUOUS SYSTEMS 449

Assume that an infinitessimal fraction  of a neighboring function ( ) is added to the extremum path
( ). That is, assume

( ; ) = ( ) + ( ) (16.9)

( ; ) ( ) ( )
 0 ( ; ) ≡ = + =  0 ( ) + 0 ( ) (16.10)
  
( ; ) ( ) ( )
̇( ; ) ≡ = + = ̇( ) + ̇( ) (16.11)
  
where it is assumed that both the extremum function ( ) and the auxiliary function ( ) are well
behaved functions of  and  with continuous first derivatives, and that ( ) = 0 at (1  1 ) and (2  2 )
because, for all possible paths, the function ( ; ) must be identical with ( ) at the end points of the
path, i.e. (1  1 ) = (2  2 ) = 0.
A parametric family of curves () as a function of the admixture coeﬃcient , is described by the
function Z 2 Z 2
() = L(( ; )  0 ( ; ) ̇( ; )  ) (16.12)
1 1

Then Hamilton’s principle requires that the action integral be a stationary function value for  = 0, that is,
() is independent of  which is satisfied if
Z 2 Z 2 µ ¶
() L  L  ̇ L  0
= + + 0  = 0 (16.13)
 1 1    ̇   

Equations 169 1610and 1611 give the partial diﬀerentials


= ( ) (16.14)

 0
=  0 ( ) (16.15)

 ̇
= ̇( ) (16.16)

Integration by parts in both the  and  terms in equation 1613 plus using the fact that (1  1 ) =
(2  2 ) = 0 at both end points, yields
Z 2 Z 2 µ ¶
L  ̇  L 
 = −  (16.17)
1  ̇  1   ̇ 
Z 2 Z 2 µ ¶
L  0  L 
0
 = −  (16.18)
1   1   0 

Therefore Hamilton’s principle, equation 1613 becomes

Z 2 Z 2 ∙ µ ¶ µ ¶¸
() L  L  L
= − − ( ) = 0 (16.19)
 1 1    ̇   0

Since the auxiliary function ( ) is arbitrary, then the integrand term in the square brackets of equation
1619 must equal zero. That is, µ ¶ µ ¶
 L  L L
+ − =0 (16.20)
  ̇   0 
Equation 1620 gives the equations of motion in terms of the Lagrangian density that has been derived
based on Hamilton’s principle.

16.3.2 Three spatial dimensions

Equation 164 expresses the Lagrangian as an integral of the Lagrangian density over a single continuous
index ( ) where the Lagrangian density is a function L(  
     ). The derivation of the Lagrangian
450 CHAPTER 16. ANALYTICAL FORMULATIONS FOR CONTINUOUS SYSTEMS

equations of motion in terms of the Lagrangian density for three spatial dimensions involves the straightfor-
ward addition of the  and  coordinates. That is, in three dimensions the vector displacement is expressed
by the vector q (   ) and the Lagrangian density is related to the Lagrangian by integration over three
dimensions. That is, they are related by the equation
Z
q
 = L(q  ∇ · q    ) (16.21)

where, in cartesian coordinates, the volume element  = . The Lagrangian density is a function
L(q q
  ∇ · q    ) where the one field quantity ( ) has been extended to a spatial vector q (   )
and the spatial derivatives  0 have been transformed into ∇ · q. Applying the method used for the one-
dimensional spatial system, to the three-dimensional system, leads to the following set of equations of motion
Ã ! Ã ! Ã ! Ã !
 L  L  L  L L
+ + + − =0 (16.22)
 q
 q

 q

 q

q

where the    spatial derivatives have been written explicitly for clarity.
Note that the equations of motion, equation 1622, treat the spatial and time coordinates symmetrically.
This symmetry between space and time is unchanged by multiplying the spatial and time coordinate by
arbitrary numerical factors. This suggests the possibility of introducing a four-dimensional coordinate system

 ≡ {   }

where the parameter  is freely chosen. Using this 4-dimensional formalism allows equation 1622 to be
written more compactly as ⎛ ⎞
X4
 ⎝ L ⎠ L
q
− =0 (16.23)

  
q


As discussed in chapter 17 relativistic mechanics treats time and space symmetrically, that is, a four-
dimensional vector q (   ) can be used that treats time and the three spatial dimensions symmetrically
and equally. This four-dimensional space-time formulation allows the first four terms in equation 1622 to be
condensed into a single term which illustrates the symmetry underlying equation 1623. If the Lagrangian
density is Lorentz invariant, and if  =  then equation 1623 is covariant. Thus the Lagrangian density
formulation is ideally suited to the development of relativistically covariant descriptions of fields.

16.4 The Hamiltonian density formulation for continuous systems

Chapter 163 illustrates, in general terms, how field theory can be expressed in a Lagrangian formulation
via use of the Lagrange density. It is equally possible to obtain a Hamiltonian formulation for continuous
systems analogous to that obtained for discrete systems. As summarized in chapter 8, the Hamiltonian
and Hamilton’s canonical equations of motion are related directly to the Lagrangian by use of a Legendre
transformation. The Hamiltonian is defined as being
X µ  ¶
≡ ̇ − (16.24)

 ̇

The generalized momentum is defined to be


 ≡ (16.25)
 ̇
Equation (1625) allows the Hamiltonian (1624) to be written in terms of the conjugate momenta as
X X
 (    ) =  ̇ − (  ̇  ) = ( ̇ −  (  ̇  )) (16.26)
 

where the Lagrangian

P has been partitioned into the terms for each of the individual coordinates, that is,
(  ̇  ) =   (  ̇  ).
16.5. LINEAR ELASTIC SOLIDS 451

In the limit that the coordinates   are continuous, then the summation in equation 1626 can be
transformed into a volume integral over the Lagrangian density L. In addition, a momentum density can be
represented by the vector field π where
L
π≡ (16.27)
 q̇
Then the obvious definition of the Hamiltonian density H is
Z Z
 = H = (π · q̇−L)  (16.28)

where the Hamiltonian density is defined to be

H =π · q̇−L (16.29)

Unfortunately the Hamiltonian density formulation does not treat space and time symmetrically making
it more diﬃcult to develop relativistically covariant descriptions of fields. Hamilton’s principle can be used
to derive the Hamilton equations of motion in terms of the Hamiltonian density analogous to the approach
used to derive the Lagrangian density equations of motion. As described in Classical Mechanics 2 edition
by Goldstein, the resultant Hamilton equations of motion for one dimension are

H
= ̇ (16.30)

H  H
− = −̇ (16.31)
   0
H L
= − (16.32)
 
Note that equation 1631 diﬀers from that for discontinuous systems.

16.5 Linear elastic solids

Elasticity is a property of matter where the atomic forces in matter act to restore the shape of a solid when
distorted due to the application of external forces. A perfectly elastic material returns to its original shape
if the external force producing the deformation is removed. Materials are elastic when the external forces
do not exceed the elastic limit. Above the elastic limit, solids can exhibit plastic flow and concomitant heat
dissipation. Such non-elastic behavior in solids occurs when they are subject to strong external forces.
The discussion of linear systems, in chapters 3 and 14, focussed on one dimensional systems, such as the
linear chain, where the transverse rigidity of the chain was ignored. An extension of the one-dimensional
linear chain to two-dimensional membranes, such as a drum skin, is straightforward if the membrane is thin
enough so that the rigidity of the membrane can be ignored. Elasticity for three-dimensional solids requires
accounting for the strong elastic forces exerted against any change in shape in addition to elastic forces
opposing change in volume. The stiﬀness of solids to changes in shape, or volume, is best represented using
the concepts of stress and strain.
Forces in matter can be divided into two classes; (1) body forces, such as gravity, which act on each
volume element, and (2) surface forces which are the forces that act on both sides of any infinitessimal
surface element inside the solid. Surface forces can have components along the normal to the infinitessimal
surface, as well as shear components in the plane of the surface element. Typically solids are elastic to both
normal and shear components of the surface forces whereas shear forces in liquids and gases lead to fluid
flow plus viscous forces due to energy dissipation. As described below, the forces acting on an infinitessimal
surface element are best expressed in terms of the stress tensor, while the relative distortion of the shape,
or volume, of the body are best expressed in terms of the strain tensor. The moduli of elasticity relate the
ratio of the corresponding stress and strain tensors. The moduli of elasticity are constant in linear elastic
solids and thus the stress is proportional to the strain providing that the strains do not exceed the elastic
limit.
452 CHAPTER 16. ANALYTICAL FORMULATIONS FOR CONTINUOUS SYSTEMS

16.5.1 Stress tensor

Consider an infinitessimal surface area A of an arbitrary closed volume element  inside the medium.
The surface area element is defined as a vector A = n̂ where n̂ is the outward normal to the closed
surface that encloses the volume element. Assume that F is the force element exerted by the outside on
the material inside the volume element. The stress tensor T is defined as the ratio of F and A where the
force vector F is given by the inner product of the stress tensor T and the surface element vector A. That
is,
F = T·A (16.33)
Since both F and A are vectors, then equation 1633 implies that the stress tensor must be a second-rank
tensor as described in appendix , that is, the stress tensor is analogous to the rotation matrix or the inertia
tensor. Note that if F and n̂A are colinear, then the stress tensor T reduces to the conventional pressure
 The general stress tensor equals the momentum flux density and has the dimensions of pressure.

16.5.2 Strain tensor

Forces applied to a solid body can lead to translational, or rotational acceleration, in addition to changing
the shape or volume of the body. Elastic forces do not act when an overall displacement ξ of an infinitessimal
volume occurs, such as is involved in translational or rotational motion. Elastic forces act to oppose position-
dependent diﬀerences in the displacement vector ξ, that is, the strain depends on the tensor product ∇ ⊗ ξ.
For an elastic medium, the strain depends only on the applied stress and not on the prior loading history.
Consider that the matter Patthe location r is subject to an elastic displacement ξ, and similarly at a
displaced location r0 = r+   
 where  are cartesian coordinates. The net relative displacement
between r and r0 is given by
X 2
X 2
X ∙ µ  
¶
 
¸
 2 = ( +   ) − ( ) = 2 
+  +     (16.34)
 
   

 
Ignoring the second order term   equation gives that the  component of   is
X 1 µ   
¶

  = +   (16.35)
2  

Define the elements of the strain tensor to be given by
µ ¶
1    
  = + (16.36)
2  
then X
  =     (16.37)

Thus the strain tensor σ is a rank-2 tensor defined as the ratio of the strain vector ξ and the infinitessimal
area vector A
ξ = σ·A (16.38)
where the component form of the rank -2 strain tensor is
¯  ¯
¯ 1 1 1 ¯
¯ 1 2 3 ¯
1 ¯ 2 2 2 ¯
σ = ¯  3 ¯¯
(16.39)
2 ¯ 31  2
3 3 ¯
¯
1 2 3
The potential-energy density for linear elastic forces is quadratic in the strain components. That is, it is
of the form X1
=      (16.40)
2

where  is a rank-4 tensor. No preferential directions remain for a homogeneous isotropic elastic body
which allows for two contractions, thereby reducing the potential energy density to the inner product
X1 2
=  ( ) (16.41)
2

16.5. LINEAR ELASTIC SOLIDS 453

16.5.3 Moduli of elasticity

The modulus of elasticity of a body is defined to be the slope of the stress-strain curve and thus, in
principle, it is a complicated rank-4 tensor that characterizes the elastic properties of a material. Thus the
general theory of elasticity is complicated because the elastic properties depend on the orientation of the
microscopic composition of the elastic matter. The theory simplifies considerably for homogeneous, isotropic
linear materials below the elastic limit, where the strain is proportional to the applied stress. That is, the
modulus of elasticity then reduces by contractions to a constant scalar value that depends on the properties
of the matter involved.
The potential energy density for homogeneous, isotropic, linear material, equation 1641 can be separated
into diagonal and oﬀ-diagonal components of the strain tensor. That is,
" #
1 X 2
X 2
=  (  ) + 2 (  ) (16.42)
2  

The diagonal first term is the dilation term which corresponds to changes in the volume with no changes
in shape. The off-diagonal second term involves the shear terms that correspond to changes of the shape of
the body that also changes the volume. The constants  and  are Lamé’s moduli of elasticity which are
positive. The various moduli of elasticity, corresponding to different distortions in the shape and volume of
any solid body, can be derived from Lamé’s moduli for the material.
The components of the elastic forces can be derived from the gradient of the elastic potential energy,
equation 1642 by use of Gauss’ law plus vector differential calculus. The components of the elastic force,
derived from the strain tensor σ, can be associated with the corresponding components of the stress tensor
T. Thus, for homogeneous isotropic linear materials, the components of the stress tensor are related to the
strain tensor by the relation
X  µ ¶ X
    
 =   + + =     + 2  (16.43)
  
 

where it has been assumed that   =   . The two moduli of elasticity  and  are material-dependent
constants. Equation 1643 can be written in tensor notation as

T = (σ)I + 2σ (16.44)

where () is the trace of the strain tensor and  is the identity matrix.
Equation 1644 can be inverted to give the strain tensor components in terms of the stress tensor com-
ponents. " #
1  X
  =  −    (16.45)
2 (3 + 2)


The various moduli of elasticity relate combinations of diﬀerent stress and strain tensor components. The
following five elastic moduli are used frequently to describe elasticity in homogeneous isotropic media, and
all are related to Lamé’s two moduli of elasticity.
1) Young’s modulus  describes tensile elasticity which is axial stiﬀness of the length of a body to
deformation along the axis of the applied tensile force.

11  (3 + 2)

≡ = (16.46)
 11 ( + )

2) Bulk modulus  = ∆  defines the relative dilation or compression of a bodies volume to pressure
applied uniformly in all directions.
2
 =+  (16.47)
3
The bulk modulus is an extension of Young’s modulus to three dimensions and typically is larger than .
The inverse of the bulk modulus is called the compressibility of the material.
454 CHAPTER 16. ANALYTICAL FORMULATIONS FOR CONTINUOUS SYSTEMS

3) Shear modulus  describes the shear stiffness of a body to volume-preserving shear deformations.
The shear strain  becomes a deformation angle given by the ratio of the displacement along the axis of the
shear force and the perpendicular moment arm. The shear modulus  equals Lamé’s constant . That is,
= (16.48)
4) Poisson’s ratio  is the negative ratio of the transverse to axial strain. It is a measure of the volume
conserving tendency of a body to contract in the directions perpendicular to the axis along which it is
stretched. In terms of Lamé’s constants, Poisson’s ratio equals

= (16.49)
2 ( + )
Note that for a stable, isotropic elastic material, Poisson’s ratio is bounded between −10 ≤  ≤ 05 to ensure
that the   and  moduli have positive values. At the incompressible limit,  = 05, and the bulk modulus
and Lame parameter  are infinite, that is, the compressibility is zero. Typical solids have Poisson’s ratios
of  ≈ 005 if hard and  = 025 if soft.
The stiffness of elastic solids in terms of the elastic moduli of solids can be complicated due to the
geometry and composition of solid bodies. Often it is more convenient to express the stiffness in terms of
the spring constant  where

= (16.50)

The spring constant is inversely proportional to the length of the spring because the strain of the material
is defined to be the fractional deformation, not the absolute deformation.

16.5.4 Equations of motion in a uniform elastic media

The divergence theorem (8) relates the volume integral of the divergence of T to the vector force density
F acting on the closed surface. I Z Z
F = T·A = ∇ · T = f  (16.51)

That is, the inner product of the del operator, ∇,I and the rank-2 stress tensor T, give the vector force
2
density f . This force acting on the enclosed mass   for the closed volume, leads to an acceleration 2 .
Thus I Z I
2ξ
F = T·A = ∇ · T =  2  (16.52)

Use equation 1644 to relate the stress tensor T to the moduli of elasticity gives
" #
 2 ξ X  2 ξ  2 ξ
 2 = ( + ) + 2 (16.53)
 
  

where  = 1 2 3. In general this equation is diﬃcult to solve. However, for the simple case of a plane wave
in the  = 1 direction, the problem reduces to the following three equations
 2 ξ1  2 ξ1
 = ( + 2) (16.54)
2 21
2ξ  2 ξ2
 22 =  (16.55)
 21
2ξ 2ξ
 23 =  23 (16.56)
 1
q
(+2)
Equation 1654 corresponds to a longitudinal wave travelling with velocity  =  . Equations
q

1655 1656 correspond to two perpendicular transverse waves travelling with velocity  = . This il-
lustrates the important fact that longitudinal waves travel faster than transverse waves in an elastic solid.
Seismic waves in the Earth, generated by earthquakes, exhibit this property. Note that shearing stresses do
not exist in ideal liquids and gases since they cannot maintain shear forces and thus  = 0
16.6. ELECTROMAGNETIC FIELD THEORY 455

16.6 Electromagnetic field theory

16.6.1 Maxwell stress tensor
Analytical formulations for continuous systems, developed for describing elasticity, are generally applicable
when applied to other fields, such as the electromagnetic field. The use of the Maxwell’s stress tensor T to
describe momentum in the electromagnetic field, is an important example of the application of continuum
mechanics in field theory.
The Lorentz force can be written as
Z Z Z
F =  (E + v × B)  = (E + J × B)  = f  (16.57)

where the force density f is defined to be

f = (E + J × B) (16.58)

Maxwell’s equations
1 E
 = 0 ∇ · E J= ∇ × B − ²0 (16.59)
0 
can be used to eliminate the charge and current densities in equation 1657
µ ¶
1 E
f =0 (∇ · E) E + ∇ × B − ²0 ×B (16.60)
0 
Vector calculus gives that
 E B
(E × B) = × B + E× (16.61)
  
while Faraday’s law gives
B
= −∇ × E (16.62)

Equation 1662 allows equation 1661 to be rewritten as
E  B 
× B = + (E × B) − E× = + (E × B) + E× (∇ × E) (16.63)
   
Equation 1663 can be inserted into equation 1660. In addition, a term 1 (∇ · B) B can be added since
0
∇ · B =0 which allows equation 1660 to be written in the symmetric form
1 1 E
f = 0 (∇ · E) E + (∇ · B) B+ (∇ × B) × B − ²0 ×B (16.64)
0 0 
1 1 
= 0 (∇ · E) E + (∇ · B) B+ (∇ × B) × B−0 (E × B) − 0 E× (∇ × E) (16.65)
0 0 
Using the vector identity

∇ (A · B) = A× (∇ × B) + B× (∇ × A) + (A · ∇) B+ (B · ∇) A (16.66)

Let A = B = E then ¡ ¢
∇  2 = 2E× (∇ × E) + 2 (E · ∇) E (16.67)
That is
1 ¡ 2¢
E× (∇ × E) = ∇  − (E · ∇) E (16.68)
2
Similarly
1 ¡ 2¢
B× (∇ × B) = ∇  − (B · ∇) B (16.69)
2
Inserting equations 1668 and 1669 into equation 1665 gives
∙ ¸ ∙ ¸
1 2 1 1 2 
f =0 (∇ · E) E+ (E · ∇) E− ∇ + (∇ · B) B+ (B · ∇) B− ∇ − 0 (E × B) (16.70)
2 0 2 
456 CHAPTER 16. ANALYTICAL FORMULATIONS FOR CONTINUOUS SYSTEMS

This complicated formula can be simplified by defining the rank-2 Maxwell stress tensor T which has
components µ ¶ µ ¶
1 1 1
 ≡ 0   −    2 +   −    2 (16.71)
2 0 2
The inner product of the del operator and the Maxwell stress tensor is a vector with  components of
∙ ¸ ∙ ¸
1 2 2 1 1 2 2
(∇ · T) = 0 (∇ · E)  + (E · ∇)  − ∇  + (∇ · B)  + (B · ∇)  − ∇  (16.72)
2 0 2

The above definition of the Maxwell stress tensor, plus the Poynting vector S = 1 (E × B)  allows the force
0
density equation 1658 to be written in the form
S
f = ∇ · T−0 0 (16.73)

The divergence theorem allows the total force, acting of the volume   to be written in the form
Z µ ¶
S
F = ∇ · T−0 0  (16.74)

I Z

= T·a−0 0 Sdτ (16.75)

Note that, if the Poynting vector is time independent, then the second term in equation 1675 is zero and the
Maxwell stress tensor T is the force per unit area, (stress) acting on the surface. The fact that T is a rank-2
tensor is apparent since the stress represents the ratio of the force-density vector f and the infinitessimal
area vector a, which do not necessarily point in the same directions.

16.6.2 Momentum in the electromagnetic field

Chapter 72 showed that the electromagnetic field carries a linear momentum A where  is the charge on a
body and A is the electromagnetic vector potential. It is useful to use the Maxwell stress tensor to express
the momentum density directly in terms of the electric and magnetic fields.
Newton’s law of motion can be used to write equation equation 1675 as
I Z
p 
F= = T·a−0 0 Sdτ (16.76)
 
where p is the total mechanical linear momentum of the volume  . Equation 1676 implies that the electro-
magnetic field carries a linear momentum
Z
p  = 0 0 Sdτ (16.77)
I
The T·a term in equation 1676 is the momentum per unit time flowing into the closed surface.
In field theory it can be useful to describe the behavior in terms of the momentum flux density π. Thus
the momentum flux density π   in the electromagnetic field is

π   =0 0 S (16.78)

Then equation 1676 implies that the total momentum flux density π = π  +π   is related to Maxwell’s
stress tensor by

(π  + π   ) = ∇ · T (16.79)

That is, like the elasticity stress tensor, the divergence of Maxwell’s stress tensor T equals the rate of change
of the total momentum density, that is, −T is the momentum flux density.
This discussion of the Maxwell stress tensor and its relation to momentum in the electromagnetic field
illustrates the role that analytical formulations of classical mechanics can play in field theory.
16.7. IDEAL FLUID DYNAMICS 457

16.7 Ideal fluid dynamics

The distinction between a solid and a fluid is that a fluid flows under shear stress whereas the elasticity
of solids oppose distortion and flow. Shear stress in a fluid is opposed by dissipative viscous forces, which
depend on velocity, as opposed to elastic solids where the shear stress is opposed by the elastic forces which
depend on the displacement. An ideal fluid is one where the viscous forces are negligible, and thus the shear
stress Lamé parameter  = 0.

16.7.1 Continuity equation

Fluid dynamics requires a diﬀerent philosophical approach than that used to describe the motion of an
ensemble of known solid bodies.The prior discussions of classical mechanics used, as variables, the coordinates
of each member of an ensemble of particles with known masses. This approach is not viable for fluids
which involve an enormous number of individual atoms as the fundamental bodies of the fluid. The best
philosophical approach for describing fluid dynamics is to employ continuum mechanics using definite fixed
volume elements  and describe the fluid in terms of macroscopic variables of the fluid such as mass density
, pressure  , and fluid velocity v.
Conservation of fluid mass requires that the rate of change of mass in a fixed volume must equal the net
inflow of mass. Z I

 + v·a = 0 (16.80)
 
Using the divergence theorem (2) allows this to be written as
Z µ ¶

+ ∇· (v)  = 0 (16.81)
 

Mass conservation must hold for any arbitrary volume, therefore the continuity equation can be written in
the diﬀerential form

+ ∇· (v) = 0 (16.82)


16.7.2 Euler’s hydrodynamic equation

The fluid surrounding a volume  exerts a net force F that equals the surface integral of the pressure P.
This force can be transformed to a volume integral of ∇ . The net force then will lead to an acceleration
of the volume element. That is
I Z Z
v
F = −  a = − ∇  =   (16.83)


Thus the force density f is given by

v
f = −∇P = (16.84)

Note that the acceleration v in equation 1683 refers to the rate of change of velocity for individual
atoms in the fluid, not the rate of change of fluid velocity at a fixed point in space. These two accelerations
are related by noting that, during the time , the change in velocity v of a given fluid particle is composed
of two parts, namely (1) the change during  in the velocity at a fixed point in space, and (2) the diﬀerence
between the velocities at that same instant in time at two points displaced a distance r apart, where r is
the distance moved by a given fluid particle during the time . The first part is given by v   at a given
point (  ) in space. The second part equals

v v v
 +  +  = (r · ∇) v (16.85)
  
Thus
v
v =  + (r · ∇) v (16.86)

458 CHAPTER 16. ANALYTICAL FORMULATIONS FOR CONTINUOUS SYSTEMS

Divide both sides by  gives that the acceleration of the atoms in the fluid equals
v v
= + (v · ∇) v (16.87)
 
Substitute equation 1687 into 1684 gives
v 1
+ (v · ∇) v = − ∇ (16.88)
 
This is Euler’s equation for hydrodynamics. The two terms on the left represent the acceleration in the
individual fluid components while the right-hand side lists the force density producing the acceleration.
Additional forces can be added to the right-hand side. For example, the gravitational force density g
can be expressed in terms of the gravitational scalar potential  to be

g = −ρ∇ (16.89)

Inclusion of the gravitational field force density in Euler’s equation gives

v 1
+ (v · ∇) v = − ∇ ( +  ) (16.90)
 

16.7.3 Irrotational flow and Bernoulli’s equation

Streamlined flow corresponds to irrotational flow, that is, ∇ × v = 0. Since irrotational flow is curl free, the
velocity streamlines can be represented by a scalar potential field . That is

v = −∇ (16.91)

This scalar potential field  can be used to derive the vector velocity field for irrotational flow.
Note that the (v · ∇) v term in Euler’s equation (1690) can be rewritten using the vector identity
1 ¡ ¢
(v · ∇) v = ∇ 2 − v × ∇ × v (16.92)
2
Inserting equation 1692 into Euler’s equation 1690 then gives
µ ¶
v 1 1 2
= v × ∇ × v− ∇  +  +  (16.93)
  2
v
Potential flow corresponds to time independent irrotational flow, that is, both  = 0 and ∇ × v = 0 For
potential flow equation 1693 reduces to
µ ¶
1 2
∇  +  +  = 0
2

which implies that µ ¶

1 2
 +  +  = constant (16.94)
2
This is the famous Bernoulli’s equation that relates the interplay of the fluid velocity, pressure and gravita-
tional energy. Bernoulli’s equation plays important roles in both hydrodynamics and aerodynamics.

16.7.4 Gas flow

Fluid dynamics applied to gases is a straightforward extension of fluid dynamics that employs standard ther-
modynamical concepts. The following example illustrates the application of fluid mechanics for calculating
the velocity of sound in a gas.
16.7. IDEAL FLUID DYNAMICS 459

16.1 Example: Acoustic waves in a gas

Propagation of acoustic waves in a gas provides an example of using the three-dimensional Lagrangian
density. Only longitudinal waves occur in a gas and the velocity is given by thermodynamics of the gas. Let
the displacement of each gas molecule be designated by the general coordinate q with corresponding velocity
q̇. Let the gas density be  then the kinetic energy density () of an infinitessimal volume of gas ∆ is
given by
1
∆ () = 0 q̇2
2
The rapid contractions and expansions of the gas in an acoustic wave occur adiabatically such that the product
   is a constant, where  = sp ecific heat at constant pressure
sp ecific heat at constant volume . Therefore the change in potential energy density
∆( ) is given to second order by
Z 0 +∆ µ ¶ µ ¶
1 0 1  2 0 1 0
∆ ( ) =   = ∆ + (∆ ) = ∆ −  (∆ )2
 0 0 0 2 0  0 0 2 0 0
Since the volume and density are related by

 =
0
then the fractional change in the density  is related to the density by
 = 0 (1 + )
This implies that the potential energy density ( ) is given by
∙ ¸
0 2
∆ ( ) = 0  +  
2
The mass flowing out of the volume 0 must equal the fractional change in density of the volume, that is
Z Z
0 q · dS = −ρ0 

The divergence theorem gives that

Z Z Z
q · dS = ∇ · q = − 

Thus the density  is given by minus the divergence of q

 = −∇ · q
This allows the potential energy density to be written as
0
∆( ) = −0 ∇ · q+ (∇ · q)2
2
Combining the kinetic energy density and the potential energy density gives the complete Lagrangian density
for an acoustic wave in a gas to be
1 0
L = 0 q̇2 + 0 ∇ · q− (∇ · q)2
2 2
Inserting this Lagrangian density in the corresponding equations of motion, equation 1623, gives that
0 2 q
∇2 q− =0
0 2
where 0 and 0 are the ambient pressure and density of the gas. This is the wave equation where the phase
velocity of sound is given by s
0
 =
0
460 CHAPTER 16. ANALYTICAL FORMULATIONS FOR CONTINUOUS SYSTEMS

16.8 Viscous fluid dynamics

Viscous fluid dynamics is a branch of classical mechanics that plays a pivotal role in a wide range of aspects
of life, such as blood flow in human anatomy, weather, hydraulic engineering, and transportation by land,
sea, and air. Viscous fluid flow provides natures most common manifestation of nonlinearity and turbulence
in classical mechanics, and provides an excellent illustration of possible solutions of non-linear equations of
motion introduced in chapter 4. A detailed description of turbulence remains a challenging problem and
this subject has the reputation of being the last great unsolved problem in classical mechanics. There is
an apocryphal story that Werner Heisenberg was asked, if given the opportunity, what would he like to ask
God. His reply was “When I meet God, I am going to ask him two questions: Why relativity? and why
turbulence?, I really believe he will only have an answer to the first”.
In contrast to solids, fluids do not have elastic restoring forces to support shear stress because the fluid
flows. Shear stresses in fluids are balance by viscous forces which are velocity dependent. There are two
mechanisms that lead to shear stress acting between adjacent fluid layers in relative motion. The first
mechanism involves laminar flow where the viscous forces produce shear stress between adjacent layers of
the fluid which are moving parallel along adjacent streamlines at diﬀering velocities. Viscous forces typically
dominate laminar flow. High viscosity fluids like honey exhibit laminar flow and are more diﬃcult to stir
or pour compared with low-viscosity fluids like water. The second mechanism involves turbulent flow where
shear stress is due to momentum transfer between adjacent layers when the flow breaks up into large-scale
coherent vortex structures which carry most of the kinetic energy. These eddies lead to transverse motion
that transfers momentum plus heat between adjacent layers and leads to higher drag. The wing-tip vortex
produced by the wing tip of an aircraft is an example of a dynamically-distinct, large-scale, coherent vortex
structure which has considerable angular momentum and decays by fragmentation into a cascade of smaller
scale structures.

16.8.1 Navier-Stokes equation

Viscous forces acting on the small-scale coherent structures eventually dissipate the energy in turbulent
motion. The viscous drag can be handled in terms of a stress tensor T analogous to its use when accounting
for the elastic restoring forces in elasticity as discussed in chapter 1653. That is, the viscous force density
is related to the deceleration of the volume element by

(v) = −∇ · T (16.95)

where the components of the stress tensor are
 =  =    +   (16.96)
Note that the stress tensor gives the momentum flux density tensor, which involves a diagonal term propor-
tional to pressure  plus a viscous drag term that is is proportional to the product of two velocities.
The Navier-Stokes equations are the fundamental equations characterizing fluid flow. They are based on
application of Newton’s second law of motion to fluids together with the assumption that the fluid stress
is the sum of a diﬀusing viscous term plus a pressure term. Combining Euler’s equation, 1690, with 1695
gives the Navier-Stokes equation
∙ ¸
v
 + v · ∇v = −∇ + ∇ · T+f (16.97)

where  is the fluid density, v is the flow velocity vector,  the pressure, T is the shear stress tensor viscous
drag term, and f represents external body forces per unit volume such as gravity acting on the fluid.
For incompressible flow the stress tensor term simplifies to ∇ · T =∇2 v. Then the Navier-Stokes
equation simplifies to ∙ ¸
v
 + v · ∇v = −∇ + ∇2 v+f (16.98)

where ∇2 v is the viscosity drag term. The left-hand side of equation 1698 represents the rate of change
of momentum per unit volume while the right-hand side represents the summation of the forces per unit
volume that are acting.
16.8. VISCOUS FLUID DYNAMICS 461

The Navier-Stokes equations are nonlinear due to the (v · ∇) v term as well as being a function of
velocity. This non-linearity leads to a wide spectrum of dynamic behavior ranging from ordered laminar
flow to chaotic turbulence. Numerical solution of the Navier-Stokes equations is extremely diﬃcult because
of the wide dynamic range of the dimensions of the coherent structures involved in turbulent motion. For
example, simulation calculations require use of a high resolution mesh which is a challenge to the capabilities
of current generation computers.
The microscopic boundary condition at the interface of the solid and fluid is that the fluid molecules
have zero average tangential velocity relative to the normal to the solid-fluid interface. This implies that
there is a boundary layer for which there is a gradient in the tangential velocity of the fluid between the
solid-fluid interface and the free-steam velocity. This velocity gradient produces vorticity in the fluid. When
the viscous forces are negligible then the angular momentum in any coherent vortex structure is conserved
leading to the vortex motion being preserved as it propagates.

16.8.2 Reynolds number

Fluid flow can be characterized by the Reynolds number
Re which is a dimensionless number that is a measure
of the ratio of the inertial forces  2  to viscous forces 100

2 . That is, A

10
Inertial forces   CD
Re ≡ = = (16.99) B
Viscous forces   1 C D

where  is the relative velocity between the free fluid E

0.1
flow and the solid surface,  is a characteristic linear 10
-1
10
0
10
1
10
2
10
3
10
4
10
5
10
6
10
7

dimension,  is the dynamic viscosity of the fluid,  is

Reynolds number
the kinematic viscosity ( =  ), and  is the density
of the fluid. The Law of Similarity implies that at a
given Reynolds number, for a specific shaped solid body,
the fluid flow behaves identically independent of the size
of the body. Thus one can use small models in wind
tunnels, or water-flow tanks, to accurately model fluid No separation Steady separation bubble

flow that can be scaled up to a full-sized aircraft or boats (A) (B)

by scaling  and  to give the same Reynolds number.

16.8.3 Laminar and turbulent fluid flow

Fluid flow over a cylinder illustrates the general features Oscillating Karman vortex street wake
of fluid flow. The drag force  acting on a cylinder (C)
of diameter  and length  with the cylindrical axis
perpendicular to the fluid flow, is given by
1 2
 =    (16.100)
2 Laminar boundary layer, Turbulent boundary layer,
wide turbulent wake narrow turbulent wake
where  is the coefficient of drag. Figure 161
(D) (E)
shows the dependence of the drag coefficient  as a
function of the Reynolds number, for fluid flow that
is transverse to a smooth circular cylinder. The lower
part of figure 161 shows the streamlines for flow around
the cylinder at various Reynolds numbers for the points Figure 16.1: Upper: The dependence of the coeffi-
identified by the letters    , and  on the plot cient of drag  on Reynolds number Re for fluid
of the drag coefficient versus Reynolds number for a flow perpendicular to a smooth circular cylinder
smooth cylinder. of diameter  and length . Lower: Typical flow
A) At low velocities, where Re ≤ 1 the flow is lam- patterns for flow past a circular cylinder at vari-
inar around the cylinder in that the low vorticity is ous Reynolds numbers as indicated in the upper
damped by the viscous forces and the v  term in equa- figure.
tion 1698 can be ignored. The coefficient of drag 
462 CHAPTER 16. ANALYTICAL FORMULATIONS FOR CONTINUOUS SYSTEMS

varies inversely with Re leading to the drag forces that are roughly linear with velocity as described in chapter
2105 The size and velocities of raindrops in a light rain shower correspond to such Reynolds numbers.
B) For 10  Re  30 the flow has two turbulent vortices immediately behind the body in the wake of
the cylinder, but the flow still is primarily laminar as illustrated.
C) For 40  Re  250 the pair of vortices peel off alternately producing a regular periodic sequence of
vortices although the flow still is laminar. This vortex sheet is called a von Kármán vortex sheet for which
the velocity at a given position, relative to the cylinder, is time dependent in contrast to the situation at
lower Reynolds numbers.
D) For 103  Re  105 viscous forces are negligible relative to the inertial effects of the vortices and
boundary-layer vortices have less time to diffuse into the larger region of the fluid, thus the boundary layer is
thinner. The boundary-layer flow exhibits a small scale chaotic turbulence in three dimensions superimposed
on regular alternating vortex structures. In this range  is roughly constant and thus the drag forces are
proportional to the square of the velocity. This regime of Reynold numbers corresponds to typical velocities
of moving automobiles.
E) For Re ≈ 106 , which is typical of a flying aircraft, the inertial effects dominate except in the narrow
boundary layer close to the solid-fluid interface. The chaotic region works its way further forward on the
cylinder reducing the volume of the chaotic turbulent boundary layer which results in a significant decreases
in  . For a sailplane wing flying at about 50, the boundary layer at the leading edge of the cylinder
reduces to the order of a millimeter in thickness at the leading edge and a centimeter at the trailing edge. At
these Reynold’s numbers the airflow comprises a thin boundary layer, where viscous effects are important,
plus fluid flow in the bulk of the fluid where the vortex inertial terms dominate and viscous forces can be
ignored. That is, the viscous stress tensor term ∇ · T on the right-hand side of equation 1697 can be
ignored, and the Navier-Stokes equation reduces to the simpler Euler equation for such inviscid fluid flow.
The importance of the inertia of the vortices is illustrated by the persistence of the vortex structure
and turbulence over a wide range of length scales characteristic of turbulent flow. The dynamic range of
the dimension of coherent vortex structures is enormous. For example, in the atmosphere the vortex size
ranges from 105  in diameter for hurricanes down to 10−3  in thin boundary layers adjacent to an aircraft
wing. The transition from laminar to turbulent flow is illustrated by water flow over the hull of a ship which
involves laminar flow at the bow followed by turbulent flow behind the bow wave and at the stern of the
ship. The broad extent of the white foam of seawater along the side and the stern of a ship illustrates the
considerable energy dissipation produced by the turbulence. The boundary layer of a stalled aircraft wing
is another example. At a high angle of attack, the airflow on the lower surface of the wing remains laminar,
that is, the stream velocity profile, relative to the wing, increases smoothly from zero at the wing surface
outwards until it meets the ambient air velocity on the outer surface of the boundary layer which is the order
of a millimeter thick. The flow on the top surface of the wing initially is laminar before becoming turbulent
at which point the boundary layer rapidly increases in thickness. Further back the airflow detaches from
the wing surface and large-scale vortex structures lead to a wide boundary layer comparable in thickness to
the chord of the wing with vortex motion that leads to the airflow reversing its direction adjacent to the
upper surface of the wing which greatly increases drag. When the vortices begin to shed off the bounded
surface they do so at a certain frequency which can cause vibrations that can lead to structural failure if the
frequency of the shedding vortices is close to the resonance frequency of the structure.
Considerable time and effort are expended by aerodynamicists and hydrodynamicists designing aircraft
wings and ship hulls to maximize the length of laminar region of the boundary layer to minimize drag.
When the Reynolds number is large the slightest imperfections in the shape of wing, such as a speck of
dust, can trigger the transition from laminar to turbulent flow. The boundaries between adjacent large-scale
coherent structures are sensitively identified in computer simulations by large divergence of the streamlines
at any separatrix. A large positive, finite-time, Lyapunov exponent identifies divergence of the streamlines
which occurs at a separatrix between adjacent large-scale coherent vortex structures, whereas the Lyapunov
exponents are negative for converging streamlines within any coherent structure. Computations of turbulent
flow often combine the use of finite-time Lyapunov exponents to identify coherent structures, plus Lagrangian
mechanics for the equations of motion since the Lagrangian is a scalar function, it is frame independent, and
it gives far better results for fluid motion than using Newtonian mechanics. Thus the Lagrangian approach in
the continua is used extensively for calculations in aerodynamics, hydrodynamics, and studies of atmospheric
phenomena such as convection, hurricanes, tornadoes, etc.
16.9. SUMMARY AND IMPLICATIONS 463

16.9 Summary and implications

The goal of this chapter is to provide a glimpse into the classical mechanics of the continua which introduces
the Lagrangian density and Hamiltonian density formulations of classical mechanics.

Lagrangian density formulation: In three dimensional Lagrangian density L(q q   ∇ · q    ) is

related to the Lagrangian  by taking the volume integral of the Lagrangian density.
Z
q
 = L(q  ∇ · q    ) (1621)

Applying Hamilton’s Principle to the three-dimensional Lagrangian density leads to the following set of
diﬀerential equations of motion
Ã ! Ã ! Ã ! Ã !
 L  L  L  L L
+ + + − =0 (1622)
 q
 q

 q

 q

q

Hamiltonian density formulation: In the limit that the coordinates   are continuous, then the Hamil-
tonian density can be expressed in terms of a volume integral over the momentum density  and the La-
grangian density L where
L
π≡ (1627)
 q̇
Then the obvious definition of the Hamiltonian density H is
Z Z
 = H = (π · q̇−L)  (1628)

where the Hamiltonian density is given by

H =π · q̇−L (1629)
These Lagrangian and Hamiltonian density formulations are of considerable importance to field theory
and fluid mechanics.

Linear elastic solids: The theory of continuous systems was applied to the case of linear elastic solids.
The stress tensor T is a rank 2 tensor defined as the ratio of the force vector F and the surface element
vector A. That is, the force vector is given by the inner product of the stress tensor T and the surface
element vector A.
F = T·A (1633)
The strain tensor σ also is a rank 2 tensor defined as the ratio of the strain vector ξ and infinitessimal
area A
ξ = σ·A (1638)
where the component form of the rank 2 strain tensor is
¯  ¯
¯ 1 1 1 ¯
1 ¯¯ 
2
1 2
2
3
2
¯
¯
σ = ¯  ¯ (1639)
2 ¯ 31  2
3
3
3 ¯
¯ ¯
1 2 3

The modulus of elasticity is defined as the slope of the stress-strain curve. For linear, homogeneous,
elastic matter, the potential energy density  separates into diagonal and oﬀ-diagonal components of the
strain tensor " #
1 X 2
X 2
=  (  ) + 2 (  ) (1642)
2  
where the constants  and  are Lamé’s moduli of elasticity which are positive. The stress tensor is related
to the strain tensor by
X  µ ¶ X
    
 =   + + =     + 2  (1643)
  
 
464 CHAPTER 16. ANALYTICAL FORMULATIONS FOR CONTINUOUS SYSTEMS

Electromagnetic field theory: The rank 2 Maxwell stress tensor T has components
µ ¶ µ ¶
1 2 1 1 2
 ≡ 0   −    +   −    (1671)
2 0 2

The divergence theorem allows the total electromagnetic force, acting of the volume   to be written as
Z µ ¶ I Z
S 
F= ∇ · T−0 0  = T·a−0 0 Sdτ (1674)
 

The total momentum flux density is given by


(π  + π   ) = ∇ · T (1679)

where the electromagnetic field momentum density is given by the Poynting vector S as π   =0 0 S.

Ideal fluid dynamics: Mass conservation leads to the continuity equation


+ ∇· (v) = 0 (1682)

Euler’s hydrodynamic equation gives
v 1
+ (v · ∇) v = − ∇ ( +  ) (1690)
 
where  is the scalar gravitational potential. If the flow is irrotational and time independent then
µ ¶
1 2
 +  +  = constant (1694)
2

Viscous fluid dynamics: For incompressible flow the stress tensor term simplifies to ∇ · T =∇2 v. Then
the Navier-Stokes equation becomes
∙ ¸
v
 + v · ∇v = −∇ + ∇2 v+f (1698)


where ∇2 v is the viscosity drag term. The left-hand side of equation 1698 represents the rate of change
of momentum per unit volume while the right-hand side represents the summation of the forces per unit
volume that are acting.
The Reynolds number is a dimensionless number that characterizes the ratio of inertial forces to viscous
forces in a viscous medium. The evolution of flow from laminar flow to turbulent flow, with increase of
Reynolds number, was discussed.
The classical mechanics of continuous fields encompasses a remarkably broad range of phenomena with
important applications to laminar and turbulent fluid flow, gravitation, electromagnetism, relativity, and
quantum fields.
Chapter 17

Relativistic mechanics

17.1 Introduction
Newtonian mechanics incorporates the Newtonian concept of the complete separation of space and time.
This theory reigned supreme from inception, in 1687, until November 1905 when Einstein pioneered the
Special Theory of Relativity. Relativistic mechanics undermines the Newtonian concepts of absoluteness of
time that is inherent to Newton’s formulation, as well as when recast in the Lagrangian and Hamiltonian
formulations of classical mechanics. Relativistic mechanics has had a profound impact on twentieth-century
physics and the philosophy of science. Classical mechanics is an approximation of relativistic mechanics
that is valid for velocities much less than the velocity of light in vacuum. The term “relativity” refers to
the fact that physical measurements are always made relative to some chosen reference frame. Naively one
may think that the transformation between diﬀerent reference frames is trivial and contains little underlying
physics. However, Einstein showed that the results of measurements depend on the choice of coordinate
system, which revolutionized our concept of space and time.
Einstein’s work on relativistic mechanics comprised two major advances. The first advance is the 1905
Special Theory of Relativity which refers to nonaccelerating frames of reference. The second major advance
was the 1916 General Theory of Relativity which considers accelerating frames of reference and their relation
to gravity. The Special Theory is a limiting case of the General Theory of Relativity. The mathematically
complex General Theory of Relativity is required for describing accelerating frames, gravity, plus related
topics like Black Holes, or extremely accurate time measurements inherent to the Global Positioning System.
The present discussion will focus primarily on the mathematically simple Special Theory of Relativity since it
encompasses most of the physics encountered in atomic, nuclear and high energy physics. This chapter uses
the basic concepts of the Special Theory of Relativity to investigate the implications of extending Newtonian,
Lagrangian and Hamiltonian formulations of classical mechanics into the relativistic domain. The Lorentz-
invariant extended Hamiltonian and Lagrangian formalisms are introduced since they are applicable to the
Special Theory of Relativity. The General Theory of Relativity incorporates the gravitational force as a
geodesic phenomena in a four-dimensional Reimannian structure based on space, time, and matter. A
superficial outline is given to the fundamental concepts and evidence that underlie the General Theory of
Relativity.

17.2 Galilean Invariance

As discussed in chapter 23, an inertial frame is one in which Newton’s Laws of motion apply. Inertial frames
are non-accelerating frames so that pseudo forces are not induced. All reference frames moving at constant
velocity relative to an inertial reference, are inertial frames. Newton’s Laws of nature are the same in all
inertial frames of reference and therefore there is no way of determining absolute motion because no inertial
frame is preferred over any other. This is called Galilean-Newtonian invariance. Galilean invariance assumes
that the concepts of space and time are completely separable. Time is assumed to be an absolute quantity
that is invariant to transformations between coordinate systems in relative motion. Also the element of
length is the same in diﬀerent Galilean frames of reference.

465
466 CHAPTER 17. RELATIVISTIC MECHANICS

Consider two coordinate systems shown in figure 171, where the primed frame is moving along the 
axis of the fixed unprimed frame. A Galilean transformation implies that the following relations apply;

01 = 1 −  (17.1)
02 = 2
03 = 3
0 = 

Note that at any instant  the infinitessimal units of length

in the two systems are identical since
x2 x’2
3
X 3
X
2
 = 2 = 02

02
=  (17.2)
=1 =1 v

These are the mathematical expression of the Newtonian idea

of space and time. An immediate consequence of the Galilean
x1 x’1
transformation is that the velocity of light must differ in dif-
ferent inertial reference frames. x x’
3 3
At the end of the 19 century physicists thought they had
discovered a way of identifying an absolute inertial frame of
reference, that is, it must be the frame of the medium that
transmits light in vacuum. Maxwell’s laws of electromagnetism Figure 17.1: Motion of the primed frame
predict that electromagnetic radiation in vacuum travels at  = along the  axis with velocity  relative to
1
√ 1 8
  = 2998 × 10 . Maxwell did not address in what the parallel unprimed frame.
frame of reference that this speed applied. In the nineteenth
century all wave phenomena were transmitted by some medium, such as waves on a string, water waves,
sound waves in air. Physicists thus envisioned that light was transmitted by some unobserved medium which
they called the ether. This ether had mystical properties, it existed everywhere, even in outer space, and yet
had no other observed consequences. The ether obviously should be the absolute frame of reference.
In the 18800 , Michelson and Morley performed an experi-
ment in Cleveland to try to detect this ether. They transmitted
light back and forth along two perpendicular paths in an inter- C
ferometer, shown in figure 172, and assumed that the earth’s Mirror
motion about the sun led to movement through the ether. Light
Semi-transparent
The time taken to travel a return trip takes longer in a source
L mirror
moving medium, if the medium moves in the direction of the
motion, compared to travel in a stationary medium. For ex-
B
ample, you lose more time moving against a headwind than A
you gain travelling back with the wind. The time difference L
∆ for a round trip to a distance , between travelling in the Mirror
direction of motion in the ether, versus travelling the same dis-
tance perpendicular to the movement in the ether, is given by
¡ ¢2
∆ ≈   where  is the relative velocity of the ether and 
is the velocity of light. Figure 17.2: The Michelson interferometer
Interference fringes between perpendicular light beams in used for the Michelson-Morley experiment.
an optical interferometer provides an extremely sensitive mea- Interference of the two beams of coherent
sure of this time difference. Michelson and Morley observed no light leads to fringes that depends on the
measurable time difference at any time during the year, that differences in phase along the two paths.
is, the relative motion of the earth within the ether is less than
16 the velocity of the earth around the sun. Their conclusion was either, that the ether was dragged along
with the earth, or the velocity of light was dependent on the velocity of the source, but these did not jibe
with other observations. Their disappointment at the failure of this experiment to detect evidence for an ab-
solute inertial frame is important and confounded physicists for two decades until Einstein’s Special Theory
of Relativity explained the result.
17.3. SPECIAL THEORY OF RELATIVITY 467

17.3 Special Theory of Relativity

17.3.1 Einstein Postulates
In November 1905, at the age of 26, Einstein published a seminal paper entitled ”On the electrodynamics of
moving bodies”. He considered the relation between space and time in inertial frames of reference that are
in relative motion. In this paper he made the following postulates.
1) The laws of nature are the same in all inertial frames of reference.
2) The velocity of light in vacuum is the same in all inertial frames of reference.
Note that Einstein’s first postulate, coupled with Maxwell’s equations, leads to the statement that the
velocity of light in vacuum is a universal constant. Thus the second postulate is unnecessary since it is an
obvious consequence of the first postulate plus Maxwell’s equations which are basic laws of physics. This
second postulate explained the null result of the Michelson-Morley experiment. However, it was not this
experimental result that led Einstein to the theory of special relativity; he deduced the Special Theory of
Relativity from consideration of Maxwell’s equations of electromagnetism. Although Einstein’s postulates
appear reasonable, they lead to the following surprising implications.

17.3.2 Lorentz transformation

Galilean invariance leads to violation of the Einstein postulate that the velocity of light is a universal con-
stant in all frames of reference. It is necessary to assume a new transformation law that renders physical
laws relativistically invariant. Maxwell’s equations are relativistically invariant, which led to some electro-
magnetic phenomena that could not be explained using Galilean invariance. In 1904 Lorentz proposed a new
transformation to replace the Galilean transformation in order to explain such electromagnetic phenomena.
Einstein’s genius was that he derived the transformation, that had been proposed by Lorentz, directly from
the postulates of the Special Theory of Relativity. The Lorentz transformation satisfies Einstein’s theory of
relativity, and has been confirmed to be correct by many experiments.
For the geometry shown in figure 171, the Lorentz transformations are:
0 =  ( − ) (17.3)
0 = 
0 = ³
 ´
0 =  − 2

where the Lorentz  factor
1
≡q ¡ ¢2 (17.4)
1 − 
The inverse transformations are
 =  (0 + 0 ) (17.5) 5

 = 0
 =  0µ ¶
4

0
 =  0 + 2 3

The Lorentz  factor, defined above, is the key feature 2

diﬀerentiating the Lorentz transformations from the Galilean

transformation. Note that  ≥ 1; also  → 10 as  → 0 and 1
0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
increases to infinity as  → 1 as illustrated in figure 173. A v
c
useful fact that will be used later is that for   1;
1 ³  ´2
 →1+ Limit for   
2 
Figure 17.3: The dependence of the Lorentz
 factor on  
Note that for    then  = 1 and the Lorentz trans-
formation is identical to the Galilean transformation.
468 CHAPTER 17. RELATIVISTIC MECHANICS

Tick! Tick!

d d

Tock! Tock! Tock!

a. b.

Figure 17.4: The observer and mirror are at rest in the left-hand frame (a). The light beam takes a time
∆ =  to travel to the mirror. In the right-hand frame (b) the source and mirror are travelling at a velocity
 relative to the observer. The light travels further in the right-hand frame of reference (b) than is the
stationary frame (a). Since Einstein states that the velocity of light is the same in both frames of reference
then the time interval must by larger in frame (b) since the light travels further than in (a).

17.3.3 Time Dilation:

Consider that a clock is fixed at 0 in a moving frame and measures the time interval between two events
in the moving frame, i.e. ∆0 = 01 − 02 . According to the Lorentz transformation, the times in the fixed
frame are given by:
µ ¶
0 00
1 =  1 + 2 (17.6)

µ ¶
0
2 =  02 + 20


Thus the time interval is given by:

2 − 1 =  (02 − 01 ) (17.7)
The time between events in the rest frame of the clock, ∆ ≡ ∆0
is called the proper time which always
is the shortest time measured for a given event and is represented by the symbol  . That is

∆ = ∆0 = ∆ (17.8)

Note that the time interval for any other frame of reference, moving with respect to the clock frame, will
show larger time intervals because  ≥ 10 which implies that the fixed frame perceives that the moving
clock is slow by the factor .
The plausibility of this time dilation can be understood by looking at the simple geometry of the space
ship example shown in Figure 174. Pretend that the clock in the proper frame of the space ship is based on
the time for the light to travel to and from the mirror in the space ship. In this proper frame the light has
the shortest distance to travel, and the proper transit time is
2
∆ = (17.9)

√
In the fixed frame,  the component of velocity in the direction of the mirror is 2 −  2 using the Pythagorus
theorem, assuming that the light cannot travel faster the . Thus the transit time towards and back from
the mirror must be
2
∆ = q ¡ ¢2 = ∆ (17.10)
 1 − 
which is the predicted time dilation.
17.3. SPECIAL THEORY OF RELATIVITY 469

There are many experimental verifications of time dilation in physics. For example, a stationary muon
has a mean lifetime of   = 2 sec, whereas the lifetime of a fast moving muon, produced in the upper
atmosphere by high-energy cosmic rays, was observed in 1941 to be longer and given by   as described in
example 171. In 1972 Hafely and Keating used four accurate cesium atomic clocks to confirm time dilation.
Two clocks were flown on regularly scheduled airlines travelling around the World, one westward and the
other eastward. The other two clocks were used for reference. The westward moving clock was slow by
(273 ± 7) compared to the predicted value of (275 ± 10) sec. The Global Positioning System of 24
geosynchronous satellites is used for locating positions to within a few meters. It has an accuracy of a few
nanoseconds which requires allowance for time dilation and is a daily tribute to the correctness of Einstein’s
Theory of Relativity.

17.3.4 Length Contraction

The Lorentz transformation leads to a contraction of the apparent length of an object in a moving frame
as seen from a fixed frame. The length of a ruler in its own frame of reference is called the proper length.
Consider an accurately measured rod of known proper length  = 02 − 01 that is, at rest in the moving
primed frame. The locations of both ends of this rod are measured at a given time in the stationary frame,
1 = 2  by taking a photograph of the moving rod. The corresponding locations in the moving frame are:

02 =  (2 − 2 ) (17.11)

01 =  (1 − 1 )

Since 2 = 1 , the measured lengths in the two frames are related by:

02 − 01 =  (2 − 1 ) (17.12)

That is, the lengths are related by:

1
=  (17.13)

Note that the moving rod appears shorter in the direction of motion. As  →  the apparent length
shrinks to zero in the direction of motion while the dimensions perpendicular to the direction of motion are
unchanged. This is called the Lorentz contraction. If you could ride your bicycle at close to the speed of
light, you would observe that stationary cars, buildings, people, all would appear to be squeezed thin along
the direction that you are travelling. Also objects that are further away down any side street would be
distorted in the direction of travel. A photograph taken by a stationary observer would show the moving
bicycle to be Lorentz contracted along the direction of travel and the stationary objects would be normal.

17.3.5 Simultaneity
The Lorentz transformations imply a new philosophy of space and time. A surprising consequence is that
the concept of simultaneity is frame dependent in contrast to the prediction of Newtonian mechanics.
Consider that two events occur in frame  at (1  1 ) and (2  2 )  In frame  0 these two events occur at
(1  1 ) and (02  02 )  From the Lorentz transformation the time diﬀerence is
0 0

∙ ¸
0 0  (2 − 1 )
2 − 1 =  (2 − 1 ) − (17.14)
2

If an event is simultaneous in frame  that is (2 − 1 ) = 0 then

∙ ¸
 (1 − 2 )
02 − 01 =  (17.15)
2

Thus the event is not simultaneous in frame  0 if (2 − 1 ) =  6= 0 That is, an event that is simultaneous
in one frame is not simultaneous in the other frame if the events are spatially separated. The equivalent
statement is that for two clocks, spatially separated by a distance  , which are synchronized in their rest
frame, then in a moving frame they are not simultaneous.
470 CHAPTER 17. RELATIVISTIC MECHANICS

L L
2 2

Figure 17.5: If lightning strikes the front and rear of the carriage simultaneously, according to the man in
the fixed frame, then the woman in the moving frame sees the flash from the front first since she is moving
towards that approaching wavefront during the transit time of the light. Thus if the length of the carriage
in the stationary frame is (2 − 1 ) =  then the time diﬀerence is ∆0 =  2 .

Einstein discussed the example shown in figure 175, where lightning strikes both ends of a train simul-
taneously in the stationary earth frame of reference. A woman on the train will see that the strikes are
not simultaneous since the wavefront from the front of the carriage will be seen first because she is moving
forward during the time the light from the two lightning flashes is travelling towards her. As a consequence
she observes that the two lightning flashes are not simultaneous. This explains why measurement of the
length of a moving rod, performed by simultaneously locating both ends in the fixed frame, implies that the
measurement occurs at diﬀerent times for both ends in the moving frame resulting in a shorter apparent
length. The lack of simultaneity explains why one can get the apparent inconsistency that the moving bicy-
clist sees that the stationary street block to be length contracted, while in contrast, a pedestrian sees that
the bicycle is length contracted.
The concept of causality breaks down since (02 − 01 ) can be either positive or negative, therefore the
corresponding ∆ can be positive of negative. A consequence of the lack of simultaneity is that the image
shown by a photograph of a rapidly moving object is not a true representation of the moving object. Not
only is the body contracted in the direction of travel, but also it appears distorted because light arriving from
the far side of the body had to be emitted earlier, that is, when the body was at an earlier location, in order
to reach the observer simultaneously with light from the near side. The relativistic snake paradox, addressed
in Chapter 17 workshop exercise 1 is an example of the role of simultaneity in relativistic mechanics.

17.1 Example: Muon lifetime

Many people had trouble comprehending time dilation and Lorentz contraction predicted by the Special
Theory of Relativity. The predictions appear to be crazy, but there are many examples where time dilation
and Lorentz contraction are observed experimentally such as the decay in flight of the muon. At rest, the
muon decays with a mean lifetime of 2  sec  Muons are created high in the atmosphere due to cosmic ray
bombardment. A typical muon travels at  = 0998 which corresponds to  = 15 Time dilation implies
that the lifetime of the moving muon in the earth’s frame of reference is 30 . The speed of the muon is
essentially  in both frames of reference, and it would travel 600 in 2  and 9000 in 30 . In fact,
it is observed that the muon does travel, on average, 9000 in the earth frame of reference before decaying.
Is this inconsistent with the view of someone travelling with the muon? In the muon’s moving frame, the
lifetime is only 2 , but the Lorentz contraction of distance means that 9000 in the earth frame appears
to be only 600 in the muon moving frame; a distance it travels is 2  sec. Thus in both frames of reference
we have consistent explanations, that is, the muon travels the height of the mountain in one lifetime.
17.3. SPECIAL THEORY OF RELATIVITY 471

17.2 Example: Relativistic Doppler Eﬀect

The relativistic Doppler eﬀect is encountered frequently in physics and astronomy. Consider monochro-
matic electromagnetic radiation from a source, such as a star, that is moving towards the detector at a
velocity . During the time ∆ in the frame of the receiver, the source emits  cycles of the sinusoidal
waveform. Thus the length of this waveform, as seen by the receiver, is  which equals

 = ( − )∆

The frequency as measured by the receiver is

 
= =
 ( − )∆

According to the source, it emits  waves of frequency  0 during the proper time interval ∆0 , that is

 =  0 ∆0

This proper time interval ∆0 , in the source frame, corresponds to a time interval ∆ in the receiver frame
where
∆ = ∆0

Thus the frequency measured by the receiver is

p s
1 0 1 − (  )2 1+
= = 0 = 0
(1 −  )  (1 −  ) 1−

where  ≡  . This formula for source and receiver approaching each other also gives the correct answer for
source and receiver receding if the sign of  is changed.
This relativistic Doppler Eﬀect accounts for the red shift observed for light emitted by receding stars and
galaxies, as well as many examples in atomic and nuclear physics involving moving sources of electromagnetic
radiation.

17.3 Example: Twin paradox

A problem that troubled physicists for many years is called the twin paradox. Consider two identical
twins, Jack and Jill. Assume that Jill travels in a space ship at a speed of  = 4 for 20 years, as measured
by Jack’s clock, and then returns taking another 20 years, according to Jack. Thus, Jack has aged 40 years
by the time his twin sister returns home. However, Jill’s clock measures 204 = 5 years for each half of the
trip so that she thinks she travelled for 10 years total time according to her clock. Thus she has aged only 10
years on the trip, that is, now she is 30 years younger that her twin brother. Note that, according to Jill, the
distance she travelled out and back was 14 the distance according to Jack, so she perceives no inconsistency
in her clock, and the speed of the space ship. This was called a paradox because some people claimed that
Jill will perceive that the earth and Jack moved away at the same relative speed in the opposite direction and
thus according to Jill, Jack should be 30 years younger, not her. Moreover, some claimed that this problem
is symmetric and therefore both twins must still be the same age since there is no way of telling who was
moving away from whom. This argument is incorrect because Jill was able to sense that she accelerated to
 = 4 which destroys the symmetry argument. The eﬀect is observed with accelerated beams of unstable
nuclei such as the muon and was confirmed by the results of the experiment where cesium atomic clocks were
flown around the Earth. Thus the Twin paradox is not a paradox; the fact is that Jill will be younger than
her twin brother.
472 CHAPTER 17. RELATIVISTIC MECHANICS

17.4 Relativistic kinematics

17.4.1 Velocity transformations

Consider the two parallel coordinate frames with the primed frame moving at a velocity  along the 01 axis
as shown in figure 171. Velocities of an object measured in both frames are defined to be

 = (17.16)

0
0 =
0
Using the Lorentz transformations 173 175 between the two frames moving with relative velocity  along
the 1 axis, gives that the velocity along the 01 axis is
01 1 −  1 − 
01 = = = (17.17)
0  − 2 1 1 − 12
Similarly we get the velocities along the perpendicular 02 and 03 axes to be
02 2
02 = = (17.18)
0 1 − 12
0
3 3
03 = =
0 1 − 12
When 12 → 0 these velocity transformations become the usual Galilean relations for velocity addition.
Do not confuse u and u0 with v; that is, u and u0 are the velocities of some object measured in the unprimed
and primed frames of reference respectively, whereas v is the relative velocity of the origin of one frame with
respect to the origin of the other frame.

17.4.2 Momentum
Using the classical definition of momentum, that is p =u, the linear momentum is not conserved using the
above relativistic velocity transformations if the mass  is a scalar quantity. This problem originates from
the fact that both x and  have non-trivial transformations and thus u = x  is frame dependent.
Linear momentum conservation can be retained by redefining momentum in a form that is identical in
all frames of reference, that is by referring to the proper time  as measured in the rest frame of the moving
object. Therefore we define relativistic linear momentum as
x x 
p ≡ = (17.19)
  
But we know the time dilation relation

 = q =    (17.20)
2
(1 − 2 )

Note that the   in this relation refers to the velocity  between the moving object and the frame; this is
quite diﬀerent from the  =  1 2 which refers to the transformation between the two frames of reference.
(1− 2 )
Thus the new relativistic definition of momentum is
x x
p ≡ =   =   u (17.21)
 
The relativistic definition of linear momentum is the same as the classical definition with the rest mass
 replaced by the relativistic mass .1
1 Note that, until recently, the rest mass was denoted by  and the relativistic mass was referred to as . Modern texts
0
denote the rest mass by  and the relativistic mass by . This book follows the modern nomenclature for rest mass to avoid
confusion.
17.4. RELATIVISTIC KINEMATICS 473

17.4.3 Center of momentum coordinate system

The classical relations for handling the kinematics of colliding objects, carry over to special relativity when the
relativistic definition of linear momentum, equation 1721, is assumed. That is, one can continue to apply
conservation of linear momentum. However, there is one important conceptual diﬀerence for relativistic
dynamics in that the center of mass no longer is a meaningful concept due to the interrelation of mass
and energy. However, this problem is eliminated by considering the center of momentum coordinate system
which, as in the non-relativistic case, is the frame where the total linear momentum of the system is zero.
Using the concept of center of momentum incorporates the formalism of classical non-relativistic kinematics.

17.4.4 Force
Newton’s second law F = p  is covariant under a Galilean transformation. In special relativity this definition
also applies using the relativistic definition of momentum p. The fact that the relativistic momentum p
is conserved in the force-free situation, leads naturally to using the definition of force to be

p
F= (17.22)

Then the relativistic momentum is conserved if F =0

17.4.5 Energy
The classical definition of work done is defined by
Z 2
12 = F·r =2 − 1 (17.23)
1

Assume 1 = 0 let r = u and insert the relativistic force relation in equation 1723, gives
Z  Z 

 = = ( u) ·u =   (  ) (17.24)
0   0

Integrate by parts, followed by algebraic manipulation, gives

Z r

2  2
 =    −  q =    +  1 − 2 − 2
2 2
0 1 − 2
2 
µ ¶
2 2 2
= q +q 1 − 2 − 2 = 2 (  − 1) (17.25)
2
1 − 2
2
1 − 2 

Define the rest energy 0

0 ≡ 2 (17.26)
and total relativistic energy 
 ≡   2 (17.27)
then equation 1725 can be written as
 =  + 0 =   2 (17.28)
This is the famous Einstein relativistic energy that relates the equivalence of mass and energy. The total
relativistic energy  is a conserved quantity in nature. It is an extension of the conservation of energy and
manifestations of the equivalence of energy and mass occur extensively in the real world.
In nuclear physics we often convert mass to energy and back again to mass. For example, gamma
rays with energies greater than 1022  , which are pure electromagnetic energy, can be converted to an
electron plus positron both of which have rest mass. The positron can then annihilate a diﬀerent electron in
another atom resulting in emission of two 511 gamma rays in back to back directions to conserve linear
momentum. A dramatic example of Einstein’s equation is a nuclear reactor. One gram of material, the mass
474 CHAPTER 17. RELATIVISTIC MECHANICS

of a paper clip, provides  = 9 × 1013 joules. This is the daily output of a 1  nuclear power station or
the explosive power of the Nagasaki or Hiroshima bombs.
As the velocity of a particle  approaches  then  and the relativistic mass  both approach infinity.
This means that the force needed to accelerate the mass also approaches infinity, and thus no particle can
exceed the velocity of light. The energy continues to increase not by increasing the velocity but by increase
of the relativistic mass. Although the relativistic relation for kinetic energy is quite diﬀerent from the
Newtonian relation, the Newtonian form is obtained for the case of    in that
2 − 12 1 2 1
 = 2 (1 − 2
) − 2 = 2 (1 + + · · ·) − 2 = 2 (17.29)
 2 2 2
An especially useful relativistic relation that can be derived from the above is
 2 = 2 2 + 02 (17.30)
This is useful because it provides a simple relation between total energy of a particle and its relativistic
linear momentum plus rest energy.

17.4 Example: Rocket propulsion

Consider a rocket, having initial mass  is accelerated in a straight line in free space by exhausting
propellant at a constant speed  relative to the rocket. Let  be the speed of the rocket relative to it’s initial
rest frame  when its rest mass has decreased to . At this instant the rocket is at rest in the inertial frame
 0 . At a proper time  +  the rest mass is  −  and it has acquired a velocity increment  relative to
 0 and propellant of rest mass  has been expelled with velocity  relative to  0 . At proper time  in  0
the rest mass is 2 . At the time  +   energy conservation requires that
 0 ( − ) 2 +    2 = 2
At the same instant, conservation of linear momentum requires
 0 ( − ) 0 −     = 0
To first order these two equations simplify to
r ³  ´2

 = 1− 

0 =    
Therefore
0 =   ()
0 0
The velocity increment  in frame  can be transformed back to frame  using equation 175, that is
µ ³  ´2 ¶
 + 0
 +  = 0 ≈  + 1− 0 ()
1 + 
2


Equations  and  yield a diﬀerential equation for () of

 
¡  ¢2 = 
1−  

Integrate the left-hand side between 0 and  and the right-hand side between  and  gives
µ ¶ ³´
1 1 + 
 ln = − ln
2 1 −  
This reduces to ¡  ¢2 
 1− 
= ¡  ¢2 
 1+ 

When  → 0 this equation reduces to the non-relativistic answer given in equation 2123.
17.5. GEOMETRY OF SPACE-TIME 475

17.5 Geometry of space-time

17.5.1 Four-dimensional space-time
In 1906 Poincaré showed that the Lorentz transformation can be regarded as a rotation in a 4-dimensional
Euclidean space-time introduced by adding an imaginary fourth space-time coordinate  to the three
real spatial coordinates. In 1908 Minkowski reformulated Einstein’s Special Theory of Relativity in this 4-
dimensional Euclidean space-time vector space and concluded that the spatial variables   where ( = 1 2 3) 
plus the time 0 =  are equivalent variables and should be treated equally using a covariant representation
of both space and time. The idea of using an imaginary time axis  to make space-time Euclidean was
elegant, but it obscured the non-Euclidean nature of space-time as well as causing diﬃculties when generalized
to non-inertial accelerating frames in the General Theory of Relativity. As a consequence, the use of the
imaginary  has been abandoned in modern work. Minkowski developed an alternative non-Euclidean
metric that treats all four coordinates (   ) as a four-dimensional Minkowski metric with all coordinates
being real, and introduces the required minus sign explicitly.
Analogous to the usual 3-dimensional cartesian coordinates, the displacement four vector s is defined
using the four components along the four unit vectors in either the unprimed or primed coordinate frames.

s = 0 ê0 + 1 ê1 + 2 ê2 + 3 ê3 = 00 ê00 + 01 ê01 + 02 ê02 + 03 ê03 (17.31)
The convention used is that greek subscripts (covariant) or superscripts (contravariant) designate a four
vector with 0 ≤  ≤ 3 The covariant unit vectors ê are written with the subscript  which has 4 values
0 ≤  ≤ 3. As described in appendix 3, using the Einstein convention the components are written with
the contravariant superscript  where the time axis 0 = , while the spatial coordinates, expressed in
cartesian coordinates, are 1 = , 2 = , and 3 = . With respect to a diﬀerent (primed) unit vector basis
ê0  the displacement must be unchanged as given by equation 1731. In addition, equation 1743 shows that
the magnitude ||2 of the displacement four vector is invariant to a Lorentz transformation.
The most general Lorentz transformation between inertial coordinate systems  and  0 , in relative motion
with velocity v, assuming that the two sets of axes are aligned, and that their origins overlap when  = 0 = 0,
is given by the symmetric matrix  where
X
0 =   (17.32)


This Lorentz transformation of the four vector X components can be written in matrix form as
X0 = λX (17.33)
Assuming that the two sets of axes are aligned, then the elements of the Lorentz transformation  are
given by
⎛ ⎞
⎛ 0 ⎞  − 1 − 2 − 3 ⎛ ⎞
 ⎜ 2 ⎟ 
⎜ 01 ⎟ ⎜ − 1 1 + ( − 1)  12 ( − 1) 1 2 2 ( − 1) 1 2 3 ⎟ ⎜ 1 ⎟
X =⎜
0 ⎟ ⎜
⎝ 02 ⎠ = ⎜ − ( − 1)  1 2 2 2
⎟·⎜ 2 ⎟
⎟ ⎝  ⎠ (17.34)
⎝ 2 
1 + ( − 1)  22 ( − 1) 2 2 3 ⎠
03 2 3
− 3 ( − 1) 1 2 3 ( − 1) 2 2 3 1 + ( − 1)  32
 1
where  =  and  = √ and assuming that the origin of  transforms to the origin of  0 at (0 0 0 0).
1− 2
For the case illustrated in figure 171 where the corresponding axes of the two frames are parallel and in
relative motion with velocity  in the 1 direction, then the Lorentz transformation matrix 1734 reduces to
⎛ 0 ⎞ ⎛ ⎞ ⎛ ⎞
  − 0 0 
⎜ 01 ⎟ ⎜ −  0 0 ⎟ ⎜ 1 ⎟
⎜ 02 ⎟ = ⎜ ⎟ · ⎜ 2 ⎟ (17.35)
⎝  ⎠ ⎝ 0 0 1 0 ⎠ ⎝  ⎠
03 3
 0 0 0 1 
This Lorentz transformation matrix is called a standard boost since it only boosts from one frame to another
parallel frame. In general a rotation matrix also is incorporated into the transformation matrix  for the
spatial variables.
476 CHAPTER 17. RELATIVISTIC MECHANICS

17.5.2 Four-vector scalar products

Scalar products of vectors and tensors usually are invariant to rotations in three-dimensional space providing
an easy way to solve problems. The scalar, or inner, product of two four vectors is defined by
⎛ ⎞ ⎛ 0 ⎞
1 0 0 0 
¡ 0 ¢ ⎜ 0 −1 0 0 ⎟ ⎜ 1 ⎟
 
X · Y =    =   1
 2
 3 ⎜
·⎝ ⎟ · ⎜ ⎟ (17.36)
0 0 −1 0 ⎠ ⎝  2 ⎠
0 0 0 −1 3
=  0 0 −  1 1 −  2 2 −  3 3

The correct sign of the inner product is obtained by inclusion of the Minkowski metric  defined by

 ≡ ê · ê (17.37)

that is, it can be represented by the matrix

⎛ ⎞
1 0 0 0
⎜ 0 −1 0 0 ⎟
≡⎜
⎝ 0
⎟ (17.38)
0 −1 0 ⎠
0 0 0 −1
The sign convention used in the Minkowski metric, equation 1738 has been chosen with the time coordinate
2 2
() positive which makes ()  0 for objects moving at less than the speed of light and corresponds to
2
 being real.
The presence of the Minkowski metric matrix, in the inner product of four vectors, complicates General
Relativity and thus the Einstein convention has been adopted where the components of the contravariant
four-vector X are written with superscripts    See also appendix . The corresponding covariant four-
vector components are written with the subscript  which is related to the contravariant four-vector
components   using the  component of the covariant Minkowski metric matrix g That is
3
X
 =    (17.39)
=0

The contravariant metric component   is defined as the  component of the inverse metric matrix g−1
where
gg−1 = I = g−1 g (17.40)
where I is the four-vector identity matrix. The contravariant components of the four vector can be expressed
in terms of the covariant components as
X3
 =    (17.41)
=0

Thus equations 1739 and 1741 can be used to transform between covariant and contravariant four vectors,
that is, to raise or lower the index .
The scalar inner product of two four vectors can be written compactly as the scalar product of a covariant
four vector and a contravariant four vector. The Minkowski metric matrix can be absorbed into either X or
Y thus
X 3
3 X 3
X 3
X
X·Y=      =    =    (17.42)
=0 =0 =0 =0

If this covariant expression is Lorentz invariant in one coordinate system, then it is Lorentz invariant in all
coordinate systems obtained by proper Lorentz transformations.
2 Older textbooks, such as all editions of Marion, and the first two editions of Goldstein, use the Euclidean Poincaré 4-

dimensional space-time with the imaginary time axis . About half the scientific community, and modern physics textbooks
including this textbook and the 3 edition of Goldstein, use the Bjorken - Drell + − − −, sign convention given in equation
1738 where 0 ≡  and 1  2  3 are the spatial coordinates. The other half of the community, including mathematicians
and gravitation physicists, use the opposite − + + + sign convention. Further confusion is caused by a few books that assign
the time axis  to be 4 rather than 0 
17.5. GEOMETRY OF SPACE-TIME 477

The scalar inner product of the invariant space-time interval is an especially important example.
3
X
2 2 2 2
() ≡ X·X=2 () 2 − (r) = () − 2 = ( ) (17.43)
=1

This is invariant to a Lorentz transformation as can be shown by applying the Lorentz standard boost
transformation given above. In particular, if  0 is the rest frame of the clock, then the invariant space-time
interval  is simply given by the proper time interval  .

17.5.3 Minkowski space-time

¡ ¢
Figure 176 illustrates a three-dimensional  1  2 representation of the 4−dimensional space-time dia-
gram where it is assumed that 3 = 0. The fact that the velocity of light has a fixed velocity leads to the
concept of the light cone defined by the locus of || = .

Inside the light cone

The vertex of the cones represent the present. Locations in-
side the upper cone represent the future while the past is
represented by locations inside the lower cone. Note that
2 2
() =2 () 2 − (r)  0 inside both the future and past
light cones. Thus the space-time interval ∆ is real and pos-
itive for the future, whereas it is real and negative for the
past relative to the vertex of the light cone. A world line
is the trajectory a particle follows is a function of time in
Minkowski space. In the interior of the future light cone
∆  0 and, since it is real, it can be asserted unambiguously
that any point inside this forward cone must occur later than
at the vertex of the cone, that is, it is the absolute future.
A Lorentz transformation can rotate Minkowski space such
that the axis 0 goes through any point within this light cone
and then the “world line” is pure time like. Similarly, any
point inside the backward light cone unambiguously occurred
before the vertex, i.e. it is absolute past.

Outside the light cone

2 2
Outside of the light cone, has () =2 () 2 − (r)  0 Figure 17.6: The light cone in the
and thus ∆ is imaginary and is called space like. A space-     space is defined by the condition
1 2
like plane hypersurface in spatial coordinates is shown for the X · X =2 2 − 2 = 0 and divides space-time
present time in the unprimed frame. A rotation in Minkowski into the forward and backward light cones,
space can be made to 0 such that the space-like hypersurface with   0 and   0 respectively; the interi-
now is tilted relative to the hypersurface shown and thus any ors of the forward and backward light cones
point  outside the light cone can be made to occur later, are called absolute future and absolute past.
simultaneous, or earlier than at the vertex depending on the
orientation of the space-like hypersurface. This startling situation implies that the time ordering of two
points, each outside the others light cone, can be reversed which has profound implications related to the
concept of simultaneity and the notion of causality. P ¡ ¢
For the special case of two events lying on the light cone 4 2 = 2 2 − 21 + 22 + 23 = 0 and thus
these events are separated by a light ray travelling at velocity  Only events separated by time-like intervals
can be connected causally. The world line of a particle must lie within its light cone. The division of intervals
into space-like and time-like, because of their invariance, is an absolute concept. That is, it is independent
of the frame of reference.
The concept of proper time can be expanded by considering a clock at rest in frame  0 which is moving
with uniform velocity  with respect to a rest frame . The clock at rest in the  0 frame measures the proper
478 CHAPTER 17. RELATIVISTIC MECHANICS

time  , then the time observed in the fixed frame can be obtained by looking at the interval  Because of
the invariance of the interval, 2 then
£ ¤
2 = 2  2 = 2 2 − 21 + 22 + 23 (17.44)
That is,
" ¡ 2 ¢ # 12 ∙ ¸1
1 + 22 + 23 2 2 
 =  1 − =  1 − 2 = (17.45)
2 2  
that is  =  which satisfies the normal expression for time dilation, 178.

17.5.4 Momentum-energy four vector

The previous four-vector discussion can be elegantly exploited using the covariant Minkowski space-time
representation. Separating the spatial and time of the diﬀerential four vector gives
X = ( x) (17.46)

Remember that the square of the four-dimensional space-time element of length ()2 is invariant (1743),
and is simply related to the proper time element  . Thus the scalar product
£ ¤
X·X = 2 = 2  2 = 2 2 − 21 + 22 + 23 (17.47)
Thus the proper time is an invariant.
The ratio of the four-vector element X and the invariant proper time interval   is a four-vector called
the four-vector velocity U where
µ ¶ µ ¶
X  x x
U= =   =    =   ( u) (17.48)
   
where u is the particle velocity, and   =  1 .
2
(1− 
2
)
The four-vector momentum P can be obtained from the four-vector velocity by multiplying it by the
scalar rest mass 
P = U = (     u) (17.49)
However,

   = (17.50)

thus the momentum four vector can be written as
µ ¶

P= p (17.51)

where the vector p represents the three spatial components of the relativistic momentum. It is interesting to
realize that the Theory of Relativity couples not only the spatial and time coordinates, but also, it couples
their conjugate variables linear momentum p and total energy,  .
An additional feature of this momentum-energy four vector P, is that the scalar inner product P · P is
invariant to Lorentz transformations and equals ()2 in the rest frame
X 3
3 X 3 X
X 3
 2
P·P=      =    = ( ) − |p|2 = 2 2 (17.52)
=0 =0 =0 =0


which leads to the well-known equation

 2 = 2 2 + 02 (17.53)
The Lorentz transformation matrix  can be applied to P
P0 = λP (17.54)
The Lorentz invariant four-vector representation
³ is illustrated by applying the Lorentz transformation
¡  ¢2 ´ 0
shown in figure 171, which gives, 1 =  1 −   , 2 = 2 , 03 = 3 , and  0 =  ( − 1 ).
0
17.6. LORENTZ-INVARIANT FORMULATION OF LAGRANGIAN MECHANICS 479

17.6 Lorentz-invariant formulation of Lagrangian mechanics

17.6.1 Parametric formulation
The Lagrangian and Hamiltonian formalisms in classical mechanics are based on the Newtonian concept
of absolute time  which serves as the system evolution parameter in Hamilton’s Principle. This approach
violates the Special Theory of Relativity. The extended Lagrangian and Hamiltonian formalism is a para-
metric approach, pioneered by Lanczos[La49], that introduces a system evolution parameter  that serves
as the independent variable in the action integral, and all the space-time variables  () () are dependent
on the evolution parameter . This extended Lagrangian and Hamiltonian formalism renders it to a form
that is compatible with the Special Theory of Relativity. The importance of the Lorentz-invariant extended
formulation of Lagrangian and Hamiltonian mechanics has been recognized for decades.[La49, Go50, Sy60]
Recently there has been a resurgence of interest in the extended Lagrangian and Hamiltonian formalism
stimulated by the papers of Struckmeier[Str05, Str08] and this formalism has featured prominently in recent
textbooks by Johns[Jo05] and Greiner[Gr10]. This parametric approach develops manifestly-covariant La-
grangian and Hamiltonian formalisms that treat equally all 2+1 space-time canonical variables. It provides
a plausible manifestly-covariant Lagrangian for the one-body system, but serious problems exist extending
this to the  -body system when   1. Generalizing the Lagrangian and Hamiltonian formalisms into the
domain of the Special Theory of Relativity is of fundamental importance to physics, while the parametric
approach gives insight into the philosophy underlying use of variational methods in classical mechanics.3
In conventional Lagrangian mechanics, the equations of motion for the  generalized coordinates are
derived by minimizing the action integral, that is, Hamilton’s Principle.
Z 
(q q̇) =  (q() q̇()) = 0 (17.55)


where (q() q̇()) denotes the conventional Lagrangian. This approach implicitly assumes the Newtonian
concept of absolute time  which is chosen to be the independent variable that characterizes the evolution
parameter of the system. The actual path [q() q̇()] the system follows is defined by the extremum of the
action integral (q q̇) which leads to the corresponding Euler-Lagrange equations. This assumption is
contrary to the Theory of Relativity which requires that the space and time variables be treated equally,
that is, the Lagrangian formalism must be covariant.

17.6.2 Extended Lagrangian

Lanczos[La49] proposed making the Lagrangian covariant by introducing a general evolution parameter 
and treating the time as a dependent variable () on an equal footing with the configuration space variables
  () That is, the time becomes a dependent variable 0 () = () similar to the spatial variables  ()
where 1 ≤  ≤ . The dynamical system then is described as motion confined to a hypersurface within an
extended space where the value of the extended Hamiltonian and the evolution parameter  constitute an
additional pair of canonically conjugate variables in the extended space. That is, the canonical momentum
0  corresponding to 0 =  is 0 =  similar to the momentum-energy four vector, equation 1751.
An extended Lagrangian L(q() q() ()
 ()  ) can be defined which can be written compactly as

L(  () () ) where the index 0 ≤  ≤  denotes the entire range of space-time variables.
This extended Lagrangian can be used in an extended action functional S(q q 
   ) to give an extended
4
version of Hamilton’s Principle
Z 
q    ()
S(q  ) =  L(  () ) = 0 (17.56)
   
3 Chapters 176 and 177 reproduce the Struckmeier presentation.[Str08]
4 These formula involve total and partial derivatives with respect to both time,  and parameter . For clarity, the derivatives
are written out in full because Lanczos[La49] and Johns[Jo05] use the opposite convention for the dot and prime superscripts
as abbreviations for the diﬀerentials with respect to  and . The blackboard bold format is used to designate the extended
versions of the action , Lagrangian  and Hamiltonian .
480 CHAPTER 17. RELATIVISTIC MECHANICS

The conventional action  and extended action S, address alternate characterizations of the same underlying
physical system, and thus the action principle implies that  = S = 0 must hold simultaneously. That is,
Z  Z 
q  q 
 (q )  =  L(q  ) (17.57)
     
As discussed in chapter 93 there is a continuous spectrum of equivalent gauge-invariant Lagrangians for
which the Euler-Lagrange equations lead to identical equations of motion. Equation 1757 is satisfied if the
conventional and extended Lagrangians are related by
q  q  Λ(q)
L(q  ) = (q ) + (17.58)
    
where Λ(q) is a continuous function of q and  that has continuous second derivatives. It is acceptable to
assume that Λ(q)
 = 0, then the extended and conventional Lagrangians have a unique relation requiring
no simultaneous transformation of the dynamical variables. That is, assume
q  q 
L(q  ) = (q ) (17.59)
   
Note that the time derivative of q can be expressed in terms of the  derivatives by
q q
= (17.60)
 
Thus, for a conventional Lagrangian with  variables, the corresponding extended Lagrangian is a function
of  + 1 variables while the conventional and extended Lagrangians are related using equations 1759 and
1760.
The derivatives of the relation between the extended and conventional Lagrangians lead to
L  
= (17.61)
    
L  
= (17.62)
  
L 
³ ´ = ³ ´ (17.63)

   

X
L   
¡  ¢ = − ³ ´ (17.64)
  =1 
 


where 1 ≤  ≤  since the  = 0 time derivatives are written explicitly in equations 1762 1764.
Equations 1763 — 1764, summed over the extended range 0 ≤  ≤  of time and spatial dynamical
variables, imply
X µ ¶  
L   X     X   
³ ´ = − ³ ´ + ³ ´ =L (17.65)
=0 
   =1     =1   
  

Equation 1765 can be written in the form

X ½ 6≡  
L   = 0 if L is not homogeneous in 
L− ³ ´ = (17.66)
=0 
  ≡ 0 if L is homogeneous in 



If the extended Lagrangian L(q q  
   ) is homogeneous to first order in the +1 variables  , then Euler’s
theorem on homogeneous functions trivially implies the relation given in equation 1766. Struckmeier[Str08]

identified a subtle but important point that if L is not homogeneous in    then equation 1766 is not an
identity but is an implicit equation that is always satisfied as the system evolves according to the solution
of the extended Euler-Lagrange
equations. Then equation 1759 is satisfied without it being a homogeneous
form in the +1 velocities  . This introduces a new class of non-homogeneous Lagrangians. The relativistic
free particle, discussed in example 175 is a case of a non-homogeneous extended Lagrangian.
17.6. LORENTZ-INVARIANT FORMULATION OF LAGRANGIAN MECHANICS 481

17.6.3 Extended generalized momenta

The generalized momentum is defined by

 = ³ ´ (17.67)

 

Assume that the definitions of the extended Lagrangian L, and the extended Hamiltonian H, are related
by a Legendre transformation, and are based on variational principles, analogous to the relation that exists
between the conventional Lagrangian  and Hamiltonian . The Legendre transformation requires defining
the extended generalized (canonical) momentum-energy four vector P()= ( ()   p()). The momentum
components of the momentum-energy four vector P()= ( ()
  p()) are given by the 1 ≤  ≤  components
using equation 1763
L 
 () = ³  ´ = ³  ´ (17.68)

   

The  = 0 component of the momentum-energy four vector can be derived by recognizing that the right-hand
side of equation 1764 is equal to −(     ). That is, the corresponding generalized momentum 0  that
is conjugate to 0 =  is given by
Ã ! ⎛ ⎞
X 
L 1 L 1   ⎠ (     )
0 = ³ 0 ´ = ¡  ¢ = ⎝ − ³ ´ =− (17.69)
      =1 
  
 

17.6.4 Extended Lagrange equations of motion

By direct analogy with the non-relativistic action integral 1755 the extremum for the relativistic action
integral S(q q 
   ) is obtained using the Euler-Lagrange equations derived from equation 1756 where the
independent variable is . This implies that for 0 ≤  ≤ 
⎛ ⎞
X
 ⎝ L ⎠ L   
³  ´ −  = Q  =  + 
 (17.70)
       
 =1

where the extended generalized force Q  shown on the right-hand side of equation 1770 accounts for all
forces not included in the potential energy term in the Lagrangian. The extended generalized force Q  can
be factored into two terms as discussed in chapter 6, equation 660. The Lagrange multiplier term includes
1 ≤  ≤  holonomic constraint forces where the  holonomic constraints, which do no work, are expressed
in terms of the  algebraic equations of holonomic constraint  . The   term includes the remaining
constraint forces and generalized forces that are not included in the Lagrange multiplier term or the potential
energy term of the Lagrangian.
For the case where  = 0, since 0 = , then equation 1770 reduces to
Ã !  
 L L X   X   
¡  ¢ − =  −  (17.71)
      =1

=1

These Euler-Lagrange equations of motion 1770 1771 determine the 1 ≤  ≤  generalized coordinates
  () plus  0 = () in terms of the independent variable .
If the holonomic equations of constraint are time independent, that is  
 = 0 and if Q0

= 0, then
the  = 0 term of the Euler-Lagrange equations simplifies to
Ã !
 L L
¡  ¢ − =0 (17.72)
   

One interpretation is to select  to be primary. Then L is derived from  using equation 1759 and L

must satisfy the identity given by equation 1766 while the Euler-Lagrange equations containing  yield an
identity which implies that  does not provide an equation of motion in terms of (). Conversely, if L is
482 CHAPTER 17. RELATIVISTIC MECHANICS

chosen to be primary, then L is no longer a homogeneous function and equation 1766 serves as a constraint

on the motion that can be used to deduce , while  yields a non-trivial equation of motion in terms of
(). In both cases the occurrence of a constraint surface results from the fact that the extended space has
2 + 2 variables to describe 2 + 1 degrees of freedom, that is, one more degree of freedom than required for
the actual system.

17.5 Example: Lagrangian for a relativistic free particle

The standard Lagrangian  =  −  is not Lorentz invariant. The extended Lagrangian ( q 
    )
introduces the independent variable  which treats both the space variables () and time variable 0 = ()
equally. This can be achieved by defining the non-standard Lagrangian
" µ ¶ #
2
q  1 2 1 q  2
L(q  ) =  −( ) −1 ()
  2 2  

The constant third term in the bracket is included to ensure that the extended Lagrangian converges to the

standard Lagrangian in the limit  → 1.
Note that the extended Lagrangian () is not homogeneous to first order in the velocities q
 as is required.
Equation 1766 must be used to ensure that equation () is homogeneous. That is, it must satisfy the
constraint relation µ ¶2 µ ¶2
 1 q
− 2 −1=0 ()
  
Inserting () into the extended Lagrangian () yields that the square bracket in equation  must equal 2.
Thus
1
|L| = 2 [−2] = −2 ()
2
The constraint equation () implies that
s µ ¶2
 1 q 1
= 1− 2 = ()
   

Using equation () gives that the relativistic Lagrangian is

q
L 2
= =− = −2 1 − 2 ()
 

Equation () is the conventional relativistic Lagrangian derived by assuming that the system evolution para-
meter  is transformed to be along the world line  where the invariant length  replaces the proper time
interval

 =  = ()

The definition of the generalized (canonical) momentum


 = = ̇ ()
̇
leads to the relativistic expression for momentum given in equation 1721.
The relativistic Lagrangian is an important example of a non-standard Lagrangian. Equation () does not
equal the diﬀerence between the kinetic and potential energies, that is, the relativistic expression for kinetic
energy is given by 1728 to be
 = ( − 1) 2 ()
The non-standard relativistic Lagrangian () can be used with the Euler-Lagrange equations to derive the
second-order equations of motion for both relativistic and non-relativistic problems within the Special Theory
of Relativity.
17.6. LORENTZ-INVARIANT FORMULATION OF LAGRANGIAN MECHANICS 483

17.6 Example: Relativistic particle in an external electromagnetic field

A charged particle moving at relativistic speed in an external electromagnetic field provides an example
of the use of the relativistic Lagrangian.
In the discussion of classical mechanics it was shown that the velocity-dependent Lorentz force can be
absorbed into the scalar electric potential Φ plus the vector magnetic potential A. That is, the potential
energy is given by equation 76 to be  = (Φ − A · v) Including this in the Lagrangian, 1771 gives
q
2
=− −  = −2 1 −  2 − Φ + A · v

The three spatial partial derivatives can be written in vector notation as
 
= −∇Φ + ∇(v · A) ()
r 
and the generalized momentum is given by

p= = v + A
v
which is identical to the non-relativistic answer given by equation 76. That is, it includes the momentum of
the electromagnetic field plus the classical linear momentum of the moving particle.
The total time derivative of the generalized momentum is
µ ¶
p    A
= = (v) +  ()
  v  

where the last term is given by the chain rule

A A
= + (v · ∇)A ()
 
Using equations    in the Euler-Lagrange equation gives
µ ¶
  
=
 v r
 A
(v) +  = −∇Φ + ∇(v · A)
 
Collecting terms and using the well-known vector-product identity, plus the definition B = ∇ × A gives
∙ ¸
 A
(v) = − ∇Φ −  +  [∇(v · A) − (v · ∇)A]
 
∙ ¸
A
= − ∇Φ − +  [v × ∇ × A]

F =  [E + v × B]

If we adopt the definition that the relativistic canonical momentum is  =  then the left hand side is
the relativistic force while the right-hand side is the well-known Lorentz force of electromagnetism. Thus
the extended Lagrangian formulation correctly reproduces the well-known Lorentz force for a charged particle
moving in an electromagnetic field.
484 CHAPTER 17. RELATIVISTIC MECHANICS

17.7 Lorentz-invariant formulations of Hamiltonian mechanics

17.7.1 Extended canonical formalism
A Lorentz-invariant formulation of Hamiltonian mechanics can be developed that is built upon the extended
Lagrangian formalism assuming that the Hamiltonian and Lagrangian are related by a Legendre transfor-
mation. That is,
X
  q
(q p ) =  − (q  ) (17.73)
=1
 

where the generalized momentum is defined by


 = ³ ´ (17.74)

 

Struckmeier[Str08] assumes that the definitions of the extended Lagrangian L, and the extended Hamil-
tonian H, are related by a Legendre transformation, and are based on variational principles, analogous to the
relation that exists between the conventional Lagrangian  and Hamiltonian . The Legendre transforma-
tion requires defining the extended generalized (canonical) momentum-energy four vector P()= ( ()  p()).
()
The momentum components of the momentum-energy four vector P()= (   p()) are given by the 1 ≤
 ≤  components using either the conventional or the extended Lagrangians as given in equation 1768
L 
 () = ³ ´ = ³ ´ (1768)
 
   

The  = 0 component of the momentum-energy four vector is given by equation 1769

Ã !
1 L (     ) E()
0 = ¡  ¢ = − =− (17.75)
    

where E() represents the instantaneous generalized energy of the conventional Hamiltonian at the point 
but not the functional form of (q() p() ()). That is

E()6≡
= (q() p() ()) (17.76)

Note that E() does not give the function (q p ). Equations 1768 and 1769 give that
E()
0 () = − (17.77)

The extended Hamiltonian H(q p  E()), in an extended phase space, can be defined by the Legendre
transformation and the four-vector P to be
q 
H(q p  E()) = (P·q) − L(q  ) (17.78)
 
X µ ¶
 q 
=  − L(q  )
=0
  
X µ ¶
   q 
=  −E − L(q  ) (17.79)
=1
   


where the 0 term has been written explicitly as −E  in equation 1779. The extended Hamiltonian
H((q p  E()) can carry all the information on the dynamical system that is carried by the extended
Lagrangian L(q q 
   ) if the Hesse matrix is non-singular. That is, if
⎛ ⎞
2
 L
det ⎝ ³  ´ ³ ´ ⎠ 6= 0 (17.80)
   

17.7. LORENTZ-INVARIANT FORMULATIONS OF HAMILTONIAN MECHANICS 485


If the extended Lagrangian L(q q  
   ) is not homogeneous in the +1 velocities  , then the extended
set of Euler-Lagrange equations 1772 is not redundant. Thus equation 1766 is not an identity but it can be
regarded as an implicit equation that is always satisfied by the extended set of Euler-Lagrange equations. As
a result, the Legendre transformation to an extended Hamiltonian exists. That is, equation 1766 is identical
to the Legendre transform for H((q p  E()) which was shown to equal zero. Therefore
H(q() p() () E()) = 0 (17.81)
which means that the extended Hamiltonian H((q p  E()) directly defines the restricted hypersurface on
which the particle motion is confined.
The extended canonical equations of motion, derived using the extended Hamiltonian H(q() p() () E())
with the usual Hamiltonian mechanics relations, are:
H  
= (17.82)
 
H 
= − (17.83)
  
H E
= (17.84)
 
H 
= − (17.85)
E 
These canonical equations give that the total derivative of H((q() p() () E()) with respect to  is
H H  H   H  H E
= +  + +
       E 
      E   E
= − + − =0 (17.86)
       
That is, in contrast to the total time derivative of (q p ), the total  derivative of the extended Hamil-
tonian H((q() p() () E()) always vanishes, that is, H(q() p() () E()) is autonomous which is ideal
for use with Hamilton’s equations of motion. The constraints give that H(q() p() () E()) = 0, (equation
1781) and  = 0, (equation 1786) implying that the correlation between the extended and conventional
Hamiltonians is given by


X µ ¶
   q 
H((q() p() () E()) =  −E − L(q  ) (17.87)
=1
   
X µ ¶
 
 q 
=  −E
− (q  ) (17.88)
=1

  
 µ ¶ "  µ  ¶#
X    X  
=  −E + (q p ) −  (17.89)
=1
  =1
 

= ((q p ) − E) =0 (17.90)

since only the term with  = 0 does not cancel in equation 1779. Equations 1781 and 1790 give that both the
left and right-hand sides of equation 1790 are zero while equation 1786 implies that H((q() p() () E())
is a constant of motion, that is,  is a cyclic variable for H((q() p() () E()). Formally one can consider
the extended Hamiltonian is a constant which equals zero
H(q p  E()) = E() = 0 (17.91)
Equations 1784 1785 imply that (E ) form a pair of canonically conjugate variables in addition to the
newly-introduced canonically-conjugate variables (E() ). Equation 1790 shows that the motion in the
2 + 2 extended phase space is constrained to the surface reflecting the fact that the observed system has
one less degree of freedom than used by the extended Hamiltonian.
In summary, the Lorentz-invariant extended canonical formalism leads to Hamilton’s first-order equations
of motion in terms of derivatives with respect to  where  is related to the proper time  for a relativistic
system.
486 CHAPTER 17. RELATIVISTIC MECHANICS

17.7.2 Extended Poisson Bracket representation

Struckmeier[Str08] investigated the usefulness of the extended formalism when applied to the Poisson bracket
representation of Hamiltonian mechanics. The extended Poisson bracket for two diﬀerentiable functions 
and  is defined as  µ ¶
X        
[[ ]] = 
− 
− + (17.92)
=1
       

As for the conventional Poisson bracket discussed in chapter 15, the extended Poisson also leads to the
fundamental Poisson bracket relations
££   ¤¤ ££  ¤¤
  =0 [[   ]] = 0    =   (17.93)

where   = 0 1  . These are identical to the non-extended fundamental Poisson brackets.
The discussion of observables in Hamiltonian mechanics in chapter 1525 can be trivially expanded to
the extended Poisson bracket representation. In particular, the total  derivative of the function  is given
by
 
= + [[ H]] (17.94)
 

If  commutes with the extended Hamiltonian, that is, the Poisson bracket equals zero, and if  = 0, then

 = 0. That is, the observable  is a constant of motion.
Substitute the fundamental variables for  gives
 H   H
= [[  H]] = −  = [[   H]] = (17.95)
   

where   = 0 1  . These are Hamilton’s extended canonical equations of motion expressed in terms of
the system evolution parameter . The extended Poisson bracket representation is a trivial extension of the
conventional canonical equations presented in chapter 153.

17.7.3 Extended canonical transformation and Hamilton-Jacobi theory

Struckmeier[Str08] presented plausible extended versions of canonical transformation and Hamilton-Jacobi
theories that can be used to provide a Lorentz-invariant formulation of Hamiltonian mechanics for relativistic
one-body systems. A detailed description can be found in Struckmeier[Str08].5

17.7.4 Validity of the extended Hamilton-Lagrange formalism

It has been shown that the extended Lagrangian and Hamiltonian formalism, based on the parametric model
of Lanczos[La49], leads to a plausible manifestly-covariant approach for the one-body system. The general
features developed for handling Lagrangian and Hamiltonian mechanics carry over to the Special Theory
of Relativity assuming the use of a non-standard, extended Lagrangian or Hamiltonian. This expansion of
the range of validity of the well-known Hamiltonian and Lagrangian mechanics into the relativistic domain
is important, and reduces any Lorentz transformation to a canonical transformation. The validity of this
extended Hamilton-Lagrange formalism has been criticized, and problems exist extending this approach to
the  -body system for   1. For example, as discussed by Goldstein[Go50] and Johns[Jo05], each of
the  moving bodies have their own world lines and momenta. Defining the total momentum P requires
knowing simultaneously the momenta of the individual bodies, but simultaneity is body dependent and
thus even the total momentum is not a simple four vector. A general method is required that will allow
using a manifestly-covariant Lagrangian or Hamiltonian for the  -body system. For the one-body system,
the extended Hamilton-Lagrange formalism provides a powerful and logical approach to exploit analytical
mechanics in the relativistic domain that retains the form of the conventional Lagrangian/Hamiltonian
formalisms. Note that Noether’s theorem relating energy and time is readily apparent using the extended
formalism.

5 Note that Greiner[Gr10] includes a reproduction of the Struckmeier paper[Str08].

17.7. LORENTZ-INVARIANT FORMULATIONS OF HAMILTONIAN MECHANICS 487

17.7 Example: The Bohr-Sommerfeld hydrogen atom

The classical relativistic hydrogen atom was first solved by Sommerfeld in 1916. Sommerfeld used Bohr’s
“old quantum theory” plus Hamiltonian mechanics to make an important step in the development of quantum
mechanics by obtaining the first-order expressions for the fine structure of the hydrogen atom. As in the
non-relativistic case, the motion is confined to a plane allowing use of planar polar coordinates. Thus the
relativistic Lagrangian is given by
s
2 2
 ̇2 + 2 ̇ 2
=− −  = −2 1 − 2
+
  
The canonical momenta are given by

 = = 2 ̇
 ̇

 = =  ̇
 ̇

̇ = =0

 2 2
̇ = = ̇ + 
 2
As for the non-relativistic case,  is a cyclic variable and thus the
angular momentum  = 2 ̇ is conserved.
The relativistic Hamiltonian for the Coulomb potential between an
electron and the proton, assuming that the motion is confined to a
plane, which allows use of planar polar coordinates, leads to The advance of the perihelion of
r bound orbits due to the dependence
2 2 2 of the relativistic mass on velocity.
 = 2 2 +  2 + 2 4 −
 
The same equations of motion are obtained using Hamiltonian mechanics, that is:
 
̇ = =
 2
 
̇ = =
 

̇ = − =0

 2 2
̇ = − = ̇ +  2
 
The radial dependence can be solved using either Lagrangian or Hamiltonian mechanics, but the solution
is non-trivial. Using the same techniques applied to solve Kepler’s problem, leads to the radial solution

s s
2 4
 4 2 Γ2 2 Γ2 (1 −  2 )
= Γ= 1− 2 2 = 2  = 1+
1 +  cos[Γ( − 0 ]     1 − Γ2
 
The apses are min = (1+) for Γ( − 0 ) = 0 2 4 and max = (1−) for Γ( − 0 ) =  3. The
perihelion advances between cycles due to the change in relativistic mass during the trajectory as shown in
the adjacent figure. This precession leads to the fine structure observed in the optical spectra of the hydrogen
atom. The same precession of the perihelion occurs for planetary motion, however, there is a comparable
size eﬀect due to gravity that requires use of general relativity to compute the trajectories.
488 CHAPTER 17. RELATIVISTIC MECHANICS

17.8 The General Theory of Relativity

Einstein’s General Theory of Relativity expands the scope of relativistic mechanics to include non-inertial
accelerating frames plus a unified theory of gravitation. That is, the General Theory of Relativity incorpo-
rates both the Special Theory of Relativity as well as Newton’s Law of Universal Gravitation. It provides
a unified theory of gravitation that is a geometric property of space and time. In particular, the curvature
of space-time is directly related to the four-momentum of matter and radiation. Unfortunately, Einstein’s
equations of general relativity are nonlinear partial diﬀerential equations that are diﬃcult to solve exactly,
and the theory requires knowledge of Riemannian geometry that goes beyond the scope of this book. The
following summarizes the fundamental variational concepts underlying the theory, and the experimental
evidence in support of the General Theory of Relativity.

17.8.1 The fundamental concepts

Einstein incorporated the following concepts in the General Theory of Relativity.

Mach’s principle:

The 1883 work “The Science of Mechanics” by the philosopher/physicist, Ernst Mach, criticized Newton’s
concept of an absolute frame of reference, and suggested that local physical laws are determined by the large-
scale structure of the universe. Mach’s Principle assumes that local motion of a rotating frame is determined
by the large-scale distribution of matter, that is, relative to the fixed stars. Einstein’s interpretation of
Mach’s statement was that the inertial properties of a body is determined by the presence of other bodies
in the universe, and he named this concept “Mach’s Principle”.

Equivalence principle:

The equivalence principle comprises closely-related concepts dealing with the equivalence of gravitational and
inertial mass. The weak equivalence principle states that the inertial mass and gravitational mass of a
body are identical, leading to acceleration that is independent of the nature of the body. Galileo demonstated
this at the Leaning Tower of Pisa. Recent measurements have shown that this weak equivalence principle
is obeyed to a sensitivity of 5 × 10−13 . Einstein’s equivalence principle states that the outcome of
any local non-gravitational experiment, in a freely falling laboratory, is independent of the velocity of the
laboratory and its location in space-time. This principle implies that the result of local experiments must be
independent of the velocity of the apparatus. Einstein’s equivalence principle has been tested by searching
for variations of dimensionless fundamental constants such as the fine structure constant. The strong
equivalence principle combines the weak equivalence and Einstein equivalence principles, and implies
that the gravitational constant is constant everywhere in the universe. The strong equivalence principle
suggests that gravity is geometrical in nature and does not involve any fifth force in nature. Tests of the
strong equivalence principle have involved searches for variations in the gravitational constant  and masses
of fundamental particles throughout the life of the universe.

Principle of covariance

A physical law that is expressed in a covariant formulation has the same mathematical form in all coordinate
systems, and is usually expressed in terms of tensor fields. In the Special Theory of Relativity, the Lorentz,
rotational, translational and reflection transformations between inertial coordinate frames are covariant. The
covariant quantities are the 4-scalars, and 4-vectors in Minkowski space-time. Einstein recognized that the
principle of covariance, that is built into the Special Theory of Relativity, should apply equally to accelerated
relative motion in the General Theory of Relativity. He exploited tensor calculus to extend the Lorentz
covariance to the more general local covariance in the General Theory of Relativity. The reduction locally
of the general metric tensor to the Minkowski metric corresponds to free-falling motion, that is geodesic
motion, and thus encompasses gravitation.
17.8. THE GENERAL THEORY OF RELATIVITY 489

Principle of minimal gravitational coupling

The principle of minimal gravitational coupling requires that the total Lagrangian for the field equations of
general relativity consist of two additive parts, one part corresponding to the free gravitational Lagrangian,
and the other part to external source fields in curved space-time.

Correspondence principle
The Correspondence Principle states that the predictions of any new scientific theory must reduce to the
predictions of well established earlier theories under circumstances for which the preceding theory was known
to be valid. The Correspondence Principle is an important concept used both in quantum mechanics and
relativistic mechanics. Einstein’s Special Theory of Relativity satisfies the Correspondence Principle be-
cause it reduces to classical mechanics in the limit of velocities small compared to the speed of light. The
Correspondence Principle requires that the General Theory of Relativity reduce to the Special Theory of
Relativity for inertial frames, and should approximate Newton’s Theory of Gravitation in weak fields and at
low velocities.

17.8.2 Einstein’s postulates for the General Theory of Relativity

Einstein realized that the Equivalence Principle relating the gravitational and inertial masses implies that
the constancy of the velocity of light in vacuum cannot hold in the presence of a gravitational field. That
is, the Minkowskian line element must be replaced by a more general line element that takes gravity into
account. Einstein proposed that the Minkowskian line element in four-dimensional space-time, be replaced
by introducing a four-dimensional Riemannian geometrical structure where space, time, and matter are
combined. As described by Lancos[La49], [Har03], [Mu08] this astonishingly bold proposal implies that
planetary motion is described as purely a geodesic phenomenon in a certain four-space of Riemannian
structure, where the geodesic is the equation of a curve on a manifold for any possible set of coordinates.
This implies that the concept of “gravitational force” is discarded, and planetary motion is a manifestation
of a pure geodesic phenomenon for forceless motion in a four-dimensional Riemannian structure.
Chapters 6 − 9 showed that the Lagrangian and Hamiltonian representations of variational principles are
powerful approaches for determining the equation governing geodesic constrained motion that are indepen-
dent of the chosen frame of reference as is also required by the General Theory of Relativity. Thus variational
principles provide a theoretical representation for the General Theory of Relativity. The Einstein-Hilbert
action is defined as Z ∙ ¸
1 √
S= −4
R + L −4  (17.96)
16
where  is Einstein’s gravitational constant, R is the Ricci scalar, L accounts for matter fields, and  is the
determinant of the metric tensor.matrix. Variational principles applied to the Einstein-Hilbert action lead to
Einstein’s sophisticated and advanced relativistic field equations of the General Theory of Relativity. Thus
the variational approach unifies relativistic mechanics and classical field theories, such as mechanics and
electromagnatism, which also were formulated in terms of least action. In relativistic mechanics, the use of
action identifies the gravitational coupling of the metric to matter as well as identifying conserved quantities
and symmetries using Noether’s theorem. The Einstein-Hilbert action expands the scope of variational
principles to include general relativity illustrating the crucial role played by variational principles in physics.
To summarize, the Special Theory of Relativity implies that the Newtonian concepts of absolute frame
of reference and separation of space and time are invalid. The General Theory of Relativity goes beyond
the Special Theory by implying that the gravitational force, and the resultant planetary motion, can be
described as pure geodesic phenomena for forceless motion in a four-dimensional Riemannian structure.

17.8.3 Experimental evidence in support of the General Theory of Relativity

The following experimental evidence in support of Einstein’s Theory of General Relativity is compelling.

Kepler problem In 1915 Einstein showed that relativistic mechanics explained the anomalous precession
of the perihelion of the planet mercury, that is, the axes of the elliptical Kepler orbit are observed to precess.
490 CHAPTER 17. RELATIVISTIC MECHANICS

Deflection of light Einstein’s prediction of the deflection of light in a gravitational field was confirmed by
Eddington during the solar eclipse of 29 May 1919 Pictures of stars in the region around the Sun showed that
their apparent locations were slightly shifted because the light from the stars had been curved by passing
close to the sun’s gravitational field.

Gravitational lensing The deflection of light by the gravitational attraction of a massive object situated
between a distant star and the observer has resulted in the observation of multiple images of a distant quasar.

Gravitational time dilation and frequency shift Processes occurring in a high gravitation field are
slower than in a weak gravitational field; this is called gravitational time dilation. In addition, light climbing
out of a gravitational well is red shifted. The gravitational time dilation has been measured many times and
the successful operation of the Global Position System provides an ongoing validation. The gravitational
red shift has been confirmed in the laboratory using the precise Mössbauer eﬀect in nuclear physics. Tests
in stronger gravitational fields are provided by studies of binary pulsars.

Black holes When the mass to radius ratio of a massive object becomes suﬃciently large, general relativity
predicts formation of a black hole, which is a region of space from which neither light nor matter can escape.
Supermassive black holes, with a mass that can be 106 − 109 solar masses, are thought to have played an
important role in formation of the galaxies.

Gravitational waves detection In 1916 Einstein predicted the existence of gravitational waves on the
basis of the theory of general relativity. The first implied detection of gravitational waves were made in
1976 by Hulse and Taylor who detected a decrease in the orbital period due to significant energy loss which
presumably was associated with emission of gravity waves by the compact neutron star in the binary pulsar
 1913 + 16. The most compelling direct evidence for observation of a gravitational wave was made
on 15 September 2015 by the LIGO Laser Interferometer Gravitational-Wave Observatories. The waveform
detected by the two LIGO observatories matched the predictions of General Relativity for gravitational waves
emanating from the inward spiral plus merger of a pair of black holes of around 36 and 29 solar masses,
followed by the resultant binary black hole. The gravitational wave emitted by this cataclysmic merger
reached Earth as a ripple in space-time that changed the length of the 4 LIGO arm by a thousandth of
the width of the proton. The gravitational energy emitted was 30+05 2
−05  solar masses. A second observation
of gravitational waves was made on 26 December 2015, and four similar observations were made during
2017. The detection of such miniscule changes in space-time is a truly remarkable achievement. This direct
detection of gravitational waves resulted in the award of the 2017 Nobel Prize to Rainer Weiss, Barry
Barish, and Kip Thorn. Gravitational wave detection has opened an exciting and powerful new frontier in
astrophysics that could lead to exciting new physics.

17.9 Implications of relativistic theory to classical mechanics

Einstein’s theories of relativity have had an enormous impact on twentieth century physics and the philosophy
of science. Relativistic mechanics is crucial to an understanding of the physics of the atom, nucleus and the
substructure of the nucleons, but the impacts are minimal in everyday experience. The Special Theory of
Relativity replaces Newton’s Laws of motion; i.e. Newton’s law is only an approximation applicable for low
velocities. The General Theory of Relativity replaces Newton’s Law of Gravitation and provides a natural
explanation of the equivalence principle. Einstein’s theories of relativity imply a profound and fundamental
change in the view of the separation of space, time, and mass, that contradicts the basic tenets that are
the foundation of Newtonian mechanics. The Newtonian concepts of absolute frame of reference, plus the
separation of space, time, and mass, are invalid at high velocities. Lagrangian and Hamiltonian variational
approaches to classical mechanics provide the formalism necessary for handling relativistic mechanics. The
present chapter has shown that logical extensions of Lagrangian and Hamiltonian mechanics lead to the
relativistically-invariant extended Lagrangian and Hamiltonian formulations of mechanics which are adequate
for handling one-body systems. However, major unsolved problems remain applying these formulations to
systems that have more than one body.
17.10. SUMMARY 491

17.10 Summary
Special theory of relativity: The Special Theory of Relativity is based on Einstein’s postulates;
1) The laws of nature are the same in all inertial frames of reference.
2) The velocity of light in vacuum is the same in all inertial frames of reference.
For a primed frame moving along the 1 axis with velocity  Einstein’s postulates imply the following
Lorentz transformations between the moving (primed) and stationary (unprimed) frames

0 =  ( − )  =  (0 + 0 )

0 =   = 0
0 =   =  0³ ´
¡ ¢ 0
0 =   − 
2  =  0 + 2

where the Lorentz  factor  ≡  1

2
1−(  )
Lorentz transformations were used to illustrate Lorentz contraction, time dilation, and simultaneity. An
elementary review was given of relativistic kinematics including discussion of velocity transformation, linear
momentum, center-of-momentum frame, forces and energy.

Geometry of space-time: The concepts of four-dimensional space-time were introduced. A discussion of

four-vector scalar products introduced the use of contravariant and covariant tensors plus the Minkowski met-
ric  where the scalar product was defined. The Minkowski representation of space time and the momentum-
energy four vector also were introduced.

Lorentz-invariant formulation of Lagrangian mechanics: The Lorentz-invariant extended La-

grangian formalism, developed by Struckmeier[Str08], based on the parametric approach pioneered by
Lanczos[La49], provides a viable Lorentz-invariant extension of conventional Lagrangian mechanics that
is applicable for one-body motion in the realm of the Special Theory of Relativity.

Lorentz-invariant formulation of Hamiltonian mechanics: The Lorentz-invariant extended Hamil-

tonian formalism, developed by Struckmeier based on the parametric approach pioneered by Lanczos, was
introduced. It provides a viable Lorentz-invariant extension of conventional Hamiltonian mechanics that is
applicable for one-body motion in the realm of the Special Theory of Relativity. In particular, it was shown
that the Lorentz-invariant extended Hamiltonian is conserved making it ideally suited for solving compli-
cated systems using Hamiltonian mechanics via use of the Poisson-bracket representation of Hamiltonian
mechanics, canonical transformations, and the Hamilton-Jacobi techniques.

The General Theory of Relativity: An elementary summary was given of the fundamental concepts
of the General Theory of Relativity and the resultant unified description of the gravitational force plus
planetary motion as geodesic motion in a four-dimensional Riemannian structure. Variational mechanics
were shown to be ideally suited to applications of the General Theory of Relativity.

Philosophical implications: Newton’s equations of motion, and his Law of Gravitation, that reigned
supreme from 1687 to 1905, have been toppled from the throne by Einstein’s theories of relativistic mechanics.
By contrast, the complete independence to coordinate frames in Lagrangian, and Hamiltonian formulations of
classical mechanics, plus the underlying Principle of Least Action, are equally valid in both the relativistic and
non-relativistic regimes. As a consequence, relativistic Lagrangian and Hamiltonian formulations underlie
much of modern physics, especially quantum physics, which explains why relativistic mechanics plays such
an important role in classical dynamics.
492 CHAPTER 17. RELATIVISTIC MECHANICS

Workshop exercises
1. A relativistic snake of proper length 100 is travelling to the right across a butcher’s table at  = 06. You
hold two meat cleavers, one in each hand which are 100 apart. You strike the table simultaneously with
both cleavers at the moment when the left cleaver lands just behind the tail of the snake. You rationalize that
since the snake is moving with  = 06 then the length of the snake is Lorentz contracted by the factor  = 54
and thus the Lorentz-contracted length of the snake is 80 and thus will not be harmed. However, the snake
reasons that relative to it the cleavers are moving at  = 06 and thus are only 80 apart when they strike
the 100 long snake and thus it will be severed. Use the Lorentz transformation to resolve this paradox.

2. Explain what is meant by the following statement: “Lorentz transformations are orthogonal transformations
in Minkowski space.”

3. Which of the following are invariant quantities in space-time?

(a) Energy
(b) Momentum
(c) Mass
(d) Force
(e) Charge
(f) The length of a vector
(g) The length of a four-vector

4. What does it mean for two events to have a spacelike interval? What does it mean for them to have a timelike
interval? Draw a picture to support your answer. In which case can events be causally connected?

Problems
1. A supply rocket flies past two markers on the Space Station that are 50 apart in a time of 02 as measured
by an observer on the Space station.

(a) What is the separation of the two markers as seen by the pilot riding in the supply rocket?
(b) What is the elapsed time as measured by the pilot in the supply rocket?
(c) What are the speeds calculated by the observer in the Space Station and the pilot of the supply rocket?

2. The Compton eﬀect involves a photon of incident energy  being scattered by an electron of mass  which
initially is stationary. The photon scattered at an angle  with respect to the incident photon has a final energy
 . Using the special theory of relativity derive a formula that related  and  to .
3. Pair creation involves production of an electron-positron pair by a photon. Show that such a process is
impossible unless some other body, such as a nucleus, is involved. Suppose that the nucleus has a mass 
and the electron mass  . What is the minimum energy that the photon must have in order to produce an
electron-positron pair?

4. A  meson of rest energy 494  decays into a  meson of rest energy 106  and a neutrino of zero
rest energy. Find the kinetic energies of the  meson and the neutrino into which the  meson decays while
at rest.
Chapter 18

The transition to quantum physics

18.1 Introduction
Classical mechanics, including extensions to relativistic velocities, embrace an unusually broad range of topics
ranging from astrophysics to nuclear and particle physics, from one-body to many-body statistical mechanics.
It is interesting to discuss the role of classical mechanics in the development of quantum mechanics which
plays a crucial role in physics. A valid question is “why discuss quantum mechanics in a classical mechanics
course?”. The answer is that quantum mechanics supersedes classical mechanics as the fundamental the-
ory of mechanics. Classical mechanics is an approximation applicable for situations where quantization is
unimportant. Thus there must be a correspondence principle that relates quantum mechanics to classical
mechanics, analogous to the relation between relativistic and non-relativistic mechanics. It is illuminating to
study the role played by the Hamiltonian formulation of classical mechanics in the development of quantal
theory and statistical mechanics. The Hamiltonian formulation is expressed in terms of the phase-space
variables q p for which there are well-established rules for transforming to quantal linear operators.

18.2 Brief summary of the origins of quantum theory

The last decade of the 19 century saw the culmination of classical physics. By 1900 scientists thought
that the basic laws of mechanics, electromagnetism, and statistical mechanics were understood and worried
that future physics would be reduced to confirming theories to the fifth decimal place, with few major new
discoveries to be made. However, technical developments such as photography, vacuum pumps, induction
coil, etc., led to important discoveries that revolutionized physics and toppled classical mechanics from its
throne at the beginning of the 20 century. Table 181 summarizes some of the major milestones leading
up to the development of quantum mechanics.
Max Planck searched for an explanation of the spectral shape of the black-body electromagnetic radia-
tion. He found an interpolation between two conflicting theories, one that reproduced the short wavelength
behavior, and the other the long wavelength behavior. Planck’s interpolation required assuming that electro-
magnetic radiation was not emitted with a continuous range of energies, but that electromagnetic radiation
is emitted in discrete bundles of energy called quanta. In December 1900 he presented his theory which
reproduced precisely the measured black body spectral distribution by assuming that the energy carried by
a single quantum must be an integer multiple of :


 =  = (18.1)

where  is the frequency of the electromagnetic radiation and Planck’s constant,  = 662610−34  ·  was
the best fit parameter of the interpolation. That is, Planck assumed that energy comes in discrete bundles
of energy equal to  which are called quanta. By making this extreme assumption, in an act of desperation,
Planck was able to reproduce the experimental black body radiation spectrum. The assumption that energy
was exchanged in bundles hinted that the classical laws of physics were inadequate in the microscopic
domain. The older generation physicists initially refused to believe Planck’s hypothesis which underlies

493
494 CHAPTER 18. THE TRANSITION TO QUANTUM PHYSICS

quantum theory. It was the new generation physicists, like Einstein, Bohr, Heisenberg, Born, Schrödinger,
and Dirac, who developed Planck’s hypothesis leading to the revolutionary quantum theory.
In 1905, Einstein predicted the existence of the photon, derived the theory of specific heat, as well
as deriving the Theory of Special Relativity. It is remarkable to realize that he developed these three
revolutionary theories in one year, when he was only 26 years old. Einstein uncovered an inconsistency in
Planck’s derivation of the black body spectral distribution in that it assumed the statistical part of the energy
is quantized, whereas the electromagnetic radiation assumed Maxwell’s equations with oscillator energies
being continuous. Planck demanded that light of frequency  be packaged in quanta whose energies were
multiples of , but Planck never thought that light would have particle-like behavior. Newton believed that
light involved corpuscles, and Hamilton developed the Hamilton-Jacobi theory seeking to describe light in
terms of the corpuscle theory. However, Maxwell had convinced physicists that light was a wave phenomena;
interference plus diffraction effects were convincing manifestations of the wave-like properties of light. In
order to reproduce Planck’s prediction, Einstein had to treat black-body radiation as if it consisted of a gas
of photons, each photon having energy  = . This was a revolutionary concept that returned to Newton’s
corpuscle theory of light. Einstein realized that there were direct tests of his photon hypothesis, one of which
is the photo-electric effect. According to Einstein, each photon has an energy  = , in contrast to the
classical case where the energy of the photoelectron depends on the intensity of the light. Einstein predicted
that the ejected electron will have a kinetic energy
 =  −  (18.2)
where  is the work function which is the energy needed to remove an electron from a solid.
Many older scientists, including Planck, accepted Einstein’s theory of relativity but were skeptical of the
photon concept, even after Einstein’s photon concept was vindicated in 1915 by Millikan who showed that,
as predicted, the energy of the ejected photoelectron depended on the frequency, and not intensity, of the
light. In 1923 Compton’s demonstrated that electromagnetic radiation scattered by free electrons obeyed
simple two-body scattering laws which finally convinced the many skeptics of the existence of the photon.
Table 181: Chronology of the development of quantum mechanics
Date Author Development
1887 Hertz Discovered the photo-electric effect
1895 Röntgen Discovered x-rays
1896 Becquerel Discovered radioactivity
1897 J.J. Thomson Discovered the first fundamental particle, the electron
1898 Pierre & Marie Curie Showed that thorium is radioactive which founded nuclear physics
1900 Planck Quantization  =  explained the black-body spectrum
1905 Einstein Theory of special relativity
1905 Einstein Predicted the existence of the photon
1906 Einstein Used Planck’s constant to explain specific heats of solids
1909 Millikan The oil drop experiment measured the charge on the electron
1911 Rutherford Discovered the atomic nucleus with radius 10−15 
1912 Bohr Bohr model of the atom explained the quantized states of hydrogen
1914 Moseley X-ray spectra determined the atomic number of the elements.
1915 Millikan Used the photo-electric effect to confirm the photon hypothesis.
1915 Wilson-Sommerfeld Proposed quantization of the action-angle integral
1921 Stern-Gerlach Observed space quantization in non-uniform magnetic field
1923 Compton Compton scattering of x-rays confirmed the photon hypothesis
1924 de Broglie Postulated wave-particle duality for matter and EM waves
1924 Bohr Explicit statement of the correspondence principle
1925 Pauli Postulated the exclusion principle
1925 Goudsmit-Uhlenbeck Postulated the spin of the electron of  = 12 h
1925 Heisenberg Matrix mechanics representation of quantum theory
1925 Dirac Related Poisson brackets and commutation relations
1926 Schrödinger Wave mechanics
1927 G.P. Thomson/Davisson Electron diffraction proved wave nature of electron
1928 Dirac Developed the Dirac relativistic wave equation
18.2. BRIEF SUMMARY OF THE ORIGINS OF QUANTUM THEORY 495

18.2.1 Bohr model of the atom

The Rutherford scattering experiment, performed at Manchester in 1911, discovered that the Au atom
comprised a positively charge nucleus of radius ≈ 10−14  which is much smaller than the 135 × 10−10 
radius of the Au atom. Stimulated by this discovery, Niels Bohr joined Rutherford at Manchester in 1912
where he developed the Bohr model of the atom. This theory was remarkably successful in spite of having
serious inconsistencies and deficiencies. Bohr’s model assumptions were:
1) Electromagnetic radiation is quantized with  = 
2) Electromagnetic radiation exhibits behavior characteristic of the emission of photons with energy
 =  and momentum  =   . That is, it exhibits both wave-like and particle-like behavior.
3) Electrons are in stationary orbits that do not radiate, which contradicts the predictions of classical
electromagnetism.

4) The orbits are quantized such that the electron angular momentum is an integer multiple of 2 = ~
5) Atomic electromagnetic radiation is emitted with photon energy equal to the diﬀerence in binding
energy between the two atomic levels involved.  = 1 − 2
The first two assumptions are due to Planck and Einstein, while the last three were made by Niels Bohr.
The deficiencies of the Bohr model were the philosophical problems of violating the tenets of classical
physics in explaining hydrogen-like atoms, that is, the theory was prescriptive, not deductive. The Bohr
model was based implicitly on the assumption that quantum theory contains classical mechanics as a limiting
case. Bohr explicitly stated this assumption which he called the correspondence principle, and which
played a pivotal role in the development of the older quantum theory. In 1924 Bohr justified the inconsis-
tencies of the old quantum theory by writing “As frequently emphasized, these principles, although they
are formulated by the help of classical conceptions, are to be regarded purely as laws of quantum theory,
which give us, not withstanding the formal nature of quantum theory, a hope in the future of a consistent
theory, which at the same time reproduces the characteristic features of quantum theory, important for its
applicability, and, nevertheless, can be regarded as a rational generalization of classical electrodynamics.”
The old quantum theory was remarkably successful in reproducing the black-body spectrum, specific heats
of solids, the hydrogen atom, and the periodic table of the elements. Unfortunately, from a methodological
point of view, the theory was a hodgepodge of hypotheses, principles, theorems, and computational recipes,
rather than a logical consistent theory. Every problem was first solved in terms of classical mechanics,
and then would pass through a mysterious quantization procedure involving the correspondence principle.
Although built on the foundation of classical mechanics, it required Bohr’s hypotheses which violated the
laws of classical mechanics and predictions of Maxwell’s equations.

18.2.2 Quantization
By 1912 Planck, and others, had abandoned the concept that quantum theory was a branch of classical
mechanics, and were searching to see if classical mechanics was a special case of a more general quantum
physics, or quantum physics was a science altogether outside of classical mechanics. Also they were trying
to find a consistent and rational reason for quantization to replace the ad hoc assumption of Bohr.
In 1912 Sommerfeld proposed that, in every elementary process, the atom gains or loses a definite amount
of action between times 0 and  of Z 
= (0 )0 (18.3)
0

where  is the quantal analogue of the classical action function It has been shown that the classical principle
of least action states that the action function is stationary for small variations of the trajectory. In 1915
Wilson and Sommerfeld recognized that the quantization of angular momentum could be expressed in terms
of the action-angle integral, that is equation 15116. They postulated that, for every coordinate, the action-
angle variable is quantized I
  =  (18.4)

where the action-angle variable integral is over one complete period of the motion. That is, they postulated
that Hamilton’s phase space is quantized, but the microscopic granularity is such that the quantization is
only manifest for atomic-sized domains. That is,  is a small integer for atomic systems in contrast to
 ≈ 1064 for the Earth-Sun two-body system.
496 CHAPTER 18. THE TRANSITION TO QUANTUM PHYSICS

Sommerfeld recognized that quantization of more than one degree of freedom is needed to obtain more
accurate description of the hydrogen atom. Sommerfeld reproduced the experimental data by assuming
quantization of the three degrees of freedom,
I I I
  = 1    = 2    = 3  (18.5)

and solving Hamilton-Jacobi theory by separation of variables. In 1916 the Bohr-Sommerfeld model solved
the classical orbits for the hydrogen atom, including relativistic corrections as described in example 177.
This reproduced fine structure observed in the optical spectra of hydrogen. The use of the canonical trans-
formation to action-angle variables proved to be the ideal approach for solving many such problems in
quantum mechanics. In 1921 Stern and Gerlach demonstrated space quantization by observing the splitting
of atomic beams deflected by non-uniform magnetic fields. This result was a major triumph for quantum
theory. Sommerfeld declared that “With their bold experimental method, Stern and Gerlach demonstrated
not only the existence of space quantization, they also proved the atomic nature of the magnetic moment,
its quantum-theoretic origin, and its relation to the atomic structure of electricity.”
In 1925 Pauli’s Exclusion Principle proposed that no more than one electron can have identical quantum
numbers and that the atomic electronic state is specified by four quantum numbers. Two students, Goudsmit
and Uhlenbeck suggested that a fourth two-valued quantum number was the electron spin of ± 2 . This
provided a plausible explanation for the structure of multi-electron atoms.

18.2.3 Wave-particle duality

In his 1924 doctoral thesis, Prince Louis de Broglie proposed the hypothesis of wave-particle duality which
was a pivotal development in quantum theory. de Broglie used the classical concept of a matter wavepacket,
analogous to classical wave packets discussed in chapter 311. He assumed that both the group and signal
velocities of a matter wave packet must equal the velocity of the corresponding particle. By analogy with
Einstein’s relation for the photon, and using the Theory of Special Relativity, de Broglie assumed that
2
~ =  = q¡ (18.6)
2¢
1 − 2

The group velocity is required to equal the velocity of the mass 

µ ¶ µ ¶µ ¶
  
 = = = (18.7)
  
This gives
µ ¶ ³ ´µ ¶− 32
 1   2
= = 1− 2 (18.8)
   ~ 
Integration of this equation assuming that  = 0 when  = 0, then gives
v
~k = q¡ ¢ =p (18.9)
1 − v·v
2

This relation, derived by de Broglie, is required to ensure that the particle travels at the group velocity
of the wave packet characterizing the particle. Note that although the relations used to characterize the
matter waves are purely classical, the physical content of such waves is beyond classical physics. In 1927 C.
Davisson and G.P. Thomson independently observed electron diﬀraction confirming wave/particle duality for
the electron. Ironically, J.J. Thomson discovered that the electron was a particle, whereas his son attributed
it to an electron wave.
Heisenberg developed the modern matrix formulation of quantum theory in 1925; he was 24 years old
at the time. A few months later Schrödinger’s developed wave mechanics based on de Broglie’s concept of
wave-particle duality. The matrix mechanics, and wave mechanics, quantum theories are radically diﬀerent.
Heisenberg’s algebraic approach employs non-commuting quantities and unfamiliar mathematical techniques
that emphasized the discreteness characteristic of the corpuscle aspect. In contrast, Schrödinger used the
familiar analytical approach that is an extension of classical laws of motion and waves which stressed the
element of continuity.
18.3. HAMILTONIAN IN QUANTUM THEORY 497

18.3 Hamiltonian in quantum theory

18.3.1 Heisenberg’s matrix-mechanics representation
The algebraic Heisenberg representation of quantum theory is analogous to the algebraic Hamiltonian rep-
resentation of classical mechanics, and shows best how quantum theory evolved from, and is related to,
classical mechanics. Heisenberg decided to ignore the prevailing conceptual theories, such as classical me-
chanics, and based his quantum theory on observables. This approach was influenced by the success of
Bohr’s older quantum theory and Einstein’s theory of relativity. He abandoned the classical notions that
the canonical variables    can be measured directly and simultaneously. Secondly he wished to absorb the
correspondence principle directly into the theory instead of it being an ad hoc procedure tailored to each ap-
plication. Heisenberg considered the Fourier decomposition of transition amplitudes between discrete states
and found that the product of the conjugate variables do not commute. Heisenberg derived, for the first
time, the correct energy levels of the one-dimensional harmonic oscillator as  = ~( + 12 ) which was a
significant achievement. Born recognized that Heisenberg’s strange multiplication and commutation rules for
two variables, corresponded to matrix algebra. Prior to 1925 matrix algebra was an obscure branch of pure
mathematics not known or used by the physics community. Heisenberg, Born, and the young mathemati-
cian Jordan, developed the commutation rules of matrix mechanics. Heisenberg’s approach represents the
classical position and momentum coordinates   by matrices q and p, with corresponding matrix elements
   and    . Born showed that the trace of the matrix

(pq) = pq̇− (18.10)

gives the Hamiltonian function (p q) of the matrices q and p which leads to Hamilton’s canonical equations

 
q̇= ṗ=− (18.11)
p q

Heisenberg and Born also showed that the commutator of q p equals

  −   = ~  (18.12)
  −   = 0
  −   = 0

Born realized that equation (1812) is the only fundamental equation for introducing ~ into the theory in a
logical and consistent way.
Chapter 1524 discussed the formal correspondence between the Poisson bracket, defined in chapter 153,
and the commutator in classical mechanics. It was shown that the commutator of two functions equals a
constant multiplicative factor  times the corresponding Poisson Bracket. That is

(  −   ) =  [   ] (18.13)

where the multiplicative factor  is a number independent of    , and the commutator.

In 1925, Paul Dirac, a 23-year old graduate student at Bristol, recognized the crucial importance of
the above correspondence between the commutator and the Poisson Bracket of two functions, to relating
classical mechanics and quantum mechanics. Dirac noted that if the constant  is assigned the value  = ~,
then equation 1813 directly relates Heisenberg’s commutation relations between the fundamental canonical
variables (   ) to the corresponding classical Poisson Bracket [   ]. That is,

  −   = ~ [   ] = ~  (18.14)

  −   = ~ [   ] = 0 (18.15)
  −   = ~ [   ] = 0 (18.16)

Dirac recognized that the correspondence between the classical Poisson bracket, and quantum commuta-
tor, given by equation (1813)  provides a logical and consistent way that builds quantization directly into
the theory, rather than using an ad-hoc, case-dependent, hypothesis as used by the older quantum theory of
498 CHAPTER 18. THE TRANSITION TO QUANTUM PHYSICS

Bohr. The basis of Dirac’s quantization principle, involves replacing the classical Poisson Bracket, [   ]
1
by the commutator,  (   −   ). That is,

1
[   ] =⇒ (  −   ) (18.17)
~

Hamilton’s canonical equations, as introduced in chapter 15, are only applicable to classical mechanics
since they assume that the exact position and conjugate momentum can be specified both exactly and
simultaneously which contradicts the Heisenberg’s Uncertainty Principle. In contrast, the Poisson bracket
generalization of Hamilton’s equations allows for non-commuting variables plus the corresponding uncertainty
principle. That is, the transformation from classical mechanics to quantum mechanics can be accomplished
simply by replacing the classical Poisson Bracket by the quantum commutator, as proposed by Dirac. The
formal analogy between classical Hamiltonian mechanics, and the Heisenberg representation of quantum
mechanics is strikingly apparent using the correspondence between the Poisson Bracket representation of
Hamiltonian mechanics and Heisenberg’s matrix mechanics.
The direct relation between the quantum commutator, and the corresponding classical Poisson Bracket,
applies to many observables. For example, the quantum analogs of Hamilton’s equations of motion are
given by use of Hamilton’s equations of motion, 1553 1556 and replacing each Poisson Bracket by the
corresponding commutator. That is

  1
= = [  ] = (  −  ) (18.18)
  ~
  1
= − = [  ] = (  −  ) (18.19)
  ~

Chapter 1525 discussed the time dependence of observables in Hamiltonian mechanics. Equation 1545
gave the total time derivative of any observable  to be

 
= + [ ] (18.20)
 
Equation 1817 can be used to replace the Poisson Bracket by the quantum commutator, which gives the
corresponding time dependence of observables in quantum physics.

  1
= + ( − ) (18.21)
  ~
In quantum mechanics, equation 1821 is called the Heisenberg equation. Note that if the observable  is
chosen to be a fundamental canonical variable, then  
 = 0 =  and equation 1520 reduces to Hamilton’s


equations 1818 and 1819.

The analogies between classical mechanics and quantum mechanics extend further. For example, if  is
a constant of motion, that is 
 = 0 then Heisenberg’s equation of motion gives

 1
+ ( − ) = 0 (18.22)
 ~
Moreover, if  is not an explicit function of time, then

1
0= ( − ) (18.23)
~
That is, the transition to quantum physics shows that, if  is a constant of motion, and is not explicitly
time dependent, then  commutes with the Hamiltonian .
The above discussion has illustrated the close and beautiful correspondence between the Poisson Bracket
representation of classical Hamiltonian mechanics, and the Heisenberg representation of quantum mechanics.
Dirac provided the elegant and simple correspondence principle connecting the Poisson bracket representation
of classical Hamiltonian mechanics, to the Heisenberg representation of quantum mechanics.
18.3. HAMILTONIAN IN QUANTUM THEORY 499

18.3.2 Schrödinger’s wave-mechanics representation

The wave mechanics formulation of quantum mechanics, by the Austrian theorist Schrödinger, was built on
the wave-particle duality concept that was proposed in 1924 by Louis de Broglie. Schrödinger developed
his wave mechanics representation of quantum physics a year after the development of matrix mechanics
by Heisenberg and Born. The Schrödinger wave equation is based on the non-relativistic Hamilton-Jacobi
representation of a wave equation, melded with the operator formalism of Born and Wiener. The 39-year old
Schrödinger was an expert in classical mechanics and wave theory, which was invaluable when he developed
the important Schrödinger equation. As mentioned in chapter 1544, the Hamilton-Jacobi theory is a
formalism of classical mechanics that allows the motion of a particle to be represented by a wave. That is,
the wavefronts are surfaces of constant action  and the particle momenta are normal to these constant-
action surfaces, that is, p = ∇. The wave-particle duality of Hamilton-Jacobi theory is a natural way to
handle the wave-particle duality proposed by de Broglie.
Consider the classical Hamilton-Jacobi equation for one body, given by 1820

+ (q ∇) = 0 (18.24)

If the Hamiltonian is time independent, then equation 1590 gives that

= −(q p ) = − (α) (18.25)

The integration of the time dependence is trivial, and thus the action integral for a time-independent Hamil-
tonian is
(q α) =  (q α) −  (α)  (18.26)
A formal transformation gives

=− p = ∇ (18.27)

Consider that the classical time-independent Hamiltonian, for motion of a single particle, is represented
by the Hamilton-Jacobi equation.
p2 
= +  () = − (18.28)
2 
Substitute for p leads to the classical Hamilton-Jacobi relation in terms of the action 
1 
(∇ · ∇) +  () = − (18.29)
2 
By analogy with the Hamilton-Jacobi equation, Schrödinger proposed the quantum operator equation

~ = ̂ (18.30)


where ̂ is an operator given by

~2 2
̂ = − ∇ +  () (18.31)
2
In 1926 Max Born and Norbert Wiener introduced the operator formalism into matrix mechanics for predic-
tion of observables and this has become an integral part of quantum theory. In the operator formalism, the
observables are represented by operators that project the corresponding observable from the wavefunction.
That is, the quantum operator formalism for the assumed momentum and energy operators, that operate
on the wavefunction , are
~  ~ 
 = =− (18.32)
   
Formal transformations of p and  in the Hamiltonian (1826) leads to the time-independent Schrödinger
equation
~2  2 
− +  () =  (18.33)
2  2
500 CHAPTER 18. THE TRANSITION TO QUANTUM PHYSICS

Assume that the wavefunction is of the form


 =   (18.34)

where the action  gives the phase of the wavefront, and  the amplitude of the wave, as described in
chapter 1544. The time dependence, that characterizes the motion of the wavefront, is contained in the
time dependence of  This form for the wavefunction has the advantage that the wavefunction frequently
factors into a product of terms, e.g.  = ()Θ()Φ() which corresponds to a summation of the exponents
 =  +  +  − . This summation form is exploited by separation of the variables, as discussed in
chapter 1543.
Insert  (1833) into equation (1828)  plus using the fact that
µ ¶ µ ¶ µ ¶2
2       1   2
= =  = − 2 +  2 (18.35)
 2     ~  ~  ~ 

leads to
 1 ~ 2
− = (∇ · ∇) +  () − ∇ = (18.36)
 2 2
Note that if Planck’s constant ~ = 0 then the imaginary term in equation (1835) is zero, leading to 1835
being real, and identical to the Hamilton-Jacobi result, equation 1823. The fact that equation 1835
equals the Hamilton-Jacobi equation in the limit ~ → 0, illustrates the close analogy between the wave-
particle duality of the classical Hamilton-Jacobi theory, and de Broglie’s wave-particle duality in Schrödinger’s
quantum wave-mechanics representation.
The Schrödinger approach was accepted in 1925 and exploited extensively with tremendous success, since
it is much easier to grasp conceptually than is the algebraic approach of Heisenberg. Initially there was much
conflict between the proponents of these two contradictory approaches, but this was resolved by Schrödinger
who showed in 1926 that there is a formal mathematical identity between wave mechanics and matrix
mechanics. That is, these quantal two representations of Hamiltonian mechanics are equivalent, even though
they are built on either the Poisson bracket representation, or the Hamilton-Jacobi representation. Wave
mechanics is based intimately on the quantization rule of the action variable. Heisenberg’s Uncertainty
Principle is automatically satisfied by Schrödinger’s wave mechanics since the uncertainty principle is a
feature of all wave motion, as described in chapter 3.
In 1928 Dirac developed a relativistic wave equation which includes spin as an integral part. This Dirac
equation remains the fundamental wave equation of quantum mechanics. Unfortunately it is diﬃcult to
apply.
Today the powerful and eﬃcient Heisenberg representation is the dominant approach used in the field of
physics, whereas chemists tend to prefer the more intuitive Schrödinger wave mechanics approach. In either
case, the important role of Hamiltonian mechanics in quantum theory is undeniable.

18.4 Lagrangian representation in quantum theory

The classical notion of canonical coordinates and momenta, has a simple quantum analog which has al-
lowed the Hamiltonian theory of classical mechanics, that is based on canonical coordinates, to serve as the
foundation for the development of quantum mechanics. The alternative Lagrangian formulation for classical
dynamics is described in terms of coordinates and velocities, instead of coordinates and momenta. The La-
grangian and Hamiltonian formulations are closely related, and it may appear that the Lagrangian approach
is more fundamental. The Lagrangian method allows collecting together all the equations of motion and
expressing them as stationary properties of the action integral, and thus it may appear desirable to base
quantum mechanics on the Lagrangian theory of classical mechanics. Unfortunately, the Lagrangian equa-
tions of motion involve partial derivatives with respect to coordinates, and their velocities, and the meaning
ascribed to such derivatives is diﬃcult in quantum mechanics. The close correspondence between Poisson
brackets and the commutation rules leads naturally to Hamiltonian mechanics. However, Dirac showed that
Lagrangian mechanics can be carried over to quantum mechanics using canonical transformations such that
the classical Lagrangian is considered to be a function of coordinates at time  and  +  rather than of
coordinates and velocities.
18.5. CORRESPONDENCE PRINCIPLE 501

The motivation for Feynman’s 1942 Ph.D thesis, entitled “The Principle of Least Action in Quantum
Mechanics”, was to quantize the classical action at a distance in electrodynamics. This theory adopted an
overall space-time viewpoint for which the classical Hamiltonian approach, as used in conventional formu-
lations of quantum mechanics, is inapplicable. Feynman used the Lagrangian, plus the principle of least
action, to underlie his development of quantum field theory. To paraphrase Feynman’s Nobel Lecture, he
used a physical approach that is quite different from the customary Hamiltonian point of view for which the
system is discussed in great detail as a function of time. That is, you have the field at this moment, then a
differential equation gives you the field at a later moment and so on; that is, the Hamiltonian approach is a
time differential method. In Feynman’s least-action approach the action describes the character of the path
throughout all of space and time. The behavior of nature is determined by saying that the whole space-time
path has a certain character. The use of action involves both advanced and retarded terms that make it
difficult to transform back to the Hamiltonian form. The Feynman space-time approach is far beyond the
scope of this course. This topic will be developed in advanced graduate courses on quantum field theory.

18.5 Correspondence Principle

The Correspondence Principle implies that any new theory in physics must reduce to preceding theories
that have been proven to be valid. For example, Einstein’s Special Theory of Relativity satisfies the Corre-
spondence Principle since it reduces to classical mechanics for velocities small compared with the velocity
of light. Similarly, the General Theory of Relativity reduces to Newton’s Law of Gravitation in the limit
of weak gravitational fields. Bohr’s Correspondence Principle requires that the predictions of quantum me-
chanics must reproduce the predictions of classical physics in the limit of large quantum numbers. Bohr’s
Correspondence Principle played a pivotal role in the development of the old quantum theory, from it’s
inception in 1912 until 1925 when the old quantum theory was superseded by the current matrix and wave
mechanics representations of quantum mechanics.
Quantum theory now is a well-established field of physics that is equally as fundamental as is classical
mechanics. The Correspondence Principle now is used to project out the analogous classical-mechanics
phenomena that underlie the observed properties of quantal systems. For example, this book has studied
the classical-mechanics analogs of the observed behavior for typical quantal systems, such as the vibrational
and rotational modes of the molecule, and the vibrational modes of the crystalline lattice. The nucleus is the
epitome of a many-body, strongly-interacting, quantal system. Example 1412 showed that there is a close
correspondence between classical-mechanics predictions, and quantal predictions, for both the rotational and
vibrational collective modes of the nucleus, as well as for the single-particle motion of the nucleons in the
nuclear mean field, such as the onset of Coriolis-induced alignment. This use of the Correspondence Principle
can provide considerable insight into the underlying classical physics embedded in quantal systems.
502 CHAPTER 18. THE TRANSITION TO QUANTUM PHYSICS

18.6 Summary
The important point of this discussion is that variational formulations of classical mechanics provide a
rational, and direct basis, for the development of quantum mechanics. It has been shown that the final form
of quantum mechanics is closely related to the Hamiltonian formulation of classical mechanics. Quantum
mechanics supersedes classical mechanics as the fundamental theory of mechanics in that classical mechanics
only applies for situations where quantization is unimportant, and is the limiting case of quantum mechanics
when ~ → 0 which is in agreement with the Bohr’s Correspondence Principle. The Dirac relativistic theory
of quantum mechanics is the ultimate quantal theory for the relativistic regime.
This discussion has barely scratched the surface of the correspondence between classical and quantal
mechanics, which goes far beyond the scope of this course. The goal of this chapter is to illustrate that
classical mechanics, in particular, Hamiltonian mechanics, underlies much of what you will learn in your
quantum physics courses. An interesting similarity between quantum mechanics and classical mechanics is
that physicists usually use the more visual Schrödinger wave representation in order to describe quantum
physics to the non-expert, which is analogous to the similar use of Newtonian physics in classical mechan-
ics. However, practicing physicists invariably use the more abstract Heisenberg matrix mechanics to solve
problems in quantum mechanics, analogous to widespread use of the variational approach in classical me-
chanics, because the analytical approaches are more powerful and have fundamental advantages. Quantal
problems in molecular, atomic, nuclear, and subnuclear systems, usually involve finding the normal modes
of a quantal system, that is, finding the eigen-energies, eigen-functions, spin, parity, and other observables
for the discrete quantized levels. Solving the equations of motion for the modes of quantal systems is sim-
ilar to solving the many-body coupled-oscillator problem in classical mechanics, where it was shown that
use of matrix mechanics is the most powerful representation. It is ironic that the introduction of matrix
methods to classical mechanics is a by-product of the development of matrix mechanics by Heisenberg, Born
and Jordan. This illustrates that classical mechanics not only played a pivotal role in the development of
quantum mechanics, but it also has benefitted considerably from the development of quantum mechanics;
that is, the synergistic relation between these two complementary branches of physics has been beneficial to
both classical and quantum mechanics.
Recommended reading
“Quantum Mechanics” by P.A.M. Dirac, Oxford Press, 1947,
“Conceptual Development of Quantum Mechanics” by Max Jammer, Mc Graw Hill 1966.
Chapter 19

Epilogue

Hamilton’s action principle

Stage 1

Hamiltonian Lagrangian d’ Alembert’s Principle

Stage 2

Equations of motion Newtonian mechanics

Stage 3

Solution for motion Initial conditions

Figure 19.1: Philosophical road map of the hierarchy of stages involved in analytical mechanics. Hamilton’s
Action Principle is the foundation of analytical mechanics. Stage 1 uses Hamilton’s Principle to derive the
Lagranian and Hamiltonian. Stage 2 uses either the Lagrangian or Hamiltonian to derive the equations
of motion for the system. Stage 3 uses these equations of motion to solve for the actual motion using
the assumed initial conditions. The Lagrangian approach can be derived directly based on d’Alembert’s
Principle. Newtonian mechanics can be derived directly based on Newton’s Laws of Motion.

This book has introduced powerful analytical methods in physics that are based on applications of
variational principles to Hamilton’s Action Principle. These methods were pioneered in classical mechanics
by Leibniz, Lagrange, Euler, Hamilton, and Jacobi, during the remarkable Age of Enlightenment, and reached
full fruition at the start of the 20 century.
The philosophical roadmap, shown above, illustrates the hierarchy of philosophical approaches available
when using Hamilton’s Action Principle Rto derive the equations of motion of a system. The primary Stage1

uses Hamilton’s Action functional,  =  (q q̇) to derive the Lagrangian, and Hamiltonian function-
als. Stage1 provides the most fundamental and sophisticated level of understanding and involves specifying
all the active degrees of freedom, as well as the interactions involved. Stage2 uses the Lagrangian or Hamil-
tonian functionals, derived at Stage1, in order to derive the equations of motion for the system of interest.
Stage3 then uses the derived equations of motion to solve for the motion of the system, subject to a given
set of initial boundary conditions.
Newton postulated equations of motion for nonrelativistic classical mechanics that are identical to those
derived by applying variational principles to Hamilton’s Principle. However, Newton’s Laws of Motion are

503
504 CHAPTER 19. EPILOGUE

applicable only to nonrelativistic classical mechanics, and cannot exploit the advantages of using the more
fundamental Hamilton’s Action Principle, Lagrangian, and Hamiltonian. Newtonian mechanics requires that
all the active forces be included in the equations of motion, and involves dealing with vector quantities which
is more difficult than using the scalar functionals, action, Lagrangian, or Hamiltonian. Lagrangian mechanics
based on d’Alembert’s Principle does not exploit the advantages provided by Hamilton’s Action Principle.
Considerable advantages result from deriving the equations of motion based on Hamilton’s Principle,
rather than basing them on the Newton’s postulated Laws of Motion. It is significantly easier to use varia-
tional principles to handle the scalar functionals, action, Lagrangian, and Hamiltonian, rather than starting
with Newton’s vector differential equations-of-motion. The three hierarchical stages of analytical mechanics
facilitate accommodating extra degrees of freedom, symmetries, constraints, and other interactions. For
example, the symmetries identified by Noether’s theorem are more easily recognized during the primary “ac-
tion” and secondary “Hamiltonian/Lagrangian” stages, rather than at the subsequent “equations-of-motion”
stage. Constraint forces, and approximations, introduced at the Stage1 or Stage2, are easier to implement
than at the subsequent Stage3. The correspondence of Hamilton’s Action in classical and quantal mechan-
ics, as well as relativistic invariance, are crucial advantages for using the analytical approach in relativistic
mechanics, fluid motion, quantum, and field theory.
Philosophically, Newtonian mechanics is straightforward to understand since it uses vector differential
equations of motion that relate the instantaneous forces to the instantaneous accelerations. Moreover,
the concepts of momentum plus force are intuitive to visualize, and both cause and effect are embedded
in Newtonian mechanics. Unfortunately, Newtonian mechanics is incompatible with quantum physics, it
violates the relativistic concepts of space-time, and fails to provide the unified description of the gravitational
force plus planetary motion as geodesic motion in a four-dimensional Riemannian structure.
The remarkable philosophical implications embedded in applying variational principles to Hamilton’s
Principle, are based on the astonishing assumption that motion of a constrained system in nature follows
a path that minimizes the action integral. As a consequence, solving the equations of motion is reduced
to finding the optimum path that minimizes the action integral. The fact that nature follows optimization
principles is nonintuitive, and was considered to be metaphysical by many scientists and philosophers during
the 19 century, which delayed full acceptance of analytical mechanics until the development of the Theory
of Relativity and quantum mechanics. Variational formulations now have become the preeminent approach
in modern physics and they have toppled Newtonian mechanics from the throne of classical mechanics that
it occupied for two centuries.
The scope of this book extends beyond the typical classical mechanics textbook in order to illustrate
how Lagrangian and Hamiltonian dynamics provides the foundation upon which modern physics is built.
Knowledge of analytical mechanics is essential for the study of modern physics. The techniques and physics
discussed in this book reappear in different guises in many fields, but the basic physics is unchanged illustrat-
ing the intellectual beauty, the philosophical implications, and the unity of the field of physics. The breadth
of physics addressed by variational principles in classical mechanics, and the underlying unity of the field,
are epitomized by the wide range of dimensions, energies, and complexity involved. The dimensions range
from as large as 1027  to quantal analogues of classical mechanics of systems spanning in size down to the
Planck length of 162 × 10−35 . Individual particles have been detected with kinetic energies ranging from
zero to greater than 1015 eV. The complexity of classical mechanics spans from one body to the statistical
mechanics of many-body systems. As a consequence, analytical variational methods have become the pre-
mier approach to describe systems from the very largest to the smallest, and from one-body to many-body
dynamical systems.
The goal of this book has been to illustrate the astonishing power of analytical variational methods for
understanding the physics underlying classical mechanics, as well as extensions to modern physics. However,
the present narrative remains unfinished in that fundamental philosophical and technical questions have
not been addressed. For example, analytical mechanics is based on the validity of the assumed principle of
economy. This book has not addressed the philosophical question, “is the principle of economy a fundamental
law of nature, or is it a fortuitous consequence of the fundamental laws of nature? ”
In summary, Hamilton’s action principle, which is built into Lagrangian and Hamiltonian mechanics,
coupled with the availability of a wide arsenal of variational principles and mathematical techniques, provides
a remarkably powerful approach for deriving the equations of motions required to determine the response of
systems in a broad and diverse range of applications in science and engineering.
Appendix A

Matrix algebra

A.1 Mathematical methods for mechanics

Development of classical mechanics has involved a close and synergistic interweaving of physics and mathe-
matics, that continues to play a key role in these fields. The concepts of scalar and vector fields play a pivotal
role in describing the force fields and particle motion in both the Newtonian formulation of classical mechan-
ics and electromagnetism. Thus it is imperative that you be familiar with the sophisticated mathematical
formalism used to treat multivariate scalar and vector fields in classical mechanics. Ordinary and partial
differential equations up to second order, as well as integration of algebraic and trigonometric functions play
a major role in classical mechanics. It is assumed that you already have a working knowledge of differential
and integral calculus in sufficient depth to handle this material. Computer codes, such as Mathematica,
MatLab, and Maple, or symbolic calculators, can be used to obtain mathematical solutions for complicated
cases.
The following 9 appendices provide brief summaries of matrix algebra, vector algebra, orthogonal co-
ordinate systems, coordinate transformations, tensor algebra, multivariate calculus, vector differential plus
integral calculus, Fourier analysis and time-sampled waveform analysis. The manipulation of scalar and
vector fields is greatly facilitated by transforming to orthogonal curvilinear coordinate systems that match
the symmetries of the problem. These appendices discuss the necessity to account for the time dependence
of the orthogonal unit vectors for curvilinear coordinate systems. It is assumed that, except for coordinate
transformations and tensor algebra, you have been introduced to these topics in linear algebra and other
physics courses, and thus the purpose of these appendices is to serve as a reference and brief review.

A.2 Matrices
Matrix algebra provides an elegant and powerful representation of multivariate operators, and coordinate
transformations that feature prominently in classical mechanics. For example they play a pivotal role in
finding the eigenvalues and eigenfunctions for coupled equations that occur in rigid-body rotation, and
coupled oscillator systems. An understanding of the role of matrix mechanics in classical mechanics facilitates
understanding of the equally important role played by matrix mechanics in quantal physics.
It is interesting that although determinants were used by physicists in the late 19 century, the concept
of matrix algebra was developed by Arthur Cayley in England in 1855 but many of these ideas were the work
of Hamilton, and the discussion of matrix algebra was buried in a more general discussion of determinants.
Matrix algebra was an esoteric branch of mathematics, little known by the physics community, until 1925
when Heisenberg proposed his innovative new quantum theory. The striking feature of this new theory
was its representation of physical quantities by sets of time-dependent complex numbers and a peculiar
multiplication rule. Max Born recognized that Heisenberg’s multiplication rule is just the standard “row
times column” multiplication rule of matrix algebra; a topic that he had encountered as a young student in a
mathematics course. In 1924 Richard Courant had just completed the first volume of the new text Methods
of Mathematical Physics during which Pascual Jordan had served as his young assistant working on matrix
manipulation. Fortuitously, Jordan and Born happened to share a carriage on a train to Hanover during

505
506 APPENDIX A. MATRIX ALGEBRA

which Jordan overheard Born talk about his problems trying to work with matrices. Jordan introduced
himself to Born and oﬀered to help. This led to publication, in September 1925, of the famous Born-Jordan
paper[Bor25a] that gave the first rigorous formulation of matrix mechanics in physics. This was followed in
November by the Born-Heisenberg-Jordan sequel[Bor25b] that established a logical consistent general method
for solving matrix mechanics problems plus a connection between the mathematics of matrix mechanics and
linear algebra. Matrix algebra developed into an important tool in mathematics and physics during World
War 2 and now it is an integral part of undergraduate linear algebra courses.
Most applications of matrix algebra in this book are restricted to real, symmetric, square matrices. The
size of a matrix is defined by the rank, which equals the row rank and column rank, i.e. the number of
independent row vectors or column vectors in the square matrix. It is presumed that you have studied
matrices in a linear algebra course. Thus the goal of this review is to list simple manipulation of symmetric
matrices and matrix diagonalization that will be used in this course. You are referred to a linear algebra
textbook if you need further details.

Matrix definition
A matrix is a rectangular array of numbers with  rows and  columns. The notation used for an element
of a matrix is  where  designates the row and  designates the column of this matrix element in the
matrix A. Convention denotes a matrix A as
⎛ ⎞
11 12  1( −1) 1
⎜ 21 22  2( −1) 2 ⎟
⎜ ⎟
A≡⎜ ⎜ : :   : : ⎟
⎟ (A.1)
⎝ ( −1)1 (−1)2  (−1)(−1) (−1) ⎠
 1 2  (−1) 
Matrices can be square,  =  , or rectangular  6=  . Matrices having only one row or column are
called row or column vectors respectively, and need only a single subscript label. For example,
⎛ ⎞
1
⎜ 2 ⎟
⎜ ⎟
A =⎜ ⎜ : ⎟
⎟ (A.2)
⎝ −1 ⎠


Matrix manipulation
Matrices are defined to obey certain rules for matrix manipulation as given below.
1) Multiplication of a matrix by a scalar  simply multiplies each matrix element by 
 =  (A.3)
2) Addition of two matrices A and B having the same rank, i.e. the number of columns, is given by
 =  +  (A.4)

3) Multiplication of a matrix A by a matrix B is defined only if the number of columns in A equals the
number of rows in B. The product matrix C is given by the matrix product
C= A · B (A.5)
X
 = [] =   (A.6)

For example, if both A and B are rank three symmetric matrices then
⎛ ⎞ ⎛ ⎞
11 12 13 11 12 13
C = A · B = ⎝ 21 22 23 ⎠ · ⎝ 21 22 23 ⎠
31 32 33 31 32 33
⎛ ⎞
11 11 + 12 21 + 13 31 11 12 + 12 22 + 13 32 11 13 + 12 23 + 13 33
= ⎝ 21 11 + 22 21 + 23 31 21 12 + 22 22 + 23 32 21 13 + 22 23 + 23 33 ⎠
31 11 + 32 21 + 33 31 31 12 + 32 22 + 33 32 31 13 + 32 23 + 33 33
A.2. MATRICES 507

In general, multiplication of matrices A and B is noncommutative, i.e.

A · B 6= B · A (A.7)
In the special case when A · B = B · A then the matrices are said to commute.

Transposed matrix A
The transpose of a matrix A will be denoted by A and is given by interchanging rows and columns, that is
¡ ¢
  =  (A.8)

The transpose of a column vector is a row vector. Note that older texts use the symbol Ã for the transpose.

Identity (unity) matrix I

The identity (unity) matrix I is diagonal with diagonal elements equal to 1, that is
I =   (A.9)
where the Kronecker delta symbol is defined by
  = 0 if  6=  (A.10)
= 1 if  = 

Inverse matrix A−1

If a matrix is non-singular, that is, its determinant is non-zero, then it is possible to define an inverse matrix
A−1 . A square matrix has an inverse matrix for which the product
A · A−1 = I (A.11)

Orthogonal matrix
A matrix with real elements is orthogonal if
A = A−1 (A.12)
That is X¡ ¢ X
   =   =   (A.13)
 

Adjoint matrix A†
For a matrix with complex elements, the adjoint matrix, denoted by A† is defined as the transpose of the
complex conjugate ¡ †¢
A  = A∗ (A.14)

Hermitian matrix
The Hermitian conjugate of a complex matrix H is denoted as H† and is defined as
¡ ¢∗ 
H† = H = (H∗ ) (A.15)
Therefore
† ∗
 =  (A.16)
A matrix is Hermitian if it is equal to its adjoint
H† = H (A.17)
that is
† ∗
 =  =  (A.18)
A matrix that is both Hermitian and has real elements is a symmetric matrix since complex conjugation has
no eﬀect.
508 APPENDIX A. MATRIX ALGEBRA

Unitary matrix

A matrix with complex elements is unitary if its inverse is equal to the adjoint matrix

U† = U−1 (A.19)

which is equivalent to
U† U = I (A.20)

A unitary matrix with real elements is an orthogonal matrix as given in equation 12

Trace of a square matrix  A

The trace of a square matrix, denoted by  A, is defined as the sum of the diagonal matrix elements.


X
 A =  (A.21)
=1

Inner product of column vectors

Real vectors The generalization of the scalar (dot) product in Euclidean space is called the inner prod-
uct. Exploiting the rules of matrix multiplication requires taking the transpose of the first column vector
to form a row vector which then is multiplied by the second column vector using the conventional rules for
matrix multiplication. That is, for rank  vectors

⎛ ⎞ ⎛ ⎞ ⎛ ⎞
1 1 1

⎜ 2 ⎟ ⎜ 2 ⎟ ¡ ¢ ⎜ 2 ⎟ X
[X] · [Y] = ⎜ ⎟ ⎜
⎝ : ⎠·⎝ :
⎟ = [X] [Y] = 1
⎠ 2   ⎜
⎝ : ⎠
⎟=   (A.22)
=1
  

For rank  = 3 this inner product agrees with the conventional definition of the scalar product and gives a
result that is a scalar. For the special case when [A] · [B] = 0 then the two matrices are called orthogonal.
The magnitude squared of a column vector is given by the inner product


X 2
[X] · [X] = ( ) ≥ 0 (A.23)
=1

Note that this is only positive.

Complex vectors For vectors having complex matrix elements the inner product is generalized to a form
that is consistent with equation 22 when the column vector matrix elements are real.
⎛ ⎞
1
⎜ 2 ⎟ X
¡ ¢⎜ ⎟
∗ †
[X] · [Y] = [X] [Y] = 1∗ 2∗  ∗
−1 ∗
 ⎜ : ⎟= ∗  (A.24)
⎜ ⎟
⎝ −1 ⎠ =1


For the special case


X
[X]∗ · [X] = [X]† [X] = ∗  ≥ 0 (A.25)
=1
A.3. DETERMINANTS 509

A.3 Determinants
Definition
The determinant of a square matrix with  rows equals a single number derived using the matrix elements
of the matrix. The determinant is denoted as det A or |A| where


X
|A| = (1  2   )11 22  (A.26)
=1

where (1  2   ) is the permutation index which is either even or odd depending on the number of
permutations required to go from the normal order (1 2 3  ) to the sequence (1 2 3  ).
For example for  = 3 the determinant is

|A| = 11 22 33 + 12 23 31 + 13 21 32 − 13 22 31 − 11 23 32 − 12 21 33 (A.27)

Properties
1. The value of a determinant || = 0, if

(a) all elements of a row (column) are zero.

(b) all elements of a row (column) are identical with, or multiples of, the corresponding elements of
another row (column).

2. The value of a determinant is unchanged if

(a) rows and columns are interchanged.

(b) a linear combination of any number of rows is added to any one row.

3. The value of a determinant changes sign if two rows, or any two columns, are interchanged.
¯ ¯
4. Transposing a square matrix does not change its determinant. ¯A ¯ = |A|

5. If any row (column) is multiplied by a constant factor then the value of the determinant is multiplied
by the same factor.

6. The determinant of a diagonal matrix equals the product of the diagonal matrix elements. That is,
when  =    then |A| = 1 2 3 

7. The determinant of the identity (unity) matrix |I| = 1.

8. The determinant of the null matrix, for which all matrix elements are zero, |0| = 0

9. A singular matrix has a determinant equal to zero.

10. If each element of any row (column) appears as the sum (diﬀerence) of two or more quantities, then
the determinant can be written as a sum (diﬀerence) of two or more determinants of the same order.
For example for order  = 2

¯ ¯ ¯ ¯ ¯ ¯
¯ 11 ± 11 12 ± 12 ¯¯ ¯¯ 11 12 ¯¯ ¯¯ 11 12 ¯¯
¯
¯ 21 22 ¯ = ¯ 21 ±
22 ¯ ¯ 21 22 ¯

11 A determinant of a matrix product equals the product of the determinants. That is, if C = AB then
|C| = |A| |B|
510 APPENDIX A. MATRIX ALGEBRA

Cofactor of a square matrix

For a square matrix having  rows the cofactor is obtained by removing the  row and the   column
and then collapsing the remaining matrix elements into a square matrix with  − 1 rows while preserving
the order of the matrix elements. This is called the complementary minor which is denoted as () . The
matrix elements of the cofactor square matrix a are obtained by multiplying the determinant of the ()
+
complementary minor by the phase factor (−1) . That is
¯ ¯
+ ¯ ¯
 = (−1) ¯() ¯ (A.28)

The cofactor matrix has the property that


X 
X
  =   |A| =   (A.29)
=1 =1

Cofactors are used to expand the determinant of a square matrix in order to evaluate the determinant.

Inverse of a non-singular matrix

The ( ) matrix elements of the inverse matrix A−1 of a non-singular matrix A are given by the ratio of
the cofactor  and the determinant |A|, that is

1
−1
 =  (A.30)
|A|
¡ ¢
Equations 28 and 29 can be used to evaluate the   element of the matrix product A−1 A


X 
¡ −1 ¢ 1 X 1
A A  = −1
  =   =   |A| =   = I (A.31)
|A| |A|
=1 =1

This agrees with equation 11 that A · A−1 = I.

The inverse of rank 2 or 3 matrices is required frequently when determining the eigen-solutions for rigid-
body rotation, or coupled oscillator, problems in classical mechanics as described in chapters 11 and 12.
Therefore it is convenient to list explicitly the inverse matrices for both rank 2 and rank 3 matrices.

Inverse for rank 2 matrices:

∙ ¸−1 ∙ ¸ ∙ ¸
  1  − 1  −
A−1 = = = (A.32)
  |A| −  ( − ) − 
where the determinant of A is written explicitly in equation 32.

Inverse for rank 3 matrices:

⎡ ⎤−1 ⎡ ⎤ ⎡ ⎤
        
1 1
A −1
= ⎣    ⎦ = ⎣    ⎦ = ⎣    ⎦
   |A|    |A|   
⎡ ⎤
 = ( −  )  = − ( − )  = ( − )
1 ⎣  = − ( −  )
=  = ( − )  = − ( − ) ⎦ (A.33)
 +  +   = ( − )  = − ( − )  = ( − )

where the functions          are equal to rank 2 determinants listed in equation 33.
A.4. REDUCTION OF A MATRIX TO DIAGONAL FORM 511

A.4 Reduction of a matrix to diagonal form

Solving coupled linear equations can be reduced to diagonalization of a matrix. Consider the matrix A
operating on the vector X to produce a vector Y, that are expressed as components with respect to the
unprimed coordinate frame, i.e.
A·X=Y (A.34)
Consider that the unitary real matrix R with rank , rotates the -dimensional un-primed coordinate
frame into the primed coordinate frame such that A , X and Y are transformed to A0 , X0 and Y0 in the
rotated primed coordinate frame. Then

X0 = R·X
Y0 = R·Y (A.35)

With respect to the primed coordinate frame equation (34) becomes

R· (A · X) = R · Y (A.36)
R · A · R−1 · R · X = R · Y (A.37)
R · A · R−1 · X0 = A0 · X0 = Y0 (A.38)

using the fact that the identity matrix I = R · R−1 = R · R since the rotation matrix in  dimensions is
orthogonal.
Thus we have that the rotated matrix

A0 = R · A · R (A.39)

Let us assume that this transformed matrix is diagonal, then it can be written as the product of the unit
matrix I and a vector of scalar numbers called the characteristic roots  as

A0 = R · A · R = I (A.40)

using the fact that R = R−1 then gives

R · (I) = A0 ·R (A.41)

Let both sides of equation 41 act on X0 which gives

I·X0 = A0 ·X0 (A.42)

or £ ¤
I−A0 X0 = 0 (A.43)
This represents a set of  homogeneous linear algebraic equations in  unknowns X0 where  is a set of
characteristic roots, (eigenvalues) with corresponding eigenfunctions X0  Ignoring the trivial case of X0 being
zero, then (43) requires that the secular determinant of the bracket be zero, that is
¯ ¯
¯I−A0 ¯ = 0 (A.44)

The determinant can be expanded and factored into the form

( − 1 ) ( − 2 ) ( − 3 )  ( −  ) = 0 (A.45)

where the  eigenvalues are  = 1  2   of the matrix A0 

The eigenvectors X0 corresponding to each eigenvalue are determined by substituting a given eigenvalue
 into the relation
X0 · A0 ·X0 = [   ] (A.46)
If all the eigenvalues are distinct, i.e. diﬀerent, then this set of  equations completely determines the ratio
of the components of each eigenvector along the axes of the coordinate frame. However, when two or more
512 APPENDIX A. MATRIX ALGEBRA

eigenvalues are identical, then the reduction to a true diagonal form is not possible and one has the freedom
to select an appropriate eigenvector that is orthogonal to the remaining axes.
In summary, the matrix can only be fully diagonalized if (a) all the eigenvalues are distinct, (b) the real
matrix is symmetric, (c) it is unitary.
A frequent application of matrices in classical mechanics is for solving a system of homogeneous linear
equations of the form
11 1 +12 2  +1  = 0
11 1 +12 2  +1  = 0
(A.47)
    = 
1 1 +2 2  +  = 0
Making the following definitions ⎛ ⎞
11 12  1
⎜ 21 22  2 ⎟
A =⎜
⎝ 
⎟ (A.48)
   ⎠
1 2  
⎛ ⎞
1
⎜ 2 ⎟
X =⎜
⎝ 
⎟
⎠ (A.49)

Then the set of linear equations can be written in a compact form using the matrices

A · X =0 (A.50)

which can be solved using equation (43). Ensure that you are able to diagonalize a matrices with rank
2 and 3. You can use Mathematica, Maple, MatLab, or other such mathematical computer programs to
diagonalize larger matrices.

A.1 Example: Eigenvalues and eigenvectors of a real symmetric matrix

Consider the matrix ⎛ ⎞
0 1 0
A =⎝ 1 0 0 ⎠
0 0 0
The secular determinant is given by (42)
¯ ¯
¯ − 1 0 ¯
¯ ¯
¯ 1 − 0 ¯=0
¯ ¯
¯ 0 0 − ¯

This expands to
−( + 1)( − 1) = 0
Thus the three eigen values are  = −1 0 1
To find each eigenvectors we substitute the corresponding eigenvalue into equation (48) 
⎛ ⎞⎛ ⎞ ⎛ ⎞
− 1 0  0
⎝ 1 − 0 ⎠ ⎝  ⎠ = ⎝ 0 ⎠
0 0 −  0

The eigenvalue  = −1 yields  +  = 0 and  = 0 Thus the eigen vector is 1 = ( √12  √ −1

2
 0). The
eigenvalue  = 0 yields  = 0 and  = 0 Thus the eigen vector is 2 = (0 0 1). The eigenvalue  = 1
yields − +  = 0 and  = 0 Thus the eigen vector is 3 = ( √12  √12  0). The orthogonality of these three
eigen vectors, which correspond to three distinct eigenvalues, can be verified.
A.4. REDUCTION OF A MATRIX TO DIAGONAL FORM 513

A.2 Example: Degenerate eigenvalues of real symmetric matrix

This example illustrates how to generate eigenvectors corresponding to degenerate eigenvalues. Consider
the matrix ⎛ ⎞
1 0 0
A =⎝ 0 0 1 ⎠
0 1 0
The secular determinant is given by (42)
¯ ¯
¯ 1− 0 0 ¯
¯ ¯
¯ 0 − 1 ¯=0
¯ ¯
¯ 0 1 − ¯

This expands to
(1 − ) ( + 1)( − 1) = 0
Thus the three eigen values are  = −1 1 1
The eigenvectors are determined by substituting the corresponding eigenvalue into equation (42)
⎛ ⎞ ⎛ ⎞ ⎛ ⎞
1− 0 0  0
⎝ 0 − 1 ⎠ · ⎝  ⎠ = ⎝ 0 ⎠
0 0 −  0

The eigenvalue  = −1 yields 2 = 0 and  +  = 0 Thus the eigen vector is 1 = (0 √12  √ −1
2
). The
eigenvalue  = 1 yields −  +  = 0 The eigenvector 2 must be perpendicular to 1 and there are an infinite
number of choices. Let us assume that 2 = (0 √12  √12 ) which satisfies equation (50) then the eigenvector
3 must be perpendicular to both 1 and 2  For rank three this is found using

r3 = r1 × r2 = (1 0 0)
514 APPENDIX A. MATRIX ALGEBRA
Appendix B

Vector algebra

B.1 Linear operations

The important force fields in classical mechanics, namely, gravitation, electric, and magnetic, are vector
fields that have a position-dependent magnitude and direction. Thus, it is useful to summarize the algebra
of vector fields.
A vector a has both a magnitude || and a direction defined by the unit vector ê , that is, the vector
can be written as a bold character a where
a = · ê (B.1)
where by convention the implied modulus sign is omitted. The hat symbol on the vector ê designates that
this is a unit vector with modulus |ê | = 1.
Vector force fields are assumed to be linear, and consequently they obey the principle of superposition,
are commutative, associative, and distributive as illustrated below for three vectors a b c plus a scalar
multiplier 

a ± b = ±b + a (B.2)
a+ (b + c) = (a + b) +c
 (a + b) = a+b

The manipulation of vectors is greatly facilitated by use of components along an orthogonal coordinate
system defined by three orthogonal unit vectors (ê1  ê2  ê3 ) . For example the cartesian coordinate system
is defined by three unit vectors which, by convention, are called (î ĵ k̂).

B.2 Scalar product

Multiplication of two vectors can produce a 9−component tensor that can be represented by a 3 × 3 matrix
as discussed in appendix . There are two special cases for vector multiplication that are important for
vector algebra; the first is the scalar product, and the second is the vector product.
The scalar product of two vectors is defined to be

a · b = || || cos  (B.3)

where  is the angle between the two vectors. It is a scalar and thus is independent of the orientation of
the coordinate axis system. Note that the scalar product commutes, is distributive, and associative with a
scalar multiplier, that is

a·b = b·a (B.4)

a· (b + c) = a · b + a · c
(a) ·b =  (b · a)

Note that a · a = ||2 and if a and b are perpendicular then cos  = 0 and thus a · b =0

515
516 APPENDIX B. VECTOR ALGEBRA

If the three unit vectors (ê1  ê2  ê3 ) form an orthonormal basis, that is, they are orthogonal unit vectors,
then from equations 3 and 4
ê · ê =   (B.5)
If â is the unit vector for the vector a then the scalar product of a vector a with one of these unit vectors
ê gives the cosine of the angle between the vector a and ê , that is
a · ê1 = || (â · ê1 ) = || cos  (B.6)
a · ê2 = || (â · ê2 ) = || cos 
a · ê3 = || (â · ê3 ) = || cos 
where the cosines are called the direction cosines since they define the direction of the vector a with respect
to each orthogonal basis unit vector. Moreover, a · ê1 = || â · ê1 = || cos  is the component of a along the
ê1 axis. Thus the three components of the vector a is fully defined by the magnitude || and the direction
cosines, corresponding to the angles   . That is,
1 = || (â · ê1 ) = || cos  (B.7)
2 = || (â · ê2 ) = || cos 
3 = || (â · ê3 ) = || cos 
If the three unit vectors (ê1  ê2  ê3 ) form an orthonormal basis then the vector is fully defined by
a = 1 ê1 + 2 ê2 + 3 ê3 (B.8)
Consider two vectors
a = 1 ê1 + 2 ê2 + 3 ê3
b = 1 ê1 + 2 ê2 + 3 ê3
Then using 5
a · b =1 1 + 2 2 + 3 3 = || || cos  (B.9)
1
where  is the angle between the two vectors. In particular, since the direction cosine cos  = || , then
equation 9 gives
cos  = cos  cos  + cos   cos   + cos   cos   (B.10)
Note that when  = 0 then 10 gives
cos2  + cos2  + cos2  = 1 (B.11)

B.3 Vector product

The vector product of two vectors is defined to be
c = a × b = || || sin n̂ (B.12)
where  is the angle between the³vectors´and n̂ is a unit vector perpendicular to the plane defined by a
and b such that the unit vectors â b̂ n̂ obey a right-handed screw rule. The vector product acts like a
pseudovector which comprises a normal vector multiplied by a sign factor that depends on the handedness
of the system as described in appendix 3.
The components of c are defined by the relation
X
 ≡    (B.13)


where the (Levi-Civita) permutation symbol  has the following properties
 = 0 if an index is equal to any another index
 = +1 if    form an even permutation of 1 2 3 (B.14)
 = −1 if    form an odd permutation of 1 2 3
B.4. TRIPLE PRODUCTS 517

P
For example, if the three unit vectors (ê1  ê2  ê3 ) form an orthonormal basis, then ê ≡   ê ê , i.e.

ê1 × ê2 = ê3 ê2 × ê3 = ê1 ê3 × ê1 = ê2 (B.15)
ê2 × ê1 = −ê3 ê3 × ê2 = −ê1 ê1 × ê3 = −ê2 (B.16)
ê1 × ê1 = 0 ê2 × ê2 = 0 ê3 × ê0 = 0 (B.17)

The vector product anticommutes in that

a × b = −b × a (B.18)

However, it is distributive and associative with a scalar multiplier

a× (b + c) = a × b + a × c (B.19)
(a) ×b =  (a × b) (B.20)

Note that when sin  = 0 then a × b = 0 and in particular, a × a = 0

Consider two vectors

a = 1 ê1 + 2 ê2 + 3 ê3

b = 1 ê1 + 2 ê2 + 3 ê3

Then using equations 12 and 15 − 17

¯ ¯
¯ ê1 ê2 ê3 ¯
¯ ¯
a × b= || || sin  = ¯¯ 1 2 3 ¯¯ = ê1 (2 3 − 3 2 ) + ê2 (3 1 − 1 3 ) + ê3 (1 2 − 2 1 )
¯ 1 2 3 ¯

where  is the angle between the two vectors and the determinant is evaluated for the top row. Examples of
vector products are torque N = r × F, angular momentum L = r × p, and the magnetic force F = v × B.

B.4 Triple products

The following scalar and vector triple products can be formed from the product of three vectors and are
used frequently.

Scalar triple products

There are several permutations of scalar triple products of three vectors [a b c] that are identical.

a· (b × c) = c· (a × b) = b· (c × a) = (a × b) · c = −a· (c × b) (B.21)

That is, the scalar product is invariant to cyclic permutations of the three vectors but changes sign for
interchange of two vectors. The scalar product is unchanged by swapping the scalar ()and vector ().
Because of the symmetry the scalar triple product can be denoted as [a b c] and

[a b c]  0 if [a b c] is right-handed

[a b c] = 0 if [a b c] is coplanar (B.22)
[a b c]  0 if [a b c] is left-handed

The scalar triple product can be written in terms of the components using a determinant
¯ ¯
¯ 1 2 3 ¯
¯ ¯
[a b c] = ¯¯ 1 2 3 ¯¯ (B.23)
¯ 1 2 3 ¯
518 APPENDIX B. VECTOR ALGEBRA

Vector triple product

The vector triple product a× (b × c) is a vector. Since (b × c) is perpendicular to the plane of b c, then
a× (b × c) must lie in the plane containing b c. Therefore the triple product can be expanded in terms of
b c, as given by the following identity

a × (b × c) = (a · c) b − (a · b) c (B.24)

Workshop exercises

1. Partition the following exercises among the group. Once you have completed your problem, check with a
classmate before writing it on the board. After you have verified that you have found the correct solution,
write your answer in the space provided on the board, taking care to include the steps that you used to arrive
at your solution. The following information is needed.
a⎛= 3i + 2j − 9k ⎞ b = −2i + 3k c =⎛
−2i + j − 6k⎞ d⎛= i + 9j + 4k ⎞
2 7 −4 µ ¶ 2 −4 −8 −1 −3
3 4
E = ⎝ 3 1 −2 ⎠ F = G=⎝ 7 1 ⎠ H = ⎝ −4 2 −2 ⎠
5 6
−2 0 5 −1 1 −1 0 0
Calculate each of the following
1 |a − (b + 3c)| 7 (EH)
2 Component of c along a 8 |HE|
3 Angle between c and d 9 EHG
4 (b × d) · a 10 EG − HG
5 (b × d) × a 11 EH − H E
6 b× (d × a) 12 F−1

Problems
[1] For what values of  are the vectors A = 2̂ − 2̂ + ̂ and B = ̂ + 2̂ + 2̂ perpendicular?

[2] Show that the triple scalar product ( × ) ·  can be written as

¯ ¯
¯ 1 2 3 ¯
¯ ¯
(A × B) · C = ¯¯ 1 2 3 ¯
¯
¯ 1 2 3 ¯

Show also that the product is unaﬀected by interchange of the scalar and vector product operations or by change in

the order of    as long as they are in cyclic order, that is

(A × B) · C = A · (B × C) = B · (C × A) =(C × A) · B

Therefore we may use the notation  to denote the triple scalar product. Finally give a geometric interpre-
tation of  by computing the volume of the parallelepiped defined by the three vectors A B C
Appendix C

Orthogonal coordinate systems

The methods of vector analysis provide a convenient representation of physical laws. However, the manip-
ulation of scalar and vector fields is greatly facilitated by use of components with respect to an orthogonal
coordinate system.

C.1 Cartesian coordinates (  )

Cartesian coordinates (rectangular) provide the simplest orthogonal rectangular coordinate system. The
unit vectors specifying the direction along the three orthogonal axes are taken to be (î ĵ k̂). In cartesian
coordinates scalar and vector functions are written as

 = (  ) (C.1)
r = î+ ĵ+ k̂ (C.2)

Calculation of the time derivatives of the position vector is especially simple using cartesian coordinates
because the unit vectors (î ĵ k̂) are constant and independent in time. That is;

î ĵ k̂

= = =0
  

Since the time derivatives of the unit vectors are all zero then the velocity ṙ = r
 reduces to the partial time
derivatives of   and . That is,
ṙ =̇î+̇ ĵ+̇ k̂ (C.3)
Similarly the acceleration is given by
r̈ =̈î+̈ ĵ+̈ k̂ (C.4)

C.2 Curvilinear coordinate systems

There are many examples in physics where the symmetry of the problem makes it more convenient to solve
motion at a point  (  ) using non-cartesian curvilinear coordinate systems. For example, problems
having spherical symmetry are most conveniently handled using a spherical coordinate system (  )
with the origin at the center of spherical symmetry. Such problems occur frequently in electrostatics and
gravitation; e.g. solutions of the atom, or planetary systems. Note that a cartesian coordinate system still
is required to define the origin plus the polar and azimuthal angles   Using spherical coordinates for
a spherically symmetry system allows the problem to be factored into a cyclic angular part, the solution
which involves spherical harmonics that are common to all such spherically-symmetric problems, plus a
one-dimensional radial part that contains the specifics of the particular spherically-symmetric potential.
Similarly, for problems involving cylindrical symmetry, it is much more convenient to use a cylindrical
coordinate system (  ). Again it is necessary to use a cartesian coordinate system to define the origin
and angle . Motion in a plane can be handled using two dimensional polar coordinates.

519
520 APPENDIX C. ORTHOGONAL COORDINATE SYSTEMS

Curvilinear coordinate systems introduce a complication in that the unit vectors are time dependent in
contrast to cartesian coordinate system where the unit vectors (î ĵ k̂) are independent and constant in time.
The introduction of this time dependence warrants further discussion.
Each of the three axes  in curvilinear coordinate systems can be expressed in cartesian coordinates
(  ) as surfaces of constant  given by the function

 =  (  ) (C.5)

where  = 1 2 or 3. An element of length  perpendicular to the surface  is the distance between the
surfaces  and  +  which can be expressed as

 =   (C.6)

where  is a function of (1  2  3 ). In cartesian coordinates 1 ,2  and 3 are all unity. The unit-length
vectors ̂1 , ̂2 , ̂3 , are perpendicular to the respective 1  2  3 surfaces, and are oriented to have increasing
indices such that q̂1 ×q̂2 = q̂3 . The correspondence of the curvilinear coordinates, unit vectors, and transform
coeﬃcients to cartesian, polar, cylindrical and spherical coordinates is given in table 1

Curvilinear 1 2 3 q̂1 q̂2 q̂3 1 2 3

Cartesian    ̂ ̂ k̂ 1 1 1
Polar   r̂ θ̂ 1 
Cylindrical    ρ̂ ϕ̂ ẑ 1  1
Spherical    r̂ θ̂ ϕ̂ 1  

Table 1: Curvilinear coordinates

The diﬀerential distance and volume elements are given by

s = 1 q̂1 + 2 q̂2 + 3 q̂3 = 1 1 q̂1 + 2 2 q̂2 + 3 3 q̂3 (C.7)
 = 1 2 3 = 1 2 3 (1 2 3 ) (C.8)

These are evaluated below for polar, cylindrical, and spherical coordinates.

C.2.1 Two-dimensional polar coordinates ( )

The complication and implications of time-dependent unit vectors are best illustrated by considering two-
dimensional polar coordinates which is the simplest curvilinear coordinate system. Polar coordinates are a
special case of cylindrical coordinates, when  is held fixed, or a special case of spherical coordinate system,
when  is held fixed.
Consider the motion of a point  as it moves along a curve s() such that in the time interval  it moves
from  (1) to  (2) as shown in figure 2. The two-dimensional polar coordinates have unit vectors r̂, θ̂,
which are orthogonal and change from r̂1 , θ̂1 , to r̂2 , θ̂ 2 , in the time  Note that for these polar coordinates
the angle unit vector θ̂ is taken to be tangential to the rotation since this is the direction of motion of a
point on the circumference at radius .
The net changes shown in figure of table 2 are

r̂ = r̂2 − r̂1 = r̂ = |r̂| θ̂ =θ̂ (C.9)

since the unit vector r̂ is a constant with |r̂| = 1. Note that the infinitessimal r̂ is perpendicular to the unit
vector r̂, that is, r̂ points in the tangential direction θ̂
Similarly, the infinitessimal
θ̂ = θ̂ 2 − θ̂ 1 = θ̂ = −r̂ (C.10)
which is perpendicular to the tangential θ̂ unit vector and therefore points in the direction −r̂ . The minus
sign causes −r̂ to be directed in the opposite direction to r̂.
C.2. CURVILINEAR COORDINATE SYSTEMS 521

The net distance element s is given by

s =r̂ + dr̂ =r̂ + θ̂ (C.11)

This agrees with the prediction obtained using table 1

The time derivatives of the unit vectors are given by equations (9) and (10) to be,

r̂ 
= θ̂ (C.12)
 
θ̂ 
= − r̂ (C.13)
 
Note that the time derivatives of unit vectors are perpendicular to the corresponding unit vector, and the
unit vectors are coupled.
Consider that the velocity v is expressed as
r   r̂
v= = (r̂) = r̂ +  = ̇r̂ + ̇θ̂ (C.14)
   

The velocity is resolved into a radial component ̇ and an angular, transverse, component ̇.
Similarly the acceleration is given by

v ̇ r̂  ̇ θ̂

a = = r̂+̇ + ̇θ̂+ θ̂+̇

³  ´ 
³  ´  
2
= ̈ − ̇ r̂ + ̈ + 2̇̇ θ̂ (C.15)

2
where the ̇ r̂ term is the eﬀective centripetal acceleration while the 2̇̇θ̂ term is called the Coriolis term.
For the case when ̇ = ̈ = 0, then the first bracket in 15 is the centripetal acceleration while the second
bracket is the tangential acceleration.
This discussion has shown that in contrast to the time independence of the cartesian unit basis vectors,
the unit basis vectors for curvilinear coordinates are time dependent which leads to components of the velocity
and acceleration involving coupled coordinates.

Coordinates  
Distance element s = r̂ + θ̂
Area element  = 
Unit vectors r̂ = ̂ cos  + ̂ sin 
θ̂ = −̂ sin  + ̂ cos 
r̂
Time derivatives  = ̇ θ̂
̂
of unit vectors  = −̇r̂
Velocity v= ³ ̇r̂ + ̇θ̂´
 2
Kinetic energy 2 ̇2 +2 ̇
³ 2
´
Acceleration a = ̈ − ̇ r̂
³ ´
+ ̈ + 2̇̇ θ̂

Table 2: Diﬀerential relations plus a diagram of the unit vectors for 2-dimensional polar coordinates.
522 APPENDIX C. ORTHOGONAL COORDINATE SYSTEMS

C.2.2 Cylindrical Coordinates (  )

The three-dimensional cylindrical coordinates (  ) are obtained by adding the motion along the symmetry
axis ẑ to the case for polar coordinates. The unit basis vectors are shown in Table 3 where the angular
unit vector φ̂ is taken to be tangential corresponding to the direction a point on the circumference would
move. The distance and volume elements, the cartesian coordinate components of the cylindrical unit
basis vectors, and the unit vector time derivatives are shown in Table 3. The time dependence of the
unit vectors is used to derive the acceleration. As for the two-dimensional polar coordinates, the ρ̂ and θ̂
direction components of the acceleration for cylindrical coordinates are coupled functions of  ̇ ̈ ̇ and ̈.

Coordinates   
Distance element s = ρ̂ + φ̂ + ẑ
Volume element  = 
Unit vectors ρ̂ = ̂ cos  + ̂ sin 
φ̂ = −̂ sin  + ̂ cos 
ẑ = k̂
̂
Time derivatives  = ̇φ̂
̂
of unit vectors  = −̇ρ̂
ẑ
 = 0
Velocity v =³ ̇ρ̂ + ̇φ̂ + ̇ẑ
´
2
Kinetic energy 
2̇2 +2 ̇ + ̇ 2
³ 2
´
Acceleration a = ̈ − ̇ ρ̂
³ ´
+ ̈ + 2̇̇ φ̂ + ̈ẑ

Table 3: Diﬀerential relations plus a diagram of the unit vectors for cylindrical coordinates.

C.2.3 Spherical Coordinates (  )

The three dimensional spherical coordinates, can be treated the same way as for cylindrical coordinates. The
unit basis vectors are shown in Table 4 where the angular unit vectors θ̂ and φ̂ are taken to be tangential
corresponding to the direction a point on the circumference moves for a positive rotation angle.

Coordinates   
Distance element  = r̂ + θ̂ +  sin φ̂
Volume element  = 2 sin 
Unit vectors r̂ = ̂ sin  cos  + ̂ sin  sin  + k̂ cos 
θ̂ = ̂ cos  cos  + ̂ cos  sin  − k̂ sin 
φ̂ = −̂ sin  + ̂ cos 
r̂
Time derivatives  = θ̂ ̇ + φ̂̇ sin 
̂
of unit vectors  = −r̂̇ + φ̂̇ cos 
̂
 = −r̂̇ sin  − θ̂ ̇ cos 
Velocity v= ³ ̇r̂ + ̇θ̂ + ̇ sin φ̂´
2 2
Kinetic energy 
2 ̇2 +2 ̇ +2 sin2 ̇
³ 2 2
´
Acceleration a = ̈ − ̇ − ̇ sin2  r̂
³ 2
´
+ ̈ + 2̇̇ − ̇ sin  cos  θ̂
³ ´
+ ̈ sin  + 2̇̇ sin  + 2̇̇ cos  φ̂

Table 4 Diﬀerential relations plus a diagram of the unit vectors for spherical coordinates.
C.3. FRENET-SERRET COORDINATES 523

The distance and volume elements, the cartesian coordinate components of the spherical unit basis
vectors, and the unit vector time derivatives are shown in the table given in figure 4. The time dependence
of the unit vectors is used to derive the acceleration. As for the case of cylindrical coordinates, the r̂ θ̂ and
φ̂ components of the acceleration involve coupling of the coordinates and their time derivatives.
It is important to note that the angular unit vectors θ̂ and φ̂ are taken to be tangential to the circles of
rotation. However, for discussion of angular velocity of angular momentum it is more convenient to use the
axes of rotation defined by r̂ × θ̂ and r̂ × φ̂ for specifying the vector properties which is perpendicular to
the unit vectors θ̂ and φ̂. Be careful not to confuse the unit vectors θ̂ and φ̂ with those used for the angular
velocities ̇ and ̇.

C.3 Frenet-Serret coordinates

The cartesian, polar, cylindrical, or spherical curvilinear coordinate systems, all are orthogonal coordinate
systems that are fixed in space. There are situations where it is more convenient to use the Frenet-Serret
coordinates which comprise an orthogonal coordinate system that is fixed to the particle that is moving
along a continuous, diﬀerentiable, trajectory in three-dimensional Euclidean space. Let () represent a
monotonically increasing arc-length along the trajectory of the particle motion as a function of time . The
Frenet-Serret coordinates, shown in figure 5 are the three instantaneous orthogonal unit vectors t̂ n̂ and
b̂ where the tangent unit vector t̂ is the instantaneous tangent to the curve, the normal unit vector n̂ is in
the plane of curvature of the trajectory pointing towards the center of the instantaneous radius of curvature
and is perpendicular to the tangent unit vector t̂ while the binormal unit vector is b̂ = t̂ × n̂ which is the
perpendicular to the plane of curvature and is mutually perpendicular to the other two Frenet-Serrat unit
vectors. The Frenet-Serret unit vectors are defined by the relations

t̂
= n̂ (C.16)

b̂
= − n̂ (C.17)

n̂
= −t̂+ b̂ (C.18)

The curvature  = 1 where  is the radius of curvature and  is the torsion that can be either positive
or negative. For increasing  a non-zero curvature  implies that the triad of unit vectors rotate in a
right-handed sense about b̂. If the torsion  is positive (negative) the triad of unit vectors rotates in right
(left) handed sense about t̂.

¯ ¯
¯ ¯
Distance element s() = t̂ ¯ r()
 ¯  = t̂()
v()
Unit vectors t̂() = |()|
t̂
n̂() =
|t̂|
b̂()= t̂ × n̂ ^
n
Time derivatives ⎛ ⎞ ⎛ ⎞⎛ ⎞
t̂ 0  0 t̂
 ⎝ n̂ ⎠
of unit vectors  = || ⎝ − 0  ⎠ ⎝ n̂ ⎠ ^t
b̂ 0 − 0 b̂ ^
b
Velocity v() = r()

Acceleration a() =  2
 t̂+ n̂

Table 5. The diﬀerential relations plus a diagram of the corresponding unit vectors for the Frenet-Serret
coordinate system.
524 APPENDIX C. ORTHOGONAL COORDINATE SYSTEMS

The above equations also can be rewritten in the form using a new unit rotation vector ω where

ω= t̂+b̂ (C.19)

Then equations 16 − 18 are transformed to

t̂
= ω × t̂ (C.20)

n̂
= ω × n̂ (C.21)

b̂
= ω × b̂ (C.22)

In general the Frenet-Serret unit vectors are time dependent. If the curvature  = 0 then the curve is a
straight line and n̂ and b̂ are not well defined. If the torsion is zero then the trajectory lies in a plane. Note
that a helix has constant curvature and constant torsion.
The rate of change of a general vector field E along the trajectory can be written as
µ ¶
E   
= t̂ + n̂+ b̂ + ω × E (C.23)
   

The Frenet-Serret coordinates are used in the life sciences to describe the motion of a moving organism
in a viscous medium. The Frenet-Serret coordinates also have applications to General Relativity.

Workshop exercises
1. The goal of this problem is to help you understand the origin of the equations that relate two diﬀerent coordinate
systems. Refer to diagrams for cylindrical and spherical coordinates as your teaching assistant explains how to
arrive at expressions for 1  2  and 3 in terms of   and  and how to derive expressions for the velocity and
acceleration vectors in cylindrical coordinates. Now try to relate spherical and rectangular coordinate systems.
Your group should derive expressions relating the coordinates of the two systems, expressions relating the unit
vectors and their time derivatives of the two systems, and finally, expressions for the velocity and acceleration
in spherical coordinates.
Appendix D

Coordinate transformations

Coordinate systems can be translated, or rotated with respect to each other as well as being subject to spatial
inversion or time reversal. Scalars, vectors, and tensors are defined by their transformation properties under
rotation, spatial inversion and time reversal, and thus such transformations play a pivotal role in physics.

D.1 Translational transformations

Translational transformations are involved frequently for transforming between the center of mass and lab-
oratory frames for reaction kinematics as well as when performing vector addition of central forces for the
cases where the centers are displaced. Both the classical Galilean transformation or the relativistic Lorentz
transformation are handled the same way. Consider two parallel orthonormal coordinate frames where the
origin of  0 (0   0   0 ) is displaced by a time dependent vector a() from the origin of frame  (  ). Then
the Galilean transformation for a vector r in frame  to r0 in frame  0 is given by

r (0   0   0 ) = r (  ) +a() (D.1)

The velocities for a moving frame are given by the vector diﬀerence of the velocity in a stationary frame,
and the velocity of the origin of the moving frame. Linear accelerations can be handled similarly.

D.2 Rotational transformations

D.2.1 Rotation matrix
Rotational transformations of the coordinate system are used extensively in physics. The transformation
properties of fields under rotation define the scalar and vector properties of fields, as well as rotational
symmetry and conservation of angular momentum.
Rotation of the coordinate frame does not change the value of any scalar observable such as mass,
temperature etc. That is, transformation of a scalar quantity is invariant under coordinate rotation from
   → 0   0   0 .
(0  0  0 ) = () (D.2)
By contrast, the components of a vector along the coordinate axes change under rotation of the coordinate
axes. This diﬀerence in transformation properties under rotation between a scalar and a vector is important
and defines both scalars and a vectors.
Matrix mechanics, described in appendix , provides the most convenient way to handle coordinate
rotations. The transformation matrix, between coordinate systems having diﬀering orientations is called the
rotation matrix. This transforms the components of any vector with respect to one coordinate frame to
the components with respect to a second coordinate frame rotated with respect to the first frame.
Assume a point  has coordinates (1  2  3 ) with respect to a certain coordinate system. Consider
rotation to another coordinate frame for which the point  has coordinates (01  02  03 ) and assume that the

525
526 APPENDIX D. COORDINATE TRANSFORMATIONS

origins of both frames coincide. Rotation of a frame does not change the vector, only the vector components
of the unit basis states. Therefore

x = ê01 01 + ê02 02 + ê03 03 = ê1 1 + ê2 2 + ê3 3 (D.3)

Note that if one designates that the unit vectors for the unprimed coordinate frame are (ê1  ê2  ê3 ) and for
the primed coordinate frame (ê01  ê02  ê03 ) then taking the scalar product of equation 3 sequentially with
each of the unit base vectors (ê01  ê02  ê03 ) leads to the following three relations

01 = (ê01 ·ê1 )1 + (ê01 ·ê2 )2 + (ê01 ·ê3 )3 (D.4)
02 = (ê02 ·ê1 )1 + (ê02 ·ê2 )2 + (ê02 ·ê3 )3
03 = (ê03 ·ê1 )1 + (ê03 ·ê2 )2 + (ê03 ·ê3 )3

Note that the (ê0 ·ê ) are the direction cosines as defined by the scalar product of two unit vectors for axes
 , that is, they are the cosine of the angle between the two unit vectors.
Equation 4 can be written in matrix form as

x0 = λ · x (D.5)

where the “·” means the inner matrix product of the rotation matrix λ and the vector x where
⎛ 0 ⎞ ⎛ ⎞ ⎛ 0 ⎞
1 1 ê1 ·ê1 ê01 ·ê2 ê01 ·ê3
x0 ≡ ⎝ 02 ⎠ x ≡ ⎝ 2 ⎠ λ ≡ ⎝ ê02 ·ê1 ê02 ·ê2 ê02 ·ê3 ⎠ (D.6)
0
3 3 ê03 ·ê1 ê03 ·ê2 ê03 ·ê3
The inverse procedure is obtained by multiplying equation 3 successively by one of the unit basis
vectors (ê1  ê2  ê3 ) leading to three equations

1 = (ê1 ·ê01 )01 + (ê1 ·ê02 )02 + (ê1 ·ê03 )03 (D.7)
2 = (ê2 ·ê01 )01 + (ê2 ·ê02 )02 + (ê2 ·ê03 )03
3 = (ê3 ·ê01 )01 + (ê3 ·ê02 )02 + (ê3 ·ê03 )03

Equation 7 can be written in matrix form as

x = λ ·x0 (D.8)

where λ is the transpose of λ.

Note that substituting equation 5 into equation 8 gives
³ ´
x = λ · (λ · x) = λ ·λ ·x (D.9)

Thus ³ ´
λ ·λ = I

where I is the identity matrix. This implies that the rotation matrix λ is orthogonal with λ = λ−1 .
It is convenient to rename the elements of the rotation matrix to be

 ≡ (ê0 ·ê ) (D.10)

so that the rotation matrix is written more compactly as

⎛ ⎞
11 12 13
λ ≡ ⎝ 21 22 23 ⎠
31 32 33
and equation 4 becomes

01 = 11 1 + 12 2 + 13 3 (D.11)

02 = 21 1 + 22 2 + 23 3
03 = 31 1 + 32 2 + 33 3
D.2. ROTATIONAL TRANSFORMATIONS 527

Consider an arbitrary rotation through an angle . Equations (10) and (11) can be used to relate
six of the nine quantities  in the rotation matrix, so only three of the quantities are independent. That
is, because of equation (11) we have three equations which ensure that the transformation is unitary.

21 + 22 + 23 = 1 (D.12)

Also requiring that the axes be orthogonal gives three equations

X
  = 0  6=  (D.13)


These six relations can be expressed as X

  =   (D.14)


The fact that the rotation matrix should have three independent quantities is due to the fact that all rotations
can be expressed in terms of rotations about three orthogonal axes.

D.1 Example: Rotation matrix:

Consider a point  (1  2  3 ) =  (3 4 5) in the unprimed coordinate system. Consider the same point
 (01  02  03 ) in the primed coordinate system which has been rotated by an angle 60◦ about the 1 axis as
shown. The direction cosines 0  =cos( 0  ) can be determined from the figure to be the following

0
  0  0  =cos(0  )
1 1 0 1
1 2 90 0
1 3 90 0
2 1 90 0
2 2 60 0500
2 3 90 − 60 0866
3 1 90 0
3 2 90 + 60 −0866
3 3 60 0500

Thus the rotation matrix is

⎛ ⎞
1 0 0
 =⎝ 0 0500 0866 ⎠
0 −0866 0500

The transform point P 0 (x 01  x 02  x 03 ) therefore is given by

⎛ 0 ⎞ ⎛ ⎞ ⎛ ⎞ ⎛ ⎞
1 1 0 0 3 3
⎝ 02 ⎠ = ⎝ 0 0500 0866 ⎠ · ⎝ 4 ⎠ = ⎝ 6330 ⎠
0
3 0 −0866 0500 5 −0964
√
Note that the radial coordinate r = r 0 = 50. That is, the rotational transformation is unitary and thus
the magnitude of the vector is unchanged.
528 APPENDIX D. COORDINATE TRANSFORMATIONS

D.2 Example: Proof that a rotation matrix is orthogonal

Consider the rotation matrix ⎛ ⎞
4 7 −4
1⎝
λ= 1 4 8 ⎠
9
8 −4 1
The product
⎛ ⎞ ⎛ ⎞ ⎛ ⎞
4 1 8 4 7 −4 81 0 0
1 1 ⎝
λ ·λ = ⎝ 7 4 −4 ⎠ · ⎝ 1 4 8 ⎠= 0 81 0 ⎠ = 1
81 81
−4 8 1 8 −4 1 0 0 81

which implies that  is orthogonal.

D.2.2 Finite rotations

Consider two finite 90 rotations  and
 illustrated in figure 1 The  ro-
tation is 90 around the 3 axis in a
right-handed direction as shown. In such A B
a rotation the axes transform to 01 = 2 ,
02 = −1 , 03 = 3 and the rotation matrix 3 = 90° 2 = 90°
is ⎛ ⎞
0 1 0 x3
λ = ⎝ −1 0 0 ⎠ (D.15)
0 0 1
x2
The second rotation λ is a right-handed x1
rotation about the 01 axis which formerly
was the 2 axis. Then ”1 = 02 , ”2 = −01 ,
B A
”3 = 03 and the rotation matrix is
⎛ ⎞ 2 = 90° 3 = 90°
1 0 0
λ = ⎝ 0 0 1 ⎠ (D.16)
0 −1 0
Figure D.1: Order of two finite rotations for a parallelepiped.
Consider the product of these two finite ro-
tations which corresponds to a single rota-
tion matrix λ
λ = λ λ (D.17)
That is: ⎛ ⎞⎛ ⎞ ⎛ ⎞
1 0 0 0 1 0 0 1 0
λ =⎝ 0 0 1 ⎠ ⎝ −1 0 0 ⎠ = ⎝ 0 0 1 ⎠ (D.18)
0 −1 0 0 0 1 1 0 0
Now consider that the order of these two rotations is reversed.

λ = λ λ (D.19)

That is: ⎛ ⎞⎛ ⎞ ⎛ ⎞
0 1 0 1 0 0 0 0 1
λ = ⎝ −1 0 0 ⎠ ⎝ 0 0 1 ⎠ = ⎝ −1 0 0 ⎠ 6= λ (D.20)
0 0 1 0 −1 0 0 −1 0
An entirely diﬀerent orientation results as illustrated in figure 1.
This behavior of finite rotations is a consequence of the fact that finite rotations do not commute, that
is, reversing the order does not give the same answer. Thus, if we associate the vectors A and B with
these rotations, then it implies that the vector product AB 6= BA. That is, for finite rotation matrices, the
product does not behave like for true vectors since they do not commute.
D.2. ROTATIONAL TRANSFORMATIONS 529

D.2.3 Infinitessimal rotations

Infinitessimal rotations do not suﬀer from the noncommutation defect
of finite rotations. If the position vector of a point changes from r to
r + r then the geometrical situation is represented correctly by

r = θ × r (D.21)

where θ is a quantity whose magnitude is equal to the infinitessimal

rotation angle and which has a direction along the instantaneous axis
of rotation as illustrated in figure 2.
The infinitessimal angle θ is a vector which is shown by proving
that two infinitessimal rotations θ1 and θ 2 commute. The change
in position vectors of the point are

r1 = θ 1 × r (D.22)

and
r2 = θ 2 × (r + r1 ) (D.23)
Thus the final position vector for θ1 followed by θ2 is

r + r1 + r2 = r + θ 1 × r + θ2 × (r + r1 ) (D.24)

Assuming that the second-order infinitessimals can be ignored gives

Figure D.2: Infinitessimal rotation
r + r1 + r2 = r + θ1 × r + θ2 × r (D.25)

Consider now the inverse order of rotations.

r + r2 + r1 = r + θ 2 × r + θ1 × (r + r2 ) (D.26)

Again, neglecting the second-order infinitessimals gives

r + r2 + r1 = r + θ2 × r + θ1 × r (D.27)

Note that the products of these two infinitessimal rotations, 25 and 27 are identical. That is, assuming
that second-order infinitessimals can be neglected, then the infinitessimal rotations commute, and thus θ 1
and θ2 are correctly represented by vectors.
The fact that θ is a vector allows angular velocity to be represented by a vector. That is, angular
velocity is the ratio of an infinitessimal rotation to an infinitessimal time.
θ
ω= (D.28)

Note that this implies that the velocity of the point can be expressed as
r θ
v= = ×r=ω×r (D.29)
 

D.2.4 Proper and improper rotations

The requirement that the coordinate axes be orthogonal, and that the transformation be unitary, leads to
the relation between the components of the rotation matrix.
X
  =   (D.30)


It was shown in equation 12 that, for such an orthogonal matrix, the inverse matrix −1 equals the
transposed matrix 
λ−1 = λ
530 APPENDIX D. COORDINATE TRANSFORMATIONS

Inserting the orthogonality relation for the rotation matrix leads to the fact that the square of the determinant
of the rotation matrix equals one,
||2 = 1 (D.31)
that is
|| = ±1 (D.32)
A proper rotation is the rotation of a normal vector and has

|| = +1 (D.33)

An improper rotation corresponds to

|| = −1 (D.34)
An improper rotation implies a rotation plus a spatial reflection which cannot be achieved by any combination
of only rotations.
Consider the cross product of two vectors c = a × b It can be shown that the cross product behaves
under rotation as: X
0 = ||   (D.35)


For all proper rotations the determinant of  = +1 and thus the cross product also acts like a proper vector
under rotation. This is not true for improper rotations where || = −1

D.3 Spatial inversion transformation

Spatial inversion, that is, mirror reflection, corresponds to reflection of all coordinate vectors, bi = − bi bj = −
bj and k
b = − k
b Such a transformation corresponds to the transformation matrix
⎛ ⎞ ⎛ ⎞
−1 0 0 1 0 0
λ=⎝ 0 −1 0 ⎠ = − ⎝ 0 1 0 ⎠ (D.36)
0 0 −1 0 0 1

Thus || = −1 that is, it corresponds to an improper

rotation. A spatial inversion for two vectors A() and
B() correspond to x3

x 1‘
A() = −A(−) (D.37)
B() = −B(−) x2 x ‘2

That is, normal polar vectors change sign under spa- x1

tial reflection. However, the cross product C = A × B

does not change sign under spatial inversion since the x 3‘

product of the two minus signs is positive. That is,

C() = +C(−) (D.38)

Figure D.3: Inversion of an object corresponds to
Thus the cross product behaves diﬀerently from a polar reflection about the origin of all axes.
vector. This improper behavior is characteristic of an
axial vector, which also is called a pseudovector.
Examples of pseudovectors are angular momentum, spin, magnetic field etc. These pseudovectors are
defined using the right-hand rule and thus have handedness. For a right-handed system

C = A × B (D.39)

Changing to a left-handed system leads to

C = B × A = −A × B (D.40)
D.4. TIME REVERSAL TRANSFORMATION 531

That is, handedness corresponds to a definite ordering of the cross product. Proper orthogonal transforma-
tions are said to preserve chirality (Greek for handedness) of a coordinate system.
An example of the use of the right-handed system is the usual definition of cartesian unit vectors,
bi × bj = k
b (D.41)

An obvious question to be asked, is the handedness of a coordinate system merely a mathematical curiosity
or does it have some deep underlying significance? Consider the Lorentz force

F =  (E + v × B) (D.42)

Since force and velocity are proper vectors then the magnetic B field must be a pseudo vector. Note that
calculation of the B field occurs only in cross products such as,

∇ × B = j (D.43)

where the current density j is a proper vector. Another example is the Biot-Savart Law which expresses B
as
  l × r
B =  (D.44)
4 2
Thus even though B is a pseudo vector, the force F remains a proper vector. Thus if a left-handed coordinate
definition of B = 4  r×l
2 is used in 44, and F =  (E + B ×v) in 42 then the same final physical
result would be obtained.
It was long thought that the laws of physics were symmetric with respect to spatial inversion ( i.e. mirror
reflection), meaning that the choice between a left-handed and right-handed representations (chirality) was
arbitrary. This is true for gravitational, electromagnetic and the strong force, and is called the conservation
of parity. The fourth fundamental force in nature, the weak force, violates parity and favours handedness.
It turns out that right-handed ordinary matter is symmetrical with left-handed antimatter.
In addition to the two flavours of vectors, one has scalars and pseudoscalars defined by:

 () = + (−) (D.45)

 () = − (−) (D.46)

An example of a pseudoscalar is the scalar product A · (B × C)

D.4 Time reversal transformation

The basic laws of classical mechanics are invariant to the sense of the direction of time. Under time reversal
the vector r is unchanged while both momentum p and time  change sign under time reversal, thus the time
derivative F = p is invariant to time reversal; that is, the force is unchanged and Newton’s Laws F = 
p

are invariant under time reversal. Since the force can be expressed as the gradient of a scalar potential for
a conservative field, then the potential also remains unchanged. That is

p
= −∇ () = F (D.47)

It is necessary to introduce tensor algebra, given in appendix , prior to discussion of the transformation
properties of observables which is the topic of appendix 5.

Workshop exercises
1. Suppose the 2 -axis of a rectangular coordinate system is rotated by 30◦ away from the 3 -axis around the
1 -axis.

(a) Find the corresponding transformation matrix. Try to do this by drawing a diagram instead of going to
the book or the notes for a formula.
532 APPENDIX D. COORDINATE TRANSFORMATIONS

(b) Is this an orthogonal matrix? If so, show that it satisfies the main properties of an orthogonal matrix. If
not, explain why it fails to be orthogonal.
(c) Does this matrix represent a proper or an improper rotation? How do you know?

2. When you were first introduced to vectors, you most likely were told that a scalar is a quantity that is defined
by a magnitude, while a vector has both a magnitude and a direction. While this is certainly true, there is
another, more sophisticated way to define a scalar quantity and a vector quantity: through their transformation
P
properties. A scalar quantity transforms as 0 =  while a vector quantity transforms as 0 =     To
show that the scalar product does indeed transform as a scalar, note that:
⎛ ⎞Ã ! Ã !
X X X X X X
A0 ·B0 = 0 0 = ⎝   ⎠   =    
     
Ã !
X X X
=     =   = A · B
  

Now you will show that the vector product transforms as a vector. Begin by writing out what you are trying
to show explicitly and show it to the teaching assistant. Once the teaching assistant has confirmed that you
have the correct expression, try to prove it. The vector product is a bit more diﬃcult to work with than the
scalar product, so your teaching assistant is prepared to give you a hint if you get stuck.

3. Suppose you have two rectangular coordinate systems that share a common origin, but one system is rotated
by an angle  with respect to the other. To describe this rotation, you have made use of the rotation matrix
(). (I’m changing the notation slightly to put the emphasis on the angle of rotation.)

(a) Verify that the product of two rotation matrices (1 )(2 ) is in itself a rotation matrix.
(b) In abstract algebra, a group  is defined as a set of elements  together with a binary operation ∗ acting
on that set such that four properties are satisfied:
i. (Closure) For any two elements  and  in the group , the product of the elements,  ∗  is also
in the group .
ii. (Associativity) For any three elements      of the group , ( ∗  ) ∗  =  ∗ ( ∗  ).
iii. (Existence of Identity) The group  contains an identity element  such that  ∗  =  ∗  =  for
all  ∈ .
iv. (Existence of Inverses) For each element  ∈ , there exists an inverse element  −1 ∈  such that
 ∗  −1 =  −1 ∗  = .
Show that if the product ∗ denotes the product of two matrices, then the set of rotation matrices together
with ∗ forms a group. This group is known as the special orthogonal group in two dimensions, also known
as (2).
(c) Is this group commutative? In abstract algebra, a commutative group is called an abelian group.

4. When you look in a mirror the image of you appears left-to-right reversed, that is, the image of your left ear
appears to be the right ear of the image and vise versa. Explain why the image is left-right reversed rather
than up-down reversed or reversed about some other axis; i.e. explain what breaks the symmetry that leads to
these properties of the mirror image.

Problems
[1] Find the transformation matrix that rotates the axis 3 of a rectangular coordinate system 45 toward 1 around
the 2 axis.

2
[2] For simplicity, take  to be a two-dimensional transformation matrix. Show by direct expansion that |λ| = 1.
Appendix E

Tensor algebra

E.1 Tensors
Mathematically scalars and vectors are the first two members of a hierarchy of entities, called tensors,
that behave under coordinate transformations as described in appendix . The use of the tensor notation
provides a compact and elegant way to handle transformations in physics.
A scalar is a rank 0 tensor with one component, that is invariant under change of the coordinate system.
(0  0  0 ) = () (E.1)
A vector is a rank 1 tensor which has three components, that transform under rotation according to
matrix relation
x0 = λ · x (E.2)
where λ is the rotation matrix. Equation 2 can be written in the suﬃx form as
3
X
0
 =   (E.3)
=1

The above definitions of scalars and vectors can be subsumed into a class of entities called tensors of rank 
that have 3 components. A scalar is a tensor of rank  = 0, with only 30 = 1 component, whereas a vector
has rank  = 1 that is, the vector x has one suﬃx  and 31 = 3 components.
A second-order tensor  has rank  = 2 with two suﬃxes, that is, it has 32 = 9 components that
transform under rotation as
3 X
X 3
0 =    (E.4)
=1 =1
For second-order tensors, the transformation formula given by equation 4 can be written more compactly
using matrices. Thus the second-order tensor can be written as a 3 × 3 matrix
⎛ ⎞
11 12 13
T ≡ ⎝ 21 22 23 ⎠ (E.5)
31 32 33
The rotational transformation given in equation 4 can be written in the form
3
Ã 3 ! 3
Ã 3 !
X X X X
0 =    =    (E.6)
=1 =1 =1 =1

where  are the matrix elements of the transposed matrix λ . The summations in 6 can be expressed
in both the tensor and conventional matrix form as the matrix product
T0 = λ · T · λ (E.7)
Equation 7 defines the rotational properties of a spherical tensor.

533
534 APPENDIX E. TENSOR ALGEBRA

E.2 Tensor products

E.2.1 Tensor outer product
Tensor products feature prominently when using tensors to represent transformations. A second-order tensor
T can be formed by using the tensor product, also called outer product, of two vectors a and b which,
written in suﬃx form, is ⎛ ⎞
1 1 1 2 1 3
T ≡ a ⊗ b = ⎝ 2 1 2 2 2 3 ⎠ (E.8)
3 1 3 2 3 3
In component form the matrix elements of this matrix are given by

 =   (E.9)

This second-order tensor product has a rank  = 2 that is, it equals the sum of the ranks of the two
vectors. Equation 8 is called a dyad since it was derived by taking the dyadic product of two vectors. In
general, multiplication, or division, of two vectors leads to second-order tensors. Note that this second-order
tensor product completes the triad of tensors possible taking the product of two vectors. That is, the scalar
product a · b, has rank  = 0, the vector product a × b, rank  = 1 and the tensor product a ⊗ b has rank1
 = 2.
Higher-order tensors can be created by taking more complicated tensor products. For example, a rank-3
tensor can be created by taking the tensor outer product of the rank-2 tensor  and a vector  which, for
a dyadic tensor, can be written as the tensor product of three vectors. That is,

 =   =    (E.10)

In summary, the rank of the tensor product equals the sum of the ranks of the tensors included in the tensor
product.

E.2.2 Tensor inner product

The lowest rank tensor product, which is called the inner product, is obtained by taking the tensor product
of two tensors for the special case where one index is repeated, and taking the sum over this repeated index.
Summing over this repeated index, which is called contraction, removes the two indices for which the index
is repeated, resulting in a tensor that has rank  equal to the sum of the ranks minus 2 for one contraction.
That is, the product tensor has rank  = 1 + 2 − 2.
The simplest example is the inner product of two vectors which has rank  = 1 + 1 − 2 = 0, that is, it is
the scalar product that equals the trace of the inner product matrix, and this inner product is commutative.
An especially important case is the inner product of a rank-2 dyad a ⊗ b given by equation 8 with a
vector c, that is, the inner product T = a ⊗ b · c. Written in component form, the inner product is
3
Ã 3 !
X X
   =    = (a · b)  (E.11)
 

The scalar product a · b is a scalar number, and thus the inner-product tensor is the vector c renormalized
by the magnitude of the scalar product a · b. That is, it has a rank  = 2 + 1− 2 = 1. Thus the inner product
of this rank-2 tensor with a vector gives a vector. The inner product of a rank-2 tensor with a rank-1 tensor
is used in this book for handling the rotation matrix, the inertia tensor for rigid-body rotation, and for the
stress and the strain tensors used to describe elasticity in solids.

E.1 Example: Displacement gradient tensor

The displacement gradient tensor provides an example of the use of the matrix representation to manipu-
late tensors. Let φ(1  2  3 ) be a vector field expressed in a cartesian basis. The definition of the gradient
G = ∇φ gives that
φ = G·x
1 The common convention is to denote the scalar product as a · b the vector product as a × b, and tensor product as a ⊗ b.
E.3. TENSOR PROPERTIES 535

Calculating the components of φ in terms of x gives

1 1 1
1 = 1 + 2 + 3
1 2 3
2 2 2
2 = 1 + 2 + 3
1 2 3
3 3 3
3 = 1 + 2 + 3
1 2 3
Using index notation this can be written as

 = 

The second-rank gradient tensor G can be represented in the matrix form as
¯  ¯
¯ 1 1 1 ¯
¯ 1 2 3 ¯
¯ 2 ¯
G = ¯  2
1
2
2 3 ¯
¯  3 3 ¯¯
¯ 3
1 2 3

Then the vector φ can be expressed compactly as the inner product of G and xthat is
φ = G·x

E.3 Tensor properties

In principle one must distinguish between a 3×3 square matrix, and the tensor component representations of
a rank-2 tensor. However, as illustrated by the previous discussion, for orthogonal transformations, the tensor
components of the second rank tensor transform identically with the matrix components. Thus functionally,
the matrix formulation and tensor representations are identical. As a consequence, all the terminology and
operations used in matrix mechanics are equally applicable to the tensor representation.
The tensor representation of the rotation matrix provides the simplest example of the equivalence of
the matrix and tensor representations of transformations. Appendix 2 showed that the unitary rotation
matrix λ acting on a vector x transforms it to the vector x0 that is rotated with respect to x. That is, the
transformation is
x0 = λ · x (5)
where ⎛ ⎞ ⎛ ⎞ ⎛ ⎞
01 1 ê01 ·ê1 ê01 ·ê2 ê01 ·ê3
x0 ≡ ⎝ 02 ⎠ x ≡ ⎝ 2 ⎠ λ ≡ ⎝ ê02 ·ê1 ê02 ·ê2 ê02 ·ê3 ⎠ (6)
03 3 ê03 ·ê1 ê03 ·ê2 ê03 ·ê3
Appendix 2 showed that the rotation matrix λ requires 9 components to fully specify the transformation
from the initial 3-component vector x to the rotated vector x0 . The rotation tensor is a dyad as well as being
unitary and dimensionless. Note that equation 5 is an example of the inner product of a rank−2 rotation
tensor acting on a vector leading to a another vector that is rotated with respect to the first vector.
In general, rank-2 tensors have dimensions and are not unitary. For example, the angular velocity vector
ω and the angular momentum vector L are related by the inner product of the inertia tensor {I} and ω.
That is
L ={I} · ω (116)
The inertia tensor has dimensions of  × 2 and relates two very diﬀerent vector observables. The
stress tensor and the strain tensor, discussed in chapter 15 provide another example of second-order tensors
that are used to transform one vector observable to another vector observable analogous to the case of the
rotation matrix or the inertia tensor.
Note that pseudo-tensors can be used to make a rotational transformation plus a change in the sign.
That is, they lead to a parity inversion.
The tensor notation is used extensively in physics since it provides a powerful, elegant, and compact
representation for describing transformations.
536 APPENDIX E. TENSOR ALGEBRA

E.4 Contravariant and covariant tensors

In general the configuration space used to specify a dynamical system is not a Euclidean space in that
there may not be a system of coordinates for which the distance between any two neighboring points can
be represented by the sum of the squares of the coordinate diﬀerentials. For example, a set of cartesian
coordinate does not exist for the two-dimension motion of a single particle constrained to the curved surface
of a fixed sphere. Such curved spaces need to be represented in terms of Riemannian geometry rather
than Euclidean geometry. Curved configuration spaces occur in some branches of physics such as Einstein’s
General Theory of Relativity.
Tensors have transformation properties that can be either contravariant or covariant. Consider a set of
generalized coordinates  0 that are a function of the coordinates . Then infinitessimal changes   will lead
to infinitessimal changes  0 where
X  0
 0 = 
  (E.12)



Contravariant components of a tensor transform according to the relation

X  0
0 =  (E.13)

 

Equation 13 relates the contravariant components in the unprimed and primed frames.
Derivatives of a scalar function , such as

 X    X  
0 = = =  (E.14)
  
    
 

That is, covariant components of the tensor transform according to the relation
X  
0 =  (E.15)

 

It is important to diﬀerentiate between contravariant and covariant vectors. The Einstein superscript/subscript
convention for distinguishing between these two flavours of tensors is given in table 1

Table 1. Einstein notation for tensors.

 denotes a contravariant vector
 denotes a covariant vector

In linear algebra one can map from one coordinate system to another as illustrated in appendix . That
is, the tensor x can be expressed as components with respect to either the unprimed or primed coordinate
frames
x = ê01 01 + ê02 02 + ê03 03 = ê1 1 + ê2 2 + ê3 3 (E.16)
For a −dimensional manifold the unit basis column vectors ê transform according to the transformation
matrix λ
ê0 = λ · ê (E.17)
Since the tensor x is independent of the coordinate basis, the components of x must have the opposite
transform ¡ ¢
x0 = λ−1 ·x (E.18)
This normal vector x is called a “contravariant vector” because it transforms contrary to the basis column
vector transformation.
The inverse of equation 18 gives that the column vector element
X
 = λ 0 (E.19)

E.5. GENERALIZED INNER PRODUCT 537

Consider the case of a gradient with respect to the coordinate x in both the unprimed and primed bases.
Using the chain rule for the partial derivative then the component of the gradient in the primed frame can
be expanded as
 X   X  
(∇ )0 = = = λ   =  (E.20)
0 
 0
 
 

That is, the gradient transforms as

∇0  = λ · ∇ (E.21)
That is, a gradient transforms as a covariant vector, like the unit vectors, whereas a vector  is contravariant
under transformation.
¡ ¢
Normally the basis is orthonormal, λ−1 = λ and thus there is no difference between contravariant and
covariant vectors. However, for curved coordinate systems, such as non-Euclidean geometry in the General
Theory of Relativity, the covariant and contravariant vectors behave differently.
The Einstein convention is extended to apply to matrices by writing the elements of the matrix A as
 while the elements of the transposed matrix A−1 are written as  . The matrix product for A with a
contravariant vector X is written as X
 0 =    (E.22)

where the summation over  effectively cancels the identical superscript and subscript .
Similarly a covariant vector, such as a gradient, is written as,
¡ 0 ¢ X¡ ¢  X¡ ¢
∇ = −1  (∇ ) = −1  (∇ ) (E.23)
 

Again the summation cancels the  superscript and subscript. The Kronecker delta symbol is written as
X 
   =  (E.24)


E.5 Generalized inner product

The generalized definition of an inner product is
X
=      (E.25)


where  is a unitary matrix called a covariant metric. The covariant metric transforms a contravariant to
a covariant tensor. For example the matrix element of a covariant tensor  can be written as
X
 =    (E.26)


By association of the covariant metric with either of the vectors in the inner product gives
X X X
=      =    =    (E.27)
  

Similarly it can be defined in terms of an orthogonal contravariant metric   where

X
=     (E.28)


Then
X
 =    (E.29)

Association of the contravariant metric with one of the vectors in the inner product gives the inner
product X X X
=     =    =    (E.30)
  
For most situations in this book the metric  is diagonal and unitary.
538 APPENDIX E. TENSOR ALGEBRA

E.6 Transformation properties of observables

In physics, observables can be represented by spherical tensors which specify the angular momentum and
parity characteristics of the observable, and the tensor rank is independent of the time dependence. The
transformation properties of these tensors, coupled with their time-reversal invariance, specify the funda-
mental characteristics of the observables.
Table 2 summarizes the transformation properties under rotation, spatial inversion and time reversal
for observables encountered in classical mechanics and electrodynamics. Note that observables can be scalar,
vector, pseudovector, or second-order tensors, under rotation, and even or odd under either space inversion
or time inversion. For example, in classical mechanics the inertia tensor I relates the angular velocity vector
ω to the angular momentum vector L by taking the inner product L = I · ω. In general I is not diagonal and
thus the angular momentum is not parallel to the angular velocity ω. A similar example in electrodynamics
is the dielectric tensor K which relates the displacement field D to the electric field E by D = K · E. For
anisotropic crystal media K is not diagonal leading to the electric field vectors E and D not being parallel.
As discussed in chapter 7, Noether’s Theorem states that symmetries of the transformation properties lead
to important conservation laws. The behavior of classical systems under rotation relates to the conservation
of angular momentum, the behavior under spatial inversion relates to parity conservation, and time-reversal
invariance relates to conservation of energy. That is, conservative forces conserve energy and are time-reversal
invariant.

Table 2 : Transformation properties of scalar, vector, pseudovector, and tensor observables
under rotation, spatial inversion, and time reversal2
Physical Observable Rotation Space Time Name
(Tensor rank) inversion reversal
1) Classical Mechanics
Mass density  0 Even Even Scalar
Kinetic energy 2 2 0 Even Even Scalar
Potential energy  () 0 Even Even Scalar
Lagrangian  0 Even Even Scalar
Hamiltonian  0 Even Even Scalar
Gravitational potential  0 Even Even Scalar
Coordinate r 1 Odd Even Vector
Velocity v 1 Odd Odd Vector
Momentum p 1 Odd Odd Vector
Angular momentum L=r×p 1 Even Odd Pseudovector
Force F 1 Odd Even Vector
Torque N=r×F 1 Even Even Pseudovector
Gravitational field g 1 Odd Even Vector
Inertia tensor I 2 Even Even Tensor
Elasticity stress tensor T 2 Even Even Tensor

2) Electromagnetism
Charge density  0 Even Even Scalar
Current density j 1 Odd Odd Vector
Electric field E 1 Odd Even Vector
Polarization P 1 Odd Even Vector
Displacement D 1 Odd Even Vector
Magnetic  field B 1 Even Odd Pseudovector
Magnetization M 1 Even Odd Pseudovector
Magnetic  field H 1 Even Odd Pseudovector
Poynting vector S=E×H 1 Odd Odd Vector
Dielectric tensor K 2 Even Even Tensor
Maxwell stress tensor T 2 Even Even Tensor

2 Based on table 6.1 in "Classical Electrodynamics" 2 edition, by J.D. Jackson [?]
Appendix F

Aspects of multivariate calculus

Multivariate calculus provides the framework for handling systems having many variables associated with
each of several bodies. It is assumed that the reader has studied linear diﬀerential equations plus multivariate
calculus and thus has been exposed to the calculus used in classical mechanics. Chapter 5 of this book
introduced variational calculus which covers several important aspects of multivariate calculus such as Euler’s
variational calculus and Lagrange multipliers. This appendix provides a brief review of a selection of other
aspects of multivariate calculus that feature prominently in classical mechanics.

F.1 Partial diﬀerentiation

The extension of the derivative to multivariate calculus involves use of partial derivatives. The partial
derivative with respect to the variable  of a multivariate function  (1  2   ) involves taking the
normal one-variable derivative with respect to  assuming that the other  − 1 variables are held constant.
That is, ∙ ¸
 (1  2   )  (1  2  −1  ( +  )   ) −  (1  2    )
= lim (F.1)
  →0 
where it will be assumed that the function  () is a continuously-diﬀerentiable function to  order, then
all partial derivatives of that order or less are independent of the order in which they are performed. That
is,
 2  ()  2  ()
= (F.2)
   
The chain rule for partial diﬀerentiation gives that

 (1  2    ) X  ()  ()
= (F.3)
  
=1

The total diﬀerential of a multivariate function  () is


X  ()
 =  (F.4)

=1

This can be extended to higher-order derivatives using the operator formalism

µ ¶ X
      ()
  () = 1 +  +   () = 1  (F.5)
1  1 

F.2 Linear operators

The linear operator notation provides a powerful, elegant, and compact way to express, and apply, the
equations of multivariate calculus; it is used extensively in mathematics and physics. The linear operators

539
540 APPENDIX F. ASPECTS OF MULTIVARIATE CALCULUS

typically comprise partial derivatives that act on scalar, vector, or tensor fields. Table  1 lists a few
elementary examples of the use of linear operators in this textbook. The first four linear operators involve
the widely used del operator ∇ to generate the gradient, divergence and curl as described in appendices 
and . The fifth and sixth linear operators act on the Lagrangian in Lagrangian mechanics applications.
The final two linear operators act on the wavefunction for wave mechanics.

Name Partial derivative Field Action

  
Gradient ∇ ≡ ̂  + ̂  + k̂  Scalar potential  E = ∇
³ ´
  
Divergence ∇· ≡ ̂  + ̂  + k̂  · Vector field E ∇·E
³ ´
  
Curl ∇× ≡ ̂  + ̂  + k̂  × Vector field E ∇×E
2 2 2
Laplacian ∇2 = ∇·∇ ≡  
2 +

2 + 
 2 Scalar potential  ∇2 
Euler-Lagrange  
Λ ≡   ̇ −   Scalar Lagrangian  Λ = 0
Canonical momentum  ≡ ̇ Scalar Lagrangian   ≡ 
̇
Canonical momentum  ≡  ̇ Wavefunction Ψ  Ψ ≡  Ψ
̇

Hamiltonian  = ~  Wavefunction Ψ Ψ = ~ Ψ  = Ψ

Table 1 examples of linear operators used in this textbook.

There are three ways of expressing operations such as addition, multiplication, transposition or inversion
of operations that are completely equivalent because they all are based on the same principles of linear
algebra. For example, a transformation O acting on a vector A can produced the vector B. The simplest
way to express this transformation is in terms of components
3
X
 =   (F.6)
=1

Another way is to use matrix mechanics where the 3 × 3 matrix (O) transforms the column vector (A) to
the column vector (B), that is,
(B) = (O) (A) (F.7)
The third approach is to assume an operator O acts on the vector A

B = OA (F.8)

In classical mechanics, and quantum mechanics, these three equivalent approaches are used and exploited
extensively and interchangeably. In particular the rules of matrix manipulation, that are given in appendix
 are synonymous, and equivalent to, those that apply for operator manipulation. If the operator is complex
then the operator properties are summarized as follows.
The generalization of the transpose for complex operators is the Hermitian conjugate †
† ∗
 =  (F.9)

Note also that

O† = (∗ ) = ( )∗ (F.10)
The generalization of a symmetric matrix is Hermitian, that is,  is equal to its Hermitian conjugate
† ∗
 =  =  (F.11)

For a real matrix the complex conjugation has no eﬀect so the matrix is real and symmetric.
The generalization of orthogonal is unitary for which the operator is unitary if it is non-singular and

−1 = † (F.12)

which implies
† =  = †  (F.13)
F.3. TRANSFORMATION JACOBIAN 541

F.3 Transformation Jacobian

The Jacobian determinant, which is usually called the Jacobian, is used extensively in mechanics for both
rotational and translational coordinate transformations. The Jacobian determinant is defined as being the
ratio of the -dimensional volume element 1 2  in one coordinate system, to the volume element
1 2  in the second coordinate system. That is
¯  ¯
¯ 1 1 ¯
¯ 1 2   ¯
1

¯ 2 2 2 ¯

1 2  ¯¯ 1 2   ¯¯
(1 2  ) ≡ =¯ . .. .. .. ¯ (F.14)
1 2  ¯ .. . . . ¯
¯   ¯
¯    
1 
 ¯
2 

F.3.1 Transformation of integrals:

Consider a coordinate transformation for the integral of the function  (1  2   ) to the integral of a
function (1  2   ) where  =  (1  2   )  The coordinate transformation of the integral equation
can be expressed in terms of the Jacobian (1 2  )
Z Z
 (1  2   )1 2  = (1  2   )1 2  = (F.15)
Z Z
1 2 
 (1  2   ) 1 2  =  (1  2   )(1  2   )1 2 
1 2 

F.3.2 Transformation of diﬀerential equations:

The diﬀerential cross sections for scattering can be defined either by the number of a definite kind of
particle/per event, going into the volume element in momentum space 1 2 3  or by the number going
into the solid angle element having momentum between  and  + . That is, the first definition can be
written as a diﬀerential equation
 3 (1  2  3 )  3 (1 () 2 () 3 ()) (1  2  3 )
1 2 3 =  (F.16)
1 2 3 1 2 3 (  )

As shown in table 4, 1 2 3 = 2 sin  that is, the Jacobian equals 2 sin  Thus equation 16
can be written as
∙ ¸
 3 (1  2  3 ) 3 2  2 (  )
1 2 3 =  (sin ) = Ω (F.17)
1 2 3 1 2 3 Ω
The diﬀerential cross section is defined by
 2 (  ) 3
≡ 2 (F.18)
Ω 1 2 3
where the 2 factor is absorbed into the cross section and the solid angle term is factored out

F.3.3 Properties of the Jacobian:

In classical mechanics the Jacobian often is extended from 3 dimensions to -dimensional transformations.
The Jacobian is unity for unitary transformations such as rotations and linear translations which implies that
the volume element is preserved. It will be shown that this also is true for a certain class of transformations
in classical mechanics that are called canonical transformations. The Jacobian transforms the local density
to be correct for any scale transformations such as transforming linear dimensions from centimeters to inches.

F.1 Example: Jacobian for transform from cartesian to spherical coordinates

R
Consider the transform in the three-dimensional integral  (1  2  3 )1 2 3 under transformation
from cartesian coordinates (1  2  3 ) to spherical coordinates (  ) The transformation is governed by
542 APPENDIX F. ASPECTS OF MULTIVARIATE CALCULUS

the geometric relations 1 =  sin  cos , 2 =  sin  sin , 3 =  cos . For this transformation the Jacobian
determinant equals
¯ ¯
¯ sin  cos   cos  cos  − sin  sin  ¯
¯ ¯
(  ) = ¯¯ sin  sin   cos  sin   sin  cos  ¯¯ = 2 sin 
¯ cos  − sin  0 ¯
Thus the three-dimensional volume integral transforms to
Z Z Z
 (1  2  3 )1 2 3 =  (  )(  ) =  (  )2 sin 

which is the well-known volume integral in spherical coordinates.

F.4 Legendre transformation

Hamiltonian mechanics can be derived directly from Lagrange mechanics by considering the Legendre trans-
formation between the conjugate variables (q q̇ ) and (q p )  Such a derivation is of considerable im-
portance in that it shows that Hamiltonian mechanics is based on the same variational principles as those
used to derive Lagrangian mechanics; that is d’Alembert’s Principle or Hamilton’s Principle. The general
problem of converting Lagrange’s equations into the Hamiltonian form hinges on the inversion of equation
(83) that defines the generalized momentum p This inversion is simplified by the fact that (83) is the first
partial derivative of the Lagrangian (q q̇ t) which is a scalar function.
Consider transformations between two functions  (u w) and (v w) where u and v are the active
variables related by the functional form
v = ∇u  (u w) (F.19)
and where w designates passive variables and ∇u  (u w) is the first-order derivative of  (u w) , i.e. the
gradient, with respect to the components of the vector u. The Legendre transform states that the inverse
formula can always be written in the form
u = ∇v (v w) (F.20)
where the function (v w) is related to  (u w) by the symmetric relation
(v w) + F(u w) = u · v (F.21)
P
and where the scalar product u · v = =1   .
Furthermore the derivatives with respect to all the passive variables { } are related by
∇w  (u w) = −∇w (v w) (F.22)
The relationship between the functions  (u w) and (v w) is symmetrical and each is said to be the
Legendre transform of the other.

Workshop exercises

1. Below you will find a set of integrals. Your teaching assistant will divide you into groups and each group will
be assigned one integral to work on. Once your group has solved the integral, write the solution on the board
in the space provided by the teaching assistant.
R 2 R 4 R cos 
(a) 2 sin 
R0¡ ṙ 0
¢0
(b)  − r̇2 
R
(c)  A · a where A = ̂ + ̂ +  k̂ and  is the sphere 2 +  2 +  2 = 9.
R
(d)  (∇ × A) · a where A = ̂ + ̂ + k̂ and  is the surface defined by the paraboloid  = 1 − 2 −  2 ,
where  ≥ 0.
Appendix G

Vector diﬀerential calculus

This appendix reviews vector diﬀerential calculus which is used extensively in both classical mechanics and
electromagnetism.

G.1 Scalar diﬀerential operators

G.1.1 Scalar field
¡¢
Diﬀerential operators like time  do not change the rotational properties of scalars or proper vectors. A

scalar operator  acting on a scalar field (), in a rotated coordinated frame 0 (0  0  0 ) is unchanged.

0 
= (G.1)
 

G.1.2 Vector field

Similarly for a proper vector field
0 X 
=  (G.2)
 


That is, diﬀerentiation of scalar or vector fields with respect to a scalar operator does not change the
rotational behavior. In particular, the scalar diﬀerentials of vectors continue to obey the rules of ordinary

proper vectors. The scalar operator  is used for calculation of velocity or acceleration.

G.2 Vector diﬀerential operators in cartesian coordinates

Vector differential operators, such as the gradient operator, are important in physics. The action of vector
operators differ along different orthogonal axes.

G.2.1 Scalar field

Consider a continuous, single-valued scalar function (     ) Since

0 =  (G.3)

then the partial diﬀerential with respect to one component  of the vector x0 gives
0 X  
0 = (G.4)
 
 0

The inverse rotation gives that X

 =  0 (G.5)


543
544 APPENDIX G. VECTOR DIFFERENTIAL CALCULUS

Therefore
 X 0 X
=  =    =  (G.6)
0 0
 

Thus
0 X 
0 =  (G.7)
 


That is the vector derivative acting of a scalar field transforms like a proper vector.
Define the gradient, or ∇ operator, as
X 
∇≡ eb (G.8)

 

where eb is the unit vector along the  axis. In cartesian coordinates, the del vector operator is,
  b
∇ ≡ bi + bj +k (G.9)
  
The gradient was applied to the gravitational and electrostatic potential to derive the corresponding field.
For example, for electrostatics it was shown that the gradient of the scalar electrostatic potential field  can
be written in cartesian coordinates as
E = −∇ (G.10)
Note that the gradient of a scalar field produces a vector field. You are familiar with this if you are a skier
in that the gravitational force pulls you down the line of steepest descent for the ski slope.

G.2.2 Vector field

Another possible operation for the del operator is the scalar product with a vector. Using the definition of
a scalar product in cartesian coordinates gives
 b b b b    
∇ · A = bi · bi +j·j +k·k = + + (G.11)
     
This scalar derivative of a vector field is called the divergence. Note that the scalar product produces a
scalar field which is invariant to rotation of the coordinate axes.
The vector product of the del operator with another vector, is called the curl which is used extensively
in physics. It can be written in the determinant form
¯ ¯
¯ bi b b ¯
¯  j k ¯
∇ × A = ¯¯    ¯¯ (G.12)
¯    ¯
  

By contrast to the scalar product, both the gradient of a scalar field, and the vector product, are vector
fields for which the components along the coordinate axes transform in a specific manner, such as to keep the
length of the vector constant, as the coordinate frame is rotated. The gradient, scalar and vector products
with the ∇ operator are the first order derivatives of fields that occur most frequently in physics.
Second derivatives of fields also are used. Let us consider some possible combinations of the product of
two del operators.
1) ∇· (∇ ) = ∇2 
The scalar product of two del operators is a scalar under rotation. Evaluating the scalar product in
cartesian coordinates gives
µ ¶ µ ¶ 2 2 2
bi  + bj  + k b  =   +   +  
b  · bi  + bj  + k (G.13)
      2  2  2

This also can be obtained without confusion by writing this product as;

∇· (∇ ) = ∇ · ∇ = (∇ · ∇)  (G.14)
G.3. VECTOR DIFFERENTIAL OPERATORS IN CURVILINEAR COORDINATES 545

where the scalar product of the del operator is a scalar, called the Laplacian ∇2  given by

2 2 2
∇ · ∇ = ∇2 ≡ + + (G.15)
2  2  2
The Laplacian operator is encountered frequently in physics.

2) ∇× (∇ ) = 0
Note that the vector product of two identical vectors

A×A=0 (G.16)

Therefore
∇× (∇ ) = 0 (G.17)

This can be confirmed by evaluating the separate components along each axis.

3) ∇· (∇ × A) = 0
This is zero because the cross-product is perpendicular to ∇ × A and thus the dot product is zero.

4) ∇× (∇ × A) = ∇· (∇ · A) − ∇2 A
The identity
A × (B × C) = B (A · C) − (A · B) C (G.18)

can be used to give

∇× (∇ × A) = ∇· (∇ · A) − ∇2 A (G.19)

since ∇ · ∇ = ∇2 
There are pitfalls in the discussion of second derivatives in that it is assumed that both del operators
operate on the same variable, otherwise the results are diﬀerent.

G.3 Vector diﬀerential operators in curvilinear coordinates

As discussed in Appendix  there are many situations where the symmetries make it more convenient to use
orthogonal curvilinear coordinate systems rather than cartesian coordinates. Thus it is necessary to extend
vector derivatives from cartesian to curvilinear coordinates. Table 1 can be used for expressing vector
derivatives in curvilinear coordinate systems.

G.3.1 Gradient:
The gradient in curvilinear coordinates is

1  1  1 
∇ = q̂1 + q̂2 + q̂3 (G.20)
1 1 2 2 3 3

where the coeﬃcients  are listed in table 1.

For cylindrical coordinates this becomes

 1  
∇ = ρ̂ + ϕ̂ + ẑ (G.21)
   

In spherical coordinates
 1  1 
∇ = r̂ + θ̂ + ϕ̂ (G.22)
    sin  
546 APPENDIX G. VECTOR DIFFERENTIAL CALCULUS

G.3.2 Divergence:
The divergence can be expressed as
∙ ¸
1   
∇·A= (1 2 3 ) + (2 3 1 ) + (3 1 2 ) (G.23)
1 2 3 1 2 3

In cylindrical coordinates the divergence is

1  1     1  
∇·A= ( ) + + = + + + (G.24)
         
In spherical coordinates the divergence is
∙ ¸
1  ¡ ¢  
∇·A= 2  2 sin  + (  sin ) + ( ) (G.25)
 sin    

G.3.3 Curl:
¯ ¯
¯ 1 q̂1 2 q̂2 3 q̂3 ¯¯
1 ¯ 
∇×A= ¯   ¯ (G.26)
1 2 3 ¯ 1 2 3 ¯
¯ 1 1 2 2 3 3 ¯
In cylindrical coordinates the curl is
¯ ¯
¯ ρ̂ ϕ̂ ẑ ¯¯
1 ¯¯    ¯
∇ × A = ¯    ¯ (G.27)
¯
   ¯

In spherical coordinates the curl is

¯ ¯
¯ r̂ θ̂  sin ϕ̂ ¯¯
1 ¯ 
∇×A= 2 ¯   ¯ (G.28)
 sin  ¯    ¯
¯    sin  ¯
 

G.3.4 Laplacian:
Taking the divergence of the gradient of a scalar gives
∙ µ ¶ µ ¶ µ ¶¸
1  2 3   3 1   1 2 
∇2  = ∇ · ∇ = + + (G.29)
1 2 3 1 1 1 2 2 2 3 3 3

The Laplacian of a scalar function  in cylindrical coordinates is

µ ¶
1   1 2 2
∇2  =  + 2 2
+ 2 (G.30)
     

The Laplacian of a scalar function  in spherical coordinates is

µ ¶ µ ¶
2 1  2  1   1 2
∇ = 2  + 2 sin  + 2 (G.31)
    sin     sin  2

The gradient, divergence, curl and Laplacian are used extensively in curvilinear coordinate systems when
dealing with vector fields in Newtonian mechanics, electromagnetism, and fluid flow.
Appendix H

Vector integral calculus

Field equations, such as for electromagnetic and gravitational fields, require both line integrals, and surface
integrals, of vector fields to evaluate potential, flux and circulation. These require use of the gradient, the
Divergence Theorem and Stokes Theorem which are discussed in the following sections.

H.1 Line integral of the gradient of a scalar field

The change ∆ in a scalar field for an infinitessimal step l along a path can be written as

∆ = (∇ ) · l (H.1)

since the gradient of  that is, ∇ is the rate of change of  with l Discussions of gravitational and
electrostatic potential show that the line integral between points  and  is given in terms of the del operator
by
Z 
 −  = (∇ ) · l (H.2)

This relates the diﬀerence in values of a scalar field at two points to the line integral of the dot product of
the gradient with the element of the line integral.

H.2 Divergence theorem

H.2.1 Flux of a vector field for Gaussian surface
Consider the flux Φ of a vector field F for a closed surface, usually
called a Gaussian surface,  shown in figure 1.
I Sa Sb
Φ= F · S (H.3)
 F
S2 S1
If the enclosed volume is cut in to two pieces enclosed by surfaces
1 =  +  and 2 =  +  . The flux through the surface  V1 Sab V2 F

common to both 1 and 2 are equal and in the same direction. Then
the net flux through the sum of 1 and 2 is given by
I I I
F · S + F · S = F · S (H.4) cut
1 2 

since the contributions of the common surface  cancel in that the
flux out of 1 is equal and opposite to the flux into 2 over the surface
Figure H.1: A volume V enclosed
  That is, independent of how many times the volume enclosed by
by a closed surface S is cut into two
 is subdivided, the net flux for the sum of all the Gaussian
H surfaces
pieces at the surface S  This gives
enclosing these subdivisions of the volume, still equals  F · S
V1 enclosed by S1 and V1 enclosed
by S2 
547
548 APPENDIX H. VECTOR INTEGRAL CALCULUS

Consider
H that the volume enclosed by  is subdivided into  subdivisions where  → ∞ then even
though  F · S → 0 as  → ∞, the sum over surfaces of all the infinitessimal volumes remains unchanged

I →∞ I
X
Φ= F · S = F · S (H.5)
  

Thus we can take the limit of a sum of an infinite number of infinitessimal volumes as is needed to obtain a
diﬀerential
H form. The surface integral for each infinitessimal volume will equal zero which is not useful, that
is  F · S → 0 as  → ∞ However, the flux per unit volume has a finite value as  → ∞ This ratio is
called the divergence of the vector field;
H
F · S
F = ∆  →0  (H.6)
∆ 
where ∆  is the infinitessimal volume enclosed by surface   The divergence of the vector field is a scalar
quantity.
Thus the sum of flux over all infinitessimal subdivisions of the volume enclosed by a closed surface 
equals
I →∞
X
H →∞

F · S X
Φ= F · S = ∆  = F∆  (H.7)
 
∆  

In the limit  → ∞ ∆  → 0 this becomes the integral;

I Z
Φ= F · S = F (H.8)
 


This is called the Divergence Theorem or Gauss’s Theorem. To avoid confusion with Gauss’s law in electro-
statics, it will be referred to as the Divergence theorem.

H.2.2 Divergence in cartesian coordinates.

Consider the special case of an infinitessimal rectangular box, size
∆ ∆ ∆ shown in figure 2 Consider the net flux for the  com-
Fz
ponent  entering the surface ∆∆ at location (  ).
µ ¶
 ∆  ∆  z
∆Φ =  + + ∆∆ (H.9)
2  2  x,y,z

The net flux of the z component out of the surface at  + ∆ is

y
µ ¶
 ∆  ∆ 
∆Φ
 =   + ∆ + + ∆∆ (H.10)
 2  2 

Thus the net flux out of the box due to the z component of F is
x

∆Φ = ∆Φ
 − ∆Φ
 = ∆∆∆ (H.11)

Adding the similar  and  components for ∆Φ gives Figure H.2: Computation of flux
µ ¶ out of an infinitessimal rectangular
  
∆Φ = + + ∆∆∆ (H.12) box, ∆ ∆ ∆
  

This gives that the divergence of the vector field F is

H µ ¶

F · S   
F = ∆  →0 = + + (H.13)
∆    
H.2. DIVERGENCE THEOREM 549

since ∆ = ∆∆∆ But the right hand side of the equation equals the scalar product ∇ · F that is,

F = ∇ · F (H.14)

The divergence is a scalar quantity. The physical meaning of the divergence is that it gives the net flux per
unit volume flowing out of an infinitessimal volume. A positive divergence corresponds to a net outflow of
flux from the infinitessimal volume at any location while a negative divergence implies a net inflow of flux
to this infinitessimal volume.
It was shown that for an infinitessimal rectangular box
µ ¶
  
∆Φ = + + ∆∆∆ = ∇ · F∆ (H.15)
  
Integrating over the finite volume enclosed by the surface  gives
I Z
Φ= F · S = ∇ · F (H.16)
 


This is another way of expressing the Divergence theorem

I Z
Φ= F · S = F (H.17)
 


The divergence theorem, developed by Gauss, is of considerable importance, it relates the surface integral of
a vector field, that is, the outgoing flux, to a volume integral of ∇ · F over the enclosed volume.

H.1 Example: Maxwell’s Flux Equations

As an example of the usefulness of this relation, consider the Gauss’s law for the flux in Maxwell’s
equations.
Gauss’ Law for the electric field
I Z
1
Φ = E · dS = d
 0 
  
But the divergence relation gives that
I Z
Φ = E · S = ∇ · E
 


Combining these gives I Z Z

1
E · S = ∇ · E = 
  0 
   
This is true independent of the shape of the surface or enclosed volume, leading to the diﬀerential form
of Maxwell’s first law, that is Gauss’s law for the electric field.

∇·E =
0
The diﬀerential form of Gauss’s law relates ∇ · E to the charge density  at that same location. This is
much easier to evaluate than a surface and volume integral required using the integral form of Gauss’s law.
Gauss’s law for magnetism
I
Φ = B · S = 0

 

Using the divergence theorem gives that

I Z
Φ = B · S = ∇ · B = 0
 
  
550 APPENDIX H. VECTOR INTEGRAL CALCULUS

This is true independent of the shape of the Gaussian surface leading to the diﬀerential form of Gauss’s law
for B
∇·B=0
That is, the local value of the divergence of B is zero everywhere.

H.2 Example: Buoyancy forces in fluids

Buoyancy in fluids provides an example of the use of flux in physics. Consider a fluid of density ()
in a gravitational field ̄() = −()̂ where the  axis points in the opposite direction to the gravitational
force. Pressure equals force per unit area and is a scalar quantity. For a conservative fluid system, in static
equilibrium, the net work done per unit area for an infinitessimal displacement  is zero. The net pressure
force per unit area is the diﬀerence  ( + ) −  () = ∇ ·  while the net change in gravitational potential
energy is ()̄() · . Thus energy conservation gives

[∇ + ()ḡ(z)] · r =0

which can be expanded as


= −()() ()

 
= =0
 
Integrating the net forces normal to the surface over any closed surface enclosing an empty volume, inside
the fluid, gives a net buoyancy force on this volume that simplifies using the Divergence theorem
I I I Z µ ¶
  
F · S=  Ŝ · S =   = + + 
   


Using equations  leads to the net buoyancy force

I Z Z

F · S=  = − ()()
  
 

The right hand side of this equation equals minus the weight of the displaced fluid. That is, the buoyancy force
equals the weight of the fluid displaced by the empty volume. Note that this proof applies both to compressible
fluids, where the density depends on pressure, as well as to incompressible fluids where the density is constant.
It also applies to situations where local gravity  is position
R dependent. If an object of mass  is completely
submerged then the net force on the object is   −  ()()  If the object floats on the surface

of a fluid then the buoyancy force must be calculated separately for the volume under the fluid surface and
the upper volume above the fluid surface. The buoyancy due to displaced air usually is negligible since the
density of air is about 10−3 times that of fluids such as water.

H.3 Stokes Theorem

H.3.1 The curl
Maxwell’s laws relate the circulation of the field around a closed loop to the rate of change of flux through
the surface bounded by the closed loop. It is possible to write these integral equations in a diﬀerential form
as follows.
Consider the line integral around a closed loop  shown in figure 3.
If this area is subdivided into two areas enclosed by loops 1 and 2 , then the sum of the line integrals
is the same I I I
F · l = F · l + F · l (H.18)
 1 2

because the contributions along the common boundary cancel since they are taken in opposite directions if
1 and 2 both are taken in the same direction. Note that the line integral, and corresponding enclosed area,
H.3. STOKES THEOREM 551

are vector quantities related by the right-hand rule and this must be taken into account when subdividing
the area. Thus the area can be subdivided into an infinite number of pieces for which
I →∞ I
X →∞
X
H

F · l
F · l = F · l = b
∆S · n (H.19)
  ∆S · nb
 

where ∆S is the infinitessimal area bounded by the closed sub-loop  and ∆S · n b is the normal component
of this area pointing along the nb direction which is the direction along which the line integral points.
The component of the curl of the vector function along the di-
rection nb is defined to be
→∞
H C
X  F · l
b ≡ ∆→0
(F) · n 
(H.20)
b
∆S · n


Thus the line integral can be written as

I X→∞
H

F · l
F · l = ∆S · nb (H.21)
 ∆S · n b

Z
= [(F) · n b ] S · n
b

The product n b·nb = 1, that is, this is true independent of the

direction of the infinitessimal loop. Thus the above relation leads
to Stokes Theorem Figure H.3: The circulation around a
I Z path is equal to the sum of the circu-
F · l =  (F) · S (H.22) lations around subareas made by sub-
  dividing the area.



This relates the line integral to a surface integral over a surface

bounded by the loop.

H.3.2 Curl in cartesian coordinates

b direction shown in figure 4
Consider the infinitessimal rectangle ∆∆ pointing in the k
The line integral, taken in a right-handed way around kb gives
I µ ¶ µ ¶ µ ¶
   
F · l =  ∆ +  + ∆ −  + ∆ −  ∆ = − ∆∆ (H.23)
    
Thus since ∆∆ = ∆S the  component of the curl is given by
H µ ¶
F · l   Fz
b
(F) · k = 
= − (H.24) z
∆S · n b  
The same argument for the component of the curl in the  direction
is given by µ ¶
b  
(F) · j = − (H.25) y
 
Similarly the same argument for the component of the curl in the 
direction is given by
µ ¶
  x
(F) · bi = − (H.26)
 

Figure H.4: Circulation around an

infinitessimal rectangle ∆∆ in the
z direction
552 APPENDIX H. VECTOR INTEGRAL CALCULUS

Thus combining the three components of the curl gives

µ ¶ µ ¶ µ ¶
  b   b   b
F = − i+ − j+ − k (H.27)
     

Note that cross-product of the del operator with the vector F is

¯ ¯
¯ bi b b ¯
¯  j k ¯
∇ × F = ¯¯   
¯
¯ (H.28)
¯    ¯


which is identical to the right hand side of the relation for the curl in cartesian coordinates. That is;
→
−
∇ × F =  F (H.29)

Therefore Stokes Theorem can be rewritten as

I Z Z
F · l =  (F) · S =  (∇ × F) · S (H.30)
  
 
 

The physics meaning of the curl is that it is the circulation, or rotation, for an infinitessimal loop at any
location. The word curl is German for rotation.

H.3 Example: Maxwell’s circulation equations

As an example of the use of the curl, consider Faraday’s Law
I Z
B
 E · l = −   · S
  
 

Using Stokes Theorem gives I Z
E · l =   (∇ × E) · S
 


These two relations are independent of the shape of the closed loop, thus we obtain Faraday’s Law in the
differential form
B
(∇ × E) = −

A differential form of the Ampère-Maxwell law also can be obtained from
I Z
E
 B · l = 0  (j + 0 ) · S
  
 
Using Stokes Theorem I Z
B · l =   (∇ × B) · S
 


Again this is independent of the shape of the loop and thus we obtain
Ampère-Maxwell law in differential form
E
∇ × B = 0 j + 0 0

The differential forms of Maxwell’s circulation relations are easier to apply than the integral equations
because the differential form relates the curl to the time derivatives at the same specific location.
H.4. POTENTIAL FORMULATIONS OF CURL-FREE AND DIVERGENCE-FREE FIELDS 553

H.4 Potential formulations of curl-free and divergence-free fields

Interesting consequences result from the Divergence theorem and Stokes Theorem for vector fields that are
either curl-free or divergence-free. In particular two theorems result from the second derivatives of a vector
field.

Theorem 1; Curl-free (irrotational) fields:

For curl-free fields
∇×F=0 (H.31)
everywhere. This is automatically obeyed if the vector field is expressed as the gradient of a scalar field

F = ∇ (H.32)

since
∇× (∇) = 0 (H.33)
That is, any curl-free vector field can be expressed in terms of the gradient of a scalar field.
The scalar field  is not unique, that is, any constant  can be added to  since ∇ = 0 that is, the
addition of the constant  does not change the gradient. This independence to addition of a number to the
scalar potential is called a gauge invariance discussed in chapter 132 for which

F = ∇0 = ∇ ( + ) = ∇ (H.34)

That is, this gauge-invariant transformation does not change the observable F. The electrostatic field E
and the gravitation field g are examples of irrotational fields that can be expressed as the gradient of scalar
potentials.

Theorem 2; Divergence-free (solenoidal) fields:

For divergence-free fields

∇·F=0 (H.35)
everywhere. This is automatically obeyed if the field F is expressed in terms of the curl of a vector field G
such that
F=∇×G (H.36)
since ∇ · ∇ × G = 0. That is, any divergence-free vector field can be written as the curl of a related vector
field.
As discussed in chapter 132, the vector potential G is not unique in that a gauge transformation can be
made by adding the gradient of any scalar field, that is, the gauge transformation G0 = G + ∇ϕ gives

F = ∇ × G0 = ∇× (G + ∇ϕ) = ∇ × G (H.37)

This gauge invariance for transformation to the vector potential G0 does not change the observable vector
field F The magnetic field B is an example of a solenoidal field that can be expressed in terms of the curl
of a vector potential A.

H.4 Example: Electromagnetic fields:

Electromagnetic interactions are encountered frequently in classical mechanics so it is useful to discuss
the use of potential formulations of electrodynamics.
For electrostatics, Maxwell’s equations give that

∇×E=0

Therefore theorem 1 states that it is possible to express this static electric field as the gradient of the scalar
electric potential  , where
E = −∇
554 APPENDIX H. VECTOR INTEGRAL CALCULUS

For electrodynamics, Maxwell’s equations give that

B
(∇ × E) + =0

Assume that the magnetic field can be expressed in the terms of the vector potential B = ∇ × A, then
the above equation becomes
A
∇ × (E + )=0

Theorem 1 gives that this curl-less field can be expressed as the gradient of a scalar field, here taken to
be the electric potential  .
A
(E + ) == −∇

that is
A
E = −(∇ + )

Gauss’ law states that

∇·E =
0
which can be rewritten as

(∇ · A) 
∇·E = −∇2  − = ()
 0
Similarly insertion of the vector potential A in Ampère’s Law gives
µ ¶ µ ¶
E  2A
∇ × B = ∇ × (∇ × A)=0 j + 0 0 = 0 j−0 0 ∇ − 0 0
  2
Using the vector identity ∇ × (∇ × A) = ∇ (∇ · A) − ∇2  allows the above equation to be rewritten as
µ µ 2 ¶¶ µ µ ¶¶
 A 
∇2 A−0 0 − ∇ ∇ · A+ 
0 0 = −0 j ( )
2 

The use of the scalar potential  and vector potential A leads to two coupled equations  and  . These
coupled equations can be transformed into two uncoupled equations by exploiting the freedom to make a gauge
transformation for the vector potential such that the middle brackets in both equations  and  are zero.
That is, choosing the Lorentz gauge µ ¶

∇ · A = −0 0

simplifies equations  and  to be

2 
∇2  −0 0 2 = −
 0
µ 2 ¶
 A
∇2 A−0 0 = −0 j
2

The virtue of using the Lorentz gauge, rather than the Coulomb gauge ∇ · A = 0 is that it separates the
equations for the scalar and vector potentials. Moreover, these two equations are the wave equations for these
two potential fields corresponding to a velocity  = √1 0 . This example illustrates the power of using the
0
concept of potentials in describing vector fields.
Appendix I

Waveform analysis

I.1 Harmonic waveform decomposition

Any linear system that is subject to a time-dependent forcing function  () can be expressed as a linear
superposition of frequency-dependent solutions of the individual harmonic decomposition () of the forcing
function. Similarly, any linear system subject to a spatially-dependent forcing function  () can be expressed
as a linear superposition of the wavenumber-dependent solutions of the individual harmonic decomposition
( ) of the forcing function. Fourier analysis provides the mathematical procedure for the transformation
between the periodic waveforms and the harmonic content, that is,  () ⇔ (), or  () ⇔ ( ). Fourier’s
theorem states that any arbitrary forcing function  () can be decomposed into a sum of harmonic terms.
For example for a time-dependent periodic forcing function the decomposition can be a cosine series of the
form
∞
X
 () =  cos( 0  +  ) (I.1)
=1

where  0 is the lowest (fundamental) frequency solution. For an aperiodic function a cosine decomposition
can be of the form Z ∞
 () =  () cos( +  ()) (I.2)
0

Either of the complementary functions  () ⇔ (), or  () ⇔ ( ) are equivalent representations of
the harmonic content that can be used to describe signals and waves. The following two sections give an
introduction to Fourier analysis.

I.1.1 Periodic systems and the Fourier series

Discrete solutions occur for systems when periodic boundary conditions exist. The response of periodic
systems can be described in either the time versus angular frequency domains, or equivalently, the spatial
coordinate  versus the corresponding wave number  . For periodic systems this decomposition leads to
the Fourier series where a generalized phase coordinate  can be used to represent either the time or spatial
coordinates, that is, with  =  0  or  =   respectively. The Fourier series relates the two representations
of the discrete wave solutions for such periodic systems.
Fourier’s theorem states that for a general periodic system any arbitrary forcing function  () can be
decomposed into a sum of sinusoidal or cosinusoidal terms. The summation can be represented by three
equivalent series expansions given below, where  =  0  or  = k0 ·r and where  0  k0 are the fundamental
angular frequency and fundamental wave number respectively.
∞
0 X
 () = + [ cos () +  sin ()] (I.3)
2 =1
∞
0 X
 () = +  cos ( +  ) (I.4)
2 =0

555
556 APPENDIX I. WAVEFORM ANALYSIS

∞
0 X
 () = +  sin ( +  ) (I.5)
2 =0

where  is an integer, and    are phase shifts fit to the initial conditions.
The normal modes of a discrete system form a complete set of solutions that satisfy the following orthog-
onality relation Z 2
 ()  ()  =    (I.6)
0

where   is the Kronecker delta symbol defined in equation (10). Orthogonality can be used to determine
the coeﬃcients for equations (3) to be
Z +
1
0 =  ()  (I.7)
 −
Z +
1
 =  () cos ()  (I.8)
 −
Z +
1
 =  () sin ()  (I.9)
 −

Similarly the coeﬃcients for (4) and (5) are related to the above coeﬃcients by

2 = 2 = 2 + 2

Instead of the simple trigonometric form used in equations (3 − 5) the cosine and sine functions can
be expanded into the exponential form where
1 ¡  ¢
cos  =  + − (I.10)
2
− ¡  ¢
sin  =  − −
2
then equation (3) becomes
∞
X
 () =   (I.11)
=−∞

where  is any integer and, from the orthogonality, the Fourier coeﬃcients are given by
Z +
1
 =  ()   (I.12)
2 −

These coefficients are related to the cosine plus sine series amplitudes by
1
 = ( −  ) ( when  is positive)
2
1
 = ( +  ) (when  is negative)
2
These results show that the coefficients of the exponential series are in general complex, and that they
occur in conjugate pairs (that is, the imaginary part of a coefficient  is equal but opposite in sign to that
for the coefficient − ). Although the introduction of complex coefficients may appear unusual, it should
be remembered that the real part of a pair of coefficients denotes the magnitude of the cosine wave of the
relevant frequency, and that the imaginary part denotes the magnitude of the sine wave. If a particular
pair of coefficients  and − are real, then the component at the frequency  0 is simply a cosine; if 
and − are purely imaginary, the component is just a sine; and if, as is the general case,  and − are
complex, both cosine and a sine terms are present.
The use of the exponential form of the Fourier series gives rise to the notion of ‘negative frequency’. Of
course,  () =  cos    is a wave of a single frequency   =  0 radians/second, and may be represented
I.1. HARMONIC WAVEFORM DECOMPOSITION 557

by a single line of height  in a normal spectral diagram. However, using the exponential form of the Fourier
series results in both positive and negative  components.
The coexistence of both negative and positive angular frequencies ± can be understood by consideration
of the Argand diagram where the real component is plotted along the -axis and the imaginary component
along the -axis. The function  + represents a vector of length  that rotates with an angular velocity 
in a positive direction, that is counterclockwise, whereas,  − represents the vector rotating in a negative
direction, that is clockwise. Thus the sum of the two rotating vectors, according to equations (3), leads
to cancellation of the opposite components on the imaginary  axis and addition of the two  cos  real
components on the  axis. Subtraction leads to cancellation of the real  components and addition of the
imaginary  axis components.

I.1.2 Aperiodic systems and the Fourier Transform

The Fourier transform (also called the Fourier integral) does for the non-repetitive signal waveform what
the Fourier series does for the repetitive signal. It was shown that the line spectrum of a recurrent periodic
pulse waveform is modified as the pulse duration decreases, assuming the period of the waveform (and hence
its fundamental component) remains unchanged. Suppose now that the duration of the pulses remain fixed
but the separation between them increases, giving rise to an increasing period. In the limit, only a single
rectangular pulse remains, its neighbors having moved away on either side towards ±∞. In this case, the
fundamental frequency  0 tends towards zero and the harmonics become extremely closely spaced and of
vanishingly small amplitudes, that is, the system approximates a continuous spectrum.
Mathematically, this situation may be expressed by modifications to the exponential form of the Fourier
series already derived. Let the phase factor  =  0  in equation (11) then
Z + Z + 2
0 1
 =  () 0   =  () 0   (I.13)
2 −  − 2

where  is the period of the periodic force. Let  () =   ,  =  0  and take the limit for  → ∞ then
equation (12) can be written as
Z +∞
 () =  ()   (I.14)
−∞
2
Similarly making the same limit for  → ∞ then  0 =  →  and equation (11) becomes

X∞ X∞ Z +∞
 () 0   0  1
 () =  =  ()  =  ()   (I.15)
=−∞
 =−∞
2 2 −∞

Equation (15) shows how a non-repetitive time-domain wave form is related to its continuous spectrum.
These are known as Fourier integrals or Fourier transforms. They are of central importance for signal
processing. For convenience the transforms often are written in the operator formalism using the F symbol
in the form
Z +∞ ∙ ¸
1 1
 () =  ()   ≡ F −1 () (I.16)
2 −∞ 2
Z +∞
 () =  () −  ≡ F () (I.17)
−∞

It is very important to grasp the significance of these two equations. The first tells us that the Fourier
transform of the waveform  () is continuously distributed in the frequency range between  = ±∞, whereas
the second shows how, in eﬀect, the waveform may be synthesized from an infinite set of exponential functions
of the form ± , each weighted by the relevant value of (). It is crucial to realize that this transformation
can go either way equally, that is, from () to  () or vice versa.1
1 The only asymmetry in the Fourier transform relations comes from the 2 factor originating from the fact that by convention

physicists use the angular frequency  = 2 rather than the frequency . In order to restore symmetry many papers use the
factor √1 in both relations rather than using the 21
factor in equation 16 and unity in equation 17.
2
558 APPENDIX I. WAVEFORM ANALYSIS

I.1 Example: Fourier transform of a single isolated square pulse:

Consider a single isolated square pulse of width  that is described by the rectangular function Π defined
as
½ 
1 ||  2
Π() = 
0 ||  2

That is, assume that the amplitude of the pulse is unity between − 2 ≤  ≤ 
2 . Then the Fourier transform
Z µ ¶
+
− sin 
2
 () = 1  =  
− 2

which is an unnormalized ( ) function. Note that the width of the pulse ∆ = ± 2 leads to a frequency
envelope that has the first zeros at ∆ = ±  . Thus the product of these widths ∆ · ∆ = ± which is

independent of the width of the pulse, that is ∆ = ∆ which is an example of the uncertainty principle
which is applicable to all forms of wave motion.

I.2 Example: Fourier transform of the Dirac delta function:

The Dirac delta function, ( − 0 ), is a pulse of extremely short duration and unit area at  = 0 and is
zero at all other times. That is,
Z +∞
1=  ( − 0 ) 
−∞

The Dirac function, which is sometimes referred to as the impulse function, has many important appli-
cations to physics and signal processing. For example, a shell shot from a gun is given a mechanical impulse
imparting a certain momentum to the shell in a very short time. Other things being equal, one is interested
only in the impulse imparted to the shell, that is, the time integral of the force accelerating the shell in the
gun, rather than the details of the time dependence of the force. Since the force acts for a very short time
the Dirac delta function can be employed in such problems.
As described in section 311 and appendix , the Dirac delta function is employed in signal processing
when signals are sampled for short time intervals. The Fourier transform of the delta function is needed for
discussion of sampling of signals
Z +∞
0
 () =  ( − 0 ) −  = −
−∞

Since − essentially is constant over the infinitesimal time duration of the  ( − 0 ) function, and the
time integral of the  function is unity, thus the term − has unit magnitude for any value of  and has
a phase shift of − ( − 0 )radians. For 0 = 0 the phase shift is zero and thus the Fourier transform of a
Dirac () function is () = 1. That is, this is a uniform white spectrum for all values of .

I.2 Time-sampled waveform analysis

An alternative approach for unloosing periodic signals, that is complementary to the Fourier analysis har-
monic decomposition, is time-sampled (discrete-sample) waveform analysis where the signal amplitude is
measured repetitively at regular time intervals in a time-ordered sequence, that is, a sequence of samples of
the instantaneous delta-function amplitudes is recorded. Typically an amplitude-to-digital converter is used
to digitize the amplitude for each measured sample and the digital numbers are recorded; this process is
called digital signal processing.
The general principles are best explained by first considering the response of a linear system to a step
function impulse, followed by a square impulse, and leading to the response of a -function impulsive driving
force.
I.2. TIME-SAMPLED WAVEFORM ANALYSIS 559

Figure I.1: Response of a underdamped linear oscillator with  = 10, and Γ = 2 to the following impulsive
force. (a) Step function force  = 0 for   0 and  =  for   0 (b) Square-wave force where  =  for
0     for  = 3 and  = 0 at other times. (c) Delta-function impulse  = 1.

I.2.1 Delta-function impulse response

Consider the damped oscillator equation

 ()
̈ + Γ̇ +  20  = (I.18)

and assume that a step function is applied at time  = 0. That is;

 ()  ()
=0 0 = 0 (I.19)
 
where  is a constant. The initial conditions are that (0) = ̇(0) = 0.
The transient or complementary solution is the solution of the linearly-damped harmonic oscillator

̈ + Γ̇ +  20  = 0 (I.20)

This is independent of the driving force and the solution is given in the chapter 35 discussion of the linearly-
damped harmonic oscillator.
The particular, steady-state, solution is easy to obtain just by inspection since the force is a constant,
that is, the particular solution is

 = 0  = 0 0
 20

Taking the sum of the transient and particular solutions, using the initial conditions, gives the final solution
to be " #
Γ
 −Γ  Γ− 2 
() = 2 1 −  2 cos  1  − sin  1  (I.21)
0 2 1
q ¡ ¢2
where  1 ≡  20 − Γ2  This functional form is shown in figure 1. Note that the amplitude of the
transient response equals − at  = 0 to cancel the particular solution when it jumps to +. The oscillatory
behavior then is just that of the transient response.
A square impulse can be generated by the superposition of two opposite-sign stepfunctions separated by
a time  as shown in figure 1.
The square impulse can be taken to the limit where the width  is negligibly small relative to the response
times of the system. It can be shown that letting  → 0 but keeping the magnitude of the total impulse
 =  finite for the impulse at time 0 , leads to the solution for the -function impulse occurring at 0

 − Γ (−0 )
() =  2 sin  1 ( − 0 )   0 (I.22)
1
This response to a delta function impulse is shown in figure 1 for the case where 0 = 0. An example is
the response when the hammer strikes a piano string at  = 0.
560 APPENDIX I. WAVEFORM ANALYSIS

Figure I.2: Decomposition of the function () = 2 sin ()+sin (5)+ 13 sin (15)+ 15 sin(25) into a time-ordered
sequence of -function samples.

I.2.2 Green’s function waveform decomposition

The response of the linearly-damped linear oscillator to an delta function impulse, that has been expressed
above, can be used to exploit the powerful Green’s technique for decomposition of any general forcing
function. That is, if the driven system is linear, then the principle of superposition is applicable and allowing
expression of the inhomogeneous part of the diﬀerential equation as the sum of individual delta functions.
That is;
X∞ X∞
 ()
̈ + Γ̇ +  20  = =  () (I.23)
=−∞
 =−∞

As illustrated in figure 2 discrete-time waveform analysis involves repeatedly sampling the instantaneous
amplitude in a regular and repetitive sequence of -function impulses. Since the superposition principle
applies for this linear system then the waveform can be described by a sum of an ordered series of delta-
function impulses where 0 is the time of an impulse. Integrating over all the -function responses that have
occurred at time 0 , that is prior to the time of interest  leads to
Z 
 (0 ) − Γ2 (−0 )
 () =  sin  1 ( − 0 ) 0  ≥ 0 (I.24)
−∞  1

The Green’s function  ( − 0 ) is defined by

1 − Γ2 (−0 )
( − 0 ) =  sin  1 ( − 0 )  ≥ 0 (I.25)
 1
= 0   0

Superposition allows the summed response of the system to be written in an integral form
Z 
() =  (0 )( − 0 )0 (I.26)
−∞

which gives the final time dependence of the forced system. This repetitive time-sampling approach avoids
the need of using Fourier analysis. Note that the Green’s function  ( − 0 ) includes
q implicitly the frequency
2
¡ Γ ¢2
of the free undamped linear oscillator  0  the free damped linear oscillator  1 ≡  0 − 2  as well as the
damping coeﬃcient Γ. Access to the combination of fast microcomputers coupled to fast digital sampling
techniques has made digital signal sampling the pre-eminent technique for signal recording of audio, video,
and detector signal processing.
Bibliography

[1] SELECTION OF TEXTBOOKS ON CLASSICAL MECHANICS

[Ar78] V. I. Arnold, “Mathematical methods of Classical Mechanics”, 2 edition, Springer-Verlag (1978)
This textbook provides an elegant and advanced exposition of classical mechanics expressed in
the language of diﬀerential topology.
[Co50] H.C. Corben and P. Stehle, “Classical Mechanics”, John Wiley (1950)
This classic textbook covers the material at the same level and comparable scope as the present
textbook.
[Fo05] G. R. Fowles, G. L. Cassiday, “Analytical Mechanics”. Thomson Brookes/Cole, Belmont, (2005)
An elementary undergraduate text that emphasizes computer simulations.
[Go50] H. Goldstein, “Classical Mechanics”, Addison-Wesley, Reading (1950)
This has remained the gold standard graduate textbook in classical mechanics since 1950. Gold-
stein’s book is the best graduate-level reference to supplement the present textbook. The lack of
worked examples is an impediment to using Goldstein for undergraduate courses. The 3 edi-
tion, published by Goldstein, Poole, and Safko (2002), uses the symplectic notation that makes the
book less friendly to undergraduates. The Cline book adopts the nomenclature used by Goldstein
to provide a consistent presentation of the material.
[Gr06] R. D. Gregory, “Classical Mechanics”, Cambridge University Press
This outstanding, and original, introduction to analytical mechanics was written by a mathemati-
cian. It is ideal for the undergraduate, but the breadth of the material covered is limited.
[Gr10] W. Greiner, “Classical Mechanics, Systems of particles and Hamiltonian Dynamics” , 2 edition,
Springer (2010). This excellent modern graduate textbook is similar in scope and approach to
the present text. Greiner includes many interesting worked examples, as well as a reproduction
of the Struckmeier[Str08] presentation of the extended Lagrangian and Hamiltonian mechanics
formalism of Lanczos[La49].
[Jo98] J. V. José and E. J. Saletan, “Classical Dynamics, A Contemporary Approach”, Cambridge
University Press (1998)
This modern advanced graduate-level textbook emphasizes configuration manifolds and tangent
bundles which makes it unsuitable for use by most undergraduate students.
[Jo05] O. D. Johns, “Analytical Mechanics for Relativity and Quantum Mechanics”, 2  edition, Ox-
ford University Press (2005). Excellent modern graduate text that emphasizes the Lanczos[La49]
parametric approach to Special Relativity. The Johns and Cline textbooks were developed inde-
pendently but are similar in scope and approach. For consistency, the name “generalized energy”,
which was introduced by Johns, has been adopted in the Cline textbook.

561
562 BIBLIOGRAPHY

[Ki85] T.W.B. Kibble, F.H. Berkshire. “Classical Mechanics, (5th edition)”, Imperial College Press,
London, 2004. Based on the textbook written by Kibble that was published in 1966 by McGraw-
Hill. The 4th and 5th editions were published jointly by Kibble and Berkshire. This excellent
and well-established textbook addresses the same undergraduate student audience as the present
textbook. This book covers the variational principles and applications with minimal discussion of
the philosophical implications of the variational approach.
[La49] C. Lanczos, “The Variational Principles of Mechanics”, University of Toronto Press, Toronto,
(1949)
An outstanding graduate textbook that has been one of the founding pillars of the field since
1949. It gives an excellent introduction to the philosophical aspects of the variational approach
to classical mechanics, and introduces the extended formulations of Lagrangian and Hamiltonian
mechanics that are applicable to relativistic mechanics.
[La60] L. D. Landau, E. M. Lifshitz, “Mechanics”, Volume 1 of a Course in Theoretical Physics, Perga-
mon Press (1960)
An outstanding, succinct, description of analytical mechanics that is devoid of any superfluous
text. This Course in Theoretical Physics is a masterpiece of scientific writing and is an essential
component of any physics library. The compactness and lack of examples makes this textbook
less suitable for most undergraduate students.
[Li94] Yung-Kuo Lim, “Problems and Solutions on Mechanics” (1994)
This compendium of 408 solved problems, which are taken from graduate qualifying examinations
in physics at several U.S. universities, provides an invaluable resource that complements this
textbook for study of Lagrangian and Hamiltonian mechanics.
[Ma65] J. B. Marion, “Classical Dynamics of Particles and Systems”, Academic Press, New York, (1965)
This excellent undergraduate text played a major role in introducing analytical mechanics to
the undergraduate curriculum. It has an outstanding collection of challenging problems. The 5
edition has been published by S. T. Thornton and J. B. Marion, Thomson, Belmont, (2004).
[Me70] L. Meirovitch, “Methods of Analytical Dynamics”, McGraw-Hill New York, (1970)
An advanced engineering textbook that emphasizes solving practical problems, rather than the
underlying theory.
[Mu08] H. J. W. Müller-Kirsten, “Classical Mechanics and Relativity”, World Scientific, Singapore, (2008)
This modern graduate-level textbook emphasizes relativistic mechanics making it an excellent
complement to the present textbook.
[Pe82] I. Percival and D. Richards, “Introduction to Dynamics” Cambridge University Press, London,
(1982)
Provides a clear presentation of Lagrangian and Hamiltonian mechanics, including canonical
transformations, Hamilton-Jacobi theory, and action-angle variables.
[Sy60] J.L. Synge, “Principles of Classical Mechanics and Field Theory” , Volume III/I of “Handbuck
der Physik” Springer-Verlag, Berlin (1960).
A classic graduate-level presentation of analytical mechanics.
[Ta05] J. R. Taylor, “Classical Mechanics”, University Science Books, Sausalito, (2006)
This undergraduate book gives a well-written descriptive introduction to analytical mechanics.
The scope of the book is limited and the problems are easy.
BIBLIOGRAPHY 563

[2] GENERAL REFERENCES

[Bak96] L. Baker, J.P. Gollub, Chaotic Dynamics, 2 edition, 1996 (Cambridge University Press)
[Bat31] H. Bateman, Phys. Rev. 38 (1931) 815
[Bau31] P.S. Bauer, Proc. Natl. Acad. Sci. 17 (1931) 311
[Bor25a] M. Born and P. Jordan, Zur Quantenmechanik, Zeitschrift für Physik, 34, (1925) 858-888.
[Bor25b] M. Born, W. Heisenberg, and P. Jordan, Zur Quantenmechanik II, Zeitschrift für Physik, 35,
(1925), 557-615,
[Boy08] R. W. Boyd, Nonlinear Optics, 3 edition, 2008 (Academic Press, NY)
[Bri14] L. Brillouin, Ann. Physik 44(1914)
[Bri60] L. Brillouin, Wave Propagation and Group Velocity, 1960 (Academic Press, New York)
[Cay1857] A. Cayley, Proc. Roy. Soc. London 8 (1857) 506
[Cei10] J.L. Cieśliński, T. Nikiciuk, J. Phys. A:Math. Theor. 43 (2010) 175205
[Cio07] Ciocci and Langerock, Regular and Chaotic Dynamics, 12 (2007) 602
[Cli71] D. Cline, Proc. Orsay Coll. on Intermediate Nuclei, Ed. Foucher, Perrin, Veneroni, 4 (1971).
[Cli72] D. Cline and C. Flaum, Proc. of the Int. Conf. on Nuclear Structure Studies Using Electron
Scattering, Sendai, Ed. Shoa, Ui, 61 (1972).
[Cli86] D. Cline, Ann. Rev. Nucl. Part. Sci. 36, (1986) 683.
[Coh77] R.J. Cohen, Amer. J. of Phys. 45 (1977) 12
[Cra65] F.S. Crawford, Berkeley Physics Course 3; Waves, 1970 (Mc Graw Hill, New York)
[Cum07] D. Cumin, C.P. Unsworth, Physica D 226 (2007) 181
[Dav58] A. S. Davydov and G. F. Filippov. Nuclear Physics, 8 (1958) 237
[Dek75] H. Dekker, Z. Physik, B21 (1975) 295
[Dep67] A. Deprit, American J. of Phys 35, no.5 424 (1967)
[Dir30] P.A.M. Dirac, Quantum Mechanics, Oxford University Press, (1930).
[Dou41] D. Douglas, Trans. Am. Math. Soc. 50 (1941) 71
[Fey84] R.P. Feynman, R.B. Leighton, M. Sands, The Feynman Lectures, (Addison-Wesley, Reading,
MA,1984) Vol. 2, p17.5
[Fro80] C. Frohlich, Scientific American, 242 (1980) 154
[Gal13] C. R. Galley, Physical Review Letters, 11 (2013) 174301
[Gal14] C. R. Galley, D. Tsang, L.C. Stein, arXiv:1412.3082v1 [math-phys] 9 Dec 2014
[Har03] James B. Hartle, Gravity: An Introduction to Einstein’s General Relativity (Addison Wesley,
2003)
[Kur75] International Symposium on Math. Problems in Theoretical Physics, Lecture Notes in Physics,
Vol39 Springer, NY (1975)
[Mus08a] Z.E. Musielak, J. Phys. A. Math. Theor. 41 (2008) 055205
[Mus08b] Z.E. Musielak, D. Rouy, L.D. Swift, Chaos, Solitons, Fractals 38 (2008) 894
564 BIBLIOGRAPHY

[Ray1881] J.W. Strutt, 3 Baron Rayleigh, Proc. London Math. Soc., s1-4 (1), (1881) 357
[Ray1887] J.W. Strutt, 3 Baron Rayleigh, The Theory of Sound, 1887 (Macmillan, London)
[Rou1860] E.J. Routh, Treatise on the dynamics of a system of rigid bodies, MacMillan (1860)
[Sim98] M. Simon, D. Cline, K. Vetter, et al, Unpublished
[Sta05] T. Stachowiak and T. Okada, Chaos, Solitons, and Fractals, 29 (2006) 417.
[Str00] S.H. Strogatz, Physica D43 (2000) 1
[Str05] J. Struckmeier, J. Phys. A: Math; Gen. 38 (2005) 1257
[Str08] J. Struckmeier, Int. J. of Mod. Phys. E18 (2008) 79
[Vir15] E.G. Virga, Phys, Rev. E91 (2015) 013203
[Win67] A.T. Winfree, J. Theoretical Biology 16 (1967) 15
Index
Abbreviated action, 228 Bohr-Sommerfeld atom
Action special relativity, 487
abbreviated action, 228 Brahe
Hamilton’s Action Principle, 226 history, 2
Action-angle variables Bulk modulus of elasticity, 453
Hamilton-Jacobi theory, 433 Buoyancy forces, 550
Sommerfeld atom, 495
Adiabatic invariance Calculus of variations
action variables, 436 brachistochrone, 111
plane pendulum, 436 Euler, 111
Analytical mechanics, xviii history, 111
Androyer-Deprit variables Leibniz, xviii
rigid-body rotation, 337 Canonical equation of motion
Archimedes Hamilton’s equations of motion, 202
history, 1 Canonical perturbation theory
Aristotle Hamilton-Jacobi theory, 438
history, 1 harmonic oscillator perturbation, 438
Asymmetric rotor Canonical transformations
stability of torque-free rotation, 344 generating function, 418
Asymmetric top Hamilton method, 420
5 somersaults plus 3 rotations of high diver, 356 Hamilton’s equations of motion, 417
separatrix, 344 Hamilton-Jacobi theory, 422
tennis racket motion, 345 identity transformation, 420
torque-free rotation, 343 Jacobi method, 420, 422
Attractor one-dimensional harmonic oscillator, 421
van der Pol oscillator, 94 Cartesian coordinates, 519
Autonomous system, 93, 169 Cayley
history, 505
Barycenter, 251 Center of momentum
Bernoulli bolas, 17
history, 5 Center of percussion, 35
principle of virtual work, 138 Central forces
virtual work, 111 two-body forces, 249
Bertrand’s Theorem Centre of mass
orbit stability, 263 finite sized objects, 12
Bertrand’s theorem Centre of momentum
orbit solution, 256 relativistic kinematics, 473
Bicycle stability Centrifugal force
rolling wheel, 353 parabolic mirror, 297
Bifurcation Chaos
non-linear system, 103 Lyapunov exponent, 103
Billiard ball, 15 onset of chaos for non-linear system, 101
Bohr Characteristic function
history, 8 Hamilton-Jacobi theory, 424
model of the atom, 495 Chasles’ theorem

565
566 INDEX

rigid-body rotation, 314 Coupled linear oscillators

Collective synchronization benzene ring, 387
coupled oscillators, 395 collective motion in nuclei, 397
Commutation relation continuous lattice chain, 447
Poisson bracket, 407, 497 discrete lattice chain, 388
Conjugate momentum, 180, 195 equations of motion, 371
Conservation laws general analytic theory, 369
angular momentum, 11 grand piano, 368
linear momentum, 12 kinetic energy tensor, 369
Conservation laws in mechanics, 21 linear triatomic molecule, 385
Conservation of linear momentum normal coordinates, 373
exploding cannon shell, 15 potential energy tensor, 370
many-body systems, 14 superposition, 372
Conservation of momentum three bodies coupled by six springs, 384
billiard ball collisions, 15 three fully-coupled plane pendula, 380
Conservative forces three nearest-neighbour coupled plane pendula,
central force, 20 382
central two-body forces, 249 two linearly-damped coupled oscillators, 394
path independence, 18 two parallel-coupled plane pendula, 377
potential energy, 18 two series-coupled oscillators, three springs, 374
time independence, 18 two series-coupled oscillators, two springs, 376
Constrained motion two-series coupled plane pendula, 379
Euler’s equations, 122 viscously-damped coupled osciilators, 394
geodesic motion, 130, 489 Coupled oscillators
Constraint forces collective synchronization, 395
scleronomic, 185, 186 Kuramoto model, 395
Constraints Covariant tensor, 536
kinematic equations of constraint, 122 four vector, 476
non-holonomic, 123 Cut-oﬀ frequency, 393
partial holonomic, 124 Cyclic coordinates
rheonomic, 124 conservation of momentum, 184
rolling wheel, 123 Cylindrical coordinates
scleronomic, 124, 143 Hamiltonian, 203
Continuity equation
fluid mechanics, 457 d’Alembert
Continuous linear chain, 447 history, 5
Contravariant tensor, 536 virtual work, 111
four vector, 476 d’Alembert’s principle
Coordinate systems Lagrange equations, 139, 200, 542
cartesian coordinates, 519 virtual work, 138
curvilinear, 519 da Vinci, Leonardo
cylindrical basis vectors, 522 history, 2
cylindrical coordinates, 522 Damped linear oscillator
polar coordinates, 520 response to arbitrary periodic force, 72
spherical basis vectors, 522 Damped oscillator
spherical coordinates, 522 attractor, 97
Coordinate transformation critically damped, 60
rotational, 525 energy dissipation, 61
translational, 525 Fourier transform, 70
Copernicus free linearly damped, 58
history, 2 overdamped, 60
Correspondence principle, 489 Q factor, 61
Bohr, 408, 498, 501 underdamped, 59
Dirac, 408, 498 de Broglie
Coulomb excitation, 277 history, 8
INDEX 567

de Broglie waves field equations, 553

group velocity, 496 Equations of motion
Heisenberg’s uncertainty principle, 80 analytic solution, 37
matter waves, 496 successive approximation, 37
Delta-function analysis, 560 Equivalence principle
Green’s method, 560 weak equivalence principle, 488
Descartes Equivalent Lagrangians
history, 3 gauge invariance, 233
Differential orbit equation, 254 Euler
Dirac calculus of variations, 111, see Euler’s equation
history, 8 history, 5
Lagrangian approach to quantum mechanics, 500 Euler angles
Poisson brackets in quantum physics, 408, 497 definition, 329
relativistic quantum theory, 500 line of nodes, 330
Discrete lattice chain Euler’s equation of motion
cut-off frequency, 393 rigid-body rotation, 334
dispersion, 393 Euler’s equations
longitudinal modes, 388 brachistrochrone, 114
normal modes, 389 calculus of variations, 112
transverse modes, 389 catenary, 129
Discrete-function analysis classical mechanics, 131
linear systems, 558 constrained motion, 122
Driven damped oscillator Dido problem, 129
absorptive amplitude, 66, 83 Fermat’s principle, 119
arbitrary periodic harmonic force, 71 generalized coordinates, 125
elastic amplitude, 66, 83 geodesic, 130
energy absorption, 65 Lagrange multipliers, 125
Green’s method, 560 minimum Laplacian, 121
harmonically driven, 62 second form, 121
Lorentzian (Breit-Wigner) line shape, 67 selection of independent variable, 117
phase shift, 63 several independent variables, 119
resonance, 65 shortest distance between two points, 114
Steady state response to harmonic drive, 63 Euler’s equations
transient response, 62 minimal travel cost, 116
uncertainty principle, 67 Euler’s equations of rigid-body rotation
Lagrangian derivation, 335
Eccentricity vector Newtonian derivation, 335
hidden symmetry, 263 Euler’s first equation, 113
Poisson Brackets, 414 Euler’s hydrodynamic equation
two-body motion, 261 fluid dynamics, 457
Einstein
General theory of relativity, 465, 488 Faraday’s law, 552
history, 7 Fast light
photoelectric effect, 494 wave packets, 106
postulates of special relativity, 467 Fermat
Special theory of relativity, 465 history, 3, 5
theory of relativity, 10 Fermat’s Principle, xvii
Einstein’s equivalence principle Feynman
general theory of relativity, 488 history, 8
Elasticity Least action in quantum mechanics, 501
modulus of elasticity, 53 Finite size bodies
spring constant, 454 centre of mass, 13
strain tensor, 452 First order integrals
stress tensor, 452 kinetic energy, 12
Electromagnetic fields momentum, 11
568 INDEX

Newton’s laws, 11 gravitational waves, 490

Fluid dynamics Mach’s principle, 488
Bernoulli’s equation, 458 principle of covariance, 488
continuity equation, 457 rotation of the perihelion of mercury, 489
Euler’s hydrodynamic equation, 457 Generalized coordinates, 125, 135, 139, 142, 172
gas flow, 458 minimal set, 142
ideal fluid, 457 Generalized energy, 186, 195, 236, 404
irrotational flow, 458 Generalized energy theorem
Navier-Stokes equation, 460 Hamiltonian mechanics, 187
viscous flow, 460 Generalized force, 139, 144, 174
Fluid flow Generalized momentum, 180
drag force, 461 Geodesic motion, 130, 489
Navier-Stokes equation, 460 Gilbert
Reynolds number, 461 history, 2
Force Gravitation, 38
constraint forces, 145 conservative, 39
generalized force, 145 curl, 41
partition forces, 145 determination of field from potential, 41
Four vector Gauss’s law, 43
special theory of relativity, 475 Newton’s laws, 44
Four vectors Poisson’s equation, 45
contravariant, 476 potential, 40
covariant, 476 potential energy, 39
momentum energy, 478 potential theory, 41
scalar product, 476 reference potential, 42
Four-dimensional space-time superposition, 40
Riemannian geometry, 489 uniform sphere of mass, 45
special theory of relativity, 475 Gravitational wave
Fourier analysis, 70, 555 General Theory of Relativity, 490
Fourier series, 555 Green’s function method, 560
Fourier transform, 557 Group velocity
Fourier series discrete lattice chain, 393
cosine and sine series, 556 surface waves on deep water, 76
exponential series, 556 wave packets, 73, 74, 105
periodic systems, 555
Fourier transform Hamilton
Dirac delta function, 558 history, 6, 505
gaussian wavepacket, 79 variational principle, 111
linearly-damped linear oscillator, 70 Hamilton’s Action Principle
rectangular wavepacket, 79 Hamilton-Jacobi equation, 228
single square pulse, 558 Lagrange equations, 141, 200, 225, 542
wavepackets, 79 stationary action, 226
Hamilton’s equations of motion
Galilean invariance, 10 canonical transformations, 417
Galileo Hamilton’s principle function
history, 2 Hamilton-Jacobi theory, 423
Gauge invariance, 233 Hamilton-Jacobi equation
Gauss Hamilton’s Action Principle, 228
history, 6 Hamilton-Jacobi theory, 422
General theory of relativity action variable, 435
black holes, 490 action-angle variables, 433
deflection of light, 490 central-force problem, 427
gravitational lensing, 490 free particle, 425
gravitational time dilation and frequency shift, Hamilton’s characteristic function, 424
490 Hamilton’s principle function, 423
INDEX 569

Hamilton-Jacobi formulations, 424 Holonomic constraints

Jacobi’s complete integral, 422 generalized forces, 144, 174
Lindblad resonance, 439 geometric constraints, 122
one-dimensional oscillator, 427 isoperimetric constraints, 123
Schrodinger equation, 500 Lagrange multipliers, 126
separation of variables, 425 Hurricane
uniform gravitational field, 426 Katrina, 307
visual representation of characteristic function,
432 Impact parameter
wave-particle duality, 432 two-body scattering, 260
Hamiltonian Impulsive force
central field, 204 angular impulsive force, 35, 170
classical mechanics, 193 translational impulse, 34, 170
conservation, 188 Inertia tensor
cyclic coordinates, 193 about center of mass of uniform cube, 320
cylindrical coordinates, 203 about corner of uniform solid cube, 321
definition, 450 characteristic (secular) equation, 318
isotropic central force, 190 components, 316
linear oscillator on moving cart, 189 diagonalization, 318
spherical coordinates, 204 general properties, 323
total energy, 188 hula hoop, 323
total energy conservation, 187 moments of inertia, 316
two body motion, 255 parallel-axis theorem, 319
Hamiltonian mechanics perpendicular-axis theorem, 322
characteristic function, 424 plane laminae, 322
comparison with Lagrangian mechanics, 441 principal axes, 317
electron motion in electric and magentic fields, principal moments of inertia, 317
208 products of inertia, 316
thin book, 323
equations of motion, 202
Inertial frame, 10
extended formalism, 481, 484
Galilean invariance, 465
generalized energy, 186, 195, 236, 404
Inner product
generalized energy theorem, 187
tensor algebra, 537
Hooke’s law for constrained motion, 207
tensors, 317
Legendre transform, 200, 542
Inverse variational calculus, 234
non-conservative forces, 242, 248
Irrotational flow, 458
observable independence, 409
observable time dependence, 409 Jacobi
observables, 408 energy integral, 186
one-dimensional harmonic oscillator, 205 history, 6
plane pendulum, 57, 206 Jacobi’s complete integral
Poisson brackets, 411 Hamilton-Jacobi theory, 422
spherical pendulum, 213 Jacobian
Harmonic oscillator example, 542
symmetry tensor, 266 general properties, 541
Heisenberg transformation of diﬀerentials, 541
history, 8, 505 transformation of integrals, 541
uncertainty principle, 80 Jacobian determinant, 541
Heisenberg matrix representation
quantum mechanics, 497 Kepler
Hidden symmetry history, 2
Laplace-Runge-Lenz vector, 263 laws of plantary motion, 259, 284
Hodograph Kinetic energy
inverse-square law, 262 generalized coordinates, 185
linear central force, 265 scleronomic systems, 185, 186
two-body scattering, 279 Kirchhoﬀ’s rules, 67, 244
570 INDEX

Kuramoto model two masses sliding on inclined planes, 151

coupled oscillators, 395 unconstrained motion, 146
velocity-dependent Lorentz force, 168
Lagrange yo-yo, 157
calculus of variations, 111 Lame’s modulus of elasticity, 453
history, 5 Legendre transform
Lagrange equations Hamiltonian and Lagrangian mechanics, 200, 542
d’Alembert’s principle, 139 Leibniz
Hamilton’s action principle, 225 history, 3, 5
Hamilton’s principle, 141 vis viva, xviii
Lagrange multipliers, 142 Linear oscillator
Lagrange equations critically damped, 60
generalized coordinates, 142 driven, 62
Lagrange multipliers energy dissipation, 61
algebraic equations of constraint, 126 linear damping, 58
Euler equations, 125 Lissajous figures, 55
integral equations of constraint, 128 overdamped, 60
Lagrangian Q factor, 61
definition, 111 resonance, 65
equivalent lagrangians, 232 Steady state response of driven oscillator, 63
extended formalism, 479 superposition, 54
non standard, 234, 238 transient response of driven oscillator, 62
relativistic free particle, 482 underdamped, 59
rotating frame, 293 Linear systems
special relativity, 482 Fourier harmonic analysis, 70
standard, 232 Linear velocity-dependent dissipation, 241
state space, 202 Linearly-damped linear oscillator
time dependent, 169 characteristic frequency, 58
Lagrangian density, 448 damping parameter, 58
Lagrangian mechanics Liouville’s theorem
Atwoods machine, 151 phase space, 415
block sliding on moveable inclined plane, 153 Lissajous figure, 55
body on periphery of rolling wheel, 166 Lorentz
central forces, 147 relativistic transformation, 467
comparison with Hamiltonian mechanics, 441 Lorentz force in electromagnetism
comparison with Newtonian mechanics, 172 Poisson brackets, 412
cyclic coordinates, 184 Lorentz transformation
disk rolling on inclined plane, 148 Minkowski metric, 476
generalized coordinates, 125, 172 Lyapunov exponent
holonomic constraints, 122, 144 onset of chaos, 103
mass sliding on paraboloid, 158
mass sliding on rotating rod, 154 Mach’s principle
motion in gravitational field, 146 general theory of relativity, 488
motion of a free particle, 146 Many-body systems
non-conservative forces, 247 angular momentum, 16
partial holonomic systems, 161 energy conservation, 18
plane pendulum, 191 linear momentum, 14
solid sphere sliding on hemispherical surface, 165 Mass
sphere rolling down inclined plane on fritionless gravitational, 39
floor, 154 inertial, 38
spherical pendulum, 155 Matrix algebra, 505
spring pendulum, 156 addition, 506
swinging mass connected to a rotating mass, 159 adjoint matrix, 507
two connected blocks sliding without friction, 152 degenerate eigenvalues, 513
two connected masses sliding on rigid rail, 160 diagonalization, 511
INDEX 571

example of eigenvectors, 512 perturbation methods, 37

Hermitian matrix, 507 position-dependent forces, 26
history, 505 projectile motion, 29
identity matrix, 507 rocket problem, 30
inverse matrix, 507 roller coaster, 27
matrix multiplication, 506 time-dependent forces, 34
orthogonal matrix, 507 variable mass, 29
scalar multiplication, 506 velocity-dependent forces, 28
secular determinant, 511 vertical fall in graviatational field, 28
transpose matrix, 507 Noether’s theorem
unitary matrix, 508 Atwoods machine, 182
Maupertuis conservation of angular momentum, 183
action principle, 228 conservation of linear momentum, 182
history, 5 diatomic molecule, 184
Max Born, 8 history, 8
history, 505 invariant transformations, 181
quantum mechanics, 499 rotational invariance, 183
Maxwell stress tensor, 455 symmetries and invariance, 179, 193
Maxwell’s equations symmetry in deformed nuclei, 184
Gauss’s law and flux, 549 translational invariance, 182
Michelson and Morley experiment Non-conservative forces
ether velocity, 466 projectile motion, 247
Minkowski metric, 476 Rayleigh dissipation force, 241
Minkowski space time Non-holonomic systems
special relativity, 477 non-conservative forces, 247
Modulus of elasticity velocity-dependent Lorentz force, 247
bulk modulus, 453 Non-inertial frames
Lame’s modulus, 453 centrifugal force, 294
Poisson’s ratio, 454 Coriolis force, 294, 295
shear modulus, 454 effective forces acting, 294
Young’s modulus, 453 effective gravitation , 303
Moment of inertia Foucault pendulum, 308
thin door, 33 free fall on earth, 305
Momentum horizontal motion on the earth, 305
angular momentum, 11 Lagrangian and Hamiltonian, 293
linear momentum, 9, 15 low-pressure systems, 306
Multivariate calculus Newtonian mechanics, 292
linear operators, 540 nucleon orbits in spheroidal potential well, 301
partial differentiation, 539 pirouette, 298
projectile fired vertically upwards, 305
Navier-Stokes equation projectile motion near surface of earth, 302
fluid flow, 460 Rossby number, 306
Newton rotating frame, 290
equations of motion, 24 rotation plus translation, 292
history, 3 time derivatives for a rotating frame, 291
laws of gravitation, 38 trajectories for free motion on earth, 304
laws of motion, 9 translation, 289
Principia, xviii, 3 transverse, azimuthal, force, 294
Newton’s laws of gravitation, 44 weather systems, 306
Newtonian mechanics Non-linear systems
conservative forces, 25 bifurcation, 92, 103
constant force problems, 24 driven damped plane pendulum, 97
constrained motion, 27 limit cycle, 93
diatomic molecule, 26 onset of chaos, 101
linear restoring force, 25 period doubling, 100
572 INDEX

point attractor, 92 Poincare sections

sensitivity to initial conditions, 101 state-space plots, 104
soliton, 107 Poincare-Bendixson theorem
turbulence in fluid flow, 460 non-linear systems, 93
van der Pol oscillator, 94 Poisson
weak non-linearity, 90 history, 6
Norbert Wiener Poisson brackets
quantum mechanics, 499 angular momentum conservation, 409
Normal modes, 365 canonical transformation, 406
commutation relation, 407, 497
Orbit equation definition, 405
differential orbit equation, 254 fundamental, 405
free body motion, 254, 256 Hamilton equations of motion, 411
Orbit stability invariance to canonical transformations, 406
Bertrand’s theorem, 263 Lorentz force in electromagnetism, 412
constant restoring force, 271 time dependence, 408
Hooke’s law restoring force, 268 two-dimensional oscillator, 413
inverse square law, 269 wave motion and uncertainty principle, 412
two-body motion, 267 Poisson’s ratio, 454
Potential theory
Parallel-axis theorem
gravitation, 41
inertia tensor, 319
Precession rate
Pascal
inertially-symmetric rigid rotor, 340
history, 3
Principle of covariance
Pauli exclusion principle
general theory of relativity, 488
quantum physics, 496
Principle of equivalence
Pendulum
weak principle, 39
Foucault, 308
Principle of minimal gravitational coupling, 489
plane, 57, 206
plane pendulum, 191 Q-factor
spherical, 155, 213 damped linear oscillator, 61
spring pendulum, 156 Quantum mechanics
Permutation symbol, 516 Heisenberg, 496
Perpendicular axis theorem Heisenberg’s matrix representation, 497
inertia tensor, 322 Max Born, 499
Phase space Norbert Wiener, 499
harmonic oscillator, 56 Paul Dirac, 408, 497
Liouville’s theorem, 415 Pauli exclusion principle, 496
Phase velocity Schrodinger, 496
wave packets, 105 Schrodinger wave mechanics, 499
wavepackets, 73, 74 Queen Dido’s problem, 129
Philosophical developments, xix
Photoelectric effect Radius of gyration, 36
Einstein, 494 Rayleigh dissipation function, 241
Millikan, 494 Rayleigh’s dissipation function
Planck Hamiltonian mechanics, 242, 248
constant, 493 Ohm’s law, 244
history, 493 Reduced mass
Plane pendulum two-body motion, 251
state space, 57 Refractive index, 106
Plato Relativistic Doppler effect
history, 1 special theory of relativity, 471
Poincare Relativistic four vector
chaos, 89 scalar product, 476
history, 7 Restricted holonomic systems
three-body problem, 89 mass sliding on hemispherical shell, 161
INDEX 573

sphere rolling on a hemispherical shell, 163 Rotational transformation

Reynolds number rotation matrix, 525
fluid flow, 461 Routh
laminar flow, 462 Routhian reduction, 210
turbulent flow, 462 Routhian reduction, 210
Rheonomic constraint, 124 cyclic and non-cyclic Routhians, 299
Riemannian geometry, 489 inverse-square central potential, 216
Rigid-body rotation non-cyclic Routhian, 212
about a body-fixed non-symmetry axis, 32 rotating frames, 299
about a body-fixed point, 314 rotation of a symmetric top about a fixed point,
about a point, 313 348
about body-fixed symmetry axis, 31 Routhian, 210
about fixed axis, 313 spherical pendulum, cyclic Routhian, 214
Androyer-Deprit variables, 337 spherical pendulum, non-cyclic Routhian, 215
angular momentum, 325 Routhian reduction
angular momentum about corner of a uniform cyclic Routhian, 211
cube, 326 Rutherford scattering, 274
angular momentum of cube about centre of mass, cross section, 276
325 distance of closest approach, 276
angular velocities in terms of the Euler angle ve- impact parameter, 275
locities, 333
billiards, 33 Scattering
body-fixed axis, 31 energy transfer, 36
Chasles’ theorem, 314 Schrodinger
Euler equations for torque-free motion, 336 history, 8
Euler’s equations of motion, 334 Schrodinger equation
Hamiltonian approach, 337 Hamilton-Jacobi equation, 500
inertia tensor, 316 Schrodinger wave mechanics
kinetic energy, 327 quantum mechanics, 499
kinetic energy in terms of Euler angular veloci- Scleronomic constraint, 124, 143, 185, 186, 369
ties, 332 Shear modulus of elasticity, 454
matrix formulation, 317 Signal processing
nutation, 331 coaxial cable, 72
parallel-axis theorem, 319 discrete-function analysis, 558
pivoting versus rolling, 354 Signal velocity
precession, 331 wave packets, 73, 105
rolling, 354 Simultaneity
rotating dumbbell, 336 Special theory of relativity, 469
spin, 331 Slow light
stability for torque-free motion, 344 wave packets, 106
stability of a rolling wheel, 353 Snell’s law, xvii
static and dynamic balancing, 355 Soliton
symmetric top about a fixed point, 347 non-linear systems, 107
torque-free rotation of symmetric top, 337 Soliton wave, 107
Rigid-body rotation about a point Sommerfeld
tippe top, 350 history, 8
Rolling wheel Sommerfeld atom
symmetric rigid-body rotation , 351 quantum of action, 495
Rotation matrix, 525 Spatial inversion transformation, 530
example, 527 Special theory of relativity
finite rotations, 528 Bohr-Sommerfeld atom, 487
infinitessimal rotations, 529 energy, 473
proper and improper rotations, 529 extended Hamiltonian formalism, 481, 484
Rotational invariants Extended Lagrangian formulation, 479
scalar products, 333 force, 473
574 INDEX

four-dimensional space-time, 475 Poisson Brackets, 414

Lagrangian, 482
Lorentz spatial contraction, 469 Teleology, 5, 228
Minkowski space, 477 Tennis racket rotation
momentum transformations, 472 asymmetric-rotor rotation, 345
momentum-energy four vector, 478 Tensor algebra
relativistic Doppler eﬀect, 471 contravariant tensor, 536
simultaneity, 469 covariant tensor, 536
time dilation, 468 inner product, 317, 533, 537
twin paradox, 471 outer product, 534
velocity transformations, 472 transformation properties, 538
Spherical coordinates Three-body problem
Hamiltonian, 204 Lagrange points, 272
Spherical harmonic oscillator planar approximation, 272
two-body force, 263 restricted 3-body problem, 272
Spherical pendulum Time dependent force
Hamiltonian mechanics, 213 nonautonomous systems, 169
Lagrangian mechanics, 155 Time invariance
Spring constant, 454 conservation of energy, 186
Standard Lagrangian, 232 Time reversal transformation, 531
State space Tippe top
Lagrangian mechanics, 202 symmetric rigid-body rotation about a point, 350
plane pendulum, 57 Tornadoes
State-space orbits weather systems, 307
Poincare sections, 104 Torque free rotation of asymmetric body, 346
Stern Gerlach Total mechanical energy, 19
space quantization, 496 Transformation properties of common observables, 538
Strain tensor Translational invariance
elasticity, 452 Noether’s theorem, 182
Stress tensor Tumbling of an asymmetric rotor
elasticity, 452 rigid-body rotation, 356
Strong equivalence principle Turbulence in fluid flow
general theory of relativity, 488 non-linear system, 460
Superposition Twin paradox
Fourier series, 555 special theory of relativity, 471
harmonic wave analysis, 70 Two-body central forces
linear equation of motion, 54 conservative forces, 249
Symmetric top Two-body kinematics, 278
Feynman’s wobbling plate, 342 angle transformation, 280
nutation , 349 recoil energies, 282
oblate spheroid, 339 velocity transformation, 279
precession, 349 Two-body motion
precession rate for torque-free symmetric top, angular momentum, 251
342 apocenter, 259
prolate spheroid, 339 barycenter, 251
rotation about a fixed point, 347 bound orbits, 258
spin, 350 equations of motion, 253
spinning jack, 349 equivalent one-body representation, 250
torque-free rotation, 337 Hamiltonian, 255
Symmetries inverse cubic central force, 270
invariance, 193 inverse square law, 257
Noether’s theorem, 181 isotropic harmonic oscillator, 263
Symmetry tensor Kepler’s laws, 259, 284
anisotropic harmonic oscillator, 414 Laplace-Runge-Lenz vector, 261
isotropic harmonic oscillator, 266 orbit solutions , 256
INDEX 575

orbit stability, 267 scalar product, 515

pericenter, 258 scalar triple product, 517
properties of objects in solar system, 260 vector product, 516
reduced mass, 251 vector triple product, 518
unbound orbits, 260 Vibration isolation
Two-body scattering linearly-damped oscillator, 71
differential cross section, 274 Virial theorem, 22
impact parameter, 260 Hooke’s law, 22
Rutherford scattering, 274 ideal gas law, 23
total cross section, 273 inverse square law, 23
Two-coupled harmonic oscillators mass of galaxies, 23
centre-of-mass oscillations, 366 Virtual work
eigenfrequencies, 364 d’Alembert’s principle, 138
grand piano, 368 principle, 138
normal modes, 363, 365
symmetric and antisymmetric normal modes, 366 Wave equation, 68
weak coupling, 367 stationary wave solutions, 69
trabelling wave solutions, 69
Uncertainty principle Wave motion
Heisenberg, 80 discrete-function analysis, 558
quantum baseball, 82 dispersion on discrete lattice chain, 393
Uncertainty principle for wave motion, 80 electromagnetic waves in ionosphere, 77
Unity of classical and quantum mechanics, 504 group velocity for discrete lattice chain, 393
group velocity for water waves, 76
van der Pol oscillator group velocity of de Broglie waves, 496
attractor, 94 plasma oscillation frequency, 78
strong non-linearity, 96 uncertainty principle, 80
weak non-linearity, 95 water waves breaking on a beach, 76
Variational principles Wave packets
calculus of variations, 111 fast light, 106
philosophy, 111, 504 Fourier transform, 79
principle of economy, 111, 504 group velocity, 73, 74, 105
Vector algebra phase velocity, 73, 74, 105
linear operations, 515 signal velocity, 73, 105
Vector differential calculus slow light, 106
scalar differential operator, 543 uncertainty principle, 80
scalar differential operators, 543 Wave-particle duality
Vector differential operators de Broglie, 496, 499
curl, 546 Hamilton-Jacobi theory, 432
curvilinear coordinates, 545 Schrodinger, 499
divergence, 546 Weak equivalence principle
gradient, 544, 545 general theory of relativity, 488
Laplacian, 545, 546 Weather systems
scalar product, 544 high-pressure systems, 308
vector product, 544 low-pressure systems, 306
Vector integral calculus tornadoes, 307
curl, 551 Work
curl in cartesian coordinates, 551 definition, 12, 20
curl-free field, 553
divergence in cartesian coordinates, 548 Young’s modulus of elasticity, 453
divergence theorem, 548
divergence-free field, 553 Zeeman effect
Gauss’s theorem, 547 weakly-coupled normal modes, 367
line integral, 547
Stokes theorem, 550
Vector multiplication
Two dramatically different philosophical approaches to classical mechanics
were proposed during the 17th – 18th centuries. Newton developed his vectorial VARIATIONAL PRINCIPLES
formulation that uses time-dependent differential equations of motion to relate
vector observables like force and rate of change of momentum. Euler, Lagrange,
Hamilton, and Jacobi, developed powerful alternative variational formulations
in
based on the assumption that nature follows the principle of least action. These
variational formulations now play a pivotal role in science and engineering. CLASSICAL MECHANICS
This book introduces variational principles and their application to classical
mechanics. The relative merits of the intuitive Newtonian vectorial formulation,
and the more powerful variational formulations are compared. Applications to SECOND EDITION
a wide variety of topics illustrate the intellectual beauty, remarkable power, and
broad scope provided by use of variational principles in physics.
This second edition adds discussion of the use of variational principles applied
to the following topics:
(1) Systems subject to initial boundary conditions
(2) The hierarchy of related formulations based on action, Lagrangian,
Hamiltonian, and equations of motion, to systems that involve symmetries
(3) Variational principles to non-conservative systems
(4) Variable-mass systems
(5) The General Theory of Relativity
Douglas Cline is a Professor of Physics in the Department of Physics and SECOND
Astronomy, University of Rochester, Rochester, New York. EDITION

Douglas Cline

The Theory of Classical Dynamics, Griffiths PDF
100% (8)
The Theory of Classical Dynamics, Griffiths PDF
330 pages
Variational principles in classical mechanics Cline D. download
No ratings yet
Variational principles in classical mechanics Cline D. download
177 pages
Introduction To Classical Mechanics by Takwale & Puranik - Text
100% (2)
Introduction To Classical Mechanics by Takwale & Puranik - Text
436 pages
Classical Mechanics Very Good Notes Like
100% (1)
Classical Mechanics Very Good Notes Like
396 pages
Donald T. Greenwood - Classical Dynamics (1997, Dover Publications)
100% (1)
Donald T. Greenwood - Classical Dynamics (1997, Dover Publications)
351 pages
Variational Principles Classical Mechanics
100% (1)
Variational Principles Classical Mechanics
587 pages
Script Swiss Physics Olympiad 4 Edition
100% (1)
Script Swiss Physics Olympiad 4 Edition
398 pages
Joel Franklin - Mathematical Methods For Oscillations and Waves-Cambridge University Press (2020)
100% (1)
Joel Franklin - Mathematical Methods For Oscillations and Waves-Cambridge University Press (2020)
275 pages
(Brian Cowan) Topics in Statistical Mechanics
100% (4)
(Brian Cowan) Topics in Statistical Mechanics
337 pages
Variational Principles in Classical Mechanics
100% (3)
Variational Principles in Classical Mechanics
717 pages
Joel A Shapiro Classical Mechanic
100% (2)
Joel A Shapiro Classical Mechanic
278 pages
Analytical Dynamics - Ardema PDF
100% (1)
Analytical Dynamics - Ardema PDF
345 pages
Quantum Mechanics - An Accessible Introduction - Robert J Scherrer - 2, 2024 - WSPC - 9789810984854 - Anna's Archive
100% (2)
Quantum Mechanics - An Accessible Introduction - Robert J Scherrer - 2, 2024 - WSPC - 9789810984854 - Anna's Archive
385 pages
Cbse 10th Bio Atom Bomb Free
No ratings yet
Cbse 10th Bio Atom Bomb Free
6 pages
Advanced classical mechanics 1st Edition Bagchi download
No ratings yet
Advanced classical mechanics 1st Edition Bagchi download
138 pages
MBMC Grade5
No ratings yet
MBMC Grade5
4 pages
MO Apuntes Anuales
No ratings yet
MO Apuntes Anuales
200 pages
HH
No ratings yet
HH
275 pages
M - SC - Physics - 345 11 - Classical Mechanics
100% (3)
M - SC - Physics - 345 11 - Classical Mechanics
290 pages
Lecture Notes On Classical Mechanics (PDFDrive)
100% (1)
Lecture Notes On Classical Mechanics (PDFDrive)
463 pages
COMPOUND-SDS - INDONESIA-English - Jayaboard (2023)
No ratings yet
COMPOUND-SDS - INDONESIA-English - Jayaboard (2023)
6 pages
Classical-Mechanics-DHSM 07 07 2019 PDF
No ratings yet
Classical-Mechanics-DHSM 07 07 2019 PDF
127 pages
Lecture Notes
No ratings yet
Lecture Notes
155 pages
2017F Phys363 Readings PDF
No ratings yet
2017F Phys363 Readings PDF
235 pages
Stargate Universe 3 X 02
100% (1)
Stargate Universe 3 X 02
52 pages
Lecture Notes On Classical Mechanics For Physics 106ab
No ratings yet
Lecture Notes On Classical Mechanics For Physics 106ab
396 pages
Mechanics and Relativity-Idema
No ratings yet
Mechanics and Relativity-Idema
193 pages
City School Itep Test
100% (4)
City School Itep Test
4 pages
Physics Hyperref
No ratings yet
Physics Hyperref
107 pages
CRC Advanced Classical Mechanics B071HLBNQD
100% (3)
CRC Advanced Classical Mechanics B071HLBNQD
272 pages
Differential Geometry in Physics Lugo
100% (6)
Differential Geometry in Physics Lugo
374 pages
Achiever'11 Class 7 Maths Worksheet 2
No ratings yet
Achiever'11 Class 7 Maths Worksheet 2
13 pages
Classical Mechanics
100% (1)
Classical Mechanics
297 pages
Lecture Notes On Classical Mechanics For Physics 106ab
No ratings yet
Lecture Notes On Classical Mechanics For Physics 106ab
396 pages
Diederichs Kaiser 1999a
No ratings yet
Diederichs Kaiser 1999a
21 pages
Mekanik Lindstrom Preview
No ratings yet
Mekanik Lindstrom Preview
27 pages
Mechanics 2024
No ratings yet
Mechanics 2024
85 pages
Classical Mechanics
No ratings yet
Classical Mechanics
427 pages
Classical Mechanics MIT
100% (1)
Classical Mechanics MIT
199 pages
Roll Crushers PDF
No ratings yet
Roll Crushers PDF
5 pages
Signal Processing For Passive Bistatic Radar Mateusz Malanowsk PDF Download
No ratings yet
Signal Processing For Passive Bistatic Radar Mateusz Malanowsk PDF Download
76 pages
Analytical Classical Dynamics
100% (15)
Analytical Classical Dynamics
315 pages
Note - 2024-04-13 - 09-22-07 5 - Copy 4
No ratings yet
Note - 2024-04-13 - 09-22-07 5 - Copy 4
50 pages
6 Problems
No ratings yet
6 Problems
248 pages
Elementary Treatise On Rigid Body Mechanics
100% (1)
Elementary Treatise On Rigid Body Mechanics
396 pages
2025 Ebs
No ratings yet
2025 Ebs
47 pages
Classical Field Theory
No ratings yet
Classical Field Theory
263 pages
Rectangular Pyramid Easy 1
No ratings yet
Rectangular Pyramid Easy 1
2 pages
(Joseph L. McCauley) Classical Mechanics PDF
100% (14)
(Joseph L. McCauley) Classical Mechanics PDF
488 pages
Safety 12 02
No ratings yet
Safety 12 02
114 pages
E Corinaldesi Classical Mechanics - For Physics Graduate Students 1998
100% (2)
E Corinaldesi Classical Mechanics - For Physics Graduate Students 1998
301 pages
ENNUS1 Manual
No ratings yet
ENNUS1 Manual
30 pages
Math Classical Mechanics
No ratings yet
Math Classical Mechanics
135 pages
Poisons An Introduction For Forensic Investigators, 1st Edition PDF
100% (9)
Poisons An Introduction For Forensic Investigators, 1st Edition PDF
17 pages
Matveev Mechanics and Theory of Relativity
100% (3)
Matveev Mechanics and Theory of Relativity
419 pages
Metro Project CC
No ratings yet
Metro Project CC
15 pages
She Loves To Walk On The Mountain
100% (1)
She Loves To Walk On The Mountain
1 page
Quantum Mechanics An Introduction - Walter Greiner
No ratings yet
Quantum Mechanics An Introduction - Walter Greiner
512 pages
14 Network Hardwares
No ratings yet
14 Network Hardwares
11 pages
Dežela Celjska in Your Pocket
No ratings yet
Dežela Celjska in Your Pocket
85 pages
Classical Mechanics - John Taylor - Free Download, Borrow, and Streaming - Internet Archive
No ratings yet
Classical Mechanics - John Taylor - Free Download, Borrow, and Streaming - Internet Archive
3 pages
Practice Exam Answers
No ratings yet
Practice Exam Answers
19 pages
WK2 Cloud Computing Presentation PDF
No ratings yet
WK2 Cloud Computing Presentation PDF
16 pages
June LSAT SECTION 3 PDF
No ratings yet
June LSAT SECTION 3 PDF
8 pages
Micro (Nano) Plastic Contaminations From Soils To Plants: Human Food Risks
No ratings yet
Micro (Nano) Plastic Contaminations From Soils To Plants: Human Food Risks
6 pages
Math 7 Subtracting Integers Lesson 2016 Day 2
No ratings yet
Math 7 Subtracting Integers Lesson 2016 Day 2
5 pages
Wbi11 01 Que 20240508
No ratings yet
Wbi11 01 Que 20240508
28 pages
Solutions Statistical Mechanics 1
No ratings yet
Solutions Statistical Mechanics 1
69 pages
How The Rib of Adam Is Incorrectly Translated
No ratings yet
How The Rib of Adam Is Incorrectly Translated
5 pages
Tips - Variational Principles and Methods in Theoretical PDF
100% (3)
Tips - Variational Principles and Methods in Theoretical PDF
245 pages
Golgi Apparatus Structure and Function Relationship
No ratings yet
Golgi Apparatus Structure and Function Relationship
3 pages
8321 Asco
No ratings yet
8321 Asco
4 pages
R. Feynman, A. Hibbs - Quantum Mechanics and Path Integrals
100% (2)
R. Feynman, A. Hibbs - Quantum Mechanics and Path Integrals
377 pages
Rutherford Classical Mechanics
100% (2)
Rutherford Classical Mechanics
111 pages
Classical Mechanics
No ratings yet
Classical Mechanics
96 pages
Quantum Mechanics, Special Chapters - Walter Greiner
83% (6)
Quantum Mechanics, Special Chapters - Walter Greiner
402 pages
Morse & Feshbach - Methods of Theoretical Physics, Part I (1953)
No ratings yet
Morse & Feshbach - Methods of Theoretical Physics, Part I (1953)
1,063 pages
Software Project Management
No ratings yet
Software Project Management
2 pages
Classical Mechanics Text Iith
No ratings yet
Classical Mechanics Text Iith
282 pages
Spiegel-SchaumsTheoryAndProblemsOfTheoreticalMechanics Text PDF
100% (1)
Spiegel-SchaumsTheoryAndProblemsOfTheoreticalMechanics Text PDF
375 pages
Special Relativity With Einstein
100% (1)
Special Relativity With Einstein
285 pages
Statistical Mechanics: Alice Pagano
No ratings yet
Statistical Mechanics: Alice Pagano
253 pages
Physical Mathematics
No ratings yet
Physical Mathematics
2 pages
Biology 12 Unit 9 Assignment 2 Blood Type and Immune Response Virtual Lab
0% (1)
Biology 12 Unit 9 Assignment 2 Blood Type and Immune Response Virtual Lab
2 pages
Intermediate Fluid Mechanics
100% (1)
Intermediate Fluid Mechanics
332 pages
Pathria Statistical Mechanics
100% (3)
Pathria Statistical Mechanics
542 pages
Quantum Physics Berkeley Physics Course Wichmann
No ratings yet
Quantum Physics Berkeley Physics Course Wichmann
446 pages
History of Virtual Work Laws
No ratings yet
History of Virtual Work Laws
505 pages
Targ - Theoretical Mechanics A Short Course - Mir 1988 PDF
100% (2)
Targ - Theoretical Mechanics A Short Course - Mir 1988 PDF
528 pages
CM Merged
No ratings yet
CM Merged
230 pages