Gimbal Lock
Gimbal Lock
ISBN 978-1-283-50546-8
Published by:
White Word Publications
48 West 48 Street, Suite 1116,
New York, NY 10036, United States
Email: [email protected]
Table of Contents
Chapter 1 - Rotational Symmetry
WT
Chapter 6 - Precession
________________________WORLD TECHNOLOGIES________________________
Chapter 1
Rotational Symmetry
WT
The triskelion appearing on the Isle of Man flag.
Generally speaking, an object with rotational symmetry is an object that looks the same
after a certain amount of rotation. An object may have more than one rotational
symmetry; for instance, if reflections or turning it over are not counted, the triskelion
________________________WORLD TECHNOLOGIES________________________
appearing on the Isle of Man's flag has three rotational symmetries (or "a threefold
rotational symmetry"). More examples may be seen below. The degree of rotational
symmetry is how many degrees the shape has to be turned to look the same on a different
side or vertex. It can not be the same side or vertex.
Formal treatment
Formally, rotational symmetry is symmetry with respect to some or all rotations in m-
dimensional Euclidean space. Rotations are direct isometries, i.e., isometries preserving
orientation. Therefore a symmetry group of rotational symmetry is a subgroup of E+(m).
Symmetry with respect to all rotations about all points implies translational symmetry
with respect to all translations, so space is homogeneous, and the symmetry group is the
whole E(m). With the modified notion of symmetry for vector fields the symmetry group
WT
can also be E+(m).
For symmetry with respect to rotations about a point we can take that point as origin.
These rotations form the special orthogonal group SO(m), the group of m×m orthogonal
matrices with determinant 1. For m=3 this is the rotation group.
In another meaning of the word, the rotation group of an object is the symmetry group
within E+(n), the group of direct isometries; in other words, the intersection of the full
symmetry group and the group of direct isometries. For chiral objects it is the same as the
full symmetry group.
The notation for n-fold symmetry is Cn or simply "n". The actual symmetry group is
specified by the point or axis of symmetry, together with the n. For each point or axis of
symmetry the abstract group type is cyclic group Zn of order n. Although for the latter
also the notation Cn is used, the geometric and abstract Cn should be distinguished: there
are other symmetry groups of the same abstract group type which are geometrically
different.
________________________WORLD TECHNOLOGIES________________________
Examples without additional reflection symmetry:
• n = 2, 180°: the dyad, quadrilaterals with this symmetry are the parallelograms;
other examples: letters Z, N, S; apart from the colors: yin and yang
• n = 3, 120°: triad, triskelion, Borromean rings; sometimes the term trilateral
symmetry is used;
• n = 4, 90°: tetrad, swastika
• n = 6, 60°: hexad, raelian symbol, new version
• n = 8, 45°: octad, Octagonal muqarnas, computer-generated (CG), ceiling
If there is e.g. rotational symmetry with respect to an angle of 100°, then also with
WT
respect to one of 20°, the greatest common divisor of 100° and 360°.
A typical 3D object with rotational symmetry (possibly also with perpendicular axes) but
no mirror symmetry is a propeller.
Examples
C2 (more examples)
________________________WORLD TECHNOLOGIES________________________
Roundabout traffic sign
WT
Snoldelev Stone's interlocked drinking horns design
For discrete symmetry with multiple symmetry axes through the same point, there are the
following possibilities:
In the case of the Platonic solids, the 2-fold axes are through the midpoints of opposite
edges, the number of them is half the number of edges. The other axes are through
opposite vertices and through centers of opposite faces, except in the case of the
tetrahedron, where the 3-fold axes are each through one vertex and the center of one face.
Rotational symmetry with respect to any angle is, in two dimensions, circular symmetry.
The fundamental domain is a half-line.
________________________WORLD TECHNOLOGIES________________________
In three dimensions we can distinguish cylindrical symmetry and spherical symmetry
(no change when rotating about one axis, or for any rotation). That is, no dependence on
the angle using cylindrical coordinates and no dependence on either angle using spherical
coordinates. The fundamental domain is a half-plane through the axis, and a radial half-
line, respectively. Axisymmetric or axisymmetrical are adjectives which refer to an
object having cylindrical symmetry, or axisymmetry. An example of approximate
spherical symmetry is the Earth (with respect to density and other physical and chemical
properties).
WT
Rotational symmetry with translational symmetry
________________________WORLD TECHNOLOGIES________________________
2-fold rotational symmetry together with single translational symmetry is one of the
Frieze groups. There are two rotocenters per primitive cell.
Together with double translational symmetry the rotation groups are the following
wallpaper groups, with axes per primitive cell:
WT
• 2-fold rotocenters (including possible 4-fold and 6-fold), if present at all, form the
translate of a lattice equal to the translational lattice, scaled by a factor 1/2. In the
case translational symmetry in one dimension, a similar property applies, though
the term "lattice" does not apply.
• 3-fold rotocenters (including possible 6-fold), if present at all, form a regular
hexagonal lattice equal to the translational lattice, rotated by 30° (or equivalently
Arrangement within a primitive cell of 2-, 3-, and 6-fold rotocenters, alone or in
combination (consider the 6-fold symbol as a combination of a 2- and a 3-fold symbol);
________________________WORLD TECHNOLOGIES________________________
in the case of 2-fold symmetry only, the shape of the parallelogram can be different. For
the case p6, a fundamental domain is indicated in yellow.
• 4-fold rotocenters, if present at all, form a regular square lattice equal to the
Scaling of a lattice divides the number of points per unit area by the square of the scale
factor. Therefore the number of 2-, 3-, 4-, and 6-fold rotocenters per primitive cell is 4, 3,
2, and 1, respectively, again including 4-fold as a special case of 2-fold, etc.
WT
3-fold rotational symmetry at one point and 2-fold at another one (or ditto in 3D with
respect to parallel axes) implies rotation group p6, i.e. double translational symmetry and
6-fold rotational symmetry at some point (or, in 3D, parallel axis). The translation
distance for the symmetry generated by one such pair of rotocenters is 2√3 times their
distance.
________________________WORLD TECHNOLOGIES________________________
Chapter 2
WT
In quantum mechanics, the angular momentum operator is an operator analogous to
classical angular momentum. The angular momentum operator plays a central role in the
theory of atomic physics and other quantum problems involving rotational symmetry. In
both classical and quantum mechanical systems, angular momentum (together with linear
momentum and energy) is one of the three fundamental properties of motion.
Intuitive meaning
Angular momentum quantifies the rotational aspect of motion. Like energy and linear
momentum, angular momentum in an isolated system is conserved. The concept of an
angular momentum operator is necessary in quantum mechanics, as calculations of
angular momentum must be made upon a wave function, rather than on a point or rigid
body as classical calculations entail. This is because at the scale of quantum mechanics,
the matter analyzed is best described by a wave equation or probability amplitude, rather
than as a collection of fixed points or as a rigid body. Vector calculus is used in
calculations of angular momentum, as angular momentum has components in each of the
three spatial dimensions.
Mathematical definition
Angular momentum L is mathematically defined as the cross product of a wave
function's position operator (r) and momentum operator (p):
In the special case of a single particle with no electric charge and no spin, the angular
momentum operator can be written in the position basis as a single vector equation:
________________________WORLD TECHNOLOGIES________________________
where is the gradient operator. This is a commonly encountered form of the angular
momentum operator, though not the most general one.
However, the square of the total angular momentum (L2) (defined as the sum of the
WT
squares of the three Cartesian components) commutes with its components as follows:
This means that no two individual components of quantum angular momentum can be
simultaneously specified for a given system, whereas the total angular momentum can be
simultaneously specified along with any one of the operator's components. The lack of
commutation of the individual components of the angular momentum describe what is
known in physics as an uncertainty principle.
Even more importantly, the angular momentum operator commutes with the Hamiltonian
of such a chargeless and spinless particle when it is in a central potential (i.e., when the
potential energy function depends only on ):
The Hamiltonian H represents the energy of the system and is used to generate
translations through time. Thus, operators which commute with H represent conserved
quantities. In this case, there is an exact analogy to the classical conservation of angular
momentum in central potentials.
The first commutation relation above is an example of what is generally known as a Lie
algebra. In this case, the Lie algebra is that of SU(2) or SO(3), the rotation group in three
dimensions. The second commutation relation indicates that L2 is a Casimir invariant.
The third commutation relation states that the angular momentum is a constant of motion,
and is a special case of Liouville's equation for quantum mechanics, or more precisely, of
Ehrenfest's theorem.
________________________WORLD TECHNOLOGIES________________________
In classical physics
WT
When solving to find eigenstates of this operator, we obtain the following
where
________________________WORLD TECHNOLOGIES________________________
Chapter 3
Angular Momentum
WT
This gyroscope remains upright while spinning due to its angular momentum.
________________________WORLD TECHNOLOGIES________________________
where r is the particle's position from the origin, p = mv is its linear momentum, and ×
denotes the cross product.
The angular momentum of a system of particles (e.g. a rigid body) is the sum of angular
momenta of the individual particles. For a rigid body rotating around an axis of symmetry
(e.g. the fins of a ceiling fan), the angular momentum can be expressed as the product of
the body's moment of inertia I (a measure of an object's resistance to changes in its
rotation rate) and its angular velocity ω:
In this way, angular momentum is sometimes described as the rotational analog of linear
momentum.
WT
Angular momentum is conserved in a system where there is no net external torque, and
its conservation helps explain many diverse phenomena. For example, the increase in
rotational speed of a spinning figure skater as the skater's arms are contracted is a
consequence of conservation of angular momentum. The very high rotational rates of
neutron stars can also be explained in terms of angular momentum conservation.
Moreover, angular momentum conservation has numerous applications in physics and
engineering (e.g. the gyrocompass).
Relationship between force (F), torque (τ), momentum (p), and angular momentum (L)
vectors in a rotating system
Definition
where r is the position vector of the particle relative to the origin, p is the linear
momentum of the particle, and × denotes the cross product.
As seen from the definition, the derived SI units of angular momentum are newton metre
seconds (N·m·s or kg·m2s−1) or joule seconds. Because of the cross product, L is a
________________________WORLD TECHNOLOGIES________________________
pseudovector perpendicular to both the radial vector r and the momentum vector p and it
is assigned a sign by the right-hand rule.
For an object with a fixed mass that is rotating about a fixed symmetry axis, the angular
momentum is expressed as the product of the moment of inertia of the object and its
angular velocity vector:
where I is the moment of inertia of the object (in general, a tensor quantity), and ω is the
angular velocity.
The angular momentum of a particle or rigid body in rectilinear motion (pure translation)
is a vector with constant magnitude and direction. If the path of the particle or rigid body
WT
passes through the given origin, its angular momentum is zero.
If a system consists of several particles, the total angular momentum about a point can be
obtained by adding (or integrating) all the angular momenta of the constituent particles.
where Ri is the position vector of particle i from the reference point, mi is its mass, and Vi
is its velocity. The center of mass is defined by:
________________________WORLD TECHNOLOGIES________________________
If we define ri as the displacement of particle i from the center of mass, and vi as the
velocity of particle i with respect to the center of mass, then we have
and
and also
and
WT
so that the total angular momentum with respect to the center is
The first term is just the angular momentum of the center of mass. It is the same angular
momentum one would obtain if there were just one particle of mass M moving at velocity
V located at the center of mass. The second term is the angular momentum that is the
result of the particles moving relative to their center of mass. This second term can be
even further simplified if the particles form a rigid body, in which case it is the product of
moment of inertia and angular velocity of the spinning motion (as above). The same
result is true if the discrete point masses discussed above are replaced by a continuous
distribution of matter.
For many applications where one is only concerned about rotation around one axis, it is
sufficient to discard the pseudovector nature of angular momentum, and treat it like a
scalar where it is positive when it corresponds to a counter-clockwise rotation, and
negative clockwise. To do this, just take the definition of the cross product and discard
the unit vector, so that angular momentum becomes:
where θr,p is the angle between r and p measured from r to p; an important distinction
because without it, the sign of the cross product would be meaningless. From the above,
it is possible to reformulate the definition to either of the following:
________________________WORLD TECHNOLOGIES________________________
where is called the lever arm distance to p.
The easiest way to conceptualize this is to consider the lever arm distance to be the
distance from the origin to the line that p travels along. With this definition, it is
necessary to consider the direction of p (pointed clockwise or counter-clockwise) to
figure out the sign of L. Equivalently:
For an object with a fixed mass that is rotating about a fixed symmetry axis, the angular
momentum is expressed as the product of the moment of inertia of the object and its
WT
angular velocity vector:
where I is the moment of inertia of the object (in general, a tensor quantity) and ω is the
angular velocity.
It is a misconception that angular momentum is always about the same axis as angular
velocity. Sometime this may not be possible, in these cases the angular momentum
component along the axis of rotation is the product of angular velocity and moment of
inertia about the given axis of rotation.
K = Iω2 / 2
________________________WORLD TECHNOLOGIES________________________
Conservation of angular momentum
WT
The torque caused by the two opposing forces Fg and -Fg causes a change in the angular
momentum L in the direction of that torque (since torque is the time derivative of angular
momentum). This causes the top to precess.
(The cross-product of velocity and momentum is zero, because these vectors are parallel.)
So requiring the system to be "closed" here is mathematically equivalent to zero external
torque acting on the system:
where τext is any torque applied to the system of particles. It is assumed that internal
interaction forces obey Newton's third law of motion in its strong form, that is, that the
forces between particles are equal and opposite and act along the line between the
particles.
In orbits, the angular momentum is distributed between the spin of the planet itself and
the angular momentum of its orbit:
________________________WORLD TECHNOLOGIES________________________
;
If a planet is found to rotate slower than expected, then astronomers suspect that the
planet is accompanied by a satellite, because the total angular momentum is shared
between the planet and its satellite in order to be conserved.
The conservation of angular momentum explains the angular acceleration of an ice skater
WT
as she brings her arms and legs close to the vertical axis of rotation. By bringing part of
mass of her body closer to the axis she decreases her body's moment of inertia. Because
angular momentum is constant in the absence of external torques, the angular velocity
(rotational speed) of the skater has to increase.
The same phenomenon results in extremely fast spin of compact stars (like white dwarfs,
neutron stars and black holes) when they are formed out of much larger and slower
rotating stars (indeed, decreasing the size of object 104 times results in increase of its
angular velocity by the factor 108).
________________________WORLD TECHNOLOGIES________________________
Angular momentum in quantum mechanics
In quantum mechanics, angular momentum is quantized – that is, it cannot vary
continuously, but only in "quantum leaps" between certain allowed values. The orbital
angular momentum of a subatomic particle, that is due to its motion through space, is
always a whole-number multiple of ("h-bar," known as the reduced Planck's constant).
Furthermore, experiments show that most subatomic particles have a permanent, built-in
angular momentum, which is not due to their motion through space. This spin angular
momentum comes in units of . For example, an electron standing at rest has an
angular momentum of .
Basic definition
WT
The classical definition of angular momentum as depends on six numbers:
rx, ry, rz, px, py, and pz. Translating this into quantum-mechanical terms, the Heisenberg
uncertainty principle tells us that it is not possible for all six of these numbers to be
measured simultaneously with arbitrary precision. Therefore, there are limits to what can
be known or measured about a particle's angular momentum. It turns out that the best that
one can do is to simultaneously measure both the angular momentum vector's magnitude
and its component along one axis.
where r and p are the position and momentum operators respectively. In particular, for a
single particle with no electric charge and no spin, the angular momentum operator can
be written in the position basis as
where is the vector differential operator del (also called "Nabla"). This orbital angular
momentum operator is the most commonly encountered form of the angular momentum
operator, though not the only one. It satisfies the following canonical commutation
relations:
where
________________________WORLD TECHNOLOGIES________________________
From this follows
Since,
WT
it follows, for example,
Given a quantized total angular momentum which is the sum of two individual
quantized angular momenta and ,
the quantum number j associated with its magnitude can range from | l1 − l2 | to l1 + l2 in
integer steps where l1 and l2 are quantum numbers corresponding to the magnitudes of the
individual angular momenta.
If φ is the angle around a specific axis, for example the azimuthal angle around the z axis,
then the angular momentum along this axis is the generator of rotations around this axis:
________________________WORLD TECHNOLOGIES________________________
The eigenfunctions of Lz are therefore , and since φ has a period of 2π, ml must be
an integer.
For a particle with a spin S, this takes into account only the angular dependence of the
location of the particle, for example its orbit in an atom. It is therefore known as orbital
angular momentum. However, when one rotates the system, one also changes the spin.
Therefore the total angular momentum, which is the full generator of rotations, is Ji = Li +
Si Being an angular momentum, J satisfies the same commutation relations as L, as will
be explained below, namely
WT
Acting with J on the wavefunction ψ of a particle generates a rotation: is the
wavefunction ψ rotated around the z axis by an angle φ. For an infinitesmal rotation by an
angle dφ, the rotated wavefunction is ψ + idφJzψ. This is similarly true for rotations
around any axis.
In a charged particle the momentum gets a contribution from the electromagnetic field,
and the angular momenta L and J change accordingly.
Since angular momentum is the generator of rotations, its commutation relations follow
the commutation relations of the generators of the three-dimensional rotation group
SO(3). This is why J always satisfies these commutation relations. In d dimensions, the
angular momentum will satisfy the same commutation relations as the generators of the
d-dimensional rotation group SO(d).
SO(3) has the same Lie algebra (i.e. the same commutation relations) as SU(2).
Generators of SU(2) can have half-integer eigenvalues, and so can mj. Indeed for
fermions the spin S and total angular momentum J are half-integer. In fact this is the
most general case: j and mj are either integers or half-integers.
Technically, this is because the universal cover of SO(3) is isomorphic to SU(2), and the
representations of the latter are fully known. Ji span the Lie algebra and J2 is the Casimir
invariant, and it can be shown that if the eigenvalues of Jz and J2 are mj and j(j+1) then
________________________WORLD TECHNOLOGIES________________________
mj and j are both integer multiples of one-half. j is non-negative and mj takes values
between -j and j.
Angular momentum operators usually occur when solving a problem with spherical
symmetry in spherical coordinates. Then, the angular momentum in space representation
is:
WT
where
Thus, a particle whose wave function is the spherical harmonic Yl,m has an orbital angular
momentum
with a z-component
________________________WORLD TECHNOLOGIES________________________
where e is the electric charge, c the speed of light and A the vector potential. Thus, for
example, the Hamiltonian of a charged particle of mass m in an electromagnetic field is
then
where φ is the scalar potential. This is the Hamiltonian that gives the Lorentz force law.
The gauge-invariant angular momentum, or "kinetic angular momentum" is given by
WT
________________________WORLD TECHNOLOGIES________________________
Chapter 4
WT
In quantum mechanics, the procedure of constructing eigenstates of total angular
momentum out of eigenstates of separate angular momenta is called angular momentum
coupling. For instance, the orbit and spin of a single particle can interact through spin-
orbit interaction, in which case the complete physical picture must include spin-orbit
coupling. Or two charged particles, each with a well-defined angular momentum, may
interact by Coulomb forces, in which case coupling of the two one-particle angular
momenta to a total angular momentum is a useful step in the solution of the two-particle
Schrödinger equation. In both cases the separate angular momenta are no longer
constants of motion, but the sum of the two angular momenta usually still is. Angular
momentum coupling in atoms is of importance in atomic spectroscopy. Angular
momentum coupling of electron spins is of importance in quantum chemistry. Also in the
nuclear shell model angular momentum coupling is ubiquitous.
An example of the first situation is an atom whose electrons only feel the Coulomb field
of its nucleus. If we ignore the electron-electron interaction (and other small interactions
________________________WORLD TECHNOLOGIES________________________
such as spin-orbit coupling), the orbital angular momentum l of each electron commutes
with the total Hamiltonian. In this model the atomic Hamiltonian is a sum of kinetic
energies of the electrons and the spherical symmetric electron-nucleus interactions. The
individual electron angular momenta l(i) commute with this Hamiltonian. That is, they
are conserved properties of this approximate model of the atom.
An example of the second situation is a rigid rotor moving in field-free space. A rigid
rotor has a well-defined, time-independent, angular momentum.
These two situations originate in classical mechanics. The third kind of conserved
angular momentum, associated with spin, does not have a classical counterpart. However,
all rules of angular momentum coupling apply to spin as well.
WT
(described by the groups SO(3) and SU(2)) and, conversely, spherical symmetry implies
conservation of angular momentum. If two or more physical systems have conserved
angular momenta, it can be useful to add these momenta to a total angular momentum of
the combined system—a conserved property of the total system. The building of
eigenstates of the total conserved angular momentum from the angular momentum
eigenstates of the individual subsystems is referred to as angular momentum coupling.
As an example we consider two electrons, 1 and 2, in an atom (say the helium atom). If
there is no electron-electron interaction, but only electron nucleus interaction, the two
electrons can be rotated around the nucleus independently of each other; nothing happens
to their energy. Both operators, l(1) and l(2), are conserved. However, if we switch on the
electron-electron interaction depending on the distance d(1,2) between the electrons, then
only a simultaneous and equal rotation of the two electrons will leave d(1,2) invariant. In
such a case neither l(1) nor l(2) is a constant of motion but L = l(1) + l(2) is. Given
eigenstates of l(1) and l(2), the construction of eigenstates of L (which still is conserved)
is the coupling of the angular momenta of electron 1 and 2.
Reiterating slightly differently the above: one expands the quantum states of composed
systems (i.e. made of subunits like two hydrogen atoms or two electrons) in basis sets
which are made of tensor products of quantum states which in turn describe the
subsystems individually. We assume that the states of the subsystems can be chosen as
eigenstates of their angular momentum operators (and of their component along any
arbitrary z axis). The subsystems are therefore correctly described by a set of l, m
________________________WORLD TECHNOLOGIES________________________
quantum numbers. When there is interaction between the subsystems, the total
Hamiltonian contains terms that do not commute with the angular operators acting on the
subsystems only. However, these terms do commute with the total angular momentum
operator. Sometimes one refers to the non-commuting interaction terms in the
Hamiltonian as angular momentum coupling terms, because they necessitate the angular
momentum coupling.
Spin-orbit coupling
The behavior of atoms and smaller particles is well described by the theory of quantum
mechanics, in which each particle has an intrinsic angular momentum called spin and
specific configurations (of e.g. electrons in an atom) are described by a set of quantum
numbers. Collections of particles also have angular momenta and corresponding quantum
numbers, and under different circumstances the angular momenta of the parts add in
WT
different ways to form the angular momentum of the whole. Angular momentum
coupling is a category including some of the ways that subatomic particles can interact
with each other.
In the macroscopic world of orbital mechanics, the term spin-orbit coupling is sometimes
used in the same sense as spin-orbital resonance.
LS coupling
In light atoms (generally Z < 30), electron spins si interact among themselves so they
combine to form a total spin angular momentum S. The same happens with orbital
angular momenta li, forming a total orbital angular momentum L. The interaction
between the quantum numbers L and S is called Russell–Saunders coupling or LS
coupling. Then S and L add together and form a total angular momentum J:
where
and
________________________WORLD TECHNOLOGIES________________________
This is an approximation which is good as long as any external magnetic fields are weak.
In larger magnetic fields, these two momenta decouple, giving rise to a different splitting
pattern in the energy levels (the Paschen–Back effect.), and the size of LS coupling term
becomes small.
jj coupling
In heavier atoms the situation is different. In atoms with bigger nuclear charges, spin-
orbit interactions are frequently as large or larger than spin-spin interactions or orbit-orbit
interactions. In this situation, each orbital angular momentum li tends to combine with
WT
each individual spin angular momentum si, originating individual total angular momenta
ji. These then add up to form the total angular momentum J
Spin-spin coupling
Spin-spin coupling is the coupling of the intrinsic angular momentum (spin) of different
particles. Such coupling between pairs of nuclear spins is an important feature of nuclear
magnetic resonance (NMR) spectroscopy as it can provide detailed information about the
structure and conformation of molecules. Spin-spin coupling between nuclear spin and
electronic spin is responsible for hyperfine structure in atomic spectra.
Term symbols
Term symbols are used to represent the states and spectral transitions of atoms, they are
found from coupling of angular momenta mentioned above. When the state of an atom
has been specified with a term symbol, the allowed transitions can be found through
selection rules by considering which transitions would conserve angular momentum. A
photon has spin 1, and when there is a transition with emission or absorption of a photon
the atom will need to change state to conserve angular momentum. The term symbol
selection rules are. ΔS = 0, ΔL = 0, ±1, Δl = ± 1, ΔJ = 0, ±1
The expression "term symbol" is derived from the "term series" associated with the
Rydberg states of an atom and their energy levels. In the Rydberg formula the frequency
or wave number of the light emitted by a hydrogen-like atom is proportional to the
difference between the two terms of a transition. The series known to early spectroscopy
________________________WORLD TECHNOLOGIES________________________
were designated sharp, principal, diffuse and fundamental and consequently the letters S,
P, D, and F were used to represent the orbital angular momentum states of an atom.
Relativistic effects
In very heavy atoms, relativistic shifting of the energies of the electron energy levels
accentuates spin-orbit coupling effect. Thus, for example, uranium molecular orbital
diagrams must directly incorporate relativistic symbols when considering interactions
with other atoms.
Nuclear coupling
In atomic nuclei, the spin-orbit interaction is much stronger than for atomic electrons, and
WT
is incorporated directly into the nuclear shell model. In addition, unlike atomic-electron
term symbols, the lowest energy state is not L − S, but rather, l + s. All nuclear levels
whose l value (orbital angular momentum) is greater than zero are thus split in the shell
model to create states designated by l + s and l − s. Due to the nature of the shell model,
which assumes an average potential rather than a central Coulombic potential, the
nucleons that go into the l + s and l − s nuclear states are considered degenerate within
each orbital (e.g. The 2p3/2 contains four nucleons, all of the same energy. Higher in
energy is the 2p1/2 which contains two equal-energy nucleons).
________________________WORLD TECHNOLOGIES________________________
Chapter 5
Angular Velocity
WT
In physics, the angular velocity is a vector quantity (more precisely, a pseudovector)
which specifies the angular speed of an object and the axis about which the object is
rotating. The SI unit of angular velocity is radians per second, although it may be
measured in other units such as degrees per second, revolutions per second, revolutions
per minute, degrees per hour, etc. It is sometimes also called the rotational velocity and
its magnitude the rotational speed, typically measured in cycles or rotations per unit time
(e.g. revolutions per minute). Angular velocity is usually represented by the symbol
omega (ω, rarely Ω).
The direction of the angular velocity vector is perpendicular to the plane of rotation, in a
direction which is usually specified by the right-hand rule.
Angular velocity describes the speed of rotation and the orientation of the instantaneous
axis about which the rotation occurs. The direction of the angular velocity pseudovector
________________________WORLD TECHNOLOGIES________________________
will be along the axis of rotation; in this case (counter-clockwise rotation) the vector
points up.
WT
The angular velocity of the particle at P with respect to the origin O is determined by the
perpendicular component of the velocity vector v.
________________________WORLD TECHNOLOGIES________________________
A radial motion produces no change in the direction of the particle relative to the origin,
so for purposes of finding the angular velocity the parallel (radial) component can be
ignored. Therefore, the rotation is completely produced by the tangential motion (like that
of a particle moving along a circumference), and the angular velocity is completely
determined by the perpendicular (tangential) component.
It can be seen that the rate of change of the angular position of the particle is related to
the cross-radial velocity by:
Utilizing θ, the angle between vectors v∥ and v, or equivalently as the angle between
WT
vectors r and v, gives:
Combining the above two equations and defining the angular velocity as ω=dΦ/dt yields:
In two dimensions the angular velocity is a single number which has no direction. A
single number which has no direction is either a scalar or a pseudoscalar, the difference
being that a scalar does not change its sign when the x and y axes are exchanged (or
inverted), while a pseudoscalar does. The angle as well as the angular velocity is a
pseudoscalar. The positive direction of rotation is taken, by convention, to be in the
direction towards the y axis from the x axis. If the axes are inverted, but the sense of a
rotation does not, then the sign of the angle of rotation, and therefore the angular velocity
as well, will change.
It is important to note that the pseudoscalar angular velocity of a particle depends upon
the choice of the origin.
In three dimensions, the angular velocity becomes a bit more complicated. The angular
velocity in this case is generally thought of as a vector, or more precisely, a pseudovector.
It now has not only a magnitude, but a direction as well. The magnitude is the angular
speed, and the direction describes the axis of rotation. The right-hand rule indicates the
positive direction of the angular velocity pseudovector.
Being an unitary vector over the instantaneous rotation axis, so that from the top of the
vector the rotation is counter-clock-wise the angular velocity vector can be defined as:
________________________WORLD TECHNOLOGIES________________________
Just as in the two dimensional case, a particle will have a component of its velocity along
the radius from the origin to the particle, and another component perpendicular to that
radius. The combination of the origin point and the perpendicular component of the
velocity defines a plane of rotation in which the behavior of the particle (for that instant)
appears just as it does in the two dimensional case. The axis of rotation is then a line
normal to this plane, and this axis defined the direction of the angular velocity
pseudovector, while the magnitude is the same as the pseudoscalar value found in the 2-
dimensional case. Using the unit vector defined before, the angular velocity vector may
be written in a manner similar to that for two dimensions:
WT
which, by the definition of the cross product, can be written:
If a point rotates with ω2 in a frame F2 which rotates itself with angular speed ω1 respect
an external frame F1, we can define the addition of ω1 + ω2 like the angular velocity
vector of the point respect F1.
With this operation defined like this, angular velocity, which is a pseudovector, becomes
also a real vector because it has two operations:
This is the definition of a vector space. Therefore pseudovectors are a subset of the real
vectors, despite their name suggesting the opposite. The only property that presents
difficulties to prove is the commutativity of the addition. This can be proven from the fact
that the velocity tensor W is skew-symmetric. Therefore R = eWt is a rotation matrix and
in a time dt is an infinitesimal rotation matrix. Therefore it can be expanded as
________________________WORLD TECHNOLOGIES________________________
The composition of rotations is not commutative, but when they are infinitesimal
rotations the first order approximation of the previous serie can be taken and (I + W1.dt)(I
+ W2.dt) = (I + W2.dt)(I + W1.dt), and therefore ω1 + ω2 = ω2 + ω1
Rotating frames
Given a rotating frame composed by three unitary vectors, all the three must have the
same angular speed in any instant. In such a frame each vector is a particular case of the
previous case (moving particle), in which the module of the vector is constant.
Though it is just a particular case of the previous one, is a very important one for its
relationship with the rigid body study, and special tools have been developed for this
WT
case. There are two possible ways to describe the angular velocity of a rotating frame.
The angular velocity vector and the angular velocity tensor. Both entities are related and
they can be calculated from each other.
It is defined as the angular velocity of each of the vectors of the frame, in a consistent
way with the general definition.
It is known by the Euler's rotation theorem that for a rotating frame there exists an
instantaneous axis of rotation in any instant. In the case of a frame, the angular velocity
vector is over the instantaneous axis of rotation.
Any transversal section of a plane perpendicular to this axis has to behave as a two
dimensional rotation. Thus, the magnitude of the angular velocity vector at a given time t
is consistent with the two dimensions case.
________________________WORLD TECHNOLOGIES________________________
Addition of angular velocity vectors in frames
WT
Schematic construction for addition of angular velocity vectors for rotating frames
As in the general case, the addition operation for angular velocity vectors can be defined
using movement composition. In the case of rotating frames, the movement composition
is simpler than the general case because the final matrix is always a product of rotation
matrices.
________________________WORLD TECHNOLOGIES________________________
any vector e of the frame we obtain , and therefore
As the columns of the matrix of the frame are the components of its vectors, this allows
also to calculate ω from the matrix of the frame and its derivative.
WT
Diagram showing Euler frame in green
________________________WORLD TECHNOLOGIES________________________
The components of the angular velocity pseudovector were first calculated by Leonhard
Euler using his Euler angles and an intermediate frame made out of the intermediate
frames of the construction:
Euler proved that the projections of the angular velocity pseudovector over these three
axes was the derivative of its associated angle (which is equivalent to decompose the
instant rotation in three instantaneous Euler rotations). Therefore:
WT
This basis is not orthonormal and it is difficult to use, but now the velocity vector can be
changed to the fixed frame or to the moving frame with just a change of bases. For
example, changing to the mobile frame:
where IJK are unit vectors for the frame fixed in the moving body.
The components of the angular velocity vector can be calculated from infinitesimal
rotations (if available) as follows:
• As any rotation matrix has a single real eigenvalue, which is +1, this eigenvalue
shows the rotation axis.
• Its module can be deduced from the value of the infinitesimal rotation.
It can be introduced from rotation matrices. Any vector that rotates around an axis with
an angular speed vector (as defined before) satisfies:
We can introduce here the angular velocity tensor associated to the angular speed ω:
________________________WORLD TECHNOLOGIES________________________
This tensor W(t) will act as if it were a operator :
Given the orientation matrix A(t) of a frame, we can obtain its instant angular velocity
tensor W as follows. We know that:
WT
As angular speed must be the same for the three vectors of a rotating frame A(t), we can
write for all the three:
And therefore the angular velocity tensor we are looking for is:
In general, the angular velocity in an n-dimensional space is the time derivative of the
angular displacement tensor which is a second rank skew-symmetric tensor.
This tensor W will have n(n-1)/2 independent components and this number is the
dimension of the Lie algebra of the Lie group of rotations of an n-dimensional inner
product space.
Exponential of W
________________________WORLD TECHNOLOGIES________________________
As . This can be read as a differential equation that defines A(t)
knowing W(t).
And if the angular speed is constant then W is also constant and the equation can be
integrated. The result is:
A(t) = eW.t
WT
W is skew-symmetric
It is possible to prove that angular velocity tensor are skew symmetric matrices. Being
Thus, W is the negative of its transpose, which implies it is a skew symmetric matrix.
As it is a skew symmetric matrix it has a Hodge dual vector which is precisely the
previous angular velocity vector :
________________________WORLD TECHNOLOGIES________________________
Coordinate-free description
At any instant, t, the angular velocity tensor is a linear map between the position vectors
and their velocity vectors of a rigid body rotating around the origin:
where we omitted the t parameter, and regard and as elements of the same 3-
dimensional Euclidean vector space V.
The relation between this linear map and the angular velocity pseudovector ω is the
WT
following.
bilinear form is skew-symmetric. (Here stands for the scalar product). So we can apply
the fact of exterior algebra that there is a unique linear form L on Λ2V that
Introducing ω: = * L * , as the Hodge dual of L* , and apply further Hodge dual identities
we arrive at
where
by definition.
________________________WORLD TECHNOLOGIES________________________
Angular velocity as a vector field
WT
Position of point P located in the rigid body (shown in blue). Ri is the position with
respect to the lab frame, centered at O and ri is the position with respect to the rigid body
frame, centered at O' . The origin of the rigid body frame is at vector position R from the
lab frame.
The same equations for the angular speed can be obtained reasoning over a rotating rigid
body. Here is not assumed that the rigid body rotates around the origin. Instead it can be
supposed rotating around an arbitrary point which is moving with a linear velocity V(t) in
each instant.
To obtain the equations it is convenient to image a rigid body attached to the frames and
consider a coordinate system that is fixed with respect to the rigid body. Then we will
________________________WORLD TECHNOLOGIES________________________
study the coordinate transformations between this coordinate and the fixed "laboratory"
system.
As shown in the figure on the right, the lab system's origin is at point O, the rigid body
system origin is at O' and the vector from O to O' is R. A particle (i) in the rigid body is
located at point P and the vector position of this particle is Ri in the lab frame, and at
position ri in the body frame. It is seen that the position of the particle can be written:
The defining characteristic of a rigid body is that the distance between any two points in a
rigid body is unchanging in time. This means that the length of the vector is
unchanging. By Euler's rotation theorem, we may replace the vector with where
is a 3x3 rotation matrix and is the position of the particle at some fixed point in
WT
time, say t=0. This replacement is useful, because now it is only the rotation matrix
which is changing in time and not the reference vector , as the rigid body rotates
about point O'. Also, since the three columns of the rotation matrix represent the three
versors of a reference frame rotating together with the rigid body, any rotation about any
axis becomes now visible, while the vector would not rotate if the rotation axis were
parallel to it, and hence it would only describe a rotation about an axis perpendicular to it
(i.e., it would not see the component of the angular velocity pseudovector parallel to it,
and would only allow the computation of the component perpendicular to it). The
position of the particle is now written as:
where Vi is the velocity of the particle (in the lab frame) and V is the velocity of O' (the
origin of the rigid body frame). Since is a rotation matrix its inverse is its transpose. So
we substitute :
or
________________________WORLD TECHNOLOGIES________________________
where is the previous angular velocity tensor.
It can be proved that this is skew symmetric matrix, so we can take its dual to get a 3
dimensional pseudovector which is precisely the previous angular velocity vector :
Substituting ω for W into the above velocity expression, and replacing matrix
multiplication by an equivalent cross product:
WT
It can be seen that the velocity of a point in a rigid body can be divided into two terms -
the velocity of a reference point fixed in the rigid body plus the cross product term
involving the angular velocity of the particle with respect to the reference point. This
angular velocity is the "spin" angular velocity of the rigid body as opposed to the angular
velocity of the reference point O' about the origin O.
Consistency
We have supposed that the rigid body rotates around an arbitrary point. We should prove
that the angular velocity prevously defined is independent from the choice of origin,
which means that the angular velocity is an intrinsic property of the spinning rigid body.
________________________WORLD TECHNOLOGIES________________________
WT
Proving the independence of angular velocity from choice of origin
See the graph to the right: The origin of lab frame is O, while O1 and O2 are two fixed
points on the rigid body, whose velocity is
velocity with respect to O1 and O2 is
only one velocity,
and
and respectively. Suppose the angular
respectively. Since point P and O2 have
________________________WORLD TECHNOLOGIES________________________
If the reference point is the instantaneous axis of rotation the expression of velocity of a
point in the rigid body will have just the angular velocity term. This is because the
velocity of instantaneous axis of rotation is zero. An example of instantaneous axis of
rotation is the hinge of a door. Another example is the point of contact of a pure rolling
spherical rigid body.
WT
________________________WORLD TECHNOLOGIES________________________
Chapter 6
Precession
WT Precession of a gyroscope
Precession is a change in the orientation of the rotation axis of a rotating body. It can be
defined as a change in direction of the rotation axis in which the second Euler angle
(nutation) is constant. In physics, there are two types of precession: torque-free and
torque-induced.
Torque-free
Torque-free precession occurs when the axis of rotation differs slightly from an axis
about which the object can rotate stably: a maximum or minimum principal axis.
Poinsot's construction is an elegant geometrical method for visualizing the torque-free
motion of a rotating rigid body. For example, when a plate is thrown, the plate may have
some rotation around an axis that is not its axis of symmetry. This occurs because the
________________________WORLD TECHNOLOGIES________________________
angular momentum (L) is constant in absence of torques. Therefore it will have to be
constant in the external reference frame, but the moment of inertia tensor (I) is non-
constant in this frame because of the lack of symmetry. Therefore the spin angular
velocity vector (ωs) about the spin axis will have to evolve in time so that the matrix
product L = Iωs remains constant.
When an object is not perfectly solid, internal vortices will tend to damp torque-free
precession, and the rotation axis will align itself with one of the inertia axes of the body.
The torque-free precession rate of an object with an axis of symmetry, such as a disk,
spinning about an axis not aligned with that axis of symmetry can be calculated as
follows:
WT
where is the precession rate, is the spin rate about the axis of symmetry, is the
angle between the axis of symmetry and the axis about which it precesses, is the
moment of inertia about the axis of symmetry, and is moment of inertia about either of
the other two perpendicular principal axes. They should be the same, due to the symmetry
of the disk.
For a generic solid object without any axis of symmetry, the evolution of the object's
orientation, represented (for example) by a rotation matrix that transforms internal to
external coordinates, may be numerically simulated. Given the object's fixed internal
moment of inertia tensor and fixed external angular momentum , the instantaneous
angular velocity is . Precession occurs by repeatedly
recalculating and applying a small rotation vector for the short time dt, e.g.
for the skew-symmetric matrix . The errors
induced by finite time steps tend to increase the rotational kinetic energy,
; this unphysical tendency can be counter-acted by repeatedly
applying a small rotation vector perpendicular to both and , noting that
.
Another type of torque-free precession can occur when there are multiple reference
frames at work. For example, the earth is subject to local torque induced precession due
to the gravity of the sun and moon acting upon the earth’s axis, but at the same time the
solar system is moving around the galactic center. Consequently, an accurate
measurement of the earth’s axial reorientation relative to objects outside the frame of the
moving galaxy (such as distant quasars commonly used as precession measurement
reference points) must account for a minor amount of non-local torque-free precession,
due to the solar system’s motion.
________________________WORLD TECHNOLOGIES________________________
Torque-induced
Torque-induced precession (gyroscopic precession) is the phenomenon in which the axis
of a spinning object (e.g. a part of a gyroscope) "wobbles" when a torque is applied to it.
The phenomenon is commonly seen in a spinning toy top, but all rotating objects can
undergo precession. If the speed of the rotation and the magnitude of the torque are
constant the axis will describe a cone, its movement at any instant being at right angles to
the direction of the torque. In the case of a toy top, if the axis is not perfectly vertical the
torque is applied by the force of gravity tending to tip it over.
WT
The response of a rotating system to an applied torque. When the device swivels, and
some roll is added, the wheel tends to pitch.
The device depicted on the right here is gimbal mounted. From inside to outside there are
three axes of rotation: the hub of the wheel, the gimbal axis and the vertical pivot.
To distinguish between the two horizontal axes, rotation around the wheel hub will be
called 'spinning', and rotation around the gimbal axis will be called 'pitching.' Rotation
around the vertical pivot axis is called 'rotation'.
First, imagine that the entire device is rotating around the (vertical) pivot axis. Then,
spinning of the wheel (around the wheelhub) is added. Imagine the gimbal axis to be
locked, so that the wheel cannot pitch. The gimbal axis has sensors, that measure whether
there is a torque around the gimbal axis.
In the picture, a section of the wheel has been named dm1. At the depicted moment in
time, section dm1 is at the perimeter of the rotating motion around the (vertical) pivot
axis. Section dm1 therefore has a lot of angular rotating velocity with respect to the
rotation around the pivot axis, and as dm1 is forced closer to the pivot axis of the rotation
(by the wheel spinning further), due to the Coriolis effect dm1 tends to move in the
direction of the top-left arrow in the diagram (shown at 45°) in the direction of rotation
around the pivot axis. Section dm2 of the wheel starts out at the vertical pivot axis, and
________________________WORLD TECHNOLOGIES________________________
thus initially has zero angular rotating velocity with respect to the rotation around the
pivot axis, before the wheel spins further. A force (again, a Coriolis force) would be
required to increase section dm2's velocity up to the angular rotating velocity at the
perimeter of the rotating motion around the pivot axis. If that force is not provided, then
section dm2's inertia will make it move in the direction of the top-right arrow. Note that
both arrows point in the same direction.
The same reasoning applies for the bottom half of the wheel, but there the arrows point in
the opposite direction to that of the top arrows. Combined over the entire wheel, there is a
torque around the gimbal axis when some spinning is added to rotation around a vertical
axis.
It is important to note that the torque around the gimbal axis arises without any delay; the
response is instantaneous.
WT
In the discussion above, the setup was kept unchanging by preventing pitching around the
gimbal axis. In the case of a spinning toy top, when the spinning top starts tilting, gravity
exerts a torque. However, instead of rolling over, the spinning top just pitches a little.
This pitching motion reorients the spinning top with respect to the torque that is being
exerted. The result is that the torque exerted by gravity - via the pitching motion - elicits
gyroscopic precession (which in turn yields a counter torque against the gravity torque)
rather than causing the spinning top to fall to its side.
Gyroscopic precession also plays a large role in the flight controls on helicopters. Since
the driving force behind helicopters is the rotor disk (which rotates), gyroscopic
precession comes into play. If the rotor disk is to be tilted forward (to gain forward
velocity), its rotation requires that the downward net force on the blade be applied
roughly 90 degrees (depending on blade configuration) before, or when the blade is to
one side of the pilot and rotating forward.
To ensure the pilot's inputs are correct, the aircraft has corrective linkages which vary the
blade pitch in advance of the blade's position relative to the swashplate. Although the
swashplate moves in the intuitively correct direction, the blade pitch links are arranged to
transmit the pitch in advance of the blade's position.
________________________WORLD TECHNOLOGIES________________________
Classical (Newtonian)
WT
The torque caused by the two opposing forces Fg and -Fg causes a change in the angular
momentum L in the direction of that torque. This causes the top to precess.
Precession is the result of the angular velocity of rotation and the angular velocity
produced by the torque. It is an angular velocity about a line which makes an angle with
the permanent rotation axis, and this angle lies in a plane at right angles to the plane of
the couple producing the torque. The permanent axis must turn towards this line, since
the body cannot continue to rotate about any line which is not a principal axis of
maximum moment of inertia; that is, the permanent axis turns in a direction at right
angles to that in which the torque might be expected to turn it. If the rotating body is
symmetrical and its motion unconstrained, and if the torque on the spin axis is at right
angles to that axis, the axis of precession will be perpendicular to both the spin axis and
torque axis.
In which Is is the moment of inertia, is the angular velocity of spin about the spin axis,
and m*g*r are the force and radius that comes from the torque.The torque vector
originates at the center of mass. Using = , we find that the period of precession is
given by:
________________________WORLD TECHNOLOGIES________________________
In which Is is the moment of inertia, Ts is the period of spin about the spin axis, and is
the torque. In general the problem is more complicated than this, however.
Relativistic
The special and general theories of relativity give three types of corrections to the
Newtonian precession, of a gyroscope near a large mass such as the earth, described
above. They are:
WT
frame dragging by the Kerr metric of curved space near a large rotating mass.
Astronomy
In astronomy, precession refers to any of several gravity-induced, slow and continuous
changes in an astronomical body's rotational axis or orbital path.
________________________WORLD TECHNOLOGIES________________________
Axial precession (precession of the equinoxes)
WT
Precessional movement.
________________________WORLD TECHNOLOGIES________________________
WT
Precession of the equinox in relation to the distant stars
Axial precession is the movement of the rotational axis of an astronomical body, whereby
the axis slowly traces out a cone. In the case of the Earth, this type of precession is also
known as the precession of the equinoxes or precession of the equator. The Earth goes
through one such complete precessional cycle in a period of approximately 26,000 years,
during which the positions of stars as measured in the equatorial coordinate system will
slowly change; the change is actually due to the change of the coordinates. Over this
cycle the Earth's north axial pole moves from where it is now, within 1° of Polaris, in a
circle around the ecliptic pole, with an angular radius of about 23.5 degrees (or
approximately 23 degrees 27 arcminutes ). The shift is 1 degree in 72 years, where the
angle is taken from the observer, not from the center of the circle.
Aristarchus of Samos (c. 280 BC) is the earliest known astronomer to recognize and
assess the precession of the equinoxes at almost 1º per century (which is not far from the
actual value for antiquity, 1.38º). The Precession (axial rotation) was later explained by
Newtonian physics. Being an oblate spheroid, the Earth has a nonspherical shape, bulging
outward at the equator. The gravitational tidal forces of the Moon and Sun apply torque
as they attempt to pull the equatorial bulge into the plane of the ecliptic. The portion of
the precession due to the combined action of the Sun and the Moon is called lunisolar
precession.
The inclination of Earth's orbit drifts up and down. Relative to its present orbit this drift
has a period of about 70,000 years. Relative to the invariable plane it has a 100,000 year
________________________WORLD TECHNOLOGIES________________________
period. The invariable plane represents the angular momentum of the solar system, and is
approximately the orbital plane of Jupiter.
Perihelion precession
WT
Planets revolving the Sun follow elliptical (oval) orbits that rotate gradually over time
(apsidal precession). The eccentricity of this ellipse is exaggerated for visualization. Most
orbits in the Solar System have a much smaller eccentricity, making them nearly circular.
The orbit of a planet around the Sun is not really an ellipse but a flower-petal shape
because the major axis of each planet's elliptical orbit also precesses within its orbital
plane, partly in response to perturbations in the form of the changing gravitational forces
exerted by other planets. This is called perihelion precession or apsidal precession.
Discrepancies between the observed perihelion precession rate of the planet Mercury and
that predicted by classical mechanics were prominent among the forms of experimental
evidence leading to the acceptance of Einstein's Theory of Relativity (in particular, his
General Theory of Relativity), which accurately predicted the anomalies.
These periodic changes of Earth's orbital parameters, combined with the precession of the
equinoxes and of the inclination of the Earth's axis on its orbit, are an important part of
the astronomical theory of ice ages.
________________________WORLD TECHNOLOGIES________________________
Chapter 7
Euler Angles
WT
The Euler angles are three angles introduced by Leonhard Euler to describe the
orientation of a rigid body. To describe such an orientation in 3-dimensional Euclidean
space three parameters are required.
Euler angles also represent three composed rotations that move a reference frame to a
given referred frame. This is equivalent to saying that any orientation can be achieved by
composing three elemental rotations (rotations around a single axis of a basis), and also
equivalent to saying that any rotation matrix can be decomposed as a product of three
elemental rotation matrices.
Without considering the possibilities of different signs for the angles or moving the
reference frame, there are twelve different conventions divided in two groups. One of
them is called "proper" Euler angles and the other Tait–Bryan angles. Sometimes "Euler
angles" is used for all of them.
________________________WORLD TECHNOLOGIES________________________
Definition
WT
Euler angles - The xyz (fixed) system is shown in blue, the XYZ (rotated) system is
shown in red. The line of nodes, labeled N, is shown in green.
Euler angles are a means of representing the spatial orientation of any frame (coordinate
system) as a composition of rotations from a frame of reference (coordinate system). In
the following the fixed system is denoted in lower case (x,y,z) and the rotated system is
denoted in upper case letters (X,Y,Z).
The definition is Static. Given a reference frame and the one whose orientation we want
to describe, first we define the line of nodes (N) as the intersection of the xy and the XY
coordinate planes (in other words, line of nodes is the line perpendicular to both z and Z
axis). Then we define its Euler angles as:
________________________WORLD TECHNOLOGIES________________________
• α (or ψ) is the angle between the x-axis and the line of nodes.
• β (or θ) is the angle between the z-axis and the Z-axis.
• γ (or φ) is the angle between the line of nodes and the X-axis.
Euler angles between two frames are defined only if both frames have the same
handedness. Euler angles are just one of the several ways of specifying the relative
orientation of two such coordinate systems. Different authors may use different sets of
angles to describe these orientations, or different names for the same angles, leading to
different conventions. Therefore any discussion employing Euler angles should always be
preceded by their definition.
Normally, angles are defined in such a way that they are positive when they rotate
WT
counter-clock-wise (how they rotate depends on which side of the rotation plane we
observe them from. The positive side will be the one of the positive axis of rotation)
• α and γ range are defined modulo 2π radians. A valid range could be (-π, π].
• β range covers π radians (but can't be said to be modulo π). For example could be
[0, π] or [-π/2, π/2].
The angles α, β and γ are uniquely determined except for the singular case that the xy and
the XY planes are identical, the z axis and the Z axis having the same or opposite
directions. Indeed, if the z-axis and the Z-axis are the same, β = 0 and only (α+γ) is
uniquely defined (not the individual values), and, similarly, if the z-axis and the Z-axis
are opposite, β = π and only (α-γ) is uniquely defined (not the individual values). These
ambiguities are known as gimbal lock in applications.
________________________WORLD TECHNOLOGIES________________________
Conventions
There are two main types of conventions called "proper" Euler angles and Tait–Bryan
angles, after Peter Guthrie Tait and George H. Bryan, also known as Nautical or Cardan
angles, after Cardan.
Their static difference is the definition for the line of nodes. In the first case two
homologous planes (planes overlapping when the angles are zero) are used. In the second
one, they are replaced by non-homologous planes (perpendicular when angles are zero).
Nevertheless, it is unusual to use static conventions when speaking about Euler angles.
The intrinsic rotations equivalence or the extrinsic rotations equivalence are used instead.
According with these equivalences, proper Euler angles are equivalent to three combined
________________________WORLD TECHNOLOGIES________________________
rotations repeating exactly one axis. Tait-Bryan angles are equivalent to three composed
rotations in different axes.
No information is lost when using the rotation equivalence because the static parameters
can be calculated from the name of the convention. For example, given the convention X-
Y’-Z’’, the first rotation is perpendicular to "x" and the last one to "Z". Therefore the
planes are the yz and the XY, and the line of nodes is the intersection of these two.
There are six possibilities of choosing the proper Euler angles. Using the static definition
they correspond to the three possible homogeneous combinations of planes (XY, XZ and
YZ) with the two possible options to measure the angles from (given the line of nodes by
the XY planes for example, it can be taken X-N or Y-N as first angle). Hence the six
WT
possibilities.
There is a intrinsic rotations equivalence which is normally used to name the possible
conventions of Euler Angles. If we are told that some angles are given using the
convention Z-X’-Z’’, this means that they are equivalent to three concatenated intrinsic
rotations around some moving axes Z, X’ and Z’’ in that order. This composition is non-
commutative. It has to be applied in such a way that in the beginning one of the intrinsic
axis moves together with the line of nodes. The above diagram convention is usually
named this way.
Nevertheless, sometimes the extrinsic rotations equivalence could be used. If this is the
case, the given angles are backwards, meaning that the first angle is the intrinsic rotation
and the last one the precession. The name of the convention would be indistinguishable
from the previous one, even if the angles' order is the opposite, being something like z-x-
z (here lowercase is used to remark extrinsic composition).
To specify that the given order means intrinsic composition, sometimes a similar notation
is used, but stating explicitly which rotation axis are different for each step, as in Z-X’-
Z’’. Using this notation, Z-X-Z would mean extrinsic composition.
________________________WORLD TECHNOLOGIES________________________
Tait-Bryan angles
WT
Tait-Bryan angles statically defined. Z-X’-Y’’ convention
There are also six Tait-Bryan combinations. They come from the two possible non-
homogeneous planes that exist when one is given (given XY, there are two non-
homogeneous, XZ and YZ). The three possible planes at the reference frame multiplied
by the two options for each one yield the six possible conventions.
There are six possible combinations of this kind, and all of them behave in an identical
way. Using the intrinsic rotations equivalence, Tait-Bryan angles correspond with the
three rotations with a different axis. Z-X-Y for example. There are also six possibilities of
this kind. The enclosed image shows the ZXY convention. The other five proper
conventions are obtained by selecting different axes of rotation.
These three angles are normally called Heading, Elevation and Bank, or Yaw, Pitch and
Roll. The second terms have to be used carefully because they are also the names for the
three aircraft principal axes.
For Tait-Bryan angles, also intrinsic and extrinsic conventions can be used, giving
therefore two meanings for every convention name. For example, X-Y-Z, using intrinsic
convention, means that a X-rotation is performed, composing intrinsic rotations Y and Z
later, but using extrinsic convention means that after the X rotation, extrinsic rotations Y
and Z are performed. The meaning is different in both cases.
________________________WORLD TECHNOLOGIES________________________
Geometric derivation
WT
Projections of Z vector.
________________________WORLD TECHNOLOGIES________________________
WT Projections of Y vector.
The fastest way to get the Euler Angles of a given frame is to write the three given
vectors as columns of a matrix and compare it with the expression of the theoretical
matrix. Hence the three Euler Angles can be calculated.
Nevertheless, the same result can be reached avoiding matrix calculus, which is more
geometrical. Assuming a frame with unitary vectors (X, Y, Z) as in the main diagram, it
can be seen that:
________________________WORLD TECHNOLOGIES________________________
as Z2 is a double projection of an unitary vector:
There is a similar construction for Y3, projecting it first over the plane defined by the axis
Z and the line of nodes. As the angle between the planes is 90 − β and cos(90 − β) =
sin(β), this leads to:
WT
and finally, using the cosine inverse function arc cos:
It is interesting to note that the cosine inverse function yields two possible values for the
argument. In this geometrical description only one of the solutions is valid. When Euler
angles are defined as a sequence of rotations all the solutions can be valid, but there will
be only one inside the angles ranges. This is because the sequence of rotations to reach
the target frame is not unique if the ranges are not previously defined.
________________________WORLD TECHNOLOGIES________________________
Euler angles as composition of intrinsic rotations
WT
________________________WORLD TECHNOLOGIES________________________
WT
Any target frame can be reached using a specific sequence of intrinsic rotations (mobile
frame rotations), whose values are exactly the Euler Angles of the target frame. Using Z-
X'-Z" convention in this example.
Starting with an initial set of mobile axes, say XYZ overlapping the reference axes xyz, a
composition of three intrinsic rotations (rotations only about the mobile frame axes,
assuming active composition) can be used to reach any target frame with an origin
coincident with that of XYZ from the reference frame. The value of the rotations are the
Euler Angles.
The position of the mobile axes can be reached using three rotations with angles α, β, γ in
three ways equivalent to the former definition, as follows:
________________________WORLD TECHNOLOGIES________________________
The XYZ system rotates while the xyz is fixed. Starting with the XYZ system overlapping
the reference frame xyz, the same rotations as before can be performed using only
rotations around the mobile axes XYZ.
• Rotate the XYZ-system about the Z-axis by α. The X-axis now lies on the line of
nodes.
• Rotate the XYZ-system again about the now rotated X-axis by β. The Z-axis is
now in its final orientation, and the x-axis remains on the line of nodes.
• Rotate the XYZ-system a third time about the new Z-axis by γ.
Any convention for proper Euler angles is equivalent to three such rotations that one axis
is repeated (ZXZ for example). Tait-Bryan angles are also equivalent to three composed
rotations, but in this case, all three rotations are around different axes (ZXY).
WT
Usually conventions are named according with this equivalence.
A rotation represented by Euler angles with (φ,θ,ψ)=(−60°, 30°, 45°) using the 3-1-3 (Z-
X-Z) co-moving axes rotations
The same rotation alternatively expressed by (φ,θ,ψ)=(45°, 30°, −60°) using the 3-1-3 (Z-
X-Z) fixed axes rotations
Also composition of extrinsic rotations (rotations about the reference frame axes,
assuming active composition) can be used to reach any target frame. Let xyz system be
________________________WORLD TECHNOLOGIES________________________
fixed while the XYZ system rotates. Start with the rotating XYZ system coinciding with
the fixed xyz system.
• Rotate the XYZ-system about the z-axis by γ. The X-axis is now at angle γ with
respect to the x-axis.
• Rotate the XYZ-system again about the x-axis by β. The Z-axis is now at angle β
with respect to the z-axis.
• Rotate the XYZ-system a third time about the z-axis by α. The first and third axes
are identical.
Let us call (e), (f), (g), (h), the successive frames deduced from the initial (e) reference
frame by the successive intrinsic rotations described above. We call u, v, w, t, the
WT
successive vectors obtained with that rotation. We note (x)e the column matrix
representing a vector x in the frame (e). If necessary we add also a lower index to any
matrix we wish to operate in a specific frame. We call (Zα), (Xβ), (Zγ) the successive
rotations of our example. Thus we can write when describing the intrinsic operations :
When describing the intrinsic rotations in the (e) reference frame we must of course
transform the matrices used to represent the rotations. Then by the rules of matrix algebra
we get :
The relation (5) can then of course be interpreted in extrinsic manner as a succession of
rotations around the (e) axes.
Again, proper Euler angles repeat an axis and Tait-Bryan angles do not. As before, this
kind of composition is non-commutative.
________________________WORLD TECHNOLOGIES________________________
Euler rotations
WT
Euler rotations of the Earth. Intrinsic (green), Precession (blue) and Nutation (red)
Euler rotations are defined as the movement obtained by changing one of the Euler angles
while leaving the other two constant. Euler rotations are never expressed in terms of the
external frame, or in terms of the co-moving rotated body frame, but in a mixture. They
constitute a mixed axes of rotation system, where the first angle moves the line of nodes
around the external axis z, the second rotates around the line of nodes and the third one is
an intrinsic rotation around an axis fixed in the body that moves.
These rotations are called Precession, Nutation, and intrinsic rotation. While they are
rotations when they are applied over individual frames, only precession is valid as a
________________________WORLD TECHNOLOGIES________________________
rotation operator, and only precession can be expressed in general as a matrix in the basis
of the space.
Gimbal analogy
WT
________________________WORLD TECHNOLOGIES________________________
WT
Three axes z-x-z-gimbal showing Euler angles. External frame and external axis 'x' are
not shown. Axes 'Y' are perpendicular to each gimbal ring, together with a simple
diagram showing how the axes 'Y' of intermediate frames are located in the main
diagram.
If we suppose a set of frames, able to move each with respect to the former according to
just one angle, like a gimbal, there will be one initial, one final and two in the middle,
which are called intermediate frames. The two in the middle work as two gimbal rings
that allow the last frame to reach any orientation in space.
Intermediate frames
The gimbal rings indicate some intermediate frames. They can be defined statically too.
Taking some vectors i, j and k over the axes x, y and z, and vectors I, J, K over X, Y and
________________________WORLD TECHNOLOGIES________________________
Z, and a vector N over the line of nodes, some intermediate frames can be defined using
the vector cross product, as following:
These intermediate frames are equivalent to those of the gimbal. They are such that they
differ from the previous one in just a single elemental rotation. This proves that:
• Any target frame can be reached from the reference frame just composing three
rotations.
• The values of these three rotations are exactly the Euler angles of the target frame.
WT
Relationship to other representations
Euler angles are one way to represent orientations. There are others, and it is possible to
change to and from other conventions.
Matrix orientation
Using the equivalence between Euler angles and rotation composition, it is possible to
change to and from matrix convention.
Fixed (world) axes and column vectors, with intrinsic composition (composition of
rotations about body axes) of active rotations and the right-handed rule for the positive
sign of the angles are assumed. This means for example that a convention named (YXZ)
is the result of performing first an intrinsic Y rotation, followed by an X and a Z
rotations, in the moving axes. Its matrix is the product of Rot(Y,θ1) Rot(X,θ2) Rot(Z,θ3)
like this:
. .
Subindexes refer to the order in which the angles are applied. Trigonometric notation has
been simplified. For example, c1 means cos(θ1) and s2 means sin(θ2). As we assumed
intrinsic and active compositions, θ1 is the external angle of the static definition (angle
between fixed axis x and line of nodes) and θ3 the internal angle (from the line of nodes
to rotated axis X). The following table can be used both ways, to obtain an orientation
matrix from Euler angles and to obtain Euler angles from the matrix. The possible
combinations of rotations equivalent to Euler angles are shown here.
________________________WORLD TECHNOLOGIES________________________
XZX XZY
XYX XYZ
YXY YXZ
YZY YZX
WT
ZYZ ZYX
ZXZ ZXY
Quaternions
Geometric algebra
Properties
The Euler angles form a chart on all of SO(3), the special orthogonal group of rotations in
3D space. The chart is smooth except for a polar coordinate style singularity along β=0.
The space of rotations is called in general "The Hypersphere of rotations", though this is
a misnomer: the group Spin(3) is isometric to the hypersphere S3, but the rotation space
________________________WORLD TECHNOLOGIES________________________
SO(3) is instead isometric to the real projective space RP3 which is a 2-fold quotient
space of the hypersphere. This 2-to-1 ambiguity is the mathematical origin of spin in
physics.
A similar three angle decomposition applies to SU(2), the special unitary group of
rotations in complex 2D space, with the difference that β ranges from 0 to 2π. These are
also called Euler angles.
The Haar measure for Euler angles has the simple form sin(β)dαdβdγ, usually normalized
by a factor of 1/8π².
For example, to generate uniformly randomized orientations, let α and γ be uniform from
0 to 2π, let z be uniform from −1 to 1, and let β = arccos(z).
WT
Higher dimensions
It is possible to define parameters analogous to the Euler angles in dimensions higher
than three.
The number of degrees of freedom of a rotation matrix is always less than the dimension
of the matrix squared. That is, the elements of a rotation matrix are not all completely
independent. For example, the rotation matrix in dimension 2 has only one degree of
freedom, since all four of its elements depend on a single angle of rotation. A rotation
matrix in dimension 3 (which has nine elements) has three degrees of freedom,
corresponding to each independent rotation, for example by its three Euler angles or a
magnitude one (unit) quaternion.
In SO(4) the rotation matrix is defined by two quaternions, and is therefore 6-parametric
(three degrees of freedom for every quaternion). The 4x4 rotation matrices have therefore
6 out of 16 independent components.
Any set of 6 parameters that define the rotation matrix could be considered an extension
of Euler angles to dimension 4.
In general, the number of euler angles in dimension D is quadratic in D; since any one
rotation consists of choosing two dimensions to rotate between, the total number of
________________________WORLD TECHNOLOGIES________________________
Applications
WT
A gyroscope keeps its rotation axis constant. Therefore, angles measured in this frame are
equivalent to angles measured in the lab frame
Their main advantage over other orientation descriptions is that they are directly
measurable from a gimbal mounted in a vehicle. As gyroscopes keep their rotation axis
constant, angles measured in a gyro frame are equivalent to angles measured in the lab
frame. Therefore gyros are used to know the actual orientation of moving spacecrafts,
and Euler angles are directly measurable. Intrinsic rotation angle cannot be read from a
single gimbal, so there has to be more than one gimbal in a spacecraft. Normally there are
at least three for redundancy. There is also a relation to the well-known gimbal lock
problem of Mechanical Engineering.
________________________WORLD TECHNOLOGIES________________________
WT
Heading, elevation and bank for an aircraft with axes DIN 9300
The most popular application is to describe aircraft attitudes, normally using a Tait-Bryan
convention so that zero degrees elevation represents the horizontal attitude. Tait-Bryan
angles represent the orientation of the aircraft respect a reference axis system (world
frame) with three angles which in the context of an aircraft are normally called Heading,
Elevation and Bank. When dealing with vehicles, different axes conventions are possible.
When studying rigid bodies in general, one calls the xyz system space coordinates, and
the XYZ system body coordinates. The space coordinates are treated as unmoving, while
the body coordinates are considered embedded in the moving body. Calculations
involving acceleration, angular acceleration, angular velocity, angular momentum, and
kinetic energy are often easiest in body coordinates, because then the moment of inertia
________________________WORLD TECHNOLOGIES________________________
tensor does not change in time. If one also diagonalizes the rigid body's moment of
inertia tensor (with nine components, six of which are independent), then one has a set of
coordinates (called the principal axes) in which the moment of inertia tensor has only
three components.
The angular velocity of a rigid body takes a simple form using Euler angles in the moving
frame. Also the Euler's rigid body equations are simpler because the inertia tensor is
constant in that frame.
Others
WT Industrial robot operating in a foundry.
Euler angles, normally in the Tait-Bryan convention, are also used in robotics for
speaking about the degrees of freedom of a wrist. They are also used in Electronic
stability control in a similar way.
Gun fire control systems require corrections to gun-order angles (bearing and elevation)
to compensate for deck tilt (pitch and roll). In traditional systems, a stabilizing gyroscope
with a vertical spin axis corrects for deck tilt, and stabilizes the optical sights and radar
antenna. However, gun barrels point in a direction different from the line of sight to the
target, to anticipate target movement and fall of the projectile due to gravity, among other
factors. Gun mounts roll and pitch with the deck plane, but also require stabilization. Gun
________________________WORLD TECHNOLOGIES________________________
orders include angles computed from the vertical gyro data, and those computations
involve Euler angles.
Euler angles are also used extensively in the quantum mechanics of angular momentum.
In quantum mechanics, explicit descriptions of the representations of SO(3) are very
important for calculations, and almost all the work has been done using Euler angles. In
the early history of quantum mechanics, when physicists and chemists had a sharply
negative reaction towards abstract group theoretic methods (called the Gruppenpest),
reliance on Euler angles was also essential for basic theoretical work.
WT
material.
________________________WORLD TECHNOLOGIES________________________
Chapter 8
Gimbal Lock
WT
A set of three gimbals mounted together to allow three degrees of freedom. When all
three gimbals are lined up (in the same plane), the system can only move in two
dimensions from this configuration, not three, and is in gimbal lock. In this case it can
pitch or yaw, but not roll (rotate in the plane that the axes all lie in).
Adding a fourth rotational axis can solve the problem of gimbal lock, but it requires the
outermost ring to be actively driven so that it stays 90 degrees out of alignment with the
innermost axis (the flywheel shaft). Without active driving of the outermost ring, all four
________________________WORLD TECHNOLOGIES________________________
axis can become aligned in a plane as shown above, again leading to gimbal lock and
inability to roll.
Gimbal lock is the loss of one degree of freedom in a three-dimensional space that
occurs when the axes of two of the three gimbals are driven into a parallel configuration,
"locking" the system into rotation in a degenerate two-dimensional space.
The word lock is misleading: no gimbal is restrained. All three gimbals can still rotate
freely about their respective axes of suspension. Nevertheless, because of the parallel
orientation of two of the gimbal axes there is no axis available to accommodate rotation
along one axis.
Gimbals
WT
A gimbal is a ring that is suspended so it can rotate about an axis. Gimbals are typically
nested one within another to accommodate rotation about multiple axes.
They appear in gyroscopes and in inertial measurement units to allow the inner gimbal's
orientation to remain fixed while the outer gimbal suspension assumes any orientation. In
compasses, flywheel energy storage mechanisms, or more commonly drink holders, they
allow objects to remain upright. They are used to orient thrusters on rockets.
Some coordinate systems in mathematics behave as if there were real gimbals used to
measure the angles, notably Euler angles.
For cases of three or fewer nested gimbals, gimbal lock inevitably occurs at some point in
the system, due to properties of covering spaces (described below).
Gimbal lock can occur in gimbal systems with two degrees of freedom such as a
theodolite with rotations about an azimuth and elevation in two dimensions. These
systems can gimbal lock at zenith and nadir, because at those points azimuth is not well-
defined, and rotation in the azimuth direction does not change the direction the theodolite
is pointing.
Consider tracking a helicopter flying towards the theodolite from the horizon. The
theodolite is a telescope mounted on a tripod so that it can move in azimuth and elevation
to track the helicopter. The helicopter flies towards the theodolite and is tracked by the
telescope in elevation and azimuth. The helicopter flies immediately above the tripod (i.e.
it is at zenith) when it changes direction and flies at 90 degrees to its previous course. The
telescope cannot track this maneuver without a discontinuous jump in one or both of the
gimbal orientations. There is no continuous motion that allows it to follow the target. It is
in gimbal lock. So there is an infinity of directions around zenith that the telescope cannot
________________________WORLD TECHNOLOGIES________________________
continuously track all movements of a target. Note that even if the helicopter does not
pass through zenith, but only near zenith, so that gimbal lock does not occur, the system
must still move exceptionally rapidly to track it, as it rapidly passes from one bearing to
the other. The closer to zenith the nearest point is, the faster this must be done, and if it
actually goes through zenith, the limit of these "increasingly rapid" movements becomes
infinitely fast, namely discontinuous.
To recover from gimbal lock the user has to go around the zenith – explicitly: reduce the
elevation, change the azimuth to match the azimuth of the target, then change the
elevation to match the target.
Mathematically, this corresponds to the fact that spherical coordinates do not define a
coordinate chart on the sphere at zenith and nadir. Alternatively, that the corresponding
map T2→S2 from the torus T2 to the sphere S2 (given by the point with given azimuth and
WT
elevation) is not a covering map at these points.
________________________WORLD TECHNOLOGIES________________________
Gimbal lock in three dimensions
________________________WORLD TECHNOLOGIES________________________
lost
WT
Gimbal lock: two out of the three gimbals are in the same plane, one degree of freedom is
Consider a case of a level sensing platform on an aircraft flying due North with its three
gimbal axes mutually perpendicular (i.e., roll, pitch and yaw angles each zero). If the
aircraft pitches up 90 degrees, the aircraft and platform's Yaw axis gimbal becomes
parallel to the Roll axis gimbal, and changes about yaw can no longer be compensated
for.
The word lock is misleading: no gimbal is restrained, all three gimbals can still rotate
freely about their respective axis of suspension. Nevertheless, because of the parallel
orientation of both the yaw and roll gimbal axes, there is no axis available to
accommodate yaw rotation.
________________________WORLD TECHNOLOGIES________________________
Solutions
This problem may be overcome by use of a fourth gimbal, intelligently driven by a motor
so as to maintain a large angle between roll and yaw gimbal axes. Another solution is to
rotate one or more of the gimbals to an arbitrary position when gimbal lock is detected
and thus reset the device.
Modern practice is to avoid the use of gimbals entirely. In the context of inertial
navigation systems, that can be done by mounting the inertial sensors directly to the body
of the vehicle (this is called a strapdown system) and integrating sensed rotation and
acceleration digitally using quaternion methods to derive vehicle orientation and velocity.
Another way to replace gimbals is to use fluid bearings or a flotation chamber.
WT
A well-known gimbal lock incident happened in the Apollo 11 Moon mission. On this
spacecraft, a set of gimbals was used on an inertial measurement unit (IMU). The
engineers were aware of the gimbal lock problem but had declined to use a fourth gimbal.
Some of the reasoning behind this decision is apparent from the following quote:
They preferred an alternate solution using an indicator that would be triggered when near
to 85 degrees pitch.
"Near that point, in a closed stabilization loop, the torque motors could theoretically be
commanded to flip the gimbal 180 degrees instantaneously. Instead, in the LM, the
computer flashed a 'gimbal lock' warning at 70 degrees and froze the IMU at 85 degrees"
—Paul Fjeld, Apollo Lunar Surface Journal
Rather than try to drive the gimbals faster than they could go, the system simply gave up
and froze the platform. From this point, the spacecraft would have to be manually moved
away from the gimbal lock position, and the platform would have to be manually
realigned using the stars as a reference.
After the Lunar Module had landed, Mike Collins aboard the Command Module joked
"How about sending me a fourth gimbal for Christmas?"
________________________WORLD TECHNOLOGIES________________________
Robotics
In robotics, gimbal lock is commonly referred to as "wrist flip", due to the use of a
"triple-roll wrist" in robotic arms, where three axes of the wrist, controlling yaw, pitch,
and roll, all pass through a common point.
An example of a wrist flip, also called a wrist singularity, is when the path through which
the robot is traveling causes the first and third axes of the robot's wrist to line up. The
second wrist axis then attempts to spin 180° in zero time to maintain the orientation of
the end effector. The result of a singularity can be quite dramatic and can have adverse
effects on the robot arm, the end effector, and the process.
The importance of non-singularities in robotics has led the American National Standard
for Industrial Robots and Robot Systems — Safety Requirements to define it as "a
condition caused by the collinear alignment of two or more robot axes resulting in
unpredictable robot motion and velocities".
________________________WORLD TECHNOLOGIES________________________
In formal language, gimbal lock occurs because the map from Euler angles to rotations
(topologically, from the 3-torus T3 to the real projective space RP3) is not a covering map
– it is not a local homeomorphism at every point, and thus at some points the rank
(degrees of freedom) must drop below 3, at which point gimbal lock occurs. Euler angles
provide a means for giving a numerical description of any rotation in three dimensional
space using three numbers, but not only is this description not unique, but there are some
points where not every change in the target space (rotations) can be realized by a change
in the source space (Euler angles). This is a topological constraint – there is no covering
map from the 3-torus to the 3-dimensional real projective space; the only (non-trivial)
covering map is from the 3-sphere, as in the use of quaternions.
To make a comparison, all the translations can be described using three numbers x, y, and
z, as the succession of three consecutive linear movements along three perpendicular axes
X, Y and Z axes. That's the same for rotations, all the rotations can be described using
WT
three numbers α, β, and γ, as the succession of three rotational movements around three
axes that are perpendicular one to the next. This similarity between linear coordinates and
angular coordinates makes Euler angles very intuitive, but unfortunately they suffer from
the gimbal lock problem.
with α and γ constrained in the interval [ − π,π], and β constrained in the interval [0,π].
Let's examine for example what happens when β = 0. Knowing that and
, the above expression becomes equal to:
The second matrix is the identity matrix and has no effect on the product. Carrying out
matrix multiplication of first and third matrices:
________________________WORLD TECHNOLOGIES________________________
And finally using the trigonometry formulas:
Changing α's and γ's values in the above matrix has the same effects: the rotation's angle
α + γ changes, but the rotation's axis remains in the Z direction. The last column and the
last line in the matrix won't change: one degree of freedom has been lost.
The only solution for α and γ to recover different roles is to get β away from the 0 value.
WT
One can choose another convention for representing a rotation with a matrix using Euler
angles than the Z-X-Z convention above, and also choose other variation intervals for the
angles, but at the end there is always at least one value for which a degree of freedom is
lost.
Note that the gimbal lock problem does not make Euler angles "wrong" (they always play
at least their role of a well-defined coordinates system), but it makes them unsuited for
some practical applications.
There is no problem similar to the gimbal lock with quaternions. This can be explained
intuitively by the fact that a quaternion describes a rotation in one single move ("please
turn radians around the axis driven by vector "), while the Euler angles are made of
three successive rotations.
Besides that, quaternions also have other advantages over Euler angles.
________________________WORLD TECHNOLOGIES________________________
Chapter 9
Icosahedral Symmetry
WT
A Soccer ball, a common example of a spherical truncated icosahedron, has full
icosahedral symmetry.
________________________WORLD TECHNOLOGIES________________________
A regular dodecahedron has the same set of symmetries, since it is the dual of the
icosahedron.
As point group
WT
The icosahedral rotation group I with fundamental domain
Apart from the two infinite series of prismatic and antiprismatic symmetry, rotational
icosahedral symmetry or chiral icosahedral symmetry of chiral objects and full
icosahedral symmetry or achiral icosahedral symmetry are the discrete point
symmetries (or equivalently, symmetries on the sphere) with the largest symmetry
groups.
________________________WORLD TECHNOLOGIES________________________
Icosahedral symmetry is not compatible with translational symmetry, so there are no
associated crystallographic point groups or space groups.
WT
These correspond to the icosahedral groups (rotational and full) being the (2,3,5) triangle
groups.
The first presentation was given by William Rowan Hamilton in 1856, in his paper on
Icosian Calculus.
Note that other presentations are possible, for instance as an alternating group (for I).
Group structure
The icosahedral rotation group I is of order 60. The group I is isomorphic to A5, the
alternating group of even permutations of five objects. This isomorphism can be realized
by I acting on various compounds, notably the compound of five cubes (which inscribe in
the dodecahedron), the compound of five octahedra, or either of the two compounds of
five tetrahedra (which are enantiomorphs, and inscribe in the dodecahedron).
The group contains 5 versions of Th with 20 versions of D3 (10 axes, 2 per axis), and 6
versions of D5.
The full icosahedral group Ih has order 120. It has I as normal subgroup of index 2. The
group Ih is isomorphic to I × C2, or A5 × C2, with the inversion in the center
corresponding to element (identity,-1), where C2 is written multiplicatively.
Ih acts on the compound of five cubes and the compound of five octahedra, but -1 acts as
the identity (as cubes and octahedra are centrally symmetric). It acts on the compound of
ten tetrahedra: I acts on the two chiral halves (compounds of five tetrahedra), and -1
interchanges the two halves. Notably, it does not act as S5, and these groups are not
isomorphic.
The group contains 10 versions of D3d and 6 versions of D5d (symmetries like antiprisms).
________________________WORLD TECHNOLOGIES________________________
Commonly confused groups
The following groups all have order 120, but are not isomorphic:
They correspond to the following short exact sequences (which do not split) and product
WT
In words,
• A5 is a normal subgroup of S5
• A5 is a factor of Ih, which is a direct product
• A5 is a quotient group of 2I
These can also be related to linear groups over the finite field with five elements, which
exhibit the subgroups and covering groups directly; none of these are the full icosahedral
group:
Conjugacy classes
• identity
• 12 × rotation by 72°, order 5
• 12 × rotation by 144°, order 5
• 20 × rotation by 120°, order 3
• 15 × rotation by 180°, order 2
________________________WORLD TECHNOLOGIES________________________
• inversion
• 12 × rotoreflection by 108°, order 10
• 12 × rotoreflection by 36°, order 10
• 20 × rotoreflection by 60°, order 6
• 15 × reflection, order 2
Subgroups
• Ih,I , Th and T
• D2h
• D5d, D3d
• D5, D3 and D2
• C2h
• C5v, C3v and C2v
WT
• C5, C3 and C2
• S10, S6 and S2=Ci
• E and Cs
All of these classes of subgroups are conjugate (i.e., all vertex stabilizers are conjugate),
and admit geometric interpretations.
Note that the stabilizer of a vertex/edge/face/polyhedron and its opposite are equal, since
− 1 is central.
Vertex stabilizers
Stabilizers of an opposite pair of vertices can be interpreted as stabilizers of the axis they
generate.
Edge stabilizers
________________________WORLD TECHNOLOGIES________________________
Face stabilizers
Polyhedron stabilizers
For each of these, there are 5 conjugate copies, and the conjugation action gives a map,
WT
indeed an isomorphism, .
Fundamental domain
Fundamental domains for the icosahedral rotation group and the full icosahedral group
are given by:
________________________WORLD TECHNOLOGIES________________________
WTThe icosahedral rotation group I with fundamental domain
________________________WORLD TECHNOLOGIES________________________
WT The full icosahedral group Ih with fundamental domain
________________________WORLD TECHNOLOGIES________________________
WT Fundamental domain in the disdyakis triacontahedron
In the disdyakis triacontahedron one full face is a fundamental domain; other solids with
the same symmetry can be obtained by adjusting the orientation of the faces, e.g.
flattening selected subsets of faces to combine each subset into one face, or replacing
each face by multiple faces, or a curved surface.
________________________WORLD TECHNOLOGIES________________________
Solids with icosahedral symmetry
Full icosahedral symmetry
WT
{5,3} {3,5}
Archimedean solids - polyhedra with more than one polygon face type.
4.6.10 5.6.6
3.10.10 3.4.5.4 3.5.3.5
Related geometries
Icosahedral symmetry is equivalently the projective special linear group PSL(2,5), and is
the symmetry group of the modular curve X(5), and more generally PSL(2,p) is the
________________________WORLD TECHNOLOGIES________________________
symmetry group of the modular curve X(p). The modular curve X(5) is geometrically a
dodecahedron with a cusp at the center of each polygonal face, which demonstrates the
symmetry group.
This geometry, and associated symmetry group, was studied by Felix Klein as the
monodromy groups of a Belyi surface – a Riemann surface with a holomorphic map to
the Riemann sphere, ramified only at 0, 1, and infinity (a Belyi function) – the cusps are
the points lying over infinity, while the vertices and the centers of each edge lie over 0
and 1; the degree of the covering (number of sheets) equals 5.
This arose from his efforts to give a geometric setting for why icosahedral symmetry
arose in the solution of the quintic equation, with the theory given in the famous (Klein
1888); a modern exposition is given in (Tóth 2002, Section 1.6, Additional Topic: Klein's
Theory of the Icosahedron, p. 66).
WT
Klein's investigations continued with his discovery of order 7 and order 11 symmetries in
(Klein 1878/79b) and (Klein 1879) (and associated coverings of degree 7 and 11) and
dessins d'enfants, the first yielding the Klein quartic, whose associated geometry has a
tiling by 24 heptagons (with a cusp at the center of each).
Similar geometries occur for PSL(2,n) and more general groups for other modular curves.
More exotically, there are special connections between the groups PSL(2,5) (order 60),
PSL(2,7) (order 168) and PSL(2,11) (order 660), which also admit geometric
interpretations – PSL(2,5) is the symmetries of the icosahedron (genus 0), PSL(2,7) of the
Klein quartic (genus 3), and PSL(2,11) the buckyball surface (genus 70). These groups
form a "trinity" in the sense of Vladimir Arnold, which gives a framework for the various
relationships.
________________________WORLD TECHNOLOGIES________________________
Chapter 10
Rotation Matrix
WT
In linear algebra, a rotation matrix is a matrix that is used to perform a rotation in
Euclidean space. For example the matrix
rotates points in the xy-Cartesian plane counterclockwise through an angle θ about the
origin of the Cartesian coordinate system. To perform the rotation, the position of each
point must be represented by a column vector v, containing the coordinates of the point.
A rotated vector is obtained by using the matrix multiplication Rv.
In two and three dimensions, rotation matrices are among the simplest algebraic
descriptions of rotations, and are used extensively for computations in geometry, physics,
and computer graphics. Though most applications involve rotations in two or three
dimensions, rotation matrices can be defined for n-dimensional space.
Rotation matrices are always square, with real entries. Algebraically, a rotation matrix in
n-dimensions is a n × n special orthogonal matrix, i.e. an orthogonal matrix whose
determinant is 1:
The set of all rotation matrices forms a group, known as the rotation group or the special
orthogonal group. It is a subset of the orthogonal group, which includes reflections and
consists of all orthogonal matrices with determinant 1 or -1, and of the special linear
group, which includes all volume-preserving transformations and consists of matrices
with determinant 1.
________________________WORLD TECHNOLOGIES________________________
Rotations in two dimensions
WT
A counterclockwise rotation of a vector through angle θ. The vector is initially aligned
with the x-axis.
,
.
________________________WORLD TECHNOLOGIES________________________
.
If a standard right-handed Cartesian coordinate system is used, with the x axis to the right
and the y axis up, the rotation R(θ) is counterclockwise. If a left-handed Cartesian
coordinate system is used, with x directed to the right but y directed down, R(θ) is
clockwise. Such non-standard orientations are rarely used in mathematics but are
common in 2D computer graphics, which often have the origin in the top left corner and
the y-axis down the screen or page.
Common rotations
Particularly useful are the matrices for 90° and 180° rotations:
________________________WORLD TECHNOLOGIES________________________
(270° counterclockwise rotation, the same as a 90°
clockwise rotation)
Basic rotations
The following three basic (gimbal-like) rotation matrices rotate vectors about the x, y, or z
axis, in three dimensions:
WT
Each of these basic vector rotations typically appears counter-clockwise when the axis
about which they occur points toward the observer, and the coordinate system is right-
handed. Rz, for instance, would rotate toward the y-axis a vector aligned with the x-axis.
This is similar to the rotation produced by the above mentioned 2-D rotation matrix.
General rotations
Other rotation matrices can be obtained from these three using matrix multiplication. For
example, the product
represents a rotation whose yaw, pitch, and roll are α, β, and γ, respectively. Similarly,
the product
________________________WORLD TECHNOLOGIES________________________
represents a rotation whose Euler angles are α, β, and γ (using the y-x-z convention for
Euler angles).
WT
Every rotation in three dimensions is defined by its axis — a direction that is left fixed by
the rotation — and its angle — the amount of rotation about that axis (Euler rotation
theorem).
There are several methods to compute an axis and an angle from a rotation matrix. Here,
we only describe the method based on the computation of the eigenvectors and
eigenvalues of the rotation matrix. It is also possible to use the trace of the rotation
matrix.
________________________WORLD TECHNOLOGIES________________________
Determining the axis
WT
A rotation R around axis u can be decomposed using 3 endomorphisms P, (I - P), and Q.
Given a rotation matrix R, a vector u parallel to the rotation axis must satisfy
since the rotation of around the rotation axis must result in . The equation above may
be solved for which is unique up to a scalar factor.
________________________WORLD TECHNOLOGIES________________________
which shows that is the null space of R − I. Viewed another way, is an eigenvector of
R corresponding to the eigenvalue λ = 1 (every rotation matrix must have this
eigenvalue).
To find the angle of a rotation, once the axis of the rotation is known, select a vector
perpendicular to the axis. Then the angle of the rotation is the angle between and .
For some applications, it is helpful to be able to make a rotation with a given axis. Given
a unit vector u = (ux, uy, uz), where ux2 + uy2 + uz2 = 1, the matrix for a rotation by an
angle of θ about an axis in the direction of u is
where
WT
This can be written more concisely as
is the skew symmetric form of u, ⊗ is the tensor product and I is the Identity
matrix. This is a matrix form of Rodrigues' rotation formula, with
In three dimensions, for any rotation matrix , where a is a rotation axis and θ
a rotation angle,
________________________WORLD TECHNOLOGIES________________________
• (where is the identity matrix)
• The eigenvalues of are
WT
For instance, in two dimensions the properties hold with the following exceptions:
• a is not a given axis, but a point (rotation center) which must coincide
with the origin of the coordinate system in which the rotation is
represented.
• Consequently, the four elements of the rotation matrix depend only on θ,
hence we write , rather than
• The eigenvalues of are
Examples
• The 2×2 rotation matrix • The 3×3 matrix
________________________WORLD TECHNOLOGIES________________________
• The 3×3 rotation matrix
is not square, and so cannot be a
rotation matrix; yet MTM yields a
3×3 identity matrix (the columns are
orthonormal).
WT
describes an isoclinic rotation, a
rotation through equal angles (180°)
through two orthogonal planes.
Geometry
In Euclidean geometry, a rotation is an example of an isometry, a transformation that
moves points without changing the distances between them. Rotations are distinguished
from other isometries by two additional properties: they leave (at least) one point fixed,
and they leave "handedness" unchanged. By contrast, a translation moves every point, a
reflection exchanges left- and right-handed ordering, and a glide reflection does both.
________________________WORLD TECHNOLOGIES________________________
If we take the fixed point as the origin of a Cartesian coordinate system, then every point
can be given coordinates as a displacement from the origin. Thus we may work with the
vector space of displacements instead of the points themselves. Now suppose (p1,…,pn)
are the coordinates of the vector p from the origin, O, to point P. Choose an orthonormal
basis for our coordinates; then the squared distance to P, by Pythagoras, is
WT
A geometric rotation transforms lines to lines, and preserves ratios of distances between
points. From these properties we can show that a rotation is a linear transformation of the
vectors, and thus can be written in matrix form, Qp. The fact that a rotation preserves, not
just ratios, but distances themselves, we can state as
or
Because this equation holds for all vectors, p, we conclude that every rotation matrix, Q,
satisfies the orthogonality condition,
Rotations preserve handedness because they cannot change the ordering of the axes,
which implies the special matrix condition,
Equally important, we can show that any matrix satisfying these two conditions acts as a
rotation.
________________________WORLD TECHNOLOGIES________________________
Multiplication
The inverse of a rotation matrix is its transpose, which is also a rotation matrix:
WT
For n greater than 2, multiplication of n×n rotation matrices is not commutative.
Noting that any identity matrix is a rotation matrix, and that matrix multiplication is
associative, we may summarize all these properties by saying that the n×n rotation
matrices form a group, which for n > 2 is non-abelian. Called a special orthogonal group,
and denoted by SO(n), SO(n,R), SOn, or SOn(R), the group of n×n rotation matrices is
isomorphic to the group of rotations in an n-dimensional space. This means that
multiplication of rotation matrices corresponds to composition of rotations, applied in
left-to-right order of their corresponding matrices.
________________________WORLD TECHNOLOGIES________________________
Ambiguities
In most cases the effect of the ambiguity is equivalent to the effect of a transposition of
the rotation matrix.
________________________WORLD TECHNOLOGIES________________________
Decompositions
Independent planes
WT
so that
Two features are noteworthy. First, one of the roots (or eigenvalues) is 1, which tells us
that some direction is unaffected by the matrix. For rotations in three dimensions, this is
the axis of the rotation (a concept that has no meaning in any other dimension). Second,
the other two roots are a pair of complex conjugates, whose product is 1 (the constant
term of the quadratic), and whose sum is 2 cos θ (the negated linear term). This
factorization is of interest for 3×3 rotation matrices because the same thing occurs for all
of them. (As special cases, for a null rotation the "complex conjugates" are both 1, and
for a 180° rotation they are both −1.) Furthermore, a similar factorization holds for any
n×n rotation matrix. If the dimension, n, is odd, there will be a "dangling" eigenvalue of
1; and for any dimension the rest of the polynomial factors into quadratic terms like the
one here (with the two special cases noted). We are guaranteed that the characteristic
polynomial will have degree n and thus n eigenvalues. And since a rotation matrix
commutes with its transpose, it is a normal matrix, so can be diagonalized. We conclude
that every rotation matrix, when expressed in a suitable coordinate system, partitions into
independent rotations of two-dimensional subspaces, at most n⁄2 of them.
The sum of the entries on the main diagonal of a matrix is called the trace; it does not
change if we reorient the coordinate system, and always equals the sum of the
eigenvalues. This has the convenient implication for 2×2 and 3×3 rotation matrices that
the trace reveals the angle of rotation, θ, in the two-dimensional (sub-)space. For a 2×2
________________________WORLD TECHNOLOGIES________________________
matrix the trace is 2 cos(θ), and for a 3×3 matrix it is 1+2 cos(θ). In the three-dimensional
case, the subspace consists of all vectors perpendicular to the rotation axis (the invariant
direction, with eigenvalue 1). Thus we can extract from any 3×3 rotation matrix a rotation
axis and an angle, and these completely determine the rotation.
Sequential angles
The constraints on a 2×2 rotation matrix imply that it must have the form
with a2+b2 = 1. Therefore we may set a = cos θ and b = sin θ, for some angle θ. To solve
WT
for θ it is not enough to look at a alone or b alone; we must consider both together to
place the angle in the correct quadrant, using a two-argument arctangent function.
Although a2+b2 will probably not equal 1, but some value r2 < 1, we can use a slight
variation of the previous computation to find a so-called Givens rotation that transforms
the column to
zeroing b. This acts on the subspace spanned by the x and y axes. We can then repeat the
process for the xz subspace to zero c. Acting on the full matrix, these two rotations
produce the schematic form
Shifting attention to the second column, a Givens rotation of the yz subspace can now
zero the z value. This brings the full matrix to the form
________________________WORLD TECHNOLOGIES________________________
which is an identity matrix. Thus we have decomposed Q as
WT
entries below the diagonal to zero. We can zero them by extending the same idea of
stepping through the columns with a series of rotations in a fixed sequence of planes. We
conclude that the set of n×n rotation matrices, each of which has n2 entries, can be
parameterized by n(n−1)/2 angles.
One reason for the large number of options is that, as noted previously, rotations in three
dimensions (and higher) do not commute. If we reverse a given sequence of rotations, we
get a different outcome. This also implies that we cannot compose two rotations by
adding their corresponding angles. Thus Euler angles are not vectors, despite a similarity
in appearance as a triple of numbers.
________________________WORLD TECHNOLOGIES________________________
Nested dimensions
WT
is embedded in the upper left corner:
This is no illusion; not just one, but many, copies of n-dimensional rotations are found
within (n+1)-dimensional rotations, as subgroups. Each embedding leaves one direction
fixed, which in the case of 3×3 matrices is the rotation axis. For example, we have
fixing the x axis, the y axis, and the z axis, respectively. The rotation axis need not be a
coordinate axis; if u = (x,y,z) is a unit vector in the desired direction, then
________________________WORLD TECHNOLOGIES________________________
WT
where cθ = cos θ, sθ = sin θ, is a rotation by angle θ leaving axis u fixed.
where for every direction in the "base space", Sn, the "fiber" over it in the "total space",
SO(n+1), is a copy of the "fiber space", SO(n), namely the rotations that keep that
direction fixed.
Thus we can build an n×n rotation matrix by starting with a 2×2 matrix, aiming its fixed
axis on S2 (the ordinary sphere in three-dimensional space), aiming the resulting rotation
on S3, and so on up through Sn−1. A point on Sn can be selected using n numbers, so we
again have n(n−1)/2 numbers to describe any n×n rotation matrix.
When an n×n rotation matrix, Q, does not include −1 as an eigenvalue, so that none of the
planar rotations of which it is composed are 180° rotations, then Q+I is an invertible
matrix. Most rotation matrices fit this description, and for them we can show that
(Q−I)(Q+I)−1 is a skew-symmetric matrix, A. Thus AT = −A; and since the diagonal is
necessarily zero, and since the upper triangle determines the lower one, A contains
n(n−1)/2 independent numbers. Conveniently, I−A is invertible whenever A is skew-
symmetric; thus we can recover the original matrix using the Cayley transform,
________________________WORLD TECHNOLOGIES________________________
which maps any skew-symmetric matrix A to a rotation matrix. In fact, aside from the
noted exceptions, we can produce any rotation matrix in this way. Although in practical
applications we can hardly afford to ignore 180° rotations, the Cayley transform is still a
potentially useful tool, giving a parameterization of most rotation matrices without
trigonometric functions.
WT
If we condense the skew entries into a vector, (x,y,z), then we produce a 90° rotation
around the x axis for (1,0,0), around the y axis for (0,1,0), and around the z axis for
(0,0,1). The 180° rotations are just out of reach; for, in the limit as x goes to infinity,
(x,0,0) does approach a 180° rotation around the x axis, and similarly for other directions.
Lie theory
Lie group
We have established that n×n rotation matrices form a group, the special orthogonal
group, SO(n). This algebraic structure is coupled with a topological structure, in that the
operations of multiplication and taking the inverse (which here is merely transposition)
are continuous functions of the matrix entries. Thus SO(n) is a classic example of a
topological group. (In purely topological terms, it is a compact manifold.) Furthermore,
the operations are not only continuous, but smooth, so SO(n) is a differentiable manifold
and a Lie group (Baker (2003); Fulton & Harris (1991)).
Most properties of rotation matrices depend very little on the dimension, n; yet in Lie
group theory we see systematic differences between even dimensions and odd
dimensions. As well, there are some irregularities below n = 5; for example, SO(4) is,
anomalously, not a simple Lie group, but instead isomorphic to the product of S3 and
SO(3).
Lie algebra
Associated with every Lie group is a Lie algebra, a linear space equipped with a bilinear
alternating product called a bracket. The algebra for SO(n) is denoted by
________________________WORLD TECHNOLOGIES________________________
and consists of all skew-symmetric n×n matrices (as implied by differentiating the
orthogonality condition, I = QTQ). The bracket, [A1,A2], of two skew-symmetric matrices
is defined to be A1A2−A2A1, which is again a skew-symmetric matrix. This Lie algebra
bracket captures the essence of the Lie group product via infinitesimals.
For 2×2 rotation matrices, the Lie algebra is a one-dimensional vector space, multiples of
Here the bracket always vanishes, which tells us that, in two dimensions, rotations
WT
commute. Not so in any higher dimension. For 3×3 rotation matrices, we have a three-
dimensional vector space with the convenient basis
We can conveniently identify any matrix in this Lie algebra with a vector in R3,
Under this identification, the so(3) bracket has a memorable description; it is the vector
cross product,
________________________WORLD TECHNOLOGIES________________________
Notice this implies that v is in the null space of the skew-symmetric matrix with which it
is identified, because v×v is always the zero vector.
Exponential map
Connecting the Lie algebra to the Lie group is the exponential map, which we define
using the familiar power series for ex (Wedderburn 1934, §8.02),
WT
For any skew-symmetric A, exp(A) is always a rotation matrix.
An important practical example is the 3×3 case, where we have seen we can identify
every skew-symmetric matrix with a vector ω = uθ, where u = (x,y,z) is a unit magnitude
vector. Recall that u is in the null space of the matrix associated with ω, so that if we use
a basis with u as the z axis the final column and row will be zero. Thus we know in
advance that the exponential matrix must leave u fixed. It is mathematically impossible to
supply a straightforward formula for such a basis as a function of u (its existence would
violate the hairy ball theorem), but direct exponentiation is possible, and yields
where c = cos θ⁄2, s = sin θ⁄2. We recognize this as our matrix for a rotation around axis u
by angle θ. We also note that this mapping of skew-symmetric matrices is quite different
from the Cayley transform discussed earlier.
In any dimension, if we choose some nonzero A and consider all its scalar multiples,
exponentiation yields rotation matrices along a geodesic of the group manifold, forming a
one-parameter subgroup of the Lie group. More broadly, the exponential map provides a
homeomorphism between a neighborhood of the origin in the Lie algebra and a
neighborhood of the identity in the Lie group. In fact, we can produce any rotation matrix
as the exponential of some skew-symmetric matrix, so for these groups the exponential
map is a surjection.
________________________WORLD TECHNOLOGIES________________________
Baker–Campbell–Hausdorff formula
Suppose we are given A and B in the Lie algebra. Their exponentials, exp(A) and exp(B),
are rotation matrices, which we can multiply. Since the exponential map is a surjection,
we know that for some C in the Lie algebra, exp(A)exp(B) = exp(C), and we write
When exp(A) and exp(B) commute (which always happens for 2×2 matrices, but not
higher), then C = A+B, mimicking the behavior of complex exponentiation. The general
case is given by the BCH formula, a series expanded in terms of the bracket (Hall 2004,
Ch. 3; Varadarajan 1984, §2.15). For matrices, the bracket is the same operation as the
commutator, which detects lack of commutativity in multiplication. The general formula
begins as follows.
WT
Representation of a rotation matrix as a sequential angle decomposition, as in Euler
angles, may tempt us to treat rotations as a vector space, but the higher order terms in the
BCH formula reveal that to be a mistake.
We again take special interest in the 3×3 case, where [A,B] equals the cross product, A×B.
If A and B are linearly independent, then A, B, and A×B can be used as a basis; if not, then
A and B commute. And conveniently, in this dimension the summation in the BCH
formula has a closed form (Engø 2001) as αA+βB+γ(A×B).
Spin group
The Lie group of n×n rotation matrices, SO(n), is a compact and path-connected
manifold, and thus locally compact and connected. However, it is not simply connected,
so Lie theory tells us it is a kind of "shadow" (a homomorphic image) of a universal
covering group. Often the covering group, which in this case is the spin group denoted by
Spin(n), is simpler and more natural to work with (Baker 2003, Ch. 5; Fulton & Harris
1991, pp. 299–315).
In the case of planar rotations, SO(2) is topologically a circle, S1. Its universal covering
group, Spin(2), is isomorphic to the real line, R, under addition. In other words, whenever
we use angles of arbitrary magnitude, which we often do, we are essentially taking
advantage of the convenience of the "mother space". Every 2×2 rotation matrix is
produced by a countable infinity of angles, separated by integer multiples of 2π.
Correspondingly, the fundamental group of SO(2) is isomorphic to the integers, Z.
________________________WORLD TECHNOLOGIES________________________
group, Z2. We can also describe Spin(3) as isomorphic to quaternions of unit norm under
multiplication, or to certain 4×4 real matrices, or to 2×2 complex special unitary
matrices.
WT
This is our third version of this matrix, here as a rotation around non-unit axis vector
(x,y,z) by angle 2θ, where cos θ = w and |sin θ| = ||(x,y,z)||. (The proper sign for sin θ is
implied once the signs of the axis components are decided.)
Many features of this case are the same for higher dimensions. The coverings are all two-
to-one, with SO(n), n > 2, having fundamental group Z2. The natural setting for these
groups is within a Clifford algebra. And the action of the rotations is produced by a kind
of "sandwich", denoted by qvq∗.
Infinitesimal rotations
The matrices in the Lie algebra are not themselves rotations; the skew-symmetric
matrices are derivatives, proportional differences of rotations. An actual "differential
rotation", or infinitesimal rotation matrix has the form
where dθ is vanishingly small. These matrices do not satisfy all the same properties as
ordinary finite rotation matrices under the usual treatment of infinitesimals (Goldstein,
Poole & Safko 2002, §4.8). To understand what this means, consider
________________________WORLD TECHNOLOGIES________________________
differing from an identity matrix by second order infinitesimals, which we discard. So to
first order, an infinitesimal rotation matrix is an orthogonal matrix. Next we examine the
square of the matrix.
WT
Again discarding second order effects, we see that the angle simply doubles. This hints at
the most essential difference in behavior, which we can exhibit with the assistance of a
second infinitesimal rotation,
again to first order. Put in other words, the order in which infinitesimal rotations are
applied is irrelevant, this useful fact makes, for example, derivation of rigid body
rotation relatively simple.
But we must always be careful to distinguish (the first order treatment of) these
infinitesimal rotation matrices from both finite rotation matrices and from derivatives of
rotation matrices (namely skew-symmetric matrices). Contrast the behavior of finite
________________________WORLD TECHNOLOGIES________________________
rotation matrices in the BCH formula with that of infinitesimal rotation matrices, where
all the commutator terms will be second order infinitesimals so we do have a vector
space.
Conversions
We have seen the existence of several decompositions that apply in any dimension,
namely independent planes, sequential angles, and nested dimensions. In all these cases
we can either decompose a matrix or construct one. We have also given special attention
to 3×3 rotation matrices, and these warrant further attention, in both directions
(Stuelpnagel 1964).
Quaternion
WT
Given the unit quaternion q = (w,x,y,z), the equivalent 3×3 rotation matrix is
Now every quaternion component appears multiplied by two in a term of degree two, and
if all such terms are zero what's left is an identity matrix. This leads to an efficient, robust
conversion from any quaternion — whether unit, nonunit, or even zero — to a 3×3
rotation matrix.
Freed from the demand for a unit quaternion, we find that nonzero quaternions act as
homogeneous coordinates for 3×3 rotation matrices. The Cayley transform, discussed
earlier, is obtained by scaling the quaternion so that its w component is 1. For a 180°
rotation around any axis, w will be zero, which explains the Cayley limitation.
The sum of the entries along the main diagonal (the trace), plus one, equals
4−4(x2+y2+z2), which is 4w2. Thus we can write the trace itself as 2w2+2w2−1; and from
the previous version of the matrix we see that the diagonal entries themselves have the
same form: 2x2+2w2−1, 2y2+2w2−1, and 2z2+2w2−1. So we can easily compare the
magnitudes of all four quaternion components using the matrix diagonal. We can, in fact,
obtain all four magnitudes using sums and square roots, and choose consistent signs
using the skew-symmetric part of the off-diagonal entries.
________________________WORLD TECHNOLOGIES________________________
t = Qxx+Qyy+Qzz (trace of Q)
r = sqrt(1+t)
w = 0.5*r
x = copysign(0.5*sqrt(1+Qxx-Qyy-Qzz), Qzy-Qyz)
y = copysign(0.5*sqrt(1-Qxx+Qyy-Qzz), Qxz-Qzx)
z = copysign(0.5*sqrt(1-Qxx-Qyy+Qzz), Qyx-Qxy)
t = Qxx+Qyy+Qzz
r = sqrt(1+t)
WT
s = 0.5/r
w = 0.5*r
x = (Qzy-Qyz)*s
y = (Qxz-Qzx)*s
z = (Qyx-Qxy)*s
This is numerically stable so long as the trace, t, is not negative; otherwise, we risk
dividing by (nearly) zero. In that case, suppose Qxx is the largest diagonal entry, so x will
have the largest magnitude (the other cases are similar); then the following is safe.
t = Qxx+Qyy+Qzz
r = sqrt(1+t)
s = 0.5/r
w = (Qzy-Qyz)*s
x = 0.5*r
y = (Qxy+Qyx)*s
z = (Qzx+Qxz)*s
If the matrix contains significant error, such as accumulated numerical error, we may
construct a symmetric 4×4 matrix,
and find the eigenvector, (w,x,y,z), of its largest magnitude eigenvalue. (If Q is truly a
rotation matrix, that value will be 1.) The quaternion so obtained will correspond to the
rotation matrix closest to the given matrix (Bar-Itzhack 2000).
Polar decomposition
If the n×n matrix M is non-singular, its columns are linearly independent vectors; thus the
Gram–Schmidt process can adjust them to be an orthonormal basis. Stated in terms of
________________________WORLD TECHNOLOGIES________________________
numerical linear algebra, we convert M to an orthogonal matrix, Q, using QR
decomposition. However, we often prefer a Q "closest" to M, which this method does not
accomplish. For that, the tool we want is the polar decomposition (Fan & Hoffman 1955;
Higham 1989).
To measure closeness, we may use any matrix norm invariant under orthogonal
transformations. A convenient choice is the Frobenius norm, ||Q−M||F, squared, which is
the sum of the squares of the element differences. Writing this in terms of the trace, Tr,
our goal is,
Though written in matrix terms, the objective function is just a quadratic polynomial. We
can minimize it in the usual way, by finding where its derivative is zero. For a 3×3
WT
matrix, the orthogonality constraint implies six scalar equalities that the entries of Q must
satisfy. To incorporate the constraint(s), we may employ a standard technique, Lagrange
multipliers, assembled as a symmetric matrix, Y. Thus our method is:
Taking the derivative with respect to Qxx, Qxy, Qyx, Qyy in turn, we assemble a matrix.
so that
________________________WORLD TECHNOLOGIES________________________
When M is non-singular, the Q and S factors of the polar decomposition are uniquely
determined. However, the determinant of S is positive because S is positive definite, so Q
inherits the sign of the determinant of M. That is, Q is only guaranteed to be orthogonal,
not a rotation matrix. This is unavoidable; an M with negative determinant has no
uniquely-defined closest rotation matrix.
To efficiently construct a rotation matrix Q from an angle θ and a unit axis u, we can take
advantage of symmetry and skew-symmetry within the entries. If x, y, and z are the
components of the unit vector representing the axis, and
WT
c = cos(θ); s = sin(θ); C = 1-c
then Q is
Determining an axis and angle, like determining a quaternion, is only possible up to sign;
that is, (u,θ) and (−u,−θ) correspond to the same rotation matrix, just like q and −q. As
well, axis-angle extraction presents additional difficulties. The angle can be restricted to
be from 0° to 180°, but angles are formally ambiguous by multiples of 360°. When the
angle is zero, the axis is undefined. When the angle is 180°, the matrix becomes
symmetric, which has implications in extracting the axis. Near multiples of 180°, care is
needed to avoid numerical problems: in extracting the angle, a two-argument arctangent
with atan2(sin θ,cos θ) equal to θ avoids the insensitivity of arccosine; and in
computing the axis magnitude to force unit magnitude, a brute-force approach can lose
accuracy through underflow (Moler & Morrison 1983).
x = Qzy-Qyz
y = Qxz-Qzx
z = Qyx-Qxy
r = sqrt(x2 + y2 + z2)
t = Qxx+Qyy+Qzz (trace of matrix Q)
θ = atan2(r,t−1)
The x, y, and z components of the axis would then be divided by r. A fully robust
approach will use different code when the trace t is negative, as with quaternion
extraction. When r is zero because the angle is zero, an axis must be provided from some
source other than the matrix.
________________________WORLD TECHNOLOGIES________________________
Euler angles
Complexity of conversion escalates with Euler angles (used here in the broad sense). The
first difficulty is to establish which of the twenty-four variations of Cartesian axis order
we will use. Suppose the three angles are θ1, θ2, θ3; physics and chemistry may interpret
these as
WT
One systematic approach begins with choosing the right-most axis. Among all
permutations of (x,y,z), only two place that axis first; one is an even permutation and the
other odd. Choosing parity thus establishes the middle axis. That leaves two choices for
the left-most axis, either duplicating the first or not. These three choices gives us 3×2×2 =
12 variations; we double that to 24 by choosing static or rotating axes.
This is enough to construct a matrix from angles, but triples differing in many ways can
give the same rotation matrix. For example, suppose we use the zyz convention above;
then we have the following equivalent pairs:
Angles for any order can be found using a concise common routine (Herter & Lott 1993;
Shoemake 1994).
The problem of singular alignment, the mathematical analog of physical gimbal lock,
occurs when the middle rotation aligns the axes of the first and last rotations. It afflicts
every axis order at either even or odd multiples of 90°. These singularities are not
characteristic of the rotation matrix as such, and only occur with the usage of Euler
angles.
The singularities are avoided when considering and manipulating the rotation matrix as
orthonormal row vectors (in 3D applications often named 'right'-vector, 'up'-vector and
'out'-vector) instead of as angles. The singularities are also avoided when working with
quaternions.
________________________WORLD TECHNOLOGIES________________________
distributed between 0 and 2π. That intuition is correct, but does not carry over to higher
dimensions. For example, if we decompose 3×3 rotation matrices in axis-angle form, the
angle should not be uniformly distributed; the probability that (the magnitude of) the
angle is at most θ should be 1⁄π(θ − sin θ), for 0 ≤ θ ≤ π.
Since SO(n) is a connected and locally compact Lie group, we have a simple standard
criterion for uniformity, namely that the distribution be unchanged when composed with
any arbitrary rotation (a Lie group "translation"). This definition corresponds to what is
called Haar measure. León, Massé & Rivest (2006) show how to use the Cayley
transform to generate and test matrices according to this criterion.
We can also generate a uniform distribution in any dimension using the subgroup
algorithm of Diaconis & Shashahani (1987). This recursively exploits the nested
dimensions group structure of SO(n), as follows. Generate a uniform angle and construct
WT
a 2×2 rotation matrix. To step from n to n+1, generate a vector v uniformly distributed on
the n-sphere, Sn, embed the n×n matrix in the next larger size with last column (0,…,0,1),
and rotate the larger matrix so the last column becomes v.
As usual, we have special alternatives for the 3×3 case. Each of these methods begins
with three independent random scalars uniformly distributed on the unit interval. Arvo
(1992) takes advantage of the odd dimension to change a Householder reflection to a
rotation by negation, and uses that to aim the axis of a uniform planar rotation.
Euler angles can also be used, though not with each angle uniformly distributed
(Murnaghan 1962; Miles 1965).
For the axis-angle form, the axis is uniformly distributed over the unit sphere of
directions, S2, while the angle has the non-uniform distribution over [0,π] noted
previously (Miles 1965).
________________________WORLD TECHNOLOGIES________________________