0% found this document useful (0 votes)
36 views26 pages

Foundations of Robotics

Uploaded by

Bacem Ch
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
36 views26 pages

Foundations of Robotics

Uploaded by

Bacem Ch
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 26

Chapter 6

Mathematical Building Blocks: From


Geometry to Quaternions to Bayesian

Rebecca Stower , Bruno Belzile , and David St-Onge

6.1 Learning Objectives

The objective at the end of this chapter is to be able to:


• use vector and matrix operations;
• represent translation, scaling, and symmetry in matrix operations;
• understand the use and limitation of Euler’s angles and quaternions;
• use homogeneous transformations;
• use derivatives to find a function optimums and linearize a function;
• understand the importance and the definition of a Gaussian distribution;
• use t-tests and ANOVAs to validate statistical hypothesis.

6.2 Introduction

Several of the bodies of knowledge related to robotics are grounded in physics and
statistics. While this book tries to cover each topic in an accessible manner, the
large majority of these book chapters expect a minimal background in
mathematics. The following pages summarize a wide range of mathematical
concepts from geometry to statistics. Throughout this chapter, relevant Python
functions are included.

R. Stower (B)
Department of Psychology, Université Vincennes-Paris 8, Saint-Denis, France
e-mail: [email protected]
B. Belzile ·D. St-Onge
Department of Mechanical Engineering, ÉTS Montréal, Montreal, Canada
e-mail: [email protected]
D. St-Onge
e-mail: [email protected]

© The Author(s) 2022 127


D. Herath and D. St-Onge (eds.), Foundations of Robotics,
https://doi.org/10.1007/978-981-19-1983-1_6
1 R. Stower et

Fig. 6.1 Different Z


coordinate systems in 3D
space

φρ z

θ r x
X

6.3 Basic Geometry and Linear Algebra

In this section, a brief non-exhaustive summary of basic concepts in Euclidean


geom- etry is given. Moreover, some linear algebra operations, useful for the
manipulations of components in different arrays, are recalled.

6.3.1 Coordinate Systems

A coordinate system is a “system for specifying points using coordinates measured


in some specified way.”1 The most common, which you have most probably used
in the past is the Cartesian coordinate system, is shown in Fig. 6.1. In this case,
more precisely in 3D space, we have an origin, i.e., the point from where the
coordinates are measured, and three independent and orthogonal axes, X , Y , and
Z . Three axes are needed and they must be independent, but they do not need to
be orthogonal. However, for practical reasons in most (but not all) applications,
orthogonal axes are preferred (Hassenpflug, 1995).
You may encounter some common alternatives to Cartesian coordinates that
can be more appropriate for some applications, such as spherical and cylindrical
coordinates. In the former, the coordinates are defined by a distance ρ from the
origin and two angles, i.e., θ and φ. In the latter, which is an extension of polar
coordinates in 2D, a radial distance r , an azimuth (angle) θ , and an axial
coordinate (height) z are needed. While a point is uniquely defined with Cartesian
coordinates, it is not totally the case with spherical and cylindrical coordinates;
more precisely, the origin is defined by an infinite set of coordinates with those
two systems, as the angles are not defined at the origin. Moreover, you can
add/subtract multiples of 360◦ to every angle and you will end up with the same
point, but different coordinates. Moreover, you should be

1
https://mathworld.wolfram.com/CoordinateSystem.html.
6 Mathematical Building Blocks: From Geometry to Quaternions to 1

careful with cylindrical and spherical coordinates, as the variables used to define
the individual coordinates may be switched, depending on the convention used,
which usually differs if you are talking to a physicist, a mathematician, or an
engineer.2

6.3.2 Vector/Matrix Representation

In mathematics, a vector is “a quantity that has magnitude and direction and that
is commonly represented by a directed line segment whose length represents the
magnitude and whose orientation in space represents the direction.” 3 As you may
wonder, this definition does not refer to components and reference frames, which
we often come across when vectors are involved. This is because there is a
common con- fusion between the physical quantity represented by a vector and the
representation of that same quantity in a coordinate system with one-dimensional
arrays. The same word, vector, is used to refer to these arrays, but you should be
careful to distinguish the two. Commonly, an arrow over a lower case letter defines
a vector, the physical
−→
quantity, for example a , and a lower case bold letter represents a vector defined
in a
coordinate system, i.e., with components, for example, a. You should note,
however, that authors sometimes use different conventions. In this book, the
coordinate system used to represent a vector is denoted by a superscript. For
example, the variable bS
−→ −→
is the embodiment of b in frame
S , while bT is the embodiment of b in
frame .
They do not have the same components, but they remain the same vector.
−→ −→
Vectors a and b in a n-dimensional Euclidean space can be displayed with
their
components as
⎡ ⎤ ⎡ ⎤
a1 bb1
⎢ a2 ⎥ ⎢ 2 ⎥
a 3 ⎥ ⎢ b3 ⎥
a =⎢ , b= (6.1)
⎢ . ⎥ ⎢ . ⎥
⎣an−1 ⎦ ⎣bn−1 ⎦
an bn

−→ −→
For example, vectors c and d are shown in Fig. 6.2. As can be seen, two
reference frames are also displayed. Their components in these frames are
Σ Σ Σ Σ Σ Σ Σ Σ
1 0 1 −1.4142
S T S T
c = 1 , c = 1.4142 , d = 3 , d = 2.8284 (6.2)
2
6 Mathematical Building Blocks: From Geometry to Quaternions to 1
See https://mathworld.wolfram.com/SphericalCoordinates.html.
3
https://www.merriam-webster.com/dictionary/vector.
1 R. Stower et

Fig. 6.2 Planar vectors and


their components in different
frames
1.4142
2.8284
−→c
1
3
YS 1 −→d
T
Y

XS -1.4142
T
1
X

import numpy as np # Import library


# arrays
a = np . array ([1 ,1]) #
vector A = np . array ([1 ,2]
,
[3 ,4]) # matrix

Similarly, tensors are used to represent physical properties of a body (and many
other things). More formally, tensors are algebraic objects defining multilinear
rela- tionships between other objects in a vector space. Do not focus to much on
the mathematical definition, but instead on what you already know. You have
already encountered some tensors in this chapter, since scalars and vectors (the
physical quantity, not the array) are, respectively, rank-0 and rank-1 tensors.4
Therefore, ten- sors can be seen as their generalization. One example of rank-2
tensors is the inertia tensor of a rigid body, which basically represents how the
mass is distributed in a rigid body (which does not depend on a reference frame).
For the sake of numerical computation, the representation of a rank-2 tensor in a
coordinate system can be done with what we call a matrix. You should be careful,
however, not to confuse matrices and rank-2 tensors. Indeed, all rank-2 tensors can
be represented by a matrix, but not all matrices are rank-2 tensors. In other words,
matrices are just boxes (arrays) with numbers inside (components) that can be
used to represent different objects, rank-2 tensors among them. Matrices are
generally represented by upper case bold letters, eg. A. Matrices, which have
components, can also be defined in specific ref- erence frames. Therefore, the
superscript to denote the reference frame also applies to matrices in the book, e.g.,
HS is a homogeneous transformation matrix (will be seen in Sect. 6.4.4) defined in
. S
Other common matrices with typical characteristics include:
• the square matrix, which is a matrix with an equal number of rows and columns;
• the diagonal matrix, which only has nonzero components on its diagonal, i.e.,
components (1, 1), (2, 2), . . . , (n, n);
• the identity matrix 1, which is a (n×n) matrix with only 1 on the diagonal, the
other components all being equal to 0.

4
For more information on tensors and their rank: https://mathworld.wolfram.com/Tensor.html.
6 Mathematical Building Blocks: From Geometry to Quaternions to 1

6.3.3 Basic Vector/Matrix Operations

Vectors and matrices are powerful and versatile mathematical tools with several
handful properties and operations. We will recall the most useful in robotics in the
following.
Dot Product
The addition and the multiplication with a scalar operations with vectors are
simply distributed over the components. Otherwise, two most relevant operations
in robotics are the dot and cross products. The dot product is also known as the
scalar product,
−→
as
−→ the result of the dot product of two arbitrary vectors is a scalar. Let a and
b be
−→ −→
two arbitrary vectors and their corresponding magnitude5 be ǁ ǁ a and ǁ ǁ b ,
then
the dot product of these two vectors is

−→ −→ −→ −→
a · b =ǁ a ǁǁ b ǁ cos θ (6.3)

where θ is the angle between those two vectors. If the two vectors are orthogonal,
by definition, the result will be zero. If components are used, then we have

a · b = a1b1 + a2b2 + a3b3 + · · · + an−1bn−1 + anbn (6.4)

import numpy as np # Import library


# dot product
np . dot ( a , b) # dot product of two array - like inputs
np . linalg . multi_dot ( a , b , c) # dot product of two or more arrays in a single
call # magnitude of a vector
np . linalg . norm ( a)

−→
Using the numerical values previously given in (6.2), the dot product of a and
−→
b is:

−→ −→
a · b = 1.4142 · 3.1623 cos(0.4636) =4 (6.5)
S S
a · b = 1 · 1 + 1 · 3 =4 (6.6)
T T
a · b = 0 · −1.4142 + 1.4142 · 2.8284 =4 (6.7)

As you can see from this example, both the geometric and algebraic definitions of
the dot product are equivalent.
Cross Product
The other type of multiplication with vectors is the cross product. Contrary to the
dot product, the cross product of two vectors results in another vector, not a scalar.
−→ −→
Again, both vectors must have the same dimension. With a and b used
above, the
cross product is defined as

5
Length, always positive.
1 R. Stower et
−→ −→ −→ −→ −→
a × b =ǁ a ǁǁ b ǁ sin θ e (6.8)

−→ −→
where, as with the dot product, θ is the angle between a and b , and
−→
e is a unit vector6 orthogonal to the first two. Its direction is established with
the right-hand rule. In 3D space, the components of the resulting vector can be
computed with the following formula:
⎡ ⎤
a2b3 − a3b2
⎣ ⎦
a b a− ba b− a ba × b =
1 2 2 1
(6.9)3 1
1 3

where a = [a1 a2 a3]T and b = [b1 b2 b3]T .


The right-hand rule is used to easily determine the direction of a vector resulting from the cross p

import numpy as np # Import library


# cross product
np . cross (a , b)

Again, using the numerical values used above in (6.2), we can compute the cross
product. Of course, since these two vectors are planar and the cross product is
defined over 3D space, the third component in Z is assumed equal to zero. The
result is given below:
−→ −→ −→ −→
a × b = 1.4142 · 3.1623 sin(0.4636) k =2 k
(6.10)
⎡ ⎤ ⎡ ⎤
1·0−0·3 0
a × b = 0 · 1 − 1 · 0 = ⎣0⎦
S S ⎣ ⎦ (6.11)
1·3−1·1 2
⎡ ⎤
1.4142 · 0 − 0 · 2.8284 ⎡ ⎤
aT × bT = ⎣ 0 · −1.41421356 − 1.4142 · 0 ⎦ =0⎣0⎦ (6.12)
0 · 2.8284 − 1.4142 · 2
−1.4142
−→
where k is the unit vector parallel to the Z -axis. By this definition, you can
observe that the unit vector defining the Z -axis of a Cartesian coordinate frame is
simply the cross product of the unit vectors defining the X - and Y -axes, following
the order −→
given
−→ by the right-hand rule. These three unit vectors are commonly labeled i ,
j
−→
and k , as shown in Fig. 6.3. You should note that the cross product of unit vector
−→ −→ −→ −→
a with j also results in k , since a is also in the X Y -plane.
Moreover, as you
6
1 R. Stower et
With a magnitude of 1.
6 Mathematical Building Blocks: From Geometry to Quaternions to 1

Fig. 6.3 Unit vectors



defining a Cartesian frame

−→ k
i

× k Z Y −→
j
X −
− →
→ a
i

−→ −→
can see with the cross product of i and k illustrated in the same figure,
a vector is not attached to a particular point in space. As mentioned before, it is
defined by a direction and a magnitude, thus the location where it is represented
does not have any impact on the cross product result.
Matrix Multiplication
Similarly to vectors, the addition and multiplication by a scalar are also distributed
over the components for matrices. On the other hand, the matrix multiplication is
a little more complicated. Let matrix A be defined by row vectors and matrix B be
defined by column vectors, i.e.,

a1 ⎤
a
⎢ a3 ⎥ Σ Σ
A= ,B= bb b ... b b (6.13)

⎢ . ⎥ 123 n−1 n

⎣an−1 ⎦
an
Then, the matrix multiplication is defined as
⎡ ⎤
a1 · b1 a1 · b2 a1 · b3 . . . a1 · bn−1 a1 · bn
a2 · b1 a2 · b2 a2 · b3 . . . a2 · bn−1 a2 · bn

⎢ a3 · b1 a3 · b2 a3 · b3 ..... a3 · bn−1 a3 · bn ⎥
(6.14)
AB = ⎢ . . . . . ⎥

⎣an−1 · b1 an−1 · b2 an−1 · b3 . . . an−1 · bn−1 an−1 · bn
⎥ an · b1 an · b2 an · b3 . . . an · bn−1 an · bn

While this result may seem scary at first, you can see that the (i, j) component7 is
simply the dot product of the i th row of the first matrix and the j th column of the

7
The (i, j) component is the component on the i th row and j th column.
1 R. Stower et

second matrix. The number of columns of the first matrix (A) must be equal to the
number of rows of the second matrix (B).
import numpy as np # Import library
# matrix multiplication
np . matmul (A , B) # for array - like
inputs A @ B # for ndarray inputs

To illustrate this operation, let A and B be (2 × 2) matrices, i.e.,


Σ Σ Σ Σ
12 1 0
A= , B= (6.15)
34 −1 2

then, the result of the matrix multiplication is


Σ Σ Σ Σ
1·1−2·11·0+2·2 −1 4
AB = = (6.16)
3·1−4·13·0+4· −1 8
2
It is critical that you understand that matrix multiplication is not commutative,
which means the order matters, as you can see in the following example with
matrices A and B used above:
Σ Σ Σ Σ
−1 4 12
AB = , but BA = (6.17)
−1 8 56

Transpose of a Matrix
Another common operation on a matrix is the computation of its transpose,
namely an operation which flips a matrix over its diagonal. The generated matrix,
denoted AT has the row and column indices switched with respect to A. For
instance, with a (3 × 3) matrix C, its transpose is defined as
⎡ ⎤T ⎡ ⎤
c1 ,1 c1, 2 c1, 3 c1 ,1 c2, 1 c3, 1
C ⎣ c2,1 c2,2 c2,3 ⎦= ⎣ c1,2 c2,2 c3,2 ⎦=
T
(6.18)
c3,1 c3,2 c3,3 c1,3 c3,2 c3,3

import numpy as np # Import library


# matrix transpose
np . transpose ( A) # function for array - like
input A. transpose () # method for ndarray
A.T # attribute for ndarray

Since vectors (array of components) are basically (1× n) matrices, the transpose
can be used to compute the dot product of two vectors with a matrix
multiplication, i.e.,

a · b = a T b = a1b1 + a2b2 + · · · + a n b n (6.19)


6 Mathematical Building Blocks: From Geometry to Quaternions to 1

Determinant and Inverse of a Matrix


Finally, a brief introduction to the inverse of a matrix is necessary, as it is quite
common in robotics, from the mechanics to control to optimization. Let A be a
(n × n) square matrix;8 this matrix is invertible if

AB = 1, and BA = 1 (6.20)

Then, matrix B is the inverse of A and therefore can be written as A−1. The
compo- nents of A−1 can be computed formally with the following formula:
1
A−1
= det(A) CT (6.21)

where det(A) is called the determinant of A and C is the cofactor matrix9 of A. The
determinant of a matrix, a scalar sometimes labeled ǁAǁ, is equal to, in the case of
a (2 × 2) matrix, Σ Σ
ab
det(A) = ad − bc, where A = (6.22)
cd
Similarly, for a 3 × 3 matrix, we have
⎤ ⎡
abc
det(A) ⎣ a(ei ⎦=f h) b(di− − c(dh− eg), +where A
f g) − d =e f
g h i
(6.23)
The determinant of a matrix is critical when it comes to the computation of its
inverse, as a determinant of 0 corresponds to a singular matrix, which does not
have an inverse. The inverse of a×(2 2) matrix can be computed with the following
Σ Σ Σ Σ
formula 1 d −b ab
A−1 = , where A = (6.24)
ad − bc −c a cd

Similarly, for a 3 × 3 matrix, we have


⎡ ⎤ ⎡ ⎤
(ei − f g) −(bi − ch) (bf − ce) a b c
−1
1
A = ⎣−(di − f g) (ai − cg) −(af − cd)⎦ , where A = ⎣ d e f ⎦
det(A) (dh − eg) −(ah − bg) (ae − bd) ghi
(6.25)

8
Same number of rows and columns.
9
The cofactor matrix will not be introduced here for the sake brevity, but its definition can be
found in any linear algebra textbook.
1 R. Stower et

import numpy as np # Import library


# matrix determinant
np . linalg . det
( A) # matrix
inverse np .
linalg . inv ( A)

As you can see from Eq. (6.25), you cannot inverse a matrix with a determinant
equal to zero, since it would result in a division by zero. The inverse of a matrix is
a useful tool to solve a system of linear equations. Indeed, a system of n equations
with n unknowns can be casted in matrix form as

Ax = b (6.26)

where the unknowns are the components of x, the constants are the components of
b and the factors in front of each unknowns are the components of matrix A.
Therefore, we can find the solution of this system, namely the values of the
unknown variables, as
x = A−1b (6.27)

Generalized Inverses
However, if we have more equations (m) than the number of unknowns (n), the
system is overdetermined, and thus A is no longer a square matrix. Its dimensions
are (m n). An exact solution to this system of equations cannot generally be
×
found. In this case, we use a generalized inverse; a strategy to find an optimal
solution. Several generalized inverse, or pseudo-inverse, can be found in the
literature (Ben-Israel and Greville, 2003), each with different optimization
criterion. For the sake of this book, only one type is presented here, the Moore–
Penrose generalized inverse (MPGI). In the case of overdetermined systems, the
MPGI is used to find the approximate solution that minimized the Euclidean norm
of the error, which is defined as

e0 = b − Ax0 (6.28)

where x0 and e0 are the approximate solution and the residual error, respectively.
The approximate solution is computed with

x0 = AL b, AL = (AT A)−1AT (6.29)

where AL is named the left Moore–Penrose generalized inverse (LMPGI), since


AI A =1. As an exercise, you can try to prove this equation.
There is another MPGI that can be useful in robotics, but not quite as common
as the LMPGI, the right Moore–Penrose generalized inverse (RMPGI). The right
generalized inverse is defined as

AR ≡ AT (AAT )−1, AAR = 1 (6.30)


6 Mathematical Building Blocks: From Geometry to Quaternions to 1

where ×A is a m n matrix with m < n, i.e., representing a system of linear


equations with more unknowns than equations. In this case, this system admits
infinitely many solutions. Therefore, we are not looking for the best approximate
solution, but one solution with the minimum-(Euclidean) norm. For example, in
robotics, when there is an infinite set of joint configurations possible to perfectly
reach an arbitrary position with a manipulator, the RMPGI can give you the one
minimizing the joint rotations. With both generalized inverses presented here, we
assume that A is full rank, which means that its individual columns are
independent if m > n, or its individual rows are independent if m = < n. In the case
of a square matrix (m n), a full rank
matrix is simply non-singular.

6.4 Geometric Transformations

It is crucial in robotics to be able to describe geometric relations in a clear and


unambiguous way. This is done with coordinate systems and reference frames as
mentioned above. You may have studied already four kinds of geometric transfor-
mation: translation, scaling, symmetry (mirror), and rotation. We will quickly go
over each of them, as they all are useful for computer-assisted design. However,
keep in mind that transformations used to map coordinates in one frame into
another use only translation and rotation.
For clarity, we will present all geometric transformations in matrix form, to
lever- age the powerful operations and properties as well as their condensed
format. Using the vector introduction above (Sect. 6.3.2), the simplest geometric
element will be used to introduce the transformation, the point:
⎡ ⎤
Σ x
P2D(x, y) = x , P3D(x, y, z) = ⎣ y ⎦ (6.31)
Σ z
y

In fact, you only need to apply transformations to point entities in order to


trans- form any 2D and 3D geometry. From a set of points, you can define
connected pairs, i.e., edges or lines, and from a set of lines you can define loops,
i.e., surfaces. Finally, a set of selected surfaces can define a solid (Fig. 6.4).

6.4.1 Basic Transformations

Let’s start with a numerical example: given a point in =


x 1 and y 2 that we intend
to move by 2 units toward x positive and by 4 units toward y positive. The
+ x r x= 2 and yr y 4, which can be
algebraic form of this operation is= simply
written in matrix form: +
1 R. Stower et
Y Y Y
P j(3, 6)
P j(2, 4)

Ty =4

P (1, 2) P j(−1, 2) P (1, 2)


Tx =2
P (1, 2)

(0, X P0 X X
0)

Fig. 6.4 Basic geometrical transformations, from left to right: translation, scaling and mirror (sym-
metry)
Σ Σ Σ Σ
1 2
r
P =P+T= 2 + 4 (6.32)

Similar reasoning applies in three dimensions. Now imagine we use point P to


0
Σ Σ P
define a line with the origin and that we want to stretch this line with a
scaling =0
0

factor of 2. The algebraic form of this operation is x=r


x × 2 and yr = y 2, which
can be written in matrix form: ×
Σ ΣΣ Σ
20 1
r
P = SP = 0 2 2 (6.33)

This scaling operation is referred to as proportional, since both axes have the same
scaling factor. Using different scaling factors will deform the geometry. If, instead
of scaling the geometry, we use a similar diagonal matrix to change the sign of one
or more of its components, it will generate a symmetry. For instance, a symmetry
with respect to y is written:
Σ ΣΣ Σ
−1 0 1
r
P = SP = 0 1 2 (6.34)

These operations are simple and do not change with increasing the dimensions
from two to three. The rotations, however, are not as such.

6.4.2 2D/3D Rotations

A rotation is a geometric transformation that is more easily introduced with polar


coordinates (see Fig. 6.5):
6 Mathematical Building Blocks: From Geometry to Quaternions to 1

Fig. 6.5 Planar rotation and Y


polar coordinates

P J(xJ, y J)

r θ

P (x, y)
α
(0, 0)
X

Σ Σ Σ Σ
x r cos(α)
P = y = r sin(α) . (6.35)

Then a rotation θ applied to this vector consists in:

r cos(α +Σθ)
Pr . r sin(α + θ) (6.36)
=
,
which can be split with respect to the angles using common trigonometric identities
leading to
Σ Σ Σ ΣΣ Σ
r x cos(θ ) − y sin(θ) cos(θ ) − sin(θ) x
P = x sin(θ) + y = sin(θ) cos(θ ) y . (6.37)
cos(θ )
The resulting 2 ×2 matrix is referred to as the rotation matrix, and its format is
unique in 2D. Any rotation in the plane can be represented by this matrix, using
the right-hand rule for the sign of θ . This matrix is unique because a single
rotation axis exists for planar geometry: the perpendicular to the plane (often set as
the z-axis). For geometry in three-dimensional space, there is an infinite number of
potential rotation axis; just visualize the rotational motions you can apply to an
object in your hand. One approach to this challenge consists in defining a direction
vector in space and a rotation angle around it, since Leonhard Euler taught us that
“in three-dimensional space, any displacement of a rigid body such that a point on
the rigid body remains fixed, is equivalent to a single rotation about some axis
that runs through the fixed point.” While this representation is appealing to
humans fond of geometry, it is not practical to implement in computer programs
for generalized rotations. Instead, we can decompose any three-dimensional
rotation into a sequence of three rotations around principal axis. This approach is
called the Euler’s Angles and is the most common representation of three-
dimensional rotation. We only need to define three matrices:
1 R. Stower et
⎡ ⎤
1 0 0
Rx ⎣ 0 cos(ψ ) sin(ψ) , (6.38)
0 sin(ψ) cos(ψ )
⎦−=
⎡ ⎤
cos(φ) 0 sin(φ)
Ry = ⎣ 0 1 0 ⎦, (6.39)
− sin(φ) 0 cos(φ)
⎡ ⎤
cos(θ )− sin(θ) 0
Rz ⎣ sin(θ) cos(θ )⎦0= . (6.40)
0 0 1

If these matrices are the only ones required to represent any rotation, they still
− axes (x y z) in
leave two arbitrary definitions: 1. the orientation of the principal
space,
2. the order of the rotations. Rotation matrices are multiplication operations over
geometry features, and, as mentioned above, these operations are not
commutative. The solution is to agree over a universal set of conventions:

XY X, XY Z, XZ X, XZY, Y XY, YX Z, YZ X,
Y ZY, ZXY, Z X Z , ZY X, and ZY Z. (6.41)

These twelves conventions still need their axes orientation to be defined: Each
axis can either be fixed to the inertial frame (often referred to as extrinsic
rotations) or attached to the body rotating (often referred to as intrinsic rotations).
For instance, the fixed rotation matrix for the XY Z convention is:
⎡ ⎤
cosθ cosφ cosθ sinφ sinψ − sinθ cosψ cosθ sinφ cosψ + sinθ sinψ
R zR y Rx = ⎣ sinθ cosφ sinθ sinφ sinψ − cosθ cosψ sinθ sinφ cosψ − cosθ sinψ ⎦ .
− sinφ cosφ sinψ cosφ cosψ
(6.42)
While using a fixed frame may seem easier to visualize, most embedded
controllers require their rotational motion to be expressed in the body frame; one
attached to the object and moving with it. The same convention XY Z , but in
mobile frame⎡is:

r r r ⎤
cosφ cosθ − cos φ sinθ sinφ
Rx Ry Rz = ⎣cosψ sinθ + sinψ sinφ cosθ cosψ cosθ − sinψ sinφ sinθ sinψ cosφ ⎦ .
−cosψ sinφ cosθ sinψ cosθ cosψ sinφ sinθ cosψ cosφ
sinψ sinθ −
(6.43)
In aviation, the most common convention is the ZYX (roll–pitch–yaw) also
called the Tait–Bryan variant. In robotics, each manufacturer and software devel-
oper decides on the convention they prefer to use, for instance, FANUC and
KUKA use the fixed XYZ Euler angle convention, while ABB uses the mobile
ZYX Euler angle convention. As for computer-assisted design, the Euler angles
used in CATIA and SolidWorks are described by the mobile ZYZ Euler angles
convention.
6 Mathematical Building Blocks: From Geometry to Quaternions to 1

Fig. 6.6 Vector


representation of planar i
rotation using the imaginary
axis i i2 = −1

-2 -1 0 1 2

Euler’s angle representation is known to have a significant limitation: gimbal


lock. In a glimpse, each convention suffers from singular orientation(s), i.e.,
orientation at which two axes are overlaid, thus both having the same effect on
rotation. With two axes generating the same rotation, our three-dimensional space
is no longer fully reachable; i.e., one rotation is not possible anymore. Gimbal lock
has become a rather popular issue in spacecraft control since Apollo’s mission
suffered from it (Jones and Fjeld, 2006). Nevertheless, Euler’s angles stay the
most common and intuitive representation of three-dimensional rotation and
orientation, but others, often more complex, representation were introduced to
cope with this limitation.

6.4.3 Quaternion

One such gimbal-lock-free representation is the quaternion. Quaternion is a rather


complex mathematical concept with respect to the level required for this textbook.
We will not try to define exactly the quaternion in terms of their mathematical
construc- tion, and we will not detail all of their properties and operations. Instead,
you should be able to grasp the concept thanks to a comparison with the imaginary
numbers, a more common mathematical concept.
We recall that the imaginary axis (i ) is orthogonal to the real numbers one (see
Fig. 6.6), with the unique property i=2 1. Together they create a planar reference

frame that can be used to express rotations:

R(θ) = cos(θ ) + sin(θ)i. (6.44)

In other words, we can write a rotation in the plane as a vector with an imaginary
part. Now, imagine adding two more rotations as defined above with Euler’s
angles: we will need two more “imaginary” orthogonal axes to represent these
rotations. Equation 6.44 becomes:

R(θ) = cos(θ ) + sin(θ)(xi + y j + zk). (6.45)


1 R. Stower et

While this can be easily confused with a vector-angle representation, remember


−−
that i j k define “imaginary” axes; not coordinates in the Cartesian space. These
axes hold similar properties as the more common i imaginary axis:

ǁi, j, kǁ = 1, ji = −k, ij = k, i 2 = −1. (6.46)

For most people, quaternions are not easy to visualize compared to Euler angles,
but they provide a singularity-free representation and several computing
advantages. This is why ROS (see Chap. 5) developers selected this representation
as their standard.
In Python, the scipy library contains a set of functions to easily change from one
representation to another:
# Import the library
from scipy . spatial . transform import Rotation
as R # Create a rotation with Euler angles
mat = R. from_euler ( ’ yxz ’, [45 , 0 , 30] , degrees =
True ) print( " Euler : " , mat . as_euler ( ’ yxz ’,
degrees = True )) # Print the resulting quaternion
print( " Quaternion : " , mat . as_quat ())

6.4.4 Homogeneous Transformation Matrices

A standardized way to apply a transformation from one coordinate system to


another, i.e., to map a vector from one reference frame to another, is to use
homogeneous transformation matrices. Indeed, a homogeneous transformation
matrix can be used to describe both the position and orientation of an object.
The (4 × 4) homogeneous transformation matrix is defined as
Σ Σ
Q p
T
HS ≡ 0T 1 (6.47)

where Q is the (3×3) rotation (orientation) matrix, p is the three-dimensional vec-


tor defining the Cartesian position[ x, y, z of the origin and 0 is the three-
dimensional null vector. As can be ]seen with the superscript and subscript of H,
the matrix defines the referenceTframe in the reference frame . While being
composed of 9 components, there are not all independent, since the position and
orientation in the Cartesian space add up to 6 degrees-of-freedom (DoF). Whereas
the translation introduced above were defined as additions, the homogeneous
matrix merges it with rotation and makes it possible to use only multiplications.
6 Mathematical Building Blocks: From Geometry to Quaternions to 1

6.5 Basic Probability

6.5.1 Likelihood

When we talk about probability, we are typically interested in predicting the likeli-
hood of some event occurring, expressed as P(event). On the most basic level,
this can be conceptualized as a proportion representing the number of event(s) we
are interested in (i.e., that fulfill some particular criteria), divided by the total
number of equally likely events.
Below is a summary of the notation for describing the different kinds and
combi- nations of probability events which will be used throughout the rest of this
section (Table 6.1).
As an example, imagine we have a typical (non-loaded) 6-sided die. Each of the
six sides has an equal likelihood of occurring each time we roll the die. So, the
total number of possible outcomes on a single dice roll, each with equal
probability of occurring is 6. Thus, we can represent the probability of any specific
number occurring on a roll as a proportion over 6.
For example, the probability of rollinga3 is expressed as:
1
P(3)
= 6 (6.48)
The probability of an event not occurring is always the inverse of the probability
of it occurring, or 1 − P(event). This is known as the rule of subtraction.

P( A) = 1 − P( Ar) (6.49)

So in the aforementioned example, the probability of not rollinga3 is:


P(3r) = 1
1 5 (6.50)
− =
6 6
We could also change our criteria to be more general, for example to calculate
the probability of rolling an even number. In this case, we can now count 3
possible outcomes which match our criteria (rolling a 2, 4, or 6), but the total
number of possible events remains at 6. So, the probability of rolling an even
number is:

Table 6.1 Common probability notations


P( A) Probability of A occurring
P( Ar) Probability of A not occurring
P( A ∩ B) Probability of both A and B occurring
P( A ∪ B) Probability of either A or B occurring
P( A|B) Probability of A occurring given B occurs
1 R. Stower et
3 1
P(even)
(6.51)
= =
6 2
Now, imagine we expanded on this criterion of rolling even numbers, to
calculate the probability of rolling either an even number OR a number greater
than 3. We now have two different criteria which we are interested in (being an
even number or being greater than 3) and want to calculate the probability that a
single dice roll results in either of these outcomes.
To begin with, we could try simply adding the probability of each individual
outcome together:
P(even ∪ > 3) 3 3 6 1 (6.52)
= + = =
6 6 6

We have ended up with a probability of 1, or in other words, a 100% chance of


rolling a number which is either even or greater than 3. Since we already know
there are numbers on the die which do not meet either of the criteria, we can
deduce that this conclusion is incorrect.
The miscalculation stems from the fact that there are numbers which are both
even numbers AND greater than 3 (namely 4 and 6). By just adding the
probabilities together, we have “double-counted” their likelihood of occurring. In
Fig. 6.7, we can see that if we create a Venn diagram of even numbers and
numbers > 3, they overlap in the middle with the values of 4 and 6. If we think of
probability as calculating the total area of these circles, then we only need to count
the overlap once.
So to overcome this double-counting, we subtract the probability of both events
occurring simultaneously (in this example, the probability of rolling a number
which is both an even number AND greater than 3) from the summed probability
of the individual events occurring;

P(even ∪ > 3) 3 3 2 4 2
(6.53)
= + − = =
6 6 6 6 3
More generally, this is known as the rule of addition and takes the general form:

P( A ∪ B) = P( A) + P(B) − P( A ∩ B) (6.54)

Fig. 6.7 Venn diagram of


even numbers and numbers
greater than 3
6 Mathematical Building Blocks: From Geometry to Quaternions to 1

In the case where two outcomes cannot happen simultaneously (i.e., there is no
overlap in the venn diagram), then P( A∪B) = P( A) P(B),
+ as P( A B)∩ 0. =This
is known as mutually exclusive events.
Finally, imagine we slightly changed our criteria again, so that we are now
inter- ested in the probability of rolling both an even number AND a number
greater than
3. You might have noticed we actually already used the probability of both an even
number and a number greater than three occurring in the previous equation to
calcu- late the probability of either of the two events occurring, P(even ∩ >6 3) = 3
2
= 1.
This is because in this example we have a small number of outcomes, meaning it
is relatively easy to just count the number of outcomes which match our criteria.
However, in more complicated scenarios the calculation is not as straightforward.
So, to begin thinking about the question of how to calculate the probability of
two events happening simultaneously, we can first ask what is the probability of
one of the events occurring, given the other event has already occurred. In this
example, we could calculate the probability of rolling a number greater than 3,
given that the number rolled is already even. That is, if we have already rolled the
die and know that the outcome is an even number, what is the likelihood that it is
also greater than 3? We already know that there are three sides of the die which
have even numbers (2, 4, or 6). This means our number of possible outcomes, if
we know the outcome is even, is reduced from 6 to 3. We can then count the
number of outcomes from this set which are greater than 3. This gives us two
outcomes (4 and 6). Thus, the
probability of rolling a number greater than 3, given that it is also even is:
P(> 3|even) =
2 (6.55)
3

However, this calculation still overestimates the probability of both events


occur- ring simultaneously, as we have reduced our scenario to one where we are
100% sure one of the outcomes has occurred (we have already assumed that the
outcome of the roll is an even number). So, to overcome this, we can then multiply
this equation by the overall probability of rolling an even number, which we know
from before
6 is P = 3 .

P(even ∩ > 3) 32 6 1
(6.56)
= × = =
6 3 18 3
This gives us the same value, P( A∩ B) = 13 that we saw in our previous equation.
This is also called the rule of multiplication, with the general form:

P( A ∩ B) = P( A) P(B| A) (6.57)

One additional factor to consider when calculating probability is whether events


are dependent or independent. In the dice example, these events are dependent, as
one event happening (rolling an even number) affects the probability of the other
event happening (rolling a number greater than 3). The overall probability of
rolling
1 R. Stower et
1 2
a number greater than 3 is , but increases to if we already know that the number
2 3
rolled is even.
If events are independent, i.e., do not affect each other’s probability of occurring,
the rule of multiplication reduces to:

P( A ∩ B) = P( A) × P(B) (6.58)

The rule of multiplication also forms the basis for Bayes’ theorem, to be
discussed in the next section.

6.5.2 Bayes’ Theorem

Bayes’ rule is a prominent principle used in artificial intelligence to calculate the


probability of a robot’s next steps given the steps the robot has already executed.
Bayes’ theorem is defined as:

P( A)P(B|
P(
A) A∩ B) = (6.59)
P(B)

Robots (and sometimes humans) are equipped with noisy sensors and have
limited information on their environment. Imagine a mobile robot using vision to
detect objects and its own location. If it detects an oven it can use that information
to infer where it is. What you know is that the probability of seeing an oven in a
bathroom is pretty low, whereas it is high in a kitchen. You are not 100% sure
about this, because you might have just bought it and left it in the living room, or
your eyes are “wrong” (your vision sensors are noisy and erroneous), but it is
probabilistically more likely. Then, it seems reasonable to guess that, given you
have seen an oven, you are “more likely” to be in a kitchen than in bathroom.
Bayes’ theorem provides one (not the only one) mechanism to perform this
reasoning.
P(room) is the “prior” belief before you’ve seen the oven, P(oven |room) pro-
vides the likelihood of seeing an oven in some room, and P(room |oven) is your
new belief after seeing the oven. This is also called the “posterior” probability, the
conditional probability that results after considering the available evidence (in this
case an observation of the oven).

6.5.3 Gaussian Distribution

Moving away from our dice example, we know that in real-life things do not
always have an equal probability of occurring. When different outcomes have
different prob- abilities of occurring, we can think about these probabilities in
terms of frequencies. That is, in a given number of repetitions of an event, how
frequently is a specific
6 Mathematical Building Blocks: From Geometry to Quaternions to 1

Fig. 6.8 Normal distribution

outcome likely to occur? We can plot these frequencies on a frequency histogram,


which counts the number of times each event has occurred. This logic forms the
basic of frequentist statistics, which we discuss more of in Sect. 6.7.
The Gaussian, or normal, distribution (aka the “Bell Curve”) refers to a fre-
quency distribution or histogram of data where the data points are symmetrically
distributed—that is, there is a “peak” in the distribution (representing the mean)
under which most values in the dataset occur, which then decreases symmetrically
on either side as the values become less frequent (see Fig. 6.8). Many naturally
occurring datasets follow a normal distribution, for example, average height of the
population, test scores on many exams, and the weight of lubber grasshoppers. In
robotics, we can see a normal distribution on the output of several sensors. In fact,
the central limit theorem suggests that, with a big enough sample size, many
variables will come to approximate a normal distribution (even if they were not
necessar- ily normally distributed to begin with), making it a useful starting point
for many statistical analyses.
We can use the normal distribution to predict the likelihood of a data point
falling within a certain area under the curve. Specifically, we know that if our data
is normally distributed, 68.27% of data points will fall within 1 standard deviation
of the mean, 95.45% will fall within 2 standard deviations, and 99.73% will fall
within 3 standard deviations. In probability terms, we could phrase this as “there is
a 68.27% likelihood that a value picked at random will be within one standard
deviation of the mean.” The further away from the mean (the peak of the curve) a
value is, the lower its probability of occurring. The total probability of all values
in the normal distribution (i.e., the total area under the curve) is equal to 1.
Mathematically, the area under the curve is represented by a probability density
function, where the probability of falling within a given interval is equal to the
area under the curve for this interval. In other words, we can use the normal
distribution to calculate the probability density of seeing a value, x , given the
mean, μ, and standard deviation, σ 2.
2
p x|
1 − 1 (x −μ)
2

( μ, σ ) = √ e
2
2 2σ 2 (6.60)
2πσ
1 R. Stower et

Fig. 6.9 Derivative of a


function gives the f (x)
instantaneous slope of that
function. Locations with null
derivative are in green: the
optimums

We can see that there are actually only two parameters which need to be input,
μ, and σ 2. The simplicity of this representation is also relevant to computer
science and robotics applications.
In a classic normal distribution, the mean is equal to 0, and the standard
deviation is 1. The mean and standard deviation of any normally distributed
dataset can then be transformed to fit these parameters using the following
formula:

x−μ
z= σ (6.61)

These transformed values are known as z-scores. Thus, if we have the mean
and standard deviation of any normally distributed dataset, we can convert it into
z- scores. This process is called standardization, and it is useful because it means
we can then use the aforementioned properties of the normal distribution to work
out the likelihood of a specific value occurring in any dataset which is normally
distributed, independent of its actual mean and standard deviation. This is because
each z-score is associated with a specific probability of occurring (we already
know the probabilities for z-scores at exactly 1, 2, and 3 standard deviations
above/below the mean). You can check all z-score probabilities using z-tables.10
From these, we can calculate the percentage of the population which falls either
above or below a certain z-score. A z-score can then be considered a test statistic
representing the likelihood of a specific result occurring in a (normally distributed)
dataset. This becomes important when conducting inferential statistics, to be
discussed later in this chapter.

6.6 Derivatives

Differential calculus is an essential tool for most of the mathematical concepts in


robotics: from finding optimal gains to the linearization of complex dynamic systems.

10
https://www.ztable.net/.
6 Mathematical Building Blocks: From Geometry to Quaternions to 1

The derivative of a function f (x) is the rate at which its value changes. It can be
approximated by f r (x) =O f (Oxx)
. However, several algebraic functions have known
exact derivatives, such as ˙vx n x=nx n−1 . In robotics, we manipulate derivatives for
physical variables such as the velocity (x˙ ), the derivative of the position (x ), and
the acceleration (x¨ ), the derivative of the velocity. On top of this, derivative can be
helpful to find a function optimum: when the derivative of a function is equal to
zero we are either at a (local) minimum or a (local) maximum (see Fig. 6.9).
Several properties are useful to remember, such as the derivative operator can be
distributed over addition:
[ f (x) + g(x)]r = f r (x) + g r (x), (6.62)

and distributed over nested functions:

f (g(x))r = f r (g(x))gr (x). (6.63)

Finally, derivative operators can be distributed over a multivariate function,

using
partial derivatives, i.e., derivatives with respect to each variable independently. For
instance:
∂[ Ax1 + Bx 2 ]x 1 = A. (6.64)

6.6.1 Taylor Series

Robotics is all about trying to control complex dynamic systems in complex


dynamic environments. Most often these systems and models present nonlinear
dynamics. For instance, airplane and submarines drag forces impact the vehicle
acceleration with regard to the (square of) its velocity. One way to cope with this
complexity is to sim- plify the equation using polynomial (an addition of various
powers) approximation. The most popular is certainly the Taylor series:

f r(a) f rr(a) f rrr(a)


f (x)|a = f (a) + (x − a) + (x − a)2 + (x − a)3 + · · ·
1! 2! 3!
(6.65)
which approximate f (x) around the point x=a using a combination of its deriva-
tives. If we want our approximation to linearize the function, we will keep only the
first two terms:
f (x) ≈ f (a) + f r(a)(x − a) (6.66)

6.6.2 Jacobian

Now instead of a single function depending of a single variable, you will often find
yourself with a set of equations each depending of several variables. For instance,
150 R. Stower et al.

f1 = Axy, f2 = Cy 2 + Dz, and f3 = E/x + Fy + Gz (6.67)

which can be written as a vector:


⎡ ⎤
f1
F = ⎣ f2⎦ . (6.68)
f3

You can linearize this system of equations using Taylor’s series:


⎡ ⎤
x − xa
⎣ ⎦y − yF ≈ F(a) + J , (6.69)a
z − za

where J is the matrix of partial derivatives of the functions, often referred to as the
Jacobian, in this case:
⎡ ⎤ ⎡ ⎤
∂ f1 x ∂ f1 y ∂ f1 z Ay Ax 0
J = ⎣∂ f 2 x ∂ f 2 y ∂ f 3 z ⎦ = ⎣ 0 2C y D ⎦ . (6.70)
∂ f 3 x ∂ f3 y ∂ f 3 z −E/x 2 FG

In Chap. 10, the Jacobian is leveraged as a matrix to relate the task space (end
effector velocities) to the joint space (actuator velocities). A Jacobian matrix
derived for a single function, i.e., a single row matrix, is called a gradient, noted
(for a geometric function in Cartesian space):
Σ
∇ f = ∂ f x ∂ fy ∂ f z . (6.71)
Σ
The gradient is a useful tool to find the optimum of a function by traveling on it; a
stochastic approach very useful in machine learning (see Chap. 15).

6.7 Basic Statistics

When conducting research in robotics, and especially user studies, you will often
have data you have collected in pursuit of answering a specific research question.
Typically, such research questions are framed around the relationship between an
independent variable and a dependent variable. For example, you might ask how
the number of drones (independent variable) in a mission affects the operator’s
cognitive workload (dependent variable). Being able to analyze the data you have
collected is then necessary to communicate the outcomes from your research.
Chapter 13 gives more detail on how to design and conduct user studies, for now
we will begin explaining some of the analyses you can perform once you have
obtained some data!

You might also like