FFR125 LectureNotes
FFR125 LectureNotes
R OBOTS
MATTIAS WAHDE
All rights reserved. No part of these lecture notes may be reproduced or trans-
mitted in any form or by any means, electronic och mechanical, without per-
mission in writing from the author.
1 Autonomous robots 1
1.1 Robot types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.2 Robotic hardware . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.2.1 Construction material . . . . . . . . . . . . . . . . . . . . 5
1.2.2 Sensors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.2.3 Actuators . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
1.2.4 Processors . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
4 Animal behavior 45
4.1 Introduction and motivation . . . . . . . . . . . . . . . . . . . . . 45
4.2 Bottom-up approaches vs. top-down approaches . . . . . . . . . 46
4.3 Nervous systems of animals . . . . . . . . . . . . . . . . . . . . . 46
4.4 Ethology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
i
ii CONTENTS
4.4.1 Reflexes . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
4.4.2 Kineses and taxes . . . . . . . . . . . . . . . . . . . . . . . 48
4.4.3 Fixed action patterns . . . . . . . . . . . . . . . . . . . . . 50
4.4.4 Complex behaviors . . . . . . . . . . . . . . . . . . . . . . 52
8 Decision-making 105
8.1 Introduction and motivation . . . . . . . . . . . . . . . . . . . . . 105
8.1.1 Taxonomy for decision-making methods . . . . . . . . . 105
8.2 The utility function method . . . . . . . . . . . . . . . . . . . . . 106
8.2.1 State variables . . . . . . . . . . . . . . . . . . . . . . . . . 107
8.2.2 Utility functions . . . . . . . . . . . . . . . . . . . . . . . . 109
8.2.3 Activation of brain processes . . . . . . . . . . . . . . . . 110
1
2 CHAPTER 1. AUTONOMOUS ROBOTS
Figure 1.1: Left panel: A Boe-bot. Right panel: A wheeled robot currently under construction
in the Adaptive systems research group at Chalmers.
Figure 1.2: A Kondo humanoid robot. Left panel: Front view. Right panel: Rear view.
important being legged robots and wheeled robots. Other kinds, such as fly-
ing robots, exist as well, but will not be considered in this course. The class
of legged robots can be subdivided based on the number of legs, the most
common types being bipedal robots (with two legs) and quadrupedal robots
(with four legs). Most bipedal robots resemble humans, at least to some extent;
such robots are referred to as humanoid robots. An example of a humanoid
robot is shown in Fig. 1.2. Humanoid robots that (unlike the robot shown in
Fig. 1.2) not only have the approximate shape of a human, but have also been
equipped with more detailed human-like features, e.g. artificial skin, artificial
hair etc., are called androids. It should be noted that the term humanoid refers
to the shape of the robot, not its size; in fact, many humanoid robots are quite
small. For example, the Kondo robot shown in Fig. 1.2 is approximately 0.35
m tall.
Some robots are partly humanoid. For example, the wheeled robot shown
in the right panel of Fig. 1.1 is currently being equipped with a humanoid up-
per body. Unlike a fully humanoid robot, this robot need not be actively bal-
anced, but will still exhibit many desirable features of humanoid robots, such
as two arms for grasping and lifting objects, gesturing etc., as well as a head
that will be equipped with two cameras for stereo vision and microphones
providing capabilities for listening and speaking.
Figure 1.4: Left panel: Aluminium parts used in the construction of a rotating base for a
humanoid upper body. The servo motor used for rotating the base is also shown, as well as the
screws, washers and nuts. Right panel: The assembled base.
1.2.2 Sensors
The purpose of robotic sensors is to measure either some physical characteris-
tic of the robot (for example, its acceleration) or some aspect of its environment
(for example, the detected intensity of a light source). The raw data thus ob-
tained must then, in most cases, be processed further before being used in the
brain of the robot. For example, an infrared (IR) proximity sensor may pro-
vide a voltage (depending on the distance to the detected object) as its read-
ing, which can then be converted to a distance, using the characteristics of the
sensor available from its data sheet.
Figure 1.5: Left panel: A Khepera II robot. Note the IR proximity sensors (small black
rectangles around the periphery of the robot), consisting of an emitter and a detector. Right
panel: A Sharp GP2D12 infrared sensor.
Needless to say, there exists a great variety of sensors for mobile robots.
Here, only a brief introduction will be given, focusing on a few fundamental
sensor types.
A A
B
Figure 1.6: The left panel shows a simple encoder, with a single detector (A), that measures
the interruptions of a light beam, producing the curve shown below the encoder. In the right
panel, two detectors are used, making it possible to determine also the direction of rotation.
Ultrasound sensors, also known as sonar sensors or simply sonars, are based
on time-of-flight measurement. Thus, in order to detect the distance to an ob-
ject, a sonar emits a brief pulse of ultrasonic sound, typically in the frequency
range 40-50 kHz2 . The sensor then awaits the echo. Once the echo has been
detected, the distance to the object can be obtained using the fact that sound
travels at a speed of around 340 m/s. As in the case of IR sensors, there is
both a lower and an upper limit for the detection range of a sonar sensor. If
the distance to an object is too small, the sensor simply does not have enough
time to switch from emission to listening, and the signal is lost. Similarly, if
the distance is too large, the echo may be too weak to be detected.
Fig. 1.7 shows a Ping ultrasonic distance sensor, which is commonly used
in connection with the Boe-bot. This sensor can detect distances to objects in
the range [0.02, 3.00] m.
Laser range finders (LRFs) commonly rely, like sonar sensors, on time-of-flight
measurements, but involve the speed of light rather than the speed of sound.
Thus, a laser range finder emits pulses of laser light (in the form of thin beams),
2
For comparison, a human ear can detect sounds in the range 20 hz to 20 kHz. Thus, the
sound pulse emitted by a sonar sensor is not audible.
Figure 1.8: Left panel: A Hokuyo URL-04LX laser range finder. Right panel: A typical
reading, showing the distance to the nearest object in various directions. The pink rays indicate
directions in which no detection is made. The maximum range of the sensor is 4 m.
and measures the time it takes for the pulse to bounce off a target and return to
the range finder. An LRF carries out a sweep over many directions3 resulting
in an accurate local map of distances to objects along the line-of-sight of each
ray. LRFs are generally very accurate sensors, but they are also much more
expensive than sonars sensors and IR sensors.
A Hokuyo URG-04LX LRF is shown in the left panel of Fig. 1.8. This sensor
has a range of around four meters, with an accuracy of around 1 mm. It can
generate readings in 683 different directions, with a frequency of around 10
Hz. As of the time of writing (Jan. 2010), a Hokuyo URG-04LX costs around
2,000 USD. The right panel of Fig. 1.8 shows a typical reading, obtained from
the software delivered with the LRF.
Cameras
Cameras are used as the eyes of a robot. In many cases, two cameras are used,
in order to provide the robot with binocular vision, allowing it to estimate the
range to detected objects. There are many cameras available for robots, for
example the CMUCam series which has been developed especially for use in
mobile robots; The processor connected to the CMUCam is capable of basic
image processing. At the time of writing (Jan. 2010), a CMUCam costs on the
order of 150 USD. A low-cost alternative is to use ordinary webcams, for which
prices start around 15 USD. Fig. 1.9 shows a simple robotic head consisting of
two servo motors (see below) and a single webcam.
However, while the actual cameras may not be very costly, the use of cam-
eras is computationally a very expensive procedure. Even at a low resolution,
3
A typical angular interval for an LRF is around 180-240 degrees.
Figure 1.9: A simple robotic head, consisting of two servo motors and a webcam.
say 320 × 240 pixels, a webcam will deliver a flow of around 1.5 Mb/s, assum-
ing a frame rate of 20 Hz and a single byte of data per pixel. The actual data
transfer is easily handled by a Universal serial bus (USB), but the data must
not only be transferred but also analyzed, something which is far from trivial.
An introduction to image processing for robots will be given in a later chapter.
Other sensors
x
z m
Figure 1.10: An accelerometer. The motion of the small object (mass m) resulting from the
acceleration of the larger object to which the accelerometer is attached can be used for deducing
the acceleration.
1.2.3 Actuators
An actuator is a device that allows a robot to take action, i.e. to move or manip-
ulate the surroundings in some other way. Motors, of course, are very common
types of actuators. Other kinds of actuation include, for example, the use of
microphones (for human-robot interaction).
Movements can be generated in various ways, using e.g. electrical motors,
pneumatic or hydraulic systems etc. In this course, we shall only consider
electrical, direct-current (DC) motors and, in particular, servo motors. Thus,
when referring to actuation in this course, the use of such motors is implied.
Note that actuation normally requires the use of a motor controller in con-
nection with the actual motor. This is so, since the microcontroller (see below)
responsible for sending commands to the motor cannot, in general, provide
sufficient current to drive the motor. The issue of motor control will be consid-
ered briefly in connection with the discussion of servo motors below.
I
B
N S
F
Figure 1.11: A conducting wire in a magnetic field. B denotes the magnetic field strength
and I the current through the wire. The Lorentz force F acting on the wire is given by F =
I × B.
I
Figure 1.12: A conducting loop of wire placed in a magnetic field. Due to the forces acting
on the loop, it will begin to turn. The loop is shown from above in the right panel, and from
the side in the left panel.
DC motors
Electrical direct current (DC) motors are based on the principle that a force
acts on a wire in a magnetic field if a current is passed through the wire, as
illustrated in Fig. 1.11. If instead a current is passed through a closed loop of
wire, as illustrated in Fig. 1.12, the forces acting on the two sides of the loop
will point in opposite directions, making the loop turn. A standard DC motor
consists of an outer stationary cylinder (the stator), providing the magnetic
field, and an inner, rotating part (the rotor). From Fig. 1.12 it is clear that the
loop will reverse its direction of rotation after a half-turn, unless the direction
of the current is reversed. The role of the commutator, connected to the rotor
of a DC motor, is to reverse the current through the motor every half-turn, thus
allowing continuous rotation. Finally, carbon brushes, attached to the stator,
complete the electric circuit of the DC motor. There are types of DC motors
+
L
V R
VEMF
-
+
Figure 1.13: The equivalent electrical circuit for a DC motor.
that use electromagnets rather than a permanent magnet, and also types that
are brushless. However, a detailed description of such motors are beyond the
scope of this text.
DC motors are controlled by varying the applied voltage. The equations
for DC motors can be divided into an electrical and a mechanical part. The
motor can be modelled electrically by the equivalent circuit shown in Fig. 1.13.
Letting V denote the applied voltage, and ω the angular speed of the motor
shaft, the electrical equation takes the form
di
V =L + Ri + VEMF , (1.1)
dt
where i is the current flowing through the circuit, L is the inductance of the
motor, R its resistance, and VEMF the voltage (the back EMF) counteracting V .
The back EMF depends on the angular velocity, and can be written as
VEMF = ce ω, (1.2)
where ce is the electrical constant of the motor. For a DC motor, the generated
torque τg is directly proportional to the current, i.e.
τg = ct i, (1.3)
where ct is the torque constant of the motor. Turning now to the mechanical
equation, Newton’s second law gives
dω X
I = τ, (1.4)
dt
where I is the combined moment of inertia of the motor and its load, and τ
P
is the total torque acting on the motor. For the DC motor, the equation takes
the form
dω
I = τg − τf − τ, (1.5)
dt
© 2012, 2016, Mattias Wahde, [email protected]
14 CHAPTER 1. AUTONOMOUS ROBOTS
Figure 1.14: Left panel: A HiTec 645MG servo. The suffix MG indicates that the servo
is equipped with a metal gear train. Right panel: A Parallax servo, which has been modified
for continuous rotation. Servos of this kind are used on the Boe-bot. The circular (left) and
star-shaped (right) white plastic objects are the servo horns.
where τf is the frictional torque opposing the motion and τ is the (output)
torque acting on the load. The frictional torque can be divided into two parts,
the Coulomb friction (cC sgn(ω)) and the viscous friction (cv ω). Thus, the
equations for the DC motor can now be summarized as
ct ct L di ce ct
τg = V − − ω, (1.6)
R R dt R
dω
I = τg − cC sgn(ω) − cv ω − τ, (1.7)
dt
In many cases, the time constant of the electrical circuit is much shorter than
that of the physical motion, so the inductance term can be neglected. Further-
more, for simplicity, the dynamics of the mechanical part can also be neglected
under certain circumstances (e.g. if the moment of inertia of the motor and
load is small). Thus, setting di/dt and dω/dt to zero, the steady-state DC mo-
tor equations, determining the torque τ on the load for a given applied voltage
V and a given angular velocity ω
ct ce ct
τg = V − ω, (1.8)
R R
τ = τg − cC sgn(ω) − cv ω, (1.9)
are obtained. In many cases, the axis of a DC motor rotates too fast and gener-
ates a torque that is too weak for driving a robot. Thus, a gear box is commonly
used, which reduces the rotation speed taken out from the motor (on the sec-
ondary drive shaft) while, at the same time, increasing the torque. For an ideal
(loss-free) gear box, the output torque and rotation speed are given by
τout = Gτ,
Figure 1.15: Pulse width modulation control of a servo motor. The lengths of the pulses
determine the requested position angle of the motor output shaft. The interval betwwn pulses
(typically around 20 ms) is denoted T .
1
ωout = ω, (1.10)
G
where G is the gear ratio.
Servo motors
A servo motor is essentially a DC motor equipped with control electronics and
a gear train (whose purpose is to increase the torque to the required level for
moving the robot, as described above). The actual motor, the gear train, and
the control electronics, are housed in a plastic container. A servo horn (either
plastic or metal) makes it possible to connect the servo motor to a wheel or
some other structure. Fig. 1.14 shows two examples of servo motors.
The angular position of a servo motor’s output shaft is determined using a
potentiometer. In a standard servo, the angle is constrained to a given range
[−αmax , αmax ], and the role of the control electronics is to make sure that the
servo rotates to a set position α (given by the user). A servo is fitted with a
three-wire cable. One wire connects the servo to a power source (for exam-
ple, a motor controller or, in some cases, a microcontroller board) and another
wire connects it to ground. The third wire is responsible for sending signals to
the servo motor. In servo motors, a technique called pulse width modulation
Figure 1.16: An arm of a humanoid robot. The allowed rotation range of the elbow is around
100 degrees.
(PWM) is used: Signals in the form of pulses are sent (e.g. from a microcon-
troller) to the control electronics of the servo motor. The duration of the pulses
determine the required position, to which the servo will (attempt to) rotate, as
shown in Fig. 1.15. For a walking robot (or for a humanoid upper body), the
limitation to a given angular range poses no problem: The allowed rotation
range of a servo is normally sufficient for, say, an elbow or a shoulder joint. As
an example, an arm of a humanoid robot is shown in Fig. 1.16. For this particu-
lar robot, the rotation range for the elbow joint is around 100 degrees, i.e. easily
within the range of a standard servo (around 180 degrees). The limitation is, of
course, not very suitable for motors driving the wheels of a robot. Fortunately,
servo motors can be modified to allow continuous rotation. The Boe-bot that
will be built in the second half of the course uses Parallax continuous rotation
servos (see the right panel of Fig. 1.14), rather than standard servos.
Other motors
There are many different types of motors, in addition to standard DC motors
and servo motors. An example is the stepper motor, which is also a version
of the DC motor, namely one that moves in fixed angular increments, as the
name implies. However, in this course, only standard DC motors and servo
motors will be considered.
1.2.4 Processors
Sensors and actuators are necessary for a robot to be able to perceive its envi-
ronment and to move or manipulate the environment in various ways. How-
Figure 1.17: A Board of Education (BOE) microcontroller board, with a Basic Stamp II
(BS2) microcontroller attached. In addition to the microcontroller, the BOE has a serial port
for communication with a PC (used, for example, when uploading a program onto the BS2),
as well as sockets for attaching sensors and electronic circuits. In this case, a simple circuit
involving a single LED, has been built on the BOE. The two black sockets in the upper right
corner are used for connecting up to four servo motors.
ever, in addition to sensors and actuators, there must also be a system for an-
alyzing the sensory information, making decisions concerning what actions to
take, and sending the necessary signals to the actuators.
In autonomous robots, it is common to use several processors to represent
the brain of the robot. Typically, high-level tasks, such as decision-making, are
carried out on a standard PC, for example a laptop computer mounted on the
robot, whereas low-level tasks are carried out by microcontrollers, which will
now be introduced briefly.
Microcontrollers
Laptop computer
Motor Wheel
Sonars
controller encoders
Actuators
(motors)
Figure 1.18: An example of a typical robotic brain architecture, for a differentially steered
two-wheeled robot equipped with wheel encoders, three sonar sensors, one LRF, and two web
cameras.
5
A separate motor controller (equipped with its own power source) is often used for
robotics applications, since the power source for the microcontroller may not be able to de-
liver sufficient current for driving the motors as well.
Servo Phototransistors
motors Sonar Whiskers
2.1 Kinematics
Kinematics is the process of determining the range of possible movements
for a robot, without consideration of the forces acting on the robot, but tak-
ing into account the various constraints on the motion. The kinematic equa-
tions for a robot depend on the robot’s structure, i.e. the number of wheels,
the type of wheels used etc. Here, only the case of differentially steered two-
wheeled robots will be considered. For balance, a two-wheeled robot must also
have one or several supporting wheels (or some other form of ground contact,
such as a ball in a material with low friction). The influence of the supporting
wheels on the kinematics and dynamics will not be considered.
VL VR
21
22 CHAPTER 2. KINEMATICS AND DYNAMICS
v
w
v
r
Figure 2.2: Left panel: Kinematical constraints force a differentially steered robot to move in
a direction perpendicular to a line through the wheel axes. Right panel: For a wheel that rolls
without slipping, the equation v = ωr holds.
v = ωr, (2.1)
However, even if the speed estimates are very accurate, there are other
sources of error as well. For example, the robot’s wheels may slip occasionally.
Furthermore, the kinematic model may not provide a completely accurate es-
timate of the robot’s true kinematics (for example, no wheel is ever perfectly
circular).
Once wheel speed estimates are available, the pose can be estimated, us-
ing a kinematic model as described above. The process of estimating a robot’s
position and heading based on wheel encoder data is called odometry. Due
to the limited accuracy of velocity estimates, the estimated pose of the robot
will be subject to an error, which grows with time. Thus, in order to maintain
a sufficiently accurate pose estimate for navigation over long distances, an in-
dependent method of odometric recalibration must be used. This issue will
be considered in a later chapter.
Normally, the wheel speeds are not given a priori. Instead, the signals sent
to the motors by the robotic brain (perhaps in response to external events, such
as detection of obstacles) will determine the torques applied to the motor axes.
In order to determine the motion of the robot one must then consider not only
its kinematics but also its dynamcs. This will be the topic of the next section.
2.2 Dynamics
The kinematics considered in the previous section determines the range of pos-
sible motions for a robot, given the constraints which, in the case of the two-
wheeled differential robot, enforce motion in the direction perpendicular to
the wheel axes. However, kinematics says nothing about the way in which a
particular motion is achieved. Dynamics, by contrast, considers the motion of
the robot in response to the forces (and torques) acting on it. In the case of the
two-wheeled, differentially steered robot, the two motors generate torques (as
described above) that propel the wheels forward, as shown in Fig. 2.3. The fric-
tional force at the contact point with the ground will try to move the ground
backwards. By Newton’s third law, a reaction force of the same magnitude
will attempt to move the wheel forward. In addition to the torque τ from the
motor (assumed to be known) and the reaction force F from the ground, a re-
action force ρ from the main body of the robot will act on the wheel, mediated
by the wheel axis (the length of which is neglected in this derivation). Using
Newton’s second law, the equations of motion for the wheels take the form
mv̇L = FL − ρL , (2.10)
mv̇R = FR − ρR , (2.11)
I w φ̈L = τL − FL r, (2.12)
FL y
rL rL
x
t j
r R rR
f F FR
rR
Figure 2.3: Left panel: Free-body diagram showing one of the two wheels of the robot. Right
panel: Free-body diagram for the body of the robot and for the two wheels. Only the forces
acting in the horizontal plane are shown.
and
I w φ̈R = τR − FR r, (2.13)
where m is the mass of the wheel, I w is its moment of inertia, and r its radius.
It is assumed that the two wheels have identical properties. The right panel of
Fig. 2.3 shows free-body diagrams of the robot and the two wheels, seen from
above. Newton’s equations for the main body of the robot (mass M ) take the
form
M V̇ = ρL + ρR (2.14)
and
I ϕ̈ = (−ρL + ρR ) R, (2.15)
where I is its moment of inertia.
In the six equations above there are 10 unknown variables, namely vL , vR ,
FL , FR , ρL , ρR , φL , φR , V , and ϕ. Four additional equations can be obtained
from kinematical considerations. As noted above, the requirement that the
wheels should roll without slipping leads to the equations
vL = rφ̇L (2.16)
and
vR = rφ̇R . (2.17)
Furthermore, the two kinematic equations (see Sect. 2.1)
vL + vR
V = , (2.18)
2
© 2012, 2016, Mattias Wahde, [email protected]
26 CHAPTER 2. KINEMATICS AND DYNAMICS
and
vL − vR
ϕ̇ = − . (2.19)
2R
complete the set of equations for the dynamics of the differentially steered
robot. Combining Eq. (2.10) with Eq. (2.12) and Eq. (2.11) with Eq. (2.13), the
equations
τL − I w φ̈L
mv̇L = − ρL , (2.20)
r
τR − I w φ̈R
mv̇R = − ρR (2.21)
r
are obtained. Inserting the kinematic conditions from Eqs. (2.16) and (2.17), ρL
and ρR can be expressed as
!
τL Iw
ρL = − + m v̇L , (2.22)
r r2
and !
τR Iw
ρR = − + m v̇R . (2.23)
r r2
Inserting Eqs. (2.22) and (2.23) in Eq. (2.14), one obtains the acceleration of the
center-of-mass of the robot body as
!
(τL + τR ) Iw
M V̇ = ρL + ρR = − + m (v̇L + v̇R ) = (2.24)
r r2
!
(τL + τR ) Iw
= − 2 2 + m V̇ ,
r r
where, in the last step, the derivative with respect to time of Eq. (2.18) has been
used. Rearranging terms, one can write Eq. (2.24) as
M V̇ = A (τL + τR ) , (2.25)
where
1
A=
Iw m
. (2.26)
r 1+2 M r2
+ M
For the angular motion, using Eqs. (2.22) and (2.23), Eq. (2.15) can be expressed
as
!
R Iw
I ϕ̈ = (−ρL + ρR ) R = (−τL + τR ) + R + m (v̇L − v̇R ) , (2.27)
r r2
Differentiating Eq. (2.19) with respect to time, and inserting the resulting ex-
pression for v̇R − v̇L in Eq. (2.27), one obtains the equation for the angular
motion as !
R 2 Iw
I ϕ̈ = (−τL + τR ) − 2R + m ϕ̈. (2.28)
r r2
I ϕ̈ = B (−τL + τR ) , (2.29)
where
1
B= r
IwR mRr
. (2.30)
R
+2 Ir
+ I
Due to the limited strength of the motors and to friction, as well as other losses
(for instance in the transmission), there are of course limits on the speed and
rotational velocity of the robot. Thus, the differential equations for V and ϕ
should also include damping terms. In practice, for any given robot, the exact
form of these terms must be determined through experiments (i.e. through sys-
tem identification). A simple approximation is to use linear damping terms,
so that the equations of motion for the robot become
M V̇ + αV = A (τL + τR ) , (2.31)
and
I ϕ̈ + β ϕ̇ = B (−τL + τR ) , (2.32)
where α and β are constants. Note that, if the mass m and moment of inertia
I w of the wheels are small compared to the mass M and moment of inertia I
of the robot, respectively, the expression for A can be simplified to
1
A= . (2.33)
r
Similarly, the expression for B can be simplified to
R
B= . (2.34)
r
Given the torques τL and τR generated by the two motors in response to the
signals sent from the robotic brain, the motion of the robot can thus be obtained
by integration of Eqs. (2.31) and (2.32).
3.1 Simulators
Over the years, several different simulators for mobile robots have appeared,
with varying degrees of complexity. One of the most ambitious simulators
to date is Robotics studio from Microsoft, which allows the user to simulate
many of the commercially available mobile robots, or even to assemble a (vir-
29
30 CHAPTER 3. SIMULATION OF AUTONOMOUS ROBOTS
2. Process information
4. Move robot
5. Update arena
Figure 3.1: The flow of a single-robot simulation. Steps 1 through 6 are carried out in each
time step of the simulation.
taken, particularly regarding steps 1-3. To be specific, one must make sure that
these steps can be executed (on the real robot) in a time which does not exceed
the time step length in the simulation. Here, it is important to distinguish
between two different types of events, namely (1) those events that take a long
time to complete in simulation, but would take a very short time in a real robot,
and (2) those events that are carried out rapidly in simulation, but would take
a long time to complete in a real robot.
An example of an event of type (1) is collision-checking. If performed
in a straight-forward, brute-force way, the possibility of a collision between
the (circular, say) body of the robot and an object must be checked by going
through all lines in a 2D-projection of the arena. A better way (used, for ex-
ample, in GPRSim) is to introduce an invisible grid, and only check for colli-
sions between the robot and those objects that (partially) cover the grid cells
that are also covered by the robot. However, even when such a procedure is
used, collision-checking may nevertheless be very time-consuming in simula-
tion whereas, in a real robot, it amounts simply to reading a bumper sensor (or,
as on the Boe-bot, a whisker), and transferring the signal (which, in this case, is
binary, i.e. a single bit of information) from the sensor to the brain of the robot.
Events of this type cause no (timing) problems at the stage of transferring the
results to a real robot, even though they may slow down the simulation con-
siderably.
An example of an event of type (2) is the reading of sensors. For exam-
ple, an IR sensor can be modelled using simple ray-tracing (see below) and,
provided that the number of rays used is not too large, the update can be car-
Dt
Figure 3.2: A timing diagram. The boxes indicate the time required to complete the corre-
sponding event in hardware, i.e. a real robot. In order for the simulation to be realistic, the
time step ∆t used in the simulation must be longer than the total duration (in hardware) of all
events taking place within a time step.
out within the duration ∆t (the simulation time step) when transferred to the
real robot. An example of a timing diagram for a generic robot (not Khepera)
is shown in Fig. 3.2. In the case shown in the figure, two IR proximity sensors
are read, the information is processed (for example, by being passed through
an artificial neural network), and the motor signals (voltages, in the case of
standard DC motors) are then transferred to the motors. The figure shows a
case which could be realistically simulated, with the given time step length
∆t. However, if two additional IR sensors were to be added, the simulation
would become unrealistic: The real robot would not be able to complete all
steps during the time ∆t.
For the simple robotic brains considered in this course, step 2 would gener-
ally be carried out almost instantaneously (compared to step 1) in a real robot.
Similarly, the transfer of motor signals to a DC motor is normally very rapid
(note, however, that the dynamics of the motors may be such that it is pointless
to send commands with a frequency exceeding a certain threshold).
To summarize, a sequence of events that takes, say, several seconds per
time step to complete in simulation (e.g. the case of collision-checking in a very
complex arena) may be perfectly simple to transfer to a real robot, whereas a
sequence of events (such as the reading of a large set of IR sensors) that can
be completed almost instantaneously in a simulated robot, may simply not be
transferable to a real robot, unless a dedicated processor for signal processing
and signal transfer is used.
3.2.2 Noise
Another aspect that should be considered in simulations is noise. Real sensors
and actuators are invariably noisy, on several levels. Furthermore, even sen-
sors that are supposed to be identical often show very different characteristics
in practice. In addition, regardless of the noise level of a particular sensor, the
frequency with which readings can be updated is limited, thus introducing
another source of noise, in certain cases. For example, the limited sampling
frequency of wheel encoders implies that, even in the (unrealistic) case where
the kinematic model is perfect and there are no other sources of noise, the in-
tegrals in the kinematic equations (Eqs. (2.7)-(2.9)) can only be approximately
computed.
Thus, in any realistic robot simulation, noise must be added, at all relevant
levels. Noise can be added in several different ways. A common method (used
in GPRSim and ARSim) is to take the original reading S of a sensor and add
noise to form the actual reading Ŝ as
where N (1, σ) denotes the normal (Gaussian) distribution with mean 1 and
This method has the advantage of forming simulated readings from actual
sensor readings, rather than introducing a model for the noise. Furthermore,
using lookup tables, it is straightforward to account for the individual nature
of supposedly identical sensors. However, a clear disadvantage is the need for
generating the lookup tables, which often must contain a very large number of
samples taken not only at various distances, but also, perhaps, at various angles
between the forward direction of the sensor and the surface of the obstacle.
Thus, the first method, using a specific noise distribution, is normally used
instead.
3.2.3 Sensing
In addition to correct timing of events and the addition of noise in sensors and
actuators, it is necessary to make sure that the sensory signals received by the
simulated robot do not contain more information than could be provided by
the sensors of the corresponding real robot. For example, in the simulation of
a robot equipped only with wheel encoders (for odometry), it is not allowed
to provide the simulated robot with continuously updated and error-free po-
sition measurements. Instead, the simulated wheel encoders, including noise
and other inaccuracies, should be the only source of information regarding the
position of the simulated robot.
In both GPRSim and ARSim, several different sensors have been imple-
mented, namely (1) wheel encoders, (2) IR proximity sensors, and (3) com-
passes. In addition, GPRSim (but not ARSim) also features (4) sonar sensors
and (5) laser range finders (LRFs). An important subclass of (simulated) sen-
sors are ray-based sensors, which use a simple form of ray tracing in order to
Figure 3.3: Left panel: A screenshot from GPRSim, showing an LRF taking a reading in
an arena containing only walls. Right panel: A two-dimensional representation of the sensor
reading. The dotted ray points in the forward direction of the robot which, in this case, coincides
with the forward direction of the LRF.
ray, its equation can be determined. Let (xa , ya ) denote the start point for the
ray (which will be equal to the position of the sensor, if the size of the latter
can be neglected). Once the absolute direction (βi ) of the sensor ray has been
determined, the end point (xb , yb ) of an unobstructed ray (i.e. one that does
not hit any obstacle) can be obtained as
where D denotes the sensor range. Similarly, any line corresponding to the
side of an arena object can be defined using the coordinates of its start and
end points. Note that, in Fig. 3.3, all lines defining arena objects coincide with
coordinate axes, but this is, of course, not always the case. Now, in the case of
two lines of infinite length, defined by the equations yk = ck + dk x, k = 1, 2,
it is trivial to find the intersection point (if any) simply by setting y1 = y2 .
However, here we are dealing with line segments of finite length. In this case,
the intersection point can be determined as follows: Letting Pai = (xai , yia ) and
Pbi = (xbi , yib ) , denote the start and end points, respectively, of line i, i = 1, 2,
the equations for an arbitrary point Pi along the two line segments can be
written
P1 = Pa1 + t Pb1 − Pa1 , (3.4)
and
P2 = Pa2 + u Pb2 − Pa2 , (3.5)
where (t, u) ∈ [0, 1]. Solving the equation P1 = P2 for t and u gives, after some
algebra,
(xb2 − xa2 )(y1a − y2a ) − (y2b − y2a )(xa1 − xa2 )
t= (3.6)
(y2b − y2a )(xb1 − xa1 ) − (xb2 − xa2 )(y1b − y1a )
and
(xb1 − xa1 )(y1a − y2a ) − (y1b − y1a )(xa1 − xa2 )
u= (3.7)
(y2b − y2a )(xb1 − xa1 ) − (xb2 − xa2 )(y1b − y1a )
An intersection occurs if both t and u are in the range [0, 1]. Assuming that the
first line (with points given by P1 ) is the sensor ray, the distance d between the
sensor and the obstacle, along the ray in question, can then easily be formed
by simply determining P1 using the t value found, and computing
d = |P1 − Pa1 | = |t(Pb1 − Pa1 )|. (3.8)
If the two lines happen to be parallel, the denominator becomes equal to zero2 .
Thus, this case must be handled separately.
In simulations, for any time step during which the readings of a particular
sensor are to be obtained, the first step is to determine the origin of the sensor
rays (i.e. the position of the sensor), as well as their directions. An example is
shown in Fig. 3.4. Here, a sensor is placed at a point ps , relative to the center
of the robot. The absolute position Ps (relative to an external, fixed coordinate
system) is given by
Ps = X + ps , (3.9)
where X = (X, Y ) is the position of the (center of mass of the) robot. Assuming
that the front direction of the sensor is at an angle α relative to the direction of
heading (ϕ) of the robot, and that the sensor readings are to be formed using
N equally spaced rays over an opening angle of γ, the absolute direction βi of
the ith ray equals
γ
βi = ϕ + α − + (i − 1)δγ, (3.10)
2
where δγ is given by
γ
δγ = . (3.11)
N −1
Now, the use of the ray readings differs between different simulated sensors.
Let us first consider a simulated IR sensor. Here, the set of sensor rays is used
only as an artificial construct needed when forming the rather fuzzy reading of
such a sensor. In this case, the rays themselves are merely a convenient com-
putational tool. Thus, for IR sensors, the robotic brain is not given information
regarding the individual rays. Instead, only the complete reading S is pro-
vided, and it is given by
N
1 X
S= ρi , (3.12)
N i=1
2
In case the two lines are not only parallel but also coincident, both the numerators and the
denominators are equal to zero in the equations for t and u.
a
g j
b1
Figure 3.4: The right panel shows a robot equipped with two IR sensors, and the left panel
shows a blow-up of the left sensor. In this case, the number of rays (N ) was equal to 5. The
leftmost and rightmost rays, which also indicate the opening angle γ of the IR sensor are shown
as solid lines, whereas the three intermediate rays are shown as dotted lines.
where ρi is the ray reading of ray i. Ideally, the value of N should be very large
for the simulated sensor to represent accurately a real IR sensor. However, in
practice, rather small values of N (3-5, say) is used in simulation, so that the
reading can be obtained quickly. The loss of accuracy is rarely important, since
the (fuzzy) reading of an IR sensor is normally used only for proximity detec-
tion (rather than, say, mapping or localization). An illustration of a simulated
IR sensor is given in Fig. 3.4.
A common phenomenological model for IR sensor readings (used in GPRSim
and ARSim) defines ρi as
! !
c1
ρi = min + c2 cos κi , 1 , (3.13)
d2i
where c1 and c2 are non-negative constants, di > 0 is the distance to the nearest
object along ray i, and
γ
κi = − + (i − 1)δγ, (3.14)
2
is the relative ray angle of ray i. If di > D (the range of the sensor), ρi = 0.
Note that it is assumed that κi ∈ [−π/2, π/2], i.e. the opening angle cannot
exceed π radians. Typical opening angles are π/2 or less. It should also be
noted that this IR sensor model has limitations; for example, the model does
not take into account the orientation of the obstacle’s surface (relative to the
direction of the sensor rays) and neither does it account for the different IR
reflectivity of different materials.
For simulated sonar sensors (which are included in GPRSim but not in AR-
Sim), the rays are also only used as a convenient computational tool, but the
final reading S is formed in a different way. Typically, sonar sensors give rather
accurate distance measurements in the range [Dmin , Dmax ], but sometimes fail
to give a reading at all. Thus, in GPRSim, the reading of a sonar sensor is
formed as S = mini di with probability p and Dmax (no detection) with proba-
bility 1 − p. Also, if S < Dmin the reading is set to Dmin . Typically, the value of
p is very close to 1. The number of rays (N ) is usually around 3 for simulated
sonars.
A simulated LRF, by contrast, gives a vector-valued reading, S, where each
component Si is obtained simply as the distance di to the nearest obstacle along
the ray. Thus, for LRFs, the sensor rays have a specific physical interpretation,
made possible by the fact that the laser beam emitted by an LRF is very narrow.
In GPRSim, if di > D, the corresponding laser ray reading is set to -1, to indi-
cate the absence of any obstacle within range of the ray in question. Note that
LRFs are only implemented in GPRSim. It would not be difficult to add such a
sensor to ARSim, but since an LRF typically takes readings in 1,000 different
directions (thus requiring the same number of rays), such sensors would make
ARSim run very slowly.
As a final remark regarding ray-based sensors, it should be noted that a
given sensor ray i may intersect several arena object lines (see, for example,
Fig. 3.3) In such cases, di is taken as the shortest distance obtained for the ray.
3.2.4 Actuators
A commonly used actuator in mobile robots is the DC motor. The equations
describing such motors are given in Chapter 1.
In both GPRSim and ARSim, a standard DC motor has been implemented.
In this motor, the input signal is the applied voltage. Both the electrical and
mechanical dynamics of the motors are neglected. Thus the torque acting
on the motor shaft axis is given by Eqs. (1.8) and (1.9). Gears are imple-
mented in both simulators, so that the torques acting on the wheels are given
by Eqs. (1.10). However, the simulators also include the possibility of setting
a maximum torque τmax which cannot be exceeded, regardless of the output
torque τout obtained from Eqs. (1.10).
In addition, GPRSim (but not ARSim) also allows simulation of velocity-
regulated motors. Unlike the voltage signal used in the standard DC motor, a
velocity-regulated motor takes as input a desired reference speed vref for the
wheel attached to the motor axis. The robot then tries to reach this speed value,
using proportional control. The actual output torque of a velocity-regulated
motor is given by
τ = K (vref − v) (3.15)
In this model, a change in vref generates an immediate change in the torque. In
a real motor, the torques cannot change instantaneously. However, Eq. (3.15)
Figure 3.5: Left panel: A simulated robot (from GPRSim), consisting of more than 100
objects. Right panel: An example (in blue) of a collision geometry.
between the circular body of the robot and any line representing a side of an
arena object.
3.2.6 Motion
Once the torques acting on the wheels have been generated, the motion of
the robot is obtained through numerical integration of Eqs. (2.31) and (2.32).
In both GPRSim and ARSim, the integration is carried out using simple first-
order (Euler) integration. For each time step, V̇ and ϕ̈ are computed using
Eqs. (2.31) and (2.32), respectively. The new values V 0 and ϕ̇0 of V and ϕ̇ are
then computed as
V 0 = V + V̇ ∆t, (3.16)
ϕ̇0 = ϕ̇ + ϕ̈∆t, (3.17)
where ∆t is the time step length (typically set to 0.01 s). The value of ϕ is then
updated, using the equation
are motor behaviors (that make use of the robot’s motors) and some of which
are cognitive processes, i.e. processes that do not make use of any motors. In
addition, GPRBS features a decision-making system based on the concept of
utility. One of the main properties of GPRBS is that this structure allows sev-
eral processes to run in parallel, making it possible to build complex robotic
brains. In fact, the specific aim of the development of GPRBS is to move be-
yond the often very simple robotic brains defined within standard BBR.
In GPRBS, all brain processes are specified in a standardized format, which
simplifies the development of new brain processes, since many parts of an
already existent process often can be used when writing a new process. How-
ever, at the same time, GPRBS (as implemented in GPRSim) is a bit complex
to use, especially since it is intended for use in research, rather than as an edu-
cational tool. Thus, in this course, ARSim will be used instead. This simulator
allows the user to write simple brain processes (as well as a basic decision-
making system) in any desired format (i.e. without using GPRBS). Methods
for writing brain processes will be described further in a later chapter.
>> TestRunRobot
and press return. The robot appears in a quadratic arena with four walls and
two obstacles, as shown in Fig. 3.6. The robot is shown as a circle, and its di-
rection of motion is indicated by a thin line. The IR sensors (of which there are
two in the default simulation) are shown as smaller circles. The rays used for
determining the sensor readings (three per sensor, per default) are shown as
lines emanating from the sensors. In the default simulation, the robot executes
1,000 time steps of length 0.01 s, unless it is interrupted by a collision with an
obstacle or a wall.
The flow of the simulation basically follows the structure given in Fig. 3.1.
The first lines of code in the TestRunRobot.m file are devoted to adding the
various ARSim function libraries to Matlab’s search path. The arena objects are
then created and added to the arena. Next, the brain of the robot is created (by
a call to CreateBrain), and the setup is completed by creating the sensors
and motors, and adding them to the robot.
Before the actual simulation starts, the robot’s position, heading, velocity,
and angular speed are set, and the plot of the arena (including the robot) is
created. Optionally, a variable motionResults, storing information about
Figure 3.6: A typical screenshot from an ARSim simulation. The black lines emanating from
the two IR proximity sensors of the robot are the rays used for determining sensor readings.
45
46 CHAPTER 4. ANIMAL BEHAVIOR
4.4 Ethology
4.4.1 Reflexes
Reflexes, the simplest forms of behavior, are involuntary reactions to exter-
nal stimuli. An example is the withdrawal reflex, which is present even in
very simple animals (and, of course, in humans as well). However, even re-
flexes show a certain degree of modulation. For example, some reflexes exhibit
warm-up, meaning that they do not reach their maximum intensity instanta-
neously (an example is the scratch reflex in dogs). Also, reflexes may exhibit
fatigue, by which is meant a reduced, and ultimately disappearing, intensity
even if the stimulus remains unchanged. Two obvious reasons for fatigue may
be muscular or sensory exhaustion, i.e. either an inability to move or an inabil-
ity to sense. However, these explanations are often wrong, since the animal
may be perfectly capable of carrying out other actions, involving both muscles
and sensors, even though it fails to show the particular reflex response under
study. An alternative explanation concerns neural exhaustion, i.e. an inability
of the nervous system to transmit signals (possibly as a result of neurotrans-
mitter depletion). An example1 is the behavior of Sarcophagus (don’t ask - you
don’t want to know) larvae. These animals generally move away from light.
However, if placed in a tunnel, illuminated at the entrance, and with a dead
end (no pun intended), they move to the end of the tunnel, turn around, and
move towards the light, out of the tunnel. This is interesting, since these larvae
will (if not constrained) always move away from light. However, this is neither
a case of muscular exhaustion nor sensory exhaustion. Instead, the larvae have
simply exhausted their neural circuits responsible for the turning behavior.
follow the gradient in the same way as the fly maggots do. Interestingly, E. Coli
bacteria are nevertheless able to move towards, and accumulate in, regions of
high food concentration, an ability which is exploited by another predatory
bacterium, M. Xanthus, which secretes a substance that attracts E. Coli in what
Shi and Zusman2 has called “fatal attraction”. The fact that the M. Xanthus
are able to feed on E. Coli is made all the more interesting by the fact that the
latter move around 200 times faster than the former. Now two questions arise:
How do the E. Coli find regions of higher food concentration, and how do the
M. Xanthus catch them?
0.4 0.4
0.2 0.2
X
U
0 0
-0.2 -0.2
1 2 3 4 5 1 2 3 4 5
time time
Figure 4.1: An illustration of the switch between straight-line swimming and tumbling in E.
Coli bacteria, based on a model with a single leaky integrator given in Eq. (4.1). The left panel
shows the variation of the attractant concentration X(t), and the right panel shows U (t). The
straight-line swimming behavior (B1 ) is active between t = 1 and t = 3.
falls towards zero (and eventually becomes negative, if b > a). However, if
there is a decrease in X, i.e. if the bacterium begins to leave the region of high
attractant concentration, U (t) becomes negative, and B2 is activated, keeping
the bacterium approximately stationary. Thus, the chemotaxis of E. Coli can be
modelled with a single leaky integrator.
Finally, note the importance of taxes in simple behaviors for autonomous
robots. For example, it is easy (using, for example, two IR sensors) to equip a
robot with the ability to follow a light gradient. Thus, for example, if a light
source is placed next to, say, a battery charging station, the robot may achieve
long-term autonomy, by activating a phototactic behavior whenever the bat-
tery level drops below a certain threshold. Problems involving such behavior
selection in autonomous robots will be studied further in later chapters in con-
nection with the topic of utility. The simple chemotaxis of the E. Coli can be
considered as a case of utility-based behavior selection, in which one behavior,
B2 , has a fixed utility T , and the utility of the other, B1 , is given by U (t).
Kinesis is an even simpler concept, in which the level of activity (e.g. move-
ment) of an organism depends on the level of some stimulus, but is undirected.
An example is the behavior of wood lice: If the ambient humidity is high (a
condition favored by wood lice), they typically move very little. If the humid-
ity drops, they will increase their rate of movement.
Figure 4.2: The motion of simulated E. Coli bacteria based on the behavior switch defined
in the main text. 100 bacteria were simulated, and the parameters a and b were both equal to
1. The attractant had a gaussian distribution, with its peak at the center of the image. The
threshold was set to 0. The left panel shows the initial distribution of bacteria, and the right
panel shows the distribution after 10 seconds of simulation, using a time step of 0.01.
of IRMs, e.g. the question of whether they really are inborn mechanisms, have
been called into question.
small feat, keeping in mind that the entrance to the nest is a small hole in the
ground.
How does Cataglyphis achieve such accurate navigation? This issue has
been studied meticulously by Wehner and his colleagues and students3 . Their
experiments have shown that Cataglyphis has the capability of computing dis-
tance travelled and also to combine the position information thus obtained
with heading information obtained using an ingenious form of compass, based
on the pattern of light polarization over the sky. Combining the odometric in-
formation with the compass information, Cataglyphis is able to carry out path
integration, i.e. to determine, at any time, the direction vector (from the nest to
its current position), regardless of any twists and turns during the outbound
section of its movement. Once a food item has been found, the ant will use the
stored direction vector in order to return to the nest on an almost straight line.
3
See e.g. Wehner, R. Desert ant navigation: how miniature brains solve complex tasks, J Comp
Physiol, 189, pp. 579-588, 2003.
It should be noted that the light polarization patterns varies with the move-
ment of the sun in the sky. Thus, in order to use its compass over long peri-
ods of time, the ant also needs an ephemeris function, i.e. a function that de-
scribes the position of the sun during the day. Experiments have shown that
Cataglyphis indeed has such a function.
Even with the path integration, finding the exact spot of the nest is quite
difficult: As in the case of robotics, the odometric information and the compass
angle have limited accuracy. However, the tiny brain of Cataglyphis (weighing
in at around 0.1 mg, in a body weighing around 10 mg), is equipped with
yet another amazing skill, namely pattern matching: Basically, when leaving
its nest, Cataglyphis takes (and stores) a visual snapshot (using its eyes) of the
scenery around the nest. Then, as the ant approaches the nest (as determined
by the vector obtained from path integration), the ant will match its current
view to the stored snapshot, and thus find the nest.
It is interesting to note the similarities between robot navigation (which
will be described in detail in a later chapter) and Cataglyphis navigation: In
both cases, the agent (robot or ant) combines path integration with an indepen-
dent calibration method based on landmark recognition in order to maintain
accurate estimates of its pose.
In fact, the description of Cataglyphis navigation above is greatly simplified.
For example, the interplay between path integration and landmark detection
is quite a complex case of decision-making. Furthermore, recent research4 has
shown that, in the vicinity of the nest, Cataglyphis uses not only visual but
also olfactory landmarks (that is, landmarks based on odour). This illustrates
another important principle, namely that of redundancy. If one sensory modal-
ity, or a procedure (such as pattern matching) derived from it, fails, the agent
(robot or ant) should be able to use a different sensory modality to achieve its
objective.
4
See Steck, K. et al. Smells like home: Desert ants, Cataglyphis fortis, use olfactory landmarks to
pinpoint the nest, Frontiers in Zoology 6, 2009.
The quest to generate intelligent machines has now (2011) been underway for
about a half century. While much progress has been made during this period
of time, the intelligence of most mobile robots in use today reaches, at best,
the level of insects. Indeed, during the last twenty years, many of the efforts
in robotics research have been inspired by rather simple biological organisms,
with the aim of understanding and implementing basic, survival-related be-
haviors in robots, before proceeding with more advanced behaviors involv-
ing, for example, high-level reasoning. These efforts have been made mainly
within the paradigm of behavior-based robotics (BBR), an approach to ma-
chine intelligence which differs quite significantly from traditional artificial
intelligence (AI). However, researchers have not (yet) succeeded in generat-
ing truly intelligent machines using either BBR, AI or a combination thereof.
This chapter begins with a brief discussion of the paradigms introduced
above. Next, a more detailed introduction to BBR will be given. Finally, the
topic of generating basic robotic (motor) behaviors will be considered.
55
56 CHAPTER 5. APPROACHES TO MACHINE INTELLIGENCE
Sensors
Sense
Avoid obstacles
Model
Wander Follow wall
Plan
Charge battery
Act
Actuators
Figure 5.1: A comparison of the information flow in classical AI (left panel) and in BBR
(right panel). For BBR, any number of behaviors may be involved, and the figure only shows
an example involving four behaviors.
even if the brain of the moth were capable of such a feat (it is not), it would
most likely find itself eaten before deciding what to do. Instead, moths use a
much simpler procedure: Their evasive behavior is in fact based on only a few
neurons and an ingenious positioning of the ears on the body. This simple sys-
tem enables the moth to fly away from an approaching bat and, if it is unable
to shake off the pursuer, start to fly erratically (and finally dropping toward
the ground) to confuse the predator.
As one may infer from the left panel of Fig. 5.1, classical AI is strongly fo-
cused on high-level reasoning, i.e. an advanced cognitive procedure displayed
in humans and, perhaps, some other mammals. Attempting to emulate such
complex biological systems has proven to be too complex as a starting-point
for research in robotics: Classical AI has had great success in many of the sub-
fields it has spawned (e.g. pattern recognition, path planning etc.), but has
made little progress toward the goal of generating truly intelligent machines,
capable of autonomous operation.
First law: A robot may not injure a human being, or, through inaction,
allow a human being to come to harm.
Second law: A robot must obey orders given it by human beings, except
where such orders would conflict with the first law.
Third law: A robot must protect its own existence as long as such protec-
tion does not conflict this the first or second law
While Asimov’s laws certainly can serve as an inspiration for researchers work-
ing on autonomous robots, a full implementation of those laws would be a
daunting task, requiring reasoning and deliberation by the robot on a level
way beyond the reach of the current state-of-the-art. However, in a basic sense,
BBR and GPRBS clearly deal with behaviors related to Asimov’s laws.
can carry out some action (or a sequence of actions), for example setting its
motor signals (left and right, for a differentially steered robot) to particular
values.
In ARSim, the program flow essentially follows the diagram shown in
Fig. 3.1. Thus, in each time step of the simulation, (1) the robot probes the
state of the enviroment using its sensors. With the updated sensor readings,
the robot then (2) selects an action and (3) generates motor signals (one for each
motor), which are then sent to the motors. Next, (4) the position and velocity
are updated, as well as (5) the arena (if needed). Finally, (6) the termination
criteria (for example, collisions) are considered.
Note that, as mentioned in Chapter 3, for such a simulation to be realistic
(i.e. implementable in a real robot), the time required, in a corresponding real
robot, for the execution of steps (1) - (3) must be shorter than the simulation
time step. By default, ARSim uses an updating frequency of 100 Hz (i.e. a time
step of 0.01 s), which is attainable by the simple IR sensors used in the default
setup. Furthermore, in the simple behaviors considered here, the deliberations
in step (2) usually amount to checking a few if-then-else-clauses, a proce-
dure that a modern processor completes within a few microseconds.
The writing of basic behaviors for autonomous robots will be exemplified
(1) in the form of a wandering behavior, which allows a robot to explore its
surroundings, provided that no obstacles are present, and (2) using a simple
navigation behavior which makes the robot move a given distance (in a straight
line), using odometry. Other, more complex behaviors, will be considered in
the home problems.
5.3.2 Wandering
The task of robot navigation, in a general sense, is a very complex one, since
it normally requires that the robot should know its position at all times which,
in turn, requires accurate positioning (or localization), a procedure which will
be considered briefly in the next example. Simpler aspects of the motion of a
robot can be considered without the need to introduce localization. For exam-
ple wandering is an important behavior in, for instance, an exploration robot
or a guard robot that is required to cover an area as efficiently as possible.
In order to implement a specific behavior in ARSim, one must modify the
CreateBrain and BrainStep functions. For the wandering behavior, they
take the shape shown in code listings 5.1 and 5.2.
As can be seen in code listing 5.1, the CreateBrain function defines all the
relevant variables and parameters of the robotic brain. The parameter values
remain constant during the simulation, whereas the variables, of course, do
not. Even though the variables are given values in CreateBrain, those val-
ues are typically modified in the first state (initialization) of the behavior; see
Code listing 5.1: The CreateBrain function for the wandering example.
1 function b = CreateBrain;
2
3 %% Variables
4 leftMotorSignal = 0;
5 rightMotorSignal = 0;
6 currentState = 0;
7
8 %% Parameters:
9 forwardMotorSignal = 0.5;
10 turnMotorSignal = 0.7;
11 turnProbability = 0.01;
12 stopTurnProbability = 0.03;
13 leftTurnProbability = 0.50;
14
15
16 b = struct(’LeftMotorSignal’,leftMotorSignal,...
17 ’RightMotorSignal’,rightMotorSignal,...
18 ’CurrentState’,currentState,...
19 ’ForwardMotorSignal’,forwardMotorSignal,...
20 ’TurnMotorSignal’,turnMotorSignal,...
21 ’TurnProbability’,turnProbability,...
22 ’StopTurnProbability’,stopTurnProbability,...
23 ’LeftTurnProbability’,leftTurnProbability);
code listing 5.2. Note the capitalization used when defining a Matlab struct.
Code listing 5.2: The BrainStep function for the wandering example.
1 function b = BrainStep(robot, time);
2
3 b = robot.Brain;
4
Code listing 5.3: The CreateBrain function for the navigation example.
1 function b = CreateBrain;
2
3 %% Variables:
4
5 leftMotorSignal = 0;
6 rightMotorSignal = 0;
7 currentState = 0;
8 initialPositionX = 0; % Arbitrary value here - set in state 0.
9 initialPositionY = 0; % Arbitrary value here - set in state 0.
10
11 %% Parameters:
12 desiredMovementDistance = 1;
13 motorSignalConstant = 0.90;
14 atDestinationThreshold = 0.02;
15
16
17 b = struct(’LeftMotorSignal’,leftMotorSignal,...
18 ’RightMotorSignal’,rightMotorSignal,...
19 ’CurrentState’,currentState,...
20 ’InitialPositionX’,initialPositionX,...
21 ’InitialPositionY’,initialPositionY,...
22 ’DesiredMovementDistance’,desiredMovementDistance,...
23 ’MotorSignalConstant’,motorSignalConstant,...
24 ’AtDestinationThreshold’,atDestinationThreshold);
5.4 Navigation
Purposeful navigation normally requires estimates of position and heading. In
ARSim, the robot can be equipped with wheel encoders, from which odomet-
ric estimates of position and heading can be obtained. Note that the odometer
is calibrated upon initialization, i.e. the estimate is set equal to the true pose.
However, when the robot is moving, the odometric estimate will soon devi-
ate from the true pose. In ARSim, the odometric estimate (referred to as the
odometric ghost) can be seen by setting the variable ShowOdometricGhost to
true.
A simple example of navigation, in which a robot is required to move 1
m in its initial direction of heading, is given in code listings 5.3 and 5.4. As
in the previous example, the variables and parameters are introduced in the
CreateBrain function. The BrainStep function is again represented as an
FSM. In State 0, the initial position and heading are stored and the FSM then
jumps to State 1, in which the motor signal s (range [−1, 1]) is set as
D−d
s=a , (5.1)
D
© 2012, 2016, Mattias Wahde, [email protected]
64 CHAPTER 5. APPROACHES TO MACHINE INTELLIGENCE
Code listing 5.4: The BrainStep function for the navigation example.
1 function b = BrainStep(robot, time);
2
3 b = robot.Brain;
4
5 if (b.CurrentState ˜= 0)
6 deltaX = robot.Odometer.EstimatedPosition(1) - b.InitialPositionX;
7 deltaY = robot.Odometer.EstimatedPosition(2) - b.InitialPositionY;
8 distanceTravelled = sqrt(deltaX*deltaX + deltaY*deltaY);
9 end
10
The motor signal s is then applied to both wheels of the differentially steered
robot. As can be seen, the motor signals will gradually drop from a to (almost)
zero. However, when |D − d| drops below bD, where b 1 is a constant, the
FSM jumps to State 2, in which the robot stops.
In the previous chapter, the concept of robotic behaviors was introduced and
exemplified by means of some basic motor behaviors. Albeit very simple, such
behaviors can be tailored to solve a variety of tasks such as, for example, wan-
dering, wall following and various forms of obstacle avoidance. However,
there are also clear limitations. In this chapter, some more advanced motor
behaviors will be studied. First, behaviors for exploration and navigation will
be considered. Both of these two types of behavior require accurate pose esti-
mates for the robot. It is assumed that the robot is equipped with a (cognitive)
Odometry brain process, providing continuous pose (and velocity) estimates.
As mentioned earlier, such estimates are subject to odometric drift, and there-
fore an independent method for localization (i.e. odometric recalibration) is
always required in realistic applications. Such a method will be studied in
the final section of this chapter. However, exploration and navigation are im-
portant problems in their own right and, in order to first concentrate on those
problems, it will thus (unrealistically) be assumed, in the first two sections
of the chapter, that the robot obtains perfect, noise-free pose estimates using
odometry only.
6.1 Exploration
Purposeful navigation requires some form of map of the robot’s environment.
In many cases, however, no map is available a priori. Instead, it is the robot’s
task to acquire the map, in a process known as simultaneous localization and
mapping (SLAM). In (autonomous) SLAM, a robot is released in an unknown
arena, and it is then supposed to move in such a way that, during its motion,
its long-range sensors (typically an LRF) covers every part of the arena, so that
67
68 CHAPTER 6. EXPLORATION, NAVIGATION, AND LOCALIZATION
the sensor readings can be used for generating a map. This is a rather diffi-
cult task since, during exploration and mapping, the robot must keep track
of its position using, for odometric recalibration, the (incomplete, but grow-
ing) map that it is currently generating. SLAM is an active research topic, for
which many different methods have been suggested. A currently popular ap-
proach is probabilistic robotics, in which the robot maintains a probability
density function from which its position is inferred. However, SLAM is be-
yond the scope of this text. Instead, the simpler, but still challenging, topic of
exploration given perfect positioning (as mentioned in the introduction to this
chapter) will be considered.
Exploration can be carried out for different reasons. In some applications,
such as lawn mowing, vacuum cleaning, clearing mine fields etc., the robot
must physically cover as much as possible of the floor or ground in its envi-
ronment. Thus, the robot must carry out area coverage. In some applications,
e.g. vacuum cleaning, it is often sufficient that the robot carries out a more or
less aimless wandering that, eventually, will make it cover the entire floor. In
other applications, such as mapping, it is unnecessary for the robot to physi-
cally visit every spot in the arena. Instead, what matters is that its long-range
sensor, typically an LRF (or a camera), is able to sense every place in the arena
at some point during the robot’s motion. The problem of exploring an arena
such that the long-range sensor(s) reach all points in the arena will here be
referred to as sensory area coverage.
Exploring an arena, without any prior knowledge regarding its structure,
is far from trivial. However, a motor behavior (in the GPRBS framework) for
sensory area coverage has recently (2009) been implemented1 . This Exploration
behavior has been used both in the simulator GPRSim and in a real robot (as
a part of SLAM). In both cases, the robot is assumed to be equipped with an
LRF. The algorithm operates as follows: A node is placed at the current (esti-
mated) position of the robot. Next, the robot generates a set of nodes at a given
distance (D) from its current position (estimated using the Odometry process).
Before any nodes are placed, the robot used the LRF (with an opening (sweep)
angle α) to find feasible angular intervals for node placement, i.e. angular in-
tervals in which the distance to the nearest obstacle exceeds D + ∆, where ∆
is a parameter measuring the margin between a node and the nearest obsta-
cle behind the node. The exact details of the node placement procedure will
not be given here. Suffice it to say that, in order to be feasible, an angular in-
terval must have a width γ exceeding a lower limit γmin in order for a node
to be placed (at the center of the angular interval). Furthermore, if the width
of a feasible angular interval is sufficiently large, more than one node may be
placed in the interval. An illustration of feasible angular intervals and node
1
See Wahde, M. and Sandberg, D. An algorithm for sensory area coverage by mobile robots oper-
ating in complex arenas, Proc. of AMiRE 2009, pp. 179-186, 2009.
Figure 6.1: An illustration of the node placement method in the Exploration behavior. The
left panel shows the distances obtained over the 180 degree opening angle of the LRF (note that
individual rays are not shown). The inner semi-circle has a radius of D (the node placement
distance) whereas the radius of the outer semi-circle is D + ∆. The right panel shows the re-
sulting distribution of nodes. Note that one of the two feasible angular intervals is sufficiently
wide to allow two nodes to be placed.
4 6
1 5
3 3
2 2
3 3
2 2 7
Figure 6.2: The early stages of a run using the exploration algorithm, showing a robot ex-
ploring a single rectangular room without a door. The arena contains a single, low obstacle,
which cannot be detected using the LRF (since it is mounted above the highest point of the
obstacle). In each panel, the target node is shown with two thick circles, pending nodes are
shown as a single thick circle, completed nodes as a filled disc, and unreachable nodes as a filled
square. Upper left panel: The robot, whose size is indicated by a thin open circle, starts at
node 1, generating three new pending nodes (2, 3, and 4). Upper right panel: Having reached
node 4, the robot sets the status of that node to completed, and then generates new pending
nodes. Lower left panel: Here, the robot has concluded (based on IR proximity readings) that
node 6 is unreachable, and it therefore selects the nearest pending node (5, in this case) based
on path distance, as the new target node. Lower right panel: Having reached node 5, due to
the minimum distance requirement (between nodes) the robot can only generate one new node
(7). It rotates to face that node, and then moves towards it etc.
nodes. For example, if the robot is located at node 1 and generates three pend-
ing nodes (2,3 and 4), the paths will be (1, 2), (1, 3) and (1, 4). The robot next
selects the nearest node, based on the path length as the target node. In many
cases (e.g. when more than one node can be placed), several nodes are at the
same (estimated) distance from the robot. In such cases, the robot (arbitrarily)
selects one of those nodes as the target node. For the paths just described, the
path length equals the cartesian distance between the nodes. If a path con-
tains more than two elements, however, the path length will differ from the
cartesian distance, unless all the nodes in the path lie along a straight line. The
path length is more relevant since, when executing the exploration algorithm
described here, the robot will generally follow the path, even though direct
movement between the active node and a target node is also possible under
certain circumstances; see below.
Next, the robot rotates to face the target node, and then proceeds towards
it; see the upper left panel of Fig. 6.2. During the motion, one of two things
can happen: Either (i) the robot reaches the target node or, (ii) using the output
from a Proximity detection brain process (assumed available), it concludes that
the target node cannot be reached along the current path. Note that, in order
for the Proximity detection brain process to be useful, the sensors it uses should
be mounted at a different (smaller) height compared to the LRF.
In case (i), illustrated in the upper right panel of Fig. 6.2, once the target
node has been reached, it is given the status completed and is then set as the
new active node. At this point, the paths to the remaining pending nodes are
updated. Continuing with the example above, if the robot moves to node 4,
the paths to the other pending nodes (2 and 3) will be updated to (4, 1, 2) and
(4, 1, 3). Furthermore, having reached node 4, the robot generates new pend-
ing nodes. Note that the robot need not be located exactly on node 4; instead,
a node is considered to be reached when the robot passes within a distance a
from it. The new nodes are added as described above. The minimum distance
requirement between added nodes (see above) is also enforced. Proceeding
with the example, the robot might, at this stage, add nodes 5 and 6, with the
paths (4, 5) and (4 ,6). Again, the robot selects the nearest node based on the
path length, rotates to face that node, and then starts moving towards it etc.
Note that the robot can (and will) visit completed nodes more than once. How-
ever, by the construction described above, only pending nodes can be target
nodes.
In case (ii), i.e. when the target node cannot be reached, the node is assigned
the status unreachable, and the robot instead selects another target node and
proceeds towards it, along the appropriate path. This situation is illustrated in
the lower left panel of Fig. 6.2: Here, using its Proximity detection brain process,
the robot concludes that it cannot reach node 6. It therefore marks this node
unreachable, sets it as the active node, and then sets the nearest pending node
as the new target, in this case node 5. One may wonder why case (ii) can oc-
cur, since the robot uses the LRF before assigning new nodes. The reason, of
course, is that the LRF (which is assumed to be two-dimensional) only scans
the arena at a given height, thus effectively only considering a horizontal slice
of the arena. A low obstacle may therefore be missed, until the robot comes
sufficiently close to it, so that the Proximity detection brain process can detect it.
Figure 6.3: An illustration of a problem that might occur during exploration. Moving in one
particular direction (left panel) the robot is able to place and follow the nodes shown. However,
upon returning (right panel), the robot may conclude that it will be unable to pass the node
near the corner, due to the proximity detection triggered as the robot approaches the node, with
the wall right in front of it.
Note that unreachable nodes are exempt from the minimum distance require-
ment. This is so, since a given node may be unreachable from one direction but
perhaps reachable from some other direction. Thus, the exploration algorithm
is allowed to place new pending nodes arbitrarily close to unreachable nodes.
One should note that robust exploration of any arena is more difficult than
it might seem. An example of a problem that might occur is shown in Fig. 6.3.
Here, the robot passes quite near a corner on its outbound journey (left panel),
but no proximity detection is triggered. By contrast, upon returning (right
panel) a proximity detection is triggered which, in turn, may force the robot to
abandon its current path. In fact, the Exploration behavior contains a method
(which will not be described here) for avoiding such deadlocks. In the (very
rare) cases in which even the deadlock avoidance method fails, the robot sim-
ply stops, and reports its failure.
Because of the path following strategy described above, the robot may
sometimes take an unnecessarily long path from the active node to the tar-
get node. However, this does not happen so often since, in most cases, the
robot will proceed directly to a newly added node, for which the path length
is the same as the cartesian distance. However, when the robot cannot place
any more nodes (something that occurs, for example, when it reaches a cor-
ner), a distant node may become the target node. Therefore, in cases where the
path to the target node differs from the direct path, the robot rotates to face the
target node (provided that it is located within a distance L, where L should be
smaller than or equal to the range R of the LRF). Next, if the robot concludes
(based on its LRF readings) that it can reach the target node, it then proceeds
directly towards it, rather than following the path. However, also in this case,
Figure 6.4: Left panel: The robot in its initial position in an unexplored arena. Right panel:
The final result obtained after executing the Exploration behavior. The completed (visited)
nodes are shown as green dots, whereas the (single) unreachable node is shown as a red dot.
The final target node (the last node visited) is shown as a blue dot. In this case, the robot
achieved better than 99.5% sensory area coverage of the arena.
it is possible that a (low) obstacle prevents the robot from reaching its target,
in which case the robot instead switches to following the path as described
above.
The robot continues this cycle of node placement and movement between
nodes, until all nodes have been processed (i.e. either having been visited or
deemed unreachable), at which point the exploration of the arena is complete.
The Exploration behavior consists of an FSM with 17 states, which will not be
Figure 6.5: Three examples of grids that can be used in connection with grid-based naviga-
tion methods. In the left panel, the nodes are not equidistant, unlike the middle panel which
shows a regular lattice with equidistant nodes. The regular lattice can also be represented as
grid cells, as shown in the right panel. Note that the right and middle panels show equivalent
grids.
6.2 Navigation
In this section, it will again be assumed that the robot has access to accurate
estimates of its pose (from the Odometry brain process), and the question that
will be considered is: Given that the robot knows its pose and velocity, how
can it navigate between two arbitrary points in an arena? In the robotics liter-
ature, many methods for navigation have been presented, three of which will
be studied in detail in this section.
Figure 6.6: Left panel: An example of automatic grid generation. The walls of the arena are
shown as green thin lines. The black regions represent the forbidden parts of the arena, either
unreachable locations or positions near walls and other obstacles. The grid cell boundaries
are shown as thick yellow lines. Right panel: An example of a path between two points in
the arena. The basic path (connecting grid cells) was generated using Dijkstra’s algorithm
(see below). The final path, shown in the figure, was adjusted to include changes of directions
within grid cells, thus minimizing the length of the path. Note that all cells are convex, so that
the path segments within a cell can safely be generated as straight lines between consecutive
waypoints.
as the euclidean distance between the nodes. Thus, for example, in the grids
shown in the middle and right panels of Fig. 6.5, the cost of moving between
adjacent nodes would be equal to 1 (length unit), whereas, in the grid shown
in the left panel the cost would vary depending on which nodes are involved.
An interesting issue is the generation of a navigation grid, given a two-
dimensional map of an arena. This problem is far from trivial, especially in
complex arenas with many walls and other objects. Furthermore, the grid gen-
eration should take the robot’s size (with an additional margin) into account,
in order to avoid situations where the robot must pass very close to a wall or
some other object. The grid-based navigation methods described below gener-
ate paths between grid cells. On a regular grid with small, quadratic cells (as in
the examples below) it is sometimes sufficient to let the robot move on straight
lines between the cell centers. However, the generated path may then become
somewhat ragged. Furthermore, in more complex grids, where the cells are of
different size, following a straight line between cell centers may result in an
unnecessarily long path. Thus, in such cases, the robot must normally modify
its heading within a cell, in order to find the shortest path.
When generating a grid, one normally requires the grid cells to be convex,
1. Place the robot at the start node, which then becomes the current node.
Assign the status unvisited to all nodes.
2. Go through each of the cells ai that are (i) unvisited and (ii) directly reach-
able (via an edge) from the current node c. Such nodes are referred to as
neighbors of the current node. Compute the cost of going from ai to the
target node t, using the heuristic f (ai ).
3. Select the node amin associated with the lowest cost, based on the cost
values computed in Step 2.
4. Set the status of the current node c as visited, and move to amin which then
becomes the current node.
5. Return to Step 2.
so that all points on a straight line between any two points in the cell also are
part of the cell. One way of doing so is to generate a grid consisting of trian-
gular cells, which will all be convex. However, such grids may not be optimal:
The pointiness of the grid cells may force the robot to make many unnecessary
(and sharp) turns. An algorithm for constructing, from a map, a general grid
consisting of convex cells with four or more sides (i.e. non-triangular) exists
as well3 . Fig. 6.6 shows an example of a grid generated with this algorithm.
Because of its complexity, the algorithm will not be considered in detail here.
Instead, in the examples below, we shall consider grids consisting of small
quadratic cells, and we will neglect changes of direction within grid cells.
Figure 6.8: Two examples of paths generated using the BFS algorithm. The cells (nodes)
that were checked during path generation are shown in light green, whereas the actual path is
shown in dark green and with a solid line. The yellow cell is the start node and the white cell
is the target node.
path generation. The BFS method is very efficient in the absence of obstacles
or when the obstacles are few, small, and far apart. An example of such a
path generated with BFS is shown in the left panel of Fig. 6.8. As can be seen,
the robot quickly moves from the start node to the target node. However,
if there are extended obstacles between the robot’s current position and the
target node, the BFS algorithm will not find the shortest path, as shown in the
right panel of Fig. 6.8. Because of its greedy approach to the target, the robot
will find itself in front of the obstacle, and must then make a rather long detour
to arrive at the target node.
Dijkstra’s algorithm
Like BFS, Dijkstra’s algorithm also relies on a grid in which the edges are as-
sociated with non-negative costs. Here, the cost will simply be taken as the
euclidean distance between nodes. Instead of focusing on the (estimated) cost
of going from a given node to the target note, Dijkstra’s algorithm considers
the distance between the start node and the node under consideration, as de-
scribed in Fig. 6.9. In Step 2, the distance from the start node s to any node ai is
computed using the (known) distance from the initial node to the current node
c and simply adding the distance between c and ai . This algorithm will check a
large number of nodes, in an expanding pattern from the start node, as shown
in Fig. 6.10. In order to determine the actual path to follow, whenever a new
node a is checked, a note is made regarding the predecessor node p, i.e. the
1. Place the robot at the start node s, which then becomes the current node.
Assign the distance value 0 to the start node, and infinity to all other
nodes (in practice, use a very large, finite value). Set the status of all
nodes to unvisited.
3. After checking all the neighbors of the current node, set its status to vis-
ited.
4. Select the node (among all the unvisited, accessible nodes in the grid)
with the smallest distance from the start node, and set it as the new cur-
rent node.
6. When the target has been reached, use the predecessor nodes to trace a
path from the target node to the start node. Finally, reverse the order of
the nodes to find the path from the start node to the target node.
node that was the current node when checking node a. When the target has
been found, the path connecting it to the initial node can be obtained by going
through the predecessor nodes backwards, from the target node to the initial
node.
Unlike the BFS algorithm, Dijkstra’s algorithm is guaranteed to find the
shortest path4 from the start node to the target node. However, a drawback
with Dijkstra’s algorithm is that it typically searches many nodes that, in the
end, turn out to be quite irrelevant. Looking at the search patterns in Figs. 6.8
and 6.10, one may hypothesize that a combination of the two algorithms would
be useful. Indeed, there is an algorithm, known as A* that combines the BFS
and Dijkstra algorithms. Like Dijkstra’s algorithm, A* is guaranteed to find the
shortest path. Moreover, it does so more efficiently than Dijkstra’s algorithm.
However, A* is beyond the scope of this text.
4
There may be more than one such path: Dijkstra’s algorithm will select one of them.
Figure 6.10: Two examples of paths generated using Dijkstra’s algorithm. The cells (nodes)
that were checked during path generation are shown in light green, whereas the actual path is
shown in dark green and with a solid line. The yellow cell is the start node and the white cell
is the target node.
Potential fields
As shown in Fig. 6.11, a potential field can be interpreted as a landscape with
hills and valleys, and the motion of a robot can be compared to that of a ball
rolling through this landscape. The navigation target is assigned a potential
corresponding to a gentle downhill slope, whereas obstacles should generate
potentials corresponding to steep hills.
In principle, a variety of different equations could be used for defining dif-
ferent kinds of potentials. An example, namely a potential with ellipsoidal
equipotential surfaces, and exponential variation with (ellipsoidal) distance
from the center of the potential, takes the mathematical form
x−xp 2 2
φ(x, y; xp , yp , α, β, γ) = αe−( β ) −( y−y p
γ ) , (6.1)
where (x, y) is the current (estimated) position at which the potential is calcu-
lated, (xp , yp ) is the position of the object generating the potential, and α, β and
γ are constants (not to be confused with the constants defined in connection
Obstacle
Goal
Figure 6.11: A potential field containing a single obstacle and a navigation goal.
with the equations of motion in Chapter 2 and the sensor equations in Chapter
3). Now, looking at the mathematical form of the potentials, one can see that
an attractive potential (a valley) is formed if α is negative, whereas a positive
value of α will generate a repulsive potential (a hill).
Normally, the complete potential field contains many potentials of the form
given in Eq. (6.1), so that the total potential becomes
k
X
Φ(x, y) = φi (x, y; xp i , yp i , αi , βi , γi ), (6.2)
i=1
-2
-4
4
2
0
-2
-4 -4 -2 0 2 4
-2
-4
-4 -2 0 2 4
Figure 6.12: An illustration of potential field navigation in GPRSim. Upper left panel: A
simple arena, with a robot following a potential field toward a target position in the upper left
corner of the arena. Upper right panel: The corresponding potential field, generated by a total
of nine potentials (one for the target, one for each of the walls, and one for each pillar). Lower
left panel: A contour plot of the potential field, in which the target position can be seen in the
upper left corner. Lower right panel: The trajectory followed by the robot. Note that, in this
simulation, the odometric readings were (unrealistically) noise-free.
1
0.5
0
-0.5 0.5
-1
-1.5
0 0
0.5
1
-0.5
1.5
2
Figure 6.13: The locking phenomenon. Following the gradient of the potential field the
robot, whose trajectory is shown as a sequence of black dots, moves along the x-axis toward the
goal, located at (2, 0). However, because of the local minimum in the potential field, the robot
eventually gets stuck.
field navigation, the goal is to reach the local minimum represented by the
navigation target. However, depending on the shape of the arena (and there-
fore the potential field), there may also appear one or several unwanted local
minima in the field, at which the robot may become trapped.
This is called the locking phenomenon and it is illustrated in Fig. 6.13.
Here, a robot encounters a wedge-shaped obstacle represented by three poten-
tials. At a large distance from the obstacle, the robot will be directed toward
the goal potential, which is located behind the obstacle as seen from the start-
ing position of the robot. However, as the robot approaches the obstacles their
repulsive potentials will begin to be noticeable. Attracted by the goal, the robot
will thus eventually find itself stuck inside the wedge, at a local minimum of
the potential.
In order to avoid locking phenomena, the path between the robot and the
goal can be supplied with waypoints, represented by attractive potentials (for
example, of the form given in Eq. (6.1)) with rather small extension. Of course,
the introduction of waypoints leads to the problem of determining where to
put them. An analysis of such methods will not be given here8 . Suffice it to
say that the problem of waypoint placement can be solved in various ways to
aid the robot in its navigation. A waypoint should be removed once the robot
has passed within a distance L from it, to avoid situations in which the robot
finds itself stuck at a waypoint.
The potential field method also has several advantages, one of them being
that the direction of motion is obtained simply by computing the gradient of
the potential field at the current position, without the need to generate an en-
tire path from the current position to the navigation target. Furthermore, the
potential field is defined for all points in the arena. Thus, if the robot tem-
porarily must suspend its navigation (for example, in order to avoid a moving
obstacle), it can easily resume the navigation from wherever it happens to be
located when the Obstacle avoidance behavior is deactivated.
In the discussion above, only stationary obstacles were considered. Of
course, moving obstacles can be included as well. In fact, the potential field
method is commonly used in conjunction with, say, a grid-based navigation
method, such that the latter generates the nominal path of the robot, whereas
the potential field method is used for adjusting the path to avoid moving ob-
stacles. However, methods for reliably detecting moving obstacles are beyond
the scope of this text.
The desired speed differential ∆V (the difference between the right and left
wheel speeds) can then be set according to
∆V
vR = Vnav + , (6.7)
2
where vR and vL are the reference speeds of the left and right wheels, respec-
tively. Note that one can of course only set the desired (reference) speed values;
the actual speed values obtained depend on the detailed dynamics of the robot
and its motors.
If the reference angle differs strongly from the estimated heading (which
can happen, for example, in situations where the robot comes sufficiently close
to an obstacle whose potential generates a steep hill), the robot may have to
suspend its normal navigation and instead carry out a pure rotation, setting
vL = −Vrot , vR = Vrot for a left (counterclockwise) rotation, where Vrot is the ro-
tation velocity, defined along with Vnav (and the other constants) during setup.
In case the robot should carry out a clockwise rotation, the signs are reversed.
The direction of rotation is, of course, determined by the sign of the differ-
ence between the reference angle and the (estimated) heading. In this case, the
robot should turn until the difference between the reference angle and the esti-
mated heading drops below a user-specified threshold, at which point normal
navigation can resume.
6.3 Localization
In Sects. 6.1 and 6.2, it was (unrealistically) assumed that the robot’s odome-
try would provide perfect estimates of the pose. In reality, this will never be
the case, and therefore the problem of recalibrating the odometric readings,
from time to time, is a fundamental problem in robotics. Doing so requires a
method for localization independent from odometry, and such methods usu-
ally involve LRFs (even though cameras are also sometimes used), sensors that
are difficult to simulate in ARSim (because of the large number of rays which
would slow down the simulation considerably). Therefore, in this section, lo-
calization will be described as it is implemented in the simulator GPRSim and
in GPRBS, where LRFs are used.
Robot localization requires two brain processes: The cognitive Odometry
process and an independent process for odometric recalibration, which both
in GPRSim and in GPRBS goes under the name Laser localization, since the
behavior for odometric recalibration uses the readings of an LRF, together with
a map, to infer its current location using scan matching, as described below.
In fact, the problem of localization can be approached in many different
ways. For outdoor applications, a robot may be equipped with GPS, which
in many cases will give sufficiently accurate position estimates. However, in
indoor applications (standard) GPS cannot be used, since the signal is too weak
to penetrate the walls of a building. Of course, it is possible to set up a local
GPS system, for example by projecting IR beacons on the ceiling, using which
Figure 6.14: An illustration of the need for localization in mobile robot navigation. In the
left panel, the robot navigates using odometry only. As a result, the odometric trajectory
(red) deviates quite significantly from the actual (green) trajectory. In the right panel, Laser
localization was activated periodically, leading to much improved odometric estimates.
the robot can deduce its position by means of triangulation9 . However, such a
system requires that the arena should be adapted to the robot, something that
might not always be desirable or even possible.
The localization method (combining odometry and laser scan matching)
that will be described here is normally used together with some navigation
behavior. Thus, the robotic brain will consist of at least two motor behav-
iors, in which case decision-making also becomes important. This topic will
be studied in a later chapter: For now, the Laser localization behavior will be
considered separately.
Figure 6.15: An enlargement of the most significant correction in odometric readings (in the
right panel of Fig. 6.14) resulting from the Laser localization behavior.
successful, the odometric pose is reset to the position suggested by the Laser
localization behavior.
The left panel of Fig. 6.14 illustrates the need for localization: The nav-
igation task shown in Fig. 6.12 was considered again (with the same start-
ing point, but a different initial direction of motion), this time with realistic
(i.e. non-zero) levels of noise in the wheel encoders and, therefore, also in the
odometry. As can be seen in the figure, the odometric drift causes a rather
large discrepancy between the actual trajectory (green) and the odometric es-
timate (red). In the right panel, the robotic brain contained two behaviors (in
addition to the cognitive Odometry process), namely Potential field navigation
and Laser localization. The Laser localization behavior was activated periodically
(thus deactivating the Potential field navigation behavior), each time recalibrat-
ing (if necessary) the odometric readings. As can be seen in the right panel
of Fig. 6.14, with laser localization in place, the discrepancy between the odo-
metric and actual trajectories is reduced significantly. At one point, the Laser
localization behavior was required to make a rather large correction of the odo-
metric readings. That particular event is shown enlarged in Fig. 6.15. As can
be seen, the odometric readings undergo a discrete step at the moment of lo-
calization.
When activated, the localization behavior10 considered here first stops the
10
See Sandberg, D., Wolff, K., and Wahde, M. A robot localization method based on laser scan
Figure 6.16: Two examples of scan matching. The leftmost panel in each row shows a few
rays (solid lines) from an actual LRF reading (plotted in the map used by the virtual LRF), and
the middle panels show the virtual LRF readings (dotted lines) in a case in which the estimated
pose differs quite strongly from the correct one (upper row), and one in which the difference
is small (bottom row). The direction of heading is illustrated with arrows. The right panel in
each row shows both the actual LRF rays and the virtual ones. The figure also illustrates the
map, which consists of a sequence of lines.
robot, and then takes a reading of the LRF. Next, it tries to match this reading
to a virtual reading taken by placing a virtual LRF (hereafter: vLRF) at various
positions in the map. Two examples of scan matching are shown in Fig. 6.16.
The three panels in the upper row show a situation in which the odometry
has drifted significantly. The upper left panel shows the readings (i.e. laser ray
distances) from an actual LRF mounted on top of the robot (not shown). Note
that, for clarity, the figure only shows a few of the many (typically hundreds)
laser ray directions. The upper middle panel shows the readings of the vLRF,
placed at the initial position and heading obtained from odometry. As can be
seen in the upper right panel, the two scans match rather badly. By contrast,
the three panels of the bottom row show a situation in which the pose error
is small. The purpose of the search algorithm described below is to be able to
correct the odometry, i.e. to reach a situation similar to the one shown in the
bottom row of Fig. 6.16. Fig. 6.17 shows another example of a good (left panel)
and a bad (right panel) scan match. In the case shown in the left panel, the
Figure 6.17: Matching of LRF rays (in a different arena than the one used in the examples
above). The readings of the actual LRF are shown in green, and those of the virtual LRF are
shown in red. Left panel: An almost exact match. Right panel: In this case, the odometry has
drifted enough to cause a large discrepancy between the actual and virtual LRF rays.
odometric pose estimate is quite good, so that the rays from the actual LRF
(green) match those of the vLRF quite well, at the current pose estimate. By
constrast, in the situation shown in the right panel, the odometry has drifted
significantly.
for those indices i for which both the real LRF and the vLRF detect an obstacle
(i.e. obtain a reading different from -1) whereas χi is equal to zero for indices i
such that either the real LRF or the vLRF (or both) do not detect any obstacle
(out to the range R of the LRF). ν denotes the number of rays actually used in
forming the error measure, i.e. the number of rays for which χi is equal to one.
As can be seen, is a measure of the (normalized) average relative deviation
in detected distances between the real LRF and the vLRF.
Since di are given and δi depend on the pose of the vLRF, one may write
= (p). Now, if the odometric pose estimate happens to be exact, the virtual
and actual LRF scans will be (almost) identical (depending on the accuracy of
the map and the noise level in the real LRF), resulting in a very small matching
error, in which case the localization behavior can be deactivated and the robot
may continue its navigation. However, if the error exceeds a user-defined
threshold T , the robot can conclude that its odometric estimates are not suf-
ficiently accurate, and it must therefore try to minimize the matching error by
trying various poses in the map, i.e. by carrying out a number of virtual scans,
in order to determine the actual pose of the robot. The scan matching problem
can thus be formulated as the optimization problem of finding the pose p = pv
that minimizes = (p). Once this pose has been found, the new odometric
pose pnew is set equal to pv .
Note that it is assumed that the robot is standing still during localization.
This restriction (which, in principle, can be removed) is introduced in order
to (i) avoid having to correct for the motion occuring during the laser sweep,
which typically lasts between 0.01 and 0.1 s and (ii) avoid having to correct for
the motion that would otherwise take place during scan matching procedure,
which normally takes a (non-negligible) fraction of a second. Thus, only one
scan needs to be carried out using the real LRF mounted on the robot. The re-
maining work consists of generating virtual scans in the map, at a sequence of
poses, and to match these to the actual LRF readings. Unlike some other scan
matching methods, the method used here does not attempt to fit lines to the
LRF readings. Instead, the LRF rays (actual and virtual, as described above)
are used directly during scan matching. The sequence of poses for the vLRF
is generated as follows: First the actual LRF scan is carried out, generating
the distances di . Next, a virtual scan is carried out (in the map) at the current
estimated position p0 . If the error 0 = (p0 ) is below the threshold T , local-
ization is complete. If not, the algorithm picks a random pose pj (where j = 1
in the first iteration) in a rectangular box of size Lx × Ly × Lϕ , centered on p0
in pose space, and computes the matching error j = (pj ). The constants Lx
and Ly are typically set to around 0.1 m and the constant Lϕ is set to around
0.1 radians.
The process is repeated until, for some j = j1 , an error j1 < 0 is found. At
this point, the rectangular box is re-centered to pj1 , and the search continues,
now picking a random pose in the rectangular box centered on pj1 . Once a po-
-2.05 -2.05
6
43
59 0
-2.10 -2.10
38
-2.15 18 -2.15
-2.20 -2.20
34
7
12
-2.25 -2.25 17
24
0.50 0.55 0.60 0.65 0.70 0.75 0.40 0.45 0.50 0.55 0.60 0.65
Figure 6.18: An illustration of the sequence of poses generated during two executions of the
search algorithm (with arrows indicating the direction of heading). In each panel, the actual
position (measured in meters) of the robot is indicated with a filled square. The initial estimated
position (i.e. from odometry, before correction) is shown as a filled disc, and the final estimated
position is visualized as an open square. The intermediate points generated during the search
are represented as open discs and are shown together with the corresponding iteration number.
Note that, for clarity, only some of the intermediate points are shown.
sition pj2 is found for which j2 < j1 , the rectangular box is again re-centered
etc. The procedure is repeated for a given number (N ) of iterations13 .
Even though the algorithm is designed to improve both the position and
the heading simultaneously, in practice, the result of running the algorithm is
usually to correct the heading first (which is easiest, since an error in heading
typically has a larger effect on the scan match than a position error), as can be
seen clearly in the right panel of Fig. 6.18. At this stage, the estimated pose
can make a rather large excursion in position space. However, once a fairly
correct heading has been found, the estimated position normally converges
quite rapidly to the correct position.
13
Note that this algorithm resembles an evolutionary algorithm with a population size of 1.
93
94 CHAPTER 7. UTILITY AND RATIONAL DECISION-MAKING
7.1 Utility
In order to underline further the fact that the expected payoff alone is not
what determines one’s inclination to accept a bet, consider a repeated lottery
in which a fair coin (equal probability for heads and tails) is tossed repeat-
edly, and where the player receives 2k dollars if the first head, say, occurs after
k tosses of the coin. The probability pk of this event occurring equals (1/2)k .
Thus, the expected payoff from playing this lottery (taking into account the
cost r of entering the lottery) would be
∞ X 1 k ∞
k
X X
P = −r + pk ck = −r + 2 = −r + 1 (7.2)
k=1 2 k=1
which is infinite! Thus, if the expected payoff P was all that mattered, a player
should be willing to pay any finite sum of money, however large, in order to
participate in this lottery, since the expected payoff would be larger. This is,
of course, absurd; few people would be willing to bet their entire savings on a
lottery. The situation just described is called the St. Petersburg paradox, and it
was formulated by Bernoulli, who proposed a way of resolving the paradox,
by postulating that, rather than the expected payoff itself, it is a player’s per-
ception of the amount of money gained that determines her actions. Bernoulli
postulated that the subjective value of N currency units (e.g. dollars) varies
essentially as the logarithm of N . Let W denote a person’s wealth before par-
ticipating in the lottery, and r the cost of entering the lottery. Using Bernoulli’s
postulate, the subjective value Ps of the expected payoff can then be written as
the change in wealth for different outcomes, multiplied by the probability of
the outcome in question, i.e.
∞
(ln(W − r + 2k ) − ln W )2−k ,
X
Ps = (7.3)
k=1
The subjective value of a certain amount of money, set arbitrarily to the log-
arithm of the amount by Bernoulli, is a special case of the concept of utility,
which can be used for weighing different situations against each other and,
thus, to decide which action to take.
In fact, it has been proven (rigorously) by von Neumann and Morgenstern2
that, given certain assumptions that will be listed below, there exists a utility
function which maps members ci of the set of outcomes to a numerical value
u(ci ), called the utility of ci , which has the following properties:
2. u is affine, i.e.
If these four axioms hold4 it is possible to prove the existence of a utility func-
tion with the properties listed above (the proof will not be given here). How-
ever, most decision-making situations involve uncertainty, i.e. many different
outcomes are possible. Such situations can also be handled, since it follows
(not completely trivially, though, at least not for the most general case) from
the utility function properties listed above that the expected utility U (c) of a
mixed consequence c = p1 c1 + p2 c2 + . . . pn cn , where pk is the probability for
consequence ck , is given by
X
U (c) = pk u(ck ), (7.5)
U (cm ) = U (pc1 + qc2 ) = pu(c1 ) + qu(c2 ) = 0.2 × 5 + 0.8 × (−10) = −7. (7.6)
Using the same procedure, utility values can be derived for any mixed conse-
quence (i.e. for any value of p ∈ [0, 1]).
Next, consider the consequence c of not trying to cross the street at all, and
therefore not meeting B (who, let us say, owes A a considerable amount of
money, which A expects to be able to collect upon meeting B). In that case, c1
is preferred to c and (unless A is absolutely desperate to collect the debt) c is
preferred to c2 . Thus
u(c1 ) > U (c) > u(c2 ), (7.7)
but how should U (c) be determined? Given some time to think, and by con-
sidering the two outcomes (c1 and c2 ) with known utility values, A will come
up with a value of p at which he is indifferent between the mixed consequence
pc1 + (1 − p)c2 and c. Because of the affinity property of the utility function,
one then obtains
Let us say that A has a rather high tolerance for risk, so that the point of indif-
ference occurs at p = 0.1. Then
Thus, the expected utility for the consequence c has now been determined, and
the expected utility of any other consequence preferred to c2 but not to c1 can
be computed in a similar way. Note that there is no right answer – the exact
shape of the utility function depends on A’s preferences. For example, if he
were more cautious, he would perhaps conclude that the point of indifference
occurs at p = 0.5 rather than p = 0.1, in which case U (c) = −2.5.
As a more formal example, consider a lottery in which a person (C) is in-
different between the consequence c30 of a certain gain of $30 and a the mixed
consequence of gaining $100 with probability p = 0.5 and gaining nothing
with probability p = 0.5. Assume further that C has assigned utility u(c0 ) = 0
and u(c100 ) = 1 to the consequences of gaining $0 and $100, respectively. Sim-
plifying the notation somewhat by writing u(x) for u(cx ) one then finds
0.8
Utility
0.6
0.4
0.2
20 40 60 80 100
Amount
Figure 7.1: A typical utility function, showing diminishing (sub-linear) utility values for
larger amounts.
In other words, the certainty equivalent for the mixed consequence in Eq. (7.10)
is equal to $30. Proceeding in a similar way, using certainty equivalents, the
expected utility for any amount $x in the range [0, 100] can be determined, and
the utility function can be plotted. A common shape of the resulting curve is
shown in Fig. 7.1. The curve shows the utility function for a person who is
risk-averse, i.e. who would prefer a sure payoff of x $ to a lottery resulting in
the same expected payoff. If a person is risk-neutral the utility function will be
be a straight line through the origin and, similarly, for a risk-prone person, the
utility function would bend upwards. Put differently, risk-aversion implies
that the second derivative U 00 (x) of the utility function is negative.
At this point, it should be noted that, even if all the axioms necessary for
the existence of a utility function holds true, it is not always so that utilities can
be computed in a simple way. In the examples considered above, the state of
nature was known, i.e. the various consequences and their probabilities could
be listed, and all that remained was the uncertainty due to randomness. How-
ever, in many situations, the state of nature is not known, and it is therefore
much more difficult to assign appropriate utility values to guide behavior. The
general theory of decision-making under uncertainty is, however, beyond the
scope of this text.
Water availability
Nutrient level
Motivational isocline
Lethal boundary
Decision-making in Stentor
Stentor is a simple, single-celled animal (see Fig. 7.3), that attaches itself to
e.g. a rock, and uses its hair-like cilia to sweep nutrients into its trumpet-
shaped mouth. Obviously, this animal has no nervous system, but is nev-
ertheless capable of quite complicated self-preserving behaviors: Besides the
feeding behavior (B1 ), Stentor is equipped with four different avoidance be-
haviors (in response, for example, to the presence of a noxious substance). In
increasing order of energy expenditure, the avoidance behaviors are (a) turn-
Figure 7.3: A Stentor. Reproduced with kind permission of Dr. Ralf Wagner.
ing away (B2 ), (b) reversing the cilia, which interrupts the feeding activity (B3 ),
(c) contraction, followed by waiting (B4 ), and (d) detachment, that is, breaking
away from the object to which the animal is attached (B5 ). Despite its sim-
ple architecture, Stentor is able to execute the various avoidance behaviors in
a rational sequence, i.e. starting with B2 and, if this behavor is insufficient to
escape the noxious substance, proceeding with B3 etc. However, the sequence
of activation of the different avoidance behaviors is not fixed: Sometimes, B2
is followed by B4 instead of B3 etc. How can such a simple animal be capa-
ble of such complex behavior, given its utter inability to reason about which
activity to perform? It turns out, as described in detail by Staddon6 that Sten-
tor’s behavior can be accounted for by a very simple model, involving several
leaky integrator elements. A two-parameter leaky integrator is described by
the equation
dU
+ aU (t) = bX(t), (7.11)
dt
where a and b are constants and X is the external stimulus. Now, let Ui denote
the utility associated with executing Bi , and set U1 = C = constant. For the
6
See Staddon, J.E.R. Adaptive dynamics, MIT Press, 2001.
0.7 0.7
0.6 0.6
0.5 0.5
0.4 0.4
0.3 0.3
0.2 0.2
0.1 0.1
0 0
0 0.5 1 1.5 2 2.5 3 3.5 4 0 0.5 1 1.5 2 2.5 3 3.5 4
Figure 7.4: Left panel: The variation of the utility functions for the 5 behaviors. Solid curve:
B1 , large-dotted curve: B2 , small-dotted curve: B3 , dot-dashed curve: B4 , plus-sign curve:
B5 . Right panel: The variation X(t) of the concentration of the noxious substance.
dUi
+ ai Ui (t) = bi X(t), i = 2, 3, 4, 5 (7.12)
dt
Now, given initial values of the utilities U2 , U3 , U4 , and U5 (here all set, arbitrar-
ily, to zero), and the variation of X with time, the utility for each behavior can
be computed at all times. Using a utility-maximizing (rational!) procedure for
behavior selection (i.e decision-making), where the behavior Bisel correspond-
ing to maximum current utility is selected, i.e.
the behavior of Stentor, including the variable activation sequence of the avoid-
ance behaviors, can be modelled quite accurately, by selecting appropriate val-
ues for the constants ai and bi . An example is shown in Fig. 7.4. Here, it was
assumed that the variation X(t) of the noxious substance can be modelled as
dX
+ k1 X = X1 , (7.14)
dt
if B1 is active, and
dX
+ k2 X = X2 , (7.15)
dt
if any other behavior is active. k1 , k2 , X1 , and X2 are non-negative constants,
satisfying X1 > X2 (i.e. the amount of noxious substance tends towards higher
values if no evasive action is taken). The left panel of the figure shows the
variation of Ui for i = 1, . . . 5. As can be seen in the figure, when X becomes
sufficiently large, B2 is activated for a short while. Next B3 , B4 , and B5 are
activated, in order. With the parameter values used in the example, the fall in
X is quite slow. Thus, in subsequent activations of the avoidance behaviors,
B2 − B4 are skipped, and the Stentor immediately activates B5 .
7
See McFarland, D. Animal behaviour, Addison-Wesley, 1999.
8
See McFarland, D. and Bösser, T. Intelligent behavior in animals and robots, MIT Press, 1993.
9
See Wahde, M. A method for behavioural organization for autonomous robots based on evolution-
ary optimization of utility functions, J. Systems and Control Engineering, 217, pp. 249-258, 2003,
and Wahde, M. A general-purpose method for decision-making in autonomous robots, Lecture notes
in Artificial Intelligence, 5579, pp. 1-10, 2009.
105
106 CHAPTER 8. DECISION-MAKING
the battery charging behavior does not lead to any direct increase in perfor-
mance, it should nevertheless be activated, so that the robot can continue car-
rying out its floor-sweeping task when the batteries have been charged. Hence,
in order to receive the highest possible performance score over a long period
of time, the robot must, in fact, maximize utility.
Utility also provides a means of allocating limited resources in an optimal
way. The life of any animal (or robot) inevitably involves many trade-offs,
where less relevant behaviors must be sacrificed or at least postponed in or-
der to carry out the most relevant behaviors, i.e. those associated with largest
utility value. Next, a brief description of the UF method will be given, starting
with the important concept of state variables.
Sensor readings
State variables
z1
SPS z2
z3
Figure 8.1: A schematic representation of an SPS. In general, an SPS takes raw sensory
readings as input and generates state variables as output.
or the maximum
z = max Σi , (8.3)
i∈[1,j]
1 1
0.8 0.8
utility
0.6 0.6
z
0.4 0.4
0.2 0.2
0 2 4 6 8 10 0 2 4 6 8 10
time time
Figure 8.2: An illustration of the noise tolerance resulting from the differential form of the
utility functions. The left panel shows an example of signal loss in a state variable z, in turn
caused by signal loss in the sensor(s) whose readings are used when forming the state variable:
At around t = 5 s, the state variable suddenly dropped to zero for 0.15 s. Right panel: The
corresponding change in a utility function, defined by τ u̇ + u = σ(b + wz), where τ = 0.3,
b = 1.0 and w = 0.5. The sigmoid parameter c was set to 1.0.
Now, from the form of the utility functions given in Eq. (8.4) it is clear that
the utility values will depend on the state variables zk (k = 1, . . . , m). Ideally,
the state variables should provide the robot with all the information needed to
make an informed decision regarding which brain processes to keep active in
any situation encountered. However, in some cases, one may wish to directly
activate or de-activate some brain process. For that reason, the Γi (i = 1, . . . , n)
parameters, henceforth simply referred to as gamma parameters, have been
introduced (see Eq. (8.4)). The values of the gamma parameters are not ob-
tained from the state variables, but are instead set directly by the brain pro-
cesses. For example, a brain process j may, under certain circumstances, set
the parameter Γi of some brain process i either to a large positive value (in
order to raise the utility ui , so as to activate brain process i) or a large nega-
tive value (to achieve the opposite effect, i.e. de-activation of brain process i).
However, once the intended result has been achieved, Γi should return to its
default value of zero. This is achieved by letting Γi vary (at all times) as
where τiΓ is a time constant7 , determining the decay rate of Γi . Thus, in the
normal situation Γi is equal to zero, whereas if some brain process abruptly
sets Γi to a value different from zero, it subsequently falls off exponentially.
The differential form of Eq. (8.4) has the important effect of filtering out the
noise which is always present in sensor readings and therefore also in the state
variables. Thus, for example, if a given sensor fails to give readings for a brief
interval of time (a few ms, say), the corresponding state variable will drop to
zero. However, because of the time constant (usually of order 0.1 - 0.3 s) in the
utility functions, a brief interruption in the sensor readings will only cause a
small fluctuation in the corresponding utility value, as illustrated in Fig. 8.2.
1
0.75
0.5
0.25
utility
0
-0.25
-0.5
-0.75
10 20 30 40 50
time
Figure 8.3: An example of utility function dynamics, showing the variation (with time) of
three utility functions u1 (red), u2 (green) and u3 (blue).
est utility is active. In the particular case shown in the figure the utility (u1 ) for
one of the behaviors (B1) was essentially constant. Since it is only the relative
utility values that matter, it is common to let one such process have (asymptot-
ically) constant utility values, by setting all the corresponding weights wik to
zero. However, for the other two behaviors, B2 and B3, whose utility functions
u2 and u3 are shown as green and blue curves, respectively, some weights wik
were non-zero. In the particular case shown here, B1 was active most of the
time, the only interruptions occurring around t = 21 s and t = 27 s, at which
points B3 was activated instead, the first time for around 2.5 s and the second
time for less than 1 s. In the time interval covered by the figure, B2 was never
activated.
As mentioned above the UF method is mainly intended for motor tasks,
and it has been successfully applied in several tasks of that kind, both in sim-
ulated and real robots8
8
A specific example can be found in: Wahde, M. A general-purpose method for decision-making
in autonomous robots, LNCS, 5579, 1-10, 2009.
The following is an alphabetical list of all the Matlab functions associated with
ARSim. For each function, the library to which the function belongs is given,
along with the interface of the function and a brief description. For further
information, see the actual source code for the function in question.
AddMotionResults
Library: ResultFunctions
Interface:
motionresults = AddMotionResults(oldMotionResults, time, robot)
Description: This function updates the motion results by adding the current
position, velocity, heading, and sensor readings of the robot.
BrainStep
Library: –
Interface: b = BrainStep(robot, time);
Description: The BrainStep implements the decision-making system (i.e.
the brain) of the robot. The detailed form of this function will vary from ex-
periment to experiment.
CalibrateOdometer
Library: RobotFunctions
Interface: o = CalibrateOdometer(robot)
Description: In simulations in which an odometer is used, a call to CalibrateOdometer
is made just before the start of the simulation, in order to set the correct posi-
tion and heading of the robot.
See also: CreateOdometer
113
114 Appendix A: Matlab functions in ARSim
CheckForCollisions
Library: RobotFunctions
Interface: coll = CheckForCollisions(arena, robot);
Description: This function carries out a collision check, by running through all
arena objects (polygons) line by line, and checking for intersections between
the current line and the spherical body of the robot.
CreateArena
Library: ArenaFunctions
Interface: arena = CreateArena(name,size,objectArray)
Description: This function generates an arena, given an array of arena objects.
See also: CreateArenaObject
CreateArenaObject
Library: ArenaFunctions
Interface: arenaobject = CreateArenaObject(name,vertexArray)
Description: This function generates an arena object, given an array of coor-
dinates for vertices.
CreateBrain
Library: –
Interface: b = CreateBrain;
Description: This function generates the brain of a robot. Its exact form will
vary from experiment to experiment.
CreateCompass
Library: RobotFunctions
Interface: c = CreateCompass(name,sigma);
Description: This function generates a compass which can be used for esti-
mating the heading of the robot. The parameter sigma determines the noise
level.
CreateIRSensor
Library: RobotFunctions
Interface: s = CreateIRSensor(name,relativeAngle,size,numberOfRays,
openingAngle,range,c1,c2,sigma);
CreateMotor
Library: RobotFunctions
Interface: m = CreateMotor(name);
Description: CreateMotor generates a DC motor, using settings suitable for
a robot with a mass of a few kg.
CreateOdometer
Library: RobotFunctions
Interface: o = CreateOdometer(name, sigma);
Description: This function generates an odometer, which, in turn, provides
estimates for the position and heading of the robot. The parameter sigma
determines the noise level.
CreateRobot
Library: RobotFunctions
Interface: robot = CreateRobot(name,mass,momentOfInertia,radius,
wheelRadius,rayBasedSensorArray,
motorArray,compass,odometer,brain)
Description: CreateRobot sets up a robot, and computes the dynamical pa-
rameters typical of a robot with a mass of a few kg.
GetCompassReading
Library: RobotFunctions
Interface: c = GetCompassReading(robot, dt);
Description: This function updates the compass readings of a robot.
GetDistanceToLineAlongRay
Library: RobotFunctions
Interface: l = GetDistanceToLineAlongRay(beta,p1,p2,x1,y1);
Description: This function, which is used by the IR sensors, computes the
distance from a given point (x1 , y1 ) to a line segment.
See also: GetIRSensorReading, GetDistanceToNearestObject.
GetDistanceToNearestObject
Library: RobotFunctions
Interface: d = GetDistanceToNearestObject(beta, x, y, arena);
Description: This function, which is used by the IR sensors, determines the
distance between an IR sensor and the nearest object along a given ray.
See also: GetIRSensorReading.
GetIRSensorReading
Library: RobotFunctions
Interface: s = GetIRSensorReading(sensor,arena);
Description: GetIRSensorReading determines the reading of an IR sensor.
GetMinMaxAngle
Library: RobotFunctions
Interface: [aMin,aMax] = GetMinMaxAngle(v1,v2);
Description: This function determines the direction angles of the vectors con-
necting the origin of the coordinate system to the tips of a line segment.
See also: GetDistanceToNearestObject.
GetMotorSignalsFromBrain
Library: RobotFunctions
Interface: s = GetMotorSignalsFromBrain(brain);
Description: This function extracts the motor signals (one for each motor)
from the brain of the robot.
See also: MoveRobot.
GetOdometerReading
Library: RobotFunctions
Interface: o = GetOdometerReading(robot, dt);
Description: This function updates the odometer readings of a robot.
GetRayBasedSensorReadings
Library: RobotFunctions
Interface: s = GetRayBasedSensorReadings(robot, arena)
Description: This function obtains the reading of all (IR) sensors of the robot.
See also: GetIRSensorReading.
GetTorque
Library: RobotFunctions
Interface: m = GetTorque(motor, voltage);
Description: This function determines the torque delivered by a DC motor,
given a value of the applied voltage.
InitializeMotionResults
Library: ResultFunctions
Interface: motionResults = InitializeMotionResults(robot)
Description: This function initializes a Matlab structure used for storing the
results of the simulation, i.e. the position, velocity, heading, and sensor read-
ings of the robot.
InitPlot
Library: PlotFunctions
Interface: plotHandle = InitializePlot(robot, arena)
Description: This function generates the plot of the robot and the arena.
See also: CreateArena, CreateRobot.
MoveRobot
Library: RobotFunctions
Interface: r = MoveRobot(robot,dt);
Description: MoveRobot moves the robot according to the equations of mo-
tion for a differentially steered two-wheeled robot.
ScaleMotorSignals
Library: RobotFunctions
Interface: v = ScaleMotorSignals(robot,s);
Description: This function scales the motor signals (s) to the appropriate
range, as set by the voltage requirements of the robot’s DC motors.
SetPositionAndVelocity
Library: RobotFunctions
Interface: r = SetPosition(robot,position,heading,
velocity,angularSpeed);
Description: This function places the robot at a given location, and also sets is
direction of motion, velocity, and angular velocity.
ShowRobot
Library: PlotFunctions
Interface: ShowRobot(plot,robot)
Description: ShowRobot updates the plot of the robot using Matlab’s handle
graphics: Each part of the plot of the robot can be accessed and its position can
be be updated. ShowRobot also supports the plotting of an odometric ghost,
i.e. a plot showing the robot at the location determined by its odometer.
See also: MoveRobot.
UpdateMotorAxisAngularSpeed
Library: RobotFunctions
Interface: r = UpdateMotorAxisAngularSpeed(robot)
Description: This function determines the angular speed of each motor axis,
using the wheel speed and wheel radius.
UpdateSensorPositions
Library: RobotFunctions
Interface: s = UpdateSensorPositions(robot);
Description: This function updates the positions (and directions) of the sen-
sors as the robot is moved.