Probabilistic mechanisms in human sensorimotor control
Daniel Wolpert, University College London
Q. Why do we have a brain?
A. To produce adaptable and complex movements
movement is the only way we have of
Interacting with the world
Communication: speech, gestures, writing
sensory, memory and cognitive processes future motor outputs
Sea Squirt
Why study computational sensorimotor control?
Principles of Neural Science, Kandel et al.
3500
Theory
Pages
3000
2500
Experiments
r2=0.96
2000
1500
1000
500
1980
1990
2000
2010
2020
2030
2040
2050
Year
The complexity of motor control
What to move where
vs.
Moving
vs.
Noise makes motor control hard
Noise = randomness
The motor system is Noisy
Noisy
Partial
Perceptual noise
Ambiguous
Variable
Limits resolution
Noisy
Motor Noise
Limits control
David Marrs levels of understanding (1982)
1) the level of computational theory
of the system
2) the level of algorithm and
representation, which are used
make computations
3) the level of implementation: the
underlying hardware or
"machinery" on which the
computations are carried out.
Tutorial Outline
Sensorimotor integration
Static multi-sensory integration
Bayesian integration
Dynamic sensor fusion & the Kalman filter
Action evaluation
Intrinsic loss function
Extrinsic loss functions
Prediction
Internal model and likelihood estimation
Sensory filtering
Control
Optimal feed forward control
Optimal feedback control
Motor learning of predictable and stochastic
environments
Review papers on www.wolpertlab.com
Multi-sensory integration
Multiple modalities can provide information about the same quantity
e.g. location of hand in space
Vision
Proprioception
Sensory input can be
Ambiguous
Noisy
What are the computations used in
integrating these sources?
Ideal Observers
Consider n signals xi , i = {1 n}
xi = x + i
i = N (0, i2 )
x
x2
Maximum likelihood estimation (MLE)
x1
P( x1 , x2 ..., xn | x) = P( xi | x)
i =1
x = wi xi
with
i =1
=
2
x
j =1
2
i
wi =
i2
j =1 j
< k2 k
Two examples of multi-sensory integration
Visual-haptic integration (Ernst & Banks 2002)
Two alternative force choice size judgment
Visual
Haptic
Visual-haptic (with discrepancy)
2 H
Probability
Probability
Visual-haptic integration
H
Size
2 H
Size difference
Measure
Visual reliability V2
Haptic reliability H2
Predict
Visual + Haptic noise
Weighting of
Visual-haptic integration
Weights
wi =
i2
wH =
j =1 j
2
V
+
2
V
2
H
Standard deviation (~threshold)
)
wV = 1 wH
=
2
x
n
j =1
2
i
V2 H2
= 2
V + H2
Optimal integration of vision and haptic information in size judgement
Visual-proprioceptive integration
Classical claim from prism adaptation
vision dominates proprioception
Reliability of proprioception depends on location
Reliability of visual localization is anisotropic
(Van Beers, 1998)
Integration models with discrepancy
Winner takes all
Optimal integration
Linear weighting of mean
x = AxV + Bx H
PV = ( P1 + V1 ) 1
PV = PV ( P1 P + V1V )
x = wxV + (1 w)x H
Prisms displace along the azimuth
Measure V and P
Apply visuomotor discrepancy during right hand reach
Measure change in V and P to get relative adaptation
Vision 0.33
Prop 0.67
(Van Beers, Wolpert & Haggard, 2002)
Visual-proprioceptive discrepancy in depth
Adaptation
Vision 0.72
Prop 0.28
Visual adaptation in depth > visual adaptation in azimuth (p<0.01)
> Proprioceptive adaptation in depth (p<0.05)
Proprioception dominates vision in depth
Priors and Reverend Thomas Bayes
1702-1761
I now send you an essay which I have found among the papers of our
deceased friend Mr Bayes, and which, in my opinion, has great merit....
Essay towards solving a problem in the doctrine of chances.
Philosophical Transactions of the Royal Society of London, 1764.
Bayes rule
A = Disease
B = Positive blood test
P ( A, B ) = P ( A | B ) P ( B ) = P ( B | A)P ( A)
A and B
A given B
Neuroscience
A= State of the world
Belief in state AFTER sensory input
P(state|sensory input) =
Posterior
B=Sensory Input
Evidence
Belief in state BEFORE sensory input
P (sensory input|state) P(state)
P(sensory input)
Likelihood
Prior
Bayesian Motor Learning
Real world tasks have variability, e.g. estimating balls bounce location
Prior
Sensory feedback (Evidence)
Combine multiple cues
to reduce uncertainty
Evidence
Task statistics (Prior)
Not all locations are
equally likely
=
Estimate
Optimal estimate (Posterior)
Bayes rule
P(state|sensory input) P (sensory input|state) P(state)
Does sensorimotor learning use Bayes rule?
If so, is it implemented
Implicitly: mapping sensory inputs to motor outputs to minimize error?
Explicitly: using separate representations of the statistics of the prior and sensory noise?
Prior
Probability
Task in which we control
1) prior statistics of the task
2) sensory uncertainty
Lateral shift (cm)
(Krding & Wolpert, Nature, 2004)
Prior
Probability
Task in which we control
1) prior statistics of the task
2) sensory uncertainty
Lateral shift (cm)
(Krding & Wolpert, Nature, 2004)
Sensory Feedback
Learning
Likelihood
Generalization
After 1000 trials
2 cm shift
No visual feedback
Probability
1cm
Lateral shift (cm)
Models
0
Average Error
Bias (cm)
Average Error
Bias (cm)
Lateral shift (cm)
1
0
-1
1
2
Lateral Shift (cm)
Mapping
Bayesian Compensation
1
2
Lateral shift (cm)
Average Error
Bias (cm)
Full Compensation
1
0
-1
1
2
Lateral Shift (cm)
1
2
Lateral shift (cm)
1
0
-1
1
2
Lateral Shift (cm)
Results: single subject
Full
Average Error
Bias (cm)
Bayes
Map
0
1
2
Lateral Shift (cm)
Supports model 2: Bayesian
Results: 10 subjects
Full
Bayes
Map
0
1
2
Lateral Shift (cm)
Inferred Prior
(normalized)
Average Error
Bias (cm)
0
-0.5
2.5
lateral shift [cm]
Lateral shift (cm)
Supports model 2: Bayesian
Bayesian integration
Subjects can learn
multimodal priors
priors over forces
different priors one after the other
(Krding& Wolpert NIPS 2004, Krding, Ku & Wolpert J. Neurophysiol. 2004)
Statistics of the world shape our brain
Objects
Configurations of our body
Statistics of visual/auditory stimuli representation visual/auditory cortex
Statistics of early experience what can be perceived in later life
(e.g. statistics of spoken language)
Statistics of action
With limited neural resources statistics of motor tasks motor performance
4 x 6-DOF electromagnetic sensors
battery & notebook PC
Phase relationships and symmetry bias
Multi-sensory integration
CNS
In general the relative weightings of the senses is
sensitive to their direction dependent variability
Represents the distribution of tasks
Estimates its own sensory uncertainty
Combines these two sources in a Bayesian way
Supports an optimal integration model
Loss Functions in Sensorimotor system
P (state|sensory input) P(sensory input|state) P(state)
Likelihood
Prior
Posterior
Probability
Posterior
Target Position
E[ Loss ] = Loss( state, action) P ( state | sensory _ input ) dsensory _ input
Bayes estimator xB ( s ) = arg min actions E[ Loss ]
What is the performance criteria (loss, cost, utility, reward)?
Often assumed in statistics & machine learning
that we wish to minimize squared error for analytic or algorithmic tractability
What measure of error does the brain care about?
Loss function
f (error )
Scenario 2
Scenario 1
Target
Loss = error
Loss = error
Loss = error
1
2
Loss=4+4=8
Loss=1+9=10
Loss=2+2=4
Loss=1+3=4
Loss=1.4+1.4=2.8
Loss=1+1.7=2.7
Virtual pea shooter
Probability
Mean
Starting location
-0.2
0.2
Position (cm)
(Krding & Wolpert, PNAS, 2004)
Probed distributions and optimal means
Distributions
Possible Loss functions
MODE
=0.2
Maximize Hits
-2
-1
0
1
Error (cm)
MEDIAN
=0.3
Loss = error
Loss = (error )
MEAN
=0.5
=0.8
Robust estimator
-2
-1
0
1
Error (cm)
Shift of mean against asymmetry (n=8)
Mean squared error with robustness to outliers
Personalised loss function
Loss = errori
= 0.1
1.0 1.3 1.5 1.7 1.9 2.1 2.3 2.5 2.7 2.9 3.1 3.3 3.5 3.7 3.9
Bayesian decision theory
Increasing probability of avoiding keeper
Increasing probability of being within the net
Imposed loss function
(Trommershuser et al 2003)
0
+100
-100
+100
-500
+100
Optimal performance with complex regions
State estimation
State of the body/world
Set of time-varying parameters which together with
Dynamic equations of motion
Fixed parameters of the system (e.g. mass)
Allow prediction of the future behaviour
Tennis ball
Position
Velocity
Spin
State estimation
NOISE
NOISE
Observer
Kalman filter
Minimum variance estimator
Estimate time-varying state
Cant directly observe state but only measurement
xt +1 = Axt + But + w t
y t +1 = Cxt + v t
x t +1 = Ax t + But + K t [y t Cx t ]
State estimation
x t +1 =
Ax t + But
Forward Dynamic Model
+ K t [y t Cx t ]
Kalman Filter
Optimal state estimation is a mixture
Predictive estimation (FF)
Sensory feedback (FB)
Eye position
Location of object based on retinal location and gaze direction
Percept
Actual
Motor
command
FM
Eye
Position
Sensory likelihood
P(state|sensory input) P (sensory input|state) P(state)
(Wolpert & Kawato, Neural Networks 1998
Haruno, Wolpert, Kawato, Neural Computation 2001)
Sensory prediction
Our sensors report
Afferent information:
Re-afferent information:
changes in outside world
changes we cause
+
Internal
source
External
source
Tickling
Self-administered tactile stimuli rated as less ticklish than
externally administered tactile stimuli. (Weiskrantz et al, 1971)
Does prediction underlie tactile cancellation in tickle?
Tickle rating rank
3
2.5
2
P<0.001
Gain control or precise
spatio-temporal prediction?
1.5
1
0.5
0
Self-produced
tactile stimuli
Robot-produced
tactile stimuli
Condition
Spatio-temporal prediction
Tickle rating rank
Tickle rating rank
P<0.001
3.5
P<0.001
2.5
2
1.5
1
0.5
0
Delay 0ms
Delay
100ms
Delay
200ms
Condition
Delay
300ms
Robotproduced
tactile
stimuli
3.5
3
2.5
P<0.001
P<0.001
2
1.5
1
0.5
0
0 degrees 30 degrees 60 degrees 90 degrees External
Condition
(Blakemore, Frith & Wolpert. J. Cog. Neurosci. 1999)
The escalation of force
Tit-for-tat
Force escalates under rules designed
to achieve parity: Increase by ~40% per turn
(Shergill, Bays, Frith & Wolpert, Science, 2003)
Perception of force
70% overestimate in force
Perception of force
Labeling of movements
Large
sensory
discrepancy
Defective prediction in patients
with schizophrenic
Patients
Controls
The CNS predicts sensory
consequences
Sensory cancellation in
Force production
Defects may be related to
delusions of control
Motor Learning
Required if:
organisms environment, body or task change
changes are unpredictable so cannot be pre-specified
want to master social convention skills e.g writing
Trade off between:
innate behaviour (evolution)
hard wired
fast
resistant to change
learning (intra-life)
adaptable
slow
Maleable
Motor Learning
Actual behaviour
Supervised learning is good for forward models
Predicted outcome can be
compared to actual
outcome to generate an
error
Weakly electric fish (Bell 2001)
Produce electric pulses to
recognize objects in the dark or in murky habitats
for social communication.
The fish electric organ is composed of electrocytes,
modified muscle cells producing action potentials
EOD = electric organ discharges
Amplitude of the signal is between 30 mV and 7V
Driven by a pacemaker in medulla, which triggers each discharge
Sensory filtering
Skin receptors are derived from the lateral line system
Removal of expected or predicted sensory input is one of the very general
functions of sensory processing.
Predictive/associative mechanisms for changing environments
Primary afferent terminate in cerebellar-like structures
Primary afferents terminate on principal cells either directly or via interneurons
Block EOD discharge with curare
Specific for Timing (120ms), Polarity, Amplitude & Spatial distribution
Proprioceptive Prediction
Tail bend affects feedback
Passive Bend phase locked
to stimulus:
Bend
Learning rule
Changes in synaptic strength requires principal cell spike discharge
Change depends on timing of EPSP to spike
Anti-Hebbian learning
T1
T2
T2-T1
Forward Model can be learned through self-supervised learning
Anti-hebbian rule in Cerebellar like structure of he electric fish
Motor planning (what is the goal of motor control)
Tasks are usually specified at a symbolic level
Motor system works at a detailed level, specifying muscle activations
Gap between high and low-level specification
Any high level task can be achieved in infinitely many low-level ways
Duration
Hand Trajectory
Joint
Muscles
Motor evolution/learning results in stereotypy
Stereotypy between repetitions and individuals
Eye-saccades
Arm- movements
Time (ms)
Main sequence
Donders law
Listings Law
2/3 power law
Fitts law
Models
HOW models
Neurophysiological or black box models
Explain roles of brain areas/processing units in
generating behavior
WHY models
Why did the How system get to be the way it is?
Unifying principles of movement production
Evolutionary/Learning
Assume few neural constraints
The Assumption of Optimality
Movements have evolved to maximize fitness
improve through evolution/learning
every possible movement which can achieve a task has a cost
we select movement with the lowest cost
Overall cost = cost1 + cost2 + cost3 .
Optimality principles
Parsimonious performance criteria
Elaborate predictions
Requires
Admissible control laws
Musculoskeletal & world model
Scalar quantitative definition of task
performance usually time integral of
f(state, action)
Open-loop
What is the cost
Occasionally task specifies cost
Jump as high as possible
Exert maximal force
Usually task does not specify the cost directly
Locomotion well modelled by energy minimization
Energy alone is not good for eyes or arms
What is the cost?
Saccadic eye
movements
little vision over 4 deg/sec
frequent 2-3 /sec
deprives us of vision for 90
minutes/day
Minimize time
Arm movements
Movements are smooth
Minimum jerk (rate of change of acceleration) of the hand
(Flash & Hogan 1985)
T
Cost = x (t ) 2 + y (t ) 2 dt
0
x(t ) = x0 + ( x0 xT )(15 4 6 5 10 3 )
y (t ) = y0 + ( y0 yT )(15 4 6 5 10 3 )
= t /T
Smoothness
Minimum Torque change (Uno et al, 1989)
Cost = s (t ) + e (t ) dt
2
Elbow
torque
Shoulder
torque
The ideal cost for goal-directed movement
Makes sense - some evolutionary/learning advantage
Simple for CNS to measure
Generalizes to different systems
e.g. eye, head, arm
Generalizes to different tasks
e.g. pointing, grasping, drawing
Reproduces & predicts behavior
Motor command noise
Noise
Motor
System
Error minimized by
rapidity
Position
Fundamental constraint=Signal-dependent noise
Signal-dependent noise:
Constant coefficient of variation
SD (motor command) ~ Mean (motor command)
Evidence from
Experiments: SD (Force) ~ Mean (Force)
Modelling
Spikes drawn from a renewal process
Recruitment properties of motor units
(Jones, Hamilton & Wolpert , J. Neurophysiol., 2002)
Task optimization in the presence of SDN
An average motor command probability distribution (statistics) of movement.
Controlling the statistics of action
Given SDN, Task optimizing f(statistics)
Finding optimal trajectories for linear systems
Impulse response
Signal-dependent noise
System
p(t)
u2 (t ) = k 2u 2 (t )
time
M
A
Constraints
E[ x( M )] = u ( ) p( M )d = A
0
x(t)
E[ x ( n ) ( M )] = u ( ) p ( n ) ( M )d = 0
0
time
Cost
Var[ x(T )] = Var[u ( ) p(T )]d = Var[u ( )] p 2 (T )d
M
= k 2u 2 ( ) p 2 (T )d = w( )u 2 ( )d
Linear constraints with quadratic cost:
can use quadratic programming or isoperimetric optimization
Saccade predictions
3rd order linear system
SDN
Motor
command
Jerk
Prediction: very slow saccade
Degrees
22 degree saccade in 270 ms (normally ~ 70 ms)
20
0
0
100
200
Time (ms)
Head free saccade
1=0.15 2=0.08
Free parameter, eye:head noise
1=0.3 2=0.3
120
120
100
Gaze
Gaze
80
80
Head
Head
Degrees
Degrees
100
60
60
40
40
Eye
20
20
Eye
0
0
100
200
300
400
500
Time (ms)
600
700
800
100
200
300
400
500
Time (ms)
(Tomlinson & Bahra, 1986)
600
700
800
Coordination: Head and eye
For a fixed duration (T), Var(A)=k A2
Eye only
Eye &
Head
Var(A)=k A2
Var(A)= k (A/2)2 + k (A/2)2
= k A2 /2
Head only
Var(A)=k A2
Movement extent vs. target eccentricity
120
100
80
60
Head
40
20
Angular deviation at
acquisition
100
200
} Eye
300
400
500
600
700
800
100
90
80
Head
70
60
50
40
30
Eye
20
10
0
0
Gaze amplitude
20
40
60
80
100
120
Gaze amplitude
140
Arm movements
Smoothness
Non smooth movement
requires abrupt change in velocity
given low pass system
large motor command
increased noise
Smoothness accuracy
Drawing ( power law) f=path error
Obstacle avoidance
f= limit probability of collision
Feedforward control
Ignores role of feedback
Generates desired movements
Cannot model trial-to-trial variability
Optimal feedback control (Todorov 2004)
Optimize performance over all possible feedback control laws
Treats feedback law as fully programmable
command=f(state)
Models based on reinforcement learning optimal cost-to-go functions
Requires a Bayesian state estimator
Minimal intervention principle
Do not correct deviations
from average behaviour
unless they affect task
performance
Acting is expensive
energetically
noise
Leads to
uncontrolled manifold
synergies
U
t
on
nc
d
lle
ro
an
m
ifo
ld
Optimal control with SDN
Biologically plausible theoretical underpinning
for both eye, head, arm movements
No need to construct highly derived signals to
estimate the cost of the movement
Controlling statistics in the presence of noise
What is being adapted?
Possible to break down the control process:
Visuomotor rearrangements
Dynamic perturbations
[timing, coordination , sequencing]
Internal models captures the relationship
between sensory and motor variables
Altering dynamics
Altering Kinematics
Representation of transformations
Look-up
Table
x
Non-physical
Parameters
=f(x,)
Physical
Parameters
=acos(x/L)
x
1 10
3 35
. .
. .
asin
High storage
High flexibility
Low Generalization
Low storage
Low flexibility
High Generalization
Generalization paradigm
Baseline
Assess performance over
domain of interest
(e.g. workspace)
Exposure
Perturbation: New task
Limitation: Limit the
exposure to a subdomain
Test
Re-assess performance
over entire domain of
interest
Difficulty of learning
(Cunningham 1989, JEPP-HPP)
Rotations of the visual field
from 0180 degrees
Difficulty
increases from 0 to 90
decreases from 120 to 180
What is the natural
parameterization
Viscous curl field
(Shadmehr & Mussa-Ivaldi 1994, J. Neurosci.)
Representation from generalization: Dynamic
1.
Test: Movements over
entire workspace
2.
Learning
Right-hand workspace
Viscous field
3.
Test: Movements over left
workspace
Two possible interpretations
force = f(hand velocity)
or torque=f(joint velocity)
Joint-based learning of
dynamics
(Shadmehr & Mussa-Ivaldi 1994, J.
Neurosci.)
Left hand workspace
Before After with Cartesian field
Visuomotor coordinates
Joint angles
Spherical Polar
Cartesian
z
(x,y,z)
(1,2,3)
z
(r,,)
r
y
x
Representation- Visuomotor
1.
Test: Pointing accuracy to a set of targets
2.
Learning
3.
visuomotor remapping
feedback only at one target
Test: Pointing accuracy to a set of targets
Prediction
sphericalcoordinates
coordinates
Predictions of
of eye-centred
eye-centred spherical
y = 22.2/16.2 cm
y = 29.2/26.2 cm
y = 36.2 cm
z = -43.6 cm
z = -35.6 cm
z = -27.6 cm
-30
-40
-50
40
30
(Vetter et al, J. Neurophys, 1999)
20
-20 -10
Generalization paradigms can be used to assess
Extent of generalization
Coordinate system of transformations
10
20
-20 -10
10
x (cm)
20
-20 -10
10
20
Altering dynamics: Viscous curl field
Before
Late with force
Early with force
Removal of force
Stiffness control
A muscle activation levels sets the spring
constant k (or resting length) of the muscle
Equilibrium point
Equilibrium point control
Set of muscle activations (k1,k2,k3) defines a posture
CNS learns a spatial mapping
e.g. hand positions
(x,y,z)
muscle activations
(k1,k2,k3)
Equilibrium control
Low stiffness
High stiffness
The hand stiffness can vary with muscle activation
levels.
Controlling stiffness
Burdet et al (Nature, 2002)
Stiffness ellipses
Internal models to learn stable tasks
Stiffness for unpredictable tasks
Summary
Sensorimotor integration
Static multi-sensory integration
Bayesian integration
Dynamic sensor fusion & the Kalman filter
Action evaluation
Intrinsic loss function
Extrinsic loss functions
Prediction
Internal model and likelihood estimation
Sensory filtering
Control
Optimal feed forward control
Optimal feedback control
Motor learning of predictable and stochastic environments
Wolpert-lab papers on www.wolpertlab.com
References
Bell, c.(2001) Memory-based expectations in electrosensory systemsCurrent Opinion in
Neurobiology 2001, 11:481487
Burdet, E., R. Osu, et al. (2001). "The central nervous system stabilizes unstable dynamics by
learning optimal impedance." Nature 414(6862): 446-9.
Cunningham, H. A. (1989). "Aiming error under transformed spatial maps suggest a structure for
visual-motor maps." J. Exp. Psychol. 15:3: 493-506.
Ernst, M. O. and M. S. Banks (2002). "Humans integrate visual and haptic information in a
statistically optimal fashion." Nature 415(6870): 429-33.
Flash, T. and N. Hogan (1985). "The co-ordination of arm movements: An experimentally
confirmed mathematical model " J. Neurosci. 5: 1688-1703.
Shadmehr, R. and F. Mussa-Ivaldi (1994 ). "Adaptive representation of dynamics during learning
of a motor task." J. Neurosci. 14:5: 3208-3224.
Todorov, E. (2004). "Optimality principles in sensorimotor control." Nat Neurosci 7(9): 907-15.
Trommershauser, J., L. T. Maloney, et al. (2003). "Statistical decision theory and the selection of
rapid, goal-directed movements." J Opt Soc Am A Opt Image Sci Vis 20(7): 1419-33.
Uno, Y., M. Kawato, et al. (1989). "Formation and control of optimal trajectories in human
multijoint arm movements: Minimum torque-change model " Biological Cybernetics 61: 89-101.
van Beers, R. J., A. C. Sittig, et al. (1998). "The precision of proprioceptive position sense." Exp
Brain Res 122(4): 367-77.
Weiskrantz, L., J. Elliott, et al. (1971). "Preliminary observations on tickling oneself." Nature
230(5296): 598-9.