0% found this document useful (0 votes)

29 views26 pages

Experimental Design and Statistical Parametric Mapping Ch3

Uploaded by

cht ayt

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

29 views26 pages

Experimental Design and Statistical Parametric Mapping Ch3

Uploaded by

cht ayt

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 26

Chapter 3

Spatial Normalization using Basis

Functions

John Ashburner & Karl J. Friston

The Wellcome Dept. of Imaging Neuroscience,
12 Queen Square, London WC1N 3BG, UK.

Contents
3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
3.2 Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
3.2.1 A Maximum A Posteriori Solution . . . . . . . . . . . . . . . . . . . 4
3.2.2 Affine Registration . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
3.2.3 Nonlinear Registration . . . . . . . . . . . . . . . . . . . . . . . . . . 8
3.2.4 Linear Regularization for Nonlinear Registration . . . . . . . . . . . . 13
3.2.5 Templates and Intensity Transformations . . . . . . . . . . . . . . . . 16
3.3 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

Abstract
This chapter describes the steps involved in registering images of different subjects
into roughly the same co-ordinate system, where the co-ordinate system is defined
by a template image (or series of images). The method only uses up to a few hun-
dred parameters, so can only model global brain shape. It works by estimating the
optimum coefficients for a set of bases, by minimizing the sum of squared differ-
ences between the template and source image, while simultaneously maximizing the
smoothness of the transformation using a maximum a posteriori (MAP) approach.

1
2 CHAPTER 3. SPATIAL NORMALIZATION USING BASIS FUNCTIONS

3.1 Introduction
Sometimes it is desirable to warp images from a number of individuals into roughly the same
standard space to allow signal averaging across subjects. This procedure is known as spatial
normalization. In functional imaging studies, spatial normalization of the images is useful for de-
termining what happens generically over individuals. A further advantage of using spatially nor-
malized images is that activation sites can be reported according to their Euclidean co-ordinates
within a standard space [21]. The most commonly adopted co-ordinate system within the brain
imaging community is that described by [32], although new standards are now emerging that are
based on digital atlases [18, 19, 27].
Methods of registering images can be broadly divided into label based and intensity based.
Label based techniques identify homologous features (labels) in the source and reference images
and find the transformations that best superpose them. The labels can be points, lines or sur-
faces. Homologous features are often identified manually, but this process is time consuming and
subjective. Another disadvantage of using points as landmarks is that there are very few readily
identifiable discrete points in the brain. Lines and surfaces are more readily identified, and in
many instances they can be extracted automatically (or at least semi-automatically). Once they
are identified, the spatial transformation is effected by bringing the homologies together. If the
labels are points, then the required transformations at each of those points is known. Between
the points, the deforming behavior is not known, so it is forced to be as ‘smooth’ as possible.
There are a number of methods for modeling this smoothness. The simplest models include fitting
splines through the points in order to minimize bending energy [5, 4]. More complex forms of
interpolation, such as viscous fluid models, are often used when the labels are surfaces [35, 16].
Intensity (non-label) based approaches identify a spatial transformation that optimizes some
voxel-similarity measure between a source and reference image, where both are treated as unla-
beled continuous processes. The matching criterion is usually based upon minimizing the sum of
squared differences or maximizing the correlation between the images. For this criterion to be
successful, it requires the reference to appear like a warped version of the source image. In other
words, there must be correspondence in the grey levels of the different tissue types between the
source and reference images. In order to warp together images of different modalities, a few inten-
sity based methods have been devised that involve optimizing an information theoretic measure
[31, 33]. Intensity matching methods are usually very susceptible to poor starting estimates, so
more recently a number of hybrid approaches have emerged that combine intensity based methods
with matching user defined features (typically sulci).
A potentially enormous number of parameters are required to describe the nonlinear trans-
formations that warp two images together (i.e., the problem is very high dimensional). However,
much of the spatial variability can be captured using just a few parameters. Some research groups
use only a nine- or twelve-parameter affine transformation to approximately register images of
different subjects, accounting for differences in position, orientation and overall brain dimensions.
Low spatial frequency global variability of head shape can be accommodated by describing defor-
mations by a linear combination of low frequency basis functions. One widely used basis function
registration method is part of the AIR package [36, 37], which uses polynomial basis functions to
model shape variability. For example, a two dimensional third order polynomial basis function
3.2. METHOD 3

mapping can be defined:

y1 =q1 +q2 x1 +q3 x21 +q4 x31 +

q5 x2 +q6 x1 x2 +q7 x21 x2 +
q8 x22 +q9 x1 x22 +
q10 x32
y2 =q11 +q12 x1 +q13 x21 +q14 x31 +
q15 x2 +q16 x1 x2 +q17 x21 x2 +
q18 x22 +q19 x1 x22 +
q20 x32

Other low-dimensional registration methods may employ a number of other forms of basis function
to parameterize the warps. These include Fourier bases [10], sine and cosine transform basis
functions [9, 3], B-splines [31, 33], and piecewise affine or trilinear basis functions (see [25] for a
review). The small number of parameters will not allow every feature to be matched exactly, but
it will permit the global head shape to be modeled. The rationale for adopting a low dimensional
approach is that it allows rapid modeling of global brain shape.
The deformations required to transform images to the same space are not clearly defined.
Unlike rigid-body transformations, where the constraints are explicit, those for warping are more
arbitrary. Regularization schemes are therefore necessary when attempting image registration
with many parameters, thus ensuring that voxels remain close to their neighbors. Regularization
is often incorporated by some form of Bayesian scheme, using estimators such as the maximum a
posteriori (MAP) estimate or the minimum variance estimate (MVE). Often, the prior probability
distributions used by registration schemes are linear, and include minimizing the membrane energy
of the deformation field [1, 24], the bending energy [5] or the linear-elastic energy [28, 16]. None
of these linear penalties explicitly preserve the topology1 of the warped images, although cost
functions that incorporate this constraint have been devised [17, 2]. A number of methods involve
repeated Gaussian smoothing of the estimated deformation fields [14]. These methods can be
classed among the elastic registration methods because convolving a deformation field is a form
of linear regularization [8].
An alternative, to using a Bayesian scheme incorporating some form of elastic prior, could
be to use a viscous fluid model [11, 12, 8, 34] to estimate the warps. In these models, finite
difference methods are often used to solve the partial differential equations that model one image
as it “flows” to the same shape as the other. The major advantage of these methods is that
they are able to account for large deformations and also ensure that the topology of the warped
image is preserved. Viscous fluid models are almost able to warp any image so that it looks
like any other image, while still preserving the original topology. These methods can be classed
as “plastic” as it is not the deformation fields themselves that are regularized, but rather the
increments to the deformations at each iteration.

3.2 Method
This chapter describes the steps involved in registering images of different subjects into roughly
the same co-ordinate system, where the co-ordinate system is defined by a template image (or
series of images).
1 The word “topology” is used in the same sense as in “Topological Properties of Smooth Anatomical Maps”

[13]. If spatial transformations are not one-to-one and continuous, then the topological properties of different
structures can change.
4 CHAPTER 3. SPATIAL NORMALIZATION USING BASIS FUNCTIONS

This section begins by introducing a modification to the optimization method described in

Section ??, such that more robust maximum a posteriori (MAP) parameter estimates can be
obtained. It works by estimating the optimum coefficients for a set of bases, by minimizing
the sum of squared differences between the template and source image, while simultaneously
minimizing the deviation of the transformation from its expected value. In order to adopt the
MAP approach, it is necessary to have estimates of the likelihood of obtaining the fit given the
data, which requires prior knowledge of spatial variability, and also knowledge of the variance
associated with each observation. True Bayesian approaches assume that the variance associated
with each voxel is already known, whereas the approach described here is a type of Empirical
Bayesian method, which attempts to estimate this variance from the residual errors (see Chapter
13). Because the registration is based on smooth images, correlations between neighboring voxels
are considered when estimating the variance. This makes the same approach suitable for the
spatial normalization of both high quality MR images, and low resolution noisy PET images.
The first step in registering images from different subjects involves determining the optimum
12 parameter affine transformation. Unlike Chapter 2 – where the images to be matched together
are from the same subject – zooms and shears are needed to register heads of different shapes and
sizes. Prior knowledge of the variability of head sizes is included within a Bayesian framework in
order to increase the robustness and accuracy of the method.
The next part describes nonlinear registration for correcting gross differences in head shapes
that cannot be accounted for by the affine normalization alone. The nonlinear warps are modeled
by linear combinations of smooth discrete cosine transform basis functions. A fast algorithm is
described that utilizes Taylor’s Theorem and the separable nature of the basis functions, mean-
ing that most of the nonlinear spatial variability between images can be automatically corrected
within a few minutes. For speed and simplicity, a relatively small number of parameters (ap-
proximately 1000) are used to describe the nonlinear components of the registration. The MAP
scheme requires some form of prior distribution for the basis function coefficients, so a number of
different forms for this distribution are then presented.
The last part of this section describes a variety of possible models for intensity transforms. In
addition to spatial transformations, it is sometimes desirable to also include intensity transforms
in the registration model, as one image may not look exactly like a spatially transformed version
of the other.

3.2.1 A Maximum A Posteriori Solution

A Bayesian registration scheme is used in order to obtain a maximum a posteriori estimate

of the registration parameters. Given some prior knowledge of the variability of brain shapes
and sizes that may be encountered, a MAP registration scheme is able to give a more accurate
(although biased) estimate of the true shapes of the brains. This is illustrated by a very simple
one dimensional example in Figure 3.1. The use of a MAP parameter estimate reduces any
potential over-fitting of the data, which may lead to unnecessary deformations that only reduce
the residual variance by a tiny amount. It also makes the registration scheme more robust by
reducing the search space of the algorithm, and therefore the number of potential local minima.
Bayes’ rule can be expressed as:

p(q|b) ∝ p(b|q)p(q) (3.1)

where p(q) is the prior probability of parameters q, p(b|q) is the conditional probability that
b is observed given q and p(q|b) is the posterior probability of q, given that measurement b
has been made. The maximum a posteriori (MAP) estimate for parameters q is the mode of
p(q|b). The maximum likelihood (ML) estimate is a special case of the MAP estimate, in which
3.2. METHOD 5

Figure 3.1: This figure illustrates a hypothetical example with one parameter, where the prior
probability distribution is better described than the likelihood. The solid Gaussian curve (a)
represents the prior probability distribution (p.d.f), and the dashed curve (b) represents a maxi-
mum likelihood parameter estimate (from fitting to observed data) with its associated certainty.
The true parameter is known to be drawn from distribution (a), but it can be estimated with
the certainty described by distribution (b). Without the MAP scheme, a more precise estimate
would probably be obtained for the true parameter by taking the most likely a priori value, rather
than the value obtained from a maximum likelihood fit to the data. This would be analogous to
cases where the number of parameters is reduced in a maximum likelihood registration model in
order to achieve a better solution (e.g., see page 7). The dotted line (c) shows the posterior p.d.f
obtained using Bayesian statistics. The maximum value of (c) is the MAP estimate. It combines
previously known information with that from the data to give a more accurate estimate.
6 CHAPTER 3. SPATIAL NORMALIZATION USING BASIS FUNCTIONS

p(q) is uniform over all values of q. For our purposes, p(q) represents a known prior probability
distribution from which the parameters are drawn, p(b|q) is the likelihood of obtaining the data
b given the parameters, and p(q|b) is the function to be maximized. The optimization can be
simplified by assuming that all probability distributions can be approximated by multi-normal
(multidimensional and normal) distributions, and can therefore be described by a mean vector
and a covariance matrix.
A probability is related to its Gibbs form by p(a) ∝ e−H(a) . Therefore the posterior probability
is maximized when its Gibbs form is minimized. This is equivalent to minimizing H(b|q) +
H(q) (the posterior potential). In this expression, H(b|q) (the likelihood potential) is related
to the residual sum of squares. If the parameters are assumed to be drawn from a multi-normal
distribution described by a mean vector q0 and covariance matrix C0 , then H(q) (the prior
potential) is simply given by:
T
H(q) = (q − q0 ) C0 −1 (q − q0 )

Eqn. ?? gives the following maximum likelihood updating rule for the parameter estimation:
−1
qML (n+1) = q(n) − AT A AT b (3.2)

Assuming equal variance for each observation (σ 2 ) and ignoring covariances among them, the
formal covariance matrix of the fit on the assumption of normally distributed errors is given by
−1
σ 2 AT A . When the distributions are normal, the MAP estimate is simply the average of the
prior and likelihood estimates, weighted by the inverses of their respective covariance matrices:
−1 −1
q(n+1) = C0 −1 + AT A/σ 2 C0 q0 + AT A/σ 2 qML (n+1) (3.3)

The MAP optimization scheme is obtained by combining Eqns. 3.2 and 3.3:
−1 −1
q(n+1) = C0 −1 + AT A/σ 2 C0 q0 + AT Aq(n) /σ 2 − AT b/σ 2 (3.4)

For the sake of the registration, it is assumed that the exact form for the prior probability
distribution (N (q0 , C0 )) is known. However, because the registration may need to be done on a
wide range of different image modalities, with differing contrasts and signal to noise ratios, it is
not possible to easily and automatically know what value to use for σ 2 . In practice, σ 2 is assumed
to be the same for all observations, and is estimated from the sum of squared differences from
the current iteration:
XI
σ2 = bi (q)2 /ν
i=1

where ν refers to the degrees of freedom. If the sampling is sparse relative to the smoothness,
then ν ' I − J, where I is the number of sampled locations in the images and J is the number
of estimated parameters 2 .
However, complications arise because the images are smooth, resulting in the observations not
being independent, and a reduction in the effective number of degrees of freedom. The degrees of
freedom are corrected using the principles described by [23] (although this approach is not strictly
correct [38], it gives an estimate that is close enough for these purposes). The effective degrees of
freedom are estimated by assuming that the difference between f and g approximates a continuous,
zero-mean, homogeneous, smoothed Gaussian random field. The approximate parameter of a
2 Strictly speaking, the computation of the degrees of freedom should be more complicated than this, as this

simple model does not account for the regularization.

3.2. METHOD 7

Gaussian point spread function describing the smoothness in direction k (assuming that the axes
of the Gaussian are aligned with the axes of the image co-ordinate system) can be obtained by
[29]: v
u PI
u bi (q)2
wk = t PI i=1
2 i=1 (∇k bi (q))2
p
Multiplying wk by 8loge (2) produces an estimate of the full width at half maximum of the
Gaussian. If the images are sampled on a regular grid where the spacing in each direction is sk ,
the number of effective degrees of freedom 3 becomes approximately:
Y sk
ν = (I − J) 1/2
k
w k (2π)

This is essentially a scaling of I − J by the number of resolution elements per voxel.

This approach has the advantage that when the parameter estimates are far from the solution,
σ 2 is large, so the problem becomes more heavily regularized with more emphasis being placed
on the prior information. For nonlinear warping, this is analogous to a coarse to fine registration
scheme. The penalty against higher frequency warps is greater than that for those of low frequency
(see Section 3.2.4). In the early iterations, the estimated σ 2 is higher leading to a heavy penalty
against all warps, but with more against those of higher frequency. The algorithm does not fit
much of the high frequency information until σ 2 has been reduced. In addition to a gradual
reduction in σ 2 due to the decreasing residual squared difference, σ 2 is also reduced because
the estimated smoothness is decreased, leading to more effective degrees of freedom. Both these
factors are influential in making the registration scheme more robust to local minima.

3.2.2 Affine Registration

Almost all between subject co-registration or spatial normalization methods for brain images
begin by determining the optimal nine or twelve parameter affine transformation that registers the
images together. This step is normally performed automatically by minimizing (or maximizing)
some mutual function of the images. The objective of affine registration is to fit the source image
f to a template image g, using a twelve parameter affine transformation. The images may be
scaled quite differently, so an additional intensity scaling parameter is included in the model.
Without constraints and with poor data, simple ML parameter optimization (similar to that
described in Section ??) can produce some extremely unlikely transformations. For example,
when there are only a few slices in the image, it is not possible for the algorithms to determine
an accurate zoom in the out of plane direction. Any estimate of this value is likely to have very
large errors. When a regularized approach is not used, it may be better to assign a fixed value
for this difficult-to-determine parameter, and simply fit for the remaining ones.
By incorporating prior information into the optimization procedure, a smooth transition be-
tween fixed and fitted parameters can be achieved. When the error for a particular fitted param-
eter is known to be large, then that parameter will be based more upon the prior information. In
order to adopt this approach, the prior distribution of the parameters should be known. This can
be derived from the zooms and shears determined by registering a large number of brain images
to the template.
3 Note that this only applies when s < w (2π)1/2 , otherwise ν = I − J. Alternatively, to circumvent this
k k
problem the degrees of freedom can be better estimated by (I − J) k erf(2−3/2 sk /wk ). This gives a similar result
Q
to the approximation by [23] for smooth images, but never allows the computed value to exceed I − J.
8 CHAPTER 3. SPATIAL NORMALIZATION USING BASIS FUNCTIONS

3.2.3 Nonlinear Registration

The nonlinear spatial normalization approach described here assumes that the image has al-
ready been approximately registered with the template according to a twelve-parameter affine
registration. This section illustrates how the parameters describing global shape differences (not
accounted for by affine registration) between an image and template can be determined.
The model for defining nonlinear warps uses deformations consisting of a linear combination
of low-frequency periodic basis functions. The spatial transformation from co-ordinates xi , to
co-ordinates yi is:
X
y1i = x1i + u1i = x1i + qj1 dj (xi )
j
X
y2i = x2i + u2i = x2i + qj2 dj (xi )
j
X
y3i = x3i + u3i = x3i + qj3 dj (xi )
j

where qjk is the jth coefficient for dimension k, and dj (x) is the jth basis function at position x.
The choice of basis functions depend upon the distribution of warps likely to be required, and
also upon how translations at borders should behave. If points at the borders over which the
transform is computed are not required to move in any direction, then the basis functions should
consist of the lowest frequencies of the three dimensional discrete sine transform (DST). If there
are to be no constraints at the borders, then a three dimensional discrete cosine transform (DCT)
is more appropriate. Both of these transforms use the same set of basis functions to represent
warps in each of the directions. Alternatively, a mixture of DCT and DST basis functions can
be used to constrain translations at the surfaces of the volume to be parallel to the surface only
(sliding boundary conditions). By using a different combination of DCT and DST basis functions,
the corners of the volume can be fixed and the remaining points on the surface can be free to
move in all directions (bending boundary conditions) [9]. These various boundary conditions are
illustrated in Figure 3.2.
The basis functions used here are the lowest frequency components of the three (or two)
dimensional DCT. In one dimension, the DCT of a function is generated by pre-multiplication
with the matrix DT , where the elements of the I × M matrix D are defined by:

di1 = √1 i = 1..I
q I
2 π(2i−1)(m−1)
dim = I cos 2I i= 1..I, m = 2..M

A set of low frequency two dimensional DCT basis functions are shown in Figure 3.3, and a
schematic example of a two dimensional deformation based upon the DCT is shown in Figure
3.4.
As for affine registration, the optimization involves minimizing the sum of squared differences
between a source (f ) and template image (g). The images may be scaled differently, so an
additional parameter (w) is needed to accommodate this difference. The minimized function is
then:
X
(f (yi ) − wg(xi ))2
i

The approach described in Section ?? is used to optimize the parameters q1 , q2 , q3 and w,

and requires derivatives of the function f (yi ) − wg(xi ) with respect to each parameter. These
3.2. METHOD 9

Figure 3.2: Different boundary conditions. Above left: fixed boundaries (generated purely from
DST basis functions). Above right: sliding boundaries (from a mixture of DCT and DST basis
functions). Below left: bending boundaries (from a different mixture of DCT and DST basis
functions). Below right: free boundary conditions (purely from DCT basis functions).
10 CHAPTER 3. SPATIAL NORMALIZATION USING BASIS FUNCTIONS

Figure 3.3: The lowest frequency basis functions of a two dimensional Discrete Cosine Transform.
3.2. METHOD 11

Figure 3.4: In two dimensions, a deformation field consists of two scalar fields. One for horizontal
deformations, and the other for vertical deformations. The images on the left show deformations
as a linear combination of basis images (see Figure 3.3). The center column shows the same
deformations in a more intuitive sense. The deformation is applied by overlaying it on a source
image, and re-sampling (right).
12 CHAPTER 3. SPATIAL NORMALIZATION USING BASIS FUNCTIONS

α= 0

β= 0

f or j = 1 . . . J
C = d2j,: T d2j,:
E1 = diag(∇1 f :,j )D1
E2 = diag(∇2 f :,j )D1

C ⊗ (E1 T E1 ) C ⊗ (E1 T E2 ) −d2j,: T ⊗ (E1 T g:,j )

 

α = α +  (C ⊗ (E1 T E2 ))T C ⊗ (E2 T E2 ) −d2j,: T ⊗ (E2 T g:,j )

(−d2j,: T ⊗ (E1 T g:,j ))T (−d2j,: T ⊗ (E1 T g:,j ))T g:,j T g:,j

d2j,: T ⊗ (E1 T (f:,j − wg:,j ))

 

β = β + d2j,: T ⊗ (E2 T (f:,j − wg:,j ))

g:,j T (f:,j − wg:,j )
end

Figure 3.5: A two dimensional illustration of the fast algorithm for computing AT A (α) and
AT b (β).

can be obtained using the chain rule:

∂f (yi ) ∂f (yi ) ∂y1i ∂f (yi )
= = dj (xi )
∂qj1 ∂y1i ∂qj1 ∂y1i
∂f (yi ) ∂f (yi ) ∂y2i ∂f (yi )
= = dj (xi )
∂qj2 ∂y2i ∂qj2 ∂y2i
∂f (yi ) ∂f (yi ) ∂y3i ∂f (yi )
= = dj (xi )
∂qj3 ∂y3i ∂qj3 ∂y3i

The approach involves iteratively computing AT A and AT b. However, because there are
many parameters to optimize, these computations can be very time consuming. There now
follows a description of a very efficient way of computing these matrices.

A Fast Algorithm

A fast algorithm for computing AT A and AT b is shown in Figure 3.5. The remainder of this
section explains the matrix terminology used, and why it is so efficient.
For simplicity, the algorithm is only illustrated in two dimensions. Images f and g are consid-
ered as I × J matrices F and G respectively. Row i of F is denoted by fi,: , and column j by f:,j .
The basis functions used by the algorithm are generated from a separable form from matrices D1
and D2 , with dimensions I × M and J × N respectively. By treating the transform coefficients
as M × N matrices Q1 and Q2 , the deformation fields can be rapidly constructed by computing
D1 Q1 D2 T and D1 Q2 D2 T .
Between each iteration, image F is re-sampled according to the latest parameter estimates.
The derivatives of F are also re-sampled to give ∇1 F and ∇2 F. The ith element of each of these
matrices contain f (yi ), ∂f (yi )/∂y1i and ∂f (yi )/∂y2i respectively.
The notation diag(∇1 f :,j )D1 simply means multiplying each element of row i of D1 by ∇1 f i,j ,
3.2. METHOD 13

and the symbol ‘⊗’ refers to the Kronecker tensor product. If D2 is a matrix of order J × N , and
D1 is a second matrix, then:
 
d211 D1 . . . d21N D1
D2 ⊗ D1 = 
 .. .. .. 
. . . 
d2J1 D1 . . . d2JN D1

The advantage of the algorithm shown in Figure 3.5 is that it utilizes some of the useful
properties of Kronecker tensor products. This is especially important when the algorithm is
implemented in three dimensions. The limiting factor to the algorithm is no longer the time
taken to create the curvature matrix (AT A), but is now the amount of memory required to store
it and the time taken to invert it.

3.2.4 Linear Regularization for Nonlinear Registration

Without regularization in the nonlinear registration, it is possible to introduce unnecessary de-

formations that only reduce the residual sum of squares by a tiny amount (see Figure 3.6). This
could potentially make the algorithm very unstable. Regularization is achieved by minimizing
the sum of squared difference between the template and the warped image, while simultaneously
minimizing some function of the deformation field. The principles are Bayesian and make use of
the MAP scheme described in Section 3.2.1.
The first requirement for a MAP approach is to define some form of prior distribution for the
parameters. For a simple linear4 approach, the priors consist of an a priori estimate of the mean
of the parameters (assumed to be zero), and also a covariance matrix describing the distribution
of the parameters about this mean. There are many possible forms for these priors, each of which
describes some form of ‘energy’ term. If the true prior distribution of the parameters is known
(somehow derived from a large number of subjects), then C0 could be an empirically determined
covariance matrix describing this distribution. This approach would have the advantage that the
resulting deformations are more typically “brain like”, and so increase the face validity of the
approach.
The three distinct forms of linear regularization that will now be described are based upon
membrane energy, bending energy and linear-elastic energy. None of these schemes enforce a strict
one to one mapping between the source and template images, but this makes little difference
for the small deformations required here. Each of these models needs some form of elasticity
constants (λ and sometimes µ). Values of these constants that are too large will provide too
much regularization and result in greatly underestimated deformations. If the values are too
small, there will not be enough regularization and the resulting deformations will over-fit the
data.

Membrane Energy

The simplest model used for linear regularization is based upon minimizing the membrane energy
of the deformation field u [1, 24]. By summing over i points in three dimensions, the membrane
energy of u is given by:
3 X3 2
XX ∂uji
λ
i j=1
∂xki
k=1
4 Although the cost function associated with these priors is quadratic, the priors are linear in the sense that
they minimize the sum of squares of a linear combination of the model parameters. This is analogous to solving a
set of linear equations by minimizing a quadratic cost function.
14 CHAPTER 3. SPATIAL NORMALIZATION USING BASIS FUNCTIONS

Figure 3.6: The image shown at the top-left is the template image. At the top-right is an
image that has been registered with it using a 12-parameter affine registration. The image at
the bottom-left is the same image registered using the 12-parameter affine registration, followed
by a regularized global nonlinear registration. It should be clear that the shape of the image
approaches that of the template much better after nonlinear registration. At the bottom right is
the image after the same affine transformation and nonlinear registration, but this time without
using any regularization. The mean squared difference between the image and template after
the affine registration was 472.1. After the regularized nonlinear registration this was reduced to
302.7. Without regularization, a mean squared difference of 287.3 is achieved, but this is at the
expense of introducing a lot of unnecessary warping.
3.2. METHOD 15

where λ is simply a scaling constant. The membrane energy can be computed from the coefficients
of the basis functions by q1 T Hq1 + q2 T Hq2 + q3 T Hq3 , where q1 , q2 and q3 refer to vectors
containing the parameters describing translations in the three dimensions. The matrix H is
defined by:
T
H = λ Ḋ3 Ḋ3 ⊗ D2 T D2 ⊗ D1 T D1
T
+ λ D3 T D3 ⊗ Ḋ2 Ḋ2 ⊗ D1 T D1
T
+ λ D3 T D3 ⊗ D2 T D2 ⊗ Ḋ1 Ḋ1

where the notation Ḋ1 refers to the first derivatives of D1 .

T
Assuming that the parameters consist of q1 T q2 T q3 T w , matrix C0 −1 from Eqn. 3.4 can

be constructed from H by:

 
H 0 0 0
 0 H 0 0
C0 −1 =   0 0 H 0


0 0 0 0
H is all zeros, except for the diagonal. Elements on the diagonal represent the reciprocal of the a
priori variance of each parameter. If all the DCT matrices are I × M , then each diagonal element
is given by:
hj+M (k−1+M (l−1)) = λπ 2 I −2 (j − 1)2 + (k − 1)2 + (l − 1)2

over j = 1 . . . M , k = 1 . . . M and l = 1 . . . M .

Bending Energy

Bookstein’s thin plate splines [6, 5] minimize the bending energy of deformations. For a two
dimensional deformation, the bending energy is defined by:
X ∂ 2 u1i 2 ∂ 2 u1i 2 2 2 !
∂ u1i
λ + +2 +
i
∂x21i ∂x22i ∂x1i ∂x2i
X ∂ 2 u2i 2 ∂ 2 u2i 2 2 2 !
∂ u2i
λ + +2
i
∂x21i ∂x22i ∂x1i ∂x2i

This can be computed by:

λq1 T (D̈2 ⊗ D1 )T (D̈2 ⊗ D1 )q1 + λq1 T (D2 ⊗ D̈1 )T (D2 ⊗ D̈1 )q1 +
2λq1 T (Ḋ2 ⊗ Ḋ1 )T (Ḋ2 ⊗ Ḋ1 )q1 + λq2 T (D̈2 ⊗ D1 )T (D̈2 ⊗ D1 )q2 +
λq2 T (D2 ⊗ D̈1 )T (D2 ⊗ D̈1 )q2 + 2λq2 T (Ḋ2 ⊗ Ḋ1 )T (Ḋ2 ⊗ Ḋ1 )q2
where the notation Ḋ1 and D̈1 refer to the column-wise first and second derivatives of D1 . This
is simplified to q1 T Hq1 + q2 T Hq2 where:
T T T T
H = λ D̈2 D̈2 ⊗ D1 T D1 + D2 T D2 ⊗ D̈1 D̈1 + 2 Ḋ2 Ḋ2 ⊗ Ḋ1 Ḋ1

Matrix C0 −1 from Eqn. 3.4 can be constructed from H as:

 
H 0 0
C0 −1 =  0 H 0
0 0 0
16 CHAPTER 3. SPATIAL NORMALIZATION USING BASIS FUNCTIONS

with values on the diagonals of H given by:

4 4 2 2 !
π(j − 1) π(k − 1) π(j − 1) π(k − 1)
hj+(k−1)×M = λ + +2
I I I I
over j = 1 . . . M and k = 1 . . . M

Linear-Elastic Energy

The linear-elastic energy [28] of a two dimensional deformation field is:

2 X
2 X 2
X λ ∂uji ∂uki µ ∂uji ∂uki
+ +
j=1 k=1 i
2 ∂xji ∂xki 4 ∂xki ∂xji

where λ and µ are the Lamé elasticity constants. The elastic energy of the deformations can be
computed by:

(µ + λ/2)q1 T (D2 ⊗ Ḋ1 )T (D2 ⊗ Ḋ1 )q1 + (µ + λ/2)q2 T (Ḋ2 ⊗ D1 )T (Ḋ2 ⊗ D1 )q2
+µ/2q1 T (Ḋ2 ⊗ D1 )T (Ḋ2 ⊗ D1 )q1 + µ/2q2 T (D2 ⊗ Ḋ1 )T (D2 ⊗ Ḋ1 )q2
+µ/2q1 T (Ḋ2 ⊗ D1 )T (D2 ⊗ Ḋ1 )q2 + µ/2q2 T (D2 ⊗ Ḋ1 )T (Ḋ2 ⊗ D1 )q1
+λ/2q1 T (D2 ⊗ Ḋ1 )T (Ḋ2 ⊗ D1 )q2 + λ/2q2 T (Ḋ2 ⊗ D1 )T (D2 ⊗ Ḋ1 )q1

A regularization based upon this model requires an inverse covariance matrix that is not a
simple diagonal matrix. This matrix is constructed as follows:
 
H1 H3 0
C0 −1 = H3 T H2 0
0 0 0

where:
T T
H1 = (µ + λ/2)(D2 T D2 ) ⊗ (Ḋ1 Ḋ1 ) + µ/2(Ḋ2 Ḋ2 ) ⊗ (D1 T D1 )
T T
H2 = (µ + λ/2)(Ḋ2 Ḋ2 ) ⊗ (D1 T D1 ) + µ/2(D2 T D2 ) ⊗ (Ḋ1 Ḋ1 )
T T
H3 = λ/2(D2 T Ḋ2 ) ⊗ (Ḋ1 D1 ) + µ/2(Ḋ2 D2 ) ⊗ (D1 T Ḋ1 )

3.2.5 Templates and Intensity Transformations

Sections 3.2.2 and 3.2.3 have modeled a single intensity scaling parameter (q13 and w respectively),
but more generally, the optimization can be assumed to minimize two sets of parameters: those
that describe spatial transformations (qs ), and those for describing intensity transformations (qt ).
This means that the difference function can be expressed in the generic form:

bi (q) = f (s(xi , qs )) − t(xi , qt )

where f is the source image, s() is a vector function describing the spatial transformations based
upon parameters qs and t() is a scalar function describing intensity transformations based on
parameters qt . xi represents the co-ordinates of the ith sampled point.
The previous subsections simply considered matching one image to a scaled version of another,
in order to minimize the sum of squared differences between them. For this case, t(xi , qt ) is simply
3.2. METHOD 17

Figure 3.7: Example template images. Above: T1 weighted MRI, T2 weighted MRI and PD
weighted MRI. Below: Grey matter probability distribution, White matter probability distribu-
tion and CSF probability distribution. All the data were generated at the McConnel Brain Imag-
ing Center, Montréal Neurological Institute at McGill University, and are based on the averages
of about 150 normal brains. The original images were reduced to 2mm resolution and convolved
with an 8mm FWHM Gaussian kernel to be used as templates for spatial normalization.
18 CHAPTER 3. SPATIAL NORMALIZATION USING BASIS FUNCTIONS

Figure 3.8: Two dimensional histograms of template images (intensities shown as log(1+n), where
n is the value in each bin). The histograms were based on the whole volumes of the template
images shown in the top row of Figure 3.7.

equal to qt1 g(xi ), where qt1 is a simple scaling parameter and g is a template image. This is most
effective when there is a linear relation between the image intensities. Typically, the template
images used for spatial normalization will be similar to those shown in the top row of Figure 3.7.
The simplest least squares fitting method is not optimal when there is not a linear relationship
between the images. Examples of nonlinear relationships are illustrated in Figure 3.8, which
shows histograms (scatter-plots) of image intensities plotted against each other.
An important idea is that a given image can be matched not to one reference image, but to
a series of images that all conform to the same space. The idea here is that (ignoring the spatial
differences) any given image can be expressed as a linear combination of a set of reference images.
For example these reference images might include different modalities (e.g., PET, SPECT, 18 F-
DOPA, 18 F-deoxy-glucose, T1 -weighted MRI T∗2 -weighted MRI .. etc.) or different anatomical
tissues (e.g., grey matter, white matter, and CSF segmented from the same T1 -weighted MRI)
or different anatomical regions (e.g., cortical grey matter, sub-cortical grey mater, cerebellum ...
etc.) or finally any combination of the above. Any given image, irrespective of its modality could
be approximated with a function of these images. A simple example using two images would be:
bi (q) = f (s(xi , qs )) − (qt1 g1 (xi ) + qt2 g2 (xi ))
In Figure 3.9, a plane of a T1 weighted MRI is modeled by a linear combination of the five other
template images shown in Figure 3.7. Similar models were used to simulate T2 and PD weighted
MR images. The linearity of the scatter-plots (compared to those in Figure 3.8) shows that MR
images of a wide range of different contrasts can be modeled by a linear combination of a limited
number of template images. Visual inspection shows that the simulated images are very similar
to those shown in Figure 3.7.
Alternatively, the intensities could vary spatially (for example due to inhomogeneities in the
MRI scanner). Linear variations in intensity over the field of view can be accounted for by
optimizing a function of the form:
bi (q) = f (s(xi , qs )) − (qt1 g(xi ) + qt2 x1i g(xi ) + qt3 x2i g(xi ) + qt4 x3i g(xi ))
More complex variations could be included by modulating with other basis functions (such as the
DCT basis function set described in Section 3.2.3) [22]. The examples shown so far have been
linear in their parameters describing intensity transformations. A simple example of an intensity
transformation that is nonlinear would be:
bi (q) = f (s(xi , qs )) − qt1 g(xi )qt2
3.2. METHOD 19

Figure 3.9: Simulated images of T1, T2 and PD weighted images, and histograms of the real
images versus the simulated images.
20 CHAPTER 3. SPATIAL NORMALIZATION USING BASIS FUNCTIONS

[15] suggested that – rather than matching the image itself to the template – some function of
the image should be matched to a template image transformed in the same way. He found that
the use of gradient magnitude transformations lead to more robust solutions, especially in cases
of limited brain coverage or intensity inhomogeneity artifacts (in MR images). Other rotationally
invariant moments also contain useful matching information [30]. The algorithms described here
perform most efficiently with smooth images. Much of the high frequency information in the
images is lost in the smoothing step, but information about important image features may be
retained in separate (smoothed) moment images. Simultaneous registrations using these extracted
features may be a useful technique for preserving information, while still retaining the advantages
of using smooth images in the registration.
Another idea for introducing more accuracy by making use of internal consistency would be to
simultaneously spatially normalize co-registered images to corresponding templates. For example,
by simultaneously matching a PET image to a PET template, at the same time as matching a
structural MR image to a corresponding MR template, more accuracy could be obtained than
by matching the images individually. A similar approach could be devised for simultaneously
matching different tissue types from classified images together [26], although a more powerful
approach is to incorporate tissue classification and registration into the same Bayesian model
[20].

3.3 Discussion
The criteria for ‘good’ spatial transformations can be framed in terms of validity, reliability and
computational efficiency. The validity of a particular transformation device is not easy to define
or measure and indeed varies with the application. For example a rigid body transformation may
be perfectly valid for realignment but not for spatial normalization of an arbitrary brain into a
standard stereotaxic space. Generally the sorts of validity that are important in spatial transfor-
mations can be divided into (i) Face validity, established by demonstrating the transformation
does what it is supposed to and (ii) Construct validity, assessed by comparison with other tech-
niques or constructs. Face validity is a complex issue in functional mapping. At first glance, face
validity might be equated with the co-registration of anatomical homologues in two images. This
would be complete and appropriate if the biological question referred to structural differences or
modes of variation. In other circumstances however this definition of face validity is not appro-
priate. For example, the purpose of spatial normalization (either within or between subjects) in
functional mapping studies is to maximize the sensitivity to neuro-physiological change elicited
by experimental manipulation of sensorimotor or cognitive state. In this case a better definition
of a valid normalization is that which maximizes condition-dependent effects with respect to error
(and if relevant inter-subject) effects. This will probably be effected when functional anatomy is
congruent. This may or may not be the same as registering structural anatomy.
Because the deformations are only defined by a few hundred parameters, the nonlinear regis-
tration method described here does not have the potential precision of some other methods. High
frequency deformations cannot be modeled because the deformations are restricted to the lowest
spatial frequencies of the basis functions. This means that the current approach is unsuitable for
attempting exact matches between fine cortical structures (see Figures 3.10 and 3.11).
The current method is relatively fast, (taking in the order of 30 seconds per iteration –
depending upon the number of basis functions used). The speed is partly a result of the small
number of parameters involved, and the simple optimization algorithm that assumes an almost
quadratic error surface. Because the images are first matched using a simple affine transformation,
there is less ‘work’ for the algorithm to do, and a good registration can be achieved with only a
few iterations (less than 20). The method does not rigorously enforce a one-to-one match between
3.3. DISCUSSION 21

Figure 3.10: Images of six subjects registered using a 12-parameter affine registration (see also
Figure 3.11). The affine registration matches the positions and sizes of the images.
22 CHAPTER 3. SPATIAL NORMALIZATION USING BASIS FUNCTIONS

Figure 3.11: Six subjects brains registered with both affine and basis function registration (see
also Figure 3.10). The basis function registration estimates the global shapes of the brains, but
is not able to account for high spatial frequency warps.
3.3. DISCUSSION 23

the brains being registered. However, by estimating only the lowest frequency deformations and
by using appropriate regularization, this constraint is rarely broken.
The approach in this chapter searches for a MAP estimate of the parameters defining the
warps. However, optimization problems for complex nonlinear models such as those used for
image registration can easily get caught in local minima, so there is no guarantee that the estimate
determined by the algorithm is globally optimum. Even if the best MAP estimate is achieved,
there will be many other potential solutions that have similar probabilities of being correct. A
further complication arises from the fact that there is no one-to-one match between the small
structures (especially gyral and sulcal patterns) of any two brains. This means that it is not
possible to obtain a single objective high frequency match however good an algorithm is for
determining the best MAP estimate. Because of these issues, registration using the minimum
variance estimate (MVE) may be more appropriate. Rather than searching for the single most
probable solution, the MVE is the average of all possible solutions, weighted by their individual
posterior probabilities. Although useful approximations have been devised [28, 9], this estimate is
still difficult to achieve in practice because of the enormous amount of computing power required.
The MVE is probably more appropriate than the MAP estimate for spatial normalization, as it
is (on average) closer to the “true” solution. However, if the errors associated with the parameter
estimates and also the priors are normally distributed, then the MVE and the MAP estimate are
identical. This is partially satisfied by smoothing the images before registering them.
When higher spatial frequency warps are to be fitted, more DCT coefficients are required to
describe the deformations. There are practical problems that occur when more than about the
8 × 8 × 8 lowest frequency DCT components are used. One of these is the problem of storing and
inverting the curvature matrix (AT A). Even with deformations limited to 8 × 8 × 8 coefficients,
there are at least 1537 unknown parameters, requiring a curvature matrix of about 18Mbytes
(using double precision floating point arithmetic). High-dimensional registration methods that
search for more parameters should be used when more precision is required in the deformations.
In practice however, it may be meaningless to even attempt an exact match between brains
beyond a certain resolution. There is not a one-to-one relationship between the cortical structures
of one brain and those of another, so any method that attempts to match brains exactly must
be folding the brain to create sulci and gyri that do not really exist. Even if an exact match is
possible, because the registration problem is not convex, the solutions obtained by high dimen-
sional warping techniques may not be truly optimum. High-dimensional registrations methods
are often very good at registering grey matter with grey matter (for example), but there is no
guarantee that the registered grey matter arises from homologous cortical features.
Also, structure and function are not always tightly linked. Even if structurally equivalent
regions can be brought into exact register, it does not mean that the same is true for regions that
perform the same or similar functions. For inter-subject averaging, an assumption is made that
functionally equivalent regions lie in approximately the same parts of the brain. This leads to
the current rationale for smoothing images from multi-subject functional imaging studies prior
to performing statistical analyses. Constructive interference of the smeared activation signals
then has the effect of producing a signal that is roughly in an average location. In order to
account for substantial fine scale warps in a spatial normalization, it is necessary for some voxels
to increase their volumes considerably, and for others to shrink to an almost negligible size. The
contribution of the shrunken regions to the smoothed images is tiny, and the sensitivity of the
tests for detecting activations in these regions is reduced. This is another argument in favor of
spatially normalizing only on a global scale.
The constrained normalization described here assumes that the template resembles a warped
version of the image. Modifications are required in order to apply the method to diseased or
lesioned brains. One possible approach is to assume different weights for different brain regions
24 CHAPTER 3. SPATIAL NORMALIZATION USING BASIS FUNCTIONS

[7]. Lesioned areas can be assigned lower weights, so that they have much less influence on the
final solution.
The registration scheme described in this chapter is constrained to describe warps with a
few hundred parameters. More powerful and less expensive computers are rapidly evolving,
so algorithms that are currently applicable will become increasingly redundant as it becomes
feasible to attempt more precise registrations. Scanning hardware is also improving, leading
to improvements in the quality and diversity of images that can be obtained. Currently, most
registration algorithms only use the information from a single image from each subject. This is
typically a T1 MR image, which provides limited information that simply delineates grey and
white matter. For example, further information that is not available in the more conventional
sequences could be obtained from diffusion weighted imaging. Knowledge of major white matter
tracts should provide structural information more directly related to connectivity and implicitly
function, possibly leading to improved registration of functionally specialized areas.

Bibliography
[1] Y. Amit, U. Grenander, and M. Piccioni. Structural image restoration through deformable
templates. Journal of the American Statistical Association, 86:376–387, 1991.
[2] J. Ashburner and K. J. Friston. High-dimensional nonlinear image registration. NeuroImage,
7(4):S737, 1998.
[3] J. Ashburner and K. J. Friston. Nonlinear spatial normalization using basis functions. Human
Brain Mapping, 7(4):254–266, 1999.
[4] F. L. Bookstein. Principal warps: Thin-plate splines and the decomposition of deformations.
IEEE Transactions on Pattern Analysis and Machine Intelligence, 11(6):567–585, 1989.
[5] F. L. Bookstein. Landmark methods for forms without landmarks: Morphometrics of group
differences in outline shape. Medical Image Analysis, 1(3):225–243, 1997.
[6] F. L. Bookstein. Quadratic variation of deformations. In J. Duncan and G. Gindi, editors,
Proc. Information Processing in Medical Imaging, pages 15–28, Berlin, Heidelberg, New
York, 1997. Springer-Verlag.
[7] M. Brett, A. P. Leff, C. Rorden, and J. Ashburner. Spatial normalization of brain images
with focal lesions using cost function masking. NeuroImage, 14(2):486–500, 2001.
[8] M. Bro-Nielsen and C. Gramkow. Fast fluid registration of medical images. Lecture Notes
in Computer Science, 1131:267–276, 1996.
[9] G. E. Christensen. Deformable shape models for anatomy. Doctoral thesis, Washington
University, Sever Institute of Technology, 1994.
[10] G. E. Christensen. Consistent linear elastic transformations for image matching. In A. Kuba
et al., editor, Proc. Information Processing in Medical Imaging, pages 224–237, Berlin, Hei-
delberg, 1999. Springer-Verlag.
[11] G. E. Christensen, R. D. Rabbitt, and M. I. Miller. 3D brain mapping using using a de-
formable neuroanatomy. Physics in Medicine and Biology, 39:609–618, 1994.
[12] G. E. Christensen, R. D. Rabbitt, and M. I. Miller. Deformable templates using large
deformation kinematics. IEEE Transactions on Image Processing, 5:1435–1447, 1996.
BIBLIOGRAPHY 25

[13] G. E. Christensen, R. D. Rabbitt, M. I. Miller, S. C. Joshi, U. Grenander, T. A. Coogan, and

D. C. Van Essen. Topological properties of smooth anatomic maps. In Y. Bizais, C. Barillot,
and R. Di Paola, editors, Proc. Information Processing in Medical Imaging, pages 101–112,
Dordrecht, The Netherlands, 1995. Kluwer Academic Publishers.
[14] D. L. Collins, A. C. Evans, C. Holmes, and T. M. Peters. Automatic 3D segmentation of
neuro-anatomical structures from MRI. In Y. Bizais, C. Barillot, and R. Di Paola, editors,
Proc. Information Processing in Medical Imaging, pages 139–152, Dordrecht, The Nether-
lands, 1995. Kluwer Academic Publishers.
[15] D. L. Collins, P. Neelin, T. M. Peters, and A. C. Evans. Automatic 3D intersubject registra-
tion of MR volumetric data in standardized Talairach space. Journal of Computer Assisted
Tomography, 18:192–205, 1994.
[16] C. Davatzikos. Spatial normalization of 3D images using deformable models. Journal of
Computer Assisted Tomography, 20(4):656–665, 1996.
[17] P. J. Edwards, D. L. G. Hill, and D. J. Hawkes. Image guided interventions using a three
component tissue deformation model. In Proc. Medical Image Understanding and Analysis,
1997.
[18] A. C. Evans, D. L. Collins, S. R. Mills, E. D. Brown, R. L. Kelly, and T. M. Peters. 3D
statistical neuroanatomical models from 305 MRI volumes. In Proc. IEEE-Nuclear Science
Symposium and Medical Imaging Conference, pages 1813–1817, 1993.
[19] A. C. Evans, M. Kamber, D. L. Collins, and D. Macdonald. An MRI-based probabilistic
atlas of neuroanatomy. In S. Shorvon, D. Fish, F. Andermann, G. M. Bydder, and Stefan
H, editors, Magnetic Resonance Scanning and Epilepsy, volume 264 of NATO ASI Series A,
Life Sciences, pages 263–274. Plenum Press, 1994.
[20] B. Fischl, D. H. Salat, E. Busa, M. Albert, M. Dieterich, C. Haselgrove, A. van der Kouwe,
R. Killiany, D. Kennedy, S. Klaveness, A. Montillo, N. Makris, B. Rosen, and A. M. Dale.
Whole brain segmentation: Automated labeling of neuroanatomical structures in the human
brain. Neuron, 33:341–355, 2002.
[21] P. T. Fox. Spatial normalization origins: Objectives, applications, and alternatives. Human
Brain Mapping, 3:161–164, 1995.
[22] K. J. Friston, J. Ashburner, C. D. Frith, J.-B. Poline, J. D. Heather, and R. S. J. Frackowiak.
Spatial registration and normalization of images. Human Brain Mapping, 2:165–189, 1995.
[23] K. J. Friston, A. P. Holmes, J.-B. Poline, P. J. Grasby, S. C. R. Williams, R. S. J. Frackowiak,
and R. Turner. Analysis of fMRI time series revisited. NeuroImage, 2:45–53, 1995.
[24] J. C. Gee, D. R. Haynor, L. Le Briquer, and R. K. Bajcsy. Advances in elastic matching
theory and its implementation. In P. Cinquin, R. Kikinis, and S. Lavallee, editors, Proc.
CVRMed-MRCAS’97, Heidelberg, 1997. Springer-Verlag.
[25] C. A. Glasbey and K. V. Mardia. A review of image warping methods. Journal of Applied
Statistics, 25:155–171, 1998.
[26] C. D. Good, I. S. Johnsrude, J. Ashburner, R. N. A Henson, K. J. Friston, and R. S. J.
Frackowiak. NeuroImage, 14:21–36, 2001.
[27] J. C. Mazziotta, A. W. Toga, A. Evans, P. Fox, and J. Lancaster. A probablistic atlas of the
human brain: Theory and rationale for its development. NeuroImage, 2:89–101, 1995.
26 CHAPTER 3. SPATIAL NORMALIZATION USING BASIS FUNCTIONS

[28] M. I. Miller, G. E. Christensen, Y. Amit, and U. Grenander. Mathematical textbook of

deformable neuroanatomies. Proc. National Academy of Sciences, 90:11944–11948, 1993.
[29] J.-B. Poline, K. J. Friston, K. J. Worsley, and R. S. J. Frackowiak. Estimating smoothness in
statistical parametric maps: Confidence intervals on p-values. Journal of Computer Assisted
Tomography, 19(5):788–796, 1995.
[30] D. Shen and C. Davatzikos. HAMMER: Hierarchical attribute matching mechanism for
elastic registration. IEEE Transactions on Medical Imaging, 21(11):1421–1439, 2002.
[31] C. Studholme, R. T. Constable, and J. S. Duncan. Accurate alignment of functional EPI
data to anatomical MRI using a physics-based distortion model. IEEE Transactions on
Medical Imaging, 19(11):1115–1127, 2000.
[32] J. Talairach and P. Tournoux. Coplanar stereotaxic atlas of the human brain. Thieme
Medical, New York, 1988.
[33] P. Thévenaz and M. Unser. Optimization of mutual information for multiresolution image
registration. IEEE Transactions on Image Processing, 9(12):2083–2099, 2000.
[34] J.-P. Thirion. Fast non-rigid matching of 3D medical images. Technical Report 2547, Insti-
tut National de Recherche en Informatique et en Automatique, May 1995. Available from
http://www.inria.fr/RRRT/RR-2547.html.
[35] P. M. Thompson and A. W. Toga. Visualization and mapping of anatomic abnormalities
using a probablistic brain atlas based on random fluid transformations. In Proc. Visualization
in Biomedical Computing, pages 383–392, 1996.
[36] R. P. Woods, S. T. Grafton, C. J. Holmes, S. R. Cherry, and J. C. Mazziotta. Automated
image registration: I. General methods and intrasubject, intramodality validation. Journal
of Computer Assisted Tomography, 22(1):139–152, 1998.
[37] R. P. Woods, S. T. Grafton, J. D. G. Watson, N. L. Sicotte, and J. C. Mazziotta. Automated
image registration: II. Intersubject validation of linear and nonlinear models. Journal of
Computer Assisted Tomography, 22(1):153–165, 1998.
[38] K. J. Worsley and K. J. Friston. Analysis of fMRI time-series revisited - again. NeuroImage,
2:173–181, 1995.

Fat Tails STATISTICAL CONSEQUENCES OF FAT TAILS PDF
100% (2)
Fat Tails STATISTICAL CONSEQUENCES OF FAT TAILS PDF
364 pages
Jensen Et Al. Statistics For Petroleum Engineers and Geoscientists (1997)
100% (9)
Jensen Et Al. Statistics For Petroleum Engineers and Geoscientists (1997)
413 pages
Coreg and Spatial
No ratings yet
Coreg and Spatial
39 pages
Fundamentals of Medical Imaging Registration I - II: Olivier Clatz Ph.D. Oclatz@bwh - Harvard.edu
No ratings yet
Fundamentals of Medical Imaging Registration I - II: Olivier Clatz Ph.D. Oclatz@bwh - Harvard.edu
51 pages
An Unsupervised Learning Model For Deformable Medical Image Registration
No ratings yet
An Unsupervised Learning Model For Deformable Medical Image Registration
9 pages
A Fast Diffeomorphic Image Registration Algorithm
No ratings yet
A Fast Diffeomorphic Image Registration Algorithm
19 pages
Temporally-Dependent Image Similarity Measure For Longitudinal Analysis
No ratings yet
Temporally-Dependent Image Similarity Measure For Longitudinal Analysis
10 pages
A Fast Diffeomorphic Image Registration Algorithm: John Ashburner
No ratings yet
A Fast Diffeomorphic Image Registration Algorithm: John Ashburner
19 pages
Dartel
No ratings yet
Dartel
68 pages
Medical Image Registration: A Survey: Aiming Lu
No ratings yet
Medical Image Registration: A Survey: Aiming Lu
19 pages
Physgnn: A Physics-Driven Graph Neural Network Based Model For Predicting Soft Tissue Deformation in Image-Guided Neurosurgery
No ratings yet
Physgnn: A Physics-Driven Graph Neural Network Based Model For Predicting Soft Tissue Deformation in Image-Guided Neurosurgery
15 pages
Mir DIP Report Naren
No ratings yet
Mir DIP Report Naren
13 pages
Nonrigid Registration Using Free-Form Deformations: Hongchang Peng April 20th
No ratings yet
Nonrigid Registration Using Free-Form Deformations: Hongchang Peng April 20th
28 pages
Statistical Models of Appearance For Medical Image Analysis and Computer Vision
No ratings yet
Statistical Models of Appearance For Medical Image Analysis and Computer Vision
14 pages
Longitudinal Image Registration With Non-Uniform Appearance Change
No ratings yet
Longitudinal Image Registration With Non-Uniform Appearance Change
8 pages
Hart2009 Oc Registration
No ratings yet
Hart2009 Oc Registration
8 pages
A Comparison of 2D-3D Intensity-Based Registration and Feature-Based Registration For Neurointerventions
No ratings yet
A Comparison of 2D-3D Intensity-Based Registration and Feature-Based Registration For Neurointerventions
8 pages
Niethammer2011 Geometric Metamorphosis Miccai
No ratings yet
Niethammer2011 Geometric Metamorphosis Miccai
8 pages
Introduction To Methods and Modeling - Voxel-Based Morphometry
No ratings yet
Introduction To Methods and Modeling - Voxel-Based Morphometry
1 page
Image Registration by Local Histogram Matching: Dinggang Shen
No ratings yet
Image Registration by Local Histogram Matching: Dinggang Shen
12 pages
N - 2 Marks QA and Part B Answers - MIP - Unit 4
No ratings yet
N - 2 Marks QA and Part B Answers - MIP - Unit 4
24 pages
Statiscal Shape Analysis For Brain Shen2017
No ratings yet
Statiscal Shape Analysis For Brain Shen2017
28 pages
Image Registration: John Ashburner
No ratings yet
Image Registration: John Ashburner
55 pages
Plantillas Gráficas para El Registro de Modelos.
No ratings yet
Plantillas Gráficas para El Registro de Modelos.
12 pages
47 - Adaptive Division - ANDRONACHE
No ratings yet
47 - Adaptive Division - ANDRONACHE
8 pages
Lec11 Image Registration
No ratings yet
Lec11 Image Registration
46 pages
Image Deformation Using Moving Least Squares
No ratings yet
Image Deformation Using Moving Least Squares
8 pages
Generalized Ellipsoids and Anisotropic Filtering For Segmentation Improvement in 3D Medical Imaging
No ratings yet
Generalized Ellipsoids and Anisotropic Filtering For Segmentation Improvement in 3D Medical Imaging
54 pages
Mip Unit 4
No ratings yet
Mip Unit 4
4 pages
Image Registration: Verification
No ratings yet
Image Registration: Verification
6 pages
Research Paper
No ratings yet
Research Paper
9 pages
Unsupervised Learning of Image Correspondences in Medical Imaging Analysis
No ratings yet
Unsupervised Learning of Image Correspondences in Medical Imaging Analysis
65 pages
Learn2reg Introduction TomVercauteren
No ratings yet
Learn2reg Introduction TomVercauteren
31 pages
Intro fMRI 2010 01 Preprocessing
No ratings yet
Intro fMRI 2010 01 Preprocessing
34 pages
Fmri Presentation
No ratings yet
Fmri Presentation
27 pages
Medical Image
No ratings yet
Medical Image
14 pages
Deformable Models
No ratings yet
Deformable Models
563 pages
Cardiac Motion and Deformation Estimation From Tagged MRI Sequences Using A Temporal Coherent Image Registration Framework
No ratings yet
Cardiac Motion and Deformation Estimation From Tagged MRI Sequences Using A Temporal Coherent Image Registration Framework
9 pages
J01 Nikou Armspach Heitz Namer Grucker NeuroImage 1998
No ratings yet
J01 Nikou Armspach Heitz Namer Grucker NeuroImage 1998
14 pages
Voxel-Based Morphometry With Unified Segmentation
No ratings yet
Voxel-Based Morphometry With Unified Segmentation
65 pages
Learning Non-Rigid 3D Shape From 2D Motion: Ltorresa@cs - Stanford.edu Hertzman@dgp - Toronto.edu
No ratings yet
Learning Non-Rigid 3D Shape From 2D Motion: Ltorresa@cs - Stanford.edu Hertzman@dgp - Toronto.edu
8 pages
Nurbs-Based Detection of Age-Related Variability of The Human Brain Surface
No ratings yet
Nurbs-Based Detection of Age-Related Variability of The Human Brain Surface
5 pages
Lab4 - 3 Instruction
No ratings yet
Lab4 - 3 Instruction
5 pages
Medical Image Registration: Concepts and Implementation: Feb 28, 2006 Jen Mercer
No ratings yet
Medical Image Registration: Concepts and Implementation: Feb 28, 2006 Jen Mercer
68 pages
3 MultiD
No ratings yet
3 MultiD
46 pages
Lecture 6 Merged
No ratings yet
Lecture 6 Merged
21 pages
2003 Expression-Invariant 3D Face Recognition
No ratings yet
2003 Expression-Invariant 3D Face Recognition
9 pages
DSA Image Registration Based On Multiscale Gabor Filters and Mutual Information
No ratings yet
DSA Image Registration Based On Multiscale Gabor Filters and Mutual Information
6 pages
ASM, Image Search N Classification-2
No ratings yet
ASM, Image Search N Classification-2
4 pages
Charting A Manifold: Mitsubishi Electric Information Technology Center America, 2003
No ratings yet
Charting A Manifold: Mitsubishi Electric Information Technology Center America, 2003
10 pages
Chin Talap An I 2007
No ratings yet
Chin Talap An I 2007
8 pages
BCI Case Law
No ratings yet
BCI Case Law
4 pages
Lab 2
No ratings yet
Lab 2
8 pages
Visualizing Data Using t-SNE: Laurens Van Der Maaten
No ratings yet
Visualizing Data Using t-SNE: Laurens Van Der Maaten
27 pages
Calculating The Stereotaxic Coordinates of Rat Brain
No ratings yet
Calculating The Stereotaxic Coordinates of Rat Brain
9 pages
Mathematical Biosciences and Engineering Volume 3, Number 2, April 2006
No ratings yet
Mathematical Biosciences and Engineering Volume 3, Number 2, April 2006
30 pages
Jörg Polzehl, Karsten Tabelow - Magnetic Resonance Brain Imaging
No ratings yet
Jörg Polzehl, Karsten Tabelow - Magnetic Resonance Brain Imaging
242 pages
Face Recognition Using Active Appearance Models
No ratings yet
Face Recognition Using Active Appearance Models
15 pages
Lecture 4 Merged
No ratings yet
Lecture 4 Merged
11 pages
Ben Hayvan Sağlığıyım - Merck Hayvan Sağlığı ABD
No ratings yet
Ben Hayvan Sağlığıyım - Merck Hayvan Sağlığı ABD
35 pages
Vosetal EI2014
No ratings yet
Vosetal EI2014
10 pages
AnaBravo Vet Edu in Europe 17JUNE2016
No ratings yet
AnaBravo Vet Edu in Europe 17JUNE2016
45 pages
WK HowToAsk Guide 230601
No ratings yet
WK HowToAsk Guide 230601
9 pages
Information For Veterinary Practitioners: December 2022 V1.0
No ratings yet
Information For Veterinary Practitioners: December 2022 V1.0
13 pages
Dairy Production Systems and Mental Health of Farm Managers in Japan
No ratings yet
Dairy Production Systems and Mental Health of Farm Managers in Japan
12 pages
Veterinary Wellbeing & Scholarships - Merck Animal Health USA
No ratings yet
Veterinary Wellbeing & Scholarships - Merck Animal Health USA
11 pages
AgriLink Manual Eng 2017-12-19
No ratings yet
AgriLink Manual Eng 2017-12-19
56 pages
Resources - Merck Animal Health USA
No ratings yet
Resources - Merck Animal Health USA
9 pages
A Day in The Life of A Rural Vet
No ratings yet
A Day in The Life of A Rural Vet
6 pages
@individuality of Calves Linking Personality Traits To Feeding and Activity
No ratings yet
@individuality of Calves Linking Personality Traits To Feeding and Activity
17 pages
Live and Slaughter Performances of Crossbred Calves
No ratings yet
Live and Slaughter Performances of Crossbred Calves
11 pages
2024 Annual RSU Terms and Conditions Final All Links
No ratings yet
2024 Annual RSU Terms and Conditions Final All Links
222 pages
@integration of Technologies and Systems For Precision Animal Agriculture
No ratings yet
@integration of Technologies and Systems For Precision Animal Agriculture
23 pages
Economic Values For Production, Functional and Fertility Traits in Milk Production Systems in Southern Brazil
No ratings yet
Economic Values For Production, Functional and Fertility Traits in Milk Production Systems in Southern Brazil
10 pages
Economic Analysis of Animal Diseases Fao
No ratings yet
Economic Analysis of Animal Diseases Fao
94 pages
@information - and - Communication - Technologies (Ict) and Precision
No ratings yet
@information - and - Communication - Technologies (Ict) and Precision
53 pages
Farm Management Strategies To Increase Robustness and Resilience of Farming Systems
No ratings yet
Farm Management Strategies To Increase Robustness and Resilience of Farming Systems
16 pages
Estimating US Dairy Clinical Disease Costs With A Stochastic Simulation Model
No ratings yet
Estimating US Dairy Clinical Disease Costs With A Stochastic Simulation Model
15 pages
@precision Livestock Farming Research A Global Scientometric Review
No ratings yet
@precision Livestock Farming Research A Global Scientometric Review
29 pages
Unlocking The Potential of Knowledge Economy For Rural Resilience The Role of Digital Platforms
No ratings yet
Unlocking The Potential of Knowledge Economy For Rural Resilience The Role of Digital Platforms
16 pages
A28 (Tavuk Karaciğer Kemik EurPoulSci)
No ratings yet
A28 (Tavuk Karaciğer Kemik EurPoulSci)
9 pages
Development of A Novel Stall Design For Dairy Cattle Part II The e - 2022 - An
No ratings yet
Development of A Novel Stall Design For Dairy Cattle Part II The e - 2022 - An
9 pages
The Ottoman Turks Seen Through The Eyes
100% (1)
The Ottoman Turks Seen Through The Eyes
10 pages
Year 5 Maths Week 1
No ratings yet
Year 5 Maths Week 1
58 pages
Stat Cookbook
No ratings yet
Stat Cookbook
31 pages
EM Alert Limits PDA - Full
No ratings yet
EM Alert Limits PDA - Full
9 pages
Nomenclature in Evaluation of Analytical Methods
100% (1)
Nomenclature in Evaluation of Analytical Methods
25 pages
Econometrics: Two Variable Regression: The Problem of Estimation
No ratings yet
Econometrics: Two Variable Regression: The Problem of Estimation
28 pages
The Simple Linear Regression Model: Specification and Estimation
No ratings yet
The Simple Linear Regression Model: Specification and Estimation
66 pages
Baum 2013
No ratings yet
Baum 2013
44 pages
International Statistical Institute (ISI)
No ratings yet
International Statistical Institute (ISI)
15 pages
Torój - OLS Revisited
No ratings yet
Torój - OLS Revisited
45 pages
Bae Cher 2016
No ratings yet
Bae Cher 2016
17 pages
Unit 2 Statistical Estimation
No ratings yet
Unit 2 Statistical Estimation
15 pages
Module 05 Estimation of Parameters
No ratings yet
Module 05 Estimation of Parameters
3 pages
Mlelectures PDF
No ratings yet
Mlelectures PDF
24 pages
Unit5 Updated
No ratings yet
Unit5 Updated
69 pages
GARCH-M Model
No ratings yet
GARCH-M Model
11 pages
Statistical Inference For Data Science Compress
No ratings yet
Statistical Inference For Data Science Compress
78 pages
Accelerated Globalization and The Dynamics of DEINDUSTRIALIZATION Inclusive and Sustainable Industrial Development Working Paper Series WP 27 - 2018
No ratings yet
Accelerated Globalization and The Dynamics of DEINDUSTRIALIZATION Inclusive and Sustainable Industrial Development Working Paper Series WP 27 - 2018
43 pages
2 Msa
No ratings yet
2 Msa
31 pages
RP ch07
No ratings yet
RP ch07
29 pages
Multiple Regression Analysis: Inference: Wooldridge: Introductory Econometrics: A Modern Approach, 5e
No ratings yet
Multiple Regression Analysis: Inference: Wooldridge: Introductory Econometrics: A Modern Approach, 5e
23 pages
Fall 2019 Ltam Syllabus PDF
No ratings yet
Fall 2019 Ltam Syllabus PDF
7 pages
IFoA Syllabus 2019-2017
No ratings yet
IFoA Syllabus 2019-2017
201 pages
Chapter4 Estimation
No ratings yet
Chapter4 Estimation
28 pages
Gender-Biased Evaluation or Actual Differences? Fairness in The Evaluation of Faculty Teaching
No ratings yet
Gender-Biased Evaluation or Actual Differences? Fairness in The Evaluation of Faculty Teaching
19 pages
Homework Problems Stat 479: March 28, 2012
No ratings yet
Homework Problems Stat 479: March 28, 2012
44 pages
Sampling Distributions
No ratings yet
Sampling Distributions
92 pages
Unit 2 Notes - Final
No ratings yet
Unit 2 Notes - Final
32 pages
Solution To Exercise 6.2
No ratings yet
Solution To Exercise 6.2
2 pages
Cost Estimate
No ratings yet
Cost Estimate
3 pages

Experimental Design and Statistical Parametric Mapping Ch3

Uploaded by

Experimental Design and Statistical Parametric Mapping Ch3

Uploaded by

Chapter 3

Spatial Normalization using Basis

John Ashburner & Karl J. Friston

mapping can be defined:

y1 =q1 +q2 x1 +q3 x21 +q4 x31 +

This section begins by introducing a modification to the optimization method described in

3.2.1 A Maximum A Posteriori Solution

A Bayesian registration scheme is used in order to obtain a maximum a posteriori estimate

p(q|b) ∝ p(b|q)p(q) (3.1)

simple model does not account for the regularization.

This is essentially a scaling of I − J by the number of resolution elements per voxel.

3.2.2 Affine Registration

3.2.3 Nonlinear Registration

The approach described in Section ?? is used to optimize the parameters q1 , q2 , q3 and w,

C ⊗ (E1 T E1 ) C ⊗ (E1 T E2 ) −d2j,: T ⊗ (E1 T g:,j )

α = α +  (C ⊗ (E1 T E2 ))T C ⊗ (E2 T E2 ) −d2j,: T ⊗ (E2 T g:,j )

d2j,: T ⊗ (E1 T (f:,j − wg:,j ))

β = β + d2j,: T ⊗ (E2 T (f:,j − wg:,j ))

can be obtained using the chain rule:

3.2.4 Linear Regularization for Nonlinear Registration

Without regularization in the nonlinear registration, it is possible to introduce unnecessary de-

where the notation Ḋ1 refers to the first derivatives of D1 .

be constructed from H by:

This can be computed by:

Matrix C0 −1 from Eqn. 3.4 can be constructed from H as:

with values on the diagonals of H given by:

The linear-elastic energy [28] of a two dimensional deformation field is:

3.2.5 Templates and Intensity Transformations

bi (q) = f (s(xi , qs )) − t(xi , qt )

[13] G. E. Christensen, R. D. Rabbitt, M. I. Miller, S. C. Joshi, U. Grenander, T. A. Coogan, and

[28] M. I. Miller, G. E. Christensen, Y. Amit, and U. Grenander. Mathematical textbook of

You might also like