Common Lines Published
Common Lines Published
Common Lines Published
SIAM J. IMAGING SCIENCES c 2011 Society for Industrial and Applied Mathematics
Vol. 4, No. 2, pp. 543572
Three-Dimensional Structure Determination from Common Lines in Cryo-EM by
Eigenvectors and Semidenite Programming
A. Singer
and Y. Shkolnisky
Abstract. The cryo-electron microscopy reconstruction problem is to nd the three-dimensional (3D) structure
of a macromolecule given noisy samples of its two-dimensional projection images at unknown random
directions. Present algorithms for nding an initial 3D structure model are based on the angular
reconstitution method in which a coordinate system is established from three projections, and the
orientation of the particle giving rise to each image is deduced from common lines among the images.
However, a reliable detection of common lines is dicult due to the low signal-to-noise ratio of the
images. In this paper we describe two algorithms for nding the unknown imaging directions of
all projections by minimizing global self-consistency errors. In the rst algorithm, the minimizer
is obtained by computing the three largest eigenvectors of a specially designed symmetric matrix
derived from the common lines, while the second algorithm is based on semidenite programming
(SDP). Compared with existing algorithms, the advantages of our algorithms are ve-fold: rst,
they accurately estimate all orientations at very low common-line detection rates; second, they are
extremely fast, as they involve only the computation of a few top eigenvectors or a sparse SDP;
third, they are nonsequential and use the information in all common lines at once; fourth, they are
amenable to a rigorous mathematical analysis using spectral analysis and random matrix theory; and
nally, the algorithms are optimal in the sense that they reach the information theoretic Shannon
bound up to a constant for an idealized probabilistic model.
Key words. cryo-electron microscopy, angular reconstitution, random matrices, semicircle law, semidenite
programming, rotation group SO(3), tomography
AMS subject classications. 92E10, 68U10, 33C55, 60B20, 90C22
DOI. 10.1137/090767777
1. Introduction. Cryo-electron microscopy (cryo-EM) is a technique by which biological
macromolecules are imaged in an electron microscope. The molecules are rapidly frozen in
a thin ( 100nm) layer of vitreous ice, trapping them in a nearly physiological state [1, 2].
Cryo-EM images, however, have very low contrast due to the absence of heavy-metal stains or
other contrast enhancements, and have very high noise due to the small electron doses that can
be applied to the specimen. Thus, to obtain a reliable three-dimensional (3D) density map
of a macromolecule, the information from thousands of images of identical molecules must
be combined. When the molecules are arrayed in a crystal, the necessary signal-averaging of
Received by the editors August 11, 2009; accepted for publication (in revised form) February 15, 2011; published
electronically June 7, 2011. This work was partially supported by award R01GM090200 from the National Institute
of General Medical Sciences and by award 485/10 from the Israel Science Foundation. The content is solely the
responsibility of the authors and does not necessarily represent the ocial views of the National Institute of General
Medical Sciences or the National Institutes of Health.
http://www.siam.org/journals/siims/4-2/76777.html
Department of Mathematics and PACM, Princeton University, Fine Hall, Washington Road, Princeton, NJ
08544-1000 ([email protected]).
Department of Applied Mathematics, School of Mathematical Sciences, Tel Aviv University, Tel Aviv 69978,
Israel ([email protected]).
543
Copyright by SIAM. Unauthorized reproduction of this article is prohibited.
544 A. SINGER AND Y. SHKOLNISKY
noisy images is straightforwardly performed. More challenging is the problem of single-particle
reconstruction (SPR), where a 3D density map is to be obtained from images of individual
molecules present in random positions and orientations in the ice layer [1].
Because it does not require the formation of crystalline arrays of macromolecules, SPR
is a very powerful and general technique, which has been successfully used for 3D structure
determination of many protein molecules and complexes roughly 500 kDa or larger in size. In
some cases, sucient resolution ( 0.4nm) has been obtained from SPR to allow tracing of
the polypeptide chain and identication of residues in proteins [3, 4, 5]; however, even with
lower resolutions many important features can be identied [6].
Much progress has been made in algorithms that, given a starting 3D structure, are able
to rene that structure on the basis of a set of negative-stain or cryo-EM images, which are
taken to be projections of the 3D object. Datasets typically range from 10
4
to 10
5
particle
images, and renements require tens to thousands of CPU-hours. As the starting point for
the renement process, however, some sort of ab initio estimate of the 3D structure must be
made. If the molecule is known to have some preferred orientation, then it is possible to nd
an ab initio 3D structure using the random conical tilt method [7, 8]. There are two known
solutions to the ab initio estimation problem of the 3D structure that do not involve tilting.
The rst solution is based on the method of moments [9, 10] that exploits the known analytical
relation between the second order moments of the 2D projection images and the second order
moments of the (unknown) 3D volume in order to reveal the unknown orientations of the
particles. However, the method of moments is very sensitive to errors in the data and is
of rather academic interest [11, section 2.1, p. 251]. The second solution, on which present
algorithms are based, is the angular reconstitution method of Van Heel [12] in which a
coordinate system is established from three projections, and the orientation of the particle
giving rise to each image is deduced from common lines among the images. This method fails,
however, when the particles are too small or the signal-to-noise ratio is too low, as in such
cases it is dicult to correctly identify the common lines (see section 2 and Figure 2 for a
more detailed explanation about common lines).
Ideally one would want to do the 3D reconstruction directly from projections in the form
of raw images. However, the determination of common lines from the very noisy raw images
is typically too error-prone. Instead, the determination of common lines is performed on pairs
of class averages, namely, averages of particle images that correspond to the same viewing
direction. To reduce variability, class averages are typically computed from particle images
that have already been rotationally and translationally aligned [1, 13]. The choice of reference
images for the alignment is, however, arbitrary and can represent a source of bias in the
classication process. This therefore sets the goal for an ab initio reconstruction algorithm
that requires as little averaging as possible.
By now there is a long history of common-linebased algorithms. As mentioned earlier,
the common lines between three projections uniquely determine their relative orientations up
to handedness (chirality). This observation is the basis of the angular reconstitution method
of Van Heel [12], which was also developed independently by Vainshtein and Goncharov [14].
Other historical aspects of the method can be found in [15]. Farrow and Ottensmeyer [16]
used quaternions to obtain the relative orientation of a new projection in a least squares
sense. The main problem with such sequential approaches is that they are sensitive to false
Copyright by SIAM. Unauthorized reproduction of this article is prohibited.
CRYO-EM BY EIGENVECTORS AND SDP 545
detection of common lines, which leads to the accumulation of errors (see also [13, p. 336]).
Penczek, Zhu, and Frank [17] tried to obtain the rotations corresponding to all projections
simultaneously by minimizing a global energy functional. Unfortunately, minimization of
the energy functional requires a brute force search in a huge parametric space of all possible
orientations for all projections. Mallick et al. [18] suggested an alternative Bayesian approach,
in which the common line between a pair of projections can be inferred from their common
lines with dierent projection triplets. The problem with this particular approach is that
it requires too many (at least seven) common lines to be correctly identied simultaneously.
Therefore, it is not suitable in cases where the detection rate of correct common lines is low.
In [19] we introduced an improved Bayesian approach based on voting that requires only
two common lines to be correctly identied simultaneously and can therefore distinguish the
correctly identied common lines from the incorrect ones at much lower detection rates. The
common lines that passed the voting procedure are then used by our graph-based approach
[20] to assign Euler angles to all projection images. As shown in [19], the combination of
the voting method with the graph-based method resulted in a 3D ab initio reconstruction
of the E. coli 50S ribosomal subunit from real microscope images that had undergone only
rudimentary averaging.
The two-dimensional (2D) variant of the ab initio reconstruction problem in cryo-EM,
namely, the reconstruction of 2D objects from their one-dimensional (1D) projections taken
at random and unknown directions, has a somewhat shorter history, starting with the work
of Basu and Bresler [21, 22], who considered the mathematical uniqueness of the problem as
well as the statistical and algorithmic aspects of reconstruction from noisy projections. In [23]
we detailed a graph-Laplacian based approach for the solution of this problem. Although the
two problems are related, there is a striking dierence between the ab initio reconstruction
problems in 2D and 3D. In the 3D problem, the Fourier transforms of any pair of 2D projection
images share a common line, which provides some non-trivial information about their relative
orientations. In the 2D problem, however, the intersection of the Fourier transforms of any 1D
projection sinograms is the origin, and this trivial intersection point provides no information
about the angle between the projection directions. This is a signicant dierence, and, as a
result, the solution methods to the two problems are also quite dierent. Hereafter we solely
consider the 3D ab initio reconstruction problem as it arises in cryo-EM.
In this paper we introduce two common-linebased algorithms for nding the unknown
orientations of all projections in a globally consistent way. Both algorithms are motivated
by relaxations of a global minimization problem of a particular self-consistency error (SCE)
that takes into account the matching of common lines between all pairs of images. A similar
SCE was used in [16] to assess the quality of their angular reconstitution techniques. Our
approach is dierent in the sense that we actually minimize the SCE in order to nd the
imaging directions. The precise denition of our global SCE is given in section 2.
In section 3, we present our rst recovery algorithm, in which the global minimizer is
approximated by the top three eigenvectors of a specially designed symmetric matrix derived
from the common-line data. We describe how the unknown rotations are recovered from these
eigenvectors. The underlying assumption for the eigenvector method to succeed is that the
unknown rotations are sampled from the uniform distribution over the rotation group SO(3),
namely, that the molecule has no preferred orientation. Although it is motivated by a certain
Copyright by SIAM. Unauthorized reproduction of this article is prohibited.
546 A. SINGER AND Y. SHKOLNISKY
global optimization problem, the exact mathematical justication for the eigenvector method
is provided later in section 6, where we show that the computed eigenvectors are discrete
approximations of the eigenfunctions of a certain integral operator.
In section 4, we use a dierent relaxation of the global optimization problem, which leads
to our second recovery method based on semidenite programming (SDP) [24]. Our SDP
algorithm has similarities to the GoemansWilliamson max-cut algorithm [25]. The SDP
approach does not require the previous assumption that the rotations are sampled from the
uniform distribution over SO(3).
Compared with existing algorithms, the main advantage of our methods is that they
correctly nd the orientations of all projections at amazingly low common line detection rates
as they take into account all the geometric information in all common lines at once. In fact,
the estimation of the orientations improves as the number of images increases. In section 5
we describe the results of several numerical experiments using the two algorithms, showing
successful recoveries at very low common-line detection rates. For example, both algorithms
successfully recover a meaningful ab initio coordinate system from 500 projection images when
only 20% of the common lines are correctly identied. The eigenvector method is extremely
ecient, and the estimated 500 rotations were obtained in a matter of seconds on a standard
laptop machine.
In section 6, we show that in the limit of an innite number of projection images, the
symmetric matrix that we design converges to a convolution integral operator on the rotation
group SO(3). This observation explains many of the spectral properties that the matrix
exhibits. In particular, this allows us to demonstrate that the top three eigenvectors provide
the recovery of all rotations. Moreover, in section 7 we analyze a probabilistic model which is
introduced in section 5 and show that the eect of the misidentied common lines is equivalent
to a random matrix perturbation. Thus, using classical results in random matrix theory, we
demonstrate that the top three eigenvalues and eigenvectors are stable as long as the detection
rate of common lines exceeds
6
2
5
N
, where N is the number of images. From the practical point
of view, this result implies that 3D reconstruction is possible even at extreme levels of noise,
provided that enough projections are taken. From the theoretical point of view, we show
that this detection rate achieves the information theoretic Shannon bound up to a constant,
rendering the optimality of our method for ab initio 3D structure determination from common
lines under this idealized probabilistic model.
2. The global self-consistency error. Suppose we collect N 2D digitized projection im-
ages P
1
, . . . , P
N
of a 3D object taken at unknown random orientations. To each projection
image P
i
(i = 1, . . . , N) there corresponds a 3 3 unknown rotation matrix R
i
describing
its orientation (see Figure 1). Excluding the contribution of noise, the pixel intensities corre-
spond to line integrals of the electric potential induced by the molecule along the path of the
imaging electrons, that is,
(2.1) P
i
(x, y) =
_
i
(x, y, z) dz,
where (x, y, z) is the electric potential of the molecule in some xed laboratory coordinate
system and
i
(r) = (R
1
i
r) with r = (x, y, z). The projection operator (2.1) is also known
Copyright by SIAM. Unauthorized reproduction of this article is prohibited.
CRYO-EM BY EIGENVECTORS AND SDP 547
Projection Pi
Molecule
Electron
source
Ri =
| | |
R
1
i
R
2
i
R
3
i
| | |
SO(3)
Figure 1. Schematic drawing of the imaging process: every projection image corresponds to some unknown
3D rotation of the unknown molecule.
as the X-ray transform [26]. Our goal is to nd all rotation matrices R
1
, . . . , R
N
given the
dataset of noisy images.
The Fourier projection-slice theorem (see, e.g., [26, p. 11]) says that the 2D Fourier trans-
form of a projection image, denoted
P, is the restriction of the 3D Fourier transform of the
projected object
to the central plane (i.e., going through the origin)
perpendicular to
the imaging direction, that is,
(2.2)
P() =
(),
.
As every two nonparallel planes intersect at a line, it follows from the Fourier projection-
slice theorem that any two projection images have a common line of intersection in the Fourier
domain. Therefore, if
P
i
and
P
j
are the 2D Fourier transforms of projections P
i
and P
j
, then
there must be a central line in
P
i
and a central line in
P
j
on which the two transforms agree
(see Figure 2). This pair of lines is known as the common line. We parameterize the common
line by (x
ij
, y
ij
) in
P
i
and by (x
ji
, y
ji
) in
P
j
, where R is the radial frequency and
(x
ij
, y
ij
) and (x
ji
, y
ji
) are two unit vectors for which
(2.3)
P
i
(x
ij
, y
ij
) =
P
j
(x
ji
, y
ji
) for all R.
It is instructive to consider the unit vectors (x
ij
, y
ij
) and (x
ji
, y
ji
) as 3D vectors by zero-
padding. Specically, we dene c
ij
and c
ji
as
c
ij
= (x
ij
, y
ij
, 0)
T
, (2.4)
c
ji
= (x
ji
, y
ji
, 0)
T
. (2.5)
Copyright by SIAM. Unauthorized reproduction of this article is prohibited.
548 A. SINGER AND Y. SHKOLNISKY
Projection P
i
Projection P
j
P
i
P
j
3D Fourier space
3D Fourier space
(x
ij
, y
ij
)
(x
ij
, y
ij
)
(x
ji
, y
ji
)
(x
ji
, y
ji
)
R
i
c
ij
R
i
c
ij
= R
j
c
ji
Figure 2. Fourier projection-slice theorem and common lines.
Being the common line of intersection, the mapping of c
ij
by R
i
must coincide with the
mapping of c
ji
by R
j
:
(2.6) R
i
c
ij
= R
j
c
ji
for 1 i < j N.
These can be viewed as
_
N
2
_
linear equations for the 6N variables corresponding to the rst
two columns of the rotation matrices (as c
ij
and c
ji
have a zero third entry, the third column
of each rotation matrix does not contribute in (2.6)). Such overdetermined systems of linear
equations are usually solved by the least squares method [17]. Unfortunately, the least squares
approach is inadequate in our case due to the typically large proportion of falsely detected
common lines that will dominate the sum of squares error in
(2.7) min
R
1
,...,R
N
i=j
|R
i
c
ij
R
j
c
ji
|
2
.
Moreover, the global least squares problem (2.7) is nonconvex and therefore extremely dicult
to solve if one requires the matrices R
i
to be rotations, that is, when adding the constraints
(2.8) R
i
R
T
i
= I, det(R
i
) = 1 for i = 1, . . . , N,
where I is the 3 3 identity matrix. A relaxation method that neglects the constraints (2.8)
will simply collapse to the trivial solution R
1
= = R
N
= 0 which obviously does not
satisfy the constraint (2.8). Such a collapse is easily prevented by xing one of the rotations,
for example, by setting R
1
= I, but this would not make the robustness problem of the
least squares method go away. We therefore take a dierent approach for solving the global
optimization problem.
Copyright by SIAM. Unauthorized reproduction of this article is prohibited.
CRYO-EM BY EIGENVECTORS AND SDP 549
Since |c
ij
| = |c
ji
| = 1 are 3D unit vectors, their rotations are also unit vectors; that is,
|R
i
c
ij
| = |R
j
c
ji
| = 1. It follows that the minimization problem (2.7) is equivalent to the
maximization problem of the sum of dot products
(2.9) max
R
1
,...,R
N
i=j
R
i
c
ij
R
j
c
ji
,
subject to the constraints (2.8). For the true assignment of rotations, the dot product R
i
c
ij
R
j
c
ji
equals 1 whenever the common line between images i and j is correctly detected. Dot
products corresponding to misidentied common lines can take any value between 1 to 1,
and if we assume that such misidentied lines have random directions, then such dot products
can be considered as identically independently distributed (i.i.d.) zero-mean random variables
taking values in the interval [1, 1]. The objective function in (2.9) is the summation over all
possible dot products. Summing up dot products that correspond to misidentied common
lines results in many cancelations, whereas summing up dot products of correctly identied
common lines is simply a sum of ones. We may consider the contribution of the falsely
detected common lines as a random walk on the real line, where steps to the left and to the
right are equally probable. From this interpretation it follows that the total contribution of
the misidentied common lines to the objective function (2.9) is proportional to the square
root of the number of misidentications, whereas the contribution of the correctly identied
common lines is linear. This square-root diminishing eect of the misidentications makes the
global optimization (2.9) extremely robust compared with the least squares approach, which
is much more sensitive because its objective function is dominated by the misidentications.
These intuitive arguments regarding the statistical attractiveness of the optimization prob-
lem (2.9) will later be put on rm mathematical ground using random matrix theory as elab-
orated in section 7. Still, in order for the optimization problem (2.9) to be of any practical
use, we must show that its solution can be eciently computed. We note that our objective
function is closely related to the SCE of Farrow and Ottensmeyer [16, eq. (6), p. 1754] given
by
(2.10) SCE =
i=j
arccos (R
i
c
ij
R
j
c
ji
) .
This SCE was introduced and used in [16] to measure the success of their quaternion-based
sequential iterative angular reconstitution methods. At the small price of deleting the well-
behaved monotonic nonlinear arccos function in (2.10), we arrive at (2.9), which, as we will
soon show, has the great advantage of being amenable to ecient global nonsequential opti-
mization by either spectral or semidenite programming relaxations.
3. Eigenvector relaxation. The objective function in (2.9) is quadratic in the unknown
rotations R
1
, . . . , R
N
, which means that if the constraints (2.8) are properly relaxed, then the
solution to the maximization problem (2.9) would be related to the top eigenvectors of the
matrix dening the quadratic form. In this section we give a precise denition of that matrix
and show how the unknown rotations can be recovered from its top three eigenvectors.
Copyright by SIAM. Unauthorized reproduction of this article is prohibited.
550 A. SINGER AND Y. SHKOLNISKY
We rst dene the four NN matrices S
11
, S
12
, S
21
, and S
22
using all available common-
line data (2.4)(2.5) as
(3.1) S
11
ij
= x
ij
x
ji
, S
12
ij
= x
ij
y
ji
, S
21
ij
= y
ij
x
ji
, S
22
ij
= y
ij
y
ji
for 1 i ,= j N, while their diagonals are set to zero:
S
11
ii
= S
12
ii
= S
21
ii
= S
22
ii
= 0, i = 1, . . . , N.
Clearly, S
11
and S
22
are symmetric matrices (S
11
= S
11
T
and S
22
= S
22
T
), while S
12
= S
21
T
.
It follows that the 2N 2N matrix S given by
(3.2) S =
_
S
11
S
12
S
21
S
22
_
is symmetric (S = S
T
) and stores all available common line information. More importantly,
the top eigenvectors of S will reveal all rotations in a manner we describe below.
We denote the columns of the rotation matrix R
i
by R
1
i
, R
2
i
, and R
3
i
, and write the rotation
matrices as
(3.3) R
i
=
_
_
[ [ [
R
1
i
R
2
i
R
3
i
[ [ [
_
_
=
_
_
x
1
i
x
2
i
x
3
i
y
1
i
y
2
i
y
3
i
z
1
i
z
2
i
z
3
i
_
_
, i = 1, . . . , N.
Only the rst two columns of the R
i
s need to be recovered, because the third column is given
by the cross product: R
3
i
= R
1
i
R
2
i
. We therefore need to recover the six N-dimensional
coordinate vectors x
1
, y
1
, z
1
, x
2
, y
2
, z
2
that are dened by
x
1
= (x
1
1
x
1
2
x
1
N
)
T
, y
1
= (y
1
1
y
1
2
y
1
N
)
T
, z
1
= (z
1
1
z
1
2
z
1
N
)
T
, (3.4)
x
2
= (x
2
1
x
2
2
x
2
N
)
T
, y
2
= (y
2
1
y
2
2
y
2
N
)
T
, z
2
= (z
2
1
z
2
2
z
2
N
)
T
. (3.5)
Alternatively, we need to nd the following three 2N-dimensional vectors x, y, and z:
x =
_
x
1
x
2
_
, y =
_
y
1
y
2
_
, z =
_
z
1
z
2
_
. (3.6)
Using this notation we rewrite the objective function (2.9) as
(3.7)
i=j
R
i
c
ij
R
j
c
ji
= x
T
Sx +y
T
Sy +z
T
Sz,
Copyright by SIAM. Unauthorized reproduction of this article is prohibited.
CRYO-EM BY EIGENVECTORS AND SDP 551
which is a result of the following index manipulation:
i=j
R
i
c
ij
R
j
c
ji
=
i=j
x
ij
x
ji
R
1
i
R
1
j
+x
ij
y
ji
R
1
i
R
2
j
+y
ij
x
ji
R
2
i
R
1
j
+y
ij
y
ji
R
2
i
R
2
j
=
i=j
S
11
ij
R
1
i
R
1
j
+S
12
ij
R
1
i
R
2
j
+S
21
ij
R
2
i
R
1
j
+S
22
ij
R
2
i
R
2
j
(3.8)
=
i,j
S
11
ij
(x
1
i
x
1
j
+y
1
i
y
1
j
+z
1
i
z
1
j
) +S
12
ij
(x
1
i
x
2
j
+y
1
i
y
2
j
+z
1
i
z
2
j
)
+ S
21
ij
(x
2
i
x
1
j
+y
2
i
y
1
j
+z
2
i
z
1
j
) +S
22
ij
(x
2
i
x
2
j
+y
2
i
y
2
j
+z
2
i
z
2
j
)
= x
1
T
S
11
x
1
+y
1
T
S
11
y
1
+z
1
T
S
11
z
1
+ x
1
T
S
12
x
2
+y
1
T
S
12
y
2
+z
1
T
S
12
z
2
+ x
2
T
S
21
x
1
+y
2
T
S
21
y
1
+z
2
T
S
21
z
1
+ x
2
T
S
22
x
2
+y
2
T
S
22
y
2
+z
2
T
S
22
z
2
= x
T
Sx +y
T
Sy +z
T
Sz. (3.9)
The equality (3.7) shows that the maximization problem (2.9) is equivalent to the maxi-
mization problem
(3.10) max
R
1
,...,R
N
x
T
Sx +y
T
Sy +z
T
Sz,
subject to the constraints (2.8). In order to make this optimization problem tractable, we
relax the constraints and look for the solution of the proxy maximization problem
(3.11) max
x=1
x
T
Sx.
The connection between the solution to (3.11) and that of (3.10) will be made shortly. Since
S is a symmetric matrix, it has a complete set of orthonormal eigenvectors v
1
, . . . , v
2N
satisfying
Sv
n
=
n
v
n
, n = 1, . . . , 2N,
with real eigenvalues
1
2
2N
.
The solution to the maximization problem (3.11) is therefore given by the top eigenvector v
1
with largest eigenvalue
1
:
(3.12) v
1
= argmax
x=1
x
T
Sx.
If the unknown rotations are sampled from the uniform distribution (Haar measure) over
SO(3), that is, when the molecule has no preferred orientation, then the largest eigenvalue
should have multiplicity three, corresponding to the vectors x, y, and z, as the symmetry of
the problem in this case suggests that there is no reason to prefer x over y and z that appear
in (3.10). At this point, the reader may still wonder what is the mathematical justication
Copyright by SIAM. Unauthorized reproduction of this article is prohibited.
552 A. SINGER AND Y. SHKOLNISKY
that lls in the gap between (3.10) and (3.11). The required formal justication is provided in
section 6, where we prove that in the limit of innitely many images (N ) the matrix S
converges to an integral operator over SO(3) for which x, y, and z in (3.6) are eigenfunctions
sharing the same eigenvalue. The computed eigenvectors of the matrix S are therefore discrete
approximations of the eigenfunctions of the limiting integral operator. In particular, the linear
subspace spanned by the top three eigenvectors of S is a discrete approximation of the subspace
spanned by x, y, and z.
We therefore expect to be able to recover the rst two columns of the rotation matrices
R
1
, . . . , R
N
from the top three computed eigenvectors v
1
, v
2
, v
3
of S. Since the eigenspace of
x, y, and z is of dimension three, the vectors x, y, and z should be approximately obtained by
a 33 orthogonal transformation applied to the computed eigenvectors v
1
, v
2
, v
3
. This global
orthogonal transformation is an inherent degree of freedom in the estimation of rotations from
common lines. That is, it is possible to recover the molecule only up to a global orthogonal
transformation, that is, up to rotation and possibly reection. This recovery is performed by
constructing for every i = 1, . . . , N a 3 3 matrix
A
i
=
_
_
[ [ [
A
1
i
A
2
i
A
3
i
[ [ [
_
_
whose columns are given by
(3.13) A
1
i
=
_
_
_
v
1
i
v
2
i
v
3
i
_
_
_, A
2
i
=
_
_
_
v
1
N+i
v
2
N+i
v
3
N+i
_
_
_, A
3
i
= A
1
i
A
2
i
.
In practice, due to erroneous common lines and deviations from the uniformity assumption,
the matrix A
i
is approximately a rotation, so we estimate R
i
as the closest rotation matrix to
A
i
in the Frobenius matrix norm. This is done via the well-known procedure [27] R
i
= U
i
V
T
i
,
where A
i
= U
i
i
V
T
i
is the singular value decomposition of A
i
. A second set of valid rotations
R
i
is obtained from the matrices
A
i
whose columns are given by
(3.14)
A
1
i
=
_
_
1 0 0
0 1 0
0 0 1
_
_
A
1
i
,
A
2
i
=
_
_
1 0 0
0 1 0
0 0 1
_
_
A
2
i
,
A
3
i
=
A
1
i
A
2
i
,
via their singular value decomposition, that is,
R
i
=
U
i
V
T
i
, where
A
i
=
U
i
V
T
i
. The second
set of rotations
R
i
amounts to a global reection of the molecule; it is a well-known fact
that the chirality of the molecule cannot be determined from common-line data. Thus, in the
absence of any other information, it is impossible to prefer one set of rotations over the other.
From the computational point of view, we note that a simple way of computing the top
three eigenvectors is using the iterative power method, where three initial randomly chosen
vectors are repeatedly multiplied by the matrix S and then orthonormalized by the Gram
Schmidt (QR) procedure until convergence. The number of iterations required by such a
procedure is determined by the spectral gap between the third and forth eigenvalues. The
Copyright by SIAM. Unauthorized reproduction of this article is prohibited.
CRYO-EM BY EIGENVECTORS AND SDP 553
spectral gap is further discussed in sections 57. In practice, for large values of N we use the
MATLAB function eigs to compute the few top eigenvectors, while for small N we compute all
eigenvectors using the MATLAB function eig. We remark that the computational bottleneck
for large N is often the storage of the 2N 2N matrix S rather than the time complexity of
computing the top eigenvectors.
4. Relaxation by a semidenite program. In this section we present an alternative re-
laxation of (2.9) using semidenite programming (SDP) [24], which draws similarities with
the GoemansWilliamson SDP for nding the maximum cut in a weighted graph [25]. The
relaxation of the SDP is tighter than the eigenvector relaxation and does not require the
assumption that the rotations are uniformly sampled over SO(3).
The SDP formulation begins with the introduction of two 3 N matrices R
1
and R
2
dened by concatenating the rst columns and second columns of the N rotation matrices,
respectively,
(4.1) R
1
=
_
_
[ [ [
R
1
1
R
1
2
R
1
N
[ [ [
_
_
, R
2
=
_
_
[ [ [
R
2
1
R
2
2
R
2
N
[ [ [
_
_
.
We also concatenate R
1
and R
2
to dene a 3 2N matrix R given by
(4.2) R = (R
1
R
2
) =
_
_
[ [ [ [ [ [
R
1
1
R
1
2
R
1
N
R
2
1
R
2
2
R
2
N
[ [ [ [ [ [
_
_
.
The Gram matrix G for the matrix R is a 2N 2N matrix of inner products between the 3D
column vectors of R, that is,
(4.3) G = R
T
R.
Clearly, G is a rank-3 semidenite positive matrix (G _ 0), which can be conveniently written
as a block matrix
(4.4) G =
_
G
11
G
12
G
21
G
22
_
=
_
R
1
T
R
1
R
1
T
R
2
R
2
T
R
1
R
2
T
R
2
_
.
The orthogonality of the rotation matrices (R
T
i
R
i
= I) implies that
(4.5) G
11
ii
= G
22
ii
= 1, i = 1, 2, . . . , N,
and
(4.6) G
12
ii
= G
21
ii
= 0, i = 1, 2, . . . , N.
From (3.8) it follows that the objective function (2.9) is the trace of the matrix product SG:
(4.7)
i=j
R
i
c
ij
R
j
c
ji
= trace(SG).
Copyright by SIAM. Unauthorized reproduction of this article is prohibited.
554 A. SINGER AND Y. SHKOLNISKY
A natural relaxation of the optimization problem (2.9) is thus given by the SDP
max
GR
2N2N
trace(SG) (4.8)
subject to G _ 0, (4.9)
G
11
ii
= G
22
ii
= 1, G
12
ii
= G
21
ii
= 0, i = 1, 2, . . . , N. (4.10)
The only constraint missing in this SDP formulation is the nonconvex rank-3 constraint on the
Gram matrix G. The matrix R is recovered from the Cholesky decomposition of the solution
G of the SDP (4.8)(4.10). If the rank of G is greater than 3, then we project the rows of
R onto the subspace spanned by the top three eigenvectors of G and recover the rotations
using the procedure that was detailed in the previous section in (3.13). We note that except
for the orthogonality constraint (4.6), the semidenite program (4.8)(4.10) is identical to the
GoemansWilliamson SDP for nding the maximum cut in a weighted graph [25].
From the complexity point of view, SDP can be solved in polynomial time to any given
precision, but even the most sophisticated SDP solvers that exploit the sparsity structure
of the max cut problem are not competitive with the much faster eigenvector method. At
rst glance it may seem that the SDP (4.8)(4.10) should outperform the eigenvector method
in terms of producing more accurate rotation matrices. However, our simulations show that
the accuracy of both methods is almost identical when the rotations are sampled from the
uniform distribution over SO(3). As the eigenvector method is much faster, it should also be
the method of choice whenever the rotations are a priori known to be uniformly sampled.
5. Numerical simulations. We performed several numerical experiments that illustrate
the robustness of the eigenvector and the SDP methods to false identications of common
lines. All simulations were performed in MATLAB on a Lenovo Thinkpad X300 laptop with
Intel Core 2 CPU L7100 1.2GHz with 4GB RAM running Windows Vista.
5.1. Experiments with simulated rotations. In the rst series of simulations we tried
to imitate the experimental setup by using the following procedure. In each simulation, we
randomly sampled N rotations from the uniform distribution over SO(3). This was done
by randomly sampling N vectors in R
4
whose coordinates are i.i.d. Gaussians, followed by
normalizing these vectors to the unit 3D sphere S
3
R
4
. The normalized vectors are viewed
as unit quaternions which we converted into 3 3 rotation matrices R
1
, . . . , R
N
. We then
computed all pairwise common-line vectors c
ij
= R
1
i
R
3
i
R
3
j
R
3
i
R
3
j
and c
ji
= R
1
j
R
3
i
R
3
j
R
3
i
R
3
j
(see
also the discussion following (6.2)). For each pair of rotations, with probability p we kept
the values of c
ij
and c
ji
unchanged, while with probability 1 p we replaced c
ij
and c
ji
by
two random vectors that were sampled from the uniform distribution over the unit circle in
the plane. The parameter p ranges from 0 to 1 and indicates the proportion of the correctly
detected common lines. For example, p = 0.1 means that only 10% of the common lines are
identied correctly, and the other 90% of the entries of the matrix S are lled in with random
entries corresponding to some randomly chosen unit vectors.
Figure 3 shows the distribution of the eigenvalues of the matrix S for two dierent values
of N and four dierent values of the probability p. It took a matter of seconds to compute
each of the eigenvalue histograms shown in Figure 3. Evident from the eigenvalue histograms
Copyright by SIAM. Unauthorized reproduction of this article is prohibited.
CRYO-EM BY EIGENVECTORS AND SDP 555
100 50 0 50 100
0
50
100
150
(a) N = 100, p = 1
50 0 50
0
5
10
15
20
25
30
(b) N = 100, p = 0.5
20 10 0 10 20
0
5
10
15
(c) N = 100, p = 0.25
20 10 0 10 20
0
5
10
15
(d) N = 100, p = 0.1
500 0 500
0
200
400
600
800
1000
(e) N = 500, p = 1
200 100 0 100 200
0
10
20
30
40
50
60
(f) N = 500, p = 0.5
50 0 50
0
5
10
15
(g) N = 500, p = 0.1
50 0 50
0
5
10
15
(h) N = 500, p = 0.05
Figure 3. Eigenvalue histograms for the matrix S for dierent values of N and p.
is the spectral gap between the three largest eigenvalues and the remaining eigenvalues, as
long as p is not too small. As p decreases, the spectral gap narrows down, until it com-
pletely disappears at some critical value p
c
, which we call the threshold probability. Figure
3 indicates that the value of the critical probability for N = 100 is somewhere between 0.1
and 0.25, whereas for N = 500 it is bounded between 0.05 and 0.1. The algorithm is there-
fore more likely to cope with a higher percentage of misidentications by using more images
(larger N).
When p decreases, not only does the gap narrow, but also the histogram of the eigenvalues
becomes smoother. The smooth part of the histogram seems to follow the semicircle law of
Wigner [28, 29], as illustrated in Figure 3. The support of the semicircle gets slightly larger
as p decreases, while the top three eigenvalues shrink signicantly. In the next sections we
will provide a mathematical explanation for the numerically observed eigenvalue histograms
and for the emergence of Wigners semicircle.
A further investigation into the results of the numerical simulations also reveals that the
rotations that were recovered by the top three eigenvectors successfully approximated the
sampled rotations, as long as p was above the threshold probability p
c
. The accuracy of our
methods is measured by the following procedure. Denote by
R
1
, . . . ,
R
N
the rotations as
estimated by either the eigenvector or SDP methods, and by R
1
, . . . , R
N
the true sampled
rotations. First, note that (2.6) implies that the true rotations can be recovered only up to a
xed 3 3 orthogonal transformation O, since if R
i
c
ij
= R
j
c
ji
, then also OR
i
c
ij
= OR
j
c
ji
.
In other words, a completely successful recovery satises
R
1
i
R
i
= O for all i = 1, . . . , N for
some xed orthogonal matrix O. In practice, however, due to erroneous common lines and
deviation from uniformity (for the eigenvector method), there does not exist an orthogonal
transformation O that perfectly aligns all the estimated rotations with the true ones. But we
may still look for the optimal rotation
O that minimizes the sum of squared distances between
the estimated rotations and the true ones:
(5.1)
O = argmin
OSO(3)
N
i=1
|R
i
O
R
i
|
2
F
,
Copyright by SIAM. Unauthorized reproduction of this article is prohibited.
556 A. SINGER AND Y. SHKOLNISKY
where | |
F
denotes the Frobenius matrix norm. That is,
O is the optimal solution to
the registration problem between the two sets of rotations in the sense of minimizing the
mean squared error (MSE). Using properties of the trace, in particular tr(AB) = tr(BA) and
tr(A) = tr(A
T
), we notice that
N
i=1
|R
i
O
R
i
|
2
F
=
N
i=1
tr
_
_
R
i
O
R
i
__
R
i
O
R
i
_
T
_
=
N
i=1
tr
_
2I 2O
R
i
R
T
i
_
= 6N 2 tr
_
O
N
i=1
R
i
R
T
i
_
. (5.2)
Let Q be the 3 3 matrix
(5.3) Q =
1
N
N
i=1
R
i
R
T
i
;
then from (5.2) it follows that the MSE is given by
(5.4)
1
N
N
i=1
|R
i
O
R
i
|
2
F
= 6 2 tr(OQ).
Arun, Huang, and Bolstein [27] proved that tr(OQ) tr(V U
T
Q) for all O SO(3), where
Q = UV
T
is the singular value decomposition of Q. It follows that the MSE is minimized
by the orthogonal matrix
O = V U
T
, and the MSE in such a case is given by
(5.5) MSE =
1
N
N
i=1
|R
i
O
R
i
|
2
F
= 6 2 tr(V U
T
UV
T
) = 6 2
3
r=1
r
,
where
1
,
2
,
3
are the singular values of Q. In particular, the MSE vanishes whenever Q is
an orthogonal matrix, because in such a case
1
=
2
=
3
= 1.
In our simulations we compute the MSE (5.5) for each of the two valid sets of rotations
(due to the handedness ambiguity, see (3.13)(3.14)) and always present the smallest of the
two. Table 1 compares the MSEs that were obtained by the eigenvector method with the ones
obtained by the SDP method for N = 100 and N = 500 with the same common-line input
data. The SDP was solved using SDPLR, a package for solving large-scale SDP problems [30]
in MATLAB.
5.2. Experiments with simulated noisy projections. In the second series of experiments,
we tested the eigenvector and SDP methods on simulated noisy projection images of a ribo-
somal subunit for dierent numbers of projections (N = 100, 500, 1000) and dierent levels
of noise. For each N, we generated N noise-free centered projections of the ribosomal sub-
unit, whose corresponding rotations were uniformly distributed on SO(3). Each projection
was of size 129 129 pixels. Next, we xed a signal-to-noise ratio (SNR), and added to each
Copyright by SIAM. Unauthorized reproduction of this article is prohibited.
CRYO-EM BY EIGENVECTORS AND SDP 557
Table 1
The MSE of the eigenvector and SDP methods for N = 100 (left) and N = 500 (right) and dierent values
of p.
(a) N = 100
p MSE(eig) MSE(sdp)
1 0.0055 4.8425e-05
0.5 0.0841 0.0676
0.25 0.7189 0.7140
0.15 2.8772 2.8305
0.1 4.5866 4.7814
0.05 4.8029 5.1809
(b) N = 500
p MSE(eig) MSE(sdp)
1 0.0019 1.0169e-05
0.5 0.0166 0.0143
0.25 0.0973 0.0911
0.15 0.3537 0.3298
0.1 1.2739 1.1185
0.05 5.4371 5.3568
(a) Clean (b) SNR=1 (c) SNR=1/2 (d) SNR=1/4 (e) SNR=1/8
(f) SNR=1/16 (g) SNR=1/32 (h) SNR=1/64 (i) SNR=1/128 (j) SNR=1/256
Figure 4. Simulated projection with various levels of additive Gaussian white noise.
clean projection additive Gaussian white noise
1
of the prescribed SNR. The SNR in all our
experiments is dened by
(5.6) SNR =
Var(Signal )
Var(Noise)
,
where Var is the variance (energy), Signal is the clean projection image, and Noise is the noise
realization of that image. Figure 4 shows one of the projections at dierent SNR levels. The
SNR values used throughout this experiment were 2
k
with k = 0, . . . , 9. Clean projections
were generated by setting SNR = 2
20
.
We computed the 2D Fourier transform of all projections on a polar grid discretized
into L = 72 central lines, corresponding to an angular resolution of 360
/72 = 5
. We
constructed the matrix S according to (3.1)(3.2) by comparing all
_
N
2
_
pairs of projection
images; for each pair we detected the common line by computing all L
2
/2 possible dierent
1
Perhaps a more realistic model for the noise is that of a correlated Poissonian noise rather than the
Gaussian white noise model that is used in our simulations. Correlations are expected due to the varying width
of the ice layer and the point-spread-function of the camera [1]. A dierent noise model would most certainly
have an eect on the detection rate of correct common lines, but this issue is shared by all common-linebased
algorithms and is not specic to our presented algorithms.
Copyright by SIAM. Unauthorized reproduction of this article is prohibited.
558 A. SINGER AND Y. SHKOLNISKY
Table 2
The proportion p of correctly detected common lines as a function of the SNR. As expected, p is not a
function of the number of images N.
(a) N = 100
SNR p
clean 0.997
1 0.968
1/2 0.930
1/4 0.828
1/8 0.653
1/16 0.444
1/32 0.247
1/64 0.108
1/128 0.046
1/256 0.023
1/512 0.017
(b) N = 500
SNR p
clean 0.997
1 0.967
1/2 0.922
1/4 0.817
1/8 0.639
1/16 0.433
1/32 0.248
1/64 0.113
1/128 0.046
1/256 0.023
1/512 0.015
(c) N = 1000
SNR p
clean 0.997
1 0.966
1/2 0.919
1/4 0.813
1/8 0.638
1/16 0.437
1/32 0.252
1/64 0.115
1/128 0.047
1/256 0.023
1/512 0.015
normalized correlations between their Fourier central lines, of which the pair of central lines
having the maximum normalized correlation was declared as the common line. Table 2 shows
the proportion p of the correctly detected common lines as a function of the SNR (we consider
a common line as correctly identied if each of the estimated direction vectors (x
ij
, y
ij
) and
(x
ji
, y
ji
) is within 10
or
R
3
i
R
3
j
R
3
i
R
3
j
.
The eigenvalues of K and S are the same, with the eigenvectors of K being vectors of
length 2N obtained from the eigenvectors of S by reshuing their entries. We therefore try
to understand the operation of matrix-vector multiplication of K with some arbitrary vector
f of length 2N. It is convenient to view the vector f as N vectors in R
2
obtained by sampling
the function f : SO(3) R
2
at R
1
, . . . , R
N
, that is,
(6.4) f
i
= f(R
i
), i = 1, . . . , N.
The matrix-vector multiplication is thus given by
(6.5) (Kf)
i
=
N
j=1
K
ij
f
j
=
N
j=1
K(R
i
, R
j
)f(R
j
), i = 1, . . . , N.
If the rotations R
1
, . . . , R
N
are i.i.d. random variables uniformly distributed over SO(3), then
the expected value of (Kf)
i
conditioned on R
i
is
(6.6) E[(Kf)
i
[ R
i
] = (N 1)
_
SO(3)
K(R
i
, R)f(R) dR,
where dR is the Haar measure (recall that by being a zero matrix, K(R
i
, R
i
) does not con-
tribute to the sum in (6.5)). The eigenvectors of K are therefore discrete approximations to
Copyright by SIAM. Unauthorized reproduction of this article is prohibited.
564 A. SINGER AND Y. SHKOLNISKY
the eigenfunctions of the integral operator / given by
(6.7) (/f)(R
1
) =
_
SO(3)
K(R
1
, R
2
)f(R
2
) dR
2
,
due to the law of large numbers, with the kernel K : SO(3) SO(3) R
22
given by (6.3).
We are thus interested in the eigenfunctions of the integral operator / given by (6.7).
The integral operator / is a convolution operator over SO(3). Indeed, note that K given
in (6.3) satises
(6.8) K(gR
1
, gR
2
) = K(R
1
, R
2
) for all g SO(3),
because (gR
3
1
) (gR
3
2
) = g(R
3
1
R
3
2
) and gg
1
= g
1
g = I. It follows that the kernel K
depends only upon the ratio R
1
1
R
2
, because we can choose g = R
1
1
so that
K(R
1
, R
2
) = K(I, R
1
1
R
2
),
and the integral operator / of (6.7) becomes
(6.9) (/f)(R
1
) =
_
SO(3)
K(I, R
1
1
R
2
)f(R
2
) dR
2
.
We will therefore dene the convolution kernel
K : SO(3) R
22
as
(6.10)
K(U
1
) K(I, U) =
_
1 0 0
0 1 0
_
(I
3
U
3
)(I
3
U
3
)
T
|I
3
U
3
|
2
U
_
1 0 0
0 1 0
_
T
,
where I
3
= (0 0 1)
T
is the third column of the identity matrix I. We rewrite the integral
operator / from (6.7) in terms of
K as
(6.11) (/f)(R
1
) =
_
SO(3)
K(R
1
2
R
1
)f(R
2
) dR
2
=
_
SO(3)
K(U)f(R
1
U
1
) dU,
where we used the change of variables U = R
1
2
R
1
. Equation (6.11) implies that / is a
convolution operator over SO(3) given by [33, p. 158]
(6.12) /f =
K f.
Similar to the convolution theorem for functions over the real line, the Fourier transform of a
convolution over SO(3) is the product of their Fourier transforms, where the Fourier transform
is dened by a complete system of irreducible matrix-valued representations of SO(3) (see,
e.g., [33, Theorem (4.14), p. 159]).
Let
SO(2) be a
planar rotation by the same angle:
=
_
_
cos sin 0
sin cos 0
0 0 1
_
_
,
=
_
cos sin
sin cos
_
.
Copyright by SIAM. Unauthorized reproduction of this article is prohibited.
CRYO-EM BY EIGENVECTORS AND SDP 565
The kernel
K satises the invariance property
(6.13)
K((
)
1
) =
K(U
1
)
I
3
= I
3
and (U
)
3
= U
3
, so
(6.14) I
3
(
)
3
= (
I
3
) (
)
3
=
(I
3
(U
)
3
) =
(I
3
U
3
),
from which it follows that
(6.15) |I
3
(
)
3
| = |
(I
3
U
3
)| = |I
3
U
3
|,
because
)
3
)(I
3
(
)
3
)
T
=
(I
3
U
3
)(I
3
U
3
)
T
.
Combining (6.15) and (6.16) yields
(6.17)
(I
3
(
)
3
)(I
3
(
)
3
)
T
|I
3
(
)
3
|
2
(I
3
U
3
)(I
3
U
3
)
T
|I
3
U
3
|
2
U
,
which together with the denition of
K in (6.10) demonstrates the invariance property (6.13).
The fact that / is a convolution satisfying the invariance property (6.13) implies that
the eigenfunctions of / are related to the spherical harmonics. This relation, as well as the
exact computation of the eigenvalues, will be established in a separate publication [34]. We
note that the spectrum of / would have been much easier to compute if the normalization
factor |I
3
U
3
|
2
did not appear in the kernel function
K of (6.10). Indeed, in such a case,
K
would have been a third order polynomial, and all eigenvalues corresponding to higher order
representations would have vanished.
We note that (6.6) implies that the top eigenvalue of S
clean
, denoted
1
(S
clean
), scales
linearly with N; that is, with high probability,
(6.18)
1
(S
clean
) = N
1
(/) +O(
N),
where the O(
N) term is the standard deviation of the sum in (6.5). Moreover, from the top
eigenvalues observed in Figures 3(a), 3(e), 5(a), 6(a), and 7(a) corresponding to p = 1 and p
values close to 1, it is safe to speculate that
(6.19)
1
(/) =
1
2
,
as the top eigenvalues are approximately 50, 250, and 500 for N = 100, 500, and 1000,
respectively.
Copyright by SIAM. Unauthorized reproduction of this article is prohibited.
566 A. SINGER AND Y. SHKOLNISKY
We calculate
1
(/) analytically by showing that the three columns of
(6.20) f(U) =
_
1 0 0
0 1 0
_
U
1
are eigenfunctions of /. Notice that since U
1
= U
T
, f(U) is equal to the rst two columns of
the rotation matrix U. This means, in particular, that U can be recovered from f(U). Since
the eigenvectors of S, as computed by our algorithm (3.6), are discrete approximations of the
eigenfunctions of /, it is possible to use the three eigenvectors of S that correspond to the
three eigenfunctions of / given by f(U) to recover the unknown rotation matrices.
We now verify that the columns f(U) are eigenfunctions of /. Plugging (6.20) into (6.11)
and employing (6.10) give
(6.21) (/f)(R) =
_
SO(3)
_
1 0 0
0 1 0
_
(I
3
U
3
)(I
3
U
3
)
T
|I
3
U
3
|
2
U
_
_
1 0 0
0 1 0
0 0 0
_
_
U
1
R
1
dU.
From UU
1
= I it follows that
(6.22) U
_
_
1 0 0
0 1 0
0 0 0
_
_
U
1
= UIU
1
U
_
_
0 0 0
0 0 0
0 0 1
_
_
U
1
= I U
3
U
3
T
.
Combining (6.22) with the fact that (I
3
U
3
)
T
U
3
= 0, we obtain
(6.23)
(I
3
U
3
)(I
3
U
3
)
T
|I
3
U
3
|
2
U
_
_
1 0 0
0 1 0
0 0 0
_
_
U
1
=
(I
3
U
3
)(I
3
U
3
)
T
|I
3
U
3
|
2
.
Letting U
3
= (x y z)
T
, the cross product I
3
U
3
is given by
(6.24) I
3
U
3
= (y x 0)
T
,
whose squared norm is
(6.25) |I
3
U
3
|
2
= x
2
+y
2
= 1 z
2
,
and
(6.26) (I
3
U
3
)(I
3
U
3
)
T
=
_
_
y
2
xy 0
xy x
2
0
0 0 0
_
_
.
It follows from (6.21) and identities (6.23)(6.26) that
(6.27) (/f)(R) =
_
SO(3)
1
1 z
2
_
y
2
xy 0
xy x
2
0
_
dUR
1
.
Copyright by SIAM. Unauthorized reproduction of this article is prohibited.
CRYO-EM BY EIGENVECTORS AND SDP 567
The integrand in (6.27) is only a function of the axis of rotation U
3
. The integral over
SO(3) therefore collapses to an integral over the unit sphere S
2
with the uniform measure d
(satisfying
_
S
2
d = 1) given by
(6.28) (/f)(R) =
_
S
2
1
1 z
2
_
y
2
xy 0
xy x
2
0
_
dR
1
.
From symmetry it follows that
_
S
2
xy
1z
2
d = 0 and that
_
S
2
x
2
1z
2
d =
_
S
2
y
2
1z
2
d. As
x
2
1z
2
+
y
2
1z
2
= 1 on the sphere, we conclude that
_
S
2
x
2
1z
2
d =
_
S
2
y
2
1z
2
d =
1
2
and
(6.29) (/f)(R) =
1
2
_
1 0 0
0 1 0
_
R
1
=
1
2
f(R).
This shows that the three functions dened by (6.20), which are the same as those dened in
(3.6), are the three eigenfunctions of / with the corresponding eigenvalue
1
(/) =
1
2
, as was
speculated before in (6.19) based on the numerical evidence.
The remaining spectrum is analyzed in [34], where it is shown that the eigenvalues of /
are
(6.30)
l
(/) =
(1)
l+1
l(l + 1)
,
with multiplicities 2l +1 for l = 1, 2, 3, . . . . An explicit expression for all eigenfunctions is also
given in [34]. In particular, the spectral gap between the top eigenvalue
1
(/) =
1
2
and the
next largest eigenvalue
3
(/) =
1
12
is
(6.31) (/) =
1
(/)
3
(/) =
5
12
.
7. Wigners semicircle law and the threshold probability. As indicated by the numerical
experiments of section 5, false detections of common lines due to noise lead to the emergence
of what seems to be Wigners semicircle for the distribution of the eigenvalues of S. In this
section we provide a simple mathematical explanation for this phenomenon.
Consider the simplied probabilistic model of section 5.1 that assumes that every common
line is detected correctly with probability p, independently of all other common lines, and that
with probability 1 p the common lines are falsely detected and are uniformly distributed
over the unit circle. The expected value of the noisy matrix S, whose entries are correct with
probability p, is given by
(7.1) ES = pS
clean
,
because the contribution of the falsely detected common lines to the expected value vanishes
by the assumption that their directions are distributed uniformly on the unit circle. From
(7.1) it follows that S can be decomposed as
(7.2) S = pS
clean
+W,
Copyright by SIAM. Unauthorized reproduction of this article is prohibited.
568 A. SINGER AND Y. SHKOLNISKY
where W is a 2N 2N zero-mean random matrix whose entries are given by
(7.3) W
ij
=
_
(1 p)S
clean
ij
with probability p,
pS
clean
ij
+X
ij
X
ji
with probability 1 p,
where X
ij
and X
ji
are two independent random variables obtained by projecting two inde-
pendent random vectors uniformly distributed on the unit circle onto the x-axis. For small
values of p, the variance of W
ij
is dominated by the variance of the term X
ij
X
ji
. Symmetry
implies that EX
2
ij
= EX
2
ji
=
1
2
, from which we have that
(7.4) EW
2
ij
= EX
2
ij
X
2
ji
+O(p) =
1
4
+O(p).
Wigner [28, 29] showed that the limiting distribution of the eigenvalues of random n n
symmetric matrices (scaled down by
n =
2N
suces, with the probabilistic error bound given in [35].
The eigenvalues of W are therefore distributed according to Wigners semicircle law whose
support, up to small O(p) terms and nite sample uctuations, is [
2N,
200 14.14,
for N = 500 near
N).
In [38, 39, 40] it is proved that the top eigenvalue of the matrix A + W, composed of a
rank-1 matrix A and a random matrix W, will be pushed away from the semicircle with high
probability if the condition
(7.7)
1
(A) >
1
2
1
(W)
Copyright by SIAM. Unauthorized reproduction of this article is prohibited.
CRYO-EM BY EIGENVECTORS AND SDP 569
is satised. Clearly, for matrices A that are not necessarily of rank-1, the condition (7.7) can
be replaced by
(7.8) (A) >
1
2
1
(W),
where (A) is the spectral gap. Therefore, the condition
(7.9) p(S
clean
) >
1
2
1
(W)
guarantees that the top three eigenvalues of S will reside away from the semicircle. Substi-
tuting (7.5) and (7.6) in (7.9) results in
(7.10) p
_
5
12
N +O(
N)
_
>
1
2
2N(1 +O(p)),
from which it follows that the threshold probability p
c
is given by
(7.11) p
c
=
6
2
5
N
+O(N
1
).
For example, the threshold probabilities predicted for N = 100, N = 500, and N = 1000 are
p
c
0.17, p
c
0.076, and p
c
0.054, respectively. These values match the numerical results
of section 5.1 and are also in good agreement with the numerical experiments for the noisy
projections presented in section 5.2.
From the perspective of information theory, the threshold probability (7.11) is nearly
optimal. To that end, notice that to estimate N rotations to a given nite precision requires
O(N) bits of information. For p 1, the common line between a pair of images provides O(p
2
)
bits of information (see [41, section 5, eq. (82)]). Since there are N(N 1)/2 pairs of common
lines, the entropy of the rotations cannot decrease by more than O(p
2
N
2
). Comparing p
2
N
2
to N, we conclude that the threshold probability p
c
of any recovery method cannot be lower
than O(
1
N
). The last statement can be made precise by Fanos inequality and Wolfowitzs
converse, also known as the weak and strong converse theorems to the coding theorem that
provide a lower bound for the probability of the error in terms of the conditional entropy (see,
e.g., [42, Chapter 8.9, pp. 204207] and [43, Chapter 5.8, pp. 173176]). This demonstrates
the near-optimality of our eigenvector method, and we refer the reader to section 5 in [41] for
a complete discussion about the information theory aspects of this problem.
8. Summary and discussion. In this paper we presented ecient methods for computing
the rotations of all cryo-EM particles from common-line information in a globally consistent
way. Our algorithms, one based on a spectral method (computation of eigenvectors), and
the other based on SDP (a version of max-cut), are able to nd the correct set of rotations
even at very low common-line detection rates. Using random matrix theory and spectral
analysis on SO(3), we showed that rotations obtained by the eigenvector method can lead
to a meaningful ab initio model as long as the proportion of correctly detected common
lines exceeds
6
2
5
N
(assuming a simplied probabilistic model for the errors). It remains to
Copyright by SIAM. Unauthorized reproduction of this article is prohibited.
570 A. SINGER AND Y. SHKOLNISKY
be seen how these algorithms will perform on real raw projection images or on their class
averages, and to compare their performance to the recently proposed voting recovery algorithm
[19], whose usefulness has already been demonstrated on real datasets. Although the voting
algorithm and the methods presented here try to solve the same problem, the methods and
their underlying mathematical theory are dierent. While the voting procedure is based on
a Bayesian approach and is probabilistic in its nature, the approach here is analytical and is
based on spectral analysis of convolution operators over SO(3) and random matrix theory.
The algorithms presented here can be regarded as a continuation of the general methodol-
ogy initiated in [41], where we showed how the problem of estimating a set of angles from their
noisy oset measurements can be solved using either eigenvectors or SDP. Notice, however,
that the problem considered here of recovering a set of rotations from common-line mea-
surements between their corresponding images is dierent and more involved mathematically
than the angular synchronization problem that is considered in [41]. Specically, the common-
line-measurement between two projection images P
i
and P
j
provides only partial information
about the ratio R
1
i
R
j
. Indeed, the common line between two images determines only two
out of the three Euler angles (the missing third degree of freedom can be determined only by
a third image). The success of the algorithms presented here shows that it is also possible to
integrate all the partial oset measurements between all rotations in a globally consistent way
that is robust to noise. Although the algorithms presented in this paper and in [41] seem to be
quite similar, the underlying mathematical foundation of the eigenvector algorithm presented
here is dierent, as it crucially relies on the spectral properties of the convolution operator
over SO(3).
We would like to point out two possible extensions of our algorithms. First, it is possible
to include condence information about the common lines. Specically, the normalized corre-
lation value of the common line is an indication for its likelihood of being correctly identied.
In other words, common lines with higher normalized correlations have a better chance of
being correct. We can therefore associate a weight w
ij
with the common line between P
i
and
P
j
to indicate our condence in it, and multiply the corresponding 2 2 rank-1 submatrix of
S by this weight. This extension gives only a little improvement in terms of the MSE as seen
in our experiments, which will be reported elsewhere. Another possible extension is to include
multiple hypotheses about the common line between two projections. This can be done by
replacing the 2 2 rank-1 matrix associated with the top common line between P
i
and P
j
by
a weighted average of such 2 2 rank-1 matrices corresponding to the dierent hypotheses.
On the one hand, this extension should benet from the fact that the probability that one
of the hypotheses is the correct one is larger than that of just the common line with the top
correlation. On the other hand, since at most one hypothesis can be correct, all hypotheses
except maybe one are incorrect, and this leads to an increase in the variance of the random
Wigner matrix. Therefore, we often nd the single hypothesis version favorable compared to
the multiple hypotheses version. The corresponding random matrix theory analysis and the
supporting numerical experiments will be reported in a separate publication.
Finally, we note that the techniques and analysis applied here to solve the cryo-EM prob-
lem can be translated to the computer vision problem of structure from motion, where lines
perpendicular to the epipolar lines play the role of the common lines. This particular appli-
cation will be the subject of a separate publication.
Copyright by SIAM. Unauthorized reproduction of this article is prohibited.
CRYO-EM BY EIGENVECTORS AND SDP 571
Acknowledgments. We are indebted to Fred Sigworth and Ronald Coifman for introduc-
ing us to the cryo-electron microscopy problem and for many stimulating discussions. We
would like to thank Ronny Hadani, Ronen Basri, and Boaz Nadler for many valuable discus-
sions on representation theory, computer vision, and random matrix theory. We also thank
Lanhui Wang for conducting some of the numerical simulations.
REFERENCES
[1] J. Frank, Three-Dimensional Electron Microscopy of Macromolecular Assemblies: Visualization of Bio-
logical Molecules in Their Native State, Oxford University Press, New York, 2006.
[2] L. Wang and F. J. Sigworth, Cryo-EM and single particles, Physiology (Bethesda), 21 (2006), pp. 13
18.
[3] R. Henderson, Realizing the potential of electron cryo-microscopy, Q. Rev. Biophys., 37 (2004), pp. 313.
[4] S. J. Ludtke, M. L. Baker, D. H. Chen, J. L. Song, D. T. Chuang, and W. Chiu, De novo backbone
trace of GroEL from single particle electron cryomicroscopy, Structure, 16 (2008), pp. 441448.
[5] X. Zhang, E. Settembre, C. Xu, P. R. Dormitzer, R. Bellamy, S. C. Harrison, and N. Grigori-
eff, Near-atomic resolution using electron cryomicroscopy and single-particle reconstruction, Proc.
Natl. Acad. Sci. USA, 105 (2008), pp. 18671872.
[6] W. Chiu, M. L. Baker, W. Jiang, M. Dougherty, and M. F. Schmid, Electron cryomicroscopy of
biological machines at subnanometer resolution, Structure, 13 (2005), pp. 363372.
[7] M. Radermacher, T. Wagenknecht, A. Verschoor, and J. Frank, Three-dimensional reconstruc-
tion from a single-exposure, random conical tilt series applied to the 50S ribosomal subunit of Es-
cherichia coli, J. Microsc., 146 (1987), pp. 113136.
[8] M. Radermacher, T. Wagenknecht, A. Verschoor, and J. Frank, Three-dimensional structure of
the large subunit from Escherichia coli, EMBO J., 6 (1987), pp. 11071114.
[9] D. B. Salzman, A method of general moments for orienting 2D projections of unknown 3D objects,
Comput. Vision Graphics Image Process., 50 (1990), pp. 129156.
[10] A. B. Goncharov, Integral geometry and three-dimensional reconstruction of randomly oriented identical
particles from their electron microphotos, Acta Appl. Math., 11 (1988), pp. 199211.
[11] P. A. Penczek, R. A. Grassucci, and J. Frank, The ribosome at improved resolution: New tech-
niques for merging and orientation renement in 3D cryo-electron microscopy of biological particles,
Ultramicroscopy, 53 (1994), pp. 251270.
[12] M. Van Heel, Angular reconstitution: A posteriori assignment of projection directions for 3D recon-
struction, Ultramicroscopy, 21 (1987), pp. 111123.
[13] M. Van Heel, B. Gowen, R. Matadeen, E. V. Orlova, R. Finn, T. Pape, D. Cohen, H. Stark,
R. Schmidt, M. Schatz, and A. Patwardhan, Single-particle electron cryo-microscopy: Towards
atomic resolution, Q. Rev. Biophys., 33 (2000), pp. 307369.
[14] B. Vainshtein and A. Goncharov, Determination of the spatial orientation of arbitrarily arranged
identical particles of an unknown structure from their projections, in Proceedings of the 11th Inter-
national Congress on Electron Mircoscopy, 1986, pp. 459460.
[15] M. Van Heel, E. V. Orlova, G. Harauz, H. Stark, P. Dube, F. Zemlin, and M. Schatz, Angular
reconstitution in three-dimensional electron microscopy: Historical and theoretical aspects, Scanning
Microscopy, 11 (1997), pp. 195210.
[16] M. Farrow and P. Ottensmeyer, A posteriori determination of relative projection directions of arbi-
trarily oriented macromolecules, JOSA A, 9 (1992), pp. 17491760.
[17] P. A. Penczek, J. Zhu, and J. Frank, A common-lines based method for determining orientations for
N > 3 particle projections simultaneously, Ultramicroscopy, 63 (1996), pp. 205218.
[18] S. P. Mallick, S. Agarwal, D. J. Kriegman, S. J. Belongie, B. Carragher, and C. S. Potter,
Structure and view estimation for tomographic reconstruction: A Bayesian approach, in Proceedings
of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Volume II, 2006,
pp. 22532260.
Copyright by SIAM. Unauthorized reproduction of this article is prohibited.
572 A. SINGER AND Y. SHKOLNISKY
[19] A. Singer, R. R. Coifman, F. J. Sigworth, D. W. Chester, and Y. Shkolnisky, Detecting con-
sistent common lines in cryo-EM by voting, J. Struct. Biol., 169 (2010), pp. 312322.
[20] R. R. Coifman, Y. Shkolnisky, F. J. Sigworth, and A. Singer, Reference free structure determi-
nation through eigenvectors of center of mass operators, Appl. Comput. Harmon. Anal., 28 (2010),
pp. 296312.
[21] S. Basu and Y. Bresler, Uniqueness of tomography with unknown view angles, IEEE Trans. Image
Process. 9 (2000), pp. 10941106.
[22] S. Basu and Y. Bresler, Feasibility of tomography with unknown view angles, IEEE Trans. Image
Process. 9 (2000), pp. 11071122.
[23] R. R. Coifman, Y. Shkolnisky, F. J. Sigworth, and A. Singer, Graph Laplacian tomography from
unknown random projections, IEEE Trans. Image Process., 17 (2008), pp. 18911899.
[24] L. Vandenberghe and S. Boyd, Semidenite programming, SIAM Rev., 38 (1996), pp. 4995.
[25] M. X. Goemans and D. P. Williamson, Improved approximation algorithms for maximum cut and
satisability problems using semidenite programming, J. ACM, 42 (1995), pp. 11151145.
[26] F. Natterer, The Mathematics of Computerized Tomography, Classics Appl. Math. 32, SIAM, Philadel-
phia, 2001.
[27] K. Arun, T. Huang, and S. Bolstein, Least-squares tting of two 3-D point sets, IEEE Trans. Pattern
Anal. Mach. Intell., 9 (1987), pp. 698700.
[28] E. P. Wigner, (1955) Characteristic vectors of bordered matrices with innite dimensions, Ann. of Math.,
62 (1955), pp. 548564.
[29] E. P. Wigner, On the distribution of the roots of certain symmetric matrices, Ann. of Math., 67 (1958),
pp. 325327.
[30] S. Burer and R. D. C. Monteiro, A nonlinear programming algorithm for solving semidenite programs
via low-rank factorization, Math. Program. Ser. B, 95 (2003), pp. 329357.
[31] A. Averbuch and Y. Shkolnisky, 3D Fourier based discrete radon transform, Appl. Comput. Harmon.
Anal., 15 (2003), pp. 3369.
[32] A. Averbuch, R. R. Coifman, D. L. Donoho, M. Israeli, and Y. Shkolnisky, A framework for
discrete integral transformations IThe pseudo-polar Fourier transform, SIAM J. Sci. Comput., 30
(2008), pp. 764784.
[33] R. R. Coifman and G. Weiss, Representations of compact groups and spherical harmonics, Enseigne-
ment Math., 14 (1968), pp. 121173.
[34] R. Hadani and A. Singer, Representation theoretic patterns in cryo electron microscopy IThe intrinsic
reconstitution algorithm, Ann. of Math., to appear.
[35] N. Alon, M. Krivelevich, and V. H. Vu, On the concentration of eigenvalues of random symmetric
matrices, Israel J. Math., 131 (2002), pp. 259267.
[36] A. Soshnikov, Universality at the edge of the spectrum in Wigner random matrices, Comm. Math. Phys.,
207 (1999), pp. 697733.
[37] C. A. Tracy and H. Widom, Level-spacing distributions and the Airy kernel, Comm. Math. Phys., 159
(1994), pp. 151174.
[38] S. Peche, The largest eigenvalues of small rank perturbations of Hermitian random matrices, Probab.
Theory Related Fields, 134 (2006), pp. 127174.
[39] D. Feral and S. Peche, The largest eigenvalue of rank one deformation of large Wigner matrices,
Comm. Math. Phys., 272 (2007), pp. 185228.
[40] Z. F uredi and J. Koml os, The eigenvalues of random symmetric matrices, Combinatorica, 1 (1981),
pp. 233241.
[41] A. Singer, Angular synchronization by eigenvectors and semidenite programming, Appl. Comput. Har-
mon. Anal., 30 (2010), pp. 2036.
[42] T. M. Cover and J. A. Thomas, Elements of Information Theory, Wiley, New York, 1991.
[43] R. G. Gallager, Information Theory and Reliable Communication, Wiley, New York, 1968.