0% found this document useful (0 votes)
13 views15 pages

SIFT3 Dmain

The document presents a novel framework called SD-SEM for 3D reconstruction of microscopic samples using stereo scanning electron microscopy (SEM) images. It combines sparse-dense correspondence techniques, utilizing scale invariant feature transform (SIFT) for feature detection and a contrario RANSAC for outlier removal, followed by dense matching for high-quality reconstruction. The proposed method demonstrates significant improvements in accuracy and detail over traditional sparse feature-based approaches, enabling effective analysis of surface attributes in various scientific fields.

Uploaded by

José Gómez
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views15 pages

SIFT3 Dmain

The document presents a novel framework called SD-SEM for 3D reconstruction of microscopic samples using stereo scanning electron microscopy (SEM) images. It combines sparse-dense correspondence techniques, utilizing scale invariant feature transform (SIFT) for feature detection and a contrario RANSAC for outlier removal, followed by dense matching for high-quality reconstruction. The proposed method demonstrates significant improvements in accuracy and detail over traditional sparse feature-based approaches, enabling effective analysis of surface attributes in various scientific fields.

Uploaded by

José Gómez
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 15

Micron 97 (2017) 41–55

Contents lists available at ScienceDirect

Micron
journal homepage: www.elsevier.com/locate/micron

SD-SEM: sparse-dense correspondence for 3D reconstruction of


microscopic samples
Ahmadreza Baghaie a,∗ , Ahmad P. Tafti b , Heather A. Owen c , Roshan M. D’Souza d ,
Zeyun Yu e
a
Department of Electrical Engineering, University of Wisconsin-Milwaukee, WI, USA
b
Biomedical Informatics Research Center, Marshfield Clinic Research Foundation, WI, USA
c
Department of Biological Sciences, University of Wisconsin-Milwaukee, WI, USA
d
Department of Mechanical Engineering, University of Wisconsin-Milwaukee, WI, USA
e
Departments of Electrical Engineering and Computer Science, University of Wisconsin-Milwaukee, WI, USA

a r t i c l e i n f o a b s t r a c t

Article history: Scanning electron microscopy (SEM) imaging has been a principal component of many studies in biomed-
Received 28 August 2016 ical, mechanical, and materials sciences since its emergence. Despite the high resolution of captured
Received in revised form 15 March 2017 images, they remain two-dimensional (2D). In this work, a novel framework using sparse-dense corre-
Accepted 15 March 2017
spondence is introduced and investigated for 3D reconstruction of stereo SEM images. SEM micrographs
Available online 20 March 2017
from microscopic samples are captured by tilting the specimen stage by a known angle. The pair of SEM
micrographs is then rectified using sparse scale invariant feature transform (SIFT) features/descriptors
Keywords:
and a contrario RANSAC for matching outlier removal to ensure a gross horizontal displacement between
Scanning electron microscope (SEM)
3D reconstruction
corresponding points. This is followed by dense correspondence estimation using dense SIFT descrip-
Feature descriptors tors and employing a factor graph representation of the energy minimization functional and loopy belief
Dense correspondence propagation (LBP) as means of optimization. Given the pixel-by-pixel correspondence and the tilt angle
of the specimen stage during the acquisition of micrographs, depth can be recovered. Extensive tests
reveal the strength of the proposed method for high-quality reconstruction of microscopic samples.
© 2017 Elsevier Ltd. All rights reserved.

1. Introduction has attracted many researchers to devise robust and reliable algo-
rithms for 3D reconstruction of microscopic samples captured by
Scanning electron microscopy (SEM) imaging has been a crucial SEM (Danzl and Scherer, 2003; Samak et al., 2007; Marinello et al.,
technique for various studies in biomedical, mechanical, and mate- 2008; Tafti et al., 2015, 2016a,b; Eulitz and Reiss, 2015).
rials sciences (Bozzola and Russell, 1999; Egerton, 2006; Jensen, Multiview stereopsis has been an active research area in com-
2012) and has contributed tremendously to observations of surface puter vision community in the recent years, with applications in
structure of microscopic samples from its emergence. In a SEM, a scene reconstruction, movie making, medical visualization, virtual
focused electron beam is used for scanning the surface of the sam- tourism, mobile robot navigation, virtual reality, and computer
ples. The secondary electron (SE) or back-scattered electron (BSE) aided design (Hartley and Zisserman, 2003; Agarwal et al., 2011;
detectors are aimed to capture the signals generated by interac- Musialski et al., 2013). General 3D scene reconstruction techniques
tions of the beam with the surface. The detection of BSE signal can be categorized into three major classes: (1) single-view (also
has proven to be beneficial to compositional studies of materials, known as shape from shading (SfS)), (2) multi-view (also known
while SE suits topographical analysis of the samples being exam- as shape from motion (SfM)), and (3) hybrid (Tafti et al., 2015).
ined. However, SEM micrographs still remain two-dimensional In single-view class, a set of 2D images from a single viewpoint
(2D). Therefore, having a high fidelity three dimensional recon- with varying lighting conditions are used for reconstruction. In
struction for a more effective analysis of the surface attributes and contrast, in multi-view class, 3D surface is reconstructed by com-
topography of the microscopic sample is of high importance. This bining the information gathered from a set of 2D images acquired
by changing the imaging viewpoint. In such techniques, at first,
feature points are detected in the input images. This step is fol-
∗ Corresponding author. lowed by finding the corresponding points in the images and then
E-mail address: [email protected] (A. Baghaie). using the projection geometry theory for estimating the camera

http://dx.doi.org/10.1016/j.micron.2017.03.009
0968-4328/© 2017 Elsevier Ltd. All rights reserved.
42 A. Baghaie et al. / Micron 97 (2017) 41–55

projection matrices. The hybrid class combines the advantages of samples and also eliminate the manual effort needed for acquiring
the single-view and multi-view techniques for a more accurate more samples.
3D reconstruction (Tafti et al., 2015). Here, we put the focus on In this work the use of sparse and dense correspondence for
the multi-view class and therefore, having a good understand- high resolution 3D reconstruction of stereo SEM micrographs is
ing of the feature-point detectors and also local descriptors is introduced and investigated in great depth and detail. Using the
beneficial. proposed approach, we are able to reconstruct high quality uniform
Feature detection and description are among the core compo- meshes of the imaged surfaces which can be later used for further
nents in many computer vision algorithms and a wide range of quantitative analysis regarding the topology and surface attributes.
approaches and techniques have been introduced in the past few With the current work, we design and develop a new optimized
decades to address the need for more robust and accurate feature framework for high fidelity 3D surface reconstruction from SEM
detection/description. Even though there is no universal and exact micrographs. This is achieved by combining a sparse feature match-
definition of a feature that is independent of the specific applica- ing approach with high quality dense matching which results in
tion intended, methods of feature detection can be categorized a highly realistic reconstruction of the microscopic samples. For
into four major categories (Bernal et al., 2010): edge detectors, the current work, scale invariant feature transform (SIFT) method
corner detectors, blob detectors and region detectors. The pro- is considered as the basis. Even though the input SEM micro-
cess of feature detection is usually followed by feature description graphs are captured in a controlled manner with careful parameter
which uses a set of algorithms for describing the neighborhood of settings, still, they require pre-processing steps in order to be con-
the detected features. Generally speaking, the methods of feature sidered as a pair of true stereo images. The reason can be sought in
description can also be classified into four major classes (Bernal the differences between regular stereo imaging using a stereo rig
et al., 2010): shape descriptors, color descriptors, texture descrip- with two cameras capable of synchronous image capturing and the
tors and motion descriptors. By detecting the features and defining procedure used in SEM in which, instead, several images are cap-
the descriptors, one can use the new representation of the input tured by tilting the specimen stage. Using the sparse SIFT feature
images for a wide range of applications such as wide baseline detection/description in conjunction with naive nearest neigh-
matching, object and texture recognition, image retrieval, robot bor matching followed by a contrario RAndom SAmple Consensus
localization, video data mining, image mosaicing and recognition (RANSAC) for eliminating the matching outliers, a set of initial
of object categories (Szeliski, 2010; Wöhler, 2012). matches are found. These matches are later used for rectifying the
In single-view 3D surface reconstruction, creating a full model input micrographs. As the result of rectification, the displacements
of the microscopic sample is generally not possible since the images between the matching points will be highly concentrated along the
are limited to only one view-point. Moreover, recreating the SEM horizontal direction. This will reduce the alignment errors intro-
micrographs of the sample under different illumination condi- duces during the image acquisition greatly. Followed by this, dense
tions is difficult. On the other hand, multi-view approaches offer SIFT descriptors are employed for the process of dense matching.
a more general and achievable framework for the task. Using such However, due to higher dimensionality of such dense matching
techniques, a more realistic and complete reconstruction can be in comparison to the sparse matching, more rigorous approaches
created. However, the need for more sophisticated matching meth- should be considered. Defining an energy functional that mod-
ods that require higher computational power is inevitable. Use of els the dense correspondence problem followed by a factor graph
multi-view techniques for 3D reconstruction of scanning electron representation of the energy functional, well-known probabilistic
microscopy (SEM) images have been explored in the literature in methodologies are utilized for optimization. The result is a dense
the past few years (Samak et al., 2007; Eulitz and Reiss, 2015; displacement field that captures the movements of individual pix-
Zolotukhin et al., 2013; Tafti et al., 2016a,b). Still, there is room for els between the micrographs. The dense field can be later used for
improvement in accuracy and the needed computational resources 3D point cloud and surface mesh generation.
for 3D reconstruction. Sparse feature based approaches aim to The rest of the paper is organized as follows. Section 2 contains
find a set of features in the input images to be represented by an detailed explanation on the various steps of the proposed method.
aggregated set of descriptors acquired from their neighborhood. It starts with a brief overview of the proposed method followed
After matching between the features in the images acquired from by subsections on SEM imaging protocol, scale invariant feature
multiple viewpoints, the projective transformations between the transform (SIFT) feature detection and description, epipolar rectifi-
matches are estimated and the set of 3D points are generated. cation using sparse SIFT features and a contrario RANSAC approach,
The bottleneck of such techniques is in the first step of the proce- dense correspondence for vertical/horizontal disparity computa-
dure: feature detection. Feature detection techniques are designed tion by use of dense SIFT features, disparity refinement by taking
for detection of common features that are seen in images, includ- advantage of a fast approximation variant of the bilateral filtering
ing edges, corners, T-junctions, blobs, etc. Presence of noise can and finally depth estimation. In Section 3, the results generated by
have negative impacts on the detected features. Moreover, the the proposed framework are presented with detailed comparisons
microscopic samples/surfaces to be imaged may contain areas with with the state-of-the-art sparse feature based approaches. Pointers
minimal intensity and depth variations in which no features can be for future research is provided in Section 4. Section 5 concludes the
detected. These cause the features, and subsequently 3D points, to paper.
be distributed non-uniformly and very sparsely. This will greatly
affect the subsequent mesh generation and surface reconstruction
steps. It should also be noted that several images from different 2. Methods
view-points are needed for building a realistic reconstruction of the
microscopic sample. This increases the computational complexity The overview of the proposed method for high fidelity 3D
of the methods. Still, even with the use of multiple views, very fine reconstruction of microscopic samples using a pair of SEM micro-
details are missed when using sparse feature based approaches. In graphs captured by tilting the specimen stage by a known angle is
case of micrographs such as those acquired in biological studies, shown in Fig. 1. After imaging the microscopic samples, three dis-
the presence of numerous microscopic objects with varying sizes tinct steps have to be taken: (1) Pre-processing: begins by sparse
cause the sparse feature-based approach to fail. These problems can feature detection/description using SIFT (Lowe, 2004). Using the
be remedied by using dense correspondence between multi-view detected features and employing a contrario RANSAC approach,
pixels. This can enable realistic topographical reconstruction of the matching outliers are eliminated and the fundamental matrix is
A. Baghaie et al. / Micron 97 (2017) 41–55 43

Fig. 1. Flowchart of the proposed framework.

Fig. 2. Example of the high fidelity 3D reconstruction of microscopic samples using a pair of SEM micrographs captured by tilting the specimen stage: (a) a set of stereo SEM
images of a Tapetal Cell with known tilting angle (9◦ ), (b) result of sparse SIFT feature detection/description with a contrario RANSAC approach for matching outlier removal,
(c) set of rectified images, (d) bilaterally filtered horizontal disparity map, (e) a magnified view of the high quality surface mesh generated using the dense point cloud, and
(f) a magnified view of the high fidelity surface model.

approximated. This is later used for rectifying the input pair cloud and surface mesh generation. An example of the results gen-
(Fusiello and Irsara, 2008; Monasse, 2011). The rectification pro- erated by the proposed framework is shown in Fig. 2 for Tapetal Cell
cess will cause the displacements to be more concentrated along micrograph pairs. In the following subsections each of the steps are
the horizontal direction. (2) Dense matching: in this step, dense SIFT elaborated in more detail.
descriptors are employed in combination with a factor graph rep-
resentation of the energy functional to be minimized as well as 2.1. SEM imaging protocol
loopy belief propagation as means of optimization in order to find
a point-by-point matching between the pixels of the input micro- In this work, a Hitachi S-4800 field emission scanning electron
graphs (Liu et al., 2011). (3) Post-processing: disparity filtering is microscope (FE-SEM) has been utilized to generate the micro-
done by utilizing fast bilateral filtering (Paris and Durand, 2009) graphs. This SEM is equipped with a computer controlled 5 axis
in order to achieve edge and discontinuity-preserving smoothing. motorized specimen stage which enables movements in x, y and z
This is followed by disparity/depth conversion and 3D dense point directions as well as tilt (−5 to 70◦ ) and rotation (0–360◦ ). Specimen
44 A. Baghaie et al. / Micron 97 (2017) 41–55

Table 1
Summary of the dataset: The micrographs are acquired from Tapetal Cell, Copper Bar, Copper Grid, Hexagonal Grid and Pollen Grain using a Hitachi S-4800 field emission
scanning electron microscope (FE-SEM) with sizes ranging from 512 × 384 to 1280 × 960 pixels and tilt angles in the range 3–11◦ (Acc. Volt. = Acceleration Voltage, Work.
Dist. = Working Distance, Magn. = Magnification).

Tapetal Cell Copper Bar Copper Grid Hexagonal Grid Pollen Grain

Images

Size 1280 × 960 512 × 384 1280 × 960 1280 × 960 854 × 640
Tilt angle 9◦ 11◦ 7◦ 10◦ 3◦

manipulations, such as tilt, z-positioning and rotation of the spec- visually to ensure minimal amount of distortions. A more rigor-
imen stage, as well as image pre-processing and capture functions ous assessment of the distortions is needed considering all of the
were operated through the Hitachi PC-SEM software. The working parameters involved in the process of multiview SEM micrograph
distance which gives the required depth of focus was determined acquisition. Similar works are done in Nolze (2007), Marinello et al.
at the maximum tilt for every single sample at the magnification (2008) and Guery et al. (2013). Table 1 summarizes the data that
chosen for image capture. As the specimen was tilted in successive used in this work. Micrographs from Tapetal Cell, Copper Bar, Copper
1◦ increments until reaching the final value through the software Grid, Hexagonal Grid and Pollen Grain are considered for evaluating
application, the SEM image was centered by moving the stage in the performance and accuracy of the proposed approach.
the x- and/or y-axes manually. It should be noted that the micro-
graphs are only acquired at the beginning and end of the process of 2.2. Scale invariant feature transform (SIFT)
tilting. The micrographs were acquired with an accelerating volt-
age of 3 kV, utilizing the signals from both the upper and lower SE Four stages of feature detection/description involved in SIFT
detectors in a mixed manner, as shown in Fig. 3. The magnifica- method can be summarized as (Lowe, 2004): (1) scale-space
tion and working distance were held fixed in each captured image extrema detection, (2) keypoint localization, (3) orientation assign-
of the tilt series. Contrast and brightness were adjusted manually ment and (4) keypoint descriptors. For the first step, a Gaussian
to keep consistency between SEM micrographs. The SEM’s calibra- function is considered as the scale-space kernel based on the work
tion, tilt angle, magnification, tilt eucentricity and working distance of (Lindeberg, 1994). By finding the scale-space extrema in the
are among the parameters that affect the amount of distortions response of the image to difference-of-Gaussian (DoG) masks, not
observed in the SEM micrographs (Marinello et al., 2008). How- only a good approximation for the scale-normalized Laplacian of
ever, the effects of such distortions are more dominant when having Gaussian (LoG) function is provided, but also as pointed out in the
high specimen tilting angles and small magnifications (lower than work of Mikolajczyk and Schmid (2002), the detected features are
1000×). Therefore, given the low tilt angles and mid-range mag- more stable. The local extrema of the response of the image to the
nification factors used here, analytic assessment of the distortions DoG masks of different scales is found in a 3 × 3 ×3 neighborhood of
is not performed and the captured micrographs are only inspected the interest point. For accurate localization of the keypoints in the
set of candidate keypoints, a three dimensional quadratic function
is fitted to the local sample points. By applying a threshold on the
value of this fitting function at the extremum, keypoints located
in low contrast regions that are highly affected by noise are elimi-
nated. Moreover, thresholding the ratio of principal curvatures can
also eliminate poorly defined feature points near the edges. After
finalizing the keypoints, orientations can be assigned. This is done
by using the gradients computed in the first step of the algorithm
when computing DoG responses. Creating a 36-bin histogram for
orientations in the keypoint’s neighborhood is the next step. Each
neighbor contributes to the histogram by a weight computed based
on its gradient magnitude and also by a Gaussian weighed circular
window around the keypoint.
The final step is the local image descriptor generation. Using
the location, scale and orientation determined for each keypoint up
until now, the local descriptor is created in a manner which makes it
invariant to differences in illumination and viewpoint. This is done
by combining the gradients at keypoint locations, as computed in
the previous steps, weighted by a Gaussian function over each 4 × 4
sub-region in a 16 × 16 neighborhood around the keypoint into 8-
Fig. 3. SEM imaging procedure used for this study. bin histograms. This results in a 4 × 4 ×8 = 128 element vector for
A. Baghaie et al. / Micron 97 (2017) 41–55 45

each keypoint. Normalizing the feature vectors to unit length will where J is the matrix of partial derivatives of E with respect to the
reduce the effect of linear illumination changes. This is usually fol- 4 variables:
lowed by thresholding the normalized vector and re-normalizing
it again to reduce the effects of large gradient magnitudes. J = ((FXr )1 (FXr )2 (F T Xl )1 (F T Xl )2 ) (8)
In the current work, SIFT is used in two different manners. For we have
the step of sparse image matching required for epipolar rectifi-
cation, the general SIFT approach is used for locating the feature E(Xl , Xr )2
Es (Xl , Xr )2 = (9)
points and computing the corresponding descriptors. However ||[e3 ]× F T Xl ||2 + ||[e3 ]× FXr ||2
for the dense matching, feature detection is eliminated and SIFT
Utilizing Levenberg–Marquardt (Nocedal and Wright, 2006), the
descriptors are computed for all the pixels contained in the input
method seeks the parameters ( ly ,  lz ,  rx ,  ry ,  rz , g) which minimize
images. For more information regarding the detail and implemen-
the sum of Sampson errors over the matching pairs. The optimized
tation of SIFT the reader is referred to Lowe (2004).
parameters are then used for building the two homographies to
be applied to the left and right view images. More elaboration
2.3. Epipolar rectification
regarding the theory and implementation aspects of the rectifica-
tion method can be found in Fusiello and Irsara (2008) and Monasse
Given a set of two SEM images of a microscopic sample captured
(2011).
from different viewpoints, the epipolar rectification step attempts
to transform the images in such a way that we only have hori-
zontal displacement (disparity) between the corresponding pixels 2.4. SIFT-flow for dense correspondence
within the images. Assuming a set of sparse naively matched corre-
sponding points generated by SIFT followed by a contrario RANSAC Discontinuity preserving pixel/feature matching is a key compo-
(ORSA) outlier removal algorithm (Moisan and Stival, 2004) and nent of many computer vision applications. This is unlike the many
represented as 3-vectors of homogeneous coordinates for the left general purpose image registration approaches in which the com-
(Xl ) and right (Xr ) images, the epipolar constraint can be written as puted displacement maps are assumed too be smooth (Baghaie and
(Hartley and Zisserman, 2003): Yu, 2014; Baghaie et al., 2014). In such cases, even though the image
grid is deformed during the process of registration, the underlying
XlT FXr = 0 (1) geometry is considered as a whole, without the possibility of fold-
ing or overlapping. However, in computer vision applications, the
where F is the fundamental matrix that captures the rigidity con-
objects that are contained in the images are projective represen-
straint of the scene. Having a rectified pair, the fundamental matrix
tations of the three dimensional objects in real world. Therefore,
takes the especial form of:
⎡ ⎤ the assumption of having discontinuity is necessary as a represen-
0 0 0 tation of the difference in the relative distances of the objects to
F = [e1 ]× = ⎣ 0 0 −1 ⎦ (2) the camera. Here, the problem of matching between image pixels
0 1 0 is modeled as a dual-layer factor graph, with de-coupled compo-
nents for horizontal/vertical flow to account for sliding motion.
which means that the epipoles are at infinity in horizontal direction. This model is based on the work of Liu et al. (2011) which takes
Therefore, the process of rectification involves finding homogra- advantage of an L1 truncated norm for achieving discontinuity
phies to be applied to the left and right images to satisfy the epipolar preservation and higher speeds in matching. Assuming F1 and F2
constraint equation when F = [e1 ]× . This can be represented in a as two dense multi-dimensional SIFT descriptor images, and p = (x,
mathematical form as: y) as the grid coordinates of the image, the objective function to be
minimized can be written as follows:
XlT FXr = 0 ≡ (Hl Xl )T [e1 ]× (Hr Xr ) = 0 (3)  
Having a rotation matrix R for the camera around the focus point, E(w) = min(||F1 (p) − F2 (p + w(p))||, t) + (|u(p)| + |v(p)|)
a homography matrix can be formulated as follows: p p

H = KRK −1 (4) + min(˛|u(p) − u(q)|, d) + min(˛|v(p) − v(q)|, d)
(p,q) ∈ 
where K is the camera parameters matrix with (xc , yc ) as the image
(10)
center (principal point) and f as the unknown focal length:
⎡ ⎤ in which w(p) = (u(p), v(p)) is the flow vector at point p. The
f 0 xc
⎢ ⎥ three summations in this equation are data, small displacement and
K = ⎣ 0 f yc ⎦ (5) smoothness terms, respectively. The data term is for minimizing
the difference between the feature descriptors along the flow vec-
0 0 1
tor, while the small displacement term keeps the displacements
Following the formulation proposed in Fusiello and Irsara (2008) as small as possible when no information is available. Finally the
and Monasse (2011) we look for rotation matrices Rl and Rr and smoothness term guaranties that the flow vectors for neighbor pix-
focal length which satisfy: els are similar. A few parameters that are used in this formulation
are: t and d as data and smoothness thresholds and ˛ and  as small
E(xl , yl , xr , yr ) = XlT K −T RlT K T [e1 ]× KRr K −1 Xr = 0 (6)
displacement and smoothness coefficients, respectively. The values
where Rr = Rz ( rz )Ry ( ry )Rx ( rx ), Rl = Rz ( lz )Ry ( ly ) and K = K(f = are set to the default values proposed by Liu et al. (2011).
3g (w + h)), with w and h as the width and height of the input images As is obvious, in this formulation the horizontal and vertical
respectively and g in the range [−1, 1]. It should also be noted that components are de-coupled. This is mainly for reducing the compu-
due to the specific form of [e1 ]× all of the rotations around the tational complexity. But this gives additional benefit of being able
x direction are eliminated since Rxt [e1 ]× Rx = [e1 ]× . Assuming the to account for sliding motions during the process of image match-
Sampson error as: ing. The objective function is formulated as a factor graph, with (p)
−1 and (q) as the variable nodes while the factor nodes represent the
Es2 = E T (JJ T ) E (7) data, small displacement and smoothness terms. The flow is then
46 A. Baghaie et al. / Micron 97 (2017) 41–55

Fig. 4. Factor graph representation of the energy minimization functional with de-coupled horizontal and vertical components.

extracted by using a dual-layer loopy belief propagation algorithm. previous iteration is eliminated. Also, normalizing messages at each
Fig. 4 shows the factor graph suggested by Liu et al. (2011) for opti- iteration does not have any effect on the final beliefs and has the
mizing the energy functional of dense matching problem. By using benefit of preventing numerical underflow (Pearl, 2014).
a coarse-to-fine (multi-resolution) matching scheme, one is able to Factor graphs are a means of unifying the directed and
reduce the computational complexity and hence the computation undirected graphical models with the same representation
time while achieving lower values for the energy functional. (Kschischang et al., 2001). Such graphs are derived by the main
Belief propagation (BP) is a technique for exact inference of assumption of representing complicated global functions of many
marginal probabilities for singly connected distributions (Barber, variables by factorizing them as a product of several local functions
2012). Generally speaking, each node in the graph computes a belief of subsets of variables. Generally speaking, a factor graph can be
based on the messages that it receives from its children and also defined as F = (G, P) in which G is the structure of the graph and
from its parents. Such a technique is purely local, which means P is the parameter of the graph. G being a bipartite graph can be
that the updates are unaware of the global structure of the graph as defined as G = ({X, F}, E) where, X and F are variable nodes and fac-
the graph may contain loops and therefore be multiply connected tor nodes, respectively, while E is a set of edges connecting a factor
(Pearl, 2014). In this case, BP cannot compute an exact solution, fi and a variable x ∈ Xj . Given evidence as a set of variables with
but at best an approximation which can be surprisingly very accu- observed values, the process of belief propagation consists of pass-
rate (Barber, 2012). Use of graphical models in image processing ing local messages between nodes in order to compute the marginal
tasks usually fall within the category of loopy graphs, which means of all nodes. Even though the same concept is used for belief prop-
different variants of BP are used and studied for solving different agation in directed graphs, here, the process can be formulated as
problems in this area (Szeliski et al., 2008; Baghaie, 2016). passing messages between variable and factor nodes. In this case,
In the general formulation of BP and subsequently loopy BP (LBP) two types of messages are passed: (1) message from variable node
(Murphy et al., 1999), we assume that node X computes its belief to factor node (x →f ) and (2) message from factor node to variable
b(x) = P(X = x|E), where E is the observed evidence that is computed node (f →x ):
by combining the messages from the node’s children Yj (x) and also
its parents X (uk ). Assuming X (x) as the node’s message to itself x→f (x) ∝ h→x (x) (16)
representing the evidence, we have: h ∈ Nx \{f }

b(x) = ˛(x)(x) (11) f →x (x) ∝ (f (Xf ) y→f (y)) (17)
where Nf \{x} y ∈ Nf \{x}

(t)
(t) (x) = X (x) Y (x) (12) where x and y are variables, f and h are factors and Nf and Nx are rep-
j
j resentative of neighbors of the corresponding nodes in the graph. In
acyclic graphs, the process of message passing is terminated after
and two messages are passed on every edge, one in each direction. In

(t) (x) = P(X = x|U = u)
(t)
X (uk ) (13) such graphs, the process results in an exact inference. Unlike acyclic
graphs, belief propagation is done in an iterative manner in cyclic
u k
graphs. The process is terminated when having minimal changes in
The message that X passes to its parent Ui is given by: the passed messages according to a predetermined threshold and
  the result is considered an approximate inference.
x (t + 1)(ui ) = ˛ (t) (x) P(x|u) X (t)(uk ) (14) Several modifications to the general formulation of the energy
x uk :k =
/ i k=
/ i minimization procedure is proposed by Liu et al. (2011) which are
also considered in this work. Different from the general formulation
and the message that X sends to its child Yj is given by:
of optical flow, here, the smoothness term is decoupled for allowing
(t+1)
Y (x) = ˛(t) (x)X (x)
(t)
Y (x) (15) separate horizontal and vertical flows. This reduces the compu-
j k tational complexity of the energy minimization significantly. In
k=
/ j
this implementation, at first, the intra-layer messages are updated
As can be seen, if a message is being generated to pass from node for horizontal and vertical flows and then the inter-layer mes-
A to B, the contribution of the message from node B to A from the sages are updated between horizontal and vertical flows. Moreover,
A. Baghaie et al. / Micron 97 (2017) 41–55 47

Fig. 5. Effects of various spatial ( s = {1, 3, 5}, from left to right) and range ( r = {1, 3, 5}, from top to bottom) variance for bilateral filtering of the disparity of Pollen Grain.
The difference map between the initial disparity map and the refined map are also presented.

sequential belief propagation (BP-S) (Szeliski et al., 2008) is used for eliminate this and still having a reasonable smoothing effect, the
better convergence. values of  s and  r are both set to 3, experimentally for creating
the results presented here.
2.5. Disparity refinement: bilateral filtering
2.6. Depth estimation
Since the result of the previous step is in general a discrete label-
ing of the horizontal/vertical disparity maps, neighbor pixels may
As mentioned before, stereo rectification transforms the images
have different labels. These differences represent themselves as
in a manner in which the displacements will be grossly concen-
sudden jumps in the final 3D reconstruction results. One should
trated in the horizontal direction. This greatly simplifies the process
note that this is not always problematic especially in regions where
of depth estimation. This is especially useful for the case of 3D
sharp variations of the disparity levels are representatives of edges
reconstruction of SEM images since the tilt angles are very small
and different depths. However, in uniform regions these small vari-
with high amount of overlap between stereo image pairs. For more
ations in disparity values should be smoothed for a more visually
general problems like large scale multiple view stereo (MVS), the
pleasing reconstruction result.
proposed technique is not directly applicable and more sophisti-
Bilateral filtering has been shown to provide high capabil-
cated methods are needed (Shen, 2013; Tola et al., 2010; Furukawa
ity in noise/variation reduction while preserving edges contained
and Ponce, 2010).
in the images (Paris et al., 2009). The general idea is similar to
The horizontal disparity computed from the previous step, can
simple Gaussian filtering. However, unlike the Gaussian filtering
be utilized for estimating the depth of the individual pixels con-
which only takes the spatial proximity into consideration, bilat-
tained in the images. This requires that several parameters to be
eral filtering takes both spatial and range information into account.
known: tilt angle, magnification and size of each pixel in sample
Assuming the noisy image I, the general formulation for the bilat-
units. Fig. 6 shows the relationship between the computed hori-
eral filtered result Î at pixel location p is:
zontal disparity and the height for a few sample points. This can
1  be represented using a simple trigonometric equation (Roy et al.,
Îp = Gs (||p − q||) Gr (|Ip − Iq |)Iq (18)
Wp 2012; Szeliski, 2010; Xie, 2011):
q∈S
dp
with the normalization factor defined as Wp = G (||p −
p ∈ S s h= (19)
2 sin(/2)
q||) Gr (|Ip − Iq |), q is the neighbor pixel in the neighborhood S
and Gs and Gr are the Gaussian weighting functions for spa- which uses the computed horizontal disparity d, pixel size in sam-
tial and range data, respectively. Direct implementation of the ple units (p) and the total tilt angle () to estimate the height (h).
bilateral filter is computationally expensive and therefore, several
approximation techniques have been proposed in the literature
for speeding up the process (Weiss, 2006; Pham and Van Vliet,
2005; Durand and Dorsey, 2002; Paris and Durand, 2009). Here,
the approximation method proposed by Paris and Durand (2009)
is used. In their formulation, the image is first converted to a volu-
metric bilateral grid with homogeneous values. It is shown that the
bilateral filter can be approximate by Gaussian convolution applied
to the grid followed by sampling and normalization of the homoge-
neous values. The spatial ( s ) and range ( r ) variation parameters
are chosen experimentally. Fig. 5 displays the effects of different
values for the parameters in the smoothness of the computed hor-
izontal disparity map for the Pollen Grain. We aim to smooth the
minimal variations in uniform regions while preserving sudden
jumps in disparity values associated with bigger differences in the
depth. Looking closely at the various spatial and range variances
shown in Fig. 5, it is obvious that bigger values of variances cause
the disparity map to be over-smoothed. This can be seen in the
corresponding difference maps as more edge details can be seen Fig. 6. Relationship between the estimated height (h) and the computed horizontal
which indicate that more edge information are smoothed out. To disparity (d) using the pixel size in sample units (p) and the total tilt angle ().
48 A. Baghaie et al. / Micron 97 (2017) 41–55

3. Results and discussion three dimensional reconstruction of the underlying microscopic


sample.
3.1. Pre-processing results
3.2. Dense correspondence performance
Assessing the performance of proposed framework which con-
sists of several steps is done using the several sets of SEM images For a more rigorous assessment of the dense SIFT descriptors for
(Tapetal Cell, Copper Bar, Copper Grid, Hexagonal Grid, Pollen Grain) the problem of dense correspondence, several well-known descrip-
captured by a Hitachi S-4800 field emission scanning electron tors are used here for comparison: Gabor descriptors, histogram of
microscope (FE-SEM), equipped with a computer controlled 5 axis oriented gradients (HOG) and speeded-up robust features (SURF).
motorized specimen stage which enables movements in x, y and Describing the frequency components of the images is crucial in
z directions as well as tilt (−5 to 70◦ ) and rotation (0–360◦ ) (Tafti many computer vision applications. Fourier transform decomposes
et al., 2016c). As the specimen was tilted in successive 1◦ incre- the frequency components of an image into the basis functions.
ments, the SEM image was centered by moving the stage in the x- However, the spatial relations between the image pixels are not
and/or y-axes manually. One should note that this does not have preserved in this representation (Mikolajczyk and Schmid, 2005).
any effect on the relative disparity and subsequent estimated depth Gabor filters, on the other hand, are designed to overcome this
variations between pixels of the images and may only cause the problem with many applications in texture representation. The
average disparity to be elevated or lowered. The micrographs were general equation for the complex Gabor filter can be defined as mul-
acquired with an accelerating voltage of 3 kV, utilizing the signals tiplication of a complex sinusoidal plane wave function with an ori-
from both the upper and lower SE detectors, as shown in Fig. 3. ented Gaussian envelope function (Gabor, 1946; Movellan, 2002).
As the first step, sparse SIFT features/descriptors are located For the current work, the efficient implementation of the Gabor fil-
following the approach outlined in Section 2.2. This step is straight- ters in the frequency domain is used (Ilonen et al., 2005). Number of
forward and no optimization is taken in order to increase/decrease filter frequencies and number of orientations of the filters are set to
the number of features. This step is followed by sparse feature 10 and 8 respectively which result in a set of 80 filters. Histogram
matching implemented by employing a contrario RANSAC to ensure of Oriented Gradients (HOG) is based on computing a histogram
better outlier removal. Epipolar rectification for finding the appro- of gradients in pre-determined directions (Dalal and Triggs, 2005;
priate homography transforms for the input micrographs in order Uijlings et al., 2015). The main idea comes from the observation
to have more horizontally concentrated disparity maps is next. that local object appearance and shape can usually be characterized
Table 2 summarizes the results of sparse SIFT matching and the sub- by the distribution of local intensity gradients or edge directions.
sequent epipolar rectification for all of the micrograph sets. The first This can be implemented by dividing the image into small spatial
and second row in the table indicate the number of SIFT features regions and for each region, a one-dimensional histogram of gradi-
found in the first and second micrographs of each set. As can be seen, ent directions or edge orientations is accumulated. Here, assuming
the number of detected features is minuscule in comparison to the a sectioning of 5 × 5 of the input images with number of orien-
total number of pixels contained within the images. This number is tations equal to 8, the final dense descriptors are of the size of
further reduced after finding the corresponding matches (see third 200. Finally, speeded-up robust features (SURF) is used in a dense
row in Table 2). However, it should be noted that these features manner in order to generate dense descriptors for the purpose of
are not used for 3D reconstruction of the microscopic samples and dense correspondence. The original formulation of SURF consists of
while having small number of SIFT features can be problematic in both feature detection and description steps, similar to that of SIFT.
the case of sparse feature based reconstruction, here, it does not However, for the current work, the step of feature detection is elim-
have a negative impact. In fact, having only eight true matches inated and only feature description is done on all the pixels of the
is enough for estimating the fundamental matrix which captures images. Following the procedures explained in Bay et al. (2008) and
the rigidity constraint of the scene (Hartley, 1997; Moisan et al., by defining sub-regions around the pixels at each scale, it is possi-
2016; Fusiello and Irsara, 2008). The computed homography trans- ble to generate descriptors of size 64 for each pixel. The reader is
forms for the first and second micrographs are displayed in the referred to the above-mentioned references for more details on the
table as well. This is followed by initial and final Sampson rectifica- implementations and parameters of each of the descriptors.
tion errors. As expected, since the SEM micrographs are captured Optical flow estimation refers to the estimation of displace-
in a very controlled manner, rectification errors are not very large ments of intensity patterns in image sequences (Horn and Schunck,
to begin with. However, epipolar rectification is recommended to 1981; Fortun et al., 2015). Several widely used databases are avail-
ensure minimal operator introduced errors as a result of manual able on the web for assessing the performance of optical flow
manipulation of the specimen stage. This will guaranty truthful estimation algorithms (Baker et al., 2011; Geiger et al., 2012). Here,

Table 2
Summary of rectification results using sparse SIFT matching and the subsequent epipolar rectification. The first and second row in the table indicate the number of SIFT
features found in the first and second micrographs of each set, while the third row is the result of a contrario methodology for matching the SIFT features according to a
homography transform. Fourth and fifth row show the computed homography transformation matrices as results of epipolar rectification by minimizing the Sampson’s error.
Finally, initial and final Sampson rectification errors are presented.

Tapetal Cell Copper Bar Copper Grid Hexagonal Grid Pollen Grain

im.1 # SIFT keypoints 438 347 183 558 471


im.2 # SIFT keypoints 391 335 177 558 476
ORSA # SIFT matches 67 201 32 16 274
0.9739 0.0239 10.37 0.9105 −0.5868 127.1 0.9991 −0.00312 3.772 1.003 −0.0008 −4.120 0.9999 0.0027 −4.803
H1 −0.0434 0.9992 55.24 0.5727 0.8205 −118.4 0.0023 0.9999 −3.221 0.0037 0.9999 −4.828 −0.0094 0.9999 1.228
≈0 ≈0 1.026 0.0001 −0.0001 0.9456 ≈0 ≈0 1.000 ≈0 ≈0 0.9961 ≈0 ≈0 1.000
0.9733 0.0245 10.53 0.9113 −0.5873 126.9 0.9987 −0.0037 3.751 1.002 −0.0002 −3.614 1.000 0.0051 −6.254
H2 −0.0430 1.000 48.23 0.5730 0.8204 −118.6 0.0019 0.9999 −2.164 0.002 0.9999 −3.024 −0.01223 0.9999 2.037
≈0 ≈0 1.025 0.0001 −0.0001 0.9450 ≈0 ≈0 1.001 ≈0 ≈0 0.9970 ≈0 ≈0 0.9996
Initial rect. err. (pix) 1.895 1.241 0.9568 0.2826 0.5757
Final rect. err. (pix) 0.3703 0.1702 0.7198 0.1912 0.3244
A. Baghaie et al. / Micron 97 (2017) 41–55 49

Fig. 7. Results of the optical flow estimation for Grove3 (first row), Urban3 (second row), RubberWhale (third row) and Venus (fourth row) using the four dense descriptors. In
each row the set of input images is shown in the first and second column followed by the ground truth of optical flow in the third column. Optical flow estimates computed
using dense Gabor, dense HOG, dense SURF and dense SIFT descriptors are shown respectively.

for assessing the performance of the four dense descriptors for the Table 3 summarizes the results of the four dense descriptors in
problem of dense correspondence, samples from the Middlebury the dense correspondence framework for the Grove3, Urban3, Rub-
dataset (http://vision.middlebury.edu/flow/) are utilized. Columns berWhale and Venus sequences. For each sequence, the results of
1–3 in Fig. 7 show the set of initial first and second frames as well comparison between the estimated optical flow and the ground
as the ground truth optical flow estimate for Grove3, Urban3, Rub- truth using AAE and AEE as metrics are presented in the first and
berWhale and Venus sequences. second columns while results for IE are shown in the third. It should
Assessing the performance of the dense descriptors for optical be noted that in order to eliminate the marginal errors, a mar-
flow estimation requires defining proper metrics for comparison. gin of 20 pixels is considered for computing the IE. Inspecting the
Here, three widely accepted metrics for optical flow assessment are results, it can be seen that dense SIFT is able to capture the struc-
used (Baker et al., 2011). Angular error (AE), as the most commonly tural representations of the objects contained in the sequences in a
used measure of performance for optical flow estimation, is defined more robust and reliable way overall which result in more accurate
based on the 3D angle between vectors (u, v, 1.0) and (uGT , vGT , 1.0) estimates of optical flow. However, as apparent when comparing
where (u, v) is the computed flow vector and (uGT , vGT ) is the the results of dense SURF and dense SIFT, this may not necessarily
ground truth flow. AE can be computed by taking the dot product translate to more accurate matching in terms of interpolation error.
of the vectors divided by the product of their lengths as follows: While dense SIFT descriptors provide lower errors for the optical
 flow estimation problem, the IE’s are higher than that of dense
1.0 + u × uGT + v × vGT SURF descriptors. In SUFR, several simplifying approximations are
AE = cos −1
  (20)
1.0 + u2 + v2 1.0 + u2GT + v2GT made in order to achieve higher speeds when creating descrip-
tors. In this method, second order derivatives of Gaussian functions
Here, the average of AE (AAE) over the image domain is used as one are approximated with binary Haar-wavelets. This approximation,
of the metrics. even though has minimal effects when using the method in its
Endpoint error (EE) on the other hand, computes the absolute sparse feature detection/descriptor manner, results in granular pat-
error between the estimated optical flow and the ground truth. terns in the final optical flow estimations, as can be seen more
Given this, the EE can be defined as: obviously in uniform regions of optical flow map in the Venus
 sequence. These effects can be attributed to the energy functional
EE = (u − uGT )2 + (v − vGT )2 (21) falling into local minima in the process of optimization in these
Here, the average of EE (AEE) over the image domain is used as regions. While this may cause the interpolation error to achieve
metric. lower values, the optical flow estimates are less accurate. In the
Having two input images (I1 and I2 ) as inputs of the optical flow problem of dense correspondence for the purpose of 3D reconstruc-
estimation algorithm for computing the flow from I1 to I2 , it is pos- tion of microscopic samples that is under consideration here, being
sible to reconstruct an interpolated version of the second image able to capture the movements of various regions of microscopic
by applying the computed dense flow to I1 . This can be called Î2 . samples between the images of a sequence more accurately is more
The interpolation error (IE) can be defined based on the root mean important than having smaller interpolation errors. Especially con-
square (RMS) difference between the true I2 and interpolated Î2 sidering the edge effects in SEM image acquisition which cause
using: subtle variations in the intensity levels of corresponding regions
 of the microscopic samples when the incident angle of the electron
1 2 beam is different between the images of the sequence. Therefore,
IE = (I2 (x, y) − Î2 (x, y)) (22)
N dense SIFT descriptors are more suited for the current work and
x,y
result in more accurate dense correspondence, and subsequently,
where N is the number of pixels. 3D reconstructions.
50 A. Baghaie et al. / Micron 97 (2017) 41–55

Table 3
Performance comparison between various dense descriptors for the problem of dense correspondence and optical flow estimation by using average angular error (AAE),
average endpoint error (AEE) and interpolation error (IE) as metrics.

Methods/metrics Grove3 Urban3 RubberWhale Venus

AAE AEE IE AAE AEE IE AAE AEE IE AAE AEE IE

Dense-Gabor 13.27 1.34 15.11 18.64 3.00 8.50 13.47 0.42 3.54 9.30 0.60 6.34
Dense-HOG 12.99 1.36 12.51 16.00 3.45 5.27 11.09 0.36 3.19 6.45 0.47 5.86
Dense-SURF 12.72 1.37 11.50 13.81 3.93 5.49 10.73 0.36 3.12 6.78 0.56 5.75
Dense-SIFT 12.22 1.30 13.15 11.02 3.01 5.27 10.47 0.35 3.34 4.79 0.40 6.00

Fig. 8. Dense matching results for the rectified image sets: Tapetal Cell (column 1), Copper Bar (column 2), Copper Grid (column 3), Hexagonal Grid (column 4) and Pollen Grain
(column 5). The first row shows the initial difference map. The second row shows the minimization trend for the optimization process defined using dense SIFT descriptors,
factor graph representation of the energy functional to be optimized (Fig. 4) and loopy belief propagation as means of optimization. The third row displays the difference
maps after the optimization process.

Fig. 8 summarizes the results of using dense SIFT descriptors in mainly dominated by edge effects. This is due to having more sec-
the dense correspondence framework for the five used micrograph ondary electrons that can leave the sample near edges which results
sets, qualitatively. First row shows the difference maps between the in increased brightness. A close inspection of the difference maps
two rectified input images before the process of dense matching, provided in the third row of Fig. 8 reveals that while the difference
while the third row displays the difference maps after the process in non-edge regions is minimal, an increase in the difference can
using dense SIFT descriptors and loopy belief propagation for min- be seen near edges. Fortunately, SIFT is designed in such a way to
imizing the defined energy functional. The minimization process is be able to handle these subtle intensity variations. Therefore this
implemented in a multi-resolution manner in three distinct stages, does not have any impacts on the outcome of dense matching pro-
as can be seen from the second row of Fig. 8. In such case, the cess. Initial and optimized values of the energy functional in Eq.
aim is to recover the larger displacements in a coarser grid while (10) can be seen in the third and fourth rows of the table. Using
compensating the smaller displacements in a finer grid. The multi- the factor graph representation of the objective function and loopy
resolution implementation not only can reduce the computational belief propagation as means for optimization, the energy is mini-
time and complexity significantly. But also, it makes the recov- mized orders of magnitude. One of the assumptions in simplifying
ery of true correspondence, in case of having bigger disparities, the process of depth estimation and 3D point cloud generation was
more achievable. The graphs are representatives of the optimiza- based on the observation that energy of vertical disparity map is
tion trend of the dense energy functional. minuscule in comparison to the energy of horizontal disparity map.
Table 4 summarizes the results of dense correspondence numer- This was expected as a result of the sparse feature-based rectifi-
ically. Here, the root mean squared error (RMSE) is used as means cation step. More proof is presented in the fifth row of the table
for assessing the performance of dense matching. The first row where reveals that the amount of energy contained in the vertical
in the table is a representative of the initial RMSE while the sec- disparity map is in fact very small in comparison to the energy of
ond row shows the final RMSE. The error is reduced significantly. horizontal disparity map. It should be noted that in case of having
The residual error can be attributed to the noise contained in the larger displacements between corresponding pixels of the initial
micrographs as well as the differences in brightness in regions due micrographs, the ratio may increase due to the local nature of belief
to changing the tilt angel between each image acquisition. Using propagation minimization approach. This is the case of Tapetal Cell
SEM in the secondary electron (SE) imaging mode, the contrast is set where the ratio is larger compare to the rest. Of course, this is
A. Baghaie et al. / Micron 97 (2017) 41–55 51

Table 4
Summary of dense correspondence results using dense SIFT features, the factor graph representation of the objective function and loopy belief propagation as meas for
optimization. The first and second row represent the initial and final root mean squared error (RMSE) of the two input micrographs. The residual errors can be attributed to
the noise contained in the micrographs as well as the differences in brightness due to edge effects caused by imaging in the secondary electron (SE) mode. third and fourth
rows show the initial and final values of the objective function (note the coefficients ×109 and ×107 ). The fifth row shows the ratio between the energy contained in the
vertical disparity map and the energy contained in the horizontal disparity map. This provides additional proof for the efficiency of the rectification process as well as the
depth estimation step. The last row displays the computational time needed for finding the dense correspondence between input micrographs.

Tapetal Cell Copper Bar Copper Grid Hexagonal Grid Pollen Grain

RMSEinitial 44.67 43.38 19.42 31.42 23.20


RMSEfinal 25.26 17.50 8.50 12.18 7.34
Einitial (×109 ) 2.29 0.20 1.68 1.79 0.73
Efinal (×107 ) 4.20 0.44 3.28 3.60 1.59
v2 / u2 (%) 1.29 0.20 0.08 0.16 0.58
≈Elapsed time (s) 44.65 7.50 47.10 47.33 19.60

still very small to have a major negative impact of the outcome of


depth estimation step. Finally, the last row, shows the computation
time needed for dense matching between the micrographs in each
set. The codes implemented here were a combination of MATLAB
and MEX codes executed on a Core i7 CPU @ 3.50 GHz with 12 GB
or RAM using MS Windows 7 and MATLAB R2014b. As can be seen,
the size of the input micrographs dominate the overall computa-
tional need of the proposed dense matching approach. The step is
followed by disparity refinement using the approximate bilateral
filer discussed in Section 2.5.

3.3. 3D point cloud/surface mesh generation

Having the refined relative disparities and the tilt angle, depth
can be estimated using Eq. (19) and the three dimensional point
cloud can be generated. Figs. 9–12 show the results of the pro-
posed method for the Copper Bar, Copper Grid, Hexagonal Grid and
Pollen Grain image sets, respectively. In each figure, the first row
shows several views of the generated dense point cloud (with sub-
sampling for better visualization) for each pair of input images. The
second row represents several views of the generated high qual-
ity surface mesh while the third row shows a magnified view of
the generated surface mesh. Using the proposed approach, a high
fidelity reconstruction of the microscopic samples are possible.
Plus, a uniform surface mesh can be generated since the distribu-
tion of the three dimensional points is uniform within the domain.
This is one of the major advantages of using dense correspondence
for 3D surface reconstruction in comparison to sparse feature based
reconstruction approaches. Moreover, higher amount of detail can
be reconstructed employing the proposed sparse-dense methodol- Fig. 9. Qualitative visualization of the proposed 3D SEM reconstruction framework
ogy. for the Copper Bar sample images, acquired by tilting the sample stage by 11◦ . The set
Sparse feature-based techniques rely only on the features of two-view images can be seen in Table 1. First row displays several views of the
detected in the images for the purpose of reconstruction (Tafti et al., reconstructed dense point cloud. The initial cloud contains 196,608 points which
is sub-sampled here for better visualization. Second row shows the constructed
2016a,b). These features are not distributed uniformly within the triangular surface mesh. Third row depicts a magnified view of the constructed
image domain by default. This is, especially, more challenging in triangular surface mesh.
the regions that lack significant variations in intensity/depth and,
therefore, are rather flat and uniform. It should also be noted that
feature detection techniques, like the ones employed in SIFT, are Grid and Tapetal Cell respectively. However, these can be adequate
designed to ignore spatially close or features generated by edges. representations for comparing the performance of various tech-
For the general problem of image matching, this is extremely use- niques. As can be seen, sparse feature based approaches suffer
ful since it helps avoiding computational redundancy and possible from non-uniform surface meshes while sharp edges and small fea-
effects of noise. However, for surface reconstruction of SEM images, tures are not truthfully recovered. It should be noted that for the
this will lead to erroneous results. proposed approach, only two micrographs are used while for the
For a better representation of the differences, Fig. 13 provides sparse feature based reconstructions, five micrographs from dif-
a visual comparison between the reconstruction results using the ferent viewpoints are utilized. This is mainly to ensure sufficient
proposed approach and the state-of-the-art sparse feature based matching points between image pairs for being able to build a more
approaches, namely algebraic distance bundle adjustment (ADBA) truthful reconstruction. However, as is obvious from the produced
and adaptive Sampson distance bundle adjustment (ASDBA) (Torr results, this is not enough to ensure a more accurate reconstruction.
and Murray, 1997; Triggs et al., 1999; Albouy et al., 2004), 3DSEM For the case of Hexagonal Grid not only the produced mesh is not
using Differential Evolution (DE) (Tafti et al., 2015) and 3DSEM++ uniform, but also identical areas are not reconstructed properly. For
(Tafti et al., 2016b). The results are generated only for the Hexagonal the Tapetal Cell, as previously shown in Fig. 2e and f, a small dent
52 A. Baghaie et al. / Micron 97 (2017) 41–55

Fig. 10. Qualitative visualization of the proposed 3D SEM reconstruction framework Fig. 11. Qualitative visualization of the proposed 3D SEM reconstruction frame-
for the Copper Grid sample images, acquired by tilting the sample stage by 7◦ . The work for the Hexagonal Grid sample images, acquired by tilting the sample stage by
set of two-view images can be seen in Table 1. First row displays several views of the 10◦ . The set of two-view images can be seen in Table 1. First row displays several
reconstructed dense point cloud. The initial cloud contains 1,228,800 points which views of the reconstructed dense point cloud. The initial cloud contains 1,228,800
is sub-sampled here for better visualization. Second row shows the constructed points which is sub-sampled here for better visualization. Second row shows the
triangular surface mesh. Third row depicts a magnified view of the constructed constructed triangular surface mesh. Third row depicts a magnified view of the
triangular surface mesh. constructed triangular surface mesh.

can be observed in the middle of the cell structure. While all of the
reduction and blur removal to contrast enhancement. Given the
approaches provide a rather similar geometry for the cell surface,
low tilt angles and mid-range magnification factors used here, the
only the proposed sparse-dense approach can reconstruct the very
amount of spatial distortions were negligible. Depending on the
fine detail while the rest suffer from over-smoothing. Therefore, it is
various samples, a more rigorous analysis of the amount of distor-
concluded that the proposed approach is more suited for 3D recon-
tions may be necessary. In the cases of reconstruction using more
struction of microscopic samples with more details in comparison
than two views this is of high demand.
to the rest of the techniques. The results of such dense correspon-
dence approach can be further used for more quantitative analysis
of the surface attributes in various branches of science. 4.2. Different dense descriptors for dense correspondence

4. Pointers for future research The first assumption in the majority of the methods proposed
in the literature for dense matching and optical flow estimation
4.1. Micrograph pre-processing is the brightness constancy during movements of pixels between
images of the sequence. However, this is not always the case for
The quality of the SEM micrographs can be affected by sev- SEM micrographs. One solution, as pursued here, is to use struc-
eral sources of artifacts and distortions. These distortions can tural descriptors rather than pixels for estimating the matching.
range from image quality degradation factors (e.g. noise, low con- Use of dense descriptors for dense matching and optical flow esti-
trast, over-saturation, etc.) to spatial distortions caused by various mation has been investigated in our previous works (Baghaie et al.,
parameters involved in the process of imaging (e.g. SEM’s cal- 2015, 2017) using various dense descriptors, such as Leung–Malik
ibration, tilt angle, working distance, acceleration voltage, etc.) (LM) filter bank (Leung and Malik, 2001), Gabor filter bank (Gabor,
(Marinello et al., 2008). Therefore, appropriate steps should be 1946), Schmid filter bank (Schmid, 2001), root filter set (RFS) fil-
taken in order to ensure minimal distortions (Nolze, 2007; Guery ters, steerable filters (Freeman and Adelson, 1991), histogram of
et al., 2013). Here, due to the highly constrained optimization oriented gradients (HOG) (Dalal and Triggs, 2005) and speeded-
approach employed for dense matching with smoothness and small up robust features (SURF) (Bay et al., 2008). The same approaches
displacements regularization terms the effects of image-based dis- can be considered here using the above mentioned dense descrip-
tortions are reduced greatly. However, in special cases it may be tors with the possibility of newer descriptors such as DAISY (Tola
required to pre-process the input micrographs for a better match- et al., 2010) which is proven to be useful for high accuracy dense
ing. These pre-processing steps range from edge-preserving noise matching.
A. Baghaie et al. / Micron 97 (2017) 41–55 53

4.3. Occlusion handling

Occlusion handling as an interesting and challenging problem is


a widely studied problem in the computer vision community (Yang
et al., 2009; Xiao et al., 2006; Xu et al., 2012). This arises as a result of
movements of objects in the scene or the change of imaging view-
point. This is more problematic in case of large displacements of
objects between frames in the image sequence which is largely the
case for general purpose optical flow estimation or stereo matching.
However, for the problem of 3D reconstruction of microscopic sam-
ples using SEM micrographs, the problem is more relaxed. On one
hand, the SEM micrograph acquisition is done in a very organized
manner with careful sample preparation and controlled imaging
procedures. On the other, unlike the general optical flow or stereo
matching, the amount of displacements can be adjusted by manual
manipulation of the specimen sample. This, as mentioned before,
does not have a negative impact on the subsequent depth esti-
mation since it will not alter the relative disparity between the
matching points and may only elevate or decrease the mean depth
of the whole microscopic sample. Moreover it should be noted that
we have limitations on the possible tilt angles dictated by the SEM
imaging system. However, in case of multiview stereopsis and/or
for more complex microscopic samples, by taking occlusion han-
dling procedures into account, a more accurate reconstruction can
be achieved.

4.4. Hybrid approaches: combining SFM & SFS

In the class of single view 3D reconstruction approaches, images


Fig. 12. Qualitative visualization of the proposed 3D SEM reconstruction framework from a single viewpoint but with various lighting conditions are
for the Pollen Grain sample images, acquired by tilting the sample stage by 3◦ . The set captured and used for the purpose of reconstruction. The methods
of two-view images can be seen in Table 1. First row displays several views of the in this class have been previously used for 3D reconstruction from
reconstructed dense point cloud. The initial cloud contains 447,665 points which SEM images (Lee and Kuo, 1993; Drzazga et al., 2005; Paluszyński
is sub-sampled here for better visualization. Second row shows the constructed
and Slowko, 2005; Pintus et al., 2008). However, due to diffi-
triangular surface mesh. Third row depicts a magnified view of the constructed
triangular surface mesh. culty in generating SEM micrographs under different illumination
directions, they achieved moderate success. Even though several

Fig. 13. Performance comparison between the proposed method and various multiview 3D reconstruction algorithms, namely algebraic distance bundle adjustment (ADBA)
and adaptive Sampson distance bundle adjustment (ASDBA) (Torr and Murray, 1997; Triggs et al., 1999; Albouy et al., 2004), 3DSEM using differential evolution (Tafti et al.,
2015) and 3DSEM++ (Tafti et al., 2016b) for the Hexagonal Grid (first row) and the Tapetall Cell (second row). The proposed approach results in more uniform surface meshes
with higher ability in recovering fine details.
54 A. Baghaie et al. / Micron 97 (2017) 41–55

Fig. 14. Shape, illumination, and reflectance estimation from shading using only one image from the Copper Bar set by using the method proposed in Barron and Malik (2015).
From left to right, the initial image as well as shape, normals, reflectance, shading and illumination.

hybrid approaches have been introduced in the literature combin- stage of rectification for transforming the images to having a more
ing SFS with SFM (Danzl and Scherer, 2003), the advent of modern horizontally-concentrated disparity. In this manner, given the cor-
SFS algorithms can improve the performance of 3D reconstruction rect disparity, the process of depth estimation will be simplified
approaches. Fig. 14 shows a sample result produced using only one greatly since the depth will be directly proportional to the found
image from the Copper Bar micrograph set by taking advantage of disparity. For the next step, we take advantage of a constrained
the work of Barron and Malik (2015). More rigorous analysis and optimization procedure using dense SIFT descriptors, factor graph
investigations on the use of such techniques will benefit the field representation of the energy functional to be optimized and loopy
greatly. belief propagation as means of optimization. Finally, depth is esti-
mated using the bilaterally-filtered horizontal disparity computed
4.5. Case studies: more complex biological samples from the previous step. Extensive tests and experiments with sev-
eral sets of SEM micrographs prove the robustness and reliability of
When going into the realm of microscopic samples, the com- the proposed method for high resolution quality 3D reconstruction
plexity of the objects are increased tremendously. An example is of microscopic samples.
the set of Pollen Grain micrographs in which the surface of the
grain is highly porous. As evident from the results, sparse feature
based reconstruction approaches are not able to reconstruct an References
accurate surface representation while the proposed sparse-dense
correspondence framework can represent the surface with much Agarwal, S., Furukawa, Y., Snavely, N., Simon, I., Curless, B., Seitz, S.M., Szeliski, R.,
higher accuracy. As for future research, more case studies are in 2011. Building Rome in a day. Commun. ACM 54 (10), 105–112.
Albouy, B., Treuillet, S., Lucas, Y., Birov, D., 2004. Fundamental matrix estimation
order using more complex biological samples for better evaluation
revisited through a global 3D reconstruction framework. In: Advanced
of the proposed method. Concepts for Intelligent Vision Systems.
Baghaie, A., 2016. Markov random field model-based salt and pepper noise
removal. arXiv:1609.06341.
4.6. Surface mesh generation/optimization Baghaie, A., D’Souza, R.M., Yu, Z.,2015. Dense correspondence and optical flow
estimation using gabor, schmid and steerable descriptors. In: Advances in
For all the mesh related processing steps (point cloud manipu- Visual Computing. Springer, pp. 406–415.
Baghaie, A., D’Souza, R.M., Yu, Z., 2017. Dense descriptors for optical flow
lation, points’ normal vector computation, meshing and smoothing estimation: a comparative study. J. Imaging 3 (1), 12.
and Poisson surface reconstruction) involved in this study, Mesh- Baghaie, A., Yu, Z.,2014. Curvature-based registration for slice interpolation of
Lab is used which is an open source, portable, and extensible system medical images. In: Computational Modeling of Objects Presented in Images.
Fundamentals, Methods, and Applications. Springer, pp. 69–80.
for the processing and editing of unstructured 3D triangular meshes Baghaie, A., Yu, Z., D’souza, R.M.,2014. Fast mesh-based medical image registration.
(MeshLab, 2005). Although widely used in the scientific community In: International Symposium on Visual Computing. Springer, pp. 1–10.
for mesh processing, the extend of effects of various parameters uti- Baker, S., Scharstein, D., Lewis, J., Roth, S., Black, M.J., Szeliski, R., 2011. A database
and evaluation methodology for optical flow. Int. J. Comput. Vision 92 (1), 1–31.
lized in each step of the final mesh generated from the dense point
Barber, D., 2012. Bayesian Reasoning and Machine Learning. Cambridge University
cloud has not been investigated here. Moreover, texture mapping Press.
can be challenging. Use of different methods for mesh generation Barron, J.T., Malik, J., 2015. Shape, illumination, and reflectance from shading. IEEE
Trans. Pattern Anal. Mach. Intell. 37 (8), 1670–1687.
and data structures for incorporating the intensity levels in the 3D
Bay, H., Ess, A., Tuytelaars, T., Van Gool, L., 2008. Speeded-up robust features
reconstruction can be investigated in the future. Moreover, incor- (SURF). Comput. Vision Image Understand. 110 (3), 346–359.
porating the mesh generation procedures in the process of depth Bernal, J., Vilarino, F., Sánchez, J., 2010. Feature Detectors and Feature Descriptors:
estimation can be considered as a solution (Zhang et al., 2015). Where We Are Now. Universitat Autonoma de Barcelona, Barcelona.
Bozzola, J.J., Russell, L.D., 1999. Electron Microscopy: Principles and Techniques for
Biologists. Jones & Bartlett Learning.
5. Conclusions Dalal, N., Triggs, B.,2005. Histograms of oriented gradients for human detection. In:
IEEE Computer Society Conference on Computer Vision and Pattern
Recognition, 2005, CVPR 2005, vol. 1. IEEE, pp. 886–893.
In this work, an end-to-end framework for high fidelity 3D Danzl, R., Scherer, S., 2003. Integrating shape from shading and shape from stereo
reconstruction of microscopic samples from stereo SEM micro- for variable reflectance surface reconstruction from SEM images. In: SEM
Images, 26th Workshop of the Austrian Association for Pattern Recognition.
graphs is proposed. Using a Hitachi S-4800 field emission scanning Drzazga, W., Paluszynski, J., Slowko, W., 2005. Three-dimensional characterization
electron microscope (FE-SEM) which is equipped with a computer of microstructures in a SEM. Meas. Sci. Technol. 17, 28.
controlled 5 axis motorized specimen stage which enables move- Durand, F., Dorsey, J.,2002. Fast bilateral filtering for the display of
high-dynamic-range images. In: ACM Transactions on Graphics (TOG), vol. 21.
ments in x, y and z directions as well as tilt and rotation, the
ACM, pp. 257–266.
specimen was tilted in successive 1◦ increments until reaching Egerton, R.F., 2006. Physical Principles of electron Microscopy: An Introduction to
the final desired tilt angle with manual movement of the stage TEM, SEM, and AEM. Springer Science & Business Media.
Eulitz, M., Reiss, G., 2015. 3D reconstruction of SEM images by use of optical
in the x- and/or y directions. Even with the most careful acqui-
photogrammetry software. J. Struct. Biol. 191 (2), 190–196.
sition procedure, the acquired images need to be transformed in Fortun, D., Bouthemy, P., Kervrann, C., 2015. Optical flow modeling and
manner to ensure more accurate 3D reconstruction. In this step, computation: a survey. Comput. Vision Image Understand. 134, 1–21.
using sparse SIFT features/descriptors and employing a contrario Freeman, W.T., Adelson, E.H., 1991. The design and use of steerable filters. IEEE
Trans. Pattern Anal. Mach. Intell. (9), 891–906.
RANSAC, matched features are found and outliers that do not sat- Furukawa, Y., Ponce, J., 2010. Accurate, dense, and robust multiview stereopsis.
isfy a projective transform are removed. This is followed by an IEEE Trans. Pattern Anal. Mach. Intell. 32 (8), 1362–1376.
A. Baghaie et al. / Micron 97 (2017) 41–55 55

Fusiello, A., Irsara, L.,2008. Quasi-Euclidean uncalibrated epipolar rectification. In: Pham, T.Q., Van Vliet, L.J.,2005. Separable bilateral filtering for fast video
19th International Conference on Pattern Recognition, 2008, ICPR 2008. IEEE, preprocessing. In: 2005 IEEE International Conference on Multimedia and
pp. 1–4. Expo. IEEE, p. 4.
Gabor, D., 1946. Theory of communication. Part 1: The analysis of information. J. Pintus, R., Podda, S., Vanzi, M., 2008. An automatic alignment procedure for a
Inst. Electr. Engrs. Part III Radio Commun. Eng. 93 (26), 429–441. four-source photometric stereo technique applied to scanning electron
Geiger, A., Lenz, P., Urtasun, R., 2012. Are we ready for autonomous driving? The microscopy. IEEE Trans. Instrum. Meas. 57 (5), 989–996.
Kitti vision benchmark suite. In: Conference on Computer Vision and Pattern Roy, S., Meunier, J., Marian, A., Vidal, F., Brunette, I., Costantino, S.,2012. Automatic
Recognition (CVPR). 3D reconstruction of quasi-planar stereo scanning electron microscopy (SEM)
Guery, A., Latourte, F., Hild, F., Roux, S., 2013. Characterization of SEM speckle images. In: 2012 Annual International Conference of the IEEE Engineering in
pattern marking and imaging distortion by digital image correlation. Meas. Sci. Medicine and Biology Society. IEEE, pp. 4361–4364.
Technol. 25 (1), 015401. Samak, D., Fischer, A., Rittel, D., 2007. 3D reconstruction and visualization of
Hartley, R., Zisserman, A., 2003. Multiple View Geometry in Computer Vision. microstructure surfaces from 2D images. CIRP Ann. Manuf. Technol. 56 (1),
Cambridge University Press. 149–152.
Hartley, R.I., 1997. In defense of the eight-point algorithm. IEEE Trans. Pattern Schmid, C.,2001. Constructing models for content-based image retrieval. In:
Anal. Mach. Intell. 19 (6), 580–593. Proceedings of the 2001 IEEE Computer Society Conference on Computer
Horn, B.K., Schunck, B.G.,1981. Determining optical flow. In: 1981 Technical Vision and Pattern Recognition, 2001, CVPR 2001, vol. 2. IEEE, pp. II–39.
Symposium East. International Society for Optics and Photonics, pp. 319–331. Shen, S., 2013. Accurate multiple view 3D reconstruction using patch-based stereo
Ilonen, J., Kämärä inen, J.-K., Kälviä inen, H., 2005. Efficient Computation of Gabor for large-scale scenes. IEEE Trans. Image Process. 22 (5), 1901–1914.
Features. Lappeenranta University of Technology. Szeliski, R., 2010. Computer Vision: Algorithms and Applications. Springer Science
Jensen, E., 2012. Types of imaging. Part 1: Electron microscopy. Anat. Rec. 295 (5), & Business Media.
716–721. Szeliski, R., Zabih, R., Scharstein, D., Veksler, O., Kolmogorov, V., Agarwala, A.,
Kschischang, F.R., Frey, B.J., Loeliger, H.-A., 2001. Factor graphs and the Tappen, M., Rother, C., 2008. A comparative study of energy minimization
sum–product algorithm. IEEE Trans. Inform. Theory 47 (2), 498–519. methods for Markov random fields with smoothness-based priors. IEEE Trans.
Lee, K.M., Kuo, C.-C.J., 1993. Surface reconstruction from photometric stereo Pattern Anal. Mach. Intell. 30 (6), 1068–1080.
images. JOSA A 10 (5), 855–868. Tafti, A.P., Baghaie, A., Kirkpatrick, A.B., Holz, J.D., Owen, H.A., D’Souza, R.M., Yu, Z.,
Leung, T., Malik, J., 2001. Representing and recognizing the visual appearance of 2016a. A comparative study on the application of SIFT, SURF, BRIEF and ORB for
materials using three-dimensional textons. Int. J. Comput. Vision 43 (1), 29–44. 3D surface reconstruction of electron microscopy images. Comput. Methods
Lindeberg, T., 1994. Scale-space theory: A basic tool for analyzing structures at Biomech. Biomed. Eng. Imaging Visual., 1–14.
different scales. J. Appl. Stat. 21 (1–2), 225–270. Tafti, A.P., Holz, J.D., Baghaie, A., Owen, H.A., He, M.M., Yu, Z., 2016b. 3DSEM++:
Liu, C., Yuen, J., Torralba, A., 2011. SIFT flow: dense correspondence across scenes Adaptive and intelligent 3D SEM surface reconstruction. Micron 87, 33–45.
and its applications. IEEE Trans. Pattern Anal. Mach. Intell. 33 (5), 978–994. Tafti, A.P., Kirkpatrick, A.B., Alavi, Z., Owen, H.A., Yu, Z., 2015. Recent advances in
Lowe, D.G., 2004. Distinctive image features from scale-invariant keypoints. Int. J. 3D SEM surface reconstruction. Micron 78, 54–66.
Comput. Vision 60 (2), 91–110. Tafti, A.P., Kirkpatrick, A.B., Holz, J.D., Owen, H.A., Yu, Z., 2016c. 3DSEM: a 3D
Marinello, F., Bariani, P., Savio, E., Horsewell, A., De Chiffre, L., 2008. Critical factors microscopy dataset. Data Brief 6, 112–116.
in SEM 3D stereo microscopy. Meas. Sci. Technol. 19 (6), 065705. Tola, E., Lepetit, V., Fua, P., 2010. Daisy: an efficient dense descriptor applied to
MeshLab, 2005. Meshlab. http://meshlab.sourceforge.net/. wide-baseline stereo. IEEE Trans. Pattern Anal. Mach. Intell. 32 (5), 815–830.
Mikolajczyk, K., Schmid, C.,2002. An affine invariant interest point detector. In: Torr, P.H., Murray, D.W., 1997. The development and comparison of robust
Computer Vision-ECCV 2002. Springer, pp. 128–142. methods for estimating the fundamental matrix. Int. J. Comput. Vision 24 (3),
Mikolajczyk, K., Schmid, C., 2005. A performance evaluation of local descriptors. 271–300.
IEEE Trans. Pattern Anal. Mach. Intell. 27 (10), 1615–1630. Triggs, B., McLauchlan, P.F., Hartley, R.I., Fitzgibbon, A.W.,1999. Bundle adjustment
Moisan, L., Moulon, P., Monasse, P., 2016. Fundamental matrix of a stereo pair, with – a modern synthesis. In: International workshop on vision algorithms.
a contrario elimination of outliers. Image Process. On Line 6, 89–113. Springer, pp. 298–372.
Moisan, L., Stival, B., 2004. A probabilistic criterion to detect rigid point matches Uijlings, J., Duta, I., Sangineto, E., Sebe, N., 2015. Video classification with densely
between two images and estimate the fundamental matrix. Int. J. Comput. extracted HOG/HOF/MBH features: an evaluation of the
Vision 57 (3), 201–218. accuracy/computational efficiency trade-off. Int. J. Multimedia Inform.
Monasse, P., 2011. Quasi-Euclidean epipolar rectification. Image Process. On Line, 1. Retrieval 4 (1), 33–44.
Movellan, J.R., 2002. Tutorial on Gabor Filters. Open Source Document. Weiss, B., 2006. Fast median and bilateral filtering. ACM Trans. Graphics 25 (3),
Murphy, K.P., Weiss, Y., Jordan, M.I.,1999. Loopy belief propagation for 519–526.
approximate inference: an empirical study. In: Proceedings of the Fifteenth Wöhler, C., 2012. 3D Computer Vision: Efficient Methods and Applications.
Conference on Uncertainty in Artificial Intelligence. Morgan Kaufmann Springer Science & Business Media.
Publishers Inc., pp. 467–475. Xiao, J., Cheng, H., Sawhney, H., Rao, C., Isnardi, M.,2006. Bilateral filtering-based
Musialski, P., Wonka, P., Aliaga, D.G., Wimmer, M., Gool, L., Purgathofer, W.,2013. A optical flow estimation with occlusion detection. In: European Conference on
survey of urban reconstruction. In: Computer Graphics Forum, vol. 32. Wiley Computer Vision. Springer, pp. 211–224.
Online Library, pp. 146–177. Xie, J., 2011. Stereomicroscopy: 3D Imaging and the third Dimension
Nocedal, J., Wright, S., 2006. Numerical Optimization. Springer Science & Business Measurement. Agilent Technologies, Santa Clara, CA.
Media. Xu, L., Jia, J., Matsushita, Y., 2012. Motion detail preserving optical flow estimation.
Nolze, G., 2007. Image distortions in SEM and their influences on EBSD IEEE Trans. Pattern Anal. Mach. Intell. 34 (9), 1744–1757.
measurements. Ultramicroscopy 107 (2), 172–183. Yang, Q., Wang, L., Yang, R., Stewénius, H., Nistér, D., 2009. Stereo matching with
Paluszyński, J., Slowko, W., 2005. Surface reconstruction with the photometric color-weighted correlation, hierarchical belief propagation, and occlusion
method in SEM. Vacuum 78 (2), 533–537. handling. IEEE Trans. Pattern Anal. Mach. Intell. 31 (3), 492–504.
Paris, S., Durand, F., 2009. A fast approximation of the bilateral filter using a signal Zhang, C., Li, Z., Cheng, Y., Cai, R., Chao, H., Rui, Y., 2015. Meshstereo: a global stereo
processing approach. Int. J. Comput. Vision 81 (1), 24–52. model with mesh alignment regularization for view interpolation. In:
Paris, S., Kornprobst, P., Tumblin, J., Durand, F., 2009. Bilateral Filtering: Theory and Proceedings of the IEEE International Conference on Computer Vision, pp.
Applications. Now Publishers Inc. 2057–2065.
Pearl, J., 2014. Probabilistic Reasoning in Intelligent Systems: Networks of Zolotukhin, A., Safonov, I., Kryzhanovskii, K., 2013. 3D reconstruction for a
Plausible Inference. Morgan Kaufmann. scanning electron microscope. Pattern Recogn. Image Anal. 23 (1), 168–174.

You might also like