Stacked Sparse Autoencoder (SSAE) For Nuclei Detection On Breast Cancer Histopathology Images
Stacked Sparse Autoencoder (SSAE) For Nuclei Detection On Breast Cancer Histopathology Images
Abstract—Automated nuclear detection is a critical step for a Across a cohort of 500 histopathological images (2200 2200)
number of computer assisted pathology related image analysis and approximately 3500 manually segmented individual nuclei
algorithms such as for automated grading of breast cancer tissue serving as the groundtruth, SSAE was shown to have an improved
specimens. The Nottingham Histologic Score system is highly cor- F-measure 84.49% and an average area under Precision-Recall
related with the shape and appearance of breast cancer nuclei in curve (AveP) 78.83%. The SSAE approach also out-performed
histopathological images. However, automated nucleus detection nine other state of the art nuclear detection strategies.
is complicated by 1) the large number of nuclei and the size of
high resolution digitized pathology images, and 2) the variability Index Terms—Automated nuclei detection, breast cancer
in size, shape, appearance, and texture of the individual nuclei. histopathology, feature representation learning, stacked sparse
Recently there has been interest in the application of “Deep autoencoder, digital pathology, deep learning.
Learning” strategies for classification and analysis of big image
data. Histopathology, given its size and complexity, represents an
excellent use case for application of deep learning strategies. In I. INTRODUCTION
this paper, a Stacked Sparse Autoencoder (SSAE), an instance of
a deep learning strategy, is presented for efficient nuclei detection
on high-resolution histopathological images of breast cancer.
The SSAE learns high-level features from just pixel intensities
W ITH the recent advent of cost-effective whole-slide dig-
ital scanners, tissue slides can now be digitized and
stored in digital image form [1]. Digital pathology makes com-
alone in order to identify distinguishing features of nuclei. A
sliding window operation is applied to each image in order to
puterized quantitative analysis of histopathology imagery pos-
represent image patches via high-level features obtained via the sible. The diagnosis from a histopathology image remains the
auto-encoder, which are then subsequently fed to a classifier “gold standard” in diagnosing considerable number of diseases
which categorizes each image patch as nuclear or non-nuclear. including almost all types of cancer. For breast cancer (BC), the
Nottingham Histologic Score system enables the identification
of the degree of aggressiveness of the disease, largely based off
Manuscript received April 18, 2015; revised July 17, 2015; accepted July the morphologic attributes of the breast cancer nuclei. The ar-
17, 2015. Date of publication July 20, 2015; date of current version December rangement and topological features of nuclei in tumor regions of
29, 2015. This work is supported by the National Natural Science Founda-
tion of China (61273259, 61272223); Six Major Talents Summit of Jiangsu breast histopathology thus represent important histologic based
Province (2013-XXRJ-019), and the Natural Science Foundation of Jiangsu biomarkers for predicting patient outcome [2]. Accurate deter-
Province of China (BK20141482); the National Cancer Institute of the National mination of breast cancer grade is important for guiding treat-
Institutes of Health under Awards R01CA136535-01, R01CA140772-01,
R21CA167811-01, R21CA179327-01; the National Institute of Diabetes and ment selection for patients. Since the scoring system is highly
Digestive and Kidney Diseases under Award R01DK098503-02, the DOD correlated to nuclear appearance, accurate nuclei detection is a
Prostate Cancer Synergistic Idea Development Award (PC120857); the DOD critical step in developing automated machine based grading
Lung Cancer Idea Development New Investigator Award (LC130463), the
Ohio Third Frontier Technology development Grant, the CTSC Coulter Annual schemes and computer assisted decision support systems for
Pilot Grant, and the Wallace H. Coulter Foundation Program in the Department digital pathology. Additionally, the quantitative assessment of
of Biomedical Engineering at Case Western Reserve University. The content lymphocytes in breast tissue has been recently recognized as
is solely the responsibility of the authors and does not necessarily represent
the official views of the National Institutes of Health. Asterisk indicates a strong prognostic predictor of favourable outcome [3]. Con-
corresponding author. sequently it has become important to be able to automatically
*J. Xu is with the Jiangsu Key Laboratory of Big Data Analysis Technique identify the extent of lymphocytes in digitized pathology im-
and CICAEET, Nanjing University of Information Science and Technology,
Nanjing 210044, China (e-mail: [email protected]). ages. However, qualitative estimation of the extent of lympho-
L. Xiang and Q. Liu are with the Jiangsu Key Laboratory of Big Data Anal- cytic infiltration in breast pathology images by pathologists is
ysis Technique and CICAEET, Nanjing University of Information Science and still largely subjective and manual quantification is laborious,
Technology, Nanjing 210044, China.
H. Gilmore is with the Department of Pathology-Anatomic, University Hos-
tedious, and hugely time consuming. Hence, there is a need to
pitals Case Medical Center, Case Western Reserve University, OH 44106-7207 develop algorithms for automated detection of individual nu-
USA. clei.
J. Wu and J. Tang are with the Jiangsu Cancer Hospital, Nanjing 210000,
China.
Additionally, the identification of the location of individual
A. Madabhushi is with the Department of Biomedical Engineering, Case nuclei can also allow for automated characterization of spatial
Western Reserve University, OH 44106-7207 USA (e-mail: axm788@case. nuclear architecture. Features that reflect the spatial arrange-
edu).
Color versions of one or more of the figures in this paper are available online
ment of nuclei (e.g. via graph algorithms such as the Voronoi,
at http://ieeexplore.ieee.org. Delaunay Triangulation, Minimum Spanning Tree) have been
Digital Object Identifier 10.1109/TMI.2015.2458702 shown to be strongly associated with grade [2] and cancer
0278-0062 © 2015 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.
See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: RVR & JC College of Engineering. Downloaded on June 19,2020 at 10:01:31 UTC from IEEE Xplore. Restrictions apply.
120 IEEE TRANSACTIONS ON MEDICAL IMAGING, VOL. 35, NO. 1, JANUARY 2016
TABLE I
THE CATEGORIZATION OF DIFFERENT NUCLEI DETECTION APPROACHES
progression [4]. Thus there are compelling reasons to find our knowledge, none of the approaches in Table I have been
improved, automated, efficient ways to identify individual extensively tested on multi-site data.
cancer nuclei on breast pathology images. Recently, significant progress has been made in learning
The rest of the paper is organized as follows: A review of pre- image representations from pixel (or low) level image features
vious related works is presented in Section II. A brief review of to capture high level shape and edge interactions between
basic autoencoder and its architecture is presented in Section VI. different objects in images [35]. These higher level shape cues
A detailed description of Sparse Stacked Autoencoder (SSAE) tend to be more consistently present and discernible across
is presented in Section III. The experimental setup and com- similar images across sites and scanners compared to lower
parative strategies are discussed in Section IV. The experiment level color, intensity or texture features. These low level image
results and discussions are reported in Section V. Concluding attributes tend to be more susceptible to signal drift artifacts.
remarks are presented in Section VI. Deep learning (DL) is a hierarchical learning approach which
learns high-level features from just the pixel intensities alone
II. PREVIOUS RELATED WORK that are useful for differentiating objects by a classifier. DL has
Accurate detection of nuclei is a difficult task because of the been shown to effectively address some of most challenging
complicated nature of histopathological images. Hematoxylin problems in vision and learning since the first deep autoencoder
and Eosin (H&E) are standard stains that highlight nuclei in network was proposed by Hinton et al. in [36]. An appealing
blue/purple and cytoplasm in pink in order to visualize the struc- attribute of DL approaches for histopathology image analysis
ture of interest in the tissue [5]. The images are complicated is the ability to leverage and exploit large numbers of unlabeled
by (1) the large number of nuclei and the size of high reso- instances. Consequently, in the last couple of years there has
lution digitized pathology images, (2) the variability in size, been interest in the use of DL approaches for different types
shape, appearance, and texture of the individual nuclei, (3) noise of problems in histopathological image classification. For
and non-homogenous backgrounds. Most current nuclei detec- instance, in [34], the authors employed a convolutional autoen-
tion/segmentation methods on H&E stained images are based coder neural network architecture (CNN) with autoencoder
on exploiting low-level hand-crafted features [6], such as color for histopathological image representation learning. Then a
[7]–[22], edge [23]–[28], contextual information [29]–[31], and softmax classification is employed for classifying regions of
texture [32], [33]; see Table I for a detailed enumeration of some cancer and non-cancer. The work in [34], however, only used
of these approaches. one-layer autoencoder for high-level feature representation.
As Table I illustrates, most of the detection approaches are Unlike CNN-based feature representation which involves
based on hand-crafted features. A number of these approaches, convolutional and subsampling operations to learn a set of
including ones previously presented by our group [15] have locally connected neurons through local receptive fields for
yielded high degrees of detection accuracy. However, since feature extraction, the approach presented in this paper (illus-
most of these approaches are dependent on either color or trated in Fig. 2) employs a full connection model for high-level
intensity related attributes, it is not clear how these approaches feature learning. Autoencoder or Stacked Sparse Autoencoder
would generalize from one site to another. To the best of (SSAE) is an encoder-decoder architecture where the “encoder”
Authorized licensed use limited to: RVR & JC College of Engineering. Downloaded on June 19,2020 at 10:01:31 UTC from IEEE Xplore. Restrictions apply.
XU et al.: STACKED SPARSE AUTOENCODER (SSAE) FOR NUCLEI DETECTION ON BREAST CANCER HISTOPATHOLOGY IMAGES 121
TABLE II
ENUMERATION OF THE SYMBOLS USED IN THE PAPER
network represents pixel intensities modeled via lower dimen- subsequent nucleus classification. Each automatically selected
sional attributes, while the “decoder” network reconstructs the image patch is then fed into a trained classifier, yielding a
original pixel intensities using the low dimensional features. binary prediction for absence or presence of a nucleus in each
CNN is a partial connection model which stresses importance image patch.
of locality while SSAE is a full connection model which learns Note that our approach is focused on nuclei detection and not
a single global weight matrix for feature representation. For on nuclear segmentation, i.e. explicitly extracting the nuclear
our application, the size of nuclear and non-nuclear patches boundaries [15], [25]. However our approach could be used to
was set to 34 34 pixels. Additionally each image patch may provide initial seed points for subsequent application of seg-
contain up to a single object (in this case a nucleus) that would mentation models such as watershed [10], [38], active contour
be appropriate for construction of a full connection model. [15], and region-growing approaches [29].
Therefore, we choose to use SSAE instead of CNN in this The major contributions of this paper are:
paper. On the other hand, SSAE is trained, bottom up, in an 1) Different from hand-crafted feature representation based
unsupervised fashion, in order to extract hidden features. The methods, the SSAE model can transform the input pixel
efficient representations in turn pave the way for more accurate, intensities to structured nuclei or non-nuclei representa-
supervised classification of the two types of patches. Moreover, tions. Therefore, the SSAE based framework is able to
this unsupervised feature learning is appropriate for histolog- learn high-level structure information from a large number
ical images since we have a great deal of unlabeled image of unlabeled image patches. Our approach is thus funda-
data to work with; image labels typically being expensive and mentally different from a number of existent hand-crafted
laborious to come by. This higher level feature learning allows methods that rely on low-level image information such as
us to efficiently detect multiple nuclei from a large cohort of color, texture, and edge cues.
histopathological images. In our preliminary work [37], we 2) By training the SSAE classifier with unlabeled instances,
employed the SSAE framework for learning high-level features the SSAE model employs a hierarchical architecture for
corresponding to different regions of interest containing nuclei. transforming original pixel signal intensities of input
These low-dimensional, high-level features were subsequently image patches into the corresponding high-level struc-
fed into a Softmax Classifier (SMC) for discriminating nuclei tural information. During the classification stage, each
from non-nuclear regions of interest within an independent image patch to be evaluated is fed into the hierarchical
testing set. In this paper, we extend the framework presented in architecture and represented by a high-level structured
[37] to automatically detect multiple nuclei from whole slide representation of nuclei or non-nuclei patches.
images. To attain this goal, locally maximal confidence scores 3) Using a sliding window approach enables rapid traversal
are computed by sliding a window across the entire image across large images for detection of individual nuclei ef-
in order to selectively identify candidate image patches for ficiently. Locally, maximal confidence scores are assigned
Authorized licensed use limited to: RVR & JC College of Engineering. Downloaded on June 19,2020 at 10:01:31 UTC from IEEE Xplore. Restrictions apply.
122 IEEE TRANSACTIONS ON MEDICAL IMAGING, VOL. 35, NO. 1, JANUARY 2016
Fig. 1. Illustration of the architecture of basic Autoencoder (AE) with “en- Fig. 2. Illustration of Stacked Sparse Autoencoder (SSAE) plus Softmax Clas-
coder” and “decoder” networks for high-level feature learning of nuclei struc- sifier (SMC) for identifying presence or absence of nuclei in individual image
tures. The “encoder” network represents input pixel intensities corre- patches.
sponding to an image patch via a lower dimensional feature. Then the “de-
coder” network reconstructs the pixel intensities within the image patch via the
dimensional feature.
refer to the nuclear and non-nuclear patches, respectively. Note
that in the SSAE learning procedure, the label information is
to individual image patches and used to identify candidate not used. Therefore, SSAE learning is an unsupervised learning
patches which are then fed to a subsequent classifier. This scheme. After the high-level feature learning procedure is
two-tier classification reduces the computational burden complete, the learned high-level representation of nuclear
on the classifier, sieving out non-candidate nuclear patches structures, as well as its label , are fed to
and thus making the overall model more efficient. the output layer.
To sum up, this paper integrates the SSAE based framework
B. The Output Layer: Softmax Classifier (SMC)
for learning of high-level features associated with nuclei. Our
approach also employs a sliding window scheme for efficiently Softmax classifier (SMC) is a supervised model which gen-
detecting nuclear patches. eralizes logistic regression as
Authorized licensed use limited to: RVR & JC College of Engineering. Downloaded on June 19,2020 at 10:01:31 UTC from IEEE Xplore. Restrictions apply.
XU et al.: STACKED SPARSE AUTOENCODER (SSAE) FOR NUCLEI DETECTION ON BREAST CANCER HISTOPATHOLOGY IMAGES 123
Authorized licensed use limited to: RVR & JC College of Engineering. Downloaded on June 19,2020 at 10:01:31 UTC from IEEE Xplore. Restrictions apply.
124 IEEE TRANSACTIONS ON MEDICAL IMAGING, VOL. 35, NO. 1, JANUARY 2016
TABLE III (CD) based [12] algorithms. EM algorithm has been employed
MODELS CONSIDERED IN THIS WORK FOR to detect nuclei in our previous work in [15]. These methods
COMPARISON WITH THE SSAE+SMC MODEL
are described in these papers [7], [12], [15] and we direct
interested readers to these papers for a detailed description of
the methods.
The implementation of the EM based nuclei detection was
based on our previous work in [15]. BR and CD algorithms for
nuclei detection were implemented based on the paper [7], [12],
respectively.
2) SSAE+SMC Versus Other Feature Representation Based
Classification Models: We compare SSAE+SMC with SMC
classifier which employs SMC directly on the pixel intensities,
where a SMC is learned from pixel intensities of training set to
detect nuclei via the methods described in Section III-B. In the
SSAE+SMC based framework (Fig. 3), the features learned by
SSAE are treated as “input” to SMC for detection. In this exper-
iment, we additionally attempt to show the efficiency of spar-
is trained to map to digitally label by adjusting sity constraint on the hidden layer. We compare SSAE with the
the weight . Stacked Autoencoder (STAE) without the sparsity constraint,
Finally, all three layers are combined to form a SSAE with 2 where in (7).
hidden layers and a final SMC layer capable of detecting the We therefore compare the SSAE+SMC against three other
nuclei. models (see Table III) to evaluate the efficiency in detecting
The model involves the SSAE being trained bottom up in nuclei from histopathological images as follows:
an unsupervised fashion, followed by a Softmax classifier that 1) Softmax Classifier (SMC): In this model, the input of
employs supervised learning to train the top layer and fine-tune SMC in (1) are pixel intensities of an image patch and
the entire architecture. It is easy to see that the label is not the SMC's parameter is trained with training set
used during the training procedures 1) and 2), until the Softmax to minimize the cost function. Then SMC
classifier is trained. Therefore we can conclude that SSAE learns (1) with determined during training is integrated
feature representations in an unsupervised manner. with the sliding window detector (see Section IV-G) to
detect presence of a nucleus within each image patch
F. The Trained SSAE+SMC for Nuclei Detection selected by the sliding window.
The detection procedure for the trained SSAE+SMC is shown 2) AE+SMC: The parameter in (7) controls the sparsity
in Fig. 3. The red square in Fig. 3 is an example of a selected constraint on the hidden layer of AE. If the sparsity con-
image patch for nuclei detection. Each image patch is straint is removed by setting , the SAE is reduced
first converted into a 1156 3-dimensional vector. Then each to a one layer AE model. The input of SMC in (1) is
input image patch yields an output based on (1). If the output learned via a single layer of AE and the SMC is trained
value is , this patch will be identified as a nuclear patch; and employed for nuclei detection in a way similar to that
otherwise not. described in Section IV-H-2 (1). Then SMC (1) is coupled
with AE to detect presence or absence of a nucleus within
G. Sliding Window Detector each image patch selected by the sliding window.
Our strategy involves identifying the presence or absence of 3) STAE+SMC: Stacked AE (STAE) is a neural network con-
a nucleus within every individual image patch in a histopatho- sisting of multiple layers of basic AE in which the out-
logic image. A sliding window scheme is used to select candi- puts of each layer are wired to the inputs of the succes-
date patches. Since the sliding window detector will typically sive, subsequent layer. STAE involves a two layered AE
evoke multiple responses around target nuclei, non-maxima which in turn comprises of two basic AEs. The input of
suppression is applied to only retained those evoked responses SMC in (1) is a feature learned via use of a two layer of AE
above a pre-defined threshold. The threshold and overlapping from pixel intensities of a image patch. The SMC is subse-
rate for the non-maxima suppression algorithm are empirically quently trained and evaluated using the approach discussed
defined as 0.8 and 30%, respectively. in Section IV-H-2 (1).
Authorized licensed use limited to: RVR & JC College of Engineering. Downloaded on June 19,2020 at 10:01:31 UTC from IEEE Xplore. Restrictions apply.
XU et al.: STACKED SPARSE AUTOENCODER (SSAE) FOR NUCLEI DETECTION ON BREAST CANCER HISTOPATHOLOGY IMAGES 125
(2)
(3)
(4)
(5)
(6)
Authorized licensed use limited to: RVR & JC College of Engineering. Downloaded on June 19,2020 at 10:01:31 UTC from IEEE Xplore. Restrictions apply.
126 IEEE TRANSACTIONS ON MEDICAL IMAGING, VOL. 35, NO. 1, JANUARY 2016
Fig. 4. The visualization of learned high-level features of input pixel intensities with three layers SAE. (a) shows the learned feature representation in the first
hidden layer (with 400(20 20) units). The learned high-level feature representation in the second hidden layer (with 225 (15 15) units) and third hidden layer
(with 100 (10 10) units) is shown in (b) and (c), respectively. As expected, (a) shows detailed boundary features of nuclei and other tissue while (b) and (c)
show high-level features of nuclei. Also, the overfitting problem is presented in (c) when learning high-level features of nuclei in the third layer.
Fig. 5. The nuclei detection results (b) of SSAE+SMC for a large breast histopathological image (a) from a whole-slide image of a patient. The green, yellow,
and red dots represent the true positive (TP), false positive (FP), and false negative (FN) with respective to groundtruth, respectively.
Other measures such as Precision, Recall, F-measure, FPR are the nuclei that were missed with respect to the manually ascer-
defined in (2)–(5). In (6), Average Precision (AveP) involves tained ground truth delineations, respectively. The SSAE+SMC
computing the average value of over interval between model was found to outperform the other 8 models with respect
and and the precision is a function of recall . to the ground truth (see Fig. 6(a)). These results appear to sug-
Therefore, AveP shows the average area under Precision-Recall gest that SSAE+SMC works well in learning useful high-level
curve (see Fig. 7(a)). features for better representation of nuclear structures.
We also draw the precision-recall curves and Receiver Oper-
ating Characteristic (ROC) curves to assess the performance of B. Quantitative Results
nuclear detection provided by different models.
Fig. 7(a) shows the precision-recall curves corresponding
V. EXPERIMENTAL RESULTS AND DISCUSSION to nuclear detection accuracy with respect to the EM, BRT,
CD, SMC, AE+SMC, STAE+SMC, SAE+SMC, TSAE+SMC,
A. Qualitative Results CNN+SMC, and SSAE+SMC models across 500 images,
The visualization of learned high-level features in the first respectively. Each point on the X-axis and Y-axis represents
and the second hidden layers from training patches with SSAE the precision and recall in (2) and (3), respectively. Each model
are shown in Figs. 4(a) and 4(b), respectively. These features is quantitatively evaluated using AveP, as shown in Table IV.
show that the SSAE model enables the uncovering of nuclear The results appear to suggest that SSAE+SMC achieves the
and non-nuclear structures from training patches. highest AveP. Each of the ROC and Precision-Recall curves
Qualitative detection results of SSAE+SMC for a whole-slide (Figs. 7(a), (b) respectively) were generated by sequentially
breast histopathological image (Fig. 5(a)) is shown in Fig. 5(b). plotting the confidence scores (in descending order) associated
The detection results of EM, BRT, CD, SMC, AE+SMC, with the various nuclear detection methods considered in this
STAE+SMC, SAE+SMC, CNN+SMC, and SSAE+SMC work. Precision-recall and ROC curves were generated for
models on a magnified region of interest (ROI) are illustrated each detection method considered across 500 images. High
in Figs. 6(b)–(j), respectively. In these images, the green, precision or True Positive Rate corresponds to a method
yellow, and red dots represent the nuclei that had been correctly with a more accurate nuclear detection result. For the nuclear
detected (true positive detection), the non-nuclei that had been detection problem, we only had information pertaining to
wrongly detected as the nuclei (false positive detection), and the total number of manually identified nuclear patches (or
Authorized licensed use limited to: RVR & JC College of Engineering. Downloaded on June 19,2020 at 10:01:31 UTC from IEEE Xplore. Restrictions apply.
XU et al.: STACKED SPARSE AUTOENCODER (SSAE) FOR NUCLEI DETECTION ON BREAST CANCER HISTOPATHOLOGY IMAGES 127
Fig. 6. The nuclear detection results of EM (b), BRT(c), CD(d), SMC (e), AE+SMC (f), SAE+SMC (g), STAE+SMC (h), CNN+SMC (i), and SSAE+SMC (j)
models for a 400 400 patch selected from the black square region in Fig. 5(a). The ground truth of manually detected nuclei are shown as green dots in (a). (b),
(c), (d), (e), (f), (g), (h), and (i). The green, yellow, and red dots in these images represent the TP, FP, and FN with respective to ground truth, respectively.
positive patches). However information on the total number to reconstruct clean “repaired” input data from a corrupted
of patches without nuclei (or negative patches) was not avail- version of DAE. The idea is to add some noise, such as additive
able. Therefore, to compute False Positive Rate (FPR), we Gaussian noise, to the input data in the input layer. Recently,
estimated the total number of negative patches with sliding the authors in [43] employed “drop-out” to add noise to the
window scheme across the randomly chosen ROIs on each of hidden layer. We tried to implement “drop-out” in our work.
the 500 images. The window slides across each ROI image But the “drop-out” implementation did not yield any gains
row by row from upper left corner to the lower right (the step on performance in our experiments. Therefore, we decided to
size was fixed at 6 pixels). The number of negative patches exclude it from the paper. The estimated confidence intervals of
is the sum of all the patches across the 500 images excluding Precision, Recall, and F-measure in a confidence level of 95%
well-centered and annotated patches as well as those instances with the bootstrapping method for different methods are shown
in which the distance between the center of the patch window in Figs. 7(c), (d), and (e), respectively. The means of Precision,
and the closest annotated pathologist identified nucleus was Recall, and F-measure of SSAE+SMC and comparative models
less than or equal to 17 pixels. Also, for Fig. 7(b), since the are shown in Table IV. As expected, SSAE yields the highest
total number of False Positive detections is always smaller F-measure of 84.49%.
than the estimate of the total number of negative patches, the
FPR can never reach 1. The trajectory of the ROC curves is C. Sensitivity Analysis
therefore only plotted for a false positive fraction of 0.4. The
ROC curve (see Fig. 7(b)) shows that SSAE+SMC results in Fig. 7(f) shows the sensitivity of window size (X-axis) on the
superior detection performance compared to other models. detection accuracy (Y-axis) of SSAE+SMC model. The curves
Moreover, Figs. 7(a) and Table IV show that SSAE+SMC in Fig. 7(f) show that the SSAE+SMC achieves the best F-mea-
tend to outperform SAE+SMC and TSAE+SMC in terms of sure value when the window size is around 34 pixels. As a result,
AVeP. While the results appear to suggest the importance of a we chose a window size of 34 pixels for all subsequent experi-
“deeper” architecture compared to a “shallow” architecture in ments. Figs. 7(g) and 7(h) show the execution time and F-mea-
representing high-level features from pixel intensities, the rela- sure of the SSAE+SMC model as a function of step size. They
tively poor performance of the three layered SAE+SMC model show the effect of step size on the computational efficiency and
as compared to the SSAE+SMC model suggests that increasing detection accuracy of the SSAE+SMC model. The figures show
the number of layers may result in overfitting. Dropout is a that both execution time and F-measure value significantly de-
popular approach employed for avoiding overfitting problems crease as the step size increases.
associated with deep networks. The dropout approach was As expected, the SSAE achieves better performance in de-
originally proposed in [41], [42]. The Autoencoder model with tecting nuclei as compared to hand-craft feature based methods.
dropout is also called “Denoising Autoencoder (DAE)”. The This also appears to be related to the ability of the SSAE model
name “Denoising” is meant to reflect the idea of DAE. DAE to better capture higher level structural information, yielding
is a simple variant of the basic autoencoder (AE). It is trained better discriminability of nuclei versus non-nuclei.
Authorized licensed use limited to: RVR & JC College of Engineering. Downloaded on June 19,2020 at 10:01:31 UTC from IEEE Xplore. Restrictions apply.
128 IEEE TRANSACTIONS ON MEDICAL IMAGING, VOL. 35, NO. 1, JANUARY 2016
Fig. 7. The precision-recall curves (a) and ROC curves (b) on the detection accuracies of SSAE+SMC compared to EM, BRT, CD, SMC, AE+SMC, STAE+SMC,
TSAE+SMC, and CNN+SMC; The estimated confidence intervals of Precision (c), Recall (d), and F-measure (e) with bootstrapping method for different methods;
(f) The F-measure value on the nuclei detection accuracies of SSAE+SMC with variable window size; The effectiveness of SSAE+SMC model with different step
size on the execution time (g) and F-measure value (h).
D. Computational Consideration times while EM, CD, and BRT did not require a training phase.
Also, the deeper the architecture (more layers), the longer the
All the experiments were carried out on a PC (Intel Core(TM) training time required. However, once the auto-encoder models
3.4 GHz processor with 16 GB of RAM) and a Quadro 2000 were trained, the models were actually more efficient compared
NVIDIA Graphics Processor Unit. The software implementa- to the EM, CD, and BRT models in terms of run time execution.
tion was performed using MATLAB 2014a. The training set in-
cluded 14421 nuclei and 28032 non-nuclei patches. The size VI. CONCLUDING REMARKS
of each patch was 34 34 pixels. We compared the compu- In this paper, a Stacked Sparse Autoencoder framework
tational efficiency of SSAE+SMC against 9 other state of the is presented for automated nuclei detection on breast cancer
art strategies. The execution time for each of these models is histopathology. The Stacked Sparse Autoencoder model can
shown in Table V. In terms of training time, the four autoen- capture high-level feature representations of pixel intensity in
coder based models, and the CNN model, need longer training an unsupervised manner. These high-level features enable the
Authorized licensed use limited to: RVR & JC College of Engineering. Downloaded on June 19,2020 at 10:01:31 UTC from IEEE Xplore. Restrictions apply.
XU et al.: STACKED SPARSE AUTOENCODER (SSAE) FOR NUCLEI DETECTION ON BREAST CANCER HISTOPATHOLOGY IMAGES 129
(7)
Authorized licensed use limited to: RVR & JC College of Engineering. Downloaded on June 19,2020 at 10:01:31 UTC from IEEE Xplore. Restrictions apply.
130 IEEE TRANSACTIONS ON MEDICAL IMAGING, VOL. 35, NO. 1, JANUARY 2016
[13] J. P. Vink, M. B. Van Leeuwen, C. H. M. Van Deurzen, and G. De Haan, [28] P. Filipczuk, T. Fevens, A. Krzyzak, and R. Monczak, “Computer-
“Efficient nucleus detector in histopathology images,” J. Microsc., vol. aided breast cancer diagnosis based on the analysis of cytological im-
249, no. 2, pp. 124–135, 2013. ages of fine needle biopsies,” IEEE Trans. Med. Imag., vol. 32, no. 12,
[14] S. Petushi, F. U. Garcia, M. M. Haber, C. Katsinis, and A. Tozeren, pp. 2169–2178, Dec. 2013.
“Large-scale computations on histology images reveal grade-differen- [29] A. Basavanhally et al., “Computerized image-based detection
tiating parameters for breast cancer,” BMC Med. Imag., vol. 6, no. 1, and grading of lymphocytic infiltration in her2 breast cancer
p. 14, 2006. histopathology,” IEEE Trans. Biomed. Eng., vol. 57, no. 3, pp.
[15] H. Fatakdawala et al., “Expectation maximization-driven geodesic 642–653, Mar. 2010.
active contour with overlap resolution (EMaGACOR): Application [30] E. Bernardis and S. X. Yu, “Pop out many small structures from a
to lymphocyte segmentation on breast cancer histopathology,” IEEE very large microscopic image,” Med. Image Anal., vol. 15, no. 5, pp.
Trans. Biomed. Eng., vol. 57, no. 7, pp. 1676–1689, Jul. 2010. 690–707, 2011.
[16] J. Byun et al., “Automated tool for the detection of cell nuclei in digital [31] G. M. Faustino, M. Gattass, C. J. P. de Lucena, P. B. Campos, and
microscopic images: Application to retinal images,” Mol. Vis., vol. 12, S. K. Rehen, “A graph-mining algorithm for automatic detection and
pp. 949–960, 2006. counting of embryonic stem cells in fluorescence microscopy images,”
[17] Y. Al-Kofahi, W. Lassoued, W. Lee, and B. Roysam, “Improved auto- Integr. Comput.-Aid. Eng., vol. 18, no. 1, pp. 91–106, 2011.
matic detection and segmentation of cell nuclei in histopathology im- [32] G. Li et al., “Detection of blob objects in microscopic zebrafish images
ages,” IEEE Trans. Biomed. Eng., vol. 57, no. 4, pp. 841–852, Apr. based on gradient vector diffusion,” Cytomet. Part A, vol. 71, no. 10,
2010. pp. 835–845, 2007.
[18] B. Parvin et al., “Iterative voting for inference of structural saliency and [33] T. Liu et al., “lishAn automated method for cell detection in zebrafish,”
characterization of subcellular events,” IEEE Trans. Image Process., Neuroinformatics, vol. 6, no. 1, pp. 5–21, 2008.
vol. 16, no. 3, pp. 615–623, Mar. 2007. [34] A. Cruz-Roa, J. Arevalo Ovalle, A. Madabhushi, and F. Gonzalez Os-
[19] X. Qi, Y. Xing, D. J. Foran, and L. Yang, “Robust segmentation of orio, “A deep learning architecture for image representation, visual in-
overlapping cells in histopathology specimens using parallel seed de- terpretability and automated basal-cell carcinoma cancer detection,”
tection and repulsive level set,” IEEE Trans. Biomed. Eng., vol. 59, no. in MICCAI 2013, K. Mori, I. Sakuma, Y. Sato, C. Barillot, and N.
3, pp. 754–765, Mar. 2012. Navab, Eds., 2013, vol. 8150, Lecture Notes in Computer Science, pp.
[20] O. Schmitt and M. Hasse, “Radial symmetries based decomposition of 403–410, Springer.
cell clusters in binary and gray level images,” Pattern Recognit., vol. [35] Y. Bengio, A. Courville, and P. Vincent, “Representation learning: A
41, no. 6, pp. 1905–1923, 2008. review and new perspectives,” IEEE Trans. Pattern Anal. Mach. Intell.,
[21] H. Wang, F. Xing, H. Su, A. Stromberg, and L. Yang, “Novel image vol. 35, no. 8, pp. 1798–1828, Aug. 2013.
markers for non-small cell lung cancer classification and survival pre- [36] G. E. Hinton and R. Salakhutdinov, “Reducing the dimensionality of
diction,” BMC Bioinformat., vol. 15, no. 1, p. 310, 2014. data with neural networks,” Science, vol. 313, no. 5786, pp. 504–507,
[22] F. Xing, H. Su, J. Neltner, and L. Yang, “Automatic ki-67 counting 2006.
using robust cell detection and online dictionary learning,” IEEE Trans. [37] J. Xu, L. Xiang, R. Hang, and J. Wu, “Stacked sparse autoencoder
Biomed. Eng., vol. 61, no. 3, pp. 859–870, Mar. 2014. (SSAE) based framework for nuclei patch classification on breast
[23] C. Jung and C. Kim, “Segmenting clustered nuclei using h-minima cancer histopathology,” in Proc. IEEE Int. Symp. Biomed. Imag., From
transform-based marker extraction and contour parameterization,” Nano to Macro, 2014, pp. 999–1002.
IEEE Trans. Biomed. Eng., vol. 57, no. 10, pp. 2600–2604, Oct. 2010. [38] F. Meyer, “Topographic distance and watershed lines,” Signal Process.,
[24] Y. Yan, X. Zhou, M. Shah, and S. T. C. Wong, “Automatic segmen- vol. 38, no. 1, pp. 113–125, 1994.
tation of high-throughput RNAi fluorescent cellular images,” IEEE [39] A. Ng, “Sparse autoencoder,” CS294A Lecture Notes, p. 72, 2011.
Trans. Inf. Technol. Biomed., vol. 12, no. 1, pp. 109–117, Jan. 2008. [40] Y. Jia et al., “Caffe: Convolutional architecture for fast feature embed-
[25] S. Ali and A. Madabhushi, “An integrated region-, boundary-, shape- ding,” arXiv preprint arXiv:1408.5093, 2014.
based active contour for multiple object overlap resolution in histolog- [41] P. Vincent, H. Larochelle, I. Lajoie, Y. Bengio, and P. Manzagol,
ical imagery.,” IEEE Trans Med. Imag., vol. 31, no. 7, pp. 1448–1460, “Stacked denoising autoencoders: Learning useful representations in a
Jul. 2012. deep network with a local denoising criterion,” J. Mach. Learn. Res.,
[26] C. Yan et al., “Automated and accurate detection of soma location and vol. 11, pp. 3371–3408, Dec. 2010.
surface morphology in large-scale 3d neuron images,” PLoS ONE, vol. [42] P. Vincent, H. Larochelle, Y. Bengio, and P. Manzagol, “Extracting
8, no. 4, p. e62579, 04 2013. and composing robust features with denoising autoencoders,” in Proc.
[27] H. Esmaeilsabzali, K. Sakaki, N. Dechev, R. Burke, and E. J. Park, 25th Int. Conf. Mach. Learn., 2008, pp. 1096–1103.
“lishMachine vision-based localization of nucleic and cytoplasmic in- [43] N. Srivastava, G. Hinton, A. Krizhevsky, I. Sutskever, and R. Salakhut-
jection sites on low-contrast adherent cells,” Med. Biol. Eng. Comput., dinov, “Dropout: A simple way to prevent neural networks from over-
vol. 50, no. 1, pp. 11–21, 2012. fitting,” J. Mach. Learn. Res., vol. 15, no. 1, pp. 1929–1958, Jan. 2014.
Authorized licensed use limited to: RVR & JC College of Engineering. Downloaded on June 19,2020 at 10:01:31 UTC from IEEE Xplore. Restrictions apply.