Breast Cancer Detectionand Diagnosis Using Mammographic Data Systematic Review
Breast Cancer Detectionand Diagnosis Using Mammographic Data Systematic Review
Review
Syed Jamal Safdar Gardezi, PhD; Ahmed Elazab, PhD; Baiying Lei, PhD; Tianfu Wang, PhD
National-Regional Key Technology Engineering Laboratory for Medical Ultrasound, Guangdong, Key Laboratory for Biomedical Measurements and
Ultrasound Imaging, School of Biomedical Engineering, Health Science Center, Shenzhen University, Shenzhen, China
Corresponding Author:
Baiying Lei, PhD
National-Regional Key Technology Engineering Laboratory for Medical Ultrasound, Guangdong
Key Laboratory for Biomedical Measurements and Ultrasound Imaging
School of Biomedical Engineering, Health Science Center, Shenzhen University
Shenzhen,
China
Phone: 86 13418964616
Fax: 86 0755 86172219
Email: [email protected]
Abstract
Background: Machine learning (ML) has become a vital part of medical imaging research. ML methods have evolved over the
years from manual seeded inputs to automatic initializations. The advancements in the field of ML have led to more intelligent
and self-reliant computer-aided diagnosis (CAD) systems, as the learning ability of ML methods has been constantly improving.
More and more automated methods are emerging with deep feature learning and representations. Recent advancements of ML
with deeper and extensive representation approaches, commonly known as deep learning (DL) approaches, have made a very
significant impact on improving the diagnostics capabilities of the CAD systems.
Objective: This review aimed to survey both traditional ML and DL literature with particular application for breast cancer
diagnosis. The review also provided a brief insight into some well-known DL networks.
Methods: In this paper, we present an overview of ML and DL techniques with particular application for breast cancer.
Specifically, we search the PubMed, Google Scholar, MEDLINE, ScienceDirect, Springer, and Web of Science databases and
retrieve the studies in DL for the past 5 years that have used multiview mammogram datasets.
Results: The analysis of traditional ML reveals the limited usage of the methods, whereas the DL methods have great potential
for implementation in clinical analysis and improve the diagnostic capability of existing CAD systems.
Conclusions: From the literature, it can be found that heterogeneous breast densities make masses more challenging to detect
and classify compared with calcifications. The traditional ML methods present confined approaches limited to either particular
density type or datasets. Although the DL methods show promising improvements in breast cancer diagnosis, there are still issues
of data scarcity and computational cost, which have been overcome to a significant extent by applying data augmentation and
improved computational power of DL algorithms.
KEYWORDS
breast cancer; lesion classification; malignant tumor; machine learning; convolutional neural networks; deep learning
worldwide. Around 53% of these cases come from developing early detection of breast cancer [5,6]. The mammograms are
countries, which represent 82% of the world population [1]. It acquired at 2 different views for each breast: craniocaudal (CC)
is reported that 626,700 deaths will occur only in 2018 [1]. view and mediolateral oblique (MLO) view (Figure 1).
Breast cancer is the leading cause of cancer death among women
In this review, we present the recent work in breast cancer
in developing countries and the second leading cause of cancer
detection using conventional machine learning (ML) and deep
death (following lung cancer) among women in developed
learning (DL) techniques. The aim of this work was to provide
countries.
the reader with an introduction to breast cancer literature and
In breast, the cancer cells may spread to lymph nodes or even recent advancements in breast cancer diagnosis using multiview
cause damage to other parts of the body such as lungs. Breast digital mammograms (DMs). The survey aimed to highlight the
cancer more often starts from the malfunctioning of challenges in the application of DL for early detection of breast
milk-producing ducts (invasive ductal carcinoma). However, it cancer using the multiview digital mammographic data. We
may also begin in the glandular tissues called lobules or other present the recent studies that have addressed these challenges
cells or tissues within the breast [1]. Researchers have also found and finally provide some insights and discussions on the current
that hormonal, lifestyle, and environmental changes also open problems. This review is divided into 2 major parts. The
contribute to increasing the risk of breast cancer [2,3]. first part presents a brief introduction of different steps of a
conventional ML method (ie, enhancement, feature extraction,
To visualize the internal breast structures, a low-dose x-ray of
segmentation, and classification), whereas the second part
the breasts is performed; this procedure is known as
focuses on DL techniques, with an emphasis on multiview (ie,
mammography in medical terms. It is one of the most suitable
CC and MLO) mammographic data. The present DL literature
techniques to detect breast cancer. Mammograms expose the
can be characterized for breast density discrimination, detection,
breast to much lower doses of radiation compared with devices
and classification of the lesion in breast cancer in the multiview
used in the past [4]. In recent years, it has proved to be one of
digital mammographic data. The rest of this review is organized
the most reliable tools for screening and a key method for the
as follows.
Figure 1. Multiview breast mammogram of a patient. The first column presents two views of the right breast: right craniocaudal (RCC) view and right
mediolateral oblique (RMLO) view. The second column presents two views of the left breast: left craniocaudal (LCC) view and left mediolateral oblique
(LMLO) view.
Figure 3. (a) Original mammogram image 1024×1024. (b) Preprocessing to remove annotations. (c) pectoral muscle (PM) removal by region growing.
(d) PM removal by adaptive segmentation.
GTa Widely used as preprocessing step in image processing as Not suitable for segmentation of ROIsb, as GT methods
these methods are easy to implement produce high false positive detections
Local thresholding Works well compared with GT, sometimes used to improve Widely used in literature as initialization step of other
the GT results algorithms, but local thresholding fails to separate the
pixels accurately into suitable regions
Region growing Uses pixel connectivity properties to grow iteratively and Need initialization point, that is, a seed point to begin
sum up the region having similar pixel properties with and highly dependent on initial guess
Region clustering No seed point required to initialize; it can directly search the Total number of clusters need to be predefined at initial
cluster regions. stage
Edge detection Highly suitable for detecting the object boundaries and con- Requires some information about object properties
tours of the suspected ROIs
Template matching Needs ground truth and are easily implemented. Easy imple- Need prior information about the region properties of the
mentation; if the prototypes are suitably selected, it can objects such as size, shape, and area.
produce good results.
Multiscale technique Do not require any prior knowledge about object properties Requires empirical evaluation to select the appropriate
wavelet transform
Easily discriminate among the coefficients at different level Need to select scale of decompositions
and scale of decompositions
a
GT: Global thresholding.
b
ROI: region of interest.
Figure 5. An overview of mammogram processing using computer-aided diagnosis based on machine learning algorithms.
Deep Learning, an Overview convolutions sizes (ie, 5×5, 3×3, and 1×1) to have a better
DL algorithms have made significant improvements in receptive field and extract details from very small levels. One
performance compared with other traditional ML and artificial of the important salient points about the inception module is
intelligence [47]. The applications of DL have grown that it also has a so-called bottleneck layer (1×1 conv.) that
tremendously in various fields such as image classification [47], helps in massive reduction of the computation requirement.
natural language processing [48], gaming [49]; and, in particular, Another change that GoogleNet introduced is global average
it has become very popular in the medical imaging community pooling at the last convolutional layer, thus averaging the
for detection and diagnosis of diseases such as skin cancer channel values across the 2D feature map. This results in a
[50,51], brain tumor detection, and segmentation [52]. reduction of the total number of parameters.
The DL architectures can be characterized into 3 categories: With increasing network depth, the accuracy of the network is
unsupervised DL networks, also known as generative networks; saturated and thus degrades rapidly. This degradation is not
supervised networks or discriminative networks; and hybrid or caused by overfitting problem, but with the addition of more
ensemble networks. layers, the training error also increases that leads to degradation
problem. The degradation problem was solved by introducing
Convolutional neural network (CNN) is a state-of-the-art DL the residual network (ResNet) by He et al [56]. The residual
technique that is comprising many stacked convolutional layers module was introduced to effectively learn the training
[47]. The most common CNN discriminative architecture parameters in a deeper network. They introduced skip
contains a convolutional layer, a maximum pooling layer to connections in convolutional layers in a blockwise manner to
increase the field of view of the network, a rectified linear unit construct a residual module. The performance of ResNet is
(ReLU), batch normalization, a softmax layer, and fully better than VGG and GoogleNet [57].
connected layers. The layers are aligned on top of each other
to form a deep network that can the local and spatial information Deep Learning for Breast Cancer Diagnosis
from this layer when a 2D or 3D image is presented as an input Many researchers have used DL approaches in medical image
[53]. analysis. The success of DL is largely depending on the
availability of large number of training samples to learn the
The AlexNet [47] architecture was one of the first deep networks
descriptive feature mappings of the images, which give very
for improving the ImageNet classification accuracy by a
accurate results in classification. For example, the image
significant stride than the existing traditional methodologies.
classification task, the network is trained over more than 1
The architecture contained 5 convolutional layers proceeded by
million images with more than 1000 class data. However, in
3 fully connected layers. The ReLU activation function for the
the case of medical images, the amount of available training
nonlinear part was introduced by replacing the traditional
data is not that big in size. Moreover, it is also difficult to
activation function such as Tanh or Sigmoid functions used in
acquire a large number of labeled images, as the annotation
neural networks. ReLU has fast convergence as compared to
itself is an expensive task and for some diseases (eg, lesions)
sigmoid, which suffers from the vanishing gradient problem.
are scarce in the datasets [58]. In addition, annotation of these
Later, VGG 16 architecture was proposed by visual geometry data samples, if exist, in different classes suffers from
group (VGG) [54], Oxford University. The VGG improved the intraobserver variations, as the annotation is highly subjective
AlexNet architecture by changing the kernel size and and relies on the expert’s knowledge and experience. To
introduction of multiple filters. The large kernel-sized filters overcome the data insufficiency challenge, many research
are replaced (ie, 11×11 in Conv1 and 5×5 in Conv2, groups have devised different strategies: (1) using 2D patches
respectively) by multiple 3×3 kernel-sized filters that are placed or 3D cubes instead of using the whole image as input [59,60],
one after another. The multiple smaller kernel filters improve which also reduces the model parameters and alleviates
the receptive field compared with a larger size kernel, as overfitting; (2) by introducing data augmentation using some
multiple nonlinear layers increase the depth of the network. The affine transformations (translation, rotation, and flipping [61,62])
increased depth enables to learn more complex features at a and training the network on the augmented data; (3) by
lower cost. Although VGG achieved very good accuracy on transferring learning approach using pretrained weights [63,64]
classification tasks for the ImageNet dataset, it is and just replacing the last layers by the new targeted class
computationally expensive and requires huge computational instead; and (4) using trained models with small input sizes and
power, both in terms of storage memory and time. Thus, making then transforming the weights in the fully connected layers into
it inefficient because of the large width of convolutional layers. convolutional kernels [65].
The GoogleNet [55] proposed the idea that most of the Search Strategy for Study Selection
connection in dense architecture and their activations in the To select the relevant recent studies on breast cancer diagnosis,
deep network are redundant or unnecessary due to correlations we consider the studies in the past 5 years from well-known
between them. This makes the network computationally publishing platforms such as PubMed, Google Scholar,
expensive. Therefore, GoogleNet implied to have a most MEDLINE, Science Direct, Springer, and Web of Science
efficient network with sparse connections between the databases. The search terms convolutional neural networks,
activations. GoogleNet introduced the inception module, which deep learning, breast cancer, mass detection, transfer learning,
effectively computes sparse activation in a CNN with a normal and multiview are combined.
dense construction. The network also uses 3 different
Table 2. Summary of convolutional neural network–based methods for breast density estimation.
Author Method Dataset/number Task Performance metric/s Code availability
(value/s)
Mohamed et al [70] CNNa (AlexNet; transfer Private, University of Breast density estimation AUCc (0.9882) —d
learning) Pittsburgh/200,00 DMb
(multiview)
Ahn et al [71] CNN (transfer learning) Private, Seoul University Breast density estimation Correlation coefficient —
Hospital/397 DM (multi- (0.96)
view)
Xu et al [73] CNN (scratch based) Public, INbreast Breast density estimation Accuracy (92.63%) —
dataset/410 DM (multi-
view)
Wu et al [72] CNN (transfer learning) Private, New York Uni- Breast density estimation Mean AUC (0.934) [77]
versity School of
Medicine/201,179 cases
(multiview)
Kallenberg et al [74] Conventional sparse autoen- Private, Dutch Breast Breast density estimation Mammographic tex- —
coder, ie, CNN+stacked au- Cancer Screening Pro- and risk scoring ture (0.91) and AUC
toencoder gram and Mayo Mam- (0.61)
mography, Minneso-
ta/493+668 images (mul-
tiview)
Ionescu et al [75] CNN Private dataset/67,520 Breast density estimation Average match concor- —
DM (multiview) and risk scoring dance index of (0.6)
Geras et al [76] Multiview deep neural net- Private, New York Uni- Breast density estimation Mean AUC (0.735) —
work versity/886,000 image and risk score
(multiview)
a
CNN: convolutional neural network.
b
DM: digital mammogram.
c
AUC: area under the curve.
d
Not available.
In another study, Wang et al [81] presented a semiautomated CEDM images and combine them with low-energy (LE) images.
early detection approach using DL to discriminate the This virtual enhancement improved the quality of LE images.
microcalcifications and masses in breast cancer dataset. The ResNet was applied to these virtual combined images to extract
method aimed to detect the microcalcifications that can be used the features to classify benign and normal cases. Using the
as an indicator of early breast cancer [82,83]. The DL SD-CNN on the CEDM images resulted in a significant
architecture consisted of stacked autoencoders (SAE) that stack improvement in classification accuracy compared with DMs.
multiple autoencoders, hierarchically. The deep SAE model
Hagos et al [88] presented a multiview CNN to detect breast
used layer-wise greedy search training to extract the low-level
masses in symmetrical images. The method used CNN
semantic features of microcalcifications. The method had 2
architecture with multiple patches as input to learn the
scenarios: (1) having microcalcification and (2)
symmetrical differences in the masses. Using the gradient
microcalcifications and masses together to train and test the
orientation features and local lines on the images, the likelihood
SAE model. Their method achieved good discriminative
of pixels was used to determine the patch as mass or nonmass.
accuracy for identifying calcifications using SVM classifier.
They used the AUC and competition performance metric as
Riddli et al [84] used transfer learning to implement the Faster performance measures for the proposed method against the
R-CNN model for the detection of mammographic lesions and baseline nonsymmetrical methods.
classify these lesions into benign and malignant pathology as
Later, Tuwen et al [89] proposed a multiview breast mass
can be seen in Figure 6 (adapted from [84]). The region proposal
detection system based on DNN. The 2-step method first
network in the Faster R-CNN generated possible suspected
detected the suspicious regions in multiview data and then
regions, which were refined by fine-tuning the hyperparameters.
reduced FP through neural learning and affirmed the mass
The method achieved significant classification results on the
regions. The second major module consists of using transfer
public INbreast database. However, one of the major limitations
learning to train images with Fast R-CNN and mask R-CNN,
of this study is that it was tested on a small-scale pixel-level
with 3 different variants of ResNet (ie, ResNet-101,
annotated data for detection, whereas the classification task was
ResNeXt-101, and ResNeXt-152) as backend. The 3 networks
evaluated on a larger screening dataset.
were trained on full images to capture enough context
Singh et al [85] presented a conditional generative adversarial information to discriminate soft lesion tissues. Data
network (cGAN) to segment mammographic masses from a augmentation was also applied to enrich the dataset.
ROI. The generative model learns the lesion representations to
Jung et al [90] proposed a single-stage masses detection model
create binary masks. Although the adversarial network learns
using the RetinaNet model. RetinaNet is a 1-stage object
features that discriminate the real masses from the generated
detection method that can overcome the class imbalance problem
binary masks, the key advantage of their proposed cGAN is that
and perform better than 2-stage methods. The focal loss function
it can work well for small sample dataset. The results of their
of the model allowed the RetinaNet to focus on the complex
method showed high similarity coefficient value and intersection
sample and detect objects. The mammogram RetinaNet was
over union of predicted masses with ground truths. Moreover,
tested on 2 DM datasets, that is, INbreast and an in-house dataset
the method also classified the detected masses into 4 types (ie,
GURO. Moreover, data augmentation was also used to enrich
irregular, lobular, oval, and round using CNN) as shown in
the database. Using the transfer learning approach, the mass
Figure 7 (adapted from [85]).
patches from each image were trained using random weight
Some researchers used image features for lesion detection and initialization and a different combination. With 6 different
classification. One such study by Agarwal and Carson [86] experimental settings, the RetinaNet achieved significant
predicted the semantic features such as the type of lesion and detection accuracy compared with other state-of-the-art methods.
pathology in mammograms using the deep CNN. The motivation
Shen et al [91] presented a deep architecture with end-to-end
of the study was to propose a method that could automatically
learning to detect and classify the mass regions in the whole
detect lesion and its pathology (ie, calcification or mass either
digital breast image. The method was trained on the whole
benign or malignant). A scratch-based CNN was trained on
mammogram image by using a patch classifier to initiate weights
DDSM dataset that contained mass as well as calcification cases.
of full image in an end-to-end fashion. The patch classifier uses
The method showed significant results in recognizing the
existing VGG and ResNet architecture for classification.
semantic characteristics that can assist the radiologists in clinical
Different combinations of patch sets and hyperparameters were
decision support task.
trained to find the optimal combination on whole breast images
Gao et al [87] presented a shallow-deep CNN (SD-CNN) for from the DDSM and INbreast datasets.
lesion detection and classification for contrast-enhanced DMs
We summarize the lesion detection and classification methods
(CEDM). A 4-layered shallow-deep CNN was used to extract
in details in Table 3 and illustrate the datasets used, tasks,
the visualization mappings of the convolutional layer in the
performance metrics, and code availability.
Figure 6. Sample results from the study by Ribli et al for mass detection and classification.
Figure 7. An overview of conditional generative adversarial network adapted from the study by Singh et al for mass segmentation and shape classification.
CNN: convolutional neural network.
Table 3. Summary of convolutional neural network–based methods for breast mass detection.
Author Method Dataset/number Task Performance metric/s Code availability
(value/s)
Dhungel et al [78] Hybrid CNNa+level set Public, INbreast Mass detection, classifica- Accuracy (0.9) and —b
dataset/410 images (mul- tion of benign, and malig- sensitivity (0.98)
tiview) nant
Dhungel et al [79] CRFc+CNN Public, INbreast and Lesion detection and Dice score (0.89) —
DDSMd/116 and 158 im- segmentation
ages (multiview)
Zhu et al [80] Fully convolutional Public, INbreast and Lesion segmentation Dice score (0.97) [92]
network+ CRF DDSM/116 and 158 im-
ages (multiview)
Wang et al [81] Stacked autoencoder Private, Sun Yat-Sen Detection and classifica- Accuracy (0.87) —
(transfer learning) University/1000 Digital tion of calcifications and
mammogram masses
Riddli et al [84] Faster R-CNN (transfer Public, DDSM (2620), Detection and classifica- AUCe (0.95) Semmelweis
learning) INbreast (115), and pri- tion dataset: [93];
vate dataset by Semmel- Code: [94]
weis University Bu-
dapest/847 images
Singh et al [85] Conditional generative Public and private, Lesion segmentation and Dice score (0.94) and —
adversarial network and DDSM and Reus Hospi- shape classification Jaccard Index (0.89)
CNN tal Spain
dataset/567+194 images
Agarwal and Carson [86] CNN (scratch based) Public, DDSM/8750 im- Classification of mass Accuracy (0.90) —
ages (multiview) and calcifications
Gao et al [87] Shallow-deep convolu- Private, Mayo Clinic Lesion detection and Accuracy (0.9) and —
tional neural network, Arizona (49 subjects) and classification AUC (0.92)
ie, 4 layers public, INbreast dataset
CNN+ResNet (89 subjects) (multiview)
Hagos et al [88] Multi-input CNN Private (General Electric, Lesion detection and AUC (0.93) and CPM —
Hologic, Siemens) classification (0.733)
dataset/28,294 im-
ages/(multiview)
Tuwen et al [89] Fast R-CNN and Mask Private (General Electric, Lesion detection and Sensitivity (0.97) with —
R-CNN with ResNet Hologic, Siemens) classification 3.56 FPf per image
variants as backbone dataset/23,405 images
(multiview)
Jung et al [90] RetinaNet model Public and private, IN- Mass detection and classi- Accuracy (0.98) with [95]
breast and GURO dataset fication 1.3 FP per image
by Korea University
Guro Hospital/410+222
images (multiview)
Shen et al [91] CNN end-to-end (trans- Public, DDSM and IN- Classification of masses AUC (0.96) [96]
fer learning through vi- breast/2584 +410 (multi-
sual geometry group 16 view)
and ResNet)
a
CNN: convolutional neural network.
b
Not available.
c
CRF: conditional random field.
d
DDSM: Digital Database for Screening Mammography.
e
AUC: area under the curve.
f
FP: false positive.
Levy and Jain [97] demonstrated the usefulness of DL as a tested the performance of DL method against 5 classifiers (ie,
classification tool to discriminate the benign and malignant KNN, decision trees, LDA, Naive Bayes, and SVM).
cancerous regions. The authors used a transfer learning approach
Wu et al [102] presented a DL approach to address the class
to implement 2 architectures: AlexNet and GoogleNet. Data
imbalance and limited data issues for breast cancer classification.
augmentation is used to increase the number of samples and
The approach used the infilling approach to generate synthetic
alleviate overfitting issues. The results showed the significance
mammogram patches using cGAN network. In the first step,
of DL features in the classification of 2 classes.
the multiscale generator was trained to create synthetic patches
Recently, Samala et al [98] presented mass classification method in the target image using GAN. The generator used a cascading
for digital breast tomosynthesis (DBT) using multistage refinement to generate the multiscale features to ensure stability
fine-tuned CNN. The method used multistage transfer learning at high resolution. Figure 8 shows the synthetic images
approach using different layer variation and selecting the optimal generated by cGAN. The cGAN was restricted to infill only
combination. Initially, the CNN tuned on ImageNet dataset was lesion either mass or calcifications. The quality of generated
directly implemented on DBT data, and results were recorded images was experimentally evaluated by training a ResNet-50
in the multistage CNN that was fine-tuned on DBT dataset. The classifier. The classification performance of cGAN augmented,
classification layers of CNN were used with different freeze and traditional augmentation methods were also compared. The
pattern to extract the best combination that produces the highest results showed that synthetic augmentation improves
accuracy. A total of 6 different combinations of transfer classification.
networks with varying freeze pattern for convolutional layers
Sarah et al [103] addressed the issue of reducing the recall rates
were tested. The multistage transfer learning significantly
in breast cancer diagnosis. The higher number of FP results in
improved the results with least variations compared with
higher recalls, which leads to unnecessary biopsies and increased
single-stage learning.
cost for the patients. In this study, a DL method to reduce the
Jadoon et al [99] presented a hybrid methodology for breast recall rates was proposed. A deep CNN, namely, AlexNet, was
cancer classification by combining CNN with wavelet and implemented. A total of 6 different scenarios of mammogram
curvelet transform. This model targeted a 3-class classification classification were investigated. CNN was able to discriminate
study (ie, normal, malignant, and benign cases). In this study, and classify these 6 categories very efficiently. Moreover, it
2 methods, namely, CNN-discrete wavelet (CNN-DW) and could also be inferred that some features in recalled benign
CNN-curvelet transform (CNN-CT) were used. Features from images classify them reexamined and to be recalled instead of
wavelet and curvelet transform were fused with features classifying them as negative (normal) cases.
obtained from the CNN. Data augmentation was used to enrich
Lately, Wang et al [104] presented a hybrid DL method for
the dataset and avoid overfitting of features at the classification
multiview breast mass diagnosis. The framework exploited the
stage. Features from CNN-DW and CNN-CT were extracted at
contextual information from the multiview data (ie, CC and
4-level sub-band decompositions separately using the dense
MLO) using CNN features and attention mechanism. The
scale-invariant features at each sub-band level. The obtained
proposed multiview DNN aimed to help medical experts for
features were presented as input to train a CNN with SoftMax
the classification of breast cancer lesion. The method comprised
and SVM layer for the classification of normal, benign, and
4 steps, and mass cropping and extraction of clinical features
malignant cases.
were done from the multiview patches. The recurrent neural
In a similar study, Huynh et al [100] also used transfer learning network, in particular, long short-term memory, was used to
and CNN as tools to classify the tumors in breast cancer. The extract the label co-occurrence dependency of multiview
authors proposed an ensemble method that used both CNN and information for the classification of mass regions into benign
handcrafted features (eg, statistical and morphological features). and malignant cases using the clinical and CNN features as
The features from each method were combined to obtain the input.
ensemble feature matrix. SVM classifier was used with 5-fold
In another study, Shams et al [105] proposed a GAN-based
cross-validations. Performance of individual methods was
mammogram classification method—Deep GeneRAtive
compared with the ensemble method using 219 breast lesions.
Multitask (DiaGRAM) network to deal with data scarcity and
Their results showed that the ensemble could produce better
limited availability of annotated data. The DiaGRAM effectively
results compared with fine-tuned CNN and analytical feature
uses an end-to-end multitask learning to improve diagnostic
extractor.
performance on limited number of datasets.
Domingues and Cardoso [101] used an autoencoder to classify
Gastitouni et al [106] presented an ensemble method for breast
the mass versus not mass in the INbreast dataset. The classifier
pectoral parenchymal classification. The texture feature maps
architecture included 1025-500-500-2000-2 layers with the same
extracted from lattice-based techniques are fed as input
number of layers for the decoder as well. Except for the last 2
separately to a multichannel CNN. The meta-features from the
linear layers, all other layers were logistic. The method produced
CNN predicted the risk score associated with breast parenchyma.
significant results. Moreover, it was also observed that
The hybrid method showed better performance compared with
increasing the depth of the network by adding more layers can
individual texture features and CNN, respectively.
also improve the detection and classification rates. The authors
Figure 8. Sample results from the study by Wu et al for synthetic generation of data using conditional generative adversarial network. GAN: generative
adversarial network.
Dhungel et al [107] introduced a multiview ensemble deep to obtain a fully connected layer that can classify the lesions
ResNet (mResNet) for classification of malignant and benign into malignant or benign class.
tumors. Their ensemble network comprised deep ResNet capable
Generally, DL methods have significantly improved the
to tackle 6 input images, with different views, that is, CC and
performance of breast cancer detection, classification, and
MLO. The mResNet can automatically produce binary maps of
segmentation. We summarize these methods in details in Table
the lesions. The final output of the mResNet are concatenated
4 and illustrate the datasets used, tasks, performance metrics,
and code availability.
Table 4. Summary of convolutional neural network–based methods for breast mass classification.
Author Method Dataset/number Task Performance metric/s Code availability
(value/s)
Levy and Jain [97] AlexNet and GoogleNet Public, DDSMa Breast mass classification Accuracy (0.924), —b
(transfer learning) dataset/1820 images precision (0.924), and
(multiview) recall (0.934)
Samala et al [98] Multistage fine-tuned Private+public, Universi- Classification perfor- AUCe (0.91) [108]
c ty of Michigan and mance on varying sample
CNN (transfer learning)
DDSM/4039 ROIsd sizes
(multiview)
Jadoon et al [99] CNN- Discrete wavelet Public, image retrieval in Classification Accuracy (81.83 and —
and CNN-curvelet trans- medical applications 83.74) and receiver
form dataset/2796 ROI patches operating characteris-
tic curve (0.831 and
0.836) for both meth-
ods
Huynh et al [100] CNN (transfer learning) Private, University of Classification of benign AUC (0.86) —
Chicago/219 images and malignant tumor
(multiview)
Domingues and Car- Autoencoder Public, INbreast/116 Classification of mass vs Accuracy (0.99) [109]
doso [101] ROIs normal
Wu et al [102] GANf and ResNet50 Public, DDSM Detection and classifica- AUC (0.896) [110]
dataset/10,480 images tion of benign and malig-
(multiview) nant calcifications and
masses
Sarah et al [103] CNN (transfer learning) Public, Full-field digital Classification AUC (0.91) —
mammography and
DDSM/14,860 images
(multiview)
Wang et al [104] CNN and long short-term Public, Breast Cancer Classification of breast AUC (0.89) —
memory Digital Repository (BC- masses using contextual
DR-F03)/763 images information
(multiview)
Shams et al [105] CNN and GAN Public, INbreast and Classification AUC (0.925) —
DDSM (multiview)
Gastounioti et al [106] Texture feature+CNN Private/106 cases (medio- Classification AUC (0.9) —
lateral oblique view only)
Dhungel et al [107] Multi-ResNet Public, INbreast (multi- Classification AUC (0.8) —
view)
a
DDSM: Digital Database for Screening Mammography.
b
Not available.
c
CNN: convolutional neural network.
d
ROIs: region of interest.
e
AUC: area under the curve.
f
GAN: generative adversarial network.
On the other hand, many attempts have been made to reduce of GANs as successfully demonstrated in studies by Singh et
human intervention and produce fully automatic CAD system, al [85] and Wu et al [102]. Techniques such as these will not
which is a very challenging task. In fact, all methods in literature only tackle the insufficiency issue of data but will also provide
require annotated images (ground truth) to validate their findings a viable solution to class imbalance problem, which is also an
during the training and testing stages. Thus, acquisition of important research area.
labeled mammograms with image-level and pixel-level
Apart from the development of automatic DL techniques, there
annotations is one of the obstacles in designing robust DL
are other associated challenges to the medical imaging research
methods. The main issue is not only the availability of data but
community. First, it is very challenging to secure funding for
also annotations by expert radiologist, which is time consuming,
construction of a medical dataset. Also, finding an expert for
subjective, and expensive.
annotation and the cost of annotation itself is very high. Second,
It is noted from the literature that the automated DL method privacy and copyright issues make the medical image difficult
requires extensive experimentation, computational power, and to share compared with natural images datasets. Finally, because
preprocessing of data, which make it inefficient to be used in of the complex anatomy of human organs, a variety of dataset
real time. Moreover, finding the optimal parameters in DL is required using different imaging modalities. Despite these
networks is also one of the major challenges in building a CAD challenges, there has been a significant increase in the number
system for clinical use. However, this issue can be resolved if of public datasets. Organizing a grand challenge is one of the
sufficient training is provided to clinicians, and CAD systems good practices devised to share and enrich the datasets. The
are made more user friendly. It is also noted that the participants are provided with a certain number of tasks on a
semisupervised approaches have shown good performance on particular dataset, and the technique with best results is
the public and private datasets for breast cancer diagnosis. announced as a winner. Moreover, different research centers
join hands in research collaborations as well as common data
From the analysis of methods mentioned in Tables 2, 3, and 4,
sharing platforms.
it can be noted that most methods mentioned previously adapt
the augmentation strategies to enrich the dataset. All these Conclusions
techniques only use geometric transformations to create rotated From the aforementioned discussions, we can see that both
and scale version of existing samples without adding any supervised and unsupervised DL methods are used by the image
morphological variations in the lesions. Thus, enrichment of analysis community, but the majority of the work uses the semi
data with more samples is only limited to affine transformations supervised approach. The presented literature aims to help in
and cannot fully resolve the overfitting problem. building a CAD system that is robust and computationally
Developing DL models that can learn from limited data is still efficient to assist the clinicians in the diagnosis of breast cancer
an open research area not only in breast cancer diagnosis but at early stages. As DL requires a sufficient amount of annotated
also for other medical image analysis applications. Moreover, data for training, most of the researchers use a combination of
developing data augmentation techniques that can create public and private data followed by data augmentation
morphological variations in augmented samples, while also techniques to overcome the data scarcity issue. These approaches
preserving the lesion characteristic, are needed. One of the have provided a feasible solution to the problem of scarcity of
solutions to address these problems is to explore the capabilities data and overfitting.
Acknowledgments
This work was supported partly by National Natural Science Foundation of China (Nos 61871274, 61801305, 81771922, and
81571758), National Natural Science Foundation of Guangdong Province (Nos 2017A030313377 and 2016A030313047),
Shenzhen Peacock Plan (Nos KQTD2016053112051497 and KQTD2015033016104926), and Shenzhen Key Basic Research
Project (Nos JCYJ20170413152804728, JCYJ20180507184647636, JCYJ20170818142347251 and JCYJ20170818094109846).
Conflicts of Interest
None declared.
Multimedia Appendix 1
Commonly used metrics for performance evaluation in breast cancer diagnosis.
[PDF File (Adobe PDF File), 69KB-Multimedia Appendix 1]
References
1. American Cancer Society. 2018. Global Cancer: Facts & Figures, 4th edition URL:http://www.cancer.org/content/dam/
cancer-org/research/cancer-facts-and-statistics/global-cancer-facts-and-figures/global-cancer-facts-and-figures-4th-edition.
pdf
2. Blakely T, Shaw C, Atkinson J, Cunningham R, Sarfati D. Social inequalities or inequities in cancer incidence? Repeated
census-cancer cohort studies, New Zealand 1981-1986 to 2001-2004. Cancer Causes Control 2011 Sep;22(9):1307-1318.
[doi: 10.1007/s10552-011-9804-x] [Medline: 21717195]
3. Smigal C, Jemal A, Ward E, Cokkinides V, Smith R, Howe HL, et al. Trends in breast cancer by race and ethnicity: update
2006. CA Cancer J Clin 2006;56(3):168-183 [FREE Full text] [doi: 10.3322/canjclin.56.3.168] [Medline: 16737949]
4. Guide to Mammography And Other Breast Imaging Procedures. New York: Natl Council on Radiation; 2012.
5. Ponraj DN, Jenifer ME, Poongodi DP, Manoharan JS. A survey on the preprocessing techniques of mammogram for the
detection of breast cancer. J Emerg Trends Comput Inf Sci 2011;2(12):656-664 [FREE Full text]
6. Rangayyan RM, Ayres FJ, Leo Desautels JE. A review of computer-aided diagnosis of breast cancer: toward the detection
of subtle signs. J Franklin Inst 2007 May;344(3-4):312-348. [doi: 10.1016/j.jfranklin.2006.09.003]
7. Ganesan K, Acharya UR, Chua KC, Min LC, Abraham KT. Pectoral muscle segmentation: a review. Comput Methods
Programs Biomed 2013 Apr;110(1):48-57. [doi: 10.1016/j.cmpb.2012.10.020] [Medline: 23270962]
8. Ge M, Mainprize JG, Mawdsley GE, Yaffe MJ. Segmenting pectoralis muscle on digital mammograms by a Markov random
field-maximum a posteriori model. J Med Imaging (Bellingham) 2014 Oct;1(3):34503 [FREE Full text] [doi:
10.1117/1.JMI.1.3.034503] [Medline: 26158068]
9. Ali MA, Czene K, Eriksson L, Hall P, Humphreys K. Breast tissue organisation and its association with breast cancer risk.
Breast Cancer Res 2017 Sep 6;19(1):103 [FREE Full text] [doi: 10.1186/s13058-017-0894-6] [Medline: 28877713]
10. Oliver A, Lladó X, Torrent A, Martí J. One-Shot Segmentation of Breast, Pectoral Muscle, and Background in Digitised
Mammograms. In: Proceedings of the International Conference on Image Processing.: IEEE; 2014 Presented at: IEEE'14;
October 27-30, 2014; Paris, France p. 912-916. [doi: 10.1109/ICIP.2014.7025183]
11. Bora VB, Kothari AG, Keskar AG. Robust automatic pectoral muscle segmentation from mammograms using texture
gradient and euclidean distance regression. J Digit Imaging 2016 Feb;29(1):115-125 [FREE Full text] [doi:
10.1007/s10278-015-9813-5] [Medline: 26259521]
12. Tourassi GD, Vargas-Voracek R, Catarious Jr DM, Floyd Jr CE. Computer-assisted detection of mammographic masses:
a template matching scheme based on mutual information. Med Phys 2003 Aug;30(8):2123-2130. [doi: 10.1118/1.1589494]
[Medline: 12945977]
13. Rampun A, Morrow PJ, Scotney BW, Winder J. Fully automated breast boundary and pectoral muscle segmentation in
mammograms. Artif Intell Med 2017 Dec;79:28-41. [doi: 10.1016/j.artmed.2017.06.001] [Medline: 28606722]
14. Eltoukhy MM, Faye I. An Adaptive Threshold Method for Mass Detection in Mammographic Images. In: Proceedings of
the International Conference on Signal and Image Processing Applications. 2013 Presented at: IEEE'13; October 8-10,
2013; Melaka, Malaysia p. 374-378.
15. Gardezi SJ, Faye I, Sanchez BJ, Kamel N, Hussain M. Mammogram classification using dynamic time warping. Multimed
Tools Appl 2017 Jan 10;77(3):3941-3962. [doi: 10.1007/s11042-016-4328-8]
16. Shih FY. Image Processing and Pattern Recognition: Fundamentals and Techniques. USA: Wiley-IEEE Press; 2010.
17. Biltawi M, Al-Najdawi N, Tedmori S. Mammogram Enhancement and Segmentation Methods: Classification, Analysis,
and Evaluation. In: Proceedings of the 13th International Arab Conference on Information Technology. 2012 Presented at:
ACIT'12; December 10-13, 2012; Jordan, Zarq p. 477-485.
18. de Oliveira HC, Mencattini A, Casti P, Martinelli E, di Natale C, Catani JH, et al. Reduction of False-Positives in a CAD
Scheme for Automated Detection of Architectural Distortion in Digital Mammography. In: Proceeding of the Conference
on Computer-Aided Diagnosis. 2018 Presented at: SPIE'18; February 10-15, 2018; Houston, Texas, United States p.
105752P. [doi: 10.1117/12.2293388]
19. Liu X, Zeng Z. A new automatic mass detection method for breast cancer with false positive reduction. Neurocomputing
2015 Mar;152:388-402 [FREE Full text] [doi: 10.1016/j.neucom.2014.10.040]
20. Jen CC, Yu SS. Automatic detection of abnormal mammograms in mammographic images. Expert Syst Appl 2015
Apr;42(6):3048-3055. [doi: 10.1016/j.eswa.2014.11.061]
21. Ayer T, Chen Q, Burnside ES. Artificial neural networks in mammography interpretation and diagnostic decision making.
Comput Math Methods Med 2013;2013:832509 [FREE Full text] [doi: 10.1155/2013/832509] [Medline: 23781276]
22. Magna G, Casti P, Jayaraman SV, Salmeri M, Mencattini A, Martinelli E, et al. Identification of mammography anomalies
for breast cancer detection by an ensemble of classification models based on artificial immune system. Knowl Based Syst
2016 Jun;101:60-70. [doi: 10.1016/j.knosys.2016.02.019]
23. Wang H, Zheng B, Yoon SW, Ko HS. A support vector machine-based ensemble algorithm for breast cancer diagnosis.
Eur J Oper Res 2018 Jun;267(2):687-699 [FREE Full text] [doi: 10.1016/j.ejor.2017.12.001]
24. Sert E, Ertekin S, Halici U. Ensemble of Convolutional Neural Networks for Classification of Breast Microcalcification
From Mammograms. In: Proceedings of the 39th Annual International Conference of the IEEE Engineering in Medicine
and Biology Society. 2017 Presented at: IEEE'17; July 11-15, 2017; Seogwipo, South Korea p. 689-692. [doi:
10.1109/EMBC.2017.8036918]
25. Cheng H, Shi X, Min R, Hu L, Cai X, Du H. Approaches for automated detection and classification of masses in
mammograms. Pattern Recognit 2006 Apr;39(4):646-668. [doi: 10.1016/j.patcog.2005.07.006]
26. Hasan H, Tahir NM. Feature Selection of Breast Cancer Based on Principal Component Analysis. In: Proceedings of the
6th International Colloquium on Signal Processing & Its Applications. 2010 Presented at: IEEE'10; May 21-23, 2010;
Mallaca City, Malaysia p. 1-4. [doi: 10.1109/CSPA.2010.5545298]
27. Chan HP, Wei D, Helvie MA, Sahiner B, Adler DD, Goodsitt MM, et al. Computer-aided classification of mammographic
masses and normal tissue: linear discriminant analysis in texture feature space. Phys Med Biol 1995 May;40(5):857-876.
[Medline: 7652012]
28. Jin X, Xu A, Bie R, Guo P. Machine Learning Techniques and Chi-square Feature Selection for Cancer Classification
Using SAGE Gene Expression Profiles. In: Proceedings of the 2006 International Conference on Data Mining for Biomedical
Applications. 2006 Presented at: BioDM'06; April 9, 2006; Singapore p. 106-115. [doi: 10.1007/11691730_11]
29. Salama GI, Abdelhalim MB, Zeid MA. Breast cancer diagnosis on three different datasets using multi-classifiers. Int J
Comput Sci Inf Technol 2012;1(1):36-43 [FREE Full text]
30. Saghapour E, Kermani S, Sehhati M. A novel feature ranking method for prediction of cancer stages using proteomics data.
PLoS One 2017;12(9):e0184203 [FREE Full text] [doi: 10.1371/journal.pone.0184203] [Medline: 28934234]
31. Eltoukhy MM, Gardezi SJ, Faye I. A Method to Reduce Curvelet Coefficients for Mammogram Classification. In: Proceedings
of the Region 10 Symposium. 2014 Presented at: IEEE'14; April 14-16, 2014; Kaula lumpur , Malaysia p. 663-666. [doi:
10.1109/TENCONSpring.2014.6863116]
32. Singh B, Jain V, Singh S. Mammogram mass classification using support vector machine with texture, shape features and
hierarchical centroid method. J Med Imaging & Health Infor 2014 Oct 1;4(5):687-696 [FREE Full text] [doi:
10.1166/jmihi.2014.1312]
33. Sonar P, Bhosle U, Choudhury C. Mammography Classification Using Modified Hybrid SVM-KNN. In: Proceedings of
the International Conference on Signal Processing and Communication. 2017 Presented at: IEEE'17; July 28-29, 2017;
Coimbatore, India p. 305-311. [doi: 10.1109/CSPC.2017.8305858]
34. Gardezi SJ, Faye I, Eltoukhy MM. Analysis of Mammogram Images Based on Texture Features of Curvelet Sub-Bands.
In: Proceedings of the 5th International Conference on Graphic and Image Processing. 2014 Presented at: SPIE'14; January
10, 2014; Hong Kong p. 906924. [doi: 10.1117/12.2054183]
35. Pratiwi M, Alexander, Harefa J, Nanda S. Mammograms classification using gray-level co-occurrence matrix and radial
basis function neural network. Procedia Comput Sci 2015;59:83-91 [FREE Full text] [doi: 10.1016/j.procs.2015.07.340]
36. Pal NR, Bhowmick B, Patel SK, Pal S, Das J. A multi-stage neural network aided system for detection of microcalcifications
in digitized mammograms. Neurocomputing 2008 Aug;71(13-15):2625-2634 [FREE Full text] [doi:
10.1016/j.neucom.2007.06.015]
37. Wu X, Kumar V, Ross QJ, Ghosh JR, Yang Q, Motoda H, et al. Top 10 algorithms in data mining. Knowl Inf Syst 2007
Dec 4;14(1):1-37 [FREE Full text] [doi: 10.1007/s10115-007-0114-2]
38. Tan PN, Steinbach M, Kumar V. Introduction to Data Mining. Boston, USA: Pearson Addison Wesley; 2006.
39. Sumbaly R, Vishnusri N, Jeyalatha S. Diagnosis of breast cancer using decision tree data mining technique. Int J Comput
Appl 2014 Jul 18;98(10):16-24. [doi: 10.5120/17219-7456]
40. Landwehr N, Hall M, Frank E. Logistic model trees. Mach Learn 2005 May;59(1-2):161-205 [FREE Full text] [doi:
10.1007/s10994-005-0466-3]
41. Ramirez-Villegas JF, Ramirez-Moreno DF. Wavelet packet energy, Tsallis entropy and statistical parameterization for
support vector-based and neural-based classification of mammographic regions. Neurocomputing 2012 Feb;77(1):82-100.
[doi: 10.1016/j.neucom.2011.08.015]
42. Wajid SK, Hussain A. Local energy-based shape histogram feature extraction technique for breast cancer diagnosis. Expert
Syst Appl 2015 Nov;42(20):6990-6999 [FREE Full text] [doi: 10.1016/j.eswa.2015.04.057]
43. Zhang X, Homma N, Goto S, Kawasumi Y, Ishibashi T, Abe M, et al. A hybrid image filtering method for computer-aided
detection of microcalcification clusters in mammograms. J Med Eng 2013;2013:615254 [FREE Full text] [doi:
10.1155/2013/615254] [Medline: 27006921]
44. Abbas Q, Celebi ME, Garcı a I. Breast mass segmentation using region-based and edge-based methods in a 4-stage multiscale
system. Biomed Signal Process Control 2013 Mar;8(2):204-214. [doi: 10.1016/j.bspc.2012.08.003]
45. Jiang M, Zhang S, Li H, Metaxas DN. Computer-aided diagnosis of mammographic masses using scalable image retrieval.
IEEE Trans Biomed Eng 2015 Feb;62(2):783-792. [doi: 10.1109/TBME.2014.2365494] [Medline: 25361497]
46. Michaelson J, Satija S, Moore R, Weber G, Halpern E, Garland A, et al. Estimates of the sizes at which breast cancers
become detectable on mammographic and clinical grounds. J Womens Health 2003;5(1):3-10 [FREE Full text] [doi:
10.1097/00130747-200302000-00002]
47. Krizhevsky A, Sutskever I, Hinton GE. ImageNet classification with deep convolutional neural networks. Commun ACM
2017 May 24;60(6):84-90. [doi: 10.1145/3065386]
48. Young T, Hazarika D, Poria S, Cambria E. Recent trends in deep learning based natural language processing. IEEE Comput
Intell Mag 2018 Aug;13(3):55-75. [doi: 10.1109/MCI.2018.2840738]
49. Schuurmans D, Zinkevich MA. Deep Learning Games. In: Proceedings of the Advances in Neural Information Processing
Systems. 2016 Presented at: NPIS'16; December 5-10, 2016; Barcelona, Spain.
50. Yang X, Zeng Z, Yeo SY, Tan C, Tey HL, Su Y. Cornell University. 2017. A Novel Multi-task Deep Learning Model for
Skin Lesion Segmentation and Classification URL:https://arxiv.org/abs/1703.01025
51. Esteva A, Kuprel B, Novoa RA, Ko J, Swetter SM, Blau HM, et al. Dermatologist-level classification of skin cancer with
deep neural networks. Nature 2017 Dec 2;542(7639):115-118. [doi: 10.1038/nature21056] [Medline: 28117445]
52. Havaei M, Davy A, Warde-Farley D, Biard A, Courville A, Bengio Y, et al. Brain tumor segmentation with deep neural
networks. Med Image Anal 2017 Dec;35:18-31. [doi: 10.1016/j.media.2016.05.004] [Medline: 27310171]
53. Lecun Y, Bottou L, Bengio Y, Haffner P. Gradient-Based Learning Applied to Document Recognition. In: Proceedings of
the IEEE. 1998 Presented at: IEEE'98; May 20-22, 1998; Pasadena, USA p. 2278-2324. [doi: 10.1109/5.726791]
54. Simonyan K, Zisserman A. Cornell University. 2014. Very Deep Convolutional Networks for Large-Scale Image Recognition
URL:https://arxiv.org/abs/1409.1556
55. Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D. Going Deeper With Convolutions. In: Proceedings of the
Conference on Computer Vision and Pattern Recognition. 2015 Presented at: IEEE'15; June 7-12, 2015; Boston, MA, USA
p. 1-9. [doi: 10.1109/CVPR.2015.7298594]
56. He K, Zhang X, Ren S, Sun J. Identity Mappings in Deep Residual Networks. In: Proceedings of the Conference on Computer
Vision. 2016 Presented at: ECCV'16; October 11-14, 2016; Amsterdam, The Netherlands p. 630-645. [doi:
10.1007/978-3-319-46493-0_38]
57. He K, Zhang X, Ren S, Sun J. Deep Residual Learning for Image Recognition. In: Proceedings of the Conference on
Computer Vision and Pattern Recognition. 2016 Presented at: IEEE'16; June 27-30, 2016; Las Vegas, NV, USA p. 770-778.
[doi: 10.1109/CVPR.2016.90]
58. Tajbakhsh N, Shin JY, Gurudu SR, Hurst RT, Kendall CB, Gotway MB, et al. Convolutional neural networks for medical
image analysis: full training or fine tuning? IEEE Trans Med Imaging 2016 Dec;35(5):1299-1312. [doi:
10.1109/TMI.2016.2535302] [Medline: 26978662]
59. Suk HI, Lee SW, Shen D, Alzheimer's Disease Neuroimaging Initiative. Hierarchical feature representation and multimodal
fusion with deep learning for AD/MCI diagnosis. Neuroimage 2014 Nov 1;101:569-582 [FREE Full text] [doi:
10.1016/j.neuroimage.2014.06.077] [Medline: 25042445]
60. Cheng JZ, Ni D, Chou YH, Qin J, Tiu CM, Chang YC, et al. Computer-aided diagnosis with deep learning architecture:
applications to breast lesions in US images and pulmonary nodules in CT scans. Sci Rep 2016 Apr 15;6:24454 [FREE Full
text] [doi: 10.1038/srep24454] [Medline: 27079888]
61. Ting FF, Tan YJ, Sim KS. Convolutional neural network improvement for breast cancer classification. Expert Syst Appl
2019 Apr;120:103-115 [FREE Full text] [doi: 10.1016/j.eswa.2018.11.008]
62. Setio AA, Ciompi F, Litjens G, Gerke P, Jacobs C, van Riel SJ, et al. Pulmonary nodule detection in CT images: false
positive reduction using multi-view convolutional networks. IEEE Trans Med Imaging 2016 Dec;35(5):1160-1169. [doi:
10.1109/TMI.2016.2536809] [Medline: 26955024]
63. Shin HC, Roth HR, Gao M, Lu L, Xu Z, Nogues I, et al. Deep convolutional neural networks for computer-aided detection:
CNN architectures, dataset characteristics and transfer learning. IEEE Trans Med Imaging 2016 Dec;35(5):1285-1298
[FREE Full text] [doi: 10.1109/TMI.2016.2528162] [Medline: 26886976]
64. Shie CK, Chuang CH, Chou CN, Wu MH, Chang EY. Transfer Representation Learning for Medical Image Analysis. In:
Proceedings of the Engineering in Medicine and Biology Society. 2015 Presented at: IEEE'15; August 25-29, 2015; Milan,
Italy p. 711-714. [doi: 10.1109/EMBC.2015.7318461]
65. Qi D, Hao C, Lequan Y, Lei Z, Jing Q, Defeng W, et al. Automatic detection of cerebral microbleeds from MR images via
3D convolutional neural networks. IEEE Trans Med Imaging 2016 Dec;35(5):1182-1195. [doi: 10.1109/TMI.2016.2528129]
[Medline: 26886975]
66. Lehman CD, Yala A, Schuster T, Dontchos B, Bahl M, Swanson K, et al. Mammographic breast density assessment using
deep learning: clinical implementation. Radiology 2019 Jan;290(1):52-58. [doi: 10.1148/radiol.2018180694] [Medline:
30325282]
67. Sprague BL, Conant EF, Onega T, Garcia MP, Beaber EF, Herschorn SD, PROSPR Consortium. Variation in mammographic
breast density assessments among radiologists in clinical practice: a multicenter observational study. Ann Intern Med 2016
Oct 4;165(7):457-464 [FREE Full text] [doi: 10.7326/M15-2934] [Medline: 27428568]
68. Youk JH, Gweon HM, Son EJ, Kim JA. Automated volumetric breast density measurements in the era of the BI-RADS
fifth edition: a comparison with visual assessment. AJR Am J Roentgenol 2016 May;206(5):1056-1062. [doi:
10.2214/AJR.15.15472] [Medline: 26934689]
69. Brandt KR, Scott CG, Ma L, Mahmoudzadeh AP, Jensen MR, Whaley DH, et al. Comparison of clinical and automated
breast density measurements: implications for risk prediction and supplemental screening. Radiology 2016 Jun;279(3):710-719
[FREE Full text] [doi: 10.1148/radiol.2015151261] [Medline: 26694052]
70. Mohamed AA, Berg WA, Peng H, Luo Y, Jankowitz RC, Wu S. A deep learning method for classifying mammographic
breast density categories. Med Phys 2018 Jan;45(1):314-321 [FREE Full text] [doi: 10.1002/mp.12683] [Medline: 29159811]
71. Ahn CK, Heo C, Jin H, Kim JH. A Novel Deep Learning-Based Approach to High Accuracy Breast Density Estimation in
Digital Mammography. In: Proceedings of the Computer-Aided Diagnosis. 2017 Presented at: SPIE'17; February 11-16,
97. Lévy D, Jain A. Cornell University. 2016. Breast Mass Classification From Mammograms Using Deep Convolutional
Neural Networks URL:https://arxiv.org/pdf/1612.00542.pdf
98. Samala RK, Chan HP, Hadjiiski L, Helvie MA, Richter CD, Cha KH. Breast cancer diagnosis in digital breast tomosynthesis:
effects of training sample size on multi-stage transfer learning using deep neural nets. IEEE Trans Med Imaging 2019
Mar;38(3):686-696. [doi: 10.1109/TMI.2018.2870343]
99. Jadoon MM, Zhang Q, Haq IU, Butt S, Jadoon A. Three-class mammogram classification based on descriptive CNN
features. Biomed Res Int 2017;2017:3640901 [FREE Full text] [doi: 10.1155/2017/3640901] [Medline: 28191461]
100. Huynh BQ, Li H, Giger ML. Digital mammographic tumor classification using transfer learning from deep convolutional
neural networks. J Med Imaging (Bellingham) 2016 Jul;3(3):034501 [FREE Full text] [doi: 10.1117/1.JMI.3.3.034501]
[Medline: 27610399]
101. Domingues I, Cardoso J. Mass Detection on Mammogram Images: A First Assessment of Deep Learning Techniques. In:
Proceedings of the 19th edition of the Portuguese Conference on Pattern Recognition. 2013 Presented at: RecPad'13;
November 1, 2013; Lisbon, Portugal p. 2.
102. Wu E, Wu K, Cox D, Lotter W. Conditional Infilling GANs for Data Augmentation in Mammogram Classification. In:
Proceedings of the Image Analysis for Moving Organ, Breast, and Thoracic Images. 2018 Presented at: RAMBO'18;
September 16-20, 2018; Granada, Spain p. 98-106. [doi: 10.1007/978-3-030-00946-5_11]
103. Aboutalib SS, Mohamed AA, Berg WA, Zuley ML, Sumkin JH, Wu S. Deep learning to distinguish recalled but benign
mammography images in breast cancer screening. Clin Cancer Res 2018 Dec 1;24(23):5902-5909. [doi:
10.1158/1078-0432.CCR-18-1115] [Medline: 30309858]
104. Wang H, Feng J, Zhang Z, Su H, Cui L, He H, et al. Breast mass classification via deeply integrating the contextual
information from multi-view data. Pattern Recognit 2018 Aug;80:42-52 [FREE Full text] [doi: 10.1016/j.patcog.2018.02.026]
105. Shams S, Platania R, Zhang J, Kim J, Lee K, Park SJ. Deep Generative Breast Cancer Screening and Diagnosis. In:
Proceedings of the Medical Image Computing and Computer Assisted Intervention. 2018 Presented at: MICCAI'18;
September 16-20, 2018; Granada, Spain p. 859-867.
106. Gastounioti A, Oustimov A, Hsieh MK, Pantalone L, Conant EF, Kontos D. Using convolutional neural networks for
enhanced capture of breast parenchymal complexity patterns associated with breast cancer risk. Acad Radiol 2018
Dec;25(8):977-984. [doi: 10.1016/j.acra.2017.12.025] [Medline: 29395798]
107. Dhungel N, Carneiro G, Bradley AP. Fully Automated Classification of Mammograms Using Deep Residual Neural
Networks. In: Proceedings of the 14th International Symposium on Biomedical Imaging. 2017 Presented at: IEEE'17; April
18-21, 2017; Melbourne, Australia p. 310-314. [doi: 10.1109/ISBI.2017.7950526]
108. IEEE Xplore Digital Library. Breast Cancer Diagnosis in Digital Breast Tomosynthesis: Effects of Training Sample Size
on Multi-Stage Transfer Learning Using Deep Neural Nets URL:https://ieeexplore.ieee.org/document/8466816/media#media
109. Department of Computer Science, University of Toronto. Training a deep autoencoder or a classifier on MNIST digits
URL:http://www.cs.toronto.edu/~hinton/MatlabForSciencePaper.html
110. GitHub. Code for "Conditional infilling GAN for Data Augmentation in Mammogram Classification" URL:https://github.
com/ericwu09/mammo-cigan
Abbreviations
BI-RADS: Breast Imaging Reporting and Data System
CAD: computer-aided diagnosis
CC: craniocaudal
CEDM: contrast-enhanced digital mammograms
cGAN: conditional generative adversarial network
CNN: convolutional neural network
CRF: conditional random field
CT: curvelet transform
DBT: digital breast tomosynthesis
DiaGRAM: Deep Generative Multitask
DL: deep learning
DM: digital mammogram
DW: discrete wavelet
FCN: fully convolutional network
FP: false positive
IARC: International Agency for Cancer Research
KNN: k-nearest neighbor
LDA: linear discriminant analysis
LE: low energy
ML: machine learning
Edited by G Eysenbach; submitted 23.04.19; peer-reviewed by M Awais, E Javed, M Hamghlam; comments to author 30.05.19; revised
version received 11.06.19; accepted 12.06.19; published 26.07.19
Please cite as:
Gardezi SJS, Elazab A, Lei B, Wang T
Breast Cancer Detection and Diagnosis Using Mammographic Data: Systematic Review
J Med Internet Res 2019;21(7):e14464
URL: http://www.jmir.org/2019/7/e14464/
doi: 10.2196/14464
PMID: 31350843
©Syed Jamal Safdar Gardezi, Ahmed Elazab, Baiying Lei, Tianfu Wang. Originally published in the Journal of Medical Internet
Research (http://www.jmir.org), 26.07.2019. This is an open-access article distributed under the terms of the Creative Commons
Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction
in any medium, provided the original work, first published in the Journal of Medical Internet Research, is properly cited. The
complete bibliographic information, a link to the original publication on http://www.jmir.org/, as well as this copyright and license
information must be included.