A New Deep-Learning-Based Model For Breast Cancer
A New Deep-Learning-Based Model For Breast Cancer
Article
A New Deep-Learning-Based Model for Breast Cancer
Diagnosis from Medical Images
Salman Zakareya 1 , Habib Izadkhah 1,2, * and Jaber Karimpour 1
Abstract: Breast cancer is one of the most prevalent cancers among women worldwide, and early
detection of the disease can be lifesaving. Detecting breast cancer early allows for treatment to begin
faster, increasing the chances of a successful outcome. Machine learning helps in the early detection
of breast cancer even in places where there is no access to a specialist doctor. The rapid advancement
of machine learning, and particularly deep learning, leads to an increase in the medical imaging
community’s interest in applying these techniques to improve the accuracy of cancer screening. Most
of the data related to diseases is scarce. On the other hand, deep-learning models need much data
to learn well. For this reason, the existing deep-learning models on medical images cannot work as
well as other images. To overcome this limitation and improve breast cancer classification detection,
inspired by two state-of-the-art deep networks, GoogLeNet and residual block, and developing
several new features, this paper proposes a new deep model to classify breast cancer. Utilizing
adopted granular computing, shortcut connection, two learnable activation functions instead of
traditional activation functions, and an attention mechanism is expected to improve the accuracy
of diagnosis and consequently decrease the load on doctors. Granular computing can improve
diagnosis accuracy by capturing more detailed and fine-grained information about cancer images.
The proposed model’s superiority is demonstrated by comparing it to several state-of-the-art deep
models and existing works using two case studies. The proposed model achieved an accuracy of 93%
Citation: Zakareya, S.; Izadkhah, H.;
and 95% on ultrasound images and breast histopathology images, respectively.
Karimpour, J. A New Deep-
Learning-Based Model for Breast
Cancer Diagnosis from Medical
Keywords: medical image; breast cancer diagnoses; machine learning; deep learning; classification
Images. Diagnostics 2023, 13, 1944.
https://doi.org/10.3390/
diagnostics13111944
1. Introduction
Academic Editors: Mugahed A.
Al-antari and Gary J. Whitman Breast cancer is the most commonly diagnosed form of cancer worldwide and the
second leading cause of cancer-related deaths. In 2020, breast cancer was diagnosed in
Received: 9 April 2023 2.3 million women globally, resulting in 685,000 fatalities. Additionally, as of the end of
Revised: 15 May 2023 2020, 7.8 million women had received a breast cancer diagnosis within the last five years [1].
Accepted: 28 May 2023
Clinical studies have demonstrated that early detection is crucial for effective treatment
Published: 1 June 2023
and can significantly improve the survival rate of breast cancer patients [2].
Computer-aided detection and diagnosis (CAD) software systems have been devel-
oped and clinically used since the 1990s to support radiologists in screening, improve predic-
Copyright: © 2023 by the authors.
tive accuracy, and prevent misdiagnosis due to fatigue, eye strain, or lack of experience [3].
Licensee MDPI, Basel, Switzerland. The rapid progress of machine learning in both application and efficiency, especially
This article is an open access article deep learning, has increased the interest of the medical community in using these tech-
distributed under the terms and niques to improve the accuracy of cancer screening from images. Machine learning can
conditions of the Creative Commons play an essential role in helping medical professionals in the early detection of cancerous
Attribution (CC BY) license (https:// lesions. Despite the benefits of using these techniques, cancer screening is associated with
creativecommons.org/licenses/by/ a high risk of false positives and false negatives. However, early detection of cancer can
4.0/). contribute to up to a 40% decrease in the mortality rate [2].
2. Related Work
Breast cancer ranks among the most prevalent cancers affecting women worldwide.
Early detection and accurate diagnosis are crucial factors for effective treatment and im-
proved patient outcomes [9]. Ultrasound imaging is a widely used method for breast
Diagnostics 2023, 13, 1944 3 of 23
cancer screening and diagnosis, but it requires skilled radiologists to interpret the images
accurately [10]. According to the National Breast Cancer Foundation’s 2020 report, AI has
been successfully used to diagnose more than 276,000 breast cancer cases. By analyzing
breast cancer images using AI, breast lumps (masses), mass segmentation, breast density,
and breast cancer risk can be identified. In the majority of patients, lumps in the breast are
the most common sign of breast cancer [9]; therefore, their detection is an essential step
used in CAD.
A review of deep-learning applications in breast tumor diagnosis utilizing ultrasound
and mammography images is provided in [11]. Moreover, the research summarizes the
latest progressions in computer-aided diagnosis/detection (CAD) systems that rely on deep-
learning methodologies to automatically recognize breast images, ultimately enhancing
radiologists’ diagnostic precision. Remarkably, the classification process underpinning the
novel deep-learning approaches has demonstrated significant usefulness and effectiveness
as a screening tool for breast cancer.
Recent studies have explored the use of deep-learning techniques, particularly convolu-
tional neural networks (CNNs), for automated breast ultrasound image classification [12–15].
These studies have shown encouraging results. Several convolutional neural networks
(CNNs) models are used for breast cancer image classifications including AlexNet, VGGNet,
GoogLeNet, ResNet, and Inception. In their study, the authors of [5] categorized breast
lesions as either benign or malignant. They developed a CNN model to remove speckle
noise from the ultrasound images and then proposed another CNN model for classifying
the ultrasound images. The study [16] discriminates benign cysts from malignant masses
in US images.
In the study presented in [17], various deep-learning models were employed to clas-
sify breast cancer ultrasound images based on their benign, malignant, or normal status.
A dataset comprising a total of 780 images was utilized, and data augmentation and
preprocessing techniques were applied. Three models were evaluated for classification.
Specifically, ResNet50 achieved an accuracy of 85.4%, ResNeXt50 achieved 85.83%, and
VGG16 achieved 81.11%.
The study [18] introduced a novel ensemble deep-learning-enabled clinical decision
support system for the diagnosis and classification of breast cancer based on ultrasound
images. The study presented an optimal multilevel thresholding-based image segmentation
technique for identifying tumor-affected regions. Additionally, an ensemble of three deep-
learning models was developed to extract features, and an optimal machine-learning
classifier was utilized to detect breast cancer.
In the study [14], the authors proposed a system to classify breast masses into normal,
benign, and malignant. Ten well-known, pre-trained CNNs classification models were
compared, and the best model was Inception ResNetV2. In [19], a vector-attention network
(BVA Net) was proposed to classify benign and malignant mass tumors in the breast.
In [20], the authors proposed a CNN-based CAD system for breast ultrasound image
classification (benign and malignant lesions). The study [21] developed a deep-learning
model based on ResNet18 CNN architecture for breast ultrasound image classification.
In addition, the study [22] compared the performance of different deep-learning models,
including CNNs, recurrent neural networks (RNNs), and hybrid models, for breast cancer
diagnosis on ultrasound images. In addition to binary classification, some studies have also
explored multiclass classification of breast ultrasound images. For example, the study [23]
proposed a CNN-based CAD system that can classify breast lesions into four categories:
benign, malignant, cystic, or complex cystic-solid. The system achieved an overall accuracy
of 87% on a dataset of 1000 images.
Gao et al. have devised a computer-aided diagnosis (CAD) system geared toward
screening mammography readings, which demonstrated an accuracy rate of approximately
92% [24]. Similarly, in several studies [25,26], multiple convolutional neural networks
(CNNs) were employed for mass detection in mammographic and ultrasound images.
Diagnostics 2023, 13, 1944 4 of 23
The study conducted by [3] provides a comprehensive review of the techniques used
for the diagnosis of breast cancer in histopathological images. The state-of-the-art machine-
learning approaches employed at each stage of the diagnosis process, including traditional
methods and deep-learning methods, are presented, and a comparative analysis between
the different techniques is provided. The technical details of each approach and their
respective advantages and disadvantages are discussed in detail.
Lee et al. [27] conducted a study utilizing a deep-learning-based computer-aided pre-
diction system for ultrasound (US) images. The research involved a total of 153 women with
breast cancer, comprising 59 patients with lymph node metastases (LN+) and 94 patients
without (LN−). Multiple machine-learning algorithms, including logistic regression, sup-
port vector machines (SVMs), XGBoost, and DenseNet, were trained and evaluated on the
US image data. The study found that the DenseNet model exhibited the best performance,
achieving an area under the curve (AUC) of 0.8054. This study highlights the potential of
deep-learning techniques in the development of accurate and efficient prediction systems
for breast cancer diagnosis using US imaging.
Sun et al. [28] conducted a study utilizing a convolutional neural network (CNN)
trained and tested on ultrasound images of 169 patients. The training dataset consisted of
248 US images from 124 patients, while the testing dataset comprised 90 US images from
45 patients. The results of the study revealed a somewhat inferior performance, with an
AUC of 0.72 (SD 0.08) and an accuracy of 72.6% (SD 8.4). Notably, the validation process did
not include cross-validation or bootstrapping methods. These findings suggest that further
research is necessary to improve the performance of CNNs in breast cancer diagnosis using
ultrasound imaging.
In a study by [29], a comparison was made between convolutional neural networks
(CNNs) and traditional machine-learning (ML) methods, specifically random forests, in the
context of breast cancer diagnosis. The study utilized a dataset of 479 breast cancer patients,
comprising 2395 breast ultrasound images. The research also focused on different regions
of the ultrasound images, including intratumoral, peritumoral, and combined regions, to
train and evaluate the models. The study found that CNNs outperformed random forests
in all modalities (p < 0.05), and the combination of intratumoral and peritumoral regions
provided the best result, with an AUC of 0.912 [0.834–99.0]. While confidence intervals
were provided, the method used to determine them was not mentioned. These results
highlight the potential of CNNs in breast cancer diagnosis using ultrasound imaging and
the importance of considering different regions of the image in the analysis.
The study proposed by [30] implemented the multilevel transfer-learning (MSTL) algo-
rithm using three pre-trained models, namely EfficientNetB2, InceptionV3, and ResNet50,
along with three optimizers, which included Adam, Adagrad, and stochastic gradient
descent (SGD). The study utilized 20,400 cancer cell images, 200 ultrasound images from
Mendeley, and 400 from the MT-Small dataset. This approach has the potential to reduce
the need for large ultrasound datasets to realize powerful deep-learning models. The
results of this study demonstrate the effectiveness of the MSTL algorithm in breast cancer
diagnosis using ultrasound imaging.
The study [31] presents a review of studies investigating the ability of deep-learning
(DL) approaches to classify histopathological breast cancer images. The article evaluates
current DL applications and approaches to classify histopathological breast cancer im-
ages based on papers published by November 2022. The study findings indicate that
convolutional neural networks, as well as their hybrids, represent the most advanced DL
approaches currently in use for this task. The authors of the study defined two categories
of classification approaches, namely binary and multiclass solutions, in the context of DL-
based classification of histopathological breast cancer images. Overall, this review provides
insights into the current state of the art in DL-based classification of histopathological
breast cancer images and highlights the potential of advanced DL approaches to improve
the accuracy and efficacy of breast cancer diagnosis.
Diagnostics 2023, 13, 1944 5 of 23
The study [32] proposed a breast cancer classification technique that leverages a
transfer-learning approach based on the VGG16 model. To preprocess the images, a median
filter was employed to eliminate speckle noise. The convolution layers and max pooling
layers of the pre-trained VGG16 model were utilized as feature extractors, while a two-layer
deep neural network was devised as a classifier.
The vision transformer (ViT) architecture has been proven to be advantageous in
extracting long-range features and has thus been employed in various computer vision
tasks. However, despite its remarkable performance in traditional vision tasks, the ViT
model’s supervised training typically necessitates large datasets, thereby posing difficulties
in domains where it is challenging to amass ample data, such as medical image analysis.
In [33], the authors introduced an enhanced ViT architecture, denoted as ViT-Patch, and
investigated its efficacy in addressing a medical image classification problem, namely,
identifying malignant breast ultrasound images.
In summary, these studies showcase the capability of deep-learning techniques in
automating breast image classification and underscore the significance of devising precise
CAD systems to support radiologists in detecting breast cancer. The majority of current
approaches employ pre-existing deep-learning architectures for detecting breast cancer. In
the following, we introduce a novel architecture that surpasses all previous methods.
3. Methodology
Inspired by GoogLeNet [34] and residual block [35] and adding several other features,
in this paper, we developed a new deep architecture for breast cancer detection from
images. GoogLeNet and residual block are based on convolutional neural network (CNN)
architecture. GoogLeNet is a deep convolutional neural network architecture developed
by Google’s research team in 2014. It was the winner of the ImageNet Large Scale Visual
Recognition Challenge (ILSVRC) in 2014 and achieved state-of-the-art performance on a
variety of computer vision tasks.
The GoogLeNet architecture consists of a 22-layer deep neural network with a unique
“Inception” module that enables the network to efficiently capture spatial features at differ-
ent scales using parallel convolutional layers. The Inception module combines multiple
convolutional filters of different sizes and concatenates their outputs, allowing the network
to capture both fine-grained and high-level features. On the other hand, a residual block
is a building block used in deep neural networks that helps to address the problem of
vanishing gradients during training. A residual block consists of two or more stacked
convolutional layers followed by a shortcut connection that bypasses these layers. The
shortcut connection allows the gradient to be directly propagated to earlier layers, allowing
for better optimization and deeper architectures.
This study introduces a novel deep-learning-based architecture for breast cancer
detection that stands out from existing architectures in four significant aspects, resulting in
superior performance.
1. Proposing a granular computing-based algorithm aiming to extract more detailed and
fine-grained information from breast cancer images, leading to improved accuracy
and performance;
2. Utilizing wide and deep modules, shortcut connections, and intermediate classifiers
simultaneously in the architecture;
3. Designing an attention mechanism; the attention mechanism in CNNs provides a
powerful tool for selectively focusing on relevant features in the input data, enabling
the network to achieve better accuracy and efficiency;
4. Designing two learnable activation functions and using them instead of traditional
activation functions.
Figure 1 depicts the overall process proposed in this paper. The input of the proposed
method is a breast cancer image. If the size of the images is different, they are resized to
a pre-determined size. After that, the pixels of the image are normalized between [0,1].
Resizing and normalization are preprocessing steps. After the preprocessing step, we use
4. Designing two learnable activation functions and using them instead of traditiona
activation functions.
Figure 1 depicts the overall process proposed in this paper. The input of the propose
method is a breast cancer image. If the size of the images is different, they are resized to
Diagnostics 2023, 13, 1944 6 of 23
pre-determined size. After that, the pixels of the image are normalized between [0,1
Resizing and normalization are preprocessing steps. After the preprocessing step, we us
granular computing to highlight important features of an image. The output of the prev
granular computing to highlight important features of an image. The output of the previous
ousisstep
step usedistoused
train to
thetrain the proposed
proposed deep-learning
deep-learning model. Thesemodel. These
steps are used steps
for all are used for a
images
images in the dataset. In the following, we will describe each
in the dataset. In the following, we will describe each of these cases. of these cases.
Figure1.1.The
Figure Theoverall
overall process
process of of
thethe proposed
proposed method.
method. All steps
All steps in figure
in this this figure are repeated
are repeated for for a
images.
all images.
Figure 2 shows the proposed steps for granular computing used in this paper. Con-
sidering the above steps, we propose Algorithm 1 for applying granular computing in
this paper. In this algorithm, we have used the pre-trained VGG16 architecture to extract
features for each granularity. The size of each granular is considered to be 32 × 32. We
apply granular computing to the dataset before starting the training process.
Diagnostics 2023, 13, 1944 Feature Extraction: For each granule, extracts features using techniques such 8 of 23as local
binary patterns, histograms of oriented gradients, or CNN. This will result in a set of fea-
ture vectors, one for each granule.
formFeature Aggregation:
of the function that hasCombine
learnable the feature vectors
parameters. obtained
Let us call from the
this function granules and
“Learnable
use them to classify the image. This
Activation Function”, or LAF for short: can be performed by using techniques such as mean
pooling or max pooling.
Figure 2 shows the proposed LAF(x;steps a ∗ granular
W) = for F(x; W) + bcomputing used in this paper. (1) Con-
sidering the above steps, we propose Algorithm 1 for applying granular computing in this
Here, a and b are adjustable parameters that are learned during training, and F()
paper. In this algorithm, we have used the pre-trained VGG16 architecture to extract fea-
is a non-linear function that defines the shape of the activation function. The weight
tures
matrixfor
Weach granularity.
contains learnableThe sizethat
values of each granular
determine the is considered
shape to be
of F(), and it is32optimized
× 32. We apply
granular computing
through backpropagation.to the dataset before starting the training process.
Figure 2. The
Figure 2. The granular
granularcomputing
computing process
process proposed
proposed in paper.
in this this paper.
To further
Algorithm develop the
1. Extracting LAF,detailed
more we can features
choose a by
suitable non-linear
granular function F(). One
computing
possible choice is the sigmoid function:
Repeat the following steps for all images
img = Load the image F(x; W) = 1/(1 + exp(−Wx)) (2)
The sigmoid
Preprocessing function
step: is a common
resizing choice for activation functions due to its smoothness
and normalization
and boundedness, which is important for the stable training of neural networks. The LAF
img = resize(img, (224, 224))
with the Sigmoid function becomes:
img = normalize(img/255.0)
LAF_sigmoid(x; W, a, b) = a ∗ (1/(1 + exp(−Wx))) + b (3)
Granulation step: split the image in windows of size 24 * 24
Diagnostics 2023, 13, 1944 9 of 23
This activation function will be used in the dense layers of the network.
Another possible choice for F() is the ReLU function:
The ReLU function is preferred for some tasks because of its simplicity and computa-
tional efficiency. The learnable parameters a and b can be added to shift and scale the ReLU
function, resulting in the LAF with the ReLU function:
This activation function will be used in the convolutional layers of the network.
The values of a, b, and W can be trained through backpropagation using gradient
descent or other optimization algorithms. The choice of the initial values and number of
hidden units are important factors that can affect the success of training the neural network
using LAFs. There are several advantages of using learnable activation functions in artificial
neural networks:
1. Improved performance: By incorporating learnable activation functions, the neural
network performance can be improved significantly. This is because the activation
function adapts to the input data, allowing for a more accurate representation of
complex relationships between features;
2. Non-linear mapping: Learnable activation functions allow for non-linear mappings
between input and output, which can capture more complex patterns in the data;
3. Flexibility: With traditional activation functions, the network architecture is fixed.
However, using learnable activation functions allows for more flexibility in the
network architecture, as the activation function can be modified according to the
specific task;
4. Reduced overfitting: Learnable activation functions can also help reduce overfitting,
as they can adapt to the input data and generalize better to new data that has not been
seen before;
5. Efficient training: The use of learnable activation functions can also make the training
process more efficient by allowing gradients to be propagated through the network
more smoothly. This can lead to faster convergence and improved performance.
In this section, we propose an attention mechanism to apply to the output layer (top
layer). The attention mechanism in CNNs can highlight the salient regions in images that
are significant for the classification task.
In the case of breast cancer detection, this can be useful since certain regions of the
breast image may contain more relevant features for cancer detection compared to others.
Here, we develop an attention mechanism for CNN:
1. Start with a standard convolutional layer with filters of size (k, k) and stride s;
2. Add a second convolutional layer with filters of size 1 × 1 and stride 1. This layer will
compute a scalar attention weight for each pixel in the input image;
3. Apply a Softmax activation function to the output of the attention layer to ensure that
the weights sum up to 1 for each pixel;
4. Multiply the attention weight maps element-wise with the input image to obtain the
attended input image;
5. Feed the attended input image into the next layer of the CNN.
The idea behind this attention mechanism is that the second convolutional layer learns
to compute a scalar attention weight for each pixel in the input image, based on its relevance
to the task at hand. The softmax activation function ensures that the attention weights
sum up to one for each pixel, making them interpretable as a probability distribution
over the pixels. The element-wise multiplication of the attention weight maps and input
image highlights or downplays certain pixels, improving the accuracy of the CNN on the
given task.
3.4. Wide and Depth Networks, Short Connections, and 1 × 1 Convolutional Layers
Our developed network takes advantage of the features of wide and deep networks,
short connections, and 1 × 1 convolutional layers. In the following, we will examine
each one.
Wide and deep neural networks such as GoogLeNet offer improved accuracy, higher
capacity, faster convergence, and better regularization. They have demonstrated impressive
performance in various tasks, including image recognition, speech recognition, and natural
language processing. Their advantages make them an attractive choice when designing
neural networks.
In neural networks, short connections, which are used in ResNet and DenseNet
networks, are a type of connection between the neurons that bypass one or more layers
in the network. These connections allow information to flow between two layers that are
not directly connected in the network architecture. Short connections, also known as skip
connections, in neural networks can be represented mathematically as an element-wise
summation or concatenation operation between the input to a layer and the output of that
layer. In other words, the output of a layer is added to or concatenated with the input to
that layer or a previous layer. For example, in a convolutional neural network (CNN), a
short connection can be introduced between two convolutional layers by adding the output
of the first convolutional layer to the input of the second convolutional layer. This can be
added as follows:
x1 = Convolutional_layer_1(input)
x2 = Convolutional_layer_2(x1 + input)
where “+” denotes element-wise summation.
Short connections or residual connections in neural networks have several advantages:
1. Improved gradient flow: By adding short connections, the gradient can flow through
the network more effectively, which eliminates the vanishing gradient problem. The
gradient can be propagated directly to earlier layers, allowing the network to train
deeper architectures;
Diagnostics 2023, 13, 1944 11 of 23
2. Improved training speed: The use of short connections reduces the number of layers in
the critical learning path, which can speed up the training process. The reduced depth
also means that less computation is required, resulting in a more efficient model;
3. Improved accuracy: Short connections enable the learning of more complex functions
by allowing the network to make use of the information from earlier layers. This can
result in higher accuracy in tasks such as image recognition and speech processing;
4. Reduced overfitting: Short connections can help reduce overfitting by providing a
regularization mechanism. They allow the network to learn simpler representations
for the input data, which leads to better generalization.
In CNNs, 1 × 1 convolutional layers are utilized as a type of layer that executes a
convolution operation by convolving the input tensor with a kernel of size 1 × 1. Despite
their small size, 1 × 1 convolutional layers have various advantages in CNNs:
1. Dimensionality reduction: 1 × 1 convolutional layers can be used to reduce the dimen-
sionality of feature maps, which can be useful in reducing the computational complex-
ity of CNNs while maintaining their accuracy. By using 1 × 1 convolutional layers,
the number of parameters can be reduced while still retaining the important features;
2. Non-linear transformations: Even though it has a kernel of size 1 × 1, this layer applies
non-linear transformations to the input feature maps. The non-linear activation
function applied after the convolution operation contributes to this non-linearity;
3. Improved model efficiency: By reducing the number of parameters, 1 × 1 convolu-
tional layers reduce the computational cost of the model. This can, in turn, improve
the efficiency of the implementation of the model, allowing it to be run on smaller
devices or with fewer computational resources;
4. Feature interaction: A 1 × 1 convolutional layer can act as a feature interaction layer
and induce correlations between features, which can further enhance the representa-
tion power of the network.
These advantages make 1 × 1 convolutional layers an important building block in
CNNs, especially in deeper networks where computational cost and memory usage are of
key concern.
8.
Diagnostics 2023, 13, x FOR PEER REVIEW An X-module followed the Average pool, dropout layer; 12 of 24
9. A dense layer-based attention layer;
10. Learnable Softmax classifier as output layer.
Figure3.3.The
Figure Thenew
newCNN
CNNsystem
system proposed
proposed to diagnose
to diagnose medical
medical images.
images.
1. The details
Input are explained as follows:
layer;
Input Layer: In this step, the
2. A convolutional-based medical
attention image is entered into the system.
layer;
3. Convolutional-based
Convolution layer; attention layer: This layer allows for the selective focus on
specific
4. Two parts of the input
X modules withdata that hold
different significance
filter in determining
sizes followed an outcome.module;
by a down-sample
5. An auxiliary classifier with a learnable Softmax classifier;
6. Three X modules with different filter sizes followed by a down-sample module;
7. An auxiliary classifier with a learnable Softmax classifier;
8. An X-module followed the Average pool, dropout layer;
The details are explained as follows:
Input Layer: In this step, the medical image is entered into the system.
Convolutional-based attention layer: This layer allows for the selective focus on spe-
Diagnostics 2023, 13, 1944 13 of 23
cific parts of the input data that hold significance in determining an outcome.
Convolution Layer: This layer uses convolution operations to produce new feature
maps.
Convolution
X-Module: This Layer: This layer
module uses convolution
considers both the depth operations
and widthto produce new feature
of the network, withmaps.
mul-
X-Module: This module considers both the depth and width of the
tiple filters of varying sizes operating at the same level. The outputs of these filters are network, with
multiple filtersbefore
concatenated of varying
beingsizes operating
transmitted to at
thethe same level.
subsequent The outputs
module. of these
The main unitfilters
in X-
are
module is a sub-block called R-block (Figure 4), which is inspired by the residualunit
concatenated before being transmitted to the subsequent module. The main block.in
X-module is a sub-block called R-block (Figure 4), which is inspired by
The main difference is that the designed block uses learnable activation functions. The the residual block.
The
blockmain
usesdifference
a shortcutis connection;
that the designed
the inputblock uses learnable
is added activation
to the output of thefunctions. The
block to pass
block usesupdates
gradient a shortcut connection;
through thenetwork
the entire input iseasily
addedand to the output
reduce of the block
overfitting. to pass
The R-block
gradient updates through the entire network easily and reduce overfitting.
consists of two convolutional layers stacked on top of each other, with the first layer being The R-block
consists
succeeded of bytwoa convolutional
batch normalizationlayerslayer
stacked
andon top of each
a learnable other, with
activation the that
function first is
layer
de-
being succeeded by a batch normalization layer and a learnable activation
pendent on parameters. Using a parameter learnable activation function helps to reduce function that is
dependent on parameters. Using a parameter learnable activation function helps to reduce
time consumption and better learning. Three types of R-blocks are implemented upon the
time consumption and better learning. Three types of R-blocks are implemented upon
filter size. The filter size of the two convolutional layers in the first R-block is 3 × 3. In the
the filter size. The filter size of the two convolutional layers in the first R-block is 3 × 3.
second R-block, the filter size of the two convolutional layers is 5 × 5. The first convolu-
In the second R-block, the filter size of the two convolutional layers is 5 × 5. The first
tional layer in the third R-block has a 3 × 3 filter size and the second convolutional layer
convolutional layer in the third R-block has a 3 × 3 filter size and the second convolutional
has a 5 × 5 filter size. We have reduced the number of parameters and computational costs
layer has a 5 × 5 filter size. We have reduced the number of parameters and computational
in the X-module by incorporating an additional 1 × 1 convolution in the initial layer, pre-
costs in the X-module by incorporating an additional 1 × 1 convolution in the initial layer,
ceding the 3 × 3 and 5 × 5 convolutions. An extra 1 × 1 convolution is also utilized after the
preceding the 3 × 3 and 5 × 5 convolutions. An extra 1 × 1 convolution is also utilized
max pooling layer.
after the max pooling layer.
(a) (b)
(c)
Figure
Figure5.5.The
Thedifferent used
different structures
used into
structures X-module:
into X-module: (a)(a)
X-module
X-module includes three
includes R-blocks
three of 3of×
R-blocks
3,3called DNN R3_R3_R3. (b) X-module includes three R-blocks of 5 × 5, called DNN
× 3, called DNN R3_R3_R3. (b) X-module includes three R-blocks of 5 × 5, called DNN R5_R5_R5.R5_R5_R5. (c)
X-module includes three R-blocks of 3 × 3, 5 × 5, and 3 × 5, called DNN R3_R5_R35.
(c) X-module includes three R-blocks of 3 × 3, 5 × 5, and 3 × 5, called DNN R3_R5_R35.
Dense layer-based attention layer: This attention takes as input a 3D tensor represent-
ing the output features of the previous layer and outputs a 2D tensor of attention scores,
where each score represents the relevance of a specific feature.
Output layer: The output in this step will be normal or abnormal.
Loss: This metric represents the error between the predicted output and the actual
output. The loss function is typically defined during the training phase of the model,
and it is used to optimize the model parameters by minimizing the difference between its
predictions and the true values. The most commonly used loss function in deep learning
is the mean squared error (MSE), which measures the average of the squared differences
between the predicted and true values.
Precision: This metric measures the proportion of true positives (samples that were
correctly classified as positive) to the total number of positive predictions made by the
model. It can be calculated as:
Recall: This metric measures the proportion of true positives to the total number of
true positives and false negatives in the dataset.
F1: This metric calculates the harmonic mean of precision and recall.
Different DNN models are implemented using different configurations of the X-module,
as depicted in Figure 5. The first model has three R-blocks of 3 × 3 convolution filters.
The second model utilized three R-blocks of 5 × 5. In the third model, the first R-block
is 3 × 3, the second is 5 × 5, and the third R-block has a 3 × 3 filter size in the first
convolutional layer, and the second convolutional layer has a 5 × 5 filter size. We have
utilized various filters and kernels (kernel size). By specifying multiple values for the kernel
parameter within a filter, our model can effectively identify patterns that occur at different
scales within an image. The incorporation of multiple kernels also assists in reducing
overfitting and improving the generalization of the model. This is because including filters
with varying kernel sizes compels the network to learn more diverse and robust feature
representations, leading to an improved ability to generalize the model to new images.
The Cairo University ultrasound images dataset was collected in 2018 that consists of
780 images with an average image size of 500 × 500 pixels [36]. The images are categorized
Diagnostics 2023, 13, 1944 16 of 23
into three classes, which are normal (133 images), benign (487 images), and malignant
(210 images). The data collected at baseline include breast ultrasound images among
women in ages between 25 and 75 years old. The number of patients is 600 female patients.
Because the number of images in different classes is unbalanced, this may cause the
model to learn some classes better than others, and this can cause the model to perform
inappropriately during use or testing. To prevent this from happening, we randomly
selected an equal number of images from each class. Before starting the training process
of the model, due to the lack of data, we start the data augmentation process. We resize
Diagnostics 2023, 13, x FOR PEER REVIEW 17 of 24
the dataset using cubic interpolation to fit the input requirements of the model. For
augmentation, we applied width and height shifts of 0.1 and a horizontal flip, which tripled
the size of the dataset. With this technique, we tripled the number of data for each class.
to learn
After some
this classes
process, webetter
split thethan others,
dataset and
into this can
training cause
and test the
sets,model to perform
allocating inappro-
approximately
priately
80% for during
traininguse andor20%
testing. To prevent
for testing. this from
Of course, we happening,
have not used wethese
randomly selected
new images foran
equal number of images from each class. Before starting the training
testing. We maintain the sequence of each image so that every image appears only once inprocess of the model,
due
eachtoofthe
thelack of data, we start
aforementioned sets.the data augmentation process. We resize the dataset using
cubic Figure
interpolation
6 depictstothe
fit accuracy
the inputand requirements
loss diagrams of the model.
for the threeFor augmentation,
proposed models on wetheap-
plied
Cairowidth and height
University ultrasoundshifts of 0.1 and a horizontal flip, which tripled the size of the
dataset.
OnWith
dataset. the Cairo Universitywe
this technique, ultrasound
tripled theimages
number dataset,
of datatheforperformance of thethis
each class. After three
pro-
proposed CNN models in the testing phase is summarized in Table 1. The
cess, we split the dataset into training and test sets, allocating approximately 80% for train- third model,
DNN
ing andR3_R5_R35, whichOf
20% for testing. uses a different
course, sizenot
we have of convolutional
used these new filters, achieves
images the best
for testing. We
accuracy and low loss. It achieved 93% accuracy performance.
maintain the sequence of each image so that every image appears only once in each of the
Table 2 is the
aforementioned resultant confusion matrix of the DNN R3_R5_R35 model. This table
sets.
showsFigure 6 depicts theso
promising results that the and
accuracy proposed model hasfor
loss diagrams correctly
the threediagnosed
proposedthe models
presenceon
or absence of cancer in most cases,
the Cairo University ultrasound dataset. and it has not been able to correctly diagnose only five
cases out of 68 cases.
Proposed DNN R3_R5_R35
Figure 6. Accuracy and loss diagrams for the three proposed models.
Table 1. The newly developed model results on Cairo University ultrasound images dataset
(test part).
Table 2. Confusion matrix for the test dataset. (0 indicating no breast cancer and 1 indicating existing
breast cancer, one of the benign and malignant in the image.).
Predicted
0 1
Actual
0 33 4
1 1 30
As we used some of the features of the state-of-the-art GoogLeNet and ResNet archi-
tectures in the design of the new architecture, we compared the proposed architecture, i.e.,
DNN R3_R5_R35, with these architectures in Table 3. To perform this comparison, we have
utilized GoogLeNet with 22 layers and ResNet with 50 layers for comparison. The results
indicate that ResNet outperformed GoogLeNet in two critical evaluation metrics: accuracy
and F1 score. Inspection of the table containing the results reveals that the proposed method
has surpassed both of these models and yielded a higher detection accuracy than these two
state-of-the-art architectures.
Table 3. Comparison of the proposed model against GoogLeNet and ResNet on the test dataset.
GoogleNet
Figure 7.
Figure 7. Accuracy
Accuracy and
and loss
loss diagrams
diagrams for
for the
the GoogLeNet.
GoogLeNet.
Table4.4.Comparison
Table Comparison of the proposed
proposed model
model against
againstnine
ninestate-of-the-art
state-of-the-artimage
imageprocessing
processing models
models
on the
on the Cairo
Cairo University
University ultrasound images dataset.
dataset.
Deep-Learning
Deep-Learning Model Model Loss (%) Loss (%) Accuracy
Accuracy (%)(%)
AlexNet AlexNet 66 66 6969
ZFNet ZFNet 64 64 6969
VGG-16 VGG-16 6 6 7373
Inception v4
Inception v4 46 46 8585
MobileNetMobileNet 55 55 8585
WideResNetWideResNet 39 39 8888
GoogLeNet GoogLeNet 59 59 8787
ResNet34 ResNet34 61 61 8383
ResNet50 ResNet50 51 51 8888
ProposedProposed DNN R3_R5_R35
DNN R3_R5_R35 21 21 9393
We
We also
also applied
applied the
the proposed model to
proposed model to the
the breast
breast histopathology
histopathologyimagesimagesdataset
datasettoto
further
furtherevaluate
evaluateit.it.The
Theoriginal dataset
original consisted
dataset of 162
consisted whole-mount
of 162 whole-mount slideslide
images of breast
images of Co
cancer
breast specimens
cancer specimens at 40×. From
scannedscanned at 40x.that,
From277,524 of size 50of×size
patches patches
that, 277,524 50 were extracted
50 × 50 were
Co
(198,738
extractedIDC-negative and 78,786and
(198,738 IDC-negative IDC-positive). Invasive Ductal
78,786 IDC-positive). Carcinoma
Invasive (IDC) is the
Ductal Carcinoma
most
(IDC) is the most common subtype of all breast cancers. We choose 40,000 imagesfrom
common subtype of all breast cancers. We choose 40,000 images in total the
in total of
dataset: 20,000 random images from both classes. We split the dataset into training and
Co
test sets, allocating 80% for training and 20% for testing. The comparison results of the
Diagnostics 2023, 13, 1944 19 of 23
proposed DNN R3_R5_R35 model against existing approaches are listed in Table 5. The
best results are highlighted in bold. It is evident from the table that the proposed model
has outperformed the other models in the two assessed criteria.
The objective of this study was to enhance the accuracy of breast cancer detection
through the application of deep-learning techniques in the development of computer-aided
detection systems. The proposed model, utilizing various filter sizes, demonstrated 93%
and 95% accuracy on two distinct datasets: ultrasound images and breast histopathology
images, respectively. The second goal was to decrease the parameters of the network,
aiming to improve the training time. The time issue during the training process for any
deep-learning model is still challenging and depends on the facilities that be used. Training
the model on ultrasound and histopathology images takes less than two hours and less
than six hours, respectively, which is suitable compared to other DNN models.
A short review above, we can conclude the main findings of this paper as follows:
1. The granular computing technique used in this paper, by breaking down images
into smaller, more granular components, can effectively extract features from images,
allowing for more accurate and efficient image analysis. This leads to increased effi-
ciency by reducing the computational complexity of image analysis tasks. Moreover,
breaking down images into smaller, granular computing can improve the accuracy of
image analysis tasks, leading to more reliable results;
2. Activation functions with learnable parameters offer greater flexibility and adaptabil-
ity compared to traditional activation functions with fixed parameters. This allows
the network to better adapt to different types of data and tasks. These functions can
also improve the flow of gradients through the network during training, making it
easier to optimize the network and reduce the risk of vanishing gradients. Better
regularization is another advantage of these functions. Learnable activation functions
can be used as a form of regularization, helping to prevent overfitting by constraining
the network’s capacity and reducing the risk of memorization;
3. In this study, a range of filters and kernels of varying sizes were employed to effec-
tively identify patterns at multiple scales within an image. By incorporating multiple
kernels within a filter, the network was able to learn diverse and robust feature repre-
sentations, which helped to reduce overfitting and improve the generalization of the
model. This approach enabled the model to consider a wider range of input features,
Diagnostics 2023, 13, 1944 20 of 23
leading to higher accuracy in complex tasks compared with a model that employs a
single filter and kernel. The use of multiple kernels within a filter, therefore, represents
an effective strategy for improving the ability of a neural network to generalize to
new images by facilitating the learning of more sophisticated features across a range
of spatial scales;
4. Utilizing a wide and depth network, shortcut connections, attention layers, auxil-
iary classifiers, and using a learnable activation function improves the accuracy of
diagnosis and consequence and decreases the load on doctors. In addition, using
1 × 1 convolutions reduces time consumption in the model. Compared to existing
breast cancer methods, the proposed model achieves the highest diagnostic accuracy.
There are two potential limitations to the presented model:
1. The extraction of some patterns from the image may be dependent on the granularity
size. In the proposed granulation, the granularity size was set to 32 × 32 pixels,
regardless of the image size. Consequently, some patterns may not be extracted, weak-
ening the effectiveness of granulation. However, the model’s overall performance
demonstrates that the proposed granulation method outperformed state-of-the-art
models for the datasets under consideration;
2. Incorporating granularity in a model requires additional time before the training
process can commence. It is worth noting, however, that once these granules have
been established, they can be reused multiple times.
Sketch2Photo [42] before starting the learning process. This will increase the accuracy of
the model. (4) Over the last 30 years, hyperspectral imagery (HSI) has gained prominence
for its ability to discern anomalies from natural ground objects based on their spectral
characteristics. The importance of HSI has been recognized in a variety of remote sensing
applications, including but not limited to object classification, hyperspectral unmixing,
anomaly detection, and change detection [43]. We can use this technique to identify breast
cancer. (5) The primary challenge in content-based image retrieval (CBIR) systems is the
presence of a semantic gap that must be narrowed for effective retrieval. To address this
issue, various techniques, such as those outlined in [44], can be employed to incorporate
semantic considerations. (6) To reduce the processing time, it is suggested to use distributed
and parallel similarity retrieval techniques, such as [45], on large CT image sequences.
(7) The proposed framework can be extended to include other types of cancer detection,
such as lung or prostate cancer. This would enable the development of a comprehensive
cancer detection system that can be integrated into existing healthcare systems.
Author Contributions: Conceptualization, S.Z. and H.I.; methodology, S.Z. and H.I.; software, S.Z.;
validation, S.Z., H.I. and J.K.; formal analysis, S.Z., H.I. and J.K.; investigation, S.Z., H.I. and J.K;
resources, S.Z., H.I. and J.K; data curation, S.Z., H.I. and J.K; writing—original draft preparation, S.Z.;
writing—review and editing, S.Z., H.I. and J.K.; visualization, S.Z.; supervision, H.I. and J.K; project
administration, H.I. All authors have read and agreed to the published version of the manuscript.
Funding: This research received no external funding.
Institutional Review Board Statement: Not applicable.
Informed Consent Statement: Informed consent was obtained from all subjects involved in the study.
Data Availability Statement: Publicly available datasets were analyzed in this study. Ultrasound
images dataset: https://www.data-in-brief.com/article/S2352-3409(19)31218-1/fulltext (accessed on
1 September 2022). Breast Histopathology Images: https://www.kaggle.com/datasets/paultimoth
ymooney/breast-histopathology-images (accessed on 1 September 2022.).
Conflicts of Interest: There is no conflict of interest.
References
1. World Health Organization: Breast Cancer Web Site. Available online: https://www.who.int/newsroom/fact-sheets/detail/brea
st-cancer (accessed on 30 September 2022).
2. Mahmood, T.; Li, J.; Pei, Y.; Akhtar, F.; Imran, A.; Rehman, K.U. A brief survey on breast cancer diagnostic with deep learning
schemes using multi-image modalities. IEEE Access 2020, 8, 165779–165809. [CrossRef]
3. Liew, X.Y.; Hameed, N.; Clos, J. A review of computer-aided expert systems for breast cancer diagnosis. Cancers 2021, 13, 2764.
[CrossRef]
4. Almajalid, R.; Shan, J.; Du, Y.; Zhang, M. Development of a deep-learning-based method for breast ultrasound image segmentation.
In Proceedings of the 2018 17th IEEE International Conference on Machine Learning and Applications (ICMLA), Orlando, FL,
USA, 17–20 December 2018; pp. 1103–1108.
5. Latif, G.; Butt, M.O.; Al Anezi, F.Y.; Alghazo, J. Ultrasound image despeckling and detection of breast cancer using deep CNN. In
Proceedings of the 2020 RIVF International Conference on Computing and Communication Technologies (RIVF), Ho Chi Minh,
Vietnam, 14–15 October 2020; pp. 1–5.
6. Zhu, W.; Xiang, X.; Tran, T.D.; Hager, G.D.; Xie, X. Adversarial deep structured nets for mass segmentation from mammograms.
In Proceedings of the 2018 IEEE 15th International Symposium on Biomedical Imaging (ISBI 2018), Washington, DC, USA,
4–7 April 2018; pp. 847–850.
7. Al-Antari, M.A.; Al-Masni, M.A.; Choi, M.T.; Han, S.M.; Kim, T.S. A fully integrated computer-aided diagnosis system for digital
X-ray mammograms via deep learning detection, segmentation, and classification. Int. J. Med. Inform. 2018, 117, 44–54. [CrossRef]
[PubMed]
8. Yao, J.T.; Vasilakos, A.V.; Pedrycz, W. Granular computing: Perspectives and challenges. IEEE Trans. Cybern. 2013, 43, 1977–1989.
[CrossRef] [PubMed]
9. Lee, J.; Kang, B.J.; Kim, S.H.; Park, G.E. Evaluation of computer-aided detection (CAD) in screening automated breast ultrasound
based on characteristics of CAD marks and false-positive marks. Diagnostics 2022, 12, 583.
10. Cheng, H.D.; Shan, J.; Ju, W.; Guo, Y.; Zhang, L. Automated breast cancer detection and classification using ultrasound images: A
survey. Pattern Recognit. 2010, 43, 299–317. [CrossRef]
Diagnostics 2023, 13, 1944 22 of 23
11. Jiménez-Gaona, Y.; Rodríguez-Álvarez, M.J.; Lakshminarayanan, V. Deep-learning-based computer-aided systems for breast
cancer imaging: A critical review. Appl. Sci. 2020, 10, 8298. [CrossRef]
12. Wang, S.; Huang, J. Breast Lesion Segmentation in Ultrasound Images by CDeep3M. In Proceedings of the 2020 International
Conference on Computer Engineering and Application (ICCEA), Guangzhou, China, 18–20 March 2020; pp. 907–911.
13. Wei, K.; Wang, B.; Saniie, J. Faster Region Convolutional Neural Networks Applied to Ultrasonic Images for Breast Lesion
Detection and Classification. In Proceedings of the 2020 IEEE International Conference on Electro Information Technology (EIT),
Romeoville, IL, USA, 31 July–1 August 2020; pp. 171–174.
14. Badawy, S.M.; Mohamed, A.E.N.A.; Hefnawy, A.A.; Zidan, H.E.; GadAllah, M.T.; El-Banby, G.M. Classification of Breast
Ultrasound Images Based on Convolutional Neural Networks—A Comparative Study. In Proceedings of the 2021 International
Telecommunications Conference (ITC-Egypt), Alexandria, Egypt, 13–15 July 2021; pp. 1–7.
15. Tang, P.; Yang, X.; Nan, Y.; Xiang, S.; Liang, Q. Feature Pyramid Nonlocal Network With Transform Modal Ensemble Learning for
Breast Tumor Segmentation in Ultrasound Images. IEEE Trans. Ultrason. Ferroelectr. Freq. Control 2021, 68, 3549–3559.
16. Xiao, T.; Liu, L.; Li, K.; Qin, W.; Yu, S.; Li, Z. Comparison of transferred deep neural networks in ultrasonic breast masses
discrimination. BioMed Res. Int. 2018, 2018, 4605191. [CrossRef]
17. Uysal, F.; Köse, M.M. Classification of Breast Cancer Ultrasound Images with Deep Learning-Based Models. Eng. Proc. 2022, 31, 8.
18. Ragab, M.; Albukhari, A.; Alyami, J.; Mansour, R.F. Ensemble deep-learning-enabled clinical decision support system for breast
cancer diagnosis and classification on ultrasound images. Biology 2022, 11, 439. [CrossRef]
19. Xing, J.; Chen, C.; Lu, Q.; Cai, X.; Yu, A.; Xu, Y.; Huang, L. Using BI-RADS stratifications as auxiliary information for breast
masses classification in ultrasound images. IEEE J. Biomed. Health Inform. 2020, 25, 2058–2070. [CrossRef]
20. Ragab, D.A.; Sharkas, M.; Marshall, S.; Ren, J. Breast cancer detection using deep convolutional neural networks and support
vector machines. PeerJ 2019, 7, e6201. [CrossRef]
21. Yu, X.; Wang, S.H. Abnormality diagnosis in mammograms by transfer learning based on ResNet18. Fundam. Inform. 2019,
168, 219–230. [CrossRef]
22. Islam, M.M.; Haque, M.R.; Iqbal, H.; Hasan, M.M.; Hasan, M.; Kabir, M.N. Breast cancer prediction: A comparative study using
machine learning techniques. SN Comput. Sci. 2020, 1, 1–14. [CrossRef]
23. Alzubaidi, L.; Al-Shamma, O.; Fadhel, M.A.; Farhan, L.; Zhang, J.; Duan, Y. Optimizing the performance of breast cancer
classification by employing the same domain transfer learning from hybrid deep convolutional neural network model. Electronics
2020, 9, 445. [CrossRef]
24. Gao, Y.; Geras, K.J.; Lewin, A.A.; Moy, L. New frontiers: An update on computer-aided diagnosis for breast imaging in the age of
artificial intelligence. AJR Am. J. Roentgenol. 2019, 212, 300. [CrossRef]
25. Yap, M.H.; Pons, G.; Marti, J.; Ganau, S.; Sentis, M.; Zwiggelaar, R.; Marti, R. Automated breast ultrasound lesions detection using
convolutional neural networks. IEEE J. Biomed. Health Inform. 2017, 22, 1218–1226. [CrossRef]
26. Moon, W.K.; Lee, Y.W.; Ke, H.H.; Lee, S.H.; Huang, C.S.; Chang, R.F. Computer-aided diagnosis of breast ultrasound images
using ensemble learning from convolutional neural networks. Comput. Methods Programs Biomed. 2020, 190, 105361. [CrossRef]
27. Lee, Y.W.; Huang, C.S.; Shih, C.C.; Chang, R.F. Axillary lymph node metastasis status prediction of early-stage breast cancer using
convolutional neural networks. Comput. Biol. Med. 2021, 130, 104206. [CrossRef]
28. Sun, Q.; Lin, X.; Zhao, Y.; Li, L.; Yan, K.; Liang, D.; Li, Z.C. Deep learning vs. radiomics for predicting axillary lymph node
metastasis of breast cancer using ultrasound images: Don’t forget the peritumoral region. Front. Oncol. 2020, 10, 53. [CrossRef]
[PubMed]
29. Sun, S.; Mutasa, S.; Liu, M.Z.; Nemer, J.; Sun, M.; Siddique, M.; Ha, R.S. Deep learning prediction of axillary lymph node status
using ultrasound images. Comput. Biol. Med. 2022, 143, 105250. [CrossRef] [PubMed]
30. Ayana, G.; Park, J.; Jeong, J.W.; Choe, S.W. A novel multistage transfer learning for ultrasound breast cancer image classification.
Diagnostics 2022, 12, 135. [CrossRef]
31. Yusoff, M.; Haryanto, T.; Suhartanto, H.; Mustafa, W.A.; Zain, J.M.; Kusmardi, K. Accuracy Analysis of Deep Learning Methods
in Breast Cancer Classification: A Structured Review. Diagnostics 2023, 13, 683. [CrossRef] [PubMed]
32. Hossain, A.A.; Nisha, J.K.; Johora, F. Breast Cancer Classification from Ultrasound Images using VGG16 Model based Transfer
Learning. Int. J. Image Graph. Signal Process. 2023, 13, 12. [CrossRef]
33. Feng, H.; Yang, B.; Wang, J.; Liu, M.; Yin, L.; Zheng, W.; Liu, C. Identifying Malignant Breast Ultrasound Images Using ViT-Patch.
Appl. Sci. 2023, 13, 3489. [CrossRef]
34. Szegedy, C.; Liu, W.; Jia, Y.; Sermanet, P.; Reed, S.; Anguelov, D.; Rabinovich, A. Going deeper with convolutions. In Proceedings
of the IEEE Conference on Computer Vision and Pattern Recognition, Boston, MA, USA, 7–12 June 2015; pp. 1–9.
35. He, K.; Zhang, X.; Ren, S.; Sun, J. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on
Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 770–778.
36. Al-Dhabyani, W.; Gomaa, M.; Khaled, H.; Fahmy, A. Dataset of breast ultrasound images. Data Brief 2020, 28, 104863. [CrossRef]
37. Alanazi, S.A.; Kamruzzaman, M.M.; Islam Sarker, M.N.; Alruwaili, M.; Alhwaiti, Y.; Alshammari, N.; Siddiqi, M.H. Boosting
breast cancer detection using convolutional neural network. J. Healthc. Eng. 2021, 2021, 1–11. [CrossRef]
38. Seemendra, A.; Singh, R.; Singh, S. Breast cancer classification using transfer learning. In Evolving Technologies for Computing,
Communication and Smart World: Proceedings of ETCCS; Springer: Singapore, 2020; pp. 425–436.
Diagnostics 2023, 13, 1944 23 of 23
39. Gour, M.; Jain, S.; Sunil Kumar, T. Residual learning based CNN for breast cancer histopathological image classification. Int. J.
Imaging Syst. Technol. 2020, 30, 621–635. [CrossRef]
40. Shahidi, F.; Daud, S.M.; Abas, H.; Ahmad, N.A.; Maarop, N. Breast cancer classification using deep learning approaches and
histopathology image: A comparison study. IEEE Access 2020, 8, 187531–187552. [CrossRef]
41. Hirra, I.; Ahmad, M.; Hussain, A.; Ashraf, M.U.; Saeed, I.A.; Qadri, S.F.; Alfakeeh, A.S. Breast cancer classification from
histopathological images using patch-based deep learning modeling. IEEE Access 2021, 9, 24273–24287. [CrossRef]
42. Liu, H.; Xu, Y.; Chen, F. Sketch2Photo: Synthesizing photo-realistic images from sketches via global contexts. Eng. Appl. Artif.
Intell. 2023, 117, 105608. [CrossRef]
43. Wang, S.; Hu, X.; Sun, J.; Liu, J. Hyperspectral anomaly detection using ensemble and robust collaborative representation. Inf. Sci.
2023, 624, 748–760. [CrossRef]
44. Zhuang, Y.; Chen, S.; Jiang, N.; Hu, H. An Effective WSSENet-Based Similarity Retrieval Method of Large Lung CT Image
Databases. KSII Trans. Internet Inf. Syst. 2022, 16, 7.
45. Zhuang, Y.; Jiang, N.; Xu, Y.; Xiangjie, K.; Kong, X. Progressive Distributed and Parallel Similarity Retrieval of Large CT Image
Sequences in Mobile Telemedicine Networks. Wirel. Commun. Mob. Comput. 2022, 2022, 6458350. [CrossRef]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual
author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to
people or property resulting from any ideas, methods, instructions or products referred to in the content.