Classification of Breast Cancer Using Transfer Learning and Advanced Al-Biruni Earth Radius Optimization
Classification of Breast Cancer Using Transfer Learning and Advanced Al-Biruni Earth Radius Optimization
Article
Classification of Breast Cancer Using Transfer Learning and
Advanced Al-Biruni Earth Radius Optimization
Amel Ali Alhussan 1 , Abdelaziz A. Abdelhamid 2,3, * , S. K. Towfek 4,5 , Abdelhameed Ibrahim 6, * ,
Laith Abualigah 7,8,9,10 , Nima Khodadadi 11 , Doaa Sami Khafaga 1 , Shaha Al-Otaibi 12 and Ayman Em Ahmed 13
1 Department of Computer Sciences, College of Computer and Information Sciences, Princess Nourah Bint
Abdulrahman University, P.O. Box 84428, Riyadh 11671, Saudi Arabia
2 Department of Computer Science, College of Computing and Information Technology, Shaqra University,
Shaqra 11961, Saudi Arabia
3 Department of Computer Science, Faculty of Computer and Information Sciences, Ain Shams University,
Cairo 11566, Egypt
4 Computer Science and Intelligent Systems Research Center, Blacksburg, VA 24060, USA
5 Department of Communications and Electronics, Delta Higher Institute of Engineering and Technology,
Mansoura 35111, Egypt
6 Computer Engineering and Control Systems Department, Faculty of Engineering, Mansoura University,
Mansoura 35516, Egypt
7 Computer Science Department, Prince Hussein Bin Abdullah Faculty for Information Technology, Al al-Bayt
University, Mafraq 25113, Jordan
8 Hourani Center for Applied Scientific Research, Al-Ahliyya Amman University, Amman 19328, Jordan
9 MEU Research Unit, Middle East University, Amman 11831, Jordan
10 School of Computer Sciences, Universiti Sains Malaysia, Pulau Pinang 11800, Malaysia
11 Department of Civil and Architectural Engineering, University of Miami, Coral Gables, FL 33146, USA
12 Department of Information Systems, College of Computer and Information Sciences, Princess Nourah Bint
Abdulrahman University, P.O. Box 84428, Riyadh 11671, Saudi Arabia
13 Faculty of Engineering, King Salman International University, El-Tor 8701301, Egypt
* Correspondence: [email protected] (A.A.A.); [email protected] (A.I.)
Citation: Alhussan, A.A.;
Abdelhamid, A.A.; Towfek, S.K.; Abstract: Breast cancer is one of the most common cancers in women, with an estimated 287,850 new
Ibrahim, A.; Abualigah, L.; cases identified in 2022. There were 43,250 female deaths attributed to this malignancy. The high death
Khodadadi, N.; Khafaga, D.S.; rate associated with this type of cancer can be reduced with early detection. Nonetheless, a skilled
Al-Otaibi, S.; Ahmed, A.E.
professional is always necessary to manually diagnose this malignancy from mammography images.
Classification of Breast Cancer Using
Many researchers have proposed several approaches based on artificial intelligence. However, they
Transfer Learning and Advanced
still face several obstacles, such as overlapping cancerous and noncancerous regions, extracting
Al-Biruni Earth Radius Optimization.
irrelevant features, and inadequate training models. In this paper, we developed a novel computa-
Biomimetics 2023, 8, 270.
https://doi.org/10.3390/
tionally automated biological mechanism for categorizing breast cancer. Using a new optimization
biomimetics8030270 approach based on the Advanced Al-Biruni Earth Radius (ABER) optimization algorithm, a boosting
to the classification of breast cancer cases is realized. The stages of the proposed framework include
Academic Editor: Huiling Chen
data augmentation, feature extraction using AlexNet based on transfer learning, and optimized
Received: 2 May 2023 classification using a convolutional neural network (CNN). Using transfer learning and optimized
Revised: 21 June 2023 CNN for classification improved the accuracy when the results are compared to recent approaches.
Accepted: 24 June 2023 Two publicly available datasets are utilized to evaluate the proposed framework, and the average
Published: 26 June 2023 classification accuracy is 97.95%. To ensure the statistical significance and difference between the
proposed methodology, additional tests are conducted, such as analysis of variance (ANOVA) and
Wilcoxon, in addition to evaluating various statistical analysis metrics. The results of these tests
Copyright: © 2023 by the authors.
emphasized the effectiveness and statistical difference of the proposed methodology compared to
Licensee MDPI, Basel, Switzerland. current methods.
This article is an open access article
distributed under the terms and Keywords: biological mechanism; cancer detection; Al-Biruni Earth radius optimization algorithm;
conditions of the Creative Commons machine learning
Attribution (CC BY) license (https://
creativecommons.org/licenses/by/
4.0/).
1. Introduction
Cancer is a global health problem. Among female cancers, breast cancer is by far the
most common [1]. However, 42 percent of NHS trusts say they cannot assign individuals
because they do not have enough staff, with many citing a lack of breast cancer specialists.
It is the fundamental reason breast cancer has a dismal survival rate worldwide [2]. Breast
cancer specialists are in limited supply, which will delay diagnosis, increase resistance to
effective screening and treatment, and create inequalities in access to care [3]. The goal of
developing methods for detecting breast cancer was to identify anomalies and classify the
disease more accurately. This practice aids in detecting breast cancer [4]. Death rates can be
reduced with early detection using screening mammography; however, this is challenging
due to the small size of potential nodules concerning the entire breast [5]. Breast cancer has
a higher chance of being cured (about 90%) than other cancer types. Cancer patients often
go undiagnosed until they experience severe symptoms [6]. The patients’ ages affect the
mortality and incidence rates of breast cancer. Breast cancer was typically diagnosed in
patients aged 62 between 2010 and 2014 [7].
With an estimated 90,000 new cases annually and a reported 40,000 deaths due to the
disease, Pakistan has Asia’s highest breast cancer mortality rate [8,9]. Survival rates for
certain cancers vary depending on their detection stages [10]. Those who are predicted to
live beyond a certain point after receiving a diagnosis and continue to function normally
are included in the survival rate. Mammography is the most reliable technology for
identifying breast cancer due to its capabilities and inexpensive cost to satisfy medical
requirements. The study of mammograms is the major approach doctors use to make a
diagnosis. However, it can be affected by bias and fatigue. Mammography, unfortunately,
has a relatively low detection rate. Depending on the kind of the lesion, the breast density,
and the patient’s age, it can yield a false-negative result rate of anywhere from 5% to
30% [11]. Mammography uses low-dose radiography because it allows us to see the
breast’s internal structure.
To diagnose breast cancer, machine learning algorithms are trained to look for specific
patterns and associations in data that are linked to the biological mechanisms through
which cancer develops. The aberrant multiplication and proliferation of breast cells are
central to the basic mechanisms behind breast cancer, which can have multiple underlying
causes, including heredity, lifestyle, and the external environment. These processes can
lead to the development of breast abnormalities such as lumps, masses, or cysts, which can
be discovered using mammography, ultrasound, or magnetic resonance imaging. These
imaging data can be fed into a machine-learning algorithm and trained to look for abnor-
malities or patterns that are indicative of breast cancer. For breast cancer, machine learning
algorithms can be trained to recognize telltale signs such as masses and microcalcifica-
tions [12,13]. Imaging data, patient history, and molecular biomarkers are just some data
sources that can be analyzed with machine learning algorithms to enhance breast cancer
detection. Machine-learning algorithms can improve the accuracy and timeliness of breast
cancer diagnostics by merging data from numerous sources to detect tiny changes in breast
tissue that may indicate the presence of cancer. Breast cancer risk factors include genet-
ics, lifestyle choices, and other factors; these can all be modeled using machine learning
algorithms to create prediction models of an individual’s likelihood of developing breast
cancer. These models can be used to inform screening and preventative strategies, which
in turn can help lower breast cancer rates. To a large extent, the biological mechanisms of
cancer development are intertwined with breast cancer detection using machine learning,
as these algorithms are trained to recognize patterns and abnormalities in breast tissue that
are linked to the development of cancer [14–16].
CNN recently demonstrated promising performance in detecting and categorizing
tumors in medical images. Deep learning models’ performance is typically proportional to
the size of the datasets used for training. In contrast to the deep learning-based strategies,
the traditional methods performed poorly on complex nature datasets. Deep learning
employs the concept of CNN to perform breast cancer classification [17–19]. Convolutional,
Biomimetics 2023, 8, 270 3 of 24
pooling, activation, and fully linked layers are some types of layers (hidden layers) seen in
a CNN model. Softmax is the classifier used in the final layer of a convolutional neural net-
work model. The use of deep learning enables automated artificial intelligence approaches
in medical imaging. Researchers have introduced several deep learning-based architectures
to detect and categorize infectious diseases. While several deep learning methods have
been established to aid in the classification and diagnosis of breast cancer, researchers
have encountered obstacles such as imbalanced datasets, noisy imaging data, and the
downsampling of critical features [20]. The team zeroed in on the problem of teaching deep
models through transfer learning. One use of transfer learning is to apply a model that
has already been trained to a new problem or scenario [21,22]. While hyperparameters
such as learning rate, mini-batch size, and others have been used successfully in training,
setting their values by hand is tedious and error-prone when dealing with breast cancer.
The authors of described an improved hyperparameter-based deep-learning system for
breast cancer classification [23]. Extraction of deep features from the fully connected layer
followed training; nevertheless, it was shown through analysis that numerous features
were redundant, which negatively impacted breast cancer classification [23]. An improved
method of classifying breast cancer using deep learning was recently presented by the
authors of [24]. The authors of [25] proposed dialectical feature selection to improve breast
cancer classification; however, these methods run into the issue of stopping after the ideal
values have been retrieved.
Due to its many benefits over alternative modalities, the mammogram has become
the preferred modality for screening for breast cancer [26]. First, mammography has been
the subject of much research and is useful in identifying breast cancer at an early stage.
When used with a clinical breast exam, it can detect small cancers or microcalcifications
that the naked eye could miss. Successful treatment results can be improved by prompt
action made possible by this early identification. Second, mammograms produce highly
detailed pictures of breast tissue, letting radiologists see any irregularities very plainly.
This screening method is safe and well-accepted because of the low-dose X-rays used
in mammography. As a bonus, mammography can even spot breast cancer in people
with thick breast tissue. Breasts often have dense tissue, which might obscure cancers
on conventional imaging techniques such as ultrasonography. Mammography is useful
for screening women with a wide variety of breast densities because it can successfully
penetrate thick tissue. The widespread accessibility and well-established infrastructure
of mammography are additional benefits. Most medical facilities, clinics, and screening
centers can access mammography equipment. Because of this, many women will be
able to get screened regularly, which will help with identification and treatment early
on. Compared to other screening methods, mammography also has a low cost. It strikes
a good compromise between price and accuracy in establishing a diagnosis, making it
a viable option for widespread breast cancer screening programs. Mammography has
been widely adopted as the standard screening method for breast cancer because of its
efficacy in detecting cancer at an early stage, its high-resolution imaging capabilities, its
ability to identify tumors even in thick breast tissue, its widespread availability, and cost-
effectiveness. These benefits work together to make breast cancer treatment more effective
and decrease patient mortality [27].
• Optimizing the structure and training parameters of the classification CNN for boost-
ing its performance.
• Two datasets are employed to prove the effectiveness and generalization of the pro-
posed approach.
• Studying the statistical difference of the proposed methodology using ANOVA and
Wilcoxon signed ranks tests.
• Applying statistical analysis to show the stability of the proposed methodology in
classifying breast cancer cases.
The main motivation for using the BER optimization algorithm is its efficiency in
exploring the search space for the best solution. On the other hand, the motivation for using
AlexNet is that its performance is better than the other deep networks, such as GoogleNet
and VGG. Therefore AlexNet is adopted for feature extraction. In addition, CNN is used
for the classification of breast cancer. The BER optimization algorithm is used to optimize
its parameters to achieve the best performance of the CNN.
2. Literature Review
Around 1.7 million women were diagnosed with cancer in 2012. Breast cancer is the
most frequent type of cancer worldwide. Risk factors for breast cancer include age, family
history, and previous health problems [4]. Women account for the lion’s share of cancer
deaths; annually, an estimated 2.1 million people are diagnosed with breast cancer. Recent
research estimates that 627 thousand women lost their lives to cancer in 2018, accounting
for fifteen percent of all cancer deaths in women [5]. It is usual practice to use a deep
learning-based model for breast cancer diagnosis and classification when using computer
visualization. Clinicians face difficulties in making a cancer diagnosis from mammography
scans due to the complexity of early breast cancer and the fading of images. That is why it
is so important to enhance a doctor’s detection efficiency with the help of deep learning
algorithms used in the CAD system [28–31].
To categorize breast cancer, the authors of [4] proposed a convolutional neural net-
work (CNN) based framework for analyzing mammography images. In the beginning,
preprocessing was carried out so the mammography images could be seen. Then, the deep
learning model that was used to extract the features was trained using the preprocessed
images. Softmax, a CNN classifier, was then used to categorize the last layer’s retrieved fea-
tures. The preferred model enhanced the introduced framework’s classification accuracy of
mammography images. Accuracy values of 0.8585 and 0.8271 for the proposed framework
demonstrate its superiority to those of the state-of-the-art alternatives. The authors of [32]
revealed early results for utilizing transfer learning to identify breast abnormalities likely to
progress to cancer. After testing numerous deep learning models, they settled on ResNet50
and MobileNet as the best options. Both models achieved the highest accuracy levels
(78.4% and 74.3%, respectively). They used several preprocessing methods to enhance the
accuracy of the categorization further. Last but not least, in [33], researchers introduced a
novel hybrid processing approach that combines principle component analysis (PCA) and
logistic regression (LR).
Using a multi-view screening image-processing architecture, the authors of [34] were
able to improve diagnostic results. First-order local entropy, a texture-based technique,
segmented the tumor patches. Malignancy indicators such as radius and area were derived
using the feature extraction findings. Results from applying this strategy indicated that the
CC and MLO views were 88% and 80% accurate at detecting breast cancer, respectively.
The framework described by the authors in [35] centered on transferable knowledge.
Biomimetics 2023, 8, 270 5 of 24
Several augmentation methods are employed to increase the total number of mammograms
without overfitting and produce accurate findings. Using the enormous mammography
images dataset, the authors of [36] proposed a method. A segmentation module is then used
to identify breast cancer abnormalities in an image that is properly improved. The Breast
Imaging and Reporting and Data System dataset comprised five groups and achieved 92%
precision.
Tumor identification with thresholding and CNN methods were the primary focus of
the previous research, along with information fusion, hyperparameter value selection by
hand, data enhancement, and manual hyperparameter tuning. However, they failed to take
key measures that could have increased precision. These processes consist of improving the
contrast and then optimizing the retrieved features. The SGD and ADAM optimizers are
frequently used to fine-tune the weights of a deep model. A feature optimization method
is implemented following the feature extraction stage to combat computational complexity,
overfitting, and poor accuracy. Table 1 presents a summary of the related works. This
table presented the related works in terms of the presented methodology, the advances,
disadvantages, and overall performance. As shown in this table, the low accuracy of most
methods represents the research gap addressed through the methodology proposed in
this work.
3. Proposed Methodology
The proposed framework for mammogram-based breast cancer classification is pre-
sented in this section. The steps of the proposed methodology are shown in Figure 1. This
figure starts with adopting the breast cancer dataset, followed by data augmentation to
enhance these datasets. The next step is feature extraction, in which pre-trained models
are employed to realize this step. The pre-trained models include AlexNet, GoogleNet,
and VGG. The features extracted from the pre-trained model are then fed to the proposed
optimization algorithm to optimize a custom convolutional neural network (CNN) parame-
ter. The proposed optimization algorithm is based on an improved Al-Biruni Earth Radius
(BER) optimization algorithm which is denoted by advanced BER (ABER). After optimizing
the parameters of the CNN, it is used to classify the test images of the given datasets.
The classification results are finally analyzed using several evaluation criteria and statistical
methods. The next sections present more details about these steps.
Biomimetics 2023, 8, 270 6 of 24
3.1. Dataset
The Digital Database for Screening Mammography (DDSM) dataset employed in this
research can be accessed at [45]. Dataset-1 denotes this dataset throughout this text. It
provides a large database of mammograms, both normal and abnormal. A suggested
optimal convolutional neural network (CNN) for classification uses this dataset for training
and testing. CNN is a robust deep-learning model developed especially for analyzing
and interpreting visual input, making it excellent for mammography classification applica-
tions. Accurate categorization of mammograms may be accomplished using this dataset
in conjunction with the suggested optimized CNN. The breast pictures are sent into a
deep learning network, which then learns complex patterns and characteristics to identify
anomalies such as masses, calcifications, and architectural deformities. By applying the
suggested optimization strategies to the CNN design, we may boost its performance in
terms of overfitting reduction, generalization, and classification precision. Researchers
can use this dataset to test how well the improved CNN works. They can use a smaller
sample of the data for model training and then verify its accuracy using a larger test set.
Together, the proposed optimized CNN and the DDSM mammography dataset provide
a robust system for the classification of mammograms. The enhanced CNN, which uses
deep learning techniques with the dataset, can improve the accuracy and efficiency with
which mammograms are classified, hence facilitating the early identification and diagnosis
of breast problems. The number of images in this dataset is 1696 images including benign
and malignant cases.
An additional dataset is considered to emphasize the effectiveness of the proposed
methodology. This dataset is publicly available on Kaggle [46] and is denoted by Dataset-2
throughout this text. The dataset available at the provided link is a collection of mam-
mograms and breast cancer images. It is a valuable resource for training and evaluating
a proposed optimized convolutional neural network (CNN) for classification purposes.
By leveraging this dataset, the proposed optimized CNN can be trained to classify breast
cancer images and mammograms accurately. Utilizing this dataset in conjunction with
the optimized CNN can improve the efficiency and accuracy of a breast cancer diagnosis
significantly. The deep learning model can learn intricate patterns and features from the
images, enabling it to distinguish between malignant and benign cases. The proposed
optimization techniques applied to the CNN architecture can enhance its performance by
reducing overfitting, improving generalization, and increasing the overall accuracy of the
classification. With this dataset, researchers can thoroughly evaluate the performance of
the optimized CNN. The dataset’s diverse range of images and associated metadata allow
for a comprehensive evaluation of the proposed optimized CNN across various patient
demographics and imaging techniques. The dataset provided, and the proposed optimized
CNN presents a promising approach for classifying mammograms and breast cancer im-
Biomimetics 2023, 8, 270 7 of 24
ages. By harnessing the power of deep learning and leveraging this dataset, the optimized
CNN can contribute to accurate and efficient breast cancer diagnosis, and thus improves
patient outcomes and better healthcare practices. The number of images in this dataset is
1356 images including benign and malignant cases.
Table 2. The datasets information along with the number of images before and after data augmentation.
Before After
Dataset Classes
Augmentation Augmentation
Dataset-1 2 (Benign and Malignant) 1696 6784
Dataset-2 2 (Benign and Malignant) 1356 5424
nected layers, rectified linear unit (ReLU) layers, pooling layers, and convolution layers.
Specifically, AlexNet consists of five convolutional layers, three of which are followed by a
pooling layer and three fully connected layers. AlexNet uses several pragmatic strategies
contributing to its impressive performance, including the dropout regularization technique
and ReLU non-linearity layer. The optimization of AlexNet architecture using the stochas-
tic gradient descent (SGD) algorithm is based on back-propagation to optimize the cost
function when the convolutional kernels are extracted. The convolutional layers generally
apply sliding convolutional kernels to the input feature maps to produce convolved feature
maps. The pooling layers aggregate information within a given neighborhood window by
performing either a max pooling or an average pooling operation on the convolved feature
maps. The ReLU function acts as a half-wave rectifier function, reducing training time
and preventing overfitting. Dropout can be seen as a regularization method that randomly
sets several hidden neurons or input neurons to zero during training. On the other hand,
the dropout regularization technique is commonly used in the fully connected layers of
AlexNet architecture to reduce overfitting. Figure 3 shows the steps of the proposed feature
extraction method.
Figure 3. The process of extracting features from the breast cancer images dataset.
The transfer technique and the pre-training procedure [50,51] allow the parameters of
a Neural network to be imported from natural imaging datasets. This was partly possible
because remote sensing imagery and natural imaging datasets are similar and comparable
in terms of their respective categories. Well-trained network parameters are critical for
launching the subsequent classification framework, and it makes sense that these param-
eters may be obtained by training an AlexNet architecture on massive and complicated
ImageNet datasets. Therefore, the AlexNet architecture’s capability to categorize HSR
sceneries from remote sensing data is improved using the pre-training method. For the first
time, the AlexNet architecture’s easy and comprehensive representation ability can be uti-
lized in HSR remote sensing imaging scene categorization due to the pre-training approach.
returning to the input node. Since CNN preserves the spatial correlations after filtering the
input images, it is one of the best machine-learning (ML) techniques used in medical image
analysis. The medical analytic community places a premium on these connections. This
section presents a high-level overview of the components that make up CNNs. As may be
seen in Figure 4, CNN is made up of several different layers. These are the levels:
Figure 4. The typical structure of the convolutional neural network used in image classification.
the convolutional layer but before the ReLU layer. Average and maximum pooling are the
two most used methods. The difference between max pooling and average pooling is that
the former takes the maximum value of the input within a kernel and discards the others,
while the latter takes the average.
more, the elitism technique is used by holding on to the process’s leading answer if no
better solution is found; this ensures that the optimization process for the population will
converge. Suppose a solution’s fitness does not increase much after three iterations in the
BER optimization procedure. In that case, the solution may have reached a local optimum,
in which case another exploring individual can be formed using the mutation operation.
For each iteration, the ABER selects the optimal option to implement, guaranteeing a
high standard of results. The elitism approach improves the effectiveness of algorithms,
but it can lead to early convergence in multimodal functions. The ABER’s mutation
process and ensuing search around members of the exploration group provide exceptional
exploration capabilities. Due to its robust exploratory capacities, the ABER can delay the
onset of convergence. In Algorithm 2, the ABER pseudo-code is displayed. To begin, we
feed the ABER some information by specifying the number of iterations, the size of the
population, and the mutation rate. The ABER then divides the participants into two groups:
the exploration group and the exploitation group. During iterations of the search for the
optimal solution, the ABER algorithm dynamically adjusts the size of each group. Each
team uses a different method to carry out its duties. With each iteration, the ABER shuffles
the order of the solutions to increase diversity and exploration. A solution may belong
to the exploration group in one iteration, but it may be part of the exploitation group in
the next. Using the ABER’s elitist approach, the leader is less likely to be removed as the
process iterates. The steps of the proposed ABER algorithm are presented in Algorithm 2.
4. Experimental Results
In this part, we provide and discuss the results of the proposed architecture for breast
cancer classification. Two datasets have been adopted in the conducted experiments,
and the achieved results are compared to the other techniques [53–57]. In addition, a cross-
validation value of five folds and a training/testing split of 70:30 are applied to improve the
achieved accuracy. On the other hand, the proposed optimization approach is compared
to different recent approaches, including genetic algorithm (GA) [58], whale optimization
Biomimetics 2023, 8, 270 12 of 24
algorithm (WOA) [59], particle swarm optimization (PSO) [60], grey wolf optimization
(GWO) [61] and the standard Al-Biruni Earth radius (BER) [62]. The parameters of the CNN
are optimized using the suggested state-of-the-art BER method. There are many iterations
performed to arrive at the final findings, including (i) testing the adopted datasets based on
the extracted deep features using other models and (ii) testing the adopted datasets using
the extracted deep features and the optimized CNN. All tests are performed on a 16 GB
RAM, 8 GB graphics card, MATLAB 2022a-powered desktop computer.
Metric Value
TN
Specificity TN+FP (2)
TP
Sensitivity TP+FN (3)
TP+TN
Accuracy TP+TN+FP+FN (4)
TP
Precision TP+FP (5)
TN
NPV TN+FN (6)
TP
F-score TP+0.5(FP+FN)
(7)
Table 6. The configuration parameters used for the proposed ABER algorithm.
Parameter Value
Number of runs 30
Iterations count 500
Population size 30
K (decreases from 2 to 0) 1
Exploration percentage 70%
Mutation probability 0.5
Random variables [0, 1]
from 0.427 to 0.440, indicating that they can identify true positive cases with comparable
performance. The specificity values ranged from 0.925 to 0.949, indicating that the models
can correctly identify true negative cases with varying degrees of success. It is important to
note that sensitivity and specificity can be affected by the choice of the decision threshold,
and different thresholds may result in different performance levels. The Precision and
NPV are also relevant evaluation metrics for binary classification problems, as they provide
information on the prevalence of false positives and false negatives, respectively. The NPV
measures the proportion of positive cases incorrectly classified as negative, whereas the
Precision measures the proportion of negative cases incorrectly classified as positive. In this
case, the Precision ranged from 0.658 to 0.669, indicating that the models have relatively
low rates of false positive predictions. The NPV ranged from 0.846 to 0.889, indicating that
the models have somewhat higher rates of false negatives. Finally, the F-score is a measure
that combines both precision and recall into a single value. It provides a valuable summary
of the model’s overall performance in correctly identifying positive and negative cases.
In this case, the F-score values ranged from 0.521 to 0.529, indicating that the models have
similar precision and recall, but their ability to balance the two can vary. These evaluation
metrics provide a comprehensive view of the performance of the evaluated models for
breast cancer classification. By considering multiple metrics, it is possible to gain a more
nuanced understanding of the strengths and weaknesses of each model, and to make more
informed decisions about which model to use for a particular task. As presented in Table 7,
it can be shown that the performance of the AlexNet pre-trained model is superior to
the other models for both Dataset-1 and Dataset-2 and, thus, this model is adopted for
feature extraction.
Table 8. Evaluation of the results achieved by the proposed ABER-CNN compared to the baseline
CNN and optimized CNN using different optimization algorithms.
The other models achieved accuracy scores ranging from 0.914 to 0.943. It is important
to note that accuracy is only one evaluation metric, and other metrics such as sensitivity,
specificity, and F-score may be necessary to evaluate the models’ performance fully. Ad-
ditionally, further information about the dataset and the specific task would be necessary
to fully interpret and contextualize these results. These results suggest that the proposed
ABER-CNN model is a promising approach for breast cancer classification, achieving a
high accuracy score of 0.962. Similarly, the performance of the proposed approach in terms
of Dataset-2 is also presented in Table 8. The results presented in this table confirm the
effectiveness and superiority of the proposed approach in breast cancer classification tasks
when tested on the adopted datasets. On the other hand, Figure 5 shows the confusion
matrix for the results of the proposed ABER-CNN approach applied to Dataset-1 and
Dataset-2. From these matrices, it can be noted that the classification of the breast cancer
cases is accurate using the proposed approach, which proves its effectiveness in this domain
of medical diagnosis.
(a) (b)
Figure 5. The confusion matrix of the classification results of breast cancer cases using the proposed
ABER-CNN applied to the adopted datasets (Dataset-1 and Dataset-2). (a) Confusion matrix for
Dataset-1. (b) Confusion matrix for Dataset-2.
Biomimetics 2023, 8, 270 15 of 24
The accuracy plot and accuracy histogram plot are valuable tools used to compare the
performance of several models in classifying breast cancer cases as shown in Figures 6–9 for
Dataset-1 and Dataset-2. In this context, the models evaluated include CNN, WOA-CNN,
GA-CNN, PSO-CNN, GWO-CNN, BER-CNN, and ABER-CNN, where ABER represents
the advanced Al-Biruni Earth radius optimization algorithm, and the proposed approach
is ABER-CNN. The accuracy plot visually presents the accuracy scores of each model,
allowing for a direct comparison of their performance. It typically displays the accuracy
rates on the y-axis and the different models on the x-axis. This plot enables researchers
to assess which model consistently achieves higher accuracy rates in classifying breast
cancer cases.
Figure 6. The accuracy of the classification results using the proposed approach compared to other
approaches applied to Dataset-1. The colors refer to the corresponding algorithms located on the
y-axis.
Figure 7. The accuracy of the classification results using the proposed approach compared to other
approaches applied to Dataset-2. The colors refer to the corresponding algorithms located on the
y-axis.
Biomimetics 2023, 8, 270 16 of 24
Figure 8. The accuracy histogram of the classification results using the proposed approach compared
to other approaches applied to Dataset-1.
Figure 9. The accuracy histogram of the classification results using the proposed approach compared
to other approaches applied to Dataset-2.
Similarly, the accuracy histogram plot provides a distribution of accuracy scores for
each model. It offers a more detailed view of the performance by illustrating the frequency
of accuracy scores within specific ranges. This plot allows for comparing the overall
accuracy and the accuracy distribution across different models. By analyzing these plots, it
becomes evident that the proposed optimized model, ABER-CNN, outperforms the other
models in classifying breast cancer cases. Its accuracy scores consistently exceed those
of CNN, WOA-CNN, GA-CNN, PSO-CNN, GWO-CNN, and BER-CNN. The superior
performance of ABER-CNN suggests that the advanced Al-Biruni Earth radius optimization
algorithm effectively enhances the CNN architecture for breast cancer classification. This
finding highlights the potential of the ABER-CNN model for more accurate and reliable
breast cancer diagnosis, paving the way for improved patient outcomes and healthcare
practices in the field. Additional experiment is performed to study the area under the curve
(AUC) for the results achieved by the proposed approach when applied to Dataset-1. The
results of this experiments are presented in Appendix A.
Biomimetics 2023, 8, 270 17 of 24
Table 9. The results of the statistical analysis performing on the results achieved by the proposed
approach compared to other approaches.
Table 10. The ANOVA test outcomes for the comparison models and the proposed approach.
The results of the plots shown in Figures 10 and 11 used to visualize the output of
the ANOVA test further validate the effectiveness of the proposed ABER-CNN model
in breast cancer classification. Firstly, the QQ plot demonstrates that the residuals of the
ABER-CNN model align closely with the expected normal distribution. This indicates that
the assumptions of normality are met, enhancing the reliability of the model’s predictions.
Additionally, the Homoscedasticity plot reveals a consistent spread of residuals across
different independent variable levels, confirming the homoscedasticity assumption. This
suggests that the ABER-CNN model performs consistently well across various conditions
or groups, further strengthening its robustness in breast cancer classification. The Residual
plot showcases minimal patterns or systematic deviations, indicating that the ABER-CNN
model effectively captures the underlying linear relationships. The absence of non-linear
patterns implies that the model is well-suited for breast cancer classification tasks, as it
accurately captures the complexities present in the data.
Figure 10. The plots visualizing the results of the ANOVA test based on Dataset-1. The blue dots
refer to the samples included in the ANOVA test.
Biomimetics 2023, 8, 270 19 of 24
Figure 11. The plots visualizing the results of the ANOVA test based on Dataset-2. The blue dots
refer to the samples included in the ANOVA test.
Furthermore, the Heatmap highlights the significance levels or p-values resulting from
the ANOVA test. The heatmap reveals that the ABER-CNN model exhibits significantly
higher accuracy rates than other models, such as CNN, WOA-CNN, GA-CNN, PSO-CNN,
GWO-CNN, and BER-CNN. The color-coded representation indicates the superiority of the
ABER-CNN model in classifying breast cancer cases, further supporting its effectiveness
and demonstrating its potential for improved patient outcomes and healthcare practices in
breast cancer diagnosis.
The results of the QQ plot, Homoscedasticity plot, Residual plot, and Heatmap col-
lectively confirm the effectiveness of the proposed ABER-CNN model in breast cancer
classification. These plots provide strong evidence of the model’s accuracy, adherence to
assumptions, and robust performance, solidifying its potential as a valuable tool in the
early detection and diagnosis of breast cancer.
Table 11. The Wilcoxon signed-rank test outcomes for the comparison models and the proposed
approach.
Because the p-values are derived from the true probability distribution of the test
statistic, the Wilcoxon signed-rank test is considered an exact test. The p-values are exactly
0.002, which is a very small probability. The Wilcoxon signed-rank test verifies that there
are substantive differences in the effectiveness of the various models. It does not, however,
specify how large these disparities are. The deviation numbers reveal the true median
values of the performance discrepancies, with ABER-CNN doing better than CNN by a
wide margin. One important thing to keep in mind about the Wilcoxon signed-rank test
is that it is a one-tailed test, which means that it can only tell you if the models perform
considerably better or worse than the theoretical median. It is not a test for directional
variations in performance.
Author Contributions: Methodology, A.A.A. (Abdelaziz A. Abdelhamid), S.K.T. and A.I.; Soft-
ware, A.A.A. (Abdelaziz A. Abdelhamid), S.K.T., A.I., N.K. and L.A.; Data curation, D.S.K. and
A.A.A. (Amel Ali Alhussan); Writing—original draft, A.A.A. (Abdelaziz A. Abdelhamid), S.K.T., A.I.;
Writing—review & editing, A.A.A. (Amel Ali Alhussan), L.A., A.E.A., N.K. and D.S.K.; Funding
acquisition, A.A.A. (Amel Ali Alhussan) and D.S.K. Validation and Resources, S.A.-O. All authors
have read and agreed to the published version of the manuscript.
Funding: Princess Nourah bint Abdulrahman University Researchers Supporting Project number
(PNURSP2023R308), Princess Nourah bint Abdulrahman University, Riyadh, Saudi Arabia.
Institutional Review Board Statement: Not applicable.
Data Availability Statement: The datasets employed in this research can be found at the following
links. Dataset-1: https://www.kaggle.com/datasets/skooch/ddsm-mammography, Dataset-2: https:
//dx.doi.org/10.21227/9f0p-qx37.
Acknowledgments: Authors thank Princess Nourah bint Abdulrahman University Researchers
Supporting Project number (PNURSP2023R308), Princess Nourah bint Abdulrahman University,
Riyadh, Saudi Arabia.
Conflicts of Interest: The authors declare that they have no conflict of interest to report regarding
the present study.
Appendix A
The Area under the ROC curve, shown in Figure A1, is a common metric used to
evaluate the performance of binary classifiers, such as a CNN model in this case, when
applied to Dataset-1. It measures the ability of the model to correctly classify positive and
negative cases across all possible classification thresholds. The results show that the AUC
score of the model is 1, which indicates perfect discrimination between the two classes
(controls and patients). As presented in Table A1, the standard error of the AUC is 0, which
means that the estimate of the AUC is very precise. The 95% confidence interval also
confirms that the true AUC value is within the range of 1.000 to 1.000, further supporting
the conclusion of perfect classification performance.
Figure A1. The area under the ROC curve for the results achieved by the proposed approach applied
to Dataset-1.
The p-value is 0.0002, less than the commonly used threshold of 0.05, indicating
that the observed AUC is significantly different from a random classifier. The data used
for the analysis consists of 10 controls (ABER-CNN) and 10 patients (BER-CNN). There
were no missing controls or patients in the dataset. These results suggest that the CNN
Biomimetics 2023, 8, 270 22 of 24
model performs excellently discriminating between controls and patients, with perfect
classification performance according to the AUC metric.
Metric Value
Std. Error 0
Area 1
Controls (ABER-CNN) 10
Patients (BER-CNN) 10
Missing Patients 0
Missing Controls 0
p-value 0.0002
95% confidence interval 1.000 to 1.000
References
1. Li, H.; Niu, J.; Li, D.; Zhang, C. Classification of breast mass in two-view mammograms via deep learning. IET Image Process.
2021, 15, 454–467. [CrossRef]
2. Azamjah, N.; Soltan-Zadeh, Y.; Zayeri, F. Global Trend of Breast Cancer Mortality Rate: A 25-Year Study. Asian Pac. J. Cancer Prev.
APJCP 2019, 20, 2015–2020. [CrossRef] [PubMed]
3. Medeiros, G.; Thuler, L.; Bergmann, A. Delay in breast cancer diagnosis: A Brazilian cohort study. Public Health 2019, 167, 88–95.
[CrossRef]
4. Tan, Y.J.; Sim, K.S.; Ting, F.F. Breast cancer detection using convolutional neural networks for mammogram imaging system. In
Proceedings of the 2017 International Conference on Robotics, Automation and Sciences (ICORAS), Melaka, Malaysia, 27–29
November 2017; pp. 1–5. [CrossRef]
5. Hekal, A.A.; Elnakib, A.; Moustafa, H.E.D. Automated early breast cancer detection and classification system. Signal Image Video
Process. 2021, 15, 1497–1505. [CrossRef]
6. Kurumety, S.K.; Howshar, J.T.; Loving, V.A. Breast Cancer Screening and Outcomes Disparities Persist for Native American
Women. J. Breast Imaging 2023, 5, 3–10. [CrossRef]
7. Attallah, O.; Anwar, F.; Ghanem, N.M.; Ismail, M.A. Histo-CADx: Duo cascaded fusion stages for breast cancer diagnosis from
histopathological images. PeerJ Comput. Sci. 2021, 7, e493. [CrossRef]
8. Menhas, R.; Umer, S. Breast Cancer among Pakistani Women. Iran. J. Public Health 2015, 44, 586–587.
9. Abdelhamid, A.A.; El-Kenawy, E.S.M.; Khodadadi, N.; Mirjalili, S.; Khafaga, D.S.; Alharbi, A.H.; Ibrahim, A.; Eid, M.M.; Saber,
M. Classification of Monkeypox Images Based on Transfer Learning and the Al-Biruni Earth Radius Optimization Algorithm.
Mathematics 2022, 10, 3614. [CrossRef]
10. Charan, S.; Khan, M.J.; Khurshid, K. Breast cancer detection in mammograms using convolutional neural network. In Proceedings
of the 2018 International Conference on Computing, Mathematics and Engineering Technologies (iCoMET), Sukkur, Pakistan, 3–4
March 2018; pp. 1–5. [CrossRef]
11. Ramadan, S.Z. Methods Used in Computer-Aided Diagnosis for Breast Cancer Detection Using Mammograms: A Review. J.
Healthc. Eng. 2020, 2020, 9162464. [CrossRef]
12. Shen, L.; Margolies, L.R.; Rothstein, J.H.; Fluder, E.; McBride, R.; Sieh, W. Deep Learning to Improve Breast Cancer Detection on
Screening Mammography. Sci. Rep. 2019, 9, 12495. [CrossRef]
13. Khafaga, D.S.; Ibrahim, A.; El-Kenawy, E.S.M.; Abdelhamid, A.A.; Karim, F.K.; Mirjalili, S.; Khodadadi, N.; Lim, W.H.; Eid, M.M.;
Ghoneim, M.E. An Al-Biruni Earth Radius Optimization-Based Deep Convolutional Neural Network for Classifying Monkeypox
Disease. Diagnostics 2022, 12, 2892. [CrossRef]
14. Naji, M.A.; Filali, S.E.; Aarika, K.; Benlahmar, E.H.; Abdelouhahid, R.A.; Debauche, O. Machine Learning Algorithms For Breast
Cancer Prediction And Diagnosis. Procedia Comput. Sci. 2021, 191, 487–492. [CrossRef]
15. Ragab, D.A.; Attallah, O.; Sharkas, M.; Ren, J.; Marshall, S. A framework for breast cancer classification using Multi-DCNNs.
Comput. Biol. Med. 2021, 131, 104245. [CrossRef]
16. Ragab, D.A.; Sharkas, M.; Attallah, O. Breast Cancer Diagnosis Using an Efficient CAD System Based on Multiple Classifiers.
Diagnostics 2019, 9, 165. [CrossRef]
17. Din, N.M.U.; Dar, R.A.; Rasool, M.; Assad, A. Breast cancer detection using deep learning: Datasets, methods, and challenges
ahead. Comput. Biol. Med. 2022, 149, 106073. [CrossRef]
18. Alhussan, A.A.; Khafaga, D.S.; El-Kenawy, E.S.M.; Ibrahim, A.; Eid, M.M.; Abdelhamid, A.A. Pothole and Plain Road Classifica-
tion Using Adaptive Mutation Dipper Throated Optimization and Transfer Learning for Self Driving Cars. IEEE Access 2022,
10, 84188–84211. [CrossRef]
19. Khafaga, D.S.; Alhussan, A.A.; El-Kenawy, E.S.M.; Ibrahim, A.; Eid, M.M.; Abdelhamid, A.A. Solving Optimization Problems of
Metamaterial and Double T-Shape Antennas Using Advanced Meta-Heuristics Algorithms. IEEE Access 2022, 10, 74449–74471.
[CrossRef]
Biomimetics 2023, 8, 270 23 of 24
20. Dhar, T.; Dey, N.; Borra, S.; Sherratt, R.S. Challenges of Deep Learning in Medical Image Analysis—Improving Explainability and
Trust. IEEE Trans. Technol. Soc. 2023, 4, 68–75. [CrossRef]
21. Ayana, G.; Dese, K.; Dereje, Y.; Kebede, Y.; Barki, H.; Amdissa, D.; Husen, N.; Mulugeta, F.; Habtamu, B.; Choe, S.W. Vision-
Transformer-Based Transfer Learning for Mammogram Classification. Diagnostics 2023, 13, 178. [CrossRef]
22. El-kenawy, E.S.M.; Albalawi, F.; Ward, S.A.; Ghoneim, S.S.M.; Eid, M.M.; Abdelhamid, A.A.; Bailek, N.; Ibrahim, A. Feature
Selection and Classification of Transformer Faults Based on Novel Meta-Heuristic Algorithm. Mathematics 2022, 10, 3144.
[CrossRef]
23. Awotunde, J.B.; Panigrahi, R.; Khandelwal, B.; Garg, A.; Bhoi, A.K. Breast cancer diagnosis based on hybrid rule-based feature
selection with deep learning algorithm. Res. Biomed. Eng. 2023, 39, 115–127. [CrossRef]
24. Atban, F.; Ekinci, E.; Garip, Z. Traditional machine learning algorithms for breast cancer image classification with optimized deep
features. Biomed. Signal Process. Control 2023, 81, 104534. [CrossRef]
25. Pereira, J.M.S.; Araújo De Santana, M.; Lins De Lima, C.; Fernandes De Lima, R.D.C.; Lopes De Lima, S.M.; Pinheiro Dos Santos,
W. Feature Selection Based on Dialectical Optimization Algorithm for Breast Lesion Classification in Thermographic Images. In
Research Anthology on Medical Informatics in Breast and Cervical Cancer; IGI Global: Hershey, PA, USA, 2021; pp. 47–71. [CrossRef]
26. Wetstein, S.C.; De Jong, V.M.T.; Stathonikos, N.; Opdam, M.; Dackus, G.M.H.E.; Pluim, J.P.W.; Van Diest, P.J.; Veta, M. Deep
learning-based breast cancer grading and survival analysis on whole-slide histopathology images. Sci. Rep. 2022, 12, 15102.
[CrossRef] [PubMed]
27. Eid, M.M.; El-Kenawy, E.S.M.; Khodadadi, N.; Mirjalili, S.; Khodadadi, E.; Abotaleb, M.; Alharbi, A.H.; Abdelhamid, A.A.;
Ibrahim, A.; Amer, G.M.; et al. Meta-Heuristic Optimization of LSTM-Based Deep Network for Boosting the Prediction of
Monkeypox Cases. Mathematics 2022, 10, 3845. [CrossRef]
28. Tummala, S.; Kim, J.; Kadry, S. BreaST-Net: Multi-Class Classification of Breast Cancer from Histopathological Images Using
Ensemble of Swin Transformers. Mathematics 2022, 10, 4109. [CrossRef]
29. El-Kenawy, E.S.M.; Mirjalili, S.; Abdelhamid, A.A.; Ibrahim, A.; Khodadadi, N.; Eid, M.M. Meta-Heuristic Optimization and
Keystroke Dynamics for Authentication of Smartphone Users. Mathematics 2022, 10, 2912. [CrossRef]
30. Mat Radzi, S.F.; Abdul Karim, M.K.; Saripan, M.I.; Abd Rahman, M.A.; Osman, N.H.; Dalah, E.Z.; Mohd Noor, N. Impact of
Image Contrast Enhancement on Stability of Radiomics Feature Quantification on a 2D Mammogram Radiograph. IEEE Access
2020, 8, 127720–127731. [CrossRef]
31. Singla, C.; Sarangi, P.K.; Sahoo, A.K.; Singh, P.K. Deep learning enhancement on mammogram images for breast cancer detection.
Mater. Today Proc. 2022, 49, 3098–3104. [CrossRef]
32. Falconi, L.G.; Perez, M.; Aguilar, W.G. Transfer Learning in Breast Mammogram Abnormalities Classification with Mobilenet and
Nasnet. In Proceedings of the 2019 International Conference on Systems, Signals and Image Processing (IWSSIP), Osijek, Croatia,
5–7 June 2019; pp. 109–114. [CrossRef]
33. Samee, N.A.; Alhussan, A.A.; Ghoneim, V.F.; Atteia, G.; Alkanhel, R.; Al-antari, M.A.; Kadah, Y.M. A Hybrid Deep Transfer
Learning of CNN-Based LR-PCA for Breast Lesion Diagnosis via Medical Breast Mammograms. Sensors 2022, 22, 4938. [CrossRef]
34. Hikmah, N.F.; Sardjono, T.A.; Mertiana, W.D.; Firdi, N.P.; Purwitasari, D. An Image Processing Framework for Breast Cancer
Detection Using Multi-View Mammographic Images. EMITTER Int. J. Eng. Technol. 2022, 10, 136–152. [CrossRef]
35. Alruwaili, M.; Gouda, W. Automated Breast Cancer Detection Models Based on Transfer Learning. Sensors 2022, 22, 876.
[CrossRef] [PubMed]
36. Almalki, Y.E.; Soomro, T.A.; Irfan, M.; Alduraibi, S.K.; Ali, A. Computerized Analysis of Mammogram Images for Early Detection
of Breast Cancer. Healthcare 2022, 10, 801. [CrossRef]
37. Khamparia, A.; Bharati, S.; Podder, P.; Gupta, D.; Khanna, A.; Phung, T.K.; Thanh, D.N.H. Diagnosis of breast cancer based on
modern mammography using hybrid transfer learning. Multidimens. Syst. Signal Process. 2021, 32, 747–765. [CrossRef] [PubMed]
38. Agarwal, R.; Díaz, O.; Yap, M.H.; Lladó, X.; Martí, R. Deep learning for mass detection in Full Field Digital Mammograms.
Comput. Biol. Med. 2020, 121, 103774. [CrossRef] [PubMed]
39. Chougrad, H.; Zouaki, H.; Alheyane, O. Deep Convolutional Neural Networks for breast cancer screening. Comput. Methods
Programs Biomed. 2018, 157, 19–30. [CrossRef] [PubMed]
40. Platania, R.; Shams, S.; Yang, S.; Zhang, J.; Lee, K.; Park, S.J. Automated Breast Cancer Diagnosis Using Deep Learning and Region
of Interest Detection (BC-DROID). In Proceedings of the 8th ACM International Conference on Bioinformatics, Computational
Biology,and Health Informatics, Boston, MA, USA, 20–23 August 2017; pp. 536–543. [CrossRef]
41. Guan, S.; Loew, M. Breast cancer detection using synthetic mammograms from generative adversarial networks in convolutional
neural networks. In Proceedings of the 14th International Workshop on Breast Imaging (IWBI 2018), Atlanta, GA, USA, 8–11 July
2018; Krupinski, E.A., Ed.; p. 43. . [CrossRef]
42. Aly, G.H.; Marey, M.; El-Sayed, S.A.; Tolba, M.F. YOLO Based Breast Masses Detection and Classification in Full-Field Digital
Mammograms. Comput. Methods Programs Biomed. 2021, 200, 105823. [CrossRef] [PubMed]
43. Ting, F.F.; Tan, Y.J.; Sim, K.S. Convolutional neural network improvement for breast cancer classification. Expert Syst. Appl. 2019,
120, 103–115. [CrossRef]
44. Zhang, Y.D.; Satapathy, S.C.; Guttery, D.S.; Górriz, J.M.; Wang, S.H. Improved Breast Cancer Classification Through Combining
Graph Convolutional Network and Convolutional Neural Network. Inf. Process. Manag. 2021, 58, 102439. [CrossRef]
Biomimetics 2023, 8, 270 24 of 24
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual
author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to
people or property resulting from any ideas, methods, instructions or products referred to in the content.