0% found this document useful (0 votes)
45 views21 pages

2023 Optimized Deep Learning Architecture For Brain Tumor Classification Using Improved Hunger Games Search Algorithm

Uploaded by

Ansuman Acharya
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
45 views21 pages

2023 Optimized Deep Learning Architecture For Brain Tumor Classification Using Improved Hunger Games Search Algorithm

Uploaded by

Ansuman Acharya
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 21

Computers in Biology and Medicine 160 (2023) 106966

Contents lists available at ScienceDirect

Computers in Biology and Medicine


journal homepage: www.elsevier.com/locate/compbiomed

Optimized deep learning architecture for brain tumor classification using


improved Hunger Games Search Algorithm
Marwa M. Emam a , Nagwan Abdel Samee b , Mona M. Jamjoom c , Essam H. Houssein a ,∗
a
Faculty of Computers and Information, Minia University, Minia, Egypt
b
Department of Information Technology, College of Computer and Information Sciences, Princess Nourah bint Abdulrahman University, P.O. Box
84428, Riyadh 11671, Saudi Arabia
c
Department of Computer Sciences, College of Computer and Information Sciences, Princess Nourah bint Abdulrahman University, Riyadh 11671, Saudi Arabia

ARTICLE INFO ABSTRACT

Keywords: One of the worst diseases is a brain tumor, which is defined by abnormal development of synapses in the
Brain tumor brain. Early detection of brain tumors is essential for improving prognosis, and classifying tumors is a vital
Deep learning step in the disease’s treatment. Different classification strategies using deep learning have been presented for
Residual network
the diagnosis of brain tumors. However, several challenges exist, such as the need for a competent specialist
Transfer learning
in classifying brain cancers by deep learning models and the problem of building the most precise deep
Convolutional neural network
Hunger games search (HGS)
learning model for categorizing brain tumors. We propose an evolved and highly efficient model based on
Local escaping operator deep learning and improved metaheuristic algorithms to address these challenges. Specifically, we develop an
Brownian motion optimized residual learning architecture for classifying multiple brain tumors and propose an improved variant
of the Hunger Games Search algorithm (I-HGS) based on combining two enhancing strategies: Local Escaping
Operator (LEO) and Brownian motion. These two strategies balance solution diversity and convergence speed,
boosting the optimization performance and staying away from the local optima. First, we have evaluated
the I-HGS algorithm on the IEEE Congress on Evolutionary Computation held in 2020 (CEC’2020) test
functions, demonstrating that I-HGS outperformed the basic HGS and other popular algorithms regarding
statistical convergence, and various measures. The suggested model is then applied to the optimization
of the hyperparameters of the Residual Network 50 (ResNet50) model (I-HGS-ResNet50) for brain cancer
identification, proving its overall efficacy. We utilize several publicly available, gold-standard datasets of brain
MRI images. The proposed I-HGS-ResNet50 model is compared with other existing studies as well as with
other deep learning architectures, including Visual Geometry Group 16-layer (VGG16), MobileNet, and Densely
Connected Convolutional Network 201 (DenseNet201). The experiments demonstrated that the proposed I-HGS-
ResNet50 model surpasses the previous studies and other well-known deep learning models. I-HGS-ResNet50
acquired an accuracy of 99.89%, 99.72%, and 99.88% for the three datasets. These results efficiently prove
the potential of the proposed I-HGS-ResNet50 model for accurate brain tumor classification.

1. Introduction children because of a growth in technological use like mobile phones,


Tablets, and other devices [2].
A brain tumor is an abnormal development of brain cells. It can be Early detection of brain tumors is crucial for experts to choose the
categorized as either non-cancerous (benign) or cancerous (malignant), most effective course of therapy to prolong the patient’s life [3,4].
and its symptoms may vary based on its size and location within the Glioma and meningioma are two main tumors that can cause mortality
brain. Malignant tumors can originate within the brain or spread from if they are not detected early, and they are both lethal brain tumors [5].
other body parts. All types of brain tumors may create symptoms that The pituitary tumor is another variety of brain tumors. Adenohypophy-
differ depending on the part of the brain affected. These symptoms in-
seal cells, members of the neuroendocrine epithelial cell family that
clude headaches, seizures, eye issues, nausea, and mental disturbances.
secrete hormones, are the primary source of pituitary tumors. The most
A brain tumor is one of the most deadly abnormalities that can afflict
successful way to diagnose brain tumors is through a biopsy. However,
people of all ages and genders [1]. It is becoming as prevalent among

∗ Corresponding author.
E-mail addresses: [email protected] (M.M. Emam), [email protected] (N.A. Samee), [email protected] (M.M. Jamjoom),
[email protected] (E.H. Houssein).

https://doi.org/10.1016/j.compbiomed.2023.106966
Received 10 February 2023; Received in revised form 5 April 2023; Accepted 19 April 2023
Available online 24 April 2023
0010-4825/© 2023 Elsevier Ltd. All rights reserved.
M.M. Emam et al. Computers in Biology and Medicine 160 (2023) 106966

the biopsy method’s invasive nature and potential for bleeding or even to design, academics have demonstrated their knowledge addressing
functional loss make it a risky way to make a diagnosis. Medical experts many challenging optimization problems, such as feature selection [19,
have started using medical imaging techniques more frequently to 20], global optimization [21], image segmentation [22–24], deep learn-
save time and provide better results. Brain tumor detection has long ing models based optimization [8], renewable energy optimization
relied on a variety of medical imaging modalities. MRI and computed problems [25,26], and drug design [27]. Randomization and the two
tomography are two image modalities commonly used to identify ir- types of searches, exploitation and exploration, are characteristics of
regularities in the brain’s shape, size, or location. Physicians prefer MRI metaheuristics. During the exploitation phase, the algorithm performs a
over the other two techniques, and researchers are widely concentrated search for a solution locally in the region of the best solution or another
on it. Using MRIs, anomalies in brain tissue can be detected, and solution. In contrast, the exploration process performs a search across
extensive information about the brain’s structure is provided [5,6]. For the entire search space. There are three main metaheuristic algorithms:
many MRI scans, manual brain tumor classification is time-consuming. (1) swarm methods that mimic the behavior of animals, birds, and
This problem might be resolved by automatic classification, requiring humans. (2) evolutionary algorithms. (3) natural algorithms mimic the
radiologists to only minimally intervene when classifying MRI images laws of physics and chemistry [21,23].
of brain tumors.
Various metaheuristic algorithms have been implemented in re-
Machine learning (ML) has significantly advanced in the medical
cent years. Some algorithms that were developed between 2016 and
field. However, conventional ML methods have limitations such as
2019 include the Whale Optimizer Algorithm (WOA) [28], Grey Wolf
lower accuracy, increased computational time and space, high suscep-
Optimization (GWO) [29], Harris Hawks Optimization (HHO) [30],
tibility to errors, and a need for manual selection of algorithms [7].
and Colony Predation Algorithm (CPA) [31]. Between 2020 and 2021,
Recent studies (Houssein and Emam et al. [8]; Yurdusev et al. [9];
some excellent algorithms were developed, including Slime Mould
Ezzat et al. [10]) have demonstrated that deep learning techniques
Algorithm (SMA) [32], Marine Predators Algorithm (MPA) [33], and
don’t require manually extracted features and combine the process of
feature extraction and classification in one process. Aptenodytes Forsteri Optimization Algorithm (AFO) [34]. In 2022, two
Deep learning (DL) is a class of artificial intelligence that uses novel optimization algorithms were developed: White Shark Optimizer
artificial neural networks (ANNs) to simulate the human brain and (WSO) [35], inspired by the behavior of great white sharks, and Ko-
gain knowledge from massive amounts of data. DL architectures are modo Mlipir Algorithm (KMA) [36], the behavior of Komodo dragons,
preferred over traditional machine learning algorithms because they as well as a Javanese gait, served as inspiration for this. Moreover,
can learn independently and focus on complex image features. New in 2022, Weighted Mean of Vectors (INFO) [37] was developed. In
models are continually being developed to improve feature extraction the same year, Artificial Hummingbird Algorithm (AHA) [38] was
in DL approaches, and these approaches have been used in different implemented. Lastly, in 2023, RIME: A Physics-based Optimization was
medical domains. The advantages of DL include the ability to handle proposed [39].
big data, improved time efficiency, advanced analytics, proficiency The Hunger Games Search (HGS) algorithm, implemented by Yang
with unstructured data, and affordability [11]. DL techniques encom- et al. [40], is a highly efficient metaheuristic algorithm that mimics
pass a range of methods commonly employed for applications including the cooperative actions of animals and their hunger-driven movements.
image processing, classification, and segmentation [8]. Inspired by the popular dystopian novel and movie franchise, the
Recent studies have successfully applied DL approaches to a number algorithm simulates a survival competition among agents to determine
of imaging applications. For instance, Zhifang et al. [12] suggested the best solution for a given optimization issue. The HGS algorithm
an approach that combines mathematical models of infectious diseases has been proven effective through comparisons with other well-known
with DL, such as LSTM, to predict COVID-19. Their model improves algorithms on 23 functions and the IEEE CEC 2014 benchmarks. Its
single-day predictions by 50% compared to pure DL techniques and versatility has been proven by its usage in resolving a wide range of
can be adapted to short- and medium-term predictions, making it engineering issues. One of the key advantages of the HGS algorithm is
more interpretable and robust. Similarly, Nusrat et al. [13] used a DL its ability to handle both continuous and discrete optimization prob-
model for breast cancer detection. Moreover, DL methods play a crucial lems. Furthermore, it is easy to implement and adaptable to different
role in image segmentation problems [14–17]. For example, Guangqi optimization problems. The HGS algorithm can also handle multi-
et al. [14] designed a DL model explicitly for segmenting cervical modal optimization problems where the search space has multiple local
cytology images. In contrast, Chen et al. [15] suggested a segmentation optima.
technique using a DL framework. Mei et al. [16] proposed a method
Despite their usefulness, metaheuristic algorithms must balance
for supervised segmentation utilizing a neural network, which involves
exploration and exploitation to be effective. But when it comes to
a dual-branch soft-erase technique that enlarges the region of interest
high-dimensional problems, the HGS has trouble with slow conver-
for preventing erroneous expansion of that region. These recent studies
gence and getting stuck in local optima. It often only generates small,
provide the potential of DL in imaging fields and its ability to enhance
similar solutions at the end of iterations, preventing the search from
prediction accuracy and solve complex problems.
progressing. Regarding the No Free Lunch (NFL) theory [41], no sin-
Convolutional Neural Networks (CNNs) have become increasingly
well-known for medical image classification and segmentation prob- gular metaheuristic algorithm can effectively address all issues. They
lems across multiple modalities, as seen by a review of current papers all have limitations, including getting stucked in local optimal, early
on DL applications [18]. CNN architectures have been successful in convergence, and the need for global search capabilities. Many studies
medical image analysis, such as recognizing cells, detecting tumors, have suggested ways to overcome these limitations, such as adjusting
and classifying skin diseases [11]. These architectures use layer-based the algorithm’s focus on exploration or exploitation. Two primary ways
feature extraction, and its hyperparameters’ values determine a CNN to create a well-rounded and efficient algorithm include incorporating
model’s effectiveness. Finding the optimal values for these hyperpa- multiple metaheuristics and modifying or improving existing algo-
rameters can be difficult, time-consuming, and involve trial and error. rithms. The HGS could be improved for more complex optimization
Hyperparameters’ tuning is considered an NP-hard optimization chal- problems. One way to do this is by incorporating the Local Escaping
lenge regarding the vast search space. Metaheuristic algorithms are Operator (LEO) [42] and the Brownian motion strategy [33]. The LEO
considered very effective in solving these kinds of problems [18]. has been shown to enhance the algorithm’s performance. The Brownian
Metaheuristics have been implemented to solve various real-world motion strategy can increase search efficiency and enable agents to
issues and have achieved a great attraction in classification issues. independently explore their surroundings, leading to a more thorough
Because metaheuristics have significant performance and are simple exploration of the problem domain.

2
M.M. Emam et al. Computers in Biology and Medicine 160 (2023) 106966

1.1. Motivation • An improved variant of the HGS algorithm (I-HGS) is devel-


oped, which uses Lévy flights and Brownian motion to enhance
Classifying brain tumors from MRI images is a critical key of re- performance.
search in the biomedical domains. This is difficult because of the • The suggested approaches have been shown to be superior to cur-
heterogeneity of the data, the wide range in image quality, and the rent techniques in extensive experimental evaluations, specifically
insistence on precision and efficiency. While medical image classifi- for brain tumor classification and global optimization challenges.
cation frameworks have matured over the years, there is still a need
to improve their performance, especially in the context of brain tu- 1.3. Paper structure
mor classification. In order to efficiently classify different types of
brain tumors from MRI images, we present a hybrid model that com- The remaining sections have the following organization. Section 2
bines a deep learning-based residual learning architecture with an provides some state-of-the-art studies of previous work for brain tu-
enhanced metaheuristic algorithm. The goal is to develop the most mor classification. Section 3 provides the methodologies on which
effective model using a new technique and optimal hyperparameters the proposed I-HGS-ResNet50 model is based, such as the original
for the ResNet50 model. This work aims to address the shortcomings HGS, the Brownian motion strategy, the LEO mechanism, the pri-
of existing medical image classification frameworks and metaheuristic mary function of CNN architecture, and transfer learning. Section 4
algorithms. Some problems plaguing currently available metaheuristic presents the proposed I-HGS algorithm. Moreover, in Section 5, the
algorithms include their tendency to converge too quickly, their in- proposed Hybrid I-HGS-ResNet50 classification model for brain tumors
ability to escape from local optima, and their inability to conduct a is described. In Section 6, I-HGS’s performance is assessed using two
truly global search. The HGS algorithm effectively stabilizes features experiments; the first experiment is presented in Section 6.1, which
and resolves both constrained and unconstrained problems. However, tests the performance of I-HGS as a global optimization technique
according to the NFL theorem, the studies demonstrate that no single on CEC’2020 test functions. In addition, the second experiment is
metaheuristic is strong enough to solve all issues effectively. Therefore, presented in Section 6.2, which discusses the experimental results of
the proposed I-HGS algorithm addresses these shortcomings by avoid- the I-HGS-ResNet50 brain tumor classification model. In Section 8, we
ing getting trapped in local optima, preventing premature convergence, conclude the paper and point out potential future avenues of research.
and balancing the exploitation and exploration stages. Our proposed
ResNet50-based architecture addresses some of the shortcomings of 2. Related work
existing architectures, such as overfitting, low accuracy, and slow
convergence. Classifying brain tumors from MRI scans is made easier With the improvements of medical imaging, automated brain tumor
and faster with the help of the ResNet50 model and the improved I- detection systems based on MRI have been created to help physicians
HGS algorithm. To sum up, this study provides a hybrid approach to the quickly identify and thoroughly analyze the disease to determine the
difficulties in efficient and accurate brain tumor classification from MRI best clinical procedures for the patient. With DL, ML, and medical
images by combining an improved metaheuristic algorithm with a deep image processing algorithms, diseases can be swiftly and precisely
learning-based residual learning architecture. The proposed algorithm recognized for this purpose.
addresses the shortcomings of existing metaheuristic algorithms and This section summarizes previous work on brain tumor classification
optimizes the ResNet50 model’s hyperparameters more effectively. using DL techniques. Several studies have proposed different ML and DL
techniques for this task, such as fully connected networks, deep residual
1.2. Contribution learning architectures, CNNs, and optimized deep residual learning
models. Shahin et al. [4] provided a classification method that utilizes
The paper presents an enhanced version of the HGS algorithm called a fully deep-connected network for feature extraction, residual strip
I-HGS, which combines two strategies: LEO and Brownian motion. The pooling, atrous spatial pooling, and classification. The model extracted
proposed algorithm is designed to solve global optimization problems contextual and local information to classify brain tumors and was tested
and is evaluated on the CEC’2020 test suite challenges. To assess on four benchmark datasets containing 9581 MRI images. Similarly,
its effectiveness, I-HGS is compared with several other metaheuristic Kalaiselvi et al. [45] constructed six different convolutional neural net-
algorithms, including GWO [29], WOA [28], HHO [30], SMA [32], works (CNN) architectures for classifying brain tumors, which varied in
Gradient-based optimizer (GBO) [43], RUNge Kutta optimization algo- the number of layers. Two architectures utilized batch normalization
rithm (RUN) [44], and the original HGS [40]. Furthermore, the paper and dropout layers. Two used only dropout layers, and the last two
proposes an effective model for brain tumor MRI classification, called used neither. The experiments indicated that the fourth architecture
I-HGS-ResNet50, which uses the I-HGS algorithm for hyperparameter conducted well, and the sixth architecture achieved the best results
tuning of the ResNet50 model. The proposed model is evaluated on with an accuracy of 96%. Another work based on CNN proposed by
three MRI imaging benchmark datasets and achieves high classification Shanthi et al. [46]. To categorize brain MRI images, they proposed
accuracy. The performance of I-HGS-ResNet50 is compared against an autonomously optimized hybrid deep neural network (OHDNN)
other pre-trained deep learning models, such as VGG16, DenseNet201, that combines a CNN with a long short-term memory (CNN-LSTM).
and MobileNet, as well as with some state-of-the-art studies. Overall, The outcomes demonstrate that the OHDNN method obtained the best
the paper provides a novel algorithm for global optimization and a accuracy of 97.5%. While Basaran [5] suggested a computer-based
powerful model for brain tumor classification that outperforms other hybrid diagnosis technique for brain cancer with three phases. In the
existing models. first stage, the features of the images were extracted using two com-
In summary, the following are the paper’s main contributions: monly used approaches in literature (Gray level Co-occurrence matrix
and Local Binary Pattern). In the second phase, various CNNs were
• The paper proposes a novel deep learning-based classification applied, and the results were evaluated by extracting the features of the
model, I-HGS-ResNet50, for detecting brain tumors. images. In the final phase, all the acquired features were combined, and
• A novel hyperparameter tuning technique, I-HGS, is presented feature selection was made using a genetic algorithm, particle swarm
that uses transfer learning and an improved metaheuristic algo- optimization algorithm, and artificial bee colony optimization algo-
rithm. rithm. The support vector machine classified feature sets and obtained
• The I-HGS-ResNet50 model is trained and optimized using the 98% accuracy. Moreover, Rajeev et al. [47] proposed a DL technique
I-HGS technique, which automatically determines the best hyper- that uses a guided bilateral filter to pre-process the input images.
parameter values. Tumor locations are then segmented by a threshold method, and the

3
M.M. Emam et al. Computers in Biology and Medicine 160 (2023) 106966

enhanced Gabor wavelet transform extracts the main edge features. Therefore, tuning the hyperparameters of the deep learning architec-
The black widow optimization algorithm determines the best features. ture Consequently, optimizing CNN architecture with hyperparameters
The proposed technique gained an accuracy of 98.4%. Another study is crucial for improving CNN performance. There have been reports of
using the CNN architecture suggested by Mondal and Vimal [48]. They substantial effort required to tune the DL hyperparameters. , such as
proposed a CNN architecture named BMRI-Net that uses an activation Bacanin et al. [18] suggested an optimized CNN for classifying glioma
function named Parametric Flatten-p Mish to enhance performance and brain tumor grade. The method uses an improved firefly algorithm
address the limitations of current activation functions. The architecture (mFA) to tune the hyperparameters of CNN architecture. The technique
was validated on two datasets, Figshare and Br35H, and obtained an was tested on an axial dataset, and the performance was compared
accuracy of 97% and 98%, respectively. This technique’s disadvantage with a genetic algorithm (GA-CNN) and the original firefly algorithm.
is that the dataset is too small to test on the CNN architecture. And the In the same context, Kumar et al. [56] presented a deep CNN method
authors do not have any data augmentation techniques to avoid this called Hyb-DCNN-ResNet 152 TL that combines a ResNet 152 transfer
limitation. Also, the data are imbalanced. While Kumar et al. [3] intro- learning model with a nature-inspired approach. The images were pre-
duced an optimization-based DCNN. The BRATS dataset and the BRATS processed using the Otsu binarization method to remove noise and
simulation were utilized in the studies. Using the fuzzy deformable
enhance image quality. After that, the features were extracted by the
fusion (FDF) method, the images were prepared and segmented. The
Gray-Level Co-Occurrence matrix method. The Covid-19 optimization
Sine Cosine approach, inspired by dolphin echolocation, was used to
algorithm then fine-tuned the model. Also, Alshayeji et al. [57] pre-
fine-tune the FDF’s sensitivity and precision. The proposed approach
sented an automated method for classifying brain tumors that use
was shown to be 96.3 percent accurate.
two CNN models with layers and optimizes the hyperparameters using
Meanwhile, several methods have been worked on based on pre-
Bayesian optimization. The method was applied to MRI images and
trained models. For example, Ismael et al. [49] proposed a deep resid-
obtained an accuracy of 97.37%. Moreover, Mehnatkesh and Hossein
ual learning architecture for classifying the brain cancer. The architec-
et al. [58] suggested a DL architecture that utilizes an enhanced ant
ture was tested on a benchmark dataset of 3064 MRIs representing
three distinct brain tumors. The authors employed seven data aug- colony optimization (IACO) algorithm to fine-tune the hyperparameters
mentation techniques to prevent gradient vanishing and used the aug- of a deep residual network. The IACO algorithm is enhanced with
mented dataset for training and testing. The results of the evaluation a differential evolution technique and multi-population mechanism.
showed an accuracy of 98%. Moreover, Toğaçar et al. [50] provided In this work, the IACO algorithm is only used to tune the learning
a classification method that utilized the Alexnet and VGG16 deep rate of the ResNet architecture, and the resulting IACO-ResNet model
networks for feature extraction. They enhanced the features using the produced an accuracy of 98%. Therefore, the metaheuristic algorithms
hypercolumn method and employed the Alexnet and VGG16 architec- demonstrate increased classification accuracy in DL models, as shown
ture to extract the image features. Recurrent feature elimination was in the previous studies.
used for selecting the most relevant features. The SVM algorithm was The studies mentioned above indicate a preference for using deep
used for classification, which obtained an accuracy of 96%. In addition, learning architectures, particularly pre-trained models, for brain tumor
«inar and Yildirim [51] proposed a modified ResNet50 architecture classification owing to their superior achievement. However, training
that effectively classifies brain MRI images. They added eight layers to these models can be challenging as it requires significant computa-
the basic ResNet50 model. Several other models, including Resnet50, tional and memory resources and may face convergence and overfitting
Alexnet, , InceptionV3, Densenet201, and Googlenet, were used to eval- issues. To address these problems, researchers have employed meta-
uate the model. The results of the studies demonstrated a 97% success heuristic algorithms to optimize and train deep learning models. Based
rate for this strategy. Deepak and Ameer [52] proposed a classification on this, we attempted to develop an optimized deep residual learning
system that utilizes a pre-trained GoogLeNet for feature extraction from model with the best hyperparameters. To achieve this, we first proposed
MRIs. On the Figshare dataset, the experiment employed a five-fold an improved optimization algorithm called I-HGS to tackle the draw-
cross-validation strategy. The suggested model obtained an accuracy of backs of the HGS algorithm. This paper improves the HGS algorithm
98%. Similarly, Rehman et al. [53] presented a classification method by combining two enhancing strategies: LEO and Brownian motion.
that divides brain tumors into meningioma, glioma, and pituitary us- We then used I-HGS to select the optimal hyperparameters of the
ing three CNN architectures (AlexNet, GoogLeNet, and VGGNet). The ResNet50 model. We used three public datasets and applied image pre-
models were trained using the Figshare dataset. Data augmentation processing techniques to enhance the images’ quality. Additionally, we
techniques were adapted to the MRI images to improve the model and
utilized data augmentation techniques to improve model classification
avoid over-fitting. The fine-tuned VGG16 model acquired the highest
accuracy and prevent overfitting. We integrated the ResNet50 model
accuracy of 98.69%. Another method using the ensemble model with
with the I-HGS algorithm to determine its hyperparameters, resulting
the pre-trained model is Noreen et al. [54]. They suggested an ensemble
in the I-HGS-ResNet50 model. We also employed the original HGS
method that uses Inception-V3 for feature extraction and combines the
algorithm to select hyperparameters and compared the effectiveness of
results from three classifiers: SVM, K-Nearest Neighbors, and Random
the I-HGS-ResNet50 model to other pre-trained models.
Forest for classification with an accuracy of 94.34%. Furthermore,
Mesut et al. [55] presented a DL model called BrainMRNet for brain
tumors based on a CNN architecture. A residual network based on 3. Preliminaries
attention modules and hypercolumn technology is incorporated into
the design. Numerous techniques were used for image enhancement This section explains the fundamental methodologies used in the
and preprocessing. After the attention modules selected out the most proposed model, such as the structure of the hunger game search op-
significant characteristics, the images are sent to the convolutional lay- timization algorithm, the local escaping operator (LEO), the Brownian
ers. There was a 96 percent success rate with the proposed BrainMRNet motion strategy, the primary function of CNN architecture, and transfer
model. Swati et al. [7] provided a block-wise fine-tuning method and learning.
a pre-trained CNN architecture for detecting brain tumors. After five
rounds of cross-validation, the technique’s accuracy averaged 94.82
percent when tested on a dataset of T1-weighted CE-MRIs. 3.1. Hunger games search optimization algorithm
The enormous number of hyperparameters prevents further im-
provement in brain tumor detection, despite the promising results This subsection introduced the HGS algorithm and its mathematical
gained by CNN architectures and fully linked deep learning models. model. The HGS algorithm is implemented by Yang et al. [40] that

4
M.M. Emam et al. Computers in Biology and Medicine 160 (2023) 106966

acts on the cooperative actions of animals and their hunger-driven


{
movements. It uses a dynamic, fitness-wise search strategy based on ℎ𝑢𝑛𝑔𝑟𝑦largest × (1 + 𝑟) , ℎ𝑢𝑛𝑔𝑟𝑦temp < ℎ𝑢𝑛𝑔𝑟𝑦largest
ℎ𝑢𝑛𝑔𝑟𝑦𝑛𝑒𝑤 =
the principle of ‘‘Hunger’’ as the most critical homeostatic inspiration, ℎ𝑢𝑛𝑔𝑟𝑦temp , ℎ𝑢𝑛𝑔𝑟𝑦temp ≥ ℎ𝑢𝑛𝑔𝑟𝑦largest
the motivation for choices and behaviors in the lives of all animals.
(10)
The algorithm creates and uses an adaptive weight based on hunger to
replicate the influence of hunger on all stages of the search process. It where 𝑢𝑏 and 𝑙𝑏 are the upper and lower controls of the search area. 𝑟6 is
( )
operates according to the computational logic rule (games) that animals a random number in 0 ≤ 𝑟6 ≤ 1 . 𝑓 𝑖𝑡𝑤𝑜𝑟𝑠𝑡 indicates the worst objective
play, and these competitive behaviors and games promote adaptive achieved in the current step.
evolution by increasing the probability of acquiring food and survival.
HGS’s mathematical model was put forth based on a straightforward 3.2. Local Escaping Operator (LEO)
structure; however, it performs excellently. It consists of two stages: (1)
the food approach; and (2) the role of hunger. The LEO is included to improve the efficiency of an optimization
process for resolving challenging issues. Using a local search strategy,
3.1.1. Approach food such as simulated annealing or hill climbing, in conjunction with a
Social animals often collaborate when searching for food, but it is Gradient-based optimizer (GBO) [42] algorithm can help improve the
not uncommon for some individuals to not participate. The contraction algorithm’s ability to explore new regions and find better solutions in
mode and its approach behavior are proposed to be modeled using complex real-world problems. LEO is a technique that enhances the
the mathematical formulas below. The Eq. (1) presents individuals’ performance of gradient-based optimization algorithms by introducing
collaborative communication and foraging activity. a local search component. It does this by modifying the location of
solutions during the optimization process in specific situations, such
𝑋⎧
⃖⃖⃖⃖⃖⃖⃖⃖
(𝑞)⃗ ⋅ (1 + rand (1)) , 𝑟1 < 𝑧 as when the algorithm is trapped in local optima. This helps improve
⎪ ⃖⃖⃖⃖⃖2⃗ ⋅ ||𝑋 |
⃖⃖⃖⃖⃖⃖⃖⃖⃖⃖⃖⃖⃖⃖⃖⃖
𝑋 ⃖⃖⃖⃖⃖
(𝑞 + 1)⃗ = ⎨ ⃗
𝑊1 ⋅ 𝑋 ⃖⃖⃖⃖b⃗ + 𝑅 ⃖⃖⃗ ⋅ 𝑊 ⃖⃖⃖⃖b⃗ − 𝑋 (𝑞)⃗| ,
⃖⃖⃖⃖⃖⃖⃖⃖ 𝑟1 > 𝑧, 𝑟2 > 𝑆 (1) the algorithm’s convergence behavior and the solutions’ quality. Fur-
| |

=𝑊
⎩ ⃖⃖⃖⃖⃖1⃗ ⋅ 𝑋 ⃖⃖⃖⃖b⃗ − 𝑅 ⃖⃖⃗ ⋅ 𝑊 ⃖⃖⃖⃖⃖2⃗ ⋅ ||𝑋⃖⃖⃖⃖b⃗ − 𝑋 |
(𝑞)⃗| , 𝑟1 > 𝑧, 𝑟2 < 𝑆
⃖⃖⃖⃖⃖⃖⃖⃖ thermore, LEO can help the algorithm escape local optima and explore
| | new locations of the search area, resulting in better solutions. LEO
where 𝑟1 and 𝑟2 are random numbers in [0,1], while rand(1) is a normal aims to determine its new solutions (𝑋𝐿𝐸𝑂 𝐻 ) that have a significant
distribution random number. 𝑞 is the actual step. 𝑊 ⃖⃖⃖⃖⃖1⃗ and 𝑊
⃖⃖⃖⃖⃖2⃗ denote the performance by using more solutions including the best position 𝑋𝑏𝑏𝑒𝑠𝑡 ,
weights of the hunger. 𝑋 ⃖⃖⃖⃖b⃗ indicates the position of the better candidate two randomly obtained solutions 𝑋1𝑚 and 𝑋2𝑚 , and a new random
𝑟1 𝑟2
of the current iteration. 𝑋 ⃖⃖⃖⃖q⃗ denotes the location of each candidate. 𝑧 is produced solution 𝑋𝑘𝑚 . The solution 𝑋 𝐻 𝐿𝐸𝑂 can be calculated using
a constant. S is defined by the Eq. (2) the following scheme:
( )
𝑆 = sech ||fit𝑖 − fitbest || , 𝑖 ∈ 1, 2, 3, … , 𝑛 (2) 𝐈𝐟 𝑟𝑎𝑛𝑑𝑁 < 𝑝𝑟
( )
where 𝑓 𝑖𝑡𝑖 is the objective function. 𝑓 𝑖𝑡𝑏𝑒𝑠𝑡 is the best objective achieved ⎧𝑥𝑚 + 𝑓1 𝑢1 𝑋𝑏𝑏𝑒𝑠𝑡 − 𝑢2 𝑋 𝑚 +
in each process. 𝑆𝑒𝑐ℎ is a hyperbolic function calculated by: ⎪ 𝑛 ( ( 𝑚 𝑘
)) ( 𝑚 )
𝐻 ⎪ 𝑓2 𝜌1 𝑢3 𝑋2𝑛 − 𝑋1𝑚𝑛 + 𝑢2 𝑋𝑟1 𝑚
− 𝑋𝑟2 ∕2 𝑟𝑎𝑛𝑑𝑁 < 0.5
𝑋𝐿𝐸𝑂 =⎨ ( )
2 𝑚
⎪𝑋𝑏𝑏𝑒𝑠𝑡 + 𝑓1 𝑢1 𝑋𝑏𝑏𝑒𝑠𝑡 − 𝑢2 𝑋𝑘 +
𝑠𝑒𝑐ℎ(𝑥) = 𝑥 (3) ( ( )) ( 𝑚 )
𝑒 + 𝑒−𝑥 ⎪ 𝑚
⎩ 𝑓2 𝜌1 𝑢3 𝑋2𝑚𝑛 − 𝑋1𝑚𝑛 + 𝑢2 𝑋𝑟1 − 𝑋𝑟2 ∕2 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒
⃖⃖⃗ represents a number in [−𝑎, 𝑎] that is used to set the activity range.
𝑅
𝐄𝐧𝐝
It is defined by Eq. (4):
(11)
⃖⃖⃗ = 2 × 𝑎 × 𝑟𝑎𝑛𝑑 − 𝑎
𝑅 (4)
Where 𝑓1 and 𝑓2 are randomly distributed numbers in [−1, 1], 𝑃𝑟
𝑞 denotes a probability number equal to 0.5. 𝑢1, 𝑢2, and 𝑢3 are random
𝑎 = 2 × (1 − ) (5)
𝑖𝑡𝑀𝑎𝑥 numbers are obtained from the following equations:
{
where a rand is a random number in [0, 1]. 𝑖𝑡𝑀𝑎𝑥 is the uppermost 2 ∗ 𝑟𝑎𝑛𝑑𝑁 𝜇1 < 0.5
number of steps. 𝑢1 = (12)
1 𝑜𝑡ℎ𝑒𝑟𝑒𝑙𝑠𝑒
{
3.1.2. Hunger role 𝑟𝑎𝑛𝑑𝑁 𝜇1 < 0.5
𝑢2 = (13)
During this phase, the specifications of hunger-driven activities are 1 𝑜𝑡ℎ𝑒𝑟𝑒𝑙𝑠𝑒
{
replicated. The equations (Eqs. (6) and (7)) present the weights of 𝑟𝑎𝑛𝑑𝑁 𝜇1 < 0.5
hunger behavior. 𝑢3 = (14)
{ 1 𝑜𝑡ℎ𝑒𝑟𝑒𝑙𝑠𝑒
𝑁pop
⃖⃖⃖⃖⃖⃖⃖⃖⃖⃖
⃗ ℎ𝑢𝑛𝑔𝑟𝑦 (𝑖) ⋅ 𝑠𝑢𝑚ℎ𝑢𝑛𝑔𝑟𝑦 × 𝑟 4 , 𝑟3 < 𝑙 where 𝑟𝑎𝑛𝑑𝑁 is a random value between zero and one. 𝜇1 is between 0
𝑊1 (𝑖) = (6)
1 𝑟3 > 𝑙 and 1. We can simplify the equations of 𝑢1, 𝑢2, and 𝑢3 in the following
mathematical representation:
2 (𝑖) = (1 − exp (− |ℎ𝑢𝑛𝑔𝑟𝑦 (𝑖) − 𝑠𝑢𝑚ℎ𝑢𝑛𝑔𝑟𝑦|)) × 𝑟5 × 2
⃖⃖⃖⃖⃖⃖⃖⃖⃖⃖
𝑊 ⃗ (7)
𝑢1 = 𝐿1 × 2 × 𝑟𝑎𝑛𝑑𝑁 + (1 − 𝐿1 ) (15)
where 𝑁𝑝𝑜𝑝 represents the number of populations. ℎ𝑢𝑛𝑔𝑟𝑦 stand for the
hunger of each individual. 𝑟3 , 𝑟4 , and 𝑟5 are random numbers in [0, 1]. ( )
𝑠𝑢𝑚ℎ𝑢𝑛𝑔𝑟𝑦 is the sum of the hungry features of all solutions. hungry(i) 𝑢2 = 𝐿1 × 𝑟𝑎𝑛𝑑𝑁 + 1 − 𝐿1 (16)
is provided by Eq. (8).
( )
{ 𝑢3 = 𝐿1 × 𝑟𝑎𝑛𝑑𝑁 + 1 − 𝐿1 (17)
0, Allfit𝑖 = fitbest
ℎ𝑢𝑛𝑔𝑟𝑦 (𝑖) = (8)
ℎ𝑢𝑛𝑔𝑟𝑦 (𝑖) + ℎ𝑢𝑛𝑔𝑟𝑦new , Allfit𝑖 ! = fitbest where 𝐿1 is a parameter with a value of zero or one. (L1 = 1 if 𝜇1 <
where 𝐴𝑙𝑙𝐹 𝑖𝑡𝑖 is the objective function of each solution at the 𝑖th 0.5, and 0 otherwise).
iteration. The ℎ𝑢𝑛𝑔𝑒𝑟𝑦𝑛𝑒𝑤 can be calculated by Eq. (9): The 𝑋𝑘𝑚 is calculated using Eq. (11).
{
fit𝑖 − fitbest 𝑥randN if 𝜇2 < 0.5
ℎ𝑢𝑛𝑒𝑔𝑟𝑦temp = × 𝑟6 × 2 × (ub − lb) (9) 𝑋𝑘𝑚 = (18)
fitworst − fitbest 𝑥𝑚𝑝 otherwise

5
M.M. Emam et al. Computers in Biology and Medicine 160 (2023) 106966

where 𝑥randN is a new solution that can be computed using Eq. (19), employing a pre-trained CNN architecture, as in the case of Ima-
𝑥𝑚
𝑝 is a random solution determined from (𝑋 ∈ [1, 2, … 𝑁], 𝜇2 denotes geNet [62]. On the ImageNet, many models have completed pretrain-
a randomly value in [0,1]. ing, including AlexNet [63], VGG [64], ResNet [65], Inception [66],
and DenseNet [67]. TL models can be used for feature extraction and
𝑥randN = 𝑙𝑏 + randN(0, 1) × (𝑢𝑏 − 𝑙𝑏) (19) fine-tuning. Feature extraction is performed using the pre-trained ar-
Moreover, The phases of exploration and exploitation are balanced chitecture’s convolutional base, and a new classifier is trained using the
using 𝜌1 that defined by: extracted features. Fine-tuning involves retraining the unfrozen layers
of the pre-trained architecture’s hybrid model with the new classifier
𝜌1 = 2 × randn ×𝛼 − 𝛼 (20) to make the features more relevant to the new task. The pre-trained
architecture’s features are intended to be changed by the fine-tuning
| ( ( ))
3𝜋 3𝜋 ||
𝛼 = ||𝛽 × sin + sin 𝛽 × | (21) procedure to make them more applicable to the new process [8].
| 2 2 | The following steps can be used to describe how to use these
( ( )3 )2 methods:
( ) 𝑡
𝛽 = 𝛽min + 𝛽max − 𝛽min × 1 − (22)
𝑡𝑚𝑎𝑥 • In a pre-trained model, the classifier basis is deleted.
• The convolutional base is frozen.
where 𝛽min and 𝛽max are equals 0.2 and 1.2, respectively, 𝑡 is the actual • On top of the pre-trained architecture’s convolutional basis, a new
step, and 𝑡𝑚𝑎𝑥 is the uppermost number of steps. The sine function 𝛼 classifier is provided and trained.
determines how the parameter 𝜌1 changes, balancing the exploration
• Tweaking some layers of the pre-trained model’s convolutional
and exploitation stages.
basis.
Eq. (18) can be simplified in the following: • In the final step, the new classifier is hybrid-trained, as are these
( ) unfrozen layers.
𝑋𝑘𝑚 = 𝐿2 × 𝑥𝑚
𝑝 + 1 − 𝐿2 × 𝑥randn (23)

Where 𝐿2 has a value of 0 or 1. If the value of 𝜇1 is less than 0.5, the


3.6. Residual Neural Network
value of 𝐿1 is set to 1. Otherwise, it is set to 0.

The Residual Network (ResNet), a highly renowned and significant


3.3. Brownian motion
deep learning model, emerged as the winner in the 2015 ILSVRC Chal-
lenge [65]. ResNets comprise various layers, including convolutional,
The step duration in the Brownian motion stochastic technique is pooling, activation, and fully connected layers. ResNet architectures
controlled by the probability function specified by a normal gaussian come in a wide range, including ResNet 18 and 34 with two deep layers
distribution with a zero mean (𝜇 = 0) and unit variance (𝜎 2 = 1). For and ResNet 50, 101, and 152 with three deep layers [68]. ResNet’s
this motion, the controlling Probably Density Function at point 𝑦 is as strength enables us to train highly complex NNs using above 150 layers.
follows [33]: Before ResNet, the exploding gradient problem afflicted deep neural
1 (𝑦 − 𝜇)2 1 𝑦2 networks. ResNet proposed using Residual Connections to tackle the
𝐹𝐵𝑅 (𝑦, 𝜇, 𝜎) = √ exp(− ) = √ exp(− ) (24)
2𝜎 2 2 problem of deep networks. A residual block is distinct from a regular
2𝜋𝜎 2 2𝜋
block in that it includes a shortcut that bypasses one or more layers, as
3.4. Convolutional Neural Networks displayed in Fig. 1.
An input 𝑥 was considered with a ReLU activation function before
CNNs are a category of DL that are popularly used for image classi- being connected in (Eq. (25)), thus producing an output that is not
fication tasks, designed to recognize patterns from raw image data with identical to the input.
minimal preprocessing automatically. CNNs have seen advancements in 𝐻(𝑥) = 𝐹 (𝑥) (25)
design, such as transfer learning models like fine-tuning and freezing
layers, which have greatly improved image classification performance In Eq. (26), the outcome of the shortcut connection is included in
and outperformed traditional machine learning models [53]. The topol- the F(x) calculation.
ogy of a CNN includes three primary layers [11]: the convolutional
𝐻(𝑥) = 𝐹 (𝑥) + 𝑥 (26)
layer, which employs filter banks to combine input images and feature
maps. The output of the weighted sum is then passed through an ResNet50 is a variation of the ResNet architecture that comprises 50
objective function, Sigmoid [59] or rectified linear unit (ReLU) [60]. In layers. It was trained on a large dataset containing at least one million
the pooling layer, which minimizes the size of feature maps by grouping images from the ImageNet database. Most ResNet models contain batch
nearby pixels, the two most frequently used methods are Max-pooling normalization and double or triple-layer skips with nonlinearities. It
and Average-pooling [61]. Based on features collected by the CNN, the is frequently used to combine Highway-Net, a model that learns the
Fully Connected (FC) layer classifies the input data according to its skip weights using a separate weight matrix. ResNet50 architecture
category. consists of sequences of convolutional blocks with average pooling. The
The hyperparameters of a CNN model are crucial, such as the final layer of classification uses Softmax. Fig. 2 illustrates the ResNet
learning rate, unit number’s in a hidden layer, dropout rate, activation architecture.
function, and the number of epochs. Therefore, according to some The ResNet50 model comprises five convolutional layers known as
studies, these hyperparameters must be adjusted to produce excellent conv1, conv2, conv3, conv4, and conv5. When an image is inputted,
outputs to improve CNN accuracy [61]. The optimization of hyperpa- it goes through the conv1 layer composed of 64 filters and a 7 𝑥
rameters is regarded as an NP-hard problem due to the search space 7 kernel, then a max-pooling layer with a stride length of two for
size. Metaheuristics are particularly effective in solving such problems. width and height. The layers in conv2 are connected in a residual
network. The matrix shown in Eq. (27) illustrates that there are two
3.5. Transfer learning layers of 1 × 1 kernel size of 64 and 256 filters, respectively, and a
third layer of 3 × 3 kernel size with 64 filters, which is repeated three
Transfer learning is a specific category of deep learning, partic- times and corresponds to the layers in between the pooling layers. The
ularly for visual categorization and image classification. It involves convolution process was repeated up to the fifth layer, after which

6
M.M. Emam et al. Computers in Biology and Medicine 160 (2023) 106966

Fig. 1. ResNet group blocks, (a) The plain block, (b) The residual block.

Fig. 2. Deep ResNet50 architecture.

average pooling was applied to the FC layer. Finally, a softmax function We develop a new version of the HGS to enhance its limitations. The
was used for classification. LEO operator avoids getting stuck into local optimal, controlling low
⎡1 × 1 64 ⎤ convergence since the LEO update positions follow a strong mechanism
⎢3 × 3 64 ⎥ (27) with a randomly determined solution through the search area. In
⎢ ⎥ addition, the Brownian motion technique is utilized to enhance search
⎣1 × 1 256⎦
efficiency and assist agents in exploring their neighborhoods separately,
4. Improved Hunger Search Game Optimization Algorithm (I-HGS) which produces a good exploration.

This subsection presents the details of the improved hunger games 4.2. The proposed I-HGS initialization phase
search optimization algorithm (I-HGS) and how it improves the ex-
ploitation power and accelerates the exploration phase. It also discusses The I-HGS method begins by randomly initializing a group of search
controlling getting stuck at local optima regions and rapid convergence. agents as the starting point for the initialization process. Each individ-
Moreover, it first presents the drawbacks of the original HGS algorithm. ual is of a dimension (𝐷𝑖𝑚) over the search area and limited by lower
I-HGS involves two efficient schemes: and upper boundaries (𝐿𝑏 , 𝑈𝑏 ), as presented in Eq. (28)

1. Local Escaping Operator. 𝑋𝑗 = 𝐿𝑏 + 𝑟𝑎𝑛𝑑 × (𝑈𝑏 − 𝐿𝑏 ), 𝑗 ∈ {1, 2, … , 𝑁} (28)


2. Brownian motion. where 𝑋𝑗 stands for the randomly initialized 𝑗th solution vector, 𝑟𝑎𝑛𝑑
is a random value ∈ [0, 1].
4.1. Drawbacks of original HGS algorithm
4.3. Process updating and objective assessment of I-HGS
Although the HGS algorithm is highly efficient, it has some limita-
tions in some optimization issues. The HGS algorithm gets trapped in To determine the best answers and refine the new ones in the
sub-regions and improper exploration–exploitation balance, especially following stage, evaluating the solutions in each step is necessary. The
in complicated and high-dimensional issues. Although each solution ad- original HGS method is then used to update the individuals’ locations
justs its location following the previous one, this lowers the algorithm after calculating each individual’s fitness. The steps of the HGS are
convergence rate and insufficiently covers the solutions in the search divided into two phases as described in Section 3.1. The first phase is
area, causing the HGS to converge too soon. The NFL theory asserts applied using Eq. (1) to Eq. (5). Then, the second phase is performed;
that no single, superior algorithm can solve all optimization issues. in this phase, The Brownian motion is used, and the new individuals

7
M.M. Emam et al. Computers in Biology and Medicine 160 (2023) 106966

are calculated using Eq. (30). After that, Eq. (8) to Eq. (10) of the From the previous investigation, we can determine the complexity of
HGS are performed to update the individuals’ positions. The second the I-HGS algorithm: 𝑂(𝑁𝑛 ∗ (1 + 𝑚𝑇 𝑁𝑛 (2 + 𝑙𝑜𝑔𝑁𝑛 + 2 ∗ 𝑑) + 𝑁𝑛 𝑙𝑜𝑔𝑁𝑛 +
process targets altering the received solutions from the previous process 𝑁𝑛 ∗ 𝑑))
by utilizing the LEO operator (described in detail in Section 3.2).
Depending on specific criteria (𝑟𝑎𝑛𝑑𝑁 < 𝑝𝑟 ), the final process is applied. 5. Proposed model: Hybrid I-HGS-ResNet50 classification model
Where 𝑟𝑎𝑛𝑑𝑁 is a random value between zero and one, and 𝑃𝑟 is a for brain tumor
probability value for performing the second process.
{ This section presents the proposed integration of the novel I-HGS
𝐻
𝑋𝐿𝐸𝑂 𝑢𝑠𝑖𝑛𝑔 𝐿𝐸𝑂 𝑜𝑝𝑒𝑟𝑎𝑡𝑜𝑟 𝐼𝑓 𝑟𝑎𝑛𝑑𝑁 < 𝑝𝑟
𝑋𝑏(𝑡 + 1) = with the ResNet50 model for hyperparameter optimization-based brain
𝑋𝑏𝑏𝑒𝑠𝑡 𝑢𝑠𝑖𝑛𝑔 𝑡ℎ𝑒 𝐻𝐺𝑆 𝑢𝑝𝑑𝑎𝑡𝑖𝑛𝑔 𝑝𝑟𝑜𝑐𝑒𝑠𝑠 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒 tumor classification. To achieve the best accuracy, the I-HGS algorithm
selects the optimal hyperparameters values of the ResNet50. After set-
(29)
ting the best hyperparameters, ResNet50 was trained by TL approaches.
When the training is performed, it is tested using a different test set.
2 (𝑗) = (1 − exp (− |ℎ𝑢𝑛𝑔𝑟𝑦 (𝑗) − 𝑠𝑢𝑚ℎ𝑢𝑛𝑔𝑟𝑦|)) × 𝑟5 × 2 × 𝐹𝐵𝑅
⃖⃖⃖⃖⃖⃖⃖⃖⃖⃖
𝑊 ⃗ (30) The proposed model includes five phases, as shown in Fig. 3. The five
where 𝐹𝐵𝑅 is the Brownian motion calculated by Eq. (24) phases of the proposed model operate in the following order:

1. Phase 1: The dataset acquisition.


4.4. Termination criteria of I-HGS 2. Phase 2: The data preparation and augmentation methods.
3. Phase 3: Hyperparameters optimization.
The terminal conditions are examined; if they are met, the solution
4. Phase 4: The learning process.
search is stopped because the best result has been achieved. The
5. Phase 5: The performance evaluation phase.
pseudocode of I-HGS is provided in Algorithm 1.
The following subsections will provide extensive detail about each
Algorithm 1 Pseudocode of the proposed I-HGS phase.
Initialize the parameters 𝑁𝑝𝑜𝑝 , z, sumhungry, 𝑀𝑎𝑥 − 𝑖𝑡𝑒𝑟𝑎𝑡𝑖𝑜𝑛𝑠 𝑇𝑚 ,
𝐷𝑖𝑚. 5.1. Phase 1: Datasets acquisition
Generate the positions of Individuals 𝑥𝑖 (𝑖 = 1, , 3, .., 𝑁𝑝𝑜𝑝 ).
while (t ≤ 𝑇𝑚 ) do In this subsection, the used datasets are presented in detail. The
Compute the cost function of all Individuals. proposed model is assessed through three multi-class benchmarks brain
Update the best fitness, best position, worst fitness, best tumor MRI datasets as mentioned in Table 1. Each dataset has axial,
individual. coronal, and sagittal views of the brain. Also, various classifications of
Calculate the hungry with Eq. (8) brain tumors are derived from several other patients with a range of
Compute the 𝑊1 with Eq. (6) tumor grades, racial backgrounds, and ages. Fig. 4 presents samples
⊳ Using the Brownian motion from the used datasets with different classes. To provide a fair com-
Calculate the 𝑊2 by Eq. (30) parison, each dataset is split similarly to previous studies. The Cheng
for each individual do dataset1 consists of 3064 grayscale MRI images in three categories; 708
Compute the variation control S by Eq. (2). Meningioma, 1426 Glioma, and 930 Pituitary tumors. The BT-large-
Update 𝑅⃖⃖⃗ with Eq. (4). 4c2 dataset includes 3264 images; 826 Glioma, 822 Meningioma, 827
keep updating positions with Eq. (1). Pituitary, and 395 normal. The BT-large-2c3 dataset contains 3000 im-
end for ages, 1500 tumor images, and 1500 normal images. All three previous
⊳ Local escaping operator (LEO) datasets are splitted into an 80% training set and a 20% testing set for
if 𝑟𝑎𝑛𝑑𝑁 < 𝑝𝑟 then the overall dataset.
if 𝑟𝑎𝑛𝑑𝑁 < 0.5 then
Calculate 𝑋𝐿𝐸𝑂 with Eq. (11) 5.2. Phase 2: Data pre-processing and data augmentation
else
Calculate 𝑋𝐿𝐸𝑂 with Eq. (11) Data preprocessing. To improve the classification accuracy, it is impor-
end if tant to crop out any empty region from MRI images affected by noise
end if and distortion before using them for learning. The images were cleaned
𝑡=𝑡 + 1 of noise and then cropped to remove empty spaces. The next step was to
end while resize the images to the exact dimensions required by the ResNet model
Return the best fitness, best position. (224 𝑥 224). This process involved several image processing techniques
such as cropping, contour detection, extracting contour points, and
4.5. I-HGS computational complexity resizing and augmenting the images.

Data augmentation. The proposed model may overfit according to the


The complexity of the proposed I-HGS is depending on the com- few samples in the benchmark datasets. Image augmentation can be
plexity of seven fundamental parameters, which are as follows: ini- utilized to enhance model classification accuracy instead of collecting
tialization procedures, fitness assessment, sorting, hunger updating, additional data [69]. So, the proposed model’s effectiveness is im-
weight updating, and position updating. Among them, 𝑁𝑛 denotes the proved by the data augmentation methods to enlarge the number of
number of individuals within the population, 𝑑 stands for the problem’s samples. On the training dataset, various data augmentation techniques
dimension, whereas 𝑚𝑇 denotes the maximum iterations number. In the were applied to improve the accuracy of the model. This includes
initialization phase, the fitness and update of hunger evaluation have a brightness and rotation adjustments, width and height shifts, shearing,
computational complexity of 𝑂(𝑁𝑛 ), while sorting has a computational
complexity of 𝑂(𝑁𝑛 𝑙𝑜𝑔𝑁𝑛 ). Updating weight and location, on the other
hand, has a computational complexity of 𝑂(𝑁𝑛 𝑥𝑑). In the LEO, the 1
https://figshare.com/articles/dataset/brain_tumor_dataset/1512427
sorting process has a computational complexity of 𝑂(𝑁𝑛 𝑙𝑜𝑔𝑁𝑛 ) (Quick 2
https://www.kaggle.com/sartajbhuvaji/brain-tumor-classification-mri.
3
sort), and the process of selecting representative solutions has 𝑂(𝑁𝑛 𝑥𝑑). https://www.kaggle.com/ahmedhamada0/brain-tumor-detection.

8
M.M. Emam et al. Computers in Biology and Medicine 160 (2023) 106966

Fig. 3. The proposed I-HGS-ResNet50 model block-diagram.

Table 1
A summary of the datasets that have been used and corresponding URL.
Dataset Dataset name Classes No. of Total no. of URL
no. samples per images
class
Glioma 1462
1 Cheng Dataset Mengioma 708 3064 https://figshare.com/articles/dataset/brain_tumor_dataset/1512427
Pituatry 930
Glioma 826
Mengioma 822
2 BT-large-4c 3264 https://www.kaggle.com/datasets/sartajbhuvaji/brain-tumor-classification-mri
Pituatry 827
No tumor 395
Tumor 1500
3 BT-large-2c 3000 https://www.kaggle.com/datasets/ahmedhamada0/brain-tumor-detection
Non-Tumor 1500

zooming, and horizontal and vertical flips. Additionally, featurewise Table 2


The Augmentation methods and corresponding values.
centering and normalization were applied as fill mode. The images
were also resized to 180 × 180 before being used in other stages of Augmentation method Value

the pipeline [69]. The training set of the dataset was enlarged by Shearing 0.2
Zooming 0.2
using Keras ImageDataGenerator to implement data augmentation. The
Width shift 0.3
employed data augmentation methods are listed in Table 2 along with Height shift 0.3
their ranges. Rotation 15
Feature-wise center True
Feature-wise standard normalization True
5.3. Phase 3: Hyperparameters selection Fill mode Reflect
Vertical flip True
Horizontal flip True
As discussed in Section 3.5, the TL model takes the pre-trained
model after constructing some adaptations. The important adjustment
is to swap out the existing classifier for a new classifier that involves
either adding new hyperparameters or modifying their values. The rate, and the neurons numbers in the first dense layer. Accordingly,
I-HGS-ResNet50 model proposed includes four hyperparameters that the search area is four dimensions, and each location in the search area
have been fine-tuned: the learning rate, the batch size, the dropout corresponds to a set of these four hyperparameters.

9
M.M. Emam et al. Computers in Biology and Medicine 160 (2023) 106966

Fig. 4. MRI brain samples which contain four types of brain tumor.

5.4. Phase 4: Learning process [11]:


𝑇𝑃
𝑆𝑒𝑛𝑠𝑖𝑡𝑖𝑣𝑖𝑡𝑦 = (32)
The feature extraction and fine-tuning techniques are utilized to 𝑇𝑃 + 𝐹𝑁
prepare the ResNet50 architecture for learning by the given datasets. Specificity: This measure displays the validity of the usual and the
The convolutional base is not modified during the feature extraction overall pessimistic predictions. Eq. (33) is employed to express it [11].
procedure, and a new classifier base is implemented to replace the
original one. The classifier includes 4 layers: a flattened layer, followed 𝑇𝑁
𝑆𝑝𝑒𝑐𝑖𝑓 𝑖𝑐𝑖𝑡𝑦 = (33)
by a dense layer, and after that, a new dropout layer was added, then 𝑇𝑁 + 𝐹𝑃
another dense layer. The first dense layer utilizes the Relu activation Precision: This metric indicates by using Eq. (34) [11]:
function. The proposed I-HGS algorithm selects the neuron’s numbers 𝑇𝑃
and the dropout rate of the first dense layer. The second dense layer 𝑃 𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 = (34)
𝑇𝑃 + 𝐹𝑃
has four neurons and uses the softmax function. The final two layers of
F-score: Is a measure of the accuracy of a test, as calculated by
the convolutional base in the ResNet50 are retrained after training the
Eq. (35) [70]:
provided classifier for a certain number of epochs. { }
𝑇 𝑃𝑗
𝐹 1𝑗 =
𝑇 𝑃𝑗 + 𝐹 𝑃𝑗
5.5. Phase 5: Evaluation metrics of the classification model (35)
1 ∑𝑞
𝐹1 = 𝐹 1𝑗
Accuracy, Sensitivity, Specificity, Precision, and F-score metrics are 𝑞 𝑗=1
practiced to assess the effectiveness of the proposed I-HGS-ResNet50
6. Experimental evaluation and discussion
model.
Accuracy: This establishes the number of cases that have been
The purpose of this section is to evaluate I-HGS’s performance
accurately categorized. It is described by Eq. (31) [11]:
through two experiments. The first experiment involves testing the al-
𝑇𝑃 + 𝑇𝑁 gorithm on a standard set of benchmarks, while the second applies it to
𝐴𝑐𝑐𝑢𝑟𝑎𝑐𝑦 = (31)
𝑇𝑃 + 𝑇𝑁 + 𝐹𝑃 + 𝐹𝑁 a real-world problem. This is a common approach when evaluating the
where 𝑇 𝑃 indicates for true positive, 𝑇 𝑁 indicates for true negative, effectiveness of newly developed algorithms. So, we assess the I-HGS
𝐹 𝑃 indicates for false positive, and 𝐹 𝑁 indicates for false negative. algorithm efficiency at CEC’2020 in Section 6.1. Secondly, Section 6.2
Sensitivity: This analysis shows the total number of positive in- provides the results and discussion of the I-HGS-ResNet50 model-based
stances that were only accurately estimated as calculated by Eq. (32) brain tumor classification.

10
M.M. Emam et al. Computers in Biology and Medicine 160 (2023) 106966

Table 3
Benchmark functions of CEC’2020.
Class No. Function name Optimal values
Unimodal F1 Bent Cigar Function 100
F2 Shifted and Rotated schwefel’s Function 1,100
Multi-modal F3 Shifted and Rotated Lunacek bi-Rastrigin Function 700
F4 Expanded Rosenbrock plus Griewangk Function 1,900
F5 Hybrid Function 1 (𝑁 = 3) 1,700
Hybrid F6 Hybrid Function 2 (𝑁 = 4) 1,600
F7 Hybrid Function 3 (𝑁 = 5) 2,100
F8 Composition Function 1 (𝑁 = 3) 2,200
Composition F9 Composition Function 2 (𝑁 = 4) 2,400
F10 Composition Function 3 (𝑁 = 5) 2,500

Table 4 • Statistical mean: The value assigned in the midway of cost


Parametrization of I-HGS and the algorithms being compared.
values, as determined using Eq. (36):
Optimizers Parameters values
𝑅𝑛
Common parameters Population size: 𝑁𝑃 = 30 1 ∑
𝑀𝑒𝑎𝑛 = 𝐹 𝑖𝑡𝑡𝑖𝑏 (36)
Maximum Fitness Evaluation: 𝑀𝑎𝑥𝐹 𝐸 = 30.000 𝑅𝑛 𝑗=1
Number of controlling variables: 𝐷𝑖𝑚 = 10
Number of experimental runs : 30
• The worst value: The algorithm’s maximum fitness value is
GWO 𝑎 reduces from 2 to 0
WOA 𝛼 reduces from 2 to 0 derived from it and is determined by Eq. (37):
𝑎2 reduces from −1 to −2
HHO 𝛽 = 1.5
𝑊 𝑂𝑅𝑆𝑇 = max 𝐹 𝑖𝑡𝑡𝑖𝑏 (37)
1≤𝑗≤𝑅𝑛
SMA z=0.01
GBO 𝑝𝑟 = 0.5 • The best value: The minimum fitness value is determined by this
RUN 𝑎 = 20 and 𝑏 = 12
metric and is represented by Eq. (38)
HGS R in [−∞, ∞], r1, r2 in [0,1]
I-HGS R in [−∞, ∞], r1, r2 in [0,1]
𝐵𝐸𝑆𝑇 = min 𝐹 𝑖𝑡𝑡𝑖𝑏 (38)
1≤𝑗≤𝑅𝑛

• Standard deviation (STD):



6.1. First experiment: Performance of the proposed I-HGS on addressing √
√ 𝑅𝑛
global optimization problems √ 1 ∑
𝑆𝑇 𝐷 = √ (𝐹 𝑖𝑡𝑡𝑖𝑏 − 𝑀𝑒𝑎𝑛)2 (39)
𝑅𝑛 − 1 𝑗=1
The CEC’2020 benchmark functions challenges illustrate how well
the I-HGS performs. Its efficiency is contrasted with different popular where 𝑅𝑛 is the total number of runs.
metaheuristic algorithms: GWO [29], WOA [28], HHO [30], SMA [32],
GBO [43], RUN [44], and the basic HGS algorithm [40]. 6.1.4. Statistical significance analysis
We choose the comparing algorithms based on various factors, This subsection gives exhaustive comparisons and data showing
including the size and complexity of the optimization issue and the how I-HGS compares to HGS and various popular algorithms. Table 5
algorithms’ rate of convergence. The resilience of the optimization algo- represents the mean, STD, best, and worst values for I-HGS and other
rithms is another consideration. These comparison algorithms have al- algorithms on 10 benchmark functions from CEC’2020 with a dimen-
ready been used to address the same issue in several domains, including sion of 10. Boldface is used to emphasize the best values. As reported
medicine, engineering, and sophisticated applications. in Table 5, the GBO algorithm obtains the better values of the F1 test
function. The I-HGS algorithm outperforms others for the multimodal
test functions F2, F3, and F4. It produces superior results than the
6.1.1. CEC’2020 test suit original HGS for hybrid test functions F5, F6, and F7, specifically
We check the effectiveness of I-HGS using the CEC’2020 benchmark providing superior results for F5 and F6. The GBO algorithm performs
functions [71] because they are one of the most recent and chal- better on the F7 test function. The I-HGS algorithm outperforms others
lenging benchmarks. The CEC’2020 functions and the accompanying on the composite functions F8 and F10, whereas on F9, the I-HGS
optimal values are presented in Table 3. The 3D views of the CEC’2020 performs well, but the WOA algorithm surpasses the others. The study
functions are shown in Fig. 5. found that the I-HGS algorithm outperformed the compared algorithms
in seven benchmark functions. Additionally, regarding the Friedman
6.1.2. Parameter settings test, the proposed I-HGS performance was better than the compared
algorithms and had the highest ranking among all algorithms.
The parameter settings applied to the algorithms are reported in
Table 4. The parameters for all algorithms were left at their default
6.1.5. Boxplot analysis
settings, as suggested by a previous study [72], to ensure a fair com-
The boxplot analysis is a powerful technique to show the data distri-
parison and avoid bias. The simulation was run 30 times to confirm
bution characteristics. We used boxplots to present the data distribution
the comparison’s fairness further. The effectiveness of the algorithms
so that we could further study the results of Table 5. The boundaries
has been evaluated using both qualitative and quantitative criteria.
of the lowest and highest whiskers are defined by the algorithm’s
minimum and maximum achieved data values. Each rectangle’s lower
6.1.3. Performance measurements of I-HGS algorithm across CEC’2020 test and upper ends represent the bottom and top quartiles. A narrower
functions boxplot implies strong data agreement. The boxplots of F1 through
The efficacy of the suggested algorithm in selecting the best solu- F10 for 𝐷𝑖𝑚 = 10 are shown in Fig. 6. The boxplots for I-HGS show
tions compared to the comparative algorithms is estimated using a set a relatively small range of values for the majority of functions, with
of performance metrics. The definitions of these metrics are as follows: the minimum values between all the compared algorithms. Thus, the

11
M.M. Emam et al. Computers in Biology and Medicine 160 (2023) 106966

Fig. 5. The 3-D of the CEC’2020 functions.

I-HGS algorithm is more promising than its competitors for most test test functions across different dimensions. Thus, the proposed I-HGS
functions. effectively converges to and reaches the optimal solution. Furthermore,
the I-HGS yields faster convergence in most test functions than other
6.1.6. Convergence curves analysis algorithms. This illustrates the steady performance of the I-HGS in con-
This section provides an analysis of the convergence of I-HGS com- verging to optimal solutions, making it an effective tool for addressing
pared to other algorithms. Fig. 7 displays the convergence plots for complicated problems.
GWO, WOA, HHO, SMA, GBO, RUN, and the original HGS towards the
proposed I-HGS on the CEC’2020 benchmarks. The I-HGS algorithm
6.2. Second experiment: Experimental results of the I-HGS-ResNet50 brain
acquires an early exploration compared to the HGS algorithm and
tumor classification model
other competitors for the F1 test function in a unimodal space, as
seen in Fig. 7(a). For the functions F2, F3, and F4 with multimodal
features, as shown in Figs. 7(b-d), I-HGS demonstrates superior per- The results from the proposed I-HGS-ResNet50 model, detailed in
formance compared to the other algorithms for F2 and F3. However, Section 5, are presented and analyzed in this section. All of the sug-
the GBO and HGS algorithms perform similarly to I-HGS for the F4 gested model processes have been applied using Python and Keras [73]
test function. Additionally, as shown in Figs. 7(e-g), which depict the on Google Colaboratory [74]. Keras considers a high-level neural net-
hybrid functions, the I-HGS and GBO perform better when dealing work API created in Python and may be used with TensorFlow or
with these types of functions. The composite functions (F8, F9, and Theano. It was designed for quick use and the capacity to run numerous
F10), as seen in Figs. 7(h-j), reveal that the proposed I-HGS performs tests with the least amount of delay feasible, which aids in conducting
relatively well when tackling problems in complex spaces. The I-HGS enough research. To make the results clearer to see, they are divided
algorithm is observed to arrive at a stable point in a majority of the into six subsections.

12
M.M. Emam et al. Computers in Biology and Medicine 160 (2023) 106966

Table 5
The results of 30 experiments on CEC’2020 functions, including the mean, STD, best, and worst fitness produced by competitor algorithms.
Function Metric GWO WOA HHO SMA GBO RUN HGS I-HGS
Mean 2.567E+07 2.796E+07 6.480E+05 6.883E+03 2.717E+03 3.851E+03 6.031E+03 3.126E+03
STD 8.669E+07 8.613E+07 6.385E+05 4.984E+03 2.342E+03 2.076E+03 4.207E+03 2.734E+03
F1
Best 1.411E+04 8.281E+05 2.763E+04 1.317E+02 1.010E+02 1.941E+02 1.266E+02 1.113E+02
Worst 3.530E+08 4.769E+08 3.264E+06 1.274E+04 9.534E+03 8.787E+03 1.274E+04 1.143E+04
Mean 1.629E+03 2.252E+03 2.027E+03 1.644E+03 1.896E+03 1.662E+03 1.641E+03 1.470E+03
STD 2.810E+02 3.749E+02 2.206E+02 2.142E+02 2.973E+02 2.145E+02 2.395E+02 1.911E+02
F2
Best 1.241E+03 1.467E+03 1.576E+03 1.232E+03 1.157E+03 1.118E+03 1.352E+03 1.110E+03
Worst 2.363E+03 2.940E+03 2.427E+03 2.161E+03 2.608E+03 1.959E+03 2.342E+03 1.876E+03
Mean 7.320E+02 7.757E+02 7.836E+02 7.304E+02 7.391E+02 7.601E+02 7.346E+02 7.241E+02
STD 1.043E+01 2.721E+01 2.138E+01 8.914E+00 1.240E+01 1.638E+01 1.598E+01 5.289E+00
F3
Best 7.168E+02 7.490E+02 7.398E+02 7.170E+02 7.178E+02 7.260E+02 7.125E+02 7.105E+02
Worst 7.542E+02 8.776E+02 8.267E+02 7.535E+02 7.699E+02 8.056E+02 7.740E+02 7.349E+02
Mean 1.906E+03 1.908E+03 1.907E+03 1.902E+03 1.902E+03 1.902E+03 1.903E+03 1.902E+03
STD 1.581E+01 4.288E+00 2.479E+00 6.181E−01 6.677E−01 1.334E+00 1.551E+00 9.700E−01
F4
Best 1.901E+03 1.902E+03 1.903E+03 1.901E+03 1.901E+03 1.901E+03 1.901E+03 1.900E+03
Worst 1.989E+03 1.920E+03 1.912E+03 1.903E+03 1.903E+03 1.907E+03 1.907E+03 1.905E+03
Mean 6.645E+04 1.979E+05 3.853E+04 7.861E+03 2.433E+03 4.103E+03 2.422E+04 4.671E+03
STD 1.360E+05 4.825E+05 4.496E+04 6.103E+03 3.193E+02 1.650E+03 4.496E+04 2.482E+03
F5
Best 2.616E+03 6.282E+03 3.669E+03 2.026E+03 1.938E+03 2.261E+03 2.768E+03 1.734E+03
Worst 3.888E+05 2.566E+06 1.394E+05 1.991E+04 3.400E+03 7.354E+03 2.467E+05 9.161E+03
Mean 1.606E+03 1.614E+03 1.620E+03 1.601E+03 1.602E+03 1.601E+03 1.603E+03 1.601E+03
STD 1.214E+01 1.252E+01 1.184E+01 2.665E−01 3.022E+00 2.673E−01 5.148E+00 2.147E−01
F6
Best 1.601E+03 1.601E+03 1.601E+03 1.601E+03 1.601E+03 1.601E+03 1.601E+03 1.601E+03
Worst 1.660E+03 1.659E+03 1.659E+03 1.601E+03 1.618E+03 1.602E+03 1.619E+03 1.601E+03
Mean 1.496E+04 1.434E+05 8.270E+03 5.757E+03 2.541E+03 4.551E+03 9.154E+03 3.027E+03
STD 3.586E+04 2.809E+05 6.394E+03 5.760E+03 2.543E+02 2.668E+03 7.119E+03 9.075E+02
F7
Best 2.764E+03 5.536E+03 2.743E+03 2.203E+03 2.107E+03 2.230E+03 2.369E+03 2.103E+03
Worst 2.029E+05 1.031E+06 3.092E+04 2.069E+04 3.025E+03 1.319E+04 2.996E+04 4.863E+03
Mean 2.308E+03 2.439E+03 2.429E+03 2.342E+03 2.301E+03 2.303E+03 2.449E+03 2.300E+03
STD 8.615E+00 3.800E+02 3.528E+02 2.474E+02 9.863E+00 1.967E+01 3.514E+02 1.504E+01
F8
Best 2.301E+03 2.237E+03 2.308E+03 2.224E+03 2.249E+03 2.230E+03 2.301E+03 2.221E+03
Worst 2.335E+03 3.803E+03 3.580E+03 3.648E+03 2.306E+03 2.316E+03 3.554E+03 2.306E+03
Mean 2.752E+03 2.749E+03 2.813E+03 2.758E+03 2.739E+03 2.740E+03 2.756E+03 2.664E+03
STD 1.299E+01 7.745E+01 1.055E+02 1.099E+01 6.638E+01 4.612E+01 7.055E+01 1.268E+02
F9
Best 2.734E+03 2.527E+03 2.501E+03 2.743E+03 2.500E+03 2.500E+03 2.500E+03 2.500E+03
Worst 2.784E+03 2.818E+03 3.009E+03 2.780E+03 2.796E+03 2.767E+03 2.804E+03 2.767E+03
Mean 2.936E+03 2.957E+03 2.938E+03 2.937E+03 2.923E+03 2.929E+03 2.943E+03 2.922E+03
STD 1.558E+01 3.497E+01 1.928E+01 3.300E+01 6.683E+01 2.304E+01 3.056E+01 2.304E+01
F10
Best 2.898E+03 2.906E+03 2.899E+03 2.898E+03 2.600E+03 2.900E+03 2.898E+03 2.898E+03
Worst 2.949E+03 3.053E+03 2.952E+03 3.024E+03 3.024E+03 2.950E+03 3.024E+03 2.952E+03
Friedman mean rank 5.60 6.2 5.14 4.80 4.50 4.20 2.35 2.01
Rank 7 8 6 5 4 2 3 1

6.2.1. The impact of data augmentation methods to 20. Population size; this hyperparameter determines the size of
The Keras ImageDataGenerator is used to construct augmentation the population, which is the number of candidate solutions in each
methods to augment the number of images in the three used datasets. iteration of the optimization algorithm. It is set to 30, meaning that
The significance of data augmentation method is significant in increas- 30 candidate solutions will be evaluated in each iteration. Dimension;
ing the accuracy of the I-HGS-ResNet50 model. We mention the used this hyperparameter specifies the dimensionality of the search space,
data augmentation methods in Table 2. Due to data augmentation which is the number of hyperparameters being optimized. In this case,
methods, the I-HGS-ResNet50 model performs better than expected the dimension is set to 4, indicating that four hyperparameters are
for the Cheng dataset, with an accuracy improvement of 4.9%. It optimized simultaneously. Learning rate; this hyperparameter controls
enhances the effectiveness of I-HGS-ResNet50 for the large-4c and the step size at which the optimization algorithm updates the neural
large-2c datasets with an accuracy assessment of 2.3% and 3.5%, network weights during training. The search space for this hyperparam-
respectively. Fig. 8 shows the impact of using the data augmentation eter is set to [1𝑒−7 , 1𝑒−3 ], which means that the learning rate can take
techniques for improving the proposed I-HGS-ResNet50 model. any value between 1e-7 and 1e-3. It must be kept at a low number to
avoid experiencing substantial changes from occurring in the example
and preserve the features learned during the feature extraction step.
6.2.2. Constructing the I-HGS for the selection of the hyperparameters
Batch size, this hyperparameter specifies the number of samples used
optimization phase
in each iteration during training. The search space for this hyperparam-
This subsection provides the lower and upper boundaries of the
eter is set to [1, 64]. Dropout rate; this hyperparameter controls the
search area for the hyperparameters whose values have been deter-
regularization of the neural network by randomly dropping out some
mined by the proposed I-HGS algorithm.
units during training. The search space for this hyperparameter is set to
Table 6 reports the settings of the parameter values for a combina-
[0.1, 0.9]. The number of neurons; this hyperparameter determines the
tion of the I-HGS algorithm and ResNet50 model. The table includes
number of neurons in the hidden layers of the ResNet50 architecture.
the following hyperparameters with their search space. The maxi-
The search space for this hyperparameter is set to [50, 600], indicating
mum number of iterations, this hyperparameter specifies the maximum
that the number of neurons can take any value between 50 and 600.
number of iterations to run the optimization algorithm, and it is set

13
M.M. Emam et al. Computers in Biology and Medicine 160 (2023) 106966

Fig. 6. The boxplot diagrams for I-HGS and its compared algorithms generated over CEC’2020 benchmarks with a dimension of 10.

Table 6 Table 7
Settings of the parameter values for I-HGS in combination with ResNet50. The hyperparameters’ optimal values as determined by I-HGS.
Hyperparameter Search space Hyperparameter Best value
Maximum No. of iterations 20 Learning rate 0.0001
Population size 30 Batch size 15
Dimension 4 Dropout rate 0.3
Learning rate [1e−7, 1e−3] Number of neurons in the first dense layer 150
Batch size [1,64]
Dropout rate [0.1,0.9]
Number of neurons [50,600]
Uppermost training epochs of ResNet50 20 ResNet50 model is trained using the data from the training set and
evaluated using the data from the test set. Multiple experiments were
done to find the value of 𝑁, and it was observed that the ResNet50
Additionally, the training process for the ResNet50 model involves
produced the optimal output on the test set at the 20th epoch. If no
20 epochs. The I-HGS algorithm’s objective is to minimize the test set’s
improvement was observed after five iterations, early stopping [75]
loss rate as much as feasible. The effectiveness of the suggested I-HGS
was used to prematurely end the training process before reaching
methods is measured by assessing the test set’s loss rate obtained using
repetition 𝑁 to prevent overfitting. In the Cheng dataset and BT-large-
these parameters after 12 training cycles. After training the proposed I-
4c dataset, the ResNet50 was compiled using the sparse categorical
HGS-ResNet50 model, the ideal values for the learning rate, batch size,
cross-entropy [76]. While for the BT-large-2c dataset, the ResNet50 was
dropout rate, and the number of neurons in the first dense layer have
compiled using the binary cross-entropy [77] and the Adam optimizer
been selected. Table 7 lists the best hyperparameters determined using
algorithm [78].
the I-HGS algorithm.
6.2.4. Measuring the performance of I-HGS-ResNet50 brain tumor classifi-
6.2.3. Learning the ResNet50 by the optimized hyperparameters cation
In this phase, the ResNet50 model was trained by the best hyper- This section illustrates the performance evaluation of the I-HGS-
parameters selected using the proposed I-HGS. For 𝑁 epochs, the The RsNet50 model using hyperparameters values obtained by the proposed

14
M.M. Emam et al. Computers in Biology and Medicine 160 (2023) 106966

Fig. 7. Convergence plots of I-HGS and other algorithms on the CEC’2020 functions with a dimension of 10.

I-HGS algorithm. Various quantitative measurements have been per- 98.62%, a specificity of 98.78%, a precision of 99.66%, and an F1-Score
formed to estimate the effectiveness of the I-HGS-ResNet50 model, such of 99.69%. For the large-2C dataset, the I-HGS-ResNet50 acquired an
as accuracy, sensitivity, specificity, precision, and F-Score. The results accuracy of 99.88%, a sensitivity of 99.55%, a specificity of 99.91%,
are displayed in Fig. 9 based on applying the proposed I-HGS-ResNet50 a precision of 99.76%, and an F1-Score of 99.82%. These experiments
model on three different datasets: Cheng, large-4C, and large-2C. The show that the I-HGS-ResNet50 model is highly accurate and reliable for
experiments indicate that the I-HGS-ResNet50 model conducted very this specific task.
well on all three datasets, achieving high scores for all evaluation met-
rics. For the Cheng dataset, the I-HGS-ResNet50 acquired an accuracy 6.2.5. Comparison with the ResNet50 architecture of the brain tumor
of 99.89%, a sensitivity of 99.91%, a specificity of 99.94%, a precision datasets
of 99.87%, and an F1-Score of 99.92%. For the large-4C dataset, This section provides a comparison of three different models: the
the I-HGS-ResNet50 acquired an accuracy of 99.72%, a sensitivity of proposed I-HGS-ResNet50 model, the original pretrained ResNet50

15
M.M. Emam et al. Computers in Biology and Medicine 160 (2023) 106966

Fig. 8. The impact of data augmentation techniques.

Fig. 9. The performance of the I-HGS-ResNet50 model as measured by different evaluation metrics across all datasets.

model, and the original HGS algorithm integrated with ResNet50 model sensitivity of 97.88 %, a specificity of 97.15 %, a precision of 98.02
(HGS-ResNet50). The comparison is based on three different datasets, %, and an F-score of 98.06 %. On the large-2C dataset, the HGS-
namely Cheng-Dataset, lareg-4C Dataset, and lareg-2C Dataset. The ResNet50 achieved an accuracy of 98.44 %, a sensitivity of 98.32 %, a
main target of this comparison is to showcase the effectiveness of the specificity of 98.13 %, a precision of 98.22 %, and an F-score of 98.06
I-HGS algorithm in determining the optimal hyperparameters of the %. In contrast, the ResNet50 model has lower significant results on
ResNet50 that can result in the highest accuracy. Table 8 displays the all three datasets across all evaluation metrics. On the Cheng-Dataset,
evaluation results of the ResNet50, HGS-ResNet50, and I-HGS-ResNet50 the ResNet50 achieved an accuracy of 94.22%, a sensitivity of 93.01
models. These models are assessed depending on different measure- %, a specificity of 93.65%, a precision of 92.14%, and an F-score of
ments such as accuracy, sensitivity, specificity, precision, and F1-score. 91.55%. On the large-4C Dataset, the ResNet50 achieved an accuracy
Table 8 demonstrated that the I-HGS-ResNet50 model surpasses the of 92.12%, a sensitivity of 91.22%, a specificity of 91.06%, a precision
other two models on all three datasets across all evaluation metrics. On of 91.11%, and an F-score of 92.03%. On the large-2C dataset, the
the Cheng-Dataset, the I-HGS-ResNet50 gained an accuracy of 99.89%, ResNet50 achieved an accuracy of 90.33%, a sensitivity of 89.08%, a
a sensitivity of 99.91%, a specificity of 99.94%, a precision of 99.87%, specificity of 90.11%, a precision of 89.06%, and an F-score of 90.08%.
and an F-score of 99.92%. On the large-4C Dataset, the I-HGS-ResNet50 In summary, based on the evaluation metrics, the I-HGS-ResNet50
achieved an accuracy of 99.72%, a sensitivity of 98.62%, a specificity model performed significantly better than the other two models on all
of 98.78%, a precision of 99.66%, and an F-score of 99.69%. On the three datasets.
large-2C dataset, the I-HGS-ResNet50 achieved an accuracy of 99.88%,
a sensitivity of 99.55%, a specificity of 99.91%, a precision of 99.93%,
6.2.6. Comparison with the state-of-the-art brain tumor classification meth-
and an F-score of 99.74%.
ods
Moreover, the HGS-ResNet50 model achieves more significant re- This section compares the I-HGS-ResNet50 model with previous
sults than the ResNet50 model on all three datasets across all evaluation models that have been used to classify the same three datasets for
metrics. On the Cheng-Dataset, the HGS-ResNet50 achieved an accu- the problem of brain tumors. The effectiveness of the proposed I-HGS-
racy of 97.96 %, a sensitivity of 97.36 %, a specificity of 97.03 %, ResNet50 model is compared to other methods by using classification
a precision of 97.12 %, and an F-score of 97.18 %. On the large- accuracy as a metric. The table only reports accuracy as a performance
4C Dataset, the HGS-ResNet50 achieved an accuracy of 98.11 %, a measure because it is commonly used in related works.

16
M.M. Emam et al. Computers in Biology and Medicine 160 (2023) 106966

Table 8
Comparison between the proposed I-HGS-ResNet50, the HGS-ResNet50, and ResNet50 models.
Dataset Metric ResNet50 HGS-ResNet50 I-HGS-ResNet50
Accuracy 94.22% 97.96% 99.89%
Sensitivity 93.10% 97.36% 99.91%
Cheng-Dataset Specificity 93.65% 97.03% 99.94%
Precision 92.14% 97.12% 99.87%
F-score 92.55% 97.18% 99.92%
Accuracy 92.12% 98.11% 99.72%
Sensitivity 91.22% 97.88% 98.62%
large-4C Dataset Specificity 91.06% 97.15% 98.78%
Precision 91.11% 98.02% 99.66%
F-score 92.03% 98.06% 99.69%
Accuracy 90.11% 98.44% 99.88%
Sensitivity 89.80% 98.32 99.55%
lareg-2C Dataset Specificity 90.33% 98.13% 99.91%
Precision 89.06% 98.22% 99.76%
F-score 90.08% 98.06% 99.82%

Table 9
A comparison of the I-HGS-ResNet50 model with other methods across all datasets, using classification accuracy.
Dataset Applied model Data augmentation Accuracy
CNN architecture [79] Yes 91.43%
Capsule networks [80] No 90.89%
CNN with Genetic Algorithm [81] Yes 94.02%
CNN architecture [82] Yes 96.13%
Cheng Dataset
Ensemble classifier [83] No 98.48%
Fine-tuned ResNet50 [49] Yes 99.00%
Optimized ResNet50(IACO-ResNet) [58] Yes 99.02%
Modular fully-CNN architecture [4] Yes 99.78%
Fine-tuned GoogLeNet [52] No 97.00%
CNN architecture [84] No 84.19%
CNN with piece-wise activation function [48] Yes 99.57%
Modified CNNBCN [85] No 95.49%
BrainMRNet [85] No 95.65%
CNN model using 22 layers [86] No 97.39%
Fine-tuned InceptionV3 [54] No 94.34%
ResNet18 + ShallowNet + SVM [87] No 97.25%
Fine-tuned VGG16 model [53] Yes 98.69%
Proposed I-HGS-ResNet50 Yes 99.89%
Ensemble of pre-trainedDL model [88] Yes 91.58%
Modular fully-CNN architecture [4] Yes 96.03%
Ensemble learning [83] Yes –
large-4C Dataset
CNN architecture [89] Yes 98.8%
DenseNet121 + ResNet101 [88] – 98.67%
Proposed I-HGS-ResNet50 Yes 99.72%
large-2C Dataset Modular fully-CNN architecture [4] Yes 99.33%
Ensemble of pre-trainedDeep learning model [88] Yes 98.83%
Ensemble learning [83] Yes 98.00%
Proposed I-HGS-ResNet50 Yes 99.88%

Table 9 provides a comparison of the I-HGS-ResNet50 model with et al.). Ensemble learning approaches have also been used, such as the
other state-of-the-art methods across three different datasets (Cheng models proposed by Bansal et al. which have achieved high accuracy
Dataset, large-4C Dataset, and large-2C Dataset), using classification on all three datasets.
accuracy as the performance metric. The table only reports accuracy as
a performance measure because it is commonly used in related works. The first section of the table compares the I-HGS-ResNet50 model
The Applied Model reported in the first column specifies the type of to other state-of-the-art methods on the Cheng dataset. The I-HGS-
deep learning architecture or algorithm used in each method. The Data ResNet50 model achieved the highest accuracy of 99.89%, surpassing
augmentation column indicates whether the method used any data all other methods. The next best-performing method is the Modular
augmentation techniques during training, such as flipping or rotating fully-CNN architecture proposed by Shahin et al. with an accuracy of
images, to increase the size of the training set. Finally, the Accuracy 99.78%. Other methods, such as CNN with a genetic algorithm, Multi-
column shows the classification accuracy achieved by each method on CNN architecture, and optimized ResNet50, also perform well with
the corresponding dataset. The proposed method, I-HGS-ResNet50, is accuracy above 94%. However, the proposed I-HGS-ResNet50 model
compared against other models and achieves the highest accuracy of significantly outperforms these methods in terms of accuracy.
99.89% on the Cheng dataset, 99.72% on the large-4C dataset, and The second section of the table compares the I-HGS-ResNet50 model
98.53% on the large-2C dataset. Among the compared methods, some to other methods on the large-4C dataset. The I-HGS-ResNet50 model
are based on traditional convolutional neural networks (CNN) with or achieved an accuracy of 99.72%. This accuracy score is significantly
without data augmentation, such as the models proposed by Paul et al. higher than the other methods presented in the table. The next best per-
and Sultan et al. Others have used deep learning techniques, such as forming methods are the DenseNet121 and ResNet101 models proposed
capsule networks (Afshar et al.), fine-tuned GoogLeNet (Deepak et al.), by Kang et al. with an accuracy of 98.67% and the CNN architecture
fine-tuned InceptionV3 (Noreen et al.), and modified CNNBCN (Huang proposed by Naseer et al. with an accuracy of 98.8

17
M.M. Emam et al. Computers in Biology and Medicine 160 (2023) 106966

Finally, for the large-2C dataset, The proposed I-HGS-ResNet50 proposed I-HGS-ResNet50 model outperforms the other models on the
model obtained significant results and outperformed all other methods. Cheng dataset.
The Modular fully-CNN architecture proposed by Shahin et al. reaches
an even higher level of precision of 99.33%. Ensemble learning and 7. Strength and weakness of the proposed method
pre-trained deep learning models yield accuracy rates in excess of
98%. The proposed I-HGS-ResNet50 model performed better than other Regarding brain cancer classification, we applied a deep residual
state-of-the-art approaches in this study’s evaluation of brain tumor learning model to determine the kind of brain tumor as precisely as
classification. practicable. First, we looked at four pre-trained deep learning models,
It is worth noting that data augmentation played a significant role including VGG16, MobilNet, DensNet201, and ResNet50, to achieve
in achieving high accuracy in many models, as several models used this goal. We select the ResNet50 to improve its accuracy. To achieve
various data augmentation techniques to enhance the training process. a model with the best hyperparameters, we constructed an optimal
To further enhance classification accuracy, several models employed an architecture to optimize the hyperparameters of the chosen model. This
ensemble of models. model proposed an improved hunger game search algorithm called
The results of the table show that deep learning methods are su- I-HGS. It includes two powerful mechanisms in the original HGS to
perior to other approaches for classifying brain tumors. The maximum enhance its limitations in determining the optimal hyperparameters of
accuracy is obtained with the suggested I-HGS-ResNet50 model across the ResNet50 model.
all three datasets, indicating its promise in practical settings. The The benefits and drawbacks of the I-HGS-ResNet50 model are dis-
results of the proposed I-HGS-ResNet50 model are superior to all pre- cussed in this section. The following are the proposed method’s four
vious studies that used CNN architectures for the same test datasets. primary benefits: firstly, the HGS improvement has increased the con-
Fine-tuned deep learning architectures have acquired better accuracy vergence speed and precision of choosing the optimal hyperparameter.
compared to other CNN architectures. It is observed that the proposed It boosts the exploration ability of HGS and improves the exploration
I-HGS-ResNet50 model is very competitive with the Modular fully CNN and exploitation phases. The CEC’2020, which upholds extremely com-
architecture and the IACO-Resnet method and outperforms them in plex issue landscapes, is used. Second, the accuracy of the I-HGS
some datasets. Furthermore, the proposed I-HGS-ResNet50 is superior approach when used to determine the ResNet50 model’s ideal hy-
to all fine-tuned methods such as fine-tuned ResNet50, fine-tuned perparameter values is 99.89%, 99.72%, and 99.88% for the three
VGG16, and fine-tuned Inception-V3 model. Moreover, data augmenta- datasets evaluated. The ResNet50 model performs much better when
tion methods are essential in improving the performance of brain tumor classifying MRI images of brain tumors when using the I-HGS. The
classification models. suggested algorithm determines the best values of the hyperparameters,
in contrast to conventional methods that manually establish these
values by trial and error. Third, the quality of network training has been
6.2.7. Comparison with popular pre-trained deep learning models enhanced by effective preprocessing performed on the input images
Four popular pre-trained DL architectures are compared with the before model training. The performance of the suggested model has
I-HGS-ResNet50 model in this section to provide the overall effective- been assessed for a popular dataset of brain tumors. Fourth, using data
ness of the I-HGS-ResNet50 model. The deep learning models being augmentation techniques helps improve accuracy and avoid overfitting.
compared are the VGG16 [90], MobileNet [91], and DenseNet201 [92] The outcomes proved that the suggested model was superior to other
based on manual search. possibilities. Consequently, brain cancers can be identified using the
We selected these models based on their popularity and perfor- proposed optimization algorithm and the optimized ResNet50 model.
mance in previous studies on medical image analysis tasks. VGG16 is a One of the primary issues in the suggested model is how time-
widely used deep learning model with high accuracy in image classifi- consuming the I-HGS algorithm implementation is. In addition, the
cation tasks. MobileNet is a lightweight and efficient model that can run concept of NFL argues that no single optimization algorithm can excel
on mobile devices, making it a practical choice for real-world medical at every optimization problem. The authors acknowledge this principle
applications. DenseNet201 is a newer model with promising results in and suggest that the I-HGS algorithm, like other metaheuristic methods,
various medical imaging tasks, including brain tumor classification. follows this rule. Despite this, the I-HGS algorithm outperforms several
The hyperparameters for these models were chosen randomly with other well-established and recent algorithms. We intend to expand our
a batch size of 32, a dropout rate of 0.5, and a number of neurons research beyond classification and investigate tumor localization using
at 250. The comparison analysis revealed that the I-HGS-ResNet50 segmentation methods. Additionally, we aim to employ the proposed
architecture outperformed the VGG16, MobileNet, and DenseNet201 optimization algorithm to popular segmentation models.
architectures. Table 10 presents a comprehensive comparison using var-
ious quantitative evaluation metrics for the Cheng dataset. It compares 8. Conclusions and future work
the performance of four models, the proposed I-HGS-ResNet50, VGG16,
MobileNet, and DenseNet201. The proposed I-HGS-ResNet50 model In this paper, a reliable method is proposed for analyzing MRI scans
outperformed all the other models regarding all metrics, including for signs of brain tumors. The suggested approach utilizes hyperparam-
accuracy, sensitivity, specificity, precision, and F-score. Overall, the eter optimization to implement a deep residual learning architecture
proposed model is 99.89 percent accurate, which is the maximum for feature extraction from brain images and tumor classification. We
accuracy achieved. The sensitivity of a model is determined by how also proposed an improved metaheuristic algorithm (I-HGS) that inte-
well it can identify true positives, whereas specificity is determined by grates two enhancing techniques: Local Escaping Operator (LEO) and
how well it can identify false negatives. The suggested model offers a Brownian motion. The I-HGS algorithm mitigates the drawbacks of the
superior ability to distinguish between positive and negative cases, as original HGS, such as its tendency to get trapped in local optima regions
measured by its high sensitivity and specificity. Precision is the ratio of and its need for a proper balance between exploitation and exploration.
precise estimates to all positive predictions made by the model. Once The I-HGS was used for solving global optimization problems and
again, the suggested model outperforms the competition in terms of applied for optimizing the hyperparameters for the ResNet50 model for
accuracy, demonstrating its superior capacity to reduce false positives. brain tumor classification. Three benchmark brain tumor datasets have
Last but not least, the F-score takes into account both accuracy and been used to assess the proposed model. The proposed classification
reliability to provide a comprehensive evaluation of the model. The model is called I-HGS-ResNet50. I-HGS-ResNet50 contains five main
proposed model has once again demonstrated its superior performance phases: (1) dataset acquisition, (2) pre-processing and augmentation,
by receiving the highest F-score. Overall, Table 10 shows that the (3) hyperparameter selection, (4) learning process, and (5) performance

18
M.M. Emam et al. Computers in Biology and Medicine 160 (2023) 106966

Table 10
Comparison of the I-HGS-ResNet50 model’s performance with the VGG16, MobileNet, and DenseNet201
models on the Cheng dataset.
Metrices Proposed VGG16 MobileNet DenseNet201
I-HGS-
ResNet50
Accuracy 99.89% 96.33% 94.03% 95.87%
Sensitivity 99.91% 96.92% 93.78% 94.69%
Specificity 99.94% 95.8% 93.33% 94.01%
Precision 99.87% 96.02% 93.89% 95.06%
F-score 99.92% 96.03% 94.07% 95.88%

evaluation phase. The I-HGS-ResNet50 model was compared with pre- References
vious methods and well-known pre-trained deep learning architectures.
The I-HGS-ResNet50 model achieves the highest classification accuracy. [1] M. Nazir, S. Shakil, K. Khurshid, Role of deep learning in brain tumor detection
and classification (2015 to 2020): A review, Comput. Med. Imaging Graph. 91
The experiments demonstrated the proposed model’s effectiveness in
(2021) 101940.
classifying brain tumors. [2] A.M. Sarhan, et al., Brain tumor classification in magnetic resonance images
In future work, we aim to test the proposed model using a variety using deep learning and wavelet transform, J. Biomed. Sci. Eng. 13 (06) (2020)
of hybrid datasets and evaluate its performance. Additionally, we will 102.
[3] S. Kumar, D.P. Mankame, Optimization driven deep convolution neural network
experiment using different pre-trained models in combination with
for brain tumor classification, Biocybern. Biomed. Eng. 40 (3) (2020) 1190–1204.
the I-HGS algorithm for classifying brain tumors. Furthermore, we [4] A.I. Shahin, W. Aly, S. Aly, MBTFCN: A novel modular fully convolutional
will explore utilizing various advanced metaheuristic algorithms for network for MRI brain tumor multi-classification, Expert Syst. Appl. 212 (2023)
optimizing the model’s hyperparameters. Finally, we will investigate 118776.
the potential of the proposed model in solving other diagnostic tasks. [5] E. Başaran, A new brain tumor diagnostic model: Selection of textural feature
extraction algorithms and convolution neural network features with optimization
algorithms, Comput. Biol. Med. 148 (2022) 105857.
Compliance with ethical standards [6] G. Çelik, M.F. Talu, A new 3D MRI segmentation method based on generative
adversarial network and atrous convolution, Biomed. Signal Process. Control 71
(2022) 103155.
This article does not contain any studies with human participants
[7] Z.N.K. Swati, Q. Zhao, M. Kabir, F. Ali, Z. Ali, S. Ahmed, J. Lu, Brain tumor
or animals performed by any of the authors. classification for MR images using transfer learning and fine-tuning, Comput.
Med. Imaging Graph. 75 (2019) 34–46.
Funding [8] E.H. Houssein, M.M. Emam, A.A. Ali, An optimized deep learning architecture for
breast cancer diagnosis based on improved marine predators algorithm, Neural
Comput. Appl. (2022) 1–19.
This research was funded by Princess Nourah bint Abdulrahman [9] A.A. Yurdusev, K. Adem, M. Hekim, Detection and classification of microcalcifi-
University Researchers Supporting Project number (PNURSP2023R104), cations in mammograms images using difference filter and Yolov4 deep learning
Princess Nourah bint Abdulrahman University, Riyadh, Saudi Arabia. model, Biomed. Signal Process. Control 80 (2023) 104360.
[10] D. Ezzat, A.E. Hassanien, H.A. Ella, An optimized deep learning architecture for
the diagnosis of COVID-19 disease based on gravitational search optimization,
CRediT authorship contribution statement Appl. Soft Comput. 98 (2021) 106742.
[11] E.H. Houssein, M.M. Emam, A.A. Ali, P.N. Suganthan, Deep and machine learning
Marwa M. Emam: Software, Investigation, Visualization, Method- techniques for medical imaging-based breast cancer: A comprehensive review,
Expert Syst. Appl. (2020) 114161.
ology, Resources, Conceptualization, Validation, Data curation, Formal
[12] Z. Liao, P. Lan, X. Fan, B. Kelly, A. Innes, Z. Liao, SIRVD-DL: A COVID-19 deep
analysis, Writing – review & editing. Nagwan Abdel Samee: Fund- learning prediction model based on time-dependent SIRVD, Comput. Biol. Med.
ing acquisition, Resources, Conceptualization, Data curation, Formal 138 (2021) 104868.
analysis, Validation, Writing – review & editing. Mona M. Jamjoom: [13] R.A. Dar, M. Rasool, A. Assad, et al., Breast cancer detection using deep learning:
Datasets, methods, and challenges ahead, Comput. Biol. Med. (2022) 106073.
Resources, Conceptualization, Data curation, Formal analysis, Vali-
[14] G. Liu, Q. Ding, H. Luo, M. Sha, X. Li, M. Ju, Cx22: A new publicly avail-
dation, Writing – review & editing. Essam H. Houssein: Supervi- able dataset for deep learning-based segmentation of cervical cytology images,
sion, Methodology, Conceptualization, Investigation, Formal analysis, Comput. Biol. Med. 150 (2022) 106194.
Writing – review & editing. All authors read and approved the final [15] Y. Chen, H. Gan, H. Chen, Y. Zeng, L. Xu, A.A. Heidari, X. Zhu, Y. Liu, Accurate
paper.. iris segmentation and recognition using an end-to-end unified framework based
on MADNet and DSANet, Neurocomputing 517 (2023) 264–278.
[16] M. Yu, M. Han, X. Li, X. Wei, H. Jiang, H. Chen, R. Yu, Adaptive soft erasure
Declaration of competing interest with edge self-attention for weakly supervised semantic segmentation: Thyroid
ultrasound image case study, Comput. Biol. Med. 144 (2022) 105347.
[17] C. Zhao, H. Wang, H. Chen, W. Shi, Y. Feng, JAMSNet: A remote pulse extraction
The authors declare that they have no known competing finan-
network based on joint attention and multi-scale fusion, IEEE Trans. Circuits Syst.
cial interests or personal relationships that could have appeared to Video Technol. (2022).
influence the work reported in this paper. [18] N. Bacanin, T. Bezdan, K. Venkatachalam, F. Al-Turjman, Optimized convo-
lutional neural network by firefly algorithm for magnetic resonance image
classification of glioma brain tumor grade, J. Real-Time Image Process. 18 (4)
Data availability
(2021) 1085–1098.
[19] E.H. Houssein, D. Oliva, E. Çelik, M.M. Emam, R.M. Ghoniem, Boosted sooty
Data sharing not applicable to this article as no datasets were tern optimization algorithm for global optimization and feature selection, Expert
generated or analyzed during the current study. Syst. Appl. 213 (2023) 119015.
[20] E.H. Houssein, A. Sayed, Dynamic candidate solution boosted Beluga whale
optimization algorithm for biomedical classification, Mathematics 11 (3) (2023)
Acknowledgments 707.
[21] M.M. Emam, E.H. Houssein, R.M. Ghoniem, A modified reptile search algorithm
The authors would like to express their grateful to Princess Nourah for global optimization and image segmentation: Case study brain MRI images,
Comput. Biol. Med. 152 (2023) 106404.
bint Abdulrahman University Researchers Supporting Project number [22] E.H. Houssein, D.A. Abdelkareem, M.M. Emam, M.A. Hameed, M. Younan, An
(PNURSP2023R104), Princess Nourah bint Abdulrahman University, efficient image segmentation method for skin cancer imaging using improved
Riyadh, Saudi Arabia. golden Jackal optimization algorithm, Comput. Biol. Med. 149 (2022) 106075.

19
M.M. Emam et al. Computers in Biology and Medicine 160 (2023) 106966

[23] E.H. Houssein, M.M. Emam, A.A. Ali, Improved manta ray foraging optimization [53] A. Rehman, S. Naz, M.I. Razzak, F. Akram, M. Imran, A deep learning-based
for multi-level thresholding using COVID-19 CT images, Neural Comput. Appl. framework for automatic brain tumors classification using transfer learning,
33 (24) (2021) 16899–16919. Circuits Systems Signal Process. 39 (2) (2020) 757–775.
[24] E.H. Houssein, M.M. Emam, A.A. Ali, An efficient multilevel thresholding seg- [54] N. Noreen, S. Palaniappan, A. Qayyum, I. Ahmad, M.O. Alassafi, Brain tumor
mentation method for thermography breast cancer imaging based on improved classification based on fine-tuned models and the ensemble method, Comput.
chimp optimization algorithm, Expert Syst. Appl. 185 (2021) 115651. Mater. Continua 67 (3) (2021) 3967–3982.
[25] A. Fathy, H. Rezk, S. Ferahtia, R.M. Ghoniem, R. Alkanhel, M.M. Ghoniem, A [55] M. Toğaçar, B. Ergen, Z. Cömert, BrainMRNet: Brain tumor detection using
new fractional-order load frequency control for multi-renewable energy intercon- magnetic resonance images with a novel convolutional neural network model,
nected plants using skill optimization algorithm, Sustainability 14 (22) (2022) Med. Hypotheses 134 (2020) 109531.
14999. [56] K.A. Kumar, A. Prasad, J. Metan, A hybrid deep CNN-Cov-19-res-net transfer
[26] A. Eid, S. Kamel, E.H. Houssein, An enhanced equilibrium optimizer for strategic learning architype for an enhanced brain tumor detection and classification
planning of PV-BES units in radial distribution systems considering time-varying scheme in medical image processing, Biomed. Signal Process. Control 76 (2022)
demand, Neural Comput. Appl. 34 (19) (2022) 17145–17173. 103631.
[27] E.H. Houssein, M.E. Hosney, D. Oliva, W.M. Mohamed, M. Hassaballah, A novel [57] M. Alshayeji, J. Al-Buloushi, A. Ashkanani, S. Abed, Enhanced brain tumor
hybrid harris hawks optimization and support vector machines for drug design classification using an optimized multi-layered convolutional neural network
and discovery, Comput. Chem. Eng. 133 (2020) 106656. architecture, Multimedia Tools Appl. 80 (19) (2021) 28897–28917.
[28] S. Mirjalili, A. Lewis, The whale optimization algorithm, Adv. Eng. Softw. 95 [58] H. Mehnatkesh, S.M.J. Jalali, A. Khosravi, S. Nahavandi, An intelligent driven
(2016) 51–67. deep residual learning framework for brain tumor classification using MRI
[29] S. Mirjalili, S.M. Mirjalili, A. Lewis, Grey wolf optimizer, Adv. Eng. Softw. 69 images, Expert Syst. Appl. (2022) 119087.
(2014) 46–61. [59] Y. Wang, Y. Li, Y. Song, X. Rong, The influence of the activation function in a
[30] A.A. Heidari, S. Mirjalili, H. Faris, I. Aljarah, M. Mafarja, H. Chen, Harris hawks convolution neural network model of facial expression recognition, Appl. Sci. 10
optimization: Algorithm and applications, Future Gener. Comput. Syst. 97 (2019) (5) (2020) 1897.
849–872. [60] X. Glorot, A. Bordes, Y. Bengio, Deep sparse rectifier neural networks, in:
[31] J. Tu, H. Chen, M. Wang, A.H. Gandomi, The colony predation algorithm, J. Proceedings of the Fourteenth International Conference on Artificial Intelligence
Bionic Eng. 18 (3) (2021) 674–710. and Statistics, JMLR Workshop and Conference Proceedings, 2011, pp. 315–323.
[32] S. Li, H. Chen, M. Wang, A.A. Heidari, S. Mirjalili, Slime mould algorithm: A [61] A. Gaspar, D. Oliva, E. Cuevas, D. Zaldívar, M. Pérez, G. Pajares, Hyperparameter
new method for stochastic optimization, Future Gener. Comput. Syst. 111 (2020) optimization in a convolutional neural network using metaheuristic algorithms,
300–323. in: Metaheuristics in Machine Learning: Theory and Applications, Springer, 2021,
[33] A. Faramarzi, M. Heidarinejad, S. Mirjalili, A.H. Gandomi, Marine predators pp. 37–59.
algorithm: A nature-inspired metaheuristic, Expert Syst. Appl. (2020) 113377. [62] A. Krizhevsky, I. Sutskever, G.E. Hinton, Imagenet classification with deep
[34] Z. Yang, L. Deng, Y. Wang, J. Liu, Aptenodytes forsteri optimization: Algorithm convolutional neural networks, Adv. Neural Inf. Process. Syst. 25 (2012)
and applications, Knowl.-Based Syst. 232 (2021) 107483. 1097–1105.
[35] M. Braik, A. Hammouri, J. Atwan, M.A. Al-Betar, M.A. Awadallah, White shark
[63] A. Krizhevsky, I. Sutskever, G.E. Hinton, Imagenet classification with deep
optimizer: A novel bio-inspired meta-heuristic algorithm for global optimization
convolutional neural networks, in: Advances in Neural Information Processing
problems, Knowl.-Based Syst. 243 (2022) 108457.
Systems, 2012, pp. 1097–1105.
[36] S. Suyanto, A.A. Ariyanto, A.F. Ariyanto, Komodo Mlipir algorithm, Appl. Soft
[64] K. Simonyan, A. Zisserman, Very deep convolutional networks for large-scale
Comput. 114 (2022) 108043.
image recognition, 2014, arXiv preprint arXiv:1409.1556.
[37] I. Ahmadianfar, A.A. Heidari, S. Noshadian, H. Chen, A.H. Gandomi, INFO: An
[65] K. He, X. Zhang, S. Ren, J. Sun, Deep residual learning for image recognition, in:
efficient optimization algorithm based on weighted mean of vectors, Expert Syst.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition,
Appl. 195 (2022) 116516.
2016, pp. 770–778.
[38] W. Zhao, L. Wang, S. Mirjalili, Artificial hummingbird algorithm: A new bio-
[66] C. Szegedy, V. Vanhoucke, S. Ioffe, J. Shlens, Z. Wojna, Rethinking the inception
inspired optimizer with its engineering applications, Comput. Methods Appl.
architecture for computer vision, in: Proceedings of the IEEE Conference on
Mech. Engrg. 388 (2022) 114194.
Computer Vision and Pattern Recognition, 2016, pp. 2818–2826.
[39] H. Su, D. Zhao, A.A. Heidari, L. Liu, X. Zhang, M. Mafarja, H. Chen, RIME: A
[67] G. Huang, Z. Liu, L. Van Der Maaten, K.Q. Weinberger, Densely connected
physics-based optimization, Neurocomputing 532 (2023) 183–214.
convolutional networks, in: Proceedings of the IEEE Conference on Computer
[40] Y. Yang, H. Chen, A.A. Heidari, A.H. Gandomi, Hunger games search: Vi-
Vision and Pattern Recognition, 2017, pp. 4700–4708.
sions, conception, implementation, deep analysis, perspectives, and towards
[68] A. Khan, A. Sohail, U. Zahoora, A.S. Qureshi, A survey of the recent architec-
performance shifts, Expert Syst. Appl. 177 (2021) 114864.
tures of deep convolutional neural networks, Artif. Intell. Rev. 53 (8) (2020)
[41] D.H. Wolpert, W.G. Macready, No free lunch theorems for optimization, IEEE
5455–5516.
Trans. Evol. Comput. 1 (1) (1997) 67–82.
[69] C. Shorten, T.M. Khoshgoftaar, A survey on image data augmentation for deep
[42] I. Ahmadianfar, O. Bozorg-Haddad, X. Chu, Gradient-based optimizer: A new
learning, J. Big Data 6 (1) (2019) 1–48.
metaheuristic optimization algorithm, Inform. Sci. 540 (2020) 131–159.
[43] I. Ahmadianfar, O. Bozorg-Haddad, X. Chu, Gradient-based optimizer: A new [70] C. Goutte, E. Gaussier, A probabilistic interpretation of precision, recall and F-
metaheuristic optimization algorithm, Inform. Sci. 540 (2020) 131–159. score, with implication for evaluation, in: European Conference on Information
[44] I. Ahmadianfar, A.A. Heidari, A.H. Gandomi, X. Chu, H. Chen, RUN beyond the Retrieval, Springer, 2005, pp. 345–359.
metaphor: An efficient optimization algorithm based on Runge Kutta method, [71] A.W. Mohamed, A.A. Hadi, A.K. Mohamed, N.H. Awad, Evaluating the perfor-
Expert Syst. Appl. 181 (2021) 115079. mance of adaptive gaining sharing knowledge based algorithm on CEC 2020
[45] T. Kalaiselvi, S. Padmapriya, P. Sriramakrishnan, K. Somasundaram, Deriving benchmark problems, in: 2020 IEEE Congress on Evolutionary Computation, CEC,
tumor detection models using convolutional neural networks from MRI of human IEEE, 2020, pp. 1–8.
brain scans, Int. J. Inform. Technol. 12 (2) (2020) 403–408. [72] A. Arcuri, G. Fraser, Parameter tuning or default values? An empirical investi-
[46] S. Shanthi, S. Saradha, J. Smitha, N. Prasath, H. Anandakumar, An efficient gation in search-based software engineering, Empir. Softw. Eng. 18 (3) (2013)
automatic brain tumor classification using optimized hybrid deep neural network, 594–623.
Int. J. Intell. Netw. (2022). [73] F. Chollet, A. Yee, R. Prokofyev, Keras: Deep learning for humans. 2015, 2020,
[47] S. Rajeev, M.P. Rajasekaran, G. Vishnuvarthanan, T. Arunprasath, A biologically- https://Github.Com/Keras-Team/Keras Last Accessed 16.
inspired hybrid deep learning approach for brain tumor classification from [74] T. Carneiro, R.V.M. Da Nóbrega, T. Nepomuceno, G.-B. Bian, V.H.C. De Al-
magnetic resonance imaging using improved gabor wavelet transform and buquerque, P.P. Reboucas Filho, Performance analysis of google colaboratory
Elmann-BiLSTM network, Biomed. Signal Process. Control 78 (2022) 103949. as a tool for accelerating deep learning applications, IEEE Access 6 (2018)
[48] A. Mondal, V.K. Shrivastava, A novel parametric flatten-p Mish activation 61677–61685.
function based deep CNN model for brain tumor classification, Comput. Biol. [75] L. Prechelt, Early stopping-but when? in: Neural Networks: Tricks of the Trade,
Med. 150 (2022) 106183. Springer, 1998, pp. 55–69.
[49] S.A.A. Ismael, A. Mohammed, H. Hefny, An enhanced deep learning approach [76] P. Singh, A. Manure, Neural networks and deep learning with TensorFlow, in:
for brain cancer MRI images classification using residual networks, Artif. Intell. Learn TensorFlow 2.0, Springer, 2020, pp. 53–74.
Med. 102 (2020) 101779. [77] A.S. Bosman, A. Engelbrecht, M. Helbig, Visualising basins of attraction
[50] M. Toğaçar, Z. Cömert, B. Ergen, Classification of brain MRI using hyper column for the cross-entropy and the squared error neural network loss functions,
technique with convolutional neural network and feature selection method, Neurocomputing 400 (2020) 113–136.
Expert Syst. Appl. 149 (2020) 113274. [78] D.P. Kingma, J. Ba, Adam: A method for stochastic optimization, 2014, arXiv
[51] A. Çinar, M. Yildirim, Detection of tumors on brain MRI images using the hybrid preprint arXiv:1412.6980.
convolutional neural network architecture, Med. Hypotheses 139 (2020) 109684. [79] J.S. Paul, A.J. Plassard, B.A. Landman, D. Fabbri, Deep learning for brain tumor
[52] S. Deepak, P. Ameer, Brain tumor classification using deep CNN features via classification, in: Medical Imaging 2017: Biomedical Applications in Molecular,
transfer learning, Comput. Biol. Med. 111 (2019) 103345. Structural, and Functional Imaging, Vol. 10137, SPIE, 2017, pp. 253–268.

20
M.M. Emam et al. Computers in Biology and Medicine 160 (2023) 106966

[80] P. Afshar, K.N. Plataniotis, A. Mohammadi, Capsule networks for brain tumor [86] M.M. Badža, M.Č. Barjaktarović, Classification of brain tumors from MRI images
classification based on MRI images and coarse tumor boundaries, in: ICASSP using a convolutional neural network, Appl. Sci. 10 (6) (2020) 1999.
2019-2019 IEEE International Conference on Acoustics, Speech and Signal [87] C. Öksüz, O. Urhan, M.K. Güllü, Brain tumor classification using the fused
Processing, ICASSP, IEEE, 2019, pp. 1368–1372. features extracted from expanded tumor region, Biomed. Signal Process. Control
[81] A.K. Anaraki, M. Ayati, F. Kazemi, Magnetic resonance imaging-based brain 72 (2022) 103356.
tumor grades classification and grading via convolutional neural networks and [88] J. Kang, Z. Ullah, J. Gwak, Mri-based brain tumor classification using ensemble
genetic algorithms, Biocybern. Biomed. Eng. 39 (1) (2019) 63–74. of deep features and machine learning classifiers, Sensors 21 (6) (2021) 2222.
[82] H.H. Sultan, N.M. Salem, W. Al-Atabany, Multi-classification of brain tumor [89] A. Naseer, T. Yasir, A. Azhar, T. Shakeel, K. Zafar, Computer-aided brain tumor
images using deep neural network, IEEE Access 7 (2019) 69215–69225. diagnosis: Performance evaluation of deep learner CNN using augmented brain
[83] T. Bansal, N. Jindal, An improved hybrid classification of brain tumor MRI MRI, Int. J. Biomed. Imaging 2021 (2021).
images based on conglomeration feature extraction techniques, Neural Comput. [90] K. Simonyan, A. Zisserman, Very deep convolutional networks for large-scale
Appl. 34 (11) (2022) 9069–9086. image recognition, 2014, arXiv preprint arXiv:1409.1556.
[84] N. Abiwinanda, M. Hanif, S.T. Hesaputra, A. Handayani, T.R. Mengko, Brain [91] A.G. Howard, M. Zhu, B. Chen, D. Kalenichenko, W. Wang, T. Weyand, M.
tumor classification using convolutional neural network, in: World Congress on Andreetto, H. Adam, Mobilenets: Efficient convolutional neural networks for
Medical Physics and Biomedical Engineering 2018, Springer, 2019, pp. 183–189. mobile vision applications, 2017, arXiv preprint arXiv:1704.04861.
[85] M. Toğaçar, B. Ergen, Z. Cömert, Tumor type detection in brain MR images of [92] G. Huang, Z. Liu, L. Van Der Maaten, K.Q. Weinberger, Densely connected
the deep model developed using hypercolumn technique, attention modules, and convolutional networks, in: Proceedings of the IEEE Conference on Computer
residual blocks, Med. Biol. Eng. Comput. 59 (1) (2021) 57–70. Vision and Pattern Recognition, 2017, pp. 4700–4708.

21

You might also like