A Novel Control Factor and Brownian Motion-Based Improved Harris Hawks Optimization For Feature Selection
A Novel Control Factor and Brownian Motion-Based Improved Harris Hawks Optimization For Feature Selection
net/publication/357750375
CITATIONS READS
11 267
3 authors, including:
SEE PROFILE
All content following this page was uploaded by Utkarsh Mahadeo Khaire on 24 March 2022.
ORIGINAL RESEARCH
Abstract
The massive growth in data size has prompted proliferation in need for Feature Selection (FS). Hence, FS has become an
imperative method for dealing with high-dimensional data. This research critique proposes an enhanced feature selection of
Harris Hawks Optimization (HHO) based on the novel control factor and Brownian motion. The Brownian motion augments
the exploitation of foragers. It also replicates the deceptive movement of prey, allowing predators to correct their location
and direction according to the prey’s position. At the same time, the novel control factor imitates the exact behavior of the
prey’s escaping energy. The comparative analysis with the existing technique using six real high-dimensional microarray
datasets highlights the impact of the proposed Improved Harris Hawks Optimization (iHHO). The experimental results of
FS and classification accuracy vividly depict how the proposed model outperforms the existing techniques.
Keywords Feature selection · Harris Hawks Optimization · Meta-heuristic optimization · Microarray dataset
13
Vol.:(0123456789)
K. Balakrishnan et al.
forcing the algorithm to escape the local optimum, allowing Harris Hawks Optimization (HHO) is a nature-inspired
the algorithm to find the estimated optimal solution (Dash MH algorithm developed by Heidari et al. (2019). HHO imi-
and Liu 2003). As a result, random search strategies such tates the attacking strategy such as besiege, perching, and
as Simulated Annealing (SA) (Liu et al. 2018) and Genetic surprise pounce strategies of Harris hawks to find an optimal
Algorithm (GA) (Dong et al. 2018) selection of the feature solution. HHO is split into two exploration stages and four
subset outperforms sequence search in most cases. exploitation stages. In many real-world applications, HHO
This research pericope focuses on the FS using Meta- outperforms traditional meta-heuristic algorithms such as
Heuristic (MH) optimization techniques. Traditional opti- Whale Optimization Algorithm (WOA) and Genetic Algo-
mization techniques have several limitations when it comes rithm (GA). However, through this research, two drawbacks
to FS problems. Population-based methods are of five associated with traditional HHO are identified. First, the rep-
types (Heidari et al. 2019): Evolutionary Algorithm (EA), resentation of prey escape energy is inefficient in conven-
Physics-based Algorithm (PBA), Swarm-based Algorithm tional HHO. Second, Levy flight represents the imitation of
(SBA), Human-based Algorithm (HBA), and Mathematical- leapfrog movements of prey while fleeing from a predator.
based Algorithm (MBA). Figure 2 depicts numerous well- The initial random solution's more petite and extended step
known population-based algorithms. In population-based sizes hurdle the algorithm emerging from local optima. As
algorithms, two major characteristics are typically found: a result, a novel control factor that accurately imitates prey's
intensification and diversification. escape energy over time is developed through this research
Fig. 2 Population-based MH
optimization algorithms
13
A novel control factor and Brownian motion‑based improved Harris Hawks Optimization for feature…
critique. This research connoisseur has employed Brownian Elminaam et al. proposed a novel technique to the FS
motion rather than Levy flight to attain the optimum solu- problem based on the Marine Predator Algorithm (MPA)
tion in the stipulated period. In brief, the following are the (Elminaam et al. 2021). This study has encompassed the
significant contributions of this research: hybridization of MPA with KNN. The proposed MPA-KNN
adapts the basic exploratory and exploitative procedures to
1. This research pericope has proposed a robust iHHO for choose the best relevant features for the most accurate iden-
selecting significant features from the high-dimensional tification. Five different evaluation criteria are carried out
microarray datasets. on 18 UCI datasets to explore the suggested approach's per-
2. A novel control factor is used to express the prey’s formance. The obtained results enunciate that the proposed
escape energy accurately. model provides superior performance than the conventional
3. Considering randomization is the key to any MH algo- MH algorithms in selecting the optimal features. Elgamal
rithm, zigzag movement of escaping prey and surprise et al. proposed an improved variant of HHO based on Simu-
attacks of predators are represented using Brownian lated Annealing (SA) for FS in the medical field (Elgamal
motion instead of Levy flight. et al. 2020). The suggested model handles concerns such
4. The proposed iHHO's performance is assessed using six as population diversity and local optima of conventional
real high-dimensional microarray datasets and several HHO. In this proposed method, SA is used to improve the
unimodal and multimodal functions. exploitation capability of HHO. The SA algorithm is also
5. The outcomes of the proposed iHHO are compared with used by Abdel et al. in their study and has proposed a vari-
six well-established optimization techniques. ant of HHO termed the Chaotic HHO (CHHO) algorithm.
CHHO is developed using chaotic maps and incorporates
Section 2 of this research article covers the literature sur- the SA algorithm to enhance the population diversity and
vey. Section 3 stresses the motivation of the proposed iHHO. converging ability to avoid the local optima (Abdel-Basset
The background and mathematical approach of iHHO are et al. 2021).
discussed in Sect. 4. Section 5 deals with the implementation Neggaz et al. proposed an improved variant of the Salp
of the proposed method. Section 6 focuses on the simulation Swarm Algorithm (SSA) for FS problems using Sine–Cosine
findings, and discussions are elaborated in Sect. 7. Section 8 Algorithm (SCA) and disrupted operator (Neggaz et al.
emphasizes the conclusion and the future scope. 2020a). The SCA facilitates exploration and prevents stag-
nation. In addition to that, disrupt operator is used to enhanc-
ing the population diversity. Neggaz et al. also suggest a
2 Related work novel approach based on Henry Gas Solubility Optimization
(HGSO) to select the optimal features and improve clas-
The reliability factor has resulted in the stochastic optimiz- sification accuracy (Neggaz et al. 2020b). Ahmed et al.,
ers' interest in solving problems in various fields such as in their research, has used SSA with four different chaotic
the manufacturing industry, environmental quality, solar maps to balance exploration and exploitation (Ahmed et al.
systems, power systems, and other engineering areas (Ala- 2018). Twelve real-world datasets are used to test the pro-
bool et al. 2021). Compared to traditional methods, nature- posed techniques. The findings indicate that chaotic maps
inspired MH algorithms have a promising result for complex improve the proposed model’s performance considerably as
problems. Feature Selection (FS) is one of the most popular compared to conventional methods. In a study, Zhang et al.
applications of stochastic optimization algorithms. Owing proposed the binary variant of HHO for global optimization
to the slow convergence of MH algorithms, researchers use and FS problems (Zhang et al. 2020). The SSA mechanism
different techniques to improve the MH algorithm by add- is merged into traditional HHO to enhance the exploitation
ing one or more techniques to the existing algorithms. The and exploration behavior of HHO. The suggested HHO is
well-known techniques are levy flight, Brownian motion, tested on 23 classical functions using statistical metrics and
binary variant, and opposition-based learning. Sihwail convergence rate.
et al. proposed an improved version of HHO based on Elite Houssein et al. balanced exploration and exploitation
Opposition Based Learning (EOBL) and Three Search Strat- of HHO using genetic operators and two methods, such as
egies (TSS) to boost the global and local searches of HHO Opposition Based Learning (OBL) and Random Opposition
(Sihwail et al. 2020). EOBL has increased the diversity of Based Learning (ROBL) (Houssein et al. 2021). Monoam-
the HHO population, and TSS search strategies assisted the ine Oxidase and QSAR Biodegradation datasets are used
algorithm in its search for global optima by avoiding traps to assess the efficacy of the proposed system. The findings
in local optima. From the results of the experiments, it could indicate that the three variants of the proposed model out-
be understood that the proposed approach outperforms the perform the conventional techniques in determining the
other algorithms in all the metrics. best subset of chemical descriptors. Houssein et al. also
13
K. Balakrishnan et al.
suggested a hybrid version of HHO based on Cuckoo Search function is used in QBHHO to rejuvenate BHHO's feature
Optimization (CSO) and chaotic maps to boost the efficiency selection efficiency. The best fitness value, mean fitness
of the original HHO (Houssein et al. 2020a). Furthermore, value, standard deviation of fitness value, classification
the proposed model was paired with the Support Vector accuracy, and feature size are used to assess the results of the
Machine (SVM) as a machine learning classifier for per- proposed algorithms. The findings of this analysis indicate
forming chemical descriptor collection and chemical com- that the quadratic transfer function algorithm provides the
pound operations. best results. Too and Mirjalili have also introduced a gen-
Hussein et al. proposed an improved variant of HHO with eral learning method to aid search agents in avoiding local
three strategies OBL, Chaotic Local Search, and self-adap- optima and improving their capacity to find a promising
tive technique (Hussien and Amin 2021). The OBL strategy region (Too and Mirjalili 2021). Sixteen biological datasets
is incorporated in the initialization phase, and the other two were used to test the proposed General Learning Equilibrium
strategies are embedded with the update phase to enhance Optimizer (GLEO).
the converging ability of the proposed model. Hussain et al. Mafarja and Mirjalili developed a new hybrid metaheuris-
suggested a unique hybrid approach for numerical methods tic technique combining SA and WOA. The SA scheme
and FS issues by combining two algorithms: SCA and HHO improves the optimum solution found by WOA (Mafarja
(Hussain et al. 2021). The primary goal of this research is and Mirjalili 2017). The key objective of this hybridization
to balance exploration and exploitation capabilities. Ismael is to improve SA's exploitation ability, represented by WOA.
et al. suggested a novel hybrid model that optimizes the The performance assessment demonstrates an improvement
hyper parameters of v-SVR when simultaneously embed- in classification accuracy and produces better results than
ding feature selection using OBL (Ismael et al. 2020). wrapper-based techniques. 18 standard datasets were taken
Gao et al. discovered a tent map that could be used to from the UCI repository to compare the proposed method.
improve the converging capabilities of HHO (Gao et al. Hussein et al. suggested a novel binary variant of WOA
2019). The tent maps can initialize the population distri- (BWOA) to choose the best function subset for the FS prob-
bution and escape the equal distribution by generating the lem (Hussien et al. 2019). An S-shaped transfer function
chaos using non-period, non-converged, and bounded ran- is used to control the novel strategy. Over eleven different
dom numbers. The chaos is also applied to the algorithm datasets, a series of parameters is used to evaluate and equate
to substitute the random numbers. The proposed algorithm the proposed model with the existing one. Emary et al. has
is evaluated using 18 benchmarks of unimodal and multi- used a threshold value to solve function selection problems,
modal functions. Gao et al. also proposed a binary variant encompassing the first binary variant of the Firefly Algo-
of Evolutionary Optimization (EO) for FS problems (Gao rithm (FFA) (Emary et al. 2015). The suggested algorithm
et al. 2020b). The model employs the Sigmoid transfer func- had a high level of investigation and was able to find a sim-
tion (S-Shaped) and V-shaped transfer functions to transform ple solution to the problem. Kanimozhi et al. proposed an
the binary version of EO and change the particle's current image retrieval strategy based on SVM classifiers and FFA
position vector. The suggested method is tested using nine- (Kanimozhi and Latha 2015). The fundamental goal was to
teen UCI benchmark datasets. Based on the experimental improve the algorithm's accuracy by using optimal func-
findings, the binary EO technique performs well compared tionality, and the algorithm was put to the test on a variety
to other approaches for addressing FS issues. Zhang et al. of image datasets.
introduced a return cost-based Firefly Algorithm (FFA),
which uses the binary movement operator to change firefly
positions (Zhang et al. 2017). Zhang et al. also presented
an improved HHO based on SSA, assuming that SSA's 3 Motivation
powerful explorative capacity would facilitate exploring
the original HHO (Zhang et al. 2020). Initialize and update The existing Harris Hawks Optimization presents the escap-
are the two stages of the proposed method. The proposed ing energy of the prey using Eq. (1).
approach clearly outperforms other approaches in terms of ( )
t
average feature length and error rate measures according to E = 2E0 1 − (1)
T
the experimental findings.
To solve the issue of feature selection in classification In Eq. (1), E denotes the prey's escaping energy, T is the
tasks, Too et al. proposed two improved variants of HHO, maximum number of iterations, E0 is the energy's initial
namely Binary HHO (BHHO) and Quadratic Binary HHO state, which varies randomly across the interval (− 1, 1).
(QBHHO) (Too et al. 2019). The BHHO has a built-in The graphical representation of the behavior of E is shown
S-shaped or V-shaped conversion mechanism that converts in Fig. 3. The figure imitates the deceptive zigzag movement
continuous search agents to binary. The quadratic transfer of the prey. Figure 3 represents an unexpected hike in the
13
A novel control factor and Brownian motion‑based improved Harris Hawks Optimization for feature…
Fig. 4 The behaviour of novel control factor-based escaping energy Fig. 6 Movement of search agent using Brownian motion in the
(Ee) in improved HHO during 1000 iterations improved Harris Hawks
prey's energy as it gradually loses its energy while running besiege stage. Figure 5 shows the nature of movements of
away from its predator. Harris Hawks using the Levy flight. Due to the smaller and
The accurate behaviour of the escaping energy of the prey occasional larger step size, the algorithm sometimes fails
is depicted in Eq. (2), which is the proposed novel Control to recover from local optima.
Factor (CF) that accurately imitates the behaviour of the The aforementioned drawback of existing Harris Hawks
prey’s escaping energy. Optimization is overcome by the proposed movement of
( ) Harris hawks using Brownian motion, as stated in Eq. (3).
( ) 2× Tt
CF = 1 −
t (2) The graphical representation of Brownian motion is shown
T in Fig. 6.
The graphical representation of the CF in Fig. 4 shows � 2�
1 −x
the smooth deprivation of escaping energy of the prey as the Brownian Motion = fB (x) = √ exp
2 (3)
2𝜋
model progresses. The CF gradually decreases from 1 to 0,
the exact imitation of prey losing its energy while escaping
from the predator. In Eq. (3), x denotes Harris Hawk's current location in
The conventional Harris Hawks use Levy flight with the iteration. The Brownian motion allows hawks to take
progressive dives into the exploitation phase in the hard longer step size, which eventually precludes the algorithm
from stagnation at local minima.
13
K. Balakrishnan et al.
4 Improved Harris Hawks Optimization if q ≥ 0.5. The mathematical representation of the perching
(iHHO) strategy of Harris hawks is modeled in Eq. (4).
{
The modified HHO is divided into three phases: initializa- Y(t + 1) = ( rand (t) − r1 |Yrand)(t) − 2r
Y ( 2 Y(t)| ) q ≥ 0.5
Yrabbit (t) − Ym (t) − r3 lb + r4 (ub − lb) q ≤ 0.5
tion, update, and classification. The following subsection (4)
elucidates each component.
where Y(t) and Y(t + 1) is a position of the search agent in
the iteration t and t + 1, respectively. r1 , r2 , r3 , and r4 are
4.1 Initialization phase the arbitrary value in the range of [0, 1]. ub and lb denotes
the upper and lower bound value of every individual search
The initialization of iHHO reflects other meta-heuristic agent. Yrabbit (t) and Yrand (t) are the position of the prey and
techniques where it initializes the random search agents and the current random position, respectively. Ym (t) is the aver-
identifies the current best solution using a defined objective age location of the new search agents obtained using Eq. (5).
function.
N
1∑
Ym (t) = Y (t) (5)
4.2 Update phase N i=1 i
In this phase, to achieve the optimum solution, the proposed where N is the number of search agents, Yi (t) epitomizes the
algorithm performs both intensification and diversification. position of the search agent in iteration t .
Figure 7 represents the specific conditions that contribute to
the diversification and intensification of the algorithm. This 4.2.2 Transition from exploration to exploitation
procedure is categorized into four stages: soft besiege, soft
besiege with progressive rapid dives, hard besiege, and hard Generally, the performance of MH algorithms is determined
besiege with progressive rapid dives. by the factor that balances exploration and exploitation.
The evolution of iHHO from exploration to exploitation
4.2.1 Exploration is focused on the prey's escape energy using the following
equation:
In this phase, the predator (Harris Hawks) monitors the Escaping_Energy (Ee) = 2 × CF × E0 (6)
grander search space to discover the location of prey (Rab-
bit). Let q denote the probability of success for the perch- In Eq. (6), E0 is a random initial state, and its values range
ing strategy. For the condition q < 0.5, the predator perches from -1 to 1. |Ee| ≥ 1 indicates the exploration process, and
based on the position of other family members and the rab- |Ee| < 1 represents the exploitation process, and CF denotes
bit. Otherwise, a random search agent changes its position control factor.
13
A novel control factor and Brownian motion‑based improved Harris Hawks Optimization for feature…
{
4.2.3 Exploitation phase X if F(X) < F(Y(t))
Y(t + 1) = (12)
Z if F(Z) < F(Y(t))
In this phase, iHHO uses four strategies to attack the prey.
13
K. Balakrishnan et al.
13
A novel control factor and Brownian motion‑based improved Harris Hawks Optimization for feature…
During the updating phase, the iHHO updates the Yrab- 6.1 Performance on high dimensional microarray
bit and evaluates the fitness value associated with it. The datasets
proposed model returns the best search agent with the low-
est objective function value at the end of each iteration. This phase focuses on how the suggested iHHO is applied to
The weight of each feature of input data is indicated by six real high-dimensional microarray datasets. The outcomes
the unsurpassed search agent, which aids in determining of iHHO are paralleled to numerous existing optimization
the significance of the corresponding feature. The Sup- techniques as proposed in earlier researches. Particulars of
port Vector Machine (SVM) classifier with the polynomial several microarray datasets used in this research critique are
kernel function of degree three verifies the validity of the presented in Table 1.
selected feature subset during the classification phase.
Training and testing are the two divisions of the reduced 6.1.1 Comparison based on converging ability
feature set used in the tenfold cross-validation method
(CV). Using a tenfold CV minimizes the probability of The unique potentialities and efficacies of the suggested
overfitting the predictive model. For each epoch, nine out model are assessed employing the cross-entropy objective
of ten input data blocks are used as a training group, and function, which calculates the error rate in each iteration.
one set is used as a testing set. All results are recorded and Every outcome of this research experiment is documented
compared based on the average performance of optimizers and paralleled, considering the midline performance of opti-
over 30 independent runs in which each run includes 100 mizers over 30 independent runs in which each run includes
iterations. The performance of the proposed approach is 100 iterations. The ability of the proposed model to con-
measured using precision, recall, f1-score, and accuracy verge to global minima is demonstrated by the decrease in
measures, in addition to the ROC-AUC curve. error rate with each iteration. Figure 9 compares the con-
verging ability of iHHO to that of conventional HHO for
three microarray datasets. The results and performance of
6 Experimental results and discussion the proposed iHHO is compared with other well-established
optimization techniques such as the MFO (Mirjalili 2015),
From Sect. 6.1, the experimental results and an intricate MPA (Faramarzi et al. 2020), SCA (Mirjalili 2016), SSA
discussion highlight the proposed technique's effectiveness. (Mirjalili et al. 2017), WOA (Mirjalili and Lewis 2016), and
HHO algorithms based on six different microarray datasets.
The proposed iHHO’s optimum convergence rate against
13
K. Balakrishnan et al.
Error rate
12
10 SCA 8 SCA
8 SSA 6 SSA
6
WOA 4 WOA
4
2 HHO 2 HHO
0 iHHO 0 iHHO
1
6
11
16
21
26
31
36
41
46
51
56
61
66
71
76
81
86
91
96
1
6
11
16
21
26
31
36
41
46
51
56
61
66
71
76
81
86
91
96
Iterations Iterations
CNS OSCC
22 12
20
18 10
16 MFO MFO
14 MPA 8 MPA
Error rate
Error rate
12
SCA 6 SCA
10
8 SSA SSA
4
6 WOA WOA
4 2
2 HHO HHO
0 iHHO 0 iHHO
1
6
11
16
21
26
31
36
41
46
51
56
61
66
71
76
81
86
91
96
1
6
11
16
21
26
31
36
41
46
51
56
61
66
71
76
81
86
91
96
Iterations Iterations
Error rate
12
10
SCA 10 SCA
8
SSA 8 SSA
6 6
4 WOA WOA
4
2 HHO 2 HHO
0 iHHO 0 iHHO
1
6
11
16
21
26
31
36
41
46
51
56
61
66
71
76
81
86
91
96
1
6
11
16
21
26
31
36
41
46
51
56
61
66
71
76
81
86
91
96
Iterations Iterations
the global minimum demonstrates its efficacy in discover- the datasets. This requirement of epochs indicates that HHO
ing and locating a better solution. Following the conver- is experiencing premature convergence, whereas iHHO is
gence curves in Fig. 9, it could be analyzed that the proposed devoid of the same.
iHHO reaches the global minima. In contrast, the other con-
ventional techniques failed in providing an optimum solution 6.1.2 Comparison based on training accuracy
even at the 100th iteration. The projected movement that the
iHHO transits from exploration to exploitation can also be The comparison based on training accuracy between
detected. Also, it is noted that the iHHO can expose an aug- the proposed iHHO and other conventional optimiz-
mented convergence trend. It could be analyzed that the pro- ers is illustrated in Fig. 10. In contrast to the different
posed iHHO is free from premature convergence, which is approaches, the proposed model has a higher potential for
an ailment of MH optimization which avoid algorithm from increased accuracy in the early stages. From Fig. 10, it
achieving the optimum output. The proposed iHHO requires can be understood that compared to the other algorithms,
more than 50 epochs to reach the global minima, whereas there is a gradual increase and noteworthy development
traditional HHO requires only 20 iterations in the majority of
13
A novel control factor and Brownian motion‑based improved Harris Hawks Optimization for feature…
0.9 0.9
MFO MFO
0.8 MPA 0.8 MPA
Accuracy
Accuracy
1
6
11
16
21
26
31
36
41
46
51
56
61
66
71
76
81
86
91
96
Iterations Iterations
CNS OSCC
1 1
0.9 0.9
MFO MFO
0.8 MPA 0.8 MPA
Accuracy
Accuracy
0.7 SCA 0.7 SCA
SSA SSA
0.6 0.6
WOA WOA
0.5 0.5
HHO HHO
0.4 iHHO 0.4 iHHO
1
6
11
16
21
26
31
36
41
46
51
56
61
66
71
76
81
86
91
96
1
6
11
16
21
26
31
36
41
46
51
56
61
66
71
76
81
86
91
96
Iterations Iterations
0.9 0.9
MFO MFO
0.8 MPA 0.8 MPA
Accuracy
Accuracy
Iterations Iterations
in the accurateness of the proposed iHHO. The precision equated with other accustomed optimization techniques
value of the suggested model ranges between (90–100%), based on the box plots, violin plots, heat maps, Standard
whereas values of other techniques range around Deviation (STD), and an average of the test results (AVG).
(50–70%). In the majority of the dataset, the proposed The non-parametric Wilcoxon sum rank test, commonly
iHHO shows significant improvement. However, in the known as the Mann Whitney U statistical test with a 5%
case of colon cancer, though the traditional HHO demon- degree of significance, is carried out alongside the experi-
strates higher accuracy over the proposed approach, it is mental evaluations to trace the substantial variances among
to be noted that there is no increase in the accuracy value the obtained outcomes of varied nuances. The research
after the initial iterations. hypothesis of the Wilcoxon sum rank test states that there is
a noteworthy difference amongst the two groups.
6.1.3 Comparison based on unseen test data The performance of the proposed iHHO for several
microarray test datasets is demonstrated in Figs. 11, 12, and
In this phase, an assessment of the unseen data is utilized 13. A box plot in Fig. 11 utilizes boxes and lines to illustrate
to investigate the credibility of the iHHO in handling the the distributions of one or more groups of numeric data.
unknown data. The enactment of the proposed iHHO is Box limits point to the central 50% of the data range, with a
13
K. Balakrishnan et al.
Table 2 Comparison of average Dataset Metric MFO MPA SCA SSA WOA HHO iHHO
accuracy results over 30 runs
Breast Cancer AVG 0.4929 0.6123 0.6367 0.4487 0.5313 0.5640 0.7236
STD 0.0000 0.0534 0.0755 0.0236 0.0047 0.0121 0.0531
CNS AVG 0.4614 0.5229 0.6535 0.5853 0.5862 0.6964 0.6352
STD 0.0197 0.0535 0.0566 0.0037 0.0188 0.0404 0.0532
Colon Cancer AVG 0.5745 0.5607 0.5970 0.7504 0.5234 0.8117 0.6646
STD 0.0057 0.0282 0.1013 0.1052 0.0105 0.0333 0.1034
Leukemia AVG 0.5674 0.6569 0.6112 0.5392 0.7500 0.6651 0.7717
STD 0.0171 0.0485 0.0175 0.0331 0.0683 0.0573 0.0965
OSCC AVG 0.8553 0.6894 0.8116 0.8476 0.7959 0.8281 0.9521
STD 0.0039 0.0016 0.0083 0.0304 0.0000 0.0315 0.0208
Ovarian Cancer AVG 0.5408 0.7141 0.6765 0.6346 0.6193 0.6091 0.7566
STD 0.0100 0.1006 0.0515 0.0025 0.0024 0.0099 0.0479
Bold values represent the significant difference between the proposed iHHO and other conventional tech-
niques
Table 3 p values of the Mann Datasets Metric MFO MPA SCA SSA WOA HHO
Whitney U test with 5%
significance Breast Cancer U val 0 639 1799 0 0 0
p val 5.60E−39 1.02 E−26 5.02 E−15 2.28 E−34 2.01 E−34 1.03 E−37
RBC 1 0.8722 0.6402 1 1 1
CNS U val 0 457 6001.5 2037.5 2131 8892
p val 1.77 E−34 1.06 E−28 0.014378 1.01 E−14 1.13 E−12 1.90 E−21
RBC 1 0.9086 − 0.2003 0.5925 0.5738 − 0.7784
Colon Cancer U val 100 86 63 2559 0 115
p val 1.43 E−33 6.14 E−34 1.26 E−33 2.43 E−09 2.22 E−34 7.31 E−33
RBC 0.98 0.9828 0.9874 0.4882 1 0.977
Leukemia U val 450 839 541 453 4326 739
p val 7.75 E−29 2.11 E−24 6.99 E−28 9.94 E−29 0.099742 3.33 E−26
RBC 0.91 0.8322 0.8918 0.9094 0.1348 0.8522
OSCC U val 6260 300 300 300 2822 10,000
p val 4.94 E−07 7.98 E−38 2.52 E−38 6.94 E−35 9.44 E−09 1.62 E−37
RBC − 0.252 0.94 0.94 0.94 0.4356 −1
Ovarian Cancer U val 9981 1698 1158 0 77 0
p val 5.92 E−35 2.74 E−16 2.01 E−21 3.46 E−37 1.94 E−34 4.11 E−35
RBC − 0.9962 0.6604 0.7684 1 0.9846 1
central line denoting the median value. Lines outspread from optimizer's superior performance. Concerning p values in
each box to seize the range of the balanced data, with dots Table 3, it is identified that the elucidations of iHHO are
placed past the line edges to specify outliers. Box plots are suggestively better than those analyzed by further proce-
the best source for a comparative analysis between the two dures in almost all cases.
groups, and they are precise in data summarization. Com- Off to the side of the box median rather than in the mid-
pared to other optimizers, the accuracy value produced by dle, as well as an inequity in whisker lengths, where one
the proposed iHHO in 100 iterations over 30 independent cross is short with no outliers, and the other has a long tail
runs has symmetric distribution in the majority of datasets. with myriads of outliers of other optimizers, this indicates
Except for CNS and colon cancer, the proposed optimizer that the distribution is skewed. Box plots provide only a
evenly distributes outliers on either side of the box. The high-level data summary and cannot present the particulars
average and standard deviation values in Table 2 show that of a data distribution’s shape. A box plot’s effortlessness also
iHHO performs significantly superior to other algorithms contributes to the limits on the data density that it can dem-
in handling with 66.66% of six datasets, demonstrating the onstrate. It blindfolds the capability to detect the meticulous
13
A novel control factor and Brownian motion‑based improved Harris Hawks Optimization for feature…
shape of the dissemination, like, if there are oddities in a dis- Fig. 12 show that as more data points are added to a region,
tribution’s modality (number of ‘humps’ or peaks) and skew. the height of the density curve in that area of proposed
The violin plots are used to assess the proposed iHHO iHHO increases in the majority of datasets. The proposed
in Fig. 12 to overcome the drawback of box plots. A violin optimizer has a much more curtailed distribution compared
plot illustrates distributions of numeric data for one or more to the other conventional techniques.
groups using density curves. The width of each curve cor- The heatmaps in Fig. 13 indicate the variations in the
responds with the approximate frequency of data points in accuracy value throughout the 100 iterations. A sequen-
each region. Densities are frequently accompanied by an tial color ramp between value and color shows that the
overlaid chart type, such as a box plot, to provide additional lighter colors correspond to larger values and darker colors
information. In the middle of each density curve is a small to smaller values. In support of the evidence provided by
box plot, with the rectangle showing the ends of the first and Tables 2 and 3, the range of accuracy values produced by
third quartiles and the central dot the median. The plots in the proposed iHHO in Breast cancer, Leukemia, OSCC, and
13
K. Balakrishnan et al.
Ovarian cancer are much higher than other well-established improvement over other methods in CNS, Colon Cancer,
optimizers. Leukemia, and OSCC. In addition to the ROC-AUC score,
the precision, recall, f1-score, and accuracy metrics are
6.1.4 Performance analysis of the selected features subset used to validate the performance of the selected feature, as
shown in Table 4. In the case of Breast and Ovarian can-
The Receiver Operating Characteristic (ROC) curve scruti- cer, the proposed iHHO yields 50% and 60% accuracy val-
nizes the performance of a classification model at various ues, respectively. Even though the outcome in Breast and
threshold settings to determine the effectiveness of a predic- Ovarian cancer is not significant, it is the highest accuracy
tive model in evaluating the data sample class. Figure 14 value shown in that category. However, the proposed iHHO
compares the performance of the features selected by the produces precision scores ranging from 81 to 96% in most
proposed iHHO with other optimizers for different micro- datasets. A Higher ROC-AUC score of iHHO indicates
array datasets. The proposed optimizer shows significant greater confidence in the predictive potential of selected
13
A novel control factor and Brownian motion‑based improved Harris Hawks Optimization for feature…
features. The higher the AUC value, the less likely the pro- The recommended benchmark functions encompass a set of
posed model will reverse the effects and have a high level of Unimodal (UM), and Multimodal (MM) functions. The UM
separability. The features chosen by iHHO provide greater functions (F1–F4) can disclose the intensification abilities of
confidence in disease prediction and higher accuracy across diverse optimizers. The MM functions (F5–F8) can divulge
all six datasets. diversification. Tables 5 and 6 represent the mathematical
formulation of UM and MM problems. The outcomes of the
6.2 Performance‑based on unimodal suggested iHHO are compared with the existing optimiza-
and multimodal functions tion techniques such as the MFO, MPA, SCA, SSA, WOA,
and HHO algorithms.
The effectiveness of the suggested iHHO optimizer is thor-
oughly analyzed using a set of varied benchmark functions.
13
K. Balakrishnan et al.
6.2.1 Scalability analysis for 87.5% (F1–F6 and F8) of 100-dimensional search space
problems. Table 10 demonstrates that the iHHO achieves
Scalability evaluation is employed to reconnoiter the effect the paramount results in AVG and STD in dealing with all
of dimension on the outcome of iHHO. This test can reveal 8 test cases with 500 dimensions. Table 11 presents that
the impact of dimensions on the quality of solutions for iHHO performs significantly superior when dealing with
the iHHO optimizer to identify its effectiveness for prob- F1–F8 test functions with 1000 dimensions. It could be
lems with lower and higher dimension tasks. This research observed from the results of the formulated tables that the
employs the iHHO to handle the scalable UM and MM performance of the conventional methods diminishes with
F1–F8 test cases with 30, 100, 500, and 1000 dimensions. the increase of dimensions which reveals the potentiality of
Table 7 represents the analysis outcomes of the iHHO in the iHHO in consistently balancing investigative and inten-
handling F1-F8 problems with different dimensions. Table 7 sification tendencies.
also depicts that the iHHO can expose exceptional results,
which remain consistent in all dimensions. 6.3 Comparative analysis of HHO variants
13
A novel control factor and Brownian motion‑based improved Harris Hawks Optimization for feature…
Accuracy
0.5051
0.5695
0.4965
0.5094
0.4937
0.5158
0.6084
Functions Dimensions Range fmin
F1-Score n
∑ 30, 100, 500, 1000 [100, 100] 0
f1 (x) = xi2
0.67
0.00
0.28
0.69
0.00
0.66
0.67
0.00
0.66
0.00
0.67
0.05
0.41
0.69
i=1
n
∑ � � ∏n � � 30, 100, 500, 1000 [10, 10] 0
Recall
Ovarian Cancer
n i
0.50
0.00
0.80
0.53
0.00
0.49
0.50
0.00
0.49
0.00
0.51
1.00
0.76
0.56
∑ ∑
sion
f3 (x) = xj
i=1 j−1
0.8669
0.8154
0.8154
0.7555
0.9509
0.9095
0.9509
Accu-
{ }
racy
f4 (x) = maxi ||xi , 1 ≤ i ≤ n|| 30, 100, 500, 1000 [100, 100] 0
F1-Score
0.84
0.87
0.83
0.78
0.83
0.78
0.80
0.67
0.96
0.95
0.90
0.91
0.95
0.95
with novel control factor performs almost symmetrical but
Recall
0.80
0.91
1.00
0.64
1.00
0.64
1.00
0.50
1.00
0.90
0.90
0.91
0.91
1.00
shows meagre accuracy compared to iHHO. The traditional
HHO performs measly as compared to other approaches in
OSCC
Preci-
0.89
0.83
0.71
1.00
0.71
1.00
0.67
1.00
0.92
1.00
0.90
0.91
1.00
0.91
sion
all datasets.
0.8109
0.6807
0.8198
0.8678
0.7120
0.7404
0.9636
Accu-
racy
6.4 Complexity analysis
F1-Score
Preci-
0.5295
0.8620
0.6429
0.8398
0.7507
0.9609
Accu-
complexity is O(t*n*d).
F1-Score
0.71
0.81
0.48
0.56
0.87
0.84
0.56
0.69
0.83
0.82
0.67
0.80
0.96
0.96
7 Results discussion
Recall
0.55
1.00
0.42
0.64
0.73
0.45
0.82
0.91
0.75
0.50
1.00
0.92
1.00
Colon Cancer
1.0
From the results, it can be inferred that the results of iHHO are
Preci-
1.00
0.69
0.56
0.50
0.77
1.00
0.71
0.60
0.77
0.90
1.00
0.67
1.00
0.92
sion
0.7709
0.3925
0.7139
0.6118
0.5857
0.8189
0.30
0.91
0.82
0.73
0.08
0.73
0.40
1.00
0.45
0.75
0.55
0.60
0.73
0.90
0.75
0.59
0.75
0.80
0.25
0.42
1.00
0.65
0.63
CNS
0.60
0.60
0.55
0.89
0.75
sion
0.4805
0.5092
0.5092
0.4809
0.4839
0.4572
0.5092
Accu-
racy
0.00
0.65
0.00
0.67
0.00
0.67
0.65
0.00
0.00
0.65
0.00
0.67
0.62
0.67
0.00
1.00
0.00
1.00
0.00
1.00
1.00
0.00
0.00
1.00
0.00
1.00
0.93
1.00
0.00
0.48
0.00
0.50
0.00
0.50
0.48
0.00
0.00
0.48
0.00
0.50
0.46
0.50
sion
0
1
0
1
0
1
0
1
0
1
0
1
0
1
WOA
Data-
rithm
MFO
HHO
MPA
SSA
set
13
K. Balakrishnan et al.
n
∑ ∏n � � 30, 100, 500, 1000 [600, 600] 0
1 x
f8 (x) = 4000
xi2 − i=1
cos √i +1
i=1 i
Table 8 Results and comparisons of unimodal and multimodal benchmark functions with 30 dimensions
Problem/dimension Metric MFO MPA SCA SSA WOA HHO iHHO
Unimodal F1 Avg 2.05 E−67 9.98E + 01 1.73E + 04 1.46E + 02 2.74E + 01 9.12 E−05 3.56 E−85
Std 1.68 E−66 5.67E + 01 6.89E + 03 4.90E + 01 3.67E + 01 3.07 E−04 1.02 E−84
F2 Avg 3.95 E−23 8.83E + 01 8.30E + 00 4.82E + 01 7.15 E−01 1.56 E−51 7.19 E−92
Std 6.98 E−22 1.07E + 01 2.89E + 03 1.90E + 01 6.28 E−02 8.55 E−50 9.35 E−91
F3 Avg 2.39 E−04 6.94 E−02 1.25E + 02 5.83 E−71 3.97E + 05 5.96 E−88 3.92 E−89
Std 1.08 E−03 2.95 E−02 7.89E + 01 9.31 E−71 2.12E + 05 2.36 E−87 8.37 E−89
F4 Avg 5.01 E−02 2.98 E−53 7.98 E−23 1.36E + 01 6.06 E−27 2.91 E−49 7.32 E−51
Std 1.53 E−02 7.62 E−53 2.67 E−22 1.10E + 01 1.92 E−26 5.62 E−48 1.09 E−50
Multimodal F5 Avg − 1.07E + 02 − 1.09E + 05 − 5.98 E−04 − 7.29E + 02 − 2.32E + 04 − 1.96E + 10 − 2.05E + 16
Std 3.05E + 01 8.38E + 01 7.09 E−04 3.95E + 01 1.96E + 03 4.52E + 11 6.68E + 15
F6 Avg 1.42 E−09 3.65 E−02 1.96E + 04 6.62 E−09 2.35 E−29 3.36 E−04 4.68 E−38
Std 6.52 E−08 9.64 E−02 8.72E + 03 1.35 E−08 7.03 E−28 8.96 E−03 7.61 E−38
F7 Avg 5.69 E−04 8.45E + 01 9.01 E−07 6.75 E−06 2.98E + 04 1.32E + 02 5.21 E−09
Std 3.81 E−04 1.92E + 01 6.34 E−06 1.68 E−03 3.39E + 04 5.21E + 02 4.02 E−08
F8 Avg 3.97E + 06 2.39E + 04 2.95 E−09 9.45 E−24 1.94 E−05 5.60 E−08 1.28 E−65
Std 1.92E + 06 2.09E + 04 1.56 E−08 6.65 E−23 5.50 E−05 4.94 E−07 7.09 E−64
Bold values represent the significant difference between the proposed iHHO and other conventional techniques
13
A novel control factor and Brownian motion‑based improved Harris Hawks Optimization for feature…
Table 9 Results and comparisons of unimodal and multimodal benchmark functions with 100 dimensions
Problem/dimension Metric MFO MPA SCA SSA WOA HHO iHHO
Unimodal F1 Avg 7.86 E−39 3.65E + 03 1.73E + 04 7.91 E−09 4.14 E−03 1.72 E−07 2.93 E−79
Std 3.68 E−38 1.29E + 03 5.36E + 03 2.39 E−08 8.63 E−03 6.91 E−06 5.53 E−79
F2 Avg 8.65 E−04 2.35E + 01 7.38 E−07 9.92E + 02 3.67E + 01 5.02 E−61 6.85 E−87
Std 9.61 E−04 5.91E + 01 5.60 E−06 4.27E + 04 7.39E + 02 6.94 E−61 1.43 E−86
F3 Avg 7.35 E−06 2.98E + 02 6.09 E−32 3.36 E−05 1.25 E−30 5.69 E−06 5.90 E−92
Std 3.65 E−07 5.66E + 05 4.95 E−31 9.42 E−04 5.68 E−29 8.32 E−06 6.52 E−92
F4 Avg 2.37E + 01 6.62 E−04 3.09 E−02 7.50 E−28 7.56 E−59 6.96 E−64 5.40 E−69
Std 6.87E + 01 1.25 E−03 8.08 E−01 3.45 E−27 3.54 E−62 2.35 E−63 2.77 E−69
Multimodal F5 Avg − 1.24E + 05 − 7.56E + 15 -6.68E + 10 − − 5.23E + 04 − 1.29E + 07 1.18E + 05 − 7.45E + 16
Std 6.32E + 04 5.94E + 11 8.97E + 09 3.62E + 03 4.26E + 06 1.28E + 04 1.35E + 15
F6 Avg 9.65E + 01 2.91 E−09 4.29 E−04 2.22E + 02 2.06 E−01 5.19 E−10 6.28 E−16
Std 5.94E + 01 7.91 E−09 1.93 E−03 3.62E + 02 8.39 E−03 7.95 E−08 1.64 E−16
F7 Avg 2.95 E−01 8.62E + 02 1.62 E−03 6.56E + 01 4.50 E−09 6.72E + 01 3.69 E−05
Std 6.54 E−01 9.65E + 02 7.32 E−02 5.06E + 01 2.62 E−08 2.55E + 01 9.26 E−05
F8 Avg 2.84 E−01 5.02 E−14 5.24 E−18 4.02E + 02 6.30 E−03 4.92 E−06 6.89 E−21
Std 8.92 E−01 4.93 E−14 2.93 E−18 6.83E + 02 2.96 E−03 1.97 E−05 4.02 E−21
Bold values represent the significant difference between the proposed iHHO and other conventional techniques
Table 10 Results and comparisons of unimodal and multimodal benchmark functions with 500 dimensions
Problem/dimension Metric MFO MPA SCA SSA WOA HHO iHHO
Unimodal F1 Avg 5.92E + 01 3.51 E−04 8.23E + 02 2.38 E−23 6.24 E−29 7.55E + 01 1.92 E−62
Std 1.25E + 01 6.56 E−03 7.19E + 02 5.09 E−22 4.95 E−28 3.66E + 02 6.10 E−62
F2 Avg 3.51 E−04 6.16 E−02 5.57 E−32 8.56 E−15 2.69E + 00 6.98E + 02 4.66 E−72
Std 2.60 E−04 3.58 E−02 4.92 E−31 7.95 E−14 4.62E + 00 2.69E + 01 1.06 E−71
F3 Avg 1.09 E−19 5.91 E−02 6.82E + 01 8.11 E−02 7.62 E−04 4.01 E−27 4.53 E−88
Std 5.25 E−18 3.71 E−01 4.92E + 01 6.62 E−02 4.69 E−04 3.62 E−26 3.62 E−87
F4 Avg 7.62 E−02 5.36E + 01 3.93 E−29 4.87 E−24 6.02E + 02 5.68 E−13 4.95 E−62
Std 5.62 E−02 2.26E + 01 1.78 E−29 8.66 E−24 5.91E + 01 3.69 E−12 1.92 E−62
Multimodal F5 Avg − 1.24 E−14 − 1.65 E−24 − 2.26E + 01 − 4.92E + 12 − 1.35 E−15 − 1.24 E−18 − 1.52E + 20
Std 4.25 E−14 3.26 E−24 7.55E + 01 2.15E + 12 6.56 E−15 5.62 E−18 2.31E + 20
F6 Avg 8.26E + 01 7.52 E−05 6.90E + 02 1.25 E−16 7.62 E−14 8.10E + 02 5.89 E−22
Std 5.56E + 02 2.98 E−04 9.58E + 02 4.56 E−15 3.25 E−13 2.25E + 02 2.67 E−21
F7 Avg 2.32E + 01 7.93 E−02 1.46 E−19 3.74 E−01 7.24E + 04 3.56 E−02 2.34 E−24
Std 5.92E + 01 3.48 E−01 4.74 E−18 8.27 E−01 3.89E + 04 6.40 E−01 4.72 E−24
F8 Avg 6.98 E−04 5.38E + 02 8.34 E−06 9.65 E−04 6.62E + 01 8.26 E−02 3.62 E−18
Std 2.49 E−04 4.28E + 02 3.62 E−06 5.62 E−04 2.95E + 01 5.26 E−02 1.94 E−18
Bold values represent the significant difference between the proposed iHHO and other conventional techniques
addition, handling diverse classes of problems has tactfully • Modified escaping energy (Ee) using novel control factor
eluded LO and immature convergence. The recommended (CF) parameter represents smooth deprivation of escaping
iHHO has demonstrated a better ability in leaping out of local energy of the prey as it tries to escape from the predator.
optimum solutions in any Local Optima stagnation. Features It not only requires an even shift between exploration and
stated below would assist in comprehending various theoreti- exploitation patterns of iHHO but also enhances it.
cal reasons that substantiate the constructive nature of the rec- • When piloting a local search, diversification mechanisms
ommended iHHO in exploring or exploiting the search space such as Brownian motion (fB) eventually preclude the algo-
of a given optimization problem:
13
K. Balakrishnan et al.
Table 11 Results and comparisons of unimodal and multimodal benchmark functions with 1000 dimensions
Problem/dimension Metric MFO MPA SCA SSA WOA HHO iHHO
Unimodal F1 Avg 6.89 E−04 3.81 E−02 7.82E + 01 5.62 E−35 4.98E + 09 1.09E + 02 1.49 E−44
Std 7.26 E−03 5.26 E−02 2.54E + 01 6.29 E−34 2.95E + 07 6.25E + 01 3.26 E−44
F2 Avg 2.61E + 02 6.23E + 04 3.64 E−02 9.56E + 01 5.62 E−13 7.29E + 10 3.32 E−56
Std 5.65E + 01 4.92E + 04 1.95 E−02 5.62E + 01 3.62 E−12 2.32E + 10 2.91 E−55
F3 Avg 8.92 E−18 3.26 E−04 6.26E + 02 4.69 E−10 3.52E + 10 9.55E + 01 2.35 E−63
Std 2.55 E−18 1.98 E−03 4.55E + 02 6.28 E−09 8.14E + 10 8.77E + 01 1.09 E−62
F4 Avg 3.62E + 10 2.92 E−01 9.56 E−15 2.62E + 10 6.29 E−42 7.92 E−56 3.69 E−58
Std 2.60E + 10 1.59 E−01 4.29 E−14 7.69E + 10 3.27 E−42 5.62 E−55 1.62 E−57
Multimodal F5 Avg − 5.34E + 15 − 3.68 E−09 − 1.35 E−17 − 7.69 E−04 − 6.25 E−24 − 7.75E + 01 − 1.06E + 17
Std 9.62E + 13 1.64 E−08 4.69 E−16 2.78 E−03 5.62 E−24 3.62E + 01 6.38E + 16
F6 Avg 5.16E + 01 8.13E + 03 7.23 E−09 2.30 E−03 6.29E + 10 8.26 E−05 4.61 E−13
Std 2.30E + 01 7.33E + 03 6.62 E−08 1.18 E−02 5.69E + 10 5.95 E−04 6.62 E−12
F7 Avg 4.26 E−13 6.78E + 03 1.16E + 12 7.95 E−15 8.69 E−04 5.22E + 01 1.62 E−17
Std 8.96 E−11 9.21E + 02 2.36E + 11 5.26 E−14 1.96 E−03 3.26E + 01 3.92 E−17
F8 Avg 3.25 E−01 8.65E + 04 6.62E + 10 2.62 E−01 9.28E + 10 5.91 E−03 1.62 E−08
Std 6.23 E−01 5.00E + 00 1.95E + 09 7.28 E−01 4.25E + 10 1.56 E−02 6.69 E−08
Bold values represent the significant difference between the proposed iHHO and other conventional techniques
rithm from stagnation at local minima and improves the is assessed using six publically accessible real high-dimen-
exploitative nature of iHHO. sional microarray datasets. The performance measures such
• There is a constructive impact on the exploitation ability as precision, recall, f1-score, and classification accuracy are
of iHHO as it employs a sequence of searching stratagems used to evaluate the confidence of selected features. From
centered on Ee and r parameters and then selects the best the outcome of this research evaluation presented by differ-
movement step. ent performance measures, it is understood that the feature
• The randomized jump (J) potentiality can support optimum chosen by the proposed iHHO provides profound insight in
solutions in maintaining the tendencies of intensification detecting life-threatening diseases.
and diversification. Various unimodal and multimodal problems were
employed to scrutinize the evasion of exploitative, explora-
tory, and local optima by the proposed iHHO. The outcomes
8 Conclusion of iHHO illustrate that iHHO is proficient in discovering
exceptional results paralleled to archetypal optimizers. Fur-
Harris Hawks Optimization is a population-based opti- thermore, the effects of six high-dimensional microarray
mizer enthused by the cooperative nature and hurtling datasets exposed that the iHHO can demonstrate superior
skills of predatory birds, Harris hawks, in nature. This results to other optimizers. The proposed randomization
research has stressed the two setbacks associated with tra- using Brownian motion improves the convergence ability
ditional HHO. Firstly, the conventional HHO representa- and helps to avoid premature convergence. However, like
tion of prey escape energy is inefficient. Secondly, Levy other optimization methods, iHHO has a few shortcomings.
flight's smaller and occasionally more extended step sizes The proposed model experiments exclusively for FS prob-
in the initial random solution prevented the algorithm from lems with a high-dimensional dataset. It can be applied for
emerging from local optima. constrained engineering design tasks for various engineering
Randomization is a core component of swarm intelli- applications. To conclude, the performance of the Brown-
gence-based optimization techniques. In this research cri- ian motion has demonstrated a novel way to improve the
tique, the randomization is instilled in traditional HHO using outcomes of other optimization approaches in the future.
the Brownian motion rather than levy flight. Using Brownian Future works can utilize other evolutionary schemes such
motion to update the best location of the initial random solu- as mutation and crossover schemes, evolutionary updating
tion, the proposed model avoids stagnation in local minima structures, and chaos-based phases to develop binary and
and successfully converges to global optima. In addition, the multi-objective versions of iHHO. In addition, it can be
novel control factor to efficiently imitate the escape energy employed to tackle the problems in image segmentation,
of the prey is also incorporated. The performance of iHHO
13
A novel control factor and Brownian motion‑based improved Harris Hawks Optimization for feature…
0.6 HHO+Brownian Mo on
Accuracy
0.5 0.5
0.4 0.4
0.3 HHO+proposed control HHO+proposed control
0.3
factor factor
0.2 0.2
0.1 0.1
HHO+Brownian HHO+Brownian
0 Mo on+proposed 0 Mo on+proposed
1 7 13 19 25 31 37 43 49 55 61 67 73 79 85 91 97 control factor 1 7 13 19 25 31 37 43 49 55 61 67 73 79 85 91 97 control factor
Itear ons Itera ons
CNS OSCC
1 1
0.9 0.9
HHO HHO
0.8 0.8
0.7 0.7
0.6 HHO+Brownian Mo on
Accuracy
0.6 HHO+Brownian Mo on
Accuracy
0.5 0.5
0.4 0.4
0.3 HHO+proposed control HHO+proposed control
0.3
factor factor
0.2 0.2
0.1 0.1
HHO+Brownian HHO+Brownian
0 Mo on+proposed 0 Mo on+proposed
1 7 13 19 25 31 37 43 49 55 61 67 73 79 85 91 97 control factor 1 7 13 19 25 31 37 43 49 55 61 67 73 79 85 91 97 control factor
Itera ons Iteara ons
0.6 HHO+Brownian Mo on
Accuracy
0.5 0.5
0.4 0.4
0.3 HHO+proposed control HHO+proposed control
0.3
factor factor
0.2 0.2
0.1 0.1
HHO+Brownian HHO+Brownian
0 Mo on+proposed 0 Mo on+proposed
1 7 13 19 25 31 37 43 49 55 61 67 73 79 85 91 97 control factor 1 7 13 19 25 31 37 43 49 55 61 67 73 79 85 91 97 control factor
Itera ons Itear ons
sentiment analysis, signal processing, fuzzy system, and selection. Artif Intell Rev 54:593–637. https://doi.org/10.1007/
different engineering applications. s10462-020-09860-3
Abedinpourshotorban H, Mariyam Shamsuddin S, Beheshti Z, Jawawi
D (2016) Electromagnetic field optimization: a physics-inspired
Acknowledgements The authors sincerely thank the Department of metaheuristic optimization algorithm. Swarm Evol Comput 26:8–
Science and Technology (DST), Government of India, for funding this 22. https://doi.org/10.1016/j.swevo.2015.07.002
research project work under the Interdisciplinary Cyber-Physical Sys- Ahmed S, Mafarja M, Faris H, Aljarah I (2018) Feature selection using
tems (ICPS) scheme (Grant No. T-54). salp swarm algorithm with chaos. In: ACM International Confer-
ence Proceeding Series, pp 65–69
Alabool HM, Alarabiat D, Abualigah L, Heidari AA (2021) Harris
Hawks Optimization: a comprehensive review of recent variants
References and applications. Neural Comput Appl 33:8939–8980. https://d oi.
org/10.1007/s00521-021-05720-5
Abdel-Basset M, Ding W, El-Shahat D (2021) A hybrid Harris Hawks
Optimization algorithm with simulated annealing for feature
13
K. Balakrishnan et al.
Bolón-Canedo V, Remeseiro B (2020) Feature selection in image anal- trends in signal and image processing. Springer Singapore, pp
ysis: a survey. Artif Intell Rev 53:2905–2931. https://doi.org/10. 79–87
1007/s10462-019-09750-3 Ismael OM, Qasim OS, Algamal ZY (2020) Improving Harris Hawks
Dash M, Liu H (2003) Consistency-based search in feature selection. Optimization algorithm for hyperparameters estimation and fea-
Artif Intell 151:155–176. https://d oi.o rg/1 0.1 016/S
0004-3 702(03) ture selection in v-support vector regression based on opposition-
00079-1 based learning. J Chemom. https://doi.org/10.1002/cem.3311
Dong H, Li T, Ding R, Sun J (2018) A novel hybrid genetic algorithm Kanimozhi T, Latha K (2015) An integrated approach to region
with granular information for feature selection and optimization. based image retrieval using firefly algorithm and support vector
Appl Soft Comput J 65:33–46. https://doi.org/10.1016/j.asoc. machine. Neurocomputing 151:1099–1111. https://doi.org/10.
2017.12.048 1016/j.neucom.2014.07.078
Elgamal ZM, Yasin NBM, Tubishat M et al (2020) An improved Har- Kou G, Yang P, Peng Y et al (2020) Evaluation of feature selection
ris Hawks Optimization algorithm with simulated annealing for methods for text classification with small datasets using multiple
feature selection in the medical field. IEEE Access 8:186638– criteria decision-making methods. Appl Soft Comput J. https://
186652. https://doi.org/10.1109/ACCESS.2020.3029728 doi.org/10.1016/j.asoc.2019.105836
Elminaam DSA, Nabil A, Ibraheem SA, Houssein EH (2021) An Lew MS (2001) Principles of visual information retrieval
efficient marine predators algorithm for feature selection. IEEE Liu C, Wu J, Mirador L, et al (2018) Classifying DNA methylation
Access 9:60136–60153. https://doi.org/10.1109/ACCESS.2021. imbalance data in cancer risk prediction using SMOTE and tomek
3073261 link methods. In: International Conference of Pioneering Com-
Emary E, Zawbaa HM, Ghany KKA, et al (2015) Firefly optimization puter Scientists, Engineers and Educators. Springer, Singapore,
algorithm for feature selection. In: ACM International Conference pp 1–9
Proceeding Series Madasu A, Elango S (2020) Efficient feature selection techniques for
Faramarzi A, Heidarinejad M, Mirjalili S, Gandomi AH (2020) Marine sentiment analysis. Multimed Tools Appl 79:6313–6335. https://
predators algorithm: a nature-inspired metaheuristic. Expert Syst doi.org/10.1007/s11042-019-08409-z
Appl 152:113377. https://doi.org/10.1016/j.eswa.2020.113377 Mafarja MM, Mirjalili S (2017) Hybrid whale optimization algorithm
Gao D, Wang GG, Pedrycz W (2020a) Solving fuzzy job-shop schedul- with simulated annealing for feature selection. Neurocomputing
ing problem using de algorithm improved by a selection mecha- 260:302–312. https://doi.org/10.1016/j.neucom.2017.04.053
nism. IEEE Trans Fuzzy Syst 28:3265–3275. https://doi.org/10. Marcano-Cedeño A, Quintanilla-Domínguez J, Cortina-Januchs MG,
1109/TFUZZ.2020.3003506 Andina D (2010) Feature selection using sequential forward selec-
Gao Y, Zhou Y, Luo Q (2020b) An efficient binary equilibrium opti- tion and classification applying artificial metaplasticity neural net-
mizer algorithm for feature selection. IEEE Access 8:140936– work. In: IECON Proceedings (Industrial Electronics Conference).
140963. https://doi.org/10.1109/ACCESS.2020.3013617 pp 2845–2850
Gao ZM, Zhao J, Hu YR, Chen HF (2019) The improved harris hawk Mirjalili S (2015) Moth-flame optimization algorithm: A novel nature-
optimization algorithm with the tent map. In: 2019 IEEE 3rd inspired heuristic paradigm. Knowl Based Syst 89:228–249.
International Conference on Electronic Information Technology https://doi.org/10.1016/j.knosys.2015.07.006
and Computer Engineering, EITCE 2019 Mirjalili S (2016) SCA: a sine cosine algorithm for solving optimiza-
Gu N, Fan M, Du L, Ren D (2015) Efficient sequential feature selection tion problems. Knowl Based Syst 96:120–133. https://doi.org/10.
based on adaptive eigenspace model. Neurocomputing 161:199– 1016/j.knosys.2015.12.022
209. https://doi.org/10.1016/j.neucom.2015.02.043 Mirjalili S, Lewis A (2016) The whale optimization algorithm. Adv
Gunal S, Edizkan R (2008) Subspace based feature selection for pat- Eng Softw 95:51–67. https://doi.org/10.1016/j.advengsoft.2016.
tern recognition. Inf Sci (NY) 178:3716–3726. https://doi.org/10. 01.008
1016/j.ins.2008.06.001 Mirjalili S, Gandomi AH, Mirjalili SZ et al (2017) Salp swarm algo-
Heidari AA, Mirjalili S, Faris H et al (2019) Harris Hawks Optimiza- rithm: a bio-inspired optimizer for engineering design problems.
tion: algorithm and applications. Fut Gen Comput Syst 97:849– Adv Eng Softw 114:163–191. https://doi.org/10.1016/j.adven
872. https://doi.org/10.1016/j.future.2019.02.028 gsoft.2017.07.002
Hussien AG, Amin M (2021) A self-adaptive Harris Hawks Optimiza- Neggaz N, Ewees AA, Elaziz MA, Mafarja M (2020a) Boosting salp
tion algorithm with opposition-based learning and chaotic local swarm algorithm by sine cosine algorithm and disrupt operator
search strategy for global optimization and feature selection. Int J for feature selection. Expert Syst Appl. https://doi.org/10.1016/j.
Mach Learn Cybern. https://d oi.o rg/1 0.1 007/s 13042-0 21-0 1326-4 eswa.2019.113103
Houssein EH, Hosney ME, Elhoseny M et al (2020a) Hybrid Harris Neggaz N, Houssein EH, Hussain K (2020b) An efficient henry gas
Hawks Optimization with cuckoo search for drug design and dis- solubility optimization for feature selection. Expert Syst Appl
covery in chemoinformatics. Sci Rep 10:1–22. https://doi.org/10. 152:113364. https://doi.org/10.1016/j.eswa.2020.113364
1038/s41598-020-71502-z Sihwail R, Omar K, Ariffin KAZ, Tubishat M (2020) Improved Har-
Houssein EH, Saad MR, Hussain K et al (2020b) Optimal sink node ris Hawks Optimization using elite opposition-based learning
placement in large scale wireless sensor networks based on har- and novel search mechanism for feature selection. IEEE Access.
ris’ hawk optimization algorithm. IEEE Access 8:19381–19397. https://doi.org/10.1109/ACCESS.2020.3006473
https://doi.org/10.1109/ACCESS.2020.2968981 Too J, Mirjalili S (2021) General learning equilibrium optimizer: a
Houssein EH, Neggaz N, Hosney ME et al (2021) Enhanced Harris new feature selection method for biological data classification.
Hawks Optimization with genetic operators for selection chemi- Appl Artif Intell 35:247–263. https://doi.org/10.1080/08839514.
cal descriptors and compounds activities. Neural Comput Appl. 2020.1861407
https://doi.org/10.1007/s00521-021-05991-y Too J, Abdullah AR, Saad NM (2019) A new quadratic binary harris
Hussain K, Neggaz N, Zhu W, Houssein EH (2021) An efficient hybrid hawk optimization for feature selection. Electron. https://doi.org/
sine-cosine Harris Hawks Optimization for low and high-dimen- 10.3390/electronics8101130
sional feature selection. Expert Syst Appl 176:114778. https://d oi. Tuba E, Strumberger I, Bezdan T et al (2019) Classification and feature
org/10.1016/J.ESWA.2021.114778 selection method for medical datasets by brain storm optimiza-
Hussien AG, Hassanien AE, Houssein EH, et al (2019) S-shaped binary tion algorithm and support vector machine. Procedia Comput Sci
whale optimization algorithm for feature selection. In: Recent 162:307–315
13
A novel control factor and Brownian motion‑based improved Harris Hawks Optimization for feature…
Zhang Y, Song XF, Gong DW (2017) A return-cost-based binary firefly Publisher's Note Springer Nature remains neutral with regard to
algorithm for feature selection. Inf Sci (NY) 418–419:561–574. jurisdictional claims in published maps and institutional affiliations.
https://doi.org/10.1016/j.ins.2017.08.047
Zhang Y, Liu R, Wang X et al (2020) Boosted binary Harris hawks
optimizer and feature selection. Eng Comput. https://doi.org/10.
1007/s00366-020-01028-5
13