0% found this document useful (0 votes)
13 views24 pages

A Novel Control Factor and Brownian Motion-Based Improved Harris Hawks Optimization For Feature Selection

This research article presents an improved Harris Hawks Optimization (iHHO) method for feature selection, incorporating a novel control factor and Brownian motion to enhance performance on high-dimensional datasets. The study demonstrates that the iHHO outperforms existing techniques in terms of feature selection and classification accuracy, as evidenced by experiments on six real microarray datasets. The findings highlight the importance of effective feature selection in managing the challenges posed by large data sizes.

Uploaded by

deanses
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views24 pages

A Novel Control Factor and Brownian Motion-Based Improved Harris Hawks Optimization For Feature Selection

This research article presents an improved Harris Hawks Optimization (iHHO) method for feature selection, incorporating a novel control factor and Brownian motion to enhance performance on high-dimensional datasets. The study demonstrates that the iHHO outperforms existing techniques in terms of feature selection and classification accuracy, as evidenced by experiments on six real microarray datasets. The findings highlight the importance of effective feature selection in managing the challenges posed by large data sizes.

Uploaded by

deanses
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 24

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/357750375

A novel control factor and Brownian motion-based improved Harris Hawks


Optimization for feature selection

Article in Journal of Ambient Intelligence and Humanized Computing · January 2022


DOI: 10.1007/s12652-021-03621-y

CITATIONS READS

11 267

3 authors, including:

Utkarsh Mahadeo Khaire


Indian Institute of Information Technology Dharwad
17 PUBLICATIONS 445 CITATIONS

SEE PROFILE

All content following this page was uploaded by Utkarsh Mahadeo Khaire on 24 March 2022.

The user has requested enhancement of the downloaded file.


Journal of Ambient Intelligence and Humanized Computing
https://doi.org/10.1007/s12652-021-03621-y

ORIGINAL RESEARCH

A novel control factor and Brownian motion‑based improved Harris


Hawks Optimization for feature selection
K. Balakrishnan1 · R. Dhanalakshmi1 · Utkarsh Mahadeo Khaire2

Received: 8 April 2021 / Accepted: 25 November 2021


© The Author(s), under exclusive licence to Springer-Verlag GmbH Germany, part of Springer Nature 2021

Abstract
The massive growth in data size has prompted proliferation in need for Feature Selection (FS). Hence, FS has become an
imperative method for dealing with high-dimensional data. This research critique proposes an enhanced feature selection of
Harris Hawks Optimization (HHO) based on the novel control factor and Brownian motion. The Brownian motion augments
the exploitation of foragers. It also replicates the deceptive movement of prey, allowing predators to correct their location
and direction according to the prey’s position. At the same time, the novel control factor imitates the exact behavior of the
prey’s escaping energy. The comparative analysis with the existing technique using six real high-dimensional microarray
datasets highlights the impact of the proposed Improved Harris Hawks Optimization (iHHO). The experimental results of
FS and classification accuracy vividly depict how the proposed model outperforms the existing techniques.

Keywords Feature selection · Harris Hawks Optimization · Meta-heuristic optimization · Microarray dataset

1 Introduction analysis (Madasu and Elango 2020), image processing


applications (Bolón-Canedo and Remeseiro 2020), medi-
Feature selection (FS) is an established preprocessing stage cal applications (Tuba et al. 2019), power systems (Abed-
used in machine learning tasks to deal with high-dimen- inpourshotorban et al. 2016), text classification (Kou et al.
sionality issues. The primary aim of FS is to classify main 2020), Pattern recognition (Gunal and Edizkan 2008), drug
features and eliminate extraneous and insignificant features design (Houssein et al. 2020a), wireless sensor networks
from the raw dataset for knowledge discovery. As illustrated (Houssein et al. 2020b), information retrieval (Lew 2001),
in Fig. 1, FS is the process of selecting a smaller subset of job scheduling (Gao et al. 2020a) and many other real-world
features without radically decreasing classification accuracy. applications.
The first step to generate a subset from the input dataset is a Myriads of scholars have defined Feature selection as
valid search procedure. The second step compares the evalu- the ability of feature subsets to identify targets, maximize
ated optimal subset and the antecedent subset. The newly prediction accuracy, or alter the composition of the original
updated subset is intensely recommended, assuming it will data group. The three types of function subset search strate-
reinstate the older one compared to the existing subset. The gies are global optimal, sequence, and random. Discovering
loop will continue until the criteria of termination are satis- the current optimal subset of the initial feature sets is the
fied. FS has assisted in the development of a wide range of aim of global optimal search. The three types of sequence
algorithms in various fields of Engineering like Sentiment search algorithms are forward search, backward search, and
bidirectional search (Marcano-Cedeño et al. 2010). Forward
search refers to the greedy method of applying the element
* Utkarsh Mahadeo Khaire with the best score to the Selected Feature Subset (SFS).
[email protected]
Backward search denotes eliminating a component from
1
Department of Computer Science and Engineering, Indian the selected subset of the feature at a time using Sequence
Institute of Information Technology, Tiruchirappalli, Backward Search (SBS). Bidirectional search is a forward
Tamil Nadu 620012, India and backward search technique that paves the way for adding
2
Department of Data Science and Intelligent Systems, and removing features (Gu et al. 2015). The random search
Indian Institute of Information Technology, Dharwad, strategy's feature selection is chaotic, with uncertainty,
Karnataka 580009, India

13
Vol.:(0123456789)
K. Balakrishnan et al.

Fig. 1  General feature selection


procedure

forcing the algorithm to escape the local optimum, allowing Harris Hawks Optimization (HHO) is a nature-inspired
the algorithm to find the estimated optimal solution (Dash MH algorithm developed by Heidari et al. (2019). HHO imi-
and Liu 2003). As a result, random search strategies such tates the attacking strategy such as besiege, perching, and
as Simulated Annealing (SA) (Liu et al. 2018) and Genetic surprise pounce strategies of Harris hawks to find an optimal
Algorithm (GA) (Dong et al. 2018) selection of the feature solution. HHO is split into two exploration stages and four
subset outperforms sequence search in most cases. exploitation stages. In many real-world applications, HHO
This research pericope focuses on the FS using Meta- outperforms traditional meta-heuristic algorithms such as
Heuristic (MH) optimization techniques. Traditional opti- Whale Optimization Algorithm (WOA) and Genetic Algo-
mization techniques have several limitations when it comes rithm (GA). However, through this research, two drawbacks
to FS problems. Population-based methods are of five associated with traditional HHO are identified. First, the rep-
types (Heidari et al. 2019): Evolutionary Algorithm (EA), resentation of prey escape energy is inefficient in conven-
Physics-based Algorithm (PBA), Swarm-based Algorithm tional HHO. Second, Levy flight represents the imitation of
(SBA), Human-based Algorithm (HBA), and Mathematical- leapfrog movements of prey while fleeing from a predator.
based Algorithm (MBA). Figure 2 depicts numerous well- The initial random solution's more petite and extended step
known population-based algorithms. In population-based sizes hurdle the algorithm emerging from local optima. As
algorithms, two major characteristics are typically found: a result, a novel control factor that accurately imitates prey's
intensification and diversification. escape energy over time is developed through this research

Fig. 2  Population-based MH
optimization algorithms

13
A novel control factor and Brownian motion‑based improved Harris Hawks Optimization for feature…

critique. This research connoisseur has employed Brownian Elminaam et al. proposed a novel technique to the FS
motion rather than Levy flight to attain the optimum solu- problem based on the Marine Predator Algorithm (MPA)
tion in the stipulated period. In brief, the following are the (Elminaam et al. 2021). This study has encompassed the
significant contributions of this research: hybridization of MPA with KNN. The proposed MPA-KNN
adapts the basic exploratory and exploitative procedures to
1. This research pericope has proposed a robust iHHO for choose the best relevant features for the most accurate iden-
selecting significant features from the high-dimensional tification. Five different evaluation criteria are carried out
microarray datasets. on 18 UCI datasets to explore the suggested approach's per-
2. A novel control factor is used to express the prey’s formance. The obtained results enunciate that the proposed
escape energy accurately. model provides superior performance than the conventional
3. Considering randomization is the key to any MH algo- MH algorithms in selecting the optimal features. Elgamal
rithm, zigzag movement of escaping prey and surprise et al. proposed an improved variant of HHO based on Simu-
attacks of predators are represented using Brownian lated Annealing (SA) for FS in the medical field (Elgamal
motion instead of Levy flight. et al. 2020). The suggested model handles concerns such
4. The proposed iHHO's performance is assessed using six as population diversity and local optima of conventional
real high-dimensional microarray datasets and several HHO. In this proposed method, SA is used to improve the
unimodal and multimodal functions. exploitation capability of HHO. The SA algorithm is also
5. The outcomes of the proposed iHHO are compared with used by Abdel et al. in their study and has proposed a vari-
six well-established optimization techniques. ant of HHO termed the Chaotic HHO (CHHO) algorithm.
CHHO is developed using chaotic maps and incorporates
Section 2 of this research article covers the literature sur- the SA algorithm to enhance the population diversity and
vey. Section 3 stresses the motivation of the proposed iHHO. converging ability to avoid the local optima (Abdel-Basset
The background and mathematical approach of iHHO are et al. 2021).
discussed in Sect. 4. Section 5 deals with the implementation Neggaz et al. proposed an improved variant of the Salp
of the proposed method. Section 6 focuses on the simulation Swarm Algorithm (SSA) for FS problems using Sine–Cosine
findings, and discussions are elaborated in Sect. 7. Section 8 Algorithm (SCA) and disrupted operator (Neggaz et al.
emphasizes the conclusion and the future scope. 2020a). The SCA facilitates exploration and prevents stag-
nation. In addition to that, disrupt operator is used to enhanc-
ing the population diversity. Neggaz et al. also suggest a
2 Related work novel approach based on Henry Gas Solubility Optimization
(HGSO) to select the optimal features and improve clas-
The reliability factor has resulted in the stochastic optimiz- sification accuracy (Neggaz et al. 2020b). Ahmed et al.,
ers' interest in solving problems in various fields such as in their research, has used SSA with four different chaotic
the manufacturing industry, environmental quality, solar maps to balance exploration and exploitation (Ahmed et al.
systems, power systems, and other engineering areas (Ala- 2018). Twelve real-world datasets are used to test the pro-
bool et al. 2021). Compared to traditional methods, nature- posed techniques. The findings indicate that chaotic maps
inspired MH algorithms have a promising result for complex improve the proposed model’s performance considerably as
problems. Feature Selection (FS) is one of the most popular compared to conventional methods. In a study, Zhang et al.
applications of stochastic optimization algorithms. Owing proposed the binary variant of HHO for global optimization
to the slow convergence of MH algorithms, researchers use and FS problems (Zhang et al. 2020). The SSA mechanism
different techniques to improve the MH algorithm by add- is merged into traditional HHO to enhance the exploitation
ing one or more techniques to the existing algorithms. The and exploration behavior of HHO. The suggested HHO is
well-known techniques are levy flight, Brownian motion, tested on 23 classical functions using statistical metrics and
binary variant, and opposition-based learning. Sihwail convergence rate.
et al. proposed an improved version of HHO based on Elite Houssein et al. balanced exploration and exploitation
Opposition Based Learning (EOBL) and Three Search Strat- of HHO using genetic operators and two methods, such as
egies (TSS) to boost the global and local searches of HHO Opposition Based Learning (OBL) and Random Opposition
(Sihwail et al. 2020). EOBL has increased the diversity of Based Learning (ROBL) (Houssein et al. 2021). Monoam-
the HHO population, and TSS search strategies assisted the ine Oxidase and QSAR Biodegradation datasets are used
algorithm in its search for global optima by avoiding traps to assess the efficacy of the proposed system. The findings
in local optima. From the results of the experiments, it could indicate that the three variants of the proposed model out-
be understood that the proposed approach outperforms the perform the conventional techniques in determining the
other algorithms in all the metrics. best subset of chemical descriptors. Houssein et al. also

13
K. Balakrishnan et al.

suggested a hybrid version of HHO based on Cuckoo Search function is used in QBHHO to rejuvenate BHHO's feature
Optimization (CSO) and chaotic maps to boost the efficiency selection efficiency. The best fitness value, mean fitness
of the original HHO (Houssein et al. 2020a). Furthermore, value, standard deviation of fitness value, classification
the proposed model was paired with the Support Vector accuracy, and feature size are used to assess the results of the
Machine (SVM) as a machine learning classifier for per- proposed algorithms. The findings of this analysis indicate
forming chemical descriptor collection and chemical com- that the quadratic transfer function algorithm provides the
pound operations. best results. Too and Mirjalili have also introduced a gen-
Hussein et al. proposed an improved variant of HHO with eral learning method to aid search agents in avoiding local
three strategies OBL, Chaotic Local Search, and self-adap- optima and improving their capacity to find a promising
tive technique (Hussien and Amin 2021). The OBL strategy region (Too and Mirjalili 2021). Sixteen biological datasets
is incorporated in the initialization phase, and the other two were used to test the proposed General Learning Equilibrium
strategies are embedded with the update phase to enhance Optimizer (GLEO).
the converging ability of the proposed model. Hussain et al. Mafarja and Mirjalili developed a new hybrid metaheuris-
suggested a unique hybrid approach for numerical methods tic technique combining SA and WOA. The SA scheme
and FS issues by combining two algorithms: SCA and HHO improves the optimum solution found by WOA (Mafarja
(Hussain et al. 2021). The primary goal of this research is and Mirjalili 2017). The key objective of this hybridization
to balance exploration and exploitation capabilities. Ismael is to improve SA's exploitation ability, represented by WOA.
et al. suggested a novel hybrid model that optimizes the The performance assessment demonstrates an improvement
hyper parameters of v-SVR when simultaneously embed- in classification accuracy and produces better results than
ding feature selection using OBL (Ismael et al. 2020). wrapper-based techniques. 18 standard datasets were taken
Gao et al. discovered a tent map that could be used to from the UCI repository to compare the proposed method.
improve the converging capabilities of HHO (Gao et al. Hussein et al. suggested a novel binary variant of WOA
2019). The tent maps can initialize the population distri- (BWOA) to choose the best function subset for the FS prob-
bution and escape the equal distribution by generating the lem (Hussien et al. 2019). An S-shaped transfer function
chaos using non-period, non-converged, and bounded ran- is used to control the novel strategy. Over eleven different
dom numbers. The chaos is also applied to the algorithm datasets, a series of parameters is used to evaluate and equate
to substitute the random numbers. The proposed algorithm the proposed model with the existing one. Emary et al. has
is evaluated using 18 benchmarks of unimodal and multi- used a threshold value to solve function selection problems,
modal functions. Gao et al. also proposed a binary variant encompassing the first binary variant of the Firefly Algo-
of Evolutionary Optimization (EO) for FS problems (Gao rithm (FFA) (Emary et al. 2015). The suggested algorithm
et al. 2020b). The model employs the Sigmoid transfer func- had a high level of investigation and was able to find a sim-
tion (S-Shaped) and V-shaped transfer functions to transform ple solution to the problem. Kanimozhi et al. proposed an
the binary version of EO and change the particle's current image retrieval strategy based on SVM classifiers and FFA
position vector. The suggested method is tested using nine- (Kanimozhi and Latha 2015). The fundamental goal was to
teen UCI benchmark datasets. Based on the experimental improve the algorithm's accuracy by using optimal func-
findings, the binary EO technique performs well compared tionality, and the algorithm was put to the test on a variety
to other approaches for addressing FS issues. Zhang et al. of image datasets.
introduced a return cost-based Firefly Algorithm (FFA),
which uses the binary movement operator to change firefly
positions (Zhang et al. 2017). Zhang et al. also presented
an improved HHO based on SSA, assuming that SSA's 3 Motivation
powerful explorative capacity would facilitate exploring
the original HHO (Zhang et al. 2020). Initialize and update The existing Harris Hawks Optimization presents the escap-
are the two stages of the proposed method. The proposed ing energy of the prey using Eq. (1).
approach clearly outperforms other approaches in terms of ( )
t
average feature length and error rate measures according to E = 2E0 1 − (1)
T
the experimental findings.
To solve the issue of feature selection in classification In Eq. (1), E denotes the prey's escaping energy, T is the
tasks, Too et al. proposed two improved variants of HHO, maximum number of iterations, E0 is the energy's initial
namely Binary HHO (BHHO) and Quadratic Binary HHO state, which varies randomly across the interval (− 1, 1).
(QBHHO) (Too et al. 2019). The BHHO has a built-in The graphical representation of the behavior of E is shown
S-shaped or V-shaped conversion mechanism that converts in Fig. 3. The figure imitates the deceptive zigzag movement
continuous search agents to binary. The quadratic transfer of the prey. Figure 3 represents an unexpected hike in the

13
A novel control factor and Brownian motion‑based improved Harris Hawks Optimization for feature…

Fig. 3  Behavior of escaping energy (E) in traditional HHO during


Fig. 5  Movement of Harris hawks using Levy flight
1000 iterations

Fig. 4  The behaviour of novel control factor-based escaping energy Fig. 6  Movement of search agent using Brownian motion in the
(Ee) in improved HHO during 1000 iterations improved Harris Hawks

prey's energy as it gradually loses its energy while running besiege stage. Figure 5 shows the nature of movements of
away from its predator. Harris Hawks using the Levy flight. Due to the smaller and
The accurate behaviour of the escaping energy of the prey occasional larger step size, the algorithm sometimes fails
is depicted in Eq. (2), which is the proposed novel Control to recover from local optima.
Factor (CF) that accurately imitates the behaviour of the The aforementioned drawback of existing Harris Hawks
prey’s escaping energy. Optimization is overcome by the proposed movement of
( ) Harris hawks using Brownian motion, as stated in Eq. (3).
( ) 2× Tt
CF = 1 −
t (2) The graphical representation of Brownian motion is shown
T in Fig. 6.
The graphical representation of the CF in Fig. 4 shows � 2�
1 −x
the smooth deprivation of escaping energy of the prey as the Brownian Motion = fB (x) = √ exp
2 (3)
2𝜋
model progresses. The CF gradually decreases from 1 to 0,
the exact imitation of prey losing its energy while escaping
from the predator. In Eq. (3), x denotes Harris Hawk's current location in
The conventional Harris Hawks use Levy flight with the iteration. The Brownian motion allows hawks to take
progressive dives into the exploitation phase in the hard longer step size, which eventually precludes the algorithm
from stagnation at local minima.

13
K. Balakrishnan et al.

4 Improved Harris Hawks Optimization if q ≥ 0.5. The mathematical representation of the perching
(iHHO) strategy of Harris hawks is modeled in Eq. (4).
{
The modified HHO is divided into three phases: initializa- Y(t + 1) = ( rand (t) − r1 |Yrand)(t) − 2r
Y ( 2 Y(t)| ) q ≥ 0.5
Yrabbit (t) − Ym (t) − r3 lb + r4 (ub − lb) q ≤ 0.5
tion, update, and classification. The following subsection (4)
elucidates each component.
where Y(t) and Y(t + 1) is a position of the search agent in
the iteration t and t + 1, respectively. r1 , r2 , r3 , and r4 are
4.1 Initialization phase the arbitrary value in the range of [0, 1]. ub and lb denotes
the upper and lower bound value of every individual search
The initialization of iHHO reflects other meta-heuristic agent. Yrabbit (t) and Yrand (t) are the position of the prey and
techniques where it initializes the random search agents and the current random position, respectively. Ym (t) is the aver-
identifies the current best solution using a defined objective age location of the new search agents obtained using Eq. (5).
function.
N
1∑
Ym (t) = Y (t) (5)
4.2 Update phase N i=1 i

In this phase, to achieve the optimum solution, the proposed where N is the number of search agents, Yi (t) epitomizes the
algorithm performs both intensification and diversification. position of the search agent in iteration t .
Figure 7 represents the specific conditions that contribute to
the diversification and intensification of the algorithm. This 4.2.2 Transition from exploration to exploitation
procedure is categorized into four stages: soft besiege, soft
besiege with progressive rapid dives, hard besiege, and hard Generally, the performance of MH algorithms is determined
besiege with progressive rapid dives. by the factor that balances exploration and exploitation.
The evolution of iHHO from exploration to exploitation
4.2.1 Exploration is focused on the prey's escape energy using the following
equation:
In this phase, the predator (Harris Hawks) monitors the Escaping_Energy (Ee) = 2 × CF × E0 (6)
grander search space to discover the location of prey (Rab-
bit). Let q denote the probability of success for the perch- In Eq. (6), E0 is a random initial state, and its values range
ing strategy. For the condition q < 0.5, the predator perches from -1 to 1. |Ee| ≥ 1 indicates the exploration process, and
based on the position of other family members and the rab- |Ee| < 1 represents the exploitation process, and CF denotes
bit. Otherwise, a random search agent changes its position control factor.

Fig. 7  Phases of improved HHO

13
A novel control factor and Brownian motion‑based improved Harris Hawks Optimization for feature…

{
4.2.3 Exploitation phase X if F(X) < F(Y(t))
Y(t + 1) = (12)
Z if F(Z) < F(Y(t))
In this phase, iHHO uses four strategies to attack the prey.

4.2.3.1 Soft besiege During this stage, the prey has


enough energy to flee from the predator (r ≥ 0.5). How-
4.2.3.4 Hard besiege with progressive rapid dives When
ever, the predator surrounds the prey tactfully and attacks
|Ee| < 0.5 and r < 0.5 denotes that the prey cannot flee the
unexpectedly. The behavior of unpredictable attack is
hawks owing to its insufficient energy. The hard besiege of
modeled as follows:
hawks with progressive dive is formulated using the follow-
Y(t + 1) = ΔY(t) − (Ee)|J × Yrabbit (t) − Y(t)| (7) ing equations:
X = Yrabbit (t) − (Ee)|J × Yrabbit (t) − Ym (t)| (13)
ΔY(t) = Yrabbit (t) − Y(t) (8)
( )
In Eq. (7), J = 2∗(1 − r5 ) represents the random jump stepsize = fB ⊗ Yrabbit − fB ⊗ Y(t)
(14)
strength of the rabbit throughout the escaping proce- Z = X + P × R ⊗ stepsize
dure where r5 is an arbitrary number in the range of (0, 1).
The J value changes randomly in each iteration to simulate Final position updating of Harris hawks is done by
the nature of rabbit motions. using Eq. (15)
{
4.2.3.2 Hard besiege The escaping energy of the prey is X if F(X) < F(Y(t))
Y(t + 1) = (15)
low, when r ≥ 0.5 and |Ee| < 0.5, the predator encircles the Z if F(Z) < F(Y(t))
victim fiercely and strikes randomly as formulated in Eq.
(9).
Y(t + 1) = Yrabbit (t) − (Ee)|ΔY(t)| (9) 4.3 Classification phase

The standard sigmoid function is used to classify the data


4.2.3.3 Soft besiege with progressive rapid dives The
sample to verify whether the algorithm progresses cor-
prey has sufficient energy to flee from the predator when
rectly. The sigmoid function is a customary binary clas-
|Ee| ≥ 0.5 and r < 0.5. The prey performs leapfrog move-
sifier that assures the data sample category using a pre-
ment to escape from the predator. Compared to other pre-
defined threshold value. As shown in Eq. (16), the binary
vious techniques, performing soft besiege before a sudden
cross-entropy is used to calculate the error rate. Where yi
attack is a better approach. The respective zigzag move-
is the actual value, and p(yi ) is the predicted value of the
ments of prey and attacking strategy of predator are mod-
data sample.
eled using the succeeding equations:
N
X = Yrabbit (t) − (Ee)|J × Yrabbit (t) − Y(t)| (10) 1∑
1=− y × log(p(yi )) + (1 − yi ) × log(1 − p(yi )) (16)
N i=1 i
According to the movement of prey, predators make
more abrupt dives that suit the prey's deceptive move- The improved Harris Hawks Optimization (iHHO) algo-
ments. This research critique has espoused the Brownian rithm's pseudo-code is as follows:
motion to imitate the predator's most irregular and rapid
plunge. Equation (11) envisages the procedure of a preda-
tor changing its position based on the Brownian motion.
( )
stepsize = fB ⊗ Yrabbit − fB ⊗ Y(t)
(11)
Z = X + P × R ⊗ stepsize

In Eq. (11), P = 0.5 is a constant number. The final pro-


cedure for updating the position of the search agent can be
performed by Eq. (12), in which F is the fitness function.

13
K. Balakrishnan et al.

Algorithm 1: Pseudo-code of iHHO


Initial random solutions Yi (I = 1, 2, 3, 4…., n)
Compute log loss for each initial random solution using Eqn. (16) to identify Yrabbit (best
location)
While (current iteration t < T)
for each search agent (Yi)
Update the escaping energy-based Eq. (6)
if (|Ee| 1)
Update the position vector based on Eq. (4)
if (|Ee| < 1)
if (|Ee| 0.5 and r 0.5)
Update position vector based on Eq. (7)
else if (| e| < 0.5 and r 0.5)
Update position vector based on Eq. (9)
else if (|Ee| 0.5 and r < 0.5)
Update position vector based on Eq. (12)
else if (|Ee| < 0.5 and r < 0.5)
Update position vector based on Eq. (15)
Return Yrabbit

5 Implementation helps in the diagnosis and prognosis of life-threatening


diseases. Microarray data is the combination of significant
This section exemplifies the experimental setup, process and noisy features. Irrelevant and insignificant features in
flow, and overview of the high-dimensional datasets used microarray data encumber the diagnosis of diseases. All
in the present research. extraneous columns of the input datasets (such as Patient
id, Name, Address, etc.) are removed before starting the
experiment. All the datasets used in the process have two
5.1 Dataset classes viz., normal and malignant, in the target variable.

The proposed model is evaluated using six publicly avail-


able high-dimensional microarray datasets. Table 1 pro- 5.2 Process flow
vides an overview of the datasets used in the study. The
microarray dataset contains the gene expression level in a Figure 8 depicts the implementation of the proposed tech-
continuous data format, which helps analyze a multitude nique. In the initial phase, the data balancing step uses the
of genes in a limited period. Analyzing microarray data SMOTE-tomek algorithm to manage the dis-proportion of
data samples in input data. The SMOTE-tomek algorithm
is a combination of over-sampling and under-sampling
techniques that aid in the elimination of data imbalances.
Table 1  Overview of datasets Due to the addition of new data samples, data balancing
Dataset name Number of features Number typically increases the size of the original input data. In
of sam- the initial stage, the input data has been normalized using
ples Min–Max scaling. Along with the data preprocessing, the
Breast cancer 24,481 97 maximum number of iterations has been set to 100. The
Central nervous system (CNS) 7129 60 position matrix containing 100 random search agents is
Colon Cancer 2000 60 also generated. The fitness value of the logistics function
Leukemia 7129 72 is determined using an unbiased cross-entropy function for
OSCC 41,003 50 each row of the Position matrix, which aids in choosing
Ovarian cancer 15,154 253 the best location of prey (Yrabbit).

13
A novel control factor and Brownian motion‑based improved Harris Hawks Optimization for feature…

Fig. 8  The proposed approach's process flow

During the updating phase, the iHHO updates the Yrab- 6.1 Performance on high dimensional microarray
bit and evaluates the fitness value associated with it. The datasets
proposed model returns the best search agent with the low-
est objective function value at the end of each iteration. This phase focuses on how the suggested iHHO is applied to
The weight of each feature of input data is indicated by six real high-dimensional microarray datasets. The outcomes
the unsurpassed search agent, which aids in determining of iHHO are paralleled to numerous existing optimization
the significance of the corresponding feature. The Sup- techniques as proposed in earlier researches. Particulars of
port Vector Machine (SVM) classifier with the polynomial several microarray datasets used in this research critique are
kernel function of degree three verifies the validity of the presented in Table 1.
selected feature subset during the classification phase.
Training and testing are the two divisions of the reduced 6.1.1 Comparison based on converging ability
feature set used in the tenfold cross-validation method
(CV). Using a tenfold CV minimizes the probability of The unique potentialities and efficacies of the suggested
overfitting the predictive model. For each epoch, nine out model are assessed employing the cross-entropy objective
of ten input data blocks are used as a training group, and function, which calculates the error rate in each iteration.
one set is used as a testing set. All results are recorded and Every outcome of this research experiment is documented
compared based on the average performance of optimizers and paralleled, considering the midline performance of opti-
over 30 independent runs in which each run includes 100 mizers over 30 independent runs in which each run includes
iterations. The performance of the proposed approach is 100 iterations. The ability of the proposed model to con-
measured using precision, recall, f1-score, and accuracy verge to global minima is demonstrated by the decrease in
measures, in addition to the ROC-AUC curve. error rate with each iteration. Figure 9 compares the con-
verging ability of iHHO to that of conventional HHO for
three microarray datasets. The results and performance of
6 Experimental results and discussion the proposed iHHO is compared with other well-established
optimization techniques such as the MFO (Mirjalili 2015),
From Sect. 6.1, the experimental results and an intricate MPA (Faramarzi et al. 2020), SCA (Mirjalili 2016), SSA
discussion highlight the proposed technique's effectiveness. (Mirjalili et al. 2017), WOA (Mirjalili and Lewis 2016), and
HHO algorithms based on six different microarray datasets.
The proposed iHHO’s optimum convergence rate against

13
K. Balakrishnan et al.

Breast Cancer Leukemia


20 16
18 14
16
MFO 12 MFO
14
MPA 10 MPA
Error rate

Error rate
12
10 SCA 8 SCA
8 SSA 6 SSA
6
WOA 4 WOA
4
2 HHO 2 HHO
0 iHHO 0 iHHO
1
6
11
16
21
26
31
36
41
46
51
56
61
66
71
76
81
86
91
96

1
6
11
16
21
26
31
36
41
46
51
56
61
66
71
76
81
86
91
96
Iterations Iterations

CNS OSCC
22 12
20
18 10
16 MFO MFO
14 MPA 8 MPA
Error rate

Error rate
12
SCA 6 SCA
10
8 SSA SSA
4
6 WOA WOA
4 2
2 HHO HHO
0 iHHO 0 iHHO
1
6
11
16
21
26
31
36
41
46
51
56
61
66
71
76
81
86
91
96

1
6
11
16
21
26
31
36
41
46
51
56
61
66
71
76
81
86
91
96
Iterations Iterations

Colon Cancer Ovarian Cancer


18 20
16 18
14 16
MFO MFO
12 14
MPA MPA
Error rate

Error rate

12
10
SCA 10 SCA
8
SSA 8 SSA
6 6
4 WOA WOA
4
2 HHO 2 HHO
0 iHHO 0 iHHO
1
6
11
16
21
26
31
36
41
46
51
56
61
66
71
76
81
86
91
96

1
6
11
16
21
26
31
36
41
46
51
56
61
66
71
76
81
86
91
96
Iterations Iterations

Fig. 9  Convergence curve

the global minimum demonstrates its efficacy in discover- the datasets. This requirement of epochs indicates that HHO
ing and locating a better solution. Following the conver- is experiencing premature convergence, whereas iHHO is
gence curves in Fig. 9, it could be analyzed that the proposed devoid of the same.
iHHO reaches the global minima. In contrast, the other con-
ventional techniques failed in providing an optimum solution 6.1.2 Comparison based on training accuracy
even at the 100th iteration. The projected movement that the
iHHO transits from exploration to exploitation can also be The comparison based on training accuracy between
detected. Also, it is noted that the iHHO can expose an aug- the proposed iHHO and other conventional optimiz-
mented convergence trend. It could be analyzed that the pro- ers is illustrated in Fig. 10. In contrast to the different
posed iHHO is free from premature convergence, which is approaches, the proposed model has a higher potential for
an ailment of MH optimization which avoid algorithm from increased accuracy in the early stages. From Fig. 10, it
achieving the optimum output. The proposed iHHO requires can be understood that compared to the other algorithms,
more than 50 epochs to reach the global minima, whereas there is a gradual increase and noteworthy development
traditional HHO requires only 20 iterations in the majority of

13
A novel control factor and Brownian motion‑based improved Harris Hawks Optimization for feature…

Breast Cancer Leukemia


1 1

0.9 0.9
MFO MFO
0.8 MPA 0.8 MPA

Accuracy
Accuracy

0.7 SCA 0.7 SCA


SSA SSA
0.6 0.6
WOA WOA
0.5 0.5
HHO HHO
0.4 iHHO 0.4 iHHO
1
6
11
16
21
26
31
36
41
46
51
56
61
66
71
76
81
86
91
96

1
6
11
16
21
26
31
36
41
46
51
56
61
66
71
76
81
86
91
96
Iterations Iterations

CNS OSCC
1 1

0.9 0.9
MFO MFO
0.8 MPA 0.8 MPA
Accuracy

Accuracy
0.7 SCA 0.7 SCA
SSA SSA
0.6 0.6
WOA WOA
0.5 0.5
HHO HHO
0.4 iHHO 0.4 iHHO
1
6
11
16
21
26
31
36
41
46
51
56
61
66
71
76
81
86
91
96

1
6
11
16
21
26
31
36
41
46
51
56
61
66
71
76
81
86
91
96
Iterations Iterations

Colon Cancer Ovarian Cancer


1 1

0.9 0.9
MFO MFO
0.8 MPA 0.8 MPA
Accuracy
Accuracy

0.7 SCA 0.7 SCA


SSA SSA
0.6 0.6
WOA WOA
0.5 0.5
HHO HHO
0.4 iHHO 0.4 iHHO
1
6
11
16
21
26
31
36
41
46
51
56
61
66
71
76
81
86
91
96
46
51
56
61
66
71
76
81
86
91
96
16
21
26
31
36
41
1
6
11

Iterations Iterations

Fig. 10  Training accuracy throughout 100 epochs

in the accurateness of the proposed iHHO. The precision equated with other accustomed optimization techniques
value of the suggested model ranges between (90–100%), based on the box plots, violin plots, heat maps, Standard
whereas values of other techniques range around Deviation (STD), and an average of the test results (AVG).
(50–70%). In the majority of the dataset, the proposed The non-parametric Wilcoxon sum rank test, commonly
iHHO shows significant improvement. However, in the known as the Mann Whitney U statistical test with a 5%
case of colon cancer, though the traditional HHO demon- degree of significance, is carried out alongside the experi-
strates higher accuracy over the proposed approach, it is mental evaluations to trace the substantial variances among
to be noted that there is no increase in the accuracy value the obtained outcomes of varied nuances. The research
after the initial iterations. hypothesis of the Wilcoxon sum rank test states that there is
a noteworthy difference amongst the two groups.
6.1.3 Comparison based on unseen test data The performance of the proposed iHHO for several
microarray test datasets is demonstrated in Figs. 11, 12, and
In this phase, an assessment of the unseen data is utilized 13. A box plot in Fig. 11 utilizes boxes and lines to illustrate
to investigate the credibility of the iHHO in handling the the distributions of one or more groups of numeric data.
unknown data. The enactment of the proposed iHHO is Box limits point to the central 50% of the data range, with a

13
K. Balakrishnan et al.

Table 2  Comparison of average Dataset Metric MFO MPA SCA SSA WOA HHO iHHO
accuracy results over 30 runs
Breast Cancer AVG 0.4929 0.6123 0.6367 0.4487 0.5313 0.5640 0.7236
STD 0.0000 0.0534 0.0755 0.0236 0.0047 0.0121 0.0531
CNS AVG 0.4614 0.5229 0.6535 0.5853 0.5862 0.6964 0.6352
STD 0.0197 0.0535 0.0566 0.0037 0.0188 0.0404 0.0532
Colon Cancer AVG 0.5745 0.5607 0.5970 0.7504 0.5234 0.8117 0.6646
STD 0.0057 0.0282 0.1013 0.1052 0.0105 0.0333 0.1034
Leukemia AVG 0.5674 0.6569 0.6112 0.5392 0.7500 0.6651 0.7717
STD 0.0171 0.0485 0.0175 0.0331 0.0683 0.0573 0.0965
OSCC AVG 0.8553 0.6894 0.8116 0.8476 0.7959 0.8281 0.9521
STD 0.0039 0.0016 0.0083 0.0304 0.0000 0.0315 0.0208
Ovarian Cancer AVG 0.5408 0.7141 0.6765 0.6346 0.6193 0.6091 0.7566
STD 0.0100 0.1006 0.0515 0.0025 0.0024 0.0099 0.0479

Bold values represent the significant difference between the proposed iHHO and other conventional tech-
niques

Table 3  p values of the Mann Datasets Metric MFO MPA SCA SSA WOA HHO
Whitney U test with 5%
significance Breast Cancer U val 0 639 1799 0 0 0
p val 5.60E−39 1.02 E−26 5.02 E−15 2.28 E−34 2.01 E−34 1.03 E−37
RBC 1 0.8722 0.6402 1 1 1
CNS U val 0 457 6001.5 2037.5 2131 8892
p val 1.77 E−34 1.06 E−28 0.014378 1.01 E−14 1.13 E−12 1.90 E−21
RBC 1 0.9086 − 0.2003 0.5925 0.5738 − 0.7784
Colon Cancer U val 100 86 63 2559 0 115
p val 1.43 E−33 6.14 E−34 1.26 E−33 2.43 E−09 2.22 E−34 7.31 E−33
RBC 0.98 0.9828 0.9874 0.4882 1 0.977
Leukemia U val 450 839 541 453 4326 739
p val 7.75 E−29 2.11 E−24 6.99 E−28 9.94 E−29 0.099742 3.33 E−26
RBC 0.91 0.8322 0.8918 0.9094 0.1348 0.8522
OSCC U val 6260 300 300 300 2822 10,000
p val 4.94 E−07 7.98 E−38 2.52 E−38 6.94 E−35 9.44 E−09 1.62 E−37
RBC − 0.252 0.94 0.94 0.94 0.4356 −1
Ovarian Cancer U val 9981 1698 1158 0 77 0
p val 5.92 E−35 2.74 E−16 2.01 E−21 3.46 E−37 1.94 E−34 4.11 E−35
RBC − 0.9962 0.6604 0.7684 1 0.9846 1

central line denoting the median value. Lines outspread from optimizer's superior performance. Concerning p values in
each box to seize the range of the balanced data, with dots Table 3, it is identified that the elucidations of iHHO are
placed past the line edges to specify outliers. Box plots are suggestively better than those analyzed by further proce-
the best source for a comparative analysis between the two dures in almost all cases.
groups, and they are precise in data summarization. Com- Off to the side of the box median rather than in the mid-
pared to other optimizers, the accuracy value produced by dle, as well as an inequity in whisker lengths, where one
the proposed iHHO in 100 iterations over 30 independent cross is short with no outliers, and the other has a long tail
runs has symmetric distribution in the majority of datasets. with myriads of outliers of other optimizers, this indicates
Except for CNS and colon cancer, the proposed optimizer that the distribution is skewed. Box plots provide only a
evenly distributes outliers on either side of the box. The high-level data summary and cannot present the particulars
average and standard deviation values in Table 2 show that of a data distribution’s shape. A box plot’s effortlessness also
iHHO performs significantly superior to other algorithms contributes to the limits on the data density that it can dem-
in handling with 66.66% of six datasets, demonstrating the onstrate. It blindfolds the capability to detect the meticulous

13
A novel control factor and Brownian motion‑based improved Harris Hawks Optimization for feature…

Fig. 11  Test accuracy assessment using box plots

shape of the dissemination, like, if there are oddities in a dis- Fig. 12 show that as more data points are added to a region,
tribution’s modality (number of ‘humps’ or peaks) and skew. the height of the density curve in that area of proposed
The violin plots are used to assess the proposed iHHO iHHO increases in the majority of datasets. The proposed
in Fig. 12 to overcome the drawback of box plots. A violin optimizer has a much more curtailed distribution compared
plot illustrates distributions of numeric data for one or more to the other conventional techniques.
groups using density curves. The width of each curve cor- The heatmaps in Fig. 13 indicate the variations in the
responds with the approximate frequency of data points in accuracy value throughout the 100 iterations. A sequen-
each region. Densities are frequently accompanied by an tial color ramp between value and color shows that the
overlaid chart type, such as a box plot, to provide additional lighter colors correspond to larger values and darker colors
information. In the middle of each density curve is a small to smaller values. In support of the evidence provided by
box plot, with the rectangle showing the ends of the first and Tables 2 and 3, the range of accuracy values produced by
third quartiles and the central dot the median. The plots in the proposed iHHO in Breast cancer, Leukemia, OSCC, and

13
K. Balakrishnan et al.

Fig. 12  Test accuracy assessment using violin plots

Ovarian cancer are much higher than other well-established improvement over other methods in CNS, Colon Cancer,
optimizers. Leukemia, and OSCC. In addition to the ROC-AUC score,
the precision, recall, f1-score, and accuracy metrics are
6.1.4 Performance analysis of the selected features subset used to validate the performance of the selected feature, as
shown in Table 4. In the case of Breast and Ovarian can-
The Receiver Operating Characteristic (ROC) curve scruti- cer, the proposed iHHO yields 50% and 60% accuracy val-
nizes the performance of a classification model at various ues, respectively. Even though the outcome in Breast and
threshold settings to determine the effectiveness of a predic- Ovarian cancer is not significant, it is the highest accuracy
tive model in evaluating the data sample class. Figure 14 value shown in that category. However, the proposed iHHO
compares the performance of the features selected by the produces precision scores ranging from 81 to 96% in most
proposed iHHO with other optimizers for different micro- datasets. A Higher ROC-AUC score of iHHO indicates
array datasets. The proposed optimizer shows significant greater confidence in the predictive potential of selected

13
A novel control factor and Brownian motion‑based improved Harris Hawks Optimization for feature…

Fig. 13  Test accuracy assessment using heatmaps

features. The higher the AUC value, the less likely the pro- The recommended benchmark functions encompass a set of
posed model will reverse the effects and have a high level of Unimodal (UM), and Multimodal (MM) functions. The UM
separability. The features chosen by iHHO provide greater functions (F1–F4) can disclose the intensification abilities of
confidence in disease prediction and higher accuracy across diverse optimizers. The MM functions (F5–F8) can divulge
all six datasets. diversification. Tables 5 and 6 represent the mathematical
formulation of UM and MM problems. The outcomes of the
6.2 Performance‑based on unimodal suggested iHHO are compared with the existing optimiza-
and multimodal functions tion techniques such as the MFO, MPA, SCA, SSA, WOA,
and HHO algorithms.
The effectiveness of the suggested iHHO optimizer is thor-
oughly analyzed using a set of varied benchmark functions.

13
K. Balakrishnan et al.

Fig. 14  ROC-AUC Curve of Selected Features

6.2.1 Scalability analysis for 87.5% (F1–F6 and F8) of 100-dimensional search space
problems. Table 10 demonstrates that the iHHO achieves
Scalability evaluation is employed to reconnoiter the effect the paramount results in AVG and STD in dealing with all
of dimension on the outcome of iHHO. This test can reveal 8 test cases with 500 dimensions. Table 11 presents that
the impact of dimensions on the quality of solutions for iHHO performs significantly superior when dealing with
the iHHO optimizer to identify its effectiveness for prob- F1–F8 test functions with 1000 dimensions. It could be
lems with lower and higher dimension tasks. This research observed from the results of the formulated tables that the
employs the iHHO to handle the scalable UM and MM performance of the conventional methods diminishes with
F1–F8 test cases with 30, 100, 500, and 1000 dimensions. the increase of dimensions which reveals the potentiality of
Table 7 represents the analysis outcomes of the iHHO in the iHHO in consistently balancing investigative and inten-
handling F1-F8 problems with different dimensions. Table 7 sification tendencies.
also depicts that the iHHO can expose exceptional results,
which remain consistent in all dimensions. 6.3 Comparative analysis of HHO variants

6.2.2 Comparative analysis The comparison of accuracy between traditional HHO,


HHO with Brownian motion, HHO with proposed control
A comparative analysis for diverse dimensions of F1–F8 factor, and HHO with Brownian and proposed control factor
benchmark functions is done between the proposed iHHO is illustrated in Fig. 15. Compared to other techniques, the
and the conventional optimizers to highlight the signifi- suggested model has a greater chance of improving accuracy
cant performance of the iHHO, which is demonstrated in in the initial stages. As depicted in Fig. 15, the accuracy of
Tables 8, 9, 10, and 11. According to Table 8, the iHHO the suggested iHHO algorithm has been steadily increasing
achieves the best results in dealing with 87.5% of 30-dimen- and developing in comparison to the other methods. From
sional functions (F1–F3 and F5–F8), demonstrating the Fig. 15, it is vividly understood that the proposed iHHO
superior performance of the iHHO compared to existing outperforms other variants of HHO in terms of classification
optimizers. According to Table 9, the iHHO can substan- accuracy by producing accuracy in the range of [0.6 to 0.9].
tially outperform other techniques and attain the best results The combination of HHO with Brownian motion and HHO

13
A novel control factor and Brownian motion‑based improved Harris Hawks Optimization for feature…

Table 5  Description of unimodal benchmark functions

Accuracy

0.5051

0.5695

0.4965

0.5094

0.4937

0.5158

0.6084
Functions Dimensions Range fmin

F1-Score n
∑ 30, 100, 500, 1000 [100, 100] 0
f1 (x) = xi2

0.67
0.00
0.28
0.69
0.00
0.66
0.67
0.00
0.66
0.00
0.67
0.05
0.41
0.69
i=1
n
∑ � � ∏n � � 30, 100, 500, 1000 [10, 10] 0
Recall
Ovarian Cancer

f2 (x) = �xi � + i=1 �xi �


1.00
0.00
0.17
0.96
0.00
1.00
1.00
0.00
1.00
0.00
1.00
0.02
0.28
0.91
i=1
� �2 30, 100, 500, 1000 [100, 100] 0
Preci-

n i
0.50
0.00
0.80
0.53
0.00
0.49
0.50
0.00
0.49
0.00
0.51
1.00
0.76
0.56
∑ ∑
sion

f3 (x) = xj
i=1 j−1
0.8669

0.8154

0.8154

0.7555

0.9509

0.9095

0.9509
Accu-

{ }
racy

f4 (x) = maxi ||xi , 1 ≤ i ≤ n|| 30, 100, 500, 1000 [100, 100] 0
F1-Score

0.84
0.87
0.83
0.78
0.83
0.78
0.80
0.67
0.96
0.95
0.90
0.91
0.95
0.95
with novel control factor performs almost symmetrical but
Recall

0.80
0.91
1.00
0.64
1.00
0.64
1.00
0.50
1.00
0.90
0.90
0.91
0.91
1.00
shows meagre accuracy compared to iHHO. The traditional
HHO performs measly as compared to other approaches in
OSCC

Preci-

0.89
0.83
0.71
1.00
0.71
1.00
0.67
1.00
0.92
1.00
0.90
0.91
1.00
0.91
sion

all datasets.
0.8109

0.6807

0.8198

0.8678

0.7120

0.7404

0.9636
Accu-
racy

6.4 Complexity analysis
F1-Score

The time complexity of the suggested method is analysed in


0.76
0.85
0.64
0.71
0.80
0.83
0.85
0.87
0.73
0.69
0.70
0.77
0.96
0.96

this section. O(n*d) time–space is required to construct the


Recall

criteria and initialize the random population. The population


0.62
1.00
0.57
0.79
0.71
0.92
0.79
0.93
0.79
0.64
0.62
0.86
0.92
1.00

size is n, and the number of dimensions based on the dataset


Leukemia

Preci-

is d. The update phase requires O(n*d). Moreover, the outer


1.00
0.74
0.73
0.65
0.91
0.75
0.92
0.81
0.69
0.75
0.80
0.71
1.00
0.93
sion

loop has an O(t) time complexity, where t is the maximum


0.7726

0.5295

0.8620

0.6429

0.8398
0.7507

0.9609
Accu-

number of iterations. Furthermore, the proposed approach


racy

complexity is O(t*n*d).
F1-Score

0.71
0.81
0.48
0.56
0.87
0.84
0.56
0.69
0.83
0.82
0.67
0.80
0.96
0.96

7 Results discussion
Recall

0.55
1.00
0.42
0.64

0.73
0.45
0.82
0.91
0.75
0.50
1.00
0.92
1.00
Colon Cancer

1.0

From the results, it can be inferred that the results of iHHO are
Preci-

1.00
0.69
0.56
0.50
0.77
1.00
0.71
0.60
0.77
0.90
1.00
0.67
1.00
0.92
sion

significantly superior for multi-dimensional F1–F8 problems.


0.6258

0.7709

0.3925

0.7139

0.6118

0.5857

0.8189

Six real high-dimensional microarray datasets are juxtaposed


Accu-
racy

with existing and renowned optimizers such as MFO, MPA,


F1-Score

SCA, SSA, WOA, and HHO methods. The efficiency of other


0.43
0.71
0.78
0.76
0.13
0.53
0.57
0.79
0.53
0.67
0.57
0.57
0.80
0.82

methods significantly degrades as the algorithm progresses the


converging ability and accuracy. Figures 9, 10 represents how
Recall

0.30
0.91
0.82
0.73
0.08
0.73
0.40
1.00
0.45
0.75
0.55
0.60
0.73
0.90

iHHO can retain a balanced equilibrium amidst the exploratory


and exploitative propensities related to topographies with a
Preci-

0.75
0.59
0.75
0.80
0.25
0.42
1.00
0.65
0.63
CNS

0.60
0.60
0.55
0.89
0.75
sion

plethora of variables. As recorded in F1–F8 in Tables 8, 9, 10,


11, a vast significant fissure can be observed in the outcome
Table 4  Validation of Selected Features

0.4805

0.5092

0.5092

0.4809

0.4839

0.4572

0.5092
Accu-
racy

of varied nuances such as the MFO, MPA, SCA, SSA, WOA,


and HHO, with high-quality solutions based on iHHO. This
F1-Score

0.00
0.65
0.00
0.67
0.00
0.67
0.65
0.00
0.00
0.65
0.00
0.67
0.62
0.67

analysis endorses the progressive exploitative merits of the


suggested iHHO. From the solution noted for unseen data in
Recall

0.00
1.00
0.00
1.00
0.00
1.00
1.00
0.00
0.00
1.00
0.00
1.00
0.93
1.00

Figs. 11–13 and Tables 2, 3, it can be traced that iHHO discov-


ers superior and competitive results grounded on an equivocal
Preci-
Breast Cancer

0.00
0.48
0.00
0.50
0.00
0.50
0.48
0.00
0.00
0.48
0.00
0.50
0.46
0.50
sion

balance amidst exploration and exploitation predispositions


and an unwavering shift linking the searching modes. The out-
Class

0
1
0
1
0
1
0
1
0
1
0
1
0
1

comes of the experiment exemplify that the suggested iHHO


iHHO
Algo-

WOA
Data-

rithm

MFO

HHO
MPA

has a multitude of exploratory and exploitative strategies. In


SCA

SSA
set

13
K. Balakrishnan et al.

Table 6  Description of Functions Dimensions Range fmin


multimodal benchmark
functions �� �
∑ n
�xi � 30, 100, 500, 1000 [500, 500] −418.98 × n
f5 (x) = −xi sin � �
i=1
n �
∑ � 30, 100, 500, 1000 [5.12, 5.12] 0
f6 (x) = xi2 − 10 cos(2Πxi ) + 10
i=1
� � �
1
n


1
n


30, 100, 500, 1000 [32, 32] 0
f7 (x) = −20 exp −0.2 n
xi2 − exp n
cos(2Πxi ) + 20 + e
i=1 i=1

n
∑ ∏n � � 30, 100, 500, 1000 [600, 600] 0
1 x
f8 (x) = 4000
xi2 − i=1
cos √i +1
i=1 i

Table 7  Results of iHHO for Problem/dimension Metric 30 100 500 1000


different dimensions of F1–F8
functions Unimodal F1 Avg 3.56E−85 2.93 E−79 1.92 E−62 1.49 E−44
Std 1.02 E−84 5.53 E−79 6.10 E−62 3.26 E−44
F2 Avg 7.19 E−92 6.85 E−87 4.66 E−72 3.32 E−56
Std 9.35 E−91 1.43 E−86 1.06 E−71 2.91 E−55
F3 Avg 3.92 E−89 5.90 E−92 4.53 E−88 2.35 E−63
Std 8.37 E−89 6.52 E−92 3.62 E−87 1.09 E−62
F4 Avg 7.32 E−51 5.40 E−69 4.95 E−62 3.69 E−58
Std 1.09 E−50 2.77 E−69 1.92 E−62 1.62 E−57
Multimodal F5 Avg − 2.05E + 16 − 7.45E + 16 − 1.52E + 20 − 1.06E + 17
Std 6.68E + 15 1.35E + 15 2.31E + 20 6.38E + 16
F6 Avg 4.68 E−38 6.28 E−16 5.89 E−22 4.61 E−13
Std 7.61 E−38 1.64 E−16 2.67 E−21 6.62 E−12
F7 Avg 5.21 E−09 3.69 E−05 2.34 E−24 1.62 E−17
Std 4.02 E−08 9.26 E−05 4.72 E−24 3.92 E−17
F8 Avg 1.28 E−65 6.89 E−21 3.62 E−18 1.62 E−08
Std 7.09 E−64 4.02 E−21 1.94 E−18 6.69 E−08

Table 8  Results and comparisons of unimodal and multimodal benchmark functions with 30 dimensions
Problem/dimension Metric MFO MPA SCA SSA WOA HHO iHHO

Unimodal F1 Avg 2.05 E−67 9.98E + 01 1.73E + 04 1.46E + 02 2.74E + 01 9.12 E−05 3.56 E−85
Std 1.68 E−66 5.67E + 01 6.89E + 03 4.90E + 01 3.67E + 01 3.07 E−04 1.02 E−84
F2 Avg 3.95 E−23 8.83E + 01 8.30E + 00 4.82E + 01 7.15 E−01 1.56 E−51 7.19 E−92
Std 6.98 E−22 1.07E + 01 2.89E + 03 1.90E + 01 6.28 E−02 8.55 E−50 9.35 E−91
F3 Avg 2.39 E−04 6.94 E−02 1.25E + 02 5.83 E−71 3.97E + 05 5.96 E−88 3.92 E−89
Std 1.08 E−03 2.95 E−02 7.89E + 01 9.31 E−71 2.12E + 05 2.36 E−87 8.37 E−89
F4 Avg 5.01 E−02 2.98 E−53 7.98 E−23 1.36E + 01 6.06 E−27 2.91 E−49 7.32 E−51
Std 1.53 E−02 7.62 E−53 2.67 E−22 1.10E + 01 1.92 E−26 5.62 E−48 1.09 E−50
Multimodal F5 Avg − 1.07E + 02 − 1.09E + 05 − 5.98 E−04 − 7.29E + 02 − 2.32E + 04 − 1.96E + 10 − 2.05E + 16
Std 3.05E + 01 8.38E + 01 7.09 E−04 3.95E + 01 1.96E + 03 4.52E + 11 6.68E + 15
F6 Avg 1.42 E−09 3.65 E−02 1.96E + 04 6.62 E−09 2.35 E−29 3.36 E−04 4.68 E−38
Std 6.52 E−08 9.64 E−02 8.72E + 03 1.35 E−08 7.03 E−28 8.96 E−03 7.61 E−38
F7 Avg 5.69 E−04 8.45E + 01 9.01 E−07 6.75 E−06 2.98E + 04 1.32E + 02 5.21 E−09
Std 3.81 E−04 1.92E + 01 6.34 E−06 1.68 E−03 3.39E + 04 5.21E + 02 4.02 E−08
F8 Avg 3.97E + 06 2.39E + 04 2.95 E−09 9.45 E−24 1.94 E−05 5.60 E−08 1.28 E−65
Std 1.92E + 06 2.09E + 04 1.56 E−08 6.65 E−23 5.50 E−05 4.94 E−07 7.09 E−64

Bold values represent the significant difference between the proposed iHHO and other conventional techniques

13
A novel control factor and Brownian motion‑based improved Harris Hawks Optimization for feature…

Table 9  Results and comparisons of unimodal and multimodal benchmark functions with 100 dimensions
Problem/dimension Metric MFO MPA SCA SSA WOA HHO iHHO

Unimodal F1 Avg 7.86 E−39 3.65E + 03 1.73E + 04 7.91 E−09 4.14 E−03 1.72 E−07 2.93 E−79
Std 3.68 E−38 1.29E + 03 5.36E + 03 2.39 E−08 8.63 E−03 6.91 E−06 5.53 E−79
F2 Avg 8.65 E−04 2.35E + 01 7.38 E−07 9.92E + 02 3.67E + 01 5.02 E−61 6.85 E−87
Std 9.61 E−04 5.91E + 01 5.60 E−06 4.27E + 04 7.39E + 02 6.94 E−61 1.43 E−86
F3 Avg 7.35 E−06 2.98E + 02 6.09 E−32 3.36 E−05 1.25 E−30 5.69 E−06 5.90 E−92
Std 3.65 E−07 5.66E + 05 4.95 E−31 9.42 E−04 5.68 E−29 8.32 E−06 6.52 E−92
F4 Avg 2.37E + 01 6.62 E−04 3.09 E−02 7.50 E−28 7.56 E−59 6.96 E−64 5.40 E−69
Std 6.87E + 01 1.25 E−03 8.08 E−01 3.45 E−27 3.54 E−62 2.35 E−63 2.77 E−69
Multimodal F5 Avg − 1.24E + 05 − 7.56E + 15 -6.68E + 10 − − 5.23E + 04 − 1.29E + 07 1.18E + 05 − 7.45E + 16
Std 6.32E + 04 5.94E + 11 8.97E + 09 3.62E + 03 4.26E + 06 1.28E + 04 1.35E + 15
F6 Avg 9.65E + 01 2.91 E−09 4.29 E−04 2.22E + 02 2.06 E−01 5.19 E−10 6.28 E−16
Std 5.94E + 01 7.91 E−09 1.93 E−03 3.62E + 02 8.39 E−03 7.95 E−08 1.64 E−16
F7 Avg 2.95 E−01 8.62E + 02 1.62 E−03 6.56E + 01 4.50 E−09 6.72E + 01 3.69 E−05
Std 6.54 E−01 9.65E + 02 7.32 E−02 5.06E + 01 2.62 E−08 2.55E + 01 9.26 E−05
F8 Avg 2.84 E−01 5.02 E−14 5.24 E−18 4.02E + 02 6.30 E−03 4.92 E−06 6.89 E−21
Std 8.92 E−01 4.93 E−14 2.93 E−18 6.83E + 02 2.96 E−03 1.97 E−05 4.02 E−21

Bold values represent the significant difference between the proposed iHHO and other conventional techniques

Table 10  Results and comparisons of unimodal and multimodal benchmark functions with 500 dimensions
Problem/dimension Metric MFO MPA SCA SSA WOA HHO iHHO

Unimodal F1 Avg 5.92E + 01 3.51 E−04 8.23E + 02 2.38 E−23 6.24 E−29 7.55E + 01 1.92 E−62
Std 1.25E + 01 6.56 E−03 7.19E + 02 5.09 E−22 4.95 E−28 3.66E + 02 6.10 E−62
F2 Avg 3.51 E−04 6.16 E−02 5.57 E−32 8.56 E−15 2.69E + 00 6.98E + 02 4.66 E−72
Std 2.60 E−04 3.58 E−02 4.92 E−31 7.95 E−14 4.62E + 00 2.69E + 01 1.06 E−71
F3 Avg 1.09 E−19 5.91 E−02 6.82E + 01 8.11 E−02 7.62 E−04 4.01 E−27 4.53 E−88
Std 5.25 E−18 3.71 E−01 4.92E + 01 6.62 E−02 4.69 E−04 3.62 E−26 3.62 E−87
F4 Avg 7.62 E−02 5.36E + 01 3.93 E−29 4.87 E−24 6.02E + 02 5.68 E−13 4.95 E−62
Std 5.62 E−02 2.26E + 01 1.78 E−29 8.66 E−24 5.91E + 01 3.69 E−12 1.92 E−62
Multimodal F5 Avg − 1.24 E−14 − 1.65 E−24 − 2.26E + 01 − 4.92E + 12 − 1.35 E−15 − 1.24 E−18 − 1.52E + 20
Std 4.25 E−14 3.26 E−24 7.55E + 01 2.15E + 12 6.56 E−15 5.62 E−18 2.31E + 20
F6 Avg 8.26E + 01 7.52 E−05 6.90E + 02 1.25 E−16 7.62 E−14 8.10E + 02 5.89 E−22
Std 5.56E + 02 2.98 E−04 9.58E + 02 4.56 E−15 3.25 E−13 2.25E + 02 2.67 E−21
F7 Avg 2.32E + 01 7.93 E−02 1.46 E−19 3.74 E−01 7.24E + 04 3.56 E−02 2.34 E−24
Std 5.92E + 01 3.48 E−01 4.74 E−18 8.27 E−01 3.89E + 04 6.40 E−01 4.72 E−24
F8 Avg 6.98 E−04 5.38E + 02 8.34 E−06 9.65 E−04 6.62E + 01 8.26 E−02 3.62 E−18
Std 2.49 E−04 4.28E + 02 3.62 E−06 5.62 E−04 2.95E + 01 5.26 E−02 1.94 E−18

Bold values represent the significant difference between the proposed iHHO and other conventional techniques

addition, handling diverse classes of problems has tactfully • Modified escaping energy (Ee) using novel control factor
eluded LO and immature convergence. The recommended (CF) parameter represents smooth deprivation of escaping
iHHO has demonstrated a better ability in leaping out of local energy of the prey as it tries to escape from the predator.
optimum solutions in any Local Optima stagnation. Features It not only requires an even shift between exploration and
stated below would assist in comprehending various theoreti- exploitation patterns of iHHO but also enhances it.
cal reasons that substantiate the constructive nature of the rec- • When piloting a local search, diversification mechanisms
ommended iHHO in exploring or exploiting the search space such as Brownian motion (fB) eventually preclude the algo-
of a given optimization problem:

13
K. Balakrishnan et al.

Table 11  Results and comparisons of unimodal and multimodal benchmark functions with 1000 dimensions
Problem/dimension Metric MFO MPA SCA SSA WOA HHO iHHO

Unimodal F1 Avg 6.89 E−04 3.81 E−02 7.82E + 01 5.62 E−35 4.98E + 09 1.09E + 02 1.49 E−44
Std 7.26 E−03 5.26 E−02 2.54E + 01 6.29 E−34 2.95E + 07 6.25E + 01 3.26 E−44
F2 Avg 2.61E + 02 6.23E + 04 3.64 E−02 9.56E + 01 5.62 E−13 7.29E + 10 3.32 E−56
Std 5.65E + 01 4.92E + 04 1.95 E−02 5.62E + 01 3.62 E−12 2.32E + 10 2.91 E−55
F3 Avg 8.92 E−18 3.26 E−04 6.26E + 02 4.69 E−10 3.52E + 10 9.55E + 01 2.35 E−63
Std 2.55 E−18 1.98 E−03 4.55E + 02 6.28 E−09 8.14E + 10 8.77E + 01 1.09 E−62
F4 Avg 3.62E + 10 2.92 E−01 9.56 E−15 2.62E + 10 6.29 E−42 7.92 E−56 3.69 E−58
Std 2.60E + 10 1.59 E−01 4.29 E−14 7.69E + 10 3.27 E−42 5.62 E−55 1.62 E−57
Multimodal F5 Avg − 5.34E + 15 − 3.68 E−09 − 1.35 E−17 − 7.69 E−04 − 6.25 E−24 − 7.75E + 01 − 1.06E + 17
Std 9.62E + 13 1.64 E−08 4.69 E−16 2.78 E−03 5.62 E−24 3.62E + 01 6.38E + 16
F6 Avg 5.16E + 01 8.13E + 03 7.23 E−09 2.30 E−03 6.29E + 10 8.26 E−05 4.61 E−13
Std 2.30E + 01 7.33E + 03 6.62 E−08 1.18 E−02 5.69E + 10 5.95 E−04 6.62 E−12
F7 Avg 4.26 E−13 6.78E + 03 1.16E + 12 7.95 E−15 8.69 E−04 5.22E + 01 1.62 E−17
Std 8.96 E−11 9.21E + 02 2.36E + 11 5.26 E−14 1.96 E−03 3.26E + 01 3.92 E−17
F8 Avg 3.25 E−01 8.65E + 04 6.62E + 10 2.62 E−01 9.28E + 10 5.91 E−03 1.62 E−08
Std 6.23 E−01 5.00E + 00 1.95E + 09 7.28 E−01 4.25E + 10 1.56 E−02 6.69 E−08

Bold values represent the significant difference between the proposed iHHO and other conventional techniques

rithm from stagnation at local minima and improves the is assessed using six publically accessible real high-dimen-
exploitative nature of iHHO. sional microarray datasets. The performance measures such
• There is a constructive impact on the exploitation ability as precision, recall, f1-score, and classification accuracy are
of iHHO as it employs a sequence of searching stratagems used to evaluate the confidence of selected features. From
centered on Ee and r parameters and then selects the best the outcome of this research evaluation presented by differ-
movement step. ent performance measures, it is understood that the feature
• The randomized jump (J) potentiality can support optimum chosen by the proposed iHHO provides profound insight in
solutions in maintaining the tendencies of intensification detecting life-threatening diseases.
and diversification. Various unimodal and multimodal problems were
employed to scrutinize the evasion of exploitative, explora-
tory, and local optima by the proposed iHHO. The outcomes
8 Conclusion of iHHO illustrate that iHHO is proficient in discovering
exceptional results paralleled to archetypal optimizers. Fur-
Harris Hawks Optimization is a population-based opti- thermore, the effects of six high-dimensional microarray
mizer enthused by the cooperative nature and hurtling datasets exposed that the iHHO can demonstrate superior
skills of predatory birds, Harris hawks, in nature. This results to other optimizers. The proposed randomization
research has stressed the two setbacks associated with tra- using Brownian motion improves the convergence ability
ditional HHO. Firstly, the conventional HHO representa- and helps to avoid premature convergence. However, like
tion of prey escape energy is inefficient. Secondly, Levy other optimization methods, iHHO has a few shortcomings.
flight's smaller and occasionally more extended step sizes The proposed model experiments exclusively for FS prob-
in the initial random solution prevented the algorithm from lems with a high-dimensional dataset. It can be applied for
emerging from local optima. constrained engineering design tasks for various engineering
Randomization is a core component of swarm intelli- applications. To conclude, the performance of the Brown-
gence-based optimization techniques. In this research cri- ian motion has demonstrated a novel way to improve the
tique, the randomization is instilled in traditional HHO using outcomes of other optimization approaches in the future.
the Brownian motion rather than levy flight. Using Brownian Future works can utilize other evolutionary schemes such
motion to update the best location of the initial random solu- as mutation and crossover schemes, evolutionary updating
tion, the proposed model avoids stagnation in local minima structures, and chaos-based phases to develop binary and
and successfully converges to global optima. In addition, the multi-objective versions of iHHO. In addition, it can be
novel control factor to efficiently imitate the escape energy employed to tackle the problems in image segmentation,
of the prey is also incorporated. The performance of iHHO

13
A novel control factor and Brownian motion‑based improved Harris Hawks Optimization for feature…

Breast Cancer Leukemia


1 1
0.9 0.9
HHO HHO
0.8 0.8
0.7 0.7
0.6 HHO+Brownian Mo on
Accuracy

0.6 HHO+Brownian Mo on

Accuracy
0.5 0.5
0.4 0.4
0.3 HHO+proposed control HHO+proposed control
0.3
factor factor
0.2 0.2
0.1 0.1
HHO+Brownian HHO+Brownian
0 Mo on+proposed 0 Mo on+proposed
1 7 13 19 25 31 37 43 49 55 61 67 73 79 85 91 97 control factor 1 7 13 19 25 31 37 43 49 55 61 67 73 79 85 91 97 control factor
Itear ons Itera ons

CNS OSCC
1 1
0.9 0.9
HHO HHO
0.8 0.8
0.7 0.7
0.6 HHO+Brownian Mo on
Accuracy

0.6 HHO+Brownian Mo on

Accuracy
0.5 0.5
0.4 0.4
0.3 HHO+proposed control HHO+proposed control
0.3
factor factor
0.2 0.2
0.1 0.1
HHO+Brownian HHO+Brownian
0 Mo on+proposed 0 Mo on+proposed
1 7 13 19 25 31 37 43 49 55 61 67 73 79 85 91 97 control factor 1 7 13 19 25 31 37 43 49 55 61 67 73 79 85 91 97 control factor
Itera ons Iteara ons

Colon Cancer Ovarian Cancer


1 1
0.9 0.9
HHO HHO
0.8 0.8
0.7 0.7
0.6 HHO+Brownian Mo on
Accuracy

0.6 HHO+Brownian Mo on
Accuracy

0.5 0.5
0.4 0.4
0.3 HHO+proposed control HHO+proposed control
0.3
factor factor
0.2 0.2
0.1 0.1
HHO+Brownian HHO+Brownian
0 Mo on+proposed 0 Mo on+proposed
1 7 13 19 25 31 37 43 49 55 61 67 73 79 85 91 97 control factor 1 7 13 19 25 31 37 43 49 55 61 67 73 79 85 91 97 control factor
Itera ons Itear ons

Fig. 15  Comparative analysis of the variants of HHO

sentiment analysis, signal processing, fuzzy system, and selection. Artif Intell Rev 54:593–637. https://​doi.​org/​10.​1007/​
different engineering applications. s10462-​020-​09860-3
Abedinpourshotorban H, Mariyam Shamsuddin S, Beheshti Z, Jawawi
D (2016) Electromagnetic field optimization: a physics-inspired
Acknowledgements The authors sincerely thank the Department of metaheuristic optimization algorithm. Swarm Evol Comput 26:8–
Science and Technology (DST), Government of India, for funding this 22. https://​doi.​org/​10.​1016/j.​swevo.​2015.​07.​002
research project work under the Interdisciplinary Cyber-Physical Sys- Ahmed S, Mafarja M, Faris H, Aljarah I (2018) Feature selection using
tems (ICPS) scheme (Grant No. T-54). salp swarm algorithm with chaos. In: ACM International Confer-
ence Proceeding Series, pp 65–69
Alabool HM, Alarabiat D, Abualigah L, Heidari AA (2021) Harris
Hawks Optimization: a comprehensive review of recent variants
References and applications. Neural Comput Appl 33:8939–8980. https://d​ oi.​
org/​10.​1007/​s00521-​021-​05720-5
Abdel-Basset M, Ding W, El-Shahat D (2021) A hybrid Harris Hawks
Optimization algorithm with simulated annealing for feature

13
K. Balakrishnan et al.

Bolón-Canedo V, Remeseiro B (2020) Feature selection in image anal- trends in signal and image processing. Springer Singapore, pp
ysis: a survey. Artif Intell Rev 53:2905–2931. https://​doi.​org/​10.​ 79–87
1007/​s10462-​019-​09750-3 Ismael OM, Qasim OS, Algamal ZY (2020) Improving Harris Hawks
Dash M, Liu H (2003) Consistency-based search in feature selection. Optimization algorithm for hyperparameters estimation and fea-
Artif Intell 151:155–176. https://d​ oi.o​ rg/1​ 0.1​ 016/S
​ 0004-3​ 702(03)​ ture selection in v-support vector regression based on opposition-
00079-1 based learning. J Chemom. https://​doi.​org/​10.​1002/​cem.​3311
Dong H, Li T, Ding R, Sun J (2018) A novel hybrid genetic algorithm Kanimozhi T, Latha K (2015) An integrated approach to region
with granular information for feature selection and optimization. based image retrieval using firefly algorithm and support vector
Appl Soft Comput J 65:33–46. https://​doi.​org/​10.​1016/j.​asoc.​ machine. Neurocomputing 151:1099–1111. https://​doi.​org/​10.​
2017.​12.​048 1016/j.​neucom.​2014.​07.​078
Elgamal ZM, Yasin NBM, Tubishat M et al (2020) An improved Har- Kou G, Yang P, Peng Y et al (2020) Evaluation of feature selection
ris Hawks Optimization algorithm with simulated annealing for methods for text classification with small datasets using multiple
feature selection in the medical field. IEEE Access 8:186638– criteria decision-making methods. Appl Soft Comput J. https://​
186652. https://​doi.​org/​10.​1109/​ACCESS.​2020.​30297​28 doi.​org/​10.​1016/j.​asoc.​2019.​105836
Elminaam DSA, Nabil A, Ibraheem SA, Houssein EH (2021) An Lew MS (2001) Principles of visual information retrieval
efficient marine predators algorithm for feature selection. IEEE Liu C, Wu J, Mirador L, et al (2018) Classifying DNA methylation
Access 9:60136–60153. https://​doi.​org/​10.​1109/​ACCESS.​2021.​ imbalance data in cancer risk prediction using SMOTE and tomek
30732​61 link methods. In: International Conference of Pioneering Com-
Emary E, Zawbaa HM, Ghany KKA, et al (2015) Firefly optimization puter Scientists, Engineers and Educators. Springer, Singapore,
algorithm for feature selection. In: ACM International Conference pp 1–9
Proceeding Series Madasu A, Elango S (2020) Efficient feature selection techniques for
Faramarzi A, Heidarinejad M, Mirjalili S, Gandomi AH (2020) Marine sentiment analysis. Multimed Tools Appl 79:6313–6335. https://​
predators algorithm: a nature-inspired metaheuristic. Expert Syst doi.​org/​10.​1007/​s11042-​019-​08409-z
Appl 152:113377. https://​doi.​org/​10.​1016/j.​eswa.​2020.​113377 Mafarja MM, Mirjalili S (2017) Hybrid whale optimization algorithm
Gao D, Wang GG, Pedrycz W (2020a) Solving fuzzy job-shop schedul- with simulated annealing for feature selection. Neurocomputing
ing problem using de algorithm improved by a selection mecha- 260:302–312. https://​doi.​org/​10.​1016/j.​neucom.​2017.​04.​053
nism. IEEE Trans Fuzzy Syst 28:3265–3275. https://​doi.​org/​10.​ Marcano-Cedeño A, Quintanilla-Domínguez J, Cortina-Januchs MG,
1109/​TFUZZ.​2020.​30035​06 Andina D (2010) Feature selection using sequential forward selec-
Gao Y, Zhou Y, Luo Q (2020b) An efficient binary equilibrium opti- tion and classification applying artificial metaplasticity neural net-
mizer algorithm for feature selection. IEEE Access 8:140936– work. In: IECON Proceedings (Industrial Electronics Conference).
140963. https://​doi.​org/​10.​1109/​ACCESS.​2020.​30136​17 pp 2845–2850
Gao ZM, Zhao J, Hu YR, Chen HF (2019) The improved harris hawk Mirjalili S (2015) Moth-flame optimization algorithm: A novel nature-
optimization algorithm with the tent map. In: 2019 IEEE 3rd inspired heuristic paradigm. Knowl Based Syst 89:228–249.
International Conference on Electronic Information Technology https://​doi.​org/​10.​1016/j.​knosys.​2015.​07.​006
and Computer Engineering, EITCE 2019 Mirjalili S (2016) SCA: a sine cosine algorithm for solving optimiza-
Gu N, Fan M, Du L, Ren D (2015) Efficient sequential feature selection tion problems. Knowl Based Syst 96:120–133. https://​doi.​org/​10.​
based on adaptive eigenspace model. Neurocomputing 161:199– 1016/j.​knosys.​2015.​12.​022
209. https://​doi.​org/​10.​1016/j.​neucom.​2015.​02.​043 Mirjalili S, Lewis A (2016) The whale optimization algorithm. Adv
Gunal S, Edizkan R (2008) Subspace based feature selection for pat- Eng Softw 95:51–67. https://​doi.​org/​10.​1016/j.​adven​gsoft.​2016.​
tern recognition. Inf Sci (NY) 178:3716–3726. https://​doi.​org/​10.​ 01.​008
1016/j.​ins.​2008.​06.​001 Mirjalili S, Gandomi AH, Mirjalili SZ et al (2017) Salp swarm algo-
Heidari AA, Mirjalili S, Faris H et al (2019) Harris Hawks Optimiza- rithm: a bio-inspired optimizer for engineering design problems.
tion: algorithm and applications. Fut Gen Comput Syst 97:849– Adv Eng Softw 114:163–191. https://​doi.​org/​10.​1016/j.​adven​
872. https://​doi.​org/​10.​1016/j.​future.​2019.​02.​028 gsoft.​2017.​07.​002
Hussien AG, Amin M (2021) A self-adaptive Harris Hawks Optimiza- Neggaz N, Ewees AA, Elaziz MA, Mafarja M (2020a) Boosting salp
tion algorithm with opposition-based learning and chaotic local swarm algorithm by sine cosine algorithm and disrupt operator
search strategy for global optimization and feature selection. Int J for feature selection. Expert Syst Appl. https://​doi.​org/​10.​1016/j.​
Mach Learn Cybern. https://d​ oi.o​ rg/1​ 0.1​ 007/s​ 13042-0​ 21-0​ 1326-4 eswa.​2019.​113103
Houssein EH, Hosney ME, Elhoseny M et al (2020a) Hybrid Harris Neggaz N, Houssein EH, Hussain K (2020b) An efficient henry gas
Hawks Optimization with cuckoo search for drug design and dis- solubility optimization for feature selection. Expert Syst Appl
covery in chemoinformatics. Sci Rep 10:1–22. https://​doi.​org/​10.​ 152:113364. https://​doi.​org/​10.​1016/j.​eswa.​2020.​113364
1038/​s41598-​020-​71502-z Sihwail R, Omar K, Ariffin KAZ, Tubishat M (2020) Improved Har-
Houssein EH, Saad MR, Hussain K et al (2020b) Optimal sink node ris Hawks Optimization using elite opposition-based learning
placement in large scale wireless sensor networks based on har- and novel search mechanism for feature selection. IEEE Access.
ris’ hawk optimization algorithm. IEEE Access 8:19381–19397. https://​doi.​org/​10.​1109/​ACCESS.​2020.​30064​73
https://​doi.​org/​10.​1109/​ACCESS.​2020.​29689​81 Too J, Mirjalili S (2021) General learning equilibrium optimizer: a
Houssein EH, Neggaz N, Hosney ME et al (2021) Enhanced Harris new feature selection method for biological data classification.
Hawks Optimization with genetic operators for selection chemi- Appl Artif Intell 35:247–263. https://​doi.​org/​10.​1080/​08839​514.​
cal descriptors and compounds activities. Neural Comput Appl. 2020.​18614​07
https://​doi.​org/​10.​1007/​s00521-​021-​05991-y Too J, Abdullah AR, Saad NM (2019) A new quadratic binary harris
Hussain K, Neggaz N, Zhu W, Houssein EH (2021) An efficient hybrid hawk optimization for feature selection. Electron. https://​doi.​org/​
sine-cosine Harris Hawks Optimization for low and high-dimen- 10.​3390/​elect​ronic​s8101​130
sional feature selection. Expert Syst Appl 176:114778. https://d​ oi.​ Tuba E, Strumberger I, Bezdan T et al (2019) Classification and feature
org/​10.​1016/J.​ESWA.​2021.​114778 selection method for medical datasets by brain storm optimiza-
Hussien AG, Hassanien AE, Houssein EH, et al (2019) S-shaped binary tion algorithm and support vector machine. Procedia Comput Sci
whale optimization algorithm for feature selection. In: Recent 162:307–315

13
A novel control factor and Brownian motion‑based improved Harris Hawks Optimization for feature…

Zhang Y, Song XF, Gong DW (2017) A return-cost-based binary firefly Publisher's Note Springer Nature remains neutral with regard to
algorithm for feature selection. Inf Sci (NY) 418–419:561–574. jurisdictional claims in published maps and institutional affiliations.
https://​doi.​org/​10.​1016/j.​ins.​2017.​08.​047
Zhang Y, Liu R, Wang X et al (2020) Boosted binary Harris hawks
optimizer and feature selection. Eng Comput. https://​doi.​org/​10.​
1007/​s00366-​020-​01028-5

13

View publication stats

You might also like