0% found this document useful (0 votes)
69 views27 pages

A Feature Selection Based On The Farmland Fertility Algorithm For Improved Intrusion Detection Systems

An intrusion detection method using farmland fertility algorithm.

Uploaded by

Rashed Shakir
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
69 views27 pages

A Feature Selection Based On The Farmland Fertility Algorithm For Improved Intrusion Detection Systems

An intrusion detection method using farmland fertility algorithm.

Uploaded by

Rashed Shakir
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 27

Journal of Network and Systems Management (2022) 30:40

https://doi.org/10.1007/s10922-022-09653-9

A Feature Selection Based on the Farmland Fertility


Algorithm for Improved Intrusion Detection Systems

Touraj Sattari Naseri1 · Farhad Soleimanian Gharehchopogh1 

Received: 29 July 2021 / Revised: 24 December 2021 / Accepted: 25 February 2022


© The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature
2022

Abstract
The development and expansion of the Internet and cyberspace have increased com-
puter systems attacks; therefore, Intrusion Detection Systems (IDSs) are needed
more than ever. Machine learning algorithms have recently been used as successful
IDSs; however, due to the high dimensions in IDSs, Feature Selection (FS) plays
an essential role in these systems’ performance. In this paper, a binary version of
the Farmland Fertility Algorithm (FFA) called BFFA is presented to FS in the clas-
sification of IDSs. In the proposed method, the V-shaped function is used to move
the FFA processes in the binary space, as a result of which the V-shaped function
changes the continuous position of the solutions in the FFA algorithm to binary
mode. A hybrid approach to classifiers and the BFFA is presented as a fast and
robust IDS. The proposed method is tested on two valid IDSs datasets, namely NSL-
KDD and UNSW-NB15, and is compared in Accuracy, Precision, Recall, and F1_
Score criteria with K-Nearest Neighbor (KNN), Support Vector Machine (SVM),
Decision Tree (DT), Random Forest (RF), Adaboost (ADA_BOOST), and Naive
Bayes (NB) classifiers. The simulation results showed that the proposed method per-
formed better than the classifiers in Accuracy, Precision, and Recall criteria; moreo-
ver, the proposed method has a better run time in the FS operation.

Keywords  Farmland fertility algorithm · Feature selection · Hybridization ·


Intrusion detection systems

* Farhad Soleimanian Gharehchopogh


[email protected]
1
Department of Computer Engineering, Urmia Branch, Islamic Azad University, Urmia, Iran

13
Vol.:(0123456789)
40   Page 2 of 27 Journal of Network and Systems Management (2022) 30:40

1 Introduction

The Internet is considered a very open, active, and rapidly growing space. People
increasingly depend on the Internet to access a wide range of electronic services in
their daily lives, including shopping, job search, online education, access to gov-
ernment services, banking, business, etc. With the growth of such dependence, dif-
ferent types of sophisticated new attacks also appear; thus, new threats and harms
are emerged daily for the Internet space, endangering vital infrastructure. Ensur-
ing the integrity and confidentiality of information on a network and the Internet is
becoming a real challenge. There are many software tools such as firewall, antivirus,
encryption, and authentication that secure data and networks to some extent against
received threats. However, they cannot be effective against all existing threats [1].
An IDS must constantly be updated to detect and counter new attacks to deal with
such new and sophisticated attacks.
IDSs can perform various tasks such as monitoring and analyzing the user and
system activities, analyzing system configuration and vulnerability, analyzing abnor-
mal activity patterns, identifying common attacks, tracking users who break the
rules, and more. These systems can be run as a hardware device and can also be
used as a software product to automatically analyze network traffic and report secu-
rity reports to the management office if needed. In general, IDSs operate on two
distinct techniques: pattern matching and statistical anomalies. A pattern-matching-
based IDS called a signature- or misuse-based system can detect known attack pat-
terns whose signatures are stored in the database of IDSs. In statistical anomaly-
based systems, i.e., anomaly-based IDSs, typical behavior patterns are stored in the
database of IDSs. In these systems, all actions are monitored and carefully analyzed
throughout the network. Any deviation from the typical pattern is considered an
attack that causes the IDS to alarm the network security manager about the newly
identified attack [2–4].
Thus, network viruses, eavesdropping, and malicious attacks are growing, caus-
ing network security to decline dramatically; on the other hand, intrusion into a pri-
vate network can cause damage to the entire system. Many companies have lost their
business and credibility due to stealing customer account passwords or leaks of sen-
sitive information. A great pile of research has been carried out. IDSs have been
developed to monitor and filter network activities by identifying attacks and alerting
network administrators. IDSs can perform various tasks such as monitoring and ana-
lyzing the user and system activities, analyzing system configuration and vulnerabil-
ity, analyzing abnormal activity patterns, identifying common attack patterns, track-
ing users who break the rules, and more. It can be said that IDS plays an essential
role in network security by detecting and preventing malicious activities [5].
The machine learning techniques can be used in IDSs as integrated, combined,
or group in classifying. The classifying can be supervised, semi-supervised, and
unsupervised in three operational modes [6–14]. In general, the supervised mode
(or the classification) works better than the other modes; however, due to the high
dimensions of the data exchanged in IDS, classifiers’ use is very time-consuming,
especially when looking to detect intrusion in real-time. Therefore, the FS, i.e., a

13
Journal of Network and Systems Management (2022) 30:40 Page 3 of 27  40

part of the dimension reduction problem, is required to select the optimal subset of
the features that represent the entire dataset. A high-dimensional dataset to teach a
classification model may lead to overfitting the learned model compared to the train-
ing data. Overfitting reduces the model’s generalizability and therefore reduces the
accuracy of the classification for new test samples. Another drawback of large data-
sets is the need for an extended processing time to learn and test the model. Apply-
ing FS in the dataset before the learning process plays a crucial role in improving
classification performance [15, 16]. Therefore, FS is a mandatory and essential
step in data mining [17]. Due to their random nature, metaheuristic approaches are
widely used in FS. The FFA [18] algorithm is a new algorithm inspired by the fertil-
ity of farmland in nature. This algorithm attempts to optimize the solutions based
on the division of farmlands and the optimal utilization of two internal and external
memory types. Inspired by farmland fertility, the authors of this algorithm math-
ematically formulated a new method with six steps.
A critical step is to use meta-heuristic algorithms to choose the best representa-
tive characteristics. Dimensionality reduction is used to minimize the number of
characteristics by deleting those that are redundant or unimportant. The computing
cost is high when several features to be high. This paper introduces a Binary Farm-
land Fertility Algorithm (BFFA) to find the best features to support FS in the clas-
sification of IDSs. The solution is evaluated with two IDS datasets, corresponding
to NSL-KDD and UNSW-NB15. The performance (accuracy, precision, recall, and
f1_score) are compared to other classifiers. The contributions of this paper are sum-
marized as follows:

• The BFFA will be presented to FS in the classification of IDSs.


• In the proposed method, the V-shaped function is used for BFFA processes.
• The BFFA is run on the NSL-KDD and UNSW-NB15 datasets.
• Hybridization of DT, KNN, and SVM classifiers to increase the accuracy
• of the proposed method is examined in different criteria with different classifiers.

The rest of the paper is as follows: Sect. 2 provides a literature review on solv-
ing IDS. Section 3 explains the FFA theory. The proposed method is presented in
Sect. 4, including the BFFA based on various V-shaped functions and formulation.
In Sect. 5, the efficiency and performance of the proposed method are tested. More-
over, finally, Sect. 6 presents a general conclusion and suggestions for future works.

2 Related Works

One of the most effective ways to solve an FS problem is to use metaheuristic opti-
mization methods [19, 20]. A recent work review shows that using metaheuristic
optimization algorithms to reduce dimensions and unique features has increased.
The following summarizes some of the most critical research conducted in recent
years in IDS with the help of FS and using metaheuristic algorithms.
For FS in intrusion detection, an enhanced multi-objective immune algo-
rithm (MOIA) has been proposed [21]. Individuals for immune optimization are

13
40   Page 4 of 27 Journal of Network and Systems Management (2022) 30:40

handled as feature subsets for intrusion detection, which were chosen as accept-
able feature subsets to decrease the dataset’s dimensions. The classification
model is trained using ANN, and the classification model’s output is treated as
the goal fitness value for each person. The proposed technique enhanced clas-
sification accuracy while also speeding up classification convergence. The sug-
gested algorithm’s better classification accuracy was confirmed by experimental
findings on the NSL-KDD and UNSW-NB15 datasets. The false alarm rate of
MOIA is 10.63%, which is lower than that of using all features (14.38%). There-
fore, the performance is enhanced. FS algorithm for intrusion detection system
based on Binary Pigeon Inspired Optimizer is proposed [22]. All examined algo-
rithms were evaluated in terms of TPR, FPR, F-score, and accuracy. The BPIO
was evaluated using three public datasets: KDDCUP99, NLS-KDD, and UNSW-
NB15. The BPIO helped to reduce the number of features needed to build robust
IDS while maintaining a high detection rate, accuracy with low false alarms. A
wrapper-based FS model called Tabu Search-Random Forest (TS-RF) is proposed
to IDS [23]. Tabu search is applied as a search method, while RF is applied as
a learning algorithm to Network Intrusion Detection Systems (NIDS). The pro-
posed model is evaluated on the UNSW-NB15 dataset. The obtained results were
compared with other FS methods. Results showed that classification accuracy is
improved. TS-RF results showed that TS-RF has the lowest cost among all other
recent techniques.
Adel Sabry Eesa et al. [24] used the cuttlefish algorithm as a search strategy to
determine the optimal subset of features and used DT to judge the features selected
by cuttlefish. The dataset used in this paper to evaluate the proposed method was
KDD cup 99. The results showed that the proposed method increased detection and
accuracy rates and reduced false alarm rates compared to how all features were used
to detect intrusion. Mohammadi et  al. [25] also introduced a feature-based IDS.
Their method used a feature grouping algorithm based on the linear correlation coef-
ficient and the cuttlefish algorithm to select the filter- and wrapper-based features,
respectively. Choosing more important features can increase accuracy and speed.
Therefore, for a more accurate IDS with a high detection rate and low false-positive
rate, we need a precise method to extract important features from the dataset. The
proposed method is for selecting the Feature Grouping based on Linear Correlation
Coefficient-Cuttlefish Algorithm (FGLCC-CFA) feature, a hybridization of filter
and wrapper-based methods, and has the advantages of both methods for extract-
ing the most optimal subset of the dataset’s features. The FGLCC-CFA algorithm
uses FGLCC’s high speed and CFA’s high accuracy to select a subset of features. In
the CFA method, the selection of the optimal subset of features is time-consuming.
FGLCC filter is used to rank the primary features, and the selected best features
are given to the CFA as input features. The kdd-cup99 dataset is used to validate
the proposed method. The results obtained from FGLCC-CFA show that the hybrid
method increased the accuracy and detection rate and reduced the false positive rate
compared to FGLCC and CFA algorithms.
In [26], a FS with an Ant Colony Optimization (ACO) algorithm was used to
detect intrusion. It was used to select more valuable features to increase the effi-
ciency of the IDS. The simulations on the two datasets of kdd cup99 and NSL-KDD

13
Journal of Network and Systems Management (2022) 30:40 Page 5 of 27  40

indicate that the IDS process’s speed increases without any significant impact on
its accuracy by focusing on more essential and useful features. Also, in [27], a new
multi-objective approach based on bee colony was presented for FS in IDSs. The
aim of this study was not only to provide a new way to FS but also to introduce
a proper function to achieve the goals of FS, namely: minimizing the number of
selected features, minimizing false alarm rates, minimizing classification error rates,
and optimizing the accuracy of the classifier compared to the case when all the fea-
tures are used.
In [28], a two-step hybrid method is proposed for detecting intrusion. This
approach is a two-step hybrid solution that consists of two components of anomaly
detection and one component of misuse detection. In step 1, an anomaly detection
method with low computational complexity is developed and used to construct the
detection component. The KNN algorithm is also fundamental in constructing the
two detection components of step 2. Experimental results show that the proposed
method can effectively detect network anomalies with a positive and low error rate.
Moreover, in [29], a summary of IDS classification algorithms based on well-
known machine learning algorithms is presented. Different group and hybrid tech-
niques are examined in this paper according to both homogeneous and heterogene-
ous group methods. Also, Particle Swarm Optimization (PSO), ACO, and Artificial
Bee Colony (ABC) algorithms were used to select the essential features in the NSL-
KDD dataset to improve the performance of IDSs. SVM and KNN classifiers were
used to evaluate the FS process [30]. Comparing the performance of SVM and KNN
classifiers in the case they did not use FS with the case they used FS indicates that
the number of features in the dataset is decreased from 41 to 7 and from 11 to 7,
respectively. The run time was also decreased significantly, the accuracy and detec-
tion rate increased, and the false alarm rate was reduced. Among the three meta-
heuristic algorithms, ABC’s performance was better than the others.
In [31], random forest modeling is presented for network IDS using a random
forest classifier. Random forest is a group classifier and performs better than other
traditional classifiers in functional attack classification. Experimental results show
that the proposed method has high efficiency due to high detection and low false
alarm rate. Moreover, an IDS with a reduced number of features using the Artificial
Neural Networks (ANN) classifier is presented in [32], which performs the ranking
of features based on increased information and correlation. Feature reduction is also
made using a new approach to identify valuable and useless IDSs using the ANN
classifier. The results are promising for the proposed method.
Moreover, in [33], an intelligent water drops algorithm based on FS was used for
IDS to increase the SVM classifier’s accuracy. It was done by reducing the KDD
CUP 99 dataset features using the intelligent water drops algorithm. According to
ACC, DR, FAR, and PC criteria, the proposed method’s results (hybridization of
SVM and intelligent water drops algorithm) reported significant improvement com-
pared with SVM, NB, k-means, and hybridization Genetic Algorithm (GA) + SVM
algorithms. In [34], generating new detection rules were proposed to classify both
common and rare attacks. This approach used both DT methods and genetic algo-
rithms to obtain linguistic and precise detection rules. Experimental results showed

13
40   Page 6 of 27 Journal of Network and Systems Management (2022) 30:40

that this method could achieve better results compared with other advanced and old
techniques.
In [35], the Firefly Algorithm (FA) was used for FS in the IDS problem. Out of
41 features in the kdd cup99 dataset, ten were extracted by the proposed method,
which was sufficient to detect intrusion. With the reduction of the required informa-
tion, the run time was reduced, the structure was simplified, and the classification
performance was improved. The evaluated criteria were DR, FPR, and F-measure.
The results showed that, compared with using all of the features, the detection per-
formance was improved, and the cost of calculations was reduced. Unfortunately, in
this paper, the firefly algorithm’s output results were only compared with C4.5 and
Bayesian network classifiers, and no comparison was made with the performance
of other metaheuristic algorithms. In [36], the binary GWO algorithm was used
for FS on the NSL-KDD dataset, and simulation results showed that the algorithm
could balance increasing accuracy and detection rate and decrease FPR reducing
the number of features. The values obtained for these parameters were 99.22, 99.10,
0.0064, and 14, respectively. The proposed method was compared with binary Grey
Wolf Optimization (GWO), AdaBoost, PSO-discretize-HNB in the criteria men-
tioned above and showed that the proposed method had established a suitable bal-
ance between the considered parameters. And, In [37], FS methods, including criti-
cal component analysis, information utility, and Chi-square, were used to reduce
the number of features in IDSs. The number of features was reduced for intrusion
classification using the SVM classifier. The proposed method was validated on the
KDD99 dataset. The results showed that the proposed method could detect up to
99% of attacks such as U2R. In [38], in-depth learning methods on network IDSs
were investigated using the NSL-KDD dataset. IDS in unidentified attacks on net-
work traffic is an effective tool for network security. An unusual traffic detection
model (bat model) was suggested for solving problems such as low accuracy and
engineering features in IDS.
A review of the previous works related to IDS shows that the method of hybrid-
izing the classifiers of machine learning plays a vital role in IDS; however, on the
other hand, the high volume of data in IDS has reduced the accuracy and speed of
classification algorithms. Researchers have used metaheuristic methods to reduce
features in the IDS dataset. Therefore, the classifiers’ hybridization and then selec-
tion of valuable features play an essential role in detecting intrusion. In this paper,
the classifiers are hand. On the other hand, an effective method for selecting the
features based on FFA is presented, which effectively reduces features and increases
accuracy.

3 Farmland Fertility Algorithm

FFA [18] is a new metaheuristic algorithm inspired by the fertility of farmland in


nature. In general, farmers divide the different sections of the farm according to
the soil’s quality, and therefore, each section’s soil quality is different from that of
other sections. Adding particular substances to the soil of each section change their
quality. So farmers try to optimize each section of the farm by adding the particular

13
Journal of Network and Systems Management (2022) 30:40 Page 7 of 27  40

substances that the soil needs. According to this algorithm, farmers change each sec-
tion of the land; and by recording each section’s soil quality, they can later decide
how to improve each section’s soil. After determining the soil’s quality in each sec-
tion, the best and most needed substances are allocated to that section to improve
the quality. Therefore, each section of the farm needs a certain percentage of a sub-
stance to improve the quality, and the worst section of the land should undergo the
most change.
It should be noted that warehouses, called the memory of each section (or local
memory), are built next to each section of the land to store the best soil obtained by
adding particular substances. Also, total memory (or global memory) warehouses are
constructed to store the total farm area’s best-quality soil. Overall, the optimal solu-
tions found so far for each section are stored in local memory, and the best solutions
for the entire area are stored in global memory. The farm section with the worst quality
is combined with global memory ideas, and other solutions for other sections of the
farm are combined with all the solutions. After changing the solutions with the global
memory and random solutions in the search space, farmers combine each section of the
land with the best case in the local memory. FFA is being introduced; now, it is time
to explain this algorithm’s steps, formulas, and key points [18, 39–41]. Generation of
the Initial Population: This step determines the number of sections of the land and the
number of solutions for each section, and their role in the population’s generation. The
initial population is obtained using Eq. (1), where N represents the total population in
the search space, and K is the number of sections of the optimization problem. The
optimization problem determines the standard number of farm sections, and therefore,
by dividing the search space into K sections, each section is assigned a specific number
of solutions, such as the amount of water. In this equation, N shows the number of solu-
tions for each section of the land, a variable, and an integer number.
N=k×n (1)
Segmentation and Determination of Each Section Quality: The initial population is
determined, the quality of each section of the land is obtained using Eqs. (2) and (3)
(sections of the search space), which is obtained from the average of the solutions of
each section.
Sections = x(aj), a = n × (s − 1) ∶ n × s s = {1, 2, … .k}, j = {1, 2, … .4}
(2)
Equation (2) separates the solutions of each section to calculate the average of that
section separately. In Eq. (2), the variable x represents all the search space solutions,
and the variable x represents the number of each section, and j with the interval [1, …,
D] indicates the variable x.
( ( ) )
Fit_Sections = Mean all Fit xji in Sections . s = {1.2. … .k}.i = {1.2. … .n}
(3)
Fit_Sections, in Eq. (3), Fit_Sections shows each solution’s quality in each section of
the land; each section has a specific value equal to the average fit of all solutions within
each section. Thus, the average of the solutions within each section of the land is calcu-
lated and placed in Fit_Sections.

13
40   Page 8 of 27 Journal of Network and Systems Management (2022) 30:40

Updating Local and Global Memory: The local memory of each section and the
global memory are updated at this step, and the best solutions in each section are stored
in the local memory, and the best solutions for all sections are stored in the global
memory.
MGlobal = round(t × N); 0.1 < t < 1 (4)

Mlocal = round(t × n); 0.1 < t < 1 (5)


Equation (4) obtains the bests in the local memory, and Eq. (5) obtains the bests in
the global memory. In Eq. (5), ­MGlobal represents the number of global memory solu-
tions, and ­Mlocal represents the number of local memory solutions. The solutions stored
in these memories are selected according to their fitness, and then the local memory
and global memory are updated. After determining the worst and best section, it is time
to move to the step of “changing soil quality in each section.”
Changing Soil Quality in Each Section: The quality of each section being deter-
mined in Eq.  (3), the most significant change is related to the worst section in qual-
ity. Equation (2) shows the number of solutions in each section. According to Eqs. (6)
and (7), all solutions are combined with global memory solutions in the farm’s worst
section.
h = 𝛼 × rand(−1.1) (6)
( )
Xnew = h × Xij − XMGlobal + Xij (7)

In Eq.  (7), ­XMGlobal randomly selects one of the global memory solutions, and α
(a number between 0 and 1) is initiated at the beginning of the algorithm. To apply
changes to the farm’s worst section, ­Xij gives us a solution, and Eq. (6) obtains h (a
decimal number). Then, the applied changes create a new solution called ­Xnew. Like
changing the worst section, all farm sections should be combined with the entire search
space’s solutions, as obtained using Eqs. (8) and (9).
h = 𝛽 × rand(0.1) (8)
( )
Xnew = h × Xij − Xuj + Xij (9)

Xuj in Eqs. (8) and (9) is used to find a random solution from among the solutions of
the search space; β (a number between 0 and 1) is initiated at the beginning, X ­ ij enables
choosing a solution to make changes in each section, except the worst section. Equa-
tion (8) gives h (a decimal number), and X ­ new gives a new solution for the implemented
changes.
Updating Solutions based on Local Memory and Global Memory: According to
the algorithm introduced in Sect. 1, in the final step, farmers mix the soil of each sec-
tion of the farm with the best case in local memory, ­BestLocal, provided that all solu-
tions in each section are not combined with local memory; and here some solutions are

13
Journal of Network and Systems Management (2022) 30:40 Page 9 of 27  40

combined with the best solution found so far, B


­ estglobal, to improve the quality of solu-
tions. It is shown in Eq. (10).
� �
⎧ Xnew = Xij + 𝜔1 ∗ Xij − BestGlobal (b) Q > rand

H=⎨ (10)
⎪ � �
⎩ Xnew = Xij + rand(0.1) ∗ Xij − BestLocal (b) else

In explaining Eq. (10), it should be noted that there are two ways to create this
new solution. Q (a variable between 0 and 1) is determined initially and represents
the extent to which the solutions are combined with ­Bestglobal. ω1 (an integer num-
ber) is determined initially and shows the algorithm’s variables; its value is gradu-
ally reduced by iteration. A selective solution is identified from all sections to record
changes with ­Xij, and ­Xnew is a new solution obtained from recording the changes.

4 Proposed Method

In this section, the proposed method for detecting the intrusion is explained in Fig. 1.
The proposed method has different sections, each of which is separately explained
below. In this paper, two valid datasets, NSL-KDD and UNSW-NB15, are used to
evaluate the proposed method’s performance, and two pre-processing steps are car-
ried out, namely data conversion and data normalization. Then, a new method for
selecting features based on the FFA is presented. The results of FS using the FFA
are discussed in a separate sub-section in the analysis section. After the FS step,
the reduced dataset is sent to the hybridization of the DT, KNN, and SVM classifi-
ers for classification. We used two valid datasets for evaluation, i.e., NSL-KDD and
UNSW-NB15, described in detail in the following subsection.

4.1 Datasets

KDD cup 99: This dataset is an improved version of the KDD cup 99 datasets. The
first and most crucial shortcoming of KDD cup 99 is a large number of duplicate
records. 78% and 75% of the training data and test data of KDD cup 99 are redun-
dant, respectively [42]. The NSL-KDD dataset includes records showing the rela-
tionship between the two network hosts based on network protocols. There are 41
features in each record representing different traffic features, and a label is consid-
ered for each: attack or regular traffic. The 42nd feature includes five different net-
work communication classes and is classified into the following classes: one class
as a regular class (regular traffic) and four classes as intrusion traffic. The four main
intrusion classes are DOS, Probe, R2L, and U2R [43].
UNSW-NB15: This dataset is used in [43] to evaluate the proposed method [44]
as a dataset to detect newer attacks. In this dataset, attack records are classified
into nine classes (Normal, Fuzzers, Analysis, Backdoors, DoS, Exploits, Generic,
Reconnaissance, Shellcode, and Worms).

13
40   Page 10 of 27 Journal of Network and Systems Management (2022) 30:40

Fig. 1  The flowchart of the proposed method based on the hybridization of classifiers and BFFA

4.2 A New Method for Feature Selection Based on FFA

The continuous form of FFA was fully described in Sect. 3. In this sub-section,
we present a new version of the FFA algorithm, taking into account its basic
algorithm, to solve FS in IDSs, a binary version of FFA based on V-shaped func-
tions. The goal is to implement the V-shaped function on the FFA algorithm and
use it in binary problems. Given that selecting or not selecting a feature, the new
binary solution must include numbers 0 and 1, 1 indicating selecting a feature for
the new dataset, and 0 indicating not selecting a feature. The proposed method

13
Journal of Network and Systems Management (2022) 30:40 Page 11 of 27  40

uses the V-shaped function to move FFA processes in binary space. Therefore,
in this proposed method, the V-shaped function is used to change the continuous
position of the solutions in BFFA, as shown in Eq. (11).
�√ �
� � 𝜋
V FFAi (t) = �erf
d d
FFAi (t) (11)
2

Mirjalili et al. expanded Eq. (11) into Eq. (12) [45, 46]:


�⎛ √ �√ �
⎞��
� � � 𝜋 𝜋
FFAdi (t)
� �
�⎜ 2 ∫
V FFAdi (t) = �⎜ dt⎟�
2
−t2
e (12)
�⎝ ⎟��

0
⎠�

In Eqs. (11) and (12), FFAdi is the continuous value of the ith solution in the FFA
population at the dth dimension in iteration t. The V-shaped function is presented in
Fig. 2. In the following, we will analyze this function’s output and explain how it is
placed in FFA.
According to Fig. 2, the V-shaped transfer function’s output is still in a continu-
ous state between 0 and 1; therefore, its threshold should be computed to convert it
to a binary value. The random threshold is given in Eq. (13) is used to convert the
solution to a binary value in FS in the V-shaped function:
{ ( )
0 if rand < V (FFAdi (t))
1 if rand ≥ V FFAdi (t)
FFAdi (t + 1) = (13)

In Eq. (13), FFAdi indicates the position of the ith solution in the FFA population
at iteration t in the dth dimension. Also, rand represents a number between 0 and 1
and has a uniform distribution. Thus, the proposed method’s solutions are forced to
move in a binary search space using Eqs. (11) and (12). Finally, the quasi-code is
placed, and the proposed method based on a V-shaped function is presented.

Fig. 2  A general schematic of


the V-shaped transfer function

13
40   Page 12 of 27 Journal of Network and Systems Management (2022) 30:40

Fig. 3  BFFA based on V-shaped function

Figure  3 presents the proposed method for BFFA based on the V-shaped func-
tion; we simply use this transfer function to binarize solutions.

4.3 Fitness Function

Selection is a binary optimization problem in which search agents are limited to only
the values [0, 1]. FS in machine learning can be considered a multi-objective optimi-
zation problem in which two conflicting objectives are considered, i.e., selecting the
minimum number of features and accuracy of classification. For solving the multi-
objective problem, it is proposed to use the binary optimization approach, BFFA.
Moreover, a multi-objective fitness function is defined in Eq.  (14) to evaluate the
solutions.
|R|
Fitness = 𝛼𝛾R (D) + 𝛽 (14)
|N|
In Eq. (14), γR(D) is the KNN classification error rate. The KNN classifier is used
widely, because of its simplicity. Also, |R| indicates a subset of selected features and
|N| shows the total number of features in the primary dataset, α and β show two
parameters related to the importance of classification accuracy and subset length,
𝛼 ∈ [0, 1] and 𝛽 = 1 − 𝛼.

13
Journal of Network and Systems Management (2022) 30:40 Page 13 of 27  40

4.4 Complexity Analysis

In this section, the computational complexity of the proposed method is analyzed.


The complexity of the proposed method is depended on the complication of the
BFFA and Hybrid classification (DT + SVM + KNN). Hence, the complexity of the
proposed method is given as follows:
O(proposed method) = O(BFFA) + O(Hybrid)
O(BFFA) = O(N × D) + O(T × N × D)
O(Hybrid) = O(DT) + O(SVM) + O(KNN) = NKlog(N) + O(N3 ) + O(NDK) + F

In BFFA, the time complexity for initialization is O(N × D), and the time com-
plexity for a position update is O(T × N × D). In Hybrid, N is several training exam-
ples, K is the number of features, D is the dimension of each sample, T is denoted as
the total number of iterations. F is the fitness function.

5 Evaluation and Results

In this section, the proposed method and other algorithms are evaluated. This paper
used two valid datasets, i.e., NSL-KDD and UNSW-NB15, to evaluate the pro-
posed method. Both of these datasets are considered big IDS datasets. The proposed
method is compared with KNN, SVM, DT, RF, ADA_BOOST, and NB algorithms
in Accuracy, Precision, Recall, and F1_Score criteria; the results are presented in
graphic and table form. We investigate the results in two separate sections. We com-
pare the classifiers’ algorithms with the proposed method on the NSL-KDD dataset
and UNSW-NB15 dataset. In all experiments, the population and iteration of the
BFFA algorithm are considered to be 50 and 100, respectively. All methods are
tested in MATLAB 2019 and performed on an Intel Core i7 PC, 2.2 GHz CPU, and
8 GB of RAM.

5.1 Evaluation of NSL‑KDD

The proposed method is evaluated in this section and compared with other classify-
ing algorithms with different NSL-KDD dataset features. The proposed method is
compared with KNN, SVM, DT, RF, ADA_BOOST, and NB in Accuracy, Preci-
sion, Recall, and F1_Score criteria; the results are presented in graphical and tabular
forms, and the evaluation is performed using the NSL-KDD dataset. This dataset
includes 42 features, the last of which is intended to show the class; it is divided into
two subsets of training and testing data: the number of training samples is 125973,
and the number of test samples is 22544. The BFFA is then run on the NSLKD-
DTest + test dataset and the NSLKDDTrain + training dataset; the best state of the
features is examined, and the number and the rank of the features are reported in
Table 1. The values of 0 and 1 were generated by the vector of the BFFA algorithm.

13
40   Page 14 of 27 Journal of Network and Systems Management (2022) 30:40

Table 1  Results of FS using BFFA on NSL-KDD dataset


Iteration Number of Binary version (selected (1), unselected (0))
features

1 18 [1,0,1,1,1,1,1,1,0,0,0,1,1,0,0,0,0,0,0,0,1,0,0,1,1,1,0,0,0,0,0,0,0,0,1,0,1,1,0,1,1]
2 12 [1,1,1,0,1,1,0,0,0,0,0,1,1,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,1,0,1,1]
3 16 [1,0,1,0,1,1,0,1,1,0,1,0,1,0,0,0,0,0,0,1,1,0,0,0,1,0,0,0,1,1,0,0,0,1,1,0,0,0,0,1,0]
4 21 [1,1,1,0,1,1,1,1,0,1,0,0,1,0,1,0,1,1,1,1,0,0,0,0,0,1,1,1,1,0,0,0,0,1,1,0,0,0,0,1,0]
5 15 [1,1,1,1,1,0,1,0,0,0,0,0,0,0,0,0,0,0,1,1,0,0,1,0,0,0,1,0,0,1,0,0,0,0,1,1,0,0,1,1,0]
6 17 [1,1,1,1,1,1,0,0,0,0,0,0,0,0,1,0,0,0,1,1,0,1,0,0,1,0,0,1,1,1,0,0,0,1,0,1,1,0,0,0,0]
7 19 [1,1,1,1,1,1,0,1,0,0,0,1,1,0,0,1,0,1,0,0,1,0,0,1,1,1,0,0,0,0,0,0,0,0,1,0,1,1,0,0,1]
8 19 [0,0,1,0,1,1,1,0,1,0,0,1,1,1,1,1,0,1,1,1,1,0,1,1,0,0,0,0,0,0,0,0,0,1,1,0,0,1,0,0,0]
9 14 [0,0,1,1,1,0,0,0,0,1,0,1,0,0,0,0,0,0,1,1,0,0,0,1,0,0,0,0,0,0,1,0,0,1,0,1,1,1,1,0,0]
10 20 [1,0,1,0,1,1,1,1,0,0,0,1,1,0,0,1,0,1,1,0,1,1,0,0,0,1,0,1,1,0,0,0,0,0,1,0,0,1,0,1,1]

Fig. 4  Comparison of the classifiers using BFFA and base for 12 and 14 FS on NSL-KDD

Fig. 5  Comparison of the classifiers using BFFA and base for 15 and 16 FS on NSL-KDD

13
Journal of Network and Systems Management (2022) 30:40 Page 15 of 27  40

Fig. 6  Comparison of the classifiers using BFFA and base for 17 and 18 FS on NSL-KDD

Fig. 7  Comparison of the classifiers using BFFA and base for 19 FS on NSL-KDD

Fig. 8  Comparison of the classifiers using BFFA and base for 20 and 21 FS on NSL-KDD

13
40   Page 16 of 27 Journal of Network and Systems Management (2022) 30:40

Value of 1 describes important features of influence in classification. The chosen


features are determined based on the value of “1” for each element in the population.
Table 1 shows that the number of selected features is 12 to 21. The BFFA algo-
rithm has reduced about 70% to 50% of the NSL-KDD dataset features; conversely,
it increased the classifiers’ speed and accuracy. We will perform more analysis to
prove how effective the BFFA algorithm has been in FS based on the selected fea-
tures. Therefore, in the following, we will examine the results of FS in different clas-
sifiers on the NSL-KDD dataset using the BFFA algorithm in Accuracy, Precision,
Recall, F1_Score criteria. Figures  4, 5, 6, 7 and 8 compare the BFFA with KNN,
SVM, DT, RF, ADA_BOOST, and NB algorithms with different features on the
NSL-KDD dataset accuracy criteria.
Figure 5 presents the average results of 20 runs for classifiers on NSL-KDD and
using the selected features by BFFA. The results presented in Fig. 5 show that the
BFFA has a higher accuracy against the Base model. The Ada_Boost has a good
accuracy compared to RF and is better than other examined classifiers.
The results are shown in Figs.  4, 5, 6, 7 and 8 demonstrate that the FS by the
BFFA algorithm and classification by KNN, SVM, DT, RF, ADA_BOOST, and NB
on the NSL-KDD dataset was successful in Accuracy criteria. A total of 70 runs of
the FS by the BFFA algorithm have improved the different classifiers by about 86%.
However, the exact results of comparing the proposed method and other classifiers
with the various features selected by the BFFA algorithm in accuracy are shown in
Table 2.
The results of Table 2 indicate that the performance of the proposed method with
16 and 21 features (0.84 and 0.85, respectively) in accuracy was better than that of
another number of features. This experiment shows that the proposed method per-
formed better than KNN, SVM, DT, RF, ADA_BOOST, and NB in 90% of the FS
cases compared to the non-selected-feature state method (83.0) outperformed all
other classifiers. Comparing the proposed method and other classifiers with the dif-
ferent types of features selected by the BFFA algorithm in Precision is shown in
Fig. 9.
The results in Fig.  9 indicate that selecting the feature by the BFFA algorithm
and its hybridization with other classifiers was successful in Precision criteria on the

Table 2  Results of the comparison of the proposed method and other classifiers in accuracy
Methods Feature-BFFA
18 12 16 21 15 17 19 19 14 20 41

KNN 0.77 0.81 0.84 0.80 0.81 0.81 0.78 0.81 0.81 0.81 0.81
SVM 0.77 0.79 0.83 0.81 0.79 0.8 0.75 0.80 0.80 0.80 0.79
DT 0.79 0.77 0.81 0.81 0.79 0.77 0.79 0.79 0.82 0.81 0.79
RF 0.71 0.74 0.72 0.75 0.74 0.77 0.75 0.74 0.71 0.79 0.75
ADA_BOOST 0.78 0.79 0.80 0.80 0.81 0.81 0.85 0.78 0.80 0.84 0.81
NB 0.77 0.76 0.76 0.75 0.76 0.74 0.77 0.76 0.74 0.78 0.75
Proposed method 0.78 0.83 0.84 0.85 0.83 0.83 0.79 0.83 0.83 0.83 0.83

Bold values indicate best result in compare other algorithms

13
Journal of Network and Systems Management (2022) 30:40 Page 17 of 27  40

0.86

0.86
0.85

0.85

0.85
0.85

0.85

0.85

0.85

0.85

0.85
0.84

0.84

0.84

0.84
0.83
0.83

0.83

0.83
0.83

0.83
0.83
0.83

0.83

0.83

0.83
0.83

0.83

0.83
0.82

0.82

0.82
0.82

0.82

0.82

0.82
0.82

0.82

0.82

0.82

0.82

0.82
0.82
0.82
0.81
0.81

0.81
0.81

0.81
0.81

0.81
0.81

0.81

0.81
0.81

0.81

0.81
0.81

0.81
0.8
0.8

0.8
0.8
0.8

0.8

0.8

0.8

0.8
0.8
79

0.79

0.79

0.79

0.79

0.79
0.79
0
0.78

0.78

Fig. 9  Comparison of the proposed method and other classifiers based on precision on NSL-KDD

NSL-KDD dataset. The proposed method with 16, 21, and 19 features respectively
obtained the values of 0.85, 0.85, and 0.86 in Precision and outperformed other
numbers of features. Figure 10 shows the results of comparing the proposed method
and other classifiers with different types of features selected by the BFFA algorithm
in Recall.
The results in Fig. 10 show that the selection of features by the BFFA algorithm
and its hybridization with other classifiers was successful in Recall criteria on the
NSL-KDD dataset. The proposed method with 16, 21, and 19 features respectively
obtained the values of 0.85, 0.85, and 0.86 in Precision and outperformed another
number of features. The Adaboost algorithm (ADA_BOOST) obtained a value of
0.86 using 15 features. Comparing the proposed method and other classifiers with
the various features selected by the BFFA algorithm in F1_Score are shown in
Fig. 11.
The results in Fig. 11 show that selecting the feature by the BFFA algorithm and
its hybridization with other classifiers on the NSL-KDD dataset was successful in
the F1_Score criteria. The proposed method with a different number of features
performed better in F1_Score than other features. The Adaboost Algorithm (ADA_
BOOST) has obtained the value of 0.85 in 15 features; however, Table 2 shows that

13
40   Page 18 of 27 Journal of Network and Systems Management (2022) 30:40

0.86

0.86
0.85

0.85

0.85

0.85

0.85

0.85

0.85

0.85

0.85
0.84

0.84

0.84
0.83

0.83

0.83
0.83

0.83

0.83

0.83

0.83

0.83

0.83

0.83
0.83

0.83

0.83
0.82

0.82
0.82

0.82

0.82

0.82

0.82
0.81

0.81
0.81

0.81

0.81

0.81
0.81

0.81

0.81

0.81

0.81

0.81
0.8

0.8

0.8

0.8
0.8

0.8

0.8
0.8

0.8
0.79

0.79

0.79

0.79

0.79
0.78
0.78
0.78

0.78
0.78

0.78
0.77

0.77

0.77

0.77

0.77
0.76

0.76
0.75

0.75

0.75
18 features 12 features 16 features 21 features 15 features 17 features 19 features 19 fea tures 14 features 20 features base

Fig. 10  Comparison of the proposed method and other classifiers based on recall on NSL-KDD

this algorithm with 15 features is less accurate. The experiment results in this sec-
tion show that, in most FS states, the proposed method outperformed KNN, SVM,
DT, RF, ADA_BOOST, and NB in the NSL-KDD dataset in F1_Score criteria;
moreover, the proposed method performed better compared to the non-selected-fea-
ture state. Overall, the experiments showed that 16 and 21 features are the best states
for FS, and the proposed method had the best state for classification.
Figure  12 shows a runtime comparison chart for different classifiers on NSL-
KDD. As shown in Fig. 12, the Hybrid model has a longer runtime than other mod-
els. But according to the results, we observed that the detection percentage of the
hybrid model was higher. The BFFA model has a longer runtime than the BASE
model. Because the BFFA model has more mathematical operations, it takes some
time to find the number of important properties. But the superiority of the BFFA
model over the BASE is important in intrusion detection.

5.2 Evaluation of the UNSW‑NB15

In this sub-section, the proposed method is evaluated on the UNSW-NB15 data-


set and compared with other classifying algorithms with different features. Note

13
Journal of Network and Systems Management (2022) 30:40 Page 19 of 27  40

0.85

0.85
0.84

0.84

0.84
0.83
0.83

0.83

0.83

0.83

0.83

0.83

0.83
0.82

0.81

0.81
0.81

0.81

0.81
0.81

0.81

0.81

0.81

0.81

0.81

0.81

0.81

0.81
0.8

0.8

0.8

0.8

0.8

0.8

0.8

0.8

0.8
0.79
0.79

0.79

0.79

0.79

0.79

0.79
0.79

0.79

0.79
0.78

0.78

0.78

0.78
0.78
0.77

0.77
0.77

0.77

0.77
0.77

0.77
0.76
0.76

0.76
0.75
0.75

0.75
0.75

0.75
0.75

0.75

0.74
0.73

0.73

0.73

0.73

0.71
0.71

0.71
18 features 12 features 16 features 21 features 15 features 17 features 1 9 features 19 f eatures 14 features 20 features base

Fig. 11  Comparison of the proposed method and other classifiers based on F1_score on NSL-KDD

Fig. 12  Runtime comparison chart for different classifiers on NSL-KDD

that the proposed method is compared with KNN, SVM, DT, RF, ADA_BOOST,
and NB in Accuracy, Precision, Recall, and F1_Score criteria, and the results are
presented in graphical and tabular form. The UNSW-NB15 dataset includes 44
features, the last of which is intended to show the class; it is divided into two
subsets of training and testing data: the number of training samples is 175341
number of test samples is 82332 (see Fig. 13).

13
40   Page 20 of 27 Journal of Network and Systems Management (2022) 30:40

NB15test, 82332,
32%

NB15trrain,
175341,, 68%

Fig. 13  UNSW-NB15 training and testing dataset

Table 3  The results of FS by BFFA algorithm on the UNSW-NB15 dataset


Iteration Number Binary version (selected (1), unselected (0))
of features

1 19 [0,0,0,1,1,0,1,1,0,0,0,1,1,1,0,1,0,0,1,0,1,0,0,0,1,0,0,1,0,1,0,0,1,0,1,1,0,0,0,1,1,1,0]
2 18 [0,0,1,0,0,0,0,0,1,1,1,1,1,0,0,1,1,0,1,0,0,0,0,0,1,0,1,0,1,1,0,0,0,1,1,0,0,1,0,1,1,0,0]
3 21 [0,1,0,1,0,1,0,1,1,0,1,1,0,1,0,1,1,1,0,0,1,0,0,0,1,1,1,0,0,0,1,0,0,1,0,1,1,0,1,1,0,0,0]
4 18 [0,0,0,1,1,0,0,1,0,1,1,0,0,0,0,0,1,0,0,0,1,0,0,1,0,1,1,0,0,1,1,1,1,0,1,0,1,0,1,1,0,0,0]
5 17 [0,0,0,0,0,0,0,1,0,0,1,0,1,1,0,0,0,1,0,1,1,0,0,0,0,0,0,1,1,0,1,1,1,0,1,0,0,0,1,1,1,0,1]
6 19 [0,0,0,1,0,0,1,1,1,1,1,1,0,0,1,0,1,1,0,0,0,0,0,0,0,1,0,0,0,1,1,0,0,1,1,0,1,1,1,0,1,0,0]
7 14 [0,1,0,0,0,0,1,1,0,0,0,0,1,1,1,1,0,0,0,1,0,0,0,0,0,1,0,1,0,0,0,1,1,0,0,1,0,0,0,0,0,0,1]
8 19 [0,0,0,1,1,1,0,0,1,1,0,0,1,1,0,1,0,0,0,1,0,1,1,0,0,0,0,1,0,1,1,1,0,0,0,0,1,1,0,0,1,0,1]
9 19 [0,0,1,1,0,1,0,1,1,0,1,1,0,0,0,1,0,0,1,1,1,0,0,1,0,1,1,1,0,0,0,0,1,0,1,1,0,0,0,0,1,0,0]
10 18 [0,0,0,1,0,0,0,1,0,0,1,0,0,0,0,1,0,0,1,1,0,0,0,1,1,0,0,1,0,0,1,0,1,0,1,1,1,1,0,1,1,1,0]

Fig. 14  Comparison of the classifiers using BFFA and base for 18 and 19 FS on UNSW-NB15

13
Journal of Network and Systems Management (2022) 30:40 Page 21 of 27  40

Fig. 15  Comparison of the classifiers using BFFA and base for 18 and 21 FS on UNSW-NB15

Fig. 16  Comparison of the classifiers using BFFA and base for 17 and 19 FS on UNSW-NB15

Fig. 17  Comparison of the classifiers using BFFA and base for 14 and 19 FS on UNSW-NB15

13
40   Page 22 of 27 Journal of Network and Systems Management (2022) 30:40

0.99

0.99

0.99

0.99
0.98

0.98

0.98

0.98

0.98

0.98

0.98

0.98
0.97

0.97

0.97
1

1
0.96

0.96
0.95

0.95

0.95

0.95

0.95
0.95

0.95

0.95

0.95
0.95

0.95

0.95

0.95

0.95

0.95
0.94
0.94

0.94

0.94

0.94

0.94

0.94

0.94

0.94

0.94

0.94

0.94

0.94
0.94

0.94

0.94
0.94
0.94
0.93

0.93

0.93
0.92

0.92

0.92
0.91

0.91
0.75

0.75

0.75

0.75
0.73

0.73

0.73
0.68

0.67

0.67
0.65

Fig. 18  Comparison of the proposed method and other classifiers based on precision on UNSW-NB15

Table 4  Results of the comparison of the proposed method and other classifiers based on the accuracy
Methods Feature-BFFA
19 18 21 18 17 19 14 19 19 18 43

KNN 0.95 0.94 0.95 0.96 0.95 0.96 0.95 0.92 0.95 0.95 0.93
SVM 0.94 0.93 0.94 0.95 0.94 0.95 0.94 0.91 0.94 0.94 0.91
DT 0.97 0.97 0.98 0.98 0.97 0.98 0.97 0.97 0.97 0.97 0.99
RF 0.91 0.93 0.92 0.93 0.92 0.93 0.93 0.92 0.93 0.91 0.92
ADA_BOOST 0.93 0.93 0.94 0.94 0.93 0.94 0.93 0.92 0.94 0.94 0.98
NB 0.72 0.69 0.67 0.74 0.74 0.72 0.74 0.68 0.72 0.68 0.74
Proposed method 1 0.99 1 1 1 1 0.99 1 0.99 1 1

Bold values indicate best result in compare other algorithms

Figure 13 shows that about 70% of the data is for training and about 30% of the
data is for testing. The BFFA algorithm is then run on the UNSW_NB15_testing
and UNSW_NB15_training datasets, and the best state of the FS is examined.
The number of features is listed in Table 3.
The results presented in Table 3 show that 14 to 21 features are selected, and the
BFFA algorithm can select an average of 18.2 features at this step; this means that
about 58% of the features are removed, and 42% are selected, and this increases the

13
Journal of Network and Systems Management (2022) 30:40 Page 23 of 27  40

speed and accuracy of the classifiers on the UNSW-NB15 dataset. We will perform
some extra analysis to prove how effective the BFFA algorithm was in FS based on
the selected features. In the following, we will examine the results of FS in different
classifiers using the BFFA algorithm in Accuracy, Precision, Recall, and F1_Score
criteria UNSW-NB15 dataset. Figures 14 and 15 compare the proposed method with
a different number of features with KNN, SVM, DT, RF, ADA_BOOST, and NB on
the UNSW-NB15 dataset the Accuracy criterion (Figs. 16, 17).
The results presented in Fig.  18 show that the selection of the feature by the
BFFA algorithm and the classification by KNN, SVM, DT, RF, ADA_BOOST,
and NB on the UNSW-NB15 dataset was relatively successful in Accuracy cri-
teria. Moreover, the exact results of comparing the proposed method and other
classifiers with the various features selected by the BFFA based on accuracy are
shown in Table 4. Table 4 presents the selected set of features from the UNSW-
NB15 dataset by BBFA. Each row contains the number of selected features with
their accuracy.
The results Table 4 show that the proposed method with a different number of
features with 100% accuracy outperformed other features. This experiment shows
that the proposed method performed better than KNN, SVM, DT, RF, ADA_
BOOST, and NB in 100% FS. Figure 18 compares the proposed method and other
classifiers with different features selected by the BFFA algorithm in Precision.
The results presented in Fig. 18 show that selecting the features by the BFFA
algorithm and its hybridization with other classifiers on the UNSW-NB15 dataset
was successful in the Precision criteria. Also, the proposed method with a differ-
ent number of features outperformed other features in Precision. In the following,
the results of comparing the proposed method and other classifiers with different
features selected by the BFFA algorithm are presented in Fig. 19 in Recall.
0.99

0.99

0.99

0.99

0.99

0.99

0.99
0.97

0.97

0.97

0.97
0.96

0.96

0.96

0.96
1

1
0.95
0.95

0.95

0.95

0.95
0.94
0.94

0.94

0.94

0.94

0.94

0.94

0.94
0.93

0.93
0.92

0.92

0.92

0.92

0.92

0.92

0.92

0.92
0 91
0.91

0.91

0.91

0.91

0.91
0.91
0.89

0.89
0.89

0.89
0.9

0.9

0.9
0.88

0.88

0.88

0.88

0.88

0.88
0.88
0.87
0.86

0.86
0.79

0.79

0.79

0.79
0.77

0.77

0.77
0.71

0.69
0.7
0.67

18 features 12 features 16 features 21 features 15 features 17 features 19 features 19 features 14 features 20 features base

Fig. 19  Comparison of the proposed method and other classifiers based on recall on UNSW-NB15

13
40   Page 24 of 27 Journal of Network and Systems Management (2022) 30:40

0.99

0.99

0.99

0.99

0.99
1

1
0.97

0.97

0.97

0.97

0.97

0.97

0.97

0.97
0.96

0.96

0.96

0.96

0.96
0.95
0.94

0.94

0.94

0.94

0.94

0.94

0.94
0.93

0.93

0.93

0.93

0.93

0.93

0.93

0.93

0.93
0.92

0.92

0.92

0.92

0.92
0.92

0.92
0.91

0.91
0.91
0.91

0.91

0.91

0.91

0.91

0.91

0.91
0.89

0.89

0.89
0.9

0.9
0.9

0.9
0.73

0.73

0.73

0.73
0.72

0.72

0.72
0.68

0.67

0.66
0.65

18 features 12 features 16 features 21 features 15 features 17 features 19 f eatures 19 feat ures 14 features 20 features base

Fig. 20  Comparison of the proposed method and other classifiers based on F1_score on UNSW-NB15

The results presented in Fig. 19 show that selecting the features by the BFFA
algorithm and its hybridization with other classifiers on the UNSW-NB15 dataset
has been successful in the Recall criteria. The proposed method with a differ-
ent number of features obtained 100 in Recall criteria and outperformed other

Fig. 21  Runtime comparison chart for different classifiers on UNSW-NB15

13
Journal of Network and Systems Management (2022) 30:40 Page 25 of 27  40

features. The comparison of the proposed method and other classifiers, in F1_
Score, with various features selected by the BFFA algorithm, is shown in Fig. 20.
The results presented in Fig.  20 show that selecting the feature by the BFFA
algorithm and its hybridization with other classifiers on the UNSW-NB15 dataset
was successful in F1_Score criteria. The proposed method with a different num-
ber of features outperformed other features in F1_Score. The experiments results
in this section show that the proposed method, in most cases, performs a better
FS than KNN, SVM, DT, RF, ADA_BOOST, and NB on the UNSW-NB15 data-
set in different criteria; moreover, the proposed method outperformed the state of
non-selected-feature.
Figure 21 shows a runtime comparison chart for different classifiers on UNSW-
NB15. As shown in Fig. 21, the Hybrid model has a longer runtime than other mod-
els. But according to the results, we observed that the detection percentage of the
hybrid model was higher. The BFFA model has a longer runtime than the BASE
model. Because the BFFA model has more mathematical operations, it takes some
time to find the number of important properties. But the superiority of the BFFA
model over the BASE is important in intrusion detection.

6 Conclusion and Future Works

This paper presents the binary version of FFA, called BFFA, to FS in IDSs. In the
proposed method, the V-shaped function was used to move the FFA algorithm’s pro-
cesses in the binary space, as a result of which the V-shaped function changed the
continuous position of the solutions in the FFA to binary mode. A hybridization
of classifiers was also presented together with the BFFA algorithm as a fast and
robust IDS. Finally, the proposed method was compared with KNN, SVM, DT, RF,
ADA_BOOST, and NB on two valid datasets, i.e., NSL-KDD, UNSW-NB15, and
Accuracy Precision, Recall, and F1_Score criteria. The experiments on the NSL-
KDD dataset showed that BFFA was able to reduce 70% to 50% of the NSL-KDD
dataset features, and on the other hand, increased the accuracy and speed of different
classifiers. The proposed method in 90% of FS cases outperformed other algorithms’
in accuracy. According to accuracy and f-score measures, the proposed method has
the best results with 0.83 and 0.85 for accuracy and f-score on NSL-KDD respec-
tively. The DT achieved the second-best results with 0.82 and 0.82 on UNSW-
NB15. According to accuracy and f-score measures, the proposed method has the
best results with 1.0 and 1.0 for accuracy and f-score on UNSW-NB15 respectively.
The DT achieved the second-best results with 0.99 and 0.99 on UNSW-NB15.
Moreover, experiments on the UNSW-NB15 dataset showed that BFFA could
select an average of 18.2 features; this means that about 58% of the features are
removed, and 42% are selected. As a result, the accuracy and speed of different clas-
sifiers are increased. The proposed method outperformed other algorithms in 100%
of FS cases in accuracy. For future works, we considered hybrid algorithms such as
PSO-FFA and GWO-FFA to FS on large datasets.

13
40   Page 26 of 27 Journal of Network and Systems Management (2022) 30:40

References
1. Khammassi, C., Krichen, S.: A GA-LR wrapper approach for feature selection in network intrusion
detection. Comput. Secur. 70, 255–277 (2017)
2. Samadi Bonab, M., et al.: A wrapper‐based feature selection for improving performance of intrusion
detection systems. Int. J. Commun. Syst. 33, e4434 (2020)
3. Aldweesh, A., Derhab, A., Emam, A.Z.: Deep learning approaches for anomaly-based intrusion
detection systems: a survey, taxonomy, and open issues. Knowl.-Based Syst. 189, 105124 (2020)
4. Husain, M.S.: Nature inspired approach for intrusion detection systems. In: Design and Analysis of
Security Protocol for Communication, pp. 171–182 (2020)
5. Jiang, K., et al.: Network intrusion detection combined hybrid sampling with deep hierarchical net-
work. IEEE Access 8, 32464–32476 (2020)
6. Nadimi-Shahraki, M.H., et  al.: Migration-based Moth-flame optimization algorithm. Processes
9(12), 2276 (2021)
7. Zamani, H., Nadimi-Shahraki, M.H., Gandomi, A.H.: QANA: Quantum-based avian navigation
optimizer algorithm. Eng. Appl. Artif. Intell. 104, 104314 (2021)
8. Nadimi-Shahraki, M.H., et  al.: An improved Moth-flame optimization algorithm with adaptation
mechanism to solve numerical and mechanical engineering problems. Entropy 23(12), 1637 (2021)
9. Ghafori, S., Gharehchopogh, F.S.: Advances in spotted hyena optimizer: a comprehensive survey.
Arch. Comput. Methods Eng. 1, 1–22 (2021). https://​doi.​org/​10.​1007/​s11831-​021-​09624-4
10. Zamani, H., Nadimi-Shahraki, M.H., Gandomi, A.H.: CCSA: conscious neighborhood-based crow
search algorithm for solving global optimization problems. Appl. Soft Comput 85, 105583 (2019)
11. Nadimi-Shahraki, M.H., et al.: B-MFO: a binary moth-flame optimization for feature selection from
medical datasets. Computers 10(11), 136 (2021)
12. Gharehchopogh, F.S., Shayanfar, H., Gholizadeh, H.: A comprehensive survey on symbiotic organ-
isms search algorithms. Artif. Intell. Rev. 53, 2265–2312 (2019)
13. Gharehchopogh, F.S., Gholizadeh, H.: A comprehensive survey: whale optimization algorithm and
its applications. Swarm Evol. Comput. 48, 1–24 (2019)
14. Garcia-Teodoro, P., et  al.: Anomaly-based network intrusion detection: techniques, systems and
challenges. Comput. Secur. 28(1–2), 18–28 (2009)
15. Rajamohana, S., Umamaheswari, K.: Hybrid approach of improved binary particle swarm optimiza-
tion and shuffled frog leaping for feature selection. Comput. Electr. Eng. 67, 497–508 (2018)
16. Liao, T.W., Kuo, R.: Five discrete symbiotic organisms search algorithms for simultaneous optimi-
zation of feature subset and neighborhood size of knn classification models. Appl. Soft Comput. 64,
581–595 (2018)
17. Abdel-Basset, M., et al.: A new fusion of grey wolf optimizer algorithm with a two-phase mutation
for feature selection. Exp. Syst. Appl. 139, 112824 (2020)
18. Shayanfar, H., Gharehchopogh, F.S.: Farmland fertility: a new metaheuristic algorithm for solving
continuous optimization problems. Appl. Soft Comput. 71, 728–746 (2018)
19. Gharehchopogh, F.S., Abdollahzadeh, B.: An efficient harris hawk optimization algorithm

for solving the travelling salesman problem. Clust. Comput. (2021). https://​doi.​org/​10.​1007/​
s10586-​021-​03304-5
20. Abdollahzadeh, B., Gharehchopogh, F.S.: A multi-objective optimization algorithm for feature
selection problems. Eng. Comput. (2021). https://​doi.​org/​10.​1007/​s00366-​021-​01369-9
21. Wei, W., et al.: A multi-objective immune algorithm for intrusion feature selection. Appl. Soft Com-
put. 95, 106522 (2020)
22. Alazzam, H., Sharieh, A., Sabri, K.E.: A feature selection algorithm for intrusion detection system
based on Pigeon Inspired Optimizer. Exp. Syst. Appl. 148, 113249 (2020)
23. Nazir, A., Khan, R.A.: A novel combinatorial optimization based feature selection method for net-
work intrusion detection. Comput. Secur. 102, 102164 (2021)
24. Eesa, A.S., Orman, Z., Brifcani, A.M.A.: A novel feature-selection approach based on the cuttlefish
optimization algorithm for intrusion detection systems. Expert Syst. Appl. 42(5), 2670–2679 (2015)
25. Mohammadi, S., et  al.: Cyber intrusion detection by combined feature selection algorithm. J. Inf.
Secur. Appl. 44, 80–88 (2019)
26. Aghdam, M.H., Kabiri, P.: Feature selection for intrusion detection system using ant colony optimi-
zation. Int. J. Netw. Secur. 18(3), 420–432 (2016)

13
Journal of Network and Systems Management (2022) 30:40 Page 27 of 27  40

27. Ghanem, W., Jantan, A.: Novel multi-objective artificial bee Colony optimization for wrapper based
feature selection in intrusion detection. Int. J. Adv. Soft Comput. Appl. 8(1) (2016)
28. Guo, C., et al.: A two-level hybrid approach for intrusion detection. Neurocomputing 214, 391–400
(2016)
29. Farnaaz, N., Jabbar, M.: Random forest modeling for network intrusion detection system. Procedia
Comput. Sci. 89, 213–217 (2016)
30. Manzoor, I., Kumar, N.: A feature reduced intrusion detection system using ANN classifier. Expert
Syst. Appl. 88, 249–257 (2017)
31. Aburomman, A.A., Reaz, M.B.I.: A survey of intrusion detection systems based on ensemble and
hybrid classifiers. Comput. Secur. 65, 135–152 (2017)
32. Khorram, T., Baykan, N.A.: Feature selection in network intrusion detection using metaheuristic
algorithms. Int. J. Adv. Res. Ideas Innov. Technol. 4(4), 704–710 (2018)
33. Acharya, N., Singh, S.: An IWD-based feature selection method for intrusion detection system. Soft.
Comput. 22(13), 4407–4416 (2018)
34. Papamartzivanos, D., Mármol, F.G., Kambourakis, G.: Dendron: genetic trees driven rule induction
for network intrusion detection systems. Future Gener. Comput. Syst. 79, 558–574 (2018)
35. Selvakumar, B., Muneeswaran, K.: Firefly algorithm based feature selection for network intrusion
detection. Comput. Secur. 81, 148–155 (2019)
36. Alzubi, Q.M., et al.: Intrusion detection system based on a modified binary grey wolf optimisation.
Neural Comput. Appl. 32, 6125–6137 (2019)
37. Garg, L., Aggarwal, N.: A hybrid feature reduced approach for intrusion detection system. In: Com-
puting and Network Sustainability, pp. 179–186. Springer, Berlin (2019)
38. Su, T., et al.: BAT: deep learning methods on network intrusion detection using NSL-KDD dataset.
IEEE Access 8, 29575–29585 (2020)
39. Benyamin, A., Farhad, S.G., Saeid, B.: Discrete farmland fertility optimization algorithm with
metropolis acceptance criterion for traveling salesman problems. Int. J. Intell. Syst. 36(3), 1270–
1303 (2021)
40. Gharehchopogh, F.S., Farnad, B., Alizadeh, A.: A modified farmland fertility algorithm for solving
constrained engineering problems. Concurr. Comput.: Pract. Exp. 33(17), e6310 (2021)
41. Hosseinalipour, A., et al.: A novel binary farmland fertility algorithm for feature selection in analy-
sis of the text psychology. Appl. Intell. 51, 4824–4859 (2021)
42. Kevric, J., Jukic, S., Subasi, A.: An effective combining classifier approach using tree algorithms for
network intrusion detection. Neural Comput. Appl. 28(1), 1051–1058 (2017)
43. Dhanabal, L., Shantharajah, S.: A study on NSL-KDD dataset for intrusion detection system based
on classification algorithms. Int. J. Adv. Res. Comput. Commun. Eng. 4(6), 446–452 (2015)
44. Moustafa, N., Slay, J.: UNSW-NB15: a comprehensive data set for network intrusion detection sys-
tems (UNSW-NB15 network data set). In: 2015 Military Communications and Information Systems
Conference (MilCIS). IEEE (2015)
45. Mirjalili, S., Lewis, A.: S-shaped versus V-shaped transfer functions for binary particle swarm opti-
mization. Swarm Evol. Comput. 9, 1–14 (2013)
46. Arora, S., Anand, P.: Binary butterfly optimization approaches for feature selection. Expert Syst.
Appl. 116, 147–160 (2019)

Publisher’s Note  Springer Nature remains neutral with regard to jurisdictional claims in published
maps and institutional affiliations.

13

You might also like