Paper 8
Paper 8
Computer Networks
journal homepage: www.elsevier.com/locate/comnet
a r t i c l e i n f o a b s t r a c t
Keywords: Due to the proliferation of sophisticated cyber extortion with exponentially critical effects, intrusion detection
ISCX 2012 system is being evolved systematically their revealing, understanding, attribution and mitigation capabilities. Un-
CIC-IDS2017 fortunately, most of the modern Intrusion Detection System (IDS) technique does not provide sufficient defense
Evolutionary algorithm
services in the wireless environment while maintaining operational continuity and the stability of the defense
Grasshopper optimization algorithm
objective in the presence of intruders and modern attacks. To resolve this problem, we propose a new feature
Intrusion detection
selection technique by combining Ensemble of Feature Selection (EFS) and Chaotic Adaptive Grasshopper Opti-
mization Algorithm (CAGOA) method, called ECAGOA. The proposed method has the capability of preventing
stagnation issue and is particularly credited to the following three aspects. Firstly, EFS method is applied for
selecting the high ranked subset of attributes. Then, we have employed chaos concept in Grasshopper Optimiza-
tion Algorithm (GOA) which generates a uniformly distributed population to enhance the quality of the initial
populations and has the capability to manage two different issues such as the ability to search for new space
termed as exploration and the ability to use existing space termed as exploitation in the optimization process. In
order to avoid local optima and premature convergence, lastly, an adaptive grasshopper optimization algorithm
is developed by using organized parameter adaptation method. Furthermore, the adaptive behavior of GOA is
applied to decide whether a record signifies an anomaly or not, differing from some approaches acquainted in
the literature. Support vector machine (SVM) is used as a fitness function in the proposed method to choose the
relevant features that can help classify the attacks accurately. In addition, it is also applied to optimize the penalty
factor (C), kernel parameter (𝜎), and tube size (𝜖) of SVM method. The proposed algorithm is evaluated using
three popular datasets: ISCX 2012, NSL-KDD and CIC-IDS2017. The evaluation results show that the proposed
method outperformed several feature selection techniques from state-of-the-art methods in terms of detection
rate, accuracy, and false alarm rate.
∗
Corresponding author.
E-mail address: [email protected] (S. Dwivedi).
https://doi.org/10.1016/j.comnet.2020.107251
Received 10 August 2019; Received in revised form 3 March 2020; Accepted 3 April 2020
Available online 1 May 2020
1389-1286/© 2020 Elsevier B.V. All rights reserved.
S. Dwivedi, M. Vardhan and S. Tripathi Computer Networks 176 (2020) 107251
This is due to irrelevant and additional features in dataset which de- tion strategies in the metaheuristic algorithms for solving the IDS prob-
crease the speed of detection. This study is resolved the existing limita- lem. By adopting chaos and adaptive strategies, the performance of
tion by utilizing a preprocessing step so performance of the IDS can be metaheuristic algorithms can be improved in terms of diversity, prema-
improved. Feature selection, as a preprocessing step, is used in this work ture convergence and quality of solutions [9]. In current years, various
to eliminate non-essential features from datasets and can help to classify Evolutionary Computation (EC) techniques such as Differential Evolu-
external record into normal or abnormal activities. Studies in the intru- tion (DE), Genetic Algorithm (GA), Particle Swarm Optimization (PSO),
sion detection systems matter show that feature selection algorithms fall and Grasshopper Optimization Algorithm (GOA) have been established
into two categories: filter and wrapper [6]. In the filter methods, iden- which work as wrapper methods [10].
tify the relations between the input features and substantial class, and Grasshopper is an insect well-known as plants pest. Grasshopper is
then irrelevant features are removed from the input features, whereas, usually seen individually in the life, nonetheless, when they combine
wrapper methods use a learning model to select feature subsets. In gen- into one swarm, the swarm may embed millions of grasshoppers. Re-
eral, wrapper methods convey better performance compared to filter cently, grasshopper optimization algorithm has been proposed which
methods. inspired by the behavior of grasshoppers swarm in nature [11]. The el-
In recent worlds, several diversity of metaheuristics techniques have emental limitation of GOA is that it cannot guarantee optimality. The
been introduced for intrusion detection system which can be provided solution quality also deteriorates with the increase of control parame-
better security and able to find the advanced attacks [7]. Many scien- ters. In addition, it gives poor quality solutions for some problems and
tists have considered intrusion detection as a hard problem in terms of function type. To improve the performance of GOA in intrusion detec-
classification and feature selection [8]. Almost all metaheuristics algo- tion system, in this study, we have introduced an adaptive variant of
rithm depends on three aspects inspired by nature, such as ecology, bi- GOA with chaos concept called Chaotic Adaptive Grasshopper Optimiza-
ology, and ethology; they reproduce the random variables as offspring. tion Algorithm (CAGOA) to find the attacks accurately. In [12], authors
To overcome computational costs due to the time-consuming trial-and- have introduced chaos theory into the optimization process of GOA so
error parameter, researchers have used chaotic concepts and adapta- as to accelerate the global convergence speed. The chaotic maps have
S. Dwivedi, M. Vardhan and S. Tripathi Computer Networks 176 (2020) 107251
employed to balance the exploration and exploitation efficiently and the to tackle FS problem by combination of EFS and CAGOA called hybrid
reduction in repulsion/attraction forces between grasshoppers in the op- feature selection.
timization process. The rest of this paper is as follows. Section 2 outlines the related
Several machine learning techniques and frameworks namely naive works to this study. Section 3 introduces the filter feature selection al-
Bayes, multi-layer perception, support vector machine, and artificial gorithms and grasshopper method. Section 4 introduces the proposed
neural network [13] have been proposed and undertaken for IDSs for approach and outlines to choose the noticeable features from the IDS
finding the type of attacks from the network traffic. It can also distin- datasets and discriminates the different type of attacks. Section 5 illus-
guish attacks by examining the factors of the network data. The network trates the overall working of proposed method. In Section 6, we evaluate
data contains irrelevant feature which decreases the detection accuracy, the performance on the IDS datasets. Section 7 discusses the conclusion.
whereas the input of relevant features can increase the accuracy of in-
trusion detection system. From this time, for facilitating as the input to
learning approaches, the selection of informative features is vital issue 2. Related work
in an IDS. So, it is always a challenge to select a features subset and
analysis of computational complexity of FS problem is NP-hard [14]. Dorothy E. Denning introduced the first intrusion detection system,
Several integration of filter and wrapper methods called hybrid FS it is the process of monitoring and analyzing events that occur in a com-
detection, analysis and investigation approaches (combination of mis- puter or networked computer system to detect behavior of users that
use and anomaly detection) have been proposed to defend against mal- conflict with the intended use of the system [18]. The model contain
ware. However malicious programs are employing a variety of propaga- records to represent the actions of subjects regarding metrics and sta-
tion and escape techniques to bypass defensive mechanisms [15]. In the tistical models and rules to acquire facts about this action from audit
reported literature [16], a new hybrid technique has been introduced records and to detect abnormal behavior. It is also not dependent on
which employed filter as linear correlation coefficient method and cut- any system including application, vulnerability of the system or cate-
tlefish algorithm method for feature selection in which decision tree gory of intrusion in the outline of an expert system for general purpose
classifier was used as a fitness function. Moreover, several hybrid classi- intrusion detection. After that, the same research group investigates a
fication techniques such as GA-PSO, GA-SVM and SVM-PSO have been new expert intrusion detection system. In 2019, author [19] introduced
proposed defining the type of attacks [17]. a new ranker algorithm to rank the features for cost-effective classifi-
The main aim of this study is to provide an accurate and effective cation of network traffic. Based on MIT-DARPA, CAIDA, ISCX-IDS and
anomaly intrusion detection system utilizing evolutionary and machine TU-DDoS datasets, author validated the proposed method. The feature
learning techniques that rely on specific attack signatures to distin- ranking algorithm on large datasets (50,000–1,000,000 instances) finds
guish between normal and malicious activities with high accuracy and best possible features from used datasets and obtained high accuracy
fast learning speed. Adaptive grasshopper optimization algorithm with (92%–97%) in a parallel environment, which have significantly less time
chaos concept (CAGOA) technique is applied for optimizing the tun- (71%–85% lower) than a sequential environment.
ing parameters of SVM. Initially, remove the redundant features from Increasingly, fruitful applications such as feature selection and clas-
the original data from the EFS method. Then, generate an uniformly sification model have been applied to IDS datasets for finding the type of
distributed population to enhance the quality of the initial populations attacks. In the reported literature [20], FS algorithm with learning algo-
by chaos concept. Finally, reduced data is passed in wrapper method rithm cannot handle or do not scale extremely large volumes of data. To
(CAGOA) to control the observed diversity from the original feature handle this type of problem, in this research work our focus is on filters
space. The main contributions of this paper are recorded as follows: and wrapper methods namely mutual information and CAGOA for find-
ing the attacks. Detection of attacks and intrusions are broadly strain
• An efficient filter-based method called Ensemble Feature Selection
researches that follow the trend of applying detection of intrusions. To
(EFS) is introduced by combination of JMI, mRMR, and CMIM.
choose effective and efficacious features and improve the performance
• The chaos-map is used to generate uniformly distributed populations
of intrusion detection system, in [21] proposed new hybrid classifica-
to enhance the original quality.
tion method based on artificial bee colony and artificial fish swarm al-
• However, in original GOA, the local optima and stagnation problem
gorithms. The simulation results on NSL-KDD and UNSW-NB15 datasets
still occur so we cannot ignore it. To overcome the problem of GOA,
demonstrated that the proposed method outperforms in terms of perfor-
we introduced a new position-updating and natural selection mech-
mance metrics and achieved 99% detection rate and 0.01% false positive
anisms which is applied on basic GOA, known as CAGOA.
rate.
• This paper proposed a new hybrid algorithm called ECAGOA by com-
To solve cyber-attack problems, recently several metaheuristics tech-
bining EFS and CAGOA which can improve the detection rate.
niques have been proposed to avoid the anomaly based problem by using
• CAGOA can choose the optimal number of features that can help to
new updation scheme. The metaheuristics methods imitate the natural
recognize the type of attacks. In addition, it is applied to enhance
process or the biology phenomena to search the best solution efficiently.
the penalty factor, kernel parameter, and tube size of support vector
As reported in [22], author proposed a novel FS algorithm using sup-
machine.
port vector machine to diminish the feature domain of IDS datasets.
• The proposed approach is evaluated and compared with other re-
To predict the web traffic activities, Hiroshi et al. [23] introduced a
ported methods on three standard IDS datasets such as ISCX 2012,
framework using genetic with fuzzy computing model for network de-
CIC-IDS2017 and NSL-KDD. The experimental results show that the
tection for given time interval. In this approach, GA has employed to
proposed method is superior to other FS methods in terms of accu-
produce a digital signature of network section utilizing analysis of flow
racy, detection rate, and false alarm rate.
where evidence removed from data of network flow. The experimen-
However, present network traffic data, which are often large in size, tal results have estimated that proposed method in the network traffic
present a vital challenge to IDSs. These big data slow down the integral flow attained an accuracy as 96.53% and false positive rate as 0.56%.
detection process and may lead to unsatisfactory classification accuracy In [24], author proposed a novel distributed blind intrusion detection
due to the computational difficulties in handling such data, therefore, in framework by modeling sensor measurements as the target graph-signal
this study we used filter and wrapper algorithm such as EFS and CAGOA and utilizing the statistical properties of the graph-signal for intrusion
in order to distinguish the malicious records from IDS data sets and detection. To fully take into account the underlying network structure,
alleviate the classification accuracy. As we discussed earlier, positively the graph similarity matrix is constructed using both the data measured
several filter and wrapper method have employed to classify attacks, by the sensors and sensors proximity resulting in a data-adaptive and
according to our knowledge, there has been no effort in the literature structure-aware monitoring solution
S. Dwivedi, M. Vardhan and S. Tripathi Computer Networks 176 (2020) 107251
A combination of the gravitational search algorithm (GSA) and dif- gorithm can rapidly and commendably return and ease DoS attacks in
ferential evolution (DE) algorithm was employed to optimize the pa- adversarial conditions regarding the functional performance criteria. Se-
rameters of HKELM, which improved its global and local optimization curity is a prime challenge in wireless mesh networks. To handle this
abilities during prediction attacks [25]. In addition, the kernel principal challenge, in [37] using support vector machine for intrusion detection
component analysis (KPCA) algorithm was introduced for dimension- in wireless mesh networks was proposed.
ality reduction and feature extraction of the intrusion detection data. Security issues in the field of computer and/or network security have
Then, a novel intrusion detection approach, KPCA-DEGSA-HKELM, was been studied extensively. To detect attacks in networks and computer
obtained. The proposed approach was eventually applied to the classic systems, IDSs need to collect, store and analyze a wide scope of data,
benchmark KDD Cup 99 dataset, the real modern UNSW-NB15 dataset this paper has integrated ensemble of filters and wrapper. In the first
and the industrial intrusion detection dataset from the Tennessee East- phase, filter method (EFS) attempts to exclude irrelevant features from
man process. the original data. Reduction in search space from the entire original
In contrast to NSL-KDD, and KDD Cup 99 datasets, there are some feature space to the pre-selected features, this phase accelerates the
recent dataset which include modern attack trends such as ADFA step of processing for the next phase as wrapper method (CAGOA). The
[26], AWID [27], UNSW-NB15 [28], CIC-IDS 2017 [29], ISCX 2012 proposed approach is eventually applied to the well-known benchmark
[30] datasets frequently used in research studies. Sharma and Chaura- datasets. The numerical results validate both the high accuracy and the
sia [31] introduced IDS dependent on the density maximization-based time-saving benefit of the proposed approach. It outperformed several
fuzzy c-means clustering (DM-FCC). The simulation results were con- feature selection algorithms from state-of-the-art related works in terms
ducted upon ADFA Dataset and the proposed approach performed bet- of false alarm rate, and detection rate.
ter in terms of accuracy, precision, detection rates, and false alarms. In
[29] author suggested a system based on deep learning regarding hybrid 3. Background details of methods
intrusion detection and signature generation for unknown web attacks
(D-Sign). D-Sign was proficient in magnificent detection and generation 3.1. Feature selection
of signatures for attack achieving high accuracy, specificity and sensi-
tivity for web-based attacks. The experiments were performed on CIC- Feature selection is becoming an essential part of building intrusion
IDS 2017 and NSL-KDD datasets for proving the efficiency of proposed detection systems by eliminating irrelevant and redundant features and
model. Aminanto et al. [32] presented novel deep-feature extraction and selecting the most optimal subset of features that produce a better char-
selection (D-FES), integrating stacked feature extraction and weighted acterization of patterns belonging to different attacks. Let, 𝑋 𝑚×𝑛 = {𝑥𝑖,𝑗 }
feature selection. The experimental outcomes on Wi-Fi network data set, be a matrix containing m features and n records originating from differ-
𝑚×𝑛 𝑚×𝑛 𝑚×𝑛
named as, Aegean Wi-Fi Intrusion data set, demonstrated the effective- ent groups denoted by a target attack, 𝑋 𝑚×𝑛 = [𝑋1 1 𝑋2 2 … 𝑋𝑝 𝑝 ]
ness and the efficacy of the proposed D-FES attaining detection accuracy 𝑚×𝑛𝑖
where each matrix 𝑋𝑖 contains records from the same group and
of 99.918% and false alarm rate of 0.012%, to identify accurately im- 𝑛1 + 𝑛2 + … 𝑛𝑝 = 𝑛. Selecting the most informative features consists of
personation attacks discussed in the previous research studies. identifying the features subset through the whole size as Sk × n ∈ Xm × n ,
Recently, considerable work has been done in the area of intrusion k ≪ m which is the most discriminative for the outlined attacks. Ensem-
detection which attempts to design anomaly and/or misuse detection ble method is formed in several ways, but proposed ensemble framework
systems to detect malicious attacks with a high detection rate and low depends on mutual information based filters approach that is a blend
false alarm rate in [33] proposed a novel two-phase model called a Real- of Conditional Mutual Information Maximization (CMIM), minimum-
time Alert Correlation method based on Code-books (RACC) for intru- Redundancy-Maximum-Relevancy (mRMR), and Joint Mutual Informa-
sion detection systems. First, in the off line phase, RACC pre-processes a tion (JMI) ranking methods and wrapper method as CAGOA which se-
knowledge base to propose some matrices as the main data structure of lects the relevant attributes for better classification of attack types.
the method that we call them code-books. Instead of keeping alerts in the
memory, those matrices just hold keys to the corresponding meta-alerts.
3.1.1. Filter based methods
An index that is based upon red-black trees was used to access matrix
Recently, the selection of best feature subsets is one of the promis-
elements. Generating the matrices and mentioned index were indepen-
ing tasks through information theory by which selection of attributes
dent from the alerts, so utilized them can facilitate the alert correlation
based on high correlation concerning the class and uncorrelated to fea-
process in an online manner in phase two of the proposed model.
tures is done. Outside the others filter approaches, mutual information
To mitigate Denial of Service (DoS) attacks, author [34] proposed
based algorithm is adapted following aspects: (1) perform more reliably
a pattern matching algorithm called All-Ready State Traversal pat-
in noisy problems, (2) generalize to multi-class problems (3) general-
tern matching algorithm. The proposed algorithm constructed the state
ize to numerical outcome problems, and (4) to make them robust to in-
traversal machine with 1280 bytes size, and enables users to store large
complete (i.e. missing) data. Alternatively, it calculates feature score for
sized string patterns in the pattern database. The state traversal ma-
each feature which can be applied to rank and select top scoring features
chine facilitates the easy retrieval of these patterns through the path
for feature selection these scores may be applied as feature weights to
vector. Further, the proposed worked also follows a number of basic
guide downstream modeling [38]. None of the existing approaches can
ASCII characters with 128 bytes size; and designs the memory archi-
defeat problems such as low performance, redundant features, and high
tecture using binary search tree structure. Another IDS dataset as ISCX
computational burden [39].
2012 was generated through real network configuration and collected
From CMIM technique, the feature subset selection is based on max-
packets activities in normal and abnormal form. It describes 𝛼 and 𝛽
imizing conditional mutual information regarding the class, in addition,
profiles wherever profile defines multistage scenarios of attacks, and
it is extremely correlated with the class and uncorrelated to features. It
profile defines mathematical distributions of the entity which contain
makes a compromise between the predictive power of the nominated
preconditions and postconditions [30].
candidate (relevance for the class carrier) and its independence from all
Vidal et al. [35] offered an artificial immune framework which has
previously selected characteristics. To measure the Mutual Information
proficiency by matching the several immune reactions and building of
(MI) between the class y and features X is expressed in Eq. (1).
the immune memory framework. Experiments on publically available
datasets such as KDD Cup 99 and CAIDA have been conducted. The 𝑦
𝐼(𝑦; 𝑋) = 𝐻(𝑦) − 𝐻( ) (1)
multi-objective approach based on anomaly detection technique has 𝑋
proposed called multi-objective PSO in the related of the Neural Net- Where, H(y) and 𝐻( 𝑋𝑦 ) illustrate the entropy and conditional entropy
work [36]. The experimental simulations demonstrated that hybrid al- of the class variable. Some researchers have addressed these problems
S. Dwivedi, M. Vardhan and S. Tripathi Computer Networks 176 (2020) 107251
using Mutual Information based Feature Selection (MIFS) [40]. Hence, generally inclined to three factors such as gravity force, social interac-
we utilize this model to lessen the redundancy between data attributes tion, and advection of wind. The position of the ith grasshopper Yi is
and class y as expressed in Eq. (2). The primary objective of CMIM is denoted as follows in Eq. (8).
to choose the final feature subsets that conveys as much information as
𝑌𝑖 = 𝑆𝑜𝑖 + 𝐺𝑟𝑖 + 𝐴𝑤𝑖 (8)
possible from the record S. The relevance of input attributes determined
by the CMIM is as: Where, Yi represents the grasshopper position, Soi represent the ith
( ) social interaction, Gri represents the gravitational force on the ith agent,
𝑥
𝑀𝐶𝑀 𝐼 𝑀 (𝑋 ) = min 𝐼 𝑦; 𝑘 (2) and Awi demonstrates the advection of wind. The random behavior of
𝑥𝑗 ∈𝑆 𝑥𝑗
swarm can be illustrated as follows in Eq. (9).
Where MCMIM estimates the mutual information between full features
𝑌𝑖 = 𝑟1 ∗ 𝑆𝑜𝑖 + 𝑟2 ∗ 𝐺𝑟𝑖 + 𝑟3 ∗ 𝐴𝑤𝑖 (9)
set as xk and certain features xj regarding class label y, whereas S shows
𝑥
the subsets of selected features. 𝐼(𝑦; 𝑥𝑘 ) measures the quantity of the Where r1 ,r2 , and r3 are random numbers lies between 0 to 1.
𝑗
classification information that xk affords when xj has carefully chosen. 𝑛
∑
This information may not be offered by chosen feature subset S. As com- 𝑆𝑜𝑖 = 𝑠𝑓 (𝐷𝑖𝑗 )𝐷̂𝑖𝑗 (10)
𝑥
parison to 𝐼(𝑦; 𝑥𝑘 ), 𝐼(𝑦; 𝑥𝑘 ) does not contain the redundant information 𝑗=1;
𝑖≠𝑗
𝑗
of pairwise features for classification. where, Dij is the distance between the ith and the jth grasshopper, sf is
Several feature selection algorithms have reported in [41]; it has rec- a function to decide the power of social forces as shown in Eq. (10),
ommended the Mutual Information (MI) for solving the IDS classifica- (𝑌 −𝑌 )
and 𝐷̂𝑖𝑗 = 𝑗 𝑖 is a unity vector from the ith grasshopper to the jth
𝐷𝑖𝑗
tion problems. Reduction in redundancy can improve the differentiating
capability of feature subsets. The undeviating way is to maximize the grasshopper. The sf function is depicted as in Eq. (11).
classification information which is newly delivered for feature subset 𝑠𝑓 (𝑟) = 𝐼𝑒−𝑟∕𝑙𝑛 − 𝑒−𝑟 (11)
by candidate features. This phenomena is directly applied in the joint
mutual information between the subset and classes. The relevance of Where I shows the intensity of attraction and ln is the attractive
inputs attributes defined by the JMI as shown in Eq. (3). length scale. Fig. 2 shows the sf function, which demonstrate the ratio-
∑ ( ) ∑ ( ) nal model of interactions between agents and comfort zone. The Gri and
𝑀𝐽 𝑀𝐼 (𝑋 ) = 𝐼 𝑥𝑘 ; 𝑥𝑗 ; 𝑦 ∝ 𝐼 𝑦; 𝑥𝑘 ∕𝑥𝑗 (3) Awi components in Eq. (8) is enumerated by Eqs. (12)–(13) as follows:
𝑥𝑗 ∈𝑆 𝑥𝑗 ∈𝑆
Where, I (xk ; xj ; y) shows the MI between inventive attribute set xk 𝐺𝑟𝑖 = −𝑔𝑟𝑒𝑔𝑟
̂ (12)
represent selected subset of features S, and selected attribute xj with
respect to the class y. 𝐴𝑤𝑖 = 𝑑 𝑒̂𝑤 (13)
The analysis of dependency and relevancy on the features sets is hard
to perceive. To understand this issue, one of the useful method is avail- Where, gr shows the gravitational constant, 𝑒𝑔𝑟
̂ represents unity vec-
able in the literature is called minimal-Redundancy-Maximal-Relevance tor for center of earth and d signifies the constant drift and 𝑒̂𝑤 signifies
(mRMR). The maximum dependency on the target class y called as Max unity vector in the direction of wind. The Gri and Awi from Eq. (8) is
Dependency, as described in Eq. (4). substituted as follows in Eq. (14).
( ) 𝑛
( ) 𝑦 ∑
max 𝑤(𝑋, 𝑦) = 𝐼 𝑦; 𝑥1 , 𝑥2 , … , 𝑥𝑁 = 𝐻 (𝑦) − 𝐻 (4) 𝑌𝑖 = 𝑠𝑓 (|𝑌𝑗 − 𝑌𝑖 |)(|𝑌𝑗 − 𝑌𝑖 |∕𝐷𝑖𝑗 ) − 𝑔𝑟𝑒𝑔𝑟
̂ + 𝑑 𝑒̂𝑤 (14)
𝑥1 , 𝑥2 , … , 𝑥𝑁 𝑗=1;
𝑖≠𝑗
As shown in Eq. (4), the dependency among features X is estimated and
it can be large value. The relationship change between redundancies Where n is the number of grasshoppers.In the optimization algo-
between features is expressed as Eqs. (5) to (6). rithm, Eq. (14) is not used, as it prevent the optimization technique from
∑ ( ) exploring and exploiting the search space near about a solution. A vari-
min Z (X, y ) = 1∕ ∣ 𝑠2 ∣ 𝐼 𝑥𝑗 ; 𝑥𝑘 (5) ant of Eq. (14) is utilized to resolve optimization problems as depicted
𝑥𝑗 ∈𝑠
in Eq. (15).
𝑛
∑
𝑀𝑎𝑥𝜛 (𝑤, 𝑍 ) = 𝑤 − 𝑍 (6)
𝑌𝑖𝑑 = 𝑐𝑧 ∗ { 𝑐𝑧(𝑢𝑙𝑑 − 𝑙𝑙𝑑 |∕2)𝑠𝑓 (|𝑌𝑗𝑑 − 𝑌𝑖𝑑 |)(|𝑌𝑗 − 𝑌𝑖 |∕𝐷𝑖𝑗 )} + 𝑇̂𝑑 (15)
The integration of Eqs. (5) and (6) is known as minimal-redundancy- 𝑗=1;
𝑖≠𝑗
maximal-relevance (mRMR) which describes in Eq. (7).
∑ ( ) Here, uld and lld are the upper and lower limits in Dth dimension,𝑇̂𝑑
𝑗𝑚𝑅𝑀𝑅 (𝜛 ) = 𝐼 (𝑦; 𝑋 ) − 1∕ ∣ 𝑠2 ∣ 𝐼 𝑥𝑗 ; 𝑥𝑘 (7) determines the value of target in the Dth dimension and the coefficient
𝑥𝑗 ∈𝑠
cz decreases the comfort zone proportional to the number of iterations
Where, xj is selected subset of features and xk is original features set. and is deliberated as follows in Eq. (16).
𝑐𝑧 = 𝑐𝑧max − 𝑡(𝑐𝑧max − 𝑐𝑧min ∕𝑡max ) (16)
3.2. Grasshopper optimization algorithm
Where 𝑐𝑧max denotes the maximum value, 𝑐𝑧min denotes the mini-
Grasshopper optimization algorithm firstly introduced by Saremi mum value, t represent the current iteration, and 𝑡max denotes the maxi-
et al. [11], is one of the new nature-inspired and population-based tech- mum number of iterations. From Saremi et al. [11], we have taken value
nique that mimic the behavior of grasshopper swarms in nature. In GOA, of 𝑐𝑧max = 1 and 𝑐𝑧min = 0.00001 in this study.
the position of the grasshoppers in the swarm signifies a candidate so-
lution for a given optimization problem. Grasshoppers has an exclusive 4. Proposed methodology
way of flying [42]. During the food search, the grasshoppers has two vi-
tal phases of optimization which are exploration and exploitation of the 4.1. Ensemble of feature selection
search space. In exploration phase, the agents are encouraged to make
sudden movements, whereas these agents incline to move locally over Recently, the filter-based feature selection techniques are gained at-
the exploitation phase [43]. Based on the mathematical model proposed tention by many researchers in the field of intrusion detection. In this
for this optimization approach [44], the movement of grasshoppers are section, we propose ensemble of feature selection (EFS) technique which
S. Dwivedi, M. Vardhan and S. Tripathi Computer Networks 176 (2020) 107251
Aracon Zone
Comfort Zone
Repulsion Zone
combines the output of three filter-based feature selection technique 4.1.2. Frequency vote
based on the mutual information named as mRMR, JMI and CMIM and Frequency vote is a group decision-making scheme which is used to
select top ranked features and this proposed scheme is presented in combine the output of various filter techniques and consists of setup that
Algorithm 1. These filter-based techniques are employed to rank the is helpful compared to other compound schemes [46]. In this regard,
each model generates predictions (votes) for each ranked feature and
Algorithm 1 Proposed ensemble feature selection. then the final result is determined based on whether each ranked feature
acquired more than half of the votes or not. If any of the predictions
Input Data: 𝜓 the number of ranking techniques
for particular feature does not obtain more than half of the votes, in
Value : 𝜃 the threshold of the number of features to be chosen
that case, we may assume that,the ensemble technique is incapable to
Outcome: P Best ranking feature set
make a constant prediction. So, we prefer the ranked feature which has
Begin
lowermost vote value over score value. Therefore, we can select the
for 𝑙 ← 1 to 𝜓 do
most voted prediction as the final prediction of ensemble-based model
Acquire ranking 𝐴𝜓 utilizing feature selection technique 𝜓
as depicted in Eq. (17).
end for
A = combined ranked features 𝐴𝜓 with a ranking combination tech-
nique 𝑈 𝑈
∑ ∑
𝐴𝑡 = choose 𝜃 top features from A 𝑑𝑛,𝑗 = 𝑎𝑟𝑔𝑚𝑎𝑥𝑘∈{1,2,⋯,𝐿} 𝑑𝑛,𝑘 (17)
Obtain optimal ranked features (P) 𝑛=1 𝑛=1
Return P
4.2. Chaos-based population initialization a uniformly distributed sequence and prevents it from being immersed
in small periodic cycles effectively.
The generation of initial populations in the search space play an es-
sential role in GOA. From the reported literature [12], we have observed
that numerous chaos-based AGOA techniques have been studied to re- 4.3. Adaptive grasshopper optimization algorithm
solve global optimization issues. In this work, we present the chaotic
initialization of maps in the AGOA optimization process to accelerate its GOA is a population based metaheuristic technique which is widely
global convergence speed. Chaotic maps are used to efficiently balance used to solve various engineering problems. But it has certain limitations
exploration and exploitation and reduce repulsion/attraction forces be- like fixed parameter values and lack of adaptation in dynamic environ-
tween grasshoppers in the optimization process [49]. The application of ments. In the initial phase of GOA, positions of all the agents are ran-
chaotic sequence instead of random sequence in AGOA is certainly a fun- domly initialized. The agents cultivate an inclination towards the goal
damental strategy. Hence, it can implement downright search at higher (i.e, 𝑇̂𝑑 in the social interaction of swrams), conversely, if initialized po-
speeds than the stochastic search that based primarily on probabilities. sitions of the agents are more focused near about the best local and far
Few functions (chaotic maps) and few parameters (initial conditions) from the best in whole swarm, the agents can focus utilizing the best
are necessary even for long sequences. In addition, a huge number of local agent which means that GOA is conscious for selection of starting
diverse sequences can be produced by merely varying their initial con- positions, so upgrading strategies must be useful for search agents to get
dition. Furthermore, these chaotic sequences are deterministic and re- out of the best local trap.
producible. The logistics map is one of the best chaos-based technique GOA with adaptive parameters called AGOA is a substantial and
that researchers have paid attention to global search. It is determined promising variant of GOA. The self-adapting strategy in control param-
as follows in Eq. (18). eters is normally employed while their proper values are not known pri-
marily or they are required to change throughout the stochastic search
𝑥𝑛+1 = 𝜙(𝑥𝑛 , 𝜇) = 𝜇 ∗ 𝑥𝑛 (1 − 𝑥𝑛 ) (18) process. The cz control parameter is self-adapted in the AGOA. Further-
more, the strategy of adaptation in cz in the basic GOA is associated
Where xn signifies the nth chaotic variable, xn ∈ (0, 1) under the con- only to the generation Eq. (16), and does not anticipate the possibility
ditions that the original x0 ∈ (0, 1) from definite periodic static points of dynamically changing value of cz from the system through the feed-
(0, 0.25, 0.5, 0.75,1) and 𝜇 is set as 4, also known as bifurcation coeffi- back of search process. As a result, cz should be treated as an adaptive
cient. parameter associated with the estimated fitness value of each genera-
In this procedure, large number of manifold periodic elements will tion and performance of the current agent of population. That is, based
be located in the thinner and thinner 𝜇 intervals as it increases. This on the Eq. (19), the fitness of the chosen population (Pop) by the se-
phenomenon is really without restrictions. But it includes a limit value lection technique and best population is presented using the feedback
at 𝜇 t = 3.60. Remember that when the techniques are the period 𝜇t can mechanism.
be infinite or even non-periodic 𝜇 t . For the moment, the whole structure In this paper, we utilize the natural selection strategy (𝛿), and self-
evolves into a chaotic state. However, when 𝜇 is larger than 4, the whole adaptive mechanism to help GOA while jumping out of the local opti-
system becomes unstable. Therefore, the interval [ 𝜇; 4] is commonly mum trap. For implementing the adaptive scheme in cz, dynamic feed-
measured by the chaotic area of the whole system. The bifurcation dia- back mechanism relied on the adaptive scheme of the genetic tech-
gram is certainly shown in Fig. 3. nique is utilized. Agreeing to the natural selection methodology [50],
More obviously, whenever a current quantity of chaotic generations 𝛿(𝛿 < Pop) grasshopper agents are selected and removed randomly from
is executed, the chaotic variables will be produced accordingly. Sub- the population. Population is carefully chosen by the tournament se-
sequently, by re-mapping these variables in the optimization space, the lection strategy conferring to the fitness function of each grasshopper.
preliminary variables will be generated for the initial optimization prob- Then, the positions of the excluded grasshoppers are randomly initial-
lem. The logistic-map flowchart as following in Fig. 4, which produces ized again.
S. Dwivedi, M. Vardhan and S. Tripathi Computer Networks 176 (2020) 107251
Start
Randomly initialize
chaotic variable (xn)
iter=iter+1;
is iter=
N Y End
max_iter?
Utilizing a natural selection scheme, the population can move in the Algorithm 2 Pseudo code of BCAGOA.
direction of finding a better quality solution. Reinitializing the positions Begin
of disregarded agents can correctly spread the position of the grasshop- Set the swarm size,𝑐 𝑧max , 𝑐 𝑧min and maximum number of iterations
pers and extend the research space of the population. Consequently, 𝑡max ;
the local solution can be avoided to some extent. The probabilities of Chaos based generates the population Y;
cz extensively define the degree of accuracy of the solution and the Compute the fitness value of each agent;
speed of convergence that the Grasshopper optimization algorithm can Choose the elite grasshopper using tournament strategy;
achieve. Rather than utilizing fixed cz values, AGOA makes use of pop- 𝑇̂𝛼 , 𝑇̂𝛽 , 𝑇̂𝛾 = best search agents;
ulation information in each generation and modifies cz adaptively to while (t < 𝑡max ) do
retain population diversity and preserve convergence ability. In AGOA, Update 𝑐𝑧 using Eq. (19);
the parameter setting of cz relies on the suitable values of the solutions. for 𝑖 ← 1 to 𝑛 do
In this study, we identify convergence through perceiving the interval Normalize the distances between Grasshoppers Y;
between the maximum and minimum fitness values of the population. Update the current agent 𝑌𝑖 position using Eq.(20);
While the cz value is large, the widespread search is too difficult and end for
the optimal solution may be missed. While the cz value is small, the ex- Update 𝑇̂𝛼 , 𝑇̂𝛽 , 𝑇̂𝛾
ploration process can avert in the minimum area. In this scenario, we t=t+1;
have used an alternate approach for an assessment of population selec- end while
tion employing simple SVM classifiers. In AGOA the value of cz can be Return 𝑇̂𝛼
assessed according to Eq. (19).
algorithm is a population-based evolutionary search technique and was The main aim of proposed approach is to obtain best solutions while
primarily proposed as an optimization approach to address continuous utilizing the self adapting and collective learning ability of grasshop-
problems [7]. Conversely, numerous optimization problems, like feature pers. Chaotic maps utilized in proposed are proficient of considerably
selection are established in a binary space. As a result, when GOA can enhancing the solution quality of optimization techniques. In the basic
be used to solve binary optimization problems, where the solutions are GOA, there is no requirements to retain linearly decrements. In reality,
restricted to the binary (0,1) values that motivate us to propose a binary chaotic variable varying SVM parameters can be more appropriate for
variant of the AGOA [52]. Inspired from prior study [53], in this study, the search, that can also make possible the proposed approach reach to
we utilize the transfer function to transform the continuous values of the the optimum value with a fast speed. Hence, in this study, the work of
continuous space into binary values 0 or 1 in the binary space. Typically, adjusting the SVM parameters (𝜎, 𝜖, and C) using CAGOA can be seen
transfer function is utilized to return the probability of altering the po- in Fig. 5.
sition 0 to 1 or vice versa; position of the ith agent in the jth dimension
in the current iteration (t) as an input parameter. 4.3.4. Support vector machine
Support vector machine [54] is a powerful tool for machine learning
4.3.2. New encoding scheme that is extensively employed in various applications like intrusion detec-
In the duration of search process, the grasshoppers move towards tion, classification, and pattern recognition [22]. SVM has certain dom-
the target, which is the best position gained until now by the swarm. inant characteristics in comparison to other techniques for instance out-
Using a population of random solutions (binary vectors), the swarm is standing generalization performance, that makes it proficient to create
initialized, and searches for the best solution while updating the posi- high-quality decision boundaries dependent on a small subset of train-
tion of each grasshopper agent conferring to the Eq. (19) in continuous ing data points. Additionally, SVM includes a high capability to model
AGOA. Updating position of each grasshopper agent in a binary search complex and non-linear relations. The simple idea behind the SVM al-
space is not an easy task compared to a continuous space. The grasshop- gorithm is to obtain the optimal hyperplane which separates two classes
per agent can update its position with addition of the first component while maximizing distance between the hyperplane margin and the data
value in Eq. (20) to the target vector in continuous search spaces. On points in the specified data set. The hyper plane is demonstrated as in
the other hand, in a binary search space, the position of grasshopper Eq. (24).
cannot be simply updated with addition of values as the position vec-
𝑊𝑇 ∗𝑥+𝑐 =0 (24)
tors of grasshopper agent can only have either 0 or 1. So, if we take the
first component in Eq. (20) as the difference between the two binary Where, c parameter adjust the displacement from the origin, WT ∗ x
vectors position vector and target vector taken as 𝑌𝑖𝑑 . We can transform illustrates the plane direction. The margin may be between two hyper
the updating position equation of the grasshopper based on step vector planes. Mostly, the margin may be changed through the rotation of the
𝑌𝑖𝑑 in the original AGOA as follows in Eq. (21). normal vector and shifted by tuning parameter c. The margin gap is
equal to |𝑊2 | means the width is inversely proportional to normal vec-
⎧ 𝑛 ⎫
⎪∑ ⎪ tor length. In optimization problem, most of the inventor anticipates to
𝑌𝑖𝑑 = 𝑐𝑧 ∗ ⎨ 𝑑 𝑑
𝑐𝑧(𝑢𝑙𝑑 − 𝑙𝑙𝑑 |∕2)𝑠𝑓 ((|𝑌𝑗 − 𝑌𝑖 |))|𝑌𝑗 − 𝑌𝑖 |∕𝐷𝑖𝑗 ⎬ (21) maximize the width of the margin conferring to Eq. (25).
⎪ 𝑗=1; ⎪
⎩ 𝑖≠𝑗 ⎭
𝑓 (𝑤) = min[|𝑊 |2 ∕2] (25)
In the reported literature, the most common activation function is 𝑇𝑥 𝑇𝑥
The inequalities are: 𝑊 𝑚 + 𝑐 ≥ 1 for positive class and 𝑊 𝑚 +
sigmoid function which is employed to normalize the step vector in the
𝑐 ≤ 1 for negative class. Multiply each constraint by labels, we find as :
between 0 to 1. Commonly, the disadvantage related to sigmoid function
𝑦𝑚 {𝑊 𝑇 𝑥𝑚 + 𝑐 ≥ 1} and 𝑦𝑛 {𝑊 𝑇 𝑥𝑛 + 𝑐 ≤ 1} Points xm , xn have ym , yn
is that there is consistency among a significant value in 𝑌𝑖𝑑 in the positive
labels, where 𝑥𝑚 = [−1, +1]𝑇 , 𝑦𝑛 = +1, 𝑥𝑚 = [+1, −1]𝑇 and 𝑦𝑛 = −1 Fi-
and negative direction and then it designates the fact that the more
nally, we need to optimize: min [|W|2 /2] subject to: 𝑦𝑖 ∗ 𝑊 𝑇 𝑥𝑚 + 𝑐 ≥ 0
movement is required dependent on the previous position. To resolve
To resolve this equation, we multiply the inequalities with Lagrange
this issue, we present new encoding scheme to the component of step
multipliers. Inserting Lagrange multipliers 𝛾 i see in Eq. (26).
vector. The proposed V-shaped transfer function 𝑇 (𝑌𝑖𝑑 ) is highlighted in
Eq. (22). 𝑁
∑
(( ) (( ) 𝐿𝑝𝑑 = |𝑊 |2 ∕2 − 𝛾𝑖 {𝑦𝑖 (𝑊 𝑇 𝑥𝑖 + 𝑐) − 1} (26)
| ) | | ) |
𝑇 (𝑌𝑖𝑑 ) = exp | 𝑌𝑖𝑑+1 + 𝑎 ∕ (1 − 𝑏)| − 1∕exp | 𝑌𝑖𝑑+1 + 𝑎 ∕ (1 − 𝑏)| + 1 𝑖=1
| | | |
By implementing the Karush Kuhn Tucker (KKT) conditions [55], we
(22)
discover changed partial derivations as follows:
Where a and b are pre-defined constant value and it remain static in 𝜕𝐿𝑝𝑑 𝜕𝐿𝑝𝑑
𝜕𝑊
=0; 𝜕𝑐
= 0 ; 𝛾 i ≥ 0.
whole search process. After computation of the probabilities, the agents 𝜕𝐿 ∑ 𝜕𝐿𝑝𝑑
𝛾𝑖 {𝑦𝑖 (𝑊 𝑇 𝑥𝑖 + 𝑐) − 1 = 0}; 𝜕𝑊𝑝𝑑 = 0 then 𝑊 = 𝑁𝑖=1 𝛾𝑖 𝑥𝑖 𝑦𝑖 ; and 𝜕𝑐 =
updates their positions using the rules illustrated in Eq. (23). ∑𝑁
{ ( ) 0 then 𝑖=1 𝛾𝑖 𝑦𝑖 = 0;
1, 𝑖𝑓 𝑟𝑎𝑛𝑑 < 𝑇 𝑌𝑖𝑑
𝑌𝑖𝑑 = (23) Applying KKT to primaldual, we obtain different partial derivations
0, 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒 Ld according to Eqs. (27) and (28).
In general, particular problem in existing algorithms is that it does 𝑁 ∑
𝑁 𝑁
∑ ∑
not prevent individuals from being stuck in a local optimum. To resolve 𝐿𝑑 = −1∕2 ∗ 𝛾𝑖 𝛾𝑗 𝑦𝑖 𝑦𝑗 𝑥𝑇𝑖 𝑥𝑖 + 𝛾𝑖 (27)
this kind of issue, this paper proposes a novel hybrid FS algorithm to 𝑖=1 𝑗=1 𝑖=1
find the better quality of solution regarding classification accuracy and When substituting with known values: 𝑦𝑚 = +1; and 𝑦𝑛 = −1,
minimize the size of irrelevant features. In addition, a new encoding 𝑥𝑚 = [−1, +1]𝑇 , 𝑥𝑛 = [+1, −1]𝑇 ; 𝐿𝑑 = −𝛾𝑚2 − 𝛾𝑛2 − 2𝛾𝑚 𝛾𝑛 + 𝛾𝑚 + 𝛾𝑛
scheme is also investigated to evaluate the classification.
∑𝑁
𝜕𝐿𝑑
4.3.3. Optimizing SVM parameters with CAGOA = 0 → 𝛾𝑚 + 𝛾𝑛 = 1∕2; 𝛾𝑖 𝑦𝑖 = 0 → 𝛾𝑚 = 𝛾𝑛 (28)
𝜕𝛾 𝑖=1
GOA is an evolutionary computation technique has employed to
solve the variety of real-world engineering problems. In this study, where, 𝛾𝑚 + 𝛾𝑛 = 1∕2 then 𝛾𝑚 = 𝛾𝑛 = 1∕4; To get the normal vector w:
∑
CAGOA is employed as an optimization technique to select relevant fea- 𝑊 = 𝑁 𝑖=1 𝛾𝑖 𝑦𝑖 𝑥𝑖 = [−1∕2, −1∕2]
𝑇
tures and optimize SVM parameters to improve the classification per- To find b (bias): 𝛾𝑖 {𝑦𝑖 (𝑊 𝑥𝑖 + 𝑐) − 1 = 0}, 𝑐 = 𝑦1 − 𝑊 𝑇 𝑥𝑖 ∀𝑖 𝑆.𝑇 .𝛾 ≠
𝑇
𝑖
formance in prediction of attacks. 0, 𝑐 = 0∀𝑖
S. Dwivedi, M. Vardhan and S. Tripathi Computer Networks 176 (2020) 107251
Agents 010111……101
101101……110
110010……011
No
Yes
Parameter Criteria
SVM Classifier Modelling
Obtained Evaluaon?
Finally, we find the maximum margin between positive and negative selecting relevant feature subset which are helpful in identification of
to solve an Intrusion detection issue. attacks is determined as given below:
4.3.5. Kernel function and parameters of SVM 1. Firstly, we applied the filter-based method as an ensemble to get the
In the last decades, SVM have received considerable attention by final ranking called EFS.
researchers. Several types of separation classification surfaces can be 2. Logistic map is utilized to generate uniformly distributed agents to
recognized through implementation of kernel (k), for instance polyno- improve the quality of the initial population.
mial, linear, Gaussian Radial Basis Function (RBF) to handle non-linear 3. Selected agents by tournament based approach on CAGOA. Produced
problems. SVM is used in binary classification and multi-class classi- agents is set as 30. Each agent consists number of feature selected
fication whereas multi-class classification is performed by introducing by filter.
SVM for each pair of classes. In this work, we list three kernel functions 4. Apply proposed encoding scheme for initialization of agent and on
illustrated from Shukla et al. [56] which are: SVM parameters. Using three parameters as 𝜎, 𝜖 and C are encoded
in binary significance and signified as agent. After encoding, each in-
• Linear function dividual length can be few features presented in the reduced dataset.
( ) ( )
𝑘 𝑥𝑖 , 𝑥𝑗 = 𝑥𝑖 𝑇 * 𝑥𝑗 + 𝑑 5. For each agent, find out the fitness function using SVM classifier
• Polynomial function considering classification accuracy as the fitness function.
( ) ( )𝑝
𝑘 𝑥𝑖 , 𝑥𝑗 = 𝑥𝑖 𝑇 * 𝑥𝑗 + 𝑑 6. Select a highly scored value of agents as consider an intelligent agent
• RBF function { } as target in initial phase.
( ) | |2
|𝑥𝑖 −𝑥𝑗 | 7. Then update position of agent (Yi ) to make a new updated target.
𝑘 𝑥𝑖 , 𝑥𝑗 = 𝑒𝑥𝑝 − | 2𝜎 2 |
8. According to the new value of agent update the old agent.
9. Find the parameters of SVM classifier. If, the recent best fitness value
Where d is a constant value, xi and xj are samples or instances, and
meets the termination condition, if yes, stay in afterward; otherwise,
p is the order of function.
go to step 3.
5. Overall structure of proposed approach 10. Output as an optimal feature subset with the optimal parameters 𝜎,
𝜖 and C of the SVM are found.
In this study, we have employed the chaotic maps to enhance the ini- The proposed algorithm is performed in several phases: (1) Data col-
tial population and the adaptive scheme for automatic tuning of param- lection, (2) Data preprocessing, (3) classifier training, and (4) attack
eter, then the population is updated using AGOA algorithm. In which, recognition. It is observed form the results of experiments that proposed
ensemble based feature selection is used to select top ranked features approach is outperformed other existing intrusion detection technique
and then these selected feature set is optimized using CAGOA approach. in identification of intrusions but it include certain limitations which
In addition, during the tuning of SVM parameters, the optimal parame- are discussed in next subsection.
ters are dynamically adjusted by the proposed approach in the training
phase via the 10-fold cross validation to improve its classification per- 5.1. Data collection
formance. So that, the optimized parameters are acquired and fed to
the SVM classification module is illustrated in Fig. 6. The proposed al- Data collection is an important phase in the research field of intru-
gorithm is called ECAGOA. The overall process of hybrid algorithm for sion detection. During data collection phase, data is collected from nu-
S. Dwivedi, M. Vardhan and S. Tripathi Computer Networks 176 (2020) 107251
Data Pre-processing
Compact dataset
Anomaly Detecon
Engine
Yes
Normal Behavior
Found? Permied
No
merous different sources, for instance network packets, system logs and 5.3. Classifier training
management information base data. After the collection of data, data
is preprocessed including data transferring and data normalization, so In this phase, the classifier is trained. When the best features subset
that data can be sent to the intrusion detection module for further eval- is chosen, this subset is then evaluated by the classifier during train-
uation. ing phase where an explicit classification technique is used. Since SVM
can effortlessly deal with binary classification and multi-classification
5.2. Data preprocessing issues, in this work we have used SVM for classification of used IDS
datasets,for instance, in NSL-KDD dataset, there are five different classes
In this phase, after acquiring data from the data collection phase, including a normal class and four attack classes.
training and testing data undergo for pre-processing to produce basic
features. Data pre-processing phase contains three main steps. The first
step is data transferring, where each symbolic value of feature in a
dataset is transformed to a numerical value. The second step is data 5.4. Attack recognition
normalization which is essential to retain a uniform distribution of each
value of feature before beginning of any learning process. In this context, In this phase, the trained model is employed to distinguish attacks
we utilize the min-max scheme. In min-max, the values of all features on the test data. After finishing all the iterations and training of the final
are scaled to the range [0, 1] to remove the bias while supporting fea- classifier which contains the most significant features, the normal and
tures with larger values from the dataset. During the third step, feature attack traffics can be predicted using trained classifier. The test data is
selection is performed, in which the proposed approach is employed to also used through the trained model to discover intrusions. In this work,
select the most significant features that are then utilized to train the NSL-KDD, ISCX 2012, and CIC-IDS2017 dataset are used for evaluation
classifier in the intrusion detection model. of proposed approach.
S. Dwivedi, M. Vardhan and S. Tripathi Computer Networks 176 (2020) 107251
Table 1 categories of attacks, namely DoS, Probe, R2L and U2R. Consequently,
NSL-KDD dataset description. each connection sample complies with one of the five labeled classes
Data set type Details of dataset (DOS, Normal, R2L, Probe and U2R).The NSL-KDD has six different net-
work protocols and services such as SMTP, HTTP, FTP, Telnet, ICMP,
Samples Normal DoS Probe U2R R2L
and SNMP. Finally, it covers 41 attributes (six binary, three nominal,
Train Sample 125,973 67,343 45,927 11,656 52 995 and thirty-two numeric) in each record which can be divided into three
% 53.46 36.45 9.25 0.04 0.79 types, namely, basic features, traffic-based features, and content-based
Test Sample 22,543 9711 7458 2421 200 2754
% 43.08 33.08 10.74 0.89 12.22
features.
Table 2
Comparison of feature ranking of top 20 features in all attacks in two datasets.
NSL-KDD dataset
CMIM f_5,f_8,f_11,f_15,f_19,f_24,f_31,f_34,f_40,f_13,f_36,f_14,f_2,f_3,f_9,f_17,f_20,f_33,f_32,f_18
JMI f_4,f_2,f_11,f_15,f_19,f_23,f_25,f_13,f_16,f_1,f_7,f_40,f_18,f_37,f_18,f_15,f_12,f_34,f_39,f_5
mRMR f_5,f_2,f_19,f_17,f_20,f_1,f_29,f_16,f_11,f_12,f_38,f_15,f_18,f_3,f_9,f_40,f_22,f_14,f_4,f_5
EFS f_5,f_2,f_11,f_15,f_19,f_1,f_25,f_13,f_12,f_16,f_14,f_17,f_3,f_7,f_1,f_1,f_12,f_39,f_5
CIC-IDS2017 dataset
CMIM f_6,f_12,f_14,f_18,f_22,f_25,f_31,f_35,f_40,f_42,f_46,f_59,f_62,f_65,f_66,f_69,f_72,f_75,f_76,f_78
JMI f_5,f_8,f_12,f_18,f_20,f_25,f_27,f_30,f_36,f_38,f_40,f_45,f_48,f_50,f_53,f_66,f_67,f_71,f_72,f_78
mRMR f_3,f_5,f_9,f_11,f_17,f_22,f_28,f_30,f_34,f_38,f_42,f_50,f_52,f_60,f_64,f_66,f_67,f_74,f_76,f_77
EFS f_3,f_5,f_9,f_18,f_17,f_25,f_28,f_30,f_34,f_38,f_40,f_45,f_48,f_50,f_53,f_66,f_67,f_71,f_76,f_78
Table 3
Comparison of feature ranking of nineteen features in ISCX 2012 data.
CMIM f_1,f_8,f_4,f_6,f_14,f_12,f_18,f_1,f_5,f_2,f_16,f_11,f_7,f_17,f_15,f_8,f_10,f_9,f_19
JMI f_2,f_8,f_9,f_12,f_4,f_13,f_3,f_6,f_5,f_2,f_8,f_10,f_9,f_19,f_16,f_11,f_7,f_17,f_15,
mRMR f_1,f_9,f_4,f_6,f_5,f_2,f_16,f_11,f_8,f_13,f_14,f_12,f_18,f_7,f_17,f_15,f_10,f_9,f_19
EFS f_1,f_8,f_4,f_6,f_5,f_2,f_3,f_10,f_7,f_17,f_9,f_19
False Alarm Rate (FAR), and F-measure are applied to evaluate the per- Table 4
formance of the proposed algorithm. These performance metrics are de- Percentage of average performance in three feature selection method and pro-
termined in Eqs. (29) to (32). posed filter method in all attacks with NSL-KDD and CIC-IDS2017.
Table 5
Percentage of average performance in three feature selection method and proposed filter meth-
ods with ISCX 2012 dataset.
Table 6
Comparative performance on ISCX 2012 dataset.
ISCX 2012 Single SVM 0.7254 1.0235 4.8546 93.54 91.86 1012.85
EFS-SVM 0.6524 0.0235 5.7424 95.81 94.63 784.52
GA-SVM 0.4547 0.0495 5.0210 98.85 97.63 658.91
GOA-SVM 0.0581 0.0364 1.8254 99.06 98.99 542.63
Proposed 0.0715 0.0009 1.5438 99.41 99.23 288.1547
Table 7
Comparative performance on NSL-KDD and CIC-IDS2017 datasets.
with SVM-R, and significantly over takes all other classification meth- concerning their Execution training time (EtrD), Execution testing time
ods. Similarly, from Table 5, we demonstrates that when ISCX 2012 (EteD) are displayed in Fig. 7 in used three datasets.
combines with EFS, it has achieved an accuracy of 94.98% in ISCX 2012 In the experimental work, we have made comparison between the
dataset with SVM-R, and significantly over takes all other classification performance of proposed (ECAGOA) and the three different optimiza-
methods. tion algorithm with SVM classifier. Tables 8 to 10 compare the per-
formance of the ECAGOA method concerning average accuracy for IDS
datasets. Now, we have evaluated proposed method in comparison to
6.4.2. Comparison of proposed method with existing state-of arts GOA-SVM, GA-SVM, EFS-SVM and SVM with the detection rate (DR),
In available literature, several methods of feature selection have Execution training time (EtrD), false alarm rate (FAR) and Execution
been applied for the identification of attacks from IDS data sets. De- test time (EteD) in NSL-KDD and CIC-IDS2017 datasets over each fold in
spite this, there is no arrangement by which the FS method produces Tables 8 to 10. From Tables 8 to 9, the acquired outcomes illustrates that
noteworthy subsets of attributes for attack classification. A specific FS the proposed technique is greater than other nature-inspired approaches
strategy can be superior to others for some IDS datasets, or other FS acquiring 99.71% detection rate and 0.085 false alarm rate in NSL-KDD
strategy may work better for some other data sets. and 99.52% detection rate and 0.007 false alarm rate in CIC-IDS 2017
From Tables 6 and 7, all the SVM methods take on RBF which works datasets. To show the performance of ECAGOA with SVM-R compared
as a kernel function, in normal SVM detection model, the parameters to GA,GOA,EFS with SVM-R and single SVM-R, experiments have been
𝜎, 𝜖, and C are randomly selected in ISCX 2012, NSL-KDD and CIC-IDS done to make examinations with all type of attacks. Table 10 compare
2017 datasets. In proposed SVM detection model, the optimal parame- the performance of the ECAGOA method concerning 99.32% detection
ters 𝜎, 𝜖, and C are obtained by proposed algorithm through 30 simu- rate and 0.053 false alarm rate on ISCX 2012 data.
lation experiments. From Tables 6 and 7, it is observed that proposed As illustrated in Table 11, the proposed algorithm has acquired sig-
algorithm has obtained 99.23% accuracy in ISCX 2012 dataset; 99.63% nificant outcomes in comparison to prevailing state-of-art regarding ac-
and 99.25% accuracy in NSL-KDD and CIC-IDS2017, respectively while curacy, high DR, low FAR amongst all employed approaches in both
testing phase. The performance evaluations of all employed approaches type of data. Fig. 8 shows the overall performance of proposed tech-
S. Dwivedi, M. Vardhan and S. Tripathi Computer Networks 176 (2020) 107251
Table 8
Comparison the experimental performance in all attack in NSL-KDD.
Methods Measures Fold1 Fold2 Fold3 Fold4 Fold5 Fold6 Fold7 Fold8 Fold9 Fold10
Proposed DR (%) 95.11 95.23 94.07 96.04 96.89 95.23 98.04 98.07 96.31 99.71
FAR (%) 1.012 1.223 0.876 1.124 1.046 0.865 1.112 0.983 0.843 0.085
EtrD(sec) 1.54 2.67 3.43 4.43 2.45 3.57 1.89 2.76 2.59 1.57
EteD(sec) 2.24 3.69 4.56 5.36 4.57 5.34 3.78 3.68 4.56 4.58
GOA-SVM DR (%) 94.50 94.63 93.85 94.63 95.24 94.99 98.63 95.70 96.15 95.66
FAR (%) 1.547 1.906 1.745 1.896 1.654 1.974 2.023 2.523 1.924 1.024
EtrD(sec) 2.54 4.49 4.64 3.65 4.13 4.42 4.14 5.37 6.34 4.34
EteD(sec) 4.67 5.66 6.53 4.39 6.35 6.54 5.34 8.27 7.14 6.54
GA-SVM DR (%) 89.53 87.53 88.61 88.74 89.23 88.52 90.63 89.54 88.1 88.63
FAR (%) 4.954 4.635 5.564 5.0123 5.745 5.635 4.523 4.623 5.546 4.630
EtrD(sec) 3.75 6.03 6.54 4.56 5.46 5.39 6.42 6.68 7.47 7.24
EteD(sec) 4.78 7.52 7.38 5.43 6.57 7.47 7.54 7.18 9.05 10.08
EFS-SVM DR (%) 85.63 86.52 86.12 85.16 86.54 88.64 84.34 86.87 87.91 88.52
FAR (%) 4.654 5.657 6.972 6.025 6.872 5.854 5.793 6.391 4.895 7.382
EtrD(sec) 5.86 8.58 7.49 7.41 8.73 8.38 7.69 7.36 9.53 8.47
EteD(sec) 6.37 8.45 9.35 6.29 8.51 9.65 10.48 8.93 10.52 11.52
SVM DR (%) 87.76 85.01 87.32 84.08 83.36 86.32 87.21 86.38 89.22 88.36
FAR (%) 6.572 11.232 13.746 9.054 14.213 11.365 16.382 10.396 9.067 11.234
EtrD(sec) 14.78 18.53 15.38 17.27 19.16 16.46 18.25 16.29 20.17 18.21
EteD(sec) 18.75 19.87 20.48 21.32 24.37 18.25 20.44 19.11 24.12 25.08
Table 9
Comparison the experimental performance in all attack in CIC-IDS 2017.
Methods Measures Fold1 Fold2 Fold3 Fold4 Fold5 Fold6 Fold7 Fold8 Fold9 Fold10
Proposed DR (%) 93.25 94.54 93.64 95.46 97.61 96.32 98.59 99.35 96.14 99.52
FAR (%) 1.432 1.524 1.205 0.982 1.325 0.876 0.793 0.696 0.815 0.007
EtrD(sec) 1.63 2.23 3.46 4.68 2.02 3.25 2.10 3.53 1.92 1.41
EteD(sec) 2.56 3.16 4.38 3.25 4.20 3.59 2.92 3.19 4.38 3.94
GOA-SVM DR (%) 91.42 93.39 92.85 94.37 96.24 97.32 95.67 94.27 95.36 96.51
FAR (%) 1.764 1.572 1.874 2.052 1.254 1.635 2.037 1.108 1.354 1.046
EtrD(sec) 3.52 2.65 4.19 2.97 3.15 4.52 3.40 4.97 5.39 4.53
EteD(sec) 4.46 4.32 5.20 3.73 4.29 6.21 5.37 4.83 6.23 4.69
GA-SVM DR (%) 90.24 89.25 87.63 88.92 89.64 90.25 92.52 93.49 91.38 93.52
FAR (%) 5.362 3.524 3.943 4.415 4.201 3.474 2.768 3.982 2.856 2.074
EtrD(sec) 3.27 5.43 6.08 4.28 3.14 5.35 6.27 5.62 4.25 5.21
EteD(sec) 4.57 6.23 7.22 4.15 5.78 6.20 6.53 6.36 8.49 9.02
EFS-SVM DR (%) 89.26 88.54 86.22 88.64 89.21 87.35 88.20 86.58 89.73 89.65
FAR (%) 5.251 7.185 5.382 7.378 4.472 6.158 7.216 5.252 7.203 6.503
EtrD(sec) 5.60 6.15 8.32 7.25 8.47 6.20 7.18 9.21 8.94 10.02
EteD(sec) 8.25 7.26 8.58 9.21 7.62 8.10 9.23 8.17 10.18 11.29
SVM DR (%) 86.34 87.25 85.21 84.57 86.24 88.76 87.52 85.93 84.52 86.21
FAR (%) 5.722 7.251 9.045 8.351 10.152 9.672 10.213 8.273 9.219 10.201
EtrD(sec) 11.54 13.25 15.20 14.52 16.34 15.26 17.65 16.28 18.34 19.22
EteD(sec) 16.25 17.54 19.53 20.04 22.58 19.34 21.28 20.15 22.35 24.12
Table 10
Comparison the experimental performance in ISCX 2012 dataset.
Methods Measures Fold1 Fold2 Fold3 Fold4 Fold5 Fold6 Fold7 Fold8 Fold9 Fold10
Proposed DR (%) 95.46 96.21 94.15 97.62 95.34 96.75 98.41 99.38 98.29 99.32
FAR (%) 1.123 1.241 0.986 1.314 0.942 0.864 1.015 0.931 0.894 0.053
EtrD(sec) 1.76 1.27 2.51 4.23 3.75 2.39 2.17 1.96 2.58 2.81
EteD(sec) 1.67 2.25 3.27 4.15 3.21 3.96 4.36 4.05 3.26 4.90
GOA-SVM DR (%) 94.23 93.47 92.59 95.14 93.57 95.17 97.76 96.67 95.27 96.34
FAR (%) 1.564 1.702 1.635 2.215 1.547 1.625 2.254 1.037 1.615 1.362
EtrD(sec) 3.92 4.17 4.21 3.82 4.42 5.60 4.32 5.28 6.10 5.21
EteD(sec) 5.45 6.22 7.54 5.29 7.65 6.27 5.27 8.74 7.09 6.92
GA-SVM DR (%) 90.24 89.53 91.22 93.46 92.02 93.21 91.76 94.28 92.44 93.34
FAR (%) 5.784 4.164 3.463 4.241 5.352 4.172 2.563 3.754 4.215 3.163
EtrD(sec) 4.89 5.16 6.25 4.16 5.24 6.24 5.87 6.38 7.28 7.69
EteD(sec) 6.87 7.20 6.29 8.93 7.26 8.68 7.54 7.39 9.27 9.76
EFS-SVM DR (%) 88.28 86.03 87.24 85.68 87.33 86.39 89.12 90.12 88.14 89.34
FAR (%) 4.267 5.254 4.325 6.215 5.237 4.892 5.732 6.493 5.294 7.683
EtrD(sec) 6.42 7.95 6.27 7.53 9.41 8.20 9.54 7.84 9.21 8.94
EteD(sec) 7.26 8.65 9.48 7.52 8.73 9.25 10.20 9.27 10.68 10.12
SVM DR (%) 88.16 85.76 86.12 85.26 84.43 85.21 86.52 87.26 88.20 86.24
FAR (%) 6.742 11.5711 10.234 9.245 11.213 10.394 9.425 10.682 9.892 10.951
EtrD(sec) 15.24 16.33 14.25 16.37 15.69 17.26 18.75 16.49 19.52 17.42
EteD(sec) 17.65 18.29 16.78 19.52 22.26 19.45 20.44 23.51 22.72 24.15
S. Dwivedi, M. Vardhan and S. Tripathi Computer Networks 176 (2020) 107251
Fig. 7. Comparison of training time and testing time (in seconds) on NSL-KDD, ISCX 2012 and CIC-IDS2017 datasets.
Table 11
Comparison results based on all attack in all used datasets.
Table 12
Description of selected features.
nique is superior to the other methods for intrusion recognition in differ- 7. Conclusion
ent datasets. The acquired outcomes demonstrate that proposed shows
significant improvements in DR. From the Table 12, we observe that Intrusion detection system is a major line of defense to protect com-
five, seven and fifteen optimal attributes have been selected by the puter resources from unauthorizes activities. An individual approach in
proposed method in NSL-KDD, ISCX 2012 and CIC-IDS 2017 datasets intrusion detection model is to select the best set of features by using
which can be identified the attacks in the networks for intrusion detec- classifier and improve the performance, learning speed, accuracy, and
tion respectively. These attributes plays a significant role in IDS system. reliability in addition remove noise from the set of features but they
Table 12 shows the optimal selected features and also gives a short de- have few drawbacks. To overcome the existing limitation, in this pa-
scription of features. per, a novel hybrid IDS is introduced by combination of filter (EFS)
and wrapper (CAGOA) for accurate characterization for network traf-
S. Dwivedi, M. Vardhan and S. Tripathi Computer Networks 176 (2020) 107251
fic behaviors. The main contributions of this study can be summarized [17] K. Zheng, X. Wang, Feature selection method with joint maximal information en-
as follows. Firstly, filter-based FS method is introduced, called ensem- tropy between features and class, Pattern Recognit. 77 (2018) 20–29.
[18] D.E. Denning, An intrusion-detection model, IEEE Trans. Softw. Eng. (2) (1987)
ble of feature selection to eliminate the irrelevant features. After that, 222–232.
chaos-based (Logistic map) solutions are utilized to initialize the uni- [19] R.K. Deka, D.K. Bhattacharya, J.K. Kalita, Active learning to detect DDos attack using
formly distributed initial agents to enhance the stability of GOA called ranked features, Comput. Commun. (2019).
[20] S. Dwivedi, M. Vardhan, S. Tripathi, Defense against distributed dos attack detection
CGOA. In order to overcome slow convergence, high computational bur- by using intelligent evolutionary algorithm, Int. J. Comput. Appl. (2020) 1–11.
den and low interpretability shortcomings, we have introduced adaptive [21] V. Hajisalem, S. Babaie, A hybrid intrusion detection system based on ABC-AFS al-
behavior of CGOA called ACGOA to predict the networks traffic behav- gorithm for misuse and anomaly detection, Comput. Netw. 136 (2018) 37–50.
[22] M.S. Pervez, D.M. Farid, Feature selection and intrusion classification in NSL-KDD
ior accurately. Finally, ECAGOA technique is applied to select suitable
cup 99 dataset employing SVMs, in: Software, Knowledge, Information Manage-
SVM parameters which avoids over-fitting concern of SVM. Based on ment and Applications (SKIMA), 2014 8th International Conference on, IEEE, 2014,
the experimental results obtained on three datasets, it can be concluded pp. 1–6.
[23] A.H. Hamamoto, L.F. Carvalho, L.D.H. Sampaio, T. Abrão, M.L. Proença Jr., Network
that the proposed approach has achieved promising and significant per-
anomaly detection system using genetic algorithm and fuzzy logic, Expert Syst. Appl.
formance in detecting intrusions over computer networks. In particular, 92 (2018) 390–402.
the performance of the proposed approach achieves detection rate as [24] H. Sadreazami, A. Mohammadi, A. Asif, K.N. Plataniotis, Distributed-graph-based
99.71%, accuracy as 99.63%, and false alarm rate as 0.085 in NSL-KDD statistical approach for intrusion detection in cyber-physical systems, IEEE Trans.
Signal Inf. Process. Netw. 4 (1) (2017) 137–147.
dataset. From intrusion CIC-IDS 2017 dataset, the accuracy of the cur- [25] L. Lv, W. Wang, Z. Zhang, X. Liu, A novel intrusion detection system based on an
rent work achieves as 99.25%, detection rate as 99.52%, and false alarm optimal hybrid kernel extreme learning machine, Knowl. Based Syst. (2020) 105648.
rate as 0.007; and in ISCX 2012 dataset, detection rate as 99.32%, ac- [26] M. Xie, J. Hu, Evaluating host-based anomaly detection systems: a preliminary anal-
ysis of ADFA-LD, in: 2013 6th International Congress on Image and Signal Processing
curacy as 99.23% and false alarm rate as 0.053. Overall, ECAGOA has (CISP), vol. 3, IEEE, 2013, pp. 1711–1716.
performed the best when compared with the other state-of-the-art mod- [27] R. Abdulhammed, M. Faezipour, A. Abuzneid, A. Alessa, Enhancing wireless intru-
els. In addition, the impact of the unbalanced sample distribution on sion detection using machine learning classification with reduced attribute sets, in:
2018 14th International Wireless Communications & Mobile Computing Conference
an IDS necessities to be inclined a careful consideration in our future (IWCMC), IEEE, 2018, pp. 524–529.
studies. [28] N. Moustafa, J. Slay, The evaluation of network anomaly detection systems: statis-
tical analysis of the UNSW-NB15 data set and the comparison with the KDD99 data
Declaration of Competing Interest set, Inf. Secur. J. 25 (1–3) (2016) 18–31.
[29] S. Kaur, M. Singh, Hybrid intrusion detection and signature generation using deep
recurrent neural networks, Neural Comput. Appl. (2019) 1–19.
The authors declare that there is no conflict of interests regarding [30] M. Mazini, B. Shirazi, I. Mahdavi, Anomaly network-based intrusion detection sys-
the publication of this article. tem using a reliable hybrid artificial bee colony and adaboost algorithms, J. King
Saud Univ.-Comput.Inform. Sci. (2018).
[31] R. Sharma, S. Chaurasia, An enhanced approach to fuzzy c-means clustering for
CRediT authorship contribution statement anomaly detection, in: Proceedings of First International Conference on Smart Sys-
tem, Innovations and Computing, Springer, 2018, pp. 623–636.
Shubhra Dwivedi: Conceptualization, Writing - original draft, [32] M.E. Aminanto, R. Choi, H.C. Tanuwidjaja, P.D. Yoo, K. Kim, Deep abstraction and
weighted feature selection for Wi-Fi impersonation detection, IEEE Trans. Inf. Foren-
Methodology, Software. Manu Vardhan: Supervision, Writing - review sics Secur. 13 (3) (2017) 621–636.
& editing. Sarsij Tripathi: Visualization, Investigation, Validation. [33] E. Mahdavi, A. Fanian, F. Amini, A real-time alert correlation method based on code–
books for intrusion detection systems, Comput. Secur. 89 (2020) 101661.
References [34] P. Suresh, R. Sukumar, S. Ayyasamy, Efficient pattern matching algorithm for secu-
rity and binary search tree (BST) based memory system in wireless intrusion detec-
[1] S. Hajiheidari, K. Wakil, M. Badri, N.J. Navimipour, Intrusion detection systems in tion system (WIDS), Comput. Commun. 151 (2020) 111–118.
the internet of things: a comprehensive investigation, Comput. Netw. (2019). [35] J.M. Vidal, A.L.S. Orozco, L.J.G. Villalba, Adaptive artificial immune networks for
[2] I.F. Akyildiz, T. Melodia, K.R. Chowdhury, A survey on wireless multimedia sensor mitigating dos flooding attacks, Swarm Evol. Comput. 38 (2018) 94–108.
networks, Comput. Netw. 51 (4) (2007) 921–960. [36] A. Karami, M. Guerrero-Zapata, A hybrid multiobjective RBF-PSO method for
[3] A.A. Aburomman, M.B.I. Reaz, A novel weighted support vector machines multiclass mitigating dos attacks in named data networking, Neurocomputing 151 (2015)
classifier based on differential evolution for intrusion detection systems, Inf. Sci. 414 1262–1282.
(2017) 225–246. [37] R. Vijayanand, D. Devaraj, B. Kannapiran, Intrusion detection system for wireless
[4] M. Tariq, H. Majeed, M.O. Beg, F.A. Khan, A. Derhab, Accurate detection of sitting mesh network using multiple support vector machine classifiers with genetic-algo-
posture activities in a secure IoT based assisted living environment, Future Gener. rithm-based feature selection, Comput. Secur. 77 (2018) 304–314.
Comput. Syst. 92 (2019) 745–757. [38] J.E. Tapia, C.A. Perez, Gender classification based on fusion of different spatial scale
[5] A.K. Shukla, P. Singh, Building an effective approach toward intrusion detection features selected by mutual information from histogram of LBP, intensity, and shape,
using ensemble feature selection, Int. J. Inf. Secur.Privacy (IJISP) 13 (3) (2019) IEEE Trans. Inf. Forensics Secur. 8 (3) (2013) 488–499.
31–47. [39] S. Dwivedi, M. Vardhan, S. Tripathi, A.K. Shukla, Implementation of adaptive
[6] K. Hwang, M. Cai, Y. Chen, M. Qin, Hybrid intrusion detection with weighted signa- scheme in evolutionary technique for anomaly-based intrusion detection, Evol. In-
ture generation over anomalous internet episodes, IEEE Trans. Dependable Secure tell. 13 (1) (2020) 103–117.
Comput. 4 (1) (2007) 41–55. [40] C. Liu, W. Wang, Q. Zhao, X. Shen, M. Konan, A new feature selection method based
[7] A. Zakeri, A. Hokmabadi, Efficient feature selection method using real-valued on a validity index of feature subset, Pattern Recognit. Lett. 92 (2017) 1–8.
grasshopper optimization algorithm, Expert Syst. Appl. 119 (2019) 61–72. [41] M. Bennasar, Y. Hicks, R. Setchi, Feature selection using joint mutual information
[8] G. Folino, P. Sabatino, Ensemble based collaborative and distributed intrusion de- maximisation, Expert Syst. Appl. 42 (22) (2015) 8520–8532.
tection systems: a survey, J. Netw. Comput. Appl. 66 (2016) 1–16. [42] A. Fathy, Recent meta-heuristic grasshopper optimization algorithm for optimal re-
[9] S. Dwivedi, M. Vardhan, S. Tripathi, Incorporating evolutionary computation for configuration of partially shaded PV array, Sol. Energy 171 (2018) 638–651.
securing wireless network against cyberthreats, J. Supercomput. (2020) 1–38. [43] J. Luo, H. Chen, Y. Xu, H. Huang, X. Zhao, et al., An improved grasshopper optimiza-
[10] A.K. Shukla, S.K. Pippal, S.S. Chauhan, An empirical evaluation of teaching–learn- tion algorithm with application to financial stress prediction, Appl. Math. Model 64
ing-based optimization, genetic algorithm and particle swarm optimization, Int. J. (2018) 654–668.
Comput. Appl. (2019) 1–15. [44] A.A. Ewees, M.A. Elaziz, E.H. Houssein, Improved grasshopper optimization algo-
[11] S. Saremi, S. Mirjalili, A. Lewis, Grasshopper optimisation algorithm: theory and rithm using opposition-based learning, Expert Syst. Appl. (2018).
application, Adv. Eng. Softw. 105 (2017) 30–47. [45] M.K. Ebrahimpour, M. Eftekhari, Ensemble of feature selection methods: a hesitant
[12] S. Arora, P. Anand, Chaotic grasshopper optimization algorithm for global optimiza- fuzzy sets approach, Appl. Soft Comput. 50 (2017) 300–312.
tion, Neural Comput. Appl. (2018) 1–21. [46] S.A. Rankawat, R. Dubey, Robust heart rate estimation from multimodal physiolog-
[13] A.K. Shukla, P. Singh, M. Vardhan, An adaptive inertia weight teaching-learn- ical signals using beat signal quality index based majority voting fusion method,
ing-based optimization algorithm and its applications, Appl. Math. Model. 77 (2020) Biomed. Signal Process. Control 33 (2017) 201–212.
309–326. [47] A.A. Aburomman, M.B.I. Reaz, A survey of intrusion detection systems based on
[14] O. Ertenlice, C.B. Kalayci, A survey of swarm intelligence for portfolio optimization: ensemble and hybrid classifiers, Comput. Secur. 65 (2017) 135–152.
algorithms and applications, Swarm Evol. Comput. 39 (2018) 36–52. [48] A.J. Ferreira, M.A.T. Figueiredo, Efficient feature selection filters for high-dimen-
[15] C.-F. Tsai, Y.-F. Hsu, C.-Y. Lin, W.-Y. Lin, Intrusion detection by machine learning: sional data, Pattern Recognit. Lett. 33 (13) (2012) 1794–1804.
a review, Expert Syst. Appl. 36 (10) (2009) 11994–12000. [49] F. Kuang, S. Zhang, Z. Jin, W. Xu, A novel SVM by combining kernel principal com-
[16] S. Mohammadi, H. Mirvaziri, M. Ghazizadeh-Ahsaee, H. Karimipour, Cyber intrusion ponent analysis and improved chaotic particle swarm optimization for intrusion de-
detection by combined feature selection algorithm, J. Inform. Secur. Appl. 44 (2019) tection, Soft Comput. 19 (5) (2015) 1187–1199.
80–88.
S. Dwivedi, M. Vardhan and S. Tripathi Computer Networks 176 (2020) 107251
[50] T. Blickle, L. Thiele, A comparison of selection schemes used in evolutionary algo- [62] S. Lakhina, S. Joseph, B. Verma, Feature reduction using principal component anal-
rithms, Evol. Comput. 4 (4) (1996) 361–394. ysis for effective anomaly–based intrusion detection on NSL-KDD(2010).
[51] S. Mirjalili, S.M. Mirjalili, A. Lewis, Grey wolf optimizer, Adv. Eng. Softw. 69 (2014) [63] G.V. Nadiammai, M. Hemalatha, Effective approach toward intrusion detection sys-
46–61. tem using data mining techniques, Egyptian Inform. J. 15 (1) (2014) 37–50.
[52] M. Mafarja, I. Aljarah, A.A. Heidari, H. Faris, P. Fournier-Viger, X. Li, S. Mirjalili, [64] P. Gogoi, M.H. Bhuyan, D.K. Bhattacharyya, J.K. Kalita, Packet and flow based net-
Binary dragonfly optimization for feature selection using time-varying transfer func- work intrusion dataset, in: International Conference on Contemporary Computing,
tions, Knowl. Based Syst. 161 (2018) 185–204. Springer, 2012, pp. 322–334.
[53] C.-P. Lee, Y. Leu, W.-N. Yang, Constructing gene regulatory networks from microar- [65] M.M. Abd-Eldayem, A proposed HTTP service based IDS, Egyptian Inform. J. 15 (1)
ray data using GA/PSO with DTW, Appl. Soft Comput. 12 (3) (2012) 1115–1124. (2014) 13–24.
[54] C. Cortes, V. Vapnik, Support-vector networks, Mach. Learn. 20 (3) (1995) 273–297. [66] M. Prasad, S. Tripathi, K. Dahal, An efficient feature selection based bayesian and
[55] B.O. Alijla, C.P. Lim, L.-P. Wong, A.T. Khader, M.A. Al-Betar, An ensemble of intel- rough set approach for intrusion detection, Appl. Soft Comput. 87 (2020) 105980.
ligent water drop algorithm for feature selection optimization problem, Appl. Soft [67] O. Depren, M. Topallar, E. Anarim, M.K. Ciliz, An intelligent intrusion detection
Comput. 65 (2018) 531–541. system (IDS) for anomaly and misuse detection in computer networks, Expert Syst.
[56] A.K. Shukla, P. Singh, M. Vardhan, A hybrid framework for optimal feature subset Appl. 29 (4) (2005) 713–722.
selection, J. Intell. Fuzzy Syst. 36 (3) (2019) 2247–2259. [68] W. Yassin, N.I. Udzir, A. Abdullah, M.T. Abdullah, Z. Muda, H. Zulzalil, Packet
[57] M. Tavallaee, E. Bagheri, W. Lu, A.A. Ghorbani, A detailed analysis of the KDD CUP header anomaly detection using statistical analysis, in: International Joint Confer-
99 data set, in: Computational Intelligence for Security and Defense Applications, ence SOCO14-CISIS14-ICEUTE14, Springer, 2014, pp. 473–482.
2009. CISDA 2009. IEEE Symposium on, IEEE, 2009, pp. 1–6. [69] Z. Tan, A. Jamdagni, X. He, P. Nanda, R.P. Liu, J. Hu, Detection of denial-of-service
[58] A. Shiravi, H. Shiravi, M. Tavallaee, A.A. Ghorbani, Toward developing a systematic attacks based on computer vision techniques, IEEE Trans. Comput. 64 (9) (2014)
approach to generate benchmark datasets for intrusion detection, Comput. Secur. 31 2519–2533.
(3) (2012) 357–374. [70] H. Huang, R.S. Khalid, H. Yu, Distributed machine learning on smart-gateway net-
[59] W. Elmasry, A. Akbulut, A.H. Zaim, Evolving deep learning architectures for network work towards real-time indoor data analytics, in: Data Science and Big Data: An
intrusion detection using a double PSO metaheuristic, Comput. Netw. 168 (2020) Environment of Computational Intelligence, Springer, 2017, pp. 231–263.
107042. [71] F. Salo, A.B. Nassif, A. Essex, Dimensionality reduction with IG-PCA and ensemble
[60] I. Sharafaldin, A.H. Lashkari, A.A. Ghorbani, Toward generating a new intrusion de- classifier for network intrusion detection, Comput. Netw. 148 (2019) 164–175.
tection dataset and intrusion traffic characterization, in: ICISSP, 2018, pp. 108–116.
[61] C.-C. Chang, C.-J. Lin, LIBSVM: a library for support vector machines, ACM Trans.
Intell. Syst.Technol. (TIST) 2 (3) (2011) 27.