0% found this document useful (0 votes)
3 views19 pages

Paper 8

This paper presents a novel method called ECAGOA, which combines Ensemble Feature Selection (EFS) and Chaotic Adaptive Grasshopper Optimization Algorithm (CAGOA) to enhance intrusion detection systems (IDS) against sophisticated cyber attacks. The proposed approach effectively addresses issues of local optima and stagnation in existing algorithms, improving detection rates, accuracy, and reducing false alarm rates when evaluated on standard datasets like ISCX 2012, CIC-IDS2017, and NSL-KDD. The study highlights the importance of feature selection in optimizing IDS performance and proposes a hybrid algorithm that outperforms traditional methods in classifying network intrusions.

Uploaded by

arnobnave
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views19 pages

Paper 8

This paper presents a novel method called ECAGOA, which combines Ensemble Feature Selection (EFS) and Chaotic Adaptive Grasshopper Optimization Algorithm (CAGOA) to enhance intrusion detection systems (IDS) against sophisticated cyber attacks. The proposed approach effectively addresses issues of local optima and stagnation in existing algorithms, improving detection rates, accuracy, and reducing false alarm rates when evaluated on standard datasets like ISCX 2012, CIC-IDS2017, and NSL-KDD. The study highlights the importance of feature selection in optimizing IDS performance and proposes a hybrid algorithm that outperforms traditional methods in classifying network intrusions.

Uploaded by

arnobnave
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 19

Computer Networks 176 (2020) 107251

Contents lists available at ScienceDirect

Computer Networks
journal homepage: www.elsevier.com/locate/comnet

An effect of chaos grasshopper optimization algorithm for protection of


network infrastructure
Shubhra Dwivedi∗, Manu Vardhan, Sarsij Tripathi
Department of Computer Science & Engineering, NIT raipur, Chattisgarh, India

a r t i c l e i n f o a b s t r a c t

Keywords: Due to the proliferation of sophisticated cyber extortion with exponentially critical effects, intrusion detection
ISCX 2012 system is being evolved systematically their revealing, understanding, attribution and mitigation capabilities. Un-
CIC-IDS2017 fortunately, most of the modern Intrusion Detection System (IDS) technique does not provide sufficient defense
Evolutionary algorithm
services in the wireless environment while maintaining operational continuity and the stability of the defense
Grasshopper optimization algorithm
objective in the presence of intruders and modern attacks. To resolve this problem, we propose a new feature
Intrusion detection
selection technique by combining Ensemble of Feature Selection (EFS) and Chaotic Adaptive Grasshopper Opti-
mization Algorithm (CAGOA) method, called ECAGOA. The proposed method has the capability of preventing
stagnation issue and is particularly credited to the following three aspects. Firstly, EFS method is applied for
selecting the high ranked subset of attributes. Then, we have employed chaos concept in Grasshopper Optimiza-
tion Algorithm (GOA) which generates a uniformly distributed population to enhance the quality of the initial
populations and has the capability to manage two different issues such as the ability to search for new space
termed as exploration and the ability to use existing space termed as exploitation in the optimization process. In
order to avoid local optima and premature convergence, lastly, an adaptive grasshopper optimization algorithm
is developed by using organized parameter adaptation method. Furthermore, the adaptive behavior of GOA is
applied to decide whether a record signifies an anomaly or not, differing from some approaches acquainted in
the literature. Support vector machine (SVM) is used as a fitness function in the proposed method to choose the
relevant features that can help classify the attacks accurately. In addition, it is also applied to optimize the penalty
factor (C), kernel parameter (𝜎), and tube size (𝜖) of SVM method. The proposed algorithm is evaluated using
three popular datasets: ISCX 2012, NSL-KDD and CIC-IDS2017. The evaluation results show that the proposed
method outperformed several feature selection techniques from state-of-the-art methods in terms of detection
rate, accuracy, and false alarm rate.

1. Introduction or intrusions. Cyber-attack is actions that aim to compromise the pillars


of information security, such as integrity, confidentiality and the avail-
Today, technology has been developing constantly, which conse- ability of a computational resource, regardless of success or not [3].
quently increases the likelihood of a computer system containing new According to Tariq et al. [4], detection methods can be classified
security flaws and vulnerabilities [1]. Dependence on convenient secu- into two approaches: (i) misuse (signature), and (ii) anomaly. The pur-
rity systems to defend computers and networks against intruders is an pose of detection of misuse is to identify the behavior on the basis of
essential task in computer science because of the considerable develop- pre-defined attack patterns by comparing the perceived network activ-
ment of network-based computer services. Referring to this situation, it ity with predefined patterns of known attacks such as Snort, which has
is necessary to develop powerful security mechanisms known as Intru- a signature. On the contrary, anomaly-based IDS already checks occur-
sion Detection System (IDS) and Intrusion Prevention Systems (IPS) to rences of non-standard actions in the network or system, and for each
defend and analyze the events in a computer and/or network to identify normal action there is a created profile, and when a behavior does not
any deviation from normal behavior [2]. IDS is a service that helps in correspond to one of these profiles, possibly some attack is performed
the inspection of external as well as internal activities that are taking [5]. Anomaly IDSs are dependent on usual patterns and exploit them to
place in the network can see in Fig. 1. When IDS and IPS mechanisms identify some action that is significantly separated from usual patterns.
work together, can ensure the prevention and detection of cyber-attack One leading restriction of existing IDS technologies is the require-
ment to filter false alarms let the system be overwhelmed with data.


Corresponding author.
E-mail address: [email protected] (S. Dwivedi).

https://doi.org/10.1016/j.comnet.2020.107251
Received 10 August 2019; Received in revised form 3 March 2020; Accepted 3 April 2020
Available online 1 May 2020
1389-1286/© 2020 Elsevier B.V. All rights reserved.
S. Dwivedi, M. Vardhan and S. Tripathi Computer Networks 176 (2020) 107251

Fig. 1. Intrusion detection system overview.

This is due to irrelevant and additional features in dataset which de- tion strategies in the metaheuristic algorithms for solving the IDS prob-
crease the speed of detection. This study is resolved the existing limita- lem. By adopting chaos and adaptive strategies, the performance of
tion by utilizing a preprocessing step so performance of the IDS can be metaheuristic algorithms can be improved in terms of diversity, prema-
improved. Feature selection, as a preprocessing step, is used in this work ture convergence and quality of solutions [9]. In current years, various
to eliminate non-essential features from datasets and can help to classify Evolutionary Computation (EC) techniques such as Differential Evolu-
external record into normal or abnormal activities. Studies in the intru- tion (DE), Genetic Algorithm (GA), Particle Swarm Optimization (PSO),
sion detection systems matter show that feature selection algorithms fall and Grasshopper Optimization Algorithm (GOA) have been established
into two categories: filter and wrapper [6]. In the filter methods, iden- which work as wrapper methods [10].
tify the relations between the input features and substantial class, and Grasshopper is an insect well-known as plants pest. Grasshopper is
then irrelevant features are removed from the input features, whereas, usually seen individually in the life, nonetheless, when they combine
wrapper methods use a learning model to select feature subsets. In gen- into one swarm, the swarm may embed millions of grasshoppers. Re-
eral, wrapper methods convey better performance compared to filter cently, grasshopper optimization algorithm has been proposed which
methods. inspired by the behavior of grasshoppers swarm in nature [11]. The el-
In recent worlds, several diversity of metaheuristics techniques have emental limitation of GOA is that it cannot guarantee optimality. The
been introduced for intrusion detection system which can be provided solution quality also deteriorates with the increase of control parame-
better security and able to find the advanced attacks [7]. Many scien- ters. In addition, it gives poor quality solutions for some problems and
tists have considered intrusion detection as a hard problem in terms of function type. To improve the performance of GOA in intrusion detec-
classification and feature selection [8]. Almost all metaheuristics algo- tion system, in this study, we have introduced an adaptive variant of
rithm depends on three aspects inspired by nature, such as ecology, bi- GOA with chaos concept called Chaotic Adaptive Grasshopper Optimiza-
ology, and ethology; they reproduce the random variables as offspring. tion Algorithm (CAGOA) to find the attacks accurately. In [12], authors
To overcome computational costs due to the time-consuming trial-and- have introduced chaos theory into the optimization process of GOA so
error parameter, researchers have used chaotic concepts and adapta- as to accelerate the global convergence speed. The chaotic maps have
S. Dwivedi, M. Vardhan and S. Tripathi Computer Networks 176 (2020) 107251

employed to balance the exploration and exploitation efficiently and the to tackle FS problem by combination of EFS and CAGOA called hybrid
reduction in repulsion/attraction forces between grasshoppers in the op- feature selection.
timization process. The rest of this paper is as follows. Section 2 outlines the related
Several machine learning techniques and frameworks namely naive works to this study. Section 3 introduces the filter feature selection al-
Bayes, multi-layer perception, support vector machine, and artificial gorithms and grasshopper method. Section 4 introduces the proposed
neural network [13] have been proposed and undertaken for IDSs for approach and outlines to choose the noticeable features from the IDS
finding the type of attacks from the network traffic. It can also distin- datasets and discriminates the different type of attacks. Section 5 illus-
guish attacks by examining the factors of the network data. The network trates the overall working of proposed method. In Section 6, we evaluate
data contains irrelevant feature which decreases the detection accuracy, the performance on the IDS datasets. Section 7 discusses the conclusion.
whereas the input of relevant features can increase the accuracy of in-
trusion detection system. From this time, for facilitating as the input to
learning approaches, the selection of informative features is vital issue 2. Related work
in an IDS. So, it is always a challenge to select a features subset and
analysis of computational complexity of FS problem is NP-hard [14]. Dorothy E. Denning introduced the first intrusion detection system,
Several integration of filter and wrapper methods called hybrid FS it is the process of monitoring and analyzing events that occur in a com-
detection, analysis and investigation approaches (combination of mis- puter or networked computer system to detect behavior of users that
use and anomaly detection) have been proposed to defend against mal- conflict with the intended use of the system [18]. The model contain
ware. However malicious programs are employing a variety of propaga- records to represent the actions of subjects regarding metrics and sta-
tion and escape techniques to bypass defensive mechanisms [15]. In the tistical models and rules to acquire facts about this action from audit
reported literature [16], a new hybrid technique has been introduced records and to detect abnormal behavior. It is also not dependent on
which employed filter as linear correlation coefficient method and cut- any system including application, vulnerability of the system or cate-
tlefish algorithm method for feature selection in which decision tree gory of intrusion in the outline of an expert system for general purpose
classifier was used as a fitness function. Moreover, several hybrid classi- intrusion detection. After that, the same research group investigates a
fication techniques such as GA-PSO, GA-SVM and SVM-PSO have been new expert intrusion detection system. In 2019, author [19] introduced
proposed defining the type of attacks [17]. a new ranker algorithm to rank the features for cost-effective classifi-
The main aim of this study is to provide an accurate and effective cation of network traffic. Based on MIT-DARPA, CAIDA, ISCX-IDS and
anomaly intrusion detection system utilizing evolutionary and machine TU-DDoS datasets, author validated the proposed method. The feature
learning techniques that rely on specific attack signatures to distin- ranking algorithm on large datasets (50,000–1,000,000 instances) finds
guish between normal and malicious activities with high accuracy and best possible features from used datasets and obtained high accuracy
fast learning speed. Adaptive grasshopper optimization algorithm with (92%–97%) in a parallel environment, which have significantly less time
chaos concept (CAGOA) technique is applied for optimizing the tun- (71%–85% lower) than a sequential environment.
ing parameters of SVM. Initially, remove the redundant features from Increasingly, fruitful applications such as feature selection and clas-
the original data from the EFS method. Then, generate an uniformly sification model have been applied to IDS datasets for finding the type of
distributed population to enhance the quality of the initial populations attacks. In the reported literature [20], FS algorithm with learning algo-
by chaos concept. Finally, reduced data is passed in wrapper method rithm cannot handle or do not scale extremely large volumes of data. To
(CAGOA) to control the observed diversity from the original feature handle this type of problem, in this research work our focus is on filters
space. The main contributions of this paper are recorded as follows: and wrapper methods namely mutual information and CAGOA for find-
ing the attacks. Detection of attacks and intrusions are broadly strain
• An efficient filter-based method called Ensemble Feature Selection
researches that follow the trend of applying detection of intrusions. To
(EFS) is introduced by combination of JMI, mRMR, and CMIM.
choose effective and efficacious features and improve the performance
• The chaos-map is used to generate uniformly distributed populations
of intrusion detection system, in [21] proposed new hybrid classifica-
to enhance the original quality.
tion method based on artificial bee colony and artificial fish swarm al-
• However, in original GOA, the local optima and stagnation problem
gorithms. The simulation results on NSL-KDD and UNSW-NB15 datasets
still occur so we cannot ignore it. To overcome the problem of GOA,
demonstrated that the proposed method outperforms in terms of perfor-
we introduced a new position-updating and natural selection mech-
mance metrics and achieved 99% detection rate and 0.01% false positive
anisms which is applied on basic GOA, known as CAGOA.
rate.
• This paper proposed a new hybrid algorithm called ECAGOA by com-
To solve cyber-attack problems, recently several metaheuristics tech-
bining EFS and CAGOA which can improve the detection rate.
niques have been proposed to avoid the anomaly based problem by using
• CAGOA can choose the optimal number of features that can help to
new updation scheme. The metaheuristics methods imitate the natural
recognize the type of attacks. In addition, it is applied to enhance
process or the biology phenomena to search the best solution efficiently.
the penalty factor, kernel parameter, and tube size of support vector
As reported in [22], author proposed a novel FS algorithm using sup-
machine.
port vector machine to diminish the feature domain of IDS datasets.
• The proposed approach is evaluated and compared with other re-
To predict the web traffic activities, Hiroshi et al. [23] introduced a
ported methods on three standard IDS datasets such as ISCX 2012,
framework using genetic with fuzzy computing model for network de-
CIC-IDS2017 and NSL-KDD. The experimental results show that the
tection for given time interval. In this approach, GA has employed to
proposed method is superior to other FS methods in terms of accu-
produce a digital signature of network section utilizing analysis of flow
racy, detection rate, and false alarm rate.
where evidence removed from data of network flow. The experimen-
However, present network traffic data, which are often large in size, tal results have estimated that proposed method in the network traffic
present a vital challenge to IDSs. These big data slow down the integral flow attained an accuracy as 96.53% and false positive rate as 0.56%.
detection process and may lead to unsatisfactory classification accuracy In [24], author proposed a novel distributed blind intrusion detection
due to the computational difficulties in handling such data, therefore, in framework by modeling sensor measurements as the target graph-signal
this study we used filter and wrapper algorithm such as EFS and CAGOA and utilizing the statistical properties of the graph-signal for intrusion
in order to distinguish the malicious records from IDS data sets and detection. To fully take into account the underlying network structure,
alleviate the classification accuracy. As we discussed earlier, positively the graph similarity matrix is constructed using both the data measured
several filter and wrapper method have employed to classify attacks, by the sensors and sensors proximity resulting in a data-adaptive and
according to our knowledge, there has been no effort in the literature structure-aware monitoring solution
S. Dwivedi, M. Vardhan and S. Tripathi Computer Networks 176 (2020) 107251

A combination of the gravitational search algorithm (GSA) and dif- gorithm can rapidly and commendably return and ease DoS attacks in
ferential evolution (DE) algorithm was employed to optimize the pa- adversarial conditions regarding the functional performance criteria. Se-
rameters of HKELM, which improved its global and local optimization curity is a prime challenge in wireless mesh networks. To handle this
abilities during prediction attacks [25]. In addition, the kernel principal challenge, in [37] using support vector machine for intrusion detection
component analysis (KPCA) algorithm was introduced for dimension- in wireless mesh networks was proposed.
ality reduction and feature extraction of the intrusion detection data. Security issues in the field of computer and/or network security have
Then, a novel intrusion detection approach, KPCA-DEGSA-HKELM, was been studied extensively. To detect attacks in networks and computer
obtained. The proposed approach was eventually applied to the classic systems, IDSs need to collect, store and analyze a wide scope of data,
benchmark KDD Cup 99 dataset, the real modern UNSW-NB15 dataset this paper has integrated ensemble of filters and wrapper. In the first
and the industrial intrusion detection dataset from the Tennessee East- phase, filter method (EFS) attempts to exclude irrelevant features from
man process. the original data. Reduction in search space from the entire original
In contrast to NSL-KDD, and KDD Cup 99 datasets, there are some feature space to the pre-selected features, this phase accelerates the
recent dataset which include modern attack trends such as ADFA step of processing for the next phase as wrapper method (CAGOA). The
[26], AWID [27], UNSW-NB15 [28], CIC-IDS 2017 [29], ISCX 2012 proposed approach is eventually applied to the well-known benchmark
[30] datasets frequently used in research studies. Sharma and Chaura- datasets. The numerical results validate both the high accuracy and the
sia [31] introduced IDS dependent on the density maximization-based time-saving benefit of the proposed approach. It outperformed several
fuzzy c-means clustering (DM-FCC). The simulation results were con- feature selection algorithms from state-of-the-art related works in terms
ducted upon ADFA Dataset and the proposed approach performed bet- of false alarm rate, and detection rate.
ter in terms of accuracy, precision, detection rates, and false alarms. In
[29] author suggested a system based on deep learning regarding hybrid 3. Background details of methods
intrusion detection and signature generation for unknown web attacks
(D-Sign). D-Sign was proficient in magnificent detection and generation 3.1. Feature selection
of signatures for attack achieving high accuracy, specificity and sensi-
tivity for web-based attacks. The experiments were performed on CIC- Feature selection is becoming an essential part of building intrusion
IDS 2017 and NSL-KDD datasets for proving the efficiency of proposed detection systems by eliminating irrelevant and redundant features and
model. Aminanto et al. [32] presented novel deep-feature extraction and selecting the most optimal subset of features that produce a better char-
selection (D-FES), integrating stacked feature extraction and weighted acterization of patterns belonging to different attacks. Let, 𝑋 𝑚×𝑛 = {𝑥𝑖,𝑗 }
feature selection. The experimental outcomes on Wi-Fi network data set, be a matrix containing m features and n records originating from differ-
𝑚×𝑛 𝑚×𝑛 𝑚×𝑛
named as, Aegean Wi-Fi Intrusion data set, demonstrated the effective- ent groups denoted by a target attack, 𝑋 𝑚×𝑛 = [𝑋1 1 𝑋2 2 … 𝑋𝑝 𝑝 ]
ness and the efficacy of the proposed D-FES attaining detection accuracy 𝑚×𝑛𝑖
where each matrix 𝑋𝑖 contains records from the same group and
of 99.918% and false alarm rate of 0.012%, to identify accurately im- 𝑛1 + 𝑛2 + … 𝑛𝑝 = 𝑛. Selecting the most informative features consists of
personation attacks discussed in the previous research studies. identifying the features subset through the whole size as Sk × n ∈ Xm × n ,
Recently, considerable work has been done in the area of intrusion k ≪ m which is the most discriminative for the outlined attacks. Ensem-
detection which attempts to design anomaly and/or misuse detection ble method is formed in several ways, but proposed ensemble framework
systems to detect malicious attacks with a high detection rate and low depends on mutual information based filters approach that is a blend
false alarm rate in [33] proposed a novel two-phase model called a Real- of Conditional Mutual Information Maximization (CMIM), minimum-
time Alert Correlation method based on Code-books (RACC) for intru- Redundancy-Maximum-Relevancy (mRMR), and Joint Mutual Informa-
sion detection systems. First, in the off line phase, RACC pre-processes a tion (JMI) ranking methods and wrapper method as CAGOA which se-
knowledge base to propose some matrices as the main data structure of lects the relevant attributes for better classification of attack types.
the method that we call them code-books. Instead of keeping alerts in the
memory, those matrices just hold keys to the corresponding meta-alerts.
3.1.1. Filter based methods
An index that is based upon red-black trees was used to access matrix
Recently, the selection of best feature subsets is one of the promis-
elements. Generating the matrices and mentioned index were indepen-
ing tasks through information theory by which selection of attributes
dent from the alerts, so utilized them can facilitate the alert correlation
based on high correlation concerning the class and uncorrelated to fea-
process in an online manner in phase two of the proposed model.
tures is done. Outside the others filter approaches, mutual information
To mitigate Denial of Service (DoS) attacks, author [34] proposed
based algorithm is adapted following aspects: (1) perform more reliably
a pattern matching algorithm called All-Ready State Traversal pat-
in noisy problems, (2) generalize to multi-class problems (3) general-
tern matching algorithm. The proposed algorithm constructed the state
ize to numerical outcome problems, and (4) to make them robust to in-
traversal machine with 1280 bytes size, and enables users to store large
complete (i.e. missing) data. Alternatively, it calculates feature score for
sized string patterns in the pattern database. The state traversal ma-
each feature which can be applied to rank and select top scoring features
chine facilitates the easy retrieval of these patterns through the path
for feature selection these scores may be applied as feature weights to
vector. Further, the proposed worked also follows a number of basic
guide downstream modeling [38]. None of the existing approaches can
ASCII characters with 128 bytes size; and designs the memory archi-
defeat problems such as low performance, redundant features, and high
tecture using binary search tree structure. Another IDS dataset as ISCX
computational burden [39].
2012 was generated through real network configuration and collected
From CMIM technique, the feature subset selection is based on max-
packets activities in normal and abnormal form. It describes 𝛼 and 𝛽
imizing conditional mutual information regarding the class, in addition,
profiles wherever profile defines multistage scenarios of attacks, and
it is extremely correlated with the class and uncorrelated to features. It
profile defines mathematical distributions of the entity which contain
makes a compromise between the predictive power of the nominated
preconditions and postconditions [30].
candidate (relevance for the class carrier) and its independence from all
Vidal et al. [35] offered an artificial immune framework which has
previously selected characteristics. To measure the Mutual Information
proficiency by matching the several immune reactions and building of
(MI) between the class y and features X is expressed in Eq. (1).
the immune memory framework. Experiments on publically available
datasets such as KDD Cup 99 and CAIDA have been conducted. The 𝑦
𝐼(𝑦; 𝑋) = 𝐻(𝑦) − 𝐻( ) (1)
multi-objective approach based on anomaly detection technique has 𝑋
proposed called multi-objective PSO in the related of the Neural Net- Where, H(y) and 𝐻( 𝑋𝑦 ) illustrate the entropy and conditional entropy
work [36]. The experimental simulations demonstrated that hybrid al- of the class variable. Some researchers have addressed these problems
S. Dwivedi, M. Vardhan and S. Tripathi Computer Networks 176 (2020) 107251

using Mutual Information based Feature Selection (MIFS) [40]. Hence, generally inclined to three factors such as gravity force, social interac-
we utilize this model to lessen the redundancy between data attributes tion, and advection of wind. The position of the ith grasshopper Yi is
and class y as expressed in Eq. (2). The primary objective of CMIM is denoted as follows in Eq. (8).
to choose the final feature subsets that conveys as much information as
𝑌𝑖 = 𝑆𝑜𝑖 + 𝐺𝑟𝑖 + 𝐴𝑤𝑖 (8)
possible from the record S. The relevance of input attributes determined
by the CMIM is as: Where, Yi represents the grasshopper position, Soi represent the ith
( ) social interaction, Gri represents the gravitational force on the ith agent,
𝑥
𝑀𝐶𝑀 𝐼 𝑀 (𝑋 ) = min 𝐼 𝑦; 𝑘 (2) and Awi demonstrates the advection of wind. The random behavior of
𝑥𝑗 ∈𝑆 𝑥𝑗
swarm can be illustrated as follows in Eq. (9).
Where MCMIM estimates the mutual information between full features
𝑌𝑖 = 𝑟1 ∗ 𝑆𝑜𝑖 + 𝑟2 ∗ 𝐺𝑟𝑖 + 𝑟3 ∗ 𝐴𝑤𝑖 (9)
set as xk and certain features xj regarding class label y, whereas S shows
𝑥
the subsets of selected features. 𝐼(𝑦; 𝑥𝑘 ) measures the quantity of the Where r1 ,r2 , and r3 are random numbers lies between 0 to 1.
𝑗
classification information that xk affords when xj has carefully chosen. 𝑛

This information may not be offered by chosen feature subset S. As com- 𝑆𝑜𝑖 = 𝑠𝑓 (𝐷𝑖𝑗 )𝐷̂𝑖𝑗 (10)
𝑥
parison to 𝐼(𝑦; 𝑥𝑘 ), 𝐼(𝑦; 𝑥𝑘 ) does not contain the redundant information 𝑗=1;
𝑖≠𝑗
𝑗
of pairwise features for classification. where, Dij is the distance between the ith and the jth grasshopper, sf is
Several feature selection algorithms have reported in [41]; it has rec- a function to decide the power of social forces as shown in Eq. (10),
ommended the Mutual Information (MI) for solving the IDS classifica- (𝑌 −𝑌 )
and 𝐷̂𝑖𝑗 = 𝑗 𝑖 is a unity vector from the ith grasshopper to the jth
𝐷𝑖𝑗
tion problems. Reduction in redundancy can improve the differentiating
capability of feature subsets. The undeviating way is to maximize the grasshopper. The sf function is depicted as in Eq. (11).
classification information which is newly delivered for feature subset 𝑠𝑓 (𝑟) = 𝐼𝑒−𝑟∕𝑙𝑛 − 𝑒−𝑟 (11)
by candidate features. This phenomena is directly applied in the joint
mutual information between the subset and classes. The relevance of Where I shows the intensity of attraction and ln is the attractive
inputs attributes defined by the JMI as shown in Eq. (3). length scale. Fig. 2 shows the sf function, which demonstrate the ratio-
∑ ( ) ∑ ( ) nal model of interactions between agents and comfort zone. The Gri and
𝑀𝐽 𝑀𝐼 (𝑋 ) = 𝐼 𝑥𝑘 ; 𝑥𝑗 ; 𝑦 ∝ 𝐼 𝑦; 𝑥𝑘 ∕𝑥𝑗 (3) Awi components in Eq. (8) is enumerated by Eqs. (12)–(13) as follows:
𝑥𝑗 ∈𝑆 𝑥𝑗 ∈𝑆

Where, I (xk ; xj ; y) shows the MI between inventive attribute set xk 𝐺𝑟𝑖 = −𝑔𝑟𝑒𝑔𝑟
̂ (12)
represent selected subset of features S, and selected attribute xj with
respect to the class y. 𝐴𝑤𝑖 = 𝑑 𝑒̂𝑤 (13)
The analysis of dependency and relevancy on the features sets is hard
to perceive. To understand this issue, one of the useful method is avail- Where, gr shows the gravitational constant, 𝑒𝑔𝑟
̂ represents unity vec-
able in the literature is called minimal-Redundancy-Maximal-Relevance tor for center of earth and d signifies the constant drift and 𝑒̂𝑤 signifies
(mRMR). The maximum dependency on the target class y called as Max unity vector in the direction of wind. The Gri and Awi from Eq. (8) is
Dependency, as described in Eq. (4). substituted as follows in Eq. (14).
( ) 𝑛
( ) 𝑦 ∑
max 𝑤(𝑋, 𝑦) = 𝐼 𝑦; 𝑥1 , 𝑥2 , … , 𝑥𝑁 = 𝐻 (𝑦) − 𝐻 (4) 𝑌𝑖 = 𝑠𝑓 (|𝑌𝑗 − 𝑌𝑖 |)(|𝑌𝑗 − 𝑌𝑖 |∕𝐷𝑖𝑗 ) − 𝑔𝑟𝑒𝑔𝑟
̂ + 𝑑 𝑒̂𝑤 (14)
𝑥1 , 𝑥2 , … , 𝑥𝑁 𝑗=1;
𝑖≠𝑗
As shown in Eq. (4), the dependency among features X is estimated and
it can be large value. The relationship change between redundancies Where n is the number of grasshoppers.In the optimization algo-
between features is expressed as Eqs. (5) to (6). rithm, Eq. (14) is not used, as it prevent the optimization technique from
∑ ( ) exploring and exploiting the search space near about a solution. A vari-
min Z (X, y ) = 1∕ ∣ 𝑠2 ∣ 𝐼 𝑥𝑗 ; 𝑥𝑘 (5) ant of Eq. (14) is utilized to resolve optimization problems as depicted
𝑥𝑗 ∈𝑠
in Eq. (15).
𝑛

𝑀𝑎𝑥𝜛 (𝑤, 𝑍 ) = 𝑤 − 𝑍 (6)
𝑌𝑖𝑑 = 𝑐𝑧 ∗ { 𝑐𝑧(𝑢𝑙𝑑 − 𝑙𝑙𝑑 |∕2)𝑠𝑓 (|𝑌𝑗𝑑 − 𝑌𝑖𝑑 |)(|𝑌𝑗 − 𝑌𝑖 |∕𝐷𝑖𝑗 )} + 𝑇̂𝑑 (15)
The integration of Eqs. (5) and (6) is known as minimal-redundancy- 𝑗=1;
𝑖≠𝑗
maximal-relevance (mRMR) which describes in Eq. (7).
∑ ( ) Here, uld and lld are the upper and lower limits in Dth dimension,𝑇̂𝑑
𝑗𝑚𝑅𝑀𝑅 (𝜛 ) = 𝐼 (𝑦; 𝑋 ) − 1∕ ∣ 𝑠2 ∣ 𝐼 𝑥𝑗 ; 𝑥𝑘 (7) determines the value of target in the Dth dimension and the coefficient
𝑥𝑗 ∈𝑠
cz decreases the comfort zone proportional to the number of iterations
Where, xj is selected subset of features and xk is original features set. and is deliberated as follows in Eq. (16).
𝑐𝑧 = 𝑐𝑧max − 𝑡(𝑐𝑧max − 𝑐𝑧min ∕𝑡max ) (16)
3.2. Grasshopper optimization algorithm
Where 𝑐𝑧max denotes the maximum value, 𝑐𝑧min denotes the mini-
Grasshopper optimization algorithm firstly introduced by Saremi mum value, t represent the current iteration, and 𝑡max denotes the maxi-
et al. [11], is one of the new nature-inspired and population-based tech- mum number of iterations. From Saremi et al. [11], we have taken value
nique that mimic the behavior of grasshopper swarms in nature. In GOA, of 𝑐𝑧max = 1 and 𝑐𝑧min = 0.00001 in this study.
the position of the grasshoppers in the swarm signifies a candidate so-
lution for a given optimization problem. Grasshoppers has an exclusive 4. Proposed methodology
way of flying [42]. During the food search, the grasshoppers has two vi-
tal phases of optimization which are exploration and exploitation of the 4.1. Ensemble of feature selection
search space. In exploration phase, the agents are encouraged to make
sudden movements, whereas these agents incline to move locally over Recently, the filter-based feature selection techniques are gained at-
the exploitation phase [43]. Based on the mathematical model proposed tention by many researchers in the field of intrusion detection. In this
for this optimization approach [44], the movement of grasshoppers are section, we propose ensemble of feature selection (EFS) technique which
S. Dwivedi, M. Vardhan and S. Tripathi Computer Networks 176 (2020) 107251

Aracon Zone
Comfort Zone
Repulsion Zone

Fig. 2. Swarm of grasshoppers in primitive corrective patterns between individuals.

combines the output of three filter-based feature selection technique 4.1.2. Frequency vote
based on the mutual information named as mRMR, JMI and CMIM and Frequency vote is a group decision-making scheme which is used to
select top ranked features and this proposed scheme is presented in combine the output of various filter techniques and consists of setup that
Algorithm 1. These filter-based techniques are employed to rank the is helpful compared to other compound schemes [46]. In this regard,
each model generates predictions (votes) for each ranked feature and
Algorithm 1 Proposed ensemble feature selection. then the final result is determined based on whether each ranked feature
acquired more than half of the votes or not. If any of the predictions
Input Data: 𝜓 the number of ranking techniques
for particular feature does not obtain more than half of the votes, in
Value : 𝜃 the threshold of the number of features to be chosen
that case, we may assume that,the ensemble technique is incapable to
Outcome: P Best ranking feature set
make a constant prediction. So, we prefer the ranked feature which has
Begin
lowermost vote value over score value. Therefore, we can select the
for 𝑙 ← 1 to 𝜓 do
most voted prediction as the final prediction of ensemble-based model
Acquire ranking 𝐴𝜓 utilizing feature selection technique 𝜓
as depicted in Eq. (17).
end for
A = combined ranked features 𝐴𝜓 with a ranking combination tech-
nique 𝑈 𝑈
∑ ∑
𝐴𝑡 = choose 𝜃 top features from A 𝑑𝑛,𝑗 = 𝑎𝑟𝑔𝑚𝑎𝑥𝑘∈{1,2,⋯,𝐿} 𝑑𝑛,𝑘 (17)
Obtain optimal ranked features (P) 𝑛=1 𝑛=1
Return P

Here, U represents the number of feature selection method, and L


features of IDS datasets and then the resulting output of the EFS is ac- ∑
selects some attributes. For attribute k, the sum 𝑈 𝑛=1 𝑑𝑛,𝑘 tabulates the
quired by integrating the output of every filter technique using a simple number of votes for k. Plurality chooses the attribute k that maximizes
frequency vote scheme to estimate the final chosen feature subset. the sum.
In the reported literature [47], different forms of ensemble algo-
4.1.1. Ranking combination rithms were developed for intrusion detection, but traditional tech-
In the available literature, ensemble of feature selection is a pro- niques have several drawbacks like unrelated features that could per-
lific method to increase the classification performance. In this study, form well in terms of classification performance with a suitable subset
we have studied a new ensemble of filter techniques that achieves top of features will be left out of the selection. To the best of our knowledge,
ranked features integrating results of each used filter technique based this is the first time that an Ensemble of Feature Selection (EFS) tech-
on frequency vote technique to produce a final subset of features. Sev- nique by using relevance/redundancy concept (i.e. Dispersion measure)
eral modified combination scheme can be obtained in the literature, but [48] is applied that works as threshold in order to address the select top
have some difficulties [45] therefore, we have employed the frequency ranked features to distinguish the attacks from IDS datasets with high
vote scheme to examine the rank of the each filter selection technique. classification performance.
S. Dwivedi, M. Vardhan and S. Tripathi Computer Networks 176 (2020) 107251

Fig. 3. Bifurcation diagram of the Logistic map.

4.2. Chaos-based population initialization a uniformly distributed sequence and prevents it from being immersed
in small periodic cycles effectively.
The generation of initial populations in the search space play an es-
sential role in GOA. From the reported literature [12], we have observed
that numerous chaos-based AGOA techniques have been studied to re- 4.3. Adaptive grasshopper optimization algorithm
solve global optimization issues. In this work, we present the chaotic
initialization of maps in the AGOA optimization process to accelerate its GOA is a population based metaheuristic technique which is widely
global convergence speed. Chaotic maps are used to efficiently balance used to solve various engineering problems. But it has certain limitations
exploration and exploitation and reduce repulsion/attraction forces be- like fixed parameter values and lack of adaptation in dynamic environ-
tween grasshoppers in the optimization process [49]. The application of ments. In the initial phase of GOA, positions of all the agents are ran-
chaotic sequence instead of random sequence in AGOA is certainly a fun- domly initialized. The agents cultivate an inclination towards the goal
damental strategy. Hence, it can implement downright search at higher (i.e, 𝑇̂𝑑 in the social interaction of swrams), conversely, if initialized po-
speeds than the stochastic search that based primarily on probabilities. sitions of the agents are more focused near about the best local and far
Few functions (chaotic maps) and few parameters (initial conditions) from the best in whole swarm, the agents can focus utilizing the best
are necessary even for long sequences. In addition, a huge number of local agent which means that GOA is conscious for selection of starting
diverse sequences can be produced by merely varying their initial con- positions, so upgrading strategies must be useful for search agents to get
dition. Furthermore, these chaotic sequences are deterministic and re- out of the best local trap.
producible. The logistics map is one of the best chaos-based technique GOA with adaptive parameters called AGOA is a substantial and
that researchers have paid attention to global search. It is determined promising variant of GOA. The self-adapting strategy in control param-
as follows in Eq. (18). eters is normally employed while their proper values are not known pri-
marily or they are required to change throughout the stochastic search
𝑥𝑛+1 = 𝜙(𝑥𝑛 , 𝜇) = 𝜇 ∗ 𝑥𝑛 (1 − 𝑥𝑛 ) (18) process. The cz control parameter is self-adapted in the AGOA. Further-
more, the strategy of adaptation in cz in the basic GOA is associated
Where xn signifies the nth chaotic variable, xn ∈ (0, 1) under the con- only to the generation Eq. (16), and does not anticipate the possibility
ditions that the original x0 ∈ (0, 1) from definite periodic static points of dynamically changing value of cz from the system through the feed-
(0, 0.25, 0.5, 0.75,1) and 𝜇 is set as 4, also known as bifurcation coeffi- back of search process. As a result, cz should be treated as an adaptive
cient. parameter associated with the estimated fitness value of each genera-
In this procedure, large number of manifold periodic elements will tion and performance of the current agent of population. That is, based
be located in the thinner and thinner 𝜇 intervals as it increases. This on the Eq. (19), the fitness of the chosen population (Pop) by the se-
phenomenon is really without restrictions. But it includes a limit value lection technique and best population is presented using the feedback
at 𝜇 t = 3.60. Remember that when the techniques are the period 𝜇t can mechanism.
be infinite or even non-periodic 𝜇 t . For the moment, the whole structure In this paper, we utilize the natural selection strategy (𝛿), and self-
evolves into a chaotic state. However, when 𝜇 is larger than 4, the whole adaptive mechanism to help GOA while jumping out of the local opti-
system becomes unstable. Therefore, the interval [ 𝜇; 4] is commonly mum trap. For implementing the adaptive scheme in cz, dynamic feed-
measured by the chaotic area of the whole system. The bifurcation dia- back mechanism relied on the adaptive scheme of the genetic tech-
gram is certainly shown in Fig. 3. nique is utilized. Agreeing to the natural selection methodology [50],
More obviously, whenever a current quantity of chaotic generations 𝛿(𝛿 < Pop) grasshopper agents are selected and removed randomly from
is executed, the chaotic variables will be produced accordingly. Sub- the population. Population is carefully chosen by the tournament se-
sequently, by re-mapping these variables in the optimization space, the lection strategy conferring to the fitness function of each grasshopper.
preliminary variables will be generated for the initial optimization prob- Then, the positions of the excluded grasshoppers are randomly initial-
lem. The logistic-map flowchart as following in Fig. 4, which produces ized again.
S. Dwivedi, M. Vardhan and S. Tripathi Computer Networks 176 (2020) 107251

Start

Randomly initialize
chaotic variable (xn)

If (chaotic variable plunges


into fixed points and
small periodic cycle)

Implement the small positive random Update the variable by Eq.


perturbation and Map them by Eq. (18); (18), directly;

iter=iter+1;

is iter=
N Y End
max_iter?

Fig. 4. Flowchart of logistic map for initialization.

Utilizing a natural selection scheme, the population can move in the Algorithm 2 Pseudo code of BCAGOA.
direction of finding a better quality solution. Reinitializing the positions Begin
of disregarded agents can correctly spread the position of the grasshop- Set the swarm size,𝑐 𝑧max , 𝑐 𝑧min and maximum number of iterations
pers and extend the research space of the population. Consequently, 𝑡max ;
the local solution can be avoided to some extent. The probabilities of Chaos based generates the population Y;
cz extensively define the degree of accuracy of the solution and the Compute the fitness value of each agent;
speed of convergence that the Grasshopper optimization algorithm can Choose the elite grasshopper using tournament strategy;
achieve. Rather than utilizing fixed cz values, AGOA makes use of pop- 𝑇̂𝛼 , 𝑇̂𝛽 , 𝑇̂𝛾 = best search agents;
ulation information in each generation and modifies cz adaptively to while (t < 𝑡max ) do
retain population diversity and preserve convergence ability. In AGOA, Update 𝑐𝑧 using Eq. (19);
the parameter setting of cz relies on the suitable values of the solutions. for 𝑖 ← 1 to 𝑛 do
In this study, we identify convergence through perceiving the interval Normalize the distances between Grasshoppers Y;
between the maximum and minimum fitness values of the population. Update the current agent 𝑌𝑖 position using Eq.(20);
While the cz value is large, the widespread search is too difficult and end for
the optimal solution may be missed. While the cz value is small, the ex- Update 𝑇̂𝛼 , 𝑇̂𝛽 , 𝑇̂𝛾
ploration process can avert in the minimum area. In this scenario, we t=t+1;
have used an alternate approach for an assessment of population selec- end while
tion employing simple SVM classifiers. In AGOA the value of cz can be Return 𝑇̂𝛼
assessed according to Eq. (19).

lation of AGOA technique is enumerated in Eq. (20).


{ 𝑓𝑔
𝛿[1 + 𝑘1 .( 𝑓 )], 𝑓𝑔 ≥ 𝑓𝑎𝑣𝑔 ⎧ 𝑛 ⎫
𝑐𝑧 = max −𝑓min +𝑓𝑎𝑣𝑔 (19) ⎪∑ ⎪
𝑘2 , 𝑓𝑔 < 𝑓𝑎𝑣𝑔 𝑌𝑖𝑑 = 𝑐𝑧 ∗ ⎨ 𝑐𝑧(𝑢𝑙𝑑 − 𝑙𝑙𝑑 |∕2)𝑠𝑓 ((|𝑌𝑗𝑑 − 𝑌𝑖𝑑 |)|𝑌𝑗 − 𝑌𝑖 |∕𝐷𝑖𝑗 )⎬
⎪ 𝑗=1; ⎪
⎩ 𝑖≠𝑗 ⎭
̂ ̂
𝑇𝛼 + 𝑇𝛽 + 𝑇𝛾 ̂
Where, 𝑓max and 𝑓min determine the maximum/minimum value of + (20)
3
all agents fitness when AGOA do a search operation, favg determines the
average fitness, fg denotes the average fitness of the three parents in se- Where 𝑇̂𝛼 , 𝑇̂𝛽 , 𝑇̂𝛾 are the leading agents in the whole grasshoppers.
lection operation. 𝛿, k1 and k2 notify the constant value in between 0
4.3.1. Binary chaos adaptive grasshopper optimization algorithm
to 1. Overall procedure of adaptive grasshopper optimization is demon-
strated in Algorithm 2. In this work, CAGOA is utilize to improve the quality of solution and
increase convergence rate of AGOA algorithm in which Chaotic map (Lo-
Motivated from Grey Wolf Optimization (GWO) [51], the democratic
gistic map) is one of the advisable method to enhance the performance of
decision-making scheme is presented to AGOA strategy. The probable
AGOA in terms of both avoidance of local optima and convergence rate.
positions of all grasshopper agents are decided simultaneously using the
The Logistic map is utilized to produce uniformly distributed agents to
leading grasshopper agents. The best three grasshoppers in AGOA are
improve the quality of the initial population in AGOA. In general, GOA
demonstrated as 𝛼, 𝛽, 𝛾 as the leading agents. The mathematical formu-
S. Dwivedi, M. Vardhan and S. Tripathi Computer Networks 176 (2020) 107251

algorithm is a population-based evolutionary search technique and was The main aim of proposed approach is to obtain best solutions while
primarily proposed as an optimization approach to address continuous utilizing the self adapting and collective learning ability of grasshop-
problems [7]. Conversely, numerous optimization problems, like feature pers. Chaotic maps utilized in proposed are proficient of considerably
selection are established in a binary space. As a result, when GOA can enhancing the solution quality of optimization techniques. In the basic
be used to solve binary optimization problems, where the solutions are GOA, there is no requirements to retain linearly decrements. In reality,
restricted to the binary (0,1) values that motivate us to propose a binary chaotic variable varying SVM parameters can be more appropriate for
variant of the AGOA [52]. Inspired from prior study [53], in this study, the search, that can also make possible the proposed approach reach to
we utilize the transfer function to transform the continuous values of the the optimum value with a fast speed. Hence, in this study, the work of
continuous space into binary values 0 or 1 in the binary space. Typically, adjusting the SVM parameters (𝜎, 𝜖, and C) using CAGOA can be seen
transfer function is utilized to return the probability of altering the po- in Fig. 5.
sition 0 to 1 or vice versa; position of the ith agent in the jth dimension
in the current iteration (t) as an input parameter. 4.3.4. Support vector machine
Support vector machine [54] is a powerful tool for machine learning
4.3.2. New encoding scheme that is extensively employed in various applications like intrusion detec-
In the duration of search process, the grasshoppers move towards tion, classification, and pattern recognition [22]. SVM has certain dom-
the target, which is the best position gained until now by the swarm. inant characteristics in comparison to other techniques for instance out-
Using a population of random solutions (binary vectors), the swarm is standing generalization performance, that makes it proficient to create
initialized, and searches for the best solution while updating the posi- high-quality decision boundaries dependent on a small subset of train-
tion of each grasshopper agent conferring to the Eq. (19) in continuous ing data points. Additionally, SVM includes a high capability to model
AGOA. Updating position of each grasshopper agent in a binary search complex and non-linear relations. The simple idea behind the SVM al-
space is not an easy task compared to a continuous space. The grasshop- gorithm is to obtain the optimal hyperplane which separates two classes
per agent can update its position with addition of the first component while maximizing distance between the hyperplane margin and the data
value in Eq. (20) to the target vector in continuous search spaces. On points in the specified data set. The hyper plane is demonstrated as in
the other hand, in a binary search space, the position of grasshopper Eq. (24).
cannot be simply updated with addition of values as the position vec-
𝑊𝑇 ∗𝑥+𝑐 =0 (24)
tors of grasshopper agent can only have either 0 or 1. So, if we take the
first component in Eq. (20) as the difference between the two binary Where, c parameter adjust the displacement from the origin, WT ∗ x
vectors position vector and target vector taken as 𝑌𝑖𝑑 . We can transform illustrates the plane direction. The margin may be between two hyper
the updating position equation of the grasshopper based on step vector planes. Mostly, the margin may be changed through the rotation of the
𝑌𝑖𝑑 in the original AGOA as follows in Eq. (21). normal vector and shifted by tuning parameter c. The margin gap is
equal to |𝑊2 | means the width is inversely proportional to normal vec-
⎧ 𝑛 ⎫
⎪∑ ⎪ tor length. In optimization problem, most of the inventor anticipates to
𝑌𝑖𝑑 = 𝑐𝑧 ∗ ⎨ 𝑑 𝑑
𝑐𝑧(𝑢𝑙𝑑 − 𝑙𝑙𝑑 |∕2)𝑠𝑓 ((|𝑌𝑗 − 𝑌𝑖 |))|𝑌𝑗 − 𝑌𝑖 |∕𝐷𝑖𝑗 ⎬ (21) maximize the width of the margin conferring to Eq. (25).
⎪ 𝑗=1; ⎪
⎩ 𝑖≠𝑗 ⎭
𝑓 (𝑤) = min[|𝑊 |2 ∕2] (25)
In the reported literature, the most common activation function is 𝑇𝑥 𝑇𝑥
The inequalities are: 𝑊 𝑚 + 𝑐 ≥ 1 for positive class and 𝑊 𝑚 +
sigmoid function which is employed to normalize the step vector in the
𝑐 ≤ 1 for negative class. Multiply each constraint by labels, we find as :
between 0 to 1. Commonly, the disadvantage related to sigmoid function
𝑦𝑚 {𝑊 𝑇 𝑥𝑚 + 𝑐 ≥ 1} and 𝑦𝑛 {𝑊 𝑇 𝑥𝑛 + 𝑐 ≤ 1} Points xm , xn have ym , yn
is that there is consistency among a significant value in 𝑌𝑖𝑑 in the positive
labels, where 𝑥𝑚 = [−1, +1]𝑇 , 𝑦𝑛 = +1, 𝑥𝑚 = [+1, −1]𝑇 and 𝑦𝑛 = −1 Fi-
and negative direction and then it designates the fact that the more
nally, we need to optimize: min [|W|2 /2] subject to: 𝑦𝑖 ∗ 𝑊 𝑇 𝑥𝑚 + 𝑐 ≥ 0
movement is required dependent on the previous position. To resolve
To resolve this equation, we multiply the inequalities with Lagrange
this issue, we present new encoding scheme to the component of step
multipliers. Inserting Lagrange multipliers 𝛾 i see in Eq. (26).
vector. The proposed V-shaped transfer function 𝑇 (𝑌𝑖𝑑 ) is highlighted in
Eq. (22). 𝑁

(( ) (( ) 𝐿𝑝𝑑 = |𝑊 |2 ∕2 − 𝛾𝑖 {𝑦𝑖 (𝑊 𝑇 𝑥𝑖 + 𝑐) − 1} (26)
| ) | | ) |
𝑇 (𝑌𝑖𝑑 ) = exp | 𝑌𝑖𝑑+1 + 𝑎 ∕ (1 − 𝑏)| − 1∕exp | 𝑌𝑖𝑑+1 + 𝑎 ∕ (1 − 𝑏)| + 1 𝑖=1
| | | |
By implementing the Karush Kuhn Tucker (KKT) conditions [55], we
(22)
discover changed partial derivations as follows:
Where a and b are pre-defined constant value and it remain static in 𝜕𝐿𝑝𝑑 𝜕𝐿𝑝𝑑
𝜕𝑊
=0; 𝜕𝑐
= 0 ; 𝛾 i ≥ 0.
whole search process. After computation of the probabilities, the agents 𝜕𝐿 ∑ 𝜕𝐿𝑝𝑑
𝛾𝑖 {𝑦𝑖 (𝑊 𝑇 𝑥𝑖 + 𝑐) − 1 = 0}; 𝜕𝑊𝑝𝑑 = 0 then 𝑊 = 𝑁𝑖=1 𝛾𝑖 𝑥𝑖 𝑦𝑖 ; and 𝜕𝑐 =
updates their positions using the rules illustrated in Eq. (23). ∑𝑁
{ ( ) 0 then 𝑖=1 𝛾𝑖 𝑦𝑖 = 0;
1, 𝑖𝑓 𝑟𝑎𝑛𝑑 < 𝑇 𝑌𝑖𝑑
𝑌𝑖𝑑 = (23) Applying KKT to primaldual, we obtain different partial derivations
0, 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒 Ld according to Eqs. (27) and (28).
In general, particular problem in existing algorithms is that it does 𝑁 ∑
𝑁 𝑁
∑ ∑
not prevent individuals from being stuck in a local optimum. To resolve 𝐿𝑑 = −1∕2 ∗ 𝛾𝑖 𝛾𝑗 𝑦𝑖 𝑦𝑗 𝑥𝑇𝑖 𝑥𝑖 + 𝛾𝑖 (27)
this kind of issue, this paper proposes a novel hybrid FS algorithm to 𝑖=1 𝑗=1 𝑖=1
find the better quality of solution regarding classification accuracy and When substituting with known values: 𝑦𝑚 = +1; and 𝑦𝑛 = −1,
minimize the size of irrelevant features. In addition, a new encoding 𝑥𝑚 = [−1, +1]𝑇 , 𝑥𝑛 = [+1, −1]𝑇 ; 𝐿𝑑 = −𝛾𝑚2 − 𝛾𝑛2 − 2𝛾𝑚 𝛾𝑛 + 𝛾𝑚 + 𝛾𝑛
scheme is also investigated to evaluate the classification.
∑𝑁
𝜕𝐿𝑑
4.3.3. Optimizing SVM parameters with CAGOA = 0 → 𝛾𝑚 + 𝛾𝑛 = 1∕2; 𝛾𝑖 𝑦𝑖 = 0 → 𝛾𝑚 = 𝛾𝑛 (28)
𝜕𝛾 𝑖=1
GOA is an evolutionary computation technique has employed to
solve the variety of real-world engineering problems. In this study, where, 𝛾𝑚 + 𝛾𝑛 = 1∕2 then 𝛾𝑚 = 𝛾𝑛 = 1∕4; To get the normal vector w:

CAGOA is employed as an optimization technique to select relevant fea- 𝑊 = 𝑁 𝑖=1 𝛾𝑖 𝑦𝑖 𝑥𝑖 = [−1∕2, −1∕2]
𝑇

tures and optimize SVM parameters to improve the classification per- To find b (bias): 𝛾𝑖 {𝑦𝑖 (𝑊 𝑥𝑖 + 𝑐) − 1 = 0}, 𝑐 = 𝑦1 − 𝑊 𝑇 𝑥𝑖 ∀𝑖 𝑆.𝑇 .𝛾 ≠
𝑇
𝑖
formance in prediction of attacks. 0, 𝑐 = 0∀𝑖
S. Dwivedi, M. Vardhan and S. Tripathi Computer Networks 176 (2020) 107251

Fig. 5. Optimizing the parameters of the SVM with


CAGOA.
Encoding the populaons
generated by chaos concept

Agents 010111……101
101101……110
110010……011

Implemenng CAGOA opmizaon


Tuning Parameter
Training Dataset
Tesng Dataset Selecon(ᴄ,ϵ,,σ)

No

Yes
Parameter Criteria
SVM Classifier Modelling
Obtained Evaluaon?

Result of IDS with


SVM tuning
Parameter

Finally, we find the maximum margin between positive and negative selecting relevant feature subset which are helpful in identification of
to solve an Intrusion detection issue. attacks is determined as given below:

4.3.5. Kernel function and parameters of SVM 1. Firstly, we applied the filter-based method as an ensemble to get the
In the last decades, SVM have received considerable attention by final ranking called EFS.
researchers. Several types of separation classification surfaces can be 2. Logistic map is utilized to generate uniformly distributed agents to
recognized through implementation of kernel (k), for instance polyno- improve the quality of the initial population.
mial, linear, Gaussian Radial Basis Function (RBF) to handle non-linear 3. Selected agents by tournament based approach on CAGOA. Produced
problems. SVM is used in binary classification and multi-class classi- agents is set as 30. Each agent consists number of feature selected
fication whereas multi-class classification is performed by introducing by filter.
SVM for each pair of classes. In this work, we list three kernel functions 4. Apply proposed encoding scheme for initialization of agent and on
illustrated from Shukla et al. [56] which are: SVM parameters. Using three parameters as 𝜎, 𝜖 and C are encoded
in binary significance and signified as agent. After encoding, each in-
• Linear function dividual length can be few features presented in the reduced dataset.
( ) ( )
𝑘 𝑥𝑖 , 𝑥𝑗 = 𝑥𝑖 𝑇 * 𝑥𝑗 + 𝑑 5. For each agent, find out the fitness function using SVM classifier
• Polynomial function considering classification accuracy as the fitness function.
( ) ( )𝑝
𝑘 𝑥𝑖 , 𝑥𝑗 = 𝑥𝑖 𝑇 * 𝑥𝑗 + 𝑑 6. Select a highly scored value of agents as consider an intelligent agent
• RBF function { } as target in initial phase.
( ) | |2
|𝑥𝑖 −𝑥𝑗 | 7. Then update position of agent (Yi ) to make a new updated target.
𝑘 𝑥𝑖 , 𝑥𝑗 = 𝑒𝑥𝑝 − | 2𝜎 2 |
8. According to the new value of agent update the old agent.
9. Find the parameters of SVM classifier. If, the recent best fitness value
Where d is a constant value, xi and xj are samples or instances, and
meets the termination condition, if yes, stay in afterward; otherwise,
p is the order of function.
go to step 3.
5. Overall structure of proposed approach 10. Output as an optimal feature subset with the optimal parameters 𝜎,
𝜖 and C of the SVM are found.
In this study, we have employed the chaotic maps to enhance the ini- The proposed algorithm is performed in several phases: (1) Data col-
tial population and the adaptive scheme for automatic tuning of param- lection, (2) Data preprocessing, (3) classifier training, and (4) attack
eter, then the population is updated using AGOA algorithm. In which, recognition. It is observed form the results of experiments that proposed
ensemble based feature selection is used to select top ranked features approach is outperformed other existing intrusion detection technique
and then these selected feature set is optimized using CAGOA approach. in identification of intrusions but it include certain limitations which
In addition, during the tuning of SVM parameters, the optimal parame- are discussed in next subsection.
ters are dynamically adjusted by the proposed approach in the training
phase via the 10-fold cross validation to improve its classification per- 5.1. Data collection
formance. So that, the optimized parameters are acquired and fed to
the SVM classification module is illustrated in Fig. 6. The proposed al- Data collection is an important phase in the research field of intru-
gorithm is called ECAGOA. The overall process of hybrid algorithm for sion detection. During data collection phase, data is collected from nu-
S. Dwivedi, M. Vardhan and S. Tripathi Computer Networks 176 (2020) 107251

Fig. 6. Overall structure of our approach for in-


trusion detection.
IDS Datasets

Data Pre-processing

Hybrid model for feature


selecon

Compact dataset

10 fold Cross Validaon

Training Set and Tesng


Set

SVM -L SVM -P SVM -R

Building Learning algorithm using SVMs


SVM Classifier

Anomaly Detecon
Engine

Yes
Normal Behavior
Found? Permied

No

Alert Counter Measure

merous different sources, for instance network packets, system logs and 5.3. Classifier training
management information base data. After the collection of data, data
is preprocessed including data transferring and data normalization, so In this phase, the classifier is trained. When the best features subset
that data can be sent to the intrusion detection module for further eval- is chosen, this subset is then evaluated by the classifier during train-
uation. ing phase where an explicit classification technique is used. Since SVM
can effortlessly deal with binary classification and multi-classification
5.2. Data preprocessing issues, in this work we have used SVM for classification of used IDS
datasets,for instance, in NSL-KDD dataset, there are five different classes
In this phase, after acquiring data from the data collection phase, including a normal class and four attack classes.
training and testing data undergo for pre-processing to produce basic
features. Data pre-processing phase contains three main steps. The first
step is data transferring, where each symbolic value of feature in a
dataset is transformed to a numerical value. The second step is data 5.4. Attack recognition
normalization which is essential to retain a uniform distribution of each
value of feature before beginning of any learning process. In this context, In this phase, the trained model is employed to distinguish attacks
we utilize the min-max scheme. In min-max, the values of all features on the test data. After finishing all the iterations and training of the final
are scaled to the range [0, 1] to remove the bias while supporting fea- classifier which contains the most significant features, the normal and
tures with larger values from the dataset. During the third step, feature attack traffics can be predicted using trained classifier. The test data is
selection is performed, in which the proposed approach is employed to also used through the trained model to discover intrusions. In this work,
select the most significant features that are then utilized to train the NSL-KDD, ISCX 2012, and CIC-IDS2017 dataset are used for evaluation
classifier in the intrusion detection model. of proposed approach.
S. Dwivedi, M. Vardhan and S. Tripathi Computer Networks 176 (2020) 107251

Table 1 categories of attacks, namely DoS, Probe, R2L and U2R. Consequently,
NSL-KDD dataset description. each connection sample complies with one of the five labeled classes
Data set type Details of dataset (DOS, Normal, R2L, Probe and U2R).The NSL-KDD has six different net-
work protocols and services such as SMTP, HTTP, FTP, Telnet, ICMP,
Samples Normal DoS Probe U2R R2L
and SNMP. Finally, it covers 41 attributes (six binary, three nominal,
Train Sample 125,973 67,343 45,927 11,656 52 995 and thirty-two numeric) in each record which can be divided into three
% 53.46 36.45 9.25 0.04 0.79 types, namely, basic features, traffic-based features, and content-based
Test Sample 22,543 9711 7458 2421 200 2754
% 43.08 33.08 10.74 0.89 12.22
features.

6.1.2. ISCX 2012


Information Security Center of Excellence (ISCX) [58]: It is devel-
5.5. Limitations of our method
oped by Information Security Centre of Excellence at University of
New Brunswick. It contains seven days captured traffic with overall
Although, proposed method is efficient, but it has some limitations
2,450,324 flows including DoS attacks. The dataset derived from real
that should be addressed in future studies. The limitations of this paper
packets for seven days of network activity such as HTTP, SMTP, SSH,
are:
IMAP, POP3 and FTP protocols concealing various scenarios of normal
1. As shown in this manuscript, ECAGOA techniques are suited to de- and malicious activities. The dataset includes a total of 2,450,324 flows
veloping effective, efficient, and adaptive intrusion detection tech- and contain 19 features such as app name, total destination packets, to-
nique. While, the proposed approach is not suitable for dynamic net- tal source packets, total source bytes, total destination bytes, source pay-
work environment. As we know that with the introduction of new load as base64, source port, destination, destination payload as base64,
attack strategies, new types of networks, developing effective intru- destination payload as UTF, direction, source TCP flags description, des-
sion detection technique is a field that has been grown continuously. tination TCP flags description, source, protocol name, destination port,
Hence, an investigation of several simultaneous effects of attacks and start date time, stop date time, and tag excluding class.
growth in the scope of IDS system to incorporate other types of intru-
sions, where the effect of changing intensity of traffic is not clearly 6.1.3. CIC-IDS2017
pronounced by intruders, is needed to improve the performance of The CIC-IDS2017 dataset produced by the Canadian Institute of Cy-
IDS. Because, a good IDS must perform continuous adaptation to up- ber Security is a modern IDS anomaly-based dataset proposed in 2017
coming intrusions, variations in the system. that is publicly available on the Internet at the owners request [59]. Ex-
2. Feature selection is a multi-objective task in accurate detection of in- isting publicly available datasets have no traffic diversity, volumes re-
trusions. It has two main objectives, which are to maximize classifi- strictions on various attacks, anonymous packet information loading, no
cation accuracy and minimize the number of features. These two ob- metadata and feature sets. Therefore, it is necessary to launch a reliable
jectives are usually conflicted to each other and the optimal decision and updated dataset containing realistic data to help researchers prop-
which is taken in the presence of a trade-off between these two objec- erly evaluate their models and overcome all the shortcomings of other
tives, is more valuable in real time scenario. Handling feature selec- existing IDS datasets. CICFlowMeter [60] was used to analyze the pcap
tion as a multi-objective problem can acquire a set of non-dominated data acquired for five working days. The network connection records in
feature subsets to meet different requirements of IDS in real-world this dataset are based on HTTP, HTTPS, FTP, SSH and email protocols. In
applications. However, there are not frequent studies available in addition to that, the attack flows consist of a total of 20 attacks and are
which feature selection is treated as a multi-objective problem in grouped into seven major categories, namely, Brute Force, Heart bleed,
detection of attack. Although ECAGOA has been shown to be success- Botnet, DoS, DDoS, Web attack, and Infiltration. Finally, CIC-IDS2017
ful in addressing many single objective problems in identification of contains 80 different features as well as a class label to identify the par-
attacks while, ECAGOA has never been applied as multi-objective ticular traffic record to one of eight possible classes.
feature selection approach for detection of intrusions. Unlike the NSL-KDD dataset in which there is a specific number of
samples for training and testing, CIC-IDS2017 is a very large dataset
6. Experimental results and discussion that has approximately 3 million network flows in different files [29].
Therefore, in this study, we selected 10% of CIC-IDS2017 for training
6.1. Description of the benchmark datasets and testing in order to reasonably reduce training and testing times. The
10% of CIC-IDS2017 is selected randomly by using the sampling without
At this time, there are only a small number of data sets which are replacement to ensure that once an object is selected, it removed from
available openly for intrusion detection estimation. Amongst available the population. In order to ensure the diversity of the 744 traffic records
IDS data sets, NSL-KDD (see Table 1), CIC-IDS2017 and ISCX 2012 have and avoid overfitting, we have implemented balanced training and test
used in this work to assess the performance and to identify the type sets, i.e. of equivalent size (150 thousand samples in each). The samples
of attacks. In addition, these data sets have different data values and in the training set are evenly distributed as follows: Normal (18750),
several number of features that meet exhaustive tests to validate feature Brute Force (18750), Heartbleed (18750), Botnet (18750), DoS (18750),
selection methods. The detailed description of datasets is discussed as DDoS (18750), Web attack (18750), and Infiltration (18750). Therefore,
follows. the same process is followed during the creation of the test set, but with
samples that do not exist in the training data and are distributed equally
6.1.1. NSL-KDD as in the training set. In addition, some types of attacks are included only
Tavallaee et al. [57] have proposed premeditated version of KDD in the test set rather than in the training set to examine the ability of
Cup 99 named NSL-KDD. In NSL-KDD dataset, we have addresses some the NIDS used to classify them correctly. According to this procedure,
problems such as number of dismissed samples in KDD Cup 99. With we believe that both training and test data are balanced and the used
respect of the KDD Cup 99 data, each sample in the NSL-KDD dataset models will not bias to any class during the training phase.
is collected of 41 dissimilar features namely the feature name, records,
and feature portrayal.The NSL-KDD dataset used in this work consists of 6.2. Performance measures
41 valuable features along with one label for classification. It comprises
of 24 kinds of attack. Additionally, to improve the detection rate, simi- To better evaluate the advantages and disadvantages of the proposed
lar attacks are combined into a single category which leads to four main approach with SVM classifier, accuracy, precision, Detection rate (DR),
S. Dwivedi, M. Vardhan and S. Tripathi Computer Networks 176 (2020) 107251

Table 2
Comparison of feature ranking of top 20 features in all attacks in two datasets.

NSL-KDD dataset

Methods Feature Ranking

CMIM f_5,f_8,f_11,f_15,f_19,f_24,f_31,f_34,f_40,f_13,f_36,f_14,f_2,f_3,f_9,f_17,f_20,f_33,f_32,f_18
JMI f_4,f_2,f_11,f_15,f_19,f_23,f_25,f_13,f_16,f_1,f_7,f_40,f_18,f_37,f_18,f_15,f_12,f_34,f_39,f_5
mRMR f_5,f_2,f_19,f_17,f_20,f_1,f_29,f_16,f_11,f_12,f_38,f_15,f_18,f_3,f_9,f_40,f_22,f_14,f_4,f_5
EFS f_5,f_2,f_11,f_15,f_19,f_1,f_25,f_13,f_12,f_16,f_14,f_17,f_3,f_7,f_1,f_1,f_12,f_39,f_5
CIC-IDS2017 dataset
CMIM f_6,f_12,f_14,f_18,f_22,f_25,f_31,f_35,f_40,f_42,f_46,f_59,f_62,f_65,f_66,f_69,f_72,f_75,f_76,f_78
JMI f_5,f_8,f_12,f_18,f_20,f_25,f_27,f_30,f_36,f_38,f_40,f_45,f_48,f_50,f_53,f_66,f_67,f_71,f_72,f_78
mRMR f_3,f_5,f_9,f_11,f_17,f_22,f_28,f_30,f_34,f_38,f_42,f_50,f_52,f_60,f_64,f_66,f_67,f_74,f_76,f_77
EFS f_3,f_5,f_9,f_18,f_17,f_25,f_28,f_30,f_34,f_38,f_40,f_45,f_48,f_50,f_53,f_66,f_67,f_71,f_76,f_78

Table 3
Comparison of feature ranking of nineteen features in ISCX 2012 data.

ISCX 2012 dataset

Methods Feature Ranking

CMIM f_1,f_8,f_4,f_6,f_14,f_12,f_18,f_1,f_5,f_2,f_16,f_11,f_7,f_17,f_15,f_8,f_10,f_9,f_19
JMI f_2,f_8,f_9,f_12,f_4,f_13,f_3,f_6,f_5,f_2,f_8,f_10,f_9,f_19,f_16,f_11,f_7,f_17,f_15,
mRMR f_1,f_9,f_4,f_6,f_5,f_2,f_16,f_11,f_8,f_13,f_14,f_12,f_18,f_7,f_17,f_15,f_10,f_9,f_19
EFS f_1,f_8,f_4,f_6,f_5,f_2,f_3,f_10,f_7,f_17,f_9,f_19

False Alarm Rate (FAR), and F-measure are applied to evaluate the per- Table 4
formance of the proposed algorithm. These performance metrics are de- Percentage of average performance in three feature selection method and pro-
termined in Eqs. (29) to (32). posed filter method in all attacks with NSL-KDD and CIC-IDS2017.

𝑇𝑃 + 𝑇𝑁 Dataset Classifiers Measure CMIM mRMR JMI EFS


𝐴𝑐 𝑐 𝑢𝑟𝑎𝑐 𝑦 = (29)
𝑇𝑃 + 𝑇𝑁 + 𝐹𝑃 + 𝐹𝑁 NSL-KDD SVM-R Accuracy 92.02 92.76 89.80 96.08
DR 91.05 90.89 89.12 95.72
𝑇𝑃 Precision 86.18 90.16 87.65 94.69
𝐷𝑒𝑡𝑒𝑐𝑡𝑖𝑜𝑛𝑅𝑎𝑡𝑒(𝐷𝑅) = (30)
𝑇𝑃 + 𝐹𝑁 F-measure 90.14 88.11 88.95 95.19
SVM-P Accuracy 88.26 90.33 90.14 94.36
𝑇𝑃 DR 86.99 87.22 88.87 92.85
𝑃 𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛(𝑃 𝑟) = (31) Precision 86.48 86.76 87.79 90.98
𝑇𝑃 + 𝐹𝑃
F-measure 85.34 88.32 86.23 91.17
2 ∗ 𝑇𝑃 SVM-L Accuracy 82.32 82.89 82.23 89.92
𝐹 − 𝑚𝑒𝑎𝑠𝑢𝑟𝑒(𝐹 𝑚𝑒𝑠) = (32) DR 81.11 80.87 81.78 91.51
2 ∗ 𝑇𝑃 + 𝐹𝑁 + 𝐹𝑃 Precision 82.43 82.16 81.36 91.13
F-measure 83.24 80.96 83.24 90.31
𝑇𝑃
𝐹 𝐴𝑅 = (33) Dataset Classifier Measure CMIM mRMR JMI EFS
𝑇𝑁 + 𝐹𝑃
CIC-IDS2017 SVM-R Accuracy 92.54 94.26 93.32 94.69
Here, TP , TN , FP , and FN are True positive, True negative, False posi- DR 91.32 92.15 92.82 94.19
tive, and False negative in the independent datasets based on confusion Precision 92.12 91.48 90.76 94.35
matrix. Tables 2 and 3 shows selected features by EFS approach in com- F-measure 91.24 90.73 90.42 93.16
parisons to other filter methods. SVM-P Accuracy 89.47 91.43 90.12 92.53
DR 88.43 89.18 90.26 92.34
Precision 89.12 87.93 89.54 92.47
6.3. Experimental settings F-measure 89.23 90.42 88.36 91.65
SVM-L Accuracy 86.57 85.29 88.32 90.26
The proposed technique was implemented using the MATLAB DR 87.02 85.16 89.28 90.45
Precision 89.27 86.62 88.23 90.14
R2016a environment on Windows 8 operating system with 16 GB RAM
F-measure 87.34 86.75 87.68 89.36
and an i7 CPU processor. For evaluating the proposed approach, we
use three real-world attack datasets such as ISCX 2012, NSL-KDD and
CIC-IDS2017. Take SVM as a classifier for choosing the best parame-
ters that can affect the performance of our approach unwillingly, so we on the ISCX 2012 [58], NSL-KDD [62] and CIC-IDS2017 [59] data sets
have employed the LibSVM tool [61]. For the proposed method, a num- and implemented various experimental tools and techniques [63].
ber of iterations are set as 100, meanwhile, we perceived that after 100
iterations, there is no progress in the performance of classifier for the 6.4.1. Comparison of filter feature selection methods using SVMs
proposed technique and population size is set as 30. In the experimental work, we have employing 10-fold cross-
validation using SVM classifier in the IDS datasets. The results of this
6.4. Experimental results and analysis experiment are displayed in Tables 4 and 5 in terms of accuracy, DR,
precision, and F-measure. From the Table 4, we can observe that the
In order to verify the effectiveness of proposed, we exhibit the data accuracy of the classifier is not much remarkable in NSL-KDD and CIC-
sets utilized for assessment, the numerous tests utilized as a part of IDS2017 data sets, particularly for SVM-L. The large values of this per-
our investigations used. Their corresponding assessment measurements formance measures signify an outstanding classification performance.
have adapted to evaluate the execution of the proposed model. In the It demonstrates that when IDS combines with EFS, it has achieved an
available literature, a lot of researchers performed several experiments accuracy of 96.08% in NSL-KDD and 94.69% in CIC-IDS2017 datasets
S. Dwivedi, M. Vardhan and S. Tripathi Computer Networks 176 (2020) 107251

Table 5
Percentage of average performance in three feature selection method and proposed filter meth-
ods with ISCX 2012 dataset.

Dataset Classifiers Measure CMIM mRMR JMI EFS

ISCX 2012 SVM-R Accuracy 91.52 92.06 88.40 94.98


DR 90.05 91.69 88.72 93.12
Precision 86.18 90.16 87.65 91.69
F-measure 90.14 88.11 88.95 92.09
SVM-P Accuracy 88.26 90.33 90.14 93.32
DR 86.99 87.22 88.87 92.85
Precision 86.48 86.76 87.79 91.01
F-measure 85.34 88.32 86.23 90.55
SVM-L Accuracy 82.32 82.89 82.23 90.16
DR 82.11 83.87 84.78 90.31
Precision 82.43 82.16 81.36 88.67
F-measure 83.24 80.96 83.24 87.62

Table 6
Comparative performance on ISCX 2012 dataset.

Parameters Training Testing


Datasets Methods Accuracy Accuracy Runtime
𝜎 𝜖 C

ISCX 2012 Single SVM 0.7254 1.0235 4.8546 93.54 91.86 1012.85
EFS-SVM 0.6524 0.0235 5.7424 95.81 94.63 784.52
GA-SVM 0.4547 0.0495 5.0210 98.85 97.63 658.91
GOA-SVM 0.0581 0.0364 1.8254 99.06 98.99 542.63
Proposed 0.0715 0.0009 1.5438 99.41 99.23 288.1547

Table 7
Comparative performance on NSL-KDD and CIC-IDS2017 datasets.

Datasets Methods Parameters Training Testing Runtime


Accuracy Accuracy
𝜎 𝜖 C

NSL-KDD Single SVM 0.7023 0.0345 5.5647 93.12 92.54 1025.94


EFS-SVM 0.6852 0.0264 5.8214 95.87 94.82 965.81
GA-SVM 0.4854 0.0126 4.9654 98.37 98.02 842.63
GOA-SVM 0.4043 0.0097 3.8543 98.88 98.34 524.63
Proposed 0.0117 0.0010 1.1245 99.86 99.63 266.3780
CIC-IDS2017 Single SVM 0.4536 0.05632 3.5463 91.25 89.52 954.2421
EFS-SVM 0.6047 0.0143 4.5324 94.35 93.24 664.5426
GA-SVM 0.4365 0.0047 3.6842 95.38 94.20 537.2513
GOA-SVM 0.0943 0.0034 1.8254 97.66 96.25 514.6532
Proposed 0.0124 0.0004 1.0468 99.82 99.25 320.1485

with SVM-R, and significantly over takes all other classification meth- concerning their Execution training time (EtrD), Execution testing time
ods. Similarly, from Table 5, we demonstrates that when ISCX 2012 (EteD) are displayed in Fig. 7 in used three datasets.
combines with EFS, it has achieved an accuracy of 94.98% in ISCX 2012 In the experimental work, we have made comparison between the
dataset with SVM-R, and significantly over takes all other classification performance of proposed (ECAGOA) and the three different optimiza-
methods. tion algorithm with SVM classifier. Tables 8 to 10 compare the per-
formance of the ECAGOA method concerning average accuracy for IDS
datasets. Now, we have evaluated proposed method in comparison to
6.4.2. Comparison of proposed method with existing state-of arts GOA-SVM, GA-SVM, EFS-SVM and SVM with the detection rate (DR),
In available literature, several methods of feature selection have Execution training time (EtrD), false alarm rate (FAR) and Execution
been applied for the identification of attacks from IDS data sets. De- test time (EteD) in NSL-KDD and CIC-IDS2017 datasets over each fold in
spite this, there is no arrangement by which the FS method produces Tables 8 to 10. From Tables 8 to 9, the acquired outcomes illustrates that
noteworthy subsets of attributes for attack classification. A specific FS the proposed technique is greater than other nature-inspired approaches
strategy can be superior to others for some IDS datasets, or other FS acquiring 99.71% detection rate and 0.085 false alarm rate in NSL-KDD
strategy may work better for some other data sets. and 99.52% detection rate and 0.007 false alarm rate in CIC-IDS 2017
From Tables 6 and 7, all the SVM methods take on RBF which works datasets. To show the performance of ECAGOA with SVM-R compared
as a kernel function, in normal SVM detection model, the parameters to GA,GOA,EFS with SVM-R and single SVM-R, experiments have been
𝜎, 𝜖, and C are randomly selected in ISCX 2012, NSL-KDD and CIC-IDS done to make examinations with all type of attacks. Table 10 compare
2017 datasets. In proposed SVM detection model, the optimal parame- the performance of the ECAGOA method concerning 99.32% detection
ters 𝜎, 𝜖, and C are obtained by proposed algorithm through 30 simu- rate and 0.053 false alarm rate on ISCX 2012 data.
lation experiments. From Tables 6 and 7, it is observed that proposed As illustrated in Table 11, the proposed algorithm has acquired sig-
algorithm has obtained 99.23% accuracy in ISCX 2012 dataset; 99.63% nificant outcomes in comparison to prevailing state-of-art regarding ac-
and 99.25% accuracy in NSL-KDD and CIC-IDS2017, respectively while curacy, high DR, low FAR amongst all employed approaches in both
testing phase. The performance evaluations of all employed approaches type of data. Fig. 8 shows the overall performance of proposed tech-
S. Dwivedi, M. Vardhan and S. Tripathi Computer Networks 176 (2020) 107251

Table 8
Comparison the experimental performance in all attack in NSL-KDD.

Methods Measures Fold1 Fold2 Fold3 Fold4 Fold5 Fold6 Fold7 Fold8 Fold9 Fold10

Proposed DR (%) 95.11 95.23 94.07 96.04 96.89 95.23 98.04 98.07 96.31 99.71
FAR (%) 1.012 1.223 0.876 1.124 1.046 0.865 1.112 0.983 0.843 0.085
EtrD(sec) 1.54 2.67 3.43 4.43 2.45 3.57 1.89 2.76 2.59 1.57
EteD(sec) 2.24 3.69 4.56 5.36 4.57 5.34 3.78 3.68 4.56 4.58
GOA-SVM DR (%) 94.50 94.63 93.85 94.63 95.24 94.99 98.63 95.70 96.15 95.66
FAR (%) 1.547 1.906 1.745 1.896 1.654 1.974 2.023 2.523 1.924 1.024
EtrD(sec) 2.54 4.49 4.64 3.65 4.13 4.42 4.14 5.37 6.34 4.34
EteD(sec) 4.67 5.66 6.53 4.39 6.35 6.54 5.34 8.27 7.14 6.54
GA-SVM DR (%) 89.53 87.53 88.61 88.74 89.23 88.52 90.63 89.54 88.1 88.63
FAR (%) 4.954 4.635 5.564 5.0123 5.745 5.635 4.523 4.623 5.546 4.630
EtrD(sec) 3.75 6.03 6.54 4.56 5.46 5.39 6.42 6.68 7.47 7.24
EteD(sec) 4.78 7.52 7.38 5.43 6.57 7.47 7.54 7.18 9.05 10.08
EFS-SVM DR (%) 85.63 86.52 86.12 85.16 86.54 88.64 84.34 86.87 87.91 88.52
FAR (%) 4.654 5.657 6.972 6.025 6.872 5.854 5.793 6.391 4.895 7.382
EtrD(sec) 5.86 8.58 7.49 7.41 8.73 8.38 7.69 7.36 9.53 8.47
EteD(sec) 6.37 8.45 9.35 6.29 8.51 9.65 10.48 8.93 10.52 11.52
SVM DR (%) 87.76 85.01 87.32 84.08 83.36 86.32 87.21 86.38 89.22 88.36
FAR (%) 6.572 11.232 13.746 9.054 14.213 11.365 16.382 10.396 9.067 11.234
EtrD(sec) 14.78 18.53 15.38 17.27 19.16 16.46 18.25 16.29 20.17 18.21
EteD(sec) 18.75 19.87 20.48 21.32 24.37 18.25 20.44 19.11 24.12 25.08

Table 9
Comparison the experimental performance in all attack in CIC-IDS 2017.

Methods Measures Fold1 Fold2 Fold3 Fold4 Fold5 Fold6 Fold7 Fold8 Fold9 Fold10

Proposed DR (%) 93.25 94.54 93.64 95.46 97.61 96.32 98.59 99.35 96.14 99.52
FAR (%) 1.432 1.524 1.205 0.982 1.325 0.876 0.793 0.696 0.815 0.007
EtrD(sec) 1.63 2.23 3.46 4.68 2.02 3.25 2.10 3.53 1.92 1.41
EteD(sec) 2.56 3.16 4.38 3.25 4.20 3.59 2.92 3.19 4.38 3.94
GOA-SVM DR (%) 91.42 93.39 92.85 94.37 96.24 97.32 95.67 94.27 95.36 96.51
FAR (%) 1.764 1.572 1.874 2.052 1.254 1.635 2.037 1.108 1.354 1.046
EtrD(sec) 3.52 2.65 4.19 2.97 3.15 4.52 3.40 4.97 5.39 4.53
EteD(sec) 4.46 4.32 5.20 3.73 4.29 6.21 5.37 4.83 6.23 4.69
GA-SVM DR (%) 90.24 89.25 87.63 88.92 89.64 90.25 92.52 93.49 91.38 93.52
FAR (%) 5.362 3.524 3.943 4.415 4.201 3.474 2.768 3.982 2.856 2.074
EtrD(sec) 3.27 5.43 6.08 4.28 3.14 5.35 6.27 5.62 4.25 5.21
EteD(sec) 4.57 6.23 7.22 4.15 5.78 6.20 6.53 6.36 8.49 9.02
EFS-SVM DR (%) 89.26 88.54 86.22 88.64 89.21 87.35 88.20 86.58 89.73 89.65
FAR (%) 5.251 7.185 5.382 7.378 4.472 6.158 7.216 5.252 7.203 6.503
EtrD(sec) 5.60 6.15 8.32 7.25 8.47 6.20 7.18 9.21 8.94 10.02
EteD(sec) 8.25 7.26 8.58 9.21 7.62 8.10 9.23 8.17 10.18 11.29
SVM DR (%) 86.34 87.25 85.21 84.57 86.24 88.76 87.52 85.93 84.52 86.21
FAR (%) 5.722 7.251 9.045 8.351 10.152 9.672 10.213 8.273 9.219 10.201
EtrD(sec) 11.54 13.25 15.20 14.52 16.34 15.26 17.65 16.28 18.34 19.22
EteD(sec) 16.25 17.54 19.53 20.04 22.58 19.34 21.28 20.15 22.35 24.12

Table 10
Comparison the experimental performance in ISCX 2012 dataset.

Methods Measures Fold1 Fold2 Fold3 Fold4 Fold5 Fold6 Fold7 Fold8 Fold9 Fold10

Proposed DR (%) 95.46 96.21 94.15 97.62 95.34 96.75 98.41 99.38 98.29 99.32
FAR (%) 1.123 1.241 0.986 1.314 0.942 0.864 1.015 0.931 0.894 0.053
EtrD(sec) 1.76 1.27 2.51 4.23 3.75 2.39 2.17 1.96 2.58 2.81
EteD(sec) 1.67 2.25 3.27 4.15 3.21 3.96 4.36 4.05 3.26 4.90
GOA-SVM DR (%) 94.23 93.47 92.59 95.14 93.57 95.17 97.76 96.67 95.27 96.34
FAR (%) 1.564 1.702 1.635 2.215 1.547 1.625 2.254 1.037 1.615 1.362
EtrD(sec) 3.92 4.17 4.21 3.82 4.42 5.60 4.32 5.28 6.10 5.21
EteD(sec) 5.45 6.22 7.54 5.29 7.65 6.27 5.27 8.74 7.09 6.92
GA-SVM DR (%) 90.24 89.53 91.22 93.46 92.02 93.21 91.76 94.28 92.44 93.34
FAR (%) 5.784 4.164 3.463 4.241 5.352 4.172 2.563 3.754 4.215 3.163
EtrD(sec) 4.89 5.16 6.25 4.16 5.24 6.24 5.87 6.38 7.28 7.69
EteD(sec) 6.87 7.20 6.29 8.93 7.26 8.68 7.54 7.39 9.27 9.76
EFS-SVM DR (%) 88.28 86.03 87.24 85.68 87.33 86.39 89.12 90.12 88.14 89.34
FAR (%) 4.267 5.254 4.325 6.215 5.237 4.892 5.732 6.493 5.294 7.683
EtrD(sec) 6.42 7.95 6.27 7.53 9.41 8.20 9.54 7.84 9.21 8.94
EteD(sec) 7.26 8.65 9.48 7.52 8.73 9.25 10.20 9.27 10.68 10.12
SVM DR (%) 88.16 85.76 86.12 85.26 84.43 85.21 86.52 87.26 88.20 86.24
FAR (%) 6.742 11.5711 10.234 9.245 11.213 10.394 9.425 10.682 9.892 10.951
EtrD(sec) 15.24 16.33 14.25 16.37 15.69 17.26 18.75 16.49 19.52 17.42
EteD(sec) 17.65 18.29 16.78 19.52 22.26 19.45 20.44 23.51 22.72 24.15
S. Dwivedi, M. Vardhan and S. Tripathi Computer Networks 176 (2020) 107251

Fig. 7. Comparison of training time and testing time (in seconds) on NSL-KDD, ISCX 2012 and CIC-IDS2017 datasets.

Table 11
Comparison results based on all attack in all used datasets.

Dataset Method DR FPR Accuracy

NSL-KDD TUIDS [64] 98.88 1.120 96.55


HTTP based IDS [65] 99.03 1.000 99.38

ABC-AdaBoost [30] 99.61 98.9
Proposed method 99.71 0.085 99.63
CIC-IDS2017 BRS [66] 96.37 0.014 97.95
DBN [59] 95.81 1.050 98.95

LSTM [67] 99.45 99.10
Proposed method 99.52 0.007 99.25
∗ ∗
ISCX 2012 HbPHAD [68] 99.04
EMD [69] 90.04 7.920 90.12

SLFN [70] 88.16 5.560
IG-PCA-Ensemble [71] 99.10 0.010 99.01
Proposed method 99.32 0.053 99.23
S. Dwivedi, M. Vardhan and S. Tripathi Computer Networks 176 (2020) 107251

(a) NSL-KDD (b)

(c ) ISCX 2012 (d)

(e) CIC -IDS2017 (f)


Fig. 8. Comparison of detection rate on NSL-KDD, ISCX 2012 and CIC-IDS2017 datasets.

Table 12
Description of selected features.

Dataset Feature description

NSL-KDD Service, num_compromised, is_hot_login, srv_rerror_rate, dst_host_serror_rate


ISCX 2012 app name, total source bytes, total source packets, source payload as base64, protocol name,
source port, destination port
CIC-IDS2017 Protocol, Forward packet length max, Flow bytes/s, Flow IAT max, Backward IAT std,
Forward PSH flags, Forward header length, Packet length std., SYN flag count, CWE flag
count, Down/Up ratio, Init_Win_bytes_fwd, act_data_ packet_ forward, Active std., Idle Max

nique is superior to the other methods for intrusion recognition in differ- 7. Conclusion
ent datasets. The acquired outcomes demonstrate that proposed shows
significant improvements in DR. From the Table 12, we observe that Intrusion detection system is a major line of defense to protect com-
five, seven and fifteen optimal attributes have been selected by the puter resources from unauthorizes activities. An individual approach in
proposed method in NSL-KDD, ISCX 2012 and CIC-IDS 2017 datasets intrusion detection model is to select the best set of features by using
which can be identified the attacks in the networks for intrusion detec- classifier and improve the performance, learning speed, accuracy, and
tion respectively. These attributes plays a significant role in IDS system. reliability in addition remove noise from the set of features but they
Table 12 shows the optimal selected features and also gives a short de- have few drawbacks. To overcome the existing limitation, in this pa-
scription of features. per, a novel hybrid IDS is introduced by combination of filter (EFS)
and wrapper (CAGOA) for accurate characterization for network traf-
S. Dwivedi, M. Vardhan and S. Tripathi Computer Networks 176 (2020) 107251

fic behaviors. The main contributions of this study can be summarized [17] K. Zheng, X. Wang, Feature selection method with joint maximal information en-
as follows. Firstly, filter-based FS method is introduced, called ensem- tropy between features and class, Pattern Recognit. 77 (2018) 20–29.
[18] D.E. Denning, An intrusion-detection model, IEEE Trans. Softw. Eng. (2) (1987)
ble of feature selection to eliminate the irrelevant features. After that, 222–232.
chaos-based (Logistic map) solutions are utilized to initialize the uni- [19] R.K. Deka, D.K. Bhattacharya, J.K. Kalita, Active learning to detect DDos attack using
formly distributed initial agents to enhance the stability of GOA called ranked features, Comput. Commun. (2019).
[20] S. Dwivedi, M. Vardhan, S. Tripathi, Defense against distributed dos attack detection
CGOA. In order to overcome slow convergence, high computational bur- by using intelligent evolutionary algorithm, Int. J. Comput. Appl. (2020) 1–11.
den and low interpretability shortcomings, we have introduced adaptive [21] V. Hajisalem, S. Babaie, A hybrid intrusion detection system based on ABC-AFS al-
behavior of CGOA called ACGOA to predict the networks traffic behav- gorithm for misuse and anomaly detection, Comput. Netw. 136 (2018) 37–50.
[22] M.S. Pervez, D.M. Farid, Feature selection and intrusion classification in NSL-KDD
ior accurately. Finally, ECAGOA technique is applied to select suitable
cup 99 dataset employing SVMs, in: Software, Knowledge, Information Manage-
SVM parameters which avoids over-fitting concern of SVM. Based on ment and Applications (SKIMA), 2014 8th International Conference on, IEEE, 2014,
the experimental results obtained on three datasets, it can be concluded pp. 1–6.
[23] A.H. Hamamoto, L.F. Carvalho, L.D.H. Sampaio, T. Abrão, M.L. Proença Jr., Network
that the proposed approach has achieved promising and significant per-
anomaly detection system using genetic algorithm and fuzzy logic, Expert Syst. Appl.
formance in detecting intrusions over computer networks. In particular, 92 (2018) 390–402.
the performance of the proposed approach achieves detection rate as [24] H. Sadreazami, A. Mohammadi, A. Asif, K.N. Plataniotis, Distributed-graph-based
99.71%, accuracy as 99.63%, and false alarm rate as 0.085 in NSL-KDD statistical approach for intrusion detection in cyber-physical systems, IEEE Trans.
Signal Inf. Process. Netw. 4 (1) (2017) 137–147.
dataset. From intrusion CIC-IDS 2017 dataset, the accuracy of the cur- [25] L. Lv, W. Wang, Z. Zhang, X. Liu, A novel intrusion detection system based on an
rent work achieves as 99.25%, detection rate as 99.52%, and false alarm optimal hybrid kernel extreme learning machine, Knowl. Based Syst. (2020) 105648.
rate as 0.007; and in ISCX 2012 dataset, detection rate as 99.32%, ac- [26] M. Xie, J. Hu, Evaluating host-based anomaly detection systems: a preliminary anal-
ysis of ADFA-LD, in: 2013 6th International Congress on Image and Signal Processing
curacy as 99.23% and false alarm rate as 0.053. Overall, ECAGOA has (CISP), vol. 3, IEEE, 2013, pp. 1711–1716.
performed the best when compared with the other state-of-the-art mod- [27] R. Abdulhammed, M. Faezipour, A. Abuzneid, A. Alessa, Enhancing wireless intru-
els. In addition, the impact of the unbalanced sample distribution on sion detection using machine learning classification with reduced attribute sets, in:
2018 14th International Wireless Communications & Mobile Computing Conference
an IDS necessities to be inclined a careful consideration in our future (IWCMC), IEEE, 2018, pp. 524–529.
studies. [28] N. Moustafa, J. Slay, The evaluation of network anomaly detection systems: statis-
tical analysis of the UNSW-NB15 data set and the comparison with the KDD99 data
Declaration of Competing Interest set, Inf. Secur. J. 25 (1–3) (2016) 18–31.
[29] S. Kaur, M. Singh, Hybrid intrusion detection and signature generation using deep
recurrent neural networks, Neural Comput. Appl. (2019) 1–19.
The authors declare that there is no conflict of interests regarding [30] M. Mazini, B. Shirazi, I. Mahdavi, Anomaly network-based intrusion detection sys-
the publication of this article. tem using a reliable hybrid artificial bee colony and adaboost algorithms, J. King
Saud Univ.-Comput.Inform. Sci. (2018).
[31] R. Sharma, S. Chaurasia, An enhanced approach to fuzzy c-means clustering for
CRediT authorship contribution statement anomaly detection, in: Proceedings of First International Conference on Smart Sys-
tem, Innovations and Computing, Springer, 2018, pp. 623–636.
Shubhra Dwivedi: Conceptualization, Writing - original draft, [32] M.E. Aminanto, R. Choi, H.C. Tanuwidjaja, P.D. Yoo, K. Kim, Deep abstraction and
weighted feature selection for Wi-Fi impersonation detection, IEEE Trans. Inf. Foren-
Methodology, Software. Manu Vardhan: Supervision, Writing - review sics Secur. 13 (3) (2017) 621–636.
& editing. Sarsij Tripathi: Visualization, Investigation, Validation. [33] E. Mahdavi, A. Fanian, F. Amini, A real-time alert correlation method based on code–
books for intrusion detection systems, Comput. Secur. 89 (2020) 101661.
References [34] P. Suresh, R. Sukumar, S. Ayyasamy, Efficient pattern matching algorithm for secu-
rity and binary search tree (BST) based memory system in wireless intrusion detec-
[1] S. Hajiheidari, K. Wakil, M. Badri, N.J. Navimipour, Intrusion detection systems in tion system (WIDS), Comput. Commun. 151 (2020) 111–118.
the internet of things: a comprehensive investigation, Comput. Netw. (2019). [35] J.M. Vidal, A.L.S. Orozco, L.J.G. Villalba, Adaptive artificial immune networks for
[2] I.F. Akyildiz, T. Melodia, K.R. Chowdhury, A survey on wireless multimedia sensor mitigating dos flooding attacks, Swarm Evol. Comput. 38 (2018) 94–108.
networks, Comput. Netw. 51 (4) (2007) 921–960. [36] A. Karami, M. Guerrero-Zapata, A hybrid multiobjective RBF-PSO method for
[3] A.A. Aburomman, M.B.I. Reaz, A novel weighted support vector machines multiclass mitigating dos attacks in named data networking, Neurocomputing 151 (2015)
classifier based on differential evolution for intrusion detection systems, Inf. Sci. 414 1262–1282.
(2017) 225–246. [37] R. Vijayanand, D. Devaraj, B. Kannapiran, Intrusion detection system for wireless
[4] M. Tariq, H. Majeed, M.O. Beg, F.A. Khan, A. Derhab, Accurate detection of sitting mesh network using multiple support vector machine classifiers with genetic-algo-
posture activities in a secure IoT based assisted living environment, Future Gener. rithm-based feature selection, Comput. Secur. 77 (2018) 304–314.
Comput. Syst. 92 (2019) 745–757. [38] J.E. Tapia, C.A. Perez, Gender classification based on fusion of different spatial scale
[5] A.K. Shukla, P. Singh, Building an effective approach toward intrusion detection features selected by mutual information from histogram of LBP, intensity, and shape,
using ensemble feature selection, Int. J. Inf. Secur.Privacy (IJISP) 13 (3) (2019) IEEE Trans. Inf. Forensics Secur. 8 (3) (2013) 488–499.
31–47. [39] S. Dwivedi, M. Vardhan, S. Tripathi, A.K. Shukla, Implementation of adaptive
[6] K. Hwang, M. Cai, Y. Chen, M. Qin, Hybrid intrusion detection with weighted signa- scheme in evolutionary technique for anomaly-based intrusion detection, Evol. In-
ture generation over anomalous internet episodes, IEEE Trans. Dependable Secure tell. 13 (1) (2020) 103–117.
Comput. 4 (1) (2007) 41–55. [40] C. Liu, W. Wang, Q. Zhao, X. Shen, M. Konan, A new feature selection method based
[7] A. Zakeri, A. Hokmabadi, Efficient feature selection method using real-valued on a validity index of feature subset, Pattern Recognit. Lett. 92 (2017) 1–8.
grasshopper optimization algorithm, Expert Syst. Appl. 119 (2019) 61–72. [41] M. Bennasar, Y. Hicks, R. Setchi, Feature selection using joint mutual information
[8] G. Folino, P. Sabatino, Ensemble based collaborative and distributed intrusion de- maximisation, Expert Syst. Appl. 42 (22) (2015) 8520–8532.
tection systems: a survey, J. Netw. Comput. Appl. 66 (2016) 1–16. [42] A. Fathy, Recent meta-heuristic grasshopper optimization algorithm for optimal re-
[9] S. Dwivedi, M. Vardhan, S. Tripathi, Incorporating evolutionary computation for configuration of partially shaded PV array, Sol. Energy 171 (2018) 638–651.
securing wireless network against cyberthreats, J. Supercomput. (2020) 1–38. [43] J. Luo, H. Chen, Y. Xu, H. Huang, X. Zhao, et al., An improved grasshopper optimiza-
[10] A.K. Shukla, S.K. Pippal, S.S. Chauhan, An empirical evaluation of teaching–learn- tion algorithm with application to financial stress prediction, Appl. Math. Model 64
ing-based optimization, genetic algorithm and particle swarm optimization, Int. J. (2018) 654–668.
Comput. Appl. (2019) 1–15. [44] A.A. Ewees, M.A. Elaziz, E.H. Houssein, Improved grasshopper optimization algo-
[11] S. Saremi, S. Mirjalili, A. Lewis, Grasshopper optimisation algorithm: theory and rithm using opposition-based learning, Expert Syst. Appl. (2018).
application, Adv. Eng. Softw. 105 (2017) 30–47. [45] M.K. Ebrahimpour, M. Eftekhari, Ensemble of feature selection methods: a hesitant
[12] S. Arora, P. Anand, Chaotic grasshopper optimization algorithm for global optimiza- fuzzy sets approach, Appl. Soft Comput. 50 (2017) 300–312.
tion, Neural Comput. Appl. (2018) 1–21. [46] S.A. Rankawat, R. Dubey, Robust heart rate estimation from multimodal physiolog-
[13] A.K. Shukla, P. Singh, M. Vardhan, An adaptive inertia weight teaching-learn- ical signals using beat signal quality index based majority voting fusion method,
ing-based optimization algorithm and its applications, Appl. Math. Model. 77 (2020) Biomed. Signal Process. Control 33 (2017) 201–212.
309–326. [47] A.A. Aburomman, M.B.I. Reaz, A survey of intrusion detection systems based on
[14] O. Ertenlice, C.B. Kalayci, A survey of swarm intelligence for portfolio optimization: ensemble and hybrid classifiers, Comput. Secur. 65 (2017) 135–152.
algorithms and applications, Swarm Evol. Comput. 39 (2018) 36–52. [48] A.J. Ferreira, M.A.T. Figueiredo, Efficient feature selection filters for high-dimen-
[15] C.-F. Tsai, Y.-F. Hsu, C.-Y. Lin, W.-Y. Lin, Intrusion detection by machine learning: sional data, Pattern Recognit. Lett. 33 (13) (2012) 1794–1804.
a review, Expert Syst. Appl. 36 (10) (2009) 11994–12000. [49] F. Kuang, S. Zhang, Z. Jin, W. Xu, A novel SVM by combining kernel principal com-
[16] S. Mohammadi, H. Mirvaziri, M. Ghazizadeh-Ahsaee, H. Karimipour, Cyber intrusion ponent analysis and improved chaotic particle swarm optimization for intrusion de-
detection by combined feature selection algorithm, J. Inform. Secur. Appl. 44 (2019) tection, Soft Comput. 19 (5) (2015) 1187–1199.
80–88.
S. Dwivedi, M. Vardhan and S. Tripathi Computer Networks 176 (2020) 107251

[50] T. Blickle, L. Thiele, A comparison of selection schemes used in evolutionary algo- [62] S. Lakhina, S. Joseph, B. Verma, Feature reduction using principal component anal-
rithms, Evol. Comput. 4 (4) (1996) 361–394. ysis for effective anomaly–based intrusion detection on NSL-KDD(2010).
[51] S. Mirjalili, S.M. Mirjalili, A. Lewis, Grey wolf optimizer, Adv. Eng. Softw. 69 (2014) [63] G.V. Nadiammai, M. Hemalatha, Effective approach toward intrusion detection sys-
46–61. tem using data mining techniques, Egyptian Inform. J. 15 (1) (2014) 37–50.
[52] M. Mafarja, I. Aljarah, A.A. Heidari, H. Faris, P. Fournier-Viger, X. Li, S. Mirjalili, [64] P. Gogoi, M.H. Bhuyan, D.K. Bhattacharyya, J.K. Kalita, Packet and flow based net-
Binary dragonfly optimization for feature selection using time-varying transfer func- work intrusion dataset, in: International Conference on Contemporary Computing,
tions, Knowl. Based Syst. 161 (2018) 185–204. Springer, 2012, pp. 322–334.
[53] C.-P. Lee, Y. Leu, W.-N. Yang, Constructing gene regulatory networks from microar- [65] M.M. Abd-Eldayem, A proposed HTTP service based IDS, Egyptian Inform. J. 15 (1)
ray data using GA/PSO with DTW, Appl. Soft Comput. 12 (3) (2012) 1115–1124. (2014) 13–24.
[54] C. Cortes, V. Vapnik, Support-vector networks, Mach. Learn. 20 (3) (1995) 273–297. [66] M. Prasad, S. Tripathi, K. Dahal, An efficient feature selection based bayesian and
[55] B.O. Alijla, C.P. Lim, L.-P. Wong, A.T. Khader, M.A. Al-Betar, An ensemble of intel- rough set approach for intrusion detection, Appl. Soft Comput. 87 (2020) 105980.
ligent water drop algorithm for feature selection optimization problem, Appl. Soft [67] O. Depren, M. Topallar, E. Anarim, M.K. Ciliz, An intelligent intrusion detection
Comput. 65 (2018) 531–541. system (IDS) for anomaly and misuse detection in computer networks, Expert Syst.
[56] A.K. Shukla, P. Singh, M. Vardhan, A hybrid framework for optimal feature subset Appl. 29 (4) (2005) 713–722.
selection, J. Intell. Fuzzy Syst. 36 (3) (2019) 2247–2259. [68] W. Yassin, N.I. Udzir, A. Abdullah, M.T. Abdullah, Z. Muda, H. Zulzalil, Packet
[57] M. Tavallaee, E. Bagheri, W. Lu, A.A. Ghorbani, A detailed analysis of the KDD CUP header anomaly detection using statistical analysis, in: International Joint Confer-
99 data set, in: Computational Intelligence for Security and Defense Applications, ence SOCO14-CISIS14-ICEUTE14, Springer, 2014, pp. 473–482.
2009. CISDA 2009. IEEE Symposium on, IEEE, 2009, pp. 1–6. [69] Z. Tan, A. Jamdagni, X. He, P. Nanda, R.P. Liu, J. Hu, Detection of denial-of-service
[58] A. Shiravi, H. Shiravi, M. Tavallaee, A.A. Ghorbani, Toward developing a systematic attacks based on computer vision techniques, IEEE Trans. Comput. 64 (9) (2014)
approach to generate benchmark datasets for intrusion detection, Comput. Secur. 31 2519–2533.
(3) (2012) 357–374. [70] H. Huang, R.S. Khalid, H. Yu, Distributed machine learning on smart-gateway net-
[59] W. Elmasry, A. Akbulut, A.H. Zaim, Evolving deep learning architectures for network work towards real-time indoor data analytics, in: Data Science and Big Data: An
intrusion detection using a double PSO metaheuristic, Comput. Netw. 168 (2020) Environment of Computational Intelligence, Springer, 2017, pp. 231–263.
107042. [71] F. Salo, A.B. Nassif, A. Essex, Dimensionality reduction with IG-PCA and ensemble
[60] I. Sharafaldin, A.H. Lashkari, A.A. Ghorbani, Toward generating a new intrusion de- classifier for network intrusion detection, Comput. Netw. 148 (2019) 164–175.
tection dataset and intrusion traffic characterization, in: ICISSP, 2018, pp. 108–116.
[61] C.-C. Chang, C.-J. Lin, LIBSVM: a library for support vector machines, ACM Trans.
Intell. Syst.Technol. (TIST) 2 (3) (2011) 27.

You might also like