Implementation of Adaptive Scheme in Evolutionary
Implementation of Adaptive Scheme in Evolutionary
https://doi.org/10.1007/s12065-019-00293-8
SPECIAL ISSUE
Received: 11 March 2019 / Revised: 31 May 2019 / Accepted: 4 June 2019 / Published online: 21 September 2019
© Springer-Verlag GmbH Germany, part of Springer Nature 2019
Abstract
Intrusion detection has become important to network security because of the increasing connectivity between computers and
internet. Various Intrusion Detection Systems have been investigated to protect web or networks using several evolution-
ary methods and classification techniques. In this study, we propose a new technique by combining Ensemble of Feature
Selection (EFS) and Adaptive Grasshopper Optimization Algorithm (AGOA) methods, called EFSAGOA which can help
to identify the types of attack. In the proposed approach, initially, EFS method is applied to rank the attribute for selecting
the high ranked subset of attributes. Then, AGOA is employed to determine important attributes from the reduced datasets
that can contribute to predict the networks traffic behavior. Furthermore, adaptive behavior of GOA uses to decide whether
a record represents an anomaly or not, differing from some approaches acquainted in the literature. AGOA uses the Support
Vector Machine (SVM) as a fitness function to choose the extremely efficient features and to maximize the classification
performance. In addition, it is also applied to optimize the penalty factor (C), kernel parameter (𝜎), and tube size (𝜖) of SVM
classifier. The performance of EFSAGOA has been evaluated on modern intrusion data as ISCX 2012. The experimental
results demonstrate that the proposed method performs better and obtain high detection rate, accuracy, and low false alarm
rate compared to other state-of-art techniques in ISCX 2012 data.
Keywords ISCX 2012 · Network security · Adaptive grasshopper optimization algorithm · Intrusion detection
13
Vol.:(0123456789)
which selects set of features from network intrusion detec- signatures using biologically-inspired based approach
tion datasets that maximizes the objective function. In the called AGOA. To overcome the limitation of individual
present decades, a large diversity of approaches have been approaches, in this study, we propose a novel hybrid method
introduced to develop metaheuristics based intrusion detec- by combination of EFS as filter; and adaptive learning
tion system, which can achieve better network security and approach called Adaptive Grasshopper Optimisation Algo-
also find the modern attacks [4]. rithm (AGOA) to detect the security violations in IDSs. It
To overcome computational costs due to the time-con- can improve the solution quality by adjusting the regulation
suming trial-and-error parameter, researchers have been parameters values and controlling the premature conver-
actively applied the adaptation strategies in the metaheuristic gence and stagnation.
algorithms for solving the IDS problem. By using this con- The rest of the paper is planned as follows. Section 2
cept, the performance of metaheuristic algorithms regarding provides the related work on intrusion detection based
diversity, premature convergence, and provide good quality techniques for feature selection. Section 3 provides the
of solutions can be improved. In current decades, various backgrounds details of filter methods, SVM classification
Evolutionary Computation (EC) techniques such as Dif- algorithm, and grasshopper method. Section 4, introduces
ferential Evolution (DE), Genetic Algorithm (GA), Particle the proposed approach and outline to choose the noticeable
Swarm Optimisation (PSO), and Grasshopper Optimisation features from the ISCX 2012 dataset and discriminate the
Algorithm (GOA) have been established for intrusion detec- different type of attacks. Section 5 illustrates the overall
tion which work as wrapper methods [5]. working of proposed method. In Sect. 6, we evaluate the
In Saremi et al. [6] proposed a capable nature-inspired performance on ISCX 2012 data. Section 7 discusses the
algorithm called GOA which is inspired from the ideal conclusion.
swarm behavior of insects in nature. The elemental limita-
tion of GOA is that it cannot guarantee optimality. A wrap-
per approach based on GOA, in 2018, Mafarja et al. [7] dis- 2 Related work
covered that contain selection operators and evolutionary
population dynamics way by using four diverse strategies From the reported literature of intrusion detection systems
to overcome the drawbacks of the basic GOA such as slow (IDSs) it is found that these systems come under five strate-
convergence and stagnation. Similarly, Opposition-based gies such as statistics-based, pattern-based, rule-based, state-
Learning (OBL) based on GOA known OBLGOA by Ewees based, and heuristic-based [11]. In statistics-based methods,
et al. [8]. that contain two phases; in the first phase, an initial predefined threshold, mean, and standard deviation are used
population generated; and then used the Opposition-based to identify the type of attacks. This work provides a literature
Learning (OBL) as an additional phase to modify the GOA review of existing filter and wrapper FS method on intrusion
population in each iteration. The proposed algorithm was detection dataset. At first, the underlying data is described
discovered better-quality results and competent on global in more detail. Then, this paper analyzes ISCX 2012 data-
unconstrained and constrained optimisation problems and set properties which is often used in available literature to
various real-life problems. evaluate the quality of network based dataset. To make a
Among the learning methods (Naive Bayes, Multi-Layer fair comparison with standard filter and wrapper methods,
Perception and Artificial Neural Network (ANN)), SVM is proposed model is evaluated on new dataset as ISCX 2012.
an effective classification technique, the main reason is that Progressively, abundant applications such as feature
the distribution of different types of attacks is imbalanced, selection and classification model have been applied into
where the learning sample size of the low-frequent attacks is IDS datasets for finding the type of attacks. In the litera-
too small compared to the high-frequent attack [9]. SVM is a ture [12], FS algorithm with learning algorithm cannot
margin-based classifier based on small sample learning with handle or do not scale extremely large volumes of data.
good generalization capabilities, which is frequently used To handle this type of problems, in this research work our
in real world applications. It realizes the theory of marginal focus is on filters and wrapper methods namely mutual
dimension and principle of structural risk minimum, thus information and AGOA for finding the attacks. Detection
it does not have the over-fitting problem. To overcome the of attacks and intrusions are broadly thematic researches
problem, Kuang et al. [10] investigated an SVM model built that follow the trend of applying detection of intrusions.
on Genetic Algorithm (GA) and Kernel Principal Compo- The work of Tariq [13] combined methods of semantic
nent Analysis (KPCA) to excerpt intrusion characteristics, analysis and networks artificial neural networks for the
and GA to enhance the SVM parameter. detection of network behaviors. Ambusaidi et al. [3] pre-
To avoid the expensive computational costs for choos- sented an approach based on mutual information to sys-
ing the most suitable learners along with its control param- tematically choose the optimal feature concerning clas-
eter values, we propose a framework for adaptive learning sification. In addition, it is to deal with dependent data
13
features which are linear and nonlinear proposed approach Tsang et al. [20] presented a new IDS method to gener-
was used. The efficacy of proposed approach was esti- ate both precise and interpretable rules based on fuzzy from
mated in the events of network intrusion identification web-traffic data for classification to expand the classifica-
using Least Square Support Vector Machine based IDS tion performance. Using this dataset, the proposed model
(LSSVM-IDS). The experiments were conducted on three attained the detection accuracy as 99.24% with optimal fea-
datasets such as KDD Cup 99, NSL-KDD and Kyoto ture subsets. Recently, Khammassi and Krichen [21] pre-
2006+ dataset. In addition, author [14] proposed a new sented a new wrapper method namely GA which has worked
technique to detect an anomaly regarding normal and DoS as heuristic search method using logistic regression learner
attack. The idea was based on behavior of network traffic approach for web intrusion detection systems for selecting
records to be evaluated by the proposed detection system. the finest subset of features. The experimental study was
The experiment was used two different datasets such as accompanied on KDD Cup 99 dataset and UNSW-NB15
KDD Cup 99 and ISCX 2012 and acquired remarkable dataset. Shahreza et al. [22] anticipated a novel anomaly
detection accuracy as 99.95% and 90.12% respectively. detection technique by implementing self-organizing map
To solve cyber-attack problems, recently several and particle swarm optimization approach.
metaheuristics techniques have been proposed to avoid the In the addition, using fuzzy and SVM method, many
anomaly based problem by using new updation scheme. The methods have been proposed that can increase the classi-
metaheuristics methods imitate the natural process or the fication efficiency on revised version of benchmark dataset
biology phenomena to search the best solution efficiently. like DARPA KDD 99 [23]. In contrast to filter methods,
Several metaheuristics methods such as Differential Evolu- evolutionary methods are claimed to be more precise as
tion (DE), Genetic Algorithm (GA) [5], and Particle Swarm feature selection method using radial basis function [24].
Optimisation (PSO) have been established for intrusion Manzoor et al. [25]. presented a feature reduction method
detection generally work as wrapper approaches. using merging ranks that obtained from information gain and
GOA has applied to several fields due to its ease and pro- correlation-based methods to classify normal and abnormal
ficiency such as financial stress prediction problem, partially attack. A novel support vector machine model was proposed
shaded PV array, wireless node localization, vibration signal by Kuang et al. [10]. that combined the kernel PCA with GA
analysis in machine learning and computer networks [15]. for intrusion detection system, here, KPCA was employed as
The problems of numerous continuous, discrete, single- a pre-processor of SVM to diminish the features vectors and
objective and multi-objective optimization were effectively reduce the training time. So as to shrink the noise produced
solved by GOA with respect to various classic metaheuristics by differences in features and to enhance the performance of
algorithms such as differential evolution and particle swarm SVM, an enriched kernel function is suggested by implant-
optimisation to solve global optimal problems and acquire ing mean square difference and mean values of features in
success. From [16], proposed a new FS method based on Radial Basis Function (RBF) as kernel function.
mathematical model of interaction between grasshoppers to To mitigate Denial of Service (DoS) attacks, Vidal
find food sources. Some modifications have applied to the et al. [26] offered an artificial immune framework which
grasshopper optimization algorithm to seek it suitable for has proficiency by matching the several immune reactions
a feature selection problem. GOFS was supplemented by and building of the immune memory framework. Experi-
statistical measures during iterations to replace the dupli- ments on publically available datasets such as KDD Cup
cate features with the most promising features. As reported 99 and CAIDA have been conducted. The multi-objective
in [17], author proposed a novel FS algorithm using a sup- approach based on anomaly detection technique has pro-
port vector machine to diminish the feature domain of IDS posed called multi-objective PSO in the related of the Neural
datasets. Keeping this in mind, wrapper approach as GOA is Network [27]. The experimental simulations demonstrated
used here for receiving the near-optimal solutions efficiently that hybrid algorithm can rapidly and commendably return
in the boundless search space. and ease DoS attacks in adversarial conditions regarding the
Each FS technique has its merits and demerits, and the functional performance criteria. In [28], author proposed a
performance is based on the building intrusion detection model for trajectory optimisation of the solar-powered UAVs
systems including the limitations associated with scenario, that combined object tracking in urban environment.
such as precision, time and cost [18]. Still, the accessibility Overall, traditional IDS-based techniques are either veri-
of strategies, analysts generally agree that there is no FS fied on one or two kinds of attacks and determine execution
method which is impeccable. To deal with the above limita- time and false positive rate at maximum in the previous lit-
tions, various researchers endeavor to overcome the limita- erature. To solve the problem stated above, this paper has
tions of both filter and wrapper methods [19]. Therefore, we integrated ensemble of filters and wrapper. In the first phase,
have used an ensemble method based on the filter FS method filter method (EFS) attempts to exclude irrelevant features
and the wrapper approach on the IDS data. from the original data. Reduction in search space from the
13
entire original feature space to the pre-selected features, this adulthood. It can move in swarms like one of the largest swarms
phase accelerates the step of processing for the next phase as among all kind of creatures, while grasshoppers are generally
wrapper method (AGOA). Consequently, the results of this found exclusively, in nature [33]. The swarm behavior appear in
prominent work is obtained as high accuracy and detection both the nymph and the adulthood stages which is a inimitable
rates, low false positive rate, and less processing time in aspect of the grasshoppers swarm [34]. The nymph grasshopper
comparison to reported literature . Furthermore, to validate jump and move as rotating cylinders in millions.
the performance of the proposed approach, experiments are In general, algorithms which are inspired by nature, plau-
accompanied by most important forms of intrusion attacks. sibly divide the research process into two phases: explora-
tion and exploitation. In exploration phase, agents are stimu-
lated to move involuntarily, while they incline to run locally
3 Background details in the space of exploitation. Widespread exploration and
prompt convergence inspires the employment of GOA. The
3.1 Feature selection mathematical model used to simulate the swarm behavior
of grasshoppers is obtainable as [8] according to Eq. (1).
Feature selection is the process where we automatically Yi = Soi + Gri + Awi (1)
or manually select those features which contribute most to
our prediction variable or output in which we are interested where Yi represents the grasshopper position, Soi represent
in. Having irrelevant features in our data can decrease the the ith social interaction expressed in Eq. (3), Gri represents
accuracy of the models and our model can learn based on the gravitational force on the ith agent, and Awi shows the
irrelevant features. advection of wind. The random behavior of swarm can be
As a generally accepted rule, feature selection methods written as following Eq. (2).
are assembled into two sets such as filter, and wrapper meth- Yi = r1 ∗ Soi + r2 ∗ Gri + r3 ∗ Awi (2)
ods [29]. Wrapper approaches utilise FS as a part of train-
ing the learning model, whereas filter approaches choose where r1, r2, and r3 are random numbers lies between 0 and 1.
features independently from classification model. It is a
∑
n
straightforward method and observes the features established Soi = sf (Dij )D̂ij (3)
on the intrinsic behavior of the dataset before the learning j=1;i≠j
tasks [30]. In this paper, we have chosen the three filters
method such as Conditional Mutual Information Maximiza- where Dij is the distance between the ith and the jth grass-
tion (CMIM), minimum-Redundancy-Maximum-Relevancy hopper, determined by Dij = |Yj − Yi |, sf is a function to
(mRMR), and Joint Mutual Information (JMI) [31, 32], and decide the power of social forces, as shown in Eq. (4), and
(Y −Y )
wrapper method as AGOA which selects the relevant attrib- D̂ij = jD i is a unit vector from the ith grasshopper to the
ij
utes for better classification of attack types. jth grasshopper. The sf function, which describes the social
Recently, the selection of best feature subsets is one of the forces, is described in Eq. (4):
promising tasks through information theory. In this study,
we have used concept of mutual information and measures sf (r) = Ie−r∕ln − e−r (4)
the effectiveness of picked features based on information
where I shows the intensity of attraction and ln is the attrac-
value with the intention of retaining correlated features
tive length scale.
and excluding redundant features. None of the available
Figure 1 shows the sf function, which demonstrate the
approaches can defeat problems such as low performance,
rational model of interactions between agents and comfort
redundant features, and high computational burden. There-
zone. It can be seen that, this social interaction between the
fore, this study develop a new hybrid algorithm by integrat-
grasshopper has been the driving force in some previous
ing ensemble of feature selection (EFS) method and AGOA
models of swarms in an easier form.
to select the useful features for accurate attack prediction.
The Gri component in Eq. (1) is enumerated by Eq. (5)
It supports us to concentrate on worthy feature sets and dif-
as follows:
ferentiate among records of different labels which leads us
to predict an attack correctly. Gri = −grêgr (5)
3.2 Grasshopper optimization algorithm where gr shows the gravitational constant and êgr represents
unit vector for center of earth. The Awi component in Eq. (1)
The grasshopper optimisation algorithm [6] is an innovative is enumerated according to Eq. (6).
evolutionary algorithm motivated from the lives of grasshop- Awi = dêw (6)
pers. In their lifespan, there are two parts, namely nymph and
13
where d represents the constant drift and êw represents unit in Eq. (8). Noticing that the first element of this equation
vector in the wind direction. treats the position of the current grasshopper concerning
Their approach to move is more relevant with wind direc- other grasshoppers.
tion because the small grasshoppers have no wings. So, gri To balance exploration and exploitation search capability,
and Awi from the Eq. (1) is substituted as following equation: cz parameter is required to directly inverse propositional to
number of iterations. This equation helps exploitation when
∑
n
sf (|Yj − Yi |)(|Yj − Yi |∕Dij ) − grêgr + dêw the number of iteration rises. The cz coefficient decreases
Yi = (7)
j=1;i≠j the comfort zone proportionated to the number of iterations
and is enumerated as Eq. (9):
where sf (r) = Ie−r∕ln − e−r and n is the number of grasshop-
pers. Equation (7) is utilized and can sense the communica- cz = czmax − t(czmax − czmin ∕tmax ) (9)
tion between grasshoppers in swarm. However, due to agents
where czmax represent the maximum value, czmin represent
which hurriedly arrive at the comfort zone and as a result
the minimum value, t represent the current iteration, and tmax
swarm diverge from definite point, this mathematical model
represent the maximum number of iterations.
may not be used openly to resolve optimisation problems. A
variant of Eq. (8) is used as follows to resolve optimisation
problems: 3.3 Support vector machine
13
for algorithms that can be expressed regarding dot prod- a weighty support vector and the training time will not
ucts. There are many implementations method to solve be shorter too.
the problem but in this manuscript we have used three
kernel in SVM which taken from LibSVM tool [36]. The
RBF kernel { f u n c t i o n} f o r m u l a t e d a s : 4 Proposed methodology
( ) |R −R |
2
k Ri , Rj = exp − 2𝜎 2 the frequency vote approach for examining the rank of the
individual feature selection method.
where d is a constant value, Ri and Rj are records or
instances, and p is the order of function. 4.1.2 Frequency vote
When we use RBF function, in general, SVM provides
the decent performance in classification, which is not a Frequency vote (FV) is a group decision-making system
nominal kernel function for less number of parameters. and has setup which is beneficial as another more com-
A network sample contain a large number of attributes, pound schemes [38]. In the context of FS, each model
and there exist substantial differences among them. When creates a prediction (votes) for each feature rank and the
the variances between the attributes are very large then final output estimate is the one that accepts more than
excluding RBF function in process of training will yield half of the votes. If any of the predictions do not acquire
more than half of the votes, we can observe that the
13
ensemble method unable to make a constant prediction space of population. Thus, the local solution can be avoid-
for the record. If any of the predictions do not get more able to some amount.
than one of the votes, we can assume that the ensemble Instead of using fixed values of cz, AGOA utilizes the pop-
method unable to make a constant prediction for particular ulation information in each generation and adaptively adjust
instance then we prefer lowermost vote value regarding the cz in order to maintain the population diversity as well
score value. So, we can attempt the most voted prediction as to sustain the convergence capacity. In AGOA, the adjust-
as the final prediction as illustrated in Eq. (12). ment of cz depends on the fitness values of the solutions. This
research detects convergence by observing the range between
∑
M
∑
M
the maximum and minimum fitness values of the population.
dn,j = argmaxk∈{1,2,…,L} dn,k (12)
n=1 n=1 When the value of cz is the large, then the complete search is
too difficult and the optimal solution may be lost. When the cz
Here, M represents the number of feature selection method, value is small, the exploration can stop in the minimum zone.
and L selects some attributes. For attribute k, the sum In this scenario, we have used an alternative approach to an
∑M
n=1 n,k tabulates the number of votes for k. Plurality evaluation of population selection assessment using simple
d
chooses the attribute k which maximizes the sum. SVM classifiers. In AGOA, the value of cz can be estimated
by using Eq. (13).
4.2 Adaptive grasshopper optimization algorithm { [ f −f ]
k1 . f max−f g , fg ≥ favg
To enhance the basic GOA merits, firstly we have analysed
cz = max avg (13)
k2 , fg < favg
the disadvantage of GOA. From the starting of GOA, posi-
tions of all the agents are randomly initialised. Agents culti- where fmax represents the maximum of all agents fitness
vate a leaning to affect towards the target (i.e, T̂d in the social when AGOA do a search operation, favg represents the aver-
interaction of swrams), however, if initialised positions of age fitness, fg represents the average fitness of the three
the agents are concentrated near the local-best and far from parents in selection operation. k1 and k2 represents the four
the best overall, the agents can be attentive by the best locale constant value in between 0 and 1. Overall procedure of
that means GOA is aware to initial positions selection, so adaptive grasshopper optimization algorithm is demon-
improvement strategies should be able to help search agents strated in Algorithm 2.
come from the local best trap. Inspired by Grey Wolf Optimization (GWO) [40], the
Grasshopper optimisation algorithm with adaptive strat- democratic decision-making mechanism is introduced to
egy referred as AGOA, is a significant and promising variant AGOA strategy. The expected positions of all grasshoppers
of GOA. Furthermore, the adaptation strategy in cz in the are decided by the leading agents simultaneously. The best
basic GOA is only related to the generation [Eq. (9)], and three grasshoppers in AGOA are manifested as 𝛼 , 𝛽 , 𝛾 as
does not consider dynamically change the value of cz from the leading agents. The mathematical formulation of AGOA
the system by the feedback of search process. Therefore, cz mechanism is described in Eq. (14).
should be designed as an adaptive parameter related to the
fitness value of each generation and performance of the cur- ⎧ n
�
rent population. That is, based on the Eq. (13), the fitness of ⎪ � �
Yid =cz ∗ ⎨ cz(uld − lld �∕2)sf (�Yjd − Yid �) �Yj
the selected population (Pop) by the selection technique and ⎪ j =1;i≠j
best population should be introduced as feedback. ⎩
(14)
For the local best trap, this paper utilises the natural selec- ⎫
tion strategy (𝛿), and adaptation mechanism to help GOA ��⎪ T̂𝛼 + T̂𝛽 + T̂𝛾
− Yi �∕Dij ⎬ +
swoop of the trap. For the adaptive strategy on cz, dynamic ⎪ 3
feedback mechanism based on the genetic algorithm adap- ⎭
tation strategy is used. According to the methodology of
where T̂𝛼 , T̂𝛽 , T̂𝛾 are the leading agents in the whole
natural selection [39], 𝛿(𝛿 < Pop) grasshoppers should be
grasshoppers.
selected and eliminated randomly from the population. Pop-
ulation is selected by the tournament selection according to
the fitness function of each grasshopper. Then, the positions 4.2.1 Binary adaptive grasshopper optimization algorithm
of the eliminated 𝛿 grasshoppers are randomly re-initialized.
By using natural selection strategy, the population can Generally, in binary FS problem, the agents (particles) of
move towards a better-quality of solution. For now, the posi- binary optimization algorithm can only move to closer and
tion re-initialization for the eliminated agents can diffuse distant angles of this hypercube by flipping one or more bits
the position of grasshoppers properly and spread the search of position vector Y = {Y1 , Y2 , … , Yd } is designated as the
13
search space of a hypercube [41]. The location of an agent is the agents updates their positions using the rules enumer-
being updated by accumulating the current position vector, ated in Eq. (16).
meanwhile the basic GOA was expected to deal with con- { ( )
tinuous FS problem. Still, to deal with binary FS problem, 1, if rand < T Yid+1
(16)
d+1
Yi =
current AGOA approach cannot be used. As stated by prior 0, otherwise
study [42], employed a transfer function is an efficient and
In general, particular problem in existing algorithms is that
appropriate approach of transforming a continuous into a
it does not prevent individuals from being stuck in a local
binary value. Mostly, transfer function is utilized to yield the
optimum. To resolve this kind of issue, this paper proposes a
probability of altering the position 0 to 1 or vice versa posi-
novel hybrid FS algorithm to find the better quality of solu-
tion of the ith agent in the jth dimension in the current itera-
tion regarding classification accuracy and minimize the size
tion (t) as an input parameter is described in algorithm 2.
of irrelevant features. In addition, a new encoding scheme is
also investigated to evaluate the classification.
13
where Tp , TN , FP , and FN represent as True Positive, True performance of proposed method, we have used SVM as
Negative, False Positive, and False Negative. fitness function in the AGOA and find the best agent with
optimal number features.
The overall process of hybrid algorithm for feature selec-
tion is comprised of four main steps: (a) data collection, (b)
5 Overall structure of proposed data preprocessing training and test data are preprocessed
methodology and important features that can distinguish one class from
the others are selected, (c) classifier training, where the
Each FS method has their potentials and limitations, and the model for classification is trained using AGOA-SVM, and
performance is relying on the type of data, as well as on the (d) attack recognition, where the trained classifier is used to
constraints associated to the situation i.e., accuracy, time and detect intrusions on the test data.
budget. However, the convenience of developed methods,
experts generally agree that there is no perfect FS method. 5.1 Data collection
Keeping in mind, in this paper, firstly, we have presented an
ensemble FS method then the wrapper method as AGOA is The data collection is the counterpart and a fundamental
applied to choose an optimal subset of features from ISCX step for intrusion detection system. The type of data source
2012 dataset filtered by EFS method. The proposed method and the position in which the data is collected are the two
enriches the classification performance and also reduces the determining factors in the design and efficacy of an IDS.
search complexity for generating the feature sets concluded To make available the most appropriate security for the
ISCX 2012 dataset. Generally, GOA is widely applied to host or target network, this study intends IDS based web
numerical optimisation problems. In this article, we pro- network to test proposed approach. Throughout the phase
poses a novel hybridization of ensemble feature selection of training, the samples of data collected are classified
and adaptive GOA methods called EFSAGOA to solve the according to the web level protocols and are labeled on
IDS problem, can see in Fig. 3. AGOA is population-based the basis of knowledge of the domain. However, the data
metaheuristics search method inspired by grasshopper gathered in the test phase are classified only conferring to
swarms phenomena determined as a unit of cultural evolu- the protocol types.
tion that is proficient of local modifications and to separate
a type of an attack in the case of a new agent which aids to
determine the early detection [43]. In addition, evaluate the
13
Data Pre-processing
Reduced dataset
Anomaly Detecon
Engine
5.2 Data preprocessing fixed range so that the bias in favor of higher-valued features
are eliminated from the data set. Each characteristic within
In this section, we obtain the data during data collection each sample is normalized by the tremendous value and falls
phases. It is utilized for our experiments to evaluate the in the same interval as [− 1, +1]. The data transfer and nor-
necessary feature in IDS data namely ISCX 2012 [44]. The malization process are also employed on the test data.
data pre-processing process covers three vital components
revealed as follows: 5.3 Classifier training
13
part of the experiments in the field of article used, class of Pentium Core i7, 16 GB RAM. Take SVM as per the clas-
recordings to coincide with the normal class if they inform sifier concerning the selection of best parameters can effect
as normal data, on the contrary if they consider attacks. We performance of approach unwillingly, so we have employed
have adjusted parameters with the maximum accuracy for LibSVM tool [47]. The parameters of all optimisation
the classification are nominated as the most suitable param- approaches are taken as: population size is 30 and maximum
eters. At that time, the optimal parameters are used to train number of iterations set as 100.
the SVM classifiers.
6.4 Experimental results and analysis
Information Security Center of Excellence (ISCX) data- 6.4.1 Comparison of filter feature selection methods using
set was built in the University of New Brunswick ISCX to SVMs
provide a modern-day benchmark for ID system [46]. The
dataset derived from real packets for seven days of network In the experimental work, we have employing tenfold cross-
activity such as HTTP, SMTP, SSH, IMAP, POP3 and FTP validation using SVM classifier in the ISCX 2012 dataset.
protocols concealing various scenarios of normal and mali- The results of this experiment are displayed in Table 2 in
cious activities. The dataset includes a total of 2,450,324 terms of accuracy, DR, precision, and f-measure. From the
flows and contain 19 features such as app name”, “total Table 2, we can observe that the accuracy of the classifier is
destination packets”, “total source packets”, “total source not much remarkable in ISCX 2012 dataset, particularly for
bytes”, “total destination bytes”, “source payload as base64”, SVM-L. The greater values of this performance measures
“source port”, “destination”,”destination payload as base64”, signify an outstanding classification performance. It dem-
“destination payload as UTF”, “direction”, “source TCP onstrates that when ISCX 2012 combines with EFS, it has
flags description”, “destination TCP flags description”, achieved an accuracy of 94.98% in ISCX 2012 dataset with
“source”, “protocol name”, “destination port”, “start date SVM-R, and significantly over takes all other classification
time”, “stop date time”, and “tag” excluding class. methods.
CMIM f_4,f_13,f_3,f_6,f_14,f_12,f_18,f_1,f_5,f_2,f_16,f_11,f_7,f_17,f_15,f_8,f_10,f_9,f_19
JMI f_14,f_12,f_18,f_1,f_4,f_13,f_3,f_6,f_5,f_2,f_8,f_10,f_9,f_19,f_16,f_11,f_7,f_17,f_15,
mRMR f_4,f_3,f_6,,f_1,f_5,f_2,f_16,f_11,f_8,f_13,f_14,f_12,f_18,f_7,f_17,f_15,f_10,f_9,f_19
EFS f_4,f_3,f_1,f_2,f_5,f_8,f_16,f_10,f_7,f_15,f_10,f_9,f_19
13
Table 2 Percentage of average Dataset Classifiers Measure CMIM mRMR JMI EFS
performance in three feature
selection method and Proposed ISCX 2012 SVM-R Accuracy 91.52 92.06 88.40 94.98
filter methods with ISCX 2012
DR 90.05 91.69 88.72 93.12
dataset
Precision 86.18 90.16 87.65 91.69
F-measure 90.14 88.11 88.95 92.09
SVM-P Accuracy 88.26 90.33 90.14 93.32
DR 86.99 87.22 88.87 92.85
Precision 86.48 86.76 87.79 91.01
F-measure 85.34 88.32 86.23 90.55
SVM-L Accuracy 82.32 82.89 82.23 90.16
DR 82.11 83.87 84.78 90.31
Precision 82.43 82.16 81.36 88.67
F-measure 83.24 80.96 83.24 87.62
2012 dataset. Despite this, there is no arrangement by which training time (EtrD), Execution testing time (EteD) are dis-
the FS method produces noteworthy subsets of attributes for played in Fig. 4 on ISCX 2012 dataset.
attack classification. A specific FS strategy can be superior As illustrated in Table 5, the proposed algorithm has
to others for some IDS datasets, or other FS strategy may acquired significant outcomes in comparison to prevailing
work better for some other data sets. In Table 3, all the SVM state-of-art regarding accuracy, high DR, low FAR amongst
methods take on RBF which works as a kernel function, in all employed approaches in IDS data. Figure 5 shows the
normal SVM detection model, the parameters 𝜎 , 𝜖 , and C overall performance of proposed technique is superior to the
are randomly selected in ISCX 2012 dataset. In proposed other methods for intrusion recognition in different datasets.
SVM detection model, the optimal parameters 𝜎 , 𝜖 , and C The acquired outcomes demonstrate that proposed shows
are obtained by EFSAGOA algorithm through 30 simulation significant improvements in DR, and accuracy.
experiments.
In the experimental work, we have made comparison
between the performance of proposed (EFSAGOA) and the 7 Conclusion
three different optimization algorithm with SVM classifier.
Table 4 compare the performance of the EFSAGOA method An IDS monitors network traffic searching for suspicious
concerning average accuracy for ISCX 2012 data. From activity and known threats, sending up alerts when it finds
Table 4, the acquired outcomes illustrates that the proposed malicious issues. For long time cyber-security equipment in
technique is greater than other nature-inspired approaches. corporate, intrusion detection as a function remains critical
Now, we have evaluated proposed method in comparison in the modern enterprise, but may not be as a standalone
to GOA-SVM, GA-SVM, EFS-SVM and SVM with the solution. keeping in this mind, in this paper, a novel hybrid
detection rate (DR), Execution training time (EtrD), false IDS has introduced by combination of EFS and AGOA for
alarm rate (FAR) and Execution test time (EteD) in ISCX the detection and classification of anomalies, caused by an
2012 dataset over each fold. To show the performance of attack in the computer network. The main contributions of
EFSAGOA with SVM-R, experiments have been done to this study can be summarized as follows. Firstly, a super-
make examinations with attacks. The performance evalua- vised filter-based feature selection method is introduced,
tions of all employed approaches concerning their Execution namely ensemble FS which combines ranking of features
generated by different filters which can help to remove the
ISCX 2012 Single SVM 0.6215 0.0215 4.2437 92.93 90.45 927.9467
EFS-SVM 0.6987 0.0041 6.3956 97.95 96.37 652.2594
GA-SVM 0.3245 0.0029 4.8167 99.96 98.26 390.7865
GOA-SVM 0.0943 0.0015 1.7432 98.95 98.45 418.9453
Proposed 0.0765 0.0011 1.5521 99.32 99.13 324.4312
13
Fig. 4 Comparison of training time and testing time on ISCX 2012 dataset
Table 4 Comparison the Methods Measures Fold1 Fold2 Fold3 Fold4 Fold5 Fold6 Fold7 Fold8 Fold9 Fold10
experimental performance in
ISCX 2012 dataset ProposedDR (%) 95.43 95.02 94.61 94.51 94.99 97.65 99.45 98.57 99.01 98.96
FAR (%) 1.012 1.012 0.876 1.124 1.00 0.865 1.112 0.98 0.843 0.913
EtrD 1.54 2.67 3.43 4.43 2.45 3.57 1.89 2.76 2.59 1.57
EteD 2.24 3.69 4.56 5.36 4.57 5.34 3.78 3.68 4.56 4.58
GOA-SVM DR (%) 93.56 92.36 91.31 93.36 94.26 95.26 97.06 94.26 93.08 96.34
FAR (%) 1.654 1.832 1.865 2.452 1.436 1.542 2.543 1.127 1.532 1.214
EtrD 2.54 4.49 4.64 3.65 4.13 4.42 4.14 5.37 6.34 4.34
EteD 4.67 5.66 6.53 4.39 6.35 6.54 5.34 8.27 7.14 6.54
GA-SVM DR (%) 88.43 85.43 87.72 85.16 86.32 88.31 89.37 88.34 90.10 89.42
FAR (%) 5.624 3.265 3.122 4.342 5.010 5.046 3.235 3.253 3.145 3.254
EtrD 3.75 6.03 6.54 4.56 5.46 5.39 6.42 6.68 7.47 7.24
EteD 4.78 7.52 7.38 5.43 6.57 7.47 7.54 7.18 9.05 10.08
EFS-SVM DR (%) 87.98 85.23 87.57 84.78 85.43 87.43 88.32 87.22 89.68 89.03
FAR (%) 5.78 4.34 5.62 5.73 6.57 6.49 5.79 6.39 4.89 7.38
EtrD 5.86 8.58 7.49 7.41 8.73 8.38 7.69 7.36 9.53 8.47
EteD 6.37 8.45 9.35 6.29 8.51 9.65 10.48 8.93 10.52 11.52
SVM DR (%) 87.76 85.01 87.32 84.08 83.36 86.32 87.21 86.38 89.22 88.36
FAR (%) 6.57 11.23 13.74 9.05 14.21 11.36 16.38 10.39 9.06 11.23
EtrD 14.78 18.53 15.38 17.27 19.16 16.46 18.25 16.29 20.17 18.21
EteD 18.75 19.87 20.48 21.32 24.37 18.25 20.44 19.11 24.12 25.08
Table 5 Comparison results in ISCX 2012 dataset irrelevant features. However, GOA is quickly trapped in local
Dataset Method Feature DR FPR Accuracy optima, and premature convergence seems when applied to
sophisticated problems. To overcome this issue, we have
ISCX 2012 HbPHAD [51] All 99.04 * * introduced adaptive behavior of GOA, called AGOA that can
EMD [14] * 90.04 7.92 90.12 contribute to predict the networks traffic behavior accurately.
SLFN [52] 11 88.16 5.56 * In addition, AGOA technique is applied to choose appropri-
IG-PCA-Ensem- 7 99.1 0.01 99.011 ate SVM parameters, which escape over-fitting concern of
ble [53]
SVM. The proposed model has provided a high detection
Proposed method 5 99.23 0.067 99.13
rate of 99.23%, accuracy of 99.13% and a low false alarm
13
Fig. 5 Comparison of average accuracy and detection rate in ISCX 2012 dataset
rate of 0.067 in ISCX 2012 dataset. This quality constructs 11. Denning DE (1987) An intrusion-detection model. IEEE Trans
the model computationally more efficient. The future work Softw Eng 2:222–232
12. Benmessahel I, Xie K, Chellal M, Semong T (2019) A new evo-
will include integrating SVM with other metaheuristics opti- lutionary neural networks based on intrusion detection systems
mization feature search approaches to acquire more profi- using locust swarm optimization. Evol Intell 12:1–16
cient hybrid models. 13. Tariq M, Majeed H, Beg MO, Khan FA, Derhab A (2019)
Accurate detection of sitting posture activities in a secure iot
based assisted living environment. Future Gener Comput Syst
92:745–757
14. Tan Z, Jamdagni A, He X, Nanda P, Liu RP, Hu J (2014) Detec-
References tion of denial-of-service attacks based on computer vision tech-
niques. IEEE Trans Comput 64:2519–2533
1. Kusyk J, Uyar MU, Sahin CS (2018) Survey on evolutionary 15. Satyapal Singh AKS, Mohan Kubendiran (2019) A review of
computation methods for cybersecurity of mobile ad hoc net- intrusion detection approaches in cloud security systems. Int J
works. Evol Intell 10:95–117 Grid Util Comput 10:361–374
2. Yao X (2017) The realisation of goal-driven airport enclosures 16. Zakeri A, Hokmabadi A (2019) Efficient feature selection
intrusion alarm system. Int J Grid Util Comput 8:1–6 method using real-valued grasshopper optimization algorithm.
3. Ambusaidi MA, He X, Nanda P, Tan Z (2016) Building an Expert Syst Appl 119:61–72
intrusion detection system using a filter-based feature selection 17. Pervez MS, Farid DM (2014) Feature selection and intrusion
algorithm. IEEE Trans Comput 65:2986–2998 classification in NSL-KDD cup 99 dataset employing SVMS.
4. Alkhamisi GTMB Abrar Omar, Buhari Seyed M (2016) An inte- In: 2014 8th international conference on software, knowledge,
grated incentive and trust-based optimal path identification in information management and applications (SKIMA). IEEE, pp
ad hoc on-demand multipath distance vector routing for manet. 1–6
Int J Grid Util Comput 18. Abraham A, Jain R, Thomas J, Han SY (2007) D-SCIDS: dis-
5. Mirjalili SZ, Mirjalili S, Saremi S, Faris H, Aljarah I (2018) tributed soft computing intrusion detection system. J Netw
Grasshopper optimization algorithm for multi-objective opti- Comput Appl 30:81–98
mization problems. Appl Intell 48:805–820 19. Hamamoto AH, Carvalho LF, Sampaio LDH, Abrão T, Proença
6. Saremi S, Mirjalili S, Lewis A (2017) Grasshopper optimisation ML Jr (2018) Network anomaly detection system using genetic
algorithm: theory and application. Adv Eng Softw 105:30–47 algorithm and fuzzy logic. Expert Syst Appl 92:390–402
7. Mafarja M, Aljarah I, Heidari AA, Hammouri AI, Faris H, 20. Tsang C-H, Kwong S, Wang H (2007) Genetic-fuzzy rule min-
Ala’M A-Z, Mirjalili S (2018) Evolutionary population dynam- ing approach and evaluation of feature selection techniques for
ics and grasshopper optimization approaches for feature selec- anomaly intrusion detection. Pattern Recognit 40:2373–2391
tion problems. Knowl Based Syst 145:25–45 21. Khammassi C, Krichen S (2017) A GA-LR wrapper approach
8. Ewees AA, Elaziz MA, Houssein EH (2018) Improved grass- for feature selection in network intrusion detection. Comput
hopper optimization algorithm using opposition-based learning. Secur 70:255–277
Expert Syst Appl 112:156–172 22. Shahreza ML, Moazzami D, Moshiri B, Delavar M (2011) Anom-
9. Aburomman AA, Reaz MBI (2017) A novel weighted support aly detection using a self-organizing map and particle swarm opti-
vector machines multiclass classifier based on differential evolu- mization. Sci Iran 18:1460–1468
tion for intrusion detection systems. Inf Sci 414:225–246 23. Zaman S, Karray F (2009) Lightweight ids based on features
10. Kuang F, Xu W, Zhang S (2014) A novel hybrid KPCA and selection and ids classification scheme. In: 2009 international con-
SVM with GA model for intrusion detection. Appl Soft Comput ference on computational science and engineering, vol 3. IEEE,
18:178–184 pp 365–370
13
24. Buchtala O, Klimek M, Sick B (2005) Evolutionary optimization selection using time-varying transfer functions. Knowl Based Syst
of radial basis function classifiers for data mining applications. 161:185–204
IEEE Trans Syst Man Cybern Part B (Cybern) 35:928–947 42. Lee C-P, Leu Y, Yang W-N (2012) Constructing gene regulatory
25. Manzoor I, Kumar N et al (2017) A feature reduced intru- networks from microarray data using GA/PSO with DTW. Appl
sion detection system using ann classifier. Expert Syst Appl Soft Comput 12:1115–1124
88:249–257 43. Soufan O, Kleftogiannis D, Kalnis P, Bajic VB (2015) DWFS: a
26. Vidal JM, Orozco ALS, Villalba LJG (2018) Adaptive artificial wrapper feature selection tool based on a parallel genetic algo-
immune networks for mitigating dos flooding attacks. Swarm Evol rithm. PLoS ONE 10:e0117988
Comput 38:94–108 44. Shiravi A, Shiravi H, Tavallaee M, Ghorbani AA (2012) Toward
27. Karami A, Guerrero-Zapata M (2015) A hybrid multiobjective developing a systematic approach to generate benchmark datasets
RBF-PSO method for mitigating dos attacks in named data net- for intrusion detection. Comput Secur 31:357–374
working. Neurocomputing 151:1262–1282 45. Elhag S, Fernández A, Bawakid A, Alshomrani S, Herrera F
28. Wu J, Wang H, Li N, Yao P, Huang Y, Su Z, Yu Y (2017) Dis- (2015) On the combination of genetic fuzzy systems and pair-
tributed trajectory optimization for multiple solar-powered UAVs wise learning for improving detection rates on intrusion detection
target tracking in urban environment by adaptive grasshopper opti- systems. Expert Syst Appl 42:193–202
mization algorithm. Aerosp Sci Technol 70:497–510 46. Nisioti A, Mylonas A, Yoo PD, Katos V (2018) From intrusion
29. Al-Betar MA, Awadallah MA (2018) Island bat algorithm for detection to attacker attribution: a comprehensive survey of unsu-
optimization. Expert Syst Appl 107:126–145 pervised methods. IEEE Commun Surv Tutor 20:3369–3388
30. Cai J, Luo J, Wang S, Yang S (2018) Feature selection in machine 47. Ravale U, Marathe N, Padiya P (2015) Feature selection based
learning: a new perspective. Neurocomputing 300:70–79 hybrid anomaly intrusion detection system using K means and
31. Il-Agure Z, Attallah B (2019) How mutual information inter- RBF kernel function. Procedia Comput Sci 45:428–435
prets anomalies using different clustering. Int J Grid Util Comput 48. Shukla AK (2019) Building an effective approach toward Intrusion
10:36–41 detection using ensemble feature selection. Int J Inf Secur Priv
32. Cover TM, Thomas JA (2012) Elements of information theory. (IJISP) 13(3):31–47
Wiley, Hoboken 49. Tavallaee M, Bagheri E, Lu W, Ghorbani AA (2009) A detailed
33. Fathy A (2018) Recent meta-heuristic grasshopper optimization analysis of the KDD cup 99 data set. In: IEEE symposium on
algorithm for optimal reconfiguration of partially shaded PV array. computational intelligence for security and defense applications,
Sol Energy 171:638–651 2009. CISDA 2009. IEEE, pp 1–6
34. Luo J, Chen H, Xu Y, Huang H, Zhao X et al (2018) An improved 50. Nadiammai G, Hemalatha M (2014) Effective approach toward
grasshopper optimization algorithm with application to financial intrusion detection system using data mining techniques. Egypt
stress prediction. Appl Math Model 64:654–668 Inform J 15:37–50
35. Cortes C, Vapnik V (1995) Support-vector networks. Mach Learn 51. Yassin W, Udzir NI, Abdullah A, Abdullah MT, Muda Z, Zulzalil
20:273–297 H (2014) Packet header anomaly detection using statistical
36. Chang C-C, Lin C-J (2011) Libsvm: a library for support vector analysis. In: International joint conference SOCO’14-CISIS’14-
machines. ACM Trans Intell Syst Technol (TIST) 2:27 ICEUTE’14. Springer, pp 473–482
37. Ebrahimpour MK, Eftekhari M (2017) Ensemble of feature selec- 52. Huang H, Khalid RS, Yu H (2017) Distributed machine learning
tion methods: a hesitant fuzzy sets approach. Appl Soft Comput on smart-gateway network towards real-time indoor data analytics.
50:300–312 In: Data science and big data: an environment of computational
38. Rankawat SA, Dubey R (2017) Robust heart rate estimation from intelligence. Springer, pp 231–263
multimodal physiological signals using beat signal quality index 53. Salo F, Nassif AB, Essex A (2019) Dimensionality reduction with
based majority voting fusion method. Biomed Signal Process ig-pca and ensemble classifier for network intrusion detection.
Control 33:201–212 Comput Netw 148:164–175
39. Blickle T, Thiele L (1996) A comparison of selection schemes
used in evolutionary algorithms. Evol Comput 4:361–394 Publisher’s Note Springer Nature remains neutral with regard to
40. Mirjalili S, Mirjalili SM, Lewis A (2014) Grey wolf optimizer. jurisdictional claims in published maps and institutional affiliations.
Adv Eng Softw 69:46–61
41. Mafarja M, Aljarah I, Heidari AA, Faris H, Fournier-Viger P, Li
X, Mirjalili S (2018) Binary dragonfly optimization for feature
13
1. use such content for the purpose of providing other users with access on a regular or large scale basis or as a means to circumvent access
control;
2. use such content where to do so would be considered a criminal or statutory offence in any jurisdiction, or gives rise to civil liability, or is
otherwise unlawful;
3. falsely or misleadingly imply or suggest endorsement, approval , sponsorship, or association unless explicitly agreed to by Springer Nature in
writing;
4. use bots or other automated methods to access the content or redirect messages
5. override any security feature or exclusionary protocol; or
6. share the content in order to create substitute for Springer Nature products or services or a systematic database of Springer Nature journal
content.
In line with the restriction against commercial use, Springer Nature does not permit the creation of a product or service that creates revenue,
royalties, rent or income from our content or its inclusion as part of a paid for service or for other commercial gain. Springer Nature journal
content cannot be used for inter-library loans and librarians may not upload Springer Nature journal content on a large scale into their, or any
other, institutional repository.
These terms of use are reviewed regularly and may be amended at any time. Springer Nature is not obligated to publish any information or
content on this website and may remove it or features or functionality at our sole discretion, at any time with or without notice. Springer Nature
may revoke this licence to you at any time and remove access to any copies of the Springer Nature journal content which have been saved.
To the fullest extent permitted by law, Springer Nature makes no warranties, representations or guarantees to Users, either express or implied
with respect to the Springer nature journal content and all parties disclaim and waive any implied warranties or warranties imposed by law,
including merchantability or fitness for any particular purpose.
Please note that these rights do not automatically extend to content, data or other material published by Springer Nature that may be licensed
from third parties.
If you would like to use or distribute our Springer Nature journal content to a wider audience or on a regular basis or in any other manner not
expressly permitted by these Terms, please contact Springer Nature at