0% found this document useful (0 votes)
24 views18 pages

A Survey of Neural Networks Usage For Intrusion Detection Systems

Uploaded by

electro-ub ub
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
24 views18 pages

A Survey of Neural Networks Usage For Intrusion Detection Systems

Uploaded by

electro-ub ub
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 18

Journal of Ambient Intelligence and Humanized Computing (2021) 12:497–514

https://doi.org/10.1007/s12652-020-02014-x

ORIGINAL RESEARCH

A survey of neural networks usage for intrusion detection systems


Anna Drewek‑Ossowicka1 · Mariusz Pietrołaj1 · Jacek Rumiński1

Received: 19 August 2019 / Accepted: 17 April 2020 / Published online: 12 May 2020
© The Author(s) 2020

Abstract
In recent years, advancements in the field of the artificial intelligence (AI) gained a huge momentum due to the worldwide
appliance of this technology by the industry. One of the crucial areas of AI are neural networks (NN), which enable commer‑
cial utilization of functionalities previously not accessible by usage of computers. Intrusion detection system (IDS) presents
one of the domains in which neural networks are widely tested for improving overall computer network security and data
privacy. This article gives a thorough overview of recent literature regarding neural networks usage in intrusion detection
system area, including surveys and new method proposals. Short tutorial descriptions of neural network architectures, intru‑
sion detection system types and training datasets are also provided.

Keywords Neural network · Deep learning · Machine learning · Intrusion detection system

1 Introduction implementation or improvement (Saied et al. 2016; Kang


and Kang 2016; Yin et al. 2017).
Cyber security is an extremely important topic for contem‑ This article presents the results of a literature survey con‑
porary society. Instant access to the global network expose cerning neural networks usage in the cyber security area,
individuals and organizations to cyber threats. For a while specifically—intrusion detection systems. It is focused on
now, various methods as firewalls and antivirus software reviewing literature in the context of the appliance of par‑
have been being used in order to protect both user’s privacy ticular NN models in terms of intrusion detection systems.
and sensitive data (Choo 2011). Intrusion detection system NN became an emerging area of interest in machine learning
(IDS) represents another important area for cyber security. (ML) research activities, due to several breakthrough events,
IDS focuses on network traffic or particular computer envi‑ like success of convolution neural network proposals for
ronment analysis in order to identify signs related to mali‑ ImageNet competition (Krizhevsky et al. 2012). This work
cious activity (Liao et al. 2013). also describes and compares recent NN methods, models
The recent rise of interest in the field of artificial intel‑ used for defining new, refined IDS solutions, proposed in
ligence (AI) resulted in major advancements of, among oth‑ the reviewed literature.
ers, pattern recognition or anomaly detection mechanisms. The main contributions of this paper are the following:
Neural networks (NN) are a common choice for such prob‑
lems and their usage is no longer held back. Mainly due to • Review of the most relevant recent papers—methods
increase of available computational power. Such situation proposal, surveys and tutorials for intrusion detection
encouraged researchers to adapt NN architectures for IDS systems.
• The main focus of neural network appliance for IDSs.
Other surveys known to authors generally focus on a
* Anna Drewek‑Ossowicka wider field of machine learning.
[email protected]
• Solid base of knowledge for future researchers in terms
Mariusz Pietrołaj of NN appliance to IDS.
[email protected]
• Stating and defining problems which have a challenging
Jacek Rumiński impact for related research.
[email protected]
1
Faculty of Electronics, Telecommunications and Informatics,
Gdańsk University of Technology, Gdańsk, Poland

13
Vol.:(0123456789)
498 A. Drewek‑Ossowicka et al.

This paper is organized in the following way. Section 2 was described firstly in 80s, by, among others, Denning
describes background of the presented research. The third (1987). IDSs can be divided into categories using several
chapter presents theoretical overview of IDS and NN related approaches. First two types are: network-based and host-
terms. Section 4 gives a summary of datasets that are used based, depending on where the intrusive behavior may be
for IDSs, including custom solutions that we came across observed. Network-based IDSs monitor and analyze net‑
during our research. Next, the fifth part of this paper reveals work traffic and are focused on network security. Host-
the methodology of the literature review and decisions based IDSs identify malicious activities by monitoring
made during that process. Section 6 includes the results processes and system events on the software environment
of our literature review, including an overview of surveys, that is related to particular computer (Camastra et al. 2013;
new method proposals and other papers with categoriza‑ Buczak and Guven 2016).
tion based on AI area and IDS focus. Section seven covers Another division of types of IDSs is based on the data
NN security. The last, eighth part presents our conclusion analytics approaches, which have been used: signature-
derived from the presented work. based (misuse-based), anomaly-based and hybrid. Signa‑
ture-based approach analyses network packets or data from
particular system (e.g. logs) in order to find signatures, pat‑
2 Background terns which are characteristic for intrusive behavior. This
type of technique is significantly more effective in terms of
In our work we decided to focus mostly on neural network known attacks as it leverages previously labelled data from
appliances for modern IDSs. Based on the conducted review database. Although it is characterized by being simple and
and to our best knowledge, most surveys cover wide areas effective method, it cannot recognize unknown attacks and
such as machine learning and/or data mining (Buczak and requires frequent database updates (Liao et al. 2013; Modi
Guven 2016), not only neural networks. Additionally, some et al. 2013; Lin et al. 2015; Buczak and Guven 2016).
of them are older than 2015, which is our limit for searching Anomaly-based approach analyzes data in order to recog‑
the papers (Ahmad et al. 2009; Shah and Trivedi 2012; Vin‑ nize abnormal situations, that differs from normal network
churkar and Reshamwala 2012). Such approach can be lim‑ and system behaviors. This kind of ability may be achieved
ited in terms of describing specific architectures or network based on previously provided data, which were used to train
models used for threats detection. Another important aspect a particular algorithm. The described method is promising,
is also a role of NN in particular solution as it can be used because, in contrast to previous technique, it enables finding
for classification or e.g. reduction of data dimension, which zero-day attacks. It also allows more robust customization
is proved by available hybrid IDS methods (Pandeeswari for a particular system or network. The significant draw‑
and Kumar 2016; Erfani et al. 2016; Al-Yaseen et al. 2017). back in this case is the fact, that these kinds of techniques
It can be spotted that neural networks are one of the are characterized by a high level of false positive alarms,
most advancing technologies in terms of real-life influence. due to the fact that they are not based only on labelled data,
Robust usage of NN in mobile solutions, automotive, IoT, but taught to recognize anomalies based on previously pro‑
medical and military companies makes it an exciting tech‑ vided data, which may end up with finding situations that are
nology, which is highly adaptable by industry. All of these anomalies, but not necessarily cyber security attacks (Liao
have a high impact on number of analysis regarding NN et al. 2013; Camastra et al. 2013; Buczak and Guven 2016;
appliance for security and privacy branches including IDS Besharati et al. 2019).
and network tracking tools. Hybrid techniques are combinations of signature and
Finally, due to rapid advancements in AI filed, new, more anomaly detection. Such method is created in order to com‑
efficient algorithms and NN specifications are described bine the advantages of both previous solutions—to minimize
(Almási et al. 2016). This is why focusing on the latest false alarm results and also raise detection effectiveness for
experiments is so important. known attacks (Buczak and Guven 2016).
A comprehensive review conducted by Liao et al. (2013)
marks out also some additional types, like wireless-based,
3 Intrusion detection systems and machine network behavior analysis, mixed IDS and stateful protocol
learning analysis. Wireless-based IDS is analogous to network-based,
it captures wireless traffic. Network behavior analysis sys‑
3.1 Intrusion detection system tem analyses network traffic to find malicious attacks with
not expected traffic flows. Mixed IDSs combine multiple
Intrusion detection systems are entities for auditing sys‑ technologies to provide a more comprehensive and accu‑
tems and network operations against hostile actions and rate intrusion detection. Stateful protocol analysis, on the
policy violations (Tran et al. 2018). The IDS model other hand, is used to analyze specific states of the particular

13
A survey of neural networks usage for intrusion detection systems 499

Table 1  Summary of some of IDSs types (Liao et al. 2013)

IDS Detection area Host-based


Network-based
Wireless-based
Network behavior analysis
Mixed
Detection methodology Signature-based
Anomaly-based
Stateful protocol analysis

network protocol, to find potentially harmful patterns. (Liao


et al. 2013) (Table 1).
Camastra et al. (2013) presents also the categorization of Fig. 1  The simple architecture of a three layer feed-forward neural
network (LeNail 2019). Created using a program distributed with
machine learning and soft computing (SC) approaches used
MIT license: https​://githu​b.com/alexl​enail​/NN-SVG and described
for IDS modeling. Four groups of ML and SC are described: in the article under CC-BY license: https​://joss.theoj​.org/paper​
supervised learning-based approaches, unsupervised learn‑ s/10.21105​/joss.00747​.pdf
ing-based approaches, statistical modeling-based approaches
and ensemble-based approaches. First approach is used for
hence the name. Figure 2 depicts a simple diagram of NN
detecting attacks that are known, while unsupervised tech‑
learning process including backpropagation step. With a
niques works for new intrusions. Statistical modeling-based
proper number of neurons and hidden layers, MLP should
approach is used for monitoring user behavior and assess‑
be able to learn quite accurate approximation of a relation
ing whether it differs anyhow from the behavior defined as
function between input and output data.
‘normal’. Ensemble-based approaches on the other hand,
Recurrent neural network (RNN) presents an extension of
combine several models in order to improve efficiency and
standard feed-forward NN that leverages time and sequence
accuracy.
dependencies. The main difference introduced by RNN
architecture is a cyclic neuron connection, which enables
3.2 Neural networks inference to take into consideration previous conditions of
neurons. This feature allows a network unit to remember its
In literature, it is not obvious to find unambiguous artificial previous state (Elman 1990). RNN is especially useful in the
neural network definition (Guresen and Kayakutlu 2011). area of language and video processing, where the context of
The accurate explanation is given by Haykin describing data sequence is highly relevant to the structure of the input
ANN as a “massively parallel combination of simple pro‑ data. A major obstacle for training RNN is a known problem
cessing unit which can acquire knowledge from the environ‑ of gradient exploding or vanishing (Kim et al. 2016).
ment through a learning process and store the knowledge
in its connections” (Haykin 1994; Guresen and Kayakutlu
2011). In general, it can be stated, that neural networks aim
to resemble inference of human brain.
Many different architectures of neural networks have been
applied for the domain of intrusion detection systems. The
most extensively used are described below (Veen 2016):
Multi-layer perceptron (MLP) is a feed-forward NN built
from single perceptrons, which are simple computational
models resembling biological neurons (Rosenblatt 1958).
The network consists of at least three fully connected layers
of perceptrons: input, hidden and output layer. The Fig. 1
presents a general example of such a network. Neurons
inside a particular layer have no connection with each other.
Supervised training of MLP usually uses backpropagation Fig. 2  The generalized learning process of artificial neural network,
algorithm based on the input and output examples provided including feedforward and backpropagation steps (LeNail 2019).
Created using a program distributed with MIT license: https​://githu​
to the network. The error between predicted and calculated b.com/alexl​enail​/NN-SVG and described in the article under CC-BY
results is back propagated to previous layers of the network, license: https​://joss.theoj​.org/paper​s/10.21105​/joss.00747​.pdf

13
500 A. Drewek‑Ossowicka et al.

Long short term memory (LSTM) has been presented as a


solution to difficulties related to RNN. LSTM helps to over‑
come the previously mentioned vanishing and the explod‑
ing gradient problem, existing in RNN. In order to avoid
weight conflicts this architecture introduces a new memory
cell (Hochreiter and Schmidhuber 1997). The structure of
such cell includes input, output and forget gates. The main
advantage of this architecture is the ability of the network to
learn over long sequences of data. This is why it is widely
used for text and video processing.
Autoencoder (AE) represents a variation of MLP used
in an unsupervised manner, although, as present by Fig. 3., Fig. 4  The architecture of a sparse autoencoder (LeNail 2019). Cre‑
the architecture of the network is quite similar. One of the ated using a program distributed with MIT license: https​://githu​
possible ways of using AE is compression or reduction of b.com/alexl​enail​/NN-SVG and described in the article under CC-BY
license: https​://joss.theoj​.org/paper​s/10.21105​/joss.00747​.pdf
input dimensionality. Input layer processes data to output
layer through limited number of hidden units, which create
a bottleneck in the network structure and encode provided knowledge about the probability distribution of particular
data (Bourlard and Kamp 1988). Decoding takes place in the inputs. DBN tries to overcome the problem of not optimal
further layers till the output layer, which usually corresponds solutions achieved by commonly used gradient based learn‑
with the number of neurons in the input layer. Such network ing algorithms. An unsupervised greedy layer-wise learning
construction resembles a shape of an hourglass. algorithm utilized by DBN focuses on training the network
Sparse autoencoder (sparse AE) is a NN, which architec‑ part by part in order to find an optimal general solution (Liu
ture is opposite to AE presented earlier. Instead of having a et al. 2017).
bottleneck in the central part of the network, central hidden Convolutional neural network (CNN) is a deep neural
layer is the one with the highest number of neurons, which network consisted of multiple layers, as presented on Fig. 5
is depicted in Fig. 4 Sparse AE represents an example of The main usage of CNN is image recognition, but with addi‑
an unsupervised method for learning overcomplete features. tional architectural or input modifications, it can be used for
The proposed model consists of the encoder, the “sparsify‑ various other use cases. CNN in its design provides specific
ing” logistics which is a non-linear data transformer, and the functions for filtering layers as convolution and pooling. In
decoder (Ranzato et al. 2007). This architecture is mostly contrast to other NN architectures, not all of the layers in
used in order to extricate features from a large set of unla‑ CNNs are fully connected. Some neurons focus on a specific
beled data. group of data which helps to analyze or extract features for
Deep belief network (DBN) is an example of deep neural a particular region of an image. As these NNs are usually
network, which basically consists of stacked restricted boltz‑ designed to deal with 2D shape, they are mostly used for
mann machines (RBM) that communicate with each other. data or image classification (Lecun et al. 1998; Guo et al.
RBM is a simple two layer neural network, that can gain the 2016).
Extreme learning machine (ELM) is an example of
another modification of a standard feed-forward neural
network. The main purpose of this solution is to address

Fig. 5  An example of convolution neural network architecture


Fig. 3  The architecture of an autoencoder (LeNail 2019). Created (LeNail 2019). Created using a program distributed with MIT license:
using a program distributed with MIT license: https​://githu​b.com/ https​://githu​b.com/alexl​enail​/NN-SVG and described in the arti‑
alexl​enail​/NN-SVG and described in the article under CC-BY cle under CC-BY license: https​://joss.theoj​.org/paper​s/10.21105​/
license: https​://joss.theoj​.org/paper​s/10.21105​/joss.00747​.pdf joss.00747​.pdf

13
A survey of neural networks usage for intrusion detection systems 501

bottlenecks that are slowing training process and come traffic and possible threats. This obstacle might be resolved
mostly from using backpropagation based algorithms. by gathering data manually or using customized versions
This method can speed up the training of the network up of already available datasets. However, not having common
to a thousand times with similar accuracy in comparison to benchmark for new IDS implementations makes it difficult
standard NN methods. Such result is achieved by random to compare methods in terms of accuracy and false-positive
connection between neuron layers and different learning alerts.
algorithm based on least square fit (Huang et al. 2006). The following section gives an overview of datasets avail‑
Self-organizing map (SOM)defines yet another NN mech‑ able for, among others, neural network training regarding
anism for unsupervised data aggregation. One of the goals IDS implementation, that were used in discussed papers and
stated for SOM is a reduction of dimensional complexity of beyond. Some custom methods to generate training data are
input data. Due to that, such network architecture can find mentioned as well (Narudin et al. 2016; Wang et al. 2017)
specific clusters of categories in a large input database. In in order to help researchers find new ways to verify their
contrast to the commonly used approaches, backpropagation own IDSs.
is replaced here with a competitive learning algorithm to
enable mapping of features (Kohonen 1982). 4.1 Public datasets

4.1.1 DARPA 1998 and DARPA 1999


3.3 Other machine learning methods
The Defense Advanced Research Projects Agency (DARPA)
Lots of research focuses on hybrid approaches to IDS, which
datasets are treated as a basic, publicly available standard.
makes NN only a part of the final method. Several different
DARPA 1998 was introduced by Cyber Systems and Tech‑
machine learning architectures are used in terms of IDS. The
nology Group of the Massachusetts Institute of Technology
examples that appear in the reviewed literature are using the
Lincoln Laboratory (Lincoln Laboratory 1998; Lippmann
following other (ML) techniques:
et al. 2000a, 2000b; Buczak and Guven 2016).
K-nearest neighbors (K-NN) is a supervised learning
It was created based on (including both network and OS
algorithm based on calculating the Euclidean distance
data):
between given input data. K-NN method is commonly used
for classification of given collection. The simplicity of the
• TCP/IP network data.
solution comes from categorization of the input according to
• Solaris basic security module logs.
calculated Euclidean distances from the classified samples
• Solaris file system dumps (root and user) (Buczak and
from the training set. Based on that a particular element is
Guven 2016).
classified by the majority of types within K-nearest neigh‑
bors, hence the name (Cunningham and Delany 2007).
This dataset consists of network and operating system
Support-vector machine (SVM)belongs to supervised
data. The data was being gathered for 9 weeks, 7 for training
ML methods and is motivated through statistical learning
and 2 for testing set (Lippmann et al. 2000b).
theory. The process of training SVMs relates to solving a
DARPA 1999 is a successor of DARPA 1998. In this case,
constrained quadratic optimization problem. The easiest
data was being gathered for 5 weeks, 3 for training and 2 for
explanation of SVM execution can be stated as finding an
testing. The major distinction between them is an expanded
optimal hyperplane solution, that separates examples into
range of available attack scenarios (Lippmann et al. 2000a).
two separate groups. In case of 2D space such hyperplane
However DARPA 1998 and DARPA 1999 are usually pre‑
will be a straight line. Its main role is a binary classification
sented as commonly used datasets for experiments, during
of a given data. When given unlabeled data SVM can be
our research we did not encounter new methods using them.
used as a clustering mechanism (Evgeniou and Pontil 2001).
The possible reason behind it is, that those datasets turned
out to not be fully capable of simulating physical network
systems (McHugh 2000; Aljawarneh et al. 2018) and are
4 Datasets currently being replaced by newer proposals.

Neural network training requires a significant amount of data


in order to approximate effective correlation between pro‑ 4.1.2 KDD Cup 1999
vided input and expected results. This issue is particularly
noticeable in case of supervised learning. Unfortunately, The KDD Cup 1999 dataset (KDD Cup 1999) is one of the
major part of publicly available datasets for IDSs is usually most often used datasets for evaluating IDSs. It utilizes TCP/
quite old and do not provide ideal representation of network IP data from DARPA 1998 dataset. While DARPA 1998

13
502 A. Drewek‑Ossowicka et al.

consists of about 5 million records in training data and • Inverse proportion number of particular records from
around 2 million records in testing data, KDD Cup 1999 each difficulty level group to the percentage of records
training part has around 4,900,000 connection vectors (Tav‑ in the original KDD Cup 1999 dataset.
allaee et al. 2009). Each vector has 41 features and is clas‑
sified as normal connection or an attack. Additionally it can NSL-KDD is still not perfect (due to problems that are
belong to one of four attack types (Tavallaee et al. 2009; going to be listed in the next section), nevertheless can be
Dhanabal and Shantharajah 2015): used for effective benchmarking for IDSs.
Denial of service (DOS)—a case when an attacker pur‑
posely uses victims resources with flood number of mali‑ 4.1.4 UNSW‑NB15
cious request in order to make it unable to handle legitimate
calls to the service. Extensive usage of KDD Cup 1999 and NSL-KDD datasets
User to root (U2R)—rising normal user privileges to a resulted in discovering the following challenges:
super user (root) by exploiting some vulnerabilities in the
attacked system. • Missing some low footprint attack characteristics,
Probe (probing)—exploring or examining victim or its • Missing some traffic schemes (e.g. normal and modern),
environment in order to gain information. Port scanning or • Discrepancy between distribution of particular data sets
checking duration of connection are only a few examples. (training vs. testing) (Moustafa and Slay 2016).
Root to local (R2L)—access of an unauthorized entity to
a remote machine and gaining local privileges. UNSW-NB15 was created as a response to the above
The 41 features are divided into three groups (Tavallaee problems. It was created with the usage of an IXIA Perfect‑
et al. 2009): Storm tool in the Cyber Range Lab of the Australian Centre
Basic features—general features for TCP connections. for Cyber Security (ACCS) (UNSW-NB15 2015; Moustafa
Content features—features describing invalid behaviors and Slay 2016). UNSW-NB15 consists of 49 features. There
for single connections helping discovering R2L and U2R are two attributes for the data provided: label (0 for nor‑
attacks. mal and 1 for otherwise) and attack_cat for attack category
Traffic features—features defined using time window. (Moustafa and Slay 2016). There are five categories of fea‑
Besides huge popularity and number of available data, tures: Flow, Basic, Content, Time and Additional Generated
KDD Cup 1999 struggles with problems. Some of them Features. The types of attacks are: Fuzzers, Analysis, Back‑
were inherited from DARPA’98 dataset like the fact of being doors, DoS, Exploits, Generic, Reconnaissance, Shellcode,
fully synthetic dataset or lack of the examination of possible Worms (Moustafa and Slay 2015, 2016) (Table 3).
dropped packets while the dataset was being created. KDD
Cup 1999 itself suffers also from not even distribution of the
4.1.5 Kyoto2006+
attacks and record redundancy (Tavallaee et al. 2009). While
describing KDD Cup 99, we spotted, that one of the traffic
Another example of publicly available benchmark data
features—dst_host_same_src_port_rate—is described in lit‑
for IDS training and testing is Kyoto2006 + dataset. It pre‑
erature as “same_src_port_rate for destination host” (KDD
sents 24 different network related features which have been
Cup 1999; Shanmugavadivu and Nagarajan 2011; Amiri
extracted from servers placed at Kyoto University. 14 fea‑
et al. 2011; Songma et al. 2012), while in KDD Cup 1999, to
tures are obtained from KDD Cup 99, while 10 other features
our best knowledge, we could not find same_src_port_rate
were newly added (Song et al. 2011). Data was gathered for
feature, hence lack of description in Table 2.
three years from 2006 to 2009 (Ambusaidi et al. 2016). It
was created as an alternative for KDD Cup 1999 (Song et al.
4.1.3 NSL‑KDD
2011). There is also benchmark version described, which
contains 17 features (14 derived from KDD Cup 1999 and 3
NSL-KDD is a dataset that was created to overcome the
additional) (Kyoto dataset 2015).
issues of DARPA and KDD Cup 1999 datasets (Tavallaee
et al. 2009; Dhanabal and Shantharajah 2015). It was pro‑
posed by Tavallaee et al. (2009). The main advantages over 4.2 Other datasets
KDD Cup 1999 are (NSL-KDD 2009):
Some of the researchers decided to experiment with other
• Lack of redundant records in training set and no records than any of presented above public datasets. Erfani et al.
in testing set, that are duplicated. (2016) used six real-life datasets and two synthetic ones.
• The number of records is feasible, so there is no need of The six real-life ones were received from the UCI Machine
creating subsets of the dataset for the experiments. Learning Repository:

13
A survey of neural networks usage for intrusion detection systems 503

Table 2  Features of individual TCP connections in KDD Cup 1999 (KDD Cup 1999; Amiri et al. 2011)
Feature Description

Basic features
Duration Connection length expressed in seconds
Protocol_type Type of connection protocol, e.g. udp
Service Destination network service, e.g. telnet
Flag Connection status—normal/error
Src_bytes Bytes from source to destination point
Dst_bytes Bytes from destination point to source
Land 1 or 0 – if connection is from the same host/port
Wrong_fragment Number of incorrect fragments
Urgent Number of packets marked as urgent
Content features
Hot “Hot” indicators—number
Num_failed_logins Failed logins attempted—number
Logged_in 1 or 0—if login trial was successful
Num_compromised “Compromised” conditions - number
Root_shell 1 or 0—if root shell was accessed
Su_attempted 1 or 0—if there was “su root” attempt
Num_root “Root” accesses—number
Num_file_creations File creation operations—number
Num_shells Shell prompts—number
Num_access_files Operations on access control files—number
Num_outbound_cmds Outbound commands in ftp conn—number
Is_hot_login 1 or 0—if login is on the “hot” list
Is_guest_login 1 or 0—if the login is classified as “guest”
Traffic features
Count Connections as current to the same host—number in the past two seconds
Srv_count Connections as current to the same service—number in the past two seconds
Serror_rate Connections having “SYN” errors—percentage
Srv_serror_rate Connections having “SYN” errors—percentage (service)
Rerror_rate Connections having “REJ” errors—percentage
Srv_rerror_rate Connections having “REJ” errors—percentage (service)
Same_srv_rate Connections to the same service—percentage
Diff_srv_rate Connections to different services—percentage
Srv_diff_host_rate Connections to different hosts—percentage
Dst_host_count Count for destination host
Dst_host_srv_count srv_count for destination host
Dst_host_same_srv_rate Same_srv_rate for destination host
Dst_host_diff_srv_rate Diff_srv_rate for destination host
Dst_host_same_src_port_rate Lack of detailed description
Dst_host_srv_diff_host_rate Srv_diff_host_rate for destination host
Dst_host_serror_rate Serror_rate for destination host
Dst_host_srv_serror_rate Srv_serror_rate for destination host
Dst_host_rerror_rate Rerror_rate for destination host
Dst_host_srv_rerror_rate Srv_rerror_rate for destination host

• Forest adult gas sensor array drift (Gas). and have dimensionalities of 54, 123, 128, 242, 315 and
• Opportunity activity recognition (OAR), 561 attributes. Synthetic datasets were “Banana” dataset,
• Daily and sport activity (DSA), created by mixing “two banana shaped distributions” and
• Human activity recognition using smartphones (HAR),

13
504 A. Drewek‑Ossowicka et al.

Table 3  UNSW-NB15—group of features and labels (Moustafa and It was used for Deep Belief Network to generate signatures
Slay 2015, 2016) for malware records.
Group of fea‑ Name Group of fea‑ Name An interesting example is provided by Du et al. (2017) -
tures tures HDFS log dataset and OpenStack log dataset. First comes
from Hadoop - based environment, second from the Open‑
Flow features srcip Time features sjit
Stack environment. Those datasets are particularly inter‑
sport djit
esting due to be some examples of datasets based on logs,
dstip stime
not network packets.
dsport ltime
Wang et al. (2017) present, on the other hand, an exam‑
proto sintpkt
ple of self-generate dataset called USTC-TFC2016. It con‑
Basic features state dintpkt
sists of two parts. First contains ten types of malware traf‑
dur tcprtt
fic from publicly accessible website, second contains ten
sbytes synack
types of non-malware traffic.
dbytes ackdat
Network traffic data can also be gathered by NetFlow,
sttl Additional is_sm_ips_ports
features which was created as Cisco router feature, as mentioned in
dttl ct_state_ttl
(Buczak and Guven 2016). Network flow in this understand‑
sloss ct_flw_http_mthd
ing is an order of packets sharing exactly the same packet
dloss is_ftp_login
features: IP protocol, source port, destination port, IP type of
service ct_ftp_cmd
service, ingress interface, source IP address and destination
sload ct_srv_src
IP address (Buczak and Guven 2016).
dload ct_srv_dst
Both approaches (public vs. private/privately generated)
spkts ct_dst_ltm
datasets have their pros and cons. In case of public datasets
dpkts ct_src_ ltm
it is possible to easily compare the results of the experiments
Content features swin ct_src_dport_ltm
with other methods results and benchmark particular solu‑
dwin ct_dst_sport_ltm
tions. On the other hand, those datasets are considered two
stcpb ct_dst_src_ltm
general and not flexible enough to address contemporary
dtcpb Labels attack_cat
smeansz
needs in terms of IDSs. Private datasets can be prepared for
label
dmeansz
specific experiment and better address particular needs, nev‑
trans_depth
ertheless they can be a subject of privacy concerns and have
res_bdy_len
too specific form, that is hard to be used on a wider scale.

5 Research method

the second one was “Smiley”—combination of Gaussians This paper is designed for researchers, who need a complex
and arc shaped distributions (Erfani et al. 2016). source of data concerning available literature in terms of
Kang and Kang (2016) were working on the data created Neural Network usage for Intrusion Detection Systems. We
by packet generator open car test-bed and network experi‑ decided to review the newest scientific literature concern‑
ments (OCTANE) (Borazjani et al. 2014), which was able to ing the topic above. In order to achieve that, we performed
generate CAN (controller area network) packets, a standard a systematic literature review. Google Scholar database was
for in-vehicle network communication. used for performing research for two search strings: (1)
Narudin et al. (2016) focused on mobile malware “intrusion detection system” AND “neural network”, (2)
detection. They used two datasets: public (MalGenome) “intrusion detection system” AND “neural networks”. We
and self-collected, private dataset. MalGenome consists did not include the word “artificial” in the search string, due
of 1260 malwares records categorized into 49 different to the fact that in the articles NN are covered by both “neural
groups. It was gathered between 2010 and 2011 (Narudin network(s)” and “artificial neural network(s)” phrase, so we
et al. 2016). did not want to exclude accidentally any important articles.
Saied et al. (2016) used artificial neural network for Especially, taking into consideration the fact, that quite fre‑
detecting DDoS attacks. In order to generate datasets, they quently, “Artificial Neural Network” term is used for par‑
built safe, realistic network, where they performed DDoS ticular architecture, like multi-layer perceptron. These search
attacks (TCP, UDP and ICMP protocols). string were searched separately, due to lack of confirmation
David and Netanyahu (2015) used an extensive dataset that Google Scholar accepts any parenthesis in search strings
provided by C4 Security with multiple malware categories. (Tay 2015). Publish or Perish software (Harzing 2007) was

13
A survey of neural networks usage for intrusion detection systems 505

used to perform the queries due to the ease of exporting data Table 4  Search criteria used for literature review
to Excel files, which was necessary to perform later review. Search parameter Value
We searched for articles from between 2015 and 2019 and
sorted the results based on number of citations. The search Database Google Scholar
was performed on April 6th, 2019. The goal was to review Search date 06th of April 2019
the literature that currently has the biggest influence and Search strings “Intrusion detection
system” AND “neural
we decided that citations count is a quite good indicator
network”
of the potential paper impact, as the authors of other sur‑ “Intrusion detection
veys/literature reviews proposed (Buczak and Guven 2016). system” AND “neural
From each, sorted list (from each of two search strings) we networks”
excluded patents and books in order to focus on journal and Timeline 2015–2019
conference papers. After exclusion we chose 50 positions Sorting Highest citation count
from each list and merged both lists. Majority of entries Document type Journals
Conference papers
were repeated so the final number of journals prepared for
abstract review was 62 articles. Next, we performed an
abstract review to assess if a particular article is relevant to
our research. The final list of articles for literature review that cannot be categorized into one of two first groups. Ten
contained 34 articles (Fig. 6). articles have been marked as other as they focus mostly on
We are conscious that citation number is not flawless. datasets itself or machine learning methods, which are not
The older the paper is the bigger chance it obviously gets strictly related to NN. For the purpose of this research we
to obtain high number of citations. That is why the decision decided to include them as well as they present a wider range
was to review papers from short time range. We are also of available IDS solutions. The below chart summarizes
aware that survey and tutorials, due to its informative nature, ratio of the reviewed articles (FIg. 7):
are getting high number of citations in general, hence in
the final summary they are presented in a separate category 6.1 Surveys
(Table 4).
During the conducted literature review we came across cou‑
ple of surveys regarding possible IDS implementations or
6 Results enhancements with usage of Artificial Intelligence. Most of
the articles present a good theoretical machine learning and
The reviewed articles can be categorized into three separate IDSs background for researchers interested in this field. Nev‑
groups. First one consists of surveys focused on machine ertheless, neural networks are usually treated only as a small
learning algorithms usage for IDSs. It covers not only par‑ part of available solutions. It is also worth mentioning, that
ticular examples of intrusion detection methods, but also papers that emphasized recent advancements in deep learn‑
general knowledge about Artificial Intelligence and data‑ ing were able to depict a wider area of NN field.
sets available for ML algorithms training. The second group Having in mind how important the aspect of datasets is
gathers all articles that focus on new methods proposals or for NN related methods, we decided to review mentioned
experiments including strict neural network usage and hybrid surveys in terms of described or proposed datasets. This
solutions for IDSs. The third group, which depicts articles shows, that besides long-serving databases as KDD Cup
1999, NSL-KDD etc. it is hard to define reliable dataset for

Fig. 6  Number of the articles at each of manual literature review


steps Fig. 7  The categories of the final group of the reviewed articles

13
506 A. Drewek‑Ossowicka et al.

Table 5  Summary of reviewed surveys


Authors Citations AI area focus Datasets focus Summary Year
(6th Apr 2019)

Buczak and Guven (2016) 423 Machine learning and data mining: Netflow Detailed overview of available 2016
Neural networks DARPA 1998 data mining and Machine Learn‑
Association and fuzzy associa‑ DARPA 1999 ing methods including com‑
tion rules, KDD Cup 99 parison of datasets, performance
Bayesian network, NSL-KDD comparisons and recommenda‑
Clustering, tions against IDS implementa‑
Decision trees, tion
Ensemble learning,
Evolutionary computation,
Hidden Markov models,
Inductive learning
Naïve Bayes,
vSequential pattern mining,
Support vector machine
Agrawal and Agrawal (2015) 139 Clustering: N/A Overview of available Data 2015
k-means, Mining techniques and hybrid
k-medoids, methods with examples that have
EM clustering, been implemented for IDSs.
Outlier detection algorithms Focused on Anomaly Detection
classification:
Classification tree,
Fuzzy logic,
Naïve Bayes network,
Genetic algorithm,
Neural networks,
Support vector machine
hybrid:
Cascading supervised techniques,
Combining supervised and unsu‑
pervised techniques
Fadlullah et al.. (2017) 117 Machine learning and deep learn‑ N/A Overview of deep learning archi‑ 2017
ing: tectures and their appliances for
Convolutional NN network related traffic control
Recurrent NN
Long short term memory NN
Stacked auto-encoder
Deep Boltzmann machines
Deep reinforcement learning
Narudin et al. (2016) 104 Machine learning: MalGenom Focus on mobile malware detec‑ 2016
Random forest Custom, self- tion also with usage of IDS.
J48 collected Overview of possible appliances
Multi-layer perceptron database of described methods and their
Bayesian network verification on chosen datasets.
k-NN It is not a classic survey, rather
evaluation of existing methods –
authors decided to keep it in this
category, as to our best under‑
standing it provides evaluation
for machine learning classifiers
Kwon et al. (2019) 51 Deep learning and machine learn‑ KDD Cup 99 Overview of multiple methods of 2017
ing: NSL-KDD data dimensionality reduction
Restricted Boltzmann machine and possible DL appliances to
Deep belief network IDS enhancement. Authors per‑
Deep neural network formed also an experiment with
Recurrent neural network Fully Connected Network model
for NSL-KDD dataset

13
A survey of neural networks usage for intrusion detection systems 507

benchmarking that could be used across all reviewed solu‑ learning sets might enable researchers to improve the accu‑
tions. UNSW-NB15 is an example of the newer one. racy of the proposed solutions (Table 7).
Table 5 briefly summarizes surveys that were finally cho‑
sen for review in our research. As mentioned before, we
decided to focus on two major aspects, which are AI area 7 Security concerns
and datasets described in a particular paper. This gives a
good base for further data gathering or research in terms of In this paper, we present the overview of the latest literature
NN and IDSs. concerning NN usage in IDSs. While describing this topic,
Not all ML techniques have been thoroughly explained in it is important to highlight, that IDS can be itself a subject
this paper. The number of proposed solutions is so high, that of security attack (Corona et al. 2013). Also machine learn‑
we decided to list them in the table and redirect our readers ing based solutions usage in modern IDS architecture can
to a specific article. raise security concerns. Appliance of machine learning in
cybersecurity area may result in undesirable inheritance of
6.2 New methods proposals and experiments its flaws by NN based IDSs and new vectors of attacks.
Corona et al. (2013) provided an interesting taxonomy
Major part of the reviewed articles presents new methods proposal for adversarial attacks against Intrusion Detection
for IDSs, based on neural network or performed experi‑ Systems in general. The types of attacks that can directly
ments. The below table summarizes NN architectures used harm NN based IDSs are, among others poisoning and eva‑
by researchers. Additionally, dataset used for method vali‑ sion (Corona et al. 2013; Pitropakis et al. 2019). The first
dation is stated. We did not perform accuracy comparison type of attacks concerns manipulating training data in order
of the proposed solutions. Such comparison might be not to decrease algorithm’s performance, resulting in, for exam‑
informative due to different datasets or data subsets that ple, misclassification (Baracaldo et al. 2018). This obviously
were used. Additionally there are some differences during concerns wide usage of ML algorithms, not only in IDSs
data preparation steps or type of attacks detected by particu‑ (Baracaldo et al. 2018). Evasion attacks, on the other hand,
lar IDS. Due to vast variety of available methods proposal, are focused on the testing phase of the algorithm. Pitropakis
only part of the below algorithms or NN architectures have et al. (2019) provide an example of such attack in the con‑
been described in this article. For each listed publication, text of NN and IDSs. They describe experiment prepared by
column “method used” enlist general mechanism used by the Demetrio et al. (2019), where the evasion black-box attack
particular solution. For more details the reader is redirected was performed against convolutional neural network, in
to the related paper (Table 6). order to compromise its classification possibilities (Pitropa‑
kis et al. 2019).
6.3 Other related papers Another classification of attacks that can be performed on
ML based IDSs is differentiation between black-box, gray-
Some of the papers that we reviewed could not be easily box and white-box attacks (Darvish Rouani et al. 2019). In
classified to the category of surveys or new method propos‑ case of black-box attacks, intruder has no knowledge about
als/experiments. In this group we placed works that present: the ML algorithm or model. Gray-box attacks involves only
knowledge about ML algorithm or model, but without any
• Interesting IDS enhancing methods, that are not directly information about model parameters. In terms of white-box
connected to neural networks, attack – the attacker has knowledge about all of the above
• Papers that focus on datasets itself. (Darvish Rouani et al. 2019).
The most important thing is, how IDSs that use ML in
However, this paper focuses on NN based IDSs, we think general (including NN), can be defended from adversarial
that mentioning most cited ML based solution might be attacks. One of the solutions for defending from poisoning
beneficial for future research. As we presented, quite often attack is training data manipulation, nevertheless it can cost
hybrid methods are used instead of plain NN. Although increased computational resources (Corona et al. 2013). One
those articles do not match exactly our criteria of research, of the proposals in the literature for NN defense in general is
we found them useful in terms of appliance in the field of Mixup, which, among others, helps to act against adversar‑
IDS. ial examples (Zhang et al. 2018; Stewart 2019). Yuan et al.
Publications focusing on comparison and analysis of the (2019) presented a classification of two types of defense
datasets might be especially helpful as number of public strategies against adversarial examples: reactive and proac‑
training data for IDS is quite limited. Extended knowledge tive. The first type consists of adversarial detecting, input
on structure and possible challenges of these common reconstruction and network verification, while the second

13
Table 6  List of reviewed new IDS method
508

Authors Citations Dataset Method used IDS focus Summary Year


(6th Apr 2019)

13
Erfani et al. (2016) 198 Six real-life and two synthetic Hybrid: N/A New method proposal of 2016
datasets (not network related) DBN - reduction of data unsupervised dimensionality
dimensionality reduction. No example on
SVM – anomaly detection network traffic database has
been presented.
Ashfaq et al. (2017) 187 NSL-KDD Hybrid: DOS Semi-supervised ML method. 2017
NN - preparation of fuzzy U2R Results compared with:
membership vector R2L J48, Naive Bayes, NB tree,
Fuzziness based algorithm- cat‑ PROBE Random forests, Random tree,
egorization Multi-layer perceptron,
SVM. Two-class classification:
normal vs. attack
Javaid et al. (2016) 142 NSL-KDD Hybrid: DOS ML method for IDS enhance‑ 2016
Sparse AE – unsupervised U2R ment – Self-taught learning. 2,
feature learning R2L 5 and 23 – class classification.
Soft-max regression - classifier PROBE
Kang and Kang (2016) 120 Generated with OCTANE simu‑ DBN VANET communication: Method for threats detection in 2016
lation software DNN Vehicle to vehicle automotive communication.
Vehicle to infrastructure
Tang et al. (2016) 96 NSL-KDD Deep NN (three hidden layers) DOS Method based on six feature 2016
U2R subset from NSL-KDD.
R2L Comparison of results with:
PROBE J48, Naive Bayes, NB Tree,
Random Forest, Random
Tree, MLP, SVM. Focused on
Software Defined Networking
Saied et al. (2016) 94 Generated with simulation of NN DDOS Supervised method for detec‑ 2016
DDOS attacks with special tion of known and unknown
tools. DDOS attacks in real time.
Aljawarneh et al. (2018) 93 NSL-KDD Hybrid: DOS Method including variety of 2018
First - filtering data with Vote U2R available classifiers. Clear
algorithm R2L comparison of presented
Second: two of classifiers: PROBE mechanism in respect to pos‑
J48 sible attack types. Binary and
Meta pagging multiclass classification.
Random tree
REPTree
AdaBoostM1
Decision stump
Naïve Bayes
Ozay et al. (2016) 88 Generated test system for smart Various machine learning Smart grid related attacks Technical overview with perfor‑ 2016
grid techniques mance and accuracy compari‑
A. Drewek‑Ossowicka et al.

son of new ML methods for


smart grid attack detection
Table 6  (continued)
Authors Citations Dataset Method used IDS focus Summary Year
(6th Apr 2019)

Yin et al. (2017) 82 NSL-KDD Recurrent NN DOS Method including experiments 2017
U2R with different topologies of
R2L recurrent neural network.
PROBE Detailed comparison with
other ML methods as : J48,
Naïve Bayes, NB Tree Ran‑
dom Forest, MLP, SVM
Singh et al. (2015) 79 NSL-KDD Online sequential extreme DOS Methodology with reduced 2015
Kyoto 2006+ learning machine (OS-ELM) U2R computational time and mem‑
R2L ory requirements. Multiple
PROBE topology of proposed neural
network architecture are pro‑
posed based on manipulation
of neurons count in hidden
layer. Results of the proposed
technique are compared with:
A survey of neural networks usage for intrusion detection systems

ANN, AdaBoost, Native


Bayes and ELM
Kim et al. (2016) 74 KDD Cup 1999 LSTM + RNN DOS ML method with training data 2016
U2R based on KDD Cup 99 subset.
R2L The article includes experi‑
PROBE mentation on NN param‑
eters in order to improve the
proposed solution. Results are
compared with: GRNN, PNN,
RBNN, KNN, SVN, Bayesian
Hodo et al. (2016) 66 Internet packets NN DOS Method focused on IoT security 2016
DDOS against DoS and DDoS
attacks. The solution is vali‑
dated based on a simulated
IoT Network
Pandeeswari and Kumar (2016) 62 KDD Cup 1999 Hybrid: DOS Hybrid method proposed for 2016
Fuzzy means clustering (FMC) U2R cloud environments. Results
- clustering of incoming data R2L compared with: Naïve Bayes
NN - trained based on FMC PROBE and ANN
output

13
509
Table 6  (continued)
510

Authors Citations Dataset Method used IDS focus Summary Year


(6th Apr 2019)

13
De la Hoz et al. (2015) 61 NSL-KDD Hybrid: DOS Unsupervised hybrid method 2015
Principal component analysis U2R for Intrusion Detection Sys‑
(PCA) / Fisher discriminant R2L tems including variation of
ratio (FDR) - feature selection PROBE hybrid classification methods
and noise removal based on SOM.
Self-organizing map - clas‑
sification
Du et al. (2017) 60 HDFS logs LSTM Suspicious activities: Method based on system logs 2017
OpenStack logs DOS as a training data. Verification
VAST challenge 2011 (testing) Port scan done also on VAST challenge
Socially engineered attack 2011 dataset
Undocumented IP address
Shone et al. (2018) 56 KDD Cup 99 Non-symmetric deep AE DOS Method based on stacked AE 2018
NSL-KDD Random forest U2R and Random Forest. Extensive
R2L comparison with DBN results
PROBE per each network threat
defined in KDD dataset
Alom et al. (2015) 50 NSL-KDD DBN DOS Method plainly using DBN. 2015
U2R According to authors it
R2L achieves better accuracy than
PROBE SVM and DBN-SVM based
solutions with only subset of
NSL-KDD database used as a
training input
Wang, et al. (2017) 47 Self-created USTC-TFC2016 CNN 10 types of normal traffic, e.g. Method taking network traffic 2017
Gmail data as images - training input
10 types of malware traffic, e.g. for CNN. Using raw traffic
Cridex data. The paper includes new
data set of network traffic cre‑
ated by authors called USTC-
TFC2016 (around 3.71 GB)
along with data preprocessing
toolkit called USTC-TK2016
Alheeti et al. (2015) 46 Generated with simulation NN VANET communication (DOS): IDS method for DOS detection 2015
tools: SUMO and MOVE Vehicle to vehicle in VANET infrastructure
Vehicle to infrastructure
A. Drewek‑Ossowicka et al.
A survey of neural networks usage for intrusion detection systems 511

Table 7  Other reviewed papers


Authors Citations AI Area Dataset focus Summary Year
(06.04.2019)

Lin et al. (2015) 214 Cluster center and nearest KDD Cup 99 Proposal of a new ML feature 2015
neighbor approach (CANN). representation method based
Contains k-NN on cluster center and nearest
neighbor approach
Aburomman and Ibne Reaz 132 Hybrid: KDD Cup 99 Hybrid machine learning 2016
(2016) SVM method trained with random
k-NN subsets of KDD Cup 99
dataset. Several ensemble
approach usage
Moustafa and Slay (2016) 125 N/A KDD Cup 99 Comparison of databases in 2016
NSL-KDD terms of complexity and
UNSW-NB15 usability for ML related
techniques applied to IDSs
Vasilomanolakis et al. (2015) 116 Neural Networks: N/A Overview of Collaborative 2015
HIDE method IDSs approaches and pos‑
Several other Collaborative sible network related threats
IDS architectures
Ambusaidi et al. (2016) 111 Least square SVM KDD Cup 99 Supervised filter-based 2016
NSL-KDD feature selection algorithm
Kyoto2006 + is presented for finding opti‑
mal data features for further
classification
Dhanabal and Shantharajah 97 Machine learning: NSL-KDD A thorough analysis of NSL- 2015
(2015) J48 KDD database for IDS
SVM appliance. Additionally ML
Naïve Bayes techniques are used in order
to check NSL-KDD usabil‑
ity as a training data
Weller-Fahy et al. (2015) 72 Machine Learning N/A Overview of multiple meth‑ 2015
ods defining similarity and
distance measures in area of
Network Intrusion Anomaly
Detection
Iglesias and Zseby (2015) 61 Machine learning: NSL-KDD Proposal of multi-stage fea‑ 2015
Decision tree classifiers ture selection method based
Naïve Bayes on stepwise regression
k-NN wrappers and filters
ANN
SVM
David and Netanyahu (2015) 61 DBN Custom malware database Novel method of malware 2015
signature detection based on
network traffic and host logs
Ingre and Yadav (2015) 48 NN NSL-KDD Evaluation of NSL-KDD 2015
performance using NN
related method for binary
and five-class classification
for each attack type

type contains network distillation, adversarial training and 8 Conclusions


classifier robustifying.
Defensive techniques for Neural Network adversarial The paper summarizes the literature review performed in
attacks seem to be discussed in the literature very recently. order to present neural network architectures usage for intru‑
This highlights the importance of the topics for contempo‑ sion detection systems. We decided to perform it, as cyber
rary Neural Network usage. security tends to be an emerging research topic and constant
progression in the Neural network area is a fact. Neural net‑
work architectures are widely used for creating new models

13
512 A. Drewek‑Ossowicka et al.

for IDSs. Nevertheless, it is significant that they are quite the field. We hope that all the solutions and datasets enlisted
often combined with other ML techniques in hybrid mod‑ in this paper will enable researchers for more efficient and
els, which show themselves as quite efficient solutions. Pure influential work regarding new IDS proposals.
NN solutions may seem to be not sufficient to create highly
operational solutions. Open Access This article is licensed under a Creative Commons Attri‑
bution 4.0 International License, which permits use, sharing, adapta‑
There is also a long-lasting challenge with available data‑ tion, distribution and reproduction in any medium or format, as long
sets for performing experiments for IDSs. There are several as you give appropriate credit to the original author(s) and the source,
public datasets described in these articles, but all of them provide a link to the Creative Commons licence, and indicate if changes
have their drawbacks, like age or record redundancy. They were made. The images or other third party material in this article are
included in the article’s Creative Commons licence, unless indicated
are also not always representative for real-life data. But on otherwise in a credit line to the material. If material is not included in
the other hand, due to be publicly available, they enable the article’s Creative Commons licence and your intended use is not
researchers to perform comparable benchmarking. Some permitted by statutory regulation or exceeds the permitted use, you will
experiments are being executed based on self-generated need to obtain permission directly from the copyright holder. To view a
copy of this licence, visit http://creat​iveco​mmons​.org/licen​ses/by/4.0/.
datasets. They may be more suitable for particular research‑
ers groups need, but they are subject to privacy concerns.
One another important observation that came from out
literature review, is the fact that NN are also quite often
References
used for working on reduction of dimensionality of the data,
generating signatures for datasets or general working on data Aburomman AA, Ibne Reaz MB (2016) A novel SVM-kNN-PSO
preparation. High-dimensionality of data can cause inability ensemble method for intrusion detection system. Appl Soft Com‑
of an effective training of ML algorithm and this is being put 38:360–372. https​://doi.org/10.1016/j.asoc.2015.10.011
Agrawal S, Agrawal J (2015) Survey on anomaly detection using data
currently spotted in the newest articles.
mining techniques. Procedia Comput Sci 60:708–713. https​://doi.
Quite significant is that nowadays researchers are look‑ org/10.1016/j.procs​.2015.08.220
ing for effective solutions for other fields then only “clas‑ Ahmad I, Abdullah AB, Alghamdi AS (2009) Artificial neural net‑
sic” computer network. Intrusion detection systems are now work approaches to intrusion detection: a review. In: Proceed‑
ings of the 8th Wseas International Conference on Telecom‑
applied to the areas like internet of things, clouds, auto‑
munications and Informatics. World Scientific and Engineering
motive, smart grids and mobile communication. Those all Academy and Society (WSEAS), pp 200–205
topics were covered by the articles we reviewed, simultane‑ Al-Yaseen WL, Othman ZA, Nazri MZA (2017) Multi-level hybrid
ously being emerging technological areas, where cyberse‑ support vector machine and extreme learning machine based on
modified K-means for intrusion detection system. Expert Syst
curity plays a crucial role. Therefore it is clearly shown that
Appl 67:296–303. https​://doi.org/10.1016/j.eswa.2016.09.041
approaching challenges will be connected to the fact that Alheeti KMA, Gruebler A, McDonald-Maier KD (2015) An intru‑
new network protocols and network types are being created. sion detection system against malicious attacks on the com‑
One of the biggest challenges we can see ahead in terms munication network of driverless cars. In: 2015 12th Annual
IEEE Consumer Communications and Networking Conference
of intrusion detection systems is to have a possibility of
(CCNC). pp 916–921
creating system that could be reactive to any new and low Aljawarneh S, Aldwairi M, Yassein MB (2018) Anomaly-based
frequent attacks. Currently available public databases are intrusion detection system through feature selection analysis
not a sufficient base for such a use case. One of the promis‑ and building hybrid efficient model. J Comput Sci 25:152–160.
https​://doi.org/10.1016/j.jocs.2017.03.006
ing approaches that can be taken is focusing on particular
Almási A-D, Woźniak S, Cristea V et al (2016) Review of advances
types of attacks and preparing solution directly for them, as in neural networks: neural design technology stack. Neu‑
showed in couple of reviewed papers. This could make pro‑ rocomputing 174:31–41. https ​ : //doi.org/10.1016/j.neuco​
posed solutions more adaptive to new types of threats. Addi‑ m.2015.02.092
Alom MdZ, Bontupalli V, Taha TM (2015) Intrusion detection using
tionally, what would have to be addressed is the enormous
deep belief networks. In: 2015 National Aerospace and Electron‑
amount of data that are processed every day in the world. ics Conference (NAECON). pp 339–344
IDSs that will be created in future will have to be resistant Ambusaidi MA, He X, Nanda P, Tan Z (2016) Building an intrusion
to the problems connected with data volume. detection system using a filter-based feature selection algorithm.
IEEE Trans Comput 65:2986–2998. https​://doi.org/10.1109/
It is worth to highlight that only basis of security implica‑
TC.2016.25199​14
tions of NN usage in IDSs are covered in this article. This is Amiri F, Rezaei Yousefi M, Lucas C et al (2011) Mutual informa‑
an important area of research that should not be neglected tion-based feature selection for intrusion detection systems. J
at the expense of studies on precision and performance of Netw Comput Appl 34:1184–1199. https​://doi.org/10.1016/j.
jnca.2011.01.002
NN based IDSs.
Ashfaq RAR, Wang X-Z, Huang JZ et al (2017) Fuzziness based semi-
Based on the conducted review we tried to create a coher‑ supervised learning approach for intrusion detection system. Inf
ent source of knowledge about NN appliance for IDSs. This Sci 378:484–497. https​://doi.org/10.1016/j.ins.2016.04.019
work is meant as an overall introduction for future work in

13
A survey of neural networks usage for intrusion detection systems 513

Baracaldo N, Chen B, Ludwig H et al (2018) Detecting poisoning Guresen E, Kayakutlu G (2011) Definition of artificial neural networks
attacks on machine learning in IoT environments. In: 2018 IEEE with comparison to other networks. Procedia Comput Sci 3:426–
International Congress on Internet of Things (ICIOT). pp 57–64 433. https​://doi.org/10.1016/j.procs​.2010.12.071
Besharati E, Naderan M, Namjoo E (2019) LR-HIDS: logistic regres‑ Harzing A-W (2007) Publish or Perish. In: Harzing.com. https​://harzi​
sion host-based intrusion detection system for cloud environ‑ ng.com/resou​rces/publi​sh-or-peris​h. Accessed 1 Apr 2019
ments. J Ambient Intell Humaniz Comput 10:3669–3692. https​:// Haykin S (1994) Neural networks: a comprehensive foundation, 1st
doi.org/10.1007/s1265​2-018-1093-8 edn. Prentice Hall PTR, USA
Borazjani PN, Everett CE, McCoy D (2014) OCTANE: An extensible Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural
open source car security testbed. In: Proceedings of the Embedded Comput 9:1735–1780. https:​ //doi.org/10.1162/neco.1997.9.8.1735
Security in Cars Conference. p 10 Hodo E, Bellekens X, Hamilton A et al (2016) Threat analysis of IoT
Bourlard H, Kamp Y (1988) Auto-association by multilayer percep‑ networks using artificial neural network intrusion detection sys‑
trons and singular value decomposition. Biol Cybern 59:291–294. tem. In: 2016 International Symposium on Networks, Computers
https​://doi.org/10.1007/BF003​32918​ and Communications (ISNCC). pp 1–6
Buczak AL, Guven E (2016) A survey of data mining and machine De la Hoz E, De La Hoz E, Ortiz A et al (2015) PCA filtering and
learning methods for cyber security intrusion detection. IEEE probabilistic SOM for network intrusion detection. Neurocom‑
Commun Surv Tutor 18:1153–1176. https​://doi.org/10.1109/ puting 164:71–81. https​://doi.org/10.1016/j.neuco​m.2014.09.083
COMST​.2015.24945​02 Huang G-B, Zhu Q-Y, Siew C-K (2006) Extreme learning machine:
Camastra F, Ciaramella A, Staiano A (2013) Machine learning and theory and applications. Neurocomputing 70:489–501. https:​ //doi.
soft computing for ICT security: an overview of current trends. org/10.1016/j.neuco​m.2005.12.126
J Ambient Intell Humaniz Comput 4:235–247. https ​ : //doi. Iglesias F, Zseby T (2015) Analysis of network traffic features
org/10.1007/s1265​2-011-0073-z for anomaly detection. Mach Learn 101:59–84. https​: //doi.
Choo K-KR (2011) The cyber threat landscape: challenges and future org/10.1007/s1099​4-014-5473-9
research directions. Comput Secur 30:719–731. https​: //doi. Ingre B, Yadav A (2015) Performance analysis of NSL-KDD dataset
org/10.1016/j.cose.2011.08.004 using ANN. In: 2015 International Conference on Signal Process‑
Corona I, Giacinto G, Roli F (2013) Adversarial attacks against ing and Communication Engineering Systems. pp 92–96
intrusion detection systems: taxonomy, solutions and open Javaid A, Niyaz Q, Sun W, Alam M (2016) A deep learning approach
issues. Inform Sci 239:201–225. https ​ : //doi.org/10.1016/j. for network intrusion detection system. In: Proceedings of the 9th
ins.2013.03.022 EAI International Conference on Bio-inspired Information and
Cunningham P, Delany SJ (2007) K-nearest neighbour classifiers. Mult Communications Technologies (formerly BIONETICS). ICST,
Classif Syst 34:1–17 pp 21–26
Darvish Rouani B, Samragh M, Javidi T, Koushanfar F (2019) Safe KDD Cup (1999) KDD Cup 1999 Data. In: KDD Cup 1999 Data.
machine learning and defeating adversarial attacks. IEEE Secur http://kdd.ics.uci.edu/datab ​ a ses/kddcu ​ p 99/kddcu ​ p 99.html.
Priv 17:31–38. https​://doi.org/10.1109/MSEC.2018.28887​79 Accessed 1 Jun 2019
David OE, Netanyahu NS (2015) DeepSign: deep learning for auto‑ Kang M-J, Kang J-W (2016) Intrusion detection system using deep
matic malware signature generation and classification. In: 2015 neural network for In-vehicle network security. PLoS One. https​
International Joint Conference on Neural Networks (IJCNN). pp ://doi.org/10.1371/journ​al.pone.01557​81
1–8 Kim J, Kim J, Thu HLT, Kim H (2016) Long short term memory
Demetrio L, Biggio B, Lagorio G et al (2019) Explaining vulnerabili‑ recurrent neural network classifier for intrusion detection. In:
ties of deep learning to adversarial malware binaries. https:​ //arxiv​ 2016 International Conference on Platform Technology and
.org/abs/1901.03583​ Service (PlatCon). pp 1–5
Denning DE (1987) An intrusion-detection model. IEEE Trans Softw Kohonen T (1982) Self-organized formation of topologically correct
Eng SE 13:222–232. https​://doi.org/10.1109/TSE.1987.23289​4 feature maps. Biol Cybern 43:59–69. https​://doi.org/10.1007/
Dhanabal L, Shantharajah DSP (2015) A study on NSL-KDD dataset BF003​37288​
for intrusion detection system based on classification algorithms. Krizhevsky A, Sutskever I, Hinton GE (2012) ImageNet classifica‑
Int J Adv Res Comput Commun Eng 4:446–452 tion with deep convolutional neural networks. In: Pereira F,
Du M, Li F, Zheng G, Srikumar V (2017) DeepLog: anomaly detec‑ Burges CJC, Bottou L, Weinberger KQ (eds) Advances in neural
tion and diagnosis from system logs through deep learning. In: information processing systems 25. Curran Associates, Inc., pp
Proceedings of the 2017 ACM SIGSAC Conference on Computer 1097–1105
and Communications Security. ACM, pp 1285–1298 Kwon D, Kim H, Kim J et al (2019) A survey of deep learning-based
Elman JL (1990) Finding structure in time. Cogn Sci 14:179–211. https​ network anomaly detection. Clust Comput 22:949–961. https​://
://doi.org/10.1207/s1551​6709c​og140​2_1 doi.org/10.1007/s1058​6-017-1117-8
Erfani SM, Rajasegarar S, Karunasekera S, Leckie C (2016) High- Kyoto dataset (2015) Traffic Data from Kyoto University’s Honey‑
dimensional and large-scale anomaly detection using a linear pots. http://www.takak​ura.com/Kyoto​_data/. Accessed 1 Jun
one-class SVM with deep learning. Pattern Recognit 58:121–134. 2019
https​://doi.org/10.1016/j.patco​g.2016.03.028 LeNail A (2019) NN-SVG: publication-ready neural network archi‑
Evgeniou T, Pontil M (2001) Support vector machines: theory and tecture schematics. J Open Source Softw 4:747. https​: //doi.
applications. In: Paliouras G, Karkaletsis V, Spyropoulos CD org/10.21105​/joss.00747​
(eds) Machine learning and its applications: advanced lectures. Lecun Y, Bottou L, Bengio Y, Haffner P (1998) Gradient-based
Springer, Berlin, pp 249–257 learning applied to document recognition. Proc IEEE 86:2278–
Fadlullah ZMD, Tang F, Mao B et al (2017) State-of-the-art deep 2324. https​://doi.org/10.1109/5.72679​1
learning: evolving machine intelligence toward tomorrow’s intel‑ Liao H-J, Richard Lin C-H, Lin Y-C, Tung K-Y (2013) Intrusion
ligent network traffic control systems. IEEE Commun Surv Tutor detection system: a comprehensive review. J Netw Comput Appl
19:2432–2455. https​://doi.org/10.1109/COMST​.2017.27071​40 36:16–24. https​://doi.org/10.1016/j.jnca.2012.09.004
Guo Y, Liu Y, Oerlemans A et al (2016) Deep learning for visual Lin W-C, Ke S-W, Tsai C-F (2015) CANN: An intrusion detec‑
understanding: a review. Neurocomputing 187:27–48. https​:// tion system based on combining cluster centers and
doi.org/10.1016/j.neuco​m.2015.09.116

13
514 A. Drewek‑Ossowicka et al.

nearest neighbors. Knowl-Based Syst 78:13–21. https​: //doi. Singh R, Kumar H, Singla RK (2015) An intrusion detection system
org/10.1016/j.knosy​s.2015.01.009 using network traffic profiling and online sequential extreme
Lincoln L (1998) DARPA 1998 & 1999 datasets. In: DARPA 1998 learning machine. Expert Syst Appl 42:8609–8624. https​://doi.
1999 Datasets. https​://www.ll.mit.edu/r-d/datas​ets. Accessed 1 org/10.1016/j.eswa.2015.07.015
Apr 2020 Song J, Takakura H, Okabe Y et al (2011) Statistical analysis of hon‑
Lippmann R, Haines JW, Fried DJ et al (2000a) The 1999 DARPA eypot data and building of Kyoto 2006 + dataset for NIDS evalu‑
off-line intrusion detection evaluation. Comput Netw 34:579– ation. In: Proceedings of the First Workshop on Building Analysis
595. https​://doi.org/10.1016/S1389​-1286(00)00139​-0 Datasets and Gathering Experience Returns for Security. ACM,
Lippmann RP, Fried DJ, Graf I et al (2000b) Evaluating intrusion pp 29–36
detection systems: the 1998 DARPA off-line intrusion detection Songma S, Chimphlee W, Maichalernnukul K, Sanguansat P (2012)
evaluation. In: Proceedings DARPA Information Survivability Classification via k-means clustering and distance-based outlier
Conference and Exposition. DISCEX’00. pp 12–26 vol.2 detection. In: 2012 Tenth International Conference on ICT and
Liu W, Wang Z, Liu X et al (2017) A survey of deep neural network Knowledge Engineering. pp 125–128
architectures and their applications. Neurocomputing 234:11– Stewart M (2019) Security vulnerabilities of neural networks. In:
26. https​://doi.org/10.1016/j.neuco​m.2016.12.038 Medium. https​://towar​dsdat​ascie​nce.com/hacki​ng-neura​l-netwo​
McHugh J (2000) Testing Intrusion detection systems: a critique of rks-2b9f4​61ffe​0b. Accessed 1 Jan 2020
the 1998 and 1999 DARPA intrusion detection system evalua‑ Tang TA, Mhamdi L, McLernon D et al (2016) Deep learning approach
tions as performed by Lincoln Laboratory. ACM Trans Inf Syst for network intrusion detection in software defined networking. In:
Secur TISSEC 3:262–294 2016 International Conference on Wireless Networks and Mobile
Modi C, Patel D, Borisaniya B et al (2013) A survey of intrusion Communications (WINCOM). pp 258–263
detection techniques in Cloud. J Netw Comput Appl 36:42–57. Tavallaee M, Bagheri E, Lu W, Ghorbani AA (2009) A detailed analy‑
https​://doi.org/10.1016/j.jnca.2012.05.003 sis of the KDD CUP 99 data set. In: 2009 IEEE Symposium on
Moustafa N, Slay J (2016) The evaluation of network anomaly Computational Intelligence for Security and Defense Applica‑
detection systems: statistical analysis of the UNSW-NB15 data tions. pp 1–6
set and the comparison with the KDD99 data set. Inf Secur Tay A (2015) 6 common misconceptions when doing advanced
J Glob Perspect 25:18–31. https ​ : //doi.org/10.1080/19393​ Google Searching. http://musin​gsabo​utlib​raria​nship​.blogs​pot.
555.2015.11259​74 com/2015/10/6-commo ​ n -misco ​ n cept ​ i ons-when-doing ​ . html.
Moustafa N, Slay J (2015) UNSW-NB15: a comprehensive data set for Accessed 1 Apr 2019
network intrusion detection systems (UNSW-NB15 network data Tran NN, Sarker R, Hu J (2018) An Approach for Host-Based Intrusion
set). In: 2015 Military Communications and Information Systems Detection System Design Using Convolutional Neural Network.
Conference (MilCIS). pp 1–6 In: Hu J, Khalil I, Tari Z, Wen S (eds) Mobile Networks and Man‑
NSL-KDD (2009) NSL-KDD | Datasets | Research | Canadian Insti‑ agement. Springer International Publishing, Cham, pp 116–126
tute for Cybersecurity | UNB. https:​ //www.unb.ca/cic/datase​ ts/nsl. UNSW-NB15 (2015) The UNSW-NB15 data set description. https​
html. Accessed 1 Jun 2019 ://www.unsw.adfa.edu.au/unsw-canbe​r ra-cyber​/cyber​secur ​ity/
Narudin FA, Feizollah A, Anuar NB, Gani A (2016) Evaluation of ADFA-NB15-Datas​ets/. Accessed 1 Jun 2019
machine learning classifiers for mobile malware detection. Soft Vasilomanolakis E, Karuppayah S, Mühlhäuser M, Fischer M (2015)
Comput 20:343–357. https​://doi.org/10.1007/s0050​0-014-1511-6 Taxonomy and survey of collaborative intrusion detection. ACM
Ozay M, Esnaola I, Yarman Vural FT et al (2016) Machine learning Comput Surv CSUR. https​://doi.org/10.1145/27162​60
methods for attack detection in the smart grid. IEEE Trans Neural Veen F van (2016) The Neural Network Zoo. In: Asimov Inst. https​
Netw Learn Syst 27:1773–1786. https​://doi.org/10.1109/TNNLS​ ://www.asimo​vinst​itute​.org/neura​l-netwo​rk-zoo/. Accessed 1 Jun
.2015.24048​03 2019
Pandeeswari N, Kumar G (2016) Anomaly detection system in cloud Vinchurkar DP, Reshamwala A (2012) A review of intrusion detection
environment using fuzzy clustering based ANN. Mob Netw Appl system using neural network and machine learning technique. Int
21:494–505. https​://doi.org/10.1007/s1103​6-015-0644-x J Eng Sci Innov Technol IJESIT 1:10
Pitropakis N, Panaousis E, Giannetsos T et al (2019) A taxonomy Wang W, Zhu M, Zeng X et al (2017) Malware traffic classification
and survey of attacks against machine learning. Comput Sci Rev using convolutional neural network for representation learning.
34:100199. https​://doi.org/10.1016/j.cosre​v.2019.10019​9 In: 2017 International Conference on Information Networking
Ranzato M, Poultney C, Chopra S, Cun YL (2007) Efficient learning of (ICOIN). pp 712–717
sparse representations with an energy-based model. In: Schölkopf Weller-Fahy DJ, Borghetti BJ, Sodemann AA (2015) A survey of
B, Platt JC, Hoffman T (eds) Advances in veural information pro‑ distance and similarity measures used within network intrusion
cessing systems 19. MIT Press, pp 1137–1144 anomaly detection. IEEE Commun Surv Tutor 17:70–91. https​://
Rosenblatt F (1958) The perceptron: a probabilistic model for informa‑ doi.org/10.1109/COMST​.2014.23366​10
tion storage and organization in the brain. Psychol Rev 65:386– Yin C, Zhu Y, Fei J, He X (2017) A deep learning approach for intru‑
408. https​://doi.org/10.1037/h0042​519 sion detection using recurrent neural networks. IEEE Access
Saied A, Overill RE, Radzik T (2016) Detection of known and 5:21954–21961. https​://doi.org/10.1109/ACCES​S.2017.27624​18
unknown DDoS attacks using artificial neural networks. Neu‑ Yuan X, He P, Zhu Q, Li X (2019) Adversarial examples: attacks and
rocomputing 172:385–393. https ​ : //doi.org/10.1016/j.neuco​ defenses for deep learning. IEEE Trans Neural Netw Learn Syst
m.2015.04.101 30:2805–2824. https​://doi.org/10.1109/TNNLS​.2018.28860​17
Shah B, Trivedi BH (2012) Artificial neural network based intrusion Zhang H, Cisse M, Dauphin YN, Lopez-Paz D (2018) Mixup: beyond
detection system: a survey. Int J Comput Appl 39:13–18 empirical risk minimization. https​://arxiv​.org/abs/1710.09412​
Shanmugavadivu R, Nagarajan N (2011) Network intrusion detec‑
tion system using fuzzy logic. Indian J Comput Sci Eng IJCSE Publisher’s Note Springer Nature remains neutral with regard to
2:101–111 jurisdictional claims in published maps and institutional affiliations.
Shone N, Ngoc TN, Phai VD, Shi Q (2018) A deep learning approach
to network intrusion detection. IEEE Trans Emerg Top Comput
Intell 2:41–50. https​://doi.org/10.1109/TETCI​.2017.27727​92

13

You might also like