Survey of Attack Projection, Prediction, and
Survey of Attack Projection, Prediction, and
net/publication/327449459
CITATIONS READS
26 3,675
4 authors, including:
Pavel Celeda
Masaryk University
71 PUBLICATIONS 752 CITATIONS
SEE PROFILE
Some of the authors of this publication are also working on these related projects:
All content following this page was uploaded by Elias Bou-Harb on 19 October 2018.
Abstract—This paper provides a survey of prediction, and approaches to predict also vulnerabilities. Finally, we might
forecasting methods used in cyber security. Four main tasks be interested in overall statistics of attacks, the presence of
are discussed first, attack projection and intention recognition, threats, and other pieces of information that together form
in which there is a need to predict the next move or the
intentions of the attacker, intrusion prediction, in which there is a network security situation. In this context, we talk about
a need to predict upcoming cyber attacks, and network security network security situation forecasting [6]. Numerous methods
situation forecasting, in which we project cybersecurity situation and system were proposed to approach these problems, and
in the whole network. Methods and approaches for addressing as we point out in this survey, they often share a common
these tasks often share the theoretical background and are often theoretical background, which makes the particular tasks and
complementary. In this survey, both methods based on discrete
models, such as attack graphs, Bayesian networks, and Markov use cases similar to each other.
models, and continuous models, such as time series and grey To summarize the open problems, we emphasize the fol-
models, are surveyed, compared, and contrasted. We further lowing research challenges of predictions and forecasting in
discuss machine learning and data mining approaches, that have cyber security:
gained a lot of attention recently and appears promising for such
• What can be predicted in a cyber security domain?
a constantly changing environment, which is cyber security. The
survey also focuses on the practical usability of the methods and Is it the next move of an adversary, appearance of a
problems related to their evaluation. new attacker, or cyber security situation from a global
Index Terms—Cyber security, intrusion detection, situational perspective?
awareness, prediction, forecasting, model checking. • How usable are the predictions in cyber security? Can
they be used to effectively mitigate an attack or to get
prepared for an upcoming security threat?
I. I NTRODUCTION • How to evaluate predictions in cyber security and what
Cyber security is a broad field of research, and the detection metrics should be used? Is it sufficient to rely on evalua-
of malicious activities on the network is among the oldest and tion using datasets and testbeds or can the actual predic-
most common problems [1]. However, intrusion detection is tion accuracy be measured in a live network setting?
mostly reactive and responses to specific patterns or observed To this end, such research challenges impact both theoretical
anomalies. The intuitive next step is taking a proactive ap- and practical perspectives. In this survey, we postulate if pre-
proach, in which there is a need to preemptively infer the dictions and forecasts are possible, and we are also interested
upcoming malicious activities so that we could react to such in the applicability and evaluation of the theoretical results.
events before they cause any harm [2]. Research efforts and
progress in predictions and forecasting in cyber security are A. Paper Organization
not as prominent as attack detection. However, it is gaining
more attention, and a breakthrough in this field would benefit This paper is divided into nine sections. Section II intro-
the whole discipline of cyber security [1]. duces the main use cases of predictive and forecasting methods
Before we can start making predictions about cyber security, in cyber security. Taxonomy of attack prediction methods is
there is a need to examine what can actually be predicted and presented in Section III. A literature review of methods of
what obstacles are there that make this problem hard. First, if cyber attack prediction is presented in Sections IV–VII with
there is an attack taking place, it is possible to predict its next a detailed explanation of the methods. Section VIII discusses
steps. Such a task is called attack projection [3]. A similar task evaluation of attack prediction and lessons learned. Finally,
is intention recognition [4], in which we also estimate what is Section IX concludes the paper and provides an outlook on
the ultimate goal of an adversary, which can also help us in future research.
predicting adversary’s next moves. Another task is predicting This paper is intended for an audience familiar with com-
cyber attacks that are going to happen. In this case, we talk puter networks and cyber attacks. Nevertheless, the tasks and
about intrusion prediction [5], although we can use similar use cases of attack prediction, projection, and forecasting are
defined in Section II, so the reader does not need to be
Martin Husák, Jana Komárková and Pavel Čeleda are with the an expert in the field. Probably the most interesting part of
Masaryk University, Institute of Computer Science, Botanická 68a, 602 00 this survey can be found in Sections III–VII. A taxonomy
Brno, Czech Republic (email: [email protected], [email protected], in Section III provides a high-level view of the discussed
[email protected])
Elias Bou-Harb is with Florida Atlantic University, 777 Glades Road, Boca methods. Sections IV–VII contain theoretical background and
Raton, FL 33431, USA (email: [email protected]) list of recent literature for each group of methods. There is
2
also a table included in each of the four sections (Tables II– analytics, although these are not discussed much in details. A
V) that summarizes all the prediction method. If a paper listed simple yet usable taxonomy of intrusion prediction methods
in the table is discussed in the text, it is distinguished by can also be found in a paper by Abdlhamed et al. [9]. The
the author name(s) in italic. Selected papers are highlighted authors first split related work into two groups, predictions
with a gray background in the table and announced in the methods and intrusion detection enhancement. The prediction
text as recommended reading. Practitioners are advised to methods are categorized into three groups, methods using
read Section VIII that contains practical implications and open Hidden Markov models, methods based on Bayesian networks,
problems in the field. and genetic algorithms. Subsequently, they classify artificial
neural networks, data mining, and algorithmic methodologies
B. Literature Search Methodology as three enhancements for intrusion detection, which enhance
the effectiveness of prediction systems. The same authors later
A literature search for this survey covered many journals
published a survey of intrusion prediction [5], in which they
and conference proceedings. Although the discussed problems
categorize prediction methodologies and prediction systems.
are studied in the field of cyber security, the topics are often
Prediction methodologies can be based on alert correlation,
addressed in journals and conferences on computer networks
sequences of actions, statistical and probabilistic methods, and
and communications. Due to the specific nature of this work,
feature extraction. Prediction systems are then categorized as
we also had to go through journals and conferences dedicated
based on hidden Markov models, Bayesian networks, genetic
to formal methods in computer science, such as expert systems
algorithms, neural networks, data mining, and algorithmic
and their applications, which appeared to be an important
methods. Recently, Ahmed and Zaman [4] surveyed methods
source of papers for this survey.
of attack intention recognition, a field dominated by meth-
First, we reviewed survey-oriented journals like IEEE Com-
ods based on graphical models. The authors recognize four
munication Surveys and Tutorials and ACM Computing Sur-
categories: causal networks, path analysis, graphical models,
veys, although no survey was found to discuss predictions in
and dynamic Bayesian networks. Methods based on causal
cyber security. Subsequently, we used Google Scholar, IEEE
networks were evaluated as the most effective.
Xplore, and ACM Digital Library to search for related papers
using the queries “cyber security” AND “prediction”, “cyber
security” AND “attack projection”, “cyber security” AND II. U SE C ASES OF P REDICTION AND F ORECASTING IN
“forecasting”. Further, we looked for publications citing or C YBER S ECURITY
cited by already found works or having the same author. The From the surveyed research papers, we distilled several tasks
publications are presented in chronological order from 2012 to that pose a use case of prediction or forecasting in cyber
2018. Papers published prior to 2012 are not included in this security. The tasks are summed up in Table I. Historically,
survey unless they pose fundamental contribution or are still the first such use cases are the attack projection [3] and
highly relevant. The numbers of citations assessed by Google the attack intention recognition [4], which are closely tied to
Scholar and Scopus were used to identify the most influential intrusion detection. The task is to predict what is an attacker
research papers. (in an already observed attack) going to do next, and what is
attacker’s ultimate goal [4]. In practice, these two tasks use
C. Existing Surveys very similar methods, and can often be used interchangeably.
Later, the task of predicting attacks emerged [5]. This task is
To the best of our knowledge, prediction and forecasting more general as it does not require observation of a preceding
methods in cyber security were not surveyed in such scope activity. The expected outcome is a prediction of an attack
yet, although several surveys of particular tasks and use before it actually occurs, not predicting a continuation of an
cases were published in recent years. Wei and Jiang [7] observed series of events. Finally, the task of forecasting a
in 2013 analyzed the problem of network security situation security situation [6] is a highly generic use case related to
prediction and compared predictions of NSSA using neural cyber situational awareness. The task is not to predict an at-
networks, time series, and support vector machines, although tack, but rather forecast the situation in the whole network [2].
mostly to illustrate the limitations of the available methods. The outcomes may be a forecast of increase or decrease in
Yang et al. [3] formalized the task of attack projection and the number of attacks or vulnerabilities in the network. The
surveyed literature on the topic in 2014. Three categories are following subsections discuss the use cases in more details.
listed, prediction based on attack plans, estimates of attackers
capabilities and intentions, and predictions by learning attack
patterns and attacker’s behavior. Leau and Manickam [6] in A. Attack Projection and Intention Recognition
2015 surveyed several existing techniques of network security The initial idea of attack projection dates back to 2001
situation forecasting. They grouped them into three categories when Geib and Goldman [10] proposed attack projection
by their theoretical background: machine learning, Markov as an extension of attack plan recognition and identified its
models, and Grey theory. In 2016, Gheyas and Abdallah [8] prerequisites and possible problems, such as a need to work
surveyed detection and prediction of insider threats. Although with unobserved actions, failure to observe, and consideration
this topic is still of interest, the predictive approaches do not of multiple concurrent goals. First methods started to appear
seem to be studied in recent years. Ramaki and Atani [2] around 2003 [11], [12] and the research in this field is still
surveyed early warning systems, which often use predictive active, including literature reviews [3], [4].
3
Discrete Models Continuous Models Machine Learning and Data Mining Other Approaches
(Section IV) (Section V) (Section VI) (Section VII)
Graph Models Game Theoretical Time series Grey Models Machine Learning Data Mining Similarity-based approaches,
(Section IV-D) (Section V-A) (Section V-B) evolutionary computing,
prediction from unconventional data,
DDoS volume forecasting, . . .
Attack Graphs Bayesian Networks Markov Models Neural Networks, SVM, . . .
(Section IV-A) (Section IV-B) (Section IV-C)
TABLE II
C OMPARISON OF PREDICTION METHODS , PART I – APPROACHES BASED ON DISCRETE MODELS .
also served as a basis for other model-checking approaches, actions to transition from the initial state to any of the success
e.g., methods using Bayesian networks and Markov models states, the attack is successful, as the success state represents
and game-theoretical methods. a system compromise.
1) Method Description: An attack graph (often abbreviated
as AG) is a tuple G = (S, r, S0 , Ss ), where S is a set of As stated earlier, an attack graph is constructed either
states, r ⊆ S × S is a transition relation, S0 ⊆ S is a set manually or automatically; a popular approach is using data
of initial states, and Ss ⊆ S is a set of success states [47]. mining to generate attack graphs [14]. An example of an
The initial state represents the state before the attack starts. attack graph is shown in Figure 3. In the nodes, we can see
Transition relations represent possible actions of an attacker. possible events that comprise an attack. Edge values represent
These are usually weighted, e.g., by the probability that the a probability, by which the event associated with the end node
attacker will choose the action. If an attacker takes all the will happen. The edge value is referred to as predictability.
6
B C Pr (A) Pr (¬A) with the highest posterior probabilities is the most probable
A root/FTP server 1
1
1
0
1.00
0.65
0.00
0.35 to appear in the future. For practical purposes, a threshold is
192.216.0.10 0 1 1.00 0.00
0 0 0.00 0.00 required to filter out predicted alerts with low probability. If the
0.65 1.00 probability of the predicted event is higher than the threshold,
the predicted event can be reported, and appropriate defense
B Matu FTP BOF C remote BOF on SSH daemon
mechanisms can be set.
192.216.0.10 192.216.0.10 2) Literature Review: A fundamental contribution is re-
search work by Qin and Lee from 2004 [12], which remain
D Pr (B) Pr (¬B) D Pr (C) Pr (¬C)
1 0.85 0.15
0.85 0.70
1 0.70 0.30 a recommended reading even today. The authors presented
0 0.00 1.00 0 0.00 1.00
an approach to attack plan recognition and prediction of
D remote attacker
upcoming attacks based on predefined attack plans. According
to their proposal, a causal network is constructed from low-
Pr (D) Pr (¬D)
0.70 0.30 level alerts. Subsequently, probabilistic inference is conducted
to evaluate the likelihood of the next attack step. Their
Fig. 4. Simple Bayesian Attack Graph illustrating probability computations approach was evaluated using DARPA’s Grand Challenge
(inspired by [50]). Problem datasets. However, only limited results are presented.
A drawback of their work is that it requires a library of attack
plans, from which the causal network is derived. Thus, input
graph with nodes as the discrete or continuous random vari- from a human expert is needed. The authors acknowledge this
ables and edges as the relationships between them. The nodes as a challenge for future work. They also stated that there is
maintain the states of the random variables and conditional a need to distinguish between the deceptive plan and the real
probability form. goal of the attack and also attacks conducted by one attacker
There are several equivalent definitions of a Bayesian net- and a group of collaborating attackers. These issues remain
work. Bayesian network is usually represented as a directed open research problems even today.
acyclic graph (DAG). Each node represents a variable that
Similarly to the situation with attack graphs, methods based
has a certain set of states. The edges represent the causal
on Bayesian networks peaked in late 2000’ and are not getting
relationships between the nodes. Formally, let G = (V, E)
that much attention lately. Wu et al. [31] in 2012 proposed
be a DAG, and let X = (Xv )vV be a set of random variables
minor updates to building Bayesian networks from attack
indexed by V . A Bayesian Network consists of a set of
graphs for attack predictions. The authors propose to include
variables and a set of direct edges between variables. Each
the presence of vulnerabilities and three environmental factors
variable has a finite set of mutually exclusive states. The
into the Bayesian networks to reflect the potential impact of
variable and direct edge form a DAG. To each variable A with
predicted attacks. The environmental factors are the value
parents B1 , B2 ...Bn , there is attached a conditional probability
of assets in the network, the utilization of the host in the
table P (A|B1 , B2 ...Bn ).
network, and the attack history. However, the research work
An example of a Bayesian attack graph is shown in Fig-
only outlines the work and does not include any results.
ure 4 [50]. We can derive that the Bayesian network models
an activity of an attacker (D), who is likely to use one of Ramaki et al. [32] proposed a real-time alert correlation
the buffer overflow exploits (B, C) to get access to a server and prediction framework in 2015. The framework has two
(A). Probability tables are attached to each node informing us modes, online and offline. In the offline mode, a Bayesian
about the probability related to the exploit that the attacker will attack graph is constructed from low-level alerts. In the online
likely use and what is the probability of a successful exploit. mode, the most probable next step of the attacker according
Further extensions or constraints are used for specific to BAG is predicted. The authors evaluated their approach
purposes, including cyber security. For example, Bayesian using the DARPA 2000 dataset. The accuracy of prediction
attack graphs is an attack graph in the form of the Bayesian was observed to be increasing with the length of the attack
network [32]. A causal network is a special case of a Bayesian scenario. Thus, accuracy ranged from 92.3% when processing
network which explicitly requires the relationships in the the first attack step to 99.2% when processing the fifth attack
network to be causal [12]. step.
In order to create a Bayesian network or a Bayesian attack Recently, Okutan et al. [33] included signals unrelated to
graph, the list of events, causal dependencies between events, the target network into the attack prediction method based
and the probability of transitions between events are required. on the Bayesian network. The signals are mentions of attacks
Building the model requires either expert knowledge, or it can on Twitter or the current number of attacks from Hackmaged-
be trained using data mining or machine learning. Typically, don [51]. The results show that the prediction accuracy ranges
the probability tables are calculated from the training datasets from 63 % to 99 %, which makes it a promising approach.
or historical records. Structure learning, parameter learning, Huang et al. [34] in 2018 involved attack prediction using
and unobserved variable inference are the main tasks of the Bayesian network in their framework for assessing cyber
building the network. attacks in cyber-physical systems. However, there are no
Alert prediction using Bayesian networks or Bayesian attack improvements to the prediction method itself; it is more of
graphs uses probabilities depicted in the model. The event an application.
8
C. Markov Models
Another common approach to predicting attacks based on
model-checking prediction methods is using Markov models.
Markov models form a popular category of models, including
well-known examples of Markov chains and Hidden Markov Normal Attempt Progress Compromise
Models (HMM). Markov models are often represented as
a graph, which makes methods based on them similar to
the methods based on attack graphs and Bayesian networks.
Contrary to previously described approaches, Markov mod- Fig. 5. Hidden Markov Model states for predicting cyber attacks (inspired
by [35]).
els operate well in the presence of unobservable states and
transitions, which removes the dependency of intrusion de-
tection and attack prediction methods on possessing complete 2) Literature Review: The methods based on Markov mod-
information. This allows for successful intrusion detection and els appeared along with the methods based on attack graphs
attack prediction even if some attack steps were undetected or and Bayesian networks in late 2000’. Farhadi et al. [15] in
cannot be completely inferred. 2011 proposed a complex framework for alert correlation and
1) Method Description: There are several variants of prediction. In this work, sequential pattern mining is used to
Markov models used for attack prediction, Hidden Markov extract attack scenarios, which are then represented using a
models (HMM), Variable-length Markov models (VLMM), Hidden Markov model that is used for attack plan recognition.
and Variable-order Markov models (VOMM). In this section, Authors claim that their work is the first to use an unsupervised
we show how to construct the model and predict an attack method of attack plan recognition. Research works like this
using an HMM. VLMM and VOMM, however, share the one are part of a trend in research on predictions in cyber
same theoretical background and their utilization for attack security that overcomes a major drawback of previous works.
prediction is very similar. HMM is a statistical model where Instead of relying on a predefined model constructed or super-
the system being modeled is assumed to be a Markov process vised by a human expert, it incorporates unsupervised methods
with unobserved (hidden) states. Hence, we can not observe of data mining or machine learning. Thus, we selected this
the state of a model directly, but only the outputs dependent work as a recommended reading to illustrate this transition.
on the current state. Sendi et al. [35] in 2012 proposed a method of intrusion
Consider having attack sequences consisting of classes such prediction in real time that uses HMMs. The multi-step attacks
as enumeration, host and service probing, exploitation, etc. are the prime interest in this work. An experimental evaluation
These events may be detected by an IDS, and thus the alerts shows how their method can predict multi-step attacks, which
will be raised. From the perspective of HMMs, the alerts are is especially useful for preventing the attacker from gaining
observable outputs of attack classes. Keep in mind that not all control over more and more hosts in the network.
the events can be detected by an IDS. In order to construct Shin et al. [36] in 2013 proposed an advanced probabilis-
an HMM from the attack sequences, we need to determine tic approach for network-based IDS (APAN), which uses a
the number of states in the model, the number of distinct Markov chain to model unusual events in the network traffic
observation symbols per state, the state transition probability and to forecast intrusion. Contrary to other methods based on
distribution, and the initial state distribution [15]. The number Markov models, this method processes network anomalies and,
of states is the number of attack classes. The observation thus, is not aiming at predicting the next move of an attacker
symbols represent IDS alerts. State transition and observation like other model-checking approaches.
probabilities are extracted from historical records or by an Zhang et al. [37] in 2014 discussed differences between
expert. trained and untrained Markov models as applied to detection
HMMs are often visualized as graphs. In cyber security, and prediction of multi-step attacks. The authors first train
attack classes are the nodes, observation symbols are the the HMM by Baum-Welch algorithm. Consequently, attack
edges, and the probabilities are weights of the edges. Figure 5 scenario corresponding to an alert is found using a Forward al-
shows an example of HMM used for attack prediction [35]. gorithm. Finally, the next possible attack sequence is predicted
We can see four states representing the attacker’s progress using the Viterbi algorithm. The approach was evaluated using
from a normal state (nothing is happening) to a successful DARPA 2000 dataset. Trained HMMs scored better than their
compromise. untrained counterparts in both recognition and prediction.
When having a sequence of attack classes, there is a need Kholidy et al. published a series of three papers on attack
to predict the next activity of an attacker, i.e., the next element predictions in cloud systems in 2014. First, attack predic-
in the sequence. Intuitively, there is a need to find the most tion models for intrusion detection systems in the cloud are
likely path from the current state node. The most likely path proposed [38]. Subsequently, the utilization of finite state
provides a sequence of attack classes that are the predicted HMMs for predicting multi-stage attacks in the cloud is
actions of the attacker. To eliminate false positives, it is discussed [39]. Finally, the intrusion prediction model with
recommended to set a probability threshold so that lower finite context with a probabilistic suffix tree is described [40].
probabilities are discarded, and such paths are not considered Abraham and Nair [41] proposed predictive cybersecurity
for further actions [15]. framework based on Markov models for exploitability anal-
9
ysis. The authors use CVSS data to assess the life-cycle of other’s moves in previous turns. Contrary, if all information
vulnerabilities and predict their impact on the network. about past moves is not available to all player, the extensive
Most recently, Bar et al. [42], [43] in 2016 used data form game is said to have imperfect information.
from honeypots for complex modeling of attack propagation 2) Literature Review: Lisý et al. [44] used a zero-sum
using Markov chains. Several frequent patterns of attack game in extensive form with imperfect information to infer
propagation were observed and described in details. However, the attacker’s plan in situations when the attacker tries to
the prediction of the next attacked honeypot is only briefly actively mislead the defender about his goals. They assume the
mentioned and left for future work. targets and their respective value for the attacker are known
as well as the set of all attack scenarios. Every round of
the game, the attacker chooses an action, and the defender
D. Game Theory
chooses a sensor from a given set of sensors. Each sensor has
Game-theoretical approaches to attack prediction are similar given the capability of detecting various attacker’s actions.
to the graphical model-checking approaches discussed earlier. The attacker tries to reach the most valuable target while
The game is used as a model of interaction between an attacker avoiding detection and misleading the defender about the
and a defender. Contrary to the graphical model-checking ultimate goal. The defender tries to guess as many of the
approaches, game-theoretical methods aim to find the best attacker’s moves as possible. They present an algorithm to
strategy for the players instead of the most frequent attack compute an approximation of the Nash equilibria. Another
progression observed in historical data. Thus, game-theoretical presented algorithm each turn identifies the most probable
approaches seem promising especially for prediction of ad- scenarios, thus enabling the defender to guess not only the
vanced attacker’s activity. attacker’s next action but also his ultimate goal.
1) Method Description: Game theory is a mathematical Pı́bil et al. [45] focus on predicting the target of the attacker
tool designed for analysis of an interaction between subjects rather than his next move. They consider the zero-sum finite
with often conflicting objects. The basic assumptions in game game in extensive form with imperfect information between
theory are that participants are rational (they pursue their the attacker and defender. The defender selects the deployment
objectives) and that they reason strategically (they take into of honeypots, mainly how valuable they appear to the attacker.
account their knowledge or expectations of other participants). The attacker chooses which target to attack. They consider two
A game is a model of strategic interaction. The game scenarios; in the first scenario, the attacker has no information
consists of 1) a finite set N of players (usually attacker and other than the perceived value of the target, while in the second
defender/administrator in context of network security), 2) a scenario the attacker can probe a few targets and receive
nonempty set of actions Ai for each player i ∈ N , 3) a payoff noisy information of their type. The Nash equilibria of this
function ui for each player i ∈ N , that assigns each outcome game help the defender to best disguise the honeypots and the
a ∈ ×j∈N Aj a utility of player i. attacker to select which targets will he attack.
A strategy of a player is a function that provides a player’s Abdlhamed et al. [9] in 2016 proposed a system for intrusion
action for each situation in which the player should make prediction in a cloud computing environment. Their system is
a decision. We distinguish between two types of strategies. designed to leverage the problem that theoretic models such
Pure strategies provide a single action for each situation. By as game theory can be highly unreliable with insufficient or
contrast, a mixed strategy assigns each situation a probability uncertain input data. Their system first tries to match the
distribution over the set of player’s actions. The concept of situation to build attack models and scenarios. If the match is
a game solution in game theory is not explicit. The most sufficient, the system assumes the situation is covered by the
commonly used solution concept is a Nash equilibrium [52]. theoretical game theory based model and applies the model’s
In Nash equilibrium, both players have chosen such strategies, prediction. In case the input data are not sufficient, statistical
which neither of them would benefit by deviating from his methods are applied for prediction. Thus, this work poses as
strategy. Finding the Nash equilibria of a game is often com- an example of using a combination of different approaches.
putationally intractable [53]. However, algorithms with lesser
computational complexity approximating the Nash equilibria V. M ETHODS BASED ON C ONTINUOUS M ODELS
are available for some types of games [54], [55]. The second group of methods is using continuous models,
There are various classes of game models that can be used namely time series and grey models, as discussed in appro-
for attack prediction. One such classification distinguishes priate subsections. Such approaches are in most cases suitable
extensive vs. strategic games. In a game in strategic form, for forecasting network security situation. Common results are
each player chooses his action only once, and the actions of forecasts of the numbers, volumes, and composition of attacks
all players are made simultaneously. By contrast, in games in in the network and their distribution in time. Alternatively,
extensive form, the players make the choice of action multiple spatiotemporal patterns in time series may be used to predict
(possibly infinitely many) times simultaneously or in turns cyber attacks. A summary of methods and research papers can
and the players may include all available information in their be found in Table III.
decision at the time the decision is made.
Alternatively, we distinguish games with imperfect vs. per- A. Time Series
fect information. In extensive games with perfect information, Time series pose a very interesting tool for predictive
at any stage of the game, all players are informed about each analysis, that is used in various fields, including cyber security.
10
TABLE III
C OMPARISON OF PREDICTION METHODS , PART II – APPROACHES BASED ON CONTINUOUS MODELS .
interesting results, up to 87.9 % prediction accuracy was equation x b1 (k) is computed and the future values of the
achieved 1 hour ahead of time of an attack. In 2017, Werner sequence X 0 are predicted as x b0 (k + 1) = xb1 (k + 1) − x
b1 (k)
et al. [65] used ARIMA time series to predict the intensity of for k ≥ n. The various methods based on Grey model usually
cyber attacks, i.e., expected number of attacks in the next day. use modified model or extend the model on error prediction.
Sokol et al. [64] used AR(1) model to predict attacks against 2) Literature Review: Preliminary work on network secu-
a honeynet. A similar yet simplistic method using random rity situation forecasting using Grey models from 2006 to 2014
sampling in temporal variance was proposed by Dowling et is covered in a survey by Leau and Manickam [6]. Thus, we
al. [66] to attack type predictability. Recent work by Okutan only surveyed later research works.
et al. [67] uses a broad range of unconventional signals, such In 2014, Lin et al. [68] introduced their definition of the
as Twitter events, to improve forecasting of security incidents network security situation. They claim the network defense
using a time series and ARIMA model. is similar to an immunity system; the severity of a situation
Time series were already mentioned in Section IV-D, where is proportional to the strength of the response. The authors
a combined approach using game theory and supported by compute the network security situation based on the num-
time series analysis was presented [9]. Machine learning ber of defensive measures currently in place. They improve
methods (see Section VI) may also use time series to train the prediction by considering various factors, that influence
classifiers [71]. network security situation. The most influential factors are
selected using the method of grey entropy correlation analysis,
B. Grey Models and the Kalman filter is applied to improve the prediction.
The Grey Models are typically used for predicting cyber In 2016, Leau and Manickam [69] endeavor to overcome the
security situations and define yet another example of method- limitations of GM (1, 1) and Grey-Verhulst models, namely
ologies which employ a continuous mathematical model. The that they are accurate only for specific input series. In their
Grey Theory was first presented by Deng in 1982 [72]. In a work, they introduce an adaptive Grey-Verhulst model that is
grey theory terminology, a situation with no information is robust as applied to wider types of time series. The modifica-
defined as black and a situation with complete information tion consists of an extension of the underlining Grey-Verhulst
as white. Since both options are idealized, the real world model. While the original model from which the differential
problems are somewhere in the middle, in a situation defined equation is derived assumes that x0 (k) + az 1 (k) = b(z 1 (k))2 ,
as grey. Thus, a grey situation can be modeled using a Grey where z 1 (k) = 21 x1 (k) + 12 x1 (k − 1), the modified version
Model (GM). assumes z 1 (k) = x1 (k−1)+ 12 x0 (k)+ 61 x0 (k−1)− 16 x0 (k−2).
1) Method Description: The most widely used grey fore- The value of z 1 (k) is derived so that the error due to different
casting models are GM (1, 1) and its modification Grey- shapes of the original time series is reduced as much as
Verhulst model. The forecasting ability of these models is possible. The same authors also introduce [70] an adaptive
limited to predicting next members of a time series. It is most Grey-Verhulst-Kalman prediction model, which utilizes the
suitable for short-term prediction based on a small sample of adaptive Grey-Verhulst model from their previous work and
data. In network security, authors usually measure the network improves it by applying the Kalman filter to predict the next
security situation and predict its next value. residuum, thus increasing the prediction accuracy.
Let X 0 = {x0 (1), . . . , x0 (n)} be a sequence of length n
whose next value will be predicted, usually a time series. First
the Accumulating Generation Operation (1-AGO) is applied VI. M ACHINE L EARNING AND DATA M INING M ETHODS
and new sequence
Pk X 1 = {x1 (1), . . . , x1 (n)} is created, where
1 0
x (k) = i=1 x (i). By applying accumulation operation, Machine learning (ML) is gaining popularity in the research
the influence of random fluctuations present in the original community in wide areas of exploration, and cyber security is
sequence is weakened. Moreover the original sequence can be no exception [89]. It contains a vast landscape of approaches
easily reconstructed as x0 (k) = x1 (k) − x1 (k − 1) for k > 1, and methods, such as neural networks and support vector
x0 (1) = x1 (1). machines, which makes it difficult to properly categorize ma-
The model is created for the sequence X 1 . Different mod- chine learning in terms of attack prediction methods. Machine
ifications use different models. The original GM (1, 1) model learning is closely tied to data mining [89], which was already
assumes the data satisfy the differential equation mentioned several times in this work. Typically, data mining
dx1 (k) was exploited to create a model used in attack prediction, e.g.,
+ ax1 (k) = b. an attack graph [14] and a Markov model [15]. The utilization
dk
The model works best for data with exponential growth. of data mining in this context is intended to overcome a
The Grey-Verhulst model, which is more appropriate for data major drawback of model-based attack prediction models, i.e.,
following S-curve [73] assumes a differential equation the dependency on models provided by a security expert [3].
However, data mining does not directly influence the method
dx1 (k) itself. Thus, in this section, we only list approaches that
+ ax1 (k) = b[x1 (k)]2
dk make direct use of machine learning. Methods that are only
The model parameters a, b are estimated using least squares supported by machine learning or data mining are discussed
method from the sample data. The solution of the differential in other sections.
12
TABLE IV
C OMPARISON OF PREDICTION METHODS , PART III – APPROACHES BASED ON MACHINE LEARNING AND DATA MINING .
Prediction Output
Situation Sample
I2 O2
H2 a for the learning phase [83].
3) Data Mining: Fachkha et al. [84] in 2012 investigated
the data from darknet, a large unassigned IP address space,
.. .. .. to profile the darknet traffic and corresponding cyber threats.
Frequent pattern mining and association rule mining were used
. . . to find hidden correlations between events in darknet traffic.
The found patterns and rules are then proposed to be used
In On for predicting events in the darknet traffic and cyber threats in
Hn
general. Due to the nature of the darknet, in this case, CAIDA
darknet that represents 1/256 of the IPv4 address space, the
Fig. 7. Artificial neural network for network security situation prediction results of such threat prediction have global scope.
(inspired by [90]).
Kim and Park [85] in 2014 used data mining to build the
attack graph for attack prediction. The authors used sequential
works. Those are neural networks, support vector machines, association rule mining to reflect the order of events. Although
and data mining. The remaining research works are discussed the paper indicates that the mined sequences are used for
after that. It is hard to properly categorize this group of constructing the attack graph, the paper does not particularly
methods, because of frequent combinations of approaches or specify how is this actually done but rather focus on the
uniquely used approach. sequence mining. Thus, it was not categorized under attack
1) Neural Networks: A number of papers deal with the graph-based models in Section IV-A. Sequence mining was
application of machine learning to predict network security also used in recent work by Husák and Kašpar [86], in which
situation for the needs of NSSA. These papers are rather short the authors mined sequential rules from cyber security alerts
and focus on the theoretical background of NSSA modeling contained in a large-scale alert sharing platform. Contrary
and forecasting, such as the mathematical formalization of to [85], the emphasis was put on analyzing live data from real
the problem. However, the proposed approaches are rarely networks and evaluating the suitability of such an approach in
supported by experimental evaluation and, thus, provide lim- practice.
ited value for security practitioners. Nevertheless, the common 4) Other Machine Learning Methods: In 2014, Soska and
statement that NSSA is of vital interest is unquestionable. Vari- Christin [87] used machine learning to automatically detect
ous types of neural networks are discussed in these papers, and vulnerable websites before they turn malicious. Traffic statis-
herein, a summary is subsequently provided for completeness tics, filesystem structure, and website content were used to
purposes. The first papers started to appear in 2008, and the train an ensemble of decision-tree classifiers. The authors
work continues till now. In 2012, Zheng et al. [74] discussed performed a year-long evaluation with promising results of
using back-propagation neural networks. Zhang et al. [76] 66% true positive rate and 17% false positive rate, which is a
in 2013 compared back-propagation and radial basis function good result among methods evaluated in practice.
neural networks and Chen et al. [75] proposed using small- Liu et al. [71] in 2015 characterized the extent to which
world echo state network, which is a kind of recurrent neural cyber security incidents can be predicted. The research work is
network. Zhang et al. [78] proposed using wavelet neural focused on data breaches, which are predicted using a random
networks in 2016. Most recently, He et al. [79] proposed using forest classifier against more than 1,000 real data breaches.
a mixed wavelet-based neural network. The number of features used for training the classifier is
Neural networks were also used for intrusion prediction in remarkable, 258 features were collected from organizations’
2016 by Xing-zhu [77]. The research work is, in essence, simi- networks. The features either describe mismanagement symp-
lar to the works on network security situation forecasting, only toms (misconfigured DNS, BGP, etc.) or malicious activity
the motivation is focused more towards predicting particular time series (spam, phishing, network scans, etc.). The resulting
intrusion. 90% true positive rate and 10% false positive rate only
2) Support Vector Machine: Cheng and Lang [80] sug- underline the extent of this work. Due to the significant extent
gested using support vector regression machine to forecast of the work, we list this work as a recommended reading.
network security situation, although this work mostly presents Veeramachaneni et al. [88] in 2016 presented AI2 , a
an alternative to the neural network-based methods. Apart machine learning system for attack prediction that includes
from a different classifier, their work is, in essence, similar human input. First, the first authors use an ensemble of
to research performed in this field using neural networks. unsupervised outlier detection methods, including principal
Support vector machines proved suitable for predicting very component analysis and autoencoders. Subsequently, feedback
specific attacks. Jayasinghe et al. [81] in 2014 predicted from an analyst is obtained and supervised learning module
14
is used. The model is constantly refined as more feedback is malicious flows and current flows, it is possible to predict
gathered, which leads to promising results. The AI2 improves a continuation of the traffic and mitigate the attack.
the detection rate more than three times on average while 2) DDoS Volume Forecasting: Deeply studied topics are the
reducing false positive rate fivefold, compared to unsupervised DDoS attacks and predictions related to them. The predictions
methods alone. of DDoS attacks focuses mostly on identifying the initial phase
Shao et al. [18] in 2016 used user behavior analysis to of an attack, in which the volume of bogus network traffic
predict cyber attacks with a motivation to include the reasoning rises, and the prediction of the volume of the attack. The
behind the attacks. User security rating is derived from his/her volume of a DDoS attack is the most important feature of
consistency (usage patterns), accuracy (frequency of mistakes), such attacks. The metrics for DDoS volume are packet or byte
and constancy (how long the user displays good online behav- rate per second and the estimated number of compromised
ior). Rule mining is then used to find hidden relations in the machines involved in the attack. Knowing the attack volume
behavior patterns. Finally, unsupervised clustering, such as k- in advance tells us whether the target system or the network
means, and manual filtration of the results are used to identify can withstand the attack or if there is enough capacity for
groups of users that are prone to malicious operations. defense, e.g., in scrubbing centers.
Since 2012, several authors have proposed their methods of
VII. OTHER A PPROACHES DDoS forecasting. Kwon et al. [97] used honeynets to capture
the initial phases of the DDoS attack and predict its size.
In this section, we discuss the fourth group of prediction
Later, they used statistical approaches to predict the DDoS
methods, methods that are hard to categorize properly or that
volume [98]. Fachkha et al. [99] proposed an approach based
are highly specialized in terms of a use case or a method
on analysis of data from darknets. Olabelurin et al. [100]
used. The full list of approaches and papers is presented in
improved the forecasting techniques by including entropy in
Table V. There is no common background to these methods,
the calculations.
so we only provide the literature review, and briefly explain
the background there. 3) Evolutionary computing: A very recent approach to
1) Similarity-based Approaches: The first of the alterna- forecast network security situation is based on belief rule base
tive approaches is based on similarity, mostly addressing the (BRB) models and evolutionary algorithms, namely CMA-ES.
problem of attacker’s intention recognition by calculating a This approach emerged in 2016, and was since then described
similarity metric with a previously observed attack. In 2012, and continuously improved by Hu et al. [101], [102] and Wei
Jantan and Rasmi et al. [91], [92] proposed a model of et al. [103], including the improvements in network security
attack strategy that allows comparisons of the attack strategies. situation assessment [107]. BRB model includes a series of
The observed security alerts are expressed numerically, and belief rules and can be built from expert knowledge as well
cosine similarity is applied to infer a similarity between as historical data. These might be subjective and inaccurate.
two attack strategies. It is worth mentioning that the same Subsequently, the covariance matrix adaption evolution strat-
authors have previously developed models based on Bayesian egy (CMA-ES) is used to optimize the models the parameters
networks [106]. of BRB model, which can then forecast network security
In 2014, AlEroud and Karabatis [93] proposed an approach situation. This novel approach seems very promising and
to detect cyber attacks using semantic link network (SLN), might be a good alternative to grey models, that were used for
which utilizes contextual information of network flows and the same purpose, as discussed in section V-B. Nevertheless,
alerts raised in response to them. Subsequently, SLN is used this method is too novel, so that we cannot compare its impact,
to predict and detect malicious flows, focusing on multi-step e.g., by a number of citations.
attacks, using similarity measures. The same authors recently 4) Unconventional data sources: A novel trend in cyber
published a novel approach [94] based on contextual relation- security predictions is using unconventional data sources. For
ships between cyber attacks and calculating their similarity. example, using DNS logs for attack prediction is present in
In 2016, Jiang et al. [95] proposed an intrusion prediction work by Mahjoub and Mathew [104] from 2015, who proposed
mechanism based on honeypot log similarity. System logs a principle called Spike Rank or SPRank, that detects domains
from honeypots were first analyzed using association rule min- showing a sudden spike in DNS queries issued from millions
ing to find useful implicit information and to select features. of clients worldwide towards OpenDNS resolvers. The spikes
Subsequently, the flows are mapped into metric space, and were able to detect several malware campaigns as well as
distance calculation is used to identify flows that are most phishing campaigns.
similar to the known malicious flows, thus adding them to the In addition, even non-technical data sources were consid-
prediction list. This approach aims at reducing false positive ered for cyber attack prediction. Hernandez et al. [16] in
alarms and was evaluated in a live environment of a Taiwanese 2016 performed sentiment analysis on Twitter to predict cyber
academic network. attacks. Sentiment analysis of social networks was also a
Recently, AlEroud and Alsmadi [96] used similarity to data source for Shu et al. [17] in 2018. Information foraging
predict and mitigate attacks in software-defined networks. for improving cyber attack predictions was also discussed by
The network traffic is aggregated to flows, and the flows’ Dalton et al. [105] in 2017. The authors, however, discuss
characteristics are subsequently compared to flow signatures various strategies for information foraging and only briefly
of known attacks. If a similarity is found between known mention the data sources with which they work.
15
TABLE V
C OMPARISON OF PREDICTION METHODS , PART IV – OTHER APPROACHES .
VIII. E VALUATION AND L ESSONS L EARNED 60–70 % [25], [33], [58], [63]. Some works show even worse
results, which indicates that the prediction accuracy in practice
In this section, we evaluate the findings from the literature
is at the lower bounds.
review, and we answer the questions stated in the introduction.
In the first question, we were interested in what can be pre- Other practical aspects of predictions in practice are the time
dicted in the cyber security domain. Although many use cases criteria, namely the time needed to predict future events and
were proposed, they can be reduced to several main use cases, the time that remains to the predicted event. While older works
namely, attack projection and intent recognition, attack or focused on the computational complexity of the prediction
intrusion prediction, and security situation forecasting. These algorithms, the field reports are scarce. However, modern
were already described in details in Section II. The remaining approaches are implemented to operate in real time with
questions are summed up and answered in the following minimal time delay [32], which effectively solves the problem.
subsection. First, we sum up the practical implications, i.e., Nevertheless, there is a need to find out how much time there
how ready are the attack prediction methods to effectively is to react to a predicted attack. Kholidy et al. [38], [39],
mitigate the attacks. Further, we take a closer look at the [40] claim that they can predict an attack forthcoming in 39
evaluation of predictions and forecasts in cyber security. A minutes, which is a promising result that leaves enough time
separate subsection is dedicated to metrics as there appeared to even for manual inspection of the predicted event. However,
be more approaches to set an evaluating set of metric. Finally, there are no other works using the same metrics.
we sum up open and resolved problems in the field. There are two other major issues common to many methods,
populating the knowledge base of the attacks and placing
attack prediction at the most suitable level of abstraction [3].
A. Practical implications First, attack prediction methods require either a library of
Regarding the practical implications, the prime issues are attack plans completed by experts or a dataset of historical
the accuracy and efficiency of predictions, but it is hard to records, from which the attack plans might be constructed.
evaluate and compare the methods. Even setting the right Although both approaches are prone to errors and missing
metrics is a problem as we have discussed further in this attack descriptions, the use of machine learning and data min-
section. However, high prediction accuracy is a good indicator ing for model construction or direct prediction has prevailed in
of a method’s usability in practice. As we inferred in the recent years. However, if an automatically found attack plan is
literature review, there are many approaches that achieved high going to be used in practice, one has to be careful to manually
accuracies of over than 90 % [26], [32], [15]. However, such inspect the results. Second, it is computationally demanding
results were obtained when evaluating the approaches over to implement attack prediction at the network level, e.g., as
datasets. When we take a look at methods evaluated on live part of an IDS. Working with alerts from IDS is much more
network traffic, the prediction accuracies drop down to around scalable and flexible than working with packets or network
16
flows. Additionally, it is convenient to combine alerts from that is that the datasets are not designed for the purpose of
multiple IDS, e.g., a network-based and host-based, to get the evaluating attack prediction. As Fava et al. stated back in
complete trace of the attack. However, correlating alerts from 2008 [115], commonly known datasets, including the DARPA
heterogeneous sources adds additional complexity and stands datasets, are crafted for intrusion detection and, thus, do not
as a research problem of its own [108]. have the notion of attack tracks, i.e., there is no information
Suthaharan [109] states that the network intrusion detection available on the attackers’ intentions or correlation of attack
and prediction are time sensitive applications requiring highly steps. Thus, we can only confirm the accuracy of predicting
efficient Big Data techniques to tackle the problem on the fly. the next attacker’s move, but we cannot confirm or discard the
Thus, it is proven that the data fall into the category of big data. predicted attack plan.
However, a new definition of big data is provided based on
three new parameters, cardinality, continuity, and complexity, C. Evaluation in live network
instead of traditional volume, variety, and velocity. Further,
Evaluation of attack prediction in real-life scenarios is
the suitability of machine learning for big data is discussed.
challenging. It is hazardous to let the adversary execute an
Although methods based on Support Vector Machines provide
intrusion in a real network only to evaluate the predictions. In
excellent accuracies, yet they are not suitable for big Data due
large networks, it is also problematic to get access to every
to their computational complexity. Representational learning
host that could be compromised. Nevertheless, several live data
might be suitable for big data classification, but Machine
sources were used, such as the data from DShield [116], a col-
Lifelong Learning is recommended to be used.
laborative database of firewall logs, and Hackmageddon [51], a
B. Datasets compilation of cyber attack timelines and statistics. Very often,
researchers set up a honeypot to capture cyber security data
During the literature search, we encountered several datasets
and use them to evaluate predictions. The main advantage of
that were often used for evaluation of the proposed meth-
honeypots is that they typically contain only malicious data.
ods. The most popular datasets were produced by MIT
However, they are not useful for studying advanced attack-
Lincoln Labs and are generally recognized as the DARPA
ers for the purpose of attack intention recognition. Finally,
datasets [110], [111]. There are three distinct datasets avail-
darknets, large unassigned IP blacks, such as CAIDA network
able: DARPA 1998, DARPA 1999, and DARPA 2000. DARPA
telescope, were used for prediction in a global scale [84], [99].
2000 further contains two attack scenarios, LLDOS 1.0 and In addition, the research on attack projection is often
LLDOS 2.0.2; often only LLDOS 1.0 was used in attack accompanied by research on deception and network traffic
prediction method evaluations. Although the dataset is popular manipulation. The aim of deception in cyber security is to
and well documented, its main problem is its age; almost 20 guide the adversary to the target of defender’s choice, typically
years old dataset does not reflect current cyber security threats a honeypot. Several researchers [117] continued their work
and network traffic patterns. on attack prediction by proposing a deception system, which
ACM SIGKDD announced KDD Cup 1999 [112], a contest
prepares an attractive target for an attacker. For example,
on knowledge discovery from the cyber security data. In
if an adversary is supposed to exploit a certain service,
this contest, DARPA 1998 dataset was used, although many
a honeypot emulating such service is set up in the target
authors referenced the dataset as the KDD 1999 dataset. The
network, either as a new target or as a clone of a real
KDD Cup 1999 gained a lot of attention from numerous
system. If the predictions are correct and the honeypot setup
researchers on the problem of intrusion detection as well
is quick enough, the attacker would exploit a honeypot, and
as attack prediction, thus allowing further comparisons of
the attack can be studied. Manipulating the terrain for the
various methods. However, substantial flaws in the dataset
attacker was problematic mostly due to the need for rapid
were revealed in a thorough evaluation [113]. Thus, the
deployment of honeypots and movement of targets as traffic
dataset is now considered unreliable and even harmful by
manipulation was too costly. However, recent development in
the community, although attempts for improving the dataset
networking, namely in Software Defined Networks (SDN),
quality were made [114]. Still, the dataset is used even in
allowed easy traffic manipulation. The emerging field of SDN
recent works [9], [77].
thus began producing security-related frameworks focusing
Other datasets public datasets are used scarcely; the re-
on early-stage attack mitigation and traffic redirection, e.g.,
searchers often crafted their own datasets [76] and evaluated
diverting the attack traffic to a honeypot instead of the original
their proposed methods using these data. While some data are
target. AVANT-GUARD [118] is one of the early examples.
obtained from real network traffic, which provides fresh data,
Combining such framework with attack prediction have been
nevertheless it is quite problematic to publish such data due to
proposed recently [96], and we expect more work on this topic
the needs of data anonymization. Another common option is
in near future.
to design a testbed [22], [41], [100], which is often laborious
to set up, even if a proper description is provided. Thus,
custom datasets and testbeds seem suitable for evaluating the D. Metrics
proposed methods, but the reproducibility of such research is Setting the metrics to evaluate and compare attack predic-
often disputable. tion methods is a challenging task. Naturally, we are interested
There is one more common problem related to many in the prediction accuracy as a prime indicator, but that may
datasets used for evaluating methods of attack prediction, and rely on a given context and specific use case. In practical
17
setups, we encountered the time criteria, such as prediction to be created and maintained. Similarly, if a security situation
efficiency and the time remaining to the predicted event. is formally represented, there is a need to consider all the
Specific tasks, such as predictions based on specific attack factors contributing to it, which is not always straightforward.
traits, require specific metrics. In this section, we summarize Here we recapitulate minor problems which were successfully
and evaluate the metrics that are typically used in the literature. approached and which remain open.
The most important metric for evaluating prediction meth- An example of a successfully resolved problem is the gener-
ods is their accuracy. As we have seen in many surveyed ation and maintenance of attack models or attack plan libraries.
papers, the authors often include the accuracy as the percent- The first attack prediction methods depended on attack plan
age of successfully predicted events or situations. However, libraries that had to be populated by human experts. It was
accuracy can be understood broadly and not all the papers use tedious to formally represent all the possible attack paths and
it in a formal sense. Often, we can see confusion matrix as a if so, the model parameters, such as transition probabilities
more descriptive metric of a prediction method. The confusion in graph models, were hard to accurately be obtained. Often,
matrix is used for the evaluation of intrusion detection. Hence a model library built upon historical records were proposed,
it is natural to use it in to evaluate prediction in cyber security which enabled realistic model parameters but still required
as well. However, there are several issues with the use of laborious manual work by experts. However, the introduction
confusion matrices. First, all the elements can be obtained of data mining into the cyber security domain created a
when evaluating a method over an annotated dataset, but breakthrough for attack predictions. Using data mining, an
we can never be sure about the results when evaluating the attack plan library can be constructed automatically and con-
methods over live network data. Second, different methods tinuously updated. Data mining became especially popular for
may use different criteria for true and false positives and constructing graph-based models, for example [14], [15], [32],
negatives. For example, if a certain exploit is predicted to [37]. Data mining closely relates to machine learning, which
happen at a certain time on a specific host, but the attacker became another popular method to attack prediction. Machine
exploits another target or the time of the attack is significantly learning-supported methods do not need an external model
different, it is quite unclear whether we should consider as they construct their own internal representation of cyber
such events as true positives. Finally, in predictive analytics security events and predictive rules during the learning phase.
and other fields of research, precision and recall values are However, human experts still play a vital role in constructing
often used instead of the full confusion matrix, but calculated attack models and consulting the results [88]. Further, a current
from it. Precision is defined as tp/(tp + f p), while recall as research trend is using deep machine learning, which has not
tp/(tp + f n). Precision and recall are favored to prevent the been observed in the surveyed literature yet. We expect to see
accuracy paradox, i.e., a situation in which a predictive model deep learning-based prediction methods in cyber security in
with a given level of accuracy may have greater predictive the near future.
power than models with high accuracy. These metrics were Although the problems outlined earlier in this section have
often used to evaluate statistical methods and methods based been resolved, many other issues remain. The major issue
on machine learning, that we surveyed in the literature review. is how can prediction methods react to new trends in cyber
To sum up, even though many surveyed papers use similar security, e.g., novel attack vectors and security paradigms.
metrics, they are hardly comparable due different works going Even though we cannot effectively predict 0-day attacks, its
into different levels of details or using less formal definitions attack progression is typically similar to some of the known
of prediction accuracy. attacks, thus making the actual attack predictable to some
Time criteria were used for evaluation of attack prediction extent. However, how can we react to paradigm shifts and
methods by Kholidy et al., who measured the time difference novel attack vectors that arose with the development of the
between the prediction and the predicted attack [38], [39], Internet of Things (IoT), cyber-physical systems, software-
[40]. Thus, it is possible to estimate when is the attack going defined networking (SDN), and other current trends? Indeed,
to appear and how much time there is to prepare an appropriate the first attempts to predict attacks in these novel paradigms
defense. On the one hand, the time delay between individual have already been proposed [34], [96]. Nevertheless, it is
attack steps can be inferred from the history of attacks in most definitely interesting to see how we can adapt the general
of the attack prediction methods. On the other hand, the time methods to work under emerging paradigms in networking
criterion may be used as an indicator of the practical usability and security.
of a prediction method. Thus, the time criterion should be
considered especially by practitioners who require some time IX. C ONCLUSION & O UTLOOK
to react to a prediction. In this paper, we presented a literature survey of attack
prediction methods. The problem was set in a context of re-
search on intrusion detection and cyber situational awareness.
E. Open and Resolved Problems A taxonomy of methods was provided, and each category
In the introduction and the literature survey in Sections IV– was described in detail and evaluated. The final evaluation
VII, we have mentioned a number of problems associated compared the methods and discussed related problems and
with attack prediction and forecasting. Many of these problems lessons learned. Herein, we conclude our findings on the
were common to multiple attack prediction methods. For ex- theory and practice of attack prediction and suggest future
ample, if a method depends on an attack model, the model has events in the field.
18
Three important findings emerged from the literature review. [6] Y.-B. Leau and S. Manickam, Network Security Situation Prediction:
First, many of the prediction methods in cyber security are A Review and Discussion. Berlin, Heidelberg: Springer Berlin
Heidelberg, 2015, pp. 424–435.
using a model to represent and project the future state of an [7] X. Wei and X. Jiang, “Comprehensive analysis of network security
attack or a security situation. Although there is an apparent situational awareness methods and models,” in Instrumentation and
division of the models given by their use case (attack pro- Measurement, Sensor Network and Automation (IMSNA), 2013 2nd
International Symposium on. IEEE, 2013, pp. 176–179.
jection more often uses discrete models, while forecasting [8] I. A. Gheyas and A. E. Abdallah, “Detection and prediction of insider
network security situation uses continuous models predomi- threats to cyber security: a systematic literature review and meta-
nantly), the two main use cases often complement each other analysis,” Big Data Analytics, vol. 1, no. 1, p. 6, Aug 2016.
and overlap in many cases. Second, we have seen many [9] M. Abdlhamed, K. Kifayat, Q. Shi, and W. Hurst, “A system for intru-
sion prediction in cloud computing,” in Proceedings of the International
new approaches based on data mining and machine learning, Conference on Internet of Things and Cloud Computing, ser. ICC ’16.
which substantially change the state of the research in cyber New York, NY, USA: ACM, 2016, pp. 35:1–35:9.
security predictions. Data mining resolves the dependence on [10] C. W. Geib and R. P. Goldman, “Plan recognition in intrusion detection
systems,” in DARPA Information Survivability Conference amp; Expo-
artificially provided prediction models, while machine learning sition II, 2001. DISCEX ’01. Proceedings, vol. 1, 2001, pp. 46–55
challenges the model-based approaches in general. Finally, we vol.1.
have encountered many problems related to the evaluation [11] T. Hughes and O. Sheyner, “Attack scenario graphs for computer
network threat analysis and prediction,” Complexity, vol. 9, no. 2, pp.
of predictions in cyber security. In the context of empirical 15–18, 2003.
datasets, popular datasets are old, unreliable, and created for [12] X. Qin and W. Lee, “Attack plan recognition and prediction using
other purposes, while evaluations in live networks are not causal networks,” in Computer Security Applications Conference, 2004.
reproducible. We do not even have a common set of metrics 20th Annual, Dec 2004, pp. 370–379.
[13] E. Bou-Harb, M. Debbabi, and C. Assi, “Cyber Scanning: A Compre-
to compare the methods. hensive Survey,” Communications Surveys & Tutorials, IEEE, vol. 16,
In the future, we are likely going to see further improve- no. 3, pp. 1496–1519, 2013.
ments of attack prediction and its utilization in practice. [14] Z. t. Li, J. Lei, L. Wang, and D. Li, “A data mining approach to gener-
ating network attack graph for intrusion prediction,” in Fuzzy Systems
Keeping in mind that attack prediction is one step behind and Knowledge Discovery, 2007. FSKD 2007. Fourth International
intrusion detection, we outline a few directions in which Conference on, vol. 4, Aug 2007, pp. 307–311.
the research will be held. First, a transition in processing [15] H. Farhadi, M. AmirHaeri, and M. Khansari, “Alert Correlation and
Prediction Using Data Mining and HMM,” ISeCure, vol. 3, no. 2, 2011.
the network data and alerts from batches to stream data
[16] A. Hernndez, V. Sanchez, G. Snchez, H. Prez, J. Olivares, K. Toscano,
processing has already started, and we may expect further M. Nakano, and V. Martinez, “Security attack prediction based on
utilization of Big Data analytics [119], [109]. Second, in the user sentiment analysis of Twitter data,” in 2016 IEEE International
near future, we are going to see research on attack prediction Conference on Industrial Technology (ICIT), March 2016, pp. 610–617.
[17] K. Shu, A. Sliva, J. Sampson, and H. Liu, “Understanding cyber attack
in a collaborative environment, such as collaborative intrusion behaviors with sentiment information on social media,” in Social,
detection systems or alert sharing platforms. Predicting attacks Cultural, and Behavioral Modeling. Cham: Springer International
in such an environment is a natural next step of the research Publishing, 2018, pp. 377–388.
[18] P. Shao, J. Lu, R. K. Wong, and W. Yang, “A transparent learning
in this area [120], [86]. Finally, we are going to see more and approach for attack prediction based on user behavior analysis,” in
more data mining and machine learning in cyber security [89] Information and Communications Security. Cham: Springer Interna-
and the attack prediction is no exception. Specifically, we will tional Publishing, 2016, pp. 159–172.
know better if machine learning alone can be used to learn [19] M. R. Endsley, “Situation awareness global assessment technique
(SAGAT),” in Aerospace and Electronics Conference, 1988. NAECON
about the attacks and predict them at the same time, or if data 1988., Proceedings of the IEEE 1988 National. IEEE, 1988, pp. 789–
mining and machine learning will be used only to learn about 795.
the attacks and the prediction will still use pattern matching. [20] ——, “Toward a Theory of Situation Awareness in Dynamic Systems,”
Human Factors, vol. 37, no. 1, pp. 32–64, 1995.
To conclude this paper, the issue of attack prediction is an [21] A. Kott, C. Wang, and R. F. Erbacher, Cyber defense and situational
interesting research problem that has been approached many awareness. Springer, 2014, vol. 62.
times by a number of researchers. Although many solutions [22] C. J. Chung, P. Khatkar, T. Xing, J. Lee, and D. Huang, “NICE:
Network Intrusion Detection and Countermeasure Selection in Virtual
have been proposed, there is still no definite answer on Network Systems,” IEEE Transactions on Dependable and Secure
how to effectively and precisely predict cyber attacks. Attack Computing, vol. 10, no. 4, pp. 198–211, July 2013.
prediction is not yet used in practice and sometimes seen [23] I. Kotenko and A. Chechulin, “A cyber attack modeling and impact
as rather misleading [121], but it is still an open and an assessment framework,” in 2013 5th International Conference on Cyber
Conflict (CYCON 2013), June 2013, pp. 1–24.
imperative, desirable research problem [1], [3], [120]. [24] P. Cao, K.-w. Chung, Z. Kalbarczyk, R. Iyer, and A. J. Slagell, “Pre-
emptive intrusion detection,” in Proceedings of the 2014 Symposium
and Bootcamp on the Science of Security, ser. HotSoS ’14. New York,
R EFERENCES NY, USA: ACM, 2014, pp. 21:1–21:2.
[1] A. Kott, Towards Fundamental Science of Cyber Security. New York, [25] P. Cao, E. Badger, Z. Kalbarczyk, R. Iyer, and A. Slagell, “Preemptive
NY: Springer New York, 2014, pp. 1–13. Intrusion Detection: Theoretical Framework and Real-world Measure-
[2] R. A. Ahmadian and A. R. Ebrahimi, “A survey of it early warning ments,” in Proceedings of the 2015 Symposium and Bootcamp on the
systems: architectures, challenges, and solutions,” Security and Com- Science of Security, ser. HotSoS ’15. New York, NY, USA: ACM,
munication Networks, vol. 9, no. 17, pp. 4751–4776. 2015, pp. 5:1–5:12.
[3] S. J. Yang, H. Du, J. Holsopple, and M. Sudit, Attack Projection. [26] A. A. Ramaki, M. Amini, and R. E. Atani, “RTECA: Real time
Cham: Springer International Publishing, 2014, pp. 239–261. episode correlation algorithm for multi-step attack scenarios detection,”
[4] A. A. Ahmed and N. A. K. Zaman, “Attack intention recognition: A Computers & Security, vol. 49, no. Supplement C, pp. 206 – 219, 2015.
review.” IJ Network Security, vol. 19, no. 2, pp. 244–250, 2017. [27] M. GhasemiGol, A. Ghaemi-Bafghi, and H. Takabi, “A comprehensive
[5] M. Abdlhamed, K. Kifayat, Q. Shi, and W. Hurst, Intrusion Prediction approach for network attack forecasting,” Computers & Security,
Systems. Cham: Springer International Publishing, 2017, pp. 155–174. vol. 58, pp. 83 – 105, 2016.
19
[28] M. GhasemiGol, H. Takabi, and A. Ghaemi-Bafghi, “A foresight model [49] N. Polatidis and C. K. Georgiadis, “A multi-level collaborative filtering
for intrusion response management,” Computers & Security, vol. 62, method that improves recommendations,” Expert Systems with Appli-
pp. 73 – 94, 2016. cations, vol. 48, pp. 100 – 110, 2016.
[29] N. Polatidis, E. Pimenidis, M. Pavlidis, and H. Mouratidis, “Rec- [50] N. Poolsappasit, R. Dewri, and I. Ray, “Dynamic Security Risk
ommender systems meeting security: From product recommendation Management Using Bayesian Attack Graphs,” IEEE Transactions on
to cyber-attack prediction,” in Engineering Applications of Neural Dependable and Secure Computing, vol. 9, no. 1, pp. 61–74, Jan 2012.
Networks. Cham: Springer International Publishing, 2017, pp. 508– [51] P. Passeri. (2017) Hackmageddon Information Security Timelines and
519. Statistics. [Online]. Available: http://www.hackmageddon.com/
[30] N. Polatidis, E. Pimenidis, M. Pavlidis, S. Papastergiou, and H. Moura- [52] J. Nash, “Non-cooperative games,” Annals of mathematics, pp. 286–
tidis, “From product recommendation to cyber-attack prediction: gen- 295, 1951.
erating attack graphs and predicting future attacks,” Evolving Systems, [53] V. Conitzer and T. Sandholm, “Complexity Results About Nash Equi-
May 2018. libria,” in Proceedings of the 18th International Joint Conference
[31] J. Wu, L. Yin, and Y. Guo, “Cyber Attacks Prediction Model Based on Artificial Intelligence, ser. IJCAI’03. San Francisco, CA, USA:
on Bayesian Network,” in Parallel and Distributed Systems (ICPADS), Morgan Kaufmann Publishers Inc., 2003, pp. 765–771.
2012 IEEE 18th International Conference on, Dec 2012, pp. 730–731. [54] S. C. Kontogiannis and P. G. Spirakis, “Well supported approximate
[32] A. A. Ramaki, M. Khosravi-Farmad, and A. G. Bafghi, “Real time alert equilibria in bimatrix games,” Algorithmica, vol. 57, no. 4, pp. 653–
correlation and prediction using Bayesian networks,” in Information 667, 2010.
Security and Cryptology (ISCISC), 2015 12th International Iranian [55] H. Tsaknakis and P. Spirakis, “An optimization approach for approx-
Society of Cryptology Conference on. IEEE, 2015, pp. 98–103. imate nash equilibria,” Internet and Network Economics, pp. 42–56,
[33] A. Okutan, S. J. Yang, and K. McConky, “Predicting Cyber Attacks 2007.
with Bayesian Networks Using Unconventional Signals,” in Proceed- [56] H. Park, S.-O. D. Jung, H. Lee, and H. P. In, “Cyber Weather
ings of the 12th Annual Conference on Cyber and Information Security Forecasting: Forecasting Unknown Internet Worms Using Randomness
Research, ser. CISRC ’17. ACM, 2017, pp. 13:1–13:4. Analysis,” in Information Security and Privacy Research. Berlin,
[34] K. Huang, C. Zhou, Y. C. Tian, S. Yang, and Y. Qin, “Assessing the Heidelberg: Springer Berlin Heidelberg, 2012, pp. 376–387.
physical impact of cyberattacks on industrial cyber-physical systems,” [57] Z. Zhan, M. Xu, and S. Xu, “Characterizing Honeypot-Captured Cyber
IEEE Transactions on Industrial Electronics, vol. 65, no. 10, pp. 8153– Attacks: Statistical Framework and Case Study,” IEEE Transactions on
8162, Oct 2018. Information Forensics and Security, vol. 8, no. 11, pp. 1775–1789, Nov
[35] A. S. Sendi, M. Dagenais, and M. Jabbarifar, “Real Time Intrusion 2013.
Prediction based on Optimized Alerts with Hidden Markov Model,” [58] A. Silva, E. Pontes, F. Zhou, A. Guelf, and S. Kofuji, “PRBS/EWMA
Journal of Networks, vol. 7, no. 2, 2012. based model for predicting burst attacks (Brute Froce, DoS) in
[36] S. Shin, S. Lee, H. Kim, and S. Kim, “Advanced probabilistic approach computer networks,” in Ninth International Conference on Digital
for network intrusion forecasting and detection,” Expert Systems with Information Management (ICDIM 2014), Sept 2014, pp. 194–200.
Applications, vol. 40, no. 1, pp. 315 – 322, 2013. [59] A. B. Abdullah, T. R. Pillai, and L. Z. Cai, “Intrusion detection fore-
[37] Y. Zhang, D. Zhao, and J. Liu, “The Application of Baum-Welch casting using time series for improving cyber defence,” International
Algorithm in Multistep Attack,” The Scientific World Journal, vol. Journal of Intelligent Systems and Applications in Engineering, vol. 3,
2014, 2014. no. 1, pp. 28–33, 2015.
[38] H. A. Kholidy, A. Erradi, and S. Abdelwahed, “Attack Prediction Mod- [60] T. R. Pillai, S. Palaniappan, A. Abdullah, and H. M. Imran, “Predictive
els for Cloud Intrusion Detection Systems,” in Artificial Intelligence, modeling for intrusions in communication systems using GARMA
Modelling and Simulation (AIMS), 2014 2nd International Conference and ARMA models,” in 2015 5th National Symposium on Information
on, Nov 2014, pp. 270–275. Technology: Towards New Smart World (NSITNSW), Feb 2015.
[39] H. A. Kholidy, A. Erradi, S. Abdelwahed, and A. Azab, “A Finite [61] J. Freudiger, E. De Cristofaro, and A. E. Brito, Controlled Data
State Hidden Markov Model for Predicting Multistage Attacks in Cloud Sharing for Collaborative Predictive Blacklisting. Cham: Springer
Systems,” in Dependable, Autonomic and Secure Computing (DASC), International Publishing, 2015, pp. 327–349.
2014 IEEE 12th International Conference on, Aug 2014, pp. 14–19. [62] Y.-Z. Chen, Z.-G. Huang, S. Xu, and Y.-C. Lai, “Spatiotemporal
[40] H. A. Kholidy, A. M. Yousof, A. Erradi, S. Abdelwahed, and H. A. patterns and predictability of cyberattacks,” PLOS ONE, vol. 10, no. 5,
Ali, “A Finite Context Intrusion Prediction Model for Cloud Systems pp. 1–19, 05 2015.
with a Probabilistic Suffix Tree,” in Modelling Symposium (EMS), 2014 [63] Z. Zhan, M. Xu, and S. Xu, “Predicting cyber attack rates with extreme
European, Oct 2014, pp. 526–531. values,” IEEE Transactions on Information Forensics and Security,
[41] S. Abraham and S. Nair, “Exploitability analysis using predictive vol. 10, no. 8, pp. 1666–1677, Aug 2015.
cybersecurity framework,” in 2015 IEEE 2nd International Conference [64] P. Sokol and A. Gajdoš, Prediction of Attacks Against Honeynet Based
on Cybernetics (CYBCONF), June 2015, pp. 317–323. on Time Series Modeling. Cham: Springer International Publishing,
[42] A. Bar, B. Shapira, L. Rokach, and M. Unger, “Identifying Attack 2018, pp. 360–371.
Propagation Patterns in Honeypots Using Markov Chains Modeling [65] G. Werner, S. Yang, and K. McConky, “Time series forecasting of
and Complex Networks Analysis,” in Software Science, Technology cyber attack intensity,” in Proceedings of the 12th Annual Conference
and Engineering (SWSTE), 2016 IEEE International Conference on. on Cyber and Information Security Research, ser. CISRC ’17. New
IEEE, 2016, pp. 28–36. York, NY, USA: ACM, 2017, pp. 18:1–18:3.
[43] ——, “Scalable attack propagation model and algorithms for honeypot [66] S. Dowling, M. Schukat, and H. Melvin, “Using analysis of temporal
systems,” in 2016 IEEE International Conference on Big Data (Big variances within a honeypot dataset to better predict attack type proba-
Data), Dec 2016, pp. 1130–1135. bility,” in 2017 12th International Conference for Internet Technology
[44] V. Lisý, R. Pı́bil, J. Stiborek, B. Bošanský, and M. Pěchoucek, “Game- and Secured Transactions (ICITST), Dec 2017, pp. 349–354.
theoretic Approach to Adversarial Plan Recognition,” in ECAI, 2012, [67] A. Okutan, G. Werner, K. McConky, and S. J. Yang, “POSTER:
pp. 546–551. Cyber Attack Prediction of Threats from Unconventional Resources
[45] R. Pı́bil, V. Lisý, C. Kiekintveld, B. Bošanský, and M. Pěchouček, (CAPTURE),” in Proceedings of the 2017 ACM SIGSAC Conference
“Game theoretic model of strategic honeypot selection in computer on Computer and Communications Security, ser. CCS ’17. New York,
networks,” in Decision and Game Theory for Security. Springer, 2012, NY, USA: ACM, 2017, pp. 2563–2565.
pp. 201–220. [68] Z. Lin, L. Xiujie, M. Jing, S. Wenchang, and W. Xiufang, “The predic-
[46] C. Phillips and L. P. Swiler, “A graph-based system for network- tion algorithm of network security situation based on grey correlation
vulnerability analysis,” in Proceedings of the 1998 Workshop on New entropy Kalman filtering,” in Information Technology and Artificial
Security Paradigms, ser. NSPW ’98. New York, NY, USA: ACM, Intelligence Conference (ITAIC), 2014 IEEE 7th Joint International,
1998, pp. 71–79. Dec 2014, pp. 321–324.
[47] O. Sheyner, J. Haines, S. Jha, R. Lippmann, and J. M. Wing, “Au- [69] Y.-B. Leau and S. Manickam, “A Novel Adaptive Grey Verhulst Model
tomated generation and analysis of attack graphs,” in Security and for Network Security Situation Prediction,” International Journal of
privacy, 2002. Proceedings. 2002 IEEE Symposium on. IEEE, 2002, Advanced Computer Science & Applications, vol. 1, no. 7, pp. 90–95,
pp. 273–284. 2016.
[48] H. Debar and A. Wespi, “Aggregation and correlation of intrusion- [70] ——, “An Enhanced Adaptive Grey Verhulst Prediction Model for Net-
detection alerts,” in International Workshop on Recent Advances in work Security Situation,” International Journal of Computer Science
Intrusion Detection. Springer, 2001, pp. 85–103. and Network Security (IJCSNS), vol. 16, no. 5, p. 13, 2016.
20
[71] Y. Liu, A. Sarabi, J. Zhang, P. Naghizadeh, M. Karir, M. Bailey, and Procedia Technology, vol. 11, no. Supplement C, pp. 540 – 547, 2013,
M. Liu, “Cloudy with a Chance of Breach: Forecasting Cyber Security 4th International Conference on Electrical Engineering and Informatics,
Incidents,” in USENIX Security Symposium, 2015, pp. 1009–1024. ICEEI 2013.
[72] D. Ju-Long, “Control problems of grey systems,” Systems & Control [93] A. Aleroud and G. Karabatis, “Context Infusion in Semantic Link
Letters, vol. 1, no. 5, pp. 288–294, 1982. Networks to Detect Cyber-attacks: A Flow-Based Detection Approach,”
[73] F.-s. Zhang, F. Liu, W.-b. Zhao, Z.-a. SUN, and G.-y. JIANG, “Applica- in 2014 IEEE International Conference on Semantic Computing, June
tion of grey verhulst model in middle and long term load forecasting,” 2014, pp. 175–182.
Power System Technology, vol. 5, pp. 37–40, 2003. [94] A. AlEroud and G. Karabatis, “Methods and techniques to identify se-
[74] R. Zheng, D. Zhang, Q. Wu, M. Zhang, and C. Yang, “A strategy of curity incidents using domain knowledge and contextual information,”
network security situation autonomic awareness,” in Network Comput- in 2017 IFIP/IEEE Symposium on Integrated Network and Service
ing and Information Security. Springer, 2012, pp. 632–639. Management (IM), May 2017, pp. 1040–1045.
[75] F. Chen, Y. Shen, G. Zhang, and X. Liu, “The network security situation [95] C.-B. Jiang, I. Liu, Y.-N. Chung, J.-S. Li et al., “Novel intrusion
predicting technology based on the small-world echo state network,” prediction mechanism based on honeypot log similarity,” International
in Software Engineering and Service Science (ICSESS), 2013 4th IEEE Journal of Network Management, 2016.
International Conference on. IEEE, 2013, pp. 377–380. [96] A. AlEroud and I. Alsmadi, “Identifying cyber-attacks on software
[76] Y. Zhang, S. Jin, X. Cui, X. Yin, and Y. Pang, Network Security defined networks: An inference-based intrusion detection approach,”
Situation Prediction Based on BP and RBF Neural Network. Berlin, Journal of Network and Computer Applications, vol. 80, pp. 152 –
Heidelberg: Springer Berlin Heidelberg, 2013, pp. 659–665. 164, 2017.
[77] W. Xing-zhu, “Network Intrusion Prediction Model based on RBF [97] D. Kwon, J. W.-K. Hong, and H. Ju, “DDoS attack forecasting system
Features Classification,” International Journal of Security and Its architecture using Honeynet,” in Network Operations and Management
Applications, vol. 10, no. 4, pp. 241–248, 2016. Symposium (APNOMS), 2012 14th Asia-Pacific, Sept 2012, pp. 1–4.
[78] H. Zhang, Q. Huang, F. Li, and J. Zhu, “A network security situation [98] D. Kwon, H. Kim, D. An, and H. Ju, “DDoS Attack Volume Forecast-
prediction model based on wavelet neural network with optimized ing Using a Statistical Approach,” in TODO, 2017.
parameters,” Digital Communications and Networks, vol. 2, no. 3, pp. [99] C. Fachkha, E. Bou-Harb, and M. Debbabi, “Towards a Forecast-
139 – 144, 2016, advances in Big Data. ing Model for Distributed Denial of Service Activities,” in Network
[79] F. He, Y. Zhang, D. Liu, Y. Dong, C. Liu, and C. Wu, “Mixed Wavelet- Computing and Applications (NCA), 2013 12th IEEE International
Based Neural Network Model for Cyber Security Situation Prediction Symposium on, Aug 2013, pp. 110–117.
Using MODWT and Hurst Exponent Analysis,” in Network and System [100] A. Olabelurin, S. Veluru, A. Healing, and M. Rajarajan, “Entropy
Security. Cham: Springer International Publishing, 2017, pp. 99–111. clustering approach for improving forecasting in DDoS attacks,” in
[80] X. Cheng and S. Lang, “Research on network security situation Networking, Sensing and Control (ICNSC), 2015 IEEE 12th Interna-
assessment and prediction,” in Computational and Information Sciences tional Conference on, April 2015, pp. 315–320.
(ICCIS), 2012 Fourth International Conference on. IEEE, 2012, pp. [101] G.-Y. Hu, Z.-J. Zhou, B.-C. Zhang, X.-J. Yin, Z. Gao, and Z.-G.
864–867. Zhou, “A method for predicting the network security situation based
[81] G. K. Jayasinghe, J. S. Culpepper, and P. Bertok, “Efficient and on hidden BRB model and revised CMA-ES algorithm,” Applied Soft
effective realtime prediction of drive-by download attacks,” Journal Computing, vol. 48, pp. 404 – 418, 2016.
of Network and Computer Applications, vol. 38, pp. 135 – 149, 2014. [102] G. Y. Hu and P. L. Qiao, “Cloud Belief Rule Base Model for Network
[82] S. O. Uwagbole, W. J. Buchanan, and L. Fan, “Applied Machine Security Situation Prediction,” IEEE Communications Letters, vol. 20,
Learning predictive analytics to SQL Injection Attack detection and no. 5, pp. 914–917, May 2016.
prevention,” in 2017 IFIP/IEEE Symposium on Integrated Network and [103] H. Wei, G. Hu, X. Han, P. Qiao, Z. Zhou, Z. Feng, and X. Yin, “A New
Service Management (IM), May 2017, pp. 1087–1090. BRB Model for Cloud Security-state Prediction based on the Large-
[83] ——, “An applied pattern-driven corpus to predictive analytics in miti- scale Monitoring Data,” IEEE Access, 2017.
gating SQL injection attack,” in 2017 Seventh International Conference [104] D. Mahjoub and T. Mathew, “SPRank and IP Space Monitoring at Bru-
on Emerging Security Technologies (EST), Sept 2017, pp. 12–17. CON & Hack.lu,” https://umbrella.cisco.com/blog/2015/11/19/sprank-
[84] C. Fachkha, E. Bou-Harb, A. Boukhtouta, S. Dinh, F. Iqbal, and and-ip-space-monitoring/, 2015.
M. Debbabi, “Investigating the dark cyberspace: Profiling, threat-based [105] A. Dalton, B. Dorr, L. Liang, and K. Hollingshead, “Improving
analysis and correlation,” in 2012 7th International Conference on cyber-attack predictions through information foraging,” in 2017 IEEE
Risks and Security of Internet and Systems (CRiSIS), Oct 2012. International Conference on Big Data (Big Data), Dec 2017, pp. 4642–
[85] Y.-H. Kim and W. H. Park, “A study on cyber threat prediction based 4647.
on intrusion detection event for apt attack detection,” Multimedia Tools [106] M. Rasmi and A. Jantan, Attack Intention Analysis Model for Network
and Applications, vol. 71, no. 2, pp. 685–698, Jul 2014. Forensics. Berlin, Heidelberg: Springer Berlin Heidelberg, 2011, pp.
[86] M. Husák and J. Kašpar, “Towards Predicting Cyber Attacks Using 403–411.
Information Exchange and Data Mining,” in Proceedings of 2018 Inter- [107] H. Wei, G.-Y. Hu, Z.-J. Zhou, P.-L. Qiao, Z.-G. Zhou, and Y.-M. Zhang,
national Wireless Communications and Mobile Computing Conference “A new BRB model for security-state assessment of cloud computing
(IWCMC), 2018, (to appear). based on the impact of external and internal environments,” Computers
[87] K. Soska and N. Christin, “Automatically detecting vulnerable websites & Security, vol. 73, pp. 207 – 218, 2018.
before they turn malicious.” in USENIX Security Symposium, 2014, pp. [108] H. T. Elshoush and I. M. Osman, “Alert correlation in collaborative
625–640. intelligent intrusion detection systems – a survey,” Applied Soft Com-
[88] K. Veeramachaneni, I. Arnaldo, V. Korrapati, C. Bassias, and K. Li, puting, vol. 11, no. 7, pp. 4349 – 4365, 2011, soft Computing for
“AI2 : Training a Big Data Machine to Defend,” in 2016 IEEE 2nd Information System Security.
International Conference on Big Data Security on Cloud (BigDataSe- [109] S. Suthaharan, “Big data classification: Problems and challenges in
curity), IEEE International Conference on High Performance and network intrusion prediction with machine learning,” SIGMETRICS
Smart Computing (HPSC), and IEEE International Conference on Perform. Eval. Rev., vol. 41, no. 4, pp. 70–73, Apr. 2014.
Intelligent Data and Security (IDS), April 2016, pp. 49–54. [110] R. P. Lippmann, D. J. Fried, I. Graf, J. W. Haines, K. R. Kendall,
[89] A. L. Buczak and E. Guven, “A survey of data mining and ma- D. McClung, D. Weber, S. E. Webster, D. Wyschogrod, R. K. Cun-
chine learning methods for cyber security intrusion detection,” IEEE ningham, and M. A. Zissman, “Evaluating intrusion detection systems:
Communications Surveys Tutorials, vol. 18, no. 2, pp. 1153–1176, the 1998 DARPA off-line intrusion detection evaluation,” in DARPA
Secondquarter 2016. Information Survivability Conference and Exposition, 2000. DISCEX
[90] J.-B. Lai, H.-Q. Wang, X.-W. Liu, Y. Liang, R.-J. Zheng, and G.-S. ’00. Proceedings, vol. 2, 2000, pp. 12–26.
Zhao, “Wnn-based network security situation quantitative prediction [111] MIT Lincoln Laboratory. DARPA Intrusion Detection Data Sets.
method and its optimization,” Journal of computer science and tech- [Online]. Available: https://www.ll.mit.edu/ideval/data/
nology, vol. 23, no. 2, pp. 222–230, 2008. [112] The UCI KDD Archive. (1999, Oct.) KDD Cup 1999 Data . [Online].
[91] A. Jantan, M. Rasmi, M. I. Ibrahim, and A. H. A. Rahman, A Similarity Available: http://kdd.ics.uci.edu/databases/kddcup99/kddcup99.html
Model to Estimate Attack Strategy Based on Intentions Analysis for [113] M. Mahoney and P. Chan, “An analysis of the 1999 DARPA/Lincoln
Network Forensics. Berlin, Heidelberg: Springer Berlin Heidelberg, Laboratory evaluation data for network anomaly detection,” in Recent
2012, pp. 336–346. advances in intrusion detection. Springer, 2003, pp. 220–237.
[92] M. Rasmi and A. Jantan, “A new algorithm to estimate the similarity [114] M. Tavallaee, E. Bagheri, W. Lu, and A. A. Ghorbani, “A detailed
between the intentions of the cyber crimes for network forensics,” analysis of the KDD CUP 99 data set,” in 2009 IEEE Symposium on
21