Breaching Data Analytics Paper
Breaching Data Analytics Paper
Abstract— Analyzing cyber incident data sets is an important reports that in year 2016, the median number of breached
method for deepening our understanding of the evolution of the records was 1,339, the median per-record cost was $39.82,
threat situation. This is a relatively new research topic, and many the average breach cost was $665,000, and the median breach
studies remain to be done. In this paper, we report a statistical cost was $60,000.
analysis of a breach incident data set corresponding to 12 years
(2005–2017) of cyber hacking activities that include malware While technological solutions can harden cyber systems
attacks. We show that, in contrast to the findings reported in the against attacks, data breaches continue to be a big prob-
literature, both hacking breach incident inter-arrival times and lem. This motivates us to characterize the evolution of data
breach sizes should be modeled by stochastic processes, rather breach incidents. This not only will deep our understanding
than by distributions because they exhibit autocorrelations. Then, of data breaches, but also shed light on other approaches
we propose particular stochastic process models to, respectively, for mitigating the damage, such as insurance. Many believe
fit the inter-arrival times and the breach sizes. We also show
that these models can predict the inter-arrival times and the
that insurance will be useful, but the development of accurate
breach sizes. In order to get deeper insights into the evolution cyber risk metrics to guide the assignment of insurance rates is
of hacking breach incidents, we conduct both qualitative and beyond the reach of the current understanding of data breaches
quantitative trend analyses on the data set. We draw a set of (e.g., the lack of modeling approaches) [6].
cybersecurity insights, including that the threat of cyber hacks Recently, researchers started modeling data breach inci-
is indeed getting worse in terms of their frequency, but not in dents. Maillart and Sornette [7] studied the statistical prop-
terms of the magnitude of their damage. erties of the personal identity losses in the United States
Index Terms— Hacking breach, data breach, cyber threats, between year 2000 and 2008 [8]. They found that the num-
cyber risk analysis, breach prediction, trend analysis, time series, ber of breach incidents dramatically increases from 2000 to
cybersecurity data analytics. July 2006 but remains stable thereafter. Edwards et al. [9]
analyzed a dataset containing 2,253 breach incidents that span
I. I NTRODUCTION over a decade (2005 to 2015) [1]. They found that neither
the size nor the frequency of data breaches has increased
D ATA breaches are one of the most devastating cyber
incidents. The Privacy Rights Clearinghouse [1] reports
7,730 data breaches between 2005 and 2017, accounting for
over the years. Wheatley et al. [10] analyzed a dataset that is
combined from [8] and [1] and corresponds to organizational
9,919,228,821 breached records. The Identity Theft Resource breach incidents between year 2000 and 2015. They found
Center and Cyber Scout [2] reports 1,093 data breach incidents that the frequency of large breach incidents (i.e., the ones
in 2016, which is 40% higher than the 780 data breach that breach more than 50,000 records) occurring to US firms
incidents in 2015. The United States Office of Personnel is independent of time, but the frequency of large breach
Management (OPM) [3] reports that the personnel information incidents occurring to non-US firms exhibits an increasing
of 4.2 million current and former Federal government employ- trend.
ees and the background investigation records of current, The present study is motivated by several questions
former, and prospective federal employees and contractors that have not been investigated until now, such as: Are
(including 21.5 million Social Security Numbers) were stolen data breaches caused by cyber attacks increasing, decreas-
in 2015. The monetary price incurred by data breaches is ing, or stabilizing? A principled answer to this question will
also substantial. IBM [4] reports that in year 2016, the global give us a clear insight into the overall situation of cyber threats.
average cost for each lost or stolen record containing sensi- This question was not answered by previous studies. Specifi-
tive or confidential information was $158. NetDiligence [5] cally, the dataset analyzed in [7] only covered the time span
from 2000 to 2008 and does not necessarily contain the breach
Manuscript received November 22, 2017; revised March 16, 2018 and incidents that are caused by cyber attacks; the dataset analyzed
April 23, 2018; accepted April 28, 2018. Date of publication May 16, in [9] is more recent, but contains two kinds of incidents:
2018; date of current version May 23, 2018. This work was supported negligent breaches (i.e., incidents caused by lost, discarded,
in part by ARL under Grant W911NF-17-2-0127. The associate editor
coordinating the review of this manuscript and approving it for publication was stolen devices and other reasons) and malicious breaching.
Prof. Mauro Conti. (Corresponding author: Shouhuai Xu.) Since negligent breaches represent more human errors than
M. Xu is with the Department of Mathematics, Illinois State University, cyber attacks, we do not consider them in the present study.
Normal, IL 61761 USA.
K. M. Schweitzer and R. M. Bateman are with the U.S. Army Research
Because the malicious breaches studied in [9] contain four
Laboratory South (Cyber), San Antonio, TX 78284 USA. sub-categories: hacking (including malware), insider, payment
S. Xu is with the Department of Computer Science, The University of Texas card fraud, and unknown, this study will focus on the hacking
at San Antonio, San Antonio, TX 78249 USA (e-mail: [email protected]). sub-category (called hacking breach dataset thereafter), while
Color versions of one or more of the figures in this paper are available
online at http://ieeexplore.ieee.org. noting that the other three sub-categories are interesting on
Digital Object Identifier 10.1109/TIFS.2018.2834227 their own and should be analyzed separately.
A. Our Contributions implying that neither the breach size nor the breach frequency
In this paper, we make the following three contributions. has increased over the years.
First, we show that both the hacking breach incident inter- Wheatley et al. [10] analyzed an organizational breach inci-
arrival times (reflecting incident frequency) and breach sizes dents dataset that is combined from [8] and [1] and spans over
should be modeled by stochastic processes, rather than by a decade (year 2000 to 2015). They used the Extreme Value
distributions. We find that a particular point process can ade- Theory [11] to study the maximum breach size, and further
quately describe the evolution of the hacking breach incidents modeled the large breach sizes by a doubly truncated Pareto
inter-arrival times and that a particular ARMA-GARCH model distribution. They also used linear regression to study the
can adequately describe the evolution of the hacking breach frequency of the data breaches, and found that the frequency
sizes, where ARMA is acronym for “AutoRegressive and of large breaching incidents is independent of time for the
Moving Average” and GARCH is acronym for “Generalized United States organizations, but shows an increasing trend for
AutoRegressive Conditional Heteroskedasticity.” We show that non-US organizations.
these stochastic process models can predict the inter-arrival There are also studies on the dependence among cyber risks.
times and the breach sizes. To the best of our knowledge, Böhme and Kataria [12] studied the dependence between
this is the first paper showing that stochastic processes, rather cyber risks of two levels: within a company (internal depen-
than distributions, should be used to model these cyber threat dence) and across companies (global dependence). Herath and
factors. Herath [13] used the Archimedean copula to model cyber risks
Second, we discover a positive dependence between the caused by virus incidents, and found that there exists some
incidents inter-arrival times and the breach sizes, and show that dependence between these risks. Mukhopadhyay et al. [14]
this dependence can be adequately described by a particular used a copula-based Bayesian Belief Network to assess cyber
copula. We also show that when predicting inter-arrival times vulnerability. Xu and Hua [15] investigated using copulas to
and breach sizes, it is necessary to consider the dependence; model dependent cyber risks. Xu et al. [16] used copulas to
otherwise, the prediction results are not accurate. To the best investigate the dependence encountered when modeling the
of our knowledge, this is the first work showing the existence effectiveness of cyber defense early-warning. Peng et al. [17]
of this dependence and the consequence of ignoring it. investigated multivariate cybersecurity risks with dependence.
Third, we conduct both qualitative and quantitative trend Compared with all these studies mentioned above,
analyses of the cyber hacking breach incidents. We find that the present paper is unique in that it uses a new methodology to
the situation is indeed getting worse in terms of the incidents analyze a new perspective of breach incidents (i.e., cyber hack-
inter-arrival time because hacking breach incidents become ing breach incidents). This perspective is important because
more and more frequent, but the situation is stabilizing in it reflects the consequence of cyber hacking (including mal-
terms of the incident breach size, indicating that the damage of ware). The new methodology found for the first time, that
individual hacking breach incidents will not get much worse. both the incidents inter-arrival times and the breach sizes
We hope the present study will inspire more investigations, should be modeled by stochastic processes rather than distri-
which can offer deep insights into alternate risk mitigation butions, and that there exists a positive dependence between
approaches. Such insights are useful to insurance companies, them.
government agencies, and regulators because they need to 2) Other Prior Works Related to the Present Study:
deeply understand the nature of data breach risks. Eling and Loperfido [18] analyzed a dataset [1] from
the point of view of actuarial modeling and pricing.
Bagchi and Udo [19] used a variant of the Gompertz model
B. Related Work
to analyze the growth of computer and Internet-related crimes.
1) Prior Works Closely Related to the Present Study: Condon et. al [20] used the ARIMA model to predict secu-
Maillart and Sornette [7] analyzed a dataset [8] of 956 per- rity incidents based on a dataset provided by the Office
sonal identity loss incidents that occurred in the United States of Information Technology at the University of Maryland.
between year 2000 and 2008. They found that the personal Zhan et al. [21] analyzed the posture of cyber threats by using
identity losses per incident, denoted by X, can be modeled by a a dataset collected at a network telescope. Using datasets
heavy tail distribution Pr(X > n) ∼ n −α where α = 0.7 ± 0.1. collected at a honeypot, Zhan et al. [22], [23] exploited their
This result remains valid when dividing the dataset per type of statistical properties including long-range dependence and
organizations: business, education, government, and medical extreme values to describe and predict the number of
institution. Because the probability density function of the attacks against the honeypot; a predictability evaluation of
identity losses per incident is static, the situation of identity a related dataset is described in [24]. Peng et al. [25] used
loss is stable from the point of view of the breach size. a marked point process to predict extreme attack rates.
Edwards et al. [9] analyzed a different breach dataset [1] Bakdash et al. [26] extended these studies into related cyber-
of 2,253 breach incidents that span over a decade security scenarios. Liu et al. [27] investigated how to use
(2005 to 2015). These breach incidents include two categories: externally observable features of a network (e.g., misman-
negligent breaches (i.e., incidents caused by lost, discarded, agement symptoms) to forecast the potential of data breach
stolen devices, or other reasons) and malicious breaching incidents to that network. Sen and Borle [28] studied the
(i.e., incidents caused by hacking, insider and other reasons). factors that could increase or decrease the contextual risk
They showed that the breach size can be modeled by the of data breaches, by using tools that include the opportunity
log-normal or log-skewnormal distribution and the breach fre- theory of crime, the institutional anomie theory, and the
quency can be modeled by the negative binomial distribution, institutional theory.
2858 IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, VOL. 13, NO. 11, NOVEMBER 2018
TABLE I
S UMMARY OF N OTATIONS (r.v. S TANDS FOR R ANDOM VARIABLE )
D. Remark
In this paper, we use a number of statistical techniques,
a thorough review of which would be lengthy. In order to
comply with the space requirement, here we only briefly
review these techniques at a high level, and refer the readers to
specific references for each technique when it is used. We use
the autoregressive conditional mean point process [30], [31],
which was introduced for describing the evolution of condi-
tional means, to model the evolution of the inter-arrival time.
We use the ARMA-GARCH time series model [32], [33] to skewness, which make it difficult to model the breach sizes.
model the evolution of the breach size, where the ARMA part We observe a large volatility in the breach size and the volatil-
models the evolution of the mean of the breach sizes and the ity clustering phenomenon of large (small) changes followed
GARCH part models the high volatility of the breach sizes. by large (small) changes. We also observe that some breach
We use copulas [34], [35] to model the nonlinear dependence sizes are especially large (meaning severe hacking breach
between the inter-arrival times and the breach sizes. incidents). We will pay particular attention for modeling these
Table I summarizes the main notations used in the paper. extreme breach incidents.
Fig. 3. The sample ACF and PACF of incidents inter-arrival times. (a) ACF of Fig. 4. The sample ACF and PACF of log-transformed breach sizes. (a) ACF
inter-arrival times. (b) PACF of inter-arrival times. of transformed breach sizes. (b) PACF of transformed breach sizes.
TABLE III
always much larger than the corresponding mean. Figure 2(b)
S TATISTICS OF H ACKING B REACH S IZES , W HERE ‘SD’
S TANDS FOR S TANDARD D EVIATION plots the log-transformed breach sizes because, as we can
observe from Table III, the breach sizes exhibit large volatility
and skewness (which is indicated by the substantial difference
between the median and the mean values), which make them
hard to model without making transformations.
In order to answer the question whether the breach sizes
should be modeled by a distribution or stochastic process,
we plot the temporal correlations between the breach sizes.
Figures 4(a) and 4(b) plot the sample ACF and PACF for
the log-transformed breach sizes, respectively. We observe
correlations between the breach sizes, meaning that we should
use a stochastic process, rather than a distribution, to model
the breach sizes [33], [36]. This is in contrast to the insight
offered by previous studies [7], [18], which suggests to use a
skewed distribution to model the breach sizes. We attribute the
drawing of this insight to the fact that these studies [7], [18]
the observations in between them, and PACF measures the did not look into this due perspective of temporal correla-
correlation between the observations at earlier times and the tions. An important factor for determining whether to use
observations at later times while disregarding the observations a distribution or a stochastic process to describe something,
in between them. The formal definitions of ACF and PACF depends on whether or not there is temporal autocorrelation
are given in Appendix A. ACF and PACF are widely used to between the individual samples. This is because zero temporal
detect temporal correlations in time series [36], [37]. autocorrelation means that the samples are independent of each
Figure 3 plots the sample ACF and PACF, respectively. other; otherwise, non-zero temporal autocorrelation means that
We observe correlations in both plots because there are correla- they are not independent of each other and should not be
tion values that exceed the dashed blue lines (i.e., the threshold modeled by a distribution.
values which are derived based on the asymptotic statistical Insight 2: The hacking breach sizes exhibit a large volatil-
theory [36], [38]). This means that there are significant corre- ity, a large skewness, and a volatility clustering phenom-
lations between the inter-arrival times and that the inter-arrival enon, namely large (small) changes followed by large (small)
times do not follow the exponential distribution. Moreover, changes. Moreover, there are correlations between the breach
we should use a stochastic process to describe the inter-arrival sizes, implying that they should be modeled by an appropriate
times [39]. In summary, we have: stochastic process than a distribution.
Insight 1: The hacking breach incidents inter-arrival times
exhibit some clusters of small inter-arrival times (i.e., multiple IV. M ODELING THE H ACKING B REACH DATASET
incidents occur within a short period of time) and the inci-
dents are irregularly spaced. Moreover, there are correlations In this section, we develop a novel statistical model to
between the inter-arrival times, meaning that the inter-arrival fit the breach dataset, or more specifically the in-sample of
times should be modeled by an appropriate stochastic process 320 incidents. The fitted model will be used for prediction,
rather than by a distribution. which will be evaluated by the out-of-sample of 280 incidents
(Section V).
B. Basic Analysis of Hacking Breach Sizes
Table III summarizes the basic statistics of the hacking A. Modeling the Inter-Arrival Times
breach sizes. We observe that three Business categories have Insight 1 suggests that we model the hacking breach inci-
much larger mean breach sizes than others. We further observe dents inter-arrival times with an autoregressive conditional
that there exists a large standard deviation for the breach size in mean (ACD) model, which was originally introduced to model
each of the victim categories, and that the standard deviation is the evolution of the inter-arrival time, or duration, between
XU et al.: MODELING AND PREDICTING CYBER HACKING BREACHES 2861
TABLE V
T HE p-VALUES OF S TATISTICAL T ESTS FOR THE R ESIDUALS
Fig. 7. The qq-plots of the residuals of ARMA(1, 1)-GARCH(1, 1) with Fig. 8. Normal score plot and fitted contour plot. (a) Normal scores plot.
innovations following different distributions for fitting the log-transformed (b) Gumbel contour plot.
breach sizes. (a) The qq-plot of the skewed Student-t. (b) The qq-plot of the
mixed distribution. It is interesting to note that the upper tail shape parameter
ξ = .001 indicates that the upper tail is heavy. The qq-plot
propose an extreme value mixture distribution for describing in Figure 7(b) indicates that the mixed distribution describes
the innovations. the tails well because all of the points are around the 45-degree
The Extreme Value Theory (EVT) [32], [45] is a useful tool line. This leads to:
for modeling the heavy-tail distribution. A popular method is Insight 4: The log-transformed hacking breach sizes exhibit
known as the peaks over threshold approach (POT). Given a significant temporal correlation, and therefore should be
a sequence of i.i.d. observations X 1 , . . . , X n , the excesses modeled by a stochastic process rather than a distribution.
X i − μ of some suitably high threshold μ can be modeled by, Moreover, the log-transformed hacking breach sizes exhibit the
under certain mild conditions, the generalized Pareto distrib- volatility clustering phenomenon with possibly extremely large
ution (GPD). The survival function of the GPD breach sizes. These two properties lead to the development
⎧ −1/ξ of ARMA(1, 1)-GARCH(1, 1) with innovations that follow
⎨ 1+ξx −μ
⎪
, ξ = 0, a mixed extreme value distribution, which can adequately
Ḡ ξ,σ,μ (x) = 1 − G ξ,σ,μ = σ + describe the evolution of the log-transformed breach size.
⎪
⎩exp − x − μ ,
σ ξ = 0. Note that the ARMA(1, 1) part models the means of the
observations and the GARCH(1, 1) part models the large
where x ≥ μ if ξ ∈ R+ and x ∈ [μ, μ − σ/ξ ] if ξ ∈ R− , and volatility exhibited by the data.
ξ and σ are respectively called the shape and scale parameters.
Because Figure 7(a) shows that both tails cannot be modeled C. Dependence Between Inter-Arrival Times
by the skewed Student-t distribution, we propose modeling and Breach Sizes
both tails with the GPD and modeling the middle part with In order to answer the question whether or not there
the normal distribution. This leads to a mixed extreme value exists dependence between the inter-arrival times and the
distribution that is used to model the innovations as follows: breach sizes, we propose conducting the normal score
transformation [35] to the residuals that are obtained after
G m (x)
⎧ fitting these two time series. For residuals of the LACD1
⎪
⎪ pl [1 − G(−x|ξl , σl , −μl )], fitting, denoted by e1 , . . . , en , we use the fitted generalized
⎪
⎪
⎪
⎪ if x ≤ μl , gamma distribution G(·|γ , k) to convert them into empirical
⎪
⎪
⎨ p + (1 − p − p ) (x|μm , σm ) − (μl |μm , σm ) , normal scores:
l l u (μ |μ , σ ) − (μ |μ , σ )
= u m m l m m
⎪
⎪ if μ < x < μ , ei → −1 (G(ei |γ , k)), i = 1, . . . , n,
⎪
⎪ l u
⎪
⎪
⎪
⎪ 1 − pu + pu G(x|ξu , σu , μu ), where −1 is the inverse of the standard normal distribution.
⎩
if x ≥ μu . For the residuals of the ARMA(1, 1)-GARCH(1, 1) fitting,
we use the estimated mixed extreme value distribution to
where pl = P(X ≤ μl ) and pu = P(X > μu ) are the convert them into empirical normal scores.
probabilities corresponding to the tails, and μm and σm are Figure 8(a) plots the bivariate normal scores. We observe
respectively the mean and the standard deviation of the normal that large transformed durations are associated with large
distribution. It is worth mentioning that a similar idea has been transformed sizes, implying a positive dependence between the
used to model the impact of the financial crisis on stock and inter-arrival times and the breach sizes. In order to statistically
index returns [46], [47]. test the dependence, we compute the sample Kendall’s τ and
The estimated parameters for the tail proportions are Spearman’s ρ for the incidents inter-arrival times and the
( pl , pu ) = (0.126, 0.098), which means that both tails account breach sizes, which are 0.07578 and .11515, respectively.
for about 10% of the observations of GPD. The estimated The nonparametric rank tests [43] for both statistics lead
parameters μ̂m , σ̂m , μ̂l , σ̂l , ξ̂l , μ̂u , σ̂u , ξ̂u for the GPD and to a p-value of .04313 and .03956, respectively, which are
normal distributions are very small. This means that there indeed exists some positive
dependence between the inter-arrival times and the breach
(−0.002, 0.963, −1.105, 0.877, −0.694, 1.243, 0.471, 0.001). sizes.
2864 IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, VOL. 13, NO. 11, NOVEMBER 2018
1: for i = m + 1, · · · , n do
2: Estimate the LACD1 model of the incidents inter-arrival
times based on {ds |s = 1, . . . , i − 1}, and predict the
conditional mean
i = exp (ω + a1 log(i−1 ) + b1 log(i−1 ));
3: Estimate the ARMA-GARCH of log-transformed size, use rolling prediction, meaning that training data grows as
and predict the next mean μ̂i and standard error σ̂i ; the prediction operation moves forward, newer training data
4: Select a suitable Copula using the bivariate residuals needs to be re-fitted, possibly needing different copula models.
from the previous models based on AIC; As such, we need to consider more dependence structure. This
5: Based on the estimated copula,simulate 10000 explains why we need to re-select the copula structure, which
(k) (k) can fit the newly updated training data better, via the criterion
2-dimensional copula samples u 1,i , u 2,i ,
of AIC (see Step 4 of Algorithm 1).
k = 1, . . . , 10000;
Table VIII reports the prediction results. We observe that
6: For the incidents inter-arrival times, convert the
(k) (k) the prediction models pass all of the tests at the .1 significant
simulated dependent samples u 1,i ’s into the z 1,i ’s by
level. In particular, the models can predict the future inter-
using the inverse of the estimated generalized gamma arrival times for all of the α’s levels. For the breach sizes,
distribution, k = 1, . . . , 10000; at level α = .90, the model predictions have 28 violations,
7: For the breach sizes, convert the simulated dependent while the number of violations from the observed values is 31,
samples u (k) (k)
2,i ’s into the z 2,i ’s by using the inverse of the which is fairly close to each other. For α = .95, the number of
estimated mixed extreme value distribution, violations from the observed values is 20, while the model’s
k = 1, . . . , 10000; expected number of violations is 14. This indicates that the
8: Compute the predicted
10000 2-dimensional breach models for predicting the future breach sizes are somewhat
(k) (k)
data di , yi ,k = 1, . . . , 10000 based on Eq. (IV.1) conservative.
and (IV.3), respectively; Figure 9 plots the prediction results for the 280 out-
9: Compute the VaRα,d (i ) for the incidents inter-arrival of-samples. Figure 9(a) plots the prediction results for the
times and VaRα,y (i ) for the log-transformed breach incidents inter-arrival times. Figure 9(c) plots of the original
sizes based on the simulated breach data. breach sizes, but it is hard to look into visually. For a better
(k) visualization effect, we plot in Figure 9(b) the log-transformed
10: if di > VaRα,d (i ) then
11: A violation to the incidents inter-arrival time occurs; breach sizes. We observe from Figure 9(c) that for the breach
sizes, there are several extreme large values, which are far
12: end if
(k) from the predicted VaR.95 ’s. This means that the prediction
13: if yi > VaRα,y (i ); then
missed some of the extremely large breaches, the prediction
14: A violation to the breach size occurs;
of which is left as an open problem.
15: end if
In conclusion, the proposed models can effectively predict
16: end for
the VaR’s of both the incidents inter-arrival time and the
Output: Numbers of violations in inter-arrival times and breach size, because they both pass the three statistical tests.
breach sizes. However, there are several extremely large inter-arrival times
and extremely large breach sizes that are far above the pre-
values, we use the following three popular tests [54]. The dicted VaR.95 ’s, meaning that the proposed models may not
first test is the unconditional coverage test, denoted by LRuc , be able to precisely predict the exact values of the extremely
which evaluates whether or not the fraction of violations is large inter-arrival times or the extremely large breach sizes.
significantly different from the model’s violations. The second Nevertheless, as shown in Section V-C below, our models
test is the conditional coverage test, denoted by LRcc , which is can predict the joint probabilities that an incident of a certain
a joint likelihood ratio test for the independence of violations magnitude of breach size will occur during a future period of
and unconditional coverage. The third test is the dynamic time.
quantile test (DQ) [55], which is based on the sequence of
‘hit’ variables. C. Algorithm for Joint Prediction and Results
In practice, it is important to know the joint probability
B. Algorithm for Separate Prediction and Results that the next breach incident of a particular size happens at
We use Algorithm 1 to perform the recursive rolling predic- a particular time (i.e., with a particular inter-arrival time).
tion for the inter-arrival time and the breach sizes. Because we For this purpose, we consider the 10000 values predicted
2866 IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, VOL. 13, NO. 11, NOVEMBER 2018
Fig. 9. Predicted inter-arrival times and breach sizes, where black-colored circles represent the observed values. (a) Incidents inter-arrival times.
(b) Log-transformed breach sizes. (c) Breach sizes (prior to the transformation).
TABLE IX
P REDICTED J OINT P ROBABILITIES OF I NCIDENTS I NTER -A RRIVAL T IMES AND B REACH S IZES , W HERE “P ROB .” I S THE P ROBABILITY
OF B REACH S IZE A C ERTAIN P REDICTED yt O CCURRING W ITH THE N EXT T IME dt ∈ (0, ∞)
by Algorithm 1. Specifically, we consider several combinations probabilities by using the benchmark model, which makes
of (di , yti ), where di = ti − ti−1 and yti is the breach size at the independence assumption between the incidents inter-
time ti for i = 1, . . . , n as mentioned above. arrival times and the breach sizes. We observe that these
We divide the predicted inter-arrival time of the next breach probabilities are different from that of the benchmark model.
incident into the following time intervals: (i) longer than For example, the probability of data breach is .0460 for breach
one month or dt ∈ (30, ∞); (ii) in between two weeks sizes exceeding one million (i.e., severe breach incidents),
and one month or dt ∈ (14, 30]; (iii) in between one and namely yt ∈ (1 × 106 , ∞), while the probability based on the
two weeks dt ∈ (7, 14]; (iv) in between one day and one benchmark model is only .0339. Moreover, when we look at
week dt ∈ (1, 7]; (v) within one day dt ∈ (0, 1]. Similarly, the joint event of inter-arrival time dt ∈ (0, 7) and breach size
we divide the predicted breach size of the next breach incident yt ∈ (1×106 , ∞), the copula model predicts the probability as
into the following size intervals: (i) greater than one million .0332; whereas, the benchmark model predicts the probability
records or yt ∈ (1 × 106 , ∞), indicating a large breach; as .0255. This means that the benchmark model underestimates
(ii) yt ∈ (5 × 105 , 1 × 106 ]; (iii) yt ∈ (1 × 105 , 5 × 105 ]; the severity of data breach incidents.
(iv) yt ∈ (5 × 104 , 1 × 105 ]; (v) yt ∈ (1 × 104 , 5 × 104 ]; We further observe that both models predict that there
(vi) yt ∈ (5 × 103 , 1 × 104 ]; (vii) yt ∈ (1 × 103 , 5 × 103 ]; will be a breach incident occurring within a month, where
(viii) smaller than 1000 or yt ∈ [1, 1 × 103 ], indicating a the copula model predicts the probability of this incident
small breach. We use the models mentioned above to fit these being .9976, and the benchmark model predicts this probability
bivariate observations, and predict the joint event by using being .9969. This indicates that almost certainly a data breach
Algorithm 1 (steps 2-8). incident will happen within a month. Further, the copula model
Table IX describes the predicted probabilities of joint events predicts a probability of .7783 that a breach incident will
(dt , yt ) using the copula model, as well as the predicted joint occur within a week, while the benchmark model predicts
XU et al.: MODELING AND PREDICTING CYBER HACKING BREACHES 2867
TABLE X
Q UANTITATIVE T REND A NALYSIS S TATISTICS OF H ACKING B REACH I NCIDENTS , W HERE ‘SD’ S TANDS FOR S TANDARD D EVIATION
Fig. 11. The estimated VaR.9 ’s of the hacking breach incidents inter-arrival Fig. 12. Using the ARMA-GARCH model to decompose the log-transformed
times based on the LACD1 model. breach sizes into a trend part and a random part.
This finding is different from the conclusion drawn in [9], Recall that {(ti , yti )}i=1,...,n is the sequence of breach incidents
which was based on a super dataset in terms of the incident occurring at time ti with a breach size yti . Inspired by the
types (i.e., negligent breaches and malicious breaching as we growth rate analysis in economics [56], we propose:
will discuss in Section I-B); whereas, the present study focuses • Growth Rate (GR): We define the breach-size GR as
on hacking breach incidents only (i.e., a proper sub-type of the yt − yt i
malicious breaches type analyzed in [9]). GRi = i+1 .
yt i
2) Qualitative Trend Analysis of the Hacking Breach Sizes:
In Section IV-B, we used the ARMA-GARCH model with Inter-arrival times GR can be defined similarly.
innovations that follow the mixed extreme value distribution • Average Growth Rate over Time (AGRT): We define the
to describe the log-transformed breach sizes. Figure 12 plots AGRT as
the decomposition of the time series using this model. The 1 yti+1 − yti
trend is defined as AGRTi = .
di+1 yt i
Yt = μ + φ1 Yt −1 + θ1 t −1 , • Compound Growth Rate over Time (CGRT): We define
and the random part is defined as t , which is modeled by the the CGRT as
GARCH(1, 1) model described in Eq. (IV.4). We observe that yti+1 1/di+1
although the breach sizes vary over time, there is no clear CGRTi = − 1.
yt i
trend. This conclusion coincides with what was concluded Note that AGRT represents the percentage change of the
in [9], which is drawn from, as mentioned above, a proper breach size over time, and CGRT describes the rate at which
super set of the dataset we analyze. the breach size would grow.
Table X summarizes the results of the quantitative trend
B. Quantitative Trend Analysis analysis. For the breach-size GR, we observe that the means of
In order to quantify the trend, we propose using two the GR are all positive, meaning that the breach size becomes
metrics to characterize the growth of hacking breach incidents. increasingly larger each year. Note that the means of the GR
XU et al.: MODELING AND PREDICTING CYBER HACKING BREACHES 2869
are largely affected by the extreme GR. For example, for year presented in the literature, because the latter ignored both the
2016, we have the maximum GR 411, 999, which leads to a temporal correlations and the dependence between the inci-
very large mean GR (i.e., 3,917.1173). In terms of the medians, dents inter-arrival times and the breach sizes. We conducted
we observe that from 2005 to 2008, the GRs are negative, qualitative and quantitative analyses to draw further insights.
meaning that the breach sizes decrease during these years. We drew a set of cybersecurity insights, including that the
The negative GRs of breach sizes are also observed for years threat of cyber hacking breach incidents is indeed getting
2010, 2013 and 2014. For years 2015 and 2016, we observe worse in terms of their frequency, but not the magnitude of
positive GRs, 2.0172 and 0.2699, meaning that the breach their damage. The methodology presented in this paper can be
size increases for these two years. For year 2017, we have adopted or adapted to analyze datasets of a similar nature.
a negative median GR (i.e., −0.3092) until April 7, 2017. It is There are many open problems that are left for future
worth mentioning that for years 2010, 2013, and 2016, we have research. For example, it is both interesting and challenging to
very large standard deviations, which indicate that there exist investigate how to predict the extremely large values and how
extreme breach sizes during these years. to deal with missing data (i.e., breach incidents that are not
For the inter-arrival time GR, we observe that the median reported). It is also worthwhile to estimate the exact occurring
GR for each year is relatively small. In particular, we observe times of breach incidents. Finally, more research needs to be
that the median is 0 for years 2007, 2007, 2009, 2016, and conducted towards understanding the predictability of breach
2017, meaning that during these years, the breach inter-arrival incidents (i.e., the upper bound of prediction accuracy [24]).
times are relatively stable. We also observe that for years
2014 and 2015, the medians of the inter-arrival time are A PPENDIX
negative, meaning that the inter-arrival time decreases for these
A. ACF and PACF
years. We also note that since year 2012 (except for year
2015), the standard deviations of the GRs of the inter-arrival ACF and PACF [36] are two important tools for examin-
time are relatively small (smaller than 3.6). We conclude ing temporal correlations. Consider a sequence of samples
that hacking breach incidents inter-arrival time decreases in {Y1 , . . . , Yn }. The sample ACF is defined as
n
recent years. This deepens the qualitative trend analysis in the
t =k+1 Yt − Ȳ Yt −k − Ȳ
previous section. rk = n 2 , k = 1, . . . , n − 1,
The AGRT and CGRT metrics consider both the breach size t =k+1 Yt − Ȳ
and the inter-arrival time. We observe that the means of the n
where Ȳ = t =1 Yt /n is the sample mean. The PACF is
AGRT are all positive, meaning that the breach size increases defined as a conditional correlation of two variables given the
on average. In terms of the median, we observe that the AGRTs information of the other variables. Specifically, the PACF of
of years 2013 and 2014 are negative. Compared to the GRs (Yt , Yt −k ) is the autocorrelation between Yt and Yt −k after
of these two years, we observe that the absolute values of removing any linear dependence on Yt +1 , Yt +2 , . . . , Yt −k+1 ;
the AGRTs are smaller, namely, 0.0318 and 0.0360 for the see [36] for more details.
AGRTs versus 0.2633 and 0.2878 for the GRs, respectively.
This can be explained by the evolution of the inter-arrival
times. Based on AGRT, we conclude that although the breach B. AIC and BIC
size turns to be smaller (negative growth) in years 2013 and AIC and BIC are the most commonly used criteria in
2014, it becomes larger (positive growth) in years 2015 and the model selection in the statistics [36], [37], [53]. AIC is
2016, and becomes smaller at the beginning of year 2017. meant to balance the goodness-of-fit and the penalty for model
A similar conclusion can be drawn for the CGRT metric. The complexity (the smaller the AIC value, the better the model).
median value 0.0808 of CGRT in year 2016 can be interpreted Specifically,
as the median daily growth rate of 0.0808 for year 2016.
AIC = −2 log(MLE) + 2 k,
By summarizing the preceding qualitative and quantitative
trend analysis, we draw: where MLE is the likelihood associated to the fitted model and
Insight 7: The situation of hacking breach incidents are measures the goodness-of-fit, and k is the number of estimated
getting worse in terms of their frequency, but appear to be parameters and measures the model complexity. Similarly,
stabilizing in terms of their breach sizes, meaning that more the smaller the BIC value, the better the model. Specifically,
devastating breach incidents are unlikely in the future.
BIC = −2 log(MLE) + k log(n),
VII. C ONCLUSION where n is the sample size. BIC penalizes complex models
We analyzed a hacking breach dataset from the points of more heavily than AIC, thus favoring simpler models.
view of the incidents inter-arrival time and the breach size,
and showed that they both should be modeled by stochas- C. Ljung-Box and McLeod-Li Tests
tic processes rather than distributions. The statistical models The Ljung-Box test consider a group of ACFs of a time
developed in this paper show satisfactory fitting and prediction series [37], [57]. The null hypotheses is
accuracies. In particular, we propose using a copula-based
approach to predict the joint probability that an incident with H0 : The time series are independent.
a certain magnitude of breach size will occur during a future
and the alternative is
period of time. Statistical tests show that the methodolo-
gies proposed in this paper are better than those which are Ha : The time series are not independent.
2870 IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, VOL. 13, NO. 11, NOVEMBER 2018
The Ljung-Box test statistic is defined as [6] M. Eling and W. Schnell, “What do we know about cyber risk and cyber
risk insurance?” J. Risk Finance, vol. 17, no. 5, pp. 474–491, 2016.
r̂12 r̂k2 [7] T. Maillart and D. Sornette, “Heavy-tailed distribution of cyber-risks,”
Q = n(n + 2) + ···+ , Eur. Phys. J. B, vol. 75, no. 3, pp. 357–364, 2010.
n−1 n−k [8] R. B. Security. Datalossdb. Accessed: Nov. 2017. [Online]. Available:
https://blog.datalossdb.org
where r̂i is the estimated correlation coefficient at lag i . [9] B. Edwards, S. Hofmeyr, and S. Forrest, “Hype and heavy tails: A closer
look at data breaches,” J. Cybersecur., vol. 2, no. 1, pp. 3–14, 2016.
We reject the null hypothesis if Q > χ1−α,k
2 where χ1−α,k
2 is [10] S. Wheatley, T. Maillart, and D. Sornette, “The extreme risk of personal
the αth quantile of the chi-squared distribution with k degrees data breaches and the erosion of privacy,” Eur. Phys. J. B, vol. 89, no. 1,
of freedom. p. 7, 2016.
The McLeod-Li test is similarly defined but it tests whether [11] P. Embrechts, C. Klüppelberg, and T. Mikosch, Modelling Extremal
Events: For Insurance and Finance, vol. 33. Berlin, Germany:
the first m autocorrelations of squared data are zero using the Springer-Verlag, 2013.
Ljung-Box test [31], [57]. [12] R. Böhme and G. Kataria, “Models and measures for correlation in
cyber-insurance,” in Proc. Workshop Econ. Inf. Secur. (WEIS), 2006,
pp. 1–26.
D. Goodness-of-Fit Test Statistics [13] H. Herath and T. Herath, “Copula-based actuarial model for pricing
cyber-insurance policies,” Insurance Markets Companies: Anal. Actuar-
The goodness-of-fit of a distribution describes how well ial Comput., vol. 2, no. 1, pp. 7–20, 2011.
the distribution fits a set of samples. Three commonly [14] A. Mukhopadhyay, S. Chatterjee, D. Saha, A. Mahanti, and
S. K. Sadhukhan, “Cyber-risk decision models: To insure it or not?”
used test statistics are: the Kolmogorov-Smirnov (KS) Decision Support Syst., vol. 56, pp. 11–26, Dec. 2013.
test, the Anderson-Darling (AD) test, and the Cramér-von [15] M. Xu and L. Hua. (2017). Cybersecurity Insurance: Modeling
Mises (CM) test [58], [59]. Specifically, let X 1 , . . . , X n be and Pricing. [Online]. Available: https://www.soa.org/research-reports/
independent and identical random variables with distribu- 2017/cybersecurity-insurance
[16] M. Xu, L. Hua, and S. Xu, “A vine copula model for predicting the
tion F. The empirical distribution Fn is defined as effectiveness of cyber defense early-warning,” Technometrics, vol. 59,
no. 4, pp. 508–520, 2017.
1
n
[17] C. Peng, M. Xu, S. Xu, and T. Hu, “Modeling multivariate cybersecurity
Fn (x) = I(X i ≤ x), risks,” J. Appl. Stat., pp. 1–23, 2018.
n [18] M. Eling and N. Loperfido, “Data breaches: Goodness of fit, pricing,
i=1
and risk measurement,” Insurance, Math. Econ., vol. 75, pp. 126–136,
where I(X i ≤ x) is the indicator function: Jul. 2017.
[19] K. K. Bagchi and G. Udo, “An analysis of the growth of computer and
1, X i ≤ x, Internet security breaches,” Commun. Assoc. Inf. Syst., vol. 12, no. 1,
I(X i ≤ x) = p. 46, 2003.
0, o/w. [20] E. Condon, A. He, and M. Cukier, “Analysis of computer security
incident data using time series models,” in Proc. 19th Int. Symp. Softw.
The KS, CM, and AD test statistics are defined as: Rel. Eng. (ISSRE), Nov. 2008, pp. 77–86.
√ [21] Z. Zhan, M. Xu, and S. Xu, “A characterization of cyber-
KS = n sup |Fn (x) − F(x)| , security posture from network telescope data,” in Proc. 6th
x Int. Conf. Trusted Syst., 2014, pp. 105–126. [Online]. Available:
http://www.cs.utsa.edu/~shxu/socs/intrust14.pdf
CM = n (Fn (x) − F(x))2 d F(x), [22] Z. Zhan, M. Xu, and S. Xu, “Characterizing honeypot-captured cyber
attacks: Statistical framework and case study,” IEEE Trans. Inf. Forensics
Security, vol. 8, no. 11, pp. 1775–1789, Nov. 2013.
AD = n (Fn (x) − F(x))2 w(x)d F(x), [23] Z. Zhan, M. Xu, and S. Xu, “Predicting cyber attack rates with
extreme values,” IEEE Trans. Inf. Forensics Security, vol. 10, no. 8,
where w(x) = [F(x)(1 − F(x))]−1 . pp. 1666–1677, Aug. 2015.
[24] Y.-Z. Chen, Z.-G. Huang, S. Xu, and Y.-C. Lai, “Spatiotemporal pat-
terns and predictability of cyberattacks,” PLoS ONE, vol. 10, no. 5,
p. e0124472, 2015.
ACKNOWLEDGMENT [25] C. Peng, M. Xu, S. Xu, and T. Hu, “Modeling and predicting extreme
The authors thank the reviewers for their constructive com- cyber attack rates via marked point processes,” J. Appl. Stat., vol. 44,
no. 14, pp. 2534–2563, 2017.
ments that helped improve the paper. In Section V, they [26] J. Z. Bakdash et al. (2017). “Malware in the future? fore-
incorporated some insightful comments of one reviewer on casting analyst detection of cyber events.” [Online]. Available:
how to connect the prediction models to real-world cyber https://arxiv.org/abs/1707.03243
[27] Y. Liu et al., “Cloudy with a chance of breach: Forecasting cyber security
defense quantitative risk management. incidents,” in Proc. 24th USENIX Secur. Symp., Washington, DC, USA,
2015, pp. 1009–1024.
R EFERENCES [28] R. Sen and S. Borle, “Estimating the contextual risk of data breach:
An empirical approach,” J. Manage. Inf. Syst., vol. 32, no. 2,
[1] P. R. Clearinghouse. Privacy Rights Clearinghouse’s Chronol- pp. 314–341, 2015.
ogy of Data Breaches. Accessed: Nov. 2017. [Online]. Available: [29] F. Bisogni, H. Asghari, and M. Eeten, “Estimating the size of the iceberg
https://www.privacyrights.org/data-breaches from its tip,” in Proc. Workshop Econ. Inf. Secur. (WEIS), La Jolla, CA,
[2] ITR Center. Data Breaches Increase 40 Percent in 2016, Finds USA, 2017.
New Report From Identity Theft Resource Center and CyberScout. [30] R. F. Engle and J. R. Russell, “Autoregressive conditional duration:
Accessed: Nov. 2017. [Online]. Available: http://www.idtheftcenter.org/ A new model for irregularly spaced transaction data,” Econometrica,
2016databreaches.html vol. 66, no. 5, pp. 1127–1162, 1998.
[3] C. R. Center. Cybersecurity Incidents. Accessed: Nov. 2017. [Online]. [31] N. Hautsch, Econometrics of Financial High-Frequency Data. Berlin,
Available: https://www.opm.gov/cybersecurity/cybersecurity-incidents Germany: Springer-Verlag, 2011.
[4] IBM Security. Accessed: Nov. 2017. [Online]. Available: [32] P. Embrechts, C. Klüppelberg, and T. Mikosch, Modelling Extremal
https://www.ibm.com/security/data-breach/index.html Events: For Insurance and Finance. Berlin, Germany: Springer, 1997.
[5] NetDiligence. The 2016 Cyber Claims Study. Accessed: Nov. 2017. [33] T. Bollerslev, J. Russell, and M. Watson, Volatility and Time Series
[Online]. Available: https://netdiligence.com/wp-content/uploads/2016/ Econometrics: Essays in Honor of Robert Engle. London, U.K.:
10/P02_NetDiligence-2016-Cyber-Claims-Study-ONLINE.pdf Oxford Univ. Press, 2010.
XU et al.: MODELING AND PREDICTING CYBER HACKING BREACHES 2871
[34] R. B. Nelsen, An Introduction to Copulas. New York, NY, USA: Maochao Xu received the Ph.D. degree in statis-
Springer-Verlag, 2007. tics from Portland State University in 2010. He is
[35] H. Joe, Dependence Modeling With Copulas. Boca Raton, FL, USA: currently an Associate Professor of mathematics
CRC Press, 2014. with Illinois State University. His research interests
[36] J. D. Cryer and K.-S. Chan, Time Series Analysis With Applications in R. include statistical modeling, cyber risk analysis, and
New York, NY, USA: Springer, 2008. ensuring cyber security. He also serves as an Asso-
[37] B. Peter and D. Richard, Introduction to Time Series and Forecasting. ciate Editor for Communications in Statistics.
New York, NY, USA: Springer-Verlag, 2002.
[38] P. J. Brockwell and R. A. Davis, Introduction to Time Series and
Forecasting. New York, NY, USA: Springer-Verlag, 2016.
[39] D. J. Daley and D. Vere-Jones, An Introduction to the Theory of Point
Processes, vol. 1, 2nd ed. New York, NY, USA: Springer-Verlag, 2002.
[40] M. Y. Zhang, J. R. Russell, and R. S. Tsay, “A nonlinear autoregressive
conditional duration model with applications to financial transaction
data,” J. Econ., vol. 104, no. 1, pp. 179–207, 2001. Kristin M. Schweitzer is a Mechanical Engineer
[41] L. Bauwens and P. Giot, “The logarithmic ACD model: An application with the U.S. Army Research Laboratory (ARL),
to the bid-ask quote process of three NYSE stocks,” Ann. Économie Cyber and Networked Systems Branch. Her current
Stat., no. 60, pp. 117–149, Oct./Dec. 2000. role is to conduct and coordinate use-inspired basic
[42] L. Bauwens, P. Giot, J. Grammig, and D. Veredas, “A comparison of research in cyber security for the ARL South office
financial duration models via density forecasts,” Int. J. Forecasting, located at the University of Texas at San Antonio.
vol. 20, no. 4, pp. 589–609, 2004. Previously for ARL, she provided Human Systems
[43] G. W. Corder and D. I. Foreman, Nonparametric Statistics: A Step-by- Integration analyses for U.S. Army, Marine Corps,
Step Approach. Hoboken, NJ, USA: Wiley, 2014. Air Force, and Department of Homeland Security
[44] P. R. Hansen and A. Lunde, “A forecast comparison of volatility models: systems. She also conducted research on human
Does anything beat a garch(1, 1)?” J. Appl. Econ., vol. 20, no. 7, performance in uncontrolled environments.
pp. 873–889, 2005.
[45] S. I. Resnick, Heavy-Tail Phenomena: Probabilistic and Statistical
Modeling. New York, NY, USA: Springer-Verlag, 2007.
[46] X. Zhao, C. Scarrott, L. Oxley, and M. Reale, “Extreme value modelling Raymond M. Bateman received the Ph.D. degree
for forecasting market crisis impacts,” Appl. Financial Econ., vol. 20, in mathematical and computer sciences (operations
nos. 1–2, pp. 63–72, 2010. research) from the Colorado School of Mines.
[47] C. Scarrott, “Univariate extreme value mixture modeling,” in Extreme He retired as a Lieutenant Colonel from the U.S.
Value Modeling and Risk Analysis: Methods and Applications, J. Yan Army Special Forces with 20 years of enlisted and
and D. K. Dey, Eds. London, U.K.: Chapman & Hall, 2016, pp. 41–67. officer service. He conducted research for significant
[48] H. Joe, Multivariate Models and Dependence Concepts (Mono- and relevant issues affecting the U.S. Army Medical
graphs on Statistics and Applied Probability), vol. 73. London, U.K.: Department Center and School, Health Readiness
Chapman & Hall, 1997. Center of Excellence by applying human systems
[49] H. White, “Maximum likelihood estimation of misspecified models,” integration (HSI) and operations research tech-
Econometrica, J. Econ. Soc., vol. 50, no. 1, pp. 1–25, 1982. niques. He currently serves as the Army Research
[50] W. Huang and A. Prokhorov, “A goodness-of-fit test for copulas,” Econ. Laboratory (ARL) South Lead for cybersecurity for use-inspired basic research
Rev., vol. 33, no. 7, pp. 751–771, 2014. at The University of Texas, San Antonio. His projects included serving as the
[51] W. Wang and M. T. Wells, “Model selection and semiparametric Non-Medical Operations Research Systems Analyst and HSI Expert for the
inference for bivariate failure-time data,” J. Amer. Statist. Assoc., vol. 95, Medical Command Root-Cause Analysis Event Support and the Engagement
no. 449, pp. 62–72, 2000. Team that investigates sentinel events that result in permanent harm or death.
[52] C. Genest, J.-F. Quessy, and B. Rémillard, “Goodness-of-fit procedures He has two deployments to Iraq as the Army Civilian Science Advisor to
for copula models based on the probability integral transformation,” Commander III Corps and Army Materiel Command.
Scandin. J. Stat., vol. 33, no. 2, pp. 337–366, 2006.
[53] A. McNeil, R. Frey, and P. Embrechts, Quantitative Risk Man-
agement: Concepts, Techniques, and Tools. Princeton, NJ, USA:
Princeton Univ. Press, 2010. Shouhuai Xu received the Ph.D. degree in computer
[54] P. F. Christoffersen, “Evaluating interval forecasts,” Int. Econ. Rev., science from Fudan University. He is currently a Full
vol. 39, no. 4, pp. 841–862, 1998. Professor with the Department of Computer Science,
[55] R. F. Engle and S. Manganelli, “CAViaR: Conditional autoregressive The University of Texas at San Antonio. He is also
value at risk by regression quantiles,” J. Bus. Econ. Stat., vol. 22, no. 4, the Founding Director of the Laboratory for Cyber-
pp. 367–381, 2004. security Dynamics. He pioneered the Cybersecurity
[56] P. M. Romer, “Increasing returns and long-run growth,” J. Political Dynamics framework for modeling and analyzing
Econ., vol. 94, no. 5, pp. 1002–1037, 1986. cybersecurity from a holistic perspective. He is inter-
[57] G. M. Ljung and G. E. P. Box, “On a measure of lack of fit in time ested in both theoretical modeling and analysis of
series models,” Biometrika, vol. 65, no. 2, pp. 297–303, 1978. cybersecurity and devising practical cyber defense
[58] G. R. Shorack and J. A. Wellner, Empirical Processes With Applications solutions. He co-initiated the International Confer-
to Statistics. Philadelphia, PA, USA: SIAM, 1986. ence on Science of Cyber Security (SciSec) in 2018 and the ACM Scalable
[59] M. A. Stephens, “Tests based on EDF statistics,” in Goodness-of-Fit Trusted Computing Workshop. He is/was a Program Committee Co-Chair of
Techniques, R. B. d’Agostino and M. A. Stephens, Eds. New York, NY, SciSec’18, ICICS’18, NSS’15, and Inscrypt’13. He was/is an Associate Editor
USA: Marcel Dekker, 1986, pp. 97–193. of IEEE TDSC, IEEE T-IFS, and IEEE TNSE.