Limiting The Impact of Stealthy Attacks On Industrial Control Systems
Limiting The Impact of Stealthy Attacks On Industrial Control Systems
Control Systems
1092
models outperform output-only autoregressive models, (d)
time and space correlated models outperform models that
do not exploit these correlations, and (e) from the point of
view of an attacker, launching undetected actuator attacks is
more difficult than launching undetected false-data injection
for sensor values.
The remainder of this paper is organized as follows: In
§ 2, we provide the scope of the paper, and provide the
background to analyze previous proposals. We introduce
our attacker model and the need for new metrics in § 3. We Figure 1: Di↵erent attack points in a control sys-
introduce a way to evaluate the impact of undetected attacks tem: (1) Attack on the actuators (blue): vk j uk , (2)
and attack-detection systems in § 4, and then we use this Attack on the sensors (purple): yk j zk , (3) Attack
adversary model and metric to evaluate the performance of on the controller (red): uk j K(yk )
these systems in physical testbeds, real-world systems, and
simulations in § 5.
popular models used by the papers we survey are Auto-
2. BACKGROUND AND TAXONOMY Regressive (AR) models and Linear Dynamical State-
space (LDS) models.
Scope of Our Study. We focus on using real-time mea- An AR model for a time series yk is given by
surements of the physical world to build indicators of at-
tacks. In particular, we look at the physics of the process un- k
der control but our approach can be extended to the physics ŷk+1 = = ↵i yi + ↵0 (1)
of devices as well [18]. Our work is motivated by false sensor i=k N
measurements [35, 58] or false control signals like manipu- where ↵i are obtained through system identification and yi
lating vehicle platoons [19], manipulating demand-response the last N sensor measurements. The coefficients ↵i can be
systems [58], and the sabotage Stuxnet created by manip- obtained by solving an optimization problem that minimizes
ulating the rotation frequency of centrifuges [17, 32]. The the residual error (e.g., least squares) [37].
question we are trying to address is how to detect these If we have inputs (control commands uk ) and outputs
false sensor or false control attacks in real-time. (sensor measurements yk ) available, we can use subspace
2.1 Background model identification methods, producing LDS models:
A general feedback control system has five components: xk+1 = Axk + Buk + ✏k
(1) the physical phenomena of interest (sometimes called yk = Cxk + Duk + ek (2)
the process or plant), (2) sensors that send a time series yk
denoting the value of the physical measurement zk at time where A, B, C, and D are matrices modeling the dynamics
k (e.g., the voltage at 3am is 120kV) to a controller, (3) of the physical system. Most physical systems are strictly
based on the sensor measurements received yk , the controller causal and therefore D = 0 in general. The control com-
K(yk ) sends control commands uk (e.g., open a valve by 10
p
mands uk " R a↵ect the next time step of the state of the
n q
%) to actuators, and (4) actuators that produce a physical system xk " R and sensor measurements yk " R are mod-
change vk in response to the control command (the actuator eled as a linear combination of these hidden states. ek and
is the device that opens the valve). ✏k are sensor and perturbation noise, and are assumed to be
A general security monitoring architecture for control sys- a random process with zero mean. To make a prediction,
tems that looks into the “physics” of the system needs an we i) first need yk and uk to obtain a state estimate x̂k+1
anomaly detection system that receives as inputs the sensor and ii) use the estimate to predict ŷk+1 = C x̂k+1 . A large
measurements yk from the physical system and the control body of work on power systems employs the second equation
commands uk sent to the physical system, and then uses from Eq. (2) without the dynamic state equation. We refer
them to identify any suspicious sensor or control commands to this special case of LDS used in power systems as Static
is shown in Fig. 1. Linear State-space (SLS) models.
Detection Statistic. If the observations we get from sen-
2.2 Taxonomy sors yk are significantly di↵erent from the ones we expect
Anomaly detection is usually performed in two steps. First (i.e., if the residual is large) we generate an alert. A State-
we need a model of the physical system that predicts the less test, raises an alarm for every deviation at time k: i.e.,
output of the system ŷk . The second step compares that if ∂yk ŷk ∂ = rk ' ⌧ , where ⌧ is a threshold.
prediction ŷk to the observations yk and then performs a In a Stateful test we compute an additional statistic Sk
statistical test on the di↵erence. The di↵erence between that keeps track of the historical changes of rk (no mat-
prediction and observation is usually called the residual rk . ter how small) and generate an alert if Sk ' ⌧ , i.e., if
We now present our new taxonomy for related work, based there is a persistent deviation across multiple time-steps.
on four aspects: (1) physical model, (2) detection statistic, There are many tests that can keep track of the histori-
(3) metrics, and (4) validation. cal behavior of the residual rk such as taking an average
Physical Model. The model of how a physical system be- over a time-window, an exponential weighted moving aver-
haves can be developed from physical equations (Newton’s age (EWMA), or using change detection statistics such as
laws, fluid dynamics, or electromagnetic laws) or it can be the non-parametric CUmulative SUM (CUSUM) statistic.
learned from observations through a technique called system The nonparametric CUSUM statistic is defined recursively
as S0 = 0 and Sk+1 = (Sk + ∂rk ∂ ) , where (x) represents
+ +
identification [4, 38]. In system identification one often has
to use either Auto-Regressive Moving Average with eXoge- max(0, x) and is selected so that the expected value of
nous inputs (ARMAX) or linear state-space models. Two ∂rk ∂ < 0 under hypothesis H0 (i.e., prevents Sk from
1093
increasing consistently under normal operation). An alert other, and this makes it difficult to build upon previous work
is generated whenever the statistic is greater than a previ- (it is impossible to identify best practices without a way
ously defined threshold Sk > ⌧ and the test is restarted with to compare di↵erent proposals). To address this problem
Sk+1 = 0. The summary of our taxonomy for modeling the we propose a general-purpose evaluation metric in § 4 that
system and to detect an anomaly in the residuals is given in leverages our stealthy adversary model, and then compare
Fig. 2 previously proposed methods. Our results show that while
stateless tests are more popular in the literature, stateful
tests are better to limit the impact of stealthy attackers.
Detection In addition, we show that LDS models are better than AR
Residual Generation models, that AR models proposed in previous work can be
yk Anomaly improved by leveraging correlation among di↵erent signals,
rk Detection: alert
and that having an integral controller can limit the impact
yk Physical ŷk rk = yk ŷk Sateless or
1
Model Stateful of stealthy actuation attacks.
uk LDS or AR To address point (4) we conduct experiments using all
three options: a testbed with a real physical process under
control § 5.1, real-world data § 5.2, and simulations § 5.3. We
Figure 2: The detection block from Fig. 1 focusing show the advantages and disadvantages of each experimental
on our taxonomy. setup, and the insights each of these experiments provide.
1094
Table 1: Taxonomy of related work. Columns are organized by publication venue.
[45] Mo et al.
[15] Do et al.
[7] Bai et al.
[56] Smith
Venue Control Smart/Power Grid Security Misc.
Detection Statistic
stateless c c c - - - c c c G
# c c c c c - c G
# c - c c c c c - c - c G
# c - c - G # - - c c c c c - c
# G
# l l - - - - - - - - - c - - - l - - - - - l - c - - - l - c - - c c c - - c - l -
stateful - - - G
Physical Model
AR - - - - - - - - - - - - - - - - - - - - - - - - - - - c - - - - c - - - - - - - - - - - -
SLS c c G# - - - - - - - - - - - - - - - - c c c c c G
# - - - c - - - - - - - - - - G
# G
# c - - -
LDS - - - c c c c c c c c c c c c c c c c - - - - - - - - - - - - c - c - - c - - - - - - - -
other - - - - - - - - - - - - - - - - - - - - - - - - - c c - - G
# G
# - - - c c - c c - - - c G
# c
ò
Metrics
impact - c - c - - c - c c - c c c c - - - c c - c - c - - c c c - c - - - - - c - - c - c - c -
statistic - - c - c - - c c - - c c c c - c - - c - - c - - - c - c - - - - - - - - c c c - c - - c
TPR - - - - c c - - - - c - - - - c - - - - c - c - - c - - - - - c c - c - - - - - - - c - -
FPR - - - - c - - - - - c - - - - c - - - - - - c - - c - c - - - c - - c - - - - - - - c - -
Validation
simulation - c c c c c c c c c c c c c - c - c c c - c c c c c G
# - c - c - - c c - c - c c c c - - c
real data - - - - - - - - - - - - - - - - c - - - - - - - G
# c - c - - - - c - - - - c - - - - - c -
testbed - - - - - - - - - - - - - - c - - - c - - - - - - - - - - G
# - c c - - c - - - - - - c - -
The wide oscillations of the pH levels occur because there is by injecting a malicious device in the EtherNet/IP ring of
a delay between the control actions of the HCl pump, and the testbed, given that the implementation of this protocol
the water pH responding to it. is unauthenticated. A detailed implementation of our attack
is given in our previous work [64]. In particular, our MitM
8.5 intercepts sensor values coming from the HCL pump and
Water pH measure
8 HCl Pump On the pH sensor, and intercept actuator commands going to
Pump State
7.5
6
14
1 2 3 4 5 6 7 8 9 10 Real Water pH
Time(min) 12 Compromised pH
Water pH
10 Attack
Figure 4: During normal operation, the water pH is
8
kept in safe levels.
6
1095
10
Attack
4. A STRONGER ADVERSARY MODEL
We assume an attacker that has compromised a sensor
HCl Pump
Water pH
8 On
(e.g. pH level in our motivating example) or an actuator
Off
(e.g. pump in our motivating example) in our system. We
6
Real Water pH
Compromised HCl Pump
also assume that the adversary has complete knowledge of
4
our system, i.e. she knows the physical model we use, the
1 2 3 4 5 6 7 8 9 10 statistical test we use, and the thresholds we select to raise
1.5 Time(min) alerts. Given this knowledge, she generates a stealthy at-
Detection Metric
Stateful tack, where the detection statistic will always remain below
Stateless
1 Alarm the selected threshold.
While similar stealthy attacks have been previously pro-
0.5 posed [13, 35, 36], in this paper we extend them for generic
control systems including process perturbations and mea-
0
1 2 3 4 5 6 7 8 9 10
surement noise, we force the attacks to remain stealthy against
Time(min) stateful tests, and also force the adversary to optimize the
negative impact of the attack. In addition, we assume our
adversary is adaptive, so if we lower the threshold to fire
Figure 6: Attack to the pump actuator.
an alert, the attacker will also change the attack so that
the anomaly detection statistic remains below the thresh-
old. This last property is illustrated in Fig. 7.
tain a probability of false alarm of 0.01). We also launched
Notice that this type of adaptive behavior is di↵erent from
an attack on the pump (actuator). Here the pump ignores
how traditional metrics such as ROC curves work, because
O↵ control commands from the PLC, and sends back mes-
they use the same attacks for di↵erent thresholds of the
sages stating that it is indeed O↵, while in reality it is On.
anomaly detector. On the other hand, our adversary model
As illustrated in Fig. 6, only the stateful test detects this
requires a new and unique (undetected) attack specifically
attack. We also launched several random attacks that were
tailored for every anomaly detection threshold. If we try
easily detected by the stateful statistic, and if we were to
to compute an ROC curve under our adversary model we
plot the ROC curve of these attacks, we would get 100%
would get a 0% detection rate because the attacker would
detection rate.
generate a new undetected attack for every anomaly detec-
Observations. As we can see, it is very easy to create tion threshold.
attacks that can be detected. Under these simulations we This problem is not unique to ROC curves: most popular
could initially conclude that our LDS model combined with metrics for evaluating the classification accuracy of intrusion
the stateful anomaly detection are good enough; after all, detection systems (like the intrusion detection capability, the
they detected all attacks we launched. However, are these Bayesian detection rate, accuracy, expected cost, etc.) are
attacks enough to conclude that our LDS model is good known to be a multi-criteria optimization problem between
enough? And if these attacks are not enough, then which two fundamental trade-o↵ properties: the false alarm rate,
types of attacks should we launch? and the true positive rate [11], and as we have argued, using
Notice that for any physical system, a sophisticated at- any metric that requires a true positive rate will be inef-
tacker can spoof deviations that follow relatively close the fective against our adversary model launching undetected
“physics” of the system while still driving the system to a attacks.
di↵erent state. How can we measure the performance of our
anomaly detection algorithm against these attacks? How Observation. Most intrusion detection metrics are varia-
can we measure the e↵ectiveness of our anomaly detection tions of the fundamental trade-o↵ between false alarms and
tool if we assume that the attacker will always adapt to our true positive rates [11], however, our adversary by definition
algorithms and launch an undetected attack? And if our will never be detected so we cannot use true positive rates
algorithms are not good enough, how can we design better (or variations thereof). Notice however that by forcing our
algorithms? If by definition the attack is undetected, then adversary to remain undetected, we are e↵ectively forcing
we will always have a 0% true positive rate, therefore we her to launch attacks that follow closely the physical behav-
need to devise new metrics to evaluate our systems. ior of the system (more precisely, we are forcing our attacker
to follow more closely our Physical Model ), and by following
closer the behavior of the system, then the attack impact is
D reduced: the attack needs to appear to be a plausible phys-
ical system behavior. So the trade-o↵ we are looking for
with this new adversary model is not one of false positives
vs. true positives, but one between false positives and the
impact of undetected attacks.
Attacks
New Metric. To define precisely what we mean by impact
T
of undetected attack we select one (or more) variables of
interest (usually a variable whose compromise can a↵ect the
Figure 7: Our attacker adapts to di↵erent detection safety of the system) in the process we want to control–
thresholds: If we select ⌧2 the adversary launches e.g., the pH level in our motivating example. The impact
an attack such that the detection statistic (dotted of the undetected attack will then be, how much can the
blue) remains below ⌧2 . If we lower our threshold to attacker drive that value towards its intended goal (e.g., how
⌧1 , the adversary selects a new attack such that the much can the attacker lower the pH level while remaining
detection statistic (solid red) remains below ⌧1 . undetected) per unit of time.
Therefore we propose a new metric consisting of the trade-
1096
a
attack yk is a new vector where some (or all) of the sensor
aò a
satisfies the equation: yk+1 = arg maxyk+1 a
Ωyk+1 yk+1 Ω, the greedy attack for a stateless test is: yk+1 =
a aò
Longer time between false alarms = More Usable
ŷk+1 ± ⌧. The greedy optimization problem for an attacker
aò a
facing a stateful CUSUM test becomes yk+1 = max{yk+1 ⇥
Sk+1 & ⌧ }. Because Sk+1 = (Sk +rk ) the optimal attack is
Usability Metric: Expected time between false alarms
1097
Algorithm 1: Computing Y axis A physical testbed has typically a smaller scale than a
f (yk+1 )
a real-world operational system, so the fidelity in false alarms
1: Define
2: Select ⌧set = {⌧1 , ⌧2 , . . .}, , f , and
might not be as good as with real data, but on the other
Kset = {, . . . , kf 1}
hand, we can launch attacks. The attacks we can launch are,
however, constrained because physical components and de-
3: æ(⌧, k) " ⌧set ✓ Kset , find vices may su↵er damage by attacks that violate the safety re-
4: quirements and conditions for which they were designed for.
yk+1 (⌧ ) = arg max f (yk+1 )
aò a
Moreover, attacks could also drive the testbed to states that
a
yk+1 endanger the operator’s and environment’s safety. There-
s.t. fore, while a testbed provides more experimental interaction
than real data, it introduces safety constraints for launching
Detection Statistic & ⌧
attacks.
5: æ⌧ " ⌧set , calculate Simulations on the other hand, do not have these con-
axis = max f (yk+1 (⌧ ))
aò straints and a wide variety of attacks can be launched. So
y our simulations will focus on attacks to actuators and demon-
k"Kset
strate settings that cannot be achieved while operating a
real-world system because of safety constraints. Simulations
also allow us to easily change the control algorithms and to
Algorithm 2: Computing X axis our surprise, we found that control algorithms have a big
1: Observations Y
na
with no attacks of time-duration TE impact on the ability of our attacker to achieve good results
2: æ⌧ " ⌧set , compute in the y-axis of our metric. However, while simulations allow
Detection Statistic: DS (Y )
na
us to test a wide variety of attacks, the problem is that the
false alarms measured with a simulation are not going to be
Number of false alarms: nF A(DS , ⌧ ) as representative as those obtained from real data or from a
x axis = E[Tf a (⌧ )] = TE /nF A
testbed.
= well suited, G
# = partially suitable, # = least suitable 7
pH
Attack
We evaluate anomaly detection systems under the light of 6
our Stronger Adversary Model (see section § 4), using our pH with Nonlinear order-100
pH with Nonlinear order-50
new metrics in a range of test environments, with individ- 5 pH with LDS order-20
pH without Attack
ual strengths and weaknesses (see Table 2). As shown in
the table, real-world data allows us to analyze operational 5 6 7 8 9 10 11 12
Time (min)
large-scale scenarios, and therefore it is the best way to test
the x-axis metric E[Tf a ]. Unfortunately, real-world data
does not give researchers the flexibility to launch attacks Figure 10: pH deviation imposed by greedy attacks
and measure the impact on all parts of the system. Such while using stateful detection (⌧ = 0.05) with both,
interactive testing requires the use of a dedicated physical LDS and nonlinear models.
testbed.
1098
3.5
water level attacks with di↵erent increment rates, starting
3 Nonlinear Model order-100 from the Low level setting and stopping at the High level
Nonlinear Model order-50
LDS order-20
setting, and their induced maximum over the real level.
2.5
Only attacks a1 and a2 achieve a successful overflow (only
pH / min
Now we turn to another stage in our testbed. The goal of 0.1 0.2
=0.0014
the attacker this time is to deviate the water level in a tank 0.1 Stateless
0.05
as much as possible until the tank overflows. =0.0011 Stateful
0
While in the pH example we had to use system identifi- 0
0 0.1 0.2 0.3
cation to learn LDS and nonlinear models, the evolution of 0 50 100 150
E[Tfa] (min)
the water level in a tank is a well-known LDS system that
can be derived from first principles. In particular, we use a
mass balance equation that relates the change in the water Figure 13: Comparison of stateful and stateless de-
in out
level h with respect to the inlet Q and outlet Q volume tection. At 0.3m the tank overflows, so stateless
in out
of water, given by Area dt = Q
dh tests are not good for this use case. ⌧b , ⌧c correspond
to the threshold associated to some E[Tf a ].
Q , where Area is the
cross-sectional area of the base of the tank. Note that in
this process the control actions for the valve and pump are
in out
On/O↵. Hence, Q or Q remain constant if they are Because it was derived from “first principles”, our LDS
open, and zero otherwise. Using a time-discretization of 1 s, model is a highly accurate physical model of the system, so
we obtain an LDS model of the form there is no need to test alternative physical models. How-
in out
Q Qk ever, we can combine our LDS model with a stateless test,
hk+1 = hk + k .
Area and with a stateful test and see which of these detection
Note that while this equation might look like an AR model, tests can limit the impact of stealthy attacks.
in out
it is in fact an LDS model because the input Qk Qk In particular, to compute our metric we need to test state-
changes over time, depending on the control actions of the less and stateful mechanisms and obtain the security metric
PLC (open/close inlet or start/stop pump). In particular that quantifies the impact of undetected attacks for sev-
it is an LDS model with xk = hk , uk = [Qk , Qk ] , B =
in out T
eral thresholds ⌧ . We selected the parameter = 0.002 for
[ Area , Area ], A = 1, and C = 1.
1 1
the stateful (CUSUM) algorithm, such that the detection
Recall that the goal of the attacker is to deviate the water metric Sk remains close to zero when there is no attack.
level in a tank as much as possible until the tank overflows. The usability metric is calculated for TE = 8 h, which is the
In particular, the attacker increases the water level sensor time of the experiment without attacks.
signal at a lower rate than the real level of water (Fig. 12) Fig. 13 illustrates the maximum impact caused by 20 dif-
with the goal of overflowing the tank. A successful attack ferent undetected attacks, each of them averaging 40 min-
occurs if the PLC receives from the sensor a High water-level utes. Even though the attacks remained undetected, the
message (the point when the PLC sends a command to close impact using stateless detection is such that a large amount
the inlet), and at that point, the deviation ( ) between the of water can be spilled. Only for very small thresholds is it
real level of water and the “fake” level (which just reached the possible to avoid overflow, but it causes a large number of
High warning) is ' Overflow High. Fig. 12 shows three false alarms. On the other hand, stateful detection limits
1099
the impact of the adversary. Note that to start spilling wa- impact of a stealthy attack when compared to the stateless
ter (i.e., > 0.3 m) a large threshold is required. Clearly, test we now show how to improve the AR physical model
selecting a threshold such that E[Tf a ] = 170 min can avoid previously used by Hadziosmanovic et al. [21]. In particular,
the spilling of water with a considerable tolerable number of we notice that Hadziosmanovic et al. use an AR model per
false alarms. signal ; this misses the opportunity of creating models of how
In addition to attacking sensor values, we would like to multiple signals are correlated, creating correlated physical
analyze undetected actuation attacks. To launch attacks on models will limit the impact of undetected attacks.
the actuators (pumps) of this testbed, we would need to turn
them On and O↵ in rapid succession in order try to main- Stateless
tain the residuals of the system low enough to avoid being 600
Stateful
detected. We cannot do this on real equipment because the 500
/ sec
pumps would get damaged. Therefore, we will analyze unde- 400
100
5.2 Large-Scale Operational Systems (Modbus 0 200 400 600 800 1000
Measurements
108 Modbus devices, of which one acts as central master, 80
60
one as external network gateway, and 106 are slave PLCs; 40
3) of the commands sent from the master to the PLCs, 74% 0 1 2 3 4 5 6 7 8 9
Time (sec) ×10 4
s16
are Read Coils (0x01) commands, and 6% are Read Discrete 110
108 s19
Inputs (0x02) commands; and 4) 78% of PLCs count with 106
200 to 600 registers, 15% between 600 to 1000, and 7% with 104
100
We replay the traffic traces in packet capture (pcap) for- 5.2 5.4 5.6 5.8 6 6.2 6.4 6.6 6.8
mat and use Bro [51] to track the memory map of holding
(read/write) registers from PLCs. We then use Pandas [68],
a Python Data Analysis Library, to parse the log generated Figure 15: Three example signals with significant
by Bro and to extract per PLC the time series correspond- correlations. Signal S16 is more correlated with S19
ing to each of the registers. Each time series corresponds to than it is with S8 .
a signal (yk ) in our experiments. We classify the signals as
91.5% constant, 5.3% discrete and 3.2% continuous based
Spatial and Temporal Correlation. In an ideal situ-
on the data characterization approach proposed to analyze
ation the water utility operators could help us identify all
Modbus traces [21] and uses AR models (as in Eq. (1)). We
control loops and spatial correlations of all variables (the
follow that approach by modeling the continuous time-series
water pump that controls the level of water in a tank etc.);
in our dataset with AR models. The order of the AR model
however, this process becomes difficult to perform in a large-
is selected using the Best Fit criteria from the Matlab sys-
scale system with thousands of control and sensor signals
tem identification toolbox [39], which uses unexplained out-
exchanged every second; therefore we now attempt to find
put variance, i.e., the portion of the output not explained
correlations empirically from our data. We correlate sig-
by the AR model for various orders [41].
nals by computing the correlation coefficients of di↵erent
Using the AR model, our first experiment centers on de-
signals s1 , s2 , ⇧, sN . The correlation coefficient is a nor-
ciding which statistical detection test is better, a stateless
malized variant of the mathematical covariance function:
corr(si , sj ) = ’ where cov(si , sj ) denotes
test or the stateful CUSUM change detection test. Fig. 14 cov(si ,sj )
shows the comparison of stateless vs. stateful tests with our cov(si ,si )cov(sj ,sj )
proposed metrics (where the duration of an undetected at- the covariance between si and sj and correlation ranges
tack is 10 minutes). As expected, once the CUSUM statis- between 1 & corr(si , sj ) & 1. We then calculate the p-
tic reaches the threshold Sk = ⌧ , the attack no longer has value of the test to measure the significance of the corre-
enough room to continue deviating the signal without be- lation between signals. The p-value is the probability of
ing detected, and larger thresholds ⌧ do not make a di↵er- having a correlation as large (or as negative) as the ob-
ence once the attacker reaches the threshold, whereas for served value when the true correlation is zero (i.e., testing
the stateless test, the attacker has the ability to change the the null hypothesis of no correlation, so lower values of p
measurement by ⌧ units at every time step. indicate higher evidence of correlation). We were able to
Having shown that a CUSUM (stateful) test reduces the find 8,620 correlations to be highly significant with p = 0.
1100
Because corr(si , sj ) = corr(sj , si ) there are 4,310 unique sig- output signal is not under complete control of the attacker:
nificant correlated pairs. We narrow down our attention to consumers can also a↵ect the frequency of the system (by
corr(si , sj ) > .96. Fig. 15 illustrates three of the correlated increasing or decreasing electricity consumption), and there-
signals we found. Signals s16 and s19 are highly correlated fore they can cause an alarm to be generated if the attacker
with corr(s16 , s19 ) = .9924 while s8 and s19 are correlated is not conservative. We assume the worst possible case of
but with a lower correlation coefficient of corr(s8 , s19 ) = an omniscient adversary that knows how much consumption
.9657. For our study we selected to use signal s8 and its will happen at the next time-step (this is a conservative ap-
most correlated signal s17 which are among the top most proach to evaluate the security of our system, in practice
correlated signal pairs we found with corr(S8 , S17 ) = .9996. we expect the anomaly detection system to perform better
because no attacker can predict the future).
Stateless Stateful
800
160 AR model LDS model
20 0.7
600 150
18 0.6
Maximum ∆f (Hz)
140
Maximum ∆f (Hz)
/ sec
/ sec
1101
Actuator attack I/O with P control
66 0.15
Real control
of the system achieved by the attacker, and the maximum
Compromised control temporary deviation of the state of the system achieved by
Frequency (Hz)
64.8
0.1
the attacker.
U , Ua (MW)
63.6
0.05 As we can see, the control algorithm plays a fundamental
62.4
Real freq.
Estimated freq.
role in how e↵ective an actuation attack can be. An at-
61.2
0 tacker that can manipulate the actuators at will can cause
-0.05
a larger frequency error but for a short time when we use
60
30 40 50 60 70 80 30 40 50 60 70 80 PI control; however, if we use P control, the attacker can
Time (sec) Time (sec)
launch more powerful attacks causing long-term e↵ects. On
the other hand, attacks on sensors have the same long-term
Figure 18: Left: The real (and trusted) frequency negative e↵ects independent of the type of control we use
signal is increased to a level higher than the one ex- (P or PI). Depending on the type of system, short-term ef-
pected (red) by our model of physical system given fects may be more harmful than long-term errors. In our
the control commands. Right: If the defender uses a power plant example, a sudden frequency deviation larger
P control algorithm, the attacker is able to maintain than 0.5 Hz can cause irreparable damage on the generators
a large deviation of the frequency from its desired and equipment in transmission lines (and will trigger pro-
60Hz set point. tection mechanisms disconnecting parts of the grid). Small
long-term deviations may cause cascading e↵ects that can
Actuator attack with I/O estimation and PI control propagate and damage the whole grid.
64
63
Real freq.
0.1
Real control
Compromised control
While it seems that the best option to protect against
Estimated freq.
actuator attacks is to deploy PI controls in all generators,
Frequency (Hz)
62 0.05
U,U a (MW)
2.5 2.5
6. CONCLUSIONS
2 2
1.5
0.04
1.5 0.4 6.1 Findings
1 1 0.2
0.02
We introduced theoretical and practical contributions to
0.5 0 5 0.5 0 5
the growing literature of physics-based attack detection in
0 4 0 4
0 2
E[T ] (min)
6 0 2
E[Tfa] (min)
6 control systems. Our literature review from di↵erent do-
fa
mains of expertise unifies disparate terminology, and nota-
tion. We hope our e↵orts can help other researchers refine
Figure 20: Di↵erences between attacking sensors and improve a common language to talk about physics-based
and actuators, and e↵ects when the controller runs attack detection across computer security, control theory,
a P control algorithm vs. a PI control algorithm. and power system venues.
In particular, in our survey we identified a lack of unified
In all our previous examples with attacked sensors (except metrics and adversary models. We explained in this paper
for the pH case), the worst possible deviation was achieved the limitations of previous metrics and adversary models,
at the end of the attack, but for actuation attacks (and PI and proposed a novel stealthy and adaptive adversary model,
control), we can see that the controller is compensating the together with its derived intrusion detection metric, that can
attack in order to correct the observed frequency deviation, be used to study the e↵ectiveness of physics-based attack-
and thus the final deviation will be zero: that is, the asymp- detection algorithms in a systematic way.
totic deviation is zero, while the transient impact of the We validated our approaches in multiple setups, includ-
attacker can be high. Fig. 20 illustrates the di↵erence be- ing: a room-size water treatment testbed, a real large-scale
tween measuring the maximum final deviation of the state operational system managing more than 100 PLCs, and sim-
1102
ulations of primary frequency control in the power grid. We an unsafe state. Therefore maintaining safety under both,
showed in Table 2 how each of these validation setups has attacks and false alarms, will need to take priority in the
advantages and disadvantages when evaluating the x-axis study of any automatic response to alerts.
and y-axis of our proposed metric.
One result we obtained across our testbed, real opera- Acknowledgments
tional systems, and simulations, is the fact that stateful
tests perform better than stateless tests. This is in stark The work at UT Dallas was supported by NIST under award
contrast to the popularity of stateless detection statistics 70NANB14H236 from the U.S. Department of Commerce
as summarized in Table 1. We hope our paper motivates and by NSF CNS-1553683. The work of Justin Ruths at
more implementations of stateful instead of stateless tests SUTD was supported by grant NRF2014NCR-NCR001-40
in future work. from NRF Singapore. H. Sandberg was supported in part
We also show that for a stealthy actuator attack, PI con- by the Swedish Research Council (grant 2013-5523) and the
trols play an important role in limiting the impact of this Swedish Civil Contingencies Agency through the CERCES
attack. In particular we show that the Integrative part of project. We thank the iTrust center at SUTD for enabling
the controller corrects the system deviation and forces the the experiments on the SWaT testbed.
attacker to have an e↵ective negligible impact asymptoti-
cally. Disclaimer
Finally, we also provided the following novel observations:
(1) finding spatio-temporal correlations of Modbus signals Certain commercial equipment, instruments, or materials
has not been proposed before, and we showed that these are identified in this paper in order to specify the experimen-
models are better than models of single signals, (2) while tal procedure adequately. Such identification is not intended
input/output models like LDS are popular in control the- to imply recommendation or endorsement by the National
ory, they are not frequently used in papers published in se- Institute of Standards and Technology, nor is it intended to
curity conferences, and we should start using them because imply that the materials or equipment identified are neces-
they perform better than the alternatives, unless we deal sarily the best available for the purpose.
with a highly-nonlinear model, in which case the only way
to limit the impact of stealthy attacks is to estimate non- 7. REFERENCES
linear physical models of the system, and (3) we show why [1] S. Amin, X. Litrico, S. Sastry, and A. Bayen. Cyber
launching undetected attacks in actuators is more difficult security of water SCADA systems; Part I: Analysis
than in sensors. and experimentation of stealthy deception attacks.
IEEE Transactions on Control Systems Technology,
6.2 Discussion and Future Work 21(5):1963–1970, 2013.
While physics-based attack detection can improve the se- [2] S. Amin, X. Litrico, S. Sastry, and A. Bayen. Cyber
curity of control systems, there are some limitations. For ex- security of water SCADA systems; Part II: Attack
ample, in all our experiments the attacks a↵ected the resid- detection using enhanced hydrodynamic models. IEEE
uals and anomaly detection statistics while keeping them Transactions on Control Systems Technology,
below the thresholds; however, there are special cases where 21(5):1679–1693, 2013.
depending on the power of the attacker or the characteris- [3] M. Andreasson, D. V. Dimarogonas, H. Sandberg, and
tics of the plant, the residuals can remain zero (ignoring the K. H. Johansson. Distributed pi-control with
noise) while the attacker can drive the system to an arbi- applications to power systems frequency control. In
trary state. For example, if the attacker has control of all Proceedings of American Control Conference (ACC),
sensors and actuators, then it can falsify the sensor readings pages 3183–3188. IEEE, 2014.
so that our detector believes the sensors are reporting the
[4] K. J. Åström and P. Eykho↵. System identification—a
expected state given the control signal, while in the mean-
survey. Automatica, 7(2):123–162, 1971.
time, the actuators can control the system to an arbitrary
unsafe condition. [5] S. Axelsson. The base-rate fallacy and the difficulty of
Similarly, some properties of the physical systems can intrusion detection. ACM Transactions on Information
also limit us from detecting attacks. For example, systems and System Security (TISSEC), 3(3):186–205, 2000.
vulnerable to zero-dynamics attacks [61], unbounded sys- [6] C.-z. Bai and V. Gupta. On Kalman filtering in the
tems [62], and highly non-linear or chaotic systems [48]. presence of a compromised sensor : Fundamental
Finally, one of the biggest challenges for future work is performance bounds. In Proceedings of American
the problem of how to respond to alerts. While in some Control Conference, pages 3029–3034, 2014.
control systems simply reporting the alert to operators can [7] C.-z. Bai, F. Pasqualetti, and V. Gupta. Security in
be considered enough, we need to consider automated re- stochastic control systems : Fundamental limitations
sponse mechanisms in order to guarantee the safety of the and performance bounds. In Proceedings of American
system. Similar ideas in our metric can be extended to Control Conference, 2015.
this case, where instead of measuring the false alarms, we [8] R. B. Bobba, K. M. Rogers, Q. Wang, H. Khurana,
measure the impact of a false response. For example, our K. Nahrstedt, and T. J. Overbye. Detecting false data
previous work [10] considered switching a control system to injection attacks on DC state estimation. In
open-loop control whenever an attack in the sensors was de- Proceedings of Workshop on Secure Control Systems,
tected (meaning that the control algorithm will ignore sensor volume 2010, 2010.
measurements and will attempt to estimate the state of the [9] A. Carcano, A. Coletta, M. Guglielmi, M. Masera,
system based only on the expected consequences of its con- I. N. Fovino, and A. Trombetta. A multidimensional
trol commands). As a result, instead of measuring the false critical state analysis for detecting intrusions in
alarm rate, we focused on making sure that a reconfiguration SCADA systems. IEEE Transactions on Industrial
triggered by a false alarm would never drive the system to Informatics, 7(2):179–186, 2011.
1103
[10] A. A. Cardenas, S. Amin, Z.-S. Lin, Y.-L. Huang, systems. In Proceedings of Chinese Control and
C.-Y. Huang, and S. Sastry. Attacks against process Decision Conference, pages 3319–3323, 2015.
control systems: risk assessment, detection, and [24] T. Kailath and H. V. Poor. Detection of stochastic
response. In Proceedings of the ACM symposium on processes. IEEE Transactions on Information Theory,
information, computer and communications security, 44(6):2230–2231, 1998.
pages 355–366, 2011. [25] A. J. Kerns, D. P. Shepard, J. A. Bhatti, and T. E.
[11] A. A. Cárdenas, J. S. Baras, and K. Seamon. A Humphreys. Unmanned aircraft capture and control
framework for the evaluation of intrusion detection via gps spoofing. Journal of Field Robotics,
systems. In Proceedings of Symposium on Security and 31(4):617–636, 2014.
Privacy, pages 77–91. IEEE, 2006. [26] T. T. Kim and H. V. Poor. Strategic protection
[12] S. Cui, Z. Han, S. Kar, T. T. Kim, H. V. Poor, and against data injection attacks on power grids. IEEE
A. Tajer. Coordinated data-injection attack and Transactions on Smart Grid, 2(2):326–333, 2011.
detection in the smart grid: A detailed look at [27] I. Kiss, B. Genge, and P. Haller. A clustering-based
enriching detection solutions. Signal Processing approach to detect cyber attacks in process control
Magazine, IEEE, 29(5):106–115, 2012. systems. In Proceedings of Conference on Industrial
[13] G. Dán and H. Sandberg. Stealth attacks and Informatics (INDIN), pages 142–148. IEEE, 2015.
protection schemes for state estimators in power [28] O. Kosut, L. Jia, R. Thomas, and L. Tong. Malicious
systems. In Proceedings of Smart Grid data attacks on smart grid state estimation: Attack
Commnunications Conference (SmartGridComm), strategies and countermeasures. In Proceedings of
October 2010. Smart Grid Commnunications Conference
[14] K. R. Davis, K. L. Morrow, R. Bobba, and E. Heine. (SmartGridComm), October 2010.
Power flow cyber attacks and perturbation-based [29] G. Koutsandria, V. Muthukumar, M. Parvania,
defense. In Proceedings of Conference on Smart Grid S. Peisert, C. McParland, and A. Scaglione. A hybrid
Communications (SmartGridComm), pages 342–347. network IDS for protective digital relays in the power
IEEE, 2012. transmission grid. In Proceedings of Smart Grid
[15] V. L. Do, L. Fillatre, and I. Nikiforov. A statistical Communications (SmartGridComm), 2014.
method for detecting cyber/physical attacks on [30] M. Krotofil, J. Larsen, and D. Gollmann. The process
SCADA systems. In Proceedings of Control matters: Ensuring data veracity in cyber-physical
Applications (CCA), pages 364–369. IEEE, 2014. systems. In Proceedings of Symposium on Information,
[16] E. Eyisi and X. Koutsoukos. Energy-based attack Computer and Communications Security (ASIACCS),
detection in networked control systems. In Proceedings pages 133–144. ACM, 2015.
of the Conference on High Confidence Networked [31] C. Kwon, W. Liu, and I. Hwang. Security analysis for
Systems (HiCoNs), pages 115–124, New York, NY, cyber-physical systems against stealthy deception
USA, 2014. ACM. attacks. In Proceedings of American Control
[17] N. Falliere, L. O. Murchu, and E. Chien. W32. stuxnet Conference, pages 3344–3349, 2013.
dossier. White paper, Symantec Corp., Security [32] R. Langner. Stuxnet: Dissecting a cyberwarfare
Response, 2011. weapon. Security & Privacy, IEEE, 9(3):49–51, 2011.
[18] D. Formby, P. Srinivasan, A. Leonard, J. Rogers, and [33] J. Liang, O. Kosut, and L. Sankar. Cyber attacks on
R. Beyah. Who’s in control of your control system? ac state estimation: Unobservability and physical
Device fingerprinting for cyber-physical systems. In consequences. In Proceedings of PES General Meeting,
Network and Distributed System Security Symposium pages 1–5, July 2014.
(NDSS), Feb, 2016. [34] H. Lin, A. Slagell, Z. Kalbarczyk, P. W. Sauer, and
[19] R. M. Gerdes, C. Winstead, and K. Heaslip. CPS: an R. K. Iyer. Semantic security analysis of SCADA
efficiency-motivated attack against autonomous networks to detect malicious control commands in
vehicular transportation. In Proceedings of the Annual power grids. In Proceedings of the workshop on Smart
Computer Security Applications Conference (ACSAC), energy grid security, pages 29–34. ACM, 2013.
pages 99–108. ACM, 2013. [35] Y. Liu, P. Ning, and M. K. Reiter. False data injection
[20] A. Giani, E. Bitar, M. Garcia, M. McQueen, attacks against state estimation in electric power
P. Khargonekar, and K. Poolla. Smart grid data grids. In Proceedings of ACM conference on Computer
integrity attacks: characterizations and and communications security (CCS), pages 21–32.
countermeasures ⇡. In Proceedings of Smart Grid ACM, 2009.
Communications Conference (SmartGridComm), [36] Y. Liu, P. Ning, and M. K. Reiter. False data injection
pages 232–237. IEEE, 2011. attacks against state estimation in electric power
[21] D. Hadžiosmanović, R. Sommer, E. Zambon, and P. H. grids. ACM Transactions on Information and System
Hartel. Through the eye of the PLC: semantic security Security (TISSEC), 14(1):13, 2011.
monitoring for industrial processes. In Proceedings of [37] L. Ljung. The Control Handbook, chapter System
the Annual Computer Security Applications Identification, pages 1033–1054. CRC Press, 1996.
Conference (ACSAC), pages 126–135. ACM, 2014. [38] L. Ljung. System Identification: Theory for the User.
[22] X. Hei, X. Du, S. Lin, and I. Lee. PIPAC: patient Prentice Hall PTR, Upper Saddle River, NJ, USA, 2
infusion pattern based access control scheme for edition, 1999.
wireless insulin pump system. In Proceedings of [39] L. Ljung. System Identification Toolbox for Use with
INFOCOM, pages 3030–3038. IEEE, 2013. MATLAB. The MathWorks, Inc., 2007.
[23] F. Hou, Z. Pang, Y. Zhou, and D. Sun. False data [40] D. Mashima and A. A. Cárdenas. Evaluating
injection attacks for a class of output tracking control electricity theft detectors in smart grid networks. In
1104
Research in Attacks, Intrusions, and Defenses, pages authentication for active sensors under spoofing
210–229. Springer, 2012. attacks. In Proceedings of the ACM SIGSAC
[41] I. MathWorks. Identifying input-output polynomial Conference on Computer and Communications
models. www.mathworks.com/help/ident/ug/ Security (CCS), pages 1004–1015, New York, NY,
identifying-input-output-polynomial-models.html, USA, 2015. ACM.
October 2014. [56] R. Smith. A decoupled feedback structure for covertly
[42] S. McLaughlin. CPS: Stateful policy enforcement for appropriating networked control systems. In
control system device usage. In Proceedings of the Proceedings of IFAC World Congress, volume 18,
Annual Computer Security Applications Conference pages 90–95, 2011.
(ACSAC), pages 109–118, New York, NY, USA, 2013. [57] S. Sridhar and M. Govindarasu. Model-based attack
ACM. detection and mitigation for automatic generation
[43] F. Miao, Q. Zhu, M. Pajic, and G. J. Pappas. Coding control. Smart Grid, IEEE Transactions on,
sensor outputs for injection attacks detection. In 5(2):580–591, 2014.
Proceedings of Conference on Decision and Control, [58] R. Tan, V. Badrinath Krishna, D. K. Yau, and
pages 5776–5781, 2014. Z. Kalbarczyk. Impact of integrity attacks on
[44] Y. Mo and B. Sinopoli. Secure control against replay real-time pricing in smart grids. In Proceedings of the
attacks. In Proceedings of Allerton Conference on SIGSAC conference on Computer & communications
Communication, Control, and Computing (Allerton), security (CCS), pages 439–450. ACM, 2013.
pages 911–918. IEEE, 2009. [59] A. Teixeira, S. Amin, H. Sandberg, K. H. Johansson,
[45] Y. Mo, S. Weerakkody, and B. Sinopoli. Physical and S. S. Sastry. Cyber security analysis of state
authentication of control systems: designing estimators in electric power systems. In Proceedings of
watermarked control inputs to detect counterfeit Conference on Decision and Control (CDC), pages
sensor outputs. IEEE Control Systems, 35(1):93–109, 5991–5998. IEEE, 2010.
2015. [60] A. Teixeira, D. Pérez, H. Sandberg, and K. H.
[46] Y. L. Mo, R. Chabukswar, and B. Sinopoli. Detecting Johansson. Attack models and scenarios for networked
integrity attacks on SCADA systems. IEEE control systems. In Proceedings of the conference on
Transactions on Control Systems Technology, High Confidence Networked Systems (HiCoNs), pages
22(4):1396–1407, 2014. 55–64. ACM, 2012.
[47] K. L. Morrow, E. Heine, K. M. Rogers, R. B. Bobba, [61] A. Teixeira, I. Shames, H. Sandberg, and K. H.
and T. J. Overbye. Topology perturbation for Johansson. Revealing stealthy attacks in control
detecting malicious data injection. In Proceedings of systems. In Proceedings of Allerton Conference on
Hawaii International Conference on System Science Communication, Control, and Computing (Allerton),
(HICSS), pages 2104–2113. IEEE, 2012. pages 1806–1813. IEEE, 2012.
[48] E. Ott, C. Grebogi, and J. A. Yorke. Controlling [62] A. Teixeira, I. Shames, H. Sandberg, and K. H.
chaos. Physical review letters, 64(11):1196, 1990. Johansson. A secure control framework for
[49] M. Parvania, G. Koutsandria, V. Muthukumary, resource-limited adversaries. Automatica, 51:135–148,
S. Peisert, C. McParland, and A. Scaglione. Hybrid 2015.
control network intrusion detection systems for [63] The Modbus Organization. Modbus application
automated power distribution systems. In Proceedings protocol specification, 2012. Version 1.1v3.
of Conference on Dependable Systems and Networks [64] D. Urbina, J. Giraldo, N. Tippenhauer, and
(DSN), pages 774–779, June 2014. A. Cárdenas. Attacking fieldbus communications in
[50] F. Pasqualetti, F. Dorfler, and F. Bullo. Attack ics: Applications to the swat testbed. In Proceedings
detection and identification in cyber-physical systems. of the Singapore Cyber-Security Conference
Automatic Control, IEEE Transactions on, (SG-CRC), Singapore, volume 14, pages 75–89, 2016.
58(11):2715–2729, Nov 2013. [65] J. Valente and A. A. Cardenas. Using visual
[51] V. Paxson. Bro: a system for detecting network challenges to verify the integrity of security cameras.
intruders in real-time. Computer networks, In Proceedings of Annual Computer Security
31(23):2435–2463, 1999. Applications Conference (ACSAC). ACM, 2015.
[52] S. Postalcioglu and Y. Becerikli. Wavelet networks for [66] O. Vuković and G. Dán. On the security of distributed
nonlinear system modeling. Neural Computing and power system state estimation under targeted attacks.
Applications, 16(4-5):433–441, 2007. In Proceedings of the Symposium on Applied
[53] I. Sajjad, D. D. Dunn, R. Sharma, and R. Gerdes. Computing, pages 666–672. ACM, 2013.
Attack mitigation in adversarial platooning using [67] Y. Wang, Z. Xu, J. Zhang, L. Xu, H. Wang, and
detection-based sliding mode control. In Proceedings of G. Gu. SRID: State relation based intrusion detection
the ACM Workshop on Cyber-Physical for false data injection attacks in SCADA. In
Systems-Security and/or PrivaCy (CPS-SPC), pages Proceedings of European Symposium on Research in
43–53, New York, NY, USA, 2015. ACM. Computer Security (ESORICS), pages 401–418.
http://doi.acm.org/10.1145/2808705.2808713. Springer, 2014.
[54] H. Sandberg, A. Teixeira, and K. H. Johansson. On [68] Pandas: Python data analysis library.
security indices for state estimators in power http://pandas.pydata.org, November 2015.
networks. In Proceedings of Workshop on Secure [69] M. Zeller. Myth or reality—does the aurora
Control Systems, 2010. vulnerability pose a risk to my generator? In
[55] Y. Shoukry, P. Martin, Y. Yona, S. Diggavi, and Proceedings of Conference for Protective Relay
M. Srivastava. PyCRA: Physical challenge-response Engineers, pages 130–136. IEEE, 2011.
1105