0% found this document useful (0 votes)
24 views13 pages

Classification of Disturbances and Cyber Attacks

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
24 views13 pages

Classification of Disturbances and Cyber Attacks

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 13

650 IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, VOL. 11, NO.

3, JUNE 2015

Classification of Disturbances and Cyber-Attacks


in Power Systems Using Heterogeneous
Time-Synchronized Data
Shengyi Pan, Member, IEEE, Thomas Morris, Senior Member, IEEE,
and Uttam Adhikari, Student Member, IEEE

Abstract—Visualization and situational awareness are of vital Poor visibility across the power system may also cause the sig-
importance for power systems, as the earlier a power-system event nificance of an event to be misunderstood and lead to incorrect
such as a transmission line fault or cyber-attack is identified, the control actions by operators in control centers. Additionally, as
quicker operators can react to avoid unnecessary loss. Accurate
time-synchronized data, such as system measurements and device power systems increasingly depend on communication infras-
status, provide benefits for system state monitoring. However, the tructures to provide the wide-area monitoring and control,
time-domain analysis of such heterogeneous data to extract pat- power systems are exposed to the threat of cyber-attacks.
terns is difficult due to the existence of transient phenomena in the Cyber-attacks are another form of power-system contingency.
analyzed measurement waveforms. This paper proposes a sequen- Attacks that target power systems can exploit vulnerabilities in
tial pattern mining approach to accurately extract patterns of
power-system disturbances and cyber-attacks from heterogeneous control devices and communication links to corrupt the con-
time-synchronized data, including synchrophasor measurements, trol and measurement signals [2], [3], and interrupt monitoring
relay logs, and network event monitor logs. The term common algorithms [4]. Cyber-attacks that corrupt control and measure-
path is introduced. A common path is a sequence of critical system ment signals can be disguised as power-system disturbances or
states in temporal order that represent individual types of distur- control actions. Situational awareness technologies are needed,
bances and cyber-attacks. Common paths are unique signatures
for each observed event type. They can be compared to observed which distinguish between actual power-system disturbances
system states for classification. In this paper, the process of auto- related to natural events and cyber-attacks. The emphasis of
matically discovering common paths from labeled data logs is this work is not on classifying disturbance types as quite a
introduced. An included case study uses the common path-mining number of methods have been proposed to do so in the power
algorithm to learn common paths from a fusion of heterogeneous system, but on distinguishing between disturbances and cyber-
synchrophasor data and system logs for three types of distur-
bances (in terms of faults) and three types of cyber-attacks, which attacks. First, in the case that a cyber-attack impersonates a
are similar to or mimic faults. The case study demonstrates the disturbance or control action, proper classification will lead to
algorithm’s effectiveness at identifying unique paths for each type proper response. Classifying a cyber-attack as a disturbance or
of event and the accompanying classifier’s ability to accurately control action can lead to improper response and cause an out-
discern each type of event. age or other negative impacts on the power system. Conversely,
Index Terms—Common paths, cyber-attack detection, distur- incorrectly classifying a disturbance or control action as a
bances, symmetric and unsymmetrical faults, synchrophasor data cyber-attack can lead to improper response within the informa-
and device log mining. tion and communications technology (ICT) system. Second, a
single classifier, which identifies all types of power-system con-
I. I NTRODUCTION tingences, is needed as an input to automated event response
algorithms such as autonomic management frameworks, sys-
S ITUATIONAL awareness technologies have been stud-
ied and continuously improved for decades. The need to
continue situational awareness improvements is motivated by
tem integrity protection schemes (SIPS) [5], wide-area protec-
tion systems (WAPS) [6], and autonomic control frameworks
[7]. This paper presents a methodology to mine the patterns
recent power disturbances, which have led to large-scale black- for disturbances and cyber-attacks using a two-dimensional
outs [1]. A power-system disturbance, such as a transmission (2-D) graph from logged heterogeneous system data, to use the
line fault, can initiate a chain of reactions, which lead to a cas- common paths in the graph as signatures of each type of mod-
cading blackout if timely actions from operators are not taken. eled scenario, and finally, to classify specific disturbances and
cyber-attacks. For proof of concept, in the paper, we consider
Manuscript received July 01, 2014; revised September 18, 2014, December disturbances as different types of line-to-ground and line-to-line
18, 2014 and February 19, 2015; accepted March 22, 2015. Date of publication
April 08, 2015; date of current version June 02, 2015. Paper no. TII-14-0692. faults.
The authors are with the Department of Electrical and Computer Wide-area measurement systems (WAMS) couple time-
Engineering, Mississippi State University, Starkville, MS 39762 USA (e-mail: synchronized voltage, current, and frequency measurements
[email protected]; [email protected]; [email protected]).
with high-speed networks to allow improved power-system
Color versions of one or more of the figures in this paper are available online
at http://ieeexplore.ieee.org. situational awareness [8]. Compared with the traditional super-
Digital Object Identifier 10.1109/TII.2015.2420951 visory control and data acquisition (SCADA) systems that poll
1551-3203 © 2015 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.
See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: Jaypee Insituite of Information Technology-Noida Sec 128 (L3). Downloaded on February 13,2024 at 05:12:56 UTC from IEEE Xplore. Restrictions apply.
PAN et al.: CLASSIFICATION OF DISTURBANCES AND CYBER-ATTACKS IN POWER SYSTEMS 651

field sensors once per several seconds, synchrophasor systems in this work belong to masquerading and/or man-in-the-middle
allow measurement of up to 120 samples/s. Synchrophasor data (MITM) attacks that target physical devices such as phasor
were used in this work for two reasons. First, the common measurement units (PMU) and relays. These attacks may orig-
path-mining algorithm uses a set of observed system states inate from a compromised node in control center, sending
in temporal order as a signature for each observed event control commands or measurement packets covered by legiti-
type. Synchrophasor measurements enable identification of mate source IP addresses and legal packet formats. As such, it
fast-moving power-system events. Some power-system events is assumed that the masquerading packets cannot be detected
involve fast-changing behaviors and may last only a few mil- by traditional network intrusion detection systems. Validation
liseconds [9]. For example, zone 1 faults are typically set to of the common path-mining algorithm is based on simulated
be cleared instantly. The presence of a fault and the system data because actual synchrophasor data are not available for
response of opening the breaker to clear the fault take just researchers due to the proprietary nature of data, confiden-
a few cycles. These events can be missed by slower speed tiality issues, and lack of proper sharing mechanism among
measurement systems. Second, synchrophasor systems pro- researchers and institutes. Additionally, datasets captured from
vide more accurate system-state visibility due to the use of utilities contain a limited number of scenarios. This limits
time-synchronized measurements. The common path-mining diversity in the dataset. Some power-system scenarios are rare,
algorithm can leverage this improved visibility to track events especially cyber-attacks. Hardware-in-the-loop (HIL) simula-
related to a single event from multiple sensors. The relatively tion allows targeted dataset creation with realistic scenarios
high measurement frequency and time-synchronized charac- captured from the same commercial devices found in utilities.
teristic offered by WAMS create very large volumes of data The same datasets used in this work have also been used in [11]
and enable various applications including wide-area protection for synchrophasor data-mining research.
schemes (WAPS), and SIPS [5], [6], [10]. The common path- This work has three primary contributions that distinguish
mining algorithm is not dependent on synchrophasor systems. it from existing methods. First, this work demonstrates a new
Common path mining requires the ability to observe sequences classifier capable of distinguishing power-system disturbances
of events. Other devices such as fault data recorders or meters and cyber security attacks that interrupt power-system control
may potentially be substituted to detect events of interest. Using actions and mimic real disturbances. Compared to a similar
synchrophasor data alone is not enough to detect cyber-attacks. work in [11], the method described in this paper provides pre-
For example, a cyber-attack can mimic a real fault by first cise classifications of fault types and the types of cyber-attacks
injecting false measurements, then tripping the relay. Such with similar accuracy. Second, this work uses the common
mimicry cannot be detected with synchrophasor data alone. The path-mining algorithm to mine fused heterogeneous data and
status of other power-system components such as relays and create common paths for each known event type. The common
breakers is also available as time-synchronized data via syn- path-mining algorithm uses less memory when compared to tra-
chrophasor systems [10]. Combining synchrophasor data with ditional data mining methods that require data to be mapped
other system logs such as relay status log and network event into memory before mining. The smaller memory requirement
monitor logs can extend the situational awareness capabilities is achieved via a preprocessing step, which compresses the mas-
provided by a synchrophasor system to detect cyber-attacks. sive time-synchronized data into a sequence of system states,
However, this creates the challenge of how heterogeneous data aka paths, which require considerably less memory than storing
sources can be merged to train and use such a classifier. This all time-synchoronized measurements associated with an event.
paper provides a solution to this problem by proposing a Third, power systems are dynamic in nature, which leads to
data-mining approach that leverages the timestamped data to minor variations in system state for known scenarios. The clas-
extract temporal patterns, which can be used to describe system sifier presented in this paper learns by parsing datasets marked
behavior related to disturbances and cyber-attacks. Henceforth, with scenario type. The training process results in an ordered
disturbances and cyber-attacks are collectively referred to as sequence of system states, i.e., a path, representing each unique
scenarios. instance of a scenario found in the dataset. To avoid overfitting,
In this work, a pattern for a scenario is presented as a the common path-mining algorithm was developed to discover
common path that consists of a sequence of system states in critical states shared by similar paths representing the same sce-
temporal order. A system state in a common path is made up nario. The result of the common path algorithm is a merged set
of multiple instantaneous readings from available sensors from of paths representing all scenarios in the dataset. The classifier
the system. One advantage of the common path is that it over- matches monitored state-transition patterns to common paths
comes the difficulty in analyzing time-domain waveforms by of known scenarios to provide a specific classification of the
discovering the critical system states across very short time observed behavior.
intervals (in milliseconds). The automatic process of discov- The remainder of this paper is organized as follows.
ering common paths is introduced by using a case study in Section II presents related works including an overview of
a simulated three-bus two-line transmission system. For this other data-mining approaches used for classification of power-
work, a case study is provided, which considers disturbances system disturbances or cyber-attacks. Section III discusses the
including symmetric and asymmetric faults and different cyber- methodology, the process of common path mining, and the
attacks that mimic the single-line-to-ground (1LG) fault to con- classifier training and validation phases. Section IV introduces
fuse operators in the control center. The cyber-attacks studied the case study test bed, test data, and test data preprocessing

Authorized licensed use limited to: Jaypee Insituite of Information Technology-Noida Sec 128 (L3). Downloaded on February 13,2024 at 05:12:56 UTC from IEEE Xplore. Restrictions apply.
652 IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, VOL. 11, NO. 3, JUNE 2015

procedure. Section V presents the classification results of three can quickly exhaust available memory resources. The method
experiments. Section VI concludes this work and proposes proposed in this paper distinguishes itself from batch process-
future work. ing data-mining approaches by compressing fused synchropha-
sor and system log information into a set of system-state
transitions, which minimize memory requirements during the
II. R ELATED W ORKS
training step. Furthermore, the same compression scheme is
Current research on applying data mining to synchrophasor used during the classification step allowing the use of pattern
data for power-system fault and disturbance classification can matching to support real-time classification.
be found in [12] and [13]. The K-nearest neighbor algorithm The work presented in this paper uses a sequential data-
was used to classify three phase faults (3LG), voltage oscilla- mining approach to classify patterns from sequences of events.
tion, and voltage sag scenarios in [11]. The algorithm accuracy Sequential data mining is better suited for high-velocity and
is not provided in [12]. Hoeffding Tree-based stream data min- high-volume synchrophasor data streams because synchropha-
ing is used in [13]. This approach was able to classify 3LG sor data are discrete data but continuous in time. Additionally,
and 1LG faults grouped for binary classification with greater the common path-mining algorithm presented in this paper
than 90% accuracy. Both [12] and [13] used simulated power- can learn to classify traditional power-system contingencies,
system data. Both [12] and [13] propose methods to mine such as faults, and cyber-attacks against power systems which
synchrophasor data. However, both are designed for power- masquerade as traditional contingencies.
system measurement data only and do not incorporate any other Machine-learning approaches have also been applied to
types of system information. By only considering measurement detect cyber-attacks against power systems, but they do not
data, it is impossible to detect cyber-attacks such as fault replay consider power-system fault detection. In [21], detection rules
or command injection attacks in which valid measurements or were derived by manually specifying allowable ranges for dif-
control commands are replayed. The work described in this ferent system measurements using domain expert knowledge.
paper fuses synchrophasor data and control system log infor- Such specification-based methods have been shown to have
mation to allow precise classification of power-system faults high detection accuracy; however, the manual effort required
and cyber-attacks. to develop such a decision tree is too great to apply to a prob-
Multiple traditional data-mining algorithms were used to lem on the scale of power-system protection. Other works have
classify power-system faults and cyber-attacks in [11]. The been found, which provide intrusion detection for synchropha-
authors of [11] used the same dataset for algorithm validation as sor systems, but they still do not provide power-system fault
that used for this paper. The traditional data-mining algorithms detection. An intrusion detection system (IDS) was proposed,
were able to differentiate between power-system disturbances which uses white lists to detect invalid network behaviors based
and cyber-attacks. However, the traditional data-mining algo- on a synchrophasor network protocol specification [22]. A sec-
rithms were not able to classify specific fault and cyber-attack ond proposed IDS uses timing and data-volume information to
types within each large category. identify data-integrity attacks against synchrophasor systems
Many other data-mining approaches have been developed [23]. However, by looking only at protocol format, timing,
to extract signatures and classify power-system disturbances, and data-volume information, these methods are not able to
but they have no ability to detect cyber-attacks. Many such detect insider attacks, e.g., the command injection from a valid
approaches classify power-system disturbances in the time machine where the network packets have legitimate format,
domain. Decision trees were used to classify power-system valid timing, and data-volume information. In [24], the authors
disturbances in [14] and [15]. Statistical characteristics of manually created rules using the industrial state modeling lan-
power-system frequency were used in [16] to represent the guage (ISML) to track SCADA system states. Nader et al. used
signatures of power-system disturbances. Many works have a kernel machine-learning method to model SCADA system
applied neural networks to classify faults. In [17] with the help normal behavior, in order to detect machine failures and intru-
of wavelet transforms, current phase is decomposed and fed sions [25]. Due to a lack of attack data, only system normal
into a particle swarm optimization-based neural network for behavior was learnt, and therefore, the authors were not able to
fault classification. A Chebyshev neural network is examined in test detection of attacks.
[18] on current signals to evaluate the fault classification perfor- This paper presents a data-mining technique to develop
mance. In [19], the neural network is integrated with a wavelet signatures of multiple types of power-system faults and cyber-
transform multiresolution analysis technique to extract patterns attacks. The resulting signatures provide a hybrid specification,
for faults in shipboard power systems using energy variation of which specifies both normal reactions to faults and symp-
fault signals. In [20], the authors used a neural network with toms of cyber-attacks. The data-mining algorithm presented in
current waveforms and data from digital fault recorders to clas- this paper has the distinct advantage requiring far less system
sify faults, normal maintenance operations, and power-quality expertise to create signatures.
disturbances. The works above all propose batch processing The data-mining technique used in this paper uses the min-
data-mining approaches to learn patterns for power-system ing sequential patterns’ technique which discovers patterns of
events. These methods are not suitable for synchrophasor mea- activity from time-ordered data. The mining sequential pat-
surement data because batch processing requires all data to be terns’ concept was first presented in [26] as a method to perform
read into memory to learn patterns. A single PMU can gen- market basket analysis. Mining sequential patterns was used
erate two million daily samples of data and multiple PMU to discover patterns in clinical client-care management process
Authorized licensed use limited to: Jaypee Insituite of Information Technology-Noida Sec 128 (L3). Downloaded on February 13,2024 at 05:12:56 UTC from IEEE Xplore. Restrictions apply.
PAN et al.: CLASSIFICATION OF DISTURBANCES AND CYBER-ATTACKS IN POWER SYSTEMS 653

data that consist of patient records and log data over a period
of treatment time in [27]. This technique was extended in [28]
by employing a 2-D Bayesian network to graphically repre-
sent patterns in Hemodialysis processes, which consists of a
sequence of medical activities over time. In order to discover
patterns, a patient’s physiological “state” is defined using clini-
cal log data and patient records (e.g., body temperature, weight,
mood, etc.). The pattern is therefore represented as contiguous
transitions of states in a 2-D graph. Classification was made
using the learnt patterns.
For this work, the frequent pattern (FP)-growth algorithm
as used to mine for frequent sequential patterns. FP-growth
reduces the cost of searching for frequent sequences by adopt-
Fig. 1. Ideal versus actual 1LG fault and protection system response.
ing a divide-and-conquer strategy [29]. As demonstrated in
[30], FP-growth algorithm outperforms several popular fre-
magnitude then drops through node C to zero. If following six
quent pattern-mining algorithms in run time, and therefore, it
notations are used to denote six events: “IR1 = H” as node
was chosen for this work. Frequent pattern mining is tradi-
“B,” meaning “Current measured by R1 increases to High;”
tionally used for market basket analysis, a method to build
“IR2 = H” for “Current measured by R2 increases to High;”
associations between commonly purchased items at a store. In
“R1 = Trip” for “Relay R1 trips;” “R2 = Trip” for “Relay
this paper, frequent pattern mining is used to identify associa-
R2 trips;” “IR1 = 0” as node “C” for “Current measured by
tive relationships between observed power-system states related
R1 drops to Zero;” “IR1 = 0” for “Current measured by R2
to a particular event type or scenario.
drops to Zero.” The timestamps of 1LG fault and resulting
Compared with peer works, this work is unique in that we
protection scheme operation can be represented by expres-
propose a data-mining algorithm that can learn patterns for
sion (1) where t(·) stands for the timestamp of corresponding
both power-system disturbances and cyber-attacks from het-
events
erogeneous data including synchrophasor measurements and
device logs from multiple locations in the power system. Learnt t(IR1 =H) = t(IR2 =H) < t(R1=Trip)
patterns are translated into common paths. Common paths = t(R2=Trip) < t(IR1 =0) = t(R2=0) . (1)
are used as signatures for pattern recognition. This approach
enables a fast low-memory process for detecting power-system Expression (1) assumes a fault which appears at both relays
contingencies and cyber-attacks. It is possible to use separate at the same time and assumes that both relays operate at the
classifiers for power-system event detection and cyber intrusion same time. In fact, the fault may occur at different locations
detection. However, for attacks which mimic power-system along the line leading to variations in the time each relay
events, a supervisor process (a human or another algorithm) will observes the fault and variations in relay operation time. Power
be required to analyze outputs from the two separate algorithms systems are dynamic. In Fig. 1, the dashed line shows an ideal
to resolve conflicts. Combining power-system event detection waveform of current magnitude during a fault and the solid
and cyber-intrusion detection resolves this issue. Furthermore, line graphs a waveform captured from real-time digital simu-
this work is unique because it provides a mechanism for precise lator (RTDS) simulation of a 1LG fault. The actual waveform
classification of power-system disturbances and cyber-attacks includes multiple variations from the ideal waveform. A power
which attempt to mimic the same disturbances. Such precise system’s response to load variation, fault location variation, and
classification enables automated response algorithms which transient behaviors results in irregular waveforms. Such varia-
will lead to a more reliable power system. tions are reflected as dispersions in the timestamps of node B
and node C for different instances of the same scenario. The dis-
persion in timestamps can be seen not only in the events related
III. C OMMON PATH M INING to the current magnitude but also in the events related to other
features. Fig. 2 shows box plots of timestamps of six events for
A. Sequential Events for a Power-System Scenario
three fault scenarios and one scenario where relays R1 and R2
Power-system scenarios can be described as an ordered are tripped by attackers. Fig. 2 (X-axis) is the set of observed
sequence of measureable events. For example, Fig. 1 depicts events. The box plots represent 40 instances of each scenario.
phase a current magnitude during a 1LG fault on a transmis- To provide an ordered sequence, the timestamp of the first event
sion line. The current magnitude can be quantized into three in a sequence was subtracted from timestamps of all later events
ranges: high, normal, and low which are represented by dark in the sequence. The box plots and the interconnecting edges of
gray, white, and light gray rectangles shading Fig. 1. When a scenario are depicted using the same color. As shown in Fig. 2,
the system is in a normal state, the current stays in the normal events take place in temporal order. Event timestamps vary due
range, marked as node A in Fig. 1. When the 1LG fault occurs, to system dynamics. For each scenario, a track can be drawn by
current increases to the high range via node B. The protection connecting box plot medians. The tracks shown in Fig. 2 gen-
scheme will operate two relays, R1 and R2, at both ends of the erally agree with expression 1. Expert knowledge can be used
transmission line to open breakers and isolate the fault. Current to create similar expressions for all known system behaviors.
Authorized licensed use limited to: Jaypee Insituite of Information Technology-Noida Sec 128 (L3). Downloaded on February 13,2024 at 05:12:56 UTC from IEEE Xplore. Restrictions apply.
654 IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, VOL. 11, NO. 3, JUNE 2015

TABLE I
M ERGED R AW DATA

s2a1.5 is a measurement from sensor s2 for item a


at timestamp 1.5. Many instances of raw data are
needed for each scenario. All sensors must have a
measurement at time 0
s11 = (s1a1, s1b1, . . . , ts11 ) (2)
Fig. 2. Distribution of timestamps for events. s21.5 = (s2a1.5, s2b1.5, . . . , ts21.5 ) (3)
s12 = (s1a2, s1b2, . . . , ts12 ) . (4)
However, time variation prevents these from serving as signa-
tures for classification. This leads to the need for a graph to Step 2) Merge raw data. The various sensor data must be
describe an ordered set of events describing a scenario while merged into a single database. Since each sen-
comprehending the variation in timestamps. sor may take measurements at different times, the
Tracks are an ordered list of events with measurements where merged data must be time aligned. The highest fre-
each vertex is an event measured at a single sensor. The classi- quency sensor is used as a baseline. Slower rate
fier presented in this paper uses paths which are an ordered list sensor data are merged into the baseline sensor’s
of system states where a state is snapshot of measurements from log file. Measurements from slower sensors, which
all available sensors at a given time instant. The steps taken to are between timestamps of the baseline sensor,
convert heterogeneous data collected during a scenario into a are delayed to the next baseline sensor timestamp.
path will be introduced in the next section. Path vertices are Table I shows an example of merged raw data based
states and path edges are transitions between states. Paths are on the input data from expressions 2–4.
a means for providing stateful monitoring of the system. The Step 3) Quantize data. Data from sensors can take many
training process performed to create paths is subject to overfit- forms: real numbers, integers, Boolean values, etc.
ting due to the time variations seen in Fig. 2. In the overfitting Data must be quantized to reduce state space.
case, different instances of the same scenario may have different For sensors with real and integer values, data can
paths. A technique for common path mining is provided below be quantized into numbered ranges. For exam-
to identify shared critical states between a set of paths for a sce- ple, voltage and current can be quantized into
nario leaving a common path that comprehends the variation in low (0), medium (1), and high (2) ranges accord-
timestamps. ing to two thresholds r1 and r2 . The choice of
r1 and r2 requires expert knowledge. Expression
5 provides an example quantization mapping for
B. Common Path Mining measurement s

The mining common path algorithm is used to derive com- ⎪
⎨0, if si ≤ r1
mon paths for each scenario of interest. Common paths are q(si ) = 1, if r1 ≤ si < r2 (5)
maximal frequent sequences found in the set of paths observed ⎪

for a given scenario. Common paths can be used as a signa- 2, if si ≥ r2 .
ture for a scenario and pattern matching can be used to classify Step 4) Map to states. A state is a set of merged and
system events by scenario type. quantized sensor measurements and a timestamp.
The Common Path-Mining Algorithm is described below. Expression 6 shows an example state
The algorithm must be run once for each scenario of interest.
Sj = (q (s1i ), q (s2i ), . . . , ti ). (6)
Algorithm 1. Common Path Mining
States are stored in a state database. Only unique
Input: Raw data from power system for the scenario of states are stored in the database and the state
interest index j is incremented for each unique state. The
Output: A common path state database is common for all instances of all
Step 1) Collect raw data. Raw data consist of measure- scenarios.
ments and timestamps. Expressions 2–4 show three After mapping to states, an instance of a sce-
measurements and timestamps from two exam- nario can be represented as an uncompressed path.
ple sensors, s1 and s2. Each sensor may measure Expression 7 shows an uncompressed path repre-
a single item or multiple items and each sensor senting the kth instance of scenario U
provides a timestamp. For example, s11 denotes
the measurements from sensor s1 at timestamp 1; Uk = (S0 , S0 , S1 , S2 , . . .). (7)
Authorized licensed use limited to: Jaypee Insituite of Information Technology-Noida Sec 128 (L3). Downloaded on February 13,2024 at 05:12:56 UTC from IEEE Xplore. Restrictions apply.
PAN et al.: CLASSIFICATION OF DISTURBANCES AND CYBER-ATTACKS IN POWER SYSTEMS 655

Step 5) Compress data into paths. The uncompressed paths TABLE II


are compressed by removing sequences of states E XAMPLE PATHS FOR A S CENARIO
that do not change leaving just one instance of
that state. This step provides a compression, which
reduces memory usage and results in a tuple that
represents all state transitions for the system. The
state transitions correspond to events. The result
of compression is a path that represents a sin-
gle instance of a scenario. A path Pi is a list of
observed system states arranged in temporal order
according to their timestamps ordered by increas-
ing time during a state transition. P4 represents the case when a path is
similar, but a state is different from the ideal case. This could
Pi = (S1 , S2 , . . . , Sn ). (8) happen when an event in a state (i.e., S2 ) does not occur due
to the variation in the timestamp, which results in a different
Dynamic systems will have many paths for the state (i.e., S11 ). P5 represents an error path. In the error path,
same scenario due to minor variations in sampled no sequences match the ultimate common path.
data resulting from measurement inaccuracies and For this example, G = {P1 , P2 , P3 , P4 , P5 }. If the min-
changes in the larger system. For example, power imum support threshold is set to 60%, the output of FP-
systems are large interconnected systems. Changes growth algorithm for G is a set of frequent sequences
outside the monitored portion of the power system which meet the minimum support threshold including
may lead to variability in observed measurements, {S1 , S2 , S3 , S4 , S5 } and {S1 , S3 , S4 , S5 }. For this exam-
the same scenario in the monitored portion of the ple, {S1 , S2 , S3 , S4 , S5 } is maximal and is therefore the
power system. common path. The sequence {S1 , S3 , S4 , S5 } is not maxi-
Step 6) Mine common paths. The common path-mining mal because it is contained in {S1 , S2 , S3 , S4 , S5 }. If the
process uses the mining frequent patterns’ algo- minimum support threshold is changed to 70%, the maximal
rithm FP-growth [24] to mine for common frequent sequence will be {S1 , S3 , S4 , S5 }. Since only one
sequences of states from P. Among these frequent sequence meets the threshold, it is maximal.
sequences, the maximal sequences are used as com-
mon paths. Note that there could be more than one Algorithm 2. Classification Using Common Paths, cp
common path for a scenario. Input: PUT (path under test)
Output: C (Class)
A sequence α is a subset of a path, i.e., α ⊆ P. Sequence 1: For each common path, cpi , in cp:
α is denoted by {Si+1 , Si+2 , . . . , Si+m }. A path P contains 2: If cpi ⊆ PUT:
sequence α if all of the elements in α appear in P in the same 3: Add cpi to CCP (list of candidate common paths)
order. In a set of sequences, a sequence α is maximal if α is not 4: Filter CCP for cpi with maximal length
contained in any other sequences. 5: If size(CCP) == 1
Let G be the set of all observed paths for a scenario Q, so that 6: Return class = look-up class of CCP0
G = {P1 , P2 , . . . , Pn } where n is the number of observed 7: Else return class = unknown.
paths for Q. A path supports sequence α if the sequence is
contained in the path. The number of paths that contain the The common path is used as a signature during classification.
sequence α is defined as support count. Given the support count Changing the minimum support threshold changes the number
for the sequence α and the total number of paths in G, the sup- of states in a common path and can affect classification accu-
port for the sequence α can be defined as the support count racy. It is not necessary to find a common path which matches
divided by the total number of paths in G. the ideal path, rather the goal is to find a common path which is
A sequence whose support is greater than a minimum unique for a scenario and which leads to maximum classifica-
support threshold is called a frequent sequence. A common tion accuracy. For a noisy system, a shorter common path may
path for scenario Q is a frequent sequence whose support is yield better classification results.
greater than a minimum support threshold and is maximal. Common paths are signatures which can be compared to
There may be multiple common paths for a single scenario. compressed paths for classification. Algorithm 2 shows the pro-
Common paths reflect the states that occur most frequently for a cess for classifying a single PUT. Algorithm 2 can be used
scenario. for real-time classification as shown in Algorithm 3. The while
Table II provides examples of different paths for one sce- loop in Algorithm 3 executes at the frequency of the sensor with
nario. Each path is mined from a measured event database. T the highest sample rate. The merge raw data step in Algorithm 1
represents the timestamps for states. P1 represents the ideal is not needed in for real-time processing since the value of all
case for a path representing a scenario. P2 matches P1 , except sensors can be read in each loop iteration. The steps {collect
that a subset of states is delayed. This may occur due to times- raw data, quantize data, and map to state} are the same as the
tamp variation in events or due to system dynamics. P3 contains steps of the same name in Algorithm 1. The function call class
an extra state. Dynamics may occur when a feature oscillates (PUT) in Algorithm 3 refers to calling Algorithm 2.
Authorized licensed use limited to: Jaypee Insituite of Information Technology-Noida Sec 128 (L3). Downloaded on February 13,2024 at 05:12:56 UTC from IEEE Xplore. Restrictions apply.
656 IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, VOL. 11, NO. 3, JUNE 2015

Algorithm 3. Real-time Classification Using Common Paths


Input: Real-time raw data
Output: Classified scenario type
1: While(true)
2: While (state! = steady_state)
3: Collect raw data sample
4: Quantize data Fig. 3. Three-bus two-line transmission system for case study.
5: Map to state
6: If (statei ! = statei−1 ) where the system voltage, current, and frequency are abnormal.
7: Add statei to path PUT Typically, 1LG faults, double-line-to-ground (2LG) faults,
8: class = class(PUT) three-line-to-ground (3LG) faults, and line-to-line (LL) faults
represent greater than 95% of faults in a power system [31].
In this work, for proof of concept, we simulated phase-a-to-
Algorithm 3 can be implemented in a daemon to monitor a
ground fault for 1LG faults, phase-a-b-to-ground faults for
power system in real time. The definition of steady state will
2LG faults, phase a-b-c-to-ground fault for (3LG) faults, and
vary by system and can be measured by a lack of state change
phase-a-to-b LL fault for LL faults.
over a user-defined period of time.
2) Trip Command Injection Attack: Trip command injec-
For the proof of concept described later in this paper, the
tion attacks create contingencies by remotely sending unex-
common path-mining algorithm was verified by collecting raw
pected relay trip commands from an attacker’s computer to
data in advance and using Algorithm 2 to classify paths.
relays at the ends of a transmission line. The trip command
The rest of this paper presents a case study that applies
injection attack used for this work closely mimics the 1LG
the mining common path algorithm to a three-bus two-line
fault. The attack was implemented against relay R1 and R2
transmission system for classifying four types of power-system
by replaying relay trip commands captured from Modbus over
symmetric and unsymmetrical faults and three cyber-attacks
Transmission Control Protocol (TCP) network traffic. However,
scenarios.
we assume that these commands are sent from a compro-
mised legitimate computer, such that these commands cannot
IV. P OWER -S YSTEM T EST B ED be detected by a network event monitor as attacks since they are
A real-world power system is dynamic and consists of from a valid source and have valid formats. The two relay trip
thousands of buses, loads, transmission lines, and other com- commands open the breakers at the ends of transmission line
ponents. The power-system operation goes through various L1. This attack stresses the system by forcing L2 to carry more
states and is a continuous process. The three-bus two-line power flow, which may cause cascading failures in a power
transmission system used in this work is modified from the system. However, for this work, cascading failures were not
IEEE nine-bus three-generator system [31] according to our simulated. The trip command injection attack instances were
simulation requirements. Although this system is relatively created under random load conditions in the same range used
small, it captures the essence of the larger power system and for faults.
is small enough to be comprehensible in every detail. Multiple 3) Aurora Attack: The Aurora vulnerability refers to poten-
instances of the classifier proposed in this work would be tial harm caused to a generator by intentionally opening and
deployed to monitor sections of a power system. The case closing a breaker near the generator in rapid succession [33]. In
study system uses commercial PMU and relays from two major this work, an aurora cyber-attack was simulated, which periodi-
vendors. The test bed and datasets exhibit behaviors of a real cally sends opening–closing commands to relays that cause the
power system, yet fit into the resources available in the lab breaker on the transmission line to open and close at a very fast
in terms of hardware and software limitations. Because the pace.
three-bus two-line transmission system is capable of varying 4) 1LG Fault Replay Attack: The 1LG fault replay attack
generation from two sources, varying load, simulating faults attempts to emulate a valid fault by altering system measure-
on two transmission lines at locations with 1% increments, ments to mimic a 1LG fault followed by sending an illicit
simulating loss of a transmission line due to control action or trip command from a compromised computer to relays at the
fault, and of multiple cyber-attacks, it is adequate for proof of ends of the transmission line. This attack may lead to confu-
concept of this work. The transmission system used for HIL sion and potentially cause an operator to take invalid control
simulation for this work is shown in Fig. 3. actions. A Python script is used to initiate an MITM attack
between the hardware PDC and the historian. The attack replays
synchrophasor measurements from a valid 1LG fault and then
A. Simulated Scenarios
replays commands to trip the relays on the affected line.
The power-system disturbances and three types of attacks
simulated for this work are described as follows.
B. Test Bed Architecture
1) Power-System Faults: In this work, we consider
symmetric and unsymmetrical faults in a power system as the The HIL test bed shown in Fig. 4 was used to simu-
examples of disturbances. A power-system fault is a condition late the distance protection scheme on the three-bus two-line
Authorized licensed use limited to: Jaypee Insituite of Information Technology-Noida Sec 128 (L3). Downloaded on February 13,2024 at 05:12:56 UTC from IEEE Xplore. Restrictions apply.
PAN et al.: CLASSIFICATION OF DISTURBANCES AND CYBER-ATTACKS IN POWER SYSTEMS 657

Fig. 4. Hardware in the loop test bed.

Fig. 5. Comparison of voltage measured by PMU and RTDS.


transmission system and implement the faults and cyber-attacks
scenarios. The RTDS was used to simulate transmission lines, CSV file includes readings of frequency, current phasors, volt-
breakers, generators, and load. Four physical relays were wired age phasors, and sequence components from the four PMUs,
to the RTDS in a HIL configuration. The relays implemented a and a timestamp. Each CSV file is labeled with the instance
two-zone distance protection scheme. The relays trip and open number, scenario name, as well as load ranges and/or fault loca-
the breakers once a fault occurs on a transmission line. Fault tion at the moment the instance of the scenario occurs. The
logics for different types of faults were created in RSCAD and label is useful for grouping instances as will be discussed in
then the faults were implemented in the RTDS. Prior to each Section V. The label is also used for training and classifier test-
implementation of a fault, the system load was randomized ing. The four relays were sources of timestamped relay state
in the range of 200–399 MW. Each fault instance was imple- changes. There is also a network event monitor that logs any trip
mented at a random location in 1% increments from 10% to command packets sent to relays. All logs and synchrophasor
90% of line L1. measurement CSV files were stored by a historian.
The relays used in this work are the GE-D60 and SEL- For this work, simulation of all scenarios starts from a stable
421. Both are digital relays with integrated PMU functionality. state and ends at a stable state. Faults last for 1 s and the relay
However, PMUs and relays were drawn separately in Fig. 4. closes the breaker 2 s after opening. Also, the distance pro-
The PMUs stream real-time synchrophasor measurement data, tection scheme was simplified by disabling reverse time-delay
using the IEEE C37.118 protocol at a rate of 120 samples/s, to backup and limiting the number of protection zones for each
the PDC. Then, aggregated synchrophasor data are forwarded relay to 2. Each relay provides primary protection up to 80%
to the OpenPDC software. The electrical parameters from of the line (Zone 1 protection) and backup protection (Zone 2
RTDS simulation and PMU measured values were compared. protection) up to 150% of the line. The trip time for Zone 1 pro-
The current transformer (CT) and potential transformer (PT) tection is set to instantaneous, while the trip time for the Zone 2
ratios of the simulated power-system model and of the actual protection is set to 20 cycles.
hardware PMUs and the scaling factors of I/O components
were adjusted to make the output from the RTDS simulation
and PMU measurements close to identical. Validation of the C. Test Data and Data Preprocessing
HIL configuration required two steps. First, the power-system In total, 1023 instances of 1LG faults, 274 instances of 2LG
model described in [34] was implemented as a baseline. The faults, 584 instances of 3LG faults, 272 instances of LL faults,
RTDS simulation, PMU, and PDC voltage, current, and fre- 274 instances of command injection attacks, 225 instances of
quency were compared for dynamic and steady-state conditions aurora attack, and 703 instances of 1LG fault replay attack were
described in [34]. Simulated and measured voltage, current, simulated. Test data consist of the synchrophasor measure-
and frequency results matched with values noted in [34]. The ment CSV files, the four relay logs, and network event-monitor
baseline system was modified to create the three-bus two-line logs collected during all of these scenarios. The relay log
transmission system without altering the external hardware con- that contains timestamp and corresponding event information
figuration. After altering the power-system model RTDS, PMU, (trip or nontrip) was extracted from the relays. The network
and PDC voltage, current, and frequency continue to match. event-monitor log contains timestamp and corresponding net-
Fig. 5 shows overlapping voltage magnitude from the RTDS work events (trip command seen or not seen). Each CSV file
simulation and PMU. The voltages seen in simulation and at contains tuples with 52 synchrophasor measurements as each
the PMU are the same throughout the simulated events. Current PMU provides 13 measurements including voltage and current
and frequency plots, as well as PDC measurements, also match phasor magnitude (Va , Vb , Vc and Ia , Ib , Ic ), zero, positive
but are not shown in the figure to save space. and negative sequence voltage and current phasor magnitude
A python script processes the synchrophasor measurement (V0 , V1 , V2 and I0 , I1 , I2 ), and apparent line impedance (Z).
data received by OpenPDC into a comma-separated values for- A single CSV file has approximately 2000 tuples for an instance
mat (CSV) file for each instance of a scenario. A row in the of a single scenario. Since the PMU streams at 120 samples/s,
Authorized licensed use limited to: Jaypee Insituite of Information Technology-Noida Sec 128 (L3). Downloaded on February 13,2024 at 05:12:56 UTC from IEEE Xplore. Restrictions apply.
658 IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, VOL. 11, NO. 3, JUNE 2015

2000 tuples correspond to 17 s of simulated system time per TABLE III


scenario. The test data were separated into training and testing C ONFUSION M ATRIX FOR E XPERIMENT 1
datasets, each of which was the input to common path mining
and classifier algorithms described in the previous section.
Rather than using all recorded input features from the dataset,
only a portion of measurements was retained as selected fea-
tures. In this work, the selected features contain relay status
and the three-phase current magnitudes (Ia , Ib , Ic ). Relay sta- learn paths for larger sets of symmetric and asymmetric faults
tus was used as a feature because all cyber-attacks studied in and multiple different cyber-attacks. To test for overfitting,
this work maliciously trip relays via the network. The network captured data from experiment 3 were used to test classifier
event-monitor log was selected as one of the features for the accuracy using 10-round cross-validation. Experiment 3 was
same reason. The three-phase current magnitudes were selected also repeated with varying PMU sample rates to show the effect
because the current magnitudes of the three phases were the of sample rate on classifier accuracy. For each experiment, the
most significant measurements during symmetric and unsym- training phase that computes a set of common paths is described
metrical faults. Other unselected measurements were discarded in Algorithm 1 in the previous section; the testing phase that
from the input data. classified testing paths is described in Algorithm 2.
The measurement data from the PMU and relay log were
merged into a single file for one instance of a scenario. The
A. Experiment 1
PMU current magnitude measurements were measured at 120
samples/s, while relay status occurs asynchronously. To merge For the first experiment, approximately half of the test data
the features, phase current was chosen as a reference and the for 1LG fault and command injection attack was randomly cho-
relay status was up-sampled prior to merging. sen as a training dataset, while the rest was used as a testing
Each feature was quantized into finite ranges. The phase cur- dataset. This resulted in 519 instances of 1LG fault and 127
rents were quantized into low, normal, and high ranges. The low instances of the command injection attack, which were used for
range was 0–99 Amperes (A). The normal range was 100–1199 training. Table III is a confusion matrix from Experiment 1.
A. The high range was greater than 1200 A. The relay status For this work, accuracy, misclassification, and unknown rates
was quantized into two values: 1) tripped; and 2) nontripped. were defined as follows. The accuracy rate is the percentage of
The aggregated features describe the system state at instances correctly classified. Misclassification rate is the per-
a given timestamp. A system state thus is a vector of a centage of the instances of a class which were misclassified as
timestamp and features with quantized measurements. An another scenario. The unknown rate is the percentage of the
example of state that describes relay R1 and R2 tripping instances of a scenario which were not classified as any sce-
due to high current magnitude can be represented as a vector nario. Unknown instances either match no common paths or
Timestamp, IR1 = High, IR2 = High, R1 = Trip, R2 = match more than one common path from more than one class.
Trip, . . ., where “IR1 = High” and “IR2 = High” in the For the first experiment, the overall classification accuracy
vector represent high-current magnitudes measured by PMUs was 95%. No instances were misclassified. A total of 5% of
in R1 and R2. “R1 = Trip” and “R2 = Trip” in the vector tested scenario instances were unknown. All unknown instances
represent relay trip status of the two relays. The time difference matched at least one fault and at least one command injection
between two states is same as that between two rows, which common path.
is the reciprocal of the synchrophasor measurement rate; There were a total of 221 common paths found for the two
1/120 samples/s = 8.3 milliseconds (ms). The timestamps scenarios: 203 for 1LG fault scenario and 18 for the command
of rows in the file were normalized by subtracting the time of injection scenario. This high number of paths results from the
the first row from all other rows. This causes all files for all dynamic nature of the power system. Fig. 6 is a plot of the
scenario instances to start from time 0. fault location, from the perspective of relay R1, versus relay
trip times for relays R1 and R2. Fig. 6 clearly shows zone 1
and zone 2 trip boundaries for both relays. Additionally, Fig. 6
V. E VALUATION shows that the relay trip times vary with fault location espe-
Three experiments were performed to validate the common cially in the fault location region from 24% to 79% of the
path-mining algorithm. Experiment 1 classifies two classes, transmission line. The large number of common paths for the
1LG fault and command injection attack. This was an initial 1LG fault injection scenario is primarily due to this variation.
proof of concept to show that the algorithm can distinguish System behavior also varies as the system load changes.
a fault from a single attack intended to mimic the fault.
Experiment 2 repeated Experiment 1 with the fault labels pre-
B. Experiment 2
processed into groups by fault location and system load. This
was done to show that the common path-mining algorithm can Ideally, faults between 0% and 20% of the transmission line
learn unique paths for sequences of events with very small dif- should have instant trip time for relay R1 and trip after 20
ferences. In Experiment 3, four types of short-circuit faults cycles for relay R2. Faults between 80% and 100% of the
and three types of cyber-attacks were simulated. This experi- transmission line should trip after 20 cycles for relay R1 and
ment demonstrates that the common path-mining algorithm can instantly for relay R2. In the 21%–79% range, both relays
Authorized licensed use limited to: Jaypee Insituite of Information Technology-Noida Sec 128 (L3). Downloaded on February 13,2024 at 05:12:56 UTC from IEEE Xplore. Restrictions apply.
PAN et al.: CLASSIFICATION OF DISTURBANCES AND CYBER-ATTACKS IN POWER SYSTEMS 659

TABLE IV
C ONFUSION M ATRIX FOR E XPERIMENT 2

Fig. 6. Relay trip time versus fault location for relays R1 and R2.

should ideally trip instantly. Observed trip times match the ideal
case for the 0%–20% and 80%–100% ranges. Note, the appar-
ent impedance setting for zone 2 for relay R2 causes the zone
1-to-zone 2 transition to occur at approximately 23% of the
line (77% of the line from relay R2’s perspective) instead of
at the expected 20% of the line (80% of the line from relay
R2’s perspective).
The trip times from 24% to 80% of the line are always
instantaneous. Observed trip times tended to increase as the
fault approached the zone 1 to zone 2 boundary points. To
compensate for this observed behavior, the 1LG fault paths
were grouped by fault location per the following groups: 10%–
23%, 24%–29%, 30%–35%, 36%–40%, 41%–60%, 61%–
65%, 66%–70%, 71%–80%, 81%–90%. Additionally, it was
Fig. 7. 2-D coordinates documenting 1LG fault versus command injection
observed that trip times partially correlated with the system
attack common paths.
load. As a result, the 1LG fault class used in Experiment 1 was
divided into multiple classes by fault location and load. Four
load ranges were used: (200–249, 250–399, 300–349, 350– unknown instances which matched no common path. The 16
399 MW). This subdivided the 1LG fault class into 9 ∗ 4 = 36 cases of faults, which matched common paths from more than
subclasses. one group, all occurred because both the (30%–35%) and the
The command injection attack class in Experiment 1 was also (36%–40%) shared a common path.
divided using four load ranges, which results in four command The intent of subdividing the 1LG fault class was not to clas-
injection attack classes. sify 1LG faults by a specific fault location. Correctly classifying
The extra step of subdividing the 1LG fault class and com- a fault as a fault is sufficient as many algorithms are available
mand injection attack results in a total of 40 classes. The to provide fault location information. The accuracy rate when
training dataset and testing dataset in this experiment is the the fault location classes were combined into a single class is
same as that used in Experiment 1. 96.7%. The misclassification rate was 0% and the unknown rate
Table IV is a confusion matrix for all scenarios for was 3.3%.
Experiment 2. As previously mentioned, the 1LG fault classes Common paths can be mapped into 2-D coordinates with the
were divided by fault location and system load. To save space, Y-axis indicating the state identification code (state ID) and the
the groups in the confusion matrix were combined to just show X-axis indicating normalized timestamps. An edge between two
the fault location classes and one command injection class. An vertices represents the temporal transition between two states.
extra row (marked Unk. for unknown) was added to the con- Each vertex is marked with state information. Fig. 7 shows
fusion matrix to show instances of scenarios, which were not common paths for two scenarios, a 1LG fault in the 36%–40%
classified. fault location group and a command injection attack. Both the
Experiment 2 classification accuracy, misclassification, and fault and command injection common paths start at the system
unknown rates can be viewed from multiple perspectives. The normal state. These paths differ immediately because, for faults,
overall accuracy rate for the groups shown in the confusion the PMU will measure high current when a fault is present.
matrix was 87.6%. Misclassification and unknown rates for the This makes the second state of the fault common path high cur-
same groups were 9.1% and 3.3%, respectively. From the con- rent detected at relay R1. The command injection attack occurs
fusion matrix, the majority of misclassification occurred when when there is no fault present. As such, the second state for
1LG fault groups were classified as members of a neighboring the command injection attack has normal current at both relays,
or nearby fault group. The unknown cases are separated into while both relays’ status indicates a trip.
unknown instances, which resulted from an instance matching Fig. 8 shows common paths for two different 1LG fault loca-
multiple fault common paths (“Unk. fault” in Table III) and tions. Note that not all features are displayed in the vertex
Authorized licensed use limited to: Jaypee Insituite of Information Technology-Noida Sec 128 (L3). Downloaded on February 13,2024 at 05:12:56 UTC from IEEE Xplore. Restrictions apply.
660 IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, VOL. 11, NO. 3, JUNE 2015

TABLE V
C ONFUSION M ATRIX FOR F OUR T YPES OF FAULTS
AND T HREE C YBER -ATTACKS

Fig. 8. 2-D coordinates comparing two common paths for 1LG faults of
different locations.

labels. The 10%–23% fault is in relay R2 zone 2 and the 24%–


29% fault is in relay R2 zone 1. This difference is the primary
reason for different paths for the two fault subgroups.
Figs. 7 and 8 demonstrate that common paths contain the
critical states for different scenarios. The primary contribu-
tion of the common path-mining algorithm is the ability to
automatically learn unique paths for each scenario type from
data. Fig. 9. Accuracy rates for 10-round cross-validation with different PMU
Training and testing processing time and memory usage streaming rates.
were measured using an Ubuntu Linux Virtual Machine with
3.5 GHz CPU and 2 GB memory. For Experiment 1, train- LL faults, command injection, Aurora, and fault replay attacks.
ing required 202 s and 25.3-MB memory with approxi- Each entry in the table sums up numbers for 10 rounds in the
mately 2.5-GB time-synchronized data. Experiment 1 testing corresponding location.
required 0.85 s per scenario instance. For Experiment 2, train- The total number of classifications made in Table V is
ing required 205 s and 25.2-MB memory for the same amount 16 885, of which 15 740 instances are correctly classified. The
of training data. Experiment 2 testing required 0.83 s per sce- average accuracy for the seven classes shown in Table V is
nario instance. The difference in classification time between 93.21%. Only 488 instances of faults (177 of 1LG fault, 58 of
the two experiments is likely due to host computer load vary- 2LG fault, and 15 of 3LG fault and 238 for LL) were classified
ing between experiments. Classification is a pattern-matching as unknown, and only six instances of faults are misclassi-
exercise similar to other pattern-matching technologies such fied as cyber-attacks. The lowest accuracy for an individual
as virus scanners or rule-based network intrusion detection class or scenario type was for fault replay attacks. Fault replay
systems. The classification testing for this work was not opti- attack classification accuracy was 90%. Fault replay attacks
mized for actual use in a real system. To build a real-time were misclassified as a fault for 3.6% of the tested instances
classifier, a program would be required to collect raw data and misclassified as a command injection attack for 3.5% of
samples, quantize data, and map data to states (steps 3–5 of tested instances. The fault replay attack is intended to mimic a
Algorithm 3). The instances used for testing in this work each 1LG fault and as such is sometimes able to confuse the clas-
required approximately 17 s of wall clock time to occur in the sifier. The fault replay includes elements from the command
system. For a synchrophasor system with 120 samples/s, there injection attack. This leads to similarities which cause occa-
are 8.3 ms between samples. This time could be utilized to pro- sional misclassification as a command injection attack. Table IV
cess samples and perform the comparison, and a decision tree demonstrates that the classifier is able to distinguish faults and
architecture could be used to facilitate fast pattern matching. cyber-attacks.
The accuracy rate for 10-round validation when the PMU is
sample rate at 20, 30, 60, and 120 Hertz (Hz) is plotted in Fig. 9.
C. Experiment 3
Classification accuracy is higher when the PMU is streaming at
A third experiment was conducted for classifying four types 120 Hz and lowest at 20 Hz. However, event at 20-Hz accuracy
of symmetric and unsymmetrical faults and three types of exceeds 80%. This is reasonable as higher PMU samples rates
cyber-attacks. The training phase used the same methodology gives better visibility of the system states when fast-moving
as Experiments 1 and 2. Validation in this experiment used 10- events, such as faults, are considered.
round cross-validation. In each round, half of the test data was Table VI shows a comparison of classifier results from the
randomly chosen as a training dataset and the remaining data Random Forest, JRip, Adaboost + JRip, and common path-
were used as the testing dataset. Table IV is a combined confu- mining algorithms. The values for Random Forest, JRip, and
sion matrix for 10 rounds of validation for the 1LG, 2LG, 3LG, Adaboost + JRip are from the work described in [11] which
Authorized licensed use limited to: Jaypee Insituite of Information Technology-Noida Sec 128 (L3). Downloaded on February 13,2024 at 05:12:56 UTC from IEEE Xplore. Restrictions apply.
PAN et al.: CLASSIFICATION OF DISTURBANCES AND CYBER-ATTACKS IN POWER SYSTEMS 661

TABLE VI the 1LG fault class is divided into a number of subclasses


C OMPARISON OF C OMMON PATH M INING TO OTHER A LGORITHMS by taking advanatage of power-system domain expertise. The
extra step of subdividing classes in training produces slightly
better accuracy, misclassification, and unknown classification.
Both experiments required similar training time, testing time,
and memory usage. A third experiment was conducted using
the same training as Experiment 2. Ten-round cross-validation
was performed with varying PMU sample rates. The 10-round
validation shows that the classifier has not overfit the data.
Comparison of varying PMU sample rates shows that the
highest accuracy is achieved with PMU sampled in 120 Hz.
used datasets derived from the same test bed as this com- This is expected since faults are fast-moving events and the
mon path-mining algorithm evaluation for training and testing. 120-Hz sample rate provides the most visibility of system-state
Among the four algorithms, Adaboost + JRip has the highest changes.
value in four metrics: accuracy, precision, recall, and F mea- This paper demonstrates a methodology to leverage syn-
sure. Accuracy, precision, recall, and F measure for the com- chrophasor measurements for power-system disturbance and
mon path-mining algorithm were computed from the results cyber-attack detecion and highlights the promise of the minin-
for Experiment 3 of this work. Note that the mining common ing common paths aglorithm. Future work includes applying
path classifier uses seven classes and has similar performance the algorithm to larger systems with more known types of
to Adaboost + JRip classifier with only three classes. The disturbances, control actions, and cyber-attacks.
Random Forest, JRip, Adaboost + JRip classifiers used the The common path-mining algorithm was evaluated using a
following classes: normal behavior, attack events, and natu- three-bus two-line transmission system. It is possible to scale
ral events. The mining common path classifier classes were up for larger systems by sampling system state from larger por-
three separate attacks and four types of faults (natural events). tions of a power system. Training and classification time will
The ability to make a precise classification versus a broad cat- increase linearly as the number of tuples in a sample increases
egory while maintaining high accuracy makes the common based on the property FP-growth algorithm [30]. This leads
path-mining algorithm promising. Such precise classification to an effective limit on the number of measured tuples used
is necessary to quickly understand the root cause of events to for one instance of the classifier. When this limit is reached,
enable automated response. different portions of a power system can be monitored by sep-
arate instances of the classifier. Using multiple instances of
the classifier leads to two potential future works. First, classi-
VI. C ONCLUSION AND F UTURE W ORK
fiers will have overlapping visibility. As such, a method will
The common path-mining algorithm creates common paths be needed to rationalize results from overlapping classifiers.
from heterogenerous data in the power system. A common path Second, a partitioning scheme is needed to determine classifier
represents a set of critical states in which a system will step boundaries.
through in temporal order for a scenario such as a disturbance
or a cyber-attack. Common paths can be used as signatures R EFERENCES
to classify power-system behaviors with high specificity. Such
[1] H. Polzin and B. McMillan. (2012, Apr.). Arizona-Southern California
a classifier is a useful tool for use with automated system Outages on September 8, 2011. Causes and Recommendations,
integrity protection systems and wide-area control systems, Federal Energy Regulatory Commission and the North American
which include responses for both natural, equipment failure, Electric Reliability Corporation, Washington, DC, USA [Online].
Available: http://www.nerc.com/fileUploads/File/News/AZOutage_
and cyber-attack-related contingiencies. Report_01MAY12.pdf
Simple paths can be derived from monitored instances of [2] N. Falliere, L. O’Murchu, and E. Chien, “W32. Stuxnet Dossier, V 1.4,”
sceanrios applied to a test bed. However, the transients present Symantec Corp., Mountain View, CA, USA, Tech. Rep. MS10-046, 2011
[Online]. Available: http://www.symantec.com/content/en/us/enterprise/
in time-domain measurement data lead to different paths for media/security_response/whitepapers/w32_stuxnet_dossier.pdf
different instances of the same scenario. The common path- [3] S. Ntalampiras, “Detection of integrity attacks in cyber-physical critical
mining algorithm uses a sequential pattern-mining approach to infrastructures using ensemble modeling,” IEEE Trans. Ind. Informat.,
vol. 11, no. 1, pp. 104–111, Nov. 2014.
overcome this challenge and common paths for the scenario. [4] H. Lin, Y. Deng, S. Shukla, J. Thorp, and L. Mili, “Cyber security impacts
To validate the correctness of the algorithm, a case study was on all-PMU state estimator—A case study on co-simulation platform
performed, which applied the common path-mining algorithm GECO,” in Proc. IEEE 3rd Int. Conf. Smart Grid Commun., Nov. 2012,
pp. 587–592.
and classifier to detect disturbances and cyber-attacks. The clas- [5] V. Madani et al., “IEEE PSRC report on global industry experiences with
sifier provides a capability to accurately distinguish between system integrity protection schemes (SIPS),” IEEE Trans. Power Del.,
different types of power-system faults and cyber-attacks includ- vol. 25, no. 4, pp. 2143–2155, Oct. 2010.
[6] V. Terzija et al., “Wide-area monitoring, protection, and control of
ing command injection, aurora attacks, and fault replay attacks. future electric power networks,” Proc. IEEE, vol. 99, no. 1, pp. 80–93,
Three separate experiments were performed. The first experi- Jan. 2011.
ment applied the common path-mining algorithm to data with [7] Y. Deng et al., “Communication network modeling and simulation
for Wide Area Measurement applications,” in Proc. IEEE PES Innov.
two classes: 1LG fault and command injection. The second Smart Grid Technol. (ISGT), Washington, DC, USA, Jan. 16–20, 2012,
experiment adds an extra step prior to the training phase where pp. 1–6.
Authorized licensed use limited to: Jaypee Insituite of Information Technology-Noida Sec 128 (L3). Downloaded on February 13,2024 at 05:12:56 UTC from IEEE Xplore. Restrictions apply.
662 IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, VOL. 11, NO. 3, JUNE 2015

[8] R. Amgai, J. Shi, and S. Abdelwahed, “An integrated lookahead control- [29] J. Han, M. Kamber, and J. Pei, Data Mining Concepts and Techniques,
based adaptive supervisory framework for autonomic power system 3rd ed. San Mateo, CA, USA: Morgan Kaufmann, 2012.
applications,” Int. J. Elect. Power Energy Syst., vol. 63, pp. 824–835, [30] J. Han, J. Pei, Y. Yin, and R. Mao, “Mining frequent patterns without can-
2014. didate generation: a frequent-pattern tree approach,” Data Min. Knowl.
[9] A. Bose, “Smart transmission grid applications and their supporting Discovery, vol. 8, no. 1, pp. 53–87, Jan. 2004.
infrastructure,” IEEE Trans. Smart Grid, vol. 1, no. 1, pp. 11–19, Jun. [31] P. Anderson, Analysis of Faulted Power Systems. Hoboken, NJ, USA:
2010. Wiley, 1995.
[10] D. Bakken et al., “Smart generation and transmission with coherent, real- [32] H. Saadat, Power System Analysis, 3rd ed. Alexandria, VA, USA: PSA
time data,” Proc. IEEE, vol. 99, no. 6, pp. 928–951, Jun. 2011. Publishing, 2010.
[11] R. Borges et al., “Machine learning for power system disturbance and [33] M. Zeller, “Myth or reality—Does the Aurora vulnerability pose a risk
cyber-attack discrimination,” in Proc. 7th Int. Symp. Resilient Control to my generator?,” in Proc. 64th Annu. Conf. Protective Relay Eng., Apr.
Syst. (ISRCS), Aug. 2014, pp. 1–8. 11–14, 2011, pp. 130–136.
[12] M. Al Karim, M. Chenine, K. Zhu, and L. Nordstrom, “Synchrophasor- [34] H. Ferrer and E. Schweitzer, Modern Solutions for Protection, Control,
based data mining for power system fault analysis,” in Proc. 3rd IEEE and Monitoring of Electric Power Systems. Oregon, IL, USA: Quality
PES Int. Conf. Exhib. Innov. Smart Grid Technol. (ISGT Europe), Oct. Books Inc., 2010, pp. 57–104.
2012, pp. 1–8.
[13] N. Dahal, “Synchrophasor data mining for situational awareness in power
systems,” Ph.D. dissertation, Dept. Elect. Comput. Eng., Mississippi State
Univ., Starkville, MS, USA, 2012.
[14] P. K. Ray, S. R. Mohanty, N. Kishor, and J. P. S. Catalao, “Optimal fea- Shengyi Pan (S’12–M’14) received the B.Eng.
ture and decision tree-based classification of power quality disturbances degree in electronic information engineering from
in distributed generation systems,” IEEE Trans. Sustain. Energy, vol. 5, Fuzhou University, Fuzhou, China, in 2008; the
no. 1, pp. 200–208, Jan. 2014. M.Sc. degree in data communications from the
[15] A. Rodriguez et al., “A Decision Tree and S-transform based approach for University of Sheffield, Sheffield, U.K., in 2009; and
power quality disturbances classification,” in Proc. 4th Int. Conf. Power the Ph.D. degree in electrical and computer engineer-
Eng. Energy Elect. Drives (POWERENG), May 2013, pp. 1093–1097. ing from Mississippi State University, Starkville, MS,
[16] W. Gao and J. Ning, “Wavelet-based disturbance analysis for power USA, in 2014.
system wide-area monitoring,” IEEE Trans. Smart Grid, vol. 2, no. 1, From 2010 to 2014, he was a Research Assistant
pp. 121–130, Mar. 2011. with the Department of Electrical and Computer
[17] J. Upendar, C. P. Gupta, G. K. Singh, and G. Ramakrishna, “PSO Engineering, Mississippi State University, where his
and ANN-based fault classification for protective relaying,” IET Gener. research focused on smart grid cyber security and data-driven intrusion
Transmiss. Distrib., vol. 4, no. 10, pp. 1197–1212, Oct. 2010. detection technologies. He is currently a Software Engineer with MaxPoint
[18] B. Y. Vyas, B. Das, and R. P. Maheshwari, “Improved fault clas- Interactive Inc., Morrisville, NC, USA, for big data application develop-
sification in series compensated transmission line: Comparative ment in Internet digital advertising. His research interests include smart grid
evaluation of chebyshev neural network training algorithms,” technologies, cyber security, data mining, and bid data technologies.
IEEE Trans. Neural Netw. Learn. Syst., Oct. 2014 [Online].
Available: http://ieeexplore.ieee.org/xpl/login.jsp?tp=&arnumber=
6920088&url =https%3A%2F%2Ffanyv88.com%3A443%2Fhttp%2Fieeexplore.ieee.org%2Fxpls%2Fabs_
all.jsp%3Farnumber%3D6920088
[19] W. Li, A. Monti, and F. Ponci, “Fault detection and classification in
medium voltage DC shipboard power systems with wavelets and arti- Thomas Morris (M’06–SM’08) received the B.S.
ficial neural networks,” IEEE Trans. Instrum. Meas., vol. 63, no. 11, degree in electrical engineering from Texas A&M
pp. 2651–2665, Nov. 2014. University, College Station, TX, USA, in 1994, and
[20] K. Silva, B. Souza, and N. Brito, “Fault detection and classification in the M.S. and Ph.D. degrees in computer engineer-
transmission lines based on wavelet transform and ANN,” IEEE Trans. ing from Southern Methodist University, Dallas, TX,
Power Del., vol. 21, no. 4, pp. 2058–2063, Oct. 2006. USA, in 2001 and 2008, respectively.
[21] R. Mitchell and I.-R. Chen, “Behavior-rule based intrusion detection sys- He joined Mississippi State University, Starkville,
tems for safety critical smart grid applications,” IEEE Trans. Smart Grid, MS, USA, in 2008. He currently serves as an
vol. 4, no. 3, pp. 1254–1263, Sep. 2013. Associate Professor of Electrical and Computer
[22] Y. Yang et al., “Intrusion detection system for network security in syn- Engineering, Associate Director of the Distributed
chrophasor systems,” in Proc. IET Int. Conf. Inf. Commun. Technol., Analytics and Security Institute (DASI), and the
2013, pp. 246–252. Director of the Critical Infrastructure Protection Center (CIPC). His research
[23] B. Sikdar and J. Chow, “Defending synchrophasor data networks against interests include cyber security for power systems and industrial control
traffic analysis attacks,” IEEE Trans. Smart Grid, vol. 2, no. 4, pp. 819– systems.
826, Dec. 2011.
[24] A. Carcano et al., “A multidimensional critical state analysis for detecting
intrusions in SCADA systems,” IEEE Trans. Ind. Informat., vol. 7, no. 2,
pp. 179–186, May 2011.
[25] P. Nader, P. Honeine, and P. Beauseroy, “{l_p}-norms in one-class clas- Uttam Adhikari (S’11) received the B.S. degree
sification for intrusion detection in SCADA systems,” IEEE Trans. Ind. in electrical engineering from Tribhuvan University,
Informat., vol. 10, no. 4, pp. 2308–2317, Nov. 2014. Kathmandu, Nepal, in 2005, and he is currently
[26] R. Agrawal and R. Srikant, “Mining sequential patterns,” in Proc. 11th pursuing the Ph.D. degree in electrical and com-
Int. Conf. Data Eng., Mar. 1995, pp. 3–14. puter engineering at Mississippi State University,
[27] F. Lin, S. Chen, S. Pan, and Y. Chen, “Mining time dependency patterns Starkville, MS, USA.
in clinical pathways,” Int. J. Med. Informat., vol. 62, no. 1, pp. 11–25, His research interests include cyber-physical sys-
2001. tem modeling and simulation, wide-area measure-
[28] F. Lin, C. Chiu, and S. Wu, “Using Bayesian networks for discovering ment systems, data mining, and cyber security in
temporal-state transition patterns in hemodialysis,” in Proc. 35th Annu. smart grid.
Hawaii Int. Conf. Syst. Sci., Jan. 2002, pp. 1995–2002.

Authorized licensed use limited to: Jaypee Insituite of Information Technology-Noida Sec 128 (L3). Downloaded on February 13,2024 at 05:12:56 UTC from IEEE Xplore. Restrictions apply.

You might also like