0% found this document useful (0 votes)
9 views16 pages

Attack Graph Model For Cyber Physical Power Systems Using Hybrid Deep Learning

Uploaded by

Risk Maven
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views16 pages

Attack Graph Model For Cyber Physical Power Systems Using Hybrid Deep Learning

Uploaded by

Risk Maven
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 16

Delft University of Technology

Attack Graph Model for Cyber-Physical Power Systems Using Hybrid Deep Learning

Presekal, A.; Stefanov, Alexandru; Subramaniam Rajkumar, Vetrivel; Palensky, P.

DOI
10.1109/TSG.2023.3237011
Publication date
2023
Document Version
Final published version
Published in
IEEE Transactions on Smart Grid

Citation (APA)
Presekal, A., Stefanov, A., Subramaniam Rajkumar, V., & Palensky, P. (2023). Attack Graph Model for
Cyber-Physical Power Systems Using Hybrid Deep Learning. IEEE Transactions on Smart Grid, 14(5),
4007-4020. https://doi.org/10.1109/TSG.2023.3237011

Important note
To cite this publication, please use the final published version (if applicable).
Please check the document version above.

Copyright
Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent
of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.
Takedown policy
Please contact us and provide details if you believe this document breaches copyrights.
We will remove access to the work immediately and investigate your claim.

This work is downloaded from Delft University of Technology.


For technical reasons the number of authors shown on this cover page is limited to a maximum of 10.
Green Open Access added to TU Delft Institutional Repository

'You share, we take care!' - Taverne project

https://www.openaccess.nl/en/you-share-we-take-care

Otherwise as indicated in the copyright section: the publisher


is the copyright holder of this work and the author uses the
Dutch legislation to make this work public.
IEEE TRANSACTIONS ON SMART GRID, VOL. 14, NO. 5, SEPTEMBER 2023 4007

Attack Graph Model for Cyber-Physical Power


Systems Using Hybrid Deep Learning
Alfan Presekal , Member, IEEE, Alexandru Ştefanov , Member, IEEE,
Vetrivel Subramaniam Rajkumar , Student Member, IEEE, and Peter Palensky, Senior Member, IEEE

Abstract—Electrical power grids are vulnerable to cyber k Number of neighbor hops in the graph
attacks, as seen in Ukraine in 2015 and 2016. However, existing Wgcn Weight of graph convolutional neural
attack detection methods are limited. Most of them are based on network
power system measurement anomalies that occur when an attack
is successfully executed at the later stages of the cyber kill chain.  Hadamard product multiplication
In contrast, the attacks on the Ukrainian power grid show the operator
importance of system-wide, early-stage attack detection through {s1 , s2 , . . . , sn } ∈ S S as all observable substations, and
communication-based anomalies. Therefore, in this paper, we each individual substation sn
propose a novel method for online cyber attack situational aware- Xt , X ∈ sn Data of network traffic for each time
ness that enhances the power grid resilience. It supports power
system operators in the identification and localization of active t for all nodes
attack locations in Operational Technology (OT) networks in near {x1 , x2 , . . . , xn } ∈ X Individual node traffic data
real-time. The proposed method employs a hybrid deep learn- it Input gate for long short-term memory
ing model of Graph Convolutional Long Short-Term Memory ft Forget gate for long short-term
(GC-LSTM) and a deep convolutional network for time series memory
classification-based anomaly detection. It is implemented as a
combination of software defined networking, anomaly detection ot Output gate for long short-term
in communication throughput, and a novel attack graph model. memory
Results indicate that the proposed method can identify active ct Internal cell state for long short-term
attack locations, e.g., within substations, control center, and wide memory
area network, with an accuracy above 96%. Hence, it outper- ct Transferable state for long short-term
forms existing state-of-the-art deep learning-based time series
classification methods. memory
ht Hidden state for long short-term
Index Terms—Anomaly detection, cyber-physical system, memory
graph neural network, network security, software defined
networking, throughput, time series analysis. Wi , Wf , Wo , Wc , Set of weights for long short-term
Ui , Uf , Uo , Uc memory
bi , bf , bo , bc Set of biases for long short-term
memory
N OMENCLATURE σ Sigmoid function
G Graph tanh Hyperbolic tangent function
V Known vertices/ nodes yli Convolution operation output for each
E Edges l layers and i element
Ai,j Adjacency matrix with where i and j ReLU
m−1 l−1 Rectifier linear unit function
represent the node index numbers wy(i) +b Convolution operation for layer l and

A Modified adjacency matrix where A= element i with filter size (m), weight
A + I (identity matrix) (w), and bias (b)
GCNtk Graph convolutional equation for each Λ Attack graph
k hop and time t {ai , ai } ∈ V ∈ Λ Normal nodes (ai ), anomalous nodes
(ai )
Manuscript received 12 August 2022; revised 30 November 2022; accepted {ui }∈/ V; ui ∈ Λ Unidentified nodes (ui )
4 January 2023. Date of publication 16 January 2023; date of current Gmean Geometric mean function
version 23 August 2023. This work was supported by the Designing
Systems for Informed Resilience Engineering (DeSIRE) Program of the 4TU
Center for Resilience Engineering (4TU.RE). Paper no. TSG-01168-2022.
(Corresponding author: Alfan Presekal.) List of Acronyms
The authors are with the Department of Electrical Sustainable Energy,
Delft University of Technology, 2628 CD Delft, The Netherlands (e-mail: CNN Convolutional Neural Network
[email protected]; [email protected]; V.SubramaniamRajkumar@ CPS Cyber-Physical System
tudelft.nl; [email protected]). DDoS Distributed Denial-of-Service
Color versions of one or more figures in this article are available at
https://doi.org/10.1109/TSG.2023.3237011. EI Expected Improvement
Digital Object Identifier 10.1109/TSG.2023.3237011 FCN Fully Convolutional Neural Network
1949-3053 
c 2023 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.
See https://www.ieee.org/publications/rights/index.html for more information.
Authorized licensed use limited to: TU Delft Library. Downloaded on November 02,2023 at 12:40:52 UTC from IEEE Xplore. Restrictions apply.
4008 IEEE TRANSACTIONS ON SMART GRID, VOL. 14, NO. 5, SEPTEMBER 2023

GC-LSTM Graph Convolutional Long Short-


Term Memory
GCN Graph Convolutional Network
GNN Graph Neural Networks
IED Intelligent Electronic Device
IT Information Technology
LSTM Long Short-Term Memory
MLP Multi-Layer Perceptron
MU Merging Unit
OT Operational Technology
ROC Receiver Operating Characteristic
RTU Remote Terminal Unit
SCADA Supervisory Control and Data Fig. 1. Cyber kill chain stages and impacts.
Acquisition
SDN Software Defined Networking the early stages of the cyber kill chain, based on throughput
TDG Traffic Dispersion Graph anomalies in OT communication traffic power system wide.
TSC Time Series Classification Cyber attack detection on power grids have been exten-
WAN Wide Area Network sively studied in recent years. Nonetheless, the majority of
the existing research is focused on the identification of cyber
attacks on power grids under False Data Injection (FDI) attack
scenarios. These scenarios focus on analyzing power system
I. I NTRODUCTION AND R ELATED W ORK measurements to identify anomalies in power grids [4], [5],
YBER attacks on power grids are high-impact and [6], [7], [8], [9], [10], [11]. However, in the real-world cyber
C low-frequency disturbances with a wide range of conse-
quences. These could include but are not limited to, equipment
attacks on power grids reported in [1], [2], [12] adversaries
did not perform FDI attacks. Instead, in the early stages of
damage, loss of load, and power system instability. In the the cyber kill chain, attackers targeted the IT-OT communi-
worst-case scenario, cyber attacks and advanced persistent cations. Therefore, in this research, we omit power system
threats may cause system-wide cascading failures and a black- measurements under FDI attack scenarios and focus on the
out. Therefore, cyber attacks on power grids are severe threats OT communication traffic anomalies.
and have already been identified in the real world. For exam- There are four major methods reported in the literature
ple, on December 23, 2015, a cyber attack was conducted on for power grid communication traffic anomaly detection, i.e.,
the power grid in Ukraine that resulted in a power outage, signature-based [13], sequence-based [14], rule-based [15],
affecting 225,000 customers [1]. A more sophisticated cyber [16], [17], and machine learning-based [18], [19], [20]. Recent
attack followed on December 17, 2016, resulting in a power research shows that machine learning-based methods are gain-
outage in the distribution network, where 200 MW of load ing increased attention and provide superior performance for
was left unsupplied [2]. The attackers employed several attack anomaly detection [21], [22], [23]. Therefore, in this work,
strategies and steps to achieve their objectives. These can be we focus on machine learning-based communication traffic
mapped with the seven stages of the cyber kill chain for an anomaly detection. Our proposed model is based on a semi-
in-depth analysis of such an advanced persistent threat, i.e., supervised learning. It does not use signatures, sequences nor
reconnaissance, weaponization, delivery, exploitation, instal- rules for detection and classification. The proposed model
lation, command and control, and action on objectives [3] as classifies OT network traffic into two categories, i.e., nor-
depicted in Fig. 1. However, existing detection methods for mal and anomalous, based on the network traffic throughput.
cyber attacks on power grids are limited. Most of them are Previous research in this area is discussed in [18], [20].
based on power system measurement anomalies that occur In [18], the authors used labeled communication packets from
when an attack is successfully executed at the later stages of UNSW-NB15 and IDE2012/16 datasets as inputs to predict
the cyber kill chain, e.g., false data injection [4], [5], [6], [7], the Distributed Denial of Services (DDoS) attacks. Meanwhile,
[8], [9], [10], [11]. In contrast, in the aforementioned cyber in [20], the authors use traffic data logs from Snort to create a
attacks in Ukraine, the cyber kill chain lasted for more than six sequence-based anomaly detection technique. However, both
months between the reconnaissance and command and control machine learning implementations do not use traffic through-
stages. The latter caused power outages in a matter of min- put data, which is our research focus. Furthermore, the vast
utes [1], [2], [12]. Hence, this highlights the urgency of timely majority of machine learning-based anomaly detection meth-
early-stage attack detection through Information Technology- ods only focus on IT systems [21], [22], [23], [24]. Even
Operational Technology (IT-OT) system anomalies. Physical though the IT and OT systems of a utility are integrated, the
measurement-based anomaly detection is only valid for later traffic characteristics are distinct. The network traffic in OT
stages in the cyber kill chain, i.e., command and control and systems is generated from automated processes with deter-
actions on objectives. Therefore, in this research, we propose ministic and homogenous behavior, whilst the IT system traffic
an early-stage anomaly detection method for OT systems. It consists of user-generated data with a stochastic behavior [25].
is implemented in the control center to detect cyber attacks at Hence, the implementation of traffic-based anomaly detection

Authorized licensed use limited to: TU Delft Library. Downloaded on November 02,2023 at 12:40:52 UTC from IEEE Xplore. Restrictions apply.
PRESEKAL et al.: ATTACK GRAPH MODEL FOR CYBER-PHYSICAL POWER SYSTEMS 4009

for OT systems is fundamentally different from that of IT


systems.
Amongst the machine learning-based traffic anomaly detec-
tion methods, most recent works use deep learning models that
provide a better performance [22], [26]. In [27], the authors
propose a deep reinforcement learning-based method for traf-
fic flow matching control. They focus on detection of DDoS
attacks that systematically trigger considerable anomalies in
traffic throughput. Therefore, this method is not suitable to
detect infinitesimally small changes in OT network traffic
throughput, e.g., caused by stealthy attacks [27]. In [28], the
authors used Convolutional Neural Network (CNN) for com-
munication traffic classification. However, the CNN method
cannot detect unknown cyber attacks because it depends
on preliminary traffic data for the training. To address this
gap, instead of using specific labeled data for each attack
category, we use the quantitative anomaly. The quantitative Fig. 2. Abstraction layers of SDN architecture.
anomaly detection uses the throughput of the OT communi-
cation traffic. The throughput is quantified as a time series network. Hence, the attack graph is an important method to
to generate a unique waveform pattern as shown in [29], identify vulnerabilities in the system [42]. The knowledge
[30], [31]. Therefore, instead of classifying specific attack about the attack path is also crucial to prevent and miti-
types or sequences, in this work we classify the time series gate cyber attacks. At current, the attack graphs are mostly
traffic flow into two categories, i.e., normal and anomalous. constructed based on vulnerability information obtained from
In other related work, time series-based anomaly detection network elements [43], [44]. This type of attack graph is not
and classification were studied in [32], [33], [34], [35]. The flexible, because it heavily depends on system vulnerability
state-of-the-art Time Series Classification (TSC) methods are data. However, in this research, we propose an alternative
based on deep learning models, as described in [34], [35]. attack graph map generation model, based on the online traf-
However, based on our experiments, they do not perform well fic monitoring in the OT networks of power grids. This is
in the detection of stealthy attacks due to infinitesimally small made possible through the wide deployment of an emergent
changes in the traffic throughput. Additionally, these methods technology, i.e., Software Defined Networking (SDN). SDN
do not perform well due to imbalanced data that is indicated is a networking paradigm based on network virtualization and
in their F1 and Geometric mean scores. Therefore, to address segregation of data and control planes [45]. In the SDN archi-
these challenges, we propose a novel hybrid deep learning tecture, as seen in Fig. 2, there are three abstraction layers
model for anomaly detection in power grid OT network traffic. present, i.e., data plane, control plane, and management plane.
The hybrid model uses Graph Neural Networks (GNN), Long The data plane represents locations of conventional commu-
Short-Term Memory (LSTM), and CNN. It employs unsuper- nication networks, while control plane provides controllability
vised learning to learn the complex behavior of OT network over the data plane. Additionally, the management plane in
traffic throughput and supervised learning to classify the OT SDN allows the deployment of network applications, e.g.,
traffic. attack graph model. Although SDN is an emergent paradigm in
GNN-based deep learning models have been implemented the field of computer networking, earlier research has investi-
for various applications, e.g., residential load forecast- gated its implementation in cyber-physical power systems [46],
ing [36], detection of false data injection [37], road traffic [47], [48], [49], [50]. Earlier research has used SDN for
prediction [38], and road traffic anomaly detection [39]. LSTM anomaly detection based on traffic flow information [27], [51].
has been used to detect anomalies in Supervisory Control and However, these works are not designed to detect anomalies
Data Acquisition (SCADA) systems [40]. This method can triggered by cyber attacks in OT networks. In this research,
detect anomalies based on temporal features of time series we use SDN to monitor the network traffic in real-time, orig-
data. CNN has been proposed to detect anomalies in power inating from the data plane of the OT Wide Area Network
system data [41]. It has advantages in learning spatial features (WAN) for power systems. In summary, a critical examination
and correlations of the datasets. In this research, we propose of related state-of-the-art methods for communication traffic
the application of a Graph-Convolutional Long Short-Term anomaly detection reveals the following. (1) Existing SDN
Memory (GC-LSTM) to preprocess the data of OT network applications for cyber-physical systems are not focused on
traffic and generate traffic predictions. The output from the cyber security of OT networks [27], [47], [48], [49], [50], [51].
GC-LSTM is then used as an input for the CNN-based time- (2) They are solely based on packet flow rules [51]. (3) They
series classification. We generate an attack graph to identify overlook the cyber kill chain and do not address any type of
in near real-time the active cyber attack locations in the power stealthy cyber attacks [27], [51].
grid. The scientific contributions of this paper are as follows:
The attack graph provides topological information on the 1) To the best knowledge of the authors, we propose the first
possible attack paths for a specific cyber attack on a given known SDN-based online cyber attack situational awareness

Authorized licensed use limited to: TU Delft Library. Downloaded on November 02,2023 at 12:40:52 UTC from IEEE Xplore. Restrictions apply.
4010 IEEE TRANSACTIONS ON SMART GRID, VOL. 14, NO. 5, SEPTEMBER 2023

Fig. 3. Attack graph creation using CyResGrid method.

method, i.e., Cyber Resilient Grid (CyResGrid). It is specif- Dispersion Graph (TDG), GC-LSTM, TSC for anomaly detec-
ically designed for anomaly detection using communication tion, and the attack graph model. Section III provides the
traffic throughput in OT networks for stealthy cyber attacks experimental results. Section IV presents the conclusions and
during the early stages of the cyber kill chain, e.g., network future work.
reconnaissance. Therefore, CyResGrid aids operators to locate
and identify power system-wide cyber attacks in near real-time II. A NOMALY D ETECTION AND ATTACK G RAPH M ODEL
through an attack graph map.
In this section, the proposed methods for anomaly detection
2) We propose a hybrid deep learning model to classify the
and attack graph modeling are introduced. Furthermore, we
OT network traffic throughput as anomalous or normal. The
also elaborate on the cyber-physical model that serves as the
model combines GC-LSTM and a deep convolutional network
basis for the aforementioned methods. Fig. 3 summarizes the
to detect OT network anomalies caused by cyber attacks. It
methodology of anomaly detection and attack graph creation.
outperforms existing state-of-the-art deep learning-based time
The method consists of four steps as follows.
series classifiers [34], [35], as indicated by Geometric mean
Step 1: GC-LSTM training and TDG. The normal OT traf-
and F1 scores. To achieve this, we use GC-LSTM for traf-
fic is used to train the GC-LSTM model for traffic
fic normalization. Subsequently, to detect the anomaly, we
prediction. The process generates a trained GC-
design a deep convolutional network by tuning the hyperpa-
LSTM model. Additionally, the normal OT traffic
rameters through Bayesian optimization. Based on the network
is used to generate the OT network topology using
throughput monitoring and anomaly detection, we create an
a TDG.
attack graph map of power system-wide cyber attacks, in near
Step 2: Deep CNN training. The trained GC-LSTM
real-time.
model is used to predict the OT network traf-
3) As there is a strong need for synthetic Cyber-Physical
fic. The prediction is then used to train a Deep
System (CPS) datasets for research [52], we create the first
Convolutional Neural Network for TSC. This pro-
synthetic dataset of OT communication traffic throughput,
cess generates a trained Deep CNN model for OT
which is generated through a cyber-physical power system
traffic classification.
model. To the best of our knowledge, the majority of the
Step 3: Online node classification. This step monitors the
existing datasets are not suitable for cyber security [53], [54],
online OT traffic as input for node classification.
[55], [56], [57], [58], [59]. A cyber-physical system dataset
The trained GC-LSTM and Deep CNN are used
was proposed in [60], [61] for intrusion detection. However,
sequentially to classify the nodes as normal or
the OT traffic data is only in the form of signature-based logs
anomalous.
without detailed traffic information [60], [61]. Therefore, in
Step 4: Attack graph generation. The node classification
this research, we employ a CPS model of the power grid
results from step 3 in conjunction with OT graph
consisting of the physical system and associated OT commu-
data from step 1 are used to generate the attack
nication networks. The model is used to co-simulate the power
graph visualization.
grid and OT network, from substations up to the control cen-
A more detailed explanation of the method in each step is
ter. It also has cyber range capabilities to simulate various
provided in the following subsections.
cyber attack scenarios. Based on this model, we generate a
synthetic dataset of OT communication traffic throughput for
cyber-physical power system operation under cyber attacks. A. Cyber-Physical System Model
The paper is structured as follows. Section I is the intro- Detailed CPS models are needed for research on cyber secu-
duction and Section II describes the methodology proposed rity of power grids. They are used to simulate the power
in this paper, including cyber-physical system model, Traffic systems along with their associated IT-OT communication

Authorized licensed use limited to: TU Delft Library. Downloaded on November 02,2023 at 12:40:52 UTC from IEEE Xplore. Restrictions apply.
PRESEKAL et al.: ATTACK GRAPH MODEL FOR CYBER-PHYSICAL POWER SYSTEMS 4011

Fig. 4. Cyber-physical system model of the power grid with IT-OT communication networks.

networks and cyber events. The state-of-the-art in smart grid The CPS model is integrated with SDN capability that cre-
modeling and simulations is discussed in [62], [63], [64], [65], ates network virtualization using virtual switches. Based on
[66], [67], [68]. Hence, as part of our CPS model, we per- Fig. 4, the OT and IT networks are present in the data plane
form a co-simulation of the power grid and IT-OT systems, as layer of the SDN. Meanwhile, the control and management
depicted in Fig. 4. plane are represented by the SDN controller. Network virtual-
The CPS model provides time-domain measurement data ization allows the SDN controller to monitor and control traffic
from substation bays, e.g., buses, lines, and generators, in and run custom network applications. Fig. 4 depicts how the
the form of active and reactive power, voltage, and current SDN controller is applied to the typical SCADA architecture.
measurements. All measurement data is then delivered from SDN improves the OT network monitoring and control by col-
the substation to the control center via a WAN as SCADA lecting OT communication traffic reports in the control center.
telemetry. The SCADA data is also stored in local databases The traffic observation points are visualized as red squares,
located in substations and the control center. For the cyber which are distributed across the substations and control center.
system, every node in the OT network is emulated using Using these points, we observe real-time OT network traffic
operating system-level virtualization. The network connectiv- from the control center to detect traffic anomalies for each
ity between substations, WAN, and control center is realized observation location and create a power system wide attack
through network virtualization and SDN. With this configura- graph.
tion, the developed CPS architecture can model and simulate
realistic OT network traffic for the power system.
The OT network is modeled based on custom functions for B. Traffic Dispersion Graph
every device in the communication network. The measurement The TDG is an analytical model for communication traf-
devices represent components, such as Merging Units (MUs), fic monitoring and analysis. The core idea for TDG is derived
Remote Terminal Units (RTUs), and Intelligent Electronic from the social behavior of hosts in a network [69]. Therefore,
Devices (IEDs). These devices perform data acquisition from the flow of OT network traffic is analyzed based on the interac-
the power grid, with a SCADA sampling rate of one sample tions between all hosts in the communication network. Based
per second. Legitimate control commands from the control on this analysis, information related to communication sources
center modify the set points for power grid controllers in real- and destinations is extracted. Furthermore, TDG represents
time. For example, a control command can set a circuit breaker nodal information using graph structures. Every host in a
to open or close, set values for voltage, and active power set network is represented by a single node in a graph. On the
points of generator automatic voltage regulators and governors. other hand, communication between hosts is represented by
The measurement values and control set points are commu- connectivity between nodes, i.e., graph edges. Fig. 5 shows
nicated across the OT network using Transmission Control the TDG generation processes. Firstly, information on the
Protocol/Internet Protocol (TCP/IP) packets. IP address source and destination from flowing packets in

Authorized licensed use limited to: TU Delft Library. Downloaded on November 02,2023 at 12:40:52 UTC from IEEE Xplore. Restrictions apply.
4012 IEEE TRANSACTIONS ON SMART GRID, VOL. 14, NO. 5, SEPTEMBER 2023

nodes. The traffic observation locations are situated in the


Ethernet ports of virtual SDN switches that are directly con-
nected to a host. All measurement data from each substation
is sent to the control center via SCADA protocols, e.g., IEC
104 and DNP3. Thereby, this traffic flow allows the con-
trol center to gain a complete overview of the entire OT
network. Using observation locations in the control center, the
dispersion graph determines the nodes that actively communi-
cate measurements. Also, the dispersion graph can determine
unusual behavior, i.e., when a node is not sending measure-
ment data or sending an abnormal quantity of traffic. In this
research, anomaly detection works based on the total vol-
ume of observed network traffic, i.e., throughput, measured
in KiloBytes per second (KBps). Furthermore, the dispersion
Fig. 5. Traffic Dispersion Graph (TDG) processes. graph can also identify unknown nodes with unidentified or
unknown sources and destinations of IP Addresses or MAC
Addresses.

C. Graph Convolutional Long Short-Term Memory


GC-LSTM aims to learn the traffic behavior of the OT
network. Two machine learning models are applied in GC-
LSTM, i.e., Graph Convolutional Network (GCN) and LSTM.
GCN processes the OT network topological information
expressed as a graph, along with localized features from
neighboring communication nodes in the spatial domain.
Subsequently, LSTM performs temporal learning based on
time-series data of observed OT network traffic. The com-
bination of GCN and LSTM has the advantage of learning
from both the spatial and temporal domains. Various appli-
cations using graph-based spatial and temporal models were
proposed in [36], [37], [38], [39]. In this research, we propose
Fig. 6. Traffic dispersion graph of 27 substations. a novel method for nodal feature prediction based on commu-
nication network topology and features of neighboring nodes.
CyResGrid proposes an innovative application of GC-LSTM
the network is in the collected information table. Information
to model the OT network traffic of the power system. It uses
about the path between two IP addresses is added based on
a hybrid combination of unsupervised and supervised models
prior knowledge of the network topology. The information
for OT traffic anomaly detection. The former is based on GC-
in the table is then used to create an individual flow graph.
LSTM which learns the complex behavior of OT network data
Finally, all individual graph is converged into a dispersion
and topology. Subsequently, the GC-LSTM generates traffic
graph which provides an overall topology of the network.
for the supervised predictions of the TSCs. The OT traffic
The TDG has previously been used to analyze communi-
model is then integrated with deep convolutional network-
cation network patterns. For example, a research proposed
based TSC to generate an attack graph based on observed
an application of TDG for anomaly detection based on the
anomalies in the communication network traffic.
degree distribution values of a graph [70]. In our research, the
The graph structure of the OT network topology serves as
CyResGrid method uses TDG to generate graph structures of
the main input for GC-LSTM method. This graph structure
the power system OT network. This includes a graphical rep-
is obtained from the TDG. It can be represented as G =
resentation of the OT network topology between the control
(V, E) where G is the graph, V represents the vertices/nodes
center and substations. The anomalous nodes in the graph are
and E represents the edges/links. The connection between the
then detected based on OT network traffic anomalies. In our
nodes in the graph is represented by the adjacency matrix A.
model, the CPS topology of a power grid possesses a tree-
Elements of the adjacency matrix are represented by Ai,j where
like network structure. Fig. 6 illustrates the TDG of the OT
i and j represent the node index numbers, such that Ai,j = 1
network that is used in our model, containing a total of 27 sub-
when two nodes are connected, and Ai,j = 0 otherwise.
stations and one control center. Every substation consists of  
OT devices, e.g., MUs, IEDs, RTUs, etc., and a communica- GCNtk ← Wgcn  Ak Xt (1)
tion gateway, e.g., router/firewall, that communicates with the     
control center. ft = σ Wf GCNtk + Uf ht−1 + bf (2)
In this research, the nodes represent traffic observation loca-   
tions, while edges represent communication links between it = σ Wi GCNtk + (Ui ht−1 ) + bi (3)

Authorized licensed use limited to: TU Delft Library. Downloaded on November 02,2023 at 12:40:52 UTC from IEEE Xplore. Restrictions apply.
PRESEKAL et al.: ATTACK GRAPH MODEL FOR CYBER-PHYSICAL POWER SYSTEMS 4013

  
ot = σ Wo GCNtk + (Uo ht−1 ) + bo (4) Algorithm 1 CyResGrid Attack Graph Generation
   Inputs: S{s1 , s2 , . . . sn }; X ∈ sn : Substations traffic data
ct = tanh Wc GCNtk + (Uc ht−1 ) + bc (5) {x1 , x2 , . . . , xn } ∈ X: Nodes traffic data
  Outputs: Λ = {{ai , ai ∈ V}}: Nodes classification as attack graph
ct = (ft  ct−1 ) + it  ct (6) Iteration for each substation
ht = ot  tanh(ct ) (7) 1: for si in S do
2: for t = 1 to T do

Traffic prediction 
GCN kt ← Wgcn  
k
The GCN function is used to obtain the nodal features as 3: A X{x1 , x2 , . . . , xn }t
described in (1). GCN operates based on the Hadamard prod- 4: ht , ct = LSTM(X{x1 , x2 , . . . , xn }t , GCNtk , ht−1 , Ct−1 )
uct multiplication () of the weight matrix (Wgcn ), adjacency Iteration for each node a in V
matrix (A), and node features from the observed traffic data 5: for a in V
(Xt ). The adjacency matrix captures information related to the Node classification
l−1
OT network topology. The adjacency matrix (A) is added with 6: ai = m−1 wht,(i) +b
the identity matrix (I) to form a modified adjacency matrix 7: end for
8: end for
(
A). The data set (Xt ) is represented as a time series, where the 9: end for
equation considers the single time instant (t) and total number 10: return: Λ = {{ai , ai ∈ V}}
of time observations, T. The node feature matrix (X) contains
individual nodal information (xi ), where the total number of
nodes is represented by (n). The equation also considers the
learn the complex behavior of OT network data and topol-
number of hops from a communication node to neighboring
ogy. Subsequently, the GC-LSTM generates traffic predictions
nodes, i.e., k as an exponent of  A, as explained in [38], [71].
as inputs to TSCs.
This research uses the maximum number of hops between each
substation and the control center being two, i.e., k = 2. m−1
After obtaining the spatial features from the graph convolu- yli = ReLU l−1
wy(i) +b (8)
tional operation, LSTM is then used to analyze the temporal /
time-series features. The LSTM functions and processes inside x∗ = argmax f (x) (9)
an LSTM cell are described in (2 - 7). There are six main sub- x
equations in the LSTM process, including the forget gate (ft ),
We propose a supervised deep convolutional neural network
input gate (it ), output gate (ot ), internal cell state (ct ), trans-
for TSC-based anomaly detection. The deep convolutional
ferable cell state (ct ), and hidden state (ht ). The previously
network is based on a multi-layer one-dimensional convo-
calculated nodal features output (GCN kt ) serves as the input
lutional with the ReLU activation function as shown in (8).
for the LSTM cell.
In (8), we consider the number of layers (l), filter size (m),
In this work, we consider each substation to have unique
weight (w), and bias (b). This model is trained to optimize the
characteristics. Given the communication network traffic data
performance of classification based on the previous GC-LSTM
from all nodes that are present in a substation as (X),
output. To formulate our hybrid deep learning model, we per-
Algorithm 1 describes how an independent process is per-
form hyperparameter tuning based on the number of layers,
formed for each substation to provide the independent set
filters, and kernel size. Bayesian optimization [76] is used
GC-LSTM models for every substation (si ). During the train-
to optimize the deep learning model. The objective function
ing process, this output is compared with the real OT traffic
maximizes the deep learning performance as described in (9).
data (Xt+1 ) to update the weight values in GCN and LSTM.
Bayesian optimization works based on the surrogate model and
The final output of LSTM predicts the OT traffic in corre-
acquisition function. The surrogate model is a Gaussian pro-
sponding nodes represented by (ht ). This output serves as input
cess that quantifies the uncertainty of the unobservable region.
for the TSC in the following stage.
To achieve the optimum value of the objective function, we use
the Expected Improvement (EI) as the acquisition function.
Bayesian optimization performs iterations to obtain a func-
D. Time Series Anomaly Detection
tion with the best performance. From the iterative process,
TSC for anomaly detection was studied in [32], [33], we obtain the best performing deep convolutional network
[34], [35]. In this research, we propose a new method using that has 3 layers, 64 filters, and 3 kernel sizes. Fig. 7 shows
TSC to detect anomalies in the OT communication network the architecture of CyResGrid hybrid deep learning model
traffic throughput for power systems. As a benchmark, we that consists of a GC-LSTM layer, three layers of convolu-
focus on state-of-the-art deep learning-based anomaly detec- tional neural network, and one layer of fully connected neural
tion techniques, i.e., ResNets [72], Inception [35], Fully network (dense).
Convolutional Neural Networks (FCN) [73], and Multi-Layer
Perceptron (MLP) [74]. Meanwhile, in our research, we pro-
pose CyResGrid; a hybrid of method for unsupervised and E. Attack Graph Model
supervised OT traffic anomaly detection. The unsupervised An attack graph is a method to model CPS vulnerabilities
learning application for time series data was studied in [75]. and potential exploits. Since a successful exploit of a vulnera-
We specifically use an unsupervised GC-LSTM model to bility may lead to a partial or even a total failure of the CPS,

Authorized licensed use limited to: TU Delft Library. Downloaded on November 02,2023 at 12:40:52 UTC from IEEE Xplore. Restrictions apply.
4014 IEEE TRANSACTIONS ON SMART GRID, VOL. 14, NO. 5, SEPTEMBER 2023

known nodes. This notion is represented by a set of attack


graphs (Λ) and described through (10). On the other hand,
the attack graph type II in Fig. 8(c) also considers all uniden-
tified nodes for the classification of anomalous behavior, as
described in (11). The unidentified nodes (ui ) are determined
based on unknown sources or destinations address obtained
from the TDG. The unknown nodes (ui ) are assumed to indi-
cate an active cyber attack, originating from an unlisted host
Fig. 7. CyResGrid – hybrid deep learning model. in the known OT network (V).

an attack graph is an important tool for vulnerability analy-


sis and mitigation strategies. Meanwhile, in a communication III. E XPERIMENTAL R ESULTS
network, there are many hosts that may become vulnerable. A. Experimental Setting
As a result, the cyber security of the entire CPS cannot only All experiments in this paper are conducted using the
rely on the security of a single host. Therefore, it is important previously discussed CPS model of the power grid repre-
to locate and identify all vulnerable nodes/hosts in a com- sented in Fig. 4. The power system is simulated in real-time
munication network as a set of potential threats in the CPS. using a Root Mean Square (RMS) dynamic model of the
Subsequently, in this research, we propose the observation and IEEE 39-bus test system in DIgSILENT PowerFactory. The
analysis of anomalous OT traffic behavior to detect nodes CPS model employs OPC UA implemented through Python
potentially compromised by cyber attacks. The information to interface the time domain simulation of the power grid
regarding anomalous nodes is then used to construct an online and emulated OT communication network. The OT network
attack graph in near real-time for the entire OT network of the emulation is based on Mininet, which uses the operating-
power grid. system-level virtualization. The entire emulated OT network
Algorithm 1 explains the process of attack graph generation. runs on 10 virtual servers and consists of 27 user-defined sub-
The OT network traffic (X) is the input for the algorithm. The stations, 118 measurement devices, and over 800 data points
network traffic from each substation (Xn ) is used to predict for the entire simulated power system. SCADA device func-
the OT traffic using GC-LSTM. The GC-LSTM model pro- tionality within the OT network is realized through custom
vides a set of traffic predictions (ht ) as outputs. The output Python code. Therefore, we generate SCADA traffic from
from the prediction is then used as input for the TSC-based substations and the control center. All OT network traffic
CNN. The time series-based anomaly detection is performed is captured using the Linux bwm-ng tool and used as the
for each node (a) in V. The classifier labels each node as main dataset for this research. The OT network traffic is mea-
anomalous or normal based on the input OT traffic prediction. sured in KBps. The observed OT network traffic data under
This information is then used to construct the attack graph. nominal operating conditions is used to train the GC-LSTM
Λ = {{ai , ai , ∈ V}} (10) model.
We collect OT network traffic data during various cyber
Λ = {{ai , ai , ∈ V}, {ui ∈
/ V}} (11)
attack scenarios. Two types of cyber attacks are consid-
There are two types of attack graphs as described through ered, i.e., DDoS and active reconnaissance, i.e., OT network
equations (10) and (11). The attack graph type I in (10) is con- scanning. The DDoS attack is launched to target multiple sub-
structed based on prior knowledge of the OT network topology stations and aims to disrupt the power system operation with
and node classification results. Meanwhile, the attack graph a malicious increase of the OT network traffic loading. To
type II in (11) considers unidentified nodes based on the TDG. this end, we use the well-known Syn Flood cyber attack vec-
There are two elements of attack graph (Λ) type I as indicated tor that exploits vulnerabilities in the TCP/IP packets to target
in (10), i.e., normal nodes (ai ), and anomalous nodes (ai ). Both network hosts [77]. This attack vector is chosen as it can flood
of the nodes are elements of the known nodes (V). In contrast the OT network and cause the targeted hosts to crash. The
attack graph (Λ) type II as indicated in (11) contains one extra DDoS attack is executed using the Linux hping3 tool. The sec-
element of unidentified nodes (ui ). The unidentified nodes are ond examined cyber attack scenario is based on OT network
considered as anomalous since these nodes are not elements scanning. This attack aims to enumerate active hosts within
of the known nodes (V). the OT network. Network scanning targets IP addresses and
Fig. 8 depicts an example comparison of attack graph rep- ports within a specified range. It is typically performed during
resentations of the OT network under normal network traffic reconnaissance at the early stages of a cyber attack kill chain.
conditions in Fig. 8(a) and anomalous traffic in Figs. 8(b) In this work, we conduct a six-level network scanning using
and 8(c). The anomalous network traffic conditions are deter- nmap, i.e., paranoid, sneaky, polite, normal, aggressive, and
mined based on observed abnormal node behavior shown in insane. The first two scanning levels are stealthy and used to
red. Subsequently, these nodes are combined to form an attack evade intrusion detection systems [78]. The scanning intensity
graph (Λ). There are three elements in the attack graph, i.e., determines the number of packets delivered to the network.
normal nodes (ai ), anomalous nodes (ai ), and unidentified For all cyber attack scenarios and simulations, we collect the
nodes (ui ). The attack graph type I from Fig. 8(b) only clas- observed OT network traffic data into a labeled dataset for
sifies nodes as anomalous based on observed traffics from all deep learning applications.

Authorized licensed use limited to: TU Delft Library. Downloaded on November 02,2023 at 12:40:52 UTC from IEEE Xplore. Restrictions apply.
PRESEKAL et al.: ATTACK GRAPH MODEL FOR CYBER-PHYSICAL POWER SYSTEMS 4015

Fig. 8. Attack graph representation for normal and anomalous traffic: a) Normal graph, b) Attack graph type I which contains normal and anomalous nodes,
and c) Attack graph type II which contains normal, anomalous and unidentified nodes.

Fig. 9. Comparison of real and predicted traffic under normal conditions.

Fig. 11. Statistical comparison of real (r) and predicted traffic (p).

Fig. 10. Histogram of real and predicted traffic under normal conditions.
or drops to zero but we cannot consider this situation as an
anomaly. In distributed communication systems, the zero-value
B. Network Traffic Prediction and variability happen because of the latency and delay that
In this research, the training of the GC-LSTM model is per- lead to variations in the packet arrival time. These factors are
formed using the simulated OT network traffic dataset. This common phenomena for distributed communications, which
dataset consists of operational data for 27 substations, result- have been studied in [79]. The zero value in Fig. 9 repre-
ing in a total of 146 columns and 25 x 104 rows. The number sents zero in Fig. 13. On average, the observed OT traffic data
of columns represents the total number of traffic observation contains 3.6% of zeroes.
points in the OT network. On the other hand, the number Fig. 10 presents the histogram and probability distribution
of rows in the dataset represents the temporal observations. of the real and predicted OT traffic in node 2, substation 7.
The sampling rate for all observations is 1 sample/second. Fig. 10 shows that the predicted OT traffic is more concen-
Therefore, the dataset for normal OT traffic is collected for trated. We also compare the normal and predicted OT traffic
a total duration of 25 x 104 seconds. The training was per- for nodes 1 to 5 in substation 7 as represented in Fig. 11.
formed using a computer with the following specifications: The box plot in Fig. 11 shows the statistical summary from
Intel Xeon CPU 3.60GHz, 64 GB of RAM, and an NVIDIA the traffic data including the minimum, median, maximum,
Quadro RTX 4000 graphics processing unit. During the train- first quartile, and third quartile. The box plot also indicates
ing process, the OT observation points are further classified for the variability, spread, and skewness of the data. The circles
each individual substation to create 27 independent models of in the plot indicate the outlier data. Based on the plots in
traffic predictions. The total training time for all 27 substations Fig. 9–11, the predicted OT traffic has a more concentrated
is 26.5 hours. value and fewer outliers compared to the real data. Therefore,
Fig. 9 shows the comparison of the real OT traffic under the GC-LSTM performs as a filter to normalize and reduce
normal conditions and GC-LSTM predicted traffic in node 2, the variability and outliers traffic.
substation 7. The observed traffic rate is around 197 KBps. Fig. 12 shows the comparison of the real and predicted OT
However, occasionally, the real OT traffic slightly increases traffic during a sneaky cyber attack. The cyber attack triggers

Authorized licensed use limited to: TU Delft Library. Downloaded on November 02,2023 at 12:40:52 UTC from IEEE Xplore. Restrictions apply.
4016 IEEE TRANSACTIONS ON SMART GRID, VOL. 14, NO. 5, SEPTEMBER 2023

Using the same generated dataset, we compare our


proposed CyResGrid method with four state-of-the-art deep
learning-based TSC techniques for anomaly detection, i.e.,
ResNets [72], Inception [35], FCN [73], and MLP [74]. These
deep learning models are chosen as they address the general
time series classification problem and are not domain specific.
This makes them suitable for benchmarking and compari-
Fig. 12. Comparison of throughput between real and predicted OT traffic son of various TSC methods. Additionally, we also combine
for sneaky network scanning cyber attack scenario.
them with the proposed GC-LSTM method and test their
performances, as summarized in Table II.

Gmean = true positive rate ∗ truenegativerate (12)


2 ∗ precision ∗ recall
F1 = (13)
precision + recall
In Table II, we classify the cyber attacks into two sce-
narios. The first is for all combined attacks, i.e., no. 1-9,
and the second only focuses on stealthy attack scenarios,
Fig. 13. Dataset for time series classification. i.e., paranoid and sneaky attacks no. 10-16. We consider the
test dataset as imbalanced because, for the combined attacks,
TABLE I only 6.4% of the data is labeled as an anomaly. Meanwhile,
C YBER ATTACK S CENARIOS
for the stealthy attacks, only 2.7% of the data is labeled
as an anomaly. Therefore, to evaluate the anomaly detection
performance, we use as metrics the Geometric mean (G mean)
in Equation (12) [80] and F1 score in Equation (13) [81], [82].
From Table II, it is clearly seen that for the combined attack
scenario, CyResGrid provides the best performance with the
highest scores in the Area Under The Curve (AUC), accuracy,
G mean, and F1. Meanwhile, for the stealthy attack dataset,
we ignore the MLP method due to its lower performance. For
this scenario, Inception seems to provide the best AUC and
accuracy. However, its true positive rate is significantly low.
Furthermore, its F1 and G mean score are amongst one the
lowest. Therefore, we can still conclude that CyResGrid pro-
vides the most balanced performance, even for stealthy attack
a higher spike in OT traffic. The time series-based anomaly detection.
detection is then expected to distinguish the spikes due to Table II also indicates that GC-LSTM hybrid models can
traffic variability and cyber attacks. Therefore, the GC-LSTM- significantly improve the performance of deep learning-based
based prediction is important to normalize the OT traffic and classification, as indicated in row number 5, 6, 7, 8, 13,
reduce data variability on the predicted traffic. This is then 14, and 15. The performance comparisons are also shown in
used to improve the anomaly detection accuracy of TSC. Figs. 14 and 15. The Receiver Operating Characteristic (ROC)
curve shows the performance of the classifier. The hybrid clas-
C. Anomaly Detection sification integrated with GC-LSTM provides improved result,
as seen in Fig. 15, in comparison to the one without GC-
To perform anomaly detection on the OT traffic, we generate
LSTM in Fig. 14. According to Figs. 7–9, the actual OT traffic
a dataset with network traffic (X) and labels (L) for univariate
data is noisier compared to the predicted one. This condition
TSC. This is depicted in Fig. 13. Each column (xn ) in the
leads to better anomaly detection using the hybrid model, as
observed traffic data has one associated label column (ln ). A
described above.
label value of zero corresponds to the normal operation, while
one represents anomalous OT traffic. We simulate two types of
cyber attacks to generate anomalous traffic, i.e., DDoS and OT D. Attack Graph Generation and Analysis
network scanning during the reconnaissance stage of the cyber As discussed in Section II-D, the attack graph is modeled by
kill chain. The attack scenarios are summarized in Table I. comparing the normal and anomalous OT traffic. The result of
There are nine variations in the intensity of the communication this comparison is then used to determine the nodal abnormal-
network scanning amongst the scenarios. In total, the cyber ity. The attack graph classifies nodes into two categories, i.e.,
attacks run for 345,000 seconds, and data is collected every normal and anomalous. Anomalous nodes (ai ) are indicated
second to create the dataset, as represented in Fig. 13, from t by red, while normal nodes (ai ) are highlighted in blue.
= 1 until t = 345,000. This dataset is then used to train 70% Fig. 16 illustrates the entire attack graph map for online
and test 30% the TSC algorithm. cyber attack identification and visualization. Fig. 16 (a) depicts

Authorized licensed use limited to: TU Delft Library. Downloaded on November 02,2023 at 12:40:52 UTC from IEEE Xplore. Restrictions apply.
PRESEKAL et al.: ATTACK GRAPH MODEL FOR CYBER-PHYSICAL POWER SYSTEMS 4017

TABLE II
P ERFORMANCE C OMPARISON OF A NOMALY D ETECTION M ETHODS

Fig. 14. ROC comparison of the deep learning-based TSC. Fig. 15. ROC comparison of the hybrid deep learning-based TSC.

OT network scanning, originating from the control center to OT network scanning by an unidentified node, as indicated by
an OT device in substation 7. Consequently, this leads to the an orange triangle. The attack source is classified as uniden-
control center, substation 7 gateway, and targeted OT device to tified because it is not included on the list of known nodes in
be flagged as anomalous, as shown in red. Fig. 16 (b) depicts the OT network.
a DDoS attack targeting substations 1-7 that originates from
the control center. The DDoS attack on multiple substation tar- IV. C ONCLUSION AND F UTURE W ORK
gets triggers widespread traffic anomalies in substations 1-7, With the ever-increasing threat of cyber attacks on power
as indicated in red. It is considerably easier to detect a DDoS grids, it is now crucial to improve attack detection capabilities
attack, as it results in notably increased OT network traffic in OT systems. In this work, we proposed CyResGrid, a hybrid
volume, in comparison to a network scanning attack. Figs. 16 model of GC-LSTM and a deep convolutional network for
(c) and (d) depict attack graphs for cyber attacks originating anomaly detection in OT communication networks for power
from other sources than the control center. In Fig. 16 (c), we grids. It helps power system operators to localize and identify
highlight OT network scanning performed by a compromised cyber attacks in near real-time. GC-LSTM creates OT traf-
OT device located in substation 7. The scanning attacks lead to fic predictions based on the spatial and temporal features of
all nodes in substation 7 being classified as anomalous, except the input data. Through its predictions, the data variability
the router gateway. This scenario is explained as a local cyber and outliers are reduced. GC-LSTM also serves as a mecha-
attack that occurs in a substation. Finally, Fig. 16 (d) shows nism to improve the anomaly detection performance of TSCs.

Authorized licensed use limited to: TU Delft Library. Downloaded on November 02,2023 at 12:40:52 UTC from IEEE Xplore. Restrictions apply.
4018 IEEE TRANSACTIONS ON SMART GRID, VOL. 14, NO. 5, SEPTEMBER 2023

Fig. 16. Attack graph maps to identify and visualize cyber attack locations.

Furthermore, the deep convolutional network in CyResGrid is [4] G. Liang, J. Zhao, F. Luo, S. R. Weller, and Z. Y. Dong, “A review of
designed based on the hyperparameter tunning using Bayesian false data injection attacks against modern power systems,” IEEE Trans.
Smart Grid, vol. 8, no. 4, pp. 1630–1638, Jul. 2017.
optimization. Hence, CyResGrid outperforms the state-of-the- [5] R. Deng, G. Xiao, R. Lu, H. Liang, and A. V. Vasilakos, “False
art deep learning-based TSC. It provides the best detection data injection on state estimation in power systems—Attacks, impacts,
performance, with the highest accuracy of 96.45%, F1 score of and defense: A survey,” IEEE Trans. Ind. Informat., vol. 13, no. 2,
pp. 411–423, Apr. 2017.
65.03%, and G mean of 17.16%, and the lowest false positive [6] A. S. Musleh, G. Chen, and Z. Y. Dong, “A survey on the detection
rate of 0.13%. Additionally, for stealthy cyber attack scenarios, algorithms for false data injection attacks in smart grids,” IEEE Trans.
i.e., paranoid and sneaky attacks, CyResGrid provides the best Smart Grid, vol. 11, no. 3, pp. 2218–2234, May 2020.
[7] H. T. Reda, A. Anwar, and A. Mahmood, “Comprehensive survey and
performance indicated by the highest F1 score of 2.32% and G taxonomies of false data injection attacks in smart grids: Attack models,
mean score of 3.08%. Other methods seem to provide higher targets, and impacts,” Renew. Sustain. Energy Rev., vol. 163, pp. 1–24,
accuracy and AUC. However, they have a lower performance Jul. 2022.
[8] A. Sayghe et al., “Survey of machine learning methods for detecting
to detect anomalies as indicated by the lower True Positive false data injection attacks in power systems,” IET Smart Grid, vol. 3,
(TP), F1, and G mean scores. This classification is then used no. 5, pp. 581–595, Oct. 2020.
to generate an attack graph that serves as an online tool for [9] H. Zhang, B. Liu, and H. Wu, “Smart grid cyber-physical attack and
power system operators to identify and localize active cyber defense: A review,” IEEE Access, vol. 9, pp. 29641–29659, 2021.
[10] U. Inayat, M. F. Zia, S. Mahmood, H. M. Khalid, and M. Benbouzid,
attacks in OT networks of power systems. “Learning-based methods for cyber attacks detection in IoT systems: A
In a future work, we will focus on augmenting the proposed survey on methods, analysis, and future prospects,” Electronics, vol. 11,
CyResGrid method with prevention capabilities, in addition no. 9, pp. 1–20, Jan. 2022.
[11] A. S. Musleh, H. M. Khalid, S. M. Muyeen, and A. Al-Durra, “A
to the existing detection features. Subsequently, it can be prediction algorithm to enhance grid resilience toward cyber attacks
integrated with an intrusion detection and prevention system. in WAMCS applications,” IEEE Syst. J., vol. 13, no. 1, pp. 710–719,
The developed method is equally applicable to different OT Mar. 2019.
[12] SANS ICS, “Analysis of the cyber attack on the Ukrainian power grid,”
networks and CPS topologies, besides other cyber attack vec- Electricity Inf. Sharing Center (E-ISAC), Washington, DC, USA, Rep. 2,
tors, such as malware-based and privilege escalation attacks. Mar. 2016, vol. 388.
Moreover, the performance of the detection algorithm can fur- [13] C.-W. Ten, J. Hong, and C.-C. Liu, “Anomaly detection for cybersecurity
of the substations,” IEEE Trans. Smart Grid, vol. 2, no. 4, pp. 865–873,
ther be improved to detect more variations of cyber attacks Dec. 2011.
with infinitesimally small changes to OT network traffic [14] Q. Wang, X. Cai, Y. Tang, and M. Ni, “Methods of cyber-attack identifi-
intensity and frequency of occurrences. cation for power systems based on bilateral cyber-physical information,”
Int. J. Electr. Power Energy Syst., vol. 125, pp. 1–12, Feb. 2021.
[15] R. Mitchell and I.-R. Chen, “Behavior-rule based intrusion detection
ACKNOWLEDGMENT systems for safety critical smart grid applications,” IEEE Trans. Smart
Grid, vol. 4, no. 3, pp. 1254–1263, Sep. 2013.
DeSIRE is funded by the 4TU-program High Tech for a [16] G. M. Coates, K. M. Hopkinson, S. R. Graham, and S. H. Kurkowski,
Sustainable Future (HTSF). 4TU is the federation of the four “Collaborative, trust-based security mechanisms for a regional util-
ity Intranet,” IEEE Trans. Power Syst., vol. 23, no. 3, pp. 831–844,
technical universities in the Netherlands. Aug. 2008.
[17] Y. Yang et al., “Intrusion detection system for network security in syn-
chrophasor systems,” in Proc. IET Int. Conf. Inf. Commun. Technol.
R EFERENCES (IETICT), Beijing, China, 2013, pp. 246–252.
[1] D. E. Whitehead, K. Owens, D. Gammel, and J. Smith, “Ukraine cyber- [18] S. Ali and Y. Li, “Learning multilevel auto-encoders for DDoS
induced power outage: Analysis and practical mitigation strategies,” in attack detection in smart grid network,” IEEE Access, vol. 7,
Proc. 70th Annu. Conf. Prot. Relay Eng., College Station, TX, USA, pp. 108647–108659, 2019.
Apr. 2017, pp. 1–8. [19] M. Ozay, I. Esnaola, F. T. Y. Vural, S. R. Kulkarni, and H. V. Poor,
[2] M. J. Assante, R. M. Lee, and T. Conway, “ICS defense use case no. “Machine learning methods for attack detection in the smart grid,”
6: Modular ICS malware,” Electricity Inf. Sharing Center (E-ISAC), IEEE Trans. Neural Netw. Learn. Syst., vol. 27, no. 8, pp. 1773–1786,
Washington, DC, USA, Rep. 6, Aug. 2017, vol. 2. Aug. 2016.
[3] E. M. Hutchins, M. J. Cloppert, and R. M. Amin, “Intelligence-driven [20] M. Panthi, “Anomaly detection in smart grids using machine learning
computer network defense informed by analysis of adversary campaigns techniques,” in Proc. 1st Int. Conf. Power Control Comput. Technol.
and intrusion kill chains,” Lockheed Martin Corp., Bethesda, MD, (ICPC2T), Raipur, India, Jan. 2020, pp. 220–222.
USA, Rep. 1, 2011. Accessed: Jul. 5, 2022. [Online]. Available: [21] A. Khraisat, I. Gondal, P. Vamplew, and J. Kamaruzzaman, “Survey
https://www.lockheedmartin.com/content/dam/lockheed-martin/rms/ of intrusion detection systems: Techniques, datasets and challenges,”
documents/cyber/LM-White-Paper-Intel-Driven-Defense.pdf Cybersecurity, vol. 2, no. 1, pp. 1–22, Dec. 2019.

Authorized licensed use limited to: TU Delft Library. Downloaded on November 02,2023 at 12:40:52 UTC from IEEE Xplore. Restrictions apply.
PRESEKAL et al.: ATTACK GRAPH MODEL FOR CYBER-PHYSICAL POWER SYSTEMS 4019

[22] H. Liu and B. Lang, “Machine learning and deep learning methods [45] D. Kreutz, F. M. V. Ramos, P. E. Veríssimo, C. E. Rothenberg,
for intrusion detection systems: A survey,” Appl. Sci., vol. 9, no. 20, S. Azodolmolky, and S. Uhlig, “Software-defined networking: A com-
pp. 1–28, Oct. 2019. prehensive survey,” Proc. IEEE, vol. 103, no. 1, pp. 14–76, Jan. 2015.
[23] A. Aldweesh, A. Derham, and A. Z. Emam, “Deep learning approaches [46] J. Wu, S. Luo, S. Wang, and H. Wang, “NLES: A novel lifetime exten-
for anomaly-based intrusion detection systems: A survey, taxonomy, and sion scheme for safety-critical cyber-physical systems using SDN and
open issues,” Knowl. Based Syst., vol. 189, pp. 1–19, Feb. 2020. NFV,” IEEE Internet Things J., vol. 6, no. 2, pp. 2463–2475, Apr. 2019.
[24] P. Mishra, V. Varadharajan, U. Tupakula, and E. S. Pilli, “A detailed [47] Y. Li, Y. Qin, P. Zhang, and A. Herzberg, “SDN-enabled cyber-physical
investigation and analysis of using machine learning techniques for security in networked microgrids,” IEEE Trans. Sustain. Energy, vol. 10,
intrusion detection,” IEEE Commun. Surveys Tuts., vol. 21, no. 1, no. 3, pp. 1613–1622, Jul. 2019.
pp. 686–728, 1st Quart., 2019. [48] X. Zhang, K. Wei, L. Guo, W. Hou, and J. Wu, “SDN-based resilience
[25] R. Barbosa, R. Sadre, and A. Pras, “Difficulties in modeling SCADA solutions for smart grids,” in Proc. Int. Conf. Softw. Netw. (ICSN), Jeju,
traffic: A comparative analysis,” in Proc. Int. Conf. Passive Active Meas., South Korea, May 2016, pp. 1–5.
Berlin, Germany, Mar. 2012, pp. 126–135. [49] A. Montazerolghaem and M. H. Yaghmaee, “Demand response applica-
[26] R. Chalapathy and S. Chawla, “Deep learning for anomaly detection: A tion as a service: An SDN-based management framework,” IEEE Trans.
survey,” Jan. 2019, arXiv:1901.03407. Smart Grid, vol. 13, no. 3, pp. 1952–1966, May 2022.
[27] T. V. Phan, T. G. Nguyen, N.-N. Dao, T. T. Huong, N. H. Thanh, and [50] M. H. Rehmani, F. Akhtar, A. Davy, and B. Jennings, “Achieving
T. Bauschert, “DeepGuard: Efficient anomaly detection in SDN with resilience in SDN-based smart grid: A multi-armed bandit approach,”
fine-grained traffic flow monitoring,” IEEE Trans. Netw. Service Manag., in Proc. IEEE Conf. Netw. Softw. Workshop (NetSoft), Montreal, QC,
vol. 17, no. 3, pp. 1349–1362, Sep. 2020. Canada, Jun. 2018, pp. 366–371.
[28] R.-H. Hwang, M.-C. Peng, C.-W. Huang, P.-C. Lin, and V.-L. Nguyen, [51] P. Zhang et al., “Network-wide forwarding anomaly detection and local-
“An unsupervised deep learning model for early network traffic anomaly ization in software defined networks,” IEEE/ACM Trans. Netw., vol. 29,
detection,” IEEE Access, vol. 8, pp. 30387–30399, 2020. no. 1, pp. 332–345, Feb. 2021.
[29] X. Guan, T. Qin, W. Li, and P. Wang, “Dynamic feature analysis and [52] V. Krishnan et al., “Validation of synthetic U.S. electric power dis-
measurement for large-scale network traffic monitoring,” IEEE Trans. tribution system data sets,” IEEE Trans. Smart Grid, vol. 11, no. 5,
Inf. Forensics Security, vol. 5, pp. 905–919, 2010. pp. 4477–4489, Sep. 2020.
[30] A. Kind, M. P. Stoecklin, and X. Dimitropoulos, “Histogram-based traf- [53] X. Zheng et al., “A multi-scale time-series dataset with benchmark for
fic anomaly detection,” IEEE Trans. Netw. Service Manag., vol. 6, no. 2, machine learning in decarbonized energy grids,” Nat. Sci. Data, vol. 9,
pp. 110–121, Jun. 2009. p. 359, Jun. 2022.
[31] K. Xu, Z.-L. Zhang, and S. Bhattacharyya, “Internet traffic behavior [54] S. Soltan, A. Loh, and G. Zussman, “A learning-based method for gener-
profiling for network security monitoring,” IEEE/ACM Trans. Netw., ating synthetic power grids,” IEEE Syst. J., vol. 13, no. 1, pp. 625–634,
vol. 16, no. 6, pp. 1241–1252, Dec. 2008. Mar. 2019.
[32] H.-S. Wu, “A survey of research on anomaly detection for time series,” [55] A. Venzke, D. K. Molzahn, and S. Chatzivasileiadis, “Efficient creation
in Proc. 13th Int. Comput. Conf. Wavelet Active Media Technol. Inf. of datasets for data-driven power system applications,” Electr. Power
Process. (ICCWAMTIP), Chengdu, China, Dec. 2016, pp. 426–431. Syst. Res., vol. 190, pp. 1–8, Jan. 2021.
[33] K. Shaukat et al., “A review of time-series anomaly detection techniques: [56] M. F. Elaha, M. Jin, and P. Zeng, “Review of load data analytics
A step to future perspectives,” in Proc. Future Inf. Commun. Conf., using deep learning in smart grids: Open load datasets, methodolo-
Vancouver, BC, Canada, Apr. 2021, pp. 865–877. gies, and application challenges,” Int. J. Energy Res., vol. 45, no. 10,
[34] H. I. Fawaz, G. Forestier, J. Weber, L. Idoumghar, and P.-A. Muller, pp. 14274–14302, Aug. 2021.
“Deep learning for time series classification: A review,” Data Min. [57] S. Tavakkoli, J. Macknick, G. A. Heat, and S. M. Jordaan,
Knowl. Discov., vol. 33, no. 4, pp. 917–963, Jul. 2019. “Spatiotemporal energy infrastructure datasets for the United States: A
[35] I. Fawaz et al., “InceptionTime: Finding AlexNet for time series clas- review,” Renew. Sustain. Energy Rev., vol. 152, pp. 1–10, Dec. 2021.
sification,” Data Min. Knowl. Discov., vol. 34, no. 6, pp. 1936–1962, [58] Y. Himeur, A. Alsalemi, F. Bensaali, and A. Amira, “Building power
Sep. 2020. consumption datasets: Survey, taxonomy and future directions,” Energy
[36] W. Lin, D. Wu, and B. Boulet, “Spatial-temporal residential short-term Build., vol. 227, pp. 1–16, Nov. 2020.
load forecasting via graph neural networks,” IEEE. Trans. Smart Grid, [59] M. Naglic, PMU Measurements of IEEE 39-Bus Power System Model,
vol. 12, no. 6, pp. 5373–5384, Nov. 2021. IEEE DataPort, Piscataway, NJ, USA, Jun. 2019.
[37] O. Boyaci, M. R. Narimani, K. R. Davis, M. Ismail, T. J. Overbye, and [60] S. Pan, T. Morris, and U. Adhikari, “Developing a hybrid intrusion detec-
E. Serpedin, “Joint detection and localization of stealth false data injec- tion system using data mining for power systems,” IEEE Trans. Smart
tion attacks in smart grids using graph neural networks,” IEEE Trans. Grid, vol. 6, no. 6, pp. 3104–3113, Nov. 2015.
Smart Grid, vol. 13, no. 1, pp. 807–819, Jan. 2022. [61] U. Adhikari, T. Morris, and S. Pan, “WAMS cyber-physical test bed
[38] Z. Cui, K. Henrickson, R. Ke, and Y. Wang, “Traffic graph convolutional for power system, cybersecurity study, and data mining,” IEEE Trans.
recurrent neural network: A deep learning framework for network- Smart Grid, vol. 8, no. 6, pp. 2744–2753, Nov. 2017.
scale traffic learning and forecasting,” IEEE Trans. Intell. Transp. Sys., [62] A. Hahn, A. Ashok, S. Sridhar, and M. Govindarasu, “Cyber-physical
vol. 21, no. 11, pp. 4883–4894, Nov. 2020. security testbeds: Architecture, application, and evaluation for smart
[39] L. Deng, D. Lian, Z. Huang, and E. Chen, “Graph convolutional adver- grid,” IEEE Trans. Smart Grid, vol. 4, no. 2, pp. 847–855, Jun. 2013.
sarial networks for spatiotemporal anomaly detection,” IEEE Trans. [63] M. H. Cintuglu, O. A. Mohammed, K. Akkaya, and A. S. Uluagac, “A
Neural Netw. Learn. Syst., vol. 33, no. 6, pp. 2416–2428, Jun. 2022. survey on smart grid cyber-physical system testbeds,” IEEE Commun.
[40] H. Chen, H. Liu, X. Chu, Q. Liu, and D. Xue, “Anomaly detection Surveys Tuts., vol. 19, no. 1, pp. 446–464, 1st Quart., 2017.
and critical SCADA parameters identification for wind turbines based [64] B. B. Gupta and T. Akhtar, “A survey on smart power grid: Frameworks,
on LSTM-AE neural network,” Renew. Energy, vol. 172, pp. 829–840, tools, security issues, and solutions,” Ann. Telecommun., vol. 72, no. 9,
Jul. 2021. pp. 517–549, Sep. 2017.
[41] S. Basumallik, R. Ma, and S. Eftekharnejad, “Packet-data anomaly [65] C.-C. Sun, A. Hahn, and C.-C. Liu, “Cyber security of a power grid:
detection in PMU-based state estimator using convolutional neural State-of-the-art,” Int. J. Electr. Power Energy Syst., vol. 99, pp. 45–56,
network,” Int. J. Electr. Power Energy Syst., vol. 107, pp. 690–702, Jul. 2018.
May 2019. [66] X. Zhou, X. Gou, T. Huang, and S. Yang, “Review on testing of
[42] X. Ou, W. F. Boyer, and M. A. McQueen, “A scalable approach to attack cyber physical systems: Methods and testbeds,” IEEE Access, vol. 6,
graph generation,” in Proc. ACM Conf. Comput. Commun. Security, pp. 52179–52194, 2018.
Oct. 2006, pp. 336–345. [67] M. Z. Gunduz and R. Das, “A comparison of cyber-security ori-
[43] K. Kaynar and F. Sivrikaya, “Distributed attack graph generation,” ented testbeds for IoT-based smart grids,” in Proc. 6th Int. Symp.
IEEE Trans. Dependable Secure Comput., vol. 13, no. 5, pp. 519–532, Digit. Forensic Security (ISDFS), Antalya, Turkey, Mar. 2018,
Sep./Oct. 2016. pp. 1–6.
[44] S. Yoon, J.-H. Cho, D. S. Kim, T. J. Moore, F. Free-Nelson, and [68] J. Montoya et al., “Advanced laboratory testing methods using real-
H. Lim, “Attack graph-based moving target defense in software- time simulation and hardware-in-the-loop techniques: A survey of smart
defined networks,” IEEE Trans. Netw. Service Manag., vol. 17, no. 3, grid international research facility network activities,” Energies, vol. 13,
pp. 1653–1668, Sep. 2020. no. 12, pp. 1–38, Jun. 2020.

Authorized licensed use limited to: TU Delft Library. Downloaded on November 02,2023 at 12:40:52 UTC from IEEE Xplore. Restrictions apply.
4020 IEEE TRANSACTIONS ON SMART GRID, VOL. 14, NO. 5, SEPTEMBER 2023

[69] M. Iliofotou, P. Pappu, M. Faloutsos, M. Mitzenmacher, S. Singh, Alexandru Ştefanov (Member, IEEE) received the
and G. Varghese, “Network monitoring using traffic dispersion graphs M.Sc. degree from the University Politehnica of
(TDGS),” in Proc. 7th ACM SIGCOMM Conf. Internet Meas., San Bucharest, Romania, in 2011, and the Ph.D. degree
Diego, CA, USA, Oct. 2007, pp. 315–320. from University College Dublin, Ireland, in 2015.
[70] D. Q. Le, T. Jeong, H. E. Roman, and J. W.-K. Hong, “Traffic disper- He is an Assistant Professor of Intelligent Electrical
sion graph based anomaly detection,” in Proc. 2nd Symp. Inf. Commun. Power Grids with TU Delft, The Netherlands. He
Technol., Hanoi, Vietnam, Oct. 2011, pp. 36–41. is the Director of the Control Room with the
[71] J. Chen, X. Wang, and X. Xu, “GC-LSTM: Graph convolution embed- Future Technology Centre. He is leading the Cyber
ded LSTM for dynamic network link prediction,” Appl. Intell., vol. 52, Resilient Power Grids Research Group. His research
pp. 7513–7528, Sep. 2021. interests include cyber security of power grids,
[72] K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for resilience of cyber-physical systems, and next gen-
image recognition,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., eration grid operation. He holds the Professional Title of Chartered Engineer
Las Vegas, NV, USA, Jun. 2016, pp. 770–778. from Engineers Ireland.
[73] J. Long, E. Shelhamer, and T. Darrell, “Fully convolutional networks
for semantic segmentation,” in Proc. IEEE Conf. Comput. Vis. Pattern
Recognit., Boston, MA, USA, Jun. 2015, pp. 3431–3440.
[74] Y. LeCun, Y. Bengio, and G. Hinton, “Deep learning,” Nature, vol. 521,
no. 7553, pp. 436–444, May 2015.
[75] M. Längkvist, L. Karlsson, and A. Loutfi, “A review of unsupervised
feature learning and deep learning for time-series modeling,” Pattern
Recognit. Lett., vol. 42, pp. 1–14, Jun. 2014. Vetrivel Subramaniam Rajkumar (Student
[76] J. Snoek, H. Larochelle, and R. P. Adams, “Practical Bayesian Member, IEEE) received the bachelor’s degree
optimization of machine learning algorithms,” in Proc. Int. Conf. Adv. in electrical engineering from Anna University,
Neural Inf. Process. Syst., vol. 25, Dec. 2012, pp. 1–9. India, in 2013, and the M.Sc. degree in electrical
[77] R. Mohammadi, R. Javidan, and M. Conti, “SLICOTS: An SDN-based power engineering from the Delft University of
lightweight countermeasure for TCP SYN flooding attacks,” IEEE Trans. Technology, The Netherlands, in 2019, where he is
Netw. Service Manag., vol. 14, no. 2, pp. 487–497, Jun. 2017. currently a Doctoral Researcher with the Intelligent
[78] T. Zitta et al., “Penetration testing of intrusion detection and prevention Electrical Power Grids Group, Department of
system in low-performance embedded IoT device,” in Proc. 18th Int. Electrical Sustainable Technology. His current
Conf. Mechtronics (ME), Brno, Czech Republic, Dec. 2018, pp. 1–5. research interests include cyber security for power
[79] J. M. Johansson, “On the impact of network latency on distributed grids and impact analysis of cyber attacks on power
systems design,” Inf. Technol. Manag., vol. 1, no. 3, pp. 183–194, systems.
Jan. 2000.
[80] R. Barandela, J. S. Sánchez, V. Garcıa, and E. Rangel, “Strategies for
learning in class imbalance problems,” Pattern Recognit., vol. 36, no. 3,
pp. 849–851, Mar. 2003.
[81] J. M. Johnson and T. M. Khoshgoftaar, “Survey on deep learning with
class imbalance,” J. Big Data, vol. 6, no. 1, pp. 1–54, Dec. 2019.
[82] B. Kim, Y. Ko, and J. Seo, “Novel regularization method for the class Peter Palensky (Senior Member, IEEE) received
imbalance problem,” Expert Syst. Appl., vol. 188, pp. 1–8, Feb. 2022. the M.Sc. degree in electrical engineering and the
Ph.D. and Habilitation degrees from the Vienna
University of Technology, Austria, in 1997, 2001,
and 2015, respectively. He co-founded Envidatec,
a German startup on energy management and ana-
Alfan Presekal (Member, IEEE) received the B.Eng. lytics. In 2008, he joined the Lawrence Berkeley
degree in computer engineering from Universitas National Laboratory, Berkeley, CA, USA, as a
Indonesia in 2014, and the M.Sc. degree in secure Researcher, and the University of Pretoria, South
software system from the Department of Computing, Africa. In 2009, he became the Head of the Business
Imperial College London, U.K., in 2016. He is Unit, Austrian Institute of Technology in Sustainable
an Assistant Professor of Computer Engineering Building Technologies, where he was the first Principal Scientist of Complex
with the Department of Electrical Engineering, Energy Systems. In 2014, he was appointed as a Full Professor of Intelligent
Universitas Indonesia. He holds various cyber secu- Electric Power Grids with TU Delft, The Netherlands. He is active in interna-
rity certifications from EC Council, CompTIA, and tional committees, such as ISO or CEN. His research interests include energy
CISCO. He is currently a Doctoral Researcher automation networks, smart grids, and modeling intelligent energy systems.
in Cyber Resilient Power Grids within Intelligent He also serves as an IEEE IES AdCom Member-at-Large in various func-
Electrical Power Grids with the Department of Electrical Sustainable tions for IEEE. He is the past Editor-in-Chief of IEEE Industrial Electronics
Technology, Delft University of Technology. His main research interest Magazine and an Associate Editor of several other IEEE publications and
includes cyber security, cyber-physical systems, and artificial intelligence. regularly organizes IEEE conferences.

Authorized licensed use limited to: TU Delft Library. Downloaded on November 02,2023 at 12:40:52 UTC from IEEE Xplore. Restrictions apply.

You might also like