0% found this document useful (0 votes)
24 views

Attack Detection in Automatic Generation Control Systems Using LSTM-Based Stacked Autoencoders

Uploaded by

mingchen wei
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
24 views

Attack Detection in Automatic Generation Control Systems Using LSTM-Based Stacked Autoencoders

Uploaded by

mingchen wei
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 13

IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, VOL. 19, NO.

1, JANUARY 2023 153

Attack Detection in Automatic Generation


Control Systems using LSTM-Based
Stacked Autoencoders
Ahmed S. Musleh , Member, IEEE, Guo Chen , Member, IEEE, Zhao Yang Dong , Fellow, IEEE,
Chen Wang , and Shiping Chen , Senior Member, IEEE

Abstract—Automatic generation control (AGC) is I. INTRODUCTION


paramount in maintaining the stability and operation
UTOMATIC generation control (AGC) is a main closed-
of power grids. Its dependence on communication systems
makes it vulnerable to various cyberphysical attacks. False
data injection attacks (FDIA) are particularly difficult to
A loop control system that maintains the stability of today’s
highly interconnected power grids [1]. To do this, AGC utilizes
detect and represent a major threat to AGC systems. This a variety of information and communication infrastructures that
article proposes a novel spatio-temporal learning algorithm
that can learn the normal dynamics of the power grid cover the widely spread power grid and control centers [2]. Being
with AGC system to deal with this problem. The algorithm a form of a cyberphysical platform, this control system becomes
first uses a long short-term memory autoencoder to learn an attractive target for cyberphysical attackers who aim at either
the normal dynamics. It then utilizes this unsupervised gaining energy market profits or causing service outages and
learned model in detecting the various possibilities infrastructural damages. The vulnerabilities of such systems are
of FDIA affecting the AGC system by evaluating the
reconstruction residual of each measurements sample. thoroughly studied and illustrated in literature [2]–[5]. Recent
The proposed algorithm is data-driven which makes it high-profile attacks on energy systems include Stuxnet, Dragon-
resilient against AGC’s parameters uncertainties and fly, and the 2015 attack in Ukraine which led to power outages
modeling nonlinearities. The effectiveness of the developed affecting hundreds of thousands of people [6]. While the forms
algorithm is evaluated through test cases with various and impacts of cyberphysical attacks differ immensely, false data
basic and stealth FDIAs.
injection attacks (FDIA) represent a substantial problem as they
Index Terms—Automatic generation control (AGC), cy- have extensive effects and are hard to detect, as stated by the
berphysical security, false data injection attacks (FDIAs), National Institute of Standards and Technology [7]. FDIA on
long short-term memory autoencoders (LSTM-AE), situa-
AGC systems are studied in [8] where optimal attack strategies
tional awareness.
are designed. FDIA affecting AGC system could cause severe
damage and losses [9], [10]; thus, detection of these attacks is a
paramount necessity.
Detection of FDIA in AGC systems is achieved via different
methods in the literature. These could be split into model-based
Manuscript received 10 September 2021; revised 23 January 2022 and data-driven. Many model-based algorithms were presented
and 19 April 2022; accepted 8 May 2022. Date of publication 27 May
2022; date of current version 8 November 2022. This work was sup- to detect FDIA in AGC systems. Kernel density estimation
ported by Australian Research Council under Grant DP180103217, was utilized in [9] for detecting FDIA. In [10], the authors
Grant FT190100156, Grant IH180100020, and Grant LP200100056. proposed a load forecasting-based algorithm to detect FDIA.
Paper no. TII-21-3920. (Corresponding author: Guo Chen.)
Ahmed S. Musleh is with the School of Electrical Engineering This requires a precise forecast of the load profile of the grid.
and Telecommunications and UNSW Digital Grid Futures Institute, Liu et al. [11] developed a detection algorithm utilizing the
The University of New South Wales, Sydney, NSW 2052, Australia, Lyapunov stability theory to overcome FDIA in AGC systems.
and also with CSIRO DATA61, Sydney, NSW 2015, Australia (e-mail:
[email protected]). Model-based prediction and threshold testing are used in de-
Guo Chen is with the School of Electrical Engineering and Telecom- tecting FDIA in [12]. All these methods provide competent
munications and UNSW Digital Grid Futures Institute, The Univer- detection methodologies. However, the need for precise system
sity of New South Wales, Sydney, NSW 2052, Australia (e-mail:
[email protected]). models and parameters is the main drawback of model-based
Zhao Yang Dong is with the School of Electrical and Electronic Engi- detection algorithms as these are difficult to obtain in the real
neering, Nanyang Technological University, Singapore 639798 (e-mail: industry [5]. Furthermore, model-based methods are associated
[email protected]).
Chen Wang and Shiping Chen are with CSIRO DATA61, Syd- with extensive computations, serious detection delay, unscala-
ney, NSW 2015, Australia (e-mail: [email protected]; ship- bility, and possible divergence [5], [13]. An attack mitigation
[email protected]). algorithm is proposed in [14] where the authors utilize dynamic
Color versions of one or more figures in this article are available at
https://doi.org/10.1109/TII.2022.3178418. control loops to overcome the possible FDIA. This provides a
Digital Object Identifier 10.1109/TII.2022.3178418 good mitigation approach in theory, yet the applicability of this

1551-3203 © 2022 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.
See https://www.ieee.org/publications/rights/index.html for more information.

Authorized licensed use limited to: University of New South Wales. Downloaded on June 05,2024 at 05:10:12 UTC from IEEE Xplore. Restrictions apply.
154 IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, VOL. 19, NO. 1, JANUARY 2023

algorithm in real-life power systems is questioned. For instance, fluctuations by maintaining the scheduled power exchange at
the authors assume that all of the generators in the power grid are the tie-lines between the different areas in the grid. Due to the
equipped with AGC. This is not realistic and would require lots considerable time constant associated with the AGC compared to
of hardware updates in reality. Furthermore, the complexity and the automatic voltage regulator (AVR), the AGC control loop can
nonlinearity of the AGC systems limit the utilization of many be decoupled from the AVR loop. Consequently, a steady-state
model-based detection algorithms as they assume a linearized operating point can be considered for the AVR when analyzing
model of the AGC system which results in wrong decisions in AGC system [1]. Thus, a typical area of the power network
the industry [15]. These drawbacks question the applicability of with the AGC system can be represented by the block diagram
these model-based algorithms in real-life networks. illustrated in Fig. 1. For the ith area, the approximated linearized
To overcome these issues, data-driven detection algorithms load-frequency dynamics can be mathematically represented as
are developed. In [16], the authors developed a detection algo- [21]
rithm based on a multilayer perception classifier. Abbaspour et
1
al.[17] proposed a neural network-based Luenberger observer Δf˙i = (ΔPmi − ΔPLi − ΔPtiei − Di Δfi ) (1)
for detecting FDIA in AGC system. A comparative analysis of 2Hi
multiple supervised classifiers with neighborhood component where Δf˙i denotes the first derivative of the frequency deviation
analysis is presented in [18]. Regression-based FDIA signal Δfi with respect to time. Hi , ΔPLi , and Di are the ith area’s
prediction is developed in [19] using long short-term memory equivalent inertia constant, load change, and equivalent damping
(LSTM) networks. These are model-free detection algorithms. coefficient, respectively. ΔPmi and ΔPtiei are the ith area’s total
However, these supervised data-driven detection algorithms face mechanical power and tie-lines power deviations, respectively,
many challenges such as the high dimensionality of measure- represented as
ments which may be difficult to fit. Also, the limited FDIA
scenarios (attack labels) utilized in the training process question 
ni

the ability of these detection algorithms to identify novel attacks ΔPmi = ΔPmg .i (2)
that were not trained for. This may result in an overfitting g=1

problem where only attacks that were used in the training 


δi
process are detected while other attacks are not. Furthermore, ΔPtiei = ΔPtiei.j (3)
the spatio-temporal correlations of the power grid measurements j=1
cannot be captured by the aforementioned data-driven detection
algorithms. This limits the detection possibility of replay FDIA where ni and ΔPmg .i are the total number of generators in area
or FDIA of temporal nature. i, and the mechanical power deviation of the gth generator. δi
To overcome these challenges and to address the remain- and ΔPtiei.j are the total number of areas connected to area i,
ing research gaps in adopting data-driven detection algorithms and the power deviation at the tie-line connecting areas i and j,
of FDIA in AGC systems, an unsupervised spatio-temporal respectively.
learning model is developed in this article. This is based on Given the frequency deviation Δfi and the power generation
LSTM-autoencoder (LSTM-AE), where the spatial correlations control setpoint Pcg .i , the turbine’s valve position Pvg .i of each
are learned through the AE structure and the temporal corre- generator’s governor is adjected as
lations via the LSTM units implemented within the AE [20].  
1 1
This enables the unsupervised learning of dynamic sequential ΔṖvg .i = ΔPcg .i − Δfi − ΔPvg .i (4)
Tgg .i Rg.i
behavior of the AGC system.
In this article, the key contributions of this manuscript are as where Tgg .i and Rg.i are the governor’s time constant and the
follows. droop coefficient of the gth generator, respectively. Adjusting
1) Development of an unsupervised deep network model to the turbine’s valve position Pvg .i regulates the stream flowing
learn the normal dynamic behavior of the AGC system in into its turbine which changes the mechanical output power as
the power grid by extracting the nonlinear spatio-temporal
correlations of the measurements collected. 1  
ΔṖmg .i = ΔPvg .i − ΔPmg .i (5)
2) Development of FDIA detection algorithm based on the Tchg .i
developed deep network model.
3) Identification of the attacked measurements signals under where Tchg .i denotes the turbine’s time constant of the gth
basic, stealth, and replay FDIA. generator. If the gth generator is under the AGC control loop,
The rest of this article is organized as follows: Section II its power generation control setpoint Pcg .i would be tuned to
discusses the FDIA on AGC systems; Section III presents the allow the generator to compensate for the load changes which
proposed algorithm; Section IV includes the performance re- maintains the frequency and the tie-lines power accordingly.
sults. Finally, Section V concludes this article. Given the frequency deviation and the power deviation at the
tie-lines, the control center evaluates the area control error (ACE)
for the ith area using the following equation:
II. FDIA ON AGC: A BACKGROUND

δi
The AGC system aims to maintain the frequency of the A C Ei = βi Δfi + ΔPtiei.j (6)
system at its nominal value (e.g., 60 Hz); and to mitigate power j=1

Authorized licensed use limited to: University of New South Wales. Downloaded on June 05,2024 at 05:10:12 UTC from IEEE Xplore. Restrictions apply.
MUSLEH et al.: ATTACK DETECTION IN AGC SYSTEMS USING LSTM-BASED STACKED AUTOENCODERS 155

Fig. 1. Block diagram of AGC model for area 1 in a virtual three area power grid.

where βi is the frequency bias of area i expressed by points and DNS spoofing). Even though numerous verification
ni techniques may be employed in the network as in the IEC-62351
1
βi = Di + . (7) standard, the attackers can hack these through perceiving the
g=1
Rg.i network data and traffic for an extended time (e.g., weeks)
prior to launching the intended attack. Several vulnerabilities
Taking the ACEi as the input, the AGC evaluates the needed
and attacks analyses have been conducted on AGC-associated
change in the power generation control setpoint ΔPcg .i as
protocols [2]–[4]. It must be noted that a typical FDIA includes
ΔṖcg .i = −ki × (ACEi ) (8) several steps. These start with gathering and investigating the
technical design of the system through eavesdropping, extracting
where ki is the AGC gain. This setpoint is sent to the power plant the original measurements data, creating untraceable FDIA that
every 2 to 4 s [21]. On the other hand, if the gth generator is not would serve the attacker’s goals, and finally injecting the planned
under AGC, its power generation control setpoint Pcg .i would FDIA’s data into the system.
be determined by the control center according to the planned
generation schedule.
Actual AGC systems include nonlinearities that are usually
excluded in model-based analysis methods for simplicity. The B. FDIA Motivations and Impacts
two main nonlinearities are the governor dead-band and genera- FDIA targeting AGC systems aims for two goals: first, creat-
tion rate constraint [15]. Omitting these non-linearities questions ing system instability and potential load shedding or blackouts,
the applicability of model-based attack detection methods in and second, manipulating the economic operation of the power
AGC systems in real power grids. grid to either gain profits or cause losses. The first motivation
is realized as a form of aggressive action against a specific
A. Vulnerability of AGC Toward FDIA
power network where the adversary is interested in disabling
The control center receives the tie-lines power and frequency the electric network. An example of this is the blackout that
measurements over different communication and network plat- occurred in Ukraine in 2015 [6]. The second motivation is
forms. AGC’s Communications made within the electrical sub- the more common where the adversary tries to manipulate the
station are built according to the IEC-61850 standard [22], while measurements of the system to gain profits. Examples of these
communications made between the substations are based on the motivations and possible impacts are discussed next.
IEC-60870-5 standard [23], specifically the distributed network 1) System Stability Impacts: A main part of the AGC is to
protocol 3.0 (DNP3). IEC-62351 standard was developed to stabilize the system by damping the transfer power oscillations
address the security issues in the beforementioned protocols between the areas and maintaining the frequency of the system.
[24]. It provides security aspects like hash-based message au- If the attacker manipulates the readings of the frequency or
thentication code for communication authentication. Several the power measurements, the AGC system starts to deteriorate
vulnerabilities still exist with the widely spread architecture of which causes unstable conditions in the grid. This ends with
the power grids despite the various security-enhancing measures either under-frequency load shedding (UFLS) or overfrequency
available nowadays. This is demonstrated by the many cyber- generation shedding (OFGS) and may lead to a full blackout.
physical security issues that are noted in the last few years [5]. In real grids, protective relays (e.g., underfrequency relay) are
Several access methods can be utilized by attackers to launch started, and operators are alarmed if as follows.
their AGC-targeted attacks [2], [5]. These include device access 1) The continuous system frequency deviation exceeds the
(e.g., microprobing and circuit bonding); proximity access (e.g., defined threshold for a period of time, e.g., an alarm rises
meter GPS spoofing); and network access (e.g., fake access if a ±0.15 Hz deviation exists for 5 mins, and UFLS

Authorized licensed use limited to: University of New South Wales. Downloaded on June 05,2024 at 05:10:12 UTC from IEEE Xplore. Restrictions apply.
156 IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, VOL. 19, NO. 1, JANUARY 2023

attacks do not lead to frequency instability which makes them


unnoted for a long time if planned carefully.
Moreover, the AGC control strategy embraces other economic
goals. These include minimizing the fuel cost, avoiding sus-
tained operation of generating units in undesirable conditions,
and minimizing the equipment wear and tear by avoiding unnec-
essary maneuvering of generating units [21]. Thus, any FDIA
on AGC affects the economic operation of the grid regardless of
the motives of the attack.

C. FDIA Model in AGC


In investigating the probable FDIA on AGC, the next assump-
tions on the constraints are made: the attacker’s capitals are
limited and they cannot affect the entire system simultaneously;
the attacker holds sufficient information of the parameters of the
area of interest of the power grid; and the attacker holds sufficient
Fig. 2. (a) Frequency. (b) ACE signal during normal and different computational capacity to evaluate some needed information.
FDIAs.
The first assumption is reasonable as the key feature of FDIA
is to remain invisible. This entails the attacker to minimalize
the resources used and the manipulated measurements at one
or OFGS initiates for a ±1.5 Hz deviation as per the
time to decline the danger of detection [26]. Assumption 2 is
Australian Energy Market Commission [25].
acceptable as these parameters can be evaluated using the his-
2) The ACE value exceeds a predefined threshold, e.g., 0.1
toric measurements that the attacker analyzes before launching
p.u., or does not return to zero within 10 mins as per the
FDIA. Assumption 3 is tolerable given the progressive comput-
North American Electric Reliability Council [21].
ing capabilities obtainable by the attackers. This is noted in the
These alarms and protective measures may worsen the sta-
enlarged cyberattacks incidents against the advanced security
bility of the grid in attack situations ending with cascaded
measures adopted nowadays [5].
blackouts. This attack is seen in the FDIAs applied at the ACE
1) Basic FDIA: This attack targets one measurement signal.
signal in Fig. 2. ACE FDIA 1 and ACE FDIA 2 represent two
It does not have any coordination nor design process before
scaling FDIAs that affect the ACE signal with sudden scaling of
launching. This attack can be modeled as
2 and gradual scaling change to -1, respectively. As illustrated

in Fig. 2 the system frequency destabilizes and causes the ΔfiA = af (Δf  i + Af )  (9a)
frequency protection relay to trigger. This leads to disconnecting ΔPtieA
i.j
= a P ΔP tiei.j + AP (9b)
some parts of the grid which may lead to partial or cascaded full
blackouts. where ΔfiA and ΔPtie A
i.j
refer to the attacked signals of the
2) Economic Impacts: To manipulate the economic opera- frequency and tie-line power deviations, respectively. af and
tion of the grid, the attacker manipulates the received measure- aP are arbitrary attack multipliers for the frequency and the
ments slightly to meet the goals but without causing system tie-line power, respectively. These multipliers result in over-
instability. In deregulated electric energy markets, the locational compensation (where af , ap > 1), under-compensation (where
marginal pricing (LMP) of the energy in the power grid is 0 < af , ap < 1), or negative compensation (where af , ap < 0).
governed by the Regional Transmission Organizations, such as Over/undercompensation leads to unstable behavior of the AGC
the Australian Energy Market Operator in Australia. While the (increased oscillations), while negative compensation leads to
predicted and real-time supplies and demands are almost equiv- greater deviation in the system measurements directly. Af and
alent, some differences do occur (particularly with renewable AP are arbitrary attack addends for the frequency and the tie-line
energy resources). Though, these slight differences imply major power, respectively. Unlike the multipliers, these only lead to a
profits/losses for the wholesale retailers. Reasons for the LMP changed setpoint for the AGC and will not affect the stability
changes include transmission congestion and generation costs of the system if they are within an acceptable range, e.g., with
between and at the different nodes. By manipulating the power Af = 0.05, the frequency stabilizes at 60.05 Hz.
readings at the tie-lines, the attacker may gain extra profits. 2) Replay FDIA: This is an advanced form of the basic attack.
For example, assuming the LMP is set at 38.45 $/MWh at The adversary here records some of the measurements of the
the tie-lines connecting different areas that are controlled by targeted signals and then reinjects these again in the system.
different utilities. If the sending utility increases the tie-line This is modeled as
power flow measurements by 5% of the line carrying 500 MW, it  A
Δfi = Δfirecord (10a)
gains a total illegitimate revenue of $961.25 in one hour. FDIAs ΔPtie A
= ΔP record
(10b)
i.j tiei.j
targeting tie-line powers are illustrated in Fig. 2. Ptie FDIA 1 and
Ptie FDIA 2 represent two shifting FDIAs that affect the tie-line where Δfirecord and ΔPtie
record
i.j
are the recorded system frequency
power measurements with 2% and 5% shifts, respectively. These and tie-line power measurements at an earlier time. This attack

Authorized licensed use limited to: University of New South Wales. Downloaded on June 05,2024 at 05:10:12 UTC from IEEE Xplore. Restrictions apply.
MUSLEH et al.: ATTACK DETECTION IN AGC SYSTEMS USING LSTM-BASED STACKED AUTOENCODERS 157

is more difficult to detect as the measurements resemble the processes can be modeled as
normal dynamics of the system. The adversary applies this 
attack to resemble previous system instability, congestions, or ψ : x → H : fe (x, Θe ) (16)
to deliver stable measurements to the control center while the ϕ : H → x̃ : fd (x, Θd ) (17)
actual system experience unstable behavior. where ψ and ϕ are the encoding and decoding transitions, re-
3) Stealth FDIA: Unlike replay attacks, this attack is care-
spectively. H represent the minimum latent feature space. fe and
fully designed to have the measurements that are desired. Here, fd are nonlinear feature mapping functions of the encoder and
the attacker needs to have full knowledge about the system the decoder, respectively. Θe = {W e , be } and Θd = {W d , bd }
architecture and parameters of the specific area that is attacked. are the trainable weight and bias matrices for the encoder and de-
The covertness of this attack is achieved via the following coder layers. fe.1 = σ(W e x + be ) and fd.1 = σ(W d H + bd )
conditions. represent one hidden layer in the encoder and the decoder,
1) C1: The attacked frequency signal is within the acceptable respectively, with σ representing the sigmoid function. Stacked
range AE layers include multiple hidden layers with cascaded fe.1
 A
Δfi  < ΔfMAX . (11) and fd.1 and different Θe and Θd . To reconstruct the input with
minimum reconstruction residual, AE tries to find the optimal
2) C2: The rate of change of the attacked frequency does not sets of Θe , Θd as
exceed the maximum acceptable threshold ξf as
  {Θ∗e , Θ∗d } = argmin x − x̃2
d 
 ΔfiA  < ξf . (12) Θe ,Θd
 dt 

m
3) C3: The attacked area’s ACE value is within the accept- = argmin xi − fd (fe (xi , Θe ) , Θd ) 2 .
Θe ,Θd i=1
able range
  (18)
ACEA 
i < ACEMAX . (13)
By checking the reconstruction residual, AE can detect out-
4) C4: The attacked ACE’s rate of change does not exceed
liers that do not agree with the original normal dataset corre-
a specific threshold γACE as
  lations that were used in training. Stacking multiple layers of
d  AE enables us to learn high-level features of the data which de-
 ACEA 
i  < γACE . (14)
 dt creases the reconstruction residual consequently. However, AE
does not deal with temporal dependency between the samples
5) C5: The final values of the attacked signals ΔfiA and over time. We, therefore, use a recurrent neural network (RNN)
ΔPtieA
i.j
converge to the desired attacker goals. to learn the representation of time samples in the sequence data
To satisfy these conditions, a generalized stealth FDIA can be to feed into the AE.
modeled as

ΔfiA = af (t) ∗ (Δf  i + Af (t))  (15a) B. LSTM Neural Networks
ΔPtie A
i.j
= a P (t) ∗ ΔP tiei.j + AP (t) (15b)
LSTM can retrieve information from previous timesteps, for-
Unlike the basic attack model in (9), the multipliers and the get, and update the data in the internal memory cells at each
addends are time-dependent in this attack model. This allows time step [28]. This enables the neural network to recognize
the gradual change of the attack till the adversary reached the time-dependent relationships. Given a multivariate sequence
desired value. To reach the desired values in the shortest time, dataset {x1 , x2 , . . . , xT }, where xt ∈ Rm represents the m-
a minimization problem can be formulated subject to the attack dimensional vector at time step t, the output of the LSTM
model (15) and conditions C1 to C5. The resultant multipliers unit ht is determined based on the memory cell state C t−1 ,
and addends are known to be the optimal attack sequence. the intermediate output ht−1 , and on the subsequent input xt .
Additionally to the memory state, LSTM structure also include
III. FDIA DETECTION USING LSTM-AE input gate it , output gate ot , forget gate f t , and candidate
t
A. Autoencoders memory cell C̃ . The full model of LSTM unit is represented as
 
An AE is a type of deep neural network that tries to reconstruct it = σ W ix xt + W ih ht−1 + bi (19)
a given input through encoding and decoding processes [27].  
The encoder constructed by bottleneck layers maps the input f t = σ W f x xt + W f h ht−1 + bf (20)
into latent space with a compressed representation of the input.  
ot = σ W ox xt + W oh ht−1 + bo (21)
The decoder utilizes this encoded input representation vector in
the latent space to reconstruct the original input. An AE can be t  
C̃ = tanh W C̃x xt + W C̃h ht−1 + bC̃ (22)
trained to find a robust representation of the input and recover
t
the original signal from the noise in the input. C t = it  C̃ + f t  C t−1 (23)
Given a multivariate dataset x = {x1 , x2 , . . . , xm } where m  
is the total number of input features, the encoder, and the decoder ht = tanh C t  ot (24)

Authorized licensed use limited to: University of New South Wales. Downloaded on June 05,2024 at 05:10:12 UTC from IEEE Xplore. Restrictions apply.
158 IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, VOL. 19, NO. 1, JANUARY 2023

Fig. 3. LSTM unit Architecture.

where W ix , W ih , W f x , W f h , W ox , W oh , W C̃x , and


W C̃h are the corresponding input matrices; bi , bf , bo , and bC̃
are the corresponding bias vectors. represents an element-wise
multiplication operator. The detailed diagram of a single LSTM
unit is illustrated in Fig. 3. The forget gate decides what to forget
from the previous memory cell state, the input gate decides what
to preserve in the current memory cell state, and the output state
decides what to pass as the LSTM output. Thus, the information
from the current timestep affects the output of the LSTM in the
future timesteps accordingly. This maps the temporal correlation
between the data samples over time.
Fig. 4. Proposed LSTM-AE detection algorithm.
C. FDIA Detection Process Using LSTM-AE
By utilizing LSTM units within the AE network, the spatio-
temporal correlations of the dataset sequence can be captured. is present, and 0 is for normal measurements sample. τi is the
This allows us to have a machine learning model that can decision threshold that is designed based on the normal distri-
reconstruct the input samples sequence with the minimum re- bution of the reconstruction residual of the normal validation set
construction residual. For AE, this residual is generated as of the system. In this article, the decision threshold is decided
AE (xi ) = xi − fd (fe (xi , Θ∗e ) , Θ∗d ) 2 (25) based on the maximum reconstruction error in each measuring
signal for the normal attack-free samples.
where AE (xi ) represents the residual value of the ith sample in After training the LSTM-AE using normal attack-free grid
the m-dimensional vector sample. Θ∗e and Θ∗d are the optimal measurements, we utilize the online LSTM-AE based detection
sets of AE parameters derived from (18). For LSTM-AE, the scheme illustrated in Fig. 4. We start by collecting the mea-
residual is represented as surements samples from the grid. These include the different
 t  frequency and tie-line power measurements. We normalize them
LSTM
AE xi = xti − fdLSTM feLSTM
   and feed them to the LSTM-AE. The output of the LSTM-AE is a
× xti , Θ∗LSTM.e , Θ∗LSTM.d 2 (26) near representation of the original input sequences. We generate
the residual of the output after the denormalizing process by
AE (xi ) represents the residual value of the ith sample
where LSTM t
comparing it to the input. Finally, we utilize (27) to test for
in the m-dimensional vector of samples at timestep t; feLSTM and attacks availability. The proposed algorithm is implemented
fdLSTM are the nonlinear feature mapping functions of the LSTM- within the computing facilities of the control center to examine
based encoder and the decoder; and Θ∗LSTM.e and Θ∗LSTM.d the received frequency and tie-line power measurements sent to
are the optimal sets of LSTM-AE parameters derived from (18) the AGC system. The full generic scheme is shown in Fig. 5.
while utilizing (19)–(24). Once the training of the network is
finalized, the reconstruction residual should be at a minimum
IV. RESULTS AND DISCUSSIONS
for normal system dynamics. A deviation in this reconstruction
residual shall be noted when any of the input data does not agree To assess the practical performance of the proposed detection
with the spatio-temporal correlations of the historic dataset that algorithm, New-England IEEE 39 bus system is utilized as illus-
the network was trained on. Based on this fact, the following trated in Fig. 6. It includes ten synchronous machines (with IEEE
FDIA detector can be utilized type 1 excitation system), 34 transmission lines, 12 transformers,
 and 19 dynamic loads. The total complex power of the system is
 t 1, if LSTM (xti ) > τi
Ψ xi = AE (27) around 6500 MVA. The full parameters of the system are shown
0, Otherwise
in [29]. The system is divided into three areas with AGC applied
where Ψ(xti ) is the detector output of the ith sample in the to one generating unit in each area. AGC system’s parameters
m-dimensional vector of samples at timestep t. 1 means a FDIA are given in Table I. The system is simulated in a real-time

Authorized licensed use limited to: University of New South Wales. Downloaded on June 05,2024 at 05:10:12 UTC from IEEE Xplore. Restrictions apply.
MUSLEH et al.: ATTACK DETECTION IN AGC SYSTEMS USING LSTM-BASED STACKED AUTOENCODERS 159

TABLE II
TRAINING PERFORMANCE

A. LSTM-AE Parameters and Training Performance


The data needed for the training for a real power grid should
include as much dynamics of the system as possible. In this
Fig. 5. Generic diagram of the proposed detection algorithm in the article, the dataset of 3000 s included various dynamics of the
AGC system. grid that illustrate various random loads variations that denote
the real-life power grid dynamics. From this dataset that includes
various dynamics of the system, the LSTM-AE model can be
learned, and the threshold can be decided based on the maximum
reconstruction error. Using the grid simulated in RTDS, two
sets of measurements are collected, one for training purposes
and the other for validation tests. Both sets include dynamic
loads changes in a random and dynamic behavior with an overall
variation of ±30%. Each dataset includes 3000 s of records of
all the frequencies and tie-lines power measurements with total
observations of 12 000 at a sampling frequency of 4 samples per
second. Therefore, the time series input vector of the LSTM-AE
is xt = {f1t , f2t , f3t , Ptie1.2
t t
, Ptie2.3 t
, Ptie3.1 }. These measurements
are first normalized using the z-score method. Thus, each mea-
surement signal has a 0 mean and a standard deviation of 1. The
LSTM encoder structure includes LSTM layer with six units fol-
lowed by LSTM layer with 3 units. The LSTM decoder includes
LSTM layer with three units and LSTM layer with six units.
Fig. 6. Single-line diagram of IEEE 39 bus system with three areas.
The training process includes 1000 epochs, L2 regularization of
TABLE I 0.0001, and a learning rate of 0.001. The loss function is the root
T 
NOMINAL AGC PARAMETERS OF THE SYSTEM  2
mean squared error RMSE = xt − x̃t /T , where T
t−1
is the total observations per sample. The optimization utilized
is the stochastic gradient-based optimization algorithm—Adam.
The training is done using MATLAB R2021a on an Intel Core
i7 CPU @ 3.2 GHz PC with 16 GB RAM. The size of the hidden
units (amount of information remembered between time steps) in
each LSTM unit is varied in three different networks. Network 1
includes LSTM-AE with three hidden units. Therefore, it utilizes
data from 3 previous timesteps before generating the output. Net-
environment using real-time digital simulator (RTDS) where works 2 and 3 include 10 and 25 hidden units, respectively. The
frequency and tie-line power measurements are collected for training performance of these networks is given in Table II along
LSTM-AE training. This real-time simulation could be seen as with the AE network. All LSTM-AE provide better training per-
the closest form of simulation to real-life power grids [30]. The formance compared to the AE network. LSTM-AE (3) illustrates
RTDS Simulator is the world’s leading real-time power system the best result with the minimum RMSE obtained. The fact that
simulator which allows engineers to validate the performance of LSTM-AE with three hidden units performs better than the other
power system devices and derisk deployment. It also allows re- networks indicates that the information retained in these cells is
alistic hardware in the loop studies as shown in [31]. We adopted enough to learn the temporal correlation of the AGC system. In-
this simulation tool as it is difficult to obtain the measurements creasing the number of hidden cells tends to increase the learning
data from real industry for security issues. Second, even if we error slightly which indicates that the information retained with
can obtain these data, we will not have the different and widely a higher LSTM units number degrades the learning process or
ranged grid dynamics scenarios and FDIA scenarios that we does not add much importance to the learning process. It must
need to validate our proposed detection algorithms. be noted that the difference is very minimal among the two

Authorized licensed use limited to: University of New South Wales. Downloaded on June 05,2024 at 05:10:12 UTC from IEEE Xplore. Restrictions apply.
160 IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, VOL. 19, NO. 1, JANUARY 2023

Fig. 7. Validation of the tracking performance on normal measure-


ments signals. (a) Area 1 frequency. (b) Tie-line 1-2 power flow. Fig. 8. FDIA on area 2 frequency. (a) Frequency. (b) APE.

TABLE III
VALIDATION PERFORMANCE C. FDIA Detection
Two scenarios are considered in applying FDIA. The first
applies FDIA on a single measurement signal while the second
applies FDIA on whole area measurements signals.
1) FDIA on Single Measurement: In this scenario, different
FDIAs are applied to a single measurement signal which is
the second area frequency f2t . These attacks are shown and
highlighted in Fig. 8(a) where the red color illustrates the at-
first models LSTM-AE(3) and LSTM-AE(10), especially in the tacked measurement signal. The original measurement signal
power measurements that exhibit much of fluctuations compared is indicated in the blue color and the yellow color denotes the
to the frequency measurements. This is further shown in the output of the LSTM-AE. It must be noted that the attacks are
validation process. applied only in specific intervals with various settings and are
summarized as follows.
B. Model Performance Evaluation With Normal Grid 1) A1: Basic attack with a constant step addition of 0.01 Hz,
Dynamics from t = 200 s to t = 300 s.
2) A2: Basic attack with a constant step subtraction of 0.015
To verify the efficiency of the developed LSTM-AE net- Hz, from t = 700 s to t = 800 s.
works, a validation process is carried out using the test dataset. 3) A3: Stealth attack with a gradual quadratic addition as in
Fig. 7 illustrates the testing dataset with the real values of (15a), from t = 1000 s to t = 1500 s.
the area 1 frequency (a) and tie-line 1–2 power measurements 4) A4: Replay attack with a constant frequency of 60 Hz
(b). The outputs of the four trained networks are illustrated from t = 1700 s to t = 1850 s.
as well. LSTM-AE (3) shows the best tracking performance. 5) A5: Replay attack with reading from t = 1500 s to t =
This is further clarified in Table III. Mean absolute percent- 1600 s, from t = 2000 s to t = 2100 s.
age error MAPE = (100/T ) Tt−1 |(xt − x̃t )/xt | and abso- 6) A6: Replay attack with reading from t = 1800 s to t =
lute percentage error APE = (100)|(xt − x̃t )/xt | are used to 1900 s, from t = 2200 s to t = 2300 s.
compare the networks used. LSTM-AE (3) provides the least 7) A7: Stealth attack with a gradual linear addition as in
maximum APE error for both the frequency and power mea- (15a), from t = 2400 s to t = 2700 s.
surements. Based on the maximum APE, the threshold value These attacks are chosen to be small to verify whether the
can be determined for frequency and tie-line power measure- detection scheme can detect these small variations. As illustrated
ments. Since LSTM-AE (3) illustrates the best performance, in Fig. 8(b), all these attacks are detected as the reconstruction
it is adopted for the detection of the FDIA in the following residual associated with them exceeds the predefined threshold
sections. Given that AGC systems might have different dynam- of the system that was defined based on the maximum APE in the
ics in different power systems, we believe that the number of previous section. This illustrates the capability of the proposed
hidden units should be calibrated for AGC systems in different detection algorithm in detecting small and stealth FDIA in the
power grids. system. Nonetheless, it must be noted that FDIA that have lower

Authorized licensed use limited to: University of New South Wales. Downloaded on June 05,2024 at 05:10:12 UTC from IEEE Xplore. Restrictions apply.
MUSLEH et al.: ATTACK DETECTION IN AGC SYSTEMS USING LSTM-BASED STACKED AUTOENCODERS 161

are applied instantaneously at measurements collected from one


area in the system. Some parts of the FDIA are not detected
when they are very small as shown in the stealth attacks at their
beginning.

D. Comparative Analysis With Recent Literature and


Industry Solutions
To analyze the performance of the proposed LSTM-AE de-
tection algorithm, a new scenario is made where multiple FDIA
are considered at different measurements and areas. The dataset
includes 12 000 samples of dynamic grid response, with 50%
of these being affected by FDIA. FDIA cases include single
measurements attacks, where the measurements signals are
affected separately; and whole area attacks, where the mea-
surement signals of an entire area are attacked simultaneously.
These attacks include basic and stealth attacks equally. The
performance of the LSTM-AE is compared to some state-of-the-
art baseline learning-based models, model-based algorithms,
and an industrial security protocol. Learning-based models in-
clude supervised and unsupervised models. Supervised models
include support vector machine (SVM), K-nearest neighbor
(KNN), random forest (RF), RNN, and multilayer perception
network (MLP); where some FDIA and labels are utilized in the
training process. Unsupervised learning-based models include
traditional stacked AE, principal component analysis k- means
clustering (PCAKM), and dynamic time warping k-means clus-
tering (DTWKM). The training of these classifiers is based on
the normalized samples. SVM classifier utilizes an RBF kernel
Fig. 9. FDIA on area 1. (a) Frequency. (b) Tie-line power 1-2. (c) Tie- to handle the nonlinearity of the system. KNN uses 1 nearest
line power 3-1. (d) Frequency APE. (e) Tie-lines power APE. neighbor for classification. RF classifier includes 100 trees with
10 max splits. RNN includes a hidden RNN layer with a single
output layer to indicate any attack. MLP includes two hidden
magnitude is not detected as the reconstruction error is not
layers with neurons number equal to half of the number of the
elevated beyond the predefined threshold.
input neurons, and one output layer. AE has the same parameters
2) FDIA on Multiple Measurements on the Whole Area: In
as LSTM-AE but without the LSTM units. Similar to LSTM-AE,
this scenario, different FDIAs are applied to multiple measure-
AE is an unsupervised classification method that is based on
ments signals. These are area 1 frequency f1t , and tie-lines
t t defining a threshold for normal unattacked reconstruction resid-
powers Ptie1.2 and Ptie3.1 . These attacks are shown in Fig. 9(a)–(c)
ual. K-means clustering algorithms utilize squared Euclidean
and are summarized as follows..
distance in the clustering process. PCAKM is an unsupervised
1) A1: basic multiple attacks with a constant step addition
clustering method applied to the data samples after they are
of 0.01 Hz at f1t , 2 MW at Ptie1.2t
, and −4 MW at Ptie3.1
t
,
converted to the principal components space where the main
from t = 200 s to t = 300 s.
first few components are utilized. DTWKM is utilized after the
2) A2: basic multiple attacks with a constant step subtraction
dynamic time warping is applied to the original data samples
of 0.01 Hz at f1t , 5 MW at Ptie1.2t
, and −5 MW at Ptie3.1
t
,
which align the different time-series data to reduce the sum of
from t = 700 s to t = 800 s.
the Euclidean distances between corresponding points. All these
3) A3: stealth attack with a gradual quadratic addition in
classifiers are trained with the same PC mentioned earlier. Four
f1t and Ptie1.2
t t
, quadratic subtraction in Ptie3.1 as in (15a
evaluation metrics are employed in the comparative analysis and
and 15b), from t = 1000 s to t = 2000 s.
are calculated as:
4) A4: replay attack with reading from t = 1800 s to t =
1900 s, from t = 2200 s to t = 2300 s. Accuracy = (TP + TN) / (TP + FP + TN + FN) (28)
5) A5: stealth attack with a gradual linear addition in f1t and
t t Recall = TP/ (TP + FN) (29)
Ptie1.2 , linear subtraction in Ptie3.1 as in [15(a) and 15(b)],
from t = 2400 s to t = 2700 s. FAR = FP/ (FP + TN) (30)
As shown in Fig. 9(d) and (e), all of these attacks end up
F 1score = 2TP/ (2TP + FP + FN) (31)
crossing the predefined thresholds for both the frequency and
tie-line power. This indicates the capability of the proposed where TP is the overall accurately labeled FDIA, FP is the
LSTM-AE detection algorithm in detecting multiple FDIA that overall inaccurately labeled FDIA, TN is the overall accurately

Authorized licensed use limited to: University of New South Wales. Downloaded on June 05,2024 at 05:10:12 UTC from IEEE Xplore. Restrictions apply.
162 IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, VOL. 19, NO. 1, JANUARY 2023

TABLE IV
COMPARISON OF THE PERFORMANCE WITH RECENT LEARNING-BASED AND OTHER STATE OF THE ART METHODS

labeled normal sample, and FN is the overall inaccurately labeled better recall rates, these models are associated with more draw-
normal sample. The cost of FN illustrates the fact of having a backs compared to the unsupervised models. These drawbacks
FDIA that is not detected. This may lead to both stability and include the need for a historical dataset (labeled samples with
economic impacts as discussed in section II.B. On the other synthetic attack scenarios) and the possible overfitting of the
hand, the cost of FP could be summarized by having false training samples, especially for the utilized attack samples.
emergency procedures that would require the control center On the other hand, the proposed unsupervised model requires
to run further investigative tasks to confirm the source of this unlabeled historical data that could be easily captured in the
attack. This means increased pressure on the system operators industry. The main remaining issue is the extensive training time
and engineers. It could also affect the stability of the system required which could be in hours. Nonetheless, this is offline
if the AGC is not responding to a disturbance in the system training that is required initially before the network is deployed
that is considered as FP. Accuracy illustrates the overall model online. While the training of the proposed algorithm requires
accuracy. Recall (also known as sensitivity) measures the ability hours of training as given in Table II, the detection is made
of the classifier to grasp the different cases of the FDIA. False through a straightforward passing of the received data sample
alarm rate (FAR) measures the error percentage of characterizing through the developed LSTM-AE model. This is basically a
a normal sample as a FDIA. F1-score represents the overall multiplication process of the received data sample at each time
measure of a model’s precision. That is, a good F1-score means step with the weights of the developed model which results in
that we have low FP and low FN. The outcomes of these a new output of the LSTM-AE and a new data update in the
evaluation metrics for the different considered classifiers are memory state of the LSTM units. This time is noted to be below
given in Table IV. LSTM-AE demonstrates the best accuracy 10 ms during the testing process for each data sample. This small
and F1-score at 98.78% and 98.77%, respectively. While SVM detection time allows the quick online handling of FDIA once
provides a recall of 100%, it fails in the FAR scoring 10.90%. it is detected.
RNN demonstrates a good recall of 98.21% which indicates On the other hand, several model-based detection algorithms
the importance of utilizing recurrent neural units to capture were designed to detect the FDIA in AGC systems. Kernel
the temporal dependencies. Unsupervised clustering illustrates density estimation was utilized in [9] for detecting FDIA. In
the worst performance with DTW clustering showing a slightly [10], the authors proposed a load forecasting-based algorithm
better recall rate. AE and LSTM-AE have zero FAR because to detect FDIA. This requires a precise forecast of the load
they utilize predefined thresholds to evaluate the reconstruction profile of the grid. Liu et al. [11] developed a detection al-
residual. This eliminates the chances of false alarms in the gorithm utilizing the Lyapunov stability theory to overcome
detection process. LSTM-AE records the second-best recall rate FDIA in AGC systems. Model-based prediction and threshold
at 97.56%. This indicates that some FDIAs are not detected due testing are used in detecting FDIA in [12]. All these methods
to their small magnitude which shall not be of great concern provide competent detection methodologies with more than a
in real industry. This shows the excellent performance of the 95% detection rate and less than 5% FAR for FDIA of more than
LSTM-AE in comparison to the other baseline classification 5%. While this may be seen as a good performance, the draw-
methods. While some supervised learning models illustrate backs associated with these algorithms are the main challenge.

Authorized licensed use limited to: University of New South Wales. Downloaded on June 05,2024 at 05:10:12 UTC from IEEE Xplore. Restrictions apply.
MUSLEH et al.: ATTACK DETECTION IN AGC SYSTEMS USING LSTM-BASED STACKED AUTOENCODERS 163

These drawbacks include the need for precise system models TABLE V
COMPARISON OF THE PERFORMANCE METRICS WITH THE NOISY
and parameters, extensive computation, serious detection delay, MEASUREMENT
and possible divergence [5], [13]. The extensive computation
of model-based detection algorithms is due to the associated
equations and possible optimizations that need to be solved for
each received measurements sample as detailed in [5]. On the
other hand, data-driven methods would indeed require extensive
training, but they would not require extensive computation once
the training process is finalized. Therefore, the online detection
decrease. This decrease is due to the fact that a higher noise ratio
of FDIA would not require extensive computations using data-
affects the detection performance of the proposed algorithm by
driven algorithms.
increasing the threshold level to adapt to the effect of the noise.
Unlike other detection methods, a mitigation algorithm is
Thus, the detection algorithm would not be able to detect FDIA
proposed in [14] where the authors utilize dynamic control
with a magnitude of less than the noise levels in the data samples.
loops to overcome the possible FDIA. This provides a good
Nevertheless, even with these noise levels, the accuracy and the
mitigation approach in theory, yet the applicability of this algo-
F1-score of the proposed detection algorithm illustrate better
rithm in real-life power systems is questioned. For instance, the
rates than the other data-driven detection algorithms without the
authors assume that all of the generators in the power grid are
noise effect.
equipped with AGC. This is not realistic and would require lots
of hardware updates in reality. Furthermore, the complexity and
nonlinearity of the AGC systems limit the utilization of many V. CONCLUSION
model-based detection algorithms as they assume a linearized This article proposed a spatio-temporal learning model to
model of the AGC system which results in wrong decisions in learn the normal spatial and temporal correlations of the power
the industry [15]. These drawbacks question the applicability of grid dynamics under AGC. This was achieved using LSTM-AE
these model-based algorithms in real-life networks. neural networks. Once the model was learned, the detection
IEC-62351 standard was prepared to handle the security prob- of the FDIA was achieved by evaluating the reconstruction
lems in power grids communications in the industry [32]. It is residual of the measurements’ samples. A main advantage of
the main solution for securing the communication platforms for the proposed algorithm was done in a self-supervision manner
measurements exchange in the power system. While this proto- that does not require labeled attack samples in the learning
col offers excessive solutions, it cannot protect the exchanged process. The comparative analysis with the baseline attack de-
data fully. Table IV gives a comparative analysis of the proposed tectors in the literature illustrates the superior performance of
detection algorithm against the IEC-62351 standard. IEC-62351 the proposed algorithm. The proposed unsupervised algorithm
standard can detect FDIA at the network and communication highlights any suspicious data samples that do not agree with
links. However, it cannot detect FDIA targeting the measurement the spatio-temporal correlation of the original system. Thus,
device itself. This is because the original data are injected with the limitations of the algorithm include the inability to detect
false data while being captured and prior to the communication FDIA that agree with the spatio-temporal correlation of the
process and the verification step. Moreover, the IEC-62351 original system and the inability to detect low magnitude FDIA
standard can be hacked when an expert attacker is involved. This that does not raise the reconstruction error of the LSTM-AE
attacker can spend weeks perceiving communication links and above the threshold level. These FDIA were either inapplicable
network packets to decode the verification processes involved in real-life or have a neglected effect on the operation of the
before launching FDIA. These kinds of FDIA are detectable AGC system. For instance, to have a FDIA that agrees fully with
with a good percentage using the proposed algorithm in this the spatio-temporal correlation of the original system requires
article. the adversary to have access to the entire set of measurements
of the system. This was an extremely challenging task as
E. Robustness to Noise suggested in the literature [5]. The main remaining issue was
the extensive training time required which could be in hours.
In actual power systems, the collected grid measurements are Nonetheless, this was offline training that is required initially
affected by different levels of noise. For instance, the signal-to- before the network was deployed online. Future work shall
noise ratio (SNR) of the collected measurements data is 87 dB include extending the learned model to comprise other input
in New England [33]. Reported SNR values vary starting from features, such as network data traffic. This shall further enhance
30 dB to higher values [34], [35]. To check the rationality of the detection performance for FDIA applied at the network layer.
the developed detection algorithm established in this article, the
performance metrics are compared when the SNR varies from 30
REFERENCES
to 100 dB. White Gaussian noise with zero mean and a standard
deviation relevant to the desired SNR is added to both the training [1] A. J. Wood, B. F. Wollenberg, and G. B. Sheblé, Power Generation,
Operation, and Control. New York, NY, USA: Wiley, 2013.
and the testing datasets. Table V gives the performance metrics [2] A. S. Musleh, G. Chen, Z. Y. Dong, C. Wang, and S. Chen, “Vulnerabilities,
in the noisy dataset. As given in the table, the recall rates of the threats, and impacts of false data injection attacks in smart grids: An
FDIA detection decrease slightly when the SNR rate tends to overview,” in Proc. Int. Conf. Smart Grids Energy Syst., 2020, pp. 77–82.

Authorized licensed use limited to: University of New South Wales. Downloaded on June 05,2024 at 05:10:12 UTC from IEEE Xplore. Restrictions apply.
164 IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, VOL. 19, NO. 1, JANUARY 2023

[3] Y. Xu, Y. Yang, T. Li, J. Ju, and Q. Wang, “Review on cyber vulnerabilities [27] I. Goodfellow, Y. Bengio, and A. Courville, Deep Learning. Cambridge,
of communication protocols in industrial control systems,” in Proc. IEEE MA, USA: MIT Press, 2016.
Conf. Energy Internet Energy Syst. Integr., 2017, pp. 1–6. [28] S. Hochreiter and J. Schmidhuber, “Long short-term memory,” Neural
[4] I. Darwish, O. Igbe, O. Celebi, T. Saadawi, and J. Soryal, “Smart grid Comput., vol. 9, no. 8, pp. 1735–1780, 1997.
DNP3 vulnerability analysis and experimentation,” in Proc. Int. Conf. [29] A. Pai, Energy Function Analysis for Power System Stability. London,
Cyber Secur. Cloud Comput., 2015, pp. 141–147. U.K.: Kluwer Acad. Publ., 1989.
[5] A. S. Musleh, G. Chen, and Z. Y. Dong, “A survey on the detection [30] “RTDS Technologies,” RTDS, Canada. Accessed: Jan. 2, 2022. [Online].
algorithms for false data injection attacks in smart grids,” IEEE Trans. Available: https://www.rtds.com/technology/
Smart Grid, vol. 11, no. 3, pp. 2218–2234, May 2020. [31] A. S. Musleh, S. Muyeen, A. Al-Durra, and I. Kamwa, “Testing and
[6] K. E. Hemsley and R. E. Fisher, “History of industrial control sys- validation of wide-area control of STATCOM using real-time digital
tem cyber incidents,” Dec. 31, 2018. Accessed: Jan. 21, 2021. [On- simulator with hybrid HIL–SIL configuration,” IET Gener., Transmiss.
line]. Available: https://www.osti.gov/biblio/1505628-history-industrial- Distrib., vol. 11, no. 12, pp. 3039–3049, 2017.
control-system-cyber-incidents [32] “IEC 62351:2022 SER,” Int. Electrotechn. Commiss., Geneva, Switzer-
[7] V. Y. Pillitteri and T. L. Brewer, “Guidelines for smart grid cybersecurity,” land. Jan. 4, 2022, Accessed: Jan. 8, 2022. [Online]. Available: https:
Sep. 25, 2014. Accessed: Jun. 12, 2020. [Online]. Available: https://www. //webstore.iec.ch/publication/6912
nist.gov/publications/guidelines-smart-grid-cybersecurity [33] W. Li, M. Wang, and J. H. Chow, “Real-time event identification through
[8] M. Vrakopoulou, P. M. Esfahani, K. Margellos, J. Lygeros, and G. An- low-dimensional subspace characterization of high-dimensional syn-
dersson, “Cyber-Attacks in the automatic generation control,” in Cyber chrophasor data,” IEEE Trans. Power Syst., vol. 33, no. 5, pp. 4937–4947,
Physical Systems Approach to Smart Electric Power Grid. Berlin, Ger- Sep. 2018.
many: Springer, 2015, pp. 303–328. [34] A. Bhandari, H. Yin, Y. Liu, W. Yao, and L. Zhan, “Real-time signal-to-
[9] S. Sridhar and M. Govindarasu, “Model-Based attack detection and miti- noise ratio estimation by universal grid analyzer,” in Proc. Int. Conf. Smart
gation for automatic generation control,” IEEE Trans. Smart Grid, vol. 5, Grid Synchronized Meas. Analytics, 2019, pp. 1–6.
no. 2, pp. 580–591, Mar. 2014. [35] Y. Zhang et al., “Wide-area frequency monitoring network (FNET) ar-
[10] S. D. Roy and S. Debbarma, “Detection and mitigation of cyber-attacks chitecture and applications,” IEEE Trans. Smart Grid, vol. 2, no. 1,
on AGC systems of low inertia power grid,” IEEE Syst. J., vol. 14, no. 2, pp. 159–167, Sep. 2010.
pp. 2023–2031, Jun. 2020.
[11] J. Liu, Y. Gu, L. Zha, Y. Liu, and J. Cao, “Event-triggered h load Ahmed S. Musleh (Member, IEEE) received
frequency control for multiarea power systems under hybrid cyber attacks,” the M.Sc. degree in electrical engineering from
IEEE Trans. Syst., Man, Cybern., Syst., vol. 49, no. 8, pp. 1665–1678, the Petroleum Institute (currently Khalifa Uni-
Aug. 2019. versity), Abu Dhabi, United Arab Emirates, in
[12] R. Tan et al., “Modeling and mitigating impact of false data injection 2016. He is currently working toward the Ph.D.
attacks on automatic generation control,” IEEE Trans. Inf. Forensics Secur., degree in electrical engineering with the School
vol. 12, no. 7, pp. 1609–1624, Jul. 2017. of Electrical Engineering and Telecommunica-
[13] S. Wen, Y. Wang, Y. Tang, Y. Xu, P. Li, and T. Zhao, “Real-Time identi- tions, University of New South Wales, Sydney,
fication of power fluctuations based on LSTM recurrent neural network: NSW, Australia.
A case study on Singapore power system,” IEEE Trans. Ind. Informat., His research interests include smart grid tech-
vol. 15, no. 9, pp. 5266–5275, Sep. 2019. nologies, wide-area monitoring and control, cy-
[14] A. Patel, S. Roy, and S. Baldi, “Wide-Area damping control resilience berphysical security, and machine learning applications.
towards cyber-attacks: A dynamic loop approach,” IEEE Trans. Smart Mr. Musleh was the recipient of the Abu Dhabi University Overall
Grid, vol. 12, no. 4, pp. 3438–3447, Jul. 2021. Award of Excellence and the Petroleum Institute Graduate Fellowship
[15] J. Morsali, K. Zare, and M. T. Hagh, “Appropriate generation rate con- in 2014 and 2015, respectively.
straint (GRC) modeling method for reheat thermal units to obtain optimal
load frequency controller (LFC),” in Proc. Conf. Thermal Power Plants,
2014, pp. 29–34. Guo Chen (Member, IEEE) received the Ph.D.
[16] C. Chen, K. Zhang, K. Yuan, L. Zhu, and M. Qian, “Novel detection scheme degree in electrical engineering from the Univer-
design considering cyber attacks on load frequency control,” IEEE Trans. sity of Queensland, Brisbane, QLD, Australia, in
Ind. Informat., vol. 14, no. 5, pp. 1932–1941, May 2018. 2010.
[17] A. Abbaspour, A. Sargolzaei, P. Forouzannezhad, K. K. Yen, and A. I. He is currently a Senior Lecturer with the
Sarwat, “Resilient control design for load frequency control system under School of Electrical Engineering and Telecom-
false data injection attacks,” IEEE Trans. Ind. Electron., vol. 67, no. 9, munications, University of New South Wales,
pp. 7951–7962, Sep. 2020. Sydney, NSW, Australia. Previously, he held
[18] S. D. Roy, S. Debbarma, and S. Deb, “A comparative analysis of supervised academic positions with the Australian National
classifiers employing NCA for feature selection to secure generation University, University of Sydney, and the Univer-
control,” in Proc. Int. Conf. Power Electron. Energy, 2021, pp. 1–6. sity of Newcastle. His research interests include
[19] C. Chen, Y. Chen, J. Zhao, K. Zhang, M. Ni, and B. Ren, “Data-Driven sustainable energy system modeling, artificial intelligence, optimization
resilient automatic generation control against false data injection attacks,” and control, and their applications in smart grids.
IEEE Trans. Ind. Informat., vol. 17, no. 12, pp. 8092–8101, Dec. 2021. Dr. Chen is an Editor for the IEEE TRANSACTIONS ON SMART GRID.
[20] A. Essien and C. Giannetti, “A deep learning model for smart manufactur-
ing using convolutional LSTM neural network autoencoders,” IEEE Trans. Zhao Yang Dong (Fellow, IEEE) received the
Ind. Informat., vol. 16, no. 9, pp. 6069–6078, Sep. 2020. Ph.D. degree in electrical engineering from the
[21] P. Kundur, Power System Stability and Control. New York, NY, USA: University of Sydney, Sydney, NSW, Australia, in
McGraw-Hill, 1994. 1999.
[22] “EC 61850 Communication networks and systems for power utility,” Int. He is currently with Nanyang Technological
Electrotechn. Commiss., Geneva, Switzerland, 2004. University, Singapore. His previous roles in-
[23] G. Clarke, D. Reynders, and E. Wright, Practical Modern SCADA Proto- clude SHARP Professor and the Director of
cols: DNP3, 60870.5 and Related Systems. London, U.K.: Newnes, 2003. UNSW Digital Grid Futures Institute, Univer-
[24] S. M. S. Hussain, T. S. Ustun, and A. Kalam, “A review of IEC 62351 sity of New South Wales, Australia; and the
security mechanisms for IEC 61850 message exchanges,” IEEE Trans. Ausgrid Chair and the Director of the Ausgrid
Ind. Informat., vol. 16, no. 9, pp. 5643–5654, Sep. 2020. Centre for Intelligent Electricity Networks pro-
[25] R. Panel, “Frequency operating standard,” Jan. 1, 2020. Accessed: Mar. viding support for the Smart Grid, Smart City national demonstra-
3, 2021. [Online]. Available: https://www.aemc.gov.au/sites/default/ tion project. His research interests include smart grid, smart cities,
files/content/c2716a96-e099-441d-9e46-8ac05d36f5a7/REL0065-The- power system planning and stability, renewable energy systems, elec-
Frequency-Operating-Standard-stage-one-final-for-publi.pdf tricity market, and computational methods for power engineering
[26] J. Zhao, L. Mili, and M. Wang, “A generalized false data injection attacks applications.
against power system nonlinear state estimator and countermeasures,” Dr. Dong is currently an Editor for a number of IEEE Transactions and
IEEE Trans. Power Syst., vol. 33, no. 5, pp. 4868–4877, Sep. 2018. IET Journals.

Authorized licensed use limited to: University of New South Wales. Downloaded on June 05,2024 at 05:10:12 UTC from IEEE Xplore. Restrictions apply.
MUSLEH et al.: ATTACK DETECTION IN AGC SYSTEMS USING LSTM-BASED STACKED AUTOENCODERS 165

Chen Wang received the Ph.D. degree in com- Shiping Chen (Senior Member, IEEE) received
puter science from Nanjing University, Nanjing¸ the Ph.D. degree in computer science from the
China, in 1998. University of New South Wales, Sydney, Aus-
He is currently a Principal Research Scientist tralia, in 2001.
with CSIRO Data61, Sydney, NSW, Australia. He is a Principal Research Scientist in CSIRO
He leads and develops machine learning and Data61, Sydney. He is currently a Conjoint
data analytics systems for various domains, in- Professor with the University of New South
cluding radio astronomy, health, and agriculture Wales (UNSW), Sydney, and Macquarie Uni-
as well as smart grids. His research interests versity Macquarie Park, Sydney. He has been
include the interpretability, robustness, and scal- working on distributed systems for more than 20
ability of data-driven systems. years with a focus on performance and security.
He has authored or coauthored more than 240 research papers and
technical reports in these areas. He has been actively participating in re-
search communities through publishing papers, journal editorships, and
conference PC/Chair services. His current research interests include ap-
plication security, blockchain, and service-oriented trusted collaboration.
Mr. Chen is a Fellow of the Institute of Engineering Technology.

Authorized licensed use limited to: University of New South Wales. Downloaded on June 05,2024 at 05:10:12 UTC from IEEE Xplore. Restrictions apply.

You might also like