Data Model Jointly Driven High-Quality Case Generation For Power System Dynamic Stability Assessment

Uploaded by

fringe.hxz

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

14 views12 pages

Data Model Jointly Driven High-Quality Case Generation For Power System Dynamic Stability Assessment

Uploaded by

fringe.hxz

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 12

IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, VOL. 18, NO.

8, AUGUST 2022 5055

Data/Model Jointly Driven High-Quality Case

Generation for Power System Dynamic
Stability Assessment
Lipeng Zhu , Member, IEEE, and David J. Hill , Life Fellow, IEEE

Abstract—For data-driven dynamic stability assessment I. INTRODUCTION

(DSA) in power systems, learning cases collected from ac-
OW to perform reliable yet efficient online dynamic sta-
tual historical records appear to be more reliable than those
obtained from numerical simulations with an inevitable
reality gap. However, due to the scarceness of transient
H bility assessment (DSA) after event/contingency occur-
rence is of vital importance in electric power system online
events in practical systems, historical case sets generally monitoring. Owing to the rise of synchrophasor measurement
encounter the small sample size and class-imbalance prob-
lems. To tackle these challenging issues, this article pro- technologies in modern power systems [1], [2], wide-area pha-
poses a novel data/model jointly driven framework to gener- sor measurement units (PMUs) provide immense opportunities
ate high-quality cases for power system DSA applications. to fulfill this task. For example, with system-wide dynamic
Model-driven numerical simulations are first utilized for states/responses captured by massive synchronized PMU data,
rough case generation, based upon which case refinement data-driven solutions can be figured out to perform online DSA
is then intelligently carried out via cycle generative ad-
versarial network (CycleGAN) learning. In this data-driven in a reliable and efficient way. In particular, on the basis of ma-
manner, the CycleGAN is able to produce refined cases chine learning (ML) techniques, the relationships between initial
highly resembling actual historical ones. A long short-term system states/responses and eventual stability statuses/margins
memory-based semisupervised learning scheme is further can be intelligently deduced for online DSA. Following this
designed to reliably label all the refined cases. Numeri- idea, many ML approaches have recently shown high potential
cal tests are comprehensively carried out on the realistic
Guangdong Power Grid in South China. With only a small to derive powerful data-driven DSA models, including deci-
and skewed historical case set initially provided, the pro- sion trees (DTs) [3], [4], support vector machines (SVMs) [5],
posed framework is able to generate highly realistic cases artificial neural networks (ANNs) [6], [7], extreme learning
to augment the set and mitigate the class-imbalance issue. machines (ELMs) [8], [9], and emerging deep neural networks
These synthetic cases further help derive a more discern- (DNNs) [10], [11], etc. These efforts have achieved excellent
ing DSA model, which contributes to enhanced reliability
and adaptability of online DSA in practical power grids. results in various dynamic stability issues, e.g., transient sta-
bility [3], [5], [7], [10], [11], voltage stability [4], [6], [8], and
Index Terms—Case generation, deep neural networks frequency stability [9].
(DNNs), dynamic stability assessment (DSA), generative
The realization of the majority of existing ML-based DSA
adversarial networks (GANs), synchrophasor measure-
ments, time series (TS). schemes in practical power systems takes an implicit assumption
that data-driven DSA models can be reliably derived from an
offline learning case set (CS), which is often generated by nu-
merical time-domain (TD) simulations of the physical systems.
However, as precise power system modeling and simulation is
Manuscript received May 24, 2021; revised August 9, 2021, August
30, 2021, and September 18, 2021; accepted October 24, 2021. Date of still an open issue [12], errors are inevitable in TD simulation
publication October 29, 2021; date of current version May 6, 2022. This results. In this regard, the simulation results present a nontrivial
work was supported in part by the Fundamental Research Funds for the reality gap with respect to the systems’ actual behaviors, which
Central Universities of China and in part by the Research Grants Council
of Hong Kong Special Administrative Region under the Theme-based would degrade the fidelity of the simulated cases. Consequently,
Research Scheme under Project T23-701/14-N. Paper no. TII-21-2172. the DSA models derived by ML could drift regarding the actual
(Corresponding author: Lipeng Zhu.) characteristics of the physical systems, leading to insufficient
Lipeng Zhu is with the College of Electrical and Information Engineer-
ing, Hunan University, Changsha 410082, China (e-mail: zhulpwhu@ adaptability and reliability in practical online applications.
126.com). To address this issue, instead of relying on TD simulations for
David J. Hill is with the Department of Electrical and Electronic Engi- case generation, one may try to collect more reliable cases that
neering, The University of Hong Kong, Hong Kong 999077, China, and
also with the School of Electrical Engineering and Telecommunications, actually happened in practical power systems by retrieving their
The University of New South Wales, Sydney, NSW 2052, Australia historical operating records. However, this is likely to introduce
(e-mail: [email protected]). small sample size and class-imbalance issues into DSA model
Color versions of one or more figures in this article are available at
https://doi.org/10.1109/TII.2021.3123823. training. Specifically, with the enhancement of modern power
Digital Object Identifier 10.1109/TII.2021.3123823 system planning, operation, and control, transient events or

1551-3203 © 2021 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.
See https://www.ieee.org/publications/rights/index.html for more information.

Authorized licensed use limited to: Shanghai Jiaotong University. Downloaded on September 26,2024 at 10:24:01 UTC from IEEE Xplore. Restrictions apply.
5056 IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, VOL. 18, NO. 8, AUGUST 2022

contingencies rarely occur in practice. Moreover, even after focusing on power system transient stability assessment (TSA),
being subject to severe faults, today’s power grids can remain Section III describes the data inadequacy issue encountered by
stable in most cases, which makes unstable scenarios quite historical data-driven TSA and presents the basic idea to address
scarce. Under this circumstance, the collected historical CS it, i.e., data/model jointly driven case generation. Section IV
may contain only a small number of cases, within which very details the proposed case generation framework. In Section V,
few unstable cases are included. Given such a skewed CS numerical tests are carried out on the realistic Guangdong Power
with limited cases, it is difficult for standard ML algorithms to Grid (GPG) in South China to verify the proposed framework’s
sufficiently learn stability knowledge for online DSA. performance. Finally, Section VI concludes this article and
Considering this difficulty, a viable solution is to augment the summarizes relevant future work.
historical CS with synthetic cases that behave realistically. Then,
the main concern lies in how to improve the quality and fidelity II. RELATED WORK
of the synthetic cases. In fact, although the simulated cases
obtained by TD simulations may not be realistic enough for DSA A. Generative Learning and CycleGAN
model training, they provide a rough version of synthetic cases. In the broad field of ML-based industrial informatics, many
If the underlying relationships between such simulated cases and practical applications suffer from the small sample size issue
actual historical cases are fully captured, they can be leveraged due to the difficulty in acquiring sufficient samples from the
to refine the former and, thus, boost their fidelity. By doing so, operational contexts. To address this issue, one of the most effec-
one can circumvent the intractable problem of reducing power tive ways is to appropriately supplement synthetic samples via
system modeling and simulation errors to produce high-quality generative learning [21]. For example, given the pedestrian rei-
cases for DSA applications. The goal of capturing the potentially dentification task in smart transportation, Han et al. [22] utilized
complex relationships between simulated cases and actual cases a genetic algorithm to produce virtual samples, showing that vir-
can be achieved by using a new kind of intelligent DNNs called tually generated samples can effectively alleviate the small sam-
the generative adversarial network (GAN) [13], [14]. Based on ple size problem. Based on hand-crafted feature representation,
a two-player gaming mechanism, GAN can be utilized to create this method produces samples in the form of dimension-reduced
realistic data with no model/distribution assumption. Recently, feature vectors. Consequently, it may be incompetent in other
a number of inspiring GAN-based research efforts have been domains, where samples with highly complicated data structures
made in the power engineering field [15]–[19]. Their success and characteristics need to be generated. To support sample
reveals the feasibility and potential of generating realistic data generation with complex data structures and distributions, Good-
with GANs in practical power system monitoring and operation. fellow et al. [13] developed the powerful GAN approach in
Drawing the abovementioned ideas together, this article aims 2014. With no need for explicit data distribution assumptions
to tackle the challenging problem of DSA-oriented high-quality or hand-crafted feature engineering, this deep learning-enabled
case generation via a data/model jointly driven framework. In technique and its improved versions have been successfully ap-
particular, model-based TD simulations are first performed to plied to many areas, e.g., image synthesis/editing, style transfer,
roughly generate simulated cases. Taking them as the inputs, a object detection, music/video generation, and language/speech
powerful GAN algorithm called CycleGAN [20] is utilized to processing [23]–[25].
refine them and enhance their fidelity in a data-driven manner. Among various existing variants of the GAN technique, Cy-
By learning the bidirectional relationships between simulated cleGAN is a special version devoted to unpaired data trans-
cases and actual historical cases in an organized way, CycleGAN formation tasks, e.g., image-to-image translation [20]. It intro-
is able to produce highly realistic cases as many as necessary. duces a cycle consistency regularization to preserve the original
Such cases would help augment the diversity and reliability of data information after bidirectional data transformation, thus
the historical CS, thus contributing to more reliable online DSA enabling unpaired data transformation in a flexible yet plausible
in practical power grids. The main contributions and merits of manner [24]. Owing to this merit, CycleGAN has exhibited high
this work include the following. potential in industrial informatics. To list a few, CycleGAN was
1) This article systematically develops a data/model jointly adopted in [26] to translate pedestrian images across different
driven framework to generate high-quality cases for DSA, camera views, which effectively mitigates the class-imbalance
which can help improve the reliability and practicability problem in ML-based pedestrian reidentification. In [27], the
of general data-driven DSA schemes in realistic power cycle consistency idea of CycleGAN was incorporated into the
grids. adversarial autoencoder (AAE) algorithm to learn the latent
2) A novel CycleGAN-based case refinement method characteristics of operational data in process industries, which
is proposed to intelligently refine roughly simulated results in human-level decision-making. To build a high-fidelity
cases, making them resemble actual cases for fidelity digital twin for a life-cycle rolling bearing, CycleGAN was used
augmentation. in [28] to reduce the gap between the virtual environment and
3) To generate cases with complete input–output informa- the physical context.
tion, a long short-term memory (LSTM)-based semisu- Considering the need for high-quality data generation, this
pervised learning (SSL) scheme is carefully designed to work also employs the CycleGAN method to generate power
efficiently and reliably label refined cases. system operational data with complex spatial-temporal dy-
The rest of this article is organized as follows. Section II first namics. Compared with the abovementioned representative
provides an overview of related work in the literature. Then, CycleGAN-based studies [26]–[28], its major differences are
Authorized licensed use limited to: Shanghai Jiaotong University. Downloaded on September 26,2024 at 10:24:01 UTC from IEEE Xplore. Restrictions apply.
ZHU AND HILL: DATA/MODEL JOINTLY DRIVEN HIGH-QUALITY CASE GENERATION 5057

briefly highlighted here. In [26], only real images captured by without intricate temporal dependencies, the SSL scheme in this
networked cameras are taken as the source data to implement article is devoted to labeling dynamic data with much more
image translation, which may limit the number of generated complex spatial-temporal correlations.
samples and the effect of class-imbalance mitigation. By com-
parison, with a model/data jointly driven mechanism, this article
utilizes a numerical simulation engine to prepare rough samples C. Relevant Applications in Power Engineering
as many as necessary for CycleGAN-based translation. While In the specific domain of power engineering that covers the
the approach in [27] aims to perform posterior distribution infer- subject of power system DSA in this article, the abovementioned
ence by exploiting the capability of GAN in implicitly capturing GAN and SSL algorithms have also been successfully adopted to
complex data distributions, this article focuses on producing address diverse issues. A brief overview of relevant applications
realistic training data via generative learning. Comparatively reported in recent years and of the key differences between them
speaking, perhaps the scheme in [28] is more similar to the and this work is provided below.
one mentioned in this article, yet it does not consider how to 1) GAN-Related Applications: In [15], based on conditional
annotate the health state of the bearing data refined by Cycle- GAN, sequential wind/solar power data were synthesized in a
GAN. In contrast, as discussed in the next section, this article natural manner for renewable scenario generation. To cope with
further introduces an SSL scheme to reliably label the synthetic the missing data issue in power system DSA, Ren et al. [16]
samples. developed a GAN-based scheme to impute missing data in
practical PMU measurements. Similarly, lossy solar power data
were reliably imputed with GAN in [17]. Li et al. [18] proposed
B. SSL for Data Annotation a GAN-based approach to recover PMU data against false data
As GAN generally generates new samples without labels [26], injection attacks. In [19], wind turbine monitoring data were
additional efforts are needed to label them for downstream augmented via GAN to tackle the small sample size problem in
supervised applications. Since the initial real samples for GAN wind turbine fault detection. Considering the data privacy issue,
training are often labeled, data annotation can be implemented a privacy-preserving renewable scenario generation scheme was
via SSL to fully learn knowledge from both labeled (real) and designed in [34] by incorporating federated learning with GAN.
unlabeled (synthetic) samples [29]. Supposing testing samples Instead of leveraging GAN to synthesize data, Yuan et al. [35]
are not seen during offline training, inductive SSL [30] is mainly utilized GAN to implicitly infer the complicated distributions
considered here. According to the taxonomy in [29], inductive of smart meter data under normal conditions for outage de-
SSL methods can be grouped into the following categories: tection. Among these studies, despite their distinct application
1) SSL with unsupervised preprocessing, e.g., feature ex- contexts, [16] and [18] that focus on producing system-wide
traction, cluster-then-label, and pretraining; operational data appear to be the most similar ones to this article.
2) wrapper-based SSL, including self-training, cotraining, However, they merely address single snapshots of power system
and boosting; data in steady states. Different from them, this article strives to
3) intrinsic SSL (directly optimizing the learning objective tackle the more challenging task of generating power system
with both labeled and unlabeled data), such as maximum transient data comprehensively covering prefault, fault-on, and
margin-based, perturbation-based, manifold-based, and postfault dynamics.
generative model-based SSL. 2) SSL-Related Applications: For the purpose of reducing
In many areas of industrial informatics, SSL has been exten- the offline labeling efforts, Liu et al. [36] utilized a special
sively applied to reliably annotate data acquired from practical version of cotraining algorithms called tritraining to implement
environments. Typical examples include SSL-based 3-D-hand SSL-assisted DSA. Based on manifold regularization, Huang
pose estimation [31], key performance indicator prediction in et al. [37] introduced a semisupervised ELM algorithm to ef-
industrial processes based on ensemble SSL [32], and industrial ficiently implement photovoltaic system fault diagnosis with
fault diagnosis via semisupervised broad learning [33]. In this limited labeled data. Analogously, Yan et al. [38] leveraged the
article, SSL is also carried out to efficiently label the synthetic semisupervised ELM technique to estimate dynamic security
data obtained from CycleGAN. The SSL scheme introduced in limits in hybrid ac/dc power grids. In [39], a semisupervised
this work differs from the those in aforementioned representative AAE-based scheme was developed to detect false data injection
studies [31]–[33] in the following respects. The approach in [31] attacks in smart distribution systems. Overall, the SSL algo-
implements SSL via predictive coding and regressive estimation, rithms involved in these studies [36]–[39] are different from the
with a self-learning mechanism that forward the estimation SSL method in this article. Comparatively speaking, the work
errors to inputs to enable learning from its own mistakes. On in [39] with unsupervised preprocessing and supervised training
the contrary, with the integration of the LSTM and autoencoder shares the most similarities with this article. Yet it is limited to
algorithms, this article performs SSL in a two-step manner, i.e., handling flat operational data acquired from steady states, while
unsupervised pretraining and supervised fine-tuning. Different the LSTM-enabled SSL method in this article can effectively
from this paradigm, the SSL solution in [32] is realized via a address complex trajectory data obtained from wide-area power
gated stacked autoencoder, while the SSL alternative in [33] system dynamics.
is carried out with manifold regularization. Besides, unlike To sum up, this work systematically integrates model-based
the studies in [32] and [33] merely annotating vectorized data rough simulations with data-driven CycleGAN learning and

Authorized licensed use limited to: Shanghai Jiaotong University. Downloaded on September 26,2024 at 10:24:01 UTC from IEEE Xplore. Restrictions apply.
5058 IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, VOL. 18, NO. 8, AUGUST 2022

LSTM-enabled SSL to form an intelligent framework of high- class-imbalance problem in Shist , i.e., stable cases have a much
fidelity data generation for power system DSA. Such a unique higher proportion than unstable ones [4].
learning framework and the resulting strong capability in gener- Faced with the shortage of historical cases, especially the
ating and annotating power system transient data with compli- scarceness of unstable cases, most of the existing DSA efforts
cated spatial-temporal correlations make this work differ from resort to TD simulations for data augmentation. Based on a TD
relevant research efforts in the literature. simulation engine, transient cases can be generated as many
as needed, and unstable scenarios can be sufficiently simulated
III. PROBLEM DESCRIPTION by considering extremely severe events. However, due to the
inevitability of system modeling and simulation errors [12],
In general, an ML-based DSA scheme infers the relationship simulated cases would deviate from the actual behaviors of
between initial system states/responses and eventual stability the practical power grid, thereby impairing the data quality of
statuses by learning from a CS prepared offline. Taking TSA for S. Consequently, the derived DSA model may exhibit poor
instance, a CS with n transient cases can be compactly described performances when commissioned for online monitoring.
as To improve the quality of learning cases in S and, thus,
enhance the reliability of online DSA, a data/model jointly
S = {(xi , yi ) |xi = {V i , θ i , Δf i , . . .} , for 1 ≤ i ≤ n } (1)
driven framework for high-quality case generation is designed
where (xi , yi ) is an input–output pair that represents the ith case in this article. Its basic idea is outlined as follows.
(1 ≤ i ≤ n). Specifically, xi denotes the multivariable input of 1) The TD simulation engine is employed to simulate histor-
responsive time series (TS) data, characterizing system dynam- ical cases by setting similar operating points and transient
ics of case i during the transient period; yi ∈ {0, 1} is the discrete events to those in historical records (model-driven).
output (class label), indicating the system stability status of case 2) A powerful CycleGAN-based learning model is con-
i (0 → unstable, 1 → stable). Note that, for the sake of deriving structed to learn the underlying relationships between
a compact CS, only those variables that have strong correlations historical cases and simulated cases (data-driven).
to the specific stability issue of interest and can be conveniently 3) Newly simulated cases are intelligently refined with the
measured by PMUs are considered for input data acquisition. In help of the above-learned relationships, which can boost
this article, with emphasis on transient stability, three types of the quality and reliability of S for DSA implementation.
variables, which can be directly measured at generator buses by
PMUs, i.e., bus voltage magnitudes (V i ), relative bus voltage IV. PROPOSED CASE GENERATION FRAMEWORK
angles (θ i , with reference to a certain generator bus), and bus The overall framework for high-quality case generation is
frequency deviations (Δf i ), are taken as the components of illustrated in Fig. 1. Its realization involves the following three
xi . Considering the need for fast stability prediction in early major phases: 1) data integration and rough case generation; 2)
stages, a relatively short time window Twin (|Twin | < 1 s) that data-driven refined case generation; and 3) DSA application.
covers prefault, fault-on, and postfault processes is employed to More technical details are presented in the following.
acquire PMU measurements of {V, θ, Δf } for each case. With
the help of advanced fault detection technologies [40], Twin is
configured to start collecting PMU data from the prefault time A. Data Integration and Rough Case Generation
instant being closest to fault occurrence, so as to include the Given a practical power system, its historical database is first
prefault–fault-on transition into xi . visited to collect representative transient events previously oc-
The transient cases in S can be jointly gathered from the curring in practice. For each event, transient PMU measurements
historical database of the specific system and its batch TD of {V, θ, Δf } at individual generator buses are gathered to form
simulation results the input of a transient case. The output is ascertained by a widely
adopted transient stability criterion [41]
S = Shist ∪ Ssim (2)
1 if η > 0 (stable) 360o − |Δδ|max
where Shist denotes the collection of cases acquired from the his- yi = , for η = (3)
0 else (unstable) 360o + |Δδ|max
torical database, and Ssim represents the group of cases collected
from batch TD simulation results. where |Δδ|max is the largest rotor angle separation between any
In Shist , transient cases recorded by practical PMU measure- two synchronous generators in the transient time span of 0–10 s.
ments have high fidelity to characterize actual system dynamic With all the cases’ data integrated together, a historical CS Shist
behaviors. However, as transient events do not frequently occur is formed. Since these cases come from the physical system,
in today’s power grids, it would be impossible to construct a they are also called real cases in this article.
large CS by merely querying historical databases. For instance, For each real case, its prefault (steady-state) PMU measure-
practical operating records in China Southern Power Grid show ments and event records are acquired to help generate a simulated
that the whole grid encountered no more than 50 transient events counterpart that mimics it in the simulation engine. In this way, a
in the first half of 2019. Moreover, with enhanced reliability simulated CS Ssim is obtained. Due to modeling and simulation
and resilience, instability rarely happens in modern power grids errors, however, these simulated cases may have nonnegligible
even if severe transient faults occur. This would further cause a deviations from real ones. Hence, they are termed as fake cases

Authorized licensed use limited to: Shanghai Jiaotong University. Downloaded on September 26,2024 at 10:24:01 UTC from IEEE Xplore. Restrictions apply.
ZHU AND HILL: DATA/MODEL JOINTLY DRIVEN HIGH-QUALITY CASE GENERATION 5059

Fig. 1. Data/model jointly driven case-generation framework.

as well. Note that the stability statuses of these fake cases are
also determined by the criterion in (3).
The main target of the proposed case generation framework
is to make the simulated fake cases be revised toward real ones,
i.e., enhancing their resemblance with respect to real cases. This
is fulfilled by refining simulated cases with the help of the
fake→real mapping intelligently learned by CycleGAN. How
the refinement is carried out is described in the second phase.

B. Data-Driven Refined Case Generation

1) CycleGAN-Based Case Refinement: As introduced pre-
viously, GAN is a highly promising generative learning tech-
nique attracting increasing attention from the ML commu-
nity [13], [14]. With well-designed learning architectures and
hyperparameters, it has high potential to generate realistic sam-
ples (cases) in the absence of explicit knowledge about the distri- Fig. 2. Structure of the CycleGAN.
butions or characteristics of real samples. Considering the merit
and potential of GAN, it is leveraged to address the problem
of high-quality case generation in this article. In particular, the model. Taking the forward GAN for instance, based on the
simulated cases generated by TD simulations, i.e., cases in Ssim , adversarial loss function [13], its learning objective is described
are taken as the raw inputs of generative learning. In this regard, as
the simulated cases are deemed as rough cases, and the GAN min max LGAN1 = Exreal ∼p(xreal ) [log D1 (xreal )]
G1 D1
learning procedure aims to refine them to improve their fidelity
with respect to real cases. + Exfake ∼p(xfake ) [log(1 − D1 (G1 (xfake ))]
To carry out the refinement in a reliable and reasonable way, (4)
one of the possible solutions is to regularize the GAN learning
where xreal and xfake represent real and fake (simulated) cases;
procedure to ensure the transitivity, which requires that the real-
p(xreal ) and p(xfake ) denote their probability distributions; G1
istic domain be mapped back to the original simulated domain
and D1 represent the generative model1 and the discriminative
to recover the rough cases. In fact, this is similar to the encoder–
model (discriminator) of the forward GAN, respectively. Based
decoder mechanism in autoencoder algorithms, being helpful to
on (4), G1 and D1 compete with each other in a two-player gam-
verify and improve the quality of case refinement. Following this
ing manner: G1 aims to fool D1 by generating cases resembling
learning idea, the powerful CycleGAN algorithm first proposed
real ones; D1 fights back by learning to clearly discriminate
in [20] to learn the bidirectional fake⇔real mappings is utilized
real cases from fake ones produced by G1 [13]. Such iterative
for GAN learning in this article.
competitions lead G1 and D1 to optimize themselves until the
As shown in Fig. 2, the CycleGAN adopted in this work
generated cases seem to be indistinguishable from real ones.
includes two GANs, i.e., the forward GAN and the back-
ward GAN; they try to learn the mappings of fake→real and 1 It is often called generator in the ML community. To avoid confusion with the
real→fake, respectively. Each of them has a standard GAN concept of synchronous generator in power systems, it is termed as generative
structure, including a generative model and a discriminative model in this article.

Authorized licensed use limited to: Shanghai Jiaotong University. Downloaded on September 26,2024 at 10:24:01 UTC from IEEE Xplore. Restrictions apply.
5060 IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, VOL. 18, NO. 8, AUGUST 2022

Similarly, the learning objective of the backward GAN is

min max LGAN2 = Exfake ∼p(xfake ) [log D2 (xfake )]
G2 D2

+ Exreal ∼p(xreal ) [log(1 − D2 (G2 (xreal ))] .

(5)
To coordinate the two GANs and ensure the transitivity, a
cycle consistency loss function is taken into additional account
here. Following the closed-loop procedure of case transition
in Fig. 2, G2 (G1 (xfake )) and G1 (G2 (xreal )) are expected to
resemble their original profiles, i.e., G2 (G1 (xfake )) ≈ xfake and
G1 (G2 (xreal )) ≈ xreal . On the basis of the L1 norm, the cycle
consistency loss-based regularization is expressed as [20]
Fig. 3. Semisupervised RNN for case labeling.
min Lcyc = Exfake ∼p(xfake ) [ G2 (G1 (xfake )) − xfake 1 ]
G1 ,G2

+ Exreal ∼p(xreal ) [ G1 (G2 (xreal )) − xreal 1 ] . (6) generated by TD simulations. To help alleviate the aforemen-
tioned class-imbalance problem in Shist , transient contingencies
Based on the abovementioned preliminaries, the overall learn- being more severe than usual can be considered to produce more
ing objective of the CycleGAN is determined by
unstable cases in Ssim . All the cases in Ssim are then refined

min max L(G1 , G2 , D1 , D2 ) = min max (LGAN1 by G1 , resulting in a refined CS Sref that reflects the realistic
G1 ,G2 D1 ,D2 G1 ,G2 D1 ,D2 characteristics of the practical system.
+ LGAN2 + λLcyc ) (7) With rectified TS profiles of {V, θ, Δf }, the actual stability

statuses of the refined cases in Sref may not be consistent with
where λ is the regularization parameter for the cycle consistency
their original counterparts in Ssim . In this respect, it is necessary

loss. To mitigate the possible gradient vanishing problem in to relabel the stability statuses of all the cases in Sref . However,

GAN learning, the logarithmic function-based losses in (4) and as the TS profiles in Sref are relatively short (|Twin | < 1 s), the
(5) can be replaced by least-square losses [20], [42]. In particular, criterion in (3) cannot be utilized for stability status labeling
the learning objectives of the generative and discriminative here. Instead, an SSL scheme based on the LSTM method is
models in (4) and (5) can be modified as introduced for the case labeling task.
1 As presented in Fig. 3, the implementation of this learning
min V(G1 ) = Exfake ∼p(xfake ) (D1 (G1 (xfake )) − 1)2 (8) scheme has two steps: 1) taking all the cases in Shist and Sref
as
G1 2
inputs, an LSTM autoencoder is trained for unsupervised feature
1 learning [45] from TS profiles and 2) the hidden layer of the
min V(D1 ) = Exreal ∼p(xreal ) (D1 (xreal ) − 1)2
D1 2 LSTM autoencoder that is expected to capture the latent features
1 from TS profiles is taken as the pretrained layer and transferred to
+ Exfake ∼p(xfake ) D1 (G1 (xfake ))2 (9) a supervised LSTM learning model, which is further tuned with
2
1 labeled cases in Shist . More detailed descriptions are presented
min V(G2 ) = Exreal ∼p(xreal ) (D2 (G2 (xreal )) − 1)2 (10) in the following.
G2 2
The LSTM autoencoder is a combination of the LSTM
1 method and the autoencoder algorithm. It consists of an LSTM
min V(D2 ) = Exfake ∼p(xfake ) (D2 (xfake ) − 1)2
D2 2 encoder and an LSTM decoder to perform sequence-to-sequence
1 learning. The LSTM neurons have the standard LSTM archi-
+ Exreal ∼p(xreal ) D2 (G2 (xreal ))2 . (11) tecture, including input gates, output gates, forget gates, and
2
memory cells. Detailed interactions among these elements for
A fully convolutional network (FCN) architecture with high
LSTM learning can be found in [14]. Aiming at minimizing
performances on both generative and discriminative tasks [43]
the sequence reconstruction errors, the learning objective of the
is utilized to construct the four models {G1 , G2 , D1 , D2 } in the
LSTM autoencoder is described as
CycleGAN. Detailed settings of these models’ structures and

hyperparameters are provided along with the specific case study 1 n
in Section V-A. Following the experimental settings in [20], min LAE = min |X i − X i |2 (12)
n i=1
the regularization parameter in (7) is set to λ = 10. The Adam
optimizer [44] is employed to implement iterative CycleGAN where LAE denotes the loss function; X i is the normalized input
learning. of the LSTM encoder, i.e., the normalized TS profiles of case
2) Semisupervised Case Labeling: After the completion of i by applying the minmax normalization to xi ; and X i is the
CycleGAN learning, the model G1 can be used to generate real- corresponding sequential output of the LSTM decoder.
istic cases. Considering the current operating point and ongoing When the LSTM autoencoder is well trained with high re-
operating tendency as well as a list of presumed contingencies construction precisions, its hidden states would present a good

in the specific system, a new group of simulated cases Ssim are feature representation [45] for all the cases in Shist and Sref .
Authorized licensed use limited to: Shanghai Jiaotong University. Downloaded on September 26,2024 at 10:24:01 UTC from IEEE Xplore. Restrictions apply.
ZHU AND HILL: DATA/MODEL JOINTLY DRIVEN HIGH-QUALITY CASE GENERATION 5061

Hence, the corresponding LSTM layer is preserved for sub-

sequent supervised LSTM learning. In particular, an LSTM
classification model is built by replacing the LSTM decoder
with a fully connected (FC) layer and a softmax layer for binary
classification. Essentially, the features and knowledge learned
in the LSTM autoencoder are transferred to this classification
model, and it only needs fine-tuning with a small number of
labeled cases in Shist . As the case labeling issue is a two-class
classification task, the LSTM classification model is tuned by
minimizing a cross-entropy loss function
n

min Ltune = min [−yi log2 ȳi − (1 − yi ) log2 (1 − ȳi )]
i=1 Fig. 4. Structure of GPG.
(13)
where ȳi and yi stand for the numerical prediction and the actual
class label of case i, respectively. Specifically, ȳi is estimated by are deployed at all the 500-kV substations and 22 power plants
exponential normalization in the softmax layer with major impacts on system transient stability. Numerical TD
simulations were executed in PSD-BPA, a commercial power
ȳi = eνi,1 / (eνi,1 + eνi,0 ) (14) system simulation toolkit maintained by China-EPRI.
where νi,1 and νi,0 are the outputs of the FC layer regarding
potential stable and unstable classes for case i. A. Simulation Setting and Training Setup
After the LSTM classification model is tuned, it is exploited
The original operating point of the GPG model, denoted
to label transient cases in Sref via the following binary rule:
as OPA , was assumed to be the actual operating condition in

1 for ȳi > ε (stable) practice. A representative event list involving various (N −
ŷi = (15) 1)˜(N − 2) contingencies (provided by system operators) was
0 otherwise (unstable)
taken as the historical event records. Based on these settings,
where ŷi is the predicted stability status, and ε is the decision 600 transient cases were generated by TD simulations in PSD-
threshold. To label cases without bias, ε is set to 0.5. BPA, so as to mimic the historical CS Shist collected in the
past few years. In Shist , only 21 cases turn out to be transient
C. DSA Application unstable, which indicates an extremely skewed proportion of
Based on the abovementioned two phases, all the newly gener- unstable cases (3.5%). Given the basic operating point OPA ,
random errors following the Gaussian distribution N (0, σ 2 )
ated cases in Sref would have complete and realistic input–output
(with σ = 5%) were imposed onto the major model parameters
information in the form of (xi , ŷi ). Sref is further combined with
Shist to form a full CS for follow-up DSA applications. As all the of the generators, loads, and transmission lines in the GPG
cases in the CS are expected to have a high quality to capture the numerical simulation model. By doing so, a new operating point
realistic behaviors and characteristics of the practical system, OPB was produced to simulate system modeling and parameter
they can help derive reliable data-driven DSA models. Since the errors introduced into TD simulations. With the same event
CS is a standard case repository for DSA, various popular ML list, TD simulations were conducted to generate the simulated
algorithms, such as ANN, DT, SVM, and emerging DNN, can be (fake) CS, i.e., Ssim . To alleviate the small sample size and
readily exploited to learn stability knowledge from it for online class-imbalance issues in Shist , another group of transient events
DSA. involving much more severe (N − 3)˜(N − 5) contingencies
When a DSA model is trained, three metrics, i.e., misdetection were considered for TD simulations. This resulted in 2000

rate, false-alarm rate, and accuracy [4], are calculated to evalu- newly simulated cases to form Ssim . For each case in Shist ,

ate its online performance. With nt cases fed into the DSA model Ssim , and Ssim , multidimensional TS trajectories of {V, θ, Δf }
for online tests, suppose nfs unstable cases are misclassified as were acquired from the 22 generator buses by PMUs with a
stable and nfu stable cases are falsely alarmed to be unstable. sampling rate of 100 Hz. The length of the time window for
The three metrics are computed by TS data acquisition was specified as |Twin | = 0.5 s to make the
information carried by the obtained CS be abundant enough.
Mis = nfs /nt × 100% (16)
Note that, for specific DSA applications, the observation time
Fal = nfu /nt × 100% (17) window can be appropriately set to a shorter value (e.g., 0.2 s)
to support fast online decision-making.
Acc = (nt − nfs − nfu ) /nt × 100%. (18)
Taking all the cases in Shist and Ssim as inputs, a CycleGAN
was trained for case refinement. Given the dimensionality (22*3)
V. CASE STUDY
and TS length (50) in the acquired cases, the architecture of
The proposed case generation framework was comprehen- the FCN-based CycleGAN with three channels (i.e., V, θ, and
sively tested on the realistic provincial GPG in South China. Its Δf ) was carefully designed, as summarized in Tables I and
500-kV bulk network is depicted in Fig. 4. In the grid, PMUs II. A semisupervised LSTM learning model was then trained
Authorized licensed use limited to: Shanghai Jiaotong University. Downloaded on September 26,2024 at 10:24:01 UTC from IEEE Xplore. Restrictions apply.
5062 IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, VOL. 18, NO. 8, AUGUST 2022

TABLE I the cases in Sref (refined from Ssim ) resemble the real historical
PRIMARY STRUCTURE OF THE GENERATIVE MODELS (G1 AND G2 )
cases in Shist .
1) Illustration of Refined TS Profiles: Without loss of gener-
ality, a refined case in Sref was randomly chosen to compare
its TS profiles with those of the corresponding simulated and
historical cases encountering the same transient event. In partic-
ular, the voltage profiles (V ) of all the 22 generator buses were
collected for illustration, as shown in Fig. 5.
Clearly, compared with the simulated case, the refined one
has more consistent voltage evolution with respect to the actual
historical case. The latter two present similar zigzag transi-
tions during the processes of fault occurrence and clearance,
with down–up–down and/or up–down–up tendencies. On the
contrary, the simulated one has simple down–down or up–up
transitions due to deviated system models and parameter errors.
In addition, at the end of the time window, the simulated voltage
profiles concentrate together, while the refined and historical
ones do not exhibit such gathering behaviors. As can be intu-
itively observed, the refined voltage values of individual buses
in the postfault stage (0.21–0.5 s) are able to follow the actual
values to a closer extent. A quantitative comparison was further
made by calculating the mean absolute errors (MAEs) and mean
Notes: N represents the number of cases in each mini batch. The abbreviated terms square errors (MSEs) of the simulated/refined voltage profiles
of the layers are represented as: Conv2D → 2-D convolution; LeakyReLU → leaky
version of a rectified linear unit; InsNorm → instance normalization; UpSp2D with respect to the actual historical ones throughout the time
→ 2-D upsampling; Concate → concatenation of inputs (the same below). The window of 0–0.5 s. For the simulated case, its MAE and MSE
generative models in the forward and backward GANs have the same structure. are 0.0878 and 0.0134, respectively. For the refined version,
its MAE and MSE are largely reduced to 0.0508 and 0.0077,
TABLE II
PRIMARY STRUCTURE OF THE DISCRIMINATIVE MODELS (D1 AND D2 ) respectively.
These comparisons indicate that the refined profiles resemble
the historical ones to a relatively high degree. In fact, similar
resemblances are also observed in the profiles of θ and Δf
of different buses. Due to limited space, however, such resem-
blances are not presented here. For brevity, the following tests
will mainly focus on comparing the differences between the
refined cases and historical ones.
2) Statistical Analysis: From the statistical perspective, the
distributions of all the TS sampling values of (V , θ, and Δf ) in
Ssim , Shist , and Sref were estimated by drawing their histograms,
as depicted in Fig. 6. For all the three types of variables, Sref gen-
erally has more consistent data distributions with Shist than Ssim .
This reveals that Sref shares a higher similarity with Shist . To
quantify and compare their differences in a more straightforward
Notes: The discriminative models in the forward and backward GANs have the
manner, the Kullback–Leibler divergence (KLD) metric [46],
same structure. [47] widely used for estimating the dissimilarity between two
data distributions was adopted here. Let the data histograms
of Shist be the reference distributions. The KLDs of the data
for case labeling. All the programming and computations were
distributions of Ssim and Sref with respect to the reference ones
implemented in Python 3.6 with the Tensorflow back end. After
of Shist were separately computed. As illustrated in Fig. 6, Sref
the completion of the whole learning procedure, a refined CS
generally possesses much smaller KLDs than Ssim .
Sref with 2000 cases was obtained. Among them, 957 and 1043
Comparatively speaking, the distribution of Δf in Fig. 6(i)
cases were labeled as unstable and stable, respectively. In the
has a much larger KLD value than those of V and θ in Fig. 6(g)
combined CS, i.e., Sref ∪ Shist , unstable cases have a proportion
and (h), respectively, which implies a lower fidelity of generating
of 37.62%, which implies that the class-imbalance issue has
frequency deviation data. This is perhaps because the three types
been largely mitigated.
of variables (V , θ, and Δf ) are equally treated as three channels
during generative learning. In fact, due to the complicated elec-
B. Performance on Transient Case Generation trical couplings among the three types of variables, they cannot
To verify the proposed framework’s performance on case be ideally deemed as three equally weighted channels like RGB
generation, extensive tests were conducted here to show how channels in conventional color images. From the perspective
Authorized licensed use limited to: Shanghai Jiaotong University. Downloaded on September 26,2024 at 10:24:01 UTC from IEEE Xplore. Restrictions apply.
ZHU AND HILL: DATA/MODEL JOINTLY DRIVEN HIGH-QUALITY CASE GENERATION 5063

Fig. 5. Illustration of typical voltage profiles of simulated, historical, and refined cases (with the same transient event). (a) Voltage profiles of a
simulated case. (b) Voltage profiles of a historical case. (c) Voltage profiles of a refined case.

Fig. 6. Histograms of TS sampling values in simulated, historical, and refined cases.

of mathematical optimization, the whole generative learning (MI) [15], [19] was performed here. Specifically, the MIs with
procedure can be regarded as a three-objective optimization task respect to TS profiles of pairwise generator buses were com-
to produce three types of data. In this respect, the precision of puted and averaged over all the historical and refined cases, as
synthesizing frequency deviation data may be further improved summarized in Fig. 8.
by properly coordinating the three learning objectives. This At first glance, the refined cases share similar MI correlations
remains to be investigated in future research. with the historical ones. This reveals that the proposed frame-
To further verify the statistical similarity between Shist and work is able to capture the intrinsic spatial-temporal correlations
Sref , the mean values and standard deviations of the (V , θ, and in the practical grid. Such a desirable feature would boost
Δf ) variables at the 22 generator buses were calculated. As the fidelity of the generated cases, making them behave more
presented in Fig. 7, the mean values and standard deviations realistically to better support a data-driven DSA scheme. Nev-
of the refined cases highly approximate those of the historical ertheless, it is noticed that the refined cases present weaker MI
cases, especially for V and θ. This again demonstrates the high correlations among generator buses than the actual correlations
similarity of the refined cases to the historical ones. reflected in historical cases. Thus, there exists some room to
3) Correlation Analysis: For the sake of examining whether further enhance the reality of the refined cases, which would be
the refined cases can preserve the inherent spatial-temporal explored in relevant future work.
correlations among individual buses in the power grid, an For the purpose of quantifying the overall differences of the
additional correlation analysis based on mutual information MI correlations between the refined cases and historical ones,
Authorized licensed use limited to: Shanghai Jiaotong University. Downloaded on September 26,2024 at 10:24:01 UTC from IEEE Xplore. Restrictions apply.
5064 IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, VOL. 18, NO. 8, AUGUST 2022

Fig. 7. Mean values and standard deviations of different variables in historical and refined cases. (a) Voltage magnitudes’ mean values. (b) Voltage
phases’ mean values. (c) Frequency deviations’ mean values. (d) Voltage magnitudes’ standard deviations. (e) Voltage phases’ standard deviations.
(f) Frequency deviations’ standard deviations.

Fig. 8. MI-based correlations among multiple variables in historical and refined cases. (a) MI correlation w.r.t. voltage magnitudes of different
buses (historical cases). (b) MI correlation w.r.t. voltage phases of different buses (historical cases). (c) MI correlation w.r.t. frequency deviations
of different buses (historical cases). (d) MI correlation w.r.t. voltage magnitudes of different buses (refined cases). (e) MI correlation w.r.t. voltage
phases of different buses (refined cases). (f) MI correlation w.r.t. frequency deviations of different buses (refined cases).

the mean relative correlation (MRC) regarding each variable the MRC values related to (V , θ, and Δf ) of each bus were
was calculated. For the ith MI matrix illustrated in Fig. 8 estimated. As summarized in Fig. 9, the historical and refined
[1 ≤ i ≤ 6, corresponding to Fig. 8(a)–(f)], let it be denoted cases share similar MRC values at most buses. Interestingly, for
as Λi = [λuv ]22×22 . In Λi , the average MI of variable j (corre- the frequency deviation data, unlike the results in Figs. 6 and 7
sponding to column j) with respect to the other variables was that exhibit relatively large statistical differences, here the MRCs
22
estimated by Mj = 22 1
k=1,k=j λjk . The MRC of variable j present more consistent values. This implies that the proposed
was then calculated as case refinement framework can still reliably preserve the relative
correlations among individual buses in the grid, despite the
Mj
M̄j = . (19) difficulty in precisely capturing the absolute correlations, as
max Mk shown in Fig. 8.
k
4) Computational Efficiency: To test the computational effi-
Essentially, M̄j ∈ [0, 1] measures the relative correlation of ciency of the proposed framework, its time consumptions during
bus j to the remaining buses in the system by observing a CycleGAN training, semisupervised LSTM learning for class
specific type of variables of (V , θ, and Δf ). Based on (19), labeling, and application to the refinement of newly generated
Authorized licensed use limited to: Shanghai Jiaotong University. Downloaded on September 26,2024 at 10:24:01 UTC from IEEE Xplore. Restrictions apply.
ZHU AND HILL: DATA/MODEL JOINTLY DRIVEN HIGH-QUALITY CASE GENERATION 5065

Fig. 9. MRCs of individual variables observed in historical and refined cases. (a) Voltage magnitudes’ mean relative correlations. (b) Voltage
phases’ mean relative correlations. (c) Frequency deviations’ mean relative correlations.

TABLE III TABLE IV

STATISTICS OF COMPUTATION TIMES DSA PERFORMANCE ON 600 HISTORICAL CASES

Notes: The 3rd column represents the averaged time consumption of applying the
CycleGAN model and the LSTM-based classifier to generate a single refined case TABLE V

in Sref . DSA PERFORMANCE ON 1500 UNSEEN TESTING CASES

cases were tracked. Their computation times are reported in

Table III. All the computations were carried out using a PC
with a 3.60 GHz×8 Intel Core i7-7700 CPU, 32.0 GB RAM,
and an NVIDIA GeForce GTX-1080 GPU.
As can be observed, the proposed framework is able to com-
plete the procedures of CycleGAN training and semisupervised For the 600 historical cases, all the three schemes have very
LSTM learning in less than 4.5 h and 15 min, respectively. high learning performances, with the overall DSA accuracy
Although the computational burden of CycleGAN training is a remaining above 99%. However, when faced with unseen
bit heavy, it does not impair the efficiency of case refinement due scenarios, not all of them maintain high performances. For
to its offline property. In fact, when applied to case refinement, it the scheme merely learning from historical cases, it does
costs no more than 60 ms to refine a simulated case for improving not generalize well to those unknown cases, with the DSA
its reality. Such a high efficiency would be extremely helpful accuracy reduced by more than 13%. When simulated cases
for practical DSA applications. It can not only help efficiently are supplemented, the online DSA performance is improved,
generate high-quality cases for offline DSA model training, yet it still fails to identify more than 7% of unseen cases.
but also assist to quickly supplement new learning cases when With the simulated cases refined by the proposed framework,
periodic DSA model update is needed during online monitoring. the corresponding DSA scheme significantly reduces the
misdetection and false alarm rates, resulting in a desirable
C. Effect on Online DSA online DSA accuracy of more than 97.5%. This reveals that the
proposed framework is capable of generating high-quality cases
To further demonstrate the advantage of the proposed case to augment the diversity of the stability knowledge base, which
generation framework in helping enhance the online DSA per- enhances the reliability and adaptability of DSA applications,

formance, the eventually obtained CS, i.e., Sref ∪ Shist , was especially under unknown scenarios.
exploited to train a DSA-oriented classification model. In partic-
ular, the convolutional neural network (CNN)-based approach
in [11] that learns from transient (fault-on + 2) TS profiles VI. CONCLUSION
(fault-on trajectories plus two closest data points in prefault Focusing on augmenting the reliability of data-driven DSA
and postfault stages) was employed to efficiently perform online applications in practical power systems, this article developed
DSA. For comparative study, the historical CS without case sup- a data/model jointly driven framework for high-quality case
plementation (Shist ) and the CS composed of historical cases and generation. Instead of merely relying on model-based TD sim-

simulated cases without case refinement (Ssim ∪ Shist ), respec- ulations for case generation, the quality of simulated cases was
tively, were separately fed into the same CNN algorithm to build intelligently improved by CycleGAN-based refinement. With
another two DSA schemes. On the basis of the actual operating no need for resorting to costly domain expertise, the stabil-
point OPA , 1500 transient cases involving (N − 3) ∼ (N − 5) ity statuses of the refined cases were efficiently labeled via
events not seen in previous case simulations were generated and semisupervised LSTM learning. Numerical test results on the
taken as a testing set for online DSA performance verification. realistic GPG show that the proposed framework is able to
All the three learning schemes’ DSA performances on the 600 produce high-quality learning cases highly resembling actual
historical cases and the 1500 unseen testing cases are summa- historical ones. With the help of this framework, the reliability
rized in Tables IV and V, respectively. and adaptability of online DSA were significantly improved.
Authorized licensed use limited to: Shanghai Jiaotong University. Downloaded on September 26,2024 at 10:24:01 UTC from IEEE Xplore. Restrictions apply.
5066 IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, VOL. 18, NO. 8, AUGUST 2022

Yet it should be noted that there still exists a nonnegligible [23] A. Creswell, T. White, V. Dumoulin, K. Arulkumaran, B. Sengupta, and A.
gap between the refined cases and actual ones, especially for A. Bharath, “Generative adversarial networks: An overview,” IEEE Signal
Process. Mag., vol. 35, no. 1, pp. 53–65, Jan. 2018.
frequency deviation data. Future research efforts would be [24] Y. Hong, U. Hwang, J. Yoo, and S. Yoon, “How generative adversarial
devoted to further improving the fidelity of refined cases by networks and their variants work: An overview,” ACM Comput. Surv.,
designing a more appropriate generative learning framework. vol. 52, no. 1, pp. 1–43, 2019.
[25] J. Gui, Z. Sun, Y. Wen, D. Tao, and J. Ye, “A review on genera-
tive adversarial networks: Algorithms, theory, and applications,” 2020,
REFERENCES arXiv:2001.06937.
[26] H. Han, W. Ma, M. Zhou, Q. Guo, and A. Abusorrah, “A novel semi-
[1] J. De La Ree, V. Centeno, J. S. Thorp, and A. G. Phadke, “Synchronized supervised learning approach to pedestrian reidentification,” IEEE Internet
phasor measurement applications in power systems,” IEEE Trans. Smart Things J., vol. 8, no. 4, pp. 3042–3052, Feb. 2021.
Grid, vol. 1, no. 1, pp. 20–27, Jun. 2010. [27] N. Zheng, J. Ding, and T. Chai, “DMGAN: Adversarial learning-based
[2] R. Yadav, A. K. Pradhan, and I. Kamwa, “Real-time multiple event decision making for human-level plant-wide operation of process indus-
detection and classification in power system using signal energy trans- tries under uncertainties,” IEEE Trans. Neural Netw. Learn. Syst., vol. 32,
formations,” IEEE Trans. Ind. Informat., vol. 15, no. 3, pp. 1521–1531, no. 3, pp. 985–998, Mar. 2021.
Mar. 2019. [28] Y. Qin, X. Wu, and J. Luo, “Data-model combined driven digital twin of
[3] M. He, J. Zhang, and V. Vittal, “Robust online dynamic security assessment life-cycle rolling bearing,” IEEE Trans. Ind. Informat., to be published,
using adaptive ensemble decision-tree learning,” IEEE Trans. Power Syst., doi: 10.1109/TII.2021.3089340.
vol. 28, no. 4, pp. 4089–4098, Nov. 2013. [29] J. E. Van Engelen and H. H. Hoos, “A survey on semi-supervised learning,”
[4] L. Zhu, C. Lu, Z. Y. Dong, and C. Hong, “Imbalance learning machine- Mach. Learn., vol. 109, no. 2, pp. 373–440, 2020.
based power system short-term voltage stability assessment,” IEEE Trans. [30] X. Zhu and A. B. Goldberg, “Introduction to semi-supervised learning,”
Ind. Informat., vol. 13, no. 5, pp. 2533–2543, Oct. 2017. Synth. Lectures Artif. Intell. Mach. Learn., vol. 3, no. 1, pp. 1–130,
[5] B. Wang, B. Fang, Y. Wang, H. Liu, and Y. Liu, “Power system transient 2009.
stability assessment based on big data and the core vector machine,” IEEE [31] J. Banzi, I. Bulugu, and Z. Ye, “Learning a deep predictive coding network
Trans. Smart Grid, vol. 7, no. 5, pp. 2561–2570, Sep. 2016. for a semi-supervised 3D-hand pose estimation,” IEEE/CAA J. Autom.
[6] D. Q. Zhou, U. D. Annakkage, and A. D. Rajapakse, “Online monitoring Sinica, vol. 7, no. 5, pp. 1371–1379, Sep. 2020.
of voltage stability margin using an artificial neural network,” IEEE Trans. [32] Q. Sun and Z. Ge, “Deep learning for industrial KPI prediction: When en-
Power Syst., vol. 25, no. 3, pp. 1566–1574, Aug. 2010. semble learning meets semi-supervised data,” IEEE Trans. Ind. Informat.,
[7] F. Hashiesh et al., “An intelligent wide area synchrophasor based system vol. 17, no. 1, pp. 260–269, Jan. 2021.
for predicting and mitigating transient instabilities,” IEEE Trans. Smart [33] X. Pu and C. Li, “Online semisupervised broad learning system for
Grid, vol. 3, no. 2, pp. 645–652, Jun. 2012. industrial fault diagnosis,” IEEE Trans. Ind. Informat., vol. 17, no. 10,
[8] Y. Zhang, Y. Xu, Z. Dong, and R. Zhang, “A hierarchical self-adaptive pp. 6644–6654, Oct. 2021.
data-analytics method for power system short-term voltage stability as- [34] Y. Li, J. Li, and Y. Wang, “Privacy-preserving spatiotemporal sce-
sessment,” IEEE Trans. Ind. Informat., vol. 15, no. 1, pp. 74–84, Jan. 2019. nario generation of renewable energies: A federated deep genera-
[9] Y. Xu et al., “Extreme learning machine-based predictor for real-time tive learning approach,” IEEE Trans. Ind. Informat., to be published,
frequency stability assessment of electric power systems,” Neural Comput. doi: 10.1109/TII.2021.3098259.
Appl., vol. 22, no. 3/4, pp. 501–508, 2013. [35] Y. Yuan, K. Dehghanpour, F. Bu, and Z. Wang, “Outage detection in
[10] J. Yu et al., “Intelligent time-adaptive transient stability assessment sys- partially observable distribution systems using smart meters and gen-
tem,” IEEE Trans. Power Syst., vol. 33, no. 1, pp. 1049–1058, Jan. 2018. erative adversarial networks,” IEEE Trans. Smart Grid, vol. 11, no. 6,
[11] L. Zhu, D. J. Hill, and C. Lu, “Data-driven fast transient stability as- pp. 5418–5430, Nov. 2020.
sessment using (fault-on 2) generator trajectories,” in Proc. IEEE Power [36] R. Liu, G. Verbič, and J. Ma, “A new dynamic security assessment
Energy Soc. Gen. Meeting, 2019, pp. 1–5. framework based on semi-supervised learning and data editing,” Electric
[12] N. Tleis, Power Systems Modelling and Fault Analysis: Theory and Prac- Power Syst. Res., vol. 172, pp. 221–229, 2019.
tice, 2nd ed. New York, NY, USA: Elsevier, 2019. [37] J.-M. Huang, R.-J. Wai, and G.-J. Yang, “Design of hybrid artificial bee
[13] I. Goodfellow et al., “Generative adversarial nets,” in Proc. Adv. Neural colony algorithm and semi-supervised extreme learning machine for pv
Inf. Process. Syst., 2014, pp. 2672–2680. fault diagnoses by considering dust impact,” IEEE Trans. Power Electron.,
[14] I. Goodfellow, Y. Bengio, and A. Courville, Deep Learning. Cambridge, vol. 35, no. 7, pp. 7086–7099, Jul. 2020.
MA, USA: MIT Press, 2016. [38] J. Yan, C. Li, and Y. Liu, “Insecurity early warning for large scale
[15] Y. Chen, Y. Wang, D. Kirschen, and B. Zhang, “Model-free renewable hybrid ac/dc grids based on decision tree and semi-supervised deep
scenario generation using generative adversarial networks,” IEEE Trans. learning,” IEEE Trans. Power Syst., vol. 36, no. 6, pp. 5020–5031,
Power Syst., vol. 33, no. 3, pp. 3265–3275, May 2018. Nov. 2021.
[16] C. Ren et al., “A fully data-driven method based on generative adversarial [39] Y. Zhang, J. Wang, and B. Chen, “Detecting false data injection attacks
networks for power system dynamic security assessment with missing in smart grids: A semi-supervised deep learning approach,” IEEE Trans.
data,” IEEE Trans. Power Syst., vol. 34, no. 6, pp. 5044–5052, Nov. 2019. Smart Grid, vol. 12, no. 1, pp. 623–634, Jan. 2021.
[17] W. Zhang, Y. Luo, Y. Zhang, and D. Srinivasan, “SolarGAN: Multivariate [40] Q. Jiang, X. Li, B. Wang, and H. Wang, “PMU-based fault location using
solar data imputation using generative adversarial network,” IEEE Trans. voltage measurements in large transmission networks,” IEEE Trans. Power
Sustain. Energy, vol. 12, no. 1, pp. 743–746, Jan. 2021. Del., vol. 27, no. 3, pp. 1644–1652, Jul. 2012.
[18] Y. Li, Y. Wang, and S. Hu, “Online generative adversary network based [41] User Manual-Transient Security Assessment Tool (TSAT), Powertech Labs
measurement recovery in false data injection attacks: A cyber-physical Inc., Surrey, BC, Canada, 2009.
approach,” IEEE Trans. Ind. Informat., vol. 16, no. 3, pp. 2031–2043, [42] X. Mao, Q. Li, H. Xie, R. Y. Lau, Z. Wang, and S. P. Smolley, “Least squares
Mar. 2020. generative adversarial networks,” in Proc. IEEE Int. Conf. Comput. Vis.,
[19] J. Liu, F. Qu, X. Hong, and H. Zhang, “A small-sample wind turbine fault 2017, pp. 2794–2802.
detection method with synthetic fault data using generative adversarial [43] J. Johnson, A. Alahi, and L. Fei-Fei, “Perceptual losses for real-time style
nets,” IEEE Trans. Ind. Informat., vol. 15, no. 7, pp. 3877–3888, Jul. 2019. transfer and super-resolution,” in Proc. Eur. Conf. Comput. Vis., 2016,
[20] J.-Y. Zhu et al., “Unpaired image-to-image translation using cycle- pp. 694–711.
consistent adversarial networks,” in Proc. IEEE Int. Conf. Comput. Vis., [44] D. Kinga and J. Ba, “Adam: A method for stochastic optimization,” in
2017, pp. 2223–2232. Proc. Int. Conf. Learn. Representations, 2015, pp. 1–15.
[21] D. De Silva, S. Sierla, D. Alahakoon, E. Osipov, X. Yu, and V. Vyatkin, “To- [45] A. M. Dai and Q. V. Le, “Semi-supervised sequence learning,” in Proc.
ward intelligent industrial informatics: A review of current developments Adv. Neural Inf. Process. Syst., 2015, pp. 3079–3087.
and future directions of artificial intelligence in industrial applications,” [46] S. Kullback, Information Theory and Statistics. Chelmsford, MA, USA:
IEEE Ind. Electron. Mag., vol. 14, no. 2, pp. 57–72, Jun. 2020. Courier Corp., 1997.
[22] H. Han, M. Zhou, and Y. Zhang, “Can virtual samples solve small sample [47] X. Wang, Q. Kang, J. An, and M. Zhou, “Drifted twitter spam classification
size problem of KISSME in pedestrian re-identification of smart trans- using multiscale detection test on K-L divergence,” IEEE Access, vol. 7,
portation?” IEEE Trans. Intell. Transp. Syst., vol. 21, no. 9, pp. 3766–3776, pp. 108384–108394, 2019.
Sep. 2020.

Authorized licensed use limited to: Shanghai Jiaotong University. Downloaded on September 26,2024 at 10:24:01 UTC from IEEE Xplore. Restrictions apply.