0% found this document useful (0 votes)
19 views11 pages

Robust and Scalable Power System State Estimation Via Composite Optimization

This paper presents novel algorithms for robust and scalable power system state estimation (PSSE) using least-absolute-value (LAV) estimators, which are more resilient to bad data compared to traditional weighted least-squares methods. The proposed deterministic and stochastic algorithms leverage composite optimization techniques to efficiently handle large-scale power networks, significantly improving computational speed and robustness. Numerical evaluations demonstrate that these methods outperform existing alternatives in terms of accuracy and efficiency for medium to large power systems.

Uploaded by

yousif3630
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
19 views11 pages

Robust and Scalable Power System State Estimation Via Composite Optimization

This paper presents novel algorithms for robust and scalable power system state estimation (PSSE) using least-absolute-value (LAV) estimators, which are more resilient to bad data compared to traditional weighted least-squares methods. The proposed deterministic and stochastic algorithms leverage composite optimization techniques to efficiently handle large-scale power networks, significantly improving computational speed and robustness. Numerical evaluations demonstrate that these methods outperform existing alternatives in terms of accuracy and efficiency for medium to large power systems.

Uploaded by

yousif3630
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 11

IEEE TRANSACTIONS ON SMART GRID, VOL. 10, NO.

6, NOVEMBER 2019 6137

Robust and Scalable Power System State Estimation


via Composite Optimization
Gang Wang , Member, IEEE, Georgios B. Giannakis , Fellow, IEEE, and Jie Chen , Fellow, IEEE

Abstract—In today’s cyber-enabled smart grids, high penetra- I. I NTRODUCTION


tion of uncertain renewables, purposeful manipulation of meter
HE NORTH American electric grid, the largest machine
readings, and the need for wide-area situational awareness, call
for fast, accurate, and robust power system state estimation. The
least-absolute-value (LAV) estimator is known for its robustness
T on earth, is recognized as the greatest engineering
achievement of the 20th century [1]: thousands of miles of
relative to the weighted least-squares one. However, due to non- transmission lines and millions of miles of distribution lines,
convexity and nonsmoothness, existing LAV solvers based on linking thousands of power plants to millions of factories and
linear programming are typically slow and, hence, inadequate
homes. Accurately monitoring the grid’s operating condition
for real-time system monitoring. This paper, develops two novel
algorithms for efficient LAV estimation, which draw from recent is critical to several control and optimization tasks, including
advances in composite optimization. The first is a determinis- optimal power flow, reliability analysis, attack detection, and
tic linear proximal scheme that handles a sequence of (5 ∼ 10 future network expansion planning [2], [3].
in general) convex quadratic problems, each efficiently solvable To enable grid-wide monitoring, power system engineers
either via off-the-shelf toolboxes or through the alternating direc- in the 1960s pursued voltages at a few critical buses based
tion method of multipliers. Leveraging the sparse connectivity
inherent to power networks, the second scheme is stochastic and on readings collected from current and potential transformers.
updates only a few entries of the complex voltage state vector per But the power flow equations were never feasible due to timing
iteration. In particular, when voltage magnitude and (re)active and modeling inaccuracies. In a seminal work [4], the statis-
power flow measurements are used only, this number reduces to tical foundation was laid for power system state estimation
one or two regardless of the number of buses in the network. (PSSE), whose central task is to obtain the voltage magnitude
This computational complexity evidently scales well to large-
size power systems. Furthermore, by carefully mini-batching the
and angle information at all buses given the network param-
voltage and power flow measurements, accelerated implementa- eters and measurements acquired from across the grid. Since
tion of the stochastic iterations becomes possible. The developed then, there have been numerous PSSE contributions; see for
algorithms are numerically evaluated using a variety of bench- example, [3] for a recent review of PSSE advances, some of
mark power networks. Simulated tests corroborate that improved which are outlined below.
robustness can be attained at comparable or markedly reduced
Power grids are primarily monitored by supervisory control
computation times for medium- or large-size networks relative
to existing alternatives. and data acquisition (SCADA) systems. Parameter uncertainty,
instrument mis-calibration, and unmonitored topology changes
Index Terms—SCADA measurements, nonlinear AC estima-
tion, cyberattacks, alternating direction method of multipliers,
can however, yield grossly corrupted SCADA measurements
prox-linear algorithms. (a.k.a. ‘bad data’) [5]. Designed for functionality and effi-
ciency with little attention paid to security, today’s SCADA
Manuscript received August 14, 2017; revised November 2, 2017, March systems are vulnerable to cyberattacks [6]. Bad data also come
10, 2018, and October 7, 2018; accepted January 29, 2019. Date of pub- in the form of purposeful manipulation of smart meter read-
lication February 1, 2019; date of current version October 30, 2019. The
work of G. Wang and G. B. Giannakis was supported in part by the ings, as asserted by the first hacker-caused power outage: the
National Science Foundation under Grant 1514056, Grant 1505970, and 2015 Ukraine blackout [7]. Any of these events can happen
Grant 1711471. The work of J. Chen was supported in part by the National which will cause a given data collection to be much more
Natural Science Foundation of China (NSFC) under Grant 61621063 and
Grant 61522303, in part by the NSFC-Zhejiang Joint Fund for the Integration inaccurate than is assumed by popular mathematical models.
of Industrialization and Informatization under Grant 61720106011, in part Efficient robust PSSE approaches against cyber threats are thus
by the Projects of Major International (Regional) Joint Research Program well motivated in the smart grid context [8].
NSFC under Grant 61720106011, and in part by the Program for Changjiang
Scholars and Innovative Research Team in University under Grant IRT1208. Commonly used PSSE criteria include the weighted least-
Paper no. TSG-01182-2017. (Corresponding author: Georgios B. Giannakis.) squares (WLS) and the least-absolute value (LAV) [9]. Other
G. Wang and G. B. Giannakis are with the Digital Technology enhanced estimators for robustness consist of the Schweppe-
Center, University of Minnesota, Minneapolis, MN 55455 USA, and also
with the Electrical and Computer Engineering Department, University of Huber generalized M-estimator [10], as well as the least-
Minnesota, Minneapolis, MN 55455 USA (e-mail: [email protected]; median and the least-trimmed squares state estimators [11].
[email protected]). The WLS criterion would coincide with the maximum likeli-
J. Chen is with the State Key Laboratory of Intelligent Control and Decision
of Complex Systems, and the School of Automation, Beijing Institute of hood criterion when additive white Gaussian noise is assumed.
Technology, Beijing 100081, and also the Tongji University, Shanghai 200092, Unfortunately, WLS comes with a number of limitations. First,
China (e-mail: [email protected]). obtaining the WLS estimate based on SCADA measurements
Color versions of one or more of the figures in this paper are available
online at http://ieeexplore.ieee.org. amounts to minimizing a nonconvex quartic polynomial. As
Digital Object Identifier 10.1109/TSG.2019.2897100 a result, the quality of the resultant iterative estimators relies
1949-3053 c 2019 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.
See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: Consortium - Saudi Arabia SDL. Downloaded on February 21,2025 at 15:24:55 UTC from IEEE Xplore. Restrictions apply.
6138 IEEE TRANSACTIONS ON SMART GRID, VOL. 10, NO. 6, NOVEMBER 2019

heavily on the initialization (see justifications in [12] and [13]). numerically tested using the IEEE 14-, 118-bus, and the
Furthermore, the convergence of Gauss-Newton iterations for PEGASE 9, 241-bus benchmark networks. Simulations cor-
nonconvex objectives is hardly guaranteed in general [14]. roborate their merits relative to the WLS-based Gauss-Newton
As least as important, WLS estimators are sensitive to bad iterations.
data [5]. They may yield very poor estimates in the pres- Outline. Grid modeling and problem formulation are given
ence of outliers. These issues were somewhat mitigated by in Section II. Upon reviewing the basics of composite
incorporating the largest normalized residual (LNR) test for optimization, Section III presents the deterministic LAV solver,
bad data removal [5], or, via reformulating the (possibly reg- followed by its stochastic alternative in Section IV. Extensive
ularized) WLS into a semidefinite program (SDP) via convex numerical tests are presented in Section V, while the paper is
relaxation [15], [16]. The former alternates between the LNR concluded in Section VI.
test and the estimation, while the latter solves SDPs. The least- Notation: Matrices (column vectors) are denoted by upper-
median-squares and the least-trimmed-squares estimators have (lower-) case boldface letters. Symbols (·)T and (·)H repre-
provably improved performance under certain conditions [11]. sent (Hermitian) transpose, and (·) complex conjugate. Sets are
Unfortunately, their computational complexities and storage denoted using calligraphic letters. Symbol (·) ((·)) takes the
requirements scale unfavorably with the number of buses in real (imaginary) part of a complex number. Operator dg(xi )
the network [3]. defines a diagonal matrix holding entries of xi on its diagonal,
On the other hand, LAV estimators simultaneously identify while [xi ]1≤i≤N returns a matrix with xH i being its i-th row.
and reject bad data while acquiring an accurate estimate of
the state [17]. Recent research efforts have focused on dealing II. G RID M ODELING AND P ROBLEM F ORMULATION
with the nonconvexity and nonsmoothness in LAV estimation. An electric grid having N buses and L lines is modeled
Upon linearizing the nonlinear measurement functions at the as a graph G = (N , E), whose nodes N := {1, 2, . . . , N}
most recent iterate, a series of linear programs was solved [17]. correspond to buses and whose edges E := {(n, n )} ⊆
Techniques for improving the linear programming by exploit- N × N correspond to lines. The complex voltage per bus
ing the system’s structure [18], or via iterative reweighting [19] n ∈ N is expressed in rectangular coordinates as vn =
have also been reported. LAV estimation based only on PMU (vn ) + j(vn ), with all nodal voltages forming the vector
data was studied [20], [21], in which a strategic scaling v := [v1 · · · vN ]H ∈ CN .
was suggested to eliminate the effect of leverage measure- The voltage magnitude square Vn := |vn |2 = 2 (vn ) +
ments [10], [11]. Despite these efforts, LAV estimators have  (vn ) can be compactly expressed as a quadratic function
2
not been widely employed yet in today’s power networks due of v, namely
mostly to their computational inefficiency [20].
The LAV-based PSSE is revisited in this work from the Vn = vH HVn v, with HVn := en eTn (1)
viewpoint of composite optimization [22], [23], which consid- where en denotes the n-th canonical vector in RN . To express
ers minimizing functions f (v) = h(c(v)) that are compositions power injections as functions of v, introduce the so-termed
of a convex function h, and a smooth vector function c. Two bus admittance matrix Y = G + jB ∈ CN [2]. In rectangular
novel proximal linear (prox-linear) procedures are developed coordinates, the active and reactive power injections pn and
based upon minimizing a sequence of convex quadratic qn at bus n are given by
subproblems. The first deterministic LAV solver minimizes
functions constructed from a linearized approximation to the 
N

original objective and a quadratic regularization, each effi- pn = (vn ) [(vn )Gnn − (vn )Bnn ]
ciently implementable using either off-the-shelf solvers, or, n =1

the alternating direction method of multipliers (ADMM). The N

convergence of such deterministic prox-linear algorithms has + (vn ) [(vn )Gnn + (vn )Bnn ] (2)
n =1
been documented [23], [24].
The second LAV solver builds on a stochastic prox-linear 
N
qn = (vn ) [(vn )Gnn − (vn )Bnn ]
algorithm, and has each iteration minimizing the summation
n =1
of a linearized approximation to the LAV loss of a single mea-
surement and the regularization term [25], [26]. Interestingly, 
N
− (vn ) [(vn )Gnn + (vn )Bnn ] (3)
each iteration of the stochastic LAV solver has a closed-form
n =1
update. Thanks to the sparse connectivity inherent to power
networks, this amounts to updating very few entries of the state which admits a compact representation as
vector. Moreover, even faster implementation of the stochastic YH en eTn + en eTn Y
solver is realized by means of judiciously mini-batching the pn = vH Hpn v, with Hpn := (4a)
2
measurements. H e eT − e eT Y
Y
qn = vH Hqn v, with Hqn :=
n n n n
Bad leverage points may challenge, but the proposed . (4b)
prox-linear algorithms can be generalized to accommodate 2j
robust estimation formulations including the Huber estimation, Recognize that the line current from bus n to n at the ‘from’
Huber M-estimation, and the Schweppe-Huber generalized M- end obeys Inn = eTnn if = eTnn Yf v, where if ∈ C|E | collects
estimation [10], [11], [27]–[30]. The novel algorithms were all line currents, and Yf ∈ C|E |×N relates the bus voltages

Authorized licensed use limited to: Consortium - Saudi Arabia SDL. Downloaded on February 21,2025 at 15:24:55 UTC from IEEE Xplore. Restrictions apply.
WANG et al.: ROBUST AND SCALABLE PSSE VIA COMPOSITE OPTIMIZATION 6139

to all line currents at the ‘from’ (sending) end. Ohm’s and noise [4]. It is known however that the WLS criterion is
Kirchhoff’s laws assert that the ‘from-end’ power flow over sensitive to outliers, and may yield very bad estimates even
f
line (n, n ) can be expressed as Snn = Pnn − jQnn = vn inn =
f f f if there are few grossly corrupted measurements [5]. As is
H T H T
(v en )(enn if ) = v en enn Yf v, yielding well documented in statistics and optimization, the 1 -based
losses yield median-based estimators [32], and handle gross
YH T T
f enn en + en enn Yf errors in the measurements z in a relatively benign way.
Pnn = vH HPnn v, with HPnn :=
f
(5a)
2 Prompted by this, we will consider here minimizing the
YH T T
f enn en − en enn Yf
1 loss of the residuals, which leads to the so-called LAV
Qnn = vH Hnn v, with Hnn :=
f Q Q
. estimate [17]
2j
(5b) 1   H 
M

minimize f (v) := v Hm v − zm . (7)
The active and reactive power flows measured at the ‘to’ v∈CN M
m=1
(receiving) ends Ptnn and Qtnn can be written symmetrically
f f
to Pnn and Qnn ; and hence, they are omitted here for brevity. Because of {vH Hm v}M m=1 and the absolute-value operation,
Given line parameters collected in Y and Yf , all SCADA the LAV objective in (7) is nonsmooth, nonconvex, and not
measurements including squared voltage magnitudes as well as even locally convex near the optima ±v∗ . This is clear
active and reactive power injections and flows can be expressed from the real-valued scalar case f (v) = |v∗ v − 1|, where
as quadratic functions of the voltages v ∈ CN . This justifies v ∈ R. A local analysis based on convexity and smooth-
why v is referred to as the system state. If SV , Sp , Sq , SP ,
f ness is thus impossible, and f (v) is difficult to minimize. For
f
SQ , SPt , and SQt signify the smart meter locations of the cor- this reason, Gauss-Newton is not applicable to minimize (7).
responding type, we have available the following (possibly Nevertheless, the criterion f (v) possesses several unique struc-
noisy or even corrupted) measurements: {V̌n }n∈SV , {p̌n }n∈Sp , tural properties, which we exploit next to develop efficient
f f algorithms.
{q̌n }n∈Sq , {P̌nn }(n,n )∈S f , {Q̌nn }(n,n )∈S f , {P̌tnn }(n,n )∈SPt , and
P Q Remark 1: For an N-bus power system, most existing PSSE
{Q̌tnn }(n,n )∈S t , henceforth concatenated in the vector z ∈ RM , approaches have relied on optimizing over (2N − 1) real vari-
Q
where M denotes the total number of measurements. ables, which consist of either the polar or the rectangular
In this paper, the following corruption model is con- components of the complex voltage phasors after excluding
sidered [31]: If {ξi } ⊆ R models an arbitrary attack (or the angle or the imaginary part of the reference bus that is
outlier) sequence, given the measurement matrices {Hm }M m=1
often set to 0. Nevertheless, when iterative algorithms are
in (1)-(5a), we observe for 1 ≤ m ≤ M the samples used, working directly with the N-dimensional complex volt-
 H age vector has in general lower complexity and computational
v Hm v if m ∈ I nom
zm ≈ (6) burden than in the real case. This is due to the compact
ξm if m ∈ I out quadratic representations of all SCADA quantities in complex
where additive measurement noise can be included if ≈ is voltage phasors, namely the natural sparsity of quadratic mea-
replaced by equality, and I nom , I out ⊆ {1, 2, . . . , M} collect surement matrices in the unknown complex voltage phasor
the indices of nominal data and outliers, respectively. In other vector.
words, I out is the set of meter indices that can be compro-
mised. The indices in I out are assumed chosen randomly from III. D ETERMINISTIC P ROX -L INEAR LAV S OLVER
{1, 2, . . . , M}. Instrument failures occur at random, although
the attack sequence {ξm } may rely on {Hm } (even adversar- In this section, we will develop a deterministic solver of (7).
ially). Specifically, two models will be considered for the To that end, let us start rewriting the objective in (7) as
attacks. minimize f (v) := h(c(v)) (8)
M1 Matrices {Hm }M m=1 are independent of {ξm }m=1 .
M
v∈CN
M2 Nominal measurement matrices {Hm }m∈I nom are inde-
the composition of a convex function h : RM → R, and a
pendent of {ξm }m∈I out .
smooth vector function c : CN → RM , a structure that is
It is worth highlighting that M1 requires full independence
known to be amenable to efficient algorithms [22], [23]. It is
between the corruption and measurements. That is, the attacker
clear that this general form subsumes (7) as a special case, for
may only corrupt ξm without knowing Hm . On the contrary,
which we can take h(u) = (1/M) u 1 and c(v) = [vH Hm v −
M2 allows completely arbitrary dependence between ξm and
zm ]1≤m≤M . The compositional structure lends itself well to the
Hm for m ∈ I out , which is natural as the type of corrup-
proximal linear (prox-linear) algorithm, which is a variant of
tion may also rely on the individual measurement Hm being
the Gauss-Newton iterations [22]. Specifically, define close to
taken.
a given v the local “linearization” of f as
Having elaborated on the system and corruption models,
the PSSE problem can be stated as follows: Given matri- fv (w) := h(c(v) + (∇ H c(v)(w − v))) (9)
ces Y, Yf , and the available measurements z ∈ RM , with
entries as in (6) obeying M1 or M2, recover the voltage vector where ∇c(v) ∈ CN×M denotes the Jacobian matrix of c at
v ∈ CN . The first attempt may be seeking the WLS esti- v based on Wirtinger derivatives for functions of complex-
mate, or the ML one when assuming independent Gaussian valued variables [33, Appendix]. In contrast to the nonconvex

Authorized licensed use limited to: Consortium - Saudi Arabia SDL. Downloaded on February 21,2025 at 15:24:55 UTC from IEEE Xplore. Restrictions apply.
6140 IEEE TRANSACTIONS ON SMART GRID, VOL. 10, NO. 6, NOVEMBER 2019


f (v), function fv (w) in (9) is convex in w, which is the key wk+1 , uk+1
behind the prox-linear method. Starting with some point v0 ,  2  2
which can be the flat-voltage profile point, namely the all-one := arg min w − w̃k+1 + λk + u − ũk+1 + ν k
w, u 2 2
vector, construct the iteration
  subject to A w = u (16c)
1  k+1  t k 
vt+1 := arg min fvt (v) + v − vt 2
2
(10) λ λ + w̃k+1 − wk+1
2μt = k (16d)
v∈CN ν k+1 ν + ũk+1 − uk+1
where μt > 0 is a stepsize that can be fixed in advance, or be where all the dual variables have been scaled by the factor
determined by a line search [23]. ρ > 0 [35].
Evidently, the subproblem (10) to be solved at every Interestingly enough, the solutions of (16a)-(16c) can be
iteration of the prox-linear algorithm is convex, and can be provided in closed form, as we elaborate in the following two
handled by off-the-shelf solvers such as CVX [34]. However, propositions, whose proofs are deferred to the Appendix.
these interior-point based solvers may not scale well when Proposition 1: The solutions of (16a) and (16b) are
{Hm } are large. For this reason, we derive next a more efficient respectively
iterative procedure using ADMM iterations [35], [36]. ρ  k
When specifying f to be the LAV objective of (7), the w̃k+1 := w − λk (17a)
1+ρ
minimization in (10) becomes   
ũk+1 := ct + S1/2ρ  uk − ν k − ct + i uk − ν k
1
vt+1 = arg min (At (v − vt )) − ct 1 + v − vt 2
2 (11) (17b)
v∈CN 2
with coefficients given by where the shrinkage operator Sτ (x) : RN × R+ → RN is
  Sτ (x) := sign(x)max(|x|−τ 1, 0), with  and |·| denoting the
At := (2μt /M)vH t Hm (12a) entrywise multiplication and absolute operators, respectively,
1≤m≤M
   and
H
ct := (μt /M) zm − vt Hm vt . (12b) 
1≤m≤M x/|x|, x  = 0
sign(x) :=
0, x=0
For brevity, let w := v − vt , and rewrite (11) equivalently as
a constrained optimization problem provides an entrywise definition of operator sign(x).
1 The constrained minimization of (16c) essentially projects
minimize (u) − ct 1 + w 2
2 (13a) the pair (w̃k+1 + λk , ũk+1 + ν k ) onto the convex set specified
u∈CM , w∈CN 2
by the linear equality constraint, namely {(w, u) : At w = u}.
subject to At w = u. (13b) Its solution is derived in a simple closed form next.
To decouple constraints and also facilitate the implementation Proposition 2: Given b ∈ CN and d ∈ CM , the solution of
of ADMM, introduce an auxiliary copy ũ and w̃ for u and w 1 1
minimize w − b 22 + u − d 22
accordingly, and rewrite (13) into w∈C , u∈C
N M 2 2
1 2 subject to Aw = u
minimize  ũ − ct 1
+
w̃ 2 (14a)
ũ, w̃, u, w 2 is given as
subject to ũ = u, w̃ = w, At w = u. (14b)  −1 
w∗ := I + AH A b + AH d (18a)
Letting λ ∈ CN and ν ∈ CM be the Lagrange multipliers
∗ ∗
corresponding to the w- and u-consensus constraints, respec- u := Aw . (18b)
tively, the augmented Lagrangian after leaving out the last Using Proposition 2, the minimizer of (16c) is found as
equality in (14b) can be expressed as  −1   
1 2 wk+1 := I + AH t At w̃k+1
+ λ k
+ AH k+1
ũ + ν k
L(w̃, ũ, w, u; λ, ν) := (ũ) − ct 1 + w̃ 2
 2  (19a)
+  λH w̃ − w +  ν H ũ − u uk+1
:= Aw k+1
. (19b)
ρ 2 ρ 2
+ w̃ − w 2 + ũ − u 2 (15) The four updates in (16) are computationally simple except
2 2 for the matrix inversion of (19a), which nevertheless can be
where ρ > 0 is a predefined step size. With k ∈ N denoting the cached once computed during the first iteration. In addition,
iteration index for solving (13), or equivalently (10), ADMM variables ũ0 , λ0 , and ν 0 of ADMM can be initialized to zero.
cycles through the following recursions Finally, the solution of (11) can be obtained as
 
1 ρ 2 vt+1 := vt + w∗ (20)
w̃k+1 := arg min w̃ 22 + w̃ − (wk − λk ) (16a)
2 2 2

   where w∗ is the converged w-iterate of the ADMM iterations
1 ρ 2
in (17), (19), and (16d).
ũk+1 := arg min  ũ − ct 1 + ũ − uk − ν k
ũ 2 2 2 The novel deterministic LAV solver based on ADMM is
(16b) summarized in Table I, in which the inner loop consisting

Authorized licensed use limited to: Consortium - Saudi Arabia SDL. Downloaded on February 21,2025 at 15:24:55 UTC from IEEE Xplore. Restrictions apply.
WANG et al.: ROBUST AND SCALABLE PSSE VIA COMPOSITE OPTIMIZATION 6141

TABLE I TABLE II
D ETERMINISTIC LAV S OLVER U SING ADMM S TOCHASTIC LAV S OLVER

of (9), hence also in (11), yielding


   
 H  1
vt+1 := arg min  amt (v − vt ) − cmt  + v − vt 2
2
v∈CN 2μt
of Steps 3-8 can be replaced with off-the-shelf solvers to
(21)
solve (11) for vt+1 .
As far as performance is concerned, if h is L-Lipschitz and where the coefficients are given by
∇c is β-Lipschitz, then taking a constant stepsize μ ≤ 1/(Lβ)
in (10) guarantees that [24]: amt := 2Hmt vt (22a)
i) the proposed solver in Table I is a descent method; and, cmt := zmt − vH
t Hmt vt . (22b)
ii) the iterate sequence {vt } converges to a stationary point
Different from iteratively seeking solutions of (11) based on
of the LAV objective in (7).
ADMM iterations, the minimization of (21) admits a simple
The computational burden of the ADMM based determin-
closed-form minimizer presented in the next result, which is
istic solver is dominated by the projection step of (19), which
proved in the Appendix.
incurs per-iteration computational complexity on the order
Proposition 3: Given a ∈ CN and c ∈ R, the solution of
of O(MN 2 ). This complexity can be afforded in small- or   
  1
medium-size PSSE tasks, but may not be efficient enough for minimize  aH w − c + w 22 (23)
nowadays increasingly interconnected power networks. This w∈CN 2τ
motivates our stochastic alternative of the ensuing section that is given by ŵ := projτ (c/ a 22 )·a, where the projection opera-
relies on very inexpensive iterations. tor projτ (x) : R × R+ → R returns the real number in interval
[−τ, τ ] closest to any given x ∈ R.
IV. S TOCHASTIC P ROX -L INEAR LAV S OLVER Based on Proposition 3, the solution of (21) is given by

Finding the minimizer of (11) exactly per iteration of the vt+1 := vt + projμt cmt / amt 22 · amt . (24)
deterministic LAV solver may be computationally expensive,
and can be intractable when the network size grows very Intuitively, measurements with a relatively small absolute
large. Considering the wide applicability of LAV estimation residual, namely |cmt | ≤ amt 22 , are deemed ‘nominal,’ and vt
as well as the increasing interconnection of microgrids, scal- is updated with a step of cmt / amt 22 along the current direction
able online and stochastic approaches become of substantial of amt . On the other hand, the measurements of larger absolute
interest. In this section, a stochastic linear proximal algorithm residuals are likely to be outliers, so vt is updated along its
of [25] is adapted to our PSSE task, and enables the prox-linear direction amt by only a step of τt as opposed to cmt / amt 22 .
method to efficiently solve the LAV estimation problem at The proposed stochastic prox-linear LAV solver is listed
scale. Advantages of the stochastic approaches over their deter- in Table II. For convergence, a diminishing stepsize sequence
ministic counterparts include oftentimes simple closed-form {μt } is required. Specifically, we consider stepsizes that are
updates as well as faster convergence to yield an (approx- square summable but not summable; that is,
imately) optimal solution. Leveraging the sparsity structure ∞
 ∞

of the measurement matrices and judiciously grouping mea- μt = ∞, and μ2t < ∞. (25)
surements into small mini-batches can considerably speedup t=0 t=0
implementation of the stochastic solver.
For instance, one can choose μt = αt−β with appropriately
selected constants α > 0 and β ∈ (0.5, 1]. Then the sequence
A. Stochastic LAV Solver {vt } converges to a stationary point of the LAV objective in
Instead of dealing with the quadratic subproblems in (11), (7) almost surely [25, Th. 1].
each iteration of the stochastic LAV solver samples a datum In terms of computational complexity, it can be verified that
mt ∈ {1, 2, . . . , M} uniformly at random from the total M of each Hm matrix corresponding to a square voltage magnitude
measurements, and substitutes functions (h, c) by (hmt , cmt ) or (re)active power flow measurement [see (1) and (5)] has
associated with the sampled datum in the local linearization exactly one or three nonzero entries, respectively. As such,

Authorized licensed use limited to: Consortium - Saudi Arabia SDL. Downloaded on February 21,2025 at 15:24:55 UTC from IEEE Xplore. Restrictions apply.
6142 IEEE TRANSACTIONS ON SMART GRID, VOL. 10, NO. 6, NOVEMBER 2019

TABLE III
M INI -BATCHING P OWER F LOW M EASUREMENTS

Fig. 1. The IEEE 14-bus benchmark configuration.

instance, HPnn has nonzero entries indexed by (n, n ), (n, n),


and (n , n); and so does Hnn . Processing the active or reactive
Q
if the available measurements include only these two types
of measurements, evaluating (amt , cmt ) as well as updating vt power flow measurement over line (n, n ) amounts to updating
per stochastic iteration requires just a small number (≤ 10) of the n-th and n -th entries of vt . Hence, so long as each of a
scalar multiplications and additions, therefore incurring per- mini-batch of measurements does not share common indices
iteration complexity of O(1), which is independent of the with the remaining ones, processing a mini-batch of such mea-
network size N. This complexity evidently scales favorably surements one after the other boils down to processing all
to very large interconnected power networks. It is also worth measurements simultaneously in one iteration.
highlighting that only one or two entries of vt are updated For illustration, consider the IEEE 14-bus test system
depending on whether a voltage or power flow measurement depicted in Fig. 1 [37]. Consider a total of 54 measurements,
is processed at each iteration. On the other hand, even if the which include 14 square voltage magnitudes, as well as 20
power injections are measured too, the number of scalar oper- ‘from-end’ active and reactive power flows each. All volt-
ations per iteration increases to the order of the number of age magnitudes can be grouped as a single mini-batch, or
neighboring buses, which still remains much smaller than N into several mini-batches by any means. One way of mini-
in most real-world networks. batching each type of power flow measurements is suggested
Remark 2: PMU measurements can be easily accounted for in Table III, where 20 active (reactive) power flows yield 5
in (7). The developed prox-linear LAV schemes apply without mini-batches of equal size. It can be easily verified that any
any algorithmic modification. two measurements within a group (mini-batch) are measured
over two lines of non-overlapping indices.
Let the entire measurements be divided into B mini-batches
B. Accelerated Implementation Using Mini-Batches denoted by {Bb }Bb=1 ⊆ {1, 2, . . . , M}. If a mini-batch Bbt of
Although it involves only simple closed-form updates, the measurements is drawn uniformly at random from {Bb }Bb=1 at
stochastic solver in Table II may require a large number of iteration t, the accelerated implementation by means of mini-
iterations to converge for high-dimensional power networks. batching, updates the state estimate according to [see (24)]
Stochastic approaches based on mini-batches of measurements  
have been recently popular in large-scale machine learning vt+1 := vt + projμt cm / am 22 · am (26)
tasks, because they offer a means of accelerating the stochastic m∈Bbt
algorithms. Yet, the naive way of designing mini-batches by
which is in sharp contrast to that of using ADMM iterations to
grouping measurements randomly would yield a sequence of
deal with the quadratic subproblems (10) in the deterministic
quadratic programs as in (11) of the deterministic LAV solver,
LAV solver.
which does not have closed-form solutions due to the 1 term.
The novelty here is to fully exploit the sparsity of Hm matrices
to group measurements into mini-batches in a way that closed- V. N UMERICAL T ESTS
form solutions of the resulting quadratic programs are possible. The proposed linear proximal LAV solvers were numeri-
Suppose that active and reactive power flows over all lines cally tested in this section. Three power network benchmarks
and the square voltage magnitude at all buses are measured. including the IEEE 14-, 118-bus, and the PEGASE 9, 241-
Since HVn in (1) has exactly one nonzero entry at the (n, n)-th bus systems were simulated, following the MATLAB-based
position, the corresponding updating vector an in (22a) has toolbox MATPOWER [37], [38].
(at most) one nonzero entry at the n-th position. Updating The linear programming (LP) and the iteratively reweighted
vt using (24) modifies the n-th position only. Hence, refin- least-squares (IRLS) based LAV estimators [17], [18], [39],
ing v using the N voltage measurements sequentially in N along with the ‘workhorse’ Gauss-Newton iterations for the
stochastic iterations is equivalent to updating v using all N WLS-based PSSE [2] were adopted as baselines. It is worth
measurements simultaneously in a single iteration. For power mentioning that the LP-based implementation can be regarded
flows, every Hm has three nonzero entries in two rows. For as a special case of the deterministic prox-linear algorithm

Authorized licensed use limited to: Consortium - Saudi Arabia SDL. Downloaded on February 21,2025 at 15:24:55 UTC from IEEE Xplore. Restrictions apply.
WANG et al.: ROBUST AND SCALABLE PSSE VIA COMPOSITE OPTIMIZATION 6143

TABLE IV
with a constant step size of ∞. To see this, per iteration, the C OMPARISONS OF D IFFERENT S TATE E STIMATORS
LP-based scheme [18] solves the minimization problem in (10)
but without the augmented majorization term 2μ 1
t
v − vt 22 ,
or equivalently with μt = ∞. To be specific, the linear
program was formulated over 2N − 1 real variables consist-
ing of the real and imaginary parts of the unknown voltage
phasor vector, after excluding the imaginary part of the ref-
erence bus which was set 0. Per iteration, the resultant linear
program was solved by calling for the convex optimization
package CVX [34] together with its embedded interior-point
solver SeDuMi [40]. Given that there is no parameter in
the LP-based LAV estimator, although the time performance
may vary if different toolboxes are used for solving the
resultant linear programs, its convergence behavior in terms
of the number of iterations is independent of the toolbox
used. Furthermore, the Gauss-Newton iterations were imple-
mented by calling for the embedded state estimation function
‘doSE.m’ in MATPOWER.
Regarding the initialization, when all squared voltage mag-
nitudes are measured, the initial point is taken to be the voltage
magnitude vector, unless otherwise specified. Each simulated
scheme stops either when a maximum number 100 of iterations
are reached, or when the normalized distance between two
consecutive estimates
√ becomes smaller than 10−10 , namely
vt − vt−1 2 / N ≤ 10−10 . In order to fix the phase ambi- Fig. 2. Convergence performance for the IEEE 14-bus system.
guity, the phase generated at the reference bus was set to
0 in all tests. For numerical stability, and to eliminate the
effect of certain leverage measurements [20], the developed power flows, each grouped as in Table III. The normalized
solvers were implemented using the normalized data, namely root mean-square error (RMSE) vt − v 2 / v 2 was evalu-
{( Hzmm 2 , HHmm 2 )}M
m=1 . Although this work focused on fast and ated at every Gauss-Newton iteration, per linear program, and
scalable implementations of LAV estimators, certain enhanced every M stochastic iterations of the stochastic and accelerated
solvers that possess similar compositional structure as in schemes, where v is the true voltage profile, and vt denotes
LAV (7) can also benefit from the developed composite the estimate obtained at the t iteration.
optimization algorithms. Those include, e.g., the (robustified) Figure 2 compares the normalized RMSE for the LP-, and
Schweppe-Huber generalized M-estimator [10]. IRLS-based, deterministic, stochastic, and accelerated LAV
solvers with that of the WLS-based Gauss-Newton itera-
tions, whose corresponding runtime and number of iterations
A. Noiseless Case to reach the stopping criterion are tabulated in Table IV.
The first experiment simulates the noiseless data to eval- Evidently, the deterministic scheme is the fastest in terms
uate the convergence and runtime of the novel algorithms of both the number of iterations and runtime, and converges
relative to the WLS-based Gauss-Newton iterations, as well to a point of machine accuracy (i.e., 10−16 ) in 8 iterations.
as the LP- and IRLS-based LAV solvers on the IEEE 14- The IRLS is also fast, but similar to the WLS-based Gauss-
bus test system. The default voltage profile was employed. Newton iterations, it requires inverting a matrix per iteration
Measurements including all (‘sending-end’) active and reac- which may discourage its use in large power systems. Even
tive power flows, as well as all squared voltage magnitudes though the time of solving each LP may vary across tool-
were obtained from MATPOWER [38]. The ADMM-based boxes, convergence of the LP-based scheme in terms of the
deterministic prox-linear solver in Table I was implemented number of iterations will be the same. Evidently, solving a
with stepsize μ = 200, where each quadratic subproblem was LP of 2M constraints and 2N − 1 real variables is compu-
solved using a maximum of 150 ADMM iterations with step- tationally more cumbersome and slower than performing M
size ρ = 100. It is worth mentioning that the deterministic (accelerated) stochastic LAV iterations, hence justifying the
prox-linear solver can be also implemented using standard fast convergence rate of the proposed LAV solvers.
convex programming approaches (by solving subproblem (10) The Gauss-Newton method terminated after six iterations,
directly). It typically converges in a few (less than 10) itera- but at a sub-optimal point of normalized RMSE 10−3 or
tions yet at a higher computational complexity. The stochastic so. The accelerated implementation is comparable with the
algorithm in Table II used the diminishing stepsize 1/t0.8 . The stochastic LAV solver, and yields an accurate solution with
accelerated scheme in (26) was implemented with stepsize RMSE = 4.28 × 10−8 in time also comparable to the Gauss-
0.8 using a total of 11 mini-batches: 1 for all voltage magni- Newton iterations. The proposed LAV solvers are much faster
tudes, and 5 of equal size for (sending-end) active and reactive than the LP-based implementation.

Authorized licensed use limited to: Consortium - Saudi Arabia SDL. Downloaded on February 21,2025 at 15:24:55 UTC from IEEE Xplore. Restrictions apply.
6144 IEEE TRANSACTIONS ON SMART GRID, VOL. 10, NO. 6, NOVEMBER 2019

TABLE V
C OMPARISONS OF THE G AUSS -N EWTON AND S TOCHASTIC
LAV E STIMATORS

available in MATPOWER [38]. The true voltage magnitude of


each bus was uniformly distributed over [0.95, 1.05], and its
angle over [ − 0.05π, 0.05π ]. The maximum number of itera-
tions for the Gauss-Newton method was set 10. All seven types
of SCADA data were measured with additive noise described
in the last experiment, 5% of which were compromised under
model M2. The corrupted data ξm := ṽH Hm ṽ relying on the
individual Hm were generated using ṽ ∈ RN from the standard-
ized multivariate Gaussian distribution. In words, there were
Fig. 3. Robustness to outliers for the IEEE 118-bus system.
a total of 18, 481 variables to be estimated, a total of 91, 919
measurements were obtained, 4, 595 of which were purpose-
fully manipulated by adversaries. Initialized with the flat
B. Presence of Outliers
voltage profile point, the WLS-based Gauss-Newton iterations
The second experiment was set to assess the robustness of yielded an estimate of RMSE 0.9846, whereas the stochastic
the novel deterministic and stochastic solvers to measurements scheme in Table II with diminishing stepsize μt = 100/t0.8
with outliers using the IEEE 118-bus benchmark network [37], attained an RMSE of 0.0412. The corresponding computa-
while the IRLS-based LAV implementation and the WLS- tional runtime of each scheme was reported in Table V.
based Gauss-Newton iterations were simulated as baselines. Evidently, the stochastic LAV implementation is several times
The actual voltage magnitude (in p.u.) and angle (in radi- faster than the WLS-based Gauss-Newton iterations.
ans) of each bus were uniformly distributed over [0.9, 1.1], Remark 3: Each Gauss-Newton iteration involves inverting
and over [−0.1π, 0.1π ]. To assess the PSSE performance a (2N − 1) × (2N − 1) matrix, which incurs computational
versus the measurement size, an additional type of measure- complexity O((2N − 1)3 ). It is clear when the system size
ments was included in a deterministic manner, as described N grows large, say N ≥ 10, 000, this cubic complexity of
next. All seven types of SCADA measurements were first enu- Gauss-Newton iterations as well as the memory required may
f f
merated as {|Vn |2 , Pnk , Qnk , Pn , Qn , Ptnk , Qtnk }. Each x-axis easily become prohibitive for a desktop computer. On the con-
value in Fig. 3 signifies the number of ordered types of mea- trary, the per-iteration complexity of the proposed stochastic
surements used in the experiment to yield the corresponding LAV scheme can be as low as O(1), which is clearly scal-
normalized RMSEs, which are obtained by averaging over able, and well-tailored for PSSE tasks of large dimensions. It
100 independent realizations. For example, 5 implies that the is thus intuitive that in large-scale power systems, the proposed
f f
first 5 types of data (i.e., |Vn |2 , Pnk , Qnk , Pn , Qn over all buses stochastic iterations based LAV implementation is faster than
and lines) were measured. Additive noise was independently the Gauss-Newton iterations. The advantage of adopting inex-
generated from Gaussian distributions having zero-mean and pensive stochastic iterations to handle large-scale optimization
standard deviation 0.004, 0.008, and 0.01 p.u. for the volt- problems has been corroborated by the recent success of deep
age magnitude, line flow, and power injection measurements, learning for visual recognition and speech translation too,
respectively [18]. Ten percent of the measured data were where stochastic gradient based approaches (e.g., stochastic
corrupted according to model M1, chosen randomly from gradient descent) constitute the ‘workhorse’ in training deep
line flows and bus injections. The outlying data {ξm } were neural networks [41].
drawn independently from a Laplacian distribution with zero-
mean and standard deviation 30. The subproblems (10) with
μ = 100 of the deterministic scheme were solved using a VI. C ONCLUSION
maximum of 200 ADMM iterations with stepsize ρ = 100, Robust power system state estimation was pursued using
while the stochastic one was implemented with a diminishing contemporary tools of composite optimization. Building on
stepsize μt = 10/t0.9 . It is evident from Fig. 3 that our prox- recent algorithmic advances, two solvers were put forward
linear LAV schemes are resilient to outlying measurements to efficiently handle the LAV-based PSSE. Specifically, a
under corruption model M1, yielding improved performance deterministic LAV method was developed based on a linear
relative to the IRLS estimator. Furthermore, IRLS works well proximal method, which yields a sequence of convex quadratic
when the number of measurements grows large. Finally, the subproblems that can be efficiently solved using off-the-shelf
ADMM-based prox-linear estimator requires the least num- solvers, or, through fast ADMM iterations. It converges as fast
ber of iterations for convergence. It is followed by the IRLS as Gauss-Newton iterations, amounting to solving only 5 ∼ 10
estimator, and then by the stochastic prox-linear estimator. quadratic programs in general. Inspired by the sparse connec-
The third experiment tests the scalability and efficacy of the tivity inherent to power networks, a highly scalable stochastic
stochastic iterations on a larger power network of 9, 241 buses scheme that can afford simple closed-form updates was also

Authorized licensed use limited to: Consortium - Saudi Arabia SDL. Downloaded on February 21,2025 at 15:24:55 UTC from IEEE Xplore. Restrictions apply.
WANG et al.: ROBUST AND SCALABLE PSSE VIA COMPOSITE OPTIMIZATION 6145

devised. When only line flows and voltage magnitudes are thus yielding the optimal solution of (28) as
measured, each stochastic iteration performs merely a few
complex scalar operations, incurring per-iteration complexity x∗ := x∗r + jx∗i = Sλ ((d) − c) + j(d).
O(1), regardless of the number of buses in the entire network. Recalling u = x + c, the optimal solution of (27) is
If, on the other hand, the power injections are included as well,
this time complexity goes down to the order of the number of u∗ = x∗ + c = c + Sλ ((d) − c) + j(d) (31)
neighboring buses, which still remains much smaller than the
network size in general. A mini-batching technique was sug- which completes the proof.
gested to further accelerate the stochastic iterations by means Proof of Proposition 2: Letting χ denote the dual variable
of leveraging the sparsity of measurement matrices. Numerical associated with the constraint u = Aw, the KKT optimality
tests on a variety of benchmark networks of up to 9, 241 conditions are given by [35]
buses showcase the robustness and computational efficiency w∗ − b + AH χ ∗ = 0
of the developed approaches relative to existing alternatives,
particularly over large-size networks. u∗ − d − χ ∗ = 0
Devising decentralized and parallel implementations of the Aw∗ − u∗ = 0
novel approaches constitutes interesting future research direc-
or in the following compact form
tions. Since the LAV estimator may yield non-robust estimates
⎡ ⎤⎡ ∗ ⎤ ⎡ ⎤
in the presence of bad leverage points, and the measurement IN 0 AH w b
scaling may not be able to effectively identify and elimi- ⎣0 IM −IM ⎦⎣ u∗ ⎦ = ⎣ d ⎦.
nate a certain type of leverage measurements, it is meaningful A −IM 0 χ∗ 0
and promising to generalize the presented deterministic and
stochastic proximal-linear based algorithmic tools to other Eliminating the dual variable via χ ∗ = u∗ − d from the
robustness-enhanced estimators such as the Schweppe-Huber KKT system, yields
  ∗   
generalized M-estimator [10], [29], [30]. Coping with the IN AH w b + AH d
Y- and -connection, as well as investigating the technical = . (32)
A −IM u∗ 0
approaches in multiphase unbalanced distribution systems are
practically relevant future research topics too. By further eliminating d∗ and solving for b∗ , the solution
to (32) and also to the minimization in (18) can be found in
A PPENDIX two steps as
Proof of Proposition 1: It is easy to check that the solu-  −1 
tion of (16a) is given by (17a), whose proof is thus omitted. w∗ := I + AH A b + AH d
Considering any c ∈ RN and d ∈ CN , solving (16b) is
u∗ := Aw∗
equivalent to solving
1 which completes the proof of the claim.
u∗ := arg min λ (u) − c 1 + u − d 22 . (27)
u∈CN 2 Proof of Proposition 3: The optimality condition for (23) is
Upon defining x := u − c, problem (27) becomes      
  1  
1 0 ∈ ∂  aH w − c + w ⇐⇒ 0 ∈ ∂  aH w − c · a
min λ (x) 1 + x − (d − c) 22 τ
x∈C N 2 1
+ w
or equivalently, τ
1 1 or equivalently,
min λ (x) 1 (x) − (d − c))
+ 2
2 + (x)   
x 2 2  
0 ∈ ∂  aH w − c · (τ a) + w,
− (d − c)) 22 . (28)
Problem (28) can be decomposed into two subproblems that where ∂ denotes the subdifferential. Let us first examine the
correspond to optimizing over the real- and imaginary parts case where (aH w) − c = 0. We thus have ∂|(aH w) − c| =
of x = (x) + j(x) := xr + jxi ; that is sign((aH w) − c), which yields the optimum
 
min λ xr +
1
xr − (d − c) 2
(29) w∗ = −τ sign  aH w − c · a.
1 2
xr ∈RN 2
and Note that if c/ a 22 ≥ τ , or (aH w∗ ) − c =
1 −τ a 22 sign(aH w∗ − c) − c < 0, then w∗ = τ a. Equivalently,
min xi − (d − c) 22 . (30) if c/ a 22 ≤ −τ , or (aH w∗ ) − c = −τ a 22 sign(aH w∗ −
∈RN 2
xi
c) − c > 0, then w∗ = −τ a.
The optimal solutions of the convex programs in (29) If (aH w)−c = 0, the subdifferential of the absolute-value
and (30) can be found as, see [35] operator belongs to the interval [ − 1, 1]; hence, the optimality
condition becomes
x∗r := Sλ ((d − c)) = Sλ ((d) − c)
x∗i := (d − c) = (d) 0 ∈ −[−1, 1] · (τ a) + w ⇐⇒ w ∈ [−τ, τ ] a.

Authorized licensed use limited to: Consortium - Saudi Arabia SDL. Downloaded on February 21,2025 at 15:24:55 UTC from IEEE Xplore. Restrictions apply.
6146 IEEE TRANSACTIONS ON SMART GRID, VOL. 10, NO. 6, NOVEMBER 2019

Upon letting projτ (x) denote the projection of a real number [23] A. S. Lewis and S. J. Wright, “A proximal method for compos-
x onto the interval [−τ, τ ], one can combine the aforemen- ite minimization,” Math. Program., vol. 158, nos. 1–2, pp. 501–546,
Jul. 2016.
tioned three cases, and express compactly the optimum as [24] D. Davis and D. Drusvyatskiy, “Stochastic model-based minimization of
follows weakly convex functions,” SIAM J. Optim., vol. 29, no. 1, pp. 207–239,
 Jan. 2019.
w∗ := projτ c/ a 22 · a [25] J. C. Duchi and F. Ruan, “Stochastic methods for composite optimization
problems,” arXiv:1703.08570, 2017.
which concludes the proof of the proposition. [26] G. Wang, H. Zhu, G. B. Giannakis, and J. Sun, “Robust power system
state estimation from rank-one measurements,” IEEE Trans. Control
Netw. Syst., to be published. doi: 10.1109/TCNS.2019.2890954.
R EFERENCES [27] M. A. Gandhi and L. Mili, “Robust Kalman filter based on a general-
ized maximum-likelihood-type estimator,” IEEE Trans. Signal Process.,
[1] W. A. Wulf, “Great achievements and grand challenges,” vol. 58, no. 5, pp. 2509–2520, May 2010.
Bridge, vol. 30, nos. 3–4, pp. 5–10, 2010. [Online]. Available: [28] R. C. Pires, A. S. Costa, and L. Mili, “Iteratively reweighted least-
http://www.greatachievements.org/ squares state estimation through Givens Rotations,” IEEE Trans. Power
[2] A. Abur and A. Gómez-Expósito, Power System State Estimation: Syst., vol. 14, no. 4, pp. 1499–1507, Nov. 1999.
Theory and Implementation. New York, NY, USA: Marcel Dekker, 2004. [29] J. Zhao and L. Mili, “Vulnerability of the largest normalized residual
[3] G. Wang, G. B. Giannakis, J. Chen, and J. Sun, “Distribution system statistical test to leverage points,” IEEE Trans. Power Syst., vol. 33,
state estimation: An overview of recent developments,” Front. Inf. no. 4, pp. 4643–4646, Jul. 2018.
Technol. Electron. Eng., vol. 20, no. 1, pp. 4–17, Jan. 2019. [30] J. Zhao, L. Mili, and R. C. Pires, “Statistical and numerical robust state
[4] F. C. Schweppe, J. Wildes, and D. Rom, “Power system static-state esti- estimator for heavily loaded power systems,” IEEE Trans. Power Syst.,
mation: Parts I, II, and III,” IEEE Trans. Power App. Syst., vol. PAS-89, vol. 33, no. 6, pp. 6904–6914, Nov. 2018.
pp. 120–135, Jan. 1970. [31] J. C. Duchi and F. Ruan, “Solving (most) of a set of quadratic equalities:
[5] H. M. Merrill and F. C. Schweppe, “Bad data suppression in Composite optimization for robust phase retrieval,” arXiv:1705.02356,
power system static state estimation,” IEEE Trans. Power App. Syst., 2017.
vol. PAS-90, no. 6, pp. 2718–2725, Nov. 1971. [32] P. J. Huber, “Robust statistics,” in International Encyclopedia
[6] P. Fairley, “Cybersecurity at U.S. utilities due for an upgrade: Tech of Statistical Science. Heidelberg, Germany: Springer, 2011,
to detect intrusions into industrial control systems will be mandatory,” pp. 1248–1251.
IEEE Spectr., vol. 53, no. 5, pp. 11–13, May 2016. [33] G. Wang, A. S. Zamzam, G. B. Giannakis, and N. D. Sidiropoulos,
[7] D. U. Case, “Analysis of the cyber attack on the Ukrainian power grid,” “Power system state estimation via feasible point pursuit: Algorithms
2016. and Crmér–Rao bound,” IEEE Trans. Signal Process., vol. 66, no. 6,
[8] S. Zonouz et al., “SCPSE: Security-oriented cyber-physical state esti- pp. 1649–1658, Mar. 2018.
mation for power grid critical infrastructures,” IEEE Trans. Smart Grid, [34] M. Grant and S. Boyd. (2008). CVX: MATLAB Software for Disciplined
vol. 3, no. 4, pp. 1790–1799, Dec. 2012. Convex Programming. [Online]. Available: http://cvxr.com/cvx.
[9] E. Caro and A. J. Conejo, “State estimation via mathematical program- [35] S. Boyd, N. Parikh, E. Chu, B. Peleato, and J. Eckstein, “Distributed
ming: A comparison of different estimation algorithms,” IET Gener. optimization and statistical learning via the alternating direction method
Transm. Distrib., vol. 6, no. 6, pp. 545–553, Jun. 2012. of multipliers,” Found. Trends Mach. Learn., vol. 3, no. 1, pp. 1–122,
[10] L. Mili, M. G. Cheniae, N. S. Vichare, and P. J. Rousseeuw, “Robust 2010.
state estimation based on projection statistics [of power systems],” IEEE [36] V. Kekatos and G. B. Giannakis, “Distributed robust power system state
Trans. Power Syst., vol. 11, no. 2, pp. 1118–1127, May 1996. estimation,” IEEE Trans. Power Syst., vol. 28, no. 2, pp. 1617–1626,
[11] L. Mili, M. G. Cheniae, and P. J. Rousseeuw, “Robust state estimation of May 2013.
electric power systems,” IEEE Trans. Circuits Syst. I, Fundam. Theory [37] Power Systems Test Case Archive, Univ. Washington, Seattle, WA, USA.
Appl., vol. 41, no. 5, pp. 349–358, May 1994. [Online]. Available: http://www.ee.washington.edu/research/pstca
[12] G. Wang, G. B. Giannakis, and Y. C. Eldar, “Solving systems of ran- [38] R. D. Zimmerman, C. E. Murillo-Sanchez, and R. J. Thomas,
dom quadratic equations via truncated amplitude flow,” IEEE Trans. Inf. “MATPOWER: Steady-state operations, planning and analysis tools
Theory, vol. 64, no. 2, pp. 773–794, Feb. 2018. for power systems research and education,” IEEE Trans. Power Syst.,
[13] G. Wang, G. B. Giannakis, Y. Saad, and J. Chen, “Phase retrieval via vol. 26, no. 1, pp. 12–19, Feb. 2011.
reweighted amplitude flow,” IEEE Trans. Signal Process., vol. 66, no. 11, [39] R. A. Jabr and B. C. Pal, “Iteratively reweighted least-squares implemen-
pp. 2818–2833, Jun. 2018. tation of the WLAV state-estimation method,” IEE Proc. Gener. Transm.
[14] D. P. Bertsekas, Nonlinear Programming, 2nd ed. Belmont, MA, USA: Distrib., vol. 151, no. 1, pp. 103–108, Feb. 2004.
Athena Sci., 1999. [40] J. F. Sturm, “Using SeDuMi 1.02, a MATLAB toolbox for optimization
[15] H. Zhu and G. B. Giannakis, “Power system nonlinear state estimation over symmetric cones,” Optim. Method Softw., vol. 11, nos. 1–4,
using distributed semidefinite programming,” IEEE J. Sel. Topics Signal pp. 625–653, Jan. 1999.
Process., vol. 8, no. 6, pp. 1039–1050, Dec. 2014. [41] Y. LeCun, Y. Bengio, and G. Hinton, “Deep learning,” Nature, vol. 521,
[16] Y. Zhang, R. Madani, and J. Lavaei, “Conic relaxations for power system no. 7553, pp. 436–444, May 2015.
state estimation with line measurements,” IEEE Trans. Control Netw.
Syst., vol. 5, no. 3, pp. 1193–1205, Sep. 2018.
[17] W. W. Kotiuga and M. Vidyasagar, “Bad data rejection properties of
weighted least absolute value techniques applied to static state estima-
tion,” IEEE Trans. Power App. Syst., vol. PER-2, no. 4, pp. 844–853,
Apr. 1982. Gang Wang (M’18) received the B.Eng. degree
[18] A. Abur and M. K. Celik, “A fast algorithm for the weighted least in electrical engineering and automation from the
absolute value state estimation (for power systems),” IEEE Trans. Power Beijing Institute of Technology, Beijing, China, in
Syst., vol. 6, no. 1, pp. 1–8, Feb. 1991. 2011 and the Ph.D. degree in electrical and com-
[19] R. A. Jabr and B. C. Pal, “Iteratively re-weighted least absolute puter engineering from the University of Minnesota,
value method for state estimation,” IEE Proc. Gener. Transm. Distrib., Minneapolis, MN, USA, in 2018.
vol. 150, no. 4, pp. 385–391, Jul. 2003. He is currently a Post-Doctoral Associate with the
[20] M. Göl and A. Abur, “LAV based robust state estimation for Department of Electrical and Computer Engineering,
systems measured by PMUs,” IEEE Trans. Smart Grid, vol. 5, no. 4, University of Minnesota. His research interests focus
pp. 1808–1814, Jul. 2014. on the areas of statistical learning, optimization, and
[21] C. Xu and A. Abur, “A fast and robust linear state estimator for very deep learning with applications to data science and
large scale interconnected power grids,” IEEE Trans. Smart Grid, vol. 9, smart grids.
no. 5, pp. 4975–4982, Sep. 2018. Dr. Wang was a recipient of the National Scholarship in 2014, the Guo Rui
[22] J. V. Burke, “Descent methods for composite nondifferentiable Scholarship in 2017, and the Innovation Scholarship (First Place) in 2017, all
optimization problems,” Math. Program., vol. 33, no. 3, pp. 260–279, from China, as well as the Best Student Paper Award at the European Signal
Dec. 1985. Processing Conference in 2017.

Authorized licensed use limited to: Consortium - Saudi Arabia SDL. Downloaded on February 21,2025 at 15:24:55 UTC from IEEE Xplore. Restrictions apply.
WANG et al.: ROBUST AND SCALABLE PSSE VIA COMPOSITE OPTIMIZATION 6147

Georgios B. Giannakis (F’97) received the Diploma Jie Chen (M’09–SM’12–F’19) received the B.Sc.,
degree in electrical engineering from the National M.Sc., and Ph.D. degrees in control theory and
Technical University of Athens, Greece, in 1981, control engineering from the Beijing Institute of
the first M.Sc. degree in electrical engineering, the Technology, Beijing, China, in 1986, 1996, and
second M.Sc. degree in mathematics, and the Ph.D. 2001, respectively.
degree in electrical engineering from the University From 1989 to 1990, he was a Visiting Scholar
of Southern California, CA, USA, in 1983, 1986, with the California State University, Long Beach,
and 1986, respectively. CA, USA. From 1996 to 1997, he was a Research
From 1982 to 1986, he was with the University Fellow with the School of Engineering, University
of Southern California. He was a Faculty Member of Birmingham, Birmingham, U.K. He is a Professor
with the University of Virginia, VA, USA, from 1987 of control science and engineering with the Beijing
to 1998 and since 1999, he has been a Professor with the University of Institute of Technology, where he also serves as the Head of the State Key
Minnesota, MN, USA, where he holds an ADC Endowed Chair, a University Laboratory of Intelligent Control and Decision of Complex Systems. He is cur-
of Minnesota McKnight Presidential Chair in ECE, and serves as the Director rently the President of the Tongji University, Shanghai 200092, China. He is
of the Digital Technology Center. His general interests span the areas of sta- also a member of the Chinese Academy of Engineering. He has (co-)authored
tistical learning, communications, and networking—subjects on which he has four books and over 200 research papers. His main research interests include
published over 450 journal papers, 750 conference papers, 25 book chapters, intelligent control and decision in complex systems, multi-agent systems, and
two edited books, and two research monographs with an H-index of 142. He optimization.
is the (co-) inventor of 32 patents issued. Current research focuses on data sci- Dr. Chen served as a Managing Editor for the Journal of Systems Science
ence, and network science with applications to the Internet of Things, social, & Complexity and an Associate Editor for the IEEE T RANSACTIONS ON
brain, and power networks with renewables. C YBERNETICS and for several other international journals.
Dr. Giannakis was a recipient of the (co-) recipient of nine Best Journal
Paper Awards from the IEEE Signal Processing (SP) and Communications
Societies, including the G. Marconi Prize Paper Award in Wireless
Communications, the Technical Achievement Awards from the SP Society
in 2000 and from EURASIP in 2005, the Young Faculty Teaching Award,
the G. W. Taylor Award for Distinguished Research from the University of
Minnesota, and the IEEE Fourier Technical Field Award in 2015. He is a
fellow of EURASIP, and has served the IEEE in a number of posts, including
that of a Distinguished Lecturer for the IEEE-SPS.

Authorized licensed use limited to: Consortium - Saudi Arabia SDL. Downloaded on February 21,2025 at 15:24:55 UTC from IEEE Xplore. Restrictions apply.

You might also like