Robust and Scalable Power System State Estimation Via Composite Optimization
Robust and Scalable Power System State Estimation Via Composite Optimization
heavily on the initialization (see justifications in [12] and [13]). numerically tested using the IEEE 14-, 118-bus, and the
Furthermore, the convergence of Gauss-Newton iterations for PEGASE 9, 241-bus benchmark networks. Simulations cor-
nonconvex objectives is hardly guaranteed in general [14]. roborate their merits relative to the WLS-based Gauss-Newton
As least as important, WLS estimators are sensitive to bad iterations.
data [5]. They may yield very poor estimates in the pres- Outline. Grid modeling and problem formulation are given
ence of outliers. These issues were somewhat mitigated by in Section II. Upon reviewing the basics of composite
incorporating the largest normalized residual (LNR) test for optimization, Section III presents the deterministic LAV solver,
bad data removal [5], or, via reformulating the (possibly reg- followed by its stochastic alternative in Section IV. Extensive
ularized) WLS into a semidefinite program (SDP) via convex numerical tests are presented in Section V, while the paper is
relaxation [15], [16]. The former alternates between the LNR concluded in Section VI.
test and the estimation, while the latter solves SDPs. The least- Notation: Matrices (column vectors) are denoted by upper-
median-squares and the least-trimmed-squares estimators have (lower-) case boldface letters. Symbols (·)T and (·)H repre-
provably improved performance under certain conditions [11]. sent (Hermitian) transpose, and (·) complex conjugate. Sets are
Unfortunately, their computational complexities and storage denoted using calligraphic letters. Symbol (·) ((·)) takes the
requirements scale unfavorably with the number of buses in real (imaginary) part of a complex number. Operator dg(xi )
the network [3]. defines a diagonal matrix holding entries of xi on its diagonal,
On the other hand, LAV estimators simultaneously identify while [xi ]1≤i≤N returns a matrix with xH i being its i-th row.
and reject bad data while acquiring an accurate estimate of
the state [17]. Recent research efforts have focused on dealing II. G RID M ODELING AND P ROBLEM F ORMULATION
with the nonconvexity and nonsmoothness in LAV estimation. An electric grid having N buses and L lines is modeled
Upon linearizing the nonlinear measurement functions at the as a graph G = (N , E), whose nodes N := {1, 2, . . . , N}
most recent iterate, a series of linear programs was solved [17]. correspond to buses and whose edges E := {(n, n )} ⊆
Techniques for improving the linear programming by exploit- N × N correspond to lines. The complex voltage per bus
ing the system’s structure [18], or via iterative reweighting [19] n ∈ N is expressed in rectangular coordinates as vn =
have also been reported. LAV estimation based only on PMU (vn ) + j(vn ), with all nodal voltages forming the vector
data was studied [20], [21], in which a strategic scaling v := [v1 · · · vN ]H ∈ CN .
was suggested to eliminate the effect of leverage measure- The voltage magnitude square Vn := |vn |2 = 2 (vn ) +
ments [10], [11]. Despite these efforts, LAV estimators have (vn ) can be compactly expressed as a quadratic function
2
not been widely employed yet in today’s power networks due of v, namely
mostly to their computational inefficiency [20].
The LAV-based PSSE is revisited in this work from the Vn = vH HVn v, with HVn := en eTn (1)
viewpoint of composite optimization [22], [23], which consid- where en denotes the n-th canonical vector in RN . To express
ers minimizing functions f (v) = h(c(v)) that are compositions power injections as functions of v, introduce the so-termed
of a convex function h, and a smooth vector function c. Two bus admittance matrix Y = G + jB ∈ CN [2]. In rectangular
novel proximal linear (prox-linear) procedures are developed coordinates, the active and reactive power injections pn and
based upon minimizing a sequence of convex quadratic qn at bus n are given by
subproblems. The first deterministic LAV solver minimizes
functions constructed from a linearized approximation to the
N
original objective and a quadratic regularization, each effi- pn = (vn ) [(vn )Gnn − (vn )Bnn ]
ciently implementable using either off-the-shelf solvers, or, n =1
convergence of such deterministic prox-linear algorithms has + (vn ) [(vn )Gnn + (vn )Bnn ] (2)
n =1
been documented [23], [24].
The second LAV solver builds on a stochastic prox-linear
N
qn = (vn ) [(vn )Gnn − (vn )Bnn ]
algorithm, and has each iteration minimizing the summation
n =1
of a linearized approximation to the LAV loss of a single mea-
surement and the regularization term [25], [26]. Interestingly,
N
− (vn ) [(vn )Gnn + (vn )Bnn ] (3)
each iteration of the stochastic LAV solver has a closed-form
n =1
update. Thanks to the sparse connectivity inherent to power
networks, this amounts to updating very few entries of the state which admits a compact representation as
vector. Moreover, even faster implementation of the stochastic YH en eTn + en eTn Y
solver is realized by means of judiciously mini-batching the pn = vH Hpn v, with Hpn := (4a)
2
measurements. H e eT − e eT Y
Y
qn = vH Hqn v, with Hqn :=
n n n n
Bad leverage points may challenge, but the proposed . (4b)
prox-linear algorithms can be generalized to accommodate 2j
robust estimation formulations including the Huber estimation, Recognize that the line current from bus n to n at the ‘from’
Huber M-estimation, and the Schweppe-Huber generalized M- end obeys Inn = eTnn if = eTnn Yf v, where if ∈ C|E | collects
estimation [10], [11], [27]–[30]. The novel algorithms were all line currents, and Yf ∈ C|E |×N relates the bus voltages
Authorized licensed use limited to: Consortium - Saudi Arabia SDL. Downloaded on February 21,2025 at 15:24:55 UTC from IEEE Xplore. Restrictions apply.
WANG et al.: ROBUST AND SCALABLE PSSE VIA COMPOSITE OPTIMIZATION 6139
to all line currents at the ‘from’ (sending) end. Ohm’s and noise [4]. It is known however that the WLS criterion is
Kirchhoff’s laws assert that the ‘from-end’ power flow over sensitive to outliers, and may yield very bad estimates even
f
line (n, n ) can be expressed as Snn = Pnn − jQnn = vn inn =
f f f if there are few grossly corrupted measurements [5]. As is
H T H T
(v en )(enn if ) = v en enn Yf v, yielding well documented in statistics and optimization, the 1 -based
losses yield median-based estimators [32], and handle gross
YH T T
f enn en + en enn Yf errors in the measurements z in a relatively benign way.
Pnn = vH HPnn v, with HPnn :=
f
(5a)
2 Prompted by this, we will consider here minimizing the
YH T T
f enn en − en enn Yf
1 loss of the residuals, which leads to the so-called LAV
Qnn = vH Hnn v, with Hnn :=
f Q Q
. estimate [17]
2j
(5b) 1 H
M
minimize f (v) := v Hm v − zm . (7)
The active and reactive power flows measured at the ‘to’ v∈CN M
m=1
(receiving) ends Ptnn and Qtnn can be written symmetrically
f f
to Pnn and Qnn ; and hence, they are omitted here for brevity. Because of {vH Hm v}M m=1 and the absolute-value operation,
Given line parameters collected in Y and Yf , all SCADA the LAV objective in (7) is nonsmooth, nonconvex, and not
measurements including squared voltage magnitudes as well as even locally convex near the optima ±v∗ . This is clear
active and reactive power injections and flows can be expressed from the real-valued scalar case f (v) = |v∗ v − 1|, where
as quadratic functions of the voltages v ∈ CN . This justifies v ∈ R. A local analysis based on convexity and smooth-
why v is referred to as the system state. If SV , Sp , Sq , SP ,
f ness is thus impossible, and f (v) is difficult to minimize. For
f
SQ , SPt , and SQt signify the smart meter locations of the cor- this reason, Gauss-Newton is not applicable to minimize (7).
responding type, we have available the following (possibly Nevertheless, the criterion f (v) possesses several unique struc-
noisy or even corrupted) measurements: {V̌n }n∈SV , {p̌n }n∈Sp , tural properties, which we exploit next to develop efficient
f f algorithms.
{q̌n }n∈Sq , {P̌nn }(n,n )∈S f , {Q̌nn }(n,n )∈S f , {P̌tnn }(n,n )∈SPt , and
P Q Remark 1: For an N-bus power system, most existing PSSE
{Q̌tnn }(n,n )∈S t , henceforth concatenated in the vector z ∈ RM , approaches have relied on optimizing over (2N − 1) real vari-
Q
where M denotes the total number of measurements. ables, which consist of either the polar or the rectangular
In this paper, the following corruption model is con- components of the complex voltage phasors after excluding
sidered [31]: If {ξi } ⊆ R models an arbitrary attack (or the angle or the imaginary part of the reference bus that is
outlier) sequence, given the measurement matrices {Hm }M m=1
often set to 0. Nevertheless, when iterative algorithms are
in (1)-(5a), we observe for 1 ≤ m ≤ M the samples used, working directly with the N-dimensional complex volt-
H age vector has in general lower complexity and computational
v Hm v if m ∈ I nom
zm ≈ (6) burden than in the real case. This is due to the compact
ξm if m ∈ I out quadratic representations of all SCADA quantities in complex
where additive measurement noise can be included if ≈ is voltage phasors, namely the natural sparsity of quadratic mea-
replaced by equality, and I nom , I out ⊆ {1, 2, . . . , M} collect surement matrices in the unknown complex voltage phasor
the indices of nominal data and outliers, respectively. In other vector.
words, I out is the set of meter indices that can be compro-
mised. The indices in I out are assumed chosen randomly from III. D ETERMINISTIC P ROX -L INEAR LAV S OLVER
{1, 2, . . . , M}. Instrument failures occur at random, although
the attack sequence {ξm } may rely on {Hm } (even adversar- In this section, we will develop a deterministic solver of (7).
ially). Specifically, two models will be considered for the To that end, let us start rewriting the objective in (7) as
attacks. minimize f (v) := h(c(v)) (8)
M1 Matrices {Hm }M m=1 are independent of {ξm }m=1 .
M
v∈CN
M2 Nominal measurement matrices {Hm }m∈I nom are inde-
the composition of a convex function h : RM → R, and a
pendent of {ξm }m∈I out .
smooth vector function c : CN → RM , a structure that is
It is worth highlighting that M1 requires full independence
known to be amenable to efficient algorithms [22], [23]. It is
between the corruption and measurements. That is, the attacker
clear that this general form subsumes (7) as a special case, for
may only corrupt ξm without knowing Hm . On the contrary,
which we can take h(u) = (1/M) u 1 and c(v) = [vH Hm v −
M2 allows completely arbitrary dependence between ξm and
zm ]1≤m≤M . The compositional structure lends itself well to the
Hm for m ∈ I out , which is natural as the type of corrup-
proximal linear (prox-linear) algorithm, which is a variant of
tion may also rely on the individual measurement Hm being
the Gauss-Newton iterations [22]. Specifically, define close to
taken.
a given v the local “linearization” of f as
Having elaborated on the system and corruption models,
the PSSE problem can be stated as follows: Given matri- fv (w) := h(c(v) + (∇ H c(v)(w − v))) (9)
ces Y, Yf , and the available measurements z ∈ RM , with
entries as in (6) obeying M1 or M2, recover the voltage vector where ∇c(v) ∈ CN×M denotes the Jacobian matrix of c at
v ∈ CN . The first attempt may be seeking the WLS esti- v based on Wirtinger derivatives for functions of complex-
mate, or the ML one when assuming independent Gaussian valued variables [33, Appendix]. In contrast to the nonconvex
Authorized licensed use limited to: Consortium - Saudi Arabia SDL. Downloaded on February 21,2025 at 15:24:55 UTC from IEEE Xplore. Restrictions apply.
6140 IEEE TRANSACTIONS ON SMART GRID, VOL. 10, NO. 6, NOVEMBER 2019
f (v), function fv (w) in (9) is convex in w, which is the key wk+1 , uk+1
behind the prox-linear method. Starting with some point v0 , 2 2
which can be the flat-voltage profile point, namely the all-one := arg min w − w̃k+1 + λk + u − ũk+1 + ν k
w, u 2 2
vector, construct the iteration
subject to A w = u (16c)
1 k+1 t k
vt+1 := arg min fvt (v) + v − vt 2
2
(10) λ λ + w̃k+1 − wk+1
2μt = k (16d)
v∈CN ν k+1 ν + ũk+1 − uk+1
where μt > 0 is a stepsize that can be fixed in advance, or be where all the dual variables have been scaled by the factor
determined by a line search [23]. ρ > 0 [35].
Evidently, the subproblem (10) to be solved at every Interestingly enough, the solutions of (16a)-(16c) can be
iteration of the prox-linear algorithm is convex, and can be provided in closed form, as we elaborate in the following two
handled by off-the-shelf solvers such as CVX [34]. However, propositions, whose proofs are deferred to the Appendix.
these interior-point based solvers may not scale well when Proposition 1: The solutions of (16a) and (16b) are
{Hm } are large. For this reason, we derive next a more efficient respectively
iterative procedure using ADMM iterations [35], [36]. ρ k
When specifying f to be the LAV objective of (7), the w̃k+1 := w − λk (17a)
1+ρ
minimization in (10) becomes
ũk+1 := ct + S1/2ρ uk − ν k − ct + i uk − ν k
1
vt+1 = arg min (At (v − vt )) − ct 1 + v − vt 2
2 (11) (17b)
v∈CN 2
with coefficients given by where the shrinkage operator Sτ (x) : RN × R+ → RN is
Sτ (x) := sign(x)max(|x|−τ 1, 0), with and |·| denoting the
At := (2μt /M)vH t Hm (12a) entrywise multiplication and absolute operators, respectively,
1≤m≤M
and
H
ct := (μt /M) zm − vt Hm vt . (12b)
1≤m≤M x/|x|, x = 0
sign(x) :=
0, x=0
For brevity, let w := v − vt , and rewrite (11) equivalently as
a constrained optimization problem provides an entrywise definition of operator sign(x).
1 The constrained minimization of (16c) essentially projects
minimize (u) − ct 1 + w 2
2 (13a) the pair (w̃k+1 + λk , ũk+1 + ν k ) onto the convex set specified
u∈CM , w∈CN 2
by the linear equality constraint, namely {(w, u) : At w = u}.
subject to At w = u. (13b) Its solution is derived in a simple closed form next.
To decouple constraints and also facilitate the implementation Proposition 2: Given b ∈ CN and d ∈ CM , the solution of
of ADMM, introduce an auxiliary copy ũ and w̃ for u and w 1 1
minimize w − b 22 + u − d 22
accordingly, and rewrite (13) into w∈C , u∈C
N M 2 2
1 2 subject to Aw = u
minimize ũ − ct 1
+
w̃ 2 (14a)
ũ, w̃, u, w 2 is given as
subject to ũ = u, w̃ = w, At w = u. (14b) −1
w∗ := I + AH A b + AH d (18a)
Letting λ ∈ CN and ν ∈ CM be the Lagrange multipliers
∗ ∗
corresponding to the w- and u-consensus constraints, respec- u := Aw . (18b)
tively, the augmented Lagrangian after leaving out the last Using Proposition 2, the minimizer of (16c) is found as
equality in (14b) can be expressed as −1
1 2 wk+1 := I + AH t At w̃k+1
+ λ k
+ AH k+1
ũ + ν k
L(w̃, ũ, w, u; λ, ν) := (ũ) − ct 1 + w̃ 2
2 (19a)
+ λH w̃ − w + ν H ũ − u uk+1
:= Aw k+1
. (19b)
ρ 2 ρ 2
+ w̃ − w 2 + ũ − u 2 (15) The four updates in (16) are computationally simple except
2 2 for the matrix inversion of (19a), which nevertheless can be
where ρ > 0 is a predefined step size. With k ∈ N denoting the cached once computed during the first iteration. In addition,
iteration index for solving (13), or equivalently (10), ADMM variables ũ0 , λ0 , and ν 0 of ADMM can be initialized to zero.
cycles through the following recursions Finally, the solution of (11) can be obtained as
1 ρ 2 vt+1 := vt + w∗ (20)
w̃k+1 := arg min w̃ 22 + w̃ − (wk − λk ) (16a)
2 2 2
w̃
where w∗ is the converged w-iterate of the ADMM iterations
1 ρ 2
in (17), (19), and (16d).
ũk+1 := arg min ũ − ct 1 + ũ − uk − ν k
ũ 2 2 2 The novel deterministic LAV solver based on ADMM is
(16b) summarized in Table I, in which the inner loop consisting
Authorized licensed use limited to: Consortium - Saudi Arabia SDL. Downloaded on February 21,2025 at 15:24:55 UTC from IEEE Xplore. Restrictions apply.
WANG et al.: ROBUST AND SCALABLE PSSE VIA COMPOSITE OPTIMIZATION 6141
TABLE I TABLE II
D ETERMINISTIC LAV S OLVER U SING ADMM S TOCHASTIC LAV S OLVER
Authorized licensed use limited to: Consortium - Saudi Arabia SDL. Downloaded on February 21,2025 at 15:24:55 UTC from IEEE Xplore. Restrictions apply.
6142 IEEE TRANSACTIONS ON SMART GRID, VOL. 10, NO. 6, NOVEMBER 2019
TABLE III
M INI -BATCHING P OWER F LOW M EASUREMENTS
Authorized licensed use limited to: Consortium - Saudi Arabia SDL. Downloaded on February 21,2025 at 15:24:55 UTC from IEEE Xplore. Restrictions apply.
WANG et al.: ROBUST AND SCALABLE PSSE VIA COMPOSITE OPTIMIZATION 6143
TABLE IV
with a constant step size of ∞. To see this, per iteration, the C OMPARISONS OF D IFFERENT S TATE E STIMATORS
LP-based scheme [18] solves the minimization problem in (10)
but without the augmented majorization term 2μ 1
t
v − vt 22 ,
or equivalently with μt = ∞. To be specific, the linear
program was formulated over 2N − 1 real variables consist-
ing of the real and imaginary parts of the unknown voltage
phasor vector, after excluding the imaginary part of the ref-
erence bus which was set 0. Per iteration, the resultant linear
program was solved by calling for the convex optimization
package CVX [34] together with its embedded interior-point
solver SeDuMi [40]. Given that there is no parameter in
the LP-based LAV estimator, although the time performance
may vary if different toolboxes are used for solving the
resultant linear programs, its convergence behavior in terms
of the number of iterations is independent of the toolbox
used. Furthermore, the Gauss-Newton iterations were imple-
mented by calling for the embedded state estimation function
‘doSE.m’ in MATPOWER.
Regarding the initialization, when all squared voltage mag-
nitudes are measured, the initial point is taken to be the voltage
magnitude vector, unless otherwise specified. Each simulated
scheme stops either when a maximum number 100 of iterations
are reached, or when the normalized distance between two
consecutive estimates
√ becomes smaller than 10−10 , namely
vt − vt−1 2 / N ≤ 10−10 . In order to fix the phase ambi- Fig. 2. Convergence performance for the IEEE 14-bus system.
guity, the phase generated at the reference bus was set to
0 in all tests. For numerical stability, and to eliminate the
effect of certain leverage measurements [20], the developed power flows, each grouped as in Table III. The normalized
solvers were implemented using the normalized data, namely root mean-square error (RMSE) vt − v 2 / v 2 was evalu-
{( Hzmm 2 , HHmm 2 )}M
m=1 . Although this work focused on fast and ated at every Gauss-Newton iteration, per linear program, and
scalable implementations of LAV estimators, certain enhanced every M stochastic iterations of the stochastic and accelerated
solvers that possess similar compositional structure as in schemes, where v is the true voltage profile, and vt denotes
LAV (7) can also benefit from the developed composite the estimate obtained at the t iteration.
optimization algorithms. Those include, e.g., the (robustified) Figure 2 compares the normalized RMSE for the LP-, and
Schweppe-Huber generalized M-estimator [10]. IRLS-based, deterministic, stochastic, and accelerated LAV
solvers with that of the WLS-based Gauss-Newton itera-
tions, whose corresponding runtime and number of iterations
A. Noiseless Case to reach the stopping criterion are tabulated in Table IV.
The first experiment simulates the noiseless data to eval- Evidently, the deterministic scheme is the fastest in terms
uate the convergence and runtime of the novel algorithms of both the number of iterations and runtime, and converges
relative to the WLS-based Gauss-Newton iterations, as well to a point of machine accuracy (i.e., 10−16 ) in 8 iterations.
as the LP- and IRLS-based LAV solvers on the IEEE 14- The IRLS is also fast, but similar to the WLS-based Gauss-
bus test system. The default voltage profile was employed. Newton iterations, it requires inverting a matrix per iteration
Measurements including all (‘sending-end’) active and reac- which may discourage its use in large power systems. Even
tive power flows, as well as all squared voltage magnitudes though the time of solving each LP may vary across tool-
were obtained from MATPOWER [38]. The ADMM-based boxes, convergence of the LP-based scheme in terms of the
deterministic prox-linear solver in Table I was implemented number of iterations will be the same. Evidently, solving a
with stepsize μ = 200, where each quadratic subproblem was LP of 2M constraints and 2N − 1 real variables is compu-
solved using a maximum of 150 ADMM iterations with step- tationally more cumbersome and slower than performing M
size ρ = 100. It is worth mentioning that the deterministic (accelerated) stochastic LAV iterations, hence justifying the
prox-linear solver can be also implemented using standard fast convergence rate of the proposed LAV solvers.
convex programming approaches (by solving subproblem (10) The Gauss-Newton method terminated after six iterations,
directly). It typically converges in a few (less than 10) itera- but at a sub-optimal point of normalized RMSE 10−3 or
tions yet at a higher computational complexity. The stochastic so. The accelerated implementation is comparable with the
algorithm in Table II used the diminishing stepsize 1/t0.8 . The stochastic LAV solver, and yields an accurate solution with
accelerated scheme in (26) was implemented with stepsize RMSE = 4.28 × 10−8 in time also comparable to the Gauss-
0.8 using a total of 11 mini-batches: 1 for all voltage magni- Newton iterations. The proposed LAV solvers are much faster
tudes, and 5 of equal size for (sending-end) active and reactive than the LP-based implementation.
Authorized licensed use limited to: Consortium - Saudi Arabia SDL. Downloaded on February 21,2025 at 15:24:55 UTC from IEEE Xplore. Restrictions apply.
6144 IEEE TRANSACTIONS ON SMART GRID, VOL. 10, NO. 6, NOVEMBER 2019
TABLE V
C OMPARISONS OF THE G AUSS -N EWTON AND S TOCHASTIC
LAV E STIMATORS
Authorized licensed use limited to: Consortium - Saudi Arabia SDL. Downloaded on February 21,2025 at 15:24:55 UTC from IEEE Xplore. Restrictions apply.
WANG et al.: ROBUST AND SCALABLE PSSE VIA COMPOSITE OPTIMIZATION 6145
devised. When only line flows and voltage magnitudes are thus yielding the optimal solution of (28) as
measured, each stochastic iteration performs merely a few
complex scalar operations, incurring per-iteration complexity x∗ := x∗r + jx∗i = Sλ ((d) − c) + j(d).
O(1), regardless of the number of buses in the entire network. Recalling u = x + c, the optimal solution of (27) is
If, on the other hand, the power injections are included as well,
this time complexity goes down to the order of the number of u∗ = x∗ + c = c + Sλ ((d) − c) + j(d) (31)
neighboring buses, which still remains much smaller than the
network size in general. A mini-batching technique was sug- which completes the proof.
gested to further accelerate the stochastic iterations by means Proof of Proposition 2: Letting χ denote the dual variable
of leveraging the sparsity of measurement matrices. Numerical associated with the constraint u = Aw, the KKT optimality
tests on a variety of benchmark networks of up to 9, 241 conditions are given by [35]
buses showcase the robustness and computational efficiency w∗ − b + AH χ ∗ = 0
of the developed approaches relative to existing alternatives,
particularly over large-size networks. u∗ − d − χ ∗ = 0
Devising decentralized and parallel implementations of the Aw∗ − u∗ = 0
novel approaches constitutes interesting future research direc-
or in the following compact form
tions. Since the LAV estimator may yield non-robust estimates
⎡ ⎤⎡ ∗ ⎤ ⎡ ⎤
in the presence of bad leverage points, and the measurement IN 0 AH w b
scaling may not be able to effectively identify and elimi- ⎣0 IM −IM ⎦⎣ u∗ ⎦ = ⎣ d ⎦.
nate a certain type of leverage measurements, it is meaningful A −IM 0 χ∗ 0
and promising to generalize the presented deterministic and
stochastic proximal-linear based algorithmic tools to other Eliminating the dual variable via χ ∗ = u∗ − d from the
robustness-enhanced estimators such as the Schweppe-Huber KKT system, yields
∗
generalized M-estimator [10], [29], [30]. Coping with the IN AH w b + AH d
Y- and -connection, as well as investigating the technical = . (32)
A −IM u∗ 0
approaches in multiphase unbalanced distribution systems are
practically relevant future research topics too. By further eliminating d∗ and solving for b∗ , the solution
to (32) and also to the minimization in (18) can be found in
A PPENDIX two steps as
Proof of Proposition 1: It is easy to check that the solu- −1
tion of (16a) is given by (17a), whose proof is thus omitted. w∗ := I + AH A b + AH d
Considering any c ∈ RN and d ∈ CN , solving (16b) is
u∗ := Aw∗
equivalent to solving
1 which completes the proof of the claim.
u∗ := arg min λ (u) − c 1 + u − d 22 . (27)
u∈CN 2 Proof of Proposition 3: The optimality condition for (23) is
Upon defining x := u − c, problem (27) becomes
1
1 0 ∈ ∂ aH w − c + w ⇐⇒ 0 ∈ ∂ aH w − c · a
min λ (x) 1 + x − (d − c) 22 τ
x∈C N 2 1
+ w
or equivalently, τ
1 1 or equivalently,
min λ (x) 1 (x) − (d − c))
+ 2
2 + (x)
x 2 2
0 ∈ ∂ aH w − c · (τ a) + w,
− (d − c)) 22 . (28)
Problem (28) can be decomposed into two subproblems that where ∂ denotes the subdifferential. Let us first examine the
correspond to optimizing over the real- and imaginary parts case where (aH w) − c = 0. We thus have ∂|(aH w) − c| =
of x = (x) + j(x) := xr + jxi ; that is sign((aH w) − c), which yields the optimum
min λ xr +
1
xr − (d − c) 2
(29) w∗ = −τ sign aH w − c · a.
1 2
xr ∈RN 2
and Note that if c/ a 22 ≥ τ , or (aH w∗ ) − c =
1 −τ a 22 sign(aH w∗ − c) − c < 0, then w∗ = τ a. Equivalently,
min xi − (d − c) 22 . (30) if c/ a 22 ≤ −τ , or (aH w∗ ) − c = −τ a 22 sign(aH w∗ −
∈RN 2
xi
c) − c > 0, then w∗ = −τ a.
The optimal solutions of the convex programs in (29) If (aH w)−c = 0, the subdifferential of the absolute-value
and (30) can be found as, see [35] operator belongs to the interval [ − 1, 1]; hence, the optimality
condition becomes
x∗r := Sλ ((d − c)) = Sλ ((d) − c)
x∗i := (d − c) = (d) 0 ∈ −[−1, 1] · (τ a) + w ⇐⇒ w ∈ [−τ, τ ] a.
Authorized licensed use limited to: Consortium - Saudi Arabia SDL. Downloaded on February 21,2025 at 15:24:55 UTC from IEEE Xplore. Restrictions apply.
6146 IEEE TRANSACTIONS ON SMART GRID, VOL. 10, NO. 6, NOVEMBER 2019
Upon letting projτ (x) denote the projection of a real number [23] A. S. Lewis and S. J. Wright, “A proximal method for compos-
x onto the interval [−τ, τ ], one can combine the aforemen- ite minimization,” Math. Program., vol. 158, nos. 1–2, pp. 501–546,
Jul. 2016.
tioned three cases, and express compactly the optimum as [24] D. Davis and D. Drusvyatskiy, “Stochastic model-based minimization of
follows weakly convex functions,” SIAM J. Optim., vol. 29, no. 1, pp. 207–239,
Jan. 2019.
w∗ := projτ c/ a 22 · a [25] J. C. Duchi and F. Ruan, “Stochastic methods for composite optimization
problems,” arXiv:1703.08570, 2017.
which concludes the proof of the proposition. [26] G. Wang, H. Zhu, G. B. Giannakis, and J. Sun, “Robust power system
state estimation from rank-one measurements,” IEEE Trans. Control
Netw. Syst., to be published. doi: 10.1109/TCNS.2019.2890954.
R EFERENCES [27] M. A. Gandhi and L. Mili, “Robust Kalman filter based on a general-
ized maximum-likelihood-type estimator,” IEEE Trans. Signal Process.,
[1] W. A. Wulf, “Great achievements and grand challenges,” vol. 58, no. 5, pp. 2509–2520, May 2010.
Bridge, vol. 30, nos. 3–4, pp. 5–10, 2010. [Online]. Available: [28] R. C. Pires, A. S. Costa, and L. Mili, “Iteratively reweighted least-
http://www.greatachievements.org/ squares state estimation through Givens Rotations,” IEEE Trans. Power
[2] A. Abur and A. Gómez-Expósito, Power System State Estimation: Syst., vol. 14, no. 4, pp. 1499–1507, Nov. 1999.
Theory and Implementation. New York, NY, USA: Marcel Dekker, 2004. [29] J. Zhao and L. Mili, “Vulnerability of the largest normalized residual
[3] G. Wang, G. B. Giannakis, J. Chen, and J. Sun, “Distribution system statistical test to leverage points,” IEEE Trans. Power Syst., vol. 33,
state estimation: An overview of recent developments,” Front. Inf. no. 4, pp. 4643–4646, Jul. 2018.
Technol. Electron. Eng., vol. 20, no. 1, pp. 4–17, Jan. 2019. [30] J. Zhao, L. Mili, and R. C. Pires, “Statistical and numerical robust state
[4] F. C. Schweppe, J. Wildes, and D. Rom, “Power system static-state esti- estimator for heavily loaded power systems,” IEEE Trans. Power Syst.,
mation: Parts I, II, and III,” IEEE Trans. Power App. Syst., vol. PAS-89, vol. 33, no. 6, pp. 6904–6914, Nov. 2018.
pp. 120–135, Jan. 1970. [31] J. C. Duchi and F. Ruan, “Solving (most) of a set of quadratic equalities:
[5] H. M. Merrill and F. C. Schweppe, “Bad data suppression in Composite optimization for robust phase retrieval,” arXiv:1705.02356,
power system static state estimation,” IEEE Trans. Power App. Syst., 2017.
vol. PAS-90, no. 6, pp. 2718–2725, Nov. 1971. [32] P. J. Huber, “Robust statistics,” in International Encyclopedia
[6] P. Fairley, “Cybersecurity at U.S. utilities due for an upgrade: Tech of Statistical Science. Heidelberg, Germany: Springer, 2011,
to detect intrusions into industrial control systems will be mandatory,” pp. 1248–1251.
IEEE Spectr., vol. 53, no. 5, pp. 11–13, May 2016. [33] G. Wang, A. S. Zamzam, G. B. Giannakis, and N. D. Sidiropoulos,
[7] D. U. Case, “Analysis of the cyber attack on the Ukrainian power grid,” “Power system state estimation via feasible point pursuit: Algorithms
2016. and Crmér–Rao bound,” IEEE Trans. Signal Process., vol. 66, no. 6,
[8] S. Zonouz et al., “SCPSE: Security-oriented cyber-physical state esti- pp. 1649–1658, Mar. 2018.
mation for power grid critical infrastructures,” IEEE Trans. Smart Grid, [34] M. Grant and S. Boyd. (2008). CVX: MATLAB Software for Disciplined
vol. 3, no. 4, pp. 1790–1799, Dec. 2012. Convex Programming. [Online]. Available: http://cvxr.com/cvx.
[9] E. Caro and A. J. Conejo, “State estimation via mathematical program- [35] S. Boyd, N. Parikh, E. Chu, B. Peleato, and J. Eckstein, “Distributed
ming: A comparison of different estimation algorithms,” IET Gener. optimization and statistical learning via the alternating direction method
Transm. Distrib., vol. 6, no. 6, pp. 545–553, Jun. 2012. of multipliers,” Found. Trends Mach. Learn., vol. 3, no. 1, pp. 1–122,
[10] L. Mili, M. G. Cheniae, N. S. Vichare, and P. J. Rousseeuw, “Robust 2010.
state estimation based on projection statistics [of power systems],” IEEE [36] V. Kekatos and G. B. Giannakis, “Distributed robust power system state
Trans. Power Syst., vol. 11, no. 2, pp. 1118–1127, May 1996. estimation,” IEEE Trans. Power Syst., vol. 28, no. 2, pp. 1617–1626,
[11] L. Mili, M. G. Cheniae, and P. J. Rousseeuw, “Robust state estimation of May 2013.
electric power systems,” IEEE Trans. Circuits Syst. I, Fundam. Theory [37] Power Systems Test Case Archive, Univ. Washington, Seattle, WA, USA.
Appl., vol. 41, no. 5, pp. 349–358, May 1994. [Online]. Available: http://www.ee.washington.edu/research/pstca
[12] G. Wang, G. B. Giannakis, and Y. C. Eldar, “Solving systems of ran- [38] R. D. Zimmerman, C. E. Murillo-Sanchez, and R. J. Thomas,
dom quadratic equations via truncated amplitude flow,” IEEE Trans. Inf. “MATPOWER: Steady-state operations, planning and analysis tools
Theory, vol. 64, no. 2, pp. 773–794, Feb. 2018. for power systems research and education,” IEEE Trans. Power Syst.,
[13] G. Wang, G. B. Giannakis, Y. Saad, and J. Chen, “Phase retrieval via vol. 26, no. 1, pp. 12–19, Feb. 2011.
reweighted amplitude flow,” IEEE Trans. Signal Process., vol. 66, no. 11, [39] R. A. Jabr and B. C. Pal, “Iteratively reweighted least-squares implemen-
pp. 2818–2833, Jun. 2018. tation of the WLAV state-estimation method,” IEE Proc. Gener. Transm.
[14] D. P. Bertsekas, Nonlinear Programming, 2nd ed. Belmont, MA, USA: Distrib., vol. 151, no. 1, pp. 103–108, Feb. 2004.
Athena Sci., 1999. [40] J. F. Sturm, “Using SeDuMi 1.02, a MATLAB toolbox for optimization
[15] H. Zhu and G. B. Giannakis, “Power system nonlinear state estimation over symmetric cones,” Optim. Method Softw., vol. 11, nos. 1–4,
using distributed semidefinite programming,” IEEE J. Sel. Topics Signal pp. 625–653, Jan. 1999.
Process., vol. 8, no. 6, pp. 1039–1050, Dec. 2014. [41] Y. LeCun, Y. Bengio, and G. Hinton, “Deep learning,” Nature, vol. 521,
[16] Y. Zhang, R. Madani, and J. Lavaei, “Conic relaxations for power system no. 7553, pp. 436–444, May 2015.
state estimation with line measurements,” IEEE Trans. Control Netw.
Syst., vol. 5, no. 3, pp. 1193–1205, Sep. 2018.
[17] W. W. Kotiuga and M. Vidyasagar, “Bad data rejection properties of
weighted least absolute value techniques applied to static state estima-
tion,” IEEE Trans. Power App. Syst., vol. PER-2, no. 4, pp. 844–853,
Apr. 1982. Gang Wang (M’18) received the B.Eng. degree
[18] A. Abur and M. K. Celik, “A fast algorithm for the weighted least in electrical engineering and automation from the
absolute value state estimation (for power systems),” IEEE Trans. Power Beijing Institute of Technology, Beijing, China, in
Syst., vol. 6, no. 1, pp. 1–8, Feb. 1991. 2011 and the Ph.D. degree in electrical and com-
[19] R. A. Jabr and B. C. Pal, “Iteratively re-weighted least absolute puter engineering from the University of Minnesota,
value method for state estimation,” IEE Proc. Gener. Transm. Distrib., Minneapolis, MN, USA, in 2018.
vol. 150, no. 4, pp. 385–391, Jul. 2003. He is currently a Post-Doctoral Associate with the
[20] M. Göl and A. Abur, “LAV based robust state estimation for Department of Electrical and Computer Engineering,
systems measured by PMUs,” IEEE Trans. Smart Grid, vol. 5, no. 4, University of Minnesota. His research interests focus
pp. 1808–1814, Jul. 2014. on the areas of statistical learning, optimization, and
[21] C. Xu and A. Abur, “A fast and robust linear state estimator for very deep learning with applications to data science and
large scale interconnected power grids,” IEEE Trans. Smart Grid, vol. 9, smart grids.
no. 5, pp. 4975–4982, Sep. 2018. Dr. Wang was a recipient of the National Scholarship in 2014, the Guo Rui
[22] J. V. Burke, “Descent methods for composite nondifferentiable Scholarship in 2017, and the Innovation Scholarship (First Place) in 2017, all
optimization problems,” Math. Program., vol. 33, no. 3, pp. 260–279, from China, as well as the Best Student Paper Award at the European Signal
Dec. 1985. Processing Conference in 2017.
Authorized licensed use limited to: Consortium - Saudi Arabia SDL. Downloaded on February 21,2025 at 15:24:55 UTC from IEEE Xplore. Restrictions apply.
WANG et al.: ROBUST AND SCALABLE PSSE VIA COMPOSITE OPTIMIZATION 6147
Georgios B. Giannakis (F’97) received the Diploma Jie Chen (M’09–SM’12–F’19) received the B.Sc.,
degree in electrical engineering from the National M.Sc., and Ph.D. degrees in control theory and
Technical University of Athens, Greece, in 1981, control engineering from the Beijing Institute of
the first M.Sc. degree in electrical engineering, the Technology, Beijing, China, in 1986, 1996, and
second M.Sc. degree in mathematics, and the Ph.D. 2001, respectively.
degree in electrical engineering from the University From 1989 to 1990, he was a Visiting Scholar
of Southern California, CA, USA, in 1983, 1986, with the California State University, Long Beach,
and 1986, respectively. CA, USA. From 1996 to 1997, he was a Research
From 1982 to 1986, he was with the University Fellow with the School of Engineering, University
of Southern California. He was a Faculty Member of Birmingham, Birmingham, U.K. He is a Professor
with the University of Virginia, VA, USA, from 1987 of control science and engineering with the Beijing
to 1998 and since 1999, he has been a Professor with the University of Institute of Technology, where he also serves as the Head of the State Key
Minnesota, MN, USA, where he holds an ADC Endowed Chair, a University Laboratory of Intelligent Control and Decision of Complex Systems. He is cur-
of Minnesota McKnight Presidential Chair in ECE, and serves as the Director rently the President of the Tongji University, Shanghai 200092, China. He is
of the Digital Technology Center. His general interests span the areas of sta- also a member of the Chinese Academy of Engineering. He has (co-)authored
tistical learning, communications, and networking—subjects on which he has four books and over 200 research papers. His main research interests include
published over 450 journal papers, 750 conference papers, 25 book chapters, intelligent control and decision in complex systems, multi-agent systems, and
two edited books, and two research monographs with an H-index of 142. He optimization.
is the (co-) inventor of 32 patents issued. Current research focuses on data sci- Dr. Chen served as a Managing Editor for the Journal of Systems Science
ence, and network science with applications to the Internet of Things, social, & Complexity and an Associate Editor for the IEEE T RANSACTIONS ON
brain, and power networks with renewables. C YBERNETICS and for several other international journals.
Dr. Giannakis was a recipient of the (co-) recipient of nine Best Journal
Paper Awards from the IEEE Signal Processing (SP) and Communications
Societies, including the G. Marconi Prize Paper Award in Wireless
Communications, the Technical Achievement Awards from the SP Society
in 2000 and from EURASIP in 2005, the Young Faculty Teaching Award,
the G. W. Taylor Award for Distinguished Research from the University of
Minnesota, and the IEEE Fourier Technical Field Award in 2015. He is a
fellow of EURASIP, and has served the IEEE in a number of posts, including
that of a Distinguished Lecturer for the IEEE-SPS.
Authorized licensed use limited to: Consortium - Saudi Arabia SDL. Downloaded on February 21,2025 at 15:24:55 UTC from IEEE Xplore. Restrictions apply.