0% found this document useful (0 votes)
10 views

2019-Data-Driven Learning-Based Optimization For Distribution System State Estimation

Uploaded by

Mousa Afrasiabi
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views

2019-Data-Driven Learning-Based Optimization For Distribution System State Estimation

Uploaded by

Mousa Afrasiabi
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 10

4796 IEEE TRANSACTIONS ON POWER SYSTEMS, VOL. 34, NO.

6, NOVEMBER 2019

Data-Driven Learning-Based Optimization for


Distribution System State Estimation
Ahmed S. Zamzam , Student Member, IEEE, Xiao Fu , Member, IEEE,
and Nicholas D. Sidiropoulos , Fellow, IEEE

Abstract—Distribution system state estimation (DSSE) is a core obtain an estimate of the system state variables, i.e., bus volt-
task for monitoring and control of distribution networks. Widely age magnitudes and angles [1], [2] across the network. SE tech-
used algorithms such as Gauss–Newton perform poorly with the niques have also proven to be useful in network ‘forensics’, such
limited number of measurements typically available for DSSE, of-
ten require many iterations to obtain reasonable results, and some- as spotting bad measurements and identifying gross modelling
times fail to converge. DSSE is a non-convex problem, and working errors [3].
with a limited number of measurements further aggravates the Unlike transmission networks where measurement units are
situation, as indeterminacy induces multiple global (in addition to placed at almost all network nodes, the SE task in distribution
local) minima. Gauss–Newton is also known to be sensitive to initial- systems is particularly challenging due to the scarcity of real-
ization. Hence, the situation is far from ideal. It is therefore natural
to ask if there is a smart way of initializing Gauss–Newton that will time measurements. This is usually compensated by the use of
avoid these DSSE-specific pitfalls. This paper proposes using his- so-called pseudo-measurements. Obtained through short-term
torical or simulation-derived data to train a shallow neural network load and renewable energy forecasting techniques, these pseudo-
to “learn to initialize,” that is, map the available measurements to measurements play a vital rule in enabling distribution system
a point in the neighborhood of the true latent states (network volt- state estimation (DSSE) [4]–[6]. Several DSSE solvers based on
ages), which is used to initialize Gauss–Newton. It is shown that
this hybrid machine learning/optimization approach yields supe- weighted least squares (WLS) transmission system state estima-
rior performance in terms of stability, accuracy, and runtime effi- tion methods have been proposed [7]–[11]. A three-phase nodal
ciency, compared to conventional optimization-only approaches. It voltage formulation was used to develop a WLS-based DSSE
is also shown that judicious design of the neural network training solver in [7], [8]. Recently, the authors of [12] used Wirtinger
cost function helps to improve the overall DSSE performance. calculus to devise a new approach for WLS state estimation
Index Terms—Distribution network state estimation, phasor in the complex domain. In order to reduce the computational
measurement units, machine learning, neural networks, Gauss- complexity and storage requirements, the branch-based WLS
Newton, least squares approximation. model was proposed in [13], [14]. However, such gains can be
only obtained when the target power system features only wye-
connected loads that are solidly grounded. It is also recognized
I. INTRODUCTION
that incorporating phasor measurements in DSSE improves the
TATE estimation (SE) techniques are used to monitor power
S grid operations in real-time. Accurately monitoring the net-
work operating point is critical for many control and automation
observability and the estimation accuracy [15], [16]. Therefore,
the DSSE approach developed in this paper considers the case
where classical (quadratic) and phasor (linear) measurements
tasks, such as Volt/VAr optimization, feeder reconfiguration and are available, as well as pseudo-measurements provided through
restoration. SE uses measured quantities like nodal voltages, in- short term forecasting algorithms.
jections, and line flows, together with physical laws in order to WLS DSSE is a non-convex problem that may have multiple
local minima, and working with a limited number of measure-
ments empirically aggravates the situation, as it may introduce
Manuscript received October 23, 2018; revised February 20, 2019; accepted multiple local minima as well. Furthermore, Gauss-Newton
March 29, 2019. Date of publication April 18, 2019; date of current version type algorithms behave very differently when using different
October 24, 2019. The work of A. S. Zamzam and N. D. Sidiropoulos was initializations—the algorithms may need many iterations, or
partially supported by NSF under Grant CCF-1525194. Paper no. TPWRS-
01617-2018. (Corresponding author: Nicholas D. Sidiropoulos.) even fail to converge. It is therefore natural to ask if there is a
A. S. Zamzam is with the Department of Electrical and Computer Engineer- smart way of initializing Gauss-Newton that will avoid these
ing, University of Minnesota, Minneapolis, MN 55455 USA (e-mail:,ahmedz@ pitfalls?
umn.edu).
X. Fu is with the School of Electrical Engineering and Computer Sci- Contributions. In this paper, we propose a novel learning ar-
ence, Oregon State University, Corvallis, OR 97331 USA (e-mail:, xiao.fu@ chitecture for the DSSE task. Our idea is as follows. A wealth
oregonstate.edu). of historical data is often available for a given distribution sys-
N. D. Sidiropoulos is with the Department of Electrical and Computer Engi-
neering, University of Virginia, Charlottesville, VA 22904 USA (e-mail:,nikos@ tem. This data is usually stored and utilized in various other
virginia.edu). network management tasks, such as load and injection forecast-
Color versions of one or more of the figures in this paper are available online ing. Even without detailed recording of the network state, we
at http://ieeexplore.ieee.org.
Digital Object Identifier 10.1109/TPWRS.2019.2909150 can reuse this data to simulate network operations off-line. We

0885-8950 © 2019 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.
See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

Authorized licensed use limited to: University of Saskatchewan. Downloaded on September 26,2021 at 23:58:48 UTC from IEEE Xplore. Restrictions apply.
ZAMZAM et al.: DATA-DRIVEN LEARNING-BASED OPTIMIZATION FOR DISTRIBUTION SYSTEM STATE ESTIMATION 4797

-measurements. The key to success is appropriate design of the


NN training cost function for the ‘neighborhood-finding’ NN.
As we will see, the proposed cost function serves our purpose
much better than using a generic cost function for conventional
NN training. In addition, owing to the special design of the train-
ing cost function, the experiments corroborate the resiliency of
the proposed approach in case of modest network reconfigura-
tion events.
Context. Machine learning approaches are not entirely new
in the power systems / smart grid area. For instance, an online
Fig. 1. The proposed learning-based DSSE. learning algorithm was used in [21] to shape residential energy
demand and reduce operational costs. In [22], a multi-armed
bandit online learning technique was employed to forecast the
can then think of network states and measurements as (out- power injection of renewable energy sources. An early example
put,input) training pairs, which can be used to train a neu- of using NNs in estimation problems appeared in [23] as part of
ral network (NN) to ‘learn’ a function that maps measurements damage-adaptive intelligent flight control. Closer to our present
to states. After the mapping function is learned, estimating the context, [5] proposed the use of an artificial neural network that
states associated with a fresh set of measurements only requires takes the measured power flows as input and aims to estimate
very simple operations—passing the measuremnts through the the bus injections, which are later used as pseudo-measurements
learned NN. This would greatly improve the efficiency of DSSE, in the state estimation. In contrast to our approach where the
bringing real-time state estimation within reach. Accurate and NN is used to approximate the network state given the con-
cheap DSSE using an NN may sound too good to be true, and in ventional measurements as well as the pseudo-measurements,
some sense (in its raw form) it is; but there is also silver lining, the authors of [5] designed an artificial NN to generate pseudo-
as we will see. measurements from the available power flow measurements. In
Known as universal function approximators, neural networks addition, artificial neural networks have been used for the predic-
have made a remarkable comeback in recent years, outperform- tion step in dynamic state estimation [24], and for forecasting-
ing far more complicated (and disciplined) methods in several aided state estimation [25], [26] where the state of the network
research fields; see [17]–[19] for examples. One nice feature of is estimated from the previous sequence of states, without using
neural networks and other machine learning approaches is that conventional measurements to anchor the solution. This is bet-
they alleviate the computational burden at the operation stage— ter suited for transmissions systems, which are more predictable
by shifting computationally intensive ‘hard work’ to the off-line relative to distribution systems with time-varying loads. As the
training stage. installation of renewable energy sources in the distribution grid
However, accurately learning the end-to-end mapping from surges, the huge volatility brought by these energy sources [27]
the available measurements to the exact network state is very induces rapid changes in network state, and thus the previously
challenging in our context—the accuracy achieved by conver- estimated state is often bad initialization for the next instance
gent Gauss-Newton iterates (under good initializations) is hard of DSSE. To the best of our knowledge, machine learning ap-
to obtain using learning approaches. The mapping itself is very proaches have not yet been applied to the core DSSE optimiza-
complex, necessitating a wide and/or deep NN that is hard to tion task, which is the focus of our work.
train with reasonable amounts of data. In addition, training a Notation: matrices (vectors) are denoted by boldface capital
deep NN (DNN) is computationally cumbersome requiring sig- (small) letters; (·)T , (·) and (·)H stand for transpose, complex-
nificant computing resources. Also, DNN slows down real-time conjugate and complex-conjugate transpose, respectively; and
estimation, as passing the input through its layers is a sequential |(·)| denotes the magnitude of a number or the cardinality of a set.
process that cannot be parallelized. To circumvent these obsta-
cles, we instead propose to train a shallow neural network to II. DISTRIBUTION SYSTEM STATE ESTIMATION
‘learn to initialize’—that is, map the available measurements to
a point in the neighborhood of the true latent states, which is A. Network Representation
then used to initialize Gauss-Newton; see Fig. 1 for an illus- Consider a multi-phase distribution network consisting of
tration. When the Gauss-Newton solver is initialized at a point N + 1 nodes and L edges represented by a graph G := (N , L),
in the vicinity of the optimal solution, it enjoys quadratic con- whose set of multi-phase nodes (buses) is indexed by N :=
vergence [20]; otherwise, divergence is possible. We show that {0, 1, . . . , N }, and L ⊆ N × N represents the lines in the net-
such a hybrid machine learning / optimization approach yields work. Let the node 0 be the substation that connects the system
superior performance compared to conventional optimization- to the transmission grid. The set of phases at bus n and line
only approaches, in terms of stability, accuracy, and runtime (l, m) are denoted by ϕn and ϕlm , respectively. Let the voltage
efficiency. We demonstrate these benefits using convincing ex- at the n-th bus for phase φ be denoted by vn,φ . Then, define
periments with the benchmark IEEE-37 distribution feeder with vn := [vn,φ ]φ∈ϕn to collect the voltage phasors at the phases of
several renewable energy sources installed and several types bus n. In addition, let the vector v concatenate the vectors vn
of phasor and conventional measurements, as well as pseudo for all n ∈ N . Lines (l, m) ∈ L are modeled as π-equivalent

Authorized licensed use limited to: University of Saskatchewan. Downloaded on September 26,2021 at 23:58:48 UTC from IEEE Xplore. Restrictions apply.
4798 IEEE TRANSACTIONS ON POWER SYSTEMS, VOL. 34, NO. 6, NOVEMBER 2019

circuit, where phase impedance and shunt admittance are introduced for all types of available measurements. Then, the
denoted by Zlm ∈ C |ϕlm |×|ϕlm | and Y̌lm ∈ C |ϕlm |×|ϕlm | , construction of the pseudo-measurements mappings ȟ(v) will
respectively. be explained.
The measurement functions consist of:
B. Problem Formulation r phasor measurements which represent the complex nodal
The DSSE task amounts to recovering the voltage phasors of voltages vn , or current flows ilm . The corresponding mea-
buses given measurements related to real-time physical quanti- surement function is linear in the state variable v. These
ties, and the available pseudo-measurements. Actual measure- measurements are usually obtained by the PMUs and
ments are acquired by smart meters, PMUs, and μPMUs that μPMUs. Each measurement of this type is handled as two
are placed at some locations in the distribution network. The measurements, i.e., the real and imaginary parts of the com-
measured quantities are usually noisy and adhere to plex quantities are handled as two measurements. For the
nodal voltages, the real and imaginary parts are given as
z̃ = h̃ (v) + ξ , 1 ≤  ≤ Lm (1) follows
where ξ accounts for the zero-mean measurement noise with 1
{vn,φ } = eTn,φ (vn + vn ), (4)
known variance σ̃2 . The functions h̃ (v) are dependent on the 2
type of the measurement, and can be either linear or quadratic 1 T
{vn,φ } = e (vn − vn ) (5)
relationships. In the next section, the specific form of h̃ (v) 2j n,φ
will be discussed. In addition, load and generation forecasting
where eφ is the φ-th canonical basis in R|ϕn | . In addition,
methods are employed to obtain pseudo-measurements that can
the current flow measurements can be modeled as
help with identifying the network state. The forecasted quantities
1
are modeled as {ilm,φ } = eTlm,φ (Ylm (vl − vm ) + Ylm (vl − vm ))
2
ž = ȟ (v) + ζ , 1 ≤  ≤ Ls (2) (6)
where ζ represents the zero-mean forecast error which has a 1 T 
variance of σ̌2 . Since ž ’s represent power-related quantities, {ilm,φ } = e (Ylm (vl − vm ) − Ylm (vl − vm )
2j lm,φ
they are usually modeled as quadratic functions of the state vari- (7)
able v. While the value of the measurement noise variance σ̃2
depends on the accuracy of the measuring equipment, the vari- where Ylm is the inverse of Zlm , and elm,φ is the φ-th
ance of the forecast error can be determined using historical canonical basis in R|ϕlm | .
r real-valued measurements which encompass voltage mag-
forecast data.
Let z be a vector of length L = Lm + Ls containing the mea- nitudes |vn,φ |, current magnitudes |ilm,φ |, and real and re-
surements and pseudo-measurements, and h(v) the equation active power flow measurements plm,φ , qlm,φ . These mea-
relating the measurements to the state vector v, which will be surements are obtained by SCADA systems, Distribution
specified in the next section. Adopting a weighted least-squares Automation, Intelligent Electronic Devices, and PMUs.
formulation, the problem can be cast as follows The real-valued measurements are nonlinearly related to
the state variable v. The measured voltage magnitude

Lm
 2 
Ls
 2 square, and active and reactive power flows can be rep-
min J(v) = w̃ z̃ − h̃ (v) + w̌ ž − ȟ (v) resented as quadratic functions of the state variable v,
v
=1 =1
see [29]. The current magnitude squared can be written
= (z − h(v))T W(z − h(v)) (3) as follows
where the values of w̃ and w̌ are inversely proportional to |ilm,φ |2 = (vl − vm )H ylm,φ
H
ylm,φ (vl − vm ) (8)
σ2 and σ̌2 , respectively. The optimization problem (3) is non- where ylm,φ is the φ-th row of the admittance matrix Ylm .
convex due to the nonlinearity of the measurement mappings Therefore, all the real-valued measurements can be written
h(v) inside the squares. as quadratic measurements of the state variable v.
The available real-time measurements are usually insufficient
C. Available Measurements for DSSE
to ‘pin down’ the network state, as we have discussed. In this
As indicated in the previous subsection, only few real-time case, the system is said to be unobservable. Hence, pseudo-
measurements are usually available in distribution networks, rel- measurements that augment the real-time measurements are
ative to the obtainable measurements in transmission systems. crucial in DSSE as they help achieve network observability.
Therefore, pseudo-measurements are used to alleviate the is- Pseudo-measurements are obtained through load and generation
sue of solving an underdetermined problem. There are always forecast procedures that aim at estimating the energy consump-
different latencies for different sources of measurements which tion or generation utilizing historical data and location-based
bring up the issue of time skewness. Many approaches have been information. They are considered less accurate than real-time
proposed in the literature to tackle the issue [28]. In this work, measurements, and hence, assigned low weights in the WLS
assume that the issue is resolved using one of the solutions in formulation. The functions governing the mapping from the
the literature, and hence, the measurements are assumed to be state variable to the forecasted load and renewable energy source
synchronized. First, the measurements function h̃(v) will be injections can be formulated as quadratic functions [29], [30].

Authorized licensed use limited to: University of Saskatchewan. Downloaded on September 26,2021 at 23:58:48 UTC from IEEE Xplore. Restrictions apply.
ZAMZAM et al.: DATA-DRIVEN LEARNING-BASED OPTIMIZATION FOR DISTRIBUTION SYSTEM STATE ESTIMATION 4799

Therefore, any measurement synthesizing function h (v) can


be written in the following form
h (v) = vT D v + cT v + c T v (9)
where D is a Hermitian matrix. This renders J(v) a fourth
order function of the state variable, which is very challenging to
optimize
The Gauss-Newton algorithm linearizes the first order opti-
mality conditions to iteratively update the state variables until
convergence. The algorithm is known to perform well in prac-
tice given that the algorithm is initialized from a point in the Fig. 2. Learning-based state estimator structure.
vicinity of the true network state, albeit lacking provable con-
vergence result in theory. Several variants of the algorithm have
been proposed in the literature using polar [7], rectangular [31]
σ(wtT z + βt )), βt the corresponding scalar bias, and the vec-
and complex [12] representations of the state variables. All these
tors αt ’s combine the outputs of the neurons in the hidden
algorithms work to a certain extent, but failure cases are also of-
layer to produce the vector output of the NN. The parameters
ten observed. Again, stable convergence performance is only
(αt , wt , βt )Tt=1 can be learned by minimizing the training cost
observed when the initialization is close enough to the optimal
function
solution of (3). This is not entirely surprising—given the non- 
convex nature of the DSSE problem. min vj − gT (zj )22 , (11)
{αt ,wt ,βt }T
t=1 j
III. PROPOSED APPROACH: LEARNING-AIDED
DSSE OPTIMIZATION where the pair (zj , vj ) is a training sample of measurements
and the corresponding underlying voltages to be estimated, in
Assume that there exists a mapping F(·) such that
our context.
F(z) = v; The above training cost function ideally seeks an NN that
works perfectly—at least over the training set. This approach is
i.e., F(·) maps the (noiseless) measurements to the ground-truth
similar in spirit to the one in [33], which considered a problem
states. An example of such mapping is an optimization algorithm
in wireless resource allocation with the objective of ‘learning
that can optimally solve the DSSE problem in the noiseless case,
to optimize’—meaning, training an NN to learn the exact end-
assuming that the solution is unique. The algorithm takes z as
to-end input-output mapping of an optimization algorithm. Our
input and outputs v. In reality the actual (and the virtual) mea-
experience has been that, for DSSE, such an approach works to
surements will be noisy, so we can only aim for
some extent, but its performance is not ideal. Trying to learn
F(z) ≈ v; the end-to-end DSSE mapping appears to be too ambitious,
requiring very large T or a deep NN, and very high training
which is also what optimization-based DSSE aims for in the sample complexity. To circumvent these obstacles, we instead
noisy case. propose to train a shallow neural network, as above, to ‘learn to
Inspired by the recent successes of machine learning, it is in- initialize’—that is, map the available measurements to a point
triguing to ask whether it is possible to learn mapping F(·) from in the neighborhood of the true latent state, which is then used
historical data. If the answer is affirmative and the learned F̂(·) to initialize Gauss-Newton as depicted in Fig. 2.
is easy to evaluate, then the DSSE problem could be solvable More specifically, we propose using the following cost func-
in a very efficient way online, after the mapping F̂(·) is learned tion for training the NN:
offline. 
In machine learning, neural networks are known as universal min max{vj − gT (zj )22 − 2 , 0} (12)
{wt ,βt ,αt }T
function approximators. In principle, a three-layer (input, hid- t=1 j
den, output) NN can approximate any continuous multivariate
where the cost function indicates that the NN parameters are
function down to prescribed accuracy, if there are no constraints
tuned with the relaxed goal that gT (zj ) lies in the ball of radius
on the number of neurons [32]. This motivates us to consider
around vj . Fig. 3 illustrates the effect of changing the value
employing an NN for approximating F(z) in the DSSE prob-
of on the empirical loss function. The high-level
 idea is as fol-
lem. An NN with vector input z, vector output g, and one hidden
lows: instead of enforcing minimization of j vj − gT (zj )22 ,
layer comprising T neurons synthesizes a function of the folow-
we seek a ‘lazy’ solution such that vj − gT (zj )22 ≤ for
ing form
as many j as possible—in other words, it is enough to get to

T the right neighborhood. As we will show, this ‘lowering of the
gT (z) = αt σ(wtT z + βt ), (10) bar’ can significantly reduce the complexity of the NN (mea-
t=1 sured by the number of neurons) that is needed to learn such an
where wt represents the linear combination of the inputs in approximate mapping, and with it also the number of training
z that is fed to the tth neuron (i.e., the unit represented by samples required for learning. These complexity benefits are

Authorized licensed use limited to: University of Saskatchewan. Downloaded on September 26,2021 at 23:58:48 UTC from IEEE Xplore. Restrictions apply.
4800 IEEE TRANSACTIONS ON POWER SYSTEMS, VOL. 34, NO. 6, NOVEMBER 2019

Fig. 4. IEEE-37 distribution feeder. Nodes in blue circles are with loads, and
red square nodes represent buses with DER installed. Buses with PMUs are
Fig. 3. The empirical loss function used for training. circled, and the links where the current magnitudes are measured have a small
rhombus on them.

In order to tune the neural network parameters, Nt training


obtained while still reproducing a point that is close enough to samples have to be used in order to minimize the cost function
serve as a good initialization for Gauss-Newton, ensuring stable in (12). Two different ways can be utilized in order to obtain
and rapid convergence. To back up this intuition, we have the such training data. First, historical data for load and generation
following result. can be utilized. Note that these data are not readily available
Proposition 1: Let σ(·) be any continuous sigmoidal func- unless all the buses in the network are equipped with measuring
tion, and let gT (z) : RL → RK be in the form devices, however, such load and generation profiles can be
estimated using a state estimation algorithm. Then, the network

T
gT (z) = αt σ(wtT z + βt ). power flow equations can be solved to obtain the system state
t=1
which is used later to synthesize the measurements using (1)
and (2). Hence, for each historical load and generation instance,
Then, for approximating a continuous mapping F : RL → RK , a noiseless training pair (zj , vj ) can be generated. The second
the complexity for a shallow network to solve Problem (12) way to obtain the training data is to resort to an operating state
exactly (i.e., with zero cost) for a finite number of bounded estimation procedure. In this case, the goal of the neural network
training samples zj , vj = F(zj ) is at least in the order of approach is to emulate the mapping of the estimator from the
 measurements space to the state space. The second approach suf-
− Lr
fers all the limitations of the current state estimation algorithms
T =O √ . such as inaccuracy or computational inefficiency. In addition
K
to providing noisy training pairs, these limitations result in a
where r is the number of continuous derivatives of F(·). much more time consuming way of generating training data.
The proof of this proposition is relegated to Appendix A. Therefore, the first way is adopted for the rest of this paper, and
Note that the boundedness assumption on the inputs is a proper the detailed procedure is presented in the experiments section.
assumption since these quantities represent voltages and pow- Remark: One concern for data-driven approaches is that the
ers. The implication here is very interesting, as controlling mapping is learned from historical data under a certain network
can drastically reduce the required T (and, along with it, sam- topology. What if the configuration changes for some reason
ple complexity) while still ensuring an accurate enough predic- (e.g., maintenance)? Is the trained mapping still useful? The an-
tion to enable rapid convergence of the ensuing Gauss-Newton swer is, surprisingly, affirmative, thanks to the ‘lazy’ training
stage. Furthermore, keeping the network shallow and T mod- objective—since we do not seek exact solutions for the map-
erate makes the actual online computation (passing the input ping, modest reconfiguration of the network will not destroy the
measurements through the NN to obtain the sought initializa- effectiveness of the trained NN for initialization, as we will see
tion) simple enough for real-time operation. This way, the rela- in Section V.
tive strengths of learning-based and optimization-based methods
can be effectively combined, and the difficulties of both methods IV. EXPERIMENTAL RESULTS
can be circumvented. The proposed state estimation procedure is tested on the
One important remark is that Proposition 1 is derived under benchmark IEEE-37 distribution feeder. This network is recom-
the assumption that F(·) is a continuous mapping that can be mended for testing state estimation algorithms by the Test Feeder
parametrized with L parameters, which is hard to verify in our Working Group of the Distribution System Analysis Subcom-
case. Nevertheless, we find that the theoretical result here is mittee of the IEEE PES [34]. The feeder is known to be a highly
interesting enough and intuitively pleasing. In a case of a sim- unbalanced system that has several delta-connected loads, which
ple single-phase feeder, the state estimation mapping is indeed are blue-colored in Fig. 4.
continuous and finitely parametrizable; see Appendix B. More The feeder has nodes that feature different types of connec-
importantly, as will be seen, this corroborating theory is consis- tions, i.e., single-, two-, and three-phase connections. Addition-
tent with our empirical results. ally, distributed energy resources are assumed to be installed

Authorized licensed use limited to: University of Saskatchewan. Downloaded on September 26,2021 at 23:58:48 UTC from IEEE Xplore. Restrictions apply.
ZAMZAM et al.: DATA-DRIVEN LEARNING-BASED OPTIMIZATION FOR DISTRIBUTION SYSTEM STATE ESTIMATION 4801

TABLE I
LOADS AND DER CONNECTIONS

Fig. 5. Histogram of the distance between the shallow NN output and the true
voltage profile with ( = 0) and ( = 1).
at six different buses, which are colored in red in Fig. 4. In
Table I, the types of the connections of all the loads and
DERs are presented where (L) and (G) mean load and DER, number of distributed energy sources is 6. Therefore, the
respectively. state estimator obtains 58 real pseudo-measurements relat-
Historical load and generation data available in [35] modu- ing to the active and reactive forecasted demand/injection
lated by the values of the loads are used to generate the training at these buses.
samples. Each time instance has an injection profile which is The state estimator obtains noisy measurements and inexact
used as an input to the linearized power flow solver in [36]. load demands and energy generation quantities. It is assumed
The algorithm returns a voltage profile (network state variable) that the noise in the PMU voltage measurements is drawn from
which is utilized to generate the value of the measurements at a Gaussian distribution with zero mean and a standard deviation
this point of time. A total of 100, 000 loading and generation of 10−3 . Additionally, the noise added to current magnitudes is
scenarios were used to train a shallow neural network. The net- Gaussian distributed with a standard deviation of 10−2 . Finally,
work has an input size of 103, 2048 nodes in the hidden layer, the differences between the pseudo-measurement and the real
and output of size 210. load demand and generations are assumed to be drawn from a
The available measurements are detailed as follows. Gaussian distribution with a standard deviation of 10−1 . The
r PMU measurements: four PMUs are installed at buses 701, proposed learning-based state estimation approach aims at esti-
704, 709, and 734 which are circled in Fig. 4. It assumed mating the voltage phasor at all the phases of all the buses in the
that the voltage phasors of all the phases are measured at network.
these buses. This sums up to 12 complex measurement, i.e., The shallow neural network is trained using the Tensor-
24 real measurements. We installed a unit at the substation, Flow [37] software library with 90% of the data used for training
and then placed the rest to be almost evenly distributed while the rest is used for verification. After tuning the network
along the network in order to achieve better observability. parameters, noisy measurements are generated and then passed
r Current magnitude measurements: The magnitude of the to the state estimator architecture in Fig. 2. In order to show the
current flow is measured on all phases of the lines that are effect of the modified cost function, we test the networks trained
marked with a rhombus in Fig. 4. The number of current with different values of on 1, 000 loading and generation sce-
magnitude measurements is 21 real measurements. We in- narios. Fig. 5 shows the histogram of the distance between the
stalled the units such that the state estimation problem can output of the shallow NN and the true network state. With the
be solved without unobservability problems. We tested dif- conventional training cost function ( = 0) the resulting distri-
ferent installation for the current flow measuring devices bution is more spread than the histogram that we obtain through
with noiseless measurements, and then chose one such that the network trained with a relaxed cost function ( = 1).
the problem is not ill-posed. Two performance indices (13)-(14) are introduced to quan-
r Pseudo-measurments: The aggregate load demand of the tify the quality of the estimate as well as the performance of the
buses with load installed, which are blue-colored in Fig. 4, proposed approach. The first index, which is denoted by ν, rep-
are estimated using a load forecasting algorithm using his- resents the Frobenius norm square of the estimation error. Also,
torical and situational data. Therefore, only two real quan- the cost function value at the estimate is denoted by μ.
tities are obtained by the state estimator that relate to the
active and reactive estimated load demand at the load buses.
ν = v̂ − vtrue 22 (13)
In addition, an energy forecast method is used to obtain
an estimated injection from the renewable energy sources 
L
located at the DER buses which are colored in red in Fig. 4. μ= (z − h (v̂))2 (14)
The total number of load buses in the feeder is 23, and the =1

Authorized licensed use limited to: University of Saskatchewan. Downloaded on September 26,2021 at 23:58:48 UTC from IEEE Xplore. Restrictions apply.
4802 IEEE TRANSACTIONS ON POWER SYSTEMS, VOL. 34, NO. 6, NOVEMBER 2019

TABLE II TABLE IV
THE ESTIMATOR PERFORMANCE WITH DIFFERENT VALUES OF () TIMING AND CONVERGENCE OF DIFFERENT STATE ESTIMATORS

approach failed to converge in 28 scenarios, the proposed archi-


tecture has converged for all considered cases. In addition, the
time taken by the proposed learning approach is almost four
times less than the Gauss-Newton algorithm. This is due to the
fact that only few Gauss-Newton iterations need to be done when
TABLE III
the proposed approach is utilized.
PERFORMANCE COMPARISON OF DIFFERENT STATE ESTIMATORS
V. SYSTEM RECONFIGURATION
In distribution systems, the network configuration may be sub-
ject to changes either for restoration [39], i.e., to isolate a fault,
or for system loss reduction [40], [41]. An important task is to
identify the underlying topology of the feeder. In order to per-
form this task, several approaches have been recently developed
Furthermore, in order to show the effect of changing the cost utilizing measurement data [42]–[44]. Without access to the cor-
function used for training, the average cost achieved using the rect network topology, accurate state estimation is untenable, as
proposed approach is shown in Table II when different values the estimator will attempt to fit the measurements to a wrong
of are used for training cost function. In addition, the average model. In other words, the ground truth function generating z
number of iterations required by the Gauss-Newton iterates to is different from h(v), and hence, solving (3) is meaningless.
converge to the optimal estimate is also presented. Using a posi- Therefore, in this study we only consider the case where the
tive value of can lead to savings up 25% in computations, which (new) system topology has been adequately identified. Hence,
is valuable when solving the DSSE for large systems. Also, it in the latter part of the proposed approach, the Gauss-Newton
can be seen that choosing non-zero values for enhances the iterations utilize accurate system topology information.
performance of the proposed architecture. The estimation ac- In order to test the robustness of the proposed approach, we as-
curacy can be almost 5 times better using a positive . As the sume that switches are available on several lines in the feeder as
approximation requirement is relaxed while training the shallow depicted in Fig. 4. Also, we add three additional lines to the net-
NN, the network gains in generalization ability, accommodating work as redundant lines that are assumed to be unenergized under
more scenarios of loading and generation profiles. normal operating conditions [45]. Specifically, switches are as-
To assess the efficacy of the proposed approach we compare sumed to be present on the original lines (710, 735), (703, 730)
it against the complex variable Gauss-Newton state estimator and (727, 7444), and 3 additional tie lines (742, 744), (735, 737)
using [38] as a state-of-art Gauss-Newton solver for a real-valued and (703, 741) are added to the feeder. We asses the robustness
optimization problem in complex variables. The shallow NN was of the proposed learning approach under the following three
trained with = 12 in the next comparisons. scenarios.
In Table III, the average accuracy achieved in estimating the r Scenario A: a fault has occurred in line (727, 744) and the
true voltage profile using both the Gauss-Newton method and tie-switch on line (742, 744) has been turned on;
the proposed architecture is presented for 1000 scenarios. In the r Scenario B: a fault has occurred in line (703, 730) and the
Gauss-Newton implementation, the complex voltages provided line (703, 741) has been energized; and
by the PMUs are used to initialize the voltage phasors corre- r Scenario C: the switch on line (710, 735) has been turned
sponding to these buses. This provides a better initialization off, and the switch on line (735, 737) has been connected.
point to the Gauss-Newton algorithm which also enhances its We train the NN using data that are generated from the original
stability. Still, the proposed approach is able to achieve almost network topology. We assess the performance of the proposed
10 times better accuracy on average. In addition, the fitting er- learning-based state estimator on Scenarios A, B and C. Note
ror which represents the WLS cost function is greatly enhanced that, although in these scenarios the neural network is trained
using the proposed approach. on a different generating model, the proposed approach still
In order to assess the computational time of the proposed al- has advantageous performance when compared with the plain
gorithm, we tried 1000 simulated cases for the NN-assisted state Gauss-Newton approach. This can be attributed to the specially
estimator and the Gauss-Newton (optimization-only) state esti- designed cost function (12) that was proposed to train the NN.
mator. In Table IV, the number of divergent cases out of the 1000 During the course of our experiments, we noticed that the robust
trials is presented for both approaches. While the Gauss-Newton performance against topology reconfiguration events is more

Authorized licensed use limited to: University of Saskatchewan. Downloaded on September 26,2021 at 23:58:48 UTC from IEEE Xplore. Restrictions apply.
ZAMZAM et al.: DATA-DRIVEN LEARNING-BASED OPTIMIZATION FOR DISTRIBUTION SYSTEM STATE ESTIMATION 4803

TABLE V for which,


PERFORMANCE COMPARISON WITH SYSTEM RECONFIGURATION
(AVERAGED OVER 100 RUNS) |g(z) − f (z)| < ∀z ∈ I d .
Proof of Proposition 1: Note that the vector-valued function
F(·) can be represented as K separate scalar-valued functions.
In order to prove the proposition, we start by considering ap-
proximating a scalar-valued function fk (z) that represents the
mapping between z and the k-th element of F(z).
Since zj ’s are finite with length L, finite maximum and min-
imum along each dimension can be obtained. Let the vectors
that collect the maximum and minimum values be denoted by z
and z, respectively. Then, each training sample zj is replaced by
z̃j = Dz−z (zj − z), where Dz−z is a diagonal matrix that has
pronounced when positive ε is used for training the NN. Ta- the values of z − z on the diagonal. Therefore, the vectors z̃j
ble V compares the performance of the proposed state estimator are inside the L-dimensional cube I L . According to Lemma 1,
with (ε = 12 ) against the plain Gauss-Newton approach under there exists a sum g̃k (z̃) in the form of
the three system reconfiguration events. Clearly, the proposed
approach still provides performance gains even under modest 
Tk
g̃k (z̃) = α̃t,k σ(w̃t,k
T
z + β̃t,k ) (16)
topology changes. When the approach was tested under signifi- t=1
cant reconfiguration events, our simulations showed that the ini-
tialization produced by projecting the flat voltage profile onto that satisfies
the linear space defined by the PMU measurements performs |fk (z̃j ) − g̃k (z̃j )| < 1 ∀ z̃j (17)
better than the neural network initialization. The training pro-
cedure is not computationally intensive due to the simplicity for 1 > 0. Then, let gk (z) be a mapping in the form of (16)
of the model and the relaxed training cost function. Therefore, where the parameters are given by
in case of severe system reconfigurations that are expected to αt,k = α̃t,k , βt,k = β̃t,k − w̃t,k
T
Dz−z z, wt,k = Dz−z w̃t,k .
last for long time, the shallow neural network can be retrained
in the order of a few minutes to match the underlying physical Then, for all zj we have
model.
|fk (zj ) − gk (zj )| < 1. (18)
VI. CONCLUSION This result holds for each of the K scalar elements of F(z).
Therefore, by parallel concatenation of the K neural networks
This paper presented a data-driven learning-based state es-
used to approximate the K scalar-valued functions, we obtain a
timation architecture for distribution networks. The proposed
shallow neural network that has K√outputs and ( i Ti ) neurons
approach designs a neural network that can accommodate sev-
eral types of measurements as well as pseudo-measurements. at the hidden layer. Setting = K 1 , we deduce that there
Historical load and energy generation data is used to train a neu- exists a sum gT (z) in the form of (10) that satisfies
ral network in order to produce an approximation of the network F(zj ) − gT (zj )2 < ∀ zj . (19)
state. Then, this estimate is fed to a Gauss-Newton algorithm for
refinement. Our realistic experiments suggest that the combina- It is clear now that the parameters of this function gT (z), i.e., αt ,
tion offers fast and reliable convergence to the optimal solution. wt , and βt , achieve a zero cost function solving Problem (12),
The IEEE-37 test feeder was used to test the proposed approach and hence is optimal in solving (12).
in scenarios that include distributed energy sources. The pro- In addition, since an approximation can be realized using any
posed learning approach shows superior performance results in sigmoid functions, the main result in [46] specifies that the min-
terms of the accuracy of the estimates as well as computation imum number of neurons required to achieve accuracy at least
time. 1 for a scalar-valued function is given by

−L
T = O( 1 )
r
(20)
APPENDIX A
PROOF OF PROPOSITION 1 where r denotes the number of continuous derivatives of the
To prove the proposition, we first invoke the following lemma: approximated function f (z), and L represents the number of
Lemma 1 ([32, Theorem 2] ): Let σ(·) be any continuous parameters of the function. In order to achieve accuracy for
sigmoidal function. Then, given any function f (·) that is con- approximating F(z), at least one of the real-valued functions
tinuous on the d-dimensional unit cube I d = [0, 1]d , and > 0, that construct F(z) has to achieve √K . Hence, the complexity
there is a sum, g(·) : Rd → R, of the form of shallow neural networks that optimally solve (12) for > 0
is at least

T  
−L
g(z) = αt σ(wtT z + βt ) (15) T =O √
r
. 
t=1 K

Authorized licensed use limited to: University of Saskatchewan. Downloaded on September 26,2021 at 23:58:48 UTC from IEEE Xplore. Restrictions apply.
4804 IEEE TRANSACTIONS ON POWER SYSTEMS, VOL. 34, NO. 6, NOVEMBER 2019

Proof: The proof is straightforward and build upon basic re-


sults from real functions analysis. First, the function
P12
f1 (P12 , |v1 |, |v2 |) = (28)
B12 |v1 ||v2 |
is continuous on B12 |v1 ||v2 | ∈ [ , ∞] for any > 0. Then, the
mapping functions in (25) is composite function of f1 and
sin−1 (·) which is a continuous function. Therefore, the mapping
in (25) is continuous on B12 |v1 ||v2 | ∈ [ , ∞] for any > 0 [47].
The same follows for θ13 . 
Fig. 6. Example 3-bus network.

REFERENCES
APPENDIX B [1] V. Kekatos, G. Wang, H. Zhu, and G. B. Giannakis, “PSSE redux: Con-
vex relaxation, decentralized, robust, and dynamic approaches,” 2017,
AN EXAMPLE NETWORK arXiv:1708.03981.
The neural networks are known as universal functions ap- [2] G. Wang, G. B. Giannakis, J. Chen, and J. Sun, “Distribution system state
estimation: An overview of recent developments,” Frontiers Inf. Technol.
proximators. Nevertheless, the theoretical results on the ability Electron. Eng., vol. 20, no. 1, pp. 1–14, Jan. 2019.
to approximate function are usually limited to continuous func- [3] Y. Lin and A. Abur, “A highly efficient bad data identification approach
tions. Hence, for continuous functions, the neural networks are for very large scale power systems,” IEEE Trans. Power Syst., vol. 33,
no. 6, pp. 5979–5989, Nov. 2018.
expected to be able to achieve high approximation accuracy. Un- [4] A. K. Ghosh, D. L. Lubkeman, and R. H. Jones, “Load modeling for
fortunately, checking the continuity of the state estimation solu- distribution circuit state estimation,” IEEE Trans. Power Del., vol. 12,
tion, which is an inverse mapping of a highly nonlinear function, no. 2, pp. 999–1005, Apr. 1997.
[5] E. Manitsas, R. Singh, B. C. Pal, and G. Strbac, “Distribution system
is not simple to be checked. state estimation using an artificial neural network approach for pseudo
In this appendix, a 3-bus balanced lossless network is pre- measurement modeling,” IEEE Trans. Power Syst., vol. 27, no. 4, pp. 1888–
sented, in Fig. 6, in order to inspect the continuity of the state 1896, Nov. 2012.
[6] I. Džafić, R. A. Jabr, I. Huseinagić, and B. C. Pal, “Multi-phase state
estimation mapping. We assume that the simple network has 3 estimation featuring industrial-grade distribution network models,” IEEE
buses and that the magnitude of the voltages are measured at Trans. Smart Grid, vol. 8, no. 2, pp. 609–618, Mar. 2017.
all buses. In addition, the active and reactive power flows are [7] M. E. Baran and A. W. Kelley, “State estimation for real-time monitoring
of distribution systems,” IEEE Trans. Power Syst., vol. 9, no. 3, pp. 1601–
measured at all lines. Since, the phase at Bus 1 can be taken 1609, Aug. 1994.
as a reference for the other buses, the state estimation problem [8] K. Li, “State estimation for power distribution system and measurement
amounts to estimating the lines phase differences, or equiva- impacts,” IEEE Trans. Power Syst., vol. 11, no. 2, pp. 911–916, May 1996.
[9] R. Singh, B. Pal, and R. Jabr, “Choice of estimator for distribution system
lently, the phases at Bus 2 and Bus 3. state estimation,” IET Gener., Transmiss., Distrib., vol. 3, no. 7, pp. 666–
The power flow equations can be expressed as follows 678, Jul. 2009.
[10] V. Kekatos and G. B. Giannakis, “Distributed robust power system state
P12 = B12 |v1 ||v2 | sin(θ12 ), (21) estimation,” IEEE Trans. Power Syst., vol. 28, no. 2, pp. 1617–1626, May
2013.
Q12 = |v1 |2 − B12 |v1 ||v2 | cos(θ12 ), (22) [11] G. Wang, A. S. Zamzam, G. B. Giannakis, and N. D. Sidiropoulos, “Power
system state estimation via feasible point pursuit: Algorithms and Cramer–
P13 = B12 |v1 ||v3 | sin(θ13 ), (23) Rao bound,” IEEE Trans. Signal Process., vol. 66, no. 6, pp. 1649–1658,
Mar. 2018.
Q13 = |v1 |2 − B12 |v1 ||v3 | cos(θ13 ) (24) [12] I. Dzafic, R. A. Jabr, and T. Hrnjic, “Hybrid state estimation in com-
plex variables,” IEEE Trans. Power Syst., vol. 33, no. 5, pp. 5288–5296,
where Bij is the susceptance of the line between Bus i and Bus Sep. 2018, doi: 10.1109/TPWRS.2018.2794401.
[13] M. E. Baran and A. W. Kelley, “A branch-current-based state estimation
j, |vi | is the voltage magnitude at the i-th Bus, and θij is the method for distribution systems,” IEEE Trans. Power Syst., vol. 10, no. 1,
angle difference on the line (i, j). Assuming that the collected pp. 483–491, Feb. 1995.
measurements are noiseless, the solution of the state estimation [14] H. Wang and N. N. Schulz, “A revised branch current-based distribution
system state estimation algorithm and meter placement impact,” IEEE
problem can be written in closed-form as Trans. Power Syst., vol. 19, no. 1, pp. 207–213, Feb. 2004.
  [15] A. G. Phadke, J. S. Thorp, and K. Karimi, “State estimation with phasor
−1 P12
θ12 = sin , (25) measurements,” IEEE Trans. Power Syst., vol. 1, no. 1, pp. 233–238, Feb.
B12 |v1 ||v2 | 1986.
  [16] R. Zivanovic and C. Cairns, “Implementation of PMU technology in state
P13 estimation: An overview,” in Proc. IEEE AFRICON, Stellenbosch, South
θ13 = sin−1 . (26) Africa, 1996, pp. 1006–1011.
B13 |v1 ||v3 | [17] A. Krizhevsky, I. Sutskever, and G. E. Hinton, “Imagenet classification
Claim 1: The mapping between the measurements with deep convolutional neural networks,” in Proc. Adv. Neural Inf. Pro-
cess. Syst., 2012, pp. 1097–1105.
P12 , P13 , |v1 |, and |v2 | and the state of the network, i.e., [18] I. J. Goodfellow, D. Warde-Farley, M. Mirza, A. C. Courville, and
θ12 and θ13 , is continuous if Y. Bengio, “Maxout networks,” in Proc. 30th Int. Conf. Mach. Learn.,
Jun. 2013.
B12 |v1 ||v2 | ≥ , and B13 |v1 ||v3 | ≥ (27) [19] L. Wan, M. Zeiler, S. Zhang, Y. Le Cun, and R. Fergus, “Regularization
of neural networks using dropconnect,” in Proc. Int. Conf. Mach. Learn.,
for any > 0. 2013, pp. 1058–1066.

Authorized licensed use limited to: University of Saskatchewan. Downloaded on September 26,2021 at 23:58:48 UTC from IEEE Xplore. Restrictions apply.
ZAMZAM et al.: DATA-DRIVEN LEARNING-BASED OPTIMIZATION FOR DISTRIBUTION SYSTEM STATE ESTIMATION 4805

[20] O. P. Ferreira, “Local convergence of Newton’s method in Banach space [43] G. Cavraro and V. Kekatos, “Graph algorithms for topology identifica-
from the viewpoint of the majorant principle,” IMA J. Numer. Anal., vol. 29, tion using power grid probing,” IEEE Control Syst. Lett., vol. 2, no. 4,
no. 3, pp. 746–759, 2009. pp. 689–694, Oct. 2018.
[21] D. O’Neill, M. Levorato, A. Goldsmith, and U. Mitra, “Residential demand [44] Y. Weng, Y. Liao, and R. Rajagopal, “Distributed energy resources topol-
response using reinforcement learning,” in Proc. 1st IEEE Int. Conf. Smart ogy identification via graphical modeling,” IEEE Trans. Power Syst.,
Grid Commun.. Gaithersburg, MD, USA, 2010, pp. 409–414. vol. 32, no. 4, pp. 2682–2694, Jul. 2017.
[22] X. Fang, D. Yang, and G. Xue, “Online strategizing distributed renewable [45] E. Dall’Anese and G. B. Giannakis, “Convex distribution system recon-
energy resource access in islanded microgrids,” in Proc. IEEE Global figuration using group sparsity,” in Proc. IEEE Power Energy Soc. Gen.
Telecommun. Conf., 2011, pp. 1–6. Meeting, Jul. 2013, pp. 1–5.
[23] S. Amin, V. Gerhart, E. Rodin, S. Amin, V. Gerhart, and E. Rodin, “System [46] H. N. Mhaskar, “Neural networks for optimal approximation of smooth
identification via artificial neural networks-applications to on-line aircraft and analytic functions,” Neural Comput., vol. 8, no. 1, pp. 164–177, 1996.
parameter estimation,” in Proc. World Aviation Congr., Anaheim, CA, [47] W. Rudin, Principles of Mathematical Analysis, vol. 3. New York, NY,
USA, 1997, Art. no. 5612. USA: McGraw-Hill, 1964.
[24] P. Rousseaux, D. Mallieu, T. Van Cutsem, and M. Ribbens-Pavella, “Dy-
namic state prediction and hierarchical filtering for power system state
estimation,” Automatica, vol. 24, no. 5, pp. 595–618, 1988.
[25] A. A. da Silva, A. L. da Silva, J. S. de Souza, and M. Do Coutto Filho, Ahmed S. Zamzam (S’14) received the B.Sc. degree
“State forecasting based on artificial neural networks,” in Proc. 11th Power from Cairo University, Giza, Egypt, in 2013 and the
Syst. Comput. Conf., 1993, pp. 461–467. M.Sc. degree from Nile University, Giza, Egypt, in
[26] M. B. D. C. Filho and J. C. S. de Souza, “Forecasting-aided state estima- 2015. He is currently working toward the Ph.D. de-
tionpart I: Panorama,” IEEE Trans. Power Syst., vol. 24, no. 4, pp. 1667– gree with the Department of Electrical and Computer
1677, Nov. 2009. Engineering, University of Minnesota, Minneapolis,
[27] K. Schneider and E. Stuart, “Necpuc distribution systems & plan- MN, USA, where he is also affiliated with the Signal
ning training: Data and analytics for distribution,” Sep. 2017. [Online]. and Tensor Analytics Research (STAR) group under
Available: http://eta-publications.lbl.gov/sites/default/files/9a._data_and the supervision of Professor N. D. Sidiropoulos. He
_analytics_ems_core-ls.pdf received the Louis John Schnell Fellowship (2015),
[28] Q. Zhang, Y. Chakhchoukh, V. Vittal, G. T. Heydt, N. Logic, and S. Sturgill, and the Doctoral Dissertation Fellowship (2018) from
“Impact of PMU measurement buffer length on state estimation and its the University of Minnesota. His research interests include control and optimiza-
optimization,” IEEE Trans. Power Syst., vol. 28, no. 2, pp. 1657–1665, tion of smart grids, large-scale complex energy systems, grid data analytics, and
May 2013. machine learning. He also received Student Travel Awards from the IEEE Sig-
[29] E. Dall’Anese, H. Zhu, and G. B. Giannakis, “Distributed optimal power nal Processing Society in 2017, the IEEE Power and Energy Society in 2018,
flow for smart microgrids,” IEEE Trans. Smart Grid, vol. 4, no. 3, pp. 1464– and the Council of Graduate Students at the niversity of Minnesota in 2016 and
1475, Sep. 2013. 2018.
[30] A. S. Zamzam, N. D. Sidiropoulos, and E. Dall’Anese, “Beyond relaxation
and Newton–Raphson: Solving AC OPF for multi-phase systems with
renewables,” IEEE Trans. Smart Grid, vol. 9, no. 5, pp. 3966–3975, Sep.
2018. Xiao Fu (S’12–M’15) received the Ph.D. degree in
[31] R. Nuqui and A. G. Phadke, “Hybrid linear state estimation utilizing electronic engineering from The Chinese University
synchronized phasor measurements,” in Proc. IEEE Power Tech, 2007, of Hong Kong, Hong Kong, in 2014. He was a Post-
pp. 1665–1669. doctoral Associate with the Department of Electrical
[32] G. Cybenko, “Approximation by superpositions of a sigmoidal function,” and Computer Engineering, University of Minnesota
Math. Control, Signals, Syst., vol. 2, no. 4, pp. 303–314, 1989. Twin Cities, Minneapolis, MN, USA, from 2014 to
[33] H. Sun, X. Chen, Q. Shi, M. Hong, X. Fu, and N. D. Sidiropoulos, “Learn- 2017. He is currently an Assistant Professor with the
ing to optimize: Training deep neural networks for wireless resource man- School of Electrical Engineering and Computer Sci-
agement,” IEEE Trans. Signal Proc., vol. 66, no. 20, pp. 5438–5453, Oct. ence, Oregon State University, Corvallis, OR, USA.
2018. His research interests include the broad area of signal
[34] K. P. Schneider et al., “Analytic considerations and design basis for the processing and machine learning, with a recent em-
IEEE distribution test feeders,” IEEE Trans. Power Syst., vol. 33, no. 3, phasis on tensor/matrix factorization. He was the recipient of Best Student Paper
pp. 3181–3188, May 2018. Award at ICASSP 2014 and was a finalist of the Best Student Paper Competition
[35] J. Bank and J. Hambrick, “Development of a high resolution, real time, at IEEE SAM 2014. He also coauthored a paper that received a Best Student
distribution-level metering system and associated visualization, model- Paper Award at IEEE CAMSAP 2015.
ing, and data analysis functions,” National Renewable Energy Laboratory,
Golden, CO, USA, Tech. Rep. NREL/TP-5500-56610, 2013.
[36] A. Garces, “A linear three-phase load flow for power distribution systems,” Nicholas D. Sidiropoulos (F’09) received the
IEEE Trans. Power Syst., vol. 31, no. 1, pp. 827–828, Jan. 2016. Diploma in electrical engineering from the Aris-
[37] M. Abadi et al., “TensorFlow: Large-scale machine learning on hetero- totelian University of Thessaloniki, Thessaloniki,
geneous systems,” 2015. [Online]. Available: https://www.tensorflow.org/ Greece, in 1988, and the M.S. and Ph.D. degrees
[38] L. Sorber, M. V. Barel, and L. D. Lathauwer, “Unconstrained optimization in electrical engineering from the University of
of real functions in complex variables,” SIAM J. Optim., vol. 22, no. 3, Maryland-College Park, College Park, MD, USA, in
pp. 879–898, 2012. 1990 and 1992, respectively. He has served on the fac-
[39] A. L. Morelato and A. J. Monticelli, “Heuristic search approach to distribu- ulty of the University of Virginia (UVA), University
tion system restoration,” IEEE Trans. Power Del., vol. 4, no. 4, pp. 2235– of Minnesota, and the Technical University of Crete,
2241, Oct. 1989. Greece, prior to his current appointment as the Chair
[40] M. E. Baran and F. F. Wu, “Network reconfiguration in distribution systems of Electrical and Computer Engineering with UVA.
for loss reduction and load balancing,” IEEE Trans. Power Del., vol. 4, His research interests include signal processing, communications, optimiza-
no. 2, pp. 1401–1407, Apr. 1989. tion, tensor decomposition, and factor analysis, with applications in machine
[41] F. V. Gomes, S. Carneiro, J. L. R. Pereira, M. P. Vinagre, P. A. N. Garcia, learning and communications. He received the NSF/CAREER award in 1998,
and L. R. D. Araujo, “A new distribution system reconfiguration approach the IEEE Signal Processing Society (SPS) Best Paper Award in 2001, 2007,
using optimum power flow and sensitivity analysis for loss reduction,” and 2011, served as IEEE SPS Distinguished Lecturer (2008–2009), and cur-
IEEE Trans. Power Syst., vol. 21, no. 4, pp. 1616–1623, Nov. 2006. rently serves as the Vice-President—Membership of IEEE SPS. He received the
[42] D. Deka, M. Chertkov, and S. Backhaus, “Structure learning in power 2010 IEEE Signal Processing Society Meritorious Service Award, and the 2013
distribution networks,” IEEE Trans. Control Netw. Syst., vol. 5, no. 3, Distinguished Alumni Award from the University of Maryland, Department of
pp. 1061–1074, Sep. 2018. Electrical and Computer Engineering. He is a also Fellow of EURASIP (2014).

Authorized licensed use limited to: University of Saskatchewan. Downloaded on September 26,2021 at 23:58:48 UTC from IEEE Xplore. Restrictions apply.

You might also like